SkillAgentSearch skills...

Telseq

A software for calculating telomere length

Install / Use

/learn @zd1/Telseq
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

TelSeq is a software that estimates telomere length from whole genome sequencing data (BAMs).

The most current development version is available from our git repository: git://github.com/zd1/telseq.git

The software is implemented in C++.

Citation:

Estimating telomere length from whole genome sequence data

Zhihao Ding; Massimo Mangino; Abraham Aviv; Tim Spector; Richard Durbin. Nucleic Acids Research 2014; doi: 10.1093/nar/gku181 http://nar.oxfordjournals.org/content/42/9/e75

Compile TelSeq

TelSeq dependency:

  • the bamtools library (https://github.com/pezmaster31/bamtools)

  • A modern version of GCC (version 4.8 or above) This can been seen by "gcc --version". If multiple GCCs are installed in your system, please set environmental variables pointing to the one of version 4.8 or above. e.g. in bash,

export CXX=/path/to/gcc/gcc-4.8.1/bin/g++
export CC=/path/to/gcc/gcc-4.8.1/bin/gcc

One easy way to install a new GCC is to use homebrew,

# install homebrew if you don't have it
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

# install GCC
brew install gcc

Compile

Go to the src directory and run autogen.sh from the src directory to generate the configure file ./autogen.sh Then run

./configure
make

The executable binary will be at src/Telseq/telseq. If bamtools are installed not at the system location, you can specify their location by

./configure --with-bamtools=/path/to/bamtools

The /path/to/bamtools directory is the directory that contains 'lib' and 'include' sub directories.

Run TelSeq

Read options and usage information

telseq

Analyse one or more BAMs by specifying BAM file path as command line arguments.

telseq a.bam b.bam

Analyse a list of BAMs whose paths are specified in a 'bamlist' file.

bamlist should contain only 1 column with each row the path of a BAM. i.e.

/path/to/a.bam
/path/to/b.bam

telseq -f bamlist

BAM file path can also be provided by piping in a 'bamlist', whose format must be same as above

cat bamlist | telseq

output

By default the result will be printed out to stdout. To change it to a file, use '-o' option to specify a file path. i.e.

telseq -o /path/to/output a.bam b.bam c.bam

This can also be achived by just direct the output to a file using '>', i.e.

telseq a.bam b.bam c.bam > /path/to/output

The software will print out running status to stderr as well. To separate them from stdout, one could direct log to a file, ie.

telseq a.bam b.bam c.bam 2>outputlog

Merge results from read groups by taking a weighted mean. However, it is benetifical run without -m to output the result per lane, so to have an idea about inter-lane variation. The merging can be done afterwards. telseq -m a.bam > output

Output file format

| Column | Definitions | | -------------|----------------------------------------------| | ReadGroup | read group, Defined by the RG tag in BAM header. | | Library | sequencing library that the read group belongs to.| | Sample | defined by the SM tag in BAM header. | | Total | total number of reads in this read group. | | Mapped | total number of mapped reads, SAM flag 0x4. | | Duplicates | total number of duplicate reads, SAM flag 0x400. | | LENGH_ESTIMATE | estimated telomere length. | | TEL0 | read counts for reads containing no TTAGGG/CCCTAA repeats. | | TEL1 | read counts for reads containing only 1 TTAGGG/CCCTAA repeats. | | TELn | read counts for reads containing only n TTAGGG/CCCTAA repeats. | | TEL16 | read counts for reads containing 16 TTAGGG/CCCTAA repeats. | | GC0 | read counts for reads with GC between 40%-42%. | | GC1 | read counts for reads with GC between 42%-44%. | | GCn | read counts for reads with GC between (40%+n*2%)-(42%+(n+1)*2%). | | GC9 | read counts for reads with GC between 58%-60%. |

By default for each BAM a header line will be printed out. This can be suppressed by using the '-H' option. It is useful when one has multiple BAMs to scan and wish the output to be merged together. i.e.

telseq -H a.bam b.bam c.bam > myresult

To just print out the header, use '-h' option. i.e.

telseq -h

Docker

Install Docker

Please refer to the official website for installing Docker https://docs.docker.com/engine/installation/

Build telseq Docker image

docker build -t telseq-docker github.com/zd1/telseq

Run telseq

Note that the sample path "/path/to/bam/sample.bam" in the machine that the container is run needs to be specified. "/sample.bam" doesn't need to be changed.

docker run -v /path/to/bam/sample.bam:/sample.bam telseq-docker /sample.bam

Contact

zhihao.ding at gmail.com

View on GitHub
GitHub Stars74
CategoryDevelopment
Updated27d ago
Forks29

Languages

C++

Security Score

95/100

Audited on Mar 11, 2026

No findings