SkillAgentSearch skills...

MTAGs

No description available

Install / Use

/learn @SushiLab/MTAGs
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<p align="center"> <img src="img/mtags_logo.png" width="500" /> </p>

mTAGs: taxonomic profiling using degenerate consensus reference sequences of ribosomal RNA genes

mTAGs is a tool for the taxonomic profiling of metagenomes. It detects sequencing reads belonging to the small subunit of the ribosomal RNA (SSU-rRNA) gene and annotates them through the alignment to full-length degenerate consensus SSU-rRNA reference sequences. The tool is capable of processing single-end and pair-end metagenomic reads, takes advantage of the information contained in any region of the SSU-rRNA gene and provides relative abundance profiles at multiple taxonomic ranks (Domain, Phylum, Class, Order, Family, Genus and OTUs defined at a 97% sequence identity cutoff).

The tool is developed by Hans-Joachim Ruscheweyh and Guillem Salazar and distributed under the License GPL v3.

If you use mTAGs, please cite:

Salazar G*, Ruscheweyh H-J*, Hildebrand F, Acinas S and Sunagawa S. mTAGs: taxonomic profiling using degenerate consensus reference sequences of ribosomal RNA gene. Bioinformatics, 2021.

Analyses in the publication were executed using version 1.0.0

Questions/Comments? Write a github issue.

Installation

mTAGs is written in python and has the following dependencies:

  • Python>=v3.7
  • vsearch (tested: v2.15.0)
  • hmmer (tested: v3.3)

Installation using conda

The easiest way to install mTAGs is to use the conda package manager, which will automatically create an environment with dependencies installed in the correct version.

$ conda create -n mtags python=3.7 hmmer vsearch
$ source activate mtags
# or
$ conda activate mtags

$ pip install mTAGs

# Download mTAGs database
$ mtags download

2021-06-21 12:02:40,294 INFO: Starting mTAGs
2021-06-21 12:02:40,294 INFO: Start downloading the mTAGs database. ~600MB
2021-06-21 12:05:17,883 INFO: Finished downloading the mTAGs database.
2021-06-21 12:05:17,883 INFO: Finishing mTAGs

$ mtags
<details><summary>mTAGs output</summary> <p>
Program:    mTAGs - taxonomic profiling using degenerate consensus reference
            sequences of ribosomal RNA gene
Version:    1.0.4
Reference:  Salazar, Ruscheweyh, et al. mTAGs: taxonomic profiling using
            degenerate consensus reference sequences of ribosomal RNA
            gene. Bioinformatics (2021)

Usage: mtags <command> [options]
Command:

-- General
    profile     Extract and taxonomically annotate rRNA reads in metagenomic samples
    merge       Merge profiles

-- Expert
    extract     Extract rRNA reads in metagenomic samples
    annotate    Annotate and quantify rRNA reads

-- Installation
    download    Download the mTAGs database - Once after download of the tool

The database needs to be downloaded in the last step of the installation. This needs to be done once and before the first metagenomic samples can be processed:
</p> </details>

Manual installation

Manual installation is possible but not recommended. Install via pip after installation of dependencies:

$ pip install mTAGs

# Download mTAGs database
$ mtags download

2021-06-21 12:02:40,294 INFO: Starting mTAGs
2021-06-21 12:02:40,294 INFO: Start downloading the mTAGs database. ~600MB
2021-06-21 12:05:17,883 INFO: Finished downloading the mTAGs database.
2021-06-21 12:05:17,883 INFO: Finishing mTAGs

$ mtags
<details><summary>mTAGs output</summary> <p>
Program:    mTAGs - taxonomic profiling using degenerate consensus reference
            sequences of ribosomal RNA gene
Version:    1.0.4
Reference:  Salazar, Ruscheweyh, et al. mTAGs: taxonomic profiling using
            degenerate consensus reference sequences of ribosomal RNA
            gene. Bioinformatics (2021)

Usage: mtags <command> [options]
Command:

-- General
    profile     Extract and taxonomically annotate rRNA reads in metagenomic samples
    merge       Merge profiles

-- Expert
    extract     Extract rRNA reads in metagenomic samples
    annotate    Annotate and quantify rRNA reads

-- Installation
    download    Download the mTAGs database - Once after download of the tool

The database needs to be downloaded in the last step of the installation. This needs to be done once and before the first metagenomic samples can be processed:
</p> </details>

Usage

The tool is split in to two steps: profiling and merging. The first step (mtags profile [options]) uses HMM models to extract potential rRNA sequences from metagenomic data and annotates them taxonomically through the alignment of these sequences against a modified Silva database. The second step (mtags merge [options]) is a function that merges taxonomic profiles from different metagenomic samples. The steps for extraction and annotation of rRNA sequences are grouped into a single command but can also be run independently (mtags extract [options] and mtags annotate [options]).

PROFILE

This step uses precomputed HMM models to extract rRNA sequences from a metagenomic sample. The rRNA sequences are then aligned against a clustered rRNA database to annotate sequences and profile samples. mTAGs takes as input fasta/fastq files with quality controlled sequencing data.


$ mtags profile
Program:    mTAGs - taxonomic profiling using degenerate consensus reference
            sequences of ribosomal RNA gene
Version:    1.0.4
Reference:  Salazar, Ruscheweyh, et al. mTAGs: taxonomic profiling using
            degenerate consensus reference sequences of ribosomal RNA
            gene. Bioinformatics (2021)

Usage: mtags profile [options]

Input options:
    -f  FILE [FILE ...]   Forward reads file. Can be fasta/fastq and gzipped.
    -r  FILE [FILE ...]   Reverse reads file. Can be fasta/fastq and gzipped.
    -s  FILE [FILE ...]   Single/merge reads file. Can be fasta/fastq and gzipped.

Output options:
    -o  DIR               Output folder [Required]

Other options:
    -n  STR               Samplename [Required]
    -t  INT               Number of threads. [4]
    -ma INT               Maxaccepts, vsearch parameter. Larger
                          numbers increase sensitivity and runtime. [1000]
    -mr INT               Maxrejects, vsearch parameter. Larger
                          numbers increase sensitivity and runtime. [1000]

# Example usage of the mTAGs profile routine
$ mtags profile -f sample.1.fq.gz -r sample.2.fq.gz -s sample.s.fq.gz sample.m.fq.gz -o output -t 4 -n sample -ma 1000 -mr 1000
<details><summary>mTAGs log</summary> <p>
2021-06-21 09:04:48,644 INFO: Starting mTAGs
2021-06-21 09:04:48,646 INFO: Extracting FastA and revcomp FastA from input/sample.1.fq.gz
2021-06-21 09:04:59,536 INFO: Processed reads:	824523
2021-06-21 09:04:59,536 INFO: Finished extracting. Found 824523 sequences.
2021-06-21 09:04:59,536 INFO: Start detecting rRNA sequences in FastA files
2021-06-21 09:04:59,536 INFO: Start detecting rRNA sequences for molecule=ssu
2021-06-21 09:04:59,536 INFO: Executing:	hmmsearch --cpu 4 -o sample/sample.1.fq.gz_fw.fasta_ssu.hmmer --domtblout sample/sample.1.fq.gz_fw.fasta_ssu.dom -E 0.01 mTAGs/data/ssu.hmm sample/sample.1.fq.gz_fw.fasta
2021-06-21 09:05:06,982 INFO: Finished hmmsearch
2021-06-21 09:05:06,988 INFO: Executing:	hmmsearch --cpu 4 -o sample/sample.1.fq.gz_rev.fasta_ssu.hmmer --domtblout sample/sample.1.fq.gz_rev.fasta_ssu.dom -E 0.01 mTAGs/data/ssu.hmm sample/sample.1.fq.gz_rev.fasta
2021-06-21 09:05:14,442 INFO: Finished hmmsearch
2021-06-21 09:05:14,449 INFO: Finished detecting rRNA sequences for molecule=ssu
2021-06-21 09:05:14,449 INFO: Start detecting rRNA sequences for molecule=lsu
2021-06-21 09:05:14,450 INFO: Executing:	hmmsearch --cpu 4 -o sample/sample.1.fq.gz_fw.fasta_lsu.hmmer --domtblout sample/sample.1.fq.gz_fw.fasta_lsu.dom -E 0.01 mTAGs/data/lsu.hmm sample/sample.1.fq.gz_fw.fasta
2021-06-21 09:05:35,255 INFO: Finished hmmsearch
2021-06-21 09:05:35,266 INFO: Executing:	hmmsearch --cpu 4 -o sample/sample.1.fq.gz_rev.fasta_lsu.hmmer --domtblout sample/sample.1.fq.gz_rev.fasta_lsu.dom -E 0.01 mTAGs/data/lsu.hmm sample/sample.1.fq.gz_rev.fasta
2021-06-21 09:05:55,845 INFO: Finished hmmsearch
2021-06-21 09:05:55,859 INFO: Finished detecting rRNA sequences for molecule=lsu
2021-06-21 09:05:55,859 INFO: Found 4143 potential rRNA sequences.
2021-06-21 09:05:55,859 INFO: Finished detecting rRNA sequences from FastA files.
2021-06-21 09:05:55,859 INFO: Finding best molecule for each read
2021-06-21 09:05:55,867 INFO: Finished finding best molecule for each read
2021-06-21 09:05:55,867 INFO: Start extracting reads/writing output
2021-06-21 09:05:58,911 INFO: Processed reads:	824523
2021-06-21 09:05:58,912 INFO: Finished extracting reads/writing output
2021-06-21 09:05:58,912 INFO: euk_lsu	1571
2021-06-21 09:05:58,912 INFO: bac_lsu	1114
2021-06-21 09:05:58,912 INFO: euk_ssu	865
2021-06-21 09:05:58,912 INFO: bac_ssu	583
2021-06-21 09:05:58,912 INFO: arc_lsu	9
2021-06-21 09:05:58,912 INFO: arc_ssu	1
2021-06-21 09:05:58,927 INFO: Extracting FastA and revcomp FastA from input/sample.2.fq.gz
2021-06-21 09:06:10,001 INFO: Processed reads:	824523
2021-06-21 09:06:10,002 INFO: Finished extracting. Found 824523 sequences.
2021-06-21 09:06:10,002 INFO: Start detecting rRNA sequences in FastA files
2021-06-21 09:06:10,002 INFO: Start detecting rRNA sequences for molecule=ssu
2021-06-21 09:06:10,002 INFO: Executing:	hmmse
View on GitHub
GitHub Stars6
CategoryDevelopment
Updated1mo ago
Forks2

Languages

Python

Security Score

85/100

Audited on Feb 26, 2026

No findings