AGAT
Another Gtf/Gff Analysis Toolkit https://nbisweden.github.io/AGAT/
Install / Use
/learn @NBISweden/AGATREADME
<img alt="docker_agat" src="https://img.shields.io/badge/container-Docker-blue">
<img alt="singularity_agat" src="https://img.shields.io/badge/container-Singularity-orange">
<img alt="doi_zenodo" src="https://img.shields.io/badge/DOI-10.5281/zenodo.3552717-blue">
AGAT
<img align="right" src="docs/img/NBIS.png" width="200" height="100" />
<h2><em>A</em>nother <em>G</em>tf/Gff <em>A</em>nalysis <i>T</i>oolkit</h2>Suite of tools to handle gene annotations in any GTF/GFF format.
Documentation >>here<<
Previous documentation until v1.4.0 (readthedocs) here
Table of Contents
- What can AGAT do for you?
- Installation
- Usage
- List of tools
- More about the tools
- The AGAT parser - Standardisation to create GXF files compliant to any tool
- How to cite?
- Publication using AGAT
- Troubleshooting
What can AGAT do for you?
AGAT has the power to check, fix, pad missing information (features/attributes) of any kind of GTF and GFF to create complete, sorted and standardised gff3 format. Over the years it has been enriched by many many tools to perform just about any tasks that is possible related to GTF/GFF format files (sanitizing, conversions, merging, modifying, filtering, FASTA sequence extraction, adding information, etc). Comparing to other methods AGAT is robust to even the most despicable GTF/GFF files.
-
Standardize/sanitize any GTF/GFF file into a comprehensive GFF3 format (script with
<details> <summary>See standardization/sanitization tool</summary>_sp_prefix)| task | tool | | --- | --- | | check, fix, pad missing information into sorted and standardised gff3 |
agat_convert_sp_gxf2gxf.pl|- add missing parent features (e.g. gene and mRNA if only CDS/exon exists).
- add missing features (e.g. exon and UTR).
- add missing mandatory attributes (i.e. ID, Parent).
- fix identifiers to be uniq.
- fix feature locations.
- remove duplicated features.
- group related features (if spread in different places in the file).
- sort features (tabix optional).
- merge overlapping loci into one single locus (only if option activated).
-
Convert many formats
<details> <summary>See conversion tools</summary>| task | tool | | --- | --- | | convert any GTF/GFF into BED format |
</details>agat_convert_sp_gff2bed.pl| | convert any GTF/GFF into GTF format |agat_convert_sp_gff2gtf.pl| | convert any GTF/GFF into tabulated format |agat_sp_gff2tsv.pl| | convert any BAM from minimap2 into GFF format |agat_convert_sp_minimap2_bam2gff.pl| | convert any GTF/GFF into ZFF format |agat_sp_gff2zff.pl| | convert any GTF/GFF into any GTF/GFF (bioperl) format |agat_convert_sp_gxf2gxf.pl| | convert BED format into GFF3 format |agat_convert_bed2gff.pl| | convert EMBL format into GFF3 format |agat_convert_embl2gff.pl| | convert genscan format into GFF3 format |agat_convert_genscan2gff.pl| | convert mfannot format into GFF3 format |agat_convert_mfannot2gff.pl| -
Perform numerous tasks (Just about anything that is possible)
<details> <summary>See tools</summary>| task | tool | | --- | --- | | make feature statistics |
</details>agat_sp_statistics.pl| | make function statistics |agat_sp_functional_statistics.pl| | extract any type of sequence |agat_sp_extract_sequences.pl| | extract attributes |agat_sp_extract_attributes.pl| | complement annotations (non-overlapping loci) |agat_sp_complement_annotations.pl| | merge annotations |agat_sp_merge_annotations.pl| | filter gene models by ORF size |agat_sp_filter_by_ORF_size.pl| | filter to keep only longest isoforms |agat_sp_keep_longest_isoform.pl| | create introns features |agat_sp_add_introns.pl| | fix cds phases |agat_sp_fix_cds_phases.pl| | manage IDs |agat_sp_manage_IDs.pl| | manage UTRs |agat_sp_manage_UTRs.pl| | manage introns |agat_sp_manage_introns.pl| | manage functional annotation |agat_sp_manage_functional_annotation.pl| | specificity sensitivity |agat_sp_sensitivity_specificity.pl| | fusion / split analysis between two annotations |agat_sp_compare_two_annotations.pl| | analyze differences between BUSCO results |agat_sp_compare_two_BUSCOs.pl| | ... and much more ...| ... see here ...|
About the GTF/GFF fromat
The GTF/GFF formats are 9-column text formats used to describe and represent genomic features.
The formats have quite evolved since 1997, and despite well-defined specifications existing nowadays they have a great flexibility allowing holding wide variety of information.
This flexibility has a drawback aspect, there is an incredible amount of flavour of the formats, that can result in problems when using downstream programs.
For a complete overview of the GTF/GFF formats have a look here.
Installation
Using Docker
<details> <summary>See details</summary>First you must have Docker installed and running.
Secondly have look at the availabe AGAT biocontainers at quay.io.
Then:
# get the chosen AGAT container version
docker pull quay.io/biocontainers/agat:1.4.2--pl5321hdfd78af_0
# use an AGAT's tool e.g. agat_convert_sp_gxf2gxf.pl
docker run quay.io/biocontainers/agat:1.4.2--pl5321hdfd78af_0 agat_convert_sp_gxf2gxf.pl --help
</details>
Using Singularity
<details> <summary>See details</summary>First you must have Singularity installed and running.
Secondly have look at the availabe AGAT biocontainers at quay.io.
Then:
# get the chosen AGAT container version
singularity pull docker://quay.io/biocontainers/agat:1.4.2--pl5321hdfd78af_0
# run the container
singularity run agat_1.4.2--pl5321hdfd78af_0.sif
You are now in the container. You can use an AGAT's tool e.g. agat_convert_sp_gxf2gxf.pl doing
agat_convert_sp_gxf2gxf.pl --help
</details>
Using Bioconda
<details> <summary>See details</summary>Install AGAT
conda install -c bioconda agat
Update AGAT
conda update agat
Uninstall AGAT
conda uninstall agat
</details>
Old school - Manually
<details> <summary>See details</summary> You will have to install all prerequisites and AGAT manually.Install prerequisites
-
R (optional)
You can install it by conda (conda install r-base), through CRAN (See here for a nice tutorial) or using your package management tool (e.g apt for Debian, Ubuntu, and related Linux distributions). R is optional and can be used to perform some plots. You will need to install the perl depency Statistics::R -
Perl >= 5.8
It should already be available on your computer. If you are unlucky perl.org is the place to go. -
Perl modules
They can be installed in different ways:- using cpan or cpanm
cpanm --installdeps .
