SkillAgentSearch skills...

SnpEffWrapper

Takes a VCF and applies annotations from a GFF using SnpEff

Install / Use

/learn @sanger-pathogens/SnpEffWrapper
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

SnpEffWrapper

Takes a VCF and infers annotations and variant effects from a GFF using SnpEff.

Build Status
License: GPL v3
codecov

Content

Introduction

SnpEff is a tool that annotates and predicts the effects of variants on genes. SnpEffWrapper takes a VCF and, using SnpEff, infers annotations and variation effects from a GFF. If you use SnpEffWrapper, please consider citing SnpEff. This software is not endorsed in any respect by the original authors.

Installation

SnpEffWrapper has the following dependencies:

  • SnpEff (>= 4.1)
  • Java (>= 1.7)
  • Jinja2
  • PyVCF
  • PyYAML

Details for the installation are provided below. If you encounter an issue when installing SnpEffWrapper please contact your local system administrator. If you encounter a bug please log it here or email us at path-help@sanger.ac.uk

Install snpEff and Java 1.7 then

pip install git+https://github.com/sanger-pathogens/SnpEffWrapper.git

Running the tests

The test can be run from the top level directory:

./snpEffWrapper/tests/test_wrapper.py

Usage

$ snpEffBuildAndRun --help
usage: snpEffBuildAndRun [-h] [--snpeff-exec SNPEFF_EXEC]
                         [--java-exec JAVA_EXEC] [--coding-table CODING_TABLE]
                         [-o OUTPUT_VCF] [--debug] [--keep]
                         gff_file vcf_file

Takes a VCF and applies annotations from a GFF using SnpEff

positional arguments:
  gff_file              GFF with annotations including a reference genome
                        sequence
  vcf_file              VCF input to annotate (NB must be aligned to the
                        reference in your GFF

optional arguments:
  -h, --help            show this help message and exit
  --snpeff-exec SNPEFF_EXEC
                        Path to your prefered SnpEff executable (default:
                        snpEff.jar)
  --java-exec JAVA_EXEC
                        Path to Java 1.7 (default: java)
  --coding-table CODING_TABLE
                        A mapping of contig name to coding table formatted in
                        YAML
  -o OUTPUT_VCF, --output_vcf OUTPUT_VCF
                        Output for the annotated VCF (default: stdout)
  --debug               Show lots of SnpEff and other debug output
  --keep                Keep temporary files and databases (useful for
                        debugging)
  • snpEffBuildAndRun will look for SnpEFF.jar in the following locations:
    • the file specified by --snpeff-exec
    • snpEff.jar in your local directory
    • snpEff.jar in your PATH
  • SnpEff needs Java 1.7 to run; snpEffBuildAndRun will look in the following locations:
    • the file specified by --java-exec
    • java in your PATH

Example usage

$ snpEffBuildAndRun snpEffWrapper/tests/data/minimal.gff snpEffWrapper/tests/data/minimal.vcf -o minimal.annotated.vcf

Alternative coding tables

You can provide a coding table for each VCF contig otherwise it'll default to SnpEff's 'Bacterial_and_Plant_Plastid'. You can do this by providing a mapping for each contig in your VCF to the relevant table in snpEffWrapper/data/config.template in YAML format.

For example:

snpEffBuildAndRun minimal.gff minimal.vcf \
  --coding-table 'default: Standard'
  
snpEffBuildAndRun minimal.gff minimal.vcf \
  --coding-table '{CHROM1: Standard, MITO1: Mitochondrial}'
  
snpEffBuildAndRun minimal.gff minimal.vcf \
  --coding-table '{default: Standard, MITO1: Mitochondrial}'

NB you don't need curly brackets if you're only mapping one contig (or setting a default); you do need them if you're setting different coding tables.

Input

  • The GFF must contain the reference sequence in Fasta format
  • The VCF must be aligned against the reference in the GFF
  • At least one of the contigs in the VCF must have annotation data in the GFF (you'll get warnings for each VCF config not in the GFF)
  • You cannot provide unknown coding tables (i.e. that can't be found in config.template)

License

SnpEffWrapper is free software, licensed under GPLv3.

Feedback/Issues

Please report any issues to the issues page or email path-help@sanger.ac.uk.

Citation

If you use this, please consider citing SnpEff. This software is not endorsed in any respect by the original authors.

Related Skills

View on GitHub
GitHub Stars5
CategoryDevelopment
Updated3y ago
Forks5

Languages

Python

Security Score

55/100

Audited on Jul 20, 2022

No findings