Varigraph
An accurate and widely applicable pangenome graph-based variant genotyper for diploid and polyploid genomes
Install / Use
/learn @JiaoLab2021/VarigraphREADME
varigraph
Introduction
An accurate and widely applicable pangenome graph-based variant genotyper for diploid and polyploid genomes
Requirements
Please note the following requirements before building and running the software:
Linuxoperating system- cmake version
3.12or higher - C++ compiler that supports
C++17or higher, and thezliblibrary installed (we recommend using GCC version"7.3.0"or newer) for buildingvarigraph
Installation
Install via conda
conda create -n varigraph
conda activate varigraph
conda install -c duzezhen varigraph
Building on Linux
Use the following script to build the software:
- First, obtain the source code.
git clone https://github.com/JiaoLab2021/varigraph.git
cd varigraph
- Next, compile the software and add the current directory to your system's
PATHenvironment variable.
cmake ./
make -j 5
echo 'export PATH="$PATH:'$(pwd)'"' >> ~/.bashrc
source ~/.bashrc
Usage
Input Files
- Reference Genome
- VCF File of Population Variants
- Sample File:
# Sample File
sample1 sample1.r1.fq.gz sample1.r2.fq.gz
sample2 sample2.r1.fq.gz sample2.r2.fq.gz
...
sampleN sampleN.r1.fq.gz sampleN.r2.fq.gz
Please note that the Sample file must be formatted exactly as shown above, where each sample is listed with its corresponding read files.
Running
For convenience, let's assume the following file names for the input:
refgenome.fainput.vcf.gzsamples.cfg
varigraph runs in two steps: the first step builds the genome graph, and the second step performs the genotyping. Here is the specific code:
1. Building the Genome Graph:
varigraph construct -r refgenome.fa -v input.vcf.gz --save-graph graph.bin
- Adjustment for Tetraploid Genome:
- If your VCF file involves variants from a tetraploid genome, include the
--vcf-ploidy 4parameter.
- If your VCF file involves variants from a tetraploid genome, include the
2. Performing Genotyping:
varigraph genotype --load-graph graph.bin -s samples.cfg --use-depth
- Adjustments for Genotyping:
- Homozygous Samples: For homozygous samples, add
-g homto improve genotyping accuracy. - Use
--use-depthfor accurate genotyping regardless of ploidy.
- Homozygous Samples: For homozygous samples, add
Note on Genotyping
-
The software supports species with ploidy ranging from 2 to 8. Please set the
--sample-ploidyparameter to the corresponding value for the species:- Autotetraploid: Set
--sample-ploidy 4(such as Solanum tuberosum, Medicago sativa, and Vaccinium corymbosum)
- Autotetraploid: Set
-
Autopolyploids (such as tetraploid potato): For these species, simply set --sample-ploidy to the corresponding ploidy level (e.g., 4 for tetraploid potato).
-
Allopolyploids (such as Brassica napus (AACC) or hexaploid wheat (AABBDD)): For these species, simply set --sample-ploidy to 2.
-
For accurate genotyping, make sure to choose the correct ploidy setting based on whether your species is a autopolyploid or allopolyploid.
Note on GPU Acceleration
-
GPU Version: varigraph also has a GPU-enabled version for faster computation if your server is equipped with GPUs.
-
Usage:
- Use
--gputo specify GPU usage. For example,--gpu 0uses GPU 0. - Adjust GPU memory usage with
--bufferparameter. Smaller values consume less GPU memory.
- Use
Citation
- Du, ZZ., He, JB., et al. Varigraph: An accurate and widely applicable pangenome graph-based variant genotyper for diploid and polyploid genomes. Molecular Plant (2025)
License
MIT
Related Skills
node-connect
341.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
341.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.4kCommit, push, and open a PR
