Snipar
Imputation of parental genotypes, inference of sibling IBD segments, family based GWAS, and polygenic score analyses.
Install / Use
/learn @AlexTISYoung/SniparREADME
snipar
<p align="center"> <img src="docs/snipar_logo.png" width="350" alt="snipar logo"> </p>
<a href="https://twitter.com/intent/follow?screen_name=alextisyoung"> <img src="https://img.shields.io/twitter/follow/alextisyoung.svg?style=social" alt="follow on Twitter"></a>
snipar (single nucleotide imputation of parents) provides a command line toolbox for family-based analyses in genetics:
- family-GWAS: Perform family-GWAS (FGWAS) with a variety of estimators. snipar can perform family-GWAS using genetic differences between siblings, parent-offspring trios, and increase power through using imputed parental genotypes, which enables inclusion and optimal use of samples with only one parent genotyped and without genotyped parents or siblings. See Simulation Exercise: family-GWAS without imputed parental genotypes and Simulation Exercise: family-GWAS with imputed parental genotypes. The regressions are performed in an efficient linear mixed model that accounts for correlations between siblings and more distant relatives.
- family-PGS analyses: Compute and analyze polygenic scores (PGS) for a set of individuals along with their siblings and parents, using both observed and imputed parental genotypes. snipar can estimate the direct effect (within-family) of a polygenic score: see Simulation Exercise: Polygenic score analyses. It can adjust for the impact of assortative mating on estimates of indirect genetic effects (effects of alleles in parents on offspring mediated through the environment) from family-based PGS analysis: see Simulation Exercise: Polygenic score analyses.
- Imputation of missing parental genotypes: For samples with at least one genotyped sibling and/or parent, but without both parents' genotypes available, snipar can impute missing parental genotypes according to Mendelian laws (Mendelian Imputation) and use these to increase power for family-GWAS and PGS analyses. See Tutorial: imputing-missing-parental-genotypes
- Identity-by-descent (IBD) segments shared by siblings: snipar implements a hidden markov model (HMM) to accurately infer identity-by-descent segments shared between siblings. The output of this is needed for imputation of missing parental genotypes from siblings. See Tutorial: inferring IBD between siblings
- Multi-generational forward simulation with indirect genetic effects and assortative mating: snipar includes a simulation module that performs forward simulation of multiple generations undergoing random and/or assortative mating. The phenotype on which assortment occurs can include indirect genetic effects from parents. Users can input phased haplotypes for the starting generation or artificial haplotypes can be simulated. Output includes a multigenerational pedigree with phenotype values, direct and indirect genetic component values, and plink formatted genotypes for the final two generations along with imputed parental genotypes. See Simulation Exercise
- Estimate correlations between effects: Family-GWAS summary statistics include genome-wide estimates of direct genetic effects (DGEs) — the within-family estimate of the effect of the allele — population effects — as estimated by standard GWAS — and non-transmitted coefficients (NTCs), the coefficients on parents' genotypes. The correlate.py scipt enables efficient estimation of genome-wide correlations between these different classes of effects accounting for sampling errors. See Tutorial: correlations between effects
The above illustrates an end-to-end workflow for performing family-GWAS in snipar, an example of which is given in the Tutorial. Not all steps are necessary for all analyses. For example, family-GWAS (and PGS analyses) can be performed without imputed parental genotypes, requiring only input genotypes in .bed or .bgen format along with pedigree information. Also: imputation for parent-offspring pairs can proceed without IBD inference.
Publications
Please cite at least one of these publications if you use snipar in your work!
The methodologies implemented in snipar are described in the following publications:
Mendelian imputation of parental genotypes improves estimates of direct genetic effects.
Alexander Strudwick Young, SM Nehzati, ..., Augustine Kong.
Describes the method for imputation of missing parental genotypes and family-based GWAS with imputed parental genotypes.
🔗 Full Text
Estimation of indirect genetic effects and heritability under assortative mating.
Alexander Strudwick Young.
Describes family-PGS analysis with adjustment for the impact of assortative mating on estimates of indirect genetic effects.
🔗 Full Text
Family-GWAS reveals effects of environment and mating on genetic associations. .
Tammy Tan, H Jayashankar, J Guan, SM Nehzati, M Mir, M Bennett, E Agerbo, ..., Alexander Strudwick Young.
Shows snipar applied to generate family-GWAS summary statistics from 17 different cohorts that are meta-analyzed. Describes the methodology for estimating genome-wide correlations between the different classes of effects estimated by family-GWAS.
🔗 Full Text
Family-based genome-wide association study designs for increased power and robustness.
Describes additional family-GWAS designs: the unified estimator, which increases power for estimating direct genetic effects in homogeneous samples (typical for GWAS) by including all samples through linear imputation; and the robust estimator, which maximizes power in strongly structured or admixed samples without introducing bias. The linear mixed model used in snipar family-GWAS and PGS analyses is described here.
Junming Guan, T Tan, SM Nehzati, M Bennett, P Turley, DJ Benjamin, Alexander Strudwick Young.
🔗 Full Text
Documentation
Documentation: https://snipar.rtfd.io/
It is recommended to read the guide: https://snipar.rtfd.io/en/latest/guide.html
And to work through the tutorial (https://snipar.readthedocs.io/en/latest/tutorial.html) and simulation exercise (https://snipar.readthedocs.io/en/latest/simulation.html).
Installing Using pip
snipar currently supports Python 3.7-3.9 on Linux, Windows, and Mac OSX (14.7 and higher). We recommend using a python distribution such as Anaconda 3 (https://store.continuum.io/cshop/anaconda/).
The easiest way to install is using pip:
pip install snipar
Sometimes this may not work because the pip in the system is outdated. You can upgrade your pip using:
pip install --upgrade pip
Note: installing snipar requires the package bed_reader, which in turn requires Rust. If an error occurs at "Collecting bed-reader ...", please try downloading Rust following the instruction here: https://rust-lang.github.io/rustup/installation/other.html.
Virtual Environment
You may encounter problems with the installation due to Python version incompatability or package conflicts with your existing Python environment. To overcome this, you can try installing in a v
Related Skills
node-connect
335.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
82.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
335.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
82.7kCommit, push, and open a PR
