Tsinfer
Infer a tree sequence from genetic variation data.
Install / Use
/learn @tskit-dev/TsinferREADME
tsinfer <img align="right" width="145" height="90" src="https://raw.githubusercontent.com/tskit-dev/tsinfer/main/docs/tsinfer_logo.svg">
NOTICE
WARNING The main branch on GitHub is a development branch for the 1.0 release. It has many changes from the current 0.x series, and is largely incompatible. DO NOT USE THIS CODE for your research - it is not ready. Please use the released version of tsinfer available from PyPI and conda-forge.
README (to updated once 1.0 stabilises)
Infer whole-genome tree sequences from genetic variation data. Tsinfer implements efficient algorithms to reconstruct ancestral haplotypes and recombination breakpoints, producing succinct tree sequences that capture shared ancestry across the genome. It scales to large cohorts and integrates cleanly with the broader tskit ecosystem for downstream statistics and analysis.
The documentation (stable • latest) contains details of how to use this software, including installation instructions.
Installation
python -m pip install tsinfer
# or
conda install -c conda-forge tsinfer
The initial algorithm, its rationale, and results from testing on simulated and real data are described in the following Nature Genetics paper:
Jerome Kelleher, Yan Wong, Anthony W Wohns, Chaimaa Fadil, Patrick K Albers and Gil McVean (2019) Inferring whole-genome histories in large population datasets. Nature Genetics 51: 1330-1338
Tsinfer versions 0.2.0 onwards allow missing data and provide a fully parameterised Li & Stephens matching algorithm (i.e. which allows mismatch). These improvements are described in the following Science paper:
Anthony Wilder Wohns, Yan Wong, Ben Jeffery, Ali Akbari, Swapan Mallick, Ron Pinhasi, Nick Patterson, David Reich, Jerome Kelleher, and Gil McVean (2022) A unified genealogy of modern and ancient genomes. Science 375: eabi8264
Please cite either or both of these if you use tsinfer in your work. Code to reproduce the results in the first paper is present in a separate GitHub repository.
Note that tsinfer does not attempt to infer node times (i.e. branch lengths of the
inferred trees). If you require a tree sequence where the dates of common ancestors
are expressed in calendar or generation times, you should post-process the tsinfer
output using software such as tsdate.
Related Skills
node-connect
342.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
342.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.7kCommit, push, and open a PR
