Mope
Infer the dynamics of mitochondrial heteroplasmy during different life stages using population genetics and phylogenetics
Install / Use
/learn @ammodramus/MopeREADME
mope
Mitochondrial Ontogenetic Phylogeny Estimation
Mope is a program for making inferences about the dynamics of mitochondrial heteroplasmy during different life stages, using approaches from population genetics and phylogenetics. See the preprint or the paper.
Requirements
Mope is written in Python and requires Python 2.7+ or Python 3.1+. The following python modules are required:
- Numpy
- Scipy
- Pandas
- H5py for HDF5 data processing
- emcee for Ensemble MCMC machinery
- future for Python 2/3 compatibility
- lru-dict for fast caching
All of these, except for emcee, future, and lru-dict, are included with the Anaconda Python distribution.
To install these dependencies, you can use pip:
# pip install -U numpy scipy pandas h5py emcee future lru-dict
Mope also requires Cython, the Python development headers and libraries, and the GNU Scientific Library development libraries. Many systems will already have Cython. On Ubuntu, the remaining libraries can be installed with the following command:
# apt-get install python-dev libgsl0-dev
Installation
To install the most up-to-date version of mope, clone this repository and
use Python and distutils:
# git clone https://github.com/ammodramus/mope
# cd mope/
# python setup.py install
This may require superuser priveleges; to install in your home directory
replace install with install --user
mope can also be installed using pip; however the version on PyPI may not
be updated as frequently. To install with pip:
# pip install mope
Again, this command may require superuser priveleges on some systems; in this case, use the command
# pip install --user mope
A successful installation will install the mope library for use in Python and
an executable script called mope, which can be used to run data analysis and
inference, perform simulations, and execute a number of other utility
functionalities. This executable script should be in the user's PATH after
installation.
To obtain example files and scripts, clone this repository rather than install
by pip. Mope is not supported on Windows.
Obtaining allele frequency transition files
Likelihood calculations with mope require precomputed allele frequency transition distributions. Mope can download these automatically:
mope download-transitions
Transition distributions can also downloaded
here
(1.1 GB,
md5).
Note that if you are downloading this file programatically (e.g., using wget or
curl), you will need to rename the downloaded file to transitions.tar.gz, due
to limitations of our filehosting service.
To generate allele frequency transition distributions locally, run
mope generate-commands
This command will generate many commands to be run in parallel so that the transition distributions can be generated more efficiently.
Usage
For usage, try
# mope run --help
or see examples/.
Format specifications
Data input
For inference with mope, allele frequency data can be provided in two formats, either as allele frequency data or as allele count data. In each case, the data takes the form of a tab-delimited table.
Required columns are the data columns, having the names of the different tissues in the ontogenetic phylogeny (and corresponding to the leaf nodes of the phylogeny) and any age columns for ages corresponding to ontogenetic phylogeny components that accumulate drift and mutation with time.
For allele frequency data, the above data columns contain the allele
frequencies. For allele count data, the data columns contain the counts of the
focal heteroplasmic allele and additional (required) coverage columns contain
the total coverage. Count columns must be named x_n, where x is the name
of a data column.
See examples/data/ for an example dataset in each format.
Ontogenetic tree file
Ontogenetic trees are specified in a modified NEWICK format. Each node requires a unique name, and a length. Only alphanumeric characters and underscores are allowed in node names.
Node lengths specify the name of the parameter pair (i.e., the genetic drift and mutation/selection parameters) associated with the branch. Optionally, this parameter name may be multiplied by an age variable, indicating that this parameter is to be interpreted as a branch length that depends on some age. (Note that this age name must be a variable in the data file.)
It is also possible to specify that the genetic drift for a certain parameter
is to be modeled as a bottleneck. This done by appending ^ to the parameter
name.
These three ways of specifying a node are demonstrated here for a node named
mother_blood:
mother_blood:blo # simple genetic drift, no dependence on age
mother_blood:blo*mother_age # rate of accumulation of drift, with mother_age
mother_blood:blo^ # mother_blood is a bottleneck
For a complete example, here is the ontogenetic phylogeny used in the original study.
(((mother_blood:blo*mother_age)mother_fixed_blood:fblo,(mother_cheek:buc*mother_age)mother_fixed_cheek:fbuc)somM:som,((((child_blood:blo*child_age)child_fixed_blood:fblo,(child_cheek:buc*child_age)child_fixed_cheek:fbuc)som1:som)loo:loo*mother_birth_age)eoo:eoo)emb;
Parameters file (simulations only)
The parameters file specifies simulation parameters. It is a
whitespace-delimited table of parameter names (first column, must match tree
file) and their values (second column). See examples/params/ for examples.
Related Skills
node-connect
342.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
85.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
342.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
342.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
