MultiPrime

multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn)

Generate Convert Improve

Install / Use

/learn @joybio/MultiPrime

About this skill

Quality Score

0/100

README

multiPrime: version 2.1.1

Multi PCR primer pairs design processing pipeline

MultiPrime https://multiPrime.cn is a pipeline designed for broad-spectrum detection of target sequences using tNGS. It is implemented in Python and Snakemake and takes a FASTA format file as input. The pipeline has three main steps: classification by identity, primer design, and primer set combination. In the classification step, redundant sequences are removed and clusters are formed by identity. Rare sequence clusters are compared to others by average nucleotide identity, and if they are deemed similar enough, they are merged. In the primer design step, multi-alignment is performed using MUSCLE or MAFFT, and candidate primers are designed using the nearest-neighbor model. Primer pairs are selected based on PCR product length, melting temperature, dimer examination, coverage with errors, and other factors. Finally, a greedy algorithm is used to combine primer pairs into a minimal primer set according to dimer examination.

If you only require primer design without the need for primer set combination, you may use the primer design module of MultiPrime, which is accessible through scripts/multiPrime-core.py or pip install multiPrime (version >=2.4.8) and utilize the DPrime function.

multi-DegePrime: Degenerate primer design by DEGEPRIME (MC-DPD).

multiPrime-original: Degenerate primer design by multiPrime-core (MC-EDPD or MC-DPD). It allows for avoidance of mismatches at 3'end region.

multiPrime: It is an update of original. New multiPrime allows for easy avoidance of mismatches at any position, making it flexible for experienced users.

Scripts and pipelines provided in this repository aid to design multiplex PCR primer and return a minimal primerset for multi-PCR. It contains all scripts to allow a self-assembled processing and additionally provides pipeline scripts that run the entire processing automatically.

We have provided a video tutorial at here to assist you with the installation and usage of multiPrime.

Why multiPrime

MultiPrime is a user-friendly and one-step tool for designing tNGS primer sets (cluster-specific primers or ultra multiplex PCR).
It integrates degenerate primer design theory with mismatch handling, resulting in improved accuracy and specificity in detecting broad spectrum sequences.
It outperformed conventional programs in terms of run time, primer number, and primer coverage.
The versatility and potential of multiPrime is highlighted by its potential application in detecting single or multiple genes, exons, antisense strands, RNA, or other specific DNA segments.

Citation

If you find multiPrime helpful for your research or project, we kindly request that you cite the following publication:

Xia, Han et al. 2023. "MultiPrime: A Reliable and Efficient Tool for Targeted Next-Generation Sequencing.” iMeta e143. https://doi.org/10.1002/imt2.143".

Citing the publication will acknowledge the contribution of multiPrime to your work and help us in further development and improvement of the tool. Thank you for your support!

Requirements

To run this pipeline, your computer requires 30 GB of available memory (RAM) to process larger number of sequence (e.g. 1,000,000). Note: We don't suggest that Input sequences contains those sequences whose length is greater than 100K, if necessary, you'd better set the Maxseq in yaml file as small as possible, but do not smaller than 200. Alternatively, you may consider using conserved genes/regions instead of whole genomes. Snakemake was used to facilitate the automated execution of all analysis steps. The easiest way to make use of the pipeline is to set up a python 3.9 virtual environment and run the pipeline in this environment.

Download/Provide all necessary files (The requirements file is all set. Simply clone this repository and install via conda.):

Comparison:

DEGEPRIME-1.1.0 (multi-DegePrime): DOI: 10.1128/AEM.01403-14; Please cite: "DegePrime, a program for degenerate primer design for broad-taxonomic-range PCR in microbial ecology studies." Links: https://github.com/EnvGen/DegePrime; please move this directory into scripts.

mfeprimer-3.2.6: DOI: 10.1093/nar/gkz351; Please cite: "MFEprimer-3.0: quality control for PCR primers." please move this it into scripts. Please add "execute" to mfeprimer-3.2.6

Programs we employed:

biopython: Not required in v1.0.1 and the subsequent version.

The method for calculating Tm values in this study is a slightly modified version of primer3-py here. Reference paper: Owczarzy et al., 2004; Owczarzy et al., 2008.

The method for calculating deltaG in this study is a slightly modified version of the approach proposed by Martin et al., 2020. "Base-Pairing and Base-Stacking Contributions to Double-Stranded DNA Formation."

The method for dimer examination in this study is a slightly modified version of the approach proposed by Xie et al., 2022. "Designing highly multiplex PCR primer sets with Simulated Annealing Design using Dimer Likelihood Estimation (SADDLE)"

MUSCLE: It is already in the requirement.txt. version=v3.8.1551. http://www.drive5.com/muscle This software is donated to the public domain. Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

MAFFT: It is already in the requirement.txt. version=v7.508 (2022/Sep/07). Please cite: "MAFFT multiple sequence alignment software version 7: improvements in performance and usability".

fastANI: It is already in the requirement.txt. version=version 1.33. Please cite: "FastANI, Mash and Dashing equally differentiate between Klebsiella species."

blast+: It is already in the requirement.txt. version=BLAST 2.13.0+. Links: https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastNews.

bowtie2: It is already in the requirement.txt. version=version 2.2.5. DOI:10.1038/nmeth.1923; Please cite: "Fast gapped-read alignment with Bowtie 2." Links: https://www.nature.com/articles/nmeth.1923

Installation and Snakemake

Snakemake is a workflow management system that helps to create and execute data processing pipelines. It requires python3 and dependent environment (multi-DegePrime == multiPrime-original == multiPrime) can be most easily installed via the bioconda package of the python anaconda distribution.

conda create -n multiPrime -c bioconda -c conda-forge --file requirement.txt

if conflicts:

conda create -n multiPrime -c bioconda -c conda-forge --file requirement2.txt

or Copy multiPrime.tar.gz files from the ENV directory to your Conda environment directory and then unpack them.

cp ENV/multiPrime.tar.gz  ${/path/to/your/conda/env}
tar -xzvf multiPrime.tar.gz 
conda activate multiPrime

Activate and exit the environment

To activate the environment

source activate multiPrime

To exit the environment (after finishing the usage of the pipeline), just execute

source deactivate

Run the pipeline

Configure input parameters

The working directory contains files named multi-DegePrime.yaml, multiPrime-original.yaml and multiPrime.yaml. These are the central file in which all user settings, paramter values and path specifications are stored. multi-DegePrime.yaml employs DEGEPRIME-1.1.0 for maximum coverage degenerate primer design (MC-DPD), multiPrime-orignal.yaml and multiPrime.yaml use multiPrime-core.py for MC-DPD or MC-DPD with error. During a run, all steps of the pipeline will retrieve their paramter values from these file. It follows the yaml syntax (find more information about yaml and it's syntax here) what makes it easy to read and edit. The main principles are:

everything that comes after a # symbol is considered as comment and will not be interpreted
paramters are given as key-value pair, with key being the name and value the value of any paramter

Before starting the pipeline, open the multiPrime.yaml configuration file and set all options according as required. This should at least include:

name of the input directory - where are your input fasta files stored -input_dir: ["abs_path_to_input_dir"]
name of the output directory - where should the pipeline store the output files (the direcotry is created if not existing) -results_dir: ["abs_path_to_results_dir"]
name of the log directory - where should the pipeline store the log files -log_dir: ["abs_path_to_log_dir"]
name of the scripts directory - where should the pipeline store the scripts files -scripts_dir: ["abs_path_to"]/multiPrime/scripts
name(s) of your input samples - please note: If your sample is named sample1.fa then sample1 will be kept as naming scheme throughout the entire run to indicate output files that belong to this input file, e.g. the pipeline will create a file called sample1.fa. If you have multiple input files, just follow the given pattern with one sample name per line (and a dash that indicates another list item).
identity - threshold for classification. please note: If you set 1, multiPrime will design candidate primer pairs for each fasta in input files. Suggestion: 0.7-0.8.
others - for more information on the parameters, please refer to the YAML file.

Start a run

Once you set up your configuration file, running the pipeline locally on your computer is as easy as invoking:

sh run.sh

maximal coverage degenerate primer design (MC-DPD). The approach employed DegePrime to design degenerate primers for the target sequence.

snakemake --configfile multi-DegePrime.yaml -s multi-DegePrime.py --cores 10 --resources disk_mb=80000

Related Skills

claude-opus-4-5-migration

85.3k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

model-usage

342.5k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

diffs

342.5k

Use the diffs tool to produce real, shareable diffs (viewer URL, file artifact, or both) instead of manual edit summaries.

TrendRadar

50.2k

⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载，你的 AI 舆情监控助手与热点筛选工具！聚合多平台热点 + RSS 订阅，支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机，也支持接入 MCP 架构，赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ，数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。

joybio

View profile

View on GitHub

GitHub Stars521

CategoryDesign

Updated15d ago

Forks41

joybio/multiPrime

Languages

Python

Security Score

100/100

Audited on Mar 16, 2026

No findings