DiTing
DiTing: A pipeline to infer and compare biogeochemical pathways in metagenomic data
Install / Use
/learn @xuechunxu/DiTingREADME
DiTing
[!NOTE] 🚀 Major Update (v2.0): DiTing has been boosted to version 2! Compared to the initially published version, this release introduces several significant upgrades.
- Rewrite using Snakemake: The entire pipeline has been rewritten using
Snakemake, providing robust workflow management, better parallelization, and the ability to resume execution from breakpoints.- Upgrade annotation engine (kofamscan): We have replaced the manual
hmmsearchparsing system with the standardizedkofamscanengine for KEGG annotations. This effectively resolves the frequent parsing errors and compatibility issues previously encountered with rawhmmsearchoutputs.- Bioconda: DiTing is now available on Bioconda, making it easier to install and manage dependencies.
- Check out the new version in this fork.
Etymology
DiTing is a Chinese mythical creature who knows everything when he puts ears on the earth's surface. Parallelly, this program is developed to recognize biogeochemical cycles from environmental omic data accurately and efficiently.
谛听(DiTing) 若伏在地下,一霎时,便可将四大部洲山川社稷、洞天福地之间, 蠃虫、鳞虫、毛虫、羽虫、昆虫,天仙、地仙、神仙、人仙、鬼仙,顾鉴善恶,察听贤愚。
Citation
To cite DiTing please use
Xue CX, Lin H, Zhu XY, Liu J, Zhang Y, Rowley G, Todd JD, Li M, Zhang XH. DiTing: A Pipeline to Infer and Compare Biogeochemical Pathways From Metagenomic and Metatranscriptomic Data. Front Microbiol. 2021 Aug 2;12:698286. doi: 10.3389/fmicb.2021.698286.
Introduction
DiTing is designed to determine the relative abundance of metabolic and biogeochemical functional pathways in a set of given metagenomic/metatranscriptomic data. The input is expected to be a folder containing a group of paired-end clean reads. These reads will be assembled, annotated, and parsed for producing a table of relative abundance of elemental/biogeochemical cycling pathways (e.g., Nitrogen, Carbon, Sulfur) in each sample. Sketch maps and heatmaps will also be produced accordingly for comparing biogeochemical functions visually.
Procedure

Dependencies
- Megahit
- SPAdes
- Prodigal
- bwa
- BBMap
- HMMER3
- python3
- Python modules:
- KofamKOALA hmm database (ftp://ftp.genome.jp/pub/db/kofam/)
- ko_list.gz (ftp://ftp.genome.jp/pub/db/kofam/ko_list.gz)
- profiles.tar.gz (ftp://ftp.genome.jp/pub/db/kofam/profiles.tar.gz)
Installation
Recommended configuration:
CPU threads ≥ 8
RAM ≥ 64 Gb
Option 1: Conda (recommended)
Configure conda environment
# order matters
conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda
conda config --add channels silentgene
Set up a Diting environments
conda create -n diting-env
Activate diting-env and install DiTing program
conda activate diting-env
conda install -c silentgene diting
Deactivate diting-env
conda deactivate
Option 2: Repository from GitHub
Step 1. Download main scripts
git clone https://github.com/xuechunxu/DiTing.git
or click the green button Clone or download and select download ZIP to download the repo and unzip manually.
Step 2. Download databases
DiTing requires KofamKOALA hmm database. This database will be downloaded and unzipped automatically on the first run.
You can also download the database manually. This database should be stored in the same directory with the diting.py scripts.
# At the home directory of this program
mkdir kofam_database
cd kofam_database
wget -c ftp://ftp.genome.jp/pub/db/kofam/ko_list.gz
wget -c ftp://ftp.genome.jp/pub/db/kofam/profiles.tar.gz
gzip -d ko_list.gz
tar zxvf profiles.tar.gz
Step 3. Install the Dependencies
The Dependencies are required to be installed and added to the system $PATH
Running
1. One step running
diting.py -r <clean_reads_dir> -o <output_dir>
diting.py -r <clean_reads_dir> -a <metagenomic_assembly> -o <output_dir>
Example reads run:
#download the example reads
Google Drive:
URL: https://drive.google.com/file/d/132605rtKuA-Xx--eh3aC7i5WIExNWl5k/view?usp=sharing
after download, run:
unzip Clean-reads_interleaved.zip
OR If you are in China, you can download from Baiduyun:
URL: https://pan.baidu.com/s/1gFtJnz1G3pdEqBSFnUqFJw
Password: diti
# run Example
diting.py -r Clean-reads_interleaved -o Clean-reads_interleaved.diting.out
The input is the <clean_reads_dir> folder containing a group of paired-end metagenomic clean reads, looks like:
sample_one_1.fastq
sample_one_2.fastq
sample_two_1.fastq
sample_two_2.fastq
sample_three_1.fastq
sample_three_2.fastq
The paired-end metagenomic clean reads should end with .fq, .fq.gz, .fastq, or .fastq.gz.
The interleaved reads are also supported.
2. Optional parameter
2.1 --spades
Using metaSPAdes instead of megahit to assemble reads
Consider setting memory limitation by -m when usign SPAdes as assembler
-m(--memory) <int> default: 50 (in Gb)
2.2 -a (--assembly) metagenomic assembly
Path to a folder containing metagenomic assemblies corresponding to the provided reads, which is expected to have the same base name as the reads. The reads will not be assembled when this parameter was used.
python diting.py -r <clean_reads_dir> -a <metagenomic_assembly> -o <output_dir>
The <metagenomic_assembly> folder looks like:
sample_one.fa
sample_two.fa
sample_three.fa
2.3 Using interleaved paired-end reads
DiTing supports interleaved paired-end fastq files. Note that the reads type must be all interleaved or all separated.
e.g. [clean_reads_dir] content:
samples1.fq.gz
samples2.fq.gz
samples3.fq.gz
samples4.fq.gz
2.4 -n (--threads) number of threads
Number of threads to run (default: 4)
diting.py -r <clean_reads_Dir> -a <metagenomic_assembly> -o <output_dir> -n 20
2.5 --noclean
The sam files would be retained if this flag was used.
diting.py -r <clean_reads_dir> -a <metagenomic_assembly> -o <output_dir> -n 12 --noclean
2.6 -vis (--visualization) pathways_relative_abundance.tab
Visualization can also be executed independently, which allows users to adjust the final result table (e.g., merge some similar samples) before the visualization.
diting.py -vis <pathways_relative_abundance.tab>
3. Output
3.1 Table
pathways_relative_abundance.tab:The final result with the relative abundance of pathways in each sample.ko_abundance_among_samples.tab: A table with the relative abundance of eachk_numberof KEGG annotation is produced inKEGG_annotationfolder.
3.2 Visualization
carbon_cycle_sketch.png,nitrogen_cycle_sketch.png,DMSP_cycle_sketch.pngandsulfur_cycle_sketch.pngSketch maps regarding carbon, nitrogen and sulfur cyclescarbon_cycle_heatmap.pdf,nitrogen_cycle_heatmap.pdf,sulfur_cycle_heatmap.pdfandother_cycle_heatmap.pdfHeatmaps regarding carbon, nitrogen, sulfur cycles and other pathways
Example:
sketchlook like:
<img src="./example/diting.out/sketch.png" width="792" height="624">
heatmaplook like:
<img src="./example/diting.out/heatmap.png" width="792" height="627">
Copyright
Xue Chunxu, xuechunxu (at) outlook.com
Heyu Lin, heyu.lin (at) qut.edu.au
Xiaoyu Zhu, xiaoyuzhu321 (at) 126.com
Xiao-Hua Zhang, xhzhang (at) ouc.edu.cn
Lab of Microbial Oceanography
College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China
Related Skills
node-connect
354.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
112.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
354.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
354.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
