DeepChopper
Genomic Language Model Mitigates Chimera Artifacts in Nanopore Direct RNA Sequencing
Install / Use
/learn @ylab-hi/DeepChopperREADME
<img src="./docs/logo.webp" alt="logo" height="100"/> DeepChopper 
<!--toc:start-->
<!--toc:end-->
🧬 DeepChopper leverages a language model to accurately detect and chop artificial sequences that may cause chimeric reads, ensuring higher quality and more reliable sequencing results. By integrating seamlessly with existing workflows, DeepChopper provides a robust solution for researchers and bioinformaticians working with Nanopore direct-RNA sequencing data.
✨ What's New in v1.3.0
- 🚀 Direct FASTQ Processing: No more encoding step! DeepChopper now works directly with FASTQ files
- ⚡ Simplified Workflow: Go from raw data to results in just 2 commands (
predict→chop) - 📦 Auto-format Detection: Automatically handles
.fastq,.fq,.fastq.gz, and.fq.gzfiles - ⚠️ Breaking Change: The
encodecommand has been removed - update your pipelines accordingly
📘 FEATURED: We provide a comprehensive tutorial that includes an example dataset in our full documentation.
🚀 Quick Start: Try DeepChopper Online
Experience DeepChopper instantly through our user-friendly web interface. No installation required! Simply click the button below to launch the web application and start exploring DeepChopper's capabilities:
What you can do online:
- 📤 Upload your sequencing data
- 🔬 Run DeepChopper's analysis
- 📊 Visualize results
- 🎛️ Experiment with different parameters
Perfect for quick tests or demonstrations! However, for extensive analyses or custom workflows, we recommend installing DeepChopper locally.
⚠️ Note: The online version is limited to one FASTQ record at a time and may not be suitable for large-scale projects.
📦 Installation
DeepChopper can be installed using pip, the Python package installer. Follow these steps to install:
-
Ensure you have Python 3.10 or later installed on your system.
-
Create a virtual environment (recommended):
python -m venv deepchopper_env source deepchopper_env/bin/activate # On Windows use `deepchopper_env\Scripts\activate` -
Install DeepChopper:
pip install deepchopper -
Verify the installation:
deepchopper --help
Compatibility and Support
DeepChopper is designed to work across various platforms and Python versions. Below are the compatibility matrices for PyPI installations:
PyPI Support
| Python Version | Linux x86_64 | macOS Intel | macOS Apple Silicon | Windows x86_64 | | :------------: | :----------: | :---------: | :-----------------: | :------------: | | 3.10 | ✅ | ✅ | ✅ | ✅ | | 3.11 | ✅ | ✅ | ✅ | ✅ | | 3.12 | ✅ | ✅ | ✅ | ✅ |
🆘 Trouble installing? Check our Troubleshooting Guide or open an issue.
🛠️ Usage
For a comprehensive guide, check out our full tutorial. Here's a quick overview:
Command-Line Interface
🎉 New in v1.3.0: DeepChopper now works directly with FASTQ files! No encoding step required.
DeepChopper offers two main commands: predict and chop.
-
Predict chimera artifacts directly from FASTQ:
deepchopper predict input.fastq --output predictionsUsing GPUs? Add the
--gpusflag:deepchopper predict input.fastq --output predictions --gpus 2Supports all FASTQ formats:
.fastq,.fq,.fastq.gz,.fq.gz -
Chop chimera artifacts:
deepchopper chop predictions/0 input.fastq
Want a GUI? Launch the web interface (note: limited to one FASTQ record at a time):
deepchopper web
Python Library
Integrate DeepChopper into your Python scripts:
import deepchopper
model = deepchopper.DeepChopper.from_pretrained("yangliz5/deepchopper")
# Your analysis code here
📚 Cite
If DeepChopper aids your research, please cite our paper:
@article{li2026genomic,
title = {Genomic Language Model Mitigates Chimera Artifacts in Nanopore Direct {{RNA}} Sequencing},
author = {Li, Yangyang and Wang, Ting-You and Guo, Qingxiang and Ren, Yanan and Lu, Xiaotong and Cao, Qi and Yang, Rendong},
date = {2026-01-19},
journaltitle = {Nature Communications},
shortjournal = {Nat Commun},
publisher = {Nature Publishing Group},
issn = {2041-1723},
doi = {10.1038/s41467-026-68571-5},
url = {https://www.nature.com/articles/s41467-026-68571-5},
urldate = {2026-01-20}
}
🤝 Contribution
We welcome contributions! Here's how to set up your development environment:
Build Environment
git clone https://github.com/ylab-hi/DeepChopper.git
cd DeepChopper
# Install dependencies
uv sync
# Run DeepChopper
uv run deepchopper --help
🎉 Ready to contribute? Check out our Contribution Guidelines to get started!
🔗 Related Projects
- ChimeraLM - Identify artificial chimeric reads from whole genome amplification (WGA) processes
📬 Support
Need help? Have questions?
- 📖 Check our Documentation
- 🐛 Report issues
DeepChopper is developed with ❤️ by the YLab team. Happy sequencing! 🧬🔬
