Dorado
Oxford Nanopore's Basecaller
Install / Use
/learn @nanoporetech/DoradoREADME
Dorado
Dorado is a high-performance, easy-to-use, open source analysis engine for Oxford Nanopore reads.
Detailed information about Dorado and its features is available in the Dorado Documentation.
Features
- One executable with sensible defaults, automatic hardware detection and configuration.
- Runs on Apple silicon (M series) and Nvidia GPUs including multi-GPU with linear scaling (see Platforms).
- Modified basecalling.
- Duplex basecalling (watch the following video for an introduction to Duplex).
- Simplex barcode classification.
- Support for aligned read output in SAM/BAM.
- Initial support for poly(A) tail estimation.
- Support for single-read error correction.
- POD5 support for highest basecalling performance (documentation).
- Based on libtorch, the C++ API for pytorch.
- Multiple custom optimisations in CUDA and Metal for maximising inference performance.
If you encounter any problems building or running Dorado, please report an issue.
Installation
First, download the relevant installer for your platform:
- dorado-1.4.0-linux-x64
- dorado-1.4.0-linux-arm64-cuda12 - Orin only
- dorado-1.4.0-linux-arm64-cuda13 - Jetson Thor / DGX Spark
- dorado-1.4.0-osx-arm64
- dorado-1.4.0-win64
Once the relevant .tar.gz or .zip archive is downloaded, extract the archive to your desired location.
You can then call Dorado using the full path, for example:
/path/to/dorado-x.y.z-linux-x64/bin/dorado basecaller hac pod5s/ > calls.bam
Or you can add the bin path to your $PATH environment variable, and run with the dorado command instead, for example:
dorado basecaller hac pod5s/ > calls.bam
Please visit the dorado documentation for more information on getting started.
See DEV.md for details about building Dorado for development.
Platforms
Dorado is heavily-optimised for Nvidia A100 and H100 GPUs and will deliver maximal performance on systems with these GPUs.
Dorado has been tested extensively and supported on the following systems:
| Platform | GPU/CPU | Minimum Software Requirements | | --- |---------|--------------| | Linux x86_64 | (G)V100, A100, H100 | CUDA Driver ≥525.105 | | Linux arm64 | Jetson Orin, Jetson Thor, DGX Spark* | Linux for Tegra ≥36.4.3 (JetPack ≥6.2) | | Windows x86_64 | (G)V100, A100, H100 | CUDA Driver ≥529.19 | | Apple | Apple Silicon (M series) | macOS ≥14 |
*DGX Spark supports all Dorado commands except Dorado correct. Support for Dorado correct will be added in a future release.
Linux x64 or Windows systems not listed above but which have Nvidia GPUs with ≥8 GB VRAM and architecture from Pascal onwards (except P100/GP100) have not been widely tested but are expected to work. When basecalling with Apple devices, we recommend systems with ≥16 GB of unified memory.
If you encounter problems with running on your system, please report an issue.
AWS Benchmarks on Nvidia GPUs for Dorado 0.3.0 are available here. Please note: Dorado's basecalling speed is continuously improving, so these benchmarks may not reflect performance with the latest release.
Performance tips
- Dorado will automatically detect your GPU's free memory and select an appropriate batch size.
- Dorado will automatically run in multi-GPU
cuda:allmode. If you have a heterogeneous collection of GPUs, select the faster GPUs using the--deviceflag (e.g.,--device cuda:0,2). Not doing this will have a detrimental impact on performance. - On Windows systems with Nvidia GPUs, open Nvidia Control Panel, navigate into “Manage 3D settings” and then set “CUDA - Sysmem Fallback Policy” to “Prefer No Sysmem Fallback”. This will provide a significant performance improvement.
Running
The following are helpful commands for getting started with Dorado.
To see all options and their defaults, run dorado -h and dorado <subcommand> -h.
Simplex basecalling
To run Dorado basecalling, using the automatically downloaded hac model on a directory of POD5 files or a single POD5 file.
dorado basecaller hac pod5s/ > calls.bam
To basecall a single file, simply replace the directory pod5s/ with a path to your data file.
Click here for more details on simplex basecalling including how to use the
--resume-from feature.
DNA adapter and primer trimming
Dorado can detect and remove any adapter and/or primer sequences from the beginning and end of DNA reads. Note that if you intend to demultiplex the reads at some later time, trimming primers will likely result in some portions of the flanking regions of the barcodes being removed, which could prevent demultiplexing from working properly. For details see the dorado documentation on read trimming.
Modified basecalling
Beyond the traditional A, T, C, and G basecalling, Dorado can also detect modified bases such as 5-methylcytosine (5mC), 5-hydroxymethylcytosine (5hmC), and N<sup>6</sup>-methyladenosine (6mA). These modified bases play crucial roles in epigenetic regulation.
For full details please read the documentation on modified basecalling.
To call modifications, extend the models argument with a comma-separated list of modifications:
dorado basecaller hac,5mCG_5hmCG,6mA pod5s/ > calls.bam
In the example above, basecalling is performed with the detection of both 5mC/5hmC in CG contexts and 6mA in all contexts. See here for details on modified basecalling context.
Refer to the models list table's Compatible Modifications column to see available modifications.
Modified basecalling is also supported with Duplex basecalling, where it produces hemi-methylation calls.
Duplex
To run Duplex basecalling, run the command:
dorado duplex sup pod5s/ > duplex.bam
For more details please head to the Dorado duplex basecalling documentation.
Alignment
Dorado supports aligning existing basecalls or producing aligned output directly, internally using minimap2.
To align existing basecalls, run:
dorado aligner <index> <reads> > aligned.bam
where index is a reference to align to in (FASTQ/FASTA/.mmi) format and reads is a folder or file in any HTS format.
To basecall with alignment with duplex or simplex, run with the --reference option:
dorado basecaller <model> <reads> --reference <index> > calls.bam
For more details please check out the Dorado aligner documentation.
Sequencing Summary
The Dorado summary command outputs a tab-separated file with read level sequencing information from the BAM file generated during basecalling. To create a summary, run:
dorado summary <bam> > summary.tsv
Barcode Classification
Dorado supports barcode classification for existing basecalls as well as producing classified basecalls directly. Further details can be found at the Dorado barcoding documentation.
Poly(A) tail estimation
Dorado has initial support for estimating poly(A) tail lengths for cDNA (PCS and PCB kits) and RNA, and can be configured for use with custom primer sequences, interrupted tails, and plasmids. Note that Oxford Nanopore cDNA reads are sequenced in two different orientations and Dorado poly(A) tail length estimation handles both (A and T homopolymers). This feature can be enabled by passing --estimate-poly-a to the basecaller command. For more details check out the dorado poly(A) estimation documentation.
Read Error Correction
Dorado supports single-read error correction with the integration of the HERRO algorithm. HERRO uses all-vs-all alignment followed by haplotype-aware correction using a deep learning model to achieve higher single-read accuracies. The corrected reads are primarily useful for generating de novo assemblies of diploid organi
Related Skills
node-connect
341.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
341.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.6kCommit, push, and open a PR
