UNCALLED
Raw nanopore signal mapper that enables real-time targeted sequencing
Install / Use
/learn @skovaka/UNCALLEDREADME
UNCALLED
A Utility for Nanopore Current Alignment to Large Expanses of DNA

A read mapper which rapidly aligns raw nanopore signal to DNA references
Enables software-based targeted sequenceing on Oxford Nanopore (ONT) MinION or GridION via adaptive sampling
Note that UNCALLED can only be applied to legacy r9.4.1 data. For r10.4.1 data try ReadFish or ONT's builtin adaptive sampling option.
For accurate end-to-end nanopore signal alignment, visualization, and analysis see Uncalled4
Installation
> pip3 install git+https://github.com/skovaka/UNCALLED.git --user
OR
> git clone --recursive https://github.com/skovaka/UNCALLED.git
> cd UNCALLED
> pip3 install .
Requires python >= 3.6, read-until == 3.0.0, pybind11 >= 2.5.0, and GCC >= 4.8.1 (all except GCC are automatically downloaded and installed)
Other dependencies are included via submodules, so be sure to clone with git --recursive
We recommend running on a Linux machine. UNCALLED has been successfully installed and run on Mac computers, but real-time ReadUntil has not been tested on a Mac. Installing UNCALLED has not been attempted on Windows.
Indexing
Example:
> uncalled index -o E.coli E.coli.fasta
Positional arguments:
fasta-filereference genome(s) or other target sequences in the FASTA format
Optional arguments:
-o/--bwa_prefixoutput index prefix (default: same as input fasta)
Note that UNCALLED uses the BWA FM Index to encode the reference, and this command will use a previously built BWA index if all the required files exist with the specified prefix. Otherwise, a new BWA index will be automatically built.
We recommend applying repeat masking your reference if it contains eukaryotic sequences. See masking for more details.
Fast5 Mapping
Example:
> uncalled map -t 16 E.coli fast5_list.txt > uncalled_out.paf
Loading fast5s
Mapping
> head -n 4 uncalled_out.paf
b84a48f0-9e86-47ef-9d20-38a0bded478e 3735 77 328 + Escherichia_coli_chromosome 4765434 2024611 2024838 66 228 255 ch:i:427 st:i:50085 mt:f:53.662560
77fe7f8c-32d6-4789-9d62-41ff482cf890 5500 94 130 + Escherichia_coli_chromosome 4765434 2333754 2333792 38 39 255 ch:i:131 st:i:238518 mt:f:19.497091
eee4b762-25dd-4d4a-8a59-be47065029be 2905 * * * * * * * * * 255 ch:i:44 st:i:302369 mt:f:542.985229
e175c87b-a426-4a3f-8dc1-8e7ab5fdd30d 8052 84 154 + Escherichia_coli_chromosome 4765434 1064550 1064614 41 65 255 ch:i:182 st:i:452368 mt:f:38.611683
Positional arguments:
bwa-prefixthe BWA reference index prefix generated byuncalled mapfast5-filesReads to be mapped. Can be a directory which will be recursively searched for all files with the ".fast5" extension, a text file containing one fast5 filename per line, or a comma-separated list of fast5 file names.
Optional arguments:
-l/--read-listtext file containing a list of read IDs. Only these reads will be mapped if specified-n/--read-countmaximum number of reads to map-t/--threadsnumber of threads to use for mapping (default: 1)-e/--max-events-procnumber of events to attempt mapping before giving up on a read (default 30,000). Note that there are approximately two events per nucleotide on average.
See example/ for a simple read and reference example.
Real-Time ReadUntil
Warning: in the latest MinKNOW version, an API bug may prevent UNCALLED from properly ejecting reads. You can identify this bug if you do not see a peak of small "adaptive sampling" reads in read length histogram. If this occurs you should stop your sequencing run, briefly start a new sequencing run with MinKNOW's builtin version of adaptive sampling enabled, then stop that run and restart your UNCALLED run. We have found that this may initialize something in MinKNOW which allows UNCALLED to function properly.
Example:
> uncalled realtime E.coli --port 8000 -t 16 --enrich -c 3 > uncalled_out.paf
Starting client
Starting mappers
Mapping
> head -n 4 uncalled_out.paf
81ba344d-60df-4688-b37f-9064e76a3eb8 1352 * * * * * * * * * 255 ch:i:68 st:i:29101 mt:f:375.93841 wt:f:1440.934 mx:f:0.152565
404113c1-6ace-4690-885c-9c4a47da6476 450 * * * * * * * * * 255 ch:i:106 st:i:29268 mt:f:63.272270 wt:f:1591.070 en:f:0.010086
d9acafe3-23dd-4a0f-83db-efe299ee59a4 1355 * * * * * * * * * 255 ch:i:118 st:i:29378 mt:f:239.50201 wt:f:1403.641 ej:f:0.120715
8a6ec472-a289-4c50-9a75-589d7c21ef99 451 98 369 + Escherichia_coli 4765434 3421845 3422097 56 253 255 ch:i:490 st:i:29456 mt:f:79.419411 wt:f:8.551202 kp:f:0.097424
We recommend that you try mapping fast5s via uncalled map before real-time enrichment, as runtime issues could occur if UNCALLED is not installed properly.
The command can generally be run at any time before or during a sequencing run, although an error may occur if UNCALLED is run before any sequencing run has been started in the current MinKNOW session. If this is happens you should start UNCALLED after the run begins, ideally during the first mux scan. If you want to change the chunk size you must run the command before starting the run (see below).
Positional arguments:
bwa-prefixthe BWA reference index prefix generated byuncalled map
Required arguments:
--enrichwill keep reads that map to the reference if included OR--depletewill eject reads that map to the reference if included Exactly one of--depleteor--enrichmust be specified
Optional Arguments:
-c/--max-chunksnumber of chunks to attempt mapping before giving up on a read (default: 10).--chunk-sizesize of chunks in seconds (default: 1). Note: this is a new feature and may not work as intended (see below)-t/--threadsnumber of threads to use for mapping (default: 1)--portMinION device port.--evenwill only eject reads from even channels if included--oddwill only eject reads from odd channels if included--durationexpected duration of sequencing run in hours (default: 72)
Altering Chunk Size
The ReadUntil API receives signal is "chunks", which by default are one second's worth of signal. This can be changed using the --chunk-size parameter. Note that --max-chunks-proc should also be changed to compensate for changes to chunk sizes. If the chunk size is changed, you must start running UNCALLED before sequencing begins. UNCALLED is unable to change the chunk size mid-seqencing-run. In general reducing the chunk size should improve enrichment, although previous work has found that the API becomes unreliable with chunks sizes less than 0.4 seconds. We have not thoroughly tested this feature, and recommend using the default 1 second chunk size for most cases. In the future this default size may be reduced.
Simulator
Example:
> uncalled sim E.coli.fasta /path/to/control/fast5s --ctl-seqsum /path/to/control/sequencing_summary.txt --unc-seqsum /path/to/uncalled/sequencing_summary.txt --unc-paf /path/to/uncalled/uncalled_out.paf -t 16 --enrich -c 3 --sim-speed 0.25 > uncalled_out.paf 2> uncalled_err.txt
> sim_scripts/est_genome_yield.py -u uncalled_out.paf --enrich -x E.coli -m mm2.paf -s sequencing_summary.txt --sim-speed 0.25
unc_on_bp 150.678033
unc_total_bp 6094.559395
cnt_on_bp 33.145022
cnt_total_bp 8271.651331
The simulator simulates a real-time run using data from two real runs: one control run and one UNCALLED run. Reads are simulated from the control run, and the pattern of channel activity of modeled after the control run. The simulator outputs a PAF file similar to the real-time mode, which can be interperted using scripts found in sim_scripts/.
Example files which can be used as template UNCALLED sequencing summary and PAF files for the simulator can be found here. The control reads/sequencing summary can be from any sequencing run of your sample of interest, and it does not have to match the sample used in the provided examples.
The simulator can take up a large amount of memory (> 100Gb), and loading the fast5 reads can take quite a long time. To reduce the time/memory requirements you could truncate your control sequencing summary and only the loads present in the summary will be loaded, although this may reduce the accuracy of the simulation. Also, unfortunately the fast5 loading portion of the simulator cannot be exited via a keyboard interrupt and must be hard-killed. I will work on fixing this in future versions.
Arguments:
bwa-prefixthe prefix of the index to align to. Should be a BWA index thatuncalled indexwas run oncontrol-fast5-filespath to the directory where control run fast5 files are stored, or a text file containing the path to one control fast5 per line--ctl-seqsumsequencing summary of the control run. Read IDs must match the control fast5 files--unc-seqsumsequencing summary of the UNCALLED run--unc-pafPAF file output by UNCALLED from the UNCALLED run--sim-speedscaling factor of simulation duration in the range (0.0, 1.0], where smaller values are faster
Related Skills
node-connect
344.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
96.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
344.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
344.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
