Plassembler
Program to quickly and accurately assemble plasmids in hybrid and long-only sequenced bacterial isolates
Install / Use
/learn @gbouras13/PlassemblerREADME
plassembler
Automated Bacterial Plasmid Assembly Program
plassembler is a program that is designed for automated & fast assembly of plasmids in bacterial genomes that have been hybrid sequenced with long read & paired-end short read sequencing. It was originally designed for Oxford Nanopore Technologies long reads, but it will also work with Pacbio reads. As of v1.3.0, it also works well for long-read only assembled genomes.
If you are assembling a small number of bacterial genomes manually, I would recommend starting by using Trycycler to recover the chromosome before using Plassembler to recover plasmids, especially the small ones.
Otherwise, I recommend you don't actually use Plassembler by itself. If you have more genomes or want to assemble your genomes in a more automated way, I would recommend Hybracter. If you use Hybracter, you will not need to use Plassembler separately, as it is built in. But please still cite Plassembler.
Quick Start
The easiest way to install plassembler is via conda:
conda install -c bioconda plassembler
Followed by database download and installation:
plassembler download -d <databse directory>
And finally run plassembler:
plassembler run -d <database directory> -l <long read fastq> -o <output dir> -1 < short read R1 fastq> -2 < short read R2 fastq> -c <estimated chromosome length>
Please read the Installation section for more details, especially if you are an inexperienced command line user.
🐳 Container Images
There are two sources of Docker images available for Plassembler:
| Source | Repository | Tags | Notes |
| :--- | :--- | :--- | :--- |
| Biocontainers | quay.io/biocontainers/plassembler | Link | Requires an initial plassembler download. |
| StaPH-B | docker.io/staphb/plassembler | Link | Database is pre-installed at /plassembler_db. |
Example usage with Apptainer (Singularity)
To use the recommended image, define the current image and pull it.
IMAGE="quay.io/biocontainers/plassembler:1.8.1--pyhdfd78af_0"
# Pull the container image
apptainer pull --name plassembler.sif docker://$IMAGE
# Download the database (required for Biocontainers image)
apptainer exec plassembler.sif plassembler download -d plassembler_db
# Run Plassembler
apptainer exec plassembler.sif \
plassembler run --help
apptainer exec plassembler.sif \
plassembler run -d plassembler_db -l long_reads.fastq.gz -1 R1.fastq.gz -2 R2.fastq.gz -o outdir -t 4 -c 50000
Google Colab Notebook
If you don't want to install plassembler locally, you can run it without any code using the colab notebook https://colab.research.google.com/github/gbouras13/plassembler/blob/main/run_plassembler.ipynb
This is only recommend if you have one or a few samples to assemble (it takes a while per sample due to the limited nature of Google Colab resources - probably an hour or two a sample). If you have more than this, a local install is recommended.
Manuscript
plassembler has been recently published in Bioinformatics:
George Bouras, Anna E. Sheppard, Vijini Mallawaarachchi, Sarah Vreugde, Plassembler: an automated bacterial plasmid assembly tool, Bioinformatics, Volume 39, Issue 7, July 2023, btad409, https://doi.org/10.1093/bioinformatics/btad409.
If you use plassembler, please see the full Citations section for a list of all programs plassembler uses under the hood, in order to fully recognise the creators of these tools for their work.
Documentation
The full documentation for Plassembler can be found here.
Table of Contents
- plassembler
- Automated Bacterial Plasmid Assembly Program
- Quick Start
- Manuscript
- Documentation
- Table of Contents
plassemblerv1.5.0 Update New Database (21 November 2023)plassemblerv1.3.0 Updates (24 October 2023)- Why Does Plassembler Exist?
- Why Not Just Use Unicycler?
- Other Features
- Quality Control
- Metagenomes
- Installation
- Unicycler v0.5.0 Installation Issues
- Running plassembler
- Outputs
- Benchmarking
- Acknowledgements
- Version Log
- Bugs and Suggestions
- Citations
plassembler v1.5.0 Update New Database (21 November 2023)
- If you upgrade to v1.5.0, you will need to update the database using
plassembler download - Plassembler v1.5.0 incorporates a new expanded database thanks to the recent PLSDB release 2023_11_03_v2. Thanks @biobrad for the heads up.
plassembler v1.3.0 Updates (24 October 2023)
plassembler longshould yield improved results. It achieves this by treating long reads as both short reads (in the sense of creating a de Brujin graph based short read assembly to begin) and long reads (for scaffolding) in Unicycler.- While I'd still recommend short reads if you can get them, I am now confident that if your isolate has small plasmids in the long read set,
plassembler longis very likely to find and recover them. - For more information, see the documentation.
- The ability to specify a
--flye_assemblyand--flye_infoif you already have a Flye assembly for your long reads instead of--flye_directoryhas been added. Thanks to @incoherentian's issue - The ability to specify a
--no_copy_numberswithplassembler assembledif you just want to run some plasmids against the PLSDB has been added. Thanks to @gaworj's issue.
Why Does Plassembler Exist?
In long-read assembled bacterial genomes, small plasmids are difficult to assemble correctly with long read assemblers. They commonly have circularisation issues and can be duplicated or missed (see this, this and this). This recent paper in Microbial Genomics by Johnson et al also suggests that long read assemblers particularly miss small plasmids.
plassembler was therefore created as a fast automated tool to ensure plasmids are assembled correctly without duplicated regions for high-throughput uses - like Unicycler but a lot laster - and to provide some useful statistics as well (such as estimate plasmid copy numbers for both long and short read sets).
As it turns out (though this wasn't a motivation for making it), plassembler also recovers more small plasmids than the existing gold standard tool Unicycler. I think this is because it throws away chromosomal reads, similar to subsampling short reads sets which can improve recovery. As there are more plasmid reads a proportion of the overall read set, there seems to be a higher chance of recovering smaller plasmids.
You can see this increase in accuracy and speed in the benchmarking results for simulated and real datasets.
Plassembler also uses mash as a quick way to determine whether each assembled contig has any similar hits in PLSDB.
Additionally, due to its mapping approach, Plassembler
Related Skills
node-connect
339.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
339.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.9kCommit, push, and open a PR
