plassembler

Automated Bacterial Plasmid Assembly Program

plassembler is a program that is designed for automated & fast assembly of plasmids in bacterial genomes that have been hybrid sequenced with long read & paired-end short read sequencing. It was originally designed for Oxford Nanopore Technologies long reads, but it will also work with Pacbio reads. As of v1.3.0, it also works well for long-read only assembled genomes.

If you are assembling a small number of bacterial genomes manually, I would recommend starting by using Trycycler to recover the chromosome before using Plassembler to recover plasmids, especially the small ones.

Otherwise, I recommend you don't actually use Plassembler by itself. If you have more genomes or want to assemble your genomes in a more automated way, I would recommend Hybracter. If you use Hybracter, you will not need to use Plassembler separately, as it is built in. But please still cite Plassembler.

Quick Start

The easiest way to install plassembler is via conda:

conda install -c bioconda plassembler

Followed by database download and installation:

plassembler download -d <databse directory>

And finally run plassembler:

plassembler run -d <database directory> -l <long read fastq> -o <output dir> -1 < short read R1 fastq> -2 < short read R2 fastq> -c <estimated chromosome length>

Please read the Installation section for more details, especially if you are an inexperienced command line user.

🐳 Container Images

There are two sources of Docker images available for Plassembler:

Example usage with Apptainer (Singularity)

To use the recommended image, define the current image and pull it.

IMAGE="quay.io/biocontainers/plassembler:1.8.1--pyhdfd78af_0"

# Pull the container image
apptainer pull --name plassembler.sif docker://$IMAGE

# Download the database (required for Biocontainers image)
apptainer exec plassembler.sif plassembler download -d plassembler_db

# Run Plassembler
apptainer exec plassembler.sif \
 plassembler run --help
apptainer exec plassembler.sif \
 plassembler run -d plassembler_db -l long_reads.fastq.gz -1 R1.fastq.gz -2 R2.fastq.gz -o outdir -t 4 -c 50000

Google Colab Notebook

If you don't want to install plassembler locally, you can run it without any code using the colab notebook https://colab.research.google.com/github/gbouras13/plassembler/blob/main/run_plassembler.ipynb

This is only recommend if you have one or a few samples to assemble (it takes a while per sample due to the limited nature of Google Colab resources - probably an hour or two a sample). If you have more than this, a local install is recommended.

Manuscript

plassembler has been recently published in Bioinformatics:

George Bouras, Anna E. Sheppard, Vijini Mallawaarachchi, Sarah Vreugde, Plassembler: an automated bacterial plasmid assembly tool, Bioinformatics, Volume 39, Issue 7, July 2023, btad409, https://doi.org/10.1093/bioinformatics/btad409.

If you use plassembler, please see the full Citations section for a list of all programs plassembler uses under the hood, in order to fully recognise the creators of these tools for their work.

Documentation

The full documentation for Plassembler can be found here.

plassembler

`plassembler` v1.5.0 Update New Database (21 November 2023)

If you upgrade to v1.5.0, you will need to update the database using plassembler download
Plassembler v1.5.0 incorporates a new expanded database thanks to the recent PLSDB release 2023_11_03_v2. Thanks @biobrad for the heads up.

`plassembler` v1.3.0 Updates (24 October 2023)

plassembler long should yield improved results. It achieves this by treating long reads as both short reads (in the sense of creating a de Brujin graph based short read assembly to begin) and long reads (for scaffolding) in Unicycler.
While I'd still recommend short reads if you can get them, I am now confident that if your isolate has small plasmids in the long read set, plassembler long is very likely to find and recover them.
For more information, see the documentation.
The ability to specify a --flye_assembly and --flye_info if you already have a Flye assembly for your long reads instead of --flye_directory has been added. Thanks to @incoherentian's issue
The ability to specify a --no_copy_numbers with plassembler assembled if you just want to run some plasmids against the PLSDB has been added. Thanks to @gaworj's issue.

Why Does Plassembler Exist?

In long-read assembled bacterial genomes, small plasmids are difficult to assemble correctly with long read assemblers. They commonly have circularisation issues and can be duplicated or missed (see this, this and this). This recent paper in Microbial Genomics by Johnson et al also suggests that long read assemblers particularly miss small plasmids.

plassembler was therefore created as a fast automated tool to ensure plasmids are assembled correctly without duplicated regions for high-throughput uses - like Unicycler but a lot laster - and to provide some useful statistics as well (such as estimate plasmid copy numbers for both long and short read sets).

As it turns out (though this wasn't a motivation for making it), plassembler also recovers more small plasmids than the existing gold standard tool Unicycler. I think this is because it throws away chromosomal reads, similar to subsampling short reads sets which can improve recovery. As there are more plasmid reads a proportion of the overall read set, there seems to be a higher chance of recovering smaller plasmids.

You can see this increase in accuracy and speed in the benchmarking results for simulated and real datasets.

Plassembler also uses mash as a quick way to determine whether each assembled contig has any similar hits in PLSDB.

Additionally, due to its mapping approach, Plassembler

Plassembler

Install / Use

README

plassembler

Automated Bacterial Plasmid Assembly Program

Quick Start

🐳 Container Images

Example usage with Apptainer (Singularity)

Google Colab Notebook

Manuscript

Documentation

Table of Contents

`plassembler` v1.5.0 Update New Database (21 November 2023)

`plassembler` v1.3.0 Updates (24 October 2023)

Why Does Plassembler Exist?

Related Skills

Plassembler

Install / Use

README

plassembler

Automated Bacterial Plasmid Assembly Program

Quick Start

🐳 Container Images

Example usage with Apptainer (Singularity)

Google Colab Notebook

Manuscript

Documentation

Table of Contents

plassembler v1.5.0 Update New Database (21 November 2023)

plassembler v1.3.0 Updates (24 October 2023)

Why Does Plassembler Exist?

Related Skills

`plassembler` v1.5.0 Update New Database (21 November 2023)

`plassembler` v1.3.0 Updates (24 October 2023)