LyRic
Long RNA-seq analysis workflow
Install / Use
/learn @guigolab/LyRicREADME
LyRic
LyRic is a versatile automated transcriptome annotation and analysis workflow written in the Snakemake language. Its core functionality is the production of:
- a set of high-quality RNA Transcript Models (TMs) mapped onto a genome sequence, based on Long-Read (LR) RNA sequencing data.
- various summary statistics plots and analysis results that describe the input and output data in details
- an interactive HTML table reporting statistics for each input sample, enabling easy and intuitive sample-to-sample comparison
- a UCSC Track Hub to display output TMs, as well as various other tracks produced by LyRic.
(Note that features 2, 3 and 4 can be easily switched on and off).
LyRic is platform-agnostic, i.e. it can deal with FASTQ data coming from both the ONT and PacBio platforms.
Full LyRic documentation is here.
Quickstart
Prerequisites
- Anaconda installation (
miniconda/mambaforge) - Snakemake v8
- Singularity
[!NOTE]
It looks like Snakemake needs an installation of Anaconda even when the pipeline runs in a containerized environment
[!TIP] If you use Pixi to install your conda environments you can use the provided
pixi.tomlfile to setup Snakemake and other requirements. Just run the following command from the pipeline directory:pixi install
Get the pipeline
Clone the GitHub repo to the folder you want to use as the working directory of the pipeline and move to it:
git clone https://github.com/guigolab/LyRic lyric_test
cd lyric_test
Make a test run
The pipeline repository contains a small datasets that can be used for testing. You can run the pipeline on the test dataset with the following command:
snakemake --cores all
Execution shall take < half an hour.
