Introduction

eNRSA is an enhanced version of NRSA for analyzing nascent transcriptome generated by PRO-seq, GRO-seq, (m)NET-seq, and Butt-seq data. The source code of eNRSA is available at GitHub. The Docker image is available at DockerHub

eNRSA_diagram

There are two ways for using eNRSA:

Command line usage – download eNRSA code and install the dependencies (see Installation and Usage).
Docker or Singularity container (see Container Usage).

Maintainers / Contributors

If you encounter any issues using this package, please email support#example.com (replace # with @).

Qi Liu, @liuqivandy Email: qi.liu#vumc.org

Jing Wang, @jingwang Email: jing.wang.1#vumc.org

HuaChang Chen @chc-code Email: hua-chang.chen#vumc.org

Citation

<div> <a class="papertitle" href="https://pubmed.ncbi.nlm.nih.gov/40613436/" style="font-size: 18px; color: rgb(86, 86, 201) !important;"> eNRSA: a faster and more powerful approach for nascent transcriptome analysis. </a> </div> <div class="author_list" style="padding-bottom: 6px;"> Wang J, Chen H C, Hiebert S W, Sheng Q, Tansey W P, Shyr Y , Liu Q . </div> <div> Gigascience. 2025 Jan 6; 14; doi: 10.1093/gigascience/giaf071 PMID: 40613436 </div>

Installation

Local Installation

eNRSA has the following dependencies:

Python Packages:

fisher
pandas
numpy
pydeseq2
matplotlib
scipy
statsmodels

fisher package For the Fisher's exact test, we used the fisher package instead of scipy.stats.fisher_exact due to its significantly faster performance. However, the version of fisher available on PyPI is outdated and may fail to install using pip install fisher.

To resolve this, you can either:
Download the package from its GitHub repository and install it locally:
git clone https://github.com/tylerjereddy/fishers_exact_test.git
pip install ./fishers_exact_test
Use conda to install it:
conda install -c conda-forge fisher
Both methods ensure you have a functional and up-to-date version of the fisher package.

Other Packages:

bedtools
homer

We recommend using conda to install the dependencies. Follow these steps:

# Create an environment for eNRSA
conda create -n enrsa python=3.9 -y
conda activate enrsa

# Install dependencies
conda install -y fisher pandas numpy pydeseq2 matplotlib scipy statsmodels bedtools homer

eNRSA

# Clone the eNRSA repository
git clone https://github.com/chc-code/eNRSA.git

To work with built-in reference files for supported organisms (hg19, hg38, mm10, mm39, dm3, dm6, ce10, danrer10), you can download them as follows:

# Download and unzip reference files
cd eNRSA
wget https://bioinfo.vanderbilt.edu/eNRSA/download/eNRSA_ref.zip
unzip eNRSA_ref.zip
rm eNRSA_ref.zip

Docker / Singularity Installation

You can also use pre-built Docker or Singularity images for eNRSA, which include all dependencies and reference files.

# Docker
docker pull chccode/enrsa:latest

# Singularity
singularity build enrsa.sif docker://chccode/enrsa:latest

Input Files

Alignment Files

Use the -in1 and -in2 options to specify alignment files for control and case samples, respectively. For multiple files, separate them with spaces.

Supported file formats:

BAM: Automatically converted to sorted BED format using bedtools bamtobed. This step may take several minutes depending on file size.
BED: If the input BED files are already sorted, add the -sorted flag to skip the sorting step.

Example:

python eRNA.py -in1 ctrl_sample.bam -in2 case_sample.bam -sorted

Design Table

To include batch correction or complex experimental designs, use the -design option. This option overrides -in1 and -in2.

Format

The design table is a tab-delimited text file with two sections:

Sample Information:
- Column 1: Full path to the alignment file (required)
- Column 2: Group name, used to define comparisons (required)
- Column 3: Batch group for correction (optional)
Comparison Definitions:
- Each row starts with @@.
- The first column specifies the case group name.
- The second column specifies the control group name.

Example:

/nobackup/INF_Cre_0hr_1-R1.bam    INF_Cre_0hr    b1
/nobackup/INF_Cre_0hr_2-R1.bam    INF_Cre_0hr    b2
/nobackup/INF_EV_0hr_1-R1.bam     INF_EV_0hr     b1
/nobackup/INF_EV_0hr_2-R1.bam     INF_EV_0hr     b2
@@INF_Cre_0hr    INF_EV_0hr

GTF Files

eNRSA supports 8 built-in organisms (hg19, hg38, mm10, mm39, dm3, dm6, ce10, danrer10). For other organisms or custom annotations, specify a GTF file using -gtf <file_path>.

GTF File Requirements:

Rows with exon as the third column (feature) will be processed.
The ninth column (attributes) must include transcript_id and gene_name.

Output

eNRSA generates a variety of results, including:

Gene-Level Metrics: Nascent RNA abundance in promoter-proximal and gene body regions, pausing index, and significance.
Differential Analysis: Changes in pausing index across conditions.
Enhancer and eRNA Detection: Identified active enhancers, long eRNAs, and their quantification.
Visualizations: Heatmaps and boxplots offering a global view of the data.

Results for known genes are stored in the known_gene folder. If eRNA.py is run, an additional eRNA folder will be created.

These folders include primary tables and visualization files, such as:

Differential analysis results.
Heatmaps of transcription changes.
Boxplots of promoter and gene body read densities.

known_gene folder

| File Name | File Description | | --------------------------------------------------- | ---------------------------------------------------------------------------------------------- | | pindex.txt | Pausing information for each gene in all samples | | normalized_pp_gb.txt | Normalized read counts in promoter-proximal and gene body regions for each gene in all samples | | pp_change.txt | Differential expression results of genes within promoter-proximal region across two conditions | | gb_change.txt | Differential expression results of genes within gene body region across two conditions | | pindex_change.txt | Differential expression results of genes of pausing index across two conditions | | boxplot_ppdensity.pdf | Box plot of normalized read density of promoter-proximal regions for each sample | | boxplot_gbdensity.pdf | Box plot of normalized read density of gene body regions for each sample | | boxplot_pausingIndex.pdf | Box plot of pausing index for each sample | | pindex_change.pdf | Heatmap of pausing index change across two conditions for genes with adjp < 0.05 | | heatmap.pdf | Heatmap of condition-dependent transcription changes around TSS for active genes | | Reps_condition1.tif | Histogram for variation across samples within condition 1 | | Reps_condition2.tif | Histogram for variation across samples within condition 2 | | TSS_alternative_isoforms_between_conditions.sig.tsv | The alternative TSS used in different conditions | | TTS_alternative_isoforms_between_conditions.sig.tsv | The alternative TTS used in different conditions |

eRNA folder

| File Name

ENRSA

Install / Use

README