SCP
An end-to-end Single-Cell Pipeline designed to facilitate comprehensive analysis and exploration of single-cell data.
Install / Use
/learn @zhanghao-njmu/SCPREADME
SCP: Single-Cell Pipeline
<!-- badges: start --> <!-- badges: end -->SCP provides a comprehensive set of tools for single-cell data processing and downstream analysis.
The package includes the following facilities:
- Integrated single-cell quality control methods.
- Pipelines embedded with multiple methods for normalization, feature reduction, and cell population identification (standard Seurat workflow).
- Pipelines embedded with multiple integration methods for scRNA-seq or scATAC-seq data, including Uncorrected, Seurat, scVI, MNN, fastMNN, Harmony, Scanorama, BBKNN, CSS, LIGER, Conos, ComBat.
- Multiple single-cell downstream analyses such as identification of differential features, enrichment analysis, GSEA analysis, identification of dynamic features, PAGA, RNA velocity, Palantir, Monocle2, Monocle3, etc.
- Multiple methods for automatic annotation of single-cell data and methods for projection between single-cell datasets.
- High-quality data visualization methods.
- Fast deployment of single-cell data into SCExplorer, a shiny app that provides an interactive visualization interface.
The functions in the SCP package are all developed around the Seurat object and are compatible with other Seurat functions.
R version requirement
- R >= 4.1.0
Installation in the global R environment
You can install the latest version of SCP from GitHub with:
if (!require("devtools", quietly = TRUE)) {
install.packages("devtools")
}
devtools::install_github("zhanghao-njmu/SCP")
Create a python environment for SCP
To run functions such as RunPAGA or RunSCVELO, SCP requires
conda to create a
separate python environment. The default environment name is
"SCP_env". You can specify the environment name for SCP by setting
options(SCP_env_name="new_name")
Now, you can run PrepareEnv() to create the python environment for
SCP. If the conda binary is not found, it will automatically download
and install miniconda.
SCP::PrepareEnv()
To force SCP to use a specific conda binary, it is recommended to set
reticulate.conda_binary R option:
options(reticulate.conda_binary = "/path/to/conda")
SCP::PrepareEnv()
If the download of miniconda or pip packages is slow, you can specify the miniconda repo and PyPI mirror according to your network region.
SCP::PrepareEnv(
miniconda_repo = "https://mirrors.bfsu.edu.cn/anaconda/miniconda",
pip_options = "-i https://pypi.tuna.tsinghua.edu.cn/simple"
)
Available miniconda repositories:
-
https://repo.anaconda.com/miniconda (default)
Available PyPI mirrors:
-
https://pypi.python.org/simple (default)
Installation in an isolated R environment using renv
If you do not want to change your current R environment or require reproducibility, you can use the renv package to install SCP into an isolated R environment.
Create an isolated R environment
if (!require("renv", quietly = TRUE)) {
install.packages("renv")
}
dir.create("~/SCP_env", recursive = TRUE) # It cannot be the home directory "~" !
renv::init(project = "~/SCP_env", bare = TRUE, restart = TRUE)
Option 1: Install SCP from GitHub and create SCP python environment
renv::activate(project = "~/SCP_env")
renv::install("BiocManager")
renv::install("zhanghao-njmu/SCP", repos = BiocManager::repositories())
SCP::PrepareEnv()
Option 2: If SCP is already installed in the global environment, copy SCP from the local library
renv::activate(project = "~/SCP_env")
renv::hydrate("SCP")
SCP::PrepareEnv()
Activate SCP environment first before use
renv::activate(project = "~/SCP_env")
library(SCP)
data("pancreas_sub")
pancreas_sub <- RunPAGA(srt = pancreas_sub, group_by = "SubCellType", linear_reduction = "PCA", nonlinear_reduction = "UMAP")
CellDimPlot(pancreas_sub, group.by = "SubCellType", reduction = "draw_graph_fr")
Save and restore the state of SCP environment
renv::snapshot(project = "~/SCP_env")
renv::restore(project = "~/SCP_env")
Quick Start
Data exploration
The analysis is based on a subsetted version of mouse pancreas data.
library(SCP)
library(BiocParallel)
register(MulticoreParam(workers = 8, progressbar = TRUE))
data("pancreas_sub")
print(pancreas_sub)
#> An object of class Seurat
#> 47874 features across 1000 samples within 3 assays
#> Active assay: RNA (15958 features, 3467 variable features)
#> 2 other assays present: spliced, unspliced
#> 2 dimensional reductions calculated: PCA, UMAP
CellDimPlot(
srt = pancreas_sub, group.by = c("CellType", "SubCellType"),
reduction = "UMAP", theme_use = "theme_blank"
)
<img src="man/figures/EDA-1.png" width="100%" style="display: block; margin: auto;" />
CellDimPlot(
srt = pancreas_sub, group.by = "SubCellType", stat.by = "Phase",
reduction = "UMAP", theme_use = "theme_blank"
)
<img src="man/figures/EDA-2.png" width="100%" style="display: block; margin: auto;" />
FeatureDimPlot(
srt = pancreas_sub, features = c("Sox9", "Neurog3", "Fev", "Rbp4"),
reduction = "UMAP", theme_use = "theme_blank"
)
<img src="man/figures/EDA-3.png" width="100%" style="display: block; margin: auto;" />
FeatureDimPlot(
srt = pancreas_sub, features = c("Ins1", "Gcg", "Sst", "Ghrl"),
compare_features = TRUE, label = TRUE, label_insitu = TRUE,
reduction = "UMAP", theme_use = "theme_blank"
)
<img src="man/figures/EDA-4.png" width="100%" style="display: block; margin: auto;" />
ht <- GroupHeatmap(
srt = pancreas_sub,
features = c(
"Sox9", "Anxa2", # Ductal
"Neurog3", "Hes6", # EPs
"Fev", "Neurod1", # Pre-endocrine
"Rbp4", "Pyy", # Endocrine
"Ins1", "Gcg", "Sst", "Ghrl" # Beta, Alpha, Delta, Epsilon
),
group.by = c("CellType", "SubCellType"),
heatmap_palette = "YlOrRd",
cell_annotation = c("Phase", "G2M_score", "Cdh2"),
cell_annotation_palette = c("Dark2", "Paired", "Paired"),
show_row_names = TRUE, row_names_side = "left",
add_dot = TRUE, add_reticle = TRUE
)
print(ht$plot)
<img src="man/figures/EDA-5.png" width="100%" style="display: block; margin: auto;" />
CellQC
pancreas_sub <- RunCellQC(srt = pancreas_sub)
CellDimPlot(srt = pancreas_sub, group.by = "CellQC", reduction = "UMAP")
<img src="man/figures/RunCellQC-1.png" width="100%" style="display: block; margin: auto;" />
CellStatPlot(srt = pancreas_sub, stat.by = "CellQC", group.by = "CellType", label = TRUE)
<img src="man/figures/RunCellQC-2.png" width="100%" style="display: block; margin: auto;" />
CellStatPlot(
srt = pancreas_sub,
stat.by = c(
"db_qc", "outlier_qc", "umi_qc", "gene_qc",
"mito_qc", "ribo_qc", "ribo_mito_ratio_qc", "species_qc"
),
plot_type = "upset", stat_level = "Fail"
)
<img src="man/figures/RunCellQC-3.png" width="100%" style="display: block; margin: auto;" />
Standard pipeline
