STQ
Spatial Transcriptomics Quantification pipeline for 10x Visium and H&E-stained whole slide images
Install / Use
/learn @TheJacksonLaboratory/STQREADME
Nextflow Pipeline for Visium and H&E Data Processing
Publication citation:
Domanskyi S, Srivastava A, Kaster J, Li H, Herlyn M, Rubinstein JC, Chuang JH. Nextflow pipeline for Visium and H&E data from patient-derived xenograft samples. Cell Rep Methods. 2024 May 20;4(5):100759. doi: 10.1016/j.crmeth.2024.100759. Epub 2024 Apr 15. PMID: 38626768; PMCID: PMC11133696.
Associated data and source code DOI:
- Domanskyi, S., Srivastava, A., Kaster, J., Li, H., Herlyn, M., Rubinstein, J. C., & Chuang, J. H. (2024). Nextflow Pipeline for Visium and H&E Data from Patient-Derived Xenograft Samples (v0.2.0). Zenodo. https://doi.org/10.5281/zenodo.10839655
- Domanskyi, S., Srivastava, A., Kaster, J., Li, H., Herlyn, M., Rubinstein, J. C., & Chuang, J. H. (2024). WM4237 TE histology images, H&E stain [Data set]. Zenodo. https://doi.org/10.5281/zenodo.12746982
- Domanskyi, S. (2024). Demo 10x Visium dataset for STQ [Data set]. Zenodo. https://doi.org/10.5281/zenodo.10654467
- Overview
- Motivation
- Documentation
- Output
- Running the piepline
- Tools used in the pipeline
- fastq-tools
- xenome
- spaceranger
- velocyto
- bafextract
- Inception v3
- CTransPath, UNI, CONCH, etc.
- HoVer-Net
- Stardist
- DeepFocus
- scanpy
- Nextflow pipeline data flow
- Nextflow pipeline resources
- Glossary of Terms
Routes of analysis
<p> <img src="docs/route-map.png" width="800"/> </p>The imaging sub-workflow is dedicated to H&E-stained image analysis, and includes several disctinct components:
<p> <img src="docs/STQ-imaging.svg" width="500"/> </p>Overview
This repository contains the source code of the nextflow implementation of the 10x Visium Spatial Gene Expression data and full-resolution H&E-stained Whole Slide Images (WSI) processing developed at The Jackson Laboratory. The overview of the pipeline is shown above. The primary input consists of compressed FASTQ files, reference FASTA files, and a full-resolution image of the 10x Visium Slide sample. Additional required inputs include either pre-built Xenome indices or host and graft genome assemblies, mouse and human reference transcriptomes for reads mapping, DL pre-trained model weights, and singularity containers with software tools.
Motivation
Most of the steps implemented in our pipeline are computationally expensive and must be carried out on high-performance computer (HPC) systems. The most computationally intensive pipeline steps include RNA-seq reads mapping, full-resolution image alignment, preprocessing for RNA-velocity calculation, and preprocessing for RNA-based CNV inference, deep learning imaging features, and nuclear morphometrics data extraction. The pipeline generates a standardized set of files (see Section "Output") that can be used for downstream analysis using R-based Seurat of Python-based Scanpy or any other available environments. The pipeline can be used with a variety of formats of stand-alone WSI to perform conversion and handling image QC, focus checking, stain normalization, nuclear segmentation, and feature extraction.
Documentation
The description of the pipeline components, parameters, analysis routes, required resources, and configuration guide are provided in this repository. The documentation files are README.md, conf/README.md, and workflows/README.md.
Running the piepline
<details closed><summary>Animated workflow steps</summary><p>
Demo data
Once the pipeline and all the prerequisite software are installed, the demo can be executed on a small dataset (https://zenodo.org/records/10654467). To get a copy of the data, modify the savePath below to a meaningful location on your computing system and execute the lines. We recommend using an absolute path since it is required to generate a proper samplesheet file. Modify the STQ run.sh file to point input samplesheet to samplesheet_demo_local.csv. Demo run is a good test that the software is installed properly and can take approximately 30 minutes to complete.
savePath="/path/to/save/demodata"
cd $savePath
wget https://zenodo.org/records/10654467/files/SC2200092.tiff
wget https://zenodo.org/records/10654467/files/fastq.tar
tar -xvf fastq.tar
echo -e "sample,fastq,image,grid,roifile,mpp\nDemo_S1,${savePath}/fastq/,${savePath}/SC2200092.tiff,,,0.22075" > samplesheet_demo_local.csv
<br>
Installation
- HPC environment with sufficient CPU and RAM and storage resources
Processing 1 sample requires approximately 100+ CPU hours of computing time. Some of the processes need 1 CPU others need 4 CPUs or 8 CPUs as specified in the nexflow.config file. The temporary storage requires roughly 250 GB per sample for the pipeline to run. For example, if 32 samples are processed simultaneously, about 8TB of storage will be used until the pipeline completes.
-
Nextflow <img src="https://www.nextflow.io/img/nextflow2014_no-bg.png" height="30"/>
# https://www.nextflow.io/docs/latest/getstarted.html#installation wget -qO- https://get.nextflow.io | bash chmod +x nextflow mv nextflow ~/bin -
Singularity <img src="https://docs.sylabs.io/guides/3.0/user-guide/_static/logo.png" height="30"/>
https://docs.sylabs.io/guides/3.0/user-guide/installation.html -
Git (most likely it is already available on your system)
# https://git-scm.com/ # On Debian one can install git with apt-get install -y git -
Get the pipeline source code (this repository)
mkdir my-piepline-run cd my-piepline-run git clone https://github.com/TheJacksonLaboratory/STQ.git cd STQ -
Singularity software containers used in this pipeline
The singularity containers used in our pipeline can be downloaded or built with the definition *.def files and recipes contained in the directory assets.
<details closed><summary>Click to see the commands used to upload singularity-built containers to quay.io</summary><p>Note 10x Genomics requires that any software containers with Space Ranger are not shared publicly. We provide an example of a definition file for building a Space Ranger container with singularity: assets/container-singularity-spaceranger.def that pulls a standard
debian:buster-slimcontainer from docker and installs all necessary Linux libraries. After that, a copy of Space Ranger is downloaded and installed from the 10x Genomics download portal. To obtain a download link for a specific version of Space Ranger user must navigate to https://www.10xgenomics.com/support/software/space-ranger/downloads, register, review, and accept any required user agreements from 10x, and copy the download link. Next, paste the link to a copy of thedeffile. Finally, build a container with any desired resource, for example, https://cloud.sylabs.io/builder.
singularity remote login -u <user> docker://quay.io
singularity push /projects/chuang-lab/USERS/domans/containers/local/mamba-xenomake.sif oras://quay.io/jaxcompsci/xenomake:v1.0.0
singularity push /projects/chuang-lab/USERS/domans/containers/container-mamba-inception.sif oras://quay.io/jaxcompsci/inception:v1.0.0
singularity push /projects/chuang-lab/USERS/domans/containers/local/container-singularity-hovernet-py.sif oras://quay.io/jaxcompsci/hovernet:v2.0.0
singularity push /projects/chuang-lab/USERS/domans/containers/container-singularity-stainnet.sif oras://quay.io/jaxcompsci/stainnet:v1.0.0
singularity push /projects/chuang-lab/USERS/domans/containers/container-singularity-staintools.sif oras://quay.io/jaxcompsci/staintools:v1.0.0
singularity push /projects/chuang-lab/USERS/domans/containers/container-singularity-vips.sif oras://quay.io/jaxcompsci/vips:v1.0.0
singularity push /projects/chuang-lab/USERS/domans/containers/container-singularity-fastqtools.sif oras://quay.io/jaxcompsci/fastqtools:v1.0.0
singularity push /projects/chuang-lab/USERS/domans/containers/container-singularity-bafextract.sif oras://quay.io/jaxcompsci/bafextract:v1.0.0
singularity push /projects/chuang-lab/USERS/domans/containers/container-singularity-velocyto.sif oras://quay.io/jaxcompsci/velocyto:v1.0.0
singularity push /projects/chuang-lab/USERS/domans/containers/container-singularity-python.sif oras://quay.io/jaxcompsci/pythonlow:v1.0.0
singularity push /projects/chuang-lab/USERS/domans/containers/deepfocus.sif oras://quay.io/jaxcompsci/deepfocus:v1.0.0
singularity push /projects/chuang-lab/USERS/domans/containers/local/ome.sif oras://quay.io/jaxcompsci/ome:v1.0.0
singularity push /projects/chuang-lab/USERS/domans/containers/local/mamba-timm.sif oras://quay.io/jaxcompsci/timm:v1.0.0
singularity push /projects/chuang-lab/USERS/domans/containers/hf-uni-conch.sif oras://quay.io/jaxcompsci/hfconch:v1.0.0
</p></details>
To download containers for use with the pipeline: change directory to the desirable download location and run the commands below:
singularity pull docker://quay.io/jaxcompsci/xenome:1.0.1
singularity pull oras://quay.io/jaxcompsci/xenomake:v1.0.0
singularity pull oras://quay.io/jaxcompsci/inception:v1.0.0
singularity pull oras://quay.io/jaxcompsci/hovernet:v2.0.0
singularity pull oras://quay.io/jaxcompsci/stainnet:v1.0.0
singularity pull oras://quay.io/jaxcompsci/staintools:v1.0.0
singularity pull oras://quay.io/jaxcompsci/vips:v1.0.0
singularity pul
