CSplotch
No description available
Install / Use
/learn @adaly/CSplotchREADME
cSplotch
cSplotch is a hierarchical generative probabilistic model for analyzing Spatial Transcriptomics (ST) [1] data.
Features
- Supports complex hierarchical experimental designs and model-based analysis of replicates
- Full Bayesian inference with Hamiltonian Monte Carlo (HMC) using the adaptive HMC sampler as implemented in Stan [2]
- Analysis of expression differences between anatomical regions and conditions using posterior samples
- Different anatomical annotated regions are modelled using a linear model
- Zero-inflated Poisson or Poisson likelihood for counts
- Conditional autoregressive (CAR) prior for spatial random effect
- Ability to deconvolve gene expression into cell type-specific signatures using compositional data gathered from histology images
- Use single-cell/single-nuclear expression data to calculate priors over expression in each cell type
We support the original ST array design (1007 spots, a diameter of 100 μm, and a center-to-center distance of 200 μm) by Spatial Transcriptomics AB, as well as Visium Spatial Gene Expression Solution by 10x Genomics, Inc., interfacing directly with file formats output by Spaceranger and Loupe Browser.
The cSplotch code in this repository supports single-, two-, and three-level experimental designs. These three different hierarchical models are illustrated below:

Installation
Tested on Python 3.10
cSplotch has been tested on Mac and Linux. It has not been tested on Windows.
Installing cSplotch
The following command installs the cSplotch Python module:
$ pip install git+https://git@github.com/adaly/cSplotch.git
As a result of this, the user will have the executables splotch, splotch_prepare_count_files, splotch_generate_input_files, splotch_compile_lamdbas and splotch_compile_betas
For splotch_prepare_count_files and splotch_generate_input_files, the inputs are assumed to be in Visium v1/v2 format unless the -B/--hd-binning (Visium HD) or -S/--st-v1 (ST v1) flags are passed. We will discuss the differences in input format in the subsequent sections.
Installing CmdStan
CmdStan [2] can be installed as follows
$ STAN_VERSION=`curl -s https://api.github.com/repos/stan-dev/cmdstan/releases/latest | sed -n 's/.*"tag_name": "v\([^"]*\)",$/\1/p'`
$ cd $HOME
$ curl -LO https://github.com/stan-dev/cmdstan/releases/download/v"$STAN_VERSION"/cmdstan-"$STAN_VERSION".tar.gz
$ tar -xzvf cmdstan-"$STAN_VERSION".tar.gz
$ cd cmdstan-"$STAN_VERSION"
$ make build -j4
This will install CmdStan in the directory $HOME/cmdstan-$STAN_VERSION.
The latest CmdStan user guide can be found at https://github.com/stan-dev/cmdstan/releases.
Compiling cSplotch
The cSplotch Stan models splotch_stan_model.stan and comp_splotch_stan_model.stan can be compiled using CmdStan as follows
$ cd $HOME
$ cd cmdstan-"$STAN_VERSION"
$ make $HOME/cSplotch/stan/splotch_stan_model
$ make $HOME/cSplotch/stan/comp_splotch_stan_model
--- Translating Stan model to C++ code ---
⋮
Here we assume you have installed CmdStan in the directory $HOME/cmdstan-$STAN_VERSION and have the cSplotch code in the directory $HOME/cSplotch. Please change the paths if your environment differs.
After a successful compilation, you will have the binaries splotch_stan_model and comp_splotch_stan_model in the directory $HOME/cSplotch/stan
$ $HOME/cSplotch/stan/splotch_stan_model
$ $HOME/cSplotch/stan/comp_splotch_stan_model
Usage: [comp_]splotch_stan_model <arg1> <subarg1_1> ... <subarg1_m> ... <arg_n> <subarg_n_1> ... <subarg_n_m>
⋮
Failed to parse arguments, terminating Stan
Usage
The main steps of cSplotch analysis are the following:
- Preparation of count files
splotch_prepare_count_files
- Annotation of ST spots
- Annotation of cell types
- Preparation of metadata table
- Preparation of input data files for cSplotch
splotch_generate_input_files
- cSplotch analysis
splotch
- Summarizing cSplotch output
splotch_compile_lambdassplotch_compile_betas
- Downstream analysis
Below we will describe these steps in detail.
Example data
In the directory examples, we have some example ST data [3]. We will use this example data set in this documentation to demonstrate the use of cSplotch.
Preparation of count files
The inputs to the count file preparation script differ depending on whether the user is supplying data from Visium/Visium HD or ST v1. Both cases are outlined below.
Visium/Visium HD count data
When working with data from the Visium platform, count data are expected in the form of the output of spaceranger count, which produces a structured directory containing count and metadata information for each sample.
Prior to downstream analysis, we must ensure that each array contains the same genes in the same index order. This is achieved through the use of the splotch_prepare_count_files (the -V/--Visium flag is set by default) on all arrays to be included in analysis. The output of the script will be an additional Splotch-formatted count file (default: [ARRAY_NAME.unified.tsv.gz]) created within the top level of each spaceranger output directory.
$ splotch_prepare_count_files --help
usage: splotch_prepare_count_files [-h] -c COUNT_DATA [SPACERANGER_COUNT_DIRS ...]
[-s SUFFIX] [-d MINIMUM_DETECTION_RATE]
[-V] [-B NAME_OF_BINNING] [-S]
A script for preparing count files for cSplotch
optional arguments:
-h, --help show this help message and exit
-c COUNT_DATA [SPACERANGER_COUNT_DIRS ...], --count_data [SPACERANGER_COUNT_DIRS ...]
list of spaceranger count directories
-s SUFFIX, --suffix SUFFIX
suffix to be added to the sample name in creation of cSplotch count file within each spaceranger directory
(default is .unified.tsv.gz for Visium/STv1, .unified.hdf5 for Visium HD)
-d MINIMUM_DETECTION_RATE, --minimum_detection_rate MINIMUM_DETECTION_RATE
minimum detection rate (default is 0.02)
-V, --Visium data are from the Visium platform (default)
-B NAME_OF_BINNING, --hd-binning NAME_OF_BINNING
name of binning to use (Visium HD only); must be a directory within */outs/binned_outputs for all SPACERANGER_COUNT_DIRS
-S, --st-v1 data are from the STv1 platform (overrides -V and -B)
ST v1 count data
Count data from original ST workflow are fully represented by a single tab-separated value (TSV) file of the following format:
| | 32.06_2.04 | 31.16_2.04 | 14.07_2.1 | … | 28.16_33.01 | |---------------|------------|------------|-----------|------------|-------------| | A130010J15Rik | 0 | 0 | 0 | … | 0 | | A230046K03Rik | 0 | 0 | 0 | … | 0 | | A230050P20Rik | 0 | 0 | 0 | … | 0 | | A2m | 0 | 1 | 0 | … | 0 | | ⋮ | ⋮ | ⋮ | ⋮ | ⋱ | ⋮ | | Zzz3 | 0 | 1 | 0 | … | 0 |
The rows and columns have gene identifiers and ST spot coordinates (in XCOORD_YCOORD format), respectively.
splotch_prepare_count_files is applied identically to above, but with the -c/--count_data argument pointing to a list of count files in the above format, and with the -S/--st-v1 flag set.
Example
For instance, the following command prepares the count files located in examples/Count_Tables [3]
$ splotch_prepare_count_files -c examples/Count_Tables/*_stdata_aligned_counts_IDs.txt -S
INFO:root:Reading 10 count files
INFO:root:We have detected 18509 genes
INFO:root:We keep 11313 genes after discarding the lowly expressed genes (detected in less than 2.00% of the ST spots)
INFO:root:The median sequencing depth across the ST spots is 2389
Annotation of ST spots
To get the most out of the statistical model of cSplotch one has to annotate the ST spots based on their tissue context. These annotations will allow the model to share information across tissue sections, resulting in more robust conclusions.
Visium ST annotations
10x Genomics have provided a tool for the exploration and annotation of Visium/Visium HD data called Loupe.
When working with Visium data, we expect annnotation files in the CSV format exported by the Loupe browser
| Barcode | Label | |--------------------|----------------| | AAACACCAATAACTGC-1 | Vent_Med_White | | AAACATGGTGAGAGGA-1 | Vent_Horn | | AAACATTTCCCGGATT-1 | Lat_Edge | | AAACCTAAGCAGCCGG-1
Related Skills
node-connect
349.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.0kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
