SkillAgentSearch skills...

CSplotch

No description available

Install / Use

/learn @adaly/CSplotch
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

cSplotch

cSplotch is a hierarchical generative probabilistic model for analyzing Spatial Transcriptomics (ST) [1] data.

Features

  • Supports complex hierarchical experimental designs and model-based analysis of replicates
  • Full Bayesian inference with Hamiltonian Monte Carlo (HMC) using the adaptive HMC sampler as implemented in Stan [2]
  • Analysis of expression differences between anatomical regions and conditions using posterior samples
  • Different anatomical annotated regions are modelled using a linear model
  • Zero-inflated Poisson or Poisson likelihood for counts
  • Conditional autoregressive (CAR) prior for spatial random effect
  • Ability to deconvolve gene expression into cell type-specific signatures using compositional data gathered from histology images
  • Use single-cell/single-nuclear expression data to calculate priors over expression in each cell type

We support the original ST array design (1007 spots, a diameter of 100 μm, and a center-to-center distance of 200 μm) by Spatial Transcriptomics AB, as well as Visium Spatial Gene Expression Solution by 10x Genomics, Inc., interfacing directly with file formats output by Spaceranger and Loupe Browser.

The cSplotch code in this repository supports single-, two-, and three-level experimental designs. These three different hierarchical models are illustrated below:

Hierarchical models

Installation

Tested on Python 3.10

cSplotch has been tested on Mac and Linux. It has not been tested on Windows.

Installing cSplotch

The following command installs the cSplotch Python module:

$ pip install git+https://git@github.com/adaly/cSplotch.git

As a result of this, the user will have the executables splotch, splotch_prepare_count_files, splotch_generate_input_files, splotch_compile_lamdbas and splotch_compile_betas

For splotch_prepare_count_files and splotch_generate_input_files, the inputs are assumed to be in Visium v1/v2 format unless the -B/--hd-binning (Visium HD) or -S/--st-v1 (ST v1) flags are passed. We will discuss the differences in input format in the subsequent sections.

Installing CmdStan

CmdStan [2] can be installed as follows

$ STAN_VERSION=`curl -s https://api.github.com/repos/stan-dev/cmdstan/releases/latest | sed -n 's/.*"tag_name": "v\([^"]*\)",$/\1/p'`
$ cd $HOME
$ curl -LO https://github.com/stan-dev/cmdstan/releases/download/v"$STAN_VERSION"/cmdstan-"$STAN_VERSION".tar.gz
$ tar -xzvf cmdstan-"$STAN_VERSION".tar.gz
$ cd cmdstan-"$STAN_VERSION"
$ make build -j4

This will install CmdStan in the directory $HOME/cmdstan-$STAN_VERSION.

The latest CmdStan user guide can be found at https://github.com/stan-dev/cmdstan/releases.

Compiling cSplotch

The cSplotch Stan models splotch_stan_model.stan and comp_splotch_stan_model.stan can be compiled using CmdStan as follows

$ cd $HOME
$ cd cmdstan-"$STAN_VERSION"
$ make $HOME/cSplotch/stan/splotch_stan_model
$ make $HOME/cSplotch/stan/comp_splotch_stan_model

--- Translating Stan model to C++ code ---
⋮

Here we assume you have installed CmdStan in the directory $HOME/cmdstan-$STAN_VERSION and have the cSplotch code in the directory $HOME/cSplotch. Please change the paths if your environment differs.

After a successful compilation, you will have the binaries splotch_stan_model and comp_splotch_stan_model in the directory $HOME/cSplotch/stan

$ $HOME/cSplotch/stan/splotch_stan_model
$ $HOME/cSplotch/stan/comp_splotch_stan_model
Usage: [comp_]splotch_stan_model <arg1> <subarg1_1> ... <subarg1_m> ... <arg_n> <subarg_n_1> ... <subarg_n_m>
⋮
Failed to parse arguments, terminating Stan

Usage

The main steps of cSplotch analysis are the following:

  1. Preparation of count files
    • splotch_prepare_count_files
  2. Annotation of ST spots
  3. Annotation of cell types
  4. Preparation of metadata table
  5. Preparation of input data files for cSplotch
    • splotch_generate_input_files
  6. cSplotch analysis
    • splotch
  7. Summarizing cSplotch output
    • splotch_compile_lambdas
    • splotch_compile_betas
  8. Downstream analysis

Below we will describe these steps in detail.

Example data

In the directory examples, we have some example ST data [3]. We will use this example data set in this documentation to demonstrate the use of cSplotch.

Preparation of count files

The inputs to the count file preparation script differ depending on whether the user is supplying data from Visium/Visium HD or ST v1. Both cases are outlined below.

Visium/Visium HD count data

When working with data from the Visium platform, count data are expected in the form of the output of spaceranger count, which produces a structured directory containing count and metadata information for each sample.

Prior to downstream analysis, we must ensure that each array contains the same genes in the same index order. This is achieved through the use of the splotch_prepare_count_files (the -V/--Visium flag is set by default) on all arrays to be included in analysis. The output of the script will be an additional Splotch-formatted count file (default: [ARRAY_NAME.unified.tsv.gz]) created within the top level of each spaceranger output directory.

$ splotch_prepare_count_files --help

usage: splotch_prepare_count_files [-h] -c COUNT_DATA [SPACERANGER_COUNT_DIRS ...]
                                   [-s SUFFIX] [-d MINIMUM_DETECTION_RATE]
                                   [-V] [-B NAME_OF_BINNING] [-S]

A script for preparing count files for cSplotch

optional arguments:
-h, --help              show this help message and exit
-c COUNT_DATA [SPACERANGER_COUNT_DIRS ...], --count_data [SPACERANGER_COUNT_DIRS ...]
                        list of spaceranger count directories
-s SUFFIX, --suffix SUFFIX
                        suffix to be added to the sample name in creation of cSplotch count file within each spaceranger directory
                        (default is .unified.tsv.gz for Visium/STv1, .unified.hdf5 for Visium HD)
-d MINIMUM_DETECTION_RATE, --minimum_detection_rate MINIMUM_DETECTION_RATE
                        minimum detection rate (default is 0.02)
-V, --Visium            data are from the Visium platform (default)
-B NAME_OF_BINNING, --hd-binning NAME_OF_BINNING
                        name of binning to use (Visium HD only); must be a directory within */outs/binned_outputs for all SPACERANGER_COUNT_DIRS
-S, --st-v1             data are from the STv1 platform (overrides -V and -B)  

ST v1 count data

Count data from original ST workflow are fully represented by a single tab-separated value (TSV) file of the following format:

| | 32.06_2.04 | 31.16_2.04 | 14.07_2.1 | … | 28.16_33.01 | |---------------|------------|------------|-----------|------------|-------------| | A130010J15Rik | 0 | 0 | 0 | … | 0 | | A230046K03Rik | 0 | 0 | 0 | … | 0 | | A230050P20Rik | 0 | 0 | 0 | … | 0 | | A2m | 0 | 1 | 0 | … | 0 | | ⋮ | ⋮ | ⋮ | ⋮ | ⋱ | ⋮ | | Zzz3 | 0 | 1 | 0 | … | 0 |

The rows and columns have gene identifiers and ST spot coordinates (in XCOORD_YCOORD format), respectively.

splotch_prepare_count_files is applied identically to above, but with the -c/--count_data argument pointing to a list of count files in the above format, and with the -S/--st-v1 flag set.

Example

For instance, the following command prepares the count files located in examples/Count_Tables [3]

$ splotch_prepare_count_files -c examples/Count_Tables/*_stdata_aligned_counts_IDs.txt -S
INFO:root:Reading 10 count files
INFO:root:We have detected 18509 genes
INFO:root:We keep 11313 genes after discarding the lowly expressed genes (detected in less than 2.00% of the ST spots)
INFO:root:The median sequencing depth across the ST spots is 2389

Annotation of ST spots

To get the most out of the statistical model of cSplotch one has to annotate the ST spots based on their tissue context. These annotations will allow the model to share information across tissue sections, resulting in more robust conclusions.

Visium ST annotations

10x Genomics have provided a tool for the exploration and annotation of Visium/Visium HD data called Loupe.

When working with Visium data, we expect annnotation files in the CSV format exported by the Loupe browser

| Barcode | Label | |--------------------|----------------| | AAACACCAATAACTGC-1 | Vent_Med_White | | AAACATGGTGAGAGGA-1 | Vent_Horn | | AAACATTTCCCGGATT-1 | Lat_Edge | | AAACCTAAGCAGCCGG-1

Related Skills

View on GitHub
GitHub Stars10
CategoryDevelopment
Updated5mo ago
Forks2

Languages

Jupyter Notebook

Security Score

82/100

Audited on Nov 5, 2025

No findings