SkillAgentSearch skills...

PULSAR

PULSAR: a Foundation Model for Multi-scale and Multicellular Biology

Install / Use

/learn @snap-stanford/PULSAR
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

PULSAR: a Foundation Model for Multi-scale and Multicellular Biology

PULSAR (Patient Understanding Leveraging Single-cell universAl Representation) is a multi-scale, multicellular foundation model that integrates information from genes to cells to multicellular systems. PULSAR bridges massive scRNA-seq datasets with clinical phenotypes for human peripheral immunity, trained via self-supervision on 36.2 million cells from 6,807 donors.

<!-- insert image PULSAR-release-dev/assets/cover.png -->

| Preprint |

Installation

  • We use uv to manage virtual environments and dependencies. Refer to the uv documentation to install uv.
  • Then use uv to create a virtual environment and install dependencies:
uv sync # create venv
uv pip install -e . # installs the package in editable mode

Usage

Refer to Examples section below for example notebooks demonstrating how to use PULSAR for various downstream tasks. In brief, you can load a pre-trained PULSAR model as follows:

from pulsar.model import PULSAR
model = PULSAR.from_pretrained("KuanP/PULSAR-pbmc")

We also provide utilities to extract donor embeddings from single-cell data in H5AD format, as follows:

from pulsar.utils import extract_donor_embeddings_from_h5ad
donor_embeddings = extract_donor_embeddings_from_h5ad(
    h5ad_path="path_to_your_h5ad_file.h5ad",
    model=model,
    donor_id_key="donor_id_column_in_obs",
)

This function will return a dictionary mapping donor IDs to their corresponding PULSAR embeddings. Column name in .obs containing donor IDs can be specified via donor_id_key.

Note that this function requires you to obtain cell-level embeddings for H5AD first in .obsm, a pipeline line for extracting UCE embedding can be found here.

Examples

| Notebook | Description | |---------|----------| | Zero-shot age regression | Demonstrates age regression using zero-shot PULSAR embeddings with subsampled OneK1K dataset. | | Zero-shot disease classification | Demonstrates lupus disease classification using zero-shot PULSAR embeddings (using subsampled Lupus dataset). | | Searching donor embeddings | Demonstrates searching donors using PULSAR embeddings against DONORxEMBED. |

Data used for the examples can be downloaded from here.

Model weights

| Model | Description | Parameters | Context Length | Download | |-------|-------------|------------|----------------|----------| | PULSAR-pbmc | Continually pre-trained on 8.8M PBMC data from 2,588 donors, best for PBMC-related tasks | 87.4M | 1024 | 🤗 HuggingFace | | PULSAR-aligned | Aligned version of PULSAR-pbmc using disease labels | 87.4M | 1024 | 🤗 HuggingFace |

Model weights are directly loadable via the transformers library, for example:

from pulsar.model import PULSAR
model = PULSAR.from_pretrained("KuanP/PULSAR-pbmc")

DONORxEMBED Datasets

We release the DONORxEMBED datasets for both zero-shot and aligned PULSAR, you can find example for loading the datasets here.

| Dataset | Download | |---------|----------| | PULSAR_DONORxEMBED_zero_shot | 🤗 HuggingFace | | PULSAR_DONORxEMBED_aligned | 🤗 HuggingFace |

Acknowledgements

We sincerely thank the authors of following open-source projects:

Cite Us

@article {pang2025pulsar,
	author = {Pang, Kuan and Rosen, Yanay and Kedzierska, Kasia and He, Ziyuan and Rajagopal, Abhe and Gustafson, Claire E and Huynh, Grace and Leskovec, Jure},
	title = {PULSAR: a Foundation Model for Multi-scale and Multicellular Biology},
	elocation-id = {2025.11.24.685470},
	year = {2025},
	doi = {10.1101/2025.11.24.685470},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2025/11/26/2025.11.24.685470},
	eprint = {https://www.biorxiv.org/content/early/2025/11/26/2025.11.24.685470.full.pdf},
	journal = {bioRxiv}
}

Related Skills

View on GitHub
GitHub Stars30
CategoryDevelopment
Updated4d ago
Forks3

Languages

Python

Security Score

95/100

Audited on Mar 17, 2026

No findings