PULSAR

PULSAR: a Foundation Model for Multi-scale and Multicellular Biology

Generate Convert Improve

Install / Use

/learn @snap-stanford/PULSAR

About this skill

Quality Score

0/100

README

PULSAR: a Foundation Model for Multi-scale and Multicellular Biology

PULSAR (Patient Understanding Leveraging Single-cell universAl Representation) is a multi-scale, multicellular foundation model that integrates information from genes to cells to multicellular systems. PULSAR bridges massive scRNA-seq datasets with clinical phenotypes for human peripheral immunity, trained via self-supervision on 36.2 million cells from 6,807 donors.

| Preprint |

Installation

We use uv to manage virtual environments and dependencies. Refer to the uv documentation to install uv.
Then use uv to create a virtual environment and install dependencies:

uv sync # create venv
uv pip install -e . # installs the package in editable mode

Usage

Refer to Examples section below for example notebooks demonstrating how to use PULSAR for various downstream tasks. In brief, you can load a pre-trained PULSAR model as follows:

from pulsar.model import PULSAR
model = PULSAR.from_pretrained("KuanP/PULSAR-pbmc")

We also provide utilities to extract donor embeddings from single-cell data in H5AD format, as follows:

from pulsar.utils import extract_donor_embeddings_from_h5ad
donor_embeddings = extract_donor_embeddings_from_h5ad(
    h5ad_path="path_to_your_h5ad_file.h5ad",
    model=model,
    donor_id_key="donor_id_column_in_obs",
)

This function will return a dictionary mapping donor IDs to their corresponding PULSAR embeddings. Column name in .obs containing donor IDs can be specified via donor_id_key.

Note that this function requires you to obtain cell-level embeddings for H5AD first in .obsm, a pipeline line for extracting UCE embedding can be found here.

Examples

| Notebook | Description | |---------|----------| | Zero-shot age regression | Demonstrates age regression using zero-shot PULSAR embeddings with subsampled OneK1K dataset. | | Zero-shot disease classification | Demonstrates lupus disease classification using zero-shot PULSAR embeddings (using subsampled Lupus dataset). | | Searching donor embeddings | Demonstrates searching donors using PULSAR embeddings against DONORxEMBED. |

Data used for the examples can be downloaded from here.

Model weights

| Model | Description | Parameters | Context Length | Download | |-------|-------------|------------|----------------|----------| | PULSAR-pbmc | Continually pre-trained on 8.8M PBMC data from 2,588 donors, best for PBMC-related tasks | 87.4M | 1024 | 🤗 HuggingFace | | PULSAR-aligned | Aligned version of PULSAR-pbmc using disease labels | 87.4M | 1024 | 🤗 HuggingFace |

Model weights are directly loadable via the transformers library, for example:

from pulsar.model import PULSAR
model = PULSAR.from_pretrained("KuanP/PULSAR-pbmc")

DONORxEMBED Datasets

We release the DONORxEMBED datasets for both zero-shot and aligned PULSAR, you can find example for loading the datasets here.

| Dataset | Download | |---------|----------| | PULSAR_DONORxEMBED_zero_shot | 🤗 HuggingFace | | PULSAR_DONORxEMBED_aligned | 🤗 HuggingFace |

Acknowledgements

We sincerely thank the authors of following open-source projects:

Cite Us

@article {pang2025pulsar,
	author = {Pang, Kuan and Rosen, Yanay and Kedzierska, Kasia and He, Ziyuan and Rajagopal, Abhe and Gustafson, Claire E and Huynh, Grace and Leskovec, Jure},
	title = {PULSAR: a Foundation Model for Multi-scale and Multicellular Biology},
	elocation-id = {2025.11.24.685470},
	year = {2025},
	doi = {10.1101/2025.11.24.685470},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2025/11/26/2025.11.24.685470},
	eprint = {https://www.biorxiv.org/content/early/2025/11/26/2025.11.24.685470.full.pdf},
	journal = {bioRxiv}
}

Related Skills

node-connect

328.6k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

80.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

328.6k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

80.9k

Commit, push, and open a PR