PULSAR
PULSAR: a Foundation Model for Multi-scale and Multicellular Biology
Install / Use
/learn @snap-stanford/PULSARREADME
PULSAR: a Foundation Model for Multi-scale and Multicellular Biology
PULSAR (Patient Understanding Leveraging Single-cell universAl Representation) is a multi-scale, multicellular foundation model that integrates information from genes to cells to multicellular systems. PULSAR bridges massive scRNA-seq datasets with clinical phenotypes for human peripheral immunity, trained via self-supervision on 36.2 million cells from 6,807 donors.
<!-- insert image PULSAR-release-dev/assets/cover.png -->
| Preprint |
Installation
- We use
uvto manage virtual environments and dependencies. Refer to the uv documentation to install uv. - Then use
uvto create a virtual environment and install dependencies:
uv sync # create venv
uv pip install -e . # installs the package in editable mode
Usage
Refer to Examples section below for example notebooks demonstrating how to use PULSAR for various downstream tasks. In brief, you can load a pre-trained PULSAR model as follows:
from pulsar.model import PULSAR
model = PULSAR.from_pretrained("KuanP/PULSAR-pbmc")
We also provide utilities to extract donor embeddings from single-cell data in H5AD format, as follows:
from pulsar.utils import extract_donor_embeddings_from_h5ad
donor_embeddings = extract_donor_embeddings_from_h5ad(
h5ad_path="path_to_your_h5ad_file.h5ad",
model=model,
donor_id_key="donor_id_column_in_obs",
)
This function will return a dictionary mapping donor IDs to their corresponding PULSAR embeddings. Column name in .obs containing donor IDs can be specified via donor_id_key.
Note that this function requires you to obtain cell-level embeddings for H5AD first in .obsm, a pipeline line for extracting UCE embedding can be found here.
Examples
| Notebook | Description |
|---------|----------|
| Zero-shot age regression | Demonstrates age regression using zero-shot PULSAR embeddings with subsampled OneK1K dataset. |
| Zero-shot disease classification | Demonstrates lupus disease classification using zero-shot PULSAR embeddings (using subsampled Lupus dataset). |
| Searching donor embeddings | Demonstrates searching donors using PULSAR embeddings against DONORxEMBED. |
Data used for the examples can be downloaded from here.
Model weights
| Model | Description | Parameters | Context Length | Download |
|-------|-------------|------------|----------------|----------|
| PULSAR-pbmc | Continually pre-trained on 8.8M PBMC data from 2,588 donors, best for PBMC-related tasks | 87.4M | 1024 | 🤗 HuggingFace |
| PULSAR-aligned | Aligned version of PULSAR-pbmc using disease labels | 87.4M | 1024 | 🤗 HuggingFace |
Model weights are directly loadable via the transformers library, for example:
from pulsar.model import PULSAR
model = PULSAR.from_pretrained("KuanP/PULSAR-pbmc")
DONORxEMBED Datasets
We release the DONORxEMBED datasets for both zero-shot and aligned PULSAR, you can find example for loading the datasets here.
| Dataset | Download | |---------|----------| | PULSAR_DONORxEMBED_zero_shot | 🤗 HuggingFace | | PULSAR_DONORxEMBED_aligned | 🤗 HuggingFace |
Acknowledgements
We sincerely thank the authors of following open-source projects:
Cite Us
@article {pang2025pulsar,
author = {Pang, Kuan and Rosen, Yanay and Kedzierska, Kasia and He, Ziyuan and Rajagopal, Abhe and Gustafson, Claire E and Huynh, Grace and Leskovec, Jure},
title = {PULSAR: a Foundation Model for Multi-scale and Multicellular Biology},
elocation-id = {2025.11.24.685470},
year = {2025},
doi = {10.1101/2025.11.24.685470},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://www.biorxiv.org/content/early/2025/11/26/2025.11.24.685470},
eprint = {https://www.biorxiv.org/content/early/2025/11/26/2025.11.24.685470.full.pdf},
journal = {bioRxiv}
}
Related Skills
node-connect
328.6kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
80.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
328.6kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
80.9kCommit, push, and open a PR
