SkillAgentSearch skills...

UniVI

UniVI is a scalable multi-modal VAE toolkit for aligning heterogeneous single-cell datasets into a shared latent space—supporting unimodal, dual-modal, and tri-modal (and beyond) integration. It can additionally be used for cross-modal imputation, data generation of biologically-relevant synthetic samples, data denoising, and structured evaluation.

Install / Use

/learn @Ashford-A/UniVI

README

UniVI

PyPI version pypi downloads Conda version conda-forge downloads PyPI - Python Version

<picture> <source media="(prefers-color-scheme: dark)" srcset="https://raw.githubusercontent.com/Ashford-A/UniVI/v0.4.7/assets/figures/univi_overview_dark.png"> <img src="https://raw.githubusercontent.com/Ashford-A/UniVI/v0.4.7/assets/figures/univi_overview_light.png" alt="UniVI overview and evaluation roadmap" width="100%"> </picture>

UniVI is a multi-modal variational autoencoder (VAE) toolkit for aligning and integrating single-cell modalities such as RNA, ADT (CITE-seq), ATAC, and coverage-aware / proportion-like assays (e.g., single-cell methylome features).

Common use cases:

  • Joint embedding of paired multimodal data (CITE-seq, Multiome, TEA-seq)
  • Bridge mapping / projection of unimodal cohorts into a paired latent
  • Cross-modal imputation (RNA→ADT, ATAC→RNA, RNA→methylome, …)
  • Denoising / reconstruction with likelihood-aware decoders
  • Generating biologically-relevant samples due to the generative nature of VAEs
  • Evaluation (FOSCTTM, Recall@k, mixing/entropy, label transfer, clustering, basic MoE gating diagnostics)

Advanced/experimental use cases (all optional, model can be run entirely without these):

  • Supervised heads (either a decoder classification head or a whole categorical encoder/decoder model VAE, treated as a modality)
  • Expanded MoE gating diagnostics (setting a simple gating network during training)
  • Transformer encoders (experimental, added for exploratory analysis)
  • Fused transformer latent space (even more experimental, added for exploratory analysis/future model expansion)

Preprint

If you use UniVI in your work, please cite:

Ashford AJ, Enright T, Somers J, Nikolova O, Demir E.
Unifying multimodal single-cell data with a mixture-of-experts β-variational autoencoder framework.
bioRxiv (2025; updated 2026). doi: 10.1101/2025.02.28.640429

@article{Ashford2025UniVI,
  title   = {Unifying multimodal single-cell data with a mixture-of-experts β-variational autoencoder framework},
  author  = {Ashford, A. J. and Enright, T. and Somers, J. and Nikolova, O. and Demir, E.},
  journal = {bioRxiv},
  date    = {2025},
  doi     = {10.1101/2025.02.28.640429},
  url     = {https://www.biorxiv.org/content/10.1101/2025.02.28.640429},
  note    = {Preprint (updated 2026)}
}

Installation

PyPI

pip install univi

UniVI requires PyTorch. If import torch fails, install PyTorch for your platform/CUDA from PyTorch's official install instructions.

Conda / mamba

conda install -c conda-forge univi
# or
mamba install -c conda-forge univi

Development install (from source)

git clone https://github.com/Ashford-A/UniVI.git
cd UniVI

conda env create -f envs/univi_env.yml
conda activate univi_env

pip install -e .

Data expectations

UniVI expects per-modality AnnData objects.

  • Each modality is an AnnData
  • For paired settings, modalities share the same cells (obs_names, same order)
  • Raw counts often live in .layers["counts"]
  • Model inputs typically live in .X (or .obsm["X_*"] for ATAC LSI)
  • Model input is a dictionary of these AnnData objects with the dictionary key specifying the modality (e.g. rna, adt, atac). These keys are used later for evaluation functions (cross-reconstruction etc.).

Recommended convention:

  • .layers["counts"] = raw counts / raw signal
  • .X / .obsm["X_*"] = model input space (log1p RNA, CLR ADT, LSI ATAC, methyl fractions, etc.)
  • .layers["denoised_*"] / .layers["imputed_*"] = UniVI outputs

Quickstart (Python / Jupyter)

Minimal "notebook path": load paired AnnData → preprocess → train → encode/evaluate → plot.

The sections below walk through a complete CITE-seq (RNA + ADT) example. All patterns generalize to Multiome (RNA + ATAC), TEA-seq (RNA + ADT + ATAC), and any other paired combination supported by UniVI.


0) Imports

import numpy as np
import scanpy as sc
import torch
from torch.utils.data import DataLoader, Subset

from univi import UniVIMultiModalVAE, ModalityConfig, UniVIConfig, TrainingConfig
from univi.data import MultiModalDataset, align_paired_obs_names, collate_multimodal_xy_recon
from univi.trainer import UniVITrainer

collate_multimodal_xy_recon is the required collate function for DataLoader when using MultiModalDataset. It correctly handles the (x, recon_targets) batch format expected by the trainer, including coverage-aware modalities such as beta-binomial methylome. Always pass it as collate_fn=collate_multimodal_xy_recon when constructing your loaders.


1) Load paired AnnData

For CITE-seq data:

rna = sc.read_h5ad("path/to/rna_citeseq.h5ad")
adt = sc.read_h5ad("path/to/adt_citeseq.h5ad")

For Multiome (RNA + ATAC):

rna  = sc.read_h5ad("path/to/rna_multiome.h5ad")
atac = sc.read_h5ad("path/to/atac_multiome.h5ad")

For tri-modal TEA-seq / DOGMA-seq / ASAP-seq:

rna  = sc.read_h5ad("path/to/rna.h5ad")
adt  = sc.read_h5ad("path/to/adt.h5ad")
atac = sc.read_h5ad("path/to/atac.h5ad")

2) Preprocess each modality

After preprocessing, set .X to the model input space and keep raw counts in .layers["counts"]. Match the likelihood in ModalityConfig to your .X space (see the likelihood guidance table in step 4).

RNA — log-normalize, select HVGs, scale:

rna.layers["counts"] = rna.X.copy()

rna.var["mt"] = rna.var_names.str.upper().str.startswith("MT-")
sc.pp.calculate_qc_metrics(rna, qc_vars=["mt"], percent_top=None, log1p=False, inplace=True)

sc.pp.normalize_total(rna, target_sum=1e4)
sc.pp.log1p(rna)
rna.raw = rna  # snapshot log-space for plotting/DE

sc.pp.highly_variable_genes(rna, flavor="seurat_v3", n_top_genes=2000, subset=True)
sc.pp.scale(rna, max_value=10)

ADT — CLR per cell, scale per protein:

adt.layers["counts"] = adt.X.copy()

def clr_per_cell(X):
    X = X.toarray() if hasattr(X, "toarray") else np.asarray(X)
    logX = np.log1p(X)
    return logX - logX.mean(axis=1, keepdims=True)

adt.X = clr_per_cell(adt.layers["counts"])
sc.pp.scale(adt, zero_center=True, max_value=10)

ATAC — TF-IDF → LSI, drop first component:

atac.layers["counts"] = atac.X.copy()

def tfidf(X):
    X = X.tocsr() if hasattr(X, "tocsr") else X
    cell_sum = np.asarray(X.sum(axis=1)).ravel()
    cell_sum[cell_sum == 0] = 1.0
    tf = X.multiply(1.0 / cell_sum[:, None])
    df = np.asarray((X > 0).sum(axis=0)).ravel()
    idf = np.log1p(X.shape[0] / (1.0 + df))
    return tf.multiply(idf)

X_tfidf = tfidf(atac.layers["counts"])

from sklearn.decomposition import TruncatedSVD
svd = TruncatedSVD(n_components=101, random_state=0)
X_lsi = svd.fit_transform(X_tfidf)
atac.obsm["X_lsi"] = X_lsi[:, 1:]  # drop first component (depth correlated)

Post-preprocessing: assemble adata_dict

# Sanity check (CITE-seq)
assert rna.n_obs == adt.n_obs and np.all(rna.obs_names == adt.obs_names)

# CITE-seq
adata_dict = {"rna": rna, "adt": adt}

# Multiome
# adata_dict = {"rna": rna, "atac": atac}

# Tri-modal
# adata_dict = {"rna": rna, "adt": adt, "atac": atac}

# Unimodal VAE
# adata_dict = {"rna": rna}

align_paired_obs_names(adata_dict)  # ensures matching obs_names and order

Avoiding data leakage: if you want to run UniVI inductively, apply feature selection, scaling, and any learned transforms (e.g., PCA/LSI) on the training set only, then apply the training-set-derived parameters to validation and test sets.


3) Dataset + DataLoaders

Device detection (CUDA → MPS → XPU → CPU):

device = (
    "cuda" if torch.cuda.is_available() else
    ("mps" if getattr(torch.backends, "mps", None) is not None
               and torch.backends.mps.is_available() else
     ("xpu" if hasattr(torch, "xpu") and torch.xpu.is_available() else
      "cpu"))
)

Build dataset:

dataset = MultiModalDataset(
    adata_dict=adata_dict,
    device=None,                  # dataset yields CPU tensors; model handles GPU transfer
    X_key_by_mod={
        "rna" : "X",              # uses rna.X
        "adt" : "X",              # uses adt.X
        # "atac": "obsm:X_lsi",  # uses atac.obsm["X_lsi"]
    },
)

Train / val / test split (80 / 10 / 10):

n = rna.n_obs
idx = np.arange(n)
rng = np.random.default_rng(0)
rng.shuffle(idx)

n_train = int(0.8 * n)
n_val   = int(0.1 * n)

train_idx = idx[:n_train]
val_idx   = idx[n_train : n_train + n_val]
test_idx  = idx[n_train + n_val :]

# Save split indices for reproducibility
np.savez("splits_seed0.npz", train_idx=train_idx, val_idx=val_idx, test_idx=test_idx)

Construct loaders (always pass collate_fn=collate_multimodal_xy_recon):

train_loader = DataLoader(
    Subset(dataset, train_idx),
    batch_size=256,
    shuffle=True,
    num_workers=0,
    collate_fn=collate_multimodal_xy_recon,
)
val_loader = DataLoader(
    Subset(dataset, val_idx),
    batch_size=256,
    shuffle=False,
    num_workers=0,
    collate_fn=collate_multimodal_xy_recon,
)
test_loader = DataLoader(
    Subset(dataset, test_idx),
    batch_size=256,
    shuffle=Fa
View on GitHub
GitHub Stars5
CategoryCustomer
Updated28d ago
Forks0

Languages

Jupyter Notebook

Security Score

90/100

Audited on Mar 4, 2026

No findings