ScMRDR
Unpaired single-cell multi-omics data integration
Install / Use
/learn @sjl-sjtu/ScMRDRREADME
scMRDR
We implement a scalable and flexible generative framework called single-cell Multi-omics Regularized Disentangled Representations (scMRDR) for unpaired multi-omics integration. The manuscript has been presented as a spotlight paper on The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025) [1].
An extended version with more downstream applications and biological analyses such as cross-omics translation, spatial reconstruction and SVG detection, DNA methylation regulation effect assessment via spatial addictive mixed-effect model, and multi-species integration can be found at [2].
- Free software: GPL-3.0 License
- Documentation: https://sjl-sjtu.github.io/scMRDR/
Tutorials
Installation
git clone https://github.com/sjl-sjtu/scMRDR.git
cd scMRDR
pip install -e .
Examples
import scanpy as sc
import anndata as ad
from scMRDR.module import Integration
rna = sc.read_h5ad("rna_processed.h5ad") # h5ad file of scRNA (after preprocessing)
atac_gas = sc.read_h5ad("atac_gas_processed.h5ad") # h5ad file of gene activity score from scATAC (after preprocessing)
rna.obs.modality == "rna"; atac_gas.obs.modality="atac"
rna_hvg = rna.var_names[rna.var['highly_variable']]; atac_hvg = atac.var_names[atac.var['highly_variable']]
adata = ad.concat([rna[:,rna_hvg].copy(),atatc_gas[:,atac_hvg].copy()], axis='obs', join='inner', label="modality") # an adata concated with different omics (as different observations). If you want to use masked features, you can specify join='outer' here and specify feature_list for each modality then
model = Integration(data=adata, modality_key="modality", layer="count", batch_key="batch",
feature_list=None, distribution="ZINB") # If we model the count data (stored in adata.layers['count']) with ZINB model, with omics information stored in adata.obs.modality, batch information stored in adata.obs.batch
# If you want to use masked features to handle mismatched features in different omics
# adata = ad.concat([rna[:,rna_hvg].copy(),atatc_gas[:,atac_hvg].copy()], axis='obs', join='outer', label="modality", fill_value=0)
# feature_list = {"rna":rna_hvg,"atac":atac_hvg}
# model = Integration(data=adata, modality_key="modality", layer="count", batch_key="batch", feature_list=feature_list, mask_key="modality", distribution="ZINB")
model.setup(hidden_layers = [512,512], latent_dim_shared = 20, latent_dim_specific=20,
beta = 2, gamma = 5, lambda_adv = 5, dropout_rate=0.2)
model.train(epoch_num = 200, batch_size = 128, lr = 1e-3, adaptlr = False, num_warmup = 0,
early_stopping = True, valid_prop = 0.1, weighted=False, patience=10)
model.inference(n_samples=1,update=True,returns=False)
adata = model.get_adata() # The integrated embeddings are stored in adata.obsm["latent_shared"]
# visualization
sc.pp.neighbors(adata,use_rep="latent_shared")
sc.tl.umap(adata)
sc.pl.umap(
adata,
color=["modality","cell_type","batch"],
size=2, wspace=0.5
)
Please visit (https://sjl-sjtu.github.io/scMRDR/) for detailed documents and APIs.
Citation
[1] Jianle Sun, Chaoqi Liang, Ran Wei, Peng Zheng, Lei Bai, Wanli Ouyang, Hongliang Yan, Peng Ye. scMRDR: A scalable and flexible framework for unpaired single-cell multi-omics data integration. The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025.
[2] Jianle Sun, Chaoqi Liang, Ran Wei, Peng Zheng, Hongliang Yan, Lei Bai, Kun Zhang, Wanli Ouyang, and Peng Ye. "Scalable integration and prediction of unpaired single-cell and spatial multi-omics via regularized disentanglement." bioRxiv (2025).
Contact
Feel free to contact me via jianles@andrew.cmu.edu or sjl-2017@alumni.sjtu.edu.cn if you have any questions.
Related Skills
node-connect
354.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
112.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
354.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
354.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
