SkillAgentSearch skills...

ScMultiBench

Multi-task benchmarking of single-cell multimodal omics integration methods

Install / Use

/learn @PYangLab/ScMultiBench
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<h1 style="display: flex; align-items: center; font-size: 2em;"> <img src="figure/logo.png" alt="Logo" style="width: 50px; height: 50px; margin-right: 10px;"> scMultiBench </h1>

Multi-task benchmarking of single-cell multimodal omics integration methods

Single-cell multimodal omics technologies have empowered the profiling of complex biological systems at a resolution and scale that were previously unattainable. These biotechnologies have propelled the fast-paced innovation and development of data integration methods, leading to a critical need for their systematic categorisation, evaluation, and benchmark. Navigating and selecting the most pertinent integration approach poses a significant challenge, contingent upon the tasks relevant to the study goals and the combination of modalities and batches present in the data at hand. Understanding how well each method performs multiple tasks, including dimension reduction, batch correction, cell type classification and clustering, imputation, feature selection, and spatial registration, and at which combinations will help guide this decision. This study aims to develop a much-needed guideline on choosing the most appropriate method for single-cell multimodal omics data analysis through a systematic categorisation and comprehensive benchmarking of current methods.

<img width=100% src="https://github.com/PYangLab/scMultiBench/blob/main/figure/figure1_v9.png"/>

Integration Tools

In this benchmark, we evaluated 40 integration methods across the four data integration categories on 64 real datasets and 22 simulated datasets on a Ubuntu system with RTX3090 GPU. In particular, we include 18 vertical integration methods, 14 diagonal integration tools, 12 mosaic integration tools, and 15 cross integration tools. The installation environment is set up according to the respective tutorials. Tools that are compared include:

Vertical Integration (Dimension Reduction and Clustering):

Vertical Integration (Feature Selection):

Diagonal Integration (Dimension Reduction, Batch Correction, Clustering, Classification):

Mosaic Integration (Dimension Reduction, Batch Correction, Clustering, Classification):

Mosaic Integration (Imputation):

Cross Integration (Dimension Reduction, Batch Correction, Clustering, Classification):

Cross Integration (Spatial Registration):

Note that the installation time for tools may vary depending on the method used. For more detailed information, please refer to the original publication. For built-in classification, the classification scripts are provided in their corresponding method folders within the [tools_scripts] directory. For additional modules (such as kNN, SVM, random forest, and MLP), the scripts are provided in the [classification] directory.

Evaluation Pipeline

All evaluation pipelines can be found within the metrics folder. Example datasets are stored in the 'example_data' folder. For spatial registration data, users are required to download it from link, and then put it in the 'example_data/spatial/' folder.

Dataset

The processed datasets can be downloaded from link.

Shiny

Explore method performance in depth with our interactive Shiny, designed for dynamic visualization of benchmark results.

License

This project is covered under the Apache 2.0 License.

View on GitHub
GitHub Stars24
CategoryDevelopment
Updated22d ago
Forks2

Languages

Jupyter Notebook

Security Score

90/100

Audited on Mar 9, 2026

No findings