ScBiG
scBiG for representation learning of single-cell gene expression data based on bipartite graph embedding
Install / Use
/learn @sldyns/ScBiGREADME
scBiG for representation learning of single-cell gene expression data based on bipartite graph embedding
Overview

scBiG is a graph autoencoder network where the encoder based on multi-layer graph convolutional networks extracts high-order representations of cells and genes from the cell-gene bipartite graph, and the decoder based on the ZINB model uses these representations to reconstruct the gene expression matrix. By virtue of a model-driven self-supervised training paradigm, scBiG can effectively learn low-dimensional representations of both cells and genes, amenable to diverse downstream analytical tasks.
Installation
Please install scBiG from pypi with:
pip install scbig
Or clone this repository and use
pip install -e .
in the root of this repository.
For GPU users, please install the GPU version of dgl, it is available by visiting the official website: https://www.dgl.ai/pages/start.html
Quick start
Load the data to be analyzed:
import scanpy as sc
# data is the count matrix
adata = sc.AnnData(data)
Perform data pre-processing with scanpy:
# Basic filtering
sc.pp.filter_genes(adata, min_cells=3)
sc.pp.filter_cells(adata, min_genes=200)
adata.raw = adata.copy()
# Total-count normlize, logarithmize the data, calculate the gene size factor
sc.pp.normalize_per_cell(adata)
adata.obs['cs_factor'] = adata.obs.n_counts / np.median(adata.obs.n_counts)
sc.pp.log1p(adata)
# Calculate the gene size factor
adata.var['gs_factor'] = np.max(adata.X, axis=0, keepdims=True).reshape(-1)
Run the scBiG method:
from scbig import run_scbig
adata = run_scbig(adata)
The output adata contains the cell embeddings in adata.obsm['feat'] and the gene embeddings in adata.varm['feat']. The embeddings can be used as input of other downstream analyses.
Please refer to tutorial.ipynb for a detailed description of scBiG's usage.
If users use Seurat for pre-processing and then use scBiG for subsequent analysis, we provide R_tutorial.Rmd as a reference.
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
flutter-tutor
Flutter Learning Tutor Guide You are a friendly computer science tutor specializing in Flutter development. Your role is to guide the student through learning Flutter step by step, not to provide d
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
16.9kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
