Mudatasets
Multimodal datasets, in MuData format
Install / Use
/learn @PMBio/MudatasetsREADME
Multimodal Datasets
mudatasets provides some public datasets with multimodal data, primarily focusing on multimodal omics datasets.
MuData library | MuData documentation
Installation
# Stable, with muon
pip install "mudatasets[muon]"
# Dev
pip install git+https://github.com/gtca/mudatasets
Getting started
import mudatasets as mds
Find available datasets
mds.list_datasets()
Load a dataset
mdata = mds.load("pbmc3k_multiome")
print(mdata)
Some common attributes for .load() are:
data_dir=for location to save the dataset (~/mudatasets/by default)with_info=Truefor also returning the second argument with dataset description as a dictionary (Falseby default)backed=Truefor reading data in a backed format, only for.h5muand.h5adfiles (Trueby default)files=for downloading specific files from the datasetfull=Truefor downloading all the files defined for the dataset (Falseby default)
Get dataset info
mds.info("pbmc3k_multiome")
List dataset file names
mds.list_files("pbmc3k_multiome")
Webpage with all the files
mds.serve_webpage(port=8000)
This command will launch a server providing a simple (temporarily created) HTML page at http://localhost:8000 with files across all of the datasets listed.
