SkillAgentSearch skills...

EDS

:bulb: :floppy_disk: :minidisc: A simple, intuitive and Efficient single cell binary Data Storage format

Install / Use

/learn @COMBINE-lab/EDS
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

What's EDS ?

EDS is an accronym for Efficient single cell binary Data Storage format for the cell-feature count matrices.

EDS

Why we need a new storage format ?

Recent advancements in single-cell technologies have seen rapid increase in the amount of data. Most single-cell studies generate a cell by feature (can be gene) count matrices, where the number of cells are now reaching towards millions. Traditional Single-cell quantification pipelines use matrix market exchange (mtx) format (sometimes gzipped) for sharing the count matrices. However, the textual representation of mtx format makes it bigger in size compared to a compressed binary format. Our quantification tool alevin dumps the output in EDS format which saves storage space.

What are the caveats ?

There are other formats (such as loom) which are designed for optimizing the query of the matrix. EDS is primarily designed to improve the storage efficiency rather than query and currently don't support random access to a cell (row).

How to convert eds to mtx format ?

We have a simple rust code inside the src-rs, it can be installed using cargo build --release and can be used as ./target/release/eds convert -i <input gzipped file currently [eds.gz | mtx.gz]> --[mtx | eds | h5 | csv] -c <num_cells> -f <num_features>.

Benchmarks

  • Size on disk. Disk Space

  • Matrix loading into memory time. Loading time

  • Memory required to load the matrix. Memory Usage

Future

  • [ ] Support delayedArray R object
  • [ ] Random access through EDS index

Contributors

  • Avi Srivastava
  • Mike Love
  • Rob Patro

Related Skills

View on GitHub
GitHub Stars15
CategoryDevelopment
Updated7mo ago
Forks1

Languages

Rust

Security Score

87/100

Audited on Aug 13, 2025

No findings