SkillAgentSearch skills...

SoundStream

This repository is an implementation of this article: https://arxiv.org/pdf/2107.03312.pdf

Install / Use

/learn @wesbz/SoundStream
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

SoundStream: An End-to-End Neural Audio Codec

This repository is an implementation of the article with same name.

<p align="center"> <img src="./images/soundstream.png" alt="SoundStream's architecture"/> </p>

The RVQ (stands for Residual Vector Quantizer) relies on lucidrains' repository.

I built this implementation to serve my needs and some features are missing from the original article.

Missing pieces

  • [ ] Denoising: this implementation is not built to denoise, so there is no conditioning signal nor Feature-wise Linear Modulation blocks.
  • [ ] Bitrate scalability: for now, quantizer dropout has not been implemented.

Citations

@misc{zeghidour2021soundstream,
    title   = {SoundStream: An End-to-End Neural Audio Codec},
    author  = {Neil Zeghidour and Alejandro Luebs and Ahmed Omran and Jan Skoglund and Marco Tagliasacchi},
    year    = {2021},
    eprint  = {2107.03312},
    archivePrefix = {arXiv},
    primaryClass = {cs.SD}
}
View on GitHub
GitHub Stars420
CategoryDevelopment
Updated26d ago
Forks60

Languages

Python

Security Score

80/100

Audited on Mar 5, 2026

No findings