SkillAgentSearch skills...

Css10

CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages

Install / Use

/learn @Kyubyong/Css10
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages

Abstract

We describe our development of CSS10, a collection of single speaker speech datasets for ten languages. It is composed of short audio clips from LibriVox audiobooks and their aligned texts. To validate its quality we train two neural text-to-speech models on each dataset. Subsequently, we conduct Mean Opinion Score tests on the synthesized speech samples. We make our datasets, pretrained models, and test resources publicly available. We hope they will be used for future speech tasks.

For details, check our paper. Kyubyong gave a talk with this paper at the workshop of 2018 The Korean Society of Speech Sciences.

Environments & Dependencies

  • Linux
  • Python 2.X or 3.X
  • TensorFlow == 1.3
  • NumPy
  • Librosa
  • Matplotlib
  • tqdm
  • scipy

Audiobooks & Datasets

|Code|Language|Audiobook|Running Time|Reader|Dataset| |--|--|--|--|--|--| |de|German|1. Meister Floh <br>2. Die acht Gesichter am Biwasee <br>3. Auswahl aus Die Serapionsbrüder|16:42:45|Hokuspokus |CSS German| |el|Greek|Παραμύθι χωρίς όνομα (Tale Without Name)|04:08:14| Rapunzelina|CSS Greek| |es|Spanish|1. Bailén <br>2. El 19 de Marzo y el 2 de Mayo<br>3. La Batalla de los Arapiles|23:49:49|Tux |CSS Spanish| |fi|Finnish|1. Gulliverin matkat kaukaisilla mailla <br>2. Ensimmäiset novellit <br>3. Kaleri-orja <br>4. Salmelan heinätalkoot|10:32:03|Harri Tapani Ylilammi |CSS Finnish| |fr|French|1. Les Misérables - tome 5 .<br> 2. Arsène Lupin contre Herlock Sholmès|19:09:03|Gilles G. Le Blanc |CSS French| |hu|Hungarian|Egri csillagok|10:00:25| Diana Majlinger|CSS Hungarian| |ja|Japanese|明暗 (Meian)|14:55:36|ekzemplaro|CSS Japanese| |nl|Dutch|20.000 Mijlen onder Zee|14:06:40|Bart de Leeuw |CSS Dutch| |ru|Russian|1. Ice March - Ледяной поход<br>2. Early Short Stories <br>3. Short Stories for Children and Adults|21:22:10 |Mark Chulsky|CSS Russian| |zh|Chinese|1. 朝花夕拾 (Chao Hua Si She))<bt>2. 呐喊 (Call to Arms)|06:27:04|Jing Li |CSS Chinese|

Pretrained Models & Audio Samples

|Code|Lanuage|Pretrained Models|Audio Samples| |--|--|--|--| |de|German|DCTTS | TACOTRON|DCTTS | TACOTRON| |el|Greek|DCTTS|DCTTS| |es|Spanish|DCTTS | TACOTRON|DCTTS | TACOTRON| |fi|Finnish|DCTTS | TACOTRON|DCTTS | TACOTRON| |fr|French|DCTTS | TACOTRON|DCTTS | TACOTRON| |hu|Hungarian|DCTTS | TACOTRON|DCTTS | TACOTRON| |ja|Japanese|DCTTS | TACOTRON|DCTTS | TACOTRON| |nl|Dutch|DCTTS | TACOTRON|DCTTS | TACOTRON| |ru|Russian|DCTTS | TACOTRON|DCTTS | TACOTRON| |zh|Chinese|DCTTS | TACOTRON|DCTTS | TACOTRON|

Cite

@article{park2019css10,
  title={CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages},
  author={Park, Kyubyong and Mulc, Thomas},
  journal={Interspeech},
  year={2019}
}

By Kyubyong Park, Tommy Mulc

View on GitHub
GitHub Stars483
CategoryDevelopment
Updated8d ago
Forks62

Languages

HTML

Security Score

100/100

Audited on Mar 26, 2026

No findings