Dctts2

Deep Convolution Text to Speech

Generate Convert Improve

Install / Use

/learn @eazhary/Dctts2

About this skill

Quality Score

0/100

README

Deep Convolution Text to Speech

This is an implementation of the paper "Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention" https://arxiv.org/abs/1710.08969

The code is based on the following implementations

https://github.com/keithito/tacotron.git
https://github.com/joisino/chainer-ETTTS.git
https://github.com/Kyubyong/tacotron.git

The model trains "text2mel" & "SSRN" seperately through trainmel.py & trainmag.py respectively You need to download the LJSpeech dataset available at https://keithito.com/LJ-Speech-Dataset/

Audio Samples

You can listen to Audio Samples

Pre-Trained models can be downloaded Here

prepare the dataset

First, you have to prepare dataset. If you want to use the LJSpeech dataset, you can use the following commands.

$ wget http://data.keithito.com/data/speech/LJSpeech-1.0.tar.bz2
$ tar xvf LJSpeech-1.0.tar.bz2
$ python prepro.py

train the Text2Mel network

$ python trainmel.py

during training you can review the output (by default every 200 minibatches) it dumps the first two examples in the batch into mel0.png & mel1.png as well view the learned attention through a0.png & a1.png

MEL

Attention

train the SSRN network

$ python trainmag.py

during training you can view the output through mag0.png & mag1.png, which compares the learned spectrogram with the groung truth.

MAG

Synthesize

to synthesize a new sentance use:

$ python synth.py --text "sentance to synthesize" --file output.wav

Demo web server

You can run a demo web server to do TTS by running

$ python server.py

this uses Flask framework to run the demo

Related Skills

proje

Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

eazhary

View profile

View on GitHub

GitHub Stars34

CategoryEducation

Updated6mo ago

Forks14

eazhary/dctts2

Languages

Python

Security Score

72/100

Audited on Sep 25, 2025

No findings