AdvDCTTS
Implementation of DCTTS with Adversarial Training
Install / Use
/learn @Yangyangii/AdvDCTTSREADME
AdvDCTTS (Adversarial Deep Convolutional TTS)
Prerequisite
- python 3.7
- pytorch 1.3
- librosa, scipy, tqdm, tensorboardX
Dataset
Usage
-
Download the above dataset and modify the path in config.py. And then run the below command. 1st arg: signal prepro, 2nd arg: metadata (train/test split)
python prepro.py 1 1 -
DCTTS has two models. Firstly, you should train the model Text2Mel. I think that 20k step is enough (for only an hour). But you should train the model more and more with decaying guided attention loss.
python train.py text2mel <gpu_id> -
Secondly, train the SSRN with GAN. The outputs of SSRN are many high resolution data. So training SSRN is slower than training Text2Mel
python gan_train.py <gpu_id> -
After training, you can synthesize some speech from text.
python synthesize.py <gpu_id> -
You can also test ssrn using the ground truth mel spectrograms.
python test.py <gpu_id>

Notes
- You can get more sharp spectrograms
Other Codes
Related Skills
node-connect
347.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
108.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
