AdvDCTTS

Implementation of DCTTS with Adversarial Training

Generate Convert Improve

Install / Use

/learn @Yangyangii/AdvDCTTS

About this skill

Quality Score

0/100

README

AdvDCTTS (Adversarial Deep Convolutional TTS)

Prerequisite

python 3.7
pytorch 1.3
librosa, scipy, tqdm, tensorboardX

Dataset

LJ Speech 1.1

Usage

Download the above dataset and modify the path in config.py. And then run the below command. 1st arg: signal prepro, 2nd arg: metadata (train/test split)
```
python prepro.py 1 1
```
DCTTS has two models. Firstly, you should train the model Text2Mel. I think that 20k step is enough (for only an hour). But you should train the model more and more with decaying guided attention loss.
```
python train.py text2mel <gpu_id>
```
Secondly, train the SSRN with GAN. The outputs of SSRN are many high resolution data. So training SSRN is slower than training Text2Mel
```
python gan_train.py <gpu_id>
```
After training, you can synthesize some speech from text.
```
python synthesize.py <gpu_id>
```
You can also test ssrn using the ground truth mel spectrograms.
```
python test.py <gpu_id>
```

Notes

You can get more sharp spectrograms

Other Codes

Related Skills

node-connect

347.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

108.0k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

347.2k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

347.2k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。