19 skills found
ivanvovk / WaveGradImplementation of WaveGrad high-fidelity vocoder from Google Brain in PyTorch.
ivanvovk / Durian PytorchImplementation of "Duration Informed Attention Network for Multimodal Synthesis" paper in PyTorch.
BogiHsu / Tacotron2 PyTorchYet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.
gokhaneraslan / Chatterbox FinetuningFine-tuning toolkit for Chatterbox TTS & Chatterbox TURBO models. Supports 23 languages with smart vocabulary extension. Features offline preprocessing, automatic VAD trimming, and voice cloning capabilities. Train custom TTS models with your own dataset in LJSpeech and file-based format.
thorstenMueller / Audio To Voice DatasetCreate an LJSpeech structured voice dataset on wave input
andi611 / TTS Tacotron PytorchPytorch implementation of Tacotron, a speech synthesis end-to-end generative TTS model.
elizaOS / LJSpeechToolsTools for making LJSpeech datasets
webaverse / LJSpeechToolsTools to isolate speaker and transcribe unstructured audio clips
AlexK-PL / GST Tacotron2A NVIDIA's Pytorch Tacotron2 adaptation with unsupervised Global Style Tokens. The model has been trained with the English read-speech LJSpeech Dataset.
vislupus / Bulgarian TTS DatasetLibriVox dataset for Bulgarian language TTS
ZirumAndBigBro / Audio Transcriptor Russian [Russian] This script will split audio file on silence, transcript it with google recognition and save it in LJSpeech-1.1 dataset manner.
AlexK-PL / GST Tacotron2 PitchContourReferenceA NVIDIA's Pytorch Tacotron2 adaptation with unsupervised Global Style Tokens. Instead of using the whole mel-scale spectrogram representation in the GST input, we extracted and used only the pitch contour in a sparse representation. The model has been trained with the English read-speech LJSpeech Dataset.
SortAnon / ToARPAbetConverts LJSpeech format transcripts to ARPAbet.
zuverschenken / XTTSv2Scriptsscripts for creating LJSpeech format dataset for TTS task
seanghay / SpeechviewerA quick audio dataset viewer
ggiggit / Phoneme Codec VisualizationVisualize the relationships between phonemes and codec tokens in a specialized speech dataset.
nipponjo / Mixer Tts PytorchMixer-TTS for efficient TTS
DominicTWHV / LJSpeech Dataset GeneratorLJSpeech dataset generator for TTS model training/fine tuning
isadrtdinov / QuartznetQuartzNet implementation for Automatic Speech Recognition task