153 skills found · Page 1 of 6
coqui-ai / TTS🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
modelscope / FunASRA Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
PaddlePaddle / PaddleSpeechEasy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
speechbrain / SpeechbrainA PyTorch-based Speech Toolkit
espnet / EspnetEnd-to-End Speech Processing Toolkit
open-mmlab / AmphionAmphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
flashlight / Wav2letterFacebook AI Research's Automatic Speech Recognition Toolkit
wenet-e2e / WenetProduction First and Production Ready End-to-End Speech Recognition Toolkit
modelscope / ClearerVoice StudioAn AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
coqui-ai / STT🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
s3prl / S3prlSelf-Supervised Speech Pre-training and Representation Learning Toolkit
mravanelli / Pytorch Kaldipytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
NVIDIA / OpenSeq2SeqToolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
alumae / Kaldi Gstreamer ServerReal-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
freewym / EspressoEspresso: A Fast End-to-End Neural Speech Recognition Toolkit
ina-foss / InaSpeechSegmenterCNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
openspeech-team / OpenspeechOpen-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
sooftware / KospeechOpen-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.
PaddlePaddle / ParakeetPAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)
soniqo / Speech SwiftAI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered by MLX and CoreML