4,517 skills found · Page 1 of 151
fishaudio / Fish SpeechSOTA Open Source TTS
mozilla / DeepSpeechDeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
PaddlePaddle / PaddleSpeechEasy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
speechbrain / SpeechbrainA PyTorch-based Speech Toolkit
Uberi / Speech RecognitionSpeech recognition module for Python, supporting several engines and APIs, online and offline.
nl8590687 / ASRT SpeechRecognitionA Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
huggingface / Speech To SpeechBuild local voice agents with open-source models
WhisperSpeech / WhisperSpeechAn Open Source text-to-speech system built by inverting Whisper.
buriburisuri / Speech To Text WavenetSpeech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's WaveNet and tensorflow
Azure-Samples / Cognitive Services Speech SDKSample code for the Microsoft Cognitive Services Speech SDK
zzw922cn / Awesome Speech Recognition Speech Synthesis PapersAutomatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
zzw922cn / Automatic Speech RecognitionEnd-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
hahahumble / Speechgpt💬 SpeechGPT is a web application that enables you to converse with ChatGPT.
jameslyons / Python Speech FeaturesThis library provides common speech features for ASR including MFCCs and filterbank energies.
edobashira / Speech Language ProcessingA curated list of speech and natural language processing resources
pannous / Tensorflow Speech Recognition🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
ming024 / FastSpeech2An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
google / Live Transcribe Speech EngineLive Transcribe is an Android application that provides real-time captioning for people who are deaf or hard of hearing. This repository contains the Android client libraries for communicating with Google's Cloud Speech API that are used in Live Transcribe.
microsoft / NeuralSpeechNo description available
microsoft / SpeechT5Unified-Modal Speech-Text Pre-Training for Spoken Language Processing