510 skills found · Page 1 of 17
moeru-ai / Airi💖🧸 Self hosted, you-owned Grok Companion, a container of souls of waifu, cyber livings to bring them into our worlds, wishing to achieve Neuro-sama's altitude. Capable of realtime voice chat, Minecraft, Factorio playing. Web / macOS / Windows supported.
discordjs / Discord.jsA powerful JavaScript library for interacting with the Discord API
modelscope / FunASRA Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
QwenLM / Qwen3 TTSQwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice cloning.
Plachtaa / Seed Vczero-shot voice conversion & singing voice conversion, with real-time support
WEIFENG2333 / AsrTools✨ AsrTools: Smart Voice-to-Text Tool | Efficient Batch Processing | User-Friendly Interface | No GPU Required | Supports SRT/TXT Output | Turn your audio into accurate text in an instant!
LiveHelperChat / LivehelperchatLive Helper Chat - live support for your website. Featuring web and mobile apps, Voice & Video & ScreenShare. Supports Telegram, Twilio (whatsapp), Facebook messenger including building a bot.
react-native-voice / Voice:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)
uowuo / AbaddonAn alternative Discord client with voice support made with C++ and GTK 3
k2-fsa / Sherpa NcnnReal-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.
hungtraan / FacebookBotA Facebook Messenger Bot that supports Voice Recognition, Natural Language Processing and features such as: search nearby restaurants, search trending news, transcribe and save memos to the cloud.
nazdridoy / Kokoro TtsA CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.
OpenMOSS / MOSS TTSDMOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flexible speaker control, and multilingual support, while enabling zero-shot voice cloning from short audio references.
Henry-23 / VideoChat实时交互数字人,可自定义形象与音色,支持音色克隆,对话延迟低至3s。Real-time voice interactive digital human, customizable appearance and voice, supporting voice cloning, with initial package delay as low as 3s.
PromtEngineer / VerbiA modular voice assistant application for experimenting with state-of-the-art transcription, response generation, and text-to-speech models. Supports OpenAI, Groq, Elevanlabs, CartesiaAI, and Deepgram APIs, plus local models via Ollama. Ideal for research and development in voice technology.
MattMoony / FigaroReal-time voice-changer for voice-chat, etc. Will support many different voice-filters and features in the future. 🎵
Vonage / Vonage Php SDK CoreVonage REST API client for PHP. API support for SMS, Voice, Text-to-Speech, Numbers, Verify (2FA) and more.
Skythinker616 / Gpt Assistant Android【新增PDF和Office文件解析上传】安卓端全场景GPT助手,可用音量键唤起并进行语音交流,支持联网、拍照、模板、PDF和Office文件解析等 | GPT assistant for Android, activated via volume keys for voice interaction, supporting features such as networking, taking photos, templates and parsing PDF and Office documents.
Tomiinek / Multilingual Text To SpeechAn implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
talonhub / CommunityVoice command set for Talon, community-supported.