AIVoiceChat
Low latency ai companion voice talk in 60 lines of code using faster_whisper and elevenlabs input streaming
Install / Use
/learn @KoljaB/AIVoiceChatREADME
Seamless and real-time voice interaction with AI.
Hint: Anybody interested in state-of-the-art voice solutions please also <strong>have a look at Linguflex</strong>. It lets you control your environment by speaking and is one of the most capable and sophisticated open-source assistants currently available.
Uses faster_whisper and elevenlabs input streaming for low latency responses to spoken input.
Note: The demo is conducted on a 10Mbit/s connection, so actual performance might be more impressive on faster connections.
voice_talk_vad.py - automatically detects speech
voice_talk.py - toggle recording on/off with the spacebar
🛠 Setup:
1. API Keys:
Replace your_openai_key and your_elevenlabs_key with your OpenAI and ElevenLabs API key values in the code.
2. Dependencies:
Install the required Python libraries:
pip install openai elevenlabs pyaudio wave keyboard faster_whisper numpy torch
3. Run the Script:
Execute the main script based on your mode preference:
python voice_talk_vad.py
or
python voice_talk.py
🎙 How to Use:
For voice_talk_vad.py:
Talk into your microphone.
Listen to the reply.
For voice_talk.py:
- Press the space bar to initiate talk.
- Speak your heart out.
- Hit the space bar again once you're done.
- Listen to reply.
🤝 Contribute
Feel free to fork, improve, and submit pull requests. If you're considering significant changes or additions, please start by opening an issue.
💖 Acknowledgements
Huge shoutout to:
- The hardworking developers behind faster_whisper.
- ElevenLabs for their cutting-edge voice API.
- OpenAI for pioneering with the GPT-4 model.
Related Skills
openhue
342.5kControl Philips Hue lights and scenes via the OpenHue CLI.
sag
342.5kElevenLabs text-to-speech with mac-style say UX.
weather
342.5kGet current weather and forecasts via wttr.in or Open-Meteo
tweakcc
1.5kCustomize Claude Code's system prompts, create custom toolsets, input pattern highlighters, themes/thinking verbs/spinners, customize input box & user message styling, support AGENTS.md, unlock private/unreleased features, and much more. Supports both native/npm installs on all platforms.
