Izwi
On-device AI engine for transcription, TTS, and voice workflows.
Install / Use
/learn @izwi-ai/IzwiREADME
Overview
Izwi is a privacy-focused audio AI platform that runs entirely on your machine. No cloud services, no API keys, no data leaving your device.
Core capabilities:
- Voice Mode — Real-time voice conversations with AI
- Text-to-Speech — Generate natural speech from text
- Speech Recognition — Convert audio to text with high accuracy
- Speaker Diarization — Identify and separate multiple speakers
- Voice Cloning — Clone any voice from a short audio sample
- Voice Design — Create custom voices from text descriptions
- Forced Alignment — Word-level audio-text alignment
- Chat — Text-based AI conversations
The server exposes OpenAI-compatible API routes under /v1.
Quick Install
macOS
Download the latest .dmg from GitHub Releases:
- Open the
.dmgfile - Drag Izwi.app to Applications
- Launch Izwi
Linux
wget https://github.com/izwi-ai/izwi/releases/latest/download/izwi_amd64.deb
sudo dpkg -i izwi_amd64.deb
Windows
Download and run the installer from GitHub Releases.
Full installation guides: macOS • Linux • Windows • From Source
Quick Start
1. Start the server
izwi serve
Open http://localhost:8080 in your browser.
2. Download a model
izwi pull Qwen3-TTS-12Hz-0.6B-Base
3. Generate speech
izwi tts "Hello from Izwi!" --output hello.wav
4. Transcribe audio
izwi pull Qwen3-ASR-0.6B
izwi transcribe audio.wav
Long-form ASR is handled automatically: Izwi now chunks long recordings, stitches overlapping transcripts, and returns a full transcript instead of only the first model window.
Optional tuning knobs:
IZWI_ASR_CHUNK_TARGET_SECS=24
IZWI_ASR_CHUNK_MAX_SECS=30
IZWI_ASR_CHUNK_OVERLAP_SECS=3
Supported Models
| Category | Models | |----------|--------| | TTS | Qwen3-TTS (0.6B, 1.7B), Kokoro-82M | | ASR | Qwen3-ASR (0.6B, 1.7B), Parakeet TDT | | Diarization | Sortformer 4-speaker | | Chat | Qwen3 (0.6B, 1.7B), Gemma 3 (1B, 4B) | | Alignment | Qwen3-ForcedAligner |
Run izwi list to see all available models.
Full model documentation: Models Guide
Documentation
| Resource | Link | |----------|------| | Getting Started | izwiai.com/docs/getting-started | | Installation | izwiai.com/docs/installation | | Features | izwiai.com/docs/features | | CLI Reference | izwiai.com/docs/cli | | Models | izwiai.com/docs/models | | Troubleshooting | izwiai.com/docs/troubleshooting |
License
Apache 2.0
Acknowledgments
- Qwen3-TTS by Alibaba
- Parakeet by NVIDIA
- Gemma by Google
- HuggingFace Hub for model hosting
Related Skills
node-connect
334.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
82.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
334.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
82.2kCommit, push, and open a PR
