Sirene
Self-hosted text-to-speech platform with multi-backend support, voice cloning, and a modern web UI.
Install / Use
/learn @KevinBonnoron/SireneREADME
Sirene
Self-hosted multi-backend text-to-speech platform with voice cloning and a modern web UI.
Full documentation: kevinbonnoron.github.io/sirene
Screenshots
<div align="center"> <img src="docs/assets/login.png" width="100%" alt="Login" /> </div> <br/> <div align="center"> <img src="docs/assets/home.png" width="49%" alt="Dashboard" /> <img src="docs/assets/voices.png" width="49%" alt="Voices" /> </div> <div align="center"> <img src="docs/assets/models.png" width="49%" alt="Models" /> <img src="docs/assets/history.png" width="49%" alt="History" /> </div>Quick Start
curl -sSL https://raw.githubusercontent.com/KevinBonnoron/sirene/main/install.sh | bash
Then open http://localhost.
Features
- Multi-backend TTS — Route requests to Kokoro, Qwen3-TTS, F5-TTS, Piper, CosyVoice, OpenAudio, or Chatterbox from a single interface
- Voice cloning — Create custom voices by uploading audio samples with zero-shot cloning
- Model management — Download and manage TTS models on demand from the web UI
- Real-time updates — Track downloads and generation progress via Server-Sent Events
- Transcription — Speech-to-text via Whisper models
- Self-hosted — Two lightweight Docker images: one for the web/API, one for inference
Supported Backends
| Backend | Voice Cloning | Streaming | Languages | |---------|:---:|:---:|---| | Kokoro | — | — | EN, FR, JA, KO, ZH | | Qwen3-TTS | Yes | — | 10+ languages | | F5-TTS | Yes | Yes | Multilingual | | Piper | — | — | 26 languages | | CosyVoice | Yes | Yes | 9 languages | | OpenAudio S1 | Yes | — | Multilingual | | Chatterbox | Yes | — | EN + 23 languages |
Development
Prerequisites
- Bun >= 1.2.4
- Python >= 3.11
- PocketBase (installed automatically in the devcontainer)
Quick Start
The easiest way is to use the devcontainer — open the project in VS Code or GitHub Codespaces and all dependencies are installed automatically.
For manual setup:
bun install
pip install -e "./inference[cpu]"
mkdir -p data/models
Start all services
bun run dev
| Service | Port | |---------|------| | PocketBase | 8090 | | Hono Server | 3000 | | Vite Client | 5173 | | Inference FastAPI | 8000 |
Scripts
bun run dev # All services in dev mode
bun run build # Production build
bun run lint # Biome lint
bun run format # Biome format
bun run type-check # TypeScript check
License
MIT
Related Skills
bluebubbles
337.4kUse when you need to send or manage iMessages via BlueBubbles (recommended iMessage integration). Calls go through the generic message tool with channel="bluebubbles".
slack
337.4kUse when you need to control Slack from OpenClaw via the slack tool, including reacting to messages or pinning/unpinning items in Slack channels or DMs.
frontend-design
83.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
Agent Development
83.2kThis skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
