Sirene

Self-hosted text-to-speech platform with multi-backend support, voice cloning, and a modern web UI.

Generate Convert Improve

Install / Use

/learn @KevinBonnoron/Sirene

About this skill

Quality Score

0/100

README

Sirene

Self-hosted multi-backend text-to-speech platform with voice cloning and a modern web UI.

Full documentation: kevinbonnoron.github.io/sirene

Screenshots

Quick Start

curl -sSL https://raw.githubusercontent.com/KevinBonnoron/sirene/main/install.sh | bash

Then open http://localhost.

Features

Multi-backend TTS — Route requests to Kokoro, Qwen3-TTS, F5-TTS, Piper, CosyVoice, OpenAudio, or Chatterbox from a single interface
Voice cloning — Create custom voices by uploading audio samples with zero-shot cloning
Model management — Download and manage TTS models on demand from the web UI
Real-time updates — Track downloads and generation progress via Server-Sent Events
Transcription — Speech-to-text via Whisper models
Self-hosted — Two lightweight Docker images: one for the web/API, one for inference

Supported Backends

| Backend | Voice Cloning | Streaming | Languages | |---------|:---:|:---:|---| | Kokoro | — | — | EN, FR, JA, KO, ZH | | Qwen3-TTS | Yes | — | 10+ languages | | F5-TTS | Yes | Yes | Multilingual | | Piper | — | — | 26 languages | | CosyVoice | Yes | Yes | 9 languages | | OpenAudio S1 | Yes | — | Multilingual | | Chatterbox | Yes | — | EN + 23 languages |

Development

Prerequisites

Bun >= 1.2.4
Python >= 3.11
PocketBase (installed automatically in the devcontainer)

Quick Start

The easiest way is to use the devcontainer — open the project in VS Code or GitHub Codespaces and all dependencies are installed automatically.

For manual setup:

bun install
pip install -e "./inference[cpu]"
mkdir -p data/models

Start all services

bun run dev

| Service | Port | |---------|------| | PocketBase | 8090 | | Hono Server | 3000 | | Vite Client | 5173 | | Inference FastAPI | 8000 |

Scripts

bun run dev          # All services in dev mode
bun run build        # Production build
bun run lint         # Biome lint
bun run format       # Biome format
bun run type-check   # TypeScript check

License

MIT

Related Skills

bluebubbles

337.4k

Use when you need to send or manage iMessages via BlueBubbles (recommended iMessage integration). Calls go through the generic message tool with channel="bluebubbles".

slack

337.4k

Use when you need to control Slack from OpenClaw via the slack tool, including reacting to messages or pinning/unpinning items in Slack channels or DMs.

frontend-design

83.2k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

Agent Development

83.2k

This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.

KevinBonnoron

View profile

View on GitHub

GitHub Stars5

CategoryCustomer

Updated10h ago

Forks1

KevinBonnoron/sirene

Languages

TypeScript

Security Score

90/100

Audited on Mar 26, 2026

No findings