Sussurro
A fully local, open-source voice-to-text tool that acts as a system-wide AI dictation layer, converting speech into clean, formatted text.
Install / Use
/learn @cesp99/SussurroREADME
Sussurro
Sussurro is a fully local, open-source voice-to-text system with a built-in native overlay UI. It transforms speech into clean, formatted, context-aware text and injects it into any application — entirely on your machine, using Whisper.cpp for ASR and a fine-tuned Qwen 3 LLM for cleanup.
Install
curl -fsSL https://raw.githubusercontent.com/cesp99/sussurro/master/scripts/install.sh | bash
Works on Linux and macOS. The script detects your platform, downloads the right binary, and places it in /usr/local/bin (or ~/.local/bin). On first run Sussurro will guide you through downloading the AI models.
Wayland users: after install, bind the hotkey in your desktop environment — see Wayland Setup. macOS users: grant Accessibility access when prompted (System Settings → Privacy & Security → Accessibility).
Features
- Built-in Native Overlay: A minimal, aesthetically clean floating capsule shows recording/transcribing state — always on top, no taskbar entry (Linux & macOS)
- Settings UI: Dark-themed settings window accessible via system tray or right-click on the overlay (Linux & macOS)
- Smart Cleanup: Removes filler words, handles self-corrections, prevents hallucinations
- Local Processing: No data leaves your machine
- System-Wide: Works in any application where you can type
- Flexible ASR: Whisper Small (fast) or Large v3 Turbo (accurate), switchable from the UI
- Live Hotkey Config: Change the global hotkey from Settings — takes effect instantly, no restart
- Hotkey Mode: Switch between Push to Talk (hold to record, release to transcribe) and Toggle (press once to start, press again to transcribe) directly from Settings (X11 & macOS only)
- Transcription Language: Choose the language Whisper listens for (or use Auto Detect) directly from Settings
- Headless Mode:
--no-uiflag for CLI/scripting use on any platform
Quick Reference
| Platform | Default hotkey | Default mode | Access Settings |
|----------|---------------|-------------|----------------|
| Linux X11 | Ctrl+Shift+Space | Push to Talk | System tray or right-click capsule |
| Linux Wayland | configured in DE | n/a (external shortcut) | System tray or right-click capsule |
| macOS | Cmd+Shift+Space | Push to Talk | System tray or right-click capsule |
The hotkey mode can be changed at any time from Settings → Global Hotkey → Mode.
Documentation
- Quick Start: Get up and running in under 5 minutes
- Dependencies: System requirements and package installation
- Wayland Setup: One-time configuration for Wayland users
- Configuration: Detailed guide on
config.yamland environment variables - Architecture: How the audio pipeline, ASR, and LLM engines work
- Compilation: Building from source (CLI and UI builds)
- File Transcription:
sussurro-transcribecompanion CLI — batch transcription of audio files
Building from Source
git clone https://github.com/cesp99/sussurro.git
cd sussurro
make build # → bin/sussurro (overlay + settings + tray)
Requires GTK3, WebKit2GTK, and AppIndicator dev headers on Linux. See Compilation for full instructions and per-distro dependency lists.
UI: The Overlay Capsule
When Sussurro runs (Linux or macOS), a sleek pill-shaped capsule appears at the bottom-center of your screen:
| State | Appearance | |-------|-----------| | Idle | 7 softly pulsing white dots | | Recording | 7 waveform bars animated by your voice | | Transcribing | "transcribing" text with a shimmer effect |
Accessing Settings:
| Method | How | |--------|-----| | System tray | Click the Sussurro icon → Open Settings | | Right-click overlay | Right-click the capsule → Open Settings |
The settings window lets you switch Whisper models, download models with a live progress bar, select the transcription language, change the global hotkey, and choose the hotkey mode. All changes take effect immediately — no restart required.
Headless / CLI Mode
./sussurro --no-ui
Terminal output only — no overlay, no tray. Useful for scripting or low-resource environments.
Switching Whisper Models
Via the Settings UI (recommended) — or from the command line:
./sussurro --whisper # (or --wsp)
| Model | Size | Best for | |-------|------|----------| | Whisper Small | ~488 MB | Faster, lower RAM | | Whisper Large v3 Turbo | ~1.62 GB | Higher accuracy |
Companion Tools
sussurro-transcribe — File Transcription
A standalone CLI for transcribing audio files using the same local models. Requires ffmpeg.
Install
curl -fsSL https://raw.githubusercontent.com/cesp99/sussurro/master/scripts/install-transcribe.sh | bash
Usage
sussurro-transcribe -i recording.mp3 # raw Whisper output to stdout
sussurro-transcribe -i recording.wav -clean # with LLM cleanup
sussurro-transcribe -i audio.m4a -o out.txt # write to file
sussurro-transcribe -i audio.mp3 -lang en -debug # force language, verbose
See File Transcription for full documentation.
License
GNU General Public License v3.0 — see LICENSE.
