Handy
A free, open source, and extensible speech-to-text application that works completely offline.
Install / Use
/learn @cjpais/HandyREADME
Handy
A free, open source, and extensible speech-to-text application that works completely offline.
Handy is a cross-platform desktop application that provides simple, privacy-focused speech transcription. Press a shortcut, speak, and have your words appear in any text field. This happens on your own computer without sending any information to the cloud.
Why Handy?
Handy was created to fill the gap for a truly open source, extensible speech-to-text tool. As stated on handy.computer:
- Free: Accessibility tooling belongs in everyone's hands, not behind a paywall
- Open Source: Together we can build further. Extend Handy for yourself and contribute to something bigger
- Private: Your voice stays on your computer. Get transcriptions without sending audio to the cloud
- Simple: One tool, one job. Transcribe what you say and put it into a text box
Handy isn't trying to be the best speech-to-text app—it's trying to be the most forkable one.
How It Works
- Press a configurable keyboard shortcut to start/stop recording (or use push-to-talk mode)
- Speak your words while the shortcut is active
- Release and Handy processes your speech using Whisper
- Get your transcribed text pasted directly into whatever app you're using
The process is entirely local:
- Silence is filtered using VAD (Voice Activity Detection) with Silero
- Transcription uses your choice of models:
- Whisper models (Small/Medium/Turbo/Large) with GPU acceleration when available
- Parakeet V3 - CPU-optimized model with excellent performance and automatic language detection
- Works on Windows, macOS, and Linux
Quick Start
Installation
- Download the latest release from the releases page or the website
- macOS: Also available via Homebrew cask:
brew install --cask handy - Windows: Also available via winget:
winget install cjpais.Handy
Note: The Homebrew cask and winget package are not maintained by the Handy developers.
- macOS: Also available via Homebrew cask:
- Install the application
- Launch Handy and grant necessary system permissions (microphone, accessibility)
- Configure your preferred keyboard shortcuts in Settings
- Start transcribing!
Development Setup
For detailed build instructions including platform-specific requirements, see BUILD.md.
Integrations
<a href="https://www.raycast.com/mattiacolombomc/handy" title="Install Handy Raycast Extension"><img src="https://www.raycast.com/mattiacolombomc/handy/install_button@2x.png?v=1.1" height="64" style="height: 64px;" alt="Install handy Raycast Extension" /></a>
Control Handy from Raycast — start/stop recording, browse transcript history, manage dictionary, switch models and languages.
Source · by @mattiacolombomc
Architecture
Handy is built as a Tauri application combining:
- Frontend: React + TypeScript with Tailwind CSS for the settings UI
- Backend: Rust for system integration, audio processing, and ML inference
- Core Libraries:
whisper-rs: Local speech recognition with Whisper modelstranscription-rs: CPU-optimized speech recognition with Parakeet modelscpal: Cross-platform audio I/Ovad-rs: Voice Activity Detectionrdev: Global keyboard shortcuts and system eventsrubato: Audio resampling
Debug Mode
Handy includes an advanced debug mode for development and troubleshooting. Access it by pressing:
- macOS:
Cmd+Shift+D - Windows/Linux:
Ctrl+Shift+D
CLI Parameters
Handy supports command-line flags for controlling a running instance and customizing startup behavior. These work on all platforms (macOS, Windows, Linux).
Remote control flags (sent to an already-running instance via the single-instance plugin):
handy --toggle-transcription # Toggle recording on/off
handy --toggle-post-process # Toggle recording with post-processing on/off
handy --cancel # Cancel the current operation
Startup flags:
handy --start-hidden # Start without showing the main window
handy --no-tray # Start without the system tray icon
handy --debug # Enable debug mode with verbose logging
handy --help # Show all available flags
Flags can be combined for autostart scenarios:
handy --start-hidden --no-tray
macOS tip: When Handy is installed as an app bundle, invoke the binary directly:
/Applications/Handy.app/Contents/MacOS/Handy --toggle-transcription
Known Issues & Current Limitations
This project is actively being developed and has some known issues. We believe in transparency about the current state:
Major Issues (Help Wanted)
Whisper Model Crashes:
- Whisper models crash on certain system configurations (Windows and Linux)
- Does not affect all systems - issue is configuration-dependent
- If you experience crashes and are a developer, please help to fix and provide debug logs!
Wayland Support (Linux):
- Limited support for Wayland display server
- Requires
wtypeordotoolfor text input to work correctly (see Linux Notes below for installation)
Linux Notes
Text Input Tools:
For reliable text input on Linux, install the appropriate tool for your display server:
| Display Server | Recommended Tool | Install Command |
| -------------- | ---------------- | -------------------------------------------------- |
| X11 | xdotool | sudo apt install xdotool |
| Wayland | wtype | sudo apt install wtype |
| Both | dotool | sudo apt install dotool (requires input group) |
- X11: Install
xdotoolfor both direct typing and clipboard paste shortcuts - Wayland: Install
wtype(preferred) ordotoolfor text input to work correctly - dotool setup: Requires adding your user to the
inputgroup:sudo usermod -aG input $USER(then log out and back in)
Without these tools, Handy falls back to enigo which may have limited compatibility, especially on Wayland.
Other Notes:
-
Runtime library dependency (
libgtk-layer-shell.so.0):-
Handy links
gtk-layer-shellon Linux. If startup fails witherror while loading shared libraries: libgtk-layer-shell.so.0, install the runtime package for your distro:| Distro | Package to install | Example command | | ------------- | --------------------- | -------------------------------------- | | Ubuntu/Debian |
libgtk-layer-shell0|sudo apt install libgtk-layer-shell0| | Fedora/RHEL |gtk-layer-shell|sudo dnf install gtk-layer-shell| | Arch Linux |gtk-layer-shell|sudo pacman -S gtk-layer-shell| -
For building from source on Ubuntu/Debian, you may also need
libgtk-layer-shell-dev.
-
-
The recording overlay is disabled by default on Linux (
Overlay Position: None) because certain compositors treat it as the active window. When the overlay is visible it can steal focus, which prevents Handy from pasting back into the application that triggered transcription. If you enable the overlay anyway, be aware that clipboard-based pasting might fail or end up in the wrong window. -
If you are having trouble with the app, running with the environment variable
WEBKIT_DISABLE_DMABUF_RENDERER=1may help -
Global keyboard shortcuts (Wayland): On Wayland, system-level shortcuts must be configured through your desktop environment or window manager. Use the CLI flags as the command for your custom shortcut.
GNOME:
- Open Settings > Keyboard > Keyboard Shortcuts > Custom Shortcuts
- Click the + button to add a new shortcut
- Set the Name to
Toggle Handy Transcription - Set the Command to
handy --toggle-transcription - Click Set Shortcut and press your desired key combination (e.g.,
Super+O)
KDE Plasma:
- Open System Settings > Shortcuts > Custom Shortcuts
- Click Edit > New > Global Shortcut > Command/URL
- Name it
Toggle Handy Transcription - In the Trigger tab, set your desired key combination
- In the Action tab, set the command to
handy --toggle-transcription
Sway / i3:
Add to your config file (
~/.config/sway/configor~/.config/i3/config):bindsym $mod+o exec handy --toggle-transcriptionHyprland:
Add to your config file (
~/.config/hypr/hyprland.conf):bind = $mainMod, O, exec, handy --toggle-transcription -
You can also manage global shortcuts outside of Handy via Unix signals, which lets Wayland window managers or other hotkey daemons keep ownership of keybindings:
| Signal | Action | Example | | --------- | ----------------------------------------- | ---------------------- | |
SIGUSR2| Toggle transcription |pkill -USR2 -n handy| |SIGUSR1| Toggle transcription with post-processing |pkill -USR1 -n handy|Example Sway config:
bindsym $mod+o exec pkill -USR2 -n handy bindsym $mod+p exec pkill -USR1 -n handypkillhere simply delivers the signal—it does not terminate the process.
Platform Support
- **macOS (bo
