SkillAgentSearch skills...

Vocalinux

Free, open-source, 100% offline voice dictation for Linux. Speak and type anywhere via whisper.cpp, Whisper & VOSK engines, GPU-accelerated, works on X11 + Wayland!

Install / Use

/learn @jatinkrmalik/Vocalinux

README

<img src="https://github.com/user-attachments/assets/56dabe5c-5c65-44d5-a36a-429c9fea0719" width="30" height="30"> Vocalinux

Voice-to-text for Linux, finally done right!

<!-- Project Status -->

Status: Beta GitHub release License: GPL v3

<!-- Build & Quality -->

Vocalinux CI Platform: Linux Python 3.9+ Made with GTK codecov

<!-- Tech & Community -->

GitHub stars GitHub forks GitHub watchers Last commit Commit activity Contributions welcome GitHub issues

Vocalinux Users

Linux has always punched above its weight, except when it comes to voice typing. Vocalinux fixes that.

It's a free, GPLv3-licensed desktop app that lets you dictate text into any application, on X11 or Wayland, using fully offline speech recognition. Pick from three engines (whisper.cpp, OpenAI Whisper, or VOSK), get automatic GPU acceleration via Vulkan, and control it all with customizable keyboard shortcuts: toggle or push-to-talk.

No internet required. No data leaves your machine. Just speak and type.

📚 What's New in v0.9.0-beta

🎉 Release: Left/Right modifier keys, sound effects toggle, Wayland clipboard fallback, and more.

🚀 Highlights (v0.8.0 → v0.9.0)

| Feature | Description | |---------|-------------| | ⌨️ Left/Right Modifier Keys | Choose Left Ctrl vs Right Ctrl (etc.) as your shortcut trigger | | 🔔 Sound Effects Toggle | Enable or disable audio feedback from the Settings dialog | | 📋 Wayland Clipboard Fallback | Automatic clipboard copy when virtual keyboard injection isn't available | | 🛠️ Installation Polish | Better pipx/Debian guidance and headless display detection |

✨ New Features (v0.9.0)

  • Left/Right Modifier Key Distinction - Shortcuts now support Left Ctrl, Right Alt, etc., with grouped UI in Settings
  • Sound Effects Toggle - New Audio Settings toggle to silence start/stop/error sounds
  • Clipboard Fallback for Wayland - Auto-copies text via wl-copy/xclip when injection unavailable (KDE Plasma etc.)
  • Display Availability Check - Graceful error message when running in headless environments

🐛 Bug Fixes (v0.9.0)

  • #308: Distinguish left vs right modifier keys (evdev + pynput backends)
  • #307: Remove unwanted leading space when starting a new transcription session
  • #305: Pass configured shortcut mode to KeyboardShortcutManager on startup
  • #299: Add clipboard fallback for Wayland compositors without virtual keyboard support
  • #289: Improve Debian/pipx installation error messages and cross-distro dependency guidance

🔧 Improvements

  • Grouped shortcut selector - Settings dropdown now organises shortcuts by Either/Left/Right side
  • pipx documentation - New DISTRO_COMPATIBILITY.md section for pipx users

✨ Features

  • 🎤 Toggle or Push-to-Talk activation modes
  • Real-time transcription with minimal latency
  • 🌎 Universal compatibility across all Linux applications
  • 🔒 100% Offline operation for privacy and reliability
  • 🤖 whisper.cpp by default - High-performance C++ speech recognition
  • 🎮 Universal GPU support - Vulkan acceleration for AMD, Intel, and NVIDIA
  • 🎨 System tray integration with visual status indicators
  • 🚀 Start on login support via XDG autostart (desktop-session startup)
  • 🔊 Pleasant audio feedback - smooth gliding tones, headphone-friendly
  • ⚙️ Graphical settings dialog for easy configuration
  • 📦 3 engine choices - whisper.cpp (default), OpenAI Whisper, or VOSK

📸 Screenshots

Here are some screenshots showcasing Vocalinux in action:

<table> <tr> <td align="center"> <img src="resources/screenshots/00-transcription.png" alt="Transcription in Action" width="350"><br> <em>Real-time voice-to-text transcription</em> </td> <td align="center"> <img src="resources/screenshots/02-system-tray.png" alt="System Tray" width="350"><br> <em>System tray with listening indicator</em> </td> </tr> <tr> <td align="center"> <img src="resources/screenshots/05-about-view.png" alt="About View" width="350"><br> <em>About view with version info</em> </td> <td align="center"> <img src="resources/screenshots/03-log-viewer.png" alt="Log Viewer" width="350"><br> <em>Log viewer for debugging</em> </td> </tr> <tr> <td colspan="2" align="center"> <img src="resources/screenshots/04-features-overview.png" alt="Features Overview" width="500"><br> <em>Overview of key features and configuration options with annotations</em> </td> </tr> </table>

🚀 Quick Install

Interactive Install (Recommended)

Our new interactive installer guides you through setup with intelligent hardware detection:

curl -fsSL raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh -o /tmp/vl.sh && bash /tmp/vl.sh

Choose your engine:

  1. whisper.cpp ⭐ (Recommended) - Fast, works with any GPU via Vulkan
  2. Whisper (OpenAI) - PyTorch-based, NVIDIA GPU only
  3. VOSK - Lightweight, works on older systems

The installer will:

  • Auto-detect your hardware (GPU, RAM, Vulkan support)
  • Recommend the best engine for your system
  • Download the appropriate model (~39MB for whisper.cpp tiny)
  • Install in ~1-2 minutes (vs 5-10 min with old Whisper)

Note: Always installs the latest release. For a specific version, check GitHub Releases.

Installation Options

Default (whisper.cpp - recommended):

curl -fsSL raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh -o /tmp/vl.sh && bash /tmp/vl.sh

Fastest installation (~1-2 min), universal GPU support via Vulkan.

Whisper (OpenAI) - if you prefer PyTorch:

curl -fsSL raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh -o /tmp/vl.sh && bash /tmp/vl.sh --engine=whisper

NVIDIA GPU only (~5-10 min, downloads PyTorch + CUDA).

VOSK only - for low-RAM systems:

curl -fsSL raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh -o /tmp/vl.sh && bash /tmp/vl.sh --engine=vosk

Lightweight option (~40MB), works on systems with 4GB RAM.

Alternative: Install from Source

# Clone the repository
git clone https://github.com/jatinkrmalik/vocalinux.git
cd vocalinux

# Run the installer (will prompt for Whisper)
./install.sh

# Or with Whisper support
./install.sh --with-whisper

The installer handles everything: system dependencies, Python environment, speech models, and desktop integration.

🌙 Nightly Releases (Bleeding Edge)

For developers and early adopters who want to test the latest features, check out our GitHub Releases page which includes both beta and nightly builds.

⚠️ Warning: Nightly releases contain the absolute latest code and may be unstable. For production use, we recommend using the latest beta release.

Nightly builds are automatically generated from the main branch every day. They include all merged changes but haven't undergone the same testing as beta releases.

Release Channels:

  • Beta (Recommended) - Tested pre-releases with known features
  • Nightly - Untested bleeding edge with latest commits

After Installation

# If ~/.local/bin is in your PATH (recommended):
vocalinux

# Or activate the virtual environment first:
source ~/.local/bin/activate-vocalinux.sh
vocalinux

# Or run directly:
~/.local/share/vocalinux/venv/bin/vocalinux

Or launch it from your application menu!

📋 Requirements

  • OS: Linux (tested on Ubuntu 22.04+, Debian 11+, Fedora 39+, Arch Linux, openSUSE Tumbleweed)
  • Python: 3.9 or newer
  • Display: X11 or Wayland
  • Hardware: Microphone for voice input

Note: See Distribution Compatibility for distribution-specific information and experimental support for Gentoo, Alpine, Void, Solus, and more.

🎙️ Usage

Voice Dictation

  1. Toggle mode: Double-tap the shortcut key (default Ctrl) to start recording
  2. Speak clearly into your microphone
  3. Toggle mode: Double-tap again (or pause speaking) to stop, or Push-to-Talk mode
View on GitHub
GitHub Stars206
CategoryDevelopment
Updated4h ago
Forks28

Languages

Python

Security Score

100/100

Audited on Mar 24, 2026

No findings