VoiceTerm is a voice-first terminal overlay for Codex and Claude. Gemini preset support remains experimental and is currently nonfunctional. It runs Whisper on your machine and types what you say into your existing CLI. Your tools still run in a normal PTY; VoiceTerm just adds a HUD on top. Use push-to-talk or wake phrases (hey codex, hey claude), then say send / submit for hands-free delivery.

Whisper runs locally by default. No cloud API keys required. Release history: dev/CHANGELOG.md.

Install and Start

Install one supported AI CLI first:

Codex:

npm install -g @openai/codex

Claude Code:

curl -fsSL https://claude.ai/install.sh | bash

Then choose one VoiceTerm setup path:

<details open> <summary>Homebrew (recommended)</summary>

brew tap jguida941/voiceterm
brew install voiceterm
cd ~/your-project
voiceterm

If needed, authenticate once:

voiceterm --login --codex
voiceterm --login --claude

</details> <details> <summary>PyPI (pipx / pip)</summary>

pipx install voiceterm
# or: python3 -m pip install --user voiceterm

cd ~/your-project
voiceterm

If needed, authenticate once:

voiceterm --login --codex
voiceterm --login --claude

</details> <details> <summary>From source</summary>

Requires Rust toolchain. See Install Guide for details.

git clone https://github.com/jguida941/voiceterm.git
cd voiceterm
./scripts/install.sh

If you are running from source while developing, run:

python3 dev/scripts/devctl.py check --profile ci

</details> <details> <summary>macOS App</summary>

Double-click app/macos/VoiceTerm.app, pick a folder, and it opens Terminal with VoiceTerm running.

</details>

For model options and startup/IDE tuning:

How It Works

VoiceTerm listens to your mic, converts speech to text on your machine, and types the result into your AI CLI input.

Recording

Requirements

macOS or Linux (Windows needs WSL2)
Microphone access
~0.5 GB disk for the default small model (base is ~142 MB, medium is ~1.5 GB)

Features

Main features

| Feature | What it does | |---------|---------------| | Local speech-to-text | Whisper runs on your machine (no cloud calls) | | Fast voice-to-text | Local Whisper turns speech into text quickly | | Keep your CLI as-is | Your backend CLI layout and behavior stay the same | | Auto voice mode | Keep listening on so you can talk instead of typing | | Wake mode + voice send | Say hey codex/hey claude, then say send/submit in insert mode | | Image prompts | Use Ctrl+X for one-shot screenshot prompts, or enable persistent image mode for HUD [rec] (IMG badge) | | Transcript queue | If the CLI is busy, VoiceTerm waits and sends text when ready | | Codex + Claude support | Primary support for Codex and Claude Code |

Everyday tools

Voice macros: expand phrases from .voiceterm/macros.yaml (toggle in Settings)
Voice navigation: spoken scroll, send, show last error, copy last error, explain last error
Dev mode tools: launch with --dev first (look for DEV badge), then use Ctrl+D for Dev panel tools; add --dev-log for JSONL diagnostics
Prompt-safe HUD: VoiceTerm suppresses HUD rows for high-confidence Codex/Claude approval prompts and fences PTY scrolling above the HUD so the active input row stays visible
Latency clarity: latency badges show completed-turn STT timing and hide while actively recording/processing
Transcript history: Ctrl+H to search and replay past text
Notification history: Ctrl+N to review recent status messages
Saved settings: stored in ~/.config/voiceterm/config.toml
Built-in themes: 11 themes including ChatGPT, Catppuccin, Dracula, Nord, Tokyo Night, and Gruvbox
Style-pack border settings: VOICETERM_STYLE_PACK_JSON supports components.overlay_border and components.hud_border (HUD applies when border mode is theme)

For full behavior details and controls, see guides/USAGE.md.

Important: if you did not launch with --dev, Ctrl+D is forwarded to the wrapped CLI as EOF (0x04) and can close/exit that CLI session.

Dev panel usage guide: guides/DEV_MODE.md

Supported AI CLIs

VoiceTerm is optimized for Codex and Claude Code. For full backend status and setup details, see Usage Guide -> Backend Support.

Codex

Use the same workflow and controls documented for backend support in guides/USAGE.md.

Claude Code

Claude Backend

IDE Support

Active verified hosts are Cursor terminal and JetBrains terminals. AntiGravity is deferred and not supported in current releases.

| IDE host | Codex | Claude Code | Status | |---|---|---|---| | Cursor terminal | Fully supported | Fully supported | Recommended primary host | | JetBrains terminals (IntelliJ, PyCharm, WebStorm, CLion) | Fully supported | Fully supported | Supported on current release; see troubleshooting for rare host-specific edge cases | | AntiGravity | Not supported | Not supported | Deferred until runtime fingerprint evidence exists (not supported in current releases) | | Other IDE terminals | Unverified | Unverified | Treat as experimental until listed here |

JetBrains + Claude rare edge case (long parallel turns): after very long parallel tool calls or parallel web-search turns, HUD/transcript overlap can appear at turn completion. Quick workaround: resize the terminal once (even by 1 row/column) to force layout recalculation. During these high-churn turns, VoiceTerm already applies a single-line full-HUD fallback for JetBrains+Claude to keep controls reachable while redraw settles. Details: Troubleshooting -> JetBrains + Claude overlay overlap after long parallel output.

Canonical matrix: Usage Guide -> IDE Compatibility.

Hands-Free Quick Start

voiceterm --auto-voice --wake-word --voice-send-mode insert

Think of this like Alexa for your terminal:

Say the wake phrase (hey codex or hey claude)
Speak your prompt
Say send (or submit)

UI Tour

Theme Picker

Press Ctrl+Y to open Theme Studio and choose Theme picker. Use Ctrl+G to cycle themes quickly. Use Tab / Shift+Tab to move between Theme Studio pages (Home, Colors, Borders, Components, Preview, Export). For editor details, see Themes. For theme-file flags/env vars, see CLI Flags.

Settings Menu

![Set

Voiceterm

Install / Use

README