OpenWhisper

A local open-source harness for Faster-Whisper/OpenAI Whisper. Free to use, clone, and modify. Simple local audio transcription built purely with Python.

Generate Convert Improve

Install / Use

/learn @Knuckles92/OpenWhisper

About this skill

Quality Score

0/100

README

OpenWhisper

A desktop app for recording audio and transcribing it to text using local Whisper models or OpenAI API. Features a modern PyQt6 GUI, system tray integration, global hotkeys, and auto-paste.

Cursor_wTPeidZjsL

Cursor_eyykcjebiU

Features

Local Whisper – Runs offline using OpenAI's Whisper model (~150MB download on first use)
API Options – Whisper API, GPT-4o Transcribe, GPT-4o Mini Transcribe
Global Hotkeys – Start/stop recording from any app (customizable)
Auto-paste – Transcription automatically pastes to your active window
System Tray – Minimize to tray, always accessible
Smart Splitting – Large audio files split automatically to avoid API limits
Audio Device Selection – Choose your preferred microphone input
Transcription History – Browse past transcriptions with search/filter, retranscribe recordings
Audio Upload – Import existing audio files for transcription
Real-time Visualization – Animated waveform overlay shows recording status
Live Streaming (experimental) – Real-time transcription preview while recording
Meeting Mode (experimental) – Long-form transcription with live streaming and auto-save
Meeting Insights (experimental) – AI-powered summaries, action items, and custom analysis (OpenAI/OpenRouter)
Caret Indicator (experimental) – Visual marker at cursor location when pasting
Window Memory – Remembers window position and size between sessions

GPU Acceleration

For significantly faster transcription speeds with an NVIDIA GPU, install CUDA support:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

With CUDA enabled, faster-whisper runs 2-4x faster than CPU-only. The app auto-detects GPU availability and selects optimal settings (turbo model on GPU, base on CPU). Streaming transcription uses ~15-20% GPU vs 40-60% CPU.

Installation

Note: It's recommended to set up a virtual environment (venv) to avoid package version conflicts. I have found Python 3.12 to be pretty stable with this codebase.

git clone https://github.com/Knuckles92/OpenWhisper
cd OpenWhisper
python -m venv venv
# Windows
venv\Scripts\activate
# Linux/Mac
source venv/bin/activate
pip install -r requirements.txt

For cloud transcription, set your API key:

# Windows
set OPENAI_API_KEY=your-key

# Or create a .env file
OPENAI_API_KEY=your-key

# Optional: For meeting insights via OpenRouter (access to Claude, Llama, Mistral, etc.)
OPENROUTER_API_KEY=your-openrouter-key

Note: API keys can also be configured in Settings > Insights for the meeting insights feature.

Usage

python app_qt.py

Hotkeys

| Key | Action | |-----|--------| | * | Start/stop recording | | - | Cancel | | Ctrl+Alt+* | Enable/disable program |

All hotkeys can be remapped in the settings.

Settings

Access settings via File > Settings or the system tray menu. Available options:

General: Default model, auto-paste, clipboard copy, minimize to tray, streaming transcription (experimental)

Audio: Sample rate, channels, silence threshold, input device selection

Hotkeys: Customize all keyboard shortcuts

Advanced: Whisper model selection (14+ options), compute device (auto/cuda/cpu), compute type (float16/float32/int8), max file size before splitting, streaming overlay positioning, logging

Offline Usage

Local Whisper transcription works fully offline after the initial model download. However, on startup, the faster-whisper library makes a brief metadata check to HuggingFace to see if a newer model version is available. This is not a model download—just a lightweight API call. If you're offline, the check will fail silently and the cached local model loads normally.

To force fully offline operation (skip the metadata check), set this environment variable before running:

export HF_HUB_OFFLINE=1  # Linux/Mac
set HF_HUB_OFFLINE=1     # Windows
python app_qt.py

Requirements

Python 3.8+(3.12 recommended)
Windows

Note: Caret paste indicator is Windows-only (uses Windows API).

License

MIT License. Free to use, clone, and modify.

Related Skills

node-connect

344.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

96.8k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

344.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

344.1k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。