WhisperDesk
Free, open-source audio transcription app. 100% local & private. Word-level timestamps, interactive transcript, multi-format export. Built with Tauri v2, React, and Whisper.cpp.
Install / Use
/learn @liaqateagle/WhisperDeskREADME
<p align="center"> <img src="image.png" alt="WhisperDesk Screenshot" width="800" /> </p>
Features
- Local & Private — 100% offline transcription, no cloud, no API keys, no telemetry
- Word-Level Timestamps — Click any word to jump to that point in the audio
- Interactive Transcript — Real-time word highlighting during playback
- Multiple Export Formats — SRT, VTT, TXT, JSON, Markdown
- Model Manager — Download Whisper models (tiny → large-v3) with progress bars
- Transcription History — SQLite-backed history with search and stats
- Beautiful UI — Dark/light mode, glassmorphism, smooth animations
- 99 Languages — Full Whisper language support
- Built-in Audio Player — Play, pause, seek, speed control (0.5x–2x)
Supported Formats
Audio: MP3, WAV, M4A, OGG, FLAC, WMA, AAC
Tech Stack
- Tauri v2 — Lightweight desktop framework (Rust + Web)
- React 19 + TypeScript — Frontend
- Tailwind CSS v4 — Styling
- whisper-rs — OpenAI Whisper.cpp speech recognition
- Symphonia — Pure Rust audio decoding
- SQLite — Transcription history
- Framer Motion — Animations
- Lucide React — Icons
Getting Started
Prerequisites
- Node.js 18+
- Rust 1.70+
- Tauri v2 prerequisites
Development
# Install dependencies
npm install
# Run in development mode
npm run tauri dev
# Build for production
npm run tauri build
First Use
- Go to Settings → Models and download a Whisper model (start with "base" for good quality/speed balance)
- Go to Transcribe and drop an audio file
- Click Start Transcription
- View, edit, search, and export your transcript
Whisper Models
| Model | Size | Speed | Quality | |-------|------|-------|---------| | tiny | 75 MB | Fastest | Basic | | base | 142 MB | Fast | Good | | small | 466 MB | Medium | Better | | medium | 1.5 GB | Slow | Great | | large-v3 | 3.1 GB | Slowest | Best | | large-v3-turbo | 1.6 GB | Fast | Great |
Part of the SpeakDock Ecosystem
WhisperDesk is the free, open-source companion to SpeakDock — our commercial real-time dictation and voice command tool.
WhisperDesk = File transcription (open source, free) SpeakDock = Real-time dictation, AI polish, meeting recording, voice commands (commercial)
Author
Liaqat Eagle
License
MIT — free for personal and commercial use. See LICENSE.
Contributing
Contributions welcome! See CONTRIBUTING.md for details.
Related Skills
node-connect
347.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
108.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.9kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
