OpenSuperMLX
macOS app for real-time audio transcription powered by MLX on Apple Silicon
Install / Use
/learn @axot/OpenSuperMLXREADME
OpenSuperMLX
OpenSuperMLX is a macOS application that provides real-time audio transcription powered by MLX on Apple Silicon. It offers a seamless way to record and transcribe audio with customizable settings and keyboard shortcuts.
<p align="center"> <img src="docs/image.png" width="400" /> </p>Features
- 🎙️ Real-time audio recording and transcription
- 🔴 Streaming transcription — see results as you speak
- 🧠 MLX-based transcription engine — download models directly from the app
- ⌨️ Global keyboard shortcuts — tap to toggle or hold to record, fully customizable
- 📁 Drag & drop audio files for transcription with queue processing
- 🎤 Microphone selection — switch between built-in, external, Bluetooth and iPhone (Apple Continuity) mics from the menu bar
- 🌍 Support for multiple languages with auto-detection
- 🇯🇵🇨🇳🇰🇷 Asian language autocorrect (autocorrect)
- 🤖 AWS Bedrock LLM post-transcription correction (optional)
- 👋 First-launch onboarding flow
Installation
Homebrew (Recommended)
brew tap axot/tap
brew install --cask opensupermlx
Manual
Download from GitHub releases page.
macOS Security Approval
Since OpenSuperMLX is not signed with an Apple Developer ID, macOS will block the app on first launch. You need to manually approve it:
- Open the app — macOS will show a warning that it cannot be opened
- Go to System Settings → Privacy & Security
- Scroll down to the Security section — you'll see a message about OpenSuperMLX being blocked
- Click Open Anyway
- Confirm in the dialog that appears
You only need to do this once. After approval, the app will launch normally.
Usage
Keyboard Shortcuts
OpenSuperMLX supports two recording modes via a global keyboard shortcut — it works from any app:
| Shortcut | Action |
|---|---|
| ⌥`(Option + Backtick) | Start/stop recording | |⌥⇧\`` (Option + Shift + Backtick) | Start/stop recording with LLM correction | | Escape` | Cancel active recording |
Recording Modes
The shortcut automatically switches between two modes based on how you press it:
- Tap (quick press & release) — Toggles recording on and off. Press once to start recording, press again to stop. The transcribed text is automatically pasted into the frontmost app.
- Hold (press and hold) — Records while the key is held down. Release to stop and the transcribed text is automatically pasted into the frontmost app.
Tip: Shortcuts are fully customizable in Settings → Shortcuts.
Requirements
- macOS 15.1+ (Apple Silicon/ARM64)
Support
If you encounter any issues or have questions, please:
- Check the existing issues in the repository
- Create a new issue with detailed information about your problem
- Include system information and logs when reporting bugs
Building locally
To build locally, you'll need:
git clone git@github.com:axot/OpenSuperMLX.git
cd OpenSuperMLX
git submodule update --init --recursive
brew install cmake libomp rust ruby
gem install xcpretty
./run.sh build
In case of problems, consult .github/workflows/build.yml which is our CI workflow
where the app gets built automatically on GitHub's CI.
License
OpenSuperMLX is licensed under the MIT License. See the LICENSE file for details.
Acknowledgments
OpenSuperMLX is forked from OpenSuperWhisper by @Starmel. Thanks to the original project for providing the foundation for this work.
Models
MLX models are downloaded automatically from Hugging Face when selected in the app. Built-in models:
- Qwen3-ASR-0.6B-4bit — Smallest model, fastest inference
- Qwen3-ASR-1.7B-8bit — Recommended balance of accuracy and speed
- Qwen3-ASR-1.7B-bf16 — Highest quality, best accuracy
Custom models can be added via HuggingFace repository ID.
Related Skills
node-connect
347.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
108.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
