SkillAgentSearch skills...

Yap

๐ŸŽ™๏ธ Vibe typing with your voice. Local real-time speech-to-text that auto-types into any app. Powered by MLX on Apple Silicon. No cloud, no cap.

Install / Use

/learn @TorchFun-AI/Yap
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<p align="center"> <img src="docs/logo.png" alt="Yap" width="120" /> </p> <h1 align="center">Yap</h1> <p align="center"> <strong>The voice input layer for agentic coding.</strong><br/> Speak in any language. It transcribes, corrects, translates, and types โ€” right where your cursor is. </p> <p align="center"> <a href="LICENSE"><img src="https://img.shields.io/badge/license-CC%20BY--NC%204.0-blue.svg" alt="License" /></a> <img src="https://img.shields.io/badge/platform-macOS%20(Apple%20Silicon)-black?logo=apple" alt="Platform" /> <img src="https://img.shields.io/badge/runtime-100%25%20local-brightgreen" alt="Local" /> </p>

๐ŸŽฌ Demo

๐ŸŽฅ Demo video coming soon โ€” stay tuned!

๐Ÿ’ก Inspired by the agentic coding movement โ€” like OpenClaw's founder voice-chatting with 10+ agents to build software. Yap is the missing input layer that makes talking to your dev tools feel native.

<!-- Optional: embed a video showing Yap + Claude Code / Cursor workflow --> <!-- https://github.com/user-attachments/assets/agentic-workflow.mp4 -->

๐Ÿค” Why Yap?

The agentic coding era is here. You're talking to Claude Code, Cursor, Copilot โ€” but you're still typing every prompt with your fingers.

Your voice is 3x faster than your keyboard. Yap bridges the gap.

  • ๐Ÿ—ฃ๏ธ Voice-first workflow โ€” Talk to your agents, your terminal, your browser. Yap types it out.
  • ๐Ÿ”’ 100% local โ€” On-device VAD + ASR via MLX. No cloud. No data leaves your machine.
  • ๐ŸŒ Multilingual โ€” Speak Chinese, English, Japanese, Korean, and more. Real-time translation built in.
  • โœจ Smart correction โ€” LLM-powered spoken โ†’ written style conversion. Your voice, but polished.

โšก How It Works

Yap lives as a floating ball on your screen. Toggle input mode, and it listens:

๐ŸŽ™๏ธ Voice โ”€โ”€โ†’ ๐Ÿ”‡ VAD โ”€โ”€โ†’ ๐Ÿง  ASR โ”€โ”€โ†’ ๐Ÿ’ฌ LLM โ”€โ”€โ†’ โŒจ๏ธ Input
             Silero      MLX       Correct    Types into
             detects     on-device  & Translate active app
             speech      transcribe (optional)

Models auto-download from HuggingFace on first launch. Zero config to get started.


โœจ Features

| | Feature | Description | |---|---------|-------------| | ๐ŸŽ™๏ธ | Multilingual Voice Input | Chinese, English, Japanese, and more โ€” switch on the fly | | ๐ŸŒ | Real-time Translation | Speak in one language, type in another | | โœ๏ธ | Formal Correction | Spoken โ†’ written style, powered by any LLM | | ๐Ÿ–ฅ๏ธ | Universal Input | Works with any app โ€” Claude Code, Cursor, VS Code, Terminal, browser, Slack... | | ๐Ÿซง | Floating Ball UI | Always-on-top, draggable, with live waveform visualization | | ๐Ÿ”’ | Fully Local | On-device ASR, no cloud dependency, your data stays yours | | ๐ŸŒ | i18n Menu | ไธญๆ–‡ / English interface |


๐Ÿš€ Quick Start

Prerequisites

  • macOS with Apple Silicon (M1/M2/M3/M4)
  • Node.js 18+
  • Python 3.10 โ€“ 3.12

Rust and uv will be installed automatically by the setup script if missing.

Development

git clone https://github.com/TorchFun-AI/Yap.git && cd Yap

# One-click setup (install all dependencies + dev environment)
./setup.sh

# Terminal 1 โ€” Python AI backend
cd src-backend && uv run python main.py

# Terminal 2 โ€” Tauri + Vue dev server
make dev

Production Build

# Build .app bundle (compiles backend + Tauri app)
./build.sh

Output in src-tauri/target/release/bundle/.


๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Vue 3 UI      โ”‚โ—„โ”€โ”€โ”€โ–บโ”‚   Tauri Core    โ”‚โ—„โ”€โ”€โ”€โ–บโ”‚  Python AI      โ”‚
โ”‚   (Webview)     โ”‚ IPC โ”‚   (Rust)        โ”‚ WS  โ”‚  (FastAPI)      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚                        โ”‚
                              โ–ผ                        โ–ผ
                        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”           โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                        โ”‚ Keyboard  โ”‚           โ”‚ VAD + ASR โ”‚
                        โ”‚ Simulationโ”‚           โ”‚   + LLM   โ”‚
                        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜           โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

| Layer | Stack | |-------|-------| | Frontend | Vue 3 + TypeScript + Ant Design Vue + Pinia | | Core | Tauri 2 (Rust) | | Backend | Python + FastAPI + Silero VAD + MLX Audio |


๐Ÿ”ง LLM Configuration

Yap uses any OpenAI-compatible API for text correction and translation. Configure in Settings:

  • API Key
  • Base URL (e.g. https://api.openai.com/v1, or a local Ollama endpoint)
  • Model name

This is optional โ€” without it, Yap still does voice-to-text perfectly fine.


๐Ÿ“„ License

CC BY-NC 4.0 โ€” Free to use, modify, and share. Not for commercial use.

Related Skills

View on GitHub
GitHub Stars7
CategoryDevelopment
Updated1mo ago
Forks0

Languages

Python

Security Score

70/100

Audited on Feb 24, 2026

No findings