Ito
[DEPRECATED] Ito, smart dictation in every application
Install / Use
/learn @heyito/ItoREADME
[DEPRECATED] - This project is no longer maintained
Ito
<div align="center"> <img src="resources/build/icon.png" width="128" height="128" alt="Ito Logo" /> <h3>Smart dictation. Everywhere you want.</h3> <p> <strong>Ito</strong> is an intelligent voice assistant that brings seamless voice dictation to any application on your computer. Simply hold down your trigger key, speak naturally, and watch your words appear instantly in any text field. </p> <p> <img alt="macOS" src="https://img.shields.io/badge/macOS-supported-blue?logo=apple&logoColor=white"> <img alt="Windows" src="https://img.shields.io/badge/Windows-supported-blue?logo=windows&logoColor=white"> <img alt="Version" src="https://img.shields.io/badge/version-0.2.0-green"> <img alt="License" src="https://img.shields.io/badge/license-GPL-blue"> </p> </div>✨ Features
🎙️ Universal Voice Dictation
- Works in any app: Emails, documents, chat applications, web browsers, code editors
- Global keyboard shortcuts: Customizable trigger keys that work system-wide
- Real-time transcription: High-accuracy speech-to-text powered by advanced AI models
- Instant text insertion: Automatically types transcribed text into the focused text field
🧠 Smart & Adaptive
- Custom dictionary: Add technical terms, names, and specialized vocabulary
- Context awareness: Learns from your usage patterns to improve accuracy
- Multi-language support: Transcribe in multiple languages
- Intelligent punctuation: Automatically adds appropriate punctuation
⚙️ Powerful Customization
- Flexible shortcuts: Configure any key combination as your trigger
- Audio preferences: Choose your preferred microphone
- Privacy controls: Local processing options and data control settings
- Seamless integration: Works with any application
💾 Data Management
- Notes system: Automatically save transcriptions for later reference
- Interaction history: Track your dictation sessions and improve over time
- Cloud sync: Keep your settings and data synchronized across devices
- Export capabilities: Export your notes and interaction data
🚀 Quick Start
Prerequisites
- macOS 10.15+ or Windows 10+
- Node.js 20+ and Bun (for development)
- Rust toolchain (for building native components)
- Microphone access and Accessibility permissions
Installation
-
Download the latest release from heyito.ai or the GitHub releases page
-
Install the application:
- macOS: Open the
.dmgfile and drag Ito to Applications - Windows: Run the
.exeinstaller and follow the setup wizard
- macOS: Open the
-
Grant permissions when prompted:
- Microphone access: Required for voice input
- Accessibility access: Required for global keyboard shortcuts and text insertion
-
Set up authentication:
- Sign in with Google, Apple, Github through Auth0 or create a local account
- Complete the guided onboarding process
First Use
- Configure your trigger key: Choose a comfortable keyboard shortcut (default:
Fn + Space) - Test your microphone: Ensure clear audio input during the setup process
- Try it out: Hold your trigger key and speak into any text field
- Customize settings: Adjust voice sensitivity, shortcuts, and preferences
🛠️ Development
Building from Source
Important: Ito requires a local transcription server for voice processing. See server/README.md for detailed server setup instructions.
# Clone the repository
git clone https://github.com/heyito/ito.git
cd ito
# Install dependencies
bun install
# Set up environment variables
cp .env.example .env
# Build native components (Rust binaries)
./build-binaries.sh
# Set up and start the server (required for transcription)
cd server
cp .env.example .env # Edit with your API keys
bun install
bun run local-db-up # Start PostgreSQL database
bun run db:migrate # Run database migrations
bun run dev # Start development server
cd ..
# Start the Electron app (in a new terminal)
bun run dev
Build Requirements
All Platforms
- Rust: Install via rustup.rs
- Windows users: See Windows-specific instructions below for GNU toolchain setup
- macOS/Linux users: Default installation is sufficient
macOS
- Xcode Command Line Tools:
xcode-select --install
Windows
Required Setup:
This setup uses git bash for shell operations. Download from git
-
Install Docker Desktop: Download from docker.com and ensure it's running
-
Install Rust (with GNU target)
Download and run the official Rust installer for Windows.
This installs rustup and the MSVC toolchain by default.
Add the GNU target (needed for our native components):
rustup toolchain install stable-x86_64-pc-windows-gnu
rustup target add x86_64-pc-windows-gnu
-
Install 7-Zip
winget install 7zip.7zip
- Install GCC & MinGW-w64 via MSYS2
Install MSYS2.
Open the MSYS2 MinGW x64 shell (from the Start Menu).
Update and install the toolchain:
pacman -Syu # run twice if asked to restart
pacman -S --needed mingw-w64-x86_64-toolchain
Verify the tools exist:
ls /mingw64/bin/gcc.exe /mingw64/bin/dlltool.exe
- Use the MinGW tools when building (Git Bash)
You normally develop and build in Git Bash. Before building, prepend the MinGW path:
export PATH="/c/msys64/mingw64/bin:$PATH"
export DLLTOOL="/c/msys64/mingw64/bin/dlltool.exe"
export CC_x86_64_pc_windows_gnu="/c/msys64/mingw64/bin/x86_64-w64-mingw32-gcc.exe"
export AR_x86_64_pc_windows_gnu="/c/msys64/mingw64/bin/ar.exe"
export CARGO_TARGET_X86_64_PC_WINDOWS_GNU_LINKER="/c/msys64/mingw64/bin/x86_64-w64-mingw32-gcc.exe"
Check you’re picking up the right ones:
which gcc # -> /c/msys64/mingw64/bin/gcc.exe
which dlltool # -> /c/msys64/mingw64/bin/dlltool.exe
⚠️ Do not add C:\msys64\ucrt64\bin to PATH. That’s the wrong runtime and will break linking.
💡 To avoid running these exports every session, add the lines above to your Git Bash ~/.bashrc file. They will be applied automatically whenever you open a new Git Bash window.
- Restart Git Bash if you update MSYS2
Whenever you update MSYS2 packages with pacman -Syu, restart Git Bash so the changes take effect.
Note: Windows builds use Docker for cross-compilation to ensure consistent builds. The Docker container handles the Windows build environment automatically.
Project Structure
ito/
├── app/ # Electron renderer (React frontend)
│ ├── components/ # React components
│ ├── store/ # Zustand state management
│ └── styles/ # TailwindCSS styles
├── lib/ # Shared library code
│ ├── main/ # Electron main process
│ ├── preload/ # Preload scripts & IPC
│ └── media/ # Audio/keyboard native interfaces
├── native/ # Native components (Rust/Swift)
│ ├── audio-recorder/ # Audio capture (Rust)
│ ├── global-key-listener/ # Keyboard events (Rust)
│ ├── text-writer/ # Text insertion (Rust)
│ └── active-application/ # Get the active application for context (Rust)
├── server/ # gRPC transcription server
│ ├── src/ # Server implementation
│ └── infra/ # AWS infrastructure (CDK)
└── resources/ # Build resources & assets
Available Scripts
# Development
bun run dev # Start with hot reload
bun run dev:rust # Build Rust components and start dev
# Building Native Components
bun run build:rust # Build for current platform
bun run build:rust:mac # Build for macOS (with universal binary)
bun run build:rust:win # Build for Windows
# Building Application
bun run build:mac # Build for macOS
bun run build:win # Build for Windows
./build-app.sh mac # Build macOS using build script
./build-app.sh windows # Build Windows using build script (requires Docker)
# Code Quality
bun run lint # Run ESLint
bun run format # Run Prettier
bun run lint:fix # Fix linting issues
🏗️ Architecture
Client Architecture
Ito is built as a modern Electron application with a sophisticated multi-process architecture:
- Main Process: Handles system integration, permissions, and native component coordination
- Renderer Process: React-based UI with real-time audio visualization
- Preload Scripts: Secure IPC bridge between main and renderer processes
- Native Components: High-performance Rust binaries for audio capture and keyboard handling
Technology Stack
Frontend:
- Electron - Cross-platform desktop framework
- React 19 - Modern UI library with concurrent features
- TypeScript - Type-safe development
- TailwindCSS - Utility-first styling
- Zustand - Lightweight state management
- Framer Motion - Smooth animations
Backend:
- Node.js - Runtime environment
- gRPC - High-performance RPC for transcription services
- SQLite - Local data storage
- Protocol Buffers - Efficient data serialization
Native Components:
- Rust - System-level audio recording and keyboard event handling
- Swift - macOS-specific text manipulation and accessibility features
- cpal - Cross-platform audio library
- enigo - Cross-platform input simulation
Infrastructure:
- AWS CDK - Infrastructure as code
- Docker - Containerized deployments
-
Related Skills
node-connect
346.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
docs-writer
100.1k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
string-reviewer
100.1k>
