OnWatch
Track AI API quotas across Synthetic, Z.ai, Anthropic (Claude Code), Codex, GitHub Copilot & Antigravity in real time. Lightweight background daemon (<50MB RAM), SQLite storage, Material Design 3 dashboard. Zero telemetry.
Install / Use
/learn @onllm-dev/OnWatchQuality Score
Category
Development & EngineeringSupported Platforms
README
onWatch
Free, open-source AI API quota monitoring for developers.
Track usage across Synthetic, Z.ai, Anthropic, Codex, GitHub Copilot, MiniMax, Gemini CLI, and Antigravity in one place. See history, get alerts, and open a local web dashboard before you hit throttling or run over budget.
Links: Website | Buy Me a Coffee
Trust & Quality
Compatibility & Docs
onWatch fills the gap between "current usage snapshot" and the historical, per-cycle, cross-session view that developers actually need. It runs as a lightweight background agent (<50 MB RAM with all eight providers polling in parallel), stores historical data in SQLite, and serves a Material Design 3 web dashboard with dark/light mode.
It works with any tool that uses Synthetic, Z.ai, Anthropic, Codex, GitHub Copilot, MiniMax, Gemini CLI, or Antigravity API keys, including Cline, Roo Code, Kilo Code, Claude Code, Codex CLI, Cursor, GitHub Copilot, MiniMax Coding Plan, Antigravity, and others.
Zero telemetry. Single binary. All data stays on your machine.
Beta: onWatch is currently in active development. Features and APIs may change as we refine the product.

If onWatch helps you track your AI spending, consider giving it a star. It helps others discover the project.
Powered by onllm.dev | Landing Page
Quick Start
macOS & Linux
One-line install:
curl -fsSL https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.sh | bash
This downloads the binary to ~/.onwatch/, creates a .env config, sets up a systemd service (Linux) or self-daemonizes (macOS), and adds onwatch to your PATH.
On macOS, the installer downloads the standard binary with menubar support.
Homebrew (macOS & Linux)
brew install onllm-dev/tap/onwatch
onwatch setup # Interactive setup wizard for API keys and config
Windows
One-line install (PowerShell):
irm https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.ps1 | iex
Or download install.bat from the Releases page and double-click it.
This downloads the binary to %USERPROFILE%\.onwatch\, runs interactive setup for API keys, creates a .env config, and adds onwatch to your PATH.
For manual setup or troubleshooting, see the Windows Setup Guide.
Manual Installation
Download binaries from the Releases page. Binaries are available for macOS (ARM64, AMD64), Linux (AMD64, ARM64), and Windows (AMD64).
Or build from source (requires Go 1.25+):
git clone https://github.com/onllm-dev/onwatch.git && cd onwatch
cp .env.example .env # then add your API keys
./app.sh --build && ./onwatch --debug # or: make build && ./onwatch --debug
Or use Docker (requires Docker or Docker Compose):
cp .env.docker.example .env # add your API keys
docker-compose up -d
Or via app.sh:
./app.sh --docker --run
The Docker image uses a distroless base (~10-12 MB) and runs as non-root. An Alpine variant with shell access is also available (ghcr.io/onllm-dev/onwatch:alpine). Data persists via volume mount at /data. Logs go to stdout (docker logs -f onwatch). See Docker Deployment for details.
Configure
Edit ~/.onwatch/.env (or .env in the project directory if built from source):
SYNTHETIC_API_KEY=syn_your_key_here # https://synthetic.new/settings/api
ZAI_API_KEY=your_zai_key_here # https://www.z.ai/api-keys
ANTHROPIC_TOKEN=your_token_here # Auto-detected from Claude Code credentials
CODEX_TOKEN=your_token_here # Recommended for Codex-only setups
COPILOT_TOKEN=ghp_your_token_here # GitHub PAT with copilot scope (Beta)
ONWATCH_ADMIN_USER=admin
ONWATCH_ADMIN_PASS=changeme
At least one provider key is required. Configure any combination to track them in parallel. Anthropic tokens are auto-detected from Claude Code credentials (macOS Keychain, Linux keyring, or ~/.claude/.credentials.json). For Codex-only setups, set CODEX_TOKEN in .env; during runtime onWatch re-reads Codex auth state from ~/.codex/auth.json (or CODEX_HOME/auth.json) and picks up token changes. Copilot tokens require a GitHub Personal Access Token (classic) with the copilot scope.
Provider setup guides:
- Windows Setup Guide - Detailed Windows installation & manual configuration
- Codex Setup Guide
- Copilot Setup Guide
- MiniMax Setup Guide
- Antigravity Setup Guide
Run
onwatch # start in background (daemonizes, logs to ~/.onwatch/data/.onwatch.log)
onwatch --debug # foreground mode, logs to stdout
onwatch stop # stop the running instance
onwatch status # check if running
Open http://localhost:9211 and log in with your .env credentials.
What onWatch Tracks (That Your Provider Doesn't)
┌──────────────────────────────────────────────────────────────────┐
│ What your provider shows │ What onWatch adds │
├───────────────────────────────────┼──────────────────────────────┤
│ Current quota usage │ Historical usage trends │
│ │ Reset cycle detection │
│ │ Per-cycle consumption stats │
│ │ Usage rate & projections │
│ │ Per-session tracking │
│ │ Multi-provider unified view │
│ │ Live countdown timers │
└───────────────────────────────────┴──────────────────────────────┘
Dashboard -- Material Design 3 with dark/light mode (auto-detects system preference). Provider tabs appear for each configured provider:
- Synthetic -- Subscription, Search, and Tool Call quota cards
- Z.ai -- Tokens, Time, and Tool Call quota cards
- Anthropic -- Dynamic quota cards (5-Hour, 7-Day, 7-Day Sonnet, Monthly, etc.) with utilization percentages, OAuth token auto-refresh, and automatic rate limit bypass via token rotation
- Codex -- Dynamic quota cards (LLMs, Review Requests) with OAuth auth-state refresh, historical cycle analytics, and multi-account support (Beta) for tracking multiple ChatGPT accounts
- GitHub Copilot (Beta) -- Premium Interactions, Chat, and Completions quota cards with monthly reset tracking
- MiniMax Coding Plan -- Shared quota pool tracking for M2, M2.1, and M2.5 models with 5-hour rolling window reset cycles
- Gemini CLI (Beta) -- Per-model quota tracking for Gemini 2.5/3.x Pro, Flash, and Flash Lite models with 24-hour reset cycles
- Antigravity -- Multi-model quota cards (Claude, Gemini, GPT) with grouped quota pools, logging history, and cycle overview
- All -- Side-by-side view of all configured providers
- PWA installable -- Install onWatch from your browser for a native app experience (Beta)
Each quota card shows: usage vs. limit with progress bar, live countdown to reset, status badge (healthy/warning/danger/critical), and consumption rate with projected usage.
Time-series chart -- Chart.js area chart showing all quotas as % of limit. Time ranges: 1h, 6h, 24h, 7d, 30d.
Insights -- Burn rate forecasting, billing-period averages, usage variance, trend detection, and cross-quota ratio analysis (e.g., "1% weekly ~ 24% of 5-hr sprint"). Provider-specific: tokens-per-call efficiency and per-tool breakdowns for Z.ai.
Cycle Overview -- Cross-quota correlation table showing all quota values at peak usage points within each bi
