onWatch

Free, open-source AI API quota monitoring for developers.

Track usage across Synthetic, Z.ai, Anthropic, Codex, GitHub Copilot, MiniMax, Gemini CLI, and Antigravity in one place. See history, get alerts, and open a local web dashboard before you hit throttling or run over budget.

Links: Website | Buy Me a Coffee

Trust & Quality

Compatibility & Docs

onWatch fills the gap between "current usage snapshot" and the historical, per-cycle, cross-session view that developers actually need. It runs as a lightweight background agent (<50 MB RAM with all eight providers polling in parallel), stores historical data in SQLite, and serves a Material Design 3 web dashboard with dark/light mode.

It works with any tool that uses Synthetic, Z.ai, Anthropic, Codex, GitHub Copilot, MiniMax, Gemini CLI, or Antigravity API keys, including Cline, Roo Code, Kilo Code, Claude Code, Codex CLI, Cursor, GitHub Copilot, MiniMax Coding Plan, Antigravity, and others.

Zero telemetry. Single binary. All data stays on your machine.

Beta: onWatch is currently in active development. Features and APIs may change as we refine the product.

Anthropic Dashboard - Light Mode

If onWatch helps you track your AI spending, consider giving it a star. It helps others discover the project.

Powered by onllm.dev | Landing Page

Quick Start

macOS & Linux

One-line install:

curl -fsSL https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.sh | bash

This downloads the binary to ~/.onwatch/, creates a .env config, sets up a systemd service (Linux) or self-daemonizes (macOS), and adds onwatch to your PATH.

On macOS, the installer downloads the standard binary with menubar support.

Homebrew (macOS & Linux)

brew install onllm-dev/tap/onwatch
onwatch setup    # Interactive setup wizard for API keys and config

Windows

One-line install (PowerShell):

irm https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.ps1 | iex

Or download install.bat from the Releases page and double-click it.

This downloads the binary to %USERPROFILE%\.onwatch\, runs interactive setup for API keys, creates a .env config, and adds onwatch to your PATH.

For manual setup or troubleshooting, see the Windows Setup Guide.

Manual Installation

Download binaries from the Releases page. Binaries are available for macOS (ARM64, AMD64), Linux (AMD64, ARM64), and Windows (AMD64).

Or build from source (requires Go 1.25+):

git clone https://github.com/onllm-dev/onwatch.git && cd onwatch
cp .env.example .env    # then add your API keys
./app.sh --build && ./onwatch --debug    # or: make build && ./onwatch --debug

Or use Docker (requires Docker or Docker Compose):

cp .env.docker.example .env   # add your API keys
docker-compose up -d

Or via app.sh:

./app.sh --docker --run

The Docker image uses a distroless base (~10-12 MB) and runs as non-root. An Alpine variant with shell access is also available (ghcr.io/onllm-dev/onwatch:alpine). Data persists via volume mount at /data. Logs go to stdout (docker logs -f onwatch). See Docker Deployment for details.

Configure

Edit ~/.onwatch/.env (or .env in the project directory if built from source):

SYNTHETIC_API_KEY=syn_your_key_here       # https://synthetic.new/settings/api
ZAI_API_KEY=your_zai_key_here             # https://www.z.ai/api-keys
ANTHROPIC_TOKEN=your_token_here           # Auto-detected from Claude Code credentials
CODEX_TOKEN=your_token_here               # Recommended for Codex-only setups
COPILOT_TOKEN=ghp_your_token_here         # GitHub PAT with copilot scope (Beta)
ONWATCH_ADMIN_USER=admin
ONWATCH_ADMIN_PASS=changeme

At least one provider key is required. Configure any combination to track them in parallel. Anthropic tokens are auto-detected from Claude Code credentials (macOS Keychain, Linux keyring, or ~/.claude/.credentials.json). For Codex-only setups, set CODEX_TOKEN in .env; during runtime onWatch re-reads Codex auth state from ~/.codex/auth.json (or CODEX_HOME/auth.json) and picks up token changes. Copilot tokens require a GitHub Personal Access Token (classic) with the copilot scope.

Provider setup guides:

Windows Setup Guide - Detailed Windows installation & manual configuration
Codex Setup Guide
Copilot Setup Guide
MiniMax Setup Guide
Antigravity Setup Guide

Run

onwatch              # start in background (daemonizes, logs to ~/.onwatch/data/.onwatch.log)
onwatch --debug      # foreground mode, logs to stdout
onwatch stop         # stop the running instance
onwatch status       # check if running

Open http://localhost:9211 and log in with your .env credentials.

What onWatch Tracks (That Your Provider Doesn't)

┌──────────────────────────────────────────────────────────────────┐
│ What your provider shows          │ What onWatch adds           │
├───────────────────────────────────┼──────────────────────────────┤
│ Current quota usage               │ Historical usage trends      │
│                                   │ Reset cycle detection        │
│                                   │ Per-cycle consumption stats  │
│                                   │ Usage rate & projections     │
│                                   │ Per-session tracking         │
│                                   │ Multi-provider unified view  │
│                                   │ Live countdown timers        │
└───────────────────────────────────┴──────────────────────────────┘

Dashboard -- Material Design 3 with dark/light mode (auto-detects system preference). Provider tabs appear for each configured provider:

Synthetic -- Subscription, Search, and Tool Call quota cards
Z.ai -- Tokens, Time, and Tool Call quota cards
Anthropic -- Dynamic quota cards (5-Hour, 7-Day, 7-Day Sonnet, Monthly, etc.) with utilization percentages, OAuth token auto-refresh, and automatic rate limit bypass via token rotation
Codex -- Dynamic quota cards (LLMs, Review Requests) with OAuth auth-state refresh, historical cycle analytics, and multi-account support (Beta) for tracking multiple ChatGPT accounts
GitHub Copilot (Beta) -- Premium Interactions, Chat, and Completions quota cards with monthly reset tracking
MiniMax Coding Plan -- Shared quota pool tracking for M2, M2.1, and M2.5 models with 5-hour rolling window reset cycles
Gemini CLI (Beta) -- Per-model quota tracking for Gemini 2.5/3.x Pro, Flash, and Flash Lite models with 24-hour reset cycles
Antigravity -- Multi-model quota cards (Claude, Gemini, GPT) with grouped quota pools, logging history, and cycle overview
All -- Side-by-side view of all configured providers
PWA installable -- Install onWatch from your browser for a native app experience (Beta)

Each quota card shows: usage vs. limit with progress bar, live countdown to reset, status badge (healthy/warning/danger/critical), and consumption rate with projected usage.

Time-series chart -- Chart.js area chart showing all quotas as % of limit. Time ranges: 1h, 6h, 24h, 7d, 30d.

Insights -- Burn rate forecasting, billing-period averages, usage variance, trend detection, and cross-quota ratio analysis (e.g., "1% weekly ~ 24% of 5-hr sprint"). Provider-specific: tokens-per-call efficiency and per-tool breakdowns for Z.ai.

Cycle Overview -- Cross-quota correlation table showing all quota values at peak usage points within each bi

OnWatch

Install / Use

README