NadirClaw
Open-source LLM router & AI cost optimizer. Routes simple prompts to cheap/local models, complex ones to premium — automatically. Drop-in OpenAI-compatible proxy for Claude Code, Codex, Cursor, OpenClaw. Saves 40-70% on AI API costs. Self-hosted, no middleman.
Install / Use
/learn @NadirRouter/NadirClawQuality Score
Category
Development & EngineeringSupported Platforms
README
Why NadirClaw?
Most LLM requests don't need a premium model. In typical coding sessions, 60-70% of prompts are simple — reading files, short questions, formatting. They can be handled by models that cost 10-20x less.
$ nadirclaw serve
✓ Classifier ready — Listening on localhost:8856
SIMPLE "What is 2+2?" → gemini-flash $0.0002
SIMPLE "Format this JSON" → haiku-4.5 $0.0004
COMPLEX "Refactor auth module..." → claude-sonnet $0.098
COMPLEX "Debug race condition..." → gpt-5.2 $0.450
SIMPLE "Write a docstring" → gemini-flash $0.0002
3 of 5 routed cheaper · $0.549 vs $1.37 all-premium · 60% saved
- Cut AI API costs 40-70% — real savings from day one
- ~10ms classification overhead — you won't notice it
- Drop-in proxy — works with any OpenAI-compatible tool
- Runs locally — your API keys never leave your machine
- Fallback chains — automatic failover when models are down
- Built-in cost tracking — dashboard, reports, budget alerts
Your keys. Your models. No middleman. NadirClaw runs locally and routes directly to providers. No third-party proxy, no subsidized tokens, no platform that can pull the plug on you. Why this matters.
Quick Start
pip install nadirclaw
Or install from source:
curl -fsSL https://raw.githubusercontent.com/doramirdor/NadirClaw/main/install.sh | sh
Then run the interactive setup wizard:
nadirclaw setup
This guides you through selecting providers, entering API keys, and choosing models for each routing tier. Then start the router:
nadirclaw serve --verbose
That's it. NadirClaw starts on http://localhost:8856 with sensible defaults (Gemini 3 Flash for simple, OpenAI Codex for complex). If you skip nadirclaw setup, the serve command will offer to run it on first launch.
Features
- Context Optimize — compacts bloated context (JSON, tool schemas, chat history, whitespace) before dispatch, saving 30-70% input tokens with zero semantic loss. Modes:
off(default),safe(lossless),aggressive(future). See savings analysis - Smart routing — classifies prompts in ~10ms using sentence embeddings
- Three-tier routing — simple / mid / complex tiers with configurable score thresholds (
NADIRCLAW_TIER_THRESHOLDS); setNADIRCLAW_MID_MODELfor a cost-effective middle tier - Agentic task detection — auto-detects tool use, multi-step loops, and agent system prompts; forces complex model for agentic requests
- Reasoning detection — identifies prompts needing chain-of-thought and routes to reasoning-optimized models
- Vision routing — auto-detects image content in messages and routes to vision-capable models (GPT-4o, Claude, Gemini)
- Routing profiles —
auto,eco,premium,free,reasoning— choose your cost/quality strategy per request - Model aliases — use short names like
sonnet,flash,gpt4instead of full model IDs - Session persistence — pins the model for multi-turn conversations so you don't bounce between models mid-thread
- Context-window filtering — auto-swaps to a model with a larger context window when your conversation is too long
- Fallback chains — if a model fails (429, 5xx, timeout), NadirClaw cascades through a configurable chain of fallback models until one succeeds
- Streaming support — full SSE streaming compatible with OpenClaw, Codex, and other streaming clients
- Native Gemini support — calls Gemini models directly via the Google GenAI SDK (not through LiteLLM)
- OAuth login — use your subscription with
nadirclaw auth <provider> login(OpenAI, Anthropic, Google), no API key needed - Multi-provider — supports Gemini, OpenAI, Anthropic, Ollama, and any LiteLLM-supported provider
- OpenAI-compatible API — drop-in replacement for any tool that speaks the OpenAI chat completions API
- Request reporting —
nadirclaw reportwith per-model and per-day cost breakdown (--by-model --by-day), anomaly flagging, filters, latency stats, tier breakdown, and token usage - Log export —
nadirclaw export --format csv|jsonl --since 7dfor offline analysis in spreadsheets or data tools - Raw logging — optional
--log-rawflag to capture full request/response content for debugging and replay - Prometheus metrics — built-in
/metricsendpoint with request counts, latency histograms, token/cost totals, cache hits, and fallback tracking (zero extra dependencies) - OpenTelemetry tracing — optional distributed tracing with GenAI semantic conventions (
pip install nadirclaw[telemetry]) - Cost savings calculator —
nadirclaw savingsshows exactly how much money you've saved, with monthly projections - Spend tracking and budgets — real-time per-request cost tracking with daily/monthly budget limits, alerts via
nadirclaw budget, optional webhook and stdout notifications - Prompt caching — in-memory LRU cache for identical chat completions, skipping redundant LLM calls entirely. Configurable TTL and max size via
NADIRCLAW_CACHE_TTLandNADIRCLAW_CACHE_MAX_SIZE. Monitor withnadirclaw cacheor the/v1/cacheendpoint - Live dashboard —
nadirclaw dashboardfor terminal, or visithttp://localhost:8856/dashboardfor a web UI with real-time stats, cost tracking, and model usage - GitHub Action —
doramirdor/nadirclaw-actionfor CI/CD pipelines
Dashboard
Monitor your routing in real-time with nadirclaw dashboard:
Install the dashboard extras: pip install nadirclaw[dashboard]
Prerequisites
- Python 3.10+
- git
- At least one LLM provider:
- Google Gemini API key (free tier: 20 req/day)
- Ollama running locally (free, no API key needed)
- Anthropic API key for Claude models
- OpenAI API key for GPT models
- Provider subscriptions via OAuth (
nadirclaw auth openai login,nadirclaw auth anthropic login,nadirclaw auth antigravity login,nadirclaw auth gemini login) - Or any provider supported by LiteLLM
Install
One-line install (recommended)
curl -fsSL https://raw.githubusercontent.com/doramirdor/NadirClaw/main/install.sh | sh
This clones the repo to ~/.nadirclaw, creates a virtual environment, installs dependencies, and adds nadirclaw to your PATH. Run it again to update.
Manual install
git clone https://github.com/doramirdor/NadirClaw.git
cd NadirClaw
python3 -m venv venv
source venv/bin/activate
pip install -e .
Uninstall
rm -rf ~/.nadirclaw
sudo rm -f /usr/local/bin/nadirclaw
Docker
Run NadirClaw + Ollama with zero cost, fully local:
git clone https://github.com/doramirdor/NadirClaw.git && cd NadirClaw
docker compose up
This starts Ollama and NadirClaw on port 8856. Pull a model once it's running:
docker compose exec ollama ollama pull llama3.1:8b
To use premium models alongside Ollama, create a .env file with your API keys and model config (see .env.example), then restart.
To run NadirClaw standalone (without Ollama):
docker build -t nadirclaw .
docker run -p 8856:8856 --env-file .env nadirclaw
Configure
Environment File
NadirClaw loads configuration from ~/.nadirclaw/.env. Create or edit this file to set API keys and model preferences:
# ~/.nadirclaw/.env
# API keys (set the ones you use)
GEMINI_API_KEY=AIza...
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
# Model routing
NADIRCLAW_SIMPLE_MODEL=gemini-3-flash-preview
NADIRCLAW_COMPLEX_MODEL=gemini-2.5-pro
# Server
NADIRCLAW_PORT=8856
If ~/.nadirclaw/.env does not exist, NadirClaw falls back to .env in the current directory.
Authentication
NadirClaw supports multiple ways to provide LLM credentials, checked in this order:
- OpenClaw stored token (`~/.openclaw/agents/main/a
