<a href="https://getnadir.com"> <img src="docs/images/banner.png" alt="NadirClaw — Cut LLM & Agent Costs 40-70%" width="100%" /> </a> <h1 align="center">NadirClaw</h1> Your simple prompts are burning premium tokens. NadirClaw routes them to cheaper models automatically. Save 40-70% on AI API costs. <a href="https://pypi.org/project/nadirclaw/"><img src="https://img.shields.io/pypi/v/nadirclaw" alt="PyPI" /></a> <a href="https://github.com/doramirdor/NadirClaw/actions"><img src="https://github.com/doramirdor/NadirClaw/actions/workflows/ci.yml/badge.svg" alt="CI" /></a> <a href="https://pypi.org/project/nadirclaw/"><img src="https://img.shields.io/pypi/pyversions/nadirclaw" alt="Python" /></a> <a href="LICENSE"><img src="https://img.shields.io/github/license/doramirdor/NadirClaw" alt="License" /></a> <a href="https://github.com/doramirdor/NadirClaw"><img src="https://img.shields.io/github/stars/doramirdor/NadirClaw?style=social" alt="GitHub stars" /></a> Works with Claude Code · Cursor · Continue · Aider · Windsurf · Codex · OpenClaw · Open WebUI · Any OpenAI-compatible client <a href="https://getnadir.com">Website</a> · <a href="#quick-start">Quick Start</a> · <a href="docs/comparison.md">Comparisons</a> · <a href="https://github.com/doramirdor/nadirclaw-action">GitHub Action</a>

Why NadirClaw?

Most LLM requests don't need a premium model. In typical coding sessions, 60-70% of prompts are simple — reading files, short questions, formatting. They can be handled by models that cost 10-20x less.

$ nadirclaw serve
✓ Classifier ready — Listening on localhost:8856

SIMPLE  "What is 2+2?"              → gemini-flash    $0.0002
SIMPLE  "Format this JSON"          → haiku-4.5       $0.0004
COMPLEX "Refactor auth module..."   → claude-sonnet    $0.098
COMPLEX "Debug race condition..."   → gpt-5.2          $0.450
SIMPLE  "Write a docstring"         → gemini-flash    $0.0002

3 of 5 routed cheaper · $0.549 vs $1.37 all-premium · 60% saved

Cut AI API costs 40-70% — real savings from day one
~10ms classification overhead — you won't notice it
Drop-in proxy — works with any OpenAI-compatible tool
Runs locally — your API keys never leave your machine
Fallback chains — automatic failover when models are down
Built-in cost tracking — dashboard, reports, budget alerts

Your keys. Your models. No middleman. NadirClaw runs locally and routes directly to providers. No third-party proxy, no subsidized tokens, no platform that can pull the plug on you. Why this matters.

Quick Start

pip install nadirclaw

Or install from source:

curl -fsSL https://raw.githubusercontent.com/doramirdor/NadirClaw/main/install.sh | sh

Then run the interactive setup wizard:

nadirclaw setup

This guides you through selecting providers, entering API keys, and choosing models for each routing tier. Then start the router:

nadirclaw serve --verbose

That's it. NadirClaw starts on http://localhost:8856 with sensible defaults (Gemini 3 Flash for simple, OpenAI Codex for complex). If you skip nadirclaw setup, the serve command will offer to run it on first launch.

Features

Context Optimize — compacts bloated context (JSON, tool schemas, chat history, whitespace) before dispatch, saving 30-70% input tokens with zero semantic loss. Modes: off (default), safe (lossless), aggressive (future). See savings analysis
Smart routing — classifies prompts in ~10ms using sentence embeddings
Three-tier routing — simple / mid / complex tiers with configurable score thresholds (NADIRCLAW_TIER_THRESHOLDS); set NADIRCLAW_MID_MODEL for a cost-effective middle tier
Agentic task detection — auto-detects tool use, multi-step loops, and agent system prompts; forces complex model for agentic requests
Reasoning detection — identifies prompts needing chain-of-thought and routes to reasoning-optimized models
Vision routing — auto-detects image content in messages and routes to vision-capable models (GPT-4o, Claude, Gemini)
Routing profiles — auto, eco, premium, free, reasoning — choose your cost/quality strategy per request
Model aliases — use short names like sonnet, flash, gpt4 instead of full model IDs
Session persistence — pins the model for multi-turn conversations so you don't bounce between models mid-thread
Context-window filtering — auto-swaps to a model with a larger context window when your conversation is too long
Fallback chains — if a model fails (429, 5xx, timeout), NadirClaw cascades through a configurable chain of fallback models until one succeeds
Streaming support — full SSE streaming compatible with OpenClaw, Codex, and other streaming clients
Native Gemini support — calls Gemini models directly via the Google GenAI SDK (not through LiteLLM)
OAuth login — use your subscription with nadirclaw auth <provider> login (OpenAI, Anthropic, Google), no API key needed
Multi-provider — supports Gemini, OpenAI, Anthropic, Ollama, and any LiteLLM-supported provider
OpenAI-compatible API — drop-in replacement for any tool that speaks the OpenAI chat completions API
Request reporting — nadirclaw report with per-model and per-day cost breakdown (--by-model --by-day), anomaly flagging, filters, latency stats, tier breakdown, and token usage
Log export — nadirclaw export --format csv|jsonl --since 7d for offline analysis in spreadsheets or data tools
Raw logging — optional --log-raw flag to capture full request/response content for debugging and replay
Prometheus metrics — built-in /metrics endpoint with request counts, latency histograms, token/cost totals, cache hits, and fallback tracking (zero extra dependencies)
OpenTelemetry tracing — optional distributed tracing with GenAI semantic conventions (pip install nadirclaw[telemetry])
Cost savings calculator — nadirclaw savings shows exactly how much money you've saved, with monthly projections
Spend tracking and budgets — real-time per-request cost tracking with daily/monthly budget limits, alerts via nadirclaw budget, optional webhook and stdout notifications
Prompt caching — in-memory LRU cache for identical chat completions, skipping redundant LLM calls entirely. Configurable TTL and max size via NADIRCLAW_CACHE_TTL and NADIRCLAW_CACHE_MAX_SIZE. Monitor with nadirclaw cache or the /v1/cache endpoint
Live dashboard — nadirclaw dashboard for terminal, or visit http://localhost:8856/dashboard for a web UI with real-time stats, cost tracking, and model usage
GitHub Action — doramirdor/nadirclaw-action for CI/CD pipelines

Dashboard

Monitor your routing in real-time with nadirclaw dashboard:

Install the dashboard extras: pip install nadirclaw[dashboard]

Prerequisites

Python 3.10+
git
At least one LLM provider:
- Google Gemini API key (free tier: 20 req/day)
- Ollama running locally (free, no API key needed)
- Anthropic API key for Claude models
- OpenAI API key for GPT models
- Provider subscriptions via OAuth (nadirclaw auth openai login, nadirclaw auth anthropic login, nadirclaw auth antigravity login, nadirclaw auth gemini login)
- Or any provider supported by LiteLLM

Install

One-line install (recommended)

curl -fsSL https://raw.githubusercontent.com/doramirdor/NadirClaw/main/install.sh | sh

This clones the repo to ~/.nadirclaw, creates a virtual environment, installs dependencies, and adds nadirclaw to your PATH. Run it again to update.

Manual install

git clone https://github.com/doramirdor/NadirClaw.git
cd NadirClaw
python3 -m venv venv
source venv/bin/activate
pip install -e .

Uninstall

rm -rf ~/.nadirclaw
sudo rm -f /usr/local/bin/nadirclaw

Docker

Run NadirClaw + Ollama with zero cost, fully local:

git clone https://github.com/doramirdor/NadirClaw.git && cd NadirClaw
docker compose up

This starts Ollama and NadirClaw on port 8856. Pull a model once it's running:

docker compose exec ollama ollama pull llama3.1:8b

To use premium models alongside Ollama, create a .env file with your API keys and model config (see .env.example), then restart.

To run NadirClaw standalone (without Ollama):

docker build -t nadirclaw .
docker run -p 8856:8856 --env-file .env nadirclaw

Configure

Environment File

NadirClaw loads configuration from ~/.nadirclaw/.env. Create or edit this file to set API keys and model preferences:

# ~/.nadirclaw/.env

# API keys (set the ones you use)
GEMINI_API_KEY=AIza...
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

# Model routing
NADIRCLAW_SIMPLE_MODEL=gemini-3-flash-preview
NADIRCLAW_COMPLEX_MODEL=gemini-2.5-pro

# Server
NADIRCLAW_PORT=8856

If ~/.nadirclaw/.env does not exist, NadirClaw falls back to .env in the current directory.

Authentication

NadirClaw supports multiple ways to provide LLM credentials, checked in this order:

OpenClaw stored token (`~/.openclaw/agents/main/a

NadirClaw

Install / Use

README