UncommonRoute

A local LLM router that intelligently dispatches AI requests to the right model — saving cost without sacrificing quality.

Generate Convert Improve

Install / Use

/learn @CommonstackAI/UncommonRoute

About this skill

Quality Score

0/100

README

English | <a href="https://github.com/CommonstackAI/UncommonRoute/blob/main/README.zh-CN.md">简体中文</a> <div align="center"> <h1>UncommonRoute</h1> Route prompts by difficulty, not habit. UncommonRoute is a local LLM router that sits between your client and your upstream API. Easy turns go cheap, hard turns go strong, and fallback chains are ready when the first choice fails. Built for Codex, Claude Code, Cursor, the OpenAI SDK, and OpenClaw. <a href="#quick-start">Quick Start</a> · <a href="#how-routing-works">How It Works</a> · <a href="#configuration-that-actually-matters">Configuration</a> · <a href="#detailed-benchmarks">Benchmarks</a>

</div>

The Expensive Default

Most AI tools make one bad assumption: every request deserves the same model.

That works until your workflow starts spending premium-model money on:

"what is 2+2?"
tool selection
log summarization
boring middle turns in an agent loop

UncommonRoute is the small local layer that changes that default.

Your client
  (Codex / Claude Code / Cursor / OpenAI SDK / OpenClaw)
            |
            v
     UncommonRoute
   (runs on your machine)
            |
            v
    Your upstream API
 (Commonstack / OpenAI / Ollama / vLLM / Parallax / ...)

It does not host models. It makes a fast local routing decision, forwards the request to your chosen upstream, and keeps enough fallback logic around to recover when upstream model names or availability do not line up cleanly.

Why It Is Worth Trying

The pitch is simple: keep one local endpoint, let the router decide when a strong model is actually worth paying for.

~90-95% cost savings in real Claude Code / OpenClaw sessions versus always using premium models
Zero keyword lists — the classifier uses structural features + n-gram learning, no hardcoded patterns
Benchmark-driven quality — model quality from PinchBench replaces price-based assumptions
Thompson Sampling — natural exploration-exploitation balance across the model pool
3 feedback clicks to change routing — user feedback takes effect immediately
341 passing tests

That is the core story of the project: spend premium-model money where it changes the answer, not where it just burns the budget.

Quick Start

If you are brand new, do these in order.

1. Install

pip install uncommon-route

Optional convenience installer. Review the script first if you are security-conscious:

curl -fsSL https://anjieyang.github.io/uncommon-route/install | bash

2. Prove the router works locally first

This step does not need a real upstream or API key.

uncommon-route route "write a Python function that validates email addresses"
uncommon-route debug "prove that sqrt(2) is irrational"

What this proves:

the package is installed
the local classifier works
the router can produce a tier, model choice, and fallback chain

What this does not prove:

your upstream is configured
your client is connected through the proxy

3. Point it at a real upstream

Pick one example and export the variables.

# Commonstack: one key, many providers
export UNCOMMON_ROUTE_UPSTREAM="https://api.commonstack.ai/v1"
export UNCOMMON_ROUTE_API_KEY="csk-..."

# OpenAI direct
export UNCOMMON_ROUTE_UPSTREAM="https://api.openai.com/v1"
export UNCOMMON_ROUTE_API_KEY="sk-..."

# Local OpenAI-compatible servers (Ollama, vLLM, etc.)
export UNCOMMON_ROUTE_UPSTREAM="http://127.0.0.1:11434/v1"

# Parallax scheduler endpoint (experimental)
export UNCOMMON_ROUTE_UPSTREAM="http://127.0.0.1:3001/v1"

If your upstream does not need a key, you can skip UNCOMMON_ROUTE_API_KEY.

Parallax is still best treated as experimental here: public docs clearly expose POST /v1/chat/completions, but public /v1/models support is less obvious, so discovery-driven routing may be limited.

4. Start the proxy

uncommon-route serve

If the upstream is configured, the startup banner shows:

the upstream host
the local proxy URL
the dashboard URL
a quick health-check command

If the upstream is missing, the banner tells you exactly which environment variables to set next.

5. Connect the client you already use

Pick the path that matches your workflow.

<details> <summary>Codex · OpenAI-compatible local routing</summary>

uncommon-route setup codex

Manual version:

export OPENAI_BASE_URL="http://localhost:8403/v1"
export OPENAI_API_KEY="not-needed"

Then:

uncommon-route serve
codex

For smart routing, set:

model = "uncommon-route/auto"

</details> <details> <summary>Claude Code · Anthropic-style local routing</summary>

uncommon-route setup claude-code

Manual version:

export ANTHROPIC_BASE_URL="http://localhost:8403"
export ANTHROPIC_API_KEY="not-needed"

Then:

uncommon-route serve
claude

Claude Code talks to /v1/messages. UncommonRoute accepts Anthropic-style requests, routes them, and converts the response shape back transparently.

</details> <details> <summary>OpenAI SDK / Cursor · One local OpenAI-compatible base URL</summary>

uncommon-route setup openai

Python example:

from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:8403/v1",
    api_key="not-needed",
)

response = client.chat.completions.create(
    model="uncommon-route/auto",
    messages=[{"role": "user", "content": "hello"}],
)

For Cursor, point "OpenAI Base URL" to http://localhost:8403/v1.

</details> <details> <summary>OpenClaw · Plugin-based integration</summary>

openclaw plugins install @anjieyang/uncommon-route
openclaw gateway restart

The plugin starts the proxy for you, registers a local OpenClaw provider, and syncs the discovered upstream pool into OpenClaw once /v1/models/mapping is available.

The config-patch fallback is static by nature, so it only registers the virtual routing IDs.

Example plugin config:

plugins:
  entries:
    "@anjieyang/uncommon-route":
      port: 8403
      upstream: "https://api.commonstack.ai/v1"
      spendLimits:
        hourly: 5.00
        daily: 20.00

If your upstream needs authentication, set UNCOMMON_ROUTE_API_KEY in the environment where OpenClaw runs.

</details>

6. Verify end to end

uncommon-route doctor
curl http://127.0.0.1:8403/health

When something feels off, uncommon-route doctor should almost always be the first command you run.

How Routing Works

You do not need to understand every internal detail to use the project, but the mental model matters.

1. Continuous difficulty, not discrete tiers

The classifier estimates a continuous difficulty score (0.0–1.0) from structural features and n-gram patterns. No keyword lists, no hardcoded rules. The score drives model selection through a quality prediction formula — there are no fixed tier boundaries in the routing logic.

Tiers (SIMPLE / MEDIUM / COMPLEX) still appear in logs, headers, and the dashboard, but they are display labels derived from the continuous score, not routing decisions.

2. Routing mode changes quality-vs-cost preference

| Mode | What it optimizes for | | --- | --- | | auto | balanced — best quality-per-dollar, adapts with difficulty | | fast | cost-dominant — cheapest acceptable model | | best | quality-dominant — highest quality, cost nearly ignored |

These show up as virtual model IDs:

uncommon-route/auto
uncommon-route/fast
uncommon-route/best

Only these virtual IDs trigger routing. Explicit real model IDs still pass through unchanged.

The quality-vs-cost weight automatically increases with task difficulty: harder tasks prioritize quality more, even in auto mode.

3. Benchmark-driven quality, not price-based

Model quality comes from real benchmark data (PinchBench agent task scores), not from price assumptions. Quality scores are blended with observed experience through Bayesian updating — the system starts from benchmark data and adapts to real-world performance over time.

The selector uses Thompson Sampling (Beta distribution per model) for natural exploration-exploitation balance. Models with fewer observations have wider distributions, giving them chances to prove themselves.

4. Three layers of learning

| Layer | Source | What it learns | | --- | --- | --- | | Benchmark prior | Pi

Related Skills

node-connect

337.7k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

83.3k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

337.7k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

83.3k

Commit, push, and open a PR