Headroom

The Context Optimization Layer for LLM Applications

Generate Convert Improve

Install / Use

/learn @chopratejas/Headroom

About this skill

Quality Score

0/100

README

<h1 align="center">Headroom</h1> Compress everything your AI agent reads. Same answers, fraction of the tokens. Every tool call, DB query, file read, and RAG retrieval your agent makes is 70-95% boilerplate. Headroom compresses it away before it hits the model. Works with any agent — coding agents (Claude Code, Codex, Cursor, Aider), custom agents (LangChain, LangGraph, Agno, Strands, OpenClaw), or your own Python and TypeScript code. <a href="https://github.com/chopratejas/headroom/actions/workflows/ci.yml"> <img src="https://github.com/chopratejas/headroom/actions/workflows/ci.yml/badge.svg" alt="CI"> </a> <a href="https://pypi.org/project/headroom-ai/"> <img src="https://img.shields.io/pypi/v/headroom-ai.svg" alt="PyPI"> </a> <a href="https://pypi.org/project/headroom-ai/"> <img src="https://img.shields.io/pypi/pyversions/headroom-ai.svg" alt="Python"> </a> <a href="https://pypistats.org/packages/headroom-ai"> <img src="https://img.shields.io/pypi/dm/headroom-ai.svg" alt="Downloads"> </a> <a href="https://www.npmjs.com/package/headroom-ai"> <img src="https://img.shields.io/npm/v/headroom-ai.svg" alt="npm"> </a> <a href="https://github.com/chopratejas/headroom/blob/main/LICENSE"> <img src="https://img.shields.io/badge/license-Apache%202.0-blue.svg" alt="License"> </a> <a href="https://chopratejas.github.io/headroom/"> <img src="https://img.shields.io/badge/docs-GitHub%20Pages-blue.svg" alt="Documentation"> </a> <a href="https://discord.gg/yRmaUNpsPJ"> <img src="https://img.shields.io/badge/Discord-Join%20us-5865F2?logo=discord&logoColor=white" alt="Discord"> </a>

Where Headroom Fits

Your Agent / App
  (coding agents, customer support bots, RAG pipelines,
   data analysis agents, research agents, any LLM app)
      │
      │  tool calls, logs, DB reads, RAG results, file reads, API responses
      ▼
   Headroom  ← proxy, Python/TypeScript SDK, or framework integration
      │
      ▼
 LLM Provider  (OpenAI, Anthropic, Google, Bedrock, 100+ via LiteLLM)

Headroom sits between your application and the LLM provider. It intercepts requests, compresses the context, and forwards an optimized prompt. Use it as a transparent proxy (zero code changes), a Python function (compress()), or a framework integration (LangChain, LiteLLM, Agno).

What gets compressed

Headroom optimizes any data your agent injects into a prompt:

Tool outputs — shell commands, API calls, search results
Database queries — SQL results, key-value lookups
RAG retrievals — document chunks, embeddings results
File reads — code, logs, configs, CSVs
API responses — JSON, XML, HTML
Conversation history — long agent sessions with repetitive context

Quick Start

Python:

pip install "headroom-ai[all]"

TypeScript / Node.js:

npm install headroom-ai

Any agent — one function

Python:

from headroom import compress

result = compress(messages, model="claude-sonnet-4-5-20250929")
response = client.messages.create(model="claude-sonnet-4-5-20250929", messages=result.messages)
print(f"Saved {result.tokens_saved} tokens ({result.compression_ratio:.0%})")

TypeScript:

import { compress } from 'headroom-ai';

const result = await compress(messages, { model: 'gpt-4o' });
const response = await openai.chat.completions.create({ model: 'gpt-4o', messages: result.messages });
console.log(`Saved ${result.tokensSaved} tokens`);

Works with any LLM client — Anthropic, OpenAI, LiteLLM, Bedrock, Vercel AI SDK, or your own code.

Any agent — proxy (zero code changes)

headroom proxy --port 8787

# Point any LLM client at the proxy
ANTHROPIC_BASE_URL=http://localhost:8787 your-app
OPENAI_BASE_URL=http://localhost:8787/v1 your-app

Works with any language, any tool, any framework. Proxy docs

Coding agents — one command

headroom wrap claude       # Starts proxy + launches Claude Code
headroom wrap codex        # Starts proxy + launches OpenAI Codex CLI
headroom wrap aider        # Starts proxy + launches Aider
headroom wrap cursor       # Starts proxy + prints Cursor config

Headroom starts a proxy, points your tool at it, and compresses everything automatically.

Multi-agent — SharedContext

from headroom import SharedContext

ctx = SharedContext()
ctx.put("research", big_agent_output)      # Agent A stores (compressed)
summary = ctx.get("research")               # Agent B reads (~80% smaller)
full = ctx.get("research", full=True)       # Agent B gets original if needed

Compress what moves between agents — any framework. SharedContext Guide

MCP Tools (Claude Code, Cursor)

headroom mcp install && claude

Gives your AI tool three MCP tools: headroom_compress, headroom_retrieve, headroom_stats. MCP Guide

Drop into your existing stack

| Your setup | Add Headroom | One-liner | |------------|-------------|-----------| | Any Python app | compress() | result = compress(messages, model="gpt-4o") | | Any TypeScript app | compress() | const result = await compress(messages, { model: 'gpt-4o' }) | | Vercel AI SDK | Middleware | wrapLanguageModel({ model, middleware: headroomMiddleware() }) | | OpenAI Node SDK | Wrap client | const client = withHeadroom(new OpenAI()) | | Anthropic TS SDK | Wrap client | const client = withHeadroom(new Anthropic()) | | Multi-agent | SharedContext | ctx = SharedContext(); ctx.put("key", data) | | LiteLLM | Callback | litellm.callbacks = [HeadroomCallback()] | | Any Python proxy | ASGI Middleware | app.add_middleware(CompressionMiddleware) | | Agno agents | Wrap model | HeadroomAgnoModel(your_model) | | LangChain | Wrap model | HeadroomChatModel(your_llm) | | OpenClaw | ContextEngine plugin | openclaw plugins install headroom-openclaw | | Claude Code | Wrap | headroom wrap claude | | Codex / Aider | Wrap | headroom wrap codex or headroom wrap aider |

Full Integration Guide | TypeScript SDK

Demo

Does It Actually Work?

100 production log entries. One critical error buried at position 67.

| | Baseline | Headroom | |--|----------|----------| | Input tokens | 10,144 | 1,260 | | Correct answers | 4/4 | 4/4 |

Both responses: "payment-gateway, error PG-5523, fix: Increase max_connections to 500, 1,847 transactions affected."

87.6% fewer tokens. Same answer. Run it: python examples/needle_in_haystack_test.py

<details> <summary>What Headroom kept</summary>

From 100 log entries, SmartCrusher kept 6: first 3 (boundary), the FATAL error at position 67 (anomaly detection), and last 2 (recency). The error was automatically preserved — not by keyword matching, but by statistical analysis of field variance.

</details>

Real Workloads

| Scenario | Before | After | Savings | |----------|--------|-------|---------| | Code search (100 results) | 17,765 | 1,408 | 92% | | SRE incident debugging | 65,694 | 5,118 | 92% | | Codebase exploration | 78,502 | 41,254 | 47% | | GitHub issue triage | 54,174 | 14,761 | 73% |

Accuracy Benchmarks

Compression preserves accuracy — tested on real OSS benchmarks.

Standard Benchmarks — Baseline (direct to API) vs Headroom (through proxy):

| Benchmark | Category | N | Baseline | Headroom | Delta | |-----------|----------|---|----------|----------|-------| | GSM8K | Math | 100 | 0.870 | 0.870 | 0.000 | | TruthfulQA | Factual | 100 | 0.530 | 0.560 | +0.030 |

Compression Benchmarks — Accuracy after full compression stack:

| Benchmark | Category | N | Accuracy | Compression | Method | |-----------|----------|---|----------|-------------|--------| | SQuAD v2 | QA | 100 | 97% | 19% | Before/After | | BFCL | Tool/Function | 100 | 97% | 32% | LLM-as-Judge | | Tool Outputs (built-in) | Agent | 8 | 100% | 20% | Before/After | | CCR Needle Retention | Lossless | 50 | 100% | 77% | Exact Match |

Run it yourself:

# Quick smoke test (8 cases, ~10s)
python -m headroom.evals quick -n 8 --provider openai --model gpt-4o-mini

# Full Tier 1 suite (~$3, ~15 min)
python -m headroom.evals suite --tier 1 -o eval_results/

# CI mode (exit 1 on regression)
python -m headroom.evals suite --tier 1 --ci

Full methodology: Benchmarks | Evals Framework

Key Capabilities

Lossless Compression

Headroom never throws data away. It compresses aggressively, stores the originals, and gives the LLM a tool to retrieve full details when needed. When it compresses 500 items to 20, it tells the model what was omitted ("87 passed, 2 failed, 1 error") so the model knows when to ask for more.

Smart Content Detection

Auto-detects what's in your context — JSON arrays, code, logs, plain text — and routes each to the best compressor. JSON goes to SmartCrusher, code goes through AST-aware compression (Python, JS, Go, Rust, Java, C++), text goes to Kompress (ModernBERT-based, with [ml] extra).

Cache Optimization

Stabilizes message prefixes so your provider's KV cache actually works. Claude offers a 90% read discount on cached prefixes — but almost no framework takes advantage of it. Headroom does.

Related Skills

node-connect

344.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

claude-opus-4-5-migration

96.8k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

frontend-design

96.8k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

Hook Development

96.8k

This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.

chopratejas

View profile

View on GitHub

GitHub Stars1.1k

CategoryDevelopment

Updated1h ago

Forks92

chopratejas/headroom

Languages

Python

Security Score

100/100

Audited on Apr 1, 2026

No findings