Wet

wet claude. Wringing Excess Tokens - transparent API proxy that compresses stale tool results in Claude Code sessions

Generate Convert Improve

Install / Use

/learn @buildoak/Wet

About this skill

Quality Score

0/100

README

Wet Claude

Wringing Excess Tokens Claude

API proxy for Claude Code — teach your Claude to optimize its own context in a meta-transparent way.

wet demo

Your Claude is running dry. Make it wet.

wet claude --dangerously-skip-permissions   # Claude takes it from here.

Why This Exists

Auto compact is brutal. It hits at the worst moments - mid-swarm, mid-experiment - and when it fires, it's all or nothing. Context gets shredded indiscriminately. Important computation goes rogue. Sessions derail. I've had a Mac Mini spiral to 60GB swap from the fallout.

So I audited thousands of tool calls across my Claude Code sessions. The culprit was obvious: 82% of context bloat is stale tool results - old git status outputs, spent pytest runs, massive grep dumps you already acted on, 30k-token agent returns you'll never look at again. They sit there, rotting, pushing you toward the autocompact cliff.

The problem: there's no hook to intercept tool results before they enter context. I checked Claude Code, Codex - nothing. Opened a feature request. I forked Codex and wired in my own compression hooks. I tried JSONL manipulation. Too dirty.

Then the insight: reverse proxy. A Go shim that sits between Claude Code and api.anthropic.com, intercepts every POST /v1/messages, and compresses stale tool results in-place before they reach the API. No client patches. No prompt wrappers. Clean.

But deterministic compression alone wasn't enough - it handles Bash outputs well, but agent returns and file reads need semantic understanding. So I flipped the script: instead of just compressing mechanically, put Claude in the driver's seat. Let it profile its own context, decide what's stale, and surgically rewrite its own tool results with a Sonnet subagent. Meta-compression - Claude optimizing Claude's context.

The result: instead of autocompact's sledgehammer, you get a scalpel. Claude thinks clearer with a lean context. Token savings compound across long sessions. Same work, half the noise.

What It Is

Put Claude in the driver's seat for context optimization.

One Go binary, one Claude Code skill. Toolbox and a Manual.

wet is a toolbox for agents. It gives Claude (or any agent sitting on top of Claude Code) surgical access to its own context - the ability to see exactly how much each tool result block consumes, profile the entire session's token distribution, and replace any block with either deterministic compression or a meta-aware subagent rewrite.

The Go proxy is the toolbox. It sits between Claude Code and the API, intercepts every POST /v1/messages, and exposes a full control plane:

# Launch — works with --resume, --dangerously-skip-permissions, or both
wet claude [args...]                    # start Claude Code through the proxy
wet claude --resume <session-id>        # resume a previous session through wet
wet claude --dangerously-skip-permissions  # autonomous mode through wet
wet serve --host 0.0.0.0 --mode auto   # standalone proxy for Docker / IDE extension

# Observe
wet ps [--all]                          # list all active wet sessions
wet status [--json]                     # context profile: fill%, token counts, compressible items
wet inspect [--json] [--full]           # every tool result block with token count, age, staleness

# Surgical compression (port auto-discovered — run from inside the wet session or its subagents)
wet compress --ids id1,id2,...          # replace specific blocks — deterministic or with replacement text
wet compress --text-file plan.json     # batch replacement with LLM-rewritten content
wet compress --dry-run --ids ...       # preview what would change without applying

# Runtime control
wet pause                               # bypass all compression (accounting still runs)
wet resume                              # re-enable compression
wet rules list                          # show active compression rules
wet rules set KEY VALUE                 # tune thresholds at runtime

# Session forensics
wet session profile --jsonl <PATH>      # context composition analysis from session trace
wet session salt                        # session self-identification token
wet data status                         # offline storage stats
wet data inspect [--all]                # browse persisted compressed items
wet data diff <turn>                    # what changed at a specific turn

wet compress and control commands auto-discover the proxy port via WET_PORT env var — no manual port wiring needed. These commands are designed to be called by Claude from inside a wet session (or by its subagents that inherit the environment).

Each tool result becomes a first-class object. You can see it, measure it, and replace it. Deterministic compression is calibrated on SWE-bench (91.2% ratio across 13,881 outputs, <5ms overhead) and understands 10 tool families natively: git, pytest, cargo, npm, pip, docker, make, ls/find, and more.

Per-item token counts are estimated from content length (chars/4 heuristic — no external tokenizer dependency). Session-level fill% and savings come from Anthropic's actual token counts in the API response — ground truth, not estimates.

Auto mode and rules. wet can run fully automatic — mode = "auto" in the config makes the proxy compress stale Bash outputs deterministically on every request without Claude lifting a finger. The rules engine controls staleness thresholds per tool family (wet rules list, wet rules set), minimum savings gates, and bypass conditions. You tune the rules, wet enforces them. See Configuration for the full config file.

The skill is the manual. It teaches Claude the meta game — how to use the toolbox on itself:

1. Profile — run wet status, see context fill, token distribution, what's compressible vs sacred.

2. Propose — inspect individual blocks, classify each one (mechanical Bash compression vs LLM-guided rewrite for agent returns and file reads), build a compression plan with expected savings.

3. Process — execute the plan. Bash outputs get deterministic Tier 1 compression. Agent returns and search results get rewritten by a Sonnet subagent that preserves semantic content while cutting 80-90% of tokens.

Here's what Claude sees when it profiles a real session (this README was written in it):

┌──────────────────────────────────────────────────────────────────────┐
│  Tool             Items    Tokens   Stale   Status                  │
├──────────────────────────────────────────────────────────────────────┤
│  Read               13    33.7k    13/13   ██████████████████  80%  │
│  Agent               6     3.5k     6/6    ████░░░░░░░░░░░░░░   8%  │
│  Bash               12     3.1k     9/12   ███░░░░░░░░░░░░░░░   7%  │
│  Grep                2     1.2k     2/2    █░░░░░░░░░░░░░░░░░   3%  │
│  TaskOutput          1     0.7k     1/1    █░░░░░░░░░░░░░░░░░   2%  │
│  Edit                6     0.2k     6/6    ░░░░░░░░░░░░░░░░░░  <1%  │
├──────────────────────────────────────────────────────────────────────┤
│  Total              40    42.4k    37/40   context fill: 11.5%      │
│                                                                      │
│  Sacred:    SOUL, IDENTITY, USER, MEMORY — never compressed          │
│  Fresh:     3 items (current turn) — protected                       │
│  Stale:     37 items — compressible                                  │
└──────────────────────────────────────────────────────────────────────┘

Claude sees what's sacred, what's fresh, what's fair game. It proposes a compression plan, you approve, it executes. Or in auto mode - it just handles it.

The skill is fully customizable — ask Claude to profile your sessions and adjust the compression strategy to your workflow. My case: the main session is a coordinator managing swarms of agents and agents inside agents, so agent returns are the primary culprit for context pollution. Your case might be different — heavy grep usage, large file reads, deep git histories. Tune the skill to match. The heuristics that drive what gets compressed and what stays sacred live in skill/references/heuristics.md — edit it to match your setup.

Quick Start

The fastest path: point your Claude at this repo and tell it to install wet. It will read the skill, build the binary, wire the statusline, and configure itself. That's the whole point - wet is built for agents to set up and operate.

Manual path:

# Homebrew (recommended)
brew tap buildoak/tap && brew install wet

# Build from source (requires Go 1.22+)
git clone https://github.com/buildoak/wet.git
cd wet && go build -o wet .
sudo mv wet /usr/local/bin/  # or anywhere on your PATH

# Install the skill — this is what teaches Claude the meta game
wet install-skill

# Wire the statusline into Claude Code
wet install-statusline

# Launch Claude through wet
wet claude --dangerously-skip-permissions

Docker / IDE Extension path:

# Build the standalone proxy image
docker build -t wet-proxy .

# Run it on localhost:8100 (auto mode shown here; passthrough is the default)
docker run --rm \
  -p 8100:8100 \
  -e WET_MODE=auto \
  -v wet-data:/root/.wet \
  wet-proxy

Then point Claude Code's shared settings at the published port:

{
  "env": {
    "ANTHROPIC_BASE_URL": "http://127.0.0.1:8100"
  }
}

This mode is best when you want a drop-in proxy beside Claude

Related Skills

node-connect

349.9k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

109.8k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

349.9k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

349.9k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。