Skim: The Most Intelligent Context Optimization Engine for Coding Agents

Code skimming. Command rewriting. Test, build, and git output compression. Token budget cascading. 17 languages. 14ms for 3,000 lines. Built in Rust.

Other tools filter terminal noise. Skim understands your code. It parses ASTs across 17 languages, strips implementation while preserving architecture, then optimizes every other type of context your agent consumes: test output, build errors, git diffs, and raw commands. 14ms for 3,000 lines. 48x faster on cache hits.

Why Skim?

Context capacity is not the bottleneck. Attention is. Every token you send to an LLM dilutes its focus. Research consistently shows attention dilution in long contexts -- models lose track of critical details even within their window. More tokens means higher latency, degraded recall, and weaker reasoning. Past a threshold, adding context makes outputs worse. While other tools stop at filtering command output, Skim parses your actual code structure and optimizes the full spectrum of agent context: code, test output, build errors, git diffs, and commands. Deeper, broader, and smarter than anything else available.

Take a typical 80-file TypeScript project: 63,000 tokens. That contains maybe 5,000 tokens of actual signal. The rest is implementation noise the model doesn't need for architectural reasoning.

80% of the time, the model doesn't need implementation details. It doesn't care how you loop through users or validate emails. It needs to understand what your code does and how pieces connect.

That's where Skim comes in.

| Mode | Tokens | Reduction | Use Case | |------------|--------|-----------|----------------------------| | Full | 63,198 | 0% | Original source code | | Structure | 25,119 | 60.3% | Understanding architecture | | Signatures | 7,328 | 88.4% | API documentation | | Types | 5,181 | 91.8% | Type system analysis |

For example:

// Before: Full implementation (100 tokens)
export function processUser(user: User): Result {
    const validated = validateUser(user);
    if (!validated) throw new Error("Invalid");
    const normalized = normalizeData(user);
    return await saveToDatabase(normalized);
}

// After: Structure only (12 tokens)
export function processUser(user: User): Result { /* ... */ }

One command. 60-90% smaller. Your 63,000-token codebase? Now 5,000 tokens. Fits comfortably in a single prompt with room for your question.

That same 80-file project that wouldn't fit? Now you can ask: "Explain the entire authentication flow" or "How do these services interact?" — and the AI actually has enough context to answer.

Features

Code Skimming (the original, still unmatched)

17 languages including TypeScript, JavaScript, Python, Rust, Go, Java, C, C++, C#, Ruby, SQL, Kotlin, Swift, Markdown, JSON, YAML, TOML
6 transformation modes from full to minimal to pseudo to structure to signatures to types (15-95% reduction)
14.6ms for 3,000-line files. 48x faster on cache hits
Token budget cascading that automatically selects the most aggressive mode fitting your budget
Parallel processing with multi-file globs via rayon

Command Rewriting (`skim init`)

PreToolUse hook rewrites cat, head, tail, cargo test, npm test, git diff into skim equivalents
Two-layer rule system with declarative prefix-swap and custom argument handlers
One command installs the hook for automatic, invisible context savings

Test Output Compression (`skim test`)

Parses and compresses output from cargo, go, vitest, jest, pytest
Extracts failures, assertions, pass/fail counts while stripping noise
Three-tier degradation from structured parse to regex fallback to passthrough

Build Output Compression (`skim build`)

Parses cargo, clippy, tsc build output
Extracts errors, warnings, and summaries

Lint Output Compression (`skim lint`)

Parses ESLint, Ruff, mypy, golangci-lint output
Extracts errors and warnings with severity grouping
Three-tier degradation from structured parse to regex fallback to passthrough

Package Manager Compression (`skim pkg`)

Parses npm, pnpm, pip, cargo audit/install/outdated output
Extracts vulnerabilities, version conflicts, and dependency issues

Git Output Compression (`skim git`)

skim git diff -- AST-aware: shows changed functions with full boundaries and +/- markers, strips diff noise
- --mode structure adds unchanged functions as signatures for context
- --mode full shows entire files with change markers
- Supports --staged, commit ranges (HEAD~3, main..feature)
Compresses git status and git log with flag-aware passthrough
All subcommands support --json for machine-readable output

Intelligence

skim discover scans agent session history for optimization opportunities
skim learn detects CLI error-retry patterns and generates correction rules
Output guardrail ensures compressed output is never larger than the original

Installation

Try it (no install required)

npx rskim file.ts

Install globally (recommended for regular use)

# Via Homebrew (macOS/Linux)
brew install dean0x/tap/skim

# Via npm
npm install -g rskim

# Via Cargo
cargo install rskim

Note: Use npx for trying it out. For regular use, install globally to avoid npx overhead (~100-500ms per invocation).

From Source

git clone https://github.com/dean0x/skim.git
cd skim
cargo build --release
# Binary at target/release/skim

Quick Start

# Try it with npx (no install)
npx rskim src/app.ts

# Or install globally for better performance
npm install -g rskim

# Extract structure from single file (auto-detects language)
skim src/app.ts

# Process entire directory recursively (auto-detects all languages)
skim src/

# Process current directory
skim .

# Process multiple files with glob patterns
skim 'src/**/*.ts'

# Process all TypeScript files with custom parallelism
skim '*.{js,ts}' --jobs 4

# Get only function signatures from multiple files
skim 'src/*.ts' --mode signatures --no-header

# Extract type definitions
skim src/types.ts --mode types

# Extract markdown headers (H1-H3 for structure, H1-H6 for signatures/types)
skim README.md --mode structure

# Pipe to other tools
skim src/app.ts | bat -l typescript

# Read from stdin (REQUIRES --language flag)
cat app.ts | skim - --language=typescript

# Override language detection for unusual file extensions
skim weird.inc --language=typescript

# Clear cache
skim --clear-cache

# Disable caching for pure transformation
skim file.ts --no-cache

# Show token reduction statistics
skim file.ts --show-stats

Usage

# Basic usage (auto-detects language)
skim file.ts                    # Single file
skim src/                       # Directory (recursive)
skim 'src/**/*.ts'             # Glob pattern

# With options
skim file.ts --mode signatures  # Different mode
skim src/ --jobs 8             # Parallel processing
skim - --language typescript   # Stdin (requires --language)

Common options:

-m, --mode - Transformation mode: structure (default), signatures, types, full, minimal, pseudo
-l, --language - Override auto-detection (required for stdin only)
-j, --jobs - Parallel processing threads (default: CPU cores)
--no-cache - Disable caching
--show-stats - Show token reduction stats
--disable-analytics - Disable analytics recording

📖 Full Usage Guide →

Transformation Modes

Skim offers six modes with different levels of aggressiveness:

| Mode | Reduction | What's Kept | Use Case | |------------|-----------|------------------------------------------|----------------------------| | Full | 0% | Everything (original source) | Testing/comparison | | Minimal | 15-30% | All code, doc comments | Light cleanup | | Pseudo | 30-50% | Logic flow, names, values | LLM context with logic | | Structure | 70-80% | Signatures, types, classes, imports | Understanding architecture | | Signatures | 85-92% | Only callable signatures | API documentation | | Types | 90-95% | Only type definitions | Type system analysis |

skim file.ts --mode structure   # Default
skim file.ts --mode pseudo      # Pseudocode (strips types, visibility, decorators)
skim file.ts --mode signatures  # More aggressive
skim file.ts --mode types       # Most aggressive
skim file.ts --mode full        # No transformation

Note on JSON/YAML/TOML files: JSON, YAML, and TOML always use structure extraction regardless of mode. Since they are data (not code), there are no "signatures" or "types" to extract—only structure. All modes produce identical output for these file types.

📖 Detailed Mode Guide →

Supported Languages

| Language | Status | Extensions | Notes | |------------|--------|--------------------|---------------------------------| | TypeScript | ✅ | .ts, .tsx | Excellent grammar | | JavaScript

Skim

Install / Use

README