Claude reads entire files to find what it needs. Lumen gives it a map.

Lumen is a 100% local semantic code search engine for AI coding agents. No API keys, no cloud, no external database, just open-source embedding models (Ollama or LM Studio), SQLite, and your CPU. A single static binary and your own local embedding server.

The payoff is measurable and reproducible: across 8 benchmark runs on 8 languages and real GitHub bug-fix tasks, Lumen cuts cost in every single language — up to 39%. Output tokens drop by up to 66%, sessions complete up to 53% faster, and patch quality is maintained in every task. All verified with a transparent, open-source benchmark framework that you can run yourself.

| | With Lumen | Baseline (no Lumen) | | ---------------------- | ----------------------------- | -------------------- | | Cost (avg, bug-fix) | $0.29 (-26%) | $0.40 | | Time (avg, bug-fix) | 125s (-28%) | 174s | | Output tokens (avg) | 5,247 (-37%) | 8,323 | | JavaScript (marked) | $0.32, 119s (-33%, -53%) | $0.48, 255s | | Rust (toml) | $0.38, 204s (-39%, -34%) | $0.61, 310s | | PHP (monolog) | $0.14, 34s (-27%, -34%) | $0.19, 52s | | TypeScript (commander) | $0.14, 56s (-27%, -33%) | $0.19, 84s | | Patch quality | Maintained in all 8 tasks | — |

Demo
Quick start
What you get
How it works
Benchmarks
Supported languages
Configuration
- Supported embedding models
Controlling what gets indexed
Database location
CLI Reference
Troubleshooting
Development

Demo

Claude Code asking about the Prometheus codebase. Lumen's semantic_search finds the relevant code without reading entire files.

Quick start

Prerequisites:

Platform support: Linux, macOS, and Windows. File locking for background indexing coordination uses flock(2) on Unix and LockFileEx on Windows (via gofrs/flock).

Ollama installed and running, then pull the default embedding model:
```
ollama pull ordis/jina-embeddings-v2-base-code
```
Claude Code installed

Install:

/plugin marketplace add ory/claude-plugins
/plugin install lumen@ory

That's it. On first session start, Lumen:

Downloads the binary automatically from the latest GitHub release
Indexes your project in the background using Merkle tree change detection
Registers a semantic_search MCP tool that Claude uses automatically

Two skills are also available: /lumen:doctor (health check) and /lumen:reindex (forced re-indexing).

What you get

Semantic vector search — Claude finds relevant functions, types, and modules by meaning, not keyword matching
Auto-indexing — indexes on session start, only re-processes changed files via Merkle tree diffing
Incremental updates — re-indexes only what changed; large codebases re-index in seconds after the first run
11 language families — Go, Python, TypeScript, JavaScript, Rust, Ruby, Java, PHP, C/C++, C#
Git worktree support — worktrees share index data automatically; a new worktree seeds from a sibling's index and only re-indexes changed files, turning minutes of embedding into seconds
Zero cloud — embeddings stay on your machine; no data leaves your network
Ollama and LM Studio — works with either local embedding backend

How it works

Lumen sits between your codebase and Claude as an MCP server. When a session starts, it walks your project and builds a Merkle tree over file hashes: only changed files get re-chunked and re-embedded. Each file is split into semantic chunks (functions, types, methods) using Go's native AST or tree-sitter grammars for other languages. Chunks are embedded and stored in SQLite + sqlite-vec using cosine-distance KNN for retrieval.

Files → semantic chunks → vector embeddings → SQLite/sqlite-vec → KNN search

When Claude needs to understand code, it calls semantic_search instead of reading entire files. The index is stored outside your repo (~/.local/share/lumen/<hash>/index.db), keyed by project path and model name — different models never share an index.

Benchmarks

Lumen is evaluated using bench-swe: a SWE-bench-style harness that runs Claude on real GitHub bug-fix tasks and measures cost, time, output tokens, and patch quality — with and without Lumen. All results are reproducible: raw JSONL streams, patch diffs, and judge ratings are committed to this repository.

Key results — 8 runs across 8 languages, hard difficulty, real GitHub issues (ordis/jina-embeddings-v2-base-code, Ollama):

| Language | Cost Reduction | Time Reduction | Output Token Reduction | Quality | | ---------- | -------------- | -------------- | ----------------------- | -------------- | | Rust | -39% | -34% | -31% (18K → 12K) | Poor (both) | | JavaScript | -33% | -53% | -66% (14K → 5K) | Perfect (both) | | TypeScript | -27% | -33% | -64% (5K → 1.8K) | Good (both) | | PHP | -27% | -34% | -59% (1.9K → 0.8K) | Good (both) | | Ruby | -24% | -11% | -9% (6.1K → 5.6K) | Good (both) | | Python | -20% | -29% | -36% (1.7K → 1.1K) | Perfect (both) | | Go | -12% | -9% | -10% (11K → 10K) | Good (both) | | C++ | -8% | -3% | +42% (feature task) | Good (both) |

Cost was reduced in every language tested. Quality was maintained in every task — zero regressions. JavaScript and TypeScript show the most dramatic efficiency gains: same quality fixes in half the time with two-thirds fewer tokens. Even on tasks too hard for either approach (Rust), Lumen cuts the cost of failure by 39%.

See docs/BENCHMARKS.md for all 8 per-language deep dives, judge rationales, and reproduce instructions.

Supported languages

Supports 12 language families with semantic chunking (9 benchmarked):

| Language | Parser | Extensions | Benchmark status | | ---------------- | ----------- | ----------------------------------------- | --------------------------------------------- | | Go | Native AST | .go | Benchmarked: -12% cost, Good quality | | Python | tree-sitter | .py | Benchmarked: Perfect quality, -36% tokens | | TypeScript / TSX | tree-sitter | .ts, .tsx | Benchmarked: -64% tokens, -33% time | | JavaScript / JSX | tree-sitter | .js, .jsx, .mjs | Benchmarked: -66% tokens, -53% time | | Dart | tree-sitter | .dart | Benchmarked: -76% cost, -82% tokens, -79% time | | Rust | tree-sitter | .rs | Benchmarked: -39% cost, -34% time | | Ruby | tree-sitter | .rb | Benchmarked: -24% cost, -11% time | | PHP | tree-sitter | .php | Benchmarked: -59% tokens, -34% time | | C / C++ | tree-sitter | .c, .h, .cpp, .cc, .cxx, .hpp | Benchmarked: -8% cost (C++ feature task) | | Java | tree-sitter | .java | Supported | | C# | tree-sitter | .cs | Supported |

Go uses the native Go AST parser for the most precise chunks. All other languages use tree-sitter grammars. See docs/BENCHMARKS.md for all 9 per-language benchmark deep dives.

Configuration

All configuration is via environment variables:

| Variable | Default | Description | | ------------------------ | ------------------------ | ------------------------------------------------------------- | | LUMEN_EMBED_MODEL | see note ¹ |

Lumen

Install / Use

README

Table of contents