Probe
AI-friendly semantic code search engine for large codebases. Combines ripgrep speed with tree-sitter AST parsing. Powers AI coding assistants with precise, context-aware code understanding.
Install / Use
/learn @probelabs/ProbeQuality Score
Category
Development & EngineeringSupported Platforms
README
Probe
We read code 10x more than we write it. Probe is a code and markdown context engine, with a built-in agent, made to work on enterprise-scale codebases.
Today's AI coding tools use a caveman approach: grep some files, read random lines, hope for the best. It works on toy projects. It falls apart on real codebases.
Probe is a context engine built for reading and reasoning. It treats your code as code—not text. AST parsing understands structure. Semantic search finds what matters. You get complete, meaningful context in a single call.
The Probe Agent is purpose-built for code understanding. It knows how to wield the Probe engine expertly—searching, extracting, and reasoning across your entire codebase. Perfect for spec-driven development, code reviews, onboarding, and any task where understanding comes before writing.
One Probe call captures what takes other tools 10+ agentic loops—deeper, cleaner, and far less noise.
Table of Contents
- Why Probe?
- Quick Start
- Features
- Usage Modes
- LLM Script
- Installation
- Supported Languages
- Documentation
- Environment Variables
- Contributing
- License
Why Probe?
Most code search tools fall into two camps: text-based (grep, ripgrep) or embedding-based (vector search requiring indexing and an embedding model). Probe takes a third path: AST-aware structural search with zero setup.
| | grep/ripgrep | Embedding tools (grepai, Octocode) | Probe |
|---|---|---|---|
| Setup time | None | Minutes (indexing + embedding service) | None |
| Code understanding | Text only | Text chunks (can split mid-function) | AST-aware (returns complete functions/classes) |
| Search method | Regex | Vector similarity | Elasticsearch-style boolean queries + BM25 |
| Result quality | Line fragments | ~512-char chunks | Complete semantic code blocks |
| Ranking | None (line order) | Cosine similarity | BM25/TF-IDF/Hybrid with SIMD acceleration |
| External dependencies | None | Embedding API (Ollama/OpenAI) | None |
| Token awareness | No | Partial | Yes (--max-tokens, session dedup) |
| Works offline | Yes | Only with local model | Always |
| AI agent integration | None | MCP server | Full agent loop + MCP + Vercel AI SDK |
The key insight: AI agents don't need embedding search
Embedding-based tools solve vocabulary mismatch -- finding "authentication" when the code says verify_credentials. But when an AI agent is the consumer, the LLM already handles this:
User: "find the authentication logic"
-> LLM generates: probe search "verify_credentials OR authenticate OR login OR auth_handler"
-> Probe returns complete AST blocks in milliseconds
The LLM translates intent into precise boolean queries. Probe gives it a powerful query language (AND, OR, +required, -excluded, "exact phrases", ext:rs, lang:python) purpose-built for this. Combined with session dedup, the agent can run 3-4 rapid searches and cover more ground than a single embedding query -- faster, deterministic, and with zero setup cost.
Quick Start
Option 1: Probe Agent via MCP (Recommended)
Our built-in agent natively integrates with Claude Code, using its authentication—no extra API keys needed.
Add to ~/.claude/claude_desktop_config.json:
{
"mcpServers": {
"probe": {
"command": "npx",
"args": ["-y", "@probelabs/probe@latest", "agent", "--mcp"]
}
}
}
The Probe Agent is purpose-built to read and reason about code. It piggybacks on Claude Code's auth (or Codex auth), or works with any model via your own API key (e.g., GOOGLE_API_KEY).
Option 2: Raw Probe Tools via MCP
If you prefer direct access to search/query/extract tools without the agent layer:
{
"mcpServers": {
"probe": {
"command": "npx",
"args": ["-y", "@probelabs/probe@latest", "mcp"]
}
}
}
Option 3: Direct CLI (No MCP)
Use Probe directly from your terminal—no AI editor required:
# Semantic search with Elasticsearch syntax
npx -y @probelabs/probe search "authentication AND login" ./src
# Extract code block at line 42
npx -y @probelabs/probe extract src/main.rs:42
# AST pattern matching
npx -y @probelabs/probe query "fn $NAME($$$) -> Result<$RET>" --language rust
Option 4: CLI Agent
Ask questions about any codebase directly from your terminal:
# One-shot question (works with any LLM provider)
npx -y @probelabs/probe@latest agent "How is authentication implemented?"
# With code editing capabilities
npx -y @probelabs/probe@latest agent "Refactor the login function" --allow-edit
Features
- Code-Aware: Tree-sitter AST parsing understands your code's actual structure
- Semantic Search: Elasticsearch-style queries (
AND,OR,NOT, phrases, filters) - Complete Context: Returns entire functions, classes, or structs -- not text chunks that break mid-function
- Zero Indexing: Instant results on any codebase. No embedding models, no vector databases, no setup
- Deterministic: Same query always returns the same results. No model variance, no stale indexes
- Fully Local: Your code never leaves your machine. No API calls for search
- Blazing Fast: SIMD-accelerated pattern matching + ripgrep scanning + rayon parallelism
- Smart Ranking: BM25, TF-IDF, and hybrid algorithms with optional BERT reranking
- Token-Aware:
--max-tokensbudget, session-based dedup to avoid repeating context - Built-in Agent: Multi-provider (Anthropic, OpenAI, Google, Bedrock) with retry, fallback, and context compaction
- Multi-Language: Rust, Python, JavaScript, TypeScript, Go, C/C++, Java, Ruby, PHP, Swift, C#, and more
Usage Modes
Probe Agent (MCP)
The recommended way to use Probe with AI editors. The Probe Agent is a specialized coding assistant that reasons about your code—not just pattern matches.
{
"mcpServers": {
"probe": {
"command": "npx",
"args": ["-y", "@probelabs/probe@latest", "agent", "--mcp"]
}
}
}
Why use the agent?
- Purpose-built to understand and reason about code
- Piggybacks on Claude Code / Codex authentication (or use your own API key)
- Smarter multi-step reasoning for complex questions
- Built-in code editing, task delegation, and more
Agent options:
| Option | Description |
|--------|-------------|
| --path <dir> | Search directory (default: current) |
| --provider <name> | AI provider: anthropic, openai, google |
| --model <name> | Override model name |
| --prompt <type> | Persona: code-explorer, engineer, code-review, architect |
| --allow-edit | Enable code modification |
| --enable-delegate | Enable task delegation to subagents |
| --enable-bash | Enable bash command execution |
| --max-iterations <n> | Max tool iterations (default: 30) |
Raw MCP Tools
Direct access to Probe's search, query, and extract tools—without the agent layer. Use this when you want your AI editor to call Probe tools directly.
{
"mcpServers": {
"probe": {
"command": "npx",
"args": ["-y", "@probelabs/probe@latest", "mcp"]
}
}
}
Available tools:
search- Semantic code search with Elasticsearch-style queriesquery- AST-based structural pattern matchingextract- Extract code blocks by line number or symbol namesymbols- List all symbols in a file (functions, classes, constants) with line numbers
CLI Agent
Run the Probe Agent directly from your terminal:
# One-shot question
npx -y @probelabs/probe@latest agent "How does the ranking algorithm work?"
# Specify search path
npx -y @probelabs/probe@latest agent "Find API endpoints" --path ./src
# Enable code editing
npx -y @probelabs/probe@latest agent "Add error handling to login()" --allow-edit
# Use custom persona
npx -y @probelabs/probe@latest agent "Review this code" --prompt code-review
Direct CLI Commands
For scripting and direct code analysis.
Search Command
probe search <PATTERN> [PATH] [OPTIONS]
Examples:
# Basic search
probe search "authentication" ./src
# Boolean operators (Elasticsearch syntax)
probe search "error AND handling" ./
probe search "login OR auth" ./src
probe search "database NOT sqlite" ./
# Search hints (file filters)
probe search "function AND ext:rs" ./ # Only .rs files
probe search "class AND file:src/**/*.py" ./ # Python files in src/
probe search "error AND dir:tests" ./ # Files in tests/
# Limit results for AI context windows
probe search "API" ./ --max-tokens 10000
Key options:
| Option | Description |
|--------|-------------|
| --max-tokens <n> | Limit total tokens returned |
| --max-results <n> | Limit number of results |
| --reranker <algo> | Ranking: bm25, tfidf, hybrid, hybrid2 |
| --allow-tests | Include test files |
| --format <fmt> | Output: markdown, json, xml |
Extract Command
probe extract <FILES> [OPTIONS]
Examples:
# Extract function at line 42
probe extract src/main.rs:42
# Extract by symbol name
probe extract src/main.rs#authenticate
# Extract line range
probe extract src/main.rs:10-50
# From compiler output
go test | probe extract
Symbols Command
probe symbols <FILES> [OPTIONS]
Examples:
# List symbols in a file
probe symbols src/main.rs
# JSON output for programmatic use
probe symbols src
