Srclight

Deep code indexing MCP server for AI agents. 25 tools: hybrid FTS5 + embedding search, call graphs, git blame/hotspots, build system analysis. Multi-repo workspaces, GPU-accelerated semantic search, 10 languages via tree-sitter. Fully local, zero cloud dependencies.

Generate Convert Improve

Install / Use

/learn @srclight/Srclight

About this skill

Quality Score

0/100

README

Srclight

Deep code indexing for AI agents. SQLite FTS5 + tree-sitter + embeddings + MCP.

Srclight builds a rich, searchable index of your codebase that AI coding agents can query instantly — replacing dozens of grep/glob calls with precise, structured lookups. It is the most comprehensive code intelligence MCP server available: 29 tools covering symbol search, relationship graphs, git change intelligence, semantic search, build system awareness, and document extraction — capabilities no other single MCP server combines. Fully local and private: your code never leaves your machine.

Why?

AI coding agents (Claude Code, Cursor, etc.) spend 40-60% of their tokens on orientation — searching for files, reading code to understand structure, hunting for callers and callees. Srclight eliminates this waste.

| Without Srclight | With Srclight | |---|---| | 8-12 grep rounds to find callers | get_callers("lookup") — one call | | Read 5 files to understand module | codebase_map() — instant overview | | "Find code that does X" → 20 greps | semantic_search("dictionary lookup") — one call | | 15-25 tool calls per bug fix | 5-8 tool calls per bug fix |

Features

Minimal dependencies — single SQLite file per repo, no Docker/Redis/vector DB
Fully offline — no API calls, works air-gapped (Ollama local embeddings)
Incremental — only re-indexes changed files (content hash detection)
11 languages — Python, C, C++, C#, JavaScript, TypeScript, PHP, Dart, Swift, Kotlin, Java, Go
10 document formats — PDF, DOCX, XLSX, HTML, CSV/TSV, email (.eml), images (PNG/JPG/SVG/etc.), plain text, RST, Markdown
Optional OCR — PaddleOCR for scanned/image-only PDF pages; pytesseract for images
4 search modes — symbol names, source code (trigram), documentation (stemmed), semantic (embeddings)
Hybrid search — RRF fusion of keyword + semantic results for best accuracy
Multi-repo workspaces — search across all your repos simultaneously via SQLite ATTACH+UNION
MCP server — works with Claude Code, Cursor, and any MCP client
CLI — index, search, and inspect from the terminal
Auto-reindex — git post-commit/post-checkout hooks keep indexes fresh

Requirements

Python 3.11+
Git (for change intelligence and auto-reindex hooks)
Ollama (optional, for semantic search / embeddings) — ollama.com
NVIDIA GPU + cupy (optional, for GPU-accelerated vector search)
Poppler (optional, for PaddleOCR scanned-PDF support) — apt install poppler-utils / brew install poppler

Quick Start

# Install from PyPI
pip install srclight

# Install from source
git clone https://github.com/srclight/srclight.git
cd srclight
pip install -e .

# Optional: document format support (PDF, DOCX, XLSX, HTML, images)
pip install 'srclight[docs,pdf]'

# Optional: OCR for scanned PDFs (also needs poppler-utils on your system)
pip install 'srclight[pdf,paddleocr]'

# Optional: OCR for images (needs tesseract on your system)
pip install 'srclight[docs,ocr]'

# Optional: GPU-accelerated vector search (requires CUDA 12.x)
pip install 'srclight[gpu]'

# Everything (docs + pdf + ocr + paddleocr + gpu)
pip install 'srclight[all]'

# Index your project
cd /path/to/your/project
srclight index

# Index with embeddings (requires Ollama running)
srclight index --embed qwen3-embedding

# Search
srclight search "lookup"
srclight search --kind function "parse"
srclight symbols src/main.py

# Start MCP server (for Claude Code / Cursor)
srclight serve

Note: srclight index automatically adds .srclight/ to your .gitignore. Index databases and embedding files can be large and should never be committed.

Semantic Search (Embeddings)

Srclight supports embedding-based semantic search for natural language queries like "find code that handles authentication" or "where is the database connection pool".

Setup

# Install Ollama (https://ollama.com)
# Pull an embedding model
ollama pull qwen3-embedding       # Best quality (8B params, needs ~6GB VRAM)
ollama pull nomic-embed-text      # Lighter alternative (137M params)

# Index with embeddings
srclight index --embed qwen3-embedding

# Or index workspace with embeddings
srclight workspace index -w myworkspace --embed qwen3-embedding

How It Works

Each symbol's name + signature + docstring + content is embedded as a float vector
Vectors are stored as BLOBs in symbol_embeddings table (SQLite)
After indexing, a .npy sidecar snapshot is built and loaded to GPU VRAM (cupy) or CPU RAM (numpy) for fast search
semantic_search(query) embeds the query and runs cosine similarity against the GPU-resident matrix (~3ms for 27K vectors on a modern GPU)
hybrid_search(query) combines FTS5 keyword results + embedding results via Reciprocal Rank Fusion (RRF)

Embedding Providers

| Provider | Model | Quality | Local? | Notes | |----------|-------|---------|--------|-------| | Ollama (default) | qwen3-embedding | Best local | Yes | Needs ~6GB VRAM | | Ollama | nomic-embed-text | Good | Yes | Lighter, works on 8GB VRAM | | Voyage AI (API) | voyage-code-3 | Best overall | No | Requires VOYAGE_API_KEY |

# Use Voyage Code 3 (API, highest quality)
VOYAGE_API_KEY=your-key srclight index --embed voyage-code-3

Storage

Embeddings are stored in symbol_embeddings table in .srclight/index.db. After indexing, a .npy sidecar snapshot is built for fast GPU loading:

| File | Purpose | |------|---------| | index.db | Write path — per-symbol CRUD during indexing | | embeddings.npy | Read path — contiguous float32 matrix for GPU/CPU search | | embeddings_norms.npy | Pre-computed row norms (avoids recomputation per query) | | embeddings_meta.json | Symbol ID mapping, model info, version for cache invalidation |

For ~27K symbols at 4096 dims (qwen3-embedding), that's ~428 MB on disk, ~450 MB in VRAM. Incremental: only re-embeds symbols whose content changed; sidecar rebuilt after each indexing run.

Multi-Repo Workspaces

Search across multiple repos simultaneously. Each repo keeps its own .srclight/index.db; at query time, srclight ATTACHes them all and UNIONs across schemas.

# Create a workspace
srclight workspace init myworkspace

# Add repos
srclight workspace add /path/to/repo1 -w myworkspace
srclight workspace add /path/to/repo2 -w myworkspace -n custom-name

# Index all repos (with optional embeddings)
srclight workspace index -w myworkspace
srclight workspace index -w myworkspace --embed qwen3-embedding

# Search across all repos
srclight workspace search "Dictionary" -w myworkspace
srclight workspace search "Dictionary" -w myworkspace --project repo1

# Status
srclight workspace status -w myworkspace
srclight workspace list

# Start MCP server in workspace mode
srclight serve --workspace myworkspace

Git submodules are not indexed automatically — git ls-files does not recurse into them. To index a submodule, clone it separately and add it as its own workspace project. See docs/usage-guide.md for details.

MCP Integration

Srclight supports two transport modes: stdio (one server per session) and SSE (persistent server, multiple sessions). SSE is recommended for workspaces.

Claude Code

Stdio (simplest — one server per session):

# Single repo
claude mcp add srclight -- srclight serve

# Workspace mode
claude mcp add srclight -- srclight serve --workspace myworkspace

# Make it available in all projects (user scope)
claude mcp add --scope user srclight -- srclight serve --workspace myworkspace

SSE (persistent server — recommended for workspaces):

Run srclight as a long-lived server, then point Claude Code at it:

# Start the server (default: http://127.0.0.1:8742/sse)
srclight serve --workspace myworkspace &

# Or install as a systemd user service (Linux/WSL)
# See docs/usage-guide.md for the service file

# Connect Claude Code to the running server
claude mcp add --transport sse srclight http://127.0.0.1:8742/sse

SSE mode supports multiple concurrent sessions and survives Claude Code restarts.

Cursor

SSE (recommended): Run srclight once, then connect Cursor to it. Best for responsiveness and no cold-start per session.

Start the server: srclight serve --workspace myworkspace (default SSE on port 8742).

UI: Settings → Tools & MCP → Add new MCP server → Type: streamableHttp, URL: http://127.0.0.1:8742/sse.
JSON (project .cursor/mcp.json or global ~/.cursor/mcp.json):

"srclight": {
  "url": "http://127.0.0.1:8742/sse"
}

Stdio (alternative): One server process per Cursor session.

UI: Type: command, Command: srclight, Args: serve --workspace myworkspace (or serve for single-repo).
JSON:

"srclight": {
  "command": "srclight",
  "args": ["serve", "--workspace", "myworkspace"]
}

For single-repo: "args": ["serve"]. Restart Cursor completely after adding the server.

Verify: In Cursor chat, ask "What projects are in the srclight workspace?" or "List srclight tools" — the agent should call list_projects() or show srclight tools.

OpenClaw

OpenClaw connects to srclight via mcporter, its built-in MCP tool server CLI.

# 1. Add srclight to mcporter's home config
mcporter config add srclight http://127.0.0.1:8742/sse \
  --transport sse --scope home \
  --description "Srclight deep code indexing"

# 2. Verify the connection
mcporter call srclight.

Related Skills

node-connect

328.6k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

openai-image-gen

328.6k

Batch-generate images via OpenAI Images API. Random prompt sampler + `index.html` gallery.

claude-opus-4-5-migration

80.9k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

frontend-design

80.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

srclight

View profile

View on GitHub

GitHub Stars20

CategoryDevelopment

Updated1d ago

Forks7

srclight/srclight

Languages

Python

Security Score

95/100

Audited on Mar 20, 2026

No findings