Ultracode
Ultra-fast MCP server enabling AI agents to comprehensively work with TypeScript/JavaScript/Python projects: instant structural search, code modification with automatic linting/formatting/fixes, runtime debugging and backwards error analysis, integrated Git automation. Your AI's survival kit for massive codebases!
Install / Use
/learn @faxenoff/UltracodeQuality Score
Category
Development & EngineeringSupported Platforms
README
██ ██
██ ██ ██ ██████ █████▄ ▄████▄
██ ██ ██ ██ ██▄▄██▄ ██▄▄██
██ ██ ██ ██ ██ ██ ██ ██
██ ██ ██████ ██ ██ ██ ██ ██
▀████▀ ▄████ ▄████▄ █████▄ █████
██ ██ ██ ██ ██ ██▄▄▄
▀████ ▀████▀ █████▀ ██▄▄▄
Codebase RAG for Fast and Accurate Code Work
🌐 Language: [EN] | RU
MCP server for AI coding agents. Builds a complete code structure graph (entities, relationships, control flow, complexity) and a semantic vector index. AI agents query the graph instead of reading files — and get precise, exhaustive answers with line references.
Why this matters
Without a structural index, an AI agent exploring a codebase has to grep → read file → follow imports → grep again → read more files. Each step costs tokens and time. Missed connections lead to incomplete fixes. The agent breaks code, checks, fixes, breaks again — a cycle that can repeat 10-20 times for a single task.
With UltraCode, the same agent makes one MCP call and gets back all affected entities, their relationships, callers, and impact — in a single response. No file-reading loop, no missed connections.
What changes in practice
| | Without UltraCode | With UltraCode |
| ----------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Search | Agent greps for keywords, reads files one by one, follows import chains manually. On a large project, finding all usages of a pattern takes dozens of agent turns and 1M+ tokens. Indirect references are often missed. | Agent calls semantic_search or query — gets all matches (including semantic: similar logic, related concepts) in one response, ~100ms, ~5K tokens. Graph traversal finds what grep cannot: indirect callers, interface implementors, data flow paths. |
| Editing | Agent modifies files without knowing the full dependency tree. Typical cycle: edit → build fails → read error → fix → new error → fix → ... This "fix loop" takes 10-20 iterations, up to 1 hour and 2M+ tokens for a cross-cutting change. | Agent calls analyze_code_impact before editing to see what will break. modify_code applies changes at entity level with auto-validation (lint before/after). Impact analysis + tracing catch breakage before compilation. Large refactors compile correctly on the first try in most cases. |
| Memory | Agent forgets prior context and recreates functionality that already exists. Or spends hours debugging a function it accidentally disabled. Token waste grows with session length. | Graph provides complete structural context on every call. AutoDoc maintains up-to-date documentation automatically. Agent always sees the current state — no "amnesia" problems. |
| Git | Branch switches and external file changes invalidate the agent's mental model. Stale data causes silent errors. Agent must be explicitly told to re-analyze. | GitWatcher detects file changes and branch switches in real-time. Incremental re-indexing of graph and embeddings happens automatically. Every query returns current data — zero manual intervention. |
Indexing speed
Full indexing of a medium project (~500 files) completes in 3-5 seconds (parallel parsing + batch SQL + streaming embeddings). Large projects like VS Code (~1.8M LOC, 7000+ files) — ~82 seconds including full embedding generation. After that, GitWatcher indexes only changed files — typically under 200ms per change.
Installation
The project is optimized for Bun (an alternative JavaScript runtime) and runs 50% faster with it.
Bun + UltraCode (recommended — one-liner):
# macOS / Linux
curl -fsSL https://bun.sh/install | bash && ~/.bun/bin/bun i -g ultracode && ~/.bun/bin/bun pm -g trust ultracode
# Windows (PowerShell)
irm bun.sh/install.ps1 | iex; bun i -g ultracode; bun pm -g trust ultracode
UltraCode only (Bun already installed):
bun i -g ultracode && bun pm -g trust ultracode
npm (alternative):
npm install -g ultracode
Why two steps for Bun? Some dependencies use postinstall scripts to build native addons:
- cbor-extract — fast native metadata serialization (via cbor-x)
- protobufjs — binary protocol for IPC
- webgpu — Dawn GPU backend for AMD/Intel
Bun blocks postinstall scripts by default. The
bun pm trustcommand allows their execution — no reinstall needed.Other native components (oxc-parser, xxhash-wasm, better-sqlite3) ship prebuilt binaries and work without trust.
Note: For full code analysis on different languages, runtimes are required:
- TypeScript/JavaScript — built-in (TypeScript Compiler API)
- Python — requires Python 3.8+ (
python --version)- Java/Kotlin — requires JRE 11+ (
java --version)- Go — requires Go 1.18+ (
go version)- Rust — requires Rust toolchain (
rustc --version)- C# — requires .NET SDK 8+ (
dotnet --version)- Zig — built-in (regex-based, no Zig toolchain required)
- C/C++ — requires Clang 12+ (
clang --version)
Claude Code Config (~/.claude.json):
{
"mcpServers": {
"ultracode": {
"command": "ultracode"
}
}
}
Configuration: .autodoc/claude.cfg/add-to-CLAUDE.md
Local Model Setup
Local models are used for intelligent tasks: embedding model for semantic search and LLM for AutoDoc. This removes token costs from your main AI agent.
After installation, a setup wizard will launch and download and configure everything needed.
Step 1: Embedding Provider (semantic search)
| Provider | Speed | Recommendation |
| --------------- | ------------- | ------------------------------------------------------------------------ |
| vLLM | 1352 emb/s | ⭐ NVIDIA GPU (recommended) |
| TEI | 1169 emb/s | ⭐ NVIDIA GPU (Blackwell: 120-latest image) |
| MLX | ~500 emb/s | ⭐ macOS Apple Silicon (Metal GPU) |
| llama.cpp | 441 emb/s | AMD GPU (Vulkan), universal |
| OVMS Native | 260-326 emb/s | ⭐ CPU / Intel GPU. <br />Can help if main VRAM is occupied by local LLM. |
Note for GTX xx50/xx60 laptops (GPU thermal throttling)
Budget NVIDIA GPUs (GTX 1650/1660, RTX 3050/3060, RTX 4050/4060) on laptops often suffer from power limit throttling, which drops TEI/vLLM embedding throughput by ~1000 emb/s. The GPU hits its power limit (PL1) and clocks down mid-batch.
Fix via ThrottleStop (Windows):
- TPL button → set PL1 to max (55–75 W for laptops), PL2 to max (90–120 W), Turbo Time Limit → 28 sec (max), enable Clamp PL1/PL2 (TPL button turns green)
- Main window → Speed Shift - EPP →
0(max performance, reduces CPU throttle)- BD PROCHOT Offset →
0(disables CPU thermal trigger for GPU)- Limit Reasons → check what's blocking (if "MS Platform" — ignore)
- Apply → save profile. CPU yields thermal budget to GPU, TEI batches stabilize.
This typically gives +1000 emb/s on affected hardware.
Step 2: LLM Provider (AutoDoc, refactoring)
| Provider | Models | Recommendation | | ----------------------- | --------------------------------------- | -------------------------------- | | Docker Model Runner | Qwen 2.5, DeepSeek R1, Phi-4, Llama 3.2 | ⭐ If Docker Desktop is installed | | Ollama | qwen2.5-coder, deepseek-coder, phi4 | Universal option | | Skip | — | Configure later |
The wizard automatically:
- Detects your GPU (NVIDIA Turing/Ampere/Ada/Hopper/Blackwell*)
- Suggests optimal models for your
Related Skills
node-connect
343.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
92.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
