Mcpunk
MCP tools for Roaming RAG
Install / Use
/learn @jurasofish/McpunkQuality Score
Category
Development & EngineeringSupported Platforms
README
MCPunk 🤖
Chat with your codebase without embeddings by giving the LLM tools to search your code intelligently.
MCPunk lets you explore and understand codebases through conversation. It works by:
- Breaking files into logical chunks (functions, classes, markdown sections)
- Giving the LLM tools to search and query these chunks
- Letting the LLM find the specific code it needs to answer your questions
No embeddings, no complex configuration - just clear, auditable searching that you can see and guide. It works great with Claude Desktop, or any other MCP client.
<a href="https://glama.ai/mcp/servers/3j9kxuocc6"> <img width="380" height="200" src="https://glama.ai/mcp/servers/3j9kxuocc6/badge" alt="MCPunk MCP server" /> </a>Built with the following in mind
- Context is King - LLMs can be great but only if provided with appropriate context.
- Context is Precious - LLMs need context, but they can't handle too much. A travesty! MCPunk is RAG that inherently provides the LLM contextual hints, allowing the LLM to really narrow in on only the relevant content.
- Human in the Loop - You can see exactly what data the LLM has considered and how it found it, You can jump into chat and direct things wherever you want.
Setup
These are instructions for Claude Desktop, but MCPunk can be used anywhere MCP is used.
- Install uv
- Put the snippet below in your
claude_desktop_config.json(Details aboutclaude_desktop_config.jsonincluding location) - Restart Claude Desktop and you should see the tools available after a small delay, like the screengrab below
- Chat away: "hey pal can you set up the ~/git/mcpunk project and tell me how python files are chunked"
("command": "uvx", might not work, and you may need to use e.g. "command": "/Users/michael/.local/bin/uvx",)
{
"mcpServers": {
"MCPunk": {
"command": "uvx",
"args": ["mcpunk"]
}
}
}

Usage Examples
Answer Questions About Your Codebase
- [User] Hey pal can you please set up the ~/git/mcpunk repo, then help me understand how a user of mcpunk would get the git diff of the currently checked out branch in the repo that they have configured with mcpunk
- [Claude] Call
configure_project - [Claude] Call
list_all_files_in_project - [Claude] Call
find_files_by_chunk_content - [Claude] ... makes a total of 21 tool calls with MCPunk.
- [Claude] Now I understand how the git diff functionality works in mcpunk.
Let me summarize how a user would get the git diff of the currently checked out branch:
To get a git diff of the currently checked out branch in a repo configured with mcpunk:
- First, the user needs to configure mcpunk to work with their project by using the configure_project tool
- To see what branches were recently checked out (to identify what to diff against)
- Then, to get the diff between the current branch (HEAD) and a reference branch This will display the differences between the current branch (HEAD) and the specified reference branch. The ref parameter should be the name of the base branch (like "main" or "develop") that you want to compare against.
- [User] Great job pal!
PR Review
- [user] Hey pal can you please set up the ~/git/mcpunk repo and have a
look at the current branch versus
scratch/1.5 - [Claude] Sets up ~/git/mcpunk and calls
diff_with_reffor refscratch/1.5 - [Claude] I see you've made several changes, primarily adding a new test
file
tests/test_git_analysis.py... - [user] how does this fit into the rest of the codebase?
- [Claude] Makes 5 tool calls for listing files by chunks, listing chunks, fetching chunk details
- [Claude] I see - the code parses git reflog output to track recently checked out branches. The main function get_recent_branches takes a repo path and returns the last N branches that were checked out, using _branches_from_reflog as a helper to parse the raw reflog output.
- [User] If you were in a rush and had to direct someone to review this work, what would you direct them to look at? Please outline discrete pieces of work
- [Claude]
- The
split("to ")[-1]in _branches_from_reflog looks fragile - Test data only shows simple branch names, what about spaces/special chars?
- Is example_reflog representative of real git reflog output?
- Are we testing against actual git behavior?
- The
- [User] You're the best!
Work across multiple codebases
You can just ask your LLM to set up multiple projects, and it can freely query across them. Handy if one depends on the other, and they're in different repos. In this case the LLM should recognise this via imports.
What is MCPunk & Other Background
MCPunk is an MCP server that provides tools to
- Configure a project, which is a directory of files. When configured, the files will be split into logical chunks. MCPunk is built for code, but really it could be applied to any documents, even images if you want to.
- Search for files in a project containing specific text
- Search for chunks in a file containing specific text
- View the full contents of a specific chunk
Along with this, it provides a few chunkers built in. The most mature is the Python chunker.
MCPunk doesn't have to be used for conversation. It can be used as part of code review in a CI pipeline, for example. It's really general RAG.
sequenceDiagram
participant User
participant Claude as Claude Desktop
participant MCPunk as MCPunk Server
participant Files as File System
Note over User,Files: Setup Phase
User->>Claude: Ask question about codebase
Claude->>MCPunk: configure_project(root_path, project_name)
MCPunk->>Files: Scan files in root directory
Note over MCPunk,Files: Chunking Process
MCPunk->>MCPunk: For each file, apply appropriate chunker:
MCPunk->>MCPunk: - PythonChunker: functions, classes, imports
MCPunk->>MCPunk: - MarkdownChunker: sections by headings
MCPunk->>MCPunk: - VueChunker: template/script/style sections
MCPunk->>MCPunk: - WholeFileChunker: fallback
MCPunk->>MCPunk: Split chunks >10K chars into parts
MCPunk-->>Claude: Project configured with N files
Note over User,Files: Navigation Phase<br>(LLM freely uses all these tools repeatedly to drill in)
Claude->>MCPunk: list_all_files_in_project(project_name)
MCPunk-->>Claude: File tree structure
Claude->>MCPunk: find_files_by_chunk_content(project_name, "search term")
MCPunk-->>Claude: Files containing matching chunks
Claude->>MCPunk: find_matching_chunks_in_file(project_name, file_path, "search term")
MCPunk-->>Claude: List of matching chunk IDs in file
Claude->>MCPunk: chunk_details(chunk_id)
MCPunk-->>Claude: Full content of specific chunk
Claude->>User: Answer based on relevant code chunks
Note over User,Files: Optional Git Analysis
Claude->>MCPunk: list_most_recently_checked_out_branches(project_name)
MCPunk->>Files: Parse git reflog
MCPunk-->>Claude: List of recent branches
Claude->>MCPunk: diff_with_ref(project_name, "main")
MCPunk->>Files: Generate git diff
MCPunk-->>Claude: Diff between HEAD and reference
Roaming RAG Crash Course
See
- https://arcturus-labs.com/blog/2024/11/21/roaming-rag--make-the-model-find-the-answers/
- https://simonwillison.net/2024/Dec/6/roaming-rag/
The gist of roaming RAG is
- Break down content (a codebase, pdf files, whatever) into "chunks". Each chunk is a "small" logical item like a function, a section in a markdown document, or all imports in a code file.
- Provide the LLM tools to search chunks. MCPunk does this by providing tools to search for files containing chunks with specific text, and to list the full contents of a specific chunk.
Compared to more traditional "vector search" RAG:
- The LLM has to drill down to find chunks, and naturally is aware of their broader context (like what file they're in)
- Chunks should always be coherent. Like a full function.
- You can see exactly what the LLM is searching for, and it's generally obvious if it's searching poorly and you can help it out by suggesting improved search terms.
- Requires exact search matching. MCPunk is NOT providing fuzzy search of any kind.
Chunks
A chunk is a subsection of a file. For example,
- A single python function
- A markdown section
- All the imports from a Python file
Chunks are created from a file by chunkers, and MCPunk comes with a handful built in.
When a project is set up in MCPunk, it goes through all files and applies the first applicable chunker to it. The LLM can then use tools to (1) query for files containing chunks with specific text in them, (2) query all chunks in a specific file, and (3) fetch the full contents of a chunk.
This basic foundation enables claude to effectively navigate relatively large codebases by starting with a broad search for relevant files and narrowing in on relevant areas.
Built-in chunkers:
PythonChunkerchunks things into classes, functions, file-level imports, and file-level statements (e.g. globals). Applicable to files ending in.pyVueChunkerchunks into 'template', 'script', 'style' chunks - or whatever top-level<blah>....</blah>items exist. Applicable to files ending in.vueMarkdownChunkerchunks things into markdown sections (by heading). Applicable to files ending in.mdWholeFileChunkerfallback chunker that creates a single chunk for the entire file. Applicable to any
Related Skills
node-connect
332.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
81.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
332.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
81.7kCommit, push, and open a PR
