SkillAgentSearch skills...

Sirchmunk

🐿️ Sirchmunk: Raw data to self-evolving intelligence, real-time.

Install / Use

/learn @modelscope/Sirchmunk
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<div align="center"> <img src="web/public/logo-v2.png" alt="Sirchmunk Logo" width="250" style="border-radius: 15px;">

Sirchmunk: Raw data to self-evolving intelligence, real-time.

<a href="https://trendshift.io/repositories/22808" target="_blank"><img src="https://trendshift.io/api/badge/repositories/22808" alt="modelscope%2Fsirchmunk | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

Python FastAPI Next.js TailwindCSS DuckDB License ripgrep-all OpenAI Kreuzberg MCP

📖 Documentation

Quick Start · Key Features · MCP Server · Web UI · Docker · How it Works · FAQ

</div> <div align="center">

🔍 Agentic Search  •  🧠 Knowledge Clustering  •  📊 Monte Carlo Evidence Sampling<br>Indexless Retrieval  •  🔄 Self-Evolving Knowledge Base  •  💬 Real-time Chat

</div> <br>

English | 中文


🌰 Why “Sirchmunk”?

Intelligence pipelines built upon vector-based retrieval can be rigid and brittle. They rely on static vector embeddings that are expensive to compute, blind to real-time changes, and detached from the raw context. We introduce Sirchmunk to usher in a more agile paradigm, where data is no longer treated as a snapshot, and insights can evolve together with the data.


✨ Key Features

1. EmbeddingDB-Free: Data in its Purest Form

Sirchmunk works directly with raw data -- bypassing the heavy overhead of squeezing your rich files into fixed-dimensional vectors.

  • Instant Search: Eliminating complex pre-processing pipelines in hours long indexing; just drop your files and search immediately.
  • Full Fidelity: Zero information loss —- stay true to your data without vector approximation.

2. Self-Evolving: A Living Index

Data is a stream, not a snapshot. Sirchmunk is dynamic by design, while vector DB can become obsolete the moment your data changes.

  • Context-Aware: Evolves in real-time with your data context.
  • LLM-Powered Autonomy: Designed for Agents that perceive data as it lives, utilizing token-efficient reasoning that triggers LLM inference only when necessary to maximize intelligence while minimizing cost.

3. Intelligence at Scale: Real-Time & Massive

Sirchmunk bridges massive local repositories and the web with high-scale throughput and real-time awareness. <br/> It serves as a unified intelligent hub for AI agents, delivering deep insights across vast datasets at the speed of thought.

For more technical details, refer to the Sirchmunk blog


Traditional RAG vs. Sirchmunk

<div style="display: flex; justify-content: center; width: 100%;"> <table style="width: 100%; max-width: 900px; border-collapse: separate; border-spacing: 0; overflow: hidden; border-radius: 12px; font-family: sans-serif; border: 1px solid rgba(128, 128, 128, 0.2); margin: 0 auto;"> <colgroup> <col style="width: 25%;"> <col style="width: 30%;"> <col style="width: 45%;"> </colgroup> <thead> <tr style="background-color: rgba(128, 128, 128, 0.05);"> <th style="text-align: left; padding: 16px; border-bottom: 2px solid rgba(128, 128, 128, 0.2); font-size: 1.3em;">Dimension</th> <th style="text-align: left; padding: 16px; border-bottom: 2px solid rgba(128, 128, 128, 0.2); font-size: 1.3em; opacity: 0.7;">Traditional RAG</th> <th style="text-align: left; padding: 16px; border-bottom: 2px solid rgba(58, 134, 255, 0.5); color: #3a86ff; font-weight: 800; font-size: 1.3em;">✨Sirchmunk</th> </tr> </thead> <tbody> <tr> <td style="padding: 16px; font-weight: 600; border-bottom: 1px solid rgba(128, 128, 128, 0.1);">💰 Setup Cost</td> <td style="padding: 16px; opacity: 0.6; border-bottom: 1px solid rgba(128, 128, 128, 0.1);">High Overhead <br/> (VectorDB, GraphDB, Complex Document Parser...)</td> <td style="padding: 16px; background-color: rgba(58, 134, 255, 0.08); color: #4895ef; border-bottom: 1px solid rgba(128, 128, 128, 0.1);"> ✅ Zero Infrastructure <br/> <small style="opacity: 0.8; font-size: 0.85em;">Direct-to-data retrieval without vector silos</small> </td> </tr> <tr> <td style="padding: 16px; font-weight: 600; border-bottom: 1px solid rgba(128, 128, 128, 0.1);">🕒 Data Freshness</td> <td style="padding: 16px; opacity: 0.6; border-bottom: 1px solid rgba(128, 128, 128, 0.1);">Stale (Batch Re-indexing)</td> <td style="padding: 16px; background-color: rgba(58, 134, 255, 0.08); color: #4895ef; border-bottom: 1px solid rgba(128, 128, 128, 0.1);"> ✅ Instant &amp; Dynamic <br/> <small style="opacity: 0.8; font-size: 0.85em;">Self-evolving index that reflects live changes</small> </td> </tr> <tr> <td style="padding: 16px; font-weight: 600; border-bottom: 1px solid rgba(128, 128, 128, 0.1);">📈 Scalability</td> <td style="padding: 16px; opacity: 0.6; border-bottom: 1px solid rgba(128, 128, 128, 0.1);">Linear Cost Growth</td> <td style="padding: 16px; background-color: rgba(58, 134, 255, 0.08); color: #4895ef; border-bottom: 1px solid rgba(128, 128, 128, 0.1);"> ✅ Extremely low RAM/CPU consumption <br/> <small style="opacity: 0.8; font-size: 0.85em;">Native Elastic Support, efficiently handles large-scale datasets</small> </td> </tr> <tr> <td style="padding: 16px; font-weight: 600; border-bottom: 1px solid rgba(128, 128, 128, 0.1);">🎯 Accuracy</td> <td style="padding: 16px; opacity: 0.6; border-bottom: 1px solid rgba(128, 128, 128, 0.1);">Approximate Vector Matches</td> <td style="padding: 16px; background-color: rgba(58, 134, 255, 0.08); color: #4895ef; border-bottom: 1px solid rgba(128, 128, 128, 0.1);"> ✅ Deterministic &amp; Contextual <br/> <small style="opacity: 0.8; font-size: 0.85em;">Hybrid logic ensuring semantic precision</small> </td> </tr> <tr> <td style="padding: 16px; font-weight: 600;">⚙️ Workflow</td> <td style="padding: 16px; opacity: 0.6;">Complex ETL Pipelines</td> <td style="padding: 16px; background-color: rgba(58, 134, 255, 0.08); color: #4895ef;"> ✅ Drop-and-Search <br/> <small style="opacity: 0.8; font-size: 0.85em;">Zero-config integration for rapid deployment</small> </td> </tr> </tbody> </table> </div>

Demonstration

<div align="center"> <video controls autoplay muted loop playsinline width="100%" src="https://github.com/user-attachments/assets/704dbc0a-3df6-436a-b7f7-fb1edefbfb8c"></video> <p style="font-size: 1.1em; font-weight: 600; margin-top: 8px; color: #00bcd4;"> Access files directly to start chatting </p> </div>

| WeChat Group | DingTalk Group | |:--------------:|:----------------:| | <img src="assets/pic/wechat.jpg" width="200" height="200"> | <img src="assets/pic/dingtalk.png" width="200" height="200"> |


🎉 News

  • 🚀 Mar 20, 2026: Sirchmunk v0.0.6post1

    • 🐿️x🦞OpenClaw skill: Sirchmunk is now available as an OpenClaw skill on ClawHub — any OpenClaw-compatible agent can search local files via natural language. See openclaw-recipe for details.
    • Search API: New SSE streaming endpoint (POST /api/v1/search/stream) for real-time log output; concurrency control via SIRCHMUNK_MAX_CONCURRENT_SEARCHES; paths parameter now accepts both string and array, and is optional (falls back to SIRCHMUNK_SEARCH_PATHS).
    • Dependency fix: sirchmunk serve no longer requires sirchmunk[web]uvicorn is now a core dependency; psutil made optional.
  • 🚀 Mar 12, 2026: Sirchmunk v0.0.6

    • Multi-turn conversation: Context management with LLM query rewriting; configs CHAT_HISTORY_MAX_TURNS / CHAT_HISTORY_MAX_TOKENS; default search token budget 128K
    • Document summarization & cross-lingual retrieval: Summarization pipeline (chunk/merge/rerank), cross-lingual keyword extraction, chat-history relevance filtering
    • Docker: SIRCHMUNK_SEARCH_PATHS env support; updated entrypoint; document-processing dependencies
    • OpenAI client: _ProviderProfile for multi-provider management; aut
View on GitHub
GitHub Stars599
CategoryDevelopment
Updated11h ago
Forks57

Languages

Python

Security Score

95/100

Audited on Mar 31, 2026

No findings