<img src="https://img.shields.io/badge/python-3.11+-blue?style=flat-square&logo=python&logoColor=white" alt="Python 3.11+"> <img src="https://img.shields.io/badge/license-MIT-green?style=flat-square" alt="MIT License"> <img src="https://img.shields.io/badge/MCP-native-purple?style=flat-square" alt="MCP Native"> <img src="https://img.shields.io/badge/LLM-Anthropic%20%7C%20OpenRouter%20%7C%20Ollama-orange?style=flat-square" alt="Multi-LLM"> <img src="https://img.shields.io/badge/storage-SQLite-lightgrey?style=flat-square&logo=sqlite" alt="SQLite"> <h1 align="center">ROOT</h1> Personal knowledge graph with entity extraction, GraphRAG, and MCP integration. Turn your scattered notes into searchable intelligence that AI tools can query in real-time. <img src="assets/hero.png" alt="ROOT - Ask a question, see the graph, get a cited answer" width="800"> <a href="#what-root-does">What It Does</a> • <a href="#how-it-works">How It Works</a> • <a href="#quick-start">Quick Start</a> • <a href="#18-mcp-tools">Tools</a> • <a href="#llm-backends">LLM Backends</a> • <a href="#use-cases">Use Cases</a>

What ROOT Does

You have knowledge everywhere. Obsidian vault. Meeting transcripts. Email threads. Slack messages. When you need to answer "Who influences this project?" or "How did this decision evolve?", you're manually searching across systems, holding the mental model in your head.

ROOT fixes this. It turns your scattered notes, meetings, and emails into an interconnected, queryable intelligence layer. Think of it as a "second brain" that actually understands the relationships between people, projects, decisions, and events across your entire professional life.

> root_ask("What decisions were made about FoS Simplification?")

# ROOT Answer

Leadership APPROVED the FoS Simplification project on March 17, 2026.
The FOS list was locked and finalized at the kick-off meeting on March 23.
Scope: consolidate 1,541 field of studies down to 131 across 16 groups,
based on Times Higher Education framework. Owner: Sadia Hamid (coordinator),
Marco/Simon (engineering). Sprint start: April 7.

*Based on 5 search results and 2 entity matches.*

The Value

Ask questions across all your knowledge. Instead of manually searching through thousands of notes, ask ROOT in plain English. It synthesizes answers with citations from every source.

See connections you'd never find manually. ROOT builds an entity graph. It knows that "Sadia" from the kick-off meeting is the same "Sadia Hamid" from the planning session, and that she's connected to "Simon" via an implementation dependency. You can traverse these connections across hundreds of notes.

Meeting intelligence without manual notes. Your meetings get ingested and entity-extracted. After a week of meetings, ask "What did I commit to this week?" and get an answer that spans all of them.

What Makes It Different From Searching Obsidian

| Obsidian search | ROOT | |----------------|------| | Keyword matching | Semantic understanding ("lead decline" finds "traffic drop" notes) | | Shows files | Shows synthesized answers with citations | | No entity awareness | Knows people, projects, decisions as first-class objects | | No cross-source | Combines vault + meetings + emails + Slack | | Manual navigation | Traverses relationship graph automatically | | One note at a time | Aggregates across hundreds of notes per query |

How It Works

ROOT has a four-step pipeline: ingest, embed, extract, query.

┌──────────────────────────────────────────────────────────────┐
│                      DATA SOURCES                             │
│                                                               │
│  Obsidian Vault        meetings (Granola)        emails       │
│  2,500+ notes          auto or manual            Gmail MCP    │
│  auto every 2h         via root_ingest           via ingest   │
└──────────────┬───────────────┬───────────────┬───────────────┘
               │               │               │
               ▼               ▼               ▼
┌──────────────────────────────────────────────────────────────┐
│           STEP 1: INDEXING (free, runs locally)               │
│                                                               │
│  Content hashing (SHA-256) for incremental updates            │
│  Markdown-aware chunking (splits on headings)                 │
│  Local embeddings: all-MiniLM-L6-v2 (384 dims, CPU)          │
│  Stored in SQLite + sqlite-vec                                │
│                                                               │
│  Cost: $0. No API calls. ~2 min for 2,500 notes.             │
└──────────────┬───────────────────────────────────────────────┘
               │
               ▼
┌──────────────────────────────────────────────────────────────┐
│           STEP 2: ENTITY EXTRACTION (pennies/day)             │
│                                                               │
│  For each new/changed note, an LLM extracts:                  │
│                                                               │
│  Entities: people, projects, decisions, events,               │
│            concepts, organizations                            │
│  Relations: works_with, owns, decided, discussed,             │
│             blocked_by, depends_on, manages, etc.             │
│  Confidence: 0.9+ explicit, 0.7 implied, 0.5 weak signals    │
│  Aliases: "Fredrik" = "Frederick", "FoS" = "Field of Study"  │
│                                                               │
│  Model: Claude Haiku (~$0.003 per note)                       │
│  Daily cost: pennies (only changed notes reprocessed)         │
└──────────────┬───────────────────────────────────────────────┘
               │
               ▼
┌──────────────────────────────────────────────────────────────┐
│           STEP 3: KNOWLEDGE GRAPH (stored in SQLite)          │
│                                                               │
│  ┌───────────┐     ┌────────────┐     ┌────────────┐        │
│  │ entities  │────▶│ relations  │◀────│  aliases   │        │
│  │  13,000+  │     │  20,000+   │     │   8,000+   │        │
│  └───────────┘     └────────────┘     └────────────┘        │
│       │                                                       │
│       ▼                                                       │
│  ┌──────────────┐   ┌───────────────┐                        │
│  │ entity-note  │   │    notes      │                        │
│  │   links      │   │ with chunks   │                        │
│  │  28,000+     │   │  & embeddings │                        │
│  └──────────────┘   └───────────────┘                        │
│                                                               │
│  Graph traversal via recursive CTEs. <10ms at depth 2.        │
│  Everything in one SQLite file. No Postgres, no Neo4j.        │
└──────────────┬───────────────────────────────────────────────┘
               │
               ▼
┌──────────────────────────────────────────────────────────────┐
│           STEP 4: QUERY (on demand, via MCP)                  │
│                                                               │
│  root_ask combines all three layers:                          │
│  1. Semantic search finds the 5 most relevant chunks          │
│  2. Entity graph pulls the neighborhood of mentioned entities │
│  3. Claude Sonnet synthesizes a cited, natural language answer │
│                                                               │
│  This consistently outperforms pure vector search for         │
│  multi-hop questions ("Who decided X and what happened next?")│
│                                                               │
│  Cost: ~$0.01 per query                                       │
└──────────────────────────────────────────────────────────────┘

The Two-Model Strategy

Haiku ($0.80/$4 per MTok): bulk extraction. Runs on every note, cheap enough to process thousands.
Sonnet ($3/$15 per MTok): synthesis. Only runs when you ask a question. Higher quality reasoning for connecting dots.

Embeddings are always free and local. Only entity extraction and root_ask use the LLM.

Quick Start

# Clone and setup
git clone https://github.com/mshadmanrahman/root-kg.git
cd root-kg
python -m venv .venv && source .venv/bin/activate
pip install -e .

# Interactive setup wizard
python -m root init

# Index your notes (~2 min for 2,500 notes)
python indexer.py

# Extract entities (~$3 on Anthropic Haiku, or free with Ollama)
python indexer.py --extract

# Register as MCP server in Claude Code
claude mcp add root -- python server.py

# Try it
# root_search("your topic")
# root_ask("your question")
# root_graph("person name", 2)

Running Costs

| Activity | Frequency | Cost | |----------|-----------|------| | Initial vault index (embeddings) | Once | $0 (local model) | | Initial entity extraction | Once (~2,500 notes) | ~$3-5 (Haiku) or $0 (Ollama) | | Incremental re-index | Every 2 hours | $0 (local) | | Incremental extraction | Every 2 hours, only changed notes | ~$0.01-0.05/day | | Queries via root_ask | On-demand | ~$0.01/query (Sonnet) | | Monthly estimate | | $1-3 |

18 MCP Tools

Search & Discovery

root_search(query)              Semantic search across all notes
root_search_folder(query, dir)  Search within a specific folder
root_note(path)                 Read full note content
root_stats()                    Index health and statistics
root_connections(path)          Cross-do

Root Kg

Install / Use

README