ReasonDB

The first database built to let AI agents think their way to the right answer using structural reasoning, rather than guessing based on vector similarity.

Generate Convert Improve

Install / Use

/learn @brainfish-ai/ReasonDB

About this skill

Quality Score

0/100

README

<br> <p align="center"> <img src="./assets/logo.svg" alt="ReasonDB" width="140" /> </p> <h3 align="center"> <strong>AI-Native Document Intelligence</strong> </h3> <p align="center"> The database that understands your documents.<br/> Built for AI agents that need to reason, not just retrieve. </p> <br> <p align="center"> <a href="https://github.com/reasondb/reasondb/releases"><img src="https://img.shields.io/github/v/release/reasondb/reasondb?color=6C5CE7&include_prereleases&label=version&sort=semver&style=flat-square" alt="Version"></a>   <a href="https://github.com/reasondb/reasondb"><img src="https://img.shields.io/badge/built_with-Rust-dca282.svg?style=flat-square" alt="Built with Rust"></a>   <a href="https://github.com/reasondb/reasondb/actions/workflows/ci.yml"><img src="https://img.shields.io/github/actions/workflow/status/reasondb/reasondb/ci.yml?style=flat-square&branch=main&label=CI" alt="CI"></a>   <a href="https://github.com/reasondb/reasondb/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-ReasonDB_v1.0-00bfff.svg?style=flat-square" alt="License"></a> </p> <p align="center"> <a href="https://hub.docker.com/r/brainfishai/reasondb"><img src="https://img.shields.io/docker/pulls/brainfishai/reasondb?label=docker%20pulls&style=flat-square" alt="Docker Pulls"></a>   <a href="https://github.com/reasondb/reasondb"><img src="https://img.shields.io/github/stars/reasondb/reasondb?color=FFD700&style=flat-square&label=stars" alt="GitHub Stars"></a>   <a href="https://github.com/reasondb/reasondb"><img src="https://img.shields.io/github/downloads/reasondb/reasondb/total?color=8259dd&label=downloads&style=flat-square" alt="Downloads"></a> </p> <p align="center"> <a href="https://reason-db.devdoc.sh">Docs</a>  •  <a href="https://reason-db.devdoc.sh/documentation/page/quickstart">Quick Start</a>  •  <a href="https://reason-db.devdoc.sh/api-reference/introduction">API Reference</a> </p> <p align="center"> <sub>⚠️ <strong>Alpha Release</strong> — ReasonDB is under active development. APIs and features may change. We'd love your <a href="https://github.com/reasondb/reasondb/issues">feedback</a>!</sub> </p> <br> <img src="./assets/banner.png" alt="Similarity is not relevance. ReasonDB replaces broken RAG & vector search." width="100%"> <br> <img src="./assets/ReasonDB_Client_Demo.gif" alt="ReasonDB Client Demo" width="100%"> <br> <h2>What is ReasonDB?</h2>

ReasonDB is an AI-native document database built in Rust, designed to go beyond simple retrieval. While traditional databases and vector stores treat documents as data to be indexed, ReasonDB treats them as knowledge to be understood - preserving document structure, enabling LLM-guided traversal, and extracting precise answers with full context.

ReasonDB introduces Hierarchical Reasoning Retrieval (HRR), a fundamentally new architecture where the LLM doesn't just consume retrieved content - it actively navigates your document structure to find exactly what it needs, like a human expert scanning summaries, drilling into relevant sections, and synthesizing answers.

ReasonDB is not another vector database. It's a reasoning engine that preserves document hierarchy, enabling AI to traverse your knowledge the way a domain expert would.

Key features of ReasonDB include:

Hierarchical Reasoning Retrieval: LLM-guided tree traversal with parallel beam search - AI navigates document structure instead of relying on similarity matching
RQL Query Language: SQL-like syntax with built-in SEARCH (BM25) and REASON (LLM) clauses in a single query
Plugin Architecture: Extensible extraction pipeline - PDF, Office, images, audio, and URLs out of the box via MarkItDown
Multi-Provider LLM Support: Anthropic, OpenAI, Gemini, Cohere, Vertex AI, AWS Bedrock, and more — switch providers without code changes
Production Ready: ACID-compliant storage, API key auth, rate limiting, async parallel traversal - all in a single Rust binary

<h2>Contents</h2>

What is ReasonDB?
The Problem
Benchmark
Insurance Policy Analyser — Live Demo
How It Works
Quick Start
Interactive Tutorials
Query with RQL
Plugin Architecture
Use Cases
Tech Stack
Documentation
Community
Contributing
License

<h2>The Problem</h2>

AI agents today are limited by their databases:

| Approach | What It Does | Why It Fails | |----------|--------------|--------------| | Vector DBs | Finds "similar" chunks | Loses structure. A contract's termination clause isn't "similar" to your question about exit terms - but it's the answer. | | RAG Pipelines | Retrieves then generates | Garbage in, garbage out. Wrong chunks retrieved means wrong answers, no matter how capable the LLM. | | Knowledge Graphs | Maps explicit relationships | Requires manual entity extraction. Can't handle the messy reality of real documents. |

The result? AI agents that hallucinate, miss critical context, or drown in irrelevant chunks.

ReasonDB solves this by letting the LLM reason through your documents - not just search them.

<h2>Benchmark</h2>

Results on a real-world insurance document corpus (4 policy documents, ~1,900 nodes, 12 queries across 6 complexity tiers). Full benchmark script: tutorials/data/insurance/benchmark.py.

Retrieval quality vs. typical RAG

| Metric | ReasonDB | Typical RAG | |---|---|---| | Pass rate | 100% (12 / 12) | 55 – 70% | | Context recall (term match) | 90% avg | 60 – 75% | | Median latency (RQL REASON) | 6.1 s | 15 – 45 s |

"Typical RAG" = chunked-retrieval pipelines (LlamaIndex / LangChain defaults) on the same corpus. ReasonDB uses BM25 candidate selection + LLM-guided hierarchical tree traversal instead of flat similarity matching.

Per-category breakdown

| Category | Avg latency | Term recall | Pass | |---|---|---|---| | Simple | 7.1 s | 100% | 2 / 2 | | Specific | 5.9 s | 75% | 2 / 2 | | Multi-condition | 5.6 s | 83% | 2 / 2 | | Comparative | 6.2 s | 100% | 2 / 2 | | Multi-hop | 6.5 s | 83% | 2 / 2 | | Synthesis | 6.5 s | 100% | 2 / 2 |

Cross-section reference retrieval

ReasonDB detects and follows intra-document cross-references during ingestion (LLM-extracted during summarization) and surfaces the referenced sections alongside primary results. This closes the "answer is split across two clauses" gap that defeats flat-chunk retrieval.

| Metric | Value | |---|---| | Queries with ≥ 1 cross-ref surfaced | 4 / 5 | | Avg recall, primary content only | 62% | | Avg recall, primary + cross-refs | 80% (+18 pp) | | Example gain | Recurrent disability query: 67% → 100% once cross-referenced policy schedule clause is included |

Insurance Policy Analyser — Live Demo

The benchmark above is powered by this tutorial app. It queries four insurance policy documents using REASON and shows the full traversal trace — which nodes the LLM visited, why it selected them, and how it synthesized the final answer.

Full tutorial source: tutorials/06-insurance/

<h2>How It Works</h2>

%%{init: {'theme': 'dark'}}%%
flowchart TD
    subgraph Ingestion["Ingestion Pipeline (Plugin-Driven)"]
        A["Documents / URLs"] -->|Extractor Plugin| B["Markdown"]
        B -->|Post-Processor Plugin| C["Cleaned Markdown"]
        C -->|Chunker| D["Semantic Chunks"]
        P["Pre-chunked JSON"] -->|"bypasses extract + chunk"| D
        D --> E["Build Hierarchical Tree"]
        E -->|Bottom-up| F["LLM Summarizes Each Node"]
    end

    subgraph Search["Search & Reasoning"]
        G["Natural Language Query"] --> G1["BM25 Candidates + Title Boost"]
        G1 --> G2["Recursive Tree-Grep Pre-Filter"]
        G2 --> H["LLM Ranks by Summaries + Match Signals"]
        H -->|Selects relevant branches| I["Traverse Tree"]
        I -->|Parallel beam search| J["Drill Into Leaf Nodes"]
    end

    subgraph Result["Response"]
        J --> K["Extract Answer"]
        K --> L["Confidence Score + Reasoning Path"]
    end

    Ingestion --> Search

Extract - Extractor plugins convert documents and URLs to Markdown (built-in: MarkItDown)
Chunk - Content is split into semantic chunks with heading detection — or bypass entirely with pre-chunked JSON via /ingest/chunks
Build Tree - Chunks are organized into a hierarchical tree structure, preserving per-chunk metadata (page numbers, line ranges, custom attributes)
Summarize - LLM generates summaries for each node (bottom-up); pre-supplied summaries are used as-is
Search - 4-phase pipeline: BM25 candidate selection → recursive tree-grep filtering → LLM summary ranking → parallel beam-search traversal
Return - Relevant content with extracted answers, confidence scores, and the full reasoning path

<h2>Quick Start</h2>

Get from zero to intelligent document search in under 5 minutes.

Download pre-built binaries

Grab the latest release for your platform:

| Platform | Architecture | Download | |----------|-------------|----------| | macOS | Apple Silicon (M1/M2/M3/M4) | aarch64-apple-darwin | | Linux | x86_64 | [x86_64-unknown-linux-gnu](ht

Related Skills

himalaya

347.6k

CLI to manage emails via IMAP/SMTP. Use `himalaya` to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language).

taskflow

347.6k

name: taskflow description: Use when work should span one or more detached tasks but still behave like one job with a single owner context. TaskFlow is the durable flow substrate under authoring layer

coding-agent

347.6k

Delegate coding tasks to Codex, Claude Code, or Pi agents via background process

tavily

347.6k

Tavily web search, content extraction, and research tools.