Ori Mnemos
Local-first persistent memory for AI agents. Knowledge graph + ACT-R decay + three-signal retrieval. Agents wake up as themselves. MCP server (14 tools) for Claude, Cursor, Windsurf, Cline.
Install / Use
/learn @aayoawoyemi/Ori MnemosQuality Score
Category
Development & EngineeringSupported Platforms
README
Ori Mnemos
Open-source cognitive architecture for persistent AI agent memory.
Language models are stateless at every inference call. Without external memory, an agent cannot learn from past sessions, cannot associate across domains, and cannot improve over time. It enters every session with amnesia. The context window is a queue, not a graph — old information falls off the back as new information enters the front. Even persisted on a VPS, an agent has a heartbeat but no hippocampus.
Model intelligence is no longer the bottleneck. The bottleneck is memory — structured, persistent, and efficient enough to scale from 50 notes to 5,000 without degrading retrieval quality or inflating token cost.
Ori implements tenets of human cognition as mathematical models on a knowledge graph. Activation decay from ACT-R. Spreading activation along wiki-link edges. Hebbian co-occurrence from retrieval patterns. Reinforcement learning on retrieval itself. Ebbinghaus forgetting with spaced-repetition strength curves. The system learns what matters, forgets what doesn't, and optimizes its own retrieval pipeline — every session makes it sharper.
The result: an agent with continuous identity across sessions, clients, and machines. Accumulated understanding that persists, connects, and compounds. The model is the engine. The vault becomes the mind.
Markdown on disk. Wiki-links as graph edges. Git as version control. No database lock-in, no cloud dependency, no vendor capture.
v0.4.0 · npm · Apache-2.0
Quick Start
npm install -g ori-memory
ori init my-agent
cd my-agent
Connect to your client:
ori bridge claude-code --scope global --activation auto --vault ~/brain
ori bridge cursor --scope project --activation manual --vault ~/brain
ori bridge generic --scope global --vault ~/brain # any MCP client
Manual MCP config:
{
"mcpServers": {
"ori": {
"command": "ori",
"args": ["serve", "--mcp", "--vault", "/path/to/brain"],
"env": { "ORI_VAULT": "/path/to/brain" }
}
}
}
Start a session. The agent receives its identity automatically and begins onboarding on first run.
What It Does
-
Persistent identity. Agent state — name, personality, goals, methodology — is stored in plain markdown and auto-injected at session start via MCP instructions. Identity survives client switches, machine migrations, and model changes without reconfiguration.
-
Knowledge graph. Every
[[wiki-link]]is a directed edge. PageRank authority, Louvain community detection, betweenness centrality, bridge detection, orphan and dangling link analysis. Structure is queryable through MCP tools and CLI. -
Three memory spaces. Identity (
self/) decays at 0.1x — barely fades. Knowledge (notes/) decays at 1.0x — lives and dies by relevance. Operations (ops/) decays at 3.0x — burns hot and clears itself. The separation is architectural, not cosmetic. -
Cognitive forgetting. Notes decay using ACT-R base-level learning equations, not arbitrary TTLs. Used notes stay alive. Their neighbors stay warm through spreading activation along wiki-link edges. Structurally critical nodes are protected by Tarjan's algorithm.
ori pruneanalyzes the full activation topology before archiving anything. -
Four-signal fusion. Semantic embeddings, BM25 keyword matching, personalized PageRank, and associative warmth fused through score-weighted Reciprocal Rank Fusion. Intent classification (episodic, procedural, semantic, decision) shifts signal weights automatically.
-
Dampening pipeline. Three post-fusion stages validated by ablation testing: gravity dampening halves cosine-similarity ghosts with zero query-term overlap, hub dampening applies a P90 degree penalty to prevent map notes from dominating results, and resolution boost surfaces actionable knowledge (decisions, learnings) over passive observation.
-
Learning retrieval (v0.4.0). Three intelligence layers improve retrieval quality from session to session, synthesized from 63 research sources. See Retrieval Intelligence below.
-
Capture-promote pipeline.
ori addcaptures to inbox.ori promoteclassifies (idea, decision, learning, insight, blocker, opportunity), detects links, suggests areas. 50+ heuristic patterns. Optional LLM enhancement. -
Zero cloud dependencies. Local embeddings via all-MiniLM-L6-v2 running in-process. SQLite for vectors and intelligence state. Everything on your filesystem. Zero API keys required for core functionality.
Retrieval Intelligence (v0.4.0)
Three learning layers that improve retrieval quality over time without manual tuning. Synthesized from 63 research sources across reinforcement learning, information retrieval, cognitive science, and bandit theory.
Layer 1 — Q-Value Reranking
Notes earn Q-values from session outcomes via exponential moving average updates. Over time, genuinely useful notes rise and noise sinks.
| Signal | Reward | What triggers it |
|--------|--------|-----------------|
| Forward citation | +1.0 | You [[link]] a retrieved note in new content |
| Update after retrieval | +0.5 | You edit a note you just retrieved |
| Downstream creation | +0.6 | You create a new note after retrieving |
| Within-session re-recall | +0.4 | Same note surfaces across different queries |
| Dead end (top-3, no follow-up) | −0.15 | Retrieved in top 3 but nothing follows |
After RRF fusion, Phase B reranks the candidate set with a lambda blend of similarity score and learned Q-value, plus a UCB-Tuned exploration bonus that ensures under-retrieved notes still get discovered. Exposure-aware correction prevents the same notes from dominating every session. A cumulative bias cap (MAX=3.0, compression=0.3) prevents runaway score inflation.
Layer 2 — Co-Occurrence Edges
Notes that are retrieved together grow edges between them — Hebbian learning on the knowledge graph. Edge weights are computed using NPMI normalization (genuine association beyond base rate), GloVe power-law frequency scaling, and Ebbinghaus decay with strength accumulation (frequently co-retrieved pairs decay slower).
Per-node Turrigiano homeostasis prevents hub notes from absorbing all edge weight. Bibliographic coupling bootstraps day-0 edges from existing wiki-link structure before any queries have been run.
The combined wiki-link + co-occurrence graph feeds a Personalized PageRank walk (HippoRAG, α=0.5) that surfaces notes semantic search alone would never find.
Layer 3 — Stage Meta-Learning
Each pipeline stage (BM25, PageRank, warmth, hub dampening, Q-reranking, co-occurrence PPR) is wrapped in a LinUCB contextual bandit with an 8-dimensional query feature vector. The system learns which stages help for which query types and auto-skips stages that consistently hurt.
Three-way decisions per stage: run / skip / abstain (stop the pipeline early). Cost-sensitive thresholds ensure expensive stages face a higher bar. Essential stages (semantic search, RRF fusion) never skip. An ACQO two-phase curriculum runs all stages during exploration (first 50 samples), then optimizes.
Session Learning Loop
Query → Retrieve → Use (cite, update, create) → Reward signals
↓ ↓
Co-occurrence edges grow Q-values update (session-end batch)
↓ ↓
Stage meta-learner updates Better retrieval next session
All updates happen in a single SQLite transaction at session end, in order: co-occurrence → Q-values → stage learning.
The Stack
Layer 5: MCP Server 15 tools, 5 resources — any agent talks to this
Layer 4: Retrieval Intelligence Q-value reranking, co-occurrence learning, stage meta-optimization
Layer 3: Dampening Pipeline gravity, hub, resolution — ablation-validated
Layer 2: Four-Signal Fusion semantic + BM25 + PageRank + warmth → score-weighted RRF
Layer 1: Knowledge Graph + Vitality wiki-links, ACT-R decay, spreading activation, zone classification
Layer 0: Markdown files on disk git-friendly, human-readable, portable
15 MCP tools · 5 resources · 16 CLI commands · 579 tests
Token Economics
Without retrieval, every question requires dumping the entire vault into context. With Ori, the cost stays flat.
| Vault Size | Without Ori | With Ori | Savings | |:----------:|:-----------:|:--------:|:-------:| | 50 notes | 10,100 tokens | 850 tokens | 91% | | 200 notes | 40,400 tokens | 850 tokens | 98% | | 1,000 notes | 202,000 tokens | 850 tokens | 99.6% | | 5,000 notes | 1,010,000 tokens | 850 tokens | 99.9% |
A typical session costs ~$0.10 with Ori. Without it: ~$6.00+.
Architecture
Any MCP Client
(Claude, Cursor, Windsurf,
Cline, custom agents, VPS)
│
MCP Protocol
(stdio / JSON-RPC)
│
┌───────────────────┐
│ Ori MCP Server │
│ │
│ instructions │ identity auto-injected
│ resources (5) │ ori:// endpoints
│ tools (15) │ full memory operations
└─────────┬─────────┘
│
┌─────────────────┼─────────────────┐
│ │ │
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Knowledge │ │ Identity │ │Operations │
│ Graph │ │ Layer │ │ Layer │
│ │ │ │ │ │
│ notes/ │ │ self/ │ │ ops/ │
│ inbox/
