Multi Agent Memory
A Multi Agent Memory MCP That Connect Agents Across Systems and Machines
Install / Use
/learn @ZenSystemAI/Multi Agent MemoryQuality Score
Category
Development & EngineeringSupported Platforms
README
Multi-Agent Memory gives your AI agents a shared brain that works across machines, tools, and frameworks. Store a fact from Claude Code on your laptop, recall it from an OpenClaw agent on your server, and get a briefing from n8n — all through the same memory system.
Born from a production setup where OpenClaw agents, Claude Code, and n8n workflows needed to share memory across separate machines. Nothing existed that did this well, so we built it.
What's New in v2.3
- Multi-Path Retrieval with RRF Fusion — Search now runs three retrieval paths in parallel: vector (semantic similarity), keyword (BM25 full-text via Postgres tsvector or SQLite FTS5), and graph (BFS spreading activation through entity relationships). Results are merged using Reciprocal Rank Fusion (RRF). Exact names and technical terms now surface reliably even when embeddings miss them. Entity relationships that were previously stored but unused are now a first-class retrieval signal. Feature-flagged via
MULTI_PATH_SEARCH=true(default on). Useformat=fullinbrain_searchto see which paths contributed to each result. Includes a backfill script (scripts/backfill-keyword-index.js) for existing memories.
What Was New in v2.2
- Noise-Free Entity Extraction — v2.2 filters out CSS properties, code identifiers, shell commands, sentence fragments, French prose, and generic phrases. Pattern-based filtering with 50+ generic noun/adjective blocklists. Includes a retroactive cleanup script (
scripts/cleanup-garbage-entities.js) to purge existing noise. - Per-Client Knowledge Base — Fingerprint-based client identification with accent normalization. One tool call (
brain_client) returns everything known about a client: brand, strategy, meetings, content, technical details, relationships. Fuzzy name resolution ("AL" resolves to "acme-loans"). - Gemini Embedding 2 — Task-type-aware embeddings at 3072 dimensions. Uses
RETRIEVAL_DOCUMENTfor storage,RETRIEVAL_QUERYfor search. Matryoshka support for flexible dimensionality (3072/1536/768). - Import/Export — Full backup and migration support. Export all memories as JSON, import with automatic deduplication. Never lose data when switching embedding providers again.
- Webhook Notifications — Real-time dispatch when memories are stored, superseded, or deleted. Fire-and-forget to any HTTP endpoint.
- Entity Relationship Graph — Track how entities connect through co-occurrence. Interactive D3.js visualization with dark theme, force-directed layout, search, and PNG export.
- Auto-Resolve Client Context — Memories without explicit client_id are automatically tagged using fingerprint matching against the content.
- Smarter Consolidation — The 6-hour LLM pass reclassifies knowledge categories and infers entity relationship types (contact_of, same_owner, uses, works_on, competitor_of). Supports OpenAI, Anthropic, Gemini, and Ollama as consolidation LLM providers.
The Problem
You run multiple AI agents — Claude Code for development, OpenClaw for autonomous tasks, n8n for automation. They each maintain their own context and forget everything between sessions. When one agent discovers something important, the others never learn about it.
Existing solutions are either single-machine only, require paid cloud services, or treat memory as a flat key-value store without understanding that a fact and an event are fundamentally different things.
Quick Start
# 1. Clone the repo
git clone https://github.com/ZenSystemAI/multi-agent-memory.git
cd multi-agent-memory
# 2. Configure
cp .env.example .env
# Edit .env — set BRAIN_API_KEY and QDRANT_API_KEY
# For embeddings: GEMINI_API_KEY (free tier) or OPENAI_API_KEY
# 3. Start services
docker compose up -d
# 4. Verify
curl http://localhost:8084/health
# {"status":"ok","service":"shared-brain","timestamp":"..."}
# 5. Store your first memory
curl -X POST http://localhost:8084/memory \
-H "Content-Type: application/json" \
-H "X-Api-Key: YOUR_KEY" \
-d '{
"type": "fact",
"content": "The API uses port 8084 by default",
"source_agent": "my-agent",
"key": "api-default-port"
}'
Features
Typed Memory with Mutation Semantics
Not all memories are equal. Multi-Agent Memory understands four distinct types, each with its own lifecycle:
| Type | Behavior | Use Case |
|------|----------|----------|
| event | Append-only. Immutable historical record. | "Deployment completed", "Workflow failed" |
| fact | Upsert by key. New facts supersede old ones. | "API status: healthy", "Client prefers dark mode" |
| status | Update-in-place by subject. Latest wins. | "build-pipeline: passing", "migration: in-progress" |
| decision | Append-only. Records choices and reasoning. | "Chose Postgres over MySQL because..." |
Memory Lifecycle
Store ──> Dedup Check ──> Supersedes Chain ──> Confidence Decay ──> LLM Consolidation
│ │ │ │ │
│ Exact match? Same key/subject? Score drops over Groups, merges,
│ Return existing Mark old inactive time without access finds insights
│ │
└────────────────────────── Vector + Structured DB ──────────────────────┘
Deduplication — Content is SHA-256 hashed on storage. Exact duplicates are caught and return the existing memory. The consolidation engine also catches near-duplicates at 92% semantic similarity.
Supersedes — When you store a fact with the same key as an existing fact, the old one is marked inactive and the new one links back to it. Same pattern for statuses by subject. Old versions remain searchable but rank lower.
Confidence Decay — Facts and statuses lose confidence over time if not accessed (configurable, default 2%/day). Events and decisions don't decay — they're historical records. Accessing a memory resets its decay clock. Search results are ranked by similarity * confidence.
LLM Consolidation — A periodic background process (configurable, default every 6 hours) analyzes unconsolidated memories via LLM to find duplicates to merge, contradictions to flag, connections between memories, cross-memory insights, and named entities to extract and normalize.
Entity Extraction & Linking
Every memory automatically extracts named entities at storage time — clients, technologies, workflows, people, domains, and agents. Two extraction paths compound over time:
- Fast path (every write) — Regex + known-tech dictionary + alias cache lookup. Sub-millisecond, no LLM call, non-blocking. Always extracts
client_idandsource_agentas entities. Catches technology names (70+ built-in), domain names, quoted references, and capitalized proper nouns. v2.2 adds aggressive noise filtering: rejects CSS properties, HTML attributes, camelCase/snake_case code identifiers, shell commands, error codes, sentence fragments, and generic adjective+noun phrases. - Smart path (every consolidation) — The LLM discovers entities regex missed, normalizes aliases (so "acme-corp", "ACME", and "Acme Corporation" resolve to one canonical entity), and classifies types. Discovered aliases feed back into the fast-path alias cache — extraction gets smarter over time.
Entities are stored in three structured DB tables (SQLite/Postgres): canonical entities, aliases, and memory links. Each Qdrant memory payload is enriched with an entities array, indexed for native vector-filtered search — GET /memory/search?entity=Docker filters at the Qdrant level with no result-count ceiling.
Query the entity graph via GET /entities, filter search by entity, or use the brain_entities MCP tool.
Credential Scrubbing
All content is scrubbed before storage. API keys, JWTs, SSH private keys, passwords, and base64-encoded secrets are automatically redacted. Agents can freely share context without accidentally leaking credentials into long-term memory.
Agent Isolation
The API acts as a gatekeeper between your agents and the data. No agent — whether it's an OpenClaw agent, Claude Code, or a rogue script — has direct access to Qdrant or the database. They can only do what the API allows:
- Store and search memories (through validated endpoints)
- Read briefings,
