SkillAgentSearch skills...

Cortex

Private. Free. Local. — Memory engine for personal AI agents. AES-256-GCM encrypted, 73.7% on LoCoMo (beats Mem0), 62µs ingest, cross-device sync via your own cloud. Pure Rust, 3.8MB.

Install / Use

/learn @gambletan/Cortex

README

Cortex

GitHub stars License: MIT

If Cortex helps your AI remember, give it a ⭐ — it takes 1 second and helps others discover the project.

中文 | 日本語 | 한국어

Private. Free. Local. — Memory engine for personal AI agents.

Your AI's memory lives on your device — never leaves, never costs, never spies. Pure Rust. 3.8MB. No third-party servers. Zero telemetry. Zero cost. Syncs through your own cloud storage.

Philosophy: Your memories are yours — not a cloud provider's training data, not a startup's monetization asset, not a government's surveillance target. Cortex runs 100% on your hardware, stores everything in your own database, and syncs only through your own cloud storage (iCloud, Google Drive, OneDrive, Dropbox). No middleman ever sees your data. No API key required. No account to create. Just plug it into your AI agent and it remembers — privately, permanently, and at sub-millisecond speed.

LLMs start blank every session. Your assistant forgets your name, your preferences, the conversation you had yesterday, the decision you made last week. Current "memory" solutions are flat text files, keyword grep, or cloud APIs that add 200-500ms latency, charge you for the privilege, and send your personal data to someone else's server.

Cortex fixes this. It gives your AI a structured, queryable, self-evolving long-term memory that persists across sessions, channels, and contexts — with Bayesian beliefs that self-correct, a people graph that resolves identities across platforms, and sub-millisecond performance on everything. All running locally, all yours.

Cortex vs Mem0 vs OpenAI Memory

| | Cortex | Mem0 | OpenAI Memory | |---|---|---|---| | Privacy | 100% local, zero cloud | Cloud API (your data on their servers) | OpenAI servers | | Latency | 156µs ingest, 568µs search | ~200-500ms | ~300-800ms | | Cost | Free, forever | $99+/mo (Pro) | ChatGPT Plus ($20/mo) | | Memory tiers | 4 (Working/Episodic/Semantic/Procedural) | 1 (flat) | 1 (flat) | | Bayesian beliefs | Self-correcting with evidence | No | No | | People graph | Cross-channel identity resolution | Paid tier only | No | | Conversation compression | Automatic session summarization | No | No | | Relationship inference | Pattern-based (EN + CN) | No | No | | Temporal retrieval | Intent-aware ("recently" / "first time") | No | No | | Contradiction detection | Automatic with confidence scores | No | No | | Consolidation | Episodic → Semantic auto-promotion | No | No | | Context injection | Token-budgeted LLM-ready output | Manual | Automatic but opaque | | Import/Export | Full JSON backup & restore | API only | No export | | Self-hosted | Native binary, Docker, MCP | Cloud only | Cloud only | | Binary size | 3.8 MB | npm package | N/A | | Dependencies | 0 runtime deps | Node.js + cloud | N/A | | Open source | MIT | Partial | No | | Encryption | AES-256-GCM encrypted sync (opt-in) | No | No | | Privacy levels | Private (default, never syncs) / Shared / Public | No | No | | Zero telemetry | No analytics, no phone-home, verifiable | Unknown | No | | Cost | Free forever, unlimited | $99+/mo (Pro) | $20/mo (Plus) | | Chinese NLP | Native (inference, retrieval, relationships) | No | Limited | | Namespace isolation | Per-user/context memory separation | No | No | | Plugin system | Compile-time hooks for ingest/retrieve/consolidation | No | No | | MCP tools | 25 tools for Claude/LLM integration | 3rd party | N/A |

Performance Benchmarks

| Operation | Cortex | Mem0 (cloud) | File-based | |-----------|--------|-------------|------------| | Ingest | 156µs | ~200ms | ~1ms | | Search (top-10) | 568µs | ~300ms | ~10ms | | Context generation | 621µs | ~500ms | manual | | Belief update | 66µs | N/A | N/A | | People graph | 51µs | paid tier | N/A | | Structured facts | 45µs | N/A | N/A | | 1K memories search | 1.6ms | ~500ms | ~50ms |

528x faster than Mem0 cloud. With features neither Mem0 nor OpenAI Memory offer.

Note: Benchmarks include proactive inference (auto-extracting facts, preferences, relationships) on every ingest. Raw ingest without inference is ~15µs. Numbers from cargo bench on M-series Mac.

LoCoMo Benchmark (ACL 2024)

Academic-grade long-term conversation memory evaluation — 10 conversations, 1540 QA pairs across 4 categories.

| System | Single-hop | Multi-hop | Open-domain | Temporal | Overall | |--------|-----------|-----------|-------------|----------|---------| | Backboard | 89.4% | 75.0% | 91.2% | 91.9% | 90.0% | | MemMachine v0.2 | — | — | — | — | 84.9% | | Cortex v1.7 | 72.5% | 59.5% | 88.8% | 74.1% | 73.7% | | Mem0-Graph | 65.7% | 47.2% | 75.7% | 58.1% | 68.4% | | Mem0 | 67.1% | 51.2% | 72.9% | 55.5% | 66.9% | | OpenAI Memory | — | — | — | — | 52.9% |

Key findings:

  • Open-domain 88.8% — leads Mem0 (72.9%) by +15.9%
  • Temporal 74.1% — leads Mem0 (55.5%) by +18.6%
  • Single-hop 72.5% — leads Mem0 (67.1%) by +5.4%
  • Multi-hop 59.5% — leads Mem0 (51.2%) by +8.3%
  • Overall 73.7% — beats Mem0 (66.9%) by +6.8%, beats OpenAI Memory (52.9%) by +20.8%

Cortex outperforms Mem0 on all 4 categories — while running 100% locally, end-to-end encrypted, at $0 cost.

Setup: Claude Sonnet 4 (QA + judge), nomic-embed-text (embeddings via Ollama), top-30 retrieval. Fully reproducible: python3 bench/locomo_bench.py

Architecture

Cortex implements a 4-tier memory model inspired by human cognition:

                    +---------------------+
                    |   Working Memory    |  Current session context
                    +---------------------+
                              |
                    +---------------------+
                    |   Episodic Memory   |  Raw experiences: conversations, events, observations
                    +---------------------+
                              |  consolidation (decay, promotion, pattern extraction)
                    +---------------------+
                    |   Semantic Memory   |  Distilled facts, preferences, relationships
                    +---------------------+
                              |
                    +---------------------+
                    | Procedural Memory   |  Learned routines, user-specific workflows
                    +---------------------+

Working holds the current session scratch pad. Episodic stores raw experiences with timestamps and source metadata. The Consolidation Engine periodically promotes recurring patterns into Semantic facts and decays stale episodes. Procedural captures learned workflows and routines.

Key Components

People Graph

Cross-channel identity resolution. The same person messaging you on Telegram, emailing you, and showing up in calendar events gets unified into a single identity node. Interactions, relationship strength, and communication patterns are tracked per-person.

Bayesian Belief System

Self-correcting understanding of the world. Beliefs are formed from evidence, updated with each new observation, and can be contradicted. Confidence scores reflect actual certainty rather than recency bias.

cortex.observe_belief("user_prefers_morning_meetings", true, 0.8)?;
cortex.observe_belief("user_prefers_morning_meetings", false, 0.6)?;
// Confidence adjusts automatically via Bayesian update

Consolidation Engine

Episodic-to-semantic promotion, decay of stale memories, and pattern extraction. Runs as a background cycle that keeps the memory store lean and queryable. Returns a report of what was promoted, decayed, and merged.

Multi-signal Retrieval

Queries combine five signals for relevance ranking:

  • Similarity -- vector cosine distance against query embedding
  • Temporal -- recency weighting with configurable decay
  • Salience -- importance scoring from access patterns and explicit hints
  • Social -- boost for memories involving specific people
  • Channel -- filter or boost by source channel

Context Injection Protocol

Generates LLM-ready context strings from memory state. Pass a token budget, optional channel/person filters, and get back a structured text block your LLM can consume directly.

Storage

SQLite for persistence, in-memory vector index for fast similarity search. Single-file database, no external services required. Designed for edge deployment -- runs on a laptop, a Raspberry Pi, or a server.

Cloud Sync

Sync memories across devices through your own cloud storage — no third-party server involved.

Device A (Mac)              Your Cloud Storage              Device B (iPhone)
┌──────────┐         ┌──────────────────────┐         ┌──────────┐
│ SQLite DB │ ──W──>  │ iCloud / GDrive /    │  <──R── │ SQLite DB│
│ (local)   │         │ OneDrive / Dropbox   │         │ (local)  │
│           │ <──R──  │                      │  ──W──> │          │
└──────────┘         └──────────────────────┘         └──────────┘
  • Changelog-based: Each device writes append-only operation logs to its own subfolder
  • No conflicts: Devices never write to the same file. Merge uses Last-Writer-Wins with Hybrid Logical Clocks
  • Encrypted: AES-256-GCM encryption (opt-in). Even if your cloud account is compromised, memories stay private
  • Privacy-aware: Private memories (the default) never leave your device. Only Shared/Public memories sync

Supported providers: iCloud Drive, Google Drive, OneDrive, Dropbox (auto-detected).

use cortex_core::sync::SyncConfig;
View on GitHub
GitHub Stars4
CategoryDevelopment
Updated4h ago
Forks0

Languages

Rust

Security Score

90/100

Audited on Mar 24, 2026

No findings