Mnemo MCP
Persistent AI memory with hybrid search and embedded sync - open, free, unlimited
Install / Use
/learn @n24q02m/Mnemo MCPQuality Score
Category
Development & EngineeringSupported Platforms
README
Mnemo MCP Server
mcp-name: io.github.n24q02m/mnemo-mcp
Persistent AI memory with hybrid search and embedded sync. Open, free, unlimited.
<!-- Badge Row 1: Status --> <!-- Badge Row 2: Tech --> <a href="https://glama.ai/mcp/servers/n24q02m/mnemo-mcp"> <img width="380" height="200" src="https://glama.ai/mcp/servers/n24q02m/mnemo-mcp/badge" alt="Mnemo MCP server" /> </a>Features
- Hybrid search -- FTS5 full-text + sqlite-vec semantic + reranking for precision
- Knowledge graph -- Automatic entity extraction and relation tracking across memories
- Importance scoring -- LLM-scored 0.0-1.0 per memory for smarter retrieval
- Auto-archive -- Configurable age + importance threshold to keep memory clean
- STM-to-LTM consolidation -- LLM summarization of related memories in a category
- Duplicate detection -- Warns before adding semantically similar memories
- Zero config -- Built-in local Qwen3 embedding + reranking, no API keys needed. Optional cloud providers (Jina AI, Gemini, OpenAI, Cohere)
- Multi-machine sync -- JSONL-based merge sync via embedded rclone (Google Drive, S3, Dropbox)
- Proactive memory -- Tool descriptions guide AI to save preferences, decisions, facts
Quick Start
Claude Code Plugin (Recommended)
claude plugin add n24q02m/mnemo-mcp
MCP Server
Option 1: uvx
{
"mcpServers": {
"mnemo": {
"command": "uvx",
"args": ["mnemo-mcp@latest"],
"env": {
// -- optional: cloud embedding + reranking (Jina AI recommended)
"API_KEYS": "JINA_AI_API_KEY:jina_...",
// -- or: "API_KEYS": "GOOGLE_API_KEY:AIza...,COHERE_API_KEY:co-...",
// -- without API_KEYS, uses built-in local Qwen3 ONNX models (CPU, ~570MB first download)
// -- optional: LiteLLM Proxy (production, selfhosted gateway)
// "LITELLM_PROXY_URL": "http://10.0.0.20:4000",
// "LITELLM_PROXY_KEY": "sk-your-virtual-key",
// -- optional: sync memories across machines via rclone
"SYNC_ENABLED": "true", // default: false
"SYNC_INTERVAL": "300" // auto-sync every 5min (0 = manual only)
}
}
}
}
Option 2: Docker
{
"mcpServers": {
"mnemo": {
"command": "docker",
"args": [
"run", "-i", "--rm",
"--name", "mcp-mnemo",
"-v", "mnemo-data:/data",
"-e", "API_KEYS",
"-e", "SYNC_ENABLED",
"-e", "SYNC_INTERVAL",
"n24q02m/mnemo-mcp:latest"
],
"env": {
"API_KEYS": "JINA_AI_API_KEY:jina_...",
"SYNC_ENABLED": "true",
"SYNC_INTERVAL": "300"
}
}
}
}
Pre-install (optional)
Pre-download the embedding model (~570 MB) to avoid first-run delays.
Use the setup MCP tool after connecting:
setup(action="warmup")
Or with cloud embedding (validates API key, skips local download if cloud works):
{
"env": { "API_KEYS": "JINA_AI_API_KEY:jina_..." }
}
// Then: setup(action="warmup")
Sync setup
Sync is fully automatic. Just set SYNC_ENABLED=true and the server handles everything:
- First sync: rclone is auto-downloaded, a browser opens for OAuth authentication
- Token saved: OAuth token is stored locally at
~/.mnemo-mcp/tokens/(600 permissions) - Subsequent runs: Token is loaded automatically -- no manual steps needed
For non-Google Drive providers, set SYNC_PROVIDER and SYNC_REMOTE:
{
"SYNC_ENABLED": "true",
"SYNC_PROVIDER": "dropbox",
"SYNC_REMOTE": "dropbox"
}
Tools
| Tool | Actions | Description |
|:-----|:--------|:------------|
| memory | add, search, list, update, delete, export, import, stats, restore, archived, consolidate | Core memory CRUD, hybrid search, import/export, archival, and LLM consolidation |
| config | status, sync, set | Server status, trigger sync, update settings |
| setup | warmup, setup_sync | Pre-download embedding model, authenticate sync provider |
| help | -- | Full documentation for any tool |
MCP Resources
| URI | Description |
|:----|:------------|
| mnemo://stats | Database statistics and server status |
| mnemo://recent | 10 most recently updated memories |
MCP Prompts
| Prompt | Parameters | Description |
|:-------|:-----------|:------------|
| save_summary | summary | Generate prompt to save a conversation summary as memory |
| recall_context | topic | Generate prompt to recall relevant memories about a topic |
Configuration
| Variable | Required | Default | Description |
|:---------|:---------|:--------|:------------|
| API_KEYS | No | -- | API keys (ENV:key,ENV:key). Enables cloud embedding + reranking |
| LITELLM_PROXY_URL | No | -- | LiteLLM Proxy URL. Enables proxy mode |
| LITELLM_PROXY_KEY | No | -- | LiteLLM Proxy virtual key |
| DB_PATH | No | ~/.mnemo-mcp/memories.db | Database location |
| EMBEDDING_BACKEND | No | auto-detect | litellm (cloud) or local (Qwen3) |
| EMBEDDING_MODEL | No | auto-detect | LiteLLM embedding model name |
| EMBEDDING_DIMS | No | 0 (auto=768) | Embedding dimensions |
| RERANK_ENABLED | No | true | Enable reranking (improves search precision) |
| RERANK_BACKEND | No | auto-detect | litellm (cloud) or local (Qwen3) |
| RERANK_MODEL | No | auto-detect | LiteLLM reranker model name |
| RERANK_TOP_N | No | 10 | Number of top results to keep after reranking |
| LLM_MODELS | No | gemini/gemini-3-flash-preview | LLM model for graph extraction, importance scoring, consolidation |
| ARCHIVE_ENABLED | No | true | Enable auto-archiving of old low-importance memories |
| ARCHIVE_AFTER_DAYS | No | 90 | Days before a memory is eligible for auto-archive |
| ARCHIVE_IMPORTANCE_THRESHOLD | No | 0.3 | Memories below this importance score are auto-archived |
| DEDUP_THRESHOLD | No | 0.9 | Similarity threshold to block duplicate memories |
| DEDUP_WARN_THRESHOLD | No | 0.7 | Similarity threshold to warn about similar memories |
| RECENCY_HALF_LIFE_DAYS | No | 7 | Half-life for temporal decay in search scoring |
| SYNC_ENABLED | No | false | Enable rclone sync |
| SYNC_PROVIDER | No | drive | rclone provider type (drive, dropbox, s3, etc.) |
| SYNC_REMOTE | No | gdrive | rclone remote name |
| SYNC_FOLDER | No | mnemo-mcp | Remote folder |
| SYNC_INTERVAL | No | 300 | Auto-sync interval in seconds (0=manual) |
| LOG_LEVEL | No | INFO | Logging level |
Embedding & Reranking
Both embedding and reranking are always available -- local models are built-in and require no configuration.
- Jina AI (recommended): A single
JINA_AI_API_KEYenables both embedding and reranking - Embedding priority: Jina AI > Gemini > OpenAI > Cohere. Local Qwen3 fallback always available
- Reranking priority: Jina AI > Cohere. Local Qwen3 fallback always available
- GPU auto-detection: CUDA/DirectML auto-detected, uses GGUF models for better performance
- All embeddings stored at 768 dims. Switching providers never breaks the vector table
Build from Source
git clone https://github.com/n24q02m/mnemo-mcp.git
cd mnemo-mcp
uv sync
uv run mnemo-mcp
Compatible With
Also by n24q02m
| Server | Description | |--------|-------------| | wet-mcp | Web search, content extraction, and documentation indexing | | better-notion-mcp | Markdown-first Notion API with 9 composite tools | | better-email-mcp | Email (IMAP/SMTP) with multi-account and auto-discovery | | better-godot-mcp | Godot Engine 4.x with 18 tools for scenes, scripts, and shaders | | [better-telegram-mcp](https://github.com/n24q0
Related Skills
node-connect
332.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
prose
332.9kOpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.
claude-opus-4-5-migration
81.9kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
frontend-design
81.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
