SkillAgentSearch skills...

Mnemo MCP

Persistent AI memory with hybrid search and embedded sync - open, free, unlimited

Install / Use

/learn @n24q02m/Mnemo MCP

README

Mnemo MCP Server

mcp-name: io.github.n24q02m/mnemo-mcp

Persistent AI memory with hybrid search and embedded sync. Open, free, unlimited.

<!-- Badge Row 1: Status -->

CI codecov PyPI Docker License: MIT

<!-- Badge Row 2: Tech -->

Python SQLite MCP semantic-release Renovate

<a href="https://glama.ai/mcp/servers/n24q02m/mnemo-mcp"> <img width="380" height="200" src="https://glama.ai/mcp/servers/n24q02m/mnemo-mcp/badge" alt="Mnemo MCP server" /> </a>

Features

  • Hybrid search -- FTS5 full-text + sqlite-vec semantic + reranking for precision
  • Knowledge graph -- Automatic entity extraction and relation tracking across memories
  • Importance scoring -- LLM-scored 0.0-1.0 per memory for smarter retrieval
  • Auto-archive -- Configurable age + importance threshold to keep memory clean
  • STM-to-LTM consolidation -- LLM summarization of related memories in a category
  • Duplicate detection -- Warns before adding semantically similar memories
  • Zero config -- Built-in local Qwen3 embedding + reranking, no API keys needed. Optional cloud providers (Jina AI, Gemini, OpenAI, Cohere)
  • Multi-machine sync -- JSONL-based merge sync via embedded rclone (Google Drive, S3, Dropbox)
  • Proactive memory -- Tool descriptions guide AI to save preferences, decisions, facts

Quick Start

Claude Code Plugin (Recommended)

claude plugin add n24q02m/mnemo-mcp

MCP Server

Option 1: uvx

{
  "mcpServers": {
    "mnemo": {
      "command": "uvx",
      "args": ["mnemo-mcp@latest"],
      "env": {
        // -- optional: cloud embedding + reranking (Jina AI recommended)
        "API_KEYS": "JINA_AI_API_KEY:jina_...",
        // -- or: "API_KEYS": "GOOGLE_API_KEY:AIza...,COHERE_API_KEY:co-...",
        // -- without API_KEYS, uses built-in local Qwen3 ONNX models (CPU, ~570MB first download)
        // -- optional: LiteLLM Proxy (production, selfhosted gateway)
        // "LITELLM_PROXY_URL": "http://10.0.0.20:4000",
        // "LITELLM_PROXY_KEY": "sk-your-virtual-key",
        // -- optional: sync memories across machines via rclone
        "SYNC_ENABLED": "true",                    // default: false
        "SYNC_INTERVAL": "300"                     // auto-sync every 5min (0 = manual only)
      }
    }
  }
}

Option 2: Docker

{
  "mcpServers": {
    "mnemo": {
      "command": "docker",
      "args": [
        "run", "-i", "--rm",
        "--name", "mcp-mnemo",
        "-v", "mnemo-data:/data",
        "-e", "API_KEYS",
        "-e", "SYNC_ENABLED",
        "-e", "SYNC_INTERVAL",
        "n24q02m/mnemo-mcp:latest"
      ],
      "env": {
        "API_KEYS": "JINA_AI_API_KEY:jina_...",
        "SYNC_ENABLED": "true",
        "SYNC_INTERVAL": "300"
      }
    }
  }
}

Pre-install (optional)

Pre-download the embedding model (~570 MB) to avoid first-run delays. Use the setup MCP tool after connecting:

setup(action="warmup")

Or with cloud embedding (validates API key, skips local download if cloud works):

{
  "env": { "API_KEYS": "JINA_AI_API_KEY:jina_..." }
}
// Then: setup(action="warmup")

Sync setup

Sync is fully automatic. Just set SYNC_ENABLED=true and the server handles everything:

  1. First sync: rclone is auto-downloaded, a browser opens for OAuth authentication
  2. Token saved: OAuth token is stored locally at ~/.mnemo-mcp/tokens/ (600 permissions)
  3. Subsequent runs: Token is loaded automatically -- no manual steps needed

For non-Google Drive providers, set SYNC_PROVIDER and SYNC_REMOTE:

{
  "SYNC_ENABLED": "true",
  "SYNC_PROVIDER": "dropbox",
  "SYNC_REMOTE": "dropbox"
}

Tools

| Tool | Actions | Description | |:-----|:--------|:------------| | memory | add, search, list, update, delete, export, import, stats, restore, archived, consolidate | Core memory CRUD, hybrid search, import/export, archival, and LLM consolidation | | config | status, sync, set | Server status, trigger sync, update settings | | setup | warmup, setup_sync | Pre-download embedding model, authenticate sync provider | | help | -- | Full documentation for any tool |

MCP Resources

| URI | Description | |:----|:------------| | mnemo://stats | Database statistics and server status | | mnemo://recent | 10 most recently updated memories |

MCP Prompts

| Prompt | Parameters | Description | |:-------|:-----------|:------------| | save_summary | summary | Generate prompt to save a conversation summary as memory | | recall_context | topic | Generate prompt to recall relevant memories about a topic |

Configuration

| Variable | Required | Default | Description | |:---------|:---------|:--------|:------------| | API_KEYS | No | -- | API keys (ENV:key,ENV:key). Enables cloud embedding + reranking | | LITELLM_PROXY_URL | No | -- | LiteLLM Proxy URL. Enables proxy mode | | LITELLM_PROXY_KEY | No | -- | LiteLLM Proxy virtual key | | DB_PATH | No | ~/.mnemo-mcp/memories.db | Database location | | EMBEDDING_BACKEND | No | auto-detect | litellm (cloud) or local (Qwen3) | | EMBEDDING_MODEL | No | auto-detect | LiteLLM embedding model name | | EMBEDDING_DIMS | No | 0 (auto=768) | Embedding dimensions | | RERANK_ENABLED | No | true | Enable reranking (improves search precision) | | RERANK_BACKEND | No | auto-detect | litellm (cloud) or local (Qwen3) | | RERANK_MODEL | No | auto-detect | LiteLLM reranker model name | | RERANK_TOP_N | No | 10 | Number of top results to keep after reranking | | LLM_MODELS | No | gemini/gemini-3-flash-preview | LLM model for graph extraction, importance scoring, consolidation | | ARCHIVE_ENABLED | No | true | Enable auto-archiving of old low-importance memories | | ARCHIVE_AFTER_DAYS | No | 90 | Days before a memory is eligible for auto-archive | | ARCHIVE_IMPORTANCE_THRESHOLD | No | 0.3 | Memories below this importance score are auto-archived | | DEDUP_THRESHOLD | No | 0.9 | Similarity threshold to block duplicate memories | | DEDUP_WARN_THRESHOLD | No | 0.7 | Similarity threshold to warn about similar memories | | RECENCY_HALF_LIFE_DAYS | No | 7 | Half-life for temporal decay in search scoring | | SYNC_ENABLED | No | false | Enable rclone sync | | SYNC_PROVIDER | No | drive | rclone provider type (drive, dropbox, s3, etc.) | | SYNC_REMOTE | No | gdrive | rclone remote name | | SYNC_FOLDER | No | mnemo-mcp | Remote folder | | SYNC_INTERVAL | No | 300 | Auto-sync interval in seconds (0=manual) | | LOG_LEVEL | No | INFO | Logging level |

Embedding & Reranking

Both embedding and reranking are always available -- local models are built-in and require no configuration.

  • Jina AI (recommended): A single JINA_AI_API_KEY enables both embedding and reranking
  • Embedding priority: Jina AI > Gemini > OpenAI > Cohere. Local Qwen3 fallback always available
  • Reranking priority: Jina AI > Cohere. Local Qwen3 fallback always available
  • GPU auto-detection: CUDA/DirectML auto-detected, uses GGUF models for better performance
  • All embeddings stored at 768 dims. Switching providers never breaks the vector table

Build from Source

git clone https://github.com/n24q02m/mnemo-mcp.git
cd mnemo-mcp
uv sync
uv run mnemo-mcp

Compatible With

Claude Code Claude Desktop Cursor VS Code Copilot Antigravity Gemini CLI OpenAI Codex OpenCode

Also by n24q02m

| Server | Description | |--------|-------------| | wet-mcp | Web search, content extraction, and documentation indexing | | better-notion-mcp | Markdown-first Notion API with 9 composite tools | | better-email-mcp | Email (IMAP/SMTP) with multi-account and auto-discovery | | better-godot-mcp | Godot Engine 4.x with 18 tools for scenes, scripts, and shaders | | [better-telegram-mcp](https://github.com/n24q0

Related Skills

View on GitHub
GitHub Stars3
CategoryDevelopment
Updated45m ago
Forks2

Languages

Python

Security Score

90/100

Audited on Mar 24, 2026

No findings