SkillAgentSearch skills...

Momo

Momo is a self-hostable AI memory system written in Rust — inspired by SuperMemory

Install / Use

/learn @momomemory/Momo
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Momo - Self-Hostable AI Memory System

"You, my friend, are just a few plumbs short of a fruit pie." — Momo

Momo is a self-hostable AI memory system written in Rust — inspired by SuperMemory. It provides long-term memory for AI agents using LibSQL's native vector search capabilities — no external vector database required. Single binary, runs anywhere.

Table of Contents

Quick Start

The fastest way to get Momo running is via Docker:

docker run --name momo -d --restart unless-stopped -p 3000:3000 -e MOMO_API_KEYS=dev-key -v momo-data:/data ghcr.io/momomemory/momo:latest
export API_KEY=dev-key

Then open:

  • Web console: http://localhost:3000/
  • API docs: http://localhost:3000/api/v1/docs
  • MCP endpoint: http://localhost:3000/mcp

Add a Memory

curl -X POST http://localhost:3000/api/v1/conversations:ingest \
  -H "Authorization: Bearer ${API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "I prefer dark mode"}], "containerTag": "user_1"}'

Search Everything

curl -X POST http://localhost:3000/api/v1/search \
  -H "Authorization: Bearer ${API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"q": "What are the user preferences?", "containerTags": ["user_1"], "scope": "hybrid"}'

Use As MCP Server

Momo includes a built-in MCP server (streamable HTTP) compatible with Supermemory-style workflows.

{
  "mcpServers": {
    "momo": {
      "url": "http://localhost:3000/mcp",
      "headers": {
        "Authorization": "Bearer your_momo_api_key",
        "x-sm-project": "default"
      }
    }
  }
}

Provided MCP primitives:

  • Tools: memory, recall, listProjects, whoAmI
  • Resources: supermemory://profile, supermemory://projects
  • Prompt: context

For handshake details, manual curl testing, and troubleshooting, see MCP Guide.

Features

  • Vector Search: Native LibSQL vector embeddings (no external vector DB needed)
  • Embedded Web Console: Bun + Preact UI embedded in the binary and served from /
  • Local AI Pipeline: Built-in support for local embeddings (FastEmbed) and transcription (Whisper)
  • External Embeddings: Support for OpenAI, OpenRouter, Ollama, and LM Studio APIs
  • Universal Ingestion: Extract from URLs, PDFs, HTML, DOCX, Images (OCR), and Audio/Video
  • Memory Versioning: Track memory updates with parent/child relationships
  • Contradiction Management: Automatically detect and resolve conflicting memories
  • Knowledge Graph: Memory relationships, graph traversal, container-level graphs
  • User Profiling: Auto-generated user profiles from accumulated memories
  • Autonomous Synthesis: Optional background engine that derives new insights from existing data
  • Intelligent Decay: Relevance scoring that automatically prunes stale or irrelevant memories
  • AST-Aware Code Chunking: Tree-sitter based chunking for multiple programming languages
  • Multi-Tenant by Design: Scalable container-based isolation for multi-user applications
  • Reranking: Improved search relevance using cross-encoder models

Documentation

For detailed documentation, see the docs directory.

SDKs

| Language | Package | Status | |----------|---------|--------| | TypeScript | @momomemory/sdk | Stable | | Python | momomemory-sdk | Coming Soon | | Rust | momo-sdk | Coming Soon | | Go | momo-go | Coming Soon |

Docker

Use the published container image:

docker run --name momo -d --restart unless-stopped -p 3000:3000 -e MOMO_API_KEYS=dev-key -v momo-data:/data ghcr.io/momomemory/momo:latest
export API_KEY=dev-key

To follow logs:

docker logs -f momo

Process Isolation

Momo can run API and background ingestion/workers in separate processes.

Runtime Modes

  • MOMO_RUNTIME_MODE=all (default): supervisor mode; launches API and worker subprocesses
  • MOMO_RUNTIME_MODE=api: HTTP/API only (no background workers)
  • MOMO_RUNTIME_MODE=worker: Background workers only (no HTTP server)

To force legacy single-process behavior when MOMO_RUNTIME_MODE=all:

./momo --single-process
# or
MOMO_SINGLE_PROCESS=true ./momo

Example split deployment:

# API process
MOMO_RUNTIME_MODE=api MOMO_PORT=3000 ./momo

# Worker process
MOMO_RUNTIME_MODE=worker ./momo

Read/Write Split (LibSQL Replica Strategy)

Use a dedicated read backend while keeping writes on the primary DB:

DATABASE_URL=libsql://primary.turso.io
DATABASE_AUTH_TOKEN=primary-token

DATABASE_READ_URL=libsql://read-replica.turso.io
DATABASE_READ_AUTH_TOKEN=read-token
DATABASE_READ_LOCAL_PATH=local-read-replica.db
DATABASE_READ_SYNC_INTERVAL_SECS=2

Search reads use the read backend; writes continue through the primary backend.

Development

# From monorepo root: run backend + frontend with auto-reload
just dev

# Build frontend bundle (embedded in binary)
just build-frontend

# Build server (includes frontend bundle)
just build

# Run tests
cargo test

# Check for issues
cargo clippy

# Format code
cargo fmt

Note: when frontend assets are missing, Rust build uses momo/build.rs to run bun install and bun run build.

Maintainers

@watzon

Contributing

See CONTRIBUTING.md for details on how to get involved. PRs accepted.

Credits

License

MIT © Momo Contributors

View on GitHub
GitHub Stars26
CategoryDevelopment
Updated3d ago
Forks2

Languages

Rust

Security Score

90/100

Audited on Apr 2, 2026

No findings