Momo
Momo is a self-hostable AI memory system written in Rust — inspired by SuperMemory
Install / Use
/learn @momomemory/MomoREADME
Momo - Self-Hostable AI Memory System
"You, my friend, are just a few plumbs short of a fruit pie." — Momo
Momo is a self-hostable AI memory system written in Rust — inspired by SuperMemory. It provides long-term memory for AI agents using LibSQL's native vector search capabilities — no external vector database required. Single binary, runs anywhere.
Table of Contents
- Quick Start
- Features
- Documentation
- SDKs
- Docker
- Process Isolation
- Development
- Maintainers
- Contributing
- Credits
- License
Quick Start
The fastest way to get Momo running is via Docker:
docker run --name momo -d --restart unless-stopped -p 3000:3000 -e MOMO_API_KEYS=dev-key -v momo-data:/data ghcr.io/momomemory/momo:latest
export API_KEY=dev-key
Then open:
- Web console:
http://localhost:3000/ - API docs:
http://localhost:3000/api/v1/docs - MCP endpoint:
http://localhost:3000/mcp
Add a Memory
curl -X POST http://localhost:3000/api/v1/conversations:ingest \
-H "Authorization: Bearer ${API_KEY}" \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "I prefer dark mode"}], "containerTag": "user_1"}'
Search Everything
curl -X POST http://localhost:3000/api/v1/search \
-H "Authorization: Bearer ${API_KEY}" \
-H "Content-Type: application/json" \
-d '{"q": "What are the user preferences?", "containerTags": ["user_1"], "scope": "hybrid"}'
Use As MCP Server
Momo includes a built-in MCP server (streamable HTTP) compatible with Supermemory-style workflows.
{
"mcpServers": {
"momo": {
"url": "http://localhost:3000/mcp",
"headers": {
"Authorization": "Bearer your_momo_api_key",
"x-sm-project": "default"
}
}
}
}
Provided MCP primitives:
- Tools:
memory,recall,listProjects,whoAmI - Resources:
supermemory://profile,supermemory://projects - Prompt:
context
For handshake details, manual curl testing, and troubleshooting, see MCP Guide.
Features
- Vector Search: Native LibSQL vector embeddings (no external vector DB needed)
- Embedded Web Console: Bun + Preact UI embedded in the binary and served from
/ - Local AI Pipeline: Built-in support for local embeddings (FastEmbed) and transcription (Whisper)
- External Embeddings: Support for OpenAI, OpenRouter, Ollama, and LM Studio APIs
- Universal Ingestion: Extract from URLs, PDFs, HTML, DOCX, Images (OCR), and Audio/Video
- Memory Versioning: Track memory updates with parent/child relationships
- Contradiction Management: Automatically detect and resolve conflicting memories
- Knowledge Graph: Memory relationships, graph traversal, container-level graphs
- User Profiling: Auto-generated user profiles from accumulated memories
- Autonomous Synthesis: Optional background engine that derives new insights from existing data
- Intelligent Decay: Relevance scoring that automatically prunes stale or irrelevant memories
- AST-Aware Code Chunking: Tree-sitter based chunking for multiple programming languages
- Multi-Tenant by Design: Scalable container-based isolation for multi-user applications
- Reranking: Improved search relevance using cross-encoder models
Documentation
For detailed documentation, see the docs directory.
SDKs
| Language | Package | Status |
|----------|---------|--------|
| TypeScript | @momomemory/sdk | Stable |
| Python | momomemory-sdk | Coming Soon |
| Rust | momo-sdk | Coming Soon |
| Go | momo-go | Coming Soon |
Docker
Use the published container image:
docker run --name momo -d --restart unless-stopped -p 3000:3000 -e MOMO_API_KEYS=dev-key -v momo-data:/data ghcr.io/momomemory/momo:latest
export API_KEY=dev-key
To follow logs:
docker logs -f momo
Process Isolation
Momo can run API and background ingestion/workers in separate processes.
Runtime Modes
MOMO_RUNTIME_MODE=all(default): supervisor mode; launches API and worker subprocessesMOMO_RUNTIME_MODE=api: HTTP/API only (no background workers)MOMO_RUNTIME_MODE=worker: Background workers only (no HTTP server)
To force legacy single-process behavior when MOMO_RUNTIME_MODE=all:
./momo --single-process
# or
MOMO_SINGLE_PROCESS=true ./momo
Example split deployment:
# API process
MOMO_RUNTIME_MODE=api MOMO_PORT=3000 ./momo
# Worker process
MOMO_RUNTIME_MODE=worker ./momo
Read/Write Split (LibSQL Replica Strategy)
Use a dedicated read backend while keeping writes on the primary DB:
DATABASE_URL=libsql://primary.turso.io
DATABASE_AUTH_TOKEN=primary-token
DATABASE_READ_URL=libsql://read-replica.turso.io
DATABASE_READ_AUTH_TOKEN=read-token
DATABASE_READ_LOCAL_PATH=local-read-replica.db
DATABASE_READ_SYNC_INTERVAL_SECS=2
Search reads use the read backend; writes continue through the primary backend.
Development
# From monorepo root: run backend + frontend with auto-reload
just dev
# Build frontend bundle (embedded in binary)
just build-frontend
# Build server (includes frontend bundle)
just build
# Run tests
cargo test
# Check for issues
cargo clippy
# Format code
cargo fmt
Note: when frontend assets are missing, Rust build uses momo/build.rs to run bun install and bun run build.
Maintainers
Contributing
See CONTRIBUTING.md for details on how to get involved. PRs accepted.
Credits
- Inspired by Supermemory
- Named after Momo, Aang's loyal flying lemur companion 🦇
- Built with LibSQL, FastEmbed, and Axum
License
MIT © Momo Contributors
