![banner-gif][banner-gif]

[![][arxiv-badge]][arxiv-link] [![Docker][docker-badge]][docker] [![Ask DeepWiki][deepwiki-badge]][deepwiki] [![License][license-badge]][license]

[Documentation][documentation] • [API Reference][api-docs] • [Demo][demo-section]

[![English][lang-en-badge]][lang-en-readme] [![简体中文][lang-zh-badge]][lang-zh-readme]

</div> <br>

[!IMPORTANT]

Memory Sparse Attention

Check out our latest papar Memory Sparse Attention - A scalable, end-to-end trainable latent-memory framework for 100M token contexts.

Scalable sparse attention + document-wise RoPE (parallel/global) achieving near-linear complexity in both training and inference.

KV cache compression with a Memory Parallel inference engine to deliver 100M token throughput on 2×A800 GPUs.

Memory Interleave for multi-round, multi-hop reasoning across scattered memory segments.

Join our [Discord][discord] to ask anything you want. AMA session is open to everyone and occurs biweekly.

<br> <details open> <summary><kbd>Table of Contents</kbd></summary> <br>

[Welcome to EverOS][welcome]
[Use Cases][use-cases]
[Quick Start][quick-start]
[API Usage][api-usage]
[Demo][demo-section]
[Evaluation][evaluation-section]
[Documentation][docs-section]
[GitHub Codespaces][codespaces]
[Questions][questions-section]
[Contributing][contributing]

Welcome to EverOS

Welcome to EverOS! Join our community to help improve the project and collaborate with talented developers worldwide.

Use Cases

[![EverMind + OpenClaw Agent Memory and Plugin][usecase-openclaw-image]][usecase-openclaw-link]

EverMind + OpenClaw Agent Memory and Plugin

Claw is putting the pieces of his memory together. Imagine a 24/7 agent with continuous learning memory that you can carry with you wherever you go next. Check out the [agent_memory][usecase-openclaw-link] branch and the [plugin][usecase-openclaw-plugin-link] for more details.

![divider][divider-light] ![divider][divider-dark]

<br>

[![Live2D Character with Memory][usecase-live2d-image]][usecase-live2d-link]

Live2D Character with Memory

Add long-term memory to your anime character that can talk to you in real-time powered by [TEN Framework][ten-framework-link]. See the [Live2D Character with Memory Example][usecase-live2d-link] for more details.

![divider][divider-light] ![divider][divider-dark]

<br>

[![Computer-Use with Memory][usecase-computer-image]][usecase-computer-link]

Computer-Use with Memory

Use computer-use to launch screenshot to do analysis all in your memory. See the [live demo][usecase-computer-link] for more details.

![divider][divider-light] ![divider][divider-dark]

<br>

[![Game of Thrones Memories][usecase-got-image]][usecase-got-link]

Game of Thrones Memories

A demonstration of AI memory infrastructure through an interactive Q&A experience with "A Game of Thrones". See the [code][usecase-got-link] for more details.

![divider][divider-light] ![divider][divider-dark]

<br>

[![EverOS Claude Code Plugin][usecase-claude-image]][usecase-claude-link]

EverOS Claude Code Plugin

Persistent memory for Claude Code. Automatically saves and recalls context from past coding sessions. See the [code][usecase-claude-link] for more details.

![divider][divider-light] ![divider][divider-dark]

<br>

[![Visualize Memories with Graphs][usecase-graph-image]][usecase-graph-link]

Visualize Memories with Graphs

Memory Graph view that visualizes your stored entities and how they relate. This is a pure frontend demo which has not been plugged into the backend yet, and we are working on it. See the [live demo][usecase-graph-link].

<br> <div align="right">

[![][back-to-top]][readme-top]

</div>

Quick Start

Prerequisites

Python 3.10+ • Docker 20.10+ • uv package manager • 4GB RAM

Verify Prerequisites:

# Verify you have the required versions
python --version  # Should be 3.10+
docker --version  # Should be 20.10+

Installation

# 1. Clone and navigate
git clone https://github.com/EverMind-AI/EverOS.git
cd EverOS

# 2. Start Docker services
docker compose up -d

# 3. Install uv and dependencies
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync

# 4. Configure API keys
cp env.template .env
# Edit .env and set:
#   - LLM_API_KEY (for memory extraction)
#   - VECTORIZE_API_KEY (for embedding/rerank)

# 5. Start server
uv run python src/run.py

# 6. Verify installation
curl http://localhost:1995/health
# Expected response: {"status": "healthy", ...}

✅ Server running at http://localhost:1995 • [Full Setup Guide][setup-guide]

[![][back-to-top]][readme-top]

</div>

Basic Usage

Store and retrieve memories with simple Python code:

import requests

API_BASE = "http://localhost:1995/api/v1"

# 1. Store a conversation memory
requests.post(f"{API_BASE}/memories", json={
    "message_id": "msg_001",
    "create_time": "2025-02-01T10:00:00+00:00",
    "sender": "user_001",
    "content": "I love playing soccer on weekends"
})

# 2. Search for relevant memories
response = requests.get(f"{API_BASE}/memories/search", json={
    "query": "What sports does the user like?",
    "user_id": "user_001",
    "memory_types": ["episodic_memory"],
    "retrieve_method": "hybrid"
})

result = response.json().get("result", {})
for memory_group in result.get("memories", []):
    print(f"Memory: {memory_group}")

📖 [More Examples][usage-examples] • 📚 [API Reference][api-docs] • 🎯 [Interactive Demos][interactive-demos]

[![][back-to-top]][readme-top]

</div>

Demo

Run the Demo

# Terminal 1: Start the API server
uv run python src/run.py

# Terminal 2: Run the simple demo
uv run python src/bootstrap.py demo/simple_demo.py

Try it now: Follow the [Demo Guide][interactive-demos] for step-by-step instructions.

Full Demo Experience

# Extract memories from sample data
uv run python src/bootstrap.py demo/extract_memory.py

# Start interactive chat with memory
uv run python src/bootstrap.py demo/chat_with_memory.py

See the [Demo Guide][interactive-demos] for details.

[![][back-to-top]][readme-top]

</div>

Advanced Techniques

[Group Chat Conversations][group-chat-guide] - Combine messages from multiple speakers
[Conversation Metadata Control][metadata-control-guide] - Fine-grained control over conversation context
[Memory Retrieval Strategies][retrieval-strategies-guide] - Lightweight vs Agentic retrieval modes
[Batch Operations][batch-operations-guide] - Process multiple messages efficiently

[![][back-to-top]][readme-top]

</div>

Documentation

| Guide | Description | | ----- | ----------- | | [Quick Start][getting-started] | Installation and configuration | | [Configuration Guide][config-guide] | Environment variables and services | | [API Usage Guide][api-usage-guide] | Endpoints and data formats | | [Development Guide][dev-guide] | Architecture and best practices | | [Memory API][memory-api-doc] | Complete

EverOS

Install / Use

README

Memory Sparse Attention

Welcome to EverOS

Use Cases

Quick Start

Prerequisites

Installation

Basic Usage

Demo

Run the Demo

Full Demo Experience

Advanced Techniques

Documentation