<h1 align="center">AuraSDK</h1> Local Cognitive Runtime For Frozen AI Models Deterministic · No fine-tuning · No cloud training · <1ms recall · ~3 MB <a href="https://github.com/teolex2020/AuraSDK/actions/workflows/test.yml"><img src="https://github.com/teolex2020/AuraSDK/actions/workflows/test.yml/badge.svg" alt="CI"></a> <a href="https://pypi.org/project/aura-memory/"><img src="https://img.shields.io/pypi/v/aura-memory.svg" alt="PyPI"></a> <a href="https://pypi.org/project/aura-memory/"><img src="https://img.shields.io/pypi/dm/aura-memory.svg" alt="Downloads"></a> <a href="https://github.com/teolex2020/AuraSDK/stargazers"><img src="https://img.shields.io/github/stars/teolex2020/AuraSDK?style=social" alt="GitHub stars"></a> <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License: MIT"></a> <a href="https://github.com/teolex2020/AuraSDK/actions/workflows/test.yml"><img src="https://img.shields.io/badge/tests-828_passed-brightgreen" alt="Tests"></a> <a href="https://www.uspto.gov/"><img src="https://img.shields.io/badge/Patent_Pending-US_63%2F969%2C703-blue.svg" alt="Patent Pending"></a> <a href="https://colab.research.google.com/github/teolex2020/AuraSDK/blob/main/examples/colab_quickstart.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>   <a href="https://www.youtube.com/watch?v=ZyE9P2_uKxg"><img src="https://img.shields.io/badge/YouTube-Demo_30s-red?logo=youtube" alt="Demo Video"></a>   <a href="https://aurasdk.dev"><img src="https://img.shields.io/badge/Web-aurasdk.dev-blue" alt="Website"></a>

Your AI model is smart. But it forgets everything after every conversation.

AuraSDK is a local cognitive runtime that runs alongside any frozen model. It gives agents durable memory, explainability, governed correction, bounded recall reranking, and bounded self-adaptation through experience — all locally, without fine-tuning or cloud training.

pip install aura-memory

from aura import Aura, Level

brain = Aura("./agent_memory")
brain.enable_full_cognitive_stack()  # activate all four bounded reranking overlays

# store what happens
brain.store("User always deploys to staging first", level=Level.Domain, tags=["workflow"])
brain.store("Staging deploy prevented 3 production incidents", level=Level.Domain, tags=["workflow"])

# recall — local retrieval with optional bounded cognitive reranking
context = brain.recall("deployment decision")  # <1ms, no API call

# inspect advisory hints produced from stored evidence
hints = brain.get_surfaced_policy_hints()
# → [{"action": "Prefer", "domain": "workflow", "description": "deploy to staging first"}]

No API keys. No embeddings required. No cloud. The model stays the same — the cognitive layer becomes more structured, more inspectable, and more useful over time.

⭐ If AuraSDK is useful to you, a GitHub star helps us get funding to continue development from Kyiv.

Why Aura?

| | Aura | Mem0 | Zep | Cognee | Letta/MemGPT | |---|---|---|---|---|---| | Architecture | 5-layer cognitive engine | Vector + LLM | Vector + LLM | Graph + LLM | LLM orchestration | | Derived cognitive layers without LLM | Yes — Belief→Concept→Causal→Policy | No | No | No | No | | Advisory policy hints from experience | Yes — bounded and non-executing | No | No | No | No | | Learns from agent's own responses | Yes — bounded, auditable, no fine-tuning | No | No | No | No | | Salience weighting | Yes — what matters persists longer | No | No | No | No | | Contradiction governance | Yes — explicit, operator-visible | No | No | No | No | | LLM required | No | Yes | Yes | Yes | Yes | | Recall latency | <1ms | ~200ms+ | ~200ms | LLM-bound | LLM-bound | | Works offline | Fully | Partial | No | No | With local LLM | | Cost per operation | $0 | API billing | Credit-based | LLM + DB cost | LLM cost | | Binary size | ~3 MB | ~50 MB+ | Cloud service | Heavy (Neo4j+) | Python pkg | | Memory decay & promotion | Built-in | Via LLM | Via LLM | No | Via LLM | | Trust & provenance | Built-in | No | No | No | No | | Encryption at rest | ChaCha20 + Argon2 | No | No | No | No | | Language | Rust | Python | Proprietary | Python | Python |

The Core Idea: Cheap Model + Aura > Expensive Model Alone

Fine-tuning costs thousands of dollars and weeks of work. RAG requires embeddings and a vector database. Context windows are expensive per token.

Aura gives you a third path: a local cognitive runtime that accumulates structured experience between conversations — free, local, sub-millisecond.

Week 1: GPT-4o-mini + Aura                Week 1: GPT-4 alone
  → average answers                          → average answers

Week 4: GPT-4o-mini + Aura                Week 4: GPT-4 alone
  → recalls your workflow                    → still forgets everything
  → surfaces patterns you repeat             → same cost per token
  → exposes explainability + correction      → no improvement
  → boundedly adapts from experience         → no durable learning
  → $0 compute cost                          → still billing per call

The model stays the same. The cognitive layer gets stronger. That's Aura.

Performance

Benchmarked on 1,000 records (Windows 10 / Ryzen 7):

| Operation | Latency | vs Mem0 | |-----------|---------|---------| | Store | 0.09 ms | ~same | | Recall (structured) | 0.74 ms | ~270× faster | | Recall (cached) | 0.48 µs | ~400,000× faster | | Maintenance cycle | 1.1 ms | No equivalent |

Mem0 recall requires an embedding API call (~200ms+) + vector search. Aura recall is pure local computation.

What Ships Today

Aura's full cognitive recall pipeline is active and bounded:

Record → Belief (±5%) → Concept (±4%) → Causal (±3%) → Policy (±2%)

Enable everything in one call:

brain.enable_full_cognitive_stack()   # activates all four bounded reranking phases
brain.disable_full_cognitive_stack()  # back to raw RRF baseline

Or configure individual phases:

brain.set_belief_rerank_mode("limited")   # belief-aware ranking
brain.set_concept_surface_mode("limited") # concept annotations + bounded concept reranking
brain.set_causal_rerank_mode("limited")   # causal chain boost
brain.set_policy_rerank_mode("limited")   # policy hint shaping

Higher layers also expose advisory surfaced output:

get_surfaced_concepts() — stable concept abstractions over repeated beliefs
get_surfaced_causal_patterns() — learned cause→effect patterns
get_surfaced_policy_hints() — advisory recommendations (Prefer / Avoid / Warn)
no automatic behavior influence — all output is advisory and read-only

Aura also ships operator-facing and plasticity-facing surfaces:

explainability:
- explain_recall()
- explain_record()
- provenance_chain()
- explainability_bundle()
governed correction:
- targeted retract/deprecate APIs
- persistent correction log
- correction review queue
- suggested corrections without auto-apply
bounded autonomous plasticity:
- capture_experience()
- ingest_experience_batch()
- maintenance-phase integration
- anti-hallucination guards
- plasticity risk scoring
- purge / freeze controls
bounded v6 cognitive guidance:
- salience:
  - mark_record_salience()
  - get_high_salience_records()
  - get_salience_summary()
- reflection:
  - get_reflection_summaries()
  - get_latest_reflection_digest()
  - get_reflection_digest()
- contradiction and instability:
  - get_belief_instability_summary()
  - get_contradiction_clusters()
  - get_contradiction_review_queue()
- honest explainability support:
  - unresolved-evidence markers in recall explanations
  - bounded answer-support phrasing for agent / UI layers

How Memory Works

Aura organizes memories into 4 levels across 2 tiers. Important memories persist, trivial ones decay naturally:

CORE TIER (slow decay — weeks to months)
  Identity  [0.99]  Who the user is. Preferences. Personality.
  Domain    [0.95]  Learned facts. Domain knowledge.

COGNITIVE TIER (fast decay — hours to days)
  Decisions [0.90]  Choices made. Action items.
  Working   [0.80]  Current tasks. Recent context.

SEMANTIC TYPES (modulate decay & promotion)
  fact          Default knowledge record.
  decision      More persistent than a standard fact. Promotes earlier.
  preference    Long-lived user or agent preference.
  contradiction Preserved longer for conflict analysis.
  trend         Time-sensitive pattern tracked over repeated activation.
  serendipity   Cross-domain discovery record.

One call runs the lifecycle — decay, promotion, consolidation, and archival:

report = brain.run_maintenance()  # background memory maintenance

Key Features

Core Cognitive Runtime

Fast Local Recall - Multi-signal ranking with optional embedding support
Two-Tier Memory — Cognitive (ephemeral) + Core (permanent) with decay, promotion, and archival
Semantic Memory Types — 6 roles (fact, decision, trend, preference, contradiction, serendipity) that influence memory behavior and insighting
Phase-Based Insights — Detects conflicts, trends, preference patterns, and cross-domain links
Background Maintenance — Continuous memory hygiene: decay, reflect, insights, consolidation, archival
Namespace Isolation — namespace="sandbox" keeps test data invisible to production recall
Pluggable Embeddings - Optional embedding support: bring your own embedding function

Trust & Safety

AuraSDK

Install / Use

README