🧠 CogniLayer v4

Stop re-explaining your codebase to AI.

Infinite speed memory · Code graph · 200K+ tokens saved

Without CogniLayer, your AI agent starts every session blind. It re-reads files, re-discovers architecture, re-learns decisions you explained last week. On a 50-file project, that's 80-100K tokens burned before real work begins.

With CogniLayer, it already knows. Three things your agent doesn't have today:

🔗 Persistent knowledge across agents - facts, decisions, error fixes, gotchas survive across sessions, crashes, and agents. Start in Claude Code, continue in Codex CLI - zero context loss

🔍 Code intelligence - who calls what, what depends on what, what breaks if you rename a function. Tree-sitter AST parsing across 10+ languages, not grep

🤖 Subagent context compression - research subagents write findings to DB instead of dumping 40K+ tokens into parent context. Parent gets a 500-token summary + on-demand memory_search retrieval

⚡ 80-200K+ tokens saved per session - semantic search replaces file reads, subagent findings go to DB instead of context. Longer sessions with subagents save more

See the Difference

Without CogniLayer

You: "Fix the login bug"

Claude: Let me read the project structure...
        Let me read src/auth/login.ts...
        Let me read src/auth/middleware.ts...
        Let me read src/config/database.ts...
        Let me understand your auth flow...
        (8 files read, 45K tokens burned, 2 minutes spent on orientation)

Claude: "Ok, I see the issue..."

With CogniLayer

You: "Fix the login bug"

Claude: [memory_search → "login auth flow"] → 3 facts loaded (200 tokens)
        [code_context → "handleLogin"] → caller/callee map in 0.2s
        Already knows: Express + Passport, JWT in httpOnly cookies,
        last login bug was a race condition in session refresh (fixed 2 weeks ago)

Claude: "This looks like the same pattern as the session refresh issue
         from March 1st. The fix is..."

That's not a small improvement. That's the difference between an agent that guesses and one that knows.

Real-World Examples

Debugging: "Why is checkout failing?"

Without CogniLayer, Claude reads 15 files to understand your e-commerce flow. With it:

memory_search("checkout payment flow")
→ fact: "Stripe webhook hits /api/webhooks/stripe, validates signature
   with STRIPE_WEBHOOK_SECRET, then calls processOrder()"
→ gotcha: "Stripe sends webhooks with 5s timeout - processOrder must
   complete within 5s or webhook retries cause duplicate orders"
→ error_fix: "Fixed duplicate orders on 2026-02-20 by adding
   idempotency key check in processOrder()"

code_impact("processOrder")
→ depth 1: createOrderRecord, sendConfirmationEmail, updateInventory
→ depth 2: InventoryService.reserve, EmailQueue.push
→ "Changing processOrder will affect 6 functions across 4 files"

Claude already knows the architecture, the past bugs, and what will break if it touches the wrong thing. Instead of 15 file reads (~60K tokens), it uses 3 targeted queries (~800 tokens).

Code Intelligence: "What happens if I change processOrder?"

Without CogniLayer, Claude greps for the function name and hopes for the best. With it:

code_context("processOrder")
→ definition: src/services/order.ts:42
→ incoming (who calls it): StripeWebhookHandler.handle, OrderController.retry,
   AdminPanel.reprocessOrder
→ outgoing (what it calls): createOrderRecord, sendConfirmationEmail,
   updateInventory, PaymentLog.write

code_impact("processOrder")
→ depth 1 (WILL BREAK):  StripeWebhookHandler, OrderController, AdminPanel
→ depth 2 (LIKELY AFFECTED): WebhookRouter, RetryQueue, AdminRoutes
→ depth 3 (NEED TESTING): 3 test files, 1 integration test
→ "Changing processOrder will affect 9 symbols across 7 files"

Before touching a single line, Claude knows the full blast radius - which files will break, which need testing, and which callers depend on the current behavior. No more surprise failures after a refactor.

Refactoring: "Rename UserService to AccountService"

code_search("UserService")
→ class UserService in src/services/user.ts (line 14)
→ 12 references across 8 files

code_impact("UserService")
→ depth 1: AuthController, ProfileController, AdminPanel (WILL BREAK)
→ depth 2: LoginRoute, RegisterRoute, middleware/auth (LIKELY AFFECTED)
→ depth 3: 4 test files (NEED UPDATING)

memory_search("UserService")
→ decision: "UserService handles both auth and profile - planned split
   into AuthService + ProfileService (decided 2026-02-15, not yet done)"

Claude doesn't just find-and-replace. It knows there's a planned split and can suggest doing both changes at once - saving you a future refactoring session.

New session after a crash: "What was I working on?"

[SessionStart hook fires automatically]
→ bridge loaded: "Progress: Migrated 3/5 API endpoints to v2 format.
   Done: /users, /products, /orders. Open: /payments, /shipping.
   Blocker: /payments needs Stripe SDK v12 upgrade first."

memory_search("stripe sdk upgrade")
→ gotcha: "Stripe SDK v12 changed webhook signature verification -
   verify() is now async, breaks all sync handlers"

Zero re-explanation. Claude picks up exactly where it left off, including the blocker you hadn't mentioned yet.

Subagent research: "What MCP frameworks exist?"

Without CogniLayer, the subagent returns a 40K-token dump into parent context:

Parent (200K context):
  → spawn subagent: "Research community MCP servers"
  ← subagent returns: 40K tokens about 15 projects
  → all 40K crammed into parent context
  → remaining: 160K → next subagent → 120K → next → 80K...

With CogniLayer's Subagent Memory Protocol:

Parent (200K context):
  → spawn subagent: "Research MCP servers, save to memory"
  ← subagent writes details to DB, returns: "Saved 3 facts,
     search 'MCP server ecosystem'. Summary: Python dominates,
     FastMCP most popular, 3 architectural patterns."
  → parent context: ~500 tokens
  → need details? memory_search("MCP server ecosystem") → targeted pull

40K tokens compressed to 500. The findings persist in DB across sessions - not just for this conversation, but forever.

Killer Features

| Feature | What it means | |---------|--------------| | Code Intelligence | code_context shows who calls what. code_impact maps blast radius before you touch anything. Powered by tree-sitter AST parsing | | Semantic Search | Hybrid FTS5 + vector search finds the right fact even with different wording. Sub-millisecond response | | MCP Toolset | Memory, code analysis, safety, project context, and safe Codex orchestration helpers | | Token Savings | 3 targeted queries (~800 tokens) replace 15 file reads (~60K tokens). Typical session saves 80-200K+ tokens | | Subagent Protocol | Research subagents save findings to DB instead of flooding parent context. 40K → 500 tokens per subagent task | | Crash Recovery | Session dies? Next one auto-recovers from the change log. Works across both agents | | Cross-Project Knowledge | Solved a CORS issue in project A? Search it from project B. Your experience compounds | | 14 Fact Types | Not dumb notes - error_fix, gotcha, api_contract, decision, pattern, procedure, and more | | Heat Decay | Hot facts surface first, cold facts fade. Each search hit boosts relevance | | Safety Gates | Identity Card system blocks deploy to wrong server. Audit trail on every safety change | | Agent Interop | Claude Code and Codex CLI share the same brain. Switch agents mid-task, zero context loss | | Session Bridges | Every session starts with a summary of what happened last time | | TUI Dashboard | Visual memory browser with 8 tabs - see everything at a glance |

How It Works

You start a session
    ↓
SessionStart hook fires → injects project DNA, last session bridge, crash recovery
    ↓
You work normally - Claude saves facts, decisions, gotchas automatically via MCP tools
    ↓
You ask about code → code_context / code_impact answer in milliseconds from AST index
    ↓
Session ends (or crashes)
    ↓
Next session starts with full context - no re-reading, no re-explaining

Zero effort after install. No commands to learn, no workflow changes. CogniLayer runs in the background via hooks and MCP tools. Claude knows how to use it automatically.

Quick Start

1. Install (30 seconds)

git clone https://github.com/LakyFx/CogniLayer.git
cd CogniLayer
python install.py

That's it. Next time you start Claude Code, CogniLayer is active.

2. Optional: Turbocharge search

# AI-powered vector search (recommended - finds facts even with different wording)
pip install fastembed sqlite-vec

3. Optional: Add Codex CLI support

python install.py --codex    # Codex CLI only
python install.py --both     # Claude Code + Codex CLI

Codex installs static AGENTS instructions plus reusable workflow files in:

~/.cognilayer/codex/onboard.md
~/.cognilayer/codex/harvest.md
~/.cognilayer/codex/checkpoint.md
~/.cognilayer/codex/multi_agent_safe.md

4. Verify

python ~/.cognilayer/mcp-server/server.py --test
# → "OK: Registered <n> tools."

Troubleshooting

MCP server not connecting? Run the diagnostic tool:

python diagnose.py          # C

CogniLayer

Install / Use

README