Aegis
Runtime policy enforcement for AI agents. Cryptographic audit trail, human-in-the-loop approvals, kill switch. Zero code changes.
Install / Use
/learn @Justin0504/AegisQuality Score
Category
Development & EngineeringSupported Platforms
README
AEGIS
The firewall for AI agents.
Every tool call. Intercepted. Classified. Blocked — before it executes.
<br> </div> <br><br> <div align="center"> <img src="docs/images/dashboard-overview.png" alt="AEGIS Compliance Cockpit" width="820"> <br> <sub>The AEGIS Compliance Cockpit — real-time monitoring across all your agents.</sub> </div>Your agent just called
DROP TABLE usersbecause the prompt said "clean up old records."Your agent just exfiltrated 2GB because "the user asked for a report."
Your agent just ran
rm -rf /because the model hallucinated a tool name.These are not hypotheticals. Every agent framework lets AI decide which tools to call, with what arguments, at machine speed. There is no human in the loop. There is no undo button.
AEGIS is the missing layer: a pre-execution firewall that sits between your agent and its tools, classifies every call in real time, enforces policies, blocks violations, and creates a tamper-evident audit trail with hash chaining and optional signing support — all with one line of code and zero changes to your agent.
Demo
<div align="center">A real Claude-powered research assistant, fully integrated with AEGIS.<br> Watch it trace tool calls, block SQL injection, detect PII, and pause for human approval — live.
<img src="docs/images/readme_demo2.gif" alt="Live agent demo" width="820"> <br>The Compliance Cockpit: traces, policies, cost tracking, sessions, approvals.
<img src="docs/images/readme_demo1.gif" alt="Dashboard walkthrough" width="820"> </div>Quick Start
3 commands. 30 seconds. Full protection.
git clone https://github.com/Justin0504/Aegis
cd Aegis
docker compose up -d
| Service | URL | What it does | |---------|-----|--------------| | Compliance Cockpit | localhost:3000 | Dashboard — traces, policies, approvals, costs | | Gateway API | localhost:8080 | Policy engine — classifies, checks, blocks |
Then add one line to your agent:
import agentguard
agentguard.auto("http://localhost:8080", agent_id="my-agent")
# Your existing code — completely unchanged
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(model="claude-sonnet-4-20250514", tools=[...], messages=[...])
For supported Python integrations, importing agentguard once is enough to enable auto-instrumentation:
python -c "import agentguard; agentguard.auto('http://localhost:8080', agent_id='my-agent')"
That's it. Every tool call is now classified, policy-checked, and recorded in a tamper-evident audit trail before execution.
Why AEGIS?
Every agent observability tool (LangFuse, Helicone, Arize) tells you what happened. AEGIS prevents it from happening.
| | LangFuse | Helicone | Arize | AEGIS | |--|----------|----------|-------|-----------| | Observability dashboard | ✅ | ✅ | ✅ | ✅ | | Pre-execution blocking | ❌ | ❌ | ❌ | ✅ | | Human-in-the-loop approvals | ❌ | ❌ | ❌ | ✅ | | Zero-config tool classification | ❌ | ❌ | ❌ | ✅ | | Cryptographic audit trail | ❌ | ❌ | ❌ | ✅ | | Kill switch | ❌ | ❌ | ❌ | ✅ | | Natural language policy editor | ❌ | ❌ | ❌ | ✅ | | Behavioral anomaly detection | ❌ | ❌ | ❌ | ✅ | | HTTP proxy for closed-source agents | ❌ | ❌ | ❌ | ✅ | | MCP server for Claude Desktop | ❌ | ❌ | ❌ | ✅ | | LLM-as-a-Judge evaluation | ❌ | ❌ | ❌ | ✅ | | Multi-tenancy & RBAC | ❌ | ❌ | ❌ | ✅ | | Admin audit log (SOC 2) | ❌ | ❌ | ❌ | ✅ | | Usage metering & quotas | ❌ | ❌ | ❌ | ✅ | | SLA metrics (P50/P95/P99) | ❌ | ❌ | ❌ | ✅ | | Data retention policies (GDPR) | ❌ | ❌ | ❌ | ✅ | | Slack / PagerDuty alerts | ❌ | ❌ | ❌ | ✅ | | Self-hostable, MIT-licensed | ✅ | ❌ | ❌ | ✅ |
How it works
Your agent calls a tool
│
▼ SDK / HTTP Proxy / MCP Proxy intercepts
┌────────────────────────────────────────────────┐
│ AEGIS Gateway │
│ │
│ ① Classify (SQL? file? network? shell?) │
│ ② Anomaly (baseline deviation? spike?) │
│ ③ Evaluate (injection? exfil? traversal?) │
│ ④ Decide allow / block / pending │
└──────────┬─────────────────────────────────────┘
│
┌──────┴──────────────┐
│ │
allow pending ──► Human reviews in Cockpit
│ │ │
▼ └──── allow ────┘
Tool executes │
│ block
▼ │
Optional signing ▼
SHA-256 hash-chained AgentGuardBlockedError
Stored in Cockpit (agent gets the reason)
Zero-config classification — works on any tool name, any argument shape:
| Your tool call | AEGIS detects | How |
|----------------|---------------|-----|
| run_query(sql="SELECT...") | database | SQL keyword in args |
| my_tool(path="/etc/passwd") | file | Sensitive path pattern |
| do_thing(url="http://...") | network | URL in args |
| helper(cmd="rm -rf /") | shell | Command injection signal |
| custom_fn(prompt="ignore previous...") | prompt-injection | Known attack pattern |
Key Features
Pre-Execution Blocking
AEGIS doesn't just log — it stops dangerous tool calls before they execute.
agentguard.auto(
"http://localhost:8080",
blocking_mode=True, # pause HIGH/CRITICAL calls for human review
human_approval_timeout_s=300, # auto-block after 5 min with no decision
)
<table>
<tr>
<td width="50%">
SQL injection — blocked instantly
<img src="docs/images/block.png" alt="Blocked SQL injection" width="100%"> </td> <td width="50%">High-risk action — awaiting human approval
<img src="docs/images/pending.png" alt="Pending approval" width="100%"> </td> </tr> </table>The agent pauses. You open the Cockpit, inspect the exact arguments, and click Allow or Block. The agent resumes in under a second.
from agentguard import AgentGuardBlockedError
try:
response = client.messages.create(...)
except AgentGuardBlockedError as e:
print(f"Blocked: {e.tool_name} — {e.reason} ({e.risk_level})")
Policy Engine
Five policies ship by default. Create more in plain English — the AI assistant generates the JSON schema for you.
| Policy | Risk | What it catches |
|--------|------|-----------------|
| SQL Injection Prevention | HIGH | DROP, DELETE, TRUNCATE in database tools |
| File Access Control | MEDIUM | Path traversal (../), /etc/, /root/ |
| Network Access Control | MEDIUM | HTTP (non-HTTPS) requests |
| Prompt Injection Detection | CRITICAL | "ignore previous instructions" patterns |
| Data Exfiltration Prevention | HIGH | Large payloads to external endpoints |
"Block all file deletions outside the /tmp directory" → Describe button → policy created instantly.
Behavioral Anomaly Detection
AEGIS builds a behavioral profile for each agent and flags deviations in real time — no manual rules required.
Nine-dimensional analysis:
| Dimension | What it catches |
|-----------|-----------------|
| Tool novelty | Agent uses a tool it has never called before |
| Frequency spike | Sudden burst of calls (3x above normal rate) |
| Argument shape drift | Parameters don't match historical patterns |
| Argument length outlier | Unusually large payloads (data exfiltration signal) |
| Temporal anomaly | Calls at unusual hours |
| Sequence anomaly | Unexpected tool ordering (e.g. delete without prior read) |
| Cost spike | Single call costs 5x the agent's average |
| Risk escalation | Jump from LOW-risk to HIGH-risk tools |
| Session burst | Too many calls in one session |
Cold-start safe — AEGIS learns for the first 200 traces before blocking, so new agents are never false-positived.
Proxy Interception (for closed-source agents)
For agents you can't modify (compiled binaries, third-party tools), AEGIS provides two proxy modes:
HTTP Forward Proxy — intercepts LLM API calls (Anthropic / OpenAI):
# Start the proxy
agentguard http-proxy --port 8081 --agent-id my-agent
# Point any agent at it — zero code changes
export ANTHROPIC_BASE_URL=http://localhost:8081
export OPENAI_BASE_URL=http://localhost:8081/v1
Captures: full prompt/response, tool_use calls, token usage, cost. Supports SSE streaming.
MCP Stdio Proxy — wraps any MCP server with policy enforcement:
agentguard mcp-proxy \
--server npx -y @modelcontextprotocol/server-filesystem / \
--agent-id my-agent --blocking
Every MCP tools/call is policy-checked and anomaly-scored before reaching the upstream server.
| Proxy | Intercepts | Use case | |-------|-----------|----------| | HTTP Proxy | LLM API calls (Anthropic/OpenAI) | Closed-source agents, binary tools | | MCP Proxy | MCP tool calls (stdio JSON-RPC) | Claude Desktop, any MCP client | | SDK | LLM SDK calls (monkey-patch) | Your own Python/JS/Go code |
