AvaKill

Open-source safety firewall for AI agents

One YAML policy. Three independent enforcement paths. Every agent protected.

pipx install avakill && avakill setup

Quickstart · How It Works · Integrations · Policy · CLI · Docs · Contributing

</div>

The Problem

AI agents are shipping to production with zero safety controls on their tool calls. The results are predictable:

Replit's agent dropped a production database and fabricated 4,000 fake user accounts to cover it up.
Google's Gemini CLI wiped a user's entire D: drive — 8,000+ files, gone.
Amazon Q terminated EC2 instances and deleted infrastructure during a debugging session.

These aren't edge cases. Research shows AI agents fail in 75% of real-world tasks, and when they fail, they fail catastrophically — because nothing sits between the agent and its tools.

AvaKill is that missing layer. A firewall that intercepts every tool call, evaluates it against your safety policies, and kills dangerous operations before they execute. No ML models, no API calls, no latency — just fast, deterministic policy checks in <1ms.

Quickstart

pipx install avakill
avakill setup

macOS note: macOS 14+ blocks pip install at the system level (PEP 668). Use pipx or a virtualenv.

avakill setup walks you through an interactive flow that:

Detects agents across three enforcement paths (hooks, MCP proxy, OS sandbox)
Creates a policy from a catalog of 81 rules across 14 categories
Installs hooks for detected agents (Claude Code, Cursor, Windsurf, Gemini CLI, Codex, Kiro, Amp, OpenClaw)
Wraps MCP servers for MCP-capable agents (Claude Desktop, Cline, Continue)
Shows sandbox commands for agents that support OS-level containment
Enables tracking (optional) for audit logs and diagnostics

After setup, test it:

echo '{"tool": "Bash", "args": {"command": "rm -rf /"}}' | avakill evaluate --policy avakill.yaml
# deny: Matched rule 'block-catastrophic-shell'

Safe calls pass through. Destructive calls are killed before they execute.

Optional framework extras

pip install "avakill[openai]"       # OpenAI function calling
pip install "avakill[anthropic]"    # Anthropic tool use
pip install "avakill[langchain]"    # LangChain / LangGraph
pip install "avakill[mcp]"          # MCP proxy
pip install "avakill[all]"          # Everything

How It Works

AvaKill enforces a single YAML policy across three independent enforcement paths. Each path works standalone — no daemon required, no single point of failure.

avakill.yaml (one policy file)
    |
    ├── Hooks (Claude Code, Cursor, Windsurf, Gemini CLI, Codex, Kiro, Amp, OpenClaw)
    |     → work standalone, evaluate in-process
    |
    ├── MCP Proxy (wraps MCP servers)
    |     → works standalone, evaluate in-process
    |
    ├── OS Sandbox (launch + profiles)
    |     → works standalone, OS-level enforcement
    |
    └── Daemon (optional)
          → shared evaluation, audit logging
          → hooks/proxy CAN talk to it if running
          → enables: logs, fix, tracking, approvals, metrics

One Policy File<br> avakill.yaml is the single source of truth. Deny-by-default, allow lists, rate limits, argument pattern matching, shell safety checks, path resolution, and content scanning.

</td> <td width="50%">

Native Agent Hooks<br> Drop-in hooks for Claude Code, Cursor, Windsurf, Gemini CLI, Codex, Kiro, Amp, and OpenClaw. One command to install. Works standalone — no daemon required.

</td> </tr> <tr> <td>

MCP Proxy<br> Wraps any MCP server with policy enforcement. Scans tool responses for secrets, PII, and prompt injection. Works standalone, evaluates in-process.

</td> <td>

OS Sandbox<br> Launch agents in OS-level sandboxes. Landlock on Linux, sandbox-exec on macOS, AppContainer on Windows. Deny-default, kernel-level enforcement.

</td> </tr> <tr> <td>

Sub-Millisecond<br> Pure rule evaluation, no ML models. Adds <1ms overhead to tool calls that already take 500ms-5s. Three enforcement paths, zero bottlenecks.

</td> <td>

Optional Daemon<br> Shared evaluation, audit logging, and visibility tooling. Hooks and proxy can talk to it when running. Enables logs, tracking, approvals, and metrics.

</td> </tr> </table>

Integrations

Native Agent Hooks

Protect AI agents with zero code changes — just install the hook:

# Install hooks (works standalone — no daemon required)
avakill hook install --agent claude-code  # or cursor, windsurf, gemini-cli, openai-codex, kiro, amp, openclaw, all
avakill hook list

Hooks work standalone by default — each hook evaluates policies in-process. Policies use canonical tool names (shell_execute, file_write, file_read) so one policy works across all agents.

| Agent | Hook Status | |---|---| | Claude Code | Battle-tested | | Cursor | Supported | | Windsurf | Supported | | Gemini CLI | Supported | | OpenAI Codex | Supported | | Kiro | Supported | | Amp | Supported | | OpenClaw | Native plugin (6-layer) |

OpenClaw native plugin: OpenClaw uses a dedicated plugin (avakill-openclaw) with 6 enforcement layers — hard block, guard tool, output scanning, message gate, spawn control, and context injection. Install with openclaw plugins install avakill-openclaw. Sandbox is available as a fallback via avakill launch --agent openclaw.

MCP Proxy

Wrap MCP servers to route all tool calls through AvaKill:

avakill mcp-wrap --agent claude-desktop   # or cursor, windsurf, cline, continue, all
avakill mcp-unwrap --agent all            # Restore original configs

Supported agents: Claude Desktop, Cursor, Windsurf, Cline, Continue.dev.

OS Sandbox

Launch agents in OS-level sandboxes with pre-built profiles:

avakill profile list                    # See available profiles
avakill profile show aider              # See what a profile restricts
avakill launch --agent aider -- aider   # Launch with OS sandbox

Profiles ship for OpenClaw (fallback — prefer the native plugin), Cline, Continue, SWE-Agent, and Aider.

Python SDK

For programmatic integration, AvaKill's Guard is available as a Python API:

from avakill import Guard, protect

guard = Guard(policy="avakill.yaml")

@protect(guard=guard, on_deny="return_none")  # or "raise" (default), "callback"
def execute_sql(query: str) -> str:
    return db.execute(query)

Framework wrappers:

# OpenAI
from avakill import GuardedOpenAIClient
client = GuardedOpenAIClient(OpenAI(), policy="avakill.yaml")

# Anthropic
from avakill import GuardedAnthropicClient
client = GuardedAnthropicClient(Anthropic(), policy="avakill.yaml")

# LangChain / LangGraph
from avakill import AvaKillCallbackHandler
handler = AvaKillCallbackHandler(policy="avakill.yaml")
agent.invoke({"input": "..."}, config={"callbacks": [handler]})

Policy Configuration

Policies are YAML files. Rules are evaluated top-to-bottom — first match wins.

version: "1.0"
default_action: deny

policies:
  # Allow safe shell with allowlist + metacharacter protection
  - name: "allow-safe-shell"
    tools: ["shell_execute", "Bash", "run_shell_command", "run_command",
            "shell", "local_shell", "exec_command"]
    action: allow
    conditions:
      shell_safe: true
      command_allowlist: [echo, ls, cat, pwd, git, python, pip, npm, node, make]

  # Block destructive SQL
  - name: "block-destructive-sql"
    tools: ["execute_sql", "database_*"]
    action: deny
    conditions:
      args_match:
        query: ["DROP", "DELETE", "TRUNCATE", "ALTER"]
    message: "Destructive SQL blocked. Use a manual migration."

  # Block writes to system directories
  - name: "block-system-writes"
    tools: ["file_write", "file_edit", "Write", "Edit"]
    action: deny
    conditions:
      path_match:
        file_path: ["/etc/", "/usr/", "/bin/", "/sbin/"]

  # Scan for secrets in tool arguments
  - name: "block-secret-leaks"
    tools: ["*"]
    action: deny
    conditions:
      content_scan: true

  # Rate limit API calls
  - name: "rate-limit-search"
    tools: ["web_search"]
    action: allow
    rate_limit:
      max_calls: 10
      window: "60s"

  # Require human approval for file writes
  - name: "approve-writes"
    tools: ["file_write"]
    action: require_approval

Policy features:

Glob patterns — *, delete_*, *_execute match tool names
Argument matching — args_match / args_not_match inspect arguments (case-insensitive substring)
Shell safety — shell_safe blocks metacharacters; command_allowlist restricts to known-good binaries
Path resolution — path_match / path_not_match with symlink resolution, ~ and $HOME expansion
Content scanning — content_scan detects secrets, PII, and prompt injection in arguments
Rate limiting — sliding window (10s, 5m, 1h)
**Approval gates

Avakill

Install / Use

README