Defenseclaw
Security Governance for Agentic AI
Install / Use
/learn @cisco-ai-defense/DefenseclawREADME
____ ____ ____ _
/ __ \ ___ / __/___ ___ ___ ___ / ___|| | __ _ __ __
/ / / / / _ \/ /_// _ \ / _ \ / __|/ _ \| | | |/ _` |\ \ /\ / /
/ /_/ / / __/ __// __/| | | |\__ \ __/| |___ | | (_| | \ V V /
/_____/ \___/_/ \___/ |_| |_||___/\___| \____||_|\__,_| \_/\_/
╔═══════════════════════════════════════════════════════════════╗
║ DefenseClaw — Security Governance for Agentic AI ║
╚═══════════════════════════════════════════════════════════════╝
DefenseClaw
AI agents are powerful. Unchecked, they're dangerous.
Large language model agents — like those built on OpenClaw — can install skills, call MCP servers, execute code, and reach the network. Every one of those actions is an attack surface. A single malicious skill can exfiltrate data. A compromised MCP server can inject hidden instructions. Generated code can contain hardcoded secrets or command injection.
DefenseClaw is the enterprise governance layer for OpenClaw. It sits between your AI agents and the infrastructure they run on, enforcing a simple principle: nothing runs until it's scanned, and anything dangerous is blocked automatically.
┌─────────────────────────────────────────────────────────┐
│ DefenseClaw │
│ │
│ ┌───────────┐ ┌───────────────────────────────────┐ │
│ │ │ │ DefenseClaw Gateway │ │
│ │ CLI │ │ │ │
│ │ (Python) │ │ ┌─────────────────────────────┐ │ │
│ │ │ │ │ AI Gateway │ │ │
│ │ │ │ └─────────────────────────────┘ │ │
│ │ │ │ ┌─────────────────────────────┐ │ │
│ │ │ │ │ Inspect Engine │ │ │
│ │ │ │ └─────────────────────────────┘ │ │
│ │ │ │ │ │
│ └───────────┘ └─────────────────┬─────────────────┘ │
│ │ │
│ WS (v3) + REST │
│ │ │
│ ┌─────────────────────────────────┼─────────────────┐ │
│ │ NVIDIA OpenShell │ │ │
│ │ │ │ │
│ │ ┌──────────────────────────────┴──────────────┐ │ │
│ │ │ OpenClaw │ │ │
│ │ │ │ │ │
│ │ │ ┌───────────────────────────────────────┐ │ │ │
│ │ │ │ DefenseClaw Plugin (TS) │ │ │ │
│ │ │ └───────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ └─────────────────────────────────────────────┘ │ │
│ │ │ │
│ └───────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘
Capabilities
Skill, MCP, and Plugin Scanning
DefenseClaw scans every skill, MCP server, and plugin before it is allowed to run. The CLI wraps Cisco AI Defense scanners (skill-scanner, mcp-scanner) and an AI bill-of-materials generator (aibom) to produce a unified ScanResult with severity-ranked findings. Scan results feed into the admission gate — HIGH/CRITICAL findings auto-block the component, MEDIUM/LOW findings install with a warning, and clean components pass through. All outcomes are logged to the SQLite audit store and forwarded to SIEM.
defenseclaw skill scan web-search # scan a skill by name
defenseclaw mcp scan github-mcp # scan an MCP server
defenseclaw plugin scan code-review # scan a plugin
defenseclaw skill scan all # scan every installed skill
CodeGuard
CodeGuard is a built-in static analysis engine that scans source files line-by-line with regex rules. It targets code written by agents or included in skills and catches:
- Hardcoded credentials — AWS keys, API tokens, embedded private keys
- Dangerous execution —
os.system,eval,subprocesswithshell=True,child_process.exec - Outbound networking — HTTP calls to variable/untrusted URLs
- Unsafe deserialization —
pickle.load,yaml.loadwithout safe loader - SQL injection — string-formatted queries
- Weak cryptography — MD5, SHA1 usage
- Path traversal —
../sequences,path.joinwith..
CodeGuard runs automatically during skill/plugin scans and is also available as a standalone scan via the sidecar API (POST /api/v1/scan/code) or the plugin's /scan code slash command.
Runtime Inspection
Message Inspection
The guardrail proxy inspects every LLM prompt and completion for secrets, PII, and injection patterns. It operates independently of the plugin — it protects the LLM channel even if the plugin is not installed. In observe mode findings are logged; in action mode dangerous content is blocked before it reaches the LLM or the user.
Tool Inspection
Every tool call passes through the inspect engine before execution. The OpenClaw plugin's before_tool_call hook sends the tool name and arguments to the gateway, which evaluates them against six rule categories:
| Category | What it catches |
|----------|----------------|
| secret | API keys, tokens, passwords in tool arguments |
| command | Dangerous shell commands (curl, wget, nc, rm -rf, etc.) |
| sensitive-path | Access to /etc/passwd, SSH keys, credential files |
| c2 | Command-and-control hostnames, metadata SSRF (169.254.169.254) |
| cognitive-file | Tampering with agent memory, instruction, or config files |
| trust-exploit | Prompt injection patterns disguised as tool arguments |
For write and edit tools, the engine additionally runs CodeGuard on the content being written. Verdicts are allow, alert, or block — in observe mode findings are logged but never block; in action mode HIGH/CRITICAL findings cancel the tool call.
Architecture
DefenseClaw is a multi-component system with three runtimes that work together:
| Component | Language | Role |
|-----------|----------|------|
| CLI | Python 3.11+ | Operator-facing tool — runs scanners, manages block/allow lists, TUI dashboard |
| Gateway | Go 1.25+ | Central daemon — REST API, WebSocket bridge to OpenClaw, policy engine, inspection pipeline, SQLite audit store, SIEM export |
| Plugin | TypeScript | Runs inside OpenClaw — intercepts tool calls via before_tool_call hook, provides /scan, /block, /allow slash commands |
The CLI and Plugin communicate with the Gateway over a local REST API. The Gateway connects to the OpenClaw Gateway over WebSocket (protocol v3) to subscribe to events and send enforcement commands. A built-in guardrail proxy inspects all LLM traffic in real time.
For the full system diagram, data flows, and component responsibilities, see docs/ARCHITECTURE.md.
Installation
Prerequisites
| Requirement | Version | Check |
|-------------|---------|-------|
| Python | 3.10+ | python3 --version |
| Go | 1.25+ | go version |
| Node.js | 20+ (plugin only) | node --version |
| Git | any | git --version |
Install OpenClaw
If you don't already have OpenClaw running:
curl -fsSL https://openclaw.ai/install.sh | bash
openclaw onboard --install-daemon
Verify the gateway is up with openclaw gateway status. See the OpenClaw Getting Started guide for full details.
Install DefenseClaw
curl -LsSf https://raw.githubusercontent.com/cisco-ai-defense/defenseclaw/main/scripts/install.sh | bash
defenseclaw init --enable-guardrail
For platform-specific instructions (DGX Spark, macOS, cross-compilation), see docs/INSTALL.md.
Quick Start
List installed components
defenseclaw skill list
defenseclaw mcp list
defenseclaw plugin list
Scan by name
# Scan a skill
defenseclaw skill scan web-search
# Scan an MCP server
defenseclaw mcp scan github-mcp
# Scan a plugin
defenseclaw plugin scan code-review
Check security alerts
defenseclaw alerts
defenseclaw alerts -n 50
For the complete walkthrough including blocking tools, enabling guardrail action mode, and testing blocked prompts, see docs/QUICKSTART.md.
Setup Guardrails
Block / Allow tools
# Block a dangerous tool
defenseclaw tool block delete_file --reason "destructive operation"
# Allow a trusted tool
defenseclaw tool allow web_search
# View blocked and allowed tools
defenseclaw tool list
Enable guardrail action mode
By default the guardrail runs in
