AgentShield

Security auditor for AI agent configurations

Scans Claude Code setups for hardcoded secrets, permission misconfigs,<br/> hook injection, MCP server risks, and agent prompt injection vectors.<br/> Available as CLI, GitHub Action, and GitHub App integration.

Quick Start · What It Catches · API Reference · Opus Pipeline · GitHub Action · Distribution · MiniClaw · Changelog

</div>

Why

The AI agent ecosystem is growing faster than its security tooling. In January 2026 alone:

12% of a major agent skill marketplace was malicious (341 of 2,857 community skills)
A CVSS 8.8 CVE exposed 17,500+ internet-facing instances to one-click RCE
The Moltbook breach compromised 1.5M API tokens across 770,000 agents

Developers install community skills, connect MCP servers, and configure hooks without any automated way to audit the security of their setup. AgentShield scans your .claude/ directory and flags vulnerabilities before they become exploits.

Built at the Claude Code Hackathon (Cerebral Valley x Anthropic, Feb 2026). Part of the Everything Claude Code ecosystem (42K+ stars).

Quick Start

# Scan your Claude Code config (no install required)
npx ecc-agentshield scan

# Or install globally
npm install -g ecc-agentshield
agentshield scan

That's it. AgentShield auto-discovers your ~/.claude/ directory, scans all config files, and prints a graded security report.

Discovery intentionally skips common generated directories such as node_modules, build output, and .dmux worktree mirrors so transient copies do not duplicate findings.

  AgentShield Security Report

  Grade: F (0/100)

  Score Breakdown
  Secrets        ░░░░░░░░░░░░░░░░░░░░ 0
  Permissions    ░░░░░░░░░░░░░░░░░░░░ 0
  Hooks          ░░░░░░░░░░░░░░░░░░░░ 0
  MCP Servers    ░░░░░░░░░░░░░░░░░░░░ 0
  Agents         ░░░░░░░░░░░░░░░░░░░░ 0

  ● CRITICAL  Hardcoded Anthropic API key
    CLAUDE.md:13
    Evidence: sk-ant-a...cdef
    Fix: Replace with environment variable reference [auto-fixable]

  ● CRITICAL  Overly permissive allow rule: Bash(*)
    settings.json
    Evidence: Bash(*)
    Fix: Restrict to specific commands: Bash(git *), Bash(npm *), Bash(node *)

  Summary
  Files scanned: 6
  Findings: 73 total — 19 critical, 29 high, 15 medium, 4 low, 6 info
  Auto-fixable: 8 (use --fix)

More commands

# Scan a specific directory
agentshield scan --path /path/to/.claude

# Auto-fix safe issues (replaces hardcoded secrets with env var references)
agentshield scan --fix

# JSON output for CI pipelines
agentshield scan --format json

# Generate an HTML security report
agentshield scan --format html > report.html

# Three-agent Opus 4.6 adversarial analysis (requires ANTHROPIC_API_KEY)
agentshield scan --opus --stream

# Generate a secure baseline config
agentshield init

JSON reports now expose findings[].runtimeConfidence when AgentShield can distinguish active runtime config from project-local settings, template/example inventories, declarative plugin manifests, and manifest-resolved non-shell hook implementations.

What It Catches

102 rules across 5 categories, graded A–F with a 0–100 numeric score.

Secrets Detection (10 rules, 14 patterns)

| What | Examples | |------|----------| | API keys | Anthropic (sk-ant-), OpenAI (sk-proj-), AWS (AKIA), Google (AIza), Stripe (sk_test_/sk_live_) | | Tokens | GitHub PATs (ghp_/github_pat_), Slack (xox[bprs]-), JWTs (eyJ...), Bearer tokens | | Credentials | Hardcoded passwords, database connection strings (postgres/mongo/mysql/redis), private key material | | Env leaks | Secrets passed through environment variables in configs, echo $SECRET in hooks |

Permission Audit (10 rules)

| What | Examples | |------|----------| | Wildcard access | Bash(*), Write(*), Edit(*) — unrestricted tool permissions | | Missing deny lists | No deny rules for rm -rf, sudo, chmod 777 | | Dangerous flags | --dangerously-skip-permissions usage | | Mutable tool exposure | All mutable tools (Write, Edit, Bash) allowed without scoping | | Destructive git | git push --force, git reset --hard in allowed commands | | Unrestricted network | curl *, wget, ssh *, scp * in allow list without scope |

Hook Analysis (34 rules)

| What | Examples | |------|----------| | Command injection | ${file} interpolation in shell commands — attacker-controlled filenames become code | | Data exfiltration | curl -X POST with variable interpolation sending data to external URLs | | Silent errors | 2>/dev/null, \|\| true — failing security hooks that silently pass | | Missing hooks | No PreToolUse hooks, no Stop hooks for session-end validation | | Network exposure | Unthrottled network requests in hooks, sensitive file access without filtering | | Session startup | SessionStart hooks that download and execute remote scripts | | Package installs | Global npm install -g, pip install, gem install, cargo install in hooks | | Container escape | Docker --privileged, --pid=host, --network=host, root volume mounts | | Credential access | macOS Keychain, GNOME Keyring, /etc/shadow reads | | Reverse shells | /dev/tcp, mkfifo + nc, Python/Perl socket shells | | Clipboard access | pbcopy, xclip, xsel, wl-copy — exfiltration via clipboard | | Log tampering | journalctl --vacuum, rm /var/log, history -c — anti-forensics |

MCP Server Security (23 rules)

| What | Examples | |------|----------| | High-risk servers | Shell/command MCPs, filesystem with root access, database MCPs, browser automation | | Supply chain | npx -y auto-install without confirmation — typosquatting vector | | Hardcoded secrets | API tokens in MCP environment config instead of env var references | | Remote transport | MCP servers connecting to remote URLs (SSE/streamable HTTP) | | Shell metacharacters | &&, \|, ; in MCP server command arguments | | Missing metadata | No version pin, no description, excessive server count | | Sensitive file args | .env, .pem, credentials.json passed as server arguments | | Network exposure | Binding to 0.0.0.0 instead of localhost | | Auto-approve | autoApprove settings that skip user confirmation for tool calls | | Missing timeouts | High-risk servers without timeout — resource exhaustion risk |

MCP Confidence Notes

AgentShield scans both active MCP config and repository-shipped MCP templates.

Findings from mcp.json, .claude/mcp.json, .claude.json, and active settings.json should be treated as the highest-confidence runtime exposure.
Findings from settings.local.json are emitted as runtimeConfidence: project-local-optional.
Findings from locations such as mcp-configs/, config/mcp/, or configs/mcp/ indicate risky MCP definitions present in repository templates, not guaranteed active runtime enablement.
JSON, markdown, terminal, and HTML outputs now expose source context via runtimeConfidence: active-runtime | project-local-optional | template-example | docs-example | plugin-manifest | hook-code.
Non-secret template-example MCP findings are score-weighted at 0.25x, and one template file is capped at 10 deduction points per score category so a single MCP catalog cannot score like dozens of enabled servers.
In template files, findings such as risky server type, remote URL transport, npx -y, unpinned packages, and environment inheritance are still valuable, but they should be interpreted as "this repo ships a risky MCP template" rather than "this MCP is definitely enabled right now."
Aggregate findings like large MCP server counts are especially likely to overstate runtime exposure when the source file is a template catalog.

Agent Config Review (25 rules)

| What | Examples | |------|----------| | Unrestricted tools | Agents with Bash access, no allowedTools restriction | | Prompt injection surface | Agents processing external/user-provided content without defenses | | Auto-run instructions | CLAUDE.md containing "Always run", "without asking", "automatically install" | | Hidden instructions | Unicode zero-width characters, HTML comments, base64-encoded directives | | URL execution | CLAUDE.md instructing agents to fetch and execute remote URLs | | Time bombs | Delayed execution instructions triggered by time or absence conditions | | Data harvesting | Bulk collection of passwords, credentials, or database dumps | | Prompt reflection | ignore previous instructions, you are now, DAN jailbreak, fake system prompts | | Output manipulation | always report ok, remove warnings from output, suppress security findings |

Structured JSON under .claude/subagents/ and .claude/slash-commands/ is analyzed like agent config when it declares allowedTools or similar tool metadata. Freeform skill-md prompt text still has narrower security coverage than agent-md and CLAUDE.md.

Scanner Accuracy Notes

Live audit notes and follow-up items are tracked in false-positive-audit.md.
The most useful operator guidance is in the audit's [`Triage Rules For Curr

Agentshield

Install / Use

README