RedteamAgent
An AI red-team agent for authorized labs and web app pentesting workflows. Turns Claude Code / OpenCode / Codex into a structured recon → test → exploit → report workflow, with containerized tools and resumable state.
Install / Use
/learn @NeoTheCapt/RedteamAgentQuality Score
Category
Development & EngineeringSupported Platforms
README
An autonomous red team simulation agent that works with Claude Code, OpenCode, and Codex. It transforms any workspace into a full penetration testing environment for CTF/lab targets — featuring 8 AI agents, containerized Kali tools, a streaming case collection pipeline, and 78 security reference files.
Demo


Key Features:
- Multi-CLI support — works with Claude Code, OpenCode, and Codex out of the box
- Autonomous workflow — 5-phase methodology (Recon → Collect → Test → Exploit+OSINT → Report) runs with minimal user interaction
- Orchestrator GUI — local web UI for projects, live runs, artifacts, timelines, and terminal run metadata
- Intelligence collection —
intel.mdaccumulates tech stack, people, domains, credentials from recon through exploitation; OSINT agent enriches with CVE, breach, DNS history, and social data - 8 specialized agents — operator, recon-specialist, source-analyzer, vulnerability-analyst, exploit-developer, fuzzer, osint-analyst, report-writer
- Containerized tools — all pentest tools run in Docker (Kali toolbox, mitmproxy, Katana, optional Metasploit RPC for OpenCode), zero local installation
- Case collection pipeline — SQLite-backed queue with 4 producers, automatic type classification, zero-token dispatcher
- 78 reference files — OWASP Top 10:2025, API Security 2023, offensive tactics, AD/Kerberos attacks
- Resume support — interrupt and continue any engagement without losing progress
Installation
Prerequisites
- Docker (with Docker Compose)
- At least one AI CLI tool if you are not using the Docker all-in-one runtime:
- Claude Code
- OpenCode (
npm install -g opencode-ai) - Codex
- Local tools:
curl,jq,sqlite3(not required for the Docker all-in-one runtime) - Native Windows/PowerShell is not supported
Installation Help
./install.sh -h
Usage by CLI
Docker (Recommended)
Install
bash <(curl -fsSL https://raw.githubusercontent.com/NeoTheCapt/RedteamAgent/dev/install.sh) docker
# or:
./install.sh docker ~/redteam-docker
./install.sh --force docker ~/redteam-docker
Start
cd ~/redteam-docker
./run.sh
Run
/engage http://your-ctf-target:8080
/autoengage http://your-ctf-target:8080
Notes
- This is the cleanest runtime path: the image bundles OpenCode, Redteam Agent, and the pentest toolchain.
run.shstarts from the image-baked clean template, persists engagement files inworkspace/, and persists the full OpenCode state directory inopencode-home/.- Use
./run.sh --ephemeral-opencodeif you do not want to persist OpenCode state outside the container. - Use
./run.sh --rebuildto force a clean image rebuild after install.
OpenCode (Recommended)
Install
bash <(curl -fsSL https://raw.githubusercontent.com/NeoTheCapt/RedteamAgent/dev/install.sh) opencode
# or:
./install.sh opencode
./install.sh opencode ~/my-project
./install.sh --dry-run opencode
Start
cd ~/redteam-agent
opencode
Run
/engage http://your-ctf-target:8080
/autoengage http://your-ctf-target:8080
Notes
- Configure your LLM provider in
.opencode/opencode.json. - OpenCode can optionally use the local Metasploit MCP path during
Exploitwhen a finding clearly maps to a known module family, service, product/version, or CVE.
Claude Code
Install
bash <(curl -fsSL https://raw.githubusercontent.com/NeoTheCapt/RedteamAgent/dev/install.sh) claude
# or:
./install.sh claude
./install.sh claude ~/my-project
Start
cd ~/redteam-agent
claude
Run
/engage http://your-ctf-target:8080
/autoengage http://your-ctf-target:8080
Codex
Install
bash <(curl -fsSL https://raw.githubusercontent.com/NeoTheCapt/RedteamAgent/dev/install.sh) codex
# or:
./install.sh codex
./install.sh codex ~/my-project
Start
cd ~/redteam-agent
codex
Run
engage http://your-ctf-target:8080
autoengage http://your-ctf-target:8080
Notes
- Codex does not support slash commands the same way OpenCode and Claude Code do; use natural-language command invocation when needed.
Local Orchestrator GUI (Optional)
Use the local web UI when you want to manage multiple workspaces or inspect live runs outside the CLI.
Start
./orchestrator/run.sh
# or rebuild the all-in-one image first:
./orchestrator/run.sh --rebuild
Stop
./orchestrator/stop.sh
Notes
- Default URL:
http://127.0.0.1:18000 ./orchestrator/run.shbootstraps the backend virtualenv, installs frontend dependencies if needed, and builds the frontend before starting.- The UI exposes projects, live run status, task/phase timelines, artifacts, and terminal run metadata from the runs API.
- Recent backend work also auto-recovers incomplete runs after supervisor loss or backend restarts, so the UI is suitable for long-running unattended sessions.
Shared Outputs
Every runtime writes engagement artifacts to:
engagements/<timestamp-target>/
Common outputs:
findings.md— vulnerability findings and supporting evidencereport.md— final engagement reportlog.md— execution log and operator timelineintel.md— summary intelligence safe for routine reviewintel-secrets.json— full captured secrets and tokensauth.json— active auth material and session statecases.db— SQLite queue, classification, and work statesurfaces.jsonl— high-risk surface coverage tracking
Sensitive outputs:
- Do not casually share
intel-secrets.json,auth.json, or any engagement directory that still contains live credentials, tokens, or session state. - If you need to share results, prefer
report.md, selected excerpts fromfindings.md, and a reviewed/redacted subset of supporting files.
Engagement Modes
| | /engage | /autoengage |
|---|---|---|
| Auth setup | Asks you to choose (proxy/cookie/skip) | Auto-skip, auto-register if endpoint found, auto-use discovered creds |
| Phase approval | Auto-confirm by default, first phase needs approval | Never asks. Every phase auto-proceeds. |
| Decisions | Parallel by default, can choose sequential | Always parallel. No options. |
| Errors | May stop on unexpected issues | Logs error, continues next task |
| When to use | First time on a target, want oversight | Repeat runs, overnight scans, maximum coverage |
The agent runs through 5 phases:
Phase 1: RECON ─── recon-specialist + source-analyzer (parallel)
│
Phase 2: COLLECT ─ Import endpoints → SQLite queue, start Katana crawler
│
Phase 3: TEST ──── Consume queue → vulnerability-analyst + source-analyzer
│ exploit-developer runs in parallel for HIGH/MEDIUM findings
│ (continuous loop with progress display)
Phase 4: EXPLOIT ── osint-analyst + exploit-developer (parallel)
│ osint-analyst: CVE/breach/DNS/social intel from intel.md
│ exploit-developer: chain analysis, impact assessment
│ OSINT high-value intel → 2nd round exploitation
Phase 5: REPORT ── report-writer with coverage statistics + intelligence summary
Common Commands
| Command | Description |
|---------|-------------|
| /engage <url> | Start a new engagement (semi-autonomous) |
| /autoengage <url> | Fully autonomous — zero interaction, max coverage |
| /resume | Continue an interrupted engagement |
| /status | Show progress dashboard with queue stats |
| /proxy start/stop | Manage mitmproxy interception proxy |
| /auth cookie/header | Configure authentication credentials |
| /queue | Show case queue statistics |
| /report | Generate final report |
| /stop | Stop all background containers |
| /confirm auto/manual | Toggle auto/manual approval mode |
| /config [key] [value] | View or set runtime configuration |
| /subdomain <domain> | Enumerate subdomains for a domain |
| /vuln-analyze | Analyze scan results for vulnerabilities |
| /osint | Run OSINT intelligence gathering on current engagement |
| /recon /scan /enumerate /exploit /pivot | Manual phase overrides |
Authentication
1 — Proxy login (recommended): /proxy start → login in browser
2 — Manual cookie: /auth cookie "session=abc123"
3 — Manual header: /auth header "Authorization: Bearer ..."
4 — Skip: test unauthenticated surface, configure auth later
Architecture
8 Agents
┌─────────────────────────┐
│ OPERATOR │
│ (primary — drives all) │
└──┬──┬──┬──┬──┬──┬──┬────┘
