Pentestagent
PentestAgent is an AI agent framework for black-box security testing, supporting bug bounty, red-team, and penetration testing workflows.
Install / Use
/learn @GH05TCREW/PentestagentQuality Score
Category
Development & EngineeringSupported Platforms
README
PentestAgent
AI Penetration Testing
</div>https://github.com/user-attachments/assets/a67db2b5-672a-43df-b709-149c8eaee975
Requirements
- Python 3.10+
- API key for OpenAI, Anthropic, or other LiteLLM-supported provider
Install
# Clone
git clone https://github.com/GH05TCREW/pentestagent.git
cd pentestagent
# Setup (creates venv, installs deps)
.\scripts\setup.ps1 # Windows
./scripts/setup.sh # Linux/macOS
# Or manual
python -m venv venv
.\venv\Scripts\Activate.ps1 # Windows
source venv/bin/activate # Linux/macOS
pip install -e ".[all]"
playwright install chromium # Required for browser tool
Configure
Create .env in the project root:
ANTHROPIC_API_KEY=sk-ant-...
PENTESTAGENT_MODEL=claude-sonnet-4-20250514
Or for OpenAI:
OPENAI_API_KEY=sk-...
PENTESTAGENT_MODEL=gpt-5
Any LiteLLM-supported model works.
Run
pentestagent # Launch TUI
pentestagent -t 192.168.1.1 # Launch with target
pentestagent --docker # Run tools in Docker container
Docker
Run tools inside a Docker container for isolation and pre-installed pentesting tools.
Option 1: Pull pre-built image (fastest)
# Base image with nmap, netcat, curl
docker run -it --rm \
-e ANTHROPIC_API_KEY=your-key \
-e PENTESTAGENT_MODEL=claude-sonnet-4-20250514 \
ghcr.io/gh05tcrew/pentestagent:latest
# Kali image with metasploit, sqlmap, hydra, etc.
docker run -it --rm \
-e ANTHROPIC_API_KEY=your-key \
ghcr.io/gh05tcrew/pentestagent:kali
Option 2: Build locally
# Build
docker compose build
# Run
docker compose run --rm pentestagent
# Or with Kali
docker compose --profile kali build
docker compose --profile kali run --rm pentestagent-kali
The container runs PentestAgent with access to Linux pentesting tools. The agent can use nmap, msfconsole, sqlmap, etc. directly via the terminal tool.
Requires Docker to be installed and running.
Modes
PentestAgent has three modes, accessible via commands in the TUI:
| Mode | Command | Description |
|------|---------|-------------|
| Assist | /assist <task> | One single-shot instruction, with tool execution |
| Agent | /agent <task> | Autonomous execution of a single task. |
| Crew | /crew <task> | Multi-agent mode. Orchestrator spawns specialized workers. |
| Interact | /interact <task> | Interactive mode. Chat with the agent, it will help you and guide during the pentesting procedure |
TUI Commands
/assist <task> One single-shot instruction.
/agent <task> Run autonomous agent on task
/crew <task> Run multi-agent crew on task
/interact <task> Chat with the agent in guided mode
/target <host> Set target
/tools List available tools
/notes Show saved notes
/report Generate report from session
/memory Show token/memory usage
/prompt Show system prompt
/mcp <list/add> Visualizes or adds a new MCP server.
/clear Clear chat and history
/quit Exit (also /exit, /q)
/help Show help (also /h, /?)
Press Esc to stop a running agent. Ctrl+Q to quit.
Playbooks
PentestAgent includes prebuilt attack playbooks for black-box security testing. Playbooks define a structured approach to specific security assessments.
Run a playbook:
pentestagent run -t example.com --playbook thp3_web

Tools
PentestAgent includes built-in tools and supports MCP (Model Context Protocol) for extensibility.
Built-in tools: terminal, browser, notes, web_search (requires TAVILY_API_KEY), spawn_mcp_agent
Agent Self-Spawning (spawn_mcp_agent)
spawn_mcp_agent is a built-in tool that allows a running agent to spawn a child copy of itself as a subordinate MCP server connected over stdio. The child process is fully isolated — its own runtime, LLM client, conversation history, and notes store — and its complete tool set is injected back into the parent agent's available tools after spawning.
This enables hierarchical, multi-agent workflows without any external orchestration: the agent self-organises by delegating scoped subtasks to children it spawns on demand.
| Argument | Type | Default | Description |
|----------|------|---------|-------------|
| target | string | — | Pentest target to pass to the child |
| scope | string[] | — | In-scope targets/CIDRs for the child |
| model | string | env var | Model identifier, overrides PENTESTAGENT_MODEL on the child |
| no_rag | boolean | false | Skip RAG engine initialisation on the child |
| no_mcp | boolean | true | Skip external MCP server connections on the child (recommended) |
After spawn_mcp_agent returns, the child's tools (run_task, run_task_async, await_tasks, etc.) are available on the next tool call. The child's server name is assigned automatically (e.g. child_agent_1) and returned in the result.
Example — orchestrator delegating parallel recon to two children:
# Turn 1: spawn two isolated child agents
spawn_mcp_agent target="10.0.1.0/24" scope=["10.0.1.0/24"]
spawn_mcp_agent target="10.0.2.0/24" scope=["10.0.2.0/24"]
# Turn 2: children's tools are now available — delegate work asynchronously
child_agent_1__run_task_async task="Full port scan and service enumeration"
child_agent_2__run_task_async task="Full port scan and service enumeration"
# Turn 3: wait and collect
child_agent_1__await_tasks task_ids=["<id1>"] timeout_seconds=600
child_agent_2__await_tasks task_ids=["<id2>"] timeout_seconds=600
child_agent_1__get_task_result task_id="<id1>"
child_agent_2__get_task_result task_id="<id2>"
MCP RAG Tool Optimizer
When an MCP server exposes more than 128 tools, PentestAgent automatically replaces the full catalogue with a single mcp_<server>_rag_optimizer tool. This meta-tool uses embedding similarity (via LiteLLM, default text-embedding-3-small) to retrieve the most relevant tools for the task at hand and injects them into the agent's next turn — keeping the context window manageable without losing access to the full tool set.
The optimizer is transparent to the agent: it calls the RAG tool with focused natural-language queries describing what it needs, and the matching tools become available on the next turn to call directly.
Usage guidance for the agent:
| Argument | Type | Default | Description |
|----------|------|---------|-------------|
| queries | string[] | (required) | One focused query per capability needed. More specific = higher accuracy |
| top_k | integer | 20 | Tools to retrieve per query (max 128). Results are merged and deduplicated |
Embeddings are computed once at startup and cached, so repeated queries are fast. The optimizer is built per-server, so each MCP server with a large catalogue gets its own independent index.
Tip: Pass one query per distinct capability rather than combining everything into one query.
["list open ports on a host", "get process memory usage"]retrieves better results than["list ports and memory and CPU"].
MCP Integration
PentestAgent supports MCP (Model Context Protocol) in two directions: consuming external MCP servers as tool sources, and exposing itself as an MCP server so external clients (Claude Desktop, Cursor, etc.) can drive PentestAgent programmatically.
Consuming External MCP Servers (Client Mode)
Configure mcp_servers.json to connect PentestAgent to any external MCP servers. Example config:
{
"mcpServers": {
"nmap": {
"command": "npx",
"args": ["-y", "gc-nmap-mcp"],
"env": {
"NMAP_PATH": "/usr/bin/nmap"
}
}
}
}
Exposing PentestAgent as an MCP Server (Server Mode)
PentestAgent can run as an MCP server, allowing any MCP-compatible client to submit tasks, inspect results, and control the agent remotely. Two transports are supported:
STDIO — for local clients (e.g. Claude Desktop, Cursor):
pentestagent mcp_server --type stdio
pentestagent mcp_server --type stdio --target 192.168.1.1 --scope 192.168.1.0/24
pentestagent mcp_server --type stdio --model claude-sonnet-4-20250514 --docker
SSE (HTTP) — for remote or networked clients:
pentestagent mcp_server --type sse
pentestagent mcp_server --type sse --host 0.0.0.0 --port 8080
pentestagent mcp_server --type sse --target 10.0.0.1 --scope 10.0.0.0/24 --docker
The SSE transport exposes a single /mcp endpoint supporting POST (requests), GET (persistent SSE stream for server-initiated push), and DELETE (session teardown). Sessions are tracked via the Mcp-Session-Id header.
All mcp_server flags:
| Flag | Default | Description |
|------|---------|-------------|
| --type | (required) | Transport: stdio or sse |
| --host | 0.0.0.0 | SSE bind host |
| --port | 8080 | SSE bind port |
| --target | none | Primary pentest target (IP / hostname) |
| --scope | [] | In-scope targets/CIDRs (space-separated) |
| --model | env var | Model identifier, overrides PENTESTAGENT_MODEL |
| --docker | false | Use DockerRuntime instead of LocalRuntime |
| --no-rag | false | Skip RAG engine initialisa
