SkillAgentSearch skills...

LuaN1aoAgent

LuaN1aoAgent is a cognitive-driven AI hacker. It is a fully autonomous AI penetration testing agent powered by DeepSeek V3.2. Using dual-graph reasoning, LuaN1ao achieves a success rate of over 90% on the XBOW Benchmark, with a median exploit cost of just $0.09.

Install / Use

/learn @SanMuzZzZz/LuaN1aoAgent

README

<p align="center"> <img src="imgs/logo.png" alt="LuaN1ao Logo" width="200" /> </p> <h1 align="center">LuaN1aoAgent</h1> <h2 align="center">

Cognitive-Driven AI Hackers

</h2> <div align="center">

License: Apache-2.0 Python 3.10+ PRs Welcome Architecture: P-E-R Powered by LLM

</div> <div align="center"> <a href="https://zc.tencent.com/competition/competitionHackathon?code=cha004"><img src="imgs/tch.png" alt="[TCH]Top-RankedIntelligent Pentest Project" width="250" /></a>

🧠 Think Like Human Experts📊 Dynamic Graph Planning🔄 Learn From Failures🎯 Evidence-Driven Decisions

🚀 Quick Start✨ Core Innovations🏗️ System Architecture🗓️ Roadmap

🌐 中文版English Version

</div>

📖 Introduction

LuaN1ao (鸾鸟) is a next-generation Autonomous Penetration Testing Agent powered by Large Language Models (LLMs).

Traditional automated scanning tools rely on predefined rules and struggle with complex real-world scenarios. LuaN1ao breaks through these limitations by innovatively integrating the P-E-R (Planner-Executor-Reflector) Agent Collaboration Framework with Causal Graph Reasoning technology.

LuaN1ao simulates the thinking patterns of human security experts:

  • 🎯 Strategic Planning: Dynamically plan attack paths based on global situational awareness
  • 🔍 Evidence-Driven: Build rigorous "Evidence-Hypothesis-Validation" logical chains
  • 🔄 Continuous Evolution: Learn from failures and autonomously adjust tactical strategies
  • 🧠 Cognitive Loop: Form a complete cognitive cycle of planning-execution-reflection

From information gathering to vulnerability exploitation, LuaN1ao elevates penetration testing from "automated tools" to an "autonomous agent".

[!NOTE] LuaN1aoAgent achieves a 90.4% success rate on benchmark tasks fully autonomously, with a median exploit cost of only $0.09. →

<p align="center"> <a href="https://github.com/SanMuzZzZz/LuaN1aoAgent"> <img src="https://img.shields.io/badge/⭐-Give%20us%20a%20Star-yellow?style=for-the-badge&logo=github" alt="Give us a Star"> </a> </p>

🖼️ Showcase

https://github.com/user-attachments/assets/e2c19442-20db-40ab-a5c6-3bf5c9054ae8

💡 More demos coming soon!


🚀 Core Innovations

1️⃣ P-E-R Agent Collaboration Framework ⭐⭐⭐

LuaN1ao decouples penetration testing thinking into three independent yet collaborative cognitive roles, forming a complete decision-making loop:

  • 🧠 Planner

    • Strategic Brain: Dynamic planning based on global graph awareness
    • Adaptive Capability: Identify dead ends and automatically generate alternative paths
    • Graph Operation Driven: Output structured graph editing instructions rather than natural language
    • Parallel Scheduling: Automatically identify parallelizable tasks based on topological dependencies
    • Adaptive Step Count: Allocate extra execution steps (max_steps) per subtask for complex tasks (blind injection extraction, multi-stage bypass, etc.)
  • ⚙️ Executor

    • Tactical Execution: Focus on single sub-task tool invocation and result analysis
    • Tool Orchestration: Unified scheduling of security tools via MCP (Model Context Protocol)
    • Context Compression: Intelligent message history management to avoid token overflow
    • Fault Tolerance: Automatic handling of network transient errors and tool invocation failures
    • Hypothesis Persistence: Hypotheses from formulate_hypotheses are preserved across steps and survive context compression
    • Parallel Discovery Sharing: Parallel subtasks exchange high-value findings in real-time via a shared bulletin board (ConfirmedVulnerability and high-confidence KeyFact)
    • First-Step Guidance: When no confirmed vulnerabilities exist, automatically prompts the agent to formulate a hypothesis framework before blind exploration
  • ⚖️ Reflector

    • Audit Analysis: Review task execution and validate artifact effectiveness
    • Failure Attribution: L1-L4 level failure pattern analysis to prevent repeated errors
    • Intelligence Generation: Extract attack intelligence and build knowledge accumulation
    • Termination Control: Judge goal achievement or task entrapment

Key Advantages: Role separation avoids the "split personality" problem of single agents. Each component focuses on its core responsibilities and collaborates via event bus.

2️⃣ Causal Graph Reasoning ⭐⭐⭐

LuaN1ao rejects blind guessing and LLM hallucinations, constructing explicit causal graphs to drive testing decisions:

graph LR
    E[🔍 Evidence<br/>Evidence Node] -->|Support| H[💭 Hypothesis<br/>Hypothesis Node]
    H -->|Validation| V[⚠️ Vulnerability<br/>Vulnerability Node]
    V -->|Exploitation| X[💥 Exploit<br/>Exploit Node]

Core Principles:

  • Evidence First: Any hypothesis requires explicit prior evidence support
  • Confidence Quantification: Each causal edge has a confidence score to avoid blind advancement
  • Traceability: Complete recording of reasoning chains for failure tracing and experience reuse
  • Hallucination Prevention: Mandatory evidence validation, rejecting unfounded attacks

Example Scenario:

Evidence: Port scan discovers 3306/tcp open
  ↓ (Confidence 0.8)
Hypothesis: Target runs MySQL service
  ↓ (Validation successful)
Vulnerability: MySQL weak password/unauthorized access
  ↓ (Attempt exploitation)
Exploit: mysql -h target -u root -p [brute-force/empty password]

3️⃣ Plan-on-Graph (PoG) Dynamic Task Planning ⭐⭐⭐

Say goodbye to static task lists. LuaN1ao models penetration testing plans as dynamically evolving Directed Acyclic Graphs (DAGs):

Core Features:

  • Graph Operation Language: Planner outputs standardized graph editing operations (ADD_NODE, UPDATE_NODE, DEPRECATE_NODE)
  • Real-time Adaptation: Task graphs deform in real-time with testing progress
    • Discover new ports → Automatically mount service scanning subgraphs
    • Encounter WAF → Insert bypass strategy nodes
    • Path blocked → Automatically prune or branch planning
  • Topological Dependency Management: Automatically identify and parallelize independent tasks based on DAG topology
  • State Tracking: Each node contains a state machine (pending, in_progress, completed, failed, deprecated)

Comparison with Traditional Planning:

| Feature | Traditional Task List | Plan-on-Graph | |---------|----------------------|---------------| | Structure | Linear list | Directed graph | | Dependency Management | Manual sorting | Topological auto-sorting | | Parallel Capability | None | Auto-identify parallel paths | | Dynamic Adjustment | Regenerate | Local graph editing | | Visualization | Difficult | Native support (Web UI) |

Visualization Example: Start the Web Server to view the task graph evolution in real-time in the browser.


Core Capabilities

Tool Integration (MCP Protocol)

LuaN1ao achieves unified integration and scheduling of tools through the Model Context Protocol (MCP):

  • HTTP/HTTPS Requests: Support for custom headers, proxies, timeout control
  • Shell Command Execution: Securely encapsulated system command invocation (containerized execution recommended)
  • Python Code Execution: Dynamic execution of Python scripts for complex logic processing
  • Metacognitive Tools: think (deep thinking), formulate_hypotheses (hypothesis generation), reflect_on_failure (failure reflection)
  • Task Control: halt_task (early task termination)
  • Local Graph Query: query_causal_graph (direct in-process causal graph lookup, zero MCP latency)

💡 Extensibility: New tools can be easily integrated via mcp.json (e.g., Metasploit, Nuclei, Burp Suite API)

Knowledge Enhancement (RAG)

  • Vector Retrieval: Efficient knowledge base retrieval based on FAISS
  • Domain Knowledge: Integration of PayloadsAllTheThings and other open-source security knowledge bases
  • Dynamic Learning: Continuous addition of custom knowledge documents

Web Visualization (New Architecture)

The Web UI is now a standalone service powered by a database, enabling persistent task monitoring and management.

  • Real-time Monitoring: Browser view of dynamic task graph evolution and live logs.
  • Node Details: Click nodes to view execution logs, artifacts, state transitions.
  • Task Management: Create, abort, and delete historical tasks.
  • Data Persistence: All task data is stored in SQLite (luan1ao.db), preserving history across restarts.

Human-in-the-Loop (HITL) Mode

LuaN1ao Agent supports a Human-in-the-Loop (HITL) mode, allowing experts to supervise and intervene in the decision-making process.

  • Enable: Set HUMAN_IN_THE_LOOP=true in .env.
  • Approval: The agent pauses after generating a plan (initial or dynamic), waiting for human approval via Web UI or CLI.
  • Modification: Experts can reject or directly modify the plan (JSON editing) before execution.
  • Injection: Supports real-time injection of new sub-tasks via the Web UI ("Active Intervention").

Interaction Methods:

  • Web UI: Approval modal pops up automatically. Use "Modify" to edit plans or "Add Task" button to inject tasks.
  • CLI: Prompts with HITL >. Type y to approve, n to reject, or m to modify (opens system editor).
View on GitHub
GitHub Stars640
CategoryDevelopment
Updated5h ago
Forks90

Languages

Python

Security Score

100/100

Audited on Mar 28, 2026

No findings