NAAMSE

Neural Adversarial Agent Mutation-based Security Evaluator

Generate Convert Improve

Install / Use

/learn @HASHIRU-AI/NAAMSE

About this skill

Quality Score

0/100

README

NAAMSE Logo

NAAMSE

Neural Adversarial Agent Mutation-based Security Evaluator

NAAMSE (Neural Adversarial Agent Mutation-based Security Evaluator) is an automated security fuzzing framework for LLM-based agents that uses evolutionary algorithms to discover vulnerabilities. Built on LangGraph and compliant with the AgentBeats A2A protocol, NAAMSE acts as a "green agent" that evaluates target "purple agents" by iteratively generating adversarial prompts through intelligent mutations, invoking the target agent, and scoring responses for security violations like jailbreaks, prompt injections, and PII leakage. The system employs a mutation engine with LLM-powered prompt transformations, a behavioral scoring engine using mixture-of-experts evaluation, and a clustering engine that organizes attack vectors by type using separate SQLite databases for adversarial and benign prompt corpora. Over multiple iterations, high-scoring prompts (those that successfully exploit vulnerabilities) are selected as parents for the next generation, creating an evolutionary pressure toward more effective attacks. The framework outputs comprehensive PDF reports with vulnerability analysis, attack effectiveness metrics, and cluster-based categorization of discovered exploits, making it a practical tool for red-teaming and hardening LLM agents before deployment.

Quick Start

Docker

The easiest way to run NAAMSE is using the pre-built Docker image:

# Pull the Docker image
docker pull ghcr.io/hashiru-ai/naamse-naamse-green-agent:latest

# Run the green agent
# .env expects GOOGLE_API_KEY to be set at least. 
# look at .env.example for more information.
docker run -p 9009:9009 \
   --env-file .env \
  ghcr.io/hashiru-ai/naamse-naamse-green-agent:latest

The agent will be available at:

Server: http://localhost:9009
Agent Card: http://localhost:9009/.well-known/agent-card.json

Building the Docker Image Locally

# Clone the repository
git clone https://github.com/HASHIRU-AI/NAAMSE.git

# Build the image
docker build -f ./scenarios/naamse/Dockerfile.naamse-green-agent -t naamse-naamse-green-agent .
# Run the container
# .env expects GOOGLE_API_KEY to be set at least. 
# look at .env.example for more information.
docker run -p 9009:9009 \
  --env-file .env \
  naamse-green-agent

Local Development Setup

For local development without Docker:

Prerequisites: Ensure Python 3.10+ is installed.

Install uv (Python package manager):

curl -LsSf https://astral.sh/uv/install.sh | sh

Clone the repository:

git clone https://github.com/HASHIRU-AI/NAAMSE.git
cd NAAMSE

Install dependencies:
```
uv sync
```

Activate the virtual environment:

# Linux/macOS
source .venv/bin/activate

# Windows
.venv\Scripts\activate

Set up environment variables:
- Copy .env.example to .env
- Set required variables (e.g., GOOGLE_API_KEY). Refer to .env.example for details.
Run the server:
```
python -m src.agentbeats.server --port 9009
```
The server will start on http://localhost:9009.
Test the agent (in another terminal, with the virtual environment activated):
```
python src/agentbeats/test_green_agent.py --target http://localhost:5000 --green-agent http://localhost:9009
```
- --target: URL of the target agent to evaluate
- --green-agent: URL of the running green agent
This will send a test evaluation request and stream the results.

You can also use langgraph dev for an interactive testing experience.

Request Format

The agent follows the AgentBeats standard EvalRequest format:

{
  "participants": {
    "agent": "http://localhost:5000"
  },
  "config": {
    "iterations_limit": 7,
    "mutations_per_iteration": 4,
  }
}

Required fields:

participants.agent - URL of the target agent to evaluate
iterations_limit (7) - Number of fuzzer iterations
mutations_per_iteration (4) - Mutations per iteration

Project Structure

src/
├─ agentbeats/           # A2A Green Agent Implementation
│  ├─ agent.py           # NAAMSE agent logic
│  ├─ executor.py        # A2A request handling
│  ├─ server.py          # Server entry point
│  ├─ models.py          # Pydantic models (EvalRequest, NAAMSEConfig)
│  └─ test_green_agent.py  # Test script
├─ agent/                # NAAMSE Fuzzer Graph (LangGraph)
│  ├─ graph.py           # Main fuzzer workflow with parallel iteration workers
│  └─ ...
├─ mutation_engine/      # Prompt mutation subgraph
├─ behavioral_engine/    # Response scoring subgraph (PII detection, jailbreak scoring)
├─ cluster_engine/       # Clustering and data management
│  ├─ data_access/
│  │  ├─ adversarial/    # SQLite database with 128K+ adversarial jailbreak prompts
│  │  ├─ benign/         # SQLite database with 50K+ benign security testing prompts
│  │  ├─ sqlite_source.py # Database access layer with embedding-based similarity search
│  │  └─ ...
│  └─ ...
└─ invoke_agent/         # Agent invocation subgraph

docs/
├─ scoring.md            # Behavioral scoring methodology
└─ schema.sql            # Database schema definitions

Database Structure

NAAMSE uses separate SQLite databases for different types of prompts:

Adversarial Database (src/cluster_engine/data_access/adversarial/naamse.db): 128,000+ jailbreak and adversarial prompts organized into hierarchical clusters by attack type (DAN prompts, uncensored personas, encoding attacks, etc.)
Benign Database (src/cluster_engine/data_access/benign/naamse_benign.db): 50,000+ benign prompts for security testing, including legitimate user queries that help validate scoring accuracy

Both databases include:

Prompt text and metadata
Hierarchical clustering information
Sentence embeddings for similarity search
Cluster labels and categorization

Generate PDF report

After getting the JSON report from the green agent. You can generate the pdf by running the util/generate_pdf_report.py by using the following command.

# Linux/macOS
python ./util/generate_pdf_report.py --input_json=<input_json> --output_pdf=<output_pdf>

# Windows
python .\util\generate_pdf_report.py --input_json=<input_json> --output_pdf=<output_pdf>

Related Skills

OpenMetadata

9.0k

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

vehicle-insurance-system

Vehicle Insurance Information System Project Purpose A world-class, multi-tenant vehicle compliance and insurance information system built with Django. This is a production-ready platform design

A2V

1.2k

A2V: Next-Gen AI Value Compute Protocol.

eoa-agent-skills

Portkey EOA wallet skill for wallet lifecycle, asset queries, transfers, and contract interactions on aelf.

HASHIRU-AI

View profile

View on GitHub

GitHub Stars14

CategoryLegal

Updated23d ago

Forks2

HASHIRU-AI/NAAMSE

Languages

Python

Security Score

80/100

Audited on Feb 26, 2026

No findings

NAAMSE

Install / Use

README

NAAMSE

Links

Quick Start

Docker

Building the Docker Image Locally

Local Development Setup

Request Format

Project Structure

Database Structure

Generate PDF report

Related Skills