SkillAgentSearch skills...

Basilisk

Basilisk — Open-source AI red teaming framework with genetic prompt evolution. Automated LLM security testing for GPT-4, Claude, Grok, Gemini. OWASP LLM Top 10 coverage. 32 attack modules.

Install / Use

/learn @regaan/Basilisk

README

Basilisk — Open-Source AI Red Teaming Framework

Basilisk is an open-source AI red teaming and LLM security testing framework. It automates adversarial prompt testing against ChatGPT, Claude, Gemini, Grok, and any LLM API using genetic prompt evolution. Built for security researchers, penetration testers, and AI safety engineers who need to find vulnerabilities in AI systems before attackers do.

<p align="center"> <img src="https://img.shields.io/badge/Version-1.1.0-red?style=for-the-badge" alt="Basilisk version 1.1.0" /> <img src="https://img.shields.io/badge/License-AGPL--3.0-blue?style=for-the-badge" alt="License: AGPL-3.0" /> <a href="https://doi.org/10.5281/zenodo.18909538"><img src="https://img.shields.io/badge/DOI-10.5281%2Fzenodo.18909538-blue?style=for-the-badge" alt="Zenodo DOI"></a> <a href="https://doi.org/10.6084/m9.figshare.31566853"><img src="https://img.shields.io/badge/Mirror-10.6084%2Fm9.figshare.31566853-emerald?style=for-the-badge" alt="Figshare DOI"></a> <a href="https://doi.org/10.17605/OSF.IO/H7BVR"><img src="https://img.shields.io/badge/DOI-10.17605%2FOSF.IO%2FH7BVR-lightgrey?style=for-the-badge" alt="OSF DOI"></a> <img src="https://img.shields.io/badge/Adoption-110+_Active_Users-blueviolet?style=for-the-badge&logo=github" alt="Adoption: 110+ Users" /> </p> <p align="center"> <b>Basilisk</b> is an industrial-strength, open-source AI red teaming framework designed to stress-test LLM security filters through advanced genetic prompt evolution. It automates the discovery of jailbreaks, data exfiltration vulnerabilities, and logic bypasses with forensic precision. </p>
<div align="center"> <img src="assets/demo.gif" alt="Basilisk AI Red Teaming Demo - Genetic Prompt Evolution Dashboard" style="border-radius: 12px; margin: 20px 0; max-width: 100%; border: 1px solid #1f1f27;" /> <p><i>Basilisk v1.1.0 — Automated LLM Jailbreaking & Security Testing</i></p> <a href="https://youtu.be/sgFcM1y_omY"> <img src="https://img.shields.io/badge/Watch-Full%20Demo%20on%20YouTube-red?style=for-the-badge&logo=youtube" alt="Basilisk YouTube Demo" /> </a> </div>

Key Features Shown in Demo

  • Genetic Prompt Evolution: Automated mutation engine for high-success jailbreaks.
  • Differential Mode: Side-by-side behavioral comparison across providers.
  • Guardrail Posture Scan: Non-destructive A+ to F security grading.
  • Visual Feedback Engine: Real-time toast notifications and interactive logs.
  • Forensic Audit Reports: Export findings in HTML, JSON, and SARIF formats.
<p align="center"> <a href="https://github.com/regaan/basilisk/actions/workflows/build.yml"><img src="https://github.com/regaan/basilisk/actions/workflows/build.yml/badge.svg" alt="Build Desktop" /></a> <a href="https://github.com/regaan/basilisk/actions/workflows/docker-build.yml"><img src="https://github.com/regaan/basilisk/actions/workflows/docker-build.yml/badge.svg" alt="Docker" /></a> <a href="https://github.com/regaan/basilisk/actions/workflows/python-publish.yml"><img src="https://github.com/regaan/basilisk/actions/workflows/python-publish.yml/badge.svg" alt="PyPI" /></a> <a href="https://github.com/marketplace/actions/basilisk-ai-security-scan"><img src="https://img.shields.io/badge/Marketplace-Action-blue?logo=github" alt="GitHub Marketplace" /></a> </p> <p align="center"> <a href="#what-is-basilisk">What is Basilisk?</a> • <a href="#quick-start">Quick Start</a> • <a href="#features">Features</a> • <a href="#whats-new-in-v110">What's New</a> • <a href="#attack-modules">Attack Modules</a> • <a href="#desktop-app">Desktop App</a> • <a href="#ci-cd-integration">CI/CD</a> • <a href="#docker">Docker</a> • <a href="https://basilisk.rothackers.com">Website</a> </p>
     ██████╗  █████╗ ███████╗██╗██╗     ██╗███████╗██╗  ██╗
     ██╔══██╗██╔══██╗██╔════╝██║██║     ██║██╔════╝██║ ██╔╝
     ██████╔╝███████║███████╗██║██║     ██║███████╗█████╔╝
     ██╔══██╗██╔══██║╚════██║██║██║     ██║╚════██║██╔═██╗
     ██████╔╝██║  ██║███████║██║███████╗██║███████║██║  ██╗
     ╚═════╝ ╚═╝  ╚═╝╚══════╝╚═╝╚══════╝╚═╝╚══════╝╚═╝  ╚═╝
                    AI Red Teaming Framework v1.1.0

What is Basilisk?

Basilisk is a production-grade, open-source offensive security framework purpose-built for AI red teaming and LLM penetration testing. It is the first automated red teaming tool to combine full OWASP LLM Top 10 attack coverage with a genetic algorithm engine called Smart Prompt Evolution (SPE-NL) that evolves adversarial prompt payloads across generations to discover novel AI vulnerabilities and jailbreaks that no static tool can find.

Whether you are testing OpenAI GPT-4o, Anthropic Claude, Google Gemini, xAI Grok, Meta Llama, or any custom LLM endpoint, Basilisk provides 32 attack modules, 5 recon modules, differential multi-model scanning, guardrail posture grading, and forensic audit logging out of the box.

Why Basilisk?

  • Automated AI Red Teaming: Stop manually copy-pasting jailbreak prompts. Basilisk evolves thousands of adversarial payloads automatically.
  • Genetic Prompt Evolution: The SPE-NL engine mutates, crosses over, and scores prompts like biological organisms, finding bypasses humans would never think of.
  • Full OWASP LLM Top 10 Coverage: 32 modules covering prompt injection, system prompt extraction, data exfiltration, tool abuse, guardrail bypass, denial of service, multi-turn manipulation, and RAG attacks.
  • Works with Every LLM Provider: OpenAI, Anthropic, Google, xAI (Grok), Groq, Azure, AWS Bedrock, GitHub Models, Ollama, vLLM, and any custom HTTP/WebSocket endpoint.
  • CI/CD Ready: Native GitHub Action with SARIF output for automated AI security testing in your pipeline.
  • Desktop App: Full Electron GUI for visual red teaming with real-time scan dashboards.

Built by Regaan, Lead Researcher at ROT Independent Security Research Lab, and creator of WSHawk.

🌐 Website: basilisk.rothackers.com


Quick Start

# Install Basilisk from PyPI
pip install basilisk-ai

# Full AI red team scan against an OpenAI chatbot
export OPENAI_API_KEY="sk-..."
basilisk scan -t https://api.target.com/chat -p openai

# Quick scan — top payloads, no evolution
basilisk scan -t https://api.target.com/chat --mode quick

# Deep scan — 10 generations of genetic prompt evolution
basilisk scan -t https://api.target.com/chat --mode deep --generations 10

# Stealth mode — rate-limited, human-like timing
basilisk scan -t https://api.target.com/chat --mode stealth

# Recon only — fingerprint the target LLM
basilisk recon -t https://api.target.com/chat -p openai

# Guardrail posture check (no attacks, safe for production)
basilisk posture -p openai -m gpt-4o -v

# Differential scan across AI providers
basilisk diff -t openai:gpt-4o -t anthropic:claude-3-5-sonnet-20241022

# Use GitHub Models (FREE — no API key purchase required!)
export GH_MODELS_TOKEN="ghp_..."   # github.com/settings/tokens → models:read
basilisk scan -t https://api.target.com/chat -p github -m gpt-4o

# CI/CD mode — SARIF output, fail on high severity
basilisk scan -t https://api.target.com/chat -o sarif --fail-on high

Zero-Setup Live Demo

Want to see Basilisk in action right now without configuring API keys? We maintain an intentionally vulnerable LLM target for security testing:

Target URL: https://basilisk-vulnbot.onrender.com/v1/chat/completions

Run a quick scan against it immediately:

# No API keys required for this target!
basilisk scan -t https://basilisk-vulnbot.onrender.com/v1/chat/completions -p custom --model vulnbot-1.0 --mode quick

Or use the Desktop App:

  1. Open the New Scan tab.
  2. Set Endpoint URL to https://basilisk-vulnbot.onrender.com/v1/chat/completions.
  3. Set Provider to Custom HTTP.
  4. Set Model to vulnbot-1.0.
  5. Click Start Scan.

Watch as Basilisk's genetic engine discovers 30+ vulnerabilities in real-time, including prompt injections, system leakage, and tool abuse.

Docker

docker pull rothackers/basilisk

docker run --rm -e OPENAI_API_KEY=sk-... rothackers/basilisk \
  scan -t https://api.target.com/chat --mode quick

Features

Smart Prompt Evolution (SPE-NL)

The core differentiator. Genetic algorithms adapted for natural language attack payloads:

  • 15 mutation operators — synonym swap, encoding wrap, role injection, language shift, structure overhaul, fragment split, nesting, homoglyphs, context padding, token smuggling, role assumption, temporal anchoring, nested context, authority tone
  • 5 crossover strategies — single-point, uniform, prefix-suffix, semantic blend, best-of-both
  • Multi-signal fitness function — refusal avoidance, information leakage, compliance scoring, novelty reward
  • Population diversity tracking — Jaccard distance sampling to prevent convergence collapse
  • Stagnation detection with adaptive mutation rate and early breakthrough exit
  • Payloads that fail get mutated, crossed, and re-evaluated — surviving payloads get deadlier every generation

32 Attack Modules

Full OWASP LLM Top 10 coverage across 8 attack categories + 3 multi-turn specialist modules. See Attack Modules below.

5 Reconnaissance Modules

  • Model Fingerprinting — identifies GPT-4, Claude, Gemini, Llama, Mistral via response patterns and timing
  • Guardrail Profiling — systematic probing across 8 content categories
  • Tool/Function Discovery — enumerates available tools and API schemas
  • Context Window Measurement — determines token limits
  • RAG Pipeline Detection — identifies retrieval-augmented generation setups

Differential Testing

Compare model behavior across providers — a feature n

Related Skills

View on GitHub
GitHub Stars9
CategoryDevelopment
Updated18h ago
Forks0

Languages

Python

Security Score

75/100

Audited on Mar 23, 2026

No findings