Basilisk

Basilisk — Open-source AI red teaming framework with genetic prompt evolution. Automated LLM security testing for GPT-4, Claude, Grok, Gemini. OWASP LLM Top 10 coverage. 32 attack modules.

Generate Convert Improve

Install / Use

/learn @regaan/Basilisk

About this skill

Quality Score

0/100

README

Basilisk — Open-Source AI Red Teaming Framework

Basilisk is an open-source AI red teaming and LLM security testing framework. It automates adversarial prompt testing against ChatGPT, Claude, Gemini, Grok, and any LLM API using genetic prompt evolution. Built for security researchers, penetration testers, and AI safety engineers who need to find vulnerabilities in AI systems before attackers do.

<img src="https://img.shields.io/badge/Version-1.1.0-red?style=for-the-badge" alt="Basilisk version 1.1.0" /> <img src="https://img.shields.io/badge/License-AGPL--3.0-blue?style=for-the-badge" alt="License: AGPL-3.0" /> <a href="https://doi.org/10.5281/zenodo.18909538"><img src="https://img.shields.io/badge/DOI-10.5281%2Fzenodo.18909538-blue?style=for-the-badge" alt="Zenodo DOI"></a> <a href="https://doi.org/10.6084/m9.figshare.31566853"><img src="https://img.shields.io/badge/Mirror-10.6084%2Fm9.figshare.31566853-emerald?style=for-the-badge" alt="Figshare DOI"></a> <a href="https://doi.org/10.17605/OSF.IO/H7BVR"><img src="https://img.shields.io/badge/DOI-10.17605%2FOSF.IO%2FH7BVR-lightgrey?style=for-the-badge" alt="OSF DOI"></a> <img src="https://img.shields.io/badge/Adoption-110+_Active_Users-blueviolet?style=for-the-badge&logo=github" alt="Adoption: 110+ Users" /> Basilisk is an industrial-strength, open-source AI red teaming framework designed to stress-test LLM security filters through advanced genetic prompt evolution. It automates the discovery of jailbreaks, data exfiltration vulnerabilities, and logic bypasses with forensic precision.

<div align="center"> <img src="assets/demo.gif" alt="Basilisk AI Red Teaming Demo - Genetic Prompt Evolution Dashboard" style="border-radius: 12px; margin: 20px 0; max-width: 100%; border: 1px solid #1f1f27;" /> Basilisk v1.1.0 — Automated LLM Jailbreaking & Security Testing <a href="https://youtu.be/sgFcM1y_omY"> <img src="https://img.shields.io/badge/Watch-Full%20Demo%20on%20YouTube-red?style=for-the-badge&logo=youtube" alt="Basilisk YouTube Demo" /> </a> </div>

Key Features Shown in Demo

Genetic Prompt Evolution: Automated mutation engine for high-success jailbreaks.
Differential Mode: Side-by-side behavioral comparison across providers.
Guardrail Posture Scan: Non-destructive A+ to F security grading.
Visual Feedback Engine: Real-time toast notifications and interactive logs.
Forensic Audit Reports: Export findings in HTML, JSON, and SARIF formats.

<a href="https://github.com/regaan/basilisk/actions/workflows/build.yml"><img src="https://github.com/regaan/basilisk/actions/workflows/build.yml/badge.svg" alt="Build Desktop" /></a> <a href="https://github.com/regaan/basilisk/actions/workflows/docker-build.yml"><img src="https://github.com/regaan/basilisk/actions/workflows/docker-build.yml/badge.svg" alt="Docker" /></a> <a href="https://github.com/regaan/basilisk/actions/workflows/python-publish.yml"><img src="https://github.com/regaan/basilisk/actions/workflows/python-publish.yml/badge.svg" alt="PyPI" /></a> <a href="https://github.com/marketplace/actions/basilisk-ai-security-scan"><img src="https://img.shields.io/badge/Marketplace-Action-blue?logo=github" alt="GitHub Marketplace" /></a> <a href="#what-is-basilisk">What is Basilisk?</a> • <a href="#quick-start">Quick Start</a> • <a href="#features">Features</a> • <a href="#whats-new-in-v110">What's New</a> • <a href="#attack-modules">Attack Modules</a> • <a href="#desktop-app">Desktop App</a> • <a href="#ci-cd-integration">CI/CD</a> • <a href="#docker">Docker</a> • <a href="https://basilisk.rothackers.com">Website</a>

     ██████╗  █████╗ ███████╗██╗██╗     ██╗███████╗██╗  ██╗
     ██╔══██╗██╔══██╗██╔════╝██║██║     ██║██╔════╝██║ ██╔╝
     ██████╔╝███████║███████╗██║██║     ██║███████╗█████╔╝
     ██╔══██╗██╔══██║╚════██║██║██║     ██║╚════██║██╔═██╗
     ██████╔╝██║  ██║███████║██║███████╗██║███████║██║  ██╗
     ╚═════╝ ╚═╝  ╚═╝╚══════╝╚═╝╚══════╝╚═╝╚══════╝╚═╝  ╚═╝
                    AI Red Teaming Framework v1.1.0

What is Basilisk?

Basilisk is a production-grade, open-source offensive security framework purpose-built for AI red teaming and LLM penetration testing. It is the first automated red teaming tool to combine full OWASP LLM Top 10 attack coverage with a genetic algorithm engine called Smart Prompt Evolution (SPE-NL) that evolves adversarial prompt payloads across generations to discover novel AI vulnerabilities and jailbreaks that no static tool can find.

Whether you are testing OpenAI GPT-4o, Anthropic Claude, Google Gemini, xAI Grok, Meta Llama, or any custom LLM endpoint, Basilisk provides 32 attack modules, 5 recon modules, differential multi-model scanning, guardrail posture grading, and forensic audit logging out of the box.

Why Basilisk?

Automated AI Red Teaming: Stop manually copy-pasting jailbreak prompts. Basilisk evolves thousands of adversarial payloads automatically.
Genetic Prompt Evolution: The SPE-NL engine mutates, crosses over, and scores prompts like biological organisms, finding bypasses humans would never think of.
Full OWASP LLM Top 10 Coverage: 32 modules covering prompt injection, system prompt extraction, data exfiltration, tool abuse, guardrail bypass, denial of service, multi-turn manipulation, and RAG attacks.
Works with Every LLM Provider: OpenAI, Anthropic, Google, xAI (Grok), Groq, Azure, AWS Bedrock, GitHub Models, Ollama, vLLM, and any custom HTTP/WebSocket endpoint.
CI/CD Ready: Native GitHub Action with SARIF output for automated AI security testing in your pipeline.
Desktop App: Full Electron GUI for visual red teaming with real-time scan dashboards.

Built by Regaan, Lead Researcher at ROT Independent Security Research Lab, and creator of WSHawk.

🌐 Website: basilisk.rothackers.com

Quick Start

# Install Basilisk from PyPI
pip install basilisk-ai

# Full AI red team scan against an OpenAI chatbot
export OPENAI_API_KEY="sk-..."
basilisk scan -t https://api.target.com/chat -p openai

# Quick scan — top payloads, no evolution
basilisk scan -t https://api.target.com/chat --mode quick

# Deep scan — 10 generations of genetic prompt evolution
basilisk scan -t https://api.target.com/chat --mode deep --generations 10

# Stealth mode — rate-limited, human-like timing
basilisk scan -t https://api.target.com/chat --mode stealth

# Recon only — fingerprint the target LLM
basilisk recon -t https://api.target.com/chat -p openai

# Guardrail posture check (no attacks, safe for production)
basilisk posture -p openai -m gpt-4o -v

# Differential scan across AI providers
basilisk diff -t openai:gpt-4o -t anthropic:claude-3-5-sonnet-20241022

# Use GitHub Models (FREE — no API key purchase required!)
export GH_MODELS_TOKEN="ghp_..."   # github.com/settings/tokens → models:read
basilisk scan -t https://api.target.com/chat -p github -m gpt-4o

# CI/CD mode — SARIF output, fail on high severity
basilisk scan -t https://api.target.com/chat -o sarif --fail-on high

Zero-Setup Live Demo

Want to see Basilisk in action right now without configuring API keys? We maintain an intentionally vulnerable LLM target for security testing:

Target URL: https://basilisk-vulnbot.onrender.com/v1/chat/completions

Run a quick scan against it immediately:

# No API keys required for this target!
basilisk scan -t https://basilisk-vulnbot.onrender.com/v1/chat/completions -p custom --model vulnbot-1.0 --mode quick

Or use the Desktop App:

Open the New Scan tab.
Set Endpoint URL to https://basilisk-vulnbot.onrender.com/v1/chat/completions.
Set Provider to Custom HTTP.
Set Model to vulnbot-1.0.
Click Start Scan.

Watch as Basilisk's genetic engine discovers 30+ vulnerabilities in real-time, including prompt injections, system leakage, and tool abuse.

Docker

docker pull rothackers/basilisk

docker run --rm -e OPENAI_API_KEY=sk-... rothackers/basilisk \
  scan -t https://api.target.com/chat --mode quick

Features

Smart Prompt Evolution (SPE-NL)

The core differentiator. Genetic algorithms adapted for natural language attack payloads:

15 mutation operators — synonym swap, encoding wrap, role injection, language shift, structure overhaul, fragment split, nesting, homoglyphs, context padding, token smuggling, role assumption, temporal anchoring, nested context, authority tone
5 crossover strategies — single-point, uniform, prefix-suffix, semantic blend, best-of-both
Multi-signal fitness function — refusal avoidance, information leakage, compliance scoring, novelty reward
Population diversity tracking — Jaccard distance sampling to prevent convergence collapse
Stagnation detection with adaptive mutation rate and early breakthrough exit
Payloads that fail get mutated, crossed, and re-evaluated — surviving payloads get deadlier every generation

32 Attack Modules

Full OWASP LLM Top 10 coverage across 8 attack categories + 3 multi-turn specialist modules. See Attack Modules below.

5 Reconnaissance Modules

Model Fingerprinting — identifies GPT-4, Claude, Gemini, Llama, Mistral via response patterns and timing
Guardrail Profiling — systematic probing across 8 content categories
Tool/Function Discovery — enumerates available tools and API schemas
Context Window Measurement — determines token limits
RAG Pipeline Detection — identifies retrieval-augmented generation setups

Differential Testing

Compare model behavior across providers — a feature n

Related Skills

healthcheck

333.7k

Host security hardening and risk-tolerance configuration for OpenClaw deployments

node-connect

333.7k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

prose

333.7k

OpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.

frontend-design

82.0k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

regaan

View profile

View on GitHub

GitHub Stars9

CategoryDevelopment

Updated20h ago

Forks0

regaan/basilisk

Languages

Python

Security Score

75/100

Audited on Mar 23, 2026

No findings