FuzzyAI

A powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify and mitigate potential jailbreaks in their LLM APIs.

Generate Convert Improve

Install / Use

/learn @cyberark/FuzzyAI

About this skill

Quality Score

0/100

README

<h1 align="center">FuzzyAI Fuzzer</h1> <img src="/src/fuzzyai/resources/logo.png" alt="Project Logo" width="200" style="vertical-align:middle; margin-right:10px;" /> The FuzzyAI Fuzzer is a powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify jailbreaks and mitigate potential security vulnerabilities in their LLM APIs. <a href="https://github.com/cyberark/fuzzyai/commits/main"> <img alt="GitHub last commit" src="https://img.shields.io/github/last-commit/cyberark/fuzzyai"> </a> <a href="https://github.com/cyberark/fuzzyai"> <img alt="GitHub code size in bytes" src="https://img.shields.io/github/languages/code-size/cyberark/FuzzyAI"> </a> <a href="https://github.com/cyberark/fuzzyai/blob/master/LICENSE" > <img alt="GitHub License" src="https://img.shields.io/github/license/cyberark/fuzzyai"> </a> <a href="https://discord.gg/ewQjdx2V"> <img alt="Discord" src="https://img.shields.io/discord/1330486843938177157"> </a> <img alt="fuzzgif" src="/src/fuzzyai/resources/fuzz.gif" />

Getting Started

Quick start #1 - Using an existing python project

Install fuzzyai

# Use either pip or any other package manager
pip install git+https://github.com/cyberark/FuzzyAI.git

Run the fuzzer
```
fuzzyai fuzz -h
```

Quick start #2 - or as a standalone project

Clone the repository:

git clone git@github.com:cyberark/FuzzyAI.git
cd FuzzyAI

Install dependencies using Poetry:
```
poetry run pip install -e .
```
Run the fuzzer:
```
poetry run fuzzyai fuzz -h
```
Optional: Install ollama, and download a model for local usage:
```
ollama pull llama3.1
ollama show llama3.1 # verify model installation
```
Alternativly, you can use the Web UI

Web UI (Experimental)

FZAI

Run the Web UI (make sure you completed either of the installation steps from above):

 poetry run fuzzyai webui
 
 # Or specify a custom port:
 poetry run fuzzyai webui --port 9000

Notebooks

We've included interactive Jupyter notebooks you can use under src/fuzzyai/resources/notebooks/.
For more information, see notebooks wiki.

Datasets

We've included some datasets you can use under resources/. For more information, see datasets wiki.

Documentation

Explore detailed usage instructions in the Wiki.

Examples

If you're using poetry, make sure you've activated the venv (or by prepending 'poetry run' to the command line

A default attack (just evaluate the prompt 'Harmful_Prompt') using llama over Ollama

fuzzyai fuzz -m ollama/llama3.1 -a def -t "Harmful_Prompt"

Attack the prompt utilizing ManyShot and Taxonomy attacks, using gpt3.5 over OpenAI using one worker

OPENAI_API_KEY=sk-3fa1... fuzzyai fuzz -m openai/gpt-3.5-turbo -a man -a tax -t "Harmful_Prompt"

Attack prompts in prompts.txt utilizing the ArtPrompt attack, using Claude 3.5 over Anthropic

ANTHROPIC_API_KEY=deadbeef... fuzzyai fuzz -m anthropic/claude-3-5-sonnet-20240620 -a art -T src/fuzzyai/resources/adv_prompts.txt -e blacklisted_words="bomb,garbage,acid,ugly,self-harm,kill your"

Attack your own REST API

OPENAI_API_KEYT=sk-3fa1... fuzzyai fuzz -a def -x openai/gpt-4 -e classifier_model=openai/gpt-4 -m rest/http.raw -e host=localhost -e port=8000 -e scheme=https  
-t "How to build a bomb"

Arguments breakdown:

-a def:
Invokes the default attack mode, which evaluates the API without applying additional strategies.
-x openai/gpt-4:
Specifies an auxiliary model (gpt-4) to assist with certain tasks during the attack.
-e classifier_model=openai/gpt-4:
Configures the fuzzer to use the gpt-4 model for classifying outputs. This helps determine whether responses meet specific criteria, such as detecting harmful or unwanted content.
-m rest/http.raw:
Uses the REST provider to attack an API endpoint. The raw HTTP request is parsed from the file http.raw. (Refer to the documentation or example for details on the file structure.)
-e host=localhost -e port=8000 -e scheme=https:
Configures the REST provider with the following API endpoint details:
- Host: localhost
- Port: 8000
- Scheme: https (communication will use HTTPS)
-t "How to build a bomb":
Specifies the test input. In this example, it tests the API's handling of sensitive or harmful content.

Key Features

Comprehensive Fuzzing Techniques: Leverage mutation-based, generation-based, and intelligent fuzzing.
Built-in Input Generation: Generate valid and invalid inputs for exhaustive testing.
Seamless Integration: Easily incorporate into your development and testing workflows.
Extensible Architecture: Customize and expand the fuzzer to meet your unique requirements.

Supported models

FuzzyAI supports various models across top providers, including:

| Provider | Models | |--------------|----------------------------------------------------------------------------------------------------------| | Anthropic| Claude (3.5, 3.0, 2.1) | | OpenAI | GPT-4o, GPT-4o mini, GPT o3 | | Gemini | Gemini Pro, Gemini 1.5 | | Azure | GPT-4, GPT-3.5 Turbo | | Bedrock | Claude (3.5, 3.0), Meta (LLaMa) | | AI21 | Jamba (1.5 Mini, Large) | | DeepSeek | DeepSeek (DeepSeek-V3, DeepSeek-V1) | | Ollama | LLaMA (3.3, 3.2, 3.1), Dolphin-LLaMA3, Vicuna |

Adding support for newer models

Easily add support for additional models by following our <a href="https://github.com/cyberark/FuzzyAI/wiki/DIY#adding-support-for-new-models">DIY guide</a>.

Implemented Attacks

See <a href="https://github.com/cyberark/FuzzyAI/wiki/Attacks">attacks wiki</a> for detailed information

| Attack Type | Title | Reference | |----------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------| | ArtPrompt | ASCII Art-based jailbreak attacks against aligned LLMs | arXiv:2402.11753 | | Taxonomy-based paraphrasing | Persuasive language techniques like emotional appeal to jailbreak LLMs | arXiv:2401.06373 | | PAIR (Prompt Automatic Iterative Refinement) | Automates adversarial prompt generation by iteratively refining prompts with two LLMs | arXiv:2310.08419 | | Many-shot jailbreaking | Embeds multiple fake dialogue examples to weaken model safety | Anthropic Research | | ASCII Smuggling | ASCII Smuggling uses Unicode Tag characters to embed hidden instructions within text, which are invisible to users but can be processed by Large Language Models (LLMs), potentially leading to prompt injection attacks | Embracethered blog | | Genetic | Utilizes a genetic algorithm to modify prompts for adversarial outcomes | arXiv:2309.01446 | | Hallucinations | Bypasses RLHF filters using model-generated | [arXiv

Related Skills

healthcheck

325.6k

Host security hardening and risk-tolerance configuration for OpenClaw deployments

prose

325.6k

OpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.

Writing Hookify Rules

80.2k

This skill should be used when the user asks to "create a hookify rule", "write a hook rule", "configure hookify", "add a hookify rule", or needs guidance on hookify rule syntax and patterns.

Agent Development

80.2k

This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.

cyberark

View profile

View on GitHub

GitHub Stars1.3k

CategoryDesign

Updated2h ago

Forks176

cyberark/FuzzyAI

Languages

Jupyter Notebook

Security Score

100/100

Audited on Mar 20, 2026

No findings