HostileShop

A Quaint Hostel Shop with Sharp Tools

Generate Convert Improve

Install / Use

/learn @mikeperry-tor/HostileShop

About this skill

Quality Score

0/100

README

HostileShop: A Quaint Hostel Shop with Sharp Tools

HostileShop is a tool for generating prompt injections and jailbreaks against LLM agents. It creates asimulated web shopping agent environment where an attacker agent attempts to manipulate a target shopping agent into performing unauthorized actions, recording successful attack examples in the process.

The framework automatically and immediately detects success conditions for direct and indirect prompt injections, without using an LLM to judge success. This enables low cost in-context learning by the attacker agent via immediate success reporting, and long-term learning via novel injection example extraction.

HostileShop supports the entire agent-capable LLM frontier, and maintains working attack examples for all such LLMs.

HostileShop also supports adversarial evaluation of prompt filters, and has generated bypass injections for gpt-oss-safeguard, even with a custom HostileShop-adapted safeguard policy.

HostileShop also provides a Prompt Injection Assistant Mode, where the full set of injection examples are provided to an agent that has been given instructions to assist you in performing red team exercises against other agent systems. Prompt injections discovered by HostileShop can be adapted to other agentic systems with this mode, since they are issues with the underlying LLM, rather than any specific agent system or agent framework.

OpenAI GPT-OSS-20B Red Team Contest Winner

HostileShop was one of the ten prize winners in OpenAI's GPT-OSS-20B RedTeam Contest.

The official contest writeup for HostileShop contains more information specific to the attacks that HostileShop discovered against gpt-oss-20b.

The branch gpt-oss-20b-submission preserves the code used to generate contest findings, and includes reproduction instructions in its README.

This branch contains many new features and improvements since then.

Attack Capabilities
Installation
- Setup
Basic Usage
Advanced Usage
Development

Attack Capabilities

The detailed results of the framework against gpt-oss-20b are documented in my contest writeup.

HostileShop has been expanded and enhanced since then. The high-level attack capabilities are as follows:

Context Window Structure Injection

The entire agent-capable frontier is currently vulnerable to attacks that render portions of the LLM's context window in common markup languages, such as XML, JSON, TOML, YAML, and Markdown.

HostileShop discovers examples of this vulerability by providing the attacker agent with a description of the context window of the target agent, along with instructions and curated examples on how to generate prompt injections that the target will recognize as if they were native context window tags.

This includes:

HostileShop provides utilities to automatically generate context window format documentation for both open-weight and closed-weight models.

Code Debugging and Social Engineering

The attacker agent discovered that code debugging questions are quite effective at causing secrets to be revealed. It also discovered that social engineering attacks are quite effective.

When used in combination, most of the LLM frontier will perform debugging that leaks confidential information as a side effect.

Jailbreak Mutation and Porting

With the introduction of externally sourced jailbreaks, HostileShop is able to mutate and enhance these jailbreaks so that they work again, to bypass filters, overcome model adaptation, or covercome additional prompt instructions.

Interestingly, old universal jailbreaks that have been fixed by the LLM provider or blocked by system prompt instructions will often work again when mutated or combined with other attacks.

Additionally, jailbreaks can be ported between models through this mutation.

Adversarial Prompt Filter Bypass

HostileShop is capable of evaluating arbitrary prompt filter systems. It has generated bypass injections for gpt-oss-safeguard, even with a custom HostileShop-adapted safeguard policy.

ParselTongue Obfuscation

HostileShop contains an implementation of ParselTongue as a single python multitool for the attacker agent to perform obfuscation of injections, by layering multiple transforms together. It contains a prompt fragment that is useful as a general jailbreaking and for bypassing prompt filters.

Attack Stacking

Attacks become even more reliable when all of the above are stacked together. This enables attacks to succeed against larger models, such as GPT-5, Claude-4.5, GLM-4.6, and Kimi-K2. It also enables bypass of policy-based prompt filters, by combining injections for the model with injections for the prompt filter.

For new models, or new jailbreaks, it takes some probing for the attacker to discover successful combinations, but once it does, it is usually able to make them quite reliable, especially to induce tool call invocation.

Installation

Setup

Clone the repository:

git clone https://github.com/mikeperry-tor/HostileShop.git
cd HostileShop

Install dependencies:

Choose one of the following methods to install the required Python packages:

Using pip:

pip install -r requirements.txt

Using conda:

# Create a new conda environment
conda create -n HostileShop python=3.13
conda activate HostileShop
# Install dependencies
pip install -r requirements.txt

Using uv:

# Create a virtual environment
uv venv
# Activate the virtual environment
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
# Install dependencies
uv pip install -r requirements.txt

Related Skills

node-connect

347.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

108.0k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

347.2k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

347.2k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。

mikeperry-tor

View profile

View on GitHub

GitHub Stars71

CategoryDevelopment

Updated22d ago

Forks12

mikeperry-tor/HostileShop

Languages

Python

Security Score

95/100

Audited on Mar 12, 2026

No findings

HostileShop

Install / Use

README

HostileShop: A Quaint Hostel Shop with Sharp Tools

OpenAI GPT-OSS-20B Red Team Contest Winner

Table of Contents

Attack Capabilities

Context Window Structure Injection

Code Debugging and Social Engineering

Jailbreak Mutation and Porting

Adversarial Prompt Filter Bypass

ParselTongue Obfuscation

Attack Stacking

Installation

Setup

Related Skills