MCPSecBench

MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols

Generate Convert Improve

Install / Use

/learn @AIS2Lab/MCPSecBench

About this skill

Quality Score

0/100

README

MCPSecBench

This benchmark includes MCPSecBench and data used in our experiment.

A technical report is available as follows:

@article{yang2025mcpsecbench,
  title={MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols},
  author={Yang, Yixuan and Wu, Daoyuan and Chen, Yufan},
  journal={arXiv preprint arXiv:2508.13220},
  year={2025}
}

Overview of MCPSecBench

main.py: an automated testing script including part attacks.
addserver.py: normal server for computation.
maliciousadd.py: malicious server.
download.py: a normal server for checking signature.
squatting.py: a malicious server for server name squatting.
client.py: client that connect with MCP host and server. At present, it support OpenAI and Claude. It can be extended for Deepseek, Llama, and QWen.
mitm.py: the script that implements Man-in-the-Middle attack.
index.js: the script for DNS rebinding attack.
cve-2025-6541.py: a malicious server to trigger CVE-2025-6541.
claude_desktop_config.json: the configuration for Claude Desktop.
prompts: example prompts for testing.
results: only for openai at present.

Set up MCPSecBench

needs: python version higher than 3.10

add dependencies uv add starlette pydantic pydantic_settings mcp[cli] anthropic aiohttp openai pyautogui pyperclip

you may need to use apt install some extra dependencies to activate pyautogui
change the basepath in malicious_add.py to you real path
for tool name squatting and server name squatting in Claude. Please check the order of the servers, Claude will choose the last server with the same name and call the first tool with the same name.

How to use MCPSecBench

Test Script

The auto check supports OpenAI and Cursor at present. To implement in Claude Desktop, please change the parameter of wait_for_image in main.py such as img/cursor_init.png to the screenshot of Claude Desktop.

set API_Key. export OPENAI_API_KEY xxxx / export ANTHROPIC_API_KEY xxx
uv run main.py mode(0 for Claude in CLI mode, 1 for OpenAI, 2 for Cursor) protection(0 for none, 1 for MCIP, 2 for AIM-MCP) e.g. uv run main.py 1 2

Delete /tmp/state.json at first.

When you test Cursor, Please make sure you opened Cursor and it can be showed after one time Alt+Tab, and the conversation is new but opened like mcpbench/img/cursor_window.png

Testing LLM models and MCP servers with own MCP client

First launch all remote servers. For example: uv run download.py
set API_Key. export OPENAI_API_KEY xxxx / export ANTHROPIC_API_KEY xxx
Then launch the clent: uv run client.py mode(0, 1). 0 for claude, 1 for openai.
In the end, interactive with LLM model

Testing Claude-Desktop

First copy the content of claude_desktop_config.json to your claude_desktop_config.json, change the directory to your path.
Launch all remote servers. For example: uv run download.py
Test by Claude-Desktop

Testing Cursor

Copy the content of cursor_config.json to Cursor configuration, change the directory to your path.
Launch all remote servers. For example: uv run download.py
Test by Cursor manually or via main.py

Experiment Results

Experiments Results are shown in data folder.

License

Released under the MIT License.

Related Skills

node-connect

347.0k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

107.8k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

Hook Development

107.8k

This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.

MCP Integration

107.8k

This skill should be used when the user asks to "add MCP server", "integrate MCP", "configure MCP in plugin", "use .mcp.json", "set up Model Context Protocol", "connect external service", mentions "${CLAUDE_PLUGIN_ROOT} with MCP", or discusses MCP server types (SSE, stdio, HTTP, WebSocket). Provides comprehensive guidance for integrating Model Context Protocol servers into Claude Code plugins for external tool and service integration.

AIS2Lab

View profile

View on GitHub

GitHub Stars33

CategoryDevelopment

Updated12d ago

Forks8

AIS2Lab/MCPSecBench

Languages

Python

Security Score

95/100

Audited on Mar 22, 2026

No findings