MCPSecBench
MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols
Install / Use
/learn @AIS2Lab/MCPSecBenchQuality Score
Category
Development & EngineeringSupported Platforms
README
MCPSecBench
This benchmark includes and
used in our experiment.
A technical report is available as follows:
@article{yang2025mcpsecbench,
title={MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols},
author={Yang, Yixuan and Wu, Daoyuan and Chen, Yufan},
journal={arXiv preprint arXiv:2508.13220},
year={2025}
}
Overview of MCPSecBench
- main.py: an automated testing script including part attacks.
- addserver.py: normal server for computation.
- maliciousadd.py: malicious server.
- download.py: a normal server for checking signature.
- squatting.py: a malicious server for server name squatting.
- client.py: client that connect with MCP host and server. At present, it support OpenAI and Claude. It can be extended for Deepseek, Llama, and QWen.
- mitm.py: the script that implements Man-in-the-Middle attack.
- index.js: the script for DNS rebinding attack.
- cve-2025-6541.py: a malicious server to trigger CVE-2025-6541.
- claude_desktop_config.json: the configuration for Claude Desktop.
- prompts: example prompts for testing.
- results: only for openai at present.
Set up MCPSecBench
needs: python version higher than 3.10
-
add dependencies uv add starlette pydantic pydantic_settings mcp[cli] anthropic aiohttp openai pyautogui pyperclip
you may need to use apt install some extra dependencies to activate pyautogui
-
change the basepath in malicious_add.py to you real path
-
for tool name squatting and server name squatting in Claude. Please check the order of the servers, Claude will choose the last server with the same name and call the first tool with the same name.
How to use MCPSecBench
Test Script
The auto check supports OpenAI and Cursor at present. To implement in Claude Desktop, please change the parameter of wait_for_image in main.py such as img/cursor_init.png to the screenshot of Claude Desktop.
-
set API_Key. export OPENAI_API_KEY xxxx / export ANTHROPIC_API_KEY xxx
-
uv run main.py mode(0 for Claude in CLI mode, 1 for OpenAI, 2 for Cursor) protection(0 for none, 1 for MCIP, 2 for AIM-MCP) e.g. uv run main.py 1 2
Delete /tmp/state.json at first.
When you test Cursor, Please make sure you opened Cursor and it can be showed after one time Alt+Tab, and the conversation is new but opened like mcpbench/img/cursor_window.png
Testing LLM models and MCP servers with own MCP client
- First launch all remote servers. For example: uv run download.py
- set API_Key. export OPENAI_API_KEY xxxx / export ANTHROPIC_API_KEY xxx
- Then launch the clent: uv run client.py mode(0, 1). 0 for claude, 1 for openai.
- In the end, interactive with LLM model
Testing Claude-Desktop
- First copy the content of claude_desktop_config.json to your claude_desktop_config.json, change the directory to your path.
- Launch all remote servers. For example: uv run download.py
- Test by Claude-Desktop
Testing Cursor
- Copy the content of cursor_config.json to Cursor configuration, change the directory to your path.
- Launch all remote servers. For example: uv run download.py
- Test by Cursor manually or via main.py
Experiment Results
Experiments Results are shown in folder.
License
Released under the MIT License.
<!-- - Tool Poison Attacks Claude:  OpenAI:  Cursor:  - Tool Shadowing Attacks Claude:  OpenAI:  Cursor:   - Data Exfiltration Claude:  OpenAI:   Cursor:  - Prompt Injection Claude(failed):  OpenAI:  Cursor:  - Slash Command Overlap Cursor:  - Rug Pull Claude:  OpenAI:  Cursor:  - Indirect Prompt Injection Claude:  OpenAI:  Cursor:Released under the MIT License.  - Privilege Escalation (indirect prompt injection) - Package Name Squatting(server name) Claude:  OpenAI:  Cursor:   - Package Name Squatting(tool name) Claude:  OpenAI:  Cursor:   - Sandbox Escape Claude:  OpenAI:  Cursor:  - Tool/Service Misuse via “Confused AI” Claude:  OpenAI:  Cursor:  - MITM Claude:  OpenAI:  Cursor:  - DNS rebinding Claude:  OpenAI:  Cursor(be aware that no proxy is set):  - Vulnerable server Claude:  OpenAI:  Cursor:  - Vulnerable client(works on Windows) Claude:  OpenAI:  Cursor:  - Configuration Drift Claude:  OpenAI:  Cursor:  - Schema inconsistencies Claude:  OpenAI:  Cursor:  -->Related Skills
node-connect
347.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
Hook Development
107.8kThis skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.
MCP Integration
107.8kThis skill should be used when the user asks to "add MCP server", "integrate MCP", "configure MCP in plugin", "use .mcp.json", "set up Model Context Protocol", "connect external service", mentions "${CLAUDE_PLUGIN_ROOT} with MCP", or discusses MCP server types (SSE, stdio, HTTP, WebSocket). Provides comprehensive guidance for integrating Model Context Protocol servers into Claude Code plugins for external tool and service integration.
