Results for "llm-attack"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

182 skills found · Page 1 of 7

llm-attacks / Llm Attacks

4.6k

Universal and Transferable Attacks on Aligned Language Models

universal

Updated 17h ago

MorDavid / BruteForceAI

1.4k

Advanced LLM-powered brute-force tool combining AI intelligence with automated login attacks

universal

aibruteforcebugbounty+2

Updated 1d ago

ethz-spylab / Agentdojo

523

A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.

universal

benchmarklarge-language-modelsprompt-injection+1

Updated 2h ago

agencyenterprise / PromptInject

472

PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. 🏆 Best Paper Awards @ NeurIPS ML Safety Workshop 2022

universal

adversarial-attacksagiagi-alignment+9

Updated 3d ago

knostic / OpenAnt

446

OpenAnt from Knostic is an open source LLM-based vulnerability discovery product that helps defenders proactively find verified security flaws while minimizing both false positives and false negatives. Stage 1 detects. Stage 2 attacks. What survives is real.

universal

aicybercybersecurity+1

Updated 15h ago

liu00222 / Open Prompt Injection

426

This repository provides a benchmark for prompt injection attacks and defenses in LLMs

universal

llmllm-securityllms+3

Updated 2d ago

tml-epfl / Llm Adaptive Attacks

381

Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]

universal

Updated 5d ago

romovpa / Claudini

183

Autoresearch for LLM adversarial attacks

universal

ai-safetyai-securityautoresearch+2

Updated 10h ago

SecureNexusLab / LLMPromptAttackGuide

178

No description available

universal

Updated 38m ago

praetorian-inc / Augustus

178

LLM security testing framework for detecting prompt injection, jailbreaks, and adversarial attacks — 190+ probes, 28 providers, single Go binary

universal

ai-securitycapability

Updated 12h ago

Yu-Fangxu / COLD Attack

175

[ICML 2024] COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability

universal

Updated 1d ago

PKU-YuanGroup / Hallucination Attack

164

Attack to induce LLMs within hallucinations

universal

adversarial-attacksai-safetydeep-learning+5

Updated 1mo ago

usail-hkust / JailTrickBench

161

Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs. Empirical tricks for LLM Jailbreaking. (NeurIPS 2024)

universal

Updated 1d ago

BishopFox / BrokenHill

158

A productionized greedy coordinate gradient (GCG) attack tool for large language models (LLMs)

zed

Updated 3d ago

microsoft / BIPIA

118

A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.

universal

llm-security

Updated 13h ago

GodXuxilie / PromptAttack

114

An LLM can Fool Itself: A Prompt-Based Adversarial Attack (ICLR 2024)

universal

Updated 10d ago

MrMoshkovitz / Gandalf Llm Pentester

113

Automated red-team toolkit for stress-testing LLM defences - Vector Attacks on LLMs (Gendalf Case Study)

universal

Updated 14d ago

niconi19 / LLM Conversation Safety

111

[NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey

universal

Updated 13d ago

uw-nsl / ArtPrompt

[ACL24] Official Repo of Paper `ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs`

universal

Updated 7d ago

SaFo-Lab / JailBreakV 28K

[COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and further assess the robustness and safety of MLLMs against a variety of jailbreak attacks.

universal

jailbreakv-28k

Updated 1d ago