182 skills found · Page 1 of 7
llm-attacks / Llm AttacksUniversal and Transferable Attacks on Aligned Language Models
MorDavid / BruteForceAIAdvanced LLM-powered brute-force tool combining AI intelligence with automated login attacks
ethz-spylab / AgentdojoA Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
agencyenterprise / PromptInjectPromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. 🏆 Best Paper Awards @ NeurIPS ML Safety Workshop 2022
knostic / OpenAntOpenAnt from Knostic is an open source LLM-based vulnerability discovery product that helps defenders proactively find verified security flaws while minimizing both false positives and false negatives. Stage 1 detects. Stage 2 attacks. What survives is real.
liu00222 / Open Prompt InjectionThis repository provides a benchmark for prompt injection attacks and defenses in LLMs
tml-epfl / Llm Adaptive AttacksJailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]
romovpa / ClaudiniAutoresearch for LLM adversarial attacks
SecureNexusLab / LLMPromptAttackGuideNo description available
praetorian-inc / AugustusLLM security testing framework for detecting prompt injection, jailbreaks, and adversarial attacks — 190+ probes, 28 providers, single Go binary
Yu-Fangxu / COLD Attack[ICML 2024] COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
PKU-YuanGroup / Hallucination AttackAttack to induce LLMs within hallucinations
usail-hkust / JailTrickBenchBag of Tricks: Benchmarking of Jailbreak Attacks on LLMs. Empirical tricks for LLM Jailbreaking. (NeurIPS 2024)
BishopFox / BrokenHillA productionized greedy coordinate gradient (GCG) attack tool for large language models (LLMs)
microsoft / BIPIAA benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.
GodXuxilie / PromptAttackAn LLM can Fool Itself: A Prompt-Based Adversarial Attack (ICLR 2024)
MrMoshkovitz / Gandalf Llm PentesterAutomated red-team toolkit for stress-testing LLM defences - Vector Attacks on LLMs (Gendalf Case Study)
[NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
uw-nsl / ArtPrompt[ACL24] Official Repo of Paper `ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs`
SaFo-Lab / JailBreakV 28K[COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and further assess the robustness and safety of MLLMs against a variety of jailbreak attacks.