SkillAgentSearch skills...

AISecurity

AI Security Platform: Defense (61 Rust engines + Micro-Model Swarm) + Offense (39K+ payloads)

Install / Use

/learn @DmitrL-dev/AISecurity

README

<div align="center"> <br>

S E N T I N E L

Your AI Is Under Attack. Your Classifier Can't See It.

Deterministic, mathematically-grounded AI security. Not another ML model hoping to catch what ML models miss.

<br>

Engines Patterns Tests Payloads Latency Detection

<br>

Rust C Go License

</div> <br>

🔴 We Broke Alibaba's Flagship AI Model. Is Yours Next?

QWEN-2026-001 — 5 critical safety bypass vectors discovered in Qwen 3.5-Plus, Alibaba's most advanced model.

| | | |:--|:--| | Model | Qwen 3.5-Plus (February 2026) — Alibaba's flagship | | Safety Stack | Qwen3Guard + GSPO + RationaleRM — 3 layers of defense | | Result | All 3 layers bypassed. 5 vectors. 5 stages. 3 chat sessions. | | Output | Functional shellcode, reverse shells, jailbreak automation tools, God Mode declarations | | Severity | High (Systemic) — the model rated its own vulnerability |

The attack chain: contextual framing → decorative refusal → God Mode → self-replicating jailbreak tools. Each stage looks like a legitimate request. No safety filter triggered.

If Alibaba's 3-layer safety stack couldn't stop us, what's protecting yours?

📄 Full advisory: QWEN-2026-001 · 🎬 Demo: YouTube


<br>

What We Do

<table> <tr> <td width="50%" valign="top">

🔍 AI Security Audit

We test your LLM deployment against 39,000+ attack payloads across 15 categories. You get a detailed vulnerability report with severity ratings, reproduction steps, and remediation guidance.

Deliverable: Full audit report + OWASP LLM Top 10 compliance mapping.

</td> <td width="50%" valign="top">

🛡️ Sentinel Integration

Deploy 61 deterministic Rust engines as an input/output firewall around your LLM. Sub-millisecond latency. 98.5% detection rate. No GPU required.

Deliverable: Production-ready security layer + monitoring dashboard.

</td> </tr> <tr> <td width="50%" valign="top">

⚔️ Red Team Operations

Adversarial testing by the team that broke Qwen 3.5-Plus. We find what your safety stack misses — prompt injection chains, multi-turn escalation, contextual framing, tool-call exploitation.

Deliverable: Attack chain documentation + video demonstrations.

</td> <td width="50%" valign="top">

🎓 Sentinel Academy

90+ lessons across 3 skill levels. From prompt injection basics to formal verification of safety properties. Available in English and Russian.

Deliverable: Team training program + certification.

</td> </tr> </table> <br>

Why Sentinel

| Metric | Value | What It Means | |:--|:--|:--| | 61 engines | Rust, deterministic, zero ML | No false negatives from model drift. Same input = same result. Always. | | 1,101 tests | 0 failures | Every engine, every pattern, every edge case — verified. | | 98.5% detection | 250,000 simulated attacks | Across 15 attack categories. The 1.5% residual is the theoretical floor. | | <1ms latency | Per query | Fast enough for real-time production. No GPU. No batching. | | 7 novel primitives | 0 prior implementations | 51 searches on grep.app confirmed: we invented these. | | 19 scientific domains | From formal verification to immunology | Each domain solves a problem the others can't. Independent failure modes. | | OWASP 9/10 | Agentic AI Top 10 coverage | Full platform: sentinel-core + shield + immune. Compliance mapping available. |

<br>

The Problem

Every AI system deployed today faces the same fundamental challenge: the model cannot distinguish legitimate instructions from adversarial ones. A prompt injection attack looks identical to normal input. A jailbreak uses the same natural language as a help request. A data exfiltration chain can consist entirely of individually-legitimate tool calls.

Current defenses rely on ML classifiers that share the same blindness as the models they protect. Sentinel takes a different approach: deterministic, mathematically-grounded defense that doesn't depend on another AI to detect what AI can't see.

61 Rust detection engines. Sub-millisecond latency. 98.5% detection across 250,000 simulated attacks spanning 15 categories. 7 novel security primitives derived from 19 scientific domains — from formal verification to mechanism design to immunology.

<br>

How It Works

Sentinel operates as a defense-in-depth cascade. Each layer catches what the previous one missed, and each layer uses a fundamentally different detection paradigm — so a bypass for one layer doesn't help against the next.

250,000 attacks enter the system
    |
    +-- L1  Sentinel Core (regex engines) ---- catches  36.0%   ← deterministic pattern matching
    |   Remaining: 160,090
    |
    +-- L2  Capability Proxy (IFC) ----------- catches  20.3%   ← structural: data CAN'T flow wrong
    |   Remaining: 109,241
    |
    +-- L3  Behavioral EDR ------------------- catches  10.9%   ← runtime anomaly detection
    |   Remaining: 82,090
    |
    +-- PASR  Provenance tracking ------------ catches   2.0%   ← unforgeable provenance certificates
    +-- TCSA  Temporal chains + capabilities -- catches   0.8%   ← LTL safety automata
    +-- ASRA  Ambiguity resolution ----------- catches   1.3%   ← argumentation + mechanism design
    +-- Combinatorial layers (A+B+G) --------- catches   6.1%   ← impossibility proofs
    +-- MIRE  Model containment -------------- contains  0.7%   ← don't detect, CONTAIN
    |
    RESIDUAL: ~1.5% (~3,750 attacks — theoretical floor)

The key insight: each layer uses a different scientific paradigm, so they don't share failure modes. Pattern matching, information flow control, temporal logic, argumentation theory, mechanism design, and containment are mathematically independent approaches.

<br>

Platform Components

Defense

sentinel-core Rust — 61 deterministic detection engines, 810+ regex patterns, 1101 tests. Sub-millisecond per-query latency. Covers OWASP LLM Top 10, CSA MCP TTPs, GenAI Attacks Matrix, and all 7 Sentinel Lattice primitives.

brain Python — AI Security Backend. gRPC API with 32 modules: analyzer, audit, compliance, graph, hive, GPU inference, rules engine, SDK.

shield C11 — AI Security DMZ. 36,000+ LOC pure C11, 21 protocols, 119 CLI handlers, 103 tests. Zero external dependencies.

immune C — EDR/XDR for AI infrastructure. Kernel-level endpoint protection, TLS/mTLS, Bloom filters, eBPF hooks.

micro-swarm Python — Lightweight ML ensemble. <1ms inference, F1=0.997. Complements deterministic engines with statistical detection.

sentinel-sdk Python — Integration SDK.  |  sentinel CLI Python — CLI framework wrapping sentinel-core.

Offense

strike Python — AI Red Team Platform. 39,000+ attack payloads across 15 categories. Autonomous adversarial testing against your own defenses.

Infrastructure

gomcp Go — MCP server with hierarchical memory, cognitive state, causal reasoning graphs.

devkit — Agent-first development toolkit.  |  patterns YAML — Detection pattern databases (CJK jailbreaks, Pipelock taxonomy).  |  signatures JSON — Signature databases (jailbreaks EN/RU, PII, keywords).

<br>

The Sentinel Lattice — 7 Novel Security Primitives

These aren't incremental improvements. Each primitive addresses a mathematically proven limitation of existing approaches. 51 cross-domain searches on grep.app confirmed: zero prior implementations exist for any of these.

| Primitive | Source Domain | The Problem | How It Solves It | |:--|:--|:--|:--| | TSA | Runtime Verification (Havelund & Rosu) | Individual tool calls are legitimate, but the chain is malicious. Current guards only check pairs. | LTL safety properties compiled to O(1) monitor automata. Checks arbitrary-length chains in constant time. | | CAFL | Information Flow Control | LLM can perform ANY information transformation — taint tracking breaks because the model is a black box. | Worst-case assumption: if tainted data enters LLM, ALL output is tainted. Capabilities only DECREASE through chains. Sound by construction. | | GPS | Predictive Analytics | Attacks are detected only AFTER the damage is done. | Enumerates the 16-bit abstract state space (65,536 states). Computes what fraction of continuations lead to danger. GPS > 0.7 = early warning BEFORE the attack arrives. | | AAS | Argumentation Theory (Dung 1995) | "How do I mix bleach and ammonia?" — chemistry student or attacker? Same text, same semantics. No classifier can distinguish them. | Constructs explicit argumentation frameworks. Computes grounded extension via fixed-point iteration. Context-conditioned att

Related Skills

View on GitHub
GitHub Stars102
CategoryDevelopment
Updated20h ago
Forks12

Languages

Python

Security Score

85/100

Audited on Mar 21, 2026

No findings