AISecurity
AI Security Platform: Defense (61 Rust engines + Micro-Model Swarm) + Offense (39K+ payloads)
Install / Use
/learn @DmitrL-dev/AISecurityQuality Score
Category
Development & EngineeringSupported Platforms
README
S E N T I N E L
Your AI Is Under Attack. Your Classifier Can't See It.
Deterministic, mathematically-grounded AI security. Not another ML model hoping to catch what ML models miss.
<br> <br>
🔴 We Broke Alibaba's Flagship AI Model. Is Yours Next?
QWEN-2026-001 — 5 critical safety bypass vectors discovered in Qwen 3.5-Plus, Alibaba's most advanced model.
| | | |:--|:--| | Model | Qwen 3.5-Plus (February 2026) — Alibaba's flagship | | Safety Stack | Qwen3Guard + GSPO + RationaleRM — 3 layers of defense | | Result | All 3 layers bypassed. 5 vectors. 5 stages. 3 chat sessions. | | Output | Functional shellcode, reverse shells, jailbreak automation tools, God Mode declarations | | Severity | High (Systemic) — the model rated its own vulnerability |
The attack chain: contextual framing → decorative refusal → God Mode → self-replicating jailbreak tools. Each stage looks like a legitimate request. No safety filter triggered.
If Alibaba's 3-layer safety stack couldn't stop us, what's protecting yours?
📄 Full advisory:
QWEN-2026-001· 🎬 Demo: YouTube
<br>
What We Do
<table> <tr> <td width="50%" valign="top">🔍 AI Security Audit
We test your LLM deployment against 39,000+ attack payloads across 15 categories. You get a detailed vulnerability report with severity ratings, reproduction steps, and remediation guidance.
Deliverable: Full audit report + OWASP LLM Top 10 compliance mapping.
</td> <td width="50%" valign="top">🛡️ Sentinel Integration
Deploy 61 deterministic Rust engines as an input/output firewall around your LLM. Sub-millisecond latency. 98.5% detection rate. No GPU required.
Deliverable: Production-ready security layer + monitoring dashboard.
</td> </tr> <tr> <td width="50%" valign="top">⚔️ Red Team Operations
Adversarial testing by the team that broke Qwen 3.5-Plus. We find what your safety stack misses — prompt injection chains, multi-turn escalation, contextual framing, tool-call exploitation.
Deliverable: Attack chain documentation + video demonstrations.
</td> <td width="50%" valign="top">🎓 Sentinel Academy
90+ lessons across 3 skill levels. From prompt injection basics to formal verification of safety properties. Available in English and Russian.
Deliverable: Team training program + certification.
</td> </tr> </table> <br>Why Sentinel
| Metric | Value | What It Means | |:--|:--|:--| | 61 engines | Rust, deterministic, zero ML | No false negatives from model drift. Same input = same result. Always. | | 1,101 tests | 0 failures | Every engine, every pattern, every edge case — verified. | | 98.5% detection | 250,000 simulated attacks | Across 15 attack categories. The 1.5% residual is the theoretical floor. | | <1ms latency | Per query | Fast enough for real-time production. No GPU. No batching. | | 7 novel primitives | 0 prior implementations | 51 searches on grep.app confirmed: we invented these. | | 19 scientific domains | From formal verification to immunology | Each domain solves a problem the others can't. Independent failure modes. | | OWASP 9/10 | Agentic AI Top 10 coverage | Full platform: sentinel-core + shield + immune. Compliance mapping available. |
<br>The Problem
Every AI system deployed today faces the same fundamental challenge: the model cannot distinguish legitimate instructions from adversarial ones. A prompt injection attack looks identical to normal input. A jailbreak uses the same natural language as a help request. A data exfiltration chain can consist entirely of individually-legitimate tool calls.
Current defenses rely on ML classifiers that share the same blindness as the models they protect. Sentinel takes a different approach: deterministic, mathematically-grounded defense that doesn't depend on another AI to detect what AI can't see.
61 Rust detection engines. Sub-millisecond latency. 98.5% detection across 250,000 simulated attacks spanning 15 categories. 7 novel security primitives derived from 19 scientific domains — from formal verification to mechanism design to immunology.
<br>How It Works
Sentinel operates as a defense-in-depth cascade. Each layer catches what the previous one missed, and each layer uses a fundamentally different detection paradigm — so a bypass for one layer doesn't help against the next.
250,000 attacks enter the system
|
+-- L1 Sentinel Core (regex engines) ---- catches 36.0% ← deterministic pattern matching
| Remaining: 160,090
|
+-- L2 Capability Proxy (IFC) ----------- catches 20.3% ← structural: data CAN'T flow wrong
| Remaining: 109,241
|
+-- L3 Behavioral EDR ------------------- catches 10.9% ← runtime anomaly detection
| Remaining: 82,090
|
+-- PASR Provenance tracking ------------ catches 2.0% ← unforgeable provenance certificates
+-- TCSA Temporal chains + capabilities -- catches 0.8% ← LTL safety automata
+-- ASRA Ambiguity resolution ----------- catches 1.3% ← argumentation + mechanism design
+-- Combinatorial layers (A+B+G) --------- catches 6.1% ← impossibility proofs
+-- MIRE Model containment -------------- contains 0.7% ← don't detect, CONTAIN
|
RESIDUAL: ~1.5% (~3,750 attacks — theoretical floor)
The key insight: each layer uses a different scientific paradigm, so they don't share failure modes. Pattern matching, information flow control, temporal logic, argumentation theory, mechanism design, and containment are mathematically independent approaches.
<br>Platform Components
Defense
sentinel-core
Rust— 61 deterministic detection engines, 810+ regex patterns, 1101 tests. Sub-millisecond per-query latency. Covers OWASP LLM Top 10, CSA MCP TTPs, GenAI Attacks Matrix, and all 7 Sentinel Lattice primitives.
brain
Python— AI Security Backend. gRPC API with 32 modules: analyzer, audit, compliance, graph, hive, GPU inference, rules engine, SDK.
shield
C11— AI Security DMZ. 36,000+ LOC pure C11, 21 protocols, 119 CLI handlers, 103 tests. Zero external dependencies.
immune
C— EDR/XDR for AI infrastructure. Kernel-level endpoint protection, TLS/mTLS, Bloom filters, eBPF hooks.
micro-swarm
Python— Lightweight ML ensemble. <1ms inference, F1=0.997. Complements deterministic engines with statistical detection.
sentinel-sdk
Python— Integration SDK. | sentinel CLIPython— CLI framework wrapping sentinel-core.
Offense
strike
Python— AI Red Team Platform. 39,000+ attack payloads across 15 categories. Autonomous adversarial testing against your own defenses.
Infrastructure
<br>gomcp
Go— MCP server with hierarchical memory, cognitive state, causal reasoning graphs.devkit — Agent-first development toolkit. | patterns
YAML— Detection pattern databases (CJK jailbreaks, Pipelock taxonomy). | signaturesJSON— Signature databases (jailbreaks EN/RU, PII, keywords).
The Sentinel Lattice — 7 Novel Security Primitives
These aren't incremental improvements. Each primitive addresses a mathematically proven limitation of existing approaches. 51 cross-domain searches on grep.app confirmed: zero prior implementations exist for any of these.
| Primitive | Source Domain | The Problem | How It Solves It | |:--|:--|:--|:--| | TSA | Runtime Verification (Havelund & Rosu) | Individual tool calls are legitimate, but the chain is malicious. Current guards only check pairs. | LTL safety properties compiled to O(1) monitor automata. Checks arbitrary-length chains in constant time. | | CAFL | Information Flow Control | LLM can perform ANY information transformation — taint tracking breaks because the model is a black box. | Worst-case assumption: if tainted data enters LLM, ALL output is tainted. Capabilities only DECREASE through chains. Sound by construction. | | GPS | Predictive Analytics | Attacks are detected only AFTER the damage is done. | Enumerates the 16-bit abstract state space (65,536 states). Computes what fraction of continuations lead to danger. GPS > 0.7 = early warning BEFORE the attack arrives. | | AAS | Argumentation Theory (Dung 1995) | "How do I mix bleach and ammonia?" — chemistry student or attacker? Same text, same semantics. No classifier can distinguish them. | Constructs explicit argumentation frameworks. Computes grounded extension via fixed-point iteration. Context-conditioned att
Related Skills
node-connect
328.6kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
openai-image-gen
328.6kBatch-generate images via OpenAI Images API. Random prompt sampler + `index.html` gallery.
claude-opus-4-5-migration
80.9kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
frontend-design
80.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
