AutoPiff
Semantic analysis engine for detecting vulnerability fixes in Windows kernel driver patches — 58 YAML rules, Ghidra decompilation, reachability tracing, and scoring
Install / Use
/learn @splintersfury/AutoPiffREADME
AutoPiff
Automated Patch Intelligence and Finding Framework
A semantic analysis engine for detecting vulnerability fixes in Windows kernel driver patches. AutoPiff uses conservative YAML rules to identify security-relevant code changes with high precision and explainability.
Overview
AutoPiff analyzes the differences between vulnerable and patched driver versions to automatically detect:
- Use-After-Free fixes (null assignments after
ExFreePool) - Bounds check additions (length validation before
memcpy) - User/kernel boundary hardening (
ProbeForRead/ProbeForWrite) - Integer overflow protections (safe math helpers)
- State hardening (interlocked refcounting)
- IOCTL input validation, pool corruption guards, privilege checks, and more
Key Features
- High Precision: Conservative rules minimize false positives
- Explainable: Every finding includes rationale and evidence
- Sink-Aware: Rules consider proximity to dangerous APIs
- Scoring Model: Ranks findings by exploitability and reachability
- Karton Integration: Runs as a distributed service in malware analysis pipelines
Why AutoPiff?
Needle in a Haystack
Vendor releases 500 driver updates/year
├── 490 are feature/performance/cosmetic changes
├── 8 are minor bug fixes
└── 2 are silent security fixes (no CVE assigned)
Without automation: Manually review 500 to find 2
With AutoPiff: Review 10 high-scorers to find 2
Security patches are often released without CVE assignments. Manually reverse engineering every driver update to find the security-relevant ones is not feasible. AutoPiff solves this by automatically surfacing the changes that matter.
What AutoPiff Automates
| Phase | Manual Effort | With AutoPiff | Time Saved | |-------|---------------|---------------|------------| | Version pairing | 5-15 min/driver | Automatic | ~100% | | Decompilation | 2-10 min/binary | Batched, parallel | ~95% | | Function matching | 30-60 min/pair | Instant | ~100% | | Identifying security changes | 2-8 hours/pair | Seconds | ~99% | | Initial triage & ranking | 1-2 hours | Instant | ~100% | | Report generation | 30-60 min | Instant | ~100% |
Total: 4-12 hours per driver pair down to 2-5 minutes
What Still Requires Human Expertise
┌─────────────────────────────────────────────────────────────────┐
│ AUTOMATED by AutoPiff │
│ ├── Find the needle: "This function changed near ExFreePool" │
│ ├── Classify: "Looks like a use-after-free fix" │
│ └── Rank: "Score 5.5 - worth investigating" │
├─────────────────────────────────────────────────────────────────┤
│ STILL MANUAL (Your expertise) │
│ ├── Confirm exploitability: "Can I actually trigger this?" │
│ ├── Root cause analysis: "Why was this vulnerable?" │
│ ├── Exploit development: "How do I reach this sink?" │
│ └── Impact assessment: "What's the real-world risk?" │
└─────────────────────────────────────────────────────────────────┘
AutoPiff doesn't replace exploitation research. It makes it feasible at scale by automating the reconnaissance phase.
Use Cases
1. Silent Patch Detection
- Monitor drivers for security fixes released without CVEs
- Get alerts when high-scoring semantic deltas appear
- Catch vulnerabilities before they're publicly disclosed
2. 1-Day Vulnerability Research
- When a CVE is announced, quickly identify the exact patch
- Correlate patch patterns with vulnerability classes
- Accelerate exploit development timelines
3. Vendor Security Auditing
- Analyze all versions of a driver family over time
- Generate timelines showing when fixes appeared
- Identify patterns in how vendors address vulnerabilities
4. Historical CVE Corpus Building
- Process known CVE driver pairs to build training data
- Validate and improve detection rules
- Create a knowledge base of patch signatures
Architecture
AutoPiff runs as a Karton pipeline with 8 sequential stages plus a parallel DriverAtlas triage branch. Each stage is an independent microservice communicating through Redis/RabbitMQ.
graph LR
sources["WinBIndex<br/>VirusTotal"]:::src --> s0["Stage 0<br/>Monitor"]
s0 --> s14["Stages 1-4<br/>Patch Differ"]
s0 --> triage["DriverAtlas<br/>Triage"]:::triage
s14 --> s5["Stage 5<br/>Reachability"]
s5 --> s6["Stage 6<br/>Ranking"]
s6 --> s7["Stage 7<br/>Report"]
s6 --> s8["Stage 8<br/>Alerter"]
triage --> alerts["MWDB Tags<br/>+ Alerts"]:::triage
classDef src fill:#1a1a2e,stroke:#e94560,color:#eee
classDef triage fill:#1a1a2e,stroke:#e9a345,color:#eee
classDef default fill:#16213e,stroke:#0f3460,color:#eee
| Stage | Service | What it does |
|-------|---------|-------------|
| 0 | driver-monitor | Polls WinBIndex and VirusTotal for new driver versions, uploads to MWDB |
| 1-4 | karton-patch-differ | Version pairing, Ghidra decompilation, function matching, semantic rule evaluation |
| 5 | karton-reachability | Ghidra call-graph BFS from IOCTL/IRP entry points to changed functions, full decompilation export |
| 6 | karton-ranking | Scores findings using reachability, semantic severity, and attack surface |
| 7 | karton-report | Generates structured markdown reports, uploads to MWDB |
| 8 | autopiff-alerter | Sends Telegram alerts for findings scoring >= 8.0 |
| — | autopiff-driver-triage | DriverAtlas attack surface scoring (parallel to 1-4), tags MWDB samples, Telegram alerts |
Semantic Rules
AutoPiff includes 58 rules across 22 categories. See Docs/semantic_rules.md for the full specification and Docs/SEMANTIC_RULES_REFERENCE.md for the technical reference.
| Category | Example Detection |
|----------|-------------------|
| bounds_check | Added length check before memcpy |
| lifetime_fix | Null assignment after ExFreePool |
| user_boundary_check | Added ProbeForRead/ProbeForWrite |
| int_overflow | Safe math helper usage |
| state_hardening | Interlocked refcount operations |
| ioctl_input_validation | New size/type checks in dispatch handlers |
| pool_type_hardening | Migration to NonPagedPoolNx |
| privilege_check | Added SeSinglePrivilegeCheck |
Sink Groups
The rule engine tracks 50+ dangerous API symbols across 8 sink groups:
memory_copy: RtlCopyMemory, memcpy, memmovepool_alloc: ExAllocatePool, ExAllocatePoolWithTagpool_free: ExFreePool, ExFreePoolWithTaguser_probe: ProbeForRead, ProbeForWriteio_sanitization: RtlULongAdd, RtlSizeTMultexceptions: __try, __exceptstring_copy: strcpy, wcsncpyrefcounting: InterlockedIncrement/Decrement
Scoring Model
Findings are scored using a configurable model (rules/scoring.yaml):
final_score = semantic_score + reachability_bonus + sink_bonus - penalties
Score Components:
- Semantic Score: Rule weight x confidence x category multiplier
- Reachability Bonus: IOCTL (+4.0), IRP (+2.5), PnP (+2.0), Internal (+0.5)
- Sink Bonus: memory_copy (+1.5), user_probe (+1.5), pool_alloc (+1.2)
- Penalties: Low matching quality, high noise risk
Gating:
- Findings with confidence < 0.45 are dropped
- Matching confidence < 0.40 caps score at 3.0
Installation
As Karton Service (Recommended)
git clone https://github.com/splintersfury/AutoPiff.git
cd AutoPiff
docker compose up -d
For the full production stack with MWDB, dashboards, and monitoring, see driver_analyzer.
Standalone Library
pip install pyyaml
from services.karton_patch_differ.rule_engine import SemanticRuleEngine
engine = SemanticRuleEngine('rules/semantic_rules.yaml', 'rules/sinks.yaml')
hits = engine.evaluate(func_name, old_code, new_code, diff_lines)
Configuration
Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| MWDB_API_URL | MWDB Core API endpoint | http://mwdb-core:8080/api/ |
| MWDB_API_KEY | MWDB API key for uploads | (required) |
| KARTON_REDIS_HOST | Redis host for Karton | karton-redis |
| AUTOPIFF_GHIDRA_TIMEOUT | Ghidra decompilation timeout (sec) | 900 |
| VT_API_KEY | VirusTotal API key for driver monitoring | (optional) |
| TELEGRAM_BOT_TOKEN | Telegram bot token for alerts | (optional) |
| TELEGRAM_CHAT_ID | Telegram chat for alerts | (optional) |
| AUTOPIFF_SCORE_THRESHOLD | Minimum score for Telegram alerts | 8.0 |
| DRIVERATLAS_SCORE_THRESHOLD | Minimum attack surface score for triage alerts | 8.0 |
Rule Customization
Edit rules/semantic_rules.yaml to add or modify rules:
rules:
- rule_id: my_custom_rule
category: bounds_check
confidence: 0.85
required_signals:
- sink_group: memory_copy
- change_type: guard_added
- guard_kind: length_check
plain_english_summary: Added length validation before memory copy.
Output Format
AutoPiff produces JSON reports attached to MWDB samples:
{
"pairing": {
"driver_new": {"sha256": "...", "version": "2.0.9.0"},
"driver_old": {"sha256": "...", "version": "2.0.8.0"},
"decision": "accept",
"confidence": 0.95
},
"semantic_deltas": {
"deltas": [
{
"function": "HandleIoctl",
"rule_id": "null_after_free_added",
"category": "lifetime_fix",
"confidence": 0.88,
"sinks": ["pool_free"],
"final_score": 5.5,
"why_matters": "Pointer is now set to NULL after freeing memor
