Aiwaf

An Adaptive AI‑Powered Web Application Firewall for Django. Detects anomalies, blocks suspicious IPs, prevents UUID tampering, stops honeypot field exploits, and continuously improves via daily log-based retraining.

Generate Convert Improve

Install / Use

/learn @aiwaf-project/Aiwaf

About this skill

Quality Score

0/100

README

AI‑WAF

A self‑learning, Django‑friendly Web Application Firewall
with enhanced context-aware protection, rate‑limiting, anomaly detection, honeypots, UUID‑tamper protection, smart keyword learning, file‑extension probing detection, exempt path awareness, and daily retraining.

🆕 Latest Enhancements:

✅ Smart Keyword Filtering - Prevents blocking legitimate pages like /profile/
✅ Granular Reset Commands - Clear specific data types (--blacklist, --keywords, --exemptions)
✅ Context-Aware Learning - Only learns from suspicious requests, not legitimate site functionality
✅ Enhanced Configuration - AIWAF_ALLOWED_PATH_KEYWORDS and AIWAF_EXEMPT_KEYWORDS
✅ Comprehensive HTTP Method Validation - Blocks GET→POST-only, POST→GET-only, unsupported REST methods
✅ Enhanced Honeypot Protection - POST validation & 4-minute page timeout with smart reload detection
✅ HTTP Header Validation - Comprehensive bot detection via header analysis and quality scoring

🚀 Quick Installation

pip install aiwaf

⚠️ Important: Add 'aiwaf' to your Django INSTALLED_APPS to avoid setup errors.

📋 Complete Setup Guide: See INSTALLATION.md for detailed installation instructions and troubleshooting.

System Requirements

No GPU needed—AI-WAF runs entirely on CPU with just Python 3.8+, Django 3.2+, a single vCPU and ~512 MB RAM for small sites; for moderate production traffic you can bump to 2–4 vCPUs and 2–4 GB RAM, offload the daily detect-and-train job to a worker, and rotate logs to keep memory use bounded.

📁 Package Structure

aiwaf/
├── __init__.py
├── blacklist_manager.py
├── middleware.py
├── trainer.py                   # exposes train()
├── utils.py
├── template_tags/
│   └── aiwaf_tags.py
├── resources/
│   ├── model.pkl                # pre‑trained base model
│   └── dynamic_keywords.json    # evolves daily
├── management/
│   └── commands/
│       ├── detect_and_train.py      # `python manage.py detect_and_train`
│       ├── add_ipexemption.py       # `python manage.py add_ipexemption`
│       ├── aiwaf_reset.py           # `python manage.py aiwaf_reset`
│       └── aiwaf_logging.py         # `python manage.py aiwaf_logging`
└── LICENSE

🚀 Features

IP Blocklist
Instantly blocks suspicious IPs using Django models with real-time performance.
Rate Limiting
Sliding‑window blocks flooders (> AIWAF_RATE_MAX per AIWAF_RATE_WINDOW), then blacklists them.
AI Anomaly Detection
IsolationForest trained on:
- Path length
GeoIP Support
AIWAF supports optional geo-blocking and country-level traffic statistics using a local GeoIP database.
- Keyword hits (static + dynamic)
- Response time
- Status‑code index
- Burst count
- Total 404s
Enhanced Dynamic Keyword Learning with Django Route Protection
- Smart Context-Aware Learning: Only learns keywords from suspicious requests on non-existent paths
- Automatic Django Route Extraction: Automatically excludes keywords from:
  - Valid Django URL patterns (/profile/, /admin/, /api/, etc.)
  - Django app names and model names (users, posts, categories)
  - View function names and URL namespaces
- Unified Logic: Both trainer and middleware use identical legitimate keyword detection
- Configuration Options:
  - AIWAF_ALLOWED_PATH_KEYWORDS - Explicitly allow certain keywords in legitimate paths
  - AIWAF_EXEMPT_KEYWORDS - Keywords that should never trigger blocking
- Automatic Cleanup: Keywords from AIWAF_EXEMPT_PATHS are automatically removed from the database
- False Positive Prevention: Stops learning legitimate site functionality as "malicious"
- Inherent Malicious Detection: Middleware also blocks obviously malicious keywords (hack, exploit, attack) even if not yet learned
File‑Extension Probing Detection
Tracks repeated 404s on common extensions (e.g. .php, .asp) and blocks IPs.
🆕 HTTP Header Validation Advanced header analysis to detect bots and malicious requests:
- Missing Required Headers - Blocks requests without User-Agent or Accept headers
- Suspicious User-Agents - Detects curl, wget, python-requests, automated tools
- Header Quality Scoring - Calculates realism score based on browser-standard headers
- Legitimate Bot Whitelist - Allows Googlebot, Bingbot, and other search engines
- Header Combination Analysis - Detects impossible combinations (HTTP/2 + old browsers)
- Static File Exemption - Skips validation for CSS, JS, images

🛡️ Header Validation Middleware Features

The HeaderValidationMiddleware provides advanced bot detection through HTTP header analysis:

What it detects:

Missing Headers: Requests without standard browser headers
Suspicious User-Agents: WordPress scanners, exploit tools, basic scrapers
Bot-like Patterns: Low header diversity, missing Accept headers
Quality Scoring: 0-11 point system based on header completeness

What it allows:

Legitimate Browsers: Chrome, Firefox, Safari, Edge with full headers
Search Engine Bots: Google, Bing, DuckDuckGo, Yandex crawlers
API Clients: Properly identified with good headers
Static Files: CSS, JS, images (automatically exempted)

Real-world effectiveness:

✅ Blocks: WordPress scanners, exploit bots, basic scrapers
✅ Allows: Real browsers, legitimate bots, API clients
✅ Quality Score: 10/11 = Legitimate, 2/11 = Suspicious bot

Testing header validation:

# Test with curl (will be blocked - low quality headers)
curl http://yoursite.com/

# Test with browser (will be allowed - high quality headers)
# Visit site normally in Chrome/Firefox

# Check logs for header validation blocks
python manage.py aiwaf_logging --recent

Enhanced Timing-Based Honeypot
Advanced GET→POST timing analysis with comprehensive HTTP method validation:
- Submit forms faster than AIWAF_MIN_FORM_TIME seconds (default: 1 second)
- 🆕 Smart HTTP Method Validation - Comprehensive protection against method misuse:
  - Blocks GET requests to POST-only views (form endpoints, API creates)
  - Blocks POST requests to GET-only views (list pages, read-only content)
  - Blocks unsupported REST methods (PUT/DELETE to non-REST views)
  - Uses Django view analysis: class-based views, method handlers, URL patterns
- 🆕 Page expiration after AIWAF_MAX_PAGE_TIME (4 minutes) with smart reload
UUID Tampering Protection
Blocks guessed or invalid UUIDs that don't resolve to real models.
Built-in Request Logger
Optional middleware logger that captures requests to Django models:
- Automatic fallback when main access logs unavailable
- Real-time storage in database for instant access
- Captures response times for better anomaly detection
- Zero configuration - works out of the box
Blocked Request Debug Logging
Optional debug logs that explain why a request was blocked:
- Reason included (keyword, flood pattern, AI anomaly, header validation, etc.)
- Request context (IP, method, path, user agent)
- Disabled by default - enable via Django LOGGING
Example settings.py:
```
LOGGING = {
    "version": 1,
    "disable_existing_loggers": False,
    "handlers": {
        "console": {"class": "logging.StreamHandler"},
    },
    "loggers": {
        "aiwaf.middleware": {"handlers": ["console"], "level": "DEBUG"},
    },
}
```
Blocked Request Responses By default, AI‑WAF raises PermissionDenied("blocked") when a request is blocked, allowing Django to render a standard 403 page. For API clients that need JSON, add JsonExceptionMiddleware near the top of your MIDDLEWARE list; it will translate PermissionDenied into a JSON 403 response when request.content_type == "application/json".
Smart Training System
AI trainer automatically uses the best available data source:
- Primary: Configured access log files (AIWAF_ACCESS_LOG)
- Fallback: Database RequestLog model when files unavailable
- Seamless switching between data sources
- Enhanced compatibility with exemption system
- Minimum log thresholds: AI training requires AIWAF_MIN_AI_LOGS (default 10,000); fewer logs falls back to keyword-only, which still requires AIWAF_MIN_TRAIN_LOGS (default 50)

Exempt Path & IP Awareness

Exempt Paths: AI‑WAF automatically exempts common login paths (/admin/, /login/, /accounts/login/, etc.) from all blocking mechanisms. You can add additional exempt paths in your Django settings.py:

AIWAF_EXEMPT_PATHS = [
    "/api/webhooks/",
    "/health/",
    "/special-endpoint/",
]

You can also store exempt paths in the database (no deploy needed):

python manage.py aiwaf_pathshell

Or add directly:

python manage.py add_pathexemption /myapp/api/ --reason "API traffic"

AIWAF Path Shell Commands:

ls                     # list routes at current level
cd <index|name>        # enter a route segment
up / cd ..             # go up one level
pwd                    # show current path prefix
exempt <index|name|.>  # add exemption for selection or current path
exit                   # quit

Exempt Path & IP Awareness

AIWAF_EXEMPT_PATHS = [
    "/api/webhooks/",
    "/health/",
    "/special-endpoint/",
]

You can also store exempt paths in the database (no deploy needed):

python manage.py aiwaf_pathshell

Or add directly:

python manage.py add_p

Related Skills

node-connect

328.7k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

81.0k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

328.7k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

81.0k

Commit, push, and open a PR