SkillAgentSearch skills...

Aiwaf

An Adaptive AI‑Powered Web Application Firewall for Django. Detects anomalies, blocks suspicious IPs, prevents UUID tampering, stops honeypot field exploits, and continuously improves via daily log-based retraining.

Install / Use

/learn @aiwaf-project/Aiwaf
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

AI‑WAF

A self‑learning, Django‑friendly Web Application Firewall
with enhanced context-aware protection, rate‑limiting, anomaly detection, honeypots, UUID‑tamper protection, smart keyword learning, file‑extension probing detection, exempt path awareness, and daily retraining.

🆕 Latest Enhancements:

  • Smart Keyword Filtering - Prevents blocking legitimate pages like /profile/
  • Granular Reset Commands - Clear specific data types (--blacklist, --keywords, --exemptions)
  • Context-Aware Learning - Only learns from suspicious requests, not legitimate site functionality
  • Enhanced Configuration - AIWAF_ALLOWED_PATH_KEYWORDS and AIWAF_EXEMPT_KEYWORDS
  • Comprehensive HTTP Method Validation - Blocks GET→POST-only, POST→GET-only, unsupported REST methods
  • Enhanced Honeypot Protection - POST validation & 4-minute page timeout with smart reload detection
  • HTTP Header Validation - Comprehensive bot detection via header analysis and quality scoring

🚀 Quick Installation

pip install aiwaf

⚠️ Important: Add 'aiwaf' to your Django INSTALLED_APPS to avoid setup errors.

📋 Complete Setup Guide: See INSTALLATION.md for detailed installation instructions and troubleshooting.


System Requirements

No GPU needed—AI-WAF runs entirely on CPU with just Python 3.8+, Django 3.2+, a single vCPU and ~512 MB RAM for small sites; for moderate production traffic you can bump to 2–4 vCPUs and 2–4 GB RAM, offload the daily detect-and-train job to a worker, and rotate logs to keep memory use bounded.

📁 Package Structure

aiwaf/
├── __init__.py
├── blacklist_manager.py
├── middleware.py
├── trainer.py                   # exposes train()
├── utils.py
├── template_tags/
│   └── aiwaf_tags.py
├── resources/
│   ├── model.pkl                # pre‑trained base model
│   └── dynamic_keywords.json    # evolves daily
├── management/
│   └── commands/
│       ├── detect_and_train.py      # `python manage.py detect_and_train`
│       ├── add_ipexemption.py       # `python manage.py add_ipexemption`
│       ├── aiwaf_reset.py           # `python manage.py aiwaf_reset`
│       └── aiwaf_logging.py         # `python manage.py aiwaf_logging`
└── LICENSE

🚀 Features

  • IP Blocklist
    Instantly blocks suspicious IPs using Django models with real-time performance.

  • Rate Limiting
    Sliding‑window blocks flooders (> AIWAF_RATE_MAX per AIWAF_RATE_WINDOW), then blacklists them.

  • AI Anomaly Detection
    IsolationForest trained on:

    • Path length
  • GeoIP Support
    AIWAF supports optional geo-blocking and country-level traffic statistics using a local GeoIP database.

    • Keyword hits (static + dynamic)
    • Response time
    • Status‑code index
    • Burst count
    • Total 404s
  • Enhanced Dynamic Keyword Learning with Django Route Protection

    • Smart Context-Aware Learning: Only learns keywords from suspicious requests on non-existent paths
    • Automatic Django Route Extraction: Automatically excludes keywords from:
      • Valid Django URL patterns (/profile/, /admin/, /api/, etc.)
      • Django app names and model names (users, posts, categories)
      • View function names and URL namespaces
    • Unified Logic: Both trainer and middleware use identical legitimate keyword detection
    • Configuration Options:
      • AIWAF_ALLOWED_PATH_KEYWORDS - Explicitly allow certain keywords in legitimate paths
      • AIWAF_EXEMPT_KEYWORDS - Keywords that should never trigger blocking
    • Automatic Cleanup: Keywords from AIWAF_EXEMPT_PATHS are automatically removed from the database
    • False Positive Prevention: Stops learning legitimate site functionality as "malicious"
    • Inherent Malicious Detection: Middleware also blocks obviously malicious keywords (hack, exploit, attack) even if not yet learned
  • File‑Extension Probing Detection
    Tracks repeated 404s on common extensions (e.g. .php, .asp) and blocks IPs.

  • 🆕 HTTP Header Validation Advanced header analysis to detect bots and malicious requests:

    • Missing Required Headers - Blocks requests without User-Agent or Accept headers
    • Suspicious User-Agents - Detects curl, wget, python-requests, automated tools
    • Header Quality Scoring - Calculates realism score based on browser-standard headers
    • Legitimate Bot Whitelist - Allows Googlebot, Bingbot, and other search engines
    • Header Combination Analysis - Detects impossible combinations (HTTP/2 + old browsers)
    • Static File Exemption - Skips validation for CSS, JS, images

🛡️ Header Validation Middleware Features

The HeaderValidationMiddleware provides advanced bot detection through HTTP header analysis:

What it detects:

  • Missing Headers: Requests without standard browser headers
  • Suspicious User-Agents: WordPress scanners, exploit tools, basic scrapers
  • Bot-like Patterns: Low header diversity, missing Accept headers
  • Quality Scoring: 0-11 point system based on header completeness

What it allows:

  • Legitimate Browsers: Chrome, Firefox, Safari, Edge with full headers
  • Search Engine Bots: Google, Bing, DuckDuckGo, Yandex crawlers
  • API Clients: Properly identified with good headers
  • Static Files: CSS, JS, images (automatically exempted)

Real-world effectiveness:

✅ Blocks: WordPress scanners, exploit bots, basic scrapers
✅ Allows: Real browsers, legitimate bots, API clients
✅ Quality Score: 10/11 = Legitimate, 2/11 = Suspicious bot

Testing header validation:

# Test with curl (will be blocked - low quality headers)
curl http://yoursite.com/

# Test with browser (will be allowed - high quality headers)
# Visit site normally in Chrome/Firefox

# Check logs for header validation blocks
python manage.py aiwaf_logging --recent
  • Enhanced Timing-Based Honeypot
    Advanced GET→POST timing analysis with comprehensive HTTP method validation:

    • Submit forms faster than AIWAF_MIN_FORM_TIME seconds (default: 1 second)
    • 🆕 Smart HTTP Method Validation - Comprehensive protection against method misuse:
      • Blocks GET requests to POST-only views (form endpoints, API creates)
      • Blocks POST requests to GET-only views (list pages, read-only content)
      • Blocks unsupported REST methods (PUT/DELETE to non-REST views)
      • Uses Django view analysis: class-based views, method handlers, URL patterns
    • 🆕 Page expiration after AIWAF_MAX_PAGE_TIME (4 minutes) with smart reload
  • UUID Tampering Protection
    Blocks guessed or invalid UUIDs that don't resolve to real models.

  • Built-in Request Logger
    Optional middleware logger that captures requests to Django models:

    • Automatic fallback when main access logs unavailable
    • Real-time storage in database for instant access
    • Captures response times for better anomaly detection
    • Zero configuration - works out of the box
  • Blocked Request Debug Logging
    Optional debug logs that explain why a request was blocked:

    • Reason included (keyword, flood pattern, AI anomaly, header validation, etc.)
    • Request context (IP, method, path, user agent)
    • Disabled by default - enable via Django LOGGING

    Example settings.py:

    LOGGING = {
        "version": 1,
        "disable_existing_loggers": False,
        "handlers": {
            "console": {"class": "logging.StreamHandler"},
        },
        "loggers": {
            "aiwaf.middleware": {"handlers": ["console"], "level": "DEBUG"},
        },
    }
    
  • Blocked Request Responses By default, AI‑WAF raises PermissionDenied("blocked") when a request is blocked, allowing Django to render a standard 403 page. For API clients that need JSON, add JsonExceptionMiddleware near the top of your MIDDLEWARE list; it will translate PermissionDenied into a JSON 403 response when request.content_type == "application/json".

  • Smart Training System
    AI trainer automatically uses the best available data source:

    • Primary: Configured access log files (AIWAF_ACCESS_LOG)
    • Fallback: Database RequestLog model when files unavailable
    • Seamless switching between data sources
    • Enhanced compatibility with exemption system
    • Minimum log thresholds: AI training requires AIWAF_MIN_AI_LOGS (default 10,000); fewer logs falls back to keyword-only, which still requires AIWAF_MIN_TRAIN_LOGS (default 50)

Exempt Path & IP Awareness

Exempt Paths: AI‑WAF automatically exempts common login paths (/admin/, /login/, /accounts/login/, etc.) from all blocking mechanisms. You can add additional exempt paths in your Django settings.py:

AIWAF_EXEMPT_PATHS = [
    "/api/webhooks/",
    "/health/",
    "/special-endpoint/",
]

You can also store exempt paths in the database (no deploy needed):

python manage.py aiwaf_pathshell

Or add directly:

python manage.py add_pathexemption /myapp/api/ --reason "API traffic"

AIWAF Path Shell Commands:

ls                     # list routes at current level
cd <index|name>        # enter a route segment
up / cd ..             # go up one level
pwd                    # show current path prefix
exempt <index|name|.>  # add exemption for selection or current path
exit                   # quit

Exempt Path & IP Awareness

Exempt Paths: AI‑WAF automatically exempts common login paths (/admin/, /login/, /accounts/login/, etc.) from all blocking mechanisms. You can add additional exempt paths in your Django settings.py:

AIWAF_EXEMPT_PATHS = [
    "/api/webhooks/",
    "/health/",
    "/special-endpoint/",
]

You can also store exempt paths in the database (no deploy needed):

python manage.py aiwaf_pathshell

Or add directly:

python manage.py add_p

Related Skills

View on GitHub
GitHub Stars46
CategoryDevelopment
Updated9d ago
Forks4

Languages

Python

Security Score

90/100

Audited on Mar 12, 2026

No findings