Openadserver
Open Source Ad Serving Platform with ML-Powered CTR Prediction | Self-hosted alternative to Google Ad Manager | Python, FastAPI, PyTorch
Install / Use
/learn @seanZhang414/OpenadserverREADME
🤔 Why OpenAdServer?
Most ad servers are either too simple (just serving static banners) or too complex (requiring Google-scale infrastructure).
OpenAdServer is the sweet spot — a production-ready, self-hosted ad platform with machine learning powered CTR prediction, designed for teams who want full control without the complexity.
Comparison
| Feature | OpenAdServer | Google Ad Manager | Revive Adserver | AdButler | |---------|:------------:|:-----------------:|:---------------:|:--------:| | Self-hosted | ✅ | ❌ | ✅ | ❌ | | ML CTR Prediction | ✅ DeepFM/LR | ❌ | ❌ | ❌ | | Real-time eCPM Bidding | ✅ | ✅ | ❌ | ⚠️ | | Modern Tech Stack | ✅ Python/FastAPI | N/A | ❌ PHP | ❌ | | One-click Deploy | ✅ Docker | ❌ | ⚠️ | ❌ | | Free & Open Source | ✅ | ❌ | ✅ | ❌ | | No Revenue Share | ✅ | ❌ 💰 | ✅ | ❌ 💰 |
Perfect For
- 🏢 SMBs building their own ad network
- 🎮 Gaming companies monetizing in-app traffic
- 📱 App developers running house ads or direct deals
- 🛒 E-commerce platforms with sponsored listings
- 🔬 Researchers studying computational advertising
- 🎓 Students learning ad-tech systems
✨ Features
🚀 Ad Serving
- High-Performance API — <10ms P99 latency with FastAPI
- Multiple Ad Formats — Banner, native, video, interstitial
- Smart Targeting — Geo, device, OS, demographics, interests
- Frequency Capping — Daily/hourly limits per user
- Budget Pacing — Smooth delivery throughout the day
🤖 Machine Learning
- CTR Prediction Models — DeepFM, Logistic Regression, FM
- Real-time Inference — <5ms prediction latency
- Automatic Feature Engineering — Sparse/dense feature processing
- Model Hot-swap — Update models without downtime
💰 Monetization
- eCPM Ranking — Maximize revenue automatically
- Multiple Bid Types — CPM, CPC, CPA, oCPM
- Real-time Bidding Ready — OpenRTB compatible (roadmap)
📊 Analytics
- Event Tracking — Impressions, clicks, conversions
- Real-time Dashboards — Grafana integration
- Prometheus Metrics — Full observability
🏗️ Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Ad Request Flow │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 📱 Client │
│ │ │
│ ▼ │
│ ┌──────────┐ ┌───────────┐ ┌──────────┐ ┌─────────┐ │
│ │ FastAPI │───▶│ Retrieval │───▶│ Ranking │───▶│Response │ │
│ │ Router │ │(Targeting)│ │ (eCPM) │ │ │ │
│ └──────────┘ └───────────┘ └──────────┘ └─────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌───────────┐ ┌──────────┐ │
│ │PostgreSQL│ │ Redis │ │ PyTorch │ │
│ │(Campaigns)│ │ (Cache) │ │ (Models) │ │
│ └──────────┘ └───────────┘ └──────────┘ │
│ │
│ Pipeline: Retrieve → Filter → Predict → Rank → Return │
│ │
└─────────────────────────────────────────────────────────────────┘
🚀 Quick Start
Option 1: Docker Compose (Recommended)
# Clone the repository
git clone https://github.com/pysean/openadserver.git
cd openadserver
# Start all services (PostgreSQL, Redis, Ad Server)
docker compose up -d
# Initialize sample data
python scripts/init_test_data.py
# Verify it's running
curl http://localhost:8000/health
# {"status":"healthy","version":"1.0.0"}
Option 2: Local Development
# Prerequisites: Python 3.11+, PostgreSQL 14+, Redis 6+
# Install dependencies
pip install -e ".[dev]"
# Start databases
docker compose up -d postgres redis
# Run the server
OPENADSERVER_ENV=dev python -m openadserver.ad_server.main
📡 Your First Ad Request
curl -X POST http://localhost:8000/api/v1/ad/request \
-H "Content-Type: application/json" \
-d '{
"slot_id": "banner_home",
"user_id": "user_12345",
"device": {"os": "ios", "os_version": "17.0"},
"geo": {"country": "US", "city": "new_york"},
"num_ads": 3
}'
Response:
{
"request_id": "req_a1b2c3d4",
"ads": [
{
"ad_id": "ad_1001_5001",
"campaign_id": 1001,
"creative": {
"title": "Summer Sale - 50% Off!",
"description": "Limited time offer",
"image_url": "https://cdn.example.com/ads/summer-sale.jpg",
"landing_url": "https://shop.example.com/sale"
},
"tracking": {
"impression_url": "http://localhost:8000/api/v1/event/track?type=impression&req=req_a1b2c3d4&ad=1001",
"click_url": "http://localhost:8000/api/v1/event/track?type=click&req=req_a1b2c3d4&ad=1001"
},
"metadata": {
"ecpm": 35.50,
"pctr": 0.0355
}
}
],
"count": 1
}
📖 Documentation
API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| /api/v1/ad/request | POST | Request ads for a placement |
| /api/v1/event/track | GET/POST | Track impression/click/conversion |
| /api/v1/campaign | CRUD | Manage campaigns |
| /api/v1/creative | CRUD | Manage creatives |
| /api/v1/advertiser | CRUD | Manage advertisers |
| /health | GET | Health check |
| /metrics | GET | Prometheus metrics |
Configuration
# configs/production.yaml
server:
host: "0.0.0.0"
port: 8000
workers: 4
database:
host: "postgres"
port: 5432
name: "openadserver"
user: "adserver"
password: "${DB_PASSWORD}"
redis:
host: "redis"
port: 6379
db: 0
ad_serving:
enable_ml_prediction: true
model_path: "models/deepfm_ctr.pt"
default_pctr: 0.01
default_pcvr: 0.001
Train Your Own CTR Model
# Prepare training data from your logs
python scripts/prepare_training_data.py \
--input logs/events/ \
--output data/training/
# Train DeepFM model
python -m openadserver.trainer.train_ctr \
--model deepfm \
--data data/training/train.parquet \
--epochs 10 \
--output models/
# Or train a simpler LR model (faster, good baseline)
python -m openadserver.trainer.train_ctr \
--model lr \
--data data/training/train.parquet \
--output models/
# Evaluate model
python -m openadserver.trainer.evaluate \
--model models/deepfm_ctr.pt \
--data data/training/test.parquet
# AUC: 0.72, LogLoss: 0.45
📊 Benchmarks
Stress Test Results (Simulated 2 vCPU / 6GB)
Full pipeline tested: Retrieval → Filter → Prediction → Ranking → Rerank
Test Environment: SQLite in-memory + FakeRedis (zero external dependencies). Results reflect core pipeline performance without network I/O overhead.
| Model | QPS | Avg Latency | P95 | P99 | Relative | |-------|-----|-------------|-----|-----|----------| | LR | 189.7 | 5.24ms | 7.64ms | 10.02ms | 100% (baseline) | | FM | 166.1 | 5.99ms | 8.10ms | 11.54ms | 87.6% | | DeepFM | 151.2 | 6.58ms | 10.30ms | 14.13ms | 79.7% |
Pipeline Stage Breakdown (LR Model)
┌─────────────────┬───────────┬─────────────┐
│ Stage │ Avg (ms) │ % of Total │
├─────────────────┼───────────┼─────────────┤
│ Retrieval │ 0.97 │ 18.5% │
│ Filter │ 0.20 │ 3.8% │
│ Prediction (ML) │ 3.63 │ 69.3% │ ← Bottleneck
│ Ranking │ 0.35 │ 6.7% │
│ Rerank │ 0.10 │ 1.9% │
└─────────────────┴───────────┴─────────────┘
Capacity Estimation (1M DAU)
| Model | Single Server QPS | Peak QPS Needed | Servers Required | |-------|-------------------|-----------------|------------------| | LR | ~190 | 870 | 5 | | FM | ~166 | 870 | 6 | | DeepFM | ~151 | 870 | 6 |
Note: Calculation assumes 1M DAU × 15 requests/user/day = 15M daily → 174 avg QPS → 870 peak (5x factor). Production deployments with PostgreSQL + Redis may have ~10-20% additional I/O overhead.
🧪 Dataset & Model Training
Criteo Click Logs Dataset
We use the Criteo Display Advertising Challenge dataset for CTR model training and evaluation.
Dataset Characteristics:
- Size: ~45GB (full), 100K samples used for benchmarks
- Features: 13 integer features (I1-I13), 26 categorical features (C1-C26)
- Label: Click (0/1)
- Positive Rate: ~3.4%
Model Comparison (100K Criteo Samples)
| Model | Test AUC | Model Size | De
