PrediChurn

A full ML pipeline for customer churn prediction in telecom, banking, or SaaS. Includes robust data cleaning, automatic feature engineering, model training/tuning (Logistic Regression, RF, XGBoost), interpretability, and interactive dashboards for actionable business retention insights.

Generate Convert Improve

Install / Use

/learn @vishnupriyanpr/PrediChurn

About this skill

Quality Score

0/100

README

PrediChurn 🚦 – End-to-End Customer Churn Prediction Suite

"Transforming churn risk into retention strategies with advanced ML."
🔍 Powered by: XGBoost, Random Forest, Optuna, SHAP
🧑💻 Engineered by: vishnupriyanpr

Overview 🚀

PrediChurn is a robust, modular machine learning pipeline for customer churn prediction. Designed for telecom, SaaS, and banking datasets, it automates data wrangling, business-driven feature engineering, model selection, evaluation, and produces clear business insights and analytics dashboards. Its outputs guide retention teams toward targeted, ROI-driven customer strategies.

Key Features 🧠

🔄 Multi-model engine: Logistic Regression, Random Forest, XGBoost—all Optuna-optimized
🛠️ Feature engineering: Tenure, ARPU, contract/payment, and behavior features with full NaN/infinite safety
🔍 Explainable AI: SHAP for both global and local churn driver visualization
📊 Business metrics: Churn rate, “revenue at risk”, “potential revenue saved”, intervention ROI
📑 Automated reporting: Executive summaries, actionable recommendations, and visualization outputs

ML Pipeline Details 🏗️

1. Data Preparation

Loads raw CSV data
Cleans missing values and outliers
Encodes categoricals
Scales numerical data

2. Feature Engineering

Generates >10 additional business-focused features (e.g., avg_charges_per_tenure, high_value_customer)
Handles division-by-zero/NaN/infinite edge cases

3. Modeling and Optimization

Trains Logistic Regression, Random Forest, and XGBoost models
Balances training data with SMOTE for rare churn events
Hyperparameter tuning via Optuna for best ROC-AUC

4. Evaluation

Measures: accuracy, precision, recall, ROC-AUC
Generates confusion matrix, ROC, Precision-Recall plots

5. Explainability

Computes and saves SHAP summary and bar plots
Ranks top churn features both globally and per-customer

6. Business Analytics

Calculates "revenue at risk", "potential savings", intervention efficiency
Generates markdown and visual HTML reports
Top churn drivers and segment-wise actionable steps

Workflow 🔁

Clone Project & Install

git clone https://github.com/vishnupriyanpr/churnguard-ai.git
cd churnguard-ai
pip install -r requirements.txt

Prepare Dataset
- Place your CSV data in data/raw/telco_churn.csv (Kaggle Telco Churn format recommended)
Run Pipeline
```
python main.py
```
View Outputs
- Metrics, SHAP PNGs, and business report: in reports/
- Model artifacts: in models/

Workflow ER Diagram 🗺️

erDiagram

    RAW_DATA {
        string customerID
        string features
        string churn_label
    }
    PROCESSED_DATA {
        string encoded_features
        string target
    }
    ENGINEERED_DATA {
        string new_features
    }
    TRAIN_DATA {
        string balanced_features
        string balanced_target
    }
    MODEL {
        string model_type
        string hyperparameters
        string trained_weights
    }
    METRICS {
        float accuracy
        float precision
        float recall
        float roc_auc
    }
    SHAP_PLOTS {
        string summary_plot
        string feature_importance
    }
    BUSINESS_REPORT {
        string revenue_at_risk
        string recommendations
        string top_drivers
    }

    RAW_DATA ||--o{ PROCESSED_DATA : cleaned_and_preprocessed
    PROCESSED_DATA ||--o{ ENGINEERED_DATA : feature_engineered
    ENGINEERED_DATA ||--o{ TRAIN_DATA : balanced_with_SMOTE
    TRAIN_DATA ||--o{ MODEL : trained_to
    MODEL ||--o{ METRICS : generates
    MODEL ||--o{ SHAP_PLOTS : explains
    METRICS ||--o{ BUSINESS_REPORT : summarized_in
    SHAP_PLOTS ||--o{ BUSINESS_REPORT : visualized_in

Key Results (Latest Run) 📊

Accuracy: 78.1%
Precision: 57.9%
Recall: 65.0%
ROC-AUC: 0.822
Churn Rate: 26.5%
Revenue at Risk: $374,000
Potential Revenue Saved: $72,900
Intervention Efficiency: 57.2%
Top Churn Drivers:
- avg_charges_per_tenure (0.132)
- MonthlyCharges (0.083)
- charges_trend (0.076)
- TotalCharges (0.076)
- price_per_month_ratio (0.075)

🧾 Business Recommendations

Immediate Action: Target high-risk (churn prob > 70%) with retention offers
Monitor Medium-Risk: Engage the 30–70% churn probability group
Feature Focus: Optimize avg_charges_per_tenure and related drivers
Ongoing Scoring: Recompute churn risk monthly for all customers

Project Structure 📁

churnguard-ai/
├── data/
│ ├── raw/
│ └── processed/
├── models/
├── reports/
├── src/
│ ├── data_loader.py
│ ├── data_preprocessor.py
│ ├── feature_engineer.py
│ ├── model_trainer.py
│ ├── model_evaluator.py
│ └── utils.py
├── main.py
├── requirements.txt
└── README.md

Output 🖼

Model Evaluation Dashboard

License 📜

MIT License — use, modify, and scale freely!

Credits 🙌

<div align="center"> <table style="width:100%;"> <tr> <td align="center" style="width:50%;"> <a href="https://github.com/vishnupriyanpr"> <img src="https://github.com/vishnupriyanpr.png?size=120" width="120px;" alt="Vishnupriyan P R"/> </a> </td> <td align="center" style="width:50%;"> <blockquote> <p>“Tools should disappear into the background and let you build.”</p> <footer>— Vishnupriyan P R, <i>caffeinated coder ☕</i></footer> </blockquote> </td> </tr> </table> </div>

Related Skills

claude-opus-4-5-migration

108.0k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

model-usage

347.2k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

TrendRadar

50.8k

⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载，你的 AI 舆情监控助手与热点筛选工具！聚合多平台热点 + RSS 订阅，支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机，也支持接入 MCP 架构，赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ，数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。

mcp-for-beginners

15.8k

This open-source curriculum introduces the fundamentals of Model Context Protocol (MCP) through real-world, cross-language examples in .NET, Java, TypeScript, JavaScript, Rust and Python. Designed for developers, it focuses on practical techniques for building modular, scalable, and secure AI workflows from session setup to service orchestration.

vishnupriyanpr

View profile

View on GitHub

GitHub Stars10

CategoryEducation

Updated7d ago

Forks0

vishnupriyanpr/PrediChurn

Languages

HTML

Security Score

95/100

Audited on Mar 27, 2026

No findings