DrWhy

DrWhy is the collection of tools for eXplainable AI (XAI). It's based on shared principles and simple grammar for exploration, explanation and visualisation of predictive models.

Generate Convert Improve

Install / Use

/learn @ModelOriented/DrWhy

About this skill

Quality Score

0/100

README

Responsible Machine Learning

With Great Power Comes Great Responsibility. Voltaire (well, maybe)

How to develop machine learning models in a responsible manner? There are several topics worth considering:

Effective. Is the model good enough? Models with low performance should not be used because they can do more harm than good. Communicate the performance of the model in a language that the user understands. Remember that the models will work on a different dataset than the training one. Make sure to assess the performance on the target dataset.
Transparent. Does the user know what influences model predictions? Interpretability and explainability is important. If the model decisions affect us directly or indirectly, we should know where these decisions come from and how they can be changed.
Fair. Does the model discriminate on the basis of gender, age, race or other sensitive attribute? Direct or indirect? It should not! Discrimination can come in many faces. The model may give lower scores, may have lower performance, or may be based on different variables for the protected population.
Secure. Do not let your model be hacked. Every complex system has its vulnerabilities. Seek them out and fix them. Some users may use various tricks to pull model predictions onto their site.
Confidential. Models are often built on sensitive data. Make sure that the data does not leak, so that sensitive attributes are not shared with unauthorized persons. Also beware of model leaks.
Reproducible. Usually the model development process consists of many steps. Make sure that they are completely reproducible and thus can be verified one by one.

Collection of tools for Visual Exploration, Explanation and Debugging of Predictive Models

It takes a village to raise a <del>child</del> model.

The way how we do predictive modeling is very ineffective. We spend way too much time on manual time-consuming and easy to automate activities like data cleaning and exploration, crisp modeling, model validation. We should be focusing more on model understanding, productisation and communication.

Here are gathered tools that can be used to make out work more efficient through the whole model lifecycle. The unified grammar beyond DrWhy.AI universe is described in the Explanatory Model Analysis: Explore, Explain and Examine Predictive Models book.

Lifecycle for Predictive Models

The DrWhy is based on an unified Model Development Process inspired by RUP. Find an overview in the diagram below.

The DrWhy.AI family

Packages in the DrWhy.AI family of models may be divided into four classes.

Model adapters. Predictive models created with different tools have different structures, and different interfaces. Model adapters create uniform wrappers. This way other packages may operate on models in an unified way. DALEX is a lightweight package with generic interface. DALEXtra is a package with extensions for heavyweight interfaces like scikitlearn, h2o, mlr.
Model agnostic explainers. These packages implement specific methods for model exploration. They can be applied to a single model or they can compare different models. ingredients implements variable specific techniques like Ceteris Paribus, Partial Dependency, Permutation based Feature Importance. iBreakDown implements techniques for variable attribution, like Break Down or SHAPley values. auditor implements techniques for model validation, residual diagnostic and performance diagnostic.
Model specific explainers. These packages implement model specific techniques. randomForestExplainer implements techniques for exploration of randomForest models. EIX implements techniques for exploration of gbm and xgboost models. cr19 implements techniques for exploration of survival models.
Automated exploration. These packages combine series of model exploration techniques and produce an automated report of website for model exploration. modelStudio implements a dashboard generator for local and global interactive model exploration. modelDown implements a HTML website generator for global model cross comparison.

Here is a more detailed overview.

DALEX <img src="https://modeloriented.github.io/DALEX/reference/figures/logo.png" align="right" width="100"/>

The DALEX package (Descriptive mAchine Learning EXplanations) helps to understand how complex models are working. The main function explain creates a wrapper around a predictive model. Wrapped models may then be explored and compared with a collection of local and global explainers. Recent developments from the area of Interpretable Machine Learning/eXplainable Artificial Intelligence.

DALEX wraps methods from other packages, i.e. 'pdp' (Greenwell 2017) doi:10.32614/RJ-2017-016, 'ALEPlot' (Apley 2018) arXiv:1612.08468, 'factorMerger' (Sitko and Biecek 2017) arXiv:1709.04412, 'breakDown' package (Staniak and Biecek 2018) doi:10.32614/RJ-2018-072, (Fisher at al. 2018) arXiv:1801.01489.

Vignettes:

General introduction: Survival on the RMS Titanic

DALEXtra <img src="https://github.com/ModelOriented/DALEXtra/blob/master/man/figures/logo.png?raw=true" align="right" width="100"/>

The DALEXtra package is an extension pack for DALEX package. This package provides easy to use connectors for models created with scikitlearn, keras, H2O, mljar and mlr.

Vignettes:

General introduction: DALEX with scikitlearn models

survex <img src="https://raw.githubusercontent.com/ModelOriented/survex/main/man/figures/survex.png" align="right" width="100"/>

The survex package provides model-agnostic explanations for machine learning survival models. It is based on the DALEX package.

Due to a functional type of prediction, either in the form of survival function or cumulative hazard function, standard model-agnostic explanations cannot be applied directly to survival analysis machine learning models. The survex package contains implementations of explanation methods specific to survival analysis, as well as extensions of existing ones for classification or regression.

Vignettes:

ingredients <img src="https://modeloriented.github.io/ingredients/reference/figures/logo.png" align="right" width="100"/>

The ingredients package is a collection of tools for assessment of feature importance and feature effects.

Key functions: feature_importance() for assessment of global level feature importance, ceteris_paribus() for calculation of the Ceteris Paribus / What-If Profiles, partial_dependency() for Partial Dependency Plots, conditional_dependency() for Conditional Dependency Plots also called M Plots, accumulated_dependency() for Accumulated Local Effects Plo

Related Skills

node-connect

345.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

104.6k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

345.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

345.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。