Imodels
Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
Install / Use
/learn @csinva/ImodelsREADME
<img align="center" width=100% src="https://csinva.io/imodels/img/anim.gif"> </img>
Modern machine-learning models are increasingly complex, often making them difficult to interpret. This package provides a simple interface for fitting and using state-of-the-art interpretable models, all compatible with scikit-learn. These models can often replace black-box models (e.g. random forests) with simpler models (e.g. rule lists) while improving interpretability and computational efficiency, all without sacrificing predictive accuracy! Simply import a classifier or regressor and use the fit and predict methods, same as standard scikit-learn models.
from sklearn.model_selection import train_test_split
from imodels import get_clean_dataset, HSTreeClassifierCV # import any imodels model here
# prepare data (a sample clinical dataset)
X, y, feature_names = get_clean_dataset('csi_pecarn_pred')
X_train, X_test, y_train, y_test = train_test_split(
X, y, random_state=42)
# fit the model
model = HSTreeClassifierCV(max_leaf_nodes=4) # initialize a tree model and specify only 4 leaf nodes
model.fit(X_train, y_train, feature_names=feature_names) # fit model
preds = model.predict(X_test) # discrete predictions: shape is (n_test, 1)
preds_proba = model.predict_proba(X_test) # predicted probabilities: shape is (n_test, n_classes)
print(model) # print the model
------------------------------
Decision Tree with Hierarchical Shrinkage
Prediction is made by looking at the value in the appropriate leaf of the tree
------------------------------
|--- FocalNeuroFindings2 <= 0.50
| |--- HighriskDiving <= 0.50
| | |--- Torticollis2 <= 0.50
| | | |--- value: [0.10]
| | |--- Torticollis2 > 0.50
| | | |--- value: [0.30]
| |--- HighriskDiving > 0.50
| | |--- value: [0.68]
|--- FocalNeuroFindings2 > 0.50
| |--- value: [0.42]
Installation
Install with pip install imodels (see here for help).
Supported models
<p align="left"> <a href="https://csinva.io/imodels/">🗂️</a> Docs   📄 Research paper   🔗 Reference code implementation </br> </p>| Model | Reference | Description | | :-------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | | Rulefit rule set | 🗂️, 📄, 🔗 | Fits a sparse linear model on rules extracted from decision trees | | Skope rule set | 🗂️, 🔗 | Extracts rules from gradient-boosted trees, deduplicates them,<br/>then linearly combines them based on their OOB precision | | Boosted rule set | 🗂️, 📄, 🔗 | Sequentially fits a set of rules with Adaboost | | Slipper rule set | 🗂️, 📄 | Sequentially learns a set of rules with SLIPPER | | Bayesian rule set | 🗂️, 📄, 🔗 | Finds concise rule set with Bayesian sampling (slow) | | Bayesian rule list | 🗂️, 📄, 🔗 | Fits compact rule list distribution with Bayesian sampling (slow) | | Greedy rule list | 🗂️, 🔗 | Uses CART to fit a list (only a single path), rather than a tree | | OneR rule list | 🗂️, 📄 | Fits rule list restricted to only one feature | | Optimal rule tree | 🗂️, 📄, 🔗 | Fits succinct tree using global optimization for sparsity (GOSDT) | | Greedy rule tree | 🗂️, 📄, 🔗 | Greedily fits tree using CART | | C4.5 rule tree | 🗂️, 📄, 🔗 | Greedily fits tree using C4.5 | | TAO rule tree | 🗂️, 📄 | Fits tree using alternating optimization | | Iterative random<br/>forest | 🗂️, 📄, 🔗 | Repeatedly fit random forest, giving features with<br/>high importance a higher chance of being selected | | Sparse integer<br/>linear model | 🗂️, 📄 | Sparse linear model with integer coefficients | | Tree GAM | 🗂️, 📄, 🔗 | Generalized additive model fit with short boosted trees | | <b>Greedy tree</br>sums (FIGS)</b> | 🗂️,ㅤ📄 | Sum of small trees with very few total rules (FIGS) | | <b>Hierarchical<br/> shrinkage wrapper</b> | 🗂️, 📄 | Improve a decision tree, random forest, or<br/>gradient-boosting ensemble with ultra-fast, post-hoc regularization | | <b>RF+ (MDI+)</b> | 🗂️, 📄 | Flexible random forest-based feature importance | | Distillation<br/>wrapper | 🗂️ | Train a black-box model,<br/>then distill it into an interpretable model | | AutoML wrapper | 🗂️ | Automatically fit and select an interpretable model | | More models | ⌛ | (Coming soon!) Lightweight Rule Induction, MLRules, ... |
Demo notebooks
Demos are contained in the notebooks folder.
<details> <summary><a href="https://github.com/csinva/imodels/blob/master/notebooks/imodels_demo.ipynb">Quickstart demo</a></summary> Shows how to fit, predict, and visualize with different interpretable models </details> <details> <summary><a href="https://auto.gluon.ai/dev/tutorials/tabular_prediction/tabular-interpretability.html">Autogluon demo</a></summary> Fit/select an interpretable model automatically using Autogluon AutoML </details> <details> <summary><a href="https://colab.research.google.com/drive/1WfqvSjegygT7p0gyqiWpRpiwz2ePtiao#scrollTo=bLnLknIuoWtQ"