SkillAgentSearch skills...

Dtreeviz

A python library for decision tree visualization and model interpretation.

Install / Use

/learn @parrt/Dtreeviz

README

dtreeviz : Decision Tree Visualization

Description

A python library for decision tree visualization and model interpretation. Decision trees are the fundamental building block of gradient boosting machines and Random Forests(tm), probably the two most popular machine learning models for structured data. Visualizing decision trees is a tremendous aid when learning how these models work and when interpreting models. The visualizations are inspired by an educational animation by R2D3; A visual introduction to machine learning. Please see How to visualize decision trees for deeper discussion of our decision tree visualization library and the visual design decisions we made.

Currently dtreeviz supports: scikit-learn, XGBoost, Spark MLlib, LightGBM, and Tensorflow. See Installation instructions.

Authors

With major code and visualization clean up contributions done by Matthew Epland (@mepland).

Sample Visualizations

Tree visualizations

<table cellpadding="0" cellspacing="0"> <tr> <td><img src="testing/samples/iris-TD-2.svg" width="250"></td> <td><img src="testing/samples/boston-TD-2.svg" width="250"></td> <td><img src="testing/samples/knowledge-TD-4-simple.svg" width="250"></td> </tr> </table>

Prediction path explanations

<table cellpadding="0" cellspacing="0"> <tr> <td><img src="testing/samples/breast_cancer-TD-3-X.svg" width="250"></td> <td><img src="testing/samples/diabetes-LR-2-X.svg" width="300"></td> <td><img src="testing/samples/knowledge-TD-15-X-simple.svg" width="250"></td> </tr> </table>

Leaf information

<table cellpadding="0" cellspacing="0"> <tr> <td><img src="testing/samples/titanic-leaf-regression.png" width="150"></td> <td><img src="testing/samples/titanic-leaf-samples-by-class.png" width="250"></td> </tr> </table>

Feature space exploration

Regression

<table cellpadding="0" cellspacing="0"> <tr> <td><img src="testing/samples/cars-univar-2.svg" width="250"></td> <td><img src="https://user-images.githubusercontent.com/178777/49104999-4edb0d80-f234-11e8-9010-73b7c0ba5fb9.png" width="250"></td> <td><img src="https://user-images.githubusercontent.com/178777/49107627-08d57800-f23b-11e8-85a2-ab5894055092.png" width="250"></td> </tr> </table>

Classification

<table cellpadding="0" cellspacing="0"> <tr> <td><img src="https://user-images.githubusercontent.com/178777/49105084-9497d600-f234-11e8-9097-56835558c1a6.png" width="250"></td> <td><img src="https://user-images.githubusercontent.com/178777/49105085-9792c680-f234-11e8-8af5-bc2fde950ab1.png" width="250"></td> </tr> </table>

Classification boundaries

As a utility function, dtreeviz provides dtreeviz.decision_boundaries() that illustrates one and two-dimensional feature space for classifiers, including colors that represent probabilities, decision boundaries, and misclassified entities. This method is not limited to tree models, by the way, and should work with any model that answers method predict_proba(). That means any model from scikit-learn should work (but we also made it work with Keras models that define predict()). (As it does not work with trees specifically, the function does not use adaptors obtained from dtreeviz.model().) See classifier-decision-boundaries.ipynb.

<table cellpadding="0" cellspacing="0"> <tr> <td><img src="https://user-images.githubusercontent.com/178777/113516364-b608db00-952e-11eb-91cf-efe2386622f1.png" width="250"><br><img src="https://user-images.githubusercontent.com/178777/113516379-d5076d00-952e-11eb-955e-1dd7c09f2f29.png" width="250"></td> <td><img src="https://user-images.githubusercontent.com/178777/113516349-a12c4780-952e-11eb-86f3-0ae457eb500f.png" width="250"></td> </tr> </table>

Sometimes it's helpful to see animations that change some of the hyper parameters. If you look in notebook classifier-boundary-animations.ipynb, you will see code that generates animations such as the following (animated png files):

<table cellpadding="0" cellspacing="0"> <tr> <td><img src="testing/samples/smiley-dtree-maxdepth.png" width="250"></td> <td><img src="testing/samples/smiley-numtrees.png" width="250"></td> </tr> </table>

Quick start

See Installation instructions then take a look at the specific notebooks for the supported ML library you're using:

To interopt with these different libraries, dtreeviz uses an adaptor object, obtained from function dtreeviz.model(), to extract model information necessary for visualization. Given such an adaptor object, all of the dtreeviz functionality is available to you using the same programmer interface. The basic dtreeviz usage recipe is:

  1. Import dtreeviz and your decision tree library
  2. Acquire and load data into memory
  3. Train a classifier or regressor model using your decision tree library
  4. Obtain a dtreeviz adaptor model using<br>viz_model = dtreeviz.model(your_trained_model,...)
  5. Call dtreeviz functions, such as<br>viz_model.view() or viz_model.explain_prediction_path(sample_x)

Example

Here's a complete example Python file that displays the following tree in a popup window:

<img src="testing/samples/iris-TD-4.svg" width="200">
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

import dtreeviz

iris = load_iris()
X = iris.data
y = iris.target

clf = DecisionTreeClassifier(max_depth=4)
clf.fit(X, y)

viz_model = dtreeviz.model(clf,
                           X_train=X, y_train=y,
                           feature_names=iris.feature_names,
                           target_name='iris',
                           class_names=iris.target_names)

v = viz_model.view()     # render as SVG into internal object 
v.show()                 # pop up window
v.save("/tmp/iris.svg")  # optionally save as svg

In a notebook, you can render inline without calling show(). Just call view():

viz_model.view()       # in notebook, displays inline

AI-Powered Tree Analysis

With AI integration enabled, you can ask ad hoc questions about your decision tree model using the chat() method. The AI has access to comprehensive knowledge about your tree structure, nodes, and training data, enabling it to answer questions about:

  • Tree structure: Overall architecture, depth, node count, splitting criteria, and tree type (classification/regression)
  • Tree nodes: Split conditions, feature usage, node statistics, sample distributions, and purity measures at internal nodes
  • Leaf nodes: Predictions, confidence scores, sample counts, and class distributions
  • Training dataset: Feature statistics, target distributions, and data characteristics within nodes or leaves
# Enable AI chat when creating the model
viz_model = dtreeviz.model(tree_classifier,
                           X_train=dataset[features], y_train=dataset[target],
                           feature_names=features,
                           target_name=target, class_names=["perish", "survive"],
       

Related Skills

View on GitHub
GitHub Stars3.1k
CategoryData
Updated1d ago
Forks340

Languages

Jupyter Notebook

Security Score

100/100

Audited on Mar 28, 2026

No findings