Dtreeviz
A python library for decision tree visualization and model interpretation.
Install / Use
/learn @parrt/DtreevizREADME
dtreeviz : Decision Tree Visualization
Description
A python library for decision tree visualization and model interpretation. Decision trees are the fundamental building block of gradient boosting machines and Random Forests(tm), probably the two most popular machine learning models for structured data. Visualizing decision trees is a tremendous aid when learning how these models work and when interpreting models. The visualizations are inspired by an educational animation by R2D3; A visual introduction to machine learning. Please see How to visualize decision trees for deeper discussion of our decision tree visualization library and the visual design decisions we made.
Currently dtreeviz supports: scikit-learn, XGBoost, Spark MLlib, LightGBM, and Tensorflow. See Installation instructions.
Authors
- Terence Parr, a tech lead at Google, and until 2022 was a professor of data science / computer science at Univ. of San Francisco, where he was founding director of the University of San Francisco's MS in data science program in 2012.
- Tudor Lapusan
- Prince Grover
With major code and visualization clean up contributions done by Matthew Epland (@mepland).
Sample Visualizations
Tree visualizations
<table cellpadding="0" cellspacing="0"> <tr> <td><img src="testing/samples/iris-TD-2.svg" width="250"></td> <td><img src="testing/samples/boston-TD-2.svg" width="250"></td> <td><img src="testing/samples/knowledge-TD-4-simple.svg" width="250"></td> </tr> </table>Prediction path explanations
<table cellpadding="0" cellspacing="0"> <tr> <td><img src="testing/samples/breast_cancer-TD-3-X.svg" width="250"></td> <td><img src="testing/samples/diabetes-LR-2-X.svg" width="300"></td> <td><img src="testing/samples/knowledge-TD-15-X-simple.svg" width="250"></td> </tr> </table>Leaf information
<table cellpadding="0" cellspacing="0"> <tr> <td><img src="testing/samples/titanic-leaf-regression.png" width="150"></td> <td><img src="testing/samples/titanic-leaf-samples-by-class.png" width="250"></td> </tr> </table>Feature space exploration
Regression
<table cellpadding="0" cellspacing="0"> <tr> <td><img src="testing/samples/cars-univar-2.svg" width="250"></td> <td><img src="https://user-images.githubusercontent.com/178777/49104999-4edb0d80-f234-11e8-9010-73b7c0ba5fb9.png" width="250"></td> <td><img src="https://user-images.githubusercontent.com/178777/49107627-08d57800-f23b-11e8-85a2-ab5894055092.png" width="250"></td> </tr> </table>Classification
<table cellpadding="0" cellspacing="0"> <tr> <td><img src="https://user-images.githubusercontent.com/178777/49105084-9497d600-f234-11e8-9097-56835558c1a6.png" width="250"></td> <td><img src="https://user-images.githubusercontent.com/178777/49105085-9792c680-f234-11e8-8af5-bc2fde950ab1.png" width="250"></td> </tr> </table>Classification boundaries
As a utility function, dtreeviz provides dtreeviz.decision_boundaries() that illustrates one and two-dimensional feature space for classifiers, including colors that represent probabilities, decision boundaries, and misclassified entities. This method is not limited to tree models, by the way, and should work with any model that answers method predict_proba(). That means any model from scikit-learn should work (but we also made it work with Keras models that define predict()). (As it does not work with trees specifically, the function does not use adaptors obtained from dtreeviz.model().) See classifier-decision-boundaries.ipynb.
Sometimes it's helpful to see animations that change some of the hyper parameters. If you look in notebook classifier-boundary-animations.ipynb, you will see code that generates animations such as the following (animated png files):
<table cellpadding="0" cellspacing="0"> <tr> <td><img src="testing/samples/smiley-dtree-maxdepth.png" width="250"></td> <td><img src="testing/samples/smiley-numtrees.png" width="250"></td> </tr> </table>Quick start
See Installation instructions then take a look at the specific notebooks for the supported ML library you're using:
- sklearn-based examples (colab)
- LightGBM-based examples (colab)
- Spark-based examples (colab)
- TensorFlow-based examples (colab) Also see blog at tensorflow.org Visualizing TensorFlow Decision Forest Trees with dtreeviz
- XGBoost-based examples (colab)
- Classifier decision boundaries for any scikit-learn model.ipynb (colab)
- Changing colors notebook (colab)
- AI-powered tree analysis (sklearn) - Interactive chat and explanations using LLMs
To interopt with these different libraries, dtreeviz uses an adaptor object, obtained from function dtreeviz.model(), to extract model information necessary for visualization. Given such an adaptor object, all of the dtreeviz functionality is available to you using the same programmer interface. The basic dtreeviz usage recipe is:
- Import dtreeviz and your decision tree library
- Acquire and load data into memory
- Train a classifier or regressor model using your decision tree library
- Obtain a dtreeviz adaptor model using<br>
viz_model = dtreeviz.model(your_trained_model,...) - Call dtreeviz functions, such as<br>
viz_model.view()orviz_model.explain_prediction_path(sample_x)
Example
Here's a complete example Python file that displays the following tree in a popup window:
<img src="testing/samples/iris-TD-4.svg" width="200">from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
import dtreeviz
iris = load_iris()
X = iris.data
y = iris.target
clf = DecisionTreeClassifier(max_depth=4)
clf.fit(X, y)
viz_model = dtreeviz.model(clf,
X_train=X, y_train=y,
feature_names=iris.feature_names,
target_name='iris',
class_names=iris.target_names)
v = viz_model.view() # render as SVG into internal object
v.show() # pop up window
v.save("/tmp/iris.svg") # optionally save as svg
In a notebook, you can render inline without calling show(). Just call view():
viz_model.view() # in notebook, displays inline
AI-Powered Tree Analysis
With AI integration enabled, you can ask ad hoc questions about your decision tree model using the chat() method. The AI has access to comprehensive knowledge about your tree structure, nodes, and training data, enabling it to answer questions about:
- Tree structure: Overall architecture, depth, node count, splitting criteria, and tree type (classification/regression)
- Tree nodes: Split conditions, feature usage, node statistics, sample distributions, and purity measures at internal nodes
- Leaf nodes: Predictions, confidence scores, sample counts, and class distributions
- Training dataset: Feature statistics, target distributions, and data characteristics within nodes or leaves
# Enable AI chat when creating the model
viz_model = dtreeviz.model(tree_classifier,
X_train=dataset[features], y_train=dataset[target],
feature_names=features,
target_name=target, class_names=["perish", "survive"],
Related Skills
claude-opus-4-5-migration
84.2kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
model-usage
340.5kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
feishu-drive
340.5k|
things-mac
340.5kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
