EconML
ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
Install / Use
/learn @py-why/EconMLREADME
EconML is a Python package for estimating heterogeneous treatment effects from observational data via machine learning. This package was designed and built as part of the ALICE project at Microsoft Research with the goal to combine state-of-the-art machine learning techniques with econometrics to bring automation to complex causal inference problems. The promise of EconML:
- Implement recent techniques in the literature at the intersection of econometrics and machine learning
- Maintain flexibility in modeling the effect heterogeneity (via techniques such as random forests, boosting, lasso and neural nets), while preserving the causal interpretation of the learned model and often offering valid confidence intervals
- Use a unified API
- Build on standard Python packages for Machine Learning and Data Analysis
One of the biggest promises of machine learning is to automate decision making in a multitude of domains. At the core of many data-driven personalized decision scenarios is the estimation of heterogeneous treatment effects: what is the causal effect of an intervention on an outcome of interest for a sample with a particular set of features? In a nutshell, this toolkit is designed to measure the causal effect of some treatment variable(s) T on an outcome
variable Y, controlling for a set of features X, W and how does that effect vary as a function of X. The methods implemented are applicable even with observational (non-experimental or historical) datasets. For the estimation results to have a causal interpretation, some methods assume no unobserved confounders (i.e. there is no unobserved variable not included in X, W that simultaneously has an effect on both T and Y), while others assume access to an instrument Z (i.e. an observed variable Z that has an effect on the treatment T but no direct effect on the outcome Y). Most methods provide confidence intervals and inference results.
For detailed information about the package, consult the documentation at https://www.pywhy.org/EconML/.
For information on use cases and background material on causal inference and heterogeneous treatment effects see our webpage at https://www.microsoft.com/en-us/research/project/econml/
<details> <summary><strong><em>Table of Contents</em></strong></summary>- News
- Getting Started
- For Developers
- Blogs and Publications
- Citation
- Contributing and Feedback
- Community
- References
News
If you'd like to contribute to this project, see the Help Wanted section below.
July 10, 2025: Release v0.16.0, see release notes here
<details><summary>Previous releases</summary>July 3, 2024: Release v0.15.1, see release notes here
February 12, 2024: Release v0.15.0, see release notes here
November 11, 2023: Release v0.15.0b1, see release notes here
May 19, 2023: Release v0.14.1, see release notes here
November 16, 2022: Release v0.14.0, see release notes here
June 17, 2022: Release v0.13.1, see release notes here
January 31, 2022: Release v0.13.0, see release notes here
August 13, 2021: Release v0.12.0, see release notes here
August 5, 2021: Release v0.12.0b6, see release notes here
August 3, 2021: Release v0.12.0b5, see release notes here
July 9, 2021: Release v0.12.0b4, see release notes here
June 25, 2021: Release v0.12.0b3, see release notes here
June 18, 2021: Release v0.12.0b2, see release notes here
June 7, 2021: Release v0.12.0b1, see release notes here
May 18, 2021: Release v0.11.1, see release notes here
May 8, 2021: Release v0.11.0, see release notes here
March 22, 2021: Release v0.10.0, see release notes here
March 11, 2021: Release v0.9.2, see release notes here
March 3, 2021: Release v0.9.1, see release notes here
February 20, 2021: Release v0.9.0, see release notes here
January 20, 2021: Release v0.9.0b1, see release notes here
November 20, 2020: Release v0.8.1, see release notes here
November 18, 2020: Release v0.8.0, see release notes here
September 4, 2020: Release v0.8.0b1, see release notes here
March 6, 2020: Release v0.7.0, see release notes here
February 18, 2020: Release v0.7.0b1, see release notes here
January 10, 2020: Release v0.6.1, see release notes here
December 6, 2019: Release v0.6, see release notes here
November 21, 2019: Release v0.5, see release notes here.
June 3, 2019: Release v0.4, see release notes here.
May 3, 2019: Release v0.3, see release notes here.
April 10, 2019: Release v0.2, see release notes here.
March 6, 2019: Release v0.1, welcome to have a try and provide feedback.
</details>Getting Started
Installation
Install the latest release from PyPI:
pip install econml
To install from source, see For Developers section below.
Usage Examples
Estimation Methods
<details> <summary>Double Machine Learning (aka RLearner) (click to expand)</summary>- Linear final stage
from econml.dml import LinearDML
from sklearn.linear_model import LassoCV
from econml.inference import BootstrapInference
est = LinearDML(model_y=LassoCV(), model_t=LassoCV())
### Estimate with OLS confidence intervals
est.fit(Y, T, X=X, W=W) # W -> high-dimensional confounders, X -> features
treatment_effects = est.effect(X_test)
lb, ub = est.effect_interval(X_test, alpha=0.05) # OLS confidence intervals
### Estimate with bootstrap confidence intervals
est.fit(Y, T, X=X, W=W, inference='bootstrap') # with default bootstrap parameters
est.fit(Y, T, X=X, W=W, inference=BootstrapInference(n_bootstrap_samples=100)) # or customized
lb, ub = est.effect_interval(X_test, alpha=0.05) # Bootstrap confidence intervals
- Sparse linear final stage
from econml.dml import SparseLinearDML
from sklearn.linear_model import LassoCV
est = SparseLinearDML(model_y=LassoCV(), model_t=LassoCV())
est.fit(Y, T, X=X, W=W) # X -> high dimensional features
treatment_effects = est.effect(X_test)
lb, ub = est.effect_interval(X_test, alpha=0.05) # Confidence intervals via debiased lasso
- Generic Machine Learning last stage
from econml.dml import NonParamDML
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
est = NonParamDML(model_y=RandomForestRegressor(),
model_t=RandomForestClassifier(),
model_final=RandomForestRegressor(),
discrete_treatment=True)
est.fit(Y, T, X=X, W=W)
treatment_effects = est.effect(X_test)
</details>
<details>
<summary>Dynamic Double 