SkillAgentSearch skills...

HonestDiD

Robust inference in difference-in-differences and event study designs

Install / Use

/learn @asheshrambachan/HonestDiD
About this skill

Quality Score

0/100

Category

Design

Supported Platforms

Universal

README

HonestDiD

The HonestDiD R package implements the tools for robust inference and sensitivity analysis for differences-in-differences and event study designs developed in Rambachan and Roth (2022). There is also an HonestDiD Stata package, and a Shiny app developed by Chengcheng Fang.

Background

The robust inference approach in Rambachan and Roth formalizes the intuition that pre-trends are informative about violations of parallel trends. They provide a few different ways of formalizing what this means.

Bounds on relative magnitudes. One way of formalizing this idea is to say that the violations of parallel trends in the post-treatment period cannot be much bigger than those in the pre-treatment period. This can be formalized by imposing that the post-treatment violation of parallel trends is no more than some constant <img src="https://latex.codecogs.com/svg.image?%5Cbar%7BM%7D" title="Mbar" /> larger than the maximum violation of parallel trends in the pre-treatment period. The value of <img src="https://latex.codecogs.com/svg.image?%5Cbar%7BM%7D" title="Mbar" /> = 1, for instance, imposes that the post-treatment violation of parallel trends is no longer than the worst pre-treatment violation of parallel trends (between consecutive periods). Likewise, setting <img src="https://latex.codecogs.com/svg.image?%5Cbar%7BM%7D" title="Mbar" /> = 2 implies that the post-treatment violation of parallel trends is no more than twice that in the pre-treatment period.

Smoothness restrictions. A second way of formalizing this is to say that the post-treatment violations of parallel trends cannot deviate too much from a linear extrapolation of the pre-trend. In particular, we can impose that the slope of the pre-trend can change by no more than M across consecutive periods, as shown in the figure below for an example with three periods.

<figure> <img src="deltaSD.png" alt="diagram-smoothness-restriction" /> <figcaption aria-hidden="true">diagram-smoothness-restriction</figcaption> </figure>

Thus, imposing a smoothness restriction with M = 0 implies that the counterfactual difference in trends is exactly linear, whereas larger values of M allow for more non-linearity.

Other restrictions. The Rambachan and Roth framework allows for a variety of other restrictions on the differences in trends as well. For example, the smoothness restrictions and relative magnitudes ideas can be combined to impose that the non-linearity in the post-treatment period is no more than <img src="https://latex.codecogs.com/svg.image?%5Cbar%7BM%7D" title="Mbar" /> times larger than that in the pre-treatment periods. The researcher can also impose monotonicity or sign restrictions on the differences in trends as well.

Robust confidence intervals. Given restrictions of the type described above, Rambachan and Roth provide methods for creating robust confidence intervals that are guaranteed to include the true parameter at least 95% of the time when the imposed restrictions on satisfied. These confidence intervals account for the fact that there is estimation error both in the treatment effects estimates and our estimates of the pre-trends.

Sensitivity analysis. The approach described above naturally lends itself to sensitivity analysis. That is, the researcher can report confidence intervals under different assumptions about how bad the post-treatment violation of parallel trends can be (e.g., different values of <img src="https://latex.codecogs.com/svg.image?%5Cbar%7BM%7D" title="Mbar" /> or M.) They can also report the “breakdown value” of <img src="https://latex.codecogs.com/svg.image?%5Cbar%7BM%7D" title="Mbar" /> (or M) for a particular conclusion – e.g. the largest value of <img src="https://latex.codecogs.com/svg.image?%5Cbar%7BM%7D" title="Mbar" /> for which the effect is still significant.

Package installation

The package may be installed by using the function install_github() from the remotes package:

## Installation

# Install remotes package if not installed
install.packages("remotes")

# Turn off warning-error-conversion, because the tiniest warning stops installation
Sys.setenv("R_REMOTES_NO_ERRORS_FROM_WARNINGS" = "true")

# install from github
remotes::install_github("asheshrambachan/HonestDiD")

Example usage – Medicaid expansions

As an illustration of the package, we will examine the effects of Medicaid expansions on insurance coverage using publicly-available data derived from the ACS. We first load the data and packages relevant for the analysis.

#Install here, dplyr, did, haven, ggplot2, fixest packages from CRAN if not yet installed
#install.packages(c("here", "dplyr", "did", "haven", "ggplot2", "fixest"))

library(here)
library(dplyr)
library(did)
library(haven)
library(ggplot2)
library(fixest)
library(HonestDiD)

df <- read_dta("https://raw.githubusercontent.com/Mixtape-Sessions/Advanced-DID/main/Exercises/Data/ehec_data.dta")
head(df,5)
## # A tibble: 5 × 5
##   stfips      year         dins yexp2      W
##   <dbl+lbl>   <dbl+lbl>   <dbl> <dbl>  <dbl>
## 1 1 [alabama] 2008 [2008] 0.681    NA 613156
## 2 1 [alabama] 2009 [2009] 0.658    NA 613156
## 3 1 [alabama] 2010 [2010] 0.631    NA 613156
## 4 1 [alabama] 2011 [2011] 0.656    NA 613156
## 5 1 [alabama] 2012 [2012] 0.671    NA 613156

The data is a state-level panel with information on health insurance coverage and Medicaid expansion. The variable dins shows the share of low-income childless adults with health insurance in the state. The variable yexp2 gives the year that a state expanded Medicaid coverage under the Affordable Care Act, and is missing if the state never expanded.

Estimate the baseline DiD

For simplicity, we will first focus on assessing sensitivity to violations of parallel trends in a non-staggered DiD (see below regarding methods for staggered timing). We therefore restrict the sample to the years 2015 and earlier, and drop the small number of states who are first treated in 2015. We are now left with a panel dataset where some units are first treated in 2014 and the remaining units are not treated during the sample period. We can then estimate the effects of Medicaid expansion using a canonical two-way fixed effects event-study specification,

<img src= "https://latex.codecogs.com/svg.image?Y_%7Bit%7D%20=%20%5Calpha_i%20&plus;%20%5Clambda_t%20&plus;%20%5Csum_%7Bs%20%5Cneq%202013%7D%201%5Bs=t%5D%20%5Ctimes%20D_i%20%5Ctimes%20%5Cbeta_s%20&plus;%20u_%7Bit%7D%20" title = "TWFE" />

where D is 1 if a unit is first treated in 2014 and 0 otherwise.

df <- read_dta("https://raw.githubusercontent.com/Mixtape-Sessions/Advanced-DID/main/Exercises/Data/ehec_data.dta")

#Keep years before 2016. Drop the 2016 cohort
df_nonstaggered <- df %>% filter(year < 2016 &
                                 (is.na(yexp2)| yexp2 != 2015) )

#Create a treatment dummy
df_nonstaggered <- df_nonstaggered %>% mutate(D = case_when( yexp2 == 2014 ~ 1,
                                                             T ~ 0))

#Run the TWFE spec
twfe_results <- fixest::feols(dins ~ i(year, D, ref = 2013) | stfips + year,
                        cluster = "stfips",
                        data = df_nonstaggered)


betahat <- summary(twfe_results)$coefficients #save the coefficients
sigma <- summary(twfe_results)$cov.scaled #save the covariance matrix


fixest::iplot(twfe_results)

<!-- -->

Sensitivity analysis using relative magnitudes restrictions

We are now ready to apply the HonestDiD package to do sensitivity analysis. Suppose we’re interested in assessing the sensitivity of the estimate for 2014, the first year after treatment.

delta_rm_results <-
HonestDiD::createSensitivityResults_relativeMagnitudes(
                                    betahat = betahat, #coefficients
                                    sigma = sigma, #covariance matrix
                                    numPrePeriods = 5, #num. of pre-treatment coefs
                                    numPostPeriods = 2, #num. of post-treatment coefs
                                    Mbarvec = seq(0.5,2,by=0.5) #values of Mbar
                                    )

delta_rm_results
## # A tibble: 4 × 5
##         lb     ub method Delta    Mbar
##      <dbl>  <dbl> <chr>  <chr>   <dbl>
## 1  0.0241  0.0673 C-LF   DeltaRM   0.5
## 2  0.0171  0.0720 C-LF   DeltaRM   1  
## 3  0.00859 0.0796 C-LF   DeltaRM   1.5
## 4 -0.00107 0.0883 C-LF   DeltaRM   2

The output of the previous command shows a robust confidence interval for different values of <img src="https://latex.codecogs.com/svg.image?%5Cbar%7BM%7D" title="Mbar" />. We see that the “breakdown value” for a significant effect is <img src="https://latex.codecogs.com/svg.image?%5Cbar%7BM%7D" title="Mbar" /> = 2, meaning that the significant result is robust to allowing for violations of parallel trends up to twice as big as the max violation in the pre-treatment period.

We can also visualize the sensitivity analysis using the createSensitivityPlot_relativeMagnitudes. To do this, we first have to calculate the CI for the original OLS estimates using the constructOriginalCS command. We then pass our sensitivity analysis and the original results to the createSensitivityPlot_relativeMagnitudes command.

originalResults <- HonestDiD::constructOriginalCS(betahat = betahat,
                                                  sigma = sigma,
                                                  numPrePeriods = 5,
                                                  numPostPeriods = 2)

HonestDiD::createSensitivityPlot_relativeMagnitudes(delta_rm_results, originalResults)

![](README_

View on GitHub
GitHub Stars228
CategoryDesign
Updated3d ago
Forks48

Languages

R

Security Score

85/100

Audited on Apr 4, 2026

No findings