Difference-in-Differences with Bad Controls

A Practical Guide

Authors
Affiliations

Carolina Caetano

University of Georgia

Brantly Callaway

University of Georgia

Stroud Payne

Vanderbilt University

Hugo Sant’Anna

University of Alabama at Birmingham

What is the bad controls problem?

In difference-in-differences, researchers often condition on time-varying covariates to make parallel trends more plausible. But if a covariate is affected by treatment, conditioning on it introduces post-treatment selection bias — a bad control.

This guide shows you how to detect and fix this problem using the badcontrols R package.

The core tension

You have a time-varying covariate \(X_t\) that you want to include in your DiD model. The problem:

Approach Result
Include \(X_t\) directly Biased — post-treatment selection
Drop \(X_t\) entirely Biased — may violate parallel trends
Use only \(X_{t-1}\) Works under restrictive conditions
Impute \(X_t(0)\) Correct — our proposed solution
Doubly Robust ML Correct — semiparametric, robust

Getting started

Installation

# Install the badcontrols package
devtools::install_github("hsantanna88/bad-controls")

# Dependencies
devtools::install_github("bcallaway11/pte")

# Optional: for the DR/ML estimator
install.packages("grf")

Quick example

library(badcontrols)
set.seed(20240301)

# Simulate data with a known bad control
sim <- simulate_bad_controls(n = 2000, T_max = 4)

# Imputation approach
res <- bc_att_gt(
  yname = "Y", gname = "G", tname = "period", idname = "id",
  data = sim$data,
  bad_control_formula = ~X,
  xformla = ~Z + W,
  est_method = "imputation"
)
extract_att(res)

Guide structure

  1. The Problem — What goes wrong when you condition on a post-treatment covariate, with DAGs and intuition
  2. Estimation — Our two proposed estimators: imputation and doubly robust ML
  3. Worked Example — Full step-by-step R walkthrough with simulated data, comparing all methods
  4. Application — Real-world application: wage scars from job loss (NLSY79), with occupation as the bad control

Citation

Caetano, C., Callaway, B., Payne, S., and Sant’Anna, H. (2024). “Difference-in-Differences with Bad Controls.” arXiv:2405.10557.

@article{caetano2024bad,
  title={Difference-in-Differences with Bad Controls},
  author={Caetano, Carolina and Callaway, Brantly and Payne, Stroud and Sant'Anna, Hugo},
  year={2024},
  journal={arXiv preprint arXiv:2405.10557}
}