Week 12: Causal Forests for Heterogeneous Treatment Effects

DSAN 5650: Causal Inference for Computational Social Science
Summer 2026, Georgetown University

Jeff Jacobs

jj1088@georgetown.edu

Wednesday, August 5, 2026

Schedule

Today’s Planned Schedule:

Start End Topic
Lecture 6:30pm 6:45pm Final Projects →
6:45pm 7:10pm Included Variable Bias →
7:10pm 8:00pm Heterogeneous Treatment Effects →
Break! 8:00pm 8:10pm
8:10pm 9:00pm Causal Forests →

Final Project / End of Term Things

  • Midterm grading will be done TONIGHT
  • Submit button for HW3 / HW4 TONIGHT

Sensitivity Analysis

  • So you’ve got an estimate of a causal effect! What now? Two possible next steps…
  • Sensitivity Analysis: Check robustness of your results
    • Modeling choices → one point in param space (garden of forking paths)
    • If results “truly” hold, shouldn’t disappear with small changes in parameters/choices
    • Last week: Can re-estimate with informativeweakskeptical priors
    • This week: Simulate how “strong” omissions would have to be to “ruin” finding
  • Heterogeneous Treatment Effects: How does the effect vary btwn subgroups?
    • Example today: PROGRESA (Cash transfer poverty-alleviation program in Mexico)
    • Program has an overall causal effect, but also a greater causal effect on indigenous households, relative to non-indigenous
    • How can we find group-specific effects? Causal forests!

Sensitivity Analysis 2: How Sensitive Are My Results to Omitted/Included Variable Bias?

  • You probably know about omitted variable bias from previous classes / intuition… but, included variable bias?
  • That’s right folks… colliders can be agents of chaos, laying waste to our best, most meticulously-planned-out models

Does Aging Cause Sadness?

Assessing the Impact of Omitted / Included Vars

  • Working example from Cinelli and Hazlett (2020): War in Darfur (2004-2020)
  • What is the causal impact of experiencing violence on support for a peace deal
  • Researcher A models this scenario as:

\[ \textsf{PeaceIndex} = \tau_{\text{res}} \textsf{DirectHarm} + \hat{\beta}_{\text{f},\text{res}}\textsf{Female} + \textsf{Village} \hat{\boldsymbol \beta}_{\text{v},\text{res}} + \mathbf{X} \hat{\boldsymbol \beta}_{\text{res}} + \hat{\varepsilon}_{\text{res}} \]

  • Researcher B instead prefers:

\[ \textsf{PeaceIndex} = \tau \textsf{DirectHarm} + \hat{\beta}_{\text{f}}\textsf{Female} + \textsf{Village} \hat{\boldsymbol \beta}_{\text{v}} + \mathbf{X} \hat{\boldsymbol \beta} + \boxed{\hat{\gamma}\textsf{Center}} + \hat{\varepsilon}_{\text{full}} \]

Our earlier estimate \(\tau_{\text{res}}\) would differ from our target quantity \(\tau\): but how badly? […] How strong would the confounder(s) have to be to change the estimates in such a way to affect the main conclusions of a study?

The Classical OVB Equation

  • Remember that \(D\) is our treatment variable! (\(\mathbf{X}\) = covariates, \(Y\) = outcome)
  • In a perfect world, we would estimate \(Y = \hat{\tau}D + \mathbf{X} \hat{\beta} + \hat{\gamma}Z+ \varepsilon_{\text{full}}\)
  • But, \(Z\) is unobserved \(\leadsto\) “restricted” model \(Y = \hat{\tau}_{\text{res}}D + \mathbf{X} \hat{\beta}_{\text{res}} + \varepsilon_{\text{res}}\)
  • What is the “gap” between \(\hat{\tau}\) and \(\hat{\tau}^{\text{res}}\)? \(\text{OVB} = \hat{\tau}_{\text{res}} - \hat{\tau}\)

\[ \begin{align*} \hat{\tau}_{\text{res}} &= \frac{ \text{Cov}[D^{\top \mathbf{X}}, Y^{\top \mathbf{X}}] }{ \text{Var}[D^{\top \mathbf{X}}] } \\ &= \frac{ \text{Cov}[D^{\top \mathbf{X}}, \hat{\tau}D^{\top \mathbf{X}} + \hat{\gamma}Z^{\top \mathbf{X}}] }{ \text{Var}[D^{\top \mathbf{X}}] } \\ &= \hat{\tau} + \hat{\gamma}\frac{ \text{Cov}[D^{\top \mathbf{X}}, Z^{\top \mathbf{X}}] }{ \text{Var}[D^{\top \mathbf{X}}] } \\ &= \hat{\tau} + \hat{\gamma}\hat{\delta} \\ \implies \text{OVB} &= \hat{\tau}_{\text{res}} - \hat{\tau} = \overbrace{ \boxed{\hat{\gamma} \times \hat{\delta}} }^{\mathclap{\text{Impact} \, \times \, \text{Imbalance}}} \end{align*} \]

More Generally

  • What if we don’t have a specific omitted variable in mind? We just want to know the expected impact if there were omitted vars… Enter “Partial \(R^2\)

Heterogeneous Treatment Effects

PROGRESA

References

Cinelli, Carlos, and Chad Hazlett. 2020. “Making Sense of Sensitivity: Extending Omitted Variable Bias.” Journal of the Royal Statistical Society Series B: Statistical Methodology 82 (1): 39–67. https://doi.org/10.1111/rssb.12348.