DSAN 5650: Causal Inference for Computational Social Science
Summer 2025, Georgetown University
Wednesday, July 23, 2025
Today’s Planned Schedule:
Start | End | Topic | |
---|---|---|---|
Lecture | 6:30pm | 6:45pm | Final Projects: Dependent vs. Independent Vars → |
6:45pm | 7:10pm | Instrumental Variables Lite → | |
7:10pm | 8:00pm | Text-as-Data Part 1: TAD in General → | |
Break! | 8:00pm | 8:10pm | |
8:10pm | 9:00pm | Text-as-Data Part 2: Causal Text Analysis → |
\[ \DeclareMathOperator*{\argmax}{argmax} \DeclareMathOperator*{\argmin}{argmin} \newcommand{\bigexp}[1]{\exp\mkern-4mu\left[ #1 \right]} \newcommand{\bigexpect}[1]{\mathbb{E}\mkern-4mu \left[ #1 \right]} \newcommand{\definedas}{\overset{\small\text{def}}{=}} \newcommand{\definedalign}{\overset{\phantom{\text{defn}}}{=}} \newcommand{\eqeventual}{\overset{\text{eventually}}{=}} \newcommand{\Err}{\text{Err}} \newcommand{\expect}[1]{\mathbb{E}[#1]} \newcommand{\expectsq}[1]{\mathbb{E}^2[#1]} \newcommand{\fw}[1]{\texttt{#1}} \newcommand{\given}{\mid} \newcommand{\green}[1]{\color{green}{#1}} \newcommand{\heads}{\outcome{heads}} \newcommand{\iid}{\overset{\text{\small{iid}}}{\sim}} \newcommand{\lik}{\mathcal{L}} \newcommand{\loglik}{\ell} \DeclareMathOperator*{\maximize}{maximize} \DeclareMathOperator*{\minimize}{minimize} \newcommand{\mle}{\textsf{ML}} \newcommand{\nimplies}{\;\not\!\!\!\!\implies} \newcommand{\orange}[1]{\color{orange}{#1}} \newcommand{\outcome}[1]{\textsf{#1}} \newcommand{\param}[1]{{\color{purple} #1}} \newcommand{\pgsamplespace}{\{\green{1},\green{2},\green{3},\purp{4},\purp{5},\purp{6}\}} \newcommand{\pedge}[2]{\require{enclose}\enclose{circle}{~{#1}~} \rightarrow \; \enclose{circle}{\kern.01em {#2}~\kern.01em}} \newcommand{\pnode}[1]{\require{enclose}\enclose{circle}{\kern.1em {#1} \kern.1em}} \newcommand{\ponode}[1]{\require{enclose}\enclose{box}[background=lightgray]{{#1}}} \newcommand{\pnodesp}[1]{\require{enclose}\enclose{circle}{~{#1}~}} \newcommand{\purp}[1]{\color{purple}{#1}} \newcommand{\sign}{\text{Sign}} \newcommand{\spacecap}{\; \cap \;} \newcommand{\spacewedge}{\; \wedge \;} \newcommand{\tails}{\outcome{tails}} \newcommand{\Var}[1]{\text{Var}[#1]} \newcommand{\bigVar}[1]{\text{Var}\mkern-4mu \left[ #1 \right]} \]
Independent Variable / Treatment \(D\)
Dependent Variable / Outcome \(Y\)
If randomization works to obtain causal effects…
…Find something random in the causal system, use e.g. matching to “force” the same scenario!
General form: \(\text{Effect}(D \rightarrow Y) = \frac{\text{Effect}(Z \rightarrow Y)}{\text{Effect}(Z \rightarrow D)}\) (Try “plugging in” \(Z\) = Coin Flip!)
\[ \beta_{\text{IV}}^{\text{Wald}} = \frac{ \mathbb{E}[Y_i \mid Z_i = 1] - \mathbb{E}[Y_i \mid Z_i = 0] }{ \mathbb{E}[D_i \mid Z_i = 1] - \mathbb{E}[D_i \mid Z_i = 0] }, \; \beta_{\text{IV}} = \frac{\text{Cov}[Y, Z]}{\text{Cov}[D,Z]} \]
(The necessity for sample splitting!)
\(Y_i \mid \textsf{do}(D_i \leftarrow 1)\) | \(Y_i \mid \textsf{do}(D_i \leftarrow 0)\) | |
---|---|---|
Person 1 | Candidate’s Morals | Taxes |
Person 2 | Candidate’s Morals | Taxes |
Person 3 | Polarization | Immigration |
Person 4 | Polarization | Immigration |
\(Y_i \mid \textsf{do}(D_i \leftarrow 1)\) | \(Y_i \mid \textsf{do}(D_i \leftarrow 0)\) | |
---|---|---|
Person 1 | Candidate’s Morals | Taxes |
Person 2 | Candidate’s Morals | Taxes |
Person 3 | Polarization | Immigration |
Person 4 | Polarization | Immigration |
Actual Assignment | Outcome \(Y_i\) | |
---|---|---|
Person 1 | \(D_1 = 1\) | Morals |
Person 2 | \(D_2 = 1\) | Morals |
Person 3 | \(D_3 = 0\) | Immigration |
Person 4 | \(D_4 = 0\) | Immigration |
Actual Assignment | Outcome \(Y_i\) | |
---|---|---|
Person 1 | \(D_1 = 1\) | Morals |
Person 2 | \(D_2 = 0\) | Taxes |
Person 3 | \(D_3 = 1\) | Polarization |
Person 4 | \(D_4 = 0\) | Immigration |
Section | Keywords |
---|---|
U.S. News | state, court, federal, republican |
World News | government, country, officials, minister |
Arts | music, show, art, dance |
Sports | game, league, team, coach |
Real Estate | home, bedrooms, bathrooms, building |
Arts | Real Estate | Sports | U.S. News | World News | |
---|---|---|---|---|---|
Correct | 3020 | 690 | 4860 | 1330 | 1730 |
Incorrect | 750 | 60 | 370 | 1100 | 590 |
Accuracy | 0.801 | 0.920 | 0.929 | 0.547 | 0.746 |
From Blei (2012)
…Unlocks a world of social modeling through text!
Blaydes, Grimmer, and McQueen (2018)
From Barron et al. (2018)
DSAN 5650 Week 10: Text-as-Data