DSAN 5100: Probabilistic Modeling and Statistical Computing
Section 01
Tuesday, September 23, 2025
\[ \DeclareMathOperator*{\argmax}{argmax} \DeclareMathOperator*{\argmin}{argmin} \newcommand{\bigexp}[1]{\exp\mkern-4mu\left[ #1 \right]} \newcommand{\bigexpect}[1]{\mathbb{E}\mkern-4mu \left[ #1 \right]} \newcommand{\convergesAS}{\overset{\text{a.s.}}{\longrightarrow}} \newcommand{\definedas}{\overset{\text{def}}{=}} \newcommand{\definedalign}{\overset{\phantom{\text{def}}}{=}} \newcommand{\eqeventual}{\overset{\mathclap{\text{\small{eventually}}}}{=}} \newcommand{\Err}{\text{Err}} \newcommand{\expect}[1]{\mathbb{E}[#1]} \newcommand{\expectsq}[1]{\mathbb{E}^2[#1]} \newcommand{\fw}[1]{\texttt{#1}} \newcommand{\given}{\mid} \newcommand{\green}[1]{\color{green}{#1}} \newcommand{\heads}{\outcome{heads}} \newcommand{\iid}{\overset{\text{\small{iid}}}{\sim}} \newcommand{\lik}{\mathcal{L}} \newcommand{\loglik}{\ell} \newcommand{\mle}{\textsf{ML}} \newcommand{\nimplies}{\;\not\!\!\!\!\implies} \newcommand{\orange}[1]{\color{orange}{#1}} \newcommand{\outcome}[1]{\textsf{#1}} \newcommand{\param}[1]{{\color{purple} #1}} \newcommand{\pgsamplespace}{\{\green{1},\green{2},\green{3},\purp{4},\purp{5},\purp{6}\}} \newcommand{\prob}[1]{P\left( #1 \right)} \newcommand{\purp}[1]{\color{purple}{#1}} \newcommand{\sign}{\text{Sign}} \newcommand{\spacecap}{\; \cap \;} \newcommand{\spacewedge}{\; \wedge \;} \newcommand{\tails}{\outcome{tails}} \newcommand{\Var}[1]{\text{Var}[#1]} \newcommand{\bigVar}[1]{\text{Var}\mkern-4mu \left[ #1 \right]} \]
\(H = 0\) | \(H = 1\) | |
---|---|---|
\(G = 10\) | 10 | 5 |
\(G = 11\) | 6 | 4 |
\(G = 12\) | 7 | 1 |
Q1, for example, is asking us for \(\Pr(G = 11, H = 1)\), a question we can answer if we know the joint distribution \(f_{G,H}(g, h)\)
Using our naïve definition of probability, we can compute this probability using the frequencies in the table as
\[ \Pr(G = 11, H = 1) = \frac{\#[G = 11, H = 1]}{\#\text{ Students Total}} \]
Plugging in the values from Table 1, we obtain the answer:
\[ \Pr(G = 11, H = 1) = \frac{4}{33} \approx 0.121 \]
We could compute the total by summing columns, then summing over our individual column totals to get 33:
\(H = 0\) | \(H = 1\) | Total | |
---|---|---|---|
\(G = 10\) | 10 | 5 | |
\(G = 11\) | 6 | 4 | |
\(G = 12\) | 7 | 1 | |
Total | 23 | 10 | 33 |
Or, we could compute the total by summing rows, then summing over our individual row totals to get 33:
\(H = 0\) | \(H = 1\) | Total | ||
---|---|---|---|---|
\(G = 10\) | 10 | 5 | 15 | |
\(G = 11\) | 6 | 4 | 10 | |
\(G = 12\) | 7 | 1 | 8 | |
Total | 33 |
\(H = 0\) | \(H = 1\) | Total | ||
---|---|---|---|---|
\(G = 10\) | 10 | 5 | 15 | |
\(G = 11\) | 6 | 4 | 10 | |
\(G = 12\) | 7 | 1 | 8 | |
Total | 23 | 10 | 33 |
Now let’s use overall total (33) to convert counts into probabilities:
\(H = 0\) | \(H = 1\) | Total | |
---|---|---|---|
\(G = 10\) | \(\frac{10}{33}\) | \(\frac{5}{33}\) | \(\frac{15}{33}\) |
\(G = 11\) | \(\frac{6}{33}\) | \(\frac{4}{33}\) | \(\frac{10}{33}\) |
\(G = 12\) | \(\frac{7}{33}\) | \(\frac{1}{33}\) | \(\frac{8}{33}\) |
Total | \(\frac{23}{33}\) | \(\frac{10}{33}\) | \(\frac{33}{33}\) |
\[ \Pr(A) = \underset{\mathclap{\small \text{Probability }\textbf{mass}}}{\boxed{\frac{|\{A\}|}{|\Omega|}}} = \frac{1}{|\{A,B,C,D\}|} = \frac{1}{4} \]
\[ \Pr(A) = \underset{\mathclap{\small \text{Probability }\textbf{density}}}{\boxed{\frac{\text{Area}(\{A\})}{\text{Area}(\Omega)}}} = \frac{\pi r^2}{s^2} = \frac{\pi \left(\frac{1}{4}\right)^2}{4} = \frac{\pi}{64} \]
Now that we have normalized counts, different pieces of this table give different probability distributions:
Joint Distribution \(f_{G,H}(g, h)\): Look at value in row \(g\), col \(h\)
Two Marginal Distributions
\(f_G(g)\): Look at total for row \(g\)
\(f_H(h)\): Look at total for column \(h\)
\(H = 0\) | \(H = 1\) | Total | |
---|---|---|---|
\(G = 10\) | \(\frac{10}{33}\) | \(\frac{5}{33}\) | \(\frac{15}{33}\) |
\(G = 11\) | \(\frac{6}{33}\) | \(\frac{4}{33}\) | \(\frac{10}{33}\) |
\(G = 12\) | \(\frac{7}{33}\) | \(\frac{1}{33}\) | \(\frac{8}{33}\) |
Total | \(\frac{23}{33}\) | \(\frac{10}{33}\) | \(\frac{33}{33}\) |
\(\Pr(H = 0, G = 10)\) | |
+ | \(\Pr(H = 0, G = 11)\) |
+ | \(\Pr(H = 0, G = 12)\) |
= | \(\Pr(H = 0)\) |
\(\Pr(H = 1, G = 10)\) | |
+ | \(\Pr(H = 1, G = 11)\) |
+ | \(\Pr(H = 1, G = 12)\) |
= | \(\Pr(H = 1)\) |
\(\Pr(G = 10, H = 0)\) | + | \(\Pr(G = 10, H = 1)\) | = | \(\Pr(G = 10)\) |
\(\Pr(G = 11, H = 0)\) | + | \(\Pr(G = 11, H = 1)\) | = | \(\Pr(G = 11)\) |
\(\Pr(G = 12, H = 0)\) | + | \(\Pr(G = 12, H = 1)\) | = | \(\Pr(G = 12)\) |
\[ \begin{align*} = &\Pr(G = 10, H = 1 \mid \Omega) \\[0.6em] = &\frac{\#(G = 10, H = 1, \Omega)}{\#\text{ Total }(\Omega)\text{ ✅}} \end{align*} \]
\[ \begin{align*} = &\Pr(G = 10 \mid \Omega) \\[0.6em] = &\frac{\#(G = 10, \Omega)}{\#\text{ Total }(\Omega)\text{ ✅}} \end{align*} \]
\[ \begin{align*} = &\frac{\Pr(G = 10, H = 1)}{\Pr(H = 1)} \\[0.6em] = &\frac{\#(G = 10, H = 1)}{\#(H = 1)\text{ 😳}} \end{align*} \]
Let’s extract just the \(H = 1\) column:
\(H = 1\) | |
---|---|
\(G = 10\) | 5 |
\(G = 11\) | 4 |
\(G = 12\) | 1 |
Total | 10 |
\(H = 1\) | |
---|---|
\(G = 10\) | \(\frac{5}{10}\) |
\(G = 11\) | \(\frac{4}{10}\) |
\(G = 12\) | \(\frac{1}{10}\) |
Total | \(\frac{10}{10}\) |
Let’s extract just the \(G = 10\) row:
\(H = 0\) | \(H = 1\) | Total | |
---|---|---|---|
\(G = 10\) | 5 | 10 | 15 |
\(H = 0\) | \(H = 1\) | Total | |
---|---|---|---|
\(G = 10\) | \(\frac{5}{15}\) | \(\frac{10}{15}\) | \(\frac{15}{15}\) |
We now have the link between three types of distributions derived from our table:
Distribution Type | How Many? | Example Value |
---|---|---|
Joint Distribution | 1 | \(\Pr(G = 11, H = 1) = \frac{4}{33}\) |
Marginal Distributions | 2 | \(\Pr(H = 1) = \frac{10}{33}\) |
Conditional Distributions | 6 | \(\Pr(G = 10 \mid H = 1) = \frac{5}{10}\) |
\(H = 0\) | \(H = 1\) | Total | |
---|---|---|---|
\(G = 10\) | \(\frac{10}{33}\) | \(\frac{5}{33}\) | \(\frac{15}{33}\) |
\(G = 11\) | \(\frac{6}{33}\) | \(\frac{4}{33}\) | \(\frac{10}{33}\) |
\(G = 12\) | \(\frac{7}{33}\) | \(\frac{1}{33}\) | \(\frac{8}{33}\) |
Total | \(\frac{23}{33}\) | \(\frac{10}{33}\) | \(\frac{33}{33}\) |
\(H = 0\) | \(H = 1\) | |
---|---|---|
\(G = 10\) | \(\frac{10}{23}\) | \(\frac{5}{10}\) |
\(G = 11\) | \(\frac{6}{23}\) | \(\frac{4}{10}\) |
\(G = 12\) | \(\frac{7}{23}\) | \(\frac{1}{10}\) |
Total | \(\frac{23}{23}\) | \(\frac{10}{10}\) |
\(\Pr(A \mid B)\) | \(=\) | \(\Pr(A, B)\) |
\(\Pr(B)\) |
\[ \iff \]
\(\Pr(A,B)\) | \(=\) | \(\Pr(A \mid B)\) | \(\cdot\) | \(\Pr(B)\) |
DSAN 5100 W05B: Joint, Marginal, Conditional Distributions