DSAN 5100: Probabilistic Modeling and Statistical Computing
Section 01
Monday, September 1, 2025
“Our Word is Our Weapon”: Text-Analyzing Wars of Ideas from the French Revolution to the First Intifada
Image credit: Tenor.com
\[ \leadsto F_g = G\frac{m_1m_2}{r^2} \]
\[ \text{Outcome}\left(\text{Dice Roll}\right) = \; ?\frac{?_1?_2}{?^2} \]
plot_circ_with_distr <- function(N, radii, ptitle, alpha=0.1) {
theta <- seq(0, 360, 4)
#hist(radii)
circ_df <- expand.grid(x = theta, y = radii)
#circ_df
ggplot(circ_df, aes(x = x, y = y, group = y)) +
geom_path(alpha = alpha, color = cbPalette[1], linewidth=g_linesize) +
# Plot the full unit circle
geom_path(data = data.frame(x = theta, y = 1), aes(x = x), linewidth=g_linesize) +
geom_point(data = data.frame(x = 0, y = 0), aes(x = x), size = g_pointsize) +
coord_polar(theta = "x", start = -pi / 2, direction = -1) +
ylim(0, 1) +
# scale_x_continuous(limits=c(0,360), breaks=seq(0,360,by=45)) +
scale_x_continuous(limits = c(0, 360), breaks = NULL) +
dsan_theme("quarter") +
labs(
title = ptitle,
x = NULL,
y = NULL
) +
# See https://stackoverflow.com/a/19821839
theme(
axis.line = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
panel.border = element_blank(),
panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
plot.margin = unit(c(0,0,0,0), "cm"),
title = element_text(size=18)
)
}
N <- 500
radii <- runif(N, 0, 1)
title <- paste0(N, " Uniformly-Distributed Radii")
alpha <- 0.2
plot_circ_with_distr(N, radii, title, alpha)
\[ \Pr(X = 1) = \Pr(X = 2) = \cdots = \Pr(X = 6) = \frac{1}{6} \]
Working Definition: Independence
Two random variables \(X\) and \(Y\) are independent if learning information about \(X\) does not give you information about the value of \(Y\), or vice-versa.
Naïve Definition of Probability
Given a sample space \(S\), and an event \(E \subset S\),
\[ \Pr(\underbrace{E}_{\text{event}}) = \frac{\text{\# Favorable Outcomes}}{\text{\# Possible Outcomes}} = \frac{|E|}{|S|} \]
Naïve Definition of Probability
Given a sample space \(S\), and an event \(E \subset S\),
\[ \Pr(\underbrace{E}_{\text{event}}) = \frac{\text{\# Favorable Outcomes}}{\text{\# Possible Outcomes}} = \frac{|E|}{|S|} \]
\[ \begin{align*} \Pr(E_1) &= \frac{|\{HT\}|}{|S|} = \frac{|\{HT\}|}{|\{TT, TH, HT, HH\}|} = \frac{1}{4} \\ \Pr(E_2) &= \frac{|\{TH, HT, HH\}|}{|S|} = \frac{|\{TH, HT, HH\}|}{|\{TT, TH, HT, HH\}|} = \frac{3}{4} \end{align*} \]
Naïve Definition of Probability
Given a sample space \(S\), and an event \(E \subset S\),
\[ \Pr(\underbrace{E}_{\text{event}}) = \frac{\text{\# Favorable Outcomes}}{\text{\# Possible Outcomes}} = \frac{|E|}{|S|} \]
Example: Student Government vs. Student Sports
\[ \begin{align*} P_{n,k} = \frac{n!}{(n-k)!}, \; C_{n,k} = \frac{n!}{k!(n-k)!} \end{align*} \]
\[ C_{n,k} = \frac{P_{n,k}}{k!} \genfrac{}{}{0pt}{}{\leftarrow \text{Permutations}}{\leftarrow \text{Duplicate groups}} \]
Where does \(k!\) come from? (How many different orderings can we make of the same group?)
\(k = 2\): \((\underbrace{\boxed{\phantom{a}}}_{\text{2 choices}},\underbrace{\boxed{\phantom{a}}}_{\text{1 remaining choice}}) \implies 2\)
\(k = 3\): \((\underbrace{\boxed{\phantom{a}}}_{\text{3 choices}},\underbrace{\boxed{\phantom{a}}}_{\text{2 remaining choices}}, \underbrace{\boxed{\phantom{a}}}_{\text{1 remaining choice}}) \implies 6\)
\(k = 4\): \((\underbrace{\boxed{\phantom{a}}}_{\text{4 choices}}, \underbrace{\boxed{\phantom{a}}}_{\text{3 remaining choices}}, \underbrace{\boxed{\phantom{a}}}_{\text{2 remaining choices}}, \underbrace{\boxed{\phantom{a}}}_{\text{1 remaining choice}}) \implies 24\)
Without Replacement | With Replacement |
---|---|
Most statistical problems: “Check off” objects as you collect data about them, so that each observation in your data is unique | Special (but important!) set of statistical problems: let objects appear in your sample multiple times, to “squeeze” more information out of the sample (called Bootstrapping—much more later in the course!) |
Example: From \(N = 3\) population, how many ways can we take samples of size \(k = 2\)?
Without Replacement | With Replacement |
---|---|
\(3 \cdot 2 = 6\) ways (3 objects to choose from for first element of sample, 2 remaining objects to choose from for second element of sample) | \(3\cdot 3 = 3^2 = 9\) ways (3 objects to choose from for first element of sample, still 3 objects to choose from for second element of sample) |
General Case: From population of size \(N\), how many ways can we take samples of size \(k\)? (Try to extrapolate from above example before looking at answer!)
Without Replacement | With Replacement |
---|---|
\(\displaystyle \underbrace{N \cdot (N-1) \cdot \cdots \cdot (N - k + 1)}_{k\text{ times}} = \frac{N!}{(N - k )!}\) (This formula should look somewhat familiar…) |
\(\displaystyle \underbrace{N \cdot N \cdot \cdots \cdot N}_{k\text{ times}} = N^k\) |
DSAN 5100 W02A: Probabilistic Modeling