1: Combinatorics

Permutations (ordered): $\frac{n!}{(n-k)!}$

3 unique awards, 10 students: $10 \times 9 \times 8 = \frac{10!}{(10-3)!}$

Combinations (unordered): $\binom{n}{k} = \frac{n!}{k!(n-k)!}$

3 exchangeable awards, 10 students: $\frac{10!}{(10-3)!}$ permuations, $3!$ ways to give 3 students awards, $\frac{10!}{3!(10-3)!} = \binom{10}{3}$

$\binom{n}{k} = \binom{n}{n-k}$, including $k$ = excluding $n-k$

$(x+y)^n = \sum_{k=1}^n \binom{n}{k} x^k y^{n-k}$

dividing $n$ objects into groups of $n_i$: $\ \binom{n}{n_1, n_2, \dots, n_r} = \frac{n!}{n_1! n_2! \dots n_r!}$

3: Independence

$E \perp \!\!\! \perp F \iff P(E \cup F) = P(E) \cdot P(F)$

$E \perp \!\!\! \perp F \implies E \perp \!\!\! \perp F^C$

$P(E \cup F) = P(E) P(F)$
$P(E) = P(E \cup F) + P(E \cup F^C)$
$P(E) = P(E) P(F) + P(E \cup F^C)$
$P(E \cup F^C) = P(E) (1-P(F)) = P(E) P(F^C)$

4: Conditional Probabilities

$P(E|F) = \frac{P(E \cup F)}{P(F)}$

$E \perp \!\!\! \perp F \iff P(E \| F) = P(E)$

$P(E \cup F) = P(E) P(F) = P(E|F) P(F)$

Bayes’ Theorem: $P(E|F) = \frac{P(F|E)P(E)}{P(F)}$

5: Discrete Random Variables

$\mathbb{E}[x] = \sum_x x p(x)$

$\mathbb{E}[ax + b] = a \mathbb{E}[x] + b$

$n$-th Moment: $\mathbb{E}[x^n]$

$\text{Var}[x] = \mathbb{E}[(x - \mathbb{E}[x])^2] = \mathbb{E}[x^2] - \mathbb{E}[x]^2$

$\text{Var}[x] = \mathbb{E}[x^2] - \mathbb{E}[2x \mathbb{E}[x]] + \mathbb{E}[x]^2 \\$ $ = \mathbb{E}[x^2] - 2 \mathbb{E}[x]^2 + \mathbb{E}[x]^2 \\$ $ = \mathbb{E}[x^2] - \mathbb{E}[x]^2 \\$

$\text{Var}[ax + b] = a^2 \text{Var}[x]$

$\text{Var}[ax] = \mathbb{E}[a^2 x^2] - \mathbb{E}[ax]^2 \\$ $= a^2 (\mathbb{E}[x^2] - \mathbb{E}[x]^2) \\$ $= a^2 \text{Var}[x] \\ \\$ $\text{Var}[x+b] = \mathbb{E}[x^2 + 2xb + b^2] - \mathbb{E}[x + b]^2 \\$ $= \mathbb{E}[x^2] + 2b\mathbb{E}[x] + b^2 - (\mathbb{E}[x] + b)^2 \\$ $= \mathbb{E}[x^2] + 2b\mathbb{E}[x] + b^2 - (\mathbb{E}[x]^2 + 2b\mathbb{E}[x] + b^2) \\$ $= \mathbb{E}[x^2] - \mathbb{E}[x]^2 \\$ $= \text{Var}[x]$

$\text{Std}(x) = \sqrt{\text{Var}(x)}$

Probability Mass Function (PMF): probability distribution over possible values of $X$

Cumulative Distribution Function (CDF): $F_X(x) = P(X \le x)$

$P(a \le x \le b) = F_X(b) - F_X(a)$

6: Discrete Distributions

Bernoulli

$X \sim \text{Ber}(p)$

$P(X=1) = p \qquad P(X=0) = 1-p$

$\mathbb{E}[X] = 1(p) + 0(1-p) = p$

$\text{Var}[X] = \mathbb{E}[X^2] - \mathbb{E}[X]^2 = p - p^2 = p(1-p)$

Binomial

$X \sim \text{Bin}(n, p)$

$P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}$

$\mathbb{E}[X] = np \\$ $\text{Var}[X] = np(1-p)$

$X = \sum_{i=1}^n Y_i \qquad Y_i \sim \text{Ber(p)} \\ \\$ $\mathbb{E}[X] = \mathbb{E}[\sum_{i=1}^n Y_i] = \sum_{i=1}^n \mathbb{E}[Y_i] = np \\ \\$ $\text{Var}[X] = \mathbb{E}[(\sum_{i=1}^n Y_i)^2] - \mathbb{E}[\sum_{i=1}^n Y_i]^2 \\$ $= \mathbb{E}[(\sum_{i=1}^n Y_i^2) + \sum_{i \ne j} Y_i Y_j] - n^2 p^2 \\$ $= np + (n^2 - n)p^2 - n^2 p^2 \\$ $= np - np^2 \\$ $= np(1-p)$

Poisson

Event occurs on avg $\lambda$ times, $X$ is # of occurrences, then $X \sim \text{Pois}(\lambda)$

$P(X=i) = e^{-\lambda} \frac{\lambda^i}{i!}$

$\sum_{i=0}^\infty e^{-\lambda} \frac{\lambda^i}{i!} = 1$

$\sum_{i=0}^\infty e^{-\lambda} \frac{\lambda^i}{i!} = e^{-\lambda} \sum_{i=0}^\infty \frac{\lambda^i}{i!} \\$ $= e^{-\lambda} e^\lambda \qquad \text{by Taylor Series of } e^x \\$ $= 1$

$\mathbb{E}[X] = \sum_{i=0}^\infty i e^{-\lambda} \frac{\lambda^i}{i!} = \lambda e^{-\lambda} \sum_{i=1}^\infty \frac{\lambda^{i-1}}{(i-1)!} = \lambda$

For small $p$ and large $n$, $\text{Pois}(\lambda = np)$ estimates $\text{Bin}(n, p)$

Geometric

$X$ is # (ind) trials to get 1st success, $X \sim \text{Geo}(p)$

$P(X=i) = (1-p)^{i-1} p$

$\mathbb{E}[X] = \frac{1}{p}$

Negative Binomial

$X=n$, $r$-th success on $n$-th trial, $X \sim \text{NBin}(n, r)$

$P(X=n) = \binom{n-1}{r-1} p^r (1-p)^{n-r}$

Hypergeometric

$X \sim \text{Hyp}(N, m, n)$

$P(X=i)$: sampling (without replacement) $i$ objects of interest, where there are $N$ total objects, $m$ objects of interest, and $n$ trials.

Ex: 20 eggs, 3 double yolk eggs, take 5 eggs. $X \sim \text{Hyp}(N=20, m=3, n=5)$. $P(2 \text{ double yolks}) = \frac{\binom{3}{2} \binom{17}{3}}{\binom{20}{5}}$

$P(X=i) = \frac{\binom{m}{i} \binom{N-m}{n-i}}{\binom{N}{m}}$

7: Continuous Distributions

Probability Density Function (PDF) $f(x)$: $P(a \le x \le b) = \int_a^b f(x) dx$

Cumulative Density Function (CDF): $F_X(y) = P(x \le y) = \int_{-\infty}^y f(x) dx$

$F_X’(y) = f(y)$

$\mathbb{E}[g(x)] = \int_{-\infty}^\infty g(x) f(x) dx$

8: Normal Distribution

$X \sim N(\mu, \sigma^2) \qquad f(x) = \frac{1}{\sqrt{2 \pi} \sigma} e^{-\frac{1}{2} (\frac{x-\mu}{\sigma})^2}$

$\int_{-\infty}^{\infty} \frac{1}{\sqrt{2 \pi} \sigma} e^{-\frac{1}{2} (\frac{x-\mu}{\sigma})^2} dx = 1 \\ \\$ $\int_{-\infty}^{\infty} e^{-x^2} dx = \sqrt{\pi} \\$

$\text{Let } I = \int_{-\infty}^{\infty} \frac{1}{\sqrt{2 \pi} \sigma} e^{-\frac{1}{2} (\frac{x-\mu}{\sigma})^2} dx \\$ $\text{Let } u = \frac{x-\mu}{\sqrt{2} \sigma} \qquad du = \frac{dx}{\sqrt{2} \sigma} \qquad dx = \sqrt{2} \sigma du \\$ $I = \int_{-\infty}^{\infty} \frac{1}{\sqrt{\pi}} e^{-u^2} du \\$ $\text{Let } A = \int_{-\infty}^{\infty} e^{-u^2} du \\$ $A^2 = \int_{-\infty}^{\infty} e^{-u^2} du \int_{-\infty}^{\infty} e^{-t^2} dt \\$ $ = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} e^{-(u^2+t^2)} du dt \\$ $\text{Let } u = r \cos(\theta) \qquad t = r \sin(\theta) \\$ $ = \int_{0}^{2\pi} \int_{0}^{\infty} r e^{-r^2} dr d\theta \\$ $ = 2\pi \int_{0}^{\infty} r e^{-r^2} dr \\$ $ = 2\pi \frac{-1}{2} e^{-r^2} \big\|_{0}^{\infty} \\$ $ = -\pi (0-1) \\$ $A^2 = \pi \\$ $A = \int_{-\infty}^{\infty} e^{-u^2} du = \sqrt{\pi} \\$ $I = \frac{1}{\sqrt{\pi}} \sqrt{\pi} = 1 \\$

$\mathbb{E}[X] = \mu \qquad \text{Var}[X] = \sigma^2$

Standard Normal: $X \sim N(\mu, \sigma^2) \qquad \text{Let } Z = \frac{X - \mu}{\sigma} \qquad Z \sim N(0, 1)$

$\text{CDF } \phi(x): \phi(-x) = 1 - \phi(x)$

9: Normal approx to the Binomial

$X \sim \text{Bin}(n, p), \ \text{if Var}[X] > 10, \ X \approx Y \sim N(\mu = \mathbb{E}[X], \sigma^2 = \text{Var}[X])$

10: Continuous Distributions

Uniform

$\begin{align} f(x) = \begin{cases} \frac{1}{b-a} \ &x \in [a, b] \\ 0 \ &\text{else} \end{cases} \end{align}$

$\mathbb{E}[X] = \frac{b+a}{2}$

Exponential

$\begin{align} f(x) = \begin{cases} \lambda e^{-\lambda x} \ &x \ge 0 \\ 0 \ &\text{else} \end{cases} \end{align}$

$\mathbb{E}[X] = \frac{1}{\lambda}$

$F(x) = 1 - e^{-\lambda x}$

$X$ is time for event to occur, event occurs at rate $\lambda$: $\ X \sim \text{Exp}(\lambda)$

Memoryless: $P(X > s + t | X > t) = \frac{e^{-\lambda (s + t)}}{e^{-\lambda t}} = e^{-\lambda s} = P(X > s)$

Density of Func of Random Var

$X \sim \text{Uniform}(0, 1) \qquad f_X(x) = 1 \qquad F_X(x) = \int_0^x f_X(t) dt = \int_0^x 1 dt = x \qquad x \in [0, 1]$

$\text{Let } Y = X^n$

$F_Y(x) = P(Y \le x) = P(X^n < x) = P(X < x^{\frac{1}{n}}) = x^\frac{1}{n}$

$f_Y(x) = \frac{d}{dx} F_Y(x) = \frac{d}{dx} x^\frac{1}{n} = \frac{1}{n} x^{\frac{1}{n} - 1}$

11: Multivariate Distributions

Marginal density: $f_x(x) = \int_{-\infty}^{\infty} f_{X,Y}(x, y) dy$

$X \perp \!\!\! \perp Y \iff f_{X,Y}(x,y) = f_X(x) f_Y(y) \iff f_{X|Y=y}(x) = f_X(x)$

Density of Func of RVs

$f_{X,Y}(x,y) = x + \frac{3}{2} y^2 \qquad 0 \le x,y \le 1$

$\text{Let } U = X + Y \qquad V = X^2$

$X = \sqrt{V} \qquad Y = U - \sqrt{V}$

$f_{U,V}(u,v) = f_{X,Y}(\sqrt{V}, U - \sqrt{V}) \cdot |J|^{-1}$

\[J = \frac{\partial(u, v)}{\partial(x, y)} = \begin{vmatrix} \frac{\partial u}{\partial x} & \frac{\partial u}{\partial y} \\ \frac{\partial v}{\partial x} & \frac{\partial v}{\partial y} \\ \end{vmatrix} = \begin{vmatrix} 1 & 1 \\ 2x & 0 \end{vmatrix} = -2x\]

$f_{U,V}(u,v) = (\sqrt{V} + \frac{3}{2}(U-\sqrt{V})^2) \frac{1}{2\sqrt{V}} \qquad (U-\sqrt{V}) \in [0, 1] \qquad V \in [0, 1]$

12: Multivariate Expectations

$\mathbb{E}[g(x,y)] = \int \int g(x,y) f_{X,Y}(x,y) dx dy$

$X \perp \!\!\! \perp Y \implies \mathbb{E}[h(X) k(Y)] = \mathbb{E}[h(X)] \mathbb{E}[k(Y)]$

Covariance

$\text{Cov}(X,Y) = \mathbb{E}[(X-\mathbb{E}[X])(Y-\mathbb{E}[Y])] = \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y]$

$\text{Cov}(X,X) = \text{Var}(X)$

$X \perp \!\!\! \perp Y \implies \text{Cov}(X,Y) = 0$

$\text{Var}(aX + bY) = a^2 \text{Var}(X) + 2ab \text{Cov}(X,Y) + b^2 \text{Var}(Y)$

Correlation Coeff

$\rho(X,Y) = \frac{\text{Cov}(X,Y)}{\sqrt{\text{Var}(X) \text{Var}(Y)}}$

$|\rho| = \text{ correlation strength}$

$\rho = 1 \implies Y = aX + b, a > 0$

$\rho = -1 \implies Y = aX + b, a < 0$

13: Moment Generating Functions

$m_X(t) = \mathbb{E}[e^{tX}] = \int_{-\infty}^{\infty} e^{tx} f_X(x) dx$

$\frac{d^n}{dt^n} m_X(0) = \frac{d^n}{dt^n} \mathbb{E}[\sum_{i=0}^\infty \frac{(tX)^i}{i!}] \big|_{t=0} = \mathbb{E}[X^n]$

14: Limit Laws

IID: independent and identically distributed

$\text{for I.I.D } X_i \qquad S_n = \sum_{i=1}^n X_i \qquad \mu = \mathbb{E}[X_i] \qquad \sigma^2 = \text{Var}(X_i)$

Strong Law of Large Numbers: $\frac{S_n}{n} \to \mathbb{E}[X_i]$

Central Limit Theorem: $P(a \le \frac{S_n - n \mu}{\sigma \sqrt{n}} \le b) \to P(a \le Z \le b)$

Sum of IID RVs (regardless of dist) converges to Normal.