MathMarch 5, 2024

Probability Theory

Probability theory provides the mathematical framework for analyzing random phenomena and uncertainty. It forms the foundation of statistics, machine learning, and many areas of science.

1. Probability Spaces

Definition 1.1(Probability Space)

A probability space is a triple

(\Omega, \mathcal{F}, P)

where:

$\Omega$ is the sample space (set of all possible outcomes)
$\mathcal{F}$ is a $\sigma$ -algebra of events
$P: \mathcal{F} \to [0,1]$ is the probability measure

The probability measure $P$ must satisfy:

$P(\Omega) = 1$ (normalization)
For countably many disjoint events $A_1, A_2, \ldots$ :
$P\left(\bigcup_{i=1}^{\infty} A_i\right) = \sum_{i=1}^{\infty} P(A_i)$

2. Random Variables

Definition 2.1(Random Variable)

A random variable

X

is a measurable function

X: \Omega \to \mathbb{R}

. Its expected value (or mean) is:

\mathbb{E}[X] = \int_{\Omega} X \, dP = \int_{-\infty}^{\infty} x \, f(x) \, dx

where

f(x)

is the probability density function (for continuous

X

Definition 2.2(Variance)

The variance of a random variable

X

measures the spread of its distribution:

\text{Var}(X) = \mathbb{E}[(X - \mathbb{E}[X])^2] = \mathbb{E}[X^2] - (\mathbb{E}[X])^2

3. Important Distributions

Example 3.1

The Normal (Gaussian) distribution with mean

\mu

and variance

\sigma^2

has density:

f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)

We write

X \sim \mathcal{N}(\mu, \sigma^2)

4. The Law of Large Numbers

Theorem 4.1(Strong Law of Large Numbers)

Let

X_1, X_2, \ldots

be i.i.d. random variables with

\mathbb{E}[X_i] = \mu

. Then with probability 1:

\lim_{n \to \infty} \frac{1}{n} \sum_{i=1}^{n} X_i = \mu

Remark. This theorem justifies the intuition that sample averages converge to the true mean as the sample size grows.

Theorem 4.2(Central Limit Theorem)

Let

X_1, X_2, \ldots

be i.i.d. random variables with mean

\mu

and variance

\sigma^2

. Then as

n \to \infty

\frac{\sum_{i=1}^{n} X_i - n\mu}{\sigma\sqrt{n}} \xrightarrow{d} \mathcal{N}(0, 1)

Corollary 4.1

For large

n

, the sample mean

\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i

is approximately normally distributed:

\bar{X}_n \approx \mathcal{N}\left(\mu, \frac{\sigma^2}{n}\right)

Proof.

This follows directly from the Central Limit Theorem by noting that:

\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i = \mu + \frac{\sigma}{\sqrt{n}} \cdot Z_n

where

Z_n \xrightarrow{d} \mathcal{N}(0,1)

■