README

One main reason why we care about measure-theoretic probability theory is that it can rigorously describe the convergence properties of stochastic sequences.

Consider the stochastic sequence {Xk}{X1,X2,,Xk,}\{X_k\} \doteq \{X_1, X_2, \ldots, X_k, \ldots\} . Each element in this sequence is a random variable defined on a triple (Ω,F,P)(\Omega, \mathcal{F}, \mathbb{P}) . When we say {Xk}\{X_k\} converges to a random variable XX , we should be careful since there are different types of convergence as shown below.

\diamond Sure convergence:

Definition: {Xk}\{X_k\} converges surely (or everywhere or pointwise) to XX if

limkXk(ω)=X(ω),f o r a l lωΩ.\lim _ {k \to \infty} X _ {k} (\omega) = X (\omega), \quad \text {f o r a l l} \omega \in \Omega .

It means that limkXk(ω)=X(ω)\lim_{k\to \infty}X_k(\omega) = X(\omega) is valid for all points in Ω\Omega . This definition can be equivalently stated as

A=Ωw h e r eA={ωΩ:limkXk(ω)=X(ω)}.A = \Omega \quad \text {w h e r e} \quad A = \left\{\omega \in \Omega : \lim _ {k \to \infty} X _ {k} (\omega) = X (\omega) \right\}.

\diamond Almost sure convergence:

Definition: {Xk}\{X_k\} converges almost surely (or almost everywhere or with probability 1 or w.p.1) to XX if

P(A)=1w h e r eA={ωΩ:limkXk(ω)=X(ω)}.(B.3)\mathbb {P} (A) = 1 \quad \text {w h e r e} \quad A = \left\{\omega \in \Omega : \lim _ {k \rightarrow \infty} X _ {k} (\omega) = X (\omega) \right\}. \tag {B.3}

It means that limkXk(ω)=X(ω)\lim_{k\to \infty}X_k(\omega) = X(\omega) is valid for almost all points in Ω\Omega . The points, for which this limit is invalid, form a set of zero measure. For the sake of simplicity, (B.3) is often written as

P(limkXk=X)=1.\mathbb {P} \left(\lim _ {k \rightarrow \infty} X _ {k} = X\right) = 1.

Almost sure convergence can be denoted as Xka.s.XX_{k}\xrightarrow{a.s.}X

Convergence in probability:

Definition: {Xk}\{X_k\} converges in probability to XX if for any ϵ>0\epsilon > 0 ,

limkP(Ak)=0w h e r eAk={ωΩ:Xk(ω)X(ω)>ϵ}.(B.4)\lim _ {k \rightarrow \infty} \mathbb {P} (A _ {k}) = 0 \quad \text {w h e r e} \quad A _ {k} = \left\{\omega \in \Omega : | X _ {k} (\omega) - X (\omega) | > \epsilon \right\}. \tag {B.4}

For simplicity, (B.4) can be written as

limkP(XkX>ϵ)=0.\lim _ {k \to \infty} \mathbb {P} (| X _ {k} - X | > \epsilon) = 0.

The difference between convergence in probability and (almost) sure convergence is as follows. Both sure convergence and almost sure convergence first evaluate the convergence of every point in Ω\Omega and then check the measure of these points that converge. By contrast, convergence in probability first checks the points that satisfy XkX>ϵ|X_{k} - X| > \epsilon and then evaluates if the measure will converge to zero as kk\to \infty .

Convergence in mean:

Definition: {Xk}\{X_k\} converges in the rr -th mean (or in the LrL^r norm) to XX if

limkE[XkXr]=0.\lim _ {k \to \infty} \mathbb {E} [ | X _ {k} - X | ^ {r} ] = 0.

The most frequently used cases are r=1r = 1 and r=2r = 2 . It is worth mentioning that convergence in mean is not equivalent to limkE[XkX]=0\lim_{k\to \infty}\mathbb{E}[X_k - X] = 0 or limkE[Xk]=E[X]\lim_{k\to \infty}\mathbb{E}[X_k] = \mathbb{E}[X] , which indicates that E[Xk]\mathbb{E}[X_k] converges but the variance may not.

Convergence in distribution:

Definition: The cumulative distribution function of XkX_{k} is defined as P(Xka)\mathbb{P}(X_k \leq a) where aRa \in \mathbb{R} . Then, {Xk}\{X_{k}\} converges to XX in distribution if the cumulative distribution function converges:

limkP(Xka)=P(Xa),forallaR.\lim _ {k \to \infty} \mathbb {P} (X _ {k} \leq a) = \mathbb {P} (X \leq a), \quad \mathrm {f o r a l l} a \in \mathbb {R}.

A compact expression is

limkP(Ak)=P(A),\lim _ {k \to \infty} \mathbb {P} (A _ {k}) = \mathbb {P} (A),

where

Ak{ωΩ:Xk(ω)a},A{ωΩ:X(ω)a}.A _ {k} \doteq \left\{\omega \in \Omega : X _ {k} (\omega) \leq a \right\}, A \doteq \left\{\omega \in \Omega : X (\omega) \leq a \right\}.

The relationships between the above types of convergence are given below:

almost sure convergence \Rightarrow convergence in probability \Rightarrow convergence in distribution convergence in mean \Rightarrow convergence in probability \Rightarrow convergence in distribution

Almost sure convergence and convergence in mean do not imply each other. More information can be found in [102].

Appendix C