Borel-Cantelli Lemma

Huarui Zhou

$ \newcommand{\ud}{\mathrm{d} } \newcommand{\calX}{\mathcal{X}} \newcommand{\calF}{\mathcal{F}} \newcommand{\ve}{\varepsilon} \newcommand{\N}{\mathcal{N}} \newcommand{\bX}{\mathbf{X}} \newcommand{\bA}{\mathbf{A}} \newcommand{\bb}{\mathbf{b}} \newcommand{\bY}{\mathbf{Y}} \newcommand{\by}{\mathbf{y}} \newcommand{\bI}{\mathbf{I}} \newcommand{\bbeta}{\pmb{\beta}} \newcommand{\bve}{\pmb{\varepsilon}} \newcommand{\bzero}{\pmb{0}} \newcommand{\ds}{\displaystyle} \newcommand{\tL}{\tilde{L}} \newcommand{\pt}{\partial} \newcommand{\E}{\mathbb{E}} \newcommand{\P}{\mathbb{P}} \newcommand{\chr}{\mathbb{1}} \newcommand{\Cov}{\mathrm{Cov}} \newcommand{\Var}{\mathrm{Var}} \newcommand{\raw}{\rightarrow} \newcommand{\gt}{>} \newcommand{\lt}{<} \newcommand{\sm}{\setminus} $

1. $\lim\inf$ and $\lim\sup$ of set sequences

Similar as define the $\lim\inf$ and $\lim\sup$ of real number sequences $\{a_k\}$, \[\liminf\limits_{k\raw\infty}a_k = \sup_{n\geq 1}(\inf_{k\geq n} a_k),\quad \limsup\limits_{k\raw\infty}a_k = \inf_{n\geq 1}(\sup_{k\geq n} a_k),\] we can also define the $\lim\inf$ and $\lim\sup$ of set sequences $\{A_k\}$, just simply replace $\inf$ with $\cap$ and $\sup$ with $\cup$, i.e. \[\liminf\limits_{k\raw\infty}A_k = \bigcup_{n\geq 1}\bigcap_{k\geq n} A_k,\quad \limsup\limits_{k\raw\infty}A_k = \bigcap_{n\geq 1}\bigcup_{k\geq n} A_k.\] By De Morgan's law, we have \[\left(\limsup\limits_{k\raw\infty}A_k\right)^c = \left(\bigcap_{n\geq 1}\bigcup_{k\geq n} A_k\right)^c = \bigcup_{n\geq 1}\left(\bigcup_{k\geq n} A_k\right)^c = \bigcup_{n\geq 1}\bigcap_{k\geq n} A_k^c =\liminf\limits_{k\raw\infty}A_k^c.\tag{1}\]

There are two important properties of these two notations.

  1. $a\in\liminf_kA_k$ if and only if there exists a integer $N$ s.t. $a\in A_n$ for all $n\geq N$.
  2. $a\in\limsup_kA_k$ if and only if $a$ belongs to infinitely many terms of $\{A_k\}$.

First, if $a\in\liminf_kA_k$, it means $a$ belongs to at least one of $\{\cap_{k\geq n}A_k:n\geq 1\}$, suppose $a\in \cap_{k\geq N}A_k$, then $a\in A_k$, for all $k\geq N$.

Second, if $a\in\limsup_kA_k$, it means $a$ belongs to all $\{\cup_{k\geq n}A_k:n\geq 1\}$. Assume $a$ belongs to only finite terms of $\{A_k\}$, e.g., $\{A_{k_1},\cdots,A_{k_m}\}$, then let $M = \max\{k_1,\cdots,k_m\}$, we have $a\notin A_k$ for all $k\geq M+1$, i.e. $a\notin \cup_{k\geq M+1}A_k$, which leads to a contradiction. \[\tag*{$\blacksquare$}\]

Due to the above theorem, sometimes we say events $A_k$ ($k=1,2,\cdots$) occur infinitely often if the event $\limsup_kA_k$ occurs, or just $A_k$ i.o. for short.

If each $A_k$ is an event (i.e. $\in\calF$), we have
  1. \[\P(\liminf\limits_{k\raw\infty}A_k)=\lim_{n\raw\infty}\P(\bigcap_{k\geq n}A_k)\tag{2}\]
  2. \[\P(\limsup\limits_{k\raw\infty}A_k)=\lim_{n\raw\infty}\P(\bigcup_{k\geq n}A_k)\tag{3}\]

Denote \[F_n = \bigcap_{k\geq n}A_k,\] then $\{F_n\}$ is an increasing sequence, i.e. $F_1\subseteq F_2 \subseteq F_3\cdots$, by the continuity of probability measure, we have \[\P(\bigcup_{n}F_n) = \lim_{n\raw\infty}\P(F_n),\] thus (2) holds. (3) also holds directly from (1) and (2). \[\tag*{$\blacksquare$}\]

2. First Borel-Cantelli lemma

Suppose $\{A_k\}$ are events, if \[\sum_k\P(A_k)\lt\infty,\] then we have\[\P(A_k\;\text{i.o.})=0.\]

Denote $B_n$ as\[B_n = \bigcup_{k\geq n} A_k,\] from (3) we have \[\P(A_k\;\text{i.o.}) = \P(\limsup\limits_{k\raw\infty}A_k)=\lim_{n\raw\infty}\P(B_n).\] We only need to prove $\P(B_n)\raw 0$. By the subadditivity of probability measure, we have \[\P(B_n) = \P(\bigcup_{k\geq n} A_k)\leq \sum_{k=n}^\infty \P(A_k)=\sum_{k=1}^\infty \P(A_k) -\sum_{k=1}^{n-1} \P(A_k)\raw 0,\] as $n\raw \infty$, the convergence is guaranteed by \[\sum_{k=1}^\infty \P(A_k)\lt\infty.\] \[\tag*{$\blacksquare$}\]

3. Second Borel-Cantelli lemma

Suppose $\{A_k\}$ are independent events, if \[\sum_k\P(A_k)=\infty,\] then we have\[\P(A_k\;\text{i.o.})=1.\]

From (1) \[\begin{split} \P(A_k\;\text{i.o.}) &= \P(\limsup\limits_{k\raw\infty}A_k)\\ &=1-\P(\liminf\limits_{k\raw\infty}A_k^c)\\ &=1-\lim_{n\raw\infty}\P(\bigcap_{k\geq n}A_k^c)\quad (\text{from (2)})\\ &=1-\lim_{n\raw\infty}\prod_{k=n}^\infty \P(A_k^c)\quad (\text{by independency})\\ &=1-\lim_{n\raw\infty}\prod_{k=n}^\infty (1-\P(A_k))\\ &\geq 1-\lim_{n\raw\infty}\prod_{k=n}^\infty \exp(-\P(A_k))\quad (\text{since $1-x\leq e^{-x}$})\\ &=1-\lim_{n\raw\infty} \exp(-\sum_{k=n}^\infty \P(A_k))\\ &=1\quad\quad (\text{since for any $n$, $\sum_{k=n}^\infty\P(A_k)=\infty$}) \end{split}\] \[\tag*{$\blacksquare$}\]

We can enhance the second Borel-Cantelli lemma by sbstituting "indenpendent" with a weaker condition "pairwise independent".

Suppose $\{A_k\}$ are pairwise independent events, if \[\sum_k\P(A_k)=\infty,\] then we have\[\P(A_k\;\text{i.o.})=1.\]

Let $\chr_{k}$ be $A_k$'s indicator, then $\E(\chr_k) = \P(A_k)$.Let $S_n$ be the partial sum of $\chr_k$, i.e. \[S_n = \sum^n_{k=1} \chr_k,\] denote \[S = \lim_{n\raw \infty} S_n = \sum^\infty_{k=1} \chr_k.\] Then $\sum_k\P(A_k) = \infty$ means \[\E(S) = \infty.\] And $x \in A_k \;\text{i.o.}$ is equivalent to \[S(x) = \sum^\infty_{k=1}\chr_k(x) = \infty,\] thus our goal is to prove $\P(S=\infty)=1$. Denote $p_k = \P(A_k)$, the variance of $S_n$ is \[\begin{split}\Var(S_n) &= \E(S_n^2)-[\E(S_n)]^2\\ &=\E(\sum_{k=1}^n\chr_k^2+\sum_{i\neq j}\chr_i\chr_j) - (\sum_{k=1}^np_k)^2\\ &=\sum_{k=1}^n\E(\chr_k^2)+\sum_{i\neq j}\E(\chr_i)\E(\chr_j) - (\sum_{k=1}^np_k)^2\\ &=\sum_{k=1}^np_k+\sum_{i\neq j}p_ip_j - (\sum_{k=1}^np_k)^2\\ &= \sum_{k=1}^np_k+(\sum^n_{k=1}p_k^2+\sum_{i\neq j}p_ip_j)-\sum^n_{k=1}p_k^2 - (\sum_{k=1}^np_k)^2 \\ &=\sum^n_{k=1}(p_k-p_k^2)\\&\leq \E(S_n) \end{split} \] Then \[\begin{split}\P\left(S\lt \frac{\E(S_n)}{2}\right)&\leq \P\left(S_n\lt \frac{\E(S_n)}{2}\right)\\ & = \P\left(S_n-\E(S_n)\lt -\frac{\E(S_n)}{2}\right)\\ &\leq \P\left(|S_n-\E(S_n)|\geq \frac{\E(S_n)}{2}\right)\\ &\leq \frac{4\Var(S_n)}{[\E(S_n)]^2}\quad (\text{by Chebyshev's inequality})\\ &\leq \frac{4}{\E(S_n)} \quad (\text{since $\Var(S_n)\leq \E(S_n)$}) \end{split}\tag{4}\] Notice $\{\ds S\lt \frac{\E(S_n)}{2}\}\uparrow \{S\lt\infty\}$, finally by the continuity of probability measure, \[\P(S\lt\infty)=\P\left(\bigcup^\infty_{n=1}\{S\lt \frac{\E(S_n)}{2}\}\right) = \lim_{n\raw\infty}\P\left(S\lt \frac{\E(S_n)}{2}\right)\leq \lim_{n\raw\infty} \frac{4}{\E(S_n)} = 0,\] then we concludes that \[\P(S=\infty)=1-\P(S\lt\infty) = 1.\] \[\tag*{$\blacksquare$}\]

4. Erdös-Rényi theorem

The pairwise independent condition in the above theorem can be further relaxed, leading to the Erdös-Rényi theorem.

Suppose $\{A_k\}$ are events, if \[\sum_k\P(A_k)=\infty,\] and \[\liminf\limits_{n\raw\infty}\frac{\ds\sum^n_{j=1}\sum^n_{k=1}\P(A_j\cap A_k)}{\ds\left(\sum^n_{k=1}\P(A_k)\right)^2}=1,\tag{5}\] then we have\[\P(A_k\;\text{i.o.})=1.\]

We will use the same notation $S_n$ and $S$ as before. Since \[\E(S^2_n) = \E[(\sum^n_{k=1}\chr_k)^2]=\sum_{j,k=1}^n\E(\chr_j\chr_k)=\sum_{j,k=1}^n\P(A_j\cap A_k),\] and \[\E(S_n)=\sum^n_{k=1}\E(\chr_k)=\sum^n_{k=1}\P(A_k),\] thus (5) is equivalent to \[\liminf\limits_{n\raw\infty}\frac{\E(S^2_n)}{[\E(S_n)]^2}=1.\] By the first 4 lines of (4), we have \[\P\left(S\lt \frac{\E(S_n)}{2}\right)\leq \frac{4\Var(S_n)}{[\E(S_n)]^2} = 4\frac{\E(S_n^2)-[\E(S_n)]^2}{[\E(S_n)]^2}=4(\frac{\E(S_n^2)}{[\E(S_n)]^2}-1),\] then we have \[\begin{split} \P(S\lt\infty)&=\P\left(\bigcup^\infty_{n=1}\{S\lt \frac{\E(S_n)}{2}\}\right) \\ &=\lim_{n\raw\infty} \P\left(S\lt \frac{\E(S_n)}{2}\right)\quad (\text{by $\{S\lt \frac{\E(S_n)}{2}\}$ $\uparrow$ and continuity of prob. meas.})\\ &= \liminf\limits_{n\raw\infty}\P\left(S\lt \frac{\E(S_n)}{2}\right)\quad (\text{since last limit exists by monotone cvg thm})\\ &\leq 4\left(\liminf\limits_{n\raw\infty}\frac{\E(S^2_n)}{[\E(S_n)]^2}-1\right)=0, \end{split}\] which shows \[\P(S=\infty) = 1 - \P(S\lt\infty) = 1.\] \[\tag*{$\blacksquare$}\]

Bibliography

  1. Kai Lai Chung, A Course in Probability Theory, 3rd edition (2001)

  2. Rick Durrett, Probability: Theory and Examples, 5th edition (2019)