Huarui Zhou
Similar as define the $\lim\inf$ and $\lim\sup$ of real number sequences $\{a_k\}$, \[\liminf\limits_{k\raw\infty}a_k = \sup_{n\geq 1}(\inf_{k\geq n} a_k),\quad \limsup\limits_{k\raw\infty}a_k = \inf_{n\geq 1}(\sup_{k\geq n} a_k),\] we can also define the $\lim\inf$ and $\lim\sup$ of set sequences $\{A_k\}$, just simply replace $\inf$ with $\cap$ and $\sup$ with $\cup$, i.e. \[\liminf\limits_{k\raw\infty}A_k = \bigcup_{n\geq 1}\bigcap_{k\geq n} A_k,\quad \limsup\limits_{k\raw\infty}A_k = \bigcap_{n\geq 1}\bigcup_{k\geq n} A_k.\] By De Morgan's law, we have \[\left(\limsup\limits_{k\raw\infty}A_k\right)^c = \left(\bigcap_{n\geq 1}\bigcup_{k\geq n} A_k\right)^c = \bigcup_{n\geq 1}\left(\bigcup_{k\geq n} A_k\right)^c = \bigcup_{n\geq 1}\bigcap_{k\geq n} A_k^c =\liminf\limits_{k\raw\infty}A_k^c.\tag{1}\]
There are two important properties of these two notations.
First, if $a\in\liminf_kA_k$, it means $a$ belongs to at least one of $\{\cap_{k\geq n}A_k:n\geq 1\}$, suppose $a\in \cap_{k\geq N}A_k$, then $a\in A_k$, for all $k\geq N$.
Second, if $a\in\limsup_kA_k$, it means $a$ belongs to all $\{\cup_{k\geq n}A_k:n\geq 1\}$. Assume $a$ belongs to only finite terms of $\{A_k\}$, e.g., $\{A_{k_1},\cdots,A_{k_m}\}$, then let $M = \max\{k_1,\cdots,k_m\}$, we have $a\notin A_k$ for all $k\geq M+1$, i.e. $a\notin \cup_{k\geq M+1}A_k$, which leads to a contradiction. \[\tag*{$\blacksquare$}\]
Due to the above theorem, sometimes we say events $A_k$ ($k=1,2,\cdots$) occur infinitely often if the event $\limsup_kA_k$ occurs, or just $A_k$ i.o. for short.
Denote \[F_n = \bigcap_{k\geq n}A_k,\] then $\{F_n\}$ is an increasing sequence, i.e. $F_1\subseteq F_2 \subseteq F_3\cdots$, by the continuity of probability measure, we have \[\P(\bigcup_{n}F_n) = \lim_{n\raw\infty}\P(F_n),\] thus (2) holds. (3) also holds directly from (1) and (2). \[\tag*{$\blacksquare$}\]
Denote $B_n$ as\[B_n = \bigcup_{k\geq n} A_k,\] from (3) we have \[\P(A_k\;\text{i.o.}) = \P(\limsup\limits_{k\raw\infty}A_k)=\lim_{n\raw\infty}\P(B_n).\] We only need to prove $\P(B_n)\raw 0$. By the subadditivity of probability measure, we have \[\P(B_n) = \P(\bigcup_{k\geq n} A_k)\leq \sum_{k=n}^\infty \P(A_k)=\sum_{k=1}^\infty \P(A_k) -\sum_{k=1}^{n-1} \P(A_k)\raw 0,\] as $n\raw \infty$, the convergence is guaranteed by \[\sum_{k=1}^\infty \P(A_k)\lt\infty.\] \[\tag*{$\blacksquare$}\]
From (1) \[\begin{split} \P(A_k\;\text{i.o.}) &= \P(\limsup\limits_{k\raw\infty}A_k)\\ &=1-\P(\liminf\limits_{k\raw\infty}A_k^c)\\ &=1-\lim_{n\raw\infty}\P(\bigcap_{k\geq n}A_k^c)\quad (\text{from (2)})\\ &=1-\lim_{n\raw\infty}\prod_{k=n}^\infty \P(A_k^c)\quad (\text{by independency})\\ &=1-\lim_{n\raw\infty}\prod_{k=n}^\infty (1-\P(A_k))\\ &\geq 1-\lim_{n\raw\infty}\prod_{k=n}^\infty \exp(-\P(A_k))\quad (\text{since $1-x\leq e^{-x}$})\\ &=1-\lim_{n\raw\infty} \exp(-\sum_{k=n}^\infty \P(A_k))\\ &=1\quad\quad (\text{since for any $n$, $\sum_{k=n}^\infty\P(A_k)=\infty$}) \end{split}\] \[\tag*{$\blacksquare$}\]
We can enhance the second Borel-Cantelli lemma by sbstituting "indenpendent" with a weaker condition "pairwise independent".
Let $\chr_{k}$ be $A_k$'s indicator, then $\E(\chr_k) = \P(A_k)$.Let $S_n$ be the partial sum of $\chr_k$, i.e. \[S_n = \sum^n_{k=1} \chr_k,\] denote \[S = \lim_{n\raw \infty} S_n = \sum^\infty_{k=1} \chr_k.\] Then $\sum_k\P(A_k) = \infty$ means \[\E(S) = \infty.\] And $x \in A_k \;\text{i.o.}$ is equivalent to \[S(x) = \sum^\infty_{k=1}\chr_k(x) = \infty,\] thus our goal is to prove $\P(S=\infty)=1$. Denote $p_k = \P(A_k)$, the variance of $S_n$ is \[\begin{split}\Var(S_n) &= \E(S_n^2)-[\E(S_n)]^2\\ &=\E(\sum_{k=1}^n\chr_k^2+\sum_{i\neq j}\chr_i\chr_j) - (\sum_{k=1}^np_k)^2\\ &=\sum_{k=1}^n\E(\chr_k^2)+\sum_{i\neq j}\E(\chr_i)\E(\chr_j) - (\sum_{k=1}^np_k)^2\\ &=\sum_{k=1}^np_k+\sum_{i\neq j}p_ip_j - (\sum_{k=1}^np_k)^2\\ &= \sum_{k=1}^np_k+(\sum^n_{k=1}p_k^2+\sum_{i\neq j}p_ip_j)-\sum^n_{k=1}p_k^2 - (\sum_{k=1}^np_k)^2 \\ &=\sum^n_{k=1}(p_k-p_k^2)\\&\leq \E(S_n) \end{split} \] Then \[\begin{split}\P\left(S\lt \frac{\E(S_n)}{2}\right)&\leq \P\left(S_n\lt \frac{\E(S_n)}{2}\right)\\ & = \P\left(S_n-\E(S_n)\lt -\frac{\E(S_n)}{2}\right)\\ &\leq \P\left(|S_n-\E(S_n)|\geq \frac{\E(S_n)}{2}\right)\\ &\leq \frac{4\Var(S_n)}{[\E(S_n)]^2}\quad (\text{by Chebyshev's inequality})\\ &\leq \frac{4}{\E(S_n)} \quad (\text{since $\Var(S_n)\leq \E(S_n)$}) \end{split}\tag{4}\] Notice $\{\ds S\lt \frac{\E(S_n)}{2}\}\uparrow \{S\lt\infty\}$, finally by the continuity of probability measure, \[\P(S\lt\infty)=\P\left(\bigcup^\infty_{n=1}\{S\lt \frac{\E(S_n)}{2}\}\right) = \lim_{n\raw\infty}\P\left(S\lt \frac{\E(S_n)}{2}\right)\leq \lim_{n\raw\infty} \frac{4}{\E(S_n)} = 0,\] then we concludes that \[\P(S=\infty)=1-\P(S\lt\infty) = 1.\] \[\tag*{$\blacksquare$}\]
The pairwise independent condition in the above theorem can be further relaxed, leading to the Erdös-Rényi theorem.
We will use the same notation $S_n$ and $S$ as before. Since \[\E(S^2_n) = \E[(\sum^n_{k=1}\chr_k)^2]=\sum_{j,k=1}^n\E(\chr_j\chr_k)=\sum_{j,k=1}^n\P(A_j\cap A_k),\] and \[\E(S_n)=\sum^n_{k=1}\E(\chr_k)=\sum^n_{k=1}\P(A_k),\] thus (5) is equivalent to \[\liminf\limits_{n\raw\infty}\frac{\E(S^2_n)}{[\E(S_n)]^2}=1.\] By the first 4 lines of (4), we have \[\P\left(S\lt \frac{\E(S_n)}{2}\right)\leq \frac{4\Var(S_n)}{[\E(S_n)]^2} = 4\frac{\E(S_n^2)-[\E(S_n)]^2}{[\E(S_n)]^2}=4(\frac{\E(S_n^2)}{[\E(S_n)]^2}-1),\] then we have \[\begin{split} \P(S\lt\infty)&=\P\left(\bigcup^\infty_{n=1}\{S\lt \frac{\E(S_n)}{2}\}\right) \\ &=\lim_{n\raw\infty} \P\left(S\lt \frac{\E(S_n)}{2}\right)\quad (\text{by $\{S\lt \frac{\E(S_n)}{2}\}$ $\uparrow$ and continuity of prob. meas.})\\ &= \liminf\limits_{n\raw\infty}\P\left(S\lt \frac{\E(S_n)}{2}\right)\quad (\text{since last limit exists by monotone cvg thm})\\ &\leq 4\left(\liminf\limits_{n\raw\infty}\frac{\E(S^2_n)}{[\E(S_n)]^2}-1\right)=0, \end{split}\] which shows \[\P(S=\infty) = 1 - \P(S\lt\infty) = 1.\] \[\tag*{$\blacksquare$}\]
Kai Lai Chung, A Course in Probability Theory, 3rd edition (2001)
Rick Durrett, Probability: Theory and Examples, 5th edition (2019)