Lec 2
Lec 2
Lec 2
Definition 1.2 (Convergence in r-th mean). Let X, {Xn } be random variables defined on
a given probability space (Ω, F, P) such that for r ∈ N, E[|X|r ] < ∞ and E[|Xn |r ] < ∞ for all
r
n. We say that {Xn } converges in the r-th mean to X, denoted by Xn → X, if the following
holds:
lim E[|Xn − X|r ] = 0.
n→∞
Example 1.6. Let {Xn } be i.i.d. random variables with E[Xn ] = µ and Var(Xn ) = σ 2 . Define
2
Yn = n1 ni=1 Xi . Then Yn → µ. Indeed
P
Pn X − nµ 2 1 1 σ2
2 i
E[|Yn − µ| ] = E[ i=1 ] = 2 E[|Sn − E(Sn )|2 ] = 2 Var(Sn ) =
n n n n
Pn 2
where Sn = i=1 Xi . Hence Yn → µ.
Theorem 1.8. The following holds:
r P
i) Xn → X =⇒ Xn → X for any r ≥ 1.
P P
ii) Let f be a given continuous function. If Xn → X, then f (Xn ) → f (X).
Proof. Proof of (i) follows from Markov’s inequality. Indeed, for any given ε > 0,
E[|Xn − X|r ]
P(|Xn − X| > ε) ≤
εr
1
=⇒ lim P(|Xn − X| > ε) ≤ r lim E[|Xn − X|r ] = 0.
n→∞ ε n→∞
Proof of (ii): For any k > 0, we see that
{|f (Xn ) − f (X)| > ε} ⊂ {|f (Xn ) − f (X)| > ε, |X| ≤ k} ∪ {|X| > k} .
Since f is continuous, it is uniformly continuous on any bounded interval. Therefore, for any
given ε > 0, there exists δ > 0 such that |f (x) − f (y)| ≤ ε if |x − y| ≤ δ for x and y in [−k, k].
This means that
{|f (Xn ) − f (X)| > ε, |X| ≤ k} ⊂ {|Xn − X| > δ, |X| ≤ k} ⊂ {|Xn − X| > δ} .
Thus we have
{|f (Xn ) − f (X)| > ε} ⊂ {|Xn − X| > δ} ∪ {|X| > k}
=⇒ P(|f (Xn ) − f (X)| > ε) ≤ P(|Xn − X| > δ) + P(|X| > k).
P
Since Xn → X and limk→∞ P(|X| > k) = 0, we obtain that limn→∞ P(|f (Xn ) − f (X)| > ε) = 0.
This completes the proof.
In general, convergence in probability does not imply convergence in r-th mean. To see it,
consider the following example.
P
Example 1.7. Let Ω = [0, 1], F = B([0, 1]) and P(dx) = dx. Let Xn = n1(0,1/n) . Then Xn → 0
r
but Xn 9 0 for all r ≥ 1. To show this, observe that
1 P
P(|Xn | > ε) ≤ =⇒ lim P(|Xn | > ε) = 0 i.e., Xn → 0.
n n→∞
Definition 1.3 (Almost sure convergence). Let X, {Xn } be random variables defined on
a given probability space (Ω, F, P). We say that {Xn } converges to X almost surely or with
probability 1 if the following holds:
P( lim Xn = X) = 1.
n→∞
a.s
We denote it by Xn → X.
Example 1.8. Let Ω = [0, 1], F = B([0, 1]) and P(dx) = dx. Define
(
1 if ω ∈ (0, 1 − n1 )
Xn (ω) =
n, otherwise.
It is easy to check that if ω = 0 or ω = 1, then limn→∞ Xn (ω) = ∞. For any ω ∈ (0, 1), we
can find n0 ∈ N such that ω ∈ (0, 1 − n1 ) for any n ≥ n0 . As a consequence, Xn (ω) = 1 for any
n ≥ n0 . In other words, for ω ∈ (0, 1), limn→∞ Xn (ω) = 1. Define X(ω) = 1 for all ω ∈ [0, 1].
Then
a.s.
P(ω ∈ [0, 1] : {Xn (ω)} does not converge to X(ω)) = P({0, 1}) = 0 =⇒ Xn → 1.
Sufficient condition for almost sure convergence: Let {An } be a sequence of events in F.
Define
lim sup An = ∩∞
n=1 ∪m≥n Am = lim ∪m≥n Am .
n→∞
n
This can be interpreted probabilistically as
lim sup An = “ An occurs infinitely often”.
n
We denote this as
{An i.o.} = lim sup An .
n
Theorem 1.9 (Borel-Cantelli lemma). Let {An } be a sequence of events in (Ω, F, P).
i) If ∞
P
n=1 P(An ) < +∞, then P(An i.o.) = 0.
ii) If An are mutually independent events, and if ∞
P
n=1 P(An ) = ∞, then
P(An i.o.) = 1.
Remark 1.1. For mutually independent events An , since ∞
P
n=1 P(An ) is either finite or infinite,
the event {An i.o.} has probability either 0 or 1. This is sometimes called zero-one law.
As a consequence of Borel-Cantelli lemma, we have the following proposition.
Proposition 1.10. Let {Xn } be a sequence of random variables defined on a probability space
a.s.
(Ω, F, P). If ∞
P
n=1 P(|Xn | > ε) < +∞ for any ε > 0, then Xn → 0.
Example 1.9. Let {Xn } be a sequence of i.i.d. random variables such that P(Xn = 1) = 12 and
a.s.
P(Xn = −1) = 12 . Let Sn = ni=1 Xi . Then n12 Sn2 → 0. To show the result, we use Proposition
P
1.10. Note that
∞
1 E[|Sn2 |2 ] 1 X 1 1 a.s.
P( 2 |Sn2 | > ε) ≤ 4 2
≤ 2 2 =⇒ P( 2 |Sn2 | > ε) < ∞ =⇒ 2 Sn2 → 0.
n n ε n ε n n
n=1
Note that, for each positive integer n, there exist integers j and k(uniquely determined) such
that
n = 2k + j, j = 0, 1, . . . , 2k − 1, k = 0, 1, 2, . . . .
( for n = 1, k = j = 0, and for n = 5, k = 2, j = 1 and so on). Let An = {Xn > 0}. Then,
P
clearly P(An ) → 0. Consequently, Xn → 0 but Xn (ω) 9 0 for all ω ∈ Ω.
Theorem 1.11. The followings hold.
a.s P
i) If Xn → X, then Xn → X.
P a.s.
ii) If Xn → X, then there exists a subsequence Xnk of Xn such that Xnk → X.
a.s a.s
iii) If Xn → X, then for any continuous function f , f (Xn ) → f (X).
Proof. Proof of i): For any ε > 0, define Aεn = {|Xn − X| > ε} and Bm ε = ∪∞ Aε . Since
n=m n
a.s ε ) = 0. Note that {B ε } are nested and decreasing sequence of events. Hence
Xn → X, P(∩m Bm m
from the continuity of probability measure P, we have
ε ε
lim P(Bm ) = P(∩m Bm ) = 0.
m→∞
Since Aεm ⊂ Bm
ε , we have P(Aε ) ≤ P(B ε ). This implies that lim
m m
ε
m→∞ P(Am ) = 0. In other
P
words, Xn → X.
P
Proof of ii): To prove ii), we will use Borel-Cantelli lemma. Since Xn → X, we can choose
a subsequence Xnk such that P(|Xnk − X| > k1 ) ≤ 21k . Let Ak := {|Xnk − X| > k1 }. Then
P∞
k=1 P(Ak ) < +∞. Hence, by Borel-Cantelli lemma P(Ak i.o.) = 0. This implies that
1
P(∪∞ ∞ c
n=1 ∩m=n Am ) = 1 =⇒ P {ω ∈ Ω : ∃ n0 : ∀k ≥ n0 , |Xnk − X| ≤ }) = 1
k
a.s.
=⇒ Xnk → X.
Proof of iii): Let N = {ω : limn→∞ Xn (ω) 6= X(ω)}. Then P(N ) = 0. If ω ∈
/ N , then by the
continuity property of f , we have
lim f (Xn (ω)) = f ( lim Xn (ω)) = f (X(ω)) .
n→∞ n→∞
a.s
This is true for any ω ∈
/ N and P(N ) = 0. Hence f (Xn ) → f (X).