Chapter7 (Probability)
Chapter7 (Probability)
Chapter7 (Probability)
1 Random sequence
lim Xn = l = 1.
n→∞
This means that in any neighbourhood around 1 we can trap the sequence, i.e.,
We can pick to be very small and make sure that the sequence will be trapped after reaching n0 ().
Therefore as decreases n0 () will increase. For example, in the considered sequence:
1
= , n0 () = 2,
2
1
= , n0 () = 1001.
1000
P ( lim Xn = X) = 1.
n→∞
We write
a.s.
Xn −−→ X.
1
Example 2. Let ω be a random variable that is uniformly distributed on [0, 1]. Define the random
sequence Xn as Xn = ω n .
So X0 = 1, X1 = ω, X2 = ω 2 , X3 = ω 3 , . . .
1
Let us take specific values of ω. For instance, if ω = 2
1 1 1
X0 = 1, X1 = , X2 = , X3 = , . . .
2 4 8
We can think of it as an urn containing sequences, and at each time we draw a value of ω, we get
a sequence of fixed numbers. In the example of tossing a coin, the output will be either heads or
tails. Whereas, in this case the output of the experiment is a random sequence, i.e., each outcome
is a sequence of infinite numbers.
Since the pdf is continuous, the probability P (ω = a) = 0 for any constant a. Notice that the
convergence of the sequence to 1 is possible but happens with probability 0.
a.s.
Therefore, we say that Xn converges almost surely to 0, i.e., Xn −−→ 0.
Example 3. Consider a random variable ω ∈ Ω = [0, 1] uniformly distributed on [a, b], 0 ≤ a ≤
b ≤ 1, and the sequence Xn (ω), n = 1, 2, . . ., defined by:
(
1 if 0 ≤ ω < n+1
2n ,
Xn (ω) =
0 otherwise.
n+1
We need to prove that P (A) = 1. Let’s first find A. Note that 2n > 12 , so for any ω ∈ [0, 12 [, we
have
Xn (ω) = X(ω) = 1.
2
Therefore, we conclude that [0, 0.5[⊂ A. Now, if ω > 12 , then
X(ω) = 0.
Xn (0.5) = 1, ∀n,
Theorem 1. Consider the sequence X1 , X2 , X3 , . . .. For any > 0, define the set of events
lim P (Am ) = 1.
m→+∞
1
Example 4. Let X1 , X2 , X3 , . . . be independent random variables, where Xn ∼ Bernoulli n for
a.s.
n = 2, 3, . . .. The goal here is to check whether Xn −−→ 0.
P+∞
1. Check that n=1 P (|Xn | > ) = +∞.
2. Show that the sequence X1 , X2 , . . . does not converge to 0 almost surely using Theorem 1.
Solution:
3
Note that for 0 < < 1, we have
Am = {Xn = 0, ∀n ≥ m}.
We can in fact show that lim→+∞ P (Am ) = 0. To show this, we will prove P (Am ) = 0, for
every m ≥ 2. For 0 < < 1, we have
We write :
p
Xn →
− X.
Example 5. Consider a random variable ω uniformly distributed on [0, 1] and the sequence Xn
given in Figure ??. Notice that only X2 or X3 can be equal to 1 for the same value of ω. Similarly,
only one of X4 , X5 , X6 and X7 can be equal to 1 for the same value of ω and so on and so forth.
4
X1 (ω)
1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω
X2 (ω)
1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω
X3 (ω)
1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω
X4 (ω)
1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω
X5 (ω)
1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω
X6 (ω)
1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω
X7 (ω)
1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω
Answer: Intuitively, the sequence will converge to 0. Let us take some examples to see how the
sequence behave.
From a calculus point of view, these sequences never converge to zero because there is always a
“jump” showing up no matter how many zeros are preceding (Fig. ??); for any ω : Xn (ω) does
not converge in the “calculus” sense. Which means also that Xn does not converge to zero almost
surely (a.s.).
5
1.5
Xn 1
0.5
0
0 100 200 300 400 500 600
n
Remark 1. The observed sequence may not converge in “calculus” sense because of the intermittent
“jumps”; however the frequency of those “jumps” goes to zero when n goes to infinity.
Example 6. Consider a random variable ω uniformly distributed over [0, 1], and the sequence
Xn (ω) defined as: (
1 for ω ≤ n1
Xn (ω) =
0 otherwise
Solution:
1. First, we will use Theorem 1 to show that the sequence does not converge a.s.. Let
Am = {Xn = 0, ∀n ≥ m}.
6
P (Am ) = P ({Xn = 0, ∀n ≥ m})
≤ P ({Xn = 0, ∀n = m, m + 1, . . . , N }) (for every positive integer N ≥ m)
= P (Xm = 0)P (Xm+1 = 0) . . . P (XN = 0) (since the Xi0 s are independent)
1 1 1
= P (w > )P (w > ) . . . P (w > )
m m+1 N
m−1 m N −1
= · ...
m m+1 N
m−1
= .
N
We conclude that limm→+∞ P (Am ) = 0. Thus, according to Theorem 1, the sequence
X1 , X2 , . . . does not converge to 0 almost surely.
X1 + X2 + ... + Xn
Sn = .
n
Then
P [|Sn − µ| ≥ ] −−−→ 0.
n→∞
We write:
m.s.
Xn −−→ X.
Remark 2. In mean square convergence, not only the frequency of the “jumps” goes to zero when
n goes to infinity; but also the “energy” in the jump should go to zero.
7
Answer:
h i 1 1 1
E |Xn − 0|2 = 1 · P (w ≤ ) + 0 · P (w > ) = .
n n n
1
lim E |Xn − 0|2 = lim
= 0.
n→∞ n→∞ n
m.s.
Therefore, Xn −−→ 0.
√
In the next example, we replace 1 by n in Example 5.
Example 7. Consider a random variable ω uniformly distributed over [0, 1], and the sequence
Xn (ω) defined as: (√
n for ω ≤ n1
Xn (ω) =
0 otherwise
1
Note that P (Xn = an ) = n and P (Xn = 0) = 1 − n1 .
Answer:
1. Almost sure convergence: Xn does not converge a.s. for the same reasons as Example 5.
p.
2. Convergence in probability: Xn −
→ 0 for the same reasons as Example 5. Namely,
√ 1
lim P r(Xn ≥ ) = lim P r(Xn = n) = lim = 0.
n→+∞ n→+∞ n→+∞ n
8
Question: In which sense, if any, does this sequence converge?
X1 = 1, Xn : 101010101010 . . .
X1 = 0, Xn : 010101010101 . . .
1. Almost sure convergence: Xn does not converge almost surely because the probability of every
jump is always equal to 21 .
2. Convergence in probability: Xn does not converge in probability because the frequency of the
jumps is constant equal to 21 .
3. Convergence in mean square: Xn does not converge to 12 in mean square sense because
12 2 1
lim E |Xn − | = E Xn − Xn + ,
n→∞ 2 4
1
= E[Xn2 ] − E[Xn ] + ,
4
21 21 1
=1 +0 −0+ ,
2 2 4
1
= .
2
4. Convergence in distribution: At infinity, since we do not know the value of X1 , each value
of Xn can be either 0 or 1 with probability 12 . Hence, any number Xn is a random variable
∼ B( 21 ). We say, Xn converges in distribution to Bernoulli( 12 ) and we denote it by:
d 1
Xn −→ Ber( ).
2
Example 9. (Central Limit Theorem)Consider the zero-mean, unit-variance, independent random
variables X1 , X2 , . . . , Xn and define the sequence Sn as follows:
X1 + X2 + .... + Xn
Sn = √ .
n
The CLT states that Sn converges in distribution to N (0, 1), i.e.,
d
Sn −
→ N (0, 1).
Theorem 4.
)
Almost sure convergence
⇒ Convergence in probability ⇒ convergence in distribution.
Convergence in mean square
Note:
9
3 Convergence of a random sequence
Example 1: Let the random variable U be uniformly distributed on [0, 1]. Consider the sequence
defined as:
(−1)n U
X(n) = .
n
Answer:
X1 = −a,
a
X2 = ,
2
a
X3 = − ,
3
a
X4 = ,
4
..
.
which means that by proving almost-sure convergence, we get directly the convergence in
probability and in distribution. However, for completeness we will formally prove that Xn
converges to 0 in probability. To do so, we have to prove that
10
By definition,
U 1
|Xn | = ≤ .
n n
Thus,
U
lim P |Xn | ≥ = lim P ≥ , (1)
n→∞ n→∞ n
= lim P (U ≥ n) , (2)
n→∞
= 0. (3)
Where equation 3 follows from the fact that finding U ∈ [0, 1].
3. Convergence in mean square sense: Does Xn converge to 0 in the mean square sense?
We know that,
d.
Hereafter, we want to prove that Xn −
→ 0.
Recall that the limit r.v. X is the constant 0 and therefore has the following CDF :
(−1)n U
Since Xn = n , the distribution of the Xi can be derived as following:
11
2
1
0
−1
−2 −1 0 1 2
Remark 4. At 0 the CDF of Xn will be flip-flopping between 0 (if n is even) and 1 (if n is
odd) (c.f. figure 2) which implies that there is a discontinuity at that point. Therefore, we
say that Xn converges in distribution to a CDF FX (x) except at points where FX (x) is not
continuous.
d.
Definition 6. Xn converges to X in distribution, i.e., X[n] −
→ X iff
lim FXn (x) = FX (x) except at points where FX (x) is not continuous.
n→∞
Therefore, Xn converges to X in distribution. We could have deduced this directly from convergence
in mean square sense or almost sure convergence.
a.s. p.
Theorem 5. a) If Xn −−→ X ⇒ Xn −
→ X.
m.s. p.
b) If Xn −−→ X ⇒ Xn −
→ X.
p. d.
c) If Xn −
→ X ⇒ Xn −
→ X.
d) If P {|Xn | ≤ Y } = 1 for all n for a random variable Y with E Y 2 < ∞, then
p. m.s.
Xn −
→ X ⇒ Xn −−→ X.
.
Remark 6. Convergence in probability allows the sequence, at ∞, to deviate from the mean for
any value with a small probability; whereas, convergence in mean square limits the amplitude of this
deviation when n → ∞. (We can think of it as energy ⇒ we can not allow a big deviation from the
mean).
12
CDF of U CDF of X1
2 2
1 1
0 0
−1 −1
−2 −1 0 1 2 −2 −1 0 1 2
CDF of X2 CDF of X3
2 2
1 1
0 0
−1 −1
−2 −1 0 1 2 −2 −1 0 1 2
Definition 7. A sequence (xn )n≥1 is Cauchy if for every , there exists a large number N s.t.
l + 2l
l= ,
2
2l2 = l2 + 2,
√
l=± 2∈ / Q.
13
Counter example 2. Consider the sequence xn = 1/n in (0, 1). Obviously it does not converge
in (0, 1) since the limit l = 1 ∈
/ (0, 1).
Definition 8. A space where every sequence converges is called a complete space.
Theorem 6. R is a complete space.
a.s.
a) Xn −−→ X ⇐⇒ P lim |xm − xn | = 0 = 1.
m,n→∞
h i
m.s.
b) Xn −−→ X ⇐⇒ lim E |xm − xn |2 = 0.
m,n→∞
p.
c) Xn −
→ X ⇐⇒ lim P [|xm − xn | ≥ ε] = 0 ∀.
m,n→∞
Goal: Our goal is to prove that this sequence does not converge in mean square using Cauchy
criteria.
X1 = 1, Xn : 101010101010 . . .
X1 = 0, Xn : 010101010101 . . .
Therefore,
h i
E |Xn − Xm |2 = E Xn2 + E Xm
2
− 2E [Xm Xn ] ,
1 1
= + − 2E [Xm Xn ] .
2 2
Consider, without loss of generality, that m > n
(
E [Xn Xm ] = 0 if m − n is odd,
E [Xn Xm ] =
E Xn2 = 12
if m − n is even.
Hence, (
h i 1 if m − n is odd,
lim E |Xn − Xm |2 =
n,m→∞ 0 if m − n is even,
which implies that Xn does not converge in mean square by theorem 7-b).
14
Lemma 1. Let Xn be a random sequence with E Xn2 < ∞ ∀n.
m.s.
Xn −−→ X iff lim E [Xm Xn ] exists and is finite.
m,n→∞
X1 + X2 + ... + Xn
Zn = √ .
n
Then Z z
1 z2
P [Zn ≤ z] = √ e− 2 dz.
−∞ 2π
Using the language of this chapter:
d.
Zn −
→ N (0, 1).
15