Chapter7 (Probability)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

ECE541: Stochastic Signals and Systems Fall 2018

Chapter 7: Convergence of Random Sequences


Dr. Salim El Rouayheb
Scribe: Abhay Ashutosh Donel, Qinbo Zhang, Peiwen Tian, Pengzhe Wang, Lu Liu

1 Random sequence

Definition 1. An infinite sequence Xn , n = 1, 2, . . . , of random variables is called a random


sequence.

2 Convergence of a random sequence

Example 1. Consider the sequence of real numbers


n
Xn = , n = 0, 1, 2, . . .
n+1

This sequence converges to the limit l = 1. We write

lim Xn = l = 1.
n→∞

This means that in any neighbourhood around 1 we can trap the sequence, i.e.,

∀ > 0, ∃ n0 () s.t. for n ≥ n0 () |Xn − l| ≤ .

We can pick  to be very small and make sure that the sequence will be trapped after reaching n0 ().
Therefore as  decreases n0 () will increase. For example, in the considered sequence:
1
= , n0 () = 2,
2
1
= , n0 () = 1001.
1000

2.1 Almost sure convergence

Definition 2. A random sequence Xn , n = 0, 1, 2, 3, . . . , converges almost surely, or with proba-


bility one, to the random variable X iff

P ( lim Xn = X) = 1.
n→∞

We write
a.s.
Xn −−→ X.

1
Example 2. Let ω be a random variable that is uniformly distributed on [0, 1]. Define the random
sequence Xn as Xn = ω n .

So X0 = 1, X1 = ω, X2 = ω 2 , X3 = ω 3 , . . .
1
Let us take specific values of ω. For instance, if ω = 2

1 1 1
X0 = 1, X1 = , X2 = , X3 = , . . .
2 4 8

We can think of it as an urn containing sequences, and at each time we draw a value of ω, we get
a sequence of fixed numbers. In the example of tossing a coin, the output will be either heads or
tails. Whereas, in this case the output of the experiment is a random sequence, i.e., each outcome
is a sequence of infinite numbers.

Question: Does this sequence of random variables converge?

Answer: This sequence converges to


(
0 if ω 6= 1 with probability 1 = P (ω 6= 1)
X=
1 if ω = 1 with probability 0 = P (ω = 1)

Since the pdf is continuous, the probability P (ω = a) = 0 for any constant a. Notice that the
convergence of the sequence to 1 is possible but happens with probability 0.
a.s.
Therefore, we say that Xn converges almost surely to 0, i.e., Xn −−→ 0.
Example 3. Consider a random variable ω ∈ Ω = [0, 1] uniformly distributed on [a, b], 0 ≤ a ≤
b ≤ 1, and the sequence Xn (ω), n = 1, 2, . . ., defined by:
(
1 if 0 ≤ ω < n+1
2n ,
Xn (ω) =
0 otherwise.

Also, define the random variable X defined by:


(
1
1 if 0 ≤ ω < 2 ,
X(ω) =
0 otherwise.
a.s.
Show that Xn −−→ X.

Solution: Define the set A as follows:

A = {ω ∈ Ω : lim Xn (ω) = X(ω)}.


n→+∞

n+1
We need to prove that P (A) = 1. Let’s first find A. Note that 2n > 12 , so for any ω ∈ [0, 12 [, we
have

Xn (ω) = X(ω) = 1.

2
Therefore, we conclude that [0, 0.5[⊂ A. Now, if ω > 12 , then

X(ω) = 0.

Also, since 2ω − 1 > 0, we can write


1
Xn (ω) = 0, ∀n > .
2ω − 1
Therefore,
1
lim Xn (ω) = X(ω) = 0, ∀ω > .
n→+∞ 2
We conclude ]0.5, 1] ⊂ Ω. You can check that ω = 0.5 ∈
/ A, since

Xn (0.5) = 1, ∀n,

while X(0.5) = 0. We conclude


     
1 1 1
A = 0, ∪ ,1 = Ω − .
2 2 2
a.s.
Since P (A) = 1, we conclude Xn −−→ X.

Theorem 1. Consider the sequence X1 , X2 , X3 , . . .. For any  > 0, define the set of events

Am = {|Xn − X| < , ∀n ≥ m}.


a.s.
Then Xn −−→ X if and only if for any  > 0, we have

lim P (Am ) = 1.
m→+∞

1

Example 4. Let X1 , X2 , X3 , . . . be independent random variables, where Xn ∼ Bernoulli n for
a.s.
n = 2, 3, . . .. The goal here is to check whether Xn −−→ 0.
P+∞
1. Check that n=1 P (|Xn | > ) = +∞.

2. Show that the sequence X1 , X2 , . . . does not converge to 0 almost surely using Theorem 1.

Solution:

1. We first note that for 0 <  < 1, we have


+∞ +∞ +∞
X X X 1
P (|Xn | > ) = P (|Xn | > ) = = +∞.
n
n=1 n=1 n=1

2. To use Theorem 1, we define

Am = {|Xn | < , ∀n ≥ m}.

3
Note that for 0 <  < 1, we have

Am = {Xn = 0, ∀n ≥ m}.

According to Theorem 1, it suffices to show that

lim P (Am ) < 1.


m→+∞

We can in fact show that lim→+∞ P (Am ) = 0. To show this, we will prove P (Am ) = 0, for
every m ≥ 2. For 0 <  < 1, we have

P (Am ) = P ({Xn = 0, ∀n ≥ m})


≤ P ({Xn = 0, ∀n = m, m + 1, . . . , N }) (for every positive integer N ≥ m)
= P (Xm = 0)P (Xm+1 = 0) . . . P (XN = 0) (since the Xi0 s are independent)
m−1 m N −1
= · ...
m m+1 N
m−1
= .
N
Thus, by choosing N large enough, we can show that P (Am ) is less than any positive number.
Therefore, P (Am ) = 0, for all m ≥ 2. We conclude that limm→+∞ P (Am ) = 0. Thus,
according to Theorem 1, the sequence X1 , X2 , . . . does not converge to 0 almost surely.
Theorem 2. Strong law of large numbers
Let X1 , X2 , X3 , . . . , Xi be iid random variables. E [Xi ] = µ, ∀i. Let
X1 + X2 + ... + Xn
Sn = .
n
Then h i
P lim |Sn − µ| ≥  = 0.
n→∞
Using the language of this chapter:
a.s.
Sn −−→ µ.

2.2 Convergence in probability

Definition 3. A random sequence Xn converges to the random variable X in probability if

∀ > 0 lim P r {|Xn − X| ≥ } = 0.


n→∞

We write :
p
Xn →
− X.
Example 5. Consider a random variable ω uniformly distributed on [0, 1] and the sequence Xn
given in Figure ??. Notice that only X2 or X3 can be equal to 1 for the same value of ω. Similarly,
only one of X4 , X5 , X6 and X7 can be equal to 1 for the same value of ω and so on and so forth.

Question: Does this sequence converge?

4
X1 (ω)
1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω
X2 (ω)

1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω
X3 (ω)

1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω
X4 (ω)

1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω
X5 (ω)

1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω
X6 (ω)

1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω
X7 (ω)

1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ω

Figure 1: Plot of the distribution of Xn (ω)

Answer: Intuitively, the sequence will converge to 0. Let us take some examples to see how the
sequence behave.

for ω = 0 : 1 10 1000 10000000 . . .


n=1 n=2 n=3 n=4
1
for ω = : 1 10 0100 00100000 . . .
3 n=1 n=2 n=3 n=4

From a calculus point of view, these sequences never converge to zero because there is always a
“jump” showing up no matter how many zeros are preceding (Fig. ??); for any ω : Xn (ω) does
not converge in the “calculus” sense. Which means also that Xn does not converge to zero almost
surely (a.s.).

5
1.5

Xn 1

0.5

0
0 100 200 300 400 500 600
n

Figure 2: Plot of the sequence for ω = 0

This sequence converges in probability since

lim P (|Xn − 0| ≥ 0) = 0 ∀ > 0.


n→∞

Remark 1. The observed sequence may not converge in “calculus” sense because of the intermittent
“jumps”; however the frequency of those “jumps” goes to zero when n goes to infinity.

Example 6. Consider a random variable ω uniformly distributed over [0, 1], and the sequence
Xn (ω) defined as: (
1 for ω ≤ n1
Xn (ω) =
0 otherwise

Question: Does this sequence converge a.s.? in probability?

Solution:

1. First, we will use Theorem 1 to show that the sequence does not converge a.s.. Let

Am = {|Xn | < , ∀n ≥ m}.

Note that for 0 <  < 1, we have

Am = {Xn = 0, ∀n ≥ m}.

6
P (Am ) = P ({Xn = 0, ∀n ≥ m})
≤ P ({Xn = 0, ∀n = m, m + 1, . . . , N }) (for every positive integer N ≥ m)
= P (Xm = 0)P (Xm+1 = 0) . . . P (XN = 0) (since the Xi0 s are independent)
1 1 1
= P (w > )P (w > ) . . . P (w > )
m m+1 N
m−1 m N −1
= · ...
m m+1 N
m−1
= .
N
We conclude that limm→+∞ P (Am ) = 0. Thus, according to Theorem 1, the sequence
X1 , X2 , . . . does not converge to 0 almost surely.

2. Now we check for convergence in probability.


1 1
P r(Xn ≥ ) = P r(Xn = 1) = P r(w ≤ )= .
n n
Hence,
1
lim P r(Xn ≥ ) = lim = 0.
n→+∞ n→+∞ n
p.
Therefore, Xn −
→ 0.

Theorem 3. Weak law of large numbers

Let X1 , X2 , X3 , . . . , Xi be iid random variables. E [Xi ] = µ, ∀i. Let

X1 + X2 + ... + Xn
Sn = .
n
Then
P [|Sn − µ| ≥ ] −−−→ 0.
n→∞

Using the language of this chapter:


p.
Sn −
→ µ.

2.3 Convergence in mean square

Definition 4. A random sequence Xn converges to a random variable X in mean square sense if


h i
lim E |X − Xn |2 = 0.
n→∞

We write:
m.s.
Xn −−→ X.

Remark 2. In mean square convergence, not only the frequency of the “jumps” goes to zero when
n goes to infinity; but also the “energy” in the jump should go to zero.

Example 6. (Revisited) Does Xn converge in m.s.?

7
Answer:
h i 1 1 1
E |Xn − 0|2 = 1 · P (w ≤ ) + 0 · P (w > ) = .
n n n
1
lim E |Xn − 0|2 = lim
 
= 0.
n→∞ n→∞ n
m.s.
Therefore, Xn −−→ 0.

In the next example, we replace 1 by n in Example 5.
Example 7. Consider a random variable ω uniformly distributed over [0, 1], and the sequence
Xn (ω) defined as: (√
n for ω ≤ n1
Xn (ω) =
0 otherwise

1
Note that P (Xn = an ) = n and P (Xn = 0) = 1 − n1 .

Question: Does this sequence converge a.s.? in probability? in m.s.?

Answer:

1. Almost sure convergence: Xn does not converge a.s. for the same reasons as Example 5.
p.
2. Convergence in probability: Xn −
→ 0 for the same reasons as Example 5. Namely,
√ 1
lim P r(Xn ≥ ) = lim P r(Xn = n) = lim = 0.
n→+∞ n→+∞ n→+∞ n

(Flash Forward: almost sure convergence ⇒ convergence in probability, but convergence in


6
probability =⇒ almost sure convergence.)
3. Mean Square Convergence:
   
h
2
i 1 1 1
E |Xn − 0| = n · P w ≤ +0·P w > = n · = 1.
n n n
Hence,
lim E |Xn − 0|2 = 1 ⇒ Xn does not converge in m.s. to 0.
 
n→∞

2.4 Convergence in distribution

Definition 5. (First attempt) A random sequence Xn converges to X in distribution if when n


goes to infinity, the values of the sequence are distributed according to a known distribution. We
say
d.
Xn −→ X.
Example 8. Consider the sequence Xn defined as:
(
Xi ∼ B( 21 ) for i = 1
Xn =
(Xi−1 + 1) mod 2 = X ⊕ 1 for i > 1

8
Question: In which sense, if any, does this sequence converge?

Answer: This sequence has two outcomes depending on the value of X1 :

X1 = 1, Xn : 101010101010 . . .
X1 = 0, Xn : 010101010101 . . .

1. Almost sure convergence: Xn does not converge almost surely because the probability of every
jump is always equal to 21 .
2. Convergence in probability: Xn does not converge in probability because the frequency of the
jumps is constant equal to 21 .
3. Convergence in mean square: Xn does not converge to 12 in mean square sense because
   
12 2 1
lim E |Xn − | = E Xn − Xn + ,
n→∞ 2 4
1
= E[Xn2 ] − E[Xn ] + ,
4
21 21 1
=1 +0 −0+ ,
2 2 4
1
= .
2
4. Convergence in distribution: At infinity, since we do not know the value of X1 , each value
of Xn can be either 0 or 1 with probability 12 . Hence, any number Xn is a random variable
∼ B( 21 ). We say, Xn converges in distribution to Bernoulli( 12 ) and we denote it by:
d 1
Xn −→ Ber( ).
2
Example 9. (Central Limit Theorem)Consider the zero-mean, unit-variance, independent random
variables X1 , X2 , . . . , Xn and define the sequence Sn as follows:
X1 + X2 + .... + Xn
Sn = √ .
n
The CLT states that Sn converges in distribution to N (0, 1), i.e.,
d
Sn −
→ N (0, 1).

Theorem 4.
)
Almost sure convergence
⇒ Convergence in probability ⇒ convergence in distribution.
Convergence in mean square

Note:

• There is no relation between Almost Sure and Mean Square Convergence.


• The relation is unidirectional, i.e., convergence in distribution does not imply convergence in
probability neither almost sure convergence nor mean square convergence.

9
3 Convergence of a random sequence

Example 1: Let the random variable U be uniformly distributed on [0, 1]. Consider the sequence
defined as:
(−1)n U
X(n) = .
n

Question: Does this sequence converge? if yes, in what sense(s)?

Answer:

1. Almost sure convergence: Suppose


U = a.
The sequence becomes

X1 = −a,
a
X2 = ,
2
a
X3 = − ,
3
a
X4 = ,
4
..
.

In fact, for any a ∈ [0, 1]


lim Xn = 0,
n→∞
a.s.
therefore, Xn −−→ 0.
a.s.
Remark 3. Xn −−→ 0 because, by definition, a random sequence converges almost surely to
the random variable X if the sequence of functions Xn converges for all values of U except
for a set of values that has a probability zero.
p.
2. Convergence in probability: Does Xn −
→ 0? Recall from theorem 13 of lecture 17:
)
a.s.
⇒ p. ⇒ d.
m.s.

which means that by proving almost-sure convergence, we get directly the convergence in
probability and in distribution. However, for completeness we will formally prove that Xn
converges to 0 in probability. To do so, we have to prove that

lim P (|X − 0| ≥ ) = 0 ∀ > 0,


n→∞
⇒ lim P (|Xn | ≥ ) = 0 ∀ > 0.
n→∞

10
By definition,
U 1
|Xn | = ≤ .
n n
Thus,
 
  U
lim P |Xn | ≥  = lim P ≥ , (1)
n→∞ n→∞ n
= lim P (U ≥ n) , (2)
n→∞
= 0. (3)

Where equation 3 follows from the fact that finding U ∈ [0, 1].

3. Convergence in mean square sense: Does Xn converge to 0 in the mean square sense?

In order to answer this question, we need to prove that

lim E |Xn − 0|2 = 0.


 
n→∞

We know that,

lim E |Xn − 0|2 = lim E Xn2 ,


   
n→∞ n→∞
 2
U
= lim E ,
n→∞ n2
1  
= lim 2 E U 2 ,
n→∞ n
Z 1
1
= lim 2 u2 du,
n→∞ n 0
1
1 u3
= lim 2 ,
n→∞ n 3 0
1
= lim ,
n→∞ 3n2
= 0.
m.s.
Hence, Xn −−→ 0.

4. Convergence in distribution: Does Xn converge to 0 in distribution? The formal definition of


convergence in distribution is the following:
d.
Xn −
→ X ⇒ lim FXn (x) = FX (x).
n→∞

d.
Hereafter, we want to prove that Xn −
→ 0.

Recall that the limit r.v. X is the constant 0 and therefore has the following CDF :
(−1)n U
Since Xn = n , the distribution of the Xi can be derived as following:

11
2
1
0
−1
−2 −1 0 1 2

Figure 3: Plot of the CDF of 0

Remark 4. At 0 the CDF of Xn will be flip-flopping between 0 (if n is even) and 1 (if n is
odd) (c.f. figure 2) which implies that there is a discontinuity at that point. Therefore, we
say that Xn converges in distribution to a CDF FX (x) except at points where FX (x) is not
continuous.
d.
Definition 6. Xn converges to X in distribution, i.e., X[n] −
→ X iff
lim FXn (x) = FX (x) except at points where FX (x) is not continuous.
n→∞

Remark 5. It is clear here that


lim FXn (x) = Fx (x) except for x = 0.
n→∞

Therefore, Xn converges to X in distribution. We could have deduced this directly from convergence
in mean square sense or almost sure convergence.
a.s. p.
Theorem 5. a) If Xn −−→ X ⇒ Xn −
→ X.
m.s. p.
b) If Xn −−→ X ⇒ Xn −
→ X.
p. d.
c) If Xn −
→ X ⇒ Xn −
→ X.
d) If P {|Xn | ≤ Y } = 1 for all n for a random variable Y with E Y 2 < ∞, then
 

p. m.s.
Xn −
→ X ⇒ Xn −−→ X.
.

Proof. The proof is omitted.

Remark 6. Convergence in probability allows the sequence, at ∞, to deviate from the mean for
any value with a small probability; whereas, convergence in mean square limits the amplitude of this
deviation when n → ∞. (We can think of it as energy ⇒ we can not allow a big deviation from the
mean).

12
CDF of U CDF of X1
2 2

1 1

0 0

−1 −1
−2 −1 0 1 2 −2 −1 0 1 2

CDF of X2 CDF of X3
2 2

1 1

0 0

−1 −1
−2 −1 0 1 2 −2 −1 0 1 2

Figure 4: Plot of the CDF of U, X1 , X2 and X3

4 Back to real analysis

Definition 7. A sequence (xn )n≥1 is Cauchy if for every , there exists a large number N s.t.

∀ m, n > N, |xm − xn | <  ⇔ lim |xm − xn | = 0.


n,m→∞

Claim 1. Every Cauchy sequence is convergent.


xn + x2
Counter example 1. Consider the sequence Xn ∈ Q defined as x0 = 1, xn+1 = 2
n
. The limit
of this sequence is given by:

l + 2l
l= ,
2
2l2 = l2 + 2,

l=± 2∈ / Q.

This implies that the sequence does not converge in Q.

13
Counter example 2. Consider the sequence xn = 1/n in (0, 1). Obviously it does not converge
in (0, 1) since the limit l = 1 ∈
/ (0, 1).
Definition 8. A space where every sequence converges is called a complete space.
Theorem 6. R is a complete space.

Proof. The proof is omitted.

Theorem 7. Cauchy criteria for convergence of a random sequence.

 
a.s.
a) Xn −−→ X ⇐⇒ P lim |xm − xn | = 0 = 1.
m,n→∞
h i
m.s.
b) Xn −−→ X ⇐⇒ lim E |xm − xn |2 = 0.
m,n→∞

p.
c) Xn −
→ X ⇐⇒ lim P [|xm − xn | ≥ ε] = 0 ∀.
m,n→∞

Proof. The proofs are omitted.

Example 10. Consider the sequence of example 11 from last lecture,


(
Xi ∼ B( 21 ) for i = 1
Xn =
(Xi−1 + 1) mod 2 = X ⊕ 1 for i > 1

Goal: Our goal is to prove that this sequence does not converge in mean square using Cauchy
criteria.

This sequence has two outcomes depending on the value of X1 :

X1 = 1, Xn : 101010101010 . . .
X1 = 0, Xn : 010101010101 . . .

Therefore,
h i
E |Xn − Xm |2 = E Xn2 + E Xm
   2
− 2E [Xm Xn ] ,
1 1
= + − 2E [Xm Xn ] .
2 2
Consider, without loss of generality, that m > n
(
E [Xn Xm ] = 0 if m − n is odd,
E [Xn Xm ] =
E Xn2 = 12
 
if m − n is even.
Hence, (
h i 1 if m − n is odd,
lim E |Xn − Xm |2 =
n,m→∞ 0 if m − n is even,
which implies that Xn does not converge in mean square by theorem 7-b).

14
Lemma 1. Let Xn be a random sequence with E Xn2 < ∞ ∀n.
 

m.s.
Xn −−→ X iff lim E [Xm Xn ] exists and is finite.
m,n→∞

Theorem 8. Central limit theorem

Let X1 , X2 , X3 , . . . , Xi be iid random variables. E [Xi ] = 0, ∀i. Let

X1 + X2 + ... + Xn
Zn = √ .
n

Then Z z
1 z2
P [Zn ≤ z] = √ e− 2 dz.
−∞ 2π
Using the language of this chapter:
d.
Zn −
→ N (0, 1).

15

You might also like