Probability Theory - Kelly.78 PDF
Probability Theory - Kelly.78 PDF
Probability Theory - Kelly.78 PDF
Introduction v
1 Basic Concepts 1
1.1 Sample Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Classical Probability . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Combinatorial Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Stirling’s Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3 Random Variables 11
3.1 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Indicator Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4 Inclusion - Exclusion Formula . . . . . . . . . . . . . . . . . . . . . 18
3.5 Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 Inequalities 23
4.1 Jensen’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Cauchy-Schwarz Inequality . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 Markov’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.4 Chebyshev’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.5 Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 28
5 Generating Functions 31
5.1 Combinatorial Applications . . . . . . . . . . . . . . . . . . . . . . . 34
5.2 Conditional Expectation . . . . . . . . . . . . . . . . . . . . . . . . 34
5.3 Properties of Conditional Expectation . . . . . . . . . . . . . . . . . 36
5.4 Branching Processes . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.5 Random Walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
iii
iv CONTENTS
These notes are based on the course “Probability” given by Prof. F.P. Kelly in Cam-
bridge in the Lent Term 1996. This typed version of the notes is totally unconnected
with Prof. Kelly.
Other sets of notes are available for different courses. At the time of typing these
courses were:
Probability Discrete Mathematics
Analysis Further Analysis
Methods Quantum Mechanics
Fluid Dynamics 1 Quadratic Mathematics
Geometry Dynamics of D.E.’s
Foundations of QM Electrodynamics
Methods of Math. Phys Fluid Dynamics 2
Waves (etc.) Statistical Physics
General Relativity Dynamical Systems
Combinatorics Bifurcations in Nonlinear Convection
v
Copyright (c) The Archimedeans, Cambridge University.
All rights reserved.
Redistribution and use of these notes in electronic or printed form, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of the electronic files must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in printed form must reproduce the above copyright notice, this
list of conditions and the following disclaimer.
3. All materials derived from these notes must display the following acknowledge-
ment:
4. Neither the name of The Archimedeans nor the names of their contributors may
be used to endorse or promote products derived from these notes.
5. Neither these notes nor any derived products may be sold on a for-profit basis,
although a fee may be required for the physical act of copying.
6. You must cause any edited versions to carry prominent notices stating that you
edited them and the date of any change.
Basic Concepts
jAj
P(A) =
j j
Example. Choose r digits from a table of random numbers. Find the probability that
for 0 k 9,
1. no digit exceeds k ,
Ak = f(a1 : : : ar ) : 0 ai k i = 1 : : : rg :
Now jAk j = (k + 1)r , so that P(Ak ) = k10
; r
+1 .
Let Bk be the event that k is the greatest digit drawn. Then Bk = Ak n Ak;1 . Also
Ak;1 Ak , so that jBk j = (k + 1)r ; kr . Thus P(Bk ) = (k+1)10r;k
r r
1
2 CHAPTER 1. BASIC CONCEPTS
log p n! !0
2nnn e;n
p
and thus n! 2nnn e;n.
We first prove the weak form of Stirling’s formula, that log(n!) n log n.
Proof.
P
log n! = n1 log k. Now
Zn X
n Z n+1
log xdx log k log xdx
1 1
R z logx dx = z log z ; z + 1, and so
1
and 1
n log n ; n + 1 log n! (n + 1) log(n + 1) ; n:
Divide by n log n and let n ! 1 to sandwich nlog n!
log n between terms that tend to 1.
Therefore log n! n log n.
1.4. STIRLING’S FORMULA 3
1 ; x + x2 ; x3 < 1 +1 x < 1 ; x + x2 :
Now integrate from 0 to y to obtain
1 1 1 1
12n2 ; 12n3 hn ; hn+1 12n2 + 6n3 :
For n 2, 0P hn ; hn+1 P 1
n21. Thus hn is a decreasing sequence, and 0
n
h2 ;hn+1 r=2(hr ;hr+1 ) 1 r12 . Therefore hn is bounded below, decreasing
so is convergent. Let the limit be A. We have obtained
n! eAnn+1=2 e;n :
We need a trick to find A. Let Ir
R
= 0=2 sinr d. We obtain the recurrence Ir =
(2n n!)2
r;1 Ir;2 by integrating by parts.
r Therefore I 2n = (2(2nnn)!!)2 =2 and I2n+1 = (2 n+1)! .
Now In is decreasing, so
1 by 1
playing silly buggers with log 1 + n
4 CHAPTER 1. BASIC CONCEPTS
Chapter 2
1. 0 P(A) 1 for A ,
2. P( ) = 1,
3. for a finite or infinite sequence A1 A2
P of disjoint events, P( Ai ) =
i P(Au ).
The number P(A) is called the probability of event A.
We can look at some distributions here. Consider an arbitrary finite or countable
= f!1 !2 : : : g and an arbitrary collection fp1 p2 : : : g of non-negative numbers
with sum 1. If we define
X
P(A) = pi
i:!i 2A
it is easy to see that this function satisfies the axioms. The numbers p1 p2 : : : are
called a probability distribution. If is finite with n elements, and if p1 = p2 = =
pn = n1 we recover the classical definition of probability.
Another example would be to let = f0 1 : : : g and attach to outcome r the
probability pr = e; r! for some > 0. This is a distribution (as may be easily
r
1. P(Ac ) = 1 ; P(A),
2. P(
) = 0,
5
6 CHAPTER 2. THE AXIOMATIC APPROACH
Proof. The result is true for n = 2. If true for n ; 1, then it is true for n and 1 r
n ; 1 by the inductive step above, which expresses a n-union in terms of two n ; 1
unions. It is true for r = n by the inclusion-exclusion formula.
Example (Derangements). After a dinner, the n guests take coats at random from a
pile. Find the probability that at least one guest has the right coat.
Solution. Let Ak be the event that guest k has his1 own coat.
S
n
We want P( i=1 Ai ). Now,
P(Ai1 \ \ Air ) =
(n ; r)!
n!
by counting the number of ways of matching guests and coats after i 1 : : : ir have
taken theirs. Thus
X n
(n ; r)!
P(Ai1 \ \ Air ) =
r n! = r1!
i1 <<ir
and the required probability is
n ! n;1
P Ai = 1 ; 2!1 + 3!1 + + (;1)n!
i=1
which tends to 1 ; e;1 as n ! 1.
Furthermore, let Pm(n) be the probability that exactly m guests take the right coat.
Then P0 (n) ! e;1 and n! P0(n) is the number of derangements of n objects. There-
fore
n
1 P (n ; m) (n ; m)!
Pm (n) =
0
m n!
;
= P0(nm;! m) ! em! as n ! 1:
1
2.2 Independence
Definition 2.1. Two events A and B are said to be independent if
Event Probability
A1 18
36 = 21
A2 As above, 1
2
A3 63 = 21
36
A1 \ A2 33 = 41
36
33
A1 \ A3 36 = 41
A1 \ A2 \ A3 0
Thus by a series of multiplications, we can see that A1 and A2 are independent, A1
and A3 are independent (also A2 and A3 ), but that A1 , A2 and A3 are not independent.
2.3 Distributions
The binomial distribution with parameters n and p, 0 p 1 has
; = f0 : : : ng and
probabilities pi = ni pi (1 ; p)n;i .
Proof.
r n;r n(n ; 1) : : : (n ; r + 1) pr (1 ; p)n;r
r p (1 ; p) = r!
= nn n ; 1 : : : n ; r + 1 (np)r (1 ; p)n;r
n n r!
Yr n ; i + 1
r
n
;r
= n r! 1 ; n 1; n
i=1
r
! 1 e; 1
r!
r
;
= e r! :
3 read “A given B ”.
10 CHAPTER 2. THE AXIOMATIC APPROACH
Example (Gambler’s Ruin). A fair coin is tossed repeatedly. At each toss a gambler
wins $1 if a head shows and loses $1 if tails. He continues playing until his capital
reaches m or he goes broke. Find px , the probability that he goes broke if his initial
capital is $x.
Solution. Let A be the event that he goes broke before reaching $m, and let H or
T be the outcome of the first toss. We condition on the first toss to get P(A) =
P(AjH ) P(H ) + P(AjT ) P(T ). But P(AjH ) = px+1 and P(AjT ) = px;1 . Thus
we obtain the recurrence
px+1 ; px = px ; px;1 :
Note that px is linear in x, with p0 = 1, pm = 0. Thus px = 1 ; m x.
P(AjBi ) P(Bi )
P(Bi jA) = P :
j P(AjBj ) P(Bj )
Proof.
P(A \ Bi )
P(Bi jA) =
P(A)
= PP(PA(A
jBi ) P(Bi )
jBj ) P(Bj )
j
by the law of total probability.
Chapter 3
Random Variables
X
P(X 2 B) = P(X = x) :
x2B
Then
(P(X = x) x 2 RX )
is the distribution of the random variable X . Note that it is a probability distribution
over RX .
3.1 Expectation
Definition 3.2. The expectation of a random variable X is the number
X
E X] = pw X (!)
!2
provided that this sum converges absolutely.
11
12 CHAPTER 3. RANDOM VARIABLES
Note that
X
E X] = pw X (!)
! 2
X X
= p! X (!)
x2RX !:X (!)=x
X X
= x p!
x2RX !:X (!)=x
X
= xP(X = x) :
x2RX
then E X ] is undefined.
Solution.
X
1
E X] = re; rr!
r=0
X
1 r;1
= e; (r ; 1)! = e; e =
r=1
Example. If P(X
;
= r) = nr pr (1 ; p)n;r then E X ] = np.
3.1. EXPECTATION 13
Solution.
X
n
= np r n;1;r
r=1 r p (1 ; p)
= np
f (X )(w) = f (X (w)):
Example. If a, b and c are constants, then a + bX and (X ; c)2 are random variables
defined by
Proof. 1. X 0 means Xw 0 8 w 2
X
So E X ] = p! X (!) 0
! 2
2. If 9! 2 with p! > 0 and X (!) > 0 then E X ] > 0, therefore P(X = 0) = 1.
14 CHAPTER 3. RANDOM VARIABLES
3.
X
E a + bX ] = (a + bX (!)) p!
!2
X X
=a p! + b p! X (!)
!2
!2
= a + E X] :
4. Trivial.
5. Now
E
(X ; c)2
= E (X ; E X ] + E X ] ; c)2
= E (X ; E X ])2 + 2(X ; E X ])(E X ] ; c) + (E X ] ; c)]2 ]
= E (X ; E X ])2 + 2(E X ] ; c)E (X ; E X ])] + (E X ] ; c)2
= E (X ; E X ])2 + (E X ] ; c)2 :
This is clearly minimised when c = E X ].
3.2 Variance
Var X = E X 2 ; E X ]2 for Random Variable X
= E X ; E X ]]2 = 2
p
Standard Deviation = Var X
Proof.
Var a + bX = E a + bX ; a ; bE X ]]
= b2 E X ; E X ]]
= b2 Var X
(iii) Var X
= E X 2 ; E X ]2
Proof.
E X
; E X ]]2 = E X 2 ; 2X E X ] + (E X ])2
= E X 2 ; 2E X ] E X ] + E X ]2
= E X 2 ; (E X ])2
Example. Let X have the geometric distribution P(X = r) = pqr with r = 0 1 2:::
and p + q = 1. Then E X ] = pq and Var X = pq2 .
Solution.
X
1 X 1
E X] = rpqr = pq rqr;1
r=0 r=0
1
= pq
X1 d (qr ) = pq d 1
r=0 dq dq 1 ; q
= pq(1 ; q);2 = pq
E
X 2
= X
1
r2 p2 q2r
r=0
X
1 X
1
p2 p p
= pq2
Corr(X Y ) = pCov(X Y )
Var X Var Y
16 CHAPTER 3. RANDOM VARIABLES
Linear Regression
(
I A](w) = 1 if ! 2 A
(3.1)
0 if ! 2= A:
1.
E I A]] = P(A)
X
E I A]] = p! I A](w)
! 2
= P(A)
2. I Ac ] = 1 ; I A]
3. I A \ B ] = I A]I B ]
4.
I A B ] = I A] + I B ] ; I A]I B ]
I A B ](!) = 1 if ! 2 A or ! 2 B
I A B ](!) = I A](!) + I B ](!) ; I A]I B ](!) WORKS!
Example. n couples are arranged randomly around a table such that males and fe-
males alternate. Let N = The number of husbands sitting next to their wives. Calculate
3.3. INDICATOR FUNCTION 17
X
n
N= I Ai ] Ai = event couple i are together
"
i=1
X
n #
E N ] = E I Ai ]
i=1
X
n
= E I Ai ]]
i=1
Xn 2
= n
i=1
Thus E N ] = n
2 =2
2 n n
X !2 3
E N2 = E4 I A ] 5 i
2 0i=1n 13
X 2
X
= E 4@ I Ai ] + 2 I Ai ]I Aj ]A5
i=1 i
j
= nE I Ai ] + n(n ; 1)E (I A1 ]I A2 ])]
2
I A ]2
= E I A ]] = 2
E i i n
E (I A1 ]I A2 ])] = I E A1 \ B2 ]] = P(A1 \ A2 )
= P(A1 ) P(A2 jA1 )
2
1 1 n;2 2
= n n;1n;1 ; n;1n;1
Var N = E N 2 ; E N ]2
= n; 2 (1 + 2(n ; 2)) ; 2
1
2(
= n;1n ; 2)
18 CHAPTER 3. RANDOM VARIABLES
N \
N !c
Ai = Aci
"
N #
1
"
1
N !c #
\
I Ai = I c Ai
1
"\
1
N #
=1;I Aci
1
Y
N
=1; I Aci ]
1
Y
N
=1; (1 ; I Ai ])
1
X
N X
= I Ai ] ; i1 i2 I A1 ]I A2 ]
1
X
+ ::: + (;1)j+1 I A1 ]I A2 ]:::I Aj ] + :::
i1
i2 :::
ij
Take Expectation
"
N #
N !
E Ai = P Ai
1 1
X
N X
= P(Ai ) ; i1 i2 P(A1 \ A2 )
1
X ;A
+ ::: + (;1)j+1 P i1 \ Ai2 \ :::: \ Aij + :::
i1
i2 :::
ij
3.5 Independence
Definition 3.5. Discrete random variables X1 ::: Xn are independent if and only if
for any x1 :::xn :
Y
N
P(X1 = x1 X2 = x2 :::::::Xn = xn ) = P(Xi = xi )
1
Proof.
X
P(f1 (X1 ) = y1 : : : fn (Xn ) = yn ) = P(X1 = x1 : : : Xn = xn )
x1 :f1 (X1 )=y1 :::
xn :fn (Xn )=yn
YN X
= P(Xi = xi )
1 xi :fi (Xi )=yi
Y
N
= P(fi (Xi ) = yi )
1
NOTE that E
P X ] = P E X ] without requiring independence.
i i
Theorem 3.7. If X1 ::: Xn are independent random variables and f 1 ::::fn are func-
tion R ! R then:
"Y
N # Y
N
E fi (Xi ) = E fi (Xi )]
1 1
Proof.
X ! 2 n !23 " n #2
n X 5 X
Var Xi = E 4 Xi ; E Xi
i=1 i=1 i=1
2 3 " n #2
X X X
= E 4 Xi2 + Xi Xj 5 ; E Xi
i i6=j i=1
X
X X X
= E X 2 + E X X ] ; E X ]2 ; E X ] E X ]
i i j i i j
i i6=j i i6=j
X 2
= E Xi ; E Xi ]
2
Xi
= Var Xi
i
X !
1 n
Var n Xi = n1 Var Xi
i=1
Proof.
X !
1 n
Var n Xi = n12 Var Xi
i=1
Xn
= 1 Var X
n2 i=1 i
= n1 Var Xi
E A] = a Var A = 2
E B] = b Var B = 2
3.5. INDEPENDENCE 21
E X] = a + b Var X = 2
E Y ] = a ; b Var Y = 2
X +Y
E
2 =a
Var X + Y = 1 2
X ;Y 2
2
E
2 =b
Var X ; Y 1 2
2 = 2
So this is better.
Example. Non standard dice. You choose 1 then I choose one. Around this cycle
Inequalities
23
24 CHAPTER 4. INEQUALITIES
Concave
Example.
f (x) = ; log x
f (x) = ;x1
0
f (x) = x12 0
00
f (x) = ;x1 0
00
Strictly concave.
Example. f (x = x3 is strictly convex on (0 1) but not on (;1 1)
Theorem 4.1. Let f : (a b) ! R be a convex function. Then:
X
n X
n !
pi f (xi ) f pi xi
i=1 i=1
x1 : : : Xn 2 (a b), p1 : : : pn 2 (0 1) and
Pp
i = 1. Further more if f is strictly
convex then equality holds if and only if all x’s are equal.
E f (X )] f (E X ])
4.1. JENSEN’S INEQUALITY 25
1
i=2
X
n !
p1 f (x1 ) + (1 ; p1 )f pi xi
0
X
n
i=2 !
f p1 x1 + (1 ; p1 ) pi xi
0
X
n ! i=2
=f pi xi
i=1
f is strictly convex n 3 and not all the x0i s equal then we assume not all of x2 :::xn
are equal. But then
X
n X
n !
(1 ; pj ) pi f (xi ) (1 ; pj )f pi xi
0 0
i=2 i=2
So the inequality is strict.
Corollary (AM/GM Inequality). Positive real numbers x1 : : : xn
1Yn ! n
xi
1X
n
x
i=1 n i=1 i
Equality holds if and only if x 1 = x2 = = xn
Proof. Let
P(X = xi ) = n1
then f (x) = ; log x is a convex function on (0 1).
So
E f (x)] f (E x]) (Jensen’s Inequality)
;E log x] log E x] 1]
;
1 X log x ; log 1 X x
n n
Therefore
n i n
Yn ! n1
1 1
xi 1X
n
i=1
n i=1 xi 2]
For strictness since f strictly convex equation holds in [1] and hence [2] if and only if
x1 = x2 = = xn
26 CHAPTER 4. INEQUALITIES
If f : (a b) ! R is a convex function then it can be shown that at each point y 2 (a b)9
a linear function y +
y x such that
f (x) y +
y x x 2 (a b)
f (y) = y +
y y
If f is differentiable at y then the linear function is the tangent f (y ) + (x ; y )f (y )
0
Let y = E X ], = y and
=
y
f (E X ]) = +
E X ]
So for any random variable X taking values in (a b)
E f (X )] E +
X ]
= +
E X ]
= f (E X ])
E XY ]
2
E Y 2
E X2
Proof. For a b 2 R Let
LetZ = aX ; bY
Then0 E Z 2 = E (aX ; bY )2
= a2 E X 2 ; 2abE XY ] + b2 E Y 2
quadratic in a with at most one real root and therefore has discriminant 0.
4.3. MARKOV’S INEQUALITY 27
Take b 6= 0
E XY ]2
E Y 2
E X2
Corollary.
jCorr(X Y )j 1
E jX j]
P(jX j a) for any a 0
a
Proof. Let
A = jX j a
Then jX j aI A]
Take expectation
E jX j] aP(A)
E jX j] aP(jX j a)
E
X 2
P(jX j )
2
Proof.
I jX j ] x2 8x
2
28 CHAPTER 4. INEQUALITIES
Then
I jX j ] x2
2
Take Expectation
x2 E
X 2
P(jX j ) E
2 = 2
Note
1. The result is “distribution free” - no assumption about the distribution of X (other
than E X 2 1).
X = + with probability 2c2
= ; with probability 2c2
= 0 with probability 1 ; c2
Then P(jX j ) = 2
c
E X2 = c
c
E X2
P(jX j ) = 2 =
2
3. If
= E X ] then applying the inequality to X ;
gives
P(X ;
)
Var X
2
Often the most useful form.
X
n
Sn = Xi
i=1
Then
Sn
8 0, P ;
! 0 as n ! 1
n
4.5. LAW OF LARGE NUMBERS 29
= Var Sn Since E S ] = n
n 2
2 n
S
n = n
Sn But Var
2
Thus P
n2 2
;
2 2 = 2 ! 0
n n n
Example. A1 A2 ::: are independent events, each with probability p. Let Xi = I Ai ].
Then
Sn = nA = number of times A occurs
n n number of trials
= E I Ai ]] = P(Ai ) = p
Theorem states that
Sn
P
n ; p ! 0 as n ! 1
Which recovers the intuitive definition of probability.
Example. A Random Sample of size n is a sequence X1 X2 : : : Xn of independent
identically distributed random variables (’n observations’)
Pn Xi
X = i=1 n is called the SAMPLE MEAN
Theorem states that provided the variance of Xi is finite, the probability that the sample
mean differs from the mean of the distribution by more than approaches 0 as n ! 1.
We have shown the weak law of large numbers. Why weak? 9 a strong form of larger
numbers.
S
P
n !
as n ! 1 = 1
n
This is NOT the same as the weak form. What does this mean?
! 2 determines
Sn n = 1 2 : : :
n
as a sequence of real numbers. Hence it either tends to
or it doesn’t.
S (!)
P !:
n !
as n ! 1 = 1
n
30 CHAPTER 4. INEQUALITIES
Chapter 5
Generating Functions
In this chapter, assume that X is a random variable taking values in the range 0 1 2 : : :.
Let pr = P(X = r) r = 0 1 2 : : :
Definition 5.1. The Probability Generating Function (p.g.f) of the random variable
X,or of the distribution pr = 0 1 2 : : : , is
X 1 X
p(z ) = E z X = z r P(X = r) = pr z r
1
r=0 r=0
This p(z ) is a polynomial or a power series. If a power series then it is convergent for
jz j 1 by comparison with a geometric series.
X X
jp(z )j pr jz jr pr = 1
r r
Example.
pr = 16 r = 1 : : : 6
;
p(z ) = E z X = 16 1 + z + : : : z 6
= 6z 11;;zz
6
p (z ) = p1 + 2p2z + : : :
0
z 1 ; z6
Recall p(z ) =
6 1;z
Theorem 5.3.
E X (X ; 1)] = lim p00 (z )
z !1
Proof.
X
1
p00 (z ) = r(r ; 1)pz r;2
r=2
Proof now the same as Abel’s Lemma
Theorem 5.4. Suppose that X1 X2 : : : Xn are independent random variables with
p.g.f’s p1 (z ) p2 (z ) : : : pn (z ). Then the p.g.f of
X1 + X2 + : : : Xn
is
p1 (z )p2 (z ) : : : pn (z )
33
E
zX
= X
1
z r e;
r
r=0 r!
= e; e;z
= e;(1;z)
Let’s calculate the variance of X
p = e;(1;z)
0
p = 2 e;(1;z)
00
Then
E X] = lim p (z ) = p (1)( Since p (z ) continuous at z = 1 )E X ] =
0 0 0
z !1
E X (X ; 1)] = p (1) = 2
00
Var X = E X 2 ; E X ]2
= E X (X ; 1)] + E X ] ; E X ]2
= 2 + ; 2
=
Example. Suppose that Y has a Poisson Distribution with parameter
. If X and Y are
zX +Y
= E zX
E zY
independent then:
E
= e;(1;z) e; (1;z)
= e;(+ )(1;z)
But this is the p.g.f of a Poisson random variable with parameter +
. By uniqueness
(first theorem of the p.g.f) this must be the distribution for X + Y
P(X = r) = r pr (1 ; p)n;r r = 0 1 : : :
n n
zX
= X
E pr (1 ; p)n;r z r
r
r=0
= (pz + 1 ; p)n
34 CHAPTER 5. GENERATING FUNCTIONS
P(Yi = 1) = p P(Yi = 0) = 1 ; p
Note if the p.g.f factorizes look to see if the random variable can be written as a sum.
fn = fn;1 + fn;2 f0 = f1 = 1
Let
X
1
F (z ) = fn z n
n=0
fn z n = fn;1z n + fn;2 z n
X
1 X
1 X
1
fn z n = fn;1z n + fn;2 z n
n=2 n=2 n=0
F (z ) ; f0 ; zf1 = z (F (z ) ; f0 ) + z 2 F (z )
F (z )(1 ; z ; z 2 ) = f0 (1 ; z ) + zf1
= 1 ; z + z = 1:
Since f0 = f1 = 1, then F (z ) = 1;z1;z2
Let
p p
1 = 1 +2 5 2 = 1 ;2 5
1
F (z ) = (1 ; z )(1
1 ; 2 z )
= (1 ; z ) ; (1 ; 2 z )
1
1
X
1 X
2
1 !
= ;1 1 n1 z n ; 2 n2 z n
1 2 n=0 n=0
The coefficient of z1n , that is fn , is
fn = ;1 ( n1 +1 ; n2 +1 )
1 2
P(X = x Y = y)
5.2. CONDITIONAL EXPECTATION 35
E X jY ] (! ) = E X jY = Y (!)]
Thus E X jY ] : !R
Example. Let X1 X2 : : : Xn be independent identically distributed random vari-
ables with P(X1 = 1) = p and P(X1 = 0) = 1 ; p. Let
Y = X1 + X2 + + Xn
Then
E X1 jY
= r] = 0 P(X1 = 0jY = r) + 1 P(X1 = 1jY = r)
= nr
1
E X1 jY = Y (! )] = Y (! )
n
Therefore E X jY ] = Y
1
1
n
Note a random variable - a function of Y .
36 CHAPTER 5. GENERATING FUNCTIONS
2
d 2 h(p(z )) and hence
Exercise Calculate dz
Var X1 + : : : Xn
In terms of Var N and Var X1
5.4. BRANCHING PROCESSES 37
F (z ) =
X
1
h i
fk z k = E z Xi = E z Yin
n=0
Let
Fn (z ) = E z Xn
Then F1 (z ) = F (z ) the probability generating function of the offspring distribution.
Theorem 5.7.
Fn+1 (z ) = Fn (F (z )) = F (F (: : : (F (z )) : : : ))
Fn (z ) is an n-fold iterative formula.
38 CHAPTER 5. GENERATING FUNCTIONS
Proof.
Fn+1 (z ) = E z Xn+1
= E E z Xn+1 jXn
X
1
= P(Xn = k) E z Xn+1 jXn = k
n=0
X
1 h i
= k) E z Y1 +Y2 ++Yn
n n n
= P(Xn
n=0
X
1 h i h i
= k ) E z Y1 : : : E z Yn
n n
= P(Xn
n=0
X
1
= P(Xn = k) (F (z ))k
n=0
= Fn (F (z ))
X
1
If m = kfk 1
k=0
X
1
and 2 = (k ; m)2 fk 1
k=0
Mean and Variance of offspring distribution.
Then E Xn ] = mn
(
2 mn 1 (mn ;1)
m 6= 1
;
E Xn ] = E E Xn jXn;1 ]]
= E mjXn;1 ]
= mE Xn;1 ]
= mn by induction
E (Xn ; mXn;1 )2 = E E (Xn ; mXn;1 )2 jXn
= E Var (Xn jXn;1 )]
= E 2 Xn;1
= 2 mn;1
Thus
X 2
; 2mE X X ] + m2E X 2
2 = 2mn;1
E n n n;1 n;1
5.4. BRANCHING PROCESSES 39
Now calculate
An = Xn = 0
= Extinction occurs by generation n
1
and let A = An
1
= the event that extinction ever occurs
Can we calculate P(A) from P(An )?
More generally let An be an increasing sequence
A1 A2 : : :
and define
1
A = nlim
!1 An = An
1
Define Bn for n 1
B1 = A1
n;1 !c
Bn = An \ Ai
i=1
c
= An \ An;1
40 CHAPTER 5. GENERATING FUNCTIONS
q = nlim
!1 Fn (0)
Also
F (q) = F nlim!1 Fn (0)
!1 F (Fn (0))
= nlim Since F is continuous
!1 Fn+1 (0)
= nlim
Thus F (q ) = q
5.4. BRANCHING PROCESSES 41
!1 F (z )
0
X
1
F (z ) =
00
j (j ; 1)z j;2 in 0 z 1 Since f0 + f1 1 Also F (0) = f0 0
j =z
Thus if m 1, there does not exists a q 2 (0 1) with F (q) = q. If m 1 then let
F (0) F ( ) =
F (F (0)) F ( ) =
Fn (0) 8n 1
q = nlim
!1 Fn (0) 0
q= Since q is a root of F (z ) = z
42 CHAPTER 5. GENERATING FUNCTIONS
Sn = S0 + X1 + X2 + + Xn Where, usually S0 =0
We shall assume
(
Xn = 1 with probability p
(5.2)
;1 with probability q
This is a simple random walk. If p = q = 12 then the random walk is called symmetric
Example (Gambler’s Ruin). You have an initial fortune of A and I have an initial
fortune of B . We toss coins repeatedly I win with probability p and you win with
probability q . What is the probability that I bankrupt you before you bankrupt me?
5.5. RANDOM WALKS 43
Let pz be the probability that the random walk hits a before it hits 0, starting from
z . Let qz be the probability that the random walk hits 0 before it hits a, starting from
z . After the first step the gambler’s fortune is either z ; 1 or z + 1 with prob p and q
respectively. From the law of total probability.
pz = qpz;1 + ppz+1 0 z a
Also p0 = 0 and pa = 1. Must solve pt2 ; t + q = 0.
p p
t = 1 21p; 4pq = 1 21p ; 2p = 1 or qp
General Solution for p 6= q is
z
pz = A + B pq A + B = 0A = 1 q a
1; p
and so
z
1 ; pq
pz = a
1 ; pq
If p = q , the general solution is A + Bz
pz = az
To calculate qz , observe that this is the same problem with p q z replaced by p q a ; z
respectively. Thus
q a q z
;
qz = p q a p if p 6= q
p ;1
44 CHAPTER 5. GENERATING FUNCTIONS
or
qz = a ;z z if p = q
Thus qz + pz = 1 and so on, as we expected, the game ends with probability one.
P(hits 0 before a) = qz
q a q z
;( )
qz = p q a p if p 6= q
p ;1
Or =
a ; z if p = q
z
What happens as a ! 1?
1
P( paths hit 0 ever) = path hits 0 before it hits a
a=z+1
P(hits 0 ever) = lim P(hits 0 before a)
a!1
!1 qz
= alim
= pq pq
=1 p=q
Let G be the ultimate gain or loss.
(
G = a ; z with probability pz
(5.3)
;z with probability qz
(
E G] =
apz ; z if p 6= q
(5.4)
0 if p = q
Fair game remains fair if the coin is fair then then games based on it have expected
reward 0.
Duration of a Game Let Dz be the expected time until the random walk hits 0
or a, starting from z . Is Dz finite? Dz is bounded above by x the mean of geometric
random variables (number of window’s of size a before a window with all +1 0 s or
;10s). Hence Dz is finite. Consider the first step. Then
Dz = 1 + pDz+1 + qDz;1
E duration] = E E duration j first step]]
= p (E duration j first step up]) + q (E duration j first step down])
= p(1 + Dz+1 ) + q(1 + Dz;1 )
Equation holds for 0 z a with D0 = Da = 0. Let’s try for a particular solution
Dz = Cz
Cz = Cp (z + 1) + Cq (z ; 1) + 1
1
C = q ; p for p 6= q
5.5. RANDOM WALKS 45
pt2 ; t + q = 0 t1 = 1 t2 = qp
General Solution for p 6= q is
z
Dz = A + B pq + q =z p
Substitute z = 0 a to get A and B
q z
1 ;
Dz = q ;z p ; q ;a p pq a p 6= q
1; p
If p = q then a particular solution is ;z 2 . General solution
Dz ; z 2 + A + Bz
Substituting the boundary conditions given.,
Dz = z (a ; z ) p=q
Example. Initial Capital.
Must satisfy
(s) = ps (((s))2 + qs
Two Roots,
p
1 (s) 2 (s) = 1 21ps; 4pqs
2
Same method give generating function for absorption probabilities at the other barrier.
Generating function for the duration of the game is the sum of these two generating
functions.
Chapter 6
In this chapter we drop the assumption that id finite or countable. Assume we are
given a probability p on some subset of .
For example, spin a pointer, and let ! 2 give the position at which it stops, with
= ! : 0 ! 2. Let
P(! 2 0 ]) =
(0 2)
2
Definition 6.1. A continuous random variable X is a function X : ! R for which
Zb
P(a X (! ) b) = f (x)dx
a
Where f (x) is a function satisfying
1. f (x) 0
R +1 f (x)dx = 1
2. ;1
The function f is called the Probability Density Function.
For example, if X (! ) = ! given position of the pointer then x is a continuous
(1
random variable with p.d.f
47
48 CHAPTER 6. CONTINUOUS RANDOM VARIABLES
in this case. Intuition about probability density functions is based on the approximate
relation.
Z x+xx
P(X 2 x x + xx]) = f (z )dz
x
Proofs however more often use the distribution function
F (x) = P(X x)
F (x) is increasing in x.
In either case
P(a X b) = P(X b) ; P(X a) = F (b) ; F (a)
49
Theorem 6.1. If X is a continuous random variable with pdf f (x) and h(x) is a con-
tinuous strictly increasing function with h;1 (x) differentiable then h(x) is a continu-
ous random variable with pdf
; d h;1(x)
fh (x) = f h;1 (x) dx
P(h(X ) x) = P
;X h;1(x) = F ;h;1(x)
Since h is strictly increasing and F is the distribution function of X Then.
d
dx P(h(X ) x)
is a continuous random variable with pdf as claimed f h . Note usually need to repeat
proof than remember the result.
50 CHAPTER 6. CONTINUOUS RANDOM VARIABLES
P(X
;
x) = P F ;1 (u) x
= P(U F (x))
= F (x) U 0 1]
Remark
P(X = Xi ) = pi i = 0 1 : : :
Let
X
j ;1 X
j
X = xj if pi U pi U U 0 1]
i=0 i=0
2. useful for simulations
FX (x) = P(Xz x)
= P(X x Y 1)
= F (x 1)
!1 F (x y)
= ylim
6.1. JOINTLY DISTRIBUTED RANDOM VARIABLES 51
FY (x) = F (1 y)
X1 X2 : : : Xn are jointly distributed continuous random variables if for a set c 2 R b
ZZ Z
P((X1 X2 : : : Xn ) 2 c) = f (x1 : : : xn )dx1 : : : dxn
(x1 :::xn )2c
For some function f called the joint probability density function satisfying the obvious
conditions.
1.
f (x1 : : : xn )dx1 0
ZZ Z
2.
Where fXi (xi ) are the pdf’s of the individual random variables.
Example. Two points X and Y are tossed at random and independently onto a line
segment of length L. What is the probability that:
jX ; Y j l?
Desired probability
ZZ
= f (x y)dxdy
A
= areaL2of A
L2 ; 2 12 (L ; l)2
= L2
= 2Ll ; l
2
L2
Let 2 0 2 ] be the angle between the needle and the parallel lines and let x be
the distance from the bottom of the needle to the line closest to it. It is reasonable to
suppose that X is distributed Uniformly.
X U 0 L] U 0 )
The needle intersects the line if and only if X sin The event A
ZZ
= f (x )dxd
Z Asin
=l L d
0
2l
= L
Definition 6.2. The expectation or mean of a continuous random variable X is
Z1
E X] = xf (x)dx
;1
R 1 xf (x)dx and R 0 xf (x)dx are infinite
provided not both of ;1 ;1
Example (Normal Distribution). Let
f (x) = p 1 e
(x;)2
22 ;1x1
;
2
This is non-negative for it to be a pdf we also need to check that
Z1
f (x)dx = 1
;1
Make the substitution z = x;
. Then
Z1
I = p1
(x;)2
e 22 dx
;
2Z ;1
Z 1 z2
= p1 e 2 dz ;
Z 1 Z 1 2 ;1
Thus I =
2 1 e 2
;x2
dx e 2
;y2
dy
2 ;1 ;1
1 Z1Z1 (y2 +x2 )
= 2 e dxdy
;
2
;1 ;1
Z 2 Z 1
= 21
2
re ;
2 drd
Z 2
0 0
= d = 1
0
Therefore I = 1. A random variable with the pdf f(x) given above has a Normal
distribution with parameters
and 2 we write this as
X N
2 ]
The Expectation is
1 Z 1 (x )2
E X] = p xe 22 dx
; ;
2 Z;1 Z1
1 1 (x )2 1 (x;)2
=p (x ;
)e 2 dx + p
e 22 dx:
; ; ;
2
2 ;1 2 ;1
6.1. JOINTLY DISTRIBUTED RANDOM VARIABLES 55
E X] = 0 +
=
Theorem 6.4. If X is a continuous random variable then,
Z1 Z1
E X] = P(X x) dx ; P(X ;x) dx
0 0
Proof.
Z1 Z 1 Z 1
P(X x) dx = f (y)dy dx
0
Z01 Z 1x
= I y x]f (y)dydx
Z 0 1 Z0 y
= dxf (y)dy
Z0 1 0
= yf (y)dy
Z1 Z0
0
X
1 X
1 X
1
P(X n) = I m n]P(X = m)
n=0 n=0 m=0 !
X
1 X
1
= I m n] P(X = m)
m=0 n=0
X
1
= mP(X = m)
m=0
Theorem 6.5. Let X be a continuous random variable with pdf f (x) and let h(x) be
a continuous real-valued function. Then provided
Z1
jh(x)j f (x)dx 1
;1 Z1
E h(x)] = h(x)f (x)dx
;1
56 CHAPTER 6. CONTINUOUS RANDOM VARIABLES
Proof.
Z1
P(h(X ) y ) dy
0
Z 1 "Z #
= f (x)dx dy
x:h(x) 0
Z 1Z
0
= h(x)f (x)dy
x:h(x) 0
Z1 Z
Similarly P(h(X ) ;y ) = ; h(x)f (x)dy
0 x:h(x)
0
P R
Note The properties of expectation and variance are the same for discrete and contin-
uous random variables just replace with in the proofs.
Example.
Var X = E X 2 ; E X ]2
Z1 Z 1
2
= x f (x)dx ;
2
xf (x)dx
;1 ;1
P(Z z) = P z
= P(X
+ z )
Z +
z 1 (x )2
= p e 22 dx
; ;
x ;
Z z 1 2u2
;1
Let u= = p e 2 du
;
;1 2
= (z ) The distribution function of a N (0 1) random variable
Z N (0 1)
6.2. TRANSFORMATION OF RANDOM VARIABLES 57
Var X = E Z 2 ; E Z ]2 Last term is zero
1 Z 1 z2
=p z 2e 2 dz ;
2 ;1
1 z 2 1 Z 1 z 2
= ; p ze 2 + e 2 dz
; ;
2 ;1 ;1
=0+1=1
Var X = 1
Variance of X ?
X =
+ z
Thus E X ] =
we know that already
Var X = 2 Var Z
Var X = 2
X (
2 )
Y1 = r1 (X1 X2 : : : Xn )
Y2 = r2 (X1 X2 : : : Xn )
..
.
Yn = rn (X1 X2 : : : Xn )
P((X1 X2 : : : Xn ) 2 R) = 1
Let S be the image of R under the above transformation suppose the transformation
from R to S is 1-1 (bijective).
58 CHAPTER 6. CONTINUOUS RANDOM VARIABLES
x1 = s1 (y1 y2 : : : yn )
x2 = s2 (y1 y2 : : : yn ) : : :
xn = sn (y1 y2 : : : yn )
@si exists and is continuous at every point (y1 y2 : : : yn ) in S
Assume that @yj
@s1 : : : @s1
@y1 @yn
J = ... . . . ..
@sn
(6.3)
@s@yn1 : : :
.
@yn
If A R
Z Z
P((X1 : : : Xn ) 2 A) 1] = f (x1 : : : xn )dx1 : : : dxn
Z A Z
= f (s1 : : : sn ) jJ j dy1 : : : dyn
B
= P((Y1 : : : Yn ) 2 B ) 2]
Example (density of products and quotients). Suppose that (X Y ) has density
(
f (x y) = 4xy for 0 x 1 0 y 1
(6.4)
0 Otherwise.
Let U = XY and V = XY
6.2. TRANSFORMATION OF RANDOM VARIABLES 59
p
r
X = UV Y = VU
p
rv
x = uv y= u
r
@x = 1 v
r
@x = 1 u
@u 2 u @v 2 v
@y = ;1 v 12 @y = p1 :
@u 2 u 32 @v 2 uv
Therefore jJ j = 21u and so
= 2 uv if (u v ) 2 D
= 0 Otherwise:
Note U and V are NOT independent
g(u v) = 2 uv I (u v) 2 D]
not product of the two identities.
When the transformations are linear things are simpler still. Let A be the n n
invertible matrix.
0Y 1 0 X 1
B@ ...1 C
A @.A
= A B .. C :
1
Yn Xn
Then
g(y z ) = f (x1 x2 ) = f (y ; z z )
joint distributions of Y and X .
Marginal density of Y is
Z1
g(y) = f (y ; z z )dz ;1y 1
Z;1
1
or g (y ) = f (z y ; z )dz By change of variable
;1
If X1 and X2 are independent, with pgf’s f1 and f2 then
P(X x^)
1 or P(X x^) 1
2 2
If X1 : : : Xn is a sample from the distribution then recall that the sample mean is
1X
n
X
n i
1
= n (1 ; F (y))n;1 f (y)
What about the joint density of Y1 Yn ?
G(y yn ) = P(Y1 y1 Yn yn )
= P(Yn yn ) ; P(Yn yn Y1 1 )
= P(Yn yn ) ; P(y1 X1 yn y1 X2 yn : : : y1 Xn yn)
= (F (yn ))n ; (F (yn ) ; F (y1 ))n
Thus the pdf of Y1 Yn is
1 n
= n(n ; 1) (F (yn ) ; F (y1 ))n;2 f (y1 )f (yn ) ; 1 y1 yn 1
=0 otherwise
What happens if the mapping is not 1-1? X = f (x) and jX j = g(x)?
Zb
P(jX j 2 (a b)) = (f (x) + f (;x)) dx g(x) = f (x) + f (;x)
a
Suppose X1 : : : Xn are iidrv’s. What is the pdf of Y1 : : : Yn the order statistics?
(
g(y1 : : : yn ) = n!f (y1 ) : : : f (yn) y1 y2 yn (6.6)
0 Otherwise
Example. Suppose X1 : : : Xn are iidrv’s exponentially distributed with parameter
. Let
z1 = Y1
z2 = Y2 ; Y1
..
.
zn = Yn ; Yn;1
62 CHAPTER 6. CONTINUOUS RANDOM VARIABLES
i=1
Thus h(z1 : : : zn ) is expressed as the product of n density functions and
Zn+1;i exp(i)
exponentially distributed with parameter i, with z1 : : : zn independent.
Example. Let X and Y be independent N (0:1) random variables. Let
D = R 2 = X 2 + Y2
Y then
then tan = X
d = x2 + y2 and = arctan xy
2x
jJ j = x2y 1 =2
2y
2
;
1+( y )2 1+( y )
x (6.8)
x x
6.2. TRANSFORMATION OF RANDOM VARIABLES 63
f (x y) = p1 e 2x p1 e
2 y2
2
; ;
2 2
1
= 2 e 2
(x2 +y2 )
;
Thus
g(d ) = 41 e 2d 0 d 1 0 2
;
gD (d) = 21 e 2d;
0d1
g () = 21 0 2
Then D and are independent. d exponentially mean 2. U 0 2 ].
Note this is useful for the simulations of the normal random variable.
We know we can simulate N 0 1] random variable by X = f 0 (U ) when U
U 0 1] but this is difficult for N 0 1] random variable since
Z +x
p1 e
z2
F (x) = (x) = ;
2
;1 2
is difficult.
Let U1 and U2 be independent U 0 1]. Let R 2 = ;2 log U , so that R2 is
exponential with mean 2. = 2U2 . Then U 0 2 ]. Now let
p
X = R cos = ;2 log U1 cos(2U2 )
p
Y = R sin = ;2 log U2 sin(2U1 )
Then X and Y are independent N 0 1] random variables.
Example (Bertrand’s Paradox). Calculatepthe probability that a “random chord” of
a circle of radius 1 has length greater that 3. The length of the side of an inscribed
equilateral triangle.
There are at least 3 interpretations of a random chord.
(1) The ends are independently and uniformly distributed over the circumference.
answer = 1
3
64 CHAPTER 6. CONTINUOUS RANDOM VARIABLES
p !2 p 2
a2 + 3 = 3
2
answer = 12
(3) The foot of the perpendicular to the chord from the centre of the circle is uni-
formly distributed over the diameter of the interior circle.
Theorem 6.6. The moment generating function determines the distribution of X , pro-
vided m() is finite for some interval containing the origin.
Proof. Not proved.
Theorem 6.7. If X and Y are independent random variables with moment generating
function mx () and my () then X + Y has the moment generating function
Theorem 6.8. The rth moment of X ie the expected value of X r , E X r ], is the coeffi-
cient of
r! of the series expansion of n().
r
e
X = 1 + X + 2! X 2 + : : :
2
2
E e
X = 1 + E X ] + E X 2 + : : :
2!
= e;(;
)xdx
0
= m() for
= ;
E X ] = m (0) =
0
= 1
( ; )
=0
2
E
2
X 2 = ( ; )2 = 2 2
=0
Thus
Var X = E X 2 ; E X ]2
= 2 ; 1 2 2
66 CHAPTER 6. CONTINUOUS RANDOM VARIABLES
h i
e
(X1 ++Xn ) = E e
X1 : : : E e
Xn
E
= E e
X1 n
n
= ;
Suppose that Y ;(n ).
Z 1 ne;xxn;1
E e
Y = e
x (n ; 1)! dx
n Z 1 ( ; )n e;(;
)xxn;1
0
= ; 0 (n ; 1)! dx
Hence claim, since moment generating function characterizes distribution.
Example (Normal Distribution). X N 0 1]
Z1
e
x p 1 e;( 22 ) dx
X x 2
E e =
;
Z;1 2
1 1 ;1
= p exp (x 2
; 2x
+
; 2 x) dx
2 2
Z;1 2 22
1 1 ;1 ;
= exp 22 (x ;
; ) ; 2
; dx
p 2 2 2 2 4
;1 2
Z ;1
2 2 1 1
= e
+
2 p exp 22 (x ;
; 2 )2 dx
;1 2
The integral equals 1 are it is the density of N
+ 2 2 ]
2 2
= e
+
2
X + Y N
1 +
2 12 + 22 ]
6.4. CENTRAL LIMIT THEOREM 67
2.
aX N a
1 + a2 2 ]
Proof. 1.
h i
E e
(X +Y ) = E e
X E e
Y
2 2 2 2
= e( 1
+ 21
1
) e( 2
+ 12
2
)
2 2 2
= e( 1 + 2 )
+ 21 (
1 +
2 )
which is the moment generating function for
N
1 +
2 12 + 22 ]
2.
h i h i
E e
(aX ) = E e(
a)X
2 2
= e 1 (
a)+ 12
1 (
a)
2 2 2
= e(a 1 )
+ 12 a
1
which is the moment generating function of
N a 1 a2 12 ]
Var Xi = 2
X1 + + Xn has Variance
Var X1 + + Xn = n2
68 CHAPTER 6. CONTINUOUS RANDOM VARIABLES
Var X1 + n + Xn = n
2
Var X1 + p
+ Xn
n = 2
Zb
lim P a Sn ;
p n b = p1 e
z2
n
;
2 dz
n!1 a 2
Which is the pdf of a N 0 1] random variable.
6.4. CENTRAL LIMIT THEOREM 69
3
= 1 + 2 + 3! E Xi3 + : : :
2
Sn
The mgf of p n
h i h (X1 ++Xn)i
e
Sn
E
p
n =E e np
h i h i
= E e n X1 : : : E e n Xn
p p
h in
= E e n X1
p
n
= mX1 p
n
!n
3 E X 3
= 1 + 2n +
2 2
32 =! e 2 as n ! 1
3!n
Which is the mgf of N 0 1] random variable.
Note if Sn Binn p] Xi = 1 with probability p and = 0 with probability (1 ; p).
Then
Sn ; np ' N 0 1]
pnpq
This is called the normal approximation the the binomial distribution. Applies as n !
1 with p constant. Earlier we discussed the Poisson approximation to the binomial.
which applies when n ! 1 and np is constant.
Example. There are two competing airlines. n passengers each select 1 of the 2 plans
at random. Number of passengers in plane one
S Binn 12 ]
Suppose each plane has s seats and let
f (s) = P(S s)
S ; np ' n0 1]
pnpq
S ; 1n s ; 1n
f (s) = P 1 p2 1 p2
2 n
2s ; n
2 n
=1; p
n
therefore if n = 1000 and s = 537 then f (s) = 0:01. Planes hold 1074 seats only 74
in excess.
70 CHAPTER 6. CONTINUOUS RANDOM VARIABLES
out complete enumeration), that p ; p 0:005. Instead choose n so that the event
0
P p ; p 0:005 = P(jSn ; npj 0:005n)
0
=P pnpq p
n
Z 1:96
p1 e
x2
;
2 dx = 2(1:96) ; 1
;1:96 2
p
0:005
p n 1:96
n
n 01::005
962 1 ' 40 000
24
If we replace 0.005 by 0.01 the n 10 000 will be sufficient. And is we replace 0.005
by 0.045 then n 475 will suffice.
Note Answer does not depend upon the total population.
6.5. MULTIVARIATE NORMAL DISTRIBUTION 71
i=1 2
= 1 n2 e 2 i=1 xi
1 Pn 2
;
(2)
= 1 n2 e 2 ~x^~x
1 ;
(2)
0X 1
Write
BBX12 CC
X~ = B@ ... CA
Xn
and let ~z
;
=
~ + AX~ where A is an invertible matrix ~x = A;1 (~x ; ~
) . Density of ~z
f (z1 : : : zn ) = (21) n2 det1 A e 2 (A (~z;~ )) (A (~z;~ ))
; ;1 1 ; T 1
1 1 (~z;~ )T 1 (~z;~ )
= 12 e 2
; ;
(2) 2 jj
n
E
(~z ; ~
)(~z ; ~
)T
= E h (AX~ )(AX~ )T i
= AE XX T AT
= AIAT = AAT = Covariance matrix
If the covariance matrix of the MVN distribution is diagonal, then the components of
the random vector ~z are independent since
n Y 1 e ;21 zi;ii 2
f (z1 : : : zn) = 1
i=1 (2 ) 2 i
02 0 : : : 1
Where
0
BB 01 22 : : : 0CC
= B .. .. . . CA
@. . . ..
.
0 0 : : : n2
Not necessarily true if the distribution is no MVN recall sheet 2 question 9.
72 CHAPTER 6. CONTINUOUS RANDOM VARIABLES
f (x1 x2 ) = 1
2(1 ; p2 ) 12 1 2
" "
exp ; 2(1 ;1 p2 ) x1 ;
1 2 ;
1
x1 ;
1
x ;
x ;
2##
2p 1
2
2
2
+ 1 1
1
;1 = 1 ;1 p2 p;11;1 p122
2
1 2 2
= p1 p122
2
1 2 2
Correlation(X1 X2 ) = Cov(X1 X2 ) = p
1 2