2.1 The Concept of A Random Variable: 2. Random Variables and Probability Distributions
2.1 The Concept of A Random Variable: 2. Random Variables and Probability Distributions
2.1 The Concept of A Random Variable: 2. Random Variables and Probability Distributions
HH 2 0 2
HT 1 1 2
TH 1 1 2
TT 0 2 2
1
We say that the space of the R. V. X = {0, 1, 2}
2. Tossing a fair coin three times. The sample space is
HHH 3
HHT 2
HTH 2
HTT 1
THH 2
THT 1
TTH 1
TTT 0
The space of the R. V. X = {0, 1, 2, 3}
3 The sample space for rolling a die once is S = {1, 2, 3, 4, 5, 6}.
Let the rv X denote number on the face that turns up in a sample point, then we can
write
X(1) = 1, X(2) = 2, X(3) = 3, X(4) = 4, X(5) = 5, X(6) = 6.
Note that in this case X(x) = x (value of the event=value of rv X): we call such a
function an identity function.
4. Rolling pair of a fair dice. A random variable X is the sum of the two dice. The
sample space is {(1,1), (1,2), …, ((6,6)}, N = 36
The space of the R. V. X = {2,3,4,5,6,7,8,9,10,11,12}, number of element is 11
5. The sample space for tossing a coin until a head turns up is S = {H, TH, TTH, TTTH,
...}.
Let the rv X be number of trials required to produce the first head, then we can write
X = {1, 2, 3, 4, ...} X(H) = 1, X(TH) = 2, X(TTH) = 3, .....
2
In this case, the space or the range of the rv X = {1, 2, 3…}
We can say X is more tangible than S. It is easier to work with X for two reasons; (i) it’s
smaller, having 11 elements (as opposed to 36), in the case of example 4, and (ii) its
elements are numbers (as opposed to outcomes) allowing the use of various
mathematical operations.
Note that: A sample space S, contains all possible outcomes. But a random variable X
contains real numbers. That is, elements of S are events but specific value of X is real
number.
a. Discrete Random Variable: If a random variable can assume only a particular finite
or accountably infinite set of values, it is said to be discrete random variable.
b. Continuous Random Variable: if a random variable can assume infinite and
uncountable set of values, it is known as continuous random variable
Once a random variable X is defined, the sample space is no longer important. All
relevant aspects of the experiment can be captured by listing the possible values of X
and their corresponding probabilities. This list is called a probability density function
(pdf) or probability mass function or probability distribution. Formally, the pdf,
denoted by f, is the function defined by, f(χ) = P(X = χ) for −1 < χ < 1
Definition: If X is a discrete random variable, the function given by f(x) = P(X=x) for
each x within the range of X is known as the probability distribution of X.
P ( X = x ) , x = x i
f ( x) = , i = 1, 2,3,...
0 , x xi
3
The probability distribution reveals association of each value of X with the
corresponding probability.
A function can serve as the probability distribution of a discrete random variable X iff
its value, f(χ), satisfy the following two conditions:
1. f ( x) 0 for each value within its domain
Where the summation extends over all the
2. f ( x) = 1,
x
2. f ( x) = 1
x
4
Possible X=x f(x) =P(X=x) X=x f(x) =P(X=x)
outcomes
HH 2 1/4 2 1/4
HT 1 1/4 1 1/4
TH 1 1/4 1 1/4
TT 0 1/4 0 1/4
Alternatively,
0 1/4
1 1/2
2 1/4
1. f ( x) 0
2. f ( x) = 1,
x
2. Tossing a fair coin three times. The random variable can be the number of heads in
this three tosses.
The probability distribution is given by (table):
X=x 0 1 2 3
f(x) =P(X=x) 1/8 3/8 3/8 1/8
1. f ( x) 0
2. f ( x) = 1,
x
3. Rolling pair of a fair dice. A random variable X is the sum of the two dice
X=x 2 3 4 5 6 7 8 9 10 11 12
f(x) =P(X=x) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
5
1. f ( x) 0
2. f ( x) = 1,
x
4. In the experiment of flipping a coin and generating the number of tosses required to
find a head, the sample space, S = {H, TH, TTH, TTTH ...}
Let the R.V. X = the number of tosses required to produce the first head and let P(H) =
p, P(T) = 1-p, then
6
Functions of Probability Distribution for Discrete Random Variable
A function can serve as the probability distribution of a discrete random variable X iff
its value, f(χ), satisfy the following two conditions:
1. f ( x) 0 for each value within its domain
Where the summation extends over all the
2. f ( x) = 1,
x
f ( x) = 1
x =1
7
4. Check whether the following function represents probability distribution
3
3! 1
f ( x) = , x = 0,1, 2, 3
x !( 3 − x ) ! 2
3! 1 1
f (0) = =
0!( 3) ! 8 8
3! 1 3
f (1) = =
1!2! 8 8
3! 1 3
f (2) = =
1!2! 8 8
3! 1 1
f (3) = =
3!0! 8 8
3
f ( x ) = f (0) + f (1) + f (2) + f (3) = 1
x =0
5. For what value of k can the following function serve as probability distribution:
x
1
f ( x) = k , x = 1, 2,3,...
4
k
f (1) =
4
k
f (2) =
16
k
f (3) =
64
.
.
The series: k/4, k/16, k/64,…, is in GP with a1 = k/4 and r = ¼. Then,
a1 1 k
S = = 1, r = & a1 =
1− r 4 4
k
4 =1
1
1−
4
k
4 =1
3
4
k
=1
3
k = 3
8
Graphs of Probability Distributions for Discrete Random Variable
The graphs include (probability) histogram and bar chart.
Example: Tossing a fair coin three times. The random variable can be the number of
heads in this three tosses.
The probability distribution is given by (table):
X=x 0 1 2 3
f(x) =P(X=x) 1/8 3/8 3/8 1/8
0.35
0.3
0.25
f(x)
0.2
0.1
0.05
0
0 1 2 3
Num ber of Heads
F ( x) = P ( X x) = f (t ) for − x
tx
9
The values of F(x) satisfy the following conditions:
F ( ) = 1
F (−) = 0
If a b, then F (a) F (b) for any real numbers a & b
c) F(x) = 0 for x < x1, x1 being the minimum/least of the values of the random variable
X.
d) F(x) = 1 for x ≥ xn, xn being the maximum/largest value of X
e) P(x < X ≤ x) = P(X ≤ x) -P (X ≤ x) = FX(x) - FX(x)
Then,
F(x) = 0 for x < x1
F(x) = f(x1) for x1 ≤ x < x2
F(x1) = f(x1)
F(x2) = f(x1) + f(x2)
F(x) = f(x1) + f(x2), for x2 ≤ x < x3
F(x3) = f(x1) + f(x2) + f(x3),
F(x) = f(x1) + f(x2) + f(x3), for x3 ≤ x < x4
.
.
.
F(x) = 1 when x ≥ xk
10
1
, x = x i; i = 1, 2 ,6
Example: f ( x ) = 6
0 , elsewhere
Then,
F ( x ) = 0, x 1
1
F ( x) = , 1 x 2
6
F (1) = f (1)
F ( x ) = f (1) + f (2), 2 x 3
1 1 2
F (2) = + =
6 6 6
F ( x ) = f (1) + f (2) + f (3), 3 x 4
2 1 3
F (3) = F (2) + f (3) = + =
6 6 6
F (4) = F (3) + f (3), 4 x 5
3 1 4
= + =
6 6 6
F (5) = F (4) + f (5), 5 x 6
4 1 5
= + =
6 6 6
F (6) = F (5) + f (6), x 6
5 1
= + =1
6 6
0 for x 0
1
for 1 x 2
6
2
for 2 x 3
6
3
Thus, F ( x ) = for 3 x 4
6
4
6 for 4 x 5
5 for 5 x 6
6
1 for x 6
11
Graphically,
F(x)
5/6
4/6
3/6
2/6
1/6
x
0 1 2 3 4 5 6
Graphically F(x) is a step function with the height of the step at xi equal to f(xi).
Note that:
1. F(x) gives us the probability that the rv X will assume a value less than or equal to a
given number. But f(x) gives us the probability that the rv X will assume a particular
value.
E.g. In the experiment of rolling a die we have the sample space, S = {1, 2, 3, 4, 5, 6}.
Here, X(x) = x, P(x) = 1/6. Such distributions are known as uniform distributions.
2. Given f(x) we can derive F(x) or given F((x) we can derive f(x)
12
F ( x ) = f ( t ) for − x
tx
1
, x = x i; i = 1, 2 ,6
E.g. f ( x ) = 6
0 , elsewhere
0 for x 0
1
for 1 x 2
6
2
for 2 x 3
6
3
Then F ( x ) = for 3 x 4
6
4
6 for 4 x 5
5 for 5 x 6
6
1 for x 6
x1 x2 x3 ... xn , then f ( x1 ) = F ( x1 ) and
If the range of a rv X consists of the values
f ( xi ) = F ( xi ) − F ( xi −1 ) for i = 2,3, 4...n
x
that P(X≤ x) = F ( x) =
−
f (u)du for every real number x, and F(x) is called the
13
x
in F ( x) =
−
f (u)du is called the probability density function of X.
NB: the conditions are not iff as that of the case of discrete rv X. The reason is that f(x)
could be negative for some values of the rv without affecting any of the probabilities. In
practice all probabilities are non-negative and hence these conditions are satisfied.
1. f ( x) 0, for − x
2. − f ( x)dx = 1
3. Probability of a fixed value of a continuous rv X is zero.
p (a X b) = p(a X b) = p(a X b) = p(a X b) . For any real constants
a and b with a b .
Probability Density Function (pdf) can be indicated using graphs or functions
Since the probability that a continuous rv X will assume a particular value is zero, the
probability distribution of continuous rv X can not be given in tabular form.
The graphs of a pdf of rv X is a continuous curve with any shape
14
Solution
1
6 x(1 − x)dx = 1
0
6( ( x − x )dx) | = 1
1
2 1
0 0
1 1
( x 2 − x 3 ) |10
2 3
3x − 2 x3 1
2
6( ) |0 = 1
6
3− 2 =1
1=1
graph it x-intercept= 0 or 1
2. For what value of k the function f ( x) = kx(1 − x), for 0 x 1 can serve as the
probability density function?
Solution:
1
0
kx(1 − x)dx = 1
1
k ( x − x 2 )dx = 1
0
1 2 1 3 1
k( x − x ) |0 = 1
2 3
3− 2
k( ) =1
6
k
=1
6
k =6
c
,0 x 4
3. The pdf of the continuous rv X is given by: f ( x) = x
0
, otherwise
a. Find the value of c
1
b. Find p ( X ) & p ( X 1)
4
15
Solution:
a. b.
1
c 1 1
4
1
dx = 1 p( X ) = 4 1
0 4 0
x2 4x 2
−1
1 1 −1
= 4 x 2
4
c x 2
dx = 1
0 4 0
1 1
1 1 1
x 4 2 = x 2 |0 4
c | =1 4 1− 1
1 0
2
2
1 1 12
1 = 2( )
2cx 2
|04 = 1 4 4
1
1
=
2c (4) = 1 2
4
4c = 1
1
c=
4
1
1
p ( X 1) =
0
4
1
2
4x
1 4 −21
4 1
= x
1
1 1
= x 2 |14
4 1− 1
2
1 1
1 1
= 2(4) 2 − 2(1) 2
4 4
1 1
= 1 − =
2 2
4. Given the pdf:
kxe − x ,for x 0
2
f ( x) =
0 ,for x 0
Find the value of k.
16
Solution:
kxe − x dx = 1
2
let v = e − x
2
dv = −2 xe − x dx
2
dv
xdx = −
2e − x
2
Then,
kxe − x dx = 1
2
0
1
−x
−k dv = 1
2
2 e
0 −x
2e
k
2 0
− 1dv = 1
k
− v | 0 = 1, but v = e − x2
2
k
− e − x | 0 =1
2
2
k k
lim( − e − x ) − ( − e 0 ) = 1
2
x → 2 2
k
0+ =1
2
k =2
Cumulative Distribution Functions (cdf) of a continuous rv X
Definition: if X is a continuous rv and the value of its probability density at t is f(t),
x
then the function given by: F ( x) = p( X x) = f (t )dt, for − x is known as
−
the Cumulative Distribution Function of X. F(.) must be continuous with domain the
set of all real numbers and range between 0 and 1 (inclusive).
17
Properties of cdf of a continuous rv X
1. F (+ ) lim F (x ) = 1
x → +
2. F (− ) lim F (x ) = 0
x → −
4. If f(x) and F(x) are the values of the pdf and cdf of a continuous rv X, then
Note that f(a) ≠ P(X = a), and f(a) could actually be greater than 1.
Examples
1. Given f ( x) = 6 x(1 − x), for 0 x 1 . Find its cdf and P(0.5<X<1)
18
Solution:
i.
x
F ( x) = −
f (t ) dt
x
= 6t (1 − t ) dt
−
x
= 6 (t − t 2 ) dt
0
t2 t3 x
= 6( − ) |0
2 3
3x − 2 x3
2
= 6( )
6
F ( x) = 3 x 2 − 2 x3
0 ,for x 0
F ( x ) = 3 x 2 − 2 x 3 ,for 0 x 1
1 ,for x 1
Its graph is continuous between 0 and 1
ii.
p(0.5 X 1) = F (1) − F (0.5)
= 1 − (3(0.5)2 − 2(0.5)3 )
= 1 − 0.5 = 0.5
c
,0 x 4
2. Given The pdf of the continuous rv X is given by: f ( x) = x
0
, otherwise
Find its cdf and p(X>1)
Solution:
19
1 1
,c = , for 0 x 4
f ( x) = 4 x 4
0
,for otherwise
x 1
i.F ( x ) = −
4 t
dt
1 x − 12
= t dt
4 0
1
1 1
= t 2 |0x
4 1− 1
2
1 12 x
= t |0
2
1 12
= x
2
0 ,for x 0
1
F ( x) = x , for 0 x 4
2
1 , for x 4
ii. p ( X 1) = F (4) − F (1)
1 1
= ( 4 − 1) =
2 2
1. Find the cdf of the pdf of the form:
kxe − x ,for x 0
2
f ( x) =
0 ,for x 0
1
2. Suppose X is a continuous rv with cdf: F ( x ) = . Find its pdf and compute
1 + e− x
p(-1<x<2) using both pdf and cdf
Note that: given F(x) we can find f(x) or vice versa.
20
2.4. The Expected Value of a Random Variable and Moments
Mathematical Expectation:
1. Let X be a discrete random variable taking values x1, x2, x3,…with f(xi) as its
probability density, then the expected value of X, denoted by E(X), is defined as
E ( X ) = x1 f ( x1 ) + x1 f ( x1 ) + x2 f ( x2 ) + x3 f ( x3 ) + ...
E( X ) = x f (x )
i
i i
That is, E(X) is the weighted mean of the possible values of X, each value is weighted
by its probability. Example:
2. Let X be a continuous random variable with probability density function f(x), then
the expected value of X, denoted by E(X), is defined as: E ( X ) = xf ( x)dx .
−
20,000
Example: f ( x) = , x 100
x3
Properties of mathematical expectations
a. if c is a constant E(c) = c
E(c) = (c)1 = c
b. E(aX+b) = aE(X)+b, where a and b are constants in
Proof: For discrete
E (aX + b) = (ax + b) f ( x)
= axf ( x) + bf ( x)
= a xf ( x) + b f ( x)
= aE ( x) + b, ( f ( x) = 1)
21
For continuous:
E (aX + b) = (ax + b) f ( x)dx
−
= axf ( x )dx + b f ( x )dx
− −
= a f ( x ) xdx + b f ( x )d
− −
= aE ( x ) + b, (
−
f ( x ) = 1)
c. Let X and Y be random variables with finite expected values. Then
E(X + Y ) = E(X) +E(Y )
Expectation of a Function of a Random Variable
Examples: X=x 0 1 2 3
1. Given: F(x) 1/3 1/2 0 1/6
Find the EV of g ( x) = ( X − 1) 2
E ( g ( x)) = i =0 g ( x) f ( x)
3
= i =0 ( x − 1) 2 f ( x)
3
1 1 1
Solution: = (0 − 1) 2
+ (1 − 1) 2 + (2 − 1) 2 0 + (3 − 1) 2
3 2 6
1 4
= + 0 + 0 + = 1
3 6
2. Let X be a rv with density function
x2
, −1 x 2
f ( x) = 3
0,
otherwise
22
Find the EV of the function: g ( x ) = 2 X − 1
Solution:
E ( g ( x )) = −
g ( x) f ( x)
x2
= − (2 x − 1)
3
dx
2 x2
= −1 (2 x − 1) 3 dx
1 2
= ( (2 x − 1) x 2 dx)
3 −1
1 1 4 1 3 2
= x − x |−1
3 2 3
1 16 8 1 1 3
= ( − ) − ( + ) =
3 2 3 2 3 2
E (cg ( x)) = cg ( x) f ( x)
= c g ( x) f ( x)
= cE ( g ( x))
Variance and Standard Deviation of a rv X
The variance of the rv X measures the spread or dispersion of a rv X
Let X be a rv with the following distribution.
X=x x1 x2 x3 ….
f(x) f(x1) f(x2) f(x3) ….
23
Var ( X ) = ( x − ) f ( x)
2
= E( X )2 − E( X )
2
= E( X )2 − 2
But E ( X ) 2 = x 2 f ( x), for discrete
E ( X ) 2 = x 2 f ( x) dx, for continuous
Var ( X ) = x 2 f ( x) − 2
Var ( X ) = x 2 f ( x)dx − 2
Properties of Var(X)
1. If “a” is any real constant, then Var(a)=0
2. if Var ( X ) = 2 , then the variance of Y in Y=aX+b is given by:
Var (Y ) = a 2 2
Examples: for discrete
X=x 0 1 2 3
1. Given
f(x) 1/8 3/8 3/8 1/8
E ( X ) = xf ( x)
1 3 3 1 12
= 0. + 1. + 2. + 3. = = 1.5
8 8 8 8 8
1 23 3 1
E( X ) = 02 + 1 + 2 2 + 32
8 8 8 8
3 12 9 24
= 0+ + + = =3
8 8 8 8
Var ( X ) = 3 − (1.5) 2 = 0.75
In the three tosses of a fair coin, on average, we get 1.5 heads.
2. Bernoulli Random Variable: A random variable with only two outcomes (0
and 1) is known as a Bernoulli R.V.
Let X be a random variable with probability p of success and (1 - p) of failure.
24
x fX(x)
Success 1 p
Failure 0 (1-p)
p (1 − p )
1− x
x
; if x = 0,1
f ( x) =
0 ; otherwise
Let X be the number of trials required to produce the 1st success, say a head in a toss of a
fair coin. This is easily described by a geometric random variable and is given as:
x fX(x)
1 p
2 (1 - p)p
3 (1 - p)2p
4 (1 - p)3p
. .
. .
. .
E(x) = (1)p + (2)(1 - p)p + (3)(1 - p)2p + (4)(1 - p)3p + ....
= p[1 + (2)(1 - p) + (3)(1 - p)2 + (4)(1 - p)3 + ....]
Let S = 1 + (2)(1 - p) + (3)(1 - p)2 + (4)(1 - p)3 + ....
E(x) =Sp
Then (1 - p)S = (1 - p) + (2)(1 - p)2 + (3)(1 - p)3 + ....
S-S+Sp = 1 + (1 - p) + (1 - p)2 + (1 - p)3 + ....
Sp = 1 + (1 - p) + (1 - p)2 + (1 - p)3 + ....
(1 - p)Sp = (1 - p) + (1 - p)2 + (1 - p)3 + ....
Sp - (1 - p)Sp = 1
p(1 − p )
x −1
, x = 1, 2, 3,
Alternatively, f ( x ) = P( X = x) =
0 , elsewhere
25
E(X ) = xf ( x ) = xp(1 − p ) p x(1 − p )
x −1 x −1
=
x =1 x =1 x =1
d d
p − (1 − p ) , as − (1 − p ) = x(1 − p )
x −1
=
x x
x =1 dp dp
d x d x d 1 1
= p − (1 − p ) = p − (1 − p ) = p − =
dp x =1 dp x =0 dp p p
1 − e − x , x 0 & 0
2. F ( x) =
0, otherwise
Find: a. E(X) and b. Var(X)
Solution
E(X)=1/λ, Var(X)=1/ λ2
26
Moments of a probability distribution
The mean of a distribution is the expected value of the random variable X. A
generalisation of this is to raise X to any power r, for r =0, 1, 2,... and compute the
E(Xr). This is known as the moment of order r about the origin. The rth moment about
for r = 0, 0/ = E ( X 0 − 0) = 1
for r = 1, 1/ = E ( X 1 − 0) = E ( X ) = mean = xf ( x )
for r = 2, 2/ = E ( X 2 − 0) = E ( X 2 )
for r = 2, 3/ = E ( X 3 − 0) = E ( X 3 )
.
.
.
r/ = E ( X r − 0) = E ( X r )
In general,
r/ = E ( X r ) = x r
f X ( x ) ; r = 0,1, 2, (Discrete)
/
r = E(X r
) = x f ( x)dx ; r = 0,1, 2,
r
(Continuous)
−
Moments can also be generated around the mean, which are known as central
27
In general,
r = E ( x − ) = ( x − ) f ( x)
r r
, r = 0,1, 2, or
x
= E ( x − ) = ( x − ) f ( x ) dx
r r
r
, r = 0,1, 2,
−
Relationship between
/
r
and r
a) 0/ = 0
=1
b) 1/ = X
E ( x − X ) 2 = E ( x 2 − 2 xX + X 2 )
= E ( x 2 ) − 2 XE ( x) + X 2
= E( x2 ) − 2 X 2 + X 2
= E( x2 ) − X 2
2 = 2' − (1' )2 = E( x2 ) − (E( x))2 . Thus, i.e, the variance of a random variable is
expected value of the square of the random variable less the square of the expected
value of the random variable.
b) 3 = E( x − X )3
E ( x − X )3 = E ( x3 − 3x 2 X + 3xX 2 − X 3 )
= E ( x3 ) − 3 XE ( x 2 ) + 3 X 2 E ( x) − X 3
= E ( x3 ) − 3 XE ( x 2 ) + 3 X 3 − X 3
= E ( x3 ) − 3 XE ( x 2 ) + 2 X 3
28
c) 4 = E ( x − ) 4
E ( x − X )4 = E ( x 4 − 4 x 3 X + 6 x 2 X 2 − 4 xX 3 + X 4 )
= E ( x 4 ) − 4 XE ( x 3 ) + 6 X 2 E ( x 2 ) − 4 X 3 E ( x) + X 4
= E ( x 4 ) − 4 XE ( x 3 ) + 6 X 2 E ( x 2 ) − 4 X 4 + X 4
= E ( x 4 ) − 4 XE ( x 3 ) + 6 X 2 E ( x 2 ) − 3 X 4
is the fourth moment about the mean and is equal to :
Interpretations
1. The first moment about the origin is the mean of the distribution. That is, μ1/ =μ is a
29
Kurtosis is a measure of whether the data are peaked or flat relative to a normal
distribution. That is, data sets with high kurtosis tend to have a distinct peak near the
mean, decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to
have a flat top near the mean rather than a sharp peak. A uniform distribution would be
the extreme case.
4. μ3 is the third moment about the mean and is used to calculate the measure of
3
skewness which is given as 3 = and known as the Pearson’s measure of
3
skewness. If α3 = 0 then the distribution is symmetric. If α3 > 0 then the distribution
negatively skewed, and there is a spread to the left–few observations on the left-
hand of the mean pull the mean to the left.
5. μ4, the fourth moment about the mean and is used to calculate the measure of
4
pickedness or flatness (which is known as kurtosis) and is given as 4 = . α4 =3
4
for a normal distribution. α4 >3 if the distribution is narrower and thinner at its tails
than the normal distribution (it is known as leptokurtic). α4 <3 if the distribution is
flatter and thicker at its tails than the normal distribution (it is known as
platykurtic).
Example
We have already obtained the expected value of the Bernoulli random variable
p (1 − p )
x 1− x
; if x = 0, 1
f X (x ) =
0 ; otherwise
to be equal to E(x) = 0(1-p)+1(p) = p.
To obtain the variance of the Bernoulli random variable we first get
E(x2) = 02(1 - p) + 12(p) = p.
30
Thus 2 = E(x2) - (E(x))2 = p - p2 = p(1 -p)
Thus = p − p 2 = p(1 − p)
p − p2 p (1 − p )
and cv = =
p p
Exercise
Let the random variable Y be the number of failures preceding a success in an
experiment of tossing a fair coin, where success is obtaining a head. Find E(Y) and 2
for this random variable.
f ( x 0 ) ( x − x0 )0 f ( x 0 ) ( x − x0 )1 f ( x 0 ) ( x − x0 ) 2 f ( x 0 ) ( x − x0 )3
f ( x) = + + + + ...
0! 1! 2! 3!
f ( x0 )( x − x0 )r
f ( x) = f ( x0 ) +
r =1 r!
r
f (x )
; Where f r (x0 ) = d r
d x x = x0
31
The Maclaurin’s series = the Taylor series expansion about the origin or zero is given
by:
( tx ) + ( tx ) + ( tx ) + ( tx )
2 3 4 5
e
tx
= 1 + tx + +
2! 3! 4! 5!
2 3 4 5
Hence, m ( t ) = E ( etx ) = 1 + tE ( X ) + E ( x ) + E ( x ) + E ( x ) + E ( x 5 ) +
t 2 t 3 t 4 t
2! 3! 4! 5!
2 3 4
m (t ) = E ( X ) + E ( x ) + E ( x ) + E ( x ) + E ( x5 ) +
d 2t 3t 4t 5t
2 3 4
dt 2! 3! 4! 5!
d d
m ( t ) t =0 = m ( 0 ) = E ( X ) , the first moment about the origin.
dt dt
2 2 3
m (t ) = E ( x ) + E ( x ) + E(x )+ E ( x5 ) +
d 2 6t 3 12t 4 20t
Similarly, 2
dt 3! 4! 5!
2 2
( ) m ( 0 ) = E ( x 2 ) , the second moment about the origin.
d d
2
m t t =0 = 2
dt dt
32
In general,
d r m(t )
r
|t =0 = r/ , the r th moment about the origin
dt
Examples
1 tx
v / = etx dx v = e
t
1 x tx 1 tx 1
Then, 2 xetx dx = 2( e − 2 e ) |0
0 t t
1 1 1
= 2( et − 2 et + 2 et )
t t t
tet − et + 1
= 2( )
t2
2
= 2 et (t − 1) + 1
t
2
m(t ) = et (t − 1) + 1
2
t
The Maclaurin’s series of the mgf is:
33
2 f (0)t1 f (0)t 2 f (0)t 3 f (0)t 4
m(t ) = 2
( f (0) + 1! ) + 2! + 3! + 4! + ...)(t − 1) + 1
t
2 t2 t3 t4
= 2 (1 + t + + + + ...)(t − 1) + 1
t 2 6 24
2 t3 − t2 t4 − t3 t5 − t4
= 2 t − 1 + t − t + 2 + 6 + 24 + ... + 1
2
t
2 2 t 2 (t − 1) t 2 (t 2 − t ) t 2 (t 3 − t 2 )
= 2 t + + + + ...
t 2 6 24
1 t t2
= 2 1 + + + + ...
2 3 8
dm(t) 2
Then, |t =0 = = mean
dt 3
2
d m(t ) 1
2
|t = 0 = = E( X 2 )
dt 2
1 4 1
Var ( X ) = − =
2 9 18
x
1
2. Given the probability distribution: f ( x) = 2 , x = 1, 2,3,...
3
Find: a. moment generating function
b. The mean and the variance of the distribution using a
Solution:
2et
m(t ) = . To solve for the mean and variance take the derivative of m(t) as it is (no
3 − et
need of the Maclaurin’s series) and evaluate it at zero.
Note:
1. Taking higher order derivative of the moment generating function and then evaluating
the resulting function at the origin (t=0) can also generate all higher order moments
about the origin of a random variable.
34
3. Special Probability Distributions and Densities
3.1 Some special Probability Distributions: The case of Discrete Random Variable
3.1.1 The Bernoulli distribution
A random variable with only two outcomes (0 = failure and 1=success) is known as a
Bernoulli rv.
Let P(success) = p implying P(failure) = (1 - p). Defining the rv X = X(success) = 1 and
X(failure) = 0 then P(X = 1) = p and P(X = 0) = (1 - p). That is,
Definition: A rv X has a Bernoulli distribution and it is referred to as a Bernoulli rv iff
its probability distribution is given by:
p (1 − p )
1− x
x
, for x = 0, 1 and 0 p 1
f (x; p ) =
0 , elsewhere
n n− x
probability distribution is given by: f ( x; n, p) = p (1 − p) , x = 1, 2,3,..., n
x
x
35
Where n = the number of independent trials and p = probability of success
n n n!
= n combination x and is given by: =
x x x !(n − x)!
Consider the case of 4 trials (n=4) and denote success by S and failure by F.
1) A single success can occur in any one of the following four ways.
(SFFF, FSFF, FFSF, FFFS). Each of these has the probability p(1-p)3 =p(1-p)3,
4
( )
3
therefore, f(1;4,p) = 4p(1-p)3 = p 1 − p
1
2) Two successes can occur in six distinct ways.
(SSFF, SFSF, SFFS, FSSF, FSFS, FFSS). Each of these has the probability p2(1-p)2,
thus, f(2;4,p) = 6p2(1-p)2, using the factorial notation we can write
4
f (2, p ) = p2 (1 − p ) =
4!
p (1 − p ) = 6 p (1 − p )
2 2 2 2 2
2 (4 − 2)!2!
by similar argument:
4 4!
f ( 3;4, p ) = p 2(1 − p ) = p (1 − p ) = 4 p (1 − p )
2 2 2 2 2
3 ( 4 − 3)!3!
The general form of the probability function of the binomial distribution is given by
n x n− x
p (1 − p) , x = 0,1, 2, n
f ( x; n, p) = x
0
, Otherwise
n x
that: x =0 p (1 − p)
x =n n− x
Note = ( p + 1 − p) n = 1 . That is, the sum of
x
probability is equal to one.
For a random variable having a binomial distribution with parameters n and p:
36
The generating function:
m ( t ) = E e tx ( )
n n x
= e tx p (1− p)
n-x
x =0 x
n n
= (et p) x (1− p)
n-x
x =0 x
( )
n
= pe t +(1− p)
( )
n −1
dm(t )
Then, = n pe t + (1− p) pe t
dt
dm(t )
|t =0 = np
dt
E ( X ) = m / ( 0 ) = np , and
( ) ( )
n−2 n −1
d 2 m(t )
2
= n(n − 1) pe t + (1− p) pe pe + n pe t + (1− p)
t t
pe t
dt
2
d m(t )
2
|t =0 = n(n − 1) p 2 + np
dt
Then, E ( X 2 ) = n(n − 1) p 2 + np
Var ( X ) = E ( X 2 ) − E ( X )
2
= n(n − 1) p 2 + np − (np) 2
= n 2 p 2 − np 2 + np − n 2 p 2
= np − np 2 = np(1 − p)
Alternatively,
n n −1
n
n n
n! x −1
E ( X ) = x p x(1 − p) n− x = x x n− x
pq = np p (1 − p)
n−x
= np
x =0 x x =0 x !( n − x ) ! x =1 x − 1
Example
Suppose a manufacturer of TV tubes draws a random sample of 10 tubes. The
production process is such that the probability that a single TV tube, selected at random,
37
is defective is 10 percent. Calculate the probability of finding
a) exactly 3 defective tubes
b) no more than 2 defective tubes.
a) note n =10, x = 3, p = 0.1, and q = 0.9, therefore
10
P(X= 3) = (0.1)3 (0.9)7
3
Exercise
1. A company that markets brand A cola drink claims that 65 percent of all residents of
a certain area prefer its brand to brand B. The company that makes Brand B employs an
independent market research consultant to test the claim. The consultant takes a random
sample of 25 persons and decides in advance to reject the claim if fewer than 12 people
prefer Brand A. What is the probability that the market researcher will make the error of
rejecting the claim even though it is correct?
2. An English teacher in Flen 101 gives a test consisting of 20 multiple choice questions
with four possible answers to each, of which only one is correct. One of the students,
who has not been studying, decides to check off answers at random. What is the
probability that he will get half of the questions right?
3. An owner of a mountain resort has 15 rooms available for rent, and they are rented
independently. The probability that any one of them will be rented for a single night is
0.8. Compute the probability that at least 12 cabins will be rented in a single night?
38
is the same from trial to trial, it is not applicable. In this case, we use the
hypergeometric distribution. Conditions: sampling must be without replacement and
population size (N) is finite and the sample size (n) is greater than 5% of the total
population.
Suppose a set of N elements (or finite population size) of which M are successes and N-
M are failures. Here again we are interested in the probability of getting x successes in n
trials (sample size). But now the choice of n out of N elements is without replacement.
m
The ways of choosing x successes out of the total of M successes are: and n-x
x
N −M
failures out of the total N-M failures .Then the ways of choosing x successes
n− x
M N − M
and n-x failures are: . The ways of choosing n elements (sample size) out
x n − x
N
of N elements (population size) are:
n
M N − M
x n − x
The probability of getting x successes in n-trials is given by: N
n
Definition: A random variable X has a heypergeometric distribution and it is known as
a heypergeometric rv iff its probability distribution is given by:
M N − M
x n − x
f ( x; n, N , M ) = , for x = 0,1.2,..., x M & n − x N − M
N
n
The mean and variance of the heypergeometric distribution:
nM nM ( N − M )( N − n)
E( X ) = & Var ( X ) =
N N 2 ( N − 1)
39
Example1: Suppose electronic component factory ships components in lots of 100, of
which 10 are defective. A quality controller draws a sample of 5 to test. What is the
probability that two of the five are defective?
10 90
= .
2 3
f ( 2;5,10,90 )
100
5
Example2: An urn contains 5 black and 7 red balls. Two balls are selected at random
without replacement.
Let X = the number of black balls in the selected sample of 2 balls.
X = {0, 1, 2}
Question: Find P(X = x)
a) total number of ways of selecting two balls from an urn containing 12 balls is
12
2
5
b) the number of ways of selecting x black balls is
x
7
c) the number of ways of selecting 2-x red balls is
2 − x
5 7
x 2− x
P (X =x ) =
12
2
P(X = x) for x = 0, 1, 2 is as given below.
5 7 5! 7! 1(7 6)
p(X = 0) = 0 2 5!(0!) 2!(5!)
= = 2 =
21
= 0.32
12 12! 12 11 66
2!(10!) 2
2
40
5 7 5! 7!
5 7
p(X = 1) = =
1 1 4!(1!) 1!(6!) 35
= = = 0.53
12 12! 12 11 66
2!(10!) 2
2
5 7 5! 7! 5 4
p(X = 2) = =
2 0 2!(3!) 7!(0!) 10
= 2 = = 0.15
12 12! 12 11 66
2!(10!) 2
2
Suppose the number of trial is very large (i.e., n → ) and the probability of
getting success is very small (i.e., p → 0 ). In such a case calculating binomial
probabilities is very difficult. Fortunately, we can approximate binomial distribution
using the Poisson distribution, after French mathematician Simon Poisson.
Consider the binomial distribution f(x; n, p) with the following things holding
(a) n →, and
(b) p → 0, but in such a way that
(c) np remains constant and hence letting np = for > 0.
Thus, the probability of success is very small, but the number of trials is very large.
Example when we have the probability of a disease is small but the number of patients
is large. We are interested in the distribution in this case:
n ( n − 1)( n − 2 ) ( n − x + 1)
p (1 − p )
n− x
f ( x; n, p ) = x
x!
n ( n − 1)( n − 2 ) ( n − x + 1) x
( )
n− x
=
x 1 −
x! n n
1 2 x − 1 n− x
( )
11 − 1 − 1 −
n x
n n
n n
= 1−
x!
n
if we let n →
41
n-x
λ n n
i
(a) lim 1 − = 1 for i = 1, 2, ... x-1 and, (b) lim 1− = e -λ
n →
n
n → n
x −
e , x = 0,1,2, , & 0
f ( x; ) = x!
0
, elsewhere
x
x
x =0
f ( x ) = e −
x =0 x !
= e −
e = 1 (
x =0 x !
is the Maclaurin expansion of: e )
If X is a random variable and X −poi(), then its moment generating function is:
42
m(t ) = E (etx )
= etx f ( x)
x =0
= etx f ( x)
x =0
x e−
= e tx
x =0 x!
x
= e e− tx
x =0 x!
(e t ) x
= e −
x =0 x!
Let f ( z ) = e z , z = et
Then, the Maclaurin expansion of f (t ) is :
( z )1 ( z)2 ( z )3
f (t ) = f (0) + f (0) + f (0) + f (0) + ...
1! 2! 3!
( z) x ( et ) x
= = . Thus it is the Maclaurin expansion of ee
t
x =0 x ! x =0 x!
m(t ) = e − ee = e ( e −1)
t t
43
dm(t )
= et e ( e −1)
t
dt
dm(t )
|t =0 = et e ( e −1) = = E ( X )
t
dt
d 2 m(t )
= e t
e t ( et −1)
e + e t ( et −1)
e
dt 2
d 2 m(t )
2
|t =0 = 2 + = E ( X 2 )
dt
Var ( X ) = 2 + − 2 =
Applications:
Analysis of accidents.
Analysis of waiting at service giving centres.
Analysis of telephone calls per hour.
Defective parts in outgoing shipments
100!
= (0.02) 0 (0.98)100−0
0!(100 − 0)!
100! 100!
+ (0.02)1 (0.98)100−1 + + (0.02) 2 (0.98)100 −2
1!(100 − 1)! 2!(100 − 2)!
100!
+ (0.02)3 (0.98)100−3
3!(100 − 3)!
= (0.98)100 + 99(0.02)(0.98)99 + 4950(0.02) 2 (0.98) 98 + 161700(0.02)3 (0.98)97
= 0.1326 + 0.2679 + 0.2734 + 0.1823
= 0.8562
44
We could also use the Poisson distribution and approximate this result as follows:
n = 100, p = 0.02 implying np = = 2 and X = {0, 1, 2, 3}
e −2
thus, f ( x) = 2 x , X = {0, 1, 2, 3}
x!
P ( X 3) = P ( X = 0 ) + P ( X = 1) + P ( X = 2 ) + P ( X = 3)
0 1 2 3
= e −2 2 + 2 + 2 + 2
0! 1! 2! 3!
= 0.1353 6.33333 0.8569
45
3.2. Some Special Probability Densities: The case of continuous
3.2.1 The uniform Continuous distribution
It is the simplest form of special probability densities.
1
, if X
f ( x) = −
0, otherwise
E(X ) = xf (x ) dx
1
= x − dx
x
2
=
2( − )
=
1
2( − )
(
2 −2 )
=
1
( − )( + )
2( − )
=
( + )
2
Similarly
46
E (X )2
= x
2
f ( x ) dx
1
= x dx
2
−
x
3
=
3 ( − )
=
1
( 3
− 3 )
3 ( − )
( − ) ( 2 + + 2 )
1
=
3 ( − )
( 2
+ + 2 )
=
3
Therefore
Var ( x) = E ( x 2 ) − E ( x) 2
( 2 + + 2 ) ( + )
2
= −
3 2
( 2 + + 2 ) ( 2 + 2 + 2 )
= −
3 4
4 + 4 + 4 − 3 − 6 − 3 2
2 2 2
12
− 2 +
2 2
( − )2
= =
12 12
m(t ) = E e ( ) tx
= e
tx 1 1
α β-αdx = t ( − ) e |
tx
e
tβ
− e tα
=
t ( β-α )
Exercise: For a uniformly distributed random variable, X, over the interval [α, β], show,
using moment generating function, that
47
+ 2 + + 2
E(X ) = , E (X 2 ) = , and hence deduce its variance.
2 3
Note that: First find the Maclaurin expansion of both et & et separately
Definition: A random variable X with parameters μ (for μ is the set of all real numbers)
and σ (>0) is said to follow the normal distribution if its probability density function is
given by:
1
1 − 2 2 ( x− )2
f ( x) = e , − x , and usually written as X ~ N(μ, σ2).
2
The distribution is usually written as X ~ N ( , 2 ) , and the values of the two
parameters are unknown. Though tedious, it possible to indicate that for the normal
distribution:
Note that: i ) f ( x ) 0 x R
ii ) f ( x )dx = 1
−
(2) it is bell-shaped;
(3) the mean (average) lies at the center of the distribution and the distribution is
symmetrical around the mean;
(4) the two tails of the distribution extend indefinitely and never touch the
horizontal axis;
(5) the shape of the distribution is determined by its Mean (µ) and Standard
Deviation (σ).
48
The shape of the normal distribution shows that observations occur mostly in the
neighbourhood of the mean, which is equal to the median and the mode of the
distribution. Their frequency decreases as they move away from the mean.
Approximately, 68% of the area under the curve lies in the region [μ-σ, μ + σ], 95% in
[μ-2σ, μ+2σ] and 99% in [μ-3σ, μ+3σ]. That is, p(μ-σ, μ + σ) = 0.68; p(μ-2σ, μ+2σ) =
0.95 and p(μ-3σ, μ+3σ) = 0.99 Graphically,
49
50
Moment Generating Function of a Normal Random Variable
Let X N(, 2), then
m(t ) = E (etx )
( x− )
2
1
−
e
1 2
= e
tx 2
dx,
− 2
but ( x − )2 = x 2 − 2 x + 2
x −2 x+
1
2 2 2
− 2
− 2t x
e
1
=
2
dx
− 2
but (x 2 2 2
)
− 2 x + − 2t x = ( x − ( + t 2 )) 2 + 2 − ( + t 2 ) 2
( ( )) ( )
2 2
1
x − + t 2 + − + t 2
2
− 2
1
2
= e dx,
− 2
( x − ( + t 2 ) ) ( ) dx,
2 2
1
− + t 2
1 2
− − 2
1
2
2
=
2
e e
− 2
( ) ( ) dx, = + t
2 2
− + t 2 x − *
1 2
1
− −
2
= e e
1 2
2 2 * 2
− 2
=1 as the integrand is N ( +t 2 , 2)
( the sum of probability is one)
( )
2
1
− + t 2
2
− 2
= e
2
1
−
2 2
− 2
− 2 t 2 − t 2 4
= e
2
t
1 2 2
t +
m(t ) = e 2
51
1
dm(t ) t + t 2 2
So E ( X ) = = ( + t )e 2 2
dt
dm(t )
|t =0 =
dt
1 1
d 2 m(t ) t + t 2 2 t + t 2 2
2
= (e 2 2
) + ( + t )( + t )e 2
2 2
dt
d 2 m(t )
2
|t = 0 =
2
+ 2 = E ( X 2)
dt
Var ( X ) = E ( X 2) − ( E ( X ) ) = 2
2
52
Its CDF is given as
x
( x ) F (z )
1 − 1t 2
= =
− 2
e 2 dt
the integral does not have a closed form solution but requires numerical integration. The
values of the standard normal are tabulated in most statistics books.
Proof:
Let Ф(z) be the distribution function of Z, and
F(x) be the distribution function of X
(z ) = P (Z z )
X −
= P z
= P( X + z )
x−
2
+z
1
e dx
1
=
−
2
− 2
(z ) = (z ) =
d 1 −1 z2
e 2
dz 2
Hence, Z ~ N(0, 1)
z
1 − 12u 2
(z) =
− 2
e du is the cumulative distribution function of a standard
normal random variable, and is tabulated for the positive values of Z, in any standard
statistics text books.
We use the fact that P( Z − z ) = P( Z z ) = 1 − P( Z z ) to calculate the probability of Z
Example: Let Y be the marks obtained by students in an examination and the following
probabilities are given:
P(Y 60) = 0.2 and P(Y < 40) = 0.3
Find the mean and the standard deviation of the marks. Now
53
Y − 60 −
P(Y 60) = P assuming that Y ~ N ( , )
2
60 −
= P z
Z ~ N (0,1)
60 −
→ P z
=0.2
From the z (standard normal table) we obtain z = 0.84 to be associated with the
probability of 0.2. We first find 0.5- 0.2 = 0.3. Then around 0.3 values 0.2995 and
0.3023. The value 0.2995 is closer to 0.3. From 0.2995 first read horizontally to the left
and then vertically. We get the value 0.84
Similarly, we have
40 −
P(Y 40) = P z
40 −
→ P z
=0.3
From the z (standard normal table) we obtain z = -0.52 to be associated with the
probability of 0.3. 0.5 – 0.3 = 0.2. Then value associated to 0.2 is -0.52 (because it is to
the left of zero).
Therefore,
60 −
= 0.84 → + 0.84 = 60 and
40 −
= −0.52 → − 0.52 = 40
= 60 − 0.84 from the 1st equation
→ 60 − 0.84 − 0.52 = 40
20 = 1.36
= 14.71
→ = 60 − 0.84 14.71 = 47.6436
Note that standard normal distribution:
1. is symmetric about z = 0, i.e., f(-z) = f(z)
2. attains maximum value at z = 0
3. the maximum value of the function is (1/2)½
4. mean = mode = median
54
3.2.3 The Normal Approximation to the Binomial Distribution
Let X ~ B[n, p], then E ( X ) = np & Var ( X ) = 2 = npq
X − np
Let Y =
npq
A
In the limit, i.e., as n gets larger, it can be shown that, Y ~ N[0, 1]. Under such
=
16!
( )
6! x 10! 2
1
16
8008
= 0.1222
65536
P[X = 6] is approximated by the area of the normal distribution between 5.5 and 6.5.
Thus, np = μ = 8 & σ2 = np(1-p) = 4 → σ = 2
P( X = 6) = P ( 5.5 X 6.5 )
5.5 − 8 6.5 − 8
= P Z X − A
2 2 Where Z = N (0,1)
~
= P ( −1.25 Z −0.75 )
= (−0.75) − ( −1.25)
Note (−1.25) = 1 − (1.25) & (−0.75) = 1 − (0.75)
= 1 − 0.8944 & = 1 − 0.7734
= 0.1056 & = 0.2266
P( X = 6) 0.2266 − 0.1056 = 0.121
55
Exercise: Let X be a random variable with probability density function given by:
cxe −2 x , 0 x
f ( x) =
0 otherwise
a) Find the value of c that makes f(x) a proper probability density function.
b) Give the mean and variance of the random variable X.
x −1e − x
, 0 x & 0 & 0
f ( x ) = ( )
0
, elsewhere
The Moment generating Function for a gamma random variable is:
56
m(t ) = E e tx
−1 −x
x e
= 0 e ( ) dx
tx
( )
−1
e
−x
= tx
x
−1
e dx
0
( )
−1 1
= x
−1 − x( −t )
e dx
0
( )
−1 −x
= x
−1
e (1− t ) dx
0
= ( )
−1
( )
1 − t
1
(1 − t )
−
= =
(1 − t )
(1 − t )
−
m(t ) =
So E ( X ) = M (1)
X ( 0) =
E ( X 2 ) = M (2)
X ( 0) = 2 2 + 2
Var ( X ) = E ( X 2 ) − E ( X )
2
= 2
57
m(t ) = E (etx )
−x
e
=
tx
e dx
0
1 tx − x
= e dx
0
1 − x( 1
−t )
=
0
e dx
− 1
− x ( 1 −t )
= e |0
1− t
1
= = (1 − t ) −1
1− t
dm(t )
= −1(1 − t ) −2 (− )
dt
dm(t )
|t =0 = = E ( X )
dt
d 2 m(t )
2
= 2(1 − t ) −3 (− )(− )
Then, dt
d 2 m(t )
2
= 2 2 = E ( X 2 )
dt
Var ( X ) = 2 2 − 2 = 2
ii) When α = ν/2 and β =2, the Gamma distribution simplifies to the Chi-Square
Distribution with ν degrees of freedom given by:
x 2−1e − x 2
2 , 0 x
f ( x) = 2 ( 2 )
0
, elsewhere
Moment Generating Function for Chi-Square Distribution is
(1 − 2t )
− 2
m(t ) = , E( X ) = & Var ( X ) = 2
58
e− x , x 0, 0
f ( x) =
0, elsewhere
Exercise: For a random variable, X, possessing an exponential distribution, show that
E(X) = 1/λ, and Var(X) = 1/λ2.
The cumulative distribution function of an exponential distribution is F(x) = 1-e-λx
Proof:
when x = 0, f(x) =
given f ( x) = e− x , x 0
x 0 x
F ( x) =
−
f (t )d (t ) =
−
f (t )d (t ) + f (t )d (t )
0
=0
x
= f (t )d (t )
0
x
= e − t d (t )
0
x
= − e − t
0
− x
= −e − (−1)
= 1 − e− x
Note that the exponential distribution is used to find the waiting time between two
events –say the time period elapsing between two telephone calls. Recall that the
Poisson distribution defines the probability for the number of times that an event takes
place per unit time. We know that the probability distribution of a random variable, X,
used to forecast such occurrences is given by
− x
f ( x) = e , x = 0,1, 2,...
x!
Recall that = E ( x) is the expected number of occurrences of the event per unit of
time.
Let y = number of occurrences in t unit of time (t > 0 & t +), then E ( y ) = t
i.e., x ~ poi( ) & y ~poi(t )
Let Z = the time elapsing before the 1st occurrence of the event (z > 0)
59
Then, F ( z ) = P ( Z z ) = 1 − P ( Z z ) & P( Z z ) = P(No occurrence of the event in time
interval z).
P(Z z ) = e− Z
F Z ( z ) = 1 − e − Z
d F Z (z )
f Z (z ) = = e − Z
dz
Hence, E (Z ) = z f (z )dz
Z
0
= z e
− Z
dz
0
1
= z e− Zdz =
0
Example: Suppose that the life of a light bulb has an exponential distribution with =
1/400 = 0.0025. What is the probability that 4 out of 5 bulbs chosen at random have life
in excess of 500 hours?
Let X = the life of a bulb, then p = P(X 500) is the probability of observing that a bulb
lasts for at least 500 hours. Thus,
e− x , x 0
f (x ) =
0 , otherwise
p = P (X 500 ) = 0.0025e
− 0.0025 x
dx
500
= − e − 0.0025 x
500
= 0 − ( − e − 0.0025 x 500
)
= e −1.25
0.2865
n y n− y
and the probability that 4 out of 5 bulbs have X > 500 is p q where n = 5, y = 4,
y
p = 0.2865 thus the required probability is 5(0.2865)4(0.7135) = 0.024
60