Expectations of Discrete Random Variables: Scott Sheffield

440: Lecture 9
Expectations of discrete random variables

Scott Sheffield


Defining expectation

Functions of random variables


Defining expectation

Functions of random variables


Expectation of a discrete random variable

� Recall: a random variable X is a function from the state space

to the real numbers.
� Can interpret X as a quantity whose value depends on the
outcome of an experiment.
� Say X is a discrete random variable if (with probability one)
it takes one of a countable set of values.
� For each a in this countable set, write p(a) := P{X = a}.
Call p the probability mass function.
� The expectation of X , written E [X ], is defined by

E [X ] = xp(x).

� Represents weighted average of possible values X can take,

each value being weighted by its probability.
Simple examples

� Suppose that a random variable X satisfies P{X = 1} = .5,

P{X = 2} = .25 and P{X = 3} = .25.

� What is E [X ]?

� Answer: .5 × 1 + .25 × 2 + .25 × 3 = 1.75.

� Suppose P{X = 1} = p and P{X = 0} = 1 − p. Then what

is E [X ]?

� Answer: p.

� Roll a standard six-sided die. What is the expectation of

number that comes up?

� Answer:
16 1 +
16 2 +
16 3 +
16 4 +
16 5 +
16 6 = 21
6 = 3.5.

Expectation when state space is countable

I� If the state space S is countable, we can give SUM OVER

STATE SPACE definition of expectation:
E [X ] = P{s}X (s).

I� Compare this to the SUM OVER POSSIBLE X VALUES

definition we gave earlier:
E [X ] = xp(x).

I� Example: toss two coins. If X is the number of heads, what is

E [X ]?

I� State space is {(H, H), (H, T ), (T , H), (T , T )} and summing

over state space gives E [X ] = 14 2 + 14 1 + 14 1 + 14 0 = 1.

A technical point

I If the state
n space S is countable, is it possible that the sum
E [X ] = s∈S P({s})X (s) somehow depends on the order in
which s ∈ S are enumerated?

I In
n principle, yes... We only say expectation is defined when

s∈S P({x})|X (s)| < ∞, in which case it turns out that the
sum does not depend on the order.

Expectation of a function of a random variable

I If X is a random variable and g is a function from the real
numbers to the real numbers then g (X ) is also a random

I How can we compute E [g (X )]?

I Answer:
E [g (X )] = g (x)p(x).


I Suppose that constants a, b, µ are given and that E [X ] = µ.

I What is E [X + b]?

I How about E [aX ]?

I Generally, E [aX + b] = aE [X ] + b = aµ + b.

More examples

I Let X be the number that comes up when you roll a standard

six-sided die. What is E [X 2 ]?

I Let Xj be 1 if the jth coin toss isnheads and 0 otherwise.

What is the expectation of X = ni=1 Xj ?

Can compute this directly as nk=0 P{X = k}k.


I Alternatively, use symmetry. Expected number of heads
should be same as expected number of tails.

I This implies E [X ] = E [n − X ]. Applying

E [aX + b] = aE [X ] + b formula (with a = −1 and b = n), we

obtain E [X ] = n − E [X ] and conclude that E [X ] = n/2.

Additivity of expectation

I If X and Y are distinct random variables, then can one say

that E [X + Y ] = E [X ] + E [Y ]?

I Yes. In fact, for real constants a and b, we have

E [aX + bY ] = aE [X ] + bE [Y ].

I This is called the linearity of expectation.

I Another way to state this fact: given sample space S and

probability measure P, the expectation E [·] is a linear

real-valued function on the space of random variables.

I Can extend to more variables

E [X1 + X2 + . . . + Xn ] = E [X1 ] + E [X2 ] + . . . + E [Xn ].

More examples

� Now can we compute expected number of people who get

own hats in n hat shuffle problem?
� Let Xi be 1 if ith person gets own hat and zero otherwise.
� What is E [Xi ], for i ∈ {1, 2, . . . , n}?
� Answer: 1/n.
� Can write total number with own hat as
X = X1 + X2 + . . . + Xn .
� Linearity of expectation gives
E [X ] = E [X1 ] + E [X2 ] + . . . + E [Xn ] = n × 1/n = 1.

Why should we care about expectation?

I Laws of large numbers: choose lots of independent random
variables same probability distribution as X — their average
tends to be close to E [X ].

I Example: roll N = 106 dice, let Y be the sum of the numbers
that come up. Then Y /N is probably close to 3.5.

I Economic theory of decision making: Under “rationality”
assumptions, each of us has utility function and tries to
optimize its expectation.

I Financial contract pricing: under “no arbitrage/interest”
assumption, price of derivative equals its expected value in
so-called risk neutral probability.

Expected utility when outcome only depends on wealth

I Contract one: I’ll toss 10 coins, and if they all come up heads
(probability about one in a thousand), I’ll give you 20 billion

I Contract two: I’ll just give you ten million dollars.

I What are expectations of the two contracts? Which would
you prefer?

I Can you find a function u(x) such that given two random
wealth variables W1 and W2 , you prefer W1 whenever
E [u(W1 )] < E [u(W2 )]?

I Let’s assume u(0) = 0 and u(1) = 1. Then u(x) = y means

that you are indifferent between getting 1 dollar no matter

what and getting x dollars with probability 1/y .

MIT OpenCourseWare

18.440 Probability and Random Variables

Spring 2014

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

