Sum of Variances
Sum of Variances
Sum of Variances
Friday, April 20
1
We shall assume (unless otherwise mentioned) that the random variables we deal
with have expectations. We shall use the notation
X = E[X],
where X is a random variable.
Proposition. Let X and Y be independent random variables on the same probability
space. Let g and h be functions of one variable such that the random variables g(X)
and h(Y ) are defined. Then
E [g(X)h(Y )] = E [g(X)] E [h(Y )] .
provided that either both X and Y are both discrete or that both X and Y have
densities.
Proof. We first assume that X and Y are both discrete. Let pX and pY be their
probability mass functions. Since they are independent, their joint probability mass
function is given by p(x, y) = pX (x)pY (y). Thus
XX
E [g(X)h(Y )] =
g(x)h(y)p(x, y)
y
XX
y
h(y)pY (y)
g(x)pX (x)
= E [g(X)] E [h(Y )]
Next we assume that X and Y both have densities. Let fX and fY be their densities.
Since they are independent, their joint density is given by f (x, y) = fX (x)fY (y).
Thus
ZZ
g(x)h(y)f (x, y) dx dy
E [g(X)h(Y )] =
ZZ
h(y)fY (y)
g(x)fX (x) dx dy
Z
= E [g(X)]
h(y)fY (y) dy
= E [g(X)] E [h(Y )]
Covariance
Definition. Let X and Y be random variables defined on the same probability space.
Assume that both X and Y have expectations and variances. The covariance of X
and Y is defined by
Cov (X, Y ) = E [(X X ) (Y Y )] .
Proposition (Some Properties of Covariance). Let X, Y , Xj , j = 1, . . . m, and Yk ,
k = 1 . . . n, be random variables with expectations and variances, and assume that
they are all defined on the same probability space.
1. Cov(X, Y ) = E[XY ] E[X]E[Y ].
2. If X and Y are independent. then Cov(X, Y ) = 0.
3. Cov(X, Y ) = Cov(Y, X).
4. Cov(aX, Y ) = aCov(X, Y ). for a R.
5. Cov(X1 + X2 , Y ) = Cov(X1 , Y ) + Cov(X2 , Y ).
6. Cov(
m
X
j=1
Xj ,
n
X
k=1
Yk ) =
m X
n
X
Cov (Xj , Yk ) .
j=1 k=1
Proof of 1.
Cov(X, Y ) =
=
=
=
E [(X X ) (Y Y )]
E [XY X Y Y X + X Y ]
E [XY ] X E [Y ] X E [Y ] + X Y
E [XY ] E [X] E [Y ]
Proof of 5.
Cov(X1 + X2 , Y ) =
=
=
=
j=1
k=1
=
=
n
X
k=1
n
X
Cov (Yk , X)
Cov (X, Yk )
k=1
Thus,
Cov
m
X
j=1
Xj ,
n
X
!
Yk
k=1
m
X
Cov Xj ,
j=1
m X
n
X
n
X
!
Yk
k=1
Cov (Xj , Yk )
j=1 k=1
Variance of a Sum
k=1
j<k
k=1
Proof.
Var
n
X
!
Xk
= Cov
n
X
Xj ,
j=1
k=1
n X
n
X
n
X
!
Xk
k=1
Cov (Xj , Xk )
j=1 k=1
XX
XX
Cov (Xj , Xk ) +
Cov (Xj , Xk )
j=k
j6=k
Each pair, , , of indices with 6= occurs twice in the sum: once as (, ) and
once as (, ). Since the terms Cov (X , X ) and Cov (X , X ) are equal, we have
!
n
n
X
X
XX
Var
Xk =
Var (Xk ) + 2
Cov (Xj , Xk )
k=1
k=1
j<k
X
= 1
Xk .
X
n k=1
is called the sample mean; it is the arithmetic mean of the heights of the
Now X
individuals in the sample. Now,
E [Xk ] =
for each k, k = 1, . . . , n. Thus,
#
" n
X
1
= E
Xk
E X
n k=1
n
1X
E [Xk ]
n k=1
1
=
n
n
=
is an unbiased esStatisticians would summarize this calculation by saying that X
timator of the population mean . This is one of the (many) reasons that we would
as our estimate of the population mean .
use the observed value of X
We also have
n
Var X
= Var
1X
Xk
n k=1
n
X
1
=
Var
Xk
n2
k=1
n
1 X
Xk Var (Xk )
=
n2 k=1
1
n 2
n2
2
=
n
where 2 is the population variance Note that the variance of the sample mean
decreases as the sample size n increases.
To estimate the population variance, 2 , we would use the sample variance:
n
S2 =
1 X
2
Xk X
n 1 k=1
We show that
E S 2 = 2.
Now
(n 1)S
=
=
n
X
k=1
n
X
Xk X
2
Xk + X
2
k=1
n
X
(Xk ) + X
2
k=1
=
=
n
X
k=1
n
X
) + (X
)2
(Xk )2 2(Xk )(X
)
(Xk ) 2(X
2
k=1
n
X
k=1
)2
(Xk ) + n(X
Since
n
X
(Xk ) =
k=1
n
X
Xk n
k=1
n,
= nX
we have
(n 1)S 2 =
n
X
)2 + n(X
)2
(Xk )2 2n(X
k=1
n
X
)2
(Xk )2 n(X
k=1
Therefore,
n
h
X
i
2
2
E (Xk )2 nE X
(n 1)E S
=
k=1
2
= n nVar X
2
= n 2 n
n
= (n 1) 2
Correlation
Let X and Y be random variables defined on the same probability space, and assume
that the expectations and variances of X and Y exist.
Definition. The correlation of X and Y is denoted (X, Y ) and is defined by
Cov(X, Y )
(X, Y )(X, Y ) = p
Var(X)Var(Y )
provided that Var(X) 6= 0 6= Var(Y ).
We remarked earlier that we can think of the covariance as an inner product. This
analogy works most precisely if we think of
Cov(X, Y ) = E [(X X ) (Y Y )]
as the inner product of the inner product of the random variables X X and
Y Y , both of which have mean 0. Then the variance is anologous to the square of
the length of the vector, and the correlation is analogous to the cosine of the angle
between the two vectors.
If we have vectors ~x and ~y with ~x 6= ~0, then there is a scalar a and there is a vector
~z such that
~y = a~x + ~z
~z ~x = 0
Now |a~x| = | cos ||~y | and |~z| = sin |~y |, where is the angle between ~x and ~y .
Thus cos is a measure of the strength of the component of ~y in the direction of ~x.
Returning to the random variables X and Y , and assuming that Var(X) 6= 0, we have
that ~x corresponds to X X and ~y corresponds to Y Y . Also cos corresponds
to (X, Y ). A multiple of X X has the form
a (X X ) = aX aX
= aX + b
of a linear function of X. Thus (X, Y ) measures the extent to which Y is a linear
function of X. (Since (X, Y ) = (Y, X), this is also the extent to which X is a
linear function of y.)
Terminology. If (X, Y ) > 0 , we say that X and Y are positively correlated. If
(X, Y ) < 0 , we say that X and Y are negatively correlated. If (X, Y ) = 0 , we
say that X and Y are uncorrelated.
Remark. (X, Y ) and Cov(X, Y ) have the same sign.
Example 5. Let A be an event with P (A) > 0. Let IA be the indicator of A; that
is, IA = 1 if A occurs, and IA = 0 otherwise. Similarly let B be an event (in the
same probability space) with P (B) > 0, and let IB be the indicator of B. We shall
show that
Cov(IA , IB ) = P (AB) P (A)P (B).
10
Solution.
E [IA ] = P (A)
E [IB ] = P (B)
E [IAB ] = P (AB)
Thus
Cov (IA , IB ) = E [IA IB ] E [IA ] E [IB ]
= P (AB) P (A)P (B)
=
12 2 36
1
=
72
Thus IA and IB are negatively correlated. This reflects the fact that if the sum is a
perfect square, it is more likely to be 9 than 4.
11