Lecture 2
Lecture 2
Lecture 2
Yi = β0 + β1 Xi + i
Yi = β0 + β1 Xi + i
E {Yi } = E {β0 + β1 Xi + i }
= β0 + β1 Xi + E {i }
= β0 + β1 Xi
Expectation Review
I Definition
Z
E {X } = E {X } = XP(X )dX , X ∈ R
I Linearity property
E {aX } = aE {X }
E {aX + bY } = aE {X } + bE {Y }
P(X ) = 2X , 0 ≤ X ≤ 1
Expectation
Z 1
E {X } = XP(X )dX
0
Z 1
= 2X 2 dX
0
2X 3 1
= |
3 0
2
=
3
Expectation of a Product of Random Variables
If X,Y are random variables with joint distribution P(X , Y ) then
the expectation of the product is given by
Z
E {XY } = XYP(X , Y )dXdY .
XY
Expectation of a product of random variables
What if X and Y are independent? If X and Y are independent
with density functions f and g respectively then
Z
E {XY } = XYf (X )g (Y )dXdY
XY
Z Z
= XYf (X )g (Y )dXdY
ZX Y Z
= Xf (X )[ Yg (Y )dY ]dX
ZX Y
= Xf (X )E {Y }dX
X
= E {X }E {Y }
Regression Function
I The response Yi comes from a probability distribution with
mean
E {Yi } = β0 + β1 Xi
E {Y } = β0 + β1 X
σ 2 {Yi } = σ 2 {β0 + β1 Xi + i }
= σ 2 {i }
= σ2
Variance (2nd central moment) Review
I Continuous distribution
Z
2 2
σ {X } = E {(X −E {X }) } = (X −E {X })2 P(X )dX , X ∈ R
I Discrete distribution
X
σ 2 {X } = E {(X − E {X })2 } = (Xi − E {X })2 P(Xi ), X ∈ Z
i
Alternative Form for Variance
σ 2 {X } = E {(X − E {X })2 }
= E {(X 2 − 2XE {X } + E {X }2 )}
= E {X 2 } − 2E {X }E {X } + E {X }2
= E {X 2 } − 2E {X }2 + E {X }2
= E {X 2 } − E {X }2 .
Example Variance Derivation
P(X ) = 2X , 0 ≤ X ≤ 1
σ 2 {X } = E {(X − E {X })2 } = E {X 2 } − E {X }2
Z 1
2
= 2XX 2 dX − ( )2
0 3
4
2X 1 4
= | −
4 0 9
1 4 1
= − =
2 9 18
Variance Properties
σ 2 {aX } = a2 σ 2 {X }
σ 2 {aX + bY } = a2 σ 2 {X } + b 2 σ 2 {Y } ifX ⊥
⊥Y
σ 2 {a + cX } = c 2 σ 2 {X } ifa, c both constant
More generally
X XX
σ2{ ai Xi } = ai aj Cov(Xi , Xj )
i j
Covariance
I The covariance between two real-valued random variables X
and Y, with expected values E {X } = µ and E {Y } = ν is
defined as
E {XY } = E {X }E {Y } = µν.
and then
Cov (XY ) = µν − µν = 0.
Least Squares Linear Regression
I Seek to minimize
n
X
Q= (Yi − (b0 + b1 Xi ))2
i=1
How?
Guess #1
Guess #2
Function maximization
I Important technique to remember!
I Take derivative
I Set result equal to zero and solve
I Test second derivative at that point
I Question: does this always give you the maximum?
I Going further: multiple variables, convex optimization
Function Maximization
Find
argmax −x 2 + ln(x)
x
ï5
ï10
ïx2 + ln(x)
ï15
ï20
ï25
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
x
Least Squares Max(min)imization
I Function to minimize w.r.t. b0 and b1 , b0 and b1 are called
point estimators of β0 and β1 respectively
n
X
Q= (Yi − (b0 + b1 Xi ))2
i=1
b0 = Ȳ − b1 X̄
P
Xi
X̄ =
Pn
Yi
Ȳ =
n