Summary of Topics For Midterm Exam #2: STA 371G, Fall 2017

Summary of Topics for Midterm Exam #2
STA 371G, Fall 2017
Listed below are the major topics covered in class that are likely to be in
Midterm Exam #2:
Mean (expectation), variance and standard deviation of a discrete random variable.
n
X n
X p
E[X] = xi P (X = xi ), Var[X] = (xi E[X])2 P (X = xi ), sd[X] = Var[X]
i=1 i=1
Normal distribution X N (, 2 ), where is the mean, 2 is the variance, and is

the standard deviation.
Probability density function: area under the curve represents probability.
Standard normal distribution Z N (0, 1).
Standardizing a normal random variable Z = X N (0, 1).
P (X < x) = P ( X x x
< ) = P (Z < ).
P (1 < Z < 1) 0.68; P ( < X < + ) 0.68.
P (2 < Z < 2) 0.95; P ( 2 < X < + 2) 0.95.
Simple Linear Regression
Least squares estimation: given n observations (x1 , y1 ), , (xn , yn ), we estimate
the intercept b0 and slope b1 by finding a straight line yi = b0 +b1 xi that minimizes
the sum of squared residuals (SSE)
n
X n
X n
X
e2i = (yi yi )2 = [yi (b0 + b1 xi )]2 .
i=1 i=1 i=1
Sample means of X and Y

Pn Pn
i=1 xi i=1 yi
x = , y = .
n n
Sample covariance
Pn
i=1 (xi x)(yi y)
Cov(X, Y ) =
n1
Sample correlation
Cov(X, Y ) Cov(X, Y )
rxy = Corr(X, Y ) = q = .
s2x s2y sx sy
Pn
x)2
i=1 (xi
p
s2x = , sx = s2x
n1
Pn
(yi y)2 q
s2y = i=1 , sy = s2y
n1
1
Interpreting covariance, correlation and regression coefficients.
SST = SSR + SSE
Coefficient of determination:
SSR SSE
R2 = =1 2
= rxy
SST SST
Regression assumptions and statistical model.
Y = 0 + 1 X + , N (0, 2 )
yi = 0 + 1 xi + i , i N (0, 2 )
yi N (0 + 1 xi , 2 )
Assuming 0 , 1 and 2 are known, given xi , the 95% prediction interval of yi is
(0 + 1 xi ) 2.
We estimate with the regression standard error s as

sP
n
r
e 2 SSE
i=1
s= = .
n2 n2
Approximately we have b1 N (1 , s2b1 ) and b0 N (0 , s2b0 ), where the standard

errors of b1 and b0 are
s s
s2 X 2

2
1
sb1 = , sb0 = s + .
(n 1)s2x n (n 1)s2x
Thus approximately we have the 95% confidence intervals for 1 and 0 as
b1 2sb1 , b0 2sb0 .
Hypothesis testing:
We test the null hypothesis H0 : 1 = 10 versus the alternative H1 : 1 6= 10 .
b1 10
The t-stat t = s b1 measures the number of standard errors the estimate
b1 is from the proposed value 10 .
The p-value provides a measure of how weird your estimate b1 is if the null
hypothesis is true.
We usually reject the null hypothesis if |t| > 2, p < 0.05, or 10 is not within
the 95% confidence interval (b1 2sb1 , b1 + 2sb1 ).
Forecasting:
Given Xf , the 95% plug-in prediction interval of Yf is (b0 + b1 Xf ) 2s.
A large predictive error variance (high uncertainty) comes from a large s, a
small n, a small sx and a large difference between Xf and X.
2
Multiple Linear Regression
Statistical model:
Y = 0 + 1 X1 + 2 X2 + + p Xp + , N (0, 2 )
Y | X1 . . . Xp N (0 + 1 X1 + + p Xp , 2 )
Interpretation of regression coefficients.

Fitted values: yi = b0 + b1 xi1 + + bp xip
Least squares
P estimation: find b0 , b1 , , bp that minimize the sum of squared
residuals ni=1 e2i = ni=1 (yi yi )2 .
P
Regression standard error:

s P sP
n 2 n 2
e
i=1 i i=1 (yi yi )
s= = .
np1 np1
e = 0, Corr(Xj , e) = 0, Corr(Y , e) = 0
2
R2 = Corr(Y, Y ) = SSR SST = 1 SST
SSE
Approximately we have bj N (j , s2bj ).

95% confidence interval for j : bj 2sbj
bj j0
t-stat: tj = s bj .
H0 : j = j0 versus H1 : j 6= j0 . Reject H0 if |tj | > 2, p-value < 0.05, or

j0 is not within (bj 2sbj , bj + 2sbj )
F-test of overall significance.
H0 : 1 = 2 = = p = 0 versus H1 : at least one j 6= 0.
R2 /p SSR/p
f= (1R2 )/(np1)
= SSE/(np1)
If H0 is true, then f > 4 is very significant in general.
If f is large (the corresponding p-value is small), we reject H0 .
Understanding multiple linear regression
Correlation is not causation
Multiple linear regression allows us to control all important variables by
including them into the regression model
Dependencies between the explanatory variables (Xs) will affect our inter-
pretation of regression coefficients
Dependencies between the explanatory variables (Xs) will inflate the stan-
dard errors of regression coefficients
s2
s2bj =
variation in Xj not associated with other Xs
3
Dummy Variables and Interactions
Dummy variables
Gender: Male, Female; Education level: High-school, Bachelor, Master, Doc-
tor; Month: Jan, Feb, , Dec
A variable of n categories can be included into multiple linear regression
using C dummy variables, where 1 C n 1
Representing a variable of n categories with n dummy variables will lead to
the problem of perfect multicollinearity
Interpretation: the same slope but different intercepts
Interactions
Interpretation: different intercepts and slopes
Diagnostics and Transformations
Diagnostics
Model assumptions:
Statistical model:
Y = 0 + 1 X1 + 2 X2 + + p Xp + , N (0, 2 )
The mean of Y is a linear combination of the Xs

The errors i (deviations from the true mean) are independent, and iden-
tically normally distributed as i N (0, 2 )
Understanding the consequences of violating the model assumptions
Detecting and explaining common model assumption violations using the
residual plots.
Modeling non-linearity with polynomial regression
Statistical model:
Y = 0 + 1 X + 2 X 2 + + m X m + , N (0, 2 )
We can always increase m if necessary, but m = 2 is usually sufficient.

Be very careful about over-fitting and doing prediction outside the data
range, especially if m is large.
For Y = 0 + 1 X + 2 X 2 + , the marginal effect of X on Y is
E[Y |X]
= 1 + 22 X,
X
which means the slope is a function of X (no longer a constant).
Handling non-constant variance with Log-Log transformation
Statistical model:
log(Y ) = 0 + 1 log(X) + , N (0, 2 )
Y = e0 X 1 e , N (0, 2 )
4
Interpretation: about 1 % change in Y per 1% change in X.
Example: price elasticity
95% plug-in prediction interval of log(Y )
(0 + 1 log(X)) 2s
95% plug-in prediction interval of Y

e0 +1 log(X)2s , e0 +1 log(X)+2s = e0 2s X 1 , e0 +2s X 1
Log transformation of Y
Statistical model:
log(Y ) = 0 + 1 X + , N (0, 2 )
Y = e0 e1 X e , N (0, 2 )
Interpretation: about (1001 )% change in Y per unit change in X (if 1 is
small).
Example: exponential growth
Time Series
Trend, seasonal, cyclical, and random components of a time series

Fitting a trend
Linear trend:
Yt = 0 + 1 t + t , t N (0, 2 )
Exponential trend:
Model: log(Yt ) = 0 + 1 t + t , t N (0, 2 )
Interpretation: Yt increases by about (1001 )% per unit time increase.
Modeling non-linearity by adding t2 into the regression model: the slope
changes as time changes.
95% plug-in prediction interval
Autoregressive models
Random walk model: Yt = 0 + Yt1 + t , t N (0, 2 )
Autoregressive model of order 1 (AR(1)):
Yt = 0 + 1 Yt1 + t , t N (0, 2 )
Autocorrelation of residuals: Corr(et , et1 )

Trend+AR(1):
Yt = 0 + 1 Yt1 + 2 t + t , t N (0, 2 )
Logtransformation + trend + AR(1):
log(Yt ) = 0 + 1 log(Yt1 ) + 2 t + t , t N (0, 2 )
5
Modeling seasonality
Using no more than 11 dummy variables for 12 months; using no more than
3 dummy variables for 4 quarters
Seasonal model:
Yt = 0 + 1 Jan + + 11 N ov + t
Seasonal + AR(1) + linear trend:
Yt = 0 + 1 Jan + + 11 N ov + 12 Yt1 + 13 t + t
Model for t in December: Yt = 0 + 12 Yt1 + 13 t + t

Model for t in Jan: Yt = (0 + 1 ) + 12 Yt1 + 13 t + t
Model for t in October: Yt = (0 + 10 ) + 12 Yt1 + 13 t + t
Logtransformation + Seasonal + AR(1) + trend
log(Yt ) = 0 + 1 Jan + + 11 N ov + 12 log(Yt1 ) + 13 t + t
Diagnose the residual plot of a time series regression model:

Are there any clear temporal patterns?
Are the residuals autocorrelated?
What kind of model assumptions have been violated?
Understand when and how to include log transformation, non-linearity, dummy
variables, interactions, AR(1) and trend to improve a time series regression
model.

Summary of Topics For Midterm Exam #2: STA 371G, Fall 2017

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Summary of Topics For Midterm Exam #2: STA 371G, Fall 2017

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Summary of Topics For Midterm Exam #2: STA 371G, Fall 2017

Uploaded by

Copyright:

Available Formats

Summary of Topics for Midterm Exam #2

STA 371G, Fall 2017

Normal distribution X N (, 2 ), where is the mean, 2 is the variance, and is

Sample means of X and Y

We estimate with the regression standard error s as

Approximately we have b1 N (1 , s2b1 ) and b0 N (0 , s2b0 ), where the standard

Thus approximately we have the 95% confidence intervals for 1 and 0 as

Interpretation of regression coefficients.

Regression standard error:

Approximately we have bj N (j , s2bj ).

H0 : j = j0 versus H1 : j 6= j0 . Reject H0 if |tj | > 2, p-value < 0.05, or

The mean of Y is a linear combination of the Xs

We can always increase m if necessary, but m = 2 is usually sufficient.

log(Y ) = 0 + 1 log(X) + ,  N (0, 2 )

95% plug-in prediction interval of Y

Trend, seasonal, cyclical, and random components of a time series

Autocorrelation of residuals: Corr(et , et1 )

Logtransformation + trend + AR(1):

log(Yt ) = 0 + 1 log(Yt1 ) + 2 t + t , t N (0, 2 )

Seasonal + AR(1) + linear trend:

Model for t in December: Yt = 0 + 12 Yt1 + 13 t + t

log(Yt ) = 0 + 1 Jan + + 11 N ov + 12 log(Yt1 ) + 13 t + t

Diagnose the residual plot of a time series regression model:

You might also like

log(Y ) = 0 + 1 log(X) + , N (0, 2 )

log(Yt ) = 0 + 1 log(Yt1 ) + 2 t + t , t N (0, 2 )

Model for t in December: Yt = 0 + 12 Yt1 + 13 t + t

log(Yt ) = 0 + 1 Jan + + 11 N ov + 12 log(Yt1 ) + 13 t + t