4.1 Multiple Choice: Chapter 4 Linear Regression With One Regressor
4.1 Multiple Choice: Chapter 4 Linear Regression With One Regressor
4.1 Multiple Choice: Chapter 4 Linear Regression With One Regressor
1) When the estimated slope coefficient in the simple regression model, 1, is zero, then
A) R2 = .
B) 0 < R2 < 1.
C) R2 = 0.
D) R2 > (SSR/TSS).
Answer: C
Y Y X
n
i i X
i 1
C)
Y Y X
n 2 n 2
i i X
i 1 i 1
SSR
D)
n2
Answer: A
1
Copyright © 2011 Pearson Education, Inc.
5) Binary variables
A) are generally used to control for outliers in your sample.
B) can take on more than two values.
C) exclude certain individuals from your sample.
D) can take on only two values.
Answer: D
6) The following are all least squares assumptions with the exception of:
A) The conditional distribution of ui given Xi has a mean of zero.
B) The explanatory variable in regression model is normally distributed.
C) (Xi, Yi), i = 1,..., n are independently and identically distributed.
D) Large outliers are unlikely.
Answer: B
2
Copyright © 2011 Pearson Education, Inc.
11) The variance of Yi is given by
A) + var(Xi) + var(ui).
C) var(Xi) + var(ui).
14) The slope estimator, β1, has a smaller standard error, other things equal, if
A) there is more variation in the explanatory variable, X.
B) there is a large variance of the error term, u.
C) the sample size is smaller.
D) the intercept, β0, is small.
Answer: A
3
Copyright © 2011 Pearson Education, Inc.
17) The OLS residuals
A) can be calculated using the errors from the regression function.
B) can be calculated by subtracting the fitted values from the actual values.
C) are unknown since we do not know the population regression function.
D) should not be used in practice since they indicate that your regression does not run through all your
observations.
Answer: B
19) If the three least squares assumptions hold, then the large sample normal distribution of 1 is
1 var X )u
A) N 0, .
i X i
n var X 2
i
1 var u ]2
B) 1 , .
i
N
n var X 2
i
u
2
N ( 1 , n
C) 2 .
Xi X
i 1
1 var u ]
D) N 1 , .
i
n var X 2
i
Answer: B
21) To obtain the slope estimator using the least squares principle, you divide the
A) sample variance of X by the sample variance of Y.
B) sample covariance of X and Y by the sample variance of Y.
C) sample covariance of X and Y by the sample variance of X.
D) sample variance of X by the sample covariance of X and Y.
Answer: C
4
Copyright © 2011 Pearson Education, Inc.
22) To decide whether or not the slope coefficient is large or small,
A) you should analyze the economic importance of a given increase in X.
B) the slope coefficient must be larger than one.
C) the slope coefficient must be statistically significant.
D) you should change the scale of the X variable if the coefficient appears to be too small.
Answer: A
25) Multiplying the dependent variable by 100 and the explanatory variable by 100,000 leaves the
A) OLS estimate of the slope the same.
B) OLS estimate of the intercept the same.
C) regression R2 the same.
D) variance of the OLS estimators the same.
Answer: C
26) Assume that you have collected a sample of observations from over 100 households and their
consumption and income patterns. Using these observations, you estimate the following regression Ci =
β0+β1Yi+ ui where C is consumption and Y is disposable income. The estimate of β1 will tell you
Income
A)
Consumption
B) The amount you need to consume to survive
Income
C)
Consumption
Consumption
D)
Income
Answer: D
5
Copyright © 2011 Pearson Education, Inc.
27) In which of the following relationships does the intercept have a real-world interpretation?
A) the relationship between the change in the unemployment rate and the growth rate of real GDP
("Okun's Law")
B) the demand for coffee and its price
C) test scores and class-size
D) weight and height of individuals
Answer: A
29) Changing the units of measurement, e.g. measuring testscores in 100s, will do all of the following
EXCEPT for changing the
A) residuals
B) numerical value of the slope estimate
C) interpretation of the effect that a change in X has on the change in Y
D) numerical value of the intercept
Answer: C
30) To decide whether the slope coefficient indicates a "large" effect of X on Y, you look at the
A) size of the slope coefficient
B) regression R2
C) economic importance implied by the slope coefficient
D) value of the intercept
Answer: A
6
Copyright © 2011 Pearson Education, Inc.
4.2 Essays and Longer Questions
1) Sir Francis Galton, a cousin of James Darwin, examined the relationship between the height of children
and their parents towards the end of the 19th century. It is from this study that the name "regression"
originated. You decide to update his findings by collecting data from 110 college students, and estimate
the following relationship:
where Studenth is the height of students in inches, and Midparh is the average of the parental heights.
(Following Galton's methodology, both variables were adjusted so that the average female height was
equal to the average male height.)
(a) Interpret the estimated coefficients.
(b) What is the meaning of the regression R2?
(c) What is the prediction for the height of a child whose parents have an average height of 70.06 inches?
(d) What is the interpretation of the SER here?
(e) Given the positive intercept and the fact that the slope lies between zero and one, what can you say
about the height of students who have quite tall parents? Those who have quite short parents?
(f) Galton was concerned about the height of the English aristocracy and referred to the above result as
"regression towards mediocrity." Can you figure out what his concern was? Why do you think that we
refer to this result today as "Galton's Fallacy"?
Answer:
(a) For every one inch increase in the average height of their parents, the student's height increases by
0.73 of an inch. There is no reasonable interpretation for the intercept.
(b) The model explains 45 percent of the variation in the height of students.
(c) 19.6 + 0.73 × 70.06 = 70.74.
(d) The SER is a measure of the spread of the observations around the regression line. The magnitude of
the typical deviation from the regression line or the typical regression error here is two inches.
(e) Tall parents will have, on average, tall students, but they will not be as tall as their parents. Short
parents will have short students, although on average, they will be somewhat taller than their parents.
(f) This is an example of mean reversion. Since the aristocracy was, on average, taller, he was concerned
that their children would be shorter and resemble more the rest of the population. If this conclusion were
true, then eventually everyone would be of the same height. However, we have not observed a decrease
in the variance in height over time.
7
Copyright © 2011 Pearson Education, Inc.
2) (Requires Appendix material) At a recent county fair, you observed that at one stand people's weight
was forecasted, and were surprised by the accuracy (within a range). Thinking about how the person
could have predicted your weight fairly accurately (despite the fact that she did not know about your
"heavy bones"), you think about how this could have been accomplished. You remember that medical
charts for children contain 5%, 25%, 50%, 75% and 95% lines for a weight/height relationship and decide
to conduct an experiment with 110 of your peers. You collect the data and calculate the following sums:
where the height is measured in inches and weight in pounds. (Small letters refer to deviations from
means as in zi = Zi – Z )
(a) Calculate the slope and intercept of the regression and interpret these.
(b) Find the regression R2 and explain its meaning. What other factors can you think of that might have
an influence on the weight of an individual?
Answer:
7625.9
(a) 1 = = 6.11, 0 = 157.95 - 6.11 × 69.69 = -267.86. For every additional inch in height, students
1, 248.9
weigh roughly 6 pounds more, on average.
n
ˆ12 xi2
ESS 46,624.1
(b) R 0.495 . Roughly half of the weight variation in the 110 students is
2 i 1
n
TSS 94, 228.8
y
i 1
2
i
explained by the single explanatory variable, height. Answers will vary by student for the other factors,
but calorie intake and amount of exercise typically appear as part of the list.
8
Copyright © 2011 Pearson Education, Inc.
3) You have obtained a sub-sample of 1744 individuals from the Current Population Survey (CPS) and
are interested in the relationship between weekly earnings and age. The regression, using
heteroskedasticity-robust standard errors, yielded the following result:
where Earn and Age are measured in dollars and years respectively.
(a) Interpret the results.
(b) Is the effect of age on earnings large?
(c) Why should age matter in the determination of earnings? Do the results suggest that there is a
guarantee for earnings to rise for everyone as they become older? Do you think that the relationship
between age and earnings is linear?
(d) The average age in this sample is 37.5 years. What is annual income in the sample?
(e) Interpret the measures of fit.
Answer:
(a) A person who is one year older increases her weekly earnings by $5.20. There is no meaning attached
to the intercept. The regression explains 5 percent of the variation in earnings.
(b) Assuming that people worked 52 weeks a year, the effect of being one year older translates into an
additional $270.40 a year. This does not seem particularly large in 2002 dollars, but may have been earlier.
(c) In general, age-earnings profiles take on an inverted U-shape. Hence it is not linear and the linear
approximation may not be good at all. Age may be a proxy for "experience," which in itself can
approximate "on the job training." Hence the positive effect between age and earnings. The results do not
suggest that there is a guarantee for earnings to rise for everyone as they become older since the
regression R2 does not equal 1. Instead the result holds "on average."
(d) Since = - ⇒ = + . Substituting the estimates for the slope and the intercept then
results in average weekly earnings of $434.16 or annual average earnings of $22,576.32.
(e) The regression R2 indicates that five percent of the variation in earnings is explained by the model.
The typical error is $287.21.
9
Copyright © 2011 Pearson Education, Inc.
4) The baseball team nearest to your home town is, once again, not doing well. Given that your
knowledge of what it takes to win in baseball is vastly superior to that of management, you want to find
out what it takes to win in Major League Baseball (MLB). You therefore collect the winning percentage of
all 30 baseball teams in MLB for 1999 and regress the winning percentage on what you consider the
primary determinant for wins, which is quality pitching (team earned run average). You find the
following information on team performance:
(a) What is your expected sign for the regression slope? Will it make sense to interpret the intercept? If
not, should you omit it from your regression and force the regression line through the origin?
(b) OLS estimation of the relationship between the winning percentage and the team ERA yield the
following:
where winpct is measured as wins divided by games played, so for example a team that won half of its
games would have Winpct = 0.50. Interpret your regression results.
(c) It is typically sufficient to win 90 games to be in the playoffs and/or to win a division. Winning over
100 games a season is exceptional: the Atlanta Braves had the most wins in 1999 with 103. Teams play a
total of 162 games a year. Given this information, do you consider the slope coefficient to be large or
small?
(d) What would be the effect on the slope, the intercept, and the regression R2 if you measured Winpct in
percentage points, i.e., as (Wins/Games) × 100?
(e) Are you impressed with the size of the regression R2? Given that there is 51% of unexplained variation
in the winning percentage, what might some of these factors be?
10
Copyright © 2011 Pearson Education, Inc.
Answer:
(a) You expect a negative relationship, since a higher team ERA implies a lower quality of the input. No
team comes close to a zero team ERA, and therefore it does not make sense to interpret the intercept.
Forcing the regression through the origin is a false implication from this insight. Instead the intercept
fixes the level of the regression.
(b) For every one point increase in Team ERA, the winning percentage decreases by 10 percentage points,
or 0.10. Roughly half of the variation in winning percentage is explained by the quality of team pitching.
(c) The coefficient is large, since increasing the winning percentage by 0.10 is the equivalent of winning 16
more games per year. Since it is typically sufficient to win 56 percent of the games to qualify for the
playoffs, this difference of 0.10 in winning percentage turns can easily turn a loosing team into a winning
team.
(d) Clearly the regression R2 will not be affected by a change in scale, since a descriptive measure of the
quality of the regression would depend on whim otherwise. The slope of the regression will compensate
in such a way that the interpretation of the result is unaffected, i.e., it will become 10 in the above
example. The intercept will also change to reflect the fact that if X were 0, then the dependent variable
would now be measured in percentage, i.e., it will become 94.0 in the above example.
(e) It is impressive that a single variable can explain roughly half of the variation in winning percentage.
Answers to the second question will vary by student, but will typically include the quality of hitting,
fielding, and management. Salaries could be included, but should be reflected in the inputs.
11
Copyright © 2011 Pearson Education, Inc.
5) You have learned in one of your economics courses that one of the determinants of per capita income
(the "Wealth of Nations") is the population growth rate. Furthermore you also found out that the Penn
World Tables contain income and population data for 104 countries of the world. To test this theory, you
regress the GDP per worker (relative to the United States) in 1990 (RelPersInc) on the difference between
the average population growth rate of that country (n) to the U.S. average population growth rate (nus )
for the years 1980 to 1990. This results in the following regression output:
12
Copyright © 2011 Pearson Education, Inc.
6) The neoclassical growth model predicts that for identical savings rates and population growth rates,
countries should converge to the per capita income level. This is referred to as the convergence
hypothesis. One way to test for the presence of convergence is to compare the growth rates over time to
the initial starting level.
(a) If you regressed the average growth rate over a time period (1960-1990) on the initial level of per
capita income, what would the sign of the slope have to be to indicate this type of convergence? Explain.
Would this result confirm or reject the prediction of the neoclassical growth model?
(b) The results of the regression for 104 countries were as follows:
where g6090 is the average annual growth rate of GDP per worker for the 1960-1990 sample period, and
RelProd60 is GDP per worker relative to the United States in 1960.
Interpret the results. Is there any evidence of unconditional convergence between the countries of the
world? Is this result surprising? What other concept could you think about to test for convergence
between countries?
(c) You decide to restrict yourself to the 24 OECD countries in the sample. This changes your regression
output as follows:
Answer:
(a) You would require a negative sign. Countries that are far ahead of others at the beginning of the
period would have to grow relatively slower for the others to catch up. This represents unconditional
convergence, whereas the neoclassical growth model predicts conditional convergence, i.e., there will
only be convergence if countries have identical savings, population growth rates, and production
technology.
(b) An increase in 10 percentage points in RelProd60 results in a decrease of 0.00006 in the growth rate
from 1960 to 1990, i.e., countries that were further ahead in 1960 do grow by less. There are some
countries in the sample that have a value of RelProd60 close to zero (China, Uganda, Togo, Guinea) and
you would expect these countries to grow roughly by 2 percent per year over the sample period. The
regression R2 indicates that the regression has virtually no explanatory power. The result is not
surprising given that there are not many theories that predict unconditional convergence between the
countries of the world.
(c) Judging by the size of the slope coefficient, there is strong evidence of unconditional convergence for
the OECD countries. The regression R2 is quite high, given that there is only a single explanatory variable
in the regression. However, since we do not know the sampling distribution of the estimator in this case,
we cannot conduct inference.
13
Copyright © 2011 Pearson Education, Inc.
7) In 2001, the Arizona Diamondbacks defeated the New York Yankees in the Baseball World Series in 7
games. Some players, such as Bautista and Finley for the Diamondbacks, had a substantially higher
batting average during the World Series than during the regular season. Others, such as Brosius and Jeter
for the Yankees, did substantially poorer. You set out to investigate whether or not the regular season
batting average is a good indicator for the World Series batting average. The results for 11 players who
had the most at bats for the two teams are:
where Wsavg and Seasavg indicate the batting average during the World Series and the regular season
respectively.
(a) Focusing on the coefficients first, what is your interpretation?
(b) What can you say about the explanatory power of your equation? What do you conclude from this?
Answer:
(a) The two regressions are quite different. For the Diamondbacks, players who had a 10 point higher
batting average during the regular season had roughly a 23 point higher batting average during the
World Series. Hence top performers did relatively better. The opposite holds for the Yankees.
(b) Both regressions have little explanatory power as seen from the regression R2. Hence performance
during the season is a poor forecast of World Series performance.
8) For the simple regression model of Chapter 4, you have been given the following data:
420 420
Y = 274, 745.75; X
i 1
i
i 1
i = 8,248.979;
420 420 420
14
Copyright © 2011 Pearson Education, Inc.
9) Your textbook presented you with the following regression output:
(a) How would the slope coefficient change, if you decided one day to measure testscores in 100s, i.e., a
test score of 650 became 6.5? Would this have an effect on your interpretation?
(b) Do you think the regression R2 will change? Why or why not?
(c) Although Chapter 4 in your textbook did not deal with hypothesis testing, it presented you with the
large sample distribution for the slope and the intercept estimator. Given the change in the units of
measurement in (a), do you think that the variance of the slope estimator will change numerically? Why
or why not?
Answer:
(a) The new regression line would be = 6.989 - 0.0228 × STR. Hence the decimal point would
simply move two digits to the left. The interpretation remains the same, since an increase in the student-
teacher ratio by 2, say, increases the new testscore by 0.0456 points on the new testscore scale, which is
4.56 in the original testscores.
(b) The regression R2 should not change, since, if it did, an objective measure of fit would depend on
whim (the units of measurement). The SER will change (from 18.6 to 0.186). This is to be expected, since
the TSS obviously changes, and with the regression R2 unchanged, the SSR (and hence SER) have to
adjust accordingly.
(c) Since statistical inference will depend on the ratio of the estimator and its standard error, the standard
error must change in proportion to the estimator. If this was not true, then statistical inference again
would depend on the whim of the investigator.
15
Copyright © 2011 Pearson Education, Inc.
10) The news-magazine The Economist regularly publishes data on the so called Big Mac index and
exchange rates between countries. The data for 30 countries from the April 29, 2000 issue is listed below:
16
Copyright © 2011 Pearson Education, Inc.
The concept of purchasing power parity or PPP ("the idea that similar foreign and domestic goods …
should have the same price in terms of the same currency," Abel, A. and B. Bernanke, Macroeconomics, 4th
edition, Boston: Addison Wesley, 476) suggests that the ratio of the Big Mac priced in the local currency
to the U.S. dollar price should equal the exchange rate between the two countries.
(a) Enter the data into your regression analysis program (EViews, Stata, Excel, SAS, etc.). Calculate the
predicted exchange rate per U.S. dollar by dividing the price of a Big Mac in local currency by the U.S.
price of a Big Mac ($2.51).
(b) Run a regression of the actual exchange rate on the predicted exchange rate. If purchasing power
parity held, what would you expect the slope and the intercept of the regression to be? Is the value of the
slope and the intercept "far" from the values you would expect to hold under PPP?
(c) Plot the actual exchange rate against the predicted exchange rate. Include the 45 degree line in your
graph. Which observations might cause the slope and the intercept to differ from zero and one?
Answer:
(a)
Country Predicted Exchange Rate
per U.S. dollar
Indonesia 5777
Italy 1793
South Korea 1195
Chile 502
Spain 149
Hungary 135
Japan 117
Taiwan 27.9
Thailand 21.9
Czech Rep. 21.7
Russia 15.7
Denmark 9.86
Sweden 9.56
Mexico 8.33
France 7.37
Israel 5.78
China 3.94
South Africa 3.59
Switzerland 2.35
Poland 2.19
Germany 1.99
Malaysia 1.80
New Zealand 1.35
Singapore 1.27
Brazil 1.18
Canada 1.14
Australia 1.03
Argentina 1.00
Britain 0.76
17
Copyright © 2011 Pearson Education, Inc.
(b) The estimated regression is as follows:
For PPP to hold exactly, you would expect an intercept of zero and a slope of unity. Since we do not
know the standard error of the slope and the intercept, and since Chapter 4 has not dealt with hypothesis
testing, it is hard to judge how "far" 27.05 and 1.35 are away from zero and one respectively.
(c) The regression is represented by the solid line, while the dashed one is the 45 degree line. Most of the
observations are bunched towards the origin, making it hard to judge from this graph which observations
cause the regression line to differ from the 45 degree line. However, the Indonesian Rupiah is certainly a
possible candidate.
18
Copyright © 2011 Pearson Education, Inc.
11) At the Stock and Watson (http://www.pearsonhighered.com/stock_watson) website go to Student
Resources and select the option "Datasets for Replicating Empirical Results." Then select the "California
Test Score Data Used in Chapters 4-9" (caschool.xls) and open it in a spreadsheet program such as Excel.
In this exercise you will estimate various statistics of the Linear Regression Model with One Regressor
through construction of various sums and ratio within a spreadsheet program.
Throughout this exercise, let Y correspond to Test Scores (testscore) and X to the Student Teacher
Ratio (str). To generate answers to all exercises here, you will have to create seven columns and the sums
of five of these. They are
(i) Yi, (ii) Xi, (iii) (Yi- Y ), (iv) (Xi- X ), (v) (Yi- Y )×(Xi- X ), (vi) (Xi- X )2, (vii) (Yi- Y )2
Although neither the sum of (iii) or (iv) will be required for further calculations, you may want to
generate these as a check (both have to sum to zero).
a. Use equation (4.7) and the sums of columns (v) and (vi) to generate the slope of the regression.
b. Use equation (4.8) to generate the intercept.
c. Display the regression line (4.9) and interpret the coefficients.
d. Use equation (4.16) and the sum of column (vii) to calculate the regression R2.
e. Use equation (4.19) to calculate the SER.
f. Use the "Regression" function in Excel to verify the results.
19
Copyright © 2011 Pearson Education, Inc.
Answer:
Column (i): 654.156548
Column (ii): 19.64043
Column (iii): 1.27329E-11
Column (iv): 1.13E-12
Column (v): -3418.76
Column (vi): 1499.58
Column (vii): 152109.6
3418.76
a. 1= = - 2.27981
1499.58
b. 0 = 274745.75-(-2.27981)×8248.979 = 698.933
c. i= 698.9 - 2.28 × Xi. A decrease in the student-teacher ratio of one results in an increase in test scores
of 2.28. It is best not to interpret the intercept; it simply determines the height of the regression line.
d. To calculate the regression R2, you need the TSS given from the sum in column (vii) and either the
ESS or SSR. In principle, you could use equation (4.10) to generate the residuals, square these and sum
them up to get SSR. However, the textbook suggests a shortcut at the bottom of p. 142:
n n 2 n 2
uˆi2 Yi Y
i 1 i 1
ˆ12 X i X
i 1
(the cross-product vanishes due to the orthogonality conditions
(4.32) and (4.36)). The various terms on the RHS of the equation have been calculated and equation (4.35)
X
n 2 7794.11
implies that ˆ1 X
2
i = ESS = 7794.11. Hence the regression R2 = = 0.051
i 1 152109.6
e. The answer in (d) can be used to calculate the SSR, which are 144325.5. Hence the SEE must be 18.6.
f.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.226
R Square 0.051
Adjusted R Square 0.049
Standard Error 18.581
Observations 420
ANOVA
df SS
Regression 1 7794.11
Residual 418 144315.5
Total 419 152109.6
Coefficients
Intercept 698.93
str -2.28
20
Copyright © 2011 Pearson Education, Inc.
12) You have obtained a sample of 14,925 individuals from the Current Population Survey (CPS) and are
interested in the relationship between average hourly earnings and years of education. The regression
yields the following result:
ˆ = -4.58 + 1.71×educ , R2 = 0.182, SER = 9.30
ahe
where ahe and educ are measured in dollars and years respectively.
c. Why should education matter in the determination of earnings? Do the results suggest that there is a
guarantee for average hourly earnings to rise for everyone as they receive an additional year of
education? Do you think that the relationship between education and average hourly earnings is linear?
d. The average years of education in this sample is 13.5 years. What is mean of average hourly earnings
in the sample?
b. The difference between a high school graduate and a college graduate is four years of education. Hence
a college graduate will earn almost $7 more per hour, on average ($6.84 to be precise). If you assume that
there are 2,000 working hours per year, then the average salary difference would be close to $14,000
(actually $13,680). Depending on how much you have spent for an additional year of education and how
much income you have forgone, this does not seem particularly large.
c. In general, you would expect to find a positive relationship between years of education and average
hourly earnings. Education is considered investment in human capital. If this were not the case, then it
would be a puzzle as to why there are students in the econometrics course — surely they are not there to
just "find themselves" (which would be quite expensive in most cases). However, if you consider
education as an investment and you wanted to see a return on it, then the relationship will most likely not
be linear. For example, a constant percent return would imply an exponential relationship whereby the
additional year of education would bring a larger increase in average hourly earnings at higher levels of
education. The results do not suggest that there is a guarantee for earnings to rise for everyone as they
become more educated since the regression R2 does not equal 1. Instead the result holds "on average."
d. Since 0 = Y - 1 X ⇒ Y = 0 + 1 X . Substituting the estimates for the slope and the intercept then
results in a mean of average hourly earnings of roughly $18.50.
e. The typical prediction error is $9.30. Since the measure is related to the deviation of the actual and
fitted values, the unit of measurement must be the same as that of the dependent variable, which is in
dollars here.
21
Copyright © 2011 Pearson Education, Inc.
4.3 Mathematical and Graphical Problems
1) Prove that the regression R2 is identical to the square of the correlation coefficient between two
variables Y and X. Regression functions are written in a form that suggests causation running from X to
Y. Given your proof, does a high regression R2 present supportive evidence of a causal relationship? Can
you think of some regression examples where the direction of causality is not clear? Is without a doubt?
Yˆ Y
n
ESS 2
Answer: The regression R2 = , where ESS is given by . But i = 0 + 1Xi and Y = 0 +
TSS i 1
n
ˆ12 xi2
i 1
indicate deviations from mean, i.e., zi = Zi - Z , we get that the regression R2 = n . The square of
yi2
i 1
n n n n
y x y x x ˆi2 xi2
2 2 2
i i i i i
xi y
2 2 2
i i i
i 1 i 1 i 1 i 1 i 1
Correlation does not imply causation. Income is a regressor in the consumption function, yet
consumption enters on the right-hand side of the GDP identity. Regressing the weight of individuals on
the height is a situation where causality is without doubt, since the author of this test bank should be
seven feet tall otherwise. The authors of the textbook use weather data to forecast orange juice prices later
in the text.
2) You have analyzed the relationship between the weight and height of individuals. Although you are
quite confident about the accuracy of your measurements, you feel that some of the observations are
extreme, say, two standard deviations above and below the mean. Your therefore decide to disregard
these individuals. What consequence will this have on the standard deviation of the OLS estimator of the
slope?
Answer: Other things being equal, the standard error of the slope coefficient will decrease the larger the
variation in X. Hence you prefer more variation rather than less. This can be seen from formula (4.20) in
the text. Intuitively it is easier for OLS to detect a response to a unit change in X if the data varies more.
22
Copyright © 2011 Pearson Education, Inc.
3) In order to calculate the regression R2 you need the TSS and either the SSR or the ESS. The TSS is fairly
straightforward to calculate, being just the variation of Y. However, if you had to calculate the SSR or
ESS by hand (or in a spreadsheet), you would need all fitted values from the regression function and their
deviations from the sample mean, or the residuals. Can you think of a quicker way to calculate the ESS
simply using terms you have already used to calculate the slope coefficient?
( i - Y )2 = (Xi - X )2, and therefore ESS = . The right-hand side contains the
estimated slope squared and the denominator of the slope, i.e., all values that have already been
calculated.
4) (Requires Appendix material) In deriving the OLS estimator, you minimize the sum of squared
residuals with respect to the two parameters 0 and 1. The resulting two equations imply two
n n
restrictions that OLS places on the data, namely that uˆ
i 1
i 0 and uˆ X
i 1
i i 0 . Show that you get the
same formula for the regression slope and the intercept if you impose these two conditions on the sample
regression function.
Answer: The sample regression function is Yi = o + 1Xi + i. Summing both sides results in
n n n
Yi nˆo ˆ1 X i uˆi . Imposing the first restriction, namely that the sum of the residuals is zero,
i 1 i 1 i 1
dividing both sides of the equation by n, and solving for o gives the OLS formula for the intercept.
For the second restriction, multiply both sides of the sample regression function by Xi and then sum both
n n n n n
sides to get Yi X i ˆo X i ˆ1 X i uˆi X i After imposing the restriction uˆ X 0 and
2
i i
i 1 i 1 i 1 i 1 i 1
Y X Y ˆ X
n n n n
dividing by the variation in ,X results in the OLS estimator for the slope.
23
Copyright © 2011 Pearson Education, Inc.
5) (Requires Appendix material) Show that the two alternative formulae for the slope given in your
textbook are identical.
1 n
n
n i 1
X iYi XY X i X Yi Y
i 1 n
n
1 2
2
X i2 X Xi X
n i 1 i 1
Answer: Let's start with the first equality. The numerator of the right-hand side expression can be written
as follows:
X
n n n n n
Multiplying out the terms in the denominator and moving the summation sign into the expression in
n
X
2
parentheses similarly yields i
2
n X . Dividing both of these expressions by n then results in the
i 1
Y i = β 0 + u i.
Y b
2
Answer: To derive the OLS estimator, minimize the sum of squared prediction mistakes i 0 .
i 1
n n
n
Y b Yi b0 2 Yi b0 1
2 2
Taking the derivative with respect to b0 results in i 0
b0 i 1 i 1 b0 i 1
n n
2 Yi b0 2 Yi nb0 . Setting the derivative to zero then results in the OLS estimator:
i 1 i 1
n
2 Yi nˆ0 0 ˆo Y .
i 1
24
Copyright © 2011 Pearson Education, Inc.
7) (Requires Calculus) Consider the following model:
Yi = β1Xi + ui.
Y b X
2
Answer: To derive the OLS estimator, minimize the sum of squared prediction mistakes i 1 i .
i 1
i 1 i
2 Yi b1 X i X i
2 2
Y b X Yi b X
1 i
b1 i 1 i 1 b1 i 1
n n
= 2 Yi b1 X i X i 2 Yi X i b1 X i2 .
i 1 i 1
n n Y X i i
2 ( Yi X i ˆ1 X i2 0 ˆ1 i 1
n
i 1 i 1
X
i 1
i
2
8) Show first that the regression R2 is the square of the sample correlation coefficient. Next, show that the
slope of a simple regression of Y on X is only identical to the inverse of the regression slope of X on Y if
the regression R2 equals one.
ESS
Answer: The regression R2 = , where ESS is given by . But i = 0 + 1Xi and Y = 0 +
TSS
n
ˆ12 xi2
i 1
indicate deviations from mean, i.e., zi = Zi - Z , we get that the regression R2 = n . The square of
yi2
i 1
n n n n
y x y x x ˆ12 xi2
2 2 2
i i i i i
xi yi2
2 2 2
i i i
i 1 i 1
i 1 i 1 i 1
n n n n
ˆ12 xi2 y 2
i x y i i y 2
i
Now 1 r
2
n
i 1
ˆ12 i 1
n . But ˆ1 ˆ1
2 i 1
n and therefore ˆ1
i 1
n , which is the
yi2
i 1
xi2
i 1
xi2
i 1
xi yi
i 1
25
Copyright © 2011 Pearson Education, Inc.
9) Consider the sample regression function
Yi = 0 + 1Xi + i.
First, take averages on both sides of the equation. Second, subtract the resulting equation from the above
equation to write the sample regression function in deviations from means. (For simplicity, you may want
to use small letters to indicate deviations from the mean, i.e., zi = Zi – Z .) Finally, illustrate in a two-
dimensional diagram with SSR on the vertical axis and the regression slope on the horizontal axis how
you could find the least squares estimator for the slope by varying its values through trial and error.
Answer: Taking averages results in the following equation: Y = 0 + 1 X . Subtracting this equation
from the above one, we get yi = 1xi + i.
n 2
SSR uˆi2 yi ˆ1 xi is a quadratic which takes on different values for different choices of ˆ1
i 1
(the y and x are given in this case, i.e., different from the usual calculus problems, they cannot vary here).
You could choose a starting value of the slope and calculate SSR. Next you could choose a different value
for the slope and calculate the new SSR. There are two choices for the new slope value for you to make:
first, in which direction you want to move, and second, how large a distance you want to choose the new
slope value from the old one. (In essence, this is what sophisticated search algorithms do.) You continue
with this procedure until you find the smallest SSR. The slope coefficient which has generated this SSR is
the OLS estimator.
26
Copyright © 2011 Pearson Education, Inc.
10) Given the amount of money and effort that you have spent on your education, you wonder if it was
(is) all worth it. You therefore collect data from the Current Population Survey (CPS) and estimate a
linear relationship between earnings and the years of education of individuals. What would be the effect
on your regression slope and intercept if you measured earnings in thousands of dollars rather than in
dollars? Would the regression R2 be affected? Should statistical inference be dependent on the scale of
variables? Discuss.
Answer: It should be clear that interpretation of estimated relationships and statistical inference should
not depend on the units of measurement. Otherwise whim could dictate conclusions. Hence the
regression R2 and statistical inference cannot be effected. It is easy but tedious to show this
mathematically. Next, the intercept indicates the value of Y when X is zero. The change in the units of
measurement have no effect on this, since the change in X is cancelled by the change in ˆ1 . The slope
coefficient will change to compensate for the change in the units of measurement of X. In the above case,
the decimal point will move 3 digits to the left.
where * indicates that the variable has been standardized. What are the units of measurement for the
dependent and explanatory variable? Why would you want to transform both variables in this way?
Show that the OLS estimator for the intercept equals zero. Next prove that the OLS estimator for the slope
in this case is identical to the formula for the least squares estimator where the variables have not been
SX
standardized, times the ratio of the sample standard deviation of X and Y, i.e., ˆ1 ˆ1 .
SY
Answer: The units of measurement are in standard deviations. Standardizing the variables allows
conversion into common units and allows comparison of the size of coefficients. The mean of
standardized variables is zero, and hence the OLS intercept must also be zero. The slope coefficient is
n
x y * *
i i
x y * *
i i
Note that means of standardized variables are zero, and hence we get ˆ1
i 1
n . Writing this
xi*2
i 1
1 1 n
xi yi
S X SY i 1
expression in terms of originally observed variables results in ˆ1 , which is the same as the
1 n 2
xi
S X2 i 1
sought after expression after simplification.
27
Copyright © 2011 Pearson Education, Inc.
12) The OLS slope estimator is not defined if there is no variation in the data for the explanatory variable.
You are interested in estimating a regression relating earnings to years of schooling. Imagine that you
had collected data on earnings for different individuals, but that all these individuals had completed a
college education (16 years of education). Sketch what the data would look like and explain intuitively
why the OLS coefficient does not exist in this situation.
Answer: There is no variation in X in this case, and it is therefore unreasonable to ask by how much Y
would change if X changed by one unit. Regression analysis cannot figure out the answer to this
question, because a change in X never happens in the sample.
13) Indicate in a scatterplot what the data for your dependent variable and your explanatory variable
would look like in a regression with an R2 equal to zero. How would this change if the regression R2 was
equal to one?
Answer: For the zero regression R2, the data would look something like this:
In the case of the regression R2 being one, all observations would lie on a straight line.
28
Copyright © 2011 Pearson Education, Inc.
14) Imagine that you had discovered a relationship that would generate a scatterplot very similar to the
relationship Yi = , and that you would try to fit a linear regression through your data points. What do
you expect the slope coefficient to be? What do you think the value of your regression R2 is in this
situation? What are the implications from your answers in terms of fitting a linear regression through a
non-linear relationship?
Answer: You would expect the slope to be a straight line (=0) and the regression R2 to be zero in this
situation. The implication is that although there may be a relationship between two variables, you may
not detect it if you use the wrong functional form.
15) (Requires Appendix material) A necessary and sufficient condition to derive the OLS estimator is that
the following two conditions hold: = 0 and = 0. Show that these conditions imply that
= 0.
Answer: = + Xi) = +
16) The help function for a commonly used spreadsheet program gives the following definition for the
regression slope it estimates:
n
n n
n X iYi X i Yi
i 1 i 1 i 1
2
n
n
n X i2 X i
i 1 i 1
Prove that this formula is the same as the one given in the textbook.
n
n n n n
n X iYi X i Yi n X iYi n X nY n X iYi n XY
i 1 i 1 i 1 i 1 i n1
Answer: .
2 n
n
n
n X i n X
2
n X i2 n X
2
n X i2 X i
2
i 1 i 1 i 1 i 1
Dividing both numerator and denominator by n then gives you the desired result.
29
Copyright © 2011 Pearson Education, Inc.
17) In order to calculate the slope, the intercept, and the regression R2 for a simple sample regression
function, list the five sums of data that you need.
Answer: Depending whether or not the data is in deviations from means or not (zi = Zi - Z or Zi, say),
you need the following sums:
n n n n n
n n n n n x y i i
Yi , X i , xi yi , Yi 2 , X i2 . Using these five columns, you can calculate the slope ˆ1 i 1
n , the
i 1 i 1 i 1 i 1 i 1
xi2
i 1
n n
ˆ1 xi yi ˆ1 xi2
intercept 0 = Y - 1 X , and the regression R
2 i 1 i 1
n n . Alternatively, if the data is not
y
i 1
2
i yi2
i 1
n
Y X i i n XY
given in deviation form, the formulae are as follows: ˆ1
i 1
n , and for the regression
X
2
i
2
nX
i 1
n n 2
ˆ1 X iYi n XY ˆ12 X i2 n X
R 2 i n1 i 1
n
.
Yi 2 nY Yi 2 nY
2 2
i 1 i 1
18) A peer of yours, who is a major in another social science, says he is not interested in the regression
slope and/or intercept. Instead he only cares about correlations. For example, in the testscore/student-
teacher ratio regression, he claims to get all the information he needs from the negative correlation
coefficient corr(X,Y)=-0.226. What response might you have for your peer?
Answer: First of all, the regression slope is related to the regression R2, and hence its square root, the
correlation coefficient, since
n n 2
ˆ1 X iYi n XY ˆ12 X i2 n X
R 2 i n1 i 1
n
.
Yi 2 nY Yi 2 nY
2 2
i 1 i 1
However, while the correlation coefficient tells you something about the direction and strength of the
relationship between two variables, it does not inform you about the effect a one unit increase in the
explanatory variable. Hence it cannot answer the question whether or not the relationship is important
(although even with the knowledge of the slope coefficient, this requires further information). Your friend
would not be able to answer the question which policy makers and researchers are typically interested in,
such as, what would be the effect on test scores of a reduction in the student-teacher ratio by one?
30
Copyright © 2011 Pearson Education, Inc.
19) Assume that there is a change in the units of measurement on both Y and X. The new variables are
Y*= aY and X* = bX. What effect will this change have on the regression slope?
Answer: We now have the following sample regression function Yˆ ˆ0 ˆ1 X . The formula for the
slope will be
n n n
20) Assume that there is a change in the units of measurement on X. The new variables X* = bX. Prove
that this change in the units of measurement on the explanatory variable has no effect on the intercept in
the resulting regression.
Answer: Consider the sample regression function Yˆ ˆ0 ˆ1 X . The formula for the intercept will be
n n n
x y bx y
i i i i ab xi yi
1 1
ˆ0 Y ˆ1b X . But ˆ1 ˆ1 . Hence ˆ0 Y ˆ1b X ˆ0 .
i 1
n
i 1
n
i 1
n
b b
x bx b 2 xi2
2 2
i i
i 1 i 1 i 1
Yˆ Y
n
ESS 2
The regression R2 = where ESS is given by i . But i= 0 + 1Xi and Y = 0 + 1 X .
TSS i 1
Hence ( i - Y )2 = (Xi - X )2 and therefore ESS = (Xi - X )2. Using small letters to indicate
n
ˆ12 xi 2
deviations from mean, i.e., : zi = Zi - Z , we get that the regression R 2 n
i 1
. The square of the
y
i 1
i
2
n n n n
y x y x x ˆ12 xi 2
2 2 2
i i i i i
correlation coefficient is r
2 i 1 i 1 i 1 i 1
n n 2 n . Hence the two are the same.
2
xi 2 yi 2 yi 2
n n
xi y 2
i
i 1 i 1 i 1 i 1 i 1
31
Copyright © 2011 Pearson Education, Inc.
22) At the Stock and Watson (http://www.pearsonhighered.com/stock_watson) website, go to Student
Resources and select the option "Datasets for Replicating Empirical Results." Then select the "California
Test Score Data Used in Chapters 4-9" and read the data either into Excel or STATA (or another statistical
program).
Run a regression of the average reading score (read_scr) on the average math score (math_scr). What
values for the slope and the intercept would you expect? Interpret the coefficients in the resulting
regression output and the regression R2.
Answer: On average, it would seem plausible, a priori, that schools which score high on the math score
would also do well in the reading score. Perhaps an underlying variable, such as genes, parental interest,
or the quality of teachers, is driving results in both. The relationship is close to the 45 degree line, where
the intercept would be zero and the slope would be one. Interpreted literally, 85 percent of the variation
in the reading score is explained by our model.
32
Copyright © 2011 Pearson Education, Inc.
23) In a simple regression with an intercept and a single explanatory variable, the variation in Y
n 2 n 2
(TSS Yi Y ) can be decomposed into the explained sums of squares ( ESS Yˆi Y ) and the
i 1 i 1
n n 2
sum of squared residuals ( SSR uˆ
i 1
i
2
Yi Yˆ ) (see, for example, equation (4.35) in the textbook).
i 1
Consider any regression line, positively or negatively sloped in {X,Y} space. Draw a horizontal line
where, hypothetically, you consider the sample mean of Y Y to be. Next add a single actual
observation of Y.
In this graph, indicate where you find the following distances: the
(i) residual
(ii) actual minus the mean of Y
(iii) fitted value minus the mean of Y
Answer:
33
Copyright © 2011 Pearson Education, Inc.