4.1 Multiple Choice: Chapter 4 Linear Regression With One Regressor

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 33
At a glance
Powered by AI
The key takeaways are the definitions and interpretations of regression coefficients, residuals, R-squared, and other regression concepts.

The key concepts of simple linear regression are the regression line, the slope and intercept coefficients, the residual sum of squares, and the interpretation of regression outputs.

The assumptions of the OLS estimator are that the errors have zero mean and are uncorrelated with the regressors, the regressors are non-stochastic, and the errors are homoskedastic and serially uncorrelated.

Introduction to Econometrics, 3e (Stock)

Chapter 4 Linear Regression with One Regressor

4.1 Multiple Choice

1) When the estimated slope coefficient in the simple regression model, 1, is zero, then
A) R2 = .
B) 0 < R2 < 1.
C) R2 = 0.
D) R2 > (SSR/TSS).
Answer: C

2) The regression R2 is defined as follows:


ESS
A)
TSS
RSS
B)
TSS

 Y Y   X 
n

i i X
i 1
C)
 Y Y   X 
n 2 n 2
i i X
i 1 i 1

SSR
D)
n2
Answer: A

3) The standard error of the regression (SER) is defined as follows


1 n 2
A)  uˆi
n  2 i 1
B) SSR
C) 1-R2
1 n 2
D)  uˆi
n  1 i 1
Answer: A

4) (Requires Appendix material) Which of the following statements is correct?


A) TSS = ESS + SSR
B) ESS = SSR + TSS
C) ESS > TSS
D) R2 = 1 - (ESS/TSS)
Answer: A

1
Copyright © 2011 Pearson Education, Inc.
5) Binary variables
A) are generally used to control for outliers in your sample.
B) can take on more than two values.
C) exclude certain individuals from your sample.
D) can take on only two values.
Answer: D

6) The following are all least squares assumptions with the exception of:
A) The conditional distribution of ui given Xi has a mean of zero.
B) The explanatory variable in regression model is normally distributed.
C) (Xi, Yi), i = 1,..., n are independently and identically distributed.
D) Large outliers are unlikely.
Answer: B

7) The reason why estimators have a sampling distribution is that


A) economics is not a precise science.
B) individuals respond differently to incentives.
C) in real life you typically get to sample many times.
D) the values of the explanatory variable and the error term differ across samples.
Answer: D

8) In the simple linear regression model, the regression slope


A) indicates by how many percent Y increases, given a one percent increase in X.
B) when multiplied with the explanatory variable will give you the predicted Y.
C) indicates by how many units Y increases, given a one unit increase in X.
D) represents the elasticity of Y on X.
Answer: C

9) The OLS estimator is derived by


A) connecting the Yi corresponding to the lowest Xi observation with the Yi corresponding to the highest
Xi observation.
B) making sure that the standard error of the regression equals the standard error of the slope estimator.
C) minimizing the sum of absolute residuals.
D) minimizing the sum of squared residuals.
Answer: D

10) Interpreting the intercept in a sample regression function is


A) not reasonable because you never observe values of the explanatory variables around the origin.
B) reasonable because under certain conditions the estimator is BLUE.
C) reasonable if your sample contains values of Xi around the origin.
D) not reasonable because economists are interested in the effect of a change in X on the change in Y.
Answer: C

2
Copyright © 2011 Pearson Education, Inc.
11) The variance of Yi is given by

A) + var(Xi) + var(ui).

B) the variance of ui.

C) var(Xi) + var(ui).

D) the variance of the residuals.


Answer: C

12) (Requires Appendix) The sample average of the OLS residuals is


A) some positive number since OLS uses squares.
B) zero.
C) unobservable since the population regression function is unknown.
D) dependent on whether the explanatory variable is mostly positive or negative.
Answer: B

13) The OLS residuals, i, are defined as follows:


A) i - 0 - 1Xi
B) Yi - β0 - β1Xi
C) Yi - i
D) (Yi - )2
Answer: C

14) The slope estimator, β1, has a smaller standard error, other things equal, if
A) there is more variation in the explanatory variable, X.
B) there is a large variance of the error term, u.
C) the sample size is smaller.
D) the intercept, β0, is small.
Answer: A

15) The regression R2 is a measure of


A) whether or not X causes Y.
B) the goodness of fit of your regression line.
C) whether or not ESS > TSS.
D) the square of the determinant of R.
Answer: B

16) (Requires Appendix) The sample regression line estimated by OLS


A) will always have a slope smaller than the intercept.
B) is exactly the same as the population regression line.
C) cannot have a slope of zero.
D) will always run through the point ( , ).
Answer: D

3
Copyright © 2011 Pearson Education, Inc.
17) The OLS residuals
A) can be calculated using the errors from the regression function.
B) can be calculated by subtracting the fitted values from the actual values.
C) are unknown since we do not know the population regression function.
D) should not be used in practice since they indicate that your regression does not run through all your
observations.
Answer: B

18) The normal approximation to the sampling distribution of 1 is powerful because


A) many explanatory variables in real life are normally distributed.
B) it allows econometricians to develop methods for statistical inference.
C) many other distributions are not symmetric.
D) is implies that OLS is the BLUE estimator for β1.
Answer: B

19) If the three least squares assumptions hold, then the large sample normal distribution of 1 is
 1 var  X   )u  
A) N  0, .
i X i

 n  var  X   2 
  i  
 1 var  u  ]2 
B)  1 , .
i
N
 n  var  X   2 
  i  
u
2
N ( 1 , n
 
C) 2 .
 Xi  X
i 1

 1 var  u  ] 
D) N  1 , .
i

 n  var  X   2 
  i  
Answer: B

20) In the simple linear regression model Yi = β0 + β1Xi + ui,


A) the intercept is typically small and unimportant.
B) β0 + β1Xi represents the population regression function.
C) the absolute value of the slope is typically between 0 and 1.
D) β0 + β1Xi represents the sample regression function.
Answer: B

21) To obtain the slope estimator using the least squares principle, you divide the
A) sample variance of X by the sample variance of Y.
B) sample covariance of X and Y by the sample variance of Y.
C) sample covariance of X and Y by the sample variance of X.
D) sample variance of X by the sample covariance of X and Y.
Answer: C

4
Copyright © 2011 Pearson Education, Inc.
22) To decide whether or not the slope coefficient is large or small,
A) you should analyze the economic importance of a given increase in X.
B) the slope coefficient must be larger than one.
C) the slope coefficient must be statistically significant.
D) you should change the scale of the X variable if the coefficient appears to be too small.
Answer: A

23) E(ui Xi) = 0 says that


A) dividing the error by the explanatory variable results in a zero (on average).
B) the sample regression function residuals are unrelated to the explanatory variable.
C) the sample mean of the Xs is much larger than the sample mean of the errors.
D) the conditional distribution of the error given the explanatory variable has a zero mean.
Answer: D

24) In the linear regression model, Yi = β0 + β1Xi + ui, β0 + β1Xi is referred to as


A) the population regression function.
B) the sample regression function.
C) exogenous variation.
D) the right-hand variable or regressor.
Answer: A

25) Multiplying the dependent variable by 100 and the explanatory variable by 100,000 leaves the
A) OLS estimate of the slope the same.
B) OLS estimate of the intercept the same.
C) regression R2 the same.
D) variance of the OLS estimators the same.
Answer: C

26) Assume that you have collected a sample of observations from over 100 households and their
consumption and income patterns. Using these observations, you estimate the following regression Ci =
β0+β1Yi+ ui where C is consumption and Y is disposable income. The estimate of β1 will tell you
Income
A)
Consumption
B) The amount you need to consume to survive
Income
C)
Consumption
Consumption
D)
Income
Answer: D

5
Copyright © 2011 Pearson Education, Inc.
27) In which of the following relationships does the intercept have a real-world interpretation?
A) the relationship between the change in the unemployment rate and the growth rate of real GDP
("Okun's Law")
B) the demand for coffee and its price
C) test scores and class-size
D) weight and height of individuals
Answer: A

28) The OLS residuals, i, are sample counterparts of the population


A) regression function slope
B) errors
C) regression function's predicted values
D) regression function intercept
Answer: B

29) Changing the units of measurement, e.g. measuring testscores in 100s, will do all of the following
EXCEPT for changing the
A) residuals
B) numerical value of the slope estimate
C) interpretation of the effect that a change in X has on the change in Y
D) numerical value of the intercept
Answer: C

30) To decide whether the slope coefficient indicates a "large" effect of X on Y, you look at the
A) size of the slope coefficient
B) regression R2
C) economic importance implied by the slope coefficient
D) value of the intercept
Answer: A

6
Copyright © 2011 Pearson Education, Inc.
4.2 Essays and Longer Questions

1) Sir Francis Galton, a cousin of James Darwin, examined the relationship between the height of children
and their parents towards the end of the 19th century. It is from this study that the name "regression"
originated. You decide to update his findings by collecting data from 110 college students, and estimate
the following relationship:

= 19.6 + 0.73 × Midparh, R2 = 0.45, SER = 2.0

where Studenth is the height of students in inches, and Midparh is the average of the parental heights.
(Following Galton's methodology, both variables were adjusted so that the average female height was
equal to the average male height.)
(a) Interpret the estimated coefficients.
(b) What is the meaning of the regression R2?
(c) What is the prediction for the height of a child whose parents have an average height of 70.06 inches?
(d) What is the interpretation of the SER here?
(e) Given the positive intercept and the fact that the slope lies between zero and one, what can you say
about the height of students who have quite tall parents? Those who have quite short parents?
(f) Galton was concerned about the height of the English aristocracy and referred to the above result as
"regression towards mediocrity." Can you figure out what his concern was? Why do you think that we
refer to this result today as "Galton's Fallacy"?
Answer:
(a) For every one inch increase in the average height of their parents, the student's height increases by
0.73 of an inch. There is no reasonable interpretation for the intercept.
(b) The model explains 45 percent of the variation in the height of students.
(c) 19.6 + 0.73 × 70.06 = 70.74.
(d) The SER is a measure of the spread of the observations around the regression line. The magnitude of
the typical deviation from the regression line or the typical regression error here is two inches.
(e) Tall parents will have, on average, tall students, but they will not be as tall as their parents. Short
parents will have short students, although on average, they will be somewhat taller than their parents.
(f) This is an example of mean reversion. Since the aristocracy was, on average, taller, he was concerned
that their children would be shorter and resemble more the rest of the population. If this conclusion were
true, then eventually everyone would be of the same height. However, we have not observed a decrease
in the variance in height over time.

7
Copyright © 2011 Pearson Education, Inc.
2) (Requires Appendix material) At a recent county fair, you observed that at one stand people's weight
was forecasted, and were surprised by the accuracy (within a range). Thinking about how the person
could have predicted your weight fairly accurately (despite the fact that she did not know about your
"heavy bones"), you think about how this could have been accomplished. You remember that medical
charts for children contain 5%, 25%, 50%, 75% and 95% lines for a weight/height relationship and decide
to conduct an experiment with 110 of your peers. You collect the data and calculate the following sums:

where the height is measured in inches and weight in pounds. (Small letters refer to deviations from
means as in zi = Zi – Z )
(a) Calculate the slope and intercept of the regression and interpret these.
(b) Find the regression R2 and explain its meaning. What other factors can you think of that might have
an influence on the weight of an individual?
Answer:
7625.9
(a) 1 = = 6.11, 0 = 157.95 - 6.11 × 69.69 = -267.86. For every additional inch in height, students
1, 248.9
weigh roughly 6 pounds more, on average.
n
ˆ12  xi2
ESS 46,624.1
(b) R     0.495 . Roughly half of the weight variation in the 110 students is
2 i 1
n
TSS 94, 228.8
y
i 1
2
i

explained by the single explanatory variable, height. Answers will vary by student for the other factors,
but calorie intake and amount of exercise typically appear as part of the list.

8
Copyright © 2011 Pearson Education, Inc.
3) You have obtained a sub-sample of 1744 individuals from the Current Population Survey (CPS) and
are interested in the relationship between weekly earnings and age. The regression, using
heteroskedasticity-robust standard errors, yielded the following result:

= 239.16 + 5.20 × Age, R2 = 0.05, SER = 287.21.,

where Earn and Age are measured in dollars and years respectively.
(a) Interpret the results.
(b) Is the effect of age on earnings large?
(c) Why should age matter in the determination of earnings? Do the results suggest that there is a
guarantee for earnings to rise for everyone as they become older? Do you think that the relationship
between age and earnings is linear?
(d) The average age in this sample is 37.5 years. What is annual income in the sample?
(e) Interpret the measures of fit.
Answer:
(a) A person who is one year older increases her weekly earnings by $5.20. There is no meaning attached
to the intercept. The regression explains 5 percent of the variation in earnings.
(b) Assuming that people worked 52 weeks a year, the effect of being one year older translates into an
additional $270.40 a year. This does not seem particularly large in 2002 dollars, but may have been earlier.
(c) In general, age-earnings profiles take on an inverted U-shape. Hence it is not linear and the linear
approximation may not be good at all. Age may be a proxy for "experience," which in itself can
approximate "on the job training." Hence the positive effect between age and earnings. The results do not
suggest that there is a guarantee for earnings to rise for everyone as they become older since the
regression R2 does not equal 1. Instead the result holds "on average."
(d) Since = - ⇒ = + . Substituting the estimates for the slope and the intercept then
results in average weekly earnings of $434.16 or annual average earnings of $22,576.32.
(e) The regression R2 indicates that five percent of the variation in earnings is explained by the model.
The typical error is $287.21.

9
Copyright © 2011 Pearson Education, Inc.
4) The baseball team nearest to your home town is, once again, not doing well. Given that your
knowledge of what it takes to win in baseball is vastly superior to that of management, you want to find
out what it takes to win in Major League Baseball (MLB). You therefore collect the winning percentage of
all 30 baseball teams in MLB for 1999 and regress the winning percentage on what you consider the
primary determinant for wins, which is quality pitching (team earned run average). You find the
following information on team performance:

Summary of the Distribution of Winning Percentage and


Team Earned Run Average for MLB in 1999
Average Standard Percentile
deviation
10% 25% 40% 50% 60% 75% 90%
(median)
Team 4.71 0.53 3.84 4.35 4.72 4.78 4.91 5.06 5.25
ERA
Winning 0.50 0.08 0.40 0.43 0.46 0.48 0.49 0.59 0.60
Percentage

(a) What is your expected sign for the regression slope? Will it make sense to interpret the intercept? If
not, should you omit it from your regression and force the regression line through the origin?
(b) OLS estimation of the relationship between the winning percentage and the team ERA yield the
following:

= 0.9 – 0.10 × teamera , R2=0.49, SER = 0.06,

where winpct is measured as wins divided by games played, so for example a team that won half of its
games would have Winpct = 0.50. Interpret your regression results.
(c) It is typically sufficient to win 90 games to be in the playoffs and/or to win a division. Winning over
100 games a season is exceptional: the Atlanta Braves had the most wins in 1999 with 103. Teams play a
total of 162 games a year. Given this information, do you consider the slope coefficient to be large or
small?
(d) What would be the effect on the slope, the intercept, and the regression R2 if you measured Winpct in
percentage points, i.e., as (Wins/Games) × 100?
(e) Are you impressed with the size of the regression R2? Given that there is 51% of unexplained variation
in the winning percentage, what might some of these factors be?

10
Copyright © 2011 Pearson Education, Inc.
Answer:
(a) You expect a negative relationship, since a higher team ERA implies a lower quality of the input. No
team comes close to a zero team ERA, and therefore it does not make sense to interpret the intercept.
Forcing the regression through the origin is a false implication from this insight. Instead the intercept
fixes the level of the regression.
(b) For every one point increase in Team ERA, the winning percentage decreases by 10 percentage points,
or 0.10. Roughly half of the variation in winning percentage is explained by the quality of team pitching.
(c) The coefficient is large, since increasing the winning percentage by 0.10 is the equivalent of winning 16
more games per year. Since it is typically sufficient to win 56 percent of the games to qualify for the
playoffs, this difference of 0.10 in winning percentage turns can easily turn a loosing team into a winning
team.
(d) Clearly the regression R2 will not be affected by a change in scale, since a descriptive measure of the
quality of the regression would depend on whim otherwise. The slope of the regression will compensate
in such a way that the interpretation of the result is unaffected, i.e., it will become 10 in the above
example. The intercept will also change to reflect the fact that if X were 0, then the dependent variable
would now be measured in percentage, i.e., it will become 94.0 in the above example.
(e) It is impressive that a single variable can explain roughly half of the variation in winning percentage.
Answers to the second question will vary by student, but will typically include the quality of hitting,
fielding, and management. Salaries could be included, but should be reflected in the inputs.

11
Copyright © 2011 Pearson Education, Inc.
5) You have learned in one of your economics courses that one of the determinants of per capita income
(the "Wealth of Nations") is the population growth rate. Furthermore you also found out that the Penn
World Tables contain income and population data for 104 countries of the world. To test this theory, you
regress the GDP per worker (relative to the United States) in 1990 (RelPersInc) on the difference between
the average population growth rate of that country (n) to the U.S. average population growth rate (nus )
for the years 1980 to 1990. This results in the following regression output:

= 0.518 – 18.831 × 18.831 × (n – nus), R2 = 0.522, SER = 0.197

(a) Interpret the results carefully. Is this relationship economically important?


(b) What would happen to the slope, intercept, and regression R2 if you ran another regression where the
above explanatory variable was replaced by n only, i.e., the average population growth rate of the
country? (The population growth rate of the United States from 1980 to 1990 was 0.009.) Should this have
any effect on the t-statistic of the slope?
(c) 31 of the 104 countries have a dependent variable of less than 0.10. Does it therefore make sense to
interpret the intercept?
Answer:
(a) A relative increase in the population rate of one percentage point, from 0.01 to 0.02, say, lowers
relative per-capita income by almost 20 percentage points (0.188). This is a quantitatively important and
large effect. Nations which have the same population growth rate as the United States have, on average,
roughly half as much per capita income.
(b) The interpretation of the partial derivative is unaffected, in that the slope still indicates the effect of a
one percentage point increase in the population growth rate. The regression R2 will remain the same
since only a constant was removed from the explanatory variable. The intercept will change as a result of
the change in X .
(c) To interpret the intercept, you must observe values of X close to zero, not Y.

12
Copyright © 2011 Pearson Education, Inc.
6) The neoclassical growth model predicts that for identical savings rates and population growth rates,
countries should converge to the per capita income level. This is referred to as the convergence
hypothesis. One way to test for the presence of convergence is to compare the growth rates over time to
the initial starting level.

(a) If you regressed the average growth rate over a time period (1960-1990) on the initial level of per
capita income, what would the sign of the slope have to be to indicate this type of convergence? Explain.
Would this result confirm or reject the prediction of the neoclassical growth model?
(b) The results of the regression for 104 countries were as follows:

= 0.019 – 0.0006 × RelProd60 , R2 = 0.00007, SER = 0.016,

where g6090 is the average annual growth rate of GDP per worker for the 1960-1990 sample period, and
RelProd60 is GDP per worker relative to the United States in 1960.
Interpret the results. Is there any evidence of unconditional convergence between the countries of the
world? Is this result surprising? What other concept could you think about to test for convergence
between countries?
(c) You decide to restrict yourself to the 24 OECD countries in the sample. This changes your regression
output as follows:

= 0.048 – 0.0404 RelProd60 , R2 = 0.82 , SER = 0.0046

How does this result affect your conclusions from above?

Answer:
(a) You would require a negative sign. Countries that are far ahead of others at the beginning of the
period would have to grow relatively slower for the others to catch up. This represents unconditional
convergence, whereas the neoclassical growth model predicts conditional convergence, i.e., there will
only be convergence if countries have identical savings, population growth rates, and production
technology.
(b) An increase in 10 percentage points in RelProd60 results in a decrease of 0.00006 in the growth rate
from 1960 to 1990, i.e., countries that were further ahead in 1960 do grow by less. There are some
countries in the sample that have a value of RelProd60 close to zero (China, Uganda, Togo, Guinea) and
you would expect these countries to grow roughly by 2 percent per year over the sample period. The
regression R2 indicates that the regression has virtually no explanatory power. The result is not
surprising given that there are not many theories that predict unconditional convergence between the
countries of the world.
(c) Judging by the size of the slope coefficient, there is strong evidence of unconditional convergence for
the OECD countries. The regression R2 is quite high, given that there is only a single explanatory variable
in the regression. However, since we do not know the sampling distribution of the estimator in this case,
we cannot conduct inference.

13
Copyright © 2011 Pearson Education, Inc.
7) In 2001, the Arizona Diamondbacks defeated the New York Yankees in the Baseball World Series in 7
games. Some players, such as Bautista and Finley for the Diamondbacks, had a substantially higher
batting average during the World Series than during the regular season. Others, such as Brosius and Jeter
for the Yankees, did substantially poorer. You set out to investigate whether or not the regular season
batting average is a good indicator for the World Series batting average. The results for 11 players who
had the most at bats for the two teams are:

= –0.347 + 2.290 AZSeasavg , R2=0.11, SER = 0.145,

= 0.134 + 0.136 NYSeasavg , R2=0.001, SER = 0.092,

where Wsavg and Seasavg indicate the batting average during the World Series and the regular season
respectively.
(a) Focusing on the coefficients first, what is your interpretation?
(b) What can you say about the explanatory power of your equation? What do you conclude from this?
Answer:
(a) The two regressions are quite different. For the Diamondbacks, players who had a 10 point higher
batting average during the regular season had roughly a 23 point higher batting average during the
World Series. Hence top performers did relatively better. The opposite holds for the Yankees.
(b) Both regressions have little explanatory power as seen from the regression R2. Hence performance
during the season is a poor forecast of World Series performance.

8) For the simple regression model of Chapter 4, you have been given the following data:

420 420

 Y = 274, 745.75;  X
i 1
i
i 1
i = 8,248.979;
420 420 420

 X iYi = 5,392, 705;


i 1
 X i2 = 163,513.03;
i 1
Y
i 1
i
2
= 179,878, 841.13

(a) Calculate the regression slope and the intercept.


(b) Calculate the regression R2
5,392,705  420  19.64  654.16
Answer: (a) 1 = = -2.28; 0 = 654.2-2.28 × 19.6 = 698.9.
163513.03  420  19.642
(This is the data set for Chapter 4).
2.28   5392704.6  19.6  654.2 
(b) R2 = = 0.051
179878841.1  420  654.22

14
Copyright © 2011 Pearson Education, Inc.
9) Your textbook presented you with the following regression output:

= 698.9 – 2.28 × STR


n = 420, R2 = 0.051, SER = 18.6

(a) How would the slope coefficient change, if you decided one day to measure testscores in 100s, i.e., a
test score of 650 became 6.5? Would this have an effect on your interpretation?
(b) Do you think the regression R2 will change? Why or why not?
(c) Although Chapter 4 in your textbook did not deal with hypothesis testing, it presented you with the
large sample distribution for the slope and the intercept estimator. Given the change in the units of
measurement in (a), do you think that the variance of the slope estimator will change numerically? Why
or why not?
Answer:
(a) The new regression line would be = 6.989 - 0.0228 × STR. Hence the decimal point would
simply move two digits to the left. The interpretation remains the same, since an increase in the student-
teacher ratio by 2, say, increases the new testscore by 0.0456 points on the new testscore scale, which is
4.56 in the original testscores.
(b) The regression R2 should not change, since, if it did, an objective measure of fit would depend on
whim (the units of measurement). The SER will change (from 18.6 to 0.186). This is to be expected, since
the TSS obviously changes, and with the regression R2 unchanged, the SSR (and hence SER) have to
adjust accordingly.
(c) Since statistical inference will depend on the ratio of the estimator and its standard error, the standard
error must change in proportion to the estimator. If this was not true, then statistical inference again
would depend on the whim of the investigator.

15
Copyright © 2011 Pearson Education, Inc.
10) The news-magazine The Economist regularly publishes data on the so called Big Mac index and
exchange rates between countries. The data for 30 countries from the April 29, 2000 issue is listed below:

Price of Actual Exchange Rate


Country Currency Big Mac per U.S. dollar

Indonesia Rupiah 14,500 7,945


Italy Lira 4,500 2,088
South Korea Won 3,000 1,108
Chile Peso 1,260 514
Spain Peseta 375 179
Hungary Forint 339 279
Japan Yen 294 106
Taiwan Dollar 70 30.6
Thailand Baht 55 38.0
Czech Rep. Crown 54.37 39.1
Russia Ruble 39.50 28.5
Denmark Crown 24.75 8.04
Sweden Crown 24.0 8.84
Mexico Peso 20.9 9.41
France Franc 18.5 .07
Israel Shekel 14.5 4.05
China Yuan 9.90 8.28
South Africa Rand 9.0 6.72
Switzerland Franc 5.90 1.70
Poland Zloty 5.50 4.30
Germany Mark 4.99 2.11
Malaysia Dollar 4.52 3.80
New Zealand Dollar 3.40 2.01
Singapore Dollar 3.20 1.70
Brazil Real 2.95 1.79
Canada Dollar 2.85 1.47
Australia Dollar 2.59 1.68
Argentina Peso 2.50 1.00
Britain Pound 1.90 0.63
United States Dollar 2.51

16
Copyright © 2011 Pearson Education, Inc.
The concept of purchasing power parity or PPP ("the idea that similar foreign and domestic goods …
should have the same price in terms of the same currency," Abel, A. and B. Bernanke, Macroeconomics, 4th
edition, Boston: Addison Wesley, 476) suggests that the ratio of the Big Mac priced in the local currency
to the U.S. dollar price should equal the exchange rate between the two countries.
(a) Enter the data into your regression analysis program (EViews, Stata, Excel, SAS, etc.). Calculate the
predicted exchange rate per U.S. dollar by dividing the price of a Big Mac in local currency by the U.S.
price of a Big Mac ($2.51).
(b) Run a regression of the actual exchange rate on the predicted exchange rate. If purchasing power
parity held, what would you expect the slope and the intercept of the regression to be? Is the value of the
slope and the intercept "far" from the values you would expect to hold under PPP?
(c) Plot the actual exchange rate against the predicted exchange rate. Include the 45 degree line in your
graph. Which observations might cause the slope and the intercept to differ from zero and one?

Answer:
(a)
Country Predicted Exchange Rate
per U.S. dollar

Indonesia 5777
Italy 1793
South Korea 1195
Chile 502
Spain 149
Hungary 135
Japan 117
Taiwan 27.9
Thailand 21.9
Czech Rep. 21.7
Russia 15.7
Denmark 9.86
Sweden 9.56
Mexico 8.33
France 7.37
Israel 5.78
China 3.94
South Africa 3.59
Switzerland 2.35
Poland 2.19
Germany 1.99
Malaysia 1.80
New Zealand 1.35
Singapore 1.27
Brazil 1.18
Canada 1.14
Australia 1.03
Argentina 1.00
Britain 0.76

17
Copyright © 2011 Pearson Education, Inc.
(b) The estimated regression is as follows:

= -27.05 + 1.35 × Pr edExRate


R2 = 0.994, n = 29, SER = 122.15

For PPP to hold exactly, you would expect an intercept of zero and a slope of unity. Since we do not
know the standard error of the slope and the intercept, and since Chapter 4 has not dealt with hypothesis
testing, it is hard to judge how "far" 27.05 and 1.35 are away from zero and one respectively.
(c) The regression is represented by the solid line, while the dashed one is the 45 degree line. Most of the
observations are bunched towards the origin, making it hard to judge from this graph which observations
cause the regression line to differ from the 45 degree line. However, the Indonesian Rupiah is certainly a
possible candidate.

18
Copyright © 2011 Pearson Education, Inc.
11) At the Stock and Watson (http://www.pearsonhighered.com/stock_watson) website go to Student
Resources and select the option "Datasets for Replicating Empirical Results." Then select the "California
Test Score Data Used in Chapters 4-9" (caschool.xls) and open it in a spreadsheet program such as Excel.

In this exercise you will estimate various statistics of the Linear Regression Model with One Regressor
through construction of various sums and ratio within a spreadsheet program.

Throughout this exercise, let Y correspond to Test Scores (testscore) and X to the Student Teacher
Ratio (str). To generate answers to all exercises here, you will have to create seven columns and the sums
of five of these. They are

(i) Yi, (ii) Xi, (iii) (Yi- Y ), (iv) (Xi- X ), (v) (Yi- Y )×(Xi- X ), (vi) (Xi- X )2, (vii) (Yi- Y )2

Although neither the sum of (iii) or (iv) will be required for further calculations, you may want to
generate these as a check (both have to sum to zero).

a. Use equation (4.7) and the sums of columns (v) and (vi) to generate the slope of the regression.
b. Use equation (4.8) to generate the intercept.
c. Display the regression line (4.9) and interpret the coefficients.
d. Use equation (4.16) and the sum of column (vii) to calculate the regression R2.
e. Use equation (4.19) to calculate the SER.
f. Use the "Regression" function in Excel to verify the results.

19
Copyright © 2011 Pearson Education, Inc.
Answer:
Column (i): 654.156548
Column (ii): 19.64043
Column (iii): 1.27329E-11
Column (iv): 1.13E-12
Column (v): -3418.76
Column (vi): 1499.58
Column (vii): 152109.6

3418.76
a. 1= = - 2.27981
1499.58
b. 0 = 274745.75-(-2.27981)×8248.979 = 698.933

c. i= 698.9 - 2.28 × Xi. A decrease in the student-teacher ratio of one results in an increase in test scores
of 2.28. It is best not to interpret the intercept; it simply determines the height of the regression line.

d. To calculate the regression R2, you need the TSS given from the sum in column (vii) and either the
ESS or SSR. In principle, you could use equation (4.10) to generate the residuals, square these and sum
them up to get SSR. However, the textbook suggests a shortcut at the bottom of p. 142:

   
n n 2 n 2
 uˆi2   Yi  Y
i 1 i 1
 ˆ12  X i  X
i 1
(the cross-product vanishes due to the orthogonality conditions

(4.32) and (4.36)). The various terms on the RHS of the equation have been calculated and equation (4.35)

 X 
n 2 7794.11
implies that ˆ1 X
2
i = ESS = 7794.11. Hence the regression R2 = = 0.051
i 1 152109.6
e. The answer in (d) can be used to calculate the SSR, which are 144325.5. Hence the SEE must be 18.6.

f.

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.226
R Square 0.051
Adjusted R Square 0.049
Standard Error 18.581
Observations 420

ANOVA
df SS
Regression 1 7794.11
Residual 418 144315.5
Total 419 152109.6

Coefficients
Intercept 698.93
str -2.28

20
Copyright © 2011 Pearson Education, Inc.
12) You have obtained a sample of 14,925 individuals from the Current Population Survey (CPS) and are
interested in the relationship between average hourly earnings and years of education. The regression
yields the following result:
ˆ = -4.58 + 1.71×educ , R2 = 0.182, SER = 9.30
ahe

where ahe and educ are measured in dollars and years respectively.

a. Interpret the coefficients and the regression R2.

b. Is the effect of education on earnings large?

c. Why should education matter in the determination of earnings? Do the results suggest that there is a
guarantee for average hourly earnings to rise for everyone as they receive an additional year of
education? Do you think that the relationship between education and average hourly earnings is linear?

d. The average years of education in this sample is 13.5 years. What is mean of average hourly earnings
in the sample?

e. Interpret the measure SER. What is its unit of measurement.


Answer:
a. A person with one more year of education increases her earnings by $1.71. There is no meaning
attached to the intercept, it just determines the height of the regression. The model explains 5 percent of
the variation in average hourly earnings.

b. The difference between a high school graduate and a college graduate is four years of education. Hence
a college graduate will earn almost $7 more per hour, on average ($6.84 to be precise). If you assume that
there are 2,000 working hours per year, then the average salary difference would be close to $14,000
(actually $13,680). Depending on how much you have spent for an additional year of education and how
much income you have forgone, this does not seem particularly large.

c. In general, you would expect to find a positive relationship between years of education and average
hourly earnings. Education is considered investment in human capital. If this were not the case, then it
would be a puzzle as to why there are students in the econometrics course — surely they are not there to
just "find themselves" (which would be quite expensive in most cases). However, if you consider
education as an investment and you wanted to see a return on it, then the relationship will most likely not
be linear. For example, a constant percent return would imply an exponential relationship whereby the
additional year of education would bring a larger increase in average hourly earnings at higher levels of
education. The results do not suggest that there is a guarantee for earnings to rise for everyone as they
become more educated since the regression R2 does not equal 1. Instead the result holds "on average."

d. Since 0 = Y - 1 X ⇒ Y = 0 + 1 X . Substituting the estimates for the slope and the intercept then
results in a mean of average hourly earnings of roughly $18.50.

e. The typical prediction error is $9.30. Since the measure is related to the deviation of the actual and
fitted values, the unit of measurement must be the same as that of the dependent variable, which is in
dollars here.

21
Copyright © 2011 Pearson Education, Inc.
4.3 Mathematical and Graphical Problems

1) Prove that the regression R2 is identical to the square of the correlation coefficient between two
variables Y and X. Regression functions are written in a form that suggests causation running from X to
Y. Given your proof, does a high regression R2 present supportive evidence of a causal relationship? Can
you think of some regression examples where the direction of causality is not clear? Is without a doubt?

  Yˆ  Y 
n
ESS 2
Answer: The regression R2 = , where ESS is given by . But i = 0 + 1Xi and Y = 0 +
TSS i 1

2 (Xi - X )2 and therefore ESS =


1 X . Hence ( i - Y ) = . Using small letters to

n
ˆ12  xi2
i 1
indicate deviations from mean, i.e., zi = Zi - Z , we get that the regression R2 = n . The square of
 yi2
i 1
n n n n

 y x   y x   x ˆi2  xi2
2 2 2
i i i i i

the correlation coefficient is r   


2 i 1 i 1 i 1 i 1
n n 2 n . Hence the two are the same.
 n 2
x y  yi2
n

  xi  y
2 2 2
i i i
i 1 i 1  i 1  i 1 i 1

Correlation does not imply causation. Income is a regressor in the consumption function, yet
consumption enters on the right-hand side of the GDP identity. Regressing the weight of individuals on
the height is a situation where causality is without doubt, since the author of this test bank should be
seven feet tall otherwise. The authors of the textbook use weather data to forecast orange juice prices later
in the text.

2) You have analyzed the relationship between the weight and height of individuals. Although you are
quite confident about the accuracy of your measurements, you feel that some of the observations are
extreme, say, two standard deviations above and below the mean. Your therefore decide to disregard
these individuals. What consequence will this have on the standard deviation of the OLS estimator of the
slope?
Answer: Other things being equal, the standard error of the slope coefficient will decrease the larger the
variation in X. Hence you prefer more variation rather than less. This can be seen from formula (4.20) in
the text. Intuitively it is easier for OLS to detect a response to a unit change in X if the data varies more.

22
Copyright © 2011 Pearson Education, Inc.
3) In order to calculate the regression R2 you need the TSS and either the SSR or the ESS. The TSS is fairly
straightforward to calculate, being just the variation of Y. However, if you had to calculate the SSR or
ESS by hand (or in a spreadsheet), you would need all fitted values from the regression function and their
deviations from the sample mean, or the residuals. Can you think of a quicker way to calculate the ESS
simply using terms you have already used to calculate the slope coefficient?

Answer: The ESS is given by . But i = 0 + 1Xi and Y = 0 + 1 X . Hence

( i - Y )2 = (Xi - X )2, and therefore ESS = . The right-hand side contains the

estimated slope squared and the denominator of the slope, i.e., all values that have already been
calculated.

4) (Requires Appendix material) In deriving the OLS estimator, you minimize the sum of squared
residuals with respect to the two parameters 0 and 1. The resulting two equations imply two
n n
restrictions that OLS places on the data, namely that  uˆ
i 1
i  0 and  uˆ X
i 1
i i  0 . Show that you get the

same formula for the regression slope and the intercept if you impose these two conditions on the sample
regression function.
Answer: The sample regression function is Yi = o + 1Xi + i. Summing both sides results in
n n n

 Yi  nˆo  ˆ1  X i   uˆi . Imposing the first restriction, namely that the sum of the residuals is zero,
i 1 i 1 i 1

dividing both sides of the equation by n, and solving for o gives the OLS formula for the intercept.

For the second restriction, multiply both sides of the sample regression function by Xi and then sum both
n n n n n
sides to get  Yi X i  ˆo  X i  ˆ1  X i   uˆi X i After imposing the restriction  uˆ X  0 and
2
i i
i 1 i 1 i 1 i 1 i 1

substituting the formula for the intercept, you get

 Y X  Y  ˆ X 
n n n n

i i 1 X  ˆ1  X i2 or Y X i i  nYX  ˆ1  X i2  ˆ1 X , which, after isolating 1 and


n
i 1 i 1 i 1 i 1

dividing by the variation in ,X results in the OLS estimator for the slope.

23
Copyright © 2011 Pearson Education, Inc.
5) (Requires Appendix material) Show that the two alternative formulae for the slope given in your
textbook are identical.
1 n
  
n


n i 1
X iYi  XY  X i  X Yi  Y
 i 1 n
 
n
1 2
 
2
X i2  X Xi  X
n i 1 i 1

Answer: Let's start with the first equality. The numerator of the right-hand side expression can be written
as follows:

 X    
n n n n n

i  X Yi  Y   X iYi  XYi  Y X i  XY   X iYi  X  Yi  Y  X i  n XY


i 1 i 1 i 1 i 1 i 1
n n n
  Yi X i  n XY n XY  n XY   Yi X i  n XY . (Note that X i  n X ).
i 1 i 1 i 1

Multiplying out the terms in the denominator and moving the summation sign into the expression in
n

X
2
parentheses similarly yields i
2
 n X . Dividing both of these expressions by n then results in the
i 1

left-hand side fraction.

6) (Requires Calculus) Consider the following model:

Y i = β 0 + u i.

Derive the OLS estimator for β0.


n

 Y  b 
2
Answer: To derive the OLS estimator, minimize the sum of squared prediction mistakes i 0 .
i 1

 n n
 n

 Y  b    Yi  b0    2  Yi  b0   1
2 2
Taking the derivative with respect to b0 results in i 0
b0 i 1 i 1 b0 i 1
n n
  2    Yi  b0    2   Yi  nb0 . Setting the derivative to zero then results in the OLS estimator:
i 1 i 1
n
 2   Yi  nˆ0  0  ˆo  Y .
i 1

24
Copyright © 2011 Pearson Education, Inc.
7) (Requires Calculus) Consider the following model:

Yi = β1Xi + ui.

Derive the OLS estimator for β1.


n

 Y  b X 
2
Answer: To derive the OLS estimator, minimize the sum of squared prediction mistakes i 1 i .
i 1

Taking the derivative with respect to b1 results in


 n n
 n

 i 1 i 
     2  Yi  b1 X i    X i 
2 2
Y  b X  Yi  b X
1 i 
b1 i 1 i 1 b1 i 1
n n
=  2    Yi  b1 X i   X i    2    Yi X i  b1 X i2  .
i 1 i 1

Setting the derivative to zero then results in the OLS estimator:


n

n n Y X i i

 2  ( Yi X i  ˆ1  X i2  0  ˆ1  i 1
n
i 1 i 1
X
i 1
i
2

8) Show first that the regression R2 is the square of the sample correlation coefficient. Next, show that the
slope of a simple regression of Y on X is only identical to the inverse of the regression slope of X on Y if
the regression R2 equals one.
ESS
Answer: The regression R2 = , where ESS is given by . But i = 0 + 1Xi and Y = 0 +
TSS

2 (Xi - X )2, and therefore ESS =


1 X . Hence ( i - Y ) = Using small letters to

n
ˆ12  xi2
i 1
indicate deviations from mean, i.e., zi = Zi - Z , we get that the regression R2 = n . The square of
 yi2
i 1
n n n n

 y x   y x   x ˆ12  xi2
2 2 2
i i i i i

the correlation coefficient is r   


2 i 1 i 1 i 1 i 1
n n 2 n . Hence the two are the same.
 2
x y y
n n

  xi   yi2
2 2 2
i i i
i 1 i 1
 i 1  i 1 i 1

n n n n
ˆ12  xi2 y 2
i x y i i y 2
i

Now 1  r 
2
n
i 1
 ˆ12  i 1
n . But ˆ1  ˆ1
2 i 1
n and therefore ˆ1 
i 1
n , which is the
 yi2
i 1
 xi2
i 1
 xi2
i 1
 xi yi
i 1

inverse of the regression slope of X on Y.

25
Copyright © 2011 Pearson Education, Inc.
9) Consider the sample regression function

Yi = 0 + 1Xi + i.

First, take averages on both sides of the equation. Second, subtract the resulting equation from the above
equation to write the sample regression function in deviations from means. (For simplicity, you may want
to use small letters to indicate deviations from the mean, i.e., zi = Zi – Z .) Finally, illustrate in a two-
dimensional diagram with SSR on the vertical axis and the regression slope on the horizontal axis how
you could find the least squares estimator for the slope by varying its values through trial and error.

Answer: Taking averages results in the following equation: Y = 0 + 1 X . Subtracting this equation
from the above one, we get yi = 1xi + i.

 
n 2
SSR   uˆi2   yi  ˆ1 xi is a quadratic which takes on different values for different choices of ˆ1
i 1

(the y and x are given in this case, i.e., different from the usual calculus problems, they cannot vary here).
You could choose a starting value of the slope and calculate SSR. Next you could choose a different value
for the slope and calculate the new SSR. There are two choices for the new slope value for you to make:
first, in which direction you want to move, and second, how large a distance you want to choose the new
slope value from the old one. (In essence, this is what sophisticated search algorithms do.) You continue
with this procedure until you find the smallest SSR. The slope coefficient which has generated this SSR is
the OLS estimator.

26
Copyright © 2011 Pearson Education, Inc.
10) Given the amount of money and effort that you have spent on your education, you wonder if it was
(is) all worth it. You therefore collect data from the Current Population Survey (CPS) and estimate a
linear relationship between earnings and the years of education of individuals. What would be the effect
on your regression slope and intercept if you measured earnings in thousands of dollars rather than in
dollars? Would the regression R2 be affected? Should statistical inference be dependent on the scale of
variables? Discuss.
Answer: It should be clear that interpretation of estimated relationships and statistical inference should
not depend on the units of measurement. Otherwise whim could dictate conclusions. Hence the
regression R2 and statistical inference cannot be effected. It is easy but tedious to show this
mathematically. Next, the intercept indicates the value of Y when X is zero. The change in the units of
measurement have no effect on this, since the change in X is cancelled by the change in ˆ1 . The slope
coefficient will change to compensate for the change in the units of measurement of X. In the above case,
the decimal point will move 3 digits to the left.

11) (Requires Appendix material) Consider the sample regression function

Yi*  ˆ0  ˆ1 X i*  uˆi ,

where * indicates that the variable has been standardized. What are the units of measurement for the
dependent and explanatory variable? Why would you want to transform both variables in this way?
Show that the OLS estimator for the intercept equals zero. Next prove that the OLS estimator for the slope
in this case is identical to the formula for the least squares estimator where the variables have not been
SX
standardized, times the ratio of the sample standard deviation of X and Y, i.e., ˆ1  ˆ1  .
SY
Answer: The units of measurement are in standard deviations. Standardizing the variables allows
conversion into common units and allows comparison of the size of coefficients. The mean of
standardized variables is zero, and hence the OLS intercept must also be zero. The slope coefficient is
n

x y * *
i i

given by the formula ˆ1 


i 1
n , where small letters indicate deviations from mean, i.e., z = Z - Z .
x
i 1
*2
i

x y * *
i i

Note that means of standardized variables are zero, and hence we get ˆ1 
i 1
n . Writing this
 xi*2
i 1

1 1 n
 xi yi
S X SY i 1
expression in terms of originally observed variables results in ˆ1  , which is the same as the
1 n 2
 xi
S X2 i 1
sought after expression after simplification.

27
Copyright © 2011 Pearson Education, Inc.
12) The OLS slope estimator is not defined if there is no variation in the data for the explanatory variable.
You are interested in estimating a regression relating earnings to years of schooling. Imagine that you
had collected data on earnings for different individuals, but that all these individuals had completed a
college education (16 years of education). Sketch what the data would look like and explain intuitively
why the OLS coefficient does not exist in this situation.
Answer: There is no variation in X in this case, and it is therefore unreasonable to ask by how much Y
would change if X changed by one unit. Regression analysis cannot figure out the answer to this
question, because a change in X never happens in the sample.

13) Indicate in a scatterplot what the data for your dependent variable and your explanatory variable
would look like in a regression with an R2 equal to zero. How would this change if the regression R2 was
equal to one?
Answer: For the zero regression R2, the data would look something like this:

In the case of the regression R2 being one, all observations would lie on a straight line.

28
Copyright © 2011 Pearson Education, Inc.
14) Imagine that you had discovered a relationship that would generate a scatterplot very similar to the

relationship Yi = , and that you would try to fit a linear regression through your data points. What do

you expect the slope coefficient to be? What do you think the value of your regression R2 is in this
situation? What are the implications from your answers in terms of fitting a linear regression through a
non-linear relationship?
Answer: You would expect the slope to be a straight line (=0) and the regression R2 to be zero in this
situation. The implication is that although there may be a relationship between two variables, you may
not detect it if you use the wrong functional form.

15) (Requires Appendix material) A necessary and sufficient condition to derive the OLS estimator is that

the following two conditions hold: = 0 and = 0. Show that these conditions imply that

= 0.

Answer: = + Xi) = +

16) The help function for a commonly used spreadsheet program gives the following definition for the
regression slope it estimates:

n
 n  n 
n X iYi    X i   Yi 
i 1  i 1  i 1 
2
n
 n 
n X i2    X i 
i 1  i 1 

Prove that this formula is the same as the one given in the textbook.
n
 n  n  n n
n X iYi    X i   Yi  n X iYi  n X nY n  X iYi  n XY
i 1  i 1  i 1   i 1  i n1
Answer: .
 
2 n
n
 n
 n X i  n X
2
n X i2  n X
2
n X i2    X i 
2

i 1  i 1  i 1 i 1

Dividing both numerator and denominator by n then gives you the desired result.

29
Copyright © 2011 Pearson Education, Inc.
17) In order to calculate the slope, the intercept, and the regression R2 for a simple sample regression
function, list the five sums of data that you need.
Answer: Depending whether or not the data is in deviations from means or not (zi = Zi - Z or Zi, say),
you need the following sums:
n n n n n

 Yi , X i , xi yi ,  yi2 , xi2 (data in deviation form) or


i 1 i 1 i 1 i 1 i 1
n

n n n n n x y i i

 Yi , X i , xi yi ,  Yi 2 , X i2 . Using these five columns, you can calculate the slope ˆ1  i 1
n , the
i 1 i 1 i 1 i 1 i 1
 xi2
i 1
n n
ˆ1  xi yi ˆ1  xi2
intercept 0 = Y - 1 X , and the regression R  
2 i 1 i 1
n n . Alternatively, if the data is not
y
i 1
2
i  yi2
i 1
n

Y X i i  n XY
given in deviation form, the formulae are as follows: ˆ1 
i 1
n , and for the regression
X
2
i
2
 nX
i 1

 n   n 2
ˆ1   X iYi  n XY  ˆ12   X i2  n X 
R 2  i n1   i 1
n
.

 Yi 2  nY  Yi 2  nY
2 2

i 1 i 1

18) A peer of yours, who is a major in another social science, says he is not interested in the regression
slope and/or intercept. Instead he only cares about correlations. For example, in the testscore/student-
teacher ratio regression, he claims to get all the information he needs from the negative correlation
coefficient corr(X,Y)=-0.226. What response might you have for your peer?
Answer: First of all, the regression slope is related to the regression R2, and hence its square root, the
correlation coefficient, since

 n   n 2
ˆ1   X iYi  n XY  ˆ12   X i2  n X 
R 2  i n1   i 1
n
.

 Yi 2  nY  Yi 2  nY
2 2

i 1 i 1

However, while the correlation coefficient tells you something about the direction and strength of the
relationship between two variables, it does not inform you about the effect a one unit increase in the
explanatory variable. Hence it cannot answer the question whether or not the relationship is important
(although even with the knowledge of the slope coefficient, this requires further information). Your friend
would not be able to answer the question which policy makers and researchers are typically interested in,
such as, what would be the effect on test scores of a reduction in the student-teacher ratio by one?

30
Copyright © 2011 Pearson Education, Inc.
19) Assume that there is a change in the units of measurement on both Y and X. The new variables are
Y*= aY and X* = bX. What effect will this change have on the regression slope?
Answer: We now have the following sample regression function Yˆ   ˆ0  ˆ1 X  . The formula for the
slope will be

n n n

 xi yi   bxi   ayi  ab xi yi


a ˆ
ˆ1  i 1
n
 i 1
n
 i 1
n
 1
b
x   bx  b 2  xi2
2 2
i i
i 1 i 1 i 1

20) Assume that there is a change in the units of measurement on X. The new variables X* = bX. Prove
that this change in the units of measurement on the explanatory variable has no effect on the intercept in
the resulting regression.
Answer: Consider the sample regression function Yˆ  ˆ0  ˆ1 X  . The formula for the intercept will be
n n n

 x y   bx  y

i i i i ab xi yi
1 1
ˆ0  Y  ˆ1b X . But ˆ1   ˆ1 . Hence ˆ0  Y  ˆ1b X  ˆ0 .
 i 1
n
 i 1
n
 i 1
n
b b
x   bx  b 2  xi2
2 2
i i
i 1 i 1 i 1

21) At the Stock and Watson (http://www.pearsonhighered.com/stock_watson) website, go to Student


Resources and select the option "Datasets for Replicating Empirical Results." Then select the "California
Test Score Data Used in Chapters 4-9" and read the data either into Excel or STATA (or another statistical
program). First run a regression where the dependent variable is test scores and the independent variable
is the student-teacher ratio. Record the regression R2. Then run a regression where the dependent
variable is the student-teacher ratio and the independent variable is test scores. Record the regression R2
from this regression. How do they compare?
Answer: The regression R2 is 0.051, confirming the idea that the regression R2 is only the square of the
correlation coefficient between two variables. This can also be shown formally as follows:

  Yˆ  Y 
n
ESS 2
The regression R2 = where ESS is given by i . But i= 0 + 1Xi and Y = 0 + 1 X .
TSS i 1

Hence ( i - Y )2 = (Xi - X )2 and therefore ESS = (Xi - X )2. Using small letters to indicate
n
ˆ12  xi 2
deviations from mean, i.e., : zi = Zi - Z , we get that the regression R 2  n
i 1
. The square of the
y
i 1
i
2

n n n n

 y x   y x   x ˆ12  xi 2
2 2 2
i i i i i

correlation coefficient is r   
2 i 1 i 1 i 1 i 1
n n 2 n . Hence the two are the same.
 2
 xi 2  yi 2  yi 2
n n

  xi  y 2
i
i 1 i 1  i 1  i 1 i 1

31
Copyright © 2011 Pearson Education, Inc.
22) At the Stock and Watson (http://www.pearsonhighered.com/stock_watson) website, go to Student
Resources and select the option "Datasets for Replicating Empirical Results." Then select the "California
Test Score Data Used in Chapters 4-9" and read the data either into Excel or STATA (or another statistical
program).

Run a regression of the average reading score (read_scr) on the average math score (math_scr). What
values for the slope and the intercept would you expect? Interpret the coefficients in the resulting
regression output and the regression R2.
Answer: On average, it would seem plausible, a priori, that schools which score high on the math score
would also do well in the reading score. Perhaps an underlying variable, such as genes, parental interest,
or the quality of teachers, is driving results in both. The relationship is close to the 45 degree line, where
the intercept would be zero and the slope would be one. Interpreted literally, 85 percent of the variation
in the reading score is explained by our model.

32
Copyright © 2011 Pearson Education, Inc.
23) In a simple regression with an intercept and a single explanatory variable, the variation in Y

   
n 2 n 2
(TSS   Yi  Y ) can be decomposed into the explained sums of squares ( ESS   Yˆi  Y ) and the
i 1 i 1

 
n n 2
sum of squared residuals ( SSR   uˆ
i 1
i
2
  Yi  Yˆ ) (see, for example, equation (4.35) in the textbook).
i 1

Consider any regression line, positively or negatively sloped in {X,Y} space. Draw a horizontal line
where, hypothetically, you consider the sample mean of Y  Y   to be. Next add a single actual
observation of Y.

In this graph, indicate where you find the following distances: the

(i) residual
(ii) actual minus the mean of Y
(iii) fitted value minus the mean of Y

Answer:

33
Copyright © 2011 Pearson Education, Inc.

You might also like