Testing Endogeneity

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

How do I test endogeneity?

How do I perform a Durbin–Wu–Hausman


test?

Consider a regression

y = b0 + b1*z + b2*x3 + e

where z is endogenous. Suppose that x1 and x2 are instrumental variables for z. One
should decide whether it is necessary to use an instrumental variable, i.e., whether a set of
estimates obtained by least squares is consistent or not.

An augmented regression test can easily be formed by including the residuals of each
endogenous right-hand side variable, as a function of all exogenous variables, in a
regression of the original model. We would first perform a regression

z = c0 + c1*x1 + c2*x2 + c3*x3 + u

to get residuals z_res, then perform an augmented regression:

y = d0 + d1*z + d2*x3 + d3*z_res + 

If d3 is significantly different from zero, then OLS is not consistent.

For example, let us assume that you wish to estimate

rent = b0 + b1*hsngval + b2*pcturban + e

where hsngval is endogenous amd pcturban is exogenous. Instrumental variables for


hsngval are: faminc, reg2, reg3 and reg4. To test the endogeneity of hsngval,

(i) we first run a reduced form model, using all exogenous variables:

. regress hsngval faminc reg2-reg4 pcturban

Source | SS df MS Number of obs = 50


-------------+------------------------------ F( 5, 44) = 19.66
Model | 8.4187e+09 5 1.6837e+09 Prob > F = 0.0000
Residual | 3.7676e+09 44 85626930.6 R-squared = 0.6908
-------------+------------------------------ Adj R-squared = 0.6557
Total | 1.2186e+10 49 248700555 Root MSE = 9253.5

------------------------------------------------------------------------------
hsngval | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
faminc | 2.731324 .6818931 4.01 0.000 1.357058 4.105589
reg2 | -5095.038 4122.112 -1.24 0.223 -13402.61 3212.533
reg3 | -1778.05 4072.691 -0.44 0.665 -9986.019 6429.919
reg4 | 13413.79 4048.141 3.31 0.002 5255.296 21572.28
pcturban | 182.2201 115.0167 1.58 0.120 -49.58092 414.0211
_cons | -18671.87 11995.48 -1.56 0.127 -42847.17 5503.438
------------------------------------------------------------------------------

(ii) Then, we save the residual from the above regression. Call it “hsng_res”. Then,
include hsng_res in the main equation, and estimate the main equation by OLS.

. predict hsng_res, res

. regress rent hsngval pcturban hsng_res


Source | SS df MS Number of obs = 50
-------------+------------------------------ F( 3, 46) = 47.05
Model | 46189.1513 3 15396.3838 Prob > F = 0.0000
Residual | 15053.9687 46 327.26019 R-squared = 0.7542
-------------+------------------------------ Adj R-squared = 0.7382
Total | 61243.12 49 1249.85959 Root MSE = 18.09

------------------------------------------------------------------------------
rent | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
hsngval | .0022398 .0002681 8.36 0.000 .0017003 .0027794
pcturban | .081516 .2438355 0.33 0.740 -.4092993 .5723313
hsng_res | -.0015889 .0003984 -3.99 0.000 -.0023908 -.000787
_cons | 120.7065 12.42856 9.71 0.000 95.68912 145.7239
------------------------------------------------------------------------------

Then, we test the significance of the coefficient of the added residual.


. test hsng_res

( 1) hsng_res = 0.0

F( 1, 46) = 15.91
Prob > F = 0.0002

The small p-value indicates that OLS is not consistent.

To perform an IV regression, run ivreg

. ivreg rent pcturban (hsngval = faminc reg2-reg4)

Instrumental variables (2SLS) regression

Source | SS df MS Number of obs = 50


-------------+------------------------------ F( 2, 47) = 42.66
Model | 36677.4033 2 18338.7017 Prob > F = 0.0000
Residual | 24565.7167 47 522.674823 R-squared = 0.5989
-------------+------------------------------ Adj R-squared = 0.5818
Total | 61243.12 49 1249.85959 Root MSE = 22.862

------------------------------------------------------------------------------
rent | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
hsngval | .0022398 .0003388 6.61 0.000 .0015583 .0029213
pcturban | .081516 .3081528 0.26 0.793 -.5384074 .7014394
_cons | 120.7065 15.70688 7.68 0.000 89.10834 152.3047
------------------------------------------------------------------------------
Instrumented: hsngval
Instruments: pcturban faminc reg2 reg3 reg4
------------------------------------------------------------------------------

Note that the coefficients of the last two estimates are the same, however, the standard

errors are different.

You might also like