ECON6001 F2021 Topic4
ECON6001 F2021 Topic4
ECON6001 F2021 Topic4
S&W: Chapter 8
Nonlinear Regression Functions
Outline
1. Why do we need to account for non-linearities?
2. Nonlinear functions of one variable: polynomials and logs
3. Nonlinear functions of two variables: interactions
4. Application to the California Test Score data set
• Interpretation:
– An additional $1,000 increase in income (income is defined in $’000s) is
associated with 𝛽𝛽2 change in test scores …
– … regardless of whether $1,000 is given to ________ family
– This linearity assumption may be unrealistic
------------------------------------------------------------------------------
| Robust
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
avginc | 3.850995 .2680941 14.36 0.000 3.32401 4.377979
avginc2 | -.0423085 .0047803 -8.85 0.000 -.051705 -.0329119
_cons | 607.3017 2.901754 209.29 0.000 601.5978 613.0056
------------------------------------------------------------------------------
Test of non-linearity: What does the null hypothesis mean?
H0: 𝐵𝐵2 = 0
H1: 𝐵𝐵2 ≠ 0
Copyright © 2019 Pearson Education Ltd. All Rights Reserved.
16
10
11
40
41
function
Yi = f (X1i, X2i,…, Xki) + ui, i = 1,…, n
Assumptions
1. E(ui| X1i, X2i,…, Xki) = 0 (same), so f is the conditional expectation
of Y given the X’s.
2. (X1i,…, Xki, Yi) are i.i.d. (same).
3. Big outliers are rare (same idea; the precise mathematical condition depends
on the specific f ).
4. No perfect multicollinearity (same idea; the precise statement depends on
the specific f ).
The expected difference in Y associated with a difference in X1, holding X2,…,
Xk constant is
ΔY = f (X1 + ΔX1, X2,…, Xk) – f (X1, X2,…, Xk)
Copyright © 2019 Pearson Education Ltd. All Rights Reserved.
20
Yi = β 0 + β1 X i + β 2 X i2 + + β r X ir + ui
------------------------------------------------------------------------------
| Robust
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
avginc | 5.018677 .7073505 7.10 0.000 3.628251 6.409104
avginc2 | -.0958052 .0289537 -3.31 0.001 -.1527191 -.0388913
avginc3 | .0006855 .0003471 1.98 0.049 3.27e-06 .0013677
_cons | 600.079 5.102062 117.61 0.000 590.0499 610.108
------------------------------------------------------------------------------
Testing the null hypothesis of linearity, against the alternative that the population
regression is quadratic and/or cubic, that is, it is a polynomial of degree up to 3:
H0: population coefficients on Income2 and Income3 = 0
H1: at least one of these coefficients is nonzero.
test avginc2 avginc3
( 1) avginc2 = 0.0
( 2) avginc3 = 0.0
F( 2, 416) = 37.69
Prob > F = 0.0000
∆X
Now 100 × = percentage change in X , so a 1% increase
X
in X ( multiplying X by 1.01) is associated with a .01β 1
change in Y .
�
𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 = 557.8 + 36.42 × ln( 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝑒𝑒𝑖𝑖 )
(3.8) (1.40)
∆Y
so ≅ β1∆X
Y
∆Y /Y
or β1 ≅ (small ∆X )
∆X
Example
• Consider the following relationship between income and years of
education (also including controls for age and gender):
∆Y ∆X
so ≅ β1
Y X
∆Y /Y
or β1 ≅ (small ∆X )
∆X /X
�
ln( 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇) = 6.336 + 0.0554 × ln( 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝑒𝑒𝑖𝑖 )
(0.006) (0.0021)
Logarithms – Example 2
• Here’s an estimated relationship between neighborhood pollution
and housing prices:
� = 9.23 − 0.72 ln 𝑛𝑛𝑛𝑛𝑛𝑛 + 0.31𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
ln 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
– price = median housing price in the neighborhood
– nox = amount of nitrous oxide in the air (parts per million)
– rooms = mean number of rooms in houses in the neighborhood
• A: Three cases
– Binary X Binary
– Binary X Continuous
– Continuous X Continuous
Copyright © 2019 Pearson Education Ltd. All Rights Reserved.
47
Binary X Binary
Yi = β0 + β1D1i + β2D2i + ui
• D1i, D2i are binary (dummy variables)
�
𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 = 664.1 − 18.2𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻 − 1.9𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻 − 3.5(𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻 × 𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻)
1.4 (2.3) (1.9) (3.1)
• For non-immigrants?
• Here’s a graphical
version of the model:
• We can do so by creating an
interaction between the dummy
variable immigrant and the
continuous variable educ by
multiplying the two:
– Immigrants?
Immigrant (immig=1)
∆TestScore
PctEL
∆STR
0 –1.12
20% –1.12 + .0012 × 20 = –1.10
Copyright © 2019 Pearson Education Ltd. All Rights Reserved.
68