04 - Panel Data PDF
04 - Panel Data PDF
04 - Panel Data PDF
Empirical Methods in CF
Lecture 4 – Panel Data
2
Background readings
n Angrist and Pischke
q Sections 5.1, 5.3
n Wooldridge
q Chapter 10 and Sections 13.9.1, 15.8.2, 15.8.3
n Greene
q Chapter 11
3
Outline for Today
n Quick review
n Motivate how panel data is helpful
q Fixed effects model
q Random effects model
q First differences
q Lagged y models
4
Quick Review [Part 1]
n What is the key assumption needed for us
to make causal inferences? And what are
the ways in which it can be violated?
q Answer = CMI is violated whenever an
independent variable, x, is correlated with the
error, u. This occurs when there is…
n Omitted variable bias
n Measurement error bias
n Simultaneity bias
5
Quick Review [Part 2]
n When is it possible to determine the sign of
an omitted variable bias?
q Answer = Basically, when there is just one
OMV that is correlated with just one of the x's;
other scenarios are much more complicated
6
Quick Review [Part 3]
n When is measurement error of the
dependent variable problematic (for
identifying the causal CEF)?
q Answer = If error is correlated with any x.
7
Quick Review [Part 4]
n What is the bias on the coefficient of x,
and on other coefficients when an indep-
endent variable, x, is measured with error?
q Answer = Hard to know!
n If ME is uncorrelated with observed x, no bias
n If ME is uncorrelated with unobserved x*, the
coefficient on x has an attenuation bias, but the
sign of the bias on all other coefficients is unclear
8
Quick Review [Part 5]
n When will an estimation suffer from
simultaneity bias?
q Answer = If we can think of any x as a
potential outcome variable; i.e. we think y
might directly affect an x
9
Outline for Panel Data
n Motivate how panel data is helpful
n Fixed effects model
q Benefits [There are many]
q Costs [There are some…]
10
Motivation [Part 1]
n As noted in prior lecture, omitted
variables pose a substantial hurdle in
our ability to make causal inferences
n What's worse… many of them are
inherently unobservable to researchers
11
Motivation [Part 2]
n E.g. consider a the firm-level estimation
leveragei , j ,t = β 0 + β1 profiti , j ,t −1 + ui , j ,t
where leverage is debt/assets for firm i,
operating in industry j in year t, and profit is
the firms net income/assets
12
Motivation [Part 3]
n Oh, there are so, so many…
q Managerial talent and/or risk aversion
q Industry supply and/or demand shock
Sadly, this is
q Cost of capital easy to do with
q Investment opportunities other dependent
or independent
q And so on… variables…
13
Motivation [Part 4]
n Using observations from various
geographical regions (e.g. state or country)
opens up even more possibilities…
q Can you think of some unobserved variables
that might be related to a firm's location?
n Answer: any unobserved differences in local economic
environment, e.g. institutions, protection of property
rights, financial development, investor sentiment,
regional demand shocks, etc.
14
Motivation [Part 5]
n Sometimes, we can control for these
unobservable variables using proxy variables
q But, what assumption was required for a
proxy variable to provide consistent
estimates on the other parameters?
n Answer: It needs to be a sufficiently good proxy such
that the unobserved variable can't be correlated with
the other explanatory variables after we control for
the proxy variable… This might be hard to find
15
Panel data to the rescue…
n Thankfully, panel data can help us with a
particular type of unobserved variable…
16
Outline for Panel Data
n Motivate how panel data is helpful
n Fixed effects model
q Benefits [There are many]
q Costs [There are some…]
17
Panel data
n Panel data = whenever you have multiple
observations per unit of observation i (e.g.
you observe each firm over multiple years)
q Let's assume N units i
q And, T observations per unit i [i.e. balanced panel]
n Ex. #1 – You observe 5,000 firms in Compustat
over a twenty year period [i.e. N=5,000, T=20]
n Ex. #2 – You observe 1,000 CEOs in Execucomp
over a 10 year period [i.e. N=1,000, T=10]
18
Time-invariant unobserved variable
n Consider the following model… Unobserved,
time-invariant
yi ,t = α + β xi ,t + δ fi + ui ,t variable, f
19
If we ignore f, we get OVB
n If estimate the model…
yi,t = α + β xi,t + vi,t
!
δ f i +ui ,t
20
Can solve this by transforming data
n First, notice that if you take the population
mean of the dependent variable for each
unit of observation, i, you get…
yi = α + β xi + δ fi + ui Again, I assumed
there are T obs.
per unit i
where
1 1 1
yi = ∑ yi ,t , xi = ∑ xi ,t , ui = ∑ ui ,t
T t T t T t
21
Transforming data [Part 2]
n Now, if we subtract yi from yi ,t , we have
yi ,t − yi = β ( xi ,t − xi ) + ( ui ,t − ui )
22
Fixed Effects (or Within) Estimator
n Answer: OLS estimation of transformed
model will yield a consistent estimate of β
n The prior transformation is called the
“within transformation” because it
demeans all variables within their group
q In this case, the “group” was each cross-section
of observations over time for each firm
q This is also called the FE estimator
23
Unobserved heterogeneity – Tangent
n Unobserved variable, f, is very general
q Doesn't just capture one unobserved
variable; captures all unobserved variables
that don't vary within the group
q This is why we often just call it
“unobserved heterogeneity”
24
FE Estimator – Practical Advice
n When you use the fixed effects (FE)
estimator in programs like Stata, it does
the within transformation for you
n Don't do it on your own because…
q The degrees of freedom(doF) (which are used
to get the standard errors) sometimes need to be
adjusted down by the number of panels, N
q What adjustment is necessary depends on
whether you cluster, etc.
25
Least Squares Dummy Variable (LSDV)
n Another way to do the FE estimation is
by adding indicator (dummy) variables
q Notice that the coefficient on fi, δ, doesn't
really have any meaning; so, can just rescale
the unobserved fi to make it equal to 1
yi ,t = α + β xi ,t + fi + ui ,t
q Now, to estimate this, we can just treat each
fi as a parameter to be estimated
26
LSDV continued…
n I.e. create a dummy variable for each
group i, and add it to the regression
q This is least squares dummy variable model
q Now, our estimation equation exactly matches
the true underlying model
yi ,t = α + β xi ,t + fi + ui ,t
q We get consistent estimates and SE that are
identical to what we'd get with within estimator
27
LSDV – Practical Advice
n Because the dummy variables will be
collinear with the constant, one of them
will be dropped in the estimation
q Therefore, don't try to interpret the intercept;
it is just the average y when all the x's are
equal to zero for the group corresponding to
the dropped dummy variable
q In xtreg, fe, the reported intercept is just
average of individual specific intercepts
28
LSDV versus FE [Part 1]
n Can show that LSDV and FE are identical,
using partial regression results [How?]
q Remember, to control for some variable z, we can
regress y onto both x and z, or we can just partial
z out from both y and x before regressing y on x
(i.e. regress residuals from regression of y on z
onto residual from regression of x on z)
q The demeaned variables are the residuals from a
regression of them onto the group dummies!
29
LSDV versus FE [Part 2]
n Reported R2 will be larger with LSDV
q All the dummy variables will explain a lot of the
variation in y, driving up R2
q Within R2 reported for FE estimator just reports
what proportion of the within variation in y that is
explained by the within variation in x
q The within R2 is usually of more interest to us
30
R-squared with FE – Practical Advice
n The within R2 is usually of more interest
since it describes explanatory power of x's
[after partialling out the FE]
q The get within R2, use xtreg, fe
n Reporting overall adjusted-R2 is also useful
q To get overall adjusted-R2, use areg command
instead of xtreg, fe. The “overall R2” reported
by xtreg does not include variation explained
by FE, but the R2 reported by areg does
31
Outline for Panel Data
n Motivate how panel data is helpful
n Fixed effects model
q Benefits [There are many]
q Costs [There are some…]
32
FE Estimator – Benefits [Part 1]
n There are many benefits of FE estimator
q Allows for arbitrary correlation between each
fixed effect, fi, and each x within group i
n I.e. its very general and not imposing much structure on
what the underlying data must look like
33
FE Estimator – Benefits [Part 2]
q It is also very flexible and can help us control for
many types of unobserved heterogeneities
n Can add year FE if worried about unobserved
heterogeneity across time [e.g. macroeconomic shocks]
n Can add CEO FE if worried about unobserved
heterogeneity across CEOs [e.g. talent, risk aversion]
n Add industry-by-year FE if worried about unobserved
heterogeneity across industries over time [e.g. investment
opportunities, demand shocks]
34
FE Estimator – Tangent [Part 1]
35
FE Estimator – Tangent [Part 2]
36
Outline for Panel Data
n Motivate how panel data is helpful
n Fixed effects model
q Benefits [There are many]
q Costs [There are some…]
37
FE Estimator – Costs
38
FE Cost #1 – Can't estimate some var.
n If no within-group variation in the
independent variable, x, of interest, can't
disentangle it from group FE
q It is collinear with group FE; and will be
dropped by computer or swept out in the
within transformation
39
FE Cost #1 – Example
q Consider following CEO-level estimation
ln(totalpay )ijt = α + β1 ln( firmsize)ijt + β1volatilityijt
+ β3 femalei + δ t + fi + λ j + uijt
n Ln(totalpay) is for CEO i, firm j, year t
n Estimation includes year, CEO, and firm FE
40
FE Cost #1 – Practical Advice
n Be careful of this!
q Programs like xtreg are good about dropping the
female variable and not reporting an estimate…
q But, if you create dummy variables yourself and
input them yourself, the estimation might drop one
of them rather than the female indicator
n I.e. you'll get an estimate for β3, but it has no
meaning! It's just a random intercept value that
depends entirely on the random FE dropped by Stata
41
FE Cost #1 – Any Solution?
n Instrumental variables can provide a
possible solution for this problem
q See Hausman and Taylor (Econometrica 1981)
q We will discuss this next week
42
FE Cost #2 – Measurement error [P1]
n Measurement error of independent variable
(and resulting biases) can be amplified
q Think of there being two types of variation
n Good (meaningful) variation
n Noise variation because we don't perfectly
measure the underlying variable of interest
43
FE Cost #2 – Measurement error [P2]
n Answer: Attenuation bias on
mismeasured variable will go up!
q Practical advice: Be careful in interpreting 'zero'
coefficients on potentially mismeasured
regressors; might just be attenuation bias!
q And remember, sign of bias on other
coefficients will be generally difficult to know
44
FE Cost #2 – Measurement error [P3]
n Problem can also apply even when all
variables are perfectly measured [How?]
n Answer: Adding FE might throw out relevant
variation; e.g. y in firm FE model might respond to
sustained changes in x, rather than transitory
changes [see McKinnish 2008 for more details]
n With FE you'd only have the transitory variation
leftover; might find x uncorrelated with y in FE
estimation even though sustained changes in x is
most important determinant of y
45
FE Cost #2 – Example
n Difficult to identify causal effect of credit
shocks on firm output because credit shocks
coincide with demand shocks [i.e. OVB]
q Paravisini, Rappoport, Schnabl, Wolfenzon
(2014) used product-level export data & shock to
some Peru banks to address this
n Basically regressed product output on total firm credit,
and added firm, bank, and product×destination FE (i.e.
dummy for selling a product to a particular country!)
n Found small effect… [Concern?]
46
FE Cost #2 – Example continued
n Concern = Credit extended to firms may
be measured with error!
q E.g. some loan originations and payoffs may
not be recorded in timely fashion
q Need to be careful interpreting a coefficient
from a model with so many FE as “small”
n Note: This paper is actually very good (and does
IV as well), and the authors are very careful to not
interpret their findings as evidence that financial
constraints only have a “small” effect
47
FE Cost #2 – Any solution?
n Admittedly, measurement error, in
general, is difficult to address
n For examples on how to deal with
measurement error, see following papers
q Griliches and Hausman (JoE 1986)
q Biorn (Econometric Reviews 2000)
q Erickson and Whited (JPE 2000, RFS 2012)
q Almeida, Campello, and Galvao (RFS 2010)
48
FE Cost #3 – Computation issues [P1]
n Estimating a model with multiple types of
FE can be computationally difficult
q When more than one type of FE, you cannot
remove both using within-transformation
n Generally, you can only sweep one away with
within-transformation; other FE dealt with by
adding dummy variable to model
n E.g. firm and year fixed effects [See next slide]
49
FE Cost #3 – Computation issues [P2]
Year FE
n Consider below model: Firm FE
yi ,t = α + β xi ,t + δ t + fi + ui ,t
50
FE Cost #3 – Computation issues [P3]
51
FE Cost #3 – Example
n But, computational issues is becoming
increasingly more problematic
q Researchers using larger datasets with many
more complicated FE structures
q E.g. if you try adding both firm and
industry×year FE, you'll have a problem
n Estimating 4-digit SIC×year and firm FE in
Compustat requires ≈ 40 GB memory
n No one has this; hence, no one does it…
52
FE Cost #3 – Any Solution?
n Yes, there are some potential solutions
q Gormley and Matsa (2014) discusses some
of these solutions in Section 4
q We will come back to this in “Common
Limitations and Errors” lecture
53
FE – Some Remaining Issues
54
Predicted values of FE [Part 1]
n Sometimes, predicted value of
unobserved FE is of interest
n Can get predicted value using
fˆi = yi − βˆ xi , for all i = 1,..., N
q E.g. Bertrand and Schoar (QJE 2003) did
this to back out CEO fixed effects
n They show that the CEO FE are jointly
statistically significant from zero, suggesting
CEOs have 'styles' that affect their firms
55
Predicted values of FE [Part 2]
n But, be careful with using these predicted
values of the FE
q They are unbiased, but inconsistent
n As sample size increases (and we get more
groups), we have more parameters to estimate…
never get the necessary asymptotics
n We call this the Incidental Parameters Problem
56
Predicted values of FE [Part 3]
57
Nonlinear models with FE [Part 1]
n Because we don't get consistent estimates
of the FE, we can't estimate nonlinear
panel data models with FE
q In practice, Logit, Tobit, Probit should not be
estimated with many fixed effects
q They only give consistent estimates under
rather strong and unrealistic assumptions
58
Nonlinear models with FE [Part 2]
Why should
q E.g. Probit with FE requires… we believe this
n Unobserved fi is to be distributed normally to be true?
n fi and xi,t to be independent
Almost surely
not true in CF
q And, Logit with FE requires…
n No serial correlation of y after conditioning on the
observable x and unobserved f
Probably unlikely in
q For more details, see… many CF settings
n Wooldridge (2010), Sections 13.9.1, 15.8.2-3
n Greene (2004) – uses simulation to show how bad
59
Outline for Panel Data
n Motivate how panel data is helpful
n Fixed effects model
q Benefits [There are many]
q Costs [There are some…]
60
Random effects (RE) model [Part 1]
n Very similar model as FE…
yi ,t = α + β xi ,t + fi + ui ,t
61
Random effects (RE) model [Part 2]
62
Random effects (RE) model [Part 3]
63
Random effects – My Take
64
Outline for Panel Data
n Motivate how panel data is helpful
n Fixed effects model
q Benefits [There are many]
q Costs [There are some…]
65
First differencing (FD) [Part 1]
n First differencing is another way to
remove unobserved heterogeneities
q Rather than subtracting off the group
mean of the variable from each variable,
you instead subtract the lagged observation
q Easy to see why this also works…
66
First differencing (FD) [Part 2]
n Notice that, yi ,t = α + β xi ,t + fi + ui ,t
yi ,t −1 = α + β xi ,t −1 + fi + ui ,t −1 Note: we'll lose
on observation
per cross-section
n From this, we can see that because there
won't be a lag
yi ,t − yi ,t −1 = β ( xi ,t − xi ,t −1 ) + (ui ,t − ui ,t −1 )
67
First differences (without time)
68
FD versus FE [Part 1]
n When just two observations per group,
they are identical to each other
n In other cases, both are consistent;
difference is generally about efficiency
q FE is more efficient if disturbances,
ui,t, are serially uncorrelated
Which is true?
q FD is more efficient if disturbance, Unclear. Truth is
ui,t, follow a random walk that it is probably
something in between
69
FD versus FE [Part 2]
n If strict exogeneity is violated (i.e. xi,t is
correlated with ui,s for s≠t), FE might be better
q As long as we believe xi,t and ui,t are uncorrelated,
the FE's inconsistency shrinks to 0 at rate 1/T;
but, FD gets no better with larger T
q Remember: T is the # of observations per group
70
FD versus FE [Part 3]
n Bottom line: not a bad idea to try both…
q If different, you should try to understand why
q With an omitted variable or measurement
error, you’ll get diff. answers with FD and FE
n In fact, Griliches and Hausman (1986) shows that
because measurement error causes predictably
different biases in FD and FE, you can (under
certain circumstances) use the biased estimates to
back out the true parameter
71
Outline for Panel Data
n Motivate how panel data is helpful
n Fixed effects model
q Benefits [There are many]
q Costs [There are some…]
72
Lagged dependent variables with FE
n We cannot easily estimate models with both
a lagged dep. var. and unobserved FE
yi ,t = α + ρ yi ,t −1 + β xi ,t + fi + ui ,t , ρ <1
73
Lagged y & FE – Problem with OLS
n To see the problem with OLS, suppose
you estimate the following:
yi,t = α + ρ yi,t−1 + β xi,t + vi,t
!
f i +ui ,t
q But, yi ,t −1 = α + ρ yi ,t − 2 + β xi ,t −1 + fi + ui ,t −1
q Thus, yi,t-1 and composite error, vi,t are positively
correlated because they both contain fi
q I.e. you get omitted variable bias
74
Lagged y & FE – Problem with FE
n Will skip the math, but it is always biased
q Basic idea is that if you do a within
transformation, the lagged mean of y, which will be
on RHS of the model now, will always be
negatively correlated with demeaned error, u
n Note #1 – This is true even if there was no unobserved
heterogeneity, f; FE with lagged values is always bad idea
n Note #2: Same problem applies to FD
75
How do we estimate this? IV?
n Basically, you're going to need instrument;
we will come back to this next week….
76
Lagged y versus FE – Bracketing
n Suppose you don't know which is correct
q Lagged value model: yi ,t = α + γ yi ,t −1 + β xi ,t + ui ,t
q Or, FE model: yi ,t = α + β xi ,t + fi + ui ,t
77
Bracketing continued…
n Use this to 'bracket' where true β is…
q But sometimes, you won't observe bracketing
q Likely means your model is incorrect in other
ways, or there is some severe finite sample bias
78
Summary of Today [Part 1]
79
Summary of Today [Part 2]
n FE estimator, however, has weaknesses
q Can't estimate variables that don't vary within
groups [or at least, not without an instrument]
q Could amplify any measurement error
n For this reason, be cautious interpreting zero or small
coefficients on possibly mismeasured variables
80
Summary of Today [Part 3]
81
In First Half of Next Class
n Instrumental variables
q What are the necessary assumptions? [E.g.
what is the exclusion restriction?]
q Is there are way we can test whether our
instruments are okay?
82
Assign papers for next week…
n Khwaja and Mian (AER 2008)
q Bank liquidity shocks
83
Break Time
n Let's take our 10 minute break
n We'll do presentations when we get back
84