04 - Panel Data PDF

FNCE 926
Empirical Methods in CF
Lecture 4 – Panel Data
Professor Todd Gormley

Announcements
n  Exercise #2 is due next week
q  You can download it from Canvas
q  Largely just has you manipulate panel data
n  Please upload both completed DO file and
typed solutions to Canvas [don't e-mail]
2
Background readings
n  Angrist and Pischke
q  Sections 5.1, 5.3
n  Wooldridge
q  Chapter 10 and Sections 13.9.1, 15.8.2, 15.8.3
n  Greene
q  Chapter 11
3
Outline for Today
n  Quick review
n  Motivate how panel data is helpful
q  Fixed effects model
q  Random effects model
q  First differences
q  Lagged y models
n  Student presentations of “Causality”
4
Quick Review [Part 1]
n  What is the key assumption needed for us
to make causal inferences? And what are
the ways in which it can be violated?
q  Answer = CMI is violated whenever an
independent variable, x, is correlated with the
error, u. This occurs when there is…
n  Omitted variable bias
n  Measurement error bias
n  Simultaneity bias
5
n  When is it possible to determine the sign of
an omitted variable bias?
q  Answer = Basically, when there is just one
OMV that is correlated with just one of the x's;
other scenarios are much more complicated
6
n  When is measurement error of the
dependent variable problematic (for
identifying the causal CEF)?
q  Answer = If error is correlated with any x.
7
n  What is the bias on the coefficient of x,
and on other coefficients when an indep-
endent variable, x, is measured with error?
q  Answer = Hard to know!
n  If ME is uncorrelated with observed x, no bias
n  If ME is uncorrelated with unobserved x*, the
coefficient on x has an attenuation bias, but the
sign of the bias on all other coefficients is unclear
8
n  When will an estimation suffer from
simultaneity bias?
q  Answer = If we can think of any x as a
potential outcome variable; i.e. we think y
might directly affect an x
9
Outline for Panel Data
n  Fixed effects model
q  Benefits [There are many]
q  Costs [There are some…]
n  Random effects model

n  First differences
n  Lagged y models
10
Motivation [Part 1]
n  As noted in prior lecture, omitted
variables pose a substantial hurdle in
our ability to make causal inferences
n  What's worse… many of them are
inherently unobservable to researchers
11
Motivation [Part 2]
n  E.g. consider a the firm-level estimation
leveragei , j ,t = β 0 + β1 profiti , j ,t −1 + ui , j ,t
where leverage is debt/assets for firm i,
operating in industry j in year t, and profit is
the firms net income/assets
What might be some unobservable

omitted variables in this estimation?
12
Motivation [Part 3]
n  Oh, there are so, so many…
q  Managerial talent and/or risk aversion
q  Industry supply and/or demand shock
Sadly, this is
q  Cost of capital easy to do with
q  Investment opportunities other dependent
or independent
q  And so on… variables…
n  Easy to think of ways these might be affect

leverage and be correlated with profits
13
Motivation [Part 4]
n  Using observations from various
geographical regions (e.g. state or country)
opens up even more possibilities…
q  Can you think of some unobserved variables
that might be related to a firm's location?
n  Answer: any unobserved differences in local economic
environment, e.g. institutions, protection of property
rights, financial development, investor sentiment,
regional demand shocks, etc.
14
Motivation [Part 5]
n  Sometimes, we can control for these
unobservable variables using proxy variables
q  But, what assumption was required for a
proxy variable to provide consistent
estimates on the other parameters?
n  Answer: It needs to be a sufficiently good proxy such
that the unobserved variable can't be correlated with
the other explanatory variables after we control for
the proxy variable… This might be hard to find
15
Panel data to the rescue…
n  Thankfully, panel data can help us with a
particular type of unobserved variable…
q  What type of unobserved variable does

panel data help us with, and why?
q  Answer = It helps us with time-invariant
omitted variables; now, let's see why…
[Actually, it helps with any unobserved variable that
doesn't vary within groups of observations]
16

17
Panel data
n  Panel data = whenever you have multiple
observations per unit of observation i (e.g.
you observe each firm over multiple years)
q  Let's assume N units i
q  And, T observations per unit i [i.e. balanced panel]
n  Ex. #1 – You observe 5,000 firms in Compustat
over a twenty year period [i.e. N=5,000, T=20]
n  Ex. #2 – You observe 1,000 CEOs in Execucomp
over a 10 year period [i.e. N=1,000, T=10]
18
Time-invariant unobserved variable
n  Consider the following model… Unobserved,
time-invariant
yi ,t = α + β xi ,t + δ fi + ui ,t variable, f
where E (ui ,t ) = 0 These implies what?

Answer: If don't control
corr ( xi ,t , f i ) ≠ 0 for f, we have OVB, but if
corr ( fi , ui ,t ) = 0 could, then we wouldn't
corr ( xi ,t , ui , s ) = 0 for all s, t
Note: This is stronger assumption then we usually make; it's

called strict exogeneity. In words, this assumption means what?
19
If we ignore f, we get OVB
n  If estimate the model…
yi,t = α + β xi,t + vi,t
!
δ f i +ui ,t
q  x is correlated with the disturbance v (through

it's correlation with the unobserved variable, f,
which is now part of the disturbance)
ˆ σ xf This is standard OVB…

q  Easy to show β = β + δ 2
OLS
σx coefficient from regression

of omitted var., f, on x
times the true coeff. on f
20
Can solve this by transforming data
n  First, notice that if you take the population
mean of the dependent variable for each
unit of observation, i, you get…
yi = α + β xi + δ fi + ui Again, I assumed
there are T obs.
per unit i
where
1 1 1
yi = ∑ yi ,t , xi = ∑ xi ,t , ui = ∑ ui ,t
T t T t T t
21
Transforming data [Part 2]
n  Now, if we subtract yi from yi ,t , we have
yi ,t − yi = β ( xi ,t − xi ) + ( ui ,t − ui )
q  And look! The unobserved variable, fi, is gone

(as is the constant) because it is time-invariant
q  With our assumption of strict exogeneity earlier,
easy to see that ( xi ,t − xi ) is uncorrelated with the
new disturbance, (ui ,t − ui ), which means…
?
22
Fixed Effects (or Within) Estimator
n  Answer: OLS estimation of transformed
model will yield a consistent estimate of β
n  The prior transformation is called the
“within transformation” because it
demeans all variables within their group
q  In this case, the “group” was each cross-section
of observations over time for each firm
q  This is also called the FE estimator
23
Unobserved heterogeneity – Tangent
n  Unobserved variable, f, is very general
q  Doesn't just capture one unobserved
variable; captures all unobserved variables
that don't vary within the group
q  This is why we often just call it
“unobserved heterogeneity”
24
FE Estimator – Practical Advice
n  When you use the fixed effects (FE)
estimator in programs like Stata, it does
the within transformation for you
n  Don't do it on your own because…
q  The degrees of freedom(doF) (which are used
to get the standard errors) sometimes need to be
adjusted down by the number of panels, N
q  What adjustment is necessary depends on
whether you cluster, etc.
25
Least Squares Dummy Variable (LSDV)
n  Another way to do the FE estimation is
by adding indicator (dummy) variables
q  Notice that the coefficient on fi, δ, doesn't
really have any meaning; so, can just rescale
the unobserved fi to make it equal to 1
yi ,t = α + β xi ,t + fi + ui ,t
q  Now, to estimate this, we can just treat each
fi as a parameter to be estimated
26
LSDV continued…
n  I.e. create a dummy variable for each
group i, and add it to the regression
q  This is least squares dummy variable model
q  Now, our estimation equation exactly matches
the true underlying model
q  We get consistent estimates and SE that are
identical to what we'd get with within estimator
27
LSDV – Practical Advice
n  Because the dummy variables will be
collinear with the constant, one of them
will be dropped in the estimation
q  Therefore, don't try to interpret the intercept;
it is just the average y when all the x's are
equal to zero for the group corresponding to
the dropped dummy variable
q  In xtreg, fe, the reported intercept is just
average of individual specific intercepts
28
LSDV versus FE [Part 1]
n  Can show that LSDV and FE are identical,
using partial regression results [How?]
q  Remember, to control for some variable z, we can
regress y onto both x and z, or we can just partial
z out from both y and x before regressing y on x
(i.e. regress residuals from regression of y on z
onto residual from regression of x on z)
q  The demeaned variables are the residuals from a
regression of them onto the group dummies!
29
LSDV versus FE [Part 2]
n  Reported R2 will be larger with LSDV
q  All the dummy variables will explain a lot of the
variation in y, driving up R2
q  Within R2 reported for FE estimator just reports
what proportion of the within variation in y that is
explained by the within variation in x
q  The within R2 is usually of more interest to us
30
R-squared with FE – Practical Advice
n  The within R2 is usually of more interest
since it describes explanatory power of x's
[after partialling out the FE]
q  The get within R2, use xtreg, fe
n  Reporting overall adjusted-R2 is also useful
q  To get overall adjusted-R2, use areg command
instead of xtreg, fe. The “overall R2” reported
by xtreg does not include variation explained
by FE, but the R2 reported by areg does
31

32
FE Estimator – Benefits [Part 1]
n  There are many benefits of FE estimator
q  Allows for arbitrary correlation between each
fixed effect, fi, and each x within group i
n  I.e. its very general and not imposing much structure on
what the underlying data must look like
q  Very intuitive interpretation; coefficient is

identified using only changes within cross-sections
33
FE Estimator – Benefits [Part 2]
q  It is also very flexible and can help us control for
many types of unobserved heterogeneities
n  Can add year FE if worried about unobserved
heterogeneity across time [e.g. macroeconomic shocks]
n  Can add CEO FE if worried about unobserved
heterogeneity across CEOs [e.g. talent, risk aversion]
n  Add industry-by-year FE if worried about unobserved
heterogeneity across industries over time [e.g. investment
opportunities, demand shocks]
34
FE Estimator – Tangent [Part 1]
n  FE estimator is very general

q  It applies to any scenario where
observations can be grouped together
n  Ex. #1 – Firms can be grouped by industry
n  Ex. #2 – CEOs observations (which may span multiple
firms) can be grouped by CEO-firm combinations
q  Textbook example of grouping units i across time

is just example (though, the most common)
35
FE Estimator – Tangent [Part 2]
n  Once you are able to construct groups, you

can remove any unobserved 'group-level
heterogeneity' by adding group FE
q  Consistency just requires there be a large
number of groups
36

37
FE Estimator – Costs
n  But, FE estimator also has its costs

q  Can't identify variables that don't vary within group
q  Subject to potentially large measurement error bias
q  Can be hard to estimate in some cases
q  Miscellaneous issues
38
FE Cost #1 – Can't estimate some var.
n  If no within-group variation in the
independent variable, x, of interest, can't
disentangle it from group FE
q  It is collinear with group FE; and will be
dropped by computer or swept out in the
within transformation
39
FE Cost #1 – Example
q  Consider following CEO-level estimation
ln(totalpay )ijt = α + β1 ln( firmsize)ijt + β1volatilityijt
+ β3 femalei + δ t + fi + λ j + uijt
n  Ln(totalpay) is for CEO i, firm j, year t
n  Estimation includes year, CEO, and firm FE
q  What coefficient can't be estimated?

n  Answer: β3! Being female doesn’t vary within
the group of each CEO’s observations; i.e. it is
collinear with the CEO fixed effect
40
FE Cost #1 – Practical Advice
n  Be careful of this!
q  Programs like xtreg are good about dropping the
female variable and not reporting an estimate…
q  But, if you create dummy variables yourself and
input them yourself, the estimation might drop one
of them rather than the female indicator
n  I.e. you'll get an estimate for β3, but it has no
meaning! It's just a random intercept value that
depends entirely on the random FE dropped by Stata
41
FE Cost #1 – Any Solution?
n  Instrumental variables can provide a
possible solution for this problem
q  See Hausman and Taylor (Econometrica 1981)
q  We will discuss this next week
42
FE Cost #2 – Measurement error [P1]
n  Measurement error of independent variable
(and resulting biases) can be amplified
q  Think of there being two types of variation
n  Good (meaningful) variation
n  Noise variation because we don't perfectly
measure the underlying variable of interest
q  Adding FE can sweep out a lot of the good

variation; fraction of remaining variation coming
from noise goes up [What will this do?]
43
n  Answer: Attenuation bias on
mismeasured variable will go up!
q  Practical advice: Be careful in interpreting 'zero'
coefficients on potentially mismeasured
regressors; might just be attenuation bias!
q  And remember, sign of bias on other
coefficients will be generally difficult to know
44
n  Problem can also apply even when all
variables are perfectly measured [How?]
n  Answer: Adding FE might throw out relevant
variation; e.g. y in firm FE model might respond to
sustained changes in x, rather than transitory
changes [see McKinnish 2008 for more details]
n  With FE you'd only have the transitory variation
leftover; might find x uncorrelated with y in FE
estimation even though sustained changes in x is
most important determinant of y
45
n  Difficult to identify causal effect of credit
shocks on firm output because credit shocks
coincide with demand shocks [i.e. OVB]
q  Paravisini, Rappoport, Schnabl, Wolfenzon
(2014) used product-level export data & shock to
some Peru banks to address this
n  Basically regressed product output on total firm credit,
and added firm, bank, and product×destination FE (i.e.
dummy for selling a product to a particular country!)
n  Found small effect… [Concern?]
46
FE Cost #2 – Example continued
n  Concern = Credit extended to firms may
be measured with error!
q  E.g. some loan originations and payoffs may
not be recorded in timely fashion
q  Need to be careful interpreting a coefficient
from a model with so many FE as “small”
n  Note: This paper is actually very good (and does
IV as well), and the authors are very careful to not
interpret their findings as evidence that financial
constraints only have a “small” effect
47
FE Cost #2 – Any solution?
n  Admittedly, measurement error, in
general, is difficult to address
n  For examples on how to deal with
measurement error, see following papers
q  Griliches and Hausman (JoE 1986)
q  Biorn (Econometric Reviews 2000)
q  Erickson and Whited (JPE 2000, RFS 2012)
q  Almeida, Campello, and Galvao (RFS 2010)
48
FE Cost #3 – Computation issues [P1]
n  Estimating a model with multiple types of
FE can be computationally difficult
q  When more than one type of FE, you cannot
remove both using within-transformation
n  Generally, you can only sweep one away with
within-transformation; other FE dealt with by
adding dummy variable to model
n  E.g. firm and year fixed effects [See next slide]
49
Year FE
n  Consider below model: Firm FE
yi ,t = α + β xi ,t + δ t + fi + ui ,t
q  To estimate this in Stata, we'd use a

command something like the following…
Tells Stata that panel dimension
xtset firm is given by firm variable
xi: xtreg y x i.year, fe
Tells Stata to remove FE for
panels (i.e. firms) by doing
Tells Stata to create and add dummy within-transformation
variables for year variable
50
n  Dummies not swept away in within-

transformation are actually estimated
q  With year FE, this isn't problem because
there aren't that many years of data
q  If had to estimate 1,000s of firm FE,
however, it might be a problem
n  In fact, this is why we sweep away the firm FE
rather than the year FE; there are more firms!
51
n  But, computational issues is becoming
increasingly more problematic
q  Researchers using larger datasets with many
more complicated FE structures
q  E.g. if you try adding both firm and
industry×year FE, you'll have a problem
n  Estimating 4-digit SIC×year and firm FE in
Compustat requires ≈ 40 GB memory
n  No one has this; hence, no one does it…
52
FE Cost #3 – Any Solution?
n  Yes, there are some potential solutions
q  Gormley and Matsa (2014) discusses some
of these solutions in Section 4
q  We will come back to this in “Common
Limitations and Errors” lecture
53
FE – Some Remaining Issues
n  Two more issues worth noting about FE

q  Predicted values of unobserved FE
q  Non-linear estimations with FE and the
incidental parameter problem
54
Predicted values of FE [Part 1]
n  Sometimes, predicted value of
unobserved FE is of interest
n  Can get predicted value using
fˆi = yi − βˆ xi , for all i = 1,..., N
q  E.g. Bertrand and Schoar (QJE 2003) did
this to back out CEO fixed effects
n  They show that the CEO FE are jointly
statistically significant from zero, suggesting
CEOs have 'styles' that affect their firms
55
n  But, be careful with using these predicted
values of the FE
q  They are unbiased, but inconsistent
n  As sample size increases (and we get more
groups), we have more parameters to estimate…
never get the necessary asymptotics
n  We call this the Incidental Parameters Problem
56
q  Moreover, doing an F-test to show they are

statistically different from zero is only valid
under rather strong assumptions
n  Need to assume errors, u, are distributed normally,
homoskedastic, and serially uncorrelated
n  See Wooldridge (2010, Section 10.5.3) and Fee,
Hadlock, and Pierce (2011) for more details
57
Nonlinear models with FE [Part 1]
n  Because we don't get consistent estimates
of the FE, we can't estimate nonlinear
panel data models with FE
q  In practice, Logit, Tobit, Probit should not be
estimated with many fixed effects
q  They only give consistent estimates under
rather strong and unrealistic assumptions
58
Nonlinear models with FE [Part 2]
Why should
q  E.g. Probit with FE requires… we believe this
n  Unobserved fi is to be distributed normally to be true?
n  fi and xi,t to be independent
Almost surely
not true in CF
q  And, Logit with FE requires…
n  No serial correlation of y after conditioning on the
observable x and unobserved f
Probably unlikely in
q  For more details, see… many CF settings
n  Wooldridge (2010), Sections 13.9.1, 15.8.2-3
n  Greene (2004) – uses simulation to show how bad
59

60
Random effects (RE) model [Part 1]
n  Very similar model as FE…
n  But, one big difference…

q  It assumes that unobserved heterogeneity, fi,
and observed x's are uncorrelated
n  What does this imply about consistency of OLS?
n  Is this a realistic assumption in corporate finance?
61
n  Answer #1 – That assumption means that

OLS would give you consistent estimate of β!
n  Then why bother?
q  Answer… potential efficiency gain relative to FE
n  FE is no longer most efficient estimator. If our
assumption is correct, we can get more efficient estimate
by not eliminating the FE and doing generalized least
squares [Note: can't just do OLS; it will be consistent as well but
SE will be wrong since they ignore serial correlation]
62
n  Answer #2 – The assumption that f and x

are uncorrelated is likely unrealistic in CF
q  The violation of this assumption is whole
motivation behind why we do FE estimation!
n  Recall that correlation between unobserved
variables, like managerial talent, demand shocks,
etc., and x will cause omitted variable bias
63
Random effects – My Take
n  In practice, RE model is not very useful

q  As Angrist-Pischke (page 223) write,
n  Relative to fixed effects estimation, random effects
requires stronger assumptions to hold
n  Even if right, asymptotic efficiency gain likely modest
n  And, finite sample properties can be worse
q  Bottom line, don't bother with it
64

65
First differencing (FD) [Part 1]
n  First differencing is another way to
remove unobserved heterogeneities
q  Rather than subtracting off the group
mean of the variable from each variable,
you instead subtract the lagged observation
q  Easy to see why this also works…
66
First differencing (FD) [Part 2]
n  Notice that, yi ,t = α + β xi ,t + fi + ui ,t
yi ,t −1 = α + β xi ,t −1 + fi + ui ,t −1 Note: we'll lose
on observation
per cross-section
n  From this, we can see that because there
won't be a lag
yi ,t − yi ,t −1 = β ( xi ,t − xi ,t −1 ) + (ui ,t − ui ,t −1 )
q  When will OLS estimate of this provide a

consistent estimate of β?
n  Answer: With same strict exogeneity assumption of
FE (i.e. xi,t and ui,s are uncorrelated for all t and s)
67
First differences (without time)
n  First differences can also be done even

when observations within groups aren't
ordered by time
q  Just order the data within groups in whatever
way you want, and take 'differences'
q  Works, but admittedly, not usually done
68
FD versus FE [Part 1]
n  When just two observations per group,
they are identical to each other
n  In other cases, both are consistent;
difference is generally about efficiency
q  FE is more efficient if disturbances,
ui,t, are serially uncorrelated
Which is true?
q  FD is more efficient if disturbance, Unclear. Truth is
ui,t, follow a random walk that it is probably
something in between
69
n  If strict exogeneity is violated (i.e. xi,t is
correlated with ui,s for s≠t), FE might be better
q  As long as we believe xi,t and ui,t are uncorrelated,
the FE's inconsistency shrinks to 0 at rate 1/T;
but, FD gets no better with larger T
q  Remember: T is the # of observations per group
n  But, if y and x are spuriously correlated, and N

is small, T large, FE can be quite bad
70
n  Bottom line: not a bad idea to try both…
q  If different, you should try to understand why
q  With an omitted variable or measurement
error, you’ll get diff. answers with FD and FE
n  In fact, Griliches and Hausman (1986) shows that
because measurement error causes predictably
different biases in FD and FE, you can (under
certain circumstances) use the biased estimates to
back out the true parameter
71

72
Lagged dependent variables with FE
n  We cannot easily estimate models with both
a lagged dep. var. and unobserved FE
yi ,t = α + ρ yi ,t −1 + β xi ,t + fi + ui ,t , ρ <1
q  Same as before, but now true model contains

lagged y as independent variable
n  Can't estimate with OLS even if x & f are uncorrelated
n  Can't estimate with FE
73
Lagged y & FE – Problem with OLS
n  To see the problem with OLS, suppose
you estimate the following:
yi,t = α + ρ yi,t−1 + β xi,t + vi,t
!
f i +ui ,t
q  But, yi ,t −1 = α + ρ yi ,t − 2 + β xi ,t −1 + fi + ui ,t −1
q  Thus, yi,t-1 and composite error, vi,t are positively
correlated because they both contain fi
q  I.e. you get omitted variable bias
74
Lagged y & FE – Problem with FE
n  Will skip the math, but it is always biased
q  Basic idea is that if you do a within
transformation, the lagged mean of y, which will be
on RHS of the model now, will always be
negatively correlated with demeaned error, u
n  Note #1 – This is true even if there was no unobserved
heterogeneity, f; FE with lagged values is always bad idea
n  Note #2: Same problem applies to FD
q  Problem, however goes away as T goes to infinity
75
How do we estimate this? IV?
n  Basically, you're going to need instrument;
we will come back to this next week….
76
Lagged y versus FE – Bracketing
n  Suppose you don't know which is correct
q  Lagged value model: yi ,t = α + γ yi ,t −1 + β xi ,t + ui ,t
q  Or, FE model: yi ,t = α + β xi ,t + fi + ui ,t
n  Can show that estimate of β>0 will…

q  Be too high if lagged model is correct, but you
incorrectly use FE model
q  Be too low if FE model is correct, but you
incorrectly used lagged model
77
Bracketing continued…
n  Use this to 'bracket' where true β is…
q  But sometimes, you won't observe bracketing
q  Likely means your model is incorrect in other
ways, or there is some severe finite sample bias
78
Summary of Today [Part 1]
n  Panel data allows us to control for certain

types of unobserved variables
q  FE estimator can control for these potential
unobserved variables in very flexible way
q  Greatly reduces the scope for potential omitted
variable biases we need to worry about
q  Random effects model is useless in most
empirical corporate finance settings
79
n  FE estimator, however, has weaknesses
q  Can't estimate variables that don't vary within
groups [or at least, not without an instrument]
q  Could amplify any measurement error
n  For this reason, be cautious interpreting zero or small
coefficients on possibly mismeasured variables
q  Can't be used in models with lagged values of the

dependent variable [or at least, not without an IV]
80
n  FE are generally not a good idea when

estimating nonlinear models [e.g. Probit,
Tobit, Logit]; estimates are inconsistent
n  First differences can also remove
unobserved heterogeneity
q  Largely just differs from FE in terms of relative
efficiency; which depends on error structure
81
In First Half of Next Class
n  Instrumental variables
q  What are the necessary assumptions? [E.g.
what is the exclusion restriction?]
q  Is there are way we can test whether our
instruments are okay?
n  Related readings… see syllabus
82
Assign papers for next week…
n  Khwaja and Mian (AER 2008)
q  Bank liquidity shocks
n  Paravisini, et al. (ReStud 2014)

q  Impact of credit supply on trade
n  Becker, Ivkovic, and Weisbenner (JF 2011)

q  Local dividend clienteles
83
Break Time
n  Let's take our 10 minute break
n  We'll do presentations when we get back
84

04 - Panel Data PDF

Uploaded by

Copyright:

Available Formats

04 - Panel Data PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

04 - Panel Data PDF

Uploaded by

Copyright:

Available Formats

FNCE 926

Professor Todd Gormley

n Student presentations of “Causality”

n Random effects model

What might be some unobservable

n Easy to think of ways these might be affect

q What type of unobserved variable does

n Random effects model

where E (ui ,t ) = 0 These implies what?

corr ( xi ,t , ui , s ) = 0 for all s, t

Note: This is stronger assumption then we usually make; it's

q x is correlated with the disturbance v (through

ˆ σ xf This is standard OVB…

σx coefficient from regression

q And look! The unobserved variable, fi, is gone

n Random effects model

q Very intuitive interpretation; coefficient is

n FE estimator is very general

q Textbook example of grouping units i across time

n Once you are able to construct groups, you

n Random effects model

n But, FE estimator also has its costs

q What coefficient can't be estimated?

q Adding FE can sweep out a lot of the good

q To estimate this in Stata, we'd use a

n Dummies not swept away in within-

n Two more issues worth noting about FE

q Moreover, doing an F-test to show they are

n Random effects model

n But, one big difference…

n Answer #1 – That assumption means that

n Answer #2 – The assumption that f and x

n In practice, RE model is not very useful

q Bottom line, don't bother with it

n Random effects model

q When will OLS estimate of this provide a

n First differences can also be done even

n But, if y and x are spuriously correlated, and N

n Random effects model

q Same as before, but now true model contains

q Problem, however goes away as T goes to infinity

n Can show that estimate of β>0 will…

n Panel data allows us to control for certain

q Can't be used in models with lagged values of the

n FE are generally not a good idea when

n Related readings… see syllabus

n Paravisini, et al. (ReStud 2014)

n Becker, Ivkovic, and Weisbenner (JF 2011)

You might also like

n  Student presentations of “Causality”

n  Random effects model

n  Easy to think of ways these might be affect

q  What type of unobserved variable does

n  Random effects model

q  x is correlated with the disturbance v (through

q  And look! The unobserved variable, fi, is gone

n  Random effects model

q  Very intuitive interpretation; coefficient is

n  FE estimator is very general

q  Textbook example of grouping units i across time

n  Once you are able to construct groups, you

n  Random effects model

n  But, FE estimator also has its costs

q  What coefficient can't be estimated?

q  Adding FE can sweep out a lot of the good

q  To estimate this in Stata, we'd use a

n  Dummies not swept away in within-

n  Two more issues worth noting about FE

q  Moreover, doing an F-test to show they are

n  Random effects model

n  But, one big difference…

n  Answer #1 – That assumption means that

n  Answer #2 – The assumption that f and x

n  In practice, RE model is not very useful

q  Bottom line, don't bother with it

n  Random effects model

q  When will OLS estimate of this provide a

n  First differences can also be done even

n  But, if y and x are spuriously correlated, and N

n  Random effects model

q  Same as before, but now true model contains

q  Problem, however goes away as T goes to infinity

n  Can show that estimate of β>0 will…

n  Panel data allows us to control for certain

q  Can't be used in models with lagged values of the

n  FE are generally not a good idea when

n  Related readings… see syllabus

n  Paravisini, et al. (ReStud 2014)

n  Becker, Ivkovic, and Weisbenner (JF 2011)