Lec 1
Lec 1
Lec 1
v
551: Applied Econometrics
Lecture note 1
Anbes Tenaye
Department of Agricultural Economics
Hawassa University
01. November 2021.
Last updated November 01, 2021
Pre-requisites for this course
2 / 49
Contents
Course information
4 / 49
This course
5 / 49
This course
In this course we will focus on procedures and tests that are commonly
used in practice, such as:
• Instrumental variable regression (chapter 12)
• Program evaluation (natural experiments chapter 13)
• Forecasting (chapter 14)
• Time series regression (Chapter 15)
6 / 49
This course
7 / 49
Econometrics as a combined disipline
8 / 49
What is statistics?
Statistics is:
• Collecting raw data
• Manipulating raw data
• summarizing data
Types of statistics:
• Descriptive statistics - used to summarize and describe data.
Example: The Detroit foreclosure rate was 5% in 1997
• Inferential statistics - try to reach conclusions that extend beyond the
immediate data set.
• Estimation
• Hypothesis testing
• Draw conclusions
Example: At 99% confidence: Living in an area with a significant
cancer risk lowers housing prices between 11% and 20%
9 / 49
Data
Data sources:
• Experiment - collect yourself
• Observational data, administrative records or surveys - collect through
data owner.
Data types:
• Cross-sectional: data on di↵erent entities for a single time period.
• Time series: data for a single entity collected at multiple time periods.
• Panel data: data for multiple entities in which each entity is observed
at two or more time periods.
• (Repeated cross section: A collection of cross-sectional data sets,
where each cross-sectional data set corresponds to a di↵erent time
period).
10 / 49
Economic theory
From S&W: Economic issues are anything dealing with the interaction of
agents.
Economic theory specifies relationships between variables of interest.
11 / 49
So what is econometrics?
12 / 49
Quantitative questions, quantitative answers
13 / 49
Wages example Wages Example
A classic question in economics is the e↵ect of education on wage
outcomes: A classic question in economics is the effect of education on wage o
3000
2500
Respondent Wage
2000
1500
1000
500
0
10 12 14 16 18
15 / 49
Steps in an econometric analysis
16 / 49
Step 4: Estimate the econometric model
Types of estimators:
• Linear regression with single or multiple regressors (ch 4-6)
• Non-linear regression functions (ch 8)
• Regression with panel data (ch 10)
• Regressions with binary dependent variable (ch 11)
• Instrumental variable regression (ch 12)
17 / 49
Step 4: Estimate the econometric model
18 / 49
Step 5: Statistical inference
19 / 49
www.hu.edu.et Ever to Excel!
Reliability
20 / 49
Review statistics
21 / 49
References
Consult your statistics textbook if you need more information than provide
in this lecture and the textbook.
22 / 49
Notation
P
• is shorthand for addition. Suppose xi is the ith observation:
3
X
x1 + x2 + x 3 = xi
i=1
23 / 49
Statistics that describe distributions
24 / 49
Measures of central tendency
Mean:
• The expected value and is the most common measure of the central
tendency: Pn R1
1
E (Y ) = µY = n i yi or E (Y ) = 1 yp(y )dy
Median
• The mid point of the data
• To calculate:
1 Order the data
2 Calculate (n + 1)/2 (i.e. the middle observation)
3 Take the value at (n + 1)/2 or the average of the two closest if it is not
a whole number.
Mode:
• The value that occurs most often.
25 / 49
Measures of variation
Sample variance:
• The most commonly used measure of dispersion.
• Summarizes how far a typical observation is from the mean.
n
X
1
ˆx2 = var (x) = (xi ûx )2
n 1
i=1
26 / 49
Measures of variation
Range:
• A measure of data dispersion, though not used for many applications.
• To calculate:
1 Identify the largest observation.
2 Identify the smallest observation.
3 Take the di↵erence.
27 / 49
Measures of shape
Skewness:
• Measures the asymmetry of a distribution.
• A positive skewness implies that the tail on the right side is longer or
fatter than the left side.
• A negative skewness implies that the tail on the left side is longer or
fatter than on the right side.
28 / 49
Measures of shape
Kurtosis:
• Measure the mass in tails, i.e. the probability of extreme values.
• The most common measure measures how heavy the tails are. Higher
kurtosis means more of the variance is the result of extreme
deviations. (as opposed to frequent modestly sized deviations).
• The normal distribution has a kurtosis of 3 and distributions with
kurtosis more than three is thus leptokurtic (heavy tailed).
29 / 49
Percentiles
30 / 49
Measures of distribution in Stata
. sum popgrowth
Percentiles Smallest
1% -.5 -.5
5% -.2 -.4
10% 0 -.3 Obs 68
25% .3 -.2 Sum of Wgt. 68
31 / 49
Illustration distribution in Stata
Using lifexp.dta
32 / 49
Measures of distribution
The measures of distribution can be used to simply describe the data. But
is can also be used to test econometric models.
33 / 49
Data scaling
What happens if we scale variables by adding a constant?
• Assume that X={2,3,4} the mean is then:
n
1X
xi = 3
n
i=1
• The variance is:
1
ˆx2 = ((2 3)2 + (3 3)2 + (4 3)2 ) = 1
2
• If we define a new variable Z=X+3, then Z={5,6,7}
•
1
µ̂z = (5 + 6 + 7) = 6.
3
• The variance is:
1
ˆx2 = ((5 6)2 + (6 6)2 + (7 6)2 ) = 1
2
• Thus the mean is a↵ected by adding a constant, but not the
dispersion.
34 / 49
Data scaling
1
µ̂J = (6 + 9 + 12) = 9
3
1
ˆJ2 = ((6 9)2 + (9 9)2 + (12 9)2 ) = 9
2
• Multiplying by a constant a↵ects both mean and variance.
• Generally if J=aX then ˆj2 = a2 ˆx2 , µ̂J = aµ̂x
What about covariance and correlation? Test yourself.
35 / 49
Review of probability
36 / 49
Key terms
37 / 49
Random variables and probability distribution
38 / 49
Joint probability
The joint probability distribution is the probability that two (or more)
random variables take on certain values simultaneously.
• Two variables are independent if knowing the value of one of the
variables provides no information about the other.
P(AandB) = Pr (A \ B) = P(A)P(B)
39 / 49
Conditional probability
The conditional probability is the probability that one event happens given
that another event has occurred.
P(A [ B)
P(A|B) =
Pr (B)
41 / 49
Example
42 / 49
Example - Coin toss
Number of heads 0 1 2
Probability 0.25 0.5 0.25
Cumulative probability 0.25 0.75 1
43 / 49
Covariance
• ˆxy > 0 tends to have xi > µ̂x when yi > µ̂y (and vice versa)
• ˆxy < 0 tends to have xi > µ̂x when yi < µ̂y (and vice versa)
44 / 49
Correlation
corr (X , Y ) == p cov (X ,Y )
var (X )var (Y )
45 / 49
Probability and regression models
46 / 49
Lecture summary
47 / 49
Next lecture
• Distributions
• Estimators and estimates
• Hypothesis testing of means
48 / 49
Reminder
49 / 49