Lecture Topic 1
Lecture Topic 1
Statistics
Unit Information
1 / 53
Teaching staff
2 / 53
Unit Learning Outcomes
3 / 53
Lecture Schedule
4 / 53
Unit Guide
I Available in iLearn. Go through it
I Read the unit information folder in iLearn
5 / 53
Tutorial classes
6 / 53
Useful References
7 / 53
Assessment Tasks
8 / 53
Reminder of Academic Integrity Policy
Acting with academic integrity is important for all staff and
students at MQ, and includes:
I submitting assignments that are your own work
I acknowledging the ideas and information of others by citing
and referencing
I providing accurate and honest data and information to the
university.
Breaches of academic integrity (what is not acceptable
behaviour) includes:
I Contract cheating: having someone else complete your
assignment – paid or not – or exchanging, downloading or
purchasing answers on a file-sharing site and/or social media
platform
I Uploading MQ course materials to file sharing sites: MQ
course materials are protected by copyright and must not be
shared or uploaded to any website or platform
9 / 53
I Plagiarism: failing to properly acknowledge sources of
information you use
I Assessment/Exam cheating: copying answers from the
internet or output of algorithms like ChatGPT and submitted
them as your own work, asking someone else to help you do
the exam, or giving answers to someone else
I Collusion: working with other students to produce
assessments assigned as an individual activity
I Deception: providing false or misleading information to the
University.
Note that through the unit iLearn sites, there are links to resources
and services to help you reach your potential while ensuring a high
level of personal integrity as you undertake your studies. If you are
in any doubt about what is and what is not allowed under this
policy, ask your lecturer/tutor immediately.
10 / 53
Econ6034-Econometrics and Business Statistics
11 / 53
What is Econometrics?
12 / 53
The Methodology of Econometrics
13 / 53
Example: Application of the Methodology of
Econometrics
1. Theory: We have two competing theories about the labour
force participation rate.
Discouraged-worker hypothesis
I More people would work if jobs were easy to find - but they
become discouraged in an economic downturn, and do not
search when work is scarce. So participation falls in an
economic downturn.
Added-worker hypothesis
I In an economic downturn, spouses or partners go to work to
compensate for the lost income of main wage-earners who
might be made redundant. So participation rises in an
economic downturn.
2. Collection of data
Time series
Cross sectional
Pooled
Data sources include ABS, RBA, Federal Reserve, the World
Bank, etc. 14 / 53
15 / 53
16 / 53
3. Mathematical Model
LF P R = β1 + β2 U N R
4. Econometric Model
LF P Rt = β1 + β2 U N Rt + t
17 / 53
5. Estimation
Choose an estimator (formula) to estimate the econometric
model.
e.g. Ordinary Least Squares estimator
Estimated model:
LF ˆP Rt = 69.46 − 0.58U N Rt
ˆt = LF P Rt − LF ˆP Rt
18 / 53
6. Diagnostic Checking
Check the behaviour of the residuals to see if the assumptions
are satisfied.
Check if the sign/size of the regression coefficients accord with
theory and common sense.
Consider alternative models, such as
19 / 53
7. Hypothesis Testing and Interpretation
Which hypothesis is supported by the estimation results?
Is UNR an important factor in determining LFPR?
Are the coefficient estimates different from previous findings?
What will happen to LFPR if UNR increases by one percentage
point ceteris paribus?
20 / 53
8. Prediction
What will be the expected LFPR when UNR is 5%?
21 / 53
Some Basic Notation: The Summation Operator
I Let
N = the number of the observations
Xi = the ith observation
I The summation operator is defined as follows:
N
X
Xi = X1 + X2 + . . . . + XN
i=1
I Notice that the following notations are often used
N
X X X
Xi Xi X
i=1
22 / 53
Example
i 1 2 3 4 5 6 7 8 9 10
Xi 2 5 7 4 2 1 6 5 2 2
Yi 10 8 9 12 10 15 10 5 3 9
i=9
X
Xi = X4 + X5 + X6 + X7 + X8 + X9 = 20
i=4
X
Xi2 = X12 + X22 + · · · + X10
2
= 168
i=10
X
Xi Yi = X1 Y1 + X2 Y2 + · · · + X10 Y10 = 315
i=1
23 / 53
Some properties of the summation estimator
N
X
a = N a; where a is a constant
i=1
N
X N
X
aXi = a Xi
i=1 i=1
N
X X X
(Xi + Yi ) = Xi + Yi e.g. 36 + 91=127
i=1
N
X X X
(Xi − Yi ) = Xi − Yi e.g. 36-91=-55
i=1
Note: !2
N
X N
X
Xi2 = 168 6= Xi = 1296
i=1 i=1
N N N
! !
X X X
Xi Yi = 315 6= Xi Yi = 3276
i=1 i=1 i=1 24 / 53
Summary of Data
25 / 53
Descriptive Techniques
26 / 53
Example: ECON6034 anxiety...
27 / 53
Example: ECON6034 anxiety...
28 / 53
Example: ECON6034 anxiety...
List of previous
year’s marks. E.g. Class average,
65 proportion of class
71 Summary information receiving F’s,
66 derived about most frequent
79 the class. mark,
65 marks distribution
82 ,etc.
..
.
29 / 53
Example: ECON6034 anxiety...
Q2: Are most of the marks clustered around the mean or are
they more spread out?
I Range = Maximum – minimum = 92 - 53 = 39
I Variance
I Standard deviation
30 / 53
Example: ECON6034 anxiety...
31 / 53
Numerical Descriptive Techniques
Measures of Variability
Range, Variance, Standard Deviation
32 / 53
Key Statistical Concepts
I Descriptive Measure
Parameter: A descriptive measure of a population.
Statistic: A descriptive measure of a sample.
33 / 53
Notation
34 / 53
Measures of Central Tendency
35 / 53
Measures of Central Tendency
36 / 53
Measures of Central Tendency
37 / 53
Measures of Dispersion/Variability
Let’s say we receive the final grades for the semester for the two
tutorial classes:
A B
75 75
80 100
70 50
77 85
73 65
75 98
90 52
60
Mean 75 75
Median 75 75
38 / 53
Measures of Dispersion/Variability
I Yet, clearly the outcomes for the two classes are quite
different, in a way which is not described by the mean or
median.
I Is there another characteristic of the distributions of marks
that we can measure, and compare?
I Variability: Which tutor is more consistent in their teaching
methods.
39 / 53
Measures of Dispersion/Variability
I Variance
The variance of a population is:
XN
(xi − µ)2
i=1
σ2 =
N
N
X
(Xi − X̄)2
i=1
The variance of a sample is: s2 =
n−1
40 / 53
Sample Variance: Calculation
A B
75 75
80 100
70 50
77 85
73 65
75 98
90 52
60
Mean 75 75
n 8 7
41 / 53
A B
(Xi − X̄)2 0 0
25 625
25 625
4 100
4 100
0 529
225 529
225
n
X
(Xi − X̄)2 508 2508
i=1
Variance 72.57 418
42 / 53
Standard Deviation
43 / 53
Descriptive statistics:Using EXCEL
I Install the Analysis Toolpak as an Add-in in
Excel
I Select the Data tab/ Analysis/ Descriptive
Statistics/ Summary Statistics
44 / 53
Measures of Relative Standing
45 / 53
Measures of Relative Standing
46 / 53
Measures of Relative Standing
Example:
0 0 5 7 8 9 12 14 22 33
47 / 53
Measures of Relative Standing
48 / 53
Measures of Relative Standing
We have special names for the 25th, 50th, and 75th percentiles,
namely quartiles.
I The first (lower) quartile: Q1 = 25th percentile.
49 / 53
Interquartile Range
I Large values of this statistic mean that the 1st and 3rd
quartiles are far apart indicating a high level of variability.
50 / 53
Using Excel ...
=quartile.(range, quart)
=QUARTILE.EXC(range, quart) # this formula align with
our manual calculation method explained above.
I Percentile
=percentile(range, k)
=max(range)
=min(range)
51 / 53
Box Plots
I The box plot is a technique that graphs five statistics:
I The minimum and maximum observations, and
I The lines extending to the left and right are called whiskers.
Any points that lie outside the whiskers are called outliers.
The whiskers extend outward to the smaller of 1.5 times the
interquartile range or to the most extreme point that is not an
outlier.
52 / 53
Excel Help
53 / 53