An Introduction To Statistical Inference
An Introduction To Statistical Inference
An Introduction To Statistical Inference
• The purpose of doing this, the aim is to determine whether the differences between the
coefficient estimates that are actually obtained, and expectations arising from financial theory,
are a long way from one another in a statistical sense.
as the
1. Null hypothesis (denoted as H0) - It is a statement or the statistical hypothesis that is
actually being tested.
2. Alternative hypothesis (denoted as H1) – Represents the remaining outcome of the interest.
Example: Scenario I: real interest to test the hypothesis that the true value of The following
notation would be used.
H0 : = 0.5
H1: ≠ 0.5
Outcomes of alternative hypothesis: both and are subsumed (two sided test)
•• Scenario II: prior information is available, suggesting that would be expected rather than <
0.5. H0 : = 0.5
H1 : > 0.5 (One-sided test)
• Scenario III: prior information is available, suggesting that would be expected rather than >
0.5.
H0 : = 0.5
H1 : 0.5
•• Note:
a) There is always an equality under the null hypothesis. So, for example, would not be specified
under the null hypothesis.
b) What you hope or expect to be able to conclude as a result of the test usually should be placed
in the alternative hypothesis
c) The null hypothesis is the hypothesis that is tested.
• There are two ways to conduct a hypothesis test:
1. Test of significance approach
2. Confidence interval approach
Both methods central on a statistical comparison of the estimated value of the
coefficient, and it’s value under the null hypothesis.
The Test of Significance Approach
• In the testing of a hypothesis one objective is to make a decision whether the hypothesis is true
or false.
• The testing procedure uses the information from random samples drawn from the respective
population.
• If the sample information agrees and is in favour of the hypothesis, then we can conclude that
the hypothesis is true.
• On the other hand, if the sample information is not in favour of the hypothesis, the can
conclude that the hypothesis is false.
Procedure in testing hypothesis
• Take a random sample from the given population
• Compute an appropriate test statistic from the sample data
• Use this statistic to make decision whether the hypothesis is accepted or rejected.
• For the purpose of testing hypothesis it is required to choose a ‘significance level’, often denoted
by α.
• Therefore, for a given significance level, a rejection region and non-rejection region can be
determined.
Example: If a 5 % significance level is employed, this means that 5% of the total distribution (5% of
the area under curve) will be in the rejection region. That rejection region can either be split in half
(for a two sided test) or it can all fall on one side of the y – axis, as is the case for a one-sided test.
I - Rejection regions for a two sided 5% hypothesis test
• An alpha level of 0.05 means that we will consider our sample mean to be significantly different
from the hypothesized mean if the chances of observing that sample mean are less than 5%.
Example: Suppose the average age of employees who have employed in an organization can be
stated as 35 years.
The null hypothesis is H0 : = 35 (i) Draw the diagram to show the
And the alternative is H1 : rejection and non-rejection region.
•• When the null hypothesis is H0 : > 35
Note: We perform one tailed test when the alternative hypothesis is of the form ‘greater than’ or
‘less than’. When the alternative hypothesis is of the form ‘not equal to’ we perform two tailed
test.
Confidence Interval Approach to Hypothesis Testing
• For example: One might estimate a parameter, say β (hat), to be 0.93, and a ‘95% confidence
interval’ to be (0.77, 1.09). – This means that in many repeated samples, 95% of the time, the
true value of β will be contained within this interval.
• Confidence intervals are almost invariably estimated in a two – sided form, although in theory a
one-sided interval can be constructed.
2) Choose a significance level, (again the convention is 5%). This is equivalent to choosing a (1-
)*100 % confidence interval
5% significance level = 95% confidence interval
3) Use the t-tables to find the appropriate critical value.
4) Perform the test: if the hypothesized value of lies outside the confidence nterval, then reject
the null hypothesis, otherwise do not reject the null hypothesis.
Test of Significance
• For Large Samples where sample size n > 30.
Suppose that an automobile manufacturer advertises that its new hybrid car has a mean gas
mileage of 50 miles per gallon. You take a simple random sample of n = 30 hybrid vehicles and
test their gas mileage. You find that in this sample, the average is x = 47 miles per gallon with a
standard deviation of s = 5.5 miles per gallon. Is there enough evidence to support the advertised
claim using
•Step
01: Develop Hypothesis
H0 : = 50
H1 : µ ≠ 50
(Two – tailed test)
Step 02: Identify the parameter values
= 0.05 n = 30
x = 47
= 5.5
Step 03: Calculating the P-Value if n >:
i) Find the z-score.
ii) Find the cumulative area corresponding to the z-score.
iii) The P-Value depends on the null hypothesis.
•i) z = x - µ = 47 – 50 -3.0
5.5/
ii) The cumulative area for this z-score is 0.0013, this is a two-
tailed test: P = 2(0.0013) = 0.0026.
- 3.0
0
iii) Since P= 0.0026 < 0.05 = we make the decision to reject the null hypothesis. This means that
there is not enough statistical evidence to support the advertiser’s claim of an average of 50 miles
per gallon.
•Example
02:
A pit crew claims that its mean pit stop time (for 4 new tires and fuel) during an auto race is less than
13 seconds. A random sample of 32 pit stop times has a sample mean of 12.9 seconds and a standard
deviation of 0.19 second. Is there enough evidence to support the claim of the pit crew at = 0.01?
H0: > 13
H1: < 13
i) X (bar) = 12.9 z = x - µ = 12.9 – 13 -2.98 (left tailed)
0.19/
= 0.19 P = 0.0014
N = 32 Decision 0.0014 < 0.01
= 0.01 Reject null hypothesis.
•
Hypothesis testing when the sample size n < 30.
• The population is assumed to be approximately normally distributed.
• The z table will not be used for small samples, instead the t-table will be used.
Example 02
A used car dealer says that the mean price of a 2008 Honda CR – V is at least EUR 20,500 . You suspect
this claim is incorrect and find that a random sample of 14 similar vehicles has a mean price of EUR
19,850 and a standard deviation of EUR 1,084. Is there enough evidence to reject the dealer’s claim at =
0.05? Assume the population is normally distributed.
Step 01 Develop Hypothesis
H0 : > 20,500
H1 : < 20,500
(Left tailed)
Step 02: Identify the parameter values
𝛼 = 0.05
x = 19,850
𝜎 = 1,084
New Input: Degrees of freedom
n-1 = 14-1 = 13 the degrees of freedom
Step 03:
Calculating the P - value is not possible in this case with our tables, as we do not use the z-table,
but the t-table. Instead,
•i) Determine the critical value, t0 for t using the table and: n-1, , H0:
2. 28 employees of XYZ Company travel an average (mean) of 14.3 miles to work. The standard
deviation of their travel time was 2 miles. Find the 90% confidence interval of the true mean
or population mean.
•Exercise:
1. A researcher wishes to determine the average take – home pay of a part –time college student.
He takes a sample of 100 students and finds:
= LKR 15,000
S = LKR 20,000
What is the true average take home pay of a part –time college student?
a) Use a 95% confidence interval estimator.
b) Use a 99% confidence interval estimator.
2. Average life of a GE refrigerator.
n= 100
X = 18 years
S = 4 years
i) 99% Confidence
ii) 95% Confidence
iii) 90% Confidence
iv) 68% Confidence
Find the Confidence Interval
•3. A company is interested in determining the average life of its watches. An employee randomly
4. Tire Manufacturer
N = 100
= 30,000 miles
S = 2,000 miles
Construct a 95% C.I.E of
•5. Average Income of a College Graduate
n = 1,600
= LKR 50,000
S = LKR 20,000
Construct a 99% C.I.E of
THANK YOU