Testing of Hypothesis
Testing of Hypothesis
Testing of Hypothesis
(Significance Test)
Terms - Definition
A hypothesis is a statement or assertion or assumption or claim
or belief about the state of nature (about the true value of an
unknown population parameter):
1 2 0
Terms-Definition
= P(Reject H 0 H 0 is true)
= P(Accept H 0 H 0 is false)
Power = (1 - )
1. Making assumptions
2. Constructing hypotheses
3. Determining the test statistic
4. Constructing critical region
5. Determining p-values
6. Drawing conclusion
UNIVARIATE POPULATION
Significance test for population
mean (when is known)
Assumptions :
A random sample is drawn from a population (normal
distribution) with mean and sd
Sample size should be large (small)
Population sd is known
Hypotheses:
H 0 : 0 H 0 : 0 H 0 : 0
H1 : 0 H1 : 0 H1 : 0
Reject H0 if
z > z/2 or z < z/2
Reject H0 if z < z Reject H0 if z > z
LO 9.4
Critical Region :
Ho : 2000
H 1 : 2000
Hypothesis Test of the Population
Mean When Is Known
The p-value Approach
Determining the p-value depending on the specification
of the competing hypotheses.
LO 9.3
Test statistic ; p-value
x
z 0 = 1999.6 - 2000
obs 1.3
n 40
= 1.95
z 1.645
0.05
z -1.645
obs
p - value P(Z -1.95)
0.0256 0.05
Hypotheses:
H 0 : 0 H 0 : 0 H 0 : 0
H1 : 0 H1 : 0 H1 : 0
Ho : 2000
H 1 : 2000
Given, n 40, x 1996.6, s 1.3
Test statistic ; p-value
x
z 0 = 1999.6 - 2000
obs s 1.3
n 40
= 1.95
z 1.645
0.05
z -1.645
obs
p - value P(Z -1.95)
0.0256 0.05
To test
H0: = 15
H1: 15
Test statistic ; p-value
x
z 0 = 16.3 -15
obs 3.6
n 8
= 1.02
z 1.96
0.025
z 1.96
obs
p - value P(Z 1.02) P(Z 1.02)
0.1539 * 2 0.3078 0.05
Hypotheses:
H 0 : p p0 H 0 : p p0 H 0 : p p0
H 1 : p p0 H 1 : p p0 H 1 : p p0
Test Statistic: sample proportion= p
By CLT,
p (1 p )
N p,
p
n
p0
p
z obs ~ N (0,1)
p0 (1 p0 )
n
Critical Region :
H0: p = 0.01
H1: p > 0.01
Approach 1: Test statistic:
0.03 0.01
zobs 2.01, under H 0
0.01(0.99) / 100
z0.05 1.645
zobs z0.05
Hypotheses:
H 0 : 0 H 0 : 0 H 0 : 0
H1 : 0 H1 : 0 H1 : 0
x 0
tobs '
~ t n 1 , under H 0
s
n
Critical Region :
H0: = 27 n = 18
H1: 27 x = 26.3
n = 18 s = 6.15
For = 0.05 and (18-1) = 17 df , x 26.3 - 27
critical values of t are 2.11 t s 0 = 6.15
obs
x 0 n 18
The test statistic is: t
s = 0.48 Do not reject H
n 0
Equivalence between Hypotheses
tests and Confidence intervals
The main idea is that a two-sided hypotheses test will give
us exactly the same conclusion (about the population
parameter) as a confidence interval i.e if we test
H0: = 0 vs H1: 0 and fail to reject H0 at significance
level (=0.01/0.05/0.1), then the corresponding 100(1)%
(99%, 95%, 90%) confidence interval will contain the null
value (i.e 0).
95% CI of
[26.3 - (1.96* 6.15/ 18 ) , 26.3 - (1.96* 6.15/ 18 )]
[23.46 , 29.14]
Problem
The manager of a small convenience store does not want her customers
standing in line for too long prior to a purchase. In particular, she is willing to
hire an employee for another cash register if the average wait time of the
customers is more than five minutes. She randomly observes the wait time (in
minutes) of customers during the day as:
P( x 177.57 | 180)
x 177.57 180
P
/ n 65 / 400
Pz .75
.2266
Effects on of Changing
Consider this diagram again. Shifting the critical
value line to the right (to decrease ) will mean a
larger area under the lower curve for (and vice
versa)
Judging the Test
A statistical test of hypothesis is effectively defined
by the significance level () and the sample size (n),
both of which are selected by the statistics
practitioner.
Therefore, if the probability of a Type II error () is
judged to be too large, we can reduce it by
Increasing ,
and/or
increasing the sample size, n.
Judging the Test
For example, suppose we increased n from a
sample size of 400 account balances to 1,000 in
Example 11.1.
z z z.05 1.645
x x 170
z 1.645
/ n 65 / 1,000
x 173.38
Judging the Test
P( x 173.38 | 180)
x 173.38 180
P
/ n 65 / 1, 000
Pz 3.22
0 (approximat ely)
By increasing the sample size we reduce the
probability of a Type II error:
n=400
n=1,000
173.38
175.35
Compare at n=400 and n=1,000
TWO INDEPENDENT
UNIVARIATE
POPULATIONS
Sampling
In order to compare two groups (populations), we have to select
samples from both the groups. If the observations in one sample
are independent of those in the other, then those are
called independent samples.
Eg. Suppose we want to compare two drugs. We select a sample of
patients and randomly allocate them to the two drugs. These two
groups of patients (and also the observations coming from them)
will constitute independent samples since they were randomly
allocated to the two groups corresponding to the two drugs.
Significance test for difference
between population proportions
Notations:
p1 (p2) : population proportion of success in the first
(second) group.
n1 (n2) : sizes of random samples drawn from the first
(second) populations.
Assumptions :
Independent random samples from the two groups.
Large enough sample sizes so that in each sample there
are at least 5 success and 5 failures.
Hypotheses:
H 0 : p1 p 2 H 0 : p1 p 2 H 0 : p1 p 2
H 1 : p1 p 2 H 1 : p1 p 2 H 1 : p1 p 2
Test Statistic: difference between sample proportions
= p p
1 2
1 p
p 2
z obs ~ N (0,1), under H 0
1 1
p (1 p )
n1 n2
Critical Region :
x1 x 2 36 31 67
p .549
n1 n 2 72 50 122
Example: contd.
z
p1 p 2 p1 p2
1 1
p (1 p )
n1 n2
.50 .62 0 1.31
1 1
.549 (1 .549)
72 50
Assumptions :
Independent random samples from the two populations
(normal distributions) are drawn.
n1 and n2 large (small)
Hypotheses:
H 0 : 1 2 H 0 : 1 2 H 0 : 1 2
H 1 : 1 2 H 1 : 1 2 H 1 : 1 2
Test Statistic: difference between sample means
= x x (unbiased estimators)
1 2 1 2
By CLT,
12 22
x1 x2 (~) N 1 2 ,
n1 n2
( x1 x2 ) ( 1 2 )
zobs ~ N (0,1), under H 0
12 22
n1 n2
Critical Region :
H 0 : 1 2
H 1 : 1 2
x1 x2
Z 0bs
12 22
n1 n2
Example : contd..
Reject H0 if Z0bs 1.645 at = 0.05
Computations:
Since x =121 minutes, x 2 =112 minutes,
1
1 = 22 = 82 = 64 minutes and n1= n2 = 10, the value of the test
2
statistics is,
121 112
Z0 2.52
8 2
8 2
10
10
Conclusion: Since Z0 = 2.52 > 1.645, we reject H0: 1 - 2 = 0 at the 0.05
level of significance and conclude that adding the new ingredient to the
paint significantly reduces the drying time.
Significance test for difference
between population means
(population sds are unknown)
Large sample
Notations:
1 (2) : population mean in the first (second) group.
1 (2) : population sd in the first (second) group.
n1 (n2) : sizes of random samples drawn from the first
(second) populations.
Assumptions :
Independent random samples from the two groups are
drawn.
n1 and n2 large
Hypotheses:
H 0 : 1 2 H 0 : 1 2 H 0 : 1 2
H 1 : 1 2 H 0 : 1 2 H 0 : 1 2
Test Statistic: difference between sample means
= 1 2 x1 x2 (unbiased estimators)
1 n1
1 s1
n1 i 1
( x1i x1 ) 2
n2
1
2 s2
n2
2i 2
( x
i 1
x ) 2
By CLT,
12 22
x1 x2 N 1 2 ,
n1 n2
( x1 x2 ) ( 1 2 )
zobs ~ N (0,1), under H 0
s12 s22
n1 n2
Critical Region :
Assumptions :
Independent random samples are drawn from normal
distributions
1 = 2 = (say)
Hypotheses:
H 0 : 1 2 H 0 : 1 2 H 0 : 1 2
H 1 : 1 2 H 1 : 1 2 H 1 : 1 2
Test Statistic: difference between sample means
= 1 2 x1 x2 (unbiased estimators)
1 n1
1 s1'
n1 1 i 1
( x1i x1 ) 2
n2
1
2 s
'
2
n2 1 i 1
( x 2i x 2 ) 2
Pooled estimator of 2 is
(n1 1) s (n2 1) s
'2 '2
s
'2 1 2
n1 n2 2
2 1 1
x1 x2 ~ N 1 2 , ( )
n1 n2
(n1 n2 2) s '2
~ 2
n1 n2 2
2
( x1 x2 ) ( 1 2 )
tobs ~ t n1 n2 2 , under H 0
1 1
s '
n1 n2
Critical Region :
Men : 72 69 98 66 85 76 79 80 77
Women : 81 67 90 78 81 80 76
Hypothesis:
H0 : f = m H1: f m
Solution:
Women Men
Mean 79 78
Variance 47.33333333 90
Observations 7 9
Pooled Variance 71.71428571
Hypothesized Mean
Difference 0
df 14
t Stat 0.234318967
P(T<=t) one-tail 0.409064729
t Critical one-tail 1.761310115
P(T<=t) two-tail 0.818129458
t Critical two-tail 2.144786681