Two-Sample Hypothesis Tests

LECTURE 10
Two-sample Hypothesis Tests

Learning Objectives
In this chapter, you learn:
 Comparing Two Means: Independent Samples
 Confidence Interval for the Difference of Two Means
 Comparing Two Means: Paired Samples
 Comparing Two Proportions
 Confidence Interval for the Difference of Two
Proportions
 Comparing Two Variances
Two-sample Tests
 A one-sample test compares a sample estimate
against a non-sample benchmark
 A Two-sample test compares two sample
estimates with each other
Two-Sample Tests
Two-Sample Tests
Population Population
Means, Means, Population Population
Independent Related Proportions Variances
Samples Samples
Difference Between Two Means
Independent Samples
Goal: Test hypothesis or form a confidence

interval for the difference between two population
means, μ1 – μ2
The point estimate for the difference is X1 – X2

Hypothesis Tests for
Two Population Means
Two Population Means, Independent Samples
Lower-tail test: Upper-tail test: Two-tail test:

H0: μ1  μ2 H0: μ1 ≤ μ2 H0: μ1 = μ2
H1: μ1 < μ2 H1: μ1 > μ2 H1: μ1 ≠ μ2
i.e., i.e., i.e.,
H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0
a a a/2 a/2
Hypothesis tests for µ1 - µ2 with σ1
and σ2 known
Population means, Assumptions:

independent  Samples are randomly and
samples
independently drawn
σ1 and σ2 known  Populations are normally

distributed or both sample
sizes are at least 30
σ1 and σ2 unknown,
assumed equal  Population variances are
known
not assumed equal
and σ2 known
The test statistic is:
ZSTAT 
 X X  μ μ 
1 2 1 2
  12  22 
  
 n1 n2 
The confidence interval for μ1 – μ2 is:
 X X   Z
1 2 /2
  12  22 
  
 n1 n2 
and σ2 unknown and assumed equal

samples
independently drawn

unknown but assumed equal
not assumed equal
The pooled variance is:
S 2

 n1  1 S1   n 2  1 S2
2 2
p
(n1  1)  ( n 2  1)
t STAT 
 X X  μ μ 
1 2 1 2
1 1 
S   
2
p
 n1 n 2 
Where tSTAT has d.f. = (n1 + n2 – 2)

Confidence interval for µ1 - µ2 with σ1
The confidence interval for μ1 – μ2 is:
X 1 
 X 2  t/2
1 1 
S   
2
p
 n1 n 2 
Where tα/2 has d.f. = n1 + n2 – 2

Pooled-Variance t Test Example
You are a financial analyst for a brokerage firm. Is there
a difference in dividend yield between stocks listed on the
NYSE & NASDAQ? You collect the following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16
Assuming both populations are

approximately normal with
equal variances, is
there a difference in mean
yield ( = 0.05)?
Pooled-Variance t Test Example:
Calculating the Test Statistic
H0: μ1 - μ2 = 0
H1: μ1 - μ2 ≠ 0
t STAT 
 X  X   μ
1 2 1  μ2 

 3.27  2.53  0  2.040
1 1  1 1 
S   
2
p
1.5021  
 n1 n 2   21 25 

S2  1
n  1 S1
2
  n 2  1 S 2
2

 21  1 1.302
  25  1 1.162
 1.5021
P
(n1  1)  (n 2  1) (21 - 1)  (25  1)
Hypothesis Test Solution
Reject H0 Reject H0
 = 0.05
df = 21 + 25 - 2 = 44 .025 .025
Critical Values: t = ± 2.0154 -2.0154 0 2.0154 t

2.040
Test Statistic:
Decision:
3.27  2.53
t STAT   2.040 Reject H0 at a = 0.05
 1 1 
1.5021    Conclusion:
 21 25  There is evidence of a
difference in means.
Minitab Pooled-Variance t test
Comparing NYSE & NASDAQ
Two-Sample T-Test and CI
Sample N Mean StDev SE Mean

1 21 3.27 1.30 0.28
2 25 2.53 1.16 0.23
Difference = mu (1) - mu (2)

Estimate for difference: 0.740
95% CI for difference: (0.009, 1.471)
T-Test of difference = 0 (vs not =): T-Value = 2.04 P-Value = 0.047 DF = 44
Both use Pooled StDev = 1.2256
Decision:
Reject H0 at a = 0.05
Conclusion:
There is evidence of a
difference in means.
Confidence Interval for µ1 - µ2
Since we rejected H0 can we be 95% confident that µNYSE >

µNASDAQ?
95% Confidence Interval for µNYSE - µNASDAQ
 X X   t
1 2 /2 p
1 1 
S     0.74  2.0154  0.3628  (0.009, 1.471)
2
 n1 n 2 
Since 0 is less than the entire interval, we can be 95%

and σ2 unknown, not assumed equal

samples
independently drawn

unknown and cannot be
σ1 and σ2 unknown, assumed to be equal
not assumed equal
Hypothesis tests for µ1 - µ2 with σ1 and
σ2 unknown and not assumed equal
t STAT 
 X 1 
 X 2   μ1  μ 2 
S12 S 22

n1 n 2
tSTAT has d.f. ν:

2
 S1 2 S 2 2 
 
n  n 
  2
1 2 
2
 S1 2   S22 
   
n  n 
 1   2 
n1  1 n2  1
A Quick Rule for degrees of freedom is to use min(n1 – 1, n2 – 1).

Confidence interval for µ1 - µ2 with σ1
and σ2 unknown and not assumed equal
The confidence interval for μ1 - μ2 is:
X 
2 2
S S
1  X 2  t 1
 2
2 n1 n 2
Difference Between Two Means
The table below presents the summary statistics for the

starting annual salaries (in thousands of dollars) for
individuals entering the public accounting and financial
planning professions.
Sample I (public accounting) X1 = 60.35, S1 = 3.25, n1= 12

Sample II (financial planning) X2 = 58.20, S2 = 2.48, n2 = 14
Test whether the mean starting annual salaries for

individuals entering the public accounting professions is
higher than that of financial planning.
Summary:
Two Independent Sample Tests
Related Populations
The Paired Difference Test
 Tests Means of 2 Related Populations
 Paired samples
 Repeated measures (before/after)
 Use difference between paired values:
Di = X1i - X2i
 Assumptions:
 Both Populations Are Normally Distributed
 Or, if not Normal, use large samples
Related Populations
The Paired Difference Test
The ith paired difference is: Di = X1i - X2i

n
The point estimate for the
paired difference
D i
D i 1
population mean μD is : n
n
The sample standard  i
(D  D ) 2
deviation is SD SD  i 1
n1
n is the number of pairs in the paired sample

The Paired Difference Test:
Finding tSTAT
 The test statistic for μD is:
D  μD
t STAT 
SD
n
 Where tSTAT has n - 1 d.f.

The Paired Difference Test:
Possible Hypotheses
H0: μD  0 H0: μD ≤ 0 H0: μD = 0

H1: μD < 0 H1: μD > 0 H1: μD ≠ 0
a a a/2 a/2
-ta ta -ta/2 ta/2

Reject H0 if tSTAT < -ta Reject H0 if tSTAT > ta Reject H0 if tSTAT < -ta/2
tSTAT > ta/2 or
Where tSTAT has n - 1 d.f.
The Paired Difference
Confidence Interval
The confidence interval for μD is
SD
D  t / 2
n
Paired Difference Test:
Example
 Assume you send your salespeople to a “customer
service” training workshop. Has the training made a
difference in the number of complaints? You collect
the following data:
Number of Complaints: (2) - (1)  Di
Salesperson Before (1) After (2) Difference, Di D = n
C.B. 6 4 - 2 = -4.2
T.F. 20 6 -14
M.H. 3 2 - 1
R.K. 0 0 0
SD 
 i
(D  D ) 2
M.O. 4 0 - 4 n 1
-21
 5.67
Paired Difference Test:
Solution
Has the training made a difference in the number of complaints
(at the 0.01 level)?
Reject Reject
H0: μD = 0
H1: μD  0 /2
/2
 = .01 D = - 4.2 - 4.604 4.604
- 1.66
t0.005 = ± 4.604
Decision: Do not reject H0
d.f. = n - 1 = 4 Conclusion: There is

D  μ D  4.2  0 insufficient evidence there
t STAT    1.66
SD / n 5.67/ 5 is significant change in the
number of complaints.
Paired t Test In Minitab Yields
The Same Conclusions
Paired T-Test and CI: After, Before
Paired T for After - Before
N Mean StDev SE Mean

After 5 2.40 2.61 1.17
Before 5 6.60 7.80 3.49
Difference 5 -4.20 5.67 2.54
95% CI for mean difference: (-11.25, 2.85)

T-Test of mean difference = 0 (vs not = 0): T-Value = -1.66 P-Value = 0.173
Two Population Proportions
Goal: test a hypothesis or form a confidence

interval for the difference between two population
proportions, π1 – π2
The point estimate for the difference is p1  p2

The pooled estimate for the overall

proportion is:
X1  X 2
p
n1  n 2
where X1 and X2 are the number of items of

interest in samples 1 and 2
The test statistic for π1 – π2 is a Z statistic:
Z STAT 
 p1  p2    π1  π 2 
 1 1 
p (1  p )   
 n1 n2 
where
X1  X 2 X1 X2
p , p1  , p2 
n1  n2 n1 n2
H0: π1  π2 H0: π1 ≤ π2 H0: π1 = π2

H1: π1 < π2 H1: π1 > π2 H1: π1 ≠ π2
i.e., i.e., i.e.,
H0: π1 – π2  0 H0: π1 – π2 ≤ 0 H0: π1 – π2 = 0
H1: π1 – π2 < 0 H1: π1 – π2 > 0 H1: π1 – π2 ≠ 0

H0: π1 – π2  0 H0: π1 – π2 ≤ 0 H0: π1 – π2 = 0
H1: π1 – π2 < 0 H1: π1 – π2 > 0 H1: π1 – π2 ≠ 0
a a a/2 a/2
-za za -za/2 za/2

Reject H0 if ZSTAT < -Za Reject H0 if ZSTAT > Za Reject H0 if ZSTAT < -Za/2
ZSTAT > Za/2 or
Hypothesis Test Example:
Two population Proportions
Is there a significant difference between the
proportion of men and the proportion of women
who will vote Yes on Proposition A?
 In a random sample, 36 of 72 men and 35 of 50

women indicated they would vote “Yes”
 Test at the .05 level of significance

 The hypothesis test is:
H0: π1 – π2 = 0
H1: π1 – π2 ≠ 0
 The sample proportions are:
 Men: p1 = 36/72 = 0.50
 Women: p2 = 35/50 = 0.70
 The pooled estimate for the overall proportion is:
X 1  X 2 36  35 71
p    0 .582
n1  n2 72  50 122
Reject H0 Reject H0
.025 .025
z STAT 
 p1  p2    π1  π 2 
 1 1 
p ( 1  p)    -1.96 1.96
 n1 n2  -2.20

 .50  .70   0   2 .20
 1 1  Decision: Reject H0
.582 ( 1  .582 )   
 72 50 
Conclusion: There is
evidence of a difference in
Critical Values = ±1.96
For  = .05 proportions who will vote
yes between men and
women.
Two Proportion Test In Minitab
Shows The Same Conclusions
Test and CI for Two Proportions
Sample X N Sample p
1 36 72 0.500000
2 35 50 0.700000
Difference = p (1) - p (2)

Estimate for difference: -0.2
95% CI for difference: (-0.371676, -0.0283244)
Test for difference = 0 (vs not = 0): Z = -2.28 P-Value = 0.022
Conclusion: There is evidence of a difference in

proportions who will vote yes between men and women.
Confidence Interval for
If
the hypothesized difference is
nonzero (like=0.02), using the
following formula:
(p1  p 2 )  ( 1   2 )
Z STAT 
p1 (1  p1 ) p 2 (1  p 2 )

n1 n2
Confidence Interval for
The confidence interval for π1 – π2 is:
p1 (1  p1 ) p 2 (1  p 2 )
 p1  p 2   Z/2 
n1 n2
Testing for the Ratio of Two
Population Variances
Hypotheses FSTAT
H0: σ12 = σ22
H1: σ12 ≠ σ22 S12
FSTAT  2
H0: σ12 ≤ σ22 S2
H1: σ12 > σ22
Where:
S12 = Variance of sample 1 (the larger sample variance)
n1 = sample size of sample 1
S 22 = Variance of sample 2 (the smaller sample variance)
n2 = sample size of sample 2
The F Distribution
 The F critical value is found from the F table
 There are two degrees of freedom required: numerator
and denominator
 The larger sample variance is always the numerator
S12
 When FSTAT  2 df1 = n1 – 1 ; df2 = n2 – 1
S2
 In the F table,
 numerator degrees of freedom determine the column
 denominator degrees of freedom determine the row
Finding the Rejection Region
H0: σ12 = σ22 H0: σ12 ≤ σ22

H1: σ12 ≠ σ22 H1: σ12 > σ22
/2 /2 
F
0
Reject H0 Do not Do not Reject H0 F
reject H0 Reject H0 reject H0 Fα
FL 1 FR
FR = F α/2, df1, df2
FL = F 1-α/2, df1, df2 = 1/Fα/2, df2, df1
Reject H0 if FSTAT > FR or FSTAT < FL Reject H0 if FSTAT > Fα

F Test: An Example
You are a financial analyst for a brokerage firm. You

want to compare dividend yields between stocks listed on
the NYSE & NASDAQ. You collect the following data:
NYSE NASDAQ
Number 21 25
Mean 3.27 2.53
Std dev 1.30 1.16
Is there a difference in the variances between the

NYSE & NASDAQ at the  = 0.05 level?
F Test: Example Solution
 Form the hypothesis test:
H 0: σ 12  σ(22there is no difference between variances)
(2there is a difference between variances)
H 1: σ 1  σ 2
2
 Significance level  = 0.05

 Numerator d.f. = n1 – 1 = 21 –1 = 20
 Denominator d.f. = n2 – 1 = 25 –1 = 24
 FR = F.025, 20, 24 = 2.33 (FINV(0.025, 20, 24)
 FL = 1/ F.025, 24, 20 = 0.41 (or FINV(0.975, 20, 24)
F Test: Example Solution
 The test statistic is: H0: σ12 = σ22
2 2 H 1: σ 12 ≠ σ 22
S 1.30
FSTAT  1

2 2
 1.256
S2 1.16
/2 = .025
Reject H0 Do not Reject H0

F
reject H0
FL = 0.41 1 FR = 2.33
 FSTAT = 1.256 is not in the rejection region, so we do not reject

H0
 Conclusion: There is insufficient evidence of a difference in
variances at  = .05
Two Variance F Test In Minitab
Yields The Same Conclusion
Test and CI for Two Variances
Null hypothesis Sigma(1) / Sigma(2) = 1

Alternative hypothesis Sigma(1) / Sigma(2) not = 1
Significance level Alpha = 0.05
Statistics
Sample N StDev Variance
1 21 1.300 1.690
2 25 1.160 1.346
Ratio of standard deviations = 1.121

Ratio of variances = 1.256
95% Confidence Intervals
CI for
Distribution CI for StDev Variance
of Data Ratio Ratio
Normal (0.735, 1.739) (0.540, 3.024)
Tests
Test
Method DF1 DF2 Statistic P-Value
F Test (normal) 20 24 1.26 0.589
Chapter Summary
 Compared two independent samples
 Performed pooled-variance t test for the difference in
two means
 Performed separate-variance t test for difference in
two means
 Formed confidence intervals for the difference
between two means
 Compared two related samples (paired
samples)
 Performed paired t test for the mean difference
 Formed confidence intervals for the mean difference
Chapter Summary
 Compared two population proportions
 Formed confidence intervals for the difference
between two population proportions
 Performed Z-test for two population proportions
 Performed F test for the ratio of two
population variances
The Wall Street Journal recently published an article indicating
differences in perception of sexual harassment on the job between men
and women. The article claimed that women perceived the problem to
be much more prevalent than did men. One question asked of both men
and women was: "Do you think sexual harassment is a major problem in
the American workplace?" 24% of the men compared to 62% of the
women responded "Yes." Assuming W designates women's responses
and M designates men's.
1. What hypothesis should The Wall Street Journal test in order to show
that its claim is true?
2. Suppose that 150 women and 200 men were interviewed. For a 0.01
level of significance, what is the critical value for the rejection region
3. What is the value of the test statistic?
4. Construct a 99% confidence interval estimate of the difference

between the proportion of women and men who think sexual
harassment is a major problem in the American workplace.
5. Calculate p-value for testing the above claim of The Wall Street
Journal.
Homeworks
 Ebook: Chaper 10
 10.70
 10.76
 10.82
 10.86
 10.90

Two-Sample Hypothesis Tests

Uploaded by

Copyright:

Available Formats

Two-Sample Hypothesis Tests

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Two-Sample Hypothesis Tests

Uploaded by

Copyright:

Available Formats

LECTURE 10

Two-sample Hypothesis Tests

Goal: Test hypothesis or form a confidence

The point estimate for the difference is X1 – X2

Lower-tail test: Upper-tail test: Two-tail test:

Population means, Assumptions:

σ1 and σ2 known  Populations are normally

The test statistic is:

The confidence interval for μ1 – μ2 is:

Population means, Assumptions:

σ1 and σ2 known  Populations are normally

The pooled variance is:

Where tSTAT has d.f. = (n1 + n2 – 2)

The confidence interval for μ1 – μ2 is:

Where tα/2 has d.f. = n1 + n2 – 2

Assuming both populations are

The test statistic is:

Critical Values: t = ± 2.0154 -2.0154 0 2.0154 t

Sample N Mean StDev SE Mean

Difference = mu (1) - mu (2)

Since we rejected H0 can we be 95% confident that µNYSE >

95% Confidence Interval for µNYSE - µNASDAQ

Since 0 is less than the entire interval, we can be 95%

Population means, Assumptions:

σ1 and σ2 known  Populations are normally

tSTAT has d.f. ν:

A Quick Rule for degrees of freedom is to use min(n1 – 1, n2 – 1).

The confidence interval for μ1 - μ2 is:

The table below presents the summary statistics for the

Sample I (public accounting) X1 = 60.35, S1 = 3.25, n1= 12

Test whether the mean starting annual salaries for

The ith paired difference is: Di = X1i - X2i

n is the number of pairs in the paired sample

 The test statistic for μD is:

 Where tSTAT has n - 1 d.f.

Lower-tail test: Upper-tail test: Two-tail test:

H0: μD  0 H0: μD ≤ 0 H0: μD = 0

-ta ta -ta/2 ta/2

d.f. = n - 1 = 4 Conclusion: There is

Paired T for After - Before

N Mean StDev SE Mean

95% CI for mean difference: (-11.25, 2.85)

Goal: test a hypothesis or form a confidence

The point estimate for the difference is p1  p2

The pooled estimate for the overall

where X1 and X2 are the number of items of

Lower-tail test: Upper-tail test: Two-tail test:

H0: π1  π2 H0: π1 ≤ π2 H0: π1 = π2

Lower-tail test: Upper-tail test: Two-tail test:

-za za -za/2 za/2

 In a random sample, 36 of 72 men and 35 of 50

 Test at the .05 level of significance

 The pooled estimate for the overall proportion is:

Test and CI for Two Proportions

Difference = p (1) - p (2)

Conclusion: There is evidence of a difference in

The confidence interval for π1 – π2 is:

H0: σ12 = σ22 H0: σ12 ≤ σ22