Inferential Statistics
Inferential Statistics
Inferential Statistics
COMPILED BY:
JAYMIL B. DELOS REYES, LPT
INTRODUCTION
This course presents techniques for statistical analysis. Rank-based and resampling
techniques are well represented, but strong techniques are considered as well. These
techniques include one-sample testing and estimation, multi-sample testing and estimation, and
regression.
.This course will begin with a parametric statistics and will then shift to nonparametric
statistics. The students will train to do inferences which focused on both parametric and
nonparametric statistics. The students will also examine their techniques to identify what
statistical method is appropriate for the difference, relationship or association between two or
more variables.
By the end of this course, the student should be able to understand the basic concepts
concepts, identify the correct usage of statistical tests by conducting investigations and
researches to formulate data-driven conclusions and decisions, develop skill in problem solving
by giving appropriate examples that can be solved using non-parametric statistics and
appreciate statistics by advocating the use of statistical data in making important decisions in
everyday life.
LEARNING OUTCOMES
At the end of each lesson, the student should be able to:
Introduction:
The basic concepts of hypothesis testing were explained. With the z, t, and x2 tests, a
sample mean, variance, or proportion can be compared to a specific population mean, variance,
or proportion to determine whether the null hypothesis should be rejected. There are, however,
many instances when researchers wish to compare two sample means, using experimental and
control groups. For example, the average lifetimes of two different brands of bus tires might be
compared to see whether there is any difference in tread wear. Two different brands of fertilizer
might be tested to see whether one is better than the other for growing plants. Or two brands of
cough syrup might be tested to see whether one brand is more effective than the other. In the
comparison of two means, the same basic steps for hypothesis testing are used, and the z and t
tests are also used.
Learning Objectives:
After successful completion of this lesson, you should be able to:
1. Test the difference between sample means, using the z test.
Course Materials:
The theory behind testing the difference between two means is based on selecting pairs
of samples and comparing the means of the pairs. The population means need not be known. All
possible pairs of samples are taken from populations. The means for each pair of samples are
computed and then subtracted, and the differences are plotted. If both populations have the same
mean, then most of the differences will be zero or close to zero. Before you can use the z test to
test the difference between two independent sample means, you must make sure that the
following assumptions are met.
ASSUMPTIONS FOR THE Z TEST TO DETERMINE THE DIFFERENCE BETWEEN TWO
MEANS
The basic format for hypothesis testing using the traditional method is reviewed here.
A study using two random samples of 35 people each found that the average amount of
time those in the age group of 26–35 years spent per week on leisure activities was 39.6 hours,
and those in the age group of 46–55 years spent 35.4 hours. Assume that the population
standard deviation for those in the first age group found by previous studies is 6.3 hours, and
the population standard deviation of those in the second group found by previous studies was
5.8 hours. At alpha = 0.05, can it be concluded that there is a significant difference in the
average times each group spends on leisure activities?
Solution:
𝐻0 = 𝜇1 = 𝜇2
𝐻1 = 𝜇1 ≠ 𝜇2
There is enough evidence to support the claim that the means are not equal. That is, the
average of the times spent on leisure activities is different for the groups.
Watch:
The (Pearson) Correlation Coefficient Explained in One Minute: From Definition to
Formula
https://www.youtube.com/watch?v=WpZi02ulCvQ
Read:
Z-test : two Sample Mean
Bluman, A. G. (2012). Descriptive and Inferential Statistics. In Bluman, A. G.,
ELEMENTARY STATISTICS: A STEP BY STEP APPROACH, EIGHT EDITION. New
York: McGraw-Hill Education
UNIT 1: TESTING THE DIFFERENCE BETWEEN 2 MEANS
LESSON 2– T-TEST
Introduction:
The basic concepts of hypothesis testing were explained. With the z, t, and x 2 tests, a
sample mean, variance, or proportion can be compared to a specific population mean, variance,
or proportion to determine whether the null hypothesis should be rejected. There are, however,
many instances when researchers wish to compare two sample means, using experimental and
control groups. When comparing two means by using the t test, the researcher must decide if the
two samples are independent or dependent.
Learning Objectives:
After successful completion of this lesson, you should be able to:
1. Test the difference between two means for independent samples, using the t test.
Course Materials:
Z test was used to test the difference between two means when the population
standard deviations were known and the variables were normally or approximately
normally distributed, or when both sample sizes were greater than or equal to 30. In
many situations, however, these conditions cannot be met—that is, the population
standard deviations are not known. In these cases, a t test is used to test the difference
between means when the two samples are independent and when the samples are
taken from two normally or approximately normally distributed populations. Samples are
independent samples when they are not related. Also it will be assumed that the
variances are not equal.
FORMULA FOR THE T TEST FOR TESTING THE DIFFERENCE BETWEEN TWO MEANS,
INDEPENDENT SAMPLES
ASSUMPTIONS FOR THE T-TEST FOR TWO INDEPENDENT MEANS WHEN 𝝈𝟏 AND 𝝈𝟐
ARE UNKNOWN
3. When the sample sizes are less than 30, the populations must be normally or
approximately normally distributed.
Example
A researcher wishes to see if the average weights of newborn male infants are different
from the average weights of newborn female infants. She selects a random sample of 10 male
infants and finds the mean weight is 7 pounds 11 ounces and the standard deviation of the
sample is 8 ounces. She selects a random sample of 8 female infants and finds that the mean
weight is 7 pounds 4 ounces and the standard deviation of the sample is 5 ounces. Can it be
concluded at alpha = 0.05 that the mean weight of the males is different from the mean weight
of the females? Assume that the variables are normally distributed.
Solution:
𝐻0 = 𝜇1 = 𝜇2
𝐻1 = 𝜇1 ≠ 𝜇2
Since the test is two-tailed and alpha = 0.05, the degrees of freedom are the smaller of
n1 - 1 or n2 - 1. In this case, n1 - 1 =10 – 1 = 9 and n2 – 1 = 8 – 1= 7. From F- Table, the critical
values are +2.365 and -2.365.
Step 3 Compute the test value. Change the means to ounces (1 lb = 16 oz):
7 lb 11 oz = 7 x 16 + 11 = 123 oz
7 lb 4 oz = 7 x 16 + 4 = 116 oz
There is not enough evidence to support the claim that the mean of the weights of the
male infants is different from the mean of the weights of the female infants.
Read:
T-test for Two Independent Means
Bluman, A. G. (2012). Descriptive and Inferential Statistics. In Bluman, A. G.,
ELEMENTARY STATISTICS: A STEP BY STEP APPROACH, EIGHT EDITION. New
York: McGraw-Hill Education
UNIT 1: TESTING THE DIFFERENCE BETWEEN 2 MEANS
LESSON 3– DEPENDENT SAMPLES
Introduction:
Z- test was used to compare two sample means when the samples were independent and
σ1 and σ2 were known. T- test was used to compare two sample means when the samples were
independent. In this section, a different version of the t test is explained. This version is used
when the samples are dependent.
Learning Objectives:
After successful completion of this lesson, you should be able to:
1. Test the difference between two means for independent samples, using the t test.
Course Materials:
Samples are considered to be dependent samples when the subjects are paired or
matched in some way. Dependent samples are sometimes called matched-pair samples. For
example, suppose a medical researcher wants to see whether a drug will affect the reaction
time of its users. To test this hypothesis, the researcher must pretest the subjects in the sample.
That is, they are given a test to ascertain their normal reaction times. Then after taking the drug,
the subjects are tested again, using a posttest. Finally, the means of the two tests are compared
to see whether there is a difference. Since the same subjects are used in both cases, the
samples are related; subjects scoring high on the pretest will generally score high on the
posttest, even after consuming the drug. Likewise, those scoring lower on the pretest will tend to
score lower on the posttest. To take this effect into account, the researcher employs a t test,
using the differences between the pretest values and the posttest values. Thus, only the gain or
loss in values is compared.
When the samples are dependent, a special t test for dependent means is used. This
test employs the difference in values of the matched pairs. The hypotheses are as follows:
Before you can use the testing method presented in this section, the following
assumptions must be met.
Assumptions for the t Test for Two Means When the Samples Are Dependent
3. When the sample size or sample sizes are less than 30, the population or populations
must be normally or approximately normally distributed.
̅ − 𝜇𝐷
𝐷
𝑡= 𝑠𝐷
√𝑛
∑𝐷 𝑛 ∑ 𝐷 2 − (∑ 𝐷)2
̅=
𝐷 𝑎𝑛𝑑 𝑠𝐷 = √
𝑛 𝑛(𝑛 − 1)
A B
X1 X2
D = X1- X2 D2 = (X1- X2)2
. .
. .
D = X1- X2
∑𝐷
̅=
𝐷
𝑛
d. Square the differences and place the results in column B. Complete the
table.
D2 = (X1- X2)2
𝑛 ∑ 𝐷 2 − (∑ 𝐷)2
𝑠𝐷 = √
𝑛(𝑛 − 1)
̅ − 𝜇𝐷
𝐷
𝑡= 𝑠𝐷
√𝑛
Step 4 Make the decision.
EXAMPLE:
A dietitian wishes to see if a person’s cholesterol level will change if the diet is
supplemented by a certain mineral. Six randomly selected subjects were pretested, and then
they took the mineral supplement for a 6-week period. The results are shown in the table.
(Cholesterol level is measured in milligrams per deciliter.) Can it be concluded that the
cholesterol level has been changed at alpha = 0.10? Assume the variable is approximately
normally distributed.
Subject 1 2 3 4 5 6
Before
210 235 208 190 172 244
(X1)
SOLUTION:
If the diet is effective, the before cholesterol levels should be different from the after
levels.
𝐻0 : 𝜇𝐷 = 0
𝐻1 : 𝜇𝐷 ≠ 0
The degrees of freedom are 6 – 1= 5. At a 0.10, the critical values are ±2.015.
Step 3 Compute the test value.
A B
Before (X1) After (X2)
D = X1- X2 D2 = (X1- X2)2
210 190
235 170
208 210
190 188
172 173
244 228
D = X1- X2
210 - 190 = 20
235 – 170 = 65
208 – 210 = -2
190 – 188 = 2
172 – 172 = -1
244 – 228 = 16
∑D = 100
∑𝐷 100
̅=
𝐷 = = 16.7
𝑛 6
d. Square the differences and place the results in column B. Complete the
table.
D2 = (X1- X2)2
(20)2 = 400
(65)2 = 4225
(-2)2 = 4
(2)2 = 4
(-1)2 = 1
(16)2 = 256
∑D2 = 4890
̅ − 𝜇𝐷
𝐷 16.7 − 0
𝑡= 𝑠𝐷 = 25.4 = 1.610
√𝑛 √6
The decision is to not reject the null hypothesis, since the test value 1.610 is in the
noncritical region.
Step 5 Summarize the results.
There is not enough evidence to support the claim that the mineral changes a person’s
cholesterol level.
Read:
Correlation Coefficient
Bluman, A. G. (2012). Descriptive and Inferential Statistics. In Bluman, A. G.,
ELEMENTARY STATISTICS: A STEP BY STEP APPROACH, EIGHT EDITION New
York: McGraw-Hill Education
UNIT 1: TESTING THE DIFFERENCE BETWEEN 2 MEANS
LESSON 4: TESTING THE DIFFERENCE BETWEEN PROPORTIONS
Introduction:
The z test with some modifications can be used to test the equality of two proportions. For
example, a researcher might ask, Is the proportion of men who exercise regularly less than the
proportion of women who exercise regularly? Is there a difference in the percentage of students
who own a personal computer and the percentage of nonstudents who own one? Is there a
difference in the proportion of college graduates who pay cash for purchases and the proportion
of non-college graduates who pay cash?
Learning Objectives:
After successful completion of this lesson, you should be able to:
1. Test the difference between two proportions.
Course Materials:
The symbol 𝑝̂ (“p hat”) is the sample proportion used to estimate the population
proportion, denoted by p. For example, if in a sample of 30 college students, 9 are on probation,
9
then the sample proportion is 𝑝̂ = , or 0.3. The population proportion p is the number of all
30
students who are on probation, divided by the number of students who attend the college. The
formula for the sample proportion is
𝑋
𝑝̂ =
𝑛
Where:
X = number of units that possess the characteristic of interest
n = sample size
When you are testing the difference between two population proportions p 1 and p2, the
hypotheses can be stated thus, if no specific difference between the proportions is hypothesized.
𝐻0 : 𝑝1 = 𝑝2 𝐻0 : 𝑝1 − 𝑝2 = 0
or
𝐻1 : 𝑝1 ≠ 𝑝2 𝐻1 : 𝑝1 − 𝑝2 ≠ 0
Similar statements using < or > in the alternate hypothesis can be formed for one-tailed tests.
𝑿𝟏 +𝑿𝟐 𝑿𝟏
̅=
𝒑 ̂𝟏 =
𝒑
𝒏𝟏 +𝒏𝟐 𝒏𝟏
𝑿𝟐
̅ = 𝟏− 𝒑
𝒒 ̅ ̂𝟐 =
𝒑
𝒏𝟐
Before you can test the difference between two sample proportions, the following
assumptions must be met.
ASSUMPTIONS FOR THE Z TEST FOR TWO PROPORTIONS
1. The samples must be random samples.
2. The sample data are independent of one another.
3. For both samples np ≥ 5 and nq ≥ 5.
EXAMPLE:
In the nursing home study mentioned in the chapter-opening Statistics Today, the
researchers found that 12 out of 34 randomly selected small nursing homes had a resident
vaccination rate of less than 80%, while 17 out of 24 randomly selected large nursing homes had
a vaccination rate of less than 80%. At alpha= 0.05, test the claim that there is no difference in
the proportions of the small and large nursing homes with a resident vaccination rate of less than
80%.
Solution:
Step 1 State the hypotheses and identify the claim.
𝐻0 : 𝑝1 = 𝑝2
𝐻1 : 𝑝1 ≠ 𝑝2
𝑋1 12 𝑋2 17
𝑝̂1 = = = 0.35 𝑝̂2 = = = 0.71
𝑛1 34 𝑛2 24
𝑋1 +𝑋2 12 + 17 29
𝑝̅ = = = = 0.5
𝑛1 +𝑛2 34 + 24 58
𝑞̅ = 1 − 𝑝̅ = 1 − 0.5 = 0.5
Read:
Correlation Coefficient
Bluman, A. G. (2012). Descriptive and Inferential Statistics. In Bluman, A. G.,
ELEMENTARY STATISTICS: A STEP BY STEP APPROACH, EIGHT EDITION New
York: McGraw-Hill Education