Unit 6 Inferntial Statistics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10



Written By:
Salman Khalil Chaudhary

Reviewed By:
Dr. Rizwan Akram Rana
Inferential statistics is of vital importance in educational research. It is used to make
inferences about the population on the bases of data obtained from the sample. It is also
used to make judgments of the probability that an observed difference among groups is a
dependable one or one that might have happened by chance in the study.

In this unit, you will study introduction, area, logic and importance of inferential
statistics. Hypothesis testing, logic and process of hypothesis testing and errors in
hypothesis are also discussed. In the last of the unit t-test, its types and general
assumptions regarding the use of t-test are discussed.

After reading this unit, you will be able to:
1. explain the term “Inferential Statistics”.
2. explain the area of Inferential Statistics.
3. explain the logic of Inferential Statistics.
4. explain the Importance of Inferential Statistics in Educational Research.
5. tell, What Hypothesis Testing is.
6. explain the Logic of Hypothesis Testing.
7. explain the Uncertainty and errors in Hypothesis Testing.
8. explain t-test and its Types.

6.1 Introduction to inferential Statistics

Many statistical techniques have been developed to help researchers make sense of the
data they have collected. These techniques are divided into two categories; descriptive
and inferential. Descriptive statistics are the techniques that allow a researcher to quickly
summarize the major characteristics of the data set. Inferential statistics, on the other
hand, is set of techniques that allow a researcher to go a step further by helping a
researcher uncover patterns or relationships in the data set, make judgment about data, or
apply information about a smaller data set to a larger group. These techniques are part of
the process of data analysis used by the researchers to analyze, interpret and make
inferences about their results. In simple words we can say that inferential statistics helps
researchers to make generalization about a population based on the data obtained from
the sample. Since the sample is a small subset of the larger population, so the inferences
made on the bases of the data obtained from sample cannot be free from errors. That is,
we cannot say with 100% confidence that the characteristics of the sample accurately
reflect the characteristics of the larger population. Hence only qualified inferences can be
made, with a degree of certainty, which is often expressed in terms of probability (90% or
95% probability that the sample reflects the population).

Descriptive statistics only gives us the central values, dispersion or the variability of the
data but inferential statistics leads us to take a decision about the whole population and in
the end to any conclusion. Inferential statistics allows us to use what we have learnt from
descriptive statistics. Inferential statistics enables us to infer from the data obtained the
sample what the population might think.

6.1.1 Areas of Inferential Statistics

Inferential statistics has two broad areas

i) Estimating Parameter
This means taking a statistics from the sample data (e.g. the sample mean) and
saying something about population parameter (e.g. the population mean).
ii) Hypothesis testing
This is where a researcher can use sample data to answer research questions.

Inferential statistics deals with two or more than two variables. If in an analysis there are
two variables it is called bivariate analysis and if the variables are more than two it is
called multivariate analysis. A number of different types of inferential statistics are in
use. All of which depend of the type of variable i.e. nominal, ordinal, interval, and ratio.
Although the type of statistical analysis is different for these variables, yet the main
theme is the same we try to determine how one variable compare to another.

It should be noted that inferential statistics always talk in terms of probability. This can
be made highly reliable by designing right experimental conditions. The inferences are
always an estimate with a confidence interval. In some cases there is simply a rejection of

Several models are available in inferential statistics that help in the process of data
analysis. A researcher should be careful while choosing any model. Because, choosing a
wrong model may give wrong conclusions.

6.1.2 Logic of Inferential Statistics

Suppose a researcher wants to know the difference between the male and female students
with respect to interest in learning English as a foreign language. He hypothesizes that
the female students are more interested in learning English as a foreign language than the
male students. To test the hypothesis he randomly selects 60 male students from a 1000
male students of English language course and 60 female students from a 1000 female
students of English language course. All the students are given an attitude scale to
complete. Now the researcher has two data sets: the attitude scores of male group and the
attitude scores of female group. The design of the study is as shown:

Population of male Population of female
English language English language
students students
N = 1000 N = 1000

Sample 1 Sample 2
n = 60 n = 60

Fig: Selection of two samples from two different populations

The researcher wants to know whether the male population is different from female
population – that is, will the mean score of the male group on attitude scale is different
from the mean score of the female group? The researcher does not know the means of the
two populations. He only has mean scores of two samples on which he has to rely on to
provide information about the populations.

Now it comes in mind that is it reasonable to assume that each sample will give a fairly
accurate picture of the whole population? It certainly is possible, because each sample
was selected randomly from its population. On the other hand, the students in each
sample are only a small portion of their respective population. It is only rare that a sample
is absolutely identical to the population from which it is drawn, on given characteristics.
The data the researcher obtains from two samples depends on the individual students
selected to be in the sample. If another two samples were selected randomly their makeup
would differ from previously selected samples. Their mean on the attitude scale would be
different, and the researcher would end up with a different data set. How can the
researcher be sure that any particular selected sample is a true representative of its
population? Indeed he cannot. He needs some help to be sure that the sample is
representative of the population and the results obtained from the sample data be
generalized to whole population. Inferential statistics will help the researcher and allow
him to make judgment about data and make generalization about a population based on
the data obtained from the sample.

6.2 Importance of Inferential Statistics in Research

Inferential statistics is of vital importance in research in general and in educational
research in particular. It allows us to use what we have learnt from descriptive statistics,
and allow us to go beyond immediate data. Inferential statistics infers on the basis of
sample data what the population might think. It helps us to make judgments about the
probability that an observation is dependable or one that happened by chance in the

study. It helps enables researchers to infer properties of a population based on data
collected from a sample of individuals

Inferential statistics have larger value because these techniques offset problems
associated with data collection. For example, time-cost factor associated with collection
of data on the entire population may be prohibitive. The population may large and
difficult to manage. In this case inferential statistics can prove to be invaluable to
educational/social scientist.

6.3 Hypothesis Testing

It is usually impossible for a researcher to observe each individual in a population.
Therefore, he selects some individual from the population as sample and collects data
from the sample. He then uses the sample data to answer questions about the population.
For this purpose, he uses some statistical techniques.

Hypothesis testing is a statistical method that uses sample data to evaluate a hypothesis
about a population parameter (Gravetter & Wallnau, 2002).A hypothesis test is usually
used in context of a research study. Depending on the type of research and the type of
data, the details of the hypothesis test will change from on situation to another.

Hypothesis testing is a formalized procedure that follows a standard series of operations.

In this way a researcher has a standardized method for evaluating the results of his
research study. Other researchers will recognize and understand exactly how the data
were evaluated and how conclusions were drawn.

6.3.1 Logic of Hypothesis Testing

According to Gravetter & Wallnau (2002) the logic underlying hypothesis testing is as
i) First, a researcher states a hypothesis about a population. Usually, the hypothesis
concerns the value of the population mean. For example, we might hypothesize that
the mean IQ for the registered voters Pakistan is M = 100.
ii) Before a researcher actually selects a sample, he uses the hypothesis to predict the
characteristics that the sample should have. For example, if he hypothesizes that the
population mean IQ = 100, then he would predict that the sample should have a
mean around 100. It should be kept in mind that the sample should be similar to the
population but there is always a chance certain amount of error.
iii) Next, the researcher obtains a random sample from the population. For example, he
might select a random sample of n = 200 registered voters to compute the mean IQ
for the sample.
iv) Finally, he compares the obtained sample data with the prediction that was made
from the hypothesis. If the sample mean is consistent with the prediction, he will
conclude that the hypothesis is reasonable. But if there is big difference between
the data and the prediction, he will decide that the hypothesis is wrong.

6.3.2 Four-Step Process for Hypothesis Testing
The process of hypothesis testing goes through following four steps.
i) Stating the Hypothesis
The process of hypothesis testing begins by stating a hypothesis about the unknown
population. Usually, a researcher states two opposing hypotheses. And both
hypotheses are stated in terms of population parameters.

The first and most important of two hypotheses is called null hypothesis. A null
hypothesis states that the treatment has no effect. In general, null hypothesis states
that there is no change, no effect, no difference – nothing happened. The null
hypothesis is denoted by the symbol Ho (H stands for hypothesis and 0 denotes that
this is zero effect).

The null hypothesis (Ho) states that in the general population there is no change, no
difference, or no relationship. In an experimental study, null hypothesis (Ho)
predicts that the independent variable (treatment) will have no effect on the
dependent variable for the population.

The second hypothesis is simply the opposite of null hypothesis and it is called the
scientific or alternative hypothesis. It is denoted by H1. This hypothesis states that
the treatment has an effect on the dependent variable.

The alternative hypothesis (H1) states that there is a change, a difference, or a

relationship for the general population. In an experiment, H1 predicts that the
independent variable (treatment) will have an effect on the dependent variable.

ii) Setting Criteria for the Decision

In a common practice, a researcher uses the data from the sample to evaluate the
authority of null hypothesis. The data will either support or negate the null
hypothesis. To formalize the decision process, a researcher will use null hypothesis
to predict exactly what kind of sample should be obtained if the treatment has no
effect. In particular, a researcher will examine all the possible sample means that
could be obtained if the null hypothesis is true.

iii) Collecting data and computing sample statistics

The next step in hypothesis testing is to obtain the sample data. Then raw data are
summarized with appropriate statistics such as mean, standard deviation etc. then it
is possible for the researcher to compare the sample mean with the null hypothesis.

iv) Make a Decision

In the final step the researcher decides, in the light of analysis of data, whether to
accept or reject the null hypothesis. If analysis of data supports the null hypothesis,
he accepts it and vice versa.

6.3.3 Uncertainty and Error in Hypothesis Testing
Hypothesis testing is an inferential process. It means that it uses limited information
obtained from the sample to reach general conclusions about the population. As a sample
is a small subset of the population, it provides only limited or incomplete information
about the whole population. Yet hypothesis test uses information obtained from the
sample. In this situation, there is always the probability of reaching incorrect conclusion.
Generally two kinds of errors can be made.

i) Type I Errors
A type I error occurs when a researcher rejects a null hypothesis that is actually
true. It means that the researcher concludes that the treatment does have an effect
when in fact the treatment has no effect.

Type I error is not a stupid mistake in the sense that the researcher is overlooking
something that should be perfectly obvious. He is looking at the data obtained from
the sample that appear to show a clear treatment effect. The researcher then makes
a careful decision based on available information. He never knows whether a
hypothesis is true or false.

The consequences of a type I error can be very serious because the researcher has
rejected the null hypothesis and believed that the treatment had a real effect. it is
likely that the researcher will report or publish the research results. Other researchers
may try to build theories or develop other experiments based on false results.

ii) Type II Errors

A type II error occurs when a researcher fails to reject the null hypothesis that is
really false. It means that a treatment effect really exists, but the hypothesis test has
failed to detect it. This type of error occurs when the effect of the treatment is
relatively small. That is the treatment does influence the sample but the magnitude
of the effect is very small.

The consequences of Type II error are not very serious. In case of Type II error the
research data do not show the results that the researcher had hoped to obtain. The
researcher can accept this outcome and conclude that the treatment either has no effect
or has a small effect that is not worth pursuing. Or the researcher can repeat the
experiment with some improvement and try to demonstrate that the treatment does
work. It is impossible to determine a single, exact probability value for a type II error.

Summarizing we can say that a hypothesis test always leads to one of two decisions.
i) The sample data provides sufficient evidence to reject the null hypothesis and the
researcher concludes that the treatment has an effect.
ii) The sample data do not provide enough evidence to reject the null hypothesis. The
researcher fails to reject the null hypothesis and concludes that the treatment does
not appear to have an effect.

In either case, there is a chance that the data are misleading and the decision is wrong.
The complete set of decision and outcome is shown in the following table.

Table: 6.1
Possible outcome of statistical decision

Actual Situation
No effect, Effect exists,
Ho true Ho false

Reject Ho Type I Error Decision Correct

Decision Correct Type II Error
Retain Ho

Source: Gravetter & Wallnau, (2002)

6.4 T-Test
A t-test is a useful statistical technique used for comparing mean values of two data sets
obtained from two groups. The comparison tells us whether these data sets are different
from each other. It further tells us how significant the differences are and if these
differences could have happened by chance. The statistical significance of t-test indicates
whether or not the difference between the mean of two groups most likely reflects a real
difference in the population from which the groups are selected.

t-tests are used when there are two groups (male and female) or two sets of data (before and
after), and the researcher wishes to compare the mean score on some continuous variable.

6.4.1 Type of T-Test

There are a number of t-test available but two main types independent sample t-test and
paired sample t-test are most commonly used. Let us deal with these types in some detail.
i) Independent sample t-test
Independent sample t-test is used when there are two different independent groups
of people and the researcher is interested to compare their scores. In this case the
researcher collects information from two different groups of people on only one

ii) Paired sample t-test

Paired sample t-test is also called repeated measures. It is used the researcher is
interested in comparing changes in the scores of the same group tested at two
different occasions.

Here at this level it is necessary to know some general assumptions regarding use of t-
test. The first assumption regarding t-test concerns the scale of measurement. It means
that it is assumed that the dependent variable is measured at interval or ratio scale. The
second assumption made is that of a simple random sample, that the data is collected
from a representative, randomly selected portion of the total population. The third
assumption is that the data, when plotted, results in a normal distribution i.e. in bell-
shaped distribution curve. The fourth assumption is that the observation that make up
data must independent of one another. That is, each observation or measurement must not
be influences by any other observation or measurement. The fifth assumption is that a
reasonably large sample size is used. A large sample size means that the distribution of
results should approach a normal bell-shaped curve. The final assumption is homogeneity
of variance. Variance will be homogeneous or equal when the standard deviation of
samples is approximately equal.

6.5 Self-Assessment Questions

Q. 1 What do you mean by inferential statistics?
Q. 2 Write down the area of inferential statistics.
Q. 3 What is the importance of inferential statistics in educational research?
Q. 4 What do mean by hypothesis testing?
Q. 5 Briefly state the logic behind hypothesis testing.
Q. 6 What are type I and type II errors?
Q. 7 In what situation will you use independent sample t-test for your data?
Q. 8 In what situation will you use paired sample t-test for your data?
Q. 9 What do you know about:
a) An independent sample t-test.
b) A paired sample t-test.

6.6 Activities
1. Suppose we exclude inferential statistics from our research. What will happen?
Write down a few lines.
2. You have scores of two different groups of students and you have to compare the
scores. Discuss with your colleague and select appropriate statistical test.

6.7 Bibliography
Fraenkel, J. R., Wallen, N. E., & Hyun, H. H. (2012). How to Design and Evaluate in
Education. (8th Ed.) McGraw-Hill, New York

Frey, L. R., Carl H. B., & Gary L. K. (2000). Investigating Communication: An

Introduction to Research Methods.2nd Ed. Boston: Allyn and Bacon

Gravetter, F. J., & Wallnau, L. B. (2002). Essentials of Statistics for the Behavioral
Sciences (4th Ed.). Wadsworth, California, USA.

Lohr, S. L. (1999). Sampling: Design and Analysis. Albany: Duxbury Press.

Pallant, J. (2005). SPSS Survival Manual – A step by step guide to data analysis using
SPSS for Windows (Version 12). Australia: Allen & Unwin.


You might also like