Unit 6 Inferntial Statistics
Unit 6 Inferntial Statistics
Unit 6 Inferntial Statistics
INFERENTIAL STATISTICS
Written By:
Salman Khalil Chaudhary
Reviewed By:
Dr. Rizwan Akram Rana
Introduction
Inferential statistics is of vital importance in educational research. It is used to make
inferences about the population on the bases of data obtained from the sample. It is also
used to make judgments of the probability that an observed difference among groups is a
dependable one or one that might have happened by chance in the study.
In this unit, you will study introduction, area, logic and importance of inferential
statistics. Hypothesis testing, logic and process of hypothesis testing and errors in
hypothesis are also discussed. In the last of the unit t-test, its types and general
assumptions regarding the use of t-test are discussed.
Objectives
After reading this unit, you will be able to:
1. explain the term “Inferential Statistics”.
2. explain the area of Inferential Statistics.
3. explain the logic of Inferential Statistics.
4. explain the Importance of Inferential Statistics in Educational Research.
5. tell, What Hypothesis Testing is.
6. explain the Logic of Hypothesis Testing.
7. explain the Uncertainty and errors in Hypothesis Testing.
8. explain t-test and its Types.
62
Descriptive statistics only gives us the central values, dispersion or the variability of the
data but inferential statistics leads us to take a decision about the whole population and in
the end to any conclusion. Inferential statistics allows us to use what we have learnt from
descriptive statistics. Inferential statistics enables us to infer from the data obtained the
sample what the population might think.
i) Estimating Parameter
This means taking a statistics from the sample data (e.g. the sample mean) and
saying something about population parameter (e.g. the population mean).
ii) Hypothesis testing
This is where a researcher can use sample data to answer research questions.
Inferential statistics deals with two or more than two variables. If in an analysis there are
two variables it is called bivariate analysis and if the variables are more than two it is
called multivariate analysis. A number of different types of inferential statistics are in
use. All of which depend of the type of variable i.e. nominal, ordinal, interval, and ratio.
Although the type of statistical analysis is different for these variables, yet the main
theme is the same we try to determine how one variable compare to another.
It should be noted that inferential statistics always talk in terms of probability. This can
be made highly reliable by designing right experimental conditions. The inferences are
always an estimate with a confidence interval. In some cases there is simply a rejection of
hypothesis.
Several models are available in inferential statistics that help in the process of data
analysis. A researcher should be careful while choosing any model. Because, choosing a
wrong model may give wrong conclusions.
63
Population of male Population of female
English language English language
students students
N = 1000 N = 1000
N
Sample 1 Sample 2
n = 60 n = 60
The researcher wants to know whether the male population is different from female
population – that is, will the mean score of the male group on attitude scale is different
from the mean score of the female group? The researcher does not know the means of the
two populations. He only has mean scores of two samples on which he has to rely on to
provide information about the populations.
Now it comes in mind that is it reasonable to assume that each sample will give a fairly
accurate picture of the whole population? It certainly is possible, because each sample
was selected randomly from its population. On the other hand, the students in each
sample are only a small portion of their respective population. It is only rare that a sample
is absolutely identical to the population from which it is drawn, on given characteristics.
The data the researcher obtains from two samples depends on the individual students
selected to be in the sample. If another two samples were selected randomly their makeup
would differ from previously selected samples. Their mean on the attitude scale would be
different, and the researcher would end up with a different data set. How can the
researcher be sure that any particular selected sample is a true representative of its
population? Indeed he cannot. He needs some help to be sure that the sample is
representative of the population and the results obtained from the sample data be
generalized to whole population. Inferential statistics will help the researcher and allow
him to make judgment about data and make generalization about a population based on
the data obtained from the sample.
64
study. It helps enables researchers to infer properties of a population based on data
collected from a sample of individuals
Inferential statistics have larger value because these techniques offset problems
associated with data collection. For example, time-cost factor associated with collection
of data on the entire population may be prohibitive. The population may large and
difficult to manage. In this case inferential statistics can prove to be invaluable to
educational/social scientist.
Hypothesis testing is a statistical method that uses sample data to evaluate a hypothesis
about a population parameter (Gravetter & Wallnau, 2002).A hypothesis test is usually
used in context of a research study. Depending on the type of research and the type of
data, the details of the hypothesis test will change from on situation to another.
65
6.3.2 Four-Step Process for Hypothesis Testing
The process of hypothesis testing goes through following four steps.
i) Stating the Hypothesis
The process of hypothesis testing begins by stating a hypothesis about the unknown
population. Usually, a researcher states two opposing hypotheses. And both
hypotheses are stated in terms of population parameters.
The first and most important of two hypotheses is called null hypothesis. A null
hypothesis states that the treatment has no effect. In general, null hypothesis states
that there is no change, no effect, no difference – nothing happened. The null
hypothesis is denoted by the symbol Ho (H stands for hypothesis and 0 denotes that
this is zero effect).
The null hypothesis (Ho) states that in the general population there is no change, no
difference, or no relationship. In an experimental study, null hypothesis (Ho)
predicts that the independent variable (treatment) will have no effect on the
dependent variable for the population.
The second hypothesis is simply the opposite of null hypothesis and it is called the
scientific or alternative hypothesis. It is denoted by H1. This hypothesis states that
the treatment has an effect on the dependent variable.
66
6.3.3 Uncertainty and Error in Hypothesis Testing
Hypothesis testing is an inferential process. It means that it uses limited information
obtained from the sample to reach general conclusions about the population. As a sample
is a small subset of the population, it provides only limited or incomplete information
about the whole population. Yet hypothesis test uses information obtained from the
sample. In this situation, there is always the probability of reaching incorrect conclusion.
Generally two kinds of errors can be made.
i) Type I Errors
A type I error occurs when a researcher rejects a null hypothesis that is actually
true. It means that the researcher concludes that the treatment does have an effect
when in fact the treatment has no effect.
Type I error is not a stupid mistake in the sense that the researcher is overlooking
something that should be perfectly obvious. He is looking at the data obtained from
the sample that appear to show a clear treatment effect. The researcher then makes
a careful decision based on available information. He never knows whether a
hypothesis is true or false.
The consequences of a type I error can be very serious because the researcher has
rejected the null hypothesis and believed that the treatment had a real effect. it is
likely that the researcher will report or publish the research results. Other researchers
may try to build theories or develop other experiments based on false results.
The consequences of Type II error are not very serious. In case of Type II error the
research data do not show the results that the researcher had hoped to obtain. The
researcher can accept this outcome and conclude that the treatment either has no effect
or has a small effect that is not worth pursuing. Or the researcher can repeat the
experiment with some improvement and try to demonstrate that the treatment does
work. It is impossible to determine a single, exact probability value for a type II error.
Summarizing we can say that a hypothesis test always leads to one of two decisions.
i) The sample data provides sufficient evidence to reject the null hypothesis and the
researcher concludes that the treatment has an effect.
ii) The sample data do not provide enough evidence to reject the null hypothesis. The
researcher fails to reject the null hypothesis and concludes that the treatment does
not appear to have an effect.
67
In either case, there is a chance that the data are misleading and the decision is wrong.
The complete set of decision and outcome is shown in the following table.
Table: 6.1
Possible outcome of statistical decision
Actual Situation
No effect, Effect exists,
Ho true Ho false
6.4 T-Test
A t-test is a useful statistical technique used for comparing mean values of two data sets
obtained from two groups. The comparison tells us whether these data sets are different
from each other. It further tells us how significant the differences are and if these
differences could have happened by chance. The statistical significance of t-test indicates
whether or not the difference between the mean of two groups most likely reflects a real
difference in the population from which the groups are selected.
t-tests are used when there are two groups (male and female) or two sets of data (before and
after), and the researcher wishes to compare the mean score on some continuous variable.
68
Here at this level it is necessary to know some general assumptions regarding use of t-
test. The first assumption regarding t-test concerns the scale of measurement. It means
that it is assumed that the dependent variable is measured at interval or ratio scale. The
second assumption made is that of a simple random sample, that the data is collected
from a representative, randomly selected portion of the total population. The third
assumption is that the data, when plotted, results in a normal distribution i.e. in bell-
shaped distribution curve. The fourth assumption is that the observation that make up
data must independent of one another. That is, each observation or measurement must not
be influences by any other observation or measurement. The fifth assumption is that a
reasonably large sample size is used. A large sample size means that the distribution of
results should approach a normal bell-shaped curve. The final assumption is homogeneity
of variance. Variance will be homogeneous or equal when the standard deviation of
samples is approximately equal.
6.6 Activities
1. Suppose we exclude inferential statistics from our research. What will happen?
Write down a few lines.
2. You have scores of two different groups of students and you have to compare the
scores. Discuss with your colleague and select appropriate statistical test.
69
6.7 Bibliography
Fraenkel, J. R., Wallen, N. E., & Hyun, H. H. (2012). How to Design and Evaluate in
Education. (8th Ed.) McGraw-Hill, New York
Gravetter, F. J., & Wallnau, L. B. (2002). Essentials of Statistics for the Behavioral
Sciences (4th Ed.). Wadsworth, California, USA.
Pallant, J. (2005). SPSS Survival Manual – A step by step guide to data analysis using
SPSS for Windows (Version 12). Australia: Allen & Unwin.
70