0% found this document useful (0 votes)
10 views13 pages

Stats Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
10 views13 pages

Stats Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 13

NOTES IN STATISTICS

Basic Concepts:

Statistics is a collection of methods for planning experiments, obtaining data, and then
organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions
based on the data.

A population is the set of all the individuals of interest in a particular study.

A sample is a set of individuals selected from a population, usually intended to represent


the population in a research study.

A parameter is a value, usually a numerical value, that describes a population. A


parameter may be obtained from a single measurement, or it may be derived from a set
of measurements from a population.

A statistic is a value, usually a numerical value, that describes a sample. A statistic may
be obtained from a single measurement, or it may be derived from a set of
measurements from the sample.

Descriptive statistics are statistical procedures used to summarize, organize, and


simplify data.

Inferential statistics consist of techniques that allow us to study samples and then make
generalizations about the populations from which they were selected.

Sampling error is the discrepancy, or amount of error, that exists between a sample
statistic, and the corresponding population parameter.

Margin of error is an estimate of the extent to which sample results are likely to deviate
from the population value.

Data (plural) are measurements or observations. A data set is a collection of


measurements or observations. A datum (singular) is a single measurement or
observation and is commonly called a score or raw score.

Quantitative data consist of numbers representing counts or measurements. When data


represent counts, they are discrete; when they represent measurements, they are
continuous.

Qualitative (or categorical or attribute) data can be separated into different categories
that are distinguished by some nonnumeric characteristics.

A variable is a characteristic or condition that changes or has different values for


different individuals.

A constant is a characteristic or condition that does not vary, but is the same for every
individual.

The independent variable is the variable that is manipulated by the researcher. In


behavioral research, the independent variable usually consists of the two (or more)
treatment conditions to which subjects are exposed. The independent variable consists
of the antecedent conditions that were manipulated prior to observing the dependent
variable.

The dependent variable is the one that is observed for changes in order to assess the
effect of the treatment.

A confounding variable is an uncontrolled extraneous variable whose effects on the


dependent variable may incorrectly be attributed to the independent variable.

An extraneous variable is an uncontrolled variable that may affect the dependent


variable of a study; its effect may be mistakenly attributed to the independent variable of
the study.

A discrete variable consists of separate, indivisible categories. No values can exist


between two neighboring categories.

For a continuous variable, there are an infinite number of possible values that fall
between any two observed values. A continuous variable is divisible into an infinite
number of fractional parts.
Scales of Measurement

S.S. Stevens has devised a system that makes a logical approach to measurement. He
defined four types of scales: nominal, ordinal, interval, and ratio. These scales are
distinguished on the basis of the relationships assumed to exist between objects having
different scale values.

Nominal scale: Mutually exclusive categories differing in some qualitative


aspect.
Ordinal scale: Scale has the property of a nominal scale (mutually exclusive
categories) and in addition has observations ranked in order of
magnitude. Ranks, which may be numerical, express a “greater
than” relationship, but with no implication about how much
greater.
Interval scale: Scale has all the properties of an ordinal scale and in addition
numerical values indicate order of merit and meaningfully reflect
relative distances between points along the scale. A given
interval between measures has the same meaning at any point
in the scale.
Ratio scale: Scale has all the properties of an interval scale and in addition
has an absolute zero point. Ratio between measures becomes
meaningful.

Frequency Distributions

A frequency distribution is an organized tabulation of the number of individuals located


in each category on the scale of measurement. A frequency distribution can be
structured either as a table or as a graph, but in either case the distribution presents the
same two elements:

1. The set of categories that make up the original measurement scale.


2. A record of the frequency, or number of individuals in each category.

Thus, a frequency distribution presents a picture of how the individual scores are
distributed on the measurement scale – hence the name frequency distribution.

Proportion measures the fraction of the total group that is associated with each score.
In general, the proportion associated with each score is

"
proportion = #=
!
Because proportions describe the frequency (f) in relation to the total number (N), they
often are called relative frequencies.

In addition to using frequencies (f) and (p), researchers often describe a distribution of
scores with percentages.

"
percentage = #$"##! = $"##!
!
An ungrouped or simple frequency distribution is used when there is a small number
of observations. It is merely an arrangement of the data usually from the highest to the
lowest that shows the frequency of occurrence of the different values of the variable.

A grouped frequency distribution is an arrangement of data that shows the frequency


of occurrence of values falling within arbitrarily defined ranges of the variable known as
class variables.

The steps in grouping a large mass of data into a frequency distribution are as follows:

1. Find the range between the highest and the lowest scores.
2. Determine the interval size by dividing the range by the desired number of classes
which is normally not less than 10 and not more than 20.
3. Determine the class limits of the class intervals. Tabulation is facilitated if the lower
class limits of the class intervals are multiples of the class size. The bottom interval
must include the lowest score.
4. Tally the frequencies for each class interval and get the sum of the frequency
column.
Frequency Distribution Graphs

A frequency distribution graph is basically a picture of the information available in a


frequency distribution table. All types of graphs start with two perpendicular lines called
axes. The horizontal line is called the X-axis, or the abscissa. The vertical line is called
the Y-axis, or the ordinate. The measurement scale (set of X values) is listed along the
X-axis in increasing value from left to right. The frequencies are listed on the Y-axis in
increasing value from bottom to top. As a general rule, the point where the two axes
intersect should have a value of zero for both the scores and the frequencies. A fine
general rule is that the graph should be constructed so that its height (Y-axis) is
approximately two-thirds to three-quarters of its length (X-axis). Violating these
guidelines can result in graphs that give a misleading picture of the data.

Histogram. When a frequency distribution graph is showing data from an interval or a


ratio scale, the bars are drawn so that adjacent bars touch each other. The touching bars
produce a continuous figure, which emphasizes the continuity of the variable. This type
of frequency distribution graph is called a histogram.

For a histogram, vertical bars are drawn above each score so that

1. The height of the bar corresponds to the frequency.


2. The width of the bar extends to the real limits of the score.

A histogram is used when the data are measured on an interval or a ratio scale.

When data have been grouped into class intervals, you can construct a frequency
distribution histogram by drawing a bar above each interval so that the width of the bar
extends to the real limits of the interval.

Bar graphs. When you are presenting the frequency distribution for data from nominal
or ordinal scale, the graph is constructed so that there is some space between the bars.
In the case of a nominal scale, the separate bars emphasize that the scale consists of
separate, distinct categories. For ordinal scales, the bar graph is used because
differences between ranks do not provide information about interval size on the X-axis.

For a bar graph, a vertical bar is drawn above each score (or category) so that

1. The height of the bar corresponds to the frequency.


2. There is a space separating each bar from the next.

A bar graph is used when the data are measured on a nominal or an ordinal scale.

Frequency Distribution Polygons

Instead of a histogram, many researchers prefer to display a frequency distribution using


a polygon.

In a frequency distribution polygon, a single dot is drawn above each score so that

1. The dot is centered above the score.


2. The height (vertical position) of the dot corresponds to the frequency.

A continuous line is then drawn connecting these dots. The graph is completed by
drawing a line down to the X-axis (zero frequency) at each end of the range of scores.

As with histogram, the frequency distribution polygon is intended for use with interval or
ratio scales. A polygon also can be used with data that have been grouped into class
intervals. In this case, you position each dot directly above the midpoint of a class
interval. The midpoint can be found by averaging the apparent limits of the interval or by
averaging the real limits of the interval.

Cumulative Frequency Polygon

A cumulative frequency distribution is represented graphically by a cumulative frequency


polygon. In this graph, the real limits of the intervals are laid on the X-axis, and the
cumulative frequencies are laid on the Y-axis. To construct the graph, a point is plotted
on the exact real limit of each interval at a height corresponding to the cumulative
frequency.

The raw frequencies may be converted to percentages such that the cumulative
frequencies will total 100 percent instead of the total number of frequencies. If these
cumulative frequencies are graphed, a cumulative percentage polygon or ogive is
obtained.

The Shape of a Frequency Distribution

Nearly all distributions can be classified as being symmetrical or skewed.

In a symmetrical distribution, it is possible to draw a vertical line through the middle so


that one side of the distribution is a mirror image of the other.

In a skewed distribution, the scores tend to pile up toward one end of the scale and taper
off gradually at the other end.

A skewed distribution with the tail on the right-hand side is said to be positively skewed
because the tail points toward the positive (above-zero) end of the X-axis. If the tail
points to the left, the distribution is said to be negatively skewed.

Percentiles and Percentile Ranks

The percentile system is widely used in educational measurement to report the standing
of an individual relative to the performance of a known group. It is based on the
cumulative percentage distribution. A percentile point is a point on the measurement
scale below which is a specified percentage of the cases in the distribution falls. It is
often called a percentile. A percentile rank is the percentage of cases falling below a
given point on the measurement scale.

Do not confuse percentile and percentile ranks: percentile ranks may take values only
between zero and 100, whereas a percentile (point) may have any value that scores may
have.

To determine a particular percentile when the distribution is ungrouped, the scores must
first be arranged according to size. Then, the position p of the score or point which
defines the pth percentile in a distribution consisting of n observations is:

"(! + !)
!""
If the data are given in a grouped distribution, the pth percentile is obtained by a
procedure similar to that use in finding the median.

#$ ! " '
&# = % + !
"

Measures of Central Tendency

Mean. The arithmetic mean or simple mean (popularly called the average) is the sum
of the separate scores or measures divided by the number of the scores.

The mean has a number of interesting properties. Some of these are:

1. The sum of deviations of all the measurements in a distribution from the mean is 0.
2. In many statistical cases, the squares of the deviations from the mean are used in
statistical computations. A second useful property of the mean involves the sum of
squares of deviations from the mean. This second property of the mean states that
the sum of the squared deviations of scores from their arithmetic mean is less than
the sum of the squared deviations around any point other than the mean.
3. If a constant c is added to each score in a distribution, the mean of the distribution
will be increased by the same constant.
4. If each score in a distribution is multiplied by a constant, the mean of the original
scores is also multiplied by the same constant.
5. The mean may not be a value that exists in the distribution.
6. All the values of the variable under investigation are incorporated in the computation
of the mean.

The mean is appropriate to use in the following situations:

1. When the distribution consists of interval or ratio data which have no extreme values
(too high or too low in comparison with the other scores in the set).
2. When other statistics (like standard deviation, coefficient of correlation, etc.) are
subsequently to be computed.
3. When the distribution is normal or is not greatly skewed, the mean is usually
preferred to either the median or the mode. In such cases, it provides a better
estimate of the corresponding population parameter than either the median or the
mode.

The formula for the mean ! of a series of ungrouped measures is:

!"
"=
!
When some scores occur several times, the mean is computed with the formula:

!#
$ =!
$ "$
#=
!
When only a grouped frequency distribution is available, the mean is approximated by the
formula:

& ' #A #
' = %& + $ !!
% " "
Median. The median is that point on the scale of measurement that divides a series of
ranked observations into halves, such that half of the observations fall above it and the
other half fall below it.

The median has the following properties:

1. The median is the point below which half of the scores in a distribution lie and above
which the other half of the scores lie.
2. The median is an ordinal statistic because its calculation is based on the ordinal
properties of the data being analyzed.
3. When the distribution is grossly asymmetrical or skewed or when a series contains
either a few extremely high or a few extremely low scores compared with the rest of
the scores, the median is the most representative average. This is because the
values of the different scores have nothing to do with the computation of the median.
4. In an open-ended distribution, the median is the most reliable measure of central
tendency that can be computed.
5. Unlike the mean, the medians of separate or different distributions cannot be
combined to give the median of the resulting combined distribution.
6. The median is less reliable or less dependable than the mean. If different samples
are randomly selected from a given population, the medians of these samples are
likely to vary or fluctuate more from each other and from the median of the given
population than the means of the same samples.

For ungrouped data, the calculation of the median is based on the following formula:

# +"
$%& = !"
!
The median is calculated from grouped data using the formula below:

&% #
$ ' $" !
'() = & + $ ! !!
$ "# !
$ !
% "
Mode. The mode is the point on the measurement scale with the maximum frequency in
the given distribution. In an ungrouped distribution, it is the measurement which occurs
most frequently.

The mode does not always exist. In a rectangular distribution where all the frequencies
are equal, there is no mode. On the other hand, for some sets of data there may be two
or more scores with the same highest frequency.
The mode has the following properties:

1. The mode is a nominal statistic which means that it is used for nominal data. Its
computation does not depend on the values of the variable or on their order, but
merely on their frequency or occurrence. It is rarely used with interval, ratio, and
ordinal variables, where means and medians can be calculated.
2. It is usually employed as a simple, inspectional measure which indicates roughly the
center of concentration of distribution. As such, there is no need to calculate it as
exactly as the median or the mean.
3. The mode is a very unstable value. It can change radically if the method of rounding
the data is changed.
4. The mode is the appropriate measure of central tendency if the distribution is bimodal
with the modes at the extreme ends of the distribution.

VARIABILITY

Variability provides a quantitative measure of the degree to which scores in a distribution


are spread out or clustered together.

The Range. The range is the distance between the largest scores (Xmax) and the
smallest scores (Xmin) in the distribution. The problem with using the range as a measure
of variability is that it is completely determined by the two extreme values and ignores the
other scores in the distribution.

Because the range does not consider all the scores in the distribution, it often does not
give an accurate description of the variability for the entire distribution. For this reason,
the range is considered to be a crude and unreliable measure of variability.

The Interquartile Range and Semi-Interquartile Range

A distribution can be divided into four equal parts using quartiles. By definition, the first
quartile (Q1) is the score that separates the lowest 25% of the distribution from the rest.
The second quartile (Q2) is the score that has exactly two quarters, or 50%, of the
distribution below. Notice that the second quartile and the median are the same. Finally,
the third quartile (Q3) is the score that divides the bottom three-fourths of the distribution
from the top quarter.

The interquartile range is the distance between the first quartile and the third quartile:

interquartile range = Q3 – Q1

When the interquartile range is used to describe variability, it commonly is transformed


into the semi-interquartile range. As the name implies,. The semi-interquartile range is
simply one-half of the interquartile range. Conceptually, the semi-interquartile range
measures the distance from the middle of the distribution to the boundaries that define
the middle 50%.

Standard Deviation and Variance

The standard deviation is the most commonly used and the most important measure of
variability. Standard deviation uses the mean of the distribution as a reference point and
measures variability by considering the distance between each score and the mean. It
determines whether the scores are generally near or far from the mean. That is, are the
scores clustered together or are scattered? In simple terms, the standard deviation
approximates the average distance from the mean.

Variance is the mean of the squared deviation scores. Standard deviation is the
square root of the variance.

Formula of the population standard deviation:

#( " " µ )
!
!=
!
Formula of the sample standard deviation:

"( " ! " )


"

#=
! !!

THE NORMAL DISTRIBUTION

One theoretical distribution that has proved to be extremely valuable is the normal
distribution (or normal curve), a distribution that, among other things, describes how
chance operates. It is a bell-shaped, theoretical distribution that predicts the frequency of
occurrence of chance events.

The mathematical representation of this curve was first studied by the famous German
mathematician Carl Gauss. That is why the curve is often referred to as the Gaussian
distribution.

Standard Normal Distribution. The simplest of the family of normal distributions is the
standard normal distribution, also called z distribution. It is a distribution of a normal
( )
random variable with a mean equal to zero µ = ! and a standard deviation equal to one
(! = !).
The standard normal distribution has the following characteristics:

1. It is symmetrical about the vertical line drawn through z = 0. This means that the
shape of the distribution at the right is a mirror image of the left.
2. The highest point in the curve is y = 0.3989.
3. The curve is asymptotic to the x-axis. This means that both positive and negative
ends approach the horizontal axis but do not touch it.
4. For all practical purposes, the area under the curve from z = -3 to z = +3 equals 1,
hence the term unit normal curve.
5. The three measures of central tendency (mean, median, and mode) coincide with
each other.

Skewness

When a distribution has many more observations on the right side of the curve, we say
that the curve is negatively skewed.

When a distribution has more observations on the left side of the curve, we say that the
curve is positively skewed.

Area Under the Unit Normal Curve

The area under the unit normal curve may represent several things like the probability of
an event, the percentile rank of a score, or the percentage distribution of the whole
population.

• About 68% of all scores fall within 1 standard deviation of the mean
• About 95% of all scores fall within 2 standard deviations of the mean
• About 99.7% of all scores fall within 3 standard deviations of the mean

HYPOTHESIS TESTING

Components of a Formal Hypothesis Test

1. The null hypothesis (denoted by H0) is a statement about the value of a


population parameter (such as the mean µ), and it must contain the condition of
equality (that is, it must be written with the symbol =, £, or ³). For the mean, the
null hypothesis will be stated in only one of three possible forms:
a. H0: µ = some value
b. H0: µ £ some value
c. H0: µ ³ some value

We test the null hypothesis directly in the sense that the conclusion will be either
a rejection of H0 or a failure to reject H0.
2. The alternative hypothesis (denoted by H1) is the statement that must be true if
the null hypothesis is false. For the mean, the alternative hypothesis will be
stated in only one of the three possible forms:
a. H1: µ ¹ some value
b. H1: µ > some value
c. H1: µ < some value

Note that H1 is the opposite of H0.

Alternative hypotheses are classified as either nondirectional or directional


hypotheses.

A nondirectional hypothesis is one which asserts that one value is different


from another (or others). More specifically, it is an assertion that there is a
significant difference between two statistical measures (or that there are
significant differences among three or more summary measures).

A directional hypothesis, on the other hand, is an assertion that one measure


is less than (or greater than) another measure of similar nature.

Mathematically, a nondirectional hypothesis makes use of the “not equal to” (¹)
sign, while a directional hypothesis involves one of the order relations, “less than”
(<) or “greater than” (>).

Nondirectional hypotheses are also called two-sided hypotheses, and directional


hypotheses are also known as one-sided hypotheses.

Very Important Note 1: Depending on the original wording of the problem, the
original claim will sometimes be the null hypothesis H0, and at other times it will
be the alternative hypothesis H1. Regardless of whether the original claim
corresponds to H0 or H1, the null hypothesis H0 must always contain equality.

Very Important Note 2: Even though we sometimes express H0 with the symbol
£ or ³ as in H0: µ £ some value or H0: µ ³ some value, we conduct the test by
assuming that H0: µ = some value is true. We must have a fixed and specific
value for µ so that we can work with a single distribution having a specific mean.

Very Important Note 3: If we are making our own claims, we should arrange the
null and alternative hypotheses so that the most serious error would be the
rejection of a true null hypothesis. Ideally, all claims would be made so that they
would all be null hypotheses. Unfortunately, our real world is not ideal. There is
poverty, war, crime, and people who make claims that are actually alternative
hypotheses.

3. Type I error: The mistake of rejecting the null hypothesis when it is true. The
probability of rejecting the null hypothesis when it is true is called the
significance level; that is, the significance level is the probability of a type I
error. The symbol a (alpha) is used to represent the significance level. The
values of a = 0.05 and a = 0.01 are commonly used.

Type I Error

The rejection of a true null hypothesis is labeled a Type I error. A Type I error,
symbolized with a greek alpha (a), is a “false alarm” – the investigator thinks he
or she has something when there is nothing there.

Significance Level

The actual probability figure that you obtain from the data is referred to as the
significance level. Thus, p£.001 is an expression of the level of significance of
the difference. In some statistical reports, an a level is not specified; only
significance levels are given. Thus, in the same report, some differences may be
reported as significant at the .001 level, some at the .01, and some at the .05
level. Regardless of how the results are reported, however, researchers view .05
as an important cutoff (Nelson, Rosenthal, and Rosnow, 1986). When .10 or .20
is used as an a level, a justification should be given.

The level of significance is the probability of a Type I error that an investigator is


willing to risk in rejecting a null hypothesis. If an investigator sets the level of
significance at .01, it means that the null hypothesis will be rejected if the
estimated probability of the observed relationship’s being a chance occurrence is
one in a hundred. The most commonly used levels of significance in the field of
education are the .05 and the .01 levels.

Traditionally, investigators determine the level of significance after weighing the


relative seriousness of Type I and Type II errors, but before running the
experiment. If the data derived from the completed experiment indicate that the
probability of the null hypothesis being true is equal to or less than the
predetermined acceptable probability, the null hypothesis is rejected and the
results are declared to be statistically significant. If the probability is greater than
the predetermined acceptable probability, the results are described as
nonsignificant – that is, the null hypothesis is retained.

The familiar meaning of the word significant is “important” or “meaningful.” In


statistics this word means “less likely to be a function of chance than some
predetermined probability.” Results of investigations can be statistically
significant without being inherently meaningful or important.

The Meaning of p in p<.05

The p in p<.05 is the probability of getting the sample statistic if H0 is true. This is
a simple definition that is easy to memorize. Nevertheless, the meaning of p is
commonly misinterpreted. Everitt and Hay (1992) report that among 70
academic psychologists, only 3 scored 100 percent in a six-item test on the
meaning of p. Here is what p is not:

p is not the probability that H0 is true


p is not the probability of a Type I error
p is not the probability of making a wrong decision
p is not the probability that the sample statistic is due to chance

Sampling distributions that are used to determine probabilities are always ones
that assume the null hypothesis is true. Thus, the probabilities these sampling
distributions for a particular statistics are accurate when H0 is true. Thus, p is the
probability of getting the sample statistic if H0 is true.

4. Type II error: The mistake of failing to reject the null hypothesis when it is false.
The symbol b (beta) is used to represent the probability of a type II error.

Type II Error

The retention of a false null hypothesis is labeled a Type II error. A type II error,
symbolized with a greek beta (β), is a “miss” – the investigator concludes that
there is nothing when there really is something.

Comparison of Type I and Type II Errors

Type I errors typically lead to changes that are unwarranted. Type II errors
typically lead to a maintenance of the status quo when a change is warranted.
The consequences of a Type I error are generally considered more serious than
the consequences of a Type II error, although there are certainly exceptions.

5. Controlling Type I and Type II Errors. The following practical considerations may
be relevant in controlling type I and type II errors:

a. For any fixed a, an increase in the sample size n will cause a decrease in b.
That is, a larger sample will lessen the chance that you will fail to reject a
false null hypothesis.
b. For any fixed sample size n, a decrease in a will cause an increase in b.
Conversely, an increase in a will cause a decrease in b.
c. To decrease both a and b, increase the sample size.

6. The following terms are associated with key components in the hypothesis-
testing procedure.

a. Test statistic: A sample statistic or a value based on the sample data. A


test statistic is used in making the decision about the rejection of the null
hypothesis.
b. Critical region: The set of all values of the test statistic that would cause us
to reject the null hypothesis.
c. Critical value: The value or values that separate the critical region from the
values of the test statistic that would not lead to the rejection of the null
hypothesis. The critical value depends on the nature of the null hypothesis,
the relevant sampling distribution, and the level of significance a.

7. Degrees of Freedom

a. The number of degrees of freedom for a data set corresponds to the


number of scores that can vary after certain restrictions have been imposed
on all scores.
b. Degrees of freedom are a function of such factors as the number of subjects
and the number of groups.
c. Each test of significance has its own formula for determining degrees of
freedom.

The “freedom” in degrees of freedom refers to the freedom of a number to have


any possible value. If you were asked to pick two numbers and there were no
restrictions, both numbers would be free to vary (take any value) and you would
have two degrees of freedom. If a restriction is imposed – say, that SX = 0 –
then one degree of freedom is lost because of that restriction; that is, when you
now pick the two numbers, only one of them is free to vary. As an example, if
you choose 3 for the first number, the second must be –3. Because of the
restriction that SX = 0, the second number is not free to vary. In a similar way, if
you were to pick five numbers with a restriction that that SX = 0, you would have
four degrees of freedom. Once four numbers are chosen (say, -5, 3, 16, and 8),
the last number (-22) is determined.

Walker (1940) summarizes this reasoning by stating: “A universal rule holds: The
number of degrees of freedom is always equal to the number of observations
minus the number of necessary relations obtaining among these observations.”

Another approach to explaining degrees of freedom is to emphasize the


parameters that are being estimated by statistics. The rule with this approach is
that df is equal to the number of observations minus the number of parameters
that are estimated with a sample statistic.

8. Conclusions in Hypothesis Testing

The initial conclusion will always be one of the following:

a. Fail to reject the null hypothesis H0.


b. Reject the null hypothesis H0.

The conclusion of failing to reject the null hypothesis or rejecting it is fine for
those of us with the wisdom to take a statistics course, but then it’s usually
necessary to use simple, nontechnical terms in stating what the conclusion
suggests. Students often have difficulty formulating this final nontechnical
statement, which describes the practical consequence of the data and
computations. It’s important to be precise in the language used; the implications
of words such as “support” and “fail to reject” are very different. If you want to
justify some claim, state it in such a way that it becomes the alternative
hypothesis and then hope that the null hypothesis gets rejected. This claim
(alternative hypothesis) will be supported if you reject the null hypothesis. If, on
the other hand, your claim is stated in the null form, you will either reject or fail to
reject the claim; in either case you will not support the original claim.

Some texts say “accept the null hypothesis,” instead of “fail to reject the null
hypothesis.” Whether we use the tem accept or fail to reject, we should
recognize that we are not proving the null hypothesis; we are merely saying that
the sample evidence is not strong enough to warrant rejection of the null
hypothesis. The term accept is somewhat misleading because it seems to
incorrectly imply that the null hypothesis has been proved. The phrase fail to
reject says more correctly that the available evidence isn’t strong enough to
warrant rejection of the null hypothesis. So, we will use the conclusion fail to
reject the null hypothesis, instead of accept the null hypothesis.

ASSUMPTIONS FOR THE USE OF PARAMETRIC TESTS

Level of measurement

Each of these approaches assume that the dependent variable is measured at the
interval or ratio level, that is, using a continuous scale, rather than discrete
categories. Whenever possible when designing your study try to make use of
continuous, rather than categorical, measures of your dependent variable. This
gives you a wider range of possible techniques to use when analyzing your data.

Random sampling

The techniques assume that the scores are obtained using a random sample from
the population.

Independence of observations

The observations that make up your data must be independent of one another. That
is, each observation or measurement must not be influenced by any other
observation or measurement. Violation of this assumption, according to Stevens
(1996), is very serious.

Any situation where the observations or measurements are collected in a group


setting, or subjects are involved in some form of interaction with one another, should
be considered suspect. In designing your study you should try to ensure that all
observations are independent. If you suspect some violation of this assumption,
Stevens (1996) recommends that you set a more stringent alpha value (e.g., p<.01).

Normal distribution

It is assumed that the populations from which the samples are taken are normally
distributed. In a lot of research (particularly in the social sciences), scores on the
dependent variable are not nicely normally distributed. Fortunately, most of the
techniques are reasonably ‘robust’ or tolerant of violations of this assumption.

Homogeneity of variance

Techniques in this section make the assumption that samples obtained from
population of equal variances. This means that the variability of scores for each of
the groups is similar.

ASSUMPTIONS FOR THE USE OF PEARSON r

Level of measurement

The scale of measurement for the variables should be interval or ratio (continuous).
The exception to this is if you have one dichotomous independent variable (with
only two values: e.g., gender) and one continuous variable. You should however,
have roughly the same number of people or cases in each category of the
dichotomous variable.

Related pairs

Each subject must provide a score on both variable X and Y (related pairs). Both
pieces of information must be from the same subject.

Independence of observations

The observations that make up your data must be independent of one another.
That is, each observation or measurement must not be influenced by any other
observation or measurement.

Normality

Scores on each variable should be normally distributed. This can be checked by


inspecting the histograms of scores on each variable.

Linearity

The relationship between the two variables should be linear. This means that when
you look at a scatterplot of scores you should see a straight line (roughly).

Homoscedasticity

The variability in scores for variable X should be similar at all values of variable Y.
Check the scatterplot. It should show a fairly even cigar shape along its length.
Formulas of the different statistical tests:

1. Independent Samples t-test

#" ! # !
$=
"! "!
+
!" ! !

! (" " " )! ! (" " ! )!


" "" ! + " "! !
!" !!
# =
!

!" + ! ! ! !

2. Paired Samples t-test

#" ! # ! #" ! # !
$= =
"% "%
!

"" "
!
(" ")
"

#" = !
! !!
3. One-Way Analysis of Variance

'' !"!#$ = ! & !


!
"
(! &! )
!

%!

( (! & # )! (! & ! )! (! & " )! % (! & ! )!


'' "#!$##% =& + + #"
&' %# %! %" #$ %!

'' #$"%$& = '' " ! '' !

"# $ = ! ! !

#$ & = "% ! !

$$ !
%$ ! =
"# !

$$ !
%$ ! =
"# !

#$ "
%=
#$ !

4. Pearson Product Moment Coefficient of Correlation

" ! #! " ! # ! !
$=
[" ! # !
" (! # )
!
][" ! ! !
" (! ! )
!
]
5. Spearman rho

$" " %
# ="!
! #! % ! "!

6. Chi – Square Test of Goodness of Fit

# =!
! (" " ! )!
!
7. Chi – Square Test of Association

#! =!
(" " ! )!
!
8. Mann – Whitney U test

A! $A! + #"
B ! = A!A" + " ! #!
!

A! $A! + #"
B ! = A"A! + " ! #!
!
When both samples are relatively large, around 20, the following formula is used.
#"#!
A!
B !µ "
&= =
" #"#! $#" + #! + !#
!"
9. Wilcoxon – Signed Ranks test

For samples larger than 50, the formula below is used.

!%! + $#
"!
# !µ "
$= =
" !%! + $#%!! + $#
!"
10. Kruskal – Wallis H test

"% %
#&%
$= " ! $# ! + "!
! # ! + "! & =" "&

You might also like