Statistics How To: The ANOVA Test
Statistics How To: The ANOVA Test
Statistics How To: The ANOVA Test
HOME
TABLES
PROBABILITY AND STATISTICS
CALCULATORS
STATISTICS BLOG
MATRICES
EXPERIMENTAL DESIGN
PRACTICALLY CHEATING STATISTICS HANDBOOK
ANOVA Test: Definition, Types, Examples
Statistics Definitions > ANOVA
Contents:
1. The ANOVA Test
2. One Way ANOVA
3. Two Way ANOVA
4. What is MANOVA?
5. What is Factorial ANOVA?
6. How to run an ANOVA
7. ANOVA vs. T Test
8. Repeated Measures ANOVA
9. Sphericity
10. Related Articles
Medication only,
Medication and counseling,
Counseling only.
Your dependent variable would be the number of alcoholic beverages consumed per day.
If your groups or levels have a hierarchical structure (each level has unique subgroups), then use
a nested ANOVA for the analysis.
Types of Tests.
There are two main types: one-way and two-way. Two-way tests can be with or without replication.
One-way ANOVA between groups: used when you want to test two groups to see if there’s a
difference between them.
Two way ANOVA without replication: used when you have one group and you’re double-
testing that same group. For example, you’re testing one set of individuals before and after they
take a medication to see if it works or not.
Two way ANOVA with replication: Two groups, and the members of those groups are doing
more than one thing. For example, two groups of patients from different hospitals trying two
different therapies.
Back to Top
What is MANOVA?
MANOVA is just an ANOVA with several dependent variables. It’s similar to many other tests
and experiments in that it’s purpose is to find out if the response variable (i.e. your dependent
variable) is changed by manipulating the independent variable. The test helps to answer many
research questions, including:
Do changes to the independent variables have statistically significant effects on dependent
variables?
What are the interactions among dependent variables?
What are the interactions among independent variables?
MANOVA Example
Suppose you wanted to find out if a difference in textbooks affected students’ scores in
math and science. Improvements in math and science means that there are two dependent variables,
so a MANOVA is appropriate.
An ANOVA will give you a single (univariate) f-value while a MANOVA will give you a
multivariate F value. MANOVA tests the multiple dependent variables by creating new, artificial,
dependent variables that maximize group differences. These new dependent variables are linear
combinations of the measured dependent variables.
Disadvantages
1. MANOVA is many times more complicated than ANOVA, making it a challenge to see
which independent variables are affecting dependent variables.
2. One degree of freedom is lost with the addition of each new variable.
3. The dependent variables should be uncorrelated as much as possible. If they are correlated,
the loss in degrees of freedom means that there isn’t much advantages in including more than
one dependent variable on the test.
Reference:
(SFSU)
Back to Top
Variability
In a one-way ANOVA, variability is due to the differences between groups and the differences within
groups. In factorial ANOVA, each level and factor are paired up with each other (“crossed”). This
helps you to see what interactions are going on between the levels and factors. If there is an
interaction then the differences in one factor depend on the differences in another.
Let’s say you were running a two-way ANOVA to test male/female performance on a final exam. The
subjects had either had 4, 6, or 8 hours of sleep.
ANOVA tests in statistics packages are run on parametric data. If you have rank or ordered data,
you’ll want to run a non-parametric ANOVA (usually found under a different heading in the software,
like “nonparametric tests“).
Steps
It is unlikely you’ll want to do this test by hand, but if you must, these are the steps you’ll want to
take:
Step 2: Replace the “factor1” name with something that represents your independent variable. For
example, you could put “age” or “time.”
Step 3: Enter the “Number of Levels.” This is how many times the dependent variable has been
measured. For example, if you took measurements every week for a total of 4 weeks, this number
would be 4.
Step 4: Click the “Add” button and then give your dependent variable a name.
Step 5: Click the “Add” button. A Repeated Measures Define box will pop up. Click the “Define”
button.
Step 6: Use the arrow keys to move your variables from the left to the right so that your screen looks
similar to the image below:
Step 7: Click “Plots” and use the arrow keys to transfer the factor from the left box onto the
Horizontal Axis box.
Step 8: Click “Add” and then click “Continue” at the bottom of the window.
Step 9: Click “Options”, then transfer your factors from the left box to the Display Means for box on
the right.
Step 10: Click the following check boxes:
Compare main effects.
Descriptive Statistics.
Estimates of Effect Size.
Step 11: Select “Bonferroni” from the drop down menu under Confidence Interval Adjustment.
Step 12: Click “Continue” and then click “OK” to run the test.
Back to Top
Sphericity
In statistics, sphericity (ε) refers to Mauchly’s sphericity test, which was developed in 1940 by John
W. Mauchly, who co-developed the first general-purpose electronic computer.
Definition
Sphericity is used as an assumption in repeated measures ANOVA. The assumption states that
the variances of the differences between all possible group pairs are equal. If your data violates this
assumption, it can result in an increase in a Type I error (the incorrect rejection of the null
hypothesis).
It’s very common for repeated measures ANOVA to result in a violation of the assumption. If the
assumption has been violated, corrections have been developed that can avoid increases in the type I
error rate. The correction is applied to the degrees of freedom in the F-distribution.
Image: UVM.EDU
You would report the above result as “Mauchly’s Test indicated that the assumption of sphericity had
not been violated, χ2(2) = 2.588, p = .274.”
If your test returned a small p-value, you should apply a correction, usually either the:
Greehouse-Geisser correction.
Huynh-Feldt correction.
When ε ≤ 0.75 (or you don’t know what the value for the statistic is), use the Greenhouse-Geisser
correction.
When ε > .75, use the Huynh-Feldt correction.
Back to Top
Related Articles
Grand mean
References
Blokdyk, B. (2018). Ad Hoc Testing. 5STARCooks
Miller, R. G. Beyond ANOVA: Basics of Applied Statistics. Boca Raton, FL: Chapman & Hall, 1997
------------------------------------------------------------------------------
Need help with a homework or test question? With Chegg Study, you can get step-by-step
solutions to your questions from an expert in the field. Your first 30 minutes with a Chegg tutor is
free!
Statistical concepts explained visually - Includes many concepts such as sample size, hypothesis tests,
or logistic regression, explained by Stephanie Glen, founder of StatisticsHowTo.
Comments? Need to post a correction? Please post a comment on our Facebook page.
Check out our updated Privacy policy and Cookie Policy
Find an article
Search
Feel like "cheating" at Statistics? Check out the grade-increasing book that's recommended
reading at top universities!
For making things clearer, let's visualize the mean IQ scores per school in a simple bar chart.
Clearly, our sample from school B has the highest mean IQ - roughly 113 points. The lowest
mean IQ -some 93 points- is seen for school C.
Now, here's the problem: our mean IQ scores are only based on tiny samples of 10 children
per school. So couldn't it be thatall 1,000 children per school have the same mean IQ?
Perhaps we just happened to sample the smartest children from school B and the dumbest
children from school C?* Is that realistic? We'll try and show that this statement -our null
hypothesis- is not credible given our data.
Now, F itself is not interesting at all. However, we can obtain the statistical significance from
F if it follows an F-distribution. It will do just that if 3 assumptions are met.
ANOVA - Assumptions
The assumptions for ANOVA are
independent observations;
normality: the outcome variable must follow a normal distribution in each subpopulation.
Normality is really only needed for small sample sizes, say n < 20 per group.
homogeneity: the variances within all subpopulations must be equal. Homogeneity is only
needed if sample sizes are very unequal. In this case, Levene's test indicates if it's met.
If these assumptions hold, then F follows an F-distribution with DFbetween and DFwithin
degrees of freedom. In our example -3 groups of n = 10 each- that'll be F(2,27).
Given this distribution, we can look up that the statistical significance. We usually report:
F(2,27) = 6.15, p = 0.006. If our schools have equal mean IQ's, there's only a 0.006 chance of
finding our sample mean differences or larger ones. We usually say something is “statistically
significant” if p < 0.05.Conclusion: our population means are very unlikely to be
equal.The figure below shows how SPSS presents the output for this example.
Effect Size - (Partial) Eta Squared
So far, our conclusion is that the population means are not all exactly equal. Now, “not
equal” doesn't say much. What I'd like to know isexactly how different are the means?A
number that estimates just that is the effect size. An effect size measure for ANOVA is partial
eta squared, written as η2.* For a one-way ANOVA, partial eta-squared is equal to simply
eta-squared.
Technically,(partial) eta-squared is the
proportion of variance accounted for by a factor.Some rules of thumb are that
η2 > 0.01 indicates a small effect;
η2 > 0.06 indicates a medium effect;
η2 > 0.14 indicates a large effect.
The exact calculation of eta-squared is shown in the formulas section. For now, suffice to say
that η2 = 0.31 for our example. This huge -huge- effect size explains why our F-test is
statistically significant despite our very tiny sample sizes of n = 10 per school.
Last but not least, there's many other post hoc tests as well. Some require the homogeneity
assumption and others don't. The figure below shows some examples.
ANOVA - Basic Formulas
For the sake of completeness, we'll list the main formulas used for the one-
way ANOVA in our example. You can see them in action in this
Googlesheet. We'll start off with the between-groups variance: