Aem214 CH-3C
Aem214 CH-3C
CHAPTER – 3
DISPERSION
Measures of Dispersion
There are different tools for measures of disperson like as Range, Mean Deviation,
Absolute Deviation, Standard Deviation, Variance, etc.
Range in Statistics
In statistics, range is defined simply as the difference between the maximum and
minimum observations. It is intuitively obvious why we define range in statistics this way
– range should suggest how diversely spreads out the values are, and by computing the
difference between the maximum and minimum values, we can get an estimate of the
spread of the data.
For example, suppose an experiment involves finding out the weight of lab rats and the
values in grams are 310, 367, 423, 471 and 485. In this case, the range is simply
computed as 485-310 = 175 grams.
Range is quite a useful indication of how spread out the data is, but it has some serious
limitations. This is because sometimes data can have outliers that are widely off the other
data points. In these cases, the range might not give a true indication of the spread of data.
For example, in our previous case, consider a small baby rat added to the data set that
weighs only 50 grams. Now the range is computed as 485-50 = 435 grams, which looks
like a false indication of the dispersion of data.
EXAMPLES
For example, the data points 50, 51, 52, 55, 56, 57, 59 and 60 have a mean at 55 (Blue).
Another data set of 12, 32, 43, 48, 64, 71, 83 and 87. This set too has a mean of 55 (Pink).
However, it can clearly be seen that the properties of these two sets are different. The first
set is much more closely packed than the second one. Through standard deviation, we can
measure this distribution of data about the mean.
USAGE
STATISTICAL VARIANCE
Statistical variance gives a measure of how the data distributes itself about the mean or
expected value. In many cases of statistics and experimentation, it is the variance that
gives invaluable information about the data distribution.
For example, suppose you want to find the variance of scores on a test. Suppose the
scores are 67, 72, 85, 93 and 98.
Variance= σ2 = ∑ (Xi-Mean)2 / N
Variance= σ2 = ∑ Fi (Xi-Mean)2 / N
σ2 = ∑ (Xi-Mean)2 / 5
3. The mean (µ) for the five scores (67, 72, 85, 93, 98), so µ=Mean = 83.
σ2 = ∑ (Xi-83)2 / 5
4. Now, compare each score (x = 67, 72, 85, 93, 98) to the mean (µ = 83)
σ2 = [ (67-83)2+(72-83)2+(85-83)2+(93-83)2+(98-83)2 ] / 5
σ2 = [ (-16)2+(-11)2+(2)2+(10)2+(15)2] / 5
6. Then, square each parenthesis. We get 256, 121, 4, 100 and 225.
σ2 = [ (-16)x(-16)+(-11)x(-11)+(2)x(2)+(10)x(10)+(15)x(15)] / 5
σ2 = 706 / 5
8. To get the final answer, we divide the sum by 5 (Because it was five scores).
This is the variance for the dataset:
If in terms of Frequency
If in terms of Frequency
In this case, we need to slightly change the formula for standard deviation as
Note that the denominator is one less than the sample size in this case.
USAGE
The concept of variance can be extended to continuous data sets too. In that case, instead
of summing up the individual differences from the mean, we need to integrate them. This
approach is also useful when the number of data points is very large, like the population
of a country.
Variance is extensively used in probability theory, wherein from a given smaller sample
set, more generalized conclusions need to be drawn. This is because variance gives us an
idea about the distribution of data around the mean, and thus from this distribution, we
can work out where we can expect an unknown data point.
The standard error of the mean, also called the standard deviation of the mean, is a
method used to estimate the standard deviation of a sampling distribution. To understand
this, first we need to understand why a sampling distribution is required.
σM = standard error of the mean, σ = the standard deviation of the original distribution
CONCEPTS:
It can be seen from the formula that the standard error of the mean decreases as N
increases. This is expected because if the mean at each step is calculated using a lot of
data points, then a small deviation in one value will cause less effect on the final mean.
The standard error of the mean tells us how the mean varies with different experiments
measuring the same quantity. Thus if the effect of random changes are significant, then
the standard error of the mean will be higher. If there is no change in the data points as
experiments are repeated, then the standard error of mean is zero.
Solved Examples:
1. Find the Range and Range Coefficient of the following series which gives the height of
seven plants:
22, 35, 32, 45, 42, 48, 39
2. Ten measurements were made with the following results. Find the standard deviation.
Length in cm. 77 73 75 70 72 76 75 72 74 76
Xi – m 3 -1 1 -4 -2 2 1 -2 0 2
2
(Xi-m) 9 1 1 16 4 4 1 4 0 4
Di 2 -2 0 -5 -3 1 0 -3 -1 1
2
Di 4 4 0 25 9 1 0 9 1 1
3. From the following figures find the mean, standard deviation and coefficient of
variation.