Measures of Dispersion, Relative Standing and Shape: No. Biostat - 8 Date:25.01.2009
Measures of Dispersion, Relative Standing and Shape: No. Biostat - 8 Date:25.01.2009
Measures of Dispersion, Relative Standing and Shape: No. Biostat - 8 Date:25.01.2009
Biostatistics
No. Biostat -8
Date:25.01.2009
MEASURES OF DISPERSION,
RELATIVE STANDING AND
SHAPE
A
B
Characteristics of an Ideal Measure of Dispersion
Range
Quartile Deviation
Mean Deviation
Standard Deviation
How dispersions are measured? Contd.
The following measures of dispersion are used
to study the variation:
The range
The inter quartile range and quartile
deviation
The mean deviation or average deviation
The standard deviation
How dispersions are measured? Contd.
Range:
The difference between the values of the two extreme items of a
series.
Example:
Age of a sample of 10 subjects from a population of 169subjects
are:
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
42 28 28 61 31 23 50 34 32 37
The youngest subject in the sample is
23years old and the oldest is 61 years, The
range: R=XL Xs
= 61-23 =38
Co-efficient of Range:
R= (XL - XS) / (XL + XS)
= (61 -23) / (61 + 23) =38 /84 = 0.452
Characteristics of Range
Simplest and most crude measure of dispersion
It is not based on all the observations.
Unduly affected by the extreme values and fluctuations of
sampling.
The range may increase with the size of the set of
observations though it can decrease
Gives an idea of the variability very quickly
The Range
The range is defined as the difference between the
largest score in the set of data and the smallest score in
the set of data, XL XS
xx
Mean deviation
n
Mean Deviation
f i xi x
MDx i 1
n
k = Number of classes
xi= Mid point of the i-th class
fi= frequency of the i-th class
Standard Deviation
Standard deviation is the positive square root
of the mean-square deviations of the
observations from their arithmetic mean.
Population Sample
i
x 2
s
i
x x 2
N N 1
SD variance
The standard deviation
Measures the variation of observations from the mean
Note:
1. MD is based on all values and hence cannot be calculated for open-
ended distributions.
2. It uses average but ignores signs and hence appears unmethodical.
3. MD is calculated from mean as well as from median for both
ungrouped data using direct method and for continuous
distribution using assumed mean method and short-cut-method.
4. The average used is either the arithmetic mean or median
Computation of Mean absolute Deviation
For individual series: X1, X2, Xn
|Xi -X|
M.A.D =
n
For discrete series: X1, X2, Xn & with
corresponding frequency f1, f2, fn
fi |Xi -X|
M.A.D =
fi
X: Mean of the data series.
Computation of Mean absolute Deviation:
For continuous grouped data: m1, m2, mn are the class mid
points with corresponding class frequency f1, f2, fn
fi|mi -X|
M.A.D =
fi
X: Mean of the data series.
Coeff. Of MAD: = (MAD /Average)
The average from which the Deviations are
calculated. It is a relative measure of dispersion
and is comparable to similar measure of other
series.
Example:
Find MAD of Confinement after delivery in the
following series.
Days of No. of Total days of Absolute fi|Xi - X|
Confinement patients confinement of each Deviation
( X) (f) group Xf from mean
|X - X |
6 5 30 1.61 8.05
7 4 28 0.61 2.44
8 4 32 1.61 6.44
9 3 27 2.61 7.83
10 2 20 3.61 7.22
Total 18 137 31.98
x x
2
Standard deviation
n
Variance 2
The square of the population standard deviation is
called the variance.
33
Variance
X 2
2
N
Standard Deviation ()
It is the positive square root of the average of squares of
deviations of the observations from the mean. This is also called
root mean squared deviation () .
For individual series: x1, x2, xn
=
( xix )2
------------
n
=
n -( n )
xi2 xi 2
= fi ( xix )2
fixi2 fixi 2
------------
fi
=
fi
-( f )
i
Standard Deviation () Contd.
= fi ( mix )2
fimi2 fimi 2
------------
fi
=
fi
-( f )
i
Variance: It is the square of the s.d
Coefficient of Variation (CV): Corresponding
Relative measure of dispersion.
CV = ------- 100
X
Characteristics of Standard Deviation:
SD is very satisfactory and most widely used
measure of dispersion
Amenable for mathematical manipulation
It is independent of origin, but not of scale
If SD is small, there is a high probability for
getting a value close to the mean and if it is large,
the value is father away from the mean
Does not ignore the algebraic signs and it is less
affected by fluctuations of sampling
SD can be calculated by :
Direct method
Assumed mean method.
Step deviation method.
It is the average of the distances of the observed
values from the mean value for a set of data
Basic rule --More spread will yield a larger SD
43
Measure of Shape
The fourth important numerical characteristic of a
data set is its shape: Skewness and kurtosis.
Skewness
Skewness characterizes the degree of asymmetry of
a distribution around its mean. For a sample data,
the skewness is defined by the formula:
xi x
n 3
n
Skewness
(n 1)( n 2) i 1 s
n(n 1) n
xi x
4
3(n 1) 2
Kurtosis
(n 1)( n 2)( n 3) i 1 s
(n 2)( n 3)