UNIT IV Dispersion and Skewness
UNIT IV Dispersion and Skewness
UNIT IV Dispersion and Skewness
DEFINITION:
“The degree to which numerical data tend to spread about an average value is called the variation
or dispersion of the data.” - Spriegel.
Measures of dispersion are also called “averages of the second order” for the reason that these
measures give an average of the differences of the various items from an average.
Objectives:
1. The range.
2. The interquartile range and the quartile deviation.
3. The mean deviation.
4. The standard deviation
5. The Lorenz curve.
Out of the above range, quartile deviation, mean deviation and standard deviation are
mathematical methods and the Lorenz curve is a graphical method.
RANGE:
The range is the simplest measure of dispersion. When the data are arranged in an order the
difference between the largest value and the smallest value in the arranged group is called the
range:
Range = L – S
L= largest value
S= smallest value
This is range in terms of absolute measure:
As relative measure, the coefficient of range is given as
absoulte range
Coefficient of range =
∑ of two extremes
L−S
Coefficient of range =
L+S
Uses of range:
Range is very much useful in quality control, weather forecasts and in knowing the fluctuations
in the prices of shares and indices of stock markets.
Merits of range:
a. Simplest measure of dispersion.
b. Simplest to understand and to compute.
Demerits of range:
a. It is largely affected by extreme values.
b. It is not based on each and every observation.
c. It is not amenable to further mathematical treatment.
d. Not suitable for open-end class distribution.
When inter quartile range is reduced to semi-inter quartile range by dividing it by 2, it is called
quartile deviation (Q.D)
Q3−Q1
Q.D =
2
Quartile deviation gives the average amount by which the two quartiles differ from the median.
Quartile deviation is an absolute measure of dispersion. Its relative measure is the co-efficient of
quartile deviation. It is given as:
Q3−Q1
Coefficient of Q.D =
Q3+ Q1
Merits
1. It is easy to compute and easy to understand.
2. It is bases on central 50% of the observation. Therefore extremes are avoided.
3. It is useful in open-end distribution.
Demerits.
1. It ignores 50% of items. Therefore it is not based on all items.
2. It is not amenable to further mathematical treatment.
3. Quartile deviation is not computed from any central value. Therefore some experts argue
that Q.D is not a measure of dispersion.
4. Q.D is not affected by the change in the distribution outside the quartiles.
Mean deviation or average deviation is the average difference between the items in a distribution
and the median or mean or mode of that series.
In simple words, the arithmetic average of the deviation (ignoring signs) from the mean , median
and mode is known as mean deviation.
Formulas:
Problems:
1. The following are the marks obtained by 10 students in an examination. Find mean
deviation about both mean and median and also find the coefficient of M.D.
2. Find out the value of Mean Deviation and its coefficient from the following:
45 , 70 , 78 , 52 , 75 , 83 , 110 , 98 , 64.
3. Calculate Mean Deviation and its coefficient from the following data:
4. Calculate the Mean Deviation and its coefficient from mean as well as median of the
following data:
X 115 125 135 145 155 165 175
F 31 48 72 116 60 22 3
5. Calculate the Mean Deviation and its coefficient from both mean and median for the
following:
6. Compute Mean Deviation about median and the coefficient of mean deviation :
Wages (Rs.) 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-
90
No. of persons 8 10 15 25 20 18 9 5
Standard Deviation (S.D.) is the square root of the arithmetic mean of the squared deviations of
values from their arithmetic mean. It is generally denoted by the symbol σ (read as sigma).
Formulas:
( ) ( )
2 2
∑ d' ∑ d'
σ = √ ∑ d ' / N−
2
×i σ = √ ∑ d ' / N−
2
×i
N N
where d’ = X-A∕ i ,A = Assumed mean , where d’ = X-A∕ i in case of discrete series
i = common value and d’ = M-A∕ i in case of continuous
series.
A = Assumed mean , i = Class interval or
common value , M = Mid point.
Problems:
3. Calculate S.D. from the following data under assumed mean method :
X 10 12 15 18 20 23
X 3 4 5 6 7 8 9
F 3 9 11 14 12 7 4
X 10 20 30 40 50 60 70
F 5 11 19 22 15 6 2
9. Compute S.D. from the following data ( use Step Deviation Method ) :
Marks : 5 10 15 20 25 30 35 40
The S.D. is the absolute measure of dispersion. The corresponding relative measure is known as
the Coefficient of Variation (C.V.). It is given as: C.V. = (σ ∕ mean ) × 100.
Variance
1. The following table shows the scores of two batsmen , Rahul and Raju in a recent cricket
tournament. Find out who is better run getter and who is more consistent .
Rahul
Mean = 51.7
Std Deviation = 43.05
Co-efficient of Variation = Std. Deviation/mean x 100
= 43.05/51.7 x 100 = 83.26%
Raju
Mean = 57.9
Std Deviation = 35.13
Co-efficient of variation = std. deviation/mean x 100
= 35.13/57.9 x 100 = 60.67%
Raju is a better run getter as the mean value of raju’s runs is more.
CV of raju is lesser than CV of rahul, therefore raju is more consistent than Rahul.
2. Find which of the following batsman is more consistent in scoring. Would you accept
him as a better run getter? why?
3. You are given below the daily wages paid to workers in two factories A and B.
a) Which factory pays higher average wages?
b) In which factory are wages more variable?
5. The average marks of 2nd sem B.Com students in Busines Statistics of a college increases
from 65 to 68 and S.D. increases from 4.5 to 5.2. Have the marks in business statistics
of that college become consistent than before?
Ans:
Mean = 65
S.D = 4.5
CV = 4.5/65 x 100 = 6.92%
Mean = 68
S.D = 5.2
CV = 5.2 /68 x 100 = 7.64%
Since the CV has increased. The marks have not become consistent.
Brand A Brand B
M 1000 hours 820 hours
S.D. 100 hours 65 hours
Calculate a measure of relative dispersion for the two brands and interpret the results.
7. Following particulars relate to wage paid by two factories M and N belonging to the same
industry :
Factory M Factory N
No. of workers 856 684
Average wages Rs. 552 Rs. 574
Variance 144 196
a) Which factory pays higher wages ?
b) Which factory has greater variability in wage ?
Solution:
8. An organization has two units A and B. An analysis of weekly wages paid toworkers
gave the following results.
Unit A Unit B
No. of wage earners 500 670
Average weekly wages 65 72
(Rs.)
S.D. (Rs.) 9 9
a) Which unit pays larger amount a weekly wages ?
b) In which unit there is greater variability in wage distribution ?
c) Find the combined average wage and the combined S.D. of wages for the
whole organization.
Skewness :
Definition :
“ A distribution is said to be ‘skewed’ when the mean and the median fall at different points in
the distribution,and the balance (or centre of gravity ) is shifted to one side or the other-to left
or right. “ - Garret
In the absence of mode, the empirical formula [Mode = 3 Median – 2 Mean] is used instead. In
such a case, Pearson’s coefficient of Skewness is calculated using the formula :
3 ( Mean−Median )
Sk p=
Standard Deviation
Problems :
Wages (Rs.) 100 200 300 400 500 600 700 800 900
No. of Workers 35 40 48 100 125 87 43 22 50
Calculate Karl Pearson’s coefficient of Skewness .
4. Calculate Karl Pearon’s coefficient of skewness from the following table :
Wages (Rs.) 270- 280- 290- 300- 310- 320- 330- 340-
280 290 300 310 320 330 340 350
No.of 12 18 35 42 50 45 20 8
workers
Distribution A Distribution B
Mean 100 90
Median 90 80
S.D. 10 10
State whether the following statements are true or false .
a) Distribution A has the same degree of the variation as Distribution B .
b) Both distributions have the same degree of Skewness.
Solution:
a) Co-efficient of variation
Distribution A = CV= 10/100 x 100 = 10
Distribution B = CV = 10/90 x 100 = 11.11
False, the degree of variation of distribution A and B are not same.
b) Co-efficient of Skewness
Distribution A = 3(100 -90)/10 = 3
Distribution B = 3 (90 – 80)/10 = 3
True, Both the distribution have the same degree of skewness