Unit 05 - 08
Unit 05 - 08
Unit 05 - 08
INTRODUCTION
Visual presentation of data would disclose some characteristic features of a mass of data. And
further summarization of data is so essential to show the relationship between variables and to
correlate one variable with another. To describe the characteristic features of the entire mass of
data with single quotient, the more obvious measure that helps to make quicker and better
decision is the measure of Central Tendency, also called the Averages. An average gives a bird's
eye view of huge mass of data, which are not easily intelligible, since it refers to a numerical
value that is a central point about which other values in a series get dispersed.
CONTENTS:
5.1 Definition
5.2 Purpose of Average
5.3 Requisites of a good average
5.4 Glossary
5.5 References
5.1 DEFINITION
Statistics provides its tools to reduce each group of values in to a single summary figure
representing each group. These representative values are called averages (the measures of
central tendency). In other words, they are measures, which condense a huge un widely set of
numerical data in to a single value. Its value always lies between the minimum and maximum
values or it has a tendency to be somewhere at the center. In general, the measures of central
tendency is divided in to two
1. Mathematical Measures of Central Tendency
2. Positional Measures of Central Tendency
5.4 GLOSSARY
Fluctuation - Move up and down or be irregular (of price, level, etc.)
Extreme values - Refers to the largest or smallest variant values which are borne by the number
of a set. The expression signifies values neighboring the end values.
Inference - Drawing conclusion from facts or by reasoning.
Parameter - Refers to characteristic or determining feature.
5.5 REFERENCES
Business Statistics, C.R. REEDY. M Com Ph. D., 1994
Business Statistics [A textbook for B.Com. Students of Indian Universities]. R.H.
DHARESHWAR, M.Sc. M.Phil. 1999
6.1 INTRODUCTION
The definition should not leave anything to the description of the person who calculated
averages. Averages should be computed with sufficient ease and rapidity or averages should not
involve more of mathematical complexities. The most popular and widely used measure for
representing the entire data by one value is arithmetic mean.
Summation operator, , implies that the values that follow it are to be summed or added together.
n upper lim it
Properties:
1. The summation of sums of differences
x yi
n n n n n n
i 1
i x
i 1
i y
i 1
i , x
i 1
i yi x
i 1
i y
i 1
i
Example: Suppose x1 = 1 , x2 = 3 , x3 = 4 , y1 = 2 , y2 = 5 , y3 = 3
x yi
3 3 3
Then
i 1
i xi
i 1
y
i 1
i
x
i 1
i yi x
i 1
i y
i 1
i …… left for the student
2. Multiplication by a constant
n n
kxi k xi
i 1 i 1
6 4 6
i 1
6 + 6 + 6 + 6 = 24
k n m 1 k for m < n
im
k n m 1 k
im
8 6 4 1 8
i4
8 + 8 + 8 = 3(8)
24 = 24
4. Sum of summations
k n n
x i 1
i x
i k 1
i x
i 1
i for any k < n
xi
i 1
xi
i k 1
x
i 1
i
3 6 6
xi
i 1
xi
i4
x
i 1
i
6 6
Let xi 10, x 2
CYP 1 i 148 , x1 = 3 , x2 = 2
i 3 i 3
6 6 6 6
xi xi xi ( xi 2)
2
Find i. ii. iii. iv. ( 2 xi 3) 2
i 1 i 1 i 1 i 1
2
v. i 1
(ixi 4)
6.3.0 Definition
The arithmetic mean is the sum of the values in a group divided by the number of items in that
group. Let x1, x2, …, xn be n values of a variable x, then their arithmetic mean is defined by:
x x2 xn x i
x
x 1 i 1
n n n
Where x – sum of all observations
n – total number of observations
Direct method: x i
Short cut method: x A
d
x i 1
n
n
Where n – number of items A = Assumed mean d = sum of deviations i.e. ( xi - A)
Example: Find the arithmetic mean for the following data by
i. direct method ii. short cut method
23.4 15.6 22.1 20.0 26.7 31.4 18.9 22.3
Solution:
8
i.
8
x i = 180.4 , n = 8 x i
180.4
i 1 x i 1
22.55
n 8
ii. Let A = 22 then di : 1.4, -6.4, 0.1, -2, 4.7, 9.4, -3.1, 0.3
8
d
8
i = 4.4 , n = 8 d i
4.4 = 22 + 0.55 = 22.55
i 1 x A i 1
22
n 8
For grouped data:
For Discrete Series:
n
i. x fx
2010
40.20
n 50
ii. x A
fd 10
40 40.20
n 50
For continuous series:
i. x fcm3020
37.75
n 80
ii. x A
fd 620
30 30 7.75 37.75
n 80
fd 1
iii. x A c 30 62 10 30 0.775 10 37.75
n 80
iii.
Classes 10 - 15 15 - 20 20 - 25 25 - 30 30 - 35
Frequencies 5 6 7 7 5
2) The algebraic sum of the deviations of the given values from the arithmetic mean is equal
to zero. Mathematically,
xi x 0 … for ungrouped data
f x i i x 0 … For grouped data
Because of this property, the arithmetic mean may be characterized as a point of ‘Balance’
3) The sum of squares of deviations is minimum when deviations are taken from the
arithmetic mean. i.e.
xi x xi A 2 … For ungrouped data
2
mean.
4) Suppose the mean of the values x1 , x2, … , xn be x0 . Then
i. if a constant k is added to each xi, then the new mean xn will be x0 + k.
Proof: Arithmetic mean of x1 + k, x2 + k, …, xn + k is
x i k
x1 k x2 k xn k
A.M i 1
n n
A.M
x1 x2 x n k k k
n
x1 x2 xn nk
A.M
n n
A.M x0 k
ii. if each value is multiplied by a constant k, then the new mean will be k x0
Proof: A.M for kx1 , kx2, … kxn, is
n
kx i
kx1 kx2 kxn
A.M i 1
n n
k x1 x2 xn
A.M
n
A.M kx0
4
Example: Given data 12, 10, 8, 6, 16, 7, 11. If each item is multiplied by and 8 is added,
5
what will be the new mean?
4
7 xn x0 8
5
x i
70 New mean
x0 i 1
10 4
7 7 xn 10 8 16
5
CYP 3 Given data 3, 8, 9, 4, 7, 5, 10, 11, 6 if each item is multiplied by 2 and 6 is added, then
i. The new mean will be _______________
ii. xi x __________________
N x N 2 x2 N n xn N x i i
N
i 1
i
Example: The mean height of 25 male and 20 female is 161.0cm and 155.6cm. What will be
the combined mean height?
xm = 161.0cm, xF = 155.6cm, NM = 25, NF = 20
xm N m x F N F
xc
Nm NF
161.0 25 155.6 20 7137
xc 158.60cm
25 20 45
CYP 4 In a factory, 120 workers get an average wage of birr 30 a day, 160 workers get Birr 50
a day, 80 workers get Birr 60 a day and 40 workers get birr 80 a day. Find
i. the average of averages.
ii. the general average.
An item or value may be relatively more important or less important than other items. This
relative importance is technically known as weight. In case where the relative importance of the
different items is not the same we compute weighted arithmetic mean.
If w1, w2, …, wn are weights attached to the values x1, x2, … , xn respectively, then the weighted
AM is defined as
xw
x1 w1 x2 w2 xn wn
wx
w1 w2 wn w
Example: An auto ride costs Birr 5 for the first km, Birr 4 for the next 3kms and Birr 9 for each
of the subsequent kms. Find the average cost per km for 10 kms.
9.00 6 54.00
10 71.00
xw
xw
71.00
7.10Birr
w 10
Examples:
1. The average mark of 100 students was found to be 40 but latter it is discovered that a score
of 33 was misread as 83. Find the correct average corresponding to the correct sum.
x 40
N
100
x i x N 40 100 4000 wrong sum
Wrong Entry = 83
Correct Entry = 33
4000 83 33 3950
Correct Mean 39.5
100 100
2. The average of a class having 35 pupils is 14 years. When the age of the class teacher is
added to the sum of the ages of the pupils, the average rises by 0.5 year. What must be the
age of the teacher?
x 14
N 35
xi 14 35 … Sum of ages of the pupils
490
x 14.5
xi 14.5 36 522 … Sum of ages of the pupils and the teacher
N 36
CYP6 The mean of 200 items is 50. Later on it is discovered that two items were wrongly
taken as 92 and 8 instead of 192 and 88. Find out the correct mean.
CYP7 The average rainfall for a week, excluding Sundays, was 10cm. Due to heavy rainfall on
Sunday, the average rainfall for the week rose to 15cm. How much rain fall was there
on Sunday?
6.4.0 DEFINITION
Geometric mean is defined as the nth root of the product of n items or values of a series. If there
are two items, we take square root; if there are three items, the cube root and so on.
Symbolically, let x1, x2, … , xn be the n values of a variable x, then their G.M is defined as
G.M n x1 . x2 xn
If the number of observation is more than three or more, the computation of the n th root is very
tedious. To simplify computation, the logarithms are used. In terms of log.
1
Log x1 . x2 xn n
1
. Log x1 . x2 xn
n
1
Log x1 Log x2 Log xn
n
n
1
n
. Log x
i 1
i
1 n
Anti log Log GM Anti log . Log x i
n i 1
1 n
GM Anti log
n
Log x
i 1
i
1 n
For ungrouped data: G.M Anti log
n
Log x
i 1
i
1 n
For grouped data: G.M Anti log
n
i 1
f i . Log xi
Log x
i 1
i 9.4021 n = 5
1
G.M Anti log 9.4021 Anti log 1.5670 36.9
5
ii. x: 10 16 22 28 34
f: 5 4 3 6 2
Log x: 1 1.2041 1.3424 1.4472 1.5315
fi log xi: 5 4.8164 4.0272 8.6532 3.0630
20
f
i 1
i Log xi 25.5598
1
G.M Anti log 25.5598 Anti log 1.2780 18.6
20
iii. Classes: 30 – 40 40 – 50 50 – 60 60 – 70
fi : 5 8 4 3
CMi : 35 45 55 65
Log CMi : 1.5441 1.6532 1.7401 1.8129
fi Log CMi : 7.7200 13.2256 6.9612 5.4387
1
G.M Anti log 33.3455 Anti log 1.6673 45.81
20
CYP8 Calculate GM for the following data.
i. x: 8 40 175 1209 2000
ii. x: 2 3 4 5 6
f: 5 7 8 3 2
iii. Classes: 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60
fi : 2 5 6 18 13 6
f 5 7 8 3 2
f/x 2.5 2.33 2 0.6 0.33
25
fi 25
x i 1
7.76 H .M
7.76
3.22
i
iii.
Classes 20 - 24 25 - 29 30 - 34 35 - 39 40 - 44 45 - 49 50 - 54
fi 11 18 32 37 21 47 13
Cmi 22 27 32 37 42 47 52
fi/Cmi 0.5 0.67 1 1 0.5 1 0.25
179
fi 179
CM
i 1
4.92 H .M
4.92
36.38
i
ii.
Marks 40 50 60 70
No. of Students 20 30 50 10
iii.
Classes 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60 60 - 70
fi 4 6 10 12 5 3
6.6.1 Advantages:
All are i. rigidly defined.
ii. based on all the observations.
iii. suitable for further mathematical tea.
AM iv. easy to calculate and understand.
v. is least affected by fluctuations of sampling compared to other averages.
GM iv. it gives highest weightage to smaller values and smaller weightage to large values.
v. it is a proper average to measure the relative change (like percentage increase in
Population, sales over a period of time, etc.
x y
Consider xy This is proved above.
2
xy xy xy
by multiplying both sides by we get
2 x y x y
2 xy xy 1 2 2
xy 2 . 2 .
x y x y x y x y 1 1
xy xy xy y x
2
xy GM HM
1 1
x y
Therefore, HM < GM < AM
Note- We can have the following relationship between the three means.
x y 2 xy
AM . HM xy
2 x y
To equalize AM . HM to GM, we put AM . HM under square root
x y 2 xy
GM xy .
x y
AM . HM
2
GM AM . HM If there are only two positive observations in the series.
2. The price of a commodity increased by 5% from 1979 to 1980, by 9% from 1980 to 1981
and by 73% from 1981 to 1982. The average increase from 1979 to 1982 is quoted as 25.6%
and not 29%. Verify.
Solution:
Year Price at the end of the year taking preceeding as 100%. (X) Log X
1980 100 + 5 = 105 2.0212
1981 100 + 9 = 109 2.0374
1982 100 + 73 = 173 2.2380
6.2966
AM = 5 + 9 + 73 = 87 = 29
3 3 3
GM = Antilog[1/3(6.2966)] = Antilog(2.0989) = 125.6
Therefore, Rise in price is 125.6 - 100 = 25.6%
Verification:
Year Rise Price would be Growth 25.6% Growth 29%
1979 100 100 100
1980 5% 105 125.6 129
1981 9% 114.45 157.75 166.41
1982 73% 198 198 214.67
Thus GM is the best average to give us the true rise in price.
3. World Population has increased from 5 billion to 6 billion within 12 years. Calculate the
average increment per year.
Solution:
The average annual increase is computed by applying the formula
Pn = Po(1 + r)n or r = n Pn/Po - 1.
Where Pn - the amount at the end of the period
Solution:
Depreciation (%) After depreciation (%) = X Log X
40% 60% 1.7782
25% 75% 1.8751
10% 90% 1.9542
10% 90% 1.9542
10% 90% 1.9542
9.5159
GM = Antilog [1/5(9.5159)]
= Antilog (1.9032)
81
Rate of depreciation per anum is 100 - 81 = 19%
5. The weighted GM of 5 numbers 10, 15, 25, 12 and 20 is 17.15. If the weights of the first
four numbers are 2, 3, 5 and 2 respectively, find out the weight of the fifth number.
Solution:
X W Log X (LogX).W
10 2 1.0000 2.0000
15 3 1.1761 3.5283
25 5 1.3979 6.9895
12 2 1.0792 2.1584
20 x 1.3010 1.3010(x)
14.6762 + 1.3010.x
Log17.15 = 14.6762 + 1.3010.x
12 + x
1.2343 = 14.6762 + 1.3010.x
12 + x
-0.0667x = -0.1354
x = 2.03
The missing weight is 2.
6. A cyclist pedals from his house to his college at a speed of 8 kmph and back from the
college to home at 12 kmph. Find the average speed.
Solution:
Let the distance between the house and the college be x kms. Then the distance from house to
college is covered in x/8 hrs and from college to house in x/12 hrs.
And the total distance = 2 x (house to college and back) is covered in (x/8 + x/12)hrs.
Average Speed = Total distance traveled
Time taken
= 2x = 2x = 48x = 9.60kmph
x/8 + x/12 5x/24 5x
7. Mr. Raga traveled a distance of 900 kms by train at an average speed of 60 kmph, 200 km
by boat at speed of 20 kmph, 1000 km by plane at 800 kmph speed and finally 4 km by taxi
at 25 kmph speed. What is the average speed for the entire distance?
Solution:
X W X/W
60 900 15.00
20 200 10.00
800 1000 1.25
25 4 0.16
2104 26.41
Weighted HM = W
W/X
= 2104 = 79.67 kmph.
26.41
CYP 10 If the arithmetic mean and the geometric mean of two items is 12.5 and 10
respectively, then
i. find the HM of the two items.
ii. find the value of the two items.
CYP 11 A motorist travels at a uniform speed of 20 kmph, 60 kmph and 30 kmph from A to B,
B to C and C to D respectively. Find the average speed.
CYP 12 In a factory, a unit of work is completed by A in 5 minutes, by B in 7 minutes, by C in
4 minutes, by D in 8 minutes and by E in 6 minutes.
i. What is their average rate of work?
ii. What is the average number of units of work completed per minute?
iii. At this rate, how many units of work will they complete in six hours a day?
CYP 13 Find the average rate of increase in Population which in the first decade had increased
by 20%, in the next by 30% and in the third by 40%.
6.7. SUMMARY
Arithmetic mean is mostly used in practice of all areas because its characteristics value being
represented to all items in the variable.
Geometric mean is widely used in averaging ratios and percentages and in computing average
rates of increase or decrease.
Harmonic mean is useful in comparing the values of a variable with constant quantity of
another variable, i.e. time, rate, distance covered, quantities purchased or sold per unit etc.
3. Find the class intervals if the AM of the following distribution is 30.1 and assumed mean
is 31.5.
i. their HM is ____________
ii. the two observations are _________ and _________
CYP 3 i. 20 ii. 0
CYP 4 i. 55 ii. 49
CYP 6 50.9
CYP 7 45cm.
CYP 11 30 kmph.
CYP 12 i. 5.65 minutes. ii. 0.177 units of work / minute. iii. 63.72 units of work.
CYP 13 i. 29.7%
6.10. GLOSSARY
Assumed mean - Refers to an estimated or approximate value for the arithmetic mean or
average which is used to simplify its calculation. The nearer it is to the mean,
the smaller are the numbers involved.
Class Interval - The range of interval between the highest and lowest values allowed in a
particular class.
6.11. REFERENCES
7.1 INTRODUCTION
The mode and median are called positional measures of central tendency. The term position
refers to the place of a value in the series. The values being divided by a number of equal parts
are called partition values. Besides median, which divides a series in to equal parts, the quartiles,
deciles and percentiles are important measures.
7.2 MODE
A value, which occurs most frequently in a series of observations, is called Mode. So by looking
the observations mode can be identified.
It is the value, which has the greatest frequency density in its immediate neighborhood.
Importance:
1. Mode can be used as a central location for qualitative as well as quantitative data, like the
median. Example, if a beauty measurement turns in to three impressions or scores, which
we rate ‘very beautiful’, ‘beautiful’ and ‘not beautiful’, then the modal value is beautiful.
2. Like the mean, the mode is not affected by extreme values.
3. Mode can be used when one or more of the classes are open-ended.
iii.
Classes 0-9 10 - 19 20 - 29 30 - 39 40 - 49 50 - 59 60 - 69 70 - 79
fi 328 350 720 664 598 524 378 244
Solution:
f1 f 0
Mode xˆ l c
f1 f 0 f1 f 2
720 350 3700
19.5 10 19.5 28.1854
720 350 720 664 426
7.3. MEDIAN
The median is that value of the variable, which divides the group in to two equal parts, one part
comprising all the values greater and the other all the values less than median. Or median can be
defined as the middle value of a set of data values when they are arranged in ascending or
descending order.
Importance:
In dealing with qualitative data, median is more suitable average
Median is recommended if the distribution has unequal classes, since it is simple to
compute than the mean.
Median is especially useful incase of open-ended classes since it is only positional and
not calculated average.
The magnitudes of extreme deviations do not influence the median.
x 4 6 8 10 12 14 16
f 2 4 5 3 2 4 1
iii.
x 50 - 60 60 - 70 70 - 80 80 - 90 90 - 100 100 - 110
fi 20 21 50 40 53 16
Solution:
i. a. Rearranging:
12 16 18 23 25 25 27 27 28 33 33 42
n = 12 … even
1 1
~
x xn xn x6 x7 1 25 27 26
2 2 2
1
2 2
b. Rearranging: 2 5 6 8 10 15 25
n = 7 … odd
~ x
x n 1 x4 8
2
ii.
x 4 6 8 10 12 14 16
f 2 4 5 3 2 4 1
<cfi 2 6 11 14 16 20 21
th
n = 21 Median = The value of N+1 item
2
th
= 21 + 1 item
2
= The value of the 11th item
= 8
iii.
x 50 - 60 60 - 70 70 - 80 80 - 90 90 - 100 100 - 110
fi 20 21 50 40 53 16
<cfi 20 41 91 131 184 200
th
n
Median class = Value of item 100th item 80 - 90
2
c n 10 9
Median ~
x l c. f 80 100 91 80 82.25
f 2 40 4
ii.
x 28 30 32 34 36 38 40 42
f 14 15 16 24 16 10 6 4
iii.
x 30 - 34 35 - 39 40 - 44 45 - 49 50 - 54 55 - 59
fi 5 10 15 20 6 4
Draw both the more than and less than ogives on the same graph. From the point of intersection
of these two curves, draw a perpendicular line to the x – axis. The foot of the perpendicular line
is the value of the median.
Classes 0 - 20 20 - 40 40 - 60 60 - 80 80 - 100
fi 15 25 30 14 16
Solution:
Classes 0 - 20 20 - 40 40 - 60 60 - 80 80 - 100
fi 15 25 30 14 16
<cfi 15 40 70 84 100
>cfi 100 85 60 30 16
120
100
80
<cfi
CFi 60
>cfi
40
20
0
0 - 20 20 - 40 40 - 60 60 - 80 80 - 100
CBi
The perpendicular line drawn from the intersection point meets the x-axis approximately at 46.
Therefore, the Median of the distribution is 46.
Importance:
The quartiles are more widely used in Economics and Business while the deciles and percentiles
are important in Psychology and Educational Statistics concerning grades, rates, ranks, etc. The
working principle for computing the partition value is basically the same as that of computing
the median.
Example: For the data given below, compute the value of Quartiles, D3, D7, P15 and P88 and
interpret.
Solution:
th
N
Q1 – size of item = 25th item 10 – 20 quartile class
4
l = 10, c = 10, f = 15, c.f = 10
c n 10
Q1 l c. f 10 25 10 20
f 4 15
Mark of 25% of students is less than 20.
th
2N
Q2 – size of item = 50th item 20 – 40 quartile class
4
l = 20, c = 20, f = 25, c.f = 25
c n 20
Q2 l c. f 20 50 25 40
f 2 25
Mark of half of students is below 40.
th
3N
Q3 – size of item = 75th item 40 – 60 quartile class
4
l = 40, c = 20, f = 30, c.f = 50
c 3n 20
Q3 l c. f 40 75 25 73.33
f 4 30
3 th
Mark of of students is below 73.33.
4
th
3N
D3 – size of item = 30th item 20 – 40 decile class
10
L = 20, c = 20, f = 25, c.f = 25
c 3n 20
D3 l c. f 20 30 25 24
f 10 25
Mark of 30% of students is below 24.
th
7N
D7 – size of item = 70th item 40 – 60 decile class
10
L = 40, c = 20, f = 30, c.f = 50
c 7n 20
D7 l c. f 40 70 50 53.33
f 10 30
Mark of 70% of students is below 53.33.
th
15N
P15 – size of item = 15th item 10 – 20 percentile class
100
L = 10, c = 10, f = 15, c.f = 10
c 15n 10
P15 l c. f 10 15 10 13.3
f 100 15
Mark of 15% of students is below 13.3.
th
88N
P88 – size of item = 88th item 60 – 80 percentile class
10
L = 60, c = 20, f = 14, c.f = 80
c 88n 20
P88 l c. f 60 88 80 71.43
f 100 14
Mark of 88% of students is below 71.43.
CYP 16 Compute the value of Quartiles, D4, P69 and interpret for the data given below.
i. 46 35 28 52 54 43 35 49 46 50 41
ii.
Daily Wages 40 45 50 55 60 65 70
No. of Workers 9 22 26 18 13 8 5
iii.
Rent in 150-250 250-350 350-450 450-550 550-650 650-750 750-850 850-950
Birr
No. of 8 10 15 25 40 20 15 7
Houses
7.5 SUMMARY
The arithmetic mean and median satisfy the conditions of definition and stability. Media has a
distinct merit over mean insofar as easy calculations. Mode can be located just by inspection. In
case, every value occurs the same number of times mode is useless measure. It is observed that
the median, quartiles, deciles and percentiles have good relation.
Weight in Kgs 30 - 40 40 - 50 50 - 60 60 - 70 70 - 80 80 - 90
No. of People 18 37 45 27 15 8
3. For the data given below, find the missing frequencies if median is 37 and mode is 43
million birr.
Fund raised in 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60
millions of birr
No. of NGO’s 3 F2 16 20 F2 16
4. The following distribution shows the marks of 60 students in Economics. Calculate Q3,
D5, P57 and the median.
Marks 31 - 39 41 - 49 51 - 59 61 - 69 71 - 79 81 - 89 91 - 99
No. of Students 12 10 12 9 6 7 4
5. For the following data Q1 is found to be 41. Find the missing frequency.
Classes 30 - 34 35 - 39 40 - 44 45 - 49 50 - 54 55 - 59
fi 8 10 f3 20 12 25
CYP16 i. Q1 = 35 Q2 – 46 Q3 = 50 D4 = 43 P69 = 50
ii.Q1 = 45 Q2 = 50 Q3 = 60 D4 = 50 P69 = 55
iii.Q1 = 458 Q2 = 580 Q3 = 685 D4 = 542 P69 = 646.5
7.8 GLOSSARY
7.9 REFERENCES
8.1 INTRODUCTION
For a moderately symmetric distribution, median lies between mean and mode. An approximate
relationship among these averages is:
Mean – Mode = 3 (Mean – Median) or
Mean – Median = 1/3 (Mean – Mode).
From this empirical relationship, we can see that median is closest to mean than mode. If the
maximum frequency has repeated or if the grouping gives two modal classes, then the
distribution is called Bi-modal distribution. In such situation, mode is obtained by:
Mean – Mode = 3 (Mean – Median) or
Mode = 3 Median – 2 Mean
Example: Find the value of mode for the following distribution.
Wages 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60 60 - 70 70 - 80
No. of 10 40 20 0 10 40 16 14
Persons
x
f 5890
i xi
39.2667
N 150
c n 10
~
x l c . f 40 75 70 45
f 2 10
Then
x 2 x 3 45 2 39.2667 135 78.5334 56.4666
ˆ 3~
x
CYP 19 Calculate mode using the empirical relationship of mean and median for the
following distribution.
fi 5 15 28 24 17 10 1
A distribution is said to be symmetrical when the values of the variables, equidistant from the
mean, have equal frequencies.
Consider the following frequency distribution
Classes 20 - 30 30 - 40 40 - 50 50 - 60 60 - 70 70 - 80 80 - 90
fi 12 18 25 36 25 18 12
In this distribution, the mirror images of the frequencies with respect to the central frequency are
present on both sides. Such distribution can be said Symmetric Frequency Distribution. If we
calculate the mean, median and mode for this distribution, we can find that x ~ x x ˆ 55 .
~
x ~
x ~
x
Mean = Median = Mode Mean > Median > Mode Mean < Median < Mode
Q2 – Q1 = Q3 – Q2 Q2 – Q1 < Q3 – Q2 Q2 – Q1 > Q3 – Q2
Symmetric Positively Skewed Negatively Skewed
fi 7 18 35 34 6
CMi 61 64 67 70 73
< Cfi 7 25 60 94 100
fi CMi 427 1152 2345 2380 438
100
f i
6742
xi
x i 1
67.42
N 100
c n 3
~
x l c . f 65.5 50 25 65.5 2.1 67.6
f 2 35
f1 f 0
xˆ l c
f1 f 0 f1 f 2
35 18 51
65.5 3 65.5 68.3
35 18 35 34 18
Since x ~
x xˆ , the distribution is negatively skewed.
8.3 SUMMARY
Skew ness discloses the difference between the manners in which the observations are
distributed in a particular distribution compared with a normal distribution.
2. For a certain symmetric distribution the first and the last deciles are 200 and 360
respectively. What is the modal value of the distribution?
3. Test the skew ness of the following distribution.
8.6 GLOSSARY
Bi - modal - Refers to a distribution of data points in which two values occur more frequently
than the rest of the values in the data set.
Empirical - Derived from or relating to experiment and observation, rather than theory.
Skew ness - A form of asymmetry in a frequency distribution.
Symmetric FD -A frequency distribution in which the distribution of frequencies is identical on
both sides of the mode. The Mean, Median and Mode coincide.
8.7 REFERENCES