Slides
Slides
Definition of Statistics:
Statistics is concerned with scientific methods for collecting, organizing, summarizing,
presenting and analyzing sample data as well as drawing valid conclusions about
population characteristics and making reasonable decisions on the basis of such analysis.
According to Lovitt, Statistics is the science which deals with collection, classification
and tabulation of numerical facts as the basis for explanation, description and comparison
of phenomenon.
Limitation of Statistics:
The drawbacks of the statistics are:
(i) Statistics is not suited to the study of qualitative phenomenon.
(ii) Statistics does not study individuals.
(iii) Statistical laws are not exact.
(iv) Statistics is liable to misused.
Page 1 of 7
Types of Statistics:
Statistics deals with both statistical data and statistical methods. Statistical methods are
again divided into two branches like
(i) Descriptive Statistics and
(ii) Inductive Statistics
Descriptive Statistics:
Descriptive Statistics deals with collection, tabulation, presentation and analysis of data
without considering the theory of probability. The study of frequency distribution is an
aspect of tabulation. The analytical aspects deal with the measures of central tendency,
measures of dispersion, skewness and kurtosis. The shape of the frequency curve is
studied by skewness and kurtosis. All the above measures are used for univariate, bi-
variate and multivariate data. The study of correlation, regression and association of
attributes are included in the bi-variate descriptive statistics.
Inductive Statistics:
Statistics is based on inductive logic. Inductive Statistics is concerned with making
estimates, predictions and generalizations, or reaching decisions about population based
on sample observations. The method of taking decision is known as statistical inference.
The inference is made by sampling, sampling distribution, estimation of parameter and
test regarding any hypothesis on parameter.
Statistical data:
Any measurement of one or more characteristics recorded (as a result of observation,
interview and so on) either from population or sample units are called data. Data are the
raw, disorganized facts and figures collected from any field of inquiry.
For example, the heights of 14 randomly selected persons from a group of N = 100
persons are as follows: 152, 160, 158, 155, 150, 152, 151, 150, 153, 154, 153, 154, 151,
155. This information on height of people constitutes a data.
Types of data
Statistical data depending upon the sources are of two types, they are:
(i) Primary data
(ii) Secondary Data
Primary Data:
The data, which are collected from the main sources by basic investigation or direct
observation of the experimental units, are called primary data.
Page 2 of 7
Secondary data:
The data that are collected from indirect sources such as from any institution or
organization, publication, report, journal etc. is called secondary data.
Collected data are stored in two fashions, they are:
(i) Raw data
(ii) Classified data
Raw data:
The data which are collected from sampling units and stored or recorded without any
systematic fashion are known as raw data.
For example: The number of road accident in 5 selected days in high ways in a year: 12,
10. 4, 12, 5
Classified data:
The primary data which are presented in a systematic fashion in rows or columns or even
in ordered way are known as classified data.
For example, the following data represent the number of some selected private
universities according to their number of students:
Measurement of Data:
Data are also be classified as the followings:
Categorical Data:
A set of data is said to be categorical if the values or observations belonging to it can be
sorted according to category. For example, people have the characteristic of 'gender' with
categories 'male' and 'female'.
Page 3 of 7
Nominal Data:
A set of data is said to be nominal if the observations belonging to it can be assigned a
code in the form of a number where the numbers are simply labels. One can count but not
order or measure nominal data. For example, in a data set males could be coded as 0,
females as 1; marital status of an individual could be coded as Y if married, N if single
Ordinal Data:
A set of data is said to be ordinal if the observations belonging to it can be ranked (put in
order) or have a rating scale attached. One can count and order, but not measure, ordinal
data. For example, suppose a group of people were asked to taste varieties of biscuit and
classify each biscuit on a rating scale of 1 to 5, representing strongly dislike, dislike,
neutral, like, strongly like. A rating of 5 indicates more enjoyment than a rating of 4, for
example, so such data are ordinal.
Interval data:
An interval scale is a scale of measurement where the distance between any two adjacent
units of measurement (or 'intervals') is the same but the zero point is arbitrary. Scores on
an interval scale can be added and subtracted but cannot be meaningfully multiplied or
divided. For example, the time interval between the starts of years 1981 and 1982 is the
same as that between 1983 and 1984, namely 365 days. The zero point, year 1 AD, is
arbitrary; time did not begin then. Other examples of interval scales include the heights of
tides, and the measurement of longitude.
Ratio data:
Ratio variable is one, which can take numeric values that are actual as well as absolute.
The zero value on this scale is absolutely zero. The variable height, weight, family size
etc. are examples of ratio variable
Sources of Statistical data:
Statistical data may be collected in a variety of ways. These sources may broadly be
categorized as primary source and secondary source. Primary data come mainly from
direct field operations, which may either be a census or a specially designed survey. On
the other hand, secondary data are usually procured from already published or
unpublished documents rather than undertaking first-hand field investigations. So the
primary data collected by an agency or organization, constitute the secondary data in the
hands of other agencies. Bangladesh Bureau of Statistics (BBS), for example, conducts
occasional surveys on various aspects, such as health, migration, marriage and morbidity.
Such data in their hands are regarded as primary data. They are compiling, analyzing and
preparing periodic reports on the issues. If these data are used by some other interested
groups to serve their own purpose, the BBs data become secondary in nature to them.
Page 4 of 7
Population
An aggregate of all individuals or items under investigation defined on some common
characteristics is called a population.
For example, first year honours students in statistics (session: 2007-2008) of SUST
constitute a population. Here, the common characteristics are:
(i) Students of SUST
(ii) Students of first year honours students in statistics and
(iii) Students of the session 2007-2008
Or, An aggregate of all individuals or items under investigation according to some pre-
determined objective and are available in a specified area at a specified time period.
For example, if the objective is to estimate the per capita salary of female employees
working in different garments industries in Bangladesh, then all female employees in all
industries of Bangladesh during a particular time period constitute the population.
Types of Population:
A population can be classified in different types which is shown by the following
diagram:
Population
Finite Population:
A population consisting of a finite number of individuals or items is called a finite
population. For example, first year honours students in statistics (session: 2007-2008) of
SUST constitute a finite population.
Infinite Population:
A population consisting of an infinite number of individuals or items is called an infinite
population. For example, if we toss a coin for an infinite number of times and write down
the upturned face of the coin then the sequence of Head (H) and Tail (T) (like
HHHTTHT-----) will constitute an infinite population.
Page 5 of 7
Sample
A small but representative part with finite number individuals or items of a population
which is under investigation is called a sample.
For example, a group of students, representing the first year first semester honours
students of Shahjalal University of Science and Technology is called a sample.
Random Sample:
If each individual or item in the population from which a sample has been drawn or
selected, has an equal chance of being included in the sample, then the sample is called a
random sample.
For example, If we have a complete list of 100 students and if we select a sample of 20
students from these 100 students completely at random, then each of the students has an
equal chance of being included in the sample. Therefore, the sample 20 students is a
random sample.
Variable:
A variable is a characteristic whose value can vary from person to person, object to object
or from phenomenon to phenomenon.
For example, (i) Sex is variable which is composed of two categories, male and female
and it varies from one to another, (ii) Age is a variable which may vary from person to
person and may assume values 10 years, 15 years 20 years and so on.
Types of variable:
There are two types of variables, they are
(i) Qualitative variable: A qualitative variable is one for which numerical
measurement is not possible, such as hair colour, religion, race, sex etc.
(ii) Quantitative variable: A quantitative variable is one for which the resulting
observations are numeric and thus possesses a natural ordering. Example: Age,
Height, Family size etc.
Quantitative variable can also be classified as
(i) Discrete variable: When the variable can assume only the isolated values, the
variable is called discrete variable. Example: - the number of children in a family.
(ii) Continuous variable: A variable is said to be continuous if it assumes any value
within a certain range. Example: - Age, Height, Temperature etc.
Page 6 of 7
Attribute: A qualitative characteristics, when used to classify a series or individuals into
two or more mutually exclusive and exhaustive classes is called an attribute. Example: -
Sex (Male/Female), Hair Colour (Black/Gray) etc.
Condensation of Data: The primary data which are collected through survey are called
raw data. The raw data are not always suitable for proper statistical analysis. For
analytical purpose and for the purpose of comprehensive idea about the population under
investigation the data are usually ordered, classified and tabulated. The process of
ordering, classification and tabulation is called condensation of data.
Tabulation: Tabulation is the process of arranging the data in an orderly manner into
rows and columns. A statistical table is the logical listing of related quantitative
information in vertical columns and in horizontal rows with sufficient explanatory and
qualifying words, phrases and statements in the form of title, sub-titles, headings and
notes to make clear the full meaning of data and their origin.
References:
1. Islam, M.N “An Introduction to Statistics and Probability”
2. Mostafa, M.G “Methods of Statistics”
3. Bhuyan, K.C. “Methods of Statistics”
4. Shil, R.N and Debnath, S.C. “An Introduction to the Theory of Statistics”
Page 7 of 7
Frequency Distribution
SOME DEFINITION REGARDING FREQUENCY DISTRIBUTION
FREQUENCY DISTRIBUTION
A frequency distribution is a table in which the values for a variable are grouped into classes
and observed frequencies are recorded.
CLASS
In the process of condensation, raw data are assigned to some chosen groups of appropriate
size. These groups are called classes. A class is thus an interval containing observations, each
observation being classified into one and only one class.
FREQUENCY
The number of observations or values falling into each group or class is called class frequency
or simply frequency. The frequency thus shows how many times a particular value or
observation is repeated. For example, if in a set of data, a value 10 occurs 6 times, then 6 is the
frequency of 10.
CLASS INTERVAL
Ordinarily, for numerical data, the frequencies of a particular class are bounded by two values.
The width or length of the class formed by these two boundary values is known as class
interval.
CLASS LIMITS
The smallest value of a class is technically known as the lower class limit of the interval, while
the largest value is known as the upper class limit of the interval. Thus for a class interval 15-
19, 15 is the lower limit and 19 is the upper limit.
CLASS MID-POINT
The mid-point or mid-value of a class is the value that falls in the middle of the class interval and
is obtained as the average of the two class limits. For the class interval 15-19, the mid-point is
17.
CLASS WIDTH
The size of a class is referred to as to class width and is the difference between the two class
limits. For a class with interval 45-50 has a class width 5.
OPEN INTERVAL
An open interval is an interval with one of its limits (in either side) indeterminate. Thus an age of
a person recorded less than 45 years (i.e. ≤ 45), also forms an open interval.
CLASS BOUNDARY
In inclusive method it is necessary to make an adjustment to determine the correct class
intervals and to have continuity. The adjustment consists in finding a Correction Factor (CF).
The Correction Factor can be expressed as
1
CF = (Lower limit of the second class – Upper limit of the first class)
2
For each class, this CF is subtracted form the lower limit and is added to the upper limit of the
class to maintain the continuity of data. Thus obtained a true class interval is called class
boundary. For example, if the class limits are 20 and 29, then all values between 19.5 and 29.5
would actually fall in the given class, so the class boundaries are 19.5 and 29.5.
TALLY MARKS
To indicate the accommodation of an observation to a particular class a tally sign is used. This
sign is known as tally mark.
CUMULATIVE FREQUENCY
Cumulative frequency is computed by adding successive class frequencies from top to bottom.
(The entry corresponding to the top interval is the frequency of that class; the entry opposite the
second interval is the sum of the frequencies in the first and second class intervals etc.}
INCLUSIVE METHOD
Under the inclusive method, the upper limit of one class is included in that class itself. The
method is inclusive in the sense that it includes both ends of the intervals such that its inclusion
does not alter the width of the interval.
EXCLUSIVE METHOD
When the class intervals are so fixed that the upper limit of one class is the lower limit of the
next class. This type of classification is conventionally known as the exclusive method.
Page 1 of 3
Frequency Distribution
1. Find out range by subtracting the lowest value from the highest value of a variable.
2. Take decision regarding number of classes and class interval:
Let k= number of classes in a frequency table. The number of class, k should not be
less than 4 and greater than 20. However, the value of k can be found out by a
mathematical formula where
k = 1 + 3.322logN,
here N is the total number of observation. (This rule for k is known as Sturge’s Rule for
number of classes).
Once the value of k is decided the interval (h, width) of a class is found out by
Range
h =
k
3. Arrange the table with three columns having headings: Class Interval, Tally Marks and
Frequency. The first class interval will start with the smallest value and continue until
the interval with the highest value of the given series of data is reached.
4. Read the items and give tick mark to each of the values of the original table of raw data
and put tally mark against the appropriate class interval. It is convenient to mark each
fifth by a diagonal. Thus exhaust all the values one after another.
5. Count the number of tally marks corresponding to each class interval and write the result
in the respective frequency column.
PROBLEM
Numbers obtained in statistics by 40 students of 1 st year 1st semester is given below:
40 38 44 28 30 22 35 42 40 36
50 67 25 58 53 48 65 35 55 39
72 44 70 55 62 20 76 46 57 68
59 34 41 56 60 42 64 73 38 41
SOLUTION
We have, total number of observation, N = 40
Lowest observation = 20
Highest observation = 78
Therefore, Range=Highest observation – Lowest observation
= 78 - 20 = 58
Page 2 of 3
Frequency Distribution
Table: Grouped Frequency Distribution (Exclusive method)
Table: Continuous Frequency Distribution using class boundary (Modified Inclusive Method)
Page 3 of 3
GRAPHICAL REPRESENTATION
GRAPHICAL REPRESENTATION
One very simple but effective form of statistical analysis is to present the tabulated data with
the help of graphs and diagrams.
BASIC PRINCIPLES OF GRAPHS
1. The graphs should be simple and well-defined.
2. The graphs should be completely self explanatory.
3. The origin, vertical scale, horizontal scale should be so chosen that a graph does not
carry a false impression about the nature of the data
4. Frequency or rate is usually represented on vertical scale, the variable or method of
classification on horizontal scale.
USES OF GRAPHS
1. It is useful in elucidating the main features of a set of data.
2. It is often valuable in suggesting an appropriate method of analysis and in explaining
the conclusions founded upon the analysis.
3. It can sometimes pin-point gross errors in statistical records.
LIMITATIONS OF GRAPHS AND DIAGRAMS
1. They may be misleading unless drawn and studied with care.
2. The conclusions drawn from graphs should normally be regarded as tentative, and
therefore, the graphs are no substitute for more critical statistical analysis.
The important graphs and diagrams which are used for presentation of statistical data are
i. Line Diagram
ii. Bar Diagram
iii. Pie Diagram
iv. Histogram
v. Pictogram
vi. Frequency Polygon
vii. Frequency Curve
viii. Cumulative Frequency Curve or
ix. Stem-and-leaf Plot
x. Box and Whisker Plot
For quick instance, the line diagram is used for time series data, the bar diagram and pie chart
are used for qualitative data and the rests are used for quantitative data
BAR DIAGRAM
A bar diagram, also known as bar chart, is a form of presentation in which the frequencies are
presented by rectangles usually separated along the horizontal axis and drawn as bars of
convenient widths.
EXAMPLE
Let us consider the following data for constructing a bar diagram
The vertical bar diagram constructed from these data is shown in below
Page 1 of 5
GRAPHICAL REPRESENTATION
.
PIE CHART/DIAGRAM
Vertical bar diagram for health center visit data The pie chart consists of a
circle sub-divided by into
sectors, whose areas are
proportional to the various
parts into which the whole
quantity is divided.
Before that, we form the
relative frequency
distribution (%) for this
purpose and convert the
percentage values into
angles. As a circle consists
of 360° (degree), the whole
quantity to be represented is equated to 360°. For example, the angle in degree to category
‘frequently’ is arrived at as follows;
49
× 360 = 117.6
150
Other category values are obtained in a similar manner. The necessary computation are shown
in the below
Table 2.0: Health center visit data for constructing a pie diagram
Never
4% Rarely
16%
Frequently
33%
Occassionally
47%
HISTOGRAM
A histogram is constructed by placing the class boundaries on the horizontal axis of a graph and
the frequencies on the vertical axis. Each class is shown on the graph by drawing a rectangle
whose base is the class boundary and whose height is the corresponding frequency for the
class.
Page 2 of 5
GRAPHICAL REPRESENTATION
FREQUENCY POLYGON
Table 4.0
Class interval Mid Value Frequency
4.5-8.5 6.5 6
8.5-12.5 10.5 19
12.5-16.5 14.5 23
16.5-20.5 18.5 18
20.5-24.5 22.5 9
24.5-28.5 26.5 3
28.5-32.5 30.5 1
32.5-36.5 34.5 1
Page 3 of 5
GRAPHICAL REPRESENTATION
25
20
15
10
0
2.5 6.5 10.5 14.5 18.5 22.5 26.5 30.5 34.5 38.5
CONSTRUCTING OGIVE
A graph of the cumulative frequency distribution or cumulative relative frequency distribution is
called an To construct a less than type ogive, the upper class limits (precisely the upper
boundaries) are put on the horizontal axis and cumulative frequencies are shown on the vertical
axis. A point is then plotted directly above upper each class limit at a height of corresponding to
cumulative frequency at that upper class limit. One additional point is then plotted above the
lower class limit for first class at a height of zero. These points are then connected by straight
lines. The straight lines allow one to approximate the cumulative frequency between the class
limits by interpolating. The resulting graph is a
To construct a , a point is plotted against each lower class limit at a height
corresponding to the cumulative frequency at that lower class limit. As before, an additional
point is to be plotted above the upper class limit for the terminal class at a height of zero.
These points are then connected by straight lines. The resulting graph is a
EXAMPLE
The following table is constructed from data collected on the life length of 40 rats in years for a
laboratory experiment. Display the data by a less than type and a more than type ogive.
Table 5.1
1.45-1.95 2
1.95-2.45 1
2.45-2.95 4
2.95-3.45 15
3.45-3.95 10
3.95-4.45 5
4.45-4.95 3
Total 40
This table is constructed to draw the required ogives and the resulting ogives are sketched in
Figures 5.1 and 5.2
Page 4 of 5
GRAPHICAL REPRESENTATION
The ogive or cumulative frequency polygon has the advantage of providing a convenient way to
estimate the median and the percentiles of a sample. In addition, it has the advantage that the
number of items between two values can be readily ascertained. The ogive allows seeing how
many observations in a data set fall at or below a given point on the scale. This is most useful
when we have a distribution of scores and we are interested in finding out how one score
compares to the rest of the scores
TABLE 5.2 Cumulative frequency distributions for less than and more than type ogives based
on the rat life data.
Figure 5.1: Less than type ogive for data in the TABLE
45
40
35
30
25
20
15
10
5
0
1.45 1.95 2.45 2.95 3.45 3.95 4.45 4.95
45
40
35
30
25
20
15
10
5
0
1.45 1.95 2.45 2.95 3.45 3.95 4.45 4.95
Page 5 of 5
Measures of Central Tendency
x 1 + x 2 + ......... + x n ∑x
i =1
i
x = = .
n n
In case of frequency distribution, where fi is the frequency of the variable xi,
i.e.
Observation Frequency
xi fi
x1 f1
x2 f2
. ..
xn fn
k k
∑f
N
i
i =1
Ex. Find the arithmetic mean from the following frequency distribution:
Class
Interval Frequency
11-13 3
13-15 4
15-17 5
17-19 10
19-21 6
21-23 4
23-25 3
Measures of Central Tendency
Solution.
Class Mid-
Interval value(xi) Frequency(f i) fixi
11-13 12 3 36
13-15 14 4 56
15-17 16 5 80
17-19 18 10 180
19-21 20 6 120
21-23 22 4 88
23-25 24 3 72
∑f i = 35 ∑fx i i = 632
k
∑fx i i
632
∴ AM = i =1
k
= = 18.06.
∑f
35
i
i =1
k − k k
Proof. ∑ f (x
i −1
i i − x) = ∑fx - x∑f
i =1
i i
i =1
i
= Nx -Nx
= 0.
k
∑fx
i =1
i i
Where, x =
N
k
⇒ N x = ∑ f i xi
i =1
(ii) Sum of squares of the deviations of a set of observations is minimum when the deviations
k k −
are taken about the AM. i.e. ∑ f (x
i −1
i i − A) 2 > ∑ f (x
i −1
i i − x) 2
2
k − k − k
= ∑ f (x
i −1
i i − x) + 2( x -A) ∑ f (x
i −1
i i − x) + ( x -A)2 ∑ f i
i =1
2
k − k −
= ∑ f (x
i −1
i i − x) + N( x -A)2 Since ∑ f (x
i −1
i i − x) = 0.
k k −
Therefore ∑ f (x
i −1
i i − A) 2 > ∑ f (x
i −1
i i − x) 2 .
xi − A
Let ui = , (i=1,2,3,…,k)
h
Measures of Central Tendency
Where ui is a transformed variable, A is the provisional mean taken to be equal to one of the
mid-values, and h is the size of the class interval.
The above relation can be rewritten as follows:
xi = A + hui
or, f ixi = Afi + hf iui
k k k
or, ∑ f x = ∑ Af
i =1
i i
i =1
i +h ∑ f u [Summing over both sides over i from 1 to k]
i =1
i i
1 k A k h k
or, ∑
N i =1
fixi = ∑ fi + ∑ fiui
N i =1 N i =1
A
or, X = .N + h u
N
or, X = A + hu
Thus arithmetic mean X depends on change of origin A and scale h.
(iv) If a set of m observations x1,x2,…..,xm and that of n observations y1,y2,….yn have means x
and y respectively, then the combined mean z of (m+n) observations of the two sets is z =
_ _
m x+ n y
.
m+n
m
∑x i m
Proof. By definition, the mean x of the first set is x = i =1
⇒ ∑ xi = m x
m i =1
n
∑y i n
Similarly, the mean y of the second set is y = i =1
⇒ ∑ yi = n y .
n i =1
∑x +∑y
i =1
i
i =1
i
=
m+n
_ _
m x+ n y
= .
m+n
_
If xi (i = 1,2,….,k) are the means of k series of sizes ni (i = 1,2,….,k) respectively, then the
− − _
n x 1 + n 2 x 2 + ........... + nk x k
x= 1 .
n1 + n 2 + ....nk
x
Ex. If 20 students in a statistics course in one section receives an average score of 67, the 18
students in a second section receives 57, and the 12 students in the third section receives 62,
then the overall score of these 20+18+12=50 students is given by
20 × 67 + 18 × 57 + 12 × 62 3110
x = = = 62.2 .
20 + 18 + 12 50
Measures of Central Tendency
Merits.
1. It is rigidly defined.
2. It is easy to understand and easy to calculate.
3. It is based upon all the observations.
4. It is amenable to further algebraic treatment.
5. Of all the averages, AM affected least by sampling fluctuation.
Demerits.
1. It cannot be determined by inspection nor can it be located graphically.
2. AM cannot be used if we are dealing with qualitative characteristics, such as intelligence,
honesty, beauty etc.
3. AM cannot obtained if a single observation is missing or lost unless we drop it out and
compute the arithmetic mean of the remaining values.
4. AM is affected very much by extreme values.
5. It does not provide exact value for open-end classes.
Weighted Arithmetic Mean: In an ordinary arithmetic mean, each item in the series is assumed
to have equal importance. But there are situations where the relative importance of the different
items is not the same. Sometimes we associate with the numbers x1,x2,…..,xn certain weighting
factors or weights w1,w2,……….,wn depending on the significance or importance attached to the
numbers. In this case,
n
w1 x 1 + w2 x 2 + ......... + wn xn ∑w x
i =1
i i
x = =
w1 + w2 + ....... + wn n
∑w
i =1
i
Ex. If a final exam in a course is weighted 3 times as much as a quiz and a student has a final
exam grade of 85 and quiz grades 70 and 90, the mean grade is
1 × 70 + 1 × 90 + 3 × 85 415
x = = = 83 .
1+1+ 3 5
Problem. Find the simple and weighted arithmetic mean of the first n natural numbers, the
weights being the corresponding numbers.
Solution. The first n natural numbers are 1, 2, 3, ……., n. (n>0)
Let us construct the following table:
xi wi w i xi
1 1 12
2 2 22
3 3 32
. . .
. . .
. . .
n n n2
n(n + 1)
We know that, 1 + 2 + 3 + ....... + n = and
2
n(n + 1)( 2n + 1)
12 + 2 2 + 3 2 + ....... + n 2 =
6
Measures of Central Tendency
Therefore, Simple arithmetic mean (AM) is,
n(n + 1)
n
∑x i
2 n(n + 1) 1 n + 1
x= i =1
= = . =
n n 2 n 2
And weighted AM is
n(n + 1)(2n + 1)
n
∑w x i i
6 n(n + 1)( 2n + 1) 2
Xw = i =1
= = ×
n
n(n + 1) n(n + 1)
∑w
6
i
i =1 2
2n + 1
=
3
n +1 2n + 1
Therefore, Simple AM = and weighted AM = .
2 3
∑ fU i i
⇒ x= a +h U where U = i =1
n
∑f
i =1
i
∑f
i =1
i = 50 ∑ fU
i =1
i i = 81
x − 71
Here, a=71, h=5 and U=
5
n
∑ fU i i
81
Thus U = i =1
n
= = 1.62
∑f
50
i
i =1
_ _
∴ x = a + h U = 71 + 5 × 1.62 = 79 .1
Measures of Central Tendency
2. Geometric Mean. Geometric Mean of a set of n non-zero positive observations is the nth root
of their product.
Let x1,x2,…..,xn be n non-zero positive observations of a series of data. Then the GM is,
1
GM= (x1.x2…….xn) n
The calculation may sometimes be simplified by taking logarithm,
1 n
logG= ∑ log xi
n i =1
1 n
⇒ G = Antilog [ ∑ log xi ]
n i =1
In case of frequency distribution, GM is given by,
1
f1 f2 fk ∑ fi
G=[ x x 1 2 ………. x k ]
k
1
⇒ logG = [ ∑f log xi ]
∑f
i
i i =1
k
1
⇒ G = Antilog [
N
∑fi =1
i log x i ]
Ex. Suppose the sales of a departmental store had increased from taka 2 lac in 1990 to taka 4 lac
6−2
in 1991 and taka 6 lac in1992. The increase is ( )×100=200 percent over the 2-year period
2
1990-92.
This is obviously wrong since the average rate of increase per year was much less. The
appropriate average in this instance is the geometric mean. Since the sales in 1991 was twice as
high as in1990, and the sales in1992 was 1.5 times as high as in 1991, the GM of these two values
is G = 2 × 1 .5 = 3 = 1.7325.
Thus the average rate of increase in sales per year is 1.7325-1=0.7325 i.e. 73.25%.
Merits.
1. It is rigidly defined.
2. It is based upon all the observations.
3. It is amenable to further algebraic treatment.
4. It is not affected much by sampling fluctuations.
5. It gives comparatively more weight to small items.
Demerits.
1. Because of its abstract mathematical character GM is not easy to understand and to calculate
for a non-mathematics student.
2. If any one of the observation is zero, GM becomes zero and if any one observation is
negative, GM becomes imaginary regardless of the magnitude of the other items.
Uses. GM is used –
1. To calculate average of ratios and percentages.
2. To calculate the growth rate of population and average rate of increase or decrease in
economics activities.
3. To calculate index number.
Measures of Central Tendency
3. Harmonic Mean. The Harmonic Mean of a set of n non-zero observations x1,x2,…..,xn in a
series is the reciprocal to the arithmetic mean of the reciprocals. That is,
n
HM= n
1
∑x
i =1 i
N
In case of frequency distribution, HM= k
fi
; where ∑f i =N
∑
i =1 x i
Ex. Suppose an aero plane flies around a square with 100 miles long side, taking the first side at
100m/h, the second side at 200m/h, the third side at 300 m/h and fourth side at 400 m/h. What
is the average speed of the plane in its flight around the square?
Soln. Here the HM is the appropriate measure for the average speed.
4 4 × 1200
HM= = = 192 m/h
1 1 1 1 25
+ + +
100 200 300 400
Merits.
1. It is rigidly defined.
2. It is based upon all the observations.
3. It is amenable to further algebraic treatment.
4. It is not affected much by sampling fluctuations.
5. It gives greater importance to small items and is useful only when small items have to give
very high weightage.
Demerits.
1. It is not easy to understand and is difficult to calculate.
2. It cannot be calculated if any observation is zero.
3. It is impossible to calculate for open-end class interval.
Uses. HM is used –
1. To calculate average speed of any vehicle.
2. To calculate the average growth rate and average profit in any business.
4. Median. The median is defined as the middle most observation when the observations are
arranged in order of magnitude (ascending or descending).
For ungrouped data, when n is odd, the median is the middle most observation,
⎛ n + 1 ⎞ th
i.e, ⎜ ⎟ observation is the series.
⎝ 2 ⎠
For example – Let a series be 5, 10, 7, 3, 2. Where, n=5
Therefore, ascending order of this series be 2,3,5,7,10
Hence the median of the series is 5.
n th ⎛ n + 1 ⎞ th
Again when n is even, the median will be the arithmetic mean of and ⎜ ⎟
2 ⎝ 2 ⎠
observations in the series.
e.g. - Let us consider the values 11,3,9,5,7,12,15,18; n=8
Ascending order: 3,5,7,9,11,13,15,18
9 + 11 8 ⎛8 ⎞
So, The median is = 10 , [ th = 4th & ⎜ + 1⎟ = 5th ]
2 2 ⎝2 ⎠
Measures of Central Tendency
For grouped frequency distribution the median is given by
⎛N ⎞
⎜ − Fc ⎟
Me = L+⎜ 2 ⎟×c
⎜ f ⎟
⎜ ⎟
⎝ ⎠
where,
L = the lower limit of the median class (median class is that class which contain n th
2
observation of the series)
N = total number of observation,
Fc = cumulative frequency of the class preceding the median class,
f = frequency of the median class,
c = Length of the median class.
Merits.
1. It is rigidly defined.
2. It is easy to understand and easy to calculate. In some cases it can be located merely by
inspection.
3. It is not at all affected by extreme values.
4. It can be calculated for distributions with open-end classes.
Demerits.
1. In case of even number of observations median cannot be determined exactly.
2. It is not based on all the observations.
3. It is not amenable to algebraic treatment.
4. As compared with AM, it is affected much by sampling fluctuations.
Uses.
1. Median is the only average to be used while dealing with qualitative data which cannot be
measured quantitatively but still can be arranged in ascending or descending order of
magnitude, e.g., to find the average intelligence or average honesty among a group of
people.
2. It is to be used for determining the typical value in problems concerning wages, distribution
of wealth etc.
5. Mode. The mode is the value of the variable that occurs most frequently, ie, for which the
frequency is maximum. It is the most fashionable value of the variable. The mode is often
denoted by M o and is frequently referred to as the modal value.
For example – The number of children in 10 families be as follows:
3, 4, 1, 0, 3, 2, 3, 5, 3, 2
Then the mode of this series is 3 because this value occurs most frequently in the series.
For grouped frequency distribution, the mode can be obtained from the formula
∆1
Mo = L + ×c
∆1 1 + ∆ 2
f 1− f 0
= L+ ×c
( f 1 − f 0) + ( f 1 − f 2)
Measures of Central Tendency
where,
L = lower limit of the modal class.
∆1 = the difference between the frequency of the modal class and pre-modal class (f1-f0).
∆2 = the difference between the frequency of the modal class and post-modal class (f1-f2).
f1 = frequency of the modal class
f0 = frequency of the class preceding the modal class,
f2 = frequency of the class following the modal class,
c = length of the modal class.
Merits.
1. Mode is readily comprehensible and easy to calculate. Like median, in some cases it can be
located merely by inspection.
2. It is not at all affected by extreme values.
3. Mode can be conveniently located even if the frequency distribution has class intervals of
unequal magnitude provided the modal class and the classes preceding and succeeding it
are of the same magnitude. Open-end classes also do not pose any problem in the location
of mode.
Measures of Central Tendency
Demerits.
1. Mode is ill-defined. It is not always possible to find a clearly defined mode.
2. It is not based on all the observations.
3. It is not capable of further mathematical treatment.
4. As compared with AM, it is affected to greater extent by sampling fluctuations.
Uses.
Mode is the average to be used to find the ideal size, e.g. in business forecasting, in
manufacturing of ready-made garments, shoes, etc.
Quadratic Mean or Root Mean Square (QM or RMS) If we have a set of n observations
represented by x1,x2,…..,xn, then the QM of these values is given by
n
x + x + ....... + x
2 2 2 ∑x
i =1
2
i
Q= 1 2 n
=
n n
This mean is seen to have applications in physical and engineering sciences.
Quantiles. The quartiles, deciles and percentiles are collectively known as quantiles. Quantiles
are those values in a series, which divide the total frequency into number of equal parts when
the series is arranged in order of magnitude.
Quartiles. Quartiles are those values, which divide the total frequency into four equal parts.
There are three quartiles, i.e., 1st quartile (Q1), 2nd quartile (Q2) and the third quartile (Q3).
Second quartile Q2 is equal to the median.
For group frequency distribution, the quartiles are given by
ixN
− Fi
Qi = Li + 4 xc ; i = 1,2,3
fi
Where, Li = lower limit of the i-th quartile class
N= total number of observation,
Fi= cumulative frequency of the class preceding the quartile class,
fi = frequency of the quartile class,
c= Length of the quartile class.
Deciles. Deciles are those values which divide the total frequency into 10 equal parts. There are
9 deciles. These are D1, D2, …, D9. 5th decile. D5 is the second quartile or median (D5 = Q2 = Me).
For grouped frequency distribution, the deciles are given by
j×N
− Fj
Dj= Lj + 10 ×c; j=1, 2, 3,….., 9
fj
where L j= lower limit of the jth decile class.
N = total number of observation (frequency);
Fj = Cumulative frequency of the preceding jth decile class;
fj = frequency of the jth decile class, and
c = length of class interval of the jth decile class.
Percentiles. Percentiles are those values which divided the total frequency into 100 equal parts.
Thus we have 99 percentiles. These are P1 ≤ P2 ≤ P3 ≤ … ≤ P50 ≤ … ≤ P99. The median is the 50th
percentile P50 as well as 5th decile (D5) and 2nd quartile (Q2).
For grouped frequency distribution, the percentiles are given by,
Measures of Central Tendency
k×N
− Fk
Pk = L k + 100 ×c ; k=1, 2, 3, …, 99
fk
where L k = lower limit of the kth percentile class.
N = total number of observation (frequency);
Fk = Cumulative frequency of the preceding kth percentile class;
fk = frequency of the kth percentile class, and
c = length of class interval of the kth percentile class.
Ex. Find (a) 1st and 3rd quartile, (b) 7th decile and (c) 62nd percentile from the frequency
distribution given below:
Class Interval Frequency Cumulative frequency
11-13 3 3
13-15 4 7
15-17 5 12
17-19 10 22
19-21 6 28
21-23 4 32
23-25 3 35
N 35
(a) (15-17) is the 1st quartile class because th = th = 8.75th observation lies in that class.
4 4
8.75 − 7
Hence Q1 = 15 + × 2 = 15+ 0.70 = 15.70
5
3 × N 3 × 35
= = 26.25th observation falls in the class (19-21).
4 4
26.25 − 22
Hence Q3 = 19 + × 2 = 20.42.
6
7 × N 7 × 35
(b) 7th decile class is (19-21), because = = 24.5
10 10
24.5 − 22
Hence D 7 =19 + × 2 = 19.83.
6
62 × N 62 × 35
(c) 62nd percentile class is (17-19), because = = 21.7
100 100
21.7 − 12
Hence P 62 =17 + × 2 = 18.94.
10
Theorems
Theorem 1: For two non-zero positive observations prove that AM × HM = GM .
2
X1 + X 2 1 1 2X 1 X 2
AM = , GM = X 1 .X 2 2 and HM = =
2 1⎛ 1 1 ⎞ X1 + X 2
⎜ + ⎟
2 ⎜⎝ X 1 X 2 ⎟⎠
X 1 + X 2 2X 1 X 2
Therefore, AM × HM = . = X1X 2
2 X1 + X 2
2
⎧ 1
⎫
= ⎨ X1 X 2 2 ⎬ = GM
2
⎩ ⎭
Hence, AM × HM = GM .
2
Measures of Central Tendency
Theorem 2: A variable X takes on values which are in geometric progression. Obtain the AM,
GM and HM and hence show that AM × HM = GM 2 .
Proof: Let the variable assumes values a, ar, ar2,………., arn-1, which are in geometric
progression. Then by definition,
a + ar + ar 2 + .......... + ar n−1 a (1 − r n )
AM = = ; r<1
n n(1 − r )
n −1
1 1
GM = (a.ar.ar 2 ......ar n −1 ) n
= (a n r 1+ 2+ 3+.........+ n ) n
= ar 2
n−1
(GM)2 = a r
2
n
HM =
1 1 1 1
+ + 2
+ ........ + n −1
a ar ar ar
na
=
1 1 1
1 + + + .......... .. +
r r 2 r n −1
na na . r n .( 1 − r ) nar ( n − 1 ) ( 1 − r )
= = =
1
( )n − 1 ( 1 − r n ). r (1 − r n )
r
1
( ) − 1
r
a (1 − r n ) nar n −1 (1 − r )
Hence AM × HM = . = a 2 r n −1 = (GM ) 2
n(1 − r ) (1 − r n )
Theorem 3: For a set of n non-zero positive values x1,x2, …, xn prove that AM ≥ GM ≥ HM.
Proof: By Definition,
n
1
AM = A =
n
∑ i=1
x i
1
and GM = G = x 1 .x 2 ......x n
n
Taking logarithm of G
n
log G =
1
log (x 1 .x 2 ......x n )= 1
∑ log x i
n n i =1
⎡ n n
⎤
1 n ⎢ ∑ x i ∑ x i ⎥
=
n
∑ i =1
log ⎢
⎢
i=1
n
− i =1
n
+ xi ⎥
⎥
⎢ ⎥
⎣ ⎦
1 n
= ∑ log A - A + x i
n i =1
1 n ⎛ A-x ⎞
= ∑ log A⎜⎝1 - A i ⎟⎠
n i =1
1 n 1 n ⎛ A - xi ⎞
= ∑ log A + ∑ log⎜1 − ⎟
n i =1 n i =1 ⎝ A ⎠
1 ⎧⎪ A − x i 1 ⎛ A − x i ⎞ 1 ⎛ A − xi ⎞ ⎫⎪ ⎡ A − xi
2 3
1 ⎤
= .n log A + ∑ ⎨− − ⎜ ⎟ − ⎜ ⎟ − ...⎬. ⎢if A < 1⎥
n n ⎪⎩ A 2⎝ A ⎠ 3⎝ A ⎠ ⎪⎭ ⎣ ⎦
1 ⎧⎪ A − xi 1 ⎛ A − x i ⎞ 1 ⎛ A − xi ⎞ ⎫⎪
2 3
= log A − ∑ ⎨
n ⎪⎩ A
+ ⎜
2⎝ A ⎠
⎟ + ⎜
3⎝ A ⎠
⎟ + ....⎬
⎪⎭
= log A − a positve quantity
or, log G = log A - a positive quantity
or, log G ≤ log A
or, A ≥ G - - - - - - - - - - - - - - - - - - - -(i)
Measures of Central Tendency
Using the relation A ≥ G, we have
x1 + x 2 + ...... + x n 1
≥ x1 . x 2 ...... x n n
n
Replacing x1,x2, …, xn by 1 , 1 ,..... 1 respectively, we get
x1 x2 xn
1 1 1
+ + ..... + 1
x1 x 2 xn ⎛ 1 1 1 ⎞ n
≥ ⎜⎜ . ...... ⎟⎟
n ⎝ 1 2
x x x n ⎠
1 1
or, ≥
H G
or, G ≥ H......................................(ii)
Combining (i) and (ii), we get
AM ≥ GM ≥ HM.
Hence prove the theorem
Measures of Dispersion
Measures of Dispersion: Measures of central tendency give us an idea of the concentration of the
observations about the central value of the distribution.
Let us consider two groups, each of 6 students with their scores in a particular examination.
Group-A: 48, 50, 52, 51, 50, 49
Group-B: 1, 2, 100, 99, 98, 0
The average score for each batch is 50. It is evident that the distribution A and B have the same AM, but
they differ from AM. Such variation is usually called dispersion.
Definition. Measures of dispersion give the degree of scatterness about the central location and thus
giving measure of variability or lack of homogeneity of the data.
Purpose of Dispersion. A measure of Dispersion serves two purposes:
1. It provides one of the most important characteristics of a frequency distribution.
2. It helps us to compare two or more frequency distributions.
Characteristics of an ideal Measure of Dispersion:
1. It should be rigidly defined.
2. It should be easy to understand and easy to calculate.
3. It should be based on all the observations of the data set.
4. It should be so defined that the measure is amenable to mathematical treatment in further analysis.
5. It should be affected as small as possible by sampling fluctuation.
6. It should not be affected by extreme observations.
Measures of Dispersion: Following are the measures of dispersions:
a) Absolute measures:
i. Range
ii. Quartile deviation
iii. Mean deviation
iv. Standard deviation
b) Relative measures:
i. Co-efficient of variation
ii. Co-efficient of range
iii. Co-efficient of mean deviation
iv. Co-efficient of quartile deviation
a) Absolute measures of dispersion. These measures are absolute in the sense that they are expressed
in the same statistical units in which the original data are presented.
i. Range: The range of a set of values is the difference between the highest and the lowest value in the
set. Thus if x1 and x 2 denote the smallest and the highest value respectively in a set then the range is
given by
Range (R) = x2 - x1
In grouped frequency distribution, the range is taken either as the difference between the highest and
the lowest mid-values or as the difference between the lower boundary of the first class and the last
class.
For a set of observations 90, 110, 20, 51, 210 and 190, the range is 210-20=190.
Merits.
1. It is the simplest measure of dispersion.
2. It is easy to understand and easy to calculate.
3. It is based on the extreme observations only and no detail information is required.
4. It gives us a quick idea of the variability of the observations involving least amount of time and
calculations.
Demerits.
1. It is a crude measure of dispersion as the two extreme observations may subject to sampling
fluctuations.
2. It would be misleading if any of the extreme value is very high.
3. It cannot be calculated if the extreme classes of the frequency distribution are open.
Page 1 of 7
Measures of Dispersion
Uses.
1. It is used in measurement of share market fluctuation.
2. It is also used in statistical quality control work.
ii. Quartile deviation: If Q1 and Q3 denote the first and third quartile respectively, then the inter-
quartile range is Q=Q3-Q1. The inter-quartile range is frequently reduced to the measure of Semi-
interquartile range, also known as the Quartile Deviation (Q.D.), by dividing it by 2. Thus
Q3 − Q1
Q.D =
2
Quartile Deviation is definitely a better measure than the range as it makes use of 50% of the data. But
since it ignores the other 50% of the data, it cannot be regarded as a reliable measure.
Ex. Find out quartile deviation from the following frequency distribution
C.I. Frequency
0-5 2
5-10 5 Q1= 16.83, Q2=22.5, Q3= 27.58
10-15 7 Q − Q1 27.58 − 16.83
Therefore, Q.D = 3 = = 5.375
15-20 13 2 2
20-52 21
25-30 16
30-35 8
35-40 3
iii. Mean deviation: The mean deviation is an average of absolute deviations of individual
observations from the central value of a series.
Let x1 , x 2 ,........., x n be n observations of a variable with mean x then mean deviation from arithmetic
mean is defined by
1 n
M .D x = ∑ xi − x
n i =1
In case of grouped data,
n n
1
M .D x =
N
∑f
i =1
i xi − x where N = ∑ f i .................... .. *
i =1
If the deviations are taken from median or mode, x has to be replaced from (*), then for grouped data
n
1
M .D Me =
N
∑f
i =1
i xi − Me
n n
1
And M .D Mo =
N
∑f
i =1
i xi − Mo , where N = ∑ f i
i =1
Merits.
1. It is easy to understand
2. It is relatively easy to calculate.
3. It takes all the observations into account.
4. It is less affected by extreme values.
Demerits.
1. It is not amenable to further algebraic treatment.
2. It cannot be calculated if the extreme classes of the frequency distribution are open.
3. It is less stable than standard deviation.
Since MD is based on all the observations, it is a better measure of dispersion than range or quartile
deviation. But the step of ignoring the signs of the deviations creates artificiality and renders it
useless for further mathematical treatment.
iv. Standard deviation: The arithmetic mean of the squares of the deviations of the given observations
from their arithmetic mean is known as variance. The positive square root of variance is the Standard
deviation.
Page 2 of 7
Measures of Dispersion
Let x1 , x 2 ,......... , x n be n observations of a variable then standard deviation is defined as
1 n
Sx = ∑ xi − x
n i =1
2
k
Where N = ∑f
i =1
i and x is the AM of the distribution.
n n
( ∑ xi ) 2
∑ (x
i =1
i − x ) 2 = ∑ ( xi2 − 2 xi x + x 2 ) = ∑ xi2 − 2 x ∑ xi + nx 2 = ∑ xi2 − nx 2 = ∑ xi2 −
i =1
i =1
n
2
n
⎛ n
⎞
∑x i
2
⎜ ∑ xi ⎟
Thus, S x =
i =1
−⎜ i =1 ⎟
n ⎜ n ⎟
⎜ ⎟
⎝ ⎠
In case of frequency distribution,
k
k k
(∑ f i x i ) 2
∑ f (x
i =1
i i − x ) 2 = ∑ f i ( x i2 − 2 x i x + x 2 ) = ∑ f i xi2 − 2 x ∑ f i xi + Nx 2 = ∑ f i xi2 − Nx 2 = ∑ f i xi2 −
i =1
i =1
2
k
⎛ k
⎞
∑ f i x i2 ⎜ ∑ f i xi ⎟
Thus, S x =
i =1
−⎜ i =1 ⎟
N ⎜ N ⎟
⎜ ⎟
⎝ ⎠
Ex. Calculate the SD from the following frequency distribution
C.I. Mid-values (xi) Frequency (fi) fixi fixi2
0-5 2.5 2 5 12.5
5-10 7.5 5 37.5 281.25
10-15 12.5 7 87.5 1093.75
15-20 17.5 13 227.5 3981.25
20-52 22.5 21 472.5 10631.25
25-30 27.5 16 440 12100
30-35 32.5 8 260 8450
35-40 37.5 3 112.5 4218.75
754 1642.5 40768.75
2
40768 . 75 ⎛ 1642 . 5 ⎞
∴S = −⎜ ⎟ = 543 . 58 − 479 . 61 = 7 . 99
75 ⎝ 75 ⎠
Merits.
1. It is rigidly defined.
2. It takes all the observations into account.
3. It is amenable to algebraic treatment.
4. It is less affected by sampling fluctuations.
5. The SD of combined series can be obtained if the means, SD’s and number of observations in each
series is given.
Demerits.
1. It is not readily comprehensible; the calculation requires a good deal of time and knowledge of
arithmetic.
2. It is affected by the extreme values.
3. It cannot be calculated if the extreme classes of the frequency distribution are open.
Page 3 of 7
Measures of Dispersion
Uses. It is the most useful measure of dispersion and has got immense use in advanced statistical work
such as sampling, correlation analysis, normal curve errors, comparing variability and uniformity of
two sets of data, etc.
When the deviations of the observations are taken from any arbitrary value A other than x ,
standard deviation reduces to Root Mean Square Deviation, which is defined by
k
1
S′ =
N
∑f
i =1
i xi − A
2
k
Where N = ∑f
i =1
i and A is any arbitrary value.
Height Weight
Mean 40 inch 10 kg
SD 5 inch 2 kg
CV 0.125 0.20
Examination of the respective standard deviation does not tell us in any meaningful way which
characteristics has more variability than the other, because they are measured in different units. Thus
the CV for weight is greater than that of the height, we would conclude that weight has more
variability than height in the population.
ii. Co-efficient of range: The co-efficient of range is a relative measure corresponding to range and is
obtained by the following formula
L−S
Co-efficient of range = × 100
L+S
Where L and S are respectively the largest and the smallest observations in the data set.
iii. Co-efficient of Mean Deviation: As the mean deviation can be computed from mean, median,
mode or from any arbitrary value, a general formula for computing coefficient of mean deviation is as
follows:
M .D.( A)
Co-efficient of mean deviation = × 100
A
Where, A is the mean, median, mode or any other arbitrary value.
iv. Co-efficient of quartile deviation: The coefficient of quartile deviation is defined by
Q3 − Q1
C.Q.D = × 100
Q3 + Q1
Where Q1 and Q3 denote the first and third quartile respectively.
Page 4 of 7
Measures of Dispersion
n2 −1
Theorem-1: The variance of the first n natural number is .
12
Proof: Consider x1 , x 2 , x3 ,....., x n denote respectively the first n natural numbers are 1,2,3,......, n .
n
n( n + 1)
Then ∑x
i =1
i = 1 + 2 + 3 + ...... + n =
2
n
n(n + 1)( 2n + 1)
And ∑x
i =1
i
2
= 12 + 2 2 + 3 2 + ...... + n 2 =
6
Hence the sum of squares of x1 , x 2 , x3 ,....., x n is
2
⎛ n ⎞
⎜ ∑ xi ⎟
( xi − x) = ∑ xi − ⎝ i =1 ⎠
n n
∑
2 2
i =1 i =1 n
2
⎧ n(n + 1) ⎫
⎨ ⎬
n(n + 1)( 2n + 1) ⎩ 2 ⎭
= −
6 n
n(n + 1)( 2n + 1) n(n + 1) 2
= −
6 4
n(n + 1) ⎡ 2n + 1 n + 1 ⎤
= −
2 ⎢⎣ 3 2 ⎥⎦
n(n + 1) (n − 1)
= .
2 6
n
n(n 2 − 1)
or , ∑ ( xi − x) =
2
i =1 12
1 n n2 −1
∑
2
or , ( xi − x ) =
n i =1 12
n2 −1
or , S x =
2
12
n2 −1
Therefore, the variance of the first n natural number is .
12
Theorem-2: The standard deviation is independent of origin but depends on scale.
Proof: Let x1 , x 2 , x3 ,....., x n be n values of a variable X. Let us define the new values corresponding to
xi − A
the original observed values by the following relation: u i = , i = 1, 2,...., n.
h
Where A is any provisional value (origin) and h is the width of the class interval (scale).
Therefore, xi − A = hu i
or , xi = A + hu i .........(i)
or , x = A + hu.......... ..(ii )
n n
∑x
i =1
i ∑u
i =1
i
where, x = and u =
n n
Subtracting (ii) from (i), we have
xi − x = h(u i − u )
n n
or , ∑ (xi − x) 2 = h 2 ∑ (u i − u ) 2
i =1 i =1
1 n 1 n
or , ∑ (xi − x ) 2 = h 2 ∑ (u i − u ) 2
n i =1 n i =1
or , S x = h 2 S u
2 2
or , S x = h .S u
It is clear from the above relation that Sx does not depend on the origin (A) but depends on scale (h).
Page 5 of 7
Measures of Dispersion
R
Theorem-3: For two unequal observations, MD = SD = , where MD = mean deviation,
2
SD = Standard deviation and R = range.
Proof: Let the two observations be x 1 and x2 (x1> x2). By definition, the mean, range and standard
deviation of x1 and x2 are respectively
x1 + x 2
x= , R = x1 − x 2
2
2
∑ (x i − x) 2
and S = i =1
2
2
∑x
i =1
i −x
Also the mean deviation of the numbers is MD =
2
2
∑x
i =1
i −x x1 − x + x 2 − x
Now, MD = =
2 2
x1 + x 2 x + x2
x1 − + x2 − 1
2 2
=
2
x1 − x 2 x − x1
+ 2
2 2
=
2
x1 − x 2 x1 − x 2
+
= 2 2
2
x1 − x 2 R
= =
2 2
( x1 − x ) 2 + ( x 2 − x ) 2
Again, S 2 =
2
x1 + x 2 2 x + x2 2
( x1 − ) + ( x2 − 1 )
= 2 2
2
2 2
⎛ x1 − x 2 ⎞ ⎛ x − x1 ⎞
⎜ ⎟ +⎜ 2 ⎟
= ⎝ ⎠ ⎝
2 2 ⎠
2
2 2
⎛ x1 − x 2 ⎞ ⎛ x − x2 ⎞
⎜ ⎟ +⎜ 1 ⎟
= ⎝ 2 ⎠ ⎝ 2 ⎠
2
2
⎛ x − x2 ⎞
2 .⎜ 1 ⎟
= ⎝
2 ⎠
2
2
⎛ x − x2 ⎞
2
⎛R⎞
=⎜ 1 ⎟ =⎜ ⎟
⎝ 2 ⎠ ⎝2⎠
R
or , SD =
2
R
Hence MD = SD = .
2
Page 6 of 7
Measures of Dispersion
Theorem-4: SD is the least possible root mean square deviation.
Or, Root mean square deviation is the least when the deviations are taken from the arithmetic mean.
Proof: Let x1 , x 2 , x3 ,....., x n be the values of k observations with corresponding frequencies f1, f2,…..fk.
Also let x be the mean of the observations and A be any arbitrary value.
k
1
S ′2 =
N
∑ f (x
i =1
i i − A) 2
k
1
=
N
∑ f [( x
i =1
i i − x ) + ( x − A)] 2
k k k
1 1 1
=
N
∑ f (x
i =1
i i − x) 2 + 2
N
( x − A)∑ f i ( xi − x ) + ( x − A) 2 ∑ f i
i =1 N i −1
k
1
=
N
∑ f (x
i =1
i i − x ) 2 + ( x − A) 2
k
1
=
N
∑ f (x
i =1
i i − x ) 2 + a positive quantity
∴ S ′ 2 = S 2 + a positive quantity
∴ S < S ′2 .
Theorem-5: If x and S be the AM and SD respectively of non-negative observations, then
x n − 1 ≥ S.
Proof: Let us consider x1 , x 2 , x3 ,....., x n be n non-negative observations. We know
n
∑x i n
x= i =1
⇒ ∑ xi = x
n i =1
n
n
( ∑ xi ) 2 n
∑x 2
i − i =1
n
( ∑ xi ) 2
⇒ nS 2 = ∑ xi2 −
n
S2 = i =1 i =1
n i =1 n
Now
n n n
(∑ x i ) 2 = ∑ xi2 + ∑ x i x j
i =1 i =1 i≠ j
∑x x
i≠ j
i j ≥0
n n
∴ (∑ xi ) 2 ≥ ∑ xi2
i =1 i =1
n n
n
(∑ x i ) 2 n
( ∑ xi ) 2
⇒ ( ∑ xi ) 2 − i =1
≥ ∑ xi2 − i =1
i =1 n i =1 n
⎛ n −1 ⎞ n
⇒⎜ ⎟(∑ xi ) ≥ nS
2 2
⎝ n ⎠ i =1
⎛ n −1 ⎞
⇒⎜ ⎟(nx ) ≥ nS
2 2
⎝ n ⎠
⇒ x (n − 1) ≥ S 2
2
⇒ x n − 1 ≥ S.
Page 7 of 7
MOMENTS, SKEWNESS & KURTOSIS
Moments. In statistics, moments are certain constant values in a given distribution. The moments help
us to ascertain the nature and form of the underlying distribution.
Moments of a distribution may be calculated from arithmetic mean of the distribution or from any
arbitrary chosen value including zero (origin). When the moments are computed from the arithmetic
mean of the distribution, we call them moments about mean or central moments. When they are
computed from an arbitrary value, we call them raw moments. When they are computed from zero, they
are called moment about origin. Moments about origin are also called raw moments.
If x1, x2,……,xn be n observations of a variable, then the rth raw moment is defined by,
n
′
∑ (x
i =1
i − A) r
r = , where A is any arbitrary value.
n
The rth central moment is defined by,
n
∑ (x i − x) r
r = i =1
.
n
If x1, x2,……,xk occur with frequencies f1,f2,…..,fk respectively then the rth raw moment is
k
′
∑ f (x i i − A) r k
r = i =1
, where, n = ∑ f i and A is any arbitrary value.
n i =1
∑ f (x i i − x)r k
r = i =1
. where, n = ∑ f i and x = Arithmetic Mean.
n i =1
Therefore,
k k 2
′
∑ f i ( xi − A)
′
∑ f i ( xi − A)
1 = i =1
, 2 = i =1
,
n n
k 3 k 4
′
∑ f (x
i =1
i i − A)
′
∑ f (x
i =1
i i − A)
3 = , 4 = , and so on.
n n
k k
∑ f (x i i − x) ∑ f (x i i − x)2
1 = i =1
=0 2 = i =1
,
n n
k k
∑ f (x i i − x )3 ∑ f (x i i − x)4
3 = i =1
, 4 = i =1
.
n n
Ex. Compute the first three central moments for the following frequency distribution
Xi: 2 3 4 5 6
fi: 1 3 7 3 1
Solution. We prepare the following table for computing the moments
xi fi fixi xi − x f i ( xi − x ) f i ( xi − x ) 2 f i ( xi − x ) 3
2 1 2 -2 -2 4 -8
3 3 9 -1 -3 3 -3
4 7 28 0 0 0 0
5 3 15 1 3 3 3
6 1 6 2 2 4 8
Total 15 60 0 0 14 0
Here, x =
∑fx i i
=
60
= 4.
n 15
Page 1 of 5
MOMENTS, SKEWNESS & KURTOSIS
∑ f (x i i − x)
0
Thus, 1 = i =1
= =0
n 15
k k
∑ f (x
i =1
i i − x)2
14
∑ f (x
i =1
i i − x )3
2 = = = 0.933 3 = =0
n 15 n
′
∑ (x i − A) ∑x i − nA
1 = i =1
= i =1
=x−A
n n
n
∑ (x i − x)
1 = i =1
=0
n
n n
∑ (x i − x) 2 ∑ {( x i − A) − ( x − A)} 2
2 = i =1
= i =1
n n
n n
∑ (x
i =1
i − A) 2 ∑ (x
i =1
i − A)
= −2 ( x − A) + ( x − A) 2
n n
= ′2 − 2 1′ . 1′ + ( 1′ ) 2
= ′2 − ( 1′ ) 2 .
n n
n n
n n n
∑ (x i − A) 3
∑ (x i − A) 2
∑ (x i − A)
= i =1
−3 ( x − A) + 3 i =1 i =1
( x − A) 2 − ( x − A) 3
n n n
= 3′ − 3 2′ . 1′ + 3 1′ ( 1′ ) 2 − ( 1′ ) 3
= 3′ − 3 2′ 1′ + 2( 1′ ) 3 .
Similarly, 4 = 4′ − 4 3′ . 1′ + 6 2′ ( 1′ ) 2 − 3( 1′ ) 4 .
Sheppard’s Correction for grouping. It is generally assumed that the frequencies in a group are
concentrated at the mid point of the class interval. This is surely an approximation. W.F.Sheppard
observed that if a) the frequency distribution is continuous and b) the frequency tapers off to zero in both
directions of the frequency distribution, then the correction for different moments due to grouping at the
mid point of the class interval are done by the following formula, known as Sheppard’s correction.
C2
2 (Corrected ) = 2 − 3 (Corrected ) = 3
12
1 2 7 4
4 (Corrected ) = 4 − C 2 + C ; where, C is the length of class interval.
2 240
Skewness. Skewness means ‘lack of symmetry’ that is departure from symmetry of a distribution. A
distribution is said to be skewed if
(i) Mean, Median and Mode give different values.
(ii) Q1 and Q3 are not equidistant from the median.
(iii) The curve drawn with the help of the given data is not symmetrical but turned nose to one side
than the other.
Page 2 of 5
MOMENTS, SKEWNESS & KURTOSIS
Measures of Skewness.
Mean − Mode
Pearson’s Coefficient of skewness =
S .D.
If Mean > Mode, the skew is positive,
If Mean < Mode, the skew is negative,
If Mean = Mode, the skew is zero, in which case the distribution is symmetrical.
For a moderately skewed distribution, Mean – Mode = 3(Mean – Median)
3(Mean − Median)
Therefore, Pearson’s Coefficient of skewness =
S .D.
(Q3 − Q2 ) − (Q2 − Q1 ) Q3 + Q1 − 2Q2
Quartile Coefficient of skewness = =
(Q3 − Q1 ) Q3 − Q1
2
A relative measure of skewness denoted by 1, is defined as 1 = 3
3
.
2
Kurtosis. The degree of peak ness or flatness of a distribution relative to a normal distribution is called
kurtosis.
A curve having relatively higher peak than the normal curve, is known as leptokurtic. If the curve is
more flat-topped than the normal curve, it is called platykurtic. A normal curve itself is called
mesokurtic, which is neither too peaked nor too flat=topped.
Page 3 of 5
MOMENTS, SKEWNESS & KURTOSIS
Measures of Kurtosis. The most important measure of kurtosis based on 2nd and 4th moments is 2,
defined as 2 = 4
2
, where, 2 and 4 are respectively the second and forth moments about the mean.
2
Theorem. For any set of observations, x1, x2,……,xn, prove that (i) 2 ≥ 1 + 1 and (ii) 2 ≥ 1.
n
∑ (x i − x)2
Proof. (i) We know, 2 = i =1
,
n
n
∑ (x i − x)3
3 = i =1
,
n
n
∑ (x i − x) 4
4 = i =1
.
n
Consider the following expression,
{a( x i − x ) 2 + b ( x i − x ) + c}2
where, a, b, c are arbitrary constants. If a, b and c are assumed to be real, then the above is always
positive. Thus performing the squares, summing and dividing by n all through
a 2 ∑ ( xi − x ) 4 b 2 ∑ ( xi − x ) 2 ∑ (x − x )3 ∑ (x − x)2 ∑ (x − x)
+ + c 2 + 2ab + 2ac + 2b ≥0
i i i
n n n n n
⇒ a2 4 + b2 2 + c 2 + 2 ab 3 + 2ac 2 +0≥0
Choosing a=1, b= − 3
and c = − 2, the above expression becomes,
2
2 2
4 + 3
+ 2
2 −2 3
−2 2
2 ≥0
2 2
2
⇒ 4 − 3
− 2
2 ≥0
2
2
4
2
− 3
3
−1 ≥ 0
2 2
⇒ 2 − 1 −1 ≥ 0
⇒ 2 ≥ 1 + 1.
Page 4 of 5
MOMENTS, SKEWNESS & KURTOSIS
1
Since ∑ ( z i − z ) 2 ≥ 0
n
1
⇒ ∑ z i2 − z 2 ≥ 0
n
Substituting the values of zi and z from (2) and (3)
2
4 − 2
2 ≥0
⇒ 4
2
−1 ≥ 0
2
⇒ 2 ≥ 1.
Page 5 of 5