Unit - 1: Statistics: Meaning, Significance & Limitations
Unit - 1: Statistics: Meaning, Significance & Limitations
Therefore, the process of collecting, classifying, presenting, analyzing and interpreting the numerical facts,
comparable for some predetermined purpose are collectively known as “Statistics”.
Scope of Statistics.
The scope of statistics is much extensive. It can be divided into two parts –
(i) Statistical Methods such as Collection, Classification, Tabulation, Presentation, Analysis, Interpretation
and Forecasting.
Limitations of statistics
1. Statistics can study only numerical or quantitative aspects of a problem.
2. Statistics deals with aggregates not with individuals.
3. Statistical results are true only on an average.
4. Statistical laws are not exact.
5. Statistics does not reveal the entire story.
6. Statistical relations do not necessarily bring out the cause and effect relationship between
phenomena.
7. Statistics is collected with a given purpose.
8. Statistics can be used only by experts.
Reasons for distrust in Statistics
By distrust of statistics we mean lack of confidence in statistical statements and statistical methods. It is often
commented by people
“Statistics can prove anything.”
“There are three type of lies – lies, damned lies and statistics – wicked in the order of their naming.”
The main reasons for such views are -
a) Figures are convincing, and therefore people are easily led to believe them.
b) Ignorance of limitation of statistics.
c) Lack of test of accuracy.
d) Contradiction of data from actual circumstances.
e) Lack of specific ability to arrive at correct and appropriate results.
f) Can easily be manipulated.
Functions and Importance of Statistics
Statistical methods are used not only in the social, economic and political fields but in every field of science
and knowledge. Statistical analysis has become more significant in global relations and in the age of fast
developing information technology.
According to Prof. Bowley, “The proper function of statistics is to enlarge individual experiences”.
Following are some of the important functions of Statistics :
Nominal Data
Nominal data is one of the types of qualitative information which helps to label the variables without providing
the numerical value. Nominal data is also called the nominal scale. It cannot be ordered and measured. But
sometimes, the data can be qualitative and quantitative. Examples of nominal data are letters, symbols, words,
gender etc.
The nominal data are examined using the grouping method. In this method, the data are grouped into
categories, and then the frequency or the percentage of the data can be calculated. These data are visually
represented using the pie charts.
Ordinal Data
Ordinal data/variable is a type of data which follows a natural order. The significant feature of the nominal
data is that the difference between the data values is not determined. This variable is mostly found in surveys,
finance, economics, questionnaires, and so on.
The ordinal data is commonly represented using a bar chart. These data are investigated and interpreted
through many visualisation tools. The information may be expressed using tables in which each row in the
table shows the distinct category.
Discrete Data
Discrete data can take only discrete values. Discrete information contains only a finite number of possible
values. Those values cannot be subdivided meaningfully. Here, things can be counted in the whole numbers.
Example: Number of students in the class
Continuous Data
Continuous data is data that can be calculated. It has an infinite number of probable values that can be selected
within a given specific range.
Example: Temperature range
Primary and Secondary Data.
Collection of data is the basic activity of statistical science. It means collection of facts and figures relating to
particular phenomenon under the study of any problem whether it is in business economics, social or natural
sciences.
Such material can be obtained directly from the individual units, called primary sources or from the material
published earlier elsewhere known as the secondary sources.
Difference between Primary & Secondary Data
Primary Data Secondary Data
Basis nature Primary data are original and are Data which are collected earlier by
collected for the first time. someone else, and which are now in
published or unpublished state.
Collecting Agency These data are collected by the Secondary data were collected earlier by
investigator himself some other person.
Post collection These data do not need alteration as they These have to be analyzed and necessary
alterations are according to the requirement of the changes have to be made to make them
investigation useful as per the requirements of investion
Time & Money More time, energy and money has to be Comparatively less time and money is to be
spent in collection of these data. spent.
Classification and Tabulation of Data
Classification is the process of arranging data into various groups, classes and sub-classes according to some
common characteristics of separating them into different but related parts.
Main objectives of Classification :-
(i) Classification should be so exhaustive and complete that every individual unit is included in one or the
other class.
(ii) Classification should be suitable according to the objectives of investigation.
(iii) There should be stability in the basis of classification so that comparison can be made.
(iv) The facts should be arranged in proper and systematic way.
(v) Data should be classified according to homogeneity.
(vi) It should be arithmetically accurate.
Data Tabulation
According to Blair, “Tabulation in its broad sense is an orderly arrangement of data in columns and rows.”
Tabulation is a process of presenting the collected and classified data in proper order and systematic way in
columns and rows so that it can be easily compared and its characteristics can be elucidated.
Objects of Tabulation :
➢ Orderly and systematic presentation of data.
➢ Making data precise and stable.
➢ To facilitate comparison.
➢ To make the problem clear and self evident.
➢ To facilitate analysis & interpretation of data.
Frequency Distributions
Problems occur when data to be entered represents the number of items in each class. This type of
classification is called a ‘frequency distribution’. When the variable counted is not a nominal variable, there
may be problems with the definition of classes. The main considerations in constructing a frequency
distribution are:
• Determining the number of classes.
• Deciding the size of the classes.
Number of Classes
A frequency distribution must be made with suitable number of classes. If the classes are few, the original
data will be compressed. Each class will be crowded, and the information may be lost. If there are too many
classes, many of them will contain only a few frequencies. The distribution will look irregular. Based on
research, distribution is optimized if the total class intervals are between 6 and 15.
Example:
Suppose the marks secured by 50 students in a class are given and they range from 0 to 100, then the number
of classes can be decided as 10.
Size of Classes
As far as possible, all classes should be of the same size. To decide the size, find the range (max – min) and
divide it by the number of intervals.
Class size = ([maximum value − minimum value]/number of class intervals)
Example:
Consider two class intervals 10–20 and 20–30. In the class 10–20, 10 is the lower limit and 20 is the upper
limit.
Class width = upper limit − lower limit = 20 − 10 = 10
In this example,
Class Interval Midpoint
10–20 (20 + 10)/2 = 15
20–30 (30 + 20)/2 = 25
Importance of Diagrams :
A properly constructed diagram appeals to the eye as well as the mind since it is practical, clear and easily
understandable even by those who are unacquainted with the methods of presentation. Utility or importance
of diagrams will become clearer from the following points –
(i) Attractive and Effective Means of Presentation: Beautiful lines; full of various colours and signs
attract human sight, and do not strain the mind of the observer. A common man who does not wish
to indulge in figures, get message from a well prepared diagram.
(ii) Make Data Simple and Understandable : The mass of complex data, when prepared through
diagram, can be understood easily. According to Shri Morane, “Diagrams help us to understand
the complete meaning of a complex numerical situation at one sight only”.
(iii) Facilitate Comparison : Diagrams make comparison possible between two sets of data of
different periods, regions or other facts by putting side by side through diagrammatic presentation.
(iv) Save Time and Energy : The data which will take hours to understand, becomes clear by just
having a look at total facts represented through diagrams.
(v) Universal Utility : Because of its merits, the diagrams are used for presentation of statistical data
in different areas. It is widely used technique in economic, business, administration, social and
other areas.
(vi) Helpful in Information Communication : A diagram depicts more information than the data
shown in a table. Information concerning data to general public becomes more easy through
diagrams and gets into the mind of a person with ordinary knowledge.
Types of Diagrams
Bar Diagram
The bar diagram is simple to draw and easy to read. It is widely used. It is useful for comparing simple
magnitudes. It can be classified into simple bar or vertical bar, horizontal bar, multiple bar or compound bar,
and component bar; and bilateral bars show profits and losses.
Consider the following points before preparing it
• Proper scale must be used.
• The bars should be of the same width.
• Uniform space must be given between the bars.
• Descriptions of the bars and components are usually given in the diagram itself.
• The title and the diagram number should be mentioned.
Example:
Number of cars sold by the Ford Company in the following months of 2017:
Month of the Year No. of Cars Sold
January 1000
February 1100
March 1100
April 1200
Construct a bar diagram to represent the same.
• A chart consisting of one or A bar chart that gives a breakdown of • two or more separate bars are used
more bars each total into its components. to present sub-divisions of data.
• The actual magnitude of A percentage component = does not • There is usually no space between
each item is shown show total magnitudes the bars for data in the same
• The lengths of bars on the category
chart allow magnitudes to be
compared
Pie Diagram
A pie diagram is a circular diagram. The component parts are shown as different sectors. The total figure is
represented in a circle, and the total angle is taken as 360, and for each component parts, the proportional
angle is calculated. The desired degrees are marked off on the circumference and sectors are drawn to denote
the parts. They are coloured differently and a description given therein or in a separate legend. A title is given
as well. It is used to show the percentage change in the components of a total.
Working procedure
Total percentage = 100%
Total angle = 360°
1% = 360/100 = 3.6°
where 3.6° will represent 1% of the whole. For example, if 1 component is 10%, it implies that it is equivalent
to 3.6°* 10 = 36°.
Example:
The following is an extract of the expenditure of the state government of Tamil Nadu on different heads: (1
unit is equivalent to 1 lakhs of dollars)
Histogram
A histogram is the method of reporting a frequency distribution in the form of a graph. It consists of bars of
the same width, each referring to class, and their heights referring to the class frequencies. Mark the midpoint
of each bar on the top and move the midpoints of the preceding and succeeding classes of the initial class and
the last class, respectively. Link all the midpoints using a straight line, and then the resultant graph is said to
be a ‘frequency polygon’. Link all the midpoints using smooth curve (free-bend), and then the resulting graph
is said to be a ‘frequency curve’. To draw the histogram, normally we take the class limits of the variable
along the x-axis, and the frequencies of the class interval on the y-axis.
NOTE 1: If the class intervals are uniform in length and are not continuous, then first it must be converted
into a continuous type of interval.
NOTE 2: If the class intervals do not having equal width, then the frequencies must be adjusted based on the
width of the class interval.
Example:
Monthly sales (in lakhs of $) 10–20 20–30 30–40 40–50
Number of companies 3 4 2 1
Frequency Polygon
Frequency Curve
The curve derived by making smooth frequency polygon is called frequency curve. It is constructed by making
smooth the lines of frequency polygon.
This curve is drawn with a free hand so that its angularity disappears and the area of frequency curve remains
equal to that of frequency polygon.
Select the midpoints of the intervals, including the preceding and succeeding class intervals. Link all those
midpoints using free hand. The resulting graph is the required frequency Polygon.
Ogive Curve
Ogive is a cumulative frequency curve. It can be evaluated in two ways as ‘less than’ or ‘more than’. Two
Ogive curves can be drawn from a given set of data. Both will intersect at a point. Always consider the
cumulative frequency on the y-axis. To get less than Ogive curve plot (midpoint of the class interval, less than
cumulative frequency), link all the points using a straight line. To get more than Ogive curve, plot (midpoint
of the class interval, more than cumulative frequency), link all the points using a straight line.
Line Diagram
Among the given 2 items base entry and item A entry, consider the base entry on the x-axis and item A entry
on the y-axis. Plot the coordinate points (base, item A) and link all the points using a straight line.
NOTE: If more than 1 item is given, draw the different lines using different colours (base, item1), (base,
item2). Usually, the base entry may be year, month, name of companies, etc.
Example: Present the following data graphically.
Year Area (in lakhs of acres) Production (in lakhs of tons)
2003 500 250
2004 550 275
2005 600 275
2006 650 300
2007 700 350