Unit - 1: Statistics: Meaning, Significance & Limitations

UNIT – 1
Statistics : Meaning, Significance & Limitations

“Statistics‟ means numerical presentation of facts. Its meaning is divided into two forms - in plural form and
in singular form. In plural form, „Statistics‟ means a collection of numerical facts or data example price
statistics, agricultural statistics, production statistics, etc. In singular form, the word means the statistical
methods with the help of which collection, analysis and interpretation of data are accomplished.
Characteristics of Statistics -
a) Aggregate of facts/data
b) Numerically expressed
c) Affected by different factors
d) Collected or estimated
e) Reasonable standard of accuracy
f) Predetermined purpose
g) Comparable
h) Systematic collection.
Therefore, the process of collecting, classifying, presenting, analyzing and interpreting the numerical facts,
comparable for some predetermined purpose are collectively known as “Statistics”.
Scope of Statistics.
The scope of statistics is much extensive. It can be divided into two parts –
(i) Statistical Methods such as Collection, Classification, Tabulation, Presentation, Analysis, Interpretation
and Forecasting.
(ii) Applied Statistics – It is further divided into three parts:
a) Descriptive Applied Statistics : Purpose of this analysis is to provide descriptive information.

b) Scientific Applied Statistics : Data are collected with the purpose of some scientific research and with
the help of these data some particular theory or principle is propounded.
c) Business Applied Statistics : Under this branch statistical methods are used for the study, analysis
and solution of various problems in the field of business.
Limitations of statistics
1. Statistics can study only numerical or quantitative aspects of a problem.
2. Statistics deals with aggregates not with individuals.
3. Statistical results are true only on an average.
4. Statistical laws are not exact.
5. Statistics does not reveal the entire story.
6. Statistical relations do not necessarily bring out the cause and effect relationship between
phenomena.
7. Statistics is collected with a given purpose.
8. Statistics can be used only by experts.
Reasons for distrust in Statistics
By distrust of statistics we mean lack of confidence in statistical statements and statistical methods. It is often
commented by people
“Statistics can prove anything.”
“There are three type of lies – lies, damned lies and statistics – wicked in the order of their naming.”
The main reasons for such views are -
a) Figures are convincing, and therefore people are easily led to believe them.
b) Ignorance of limitation of statistics.
c) Lack of test of accuracy.
d) Contradiction of data from actual circumstances.
e) Lack of specific ability to arrive at correct and appropriate results.
f) Can easily be manipulated.
Functions and Importance of Statistics
Statistical methods are used not only in the social, economic and political fields but in every field of science
and knowledge. Statistical analysis has become more significant in global relations and in the age of fast
developing information technology.
According to Prof. Bowley, “The proper function of statistics is to enlarge individual experiences”.
Following are some of the important functions of Statistics :
a) To provide numerical facts.

b) b) To simplify complex facts.
c) To enlarge human knowledge and experience.
d) Helps in formulation of policies.
e) To provide comparison.
f) To establish mutual relations.
g) Helps in forecasting.
h) Test the accuracy of scientific theories.
i) To study extensively and intensively.
The use of statistics has become almost essential in order to clearly understand and solve a problem.
Statistics proves to be much useful in unfamiliar fields of application and complex situations such as:-
a) Planning
b) Administration
c) Economics
d) Trade & Commerce
e) Production management
f) Quality control
g) Helpful in inspection
h) Insurance business
i) Railways & transport Co
j) Banking Institutions
k) Speculation and Gambling
Basic Vocabulary of Statistics

VARIABLES
Variables are a characteristics of an item or individual and are what you analyze when you use a statistical
method.
DATA
Data are the different values associated with a variable.
OPERATIONAL DEFINITIONS
Data values are meaningless unless their variables have operational definitions, universally accepted
meanings that are clear to all associated with an analysis.
POPULATION
A population consists of all the items or individuals about which you want to draw a conclusion. The
population is the “large group”
SAMPLE
A sample is the portion of a population selected for analysis. The sample is the “small group”
PARAMETER
A parameter is a numerical measure that describes a characteristic of a population.
STATISTIC
A statistic is a numerical measure that describes a characteristic of a sample.
Types of Data in Statistics
Qualitative or Categorical Data

Qualitative data, also known as the categorical data, describes the data that fits into the categories. Qualitative
data are not numerical. The categorical information involves categorical variables that describe the features
such as a person’s gender, home town etc. Categorical measures are defined in terms of natural language
specifications, but not in terms of numbers.
Sometimes categorical data can hold numerical values (quantitative value), but those values do not have
mathematical sense. Examples of the categorical data are birthdate, favourite sport, school postcode. Here, the
birthdate and school postcode hold the quantitative value, but it does not give numerical meaning.
Nominal Data
Nominal data is one of the types of qualitative information which helps to label the variables without providing
the numerical value. Nominal data is also called the nominal scale. It cannot be ordered and measured. But
sometimes, the data can be qualitative and quantitative. Examples of nominal data are letters, symbols, words,
gender etc.
The nominal data are examined using the grouping method. In this method, the data are grouped into
categories, and then the frequency or the percentage of the data can be calculated. These data are visually
represented using the pie charts.
Ordinal Data
Ordinal data/variable is a type of data which follows a natural order. The significant feature of the nominal
data is that the difference between the data values is not determined. This variable is mostly found in surveys,
finance, economics, questionnaires, and so on.
The ordinal data is commonly represented using a bar chart. These data are investigated and interpreted
through many visualisation tools. The information may be expressed using tables in which each row in the
table shows the distinct category.
Quantitative or Numerical Data

Quantitative data is also known as numerical data which represents the numerical value (i.e., how much, how
often, how many). Numerical data gives information about the quantities of a specific thing. Some examples
of numerical data are height, length, size, weight, and so on. The quantitative data can be classified into two
different types based on the data sets. The two different classifications of numerical data are discrete data and
continuous data.
Discrete Data
Discrete data can take only discrete values. Discrete information contains only a finite number of possible
values. Those values cannot be subdivided meaningfully. Here, things can be counted in the whole numbers.
Example: Number of students in the class
Continuous Data
Continuous data is data that can be calculated. It has an infinite number of probable values that can be selected
within a given specific range.
Example: Temperature range
Primary and Secondary Data.
Collection of data is the basic activity of statistical science. It means collection of facts and figures relating to
particular phenomenon under the study of any problem whether it is in business economics, social or natural
sciences.
Such material can be obtained directly from the individual units, called primary sources or from the material
published earlier elsewhere known as the secondary sources.
Difference between Primary & Secondary Data
Primary Data Secondary Data
Basis nature Primary data are original and are Data which are collected earlier by
collected for the first time. someone else, and which are now in
published or unpublished state.
Collecting Agency These data are collected by the Secondary data were collected earlier by
investigator himself some other person.
Post collection These data do not need alteration as they These have to be analyzed and necessary
alterations are according to the requirement of the changes have to be made to make them
investigation useful as per the requirements of investion
Time & Money More time, energy and money has to be Comparatively less time and money is to be
spent in collection of these data. spent.
Classification and Tabulation of Data
Classification is the process of arranging data into various groups, classes and sub-classes according to some
common characteristics of separating them into different but related parts.
Main objectives of Classification :-
(i) To make the data easy and precise

(ii) To facilitate comparison
(iii) Classified facts expose the cause-effect relationship.
(iv) To arrange the data in proper and systematic way
(v) The data can be presented in a proper tabular form only.
Essentials of an Ideal Classification :-
(i) Classification should be so exhaustive and complete that every individual unit is included in one or the
other class.
(ii) Classification should be suitable according to the objectives of investigation.
(iii) There should be stability in the basis of classification so that comparison can be made.
(iv) The facts should be arranged in proper and systematic way.
(v) Data should be classified according to homogeneity.
(vi) It should be arithmetically accurate.
Data Tabulation
According to Blair, “Tabulation in its broad sense is an orderly arrangement of data in columns and rows.”
Tabulation is a process of presenting the collected and classified data in proper order and systematic way in
columns and rows so that it can be easily compared and its characteristics can be elucidated.
Objects of Tabulation :
➢ Orderly and systematic presentation of data.
➢ Making data precise and stable.
➢ To facilitate comparison.
➢ To make the problem clear and self evident.
➢ To facilitate analysis & interpretation of data.
Frequency Distributions
Problems occur when data to be entered represents the number of items in each class. This type of
classification is called a ‘frequency distribution’. When the variable counted is not a nominal variable, there
may be problems with the definition of classes. The main considerations in constructing a frequency
distribution are:
• Determining the number of classes.
• Deciding the size of the classes.
Number of Classes
A frequency distribution must be made with suitable number of classes. If the classes are few, the original
data will be compressed. Each class will be crowded, and the information may be lost. If there are too many
classes, many of them will contain only a few frequencies. The distribution will look irregular. Based on
research, distribution is optimized if the total class intervals are between 6 and 15.
Example:
Suppose the marks secured by 50 students in a class are given and they range from 0 to 100, then the number
of classes can be decided as 10.
Size of Classes
As far as possible, all classes should be of the same size. To decide the size, find the range (max – min) and
divide it by the number of intervals.
Class size = ([maximum value − minimum value]/number of class intervals)
Example:
Consider two class intervals 10–20 and 20–30. In the class 10–20, 10 is the lower limit and 20 is the upper
limit.
Class width = upper limit − lower limit = 20 − 10 = 10
Midpoint of the class:

Midpoint of the class interval = (lower limit + upper limit)/2
In this example,
Class Interval Midpoint
10–20 (20 + 10)/2 = 15
20–30 (30 + 20)/2 = 25
Diagrammatic and Graphic Presentation of Data

Diagrammatic Representation
Depicting of statistical data in the form of attractive shapes such as bars, circles, and rectangles is called
diagrammatic presentation.
A diagram is a visual form of presentation of statistical data, highlighting their basic facts and relationship.
There are geometrical figures like lines, bars, squares, rectangles, circles, curves, etc. Diagrams are used with
great effectiveness in the presentation of all types of data.
When properly constructed, they readily show information that might otherwise be lost amid the details of
numerical tabulation.
Importance of Diagrams :
A properly constructed diagram appeals to the eye as well as the mind since it is practical, clear and easily
understandable even by those who are unacquainted with the methods of presentation. Utility or importance
of diagrams will become clearer from the following points –
(i) Attractive and Effective Means of Presentation: Beautiful lines; full of various colours and signs
attract human sight, and do not strain the mind of the observer. A common man who does not wish
to indulge in figures, get message from a well prepared diagram.
(ii) Make Data Simple and Understandable : The mass of complex data, when prepared through
diagram, can be understood easily. According to Shri Morane, “Diagrams help us to understand
the complete meaning of a complex numerical situation at one sight only”.
(iii) Facilitate Comparison : Diagrams make comparison possible between two sets of data of
different periods, regions or other facts by putting side by side through diagrammatic presentation.
(iv) Save Time and Energy : The data which will take hours to understand, becomes clear by just
having a look at total facts represented through diagrams.
(v) Universal Utility : Because of its merits, the diagrams are used for presentation of statistical data
in different areas. It is widely used technique in economic, business, administration, social and
other areas.
(vi) Helpful in Information Communication : A diagram depicts more information than the data
shown in a table. Information concerning data to general public becomes more easy through
diagrams and gets into the mind of a person with ordinary knowledge.
Types of Diagrams
Bar Diagram
The bar diagram is simple to draw and easy to read. It is widely used. It is useful for comparing simple
magnitudes. It can be classified into simple bar or vertical bar, horizontal bar, multiple bar or compound bar,
and component bar; and bilateral bars show profits and losses.
Consider the following points before preparing it
• Proper scale must be used.
• The bars should be of the same width.
• Uniform space must be given between the bars.
• Descriptions of the bars and components are usually given in the diagram itself.
• The title and the diagram number should be mentioned.
Example:
Number of cars sold by the Ford Company in the following months of 2017:
Month of the Year No. of Cars Sold
January 1000
February 1100
March 1100
April 1200
Construct a bar diagram to represent the same.
Types of Bar Graph
Simple Component Compound
• A chart consisting of one or A bar chart that gives a breakdown of • two or more separate bars are used
more bars each total into its components. to present sub-divisions of data.
• The actual magnitude of A percentage component = does not • There is usually no space between
each item is shown show total magnitudes the bars for data in the same
• The lengths of bars on the category
chart allow magnitudes to be
compared
Pie Diagram
A pie diagram is a circular diagram. The component parts are shown as different sectors. The total figure is
represented in a circle, and the total angle is taken as 360, and for each component parts, the proportional
angle is calculated. The desired degrees are marked off on the circumference and sectors are drawn to denote
the parts. They are coloured differently and a description given therein or in a separate legend. A title is given
as well. It is used to show the percentage change in the components of a total.
Working procedure
Total percentage = 100%
Total angle = 360°
1% = 360/100 = 3.6°
where 3.6° will represent 1% of the whole. For example, if 1 component is 10%, it implies that it is equivalent
to 3.6°* 10 = 36°.
Example:
The following is an extract of the expenditure of the state government of Tamil Nadu on different heads: (1
unit is equivalent to 1 lakhs of dollars)
Different Heads Expenditure (lakhs of $)

Direct demands on revenue (DDR) 3,00,000
Administration (ADMIN) 20,00,000
Other items (OI) 22,00,000
Overall expenses (Total) 45,00,000
Step 1: Express each component in percentage of the total expenditure.

DDR = (300000/4500000) * 100 = 6.7%
ADMIN = (2000000/4500000) * 100 = 44.4%
OI = (2200000/4500000) * 100 = 48.9%
Step 2: Evaluate the equivalent component’s degree.

DDR = 6.7 * 3.6 = 24.12; ADMIN = 44.4 * 3.6 = 159.84
OI = 48.9 * 3.6 = 176.04
Histogram
A histogram is the method of reporting a frequency distribution in the form of a graph. It consists of bars of
the same width, each referring to class, and their heights referring to the class frequencies. Mark the midpoint
of each bar on the top and move the midpoints of the preceding and succeeding classes of the initial class and
the last class, respectively. Link all the midpoints using a straight line, and then the resultant graph is said to
be a ‘frequency polygon’. Link all the midpoints using smooth curve (free-bend), and then the resulting graph
is said to be a ‘frequency curve’. To draw the histogram, normally we take the class limits of the variable
along the x-axis, and the frequencies of the class interval on the y-axis.
NOTE 1: If the class intervals are uniform in length and are not continuous, then first it must be converted
into a continuous type of interval.
NOTE 2: If the class intervals do not having equal width, then the frequencies must be adjusted based on the
width of the class interval.
Example:
Monthly sales (in lakhs of $) 10–20 20–30 30–40 40–50
Number of companies 3 4 2 1
Frequency Polygon
Frequency polygon is a graphical presentation of both discrete and continuous series.

For a discrete frequency distribution, frequency polygon is obtained by plotting frequencies on Y-axis against
the corresponding size of the variables on X-axis and then joining all the points ;by a straight line.
In continuous series the mid-points of the top of each rectangle of histogram is joined by a straight line. To
make the area of the frequency polygon equal to histogram, the line so drawn is stretched to meet the base line
(X-axis) on both sides.
Select the midpoints of the intervals, including the preceding and succeeding class intervals. Link all those
midpoints using a straight line. The resulting graph is the required frequency polygon.
Frequency Curve
The curve derived by making smooth frequency polygon is called frequency curve. It is constructed by making
smooth the lines of frequency polygon.
This curve is drawn with a free hand so that its angularity disappears and the area of frequency curve remains
equal to that of frequency polygon.
Select the midpoints of the intervals, including the preceding and succeeding class intervals. Link all those
midpoints using free hand. The resulting graph is the required frequency Polygon.
Ogive Curve
Ogive is a cumulative frequency curve. It can be evaluated in two ways as ‘less than’ or ‘more than’. Two
Ogive curves can be drawn from a given set of data. Both will intersect at a point. Always consider the
cumulative frequency on the y-axis. To get less than Ogive curve plot (midpoint of the class interval, less than
cumulative frequency), link all the points using a straight line. To get more than Ogive curve, plot (midpoint
of the class interval, more than cumulative frequency), link all the points using a straight line.
Example: Construct Ogive curves for the following data:

Weight (kg) 30–40 40–50 50–60 60–70 70–80
No. of students 400 500 700 300 100
Take the lower limits of the class intervals on the x-axis and frequency on the y-axis. Draw the less-than Ogive
curve by considering the points (lower limit, less-than frequency) and the more-than Ogive curve by
considering the points (lower limit, more-than frequency).
Line Diagram
Among the given 2 items base entry and item A entry, consider the base entry on the x-axis and item A entry
on the y-axis. Plot the coordinate points (base, item A) and link all the points using a straight line.
NOTE: If more than 1 item is given, draw the different lines using different colours (base, item1), (base,
item2). Usually, the base entry may be year, month, name of companies, etc.
Example: Present the following data graphically.
Year Area (in lakhs of acres) Production (in lakhs of tons)
2003 500 250
2004 550 275
2005 600 275
2006 650 300
2007 700 350

Unit - 1: Statistics: Meaning, Significance & Limitations

Uploaded by

Unit - 1: Statistics: Meaning, Significance & Limitations

Uploaded by

UNIT – 1

Statistics : Meaning, Significance & Limitations

(ii) Applied Statistics – It is further divided into three parts:

a) Descriptive Applied Statistics : Purpose of this analysis is to provide descriptive information.

a) To provide numerical facts.

Basic Vocabulary of Statistics

Types of Data in Statistics

Qualitative or Categorical Data

Quantitative or Numerical Data

(i) To make the data easy and precise

Midpoint of the class:

Diagrammatic and Graphic Presentation of Data

Types of Bar Graph

Simple Component Compound

Different Heads Expenditure (lakhs of $)

Step 1: Express each component in percentage of the total expenditure.

Step 2: Evaluate the equivalent component’s degree.

Frequency polygon is a graphical presentation of both discrete and continuous series.

Example: Construct Ogive curves for the following data:

You might also like