Chapter One

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

CHAPTER ONE: INTRODUCTION TO STATISTICS

1.1 Definition and classification of Statistics


The word statistics is defined in different ways depending on its use in the plural and singular
sense.

In the plural sense:- statistics is defined as the collection of numerical facts or figures ( or the raw
data themselves).
Eg. 1. Vital statistics (numerical data on marriage, births, deaths, etc).

2. The average mark of statistics course for students is 70% would be considered as a
statistics whereas Abebe has got 90% in statistics course is not statistics.
Remark: statistics are aggregate of facts. Single and isolated figures are not statistics as they
cannot be compared and are unrelated.
In its singular sense:- the word Statistics is the subject that deals with the methods of collecting,
organizing, presenting, analyzing and interpreting statistical data.
Classification of Statistics
Statistics is broadly divided into two categories based on how the collected data are used.
Descriptive Statistics:- deals with describing the data collected without going further conclusion.
Example 1.1: Suppose that the mark of 6 students in Statistics course for Mathematics is given as
40, 45, 50, 60, 70 and 80. The average mark of the 6 students is 57.5 and it is considered as
descriptive statistics.
Inferential Statistics:- It deals with making inferences and/or conclusions about a population
based on data obtained from a sample of observations. It consists of performing hypothesis testing,
determining relationships among variables and making predictions.
Example 1.2: In the above example, if we say that the average mark in Statistics course for
Mathematics students is 57.5, then we talk about inferential statistics (draw conclusion based on
the sample observation).

1.2 Stages of Statistical Investigation


The area of statistics points out the following five stages. These are collection, organization,
presentation, analysis and interpretation of data.
Collection of data: This is the process of obtaining measurements or counts or obtaining raw data.
Data can be collected in a variety of ways; one of the most common methods is through the use of
sample or census survey. Survey can also be done in different methods, three of the most common
methods are:
 Telephone survey
 Mailed questionnaire
 Personal interview.
Organization of data: - Data collected from published sources are generally in organized form.
However if an investigator has collected data through a survey, it is necessary to edit these data in
order to correct any apparent inconsistencies, ambiguities, and recording errors.
This phase also includes correcting the data for errors, grouping data into classes and tabulating.
Presentation of data:- After the data have been collected and organized they can be presented in
the form of tables, charts, diagrams and graphs. This presentation in an orderly manner facilitates
the understanding as well as analysis of data.
Analysis of data: - the basic purpose of data analysis is to dig out useful information for decision
making. This analysis may simply be a critical observation of data to draw some meaningful
conclusions about it or it may involve highly complex and sophisticated mathematical techniques.
Interpretation of data: - Interpretation means drawing conclusions from the data collected and
analyzed. Correct interpretation will lead to a valid conclusion of the study & thus can aid in
decision making.
1.3 Definition of some statistical terms
Population: - It is the totality of objects under study. The population represents the target of an
investigation, and the objective of the investigation is to draw conclusions about the population
hence we sometimes call it target population. The word population doesn’t necessarily refer to
people.
Examples:
 All clients of Telephone Company
 All students of Debre Markos University (DMU)
 Population of families, etc.
The population could be finite or infinite (an imaginary collection of units).
Sample: - is part or subset of population under study.

Sampling frame: - is the list of all possible units of the population that the sample can be drawn
from it.

Eg. List of all students of DMU, List of all residential houses in Debre Markos town, etc

Survey: - is an investigation of a certain population to assess its characteristics. It may be census


or sample.
Census survey: a complete enumeration of the population under study.
Sample survey: the process of collecting data covering a representative part or portion of a
population.
Parameter: - is a statistical measure of a population, or summary value calculated from a
population. Examples: Average, Range, proportion, variance, etc
Statistic: - is a descriptive measure of a sample, or it is a summary value calculated from a sample.
Sampling: - The process or method of sample selection from the population.
Sample size: - The number of elements or observation to be included in the sample.
An element: - is a member of sample or population. It is specific subject or object (for example a
person, firm, item, etc.) about which the information is collected.
Variable: - It is an item of interest that can take numerical or non-numerical values for different
elements. It may be qualitative or quantitative. Example: age, weight, sex, marital status, etc.
Observation (measurement):- is the value of a variable for an element.
1.4 Applications, uses and limitations of Statistics
Statistics can be applied in any field of study which seeks quantitative evidence. For instance,
engineering, economics, natural science, etc.
a) Engineering: Statistics have wide application in engineering.
 To compare the breaking strength of two types of materials
 To determine the probability of reliability of a product.
 To control the quality of products in a given production process.
 To compare the improvement of yield due to certain additives such as fertilizer, herbicides,
e t c.
b) Economics: Statistics are widely used in economics study and research.
 To measure and forecast Gross National Product (GNP)
 Statistical analyses of population growth, inflation rate, poverty, unemployment figures,
rural or urban population shifts and so on influence much of the economic policy making.
 Financial statistics are necessary in the fields of money and banking including consumer
savings and credit availability.
c) Statistics and research: there is hardly any advanced research going on without the use of
statistics in one form or another. Statistics are used extensively in medical, pharmaceutical and
agricultural research.
Function/Uses of Statistics
Today the field of statistics is recognized as a highly useful tool to making decision process by
managers of modern business, industry, frequently changing technology. It has a lot of functions
in everyday activities. The following are some uses of statistics:
• It condenses and summarizes a mass of data: the original set of data (raw data) is normally
voluminous and disorganized unless it is summarized and expressed in few presentable,
understandable & precise figures.
• Statistics facilitates comparison of data: measures obtained from different set of data can be
compared to draw conclusion about those sets. Statistical values such as averages, percentages,
ratios, rates, coefficients, etc, are the tools that can be used for the purpose of comparing sets of
data.
• Statistics helps to predict future trends: statistics is very useful for analyzing the past and
present data and forecasting future events.
• Statistics helps to formulate & review policies: Statistics provide the basic material for framing
suitable policies. Statistical study results in the areas of taxation, on unemployment rate, on
inflation, on the performance of every sort of military equipment, etc, may convince a government
to review its policies and plans with the view to meet national needs and aspirations.
• Formulating and testing hypothesis: Statistical methods are extremely useful in formulating
and testing hypothesis and to develop new theories.
Limitations of Statistics

The field of statistics, though widely used in all areas of human knowledge and widely applied in
a variety of disciplines such as engineering, economics and research, has its own limitations. Some
of these limitations are:
a) It does not deal with individual values: as discussed earlier, statistics deals with aggregate of
facts. For example, wage earned by an individual worker at any one time, taken by itself is not a
statistics.
b) It does not deal with qualitative characteristics directly: statistics is not applicable to
qualitative characteristics such as beauty, honesty, poverty, standard of living and so on since these
cannot be expressed in quantitative terms. These characteristics, however, can be statistically dealt
with if some quantitative values can be assigned to these with logical criterion. For example,
intelligence may be compared to some degree by comparing IQs or some other scores in certain
intelligence tests.
c) Statistical conclusions are not universally true: since statistics is not an exact science, as is
the case with natural sciences, the statistical conclusions are true only under certain assumptions.
d) It can be misused: statistics cannot be used to full advantage in the absence of proper
understanding of the subject matter.
1.5 Levels of Measurement

Proper knowledge about the nature and type of data to be dealt with is essential in order to specify
and apply the proper statistical method for their analysis and inferences.
Scale Types
Measurement is the assignment of values to objects or events in a systematic fashion. Four levels
of measurement scales are commonly distinguished: nominal, ordinal, interval, and ratio and each
possessed different properties of measurement systems. The first two are qualitative while the last
two are quantitative.
Nominal scale: The values of a nominal attribute are just different names, i.e., nominal attributes
provide only enough information to distinguish one object from another. Qualities with no ranking
or ordering; no numerical or quantitative value. These types of data are consists of names, labels
and categories. This is a scale for grouping individuals into different categories.
Example 1.3: Eye color: brown, black, etc, sex: male, female.
 In this scale, one is different from the other
 Arithmetic operations (+, -, *, ÷) are not applicable, comparison (<, >, ≠, etc) is impossible
Ordinal scale: - defined as nominal data that can be ordered or ranked.
 Can be arranged in some order, but the differences between the data values are
meaningless.
 Data consisting of an ordering of ranking of measurements are said to be on an ordinal
scale of measurements. That is, the values of an ordinal scale provide enough information
to order objects.
 One is different from and greater /better/ less than the other
 Arithmetic operations (+, -, *, ÷) are impossible, comparison (<, >, ≠, etc) is possible.
Example 1.4 -Letter grading (A, B, C, D, F), -Rating scales (excellent, very good, good, fair,
poor), military status (general, colonel, lieutenant, etc).

Interval Level: data are defined as ordinal data and the differences between data values are
meaningful. However, there is no true zero, or starting point, and the ratio of data values are
meaningless. Note: Celsius & Fahrenheit temperature readings have no meaningful zero and ratios
are meaningless.
In this measurement scale:-
 One is different, better/greater and by a certain amount of difference than another.
 Possible to add and subtract. For example; 800c – 500c = 300c, 700c – 400c = 300c.
 Multiplication and division are not possible. For example; 600c = 3(200c). But this does
not imply that an object which is 600c is three times as hot as an object which is 200c.
Most common examples are: IQ, temperature.
Ratio scale: Similar to interval, except there is a true zero (absolute absence), or starting point,
and the ratios of data values have meaning.
 Arithmetic operations (+, -, *, ÷) are applicable. For ratio variables, both differences and
ratios are meaningful.
 One is different/larger /taller/ better/ less by a certain amount of difference and so much
times than the other.
 This measurement scale provides better information than interval scale of measurement.
Example 1.5: weight, age, number of students.

You might also like