The Basic Concepts of Statistics
The Basic Concepts of Statistics
The Basic Concepts of Statistics
“Research and education are the two wheels of development, with research being the front wheel” – Duro Clement Dolapo
Introduction
Researchers use a wide variety of tools in order to gain an understanding of the phenomena they study.
Perhaps the most important of these is statistics. Statistics plays fundamental roles not only in the analysis
of data collected, but in the planning, designing, process of data collection and interpretation of the
results of research. No wonder statistics is increasingly taking a prominent position in research.
What is Research?
Research is simply the process of arriving at dependable solutions to problems through the planned and
systematic collection, analysis and interpretation of data. Research is the most important tool for
advancing knowledge, for promoting progress, and for enabling man to relate more effectively to his
environment, to accomplish his purposes, and to resolve his conflicts (Osuala, 2001). It is oriented towards
finding what works and what does not work in certain situation. Without research development is a
mirage. Any nation that desires development, but not ready to channel a great deal of her resources to
education and research is just tempting God no matter the amount of prayers.
What is Statistics?
Statistics is the scientific method of collecting, organizing, summarizing, analyzing, interpreting and
presenting data. If one compares the definitions of research and statistics, one will realize that both of
them relate to one word “data” which will be further explained later. Therefore, the definition of research
can be reframed as “the process of arriving at dependable solutions to problems through statistics”.
Statistics finds its application in all human endeavors. The branch of statistics that deals primarily with the
biological sciences and medical/health-related disciplines is the Biostatistics.
When approaching the study of any organized body of knowledge, especially one as diverse and complex
as statistics, it is important that some conceptual frameworks be identified from which the components
material can be viewed.
A population is a set of persons (or objects) having a common observable characteristic (Kuzma, 1998).
This is referred to as the popular population. Population can also be referred to as the observable
characteristics of persons or things. This is referred to as the statistical population. The sizes of populations
can vary. In the discipline of statistics two types of populations (infinite and finite) can be distinguished
based of the size. Infinite populations can be thought of as large populations while finite populations are
those that are smaller. The distinction is arbitrary, although some researchers regards populations that
1
are 10,000 or more as large populations, while those that are less than 10,000 are referred to as small
populations.
A sample is some subset of a population. The distinction between population and sample is crucial to
understanding of research. This is because more often than not, the researcher is not able to carry out
observation on all the units constituting a population for cost and logistic reasons. He/she can still conduct
the research by observing a subset of the population by taking a representative sample after which an
extrapolation is made from the results gotten from the sample to the population.
The word data refers to the recordings of measurements made on characteristics. The singular form is
datum, but since statistics is about groups of person or objects, the word “data” predominates. Data are
values a characteristic can assume or values a variable can assume.
When characteristics take on different values they are referred to as variables. In other word, a variable
is a characteristic that can assume different value or a characteristic that varies from one subject (person
or object) to another. For example, sex is a variable because it differs among people. The possible values
of sex are “male and female”. The recorded male or female for a group of people are referred to as data.
Classification of variables is depicted in the diagram below. The classification is also used for data.
2
Activity 2.1: List and discuss ten examples of each of the following and their possible values:
1. Nominal variable
2. Ordinal variable
3. Discrete variable
4. Continuous Variable
Note: Nominal variables often consist of a type with two possible values, e.g. dead or alive, male or
female, cured or not cured. This type is referred to as binary or dichotomous variable.
Activity 2.2: Discuss five different parameters and their corresponding statistics
3
graphical representations of the data. Thus, descriptive statistics, as the name implies deals with
description of data. In contrast to descriptive statistics, inferential statistics is made up of various
techniques used to provide information about parameter values based on observations made on the
values of statistics.
Figure 2: The relationship between population and sample, parameter and statistic, and inferential and descriptive statistics
Descriptive Inferential
Probability
statistics statistics
Scales of measurement
We have learnt that populations and samples are made up of subjects (persons or objects), and that
subjects have measurable and observable characteristics that take up different values. We also learned
that once measurements are carried out and recorded the result is called data. But what is meant by the
word measure? Simply put, it means that we assign numbers, letters, words or some other symbols to
persons or things in order to convey information about the characteristic being measured. Thus, we may
assign the number 65 to a man in order to represent his weight in kilogram or an “M” to represent his sex
or gender. It is worthy of note that measurements taken on variables can yield different amounts of
information depending on the scale employed in the measurement process. Thus, measurements that
produce the number 1, 2, 3, 4 and 5 on one scale may convey a different amount of information about
the variable than would the same numbers obtained from use of different scale. This in turn has
4
implications for the statistical treatment of such data. The scales of measurement were first described by
Stanley S. Stevens in his book entitled “On the theory of scales of measurement” in 1946. According to
Stevens the measurement process can be conceived of as existing on four different levels which he
referred to as the nominal, ordinal, interval (or equal interval), and ratio scales as describe below.
Independent variable: A variable thought to be the cause of some effect. This is the variable that can be
manipulated.
Predictor variable: A variable thought to predict an outcome variable. It is another term for independent
variable.
Ratio
Bibliography
1. Adamu SO, Tinuke L. Johnson. Statistics for Beginners, Evan Brothers Nigeria Limited (2011)
2. Afolabi Bamgboye E, A Companion of Medical Statistics, FalBam Publishers, Ibadan, Nigeria, Second edition
(2008).
3. Andy field, Discovering Statistics Using SPSS, Sage, Los Angeles, 3rd edition
4. Aviva Petrie and Caroline Sabin, Medical Statistics at a Glance, Willey, Blackwell, 3rd Edition
5
5. Babara Facem, High – Yield Behavioural Science, Lippincott Williams & Wilkins
6. Beth Dawson, Robert G. Trapp, Basic & Clinical Biostatistics, Lange Medical Books/McGraw-Hill, 4th edition,
2004.
7. Betty R. Kirkwood, Jonathan A.C. Sterne. Essentials of Medical Statistics. Blackwell scientific Publications,
California USA, 2nd edition, 2003
8. Bill Taylor, Gautam Sinha, Taposh Ghoshal.Research Methodology: A Guide for Researchers in Management &
Social Sciences.Prentice-Hall of India Private Limited (2006).
9. Bonita R, Beaglehole R, Kjettstrom T. Basic Epidemiology, World Health Organization, Second edition
10. Clifford Blair R, Richard A. Taylor, Biostatistics for the Health Sciences. Upper Saddle River, New Jersey
11. David Bowers, Allan House, David Owens, Understanding Clinical Papers. John Wiley & Sons Limited, 2012
12. David Machin, Michael J. Campbel, Stephen J. Walters, Medical Statistics: A Textbook for the Health Science.
John Wiley & Sons Limited, fourth edition
13. Gail F Dawson, Easy Interpretation of Biostatistics: The Vital Link to Applying Evidence in medical Decisions.
14. James F. Jekel, David L. Katz, Joan G. Elmore, Epidemiology, Biostatistics, and Preventive Medicine. Saunders,
Second edition
15. Kenneth F. Schulz, David A. Grimes, The Lancet Handbook on Essential Concepts in Clinical Research.
16. Kothari CR, Research Methodology: Methods and Techniques. New Age International Publishers, Second edition
(2011)
17. Mahajan BK, Methods of Biostatistics for Medical Students and Research Workers, Jaypee. Sixth edition
18. Nigel Bruse, Daniel Pope, Debbi Stanistreet, Quantitative Methods for Health Research, John Wiley & Sons
Limited (2008).
19. Osuala EC, Introduction to Research Methodology, Africana-Fep Publishers Limited, Third edition (2001)
20. Stephen H. Gehlbach, Interpreting Medical Literature. McGraw-Hill, fifth edition (2006)
21. Sylvia Wassertheil-Smoller, Biostatistics and Epidemiology: A Primer for Health Professionals, Springer-Verlag,
New York
Practice Questions
1. The data from one of the following types of variables cannot be ordered
A. Nominal
B. Ordinal
C. Discrete
D. Continuous
E. None of the above
6
4. The followings are examples of data except
A. “80kg”
B. “Female”
C. “Australia”
D. “Blood sugar”
E. “One Dollar”
9. Which of the following scales of measurement is known for not having a true zero origin
A. Ratio
B. Ordinal
C. Interval
D. Nominal
E. None of the above
7
E. Parameter ↔ Population
11. All the following notations are peculiar to descriptive statistics except
A. s
B.
C. μ
D. s2
E. None of the above
12. Which of the following scales of measurement has all the properties of others
A. Interval
B. Nominal
C. Ratio
D. Ordinal
E. None of the above