Data and Variable - PG 9

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Paper No: 14 Statistical Applications in Environmental Sciences

Module: 2 Data and its Types

Development Team
Prof. R.K. Kohli
Principal Investigator
& Prof. V.K. Garg &Prof.AshokDhawan
Co- Principal Investigator
Central University of Punjab, Bathinda

Dr. Harmanpreet Singh Kapoor,


Paper Coordinator
Central University of Punjab, Bathinda
Dr. Harmanpreet Singh Kapoor
Content Writer
Central University of Punjab, Bathinda

Content Reviewer Prof. Kanchan Jain,

Panjab University, Chandigarh

1
Anchor Institute Central University of Punjab

Statistical Applications in Environmental Sciences


Environmental
Sciences Data and its Types
Description of Module

Subject Name Environmental Sciences

Paper Name Statistical Applications in Environmental Sciences

Module Name/Title Data and its Types

Module Id EVS/SAES-XIV/2

Pre-requisites Basic Mathematics

Objectives To give basic introduction of data and its various types with examples

Keywords Statistics, data, quantitative, qualitative, attributes, variables, discrete variables, continuous variables

Statistical Applications in Environmental Sciences


Environmental
Sciences Data and its Types
Module 2: Data and its Types
 Learning Objectives
 Introduction.
 Types of Data
 Comparison of qualitative and quantitative data
 Variable
 Summary
 Suggested Readings
1. Learning Objectives
In this module, an attempt has been made to give a brief and relevant information about the topic with
examples. This module helps to understand the various types of data and their segmentation. Numerical
questions are included to give an in-depth knowledge of the topic.

2. Introduction
Data is a plural form of a word ‘datum’. Data is considered as a collection of items either in qualitative
or in quantitative form and it consists of full information regarding the objective. It consists of relevant
information about the objective and it is analyzed further to extract that information.
Data is collected from the sources in which one is interested. In different sectors, it is possible that one
does not have direct approach to the object due to time and money for example for the study of an
environment affecting factors. One requires large instruments and man power to collect values but there
are other departments like metrological sciences, remote sensing that also deal with the same objects.
Thus one can use published data by these department for further analysis. The source that one has used
to collect the data is secondary and it is called secondary source. If one has direct approach to the sources
and information then data collected is considered as from primary source. So data are collected from
two sources:
(a) Primary Source
(b) Secondary Source
Now information is collected in the form of data from one of the above sources. This information is
further used for analysis purpose. It solely depends on the characteristics of an item that it can be
observed either in quantitative form or in qualitative form. For example, height, weight and age of a
person are quantified in numbers. This is an example of quantitative form and the variables used to
quantify values are called quantitative variables.
Also some characteristics like religion of a person, designation, severity of a diseases, gender etc are
difficult to calculate in terms of number but one can assign numbers to them for recording purpose and
has no meaning in term of value. For example for noting down sex of a person on can use ‘1’ for male
and ‘2’ for female in a government survey. Similarly for considering the health status of patient suffering
from particular disease one can categorize good as ‘1’ , mild as ‘2’ and ‘severe’ as ‘3’. This is an
example of qualitative data and it is further evaluated to draw conclusion.
3

Statistical Applications in Environmental Sciences


Environmental
Sciences Data and its Types
The main thing that one has to keep in mind is as ‘1’,’2’ and ‘3’ values assigned to different category
are just representative values. It does not mean that one value is double or half than other value. So one
cannot apply mathematical/statistical methods on the values to make a decision about the study.
In the coming sections, the detail structure of the data and its types are discussed in detail.
Self –Check exercise
Question: Which source is used to study the factors affecting environmental pollution in the country?
Answer: As from the objective, one can see that the area under study is very vast. So, it is difficult for a
person to gather data within a limited time period and money. But one can find data related with the
study from the report published by many government departments. Hence secondary source is basically
used in such cases.
Question: Is it true to say that secondary source for data is always reliable for a study in environmental
sciences?
Answer: There are many things to discuss before making a comment about the statement. Few questions
are Does one get secondary data from a renowned agency/government institution? Does one get latest
data that one needs for a study? Is it also beneficial to take secondary data? If the answer is true in each
of the above cases then the statement is true but it is not always true in general.

3. Types of data

In this section, an attempt has been made to give an understanding of types of data. One can get detail
understanding of types of data after reading this section.
The following chart explains the different segments of the data and it relation with one another.

Statistical Applications in Environmental Sciences


Environmental
Sciences Data and its Types
Data
a

Qualitative
data Quantitative
Data

Attributes Nominal Continuous


Ordinal Discrete Data
(Dichotomous) Data

Figure 1
From the above chart, one can observe that data is mainly segmented into two forms and these forms
are further divided into various segments. One of main branch of data includes qualitative data and it is
further divided into attributes, nominal and ordinal. Similarly, quantitative data includes continuous data
and discrete data.

3.1 Qualitative Data


Qualitative data deal with information about the characteristics or qualities of an object under
study that cannot be measured. For example, color of skin, shape and color of eyes, three segmentation.
Qualitative data are further divided into the following types. These are
(a) Attributes
(b) Nominal
(c) Ordinal
3.1 (a) Attributes
Attributes are considered as a type of qualitative data that have only two categories. For example,
male/female, yes/no, dead/alive. It is called attribute data because of two categories one can say whether
an item has presence of attribute (characteristics related with objective) or not.
For example, in a study of census there are many items like male/female (excluding third sex),
employed/unemployed, educated/illiterate etc. are considered as attributes.

3.1(b) Nominal
Nominal data are considered as that form of data that cannot be ordered or have more than two
categories.
For example, color of hair (black, brown, blonde etc.), marital status (married, unmarried, divorced,
separated), nature of disaster (fire, theft, accident, earthquake, etc.). In these examples, on can observe
5

Statistical Applications in Environmental Sciences


Environmental
Sciences Data and its Types
that there are more than two categories and these categories are unordered. It means that one cannot
compare black color with brown color and comment about it. Also one cannot compare categories of
marital status and nature of disaster with each other.

3.1 (c) Ordinal


Ordinal data consist of observation that can be ordered in terms of their characteristics.
For example, tidiness among students have categories as messy, fairly, tidy or very neat, build of body
has categories fat, medium sized or thin, agreement has categories strongly agree, agree, neither agree
nor disagree, disagree, strongly disagree. Now one can order observation based on these categories in
term of high or low level of cleanliness. Similarly, responses on body type and agreement can also be
ordered based on categories available.

Self-Check Exercise
In the following question, state the type of data with a reason
Question: Which one of the following subject you learn here?
(a) Mathematics
(b) Physics
(c) Statistics
Answer This question is an example of qualitative data and further it is categorized as nominal data. The
reason behind this is that one cannot order subjects.
In the previous question, subjects are considered as a nominal data. But it can be ordered depending on
the question.
Question: Which of the following subject you like the most?
(a) Mathematics
(b) Physics
(c) Statistics
Answer: It is an example of qualitative data and specifically ordinal data. As one can order the subjects
based on his liking.
Question: How would you rate your learning technique?
Answer: It is an example of qualitative data and specifically ordinal data as the categories are ordered
from poor to excellent.
Question: Did you study statistics in your college?
Answer Qualitative data and it is an attribute. In this case, we simply say item has this attribute or not.
Question: How would you rate your learning techniques? (1= excellent, 5=poor)

Statistical Applications in Environmental Sciences


Environmental
Sciences Data and its Types
Answer It is an example of qualitative data and specifically ordinal data. As you can grade your learning
techniques from 1 to 5.
One can keep the above examples in one’s mind for understanding the concept in depth.
In the next sub-section, quantitative data is explained with examples.

3.2 Quantitative Data


Quantitative data consist of numbers (e.g. 1, 0.8, -3.7, ¾……) or quantities (e.g. 1.2 kg,
155cm……). Most of the books consider numbers being referred to as quantitative data.
Quantitative data are further segmented into two types. These types are
(a) Discrete quantitative data
(b) Continuous quantitative data

3.2(a) Discrete quantitative data


Discrete quantitative data are that form of data that can only take particular numbers like whole
numbers*. For example in a study of cancer patients, numbers of patients are calculated in whole
numbers that is 0,1,2,3….. One cannot count 2.3, 3.5 persons. There are situations where observation
are considered in term of whole numbers only then they are examples of discrete quantitative data.

*Whole Numbers are those numbers that start from 0 and go till infinity that is 0, 1, 2,
3………………….

3.2(b) Continuous quantitative data


Another form of qualitative data is continuous data that can take all values either it is whole number or
real numbers. It considers negative values. For example, temperature of a room can be negative, positive
or zero. Time duration of happening of an event like earthquake, is considered in minutes and seconds,
1
for example 5 2 min, 2 min etc. Continuous quantitative data are basically derived while measuring
height (cm), time (sec) etc.
Self-Check exercise
Question: State the type of data in the following
(a) Weight of a student
(b) Place of birth
(c) Number of claim occur due to natural disaster
(d) Nature of loss due to natural disaster
(e) Age in complete years
7

Statistical Applications in Environmental Sciences


Environmental
Sciences Data and its Types
(f) Loss occur due to flood
Answer:
(a) Continuous data (Quantitative data)
(b) Nominal (Qualitative data)
(c) Discrete data (Quantitative data)
(d) Ordinal data (Qualitative data) as loss can be small, large or big amount
(e) Discrete data (Quantitative data)
(f) Continuous data (Quantitative data)
In this section, various types of examples on data are discussed so that one can easily understand the
differences between types of data.
Now, one gets an understanding of the qualitative data and quantitative data and its types. One more
important topic that is related with this module is variable.
4. Comparison of qualitative data and quantitative data
In this section, an attempt has been made to give a clear understanding about difference between
qualitative data and quantitative data through a real life situations.
In a study to quantify loss occurred due to natural disaster in a state. how will one approach the total
amount of loss? There are many types of loss like financial loss, human being etc. One must have some
instrument/tools to quantify financial loss as one cannot measure loss of human being. So, one must
have questionnaire or paper work to collect information about financial loss. One can see the draft of
questionnaire below to understand which data is covered under which form. The draft of questionnaire
is
Name:
Age (in years):
Sex:
Occupation:
Type of financial loss:
Estimated financial loss:
Details………………..
No. of family persons injured:
No. of family member died:
From the above questionnaire, one can observe that data hold both the qualitative and quantitative forms.
Starting from considering
Age (in years): Discrete quantitative data
Sex: Attributes (Qualitative data)
Occupation: Nominal qualitative data (Farmer, shopkeeper, labourer)
Type of financial loss: Ordinal qualitative data
Estimated financial loss: Continuous quantitative data
8

Statistical Applications in Environmental Sciences


Environmental
Sciences Data and its Types
No. of family persons injured: Discrete quantitative data
No. of family member died: Discrete quantitative data

From the above example, one can see that in real life the data consists of different types. Hence, one
must be clear while preparing questionnaire or study about the type of items (questions). So one can
keep the following points in mind about qualitative data and quantitative data. These are

Qualitative Data Quantitative Data


It help to gain an understanding of the It help to quantify data and generalize
features of the data. results from the data of interest.
It help to provide insights of the problem, To measure the incident related with
generating ideas for further research. various views and opinion.
To provide view about the problem where Sometime followed by qualitative research
complete enumeration based on numbers that is used for further use.
are not possible.

Hence from the above comparison, one can observe that both these methods are also complementary to
one other. The questionnaire has given above to prove this statement. As one can see that information
is collected from the respondent to draw conclusions. These include results/statements about all the
attributes and variables available in the data. For example, from the above questionnaire, one can
conclude on the basis of data that x amount of persons (male /female) of y age are affected with monetary
loss of z amount on average. This shows the relationship of loss amount with attributes like sex and
variables like age.

5. Variable
Variable is another commonly used word when collecting information from the observations. Before
looking at the definition let’s first understand it through an example.
Basically data are realization of the variable. For example, in a study of measuring average height of
student then height is considered as a variable. As height of each student is different and it can take any
value within a specified range. Also for other objectives whose value vary under different conditions are
measured through variable.
For example it there are 3 students and their heights are given in inches as 78, 81 and 56 respectively.
So our data have three values and it is collected by measuring heights of students.

Definition

Statistical Applications in Environmental Sciences


Environmental
Sciences Data and its Types
Variable is defined as a measurement tool that fluctuates its value with change in the conditions of an
objective.
In quantitative data form, observations are of discrete or continuous data form. Hence there must be two
variables. These are
(1) Discrete variables
(2) Continuous variables
5.1) Discrete Variables
As variable is considered as an instrument that can change its values. Discrete variables are those
variables that can only change values in whole numbers. For example, numbers of boys in a class,
number of patients in a hospital, number of accidents occurred in a year and number of states in a
country.
Definition
Variables that can only take on a finite number of values is termed as discrete variables. This value vary
from one observation to the other.
5.2) Continuous Variables
In the previous example, height of different person vary and also it can take values in decimals so height
is considered as a continuous variable. Thus continuous variables are those variable that measure the
value of an item in real.
Definition
A variable is a tool that keep on changing value. This value can vary from one observation to the
other. A continuous variable is a variable that take any value possible for observation on the real line.
It means all possible positive, negative, fraction etc.
There are uncountable values that exist between two numbers like 1 and 2 for e.g. 1.00,1.01, 1.001…..
Following examples will give you more insight about continuous variable. These are
 Time taken by a person to complete a task
 Height of a person
 Wind speed
 Dust particles in the air
 Cost of an equipment/object
 Average speed of bike
 Mileage of car

After understanding the definition of the variable, it is easy for one to understand independent and
dependent variables as these variables are used in the most of the studies.

10

Statistical Applications in Environmental Sciences


Environmental
Sciences Data and its Types
Independent Variable
A variable is said to be an independent variable if there is little change in its value. The values of other
variable change but when the value of other variables change then there is no change in the value of this
variable. Hence, that variable whose value is not affected by any other variable is called an independent
variable.
For example,
Algal density is a variable and its value determine the quantity of Chlorophyll-a that is used as an
indicator of lake water quality. But change in Chlorophyll-a does not have an impact on the value of
algal density. In this case, algal density is considered as an independent variable.

Dependent Variable
A variable is said to be dependent variable if its value changes due to change in other variable. The
variable that influences the value of this dependent variable is called independent variable (from above
definition).
From previous example, one can see that there are quantities like algal density, poor water quality and
chlorophyll-II that is used as an indicator of lake water quality. Now, chlorophyll-II value are basically
dependent on the values of algal density and also on the quality of lake water. So basically, chlorophyll-
II is dependent variable and others are independent variables.
Self -Check Exercise
Question: Which one is independent/dependent variable in the study if a scientist conducts an
experiment to test the theory that a vitamin could extend a person’s life-expectancy?
Answer: Here the independent variable is the amount of vitamin that is given to the subjects within the
experiment. Dependent variable is the variable affected by the independent variable and in this case it is
life span.
Question: If a scientist studies the impact of a drug on cancer. What will be an independent variable?
Answer: A scientist studies the impact of a drug on cancer hence it is a dependent variable. Here
independent variables are the administration of the drug like the dosage and the timing of an impact.
Question: If the scientist studies the impact of withholding affection on rats. Which one is the
independent variable?
Answer: Here the amount of affection is the independent variable and dependent variable is the reaction
of the rats.
Question: In a scientific study that how many days people can eat soup until they get sick? Write
independent and dependent variable?
Answer: Here number of days of consuming soup is an independent variable and the dependent variable
is the onset of illness.

11

Statistical Applications in Environmental Sciences


Environmental
Sciences Data and its Types
6. Summary
In this module, we try to give an in-depth knowledge of different types of data. Different types of data
are discussed with examples that help the reader to understand the topic in an easy manner. In the fourth
section, different data types are compared in term of their feature and properties that they possess.
Variables with their types are discussed due to the importance of this topic in the statistics. Question and
answer part will help you to check understanding of this topic.
7. Suggested Readings
Agresti, A. and B. Finlay, Statistical Methods for the Social Science, 3rd Edition, Prentice Hall, 1997.

Daniel, W. W. and C. L. Cross, C. L., Biostatistics: A Foundation for Analysis in the Health Sciences,
10th Edition, John Wiley & Sons, 2013.

Hogg, R. V., J. Mckean and A. Craig, Introduction to Mathematical Statistics, Macmillan Pub. Co. Inc.,
1978.

Meyer, P. L., Introductory Probability and Statistical Applications, Oxford & IBH Pub, 1975.

Triola, M. F., Elementary Statistics, 13th Edition, Pearson, 2017.

Weiss, N. A., Introductory Statistics, 10th Edition, Pearson, 2017.

12

Statistical Applications in Environmental Sciences


Environmental
Sciences Data and its Types

You might also like