Sampling and Data Collection: Lecture 19-20 Research Methods (Business) Isp-Aht

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 6

Lecture 19-20 Research Methods (Business) ISP-AHT

Table of Contents

SAMPLING AND DATA COLLECTION....................................................................................1

What is sampling?...........................................................................................................................................................1

Some Important Definitions...........................................................................................................................................2

Types of Sampling:..........................................................................................................................................................2
1-Random/Probability Sampling..................................................................................................................................2
Simple Random Sample...........................................................................................................................................2
Systematic Random Sample.....................................................................................................................................3
Stratified Random Sample.......................................................................................................................................3
Cluster Sampling......................................................................................................................................................3
2- Non-Probability Sampling:......................................................................................................................................4
Convenience Sampling............................................................................................................................................4
Judgmental/Purposive Sampling..............................................................................................................................4
Quota Sampling........................................................................................................................................................4
Snowball Sampling..................................................................................................................................................5
Sequential Sampling................................................................................................................................................5
Theoretical Sampling...............................................................................................................................................5

What is the Appropriate Sample Design?.....................................................................................................................5


Degree of Accuracy..................................................................................................................................................5
Resources.................................................................................................................................................................6
Advance Knowledge of the Population...................................................................................................................6
National versus Local Project..................................................................................................................................6
Need for Statistical Analysis and Literature support...............................................................................................6

Sample Size Selection.......................................................................................................................................................6

Sampling and Data Collection


What is sampling?
Sampling is the process of selecting a few respondents from a bigger group (population) to
become the basis for estimating or predicting about unknown piece of information, situation and
problem of bigger group. A sample is sub group of the population in which we have interest to
study.
Suppose you want to estimate the average age of students in your class. One way is to contact all
students and ask about their ages and divide it over there total number of students. The other
method is to select a few students from the class, calculate their ages and divide it over their
number. From this you can make the estimate of the average age of the class. Similarly you want
to find out average income of families living in city. You need large resources, time and effort to
collect information about each family other way is you select a few families, make an estimate of
average income of families in the city.
Advantages of Sampling:
1. It saves time resources and efforts, Time
2. Through sampling we can make prediction or estimation about a big group of population
Disadvantages of Sampling:
1. We can only estimate or predict about the population
2. We cannot collect full information about population’s characteristics
3. We can make errors in estimation and selection of a sample
4. It can reduces the level of accuracy in the result
5. If the method of enquiry is incorrect the finding well also be incorrect
Some Important Definitions
What is population?
The population refers to the group of people, events, or things or interest that the researcher
wishes to investigate. Population is that about which we want to study, from which we take a
sample and about which we want to draw some findings or conclusion. It is denoted by N.
What is Sample size?
A sample is a subset of the population. It comprises some members elected from population e.g.
the number of students or families, we select as a sample for our study is called sample size and it
is denoted by n.
What is Sample Design?
The way, we select students, families etc. is known as sampling design or sampling strategy.
What is Sampling Unit/Element?
Unit/Elements is a single member of the population. Each student or family included into a
sample is called sampling unit.
What is Sampling Statistics?
The findings we obtain from sample are called sampling statistics.
What is Sampling Frame?
Total number of elements is a sample frame.
What is Population Mean?
The estimates obtained from sample statistics are called population parameters or population
mean.
What is population?
Population is that about which we want to study, from which we take a sample and about which
we want to draw some findings or conclusion. It is denoted by N.
What is Double Sampling?
This plan is adopted when further information is needed from a subset of the group from which
some information has already been collected for the same study. A sampling design where initially
a sample is used in a study to collect some preliminary information of interest, and later a sub-
sample of this primary sample is used to examine the matter in more detail, is called double
sampling.
Types of Sampling:
There are 3 main type of sampling
1. Random/Probability Sampling
2. Non-Random Sampling
3. Mix Sampling/Systematic Sampling

1-Random/Probability Sampling
Probability sampling: all elements in the population have some known, non-zero chance or
probability of being selected as sample subjects.
Probability samples that rely on random processes require more work than nonrandom ones. A
researcher must identify specific sampling elements (e.g. persons) to include in the sample. For
example, if conducting a telephone survey, the researcher needs to try to reach the specific
sampled person, by calling back several times, to get an accurate sample.

Simple Random Sample


The simple random sample is both the easiest random sample to understand and the one on which
other types are modeled. In simple random sampling, a research develops an accurate sampling
frame, selects elements from sampling frame according to mathematically random procedure, then
locates the exact element that was selected for inclusion in the sample.
For example, for a sample of 100, 100 random numbers are needed. The researcher can get
random numbers from a random number table, a table of numbers chosen in a mathematically
random way. Random-number tables are available in most statistics and research methods books.
The numbers are generated by a pure random process so that any number has an equal probability
of appearing in any position. Computer programs can also produce lists of random number. A
random starting point should be selected at the outset.
Random sampling does not guarantee that every random sample perfectly represents the
population. Instead, it means that most random samples will be close to the population most of the
time, and that one can calculate the probability of a particular sample being inaccurate. A
researcher estimates the chance that a particular sample is off or unrepresentative by using
information from the sample to estimate the sampling distribution. The sampling distribution is
the key idea that lets a researcher calculate sampling error and confidence interval.

Systematic Random Sample


Systematic random sampling is simple random sampling with a short cut for random selection.
Again, the first step is to number each element in the sampling frame. Instead of using a list of
random numbers, researcher calculates a sampling interval, and the interval becomes his or her
own quasi random selection method. The sampling interval (i.e. 1 in K where K is some number)
tells the researcher how to select elements from a sampling frame by skipping elements in the
frame before one for the sample.
Sampling intervals are easy to compute. We need the sample size and the population size. You can
think of the sample interval as the inverse of the sampling ratio. The sampling ratio for 300 names
out of 900 will be 300/900 = .333 = 33.3 percent. The sampling interval is 900/300 = 3
Begin with a random start. The easiest way to do this is to point blindly at a number from those
from the beginning that are likely to be part of the sampling interval.
When the elements are organized in some kind of cycle or pattern, the systematic sampling will
not give a representative sample.
For example, there are 100,000 elements in the population and a sample of 1,000 is desired. In
this case the sampling interval, i, is 100. A random number between 1 and 100 is selected. If, for
example, this number is 23, the sample consists of elements 23, 123, 223, 323, 423, 523, and so
on.

Stratified Random Sample


When the population is heterogeneous, the use of simple random sample may not produce
representative sample. Some of the bigger strata may get over representation while some of the
small ones may entirely be eliminated. Look at the variables that are likely to affect the results,
and stratify the population in such a way that each stratum becomes homogeneous group within
itself. Then draw the required sample by using the table of random numbers. Hence in stratified
random sampling a sub-sample is drawn utilizing simple random sampling within each stratum.
(Randomization is not done for quota sampling).
There are three reasons why a researcher chooses a stratified random sample: (1) to increase a
sample’s statistical efficiency, (2) to provide adequate data for analyzing the various
subpopulations, and (3) to enable different research methods and procedures to be used in
different strata.

Cluster Sampling
The target papulation is first divided into mutually exclusive and collectively exhaustive
subpopulations, or clusters. Then a random sample of cluster is selected, based on a probability
sampling technique.
The purpose of cluster sampling is to sample economically while retaining the characteristics of a
probability sample. Groups or chunks of elements that, ideally, would have heterogeneity among
the members within each group are chosen for study in cluster sampling. This is in contrast to
choosing some elements from the population as in simple random sampling, or stratifying and
then choosing members from the strata, or choosing every nth case in the population in systematic
sampling. When several groups with intra-group heterogeneity and inter-group homogeneity are
found, then a random sampling of the clusters or groups can ideally be done and information
gathered from each of the members in the randomly chosen clusters.
Cluster samples offer more heterogeneity within groups and more homogeneity among and
homogeneity within each group and heterogeneity across groups.
A researcher draws several samples in stages in cluster sampling. In a three-stage sample, stage 1
is random sampling of big clusters; stage 2 is random sampling of small clusters within each
selected big cluster; and the last stage is sampling of elements from within the sampled within the
sampled small clusters. First, one randomly samples the city blocks, then households within
blocks, then individuals within households. This can also be an example of multistage area
sampling.

2- Non-Probability Sampling:
Non-probability sampling: the elements do not have known or predetermined chance of being
selected as subjects.
In non-probability sampling designs, the elements in the population do not have any probabilities
attached to their being chosen as sample subjects. This means that the findings from the study of
the sample cannot be confidently generalized to the population. However the researchers may at
times be less concerned about generalizability than obtaining some preliminary information in a
quick and inexpensive way. Sometimes non-probability could be the only way to collect the data.

Convenience Sampling
Convenience sampling (also called haphazard or accidental sampling) refers to sampling by
obtaining units or people who are most conveniently available. For example, it may be convenient
and economical to sample employees in companies in a nearby area, sample from a pool of
friends and neighbors. The person-on-the street interview conducted by TV programs is another
example. TV interviewers go on the street with camera and microphone to talk to few people who
are convenient to interview. The people walking past a TV studio in the middle of the day do not
represent everyone (homemakers, people in the rural areas). Likewise, TV interviewers select
people who look “normal” to them and avoid people who are unattractive, poor, very old, or
inarticulate.
Convenience samples are least reliable but normally the cheapest and easiest to conduct.
Convenience sampling is most often used during the exploratory phase of a research project and is
perhaps the best way of getting some basic information quickly and efficiently. Often such sample
is taken to test ideas or even to gain ideas about a subject of interest.

Judgmental/Purposive Sampling
Depending upon the type of topic, the researcher lays down the criteria for the subjects to be
included in the sample on the basis of own judgment for special purpose. Whoever meets that
criteria could be selected in the sample? The researcher might select such cases or might provide
the criteria to somebody else and leave it to his/her judgment for the actual selection of the
subjects. That is why such a sample is also called as judgmental or expert opinion sample. For
example a researcher is interested in studying students who are enrolled in a course on research
methods, are highly regular, are frequent participants in the class discussions, and often come with
new ideas. The criteria has been laid down, the researcher may do this job himself/herself, or may
ask the teacher of this class to select the students by using the said criteria. In the latter situation
we are leaving it to the judgment of the teacher to select the subjects. Similarly we can give some
criteria to the fieldworkers and leave it to their judgment to select the subjects accordingly. In a
study of working women the researcher may lay down the criteria like: the lady is married, has
two children, one of her child is school going age, and is living in nuclear family.

Quota Sampling
A sampling procedure that ensures that certain characteristics of a population sample will be
represented to the exact extent that the researcher desires. In this case the researcher first
identifies relevant categories of people (e.g. male and female; or under age 30, ages 30 to 60, over
60, etc.) then decides how many to get in each category. Thus the number of people in various
categories of sample is fixed. For example the researcher decides to select 5 males and 5 females
under age 30, 10 males and 10 females aged 30 to 60, and 5 males and 5 females over age 60 for a
40 person sample. This is quota sampling.
Once the quota has been fixed then the researcher may use convenience sampling. The
convenience sampling may introduce bias. For example, the field worker might select the
individual according to his/her liking, who can easily be contacted, willing to be interviewed, and
belong to middle class. Quota sampling can be considered as a form of proportionate stratified
sampling, in which a predetermined proportion of people are sampled from different groups, but
on a convenience basis.

Snowball Sampling
Snowball sampling (also called network, chain referral, or reputational sampling) is a method for
identifying and sampling (or selecting) cases in the network. It is based on an analogy to a
snowball, which begins small but becomes larger as it is rolled on wet snow and picks up
additional snow. It begins with one or a few people or cases and spreads out on the basis of links
to the initial cases.
This design has been found quite useful where respondents are difficult to identify and are best
located through referral networks. In the initial stage of snowball sampling, individuals are
discovered and may or may not be selected through probability methods. This group is then used
to locate others who possess similar characteristics and who, in turn, identify others. The
“snowball” gather subjects as it rolls along.
For example, a researcher examines friendship networks among teenagers in a community. He or
she begins with three teenagers who do not know each other. Each teen names four close friends.
The researcher then goes to the four friends and asks each to name four close friends, then goes to
those four and does the same thing again, and so forth. Before long, a large number of people are
involved. Each person in the sample is directly or indirectly tied to the original teenagers, and
several people may have named the same person. The researcher eventually stops, either because
no new names are given, indicating a closed network, or because the network is so large that it is
at the limit of what he or she can study.

Sequential Sampling
Sequential sampling is similar to purposive sampling with one difference. In purposive sampling,
the researcher tries to find as many relevant cases as possible, until time, financial resources, or
his or her energy is exhausted. The principle is to get every possible case. In sequential sampling,
a researcher continues to gather cases until the amount of new information or diversity is filled.
The principle is to gather cases until a saturation point is reached. In economic terms, information
is gathered, or the incremental benefit for additional cases, levels off or drops significantly. It
requires that the researcher continuously evaluates all the collected cases. For example, a
researcher locates and plans in-depth interviews with 60 widows over 70 years old who have been
living without a spouse for 10 or more years. Depending on the researcher’s purposes, getting an
additional 20 widows whose life experiences, social background, and worldview differ little from
the first 60 may be unnecessary.

Theoretical Sampling
In theoretical sampling, what the researcher is sampling (e.g. people, situation, events, time
periods, etc.) is carefully selected, as the researcher develops grounded theory. A growing
theoretical interest guides the selection of sample cases. The researcher selects cases based on
new insights they may provide. For example, a field researcher may be observing a site and a
group of people during week days. Theoretically, the researcher may question whether the people
act the same at other times or when other aspects of site change. He or she could then sample
other time periods (e.g. nights and weekends) to get more full picture and learn whether important
conditions are the same.

What is the Appropriate Sample Design?


A researcher who must make a decision concerning the most appropriate sample design for a
specific project will identify a number of sampling criteria and evaluate the relative importance of
each criterion before selecting a sample design.

Degree of Accuracy
Selecting a representative sample is, of course, important to all researchers. However, the error
may vary from project to project, especially when cost saving or another benefit may be a trade-
off for reduction in accuracy.
Resources
The costs associated with the different sampling techniques vary tremendously. If the researcher’s
financial and human resources are restricted, this limitation of resources will eliminate certain
methods. For a graduate student working on a master’s thesis, conducting a national survey is
almost always out of the question because of limited resources. Managers usually weigh the cost
of research versus the value of information often will opt to save money by using non-probability
sampling design rather than make the decision to conduct no research at all.

Advance Knowledge of the Population


Advance knowledge of population characteristics, such as the availability of lists of population
members, is an important criterion. A lack of adequate list may automatically rule out any type of
probability sampling.

National versus Local Project


Geographic proximity of population elements will influence sample design. When population
elements are unequally distributed geographically, a cluster sampling may become more

Need for Statistical Analysis and Literature support


The need for statistical projections based on the sample is often a criterion. Non-probability
sampling techniques do not allow researcher to use statistical analysis to project the data beyond
the sample.

Sample Size Selection


 Population size (known or unknown)
 Variance estimates (within the variable of interest)
 Desired level of accuracy
 Expected no-response rate
 Number of constructs in the study
 Model complexity
 Budget constraints
 Type of statistical technique selected for data analysis
 (Krejcie & Morgan,1970; Cochran, 1977; Bartlett et al.,2001; Hair et al., 2010)
 Also try the following online calculator:
http://www.nss.gov.au/nss/home.nsf/pages/sample+size+calculator

You might also like