Mathematical Statistics

Sampling Methods
Defining the Target Population
 It is critical to the success of the
research project to clearly define
the target population.

 Rely on logic and judgment.

 The population should be defined in

connection with the objectives of
the study.
Technical Terminology
 An element is an object on which a
measurement is taken.
 A population is a collection of elements
about which we wish to make an
 Sampling units are nonoverlapping
collections of elements from the
population that cover the entire
Technical Terms
 A sampling frame is a list of sampling
 A sample is a collection of sampling units
drawn from a sampling frame.
 Parameter: numerical characteristic of a
 Statistic: numerical characteristic of a
Errors of nonobservation
 The deviation between an estimate
from an ideal sample and the true
population value is the sampling
 Almost always, the sampling frame
does not match up perfectly with the
target population, leading to errors of
Errors of nonobservation
 Nonresponse is probably the most serious
of these errors.
 Arises in three ways:
 Inability
of the person responding to
come up with the answer
 Refusal to answer
 Inability
to contact the sampled
Errors of observation
 These errors can be classified as
due to the interviewer, respondent,
instrument, or method of data
 Interviewers have a direct and dramatic
effect on the way a person responds to a
 Most people tend to side with the view
apparently favored by the interviewer,
especially if they are neutral.
 Friendly interviewers are more successful.
 In general, interviewers of the same gender,
racial, and ethnic groups as those being
interviewed are slightly more successful.
 Respondents differ greatly in motivation
to answer correctly and in ability to do so.
 Obtaining an honest response to sensitive
questions is difficult.
 Basic errors
 Recall bias: simply does not remember
 Prestige bias: exaggerates to ‘look’ better
 Intentional deception: lying
 Incorrect measurement: does not understand
the units or definition
Census Sample
 A census study occurs if the entire
population is very small or it is
reasonable to include the entire
population (for other reasons).

 It is called a census sample because

data is gathered on every member
of the population.
Why sample?
 The population of interest is usually
too large to attempt to survey all of
its members.
 A carefully chosen sample can be
used to represent the population.
 The sample reflects the characteristics
of the population from which it is
Probability versus Nonprobability
 Probability Samples: each member of
the population has a known non-zero
probability of being selected
 Methods include random sampling, systematic
sampling, and stratified sampling.

 Nonprobability Samples: members are

selected from the population in some
nonrandom manner
 Methods include convenience sampling,
judgment sampling, quota sampling, and
snowball sampling
Random Sampling
Random sampling is the purest form of
probability sampling.
 Each member of the population has an equal and
known chance of being selected.
 When there are very large populations, it is often
‘difficult’ to identify every member of the
population, so the pool of available subjects
becomes biased.
 You can use software, such as minitab to generate
random numbers or to draw directly from the
Systematic Sampling
 Systematic sampling is often used instead
of random sampling. It is also called an Nth
name selection technique.
 After the required sample size has been
calculated, every Nth record is selected from a
list of population members.
 As long as the list does not contain any
hidden order, this sampling method is as good
as the random sampling method.
 Its only advantage over the random sampling
technique is simplicity (and possibly cost
Stratified Sampling
 Stratified sampling is commonly used
probability method that is superior to random
sampling because it reduces sampling error.
 A stratum is a subset of the population that share
at least one common characteristic; such as
males and females.
 Identify relevant stratums and their actual
representation in the population.
 Random sampling is then used to select a sufficient
number of subjects from each stratum.
 Stratified sampling is often used when one or more
of the stratums in the population have a low
incidence relative to the other stratums.
Cluster Sampling
 Cluster Sample: a probability sample in which
each sampling unit is a collection of elements.
 Effective under the following conditions:
 A good sampling frame is not available or costly,
while a frame listing clusters is easily obtained
 The cost of obtaining observations increases as the
distance separating the elements increases

 Examples of clusters:
 City blocks – political or geographical
 Housing units – college students
 Hospitals – illnesses
 Automobile – set of four tires
Convenience Sampling
 Convenience sampling is used in
exploratory research where the
researcher is interested in getting an
inexpensive approximation.
 The sample is selected because they are
 It is a nonprobability method.
 Often used during preliminary research efforts
to get an estimate without incurring the cost or
time required to select a random sample
Judgment Sampling
 Judgment sampling is a common
nonprobability method.
 The sample is selected based upon
 an extension of convenience sampling

 When using this method, the researcher

must be confident that the chosen
sample is truly representative of the
entire population.
Quota Sampling
 Quota sampling is the nonprobability
equivalent of stratified sampling.

 First identify the stratums and their

proportions as they are represented in
the population

 Then convenience or judgment sampling

is used to select the required number of
subjects from each stratum.
Snowball Sampling
 Snowball sampling is a special nonprobability
method used when the desired sample
characteristic is rare.
 It may be extremely difficult or cost prohibitive
to locate respondents in these situations.
 This technique relies on referrals from initial
subjects to generate additional subjects.
 It lowers search costs; however, it introduces
bias because the technique itself reduces the
likelihood that the sample will represent a good
cross section from the population.
Sample Size?
 The more heterogeneous a population is,
the larger the sample needs to be.
 Depends on topic – frequently it occurs?
 For probability sampling, the larger the
sample size, the better.
 With nonprobability samples, not
generalizable regardless – still consider
stability of results
Response Rates
 About 20 – 30% usually return a
 Follow up techniques could bring it up to
about 50%
 Still, response rates under 60 – 70%
challenge the integrity of the random
 How the survey is distributed can affect
the quality of sampling

