Chapter 1 at BULLET Statistics Chapter 1
Chapter 1 at BULLET Statistics Chapter 1
Chapter 1 at BULLET Statistics Chapter 1
Statistics
Section 1.1
True-False Questions
1. The field of statistics can be roughly subdivided into two areas: descriptive statistics and
probability.
ANSWER: F
ANSWER: T
3. In the field of statistics, descriptive statistics includes collecting and describing data while
inferential statistics involves interpreting results from the data.
ANSWER: T
6. A company employs 750 individuals. To ascertain how the employees feel regarding a
pension plan, 75 of the employees are surveyed. The proportion of the 75 employees
who favor the pension plan is a parameter.
ANSWER: F
Chapter 1 • Statistics 1
7. A balance is used to measure weights to three decimal places. The data that result from
this process would be classified as qualitative data.
ANSWER: F
ANSWER: T
ANSWER: T
ANSWER: T
11. A quantitative variable that can assume a countable number of values is referred to as continuous
variable.
ANSWER: F
12. A quantitative variable that can assume an uncountable number of values is referred to as
discrete variable.
ANSWER: F
14. Inferential statistics is the study and description of data that result from an experiment.
ANSWER: F
Chapter 1 • Statistics 2
15. Descriptive statistics is the study of a sample that enables us to make projections or
estimates about the population from which the sample is drawn.
ANSWER: F
16. A population is typically a very large collection of individuals or objects about which we
desire information.
ANSWER: T
ANSWER: F
ANSWER: F
20. The “number of rotten oranges per shipping crate” is an example of a quantitative
variable.
ANSWER: T
21. The “thickness of a sheet of sheet metal” used in a manufacturing process is an example
of a quantitative variable.
ANSWER: T
22. The basic objective of statistics is that of obtaining a sample, inspecting this sample, and
then making inferences about the unknown characteristics of the population from which
the sample was drawn.
Chapter 1 • Statistics 3
ANSWER: T
Multiple-Choice Questions
23. Which of the following best describes the data: zip codes for students attending college
in the state of Michigan?
A) Attribute data
B) Numerical data
C) Quantitative data
D) Sample data
ANSWER: A
24. Which of the following best describes the data: grade point averages for athletes?
A) Attribute data
B) Quantitative data
C) Qualitative data
D) Sample data
ANSWER: B
25. Which of the following best describes the data: classifications of unlikely, likely, or very
likely to describe possible buying of a product?
A) Attribute data
B) Numerical data
C) Quantitative data
D) Sample data
ANSWER: A
26. Consider the following data: the height in centimeters of children in a third-grade class. Which of
the following best describes these data?
Chapter 1 • Statistics 4
A) Attribute data
B) Qualitative data
C) Quantitative data
D) Sample data
ANSWER: C
27. Consider the following data: the 18-hole score for all rounds of golf played at Oak Hill Country
Club last year. Which of the following best describes these data?
A) Attribute data
B) Quantitative data
C) Qualitative data
D) Statistic
ANSWER: B
28. Consider the following data: like, no preference, or dislike. Which of the following best describes
these data?
A) Qualitative data
B) Numerical data
C) Quantitative data
D) Statistic
ANSWER: A
29. Consider the following data: the weights of babies born in a given hospital. Which of the following
best describes these data?
A) Attribute data
B) Qualitative data
C) Quantitative data
D) Statistic
ANSWER: C
Chapter 1 • Statistics 5
30. Which of the following data types would not be considered quantitative data?
31. A company has developed a new battery, but the average lifetime is unknown. In order
to estimate this average, a sample of 100 batteries is tested and the average lifetime of
this sample is found to be 250 hours. The 250 hours is the value of a:
A) parameter.
B) statistic.
C) sampling frame.
D) population.
ANSWER: B
32. Which of the following data types would be considered attribute data?
A) Hair color
B) Ages of college freshmen
C) 18 hole score of golfers
D) Shoe size of 3rd grade students
ANSWER: A
Short-Answer Questions
33. The Nielsen Company reports that 30% of the television audience watched a world-
premiere movie. Is this an example of descriptive or inferential statistics?
ANSWER:
Inferential statistics
Chapter 1 • Statistics 6
34. As part of the graduation paperwork, seniors at a particular college were asked to
indicate their post-graduation plans. Results showed that 15% planned to start graduate
school right after college graduation. Is this an example of descriptive or inferential
statistics?
ANSWER:
Descriptive statistics
ANSWER:
Statistic
ANSWER:
Parameter
37. In statistics, what name do we give to a set of all individuals whose properties are to be
analyzed?
ANSWER:
Population
ANSWER:
Sample
39. What is the difference between descriptive statistics and inferential statistics?
ANSWER:
Chapter 1 • Statistics 7
Descriptive statistics: collect, present, describe sample data. Inferential statistics:
interpret based on descriptive statistics, make decisions and draw conclusions about the
population from which the sample was drawn.
40. What is the difference between a finite population and an infinite population?
ANSWER:
A population is finite when the membership could be physically listed. When the
membership is unlimited, the population is infinite.
41. Discuss the difference between a variable and a parameter. Include an illustration.
ANSWER:
42. In completing a survey, respondents use the following numbers to indicate marital
status.
ANSWER:
Even though marital status is coded by number, the data is qualitative as it categorizes
each individual respondent. Also, the average of single and divorced is meaningless.
43. In completing a survey, respondents use the following numbers to indicate ages.
Chapter 1 • Statistics 8
Is this data qualitative or quantitative? Explain.
ANSWER:
44. Explain the difference between the terms “variable” and “data.” Include an illustration
that demonstrates this difference.
ANSWER:
An office supply warehouse has boxes of pencils, 100 pencils to the box. Information about the
entire warehouse as well as a sample of the boxes is shown below:
0 1500 50
1 250 20
2 75 3
3 40 3
Chapter 1 • Statistics 9
4 10 1
ANSWER:
All boxes of pencils in the warehouse
ANSWER:
1875 boxes
ANSWER:
Finite; since the number of boxes in the population can be (or could be) physically listed.
ANSWER:
The boxes of pencils sampled.
ANSWER:
77 boxes
50. A quality control technician is interested in the number of boxes with more than two
defectives. What is the value of the parameter?
ANSWER:
50
Chapter 1 • Statistics 10
51. A quality control technician is interested in the number of boxes with more than two
defectives. What is the value of the sample?
ANSWER:
4
52. A quality control technician is interested in the proportion of boxes with no more than
one defective pencil. What is the value of the parameter?
ANSWER:
1750/1875 = 0.933
53. A quality control technician is interested in the proportion of boxes with no more than
one defective pencil. What is the value of the statistic?
ANSWER:
70/77 = 0.909
A paper company is interested in estimating the proportion of trees in a 500-acre forest with
diameters exceeding 2 feet. The company selects 25 plots (100 feet by 100 feet) from the forest
and utilizes the information from the 25 plots to help estimate the proportion for the whole forest.
ANSWER:
Population
ANSWER:
Sample
Chapter 1 • Statistics 11
56. At a large community college 120 students are randomly selected and asked the
distance of their commute to campus. From this group a mean of 9.8 miles is computed.
Match the items in Column II with the statistical term in Column I.
Column I Column II
1. Data (one) a. The process used to select the 120 students and
ANSWER:
(1, g), (2, d), (3, a), (4, h), (5, c), (6, e), (7, b), (8, f)
57. In a community of 10,987, 100 homeowners were randomly selected and asked the
amount of their January heating bill. From this group a mean of $76.98 is computed.
Match the items in Column II with the statistical term in Column I.
Column I Column II
Chapter 1 • Statistics 12
5. Population e. The heating bill for one home
ANSWER:
(1, f), (2, c), (3, h), (4, g), (5, b), (6, d), (7, a), (8, e)
A quality-control inspector selects assembled parts from an assembly line and records the
information concerning each part as: A: defective or nondefective, B: the employee number of
the individual who assembled the part, and C: the weight of the part.
ANSWER:
ANSWER:
Infinite; since all assembled parts from the assembly line can’t be (or couldn’t be)
physically listed.
ANSWER:
ANSWER:
Chapter 1 • Statistics 13
QUESTIONS 62 THROUGH 65 ARE BASED ON THE FOLLOWING INFORMATION:
Select ten students currently enrolled at your college and collect data for these three variables:
X: number of courses enrolled in, Y: total cost of textbooks and supplies for courses, and Z:
method of payment used for textbooks and supplies
ANSWER:
ANSWER:
Finite
ANSWER:
ANSWER:
66. Identify the statement “A poll of registered voters asking which candidate they support”
as an example of (1) nominal, (2) ordinal, (3) discrete, or (4) continuous variables.
Chapter 1 • Statistics 14
ANSWER:
Nominal
67. Identify the statement “The length of time required for a wound to heal when using a new
medicine.” as an example of (1) nominal, (2) ordinal, (3) discrete, or (4) continuous
variables.
ANSWER:
Continuous
68. Identify the statement “The number of telephone calls arriving at a switchboard per ten-
minute period.” as an example of (1) nominal, (2) ordinal, (3) discrete, or (4) continuous
variables.
ANSWER:
Discrete
69. Identify the statement “The distance first-year college women can kick a football.” as an
example of (1) nominal, (2) ordinal, (3) discrete, or (4) continuous variables.
ANSWER:
Continuous
70. Identify the statement “The number of pages per job coming off a computer printer.” as
an example of (1) nominal, (2) ordinal, (3) discrete, or (4) continuous variables.
ANSWER:
Discrete
Chapter 1 • Statistics 15
71. Identify the statement “The kind of tree used as a Christmas tree.” as an example of (1)
nominal, (2) ordinal, (3) discrete, or (4) continuous variables.
ANSWER:
Nominal
A study by health economists at Midwestern University indicated that Alzheimer’s disease cost
the nation $90 billion a year in medical expenses and lost productivity. Patients’ earning loss
was $25 billion, the value of time of unpaid caregivers was $40 billion, and the cost of paid care
was $30 billion.
ANSWER:
ANSWER:
The response variable is the cost in medical expenses and lost productivity per patient
per year.
ANSWER:
The total cost per year for all Alzheimer patients in the U.S.
Chapter 1 • Statistics 16
75. What is the statistic?
ANSWER:
The total cost per year for the Alzheimer patients used as a sample by the health
economists at the Midwestern University, the basis for the estimation.
A health magazine presented results of a recent study that analyzed data collected by the U.S.
Census Bureau in 2000. Results reveal that for both men and women in the United States,
heart disease remains the number one killer, victimizing 500,000 people annually. Age, obesity,
and inactivity all contribute to heart disease, and all three of these factors vary considerably
from one location to the next. The highest mortality rates (deaths per 100,000 people) were in
New York, Florida, Oklahoma, and Arkansas, whereas the lowest were reported in Alaska,
Utah, Colorado, and New Mexico.
ANSWER:
ANSWER:
ANSWER:
Chapter 1 • Statistics 17
79. Classify all the variables of the study as either attribute or numerical.
ANSWER:
Select twenty employees currently working at a local supermarket and collect data for the
following four variables:
W: Marital status
Y: total cost of cloth and toys they spend every year for their children
ANSWER:
The population consists of all employees currently working at the local supermarket.
ANSWER:
Finite
Chapter 1 • Statistics 18
82. What is the sample?
ANSWER:
ANSWER:
ANSWER:
Nominal
ANSWER:
Continuous
ANSWER:
Chapter 1 • Statistics 19
Discrete
ANSWER:
Continuous
ANSWER:
Discrete
A study by health economists at a southern university indicated that Parkinson’s disease cost
the nation $85 billion a year in medical expenses and lost productivity. Patients’ earning loss
was $25 billion, the value of time of unpaid caregivers was $35 billion, and the cost of paid care
was $22 billion.
ANSWER:
ANSWER:
Chapter 1 • Statistics 20
The response variable is the cost in medical expenses and lost productivity per patient
per year.
ANSWER:
The total cost per year for all Parkinson patients in the U.S.
ANSWER:
The total cost per year for the Parkinson patients used as a sample by the health
economists at the southern university; the basis for the estimation.
At Central Michigan University, 500 students are randomly selected and asked the distance of their
commute to campus. From this group a mean of 15.6 miles is computed.
ANSWER:
ANSWER:
Chapter 1 • Statistics 21
95. What is the parameter?
ANSWER:
The mean commute distance for all students at the college to the campus
ANSWER:
97. Describe in detail how you would select a 5% systematic sample of the adults in Detroit
in order to complete a survey about a political issue.
ANSWER:
Randomly select an integer value between 1 and 20. This integer represents the first
item in the sample. Then, select every 20th data thereafter until you have the desired
number of data for the sample.
98. Identify the following statement as descriptive in nature, or as inferential: “The average
age of 500 surveyed students in your institution is 23 years.”
ANSWER:
Descriptive
ANSWER:
Inferential
Chapter 1 • Statistics 22
100. Identify the following statement as descriptive in nature, or as inferential: “Based on a
sample of 20,000 college students, we may conclude that 80% of all college students
dislike true-false questions.”
ANSWER:
Inferential
ANSWER:
Descriptive
102. Explain why the polls that are so frequently quoted during early returns on Election Day
TV coverage are an example of cluster sampling.
ANSWER:
Each precinct is considered a cluster, and not all precincts are sampled.
True-False Questions
ANSWER: T
Chapter 1 • Statistics 23
104. Statistical process control uses statistical methodology to control (or reduce) variability in
a manufacturing process.
ANSWER: T
105. In statistics, a random sample means a sample that is selected haphazardly (without
pattern).
ANSWER: F
106. To say that a sample is selected in such a way that every element in the population has
an equal chance of being selected is equivalent to saying that all samples of size n have
an equal chance of being selected.
ANSWER: T
107. A list of elements belonging to the population from which the sample will be drawn is referred to
as the sampling frame.
ANSWER: T
108. When a judgment sample is drawn, the person selecting the sample chooses items in such a way
that every element in the population has an equal probability of being chosen.
ANSWER: F
109. If we desire to select a 4% systematic sample, first we will randomly select one element
from the first 40 elements, and then proceed to select every 4th item thereafter until we
have the desired number of data for our sample.
ANSWER: F
110. In general, probability samples are samples in which the elements to be selected are drawn on
the basis of probability in such a way that each element in the population has the same
probability of being selected as part of the sample.
ANSWER: F
111. Cluster sample is a sample obtained by selecting some of, but not all of, the possible
subdivisions within a population. These subdivisions, called clusters, often occur
naturally within the population.
Chapter 1 • Statistics 24
ANSWER: T
112. When a proportional random sample is drawn, the sampling frame is subdivided into various
strata, and then a subsample is drawn from each stratum.
ANSWER: T
113. A stratified random sample is obtained by stratifying the sampling frame, and then
selecting a fixed number of items from some of, but not all of, the strata by means of a
simple random sampling technique.
ANSWER: F
114. A representative sample is a sample obtained in such a way that all individuals had an
equal chance to be selected.
ANSWER: F
Multiple-Choice Questions
115. In the 1936 presidential election, Alfred Landon was predicted (incorrectly) to beat
Franklin D. Roosevelt based on the results of a telephone survey. Because telephones
were considered a luxury item during this period, the survey was biased because it
related only to the opinion of those who could be reached by telephone. This incident
represents which of the following?
116. Choose the item that best completes the following statement: No matter what the
variable is, if the tool of measurement is precise enough, there will be .
Chapter 1 • Statistics 25
A) uncertainty
B) variability
C) probability
D) measurability
ANSWER: B
Short-Answer Questions
117. In statistics, what name do we give to a list of elements belonging to a population from
which a sample will be drawn?
ANSWER:
Sampling frame
ANSWER:
Census
ANSWER:
Sampling frame is a subset of the population (or census) from which the sample is
selected.
120. Discuss what the lack of variability in a quantitative response variable would tend to
indicate. Include an illustration.
Chapter 1 • Statistics 26
ANSWER:
121. Discuss the difference between the following two methods of data collection: experiment
and survey. Include an illustration of each.
ANSWER:
Experiment: investigator controls or modifies the environment and observes the effect of
the variable under study
Survey: investigator collects data by sampling a population but not modify the
environment.
An illustration of a survey: A researcher stops people in a mall and asks them about the
medicines they take and its effectiveness.
122. Describe in detail how you would select a 4% systematic sample of the adults in a
nearby large city in order to complete a survey about a political issue.
ANSWER:
Randomly select an integer between 1 and 25 (100/x = 100/4 = 25). This integer
represents the first item in the sample. Then, select every 25th data item thereafter until
you have the desired number of data for the sample.
Chapter 1 • Statistics 27
Sections 1.4 and 1.5
True-False Questions
123. If it were not for the laws of probability, the theory of statistics would not be possible.
ANSWER: T
Multiple-Choice Questions
124. Suppose you are interested in determining the preferred candidate for governor of
Michigan among registered voters in Mecosta County. Which of the following best
describes this problem?
125. Suppose you are interested in determining the likelihood of winning a state lottery by
purchasing one ticket. Which of the following best describes this problem?
126. Suppose you are interested in determining the mean age of all students attending
community colleges in the state of Texas. Which of the following best describes this
problem?
Chapter 1 • Statistics 28
A) This is a problem in probability.
B) This is a problem in statistics.
C) Neither A nor B
D) Both A and B
ANSWER: B
Short-Answer Questions
127. Discuss the validity of the following statement: “Computers can analyze any and all sets
of data and give statistically correct results.”
ANSWER:
Standard statistical packages are good at performing tedious operations; however the
user must insure that appropriate methods are correctly applied and that accurate
conclusions are drawn.
128. Explain the difference between probability and statistics. Include an illustration.
ANSWER:
In probability we know the population and are interested in the likelihood of a particular
sample (e.g., rolling a die we know likelihood that the number will be even). In statistics,
draw a sample and then make inference about the population (e.g., roll a die 100 times
and keep record).
Chapter 1 • Statistics 29
Applied and Computational Questions
ANSWER:
Statistics
130. Classify the following statement as a probability or a statistics problem: “Determining the
chance that heads will result when a coin is flipped.”
ANSWER:
Probability
131. Classify the following statement as a probability or a statistics problem: “Determining the
amount of waiting time required to check out at a certain grocery store”.
ANSWER:
Statistics
132. Classify the following statement as a probability or a statistics problem: “Determining the
chance that you will be dealt a “blackjack”
ANSWER:
Probability
Chapter 1 • Statistics 30
ANSWER:
Statistics
134. Classify the following statement as a probability or a statistics problem: “Determining the
length of life for the 100-meg zip disk produced by Fuji.”
ANSWER:
Statistics
135. Classify the following statement as a probability or a statistics problem: “Determining the
chance that a blue ball will be drawn from a bowl that contains 15 balls, of which 5 are
blue.”
ANSWER:
Probability
136. Classify the following statement as a probability or a statistics problem: “Determining the
average price of the new computers that your company just purchased.”
ANSWER:
Statistics
ANSWER:
Probability
Chapter 1 • Statistics 31
138. Classify the following statement as a probability or a statistics problem: “Determining
whether a new drug shortens the recovery time from a certain illness.”
ANSWER:
Statistics
139. Classify the following statement as a probability or a statistics problem: “Determining the
chance that tails will result when a coin is tossed twice.”
ANSWER:
Probability
140. Classify the following statement as a probability or a statistics problem: “Determining the
amount of waiting time required to check out at a grocery store.”
ANSWER:
Statistics
141. Classify the following statement as a probability or a statistics problem: “Determining the
chance that you will receive an “A” grade in your statistics class.”
ANSWER:
Probability
Chapter 1 • Statistics 32
ANSWER:
Probability
143. Classify the following statement as a probability or a statistics problem: “Determining the
length of life for the 100-watt light bulbs a company produces”.
ANSWER:
Statistics
144. Classify the following statement as a probability or a statistics problem: “Determining the
shearing strength of the rivets that your company just purchased for building airplanes.”
ANSWER:
Statistics
Chapter 2
Section 2.1
True-False Questions
Chapter 1 • Statistics 33
1. Circle graphs and bar graphs are graphs that are used to summarize qualitative, or attribute, or
categorical data.
ANSWER: T
2. All graphic representations of sets of data need to be completely self-explanatory. That includes a
descriptive meaningful title, and identification of the vertical and horizontal scales.
ANSWER: T
ANSWER: T
4. There is no single correct answer when constructing a graphic display. The analyst’s
judgment and the circumstances surrounding the problem play a major role in the
development of the graphic.
ANSWER: T
5. Circle graphs and bar graphs are graphs that are used to summarize quantitative data.
ANSWER: F
6. Circle graphs (pie diagrams) show the amount of data that belong to each category as a
proportional part of a circle.
ANSWER: T
7. Circle graphs show the amount of data that belong to each category as a frequency.
ANSWER: F
8. Bar graphs show the amount of data that belong to each category as a proportionally
sized rectangular area.
Chapter 1 • Statistics 34
ANSWER: T
9. Bar graphs of attribute data should be drawn with connected bars of equal width.
ANSWER: F
10. One major reason for constructing a graph of quantitative data is to display its
distribution.
ANSWER: T
Multiple-Choice Questions
A) Pareto diagram is a bar graph with the bars arranged from the most numerous
categories to the least numerous categories.
B) Pareto diagram includes a line graph displaying the cumulative percentages and
counts for the bars.
C) A Pareto diagram of types of defects will show the ones that have the greatest effect
on the defective rate in order of effect. It is then easy to see which defects should be
targeted in order to most effectively lower the defective rate.
D) None of the above.
ANSWER: D
A) Dotplot displays the data of a sample by representing each data with a dot positioned
along a scale. This scale can be either horizontal or vertical. The frequency of the
values is represented along the other scale.
B) Pareto diagram includes a line graph displaying the frequency (counts) for the bars.
C) Dotplot display is a convenient technique to use as you first begin to analyze the
data. It results in a picture of the data as well as sorts the data into numerical order.
D) The stem-and-leaf display is a combination of a graphic technique and a sorting
technique. This display is simple to create and use, and it is well suited to computer
applications.
Chapter 1 • Statistics 35
ANSWER: B
Short-Answer Questions
13. Complete the following statement: A stem-and-leaf display is a combination of a sorting technique
and a __________ technique.
ANSWER:
graphing
14. Complete the following statement: Circle graphs and bar graphs are often used to summarize
____________ data.
ANSWER:
attribute
15. Data for the distribution of land in a particular county is given in percentages. Name two types of
graphs that would be most appropriate to display these results.
ANSWER:
Chapter 1 • Statistics 36
ANSWER:
21 68 9
22 25 567799
23 01 112334445
24 00 136
17. The number of vehicles passing a tollgate between 7 a.m. and 8 a.m. were recorded for twenty
different days. Construct a stem-and-leaf display for these data.
10 26 32 15 16 22 31 46 27 33 27 15 16 19 20 16 12 22
30 41
ANSWER:
1 02556669
2 022677
3 0123
4 16
18. A group of hypertensive patients (with diastolic blood pressure between 110 and 130) were given
a medication for reducing elevated blood pressure. The decreases in blood pressure produced by
the medication were categorized into four categories as follows:
Chapter 1 • Statistics 37
Thirty patients who used the medication experienced the following blood pressure
reductions. Give the height of each at the four bars of a bar graph for these results.
12 15 6 4 20 17 25 4 5 18
10 12 18 13 14 20 30 12 14 17
30 18 10 8 16 32 27 13 8 4
ANSWER:
A 14
B 9
C 4
D 3
19. A random sample of test scores was taken from two different sections of an introductory statistics
course. Construct a back-to-back stem-and-leaf display for this set of data.
Section A: 46 97 99 64 78 76 45 73 81 51 68 81 81 79 100
Section B: 80 69 92 75 88 47 98 92 90 81 42 50 59 66 67
66
ANSWER:
Sec. A Sec. B
56 4 27
1 5 09
48 6 6679
3689 7 5
111 8 018
79 9 0228
0 10
20. The total amount spent for textbooks (to the nearest dollar) was recorded for several students.
Some of the information was collected for the summer session (denoted by S), and some was
collected for the fall semester (denoted by F). Construct a back-to-back stem-and-leaf display for
this set of data.
Chapter 1 • Statistics 38
Semester: S F F S F F F F S F S
Semester: S F F S F F F S F F S
Amount: 35 75 80 50 122 95 79 20 95 65 42
Semester: F F F F F S F F
ANSWER:
Summer Fall
059 02
57 03
026 04
0 05
06 059
07 559
08 000
09 025558
10 58
11 25
12 02
Mathematics 50
Chapter 1 • Statistics 39
Computer 22
Science
Actuarial Science 15
Statistics 10
If a circle graph is constructed for these data, what would be the percentage of the graph for each
major?
Chapter 1 • Statistics 40
ANSWER:
Major % of Majors
Mathematics 51.5
Computer 22.7
Science
Actuarial 15.5
Science
Statistics 10.3
The final-inspection defect report for an assembly line is reported on the table and Pareto
diagram as shown below:
Count 61 50 28 17 13 11
180 1
0.8
Percent
120
Count
0.6
0.4
60
0.2
0 0
Blem Scratch Chip Bend Dent Others
Defect type
Chapter 1 • Statistics 41
22. What is the total defect count in the report?
ANSWER:
180 defects
ANSWER:
24. Find the “cum % for bend”, and explain what that value means.
ANSWER:
[(61+50+28+17) /180] ⋅ 100% = (156/180) ⋅ 100% = 86.67%. The value 86.67% is the sum
of the percentages for all defects that occurred more often than Bend, including Bend.
25. Management has given the production line the goal of reducing their defects by 50%.
What two defects would you suggest they give special attention to in working toward this
goal? Explain.
ANSWER:
The two defects, Blemish and Scratch, total 61.67%. If they can control these two
defects, the goal should be within reach.
Chapter 1 • Statistics 42
Chapter 1 • Statistics 43
QUESTIONS 26 THROUGH 29 ARE BASED ON THE FOLLOWING INFORMATION:
The points scored by the winning teams on opening night of a recent NBA season are shown in
the table below:
26. Draw a bar graph of these scores using a vertical scale ranging from 80 to 120.
ANSWER:
120
110
Score
100
90
80
Detroit Dallas Chicago
Te am
Chapter 1 • Statistics 44
27. Draw a bar graph of the scores using a vertical scale ranging from 50 to 120.
ANSWER:
120
110
100
90
Score
80
70
60
50
Detroit Dallas Chicago
Te am
Chapter 1 • Statistics 45
Chapter 1 • Statistics 46
28. In which bar graph does it appear that the NBA scores vary more? Why?
ANSWER:
Bar graph in question 27 emphasizes the variation in the scores as it focuses only on the
variation and not the relative size of the scores.
29. How could you create an accurate representation of the relative size and variation
between these scores? Draw this new bar graph.
ANSWER:
An accurate representation of both the size and variation of the values would be best served by
starting the vertical scale at zero.
120
110
100
90
80
70
Score
60
50
40
30
20
10
0
Detroit Dallas Chicago
Te am
Chapter 1 • Statistics 47
QUESTIONS 30 THROUGH 33 ARE BASED ON THE FOLLOWING INFORMATION:
What not to get them on Valentines Day! A recent study among adults in USA shows that adults
prefer not to receive certain items as gifts on Valentine’s Day as shown below:
Teddy bears: 45%; Chocolate: 25%; Jewelry: 15%; Flowers: 12%; Don’t Know: 3%.
30. Draw a bar graph picturing the percentages of “Presents not wanted”.
ANSWER:
50
40
Percent
30
20
10
0
Teddy bears Chocolate Jewelry Flowers Don't know
Presents not wanted
Chapter 1 • Statistics 48
31. Draw a Pareto diagram picturing the “Presents not wanted”.
ANSWER:
100 100
80 80
Percent
60 60
Count
40 40
20 20
0 0
Unwanted Presens Teddy Bears Chocolate Jewelry Flowers Other
Count 45 25 15 12 3
Percent 45.0 25.0 15.0 12.0 3.0
Cum % 45.0 70.0 85.0 97.0 100.0
Chapter 1 • Statistics 49
32. If you want to be 80% sure you did not get your valentine something unwanted, what
should you avoid buying? How does the Pareto diagram show this?
ANSWER:
Teddy bears, chocolates, jewelry; these are listed first in the Pareto diagram.
33. 400 adults are to be surveyed, what frequencies would you expect to occur for each
unwanted item listed on the snapshot?
ANSWER:
The frequencies are 180, 100, 60, 48, and 12 for teddy bears, chocolates, jewelry,
flowers, and don’t know, respectively.
The points scored during each game by the Big Rapids High School basketball team last
season were: 60, 58, 65, 75, 50, 65, 60, 72, 64, 70, 58, 65, 56, 40, 68, and 55.
Chapter 1 • Statistics 50
ANSWER:
40 45 50 55 60 65 70 75
Score
35. Use the dotplot in question 34 to uncover the lowest and highest scores.
ANSWER:
Chapter 1 • Statistics 51
The lowest score was 40 and the highest was 75.
36. Use the dotplot in question 34 to determine the most common score? How many teams
share that score?
ANSWER:
The data shown below are the heights (in inches) of the basketball players who were the first
round picks by the professional NBA teams in a recent year.
83 83 75 80 76 80 81 84 79 80
84 86 72 82 82 79 81 79 80 73
90 82 81 75 77 80 79 76 85
Chapter 1 • Statistics 52
37. Construct a dotplot of the heights of these players.
ANSWER:
72 75 78 81 84 87 90
Heights of NBA Players
38. Use the dotplot in question 37 to uncover the shortest and the tallest players.
Chapter 1 • Statistics 53
ANSWER:
39. Use the dotplot in question 37 to determine the most common height and how many
players share that height?
ANSWER:
40. What feature of the dotplot in question 37 illustrates the most common height?
ANSWER:
True-False Questions
ANSWER: F
42. One major reason for constructing a graph of quantitative data is to display its
distribution.
ANSWER: T
Chapter 1 • Statistics 54
43. In a J-shaped histogram, there is one tail on the side of the class with the highest
frequency.
ANSWER: F
ANSWER: T
45. The frequency of a class is the number of pieces of data whose values fall within the
boundaries of that class.
ANSWER: T
46. Frequency distributions are used in statistics to present large quantities of repeating
values in a concise form.
ANSWER: T
47. If grouping data are used to form a frequency distribution, the class width is the
difference between the upper and lower class boundaries.
ANSWER: T
48. If grouping data are used to form a frequency distribution, the class midpoint (sometimes
called the class mark) is the numerical value that is exactly in the middle of each class. It
is found by adding the class boundaries and dividing by 2.
ANSWER: T
49. A histogram is a bar graph that represents a frequency distribution of categorical data.
ANSWER: F
50. A bimodal distribution has two high-frequency classes separated by classes with lower
frequencies. It is not necessary for the two high frequencies to be the same.
Chapter 1 • Statistics 55
ANSWER: T
51. Relative frequency can be expressed as a common fraction, in decimal form, but not as
a percentage.
ANSWER: F
52. The histogram of a sample should have a distribution shape very similar to that of the
population from which the sample was drawn.
ANSWER: T
ANSWER: T
54. Every ogive starts on the left with a cumulative relative frequency of zero at the lower
class boundary of the first class and ends on the right with a cumulative relative
frequency of 100% at the upper class boundary of the last class.
ANSWER: T
55. Measures of central tendency measure the spread of a set of data about its center.
ANSWER: F
56. For every set of data, the value of the median will always be one of the original items of
data.
ANSWER: F
ANSWER: F
Chapter 1 • Statistics 56
58. The midrange for a set of data is found by subtracting the lowest valued data L from the highest
valued data H.
ANSWER: F
59. The mean, median and mode are the most common measures of dispersion (spread).
ANSWER: F
60. Measures of central tendency are numerical values that locate, in some sense, the center of a set
of data.
ANSWER: T
61. The mean, median and mode for the set of data {3, 5, 3, 8, 6} are all the same value.
ANSWER: F
62. The mean of a sample always divides the data into two equal halves (half larger and half
smaller in value than itself).
ANSWER: F
63. A measure of central tendency is a quantitative value that describes how widely the data
are dispersed about a central value.
ANSWER: F
64. For any distribution, the sum of the deviations from the mean equals zero.
ANSWER: T
65. Measures of central tendency are attribute data that locate, in some sense, the center of
a set of data.
ANSWER: F
66. The term average is often associated with all measures of central tendency.
ANSWER: T
Chapter 1 • Statistics 57
67. The population mean, µ (lowercase mu in the Greek alphabet), is the mean of all x
values in the entire population.
ANSWER: T
68. The median is the value of the data that occupies the middle position when the data are
ranked in order according to size.
ANSWER: T
ANSWER: F
70. The midrange is the number exactly midway between a lowest value data L and a
highest value data H. It is found by averaging the low and high values.
ANSWER: T
ANSWER: F
72. The population median is represented by M (the uppercase mu in the Greek alphabet).
ANSWER: T
73. When n is odd, the depth of the median, d ( x ) will always be an integer.
ANSWER: T
74. When n is even, the depth of the median, d ( x ) will always be an integer or a half-
number.
ANSWER: F
Chapter 1 • Statistics 58
75. According to your book, if two or more values in a sample are tied for the highest
frequency (number of occurrences), we say there is no mode.
ANSWER: T
ANSWER: F
77. There are several kinds of measures ordinarily known as averages and each gives a
different picture of the figures it is called on to represent.
ANSWER: T
78. The standard deviation is the positive square root of the variance.
ANSWER: T
79. The sum of the squares of the deviations from the mean ∑ (x − x ) ,
2
will sometimes be
negative.
ANSWER: F
ANSWER: F
81. The sample variance, s 2 , is the mean of the squared deviations of x values from the
sample mean x , calculated using n – 1 as the divisor.
ANSWER: T
82. The measures of dispersion include the range, variance, and standard deviation.
ANSWER: T
83. The unit of measure for the variance is the same as the unit of measure for the data.
Chapter 1 • Statistics 59
ANSWER: F
84. There is no limit to how widely spread out the data can be; therefore, measures of
dispersion can be very large.
ANSWER: T
85. Although the mean deviation is always zero, it is a useful statistic in some occasions.
ANSWER: F
86. The range is the difference in value between the highest-valued (H) and the lowest-
valued (L) data.
ANSWER: T
87. The sample variance, s 2 is the mean of the deviations of x values from the sample mean
x.
ANSWER: F
88. The standard deviation of a sample is the square of the sample variance.
ANSWER: F
89. If a rounded value of x is used, then ∑(x − x) will not always be exactly zero. It will,
however, be reasonably close to zero.
ANSWER: T
90. In a box-and-whisker display, the length of the “box” is the same as the interquartile
range.
ANSWER: T
Chapter 1 • Statistics 60
91. Each set of data has four quartiles; they divide the ranked data into four equal quarters.
ANSWER: F
92. The numerical value midway between the first quartile and the third quartile is referred to as the
midquartile.
ANSWER: T
93. Each set of data has 100 percentiles; they divide the ranked data into 100 equal
subsets.
ANSWER: F
94. The median, the midrange, and the midquartile are always the same value, since each is
a middle value.
ANSWER: F
95. The interquartile range is the difference between the first and third quartiles; it is the range of the
middle 50% of the data.
ANSWER: T
96. The standard score (or z-score) identifies the position a particular value of x has relative
to the mean, measured in standard deviations; that is, z = ( x − x ) / s .
ANSWER: T
97. On a test Jim scored at the 50th percentile and Jean scored at the 25th percentile;
therefore, Jim’s test score was twice Jean’s test score.
ANSWER: F
98. The unit of measure for the standard score is always in standard deviations.
ANSWER: T
99. Data must be ranked before calculating many of the measures of position.
Chapter 1 • Statistics 61
ANSWER: T
ANSWER: F
101. Measures of position are used to describe the position a specific data value possesses
in relation to the mean of the data.
ANSWER: F
102. Measures of position are used to describe the position a specific data value possesses
in relation to the rest of the data.
ANSWER: T
103. Quartiles and percentiles are two of the most popular measures of dispersion.
ANSWER: F
104. The median, the second quartile, and the 50th percentile are all the same.
ANSWER: T
105. The first quartile, Q1 , is a number such that at most 25 of the data values are smaller in
value than Q1 and at most 75 of the data values are larger.
ANSWER: F
106. The median, the midrange, and the midquartile are not necessarily the same value.
Each is the middle value, but by different definitions of “middle.”
ANSWER: T
Chapter 1 • Statistics 62
107. Percentiles are values of the variable that divide a set of ranked data into 100 equal
subsets.
ANSWER: T
ANSWER: F
109. The 30th percentile, P30 , is a value such that at most 30% of the data are smaller in value
than P30 and at most 70% of the data are larger.
ANSWER: T
110. The first quartile and the 25th percentile are the same.
ANSWER: T
111. The mean, median, the second quartile, and the 50th percentile are all the same.
ANSWER: F
ANSWER: T
113. The 5-number summary divides a set of data into four subsets, with one-quartile of the
data in each subset.
ANSWER: T
114. The median, the midrange, and the midquartile are the same, since each is the middle
value.
ANSWER: F
Chapter 1 • Statistics 63
115. The midquartile is the numerical value midway between the first quartile and the third
quartile.
ANSWER: T
116. The interquartile range is the average of the first and third quartiles.
ANSWER: F
117. The interquartile range is the range of the middle 50% of the data.
ANSWER: T
118. The interquartile range is very unique in the sense that it is a measure of central
tendency as well as a measure of dispersion.
ANSWER: F
119. Since the z-score is a measure of relative position with respect to the mean, it can be
used to help us compare two raw scores that come from separate populations.
ANSWER: T
120. The midquartile, defined as the average of the first and third quartiles, is a measure of
position, simply because quartiles are one of the most popular measures of position.
ANSWER: F
Multiple-Choice Questions
121. At a large company, the majority of the employees earn from $20,000 to $30,000 per year. Middle
management employees earn between $30,000 and $50,000 per year while top management
earn between $50,000 and $100,000 per year. A histogram of all salaries would have which of
the following shapes?
A) Symmetrical
Chapter 1 • Statistics 64
B) Uniform
C) Skewed to right
D) Skewed to left
ANSWER: C
Chapter 1 • Statistics 65
A) Relative frequencies are often useful in a presentation because nearly everybody
understands fractional parts when expressed as percents.
B) Relative frequencies are particularly useful when comparing the frequency
distributions of two different size sets of data.
C) The histogram of a sample should have a distribution shape that is bimodal.
D) A stem-and-leaf display contains all the information needed to create a histogram.
ANSWER: C
126. The following set of data represents letter grades on term papers in a rhetoric class:
A, A, A, B, B, B, B, C, C, C, C, C, C, C, C, C, C, C, D, D, D, F.
Select the most appropriate measure of central tendency for the data described.
A) Mean
B) Median
C) Mode
D) Midrange
ANSWER: C
127. The following set of data represents the ages of students in a small seminar: 20, 21, 22, 25, 26,
27, and 68. Select the most appropriate measure of central tendency for the data described.
A) Mean
B) Median
C) Mode
D) Midrange
ANSWER: B
128. The following set of data represents the temperature high for seven consecutive days in February
in Chicago: 22, 14, 26, 27, 35, 38, and 41. Select the most appropriate measure of central
tendency for the data described.
A) Mean
B) Median
C) Mode
D) Midrange
ANSWER: A
Chapter 1 • Statistics 66
129. Which of the following is not affected by extreme values?
A) Median
B) Tenth percentile
C) Third quartile
D) All of the above
ANSWER: D
A) mean
B) second quartile
C) first quartile
D) midquartile
ANSWER: A
A) Mean
B) Median
C) Midrange
D) None of the above
ANSWER: D
133. The following data set represents shirt sizes for girls’ field hockey team:
Chapter 1 • Statistics 67
S, S, S, M, M, M, M, M, M, M, M, M, M, L, L, L, L, L, XL, XL
Select the most appropriate measure of central tendency for the data described.
A) Mean
B) Median
C) Mode
D) Midrange
ANSWER: C
134. Adding 5 to each value in a data set would not change which of the following measures?
A) Mode
B) Mean
C) Mid-range
D) Standard deviation
ANSWER: D
A) The interquartile range is found by taking the difference between the first and third
quartiles and dividing that value by 2.
B) The standard deviation is expressed in terms of the original units of measurement
but the variance is not.
C) The values of the standard deviation may be either positive or negative, while the
value of the variance will always be positive.
D) A large measure of dispersion is the result of an error of calculation because there is
a limit to how widely spread out data can be.
ANSWER: B
Chapter 1 • Statistics 68
137. The difference between the largest and smallest values in an ordered array is called the:
A) standard deviation
B) variance
C) interquartile range
D) range
ANSWER: D
A) Range
B) Variance
C) Standard deviation
D) None of them
ANSWER: A
A) The measures of dispersion include the range, variance, and standard deviation.
B) The numerical values of measures of dispersion describe the amount of spread, or
variability that is found among the data values.
C) Closely grouped data have relatively small measures of dispersion values, and more
widely spread-out data have larger values.
D) None of the above
ANSWER: D
Chapter 1 • Statistics 69
141. Which of the following types of graphs would not be good for qualitative data?
A) Box-and-whiskers display
B) Circle graph
C) Bar graph
D) Pareto diagram
ANSWER: A
142. For a normal distribution, a value that is two standard deviations below the mean would be closer
to which of the following?
A) Third percentile
B) First quartile
C) Fortieth percentile
D) Median
ANSWER: A
A) Measures of position are used to describe the position a specific data value
possesses in relation to the rest of the data.
B) Quartiles and percentiles are two of the most popular measures of position.
C) Quartiles are values of the variable that divide the ranked data into 4 equal subsets
called quarters.
D) All of the above.
ANSWER: D
A) 9.5
B) 10.0
C) 10.5
D) 11.0
ANSWER: C
Chapter 1 • Statistics 70
A) The first quartile, Q1 , is a number such that at most 25% of the data are smaller in
value than Q1 and at most 75% are larger.
B) The second quartile is the mean.
C) The third quartile, Q3 , is a number such that at most 75% of the data are smaller in
value than Q3 , and at most 25% are larger.
D) None of the above
ANSWER: B
146. Which is the depth of the 65th percentile for a ranked set of 50 student ages?
A) 32.5
B) 33.0
C) 33.5
D) 34.0
ANSWER: B
147. If the 70th percentile for a set of exam scores is 82, what does this mean?
148. The 5-number summary divides a set of data into how many subsets?
A) 6
B) 5
C) 4
D) 3
ANSWER: C
Chapter 1 • Statistics 71
A) The 5-number summary is more informative when it is displayed on a diagram drawn
to scale. A computer-generated graphic display that accomplishes this is known as
the box-and-whiskers display.
B) The position of a specific value in a set of data can be measured in terms of the
mean and variance using the standard score, commonly called the z-score.
C) The z-scores are typically range in value from approximately -3.00 to +3.00.
D) None of the above
ANSWER: B
150. Which is the depth of the 5th percentile for a ranked set of 35 student weights?
A) 1.50
B) 2.00
C) 2.50
D) 3.00
ANSWER: B
Short-Answer Questions
151. Explain the difference between a J-shaped histogram and a skewed histogram.
ANSWER:
J-shaped histogram has only one tail with the highest frequency as an end class. A
skewed histogram has tails on both sides of the class with the highest frequency, with
one tail being considerably longer.
152. If a histogram is constructed for the following frequency distribution, what shape would it have?
20 ≤ x < 30 5
30 ≤ x < 40 15
40 ≤ x < 50 20
Chapter 1 • Statistics 72
50 ≤ x < 60 18
60 ≤ x < 70 13
70 ≤ x < 80 10
80 ≤ x < 90 5
90 ≤ x ≤ 100 1
ANSWER:
153. What is the largest possible value needed on the vertical axis of a relative frequency
histogram?
ANSWER:
One
154. A relative frequency distribution was constructed for a sample of size n = 120. The
relative frequency for the third class was 0.15. How many items of data fell into the third
class?
ANSWER:
18
155. A relative frequency distribution was constructed for a sample of size n = 150. The
relative frequency for the second class was 0.067. How many items of data fell into this
class?
ANSWER:
10
Chapter 1 • Statistics 73
156. In an ogive, what does the vertical scale identify?
ANSWER:
The vertical scale identifies either the cumulative frequencies or the cumulative relative
frequencies.
ANSWER:
The horizontal scale identifies the upper class boundaries. Until the upper boundary of a
class has been reached, you cannot be sure you have accumulated all the data in that
class. Therefore, the horizontal scale for an ogive is always based on the upper class
boundaries.
158. Explain what is wrong with the statement, “The mean is always the best measure of central
tendency.”
ANSWER:
It depends on the type of data, and what would be an appropriate measure of central
tendency.
159. A company found that the mean number of sales for the 20 salesmen during the past month was
8.5. What was the total number of sales for the salesmen?
ANSWER:
170
160. For a particular sample x = 14.7 and s = 3.5. A new sample is formed by subtracting 2
from each value in the original sample. Find x for this new sample.
Chapter 1 • Statistics 74
ANSWER:
x = 12.7
161. Explain why it is possible to find the mean for the data of a quantitative variable, but not
for a qualitative variable.
ANSWER:
162. Find the median height of cheerleaders of a college basketball team: 66, 69, 65, 63 and
67 inches.
ANSWER:
x% = 66 inches
163. Explain why the standard deviation is not always less than the variance and give an
example.
ANSWER:
164. Which of the three measures of variability, range, standard deviation, and variance, does
not preserve the same unit of measurement as the observations themselves?
ANSWER:
Variance
Chapter 1 • Statistics 75
165. If a sample has a standard deviation of 4.5, what is its variance?
ANSWER:
20.25
166. For a particular sample x = 14.7 and s = 3.5. A new sample is formed by subtracting 2
from each value in the original sample. Find s for this new sample.
ANSWER:
s = 3.5
167. For a particular sample of size n = 10, the sample variance is 4.8 and x = 0.5 . For this
sample, find ∑x 2
.
ANSWER:
45.7
ANSWER:
∑( x − x ) , is always zero because the deviations of x values smaller than the mean
(which are negative values) cancel out x values larger than the mean (which are
positive).
169. Explain the meaning of the following statement “The data value x = 30 has a deviation
value of 6.”
Chapter 1 • Statistics 76
ANSWER:
170. Explain the meaning of the following statement “The data value x = 80 has a deviation
value of -15.”
ANSWER:
171. A particular standardized test has a mean score of 455 with a standard deviation of 112.
A student scored 575 on this test. Determine the student's z-score.
ANSWER:
1.07
172. On a standardized test, a student's z-score was near zero. What does this tell us about
the student's actual score on the test?
ANSWER:
173. For a particular sample, the mean is 4.74, and the standard deviation is 3.10. What
score in the sample has a z-score equal to −0.40?
ANSWER:
3.5
174. What statistical measure gives the range of the middle 50% of the data?
Chapter 1 • Statistics 77
ANSWER:
Interquartile range
175. An aptitude test is known to have a mean score of 37.75 with a standard deviation equal to 3.5. A
company requires a standard score of at least 1.5 for employment as one of its requirements.
What must your test score be in order to be considered for employment?
ANSWER:
43 or larger
176. A normal distribution has a mean equal to 55.0 and a standard deviation equal to 7.5. Find the
value of the midquartile.
ANSWER:
55.0
177. For a particular sample x = 4.2, one item in the sample is x = 4.8. This item has a z-
score at 2.50. Find the sample standard deviation.
ANSWER:
s = 0.24
178. For a particular sample x = 4.4, an item in the sample is x = 3.4, and the z-score of this
item is equal to –1.25. Find the sample variance.
ANSWER:
s 2 = 0.64
179. Determine your raw score on a test that has a sample mean of 65 and a sample
variance of 121 if your instructor told you that your standard score is 1.50.
Chapter 1 • Statistics 78
ANSWER:
x−x x − 65
z= ⇒ 1.50 = ⇒ x = 81.5
s 11
180. In general, the median, the midrange, and the midquartile are not necessarily the same
value. Each is the middle value, but by different definitions of “middle”. What property
does the distribution need for these three measures to all be the same value?
ANSWER:
The distribution of the data needs to be symmetric for these three measures to all be the
same value.
181. What does it mean to say that x = 163 has a standard score of +1.60?
ANSWER:
182. Determine your raw score on a test that has a sample mean of 74 and a sample
standard deviation of 12 if your instructor told you that your standard score is -0.50.
ANSWER:
x−x x − 74
z= ⇒ −0.50 = ⇒ x = 68
s 12
183. What does it mean to say that a particular value of x has a z score of -1.94?
ANSWER:
Chapter 1 • Statistics 79
It means that value of x is 1.94 standard deviations below the mean.
ANSWER:
The standard score is a measure of the number of standard deviations from the mean.
The frequency distribution below gives the weight loss in pounds for 90 patients.
7 30.0 ≤ x ≤ 35.0 2
ANSWER:
25.0
Chapter 1 • Statistics 80
186. What is the class width?
ANSWER:
5.0
ANSWER:
12.5
ANSWER:
90
ANSWER:
90
ANSWER:
Chapter 1 • Statistics 81
1 0.0 ≤ x < 5.0 0.056
A sample of families living in a large, suburban subdivision resulted in the following frequency
distribution, where: x = number of children in the family.
x f
0
8
1 1
1
2 2
3
3 2
1
4 1
3
5
7
6
2
Chapter 1 • Statistics 82
191. What does the “3” represent?
ANSWER:
ANSWER:
ANSWER:
85
ANSWER:
219
195. Determine the mean number of children per family in the sample.
ANSWER:
x =219 / 85=2.58
196. A sample of twenty-five snow blowers of a given brand were filled with gasoline (one gallon) and
allowed to run until the tank was empty. The times (in minutes) that the snow blowers operated
were recorded as follows:
Chapter 1 • Statistics 83
65 70 60 65 67 68 63 62 63 70 72 66 63
66 66 62 70 58 60 60 60 62 67 71 65
ANSWER:
58 ≤ x< 61 5
61 ≤ x < 64 6
64 ≤ x < 67 6
67 ≤ x < 70 3
70 ≤ x ≤ 73 5
Class Boundaries f
0≤ x <3 2
3≤ x < 6 4
6≤ x <9 7
9 ≤ x < 12 10
12 ≤ x < 15 8
15 ≤ x < 18 6
18 ≤ x ≤ 21 3
197. In how many days was the daily high temperature between 9 and 12 degrees?
Chapter 1 • Statistics 84
ANSWER:
10 days
ANSWER:
0≤ x<3 0.050
3≤ x < 6 0.100
6≤ x<9 0.175
9 ≤ x < 12 0.250
12 ≤ x < 15 0.200
15 ≤ x < 18 0.150
18 ≤ x ≤ 21 0.075
199. What is the proportion of days in which the daily high temperature was between 15 and
18?
ANSWER:
0.15
Chapter 1 • Statistics 85
ANSWER:
0≤ x <3 2
3≤ x < 6 6
6≤ x<9 13
9 ≤ x < 12 23
12 ≤ x < 15 31
15 ≤ x < 18 37
18 ≤ x ≤ 21 40
ANSWER:
0≤ x<3 0.050
3≤ x < 6 0.150
6≤ x<9 0.325
9 ≤ x < 12 0.575
12 ≤ x < 15 0.775
15 ≤ x < 18 0.925
18 ≤ x ≤ 21 1.000
202. The following frequency distribution gives the pay ranges (in thousands of dollars) for all middle
management personnel in large company.
Chapter 1 • Statistics 86
Class f
Boundaries
20 < x < 30 4
30 ≤ x < 40 27
40 ≤ x < 50 29
50 ≤ x < 60 25
60 ≤ x < 70 17
ANSWER:
The ages of 50 students who are attending a community college in Iowa are shown below:
20 20 19 21 21 22 19 19 21 19
18 21 19 18 22 21 24 20 24 17
21 19 22 19 18 20 23 19 19 20
19 20 21 22 21 20 22 20 21 20
21 19 21 21 19 19 20 19 19 19
ANSWER:
Age 17 18 19 20 21 23 23 24
Frequency 1 3 16 10 12 5 1 2
Chapter 1 • Statistics 87
204. Prepare an ungrouped relative frequency distribution of the same data.
ANSWER:
Age 17 18 19 20 21 23 23 24
Rel. Freq. 0.02 0.06 0.32 0.20 0.24 0.10 0.02 0.04
ANSWER:
Histogram
20
15
Frequency
10
0
17 18 19 20 21 22 23
Age
Chapter 1 • Statistics 88
206. Prepare a cumulative relative frequency distribution of the same data.
ANSWER:
207. Briefly discuss the basic guidelines to follow in constructing a grouped frequency
distribution.
ANSWER:
Age 17 18 19 20 21 23 23 24
Cum. rel. freq. 0.02 0.08 0.40 0.60 0.84 0.94 0.96 1.00
guideline for the number of classes with samples of fewer than 125 data.)
(d) Use a system that takes advantage of a number pattern to guarantee accuracy.
(e) When it is convenient, an even class width is often advantageous.
208. The terms “symmetrical, uniform, skewed, J-shaped, bimodal, and normal” are usually
used to describe histograms. Discuss each term briefly.
ANSWER:
Symmetrical: Both sides of this distribution are identical (halves are mirror images).
Chapter 1 • Statistics 89
Skewed: One tail is stretched out longer than the other. The direction of skewness is on
the side of the longer tail.
J-Shaped: There is no tail on the side of the class with the highest frequency.
Bimodal: The two most populous classes are separated by one or more classes. This
situation often implies that two populations are being sampled.
Normal: A symmetrical distribution is mounded up about the mean and becomes sparse
at the extremes.
The following frequency distribution provides the number of managers and their annual salaries
(in $1000):
Number of Managers 24 74 52 38 12
ANSWER:
Class Cumulative
Boundaries Frequency
15 ≤ x ≤ 25 24
15 ≤ x ≤ 25 98
15 ≤ x ≤ 25 150
15 ≤ x ≤ 25 188
15 ≤ x ≤ 25 200
210. Prepare a cumulative relative frequency distribution for this frequency distribution.
ANSWER:
Class Cumulative
Chapter 1 • Statistics 90
Boundaries Frequency
15 ≤ x ≤ 25 0.12
15 ≤ x ≤ 25 0.49
15 ≤ x ≤ 25 0.75
15 ≤ x ≤ 25 0.94
15 ≤ x ≤ 25 1.00
The players on a professional soccer team scored 40 goals during last season.
Player 1 2 3 4 5 6 7 8 9 10 11 12 13
Goals 2 7 3 2 2 5 2 1 6 2 3 2 3
211. If you want to show the number of goals scored by each player, would it be more
appropriate to display this information on a bar graph or a histogram? Explain.
ANSWER:
In order to show the number of goals scored by each player, it would be more
appropriate to display this information on a bar graph
Chapter 1 • Statistics 91
ANSWER:
5
Number of Goals
0
1 2 3 4 5 6 7 8 9 10 11 12 13
Player
Chapter 1 • Statistics 92
213. If you wanted to show (emphasize) the distribution of scoring by the team, would it be
more appropriate to display this information on a bar graph or a histogram? Explain.
ANSWER:
ANSWER:
4
Frequency
0
1 2 3 4 5 6 7
Number of Goals
Chapter 1 • Statistics 93
QUESTIONS 215 AND 216 ARE BASED ON THE FOLLOWING INFORMATION:
A sample of twenty-five snow blowers of a given brand were filled with gasoline (one gallon) and allowed
to run until the tank was empty. The times (in minutes) that the snow blowers operated were recorded as
follows:
65 70 60 65 67 68 63 62 63 70 72 66 63
66 66 62 70 58 60 60 60 62 67 71 65
ANSWER:
Stems Leaves
5 8
6 0000222333555666778
7 00012
ANSWER:
217. Nine households had the following number of children per household: 2, 0, 2, 2, 1, 2, 4, 3, 2. Find
the mean, median, mode, and midrange for these data.
ANSWER:
Chapter 1 • Statistics 94
QUESTIONS 218 THROUGH 219 ARE BASED ON THE FOLLOWING INFORMATION:
The commuting distance was determined for each of 10 employees at Acme manufacturing. One of the
employees lives in another town and has a large commuting distance. The 10 distances were as follows:
5, 10, 7, 15, 10, 12, 8, 120, 20, 18.
ANSWER:
22.5
ANSWER:
11
ANSWER:
221. Consider the following sample: 26, 49, 9, 42, 60, 11, 43, 26, 30, and 14. Find the mean.
ANSWER:
31.0
222. Consider the following sample: 26, 49, 9, 42, 60, 11, 43, 26, 30, and 14. Find the
median.
ANSWER:
28.0
223. Consider the following sample: 26, 49, 9, 42, 60, 11, 43, 26, 30, and 14. Find the
midrange.
Chapter 1 • Statistics 95
ANSWER:
34.5
224. For a particular sample of 50 scores on a statistics exam, the following results were obtained:
Mean = 78 Midrange = 72 Third quartile = 94 Mode = 84
Median = 80 Standard deviation = 11 Range = 52 First quartile = 68
What score was earned by more students than any other score? Why?
ANSWER:
225. If a sample with a mean of 10.5 and a standard deviation of 2.30 has every item multiplied by 10,
find the mean of the new sample.
ANSWER:
105
226. For a particular sample, the mean is 3.7 and the standard deviation is 1.2. A new sample is
formed by adding 6.3 to every item of data in the original sample. Find the mean of the new
sample.
ANSWER:
10.0
227. Find the mean, median, mode, and midrange for the following data:
x f
10 2
11 4
12 7
15 3
20 1
Chapter 1 • Statistics 96
ANSWER:
228. A student computed the mean of a particular sample to be 40.0. After computing the mean, he
discovered that he forgot to include the number 36 in the sample. When this number was
included, the sample mean changed to 39.5. What is the sample size when the number 36 is
correctly included in the sample?
ANSWER:
n=8
229. What are the three data values such that the new sample has a mean of 100? Justify
your answer.
Chapter 1 • Statistics 97
ANSWER:
Many different answers are possible. The sum of the five numbers needs to be 500;
therefore we need any three numbers that total 330, such as 100, 110,120.Thus, the
new sample mean x = 500 / 5 = 100.
230. What are the three data values such that the new sample has a median of 70? Justify
your answer.
ANSWER:
Many different answers are possible. Need two numbers smaller than 70 and one
number larger than 70. For example, we may choose 50, 60, and 80.Thus the five
numbers are 50, 60, 70, 80, 100, and the median is 70.
231. What are the three data values such that the new sample has a mode of 87? Justify
your answer.
ANSWER:
Many different answers are possible. Need multiple 87's. For example, we may choose
87, 87 and 95. Thus, the five numbers are 70, 87, 87, 95, 100, and the mode = 87.
232. What are the three data values such that the new sample has a midrange of 70? Justify
your answer.
ANSWER:
Many different answers are possible. Need any two numbers that total 140 for the
extreme values L and H, where one is 100 or larger. For example, we may choose the
numbers 40, 50, and 60. Thus the five numbers are 40, 50, 60, 70, 100, and midrange =
(L+H)/2 = (40+100)/2 = 70.
233. What are the three data values such that the new sample has a mean of 100 and a
median of 70? Justify your answer.
Chapter 1 • Statistics 98
ANSWER:
Many different answers are possible. Need two numbers smaller than 70 and one
number larger than 70 so that their total is 330. For example, we may choose the
numbers 65, 65, and 200. Thus the five numbers are 65, 65, 70, 100, 200. Hence, x =
500/5 = 100, and the median is 70.
234. What are the three data values such that the new sample has a mean of 100 and a
mode of 87? Justify your answer.
ANSWER:
Many different answers are possible. Need two numbers of 87 and a number large
enough so that the total of all five numbers is 500. Therefore the three numbers are 87,
87,156. The five numbers are 70, 87, 87, 100, 156. Thus the mode = 87, and x = 500 / 5
= 100.
235. What are the three data values such that the new sample has a mean of 100, a median
of 70, and a mode of 87? Justify your answer.
ANSWER:
Many different answers are possible. There must be two 87's in order to have a mode of
87, and there can only be two data values larger than 70 in order for 70 to be the
median, which is impossible since 100 is one of the numbers, and that makes three of
the five numbers larger than 70.
236. The Next Door Store kept track of the number of paying customers it had during the noon hour
each day for 100 days. The following are the resulting statistics rounded to the nearest integer:
Mean = 95, Median = 97, Mode = 98, First quartile = 85, Third quartile = 107, Midrange = 93,
Range = 56, and Standard deviation = 12. The Next Door Store served what number of paying
customers during the noon hour more often than any other number? Explain how you determined
your answer.
ANSWER:
98 customers; this is the mode.
Chapter 1 • Statistics 99
237. A statistics test was given with the following results:
80, 69, 92, 75, 88, 37, 98, 92, 90, 81, 32, 50, 59, 66, 67, 66
Find the range, standard deviation, and variance for the scores.
ANSWER:
238. What are the three data values such that the new sample has a mean of 110 (Hint: Many
different answers are possible). Justify your answer.
ANSWER:
∑x needs to be 550; therefore, need any three numbers that total 370, such as 110,
120, and 140. Hence, the mean x = ∑ x / n = 550 / 5 = 110
239. What are the three data values such that the new sample has a median of 75 (Hint:
Many different answers are possible). Justify your answer.
ANSWER:
Need two numbers smaller than 75 and one number larger. For example, choose the
numbers 60, 70, and 80. Hence, the five data values are 60, 70, 75, 80, 105, and d( x% ) =
(n+1)/2 = (5+1)/2 = 3rd value; therefore the median x% = 75.
240. What are the three data values such that the new sample has a mode of 85 (Hint: Many
different answers are possible). Justify your answer.
Choose three numbers, each is 85. Hence the five data values are 75, 85, 85, 85, 105,
and the mode = 85.
241. What are the three data values such that the new sample has a midrange of 80 (Hint:
Many different answers are possible). Justify your answer.
ANSWER:
Need any two numbers that total 160 for the extreme values where one is 105 or larger.
For example, choose the values 40, 50, and 120. Hence the five data values are 40, 50,
75, 105, and 120. Therefore, midrange = (L+H)/2 = (40+120)/2 = 80.
242. What are the three data values such that the new sample has a mean of 110 and a
median of 75 (Hint: Many different answers are possible). Justify your answer.
ANSWER:
Need two numbers smaller than 75 and one larger than 75 so that their total is 365. For
example, choose the values 65, 70, and 230. Hence the five data values are 65, 70, 75,
105, and 230. Hence the mean x = ∑ x / n = 550 / 5 = 110, and d( x% ) = (n+1)/2 = (5+1)/2
= 3rd value; therefore the median x% = 75.
243. What are the three data values such that the new sample has a mean of 110 and a
mode of 80 (Hint: Many different answers are possible). Justify your answer.
ANSWER:
Need two numbers of 80 and a third number large enough so that the total of all five
values is 550. Then the third number must be 210. Hence, the five values are 75, 80, 80,
105, and 210. Hence, the mean x = ∑ x / n = 550 / 5 = 110, and the mode = 80.
244. What are the three data values such that the new sample has a mean of 110 and a
midrange of 80 (Hint: Many different answers are possible). Justify your answer.
We started with the data values 75 and 105. A mean of 110 requires the five data values
to total 550 and a midrange of 80 requires the total of the lowest value L and the highest
value H to be 160. The sum of 75 and 105 is 180; hence, the total of the other three
remaining numbers is 370. Since L + H must be 160, then the fifth number must be 210,
which would then become H and change the value of the midrange. So, this situation is
impossible!
245. What are the three data values such that the new sample has a mean of 110, a median
of 75, and a mode of 85 (Hint: Many different answers are possible). Justify your answer.
ANSWER:
There must be two 85's in order to have a mode of 85, and there can only be two data
values larger than 75 in order for 75 to be the median, but since 105 is one of the
starting numbers, then we have three data values larger than 75; namely 85, 85, and
105. As a result, 75 can’t be the median. So, this situation is impossible!.
246. A sample of twenty-five snow blowers of a given brand were filled with gasoline (one gallon) and
allowed to run until the tank was empty. The times (in minutes) that the snow blowers operated
were recorded as follows:
65 70 60 65 67 68 63 62 63 70 72 66 63
66 66 62 70 58 60 60 60 62 67 71 65
ANSWER:
247. A group of children had the following heights in inches: 45, 46, 42, 56, 37, 50, 51, 50, 47, 47. Find
the range, standard deviation, and variance for the scores.
ANSWER:
ANSWER:
294.89
ANSWER:
17.17
ANSWER:
ANSWER:
46
ANSWER:
529
ANSWER:
1.20
ANSWER:
1.44
255. For the following three samples, for which sample is the data most closely grouped about the
sample mean? Give a written explanation that supports your conclusion.
Since the sample standard deviation s measures dispersion about the mean, we
compute s of each sample. Sample 1, s = 5.17; Sample 2, s = 4.66; Sample 3, s = 5.54.
Since sample 2 has the smallest standard deviation, data most closely grouped about its
mean.
256. The mean for 50 pressure readings equals 5.5, and the sum of the squares of the readings
equals 1622.75. Find the standard deviation of these pressure readings.
ANSWER:
257. A set of 25 measurements has a mean of 24.5 and a standard deviation equal to 4.0.
Find ∑x 2
.
ANSWER:
Consider the following data: 21, 41, 41, 36, 39, 23, 30, 30, 34, 31, 26, 25, 29, 28, 36.
ANSWER:
x = 31.3
s = 6.3
Set 1: 45 55 50 48 52
Set 2: 35 50 65 47 53
Both sets have the same mean x = 50. Compare the following measures for both sets:
∑ ( x − x ) , SS(x), and range. Comment on the meaning of these comparisons.
ANSWER:
Set 1:
x x−x ( x − x )2
45 -5 25
55 +5 25
50 0 0
48 -2 4
52 +2 4
250 0 58
x x−x ( x − x )2
35 -15 225
50 0 0
65 +15 225
47 -3 9
53 +3 9
250 0 468
Comparisons:
∑ x ∑ (x − x) ∑ (x − x) 2 Range
Set1 250 0 58 10
more variability in the data forming set 2 than in the data of set 1. ∑ ( x − x ) = 0 for both
sets of data (in fact this is always true for any data).
65 70 60 65 67 68 63 62 63 70 72 66 63
66 66 62 70 58 60 60 60 62 67 71 65
Q1 = 62
ANSWER:
P90 = 70
263. For a particular sample of 50 scores on a statistics exam, the following results were obtained:
Mean = 78, Midrange = 72, third quartile = 94, Mode = 84, Median = 80, Standard deviation = 11,
Range = 52, and first quartile = 68. How many students scored between 68 and 94 on the exam?
ANSWER:
25
Consider the following sample of size n = 65, ordered from smallest to largest:
124 127 128 129 133 134 137 139 141 143
147 148 156 159 163 166 169 170 173 179
199 201 207 210 213 217 219 222 225 228
234 238 244 259 261 262 263 264 266 268
279 280 286 298 299 305 306 307 311 313
320 328 333 345 350 351 361 362 363 364
ANSWER:
P80 =330.5
ANSWER:
P29 = 173115.
24 27 28 29 33 34 37 39 41 43 47 48
56 59 63 66 69 70 73 79 99 21 27 10
13 17 19 22 25 28 34 38 44 59 61 62
63 64 66 68 79 80 86 98 99 35 36 37
11 13 20 28 33 45 50 51 61 62 63 64
ANSWER:
P20 =27
269. Consider the sample 9, 11, 17, 23, 26, 38, 47. Find the z-score for the data point of “11.”
ANSWER:
270. In which of these situations (A, B, or C) is the x-value lowest in relation to the sample
from which it comes? These samples come from three different populations.
Situation C: x = 16
. , x = 2.00 , s = 0.30
ANSWER:
271. Find the first quartile and the third quartile for the following data:
2.1, 2.1, 2.2, 2.4, 2.5, 2.5, 2.5, 2.5, 2.6, 2.6, 2.6, 2.7, 2.7,
2.7, 2.8, 2.9, 3.0, 3.0, 3.2, 3.2, 3.3, 3.3, 3.5, 3.6, 4.0
ANSWER:
11.1, 11.5, 11.9, 12.0, 11.6, 12.2, 11.9, 12.5, 12.8, 19.0,
10.9, 11.6, 12.7, 5.0, 11.5, 12.6, 19.5, 12.7, 4.0, 19.1
The mean equals 12.31 and the variance equals 14.2884. Find the standard score for the
smallest and largest data values.
ANSWER:
The smallest data value is 4.0, and its z-score is -2.198, while the largest data value is
19.5 and its z-score is 1.902.
273. Use the following stem-and-leaf display to find the tenth percentile for the distribution of
lengths:
Stems Leaves
2.1 0 2 1
2.3 3 6 5 2 1
1
2.4 1 1
2.5 0 1 2
2.7 7 7 8
3.1 2 2 4
3.5 1 1 2 1 1
ANSWER:
P10 = 2.12
ANSWER:
IQR = Q3 − Q1 = 83 – 74 = 9
275. The following subscripted x’s represent a sample of size n = 67 which has been ranked
from smallest ( x1 ) to largest ( x67 ) : x1 , x2 , x3 ,K x65 , x66 , x67 . . Prepare a 5-number
summary for this sample in terms of the subscripted x’s.
ANSWER:
276. What does it mean to say that x = 152 have a standard score of +1.5?
ANSWER:
277. What does it mean to say that a particular value of x has a z-score of –2.1?
ANSWER:
ANSWER:
The standard score is a measure of the number of standard deviations from the mean.
Below are the ACT scores attained by the 25 members of a local high school graduating class.
23 26 25 19 33 21 21 22 21 27
19 25 18 23 22 30 27 27 23 16
21 19 20 30 22
ANSWER:
18 21 24 27 30 33
A CT Scores
280. Using the concept of depth, describe the position of 26 in the set of 25 ACT scores in two
different ways.
ANSWER:
The data values in ascending are:
16 18 19 19 19 20 21 21 21 21
22 22 22 23 23 23 25 25 26 27
27 27 30 30 33
th th
Therefore, the value 26 is in the 19 position from L = 16, and in the 7 position from H = 33.
ANSWER:
nk / 100 =(25)(10) / 100 = 2.5. Hence d( P10 ) = 3, and P10 = 19
ANSWER:
nk / 100 =(25)(20) / 100 = 5. Hence d( P20 ) = 5.5, and P20 =(19+20)/2 = 19.5
ANSWER:
Since k = 99 > 50, subtract 99 from 100 and use 100 – k in place of k to determine the depth,
which is then counted from the largest-valued data H. Therefore, n(100 – k) / 100 = 25 (1) / 100 =
0.25; then d( P99 ) = 1, and P99 = 33
ANSWER:
Since k = 90 > 50, subtract 90 from 100 and use 100 – k in place of k to determine the depth,
which is then counted from the largest-valued data H. Therefore, P90 : n(100 – k) / 100 = 25 (10) /
100 = 2.5; then d( P90 ) = 3, and P90 = 30
ANSWER:
Since k = 80 > 50, subtract 80 from 100 and use 100 – k in place of k to determine the depth,
which is then counted from the largest-valued data H. Therefore, n(100 – k) / 100 = 25 (20) / 100
= 5; then d( P80 ) = 5.5, and P80 = (27+27) / 2 = 27
ANSWER:
nk / 100 =(25)(25) / 100 = 6.25. Hence d( Q1 ) = 7, and Q1 = 21
ANSWER:
nk / 100 =(25)(50) / 100 = 12.5. Hence d( Q2 ) = 13, and Q2 = 22
ANSWER:
Since k = 75 > 50, subtract 75 from 100 and use 100 – k in place of k to determine the depth,
which is then counted from the largest-valued data H. Therefore, n(100 – k) / 100 = 25 (25) / 100
= 6.25; then d( Q3 ) = 7, and Q3 = 26.
ANSWER:
The five number summary reported by Minitab are: L = 16, Q1 = 20.5, Q2 = 22, Q3 = 26.5, and
H = 33.
Note that the values of Q1 and Q3 reported by Minitab are slightly different compared to our
earlier calculations that showed Q1 = 21 and Q3 = 26.
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
ACT Scores
On how many days were there between 85 and 107 paying customers during the noon hour?
Explain how you determined your answer.
ANSWER:
50 days; since 50% of the 100 days fall between the first and third quartiles.
The annual salaries (in $100) of high school teachers employed at one of the high schools in
Kent County, Michigan are listed below:
600 440 461 419 397 477 464 275 507 497
332 373 440 373 501 382 377 301 323 383
ANSWER:
ANSWER:
294. Find the first quartile for these salaries, and interpret the result.
ANSWER:
nk / 100 =(20)(25) / 100 =5.0. Hence d( Q1 ) =5.5, and Q1 = (373+373)/2 = 373 or $37,300. This
means that at most 25% of high school teachers’ salaries are lower than $37,300 and at most
75% are higher.
ANSWER:
Since k = 75 > 50, subtract 75 from 100 and use 100 – k in place of k to determine the depth,
which is then counted from the largest-valued data H. Therefore,
n(100 – k) / 100 = 20 (25) / 100 = 5.0; then d ( Q3 ) = 5.5, and Q3 = (464+477)/2 = 470.5 or
$47,050.
This means that at most 75% of high school teachers’ salaries are lower than $47,50 and at most
25% are higher.
296. Find the midquartile for these salaries, and interpret the result.
ANSWER:
This means that the salary midway between the first and third quartile is $42,175.
297. Find the interquartile range for these salaries, and interpret the result.
ANSWER:
This means that the range of the middle 50% of the salaries is $9,750.
True-False Questions
298. Chebyshev’s Theorem says that within two standard deviations of the mean, you will
always find at least 89% of the data.
299. The Empirical Rule can be used to determine whether or not a set of data is approximately
normally distributed.
ANSWER: T
300. For a bell-shaped distribution, the range will be approximately equal to six standard
deviations.
ANSWER: T
301. The standard deviation is a kind of yardstick by which we can compare the variability of
one set of data with another.
ANSWER: T
ANSWER: T
303. The Empirical Rule applies specifically to a normal (bell-shaped) distribution, but it is
frequently applied as an interpretive guide to any mounded distribution.
ANSWER: T
304. The Empirical Rule applies to any distribution, regardless of its shape, as an interpretive
guide to the distribution.
ANSWER: F
305. The Empirical Rule can be used to determine whether or not a set of data is
approximately normally distributed.
ANSWER: T
ANSWER: T
ANSWER: F
308. In the event that the data do not display an approximately normal distribution,
Chebyshev’s Theorem gives us information about how much of the data will fall within
intervals centered at the mean for all distributions.
ANSWER: T
309. Graphs in which the frequency scale starts at zero tend to emphasize the size of the
numbers involved.
ANSWER: T
310. Graphs that are chopped off may tend to emphasize the variation in the numbers without
regard to the actual size of the numbers.
ANSWER: T
ANSWER: T
Multiple-Choice Questions
313. According to the Empirical Rule, if the variable is normally distributed, then within one
standard deviation of the mean, there well be approximately:
314. The proportion of any distribution that lies within four standard deviations of the mean is:
A) 93.75% or more.
B) 93.75% or less.
C) 6.25% or more.
D) 6.25% or less.
ANSWER: A
Short-Answer Questions
315. According to Chebyshev's Theorem, what percent of a set of data will be more than three
standard deviations from the mean?
ANSWER:
About 11%
316. According to the Empirical Rule, at least what percent of a set of data will lie within two standard
deviations from the mean?
Approximately 95%
317. A sample has a mean of 100.0 and a standard deviation of 15.0. According to Chebyshev's
Theorem, at least 8/9 of all of the data will lie between what two values?
ANSWER:
318. A sample of size 50 has a mean of 60.0 and a standard deviation of 10.0. According to
Chebyshev's Theorem, at least what percent of the data is between 10 and 110?
ANSWER:
96%
319. A sample of size 100 from a normal population has a mean of 110 and a standard deviation of
10.0. Using the Empirical Rule, about how many items of the sample will be above 130?
ANSWER:
Approximately 2 to 3 items
320. Complete the following statement: According to the Empirical Rule, ________ of the data for any
distribution will occur within one standard deviations of the mean of the distribution.
ANSWER:
68%
321. The lifetimes of electronic components have a mean equal to 2.5 years and a standard deviation
equal to 0.2 years. Within what time interval will at least 75% of the lifetimes fall?
ANSWER:
ANSWER:
55.6%
323. For a normal distribution, a value that is one standard deviation above the mean would be
approximately the same as what percentile?
ANSWER:
Eighty-fourth percentile
324. According to Chebyshev's Theorem, how many standard deviations on both sides of the mean do
you need to go so that at least 96% of the distribution is covered?
ANSWER:
Five
325. According to Chebyshev’s Theorem at least 75% of all the data in a particular sample
lies between 74.5 and 82.9. Find the sample mean for this sample.
ANSWER:
x = 78.7
326. According to Chebyshev’s Theorem at least 75% of all the data in a particular sample
lies between 74.5 and 82.9. Find the sample standard deviation for this sample.
ANSWER:
s = 2.1
ANSWER:
From the graph it appears that the time for the boys is twice the time for the girls.
However, the time for the boys is 80 seconds while the time for the girls is 65 seconds.
The deception is caused by the vertical scale not starting at zero.
328. A large sample is selected from a normal distribution. The middle 99.7% of the sample data falls
between 24.2 and 69.2. Estimate the sample mean and the sample standard deviation.
ANSWER:
The average clean-up time for a crew of a medium-size firm is 80.0 hours and the standard
deviation is 6.5 hours. Assuming that the Empirical Rule is appropriate.
329. What proportion of the time will it take the clean-up crew 93.0 or more hours to clean the
plant?
ANSWER:
z = (93 -80) / 6.5 = 2. Therefore, 93.0 is 2 standard deviations above the mean. Hence,
2.5% of the time more than 93.0 hours will be required.
ANSWER:
95% of the time, the total clean-up time will fall within 2 standard deviations of the mean;
that is 80.0 ± 2 (6.5) or from 67 to 93 hours.
331. At most, what percentage of a distribution will be three or more standard deviations from
the mean?
ANSWER:
At most 11%
332. At most, what percentage of a distribution will be four or more standard deviations from
the mean?
ANSWER:
At most 6.25%
333. The Next Door Store kept track of the number of paying customers it had during the noon hour
each day for 100 days. The following are the resulting statistics rounded to the nearest integer:
For how many of the 100 days was the number of paying customers within three standard
deviations of the mean ( x ± 3s ) ? Explain how you determined your answer.
ANSWER:
According to Chebyshev’s Theorem, the proportion of any distribution that lies within 3 standard
deviations of the mean is at least 89%. Therefore, we should expect in at least 89 of the 100 days
that the number of paying customers was within three standard deviations of the mean.
The mean lifetime of a certain tire is 50,000 miles and the standard deviation is 2,500 miles.
334. If we assume the mileages are normally distributed, approximately what percentage of
all such tires will last between 42,500 and 57,500 miles?
According to the Empirical Rule, approximately 99.7% of all such tires will last between
42,500 and 57,500 miles (i.e., within three standard deviations of the mean).
335. If we assume nothing about the shape of distribution, approximately what percentage of
all such tires will last between 42,500 and 57,500 miles?
ANSWER:
According to Chebyshev’s Theorem, at least 89% of all such tires will last between
42,500 and 57,500 miles (i.e., within three standard deviations of the mean).
Chapter 3
Section 3.1
True-False Questions
1. The scatter diagram is an appropriate display of bivariate data when both variables are
quantitative.
ANSWER: T
2. In problems that deal with two quantitative variables, we will present the sample data
pictorially on a scatter diagram.
ANSWER: T
ANSWER: T
4. When bivariate data result from two quantitative variables, the data are often arranged
on a cross-tabulation or contingency table.
ANSWER: F
5. The total of the marginal totals in a contingency table is the grand total and is equal to n,
the sample size.
ANSWER: T
6. Bivariate data refers to the values of two different variables that are obtained from two
different populations.
ANSWER: F
7. When bivariate data result from two qualitative variables, the data are often arranged on
a cross-tabulation or contingency table.
ANSWER: T
ANSWER: T
ANSWER: F
ANSWER: T
11. The frequencies in a contingency table can easily be converted to percentages of the
grand total by dividing each frequency by the row or column total and multiplying the
result by 100.
ANSWER: F
12. The frequencies in a contingency table can be expressed as percentages of the column
totals by dividing each column entry by that column’s total and multiplying the result by
100.
ANSWER: T
13. The frequencies in a contingency table can be expressed as percentages of the column
totals by dividing each column entry by the grand total and multiplying the result by 100.
ANSWER: F
14. When bivariate data result from one qualitative and one quantitative variable, the
quantitative values are viewed as separate samples, each set identified by levels of the
qualitative variable.
ANSWER: T
15. The scatter diagram is a plot of all the ordered pairs of bivariate data on a coordinate
axis system. The input variable x is plotted on the horizontal axis, and the output variable
y is plotted on the vertical axis.
ANSWER: T
16. When the bivariate data are the result of two attribute variables, it is customary to
express the data mathematically as ordered pairs (x, y).
ANSWER: F
ANSWER: T
Multiple-Choice Questions
18. In bivariate data, where both response variables are quantitative ordered pairs (x, y),
what name do we give to the variable x?
A) Attribute variable
B) Dependent variable
C) Output variable
D) Independent variable
ANSWER: D
A) Contingency table
B) Two histograms
C) Two bar graphs
D) Two circle graphs
ANSWER: B
20. For which of the following situations is it appropriate to use a scatter diagram?
A) The total of the marginal totals in a contingency table is the grand total and is equal
to n, the sample size.
B) The frequencies in a contingency table can be expressed as percentages of the row
totals by dividing each row entry by the grand total and multiplying the results by 100.
C) In problems that deal with two quantitative variables, we present the sample data
pictorially on a scatter diagram.
D) None of the above.
ANSWER: B
Short-Answer Questions
ANSWER:
Quantitative values are separate samples and each set identified by levels of the
qualitative variable.
24. In an experiment, a fixed amount of fertilizer was applied to each of 10 plots, and the
corresponding yield in pounds of corn was measured. Identify the independent and
dependent variables in this experiment.
ANSWER:
25. Briefly discuss the three combinations of variable types that can form bivariate data.
26. When the bivariate data are the result of two quantitative variables, it is customary to
express the data mathematically as ordered pairs (x, y). What do the variables x and y
represent?
ANSWER:
x is the input variable (sometimes called the independent variable) and y is the output
variable (sometimes called the dependent variable).
27. When the bivariate data are the result of two quantitative variables, it is customary to
express the data mathematically as ordered pairs (x, y). Why are the data said to be
ordered?
ANSWER:
The data are said to be ordered because one value, x, is always written first.
28. When the bivariate data are the result of two quantitative variables, it is customary to
express the data mathematically as ordered pairs (x, y). Why the data are called paired?
ANSWER:
The data are called paired because for each x value, there is a corresponding y value
from the same source.
29. Consider the two variables, a person’s height and weight. Which variable, height or
weight, would you use as the input variable when studying their relationship? Explain
why.
ANSWER:
Shown below is a scatter diagram for high school GPAs (x) versus college GPAs (y). The
sample was selected from freshmen who had completed two semesters at a small college.
4.0
3.0
College
GPA
2.0
x
2.0 3.0 4.0
High School GPA
ANSWER:
10
31. What is the smallest value reported for the output variable?
ANSWER:
2.0
ANSWER:
4.0
33. A survey of 15 doctors and 15 nurses was conducted, and one question related to their smoking
habit. The following coding was used: Doctor (D), Nurse (N), Smoker (S), Nonsmoker (NS). The
following results were obtained:
Respondent D D D N N D D D N D
Smoking S NS NS S S NS NS NS S NS
Habit
Respondent D N N N D N N D D D
Smoking NS NS S NS S NS NS NS NS NS
Habit
Respondent D N N N D D N N N N
Smoking NS S NS NS S S NS S NS NS
Habit
ANSWER:
Respondent
Yes 4 6 10
Column total 15 15 30
A large survey of doctors and nurses was conducted, and one of the topics investigated was their
smoking habit, i.e., whether they were smokers or not. The following results were obtained:
Respondent
34. Convert this table to a table of percentages based on the grand total (entire sample).
ANSWER:
Respondent
ANSWER:
Respondent
36. Convert this table to a table of percentages based on the row totals.
ANSWER:
Respondent
ANSWER:
12.09%
38. A study was done on undergraduate students. Of the males sampled, 80 were in the
college of liberal arts and sciences, 40 were in the college of commerce, and 10 were in
the college of engineering. For the females sampled, 70 were in the college of liberal
arts and sciences, 16 were in the college of commerce, and 34 were in the college of
engineering. For this sample construct a complete contingency table showing percents
ANSWER:
College
39. Shown below is a scatter diagram for high school GPAs (x) versus college GPAs (y).
The sample was selected from freshmen that had completed two semesters at a small
college.
4.0
3.0
College
GPA
2.0
x
2.0 3.0 4.0
High School GPA
Match the items described in Column I with the terms in Column II.
3. Input variable c. All freshmen at the college having completed two semesters.
4. Output variable d. The students whose college GPA’s are shown in the scatter
diagram.
ANSWER:
In a national survey of 400 business and 400 leisure-travelers, each were asked where they
would most like “more space.”
ANSWER:
41. Express the table as percentages of the row totals. Why might one prefer the table to be
expressed that way?
ANSWER:
42. Express the table as percentages of the column totals. Why might one prefer the table to
be expressed that way?
ANSWER:
One might prefer the table to be expressed that way because each category (Airplane,
Room, Other) is treated as a separate distribution.
A statewide survey was conducted to investigate the relationship between viewers’ preferences
for ABC, CBS, NBC, or PBS for new information and their political party affiliation. The results
are shown in tabular form:
ANSWER:
4000
44. Why is this bivariate data? Name the two variables. What type of variable is each one?
ANSWER:
This is bivariate data since the values of the two variables - television network viewers’
preferences and political affiliation - are obtained from the same population element.
Both variables are attitude variables.
ANSWER:
Viewers’ Preferences
ANSWER:
Viewers’ Preferences
ANSWER:
Viewers’ Preferences
ANSWER:
490
ANSWER:
44.45%
ANSWER:
17.82%
51. What percentage of the viewers were Republicans and preferred CNN?
6.5%
52. What percentage of the viewers who preferred ABC were Democrats?
ANSWER:
25.882%
ANSWER:
15.186%
54. What percentage of the viewers who preferred Fox were Republicans?
ANSWER:
46.154%
55. What percentage of the viewers were neither Democrats nor Republicans and preferred
Fox?
ANSWER:
18.29%
ANSWER:
23.5%
Can a man’s height be predicted from his father’s height? The heights of some father-son pairs
are listed; x is the father’s height and y is the son’s height.
x 70 70 74 72 68 70 68 71 69 70 71
y 70 72 72 72 71 71 70 69 70 71 71
x 70 71 71 70 74 68 72 71 72 73 67
y 71 72 72 69 73 69 70 73 73 72 70
ANSWER:
Father
Son
67 68 69 70 71 72 73 74
Height
58. What can you can conclude from seeing the two sets of heights as separate sets in
question 57? Explain.
ANSWER:
ANSWER:
73
72
Son Height
71
70
69
67 68 69 70 71 72 73 74
Father Height
ANSWER:
Fear of being in the dentist’s chair is an emotion felt by many people of all ages. A survey of
100 individuals in five age groups was conducted about this fear. These results are shown in the
table below:
ANSWER:
ANSWER:
63. Express the frequencies as percentages of each age group; marginal totals
ANSWER:
64. Express the frequencies as percentages of those who fear and those who do not fear.
ANSWER:
80
70
60
Percentage
50
Fear
40
Don't Fear
30
20
10
0
Elementary
Jr. High
Sr. High
College
Adult
Age Group
True-False Questions
66. If the value of the coefficient of linear correlation, r, is near –1 for two variables, then the
variables are not related.
ANSWER: F
68. If two variables are not linearly correlated, then they are not related.
ANSWER: F
69. When both variables from a bivariate set of data are quantitative, the appropriate measure of
linear relationship is the coefficient of linear correlation.
ANSWER: T
70. The equation for the line of best fit relating the height (x) and weight (y) for freshman
women attending a particular college was found to be ŷ = -187.4 + 4.82x. This equation
could be used to predict weights of senior women attending this college.
ANSWER: F
71. The signs of r and b1 are always the same; that is, r and b1 are both either positive or
negative.
ANSWER: T
72. The closer the absolute value of r is to one, the better will be the predictions made using
the equation of the line of best fit, provided the prediction is made for x values between
the smallest value of x and the largest value of x in the observed data.
ANSWER: T
73. If the data points form a straight horizontal or vertical line, there is strong correlation.
ANSWER: F
74. Although the correlation coefficient measures the strength of a linear relationship, it does
not tell us about the mathematical relationship between the two variables.
75. Perfect positive linear relationship occurs when all the points fall exactly above the
straight line.
ANSWER: F
76. The primary purpose of linear correlation analysis is to measure the strength of a linear
relationship between two variables.
ANSWER: T
77. Correlation analysis is a method of obtaining the equation that represents the
relationship between two variables.
ANSWER: F
78. The linear correlation coefficient is used to determine the equation that represents the
relationship between two variables.
ANSWER: F
79. A correlation coefficient of zero means that the two variables are perfectly correlated.
ANSWER: F
80. Whenever the slope of the regression line is zero, the correlation coefficient will also be zero.
ANSWER: T
ANSWER: F
82. The slope of the regression line represents the amount of change expected to take place
in y when x increases by one unit.
83. When the calculated value of r is positive, the calculated value of b1 may be negative.
ANSWER: F
ANSWER: F
85. The value being predicted is called the output or predicted value.
ANSWER: T
86. The line of best fit is used to predict the average value of y that can be expected to occur
at a given value of x.
ANSWER: T
87. The primary purpose of linear correlation analysis is to measure the strength of a linear
relationship between two variables.
ANSWER: T
88. If as x increases there is no definite shift in the values of y, we say there is no correlation
or no relationship between x and y.
ANSWER: T
89. If as x increases there is a definite shift in the values of y, we say there is a positive
correlation between the two variables.
ANSWER: F
ANSWER: T
91. If as x decreases there is a definite shift in the values of y, we say there is a negative
correlation between the two variables.
ANSWER: F
92. If as x increases there is a definite shift in the values of y, we say there is a correlation
between the two variables.
ANSWER: T
93. Remember that a strong correlation between two variables does imply causation.
ANSWER: F
94. When the calculated value of the linear correlation coefficient r is close to zero, we
conclude that there is little or no linear correlation.
ANSWER: T
95. As the calculated value of the linear correlation coefficient r changes from 0.0 toward -
1.0, it indicates an increasingly weaker linear correlation between the two variables.
ANSWER: F
96. The equation of the line of best fit is determined by its slope ( b1 ) and its y-intercept ( b0 ).
ANSWER: T
97. The slope, b1 , represents the predicted change in y per unit increase in x.
ANSWER: T
ANSWER: F
99. The y-intercept is the value of y where the line of best fit intersects the y-axis.
ANSWER: T
100. Correlation analysis is a method of obtaining the equation that represents the
relationship between two variables.
ANSWER: F
101. Whenever the slope of the regression line is zero, the correlation coefficient will also be
zero.
ANSWER: T
102. When the linear correlation coefficient r is positive, the slope of the regression line b1 will
also be positive.
ANSWER: T
103. The linear correlation coefficient is used to determine the equation that represents the
relationship between two variables.
ANSWER: F
104. A correlation coefficient of zero means that the two variables are perfectly correlated.
ANSWER: F
105. The slope of the regression line represents the average amount of change expected to
take place in y when x increases by one unit.
ANSWER: T
ANSWER: F
Multiple-Choice Questions
107. Select the most likely answer for the coefficient of linear correlation for the following two
variables: x = the number of hours spent studying for a test, and y = the number of
points earned on the test
A) r = 1.20
B) r = 0.70
C) r = −0.85
D) r = 0.05
ANSWER: B
108. Select the most likely answer for the coefficient of linear correlation for the following two
variables: x = the weight, in pounds, of a college student, and y = the grade point average for the
student
A) r = 0.98
B) r = 0.65
C) r = 0.07
D) r = −0.65
ANSWER: C
109. Select the most likely value for the coefficient of linear correlation for the following two variables: x
= the number of police patrol cars cruising in a given neighborhood, and y = the number of
burglaries committed in the neighborhood
A) r = 1.14
B) r = 0.78
C) r = −0.13
D) r = −0.75
ANSWER: D
A) r = −0.87
B) r = 0.65
C) r = −0.02
D) r = 0.47
ANSWER: C
111. Suppose we used the equation, y = −2x + 3, to generate eight ordered pairs, (x, y).
Then using these ordered pairs to compute the coefficient of linear correlation, what
value should we expect to obtain for r?
A) +3
B) −1
C) +1
D) −2
ANSWER: B
112. A strong linear relationship (r = 0.97) exists between the two variables x and y in the
table. The equation of the least squares line is ŷ = 15.75 – 0.55x. For what values of x
should we use this equation to make predictions?
x 5 7 8 10 11 12
y 5.5 8 8 9 10 11
113. Shown below is a scatter diagram for high-school GPAs (x) versus college GPAs (y).
The sample was selected from freshmen who had completed two semesters at a small
college.
3.0
College
GPA
2.0
x
2.0 3.0 4.0
High School GPA
What can we say about the slope of the line of best fit?
114. Suppose we find the equation of the line of best fit for a set of bivariate data. If we use
x = x in the equation, what value should we expect for y$ ?
A) y$ = b0
B) y$ = b1
C) y$ = y
D) Cannot predict the value of y$ .
ANSWER: C
A) The linear correlation coefficient, r, always has a value between -1 and +1.
B) The coefficient of linear correlation, r, is the numerical measure of the strength of the
linear relationship between two variables.
C) If the data form a straight horizontal or vertical line, there is a weak correlation, since
one variable has no significant effect on the other.
D) None of the above.
ANSWER: C
A) The lurking variable is a variable that has an important effect on the relationship
between the variables of a study but is not included in the study.
B) If there is a strong linear correlation between two variables, then one can definitely
conclude that there is a direct cause-and-effect relationship between the two
variables.
C) As the calculated value of the linear correlation coefficient r changes from 0.0 toward
+1 or -1.0, it indicates an increasingly stronger linear correlation between the two
variables.
D) None of the above.
ANSWER: B
A) The value of the linear correlation coefficient ranges between 0 and +1.
B) The value being predicted in regression analysis is called the input variable.
C) The line of best fit is used to predict the average value of y that can be expected to
occur at a given value of x.
D) All of the above
ANSWER: C
ANSWER:
120. Explain why the following statement is false: “If the value of the coefficient of linear
correlation, r, is near zero for two variables, then the variables are never related”.
ANSWER:
121. What is the largest value that the coefficient of linear correlation can ever equal?
ANSWER:
1.0
x 1 2 3 4 5
y 6 6 6 6 6
ANSWER:
y 5 3 2 3 5
ANSWER:
Plotted data would be curvilinear rather than linear and r would show little or no
relationship.
124. For a particular set of bivariate data, the equation of the line of best fit is y$ = −731
. + 512
. x,
and y = 11.80. Find the value of x .
ANSWER:
x = 16.58
ANSWER:
126. A student correctly computed the coefficient of linear correlation for two variables and
found the value to be r = −0.02 . The student’s conclusion was that since the value of r is
near zero, the two variables are not related. Comment on this conclusion.
ANSWER:
Variables are not linearly related, but some other type of relationship may exist.
127. A study investigating the relationship between speed (miles per hour) and gas rate
(miles per gallon) covered speeds ranging from 20 mph to 70 mph. Speed was the
independent variable, and gas rate was the dependent variable. The equation of the line
of best fit was ŷ = 355 - 0.1x. Estimate the average miles per gallon for cars of type
tested traveling at 50 mph.
30.5 mpg
ANSWER:
129. When making predictions on the line of best fit, explain what is wrong with utilizing data from 20
years ago to make predictions for today.
ANSWER:
130. A strong linear relationship exists between the two variables in the table (r = –0.95). The
equation of the least squares line is ŷ = 15.75 – 0.55x. For what values of x should we
use in this equation to make predictions?
x 6 7 10 12 14 15
ANSWER:
6 ≤ x ≤ 15
131. For a particular set of bivariate data, the equation of the line of best fit is ŷ = -82.4 +
6.28x, and y = 118
. . Find x for this data.
ANSWER:
x = 15
132. Diane collected a set of bivariate data and calculated r, the linear correlation coefficient.
The resulting value was – 1.54. Diane proclaimed that this indicated that there was no
correlation between the two variables since the value of r was not between –1.0 and
+1.0. Amy argues that –1.54 was impossible and that only values of r near zero implied
no correlation. Who is correct? Justify your answer.
ANSWER:
Amy is correct, since the linear correlation coefficient r must take a value between –1.0
and +1.0, and that only values of r near zero implied no correlation.
133. How would you interpret the findings of a correlation study that reported a linear
correlation coefficient of -1.25? Why?
ANSWER:
134. How would you interpret the findings of a correlation study that reported a linear
correlation coefficient of +0.095?
A linear correlation coefficient of +0.095 indicates that there is very little or no linear
correlation.
135. Explain why it makes sense for a set of data to have a correlation coefficient of zero
when the scatter diagram of the data shows a very definite pattern.
The scatter diagram may suggest a non-linear relationship between the two variables.
The correlation coefficient measures the strength of a linear relationship; therefore a
value near zero indicates no linear relationship.
136. Briefly discuss the difference between the purpose of regression analysis and the
purpose of correlation.
ANSWER:
In regression analysis, we seek a relationship between the variables. The equation that
represents this relationship may be the answer that is desired, or it may be the means to
the prediction that is desired. In correlation analysis, we measure the strength of the
linear relationship between the two variables.
137. Determine whether the following question requires correlation analysis or regression
analysis to obtain an answer: “Is there a correlation between the grades a student
obtained in high school and the grades he or she attained in college?”
ANSWER:
Correlation
138. Determine whether the following question requires correlation analysis or regression
analysis to obtain an answer: “What is the relationship between the weight of a package
and the cost of mailing it first class?”
ANSWER:
Regression
139. Determine whether the following question requires correlation analysis or regression
analysis to obtain an answer: “Is there a linear relationship between a person’s height
and shoe size? “
ANSWER:
140. Determine whether the following question requires correlation analysis or regression
analysis to obtain an answer: “What is the relationship between the number of worker-
hours and the number of units of production completed?”
ANSWER:
Regression
141. Determine whether the following question requires correlation analysis or regression
analysis to obtain an answer: “Is the score obtained on a certain aptitude test linearly
related to a person’s ability to perform a certain job?”
ANSWER:
Correlation
x 0 1 2 4
y 2 6 7 1
x 76 70 82 90 68 60 62 60 62 72 68 80 74
y 12 10 11 12 10 13 10 11 13 11 10 12 12
ANSWER:
r = 0.18
144. For a group of army inductees, the weight, x, and exercise capacity, y, were recorded for
10 individuals. For the following results, give the values for SS(xy), SS(x), SS(y), and r.
x 18 15 20 15 22 17 13 25 16 19
y 30 25 20 30 15 28 30 20 26 20
ANSWER:
145. The coefficient of linear correlation equals 0.8 for a set of bivariate data. The standard
deviation for the x variable is 20.5, and the standard deviation for the y variable equals
ANSWER:
495.28
ANSWER:
147. Compute the value of the coefficient of linear correlation for the following data and interpret the
value obtained.
ANSWER:
148. Compute the value of the coefficient of linear correlation for the following data and then
interchange the values of x and y and compute the value of the coefficient of linear
correlation for the changed data. How do the two values compare?
x 2 5 9 14
ANSWER:
149. Based on the following bivariate data, find the value of k so that the value of the
coefficient of linear correlation r will be exactly zero.
y 3 5 k
ANSWER:
k = 3.25
150. Based on the following bivariate data, find the value of k so that the value of the
coefficient of linear correlation r will be exactly +1.0.
x 5 k 7
y 8 9.5 11
ANSWER:
k=6
x y x2 xy y2
2 14 4 28 196
3 13 9 39 169
4 11 16 44 121
5 8 25 40 64
5 9 25 45 81
7 4 49 28 16
7 3 49 21 9
ANSWER:
21.43
ANSWER:
106.86
ANSWER:
−47.29
ANSWER:
−0.99
ANSWER:
−2.21
19.26
ANSWER:
y$ = 19.26 − 2.21x
ANSWER:
159. For a group of army inductees, the weight, x, and exercise capacity, y, were recorded for
10 individuals. Based on the results in the table, find the equation of the line of best fit.
x 18 15 20 15 22 17 13 25 16 19
y 30 25 20 30 15 28 30 20 26 20
ANSWER:
y$ = 45.1 − 0.114x
160. Students were given a reading competency test (scores range from 0 to 48) and also a
math competency test (scores range from 50 to 100). Find SS(xy), SS(x), and the
equation of the line of best fit for the data.
Math (y) 78 80 90 60 95 70 77 83 90 80
ANSWER:
161. The moisture content of a chemical compound is determined for different relative humidity values.
Treat the humidity as the independent variable and the moisture content as the dependent
variable and find the equation of the line of best fit.
Humidity 30 45 60 50 80 65 75 20
Moisture 8 10 12 7 15 10 12 8
Content
ANSWER:
y$ = 4.9 + 0.1x
162. Using the following bivariate data, find the equation of the line of best fit and use it to
predict the value of y when x = 7.
x 2 5 9 13
y 65 10 21 25
ANSWER:
ŷ = 2.95 + 18.0x
ANSWER:
SS(x) = 11.5
164. For a particular set of bivariate data, the equation of the line best fit is ŷ = 3.5 + 7.2x and
SS(x) = 10.1. For this data, find SS(xy).
ANSWER:
SS(xy) = 72.72
An experimental psychologist asserts that the older a child is, the fewer irrelevant answers he or
she will give during a controlled experiment. To investigate this claim, the following data were
collected.
Age (x) 2 3 4 5 6 7 8 9 10
# Irrelevant Answers 12 14 9 7 11 8 6 9 5
(y)
ANSWER:
15
13
11
5
0 2 4 6 8 10
Age
ANSWER:
ANSWER:
The coefficient of linear correlation r is the numerical measure of the strength of the linear
relationship between two variables.
In a study involving children’s fear related to being examined by a physician, the age and the
score each child made on the Child Medical Fear Scale (CMFS) were:
Age (x) 8 9 9 9 9 9 10 10 11 11
CMFS score 30 25 25 29 35 42 28 27 32 35
(y)
ANSWER:
Scatter Diagram
45
40
CMFS
35
30
25
8 9 10 11
Age
ANSWER:
170. Calculate the coefficient of linear correlation r and interpret its meaning.
ANSWER:
There is a very weak positive linear relationship between the age of a child and the
score each child made on the CMFS.
Ali used linear regression to help him understand his monthly telephone bill. The line of best fit
was $y = 25.75 + 1.32 x ; x is the number of long-distance calls made during a month and y is the
total telephone cost for a month. In terms of number of long distance calls and cost:
ANSWER:
The y-intercept of $25.75 is the amount of the total monthly telephone cost when x, the
number of long distance calls, is equal to zero. That is, when no long distance calls are
made, there is still the monthly phone charge of $25.75.
ANSWER:
The slope of $1.32 is the rate at which the total phone bill will increase for each
additional long distance call; it is related to average cost of the long distance calls.
Consider the following data, which give the weight (in thousands of pounds) x and gasoline
mileage (miles per gallon) y for ten different automobiles.
x 2.0 2.4 2.6 2.9 3.2 3.5 3.8 4.2 4.6 5.2
y 45 40 42 39 44 36 34 28 18 13
ANSWER:
ANSWER:
There is a strong negative linear relationship between the weight (in thousands of
pounds) and gasoline mileage (miles per gallon) for different automobiles.
y 3 5 7 9 11
ANSWER:
Scatter Diagram
12
10
8
y
6
4
2
0 1 2 3 4
x
The scatter diagram of these data results in five points that fall perfectly on a straight
line.
176. Find the correlation coefficient and the equation of the line of best fit.
ANSWER:
n = 5, ∑ x = 10, ∑ y = 35, ∑ x 2
= 30, ∑ y 2 = 285, ∑ xy = 90
SS ( x) = ∑ x 2 − (∑ x) 2 / n = 30 − (10) 2 / 5 = 10.0
SS ( xy ) = ∑ xy − (∑ x ⋅ ∑ y ) / n = 90 – (10)(35) / 5 = 20
A medical researcher studied the relationship between two variables: a person’s current age (x)
and the expected number of years remaining (y). The following data for a sample of ten people
were recorded:
x 64 66 68 70 72 74 76 78 80 82
y 16.6 15.2 13.8 12.6 11.5 10.2 9.3 8.5 7.1 6.2
Scatter Diagram
18
Years Remaining
15
12
6
60 70 80 90
Age
ANSWER:
ANSWER:
The line of best fit is shown on the scatter diagram in question 185.
180. What are the expected years remaining for a person who is 75 years old? Find the
answer in two different ways: Use the equation from question 182 and use the line on
the scatter diagram in question 181.
ANSWER:
Using the equation of best fit: $y = 52.6881 – 0.5697(75) = 9.96. Using the graph of the
line of best fit shown in question 185: yˆ ≈ 9.7
181. Are you surprised that the data all lie so close to the line of best fit? Explain why the
ordered pairs follow the line of best fit so closely.
ANSWER:
The apparent linear relationship should not be a surprise. One’s age and years
remaining should total a fixed value, life expectancy.
A dietician conducted a study to compare calories (x) and fat (y) in 18 of the most popular fast-
food items. The results of the study are shown below:
y 7 13 11 12 10 8 26 28 8
y 36 20 20 22 22 55 25 40 20
ANSWER:
Scatter Diagram
60
50
40
Fat
30
20
10
0
0 200 400 600 800
Calories
ANSWER:
ANSWER:
185. Explain the meaning of the answers to questions 186, 187, and 188.
ANSWER:
186. Briefly discuss all possible situations that may be true about the relationship between
two variables x and y if there is a strong linear correlation between them.
ANSWER:
One of the following situations may be true situations may be true about the relationship
between the two variables:
The number of hours studied, x, is compared to the exam score, y as shown below:
x 2 5 6 3 4 6 5 2 3
y 58 95 92 85 80 85 88 75 65
ANSWER:
ANSWER:
(∑ x)2 (36)2
SS ( x) = ∑ x 2 − = 164 − = 20
n 9
(∑ x)(∑ y ) (36)(723)
SS ( xy ) = ∑ xy − = 3013 − = 121
n 9
ANSWER:
SS ( xy ) 121
r= = = 0.0.776
SS ( x) ⋅ SS ( y ) (20)(1216)
ANSWER:
There is a strong linear correlation between the number of hours studied for a test and
the test scores. In other words, studying for an exam pays off.
A study was conducted to investigate the relationship between the resale price, y (in hundreds
of dollars), and the age, x (in years) of midsize American automobiles. The equation of the line
of best fit was determined to be ŷ = 195.6 – 21.5x.
191. Find the resale value of such a car when it is four years old.
ANSWER:
192. Find the resale value of such a car when it is seven years old.
ANSWER:
193. What is the average annual decrease in the resale price of these cars?
ANSWER:
(21.5)($100) = $2,150
194. Start with the point (4, 4) and add at least four ordered pairs, (x, y), to make a set of
ordered pairs such that the correlation of x and y is 0.0.
ANSWER:
One possible answer is (4, 4), (1, 4), (2, 4), (0, 3), (0, 5).
195. Start with the point (4, 4) and add at least four ordered pairs, (x, y), to make a set of
ordered pairs such that the correlation of x and y is +1.0.
One possible answer is (4, 4), (0, 0), (1, 1), (2, 2), (3, 3), (5, 5).
196. Start with the point (4, 4) and add at least four ordered pairs, (x, y), to make a set of
ordered pairs such that the correlation of x and y is -1.0.
ANSWER:
One possible answer is (4, 4), (7, 1), (1, 7), (6, 2), (2, 6), (5, 3), (3, 5).
Start with the point (4, 4) and add at least four ordered pairs, (x, y), to make a set of
ordered pairs such that the correlation of x and y is between -0.25 and 0.0.
197. Start with the point (4, 4) and add at least four ordered pairs, (x, y), to make a set of
ordered pairs such that the correlation of x and y is between +0.5 and +0.8.
ANSWER:
One possible answer is (4, 4), (2, 4), (1, 3), (2, 2), (0, 1).
Chapter 4
Probability
Section 4.1
True-False Questions
1. If A is any event of a sample space S, then P(A) represents the relative frequency with
which event A can be expected to occur.
2. If A is any event of a sample space S and if P(A) is computed using P(A) = n(A)/n(S),
then n(A) may never equal zero.
ANSWER: F
3. If A is any event of a sample space S and if the probability of event A is denoted by P(A),
then the probability of A is a theoretical probability.
ANSWER: F
4. Under certain conditions, it is possible that the sum of the probabilities of all the sample
points in a sample space is less than one.
ANSWER: F
5. If A is any event of a sample space S, then P(A) is a numerical value between −1 and 1,
inclusive.
ANSWER: F
ANSWER: F
7. The concepts of probability and relative frequency as related to an event are very
similar.
ANSWER: T
ANSWER: T
10. The value found for experimental probability will always be exactly equal to the
theoretical probability assigned to the same event.
ANSWER: F
11. The empirical probability that event A will occur is the relative frequency with which
event A can be expected to occur, and this probability is denoted by P′ (A).
ANSWER: T
12. The probability of an event may be obtained in three different ways: (1) empirically, (2)
theoretically, and (3) objectively.
ANSWER: F
13. The experimental, or empirical probability P′ (A) of an event A is the ratio n(A) of number
of times A occurred to the number n of trials.
ANSWER: T
14. The theoretical method for obtaining the probability of an event uses a sample space in
which each possible outcome has a certain probability of occurring, but the probabilities
of all outcomes do not necessarily have the same value.
ANSWER: F
15. A sample space is a listing of all possible outcomes from the experiment being
considered.
ANSWER: T
16. A probability is always a numerical value larger than zero but smaller than one.
ANSWER: F
ANSWER: T
18. The number of times an event can be expected to occur in n trials is always less than or
equal to the total number of trials, n.
ANSWER: T
19. The Law of Large Numbers tells us that the larger the number of experimental trials n,
the larger the empirical probability P′ (A) is expected to be compared to the true of
theoretical probability P(A).
ANSWER: F
20. The Law of Large Numbers states that “as the number of times an experiment is
repeated increases, the ratio of the number of successful occurrences to the number of
trials will tend to approach the theoretical probability of the outcome for an individual
trial.”
ANSWER: T
21. Odds are a way of expressing probabilities by expressing the number of ways an event
can happen compared to the number of ways it can’t happen.
ANSWER: T
Multiple-Choice Questions
A) With the theoretical method for obtaining the probability of an event, the sample
space must contain equally likely sample points.
B) The theoretical probability P(A) of an event A is the ratio of the number n(A) of points
that satisfy the definition of event A to the number of trials n.
24. Which of the following probabilities is suitable in establishing proper life insurance rates?
A) Empirical probability
B) Theoretical probability
C) Subjective probability
D) All of the above.
ANSWER: A
25. Which of the following statements is false If the odds in favor of an event A are a to b?
27. If the odds favoring rain tomorrow are 3 to 1, then the probability of rain tomorrow is
A) 1.00
B) 0.75
C) 0.50
D) 0.25
ANSWER: B
Short-Answer Questions
28. State whether the probability in the following situation is being determined empirically,
theoretically, or subjectively: “A box contains 30 red beads and 70 blue beads. Jessica is
going to randomly select one bead from the box and is interested in determining the
relative frequency that the bead will be blue. She determines a relative frequency of
0.700”.
ANSWER:
Theoretically
29. State whether the probability in the following situation is being determined empirically,
theoretically, or subjectively: “Abby takes a test, and based on feeling, assigns a relative
frequency of 0.8 that her grade will be an A.”
ANSWER:
Subjectively
30. State whether the probability in the following situation is being determined empirically,
theoretically, or subjectively: “In order to determine the relative frequency of obtaining a sum of 17
when three dice are tossed, Heidi tosses three dice 200 times and observe that the sum of 17
occurs 5 times. She obtains a relative frequency of 0.025.”
Empirically
31. State whether the probability in the following situation is being determined empirically,
theoretically, or subjectively: “Lily is interested in determining the relative frequency of
being dealt blackjack, which is an ace and a ten or an ace and a face card. She correctly
reasons that there are 64 possible blackjacks and 1326 possible two-card hands. She
then computes the relative frequency of being dealt blackjack as approximately 0.048.”
ANSWER:
Theoretically
32. A computer program produces a random integer between 0 and 9 (inclusive). Find the
probability that the integer is a number greater than 5.
ANSWER:
0.40
33. A computer program produces a random integer between 0 and 9 (inclusive). Find the
probability that the integer is a number less than 7.
ANSWER:
0.70
34. After examining 5000 records of children of age 5, a dentist finds that 2235 had at least
one cavity on their first dental check-up. What empirical probability would the dentist
assign to the event that a 5-year-old would have at least one cavity on his/her first dental
check-up?
ANSWER:
0.447
ANSWER:
1/3
36. Explain why the following statement is false: “If a sample space S has 5 sample points
and if event A contains exactly 1 of these sample points, then it must follow that P(A) =
0.20”.
ANSWER:
If sample points are not treated equally likely, then P(A) is not necessarily 0.20.
37. Explain why the following statement is true: if A is an event of a sample space S, then it
is possible that P(A) = 1.
ANSWER:
If A = S, then P(A) = 1.
38. Heidi is interested in determining the probability that a randomly selected student in her
statistics class earned a passing grade (A, B, C, or D) on the first test. She reasons that
each student earned either a passing grade (P) or a failing grade (F) and constructs the
sample space S = {P,F}. Are the sample points equally likely or not equally likely?
ANSWER:
39. Amy is interested in determining the probability that a randomly selected card from a
standard deck of 52 will be a club. She reasons that the deck contains clubs (C), spades
ANSWER:
Equally likely
40. A sample space is composed of three outcomes, called A, B, and C. Outcome B is twice
as probable as A, and C is twice as probable as B. Find the probabilities of the events of
A, B, and C.
ANSWER:
ANSWER:
The formula P(A) = n(A)/n(S) cannot be used since the sample points are not equally
likely to occur.
42. If the odds in favor of an event B are x to y, what is the probability that event B will
occur?
ANSWER:
P(B) = x / (x + y)
43. If the odds in favor of an event A are 2 to 3, what is the probability that event A will not
occur?
P(not A) = 0.60
44. One single-digit number is to be selected randomly. List the sample space.
ANSWER:
S = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
45. Explain why an empirical probability, an observed proportion, and a relative frequency
are actually three different names for the same thing.
ANSWER:
All three are calculated by dividing the experimental count by the sample size.
46. A single die is rolled once. What is the probability that the number on top is an odd
number?
ANSWER:
3/6
47. A two-stage experiment is performed, in which the first stage a coin is tossed and heads
(H) or tails (T) is observed. In the second stage, a single card is randomly selected from
a standard deck of 52 cards, and the suit of clubs (C), spades (S), diamonds (D), or
hearts (H) is observed. List the sample space for this experiment.
ANSWER:
S = {(H, C), (H, S), (H, D), (H, H), (T, C), (T, S), (T, D), (T, H)}
ANSWER:
S = {HH, HT, TH, TT}; P(one head) = P(HT) + P(TH) = 0.25 + 0.25 = 0.50
A sample of 240 undergraduates is randomly selected from a state university in Michigan. For
the male students, 80 were in the College of A&S, 40 were in the College of Business (COB),
and 10 were in the College of Engineering (COE). For the female students, 60 were in the
College of A&S, 16 were in the COB, and 34 were in the COE.
ANSWER:
50. If one student is randomly selected, find the probability that the student is a male in the
College of Business.
ANSWER:
0.167
51. If one student is randomly selected, find the probability that the student is a female in the
College of Engineering.
ANSWER:
52. If one student is randomly selected, find the probability that the student is a male.
ANSWER:
0.542
53. If one student is randomly selected, find the probability that the student is a student in
the College of A&S.
ANSWER:
0.583
In a sample of 300 undergraduates, 90 males and 65 females were in the College of A&S, 45
males and 36 females were in the College of Business (COB), and 30 males and 34 females
were in the College of Education (COE).
ANSWER:
55. If one student is randomly selected, find the probability that the student is a female in the
College of Business.
ANSWER:
0.12
ANSWER:
0.10
57. If one student is randomly selected, find the probability that the student is a female.
ANSWER:
0.45
58. If one student is randomly selected, find the probability that the student is a student in
the College of A&S.
ANSWER:
0.52
59. Suppose that a box of marbles contains an equal number of red and white marbles but
twice as many blue marbles as red marbles. Draw one marble from the box and
observe its color. Assign probabilities to the elements in the sample space.
ANSWER:
Let P(R) = a, then P(W) = a, and P(B) = 2a. Hence, a + a + 2a = 1, which implies a = 0.25.
Therefore, P(R) = 0.25, P(W) = 0.25, P(B) = 0.50.
60. Events A, B, and C are defined on sample space S. Their corresponding sets of sample
points do not intersect and their union is S. Further, event B is twice as likely to occur as
event A, and event C is twice as likely to occur as event B. Determine the probability of
each of these three events.
ANSWER:
Type of Diabetes
Gender A B
Male (M) 50 40
Female (F) 70 40
ANSWER:
ANSWER:
63. Find the probability that the selected individual is Type A male.
ANSWER:
ANSWER:
S = {1, 2, 3, 4, 5, 6}
ANSWER:
1/ 6
ANSWER:
3/6
ANSWER:
2/6
ANSWER:
5/6
An experiment consists of drawing one marble from a box that contains a mixture of red, white,
and blue marbles.
ANSWER:
S = {R, W, B}
70. Can we be sure that each outcome in the sample space is equally likely? Explain.
ANSWER:
No; since no information is given in regard to the proportion of marbles for each color.
71. If two marbles are drawn from the box, list the sample space.
ANSWER:
A group of files in a medical clinic classifies the patients by gender and by the type of diabetes (I
or II). The cross-tabulation (contingency table) below gives the number in each classification.
Type of Diabetes
Gender I II
Male 42 21
Female 49 28
ANSWER:
Type of Diabetes
Gender I II Row
Male 0.30 0.15 0.45
Female 0.35 0.20 0.55
Column 0.65 0.35 1.00
ANSWER:
P(Female) = 0.55
74. Find the probability that the selected individual is Type II.
ANSWER:
75. Find the probability that the selected individual is a male and Type I.
ANSWER:
76. Find the probability that the selected individual is a female and Type II.
ANSWER:
Researchers have for a long time been interested in the relationship between cigarette smoking
and lung cancer. The following table shows the percentages of adult females observed in a
recent study.
Cigarette smoking
77. What is the probability that she smokes and gets cancer?
ANSWER:
ANSWER:
79. What is the probability that she does not get cancer?
ANSWER:
80. What is the probability that she does not smoke and does not get cancer?
81. What is the probability that she gets cancer knowing she smokes?
ANSWER:
82. What is the probability that she does not get cancer, knowing she does not smoke?
ANSWER:
83. Events A, B, and C are defined on sample space S. Their corresponding sets of sample
points do not intersect and their union is S. Furthermore, event B is twice as likely to
occur as event A, and event C is twice as likely to occur as event B. Determine the
probability of each of the three events.
ANSWER:
Given information: P(A) + P(B) + P(C) = 1. Let P(A) = p, then P(B) = 2p and P(C) = 4p.
Now, p + 2p + 4p = 1, then p = 1/7. Therefore, P(A) = 1/7, P(B) = 2/7, and P(C) = 4/7.
The odds for a student to pass a statistics class with an “A” grade are 3 to 7.
84. What is the probability the student will pass the class with an “A” grade?
ANSWER:
P(A) = 3 / 10 or 0.30
ANSWER:
Odds against passing the class with an “A” grade are 7 to 3 (or 7:3).
86. What is the probability the student will not pass the class with an “A” grade?
ANSWER:
P(not A) = 7 / 10 or 0.70
True-False Questions
87. If A is an event of a sample space with P(A) = P(A) , then P(A) = 0.50.
ANSWER: T
ANSWER: T
89. Suppose A, B, and C are three nonempty events of a sample space S, all of which have
no sample points in common, then it is possible that A = B .
ANSWER: F
ANSWER: T
91. If A and B are any two events of a sample space S, then P(A) = P(A and B) − P(B).
ANSWER: F
ANSWER: T
93. A compound event formed by use of the word and requires the use of the addition rule.
ANSWER: F
94. A conditional probability is the relative frequency with which an event A can be expected
to occur under the condition that additional pre-existing information is known about some
other event, B.
ANSWER: T
95. If the results of a probability experiment can be any integer from 0 to 20, then the
probability of each integer is 0.05.
ANSWER: F
96. The complement of an event A, denoted by A , is the set of all sample points in the
sample space that do not belong to event A.
ANSWER: T
Multiple-Choice Questions
A) q – 1.
B) 1 / q.
C) q + 1.
D) 1 – q.
ANSWER: D
98. A sample space is composed of three outcomes, called A, B, and C. Outcome A is twice
as probable as B, and B is twice as probable as C. The probabilities of A, B, and C
would be:
99. Suppose A and B are two nonempty events of a sample space S, then P(B) always
equals to:
A) P(B | A).
B) P(B and A) + P(B and A ).
C) P( B ) – 1.
D) P(B or A) ⋅P ( B or A) .
ANSWER: B
100. If P(A) = 0.80, P(B) =0.70 and P(A or B) =0.90, then P(A and B) is:
A) 0.10.
B) 0.14.
C) 0.60.
D) 0.72.
ANSWER: C
101. If P(A) = 0.45, P(B) = 0.35 and P(A and B) =0.25, then P(A | B) is:
A) 1.4.
102. If P(A) = 0.60, P(B) = 0.63, and P(A and B) = 0.73, then P(A or B) is:
A) 1.23.
B) 0.50.
C) 0.13.
D) 0.10.
ANSWER: B
104. If events A and B are defined on a sample space, with P(A) = 0.25 and P(B | A) = 0.18,
then the probability that A and B can both occur at the same time is
A) 0.250
B) 0.180
C) 0.070
D) 0.045
ANSWER: D
105. If events A and B are defined on a sample space, with P(A) = 0.5 and P(A and B) = 0.8,
then the probability that event B will occur given that event A has already occurred is
A) 0.80
B) 0.50
C) 0.30
D) impossible to find P(B | A)
ANSWER: D
106. Five cards are randomly selected from a standard deck. Let A be the event that all five
selected cards are the same suit. Using probability rules, P(A) can be computed to be
0.002. Find the probability that all the cards are not the same suit.
ANSWER:
0.998
107. Events A and B are defined on a common sample space. If P(A) = 0.20, P(B) = 0.40,
and
P(A or B) = 0.56, find P(A and B)
ANSWER:
0.04
108. If the probability that event A occurs during an experiment is 0.62, what is the probability
that event A doesn’t occur during that experiment?
ANSWER:
109. If the results of a probability experiment can be any integer from 15 to 30 and the
probability that the integer is less than 20 is 0.58, what is the probability the integer will
be 20 or more?
ANSWER:
Let A = The integer is less than 20, then A = The integer is 20 or more.
ANSWER:
111. If P(A) = 0.54, P(B) = 0.29, and P(A and B) = 0.17, find P(A or B).
ANSWER:
112. If P(A) = 0.35, P(B) = 0.45, and P(A or B) = 0.65, find P(A and B).
ANSWER:
P(A or B) = P(A) + P(B) – P(A and B) ⇒ 0.65 = 0.35 + 0.45 – P(A and B) ⇒ P(A and B)
= 0.15
113. If P(A) = 0.35, P(A or B) = 0.85, and P(A and B) = 0.2, find P(B).
ANSWER:
P(A or B) = P(A) + P(B) – P(A and B) ⇒ 0.85 = 0.35 + P(B) – 0.20 ⇒ P(B) = 0.70
Twenty percent of the trees in a particular forest have a disease, 30% of the trees are too small
to be used for lumber, and 40% are too small to be used for lumber or have a disease. What
percent of the trees are too small to be used for lumber and have a disease?
ANSWER:
10%
115. What percent of the trees are not too small to be used for lumber and do not have a
disease?
ANSWER:
60%
116. If a tree is too small to be used for lumber, what is the probability it has a disease?
ANSWER:
0.333
117. If a tree has a disease, what is the probability it is not too small to be used for lumber?
ANSWER:
0.50
Age Group
Below 70 20 30 20
DB 70-90 60 140 60
Above 90 20 80 70
118. Find the probability that a randomly selected individual in this study was in age group 3 or had a
DBP above 90.
ANSWER:
0.50
119. Find the probability that a randomly selected individual in this study was in age group 1 or had a
DBP below 70.
ANSWER:
0.30
120. If a randomly selected individual in this study was in group 2, what is the probability that she/he
has a DBP between 70 and 90?
ANSWER:
0.56
121. If a randomly selected individual in that study had a DBP between 70 and 90, what is the
probability that she/he was in group 1?
ANSWER:
0.23
The probability that a first-time tourist to the city of Chicago will visit the Art Institute is 0.4, will
visit the Museum of Science and Industry is 0.3, and will visit both is 0.1. Assume a first-time
tourist to Chicago is randomly selected.
ANSWER:
0.6
123. Find the probability that the tourist will visit neither of these attractions.
ANSWER:
0.4
124. Find the probability that the tourist will visit one, but not both, of these attractions.
ANSWER:
0.5
The probability that a first-time tourist to the city of Toledo will visit the Art Museum is 0.5, will
visit the Toledo Zoo is 0.4, and will visit both is 0.25. Assume a first-time tourist to Toledo is
randomly selected.
125. Find the probability that the tourist will visit the Art Museum or the Toledo Zoo.
ANSWER:
0.65
126. Find the probability that the tourist will visit neither of these attractions.
0.35
127. Find the probability that the tourist will visit one, but not both, of these attractions.
ANSWER:
0.40
Sixty percent of the applicants at a “high tech” firm have a college degree, 45% have at least
three years experience in the high tech industry, and 35% have both a college degree and three
years experience in the high tech industry. An applicant is randomly chosen.
128. Find the probability that the applicant has a college degree or has had at least three
years of experience in the high tech industry.
ANSWER:
0.70
129. Find the probability that the applicant has no college degree.
ANSWER:
0.70
130. Find the probability that the applicant has less than three years experience in the high
tech industry.
ANSWER:
0.55
ANSWER:
0.10
Five hundred people are classified based on their smoking habits and whether or not they have
prominent wrinkles. The results are shown below:
Light or 75 245
nonsmoke
r
132. Given that the individual is a heavy smoker, what is the probability that he/she does not
have prominent wrinkles?
ANSWER:
0.333
133. What is the probability that the selected individual is a heavy smoker or has prominent
wrinkles?
ANSWER:
0.510
ANSWER:
0.855
135. The probability that an individual will contract a particular disease is 0.005. Past
experience reveals that the probability that an individual who contracts the disease will
make a complete recovery is 0.68. Find the probability that a randomly selected
individual contracts the disease and does not make a complete recovery.
ANSWER:
0.0016
Records at a particular bank show that if a customer at the bank is randomly selected, the
probability that the customer has a savings account at the bank is 0.42, the probability that the
customer has a checking account at the bank is 0.74, and the probability that the customer has
both is 0.28. A customer is randomly selected.
136. Find the probability that he/she has a checking account given that the customer has a
savings account.
ANSWER:
0.667
137. Find the probability that he/she has a savings account given that the customer has a
checking account.
0.378
Suppose A and B are events of a sample space S with P(A) = 0.36, P(B) = 0.24, and P(A and B)
= 0.06.
ANSWER:
0.54
ANSWER:
0.25
140. Find P( A | B ) .
ANSWER:
0.395
ANSWER:
0.395
ANSWER:
0.281
A published article in a medical journal stated that one out of every ten American women will get
breast cancer. It also states that of those who does, one out of four will die of it.
143. Find the probability that a randomly selected American woman will never get breast
cancer.
ANSWER:
Let C represent a women gets breast cancer, and C represent she dies of it. P(C) = 0.1;
then P (C) = 1.0 – 0.1 = 0.9
144. Find the probability that a randomly selected American woman will get breast cancer and
not die of it.
ANSWER:
Let C represent a women gets breast cancer, and C represent she dies of it. P(D|C) =
0.25; P (D|C) = 1 – P(D|C) = 0.75, Then, P(C and D) = P(C) P(D|C) = (0.10)(0.75) = 0.075.
145. Find the probability that a randomly selected American woman will get breast cancer and
die from it.
ANSWER:
Let C represent a women gets breast cancer, and C represent she dies of it. P(C and D)
= P(C) P(D|C) = (0.1)(0.25) = 0.025
A shipment of grapefruit arrived containing the following proportions of types: 10% pink
seedless, 20% white seedless, 30% pink with seeds, 40% white with seeds. A grapefruit is
selected random from the shipment.
ANSWER:
P(seedless) = 0.10 + 0.20 = 0.30
ANSWER:
P(pink) = 0.10 + 0.30 = 0.40
ANSWER:
P(pink and seedless) = 0.10
ANSWER:
P(pink or seedless) = P(pink) + P(seedless) – P(pink and seedless)
= 0.40 + 0.30 – 0.10 = 0.60
ANSWER:
P (pink | seedless) = 0.10 / 0.30 = 0.333
ANSWER:
P(seedless | pink) = 0.10 / 0.40 = 0.25
152. Bianca wants to become a police officer. She must pass a physical exam and then a
written exam. Records show the probability of passing the physical exam is 0.75 and
that once the physical is passed the probability of passing the written exam is 0.60.
What is the probability that Bianca passes both exams?
ANSWER:
Let A represent passing physical exam and B represent passing written exam.
P(A) = 0.75, and P(B|A) = 0.60. Then, P(A and B) = P(B|A) ⋅ P(A) = (0.60)(0.75) = 0.45
Gender
ANSWER: Gender
ANSWER:
P(S) = 0.45
ANSWER:
ANSWER:
ANSWER:
P(M) = 0.334
ANSWER:
ANSWER:
ANSWER:
ANSWER:
ANSWER:
ANSWER:
ANSWER:
165. Events A and B are defined on a sample space, with P(A) = 0.8 and P(B | A) = 0.3.
What is the probability that A and B can both occur at the same time?
ANSWER:
166. Events A and B are defined on a sample space, with P(A | B) = 0.5 and P(B) = 06. What
is the probability that A and B can both occur at the same time?
ANSWER:
167. Events A and B are defined on a sample space, with P(A) = 0.8 and P(A and B) = 0.4.
Find the probability that event B will occur given that event A has already occurred.
ANSWER:
168. Events A and B are defined on a sample space, with P(B) = 0.36 and P(A and B) = 0.5.
Find the probability that event A will occur given that event B has already occurred.
ANSWER:
It is impossible to find P(A | B) in this situation since P(A and B) cannot exceed P(B).
169. Suppose that A and B are two events defined on a common sample space and that the
following probabilities are known: P(A) = 0.4, P(B) = 0.3, and P(B | A) = 0.2. Find
P(A or B).
ANSWER:
Since P(A and B) = P(A) ⋅ P(B | A) = (0.4)(0.2) = 0.08, then P(A or B) = P(A) + P(B) – P(A
and B) = 0.4 + 0.3 – 0.08 = 0.62.
170. Suppose that A and B are events defined on a common sample space and that the
following probabilities are known: P(A or B) = 0.75, P(B) = 0.5, and P(A | B) = 0.25. Find
P(A).
ANSWER:
P(A or B) = P(A) + P(B) – P(A and B) ⇒ 0.75 = P(A) + 0.5 – 0.125 ⇒ P(A) = 0.375
171. Suppose that A and B are events defined on a common sample space and that the
following probabilities are known: P(A) = 0.5, P(A and B) = 0.16, and P(A | B) = 0.4. Find
P(A or B).
ANSWER:
P(A and B) = P(B) ⋅ P(A | B) ⇒ 0.16 = P(B) ⋅ (0.4) ⇒ P(B) = 0.4. Then,
True-False Questions
172. If A and B are any two mutually exclusive events of a sample space S, then the
occurrence of B means that A will occur.
ANSWER: F
173. Suppose A, B, and C are three nonempty events of a sample space S, all of which have
no outcomes in common, then it is possible that P(A) = 0.4, P(B) = 0.5, and P(C) = 0.6.
ANSWER: F
174. If A and B are any two mutually exclusive events of a sample space S, then if A has
occurred, B may also occur.
ANSWER: F
175. If A and B are both nonempty events of a sample space S, and A and B are mutually
exclusive, then A and B are dependent.
ANSWER: T
176. If A and B are two nonempty events of a sample space S, that have no outcomes in
common, then P(AB) = 1.
ANSWER: F
177. If A and B are any two independent events of a sample space S, then A and B may be
mutually exclusive.
178. If A and B are any two independent events of a sample space S, then P(A and B) = P(A)
⋅ P(B|A).
ANSWER: T
179. If two events are mutually exclusive, they are also independent.
ANSWER: F
180. If events A and B are mutually exclusive, the sum of their probabilities must be exactly
one.
ANSWER: F
181. If the sets of sample points belonging to two different events do not intersect, the events
are mutually exclusive or dependent.
ANSWER: T
182. If P(A) = 0.3, P(B) = 0.6, and P(A and B) = 0.18, then A and B are independent events.
ANSWER: T
183. If P(A) = 0.2, P(B) = 0.5, and P(A and B) = 0.05, then A and B are mutually exclusive
events.
ANSWER: F
184. If P(A) = 0.4, P(B) = 0.3, and P(A and B) = 0.15, then P(B | A) = 0.45.
ANSWER: F
186. Mutually exclusive events are non-empty events defined on the same sample space with
each event excluding the occurrence of the other. In other words, they are events that
share no common elements.
ANSWER: T
ANSWER: F
ANSWER: T
189. If events A and B are not independent, they must be mutually exclusive.
ANSWER: F
ANSWER: F
191. Two events are independent if the occurrence (or nonoccurrence) of one gives us no
information about the likeliness of occurrence for the other.
ANSWER: T
192. If the occurrence of one event does have an effect on the probability for occurrence of
the other event, we say that the two events are mutually exclusive.
ANSWER: F
ANSWER: F
194. If the occurrence of one event does have an effect on the probability for occurrence of
the other event, we say that the two events are dependent.
ANSWER: T
195. If events A and B are not mutually exclusive, they must be independent.
ANSWER: F
196. If events A and B are not mutually exclusive, they may be either independent or
dependent.
ANSWER: T
ANSWER: F
Multiple-Choice Questions
198. Which of the following defines a sample space that has sample points in common?
A) must equal 1.
B) could equal l.
C) would equal P(A) ⋅ P(B).
D) greater than 1.
ANSWER: B
200. Suppose A and B are two independent events of a sample space S with P(A) = 0.30 and
P(B) = 0.50, then P(A and B) is
A) 0.80.
B) 0.60.
C) 0.20.
D) 0.15.
ANSWER: D
201. Suppose A and B are events of a sample space S with P(A) = 0.22, P(B) = 0.40, and
P(A and B) = 0.04, then P(A | B ) is
A) 0.462.
B) 0.300.
C) 0.182.
D) 0.100.
ANSWER: B
202. If P(A) = 0.20, P(B) = 0.40 and P(A and B) = 0.08, then A and B are:
A) dependent events.
B) independent events.
C) mutually exclusive events.
D) complementary events.
ANSWER: B
203. If A and B are mutually exclusive events with P(A) = 0.40, then P(B):
204. If A and B are independent events with P(A) = 0.35 and P(A | B) = 0.35, then P(B):
A) equals 0.35.
B) equals 0.70.
C) equals 0.65.
D) cannot be determined with the information given.
ANSWER: D
A) P(A | B) = 1.
B) P(B | A) =1.
C) P(A and B) = 1.
D) P(A and B) = 0.
ANSWER: D
A) P(A and B) = P(A) ⋅ P(B) can be used as the definition of independence of events A
and B.
B) P(A and B) = P(A) ⋅ P(B) cannot be used as a test for independence of events A and
B
C) P(A and B) = P(A) ⋅ P(B) can be used as the definition of mutually exclusive events
D) None of the above
ANSWER: D
209. Suppose that A and B are mutually exclusive events, and that P(A) = 0.4 and P(B) = 0.3.
Then, P(A and B) will be
A) 0.0
B) 0.4
C) 0.3
D) 0.7
ANSWER: A
210. Which of the following is true if A and B are mutually exclusive events?
A) P(A | B) = 0
B) P(B | A) = 0
C) P(A and B) = 0
D) All of the above.
ANSWER: D
A) If two events are mutually exclusive, this means that the two events cannot occur
together; that is, they have no intersection.
B) If two events are independent, this means that the occurrence of either event does
not affect the probability of the other event.
C) Either (A) or (B), but not both, is true.
D) Both (A) and (B) are true.
ANSWER: C
A) If two events are mutually exclusive, then they are not independent.
B) If two events are independent, then they are not mutually exclusive.
C) Both (A) and (B) are true.
D) Both (A) and (B) are false.
ANSWER: D
A) If two events are not mutually exclusive, then they may be either dependent or
independent.
B) If two events are not independent, then they may be either mutually exclusive or not
mutually exclusive.
C) Both (A) and (B) are true.
D) Both (A) and (B) are false.
ANSWER: C
214. Explain why events A and B cannot be mutually exclusive if they are defined on a
common sample space with P(A) = 0.56 and P(B) = 0.61.
ANSWER:
If A and B were mutually exclusive, P(A or B) = 0.56 + 0.61 = 1.17 which is impossible.
215. Explain why P(B occurring when A has already occurred) = 0 when events A and B are
mutually exclusive.
ANSWER:
Since A and B are mutually exclusive events, the occurrence of either event excludes
the occurrence of the other. Now, since A has already occurred, then B cannot possibly
occur.
216. Events A and B are mutually exclusive events defined on a common sample space. If
P(A) = 0.4 and P(A or B) = 0.9, find P(B).
ANSWER:
0.5
217. Events A and B are defined on a common sample space. If P(A) = 0.7, P(B) = 0.6, and A
and B are independent events, find P(A or B).
ANSWER:
0.88
ANSWER:
0.311
219. Explain why nonempty, mutually exclusive events A and B must be dependent.
ANSWER:
If A and B are independent, P(A and B) = P(A) ⋅ P(B) Since A and B are mutually
exclusive, P(A and B) = 0. Thus, 0 = P(A) ⋅ P(B). This is impossible since P(A) ≠ 0 and
P(B) ≠ 0. Therefore, A and B are dependent.
220. If A and B are independent events, and P(A) = 0.7 and P(B) = 0.6, find P(A and B)
ANSWER:
221. If events A and B are independent, and P(A) = 0.6 and P(B) = 0.5, find P(A and B).
ANSWER:
222. If A and B are independent events, and P(A) = 0.8 and P(B) = 0.1, find P(B| A).
ANSWER:
ANSWER:
224. If A and B are independent events, and P(B) = 0.3 and P(A and B) = 0.4, find P(A).
ANSWER:
225. If A and B are independent events, and P(A) = 0.5 and P(B) = 0.3, find P(A | B).
ANSWER:
226. Events A and B are events of a sample space S with P(A) = 0.32, P(B) = 0.11, and P(A
and B) = 0.08. Are A and B independent events? You must give a written explanation. A
simple answer of “yes” or “no” will receive no credit.
ANSWER:
P( A ∩ B) 0.08
P ( A | B) = = = 0.727 , but P ( A) = 0.32
P( B) 011
.
ANSWER:
(26/52)(25/51)(24/50)(23/49)(22/48) = 0.0253
228. Find the probability that all five cards are red if they are selected with replacement.
ANSWER:
( 26 / 52 ) = 0.0313
5
Events A, B, and C are events of a sample space S with A and C mutually exclusive, B and C
mutually exclusive, P(A) = 0.32, P(B) = 0.11, P(A and B) = 0.08, and P(C) = 0.42.
ANSWER:
0.35
ANSWER:
0.74
ANSWER:
0.53
ANSWER:
0.68
233. Find P (C ) .
ANSWER:
0.58
ANSWER:
0.0
A box contains 12 red marbles and 8 blue marbles. Three marbles are randomly selected, one
at a time.
235. Find the probability that all three are blue if they are selected with replacement.
ANSWER:
( 8 / 20 ) = 0.064
3
236. Find the probability that all three are blue if they are selected without replacement.
ANSWER:
Let the sample space be the set of all students currently enrolled at your college. Suppose a
student is randomly selected. Define the events A, B, C, D, and E as follows:
ANSWER:
Independent
ANSWER:
Dependent
ANSWER:
Independent
Independent
ANSWER:
ANSWER:
k = 0.14 or 0.24
A and B are two independent events of a sample space S with P(A) = 0.25 and P(B) = 0.48.
0.12
ANSWER:
0.61
ANSWER:
0.25
246. Find P( A | B ) .
ANSWER:
0.25
ANSWER:
0.48
248. Find P( B | A ) .
ANSWER:
0.48
ANSWER:
250. Professor Brown gives her students a maximum of three attempts to pass a final
examination in her statistics course. She has found that the probability of passing on the
first attempt is 0.40, the probability of passing on the second attempt is 0.65, and the
probability of not passing on the third attempt is 0.15. Find the probability that a
randomly selected student of hers will pass the final examination.
ANSWER:
1 – (0.60)(0.35)(.15) = 0.9685
An experiment consists of selecting a marble from box one and placing it in box two, and then a
marble is selected from box two and its color is noted. Box one contains two red, three blue, and
five white marbles, and box two contains six red, two blue, and two white marbles.
251. Find the probability that the first and second selected marbles were both red.
ANSWER:
0.028
252. Find the probability that the marble selected from box two was red.
ANSWER:
0.564
ANSWER:
(3/20)(2/19)=0.0158
A hospital classifies some of the patients’ files by gender and by type of care received (Intensive
Care Unit (ICU) and Surgical Unit). The number of patients in each classification is presented
below:
Type of Care
Male (M) 25 39
Female (F) 21 15
254. Are the events “being a female” and “being in the ICU” mutually exclusive?
ANSWER:
No, they can occur at the same time; i.e., a patient can be both female and in ICU.
255. Are the events “being in the ICU” and “being in the surgical unit” mutually exclusive?
ANSWER:
Yes, they cannot occur at the same time; i.e., a patient cannot be in ICU and the
Surgical Unit at the same time.
ANSWER:
ANSWER:
ANSWER:
ANSWER:
P(A and B) = P(A) ⋅ P(B) = ( 0.4)(0.5) = 0.20
ANSWER:
P(B | A) = P(B) = 0.5
ANSWER:
P(A or B) = P(A) + P(B) – P(A and B) = 0.4 + 0.5 – 0.20 = 0.70
Suppose that P(A) = 0.3, P(B) = 0.6, and P(A and B) = 0.18.
ANSWER:
P(A and B) = P(B) ⋅ P(A | B) ⇒ 0.18 = (0.6) ⋅ P(A | B); therefore, P(A | B) = 0.3.
ANSWER:
P(A and B) = P(A) ⋅ P(B | A) ⇒ 0.18 = (0.3) ⋅ P(B | A); therefore, P(B | A) = 0.6.
265. Are A and B independent? Justify your answer in three different ways.
ANSWER:
266. What is the probability that they together make the right decision on the first try?
ANSWER:
Let A = right decision, B = wrong decision
P(right decision) = P(A1 and A2) = (0.7) ⋅ (0.7) = 0.49
267. What is the probability that they together make the wrong decision on the first try?
ANSWER:
Let A = right decision, B = wrong decision
P(wrong decision) = P(B1 and B2) = (0.3) ⋅ (0.3) = 0.09
268. What is the probability that they together delay the decision for further study?
ANSWER:
Let A = right decision, B = wrong decision
P(delay the decision) = P[(A1 and B2) or (B1 and A2)]
ANSWER:
Let D = defective, N = nondefective.
ANSWER:
Let D = defective, N = nondefective.
Let P(A) = 0.3, P(B) = 0.4, and events A and B are mutually exclusive.
ANSWER:
P(A and B) = 0.0 (they are mutually exclusive)
ANSWER:
P(A or B) = P(A) + P(B) = 0.3 + 0.4 = 0.7
ANSWER:
P(A | B) = 0.0 (they are mutually exclusive)
ANSWER:
P(A| B ) = P(A and B ) / P( B ) = P(A) / P( B ) = 0.3 / 0.6 = 0.5
ANSWER:
No; mutually exclusive events are disjoint, therefore they must be dependent.
278. A company that manufactures windows has three factories. Factory 1 produces 30% of
the company’s windows, Factory 2 produces 60%, and Factory 3 produces 10%. One
percent of the windows produced by Factory 1 are mislabeled, 0.5% of those produced
by Factory 2 are mislabeled, and 2% of those produced by Factory 3 are mislabeled. If
you purchase one window manufactured by this company, what is the probability that the
window is mislabeled?
ANSWER:
Let F represent factory where window was produced, with i = 1, 2, 3, and M represent
mislabeled. Then, M = (M and F1 ) or (M and F2 ) or (M and F3 ), and hence
Male Female
Satisfied 70 30 5 20 125
Unsatisfied 30 20 15 10 75
279. Find the probability that an unskilled worker is satisfied with work.
ANSWER:
280. Find the probability that a skilled woman employee is satisfied with work.
ANSWER:
281. Is satisfaction for women employees independent of their being skilled or unskilled?
ANSWER:
Since these two probabilities are not equal, therefore the two events are not
independent. That is, satisfaction for women employees depends on their being skilled
or unskilled.
Suppose a certain ophthalmic trait is associated with eye color. One hundred and fifty randomly
selected individuals are studied with results as follows:
Eye Color
Yes 35 15 10 60
No 10 55 25 90
Total 45 70 35 150
282. What is the probability that a person selected at random has blue eyes?
ANSWER:
P(blue eyes) = 45 / 150 = 0.30
283. What is the probability that a person selected at random has the trait?
ANSWER:
P(yes) = 60 / 150 = 0.40
284. Are events A (has blue eyes) and B (has the trait) independent? Justify your answer.
P(A and B) = 35 / 150 = 0.233, and P(A) ⋅ P(B) = (45/150) ⋅ (60/150) = 0.12; therefore A
and B are not independent events.
285. How are the two events A (has blue eyes) and C (has brown eyes) related (independent,
mutually exclusive, complementary, or all-inclusive)? Explain why or why not each term
applies.
ANSWER:
Blue eyes and brown eyes are mutually exclusive events. They are not complementary since not
everyone was classified as having brown or blue eyes. Since they are mutually exclusive, they
cannot be independent events.
286. In United States, professional basketball championship is often decided by two teams
playing each other in a seven-game series. Suppose that team A is the better team, and
the probability it will beat team B in any one game is 0.7. What is the probability that
team A will win the series?
ANSWER:
P(team A wins best of 7 game series)
= P(A wins in 4 games) + P(A wins in 5 games) + P(A wins in 6 games) +
P(A wins in 7 games)
= 1 ⋅ (0.7) + 4 ⋅ (0.7) (0.3) + 10 ⋅ (0.7) (0.3) + 20 ⋅ (0.7) (0.3)
4 4 1 4 2 4 3
Events A and B are defined on a sample space. Assume that P(A) = 0.2 and P(B) = 0.4.
ANSWER:
ANSWER:
ANSWER:
P(A and B) = 0
Suppose that A and B are mutually exclusive events, and that P(A) = 0.4 and P(B) = 0.3.
290. Find P( A ).
ANSWER:
291. Find P( B ).
ANSWER:
ANSWER:
Since A and B are mutually exclusive, then P(A or B) = P(A) + P(B) = 0.4 + 0.3 = 0.7.
294. Give an example to demonstrate the fact that “If events A and B are mutually exclusive,
they cannot be independent”.
ANSWER:
Let P(A) = 0.3 and P(B) = 0.4. If A and B are mutually exclusive events, then P(A and B)
= 0, and then P(A | B) = 0.0. Since we are given P(A) = 0.3, we see that the occurrence
of B has an effect on the probability of A, therefore A and B cannot be independent
events.
295. Give an example to demonstrate the fact that” If events A and B are not mutually
exclusive, they may be either independent or dependent”.
ANSWER:
Let P(A) = 0.3, and P(B) = 0.5. If events A and B are not mutually exclusive, it must be
true that P(A and B) is greater than zero. Now if P(A and B) happens to be exactly 0.15,
then events A and B are independent since P(A and B) = 0.15 = P(A) ⋅ P(B). But, if P(A
and B) is any other positive value, say 0.12, than events A and B are not independent,
since P(A and B) = 0.12 ≠ P(A) ⋅ P(B) = 0.15 . Therefore, our conclusion is that If the
events A and B are not mutually exclusive, they may be either independent or
dependent, and additional information is needed in order to determine which.
Suppose that P(A) = 0.5, P(B) = 0.7, and P(A and B) = 0.35.
ANSWER:
ANSWER:
ANSWER:
Yes, A and B are independent events since the following three equalities are satisfied:
P(A | B) = P(A), P(B | A) = P(B), and P(A and B) = P(A) ⋅ P(B) (need to satisfy only one
equality, since if one is true, the other two must be true).
An aquarium at a pet store contains 50 orange swordfish (27 females and 23 males) and 30
green swordtails (14 females and 16 males). You randomly net one of the fish.
ANSWER:
ANSWER:
Color of fish
ANSWER:
P(O) = 0.625
ANSWER:
P(M) = 0.4875
ANSWER:
ANSWER:
ANSWER:
306. What is the probability that it is a male, knowing that it is a green swordtail?
ANSWER:
307. What is the probability that it is a female, knowing that it is an orange swordfish?
ANSWER:
308. Are the events “male” and “female” mutually exclusive? Explain.
Yes; since a fish cannot be male and female at the same time
309. Are the events “male” and “swordfish” mutually exclusive? Explain.
ANSWER:
No; a fish can be both male and swordfish, that is, a male swordfish.
310. Are the events “gender” and “color of fish” independent? Explain.
ANSWER:
Since, for example, P(F | O) = 0.54 and P(F) = 0.5125, then P(F | O) ≠ P(F). Therefore,
events the two events are not independent.
311. Give an example to demonstrate the fact that “If events A and B are independent and
both have nonzero probabilities, they cannot be mutually exclusive.”
ANSWER:
Let P(A) = 0.3 and P(B) = 0.5. If A and B are independent, then P(A and B) = P(A) ⋅ P(B)
= 0.15, which is greater than zero. This means there is an intersection between events A
and B, and the events cannot be mutually exclusive.
Suppose that P(A) = 0.20, P(B) = 0.40, and P(A and B) = 0.15.
ANSWER:
ANSWER:
ANSWER:
A and B are not independent events since, for example, P(A | B) = 0.375 ≠ P(A) = 0.20
One student is selected at random from a group of 150 known to consist of 105 full-time (60
female and 45 male) students and 45 part-time (30 female and 15 male) students. Event A is
“the student selected is full-time,” event B is “the student selected is part-time”, event M is “the
selected student is male”, and event F is “the selected student is female”.
ANSWER:
Student Status
Student Status
ANSWER:
Since P(A and F) = 0.4 ≠ P(A) ⋅ P(F) = (0.7)(0.6) = 0.42, then events A and F are not
independent.
ANSWER:
Since P(B and M) = 0.1 ≠ P(B) ⋅ P(M) = (0.3)(0.4) = 0.12, then events B and M are not
independent.
319. Based on your answers to questions 317 and 318, what is your conclusion?
ANSWER:
We conclude that student status (as full-time or part-time) does not depend on the
gender of the student.
ANSWER:
322. Give an example to demonstrate the fact that” If events A and B are not independent,
they can be either mutually exclusive or not mutually exclusive”.
ANSWER:
Let P(A) = 0.3 and P(B) = 0.5. If A and B are not independent events, it must be that P(A
and B) is different than 0.15; the value it would be if they were independent [P(A) ⋅ P(B) =
(0.3)(0.5) = 0.15]. Now if P(A and B) happens to be exactly 0.00, then events A and B
are mutually exclusive, but if P(A and B) is any other positive value, say 0.13, then
events A and B are not mutually exclusive. Therefore, our conclusion is that if the
events A and B are not independent, they could be either mutually exclusive or not, and
some other information is needed to make that determination.
A box contains 40 parts, of which 5 are defective and 35 are nondefective. Assume that 2 parts
are selected without replacement. Let event D1 = first part is defective, event D2 = second part is
defective, event N1 = first part is not defective, and event N 2 = second part is not defective.
ANSWER:
ANSWER:
Are college graduation rates low? A recent survey shows that the percentage of students who
graduate within five years is 42% for public colleges and 55% for private colleges. One of the
reasons for this might be that only 56% of the students attend full time.
325. What additional information do you need to determine the probability that a student
selected at random is part time and will graduate within five years?
ANSWER:
We need to know whether or not the events part-time and graduate within five years are
independent.
326. Is it likely that the two events cited in question 325 have the needed property? Explain.
ANSWER:
Clearly the events part-time and graduate within five years are not independent.
Whether a student is part-time or full-time will make a difference in how soon he/she will
graduate.
327. If appropriate, find the probability that a student selected at random is part time and will
graduate within five years.
ANSWER:
Suppose that when a candidate comes to a campus interview for an administrative position at
an academic institution, the probability that he or she will want the job (event A) after the
interview is 0.70. Also, the probability that the institution wants the candidate (event B) is 0.35.
In addition, assume that P(A | B) is 0.90
ANSWER:
ANSWER:
ANSWER:
Since P(B | A) = 0.45 ≠ P(B) = 0.35, events A and B are not independent.
ANSWER:
332. What would it mean to say A and B are mutually exclusive events in this particular
situation?
ANSWER:
“Candidate wants the administrative position” and “institution wants candidate” could not
both happen.
The odds against throwing a pair of dice and getting a total of 5 are 8 to 1. The odds against
throwing a pair of dice and getting a total of 11 are 11 to 1.
333. What are the odds in favor of throwing a pair of dice and getting a total of 5?
ANSWER:
1 to 8
334. What are the odds in favor of throwing a pair of dice and getting a total of 11?
ANSWER:
1 to 11
335. What is the probability of throwing a pair of dice and getting a total of 5?
ANSWER:
1 / (1 + 8) = 1 / 9
336. What is the probability of throwing a pair of dice and getting a total of 11?
1 / (1 + 11) = 1 / 12
337. What is the probability of throwing the dice twice and getting a total of 5 on the first throw
and 10 on the second throw?
ANSWER:
Clearly the events of getting a total of 5 on the first throw and 10 on the second throw
are independent, so the special multiplication rule applies.
Therefore,
P(5 on first throw and 10 on second throw) = P(5 on first throw).P(10 on second throw)
Chapter 6
Normal Probability
Distributions
True-False Questions
ANSWER: T
2. If the random variable z is the standard normal score, then the mean of the distribution
of z is 0.
ANSWER: T
3. If the random variable z is the standard normal score, then the standard deviation of the
distribution of z is 1.
ANSWER: T
4. The total area under the curve of the standard normal distribution is not necessarily 1.0.
ANSWER: F
5. If the random variable z is the standard normal score, then z has a mean of one and a
standard deviation of zero.
ANSWER: F
ANSWER: F
7. The area under the normal curve between µ − 2σ and µ + 2σ is about 0.95.
ANSWER: T
ANSWER: F
ANSWER: T
10. The theoretical probability that a particular value of a continuous random variable will
occur is exactly zero.
ANSWER: T
11. The unit of measure for the standard score is the same as the unit of measure of the
data.
ANSWER: F
12. All normal distributions have the same general probability function and distribution.
ANSWER: T
13. Standard normal scores have a mean of zero and a standard deviation of one.
ANSWER: T
14. Probability distributions of all continuous random variables are normally distributed.
ANSWER: F
15. We are able to add and subtract the areas under the curve of a continuous distribution
because these areas represent probabilities of independent events.
ANSWER: F
16. The most common distribution of a continuous random variable is the normal probability.
ANSWER: T
18. The normal probability distribution is considered the single most important probability
distribution.
ANSWER: T
19. The most common distribution of a continuous random variable is the binomial
probability.
ANSWER: F
20. Each different pair of values for the mean, µ and standard deviation, σ will result in a
different normal probability distribution function. This means there are infinitely many
probability distribution functions.
ANSWER: T
21. The Empirical Rule is a fairly crude measuring device; with it we are able to find
probabilities associated only with any number multiple of the standard deviation from the
mean.
ANSWER: F
22. The standard normal table can be used to find probabilities for all combinations of mean,
µ and standard deviation, σ values.
ANSWER: T
ANSWER: F
24. All normal probability distributions have the same shape and distribution relative to the
mean and standard deviation.
25. Probability distributions of all continuous random variables are normally distributed.
ANSWER: F
26. The unit of measure for the standard score is the same as the unit of measure of the
data.
ANSWER: F
27. The total area under the curve of any normal distribution is 1.0.
ANSWER: T
28. The total area under the curve of any normal distribution is 100.
ANSWER: F
29. The Empirical Rule is a fairly crude measuring device; with it we are able to find
probabilities associated only with whole-number multiples of the standard deviation
(within one, two, or three standard deviations of the mean).
ANSWER: T
30. The total area under the curve of any continuous distribution is 1.0 as long as the
distribution is symmetric around the mean value.
ANSWER: F
31. The theoretical probability that a particular value of a continuous random variable will
occur is exactly zero.
ANSWER: T
A) The total area under the curve of any normal distribution is 1.0.
B) Nearly all the area under the standard normal curve is between z = -3.00 and z =
3.00.
C) The symmetry of the normal distribution is a key factor in determining probabilities
associated with values below (to the left of) the mean.
D) The z-score associated with the 50th percentile of the standard normal distribution is
1.0.
ANSWER: D
33. The distribution that has a mean of zero and a standard deviation of one is called the
34. Given a standard normal probability distribution, what can be said about the mean and
standard deviation?
35. If the random variable z is the standard normal score, which of the following probabilities
could easily be determined without referring to a table?
36. The area under the normal curve between z = 0.0 and z = 2.0 is
A) 0.9772.
B) 0.7408.
C) 0.1359.
D) 0.4772.
ANSWER: D
37. The area under the normal curve between z = -1.0 and z = -2.0 is
A) 0.3413.
B) 0.1359.
C) 0.4772.
D) 0.0228.
ANSWER: B
Short-Answer Questions
39. The random variable z is the standard normal score. Find the number k if P(z > k) =
0.9750.
-1.96
40. The random variable z is the standard normal score. Find the number k if P(−k < z < k) =
0.3900.
ANSWER:
0.51
ANSWER:
0.2643
ANSWER:
0.0278
ANSWER:
0.5266
0.9463
ANSWER:
-0.45
46. The random variable z is the standard normal score. Find z as shown in the diagram
below given that the area of the shaded region is 0.4927.
z 0
ANSWER:
-2.44
47. Find the probability of a randomly selected piece of data from a normal population will
have a z-score between 0 and 1.25.
ANSWER:
0.3944
48. Find the probability that a randomly selected piece of data from a normal population will
have a z-score greater than 1.25.
0.1056
49. Find the probability that a randomly selected piece of data from a normal population will
have a z-score less than 2.25.
ANSWER:
0.9878
50. Find the probability that a randomly selected piece of data from a normal population will
have a z-score between 0 and –1.9.
ANSWER:
0.0287
51. Find the probability that a randomly selected piece of data from a normal population will
have a z-score greater than –1.65.
ANSWER:
0.9505
52. Find the probability that a randomly selected piece of data from a normal population will
have a z-score between –1.9 and 1.25.
ANSWER:
0.8657
ANSWER:
0.0881
ANSWER:
-0.53
ANSWER:
-1.23
ANSWER:
1.56
57. Give the z-scores for the first, second, and third quartiles for the standard normal
distribution.
ANSWER:
ANSWER:
90th percentile
59. What is the percentage of the total area under the normal curve within plus and minus
three standard deviations of the mean?
ANSWER:
99.74%
60. “About one-third of the students entering a certain university drop out during or at the
end of their first year.” Does this statement illustrate percentage, proportion or
probability?
ANSWER:
Proportion
61. “A recent survey reported that 56% of registered voters in Michigan are Democrats.”
Does this statement illustrate percentage, proportion or probability?
ANSWER:
Percentage
62. “The chance of receiving an “A” grade in this statistics class is 0.25”. Does this
statement illustrate percentage, proportion or probability?
ANSWER:
ANSWER:
0.4798
ANSWER:
ANSWER:
ANSWER:
67. Find a value of z such that 40% of the distribution lies between it and the mean.
68. Find the standard z-score such that 80% of the distribution is to the left of this value.
z = 0.84
69. Find the standard z-score such that the area to the right of this value is 0.15.
ANSWER:
z = 1.04
70. Find the two standard z-scores that bound the middle 50% of a normal distribution.
ANSWER:
71. Find the two standard scores z such that the middle 90% of a normal distribution is
bounded by them.
ANSWER:
z = -1.65 or + 1.65
72. Find the two standard scores z such that the middle 98% of a normal distribution is
bounded by them.
z = -2.33 or + 2.33
ANSWER:
P( |z| > 1.75) = P(z < -1.75) + P(z > +1.75) = 2(0.5000 – 0.4599) = 0.0802
ANSWER:
ANSWER:
(a) The total area under the normal curve is equal to one.
(b) The distribution is mounded and symmetric with respect to the vertical line drawn
through z = 0; it extends indefinitely in directions, approaching but never touching
the horizontal axis.
(c) The distribution has a mean of 0 and a standard deviation of 1.
(d) The mean divides the area in half - 0.50 on each side.
(e) Nearly all the area is between z = -3.00 and z = 3.00
ANSWER:
z = 0.84. This says that the 80th percentile in a normal distribution is 0.84 standard
deviations above the mean.
77. Find the z-scores that bound the middle 75% of the standard normal distribution.
ANSWER:
78. Find the area under the standard normal curve to the right of z = 2.12.
ANSWER:
79. Find the area under the standard normal curve to the left of z = 1.93.
ANSWER:
80. Find the area under the standard normal curve between -1.48 and the mean.
ANSWER:
81. Find the area under the standard normal curve to the left of z = -1.33.
82. Find the area under the standard normal curve between z = -1.52 and z =1.25.
ANSWER:
83. Find the probability that it will have a standard score (z) that lies between 0 and 0.95.
ANSWER:
84. Find the probability that it will have a standard score (z) that lies to the right of 0.95.
ANSWER:
85. Find the probability that it will have a standard score (z) that lies to the left of 0.95.
ANSWER:
ANSWER:
ANSWER:
0.4960
ANSWER:
ANSWER:
91. Find the area under the standard normal curve between z = 0.25 and z = 2.75.
ANSWER:
92. Find a value of z such that 43.7% of the distribution lies between it and the mean. (Hint:
There are two possible answers.)
ANSWER:
Since P(0 < z < 1.53) = 0.437 = P(-1.53 < z < 0), then z = ± 1.53.
93. Find the two standard scores z such that the middle 75.4% of a normal distribution is
bounded by them.
ANSWER:
Since P(-1.16 < z < 1.16) = 0.377 + 0.377 = 0.754, then z = ± 1.16.
94. Find the two standard scores z such that the middle 82.3% of a normal distribution is
bounded by them.
ANSWER:
Since P(-1.35 < z < 1.35) = 0.4115 + 0.4115 = 0.823, then z = ± 1.35.
95. Find the two standard scores z such that the middle 90% of a normal distribution is
bounded by them.
Since P(-1.645 < z < 1.645) = 0.45 + 0.45 = 0.90, then z = ± 1.645.
96. Find the two standard scores z such that the middle 95% of a normal distribution is
bounded by them.
ANSWER:
Since P(-1.96 < z < 1.96) = 0.475 + 0.475 = 0.95, then z = ± 1.96.
97. Find the two standard scores z such that the middle 99% of a normal distribution is
bounded by them.
ANSWER:
Since P(-2.575 < z < 2.575) = 0.495 + 0.495 = 0.99, then z = ± 2.575
98. Find the z-scores that bound the middle 55% of the standard normal distribution.
ANSWER:
Since P(-.76 < z < 0.76) = 0.2764 + 0.2764 = 0.5528 0.55, then z = ± 0.76
99. Find the z-score for the first quartile of the standard normal distribution.
ANSWER:
Since P(z < -0.67) = 0.2486 0.25, then the z-score for the first quartile of the standard
normal distribution is -0.67.
100. Find the z-score for the second quartile of the standard normal distribution.
Since P(z < 0.0) = 0.50, then the z-score for the second quartile of the standard normal
distribution is 0.
101. Find the z-score for the third quartile of the standard normal distribution.
ANSWER:
Since P(z < 0.67) = 0.50 + 2486 = 0.7486 0.75, then the z-score for the third quartile of
the standard normal distribution is 0.67
ANSWER:
P(|z| > 1.88) = P(z < -1.88) + P(z > +1.88) = 2P(z > +1.88) = 2(0.5000 - 0.4699) =
0.0602, then k = 0.0602.
ANSWER:
P(|z| < 2.28) = P(-2.28 < z < +2.28) = 2 P(0 < z < +2.28) = 2(0.4887) = 0.9774, then
k = 0.9774.
ANSWER:
P(|z| > c) = P(z < -c) + P(z > +c) = 2P(z > +c) = 0.0204, then P(z > +c) = 0.0102. Hence,
P(0 < z < c) = 0.5000 - 0.0102 = 0.4898, and c = 2.32.
ANSWER:
P(|z| < c) = P(-c < z < +c) = 2P(0 < z < +c) = 0.8948, then P(0 < z < +c) = 0.4474, and
c =1.62.
True-False Questions
106. Assume that x is a normally distributed random variable with a mean of µ and standard
deviation of σ . If x is converted to the standard score z, then given any three of the
values of x, µ, σ, and z, we can always find the fourth value.
ANSWER: T
107. If the random variable z is the standard normal score, then z(0.30) > z(0.20).
ANSWER: F
108. If the random variable z is the standard normal score, then z(0.65) = –z(0.35).
109. If the random variable z is the standard normal score, then z(0.35) = –z(0.35).
ANSWER: F
110. If the random variable z is the standard normal score, then z(0.50) = –z(0.50).
ANSWER: T
111. In the statement z(0.33) = 0.44, the number 0.33 represents a value for z and the
number 0.44 represents the area to the right of 0.33.
ANSWER: F
112. When using the notation z (α ) , the number α in parenthesis is the measure of the area
to the right of the z-score.
ANSWER: T
113. z(0.15) is the algebraic name for the z such that the area to the left and under the standard
normal curve is exactly 0.15.
ANSWER: F
114. When using the notation z(0.05), the number in parentheses is the measure of the area
to the right of the z-score.
ANSWER: T
ANSWER: F
ANSWER: T
ANSWER: F
118. The standard normal distribution is the normal distribution of the standard variable z
(called the “standard score” or “z-score”).
ANSWER: T
119. In the notation z(0.05), the number in parentheses is the measure of the area to the left
of the z-score.
ANSWER: F
120. A standard notation used to abbreviate “normal distribution with mean µ and standard
deviation σ is N ( µ , σ ) .
ANSWER: T
121. z( α ) is the value of z such that the area to the right of z and under the standard normal
curve is exactly α .
ANSWER: T
122. The middle 0.90 of the standard normal distribution is bounded by -1.96 and 1.96.
ANSWER: F
123. The random variable x is normally distributed with a mean of 75 and a standard
deviation of 15.0. For this distribution, the twenty-third percentile, P23 , is
A) 65.7.
B) 63.9.
C) 86.1.
D) 84.3.
ANSWER: B
124. If x is normally distributed random variable with a standard score of z, a mean of µ , and
a standard deviation of σ, then x is equal to:
A) ( z − µ ) / σ
B) ( z − σ ) / µ
C) µ − σ z
D) µ + σ z
ANSWER: D
A) +0.67
B) −0.44
C) 0.2486
D) −0.17
ANSWER: B
126. If the random variable z is the standard normal score, then z(0.2611) is equal to
A) + z (0.2389).
B) + z (0.7611).
C) – z (0.7389) .
D) – z (0.1026) .
ANSWER: C
A) + 0.6324.
B) + 0.7324.
C) – z (0.7676) .
D) – z (0.2324).
ANSWER: C
128. Using the symbolic notation z(α), identify the value for α.
0.2910
0 z
A) z(0.2910)
B) z(0.2090)
C) z(0.8100)
D) z(0.7090)
ANSWER: D
129. If the random variable z is the standard normal distribution, then z(0.75) is equal to
0.2700
0 z
A) z(0.1064)
B) z(0.2300)
C) z(0.5064)
D) z(0.7400)
ANSWER: B
A) 1.0.
B) 0.6.
C) 0.0.
D) None of the above.
ANSWER: C
A) 1.04.
B) -1.04.
C) 0.52.
D) -0.52.
ANSWER: B
134. The mean of a normal probability distribution is 500 and the standard deviation is 10.
About 68% of the observations lie between what two values?
ANSWER:
ANSWER:
0.36
ANSWER:
0.0
137. The mean of a normal probability distribution is 400 and the standard deviation is 10.
About 95% of the observations lie between what two values?
ANSWER:
138. Use the standard normal table and the definition of z( α ) notation to find z(0.18).
ANSWER:
139. Find the area under the normal curve for z between z(0.95) and z(0.05).
ANSWER:
The area to the right of z(0.95) is 0.95; the area to the right of z(0.05) is 0.05; therefore
the area between them is found by 0.95 - 0.05 and it is 0.90.
140. Use the standard normal table and the definition of z( α ) notation to find z(0.78).
ANSWER:
z(0.78) = -0.77
ANSWER:
X has a normal distribution with a mean of 47.5 and a standard deviation of 5.0.
ANSWER:
a = 55.75
ANSWER:
a = 39.25
144. A traffic study at one point on an interstate highway shows that vehicle speeds are
normally distributed with a mean of 61.3 mph and a standard deviation of 3.3 mph. If a
vehicle is randomly checked, find the probability that its speed is between 55.0 mph and
60.0 mph.
ANSWER:
0.3202
145. Two-year college students have mathematics competency scores that are normally
distributed with a mean of 35 (the maximum possible score is 48). The 90th percentile is
40. Find the standard deviation of the math competency scores.
ANSWER:
3.9
146. Scores on a computer science aptitude test are normally distributed. The standard
deviation of the distribution is 6.0, and the 95th percentile for the test is 92. Find the
mean score for this test.
ANSWER:
Mean = 82.1
147. If heights of a certain group of adult males are normally distributed with a mean of 68.2
inches and a standard deviation of 4.1 inches, find the 25th percentile, P25, for this
distribution.
65.5 inches
A machine cuts circular filters from large rolls of material. The diameters of the filters are
normally distributed with a mean equal to 2.00 cm and a standard deviation equal to 0.02 cm.
148. Find the 95th percentile for the distribution of filter diameters.
ANSWER:
P95 = 2.033
149. If specifications call for the filters to have diameters between 1.95 cm and 2.03 cm,
about what percent would be expected to not meet specifications?
ANSWER:
7.3%
Reading comprehension scores for junior high students in a school district are normally
distributed with a mean of 80.0 and a standard deviation 5.0.
ANSWER:
6.68%
68.26%
The times required to assemble a product part are normally distributed with a mean of 47.5
minutes and a standard deviation equal to 8.5 minutes.
152. What percent of the assembly workers require: more than one hour?
ANSWER:
7.08%
153. What percent of the assembly workers require: less than one-half hour?
ANSWER:
1.97%
The random variable x has a normal distribution with mean of 75.0 and standard deviation of
2.5.
ANSWER:
0.0228
0.8185
ANSWER:
0.0013
157. Waiting times to see a doctor at a large clinic are normally distributed with a mean of
68.2 minutes and a standard deviation of 14.8 minutes. Find the probability that the
waiting time to see a doctor is less than 45.0 minutes.
ANSWER:
0.0582
158. Scores on a particular test are normally distributed with a mean of 126 points. Find the
standard deviation if for these scores, P90 = 160.0.
ANSWER:
26.6
159. For a particular normal distribution, Q3 = 27.8 and P40 = 24.2. Find the mean and
standard deviation of this distribution.
ANSWER:
ANSWER:
z(0.0228)
161. Use the standard normal table to find the values of z: z(0.9940).
ANSWER:
−2.51
162. Use the standard normal table to find the values of z: z(0.2054).
ANSWER:
0.54
163. Use the standard normal table to find the values of z: z(0.3315).
ANSWER:
0.96
164. A machine cuts circular filters from large rolls of material. Specifications call for the filters
to have diameters between 1.95 cm and 2.05 cm. If the diameters of the filters are
normally distributed with a mean equal to 2.00 cm, then the machine needs to be fine-
tuned to give what standard deviation so that only 1% of the filters do not meet
specifications? (Give the answer to three decimal places.)
ANSWER:
165. Use the standard normal table to find the values of z: z(0.7881).
ANSWER:
−0.80
166. The value z(0.25) associated with the standard normal distribution would correspond to
what value associated with the nonstandard normal distribution having a mean equal to
90 and a standard deviation of 10?
ANSWER:
96.7
167. Find the probability that the piece of data will have a standard score less than 2.00.
ANSWER:
168. Find the probability that the piece of data will have a standard score greater than –1.40.
ANSWER:
169. Find the probability that the piece of data will have a standard score less than –1.75.
170. Find the probability that the piece of data will have a standard score less than 1.25.
ANSWER:
171. Find the probability that the piece of data will have a standard score greater than –1.58.
ANSWER:
Assume that x is a normally distributed random variable with a mean of 70 and a standard
deviation of 10.
ANSWER:
ANSWER:
ANSWER:
P[67 < x < 93) = P(-0.30 < z < 2.30) = 0.1179 + 0.4893 = 0.6072
ANSWER:
P(75 < x < 92) = P(0.50 < z < 2.20) = 0.4861 – 0.1915 = 0.2946
ANSWER:
P(48 < x < 88) = P(-2.20 < z < 1.80) = 0.4861 + 0.4641 = 0.9502
ANSWER:
For a particular age group of adult females, the distribution of cholesterol readings, in mg/dl, is
normally distributed with a mean of 180 and a standard deviation of 12.
178. What percentage of this population would have readings exceeding 210?
P(x > 210) = P(z > 2.5) = 0.5000 – 0.4938 = 0.0062, or 0.62%
ANSWER:
P(x < 156) = P(z < -2) = 0.5000 – 0.4772 = 0.0228 or 2.28%
180. The weights of ripe watermelons grown at Mr. Howard’s farm are normally distributed
with a standard deviation of 2.4 Ibs. Find the mean weight of Mr. Howard’s ripe
watermelons if approximately 4% weigh less than 15 lb.
ANSWER:
The z-score is z = -1.75 (since the area to the left of z is 0.04). Now using the formula:
z = ( x − µ ) / σ , we have -1.75 = (15 - µ ) / 2.4. Solving for µ , we get: µ = 15 – (-1.75)
(2.4) = 19.2 Ibs.
The waiting time x at a fast-food restaurant during lunch time is approximately normally
distributed with a mean of 4.5 min and a standard deviation of 1.2 min.
181. Find the probability that a randomly selected customer has to wait less than 2.7 min.
ANSWER:
P(x < 2.7) = P( z < -1.50) = 0.5000 – 0.4332 = 0.0668
182. Find the probability that a randomly selected customer has to wait more than 6.8 min.
ANSWER:
P(x > 6.8) = P(z > 1.92) = 0.5000 – 0.4726 = 0.0274
ANSWER:
75th percentile is a value such that 75% of the data is less than this value; therefore the
z-score of this value is to the right of 0 such that the area between 0 and z is 0.25.
Hence,
the corresponding z value is z = +0.67. Now, the formula z = (x – 4.5) / 1.2 implies 0.67
= (x – 4.5) / 1.2. Solving for x, we get x = (0.67) (1.2) + 4.5 = 5.304 minutes.
184. A machine is programmed to fill 16-oz bottles. However, the variability inherent in any
machine causes the actual amounts of fill to vary. The distribution is normal with a
standard deviation 0f 0.02 oz. What must the mean amount µ be in order that only 5%
of the bottles receive less than 16 oz?
ANSWER:
The area to the left of z is 0.05, therefore z = -1.65. Then, the formula z = ( x − µ ) / σ
reduces to –1.65 = (16 – µ ) / 0.02. Solving for µ , we get µ = 16 - (-1.65)(0.02) = 16.033
oz.
The z notation, z (α ) , combines two related concepts, the z-score and the area to the right of z,
into a mathematical symbol.
185. If z(A) = 0.10, identify the letter as being a z-score or being an area, and then with the
aid of a diagram explain what both the given number and the letter represent on the
standard curve.
A is an area. z is 0.10 and the area to the right of z = 0.10 is 0.5000 – 0.0398 = 0.4602.
186. If z(0.10) = B, identify the letter as being a z-score or being an area, and then with the
aid of a diagram explain what both the given number and the letter represent on the
standard curve.
ANSWER:
B is a z-score. 0.10 is the area to the right of z = B. Use 0.4000 to look up the z-score on the
table of standard normal distribution, z = B = 1.28
187. If z(C) = -0.05, identify the letter as being a z-score or being an area, and then with the
aid of a diagram explain what both the given number and the letter represent on the
standard curve.
C is an area. z is –0.05 and the area to the right of z = -0.05 is 0.5000 + 0.0199 =
0.5199.
188. If -z(0.05) = D, identify the letter as being a z-score or being an area, and then with the
aid of a diagram explain what both the given number and the letter represent on the
standard curve.
D is a z-score, and 0.05 is the area to the left of z = D. Then, D is to the left of zero
[negative]. Use 0.4500 to look it up; z = D = -1.65.
Assume that the average annual salary for a worker in the United States is $27,500, and that
the annual salaries for Americans are normally distributed with a standard deviation equal to
$6250.
ANSWER:
P(x < 18,000) = P(z < -1.52) = 0.5000 – 0.4357 = 0.0643 or 6.43%
ANSWER:
P(x > 40,000) = P(z > 2.0) = 0.5000 – 0.4772 = 0.0228 or 2.28%
191. If z(0.08) , find the value asked for and then with the aid of a diagram explain what your
answer represents.
ANSWER:
192. If the area between z(0.98) and z(0.02) , find the value asked for and then with the aid of
a diagram explain what your answer represents.
193. If z(1.00 – 0.01) , find the value asked for and then with the aid of a diagram explain
what your answer represents.
ANSWER:
ANSWER:
The length of life of a certain type of washer is approximately normally distributed with a mean
of 6.2 years and a standard deviation of 1.4 years.
195. If this machine is guaranteed for two years, what is the probability that the machine you
purchased will require replacement under guarantee?
ANSWER:
196. What period of time should the manufacturer give as a guarantee if it is willing to replace
only 0.5% of the machines?
The area to the left of z is 0.005, hence z = -2.58. Then, -2.58 = (x – 6.2) / 1.4. Solving
for x, we get x = (-2.58)(1.4) + 6.2 = 2.588 years.
The grades of an examination whose mean is 82 and whose standard deviation is 14 are
normally distributed.
197. Anyone who scores below 55 will be retested. What percentage does this represent?
ANSWER:
198. The top 10% are to receive a special commendation. What score must be surpassed to
receive this special commendation?
ANSWER:
The area to the right of z is 0.10, hence z = 1.28. Then, 1.28 = (x – 82)/14. Solving for x,
we get x = (1.28)(14) + 82 = 99.92.
199. Find the grade such that only 1% will score above it.
ANSWER:
The area to the right of z is 0.01. Hence z = 2.33. Then, 2.33 = (x – 82)/14. Solving for x
we get, x = (2.33)(14) + 82 = 114.62.
200. Find the interquartile range for the grades on this examination.0
201. A vending machine can be regulated to dispense an average of µ oz of coffee per cup.
If the ounces dispensed per cup are normally distributed with a standard deviation of 0.2
oz, find the setting for µ that will allow 8-oz cup to hold (without overflowing) the amount
dispensed 99% of the time.
ANSWER:
The area to the left of z is 0.99. Hence z = 2.33. Then, 2.33 = (8 – µ ) / 0.2. Solving for
µ , we get µ = 8 - (2.33)(0.2) = 7.534.
202. The amount of time, x, spent commuting daily, one way, to college by students is
believed to have a mean of 25 min with a standard deviation of 10 min. If the length of
time spent commuting is approximately normally distributed, find the time, x, that
separates the 25% who spend the most time commuting from the rest of the commuters.
ANSWER:
The area to the right of z is 0.25. Therefore z = 0.67. Then, 0.67 = (x –25)/10. Solving for
x, we get x = (0.67)(10.0) + 25.0 = 31.7 min.
The SAT scores attained by the students in New York City are approximately normally
distributed with a mean of 500 and a standard deviation of 80.
ANSWER:
204. Find the percentage of students who score less than 700.
ANSWER:
P(x > 700) = P(z > 2.5) = 0.5000 – 0.4938 = 0.0062 or 0.62%
ANSWER:
ANSWER:
The 15th percentile, P15 , has a z-score of –1.04. Then, -1.04 = ( P15 - 500)/80. Solving for
P15 we get, P15 = (-1.04)(80) + 500 = 416.8.
ANSWER:
It is known that college students sleep an average of 6 hours per night with a standard deviation
equal to 1.8 hours. A student is selected at random.
208. Find the probability that he/she sleeps between 6 and 9 hours.
ANSWER:
209. Find the probability that he/she sleeps less than 5 hours.
ANSWER:
210. Find the probability that he/she sleeps between 8 and 10 hours.
ANSWER:
P(8 < x < 10) = P(1.11 < z < 2.22) = 0.4868 – 0.3665 = 0.1203
211. Approximately 80% of college students sleep less than “w” hours per night. What is the
value of w?
ANSWER:
w−6 w−6
P(x < w) = 0.80 ⇒ P z < = 0.80 ⇒ = 0.84 ⇒ w = 6 + (0.84)(1.8) ≈ 7.5 hours.
1.8 1.8
ANSWER:
w−6 w−6
P(x < w) = 0.30 ⇒ P z < = 0.30 ⇒ = -0.52 ⇒ w = 6 + (-0.52)(1.8) ≈ 5 hours.
1.8 1.8
213. Approximately 10% of college students sleep at least “w” hours per night. What is the
value of w?
ANSWER:
w−6 w−6
P(x ≥ w) = 0.10 ⇒ P z ≥ = 0.10 ⇒ = 1.28 ⇒ w = 6 + (1.28)(1.8) ≈ 8.3 hours.
1.8 1.8
214. Approximately 25% of college students sleep at most “w” hours per night. What is the
value of w?
ANSWER:
w−6 w−6
P(x ≤ w) = 0.25 ⇒ P z ≤ = 0.25 ⇒ = -0.67 ⇒ w = 6 + (-0.67)(1.8) ≈ 4.8 hours.
1.8 1.8
Assume that x is normally distributed random variable with a mean of 30 and a standard
deviation of 6.
ANSWER:
ANSWER:
ANSWER:
P(26 < x < 42) = P(-0.67 < z < 2.0) = 0.2486 + 0.4772 = 0.7258
ANSWER:
P(32< x < 47) = P(0.33 < z < 2.83 ) = 0.4977 – 0.1293 = 0.3684
ANSWER:
P(21 < x < 37) = P(-1.50 < z < 1.17) = 0.4332 + 0.3790 = 0.8122
ANSWER:
Suppose that daycare costs are normally distributed with a mean equal to $10,000 a year and a
standard deviation equal to $2,000.
221. What percentage of daycare centers will cost between $8,000 and $12,000?
ANSWER:
P(8,000 < x < 12,000) = P(-1.0 < z < 1.0) = 0.3413 + 0.3413 = 0.6826
222. What percentage of daycare centers will cost between $6,000 and $14,000?
ANSWER:
P(6,000 < x < 14,000) = P(-2.0 < z < 2.0) = 0.4772 + 0.4772 = 0.9544
223. What percentage of daycare centers will cost between $4,000 and $16,000?
ANSWER:
P(4,000 < x < 16,000) = P(-3.0 < z < 3.0) = 0.4987 + 0.4987 = 0.9974
224. Compare the results of questions 221, 222, and 223 with the Empirical Rule. Explain the
relationship.
ANSWER:
The Empirical Rule states that “If a variable is normally distributed, then: within one
standard deviation of the mean there will be approximately 68% of the data; within two
standard deviations of the mean there will approximately 95% of the data; and within
three standard deviations of the mean there will be approximately 99.7% of the data.”
The answers to questions 221, 222, and 223 are 0.6826, 0.9544, and 0.9974,
respectively. Since 0.6826 ≈ 68%, 0.9544 ≈ 95%, and 0.9974 ≈ 99.7%, our results are
the same as stated in the Empirical Rule.
ANSWER:
P(7,400 < x < 11,000) = P(-1.30 < z < 0.50) = 0.4032 + 0.1915 = 0.5947
226. What percentage of daycare centers will cost between $5,600 and $12,800?
ANSWER:
P(5,600 < x < 12,800) = P(-2.20 < z < 1.40) = 0.4861 + 0.4192 = 0.9053
227. What percentage of daycare centers will cost between $3,800 and $14,600?
ANSWER:
P(3,800 < x < 14,600) = P(-3.10 < z < 2.30) = 0.4990 + 0.4893 = 0.9883
228. Approximately 80% of daycare costs are less than “w” dollars per year. What is the value
of w?
ANSWER:
229. Approximately 30% of daycare costs are less than “w” dollars per year. What is the value
of w?
230. Approximately 10% of daycare costs are at least “w” dollars per year. What is the value
of w?
ANSWER:
231. Approximately 25% of daycare costs are at most “w” dollars per year. What is the value
of w?
ANSWER:
Final score averages are typically approximately normally distributed with a mean of 75 and a
standard deviation of 13. Your professor says that the top 8% of the class will receive an A; The
next 20%, a B; the next 42%, a C; the next 18% a D; and the bottom 12% an F.
ANSWER:
ANSWER:
B − 75 B − 75
P(x ≥ B) = 0.28 ⇒ P z ≥ = 0.28 ⇒ = 0.58
13 13
234. What average must you exceed to receive a grade better than a C?
ANSWER:
C − 75 C − 75
P(x ≥ C) = 0.70 ⇒ P z ≥ = 0.70 ⇒ = -0.52
13 13
235. What average must you obtain to pass the course? (You’ll need a “D” grade or better.)
ANSWER:
D − 75 D − 75
P(x ≥ D) = 0.88 ⇒ P z ≥ = 0.88 ⇒ = -0.1.175
13 13
236. Find the 90th percentile for the variable “final averages”.
ANSWER:
90th percentile − 75
⇒ = 1.28
13
th
⇒ 90 percentile = 75 + (1.28)(13) = 91.64 ≈ 91.6.
237. Find the first quartile for the variable “final averages”.
ANSWER:
Q1 − 75 Q1 − 75
P(x < Q1 ) = 0.25 ⇒ P z < = 0.25 ⇒ = -0.67
13 13
238. The weights of ripe watermelons grown at Mr. Cooper’s farm are normally distributed
with a standard deviation of 2.5 lbs. Find the mean weight of Mr. Cooper’s ripe
watermelons if only 5% weigh less than 12 lbs.
ANSWER:
12 − µ 12 − µ
P(x < 12) = 0.05 ⇒ P z < = 0.05 ⇒ = -1.645
2.5 2.5
239. A machine fills containers with a mean weight per container of 16.0 oz. If no more than
5% of the containers are to weigh less than 15.75 oz, what must the standard deviation
of the weights equal: Assume the weights are normally distributed.
ANSWER:
15.75 − 16 −0.25
P(x < 15.75) = 0.05 ⇒ P z < = 0.05 ⇒ = -1.645 ⇒ σ = 0.152
σ σ
ANSWER:
The area to the right of z(0.90) is 0.90; the area to the right of z(0.05) is 0.05; therefore
the area between z(0.90) and z(0.05) is found by 0.90 - 0.05, which is 0.85.
ANSWER:
The z notation, z (α ) , combines two related concepts, the z-score and the area to the right, into a
mathematical symbol.
242. If z(A) = 0.15, identify the letter A as being a z-score or being an area.
ANSWER:
A is an area. z is 0.15 and the area to the right of z = 0.15 is 0.5000 - 0.0596 = 0.4404.
ANSWER:
B is a z-score. 0.15 is the area to the right of z = B. Use 0.35 to look up the z-score on the
standard normal table. B = z =1.04.
244. If z(C) = -0.04, identify the letter C as being a z-score or being an area.
C is an area. z is -0.04 and the area to the right of z = -0.04 is 0.5000 + 0.0160 = 0.516.
ANSWER:
D is a z-score. D is to the left of zero (negative), use 0.46 to look it up; D = z = -1.75.
ANSWER:
z(0.09) = 1.34
ANSWER:
ANSWER:
ANSWER:
The long-term record for weather shows that for Northeast States, the annual precipitation has a
mean of 39.50 inches and a standard deviation of 4.30 inches. Assume the annual precipitation
amount has a normal distribution.
250. What is the probability that next year the precipitation amount is more than 45.0 inches?
ANSWER:
251. What is the probability that next year the precipitation amount is between 44.0 and 48.0
inches?
ANSWER:
P(44 < x < 48) = P(1.05 < z < 1.98) = 0.4761 – 0.3531 = 0.123
252. What is the probability that next year the precipitation amount is between 30.0 and 38.0
inches?
ANSWER:
P(30 < x < 38) = P(-2.21 < z < -0.35) = 0.4864 – 0.1368 = 0.3496
ANSWER:
254. What is the probability that next year the precipitation amount is less than 50.0 inches?
ANSWER:
255. What is the probability that next year the precipitation amount is less than 33.0 inches?
ANSWER:
The length of life of a certain type of DVD is approximately normally distributed with a mean of
5.0 years and a standard deviation of 1.5 years.
256. If this type of DVD is guaranteed for 2 years, what is the probability that the DVD you
purchased will require replacement under the guarantee?
ANSWER:
257. What period of time should the manufacturer of this type of DVD give as a guarantee if it
is willing to replace only 0.5% of the DVDs?
t −5 t −5
P(x < t) = 0.005 ⇒ P z < = 0.005 ⇒ = -2.575
1.5 1.5
The grades on an examination whose mean is 475 and whose standard deviation is 75 are
normally distributed.
258. Anyone who scores below 315 will be retested. What percentage does this represent?
ANSWER:
259. The top 10% are to receive a special commendation. What score must be surpassed to
receive this special commendation?
ANSWER:
a − 475 a − 475
P(X > a) = 0.10 ⇒ P z > = 0.10 ⇒ = 1.28
75 75
⇒ a = 47 5 + (1.28)(75) = 571
260. Find the first quartile for the grades on this examination.
ANSWER:
Q1 − 475 Q − 475
P(x < Q1 ) = 0.25 ⇒ P z < = 0.25 ⇒ 1 = -0.67
75 75
261. Find the third quartile for the grades on this examination.
ANSWER:
262. Recall that the interquartile range of a distribution is the difference between the first and
third quartiles. Find the interquartile range for the grades on this examination.
ANSWER:
263. Find the grade such that only 1% will score above it. What does this grade represent?
ANSWER:
a − 475 a − 475
P(x > a) = 0.01 ⇒ P z > = 0.01 ⇒ = 2.33
75 75
This grade represents the 99th percentile for the grades on this examination.
True-False Questions
264. For a binomial distribution with a fixed value of p, the binomial distribution begins to look
like a normal distribution as n increases in size.
ANSWER: T
265. A binomial distribution has n = 100 and p = 0.01. The normal distribution provides a
reasonable approximation to the probability of getting two or fewer successes in the 100
trials.
ANSWER: F
ANSWER: F
ANSWER: F
ANSWER: T
269. The addition and subtraction of 0.5 to the z-value from a discrete variable is commonly
called the continuity correction factor. It is a common method of converting a continuous
variable into a discrete variable.
ANSWER: F
ANSWER: T
ANSWER: T
Multiple-Choice Questions
272. Consider the binomial random variable x with n = 50 and p = 0.5. Suppose we want to
use a normal approximation to find the probability of at least 30 successes. A reasonable
approximation would be obtained by computing:
A) n = 50, p = 0.01
B) n = 500, p = 0.001
C) n = 100, p = 0.05
D) n = 50, p = 0.02
ANSWER: C
274. In a southern state, 5% of all individuals who drive automobiles are not properly
licensed. Use the normal approximation of the binomial distribution to find the probability
that among 200 randomly selected individuals, between seven and nine, inclusive, are
not properly licensed.
ANSWER:
0.3093
275. Find the 90th percentile for a binomial distribution having 400 identical trials, and
probability of success of 0.1.
ANSWER:
48
276. If 15% of the population is left-handed, find the probability that in a class of 35 students
that 3 or fewer are left-handed.
ANSWER:
0.2033
Consider a binomial distribution with 15 identical trials, and probability of success of 0.5.
ANSWER:
0.003
ANSWER:
0.004
279. A machine cuts circular filters from large rolls of material. If 7.3% of the filters fail to meet
specifications, use the normal approximation to the binomial to compute the probability
that a sample of 100 of the filters will contain 5 or fewer that fail to meet specifications.
ANSWER:
0.2451
ANSWER:
0.023
ANSWER:
0.022
Consider a binomial distribution with 15 identical trials, and probability of success of 0.5.
ANSWER:
0.733
ANSWER:
0.734
284. Use the normal approximation of the binomial distribution to find the probability of
obtaining at least 60 heads when a coin is flipped 100 times.
ANSWER:
0.0287
285. If 68% of all individuals who take a qualifying examination fail it on the first attempt, use
the normal approximation of a binomial distribution to find the probability that in a group
of 171 individuals taking the examination for the first time at least 60 will pass.
0.2177
286. A drug manufacturer states that only 5% of the patients using a high blood pressure drug
will experience side effects. Doctors at a large university hospital use the drug in
treating 200 patients. What is the probability that 15 or fewer of the 200 patients
experience side effects?
ANSWER:
Let x represent the number of patients in the 200 who will experience a side effect.
287. Suppose we have a binomial distribution with n = 180 and p = 0.45. Furthermore,
suppose we want to use a normal approximation to find the probability of at least 120
successes. Explain why we need only compute P(x > 119.5) instead of P(119.5 < x <
180.5).
ANSWER:
The value 180.5 converts to a z-score of 14.91. For all practical purposes, the area from
z = 0 to 14.91 is 0.5. Therefore, P(x > 119.5) ≈ P(119.5 < x < 180.5).
288. Find the normal approximation for the binomial probability P(x = 6), where n = 15 and p = 0.4.
Compare this to the value of P(x = 6) obtained from the binomial table.
ANSWER:
µ = np = (15)(0.4) = 6.0, and σ = npq = (15)(0.4)(0.6) = 1.897
289. If 25% of all students entering a certain university drop out during or at the end of their
first year, what is the probability that more than 550 of this year’s entering class of 2000
will drop out during or at the end of their first year?
ANSWER:
P(x > 550) = P(x ≥ 551) = P(x > 550.5) = P(z >2.61) = 0.5000 – 0.4955 = 0.0045.
290. Find by the appropriate method, the probability that the machine records 2 wrong grades
in a set of 10 exams.
ANSWER:
291. Find by the appropriate method, the probability that the machine records no more than 2
wrong grades in a set of 10 exams.
ANSWER:
292. Find by the appropriate method, the probability that the machine records no more than 2
wrong grades in a set of 15 exams.
ANSWER:
293. Find by the appropriate method, the probability that the machine records no more than 2
wrong grades in a set of 150 exams.
ANSWER:
µ = np = (150)(0.05) = 7.5
Then,
It is believed that 60% of married couples with children agree on methods of disciplining their
children. Assuming this to be the case, in a random survey of 200 married couples is conducted
by a researcher.
294. What is the probability that exactly 115 couples who agree?
ANSWER:
Then,
P(x = 115) = P(114.5 < x < 115.5) = P[(114.5 – 120) / 6.9282 < z < (115.5 –
120)/6.9282]
295. What is the probability that fewer than 115 couples who agree?
ANSWER:
P(x < 115) = P ( x ≤ 114 ) = P(x < 114.5) = P[z < (114.5 – 120) / 6.9282] = P(z < -0.79)
296. What is the probability that more than 110 couples who agree?
ANSWER:
P(x > 110) = P ( x ≥ 111) = P(x > 110.5) = P[z > (110.5 – 120)/6.9282] = P(z > -1.37)
297. For a binomial distribution with n =10, and p = 0.2, does the normal distribution provide a
reasonable approximation? Use the “rule of thumb” for normal approximation to the
binomial distribution.
ANSWER:
Since np = 2 < 5 and nq = 8 > 5, the normal approximation to the binomial distribution is
not appropriate in this case.
ANSWER:
299. For a binomial distribution with n = 600, and p = 0.1, does the normal distribution provide
a reasonable approximation? Use the “rule of thumb” for normal approximation to the
binomial distribution.
ANSWER:
Since np = 60 > 5 and nq = 540 > 5, the normal approximation to the binomial
distribution is appropriate in this case.
300. For a binomial distribution with n = 50, and p = 0.4, does the normal distribution provide
a reasonable approximation? Use the “rule of thumb” for normal approximation to the
binomial distribution.
ANSWER:
Since np = 20 > 5 and nq = 30 > 5, the normal approximation to the binomial distribution
is appropriate in this case.
301. For a binomial distribution with n =12, and p = 0.4, does the normal distribution provide a
reasonable approximation? Use the “rule of thumb” for normal approximation to the
binomial distribution.
ANSWER:
In order to see what happens when the normal approximation is improperly used, consider the
binomial distribution with n = 10 and p = 0.7. Since np = 7 and nq = 3, the rule of thumb (np ≥ 5
and nq ≥ 5) is not satisfied.
302. Find the probability of eight or more successes using the binomial tables.
ANSWER:
303. Find the probability of eight or more successes using the normal approximation.
ANSWER:
7.5 − 7.0
= Pz ≥ = P(z ≥ 0.35) = 0.50 - 0.1368 = 0.3632
1.449
ANSWER:
305. Find the probability of seven successes using the binomial tables.
ANSWER:
P(x = 7) = 0.157
306. Find the probability of seven successes using the normal approximation.
ANSWER:
307. Find the probability of five successes using the binomial tables.
ANSWER:
P(x = 5) = 0.227
308. Find the probability of five successes using the normal approximation.
ANSWER:
309. Find the probability of one or fewer successes using the normal approximation.
ANSWER:
1.5 − 0.60
= Pz < = P(z < 1.19) = 0.50 + 0.383 = 0.883
0.755
310. Find the probability of one or fewer successes using the binomial tables.
ANSWER:
311. If 25% of all students entering a certain university drop out during or at the end of their
first year, what is the probability that more than 420 of this year’s entering class of 1500
will drop out during or at the end of their first year?
ANSWER:
Since np = (1500)(0.25) = 375.0 > 5 and nq = (1500)(0.75) = 1125 > 5, the normal
approximation to the binomial is appropriate. Now,
420.5 − 375.0
= Pz ≥ = P(z ≥ 1.78) = 0.50 - 0.4625 = 0.0375
25.617
312. Explain why the normal approximation to the binomial distribution is reasonable.
ANSWER:
313. Find the mean and standard deviation of the normal distribution that is used in the
approximation.
ANSWER:
ANSWER:
ANSWER:
25
P(x = 5) = (0.4)5 (0.6) 20 = 0.0199
5
ANSWER:
In this situation, the normal approximation to the binomial is excellent. The difference
between the two answers is 0.0207 – 0.0199 = 0.0008.
It is believed that 60% of married couples with children agree on methods of disciplining their
children. Assume that a survey of 250 married couples is conducted.
317. Explain why the normal approximation to the binomial distribution is reasonable.
ANSWER:
Since np = (250)(0.60) = 150 > 5 and nq = (250)(0.40) = 100 > 5, the normal
approximation to the binomial distribution is reasonable.
318. Find the mean and standard deviation of the normal distribution that is used in the
approximation.
ANSWER:
ANSWER:
320. What is the probability we would find fewer than 135 couples who agree?
ANSWER:
321. What is the probability we would find more than 135 couples who agree?
ANSWER:
A recent study showed that 75% of commercial airline flights in and out of the US airports were
on-time arrivals and 19% were on late departures. Three hundred flights are to be randomly
identified from all flights and their flight logs examined closely.
ANSWER:
323. What is the probability that more than 80% of the sample will be on-time arrival?
ANSWER:
324. What is the mean and standard deviation of commercial airline flights in and out of the
US airports that were on late departure?
ANSWER:
Mean = µ = np = (300)(0.19) = 57
325. What is the probability that less than 15% of the sample will have departed late?
ANSWER:
A soda-filling machine is known to under fill an incorrect amount of soda on 5% of the cans it
fills.
326. Find by the appropriate method, the probability that the machine under fills 1 can in a set
of 15 cans.
P(1 under filled can in 15 cans) = P[x = 1 | B(n = 15, p = 0.05)] = 0.366
327. Find by the appropriate method, the probability that the machine under fills no more than
3 cans in a set of 15 cans.
ANSWER:
P(no more than 3 under filled cans in 15 cans) = P[x = 0, 1, 2, 3 | B(n = 15, p = 0.05)]
= 0.995
328. Find by the appropriate method, the probability that the machine under fills no more than
3 cans in a set of 10 cans.
ANSWER:
P(no more than 3 under filled cans in 10 cans) = P[x = 0,1, 2, 3 | B (n = 10, p = 0.05)]
= 0.999
329. Find by the appropriate method, the probability that the machine under fills no more than
3 cans in a set of 200 cans.
ANSWER:
µ = np = (200)(0.05) = 10.0
Then,
330. Find by the appropriate method, the probability that the machine under fills no less than
2 cans in a set of 12 cans.
ANSWER:
P(no less than 2 under filled cans in 12 cans) = P[x = 2, 3, LL ,12 | B (n = 12, p = 0.05)]
= 0.119
Chapter 5
Probability Distributions
(Discrete Variables)
Section 5.1
True-False Questions
1. A random variable may assume many values for each outcome of a probability
experiment.
ANSWER: F
ANSWER: T
3. The number of hours you studied for your final exams last semester is an example of a
continuous random variable.
ANSWER: F
4. The number of speeding tickets you received last year is an example of a discrete
random variable.
ANSWER: T
5. The number of hours you waited in line to register this semester is an example of a
discrete random variable.
ANSWER: F
6. The number of automobile accidents you were involved in as a driver last year is an
example of a discrete random variable.
ANSWER: T
7. The various values of a random variable form a list of mutually exclusive events.
ANSWER: T
8. A random variable is a variable that assumes a unique numerical value for each of the
outcomes in the sample space of a probability experiment.
ANSWER: T
ANSWER: F
ANSWER: T
11. Discrete random variable is a qualitative random variable that can assume an
uncountable number of values.
ANSWER: F
Multiple-Choice Questions
12. Which of the following probability experiments would result in a discrete random
variable?
A) Discrete random variable is a quantitative random variable that can assume each
countable number of values.
B) Continuous random variable is a quantitative random variable that can assume an
uncountable number of values.
Short-Answer Questions
15. Classify the following as discrete or continuous random variables: The weight of bags of
apples, with 10 apples in each bag.
ANSWER:
Continuous
16. Classify the following as discrete or continuous random variables: The number of times
required for a modem to dial an internet provider before connecting.
ANSWER:
Discrete
17. Classify the following as discrete or continuous random variables: Out of 10 times
connecting to an internet provider, the average number of attempts necessary before
connecting.
ANSWER:
Continuous
18. Classify the following as discrete or continuous random variables: A pair of dice is rolled,
and the sum to appear on the dice is recorded.
ANSWER:
19. A bag contains nickels, dimes, and quarters (more than two of each). Two coins are
randomly selected and their total value is noted. Describe what the random variable x
represents.
ANSWER:
The random variable x represents the total value of the two coins.
20. A bag contains nickels, dimes, and quarters (more than two of each). Two coins are
randomly selected and their total value in cents is noted. Find the possible values of the
random variable x.
ANSWER:
The possible values of x are 10, 15, 20, 30, 35, and 50.
21. In order to monitor the quality of a production process, samples of size five are selected
daily. The random variable of interest is the number of defectives in the five items
selected. What values are possible for this random variable?
ANSWER:
0, 1, 2, 3, 4, or 5
22. A bridge hand of 13 cards is dealt from a standard deck. Let x represents the number of
clubs in the hand. What values are possible for x?
ANSWER:
The possible values for x are whole numbers from 0 through 13, inclusive.
ANSWER:
The random variable is: number of cars per family. It is discrete with possible values:
0,1,2,3 … n.
24. Is the distance you travel from home to school discrete or continuous random variable?
Explain.
ANSWER:
The distance your travel from home to school is a continuous random variable, since it
can assume an uncountable number of numerical values. In other words, distance is a
measurement and can assume any value along a line interval including all possible
fractions.
25. Is the number of textbooks you bought this semester discrete or continuous random
variable? Explain.
ANSWER:
The number of textbooks you bought this semester is a discrete random variable, since it
can only assume a countable number of numerical values. A value of 2.75, for example,
would not make sense.
36
27. Suppose a random variable W is defined to equal the absolute value of the difference
between x and y. How many distinct values are possible for W?
ANSWER:
“Are you getting a summer job?” A recent study reported that 68% of college students
answered, “I have one”; 22% said “Maybe” and 10% said “No”.
28. What is the variable involved, and what are the possible values?
ANSWER:
The variable is: summer job status; with 3 possible values: have one, maybe, no.
ANSWER:
Survey your friends about the number of siblings they have and the length of the last phone call
they had with their boyfriend / girlfriend.
30. Identify the two random variables of interest and list their possible values.
First variable is: number of siblings that a friend has, with possible values: 0, 1, 2, …, n.
Second variable: length of last phone call to boyfriend / girlfriend, with possible values: 0
to any number (e.g., 36, 52, 81, ….) and/or any number including fractions (e.g., 41.67,
59.04, 75.92, …….)
31. The two variables in question 30 are either discrete or continuous. Which are they and
why?
ANSWER:
Number of siblings that a friend has is a discrete random variable, since it can only
assume a countable number of numerical values. A value of 1.68, for example, would
not make sense.
Length of last phone call to boyfriend / girlfriend is a continuous random variable, since it
can assume an uncountable number of numerical values. In other words, length is a
measurement and can assume any value along a line interval including all possible
fractions.
True-False Questions
32. The histogram of a probability distribution uses the physical area of each bar to
represent its assigned probability.
ANSWER: T
33. The mean, µ , of a discrete random variable x is found by multiplying each possible
value of x by its own probability and then adding all the products together; that is,
µ = ∑ [ xP( x)] .
34. For every discrete random variable x, the variance is given by the formula: σ 2 = npq .
ANSWER: F
35. The formula µ = np may be used to find the mean of any discrete random variable x.
ANSWER: F
36. The sum of all probabilities in any discrete probability distribution is not always exactly
one, since some of the probabilities may be slightly larger than one.
ANSWER: F
37. The sum of all the probabilities in any probability distribution is always exactly one.
ANSWER: T
ANSWER: T
39. Sample statistics are represented by letters from the Greek alphabet.
ANSWER: F
40. The probability of event A or B is equal to the sum of the probability of event A and the
probability of event B when A and B are mutually exclusive events.
ANSWER: T
ANSWER: T
43. A probability function is a rule that assigns probabilities to the values of the random
variable of interest.
ANSWER: T
44. The formula µ = np may be used to compute the mean of many discrete populations.
ANSWER: F
ANSWER: F
46. A probability function provides a probability of zero for all values of the random variable x
other than the values specified as part of the domain.
ANSWER: T
ANSWER: F
48. The mean of the probability distribution of a discrete random variable, or the mean of a
discrete random variable, is found in a manner somewhat similar to that used to find the
mean of a frequency distribution.
ANSWER: T
ANSWER: F
50. The mean of a discrete random variable is often referred to as its expected value.
ANSWER: T
51. The sum of all the probabilities in any probability distribution is always exactly 1.25.
ANSWER: F
Multiple-Choice Questions
52. Given that the numbers 1 through 6 are equally likely to occur, what is P(x ≤ 2)?
A) Cannot be determined since we do not know the probability for each number.
B) 1/2
C) 1/3
D) 1/6
ANSWER: C
6− x−7
53. Consider the probability function P( x ) = for x = 2, 3, 4, 5,.....,12. Find the
36
probability that x takes values between 6 and 8 (not inclusive).
A) 5/36
B) 6/36
C) 10/36
D) 16/36
ANSWER: B
54. Consider the data in the table. Which answer is not true?
1 0.60
2 0.20
3 0.15
4 0.05
55. A ball is drawn from a box containing three balls, one red, one blue, and one green. The
ball is returned and a second ball is drawn. A tree diagram is drawn to give the
outcomes of the experiment with respect to the colors of the two balls. If x represent the
number of red balls in the two selected, how many branches are assigned the value of x
= 1?
A) 1
B) 2
C) 3
D) 4
ANSWER: D
56. A tree diagram is constructed for the experiment of tossing a coin three times. If x
represents the number of tails in the three tosses, how many branches are assigned the
value x = 3?
A) 0
B) 1
C) 2
D) 3
ANSWER: B
59. Determine the value of the constant c in the following probability function: P(x) = c for x =
1, 2, 3, 4, 5.
ANSWER:
c = 0.20
60. The values of a random variable x have a uniform probability distribution. If the random
variable x has the values of 0, 1, 2, 3, and 4, what is the probability that the value of x is
less than 2?
ANSWER:
0.40
4− x
61. Is F ( x ) = for x = 1, 2, 3, 4, and 5 a probability function? Give a short explanation by
5
writing a sentence or two.
ANSWER:
Since F(5) = −1/5, this is not a probability function. Probabilities can never be negative.
62. Explain why the following statement is false: “The mean of a probability distribution of a
discrete random variable always has a value equal to one of the values of the random
variable”.
ANSWER:
Although variables are discrete, very likely the mean could be a non-discrete value and
therefore, not equal to one of the variables.
ANSWER:
64. Determine the value of the constant c in the following probability function: P(x) = c for x =
0, 1, 2, 3.
ANSWER:
c = 0.25
65. A probability distribution has a mean equal to 10 and a standard deviation equal to 2.
Find x 2 P(x ) .
∑
ANSWER:
104
66. A probability distribution has a mean equal to 8 and a standard deviation equal to 5. Find
∑ x 2 P( x ) .
ANSWER:
89
67. Hope and Mike were discussing one entry in a probability distribution: P(x) = 0.5 when x
= -3. Hope felt that this entry was okay since the P(x) was a value between 0.0 and 1.0.
Mike argued that this entry was impossible for a probability distribution since x = –3, and
negative values are not possible in probability distributions. Who is correct, Hope or
Mike? Justify your choice.
Hope is correct, since negative values of x are possible but P(x) must be a value
between 0.0 and 1.0, since probabilities cannot be negative for any probability
distribution.
68. Express the tossing of two coins as a probability distribution of x, the number of heads
occurring.
ANSWER:
x P(x)
0 0.25
1 0.50
2 0.25
69. Explain how the various values of x in a probability distribution form a set of mutually
exclusive events?
ANSWER:
Each unique outcome is assigned a specific numerical value. In other words, the values of x in
a probability distribution can never overlap.
70. Explain how the various values of x in a probability distribution form a set of “all
inclusive” events.
ANSWER:
71. Let x represent the number of times a one appears when a pair of dice is rolled once.
Give the probability distribution for x.
72. A card is selected from a standard deck of 52. Random variable x is defined to be 0, if
an ace occurs; 1, if a two through ten occurs; and 2, if a face card (Jack, Queen, or King)
occurs. Give the probability distribution for x.
ANSWER:
73. A small bag of M&M candies has the following assortment: red (10), blue (2), orange (5),
brown (21), green (0), and yellow (18). Give the probability distribution for x.
ANSWER:
P(red) = 0.185; P(blue) = 0.137; P(orange) = 0.093; P(brown) = 0.389; P(green) = 0.0;
P(yellow) = 0.333
x+k
74. Consider the function T ( x) = for x = 1, 2, 3, 4. Find all values of k which make the
12
function T a probability function.
ANSWER:
k = 0.5
x P(x)
2 2k
3 0.52
4 k
ANSWER:
1 + ( x − 3) 2
76. The function P( x ) = for x = 1 , 2, 3, and 4 is a probability function. Find the mean
10
and standard deviation of this distribution.
ANSWER:
µ = 2.0 and σ = 12
.
77. Find the mean and standard deviation of the following probability distribution:
x P(x)
1 0.3
2 0.5
3 0.2
ANSWER:
µ = 19
. and σ = 0.7
78. Compare the standard deviations of the following two probability distributions, both of
which have a mean equal to 5.
x 4 5 6
Distribution B:
x 1 2 3 4 5 6 7 8 9
P(x) 0.05 0.05 0.1 0.2 0.2 0.2 0.1 0.0 0.0
ANSWER:
Standard deviation for distribution A = 0.45, Standard deviation for distribution B = 1.92
79. A probability distribution has a standard deviation equal to 2.5 and ∑ x P(x ) = 10.25 . Find
2
ANSWER:
Mean = 2 or -2
80. Find the amount of the probability distribution within two standard deviations of the mean
for rolling a pair of dice and observing the sum. Compare this with the bound given by
Chebyshev's Theorem.
ANSWER:
81. An arsenal contains several identical boxes of ammunition. If the number of defective
bullets per box has the following distribution, find the mean and standard deviation for x.
ANSWER:
x P(x)
1 0.25
2 0.25
3 0.25
4 0.25
82. Find the mean and standard deviation of the probability distribution.
ANSWER:
ANSWER:
This is a uniform distribution since the probability is the same for all possible values of x.
84. Census data for families with a combined income of $60,000 or more in Michigan show
that 25% have no children, 30% have one child, 35% have two children, and 10% have
ANSWER:
x 0 1 2 3
x2 + 5
P(x) = ; for x = 1, 2, 3, 4, or 5.
80
x P(x)
1 0.0750
2 0.1125
3 0.1750
4 0.2625
5 0.3750
Notice that each P(x) is a value between 0.0 and 1.0, and the sum of all P(x) values is exactly
1.0. Therefore, P(x) is a probability function.
86. Given the probability function P(x) = (6 − x ) /15 , for x = 1, 2, 3, 4, or 5. Find the mean
and standard deviation.
ANSWER:
x P( x) xP( x) x 2 P( x)
1 5/15 5/15 5/15
2 4/15 8/15 16/15
3 3/15 9/15 27/15
4 2/15 8/15 32/15
5 1/15 5/15 25/15
∑ 1.0 35/15 105/15
σ = σ 2 = 1.556 = 1.247
x 1 2 3 4 5
ANSWER:
x P( x ) xP( x ) x 2 P( x )
µ = ∑ [ xP ( x )] = 2.1
σ = σ 2 = 1.89 = 1.3748
ANSWER:
ANSWER:
x H(x)
1 0.25
2 0.25
3 0.25
4 0.25
ANSWER:
ANSWER:
0.3
0.25
0.2
H(x)
0.15
0.1
0.05
0
1 2 3 4
x
ANSWER:
It is uniform or rectangular.
ANSWER:
94. Determine the variance and standard deviation of the probability distribution in question
89.
ANSWER:
σ = σ 2 = 1.25 = 1.118
ANSWER:
x P(x)
1 0.12
2 0.18
3 0.28
4 0.42
ANSWER:
ANSWER:
98. Determine the standard deviation of the probability function question 95.
ANSWER:
σ = σ 2 = 1.08 = 1.039
0.45
0.4
0.35
0.3
0.25
P(x)
0.2
0.15
0.1
0.05
0
1 2 3 4
x
x 1 2 3 4 5
P(x) 0.30 0.20 0.25 0.15 0.10
100. Use a computer (or random numbers table) to generate a random sample of 25
observations drawn from the discrete probability distribution.
ANSWER:
Everyone's generated values will be different. Listed here is one such sample.
3 1 3 1 3 2 4 2 5 1
1 2 3 2 2 4 1 5 4 5
4 1 1 1 3
101. Form a relative frequency distribution of the observed data (generated random data).
ANSWER:
x 1 2 3 4 5
Relative Frequency. 0.32 0.20 0.20 0.16 0.12
0.35
0.3
0.25
0.2
P(x)
0.15
0.1
0.05
0
1 2 3 4 5
x
103. Construct a relative frequency histogram of the observed data using class marks of 1, 2,
3, 4, and 5.
0.35
0.3
0.25
Relative Frequency
0.2
0.15
0.1
0.05
0
1 2 3 4 5
x
104. Compare the observed data with the theoretical distribution. Describe your conclusions.
The distribution of the sample is somewhat similar to that of the given distribution. The
two highest probabilities in the random data occurred at x = 1 and 3, matching the two
highest probabilities for the given distribution. Also, the two lowest probabilities in the
random data occurred at x = 4 and 5, matching the two lowest probabilities for the given
distribution. Finally, the probability in the random data occurred at x = 2 is identical to
that for the given distribution.
ANSWER:
x 1 2 3 4 5
P(x) 1/15 2/15 3/15 4/15 5/15
ANSWER:
∑ [ xP ( x )] = 3.667 and ∑ [ x P ( x )] = 15
2
ANSWER:
µ= ∑ [ xP ( x )] = 3.667
σ 2 = ∑[ x 2 P ( x )] − µ 2 = 15 − (3.667) 2 = 1.553
ANSWER:
σ = σ 2 = 1.553 = 1.246
ANSWER:
x 1 2 3 4
P(x) 0.1 0.2 0.3 0.4
ANSWER:
ANSWER:
µ= ∑ [ xP ( x )] = 3.0
ANSWER:
ANSWER:
σ = σ 2 = 1.0
The number of credits that full-time college students take on any given semester is a random
variable represented by x. The probability distribution for x is
x 12 13 14 15 16
P(x) 0.4 0.2 0.2 0.1 0.1
115. Find the mean of the number of credits that full-time college students take in a given
semester.
ANSWER:
µ= ∑ [ xP ( x )] = 13.3
116. Find the standard deviation of the number of credits that full-time college students take
on a given semester.
ANSWER:
σ = σ 2 = 1.345
117. How much of the probability distribution is within two standard deviations of the mean?
ANSWER:
The interval from 10.61 to 15.99 encompasses the values 12, 13, 14, and 15.
118. How much of the probability distribution is within one standard deviations of the mean?
ANSWER:
The interval from 11.955 to 14.645 encompasses the values 12, 13, and 14.
119. Find P( µ - 2 σ ≤ x ≤ µ + 2 σ ) .
ANSWER:
120. Find P ( µ − σ ≤ x ≤ µ + σ ) .
ANSWER:
121. The histogram for a binomial distribution that has a success probability close to one will
be skewed to the right, and the histogram for a binomial distribution that has a success
probability close to zero will be skewed to the left.
ANSWER: F
122. It is possible to obtain eight successes in a binomial probability experiment with six trials,
provided the probability of a success on a single trial is greater than 0.5.
ANSWER: F
123. A binomial experiment always has at least three possible outcomes to each trial.
ANSWER: F
124. The binomial random variable x is the count of the number of successful trials that occur
in n repeated (identical) independent trials; x may take on any integer value from 0 to n.
ANSWER: T
125. In any binomial probability experiment, independent trials mean that the result of one
trial does not affect the probability of success of any other trial in the experiment.
ANSWER: T
n n!
126. The binomial coefficient = is equivalent to the number of combinations, n C x ,
x x !( n − x )!
the symbol most likely on your calculator.
ANSWER: T
127. A binomial experiment always has three or more possible outcomes to each trial.
ANSWER: F
ANSWER: T
129. The binomial parameter p is the probability of one success occurring in n trials when a
binomial experiment is performed, while 2p is the probability of two successes.
ANSWER: F
130. A convenient notation to identify the binomial probability distribution for a binomial
experiment with n = 20 and p = 0.25 is B(20, 0.25).
ANSWER: T
131. The binomial random variable x is the count of the number of successful trials that occur
in n trials. The random variable x may take on any real value from zero to n.
ANSWER: F
132. Each trial in a binomial probability experiment has two possible outcomes (success,
failure) and that P(success) + P(failure) = 1.
ANSWER: T
133. A binomial experiment always has two or more possible outcomes to each trial.
ANSWER: F
134. The binomial parameter p is the possibility of one success occurring in n trials when a
binomial experiment is performed.
ANSWER: F
ANSWER: T
136. The number of hours you waited in line to register this semester is an example of a
binomial random variable.
ANSWER: F
Multiple-Choice Questions
n
A) p x q n− x
x
n x n− x
B) p q
n − x
n
C) p n − x q x
x
ANSWER: C
138. In a binomial probability experiment with P(success) = p, P(failure) = q, and eight trials,
what is the probability of three successes?
A) 5 p 3q 5
B) 5 p 5q 3
C) 56 p 3q 5
D) 56 p5q 3
ANSWER: C
A) 10! / 3!
B) 120
C) 720
D) 30
ANSWER: B
141. Which of the following is not true regarding a binomial distribution for n = 50 and p = 0.4?
142. If a tree diagram is drawn for a binomial experiment having n trials, how many branches
will it have?
A) 2 n
B) 2n
C) n 2
D) Need to know the value of n before number of branches can be determined.
ANSWER: A
143. For a binomial distribution with five trials and equal probability of success per trial, what
is the highest probability?
144. Suppose that the value of n in a binomial distribution is fixed, but we let the value of p
vary. As the value of p increases from values near 0 to values close to 1, what
conclusion can be made about the mean of the distribution?
145. If a tree diagram is drawn for a binomial experiment having 4 trails, how many branches
will it have?
A) 2
B) 4
C) 8
D) 16
ANSWER: D
146. Given a binomial probability experiment with six trials, in how many ways can we obtain
two successes?
ANSWER:
ANSWER:
p = 0.20
148. For a particular binomial distribution, n = 4. If P(2) = 0.346 and P(3) = 0.154, find p.
ANSWER:
p = 0.40
149. How many times must a fair coin be flipped in order that the mean number of heads
equals 25?
ANSWER:
50
150. For a particular binomial distribution, n = 28 and p = 0.35. For this distribution, find
∑ x ⋅ P( x ).
ANSWER:
151. For a particular binomial experiment, n = 18 and p = 0.7 . For this experiment, find the
value of ∑ [x ⋅ P(x )] .
ANSWER:
12.6
100
152. A particular binomial distribution is given by P( x ) = (0.2) x (0.8)100− x , for x = 0 , 1, 2, 3,
x
LL , 100. Find the mean and standard deviation of this distribution.
ANSWER:
µ = 20 and σ = 4.0
153. Briefly define a binomial probability experiment and discuss its properties.
ANSWER:
154. State a very practical reason why the defective item in an industrial situation would be
defined to be the “success” in a binomial experiment.
ANSWER:
The number of defective items in an industrial situation should be fairly small and
therefore easier to count.
ANSWER:
n = 75 repeated identical independent trials (towels), there are only two outcome (first
quality, irregular), p =P(success) = P(irregular), x = number of irregular towels that may
take on any integer value from 0 to 75.
156. The employees at a Ford assembly plant are polled as they leave work. Each is asked,
“What brand of automobile are you riding home in?” The random variable to be reported
is the number of each brand mentioned. Is x a binomial random variable? Justify your
answer.
ANSWER:
x is not a binomial random variable because there are more than two categories of
outcomes. As the exercise is stated, each different brand (or make) of automobile is an
outcome; therefore there are many different possible outcomes on each trial.
157. Four cards are selected, one at a time, from a standard deck of 52 cards. Let x represent
the number of jacks drawn in the set of four cards. If this experiment is completed
without replacement, explain why x is not a binomial random variable.
ANSWER:
x is not a binomial random variable because the trials are not independent. The
probability of success (get a jack) changes from trial to trial. On the first trial it is 4 / 52.
The probability of a jack on the second trial depends on the outcome of the first trial; it is
4 / 51 if a jack is not selected, and it is 3 / 51 if a jack was selected. The probability of a
jack on any given trial continues to change when the experiment is completed without
replacement.
ANSWER:
x is a binomial random variable because the trials are independent. n = 4, the number of
independent trials; two outcomes, success = queen and failure = not queen; p =
P(queen) = 4/ 52 and q = P(not queen) = 48 / 52; x = number of queens drawn in 4 trials,
and could be any integer number 0, 1, 2, 3 or 4. Further, the probability of success (get a
queen) remains 4 / 52 for each trial throughout the experiment, as long as the card
drawn on each trial is replaced before the next trial occurs.
159. Find the mean and standard deviation of x = number of heads seen in 100 tosses of a
quarter.
ANSWER:
x is binomial random variable with n = 100 and p = 0.5. Then, the mean µ = np = 50
and standard deviation σ = npq = (100)(0.5)(0.5) = 5.0
160. Let x represent the flip upon which a head first occurs when a coin is flipped repeatedly.
Find the probability that x is equal to or greater than 4.
ANSWER:
0.875
161. Thirty percent of hospital admissions for diabetic patients are related to problems with
the kidneys. In a sample of 10 diabetic hospital admissions, what is the probability that
none will be for a kidney problem?
ANSWER:
162. A manufacturer of matches puts 100 matches in each box of matches produced. One-tenth of
one percent of the matches produced has flaws. If a box is randomly selected, what is the
probability that it will have one or fewer matches with a flaw?
ANSWER:
0.995
In testing a new drug, researchers found that 5% of all patients using it will have a mild side
effect. A random sample of 11 patients using the drug is selected.
163. Find the probability that exactly two will have this mild side effect.
ANSWER:
0.087
164. Find the probability that at least one will have this mild side effect.
ANSWER:
0.431
165. A quality control inspector has determined that 0.25% of all parts manufactured by a
particular machine are defective. If 50 parts are randomly selected, find the probability
that there will be at most one defective part.
ANSWER:
0.9930
ANSWER:
0.0000016
167. A fair die is rolled 10 times. Compute the probability that a “one” appears exactly once.
ANSWER:
0.323
168. If two dice are tossed six times, find the probability of obtaining a sum of 7 two or three
times.
ANSWER:
0.255
Consider the probability distribution for x, the number of heads to occur when a coin is tossed
four times.
x 0 1 2 3 4
169. A binomial distribution is based on n = 15 trials and success probability p = 0.4 . What is
the probability that the binomial random variable equals its mean value?
0.207
170. A coin is tossed 100 times. Find numbers a and b that are such that the number of
heads to appear will be between a and b at least 89% of the time.
ANSWER:
a = 35 , b = 65
171. A binomial distribution has a mean equal to 20 and a standard deviation equal to 4. Find
n and p.
ANSWER:
n = 100, p = 0.2
172. Find the mean and standard deviation of the binomial distribution when n = 60 and p =
1/6. Note that this would correspond to the number of times a “one” would appear in 60
tosses of a fair die.
ANSWER:
173. A manufacturer of matches puts 100 matches in each box of matches produced. One-
tenth of one percent of the matches produced has a flaw. If a box is randomly selected,
what is the mean and standard deviation of x where x is defined as the number of
matches having a flaw in the box?
ANSWER:
174. For the binomial distribution with n = 48 and p = 1/3, which of the possible values of x (x
= 0, 1, 2, 3, LL , 48) lie between µ − 2σ and µ + 2σ .
ANSWER:
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, and 22
ANSWER:
P(shut down) = P ( x ≥ 2), where x represents the number of defective parts in the sample.
By using the binomial formula with n = 10, and p = 0.002, we get P(x = 0) = 0.9802, and
P(x = 1) = 0.0196. Hence, P(x ≥ 2) =1.0 – [P(x = 0) + P(x = 1)] = 1.0 – (0.9802 + 0.0196)
= 0.0002.
176. For a particular binomial distribution, µ = 4 and σ = 3. Find the values of n and p.
ANSWER:
n = 16 and p = 0.25
177. A binomial distribution has a mean of 12 and a standard deviation of 2.683. Find n and
p.
ANSWER:
n = 30, p = 0.4
Four cards are selected, one at a time, from a standard deck of 52 cards. Let x represent the
number of aces drawn in the set of 4 cards.
178. If this experiment is completed without replacement, explain why x is not a binomial
random variable.
ANSWER:
179. If this experiment is completed with replacement, explain why x is a binomial random
variable.
ANSWER:
x is a binomial random variable because the trials are independent. n = 4, the number of
trials; two outcomes, success = ace and failure = not ace; p = P(ace) = 4/52 and q =
P(not ace) = 48/52; x = n (aces drawn in 4 trials) and could be any number 0, 1, 2, 3 or
4. Further, the probability of success (get an ace) remains 4/52 for each trial throughout
the experiment, as long as the card drawn on each trial is replaced before the next trial
occurs.
It was reported in a medical journal that about 70% of the individuals needing a kidney
transplant find a suitable donor when they turn to registries of unrelated donors. Assume that a
group of ten individuals needing a kidney transplant. Let x represent the number of individuals
needing a kidney transplant who will find a suitable donor among the registries of unrelated
donors. Consider a group of ten individuals needing a kidney transplant.
181. Find the probability that all ten will find a suitable donor among the registries of unrelated
donors.
ANSWER:
182. Find the probability that exactly eight will find a suitable donor among the registries of
unrelated donors.
ANSWER:
P(x = 8) = 0.233
183. Find the probability that at least eight will find a suitable donor among the registries of
unrelated donors.
ANSWER:
184. Find the probability that no more than five will find a suitable donor among the registries
of unrelated donors.
ANSWER:
4
P(x) = ( 0.75 ) ( 0.25 ) for x = 0, 1, 2, 3, 4
x 4− x
x
ANSWER:
x P(x)
0 0.0039
1 0.0469
2 0.2109
3 0.4219
4 0.3164
This is a binomial probability function since each P(x) is between 0 and 1, and
∑ P( x) = 1.0
186. A recent study showed that only 20% of the women who lived with their boyfriends
eventually walked down the aisle with them. In a sample of 15 women who have lived
with a boyfriend in the past, what is the probability that 5 or fewer of them married the
boyfriend?
Let x represents the number of women who lived with their boyfriends and eventually
married the boyfriend. The random variable x is B(n = 15, p = 0.2). Using the table of
binomial probabilities, we have: P(x ≤ 5) = 0.035 + 0.132 + 0.231 + 0.250 + 0.188 +
0.103 = 0.939.
187. If the binomial (q + p) is squared, the result is (q + p) 2 = q 2 + 2qp + p 2 . For the binomial
experiment with n = 2, the probability of no successes in two trials is q 2 (the first term in
the expansion), the probability of one success in two trials is 2qp (the second term in the
expansion), and the probability of two successes in two trials is p 2 (the third term). Find
(q + p)3 and compare its terms to the binomial probability for n = 3 trials.
ANSWER:
(q + p)3 = q 3 + 3q 2 p + 3qp 2 + p 3
188. The probability of success on a single trial of a binomial experiment is known to be 0.40.
The random variable x, number of successes, has a mean value of 80. Find the number
of trials involved in this experiment and the standard deviation of x.
ANSWER:
189. In Florida, 40% of the people have a certain blood type. What is the probability that
exactly 5 out of a randomly selected group of 15 Floridians will have that blood type?
ANSWER:
σ 2 = ∑ x 2 p( x) − µ 2 ⇒ 4.2 = ∑ x 2 P( x) − 62 ⇒ ∑ x P( x) = 40.2
2
A large shipment of TV sets is accepted upon delivery if an inspection of ten randomly selected
TV sets yields no more than one defective TV.
191. Find the probability that this shipment is accepted if 5% of the total shipment is defective.
ANSWER:
P(accepted) = P[x = 0, 1 | B(n = 10, p = 0.05)] = P(0) + P(1) = 0.599 + 0.315 = 0.914
ANSWER:
193. The binomial probability distribution is often used in situations similar to this one,
namely, large populations sampled without replacement. Explain why the binomial
yields a good estimate.
ANSWER:
Even though the P(defective) changes from trial to trial, if the population is very large,
the probabilities are very similar. For example, suppose the population has 10,000 items
and 50 are defective. P(defective) on the first trial is 50/10,000 = 0.0050; if after 10 trials
45 defectives have been selected, P(defective) will be 45/9990 = 0.0045.
Suppose that you buy 25 plants from a nursery and the nursery claims that 95% of its plants
survive when planted. Let x represent the number of plants that survive.
ANSWER:
195. Use computer (or statistical software) to determine the probability that all 25 will survive.
ANSWER:
196. Use computer (or statistical software) to determine the probability that at most 21 will
survive.
ANSWER:
197. Use computer (or statistical software) to determine the probability that at least 23 will
survive.
ANSWER:
198. Find the mean and standard deviation of x = number of right-handed students in a
classroom of 30 students. Assume that 10% of the population is left-handed.
ANSWER:
x is binomial random variable with n = 30 and p = 0.9. Then, the mean µ = np = 27 and
standard deviation σ = npq = (30)(0.9)(0.1) = 1.643 .
Assume that x is a binomial random variable, with p = P(success), n = number of trials, and x =
number of successes in n trials. Use the binomial probabilities table available in your text to answer
the questions below.
ANSWER:
0.001
ANSWER:
0.006
ANSWER:
0.205
ANSWER:
0.279
ANSWER:
0.187
ANSWER:
0.961
x 0 1 2 3 4
P(x) 0.42 0.33 0.10 0.09 0.06
ANSWER:
If this distribution were binomial, then n would be 4 and P(x = 0) = 0.42, would be q 4 ;
that means q = 4 0.42 = 0.805 . Also, P(x = 4) = 0.06, would be p 4 , that means
p = 4 0.06 = 0.495 . Since p + q = 0.495 + 0.805 = 1.30, which did not add up to 1.0, the
only conclusion is that this distribution is not binomial.
206. A machine produces parts of which 1% are defective. If a random sample of twenty parts
produced by this machine contains two or more defectives, the machine is shut down for
repairs. Find the probability that the machine will be shut down for repairs based on this
sampling plan.
ANSWER:
P(machine will be shut down) = P(x ≥ 2), where x represents the number of defectives in
the sample of n = 20. Since
20 20
P ( x = 0) = (0.01)0 (0.99)20 = 0.8179 , and P ( x = 0) = (0.01)1 (0.99)19 = 0.1652 , then
0 1
the probability that the machine will be shut down for repairs based on this sampling plan
is given by P(x ≥ 2) = 1 – [P(x = 0) + P(x = 1)] = 1 – (0.8179 + 0.1652) = 0.0169.
207. Find the mean and standard deviation of x = number of melon seeds that germinate
when a package of 75 seeds is planted. The package states that the probability of
germination is 0.92.
ANSWER:
x is binomial random variable with n = 75 and p = 0.92. Then, the mean µ = np = 69 and
standard deviation σ = npq = (75)(0.92)(0.08) = 2.349 .
4
Consider the following function: P ( x ) = ( 0.5) ( 0.5 )
x 4− x
for x = 0, 1, 2, 3, 4.
x
ANSWER:
By inspecting the function P(x) we see it satisfies the following binomial properties: n = 4,
p = 0.5, q = 0.5 (p + q = 1), the two exponents x and 4-x add up to n = 4, and x can
take on any integer value from zero to n = 4; therefore P(x) is a binomial probability
function.
ANSWER:
x 0 1 2 3 4
P(x) 0.0625 0.250 0.375 0.250 0.0625
210. Sketch a histogram of the probability distribution of x in question 209, and briefly
describe its shape.
ANSWER:
0.4 0.375
0.35
0.3
0.25 0.25
Probability
0.25
0.2
0.15
0.1 0.0625 0.0625
0.05
0
0 1 2 3 4
x
211. Calculate the mean and standard deviation of the probability distribution of x directly by
using your answer to question 210.
ANSWER:
Mean: µ = ∑ x ⋅ P( x) = 2.0
Variance: σ 2 = ∑x 2
⋅ P( x) − µ 2 = 5 – 4 = 1.0, then standard deviation σ = σ 2 = 1.0.
212. Calculate the mean and standard deviation of the probability distribution of x by using
your answer to question 210.
ANSWER:
ANSWER:
214. If boys and girls are equally likely to be born, what is the probability that in a randomly
selected family of four children, there will be no boys? (Find the answer using a formula).
ANSWER:
4
P ( x = 0) = (0.5)0 (0.50) 4 = 0.0625
0
215. Find the mean and standard deviation of x = number of cars found to have unsafe
brakes among the 500 cars stopped at a roadblock for inspection. Assume that 5% of all
cars have one or more unsafe brakes.
ANSWER:
x is binomial random variable with n = 500 and p = 0.05. Then, the mean µ = np = 25
and standard deviation σ = npq = (500)(0.05)(0.95) = 4.873 .
216. A binomial random variable x is based on 12 trials with the probability of success equal
to 0.30. Find the probability that this variable will take on a value more than two
standard deviations above the mean.
ANSWER:
A doctor knows from experience that 20% of the patients to whom he gives a high blood
pressure drug will have undesirable side effects. Assume the doctor gives that drug to ten of his
patients.
217. Find the probability that among the ten patients to whom he gives the drug, at most two
will have undesirable side effects.
218. Find the probability that among the ten patients to whom he gives the drug, at least two
will have undesirable side effects.
ANSWER:
A random sample of 12 players from the active rosters of the 30 Major League Baseball teams
is to be selected and tested for the use of illegal drugs.
219. If 10% of all the players are using illegal drugs at the time of the test, what is the
probability that two or more test positive and fail the test?
ANSWER:
220. If 20% of all the players are using illegal drugs at the time of the test, what is the
probability that two or more test positive and fail the test?
ANSWER:
ANSWER:
A large retailer has purchased 10,000 high quality videotapes. The retailer is assured by the
supplier that the shipment contains no more than 1% defective tapes (according to agreed
specifications). To check the supplier’s claim, the retailer randomly selects 100 tapes and finds
six of the 100 to be defective.
222. Assuming the supplier’s claim is true, compute the mean and the standard deviation of
the number of defective tapes in the sample.
ANSWER:
223. Based on your answer to question 224, is it likely that as many as six tapes would be
found to be defective, if the claim is correct?
ANSWER:
No. If you were 3 standard deviations to the right of the mean, the value would be 3.985. It is
unlikely you would observe 6 defects out of 100.
224. Suppose that six tapes are indeed found to be defective. Based on your answer to
question 224, what might be a reasonable inference about the manufacturer’s claim for
this shipment of 10,000 tapes?
ANSWER:
The service manager for a new appliances store reviewed sales records of the past 20 sales of
new microwaves to determine the number of warranty repairs he will be called on to perform in
the next 90 days. Corporate reports indicate that the probability any one of their new
microwaves needs a warranty repair in the first 90 days is 0.05. The manager assumes that
calls for warranty repair are independent of one another and is interested in predicting the
number of warranty repairs he will be called on to perform in the next 90 days for this batch of
new microwaves sold.
225. What type of probability distribution will most likely be used to analyze warranty repair
needs on new microwaves in this situation?
ANSWER:
Binomial distribution
226. What is the probability that none of the 20 new microwaves sold will require a warranty
repair in the first 90 days?
ANSWER:
P(X= 0) = 0.3585
227. What is the probability that exactly two of the 20 new microwaves sold will require a
warranty repair in the first 90 days?
ANSWER:
P(X = 2) = 0.1887
228. What is the probability that at most two of the 20 new microwaves sold will require a
warranty repair in the first 90 days?
P(X ≤ 2) = 0.9245
229. What is the probability that between two and four (inclusive) of the 20 new microwaves
sold will require a warranty repair in the first 90 days?
ANSWER:
P(2 ≤ X ≤ 4) = 0.2616
Chapter 7
Sample Variability
True-False Questions
1. In general, the term “standard error” is the name only used for the standard deviation of
the sampling distribution of sample means.
ANSWER: F
2. The histogram for a population and the histogram for a sampling distribution of a sample
mean have the same shape.
ANSWER: F
3. The sampling distribution of sample means will be approximately normally distributed for
large samples when the parent population is not normally distributed.
ANSWER: T
ANSWER: T
5. The standard error of the mean is the standard deviation of the population from which
the samples have been taken.
ANSWER: F
6. We do not need to repeatedly sample a population in order to use the concept of the
sampling distribution.
ANSWER: F
7. As the sample size increases, the sampling distribution of the sample means from a normal
distribution has a normal curve that becomes more peaked.
ANSWER: T
8. The Central Limit Theorem provides us with a description of the three characteristics of a
sampling distribution of sample medians.
ANSWER: F
ANSWER: F
10. The standard error of the mean increases as the sample size increases.
ANSWER: F
11. A sample obtained in such a way that each possible sample of fixed size n has an equal
probability of being selected is referred to as a random sample.
ANSWER: T
ANSWER: T
13. The sampling distribution of sample means is normal for samples of all sizes, provided
that the parent sampled population has a normal distribution.
ANSWER: T
14. The fundamental goal of a survey is to come up with the same results that would have
been obtained had every single member of a population been interviewed.
ANSWER: T
15. Central Limit Theorem states that the sampling distribution of sample means will more
closely resemble the normal distribution regardless of the sample size.
ANSWER: F
16. The sampling distribution of a sample statistic is the distribution of values for a sample
statistic obtained from repeated samples, all of the same size and all drawn from the
same population.
ANSWER: T
17. A random sample is a sample obtained in such a way that each possible sample of fixed
size n selected from the same population has a chance or probability of being selected.
ANSWER: F
18. If the sampled distribution is normal, then the sampling distribution of sample means
(SDSM) is normal and the Central Limit Theorem does not apply.
ANSWER: T
ANSWER: T
20. The standard error of the sample mean is the standard deviation of the population from
which the samples have been selected.
ANSWER: F
21. Repeated samples are commonly used in the field of production control, in which
samples are taken to determine whether a product is of the proper size or quantity.
When the sample statistic does not fit the standards, a mechanical adjustment of the
machinery is necessary.
ANSWER: T
ANSWER: F
23. The mean of the sampling distribution of sample means x is equal to the mean of the
population from which the samples have been selected.
ANSWER: T
24. Which of the following is not a characteristic of the sampling distribution of a sample
statistic?
25. Assume that you have repeatedly taken samples of size 5 from a population of 30. What
can be said about the individual sample means?
26. As the sample size increases, what happens to the standard error of the mean ( σ x )?
A) Increases
B) Decreases
C) Remains the same
D) Becomes negative
ANSWER: B
27. Given that all possible random samples of size n are taken from any population, which of
the following would be true?
A) µ x = µ and σ x = σ .
B) µ x < µ and σ x > σ .
C) µ x = µ and σ x < σ .
D) Need to see the raw data before can make any true statement.
ANSWER: C
29. As the size of the sample increases, what happens to the shape of the sampling
distribution of sample means?
30. If all possible random samples of size n are taken from a population that is not normally
distributed, and the mean of each sample is determined, what can you say about the
sampling distribution of sample means?
A) It is positively skewed.
B) It is negatively skewed.
C) It is approximately normal provided that n is large enough.
D) None of the above.
ANSWER: C
31. If the standard deviation of the sampling distribution of sample means is 5.0 for samples of size
16, then the population standard deviation must be
A) 20.
B) 5.0.
C) 3.2.
D) 80.
ANSWER: A
32. Which of the following statements about the Central Limit Theorem is correct?
33. Consider a large population with a mean of 100 and a standard deviation of 21. A
random sample of size 36 is taken from this population. The standard error of the
sampling distribution of sample mean is equal to:
A) 16.67.
B) 3.50.
C) 12.25.
D) 1.71.
ANSWER: B
A) σ .
B) σ / n .
C) s.
D) σ / n .
ANSWER: B
A) 40.
B) 35.
C) 30.
D) 25.
ANSWER: D
A) The standard error of the mean (σ x ) is the standard deviation of the sampling
distribution of sample means.
B) If the sampled population is not normal, the sampling distribution of sample means
will still be approximately normally distributed under the right conditions.
C) The standard error of the mean (σ x ) is the standard deviation of the sampling
distribution of sample means.
D) None of the above
ANSWER: A
Short-Answer Questions
37. There are 50 possible samples of size two when selected with replacement from a total
of 10 items. In order to be a random sample, each possible sample must have what
probability of being selected?
ANSWER:
0.02
38. Determine the number of ways that two letters can be selected from {A, B, C, D} if order
in the sample is not to be considered. List the possible samples.
ANSWER:
39. Determine the number of ways that two letters can be selected from {A, B, C, D} if order
in the sample is to be considered. List the possible samples.
ANSWER:
12 possible ways: {A, B}, {B, A}, {A, C}, {C, A}, {A, D}, {D, A}, {B, C}, C, B}, {B, D}, {D, B},
{D, C} and {C, D}
40. How many samples of size 5 are possible when selecting from a set of 10 distinct
integers if the sampling is done with replacement?
ANSWER:
100,000 samples
41. Explain why the sample means become more variable as the sample size decreases.
ANSWER:
With a smaller sample size there will be more “gaps” between the values; as the sample
size increases the “gaps” become filled in.
42. What name do we give to the standard deviation of the sampling distribution of sample
means?
ANSWER:
43. Suppose samples of size 50 are selected from the distributions listed in parts a through
e below. What type of distribution will x have in each of the five cases?
ANSWER:
The sample mean would have a normal distribution in part (b) since the parent
population is normal. In all other parts, the distribution is approximately normal since n =
50 > 30, so Central Limit Theorem does apply.
44. Consider the integers {0, 1, 2, 3, 4}. If all samples of size 3 are taken, with replacement,
and the sampling distribution of the sample mean is found, what would the mean of the
sample mean equal?
ANSWER:
45. Consider the integers {10, 20, 30, 40, 50, 60}. If all samples of size 3 are taken, with
replacement, and the sampling distribution of the sample mean is found, what would the
mean of the sample mean equal?
ANSWER:
46. Discuss the effect on the standard error of the mean as the sample size increases.
ANSWER:
As the sample size increases, the standard error of the mean decreases.
ANSWER:
Both distributions are normally distributed. With n = 100 the distributions has a standard
error of 0.1σ, while the distributions for n = 60 has a standard error of 0.129σ.
48. Abby stated that “a sampling distribution of the standard deviation tell you how the
standard deviation varies from sample to sample.” Debra argues that “a population
distribution tells you that.” Who is right? Justify your answer.
ANSWER:
Abby is right. A population distribution is a distribution formed for all x values that make
up the entire population.
49. Lily says that it is the “size of each sample used” and Sue says that it is the “number of
samples used” that determines the spread of an empirical sampling distribution. Who is
right? Justify your answer.
ANSWER:
Lily is right. The standard error is found by dividing the standard deviation by the square
root of the sample size.
50. If a population has a standard deviation σ of 25 units, what is the standard error of the
mean if samples of size 80 are selected?
ANSWER:
σ x = σ / n = 25/ 80 = 2.795
ANSWER:
The equal probability of selection principle states that if every member of a population
has an equal probability of being selected in a sample, then that sample will be
representative of the population.
52. If a population has a standard deviation σ of 25 units, what is the standard error of the
mean if samples of size 20 are selected?
ANSWER:
σ x = σ / n = 25/ 20 = 5.59
53. What is the total measure of the area for any probability distribution?
ANSWER:
1.0
54. Is the statement “ x becomes less variable as n increases” correct? Justify the statement
ANSWER:
Yes, the statement is correct, simply because the standard error of the sample mean x is
given by σ x = σ / n ; and as n increases, the value of this fraction, the standard deviation of
sample mean, gets smaller.
55. If a population has a standard deviation σ of 25 units, what is the standard error of the
mean if samples of size 40 are selected?
ANSWER:
56. Make a list of all possible samples of size 2 that could be drawn with replacement from
this set of numbers.
ANSWER:
57. Construct the sampling distribution of sample means for the samples in question 57.
ANSWER:
ANSWER:
59. Construct the sampling distribution of sample means for the samples in question 60.
ANSWER:
60. If a population has a mean equal to 25 and a standard deviation equal to 5, give the mean of the
sample means and the standard error for each of the sample sizes 9, 100, 225, and 10,000,
respectively. What trend do you notice for the mean and standard error?
ANSWER:
n µx σx
9 25 1.667
100 25 0.500
225 25 0.333
10,000 25 0.050
The mean remains constant, but the standard error decreases as n increases.
ANSWER:
µ =30.0
ANSWER:
σ = 14.142
ANSWER:
64. Find the mean of the sample mean using your answer to question 66.
ANSWER:
µ x = ∑ [ x ⋅ P ( x )] = 30.0
65. Find the standard error of the mean using your answer to question 66.
ANSWER:
ANSWER:
67. What shape would you expect the distribution of all sample means to have?
ANSWER:
68. What value would you expect to find for the mean of the sample means?
ANSWER:
Approximately 58.5
69. What value would you expect to find for the standard deviation of the sample means?
ANSWER:
0.5
70. Make a list of all possible samples of size 2 that could be drawn with replacement from
this set of numbers.
ANSWER:
71. Construct the sampling distribution of sample means for the samples in question 73.
ANSWER:
72. A pair of dice is rolled 25 times, the sum of the dice observed each time, and the mean
of the 25 rolls is computed. This procedure is repeated 99 more times, and the 100
means are plotted on a histogram. The mean of the distribution will be close to what
number?
ANSWER:
73. Make a list of all possible samples of size 3 that could be drawn with replacement from
this set of numbers.
ANSWER:
74. Construct the sampling distribution of sample means for the samples in question 77.
ANSWER:
75. Make a list of all possible samples of size 2 that could be drawn with replacement from
this set of numbers.
ANSWER:
76. Construct the sampling distribution of sample means for the samples in question 80.
ANSWER:
77. Make a list of all the possible samples of size 3 that can be drawn from this set of
integers. (Sample with replacement; that is, the first number is drawn, observed, then
replaced before the next drawing.)
ANSWER:
78. Construct the sampling distribution of the sample medians for samples of size 3.
x% P( x% )
0 10/64
2 22/64
4 22/64
6 10/64
79. Construct the sampling distribution of the sample means for samples of size 3.
ANSWER:
x P( x )
0/3 1/64
2/3 3/64
4/3 6/64
6/3 10/64
8/3 12/64
10/3 12/64
12/3 10/64
14/3 6/64
16/3 3/64
18/3 1/64
Assume that the average amount spent per month for long-distance calls through the long-
distance carrier is $38.25, and that the standard deviation is $11.75. If a sample of 100
customers is selected, the mean amount spent per month for long-distance calls of this sample
belongs to a sampling distribution.
ANSWER:
The shape of the sampling distribution of sample means is approximately normal since n
= 100 is large and Central Limit Theorem does apply in this case.
ANSWER:
µ x = µ = $38.25
ANSWER:
83. Make a list of all samples of size 2 that can be drawn from this set of integers. (Sample
with replacement; that is, the first number is drawn, observed, then replaced before the
next drawing.)
ANSWER:
2,2 2,4 2,6 2,8
84. Construct the sampling distribution of sample means for samples of size 2 selected from
this set.
ANSWER:
x 2 3 4 5 6 7 8
Consider a very small, finite population, consisting of the set of odd single-digit integers {3, 5, 7,
9, and 11}.
85. Make a list of all samples of size 2 that can be drawn with replacement from this set of
integers. (Sample with replacement means that the first number is drawn, observed,
then replaced before the next drawing.)
86. Construct the sampling distribution of sample means for samples of size 2 selected from
this small population.
ANSWER:
x 3 4 5 6 7 8 9 10 11
P( x ) 0.04 0.08 0.12 0.16 0.20 0.16 0.12 0.08 0.04
87. Calculate the mean of the sampling distribution of sample means in question 93.
ANSWER:
µ x = ∑ x ⋅ P( x ) = 7.0
ANSWER:
µ = ∑ x / N = 35 / 5 = 7.0
89. Compare your answers to questions 94 and 95. What did you notice? What is your
conclusion?
ANSWER:
90. Make a list of all the possible samples of size 3 that can be drawn with replacement from
this set of integers.
ANSWER:
000 020 040 060 080 600 620 640 660 680
002 022 042 062 082 602 622 642 662 682
004 024 044 064 084 604 624 644 664 684
006 026 046 066 086 606 626 646 666 686
008 028 048 068 088 608 628 648 668 688
200 220 240 260 280 800 820 840 860 880
202 222 242 262 282 802 822 842 862 882
204 224 244 264 284 804 824 844 864 884
206 226 246 266 286 806 826 846 866 886
208 228 248 268 288 808 828 848 868 888
400 420 440 460 480
402 422 442 462 482
404 424 444 464 484
406 426 446 466 486
408 428 448 468 488
91. Construct the sampling distribution of the sample medians for samples of size 3.
ANSWER:
x% 0 2 4 6 8
92. Construct the sampling distribution of the sample means for samples of size 3.
ANSWER:
x P( x )
0/3 0.008
2/3 0.024
4/3 0.048
6/3 0.080
8/3 0.120
10/3 0.144
12/3 0.152
14/3 0.144
16/3 0.120
18/3 0.080
20/3 0.048
22/3 0.024
24/3 0.008
93. What does the sampling distribution of sample means (SDSM) say if all possible random
samples, each of size n, are taken from any population with mean µ , and standard
deviation σ ?
ANSWER:
The SDSM states that the sampling distribution of sample means will have a mean µ x
equal to µ , and have a standard deviation σ x equal to σ / n . Furthermore, if the
sampled population has a normal distribution, then the sampling distribution of x will
also be normal for samples of all sizes.
A certain population has a mean of 529 and a standard deviation of 29.7. Many samples of size
36 are randomly selected and means calculated.
94. What value would you expect to find for the mean of all these sample means? Why?
529; since µ x = µ
95. What value would you expect to find for the standard deviation of all these sample
means?
ANSWER:
σ x = σ / n = 29.7 / 36 = 4.95
96. What shape would you expect the distribution of all these samples means to have?
Why?
According to Central Limit Theorem (n = 36 is large), we would expect the shape of the
distribution of all these samples means to be approximately normal.
Egyptians watch an average of 2.5 hours of television per person per day. If the standard
deviation for the number of hours of television watched per day is 1.6 and a random sample of
225 Egyptians is selected, the mean of this sample belongs to a sampling distribution.
ANSWER:
According to Central Limit Theorem (n = 225 is large), we would expect the shape of this
sampling distribution to be approximately normal.
ANSWER:
µ x = µ = 2.5
ANSWER:
Suppose the annual consumption of chicken mean is 20.84 pounds per person, and that the
standard deviation for the consumption of chicken per person is 9.193 pounds. The mean
100. Describe the shape of this sampling distribution. Justify your answer.
ANSWER:
According to Central Limit Theorem (n = 200 is large), we would expect the shape of this
sampling distribution to be approximately normal.
ANSWER:
µ x = µ = 20.84
ANSWER:
True-False Questions
103. A sugar company packages sugar in 5-pound bags. The amount of sugar per bag varies
according to a normal distribution and has a mean equal to 5.0 pounds and a standard
deviation equal to 0.05 pounds. The computation of probabilities of events involving
weights of individual bags of sugar will utilize the variable z = (x – 5.0) / 0.05 while the
computation of probabilities of events involving the weights of sample means for
samples of size n = 25 each will utilize the variable z = ( x –5.0) / 0.01.
ANSWER: T
104. As the sample size n increases, the standard error of the sample means σ x becomes
smaller so that the distribution of sample means becomes much narrower.
ANSWER: T
105. The standard error of the sample mean increases as the sample size increases.
ANSWER: F
106. The shape of the distribution of sample means is always that of a normal distribution.
ANSWER: F
107. We need to take repeated samples in order to use the concept of the sampling
distribution.
ANSWER: T
Multiple-Choice Questions
108. A soft drink bottling machine is set to dispense soft drink into containers labeled 16
ounces. While the actual quantities vary, they are normally distributed with a mean of
109. A normal distributed population has a mean of 250 pounds and a standard deviation of
10 pounds. Given n = 20, what is the probability that this sample will have a mean value
between 245 and 255 pounds?
A) 0.9750
B) 0.4875
C) 0.3830
D) 0.0876
ANSWER: A
Short-Answer Questions
110. A manufacturer of light bulbs claims that the bulbs have a mean life of 800 hours with a
standard deviation of 20 hours. You test a random sample of 100 of these bulbs and find
a sample mean of 750 hours. Discuss the likelihood of the manufacturer’s claim.
ANSWER:
If the manufacturer’s claim is true, x = 750 has a z-score of −25.0, an extremely unlikely
occurrence. Therefore, it seems unlikely the manufacturer’s claim is true.
111. Consider a population with a mean µ of 51 and a standard deviation σ of 5.1. Calculate
the z-score for an x of 48.5 from a sample of size 36.
ANSWER:
x −µ 48.5 − 51
z= = = -2.94
σ/ n 5.1/ 36
ANSWER:
Recall that the area (probability) under the normal curve is always exactly one. So as the
width of the curve narrows, the height of the curve has to increase in order to maintain
this area.
113. A sugar company packages sugar in 5-pound bags. The amount of sugar per bag varies
according to a normal distribution. A sample of 15 bags is selected from the day's
production, and if the total weight of the sample is less than 74.5 pounds, the fill per bag
is increased. If the mean for the day is 5.00 pounds and the standard deviation is 0.05
pounds, what is the probability that the fill per bag will be increased?
ANSWER:
0.0049
114. If we are sampling from a normal population with a mean of 80 and a standard deviation
of 12, what size sample must be taken so that the middle 90% of the sampling
distribution of sample means falls between 78.35 and 81.65?
ANSWER:
144
115. If we are sampling from a normal population with a mean of 50 and a standard deviation
of 5, what size sample must be taken so that the middle 90% of the sampling distribution
of sample means falls between 48.5 and 51.5?
ANSWER:
Samples of size 10 are selected from a normal population with a mean of 35.5 and a standard
deviation of 6.5.
ANSWER:
0.5761
117. A sugar company packages sugar in 5-pound bags. The amount of sugar per bag varies
according to a normal distribution. A sample of 25 bags is selected from the day's
production, and if the mean of the sample is less than 4.98 pounds, the fill per bag is
increased. If the mean for the day is 5.00 pounds per bag and the standard deviation is
0.05 pounds, what is the probability that the fill per bag will be increased?
ANSWER:
0.0228
118. The daily production of product parts has lengths that are normally distributed with a
mean of 3.0 cm and a standard deviation of 0.05 cm. The daily production is 100%
inspected if a sample of 25 has a mean length that exceeds 3.02 cm or is less than 2.98
cm. What is the probability that a daily production is 100% inspected?
ANSWER:
0.0456
119. A sample of size 50 is selected from a normal distribution having a mean equal to 95
and a standard deviation equal to 15. What is the probability of selecting a sample
having a mean exceeding 100?
0.0091
120. A normal population has a mean equal to 100 and a standard deviation equal to 5. If a
sample of size 25 is selected, what is the probability that the sample mean will be
between 98.04 and 101.96?
ANSWER:
0.95
121. A population has a mean equal to 50. To have only a 10% chance of getting a sample of
size 36 whose mean exceeds 52.5, what must the standard deviation equal?
ANSWER:
122. A random sample of 100 times is selected from a population having a mean equal to 75.
If there is a 20% probability that the sample mean will be at the most 70 and assuming z
= −0.84, what would be the population standard deviation?
ANSWER:
5.95
123. A population has a mean equal to x and a standard deviation equal to y. Find the 90th
percentile for the distribution of sample means based on samples of size 64.
ANSWER:
x + 0.16y
ANSWER:
n=5
125. A normal population has a mean of 64 and a standard deviation of 10. If the probability
that a sample of size n = 25 will have a sample mean less than x is 0.0062, find x .
ANSWER:
Sample mean = 41
126. If the probability that a sample of size n will have a mean greater than 77 is 0.2389, find
n.
ANSWER:
n = 20
127. If the probability that a sample of size n = 100 will have a sample mean of at least x is
0.9452, find x .
ANSWER:
x = 73
ANSWER:
x = 77
Individual scores of a placement examination are normally distributed with a mean of 84.2 and a
standard deviation of 12.8.
129. If the score of an individual is randomly selected, find the probability that the score will
be less than 90.0.
ANSWER:
0.6736
130. If a random sample of size n = 20 is selected, find the probability that the sample mean
will be less than 90.0.
ANSWER:
0.9788
131. The mean of a population is 64, and its standard deviation is 12. Samples of size n = 40
are randomly selected. Find a value of k such that 90% of all such samples will have a
mean x such that 64 − k < x < 64 + k.
ANSWER:
k = 3.13
Assume that the population of heights of male college students is normally distributed with
mean µ of 68 inches and standard deviation σ of 3.75 inches. A random sample of 16 heights
is obtained.
ANSWER:
Heights are normally distributed with mean µ = 68 and standard deviation σ = 3.75.
133. Find the proportion of male college students whose height is greater than 70 inches.
ANSWER:
P(x > 70) = P[z > (70 – 68)/3.75] = P(z > 0.53) = 0.5000 – 0.2091 = 0.2909
ANSWER:
The distribution of x ’s will be normally distributed, since the sampled population is normal.
ANSWER:
µx = µ = 68; σ x = σ / n = 3.75 / 16 = 0.9375
ANSWER:
P( x > 70) = P[z > (70 – 68)/0.9375] = P(z > 2.13) = 0.5000 – 0.4834 = 0.0166
ANSWER:
P( x < 67) = P[z <(67 – 68)/0.9375] = P(z < -1.07) = 0.5000 – 0.3577 = 0.1423
138. Find the probability that the wind speed on any one reading will exceed 13.5 miles per hour.
ANSWER:
Since µ = 12, and σ = 3.4 , then
P(x > 13.5) = P[z > (13.5 – 12)/3.4] = P(z > 0.44) = 0.5000 – 0.1700 = 0.33
139. Find the probability that the mean of a random sample of 9 readings exceeds 13.5 miles per hour.
ANSWER:
Since µ = 12, and σ = 3.4 , then
P( x > 13.5) = P[z > (13.5 – 12)/(3.4/ 9)] = P(z >1.32) = 0.5000 – 0.4066 = 0.0934
ANSWER:
It is hard to tell if the assumption of normality is reasonable or not without studying wind speeds
more extensively. However, it would not be surprising if wind speeds have a mounded distribution
that could reasonably be approximated by the normal distribution. One might also expect the
distribution to be skewed to the right since very high winds can occur. However, the assumption
of normality seems reasonable.
141. What effect do you think the assumption of normality had on the answers to 150 and 151?
Explain.
ANSWER:
The assumption of normality allowed the use of the normal probability distribution to estimate the
probabilities.
142. Find the probability that the mean of the sample is less than $445.
ANSWER:
P( x < 445) = P[z<(445 – 450)/(50/ 100)] = P(z< -1.0) = 0.5000 – 0.3413 = 0.1587
143. Find the probability that the mean of the sample is between $445 and $455.
ANSWER:
P(445< x <455) = P[(445 – 450)/(50/ 100 ) < z <(455 – 450)/(50/ 100)]
= P(-1.0 < z < 1.0) = 2(0.3413) = 0.6826
144. Find the probability that the mean of the sample is greater than $460.
ANSWER:
P( x > 460) = P[z > (460 – 450)/(50/ 100)] = P(z > 2.0) = 0.5000 – 0.4772 = 0.0228
ANSWER:
The sample size is large; n = 100 is greater than 30, so Central Limit Theorem does apply.
146. What percentage of the oranges in this orchard has diameters less than 4.5 inches?
ANSWER:
P(x < 4.5) = P[z < (4.5 – 5.26)/0.5] = P(z< -1.52) = 0.5000 – 0.4357 = 0.0643 or 6.43%
147. What percentage of the oranges in this orchard is larger than 5.12 inches?
ANSWER:
P(x >5.12) = P[z > (5.12 – 5.26)/0.5] = P(z >-0.28) = 0.5000 + 0.1103 = 0.6103 or 61.03%
148. A random sample of 100 oranges is gathered and the mean diameter obtained was x = 5.12. If
another sample of size 100 is taken, what is the probability that its sample mean will be greater
than 5.12 inches?
ANSWER:
P( x > 5.12) = P[z >(5.12 – 5.26)/(0.5/ 100] = P(z >-2.80) = 0.5000 + 0.4974 = 0.9974
149. Why is the z-score used in answering questions 158, 159 and 160?
ANSWER:
z is used in questions 158 and 159 since the distribution of x is given to be normal, and it is also
used in question 160 since the sampling distribution of x is normal. (The sampled population is
normal).
150. Why the z- formula used in question 160 is different from that used in questions 158 and 159?
ANSWER:
Questions 158 and 159 are distributions of individual x-values, while question 160 is a sampling
distribution of x values.
151. A manufacturer of light bulbs says that its light bulbs have a mean life of 800 hours and a
standard deviation of 120 hours. You purchased 169 of these bulbs with the idea that you would
purchase more if the mean life of your sample were more than 780 hours. What is the probability
that you will not buy again from this manufacturer?
ANSWER:
Given information: µ = 800, σ = 120, and n = 169
P( x <780) = P[z<(780 – 800)/(120/ 169 )] = P(z<-2.17) = 0.500 – 0.485 = 0.015
152. A tire manufacturer claims (based on years of experience with its tires) that the mean mileage is
45,000 miles and the standard deviation is 6000 miles. A consumer agency randomly selects
100 of these tires and finds a sample mean of 41,000. Should the consumer agency doubt the
manufacturer’s claim?
153. The baggage weights for passengers using a domestic airline are normally distributed with a
mean of 22 lbs. and a standard deviation of 4 lbs. If the limit on total luggage weight is 2250 lbs.,
what is the probability that the limit will be exceeded for 100 passengers?
ANSWER:
Given information: µ = 22, σ = 4, and n = 100 . Let ∑ x represent the total baggage weight for
the 100 passengers:
= P[z > (22.5 - 22) / (4/ 100)] = P(z > 1.25) = 0.5000 - 0.3944 = 0.1056
A random sample of size 36 is to be selected from a population that has a mean µ of 75 and a
standard deviation σ of 15.
154. This sample of 36 has a mean value of x which belongs to a sampling distribution. Find
the shape of this sampling distribution.
ANSWER:
ANSWER:
µ x = µ = 75
σ x = σ / n = 15 / 36 = 2.5
157. What is the probability that this sample mean will be between 68 and 82?
ANSWER:
P(68 < x < 82) = P( -2.8 < z < 2.8) = 2 (0.4974) = 0.9948
158. What is the probability that the sample mean will have a value greater than 72?
ANSWER:
159. What is the probability that the sample mean will be within 2 units of the mean?
ANSWER:
P(73 < x < 77) = P(-0.8 < z < 0.8) = 2 (0.2881) = 0.5762
160. What is the probability that the sample mean will be within 3 units of the mean?
ANSWER:
P(72 < x < 78) = P(-1.2 < z < 1.2) = 2 (0.3849) = 0.7698
Consider the approximately normal population of weights of female college students with mean
µ of 118 pounds and standard deviation σ of 6.8 pounds. A random sample of 16 weights is
obtained.
ANSWER:
162. Find the proportion of female college students whose weight is greater than 120 pounds.
ANSWER:
ANSWER:
The distribution of x , the mean of samples of size 16, will be approximately normally
distributed.
ANSWER:
165. Find the probability that the sample mean weight exceeds 121 pounds.
ANSWER:
166. Find the probability that the sample mean weight is less than 114 pounds.
167. Find the probability that the sample mean weight is between 116 and 121 pounds.
ANSWER:
P(116 < x < 121) = P( -1.18 < z < 1.76) = 0.3810 + 0.4608 = 0.8418
168. Within what limits does the middle 95% of the sampling distribution of sample means for
samples of size 16 fall?
ANSWER:
x − 118
-1.96 = ⇒ x -118 = -3.332 ⇒ x =114.668 ≈ 114.7 pounds, and
1.7
x − 118
1.96 = ⇒ x -118 = 3.332 ⇒ x =121.332 ≈ 121.3 pounds
1.7
Therefore, the middle 95% of the sampling distribution of sample mean weights of
female college students is bounded by 114.7 pounds and 121.3 pounds.
A recent study showed that the average amount that high school graduates in USA spend on
their open house is $932. Assume that amounts spent are normally distributed with a standard
deviation of $348, and that open houses for 36 high school graduates are randomly selected
from Lansing, Michigan.
169. Describe the distribution of x ; the sample average amount spent on open houses of high
school graduates.
ANSWER:
ANSWER:
171. Find the probability that the sample mean cost to have an open house is between $816
and $874.
ANSWER:
P(820 < x < 874) = P(-2.0 < z < -1.0) = 0.4772 – 0.3413 = 0.1359
172. Find the probability that the sample mean cost to have an open house is higher than
$1042.
ANSWER:
A recent report in a women magazine stated that the average age for women to marry in the
United States is now 25 years of age, and that the standard deviation is assumed to be 3.2
years. A sample of 50 U.S. women is randomly selected.
173. Describe the distribution of x ; the sample average age for women to marry in the United
States.
ANSWER:
ANSWER:
175. Find the probability that the sample mean age for women to marry is at most 24 years.
ANSWER:
176. Find the probability that the sample mean age for women to marry is more than 25.5
years.
ANSWER:
177. Find the probability that the sample mean age for women to marry is between 24 and 25
years.
ANSWER:
Chapter 8
True-False Questions
1. A confidence interval estimate for µ will always contain the corresponding point estimate
for µ .
ANSWER: T
ANSWER: T
ANSWER: T
4. The objective of inferential statistics is to use the information contained in the sample
data to increase our knowledge of the sample.
ANSWER: F
5. If the maximum error E is expressed as a multiple of the standard deviation σ , then the
actual value of σ is not needed in order to calculate the sample size.
ANSWER: T
6. The sample mean, x , is the point estimate (single number value) for the mean µ of the
sampled population.
ANSWER: F
8. The Central Limit Theorem can only be applied to large samples when the data provide
a strong indication of a unimodal distribution that is approximately symmetric.
ANSWER: F
ANSWER: T
10. The sampling distribution of sample means (SDSM) and the Central Limit Theorem
provide the information needed to describe how close the point estimate, s, is expected
to be to the population standard deviation, σ .
ANSWER: F
σ
11. z (α / 2 ) in the formula x ± z (α / 2 ) is the confidence coefficient. It is the number of
n
multiples of the standard error needed to formulate an interval estimate of the correct
width to have a level of confidence of 1- α .
ANSWER: T
Multiple-Choice Questions
12. When estimating a population mean with a confidence interval estimate, then E is:
13. Suppose you selected 200 different samples from a large population and used each
sample to construct a 0.95 confidence interval estimate for the population mean. How
many of the 200 confidence interval estimates should you expect to actually contain the
population mean µ ?
A) 200
B) 190
C) 100
D) 95
ANSWER: B
14. What value is always located at the center of a confidence interval for µ ?
A) E
B) µ
C) x
D) σ
ANSWER: C
15. You are constructing a 95% confidence interval using the following information: n = 60,
x = 65.5, s = 2.5, and E = 0.7. What is the value of the middle of the interval?
A) 0.7
B) 2.5
C) 0.95
D) 65.5
ANSWER: D
A) σ / n is the standard error of the mean, or the standard deviation of the sampling
distribution of sample means.
σ
B) z (α / 2 ) is the width of the confidence interval (the product of the confidence
n
coefficient and the standard error) and is called the maximum error of estimate, E.
C) The higher the level of confidence, the more likely the interval is to contain the
parameter, and the narrower the interval, the precise the estimation.
D) None of the above.
ANSWER: B
A) The confidence interval has two basic characteristics that determine its quality: its
level of confidence and its width.
B) It is preferred that the confidence interval has a high level of confidence and be
precise (narrow) at the same time.
Short-Answer Questions
20. Discuss the difference between a point estimate for a parameter and an interval estimate
for a parameter.
ANSWER:
Point estimate for a parameter is a single value, the value of the corresponding sample
statistic. An interval estimate is an interval bounded by two values.
21. Five hundred confidence intervals, each having level of confidence 85%, were computed
for population mean µ . Approximately how many of the confidence intervals would not
capture µ ?
ANSWER:
75
22. When a (1 – α ) 100% confidence interval is formed for µ , what is the probability that the
interval will not contain µ within its limits?
ANSWER:
24. If the sample mean is used to estimate µ and a maximum error of estimate is specified,
then n may be determined for a known standard deviation and a given level of
confidence. If the maximum error of estimate is doubled, what is the affect on the
required sample size?
ANSWER:
25. Does decreasing the sample size increase or decrease the width of the confidence
interval for a particular parameter (all other things remaining the same)?
ANSWER:
26. Consider the statement: “The variance among the test scores on last week’s exam in
your statistics class was 112”. Identify each numeral value that appears above by name
(mean, variance, etc.) and by symbol ( x , σ , etc.)
ANSWER:
27. Explain the difference between a point estimate and an interval estimate.
ANSWER:
28. Consider the statement: “The mean height of a sample of 50 senior high school boys is
68 inches”. Identify each numeral value that appears above by name (mean, variance,
etc.) and by symbol ( x , σ , etc.)
ANSWER:
29. Explain the difference between an interval estimate and confidence interval.
ANSWER:
An interval estimate is an interval bounded by two values and used to estimate the value
of a population parameter. The values that bound the interval are statistics calculated
from the sample that is being used as the basis for the estimation. A confidence interval
is an interval estimate with a specified level of confidence.
30. Consider the statement: “The standard deviation for I.Q. scores is 12.3”. Identify each
numeral value that appears above by name (mean, variance, etc.) and by symbol ( x , σ ,
etc.)
ANSWER:
σ
31. The number 1.96 in the formula x ± 1.96 is the confidence coefficient. What does this
n
mean?
32. Consider the statement: “The mean height of all cadets who have ever entered West
Point is 69 inches”. Identify each numeral value that appears above by name (mean,
variance, etc.) and by symbol ( x , σ , etc.)
ANSWER:
Population mean = µ = 69
33. What value would the standard deviation need to be in order for x (based on 150
observations) to estimate µ with a maximum error of estimate equal to 0.15 and with
95% confidence?
ANSWER:
0.94
34. A sample of size 40 is taken from a population having σ = 2.7. If the mean of the
sample equals 48.5, then give a point estimate for µ and find an 85% confidence interval
for µ .
ANSWER:
ANSWER:
(217 to 233)
36. What sample size would be needed to estimate the population mean to within one-half
standard deviation with 95% confidence?
16
37. A machine is programmed to put 737 grams of salt in a container. Due to uncontrolled
variation in the process, there is variation in content from container to container. To
estimate the mean amount of salt per container, a sample of 50 boxes is selected and x
= 739.5 grams. From experience with the machine, it is known that the σ = 7.5 grams.
Find a 90% confidence interval for µ .
ANSWER:
(737.7 to 741.3)
38. A 95% confidence interval estimate for a population mean was computed to be (44.8 to
50.2). Determine the mean of the sample, which was used to determine the interval
estimate.
ANSWER:
x = 47.5
A sample was selected from a normal population with a standard deviation σ = 6.1. The
sample values are 114, 120, 108, 118, 119, 123, 117, 124, 115, and 129.
39. Construct a confidence interval estimate of the population mean with 0.90 level of
confidence.
ANSWER:
(115.52 to 121.88)
40. Construct a confidence interval estimate of the population mean with 0.95 level of
confidence.
(114.92 to 122.48)
41. Construct a confidence interval estimate of the population mean with 0.99 level of
confidence.
ANSWER:
(113.72 to 123.68)
42. Based on your answers to questions 43, 44, and 45, what is the relationship between the
level of confidence and the width of the confidence interval?
ANSWER:
The larger the level of confidence, the wider the width of the confidence interval.
A random sample of the amount paid for taxi fare from downtown to the airport was obtained
and produced the following summary statistics: n = 15, ∑ x = 301, ∑ x 2
= 6159 .
ANSWER:
x = ∑ x /n = 301/15 = 20.0667
ANSWER:
ANSWER:
s= s 2 = 8.4952 = 2.9146
46. Find the level of confidence assigned to an interval estimate of the mean formed using the
interval x − 1.28 ⋅ σ x to x + 1.28 ⋅ σ x .
ANSWER:
47. Find the level of confidence assigned to an interval estimate of the mean formed using the
interval x − 1.75 ⋅ σ x to x + 1.75 ⋅ σ x .
ANSWER:
48. Find the level of confidence assigned to an interval estimate of the mean formed using the
interval x − 1.96 ⋅ σ x to x + 1.96 ⋅ σ x .
ANSWER:
49. Find the level of confidence assigned to an interval estimate of the mean formed using the
interval x − 2.33 ⋅ σ x to x + 2.33 ⋅ σ x .
ANSWER:
50. Find the level of confidence assigned to an interval estimate of the mean formed using the
interval x − 2.75 ⋅ σ x to x + 2.75 ⋅ σ x .
ANSWER:
Consider the information: the sampled population is normally distributed, the population
standard deviation σ = 10.4, the sample size n = 60, and the sample mean x = 81.2.
ANSWER:
ANSWER:
With 98% confidence we can say the population mean µ is between 78.072 and 84.328.
ANSWER:
In a recent article, it was reported that the mean percentile score on the California Achievement
Test (CAT) for 20 students was 59.80. Assume the population of CAT scores is normally
distributed and that σ = 20.5.
54. Find a point estimate for the mean of the population the sample represents.
55. Find the maximum error of estimate for a level of confidence equal to 95%.
ANSWER:
E = z (α / 2) ⋅ σ / n = (1.96) (20.5 / 20) = 8.985
ANSWER:
x ± E = 59.80 ± 8.985. Then, the 95% confidence interval for µ is 50.815 to 68.785.
57. Explain the meaning of the answers to questions 58, 59, and 60.
ANSWER:
The above answers are the main parts of the 95% confidence interval for the population
mean µ .
Given the following information: the sample size n = 20, the sample mean x = 75.3, and the
population standard deviation σ = 6.0.
ANSWER:
The parameter of interest = µ , normality cannot be assumed for x; with n = 20, the
Central Limit Theorem does not assure us that x will be approximately normal either. It
may be meaningless to complete the procedure. However, since σ = 6.0, 1- α = 0.99,
x ± E = 75.3 ± 3.46 , and the 99% confidence interval for µ is 71.84 To 78.76.
No; the distribution for the variable x is unknown, and n = 20 is not large enough to
satisfy the Central Limit Theorem. The resulting interval is likely to have a level of
confidence that is unknowingly less than 99%.
60. How large a sample should be taken if the population mean is to be estimated with 99%
confidence to within $72? The population has a standard deviation of $800.
ANSWER:
By measuring the amount of time it takes a component of a product to move from one
workstation to the next, an engineer has estimated that the standard deviation is 4.5 seconds.
61. How many measurements should be made in order to be 95% certain that the maximum
error of estimation will not exceed 1 second?
ANSWER:
n = [z(α / 2) ⋅ σ / E ]2 = [(1.96)(4.5) /1]2 = 77.79 or 78
ANSWER:
n = [z(α / 2) ⋅ σ / E ]2 = [(1.96)(4.5) / 2]2 = 19.44 or 20
Waiting times (in hours) at a popular restaurant are believed to be approximately normally
distributed with a standard deviation of 1.5 hours during busy periods.
ANSWER:
µ = The mean waiting time (in hours) at a popular restaurant. Normality indicated. Since
σ = 1.5 , 1 − α = 0.95 , n = 20, and x = 1.58 , then
64. Suppose that the mean of 1.58 hours had resulted from a sample of 32 customers. Find
the 95% confidence interval.
ANSWER:
µ = The mean waiting time (in hours) at a popular restaurant. Normality indicated. Since
σ = 1.5 , 1 − α = 0.95 , n = 20, and x = 1.58 , then α / 2 = 0.025 ; and z(0.025) = 1.96.
Hence, the maximum error of estimate. E = z (α / 2) ⋅ σ / n = (1.96)(1.5 / 32) = 0.52.
Then, x ± E = 1.58 ± 0.52 , and the 95% confidence interval for µ is 1.06 to 2.10.
65. What effect does a larger sample size have on the confidence interval?
ANSWER:
66. An automobile manufacturer wants to estimate the mean gasoline mileage that its
customers will obtain with its new compact model. How many sample runs must be
performed in order that the estimate be accurate to within 0.25 mpg at 95% confidence?
(Assume that σ = 2.0.)
ANSWER:
A random sample of taxi fares (in dollars) from Big Rapids to Ford International airport in Grand
Rapids, Michigan, was obtained: 55, 59, 57, 63, 61, 57, 56, 58, 52, 58, 60, 62, 55, 58, and 60.
ANSWER:
x = ∑ x / n = 871 / 15 = 58.067
ANSWER:
(∑ x ) 2 (871) 2
∑x 2
−
n
50, 695 −
15 = 118.933 / 14 = 8.495
s2 = =
n −1 14
ANSWER:
s= s 2 = 8.495 = 2.915
70. Find the level of confidence assigned to an interval estimate of µ formed using the
interval x -1.15 ⋅σ x to x +1.15 ⋅σ x .
ANSWER:
ANSWER:
72. Find the level of confidence assigned to an interval estimate of µ formed using the
interval x - 2.17 ⋅σ x to x +2.17 ⋅σ x .
ANSWER:
73. Find the level of confidence assigned to an interval estimate of µ formed using the
interval x -2.58 ⋅σ x to x +2.58 ⋅σ x .
ANSWER:
74. Find the level of confidence assigned to an interval estimate of µ formed using the
interval x -1.96 σ x to x +1.96 σ x .
ANSWER:
75. Find the level of confidence assigned to an interval estimate of µ formed using the
interval x -2.33 σ x to x -2.33 σ x .
ANSWER:
76. Find the level of confidence assigned to an interval estimate of µ formed using the
interval x -1.645 ⋅σ x to x +1.645 ⋅σ x .
ANSWER:
ANSWER:
ANSWER:
79. Determine the value of the confidence coefficient z (α / 2 ) for 98% confidence.
ANSWER:
80. Determine the value of the confidence coefficient z (α / 2 ) for 99% confidence.
ANSWER:
81. Determine the level of the confidence given the confidence coefficient z (α / 2 ) =1.645.
ANSWER:
82. Determine the level of the confidence given the confidence coefficient z (α / 2 ) =1.96.
ANSWER:
83. Determine the level of the confidence given the confidence coefficient z (α / 2 ) =2.575.
ANSWER:
84. Determine the level of the confidence given the confidence coefficient z (α / 2 ) =2.05.
ANSWER:
85. Determine the level of the confidence given the confidence coefficient z (α / 2 ) =2.88.
ANSWER:
Consider a random sample of size n = 100, and mean x =125. Assume that the population
standard deviation σ =15.
ANSWER:
x ± E = 125 ± 2.4675 .Hence the 90% confidence interval for µ. is 122.5325 to 127.4675.
ANSWER:
A sample of size 100 should be large enough for the Central Limit Theorem to apply and
ensure that the sampling distribution of sample means will be normally distributed.
Consider a random sample of size n = 20, and mean x =70.3. Assume that the population
standard deviation σ = 5.4.
ANSWER:
x ± E = 70.3 ± 3.109 .Hence the 99% confidence interval for µ . is 67.191 to 73.409
ANSWER:
The assumptions are not satisfied since the distribution for the variable x is unknown,
and a sample of size n = 20 is not large enough to satisfy the Central Limit Theorem and
assure us that x will be approximately normal. The resulting interval is likely to have a
level of confidence that is unknowingly less than 99%.
90. Discuss the effect that the point estimate has on the confidence interval for µ .
ANSWER:
The point estimate is the center of the confidence interval; as it changes in value, the
interval “slides” along the number line, but does not change in width.
91. Discuss the effect that the level of confidence has on the confidence interval for µ .
ANSWER:
92. Discuss the effect that the sample size has on the confidence interval for µ .
ANSWER:
When the sample size increases, the denominator in the confidence interval formula
increases causing the maximum error to decrease; thus the confidence interval
decreases in width. Contrarily, if the sample size decreases, the denominator decreases
and the maximum error increases and the width of the confidence interval increases
93. Discuss the effect that the variability of the characteristic being measured has on the
confidence interval for µ .
ANSWER:
The lengths of 225 fish caught in Lake Michigan had a mean of 15.0 inches. Assume that the
population standard deviation is 2.5 inches.
ANSWER:
x = 15
ANSWER:
96. Find the 90% confidence interval for the population mean length.
ANSWER:
ANSWER:
ANSWER:
99. What is the effect of increasing the level of confidence from 0.90 to 0.98 on the
maximum error of estimate for µ ?
ANSWER:
When the level of confidence increases from 0.90 to 0.99, the confidence coefficient
z (α / 2 ) increases from 1.645 to 2.33; and thus the maximum error of estimate for µ
increases from 0.274 to 0.388
100. What is the effect of increasing the level of confidence from 0.90 to 0.98 on the width of
the confidence interval for µ ?
ANSWER:
When the level of confidence increases from 0.90 to 0.99, the confidence coefficient
z (α / 2 ) also increases from 1.645 to 2.33; and the maximum error of estimate E for µ
increases from 0.274 to 0.388. As a result, the width of the confidence interval increases
from 0.548 to 0.776.
A certain adjustment to a machine will change the length of the parts it is making but will not
affect the standard deviation. The length of the parts is normally distributed, and the standard
deviation is 0.5mm. After an adjustment is made, a random sample is taken to determine the
mean length of parts now being produced. The resulting lengths are: 78.0, 78.7, 77.1, 79.0,
79.7, 77.6, 79.7, 79.2, 78.5, and 77.7.
The parameter of interest is the mean length of parts being produced after adjustment.
102. Find the point estimate for the mean length of all parts now being produced.
ANSWER:
x = 78.52
ANSWER:
x ± E = 78.52 ± 0.407 .Hence the 99% confidence interval for µ is 78.113 to 78.927.
By measuring the amount of time it takes a component of a product to move from one
workstation to the next, an engineer has estimated that the standard deviation is 6 seconds.
104. How many measurements should be made in order to be 95% certain that the maximum
error of estimation will not exceed 1.5 seconds?
ANSWER:
z (α / 2) ⋅ σ (1.96)(6.0)
2 2
105. What sample size is required for a maximum error of 3.0 seconds?
z (α / 2) ⋅ σ (1.96)(6.0)
2 2
106. How large a sample would be needed to estimate the population mean weight of the
new mini-laptop computers if the maximum error of estimate is to be 0.4 of one standard
deviation with 95% confidence?
ANSWER:
z (α / 2) ⋅ σ (1.96)(σ )
2 2
107. Assume the annual college fees for private colleges have a mounded distribution and
the standard deviation is $1725. Find the 95% confidence interval for the mean costs for
private colleges.
ANSWER:
108. Assume the annual college fees for public colleges have a mounded distribution and the
standard deviation is $1125. Find the 95% confidence interval for the mean costs for
public colleges.
109. Compare the confidence intervals found in questions 115 and 116 and describe the
effect the two different sample standard deviations had on the resulting answers.
ANSWER:
When the standard deviation decreases from 1725 to 1125, the width of the confidence
interval also decreases from $1127 to $735.
110. Find the sample size needed to estimate µ of a normal population with σ = 3.5 to within
1.0 unit at the 98% level of confidence.
ANSWER:
z (α / 2) ⋅ σ (2.33)(3.5)
2 2
111. How large a sample should be taken if the population mean is to be estimated with 90%
confidence to within $75 if the population has a standard deviation of $800?
ANSWER:
z (α / 2) ⋅ σ (1.645)(800)
2 2
The weights of full boxes of Frosted Mini-Wheat cereal are normally distributed with a standard
deviation of 0.52 oz. A sample of 18 randomly selected boxes produced a mean weight of 24.3
oz.
ANSWER:
x ± E = 24.3 ± 0.2402 .Hence the 95% confidence interval for µ is 24.0598 to 24.5402
113. Find the 99% confidence interval for the true mean weight of a box of this cereal.
ANSWER:
x ± E = 24.3 ± 0.3156 .Hence the 99% confidence interval for µ is 23.9844 to 24.6156
114. What effect did the increase in the level of confidence have on the width of the
confidence interval?
ANSWER:
When the level of confidence increases from 0.95 to 0.99, the confidence coefficient
z (α / 2 ) also increases from 1.96 to 2.575. As a result, the width of the confidence interval
increases from 0.4804 to 0.6312.
115. A pharmaceutical company wants to estimate the mean response time for Tenormin 50
mg tablets to reduce blood pressure. How large of a sample should they take in order to
estimate the mean response time to within 0.80 week at 95% confidence? Assume that
σ = 4.2 weeks.
ANSWER:
116. We are interested in estimating the mean life of a new product. How large a sample do
we need to take in order to estimate the mean to within 0.20 of a standard deviation with
90% confidence?
ANSWER:
z (α / 2) ⋅ σ (1.645)(σ )
2 2
Section 8.3
True-False Questions
117. When we reject the null hypothesis, we are certain that the null hypothesis is false.
ANSWER: F
118. If our decision in a hypothesis test is to fail to reject the null hypothesis, then we know
that the null hypothesis must be true.
ANSWER: F
119. If α is reduced and β remains constant, then the sample size n must be increased.
ANSWER: T
ANSWER: T
ANSWER: F
ANSWER: F
123. Depending on the statement of the original problem, the equal sign may be in the null
hypothesis or the alternative hypothesis.
ANSWER: F
ANSWER: F
125. The risk of a Type I error is directly controlled in a hypothesis test by establishing a level
for α .
ANSWER: T
ANSWER: F
127. α is the measure of the area under the curve of the standard score that lies in the
rejection region for the null hypothesis.
ANSWER: T
129. Failing to reject the null hypothesis when it is false is a correct decision.
ANSWER: F
130. To conclude that the mean is larger (or smaller) than a claimed value, the calculated
value of the test statistic must fall in the rejection (critical) region.
ANSWER: T
131. The null hypothesis is sometimes referred to as the research hypothesis since it
represents what the researcher hopes will be found to be true.
ANSWER: F
132. Failing to reject the null hypothesis when it is true is referred to as Type A correct
decision.
ANSWER: T
133. Rejecting the null hypothesis when it is false is referred to as Type B correct decision.
ANSWER: T
ANSWER: T
135. A Type A correct decision occurs when the null hypothesis is false, and we decide in its
favor.
ANSWER: F
136. A Type I error is committed when a true null hypothesis is rejected - that is, when the null
hypothesis is true but we decide against it.
137. The Greek letter α is always the probability of rejecting the null hypothesis.
ANSWER: F
138. A Type B correct decision occurs when the null hypothesis is true, and the decision is in
opposition to the null hypothesis.
ANSWER: F
139. A Type II error is committed when we decide in favor of a null hypothesis that is actually
false.
ANSWER: T
140. The Type I error often results in what represents a “lost opportunity”.
ANSWER: F
141. Test statistic is a random variable whose value is calculated from the sample data and is
used in making the decision “fail to reject H o ” or “reject H o ”.
ANSWER: T
142. When writing the decision and the conclusion, remember that the decision is about H a
and the conclusion is a statement about whether or not the contention of H o was upheld.
ANSWER: F
Multiple-Choice Questions
143. You have rejected the null hypothesis when it is false, and therefore you have made a
144. Consider the situation: “A newly developed drug will not increase incidences of heart
attacks among its users.” Which of the following would be the most appropriate choices
for α and β ?
145. Which of the following is the name given to rejecting the null hypothesis when it is true?
146. Consider the following nonmathematical situation: “I do not have to study for my
statistics test.” Which of the following would be the most appropriate choices for α and
β?
147. Consider the following nonmathematical situation: “The brakes on my automobile are in
need of repair.” Which of the following would be the most appropriate choices for α and
β?
A) α
B) 1 – α
C) β
D) 1 – β
ANSWER: A
150. You have failed to reject the null hypothesis when it is false, and therefore you have
made a
A) α
B) 1 – α
C) β
D) 1 – β
ANSWER: C
A) α
B) 1 – α
C) β
D) 1 – β
ANSWER: B
153. Which of the following is the probability of making a Type B correct decision?
A) α
B) 1 – α
C) β
D) 1 – β
ANSWER: D
154. Which of the following is the probability of having the computed value of the test statistic
fall in the critical region when the null hypothesis is true?
A) α
B) 1 − α
C) β
D) 1 − β
ANSWER: A
155. Which of the following is the probability of having the computed value of the test statistic
fall in the non-critical region when the null hypothesis is true?
A) α
B) 1 − α
C) β
D) 1 − β
ANSWER: B
157. Which of the following statements is false regarding the alternative hypothesis H a ?
A) It is a statement about the same population parameter that is used in the null
hypothesis.
B) It is a statement that specifies the population parameter has a value different, in
some way, from the value given the null hypothesis.
C) The rejection of the null hypothesis will imply the likely truth of this alternative
hypothesis.
D) None of the above.
ANSWER: D
A) The basic idea of the hypothesis test is for the evidence to have a chance to
“disprove” the alternative hypothesis.
B) The null hypothesis is the statement that the evidence might disprove.
C) Your concern (belief or desired outcome), as the person doing the testing, is
expressed in the alternative hypothesis.
D) The alternative hypothesis is sometimes referred to as the research hypothesis;
since it represents what the researcher hopes will be found to be “true”.
ANSWER: A
A) The probability assigned to the Type I error is α (called “alpha”; α is the first letter of
the Greek alphabet).
B) The probability of the Type II error is β (called “beta”; β is the second letter of the
Greek alphabet).
A) 1- β is the probability of a correct decision when the null hypothesis is false (i.e.,
probability of Type B correct decision).
B) 1- β is called the power of the statistical test, since it is the measure of the ability of a
hypothesis test to reject a false null hypothesis, a very important characteristic.
C) If α is decreased, then either β must decrease, or n must be decreased.
D) There is an interrelationship among the probability of the Type I error ( α ), the
probability of the Type II error ( β ), and the sample size (n).
ANSWER: C
A) The null hypothesis is the statement that is “on trial”, and therefore the decision must
be about it.
B) The contention of the alternative hypothesis is the thought that brought about the
need for a decision.
C) The question that led to the alternative hypothesis must be answered when the
conclusion is written.
D) All of the above
ANSWER: D
Short-Answer Questions
ANSWER:
163. If you do not reject the null hypothesis when some alternative hypothesis is correct, what
Type error do you make?
ANSWER:
Type II error
164. What error could be made if the test statistic falls in the noncritical region?
ANSWER:
Type II
165. What proportion of the probability distribution is in the critical region, provided the null hypothesis
is correct?
ANSWER:
166. What error could be made if the test statistic falls in the critical region?
ANSWER:
Type I
167. What proportion of the probability distribution is in the noncritical region, provided the null
hypothesis is not correct?
ANSWER:
168. If the null hypothesis is false, the probability of a correct decision is identified by what
symbol?
ANSWER:
1- β
169. You are investigating a complaint that “special computer brand takes too much time” to
start. State the null and alternative hypotheses.
H o : Special computer brand does not take too much time to start
170. If the probability of Type II error, β , decreases, how does this affect the probability of
Type I error, α , or the sample size n?
ANSWER:
171. If the null hypothesis is false, the probability of a decision error is identified by what
symbol?
ANSWER:
172. If the sample size n decreases, how does this affect the probability of Type I error, α , or
the probability of Type II error, β ?
ANSWER:
173. You are testing a new security system and you are concerned that the system is not
reliable. State the null and alternative hypotheses.
ANSWER:
ANSWER:
Type I error
175. If the null hypothesis is false, what decision error could be made?
ANSWER:
Type II error
176. If the decision “reject H o ” is made, what decision error could have been made?
ANSWER:
Type I error
177. If the decision “fail to reject H o ” is made, what decision error could have been made?
ANSWER:
Type II error
178. Find the power of a test when the probability of committing Type II error is 0.01
ANSWER:
ANSWER:
180. If the null hypothesis is true, the probability of a correct decision is identified by what
symbol?
ANSWER:
1- α
181. Find the power of a test when the probability of making Type II error is 0.10
ANSWER:
182. Explain why the probability of rejecting the null hypothesis is not always α
ANSWER:
The probability of rejecting the null hypothesis is α only if the null hypothesis is true.
183. Find the power of a test when the probability of committing Type II error is 0.05
ANSWER:
184. Describe the action that would result in a Type I error and a Type II error if the following
null hypothesis was tested; “ H o : The majority of Americans favor laws against assault
weapons.”
ANSWER:
A Type I error occurs when it is determined that the majority of Americans do not favor
laws against assault weapons when, in fact, the majority do favor such laws.
A Type II error occurs when it is determined that the majority of Americans do favor laws
against assault weapons when, in fact, they do not favor such laws.
185. Describe the action that would result in a Type I error and a Type II error if the following
null hypothesis was tested; “ H o : This fast-food menu is not low fat.”
ANSWER:
A Type I error occurs when it is determined that the fast food is low fat when, in fact, it is
not low fat.
A Type II error occurs when it is determined that the fast food is not low fat when, in fact,
it is low fat.
186. Describe the action that would result in a Type I error and a Type II error if the following
null hypothesis was tested; “ H o : This old book must not be thrown away”
A Type I error occurs when it is determined that the old book must be thrown away
when, in fact, it should not be thrown.
A Type II error occurs when it is determined that the old book must not be thrown away
when, in fact, it should be thrown.
187. Describe the action that would result in a Type I error and a Type II error if the following
null hypothesis was tested; “ H o : There is no waste in the US Defense Department
spending.”
ANSWER:
A Type I error occurs when it is determined that there is waste in the US Defense
Department spending when, in fact, there is not waste.
A Type II error occurs when it is determined that there is no waste in the US Defense
Department spending when, in fact, there is waste.
188. Describe the action that would result in a correct decision Type A and a correct decision
Type B, if the following null hypothesis was tested; “ H o : The majority of Americans
favor laws against assault weapons.”
ANSWER:
Type A correct decision: The majority of Americans do favor laws against assault
weapons and it is decided that they do favor the laws.
Type B correct decision: The majority of Americans do not favor laws against assault
weapons and it is decided that they do not favor the laws.
189. Describe the action that would result in a correct decision Type A and a correct decision
Type B, if the following null hypothesis was tested; “ H o : This fast-food menu is not low
fat.”
ANSWER:
Type B correct decision: The fast food menu is low fat and it is decided that it is low fat.
190. Describe the action that would result in a correct decision Type A and a correct decision
Type B, if the following null hypothesis was tested; “ H o : This old book must not be
thrown away”
ANSWER:
Type A correct decision: The old book must not be thrown away and it is decided that it
should not be thrown.
Type B correct decision: The old book must be thrown away and it is decided that it
should be thrown.
191. Describe the action that would result in a correct decision Type A, and a correct decision
Type B, if the following null hypothesis was tested; “ H o : There is no waste in the US
Defense Department spending.”
ANSWER:
A normally distributed population is known to have a standard deviation of 5, but its mean is in
question. It has been argued to be either µ = 70 or µ = 80 , and the following hypothesis test has
been devised to settle the argument. The null hypothesis, H o : µ = 70 , will be tested using one
randomly selected data and comparing it to the critical value 76. If the data is greater than or
equal to 76, the null hypothesis will be rejected.
ANSWER:
α = P(rejecting H o when H o is true) = P( x ≥ 76 | µ = 70) = P[ z > (76 − 70) / 5] = P( z > 1.20)
= 0.5000 – 0.3849 = 0.1151
ANSWER:
194. In order to complete a hypothesis test, we will need to write a conclusion that carefully
describes the meaning of the decision relative to the intent of the hypothesis test. What
does this mean?
ANSWER:
If the decision is “reject H a ” then the conclusion should be worded something like,
“There is sufficient evidence at the α level of significance to show that…..…(the meaning
of the alternative hypothesis)”. On the other hand, if the decision is “fail to reject H a ” then
the conclusion should be worded something like, “There is not sufficient evidence at the
α level of significance to show that……..…(the meaning of the alternative hypothesis)”.
195. You want to show that professors find the new method of teaching calculus is more
effective than traditional method. State the null and alternative hypotheses.
ANSWER:
H o : The new method of teaching calculus is not more effective than traditional method
H a : The new method of teaching calculus is more effective than traditional method
ANSWER:
A statistician is interested in testing the null hypothesis H o : Iraq was not a threat to US national
security vs. the alternative hypothesis H a : Iraq was a threat to US national security.
197. Identify the following situation as Type A or B correct decision or Type I or II error:
ANSWER:
Type II error
198. Identify the following situation as Type A or B correct decision or Type I or II error:
ANSWER:
199. Identify the following situation as Type A or B correct decision or Type I or II error:
ANSWER:
Type I error
200. Identify the following situation as Type A or B correct decision or Type I or II error:
ANSWER:
When an airplane is inspected, the inspector is looking for anything that might indicate the plane
might not be safe to fly.
ANSWER:
202. Depending on the truth of the null hypothesis and the decision reached, describe what is
meant by Type A correct decision in this situation as a possible outcome.
ANSWER:
203. Depending on the truth of the null hypothesis and the decision reached, describe what is
meant by Type B correct decision in this situation as a possible outcome.
ANSWER:
Type B correct decision: The plane will not be safe to fly and the inspector did not okay
its use.
204 Depending on the truth of the null hypothesis and the decision reached, describe what is
meant by Type I error in this situation as a possible outcome.
Type I error: The plane will be safe to fly and the inspector did not okay its use.
205. Depending on the truth of the null hypothesis and the decision reached, describe what is
meant by Type II error in this situation as a possible outcome.
ANSWER:
Type II error: The plane will not be safe to fly and the inspector Okayed its use.
206. Describe the seriousness of the two possible errors in questions 212 and 213.
ANSWER:
The Type I error is not at all serious. A plane that is safe to fly will not be allowed to fly.
The Type II error is very serious. A plane that is not safe to fly will be allowed to fly and
passengers may get hurt to the extent that all may die as a result of a crash.
207. You are testing a new formula for hand lotion hoping to show it is effective on “dry or
damaged skin”. State the null and alternative hypotheses.
ANSWER:
H o : The new formula for hand lotion is not effective on dry or damaged skin
H a : The new formula for hand lotion is effective on dry or damaged skin
208. You are trying to show that tennis lessons have a positive effect on a child’s self esteem.
State the null and alternative hypotheses.
ANSWER:
When a medic at the scene after the collapse of the World Trade Center In New York on
September 11, 2002 inspects each victim, he administers the appropriate medical assistant to
all victims, unless he is certain the victim is dead.
ANSWER:
210. Depending on the truth of the null hypothesis and the decision reached, describe what is
meant by Type A correct decision in this situation as a possible outcome.
ANSWER:
Type A correct decision: The victim is alive and is treated as though alive.
211. Depending on the truth of the null hypothesis and the decision reached, describe what is
meant by Type B correct decision in this situation as a possible outcome.
ANSWER:
212. Depending on the truth of the null hypothesis and the decision reached, describe what is
meant by Type I error in this situation as a possible outcome.
213. Depending on the truth of the null hypothesis and the decision reached, describe what is
meant by Type II error in this situation as a possible outcome.
ANSWER:
214. Describe the seriousness of the two possible errors in questions 212 and 213.
ANSWER:
The Type I error is very serious. The victim may very well be dead shortly without the
attention that is not being received.
The Type II error is not as serious. The victim is receiving attention that is of no value.
This would be serious only if there were other victims that needed this attention.
215. Consider the null hypothesis:” H o : The majority of Americans favor laws against
abortion.” Describe the action that would result in a Type I error and a Type II error if this
hypothesis was tested.
ANSWER:
A Type I error occurs when it is determined that the majority of Americans do not favor
laws against abortion when, in fact, the majority do favor such laws.
A Type II error occurs when it is determined that the majority of Americans do favor laws
against abortion when, in fact, they do not favor such laws.
216. Consider the null hypothesis:” H o : This fast-food menu is not low sodium.” Describe the
action that would result in a Type I error and a Type II error if this hypothesis was tested.
A Type I error occurs when it is determined that the fast food is low sodium when, in fact,
it is not low sodium.
A Type II error occurs when it is determined that the fast food is not low sodium when, in
fact, it is low sodium.
217. Consider the null hypothesis:” H o : This historical building must not be demolished.”
Describe the action that would result in a Type I error and a Type II error if this
hypothesis was tested.
ANSWER:
A Type I error occurs when it is determined that the historical building must be
demolished when, in fact, it should not be demolished.
A Type II error occurs when it is determined that the historical building must not be
demolished when, in fact, it should be demolished.
218. Consider the null hypothesis:” H o : there is no waste in Bush’s government spending.”
Describe the action that would result in a Type I error and a Type II error if this
hypothesis was tested.
A Type I error occurs when it is determined that there is waste in Bush’s government
spending when, in fact, there is no waste.
A Type II error occurs when it is determined that there is no waste in Bush’s government
spending when, in fact, there is waste.
219. If α is assigned the value 0.001, what are we saying about the Type I error?
ANSWER:
The Type I error is very serious and, therefore, we are willing to allow it to occur with a
probability of 0.001; that is, only 1 chance in 1000.
220. Consider the null hypothesis:” H o : The majority of Americans favor laws against
abortion.” Describe the action that would result in a correct decision Type A and a
correct decision Type B if this hypothesis was tested.
ANSWER:
Type A correct decision: The majority of Americans do favor laws against abortion and it
is decided that they do favor the laws.
Type B correct decision: The majority of Americans do not favor laws against abortion
and it is decided that they do not favor the laws.
221. Consider the null hypothesis:” H o : This fast-food menu is not low sodium.” Describe the
action that would result in a correct decision Type A and a correct decision Type B if this
hypothesis was tested.
ANSWER:
Type A correct decision: The fast food menu is not low sodium and it is decided that it is
not low sodium.
Type B correct decision: The fast food menu is low sodium and it is decided that it is low
sodium.
ANSWER:
Type A correct decision: The historical building must not be demolished and it is
decided that it should not be demolished.
Type B correct decision: The historical building must be demolished and it is decided
that it should be demolished.
223. Consider the null hypothesis:” H o : there is no waste in Bush’s government spending.”
Describe the action that would result in a correct decision Type A and a correct decision
Type B if this hypothesis was tested.
ANSWER:
Type B correct decision: There is waste in Bush’s government spending and it is decided
that there is waste.
224. If α is assigned the value 0.05, what are we saying about the Type I error?
The Type I error is somewhat serious and, therefore, we are willing to allow it to occur
with a probability of 0.05; that is, 1 chance in 20.
225. If α is assigned the value 0.10, what are we saying about the Type I error?
ANSWER:
The Type I error is not at all serious and, therefore, we are willing to allow it to occur with
a probability of 0.10; that is, 1 chance in 10.
226. If β is assigned the value 0.001, what are we saying about the Type II error?
ANSWER:
The Type II error is very serious and, therefore, we are willing to allow it to occur with a
probability of 0.001; that is, only 1 chance in 1000.
227. If β is assigned the value 0.05, what are we saying about the Type II error?
ANSWER:
The Type II error is somewhat serious and, therefore, we are willing to allow it to occur
with a probability of 0.05; that is, 1 chance in 20.
228. If β is assigned the value 0.10, what are we saying about the Type II error?
ANSWER:
The Type II error is not at all serious and, therefore, we are willing to allow it to occur
with a probability of 0.10; that is, 1 chance in 10.
The owner of a life insurance company is concerned with the effectiveness of a television
commercial to promote his company.
229. What null hypothesis is he testing if he commits a Type I error when he erroneously says
that the commercial is effective?
ANSWER:
230. What null hypothesis is he testing if he commits a Type II error when he erroneously
says that the commercial is effective?
ANSWER:
H o : Commercial is effective
A normally distributed population is known to have standard deviation of 4, but its mean is in
question. It has been argued to be either µ = 90 or µ = 100, and the following hypothesis test
has been devised to settle the argument. The null hypothesis, H o : µ = 90, will be tested using
one randomly selected data and comparing it to the critical value 96. If the data is greater than
or equal to 96, the null hypothesis will be rejected.
α = P(Type I error)
= P(x ≥ 96 | µ = 90)
= 0.5000 - 0.4332
= 0.0668
ANSWER:
β = P(Type II error)
ANSWER:
234. In a particular hypothesis test, if α = 0.01 and p-value = 0.019, then the correct decision
would be to fail to reject the null hypothesis.
ANSWER: T
235. The classical approach to hypothesis testing is completed using a five-step model.
ANSWER: T
236. In a particular hypothesis test, if α = 0.05 and p-value = 0.042, then the correct decision
would be to fail to reject the null hypothesis.
ANSWER: F
237. If the noncritical region in a hypothesis test is made wider (assuming σ and n remain
fixed), then α becomes larger.
ANSWER: F
ANSWER: F
239. The probability-value approach, or simply p-value approach, is the hypothesis test
process that has gained popularity in recent years, largely as a result of the convenience
and the “number crunching” ability of the computer.
ANSWER: T
240. If the p-value is less than or equal to the level of significance, α , then the decision must
be not to reject H o .
ANSWER: F
ANSWER: T
242. The alternative hypothesis assigns a specific value to the parameter in question, and
therefore “equals” will always be part of the alternative hypothesis.
ANSWER: F
243. Probability value, or p-value is the probability that the test statistic could be the value it is
or a more extreme value (in the direction of the alternative hypothesis) when the null
hypothesis is true.
ANSWER: T
ANSWER: F
245. The fundamental idea of the p-value is to express the degree of belief in the null
hypothesis.
ANSWER: T
246. If the p-value is greater than the level of significance α , then the decision must be to
reject H o .
ANSWER: F
247. The alternative hypothesis is referred to as being “two-tailed” when H a is “not equal”.
ANSWER: T
ANSWER: T
Multiple-Choice Questions
249. Choose the pair of words that correctly completes the following statement: “The p-value
of a hypothesis test is the level of significance for which the observed sample
information is provided the null hypothesis is true.”
250. In a particular hypothesis test, the p-value is 0.0211. What must be true of α in order to
reject the null hypothesis?
251. You have conducted a hypothesis test and found that p-value = 0.04. Based on this
information you know that you cannot reject the null hypothesis if
A) α < 0.04.
B) α > 0.04.
C) α ≤ 0.04.
D) α ≥ 0.04.
ANSWER: A
252. In the classical approach to hypothesis testing, we use an asterisk” ∗ ” to identify which of
the following?
253. Which of the following would be the correct hypotheses for testing the claim that the
mean lifetime of a cellular phone battery, while the phone is left on, is less than 24
hours?
A) H o : µ = 24, H a : µ ≠ 24
B) H o : µ = 24(≥), H a : µ < 24
C) H o : µ = 24(≤), H a : µ > 24
D) H o : µ > 24, H a : µ ≤ 24
ANSWER: B
A) H o : µ = 3.14
B) H o : µ = 3.14(≥)
C) H o : µ = 3.14(≤)
D) H o : µ ≠ 3.14
ANSWER: A
255. Which of the following would be the correct hypotheses for testing the claim that the
mean monthly rainfall in Toledo daily during April is no less than 2.5 inches?
256. Which of the following would be the alternative hypothesis in testing the claim that the
mean distance students commute to campus is no more than 7.1 miles?
A) H a : µ ≠ 7.1
B) H a : µ < 7.1
C) H a : µ > 7.1
D) H a : µ = 7.1(≤)
ANSWER: C
257. Which of the following would be the correct hypotheses for testing the claim that the
mean cost of a meal at a fast food restaurant is less than $3.79?
A) H o : µ = 3.79, H a : µ ≠ 3.79
B) H o : µ = 3.79(≥), H a : µ < 3.79
C) H o : µ = 3.79(≤), H a : µ > 3.79
D) H o : µ > 3.79, H a : µ = 3.79(≤)
ANSWER: B
A) When the p-value is miniscule (like 0.0003), the null hypothesis would be rejected by
everybody because the sample results are very unlikely for a true H o . However,
when the p-value is fairly small (like 0.012), the evidence against H o is quite strong
and H o will be rejected by many.
B) When the p-value begins to get larger (say, 0.02 to 0.08), there is too much
probability that data like the sample involved could have occurred even if H o were
true, and the rejection of H o is not an easy decision.
A) Critical region is the set of values for the test statistic that will cause us to always
reject the null hypothesis for specific level(s) of significance α .
B) Critical region is the set of values for the test statistic that will cause us to always
reject the null hypothesis for any level of significance α .
C) The set of values that are not in the critical region is called the noncritical region
(sometimes called the acceptance region).
D) None of the above
ANSWER: B
Short-Answer Questions
260. Suppose the null hypothesis is “the mean diameter of parts produced by a machine is
0.85” ( µ = 0.85) and the alternative is µ > 0.85. If n items are tested and based on the
results, it is concluded that µ > 0.85 when in fact µ = 0.85. What Type of error is made?
ANSWER:
Type I error
261. Suppose that we want to test the hypothesis that the mean fill by a bottling machine is
less than 12 ounces. Explain the conditions that would exist if we make an error in
decision by committing a Type I error.
ANSWER:
ANSWER:
We fail to reject the null hypothesis that µ ≥ 105 when, in fact, µ < 105.
263. Do large or small values for the p-value help support the alternative hypothesis?
ANSWER:
The smaller the p-value, the stronger the support for the alternative hypothesis
264. An experimenter is testing the following hypothesis, H o : µ = 14.8(≥) and H a : µ < 14.8 ,
using the p-value approach and from his sample information computes a p-value of
0.0778. Then he sets the value of α = 0.10 so that he may reject the null hypothesis.
Discuss the procedure described.
ANSWER:
An honest experimenter decides on the seriousness of Type I error and sets α before
performing the test, not after the test is performed.
265. State the null and alternative hypotheses used to test the following claim: “The mean of
ACT scores is 25.”
ANSWER:
H o : µ = 25 vs. H a : µ ≠ 25
ANSWER:
267. For the following pair of values, p-value = 0.025 and α = 0.05, state the decision that will
be reached and state why.
ANSWER:
268. State the null and alternative hypotheses used to test the following claim: “The mean
lifetime of fluorescent light bulbs is at most 2000 hours.”
ANSWER:
ANSWER:
270. State the null and alternative hypotheses used to test the claim that “The mean score on
that MCAT (Medical College Admission Test) is different from 27.”
ANSWER:
H o : µ = 27 vs. H a : µ ≠ 27
271. Assume that z is the test statistic and calculate the value of z ∗ for testing the null
hypothesis H o : µ = 150.0 when σ = 4.5, n = 15, x = 147.8
x −µ 147.8 − 150
z∗ = = = -1.89
σ/ n 4.5 / 15
ANSWER:
273. State the null and alternative hypotheses used to test the claim that “The mean selling
price of foreign made mini vans is no less than $38,000.”
ANSWER:
274. For the following pair of values, p-value = 0.12 and α = 0.10, state the decision that will
be reached and state why.
ANSWER:
275. Assume that z is the test statistic and calculate the value of z ∗ for testing the null
hypothesis H o : µ = 415 when σ = 42.6, n = 50, x = 430
ANSWER:
x −µ 430 − 415
z∗ = = = 2.49
σ/ n 42.6 / 50
276. Use the p-value approach to test the hypotheses H o : µ = 500(≥) vs. H a : µ < 500 at the
0.05 level of significance, given that σ = 30.2 , and that a sample of size 81 produced a
sample mean of 508.2.
ANSWER:
p-value = 0.0073. Since p-value < α , reject the null hypothesis and conclude that the
population mean is less than 500.
277. For the hypothesis test, H o : µ = a(≥) vs. H a : µ < a , p-value = 0.0013. Give the
calculated value for the test statistic.
ANSWER:
z * = −3.0
278. A statistician was testing the following hypotheses: H o : µ = 500 vs. H a : µ ≠ 500 . The p-
value approach was to be used. A sample of size 49 gave a sample mean of 508. Given
that σ = 30.2 , and α = 0.01, find the p-value, and write your conclusion.
ANSWER:
p-value = 0.008. Since p-value < α , reject the null hypothesis and conclude that the
population mean is not 500.
279. The mean cost for a home nationwide is reported to be $80,000 with a standard
deviation equal to $9,500. To test that the mean in Omaha is less than the national
mean, 35 homes for sale are randomly selected and the mean is found to be $65,000.
Assuming the variability is the same locally as nationally, write the appropriate null and
alternative hypotheses for this situation, calculate the p-value for the test, and write your
conclusion.
p-value is practically zero, since z * = −9.34 . Therefore we reject the null hypothesis and
conclude that the mean cost for homes in Omaha is less than the national mean of
$80,000.
280. For the hypothesis test, H o : µ = a vs. H a : µ ≠ a , p-value = 0.1260. Give the calculated
value for the test statistic.
ANSWER:
z * = ± 1.53
281. For the hypothesis test, H o : µ = a(≤) vs. H a : µ > a , p-value = 0.2358. Give the
calculated value for the test statistic.
ANSWER:
z * = 0.72
282. For testing the hypothesis H o : µ = a vs. H a : µ ≠ a , give the absolute value of the
calculated test statistic, which would correspond to p-value = 0.0672.
ANSWER:
| z * | = 1.83
283. For testing the hypothesis H o : µ = a vs. H a : µ ≠ a , give the absolute value of the
calculated test statistic, which would correspond to p-value = 0.0120.
ANSWER:
284. For a national compliance test for diabetics, µ = 74 and σ = 4. To test that diabetic
patients at a particular hospital have this mean versus a value different than the national
mean, the test is administered to 50 diabetic patients at the hospital, and x = 78.5.
Assuming the variability in test scores at the hospital is the same as that at the national
level, find the p-value for this hypothesis test, and write your conclusion.
ANSWER:
p-value is practically zero, since z * = 7.95 . Therefore we reject the null hypothesis and
conclude that the mean for diabetic patients at this hospital have is different than the
national mean
285. For the hypothesis H o : µ = a vs. H a : µ ≠ a , give the absolute value of the calculated
test statistic, which would correspond to p-value = 0.1336.
ANSWER:
| z * | =1.50
A machine is programmed to put 737 grams of salt in each container that passes underneath its
nozzle. In order to test H o : µ = 737(≤) vs. H a : µ > 737 , a sample of 35 boxes of salt is selected. It
is found that x = 740.5 , and it is known that σ = 7.5 grams.
ANSWER:
z * = 2.76
ANSWER:
p-value = 0.0029
ANSWER:
Since p-value < α , reject the null hypothesis and conclude that the machine, on
average, puts more than 737 grams of salt in each container that passes underneath its
nozzle.
289. Calculate the p-value for testing H o : µ = 25(≥) vs. H a : µ < 25 , if the value of the test
statistic z * = -2.84.
ANSWER:
p-value = 0.0023
290. Calculate the p-value for testing H o : µ = 50 vs. H a : µ ≠ 50 , if the value of the test
statistic z * = 1.98.
ANSWER:
p-value = 0.0478
291. Calculate the p-value for testing H o : µ = c (≤) vs. H a : µ > c , if the value of the test
statistic z * = 3.16.
ANSWER:
292. Consider the hypothesis test H o : µ = 165(≤) vs. H a : µ > 165 , with σ = 15. Find the
critical value of the test statistic x if samples of size 50 and α = 0.01 are utilized.
ANSWER:
169.94
The following terms are commonly used in research findings: if 0.01< p-value ≤ 0.05, the result
is said to be statistically significant. If p-value ≤ 0.01, the result is said to be highly significant. If
p-value > 0.05, the result is said to be non-statistically significant, statistically significant, or
highly significant.
ANSWER:
Statistically significant
ANSWER:
Highly significant
Non-statistically significant
A sample of size 35 is used to test H o : µ = 65(≥) vs. H a : µ < 65 , and produced a sample mean
x = 63.5. Assume that the population standard deviation is σ = 2.5.
ANSWER:
z * = −355
.
297. What distribution does the test statistic have when the null hypothesis is true?
ANSWER:
ANSWER:
One-tailed alternative
ANSWER:
ANSWER:
z * = 3.04
301. What distribution does the test statistic have when the null hypothesis is true?
ANSWER:
ANSWER:
Two-tailed alternative
ANSWER:
p-value = 0.0024
A random sample was selected from a normal population with a standard deviation σ = 5.70.
The sample values were 236, 240, 229, 237, 241, 243, 239, 228, 231, and 225.
ANSWER:
p-value = 0.3085
ANSWER:
Since p-value > α ; we fail to reject the null hypothesis and conclude that the population
mean is at least 235.8.
ANSWER:
x ≥ a + 0.233b
307. In testing H o : µ = 28.7(≥) vs. H a : µ < 28.7 , using the p-value approach, a p-value of
0.0764 was obtained. If σ = 9.8, find the sample mean which produced this p-value
given that a sample of size n = 40 was randomly selected.
ANSWER:
x = 26.48
308. Suppose a sample of size 50 was taken to test the null hypothesis H o : µ = 75 versus
the alternative hypothesis H a : µ < 75 at α = 0.05 . Determine the critical region for this
test.
z ≤ −1.65
309. We wish to test the null hypothesis “the mean is no more than 20,” versus the alternative
hypothesis “the mean is more than 20.” The test statistic z is to be used. Find the value
of α that corresponds to the critical region: z ≥ 1.68.
ANSWER:
α = 0.0465
310. Suppose a sample of size 50 was taken to test the null hypothesis H o : µ = 85 versus
the alternative hypothesis H a : µ ≠ 85 at α = 0.05 . Determine the critical region for this
test.
ANSWER:
z ≤ −1.96 or z ≥ 1.96
311. We wish to test the null hypothesis “the mean is no more than 20,” versus the alternative
hypothesis “the mean is more than 20.” The test statistic z is to be used. Find the value
of α that corresponds to the critical region: z ≥ 1.75.
ANSWER:
α = 0.0401
312. Suppose a sample of size 50 was taken to test the null hypothesis H o : µ = 95 versus
the alternative hypothesis H a : µ > 95 at α = 0.10 . Determine the critical region for this
test.
ANSWER:
313. A machine is programmed to put 737 grams of salt in each container that passes
underneath its nozzle. In order to test H o : µ = 737(≤) vs. H a : µ > 737 , a sample of 100
boxes of salt is selected. How large must the sample mean be before the null hypothesis
can be rejected for α = 0.01? It is known that σ = 7.5 grams.
ANSWER:
x ≥ 738.75 grams
314. To test the null hypothesis that the average lifetime for a particular brand of bulb is 750
hours versus the alternative that the average lifetime is different from 750 hours, a
sample of 75 bulbs is used. If the standard deviation is known to equal 50 hours and if α
is equal to 0.01, what values for x will result in rejection of the null hypothesis?
ANSWER:
x ≤ 7351
. hours or x ≥ 764.9 hours
315. Suppose we were testing the hypothesis H o : µ = 82.4(≤) vs. H a : µ > 82.4 , using α =
0.10. Suppose further that σ = 16.7. What is the smallest sample mean that would
cause us to reject the null hypothesis using samples of size n = 35?
ANSWER:
x = 86.0
316. Suppose we were testing the hypothesis H o : µ = 76.9 vs. H a : µ ≠ 76.9 , using α = 0.05.
Suppose further that σ = 14.6. What is the smallest sample size that would cause us to
reject the null hypothesis if the sample mean is 74.8?
ANSWER:
n = 186
ANSWER:
318. Calculate the p -value for testing H o : µ = 12 vs. H a : µ > 12, z * = 1.58 .
ANSWER:
319. Calculate the p-value for testing H o : µ = 100 vs. H a : µ < 100, z * = −0.75 .
ANSWER:
320. Calculate the p-value for testing H o : µ = 15.6 vs. H a : µ ≠ 15.6, z * = 1.37 .
ANSWER:
321. Calculate the p-value for testing H o : µ = 9.46 vs. H a : µ < 9.46, z * = −2.19 .
ANSWER:
322. Calculate the p-value for testing H o : µ = 115 vs. H a : µ ≠ 115, z * = −0.99 .
ANSWER:
323. Find the value of z ∗ for testing H o : µ = 40 vs. H a : µ > 40 when p-value = 0.0582.
Sketch a normal curve to display the results.
ANSWER:
324. Find the value of z ∗ for testing H 0 : µ = 40 versus H a : µ < 40 when p-value = 0.0166.
Sketch a normal curve to display the results.
ANSWER:
P = P( z < z ∗ ) = 0.0166
ANSWER:
326. The null hypothesis, H o : µ = 50 , was tested against the alternative hypothesis,
H a : µ > 50 . A sample of 100 resulted in a calculated p-value of 0.102. If σ = 4.0 , find
the value of the sample mean, x .
P = P( z > z ∗ ) = 0.1020
The formula z = ( x − µ ) /(σ / n ) reduces to 1.27 = ( x − 50) /(4.0 / 100) . Solving for x ,
we get x = 50 + (1.27)(4.0 / 100) = 50.508.
ANSWER:
328. If the test is completed using α = 0.05 , what decision and conclusion are reached?
Explain.
ANSWER:
ANSWER:
σ x = σ / n = 1.25 / 80 = 0.1398
ANSWER:
331. Determine the critical region and critical values for z that would be used to test the null
hypothesis H o : µ = 25 vs. H a : µ ≠ 25, at the level of significance α = 0.10. Sketch a
normal curve to display the results.
ANSWER:
z ≤ −1.65, z ≥ 1.65
ANSWER:
z ≥ 2.33
333. Determine the critical region and critical values for z that would be used to test the null
hypothesis H o : µ = 13(≥) vs. H a : µ < 13, at the level of significance α =0.05. Sketch a
normal curve to display the results.
ANSWER:
z ≤ −1.65
334. Determine the critical region and critical values for z that would be used to test the null
hypothesis H o : µ = 18 vs. H a : µ ≠ 18, at the level of significance α =0.01. Sketch a
normal curve to display the results.
z ≤ −2.58, z ≥ 2.58
335. The manager at Fed Express feels that the weights of packages shipped recently are
less than in the past. Records show that in the past packages have had a mean weight
of 38.5 lb. and a standard deviation of 13.4 lb. A random sample of last month’s
shipping records yielded a mean weight of 34.2 lb. for 64 packages. Is this sufficient
evidence to reject the null hypothesis in favor of the manager’s claim? Use α = 0.01.
ANSWER:
z ∗ falls in the critical region, therefore we reject H o at the 0.01 level of significance in
favor of the manager’s claim, and conclude that the population mean is significantly less
than the mean of 38.5.
ANSWER:
337. Find the value of x for testing H o : µ = 80, given that z * = -0.95, σ = 6.75, and n = 36.
ANSWER:
From a population of unknown mean µ and a known standard deviation σ = 6.0 , a sample of
n = 100 is selected and the sample mean 43.5 is found.
ANSWER:
339. Complete the hypothesis test involving H a : µ ≠ 42 using the p-value approach and
α = 0.05.
ANSWER:
H o : µ = 42 vs. H a : µ ≠ 42
P = 2 P ( z > 2.5) . Using the table of standard normal distribution, then we get:
Since P < α , we reject H o at the 0.05 level of significance, and conclude that there is
sufficient evidence to support the contention that the mean is not equal to 42.
340. Complete the hypothesis test involving H a : µ ≠ 42 using the classical approach and
α = 0.05.
ANSWER:
z * falls in the critical region, therefore we reject H o at the 0.05 level of significance, and
conclude that there is sufficient evidence to support the contention that the mean is not
equal to 42.
341. Describe the relationship between these three separate procedures performed in
questions 346, 347 and 348.
ANSWER:
The null hypothesis is rejected at the 0.05 level of significance since z ∗ = 2.5 is in the
critical region, or p-value is less than α , and µ = 42 is not within the 95% confidence
interval estimate of 42.324 to 44.676.
In Meijer supermarket, the customer’s waiting time to check out is approximately normally
distributed with a standard deviation of 2.5 minutes. A sample of 25 customer waiting times
produced a mean of 8.2 minutes. Is this evidence sufficient to reject the supermarket’s claim
that its customer checkout time averages no more than 7 minutes? Complete this hypothesis
test using the 0.02 level of significance.
ANSWER:
P = P( z > 2.40) = 0.5000 − 0.4918 = 0.0082 . Since P < α , we reject H o at α = 0.02. The
sample does provide sufficient evidence to conclude that the mean waiting time is more
than the claimed 7 minutes.
ANSWER:
The Food and Drug Administration (FDA) must approve all drugs before they can be marketed
by a drug company. The FDA must weigh the error of marketing an ineffective drug, with the
usual risks of side effects, against the consequences of not allowing an effective drug to be
sold. Suppose, using standard medical treatment, that the mortality rate (r) of a certain disease
is known to be C. A manufacturer submits for approval a new drug that is supposed to treat this
disease. The FDA sets up the hypothesis to test the mortality rate for the drug as follows:
344. If C = 0.95, which test do you think the FDA should use? Explain.
ANSWER:
H a : r > C . Failure to reject H o will result in the new drug being marketed. Because of
the high current mortality rate (0.95), burden of proof is on the old ineffective drug.
345. If C = 0.05, which test do you think the FDA should use? Explain
ANSWER:
H a : r < C . Failure to reject H o will result in the new drug not being marketed. Because
of the low current mortality rate (0.05), burden of proof is on the new drug.
346. State the null and alternative hypotheses used to test the following claim: “The mean
weight of college female students is 120 pounds.”
H o : µ = 120
H a : µ ≠ 120
347. For the following pair of values, p-value = 0.021 and α = 0.01, state the decision that will
occur and state why.
ANSWER:
348. State the null and alternative hypotheses used to test the following claim: “The mean life
of fluorescent light bulbs is at least 1650 hours.”
ANSWER:
H o : µ = 1650 ( ≥ )
H a : µ < 1650
ANSWER:
350. State the null and alternative hypotheses used to test the claim that “The mean score on
that ACT is different from 22.”
ANSWER:
H a : µ ≠ 22
351. Assume that z is the test statistic and calculate the value of z ∗ for testing the null
hypothesis H o : µ = 140.0 when σ = 4.2, n = 15, x = 143.8
ANSWER:
x −µ 143.8 − 140
z∗ = = = 3.50
σ/ n 4.2 / 15
ANSWER:
353. State the null and alternative hypotheses used to test the claim that “The mean selling
price of full-size cars is no more than $35,000.”
ANSWER:
H o : µ = 35,000 ( ≤)
354. For the following pair of values, p-value = 0.016 and α = 0.025, state the decision that
will occur and state why.
ANSWER:
ANSWER:
x −µ 500 − 515
z∗ = = = -3.03
σ/ n 38.3/ 60
356. For the following pair of values, p-value = 0.068 and α = 0.10, state the decision that will
occur and state why.
ANSWER:
357. Assume that z is the test statistic and calculate the value of z ∗ for testing the null
hypothesis H o : µ = 11.6 when σ = 1.54, n = 16, x = 12.3
ANSWER:
x −µ 12.3 − 11.6
z∗ = = = 1.82
σ/ n 1.54 / 16
Chapter 9
Section 9.1
1. For a sample of size n = 31, the critical value of the t-distribution equals the
corresponding critical value of the standard normal distribution.
ANSWER: F
2. ( )
In considering Student's t-distribution, we see that t = ( x − µ ) / s / n is distributed with
a variance less than 1.
ANSWER: F
ANSWER: F
ANSWER: T
5. If σ is unknown when completing a hypothesis test about the population mean, then the
best estimate for the unknown standard deviation is the sample standard deviation s.
ANSWER: T
6. The Student’s t-distributions have an approximately normal distribution but are more
dispersed than the standard normal distribution.
ANSWER: T
ANSWER: F
8. When making inferences about one population mean when the value of the standard
deviation σ is unknown, the t-score is the test statistic.
ANSWER: T
9. When the test statistic is t and the number of degrees of freedom gets very large, the
critical value of t gets very close to that of the standard normal z.
ANSWER: T
10. The Student’s t-distribution is distributed symmetrically about its mean, and approaches
the standard normal distribution as the number of degrees of freedom increases.
ANSWER: T
11. Inferences about the population mean µ are based on the sample mean x and
information obtained from the sampling distribution of sample means.
ANSWER: T
x−µ
12. The test statistic t = is distributed so as to be less peaked at the mean and thicker
s/ n
at the tails than is the normal distribution.
ANSWER: T
13. The sampling distribution of sample means has a mean µ and a standard error of σ / n
for all samples of size n.
ANSWER: F
ANSWER: T
15. Samples as small as n =15 or 20 may be considered large enough for the Central Limit
Theorem to hold if the sample data are unimodal, nearly symmetric, short-tailed, and
without outliers.
ANSWER: T
x−µ
16. The test statistic t = is distributed symmetrically about its mean µ ( µ ≠ 0 ).
s/ n
ANSWER: F
17. The t-distribution approaches the standard normal distribution as the number of degrees
of freedom increases.
ANSWER: T
x−µ
18. The test statistic t = is distributed with a variance greater than 1, but as the
s/ n
degrees of freedom increases, the variance approaches 1.
ANSWER: T
19. The number of degrees of freedom, df, is a statistic that identifies each different
distribution of Student’s t-distribution.
ANSWER: F
20. The number of degrees of freedom associated with s 2 is the divisor (n-1) used to
calculate the sample variance s 2 .
ANSWER: T
ANSWER: F
22. The Central Limit Theorem indicates that the t-distribution can also be applied to
nonnormal populations when the sample size is sufficiently large.
ANSWER: T
23. t (df, 0.95) is the same as t (df, 0.05) since the t-distribution is symmetric around its
mean, zero.
ANSWER: F
24. Once df is “greater than 100,” the critical values of the t-distribution are the same as the
corresponding critical values of the standard normal distribution.
ANSWER: T
25. t (df, 0.90) is the same as -t (df, 0.10) since the t-distribution is symmetric around its
mean, zero.
ANSWER: T
Multiple-Choice Questions
26. In a two-tailed test, with n = 20, the computed value of t is found to be t * = 1.85.
Assuming the sample is randomly selected from a normal population, then the p-value is
given by:
28. When testing the claim that the printing speed for a certain inkjet printer is at least 6 pages per
minute, which of the following would be the alternative hypothesis?
A) H a : µ > 6.0
B) H a : µ = 6.0
C) H a : µ < 6.0
D) H a : µ ≥ 6.0
ANSWER: C
29. Which of the following would be the null hypothesis and alternative hypothesis in testing
the claim that the mean gasoline consumption of a particular model of an automobile is
no more than 19 miles per gallon?
30. Which of the following would be the null hypothesis and alternative in testing the claim
that the mean waiting time to be served at a large post office is at least 6.5 minutes?
31. In comparing Student's t-distribution to the standard normal distribution, we see that
Student's t-distribution is:
A) x
B) s
C) σ
D) µ
ANSWER: B
33. A researcher wants to test the claim that the average female college student is at least
66 inches tall. A random sample of 25 female students produced a mean of 64.5 inches
and a standard deviation of 1.23 inches. The correct symbol for 64.5 inches is:
A) x .
B) s.
C) σ .
D) µ .
ANSWER: A
x −µ
35. Which of the following statements is false regarding the test statistic t = ?
s/ n
A) Once df is greater than or equal to 10, the critical values of the t-distribution are the
same as the corresponding critical values of the standard normal distribution.
B) t(df, 0.99) is the same as -t(df, 0.01) since the t-distribution is symmetric around its
mean, zero.
C) t(10, 0.05) = 1.81
D) t(15, 0.95) = -1.75
ANSWER: A
37. Which of the following statements is false regarding a t-distribution with df = 15?
x −µ
38. Which of the following statements is false regarding the test statistic t = ?
s/ n
Short-Answer Questions
ANSWER:
The sample standard deviation, s, is divided by the square root of the sample size.
ANSWER:
2.76
ANSWER:
2.09
ANSWER:
-2.65
ANSWER:
0.095
44. What distribution does the Student t-distribution approach as the degrees of freedom
become larger?
ANSWER:
ANSWER:
-1.31
ANSWER:
2.88
47. The alternative hypothesis is sometimes called the “research hypothesis.” The
conclusion is a statement written about the alternative hypothesis. Explain why these
two statements are compatible.
ANSWER:
The alternative hypothesis expresses the concern; the conclusion answers the concern.
ANSWER:
1.30
ANSWER:
-1.83
ANSWER:
-2.06
To test the null hypothesis that the mean waist size for males under 40 years equals 34 inches
versus the hypothesis that the mean differs from 34, the following data were collected: 33, 33,
30, 34, 34, 40, 35, 35, 32, 38, 34, 32, 35, 32, 32, 34, 36, 30.
ANSWER:
ANSWER:
t * = -0.25
ANSWER:
54. Test the stated hypothesis at α = .05 and write your conclusion.
ANSWER:
Since p-value > α , we fail to reject H o , and conclude that the mean waist size for males
under 40 equals 34.
55. A new supervisor initiates procedures to reduce the mean time of 6.34 hours currently
required to complete an assembly line procedure. In a random sample of 23 assembly
line runs, the mean time required was 5.77 hours with a sample standard deviation of
1.82 hours. At the 0.05 level of significance, test the claim that the mean time has been
reduced. Determine the critical region, the computed value of the test statistic, and the
decision reached.
ANSWER:
56. A machine produces 3-inch nails. A sample is obtained and the lengths determined. The
results are as follows: 2.89, 2.95, 3.00, 3.05, 2.99, 2.96, 3.10, 3.06, 3.00, and 3.12. Find
a 99% confidence interval for µ .
(2.94 to 3.09)
57. In order to estimate the pulse rate for young males (less than 30 years), the following
sample of pulse rates were obtained: 61, 73, 58, 64, 70, 64, 72, 60, 74, 65, 65, 80, 55,
72, 56, 56. Use these data to find a 95% confidence interval for µ , the mean for all such
males.
ANSWER:
(61.3 to 69.3)
ANSWER:
3.17
12
ANSWER:
0.10
The program director for a medical assistants' program wishes to test the hypothesis that her
students score higher than the national mean on the certified medical assistants' (CMA) exam.
She randomly selects 15 recent graduates of the yearlong program and finds that x = 640 and s =
25. Assume the national mean is 615.
ANSWER:
ANSWER:
t * = 3.87
ANSWER:
64. Test the hypothesis in question 61 at α = 0.01 and write your conclusion.
ANSWER:
Since p -value < α , we reject H o and conclude that the program director for the medical
assistants’ program was right that her students score higher than the national mean
(615) on the CMA exam.
65. Ten farms (randomly selected from a large agricultural region) were selected, and the
yield per acre in wheat was determined for each. The summary data were as follows:
x = 95.0 and s = 85. Find a 95% confidence interval for the mean yield per acre for all
such farms in this region.
ANSWER:
(88.9 to 101.1)
66. Find a 95% confidence interval for µ , the mean amount of antibiotic per capsule.
ANSWER:
(247.7 to 254.9)
67. Give bound on the p-value, and test H o : µ = 250 vs. H a :µ ≠ 250 at α =0.10.
0.20 ≤ p-value ≤ 0.50. Since p-value > α , we fail to reject the null hypothesis. We
conclude that the average amount of antibiotic is 250-milligram.
68. Give the critical region, the computed test statistic, and your conclusion if you used
these data to test the hypothesis in question 67 at the 0.05 level of significance.
ANSWER:
69. A machine produces 3-inch nails. A sample of 10 nails is obtained and the lengths
determined. The results are as follows: 2.89, 2.95, 3.00, 3.05, 2.99, 2.96, 3.10, 3.06,
3.00, and 3.12. Use these results to test H o : µ = 3.0 vs. H a :µ ≠ 3.0 at a level of
significance equal to 0.01. Give the critical region, the computed test statistic, and the
conclusion.
ANSWER:
Critical region: t ≤ −3.25 or t ≥ 3.25, t * = 0.534; Conclusion: unable to reject the null
hypothesis.
In order to test the claim that the mean of a particular normal population is greater than 4.8, the
following random sample was selected: 5, 7, 3, 4, 5, 4, and 6. The test is to be completed using
a level of significance α = 0.10.
ANSWER:
ANSWER:
The test statistic is t * and the level of significance is α = 0.10. We reject H o if t * > 1.44.
ANSWER:
t * = 0.11
ANSWER:
Since t * = 0.11, we fail to reject H o . There is not sufficient evidence to suggest that the
population mean is greater than 4.8.
In order to test the claim that the mean of a particular normal population is greater than 7.6 the
following random sample was selected: 11, 6, 8, 9, 7, 6, 5, 10, 9, and 8. The test is to be
completed using α = 0.10.
ANSWER:
ANSWER:
Fail to reject H o . There is not sufficient evidence to suggest that the population mean is
greater than 7.6.
ANSWER:
s = 1.576
78. Find the first percentile of the Student’s t-distribution with df = 20.
ANSWER:
–2.53
79. Find the 95th percentile of the Student’s t-distribution with df = 20.
ANSWER:
1.72
ANSWER:
x = ∑ x / n = 540 / 40 = 13.5
ANSWER:
82. Find the 90% confidence interval to estimate the true mean textbook cost based on this
sample.
ANSWER:
µ = The mean textbook cost. Normality assumed. Since n = 40, x = 13.50, s = 6.445, and
1 − α = 0.90 , then α / 2 = 0.05; df = n − 1 = 39, and t(39, 0.05) ≈ 1.68.
83. Find the first quartile of the Student’s t-distribution with df = 20.
ANSWER:
–0.687
ANSWER:
85. Find the percent of the Student’s t-distribution that lies between t ranges from –1.77 and
3.01, when df = 13.
ANSWER:
The pulse rates for 10 adult women were as follows: 60, 72, 58, 78, 66, 82, 78, 99, 70, and 80.
ANSWER:
x = ∑ x / n = 743 / 10 = 74.3
ANSWER:
88. Find 90% confidence interval to estimate the true mean pulse rate for women based on
this sample.
89. Use the table of probability values for Student’s t-distribution with 10 degrees of freedom
to determine the p-value for testing H o : µ = 13.5 vs. H a : µ > 13.5, when the test statistic
t * = 1.94.
ANSWER:
90. Use the table of probability values for Student’s t-distribution with 10 degrees of freedom
to determine the p-value for testing H o : µ = 13.5 vs. H a : µ ≠ 13.5, when the test statistic
t * = 1.94.
ANSWER:
P = P(t < −1.94 | df = 10) + P (t > +1.94 | df = 10) = 2 P (t > 1.94 | df = 10) , we have 0.074 <
P < 0.086.
91. Use the table of probability values for Student’s t-distribution with 10 degrees of freedom
to determine the p-value for testing H o : µ = 13.5 vs. H a : µ ≠ 13.5, when the test statistic
t * = -1.94.
ANSWER:
P = p-value = P (t < −1.94 | df = 10) + (t > +1.94 | df = 10) = 2 P (t > 1.94 | df = 10) , we have
0.074 < P < 0.086.
ANSWER:
P = P (t < −1.94 | df = 10) = P (t > 1.94 | df = 10) , we have 0.037 < P < 0.043.
The p-value approach and classical approach, respectively, are two different approaches to
hypothesis testing. The former approach requires finding the p-value of the test, and the later
approach requires finding the critical value(s) and the rejection region(s). Both approaches lead
to the same decision and conclusion.
93. Compare the p-value approach and classical approach to hypothesis testing by
comparing the decision of the p-value approach to the decision of the classical
approach, for testing H o : µ = 100 vs. H a : µ ≠ 100 , when n = 15, t * = 1.60, and α = 0.05.
ANSWER:
P = 2 P (t > 1.60 | df = 14) . Using the “Probability Values for Student’s t-Distribution” table,
we get 0.065 < ½ P < 0.068; hence 0.130 < P < 0.136. Since P > α , we fail to reject H o .
P = P (t > 2.16 | df = 24) . Using the “Probability Values for Student’s t-Distribution” table,
we get 0.019 < P < 0.024. Since P < α , we reject H o .
95. Compare the p-value approach and classical approach to hypothesis testing by
comparing the decision of the p-value approach to the decision of the classical
approach, for testing H o : µ = 40 vs. H a : µ < 40 , when n = 45, t * = -1.73, and α = 0.05.
ANSWER:
P= P (t < −1.73 | df = 44) = P (t > 1.73 | df = 44) .Using the “Probability Values for
Student’s t-Distribution” table, we get 0.039 < P < 0.049. Since P < α , we reject H o .
ANSWER:
The results of the two techniques for each of the decisions made to questions 93, 94,
and 95 are identical.
97. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test: The mean weight of new born babies is at least 5 Ibs.
ANSWER:
98. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test: The mean age of patients at Mecosta County General Hospital is no more than 56
years.
ANSWER:
ANSWER:
H o : µ = 15 vs. H a : µ ≠ 15
100. What evidence do you have that the assumption of normality is reasonable? Explain.
ANSWER:
The “population” data ranged from 6% to 72%, therefore the midrange is 39%. When
the midrange is close in value to the mean, the distribution is approximately symmetrical;
therefore, the assumption of normality is reasonable.
101. Test the hypothesis of “different from” at a level of significance equal to 0.05, using the
p-value approach. Include t * , p-value, and your conclusion.
ANSWER:
P = 2 P (t > 1.07 | df = 14). Using the “Probability Values for Student’s t-Distribution” table,
we get 0.144 < ½ P < 0.169.Then 0.288 < P < 0.338. Since P> α , we fail to reject H o .
ANSWER:
The test statistic t * = 1.07 falls in the noncritical region, therefore we fail to reject H o .
We conclude that the sample does not provide sufficient evidence to justify the
contention that the mean percentage is different than 39%, at the 0.05 level of
significance.
It is claimed that the students at a certain university in Michigan will score an average of 85 on a
given test. Is the claim reasonable if a random sample of test scores from this university yields
83, 92, 88, 87, 80, and 92? Assume test results are normally distributed.
x = 87, s = 4.817
ANSWER:
ANSWER:
ANSWER:
P = p-value = 2 P (t > 1.02 | df = 5); Using the “Probability Values for Student’s t-
Distribution” table, we have 0.161 < ½ P <0.182], then 0.322 < P < 0.364. Since P > α ;
fail to reject H o .
ANSWER:
Gasoline pumped from a supplier’s pipeline is supposed to have an octane rating of 86.5. On 13
consecutive days a sample was taken and analyzed with the following results: 87.6, 85.4, 86.2,
87.4, 86.2, 86.6, 85.8, 85.1, 86.4, 86.3, 85.4, 85.6, and 86.1. Assume that the octane ratings
have a normal distribution. We wish to determine at the 0.05 level of significance if there is
sufficient evidence to show that these octane readings were taken from gasoline with a mean
octane significantly less than 87.
ANSWER:
x = 86.162, s = 0.742
ANSWER:
ANSWER:
ANSWER:
P = p-value = P (t < −1.64 | df = 12) = P (t > 1.64 | df = 12); Using the “Probability Values
for Student’s t-Distribution” table, we have 0.057 < P < 0.068. Since P > α ; we fail to
reject H o . The sample does not provide sufficient evidence to conclude at the 0.05 level
of significance that mean octane level is less than 87.5,
ANSWER:
A random sample of 20 weights is taken from babies born at the University of Iowa Hospital. A
mean of 7.55 lb and a standard deviation of 1.85 lb were found for the sample. Based on past
information, it is assumed that weights of newborns are normally distributed.
113. Estimate, with 95% confidence, the mean weight of all babies born in this hospital.
x ± E = 7.55 ± 0.865 . Thus, the 95% confidence interval for µ is 6.685 to 8.415.
ANSWER:
With 95% confidence, we estimate the mean weight of babies born at the University of
Iowa Hospital to be between 6.685 to 8.415 Ibs.
Consider the Student’s t-distribution with 20 degrees of freedom. Recall that the kth percentile,
denoted by Pk , is a value such that at most k% of the ranked data are smaller in value than Pk
and at most (100-k)% of the data are larger.
ANSWER:
P1 = -2.53
ANSWER:
P5 = -1.72
P10 = -1.33
ANSWER:
P25 = Q1 = -0.687
ANSWER:
P50 = Q2 = 0.0
ANSWER:
P75 = Q3 = 0.687
ANSWER:
P90 = 1.33
ANSWER:
ANSWER:
P99 = 2.53
ANSWER:
125. Find the percent of the Student’s t-distribution with df =10 that lies between –1.37 and
2.76.
ANSWER:
126. Find the percent of the Student’s t-distribution with df =15 that lies between –1.75 and
2.60.
ANSWER:
127. Find the percent of the Student’s t-distribution with df =20 that lies between – 0.687 and
2.09.
ANSWER:
ANSWER:
129. Ninety percent of Student’s t-distribution lies between t = –1.81 and t =1.81 for how
many degrees of freedom?
ANSWER:
df = 10
130. Ninety percent of Student’s t-distribution lies to the right of t = –1.44 for how many
degrees of freedom?
ANSWER:
df = 6
131. Eighty percent of Student’s t-distribution lies between t = –1.40 and t =1.40 for how
many degrees of freedom?
ANSWER:
df = 8
132. Ninety five percent of Student’s t-distribution lies between t = –2.12 and t =2.12 for how
many degrees of freedom?
ANSWER:
df = 16
ANSWER:
df = 18
134. Ninety nine percent of Student’s t-distribution lies to the left of t = 2.68 for how many
degrees of freedom?
ANSWER:
df = 12
135. Construct a 90% confidence interval estimate for the mean µ using the sample
information n =21, x =13.6, and s =2.4.
ANSWER:
While doing an article on the high cost of college education, a reporter took a random sample of
the cost of new textbooks for a semester. The random variable x is the cost of one book. Her
sample data can be summarized by n = 51, ∑ x =4425.88, and ∑ ( x − x ) =12,280.12.
2
x= ∑ x / n = 4425.88 / 51 = $86.78
ANSWER:
138. Find the 90% confidence interval to estimate the true mean textbook cost for the
semester based on this sample.
ANSWER:
ANSWER:
With 90% confidence, we estimate the average cost of a college new textbook to be
between $83.09 and $90.47.
The pulse rates for 15 adult women were 95, 66, 76, 106, 84, 76, 81, 56, 68, 54, 74, 62, 78, 74,
and 68.
x= ∑ x / n = 1110 / 15 = 74
ANSWER:
142. Find the minimum error of estimate for 90% confidence interval for µ .
ANSWER:
143. Find the lower and upper confidence limits for a 90% confidence interval.
ANSWER:
x ± E = 74.000 ± 6.429 . Hence, LCL = 67.571 ≈ 67.6 and UCL = 80.429 ≈ 80.4.
ANSWER:
With 90% confidence, we estimate the average pulse rate for adult women to be
between 67.6 and 80.4.
ANSWER:
146. What assumption is required to ensure the validity of the results to question 145?
ANSWER:
These results are based on the assumption that the variable Quiz Score is
approximately normally distributed. If this is not the case, then these results might not be
valid, especially a sample size of 20 is considered small.
ANSWER:
148. What is the effect of decreasing the confidence level from 98% to 90%?
ANSWER:
149. State null hypothesis, H o , and the alternative hypothesis, H a , that would be used to test
the following claim “A chicken farmer claims that his chickens have a mean weight of 4
pounds.”
H o : µ = 4 vs. H a : µ ≠ 4
150. State null hypothesis, H o , and the alternative hypothesis, H a , that would be used to test
the following claim “The mean age of Egypt’s commercial jets is less than 25 years.”
ANSWER:
H o : µ = 25 ( ≥ ) vs. H a : µ < 25
151. State null hypothesis, H o , and the alternative hypothesis, H a , that would be used to test
the following claim “The mean monthly unpaid balance on Discover card accounts is
more than $425.”
ANSWER:
152. Determine the p-value for testing H o : µ = 20 vs. H a : µ < 20, if t* = −2.01 .
ANSWER:
153. Determine the p-value for testing H o : µ = 20 vs. H a : µ > 20, if t* = 2.01 .
ANSWER:
P = p-value = P(t < -2.01 | df =10) + P(t > +2.01 | df =10) = 2P(t > 2.01| df =10)
Using the “Critical Values of Student’s t-Distribution” Table, we get: 0.05 < P < 0.10
Using the “Probability Values for Student’s t-Distribution” Table, we get: 0.062 < P < 0.074
ANSWER:
P = p-value = P(t < -2.01 | df =10) + P(t > +2.01 | df =10) = 2P(t > 2.01| df =10)
Using the “Critical Values of Student’s t-Distribution” Table, we get: 0.05 < P < 0.10
Using the “Probability Values for Student’s t-Distribution” Table, we get: 0.062 < P < 0.074
156. Draw an approximately normal distribution curve to determine the critical region and
critical value(s) that would be used in the classical approach to test the hypothesis
H o : µ = 18 vs. H a : µ ≠ 18 given that α = 0.05 and n =15 .
ANSWER:
ANSWER:
158. Draw an approximately normal distribution curve to determine the critical region and
critical value(s) that would be used in the classical approach to test the hypothesis
H o : µ = −32 vs. H a : µ < −32 given that α =0.05 and n = 18 .
ANSWER:
ANSWER:
Homes in nearby East Lansing, Michigan have a mean value of $178,750. It is assumed that
homes in the vicinity of Michigan State University (MSU) have a higher value. To test this
theory, a random sample of 12 homes is chosen from the MSU area. Their mean valuation is
$182,210 and the standard deviation is $5,600. Assume prices are normally distributed, and
that α =.05 is used in testing the appropriate hypothesis.
ANSWER:
161. Test the hypothesis in question 160 using the p-value approach.
ANSWER:
Using the “Critical Values of Student’s t-Distribution” Table, we get: 0.025 < P < 0.05.
Using the “Probability Values for Student’s t-Distribution” Table, we get 0.024< P <
0.031. Since p-value < α =.05, we reject H o .The sample does provide sufficient
evidence to justify the contention that the mean value is higher than $178,750 at the
0.05 level of significance.
162. Test the hypothesis in question 160 using the classical approach.
ANSWER:
Since the value of the test statistic t * = 2.14 falls in the rejection region, we reject H o at
α = 0.05, and reach the same conclusion as stated in question 161.
The weights of 20 adult males were recorded as: 169, 174, 149, 152, 163, 175, 169, 133, 163,
170, 148, 167, 159, 166, 149, 155, 195, 127, 190, and 185. It is believed that the mean weight
for adult males is at least 160 lb. Assume that the weights for adult males are normally
distributed.
ANSWER:
164. Use computer to calculate the sample mean and sample standard deviation.
ANSWER:
ANSWER:
ANSWER:
Using the “Probability Values for Student’s t-Distribution” Table, we get: 0.216 < P < 0.246.
ANSWER:
p-value = 0.231
168. Use computer to verify your answers to questions 165, 166, and 167.
ANSWER:
ANSWER:
Since p-value > α =.05, we fail to reject H o . The sample does not provide sufficient
evidence to justify the contention that mean weight for adult males is higher than 160
Ibs.
The water pollution readings at Lake Michigan seem to be lower than last year. A sample of 15
readings was randomly selected from the records of this year’s daily readings: 2.9, 3.2, 4.6, 3.1,
3.3, 3.7, 2.6, 2.9, 2.3, 3.3, 4.2, 2.9, 2.9, 3.1, and 2.6. A researcher claims that the mean of this
year’s pollution readings is significantly lower than last year’s mean of 3.60. Assume that all
such readings have a normal distribution.
ANSWER:
171. Use computer to calculate the sample mean and sample standard deviation.
ANSWER:
172. Does this sample provide sufficient evidence to support the researcher’s claim at the
0.05 level? Use computer to complete the hypothesis test.
ANSWER:
173. The recommended number of hours of sleep per night is 8 hours, but everybody “knows”
that the average college student sleeps less than 7 hours. The number of hours slept
last night by15 randomly selected college students are: 5.0, 6.6, 6.0, 5.3, 7.6, 5.6, 6.9,
7.9, 6.7, 5.4, 6.5, 7.2, 5.9, 6.8, and 7.0. Assume that the variable sleeping hours is
approximately normally distributed. Use a computer to test the hypothesis
H o : µ = 7 vs. H a : µ < 7 at α = 0.02.
Since p-value = 0.011 < α = 0.02, we reject H o .The sample does provide sufficient
evidence to justify the belief that college student sleeps on average less than 7 hours
per night.
ANSWER:
H o : µ = 35 vs. H a : µ ≠ 35
ANSWER:
176. Use computer to complete the hypothesis test using the p-value approach at α = 0.05.
ANSWER:
Since p-value = 0.096 > α = 0.05 , we fail to reject H o .The sample does not provide
sufficient evidence to justify that average MCAT scores for medical students at the
University of Michigan is different from 35.
177. Complete the hypothesis test using the classical approach at α = 0.05.
t(df, α /2) = t(9, 0.025) = 2.26. The critical values are ± 2.26.
The value of the test statistic t * = -1.862 does not fall in the rejection region; therefore we
fail to reject H o at α = 0.05. We reach the same conclusion as stated in question 176.
178. Use a computer to construct 95% confidence interval for the MCAT average score.
ANSWER:
179. Verify the lower and upper 95% confidence limits for µ shown on the computer output in
question 178.
ANSWER:
ANSWER:
Since the hypothesized value µ = 35 falls in the 95% confidence interval, we fail to reject
H o at α = 0.05.
It has been suggested that abnormal male children tend to occur more in children born to older-
than-average mothers. Case histories of 25 abnormal males were obtained, the ages of the 25
mothers were
21 39 31 21 29 28 34 45 21 41
31 38 40 38 32 28 37 28 16 39
35 29 43 27 42
The mean age at which mothers in the general population give birth is 28.0 years. Assume
ages have a normal distribution.
ANSWER:
182. Use computer to calculate the sample mean and standard deviation.
183. Does the sample give sufficient evidence to support the claim that abnormal male
children have older-than-average mothers? Use computer and the p-value approach at
α = 0.05.
ANSWER:
Since p-value = 0.004 < α = 0.05, we reject H o . Yes, the sample provides sufficient
evidence to support the claim that the mean age of mothers of abnormal male children is
significantly greater than the mean age of mothers with normal male children, at the 0.05
level.
184. Does the sample give sufficient evidence to support the claim that abnormal male
children have older-than-average mothers? Use computer and the classical approach at
α = 0.05.
The value of the test statistic t * = 2.909 does fall in the rejection region; therefore we
reject H o at α = 0.05. We reach the same conclusion stated in question 183.
Section 9.2
True-False Questions
185. The maximum error of estimate for a proportion is a multiple of the standard error of
proportion.
ANSWER: T
186. The best point estimate of the population proportion p is the observed proportion p′ .
ANSWER: T
187. In determining the sample size required to estimate a population proportion, the size of
sample needed may need to be reduced if a reasonably good estimate for p exists from
previous studies or perhaps from a small pilot study.
ANSWER: T
ANSWER: F
ANSWER: F
ANSWER: T
ANSWER: F
192. If a random sample of size n is selected from a large population with p =P(success), then
the sampling distribution of p ′ has a mean equal to p ′ ,
ANSWER: F
193. If a random sample of size n is selected from a large population with p =P(success), then
the sampling distribution of p′ has a standard error σ p′ equal to pq / n .
ANSWER: T
194. When we construct confidence interval for the population proportion p, we will base our
estimation on the biased sample statistic p′ , where p′ is the center of the confidence
interval.
ANSWER: F
195. If a random sample of size n is selected from a large population with p =P(success), then
the sampling distribution of p′ has an approximately normal distribution if n is sufficiently
large.
ANSWER: T
ANSWER: F
197. When the binomial parameter p is to be tested using a hypothesis-testing procedure, the
test statistic is assumed to be normally distributed when the null hypothesis is true, when
the assumptions for the test have been satisfied, and when n is sufficiently large (n>20,
np > 5, and nq > 5).
ANSWER: T
Multiple-Choice Questions
198. Which of the following would be the hypotheses in testing the claim that the percentage
of students who have part-time jobs is at least 82%?
199. Which of the following would be the hypothesis for testing the claim that the proportion of
students at a large university who smoke is significantly different from 0.15?
201. In references about the binomial probability of success, the largest possible value of pq
where p = P(success) and q = P(failure) is:
A) 1.00.
B) 0.75.
C) 0.50.
D) 0.25.
ANSWER: D
202. When testing the claim that bags of M&M candies will have less than 3% broken pieces,
which of the following would be the null hypothesis and alternative hypothesis?
A) gets smaller.
B) also gets larger.
C) stays the same.
D) size depends on n.
ANSWER: A
204. If we do not know the value of the theoretical probability of a success on a single trial in
a binomial experiment, then the best replacement available of the standard error of
proportion is:
A) npq
B) np ′q ′
C) pq / n
D) p′q′ / n
ANSWER: D
205. The standard deviation of the sampling distribution of the sample binomial probability p ′
is:
A) p
B) np
C) npq
D) pq / n
ANSWER: D
206. The mean of the sampling distribution of the sample binomial probability p ′ is:
A) p
B) np
C) npq
D) pq
207. Which of the following is not true about the binomial parameter p?
A) The point estimate is the center of the confidence interval, and the hypothesized
mean is the center of the noncritical region.
B) If the hypothesized value of p is contained in the confidence interval, then the null
hypothesis will be rejected.
C) If the hypothesized value of p does not fall within the confidence interval, then the
test statistic will be in the critical region.
D) If the hypothesized value of p is contained in the confidence interval, then the test
statistic will be in the noncritical region.
ANSWER: B
Short-Answer Questions
209. If the claim “65% of all new cars bought in 1991 were compacts” were tested, what
distribution would be used to determine the p-value for the test?
ANSWER:
210. Assume a random sample of size n is selected from a large population with p
=P(success). Briefly discuss the practical guidelines that will ensure normality for the
sample binomial probabilities p′ .
ANSWER:
A particular candidate claims she has the support of at least 60% of the voters in her district. A
random sample of 150 voters yields 87 who support her. The candidate wishes to test her claim
at the 0.05 level of significance.
ANSWER:
ANSWER:
ANSWER:
ANSWER:
Fail to reject H o . There is not sufficient evidence to conclude that the candidate has the
support of less than 60% of the voters.
ANSWER:
(0.506, 0.634)
216. To test H o : p = 0.7(≤) vs. H a : p > 0.7 , a sample of size 75 is selected at random. What
is the minimum value of the binomial random variable x that would result in rejection of
H o if α = 0.05?
ANSWER:
Minimum value = 60
217. Determine the sample size that is required to estimate the true proportion of homes with
a DVD if you want your estimate to be within 0.03 with 90% confidence.
ANSWER:
757 homes
218. A machine produces 3-inch nails. A sample of 100 nails is selected, and it is found that
25 are shorter than 3.00 inches. Find a 95% confidence interval of the proportion of all
such nails that are shorter than 3.00 inches.
ANSWER:
(0.17 to 0.33)
ANSWER:
ANSWER:
221. State the decision and conclusion at the 0.05 level of significance.
ANSWER:
Since p-value > α , we fail to reject H o . There is not sufficient evidence to conclude that
the percentage of insurance company claims that are settled within two months of being
filed is less than 75%.
222. A marketing research firm wishes to conduct a poll in a certain region to estimate the
proportion of residents who would oppose the construction of a pipeline. Determine the
sample size needed in order to be 90% confident that the sample proportion will be
within 0.05 of the true proportion.
ANSWER:
n = 273
ANSWER:
p -value = 0.0104. Since p – value < α , reject the null hypothesis. We conclude that less
than 80% of the seeds germinate.
ANSWER:
n = 139
ANSWER:
n = 385
In order to estimate the proportion of universities that provide some dental coverage for their
employees, a survey was conducted. Thirty-eight out of 75 universities responded yes to the
survey.
ANSWER:
227. Estimate the proportion of all universities that provide some dental coverage by
constructing a 98% confidence interval for p.
ANSWER:
(0.37 to 0.64)
228. The null hypothesis being tested is “a coin is fair” and the alternative hypothesis is “the
coin favors heads.” Let p be the probability of a head occurring. The null hypothesis is
H o : p = 0.5 , and the alternative is H a : p > 0.5 . The test statistic, x, is the number of
heads to occur in a set of 12 tosses of this coin. Determine the largest critical region for
which α does not exceed 0.05, by using a discrete variable. (Determine what values of
x form the critical region, and state the corresponding value of α).
ANSWER:
On a test of 12 True/False questions we wish to test the null hypothesis that “a student guessed
at the answers” versus “studied and performed better than would if simply guessed.” The test
statistic is x, the number of correct answers the student has in the 12.
ANSWER:
0.073
ANSWER:
0.073
231. Find α using a discrete variable if the critical region is x > 10.
ANSWER:
0.003
In order to test at the 0.10 level of significance the claim that at least 60% of a large student
population is in favor of an administrative proposal, a random sample of 150 students is
selected. Of this number, 88 are in favor of the proposal.
ANSWER:
ANSWER:
The test statistic is z * . The level of significance is α = 0.10. The critical value is z = -
1.28. Reject H o if z * < -1.28
z = −1.28 0
ANSWER:
z * = −0.33
ANSWER:
Fail to reject the null hypothesis. There is not sufficient evidence to indicate that the
proportion of student population who are in favor of the administrative proposal is less
than 0.60.
236. A sample study was randomly selected to construct a 95% confidence interval for p. The
interval estimate was (0.078, 0.142). Find the value of p ′ , the observed binomial
probability.
ANSWER:
p ′ = 0.11
237. Find the best estimate of the standard error of p ′ if a sample of size 53 yields 16
successes.
ANSWER:
σ p = 0.063
238. Give a point estimate for the population proportion of households who have an
answering machine.
ANSWER:
p′ = x / n = 85 / 400 = 0.2125
ANSWER:
240. Construct a 95% confidence interval for the true proportion of households who have an
answering machine.
ANSWER:
241. Independent bank randomly selected 400 checking-account customers and found that
150 of them also had savings accounts at this same bank. Construct a 95% confidence
interval for the true proportion of checking-account customers who also have savings
accounts.
ANSWER:
p = the proportion of checking account customers who also have savings accounts.
The sample was randomly selected and each subject’s response was independent of
those of the others surveyed.
Then p′ ± E = 0.375 ± 0.0474 , and the 95% interval for p is 0.3276 to 0.4224.
A policeman wishes to conduct a survey in his city to determine what percent of the bicyclists
own helmets. He decided to use the known national figure of 18% for his initial estimate of p.
242. Find the sample size if he wants his estimate to be within 0.02 with 90% confidence.
ANSWER:
243. Find the sample size if he wants his estimate to be within 0.03 with 90% confidence.
ANSWER:
244. Find the sample size if he wants his estimate to be within 0.02 with 98% confidence.
ANSWER:
245. What effect does changing the level of confidence have on the sample size? Explain.
ANSWER:
ANSWER:
247. It is known that about 15% of lung cancer patients survive for five years after diagnosis.
Suppose a physician wants to see if this survival rate is accurate. How large a sample
would he need to take to estimate the true proportion surviving for five years after
diagnosis to within 1% with 95% confidence?
ANSWER:
248. Determine the p-value testing H o : p = 0.25 vs. H a : p ≠ 0.25, if the value of the test
statistic z * = 1.84.
ANSWER:
249. Determine the p-value testing H o : p = 0.75 vs. H a : p ≠ 0.75 , if the value of the test statistic
z * = -2.05.
ANSWER:
250. Determine the p-value testing H o : p = 0.46 vs. H a : p > 0.46 , if the value of the test statistic
z * = 0.89.
251. Determine the p-value testing H o : p = 0.12 vs. H a : p < 0.12 , if the value of the test statistic
z * = -1.69.
ANSWER:
252. The binomial random variable, x, may be used as the test statistic when testing
hypotheses about the binomial parameter, p, when n is small (say, 15 or less). Use the
table of binomial probabilities and determine the p-value for testing H o : p = 0.4
vs. H a : p ≠ 0.4, where n = 13 and x = 10 .
ANSWER:
253. The binomial random variable, x, may be used as the test statistic when testing
hypotheses about the binomial parameter, p, when n is small (say, 15 or less). Use the
table of binomial probabilities and determine the p-value for testing H o : p = 0.3
vs. H a : p ≠ 0.3, where n = 15 and x = 10 .
ANSWER:
ANSWER:
255. The binomial random variable, x, may be used as the test statistic when testing
hypotheses about the binomial parameter, p, when n is small (say, 15 or less). Use the
table of binomial probabilities and determine the p-value for testing H o : p = 0.9
vs. H a : p < 0.9, where n = 13 and x = 9 .
ANSWER:
256. Use the table of binomial probabilities to determine the critical region used in testing
each of the hypothesis H o : p = 0.4 vs. H a : p > 0.4, where n = 15 and α = 0.05 . (Note: Since
x is discrete, choose critical regions that do not exceed the value of α given.)
ANSWER:
257. Use the table of binomial probabilities to determine the critical region used in testing
each of the hypothesis H o : p = 0.5 vs. H a : p ≠ 0.5, where n = 14 and α = 0.05 . (Note: Since
x is discrete, choose critical regions that do not exceed the value of α given.)
ANSWER:
ANSWER:
259. Use the table of binomial probabilities to determine the critical region used in testing
each of the hypothesis H o : p = 0.7 vs. H a : p > 0.7, where n = 13 and α = 0.01 . (Note: Since
x is discrete, choose critical regions that do not exceed the value of α given.)
ANSWER:
State Farm insurance company states that 90% of its claims are settled within 5 weeks. A
consumer group selected a random sample of 100 of the company’s claims to test this
statement. If the consumer group found that 75 of the claims were settled within 5 weeks, do
they have sufficient reason to support their contention that fewer than 90% of the claims are
settled within 5 weeks?
ANSWER:
261. Identify the probability distribution to be used and calculate the test statistic.
Since n = 100 > 20, np = (100)(0.90) = 90 > 5, and nq = (100)(0.10) = 10 > 5 ,then p′ is expected
to be approximately normally distributed.
262. Complete the test at the 0.05 level of significance using the p-value approach.
P = p-value = P( z < −5.0) = P( z > 5.0); Using the table of standard normal distribution,
we have P = 0.5000 – 0.4999997 = 0.0000003. Since P < α ; we reject H o .
263. Complete the test at the 0.05 level of significance using the classical approach.
ANSWER:
The test statistic z * falls in the critical region, therefore we reject H o , and conclude that
the sample provides sufficient evidence that p is significantly less than 0.90; it appears
that less than 90% are settled within 30 days as claimed, at the 0.05 level of
significance.
ANSWER:
Then, p′ ± E = 0.35 ± 0.0935 , and the 95% interval for p is 0.2565 to 0.4435.
The full-time student body of Big Rapids high school is composed of 50% males and 50%
females. Does a random sample of students consisting of 25 male and 15 female from calculus
course show sufficient evidence to reject the hypothesis that the proportion of male and female
students who take this course is the same as that of the whole student body?
ANSWER:
266. Identify the probability distribution to be used and calculate the value of the test statistic.
ANSWER:
Since n = 40; n > 20, np = (40)(.50) = 20 > 5, and nq = (40)(0.50) = 20 > 5 , then p ' is expected
to be approximately normally distributed. x = 25, p′ = x / n = 25 / 40 = 0.625 . Then,
z* = ( p′ − p ) / pq / n = (0.625 − 0.50) / (0.5)(0.5) / 40 = 1.58
ANSWER:
The p-value approach: P = 2 ⋅ P ( z > 1.58); Using the table of standard normal
distribution, we have P = 2(0.5000 – 0.4429) = 0.1142. Since P > α ; fail to reject H o .
The sample provides sufficient evidence that the proportion is not significantly different
than 0.50, at the 0.05 level; that is, the sample evidence does not indicate the proportion
of males taking chemistry to be different than 50%.
268. Complete the test at the 0.05 level of significance using the classical approach.
ANSWER:
The test statistic z * =1.58 falls in the noncritical region, therefore we fail to reject H o . We
reach the same conclusion as stated in question 269.
Section 9.3
True-False Questions
ANSWER: F
270. The chi-square distribution is used for inferences about the population mean µ when the
standard deviation σ is unknown.
ANSWER: F
271. Often the concern with testing the variance (or standard deviation) is to keep its size
under control or relatively small. Therefore, many of the hypotheses tests with chi-
square will be one-tailed.
ANSWER: T
272. The Student’s t-distribution is used for all inferences about a population’s variance.
ANSWER: F
273. The chi-square distribution is a skewed distribution whose mean value is n for degrees
of freedom larger than two.
ANSWER: F
274. When random samples are drawn from a normal population of a known variance σ 2 , the
quantity (n − 1) s 2 / σ 2 possesses a probability distribution that is known as the chi-square
distribution, with (n – 1) degrees of freedom.
ANSWER: T
275. The chi-square distributions, like the Student’s t-distributions, are a family of probability
distributions, with each member of the family being identified by the number of degrees
of freedom.
ANSWER: T
ANSWER: F
277. Inferences about the variance of a normally distributed population use the chi-square,
χ 2 , distributions.
ANSWER: T
278. When random samples are drawn from a normal population with a known variance σ 2 ,
the quantity (n − 1) s 2 / σ 2 possesses a probability distribution that is known as the chi-
square distribution with n -1 degrees of freedom.
ANSWER: T
Multiple-Choice Questions
279. The mean age of 25 randomly selected college seniors was found to be 23.5 years, and
the standard deviation of all college seniors was 1.3 years. The correct symbol for the
1.3 years is which of the following?
A) µ
B) s
C) σ
D) x
ANSWER: C
A) degrees of freedom.
B) median.
C) mode.
D) standard deviation.
ANSWER: A
282. Which of the following statements is false as a property of the chi-square distribution?
A) Inferences about the variance of a normally distributed population use the chi-
square, χ 2 , distributions.
B) χ 2 ( df , α ) (read “chi-square of df, alpha”) is the symbol used to identify the critical
value of chi-square with df degrees of freedom and with α area to the right.
C) When df >2, the mean value of the chi-square distribution is the square root of the df.
Itself.
D) None of the above
ANSWER: C
A) The t procedures for inferences about the mean were based on the assumption of
normality, but they are generally useful even when the sampled population is
nonnormal, especially for larger samples.
B) The statistical procedures for the standard deviation are very sensitive to nonnormal
distributions (skewness, in particular), and this makes it difficult to determine whether
an apparent significant result is the result of the sample evidence or a violation of the
assumptions.
285. Which of the following critical values of the chi-square distribution is the smallest?
A) χ 2 (15, 0.95 )
B) χ 2 (18, 0.95 )
C) χ 2 ( 32, 0.95)
D) χ 2 ( 40, 0.95 )
ANSWER: A
286. Which of the following critical values of the chi-square distribution is the largest?
287. Which of the following critical values of the chi-square distribution is the smallest?
A) χ 2 (16, 0.01)
B) χ 2 (10, 0.10 )
C) χ 2 ( 24, 0.50 )
D) χ 2 ( 28, 0.95 )
ANSWER: B
288. Which of the following critical values of the chi-square distribution is the largest?
A) χ 2 ( 20, 0.025 )
B) χ 2 (12, 0.95 )
C) χ 2 ( 8, 0.005 )
D) χ 2 (15, 0.90 )
ANSWER: A
Short-Answer Questions
289. For a chi-square distribution with a mean value of 30, find the area under the curve to
the right of 34.8.
ANSWER:
0.25
ANSWER:
291. If we correctly reject the claim that a population variance is at least 25.0, then can we
also reject the claim that the population standard deviation is at least 5.0? Explain.
ANSWER:
The techniques employ the sample variance rather than the sample standard deviation.
Since the standard deviation is the positive square root of the variance, talking about the
variance is comparable to talking about the standard deviation. Thus, we could also
reject the claim that the standard deviation is at least 5.0.
292. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim: The standard deviation has increased from its previous value of 15.
ANSWER:
293. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim: The standard deviation is no larger than 0.4 oz.
ANSWER:
294. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim: The standard deviation is not equal to 5.2.
295. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim: The variance is no less than 10.
ANSWER:
296. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim: The variance is different from the value of 0.025.
ANSWER:
297. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim: The variance has increased from 13.2.
ANSWER:
ANSWER:
26.2
ANSWER:
27.5
ANSWER:
10.9
ANSWER:
10.5
ANSWER:
32.0
ANSWER:
31.5
ANSWER:
16.0
ANSWER:
43.0
ANSWER:
16.9
ANSWER:
5.01
ANSWER:
29.1
ANSWER:
29.7
ANSWER:
H o : σ 2 = 9.0, H a : σ 2 ≠ 9.0
ANSWER:
ANSWER:
χ 2 * = 7.56
Fail to reject null. There is not sufficient evidence to suggest that the variance is not
equal to 9.0.
314. Give a bound on the p-value for testing: H o : σ 2 = a vs. H a : σ 2 > a given that the
computed test statistic = 25.2 and n = 15.
ANSWER:
315. Give a bound on the p-value for testing: H o : σ 2 = b vs. H a : σ 2 > b given that the computed
test statistic = 6.10 and n = 15.
ANSWER:
316. The null hypothesis H o : σ 2 = 150 is tested against H a : σ 2 > 150 . For a sample of size 20,
the p-value has the bound 0.05 < p < 0.10. What is the range of s 2 ?
ANSWER:
ANSWER:
ANSWER:
s 2 = 5.1552
ANSWER:
x 2 =35.556
ANSWER:
ANSWER:
A machine produces 3-inch nails. A sample of ten nails is obtained and their lengths
determined. The results are as follows: 2.89, 2.95, 3.00, 3.05, 2.99, 2.96, 3.10, 3.06, 3.00 and
3.12. This data is used test H o : σ = 0.03 vs. H a : σ ≠ 0.03.
ANSWER:
s 2 = 0.00504
ANSWER:
X 2 =50.396
ANSWER:
ANSWER:
327. Give a bound on the p-value for the testing H o : σ 2 = 27(≤) vs. H a : σ 2 > 27 , with df = 16
and χ 2 = 28.4.
ANSWER:
328. Give a bound on the p-value for testing H o : σ 2 = 46.1 vs. H a : σ 2 ≠ 46.1 , with df = 20 and
χ 2 = 9.01.
ANSWER:
329. In testing the hypothesis H o : σ 2 = 30.0(≤) vs. H a : σ 2 > 30.0 , a sample of size n = 21
yielded χ 2 = 24.0. Find the sample variance.
ANSWER:
s 2 = 36.0
330. Calculate the p-value for testing the alternative hypothesis H a : σ 2 ≠ 18, when n = 15, and
χ 2 * = 28.2 .
ANSWER:
P = 2 P( χ 2 * > 28.2 | df = 14); Since 0.01 < ½ P < 0.025; then 0.02 < P <0.05
ANSWER:
332. Calculate the p-value for testing the alternative hypothesis H a : σ 2 ≠ 32, when df = 20,
and χ 2 * = 33.1 .
ANSWER:
P = 2 P( χ 2 * > 33.1| df = 20); Since 0.025 < ½ P <0.05; then 0.05 < P < 0.10.
333. Calculate the p-value for testing the alternative hypothesis H a : σ 2 < 13, when df = 30,
and χ 2 * = 17.4 .
ANSWER:
A random sample of 51 observations was selected from a normally distributed population. The
sample mean was x = 88.6 , and the sample variance was s 2 = 38.2. We wish to determine if
there is sufficient reason to conclude that the population standard deviation is not equal to 8 at
the 0.05 level of significance.
ANSWER:
H o : σ = 8 vs. H a : σ ≠ 8
ANSWER:
ANSWER:
Since 0.01 < ½ P < 0.025; then 0.02 < P < 0.05 P < α = 0.5; reject H o . There is
sufficient reason to conclude that the population standard deviation is not equal to 8, at
the 0.05 level of significance.
ANSWER:
The test statistic χ 2 * = 29.84 falls in the critical region, therefore we reject H o . We reach
the same conclusion as stated in question 338.
A foreign car manufacturer claims that the miles per gallon for a certain model of their cars are
normally distributed with a mean equal to 41.5 miles with a standard deviation equal to 3.5
miles. The following data are obtained from a random sample of 15 such cars; 39.0, 43.5, 41.0,
43.5, 37.0, 31.0, 38.5, 38.0, 39.0, 43.5, 46.0, 35.0, 33.0, 37.0, and 37.5. We wish to test the
hypothesis that the standard deviation differs from 3.5.
ANSWER:
s 2 = 17.2024
ANSWER:
ANSWER:
ANSWER:
P = p-value = 2 ⋅ P( χ 2 > 19.66 | df = 14); Since 0.10 < ½ P < 0.25, then 0.20 < P < 0.50 P
> α = .05; fail to reject H o . There is not sufficient reason at the 0.05 level of significance
to contradict the manufacturer’s claim about the standard deviation, and conclude that it
is different from 3.5.
ANSWER:
The critical values are χ 2 (14, 0.975) = 5.63 and χ 2 (14, 0.025) = 26.1 .
343. For a chi-square distribution having 25 degrees of freedom, find the area under the
curve between χ 2 ( 25, 0.94 ) and χ 2 ( 25, 0.18) .
ANSWER:
344. The central 80% of the distribution lies between what values?
ANSWER:
Therefore the central 80% of the distribution lies between 8.55 and 22.3.
345. The central 90% of the distribution lies between what values?
Therefore the central 80% of the distribution lies between 7.26 and 25.0.
346. The central 95% of the distribution lies between what values?
ANSWER:
Therefore the central 80% of the distribution lies between 6.26 and 27.5.
347. The central 99% of the distribution lies between what values?
ANSWER:
Therefore the central 80% of the distribution lies between 4.60 and 32.8
348. For a chi-square distribution having 45 degrees of freedom, find the area under the
curve between χ 2 ( 45, 0.98) and χ 2 ( 45, 0.13) .
ANSWER:
Problems often arise that require us to make inferences about variability (the spread of data).
This is accomplished by performing hypotheses testing about the population variance σ 2 or the
population standard deviation σ . This requires us to carefully state the null and alternative
hypotheses based on the information provided to us.
ANSWER:
350. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim “The standard deviation is no larger than 0.2 oz”.
ANSWER:
351. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim “The standard deviation is not equal to 15.”
ANSWER:
H o : σ = 15 and H a : σ ≠ 15
352. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim “The variance is no less than 24.”
353. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim “The variance is different from the value of 0.01, the value called for in the
specs.”
ANSWER:
354. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim “The variance has decreased from its previous value of 32.25.”
ANSWER:
355. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim “The variance is at most 28.”
ANSWER:
356. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim “The standard deviation is at least 4.25.”
ANSWER:
ANSWER:
358. Find the value of the test statistic for testing H o : σ 2 = 55 vs H a : σ 2 ≠ 55 using the sample
information n = 26 and s 2 =75.
ANSWER:
χ 2∗ = (n − 1) s 2 / σ 2 = (25)(75) / 55 = 34.09
359. Place bounds on the p-value for testing H a : σ 2 ≠ 24, given that n = 12, and χ 2 * = 20.8
ANSWER:
P = p-value = 2 ⋅ P( χ 2 > 20.8 | df = 11). Since 0.025 < 1/ 2 P < 0.05; then, 0.05 < P < 0.10
360. Place bounds on the p-value for testing H a : σ 2 > 32, given that n = 16, and χ 2 * = 28.6 .
ANSWER:
361. Place bounds on the p-value for testing H a : σ 2 ≠ 40, given that df = 30, and χ 2 * = 48.9
ANSWER:
P = p-value = 2 ⋅ P( χ 2 > 48.9 | df = 30). Since 0.01 < 1/ 2 P < 0.025; then, 0.02 < P < 0.05
ANSWER:
363. Draw an approximate chi-square distribution and determine the critical region and critical
value(s) that would be used to test H o : σ = 0.4 vs. H a : σ > 0.4, given that n =18 and
α = 0.05 , using the classical approach:
ANSWER:
364. Draw an approximate chi-square distribution and determine the critical region and critical
value(s) that would be used to test H o : σ 2 = 10 and H a : σ 2 < 10, with n =15 and α = 0.01 ,
using the classical approach:
ANSWER:
ANSWER:
366. Place bounds on the p-value for testing H a : σ 2 < 44, given that n = 30, and χ 2 * = 18.9
ANSWER:
ANSWER:
368. Draw an approximate chi-square distribution and determine the critical region and critical
value(s) that would be used to test H o : σ = 0.6 and H a : σ < 0.6, with n =12 and α = 0.10 ,
using the classical approach:
ANSWER:
A random sample of 51 observations was selected from a normally distributed population. The
sample mean was x = 88.2, and the sample variance was s 2 =38.5. Suppose you will use this
sample to determine whether there is sufficient reason to conclude that the population standard
deviation is not equal to 8.2 at the 0.05 level of significance.
ANSWER:
ANSWER:
P = p-value = 2 ⋅ P( χ 2 < 28.63 | df = 50). Since 0.005 < 1/ 2 P < 0.01; then, 0.01 < P < 0.02
Since p-value < α = 0.05, reject H o . There is sufficient reason to conclude that the
population standard deviation is not equal to 8.2, at the 0.05 level of significance
ANSWER:
The critical values are χ 2 (50, 0.975) = 32.4, and χ 2 (50, 0.025) = 71.4 as shown
below.
Since the test statistic χ 2∗ = 28.63 < 32.4, it falls in the rejection region and H o is
rejected. We reach the same conclusion as stated in question 373.
The standard deviation of weights of certain 64.0-oz cans of tomato soup filled by a machine
was 0.28 oz. A random sample of 20 cans showed a standard deviation of 0.38 oz. Suppose
you will use this sample to determine whether there is an apparent increase in variability at the
0.10 level of significance. Assume can weight is normally distributed.
ANSWER:
ANSWER:
Since p-value < α = 0.10, reject H o . There is sufficient reason to conclude that the
apparent increase in variability is significant at the 0.10 level of significance
Since the test statistic χ 2∗ = 34.99 > 27.2, it falls in the rejection region and H o is
rejected. We reach the same conclusion as stated in question 377.
General Motors claims that their Malibu 2005 model has mean miles per gallon equal to 38 with
a standard deviation equal to 4.0 mi. A random sample of 15 such cars and produced the
following miles per gallon: 36.0, 37.0, 41.5, 44.0, 33.0, 31.0, 35.0, 34.5, 37.0, 41.5, 39.0, 41.5,
34.0, 29.0, and 36.5. Assume normality. Suppose you wish to use this sample to test the
hypothesis that the standard deviation differs from 3.8 at level of significance α = 0.05.
ANSWER:
379. Use computer to complete the hypothesis test using the p-value approach.
ANSWER:
Since p-value = 0.479 > α = 0.05, we fail to reject H o . There is not sufficient evidence to
conclude that the population standard deviation is significantly different from 3.8 at the
0.05 level of significance. In other words, there is not sufficient reason to contradict the
manufacturer's claim about the standard deviation, at the 0.05 level of significance.
ANSWER:
Since the test statistic χ 2∗ = 17.234 does not fall in the rejection region, we fail to reject
H o at α = 0.05. We reach the same conclusion as stated in question 381.
Chapter 10
INFERENCES INVOLVING
TWO POPULATIONS
Section 10.1
True-False Questions
1. Pretest versus posttest (before versus after) studies are usually independent samples.
ANSWER: F
ANSWER: T
ANSWER: F
ANSWER: T
5. If two samples have the same size, the samples may or may not be independent.
ANSWER: T
ANSWER: F
ANSWER: T
8. If two samples have the same size, the samples must be dependent.
ANSWER: F
Multiple-Choice Questions
A) When comparing two populations, we need two samples, one from each population.
A) Comparing the final test scores of male and female students in your statistics class is
an example of two dependent samples.
B) Pretest versus posttest (before versus after) studies usually use dependent samples.
C) Studies involving identical twins result in dependent samples of data.
D) None of the above
ANSWER: A
11. A political analyst in Michigan surveys a random sample of registered Democrats and compares
the results with those obtained from a random sample of registered Republicans. This would be
an example of:
A) dependent samples.
B) independent samples.
C) independent samples only if the sample sizes are equal.
D) dependent samples only if the sample sizes are equal.
ANSWER: B
13. Describe how one could select two dependent samples from among his/her co-workers
in General Motors to compare their starting salaries after graduation from high school to
their salaries when they continue working at GM and reach the age of 40.
ANSWER:
14. Explain why studies involving identical twins result in dependent samples of data.
ANSWER:
Identical twins are so much alike that the information obtained from one would not be
independent from the information obtained from the other twin.
15. Describe how one could select two independent samples from among his/her co-workers
to compare the salaries of female and male workers.
ANSWER:
Divide the co-workers into two groups, males and females. Randomly select a sample
from each of the two groups.
16. Twenty people were selected to participate in a psychology experiment. They answered
a short multiple-choice quiz about their attitudes on abortion and then viewed a 50-
minute film. The following day the same 20 people were asked to answer a follow-up
questionnaire about their attitudes. At the completion of the experiment, the
experimenter will have two sets of scores. Do these two samples represent dependent
or independent samples? Explain.
ANSWER:
These two samples represent dependent samples. The two sets of data were obtained
from the same set of 20 people, each person providing one piece of data for each
sample.
17. An experiment is designed to study the effect diet has on the uric acid level. Thirty
people are used for the study. Fifteen are randomly selected and given a junk-food diet.
The other fifteen received a high-fiber, low-fat diet. Uric acid levels of the two groups are
ANSWER:
The resulting sets of data represent independent samples. The two samples are from
two separate unrelated sets of fifteen people.
An auto insurance company is concerned that body shop “A” charges more for repair work than
body shop “B” charges. It plans to send 20 cars to each body shop and obtain separate
estimates for the repairs needed for each car.
18. How can the company do this and obtain independent samples? Explain in detail.
ANSWER:
Independent samples will result if the company sent a set of 20 cars to body shop “A”,
and another set of 20 cars to body shop “B”. This means the company sent 40 cars,
received 40 estimates (one estimate for each car).
19. How can the company do this and obtain dependent samples? Explain in detail.
ANSWER:
Dependent samples will result if the company sent the same set of 20 cars to both body
shops “A” and “B”. This means the company sent 20 cars, received 40 estimates (two
estimates for each car, one from each body shop).
Suppose that 800 students in Michigan State University are taking elementary statistics this
semester. Two samples of size 50 are needed in order to test some pre-course skill against the
same skill after the students complete the course.
ANSWER:
Randomly select 50 students from the 800 students and take a measure of this skill from
each of these 50 both before and after the course. This leads to 100 measurements from
50 students (two from each student).
21. Describe how you would obtain your samples if you were to use independent samples.
ANSWER:
Obtain a measurement of this skill from 50 randomly selected students before the course
begins. Then obtain another sample of 50 randomly selected from those completing the
course. This leads to 100 measurements from 100 students (one from each student).
Section 10.2
True-False Questions
22. In constructing a confidence interval for the mean difference in paired data we see that
as the sample size increases the width of the interval also increases.
ANSWER: F
23. Suppose we were testing the hypothesis H o : µ d = 0(≥), vs. H a : µd < 0 , where
d = x1 − x2 . If we reject H o , then this would indicate that the mean of population 2 is
less than the mean of population 1.
ANSWER: F
24. In dependent sampling, two sets of data are combined into one set using d = x1 − x2 . In
this case, ∑d / n = x − x .
1 2
25. Consider a right-tail hypothesis test concerning the mean difference between two
dependent samples where d = x1 − x2 . If we were to interchange the two populations,
then the test would change to a left-tail hypothesis test.
ANSWER: T
26. In dependent sampling, the two data values, one from each set, that come from the
same source are called paired data.
ANSWER: T
27. When the means of two unrelated samples are used to compare two populations, we are
dealing with two dependent means.
ANSWER: F
28. The use of paired data often allows for the control of immeasurable or confounding
variables because each pair is subjected to these confounding effects equally.
ANSWER: T
29. The z-distribution is used when two dependent means are to be compared.
ANSWER: F
30. In constructing a confidence interval for the mean difference in paired data, the interval
increases in width when the sample size is increased.
ANSWER: F
31. When paired observations are randomly selected from normal populations, the paired
difference, d = x1 − x2 , will be normally distributed about a mean µ d with a standard
deviation σ d .
ANSWER: T
ANSWER: T
33. The procedures for comparing two population means are based on the relationship
between two sets of sample data, one sample from each population. When dependent
samples are involved, the data are thought of as “paired data”, where the pairs of data
values are compared directly to each other by using the difference in their numerical
values.
ANSWER: T
34. When paired observations are randomly selected from normal populations, the paired
difference, d = x1 − x2 , will be approximately normally distributed about a mean µd with a
standard deviation of σ d . In this situation, the z-test for one mean is applied.
ANSWER: F
35. In a confidence interval for the mean difference in paired data, the interval increases in
width when the sample size is increased.
ANSWER: F
36. When paired observations are randomly selected from normal populations, the paired
difference, d = x1 − x2 , will be approximately normally distributed about a mean µd with a
standard deviation of σ d . In this situation, the z-test for one mean is applied with df =n-
1, where n is the number of matched pairs of data.
ANSWER: T
37. When the means of two unrelated samples are used to compare two populations, we are
dealing with two dependent means.
ANSWER: F
38. The z-distribution is used when two dependent means are to be compared.
Multiple-Choice Questions
39. When constructing a confidence interval for the mean difference in paired data, which of
the following symbols indicates the middle point of the interval?
A) µ d
B) σ d
C) d
D) sd
ANSWER: C
40. A statistics professor is testing the claim that the use of computers will help students to
better understand elementary statistics concepts. Based on this claim, if
d = X comp. − X no comp. , which of the following would be the correct null and alternative
hypotheses?
A) H o : µd = 0, H a : µd ≠ 0
B) H o : µd = 0(≤), H a : µd > 0
C) H o : µd > 0, H a : µd = 0(≤)
D) H o : µd < 0, H a : µ d = 0(≥)
ANSWER: B
41. A research laboratory interested in the medicinal effect of herbs is testing the claim that
a particular herb will reduce stress-related symptoms in adults. Based on this claim and
assuming d = X after − X before , which of the following would be the correct null and alternative
hypotheses?
A) H o : µd = 0, H a : µd ≠ 0
B) H o : µd > 0, H a : µd = 0(≤)
C) H o : µd = 0(≤), H a : µd > 0
D) H o : µd < 0, H a : µ d = 0(≥)
ANSWER: C
A) H o : µd = 0
B) H o : µd = 0(≥)
C) H o : µd ≠ 0
D) H o : µd = 0(≤)
ANSWER: B
43. When using paired differences to test the mean difference between two dependent
samples, which of the following is the point estimate of µ d ?
A) d
B) µ1 − µ 2
C) x1 − x2
D) ∑ d
ANSWER: A
A) When we test a null hypothesis about the mean difference, µd of two population
means using two dependent samples, the test statistic used will be the difference
between the sample mean d and the hypothesized value of µd , divided by the
estimated standard error.
B) The assumption for inferences about the mean of paired differences µd is that the
paired data are randomly selected from normally distributed populations.
C) The assumption for inferences about the mean of paired differences µd is that the
paired data are randomly selected from t- distributed populations.
D) None of the above
ANSWER: C
Short-Answer Questions
45. What is the assumption for inferences about the mean of paired differences µ d ?
The paired data are randomly selected from normally distributed populations.
ANSWER:
d =x−y
47. In order to compare two scales, 30 objects are weighed on both scales. Each object
would then have two weight values (one from scale 1 and one from scale 2). Based on
the nature of the differences in the two weight measurements for the 30 objects, the two
scales may be compared. Do these samples represent dependent or independent
samples?
ANSWER:
Dependent samples
48. State the null and alternative hypotheses that would be used to test each of the following
claims:
a. The mean weight loss due to a special diet is at least 5 pounds. Assume dependent
sampling was used.
b. The mean adult body temperature is not 98.6°F.
ANSWER:
a. H o : µ d = 5(≥), H a : µd < 5
b. H o : µ d = 98.6, H a : µ d ≠ 98.6
49. Define the paired difference d, and then state the null hypothesis, H o , and the alternative
hypothesis, H a , that would be used to test the claim “There is an increase in the mean
difference between posttest and pretest scores for an introduction to macroeconomics
course.”
50. Define the paired difference d, and then state the null hypothesis, H o , and the alternative
hypothesis, H a , that would be used to test the claim “As a result of a computer training
session in Microsoft Office 2003, it is believed that the mean of the difference in
performance scores will not be zero.”
ANSWER:
Let d = scores after computer training session - scores before computer training session.
Then, H o : µ d = 0 and H a : µ d ≠ 0 .
51. Define the paired difference d, and then state the null hypothesis, H o , and the alternative
hypothesis, H a , that would be used to test the claim “The mean of the differences
between pre and post self-esteem scores showed improvement after involvement in a
community service project to build a playground for children.”
ANSWER:
52. Define the paired difference d, and then state the null hypothesis, H o , and the alternative
hypothesis, H a , that would be used to test the claim “The mean of the differences
between the posttest and the pretest scores is greater than 10.”
ANSWER:
ANSWER:
Let d = weight before diet plan – weight after diet plan. Then,
H o : µd = 25 (≤) and H a : µd > 25 .
54. Define the paired difference d, and then state the null hypothesis, H o , and the alternative
hypothesis, H a , that would be used to test the claim “The mean difference in the home
reassessments from the two town assessors was no more than $800.”
ANSWER:
Let d = home reassessment from first assessor - home reassessment from second
assessor. Then, H o : µd = 800 (≤) and H a : µd > 800
55. Two different makes of stopwatches were used to time 12 different runners over a
particular course. Using the times in seconds shown in the table below, find a 95%
confidence interval for the mean time difference where d = Type 1 − Type 2.
Runner
Stopwatch 1 2 3 4 5 6 7 8 9 10 11 12
Type 1 59 49 64 60 54 47 49 58 66 76 70 66
Type 2 57 46 63 60 50 48 54 54 60 72 72 66
ANSWER:
(-0.65 to 3.31)
The exercise capacity of an individual is measured by the number of minutes the individual can
exercise before certain medical criteria are met. The exercise capacity before and after basic
training were measured for 20 marines. A summary of the data was provided as follows:
∑ d = 65 , ∑ d 2 = 1076 , where d = after capacity − before capacity. Assume that you wish to test
H o : µ d = 0(≤) vs. H a : µ d > 0 .
ANSWER:
ANSWER:
Using the Table of critical values of Student’s t-distribution, we have: 0.01 < p < 0.025.
Using the Table of probability values for Student’s t-distribution, we have: 0.02 < P <
0.025.
Ten men compared two brands of razors. One side of the face was shaved by brand A, and the
other was shaved by brand B. A “smoothness score” (from 1 to 10) was given by each person
for each side. The side on which a given shaver was used was assigned by the flip of a coin and
the smoothness scores are shown below.
Man
Razors 1 2 3 4 5 6 7 8 9 10
Brand A score 7 8 3 5 4 4 9 8 7 4
Brand B score 5 6 3 4 6 5 6 7 3 4
ANSWER:
Calculate ∑ d = 10, ∑ d , ( ∑ d ) , d , sd .
2 2
59.
ANSWER:
60. Test H o : µ d = 0 vs. H a : µd ≠ 0 by giving the critical region, t * , and your conclusion.
(Use α = 0.01).
ANSWER:
Critical region: t < −3.25 or t > 3.25; Value of the test statistic: t * = 1.732; Conclusion:
Unable to reject the null hypothesis.
Two different testing agencies develop their own achievement tests for the same subject. Both
tests are given to the same random sample of 10 students. The results are given below:
Student
Tests 1 2 3 4 5 6 7 8 9 10
Test A 83 79 96 87 93 90 77 73 85 84
Test B 90 88 98 83 97 94 82 80 92 88
Suppose we were to test the claim that there is no difference in the mean score for the two tests
at the 0.01 level of significance.
ANSWER:
d = -7, -9, -2, 4, -4, -4, -5, -7, -7, and -4.Critical region: t < −3.25 or t > 3.25
Calculate ∑ d , ∑ d 2 , ( ∑ d ) , d , and sd .
2
62.
ANSWER:
ANSWER:
H o : µd = 0 vs. H a : µd ≠ 0
64. Determine the critical region, the computed value of the test statistic, and the decision
reached.
ANSWER:
Critical region: t < -3.25 or t > 3.25; Value of the test statistic: t * = -3.922; Decision:
reject null.
65. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following claims: The mean difference between the posttest and pretest scores
is greater than 12.
66. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following claims: The mean weight gain, due to the change in diet for the
laboratory animals, is at least 8 oz.
ANSWER:
67. What is the point estimate for the mean reduction in the diastolic reading after two weeks
on this diet?
ANSWER:
68. Find the 98% confidence interval for the mean reduction in the diastolic reading.
ANSWER:
A sociologist is studying the effects of a certain motion picture film on the attitudes of white men
toward black men. Twelve white men were randomly selected and asked to fill out a
questionnaire before and after viewing the film. The scores received by the 12 men are shown
in the table below. Assume the questionnaire scores are normally distributed.
Before 11 13 19 13 9 8 14 13 18 21 7 12
After 6 9 12 16 4 5 10 14 13 17 7 11
69. Construct a 95% confidence interval for the mean shift in score that takes place when
this film is viewed.
ANSWER:
ANSWER:
Since the 95% confidence interval for µd does not include the hypothesized value 0, we
reject H o at α =0.05 and conclude that there is a difference in the mean score of the
attitude of white men toward black men after viewing this motion picture film.
ANSWER:
72. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following claims: The mean of the difference in performance scores due to
special training session will not be zero.
ANSWER:
The number of sit-ups that a person could do in one minute, both before and after a physical
fitness course was recorded as shown below for ten randomly selected participants. Suppose
you wish to determine whether a significant amount of improvement took place after the
physical fitness course.
Before 30 23 25 29 27 25 30 45 34 26
After 31 27 25 35 34 35 32 52 50 43
ANSWER:
ANSWER:
75. Test the hypotheses in question 75 at the 0.01 level of significance using the p - value
approach.
ANSWER:
The p - value approach: P = P (t > 3.77 | df = 9); Using the table of probability values for
Student’s t -distribution, we get 0.002 < P < 0.003. Since P< α , reject H o and conclude
that there is an improvement after the course.
76. Test the hypotheses in question 75 at the 0.01 level of significance Solve using the
classical approach.
ANSWER:
The critical region is t ≥ 2.82 . Since t * = 3.77 falls in the critical region, we reject H o at
α = 0.01 and conclude that there is an improvement after the course.
Ten individuals with high cholesterol levels participated in a nutrition education session. The
participants’ cholesterol levels before and after the session were recorded as shown in the table
below.
Subject
Pre-session 300 284 255 240 260 295 315 265 280 245
Post- 262 263 248 243 233 233 238 253 253 218
session
Suppose you wish to test the hypothesis that if participation in the nutrition education session
lowers the cholesterol level. Assume normality.
ANSWER:
Let d = pre – post; the mean difference in cholesterol levels in pre and post education
sessions. Then the null and alternative hypotheses are H o : µ d = 0(≤) vs. H a : µ d > 0
(improvement).
ANSWER:
79. Test the hypotheses in question 79 at α = 0.05. Solve using the p-value approach.
P = p-value = P(t > 3.83 | df = 9); Using the table of probability values for Student’s t-
distribution, we have: 0.001 < P <0.003. Since P < α = 0.05; reject H o .
80. Test the hypotheses in question 79 at α = 0.05. Solve using the classical approach.
ANSWER:
The critical region is: t ≥ 1.83 . Since t * = 3.83 falls in the critical region, we reject H o at
the 0.05 level of significance, and conclude that there is sufficient evidence that the
education session does help to lower cholesterol levels.
81. Find the 95% confidence interval for µd given: n =25, d = 5.2, and sd = 3.9.
ANSWER:
d ± E = 5.2 ± 1.61. The lower and upper confidence limits are 3.59 and 6.81,
respectively.
Ten subjects with borderline-high cholesterol levels were recruited for a study. The study
involved taking a nutrition education class. Cholesterol readings were taken before the class
and three months after the class.
Subject
Ed. Class 1 2 3 4 5 6 7 8 9 10
Pre-class 238 293 253 298 258 282 243 263 278 313
Post-class 243 233 248 268 233 269 218 253 253 238
ANSWER:
ANSWER:
d = 26.3
ANSWER:
sd = 24.5
85. Use computer to develop the 95% confidence interval for the mean amount of reduction
in cholesterol readings resulting from taking the nutrition education class.
ANSWER:
ANSWER:
87. Use computer to test the hypotheses in question 88 at the 0.05 level of significance
using the p-value approach.
ANSWER:
88. Test the hypotheses in question 88 at the 0.05 level of significance using the classical
approach.
ANSWER:
The critical value is t(df, α ) = t(9, 0.05) = 1.83 . Since the value of the test statistic t ∗ =
3.395, we reject H o at the 0.05 level of significance. We reach the same conclusion as
stated in question 89.
Salt-free diets are often prescribed to people with high blood pressure. The data shown below
were obtained from an experiment designed to estimate the reduction in diastolic blood
pressure as a result of following a salt-free diet for two weeks.
Let d = diastolic blood pressure before diet – diastolic blood pressure after diet. Assume
diastolic readings to be normally distributed.
89. Use computer to provide summary measure for d = before diet – after diet.
ANSWER:
ANSWER:
d = 0.80
ANSWER:
sd = 2.348
92. Use computer to develop the 98% confidence interval for the mean reduction in the
diastolic reading after two weeks on this diet.
ANSWER:
ANSWER:
94. Use computer to test the hypotheses in question 95 at the 0.05 level of significance
using the p-value approach.
ANSWER:
Since p-value = 0.155 > α = 0.05, we fail to reject H o . There is no sufficient evidence to
indicate that a salt-free diet for two weeks is effective in reducing the diastolic blood
pressure.
95. Test the hypotheses in question 95 at the 0.05 level of significance using the classical
approach.
ANSWER:
96. Place bounds on the p-value using the table of “critical values of Student’s t-distribution”
available in your textbook.
ANSWER:
97. Place bounds on the p-value using the table of “probability values for Student’s t-
distribution” available in your textbook.
ANSWER:
98. Place bounds on the p-value using the table of “critical values of Student’s t-distribution”
available in your textbook.
ANSWER:
P = p-value = P(t < -2.27 | df = 24) + P(t > 2.27 | df = 24) = 2. P(t > 2.27 | df = 24)
ANSWER:
P = p-value = P(t < -2.27 | df = 24) + P(t > 2.27 | df = 24) = 2. P(t > 2.27 | df = 24)
100. Place bounds on the p-value for, using the table of “critical values of Student’s t-
distribution” available in your textbook.
ANSWER:
P = p-value = P(t < -2.59 | df = 29) = P(t > 2.59 | df = 29) ⇒ 0.005 < P < 0.01
101. Place bounds on the p-value using the table of “probability values for Student’s t-
distribution” available in your textbook.
ANSWER:
P = p-value = P(t < -2.59 | df = 29) = P(t > 2.59 | df = 29) ⇒ 0.007 < P < 0.009
Consider testing H o : µd = 1.0 ( ≤ ) vs. H a : µd > 1.0, with n =10 and t ∗ =3.63.
ANSWER:
103. Place bounds on the p-value using the table of “probability values for Student’s t-
distribution” available in your textbook.
ANSWER:
104. Determine the test criteria that would be used to test H o : µd = 0 ( ≤ ) vs. H a : µd > 0, with n
=15 and α = 0.05 , using the classical approach.
ANSWER
df = 14
105. Determine the test criteria that would be used to test H o : µd = 0 vs. H a : µd ≠ 0, with n =25
and α = 0.05 , using the classical approach
df = 24
106. Determine the test criteria that would be used H o : µd = 0 ( ≥ ) vs. H a : µd < 0, with n =12 and
α = 0.10 , using the classical approach
df = 11
107. Determine the test criteria that would be used to test H o : µd = 1.0 ( ≤ ) vs. H a : µd > 1.0, with
n =18 and α = 0.01 , using the classical approach.
ANSWER:
df = 17
Section 10.3
True-False Questions
108. Confidence interval for the difference between the means of two populations using
independent sampling may contain negative values.
ANSWER: F
110. If independent samples are drawn from two large populations, then the sampling
distribution of x1 − x2 will be normally distributed.
ANSWER: F
111. In comparing two independent means when the σ ’s are unknown, we may use the
standard normal distribution.
ANSWER: F
112. When making inferences about the difference between two independent means for the
case when the number of degrees of freedom is estimated, the number of degrees of
freedom for the critical value of t is equal to the smaller of n1 − 1 or n2 − 1 .
ANSWER: T
113. If we are testing for the difference between two independent population means, it is
assumed that the two populations are approximately normal and have equal variances.
ANSWER: T
114. A hypothesized difference between two population means, µ1 − µ2 , must be zero in order
to be able to make inferences about that difference.
ANSWER: F
115. When we test a null hypothesis about the difference between two population means,
using two independent samples, the test statistic used will be the difference between the
ANSWER: T
116. A hypothesized difference between two population means, µ1 − µ 2 , can be any specified
value. The most common value specified is zero; however, the difference can be
nonzero.
ANSWER: T
117. In comparing two independent means when the σ ’s are unknown, we need to use the
standard normal distribution.
ANSWER: F
Multiple-Choice Questions
118. If two independent samples are used in a hypothesis test concerning the difference
between population means for which the combined degrees of freedom is 20, which of
the following could not be true about the sample sizes n1 and n2 ?
A) n1 = 12 and n2 = 8
B) n1 = 12 and n2 = 10
C) n1 = 13 and n2 = 9
D) Cannot be determined from the given information
ANSWER: A
119. If two independent samples are used in a hypothesis test concerning the difference
between population means for which the combined degrees of freedom is 25, which of
the following is true about the sample sizes n1 and n2 ?
A) n1 = 12 and n2 = 13
120. The director of student services for a large urban university is interested in testing the
claim that evening college students have a higher grade point average than that of day
students. Based on this claim, which of the following would be the correct null and
alternative hypotheses?
A) H o : µe = µd (≥), H a : µe < µ d
B) H o : µe > µ d , H a : µe ≤ µ d
C) H o : µe = µ d , H a : µe ≠ µ d
D) H o : µe − µd = 0(≤), H a : µe − µ d > 0
ANSWER: D
121. Which of the following are the null and alternative hypotheses that would be used to test
the following claim using independent sampling: the mean gasoline consumption of
automobile model A is no more than the mean gasoline consumption of automobile
model B?
A) H o : µ A − µ B = 0(≥), H a : µ A − µ B ≠ 0
B) H o : µ A − µ B = 0(≥), H a : µ A − µ B < 0
C) H o : µ A − µ B = 0(≤), H a : µ A − µ B > 0
D) H o : µ A − µ B = 0(≤), H a : µ A − µ B ≠ 0
ANSWER: C
122. Which of the following would be the alternative hypothesis that would be used to test the
claim that the mean IQ of individuals in population A is significantly different from the
mean IQ of individuals in population B, assuming independent sampling?
A) H a : µ A − µ B = 0
B) H a : µ A − µ B > 0
123. Which of the following is not one of the required assumptions stated in your textbook for
inferences about the difference between two population means, µ1 − µ2 , using two
samples?
124. Which of the following statements is false if independent samples of sizes n1 and n2 are
drawn randomly from large populations with means µ1 and µ2 and variances σ 12 and σ 22 ,
respectively?
σ 12 σ 22
B) The sampling distribution of x1 − x2 , has standard error σ x − x = + .
1 2
n1 n2
C) The sampling distribution of x1 − x2 , will be normally distributed, regardless of the
sample sizes, If both populations have normal distributions.
D) None of the above
ANSWER: D
125. Which of the following is not one of the required assumptions stated in your textbook for
inferences about the difference between two population means, µ1 − µ2 , using two
samples?
126. What is the assumption for inferences about the difference between two means, µ1 − µ 2
?
ANSWER:
The samples are randomly selected from normally distributed populations, and the
samples are selected in an independent manner.
127. A group of sheep, infested with tapeworms, are randomly divided into two groups as
follows. Each sheep is assigned a number (1 through 20) and then 10 numbers are
selected by drawing 10 slips of paper from a box having the numbers 1 through 20
written on them. The drawing divides the sheep into two groups. One group is given a
placebo and the other is given an experimental drug. After six weeks the sheep are
sacrificed and tapeworm counts are made. Do these samples represent dependent or
independent samples?
ANSWER:
Independent samples
128. State the null and alternative hypotheses that would be used to test each of the following
claims:
ANSWER:
a. H o : µ A − µ B = 8(≤), H a : µ A − µ B > 8
b. H o : pM − pF = 0, H a : pM − pF ≠ 0
ANSWER:
H o : µ A − µ B = 0 and H a : µ A − µ B ≠ 0
130. State the null and alternative hypotheses that would be used to test the claim “The mean
of population A is greater than the mean of population B.”
ANSWER:
H o : µ A − µ B = 0 ( ≤ ) and H a : µ A − µ B > 0
131 State the null and alternative hypotheses that would be used to test the claim “The mean
age of workers at General Motors is less than the mean age of workers at Ford.”
132. State the null and alternative hypotheses that would be used to test the claim “There is
no difference in the mean number of hours spent studying per week between male and
female college students.”
ANSWER:
H o : µ M − µ F = 0 and H a : µ M − µ F ≠ 0
133. A survey was conducted to compare the mean cost of a meal at fast food restaurants in
two different cities. With the data below, set a 90% confidence interval on µ1 − µ 2 .
City n x s
A 40 $4.05 $0.55
B 35 $4.85 $0.85
ANSWER:
(0.75 to 0.85)
134. Suppose two independent samples of equal size are selected from two populations and
both having standard deviation σ = 10. What common sample size is needed so that
x1 − x2 has a standard error equal to 2?
ANSWER:
n = 50
ANSWER:
136. An experiment was designed to test the effectiveness of a short course that teaches
diabetic self-care. Fifty diabetics were enrolled in the course, and 50 others served as a
control group. Six months after the course, blood tests were made to determine the
hemoglobin A1C levels. This test measures the blood sugar control over the past few
months. Based on the results, give the p-value for testing the hypothesis
H o : µ1 − µ 2 = 0 vs. H a : µ1 − µ2 < 0 , at α = 0.05.
ANSWER:
The value of the test statistic is: t * = −9.04 , p – value < 0.005
Since p-value < α , we reject the null hypothesis. There is no sufficient evidence to
indicate that the short course was not effective.
Attitude toward mathematics was measured for two different groups. The attitude scores range
from 0 to 80 with the higher scores indicating a more positive attitude. One group consisted of
Elementary Education majors, and the other group consisted of majors from several other
areas. The data are shown below:
Group (major) n x s
ANSWER:
ANSWER:
p -value = 0.0039
139. Give the critical region, and the conclusion for testing the hypotheses in question 140.
ANSWER:
ANSWER:
(-10.63 to -2.57)
A sample of size 60 is selected from population 1, with x1 = 15.4 and s1 = 1.7. A sample of size
40 is selected from population 2, with x2 = 16.8 and s2 = 2.0. Suppose we were to test the claim
that there is no difference in the population means at the 0.05 level of significance.
ANSWER:
H o : µ1 − µ2 = 0 vs. H a : µ1 − µ2 ≠ 0
142. Determine the critical region, the computed value of the test statistic, the decision
reached, and conclusion.
ANSWER:
Critical region: t ≤ –2.03 or t ≥ 2.03; Value of the test statistic: t * = −3.64; Decision:
Reject H o . Conclusion: There is a difference in the population means at the 0.05 level of
significance
An experiment was conducted to compare the mean absorptions of two drugs in specimens of
muscle tissue. Eighty tissue specimens were randomly divided into two equal groups. Each
group was tested with one of the two drugs. The sample results were as follows:
x A = 8.2, xB = 8.8, s A = 0.12 and sB = 0.11 . Assume both populations are normal.
143. Construct the 98% confidence interval for the difference in the mean absorption rates.
The difference between the mean absorption rates for two drugs is µ B − µ A . Normality
indicated. nA = 40, x A = 8.2, s A = 0.12 , nB = 40, xB = 8.8, sB = 0.11 .Then, xB − x A = 0.6.
Since 1- α = 0.98, then α /2 = 0.01, and t(39, 0.01) ≈ 2.42. [We used the conservative
approach in calculating the degrees of freedom; df = min(df1 = n1 − 1, df 2 = n2 − 1) =39]
144. Use the confidence interval in question 145 to test the hypothesis that there is a
difference in the mean absorptions of the two drugs at α = 0.02.
ANSWER:
145. The two independent samples shown in the following table were obtained in order to
estimate the difference between the two population means. Construct the 98%
confidence interval.
Sample A 9 10 10 9 9 8 9 11 8 7
Sample B 9 4 6 5 5 7 6 8 6 4
ANSWER:
Sample statistics:
146. State the null and alternative hypotheses that would be used to test the following claims.
There is a difference between the mean ages of students at two different colleges.
ANSWER:
H o : µ1 − µ2 = 0 vs. H a : µ1 − µ2 ≠ 0
147. State the null and alternative hypotheses that would be used to test the following claims.
The mean of population 1 is greater than the mean of population 2.
ANSWER:
148. Determine the p-value for the hypothesis test of the difference between two means with
unknown population variances given H a : µ1 − µ 2 > 0, with n1 = 8, n2 = 12, t* = 1.4
ANSWER:
We will use the conservative approach to determine the degrees of freedom; namely, df
= min( n1 − 1, n2 − 1 ), and use the table of probability values for Student’s t –distribution.
Then, P = P (t > 1.4 | df = 7) = 0.102
ANSWER:
150. Determine the p-value for the hypothesis test of the difference between two means with
unknown population variances given H a : µ1 − µ 2 < 0, with n1 = 18, n2 = 11, t* = −2.9
ANSWER:
We will use the conservative approach to determine the degrees of freedom; namely, df
= min( n1 − 1, n2 − 1 ), and use the table of probability values for Student’s t –distribution.
Then P = P(t < −2.9 | df = 10) = P(t > 2.9 | df = 10) = 0.008
151. Determine the p-value for the hypothesis test of the difference between two means with
unknown population variances given H a : µ1 − µ 2 ≠ 0, with n1 = 30, n2 = 13, t* = 1.6
ANSWER:
We will use the conservative approach to determine the degrees of freedom; namely, df
= min( n1 − 1, n2 − 1 ), and use the table of probability values for Student’s t –distribution.
Then P = 2 P (t > 1.6 | df = 12) = 2(0.068) = 0.136
152. Determine the p-value for the following hypothesis test for the difference between two
means with unknown population variances.
H a : µ1 − µ 2 ≠ 5, with n1 = 26, n2 = 38, t* = −2.1
ANSWER:
Suppose a random sample of 20 homes east of State Street in Big Rapids, Michigan has a
mean selling price of $128,000 and a standard deviation of $4500, and a random sample of 20
homes west of State Street has a mean selling price of $125,000 and a standard deviation of
$2500. Suppose that you wish to test that there is a significant difference between the selling
prices of homes in these two areas of Big Rapids at the 0.05 level.
ANSWER:
The difference between the mean selling prices of homes in two areas of Big Rapids is
µ E − µW . Therefore, H o : µ E − µW = 0 vs. H a : µ E − µW ≠ 0 .
ANSWER:
t* = [( xE − xW ) − ( µ E − µW )] / ( sE2 / nE ) + ( sW2 / nW )
155. Test the hypotheses in question 155 using the p-value approach.
P = p-value = 2 P (t > 2.61 | df = 19); Using the table of probability values for Student’s t-
distribution, we get 0.007 + < ½ P < 0.009; then 0.014 < P < 0.018. Since P < α ; reject
H o and conclude that there is not sufficient evidence at the 0.05 level of significance, to
show that the mean home prices are different.
156. Test the hypotheses in question 155 using the classical approach.
ANSWER:
The critical regions are t ≤ −2.09 and t ≥ 2.09 ; t * falls in the critical region, therefore we
reject H o , and conclude that there is not sufficient evidence at the 0.05 level of
significance, to show that the mean home prices are different.
The purchasing department for Meijer supermarket chain is considering two sources from which
to purchase 10-lb bags of potatoes. A random sample taken from each source shows the
following results.
Suppose you wish to determine whether there is a difference between the mean weights of the
10-lb bags of potatoes.
The difference in mean weights of 10-lb bags of potatoes is µb − µ s . Therefore the null
and alternative hypotheses are H o : µb − µ s = 0 vs. H a : µb − µ s ≠ 0 .
159. Test the hypotheses in question 159 at the 0.05 level of significance using the p-value
approach.
ANSWER:
P = 2.P(t > 2.58 | df = 99); Using the table of probability values for the Student’s t –
distribution, 0.006 < ½ P <0.008; then 0.012 < P < 0.016. Since P < α = 0.05; reject H o .
There is sufficient evidence to indicate that there is a difference between the mean
weights of the 10-lb bags of potatoes.
160. Test the hypotheses in question 159 at the 0.05 level of significance using the classical
approach.
ANSWER:
The critical regions are: t ≤ −1.99 and t ≥ 1.99 ; t* falls in the critical region, therefore we
reject H o . There is sufficient evidence to indicate that there is a difference between the
mean weights of the 10-lb bags of potatoes.
A professor wishes to determine whether these data show that the college graduates, on the
average, score significantly higher on the test.
ANSWER:
The difference between mean scores college graduates and high school graduates is
µc − µh . Then the hypotheses of interest are H o : µc − µh = 0 vs. H a : µc − µ h > 0 .
ANSWER:
Normality assumed. Since xc = 80.5, xh = 53.4, sc2 = 42.25, sh2 = 114.49, then
t* = [( xc − xh ) − ( µc − µ h )] / ( sc2 / nc ) + ( sh2 / nh )
163. Test the hypotheses in question 163 at α = 0.05 using the p-value approach.
P = p-value = P(t > 21.65 | df = 99); Using the table of probability values for Student’s t-
distribution, P = 0+. Since P < α = .05; reject H o .
164. Test the hypotheses in question 163 at α = 0.05 using the classical approach.
ANSWER:
The critical region is t ≥ 1.66 . The value of the test statistic t * = 21.65 falls in the critical
region, therefore we reject H o . There is sufficient evidence to conclude that the college
graduates did score significantly higher on the test.
Two independent random samples of sizes 16 and 20 were obtained to make inferences about
the difference between two means.
165. If you’re completing the inference with the aid of a computer and its statistical software,
what is the number of degrees of freedom?
ANSWER:
⇒ 15 ≤ df ≤ 34
166. If you’re completing the inference without the aid of a computer and its statistical
software, what is the number of degrees of freedom?
ANSWER:
ANSWER:
ANSWER:
ANSWER:
170. Two independent random samples resulted in the following: Sample A: n A = 25, s A = 8.7 ,
and Sample B: nB = 20, sB = 10.5 . Find the estimate for the standard error for the
difference between two means.
ANSWER:
Define the population parameter of interest as µ non − µdonor ; the difference between the mean
anxiety scores of nondonors and the mean anxiety scores of donors.
ANSWER:
( ( x1 − x2 ) ± E = (7.80 − 5.45) ± 1.54 = 2.35 ± 1.54 ⇒ LCL = 0.81 and UCL = 3.89.
ANSWER:
173. Do the sample results support the researcher’s belief? Test at the 0.05 level of
significance using the p-value approach.
( x1 − x2 ) − ( µ1 − µ 2 ) 2.35 − 0
t∗ = = = 3.14
s12 s22 0.7485
+
n1 n2
P = p-value = P(t > 3.14 | df = 24) ⇒ P < 0.005. Since p-value < α = 0.05, we reject H o .
Yes, there is sufficient evidence to support the researcher’s belief that nondonors have
mean anxiety scores higher than the mean anxiety scores of donors.
174. Do the sample results support the researcher’s belief? Test at the 0.05 level of
significance using the classical approach.
The critical value is t ( df , α ) = t(24, 0.05) = 1.71. Since t ∗ = 3.14 falls in the rejection
region, we reject H o . We reach the same conclusion as stated in question 175.
Male 70 66 73 80 79 58 73 83 78 68
69 82 66 83 80 78 52 79 84 77
97 89 66 80 58 61 65 70 75 49
59 69 79 72 77 74
Female 79 74 92 87 81 76 83 89 81 81
82 78 82 86 75 72 61 67 78 80
87 67 72 95 71 77 53 74 76 79
175. Use computer to find the mean and standard deviation, for each set of data.
ANSWER:
ANSWER:
177. Use computer to construct 95% confidence interval for mean score for all female
students.
ANSWER:
ANSWER:
Yes, the mean scores for males and females could be the same since the two
confidence intervals (69.259 to 76.186) and (74.546 and 81.121) do overlap.
179. Use computer to construct 95% confidence interval for the difference between the mean
scores for male and female students.
ANSWER:
180. Do the results found in question 181 show that the mean scores for males and females
could be the same? Explain.
ANSWER:
No, the results found in question 181 show the mean scores for males and females
could not be the same since “zero” is not included in the interval (-0.794 to -0428).
ANSWER:
The questions are asking for different information. In questions 178 and 179, two intervals are
constructed that are each centered on separate sample means. In this case, the two sample
means are a distance apart, but their intervals overlap allowing for the possibility of coming from
populations with a common mean. Yet the two sample means are themselves far enough apart
to be significantly different.
182. If you are interested in testing whether there is a difference for male and female
students, state the appropriate null and alternative hypotheses.
ANSWER:
183. Use computer to test the hypothesis in question 184 using the p-value approach at α =
0.05.
184. Test the hypothesis in question 184 using the classical approach at α = 0.05.
ANSWER:
The critical values are ±t (df ,α / 2) = ± t(29, 0.025) = ± 2.05 [df = smaller (35,29) = 29] .
Since the value of the test statistic t ∗ = -2.18 falls in the rejection region, we reject H o .
We reach the same conclusion stated in question 185.
185. Did you reach the same conclusion in questions 182, 184, and 185?
ANSWER:
Yes, we reached the same conclusion of rejecting H o at the 0.05 level of significance.
True-False Questions
186. Confidence interval estimates for the difference between the proportions of two
populations always have values between −1 and 1.
ANSWER: T
ANSWER: T
188. The standard normal score is used for all inferences concerning population proportions.
ANSWER: T
189. A pooled estimate for any statistic in a problem dealing with two populations is a value
arrived at by combining the two separate sample statistics so as to achieve the best
possible point estimate.
ANSWER: T
190. For right-hand tail test of the difference between proportions using two independent
samples at the 5% level of significance, the critical value for the z-test is 1.65, but it is
1.96 for the t-test.
ANSWER: F
191. When we estimate the difference between two proportions, p1 − p2 , we will base our
estimate on the unbiased sample statistic p1′ − p2′ .
192. When we estimate the difference between two proportions, p1 − p2 , we will base our
estimates on the unbiased sample statistic x1 − x2 ; the difference between number of
successes in the two samples.
ANSWER: F
193. When the null hypothesis “there is no difference between two population proportions” is
being tested, the test statistic will be the difference between the two population
proportions, divided by the standard error.
ANSWER: F
Multiple-Choice Questions
194. Which of the following should be used as a point estimate of p1 − p2 when constructing
confidence interval for estimating the difference between the proportions of two
populations?
A) 0
B) ( x1 / n1 ) − ( x2 / n2 )
C) n1 p1′ − n2 p2′
D) x1 − x2
ANSWER: B
195. Which of the following would be the null hypothesis used to test the claim that the
proportion of male students (M) who smoke at a particular college is greater than the
proportion of female students (F) who smoke?
A) H o : pM − pF = 0(≥)
B) H o : pM − pF = 0(≤)
C) H o : pM − pF > 0
D) H o : pM − pF < 0
ANSWER: B
A) H o : pW − pC = 0(≤), H a : pW − pC > 0
B) H o : pW − pC = 0(≥), H a : pW − pC < 0
C) H o : pW − pC = 0, H a : pW − pC ≠ 0
D) H o : pW − pC > 0, H a : pW − pC < 0
ANSWER: C
197. Select the correct hypotheses for testing the claim that the proportion of male voters (M)
that support gun control is at least as large as the proportion of female voters (F) that
support gun control.
A) H o : pM − pF = 0, H a : pM − pF ≠ 0
B) H o : pM − pF = 0(≥), H a : pM − pF < 0
C) H o : pM − pF < 0, H a : pM − pF > 0
D) H o : pM − pF = 0(≤), H a : pM − pF > 0
ANSWER: B
198. The sampling distribution of p1′ − p2′ is approximately normally distributed with a mean
equal to:
A) p1 − p2
B) n1 p1 − n2 p2
C) ( p1q1 / n1 ) + ( p2 q2 / n2 )
D) 0
ANSWER: A
199. Assume that two independent samples of sizes n1 and n2 are drawn randomly from large
populations with p1 = P1 (success) and p2 = P2 (success), respectively, and that p1′ − p2′ is
A) Its mean µ p′ − p′ = p1 − p2 .
1 2
p1 q1 p2 q2
B) Its standard error σ p′ − p′ = + .
1 2
n1 n2
C) It has an approximately normal distribution if n1 and n2 are significantly larger.
D) None of the above
ANSWER: D
Short-Answer Questions
200. When estimating the difference between the proportions of two populations using a
confidence interval estimate, why do we not use a pooled sample proportion?
ANSWER:
201. Only 48 of the 200 people interviewed were able to name the Secretary of State of the
United States. Find the value for x, n, p′, and q′ .
ANSWER:
202. Briefly discuss the practical guidelines to ensure normality, when comparing two
population proportions.
ANSWER:
203. If n1 = 50, p1′ = 0.8, n2 = 40, and p2′ = 0.9 , would this satisfy the guidelines for approximately
normal? Explain.
ANSWER:
n1 p1′ = (50)(0.8) = 40, n1q1′ = (50)(0.2) = 10, n2 p2′ = (40)(0.9)= 36, and n2 q2′ = (40)(0.1) = 4
are not all greater than 5, therefore this situation does not satisfy the guidelines for
approximately normal.
204. Two different methods for teaching human anatomy were compared. One method is
traditional lecture, and the other method utilizes computer-assisted instruction (CAI).
Ninety out of 130 in the traditional method passed the course, and ninety-eight out of
125 in the CAI method passed the course. Let p1 be the proportion of all students taking
this course by the CAI method who would pass it, and let p2 be a similar proportion for
the traditional method. Find a 90% confidence interval for p1 − p2 .
ANSWER:
(0 to 0.18)
205. In a survey of 150 men and 150 women, 36% of the men and 28% of the women listed
the evening news as their primary source of information concerning world affairs. Set a
99% confidence interval on p1 − p2 , where p1 is the proportion of men and p2 is the
proportion of women who use the evening news as their primary source of information
concerning world affairs.
ANSWER:
(-0.06 to 0.22)
ANSWER:
A random sample of 500 persons was questioned regarding political affiliation and attitude
toward government-sponsored mandatory testing of AIDS as shown in the table below.
Republicans 95 60 65 220
ANSWER:
H o : P1 − P2 = 0 vs. H a : P1 − P2 ≠ 0
208. Test the hypotheses at α = 0.05, by giving the critical region, test statistic z * , and the
conclusion.
ANSWER:
Critical regions: z ≤ –1.96 or z ≥ 1.96; Value of the test statistic: = 0.32; Conclusion:
unable to reject the null hypothesis. That is, there is no sufficient evidence to indicate
209. Two different display types were compared to determine their effect upon sales for a
new product. The results shown below were found regarding the number who looked at
the product and the number who purchased the product. Give the p-value when
H o : p1 = p2 vs. H a : p1 ≠ p2 is tested. What is your conclusion?
1 850 75
2 700 70
ANSWER:
The value of the test statistic z * = −0.81 , and p –value = 0.418. Since p-value is relatively
large, we fail to reject the null hypothesis and conclude that there is no difference
between the proportion of customers who looked at the product and the proportion of
customers who purchased the product.
A survey of 100 male and 100 female high school seniors showed that 35% of the males and
29% of the females had used marijuana previously. One wishes to determine if the results of
this survey indicate a difference in proportions for the population of high school seniors?
ANSWER:
H o : P1 − P2 = 0 vs. H a : P1 − P2 ≠ 0
211. Test the hypotheses at α = 0.05, giving the critical region, the test statistic z * , and your
conclusion.
Critical region: z ≤ 1.96 or z ≥ 1.96; Value of the test statistic: z * = 0.91 Conclusion: do not
reject the null hypothesis. There is no sufficient evidence to indicate.
A marketing researcher analyst, interested in who purchased new computers, compared the
buying average by men and women as shown below.
Male 500 70
ANSWER:
p-value = 0.002
ANSWER:
Since p –value < α , reject the null hypothesis and conclude that the proportion of male
and female customers who purchased new computers are not the same.
214. In a random sample of 50 brown-haired individuals, 28 indicated that they used hair
coloring. In another random sample of 50 blonde individuals, 34 indicated that they used
hair coloring. Use a 95% confidence interval to estimate the difference in the proportion
of these groups that use hair coloring.
ANSWER:
Sample information:
Now,
pbl′ − pbr′ = 0.68 − 0.56 = 0.12, and 1 − α = 0.95, then α / 2 = 0.025; z(0.025) = 1.96, and
E = z (α / 2). ( pbl′ .qbl′ / nbl ) + ( pbr′ .qbr′ / nbr ) = 1.96 (0.68 ⋅ 0.32 / 50) + (0.56 ⋅ 0.44 / 50)
215. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following claim: There is no difference between the proportions of men and
women who will vote for the incumbent governor in the next election.
ANSWER:
H o : pm − pw = 0 vs. H a : pm − pw ≠ 0
216. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following claim: The percentage of boys who play soccer is greater than the
percentage of girls who play soccer.
ANSWER:
H o : pb − pg = 0 ( ≤ ) vs. H a : pb − pg > 0
ANSWER:
H o : pn − pd = 0 ( ≥ ) vs. H a : pn − pd < 0
ANSWER:
The difference in the proportion of male and female responding “yes” to the survey
question is pm − p f .Therefore the null and alternative hypotheses are:
H o : pm − p f = 0 vs. H a : pm − p f ≠ 0 .
ANSWER:
Since n’s >20, np’s and nq’s all > 5, nm = 200, pm′ = 0.30, n f = 200, p′f = 0.25 , then
p′p = ( xm + x f ) /(nm + n f ) = (60 + 50) / (200 + 200) = 0.275, and q′p = 1 − p′p = 1.0 – 0.275
= 0.725. Hence, the value of the test statistic is
ANSWER:
Since P > α = 0.05, we fail to reject H o . There is no sufficient evidence to indicate that
there is a difference in the proportion of male and female responding “yes” to the above
question.
221. Test the hypotheses in question 220 using the classical approach.
ANSWER:
The critical regions are: z ≤ -1.96 and z ≥ 1.96. Since z ∗ = 1.12 falls in the noncritical
region, we fail to reject H o . We reach the same conclusion as stated in question 222.
A researcher wants to test the hypothesis that smoking rate (proportion of smokers) is higher for
males than females.
ANSWER:
ANSWER:
p′p = ( xm + xw ) /(nm + nw ) = (172 +136) / (400 + 400) = 0.385, and q′p = 1 − p′p = 1.0 0.385
= 0.615; Hence, the value of the test statistic is
z ∗ = [( pm′ − pw′ ) − ( pm − pw )] / ( p′p )(q′p )[(1/ nm ) + (1/ nw )]
224. Calculate the p-value. What decision and conclusion would be reached at the 0.05 level
of significance?
ANSWER:
P = P( z > 2.62); Using the table of standard normal distribution, P = (0.5000 – 0.4956) =
0.0088. Since P < α = 0.05; we reject H o . There is sufficient evidence to indicate that
the smoking rate for male diabetics is significantly higher than for female diabetics, at the
0.05 level.
225. Construct a 99% confidence interval estimate of the difference in the proportion of stocks
making a gain.
ANSWER:
Since pn′ − pa′ = 0.36 − 0.30 = 0.06, and 1 − α = 0.99, then α / 2 = 0.005; and z(0.005) =
2.58. Hence,
( pn′ − pa′ ) ± E = 0.06 ± 0.171 . The 99% confidence interval for pn − pa is –0.111 to 0.231
226. Does the answer to question 227 suggest that there is a significant difference between
the proportions of stocks making gains on the two stock exchanges?
ANSWER:
No, there is no significant difference at the 0.01 level because the confidence interval
estimate contains the value 0.
227. Calculate the estimate for the standard error of the difference between two proportions
given that n1 = 50, p1′ = 0.9, n2 = 40, and p2′ = 0.9 .
ANSWER:
228. Calculate the maximum error of estimate for a 95% confidence interval for the difference
between two proportions if n1 = 32, p1′ = 0.32, n2 = 38, and p2′ = 0.38
ANSWER:
= (1.96)(0.114) = 0.223
229. Calculate the maximum error of estimate for a 90% confidence interval for the difference
between two proportions n1 = 33, p1′ = 0.35, n2 = 37, and p2′ = 0.42
ANSWER:
= (1.645)(0.1161) = 0.191
The proportions of defective parts produced by two machines were compared, and the following
data were collected:
230. Calculate the maximum error of estimate for a 90% confidence interval for the difference
between the proportions of defective parts produced by the two machines.
ANSWER:
n1 = 200, p1′ = 10 / 200 = 0.05, n2 = 200, and p2′ = 6 / 200 = 0.03 . z (α / 2) = z(0.05) = 1.645. Then,
= (1.645)(0.0196) = 0.032
ANSWER:
232. If you wish to test there is no difference in the proportion of defective parts produced by
both machines, state the appropriate null and alternative hypotheses.
ANSWER:
H o : p1 − p2 = 0 and H a : p1 − p2 ≠ 0
233. Can you use the confidence interval in question 233 to test the hypotheses in question
234 at the 0.10 level of significance? Explain in detail.
ANSWER:
Yes, we can use the confidence interval in question 233 to test the hypotheses in
question 234. Since the hypothesized value of zero falls in the 90% confidence interval,
we fail to reject H o at the 0.10 level of significance.
ANSWER:
235. State null hypothesis, H o , and the alternative hypothesis, H a , that would be used to test
the claim “There is no difference between the proportions of male and female students
who will vote for the president of student government at Iowa State University.”
H o : pM − pF = 0 and H a : pM − pF ≠ 0
236. State null hypothesis, H o , and the alternative hypothesis, H a , that would be used to test
the claim “The percentage of boys who missed statistics classes is greater than the
percentage of girls who missed the same classes.”
ANSWER:
237. State null hypothesis, H o , and the alternative hypothesis, H a , that would be used to test
the claim “The percentage of college students who drive old cars is lower than the
percentage of non-college people of the same age who drive old cars.”
ANSWER:
Let p1 = percentage of college students who drive old cars, and p2 = percentage of non-
college students who drive old cars. Then, H o : p1 − p2 = 0 (≥) and H a : p1 − p2 < 0
238. Determine the p-value that would be used to test H o : p1 = p2 vs. H a : p1 > p2 , if the value
of the test statistic z * = 2.12
ANSWER:
239. Determine the p-value that would be used to test H o : pa = pB vs. H a : p A ≠ pB , if the value
of the test statistic z * = -2.28.
ANSWER:
240. Determine the p-value that would be used to test H o : p1 − p2 = 0 vs. H a : p1 − p2 < 0 , if the
value of the test statistic z * = - 0.75.
ANSWER:
241. Determine the p-value that would be used to test H o : pm − p f =0 vs. H a : pm − p f > 0 , if the
value of the test statistic z * = 3.09.
ANSWER:
242. Draw an approximate normal curve and determine the critical region and critical value(s)
that would be used to test H o : p1 = p2 vs. H a : p1 > p2 , with α = 0.05 .
ANSWER:
ANSWER:
244. Draw an approximate normal curve and determine the critical region and critical value(s)
that would be used to test H o : p1 − p2 =0 vs. H a : p1 − p2 =0, with α = 0.04 .
ANSWER:
245. Draw an approximate normal curve and determine the critical region and critical value(s)
that would be used to test H o : p1 − p2 =0 vs. H a : p1 − p2 =0, with α = 0.01
ANSWER:
Two randomly selected groups of citizens were exposed to different media campaigns that dealt
with the image of a presidential candidate. One week later, the citizen groups were surveyed to
see whether they would vote for the candidate. The results were as follows:
Exposed to Exposed to
Conservative Image Moderate Image
Number in Sample 100 100
Proportion for the Candidate 0.42 0.46
A political analyst believes that there is no difference in the effectiveness of the two image
campaigns.
246. Would this situation satisfy the guidelines for approximately normal? Explain.
ANSWER:
n1 p1′ = (100)(0.42) = 42, n1q1′ = (100)(0.58) = 58, n2 p2′ = (100)(0.46)= 46, and n2 q2′ =
(100)(0.54) = 54 are all greater than 5, therefore this situation would satisfy the
guidelines for approximately normal.
247. Calculate the maximum error of estimate for a 95% confidence interval for the difference
between the two proportions of those who would vote for the presidential candidate.
ANSWER:
248. Construct a 95% confidence interval for the difference between the two proportions of
those who would vote for the presidential candidate.
ANSWER:
249. State the null and alternative hypotheses for this situation.
ANSWER:
H o : p1 − p2 = 0 and H a : p1 − p2 ≠ 0
250. Calculate the value of the test statistics for testing the hypotheses in question 251.
ANSWER:
251. Test the hypotheses in question 251 at the 5% level of significance using the p-value
approach.
ANSWER:
P = p-value = P(z < -0.57) + P(z > 0.57) = 2 P(z > 0.57) = 2 (0.50- 0.2157) = 0.5686.
Since p-value = 0.5686 > α = 0.05, we fail to reject H o . There is sufficient evidence to
252. Test the hypotheses in question 251 at the 5% level of significance using the classical
approach.
ANSWER:
Since the value of the test statistic z ∗ = -0.57 does not fall in the rejection region, we fail
to reject H o . We reach the same conclusion as stated in question 253.
253. Can you use the confidence interval in question 250 to test the hypotheses in question
251? Explain in detail.
ANSWER:
Yes, we can use the confidence interval in question 250 to test the hypotheses in
question 251. Since the hypothesized value of zero falls in the confidence interval, we
fail to reject H o .
Section 10.5
True-False Questions
254. The chi-square distribution is used for making inferences about the ratio of the variances
of two populations.
ANSWER: F
ANSWER: F
ANSWER: F
257. Inferences about the ratio of two variances require that the samples are randomly
selected from F-distributed populations, and that the two samples are selected in an
independent manner.
ANSWER: F
258. The critical F-value for samples of size 8 and 10 with 5% of the area in the right-hand tail
is determined by the value F(8, 10, 0.05).
ANSWER: F
259. Inferences about the ratio of variances for two normally distributed populations use the
F-distribution.
ANSWER: T
260. Each F-distribution is identified by two numbers of degrees of freedom, one for each of
the two samples involved.
ANSWER: T
261. The tables of critical values for the F-distribution give only the right-hand critical values.
ANSWER: T
262. The chi-square distribution is used for making inferences about the ratio of the variances
of two populations.
ANSWER: F
Multiple-Choice Questions
264. In comparing the variances of two normally distributed populations using two
independent samples, which of the following statements is false?
265. Which of the following is not needed for calculating the critical values for the F-
distribution?
A) The degrees of freedom associated with the sample whose variance is in the
numerator of the calculated F.
B) The degrees of freedom associated with the sample whose variance is in the
denominator of the calculated F.
C) The values of the two samples variances.
D) The area under the distribution curve to the right of the critical value being sought.
ANSWER: C
266. How many values are needed to identify a single critical value of the F-distribution?
A) 5
B) 4
C) 3
D) 2
ANSWER: C
267. Suppose we were to test the hypotheses, H o : σ 12 / σ 22 = 1(≤) vs. H a : σ 12 / σ 22 > 1 , and then
reject the null hypothesis, what would this suggest about which population is more
variable? Why?
This would suggest that population 1 is more variable since σ 12 / σ 22 > 1 is equivalent to
σ 12 > σ 22 . If the variance of population 1 is greater than that of population 2, population 1
is more variable.
268. In using the F- test to test equality of variances in a two-tailed test, what can we do to
insure that we will not need a left-tail critical value of F?
ANSWER:
Always use the sample with the largest variance for the “numerator.” This will make F *
larger than 1 and place it in the right tail of the distribution.
269. What assumption must be met about two populations if we use the F test for equality of
variances?
ANSWER:
270. If a two-tailed test with n1 = 10, n2 = 18, and α = 0.05, find the right-tail critical value,
assuming that F ∗ = s12 / s22 .
ANSWER:
271. In a particular F test for the ratio of two variances, the test statistic F ∗ = s12 / s22 =331. If n1
= 10 and n2 = 12, find bounds for the p-value.
ANSWER:
272. Discuss properties of the F-distribution in regard to possible values of F and symmetry.
ANSWER:
273. To conclude statistically at the 0.05 level of significance that population 1 is more
variable than population 2, s12 / s22 must exceed what value if n1 = 10 and n2 = 5 ?
ANSWER:
274. Testing the hypotheses, H o : σ 12 / σ 22 = 1(≥) vs. H a : σ 12 / σ 22 < 1 , given F(10,15,0.05) and s22
= 10.1, what is the largest possible value of s1 which would allow us to reject H o ?
ANSWER:
275. Suppose we were to test the hypotheses, H o : σ 12 / σ 22 = 1(≥) vs. H a : σ 12 / σ 22 < 1 , using the
0.05 level of significance. If n1 = 31 and n2 = 16, what is the smallest possible value of
the ratio of s1 / s2 which causes us to reject H o ?
ANSWER:
276. Briefly discuss the assumptions for inferences about the ratio of two variances.
The samples are randomly selected from normally distributed populations, and the two
samples are selected in a independent manner.
Brand n x s
A 10 17.5 1.2
B 10 20.2 4.7
Suppose you wish to test for unequal variability in yield at level of significance equal to 0.05.
The results were as follows (in bushels of corn per plot).
ANSWER:
H o : σ 12 / σ 22 = 1 vs. H a : σ 12 / σ 22 ≠ 1
278. Give the critical region, computed test statistic, and conclusion.
ANSWER:
Critical region: F ≥ 4.03; Value of the test statistic: F * = 15.3; Conclusion: Reject the
null hypothesis of equal variances.
279. Find the following critical value for F: F(10, 30, 0.05).
2.71
280. Find the following critical value for F: F(60, 150, 0.05).
ANSWER:
2.97
281. Find the following critical value for F: F(10, 15, 0.025).
ANSWER:
3.56
282. Find the following critical value for F: F(5, 20, 0.01).
2.01
A study was designed to compare the self-care knowledge of two different groups of cardiac
patients. A standard test was administered to the two groups. One group was selected from
patients having only a high school education and the other was selected from college graduates
who were cardiac patients. The results were as follows:
ANSWER:
H o : σ 12 / σ 22 = 1 vs. H a : σ 12 / σ 22 ≠ 1
284. Give the critical region, computed test statistic, and conclusion.
ANSWER:
Critical region: F ≥ 3.72; Value of the test statistic: F * = 5.62; Conclusion: Reject the null
hypothesis.
A researcher wishes to compare two different groups of students with respect to their mean time
to complete a particular task. The time required is determined for each independent group as
shown in the following summary: Suppose you wish to test the claim of unequal variances at α
= 0.05., that there is no variance.
1 10 23.5 2.7
2 8 20.4 5.2
ANSWER:
H o : σ 12 / σ 22 = 1 vs. H a : σ 12 / σ 22 ≠ 1
286. Give the critical region, test statistic value, and conclusion for the F-test.
ANSWER:
Critical region: F ≥ 4.20; Value of the test statistic: F * = 3.71; Do not reject the null
hypothesis of equal variances.
287. Twenty individuals with cholesterol readings in the range from 250 to 275 were randomly
divided into two groups of ten each. The two groups were put on two different diets and
after 6 months, the change in cholesterol was determined for each individual. Using the
summarized results shown below , give the critical region, the test statistic, and the
conclusion for testing the null hypothesis of equal variances versus the alternative
hypothesis of unequal variances at a level of significance equal to 0.05.
Diet n Mean SD
change
1 10 20.5 5.5
2 10 14.8 6.5
ANSWER:
288. A study was designed to compare the variability of male and female diastolic blood
pressures. The null hypothesis was that the population standard deviations were equal
versus the alternative that they were not equal. State the critical region for α = 0.05, F*,
and the conclusion if the following sample results were observed. Males: n = 25, s = 9.9
and Females: n = 25, s = 8.7
ANSWER:
289. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following claim: The variances of populations A and B are not equal.
ANSWER:
H o : σ A2 = σ B2 vs. H a : σ A2 ≠ σ B2
290. State the null hypothesis H o , and the alternative hypothesis, H a , that would be used to
test the following claim: The standard deviation of population 1 is larger than the
standard deviation of population 2.
ANSWER:
H o : σ 1 = σ 2 (≤ 0) vs. H a : σ 1 > σ 2
ANSWER:
H o : σ A2 / σ B2 = 1 vs. H a : σ A2 / σ B2 ≠ 1
292. State the null hypothesis H o , and the alternative hypothesis, H a , that would be used to
test the following claim: The variability within population A is less than the variability
within population B.
ANSWER:
293. If each sample has a size of 3, find the probability that one of the sample variances is at
least 39 times larger than the other one.
ANSWER:
P ( s12 ≥ 39 s22 or s22 ≥ 39 s12 ) = P ( s12 / s22 ≥ 39) + P ( s22 / s12 ≥ 39)
= 2 P(F ≥ 39 | df = 2, 2)
294. If each sample has a size of 6, find the probability that one of the sample variances is no
more than 11 times larger than the other one.
P( s12 ≥ 11s22 or s22 ≥ 11s12 ) = P( s12 / s22 ≥ 11) + P ( s22 / s12 ≥ 11)
= 2 P[F ≥ 11 | df = 5, 5]
ANSWER:
The ratio of the standard deviations for scores of younger children and older children is
σ y / σ o . Therefore the null and alternative hypotheses are given by H o : σ y = σ o (≤ 0) and
Ha :σ y > σ o .
ANSWER:
Normality assumed, and independence exists. Since, n y = 40, s y = 24.5, no = 40, and
so = 7.5 , then F ∗ = s 2y / so2 = (24.5) 2 /(7.5) 2 = 10.67 .
297. Test the hypotheses in question 297 at α = 0.01 using the p-value approach.
ANSWER:
P = p-value = P(F > 10.67 | df = 39, 39). Using the F-distribution table, we get P < 0.01.
Since P < α = 0.01, reject H o .
ANSWER:
The critical region is F ≥ 2.11. Since the value of the test statistic F ∗ falls in the critical
region, we reject H o . There is sufficient evidence at the 0.01 level of significance that the
standard deviation of scores for younger children is larger than the standard deviation for
older children.
299. Reorganize the alternative hypothesis shown below so that the critical region will be the
right-hand tail: H a : σ 22 < σ 12 or σ 22 / σ 12 < 1 (population 2 is less variable)
ANSWER:
Reverse the direction of the inequality, and reverse the roles of the numerator and
denominator. Therefore, H a : σ 12 > σ 22 or σ 12 / σ 22 > 1 (Population 1 is less variable), and the
calculated test statistic F * will be s12 / s22 .
300. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim “The variances of populations A and B are not equal.”
ANSWER:
H o : σ A2 / σ B2 = 1 and σ A2 / σ B2 ≠ 1
301. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim “The standard deviation of population 1 is larger than the standard
deviation of population 2”.
ANSWER:
H o : σ 1 / σ 2 = 1 ( ≤ ) and σ 1 / σ 2 > 1
ANSWER:
H o : σ C2 / σ D2 = 1 and σ C2 / σ D2 ≠ 1
303. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the claim “The variability within population X is less than the variability within
population Y.”
ANSWER:
H o : σ Y2 / σ X2 = 1 ( ≤ ) and σ Y2 / σ X2 > 1
304. Use the table of critical values of the F-distribution to find F(24, 12, 0.01).
ANSWER:
3.78
305. Use the table of critical values of the F-distribution to find F(30, 40, 0.05).
ANSWER:
1.74
306. Use the table of critical values of the F-distribution to find F(12, 10, 0.025).
ANSWER:
3.62
ANSWER:
2.71
308. Use the table of critical values of the F-distribution to find F(15, 18, 0.05).
ANSWER:
2.27
309. Use the table of critical values of the F-distribution to find F(15, 9, 0.025).
ANSWER:
3.77
310. Use the table of critical values of the F-distribution to find F(40, 30, 0.01).
ANSWER:
2.30
311. Determine the p-value that would be used to test H o : σ 1 = σ 2 vs. H a : σ 1 > σ 2 with
n1 = 8, n2 = 15 and F* = 2.96 .
ANSWER:
ANSWER:
313. Determine the p-value that would be used to test the null hypothesis H o : σ 12 /σ 22 =1 vs. the
alternative hypothesis H a : σ 12 / σ 22 ≠ 1 , with n1 = 31 , n2 = 61 and F* = 1.94 .
ANSWER:
P = p-value = 2 P( F > 1.94 | 30, 60 ) ⇒ 2(0.01) < P <2( 0.025) ⇒ 0.02 < P < 0.05
314. Draw an approximate F-curve and determine the critical region and critical value(s) that
would be used to test H o : σ 12 = σ 22 vs. H a : σ 12 > σ 22 , with n1 = 10, n2 = 16, and α = 0.05 .
ANSWER:
315. Draw an approximate F-curve and determine the critical region and critical value(s) that
would be used to test H o : σ 12 / σ 22 =1 vs. H a : σ 12 / σ 22 ≠ 1, with n1 = 25, n2 = 31, and α = 0.05 .
316. Draw an approximate F-curve and determine the critical region and critical value(s) that
would be used to test H o : σ 12 / σ 22 =1 vs. H a : σ 12 / σ 22 > 1, with n1 = 10, n2 = 10, and α = 0.01 .
ANSWER:
317. Draw an approximate F-curve and determine the critical region and critical value(s) that
would be used to test H o : σ 12 = σ 22 vs. H a : σ 12 < σ 22 , with n1 = 25, n2 = 16, and α = 0.01 .
ANSWER:
ANSWER:
2.89
319. Two independent samples of sizes 3 and 4, respectively, are drawn form a normally
distributed population. Find the probability that the variance of the first sample is at least
16 times larger than the variance of the second sample.
ANSWER:
P( s12 ≥ 16s22 ) = P( s12 / s22 ≥ 16) = ( F ≥ 16 | df = 2,3) 0.025, since F(2, 3, 0.025) = 16.
321. Test the hypothesis at the 0.05 level of significance using the p-value approach.
ANSWER:
322. Test the hypothesis at the 0.05 level of significance using the classical approach.
ANSWER:
323. Two independent samples, each of size 3, are drawn form a normally distributed
population. Find the probability that one of the sample variances is at least 19 times
larger than the other one.
ANSWER:
P( s12 ≥ 19s22 or s22 ≥ 19 s12 ) = P ( s12 / s22 ≥ 19) + P( s22 / s12 ≥ 19) = 2 P( F ≥ 19 | df = 2, 2)
324. Two independent samples of sizes 3 and 5, respectively, are drawn form a normally
distributed population. Find the probability that the variance of the first sample is at least
18 times larger than the variance of the second sample.
ANSWER:
325. State the appropriate null and alternative hypotheses for this situation.
ANSWER:
ANSWER:
327. Test the hypothesis in question 327 at α = 0.01 using the p-value approach.
ANSWER:
Since p-value < α = 0.01, we reject H o . There is sufficient evidence to support the
director’s belief that that the standard deviation of GRE scores for females is larger than
the standard deviation of GRE scores for males.
328. Test the hypothesis in question 327 at α = 0.05 using the classical approach.
ANSWER:
Since F ∗ = 12.35 falls in the rejection region, we reject H o . We reach the same
conclusion as stated in question 329.
329. Use computer to calculate summary measures for the two samples.
ANSWER:
H o : σ W2 / σ M2 =1 vs. H a : σ W2 / σ M2 ≠ 1
ANSWER:
332. Do the sample data support the experimenter’s claim at the 0.05 level of significance?
Use the classical approach.
ANSWER:
The critical values for this test are: left tail, F(12,15, 0.975) and right tail, F(12, 15,
0.025). However, since we chose the sample with the larger variance for the numerator,
the value of F ∗ is greater than one, and will be in the right-hand tail; therefore, only the
right-hand critical value is needed. Since F(12, 15, 0.025) = 2.96 and F ∗ = 2.256, we fail
to reject H o . There is not sufficient evidence to support the experimenter’s claim that the
variances were unequal.
Chapter 11
APPLICATONS OF
CHI-SQUARE
ANSWER: F
ANSWER: F
ANSWER: T
ANSWER: F
5. When using the chi-square distribution in a hypothesis test for a multinomial experiment,
the number of degrees of freedom is the number of cells.
ANSWER: F
6. In a multinomial experiment, n = ∑ O= ∑
all cells all cells
E.
ANSWER: T
9. In the multinomial experiment we have (r -1) times (c -1) degrees of freedom, where r is
the number of rows, and c is the number of columns.
ANSWER: F
ANSWER: T
11. A multinomial experiment arranges the data into a two-way classification such that the
totals in one direction are predetermined.
ANSWER: F
ANSWER: F
13. The data used in a chi-square multinomial test are always enumerative in nature.
ANSWER: T
14. The shape of the chi-square distribution depends on the size of the sample.
ANSWER: F
ANSWER: F
17. The critical value at the 0.05 level of significance for a chi-square multinomial test where
there are six categories is 11.07.
ANSWER: T
18. If the value of the chi-square test statistic is less than the critical value, the null
hypothesis must be rejected at a predetermined level of significance.
ANSWER: F
19. The chi-square multinomial test can be applied if there are equal or unequal expected
frequencies.
ANSWER: T
20. A multinomial experiment, in general, differs from a binomial experiment in that each trial
has three or four outcomes rather than two outcomes.
ANSWER: F
21. The chi-square distribution will be used to test hypotheses concerning enumerated data.
ANSWER: T
22. The middle 0.95 portion of the chi-square distribution with 9 degrees of freedom has table values
of 3.33 and 16.9 respectively.
ANSWER: F
23. Suppose that we have k cells into which n observations have been sorted, where the
observed frequencies in each cell are denoted by O1 , O2 ,...., Ok and the expected or
k k
theoretical frequencies are denoted by E1 , E2 ,...., Ek . Then ∑O , ∑ E
i =1
i
i =1
i , and n must be
ANSWER: T
ANSWER: F
25. ∑ ( O − E ) must always equal zero, where the symbols O and E refer to the observed and
expected frequencies, respectively.
ANSWER: T
ANSWER: F
27. A multinomial experiment, where the outcome of each trial can be classified into one of two
categories, is identical to the binomial experiment.
ANSWER: T
28. For a chi-square distributed random variable with 10 degrees of freedom and a level of
significance of 0.025, the chi-square critical value is 20.5. If the computed value of the test
statistics is 17.87, this will lead us to reject the null hypothesis.
ANSWER: F
Multiple-Choice Questions
30. If H o : P(A) = 0.15, P(B) = 0.25, P(C) = 0.35, and P(D) = 0.25 is the null hypothesis in a
hypothesis test for a multinomial experiment, what is the appropriate alternative
hypothesis?
31. In a multinomial experiment with more than five cells and with α ≤0.10, which of the
following could not be a critical value of χ 2 ?
A) 6.00
B) 10.00
C) 14.00
D) 18.00
ANSWER: A
32. In a chi-square test comparing observed to expected frequencies, we fail to reject the
null hypothesis whenever the observed frequencies are
A) 17.0
B) 18.0
C) 19.0
D) None of these is possible.
ANSWER: D
A) ∑ ( O − E ) must
always equal zero, where O and E are the observed and expected
frequencies, respectively.
B) In a multinomial experiment, df =k, where k is the number of cells.
C) Not all multinomial experiments result in equal expected frequencies.
D) None of the above.
ANSWER: B
37. In a chi-square test of multinomial parameters, suppose that a sample showed that the observed
frequency Oi and expected frequency Ei were equal for each cell i. Then, the null hypothesis is
38. In a chi-square test of multinomial parameters, suppose that the value of the test statistic is 13.08
and the number of degrees of freedom is 6. At the 5% significance level, the null hypothesis is
39. Of the values for a chi-square test statistic listed below, which one is likely to lead to rejecting the
null hypothesis in a goodness-of-fit test?
A) 0.78
B) 2.02
C) 1.94
D) 45.1
ANSWER: D
40. If we use the χ 2 test of multinomial parameters to test for the differences among 5 proportions,
the degrees of freedom are equal to:
A) 2.
B) 3.
C) 4.
D) 5.
ANSWER: C
Short-Answer Questions
ANSWER:
Categorical variable categorizes each individual into exactly one of several cells or
classes, are all-inclusive and mutually exclusive.
ANSWER:
36.7
43. One guideline to ensure a good approximation to the χ2 distribution is that Ei ≥ 5. If this
is not possible, what would be a possible solution?
ANSWER:
ANSWER:
5.23
45. Complete the following statement: multinomial experiments will always use a
___________ critical region.
ANSWER:
positive
ANSWER:
19.7
ANSWER:
The sample information is obtained using a random sample drawn from a population in
which each individual is classified according to the categorical variable(s) involved in the
test.
ANSWER:
49. Classes at a large university that meet on Monday, Wednesday, and Friday were
sampled for student absence. Using the following results, state the null and alternative
hypotheses to test the claim that absences occur on the three days with equal frequency
ANSWER:
H o : p1 = 1/ 3, p2 = 1/ 3, p3 = 1/ 3 ;
1 400
2 500
3 450
4 500
5 150
ANSWER:
51. The following table gives theoretical distribution over four categories and the actual
observed distribution. Why would you be reluctant to apply the chi-square analysis to
determine the goodness of fit in this sample?
1 0.05 5
2 0.45 15
3 0.39 15
4 0.11 5
ANSWER:
The chi-square analysis should not be applied to determine the goodness of fit in this
sample because the expected frequencies are not greater than 5 in two of the
categories.
Mars, Inc., the manufacturer of M&M candies, claims that the distribution of the different colors
of candies in a bag of M&Ms (brown, red, yellow, green, orange, and blue) will appear in the
ratio 3:2:2:1:1:1. In testing this claim, Mars, Inc. obtained frequencies of 38, 15, 33, 4, 6, and 4,
respectively.
52. State the null and alternative hypotheses to test the claim to support this ratio.
ANSWER:
53. Find the computed value of χ 2 . If α = 0.05, what decision would be made?
ANSWER:
Color
Expected % 30 20 20 10 10 10
Observed % 38 15 33 4 6 4
χ 2 * = 20.633, and the critical value is χ 2 (5, 0.05) = 11.10. Reject H o at α = 0.05, and
conclude that Mars’ claim is not correct.
A research report gives the following seasonal distribution of colds. A researcher randomly
selects 200 cases from a large clinic that have been diagnosed as a cold and observed the
results shown in the table below. The researcher wishes to test that the clinic has the reported
seasonal distribution at α = 0.05.
Winter 20 30
Spring 35 80
Summer 10 25
Fall 35 65
ANSWER:
ANSWER:
ANSWER:
ANSWER:
Since p-value > α , fail to reject H o that the clinic has the reported seasonal distribution.
Using a deck containing 52 cards and 4 suits, a gambler draws one card and noted whether a
club, diamond, heart, or spade is drawn. The card is replaced and another one is drawn. This
experiment is performed 100 times, and the results are shown in the table below. The gambler
wishes to determine if the results indicate an equal number of clubs, diamonds, hearts, and
spades in the deck?
Club 25
Diamond 15
Heart 30
Spade 30
ANSWER:
59. Determine the critical region at α = .05 and calculate the value of the test statistic.
ANSWER:
ANSWER:
χ 2 * = 6.00. Fail to reject H o since X 2 * < X 2 . The data indicated an equal number of
clubs, diamonds, hearts, and spades in the deck.
If a fair coin is tossed three times, the number of heads to occur has a binomial distribution with
the probability distribution given in the table below. A coin is tossed three times, with the
experiment repeated 100 times. The observed frequencies are shown in the table. One wishes
to determine if the coin is fair at α = 0.05.
0 0.125 20
1 0.375 25
2 0.375 35
3 0.125 20
ANSWER:
62. Determine the critical region and calculate the value of the test statistic.
ANSWER:
ANSWER:
64. Suppose we have a multinomial experiment with the cells shown below. What observed
frequencies a, b, c, d, and e would result in χ 2 = 0, if we were testing the hypothesis that
I, II, III, IV, and V occur in the ratio 10:7:5:4:2 with a random sample of size 840?
I II III IV V
a b c d e
ANSWER:
An instructor claims that final grades in his course occur in the ratio 1:3:5:2:1 for the grades of
A, B, C, D, and F. A random sample of 240 of the students showed that 15 received a grade of
A, 55 received a grade of B, 90 received a grade of C, 50 received a grade of D, and 30
received a grade of F. Find the computed value of χ 2 . If α = 0.05, what decision would be
made?
ANSWER:
66. Calculate the value of the test statistic and determine the critical region at α = 0.05.
ANSWER:
ANSWER:
Reject H o since χ 2 * > χ 2 . There is sufficient evidence to conclude that final grades
ratio is not as claimed by the instructor.
68. In a multinomial experiment with three cells we are testing the claim that p1 = p2 = p3
using α = 0.05. If the observed frequencies in the first two cells are 20 and 16, what are
the possible observed frequencies in the third cell which would cause us to fail to reject
the claim?
ANSWER:
69. At a large university five different professors teach the same course. Random samples
of 50 students taking the course from each of the instructors were selected. The number
of students earning satisfactory grades in the course (A, B, or C) and the number
earning unsatisfactory grades in the course were determined. The number of satisfactory
grades from each of the instructors were 35, 42, 30, 40, and 39. Does the sample
evidence support the claim that satisfactory grades are given in the same proportion by
all five instructors? Use α = 0.05. Find the computed value of χ 2 and state the decision.
H o : Satisfactory grades are given in the same proportion by all five instructors.
H a : Satisfactory grades are not given in the same proportion by all five instructors.
χ 2 * = 9.53, critical region: χ 2 (4, 0.05) = 9.49. We barely reject the null hypothesis at α
= 0.05. We conclude that the sample evidence does not support the claim that
satisfactory grades are given in the same proportion by all five instructors.
70. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following statement: The five numbers: 10, 11, 12, 13, and 14, are equally likely.
ANSWER:
71. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following statement: The multiple-choice question with choices A, B, C, D, and E
has a history of students selecting answers in the ratio of 2:3:2:1:2.
ANSWER:
H o : P(A) = 0.20 , P(B) = 0.30, P(C) = 0.20, P(D) = 0.10, P(E) = 0.20
72. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following statement: The poll will show a distribution of 17%, 37%, 40%, and 6%
for the possible ratings of excellent (E), good (G), fair (F), and poor (P) on a specific
issue.
ANSWER:
polish A B C D E Total
Frequency 30 17 14 21 18 100
73. State the null hypothesis for “no preference” in statistical terminology.
ANSWER:
74. What test statistic will be used in testing the null hypothesis in question 73?
ANSWER:
χ 2 test statistic
ANSWER:
The observed and expected frequencies are shown in the table below:
polish A B C D E Total
Observed 30 17 14 21 18 100
Expected 20 20 20 20 20 100
χ 2∗ = ∑ [(O − E )
allcells
2
/ E ] = 7.50.
76. Complete the hypothesis test at the 0.10 level of significance using the p-value approach
and the classical approach.
ANSWER:
P = p-value = P( χ 2 > 7.50 | df = 4); Using the table of χ 2 distribution: 0.10 < P < 0.25.
Since P > α = 0.10, fail to reject H o , and conclude that the preferences of polish are not
significantly different from equal proportions.
77. Complete the hypothesis test at the 0.10 level of significance using the classical
approach.
ANSWER:
The critical region is χ 2 (4, 0.10) ≥ 7.78. Since the test statistic χ 2∗ falls in the non-
critical region, we fail to reject H o at α = 0.10, and conclude that the preferences of
polish are not significantly different from equal proportions.
ANSWER:
ANSWER:
The expected values are calculated according to the formula E = np. The observed and
expected frequencies are shown in the table below:
Quality A B C D Total
Observed 18 65 77 40 200
Expected 30 60 70 40 200
χ 2∗ = ∑ [(O − E ) 2 / E ] = 5.917
80. Does this sample contradict the expected proportions at α = 0.05 ? Solve using the p-
value approach.
ANSWER:
81. Does this sample contradict the expected proportions at α = 0.05 ? Solve using the
classical approach.
ANSWER:
The critical region is χ 2 (3, 0.05) ≥ 7.82. Since the test statistic χ 2∗ falls in the
noncritical region, we fail to reject H o at α = 0.05, and conclude that the proportions of
meat qualities bought at Carters are not significantly different from the claimed
proportions.
It is believed that about 40% of Americans own guns for hunting, 30% for protection, 18% for
both hunting and protection, and 12% for other reasons. A survey in Detroit of 1000 individuals
gave the following results.
Hunting 370
Protection 300
Other 150
Suppose you are interested in test the hypothesis that the distribution of reasons for owning a
gun is the same in Detroit as it is nationally known.
H o : The proportions of reasons for owing a handgun are 0.40, 0.30, 0.18, 0.12.
ANSWER:
The observed and expected frequencies are shown in the table below:
χ 2∗ = ∑ [(O − E )
allcells
2
/ E ] = 9.75
84. Complete the hypotheses test at α = 0.05 using the p-value approach.
ANSWER:
P = p-value = P( χ 2 > 9.75 | df = 3); Using the table of χ 2 distribution: 0.01 < P < 0.025.
Since P < α = 0.05, reject H o , and conclude that the proportions for reasons for owning
a handgun in Detroit are significantly different from those nationally at the 0.05 level of
significance.
85. Complete the hypotheses test at α = 0.05 using the classical approach.
The critical region is: χ 2 (3, 0.05) ≥ 7.82. Since the test statistic χ 2∗ falls in the critical
region, therefore we reject H o and conclude that the proportions for reasons for owning
a handgun in Detroit are significantly different from those nationally at the 0.05 level of
significance.
A sample of 500 individuals are tested for their blood type: A, B, O, or AB, and the results are
used to test the hypothesized distribution of blood types that 41% A, 9% B, 46% O, and 4% AB.
The observed results were as follows:
Blood Type A B O AB
A doctor wishes to determine if there is sufficient evidence to show that the stated distribution is
incorrect.
ANSWER:
ANSWER:
A B O AB Total
χ 2∗ = ∑ [(O − E )
allcells
2
/ E ] = 3.882.
88. Complete the hypothesis test at the 0.05 level of significance using the p-value
approach.
ANSWER:
P = p-value = P( χ 2 > 3.882 | df = 3); Using the table of χ 2 distribution: 0.25 < P < 0.50.
Since P > α = 0.05, fail to reject H o , and conclude that we do not have sufficient
evidence to show that the hypothesized distribution of blood types is incorrect.
89. Complete the hypothesis test at the 0.05 level of significance using the classical
approach.
ANSWER:
The critical region is χ 2 ≥ 7.82. Since the test statistic χ 2∗ falls in the noncritical region,
we fail to reject H o at α = 0.05, and conclude that we do not have sufficient evidence to
show that the hypothesized distribution of blood types is incorrect.
Frequenc 18 20 28 23 11
y
ANSWER:
ANSWER:
ANSWER:
χ 2 * = 7.90
93. Do the data provide enough evidence to support the professor’s claim?
ANSWER:
Since χ * = 7.90 < 9.49, we fail to reject H o . The data provide enough evidence to support the
2
professor’s claim.
The mathematics department at a certain college in Texas claims that the grades in its
introductory algebra course are distributed as follows: 10% A’s, 20% B’s, 40% C’s, 20% D’s,
ANSWER:
H o : The distribution of grades is 10% A’s, 20% B’s, 40% C’s, 20% D’s, 10% F’s.
ANSWER:
The observed and expected frequencies are shown in the table below, where E = np.
A B C D F Total
96. Does this sample contradict the department’s claim at the 0.05 level? Solve using the p-
value approach.
ANSWER:
P = p-value = P( χ 2 > 11.563 | df = 4); Using the table of χ 2 distribution: 0.01< P < 0.025.
Since P< α = 0.05, reject H o . There is sufficient evidence at the 0.05 level of
significance to show that the grade distribution is different than claimed. In other words,
this sample contradicts the department’s claim.
ANSWER:
The critical region is χ 2 ≥ 9.50. Since the test statistic χ 2∗ falls in the critical region, we
reject H o at α = 0.05. There is sufficient evidence at the 0.05 level of significance to
show that the grade distribution is different than claimed. In other words, this sample
contradicts the department’s claim.
ANSWER:
37.6
ANSWER:
21.6
ANSWER:
40.3
ANSWER:
18.3
ANSWER:
26.2
ANSWER:
32.4
105. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following statement: “The numbers, 1, 2, 3, and 4, are equally likely to be
drawn.”
ANSWER:
ANSWER:
ANSWER:
108. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following statement: “The poll will show a distribution of 7%, 15%, 38%, and 40%
for the possible ratings of excellent (E), good (G), fair (F), and poor (P) on US foreign
policy in the Middle East during George W. Bush administration.”
ANSWER:
109. Place bounds on the p-value for testing the null hypothesis H o : P(1) = P(2) = P(3) = P(4)
= P(5) = 0.20, given that the value of the test statistic χ 2 * =12.89.
ANSWER:
ANSWER:
The critical value = χ 2 (5, 0.05) = 11.1 and the critical region is the right hand-tail area
that is greater than 11.1. The null hypothesis H o is rejected if the value of the test
statistic χ 2∗ > 11.1112.
111. Determine the critical value and critical region that would be used in the classical
approach of a multinomial experiment to test the null hypothesis H o : P(A) = 0.28, P(B) =
0.37, P(C) = 0.35, with α = 0.01.
ANSWER:
The critical value = χ 2 (2, 0.01) = 9.21 and the critical region is the right hand-tail area
that is greater than 9.21. The null hypothesis H o is rejected if the value of the test
statistic χ 2∗ > 9.21.
112. In 2004, Brand A microwaves had 45% of the market, Brand B had 35%, and Brand C had 20%.
This year the makers of brand C launched a heavy advertising campaign. A random sample of
appliance stores shows that of 10,000 microwaves sold, 4350 were Brand A, 3450 were Brand B,
and 2200 were Brand C. Has the market changed? Test at α = 0.01.
ANSWER:
H o : p1 = 0.45 , p2 = 0.35, p3 = 0.20
H a : At least two proportions differ from their specified values.
The critical value is χ 2 (2, 0.01) = 9.21, and the value of the test statistic: χ 2∗ = 25.714. Therefore,
we reject the null hypothesis. There is sufficient evidence to indicate that the market has changed
since 2004.
113. Place bounds on the p-value for testing the null hypothesis H o : P(A) = 0.25, P(B) = 0.30,
P(C) = 0.35, P(D) = 0.10 given that the value of the test statistic χ 2 * = 8.95.
ANSWER:
A certain type of flower seed will produce magenta, chartreuse, and ochre flowers in the ratio
6:3:1 (one flower per seed). A total of 150 seeds are planted and all germinate, yielding the
following results:
ANSWER:
115. If the null hypothesis is true, what is the expected number of magenta flowers?
ANSWER:
116. If the null hypothesis is true, what is the expected number of chartreuse flowers?
ANSWER:
117. If the null hypothesis is true, what is the expected number of ochre flowers?
ANSWER:
k–1=3–1=2
ANSWER:
120. Complete the hypothesis test at α = 0.10, using the p-value approach.
ANSWER:
Since p-value > α = 0.10, we fail to reject H o . There is no significant evidence to suggest
that this type of flower seed will not produce magenta, chartreuse, and ochre flowers in
the ratio 6:3:1. In other words, the proportions of the three colors are not significantly
different from the 6:3:1 ratio.
121. Compute the hypothesis test at α = 0.10 using the classical approach.
ANSWER:
The critical value = χ 2 (2, 0.10) = 4.61. Since χ 2∗ = 4 does not fall in the rejection region,
we fail to reject H o . We reach the same conclusion as stated in question 120.
ANSWER:
ANSWER:
The expected values are calculated according to the formula E = np. The observed and
expected frequencies are shown in the table below:
Quality
χ 2∗ = ∑ [(O − E ) 2 / E ] = 6.27
124. Does this sample contradict the expected proportions? Test at the 0.05 level of
significance using the p-value approach.
ANSWER:
125. Does this sample contradict the expected proportions? Test at the 0.05 level of
significance using the classical approach.
ANSWER:
The critical value = χ 2 (3, 0.05) = 7.82. Since χ 2∗ = 6.27 does not fall in the rejection
region, we fail to reject H o . We reach the same conclusion as stated in question 124.
Integer 0 1 2 3 4 5 6 7 8 9
Frequency 16 12 11 10 15 15 12 17 21 21
The programmer has sufficient reason to believe that the integers are not being generated
uniformly.
ANSWER:
127. Test the hypotheses in question 126 at α = 0.10 using the p-value approach.
P = p-value = P( χ 2 > 9.07 | df = 9) ⇒ 0.25 < P < 0.50. Since P > α = 0. 10, we fail to
reject H o . There is sufficient reason to support the programmer’s belief that the integers
are being generated uniformly.
128. Test the hypotheses in question 126 at α = 0.10 using the classical approach.
ANSWER:
The critical value = χ 2 (9, 0.10) = 14.7 Since χ 2∗ = 9.07 does not fall in the rejection
region, we fail to reject H o . We reach the same conclusion as stated in question 127.
Skittles Original Fruit bite size candies are multiple colored candies in a bag and you can “Taste
the Rainbow” with their five colors and flavors: Green-Lime, Purple-Grape, Yellow-Lemon,
Orange-Orange, and Red-Strawberry. Unlike some of the other multi-colored candies available,
Skittles claims their 5 colors are equally likely. In an attempt to reject this claim, an 8-ounce bag
of Skittles was purchased and colors counted.
ANSWER:
ANSWER:
P = p-value = P( χ 2 > 5.4 | df = 4) ⇒ 0.10 < P < 0.25 [Almost 0.25 since χ 2 (4, 5.39) =
0.25] Since P > α = 0. 05, we fail to reject H o . There is no sufficient evidence to
contradict Skittles’s claim to conclude that these 5 colors are not equally likely.
131. Does this sample contradict Skittles’ claim? Test at the .05 level of significance using
the classical approach.
ANSWER:
The critical value = χ 2 (4, 0.05) = 9.49. Since χ 2∗ = 5.4 does not fall in the rejection
region, we fail to reject H o . We reach the same conclusion as stated in question 130.
132. Suppose we purchase a 16-ounce bag and count the five colors. The results are shown
below:
Calculate the value of chi-square for these data. How is the new chi-square value
related to the one found in question 130? What effect does this new value have on the
test results? Explain.
ANSWER:
Each of the expected frequencies = (400)(0.20) = 80, and the new chi-square value χ 2∗
= 10.80. This value is exactly twice the value found in question 130. In this case, we
reject H o since χ 2∗ = 10.80 falls in the rejection region χ 2 > 9.49. Now, we can say that
When interbreeding two strains of roses. We expect the hybrid to appear in three genetic
classes in the ratio 1:3:4. The results of an experiment yield 60 hybrids of the first type, 255 of
the second type, and 285 of the third type.
ANSWER:
ANSWER:
The expected values are calculated according to the formula E = np. The observed and
expected frequencies are shown in the table below:
Quality
χ 2∗ = ∑ [(O − E )
allcells
2
/ E ] = 3.0 + 4.0 + 0.75 = 7.75
ANSWER:
P = p-value = P( χ 2 > 7.75 | df = 2) ⇒ 0.01 < P < 0.025. Since P < α = 0.05, we reject
H o There is sufficient evidence to reject the hypothesized genetic ratio at the 0.05 level
of significance.
136. Do we have sufficient evidence to reject the hypothesized genetic ratio at the 0.05 level
of significance? Test using the classical approach.
ANSWER:
The critical value = χ 2 (2, 0.05) = 5.99 Since χ 2∗ = 7.75 does fall in the rejection region,
we reject H o . We reach the same conclusion as stated in question 135.
A national survey states that 67% of college students are under the age of 25, 21% are between the ages
of 25 and 30, 8% are between 30 and 40, and 4% are over 40. A random sample of 250 students at
Grand Rapids Community College yielded the following data:
Age Frequency
Under 25 138
25 but under 30 62
30 but under 40 32
Over 40 18
137. State the null and alternative hypotheses to test whether the distribution of students’
ages at Grand Rapids Community College agrees with the national survey.
ANSWER:
ANSWER:
The expected cell counts for each of the four age categories, computed by using the
formula Ei = npi , are 167.5, 52.5, 20, and 10, respectively. The chi-square test statistic
can now be calculated as: χ 2∗ = ∑ (O − E )
i i
2
/ Ei = 20.52.
ANSWER:
ANSWER:
Since χ 2∗ = 20.52 > 7.82, reject H o . We conclude the distribution of students’ ages at
Grand Rapids Community College does not agree with the national survey.
Section 11.3
True-False Questions
ANSWER: T
142. In a contingency table, the sum of the observed frequencies in a given row equals the
sum of the expected frequencies for the same row.
ANSWER: T
ANSWER: F
ANSWER: T
145. The number of degrees of freedom in the chi-square test of independence, where the
contingency table has r rows and c columns, is determined by df = r ⋅ c.
ANSWER: F
146. The chi-square test of homogeneity is used when the two categorical variables in the
contingency table are controlled by the experimenter so that the row (or column) totals
are predetermined.
ANSWER: F
147. The observed frequency of a cell should not be allowed to be smaller than 5 when a chi-
square test of homogeneity is being conducted.
ANSWER: F
ANSWER: T
149. The null hypothesis being tested by a test of homogeneity is that the distribution of
proportions is the same for each of the subpopulations.
ANSWER: T
150. For a contingency table, the expected frequency for a cell is determined by dividing the
column total by the grand total.
ANSWER: F
151. The sum of the observed frequencies in a chi-square test of independence need not
equal the sum of the expected frequencies.
ANSWER: F
152. Chi-square tests of independence are always lower-tailed because a perfect fit between
observed and expected frequencies makes the test statistic χ 2∗ equal to zero.
ANSWER: F
153. The degrees of freedom associated with a chi-square test of independence where data
are summarized in a contingency table with r rows and c columns equal the number of
rows times the number of columns in the table minus two; that is, rc -2.
ANSWER: F
154. A chi-square test for independence is applied to a contingency table with 4 rows and 4 columns
for two qualitative variables. The degrees of freedom for this test must be 9.
ANSWER: T
2∗
155. In a chi-square test of independence, if the value of the test statistic was χ = 16.55, and the
critical value at α = 0.025 was 14.5, then we must reject the null hypothesis at α = 0.05 .
ANSWER: T
157. The chi-square test statistic for a contingency table with r rows and c columns can be
negative if r is much smaller than c.
ANSWER: F
158. A chi-square test for independence is applied to a contingency table with 3 rows and 5
columns for two qualitative variables. The number of degrees of freedom for this test is
8.
ANSWER: T
ANSWER: F
160. A chi-squared test for independence is applied to a contingency table with 3 rows and 5 columns
for two qualitative variables. The degrees of freedom for this test must be 15.
ANSWER: F
Multiple-Choice Questions
A) In the test of independence, one set of marginal totals (either row totals or column
totals) is known before the data are collected.
B) In the test for homogeneity, the null hypothesis says, “The distribution of proportions
is the same in all subpopulations.”
C) In the test of independence, the number of degrees of freedom is r + c – 1.
D) In the test for homogeneity, the number of degrees of freedom is rc – 1, where r is
the number of rows and c is the number of columns in the contingency table.
ANSWER: B
163. You have calculated the chi-square test statistic in a test of independence and
determined that χ 2 * = −4.23. Therefore, you will know that you
A) automatically reject H o .
B) automatically fail to reject H o .
C) Observed frequencies that were greater than the corresponding expected
frequencies.
D) made a mistake in the calculation.
ANSWER: D
164. What is your conclusion for a chi-square test of independence with critical value of 17.34
and χ 2 * =2.54?
A) The actual testing procedure for independence and homogeneity with contingency
tables is not the same
B) In a test of homogeneity, we are actually testing the null hypothesis: The distribution
of proportions within the rows is the same for all rows.
C) In a test of homogeneity, the alternative hypothesis is stated as: The distribution of
proportions within the rows is not the same for all rows; that is, at least one is
different from the others.
D) All of the above.
ANSWER: A
168. The number of degrees of freedom for a contingency table with 6 rows and 6 columns is
A) 36.
B) 25.
C) 12.
D) 6.
ANSWER: C
A) 19.2.
B) 3.125.
C) 20.0.
D) 1.786.
ANSWER: A
170. A chi-square test of independence with 10 degrees of freedom results in a test statistic
of 19.25. Using the chi-square table, the most accurate statement that can be made
about the p-value for this test is that:
172. A chi-square test of independence is applied to a contingency table with 4 rows and 5
columns for two qualitative variables. The degrees of freedom for this test will be:
A) 20.
B) 16.
C) 15.
D) 12.
ANSWER: D
173. In a chi-square test of independence, the value of the test statistic was X 2 = 9.572 , and
the critical value at α = 0.025 was 11.1433. Thus,
Short-Answer Questions
174. Suppose we are interested in determining whether or not there is a particular difference
in the preference for a particular product depending on the gender of the consumer.
ANSWER:
175. In using a contingency table, what assumption allows us to compute the expected
frequency for a cell as we do?
ANSWER:
176. What must “a” and “b” equal in order that the chi-square value, χ 2 * , be zero?
1 2 Total
Levels of B 1 30 70 100
2 a b 200
ANSWER:
a = 60 , b = 140
177. What hypothesis is being tested when a contingency table is used to perform a test of
homogeneity?
ANSWER:
The proportions within the row are the same for all rows.
178. In performing a hypothesis test concerning a contingency table, discuss the implication
of obtaining a computed value of χ 2 that is very close in value to zero.
ANSWER:
If χ 2 * is close to zero, then the observed frequencies are very close in value to the
expected frequencies.
ANSWER:
ANSWER:
181. How does a test of homogeneity differ from a general contingency table problem?
ANSWER:
182. The “test of independence” and the “test of homogeneity” are completed identical
fashion, using the contingency table to display and organize the calculations. Explain
how these two hypothesis tests differ.
ANSWER:
The test of independence has one sample of data that is being cross-tabulated
according to the categories of two separate variables; the test of homogeneity has
multiple samples being compared side-by-side and together these samples form the
entire sample used in the contingency table.
A group of high school seniors was given both a math aptitude test as well as a computer
aptitude test. They were then grouped into one of three math aptitude classes as well as one of
three computer science aptitude classes as shown below. One wishes to test the null
hypothesis that computer science aptitude is independent of mathematics aptitude at α = 0.05.
Low 40 25 10
High 20 40 15
183. Determine the critical region and calculate the value of the test statistic.
ANSWER:
ANSWER:
Short 200
Tall 200
ANSWER:
Leader Follower
Short 50 150
Tall 50 150
Consider the following data regarding germination rates for treated and untreated seeds, test
the null hypotheses that the germination rate is the same for the treated as the untreated seed,
at α = 0.01.
Treated 85 15
Untreate 120 30
d
186. Determine the critical region and calculate the value of the test statistic.
ANSWER:
Fail to reject the null hypothesis since χ 2 * < χ 2 . There is not sufficient evidence to
conclude that germination rates do not differ for treated and untreated seeds.
Veterans and non-veterans were surveyed concerning giving veteran preference in hiring for
state government jobs. Suppose the results for the veteran preference were as follows:
Yes No
Non- 360 90
veteran
ANSWER:
80%
ANSWER:
80%
190. What would these answers lead you to believe about the independence of veteran preference
and veteran/non-veteran status?
ANSWER:
ANSWER:
χ2 * =0
Single Married
Candidate A 40 30 70
Candidate B 20 30 50
Total 60 60 120
192. Let p1 be the population proportion of singles who prefer Candidate A and p2 be the
population proportion of married who prefer Candidate A. Compute the test statistic, z * ,
for testing H o : p1 = p2 vs. H a : p1 ≠ p2 .
193. Compute the test statistic for testing that candidate preference is independent of marital
status. That is, compute χ 2 * .
ANSWER:
ANSWER:
X 2 * = 3.429 = (1.852) 2 = ( z *) .
2
195. Refer to the contingency table below with the given observed frequencies, what possible
values of a, would cause us to fail to reject the claim that the row variable is independent
of the column variable using α = 0.01?
18 20 16
14 30 a
ANSWER:
We fail to reject the claim if a were any one of the values 5, 6, 7, ..., 45, 46, or 47.
196. A study involving marijuana use and antisocial behavior resulted in the following data.
Give the p-value for testing that the type of dominant antisocial behavior is independent
of the level of marijuana use, and write your conclusion if α = 0.05.
Insomnia 15 8 8
Aggressiveness 10 8 20
Transient Psychosis 8 12 7
None Apparent 15 10 6
ANSWER:
Value of the test statistic: χ 2 * = 13.995, 0.025 < p- value < 0.05. Since p-value < α ,
reject the null hypothesis that type of dominant antisocial behavior is independent of the
level of marijuana use.
The individuals in the following table have an eye irritation, or a nose irritation, or a throat
irritation. They have only one of the three.
Age (years)
Type of Irritation 18-29 30-44 45-64 65 and Total
over
Throat 75 90 45 10 220
A physician wishes to determine if there is sufficient evidence to reject the hypothesis that the
type of ENT irritation is independent of the age group.
ANSWER:
Age (years)
χ 2∗ = ∑ [(O − E )
allcells
2
/ E]
= 5.2471
ANSWER:
P = p-value = P( χ 2 > 5.2471 | df = 6); Using the table of χ 2 distribution: 0.25 < P <
0.50. Since P > α = 0.05, we fail to reject H o . There is not sufficient evidence to indicate
ANSWER:
The critical region is χ 2 ≥ 12.6. Since the test statistic χ 2∗ falls in the noncritical region,
we fail to reject H o . We reach the same conclusion as stated in question 199.
The manager of an assembly process wants to determine whether the number of defective parts
manufactured depends on the day of the week the parts are produced. He collected the
following information.
Days of Week
Nondefective 34 36 38 38 36 182
Defective 6 4 2 2 4 18
ANSWER:
H a : The number of defective parts is not independent of the day of the week.
ANSWER:
χ 2∗ = ∑ [(O − E )
allcells
2
/ E]
= 3.4186.
ANSWER:
P = p-value = P( χ 2 > 3.4186 | df = 4); Using the table of χ 2 distribution: 0.25 < P <
0.50. Since P > α = 0.05, we fail to reject H o . There is not sufficient evidence to indicate
that the number of defective parts is not independent of the day of the week on which
they are produced.
ANSWER:
The critical region is χ 2 ≥ 9.49; χ 2∗ falls in the noncritical region, therefore we fail to
reject H o at the 0.05 level of significance. We reach the same conclusion as stated in
question 203.
Professor
Grades #1 #2 #3
A 15 12 30
B 20 32 26
C 25 20 10
Other 20 26 24
The department head of statistics wishes to determine if there is there sufficient evidence to
conclude that the distribution of grades is not the same for all three professors.
ANSWER:
ANSWER:
Professor
Grades #1 #2 #3
χ 2∗ = ∑ [(O − E )
allcells
2
/ E]
= 18.807
207. Complete the hypothesis test at the 0.01 level of significance using the p-value
approach.
ANSWER:
P = p-value = P( χ 2 > 18.807 | df = 6); Using the table of χ 2 distribution: P < 0.005.
Since P < α = 0.01, we reject H o . There is sufficient evidence to indicate that the
distribution of grades is not the same for all professors, at the 0.0 level of significance.
208. Complete the hypothesis test at the 0.01 level of significance using the classical
approach.
ANSWER:
The critical region is χ 2 ≥ 16.8. Since the test statistic χ 2∗ is in the critical region, we
reject H o . We reach the same conclusion as stated in question 207.
209. Which professor is the easiest grader? Explain, citing specific supporting evidence.
ANSWER:
Professor #3 gives A’s in higher proportion and C’s in lower proportions than expected if
all graded the same. This can be supported by the value of chi-square that comes from
those two cells.
The table below reports the responses of 300 students selected from schools with low
graduation rates to the question “Do tests required for graduation discourage some students
from staying in school?”
Yes 60 30 50 140
No 25 15 15 55
Unsure 45 25 35 105
One wishes to determine if there is a relationship between a student’s response and the
school’s location.
ANSWER:
H a : The student’s response and the school location are not independent.
ANSWER:
χ 2∗ = ∑ [(O − E )
allcells
2
/ E]
= 0.007 + 0.218 + 0.238 + 0.057 + 0366 + 0.606 + 0.005 + 0.010 + 0.000 = 1.507.
212. Complete the hypothesis test at the 0.05 level of significance using the p-value
approach.
ANSWER:
P = p-value = P( χ 2 > 1.507 | df = 4); Using the table of χ 2 distribution: 0.75 < P < 0.90.
Since P > α = 0.05, we fail to reject H o . There is not sufficient evidence to show that the
student’s response and the school location are not independent
213. Complete the hypothesis test at the 0.05 level of significance using the classical
approach.
ANSWER:
The critical region is: χ 2 ≥ 9.49. Since the test statistic χ 2∗ falls in the noncritical region,
we fail to reject H o at the 0.05 level of significance. We reach the same conclusion as
stated in question 212.
Response
Group 1 38 12 50
Group 2 35 15 50
Total 73 27 100
214. Compute the value of the test statistic z * that would be used to test the null hypothesis
that p1 = p2 where p1 and p2 are the proportions of “yes” responses in the respective
groups.
ANSWER:
p′p = ( x1 + x2 ) /(n1 + n2 ) = (38 + 35) / (50 + 50) = 0.73, and q′p = 1 − p′p = 1 – 0.73 = 0.27.
p1 − p2 0.76 − 0.70
z∗ = = = 0.6757
p′p q′p [(1/ n1 ) + (1/ n2 )] (0.73)(0.27)[(1/ 50) + (1/ 50)]
215. Compute the value of the test statistic χ 2 * that would be used to test the hypothesis
that “response is independent of group.”
ANSWER:
Yes No
ANSWER:
217. State the null hypothesis H o and the alternative hypothesis H a that would be used to test
the following statement: “In the recent Egyptian presidential election that was held
September 7, 2005, the voters expressed preferences that were not independent of their
party affiliations.”
ANSWER:
218. State the null hypothesis H o and the alternative hypothesis H a that would be used to test
the following statement: “The distribution of opinions is the same for all five
communities.”
ANSWER:
219. State the null hypothesis H o and the alternative hypothesis H a that would be used to test
the following statement: “The proportion of strongly agree responses was the same for
all categories surveyed.”
H o : The proportion of strongly agree responses was the same in all categories sampled.
The table below outlines the results of a survey conducted recently to collect information from
Michigan high school students about their opinion on seatbelt usage. They were asked whether
or not they rarely or never wear seatbelts when riding in someone else’s car.
Gender
220. Suppose you wish to test the hypothesis that gender is independent of seatbelt usage,
state the null and alternative hypotheses.
ANSWER:
ANSWER:
Gender
223. Using the classical approach at α = 0.05, does this sample present sufficient evidence to
reject the null hypothesis that gender is independent of seatbelt usage?
ANSWER:
The critical value = χ 2 (1, 0.05) = 3.84. Since χ 2∗ = 31.93 falls in the rejection region, we
reject H o . There is sufficient evidence to indicate that seatbelt usage depends on gender.
224. Complete the hypothesis test at the 0.05 level of significance using the p-value
approach.
ANSWER:
P = p-value = P( χ 2 > 31.93 | df = 1); Using the table of χ 2 distribution: P < 0.005.
Since P < α = 0.05, we reject H o . We reach the same conclusion as stated in question
223.
A survey of randomly selected travelers who visited the restrooms in US 131 during their
summer vacation in 2004 showed the following results:
225. Suppose you wish to test the hypothesis that quality of responses is independent of the
gender of the respondent, state the null and alternative hypotheses.
ANSWER:
228. Using the classical approach at α = 0.05, does this sample present sufficient evidence to
reject the null hypothesis that quality of responses is independent of the gender of the
respondent?
ANSWER:
The critical value = χ 2 (2, 0.05) = 5.99. Since χ 2∗ = 15.58 does fall in the rejection region,
we reject H o . There is sufficient evidence to indicate that quality of responses is
dependent of the gender of the respondent.
229. Using the p-value approach at α = 0.05, does this sample present sufficient evidence to
reject the null hypothesis that quality of responses is independent of the gender of the
respondent?
ANSWER:
P = p-value = P( χ 2 > 15.58 | df = 2); Using the table of χ 2 distribution: P < 0.005.
Since P < α = 0.05, we reject H o . We reach the same conclusion as stated in question
223.
Age Group
230. Suppose you wish to test the hypothesis that the same proportion of each age group has
serious fears of darkness, state the null and alternative hypotheses.
ANSWER:
H o : The proportion of individuals who has serious fears of darkness is the same in all
H a : The proportion of individuals who has serious fears of darkness is not the same in
ANSWER:
Age Group
ANSWER:
χ 2∗ = ∑ (O − E ) 2 / E
= 1.38 + 0.01 + 4.56 + 10.17 + 16.25 + 0.75 + 0.01 + 2.50 + 5.56 + 8.89
= 50.08
ANSWER:
The critical value = χ 2 (4, 0.01) = 13.3. Since χ 2∗ = 50.08 falls in the rejection region, we
reject H o . There is sufficient evidence to indicate that the proportion of individuals who
has serious fears of darkness is not the same in all five age group.
234. A study of the purchase decisions of three stock portfolio managers, A, B, C, was
conducted to compare the numbers of stock purchases that resulted in profits over a
time period less than or equal to 1 year. One hundred randomly selected purchases
were examined for each of the managers. Do the data provide evidence of differences
among the rates of successful purchases for the three managers?
Manager
Portfolio A B C
Profit 65 73 57
No Profit 35 27 43
ANSWER:
Manager
A B C Total
With (r – 1)(c – 1) = 2 df, the p-value is bounded between 0.05 and 0.10. Therefore, H o
is not rejected and the results are declared not significant. There is not enough
information to conclude that the proportion of successful purchases will differ among the
managers.
235. The personnel manager of a consumer product company asked a random sample of
employees how they felt about the work they were doing. The table below gives a
breakdown of their responses by gender. Do the data provide sufficient evidence to
conclude that the level of job satisfaction is related to gender? Use α = 0.10
Response
Gender Very Interesting Fairly Interesting Not Interesting
Male 70 41 9
Female 35 34 11
ANSWER:
H o : Job satisfaction and gender are independent
H a : Job satisfaction and gender are dependent
The critical value is χ 2 (2, 0.10) = 4.61 and the value of the test statistic is χ 2∗ = 4.708. Therefore,
we reject the null hypothesis. There is sufficient evidence to conclude that job satisfaction is
related to gender.
Chapter 12
ANALYSIS OF VARIANCE
1. The ANOVA test assumes sampling from normal populations with equal variances.
ANSWER: T
2. In single-factor ANOVA, if the null hypothesis is rejected then all of the population means
are declared to differ from one another.
ANSWER: F
3. We do not need to assume that the observations are independent to perform analysis of
variance.
ANSWER: F
4. Experimental error is the name given to the variability that takes place among the
replicates of an experiment as it is repeated under constant conditions.
ANSWER: T
5. The rejection of H o in single-factor ANOVA indicates that you have identified the level(s)
of the factor that is (are) different from the others.
ANSWER: F
6. To partition the sum of squares for the total in single-factor ANOVA is to separate the
numerical value of SS(total) into two values, SS(factor) and SS(error), such that the
sum of these two values is equal to SS(total).
ANSWER: T
7. In order to apply the F- test in ANOVA, the sample standard deviation from each factor
level sample must be the same.
ANSWER: F
ANSWER: F
9. Independent samples were collected in order to test the effect a factor had on a variable
of interest. The data is summarized in the ANOVA table shown below.
df SS
Factor 2 810
Error 8 720
Total 10 1530
ANSWER: F
10. Fail to reject H o in single-factor ANOVA is the desired decision when the means for the
levels of the factor being tested are all different.
ANSWER: F
11. In single-factor ANOVA, the degrees of freedom for the factor are equal to the number of
factor levels tested less one.
ANSWER: T
12. The measure of a specific level of a factor being tested in an ANOVA is the variance of
the factor level.
ANSWER: F
13. Independent samples were collected in order to test the effect a factor had on a variable
of interest. The data are summarized in the ANOVA table shown below.
Factor 2 84.5
Error 10 9.5
Total 12 94.0
ANSWER: F
14. In single-factor ANOVA, when the calculated value of the test statistic F * , is greater
than the table value for F, the conclusion will be: ”The factor being tested does have an
effect on the variable.”
ANSWER: T
15. In single-factor ANOVA, when the calculated value of the test statistic F * is greater than
the table value for F, then the decision will be: “Fail to reject H o .”
ANSWER: F
16. In single-factor ANOVA, if 10 is subtracted from every data value, then the calculated
value of the test statistic F * is also reduced by 10.
ANSWER: F
ANSWER: T
19. A possible interpretation of H a in single-factor ANOVA is that “The factor being tested
has no effect on the random variable x.”
ANSWER: F
20. In single-factor ANOVA, the sample size from each factor level must be the same in
order to apply the F-test.
ANSWER: F
21. In single-factor ANOVA, we want to reject H o and conclude that the factor has an effect
on the variable when the amount of variance assigned to the factor is significantly larger
than the variance assigned to error.
ANSWER: T
22. Independent samples were collected in order to test the effect a factor had on a variable
of interest. The data are summarized in the ANOVA table shown below.
df SS
Factor 2 28.5
Error 12 125.3
Total 14 153.8
ANSWER: T
ANSWER: F
ANSWER: T
25. In single-factor ANOVA, if the computed value of F is F * = 9.56, and the critical value is
F = 6.39, we would conclude that all the population means are equal.
ANSWER: F
26. In single-factor ANOVA, the alternative hypothesis used in the F-test states that
µ1 = µ2 = µ3 .
ANSWER: F
27. One characteristic of the F-distribution is that the computed value of F can only range
between − 1.0 and +1.0, inclusive.
ANSWER: F
28. In single-factor ANOVA, if the computed value of F is F * = 4.21, and the critical value is
F = 8.89, we would fail to reject the null hypothesis.
ANSWER: T
ANSWER: T
ANSWER: T
ANSWER: F
Multiple-Choice Questions
32. When hypothesis testing involves more than two means, we use the ANOVA rather than
the t-test. ANOVA stands for:
33. Which of the following is a correct interpretation of the null hypothesis for analysis of
variance for one factor?
A) There is no difference between the mean values of the random variable at the
various levels of the test factor.
B) The factor being tested had no effect on the random variable x.
C) There is no variance amongst the mean values of x for each of the different factor
levels.
D) All of the above.
ANSWER: D
34. Given the following set of data, there are three degrees of freedom values. Identify the
correct statement below.
Replicates
A B C D
I 6 11 8 3
Factor II 9 10 11 10
III 14 11 12 15
A) df(Factor) = 3
B) df(Error) = 9
C) df(Total) = 12
D) All of the above.
ANSWER: A
35. In single-factor ANOVA, when the calculated value of the test statistics F * is greater
than the table value for F, we will:
A) fail to reject H o and conclude the factor being tested does have an effect on the
variable.
B) fail to reject H o and conclude the factor being tested does not have an effect on
variable.
C) reject H o and conclude the factor being tested does have an effect on variable.
D) reject H o and conclude the factor being tested does not have an effect on variable.
ANSWER: C
36. Identify the correct statement about the analysis of variance technique.
37. In single-factor ANOVA, if the test is conducted and the null hypothesis is rejected, what
does this indicate?
38. What distribution does the F-distribution approach as the sample size increases?
A) Binomial
B) Normal
C) Student’s t - distribution
D) Chi-square
ANSWER: B
A) variances.
B) proportions.
C) medians.
D) means.
ANSWER: D
40. Given the significance level α = 0.01, the critical F-value for the degrees of freedom, d.f.
= (3, 8) is equal to
A) 7.59.
B) 27.5.
C) 5.42.
D) 4.07.
ANSWER: A
41. In a single-factor ANOVA, there are three treatments with sizes n1 = 5 , n2 = 6 and n3 = 5 . Then
the rejection region for this test at the 0.05 level of significance is
A) F > 3.74.
B) F > 4.86.
C) F > 4.97.
D) F > 3.81.
42. Given the significance level α = 0.025, the critical F-value for the degrees of freedom,
d.f. = (3, 8) is
A) 7.59.
B) 27.5.
C) 5.42.
D) 4.07.
ANSWER: C
43. In a single-factor ANOVA test, the test statistic is F * = 4.25. The rejection region is F > 3.06 for
the α = 0.05, F > 3.8 for α = 0.025, and F > 4.89 for α = 0.01. For this test, the approximate p-
value is
44. A professor of statistics in Michigan State University wants to determine whether the average
starting salaries among graduates of the 15 universities in Michigan are equal. A sample of 25
recent graduates from each university was randomly taken. The appropriate critical value for the
ANOVA test is obtained from the F-distribution with numerator and denominator degrees of
freedom, respectively, equal to:
A) 15 and 25
B) 14 and 360
C) 360 and 14
D) 25 and 15
ANSWER: B
45. Given the significance level α = 0.05, the critical F-value for the degrees of freedom, d.f.
= (3, 8) is
A) 7.59.
B) 27.5.
C) 5.42.
D) 4.07.
ANSWER: D
A) The factor degrees of freedom are 1 less than the number of levels (columns) for
which the factor is tested; that is df(factor) = c – 1.
B) The error degrees of freedom are the sum of the degrees of freedom for all levels
tested (columns in the data table). Since each column has ki degrees of freedom;
therefore, df(error) = ki + k2 + k3 + LLL = ∑ ki = n
i
C) The total degrees of freedom are 1 less than the total number of data; that is df(total)
= n – 1.
D) None of the above
ANSWER: C
A) The mean square for the factor being tested, MS(factor), and the mean square for
error, MS(error), are obtained by dividing the sum-of-squares value by the
corresponding number of degrees of freedom; that is MS(factor)= SS(factor) /
df(factor) and MS(error) = SS (error) / df(error).
B) MS(total) = MS(factor) + MS(error).
C) The calculated value of the test statistic, F ∗ , is found by dividing the MS(factor) by
the MS(error).
D) None of the above
ANSWER: B
A) 24
B) 25
C) 29
D) 30
ANSWER: D
52. The number of degrees of freedom for the denominator in one-way ANOVA test involving 4
population means with 15 observations sampled from each population is:
A) 3.55
B) 39.45
C) 4.56
D) 29.45
ANSWER: C
A) variation between the treatments plus the variation within the treatments.
B) variation within the treatments minus the variation between the treatments.
C) variation between the treatments divided by the variation within the treatments.
D) variation within the treatments divided by the variation between the treatments.
ANSWER: C
55. A single-factor ANOVA is applied to three independent samples having means 8, 11,
and 16, respectively. If each observation in the third sample were increased by 20, the
value of the F-statistics would:
A) increase
B) decrease
C) remain unchanged
D) increase by 20
ANSWER: A
Short-Answer Questions
56. In single-factor ANOVA, if df(Factor) = 3, what is the null hypothesis being tested?
ANSWER:
Since df(Factor)=3, then, number of levels of the factor = 4. The null hypothesis must be:
57. Explain how to determine df(Factor), df(Error), and df(Total) if n is the number of data in
the total sample and c is the number of levels (columns) for which the factor is being
tested.
ANSWER:
58. When simultaneously comparing three or more population means, an efficient technique
is called ________.
ANSWER:
ANOVA
59. In ANOVA, explain what is meant by “replicates.”, and “levels of the tested factor.”
ANSWER:
“Levels of the tested factor” refers to random samples at each level of the factor being
tested.
60. In single-factor ANOVA, if MS(factor) is significantly larger than MS(error), what is your
decision and conclusion?
ANSWER:
We reject H o . There is sufficient evidence to conclude that the means for the factor levels
being tested are not all the same.
SS df MS F∗
Factor A 3 18 E
Error B 15 D
Total 162 C
ANSWER:
ANSWER:
63. The single-factor ANOVA technique separated the variance among the sample data into
two measures of variance. What are they? Briefly explain what does each one measure?
ANSWER:
(1) MS(factor), the measure of variance between the levels of the factor being tested,
and
(2) MS(error), the measure of variance within the levels of the factor being tested.
64. In single-factor ANOVA, if MS(factor) is not significantly larger than MS(error), what is
your decision and conclusion?
We will not be able to reject H o . There is not sufficient evidence to conclude that the
means for the factor levels being tested are not all the same.
Replicates 1 2 3
1 2 3 7
2 5 0 8
3 4 6 9
ANSWER:
x1,3 = 4
ANSWER:
x3,2 = 8
67. Find C1 .
ANSWER:
C1 = 11
ANSWER:
69. Find ∑ (C ) i
2
.
ANSWER:
∑ (C ) i
2
= 778
70. The following ANOVA table shows results of independent samples collected to test the
effect a factor had on a variable. Find the critical value for F at α = 0.05 and determine
if H o can be rejected.
df SS
Factor 2 810
Error 8 720
Total 10 1530
ANSWER:
Since F * = (810/2) / (720/8) = 45, and F (2, 8, 0.05) = 4.46, we reject the null hypothesis.
Replicates
3 80 80 85 89
ANSWER:
ANSWER:
SS df MS F*
Total 1481 11
ANSWER:
H o : µ1 = µ2 = µ3 vs. H a : at least two of the population means are not the same.
Since F* = 2.697, and F (2, 9, 0.025) = 5.71, we fail to reject the null hypothesis.
Independent samples were collected in order to test the effect a factor had on a variable. Consider the
ANOVA table below.
df SS
Factor 2 810
Error 8 720
Total 10 1530
ANSWER:
H o : µ1 = µ2 = µ3 vs. H a : at least two of the population means are not the same.
ANSWER:
F * = 4.5
ANSWER:
Since F * = 4.5, and F (2, 8, 0.01) = 8.65, we fail to reject the null hypothesis.
28 35 42 20 32 29
32 37 40 40 42 39
30 39 35 30 37 49
x1 = 30 x2 = 37 x3 = 39 x1 = 30 x2 = 37 x3 = 39
ANSWER:
79. Complete the ANOVA table shown below by filling in the appropriate values for A, B, C,
D, and E.
Source SS df MS F*
Method A B D E
Total 175. 13
5
ANSWER:
Factor Levels
1 2
12.2 13.1
13.0 14.2
12.5 15.0
12.9 14.7
ANSWER:
SS df MS F*
Total 7.62 7
ANSWER:
t * = 3.506
Treatments
1 2 3
2 6 7
3 6 6
2 9 8
10
ANSWER:
SS df MS F*
Total 70.90 9
ANSWER:
H o : µ1 = µ2 = µ3 vs. H a : at least two of the population means are not the same.
Since F* = 12.592, and the critical value is F(2, 7, 0.05) = 4.74, we reject the null
hypothesis at α = 0.05, and conclude that at least two of the population means are not
the same.
86. Place bounds on the p-value for the following situation: F* = 4.21, df(Factor) = 3, and
df(Error) = 10.
ANSWER:
0.025 < P < 0.05
87. Place bounds on the p-value for the following situation: F* = 3.99, df(Factor) = 5, and
df(Error) = 15.
ANSWER:
0.01 < P < 0.025
88. Suppose that an F-test has a p-value of 0.029. What is the interpretation of the situation
if you had previously decided on a 0.05 level of significance?
ANSWER:
Reject the null hypothesis; since the p-value is less than the previously set value for α .
89. Determine the critical region(s) and critical value(s) that would be used to test
H o : µ1 = µ 2 = µ3 = µ4 with n = 18, α = 0.05 . Sketch a graph to display the results.
90. Determine the critical region(s) and critical value(s) that would be used to test
H o : µ1 = µ 2 = µ3 = µ4 =µ5 with n = 15, α = 0.01 . Sketch a graph to display the results.
ANSWER:
91. Determine the critical region(s) and critical value(s) that would be used to test
H o : µ1 = µ2 = µ3 with n = 25, α = 0.01 . Sketch a graph to display the results.
ANSWER:
92. Suppose that an F-test has a p-value of 0.035. What is the interpretation of p-value =
0.035?
0.035 of the probability distribution associated with F and a true null hypothesis is more
extreme than F ∗ . That is, area under the curve and to the right of F ∗ .
93. Suppose that an F-test has a p-value of 0.073. What is the interpretation of the situation
if you had previously decided on a 0.05 level if significance?
ANSWER:
Fail to reject the null hypothesis; since the p-value is greater than the set value for α .
94. Each department at a large industrial plant is rated weekly. State the hypotheses used to
test “the mean weekly ratings are the same in four departments.”
ANSWER:
H o : µ1 = µ 2 = µ3 = µ 4
Source df SS MS
Factor 3 * *
Error * 51.17 *
Total 20 93.44
ANSWER:
Source df SS MS
Factor 3 42.27 14.09
Error 17 51.17 3.01
Total 20 93.44
ANSWER:
ANSWER:
ANSWER:
H o : µ1 = µ 2 = µ3 = µ 4
H a : The means are not equal (that is, at least one mean is different)
99. Test the hypotheses in question 99 at the 0.05 level of significance using the p-value
approach
ANSWER:
100. Test the hypotheses in question 99 at the 0.05 level of significance using the classical
approach.
ANSWER:
ANSWER:
N–t
ANSWER:
t–1
ANSWER:
N–1
104. The F-value is a ratio of two variance estimates. What variance is used as a
denominator of the ratio?
ANSWER:
MS(error)
ANSWER:
MS(factor)
106. Fill in the blanks (identified by asterisks) in the following partial ANOVA table:
Source of SS df MS F
Variation
Factor * * 195 *
Error 625 * *
Total 1600 25
ANSWER:
Source of SS df MS F
Variation
Total 1600 24
Source of SS df MS F
Variation
Error * * 4
Total * *
107. Fill in the blanks (identified by asterisks) in the above ANOVA Table.
ANSWER:
Source of Variation SS df MS F∗
Treatments 12 2 6 1.50
Error 108 27 4
Total 130 29
ANSWER:
H o : µ1 = µ2 = µ3
H a : The population means are not equal (that is; at least one of the means is different)
109. Test at the 5% significance level to determine if differences exist among the three
treatment means.
ANSWER:
Since the test statistics F ∗ = 1.50, and the critical region = F(2, 27, 0.05) ≈ 3.32, we fail
to reject the null hypothesis. There is not sufficient evidence to conclude that the
population means are not equal.
True-False Questions
110. The mathematical model for a particular problem is an equational statement showing the
anticipated makeup of an individual piece of data.
ANSWER: T
111. Side-by-side dotplots are very useful in visualizing the within-sample variation, the
between-sample variation, and the relationship between them.
ANSWER: T
112. In a single-factor ANOVA, the null hypothesis is that there is no difference between the
levels of the factor being tested.
ANSWER: T
113. In a single-factor ANOVA, our goal is to investigate the effect that various levels of the
factor being tested have on each other.
ANSWER: F
114. In a single-factor ANOVA, we must assume independence among all observations of the
experiment.
ANSWER: T
115. In a single-factor ANOVA, we must assume that the effects due to chance and due to
untested factors are F-distributed.
ANSWER: F
ANSWER: F
Multiple-Choice Questions
117. A Wal-Mart department store examined a sample of the 20 credit sales, and recorded
the amounts charged for each of four types of credit cards as follows: 4 for American
Express, 5 for Master Card, 6 for Visa, and 5 for Discover. What are the degrees of
freedom for the F statistic?
118. Five different fertilizers were applied to a field of tomato, in constructing the ANOVA
table, how many degrees of freedom are there in the numerator?
A) 2
B) 3
C) 4
D) 5
ANSWER: C
119. One-way ANOVA is applied to three independent samples having means 12, 15, and 20,
respectively. If each observation in the third sample were increased by 25, the value of
the statistic would:
A) increase.
B) decrease.
C) remain unchanged.
D) increase by 25.
ANSWER: A
121. In an ANOVA test, the test statistic is F = 6.75. The rejection region is F > 3.97 for the 5% level of
significance, F > 5.29 for the 2.5% level, and F > 7.46 for the 1% level. For this test, the p-value is
122. In a single-factor analysis of variance, the null hypothesis of equal population means is rejected if:
124. The distribution of the test statistic for analysis of variance is the:
A) normal distribution.
B) Student’s t-distribution.
C) F-distribution.
D) chi-squared distribution.
ANSWER: C
Short-Answer Questions
125. In single-factor ANOVA, rejection of H o implies that there is a difference between the
levels. Discuss the problem that would follow.
ANSWER:
If we reject H o , problem is to locate level or levels that are different. This may be main
object of analysis.
126. In single-factor ANOVA, the null hypothesis is that there is no difference between the
levels of the factor being tested. How would you interpret a “fail to reject H o ” decision?
ANSWER:
127. Why does df(Factor), the number of degrees of freedom associated with the factor,
always appear first in the critical value notation F[df(factor), df(error), α ]?
ANSWER:
ANSWER:
th
xc , k is the value of the variable at the k replicate of level c.
µ is the mean value for all the data without respect to the test factor.
Fc , is the effect that the factor being tested has on the response variable at each
different level of c.
ε k ( c ) is the experimental error that occurs among the k replicates in each of the c
columns
129. In single-factor ANOVA, the null hypothesis is that there is no difference between the
levels of the factor being tested. How would you interpret a “reject H o ” decision?
ANSWER:
A “reject H o ” decision implies that there is a difference between the levels. That is, at
least one level is different from the others.
A study was designed to compare the fasting blood sugar readings for three groups of diabetic
patients. One group used insulin to control their problem, one group used oral drugs, and one
group used exercise and diet. The blood sugar readings for the three samples were as follows:
95 135 95
ANSWER:
x2 = 130
ANSWER:
x3 = 104
132. If asked to speculate on what two population means differ, what would your choice be?
ANSWER:
µ 2 and µ3
133. Develop the ANOVA table for testing the claim of equal means.
SS df MS F*
Total 3010 14
ANSWER:
H o : µ1 = µ2 = µ3 vs. H a : at least two of the population means are not the same.
ANSWER:
ANSWER:
P = p - value < 0.01, we reject the null hypothesis at α = 0.05, and conclude that at least
two of the population means differ.
137. The coded values for the measure of elasticity in plastic, prepared by two different
processes, for samples of six drawn randomly from each of the processes are shown
below. Using the F test, at α = 0.05, determine if the data presents sufficient evidence to
indicate a difference in mean elasticity for the two processes.
ANSWER:
SS df MS F*
Total 7.55 11
Since F * = 2.88, and critical region is F ≥ 4.96, we fail to reject the null hypothesis. The
data does not present sufficient evidence to indicate a difference in mean elasticity for
the two processes.
In order to better control inflation, the government suggested pay increases be limited to 8% or
less. A member of the Inflation Fighters group compiled the following percent increases for three
different industry groups.
ANSWER:
Total 4.927 11
ANSWER:
H o : µ1 = µ2 = µ3 vs. H a : at least two of the population means are not the same.
ANSWER:
The table below shows the cars that ran out of gas during a one day period on the New York
State Thruway for 4 observation periods.
Westbound, AM 37 34 38 36
Eastbound, AM 37 40 37 42
Westbound, PM 33 34 38 35
Eastbound, PM 41 36 40 39
SS df MS F*
Total 103.44 15
ANSWER:
H o : µ1 = µ2 = µ3 =µ4 vs. H a : at least two of the population means are not the same.
143. At the 0.05 level of significance, does the data contradict the hypothesis that the mean
number of cars running out of gas is the same in all four categories? Test at α = 0.05
using the classical approach..
ANSWER:
Critical region is: F ≥ 3.49, F * = 3.56, therefore we reject H o at α = 0.05. The data
contradicts the hypothesis that the mean number of cars running out of gas is the same
in all four categories
Four brands of gasoline were compared in an experiment. Sixteen small engines were used and
the time of operation for one gallon of gasoline was measured. Four engines were randomly
assigned to each brand.
Brand
A B C D
25 30 32 35
30 35 36 35
32 34 30 38
ANSWER:
SS df MS F*
Total 159.00 15
ANSWER:
H o : µ1 = µ2 = µ3 =µ4 vs. H a : at least two of the population means are not the same.
146. Test for equal means at α = 0.01 using the classical approach.
ANSWER:
Critical region is: F ≥ 5.95, F * = 2.33, therefore we fail to reject H o at α = 0.01 . There is
not sufficient reason to indicate that at least two population means of the four brands of
gasoline are identical.
A cookie salesman interested in increasing his sales volume arranged to have displays of his
best selling cookie in three locations in a market as shown below.
ANSWER:
SS df MS F*
Total 712.4 14
ANSWER:
H o : µ1 = µ2 = µ3 vs. H a : at least two of the population means are not the same.
149. Calculate p-value for testing equal means in the three locations based on the weekly
sales volumes. What is your conclusion at α = 0.01 ?
ANSWER:
p -value > 0.05; therefore we fail to reject H o at α = 0.01 . Conclusion: The data provides
sufficient evidence to conclude that the means in the three locations are equal.
Four different statistical computer programs were tested for time (in seconds) required
completing a particular task. The results are shown below.
ANSWER:
4 treatments
ANSWER:
ANSWER:
SS df MS F*
Total 401.93 11
153. Write the appropriate null and alternative hypotheses, and your conclusion at the 0.01
level.
ANSWER:
ANSWER:
Since F* = 126.03, and the critical value is F = 7.59, we reject the null hypothesis. There
is sufficient evidence to indicate that the mean time (in seconds) required completing a
particular task is different for at least two of the four statistical computer programs.
A new worker was recently assigned to a crew of workers who perform a certain job. From the
records of the number of units of work completed by each worker each day last month, a
sample of size five was randomly selected for each of the two experienced workers and the new
worker as shown in the table below.
Workers
New A B
Units of Work 10 13 12
(replicates)
12 14 15
11 12 11
13 14 14
10 15 15
ANSWER:
ANSWER:
Source df SS MS F∗
Total 14 42.933
157. At the 0.05 level of significance, does the evidence provide sufficient reason to reject the
claim that there is no difference in the amount of work done by the three workers? Solve
using the p-value approach.
ANSWER:
158. At the 0.05 level of significance, does the evidence provide sufficient reason to reject the
claim that there is no difference in the amount of work done by the three workers? Solve
using the classical approach.
ANSWER:
The critical region is: F ≥ 3.89 . Since the value of the test statistic F* falls in the critical
region, we reject H o . We reach the same conclusion as stated in question 158.
An experiment was designed to compare the lengths of time that four different drugs provided
pain relief following heart surgery. The results (in hours) are shown in the following table.
Drug
A B C D
9 7 7 5
7 7 9 5
5 5 9 3
3 5 9
11
ANSWER:
H o The mean amount of relief time is the same for all four drugs.
H a : The mean amount of relief time is not the same for all four drugs.
160. Assume the data were randomly collected and are independent, and the effects due to
chance and untested factors are normally distributed. Develop the ANOVA table.
ANSWER:
Source df SS MS F∗
Total 15 81.75
162. Is there enough evidence to reject the null hypothesis that there is no significant
difference in the length of pain relief for the four drugs at α = 0.05? Solve using the
classical approach.
ANSWER:
The critical region is: F ≥ 3.49. Since the test statistic F* falls in the critical region, we
reject H o . We reach the same conclusion as stated in question 162.
A certain vending company’s soft-drink dispensing machines are supposed to serve eight
ounces of beverage. Various machines were samples and the resulting amounts of dispensed
drink were recorded, as shown in the following table.
Machines
A B C D E
ANSWER:
164. Assume the data were randomly collected and are independent, and the effects due to
chance and untested factors are normally distributed. Develop the ANOVA table.
ANSWER:
∑x 2
= 1315.01
Source df SS MS F∗
Total 17 22.996
165. Does this sample evidence provide sufficient reason to reject the null hypothesis that all
five machines dispense the same average amount of soft drink? Solve using the p-value
approach.
ANSWER:
166. Does this sample evidence provide sufficient reason to reject the null hypothesis that all
five machines dispense the same average amount of soft drink? Solve using the
classical approach.
ANSWER:
It is believed that the median family incomes for three counties in Michigan are as follows:
Wexford $37,780, Osceola $32,135, and Macomb $39,630. The following data represent the
family incomes (in thousands) for nine randomly selected individuals from each of the three
counties.
ANSWER:
H o The mean family income is the same for all three counties.
H a : The mean family income is not the same for at least two of the three counties.
168. Assume the data were randomly collected and are independent, and the effects due to
chance and untested factors are normally distributed. Develop the ANOVA table.
Source df SS MS F∗
Total 26 1006.107
169. Is there sufficient evidence to conclude that the mean family income is the same for
each of the three counties at the 0.05 level of significance? Solve using the p-value
approach.
ANSWER:
170. Is there sufficient evidence to conclude that the mean family income is the same for
each of the three counties at the 0.05 level of significance? Solve using the classical
approach.
ANSWER:
The critical region is: F ≥ 3.40 . Since the test statistic F* falls in the critical region, we
reject H o . We reach the same conclusion as stated in question 170.
A B C
ANSWER:
H a : Not all of the mean mpg is the same for the three companies.
ANSWER:
Source df SS MS F∗
Total 14 63.340
173. Is there any difference in mean mpg? Perform the appropriate test at α = 0.05 using the
critical approach.
The critical value is: F(2,12,0.05) = 3.89. Since the value of the test statistic is F* = 5.05
> 3.89, we reject H o . There is sufficient evidence at the .05 level of significance to
indicate that the consumer research organization does not find support for the equality of
mean mpg for the three companies.
174. Is there any difference in mean mpg? Perform the appropriate test at α = 0.05 using the
p-value approach.
ANSWER:
175. State the null hypothesis H o and the alternative hypothesis H a that would be used to test
the following statement: “The mean scores are the same at all five levels of the
experiment.”
ANSWER:
176. State the null hypothesis H o and the alternative hypothesis H a that would be used to test
the following statement: “The test scores are the same at all three sections.”
ANSWER:
H o : µ1 = µ2 = µ3
ANSWER:
H a : Not all test means are equal. (The test factor has an effect)
178. State the null hypothesis H o and the alternative hypothesis H a that would be used to test
the following statement: “The four different methods of treatment do affect the variable.”
ANSWER:
H a : Not all test means are equal. (The different methods of treatment have an effect)
179. Place bounds on the p-value for the following situation: F* = 4.85, df(Factor) = 2,
df(Error) = 10
ANSWER:
180. Place bounds on the p-value for the following situation: F* = 4.89, df(Factor) = 4,
df(Error) = 15
ANSWER:
ANSWER:
182. Place bounds on the p-value for the following situation: F* = 3.57, df(Factor) = 6,
df(Error) = 21
ANSWER:
183. Sketch an approximate F-curve and use the classical approach to determine the critical
region(s) and critical value(s) that would be used to test the null hypothesis
H o : µ1 = µ 2 = µ3 with n = 25, α = 0.05
ANSWER:
ANSWER:
Suppose that an F- test (as described in this chapter using the p-value approach) has a p-value
of 0.039.
ANSWER:
186. What is the interpretation of the situation if you had previously decided on a 0.05 level of
significance?
ANSWER:
Reject the null hypothesis; since the p-value is smaller than the previously set value for
α.
187. What is the interpretation of the situation if you had previously decided on a 0.025 level
of significance?
ANSWER:
Fail to reject the null hypothesis; since the p-value is greater than the previously set
value for α .
The single-factor analysis of variance (ANOVA) is used to test a hypothesis about several
population means. Assume that c is the number of levels (columns) for which the factor is
tested, ki is the number of replicates at each level tested, and n = ∑ ki is the number of data in
the total sample.
188. State the null hypothesis, in a general form, for the one-way ANOVA.
ANSWER:
H o : The test factor has no effect on the mean at the tested levels.
189. State the alternative hypothesis, in a general form, for the one-way ANOVA.
H a : The test factor has does have an effect on the mean at the tested levels.
ANSWER:
191. What must happen in order to “reject H o ” if using the classical approach?
ANSWER:
The calculated value of F; namely F ∗ , must fall in the critical region; that is, the variance
between levels of the factor must be significantly larger than variance within the levels.
ANSWER:
193. What must happen in order to “fail to reject H o ” If using the p-value approach?
ANSWER:
194. What must happen in order to “fail to reject H o ” If using the classical approach?
ANSWER:
ANSWER:
The tested factor does not have a significant effect on the variable.
Three new drugs are being tested for their effect on the number of days of hospitalization
needed by the patient following surgery. There is a control group receiving a placebo and three
treatment groups with each receiving one of three new drugs, all developed to promote
recovery. The results of an analysis of variance used to analyze the data are shown here.
Source of Variation df SS MS F∗ P-
value
Total 19 51.9
ANSWER:
df(Total) = n -1 = 19 ⇒ n = 20 patients
197. How do these results verify that there was one control and three test groups?
198. Using the SS values, verify the two mean square values.
ANSWER:
ANSWER:
ANSWER:
ANSWER:
H o : µ1 = µ2 = µ3 = µ4
H a : The means are not all equal (that is, at least one mean is different)
ANSWER:
A new operator was recently assigned to a crew of workers who perform a certain job. From the
records of the number of units of work completed by each worker each day last month, a
sample of size five was randomly selected for each of the three experienced workers and the
new worker as shown in the table below. There is a reason to believe that there is no difference
in the amount of work done by the three workers.
Workers
New A B C
9 12 11 13
Units of Work 11 13 14 12
(replicates) 10 11 10 11
12 13 13 12
9 14 14 13
ANSWER:
204. Use computer and statistical software to develop the ANOVA table.
ANSWER:
ANSWER:
Since p-value = 0.0389 < α = 0.05, we reject H o . There is sufficient evidence to indicate
that the mean values for workers are not all equal. In other words, there is significant
difference between the workers with regards to mean amount of work produced.
206. Test the hypotheses in question 204 at the 0.05 level of significance using the classical
approach.
ANSWER:
The critical value is 3.239. Since the value of the test statistic F ∗ = 3.533 falls in the
rejection region, we reject H o . We reach the same conclusion as stated ion question
206.
A new all-purpose cleaner is being test-marketed by placing sales displays in three different
locations within various supermarkets. The number of bottles sold from each location within
each of the supermarkets tested is reported below.
I 42 37 46 40
Locations II 34 40 32 37
III 47 50 52 54
Based on past experience, there is no sufficient evidence to doubt that the location of the sales
display had no effect on the number of bottles sold.
ANSWER:
208. Develop the ANOVA table by using a computer and statistical software.
ANSWER:
209. Using the information obtained in question 209, state the decision and conclusion to the
hypothesis test at the 0.01 level of significance using the p-value approach.
ANSWER:
Since p-value = 0.0005 < α = 0.01, we reject H o . There is sufficient evidence to indicate
that the location of the sales display had an effect on sales.
210. State the decision and conclusion to the hypothesis test at the 0.01 level of significance
using the classical approach.
ANSWER:
The critical value is 8.022. Since the value of the test statistic F ∗ = 19.511 falls in the
rejection region, we reject H o . We reach the same conclusion as stated in question 210.
211. What is the practical interpretation of the p-value in this case? Explain.
Since the p-value is very small (0.0005), it tells us the sample data is very unlikely to
have occurred under the assumed conditions and a true null hypothesis. Therefore, the
decision was to reject H o .
An experiment was designed to compare the lengths of time that four different drugs provided
pain relief following brain surgery. The results (in hours) are shown in the following table. A
doctor claims that there is no significant difference in the length of pain relief for the four drugs
Drug
A B C D
10 14 12 14
10 16 12 12
8 16 10 10
16 10 8
18
H o : The mean length of pain relief time is the same for all four drugs.
H a : The mean length of pain relief time is not the same for all four drugs.
213. Develop the ANOVA table by using a computer and statistical software.
ANSWER:
214. Is there enough evidence to reject the null hypothesis In question 213 at α = 0.05? Use
the p-value approach.
ANSWER:
Since p-value = 0.0005 < α = 0.05, we reject H o . There is sufficient evidence to indicate
that the mean length of pain relief time is not the same for all four drugs.
215. Is there enough evidence to reject the null hypothesis In question 213 at α = 0.05? Use
the classical approach.
ANSWER:
The critical value is 3.49. Since the value of the test statistic F ∗ = 12.5 falls in the
rejection region, we reject H o . We reach the same conclusion as stated ion question
215.
ANSWER:
Since the p-value is very small (0.0005), it tells us the sample data is very unlikely to
have occurred under the assumed conditions and a true null hypothesis. Therefore, the
decision was to reject H o .
Methods of Teaching
ANSWER:
H o : All three methods of instruction are equally effective, as measured by the mean test
scores.
H a : All three methods of instruction are not equally effective, as measured by the mean
test scores.
ANSWER:
219. Using the information in the computer printout in question 219, state the decision and the
conclusion to the hypothesis test at α = 0.05 using the p-value approach.
ANSWER:
ANSWER:
The critical value is 3.403. Since the value of the test statistic F ∗ = 5.184 falls in the
rejection region, we reject H o . We reach the same conclusion as stated ion question
220.
Chapter 13
Linear Correlation
Correlation and
Regression Analysis
Sections 13.1 and 13.2
True-False Questions
ANSWER: F
2. The variance of y about the line of best fit is the same as the variance of the error e
where e = y − y$ .
ANSWER: T
4. Generally speaking, the higher the correlation between x and y, the better will be the
predictions which are made using the line of best fit provided the prediction is made for
an x-value within the range of observed x-values.
ANSWER: T
5. The linear correlation coefficient is used to measure the strength of the linear
relationship between two variables.
ANSWER: T
ANSWER: T
ANSWER: F
8. Correlation analysis attempts to find the equation of the line of best fit for two variables.
ANSWER: F
ANSWER: T
10. The linear correlation coefficient for the population is always a number between 0 and 1.
ANSWER: F
ANSWER: F
12. Analysis of linear dependency between two variables uses two measures: covariance
and the coefficient of linear correlation.
ANSWER: T
13. Like the variance and standard deviation, the covariance of a single set of bivariate data
is always positive.
ANSWER: F
14. Inferences about the linear correlation coefficient are about the pattern of behavior of the
two variables involved and the usefulness of one variable in predicting the other.
ANSWER: T
15. The covariance of a single set of data is positive if the graph is dominated by points to
the upper right and to the lower left of the centroid ( x , y ) .
ANSWER: T
16. A confidence interval may be used to estimate the value of ρ , the linear correlation
coefficient of the population. Usually this is accomplished by using the t-table with
degrees of freedom equal to n -1.
ANSWER: F
ANSWER: T
18. Failure to reject the null hypothesis H o : ρ = 0 is interpreted as meaning that a linear
relationship between the two variables in the population has been shown.
Multiple-Choice Questions
19. The values below are suggested coefficients of correlation, r. The one that indicates the
strongest negative relationship between the input variable x and the output variable y is:
A) -1.5.
B) -0.7.
C) 0.0.
D) 0.8.
ANSWER: B
20. The values below are suggested coefficients of correlation, r. The one that indicates the
strongest positive relationship between the input variable x and the output variable y is:
A) 1.2.
B) 0.7.
C) 0.0.
D) 0.8.
ANSWER: D
22. In publishing the results of some research work, the following values of the correlation
coefficient were listed. Which one would appear to be incorrect?
A) 1.05
B) 1.0
C) 0.95
D) -0.95
ANSWER: A
A) The linear correlation coefficient r is a quantity that measures the strength of a linear
relationship (dependency) between two variables.
B) Analysis of linear dependency between two variables uses two measures:
covariance and the coefficient of linear correlation.
C) The covariance of x and y is defined as the sum of the products of the distances of
all values of x and y from centroid ( x , y ) .
D) None of the above
ANSWER: C
C) r = SS ( xy ) / SS ( x) ⋅ SS ( y )
D) None of the above
ANSWER: B
A) It can be negative.
B) It can be positive.
C) It can be zero.
D) It is always zero since ∑(x − x ) and ∑( y − y) are always zero and the covariance is
defined as ∑ ( x − x )( y − y ) divided by (n – 1).
ANSWER: D
27. If the coefficient of linear correlation for a single set of bivariate data is 0.0698, while the
standard deviation of x is 4.099 and the standard deviation of y is 2.098, then the
covariance of x and y is
A) 0.205.
B) 0.300.
C) 0.286.
D) 0.146.
ANSWER: B
28. For a bivariate set of data, if SS(xy) =200, SS(x) = 350 and SS(y) = 125, then the
Pearson’s product moment is
A) 0.956.
B) 0.005.
C) 1.046.
D) None of the above.
ANSWER: A
29. If the coefficient of linear correlation and the covariance for a single set of bivariate data
are 0.582 and 0.854, respectively, and the standard deviation of x is 1.625, then the
standard deviation of y is
A) 0.681.
B) 0.526.
C) 1.107.
D) 0.903.
ANSWER: D
A) 0.234.
B) 0.300.
C) 0.094.
D) 0.265.
ANSWER: C
A) Inferences about the linear correlation coefficient are about the pattern of behavior of
the two variables involved and the usefulness of one variable in predicting the other.
B) Significance of the linear correlation coefficient means that you have established a
cause-and-effect relationship.
C) The linear correlation coefficient of the population is denoted by the Greek letter ρ .
D) None of the above.
ANSWER: B
33. Which of the following statements is false regarding the assumptions for inferences
about the linear correlation coefficient?
A) The test statistic used to test the null hypothesis H o : ρ = 0 is the calculated value of r
from the sample data.
B) When we perform a hypotheses test about ρ , the linear correlation coefficient for the
population, the number of degrees of freedom for the r statistic is 2 less than the
sample size; that is, df = n – 2.
C) Rejection of the null hypothesis H o : ρ = 0 means that there is no evidence of a linear
relationship between the two variables in the population.
D) None of the above
ANSWER: C
Short-Answer Questions
35. Suppose you are given a particular set of data and found that r = 2.5. How would you
interpret this result?
ANSWER:
You would have made a computation error since it is always true that –1 ≤ r ≤ 1.
36. If a scatter diagram for a bivariate data set results in a horizontal or vertical line, what
value does r take on?
ANSWER:
Positive
ANSWER:
Negative
ANSWER:
Negative
ANSWER:
Positive
41. Thirty-four students in an Algebra course were given a math competency test on the first
day of class. Thirty-two students completed the course and their scores on a
comprehensive final exam were recorded. The correlation coefficient between math
competency scores and final exam scores was computed. Give the critical region for
testing H o : ρ = 0(≤) vs. H a : ρ > 0 at α = 0.05.
ANSWER:
ANSWER:
Spread of data is a strong factor in size of covariance. Covariance does not have a
standardized unit of measure.
43. Indicate whether the symbol ρ is a parameter or statistic. Justify your answer.
ANSWER:
44. Indicate whether the symbol r is a parameter or statistic. Justify your answer.
ANSWER:
ANSWER:
46. What is the best analysis to describe a linear relationship between two variables?
ANSWER:
Linear correlation
ANSWER:
A “moment” is the distance from the mean, and the product of both the horizontal
moment and the vertical moment is summed in calculating the correlation coefficient.
48. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statement: “The linear correlation coefficient is positive”.
ANSWER:
H o : ρ = 0 ( ≤ ) vs. H a : ρ > 0
49. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statement: “There is no linear correlation”.
ANSWER:
H o : ρ = 0 vs. H a : ρ ≠ 0
50. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statement: “There is evidence of negative correlation”.
ANSWER:
H o : ρ = 0 ( ≥ ) vs. H a : ρ < 0
51. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statement: “There is positive linear relationship”.
ANSWER:
52. Does the value of the sample linear correlation coefficient, r, indicate that there is a
linear dependency between the two variables in the population from which the sample
was drawn? Briefly explain how to answer this question.
ANSWER:
To answer this question we can perform a hypothesis test. The null hypothesis is: The
two variables are linearly unrelated ( ρ = 0), where ρ is the linear correlation coefficient
for the population. The alternative hypothesis may be either one-tailed or two-tailed.
Most frequently it is two-tailed, ρ ≠ 0. However, when we suspect that there is only a
positive or only a negative correlation, we should use a one-tailed test. The alternative
hypothesis of a one-tailed test is ρ > 0 or ρ < 0.
53. Calculate the correlation coefficient for the following set of data. What property do the
points exhibit when plotted on a scatter diagram?
x 1 3 0 2 4
ANSWER:
r = –1. All the points fall on a straight line having a negative slope.
The scores (x) on a computer science aptitude test range from 0 to 25, and the course grade (y)
with possible values: 0.0, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, were recorded for 20 students in an
introductory computer science course as shown below.
x 18 10 15 20 18 13 20 16 12 22
y 2.5 1.5 3.0 3.5 2.5 2.0 2.5 3.0 2.0 4.0
x 5 8 16 20 11 14 20 16 15 24
y 0.0 1.0 1.5 3.0 2.0 2.5 4.0 4.0 2.5 3.5
ANSWER:
r = 0.833
55. Give the p-value if this data is used to test H o : ρ = 0(≤) vs. H a : ρ > 0 at α = 0.1. What
is your decision?
ANSWER:
56. The following data represent the number of credit hours, x, and the cost for textbooks, y,
for five students. Calculate SS(x), SS(y), SS(xy), and the coefficient of linear correlation.
x 15 12 18 9 12
y 110 86 105 65 94
ANSWER:
57. Considering the following set of bivariate data, find the value of k that would result in a
coefficient of linear correlation equal to exactly +1.
x 2 5 7 8 11
ANSWER:
k = 24.8
58. A set of bivariate data has a Pearson’s product moment equal to 0.55, and the standard
deviation of x equals 15.5, and the standard deviation of y equals 14.0. Find the
covariance of x and y.
ANSWER:
59. Find the covariance of x and y and the centroid of the data shown in the table below
ANSWER:
60. Two different scales were used to measure the weights of 20 different objects. Find a
95% confidence interval for ρ if r = 0.4.
(–0.05 to 0.70)
Ground 95 69 110 90 95 77 84 92
(x)
61. Use the given data to find a 95% confidence interval for ρ , the population correlation
between heights obtained on the ground and heights determined from aerial
photographs.
ANSWER:
ANSWER:
63. A sample of size 52 was used to test H o : ρ = 0 vs. H a : ρ ≠ 0 . Give a bound on the p-
value if r * = 0.31.
ANSWER:
64. Compute the coefficient of linear correlation for the following set of bivariate data and
find a 95% confidence interval for ρ .
ANSWER:
ANSWER:
r = 0.882.
ANSWER:
r * = 0.912
ANSWER:
ANSWER:
69. A study was conducted to determine the relationship between actual areas of planted
corn and estimates of those areas obtained from earth observation satellites as shown in
the table below.
Give the p-value for testing H o : ρ = 0 vs. H a : ρ ≠ 0 . What is your conclusion at α =0.05.
ANSWER:
r = 0.999, p –value < 0.01. We reject the null hypothesis at α =0.05, and conclude that
ρ ≠ 0.
Point A B C D E F G H I J
x 2 2 4 4 6 6 8 8 10 10
y 2 3 3 4 4 5 5 6 6 7
Scatter Diagram
7
6
5
4
y
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
x
ANSWER:
∑ xy = 310 , ∑ y 2
= 225 .
ANSWER:
ANSWER:
ANSWER:
r = SS ( xy ) / SS ( x) ⋅ SS ( y ) = 40 / (80)(22.5) = 0.943
76. Use the confidence belts chart for the correlation coefficient to determine a 95% confidence
interval for the true population linear correlation coefficient based on the following sample
statistics: n = 8, r = 0.20.
– 0.55 to 0.76
77. Use the confidence belts chart for the correlation coefficient to determine a 95% confidence
interval for the true population linear correlation coefficient based on the following sample
statistics: n = 100, r = – 0.40.
ANSWER:
– 0.55 to – 0.22
78. Use the confidence belts chart for the correlation coefficient to determine a 95% confidence
interval for the true population linear correlation coefficient based on the following sample
statistics: n =25, r = +0.65.
ANSWER:
0.34 to 0.82
79. Use the confidence belts chart for the correlation coefficient to determine a 95% confidence
interval for the true population linear correlation coefficient based on the following sample
statistics: n = 15, r = -0.23.
ANSWER:
– 0.65 to 0.31
First Score 78 90 63 78 99 83 71 87 50 75
Second Score 74 92 54 77 96 80 74 82 55 72
ANSWER:
ANSWER:
From the chart of confidence belts for the correlation coefficient we determine that the
95% confidence interval for ρ is 0.78 to 0.98.
82. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statements: “The linear correlation coefficient is positive.”
ANSWER:
H o : ρ = 0 vs. H a : ρ > 0
83. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statements: “There is no linear correlation.”
ANSWER:
84. State the null hypothesis, H o , and the alternative hypothesis H a , that would be used to
test the following statements: “There is evidence of negative correlation.”
ANSWER:
H o : ρ = 0 vs. H a : ρ < 0
85. State the null hypothesis, H o , and the alternative hypothesis H a , that would be used to
test the following statements: “There is positive linear relationship.”
ANSWER:
H o : ρ = 0 vs. H a : ρ > 0
86. If a sample of size 20 has a linear correlation coefficient of –0.547, is there sufficient
evidence to conclude that the linear correlation coefficient of the population is negative?
Use α = 0.01 , and apply both the p-value approach and the classical approach.
ANSWER:
H o : ρ = 0 vs. H a : ρ < 0
Assume normality for y at each x. Since n = 20, then df = n –2 = 18. r = - 0.547, and the
test statistic r ∗ = -0.547. α = 0.01.
Using the table of “critical values of r when ρ = 0” we get 0.005 < P < 0.01. Since P < α ;
reject H o .
ANSWER:
P = p-value = P(r > 0.295). Using the table of “critical values of r when ρ = 0” we have
0.01 < P < 0.025. Since P < α ; reject H o . There is sufficient evidence to indicate at the
0.01 level of significance that the linear correlation coefficient of the population is
positive.
88. Is a value of r = + 0.295 significant in trying to show that ρ is greater than zero for a
sample of 52 data at the 0.05 level of significance? Use the classical approach.
ANSWER:
The test statistic r ∗ = 0.295, and the critical region is: r ≥ 0.231. Since r ∗ falls in the
critical region, we reject H o . There is sufficient evidence to indicate at the 0.05 level of
significance that the correlation coefficient is positive.
Population 10.1 1.4 2.2 7.1 4.5 0.4 0.4 0.3 0.3 0.5
Crime Rate 12.2 9.7 9.4 8.6 8.4 7.5 7.3 7.2 7.1 7.1
90. Do these data provide evidence to reject the null hypothesis that ρ = 0 in favor of the
alternative ρ ≠ 0 at α = 0.05. Use the p-value approach.
ANSWER:
P = p-value = P(r < -0.798) + P(r > 0.798) = 2 P(r > 0.798). Using the table of “critical
values of r when ρ = 0” we get P < 0.01. Since P < α = 0.05; reject H o .
91. Do these data provide evidence to reject the null hypothesis that ρ = 0 in favor of
ρ ≠ 0 at α = 0.05. Use the classical approach.
ANSWER:
The critical regions are: r ≤ -0.632 and r ≥ 0.632. Since r ∗ falls in the critical region, we
reject H o . There is sufficient evidence to indicate at the 0.05 level of significance that
the correlation coefficient is different from zero.
92. Consider a set of paired bivariate data (x, y). Describe the relationship of the ordered
pairs that will cause ∑ [( x − x ) ⋅ ( y − y )] to be positive.
The set of data will be predominantly ordered pairs which have coordinates such that
both the x and y values are larger than x and y , and both smaller than x and y ; this will
result in the product (x - x )(y- y ) being positive. Graphically, the points will be mostly
located in the upper right and the lower left of the four quarters of the graph formed by
the vertical line x = x and the horizontal line y = y .
93. Consider a set of paired bivariate data (x, y). Describe the relationship of the ordered
pairs that will cause ∑ [( x − x ) ⋅ ( y − y )] to be negative.
ANSWER:
The set of data will be predominantly ordered pairs which have coordinates such that
either the x value is larger than x and y is smaller than y , or x is smaller than x and y
is larger than y ; this will result in the product (x- x )(y- y ) being negative. Graphically,
the points will be mostly located in the upper left and the lower right of the four quarters
of the graph formed by the vertical line x = x and the horizontal line y = y .
The following set of 25 scores was randomly selected from Dr. Maas’ inferential statistics class.
Let x be the pre-final average and y the final examination score. (The final examination had a
maximum of 100 points.)
Student 1 2 3 4 5 6 7 8 9 10 11 12 13
x 80 91 73 88 62 71 60 89 66 73 69 81 76
y 87 88 80 82 76 84 71 90 82 79 75 86 89
Student 14 15 16 17 18 19 20 21 22 23 24 25
x 78 83 76 91 76 99 99 64 86 63 95 97
y 85 89 85 94 78 95 98 72 94 81 90 98
A
NSW
ER: Scatter Diagram
100
90
Final exam score
80
70
60
50
50 60 70 80 90 100
Pre-final average
ANSWER:
ANSWER:
Scatter Diagram
100
Final exam score
90
80
70
60
50
50 60 70 80 90 100
Pre-final average
98. Test the significance of r at α = 0.10 using the p-value approach and the classical
approach.
ANSWER:
Assume normality for y at each x. Since n = 25, then df = n -2 = 23. α = 0.10, r ∗ = 0.875
P = 2P(r > 0.875). Using the table of “critical values of r when ρ = 0:” we get P < 0.01.
Since P < α ; reject H o .
The critical regions are: r ≤ −0.34, and r ≥ 0.34 . Since the test statistic r ∗ is in the critical
region, we reject the null hypothesis. There is sufficient evidence to conclude that there
is a correlation between the pre-final average and the final examination score.
99. Find the 95% confidence interval for the true value of ρ .
ANSWER:
The 0.95 interval for ρ (from the confidence belts chart for the correlation coefficient) is
0.70 to 0.92.
Year
Federal Unemployment 11.8 10.7 18.0 19.7 23.7 31.5 18.4 16.8
Insurance Payments
# Unemployed Persons 6.2 6.1 7.6 8.3 10.7 10.7 8.5 8.3
100. Assume that a simple linear regression model is appropriate for these data. Identify the
dependent and independent variables.
ANSWER:
101. Develop a scatter diagram for these data. What does the scatter diagram indicate about
the relationship between these two variables?
ANSWER:
Scatter Diagram
12
10
8
y
6
4
2
0
0 5 10 15 20 25 30 35
ANSWER:
ŷ = 3.664 + 0.2463x
ANSWER:
r = 0.9327
104. Test the null hypothesis that the true population coefficient of correlation equals zero
using the 0.05 significance level and the classical approach.
ANSWER:
The rejection regions are: r ≤ −0.632, and r ≥ 0.632 . Since the test statistic r ∗ = 0.9327
falls in the rejection region; we reject the null hypothesis at the 0.05 level of significance.
There is sufficient evidence to indicate that a linear relationship exists between these
two variables.
ANSWER:
The set of data will be ordered pairs, which have coordinates such that the product (x- x
)(y- y ) being distributed between positive, negative and zero so that the sum is near
zero. Graphically, the points will be approximately evenly distributed between the four
quarters of the graph formed by the vertical line x = x and the horizontal line y = y .
Consider the following set of bivariate data: (25,15), (35,55), (65,35), (85,25), (115,65) and
(125,15).
ANSWER:
107. Calculate the standard deviation of the six x-values and the standard deviation of the six
y-values.
ANSWER:
∑ x − (∑ x) / n /(n − 1) = [42150 − (450) 2 / 6]/ 5 = 40.988
2
sx =
2
∑ y − (∑ y ) / n /(n − 1) = [9550 − (210) 2 / 6]/ 5 = 20.976
2
sy =
2
ANSWER:
x 42 49 46 57 64 58
y 19 23 18 25 30 29
x y x2 xy y2
42 19 1764 798 361
49 23 2401 1127 529
46 18 2116 828 324
57 25 3249 1425 625
64 30 4096 1920 900
58 29 3364 1682 841
Sum 316 144 16990 7780 3580
∑ x = 316, ∑ y = 144, ∑ x 2
= 16990, ∑ xy = 7780, and ∑ y 2
= 3580
ANSWER:
∑ x − (∑ x)
2
SS ( x) = 2
/ n = 16990 − (316)2 / 6 = 347.333
∑ y − (∑ y)
2
SS ( y ) 2
/ n = 3580 − (144) 2 / 6 = 124.0
ANSWER:
∑ x − (∑ x) / n /(n − 1) = [16990 − (316) 2 / 6]/ 5 = 8.335
2
sx = 2
ANSWER:
ANSWER:
x 2 3 3 4 5 6 7 8 8 9
y 8 8 9 6 7 4 5 2 3 3
Scatter Diagram
10
8
6
y
4
2
0
0 2 4 6 8 10
x
115. What does the scatter diagram tell you about the relationship between x and y?
ANSWER:
ANSWER:
x y x- x y- y (x- x )( y- y )
n
Therefore the covariance is: covar(x, y) = ∑ ( xi − x )( yi − y ) /(n − 1)
i =1
= (-50.5) / 9 = -5.611.
x y xy x2 y2
2 8 16 4 64
3 8 24 9 64
3 9 27 9 81
4 6 24 16 36
5 7 35 25 49
6 4 24 36 16
7 5 35 49 25
8 2 16 64 4
8 3 24 64 9
9 3 27 81 9
Sum 55 55 252 357 357
∑ x − (∑ x) / n /( n − 1) = [357 − (55) 2 /10]/ 9 = 2.461
2
sx =
2
∑ y − (∑ y ) / n /(n − 1) = [357 − (55) 2 /10] / 9 = 2.461
2
sy =
2
118. Use your answers to questions 116 and 117 to calculate the coefficient of linear
correlation, r.
ANSWER:
ANSWER:
∑ x − (∑ x )
2
SS ( x) = 2
/ n = 357 − (55) 2 /10 = 54.5
∑ y − (∑ y )
2
SS ( y ) 2
/ n = 357 − (55) 2 /10 = 54.5
The table of “Confidence Belts for the Correlation Coefficient 1 − α = 0.95 ” available in your
textbook is used to determine a 95% confidence interval for the true population linear
correlation coefficient based on the sample statistics n and r.
ANSWER:
-0.55 to 0.77
121. Find the 95% confidence interval for ρ if n = 100, and r = - 0.40.
ANSWER:
-0.55 to -0.22
122. Find the 95% confidence interval for ρ if n = 25, and r = +0.65.
ANSWER:
0.34 to 0.82
ANSWER:
-0.65 to 0.31
124. Find the 95% confidence interval for ρ if n = 50, and r = 0.60.
ANSWER:
0.40 TO 0.73
The Test-Retest Method is one way of establishing the reliability of a test. The test is
administered and then, at a later date, the same test is re-administered to the same individuals.
The correlation coefficient is computed between the two sets of scores. The following test
scores were obtained in a Test-Retest situation.
st
1 Score 48 73 61 85 99 69 81 76 76 88
2nd Score 54 71 53 81 95 73 79 76 73 91
ANSWER:
ANSWER:
The 95% confidence interval for ρ is read from the “Confidence Belts for the Correlation
Coefficient 1 − α = 0.95 ” table available in your textbook. The values are 0.78 and 0.98.
127. State the null and alternative hypotheses for testing “The Test-Retest Method led to a
reliable test”.
ANSWER:
H o : ρ = 0 vs. H a : ρ ≠ 0
128. Test the hypotheses in question 127 at the 0.05 level of significance using the p-value
approach.
ANSWER:
P = p-value = P( r < -0.955) + P(r > 0.955) = 2 ⋅ P(r > 0.955) with df = n – 2 = 8.
We use the “Critical Values of r When ρ = 0” table available in your textbook to place
bounds on the p-value. This implies that P < 0.01. Since p-value < α = 0.05, we reject
H o . There is sufficient evidence of a linear relationship between the two sets of scores.
129. Test the hypotheses in question 127 at the 0.05 level of significance using the classical
approach.
ANSWER:
130. Can you use the 95% confidence interval for ρ in question 126 for testing the
hypotheses in question 127? Explain in detail.
ANSWER:
Since the hypothesized value ρ = 0 is not included in the 95% confidence interval (0.78,
0.098) for ρ , we reject H o . We reach the same conclusion as stated in question 128.
131. Place bounds on the p-value resulting from a sample with n = 15 and r = 0.525, if H a is
two-tailed.
ANSWER:
We use the “Critical Values of r When ρ = 0” table available in your textbook to place
bounds on the p-value. Since df = n – 2 = 13, then 0.02 < P =p-value < 0.05.
132. Place bounds on the p-value resulting from a sample with n = 20 and r = 0.405, If H a is
one-tailed.
ANSWER:
We use the “Critical Values of r When ρ = 0” table available in your textbook to place
bounds on the p-value. Since df = n – 2 = 18, then 0.05 < 2P = 2 p-value < 0.10 ⇒
0.025 < P < 0.05.
133. Determine the bounds on the p-value that would be used in testing H o : ρ = 0 vs.
H a : ρ ≠ 0 , using the p-value approach with n = 15, and r = 0.552.
We use the “Critical Values of r When ρ = 0” table available in your textbook to place
bounds on the p-value. Since df = n – 2 = 13, then 0.02 < P = p-value < 0.05.
134. Determine the bounds on the p-value that would be used in testing H o : ρ = 0 vs.
H a : ρ > 0 using the p-value approach with n = 8, and r = 0.772.
We use the “Critical Values of r When ρ = 0” table available in your textbook to place
bounds on the p-value. Since df = n – 2 = 6, then 0.02 < 2P = 2 p-value < 0.05. This
implies that 0.01 < P < 0.025.
135. Determine the bounds on the p-value that would be used in testing H o : ρ = 0 vs.
H a : ρ < 0 using the p-value approach with n = 22, and r = -0.396.
ANSWER:
We use the “Critical Values of r When ρ = 0” table available in your textbook to place
bounds on the p-value. Since df = n – 2 = 20, then 0.05 < 2P = 2 p-value < 0.10 ⇒
0.025 < P < 0.05.
136. What are the critical values of r for α = 0.05 and n = 27 if H a is two-tailed?
ANSWER:
The critical value is found at the intersection of the df = 25 row and the two-tailed 0.05
column of the “Critical Values of r When ρ = 0” table available in your textbook. The
value is 0.381. Since this is a two-tailed test; we have two critical values: ± 0.381.
137. What are the critical values of r for α = 0.05 = 0.05 and n = 42 If H a is one-tailed?
ANSWER:
The critical value is found at the intersection of the df = 40 row and the two-tailed 0.10
column of the “Critical Values of r When ρ = 0” table available in your textbook. The
value is 0.257. Since this is a one-tailed; the value is -0.257, if left tail critical region; and
0.257, if right tail.
138. Determine the critical values that would be used in testing H o : ρ = 0 vs. H a : ρ ≠ 0 using
the classical approach with n = 21, α = 0.05.
The critical value is found at the intersection of the df = 19 row and the two-tailed 0.05
column of the “Critical Values of r When ρ = 0” table available in your textbook. The
value is 0.433. Since this is a two-tailed test; we have two critical values: ± 0.433.
139. Determine the critical values that would be used in testing H o : ρ = 0 vs. H a : ρ < 0 using
the classical approach with n = 16, α = 0.05.
ANSWER:
The critical value is found at the intersection of the df = 14 row and the two-tailed 0.10
column of the “Critical Values of r When ρ = 0” table available in your textbook. Since
this is a one-tailed with critical region at the left tail; the value is -0.426 as shown in the
graph.
140. Determine the critical values that would be used in testing H o : ρ = 0 vs. H a : ρ > 0 using
the classical approach with n = 52, α = 0.01.
ANSWER:
The critical value is found at the intersection of the df = 50 row and the two-tailed 0.02
column of the “Critical Values of r When ρ = 0” table available in your textbook. Since
this is a one-tailed with critical region at the right tail; the value is 0.322.
ANSWER:
H o : ρ = 0 vs. H a : ρ > 0
We use the “Critical Values of r When ρ = 0” table available in your textbook to place
bounds on the p-value. Since df = n – 2 = 18, then 0.02 < 2P = 2 p-value < 0.05 ⇒ 0.01
< P < 0.025. Since p-value > α , we fail to reject H o . There is not sufficient evidence to
conclude that the linear correlation coefficient of the population is positive.
142. A sample of 20 pieces of bivariate data has a linear correlation coefficient of r = 0.489.
Does this provide sufficient evidence to reject the null hypothesis that ρ = 0 in favor of a
two-sided alternative? Use the classical approach at α = 0.10.
ANSWER:
H o : ρ = 0 vs. H a : ρ ≠ 0
The critical value is found at the intersection of the df = 18 row and the two-tailed 0.10
column of the “Critical Values of r When ρ = 0” table available in your textbook. The
value is 0.378. Since this is a two-tailed test; we have two critical values: ± 0.378 as
shown below.
Since r ∗ = 0.489 falls in the rejection region, we reject H o . There is sufficient evidence to
conclude that the linear correlation coefficient of the population is not zero.
ANSWER:
H o : ρ = 0 vs. H a : ρ < 0
We use the “Critical Values of r When ρ = 0” table available in your textbook to place
bounds on the p-value. Since df = n – 2 = 12, then 0.05 < 2P = 2 p-value < 0.10 ⇒
0.025 < P < 0.05. Since p-value < α , we reject H o . There is sufficient evidence to
conclude that the linear correlation coefficient of the population is negative
The population (in millions) and the violent crime rate (per 1000) were recorded for ten
metropolitan areas. The data are shown in the following table:
ANSWER:
Scatter Diagram
14
12
10
Crime Rate
8
6
4
2
0
0 2 4 6 8 10 12
Population
ANSWER:
There is a positive linear relationship between population size and violent crime rate.
146. Use a computer to form the extensions table and calculate ∑ x, ∑ y, ∑ xy, ∑ x 2
and
∑y . 2
ANSWER:
ANSWER:
∑ y − (∑ y )
2
SS ( y ) = 2
/ n = 680.59 − (79.3) 2 /10 = 51.741
ANSWER:
ANSWER:
150. Do these data provide evidence to reject the null hypothesis that ρ = 0 in favor of ρ ≠ 0 at
α = 0.05 ? Use the p-value approach.
ANSWER:
H o : ρ = 0 vs. H a : ρ ≠ 0
We use the “Critical Values of r When ρ = 0” table available in your textbook to place
bounds on the p-value. Since df = n – 2 = 8, then P = p-value < 0.01. Since p-value < α
= 0.05, we reject H o . There is sufficient evidence to conclude that the linear correlation
coefficient of the population is not zero.
151. Do these data provide evidence to reject the null hypothesis that ρ = 0 in favor of ρ ≠ 0
at α = 0.05 ? Use the classical approach.
H o : ρ = 0 vs. H a : ρ ≠ 0
The critical value is found at the intersection of the df = 8 row and the two-tailed 0.05
column of the “Critical Values of r When ρ = 0” table available in your textbook. The
value is 0.632. Since this is a two-tailed test; we have two critical values: ± 0.632. Since
r ∗ = 0.958 > 0.632 falls in the rejection region, we reject H o . We reach the same
conclusion stated in question 150.
Section 13.3
True-False Questions
152. The random variable, e (also known as the residual), is positive when the predicted
value ŷ is greater than the observed value of y, and is negative when ŷ is less than y.
ANSWER: F
153. The slope β1 of the regression line of the population can be estimated by means of a
confidence interval that is determined by the formula b1 ± z (α / 2) ⋅ sb1 .
ANSWER: F
154. We test the hypothesis H o : β1 = 0 to determine whether the equation for the line of best
fit is of any real value in predicting the output variable y.
ANSWER: T
ANSWER: F
ANSWER: T
157. In regression analysis, the error term must be normally distributed if references are to be
made.
ANSWER: T
158. The value of the input variable x must be randomly selected to achieve valid regression
results.
ANSWER: F
159. The output variable y must be normally distributed about the regression line for each
value of the input variable x.
ANSWER: T
160. The sum of squares for error is the name given to the numerator portion of the formula used to
calculate the variance of y about the regression line.
ANSWER: T
161. The line of best fit results from an analysis of two (or more) related quantitative
variables.
ANSWER: T
162. The line of best fit, provided one exists, will best predict the value of the dependent, or
output, variable from a value of the independent, or input, variable.
ANSWER: T
Multiple-Choice Questions
164. What is the linear model used to explain relationship between two variables in a
population?
A) y = b0 + b1 x + e
B) y = α + β x
C) y = β + bx
D) y = β 0 + β1 x + ε
ANSWER: D
166. If all of the values of an independent variable x are equal, then regressing a dependent
variable y on x will result in a correlation coefficient, r of:
167. The vertical spread of the data points about the regression line is measured by:
168. In a regression problem the following pairs of (x, y) are given: (3, 2), (3, 1), (3, 0), (3, -1)
and (3, -2). That indicates that the correlation coefficient if
A) 2.
B) 1.
C) 0.
D) -1.
ANSWER: C
169. A regression analysis between sales (y in $1000) and advertising (x in $100) resulted in
the following least squares line: ŷ = 75 +5x. This implies that if advertising is $800, then
the predicted amount of sales (in dollars) is:
A) $79,000.
B) $75,040.
C) $115,000.
D) $4,075.
ANSWER: C
170. A regression analysis between weight (y in pounds) and height (x in inches) resulted in
the following least squares line: ŷ = 130 + 5x. This implies that if the height is increased
by 1 inch, the weight, on average, is expected to:
A) When there is no relationship between the variables, a horizontal line of best fit will
result.
B) A horizontal line has a slope of zero, which implies that the value of the input variable
has no effect on the output variable.
C) The linear model used to explain the behavior of linear bivariate data in the
population is ŷ = β 0 + β1 x + ε , where β 0 is the y-intercept, β1 is the slope, and ε
(lowercase Greek letter “epsilon”) is the random experimental error in the observed
value of y at a given value of x.
D) None of the above.
ANSWER: D
173. Which of the following formulas represent the sum of squares for error (SSE)?
∑ ( y − yˆ )
2
A)
∑ ( y − b − b x)
2
B) 0 1
C) ∑ y − b ( ∑ y ) − b ( ∑ xy )
2
0 1
A) The sum of the errors (residuals) for all values of y for a given value of x is exactly
zero.
B) The variance of the error e (also known as the residual) is estimated by the formula
se2 = ∑ ( y − yˆ ) /(n − 1) where n – 2 is the number of degrees of freedom.
2
C) The variance of y about the line of best fit is the same as the variance of the error e.
Recall that e = y – ŷ .
D) None of the above.
ANSWER: B
Short-Answer Questions
ANSWER:
ANSWER:
Parameter
ANSWER:
Statistic
178. What are the primary questions we answer in linear regression analysis?
ANSWER:
179. If you know the value of r is very close to zero, what value would you anticipate for b1 ?
Explain.
ANSWER:
The value of b1 would be close to zero also. The formulas used to calculate r and b1 have
the same numerator; namely, SS(xy).
180. Describe why the method used to find the line of best fit is referred to as “the method of
least squares”.
ANSWER:
181. Comment on the statement “The two coefficients for the line of best fit have the same
sign.” as sometimes true, always true, or never true. Explain your response if your
answer is “sometimes true” or “never true”
ANSWER:
Sometimes true. The two coefficients (slope and y-intercept) measure two completely
different concepts. Their signs are unrelated.
∑ x = 13 , ∑ y = 246 , ∑ x
2
The following summary data are given: n = 5, = 51 ,
∑y 2
= 12, 946 , and ∑ xy = 760 .
ANSWER:
ŷ = 31 + 7x
ANSWER:
Hence, se = 0.
ANSWER:
∑ x = 39 , ∑ y = 35.1 , ∑ x
2
The following summary data are given: n = 10, = 193 ,
∑y 2
= 130.05 , and ∑ xy = 152.7 .
ANSWER:
ŷ = 2 + 0.387x
186. Find se .
ANSWER:
Hence, se = 0.307.
187. Find se .
ANSWER:
188. Based on the value of se, what do you know about this bivariate data?
ANSWER:
The following data show the number of hours (x) studied for a final exam, and the score (y)
received on the exam for a random sample of 15 students.
x 3 4 4 5 5 6 6 7 7 7 8 8 8 9 9
y 5 59 74 58 77 78 86 68 90 83 79 97 100 89 96
3
ANSWER:
Scatter Diagram
100
90
80
70
Test score
60
50
40
30
20
10
0
2 3 4 5 6 7 8 9 10
Hours of study
ANSWER:
Summary of data:
If x = 3, then ŷ = 57.662
If x = 4, then ŷ = 63.977
If x = 5, then ŷ = 70.292
If x = 6, then ŷ = 76.607
If x = 7, then ŷ = 82.922
If x = 8, then ŷ = 89.237
192. Find the five values of e that are associated with the points where x = 4 and x = 7.
ANSWER:
x 4 4 7 7 7
y 59 74 68 90 83
193. Find the variance se2 of all the points about the line of best fit.
ANSWER:
s2
=
∑y 2
− b0 ∑ y − b1 ∑ xy
= [96939 – (38.717)(1187) – (6.315)(7910)] / 13
n−2
e
= 79.252
194. Using the following bivariate data, calculate the standard error of estimate.
y 5 6 9 4 2 2 0
ANSWER:
1.66
195. Find the equation of the line of best fit for the data shown below. Then, find the variance
error by evaluating ∑ ( y − yˆ ) 2
/(n − 2) .
x 0 1 3 4
y 4 4 10 12
ANSWER:
2
The equation of the line of best fit is y$ = 31
. + 2.2 x , and the variance error is se = 1.3
The average number of client contacts per month, x, and the sales volume, y (in $1000), were
recorded of each of 10 salespeople.
x 22 16 50 48 57 14 25 52 18 52
y 35 30 100 85 135 20 35 95 35 115
ANSWER:
160
140
Sales Volume
120
100
80
60
40
20
0
0 10 20 30 40 50 60
Number of Client Contacts
197. Does the scatter diagram suggest a linear relationship between x and y?
ANSWER:
198. Calculate ∑ x, ∑ y , ∑ x , ∑ y
2 2
, and ∑ xy .
ANSWER:
∑ x = 354, ∑ y = 685, ∑ x 2
= 15346, ∑y 2
= 62675, and ∑ xy = 30730
ANSWER:
∑ x − (∑ x)
2
SS ( x) = 2
/ n = 15346 − (354) 2 /10 = 2814.4
200. Calculate the slope and y-intercept for the line of best fit.
ANSWER:
ANSWER:
ŷ = -13.0191 + 2.3028x
202. Predict the sales volume for a salesperson who contacted 50 clients.
ANSWER:
ANSWER:
SSE = ∑y 2
− (b0 )( ∑ y) − (b )(∑ xy)
1
ANSWER:
The variance of y about the line of best fit is the same as the variance of the error e.
205. Draw a scatter diagram of the data: carat weight (x) and price (y).
ANSWER:
Scatter Diagram
3200
3000
Carat Weight
2800
2600
2400
2200
2000
0.5 0.55 0.6 0.65 0.7
Diam ond Price
ANSWER:
There is a linear pattern to the data, however the data falls into two groups forming two
parallel linear patterns, one forming the top and the other forming the bottom of the total
pattern.
207. Diamonds smaller than 0.50 carats and diamonds larger than 0.66 carats may not fit the
linear pattern demonstrated by this data. Explain.
ANSWER:
Since we only have data in this weight range, we cannot predict with confidence outside
this range. Smaller values than 0.50 carats and larger values than 0.66 carats decrease
and increase, respectively, exponentially.
208. Use computer to find the equation for the line of best fit.
209. According to the results obtained in question 208, what would be a typical price for a
0.50 carat loose diamond of this quality?
ANSWER:
210. On the average, by how much does the price increase for each extra 0.01 carat in
weight? Within what interval of x-values would you expect this to be true?
ANSWER:
The price, on the average, increases by $41.98 for each extra 0.01 carat in weight. We
would expect this to be true for x-values within the interval 0.50 to 0.66 carats.
211. Use computer to find the variance of y about the regression line.
ANSWER:
212. Graph and display the line of best fit on the scatter diagram. What characteristics in the
scatter diagram support the large value obtained in question 211?
ANSWER:
Scatter Diagram
2800
2600
2400
2200
2000
0.5 0.55 0.6 0.65 0.7
Diam ond Price
True-False Questions
213. There are n – 1 degrees of freedom involved with the inferences about the regression
line.
ANSWER: F
ANSWER: T
215. The conference interval for µ y| x0 and the prediction interval for y x0 are constructed in a
similar fashion.
ANSWER: T
216. The symbol µ y| x0 refers to the mean of the population y-values at a given value of x,
while y x0 refers to the individual y-value selected at random that will occur at a given
value of x.
ANSWER: T
217. The standard error of regression (slope) is σ b and is estimated by sb ; the estimate of the
1 1
ANSWER: T
218. The best point estimate, or prediction for both µ y / x and yx , is the actual value of y.
0 0
219. The prediction interval for an individual value of y is wider than the confidence interval
for the mean value of y; both calculated at the same value x0 .
ANSWER: T
220. The confidence interval for an individual value of y is wider than the prediction interval
for the mean value of y; both calculated at the same value x0 .
ANSWER: F
Multiple-Choice Questions
221. In a simple linear regression problem, which of the following table values would be
appropriate for a 95% confidence interval for the mean of y for a given value of x if the
sample size is 10?
A) 1.86
B) 1.81
C) 2.31
D) 2.36
ANSWER: C
222. In a simple linear regression problem including eight observations, which of the following
table values would be appropriate for a 90% prediction interval of the value of a single
randomly selected y?
A) 1.40
B) 1.86
C) 1.44
D) 1.94
ANSWER: D
A) The slope β1 of the regression line of the population can be estimated by means of a
confidence interval. The confidence interval is determined by b ± z (α / 2 ) .
B) The null hypothesis H o : β1 = 0 will be tested using the Student’s t-distribution with (n
– 2) degrees of freedom.
C) The test statistics t* found by using the formula t* = (b1 − β1 ) / sb is used for testing
1
H o : β1 = 0
D) None of the above
ANSWER: A
A) The best point estimate, or prediction for both µ y / x and y x , is ŷ . This is the y value
0 0
obtained when an x value is substituted into the equation of the line of best fit.
B) The sampling distribution of ŷ is the Student’s t-distribution with df = n – 2.
C) The prediction interval for an individual value of y is wider than the confidence
interval for the mean value of y; both calculated at the same value x0 .
D) None of the above
ANSWER: B
227. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the statement: “There is evidence that the slope of the line of best fit is negative”.
ANSWER:
228. Determine the p-value for testing H a : β1 < 0 , with n = 50, b1 = -1.20, sb = 0.80.
1
ANSWER:
229. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the statement: “The slope for the line of best fit is greater than 1.0”.
ANSWER:
230. Determine the p-value for testing H a : β1 > 0 , with n = 20, t ∗ = 2.8.
ANSWER:
231. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the statement: “There is no significant relationship between the x and y variables”.
H o : β1 = 0 vs. H a : β1 ≠ 0
232. Determine the critical value(s) and rejection region(s) that would be used with the
classical approach in testing H o : β1 = 0 vs. H a : β1 > 0 , with n = 30 and α = 0.025.
ANSWER:
233. Determine the p-value for testing H a : β1 ≠ 0 , with df = 12, b1 = 0.20, and sb = 0.125
1
ANSWER:
234. Determine the critical value(s) and rejection region(s) that would be used with the
classical approach in testing H o : β1 = 0 vs. H a : β1 ≠ 0 , given that n = 18 and α = 0.10.
ANSWER:
x 18 10 15 20 18 13 20 16 12 22
y 2.5 1.5 3.0 3.5 2.5 2.0 2.5 3.0 2.0 4.0
x 5 8 16 20 11 14 20 16 15 24
y 0.0 1.0 1.5 3.0 2.0 2.5 4.0 4.0 2.5 3.5
235. Test the null hypothesis H o : β1 = 0 (≤) vs. H a : β1 > 0 by giving the critical region for α
= 0.05, the value of the test statistic t*, and the conclusion.
ANSWER:
236. Find the equation of the line of best fit, and construct a 95% confidence interval for the
mean course grade for students who score 15 on the computer science aptitude test.
ANSWER:
(2.13 to 2.69) is the 95% confidence interval for the mean course grade for students who
score 15 on the computer science aptitude test.
237. Construct a 95% prediction interval for the course grade for a student who scores 15 on
the computer science aptitude test.
ANSWER:
(1.13 to 3.69)
ANSWER:
0.080
239. Determine the p-value for testing H a : β1 > 0 , given that n = 20 and t * = 2.0.
ANSWER:
0.030
240. Determine the p-value for testing H a : β1 ≠ 0 , given that n = 10 and t * = 2.4.
ANSWER:
0.044
241. Find the equation of the line of best fit for the data below, and estimate the value of y
when x is 6.
x 2 3 3 5 7 7 8
y 5 6 9 4 2 2 0
ANSWER:
ANSWER:
243. Test H o : β1 = 0 (≤) vs. H a : β1 > 0 at α = 0.05 using the classical approach.
ANSWER:
Since t*=1.499, and the critical region is t > 1.86, we fail to reject H o at α = 0.05. There
is no sufficient evidence to indicate that β1 > 0.
ANSWER:
(0.26 to 0.76)
The following data were collected on eight insulin dependent diabetics. The variable x is the
average of thirty fasting blood sugar readings taken over the past month, and y is the
hemoglobin A1C reading obtained at the end of the month in which the blood sugar
determinations were made. The data are shown in the table below:
245. Test the hypothesis H o : β1 = 0 vs. H a : β1 > 0 at the 0.05 level of significance. Report
t * , the p-value for the test, and your conclusion.
ANSWER:
246. Find a 95% confidence interval for the mean of y when x = 140.
ANSWER:
(6.6 to 7.3)
x 1 5 6 6 7 9
ANSWER:
(0.52 to 1.63)
Waist measurements, x, and weights, y, were obtained for eighteen males under 30 years of
age. The results were as follows:
x 33 33 30 34 34 40 35 35 32 38 34 32
y 16 18 15 17 18 23 19 19 17 20 17 16
0 7 6 9 7 0 7 6 3 1 4 3
x 35 32 32 34 36 30
ANSWER:
(4.0 to 11.4)
249. Set a 95% confidence interval on the mean weight for all those males with 34-inch waist
measurements.
(175.84 to 189.06)
250. Set a 95% confidence interval on the weight of a given adult male with a 34-inch waist.
ANSWER:
(153.69 to 211.22)
Varying amounts of fertilizer were used on ten different plots and the yield of corn in bushels per
plot was measured for each plot. Let x represents the amount of fertilizer and y represents the
yield of corn. A summary of the results are as follows: y$ = 6.63 + 0.51x and se = 0.3. The x-
values ranged from 2.0 to 4.5 and x = 3.2 and SS(x) = 7.6.
251. Construct a 95% confidence interval for the mean yield for all plots that have 3.0 units of
fertilizer added.
ANSWER:
(7.94 to 8.38)
252. Construct a 95% confidence interval for the yield of an individual plot to which have 3.0
units of fertilizer added.
ANSWER:
(7.43 to 8.89)
y 14 24 30 28 30 80 90 85 110 120
253. Construct a 95% confidence interval for the mean of the population y - values when x =
30.
ANSWER:
(46.7 to 60.6)
254. Construct a 95% prediction interval for an individual y-value when x = 30.
ANSWER:
(31.1 to 76.2)
255. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statement: “The slope for the line of best fit is positive.”
ANSWER:
H o : β1 = 0 vs. H a : β1 > 0
256. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statement: “There is no regression.”
ANSWER:
H o : β1 = 0 vs. H a : β1 ≠ 0
257. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statement: “There is evidence of negative regression.”
H o : β1 = 0 vs. H a : β1 < 0
258. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statement: “There is evidence of positive regression.”
ANSWER:
H o : β1 = 0 vs. H a : β1 > 0
259. Determine the p-value for testing H a : β1 > 0, with n = 20, and t* = 2.2 .
ANSWER:
To determine the p- value for the test of the slope of the regression line, the table of
probability values for Student’s t-distribution is used with df = n – 2, and the value of test
statistic is t ∗ = (b1 − β1 ) / sb1 . P = P(t > 2.20 | df = 18) = 0.021.
260. Determine the p-value for testing H a : β1 ≠ 0, with n = 14, b1 = 0.21, and sb1 = 0.07 .
ANSWER:
To determine the p- value for the test of the slope of the regression line, the table of
probability values for Student’s t-distribution is used with df = n – 2, and the value of test
statistic is t ∗ = (b1 − β1 ) / sb1 . P = 2 P(t > 3.0 | df = 12) = 2(0.006) = 0.012.
261. Determine the p-value for testing H a : β1 < 0, with n = 27, b1 = −1.20, and sb1 = 0.75
ANSWER:
A sample of ten students were asked by their statistics professor for the distance (rounded to
nearest mile) and the time (rounded to nearest minute) required to commute to college daily.
The data collected are shown in the following table.
Distance 2 4 5 6 7 7 8 9 10 12
Time 6 13 15 20 18 23 20 25 28 30
ANSWER:
Scatter diagram
35
30
25
20
Time
15
10
0
0 2 4 6 8 10 12 14
Distance
ANSWER:
b1 = SS ( xy ) / SS ( x) = 185 / 78 = 2.372
264. Give a point estimate for the mean time required to commute four miles.
ANSWER:
When x = 4, ŷ = 3.196 + 2.372 (4) = 12.684. Then, the point estimate for µ y| x = 4 = 12.684
ANSWER:
β1 is the slope of the line of best fit for the population of distances and their
corresponding times required for students to commute to college.
se2 =
∑y 2
− b0 ∑ y − b1 ∑ xy
= [4392 – (3.196)(198) – (2.372)(1571)] / 8 = 4.0975
n−2
The critical region is t ≥ 1.86. Since the test statistic t ∗ falls in the critical region; we
reject H o . There is sufficient evidence at the 0.05 level of significance to indicate that the
slope is significantly greater than zero.
266. Find the 98% confidence interval for the estimation of β1.
ANSWER:
267. Give a 90% confidence interval for the mean travel time required to commute four miles.
ANSWER:
E = t (n − 2, α / 2) ⋅ se ⋅ (1/ n) + [( x0 − x )2 / SS ( x)
= (1.86)(2.0242) (1/10) + [(4 − 7) 2 / 78]
= (1.86)(2.0242)(0.4641) = 1.747
Hence yˆ ± E = 12.684 ± 1.747 , and the 90% confidence interval for µ y| x = 4 is 10.937 to
14.431.
268. Give a 90% prediction interval for the travel time required for one person to commute
four miles.
yx = 4 is the travel time required for one person to commute four miles.
E = t (n − 2, α / 2) ⋅ se ⋅ 1 + (1/ n) + [( x0 − x ) 2 / SS ( x )
= (1.86)(2.0242) 1 + (1/10) + [(4 − 7) 2 / 78]
= (1.86)(2.0242)(1.1024) = 4.151
Hence ŷ ± E = 12.684 ± 4.151 , and the 90% prediction interval for yx = 4 is 8.533 to 16.835.
People not only live longer today but they also are living independently longer, even an
individual may become temporarily dependent at some age. The table shown below includes
two variables: people’s age at which they became dependent (x) and the number of
independent years they had remaining (y).
x 65 66 67 68 70 72 74 76 78 80 83 85
y 11.1 10.0 10.4 9.3 8.2 6.8 6.8 4.4 5.4 2.5 2.7 0.9
ANSWER:
12
10
Independent years
0
60 65 70 75 80 85 90
Age dependent
ANSWER:
Scatter Diagram
12
10
Independent years
0
60 65 70 75 80 85 90
Age dependent
ANSWER:
When x = 80, ŷ = 43.222 – 0.497(80) = 3.462. Reading from the graph in question 272,
ŷ ≈ 3.5 when x = 80.
273. Construct a 99% prediction interval for the number of years of independent living
remaining for a person who becomes dependent at age 80.
ANSWER:
yx =80 is the number of years of independent living remaining for a person who becomes
dependent at age 80. n = 12, x0 = 80, x = ∑ x / n = 73.17,
s2
=
∑y 2
− b0 ∑ y − b1 ∑ xy
= [695.87-(43.222)(82.3)-(-0.497)(5772.2)]/10 = 0.74828
n−2
e
se = 0.74828 = 0.865 ; Since α /2 = 0.005, df = 10; then t(10, 0.005) = 3.17, and
E = t (n − 2, α / 2) ⋅ se ⋅ 1 + (1/ n) + [( x0 − x ) 2 / SS ( x)
But yˆ = 3.462 , then yˆ ± E = 3.462 ± 2.974 , and the 99% prediction interval for y x =80 is
0.488 to 6.436.
The following set of 25 scores was randomly selected from Dr. Maas’ inferential statistics class.
Let x be the pre-final average and y the final examination score. (The final examination had a
maximum of 100 points.)
Student 1 2 3 4 5 6 7 8 9 10 11 12 13
x 80 91 73 88 62 71 60 89 66 73 69 81 76
y 87 88 80 82 76 84 71 90 82 79 75 86 89
Student 14 15 16 17 18 19 20 21 22 23 24 25
x 78 83 76 91 76 99 99 64 86 63 95 97
y 85 89 85 94 78 95 98 72 94 81 90 98
274. Given that ∑ y = 2128, ∑ y 2 = 182.522, and ∑ xy = 170.971 , find the standard deviation of the
y-values about the regression line yˆ = 41.2057 + 0.5528 x ,
ANSWER:
se2 =
∑y 2
− b0 ∑ y − b1 ∑ xy
n−2
se = 13.982 = 3.793
275. Calculate a 95% confidence interval for the true value of the slope given that SS(x) =
3478.16.
276. Test the significance of the slope at α = 0.05 using the p-value approach and the
classical approach.
ANSWER:
β1 is the slope of the line of best fit for the population of pre-final averages and final
exam. H o : β1 = 0 vs. H a : β1 > 0 (Note: the alternative hypothesis can be either one-
tailed or two-tailed. Since the slope is positive, a one-tail test is appropriate). Assume
normality for y at each x. n = 25, df = n – 2 = 23, b1 = 0.5528, sb1 = 0.0634
P = P(t > 8.719 | df = 23). Using the table of critical values of Student’s t-distribution, we
get P < 0.005. Since P < α = 0.05; reject H 0 . There is sufficient evidence to indicate at
the 0.05 level of significance that the slope is significantly greater than zero.
The critical region is t ≥ 1.71. Since the test statistic t ∗ falls in the critical region; we
reject H o . We reach the same conclusion as stated above in the p-value approach.
277. Estimate the mean final-exam grade that all students with an 85 pre-final average will
obtain (95% confidence interval).
ANSWER:
µ y|x =85 is the mean final exam grade that all students with an 85 pre-final average will
obtain. Normality assumed for y at each x.
E = t (n − 2, α / 2) ⋅ se ⋅ (1/ n) + [( x0 − x ) 2 / SS ( x)]
= (2.07)(3.793) (1/ 25) + [(85 − 79.44) 2 / 3478.16]
= (2.07)(3.793)(0.2211) = 1.736
Hence, yˆ ± E = 88.19 ± 1.736 , and the 95% confidence interval for µ y| x =85 is 86.454 to
89.926.
278. Using the 95% prediction interval, predict the score that Terri will receive on her final,
knowing that her pre-final average is 80.
ANSWER:
yx =78 is the final exam score that Terri will receive on her final, knowing that her pre-final
average is 80. n = 25, x0 = 80, x = ∑ x / n = 79.44, se = 3.793, yˆ = 85.43
E = t (n − 2, α / 2) ⋅ se ⋅ 1 + (1/ n) + [( x0 − x )2 / SS ( x)
= (2.07)(3.793)(1.0198) = 8.01
Hence, ŷ ± E = 85.43 ± 8.01 , and the 95% prediction interval for Terri is 77.42 to 93.44.
279. The data below are for the number of unemployed persons (in millions) and the federal
unemployment insurance payments (in billions of dollars) for the years 1978 – 1985.
Some economists state that these two variables are positively related.
Year
Federal Unemployment 11.8 10.7 18.0 19.7 23.7 31.5 18.4 16.8
# Unemployed Persons 6.2 6.1 7.6 8.3 10.7 10.7 8.5 8.3
Use the classical approach at the 0.05 level of significance and a computer to test the
null hypothesis that the population slope is zero.
ANSWER:
The test statistic is t * = 6.335; and the critical regions are: t ≤ -2.45 or t ≥ 2.45.
Therefore we reject the null hypothesis. There is sufficient evidence at α = 0.05 to
indicate that a linear relationship exists between the two variables.
280. Sketch a t-curve to determine the critical value(s) and rejection regions that would be
used with the classical approach in testing H o : β1 = 0 vs. H a : β1 < 0 , with n = 16, α = 0.05
ANSWER:
ANSWER:
ŷ = 16.8221+ 0.64983x
ANSWER:
ANSWER:
The slope b1 : For each additional year an employee is with this company, his or her
salary increases, on average, by $650.
The y-intercept b0 : An employee just starting a job with this company has a starting
salary of $16,820.
284. Does a linear relationship exist between x any y? Test using α = 0.05.
ANSWER:
H o : β = 0 vs. H a : β ≠ 0
An experiment was conducted to study the effect of a new drug in lowering the heart rate in
adults. The data collected are shown in the following table.
Drug Dose in mg. (x) 1.75 2.50 0.50 2.00 2.75 2.25 0.75 1.25 1.50 1.00
Heart Rate Reduction (y) 13 17 9 19 20 19 6 11 14 14
ANSWER:
Scatter Diagram
25
Heart Rate Reduction
20
15
10
0
0 0.5 1 1.5 2 2.5 3
Drug Dose in m g.
ANSWER:
The scatter diagram suggests a positive linear relationship between drug dose and heart
rate reduction.
287. Use computer to determine the equation of the line of best fit.
ANSWER:
288. What is the estimated or predicted heart rate reduction for a dose of 2.00 mg?
ANSWER:
ANSWER:
SSE = ∑y 2
− (b0 )( ∑ y) − (b )(∑ xy) = 2210 – (5.3758)(142) – (5.4303)(258.75) = 41.546
1
∑ x − (∑ x)
2
SS ( x) = 2
/ n = 31.5625 − (16.25) 2 /10 = 5.1563
290. Find the 95% confidence interval for the mean heart-rate reduction for a dose of 2.00
mg.
ANSWER:
1 ( x0 − x ) 2 1 (2 − 1.625) 2
E = t (n − 2, α / 2) ⋅ se ⋅ + = (2.31)(2.2789) + = 1.88
n SS ( x) 10 5.1563
The lower and upper confidence limits for the mean heart-rate reduction when x = 2 are,
291. Find the 95% prediction interval for the heart-rate reduction expected for an individual
receiving a dose of 2.00 mg.
ANSWER:
1 ( x0 − x )2 1 (2 − 1.625) 2
E = t (n − 2, α / 2) ⋅ se ⋅ 1 + + = (2.31)(2.2789) 1 + + = 5.59
n SS ( x) 10 5.1563
yˆ ± E = 16.24 ± 5.59
ANSWER:
New Obs
1 2.00
293. Comment on the widths of the two intervals formed in questions 291 and 292.
ANSWER:
It is always the case that the prediction interval for an individual value of y is wider than
the confidence interval for the mean value of y; both calculated at the same value x0 (2 in
our case).
Shoe Sizes 8.5 7.5 12 8.5 8.5 9.5 13 13 6 13 8.5 7.5 12 7.5 6.5
Height 66 64 73 67 66 67 74 74 65 77 67 63 69 64 62
294. Construct a scatter diagram of the data, and comment on the visual linear relationship.
ANSWER:
The linear relationship between shoe size and height seems appropriate.
Scatter Diagram
100
80
Height
60
40
20
0
5 7 9 11 13 15
Shoe Size
ANSWER:
296. Is the population correlation coefficient significant? Test the appropriate hypotheses at
the 0.05 level of significance?
ANSWER:
H o : ρ = 0 vs. H a : ρ ≠ 0
The critical values are found in the “Critical Values of r When ρ = 0” table, at the
intersection of the df = 8 row and the two-tailed 0.05 column of the table. These values
are ± 0.632. Since r ∗ = 0.902 > 0.632, we reject H o . There is sufficient evidence to
indicate that there is a linear dependency between the shoe size and the height of a
person in the population from which this sample was drawn.
ANSWER:
ANSWER:
While the slope of the original equation is slightly larger than the slope of the line of best
fit (2.0 vs. 1.6725), its y-intercept is slightly smaller (50 vs. 52.1932).
299. Using the line of best fit found in question 298, estimate height for a student with a size
10 shoe. Compare results.
ANSWER:
The line of best fit provides an excellent estimate compared to the original equation.
300. Use computer to construct the 95% confidence interval for the mean height of all college
students with a size 10 shoe using the equation formed in question 298. Is your
estimate using y = 2x + 50 for a size 10 included in this interval?
ANSWER:
1 10.0
The height for a student with a size 10 shoe is estimated using the equation y = 2x + 50
to be 70 inches, which is not included in this confidence interval.
301. Construct the 95% prediction interval for the individual heights of all college students
with a size 10 shoe using the equation formed in question 298.
1 10.0
302. Comment on the widths of the two intervals formed in questions 301 and 302. Explain.
ANSWER:
It is always the case that the prediction interval for an individual value of y is wider than
the confidence interval for the mean value of y; both calculated at the same value x0 (2 in
our case).
303. Comment on the statement “The correlation coefficient has the same sign as the slope
of the least squares line fitted to the same data.” as sometimes true, always true, or
never true. Explain your response if your answer is “sometimes true” or “never true” .
ANSWER:
Always true
304. Explain why a 95% confidence interval for the mean value of y at a particular x is much
narrower than a 95% prediction interval for an individual y-value at the same value of x.
According to Central Limit Theorem, the standard error for x 's is much smaller than the
standard deviation for individual x's. Thus the confidence interval for the mean value of y
will be narrower than the prediction interval for an individual y-value at the same value of
x.
x 5 7 9 11 13
y 10 12 14 16 18
305. Calculate ∑ x, ∑ y , ∑ x , ∑ y
2 2
, and ∑ xy .
ANSWER:
x y xy x2 y2
5 10 50 25 100
7 12 84 49 144
9 14 126 81 196
11 16 176 121 256
13 18 234 169 324
Sum 45 70 670 445 1020
∑ x = 45, ∑ y = 70, ∑ x 2
= 445, ∑y 2
= 1020, and ∑ xy = 670
306. Calculate SS(x) , SS(y), and SS(xy).
∑ x − (∑ x)
2
SS ( x) = 2
/ n = 445 − (45)2 / 5 = 40
∑ y − (∑ y )
2
SS ( y ) = 2
/ n = 1020 − (70) 2 / 5 = 40
ANSWER:
ANSWER:
309. The sample correlation coefficient, r, is related to the slope of the line of best fit, b1 , by
the equation r = b1 ⋅ SS ( x) / SS ( y ) . Verify this equation using this data set.
ANSWER:
b1 ⋅ SS ( x ) / SS ( y ) = 1 ⋅ 40 / 40 = 1 = r
A scientist is studying the relationship between wind velocity (x) and DC output of a windmill (y).
The following MINITAB output is from a regression analysis for predicting y from x.
Analysis of Variance
Source DF SS MS F p
Total 13 6.0721
ANSWER:
ŷ = -0.1346 + 0.28996x
ANSWER:
ANSWER:
313. One of the assumptions about the random error ε in the regression model is that the
values of ε have a common variance equal to σ 2 . What is the best estimator of σ?
314. Does a linear relationship exist between x any y? Test using α = 0.05.
ANSWER:
A scientist is studying the relationship between x = inches of annual rainfall and y = inches of
shoreline erosion. One study reported the following data. Use the following MINITAB output to
answer the questions below.
x 30 25 90 60 50 35 75 110 45 80
y 0.3 0.2 5.0 3.0 2.0 0.5 4.0 6.0 1.5 4.0
Analysis of Variance
Source DF SS MS F p
Total 9 38.405
315. Construct a scatter diagram of the data, display the estimated regression line on the
graph, and comment on the visual linear relationship.
Scatter Diagram
y = 0.0731x - 1.7359
6
5
Shoreline Erosion
0
0 20 40 60 80 100 120
Annua l Ra infall
A linear relationship between inches of rainfall and inches of shoreline erosion seems
appropriate.
316. Specify the y-intercept and the slope of the estimated regression line?
ANSWER:
317. Interpret the estimated slope of the regression line in question 317.
ANSWER:
The slope b1 = 0.0731. This means that for each additional inch of annual rainfall, the
shoreline erodes, on average, by 0.0731 inch,
318. If we wish to test the usefulness of the simple linear regression model for predicting
shoreline erosion from a given amount rainfall, what are the appropriate null and
alternative hypotheses?
ANSWER:
H o : β = 0 vs. H a : β ≠ 0
319. Test the hypotheses in question 319 at α = 0.05 using the p-value approach.
ANSWER:
Since p-value = 0.0 < α , reject H o . There is sufficient evidence to indicate that a linear
relationship does exist between x and y. That is, the simple linear regression model is
useful for predicting erosion from a given amount of rainfall.
Chapter 14
Elements of Nonparametric
Statistics
1. One of the advantages that the nonparametric tests have is the necessity for less
restrictive assumptions.
ANSWER: T
2. If a tie occurs in a set of ranked data, the data that form the tie are removed from the set.
ANSWER: F
ANSWER: F
4. The efficiency of a nonparametric test is the probability that a false null hypothesis is
rejected.
ANSWER: F
ANSWER: T
6. When choosing between parametric and nonparametric tests, we are interested primarily
in the control of error, the relative power of the test, and efficiency.
ANSWER: T
7. The Runs test is the nonparametric counterpart of the parametric t-test for two
dependent means.
ANSWER: F
ANSWER: T
9. The nonparametric methods, or distribution-free methods as they are also known, do not
depend on the distribution of the population being sampled, but they depend on the
distribution of the sample itself.
ANSWER: F
10. While nonparametric methods require few assumptions about the parent population,
they are generally harder to apply than their parametric counterparts.
ANSWER: F
11. Nonparametric methods can be used in situations where parametric methods cannot be
used.
ANSWER: T
12. Suppose that you set the levels of risk you can tolerate for Type I and Type II error at α
and β , respectively, and then you are able to determine the sample size it would take to
meet your specified challenge. The test that required the larger sample size would seem
to have the edge, since it would be more efficient.
ANSWER: F
13. When we compare two or more tests, they must be equally qualified for use. That is,
each test has a set of assumptions that must be satisfied before it can be applied.
ANSWER: T
ANSWER: T
ANSWER: F
16. Efficiency is the ratio of the sample size of the best parametric test to the sample size of
the best nonparametric test when compared under a fixed set of risk values.
ANSWER: T
Multiple-Choice Questions
17. Which one of the following statements is correct in describing the nonparametric tests?
19. Which one of the following statements is incorrect about the comparison of parametric
and nonparametric statistical methods?
20. When trying to control the risk of error and two tests are equal candidates, we should
select the one with
A) no error.
B) lowest efficiency.
C) greatest power.
D) most calculations.
ANSWER: C
24. When choosing between parametric and nonparametric tests, we are interested primarily
in
A) It is the ratio of the sample size of the best nonparametric test to the sample size of
the best parametric test when compared under a fixed set of risk values.
B) It is the ratio of the sample size of the best parametric test to the sample size of the
best nonparametric test when compared under a fixed set of risk values.
C) It is the sum of the sample size of the best nonparametric test and the sample size of
the best nonparametric test when compared under a fixed set of risk values.
D) It is the difference of the sample size of the best parametric test and the sample size
of the best nonparametric test when compared under a fixed set of risk values.
ANSWER: B
A) The risk associated with a Type I error is controlled directly by the level of
significance α .
B) P(Type I error) = α
C) P(Type II error) = β .
D) It is α , not β , that we must control.
ANSWER: D
30. The efficiency rating for the sign test is approximately 0.63. What does this mean?
ANSWER:
This means that a sample of size 63 with a parametric test will do the same job as a
sample of size 100 will do with the sign test.
31. The power and the efficiency of a test cannot be used alone to determine the choice of a
test. Explain in detail.
Sometimes you will be forced to use a certain test because of the data you are given.
When there is a decision to be made, the final decision rests in a trade-off of three
factors: (1) the power of the test, (2) the efficiency of the test, and (3) the data (and the
number of data) available.
32. Briefly discuss the reasons for the recent popularity of nonparametric statistics.
ANSWER:
33. Explain why nonparametric methods are also called distribution-free methods.
ANSWER:
34. What two factors influence our decision as to the “best” test?
ANSWER:
The two factors are the ability to control the risk of errors and the sample size required.
35. The efficiency of a particular nonparametric test is 0.82. For a fixed set of risk values, the
sample size of the best nonparametric test is 50. Find the sample size of the parametric
test.
Section 14.4
True-False Questions
36. The sign test is a versatile and an exceptionally easy-to-apply nonparametric method
that uses only plus and minus signs.
ANSWER: T
37. The sign test can be used when the null hypothesis to be tested concerns the value of the
population median.
ANSWER: T
39. The sign test may be applied to a hypothesis test dealing with the median difference
between independent data that result from two independent samples.
ANSWER: F
40. Two dependent means can be compared nonparametrically by using the sign test.
ANSWER: T
41. The sign test is a possible alternative to the Student's t - test for one mean value.
ANSWER: T
ANSWER: F
ANSWER: F
44. The sign test can be used in a hypothesis test concerning the median difference (paired
difference) for two dependent samples.
ANSWER: T
45. In the sign test, if the observed value of the less frequent sign is larger than the critical
value k displayed in the “Critical Values of the Sign Test” table available in your
textbook, we reject H o .
ANSWER: F
46. The sign test is the nonparametric alternative to the t-test used for one mean.
ANSWER: T
ANSWER: T
ANSWER: F
49. The sign test can be applied to obtain a single-sample confidence interval for the
unknown population mean µ .
ANSWER: F
50. The sign test can be applied to obtain a single-sample confidence interval for the
unknown population median M.
ANSWER: T
51. The sign test is a nonparametric procedure for testing whether two populations have
identical
A) means
B) medians
C) variance
D) Interquartile ranges
ANSWER: B
53. Which of the following statements is false regarding the sign test?
A) In the sign test, reject the null hypothesis whenever the number of the less frequent
sign is extremely small.
B) If the number of the less frequent sign is less than or equal to the critical value k in
the “Critical Values of the Sign Test” table available in your textbook, we will reject H o
.
C) If the observed value of the less frequent sign is larger than the critical value k in the
“Critical Values of the Sign Test” table available in your textbook, we will fail to reject
Ho .
D) In the sign test, reject the null hypothesis whenever the number of the less frequent
sign is extremely large.
ANSWER: D
55. Which of the following statements is false regarding the sign test?
56. Which of the following statements is not always true regarding the sign test?
A) It can be used when the null hypothesis to be tested concerns the value of the
population median.
B) It may be either one- or two-tailed test.
C) It uses only the plus and minus signs; therefore, the zeros are discarded and the
usable sample size is adjusted accordingly.
D) Its test statistic is the number of the (+) signs; that is, n(+).
ANSWER: D
A) The sign test may be carried out by means of a normal approximation using the
standard normal variable z.
B) The normal approximation to the sign test will be used if the “Critical Values of the
Sign Test” table available in your textbook does not show the particular levels of
significance desired or if n is large.
C) The sign test may be the easiest test procedure of all nonparametric tests to use.
D) None of the above.
ANSWER: D
Short-Answer Questions
58. What non-parametric test can be used in place of either the one mean t - test or the two
dependent means t - test?
ANSWER:
ANSWER:
The sign test is a binomial experiment of n trials (the n data observations) with two
outcomes for each data [(+) or ( − )], and p = (+) = 0.5. The variable x is the number of
the least frequent sign.
60. Why does the sign test use a null hypothesis about the median instead of the mean like
a t - test uses?
ANSWER:
The median is the middle value such that 50% of the distribution is larger in value and
50% is smaller in value.
61. A restaurant has collected data on which of two seating arrangements (A and B) its customers
prefer. In a sign test to determine which one seating arrangement is significantly preferred, the
null hypothesis would be: (a) M = 0, (b) M = 0.5, (c) p = 0, or (d) p = 0.5. Explain your choice.
ANSWER:
The right choice is (d); p = P(+) = P(prefer seating arrangement (A) = 0.5.
62. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statement: “There is no change in weight from weight-in until after
three weeks of the aerobic exercises”.
ANSWER:
63. Briefly discuss the assumptions for inferences about the population median using the
sign test.
(a) The n random observations that form the sample are selected independently.
(b) The population is continuous in the vicinity of the median M.
64. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statement: “The median tax rate is 5%”.
ANSWER:
65. Briefly discuss the assumptions for inferences about the median of paired differences
using the sign test.
ANSWER:
66. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statement: “The median length of vacation time taken by university
administrators is less than 21 days per academic year.”
ANSWER:
ANSWER:
The nonparametric statistics do not require assumptions about the distribution of the
variable.
ANSWER:
The extreme value in a set of data can have a sizeable effect on the mean and standard
deviation in the parametric methods. The nonparametric methods typically use rank
numbers. The extreme value with ranks is either 1 or n, and neither changes if the value
is more extreme.
69. A computer center claims that the median downtime for its large mainframe computer is
45 minutes. A random sample of 30 downtimes for this computer revealed that 17
exceeded 45 minutes, 3 equaled 45 minutes, and 10 were less than 45 minutes. Give
the critical region, test statistic and conclusion for testing H o : M =45 vs. H a : M ≠ 45 at
α = 0.05.
ANSWER:
Critical region: x ≤ 7
Test statistic: x = 10
Conclusion: Fail to reject the null hypothesis
A blood bank claims that the median usage for red blood cells in a liver transplant is 15 units. A
random sample of 34 transplants revealed that 16 exceeded 15 units, 4 equaled 15 units, and
14 were less than 15 units.
ANSWER:
H o : M = 15 vs. H a : M ≠ 15
71. If testing the claim, what would be the test statistic, critical region, and conclusion at α = 0.05?
ANSWER:
ANSWER:
73. A marketing company conducted a taste preference test for a new brand of peanut
butter. Customers were asked to compare crunchy versus creamy. Seventy chose
crunchy over creamy, twenty-five chose creamy over crunchy, and five said they were
equally good. Give the critical region, x, and conclusion for testing that there is a
difference in preference using α = 0.05.
ANSWER:
74. A tire manufacturer claims that the median mileage for their Eagle tire is 40,000 miles. A
consumer agency wishes to test H o : M = 40,000 vs. H a : M < 40,000 at α = 0.05.
When the agency tested 100 such tires, sixty-five gave less than 40,000 miles of wear
and thirty-five gave more than 40,000. A minus sign was recorded if a tire gave less
than 40,000 miles and a plus was recorded if it gave 40,000 or more. Give the critical
region, z * , and the conclusion.
ANSWER:
17 13 20 12 14 16 16 18 12 19
16 10 10 20 14 8 19 19 16 12
16 12 21 17 16 17 14 16 12 9
ANSWER:
The table of “critical values of the sign test" indicates that for n = 10 and a two-tailed test, the
critical region for the sign test is x ≤ 1 if α = 0.05.
76. Verify this by finding P(x ≤ 1) + P(x ≥ 9) for a binomial distribution with n =10 and p = 0.5.
ANSWER:
ANSWER:
P[(x ≤ 2) + P(x ≥ 8)] = 0.001 + 0.010 + 0.044 + 0.044 + 0.010 + 0.001 = 0.11
A reading speed and comprehension test was given to a random sample of 10 individuals
before and after a reading course. The scores are given below:
After 78 82 94 69 75 79 91 65 85 90
The claim being tested is that scores will improve after the course. Use α = 0.05.
ANSWER:
H o : M = 0 (≤), H a : M > 0
79. Give the critical region, and the value of the test statistic.
ANSWER:
ANSWER:
Fail to reject H o , and conclude that the scores did not improve after the course.
81. The Computer Anxiety Index (CAIN) was administered to one hundred and fifty students
in a statistics course that utilizes statistical packages. The CAIN was given at the
beginning and the end of the course. Ninety showed a reduction in computer anxiety, ten
showed no change, and fifty showed an increase. Test the null hypothesis of no change
versus the hypothesis that anxiety was reduced at α = 0.05. Use the sign test and give
the critical region, z * , and conclusion.
82. Estimate the population median with a 0.95 confidence interval for a given set of 100
pieces of ordered data: x1 , x2 , K , x100 .
ANSWER:
( x40 to x61)
83. The following sample data represents the percent of yearly income spent on
entertainment for 16 residents selected from various care centers. Find a 95%
confidence interval on the median percent spent on entertainment for all such residents.
ANSWER:
84. The following diastolic blood pressures were obtained from 40 females who were over
60 years in age. Find a 95% confidence interval for the median diastolic reading for the
population of females who are over 60.
72 72 74 74 75 75 75 77 77 78
79 79 80 80 80 81 81 84 84 85
85 86 86 86 87 88 90 90 90 90
80 to 90
85. The ages (rounded to the nearest year) for a sample of 30 students in the evening
school at a particular college are shown below:
25 30 32 41 27 34 47 31 28 24 23
40 37 35 40 29 25 23 22 30 48 21
21 34 35 32 28 50 25 32
Find a 90% confidence interval on the median age of all students in the evening school.
ANSWER:
(28, 34)
49 32 34 36 38 40 30 28 36 38
86. Use the sign test to determine the 95% confidence interval for the median daily high
temperature in Chicago during December.
ANSWER:
Ranked data:
18 21 22 25 25 27 28 30 31 32
For n = 20 and 1 - α = 0.95, the critical value from the table of critical values of the sign
test is k = 5. Then, xk +1 = x6 = 27 and sn − k = x15 = 38 . Hence, 27 to 38 is the 95%
confidence interval for the unknown population median M.
87. Use the sign test to test at α = 0.05 the hypothesis that the median daily high
temperature in the city of Chicago during the month of December is 40 degrees, using
the p-value approach.
ANSWER:
Using the table of critical values of the sign test: P ≈ 0.01 . Since P < α ; reject H o .
88. Use the sign test to test at α = 0.05 the hypothesis that the median daily high
temperature in the city of Chicago during the month of December is 40 degrees, using
the classical approach.
ANSWER:
The classical approach: Critical region: n (least freq sign) ≤ 4 ; x* = 2 is in the critical
region; therefore we reject H o at α = 0.05 , and conclude that the median temperature in
the city of Chicago during the month of December is significantly different from 40.
89. A blind taster wishes to test if the preference for the taste of the new cola significantly greater
than one-half, state the null and alternative hypothesis.
ANSWER:
ANSWER:
x is binomial random variable and approximately normal. Let + = prefer new; then n(+) =
710, (0) = 300, and n(-) = 650; The usable sample size is n = n (+) + n (-) = 710 + 650 =
1360; Since x = n(+) = 710; then, x′ = x – 0.50 = 709.5. The value of the test statistic is:
z ∗ = [ x′ − (n / 2)] /( n / 2) = [709.5 – (1360/2)] / ( 1360 / 2 ) = 1.60.
91. Test the hypothesis in question 89 at the 0.01 level of significance using the p-value approach.
ANSWER:
The p-value approach: P = P(z > 1.60) = 0.5000 – 0.4452 = 0.0548. Since P > α = .01;
fail to reject H o .
92. Test the hypothesis in question 89 at the 0.01 level of significance using the classical approach.
ANSWER:
The classical approach: Critical region: z ≥ 2.33 ; The test statistic is not in the critical
region; therefore we fail to reject H o . The evidence does not allow us to conclude that
there is a significant preference for the new cola.
44 45 51 49 53 57 54 45 54 53 48
60 60 50 31 44 45 57 51 50 35
93. If you wish to determine whether this sample show that the median score for the exam
differs from 50. What are the null and alternative hypotheses?
ANSWER:
94. Calculate the appropriate value of the test statistic for testing the hypothesis in question
93.
ANSWER:
Let + = above 50, then n (+) = 14, n (0) = 3, and n (-) = 15.
95. Test the hypothesis in question 93 at α = 0.05 using the p-value approach.
ANSWER:
P = p-value = 2P(x ≤ 14 | n = 29). Using the table of critical values of the sign test, we
get P > 0.25. Since P > α ; fail to reject H o . The sample evidence is not sufficient to
justify the claim that median score is different from 50.
96. Test the hypothesis in question 93 at α = 0.05 using the classical approach.
The critical region: n (least freq sign) ≤ 8. The test statistic is not in the critical region;
therefore we fail to reject H o . We reach the same conclusion as stated in question 95.
97. If you wish to determine whether this sample shows that the median score for the exam
is less than 50, what are the null and alternative hypotheses?
ANSWER:
H o : Median score on exam = 50 (≥) vs. H a : Median score on exam < 50.
98. Calculate the appropriate value of the test statistic for testing the hypothesis in question
97.
ANSWER:
Let + = above 50 or equal to 50, then n(+) = 17, n(-) = 15, and n = 32.
99. Test the hypothesis in question 97 at α = 0.05 using the p-value approach.
ANSWER:
P = p-value = P(x ≤ 15 | n = 32). Using the table of critical values of the sign test we get
P > 0.0125. Since P > α , we fail to reject H o . The sample evidence is not sufficient to
justify the claim that the median score is less than 50.
100. Test the hypothesis in question 97 at α = 0.05 using the classical approach.
The critical region is: n (least freq sign) ≤ 10; Since the test statistic is not in the critical
region; we fail to reject H o . We reach the same conclusion as stated in question 99.
101. Suppose that we have 15 pieces of data in ascending order ( x1 , x2 , x3 ,......, x15 ). Explain
how to form a 90% confidence interval for the population median M.
ANSWER:
The “Critical Values of the Sign Test” available in your textbook shows a critical value of
3 (k = 3) for n = 15 and α = 0.10 for a hypothesis test. This means that we drop the last
three values on each end ( x1 , x2 , and x3 on the left; x13 , x14, and x15 on the right) .The
confidence interval is bounded by x4 and x12 , inclusively. That is, the 90% confidence
interval is x4 to x12 and is expressed as: x4 to x12 , 90% confidence interval for M.
102. Ten randomly selected college students were each asked how many hours of television
they watched last week. The results are: 22, 6, 30, 24, 15, 28, 20, 34, 50, and 31.
Determine the 90% confidence interval estimate for the median number of hours of
television watched per week by college students.
ANSWER:
103. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statement: “The median value is at least 40.”
ANSWER:
ANSWER:
105. Using the classical approach and the sign test, determine the critical value that would be
used to test H o : P(+) = 0.5 vs. H a : P(+) ≠ 0.5, with n = 20, α = 0.05.
ANSWER:
106. Using the classical approach and the sign test, determine the critical value that would be
used to test H o : P(+) = 0.5 vs. H a : P(+) > 0.5, with n = 80, α = 0.025.
ANSWER:
x = n(-) ≤ 30
107. Using the classical approach and the sign test, determine the critical value that would be
used to test H o : P(+) = 0.5 vs. H a : P(+) ≠ 0.5 with n = 175, α = 0.10.
ANSWER:
z ≤ -1.645 or z ≥ +1.645
108. Using the classical approach and the sign test, determine the critical value that would be
used to test H o : P(+) = 0.5 vs. H a : P(+) ≠ 0.5 with n = 100, α = 0.05.
ANSWER:
ANSWER:
x = n(+) ≤ 14
110. Using the classical approach and the sign test, determine the critical value that would be
used to test H o : P(+) = 0.5 vs. H a : P(+) ≠ 0.5 with n = 160, α = 0.05.
ANSWER:
z ≤ -1.96 or z ≥ +1.96
111. Using the critical values of the sign test available in your textbook, determine the p-value
that would be used to test H o : Median = 25 vs. H a : Median ≠ 25, given that n(+) = 34,
n(0) = 0, n(-) = 56, and α = 0.05. State your decision.
ANSWER:
P = p-value = 2P(x ≤ 34 | n = 90) ⇒ 0.01 < P < 0.05. Since p-value < α = 0.05, we
reject H o .
112. Using the critical values of the sign test available in your textbook, determine the p-value
that would be used to test H o : Median = 24 ( ≥ ) vs. H a : Median < 24, given that n(+) = 17,
n(0) = 2, n(-) = 28, and α = 0.05. State your decision.
ANSWER:
P = p-value = P(x ≤ 17 | n = 45) ⇒ 0.05 < P < 0.125. Since p-value > α = 0.05, we fail
to reject H o .
A recent study reported that the mean salary of a full professor in US academic institutions was
$84,175. The following table lists the average salary for a random sample of 20 institutions in
87300 58200 58700 57700 74400
Virginia.
113. State the null and alternative hypotheses in testing the claim that the median salary of
full professors in Virginia is lower than the mean for the whole country.
ANSWER:
ANSWER:
115. Test the hypotheses in question 113 at α = 0.05 using the p-value approach.
ANSWER:
ANSWER:
Critical region: n(least freq sign) = x ≤ 5. Therefore, the test statistic is in the critical
region, and H o is rejected. We reach the same conclusion as stated in question 115.
117. A researcher compared baseline values for antithrobin III with antithrobin II values 7
days after a bone marrow transplant for 50 patients. The differences were found to be
nonsignificant. Suppose 19 of the differences were positive and 31 were negative. The
null hypothesis is that the median difference is zero, and the alternative hypothesis is
that the median difference is not zero. Use the 0.05 level of significance. Complete the
test and carefully state your conclusion.
ANSWER:
Since n(+) = 19, n(0) = 0, n(-) = 31, then n = n(+) + n(-) = 50, and the value of the test
statistic is x = n(least frequent sign)= n(+0) = 19.
P = p-value = 2P(x ≤ 19 | n = 50) ⇒ 0.10 < P < 0.25. Since p-value > α = 0.05, we fail
to reject H o . There is no sufficient evidence to indicate that the median difference is
different from zero.
A taste test was conducted with a regular beef pizza. Each of 130 individuals was given two
pieces of pizza, one with a whole-wheat crust and the other with a white crust. Each person was
then asked whether she or he preferred whole-wheat or white crust. The results were: 64
preferred whole-wheat to white crust, 52 preferred white to whole-wheat crust, and 14 had no
preference.
118. A blind taster wishes to test if the whole-wheat crust is preferred to white crust, state the null and
alternative hypothesis.
ANSWER:
ANSWER:
120. Test the hypothesis in question 118 at the 0.05 level of significance using the p-value approach.
ANSWER:
P = p-value = P(z > 1.02) = 0.5000 – 0.3461 = 0.1539. Since P > α = 0.05, we fail to
reject H o . There is no sufficient evidence to indicate that whole-wheat crust is preferred
to white crust.
121. Test the hypothesis in question 118 at the 0.05 level of significance using the classical approach.
ANSWER:
The critical region is z ≥ 1.645 . Since the test statistic z∗ = 1.02 does not fall in the critical
region; we fail to reject H o . We reach the same conclusion as stated in question 120.
Section 14.5
True-False Questions
122. The Mann-Whitney U Test is used to compare two dependent sample means.
ANSWER: F
123. The Mann-Whitney U test is a nonparametric alternative for the t- test for the difference
between two independent means.
124. The calculation of the Mann-Whitney test statistic U is a two-step procedure. We first
determine the sum of the ranks for each of the two samples. Then, using the two sums
of ranks, we calculate a U score. The larger U score is the test statistic.
ANSWER: F
125. The Mann-Whitney test may be carried out by means of a normal approximation using
the standard normal variable z whenever the two sample sizes n1 and n2 are both
greater than 10.
ANSWER: T
126. The sum of U a and U b in the Mann-Whitney U test will always be equal to the product of
the two sample sizes na and nb .
ANSWER: T
127. The sum of U a and U b in the Mann-Whitney U test will always be equal to the sum of the
two sample sizes na and nb .
ANSWER: F
Multiple-Choice Questions
128. Which of the following statements is false regarding the Mann-Whitney U test?
A) It is a nonparametric alternative for the t-test for the difference between two
dependent (matched pairs) means.
B) It is often used in situations in which two independent random samples are drawn
from the same population of subjects but different “treatments” are used on each
test.
C) One of the assumptions of the test is that the random variables are ordinal or
numerical.
D) None of the above.
ANSWER: A
130. Which of the following equations is true regarding U a and U b and the two sample sizes
na and nb in the Mann-Whitney U test?
A) U a + U b = na + nb
B) U a / U b = na / nb
C) U a + U b = na ⋅ nb
D) U a / U b = na − nb
ANSWER: C
Short-Answer Questions
131. Let n1 and n2 be the sample sizes in the Mann-Whitney U test. Let Ra and Rb be the
two rank sums. Give a formula involving n1 and n2 which will always give the sum Ra +
Rb .
ANSWER:
Ra + Rb = ( n1 + n2 ) ⋅ ( n1 + n2 + 1) / 2
132. Suppose the Mann-Whitney U test were used to test a two-tailed alternative hypothesis
at α = 0.05. If two independent samples (each of size 40) were used, what partitions in
the sum of ranks is necessary in order to reject the null hypothesis?
ANSWER:
133. Briefly discuss the assumptions for inferences about two populations using the Mann-
Whitney U test.
ANSWER:
(a) The two independent random samples are independent within each sample as well
as between samples.
(b) The random variables are ordinal or numerical.
ANSWER:
The t-test for the difference between two independent means is the parametric test
procedure that is equivalent to Mann-Whitney U test.
135. State the null hypothesis H o , and the alternative hypothesis, H a , that would be used to
test the following statement :”The cholesterol level for group A is lower than for group B”.
ANSWER:
H o : The average cholesterol level is the same for both groups A and B.
H a : The average cholesterol level for group A is lower than for group B.
136. State the null hypothesis H o , and the alternative hypothesis, H a , that would be used to
test the following statement :”The average test score is not the same for both male and
female groups”.
ANSWER:
H o : The average test score is the same for both male and female groups.
H a : The average value is not the same for male and female groups.
(a) We first determine the sum of the ranks for each of the two samples A and B. Then,
using the two sums of ranks, we calculate via a pair of formulas a U score for each
sample, U a and U b respectively.
(b) The test statistic is the smaller of U a and U b .
138. What characteristic of the data used in a parametric test is not part of the data when
using the Mann-Whitney U test?
ANSWER:
The actual size of the data is not used, only its rank.
139. State the null hypothesis H o , and the alternative hypothesis H a , that would be used to
test the following statement: ”There is a difference in the value of the variable between
the two professional groups of people”
ANSWER:
140. State the null hypothesis H o , and the alternative hypothesis, H a , that would be used to
test the following statement: ”The blood pressure for group A is higher than for group B”.
ANSWER:
H o : The average blood pressure is the same for both groups A and B.
H a : The average blood pressure for group A is higher than for group B.
ANSWER:
142. The yield of two different varieties of citrus trees is being compared. Twenty-five similar
plots are available on an experimental farm. Variety 1 was planted on 13 randomly
selected plots and variety 2 was planted on the remaining plots. Two years later the
yields from the 25 trees is recorded. The yields were jointly ranked and it was found that
the sum of ranks for variety 1 was 208.5 and the sum of ranks for variety 2 was 116.5.
Test the null hypothesis that the two varieties give equal yields versus the alternative
that they do not, at α = 0.05. Give the critical region, U*, and the conclusion.
ANSWER:
Since the critical region is: U ≤ 41, and the test statistic is U * = 385
. , we reject the null
hypothesis.
Conclusion: The two varieties don’t give equal yields.
143. A new product was tested in 10 stores. Five stores were randomly selected and the item
was placed at the check-out stand. In the other stores, the item was located in a section
containing similar items. It is of interest to test for no difference in sales versus greater
sales occurring at the checkout location. Using the following data, give the critical region
for α = 0.05, U*, and your conclusion.
ANSWER:
Since the critical region is: U ≤ 4, and the test statistic is U * = 5.5, few ail to reject the null
hypothesis.
Conclusion: No difference in sales; unable to conclude that location affects sales.
144. Two populations were compared by selecting samples of size 20 from each. The Mann-
Whitney U test was selected to test a two-tailed alternative at α = 0.05. If Ra = 380 and
Rb = 440, find z* and give the p-value for the test.
ANSWER:
145. A driver kept track of her gasoline mileage using full tanks of two different brands of
gasoline. The gasoline consumption in miles per gallon for the two brands is shown
below:
Brand A 17 16 18 21 19 20
Brand B 15 18 19 17 20 21 22
Test the claim that the two brands of gasoline result in the same gasoline consumption.
Use the Mann-Whitney U Test with α = 0.05. Give the critical region, U*, and the
conclusion.
ANSWER:
Since the critical region is U ≤ 6, and the test statistic is U * = 11.5; we fail to reject the
null hypothesis.
Conclusion: The two brands result in the same gasoline consumption.
Fertilizer A 36 25 36 27 38 29 40 29 30 30 34 34 34
Fertilizer B 36 18 35 20 32 20 24 26 26 30 30 27
Test for differences in yields at α = 0.05. Give the critical region, the value of U*, and
your conclusion.
ANSWER:
147. The time required to assemble a product part was determined for 25 males and 25
females. Since the data indicated non-normality of the times for the two sexes, the
Mann-Whitney U test was selected to determine whether the assembly times differed for
males and females. Find the sum of ranks for males, Rm , and the sum of ranks for
females, R f , which would be the strongest evidence possible supporting the hypothesis
that females were faster in assembling the product part.
ANSWER:
A study involving 14 adults in the age group 40–45 years, gave the following weight values (in
pounds).
H o : Weight values are the same for both groups of boys and girls.
H a : Weight values are not the same for both groups of boys and girls.
ANSWER:
150 1 G
160 2.5 G
160 2.5 B
170 4.5 G
170 4.5 B
180 6 B
185 7.5 G
185 7.5 B
195 9.5 G
195 9.5 B
200 11.5 G
200 11.5 G
205 13 G
210 14 B
nb = 6, ng = 8
The value of the test statistic is U ∗ = min( U b , U g ) = min (23, 25) = 23.
150. Test the hypothesis in question 148 at the 0.05 level of significance using the p-value
approach.
ANSWER:
151. Test the hypothesis in question 148 at the 0.05 level of significance using the classical
approach.
ANSWER:
The critical region is U ≤ 8 . Since the test statistic does not fall in the critical region; we
fail to reject H o . We reach the same conclusion as stated in question 150.
Aug. 8 5
Sept. 9 8
Oct. 9 6
Nov. 10 7
Jan. 5 7
Feb. 7 8
Mar. 5 5
Apr. 8 6
May 5 6
June 3 2
July 3 4
Aug. 3 2
152. Convert the table to a table of ranks of the on-time arrivals (A) and baggage handling (B)
for United.
ANSWER:
2 1.5 B 6 14 B
2 1.5 B 6 14 B
3 4 A 7 17.5 B
3 4 A 7 17.5 A
3 4 A 7 17.5 B
4 6 B 7 17.5 A
5 9.5 B 8 21.5 A
5 9.5 B 8 21.5 B
5 9.5 A 8 21.5 B
5 9.5 A 8 21.5 A
5 9.5 B 9 24.5 A
6 14 B 10 26 A
153. If you wish to test the hypothesis that baggage handling obtained higher ratings than on-
time arrivals during the period, state the null and alternative hypotheses.
ANSWER:
H o : Baggage handling scores are not higher than on-time arrivals.
ANSWER:
na = 13, nb = 13 ; Ra = 4 + 4 + 4 + 9.5 + 9.5 + 9.5 + 17.5 + 17.5 + 21.5 + 21.5 + 24.5 + 24.5 + 26 = 193.5
Rb = 1.5 + 1.5 + 6 + 9.5 + 9.5 + 9.5 + 14 + 14 + 14 + 17.5 + 17.5 + 21.5 + 21.5 = 157.5 ; Then,
U a = na ⋅ nb + [(nb )(nb + 1) / 2] − Rb = (13)(13) + [(13)(14)/2] – 157.5 = 102.5
The value of the test statistic is U ∗ = min( U a , U b ) = min (102.5, 66.5) = 66.5.
155. Test the hypothesis in question 153 at the 0.05 level of significance using the p-value
approach.
ANSWER:
P = p-value = P(U < 66.5) . Using the table of critical values of U in the Mann-Whitney
test, we get: P > 0.05. Since P > α = .05 , we fail to reject H o . There is no sufficient
evidence to indicate that United baggage handling did obtain higher ratings than on-
time arrivals at the 0.05 level of significance.
156. Test the hypothesis in question 153 at the 0.05 level of significance using the classical
approach.
Group 1 80 88 65 94 82 97 93 95 60 75
Group 2 82 97 95 90 77 64 70 97 95 84
157. In you wish to test the claim that a computer-assisted approach produces higher
achievement (as measured by final exam scores) in biology courses than does a lecture
approach, what are the null and alternative hypotheses.
ANSWER:
60 1 1
64 2 2
65 3 1
70 4 2
75 5 1
77 6 2
80 7 1
82 8.5 1
82 8.5 2
84 10 2
88 11 1
90 12 2
93 13 1
94 14 1
95 16 1
95 16 2
95 16 2
97 19 1
97 19 2
97 19 2
159. Test the hypothesis in question 157 at the .05 level of significance using the p-value
approach.
ANSWER:
P = p-value = P (U < 42.5) . Using the table of critical values of U in the Mann-Whitney
test, we get P > 0.05. Since P > α = 0.05 , we fail to reject H o . The evidence does not
allow us to conclude that the computer assisted instruction approach produced higher
achievement scores.
160. Test the hypothesis in question 157 at the .05 level of significance using the classical
approach.
ANSWER:
The critical region is U ≤ 27 . Since the test statistic does not fall in the critical region; we
Ho
fail to reject . We reach the same conclusion as stated in question 159.
161. In a Mann-Whitney U test, suppose all 10 data values of sample “A” come before the
smallest of the 10 data values in sample “B” when they are ranked together. Calculate
the value U* of the test statistic.
ANSWER:
nb ( nb + 1) (10)(10 + 1)
U a = na ⋅ nb + − Rb = (10)(10) + − 155 = 0 , and
2 2
na ( na + 1) (10)(10 + 1)
U b = na ⋅ nb + − Ra = (10)(10) + − 55 = 100
2 2
ANSWER:
(8)(8 + 1)
Ra = Rb = 68. Therefore, U a = U b = (8)(8) + − 68 = 32
2
163. Determine the p-value when testing H o : Average score for group A = Average score for
group B vs. H a : Average score for group A > Average score for group B , given that na =
15, nb = 15 and U = 80.
ANSWER:
164. Determine the p-value when testing H o : Average weight for group A = Average weight
for group B vs. H a : Average weight for group A ≠ Average weight for group B, given that
na = 9, nb = 10, and U = 22.
ANSWER:
165. Determine the p-value when testing H o : The average height is the same for both groups
A and B. vs. H a : Group A average heights are less than those for group B, given that
with na = 50, nb = 45, and z = -2.18.
ANSWER:
ANSWER:
167. Use the classical method to determine the critical region that would be used to test
H o : The average score is the same for both groups A and B vs. H a : Group A average
scores are less than those for group B, for an experiment involving two independent
samples, given that na = 78, nb = 45, and α = 0.05.
ANSWER:
Pulse rates were recorded for 16 men and 13 women. The results are shown below:
Males 62 74 59 65 71 65 73 61 66 81 56 73 57 57 75 66
Females 82 57 69 55 75 63 79 67 77 107 75 69 96
Assume a doctor wishes to test the hypothesis that the distribution of pulse rates differs for men
and women.
ANSWER:
H o : Average pulse rates are the same for both males and females.
H a : Average pulse rates are not the same for males and females.
ANSWER:
ANSWER:
nb (nb + 1) (13)(14)
U a = na ⋅ nb + − Rb = (16)(13) + − 236 = 63 , and
2 2
na (na + 1) (16)(17)
U b = na ⋅ nb + − Ra = (16)(13) + − 199 = 145 .
2 2
171. Test the hypotheses in question 168 at the 0.05 level of significance using the p-value
approach.
ANSWER:
Using the table of “critical values of U in the Mann-Whitney test”, available in your
textbook, we get P = p-value = P(U ≤ 63) ⇒ 0.05 < P < 0.10.
Since P > α = 0.05, we fail to reject H o . There is no significant evidence to indicate that
the average pulse rates are not the same for males and females.
172. Test the hypotheses in question 168 at the 0.05 level of significance using the classical
approach.
The critical region is U ≤ 59. Since U ∗ = 63 does not fall in the rejection region, we fail to
reject H o . We reach the same conclusion as stated in question 171.
173. Approximate the distribution of the test statistic identified in question 169 using the
normal distribution, and calculate the value of the standardized test statistic z * .
ANSWER:
174. Calculate the p-value associated with the test statistic z * in question 173 and use it to
test the hypotheses in question 168 at the 0.05 level of significance.
ANSWER:
Since P > α = 0.05, we fail to reject H o . We reach the same conclusion as stated in
question 171.
A study involving 8 obese boys and 8 obese girls gave the following total-cholesterol values.
Obese Boys 186 199 165 205 175 177 210 195
Obese Girls 168 192 171 188 193 155 145 200
Suppose you wish to test the hypothesis that the total-cholesterol values differ for the two
groups.
ANSWER:
H o : Total cholesterol values are the same for both obese boys and girls.
H a : Total cholesterol values are not the same for obese boys and girls.
176. Identify the test statistic to be used in testing the hypotheses in question 175.
ANSWER:
ANSWER:
nb (nb + 1) (8)(9)
U a = na ⋅ nb + − Rb = (8)(8) + − 56 = 44 , and
2 2
na (na + 1) (8)(9)
U b = na ⋅ nb + − Ra = (8)(8) + − 80 = 20
2 2
178. Test the hypotheses in question 175 at the 0.05 level of significance using the p-value
approach.
ANSWER:
Using the table of “critical values of U in the Mann-Whitney test”, available in your
textbook, we get P = p-value = P(U ≤ 20) ⇒ P > 0.10.
179. Test the hypotheses in question 175 at the 0.05 level of significance using the classical
approach.
ANSWER:
The critical region is U ≤ 13. Since U ∗ = 20 does not fall in the rejection region, we fail to
reject H o . We reach the same conclusion as stated in question 178.
Section 14.5
True-False Questions
180. The Runs test is most frequently used to test the randomness of or lack of randomness of data.
ANSWER: T
ANSWER: F
182. The hypothesis test about randomness in the Runs test will be rejected when there are
too few or two many runs.
ANSWER: T
183. The Runs test may be carried out by means of a normal approximation using the
standard normal z variable whenever the two sample sizes n1 and n2 are both smaller
than 20 or when the level of significance α is other than 0.025.
ANSWER: F
ANSWER: F
185. To complete the hypothesis test about randomness, when n1 and n2 are larger than 20
or when α is other than 0.05, we will use z, the standard normal random variable.
ANSWER: T
Multiple-Choice Questions
A) 6.
B) 7.
C) 8.
D) 9.
ANSWER: C
187. The following data were collected to determine whether the data points form a random
sequence with regard to being above or below the median value.
13 13 14 15 17 18 18 19
20 21 22 24 24 24 24 27
188. Which of the following statements is false regarding the runs test?
A) It is used most frequently to test the randomness of data (or lack of randomness).
B) Its test statistic is V, the number of runs observed.
C) It is generally a two-tailed test.
D) None of the above.
ANSWER: D
189. Which of the following statements is false when testing for randomness?
Short-Answer Questions
190. What is the assumption for inferences about randomness using the Runs test?
ANSWER:
The assumption is that each sample data can be classified into one of two categories
(e.g. male or female)
191. In testing the randomness of data using the Runs test, when do we reject the null
hypothesis?
ANSWER:
State the null hypothesis and the alternative hypotheses that would be used to test the following
statements:
ANSWER:
193. Cars passing a toll booth were classified as either foreign or domestic. The sequence of
foreign or domestic is not random.
ANSWER:
194. The values in a set of data were replaced by the symbol “A” if the number was above the
median and by the symbol “B” if the number was below the median. The following
sequence was obtained.
A B B A B A B B A A A
Give the critical region using α = 0.05, the computed test statistic and the conclusion for
testing that the sequence is random versus it is not.
ANSWER:
Critical region: V ≤ 6 or V ≥ 17, and test statistic is V* =13; fail to reject the null
hypothesis.
Conclusion: The sequence is random.
2 5 6 1 1 2 2 3 4 6
4 3 1 2 5 1 2 3 6 5
4 6 4 3 1
Give the critical region for α = 0.05, the value of the test statistic V*, and the conclusion
for determining whether or not this sequence of odd and even numbers is random.
ANSWER:
n1 =12, n2 =13; Critical regions are V ≤ 8 or V ≥ 19, Test statistic V* =16; fail to reject the
null hypothesis of randomness.
196. Thirty cars which enter a vehicle inspection station are monitored and it is noted whether
they pass (P) or fail (F) the inspection. The following sequence was recorded.
P, P, P, P, P, F, F, F, P, P, P, P, P, P, P, F, F, F, F, P, P, P, P, P, P, F, F, P, P, F
Test at α = 0.05 to determine if the sequence is random. Give the critical region, V*,
and the conclusion.
n p = 20, n f =10; Critical regions are: V ≤ 9 or V ≥ 20, Test statistic is V* = 8; reject the
null hypothesis of randomness. Conclusion: The number of runs is unusually small.
197. The following are the number of defective pieces turned out by a machine during 20
consecutive shifts:
10 12 13 12 15 16 17 17 18 10
10 14 17 17 17 12 12 16 16 16.
Test the null hypothesis that the numbers in the sample form a random sequence with
respect to the two properties “above” and “below” the median value, versus the
alternative hypothesis that the sequence is not random. Use α = 0.05. Give the critical
region, V*, and the conclusion.
ANSWER:
Critical regions are: V ≤ 6 or V ≥ 16, and test statistic is V* = 6; reject the null hypothesis
of randomness.
198. A product part is routinely selected from a production line and classified as either
defective or non-defective. For the last 100 selected parts, 90 have been non-defective
and 10 have been defective. If the defectives and non-defectives are displayed as a
sequence of N’s and D’s, how many runs would you expect to see if their occurrences
are random?
ANSWER:
199 A sequence of inspected items consists of 15 defectives and 105 non-defectives. If there
are 11 runs in the sequence, find z * and give the p-value for testing the alternative
hypothesis that sequence is not random.
200. A sequence consists of 30 zeros and 20 ones. How many runs would there need to be to
reject H o : the sequence of 0’s and 1’s is random in favor of H a : the sequence is non-
random and there are more runs than should occur. Use α = 0.05.
ANSWER:
31 or more runs
201. A sequence consists of 15 items above the median (a) and 15 items below the median
(b). We are testing at α = 0.05 the following hypotheses:
ANSWER:
In order to fail to reject H o , the number of runs must be 9, 10, 11, ..., 19, 20, or 21.
202. If you wish to test student’s claim that the results reported are random, state the null and
hypotheses.
ANSWER:
ANSWER:
Since n( H ) = 13, n(T ) = 12 , and there are 21 runs, then, the value of the test statistic is:
V* = 21.
204. Test the hypothesis in question 202 at the 5% level of significance using the p-value
approach.
ANSWER:
Using the table of critical values for total number of runs (V); P < 0.05. Since P < α ;
reject H o .
205. Test the hypothesis in question 202 at the 5% level of significance using the classical
approach.
ANSWER:
The critical regions are: V ≤ 8 or V ≥ 19 ; The test statistic falls in the critical region;
therefore we reject H o There is sufficient evidence to indicate that the results are not
randomly ordered.
Minutes: 7 2 4 10 11 11 3 6 6 7 13 4 8 9 10 5 6 9 12 15
ANSWER:
ANSWER:
208. Test the hypothesis in question 206 at the 0.05 level of significance using the p-value
approach.
ANSWER:
Using the table of critical values for total number of runs (V), we get P > 0.05. Since P >
α = 0.05, we fail to reject the null hypothesis. There is no sufficient evidence to conclude
that there is an increase in wait time.
209. Test the hypothesis in question 206 at the 0.05 level of significance using the classical
approach.
ANSWER:
The critical regions are V ≤ 3 or V ≥ 10 ; The test statistic is not in the critical region;
therefore we fail to reject H o . We reach the same conclusion as stated in question 208.
ANSWER:
H a : The data did not occur in a random order about the median.
211. State the null hypothesis H o , and the alternative hypothesis, H a that would be used to
test the following statement: “The sequence of head and tail is not random”.
ANSWER:
212. State the null hypothesis H o , and the alternative hypothesis H a that would be used to
test the following statement: “The gender of students entering a college library was
recorded; the entry is not random in order.”
ANSWER:
213. What aspect of randomness will be tested using the Runs test?
ANSWER:
The Runs test will test the order, or sequence, of occurrence for the numbers generated.
ANSWER:
When testing for randomness, it is the null hypothesis that states random, thereby
making the “fail to reject” decision the desired outcome. The probability associated with
that result is 1 – α , not the level of significance, and 1 – α is known as the level of
confidence.
215. Determine the p-value that would be used to complete the hypothesis test for the
following Runs test:
H o : The sequence of gender of students coming into the university gym was random
H a : The sequence was not random; with n(A) = 10, n(B) = 12, and V ∗ = 9.
ANSWER:
We use the “critical values for total number of runs V “ table, available in your textbook,
to place bounds on the p-value. In this case, larger of n(A) and n(B) = 12, and smaller of
n(A) and n(B) = 10. Then, P = p-value = 2 ⋅ P( V ≥ 9 | n(B) = 12 and n(A) = 10) ⇒ P < 0.05.
216. Determine the p-value that would be used to complete the hypothesis test for the
following runs test:
H o : The new home cost prices collected occurred in random order above and below the
median.
H a : The new home cost prices did not occur in random order with z = 1.52.
ANSWER:
H o : The results collected occurred in random order above and below the median.
H a : The results were not random; with n(A) = 10, n(B) = 16, and α = 0.05.
ANSWER:
Critical regions: V ≤ 8 or V ≥ 19
218. Determine the critical values that would be used to complete these hypothesis tests for
the following runs tests using the classical approach.
= 0.05.
ANSWER:
219. My youngest daughter, Jessica, did not feel she was playing a game with a fair coin. She
felt that if the coin was fair, the tossing of the coin should result in a random order of
head and tail output. She performed her experiment 20 times. After each toss, Jessica
recorded the results. The following data were reported (H= head, T= tail).
Use the runs test at the 0.05 level of significance to test Jessica’s claim that the results
reported are random. Use the p-value approach.
H a : The heads / tails sequence is not of random order; n(H) = 11, n(T) = 9, and V ∗ = 10.
We use the table of “critical values for total number of runs V “, available in your
textbook, to place bounds on the p-value. In this case, larger of n(H) and n(T) = 11, and
smaller of n(H) and n(T) = 9. Then, P = p-value = 2 ⋅ P( V ≥ 10 | n(H) = 11 and n(T) = 9) ⇒ P
> 0.05.
Since P > α = 0.05, we fail to reject H o . There is no sufficient evidence to indicate that
the heads / tails sequence is not of random order. In other words, we must support
Jessica’s claim that the results reported are random.
The office of human resources at Western Michigan University recorded the gender of the last
35 individuals hired (M = male, F = female) as shown below:
ANSWER:
221. At the α = 0.05 level of significance, are we correct in concluding that this sequence is
not random? Test using the p-value approach.
ANSWER:
222. At the α = 0.05 level of significance, are we correct in concluding that this sequence is
not random? Test using the classical approach.
ANSWER:
Critical regions: V ≤ 12 or V ≥ 25
Since V ∗ = 17 does not fall in the rejection region, we fail to reject H o . We reach the
same conclusion as stated in question 221.
223. Use a computer to verify the above results in questions 221 and 222.
ANSWER:
The number of absences recorded at a large lecture that met at 6 PM Tuesdays and Thursdays
last winter semester were (in order of occurrence) as shown below:
5 17 6 10 18 13 16 20 14 17
6 12 7 18 25 6 7 5 10 19
6 8
ANSWER:
225. State the null and alternative hypotheses that can use in testing whether these data
show randomness about the median value found in question 224.
ANSWER:
H o : The numbers in the sample form a random sequence about the median value.
226. Test the hypotheses in question 225 at α = 0.05 using the p-value approach.
ANSWER:
P = p-value = 2 ⋅ P( V ≥ 13 | n = 18 and 14) ⇒ P > 0.05. Since P > α = 0.05, we fail to reject
H o . There is no sufficient evidence to indicate that the sequence is not of random order.
227. Test the hypotheses in question 225 at α = 0.05 using the classical approach.
ANSWER:
Critical regions: V ≤ 10 or V ≥ 23
Since V ∗ = 13 does not fall in the rejection region, we fail to reject H o . We reach the
same conclusion as stated in question 226.
228. Are the assumptions of using the normal approximation to complete the hypothesis test
about randomness met in this situation? Discuss.
ANSWER:
The assumptions of using the normal approximation to complete the hypothesis test
about randomness are: n1 and n2 are both larger than 20 or when the level of
significance α ≠ 0.05 . Therefore, these assumptions are not met in this particular
situation.
229. Regardless of your answer to question 228, use computer and the normal approximation
to test the hypotheses in question 225 at α = 0.05.
Since p-value = 0.171 > α = 0.05, we fail to reject H o . We reach the same conclusion as
stated in question 226.
230. Did you reach the same conclusion in questions 226, 227, and 228?
ANSWER:
Yes; in the three questions, we failed to reject the null hypothesis at α = 0.05.
Section 14.6
True-False Questions
231. The Spearman Rank Correlation coefficient and the Pearson Product Moment always
give the same value.
ANSWER: F
ANSWER: T
ANSWER: F
234. The value of Spearman Rank Correlation coefficient, rs , will range from 0 to 1.
ANSWER: F
235. Spearman’s rank correlation coefficient is an alternative to using the linear correlation
coefficient.
ANSWER: T
236. Charles Spearman developed the rank correlation coefficient in the early 1900’s. It is a
parametric alternative to the linear correlation coefficient (Pearson’s product moment r).
ANSWER: F
237. The alternative hypothesis may be either two-tailed, there is correlation, or one-tailed if
we anticipate either positive or negative correlation.
ANSWER: T
238. When there are only a few ties in either set of the order pairs of rankings, the value of
the Spearman rank correlation coefficient ( rs ) is exactly equal to the value of the
Pearson product moment correlation coefficient (r).
ANSWER: F
239. The value of Spearman rank correlation coefficient, r, rages from -1 to + 1 and is used in
much the same manner as Pearson’s linear correlation coefficient r is used.
ANSWER: T
A) The Spearman rank correlation test of significance will result in a failure to reject the
null hypothesis when r, is close to zero.
B) The Spearman rank correlation test of significance will result in a rejection of the null
hypothesis when r, is found to be close to + 1 or -1.
C) One of the assumptions for inferences about rank correlation is that the variables are
nominal.
D) None of the above.
ANSWER: C
Short-Answer Questions
243. Briefly discuss the assumptions for Inferences about Rank Correlation.
244. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following statement: “There is no relationship between the two rankings”.
ANSWER:
245. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following statement: “There is a positive correlation between the two variables”
246. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following statement: “Age has a decreasing effect on monetary value”
ANSWER:
247. State the null hypothesis, H o , and the alternative hypothesis, H a , that would be used to
test the following statement: “The two variables are unrelated”
ANSWER:
248. Determine the critical value that would be used to test the hypotheses H o : No
correlation versus H a : Negatively correlated for a multinomial experiment, with n = 14
and α = 0.05.
ANSWER:
Aspect
Ranking 1 2 3 4 5 6 7
Managers Ranking 4 3 1 7 2 5 6
Employee Ranking 6 2 3 5 1 4 7
ANSWER:
rs = 0.714
Rank of X 1 2 4 3 7 8 6 5
Rank of Y 2 3 1 4 6 8 5 7
ANSWER:
0.786
ANSWER:
0.786
ANSWER:
The Pearson’s product moment (r) and the Spearman rank correlation coefficient ( rs )
have the same value.
253. Consider the following data, which has the shape of a parabola. Find both the Pearson’s
product moment (r) and the Spearman rank correlation coefficient( rs ).
x 1 2 4 6 10 13 15 20 25 30
ANSWER:
r = 0.963, and rs = 1.
x 2 2 2 4 4 4 4 6 6 6
y 4 3 7 10 10 10 14 18 16 16
254. Rank both variables and then apply the Pearson product moment to the ranks to find the
Spearman rank correlation coefficient.
ANSWER:
0.9585
ANSWER:
0.9606
256. The weights and gestational ages of seven very low birth weight infants were recorded.
The results were as follows:
Age 25 28 30 27 32 31 27
Use the Spearman rank correlation coefficient ( rs ) to test for positive correlation
between the two variables at α = 0.01.
ANSWER:
Since the critical region is rs ≥ 0.893; and test statistic is rs* = 0.827; we fail to reject the
null hypothesis.
257. The ages for seven mated pairs of California gulls (in years) were recorded with the
following results:
Males: 4 14 10 5 4 8 9
Female 3 12 10 7 4 6 5
s
Use the Spearman rank correlation coefficient to test the alternative hypothesis that the
ages are positively correlated at α = 0.05.
Since the critical region is rs ≥ 0.714, and the test statistic is rs* = 0.847; we reject the null
hypothesis.
Conclusion: There is sufficient evidence to conclude that the ages are positively related.
x 1 2 3 4
y 2 9 11 k
ANSWER:
259. Determine the test criteria that would be used to test H a : Variable B decreases as A
increases given n = 18 and α = 0.01 in a Spearman rank correlation experiment.
ANSWER:
Do foods high in fiber tend to have more sodium? The following table was obtained by selecting
11 soups from a list published in a health magazine. The soups were measured on the basis of
both sodium content and fiber:
Soup A B C D E F G H I J K
Sodium 490 840 520 470 500 590 430 300 460 440 400
Fiber 13 1 2 6 4 8 3 5 11 7 10
260. Rank the soups in ascending order based on the basis of their sodium content and on
their fiber content, and show your results in a table.
A 5 1 4 16
B 1 11 -10 100
C 3 10 -7 49
D 6 6 0 0
E 4 8 -4 16
F 2 4 -2 4
G 9 9 0 0
H 11 7 4 16
I 7 2 5 25
J 8 5 3 9
K 10 3 7 49
261. Compute the Spearman rank order correlation coefficient for the two sets of rankings.
ANSWER:
6∑ ( d ) 2 6(284)
rs = 1 − = 1− = 1 − 1.291 = −0.291
n(n − 1)
2
11(120)
262. Does higher sodium content accompany foods that are higher in fiber? Test the null
hypothesis that there is no relationship between the fiber and sodium content of the
soups versus the alternative that there is a relationship between them at α = 0.05, using
the p-value approach.
ANSWER:
H o : ρ s = 0 vs. H a : ρ s > 0 and the test statistic is rs * = −0.291 .
Using the table of critical values of Spearman’s rank correlation coefficient we get: P >
0.10. Since P > α = 0.05 , we fail to reject H o . There is not sufficient evidence presented
263. Test the hypothesis stated in question 262 at α = 0.05 using the classical approach.
ANSWER:
The critical region is rs ≥ 0.618 . Since the test statistic rs * does not fall in the critical
region; we fail to reject H o . We reach the same conclusion as stated in question 262.
264. Determine the p-value that would be used to test “ H o : No relationship between the two
variables vs. H a : There is a positive relationship, “ for the Spearman rank correlation
experiment with n = 20 and rs = 0.51.
ANSWER:
265. Determine the p-value that would be used to test “ H o : No correlation vs. H a : There is a
relationship,” for the Spearman rank correlation experiment with n = 25, and rs = 0.35.
ANSWER:
266. Determine the p-value that would be used to test “ H o : Variable A has no effect on
Variable B vs. H a : Variable B decreases as A increases” for the Spearman rank
correlation experiment with n = 15, and rs = 0.66.
ANSWER:
ANSWER:
268. Determine the critical region(s) that would be used to test “ H o : No relationship between
the two variables. vs. H a : There is a relationship,” for the Spearman rank correlation
experiment with n = 15 and α = 0.05.
ANSWER:
269. Determine the critical region(s) that would be used to test ” H o : No correlation vs. H a :
Positively correlated,” for the Spearman rank correlation experiment with n = 24 and
α = 0.05.
ANSWER:
270. Determine the critical region(s) that would be used to test “ H o : Variable A has no effect
on Variable B vs. H a : Variable B decreases as A increases, “for the Spearman rank
correlation experiment with n = 19 and α = 0.01.
ANSWER:
U 3.5 3.1 2.7 3.7 2.5 3.3 3.0 2.9 3.8 3.2 3.6 3.1
G 3.4 3.2 3.0 3.6 3.1 3.4 3.0 3.4 3.7 3.8 3.7 3.0
271. Rank the undergraduate GPA and the graduate GPA for the 12 students, and present
your results in a table.
ANSWER:
Rankings
U 9 5.5 2 11 1 8 4 3 12 7 10 5.5
G 7 5 2 9 4 7 2 7 10.5 12 10.5 2
272. Compute the Spearman rank order correlation coefficient for the two sets of rankings.
ANSWER:
Let d = U – G.
Rankings
U 9 5.5 2 11 1 8 4 3 12 7 10 5.5
G 7 5 2 9 4 7 2 7 10.5 12 10.5 2
di 2 0.5 0 2 -3 1 2 -4 1.5 -5 -0.5 3.5
rs = 1 −
6⋅ ∑d i
2
=1−
6(78)
= 1 − 0.2727 = 0.7273
n(n − 1)
2
12(122 − 1)
ANSWER:
274. Test the hypotheses in question 273 at the 0.05 level of significance using the p-value
approach.
ANSWER:
Since P < α = 0.05, we reject H o . There is sufficient evidence to indicate that a positive
correlation exists between undergraduate GPA and GPA at graduation from a graduate
business program (MBA).
275. Test the hypotheses in question 273 at the 0.05 level of significance using the classical
approach.
ANSWER:
The critical region is rs ≥ 0.0.497. Since the test statistic rs∗ = 0.7273 falls in the rejection
region, we reject H o . We reach the same conclusion as stated in question 274.