Statistics
Statistics
(Abstract)
B.Sc Programme in Statistics under Choice based Credit Semester System – Scheme and
Syllabus – implemented with effect from 2009 admission onwards – approved - Orders
issued.
-------------------------------------------------------------------------------------------------------------
GENERAL AND ACADEMIC BRANCH – I ‘J’ SECTION
No. GA. I/J2/2455/06 Dated, Calicut University. P.O., 25.06.2009
-------------------------------------------------------------------------------------------------------------
Read : 1. U.O. No. GAI/J2/3601/08 (Vol. II) dated 19.06.2009.
2. Minutes of meeting of the Board of Studies in Statistics (UG) held on
29.01.2009 and 30.04.2009
3. Item No.2. vii(a) of the minutes of the meeting of the Faculty of Science held
on 05.05.2009.
4. Item No.IIA (8) of the minutes of meeting of the Academic Council held on
14.05.2009.
ORDER
Choice based Credit Semester System and Grading has been introduced for UG
Curriculum in the affiliated colleges of the University with effect from 2009 admission
onwards and the Regulation for the same implemented vide University Order cited 1st
paper above.
Vide paper read as (2), the Board of Studies in Statistics (UG) approved the draft
regulation and the syllabi of B Sc Programme in Statistics prepared as per draft regulation
of Choice based Credit Semester System 2009.
The Faculty of Science vide paper read as 3rd above endorsed the minutes of the
Board of Studies in Statistics (UG).
The Academic Council, vide paper read as 4 above, approved the minutes of the
Faculty of Science.
Sanction has therefore been accorded for implementing the scheme & syllabus of
B.Sc Prigramme in Statistics under Choice based Credit Semester System from 2009
admission onwards.
Orders are issued accordingly . Syllabus appended.
Sd/-
DEPUTY REGISTRAR (G&A I)
For REGISTRAR
To
The Principals of all affiliated colleges -
offering B.Sc Statistics programme
Forwarded / By order
SECTION OFFICER
1
SYLLABUS OF B.Sc. STATISTICS MAIN – SEMESTER SYSTEM
CCSSUG 2009 (2009 admission onwards)
*For Practical paper the internal marks are based on the practical records
2
Semester Course Course Title Instructional Credit Exam Ratio
No. Code Hours/week hours Ext:int
1 ST6B01 Probability Models and Risk Theory 3 2 3 3:1
3
Table showing the components and weightage for internal assessment
Components Weight
Assignment 1
Test paper 2
Seminar 1
Attendance 1
There shall be two test papers and the average grade point is to be considered for
internal assessment.
There shall be 4 parts A, B, C and D in all the question papers except for course 12,
practical. Part A consists of 12 objective type questions. Part B consists of 8 questions to
be answered in a word, phrase or sentence. Part C consists of 6 questions of short essay
type of which the student can attempt 4. Part D consists of 3 questions of long essay type
of which the student can attempt 2. In part A the weightage per question is ¼.for part B
weightage is 1/question .For part D the weightage is 2/question and for part D the
weightage is 4/question. As far as possible the number of questions should be proportional
to the modules.
The practical paper consists of 6 questions and the student can attempt 4. Calculators are
permitted
The internal assessment for the practicals shall be based on the average grade point of two
practical test papers and the practical record. The test papers shall have weight 1 each and
the record shall have weight 2
4
CORE COURSE I: METHODOLOGY OF STATISTICS,
BASIC CALCULUS AND PROBABILITY THEORY
17hours
5
Module 3. Probability concepts: Random experiment, sample space, event, classical
definition, axiomatic definition and relative frequency definition of probability.
Concept of probability measure. Addition and multiplication theorem (limited to
three events). Conditional probability and Bayes’ Theorem – numerical problems
15 hours
6
Model Question Paper
B.Sc. STATISTICS
I Semester
CORE COURSE I: METHODOLOGY OF
STATISTICS, BASIC CALCULUS AND PROBABILITY THEORY
Time: 3 Hrs
PART A
Answer all questions ( Bunch of 4 carries weight age 1)
7
(d)F(x)=1 for every x
12. If f(x)= x, 0<x<1, Obtain F(x).
x2
(a)F(x)= x 2 , (b) F(x)= , (c) F(x)= x , (d) F(x)=2 x 2
2
PART B
Answer all questions wt 1
PART C
Answer any four questions wt 2
∫ ∫ xy( x
2
23. Evaluate + y 2 )dxdy over [(0,a),(0,b)]
24. State three axioms of probability?
25. A continuous random variable X has the pdf given by f ( x) = 2 x,0 < x < 1 , and 0
Elsewhere. Find F (x) and P(X<1/2)?
26. Given f ( x) = e − x , x ≥ 0, find the pdf of y=-3x+7?
PART D
Answer any two questions wt 4
8
Module 1. Mathematical Expectations: Expectation of a random variable, moments,
relation between raw and central moments, moment generating function (mgf) and
15hours
Module 2. Bi variate random variable: Definition (discrete and continuous type)
Joint probability mass function and probability density function, marginal and
conditional distributions, independence of random variables.
Bivariate moments: Definition of raw and central product moments, conditional
mean and conditional variance, covariance, correlation and regression coefficients.
Mean and variance of a random variable in terms of conditional mean and
conditional variance
20 hours
9
3. Mood A.M., Graybill. F.A and Boes D.C. : Introduction to Theory of
Statistics McGraw Hill
4. John E Freund: Mathematical Statistics (Sixth Edition), Pearson Education
(India),New Delhi.
5.If x is a random variable having probability function f (x), then the function
tx
Σe f(x), , is known as
a. moment generating function
b. probability generating function
c. probability distribution function
d. characteristic function
a) ¼ b) 2/4 c) 4 d) 2
8. X is normally distributed with zero mean and unit variance. The variance of
11
x2 is
a) 0 b) 1 c) 2 d) 4
9.In a normal curve area to the right of the point x1 is 0.6 and to the left of the
point x2 is 0.7. Which is the correct statement.
a) n1> n2 b) n1< n2 c) n1= n2 d) none of them
10.For a normal distribution, Q.D, M.D and S.D. are in the ratio.
4 2 4 4 2 1 4
a) : 2/3:1, b) : :1 c) 1: : d) : 1:
5 3 5 5 3 2 5
d)
11.If x is a continuous r.v with means µ and variance σ 2 then for any positive
1
number k P[│x- µ │ > K σ ] ≥ is known as
k2
a. Liapunov’s inequality b) Tchebycheff’s inequality
c. Bienayme- Tchebycheff’s inequality d) Khinchin’s inequality
12.If x and y are two random variables such that their expectations exist and
P(x ≤y) =1 then
a) E(x) ≤E (y) b) E (x) >E (y)
c. E (x) = E (y) d) None of the above
15.Name the discrete distribution for which mean and variance have the same
value.
16 What is the third moment about the mean of a poison distribution if the
second moment about the origin is 12.
17. Identify the distribution (using the uniqueness property) if the name of
generating function of the distribution
12
is Mx(t)= (1+et ) 5/32
18. State the additive property of Binomial distribution.
19. Write down the pdf of the exponential distribution and write down its first
raw moments.
20. What are the points of inflexion of a normal curve N(µ,σ).
Part C
(Answer any 4 questions) Weight 2
Part D
(Answer any 4 questions) Weight 4
13
CORE COURSE III: STATISTICAL INFERENCE – I
14
20 hours
Module 4. Interval Estimation: Large sample confidence intervals for mean,
equality of means, proportions, equality of proportions. Derivation of exact
confidence intervals for means, equality of means, variance and ratio of
variances based on Normal, t, chi- square and F distribution 15hrs
Books for reference
15
Model Question Paper
Part A
Answer all questions ,4 questions carry weight 1
1. The mean of a Chi – square distribution with n degrees of freedom is
( a ) 2n ( b ) n 2 ( c ) n (d ) n
2. The relation between student’s-t and F distribution is.
( a ) t( n) 2 = F( n ,1) ( b ) t( n ) 2 = F(1,n ) ( c ) t(1)2 = F(1,n ) ( d ) t( n ) 2 = F(1,1)
3. Let X 1 , X 2 ,..., X n be a random sample from a normal population N ( µ , σ 2 ) ,then the
∑ ( x − x)
2
i
distribution of is.
σ2
( a ) χ 2( n) ( b ) t( n) ( c ) χ 2( n−1) ( d ) t( n−1)
1
( )
2
s2 =
n
∑ xi − x ,the unbiased estimator for the population variance σ 2 is
1 2 1 2 n 2 n −1 2
(a) s (b ) s (c) s (d ) s
n −1 n n −1 n
5. If T is a consistent estimator of θ then
( a ) T is a consistent estimator of θ 2 ( b ) T 2 is a consistent estimator of θ
( c ) T 2 is a consistent estimator of θ 2 ( d ) None of the above
16
6. Let X 1 , X 2 ,..., X n be a random sample from a Bernoulli population. A sufficient
statistics for p is
( a ) ∑ X i ( b ) ∏ X i ( c ) Max( X 1 , X 2 ,..., X n ) ( d ) Min( X 1 , X 2 ,..., X n )
known σ 2
single observation from the population f ( x, θ ) = θ eθ x , x > 0 ,then the value of type I
error is
( a ) e ( b ) e2 ( c ) e−2 ( d ) e−1
Part B
Answer all questions ,each questions carries weightage 1
13.Let X 1 , X 2 be a random sample of size 2 from N ( 0,1) .Then the distribution of
( X 1 + X 2 ) is-------------
2
( X1 − X 2 )
2
17
15.Let X 1 , X 2 , X 3 be a random sample of size 3 from N ( µ , σ 2 ) .he efficiency of
X1 + 2 X 2 + X 3 X + X2 + X3
relative to 1 is------------
4 3
1 X −θ
16.Let X 1 , X 2 ,..., X n be a random sample from the population with pdf f ( x,θ ) = e ,
2
The m.l.e of θ is---------
17.The diameter of a cylindrical rod is assumed to be normally distributed with a variance
of 0.04cm. A sample of 25 rods has a mean diameter of 4.5 cms.95% confidence interval
for population mean is -----------
18.The power of a test is ----------
19.Degrees of freedom for chi-square in case of contingency table of order 4x3 is ---
20.In tossing of a coin ,let the probability of a head turning up be p .the hypotheses are
H 0 : p = 0.4 aganistH1 : p = 0.6 . H0 is rejected if there are five or more heads in six
tosses. Then probability of type I error is----------
PartC
Answer any 4 questions ,each questions carries a weightage of 2
21.Obtain the distribution of the sample mean of a random sample X 1 , X 2 ,..., X n of size n
from N ( µ , σ 2 ) .
B (1, p ) .Let T = ∑ X i .
T (T − 1)
Show that is an unbiased estimator of p2.
n(n − 1)
23.Define sufficient statistic. Let X 1 , X 2 ,..., X n be a random sample of size n from
24.An oil company claims that less than 20% of all car owners have not tried its gasoline
.Test this claim at the 0.01 level of significance if a random check reveals that 22 out of
200 car owners have not tried oil company’s gasoline.
25.In the comparison of two kinds of paint ,a consumer testing service finds that four 1-
gallon cans of one brand cover on the average 546 square feet with a standard deviation of
31 square feet ,whereas four 1-gallon cans of another brand cover on the average 492
square feet with a standard deviation of 26 square feet. Assuming that the two populations
sampled are normal and have equal variance. Test the hypothesis that on the average the
first kind of paint covers a greater area than the second.
26. Mention the advantages of non-parametric tests over parametric test.
18
Part D
Answer any 2 questions ,each questions carries 4 credit
27 Let X 1 , X 2 ,..., X n be a random sample of size n from N ( µ , σ 2 ) . Find the mle’s of
19
CORE COURSE IV: STATISTICAL INFERENCE – 2
1. Module 1. Testing of Hypotheses; concept of testing hypotheses, simple and
composite hypotheses, null and alternative hypotheses, type I and type II
errors, critical region, level of significance and power of a test, most
powerful test, Neyman Pearson theorem and its simple applications. Concept
of p value
35 hours
2. Module 2. Large sample tests concerning mean, equality of means,
proportions, equality of proportions. Small sample tests based on t
distribution for mean, equality of means and paired mean for paired data.
Tests based on F distribution for ratio of variances. Test based on chi-
distribution for variance, goodness of fit and for independence of attributes
and homogeneity of proportions. Test for correlation coefficients- Z
trasformation
35 hours
Module 3. Non parametric tests: Basic idea of distribution free method.
Kolmogorov Smirnov test-one sample and two sample sign tests. Wilcoxen
matched pairs signed rank test- Kruskal Wallis test and test for randomness
(run test).
20 hours
Books for reference
20
Model Question Paper
Time: 3 Hrs
PartA
(Answer all questions)
(Contains 12 questions, 4 questions carry a weightage of 1)
1. In a chi-square contingency table with 3 rows and 5 columns, the d.f of chi-square
statistic is
a) 15
b) 24
c) 8
d) 7
2. The chi-square test statistic for a goodness of fit test is given by:
21
Oi − Ei
a)
Ei
Oi − Ei
b) ∑ Ei2
(Oi − Ei )2
c) ∑ Ei2
(Oi − Ei ) 2
d) ∑ Ei
3. In a Poisson goodness of fit test having ‘k’ sets of observed frequencies with estimated
value of λ , the chi-square statistic has d.f.
a) k-2
b) k
c) k-1
d) k-2
6. The test used to check the randomness of the collected set of symbols is:
a) Sign test
b) Rank sum test
c) Signed rank test
d) Run test
7 When there are 3 groups, each following normal distribution, and the null hypothesis is
concerned with the equality of means the test used is:
a) Chi square test
b) t-test for equality of means
c) Analysis of variance
d) none of the above
( a ) 2n ( b ) n 2 ( c ) n (d ) n
22
9. The relation between student’s-t and F distribution is.
( a ) t( n) 2 = F( n ,1) ( b ) t( n ) 2 = F(1,n ) ( c ) t(1)2 = F(1,n ) ( d ) t( n ) 2 = F(1,1)
12. If X>1is the critical region for testing H 0 : θ = 2 aganistH1 : θ = 1 on the basis of the
single observation from the population f ( x, θ ) = θ eθ x , x > 0 ,then the value of type I
error is
( a ) e ( b ) e2 ( c ) e−2 ( d ) e−1
13. In chi-square test of independences of 2 attributes with 2 observations each, the d.f of
the test statistic is 1.
14 In the case of sign test, the test statistic follows a binomial distribution.
15 In χ 2 test of goodness of fit if the calculated value of χ 2 is zero, then it is a bad fit.
23
c) Let X 1 , X 2 be a random sample of size 2 from N ( 0,1) .Then the distribution of
( X1 + X 2 ) is-------------
2
( X1 − X 2 )
2
21. What is the null hypothesis for a chi-square test of homogeneity of proportions and
give the layout of observations.
23. Give an example for a paired t test. Give the test statistics and explain the notations
24. An oil company claims that less than 20% of all car owners have not tried its gasoline
.Test this claim at the 0.01 level of significance if a random check reveals that 22 out of
200 car owners have not tried oil company’s gasoline.
25. In the comparison of two kinds of paint ,a consumer testing service finds that four
1-gallon cans of one brand cover on the average 546 square feet with a standard
deviation of 31 square feet ,whereas four 1-gallon cans of another brand cover on the
average 492 square feet with a standard deviation of 26 square feet. Assuming that the two
populations sampled are normal and have equal variance. Test the hypothesis that on the
average the first kind of paint covers a greater area than the second.
26. Mention the advantages of non-parametric tests over parametric test.
24
27.. A factory operates in three shifts. The factory manager feels that quality of part is
related to shifts. For this purpose he has collected the following data from the past
records of production.
No. of Parts
Good Bad
Shift Day
900 130
Evening
Night 700 170
400 200
28.. Fifteen patient records from each of two hospitals were received and assigned a score
designed to measure level of care. The scores were as follows:-
Hospital 99 85 73 98 83 88 99 80 74 91 80 94 94 98 80
A:
Hospital 78 74 69 79 57 78 79 68 59 91 89 55 60 55 79
B
Use a proper non-parametric test to see whether the two populations are identical with
respect to the level of care.
25
theorem and fundamental theorem of integral calculus.
20
hours
3. Module 3. Complex Numbers: Analytic functions – Cauchy Riemann
equations – Cauchy’s integral formula – Taylor and Laurent’s series
expansion – fundamental theorem of algebra – poles and singularities –
contour integration – simple problems.
40 hours
Books for reference
B.Sc. STATISTICS
Semester III
Core Course V – Mathematical Methods
Part A
(Answer all questions) weight 1 for a bunch of 4 questions
1
x
1. e
The value of lim is
x − > 01 + e1 / x
a) 0 b) 1 c) .2 d) doesnot exist
2. If lim f(n) exists and lim f(n) ≠ f (c) , them f (x) has n->c
a) Discontinuity if first kind at x =c b) Discontinuity of Second at x =c
c) Removable disconitunity at x =c d) None of these
26
3. If f (x) ‘ { 1, when x is irreational then -1, when x is rational
1 1
c) in not derivable at x ≠ c d) in not derivable at x ≥ o
f ( n) f ( n)
9. The function defined by f(n) = { 0 when x in rational 1 when x is irrigationed
a) Is integrable on any interval on R
b) Is not integrable on any interval on R
c) Is integrable on (0,00)
d) Is not integrable on (0,00)
10. If f(n) is integrable on (a,b), then
a) If (x) is also integrable on (a,b)
b) If (x) is is not integrable on (a,b)
c) If (x) is integrable on (a,b) only if a ≠ o
d) Can not say integrability if if (x) on (a,b)
11. If
∫ f (n) dn = F(b) – F (a) , then F (.) is called
b
n
27
b) Both f+g and f- g are not integration on (a,b)
c) Can not say about the integrability of f +g and f-g on (a,b)
Part- B
( Answer all questions) weight 1
13. Define uniform continuity
14. State Rolle’s Theorem
15. Write Taylor’s Series if f(n) in powers of (n-a)
16. What is meant by Partition of an interval
17. When will you say integral if f(n) exist)
18 What do you mean by Analytic functions
19 State Cauchy’s integral formula.
20. Define contour
28
CORE COURSE VI
INFORMATICS AND NUMERICAL MATHEMATICS
29
20 hours
Module 6. Numerical Analysis : Operators E and Delta and their basic
properties.Divided differences. Interpolation formulae: Newton’s forward
and backward formulae, Lagrange’s formulae, Newton’s divided difference
formula Numerical Integration: Trapezodial rule, Simpson’s 1/3rd and 3/8th
rules and Weddle’s rule
30 hours
Books for reference
30
Model Question Paper
B.Sc. STATISTICS
Semester III
Core Course VI
Time 3hrs Informatics and Numerical mathematics
31
a. ? b. /t c) & d) 01
29. Given
Weight (lbs) : 20-40 40-60 60-80 80 -100 100- 120
32
No. of Students 25 120 100 70 30
Estimate i) No: if students having weight less than 32 lbs.
ii) No: if students having weight more than 105 lbs.
33
5. Module 4. Cluster sampling: Clusters with equal sizes – estimation of
population mean and total comparison with simple random sampling two
stage cluster sampling – estimate of the variance of the population mean.
20 hours
34
Model Question Paper
B.Sc. STATISTICS
I Semester IV
CORE COURSE VII SAMPLE SURVEYS
8. In srswor Var(p) is
PQ N − n PQ N − 1 PQ N − n PQ N − n
(a) (b) (c) (d)
n N −1 n N −n N N −1 n N1 − 1
9. The total number of samples of size n = 2 from a population of N = 6 is:
35
(a) 2.9 (b) 2.8 (c) 2.7 (d) 3
PART-B
PART-C
(Answer any four questions) weight 2 21. Explain the
concept of stratified sampling.
22. What is the difference between cluster and systematic sampling?
23 Derive the expression for variance of sample mean in srswor.
24 Show that sample mean is an unbiased estimate of population
25 What are the advantages of sampling over census.
26 List out the simple random samples for the data given in question
PART – D
(Answer any two questions) weight 4
36
1. Module 1. Linear programming: Mathematical formulation of LPP,
Graphical and Simplex methods of solving LPP – duality in linear
programming
20 hours
2. Module 2. Transportation and Assignment problems: North – west corner
rule, row column and least cost method – Vogel’s approximation method.
Assignment problem Hungarian algorithm of solution
20 hours
3. Module 3. General theory of control charts, causes of variations in quality,
control limits, sub grouping, summary of out- of control criteria, charts of
attributes, np chart, p chart, c chart. Charts of variables:X bar chart, R chart
and sigma chart. Revised control charts. Applications and advantages.
25hours
4. Module 4. Principles of acceptance sampling – Problems and lot acceptance,
stipulation of good and bad lots- producers’ and consumers’ risks, simple
and double sampling plans, their OC functions, concepts AQL, LTPD,
AOQL, Average amount of inspection and ASN function 25 hrs
B.Sc. STATISTICS
Semester IV
CORE COURSE VIII OPERATIONS RESEARCH AND
37
STATISTICAL QUALITY CONTROL
Part A
Time 3hrs Answer all questions (Weight 1 for bunch of 4)
4. Dual of a dual is
a) slack b) surplus c) artificial d) primal.³σ
5. In a control chart the manageable cause is
a) assignable cause b) random cause c) chance cause d) none of them
6. A control chart for fraction defectives is said to be in control if the points lie within
a) X‾± 3σ b) p’±3np’q’ c)p’±3√p’q’/n d)c±3√c
7. The spread of a process is given by
a) 3σ b) 6σ c)2σ d) 1.96σ
8. Upper control limit for R Chart is
a) A2R‾ b) A1R‾ c) D3R‾ d) D4R‾
9. Consumers risk is usually denoted by
a) µ b)∂ c) β d) α
10. The acceptance sampling plan is used for
a) Identifying good lots b) protecting the consumers interest c) protecting the producers
interest
d) All of the above
11. The Consumers risk usually fixed at
a) .05 b).01 c).95 d) .99
12. The OC curve gives
a) proportion of bad lots b) proportion of good lots c) discriminating power of the
sampling plan
d) none of them.
Part B ( answer all questions ,weight 1)
13. The inequality constrains are made equality in a lpp using---------- variables
14. If maximization lpp problem can be increased infinitely the problem is said to have---
--------- solutions
15. A sampling plan in which we take a decision based on one sample only is called-------
------
16. In a non degenerate transportation problem with m rows and n columns the number
allocations
will be----------
17. Expand the term LTPD
18. The method used to solve an assignment problem is called---------------
19. An artificial variable is used for--------------------
20. Chart used for number of defects is based on ---------- distribution
Part C ( answer 4 questions, Weight 2)
21. Define AOQ and LTPD.
22. Define the Linear programming problem.
23. What is double sampling plan?
24. Write the assignment problem as an lpp.
25. What are probability limits?
38
26. What is an unbalanced transportation problem?
Part D ( answer any 2 questions, weight 4)
27. Distinguish between double and single sampling plans.
28. Draw the OC curve of the single sampling plan showing the consumers and producers
risks.
29. Find the initial basic feasible solution of the following transportation problem. There
are four origins three destinations. The availabilities are 9,10,8,7and the requirements
are 17,10,7 respectively.
A B C
D 2 3 2
E 1 3 4
F 2 3 1
G 2 4 3
Sons
39
2. Module 2. Analysis of income and allied distributions- Pareto distribution
, graphical test, fitting of Pareto’s law, illustrations, log normal
distribution and properties. Lorenz curve, Gini’s coefficient.
20 hours
40
Model Question Paper
B.Sc. STATISTICS
Semester IV
CORE COURSE- IX OPERATIONS RESEARCH AND
STATISTICAL QUALITY CONTROL
Part A
Time 3hrs Answer all questions (Weight 1 for bunch of 4)
distribution d) none.
41
a) Trend b) Seasonal Variation c) Cyclic variation d) Random variation
5. Which of the method can be used for getting trend values for each given time point
10. A model of time- series explains the ……………….relation between value of variable and
time series components
13. Give an example each for seasonal and cyclic variation in a time – series
42
14. Define period of Moving average.
15. Give any three examples of irregular variation affecting a Time- series data.
17. Give the formula for converting chain base into fixed base and fixed base into chain base
Index numbers.
25. With the help of an Index Number formula, explain Time and Factor Reversal Tests.
26. Explain the use for developing Cost of Living Index Numbher.
27. Given the following data related to yield of a crop in three different seasons.
1990 12 19 17
1991 14 25 23
1992 13 27 20
1993 15 28 22
1994 17 31 24
28. Briefly explain the use of Pareto distribution and its applications
29. Calculate the cost of Living Index Number for the data given below.
Rice
43
Year Season 1 Season 2 Season 3
Food 30 47 4
Fuel 8 12 1
Clothing 14 18 3
House Rent 22 15 2
Miscellaneous 25 30 1
44
1. S.C. Gupta & V.K.Kapoor: Fundamentals of Applied Statistics, Sultan
B.Sc. STATISTICS
Semester IV
Part A
Time 3hrs Answer all questions (Weight 1 for bunch of 4)
PART-B
answer all questions (weight 1)
13. Write down Gauss Markov Linear model.
14. State the necessary and sufficient condition for estimability
of Parametric function.
15. What are the principles of experimental design?
16. Write the expression for estimating missing value in LSD.
17. If there are two missing values in a RBD with 4 blocks and
5 treatments,
What will be the degrees of freedom of error sum
of squares?
18.In a LSD with 4 treatments and error sum of squares is 16,
find the Mean error sum of squares.
19.Write expression for efficiency of LSD compared to CRD
Part C
46
25How can estimate the effects and calculate the sum of squares
in factorial experiment ?
1 21 20 19
2 19 18 18
3 18 19 19
4 27 25 24
PART – D
(Answer any two questions)Weight 4
A C B D
12 19 10 8
C B D _
18 12 6
B D A C
22 10 5 21
D A C B
12 7 27 17
47
CORE COURSE XI: POPULATION STUDIES AND ACTURIAL SCIENCE
Hall
48
Model Question Paper
B.Sc. STATISTICS
Semester V
CORE COURSE- XII POPULATION STUDIES AND ACTURIAL
SCIENCE
Part A
Time 3hrs Answer all questions (Weight 1 for bunch of 4)
1. Vital statistics is mainly concerned with
(a) births (b) deaths (c) marriages (d) all the above
2. Vital rates are customarily expressed as
(a) percentages (b) per thousand (c) per million (d) per ten thousand
3. The registration of births, deaths and marriages are
(a) a fancy of society (b) a part of medical research
(c) a legal document (d) all the above
4. The child bearing age in India is
(a) 20-24 years (b) 20-29 years (c) 13-49 years (d) 15- 49 years
5. The relation between N.R.R and G.R.R is
(a) N.R.R and G.R.R are usually equal (b) N.R.R can never exceed G.R.R
(c) N.R.R is generally greater than G.R.R (d) none of the above
6. Life-table has also been named as
(a) survival table (b) mortality table (c) life expectancy table (d) all the above
7. Normally a life-table is constructed for an age interval of
(a) five years (b) ten years (c) one year (d) 5-10 years
8. The central mortality rate ‘mx ’ in terms of qx is given by the formula
2q x 2q x qx qx
(a) (b) (c) (d)
2 + qx 2 − qx 2 + qx 2 − qx
9. The payment received by the insurer is known as
(a) loss (b) cost (c) premium (d) benefit
10. _______ is a condition that increases the frequency or severity of loss.
(a) peril (b) hazard (c) risk (d) loss exposure
11. Uncertainty of loss is known as
(a) probability (b) hazard (c) loss exposure (d) risk
12. The cause of loss is defined as
(a) hazard (b) risk (c) peril (d) claim
SECTION B
(Answer all the questions) Weight 1
49
13. Death rate computed for a specified section of the population is known as ______.
14. The ratio of instantaneous rate of decrease in lx to the value of lx is
defined as _______.
15. The expectation of life at any age can be obtained from a ________.
16. Pearle’s Vital Index = ________
17. An abridged life table usually consists of ages at distance of ________ years.
18. ______ is a financial arrangement that redistributes the costs of unexpected losses.
19. The insured’s possibility of loss is called the insured’s _______.
20. If the covered peril is death, the contract is called _______.
SECTION C
(Answer any four questions) weight 2
21. What are the various uses of vital statistics?
22. What is expectation of life? Distinguish between ‘curate expectation’ and ‘complete
expectation’ of life.
23. Define general fertility rate. Explain its merits and demerits.
24. What do you understand by an abridged life table?
25. Discuss the costs and benefits of insurance to society.
26. Explain life insurance and fire insurance.
SECTION D
(Answer any two questions) weight 4
27. Compute the crude and standardized death rates of the two populations A and B,
regarding A as standard population, from the following data:
Age-group A B
(Years) Population Deaths Population Deaths
under10 20,000 600 12,000 372
10-20 12,000 240 30,000 660
20-40 50,000 1250 62,000 1612
40-60 30,000 1050 15,000 525
above 60 10,000 500 3,000 180
Age in years lx dx px qx Lx Tx e xo
50
4 95,000 500 ? ? ? 4,850,300 ?
5 ? 400 ? ? ? ? ?
51
CORE COURSE XII: PRACTICAL
1. . Numerical questions from the following topics of the syllabi are to be asked
for external examination of this paper. The questions are to be evenly chosen
d. Sample surveys
e. Design of Experiments
g. Linear Programming
h. Numerical Analysis
i. Time series
j. Index Numbers
d) Numerical Analysis
e) Sample surveys
f) Design of Experiments
52
g) Construction of Control Charts
h) Linear Programming
i) Time Series
B.Sc. STATISTICS
Semester VI
1(a) Compute chain index numbers with 1981 prices as base from the following table
giving the average wholesale prices of the commodities A,B and C for the year
1986 to 1990. (Wt-1)
Commodity Average Whole Sale Price (Rs)
1986 1987 1988 1989 1990
A 20 16 28 35 21
B 25 30 24 36 45
C 20 25 30 24 30
(b) Calculate seasonal indices by the ratio to moving average method (Wt-1)
Year 1Qtr II Qtr III Qtr IV Qtr
1998 68 62 61 63
1999 65 58 66 61
2000 68 63 63 67
53
2(a) In a study it is reported that 60 out of group of 1000 insured person died within an
year. Examine whether this justifies the assumption that less than 4% only are
likely to die with in an year, at 5% level of significance (Wt-1)
(b) A sample of 200 boys who passed SSLC examination has a mean marks 50 with
standard deviation 5. The mean marks for a sample of 100 girls was found to be 48
with standard deviation 4. Does this indicate any significant differences between
the abilities of hoys and girls, assuming that the standard deviations are the same,
at 5% level of significance (Wt-1)
3 (a) Tea accountants were given intensive earaching and two tests were conducted in a
month. The scores of test 1 and 2 are given below. (Wt-1)
S.No. of Accounts : 1 2 3 4 5 6 7 8 9
10
Marks in Ist Test : 50 42 51 42 60 41 70 55 62
38
nd
Marks in 2 Test : 62 40 61 52 68 51 64 63 72
50
Does the scores from test 1to test 2 shows an improvement? Test at 5% level of
significance.
54
C (12) B (8) B (9) A (8)
B (10) A (8) C (10) C (9)
Analyze the data and give your conclusions. (Wt-1)
(b) The following are the number of defects noted in the final inspection of 30 days of
woolen cloths:-
0,3,1,3,2,2,1,3,5,0,2,0,0,1,2,4,3,0,0,0
1,2,4,5,0,9,4,10,3 And 6
Draw suitable control chart. (Wt-1)
55
56
PROJECT
1. The project is offered in the fifth and sixth semester of the degree course
and duration of the project may spread over the complete year
in a group shall not exceed five. However, the project report shall be
3. There shall be a teacher from the department to supervise the project and the
synopsis of the project should be approved by that teacher. The head of the
4. As far as possible, topics for the project may be selected from the applied
The following books may be used to get an idea about projects and project
report writing.
ELECTIVE SUBJECTS
Module.1 Individual risk model for a short time: Model for individual claim
random variables-Sums of independent random variable-
Approximation for the distribution of the sum-Application to
insurance 10hrs
Time: 3 Hrs
58
Part A
Choose the correct answer from the brackets
Bunch of four questions carries one weight age
1. Let X is the number obtained when one true die is tossed. Let y be the sum of the
numbers obtained when x true dice are then thrown. calculate E[y]
(a) 4/7 (b)7/4 (c)2/6 (d)3/6
2. Under certain assumptions, the probability of ruin is
Ψ(u)= (0.3) e-2u +(0.2) e-4u+(0.1)e-7u, u > 0. Calculate θ?
(a)2/3 (b)1/3 (c)1 (d)½
3. Suppose that λ = 3, C = 1 and P(x) = 1/3 e-3x +16/3 e-6x , x >0 Calculate P1
(a)3/27 (b)6/27 (c)4/27 (d)5/27
4. Suppose that λ = 1, C = 10 and P(x) = 9x/25 e-3x/5, x>0. Calculate θ
(a) 3 (b) 4 (c) 2 (d)5
5. Suppose that the claim amount distribution is discrete with P(1)=1/4 and
P(2)=3/4.If R=log 2.Calculate θ
(a) 10 -1 (b) 10 (c) 10 -1 (d)10
7log2 7log2 5log2 5log2
6. Suppose that Wi assumes only, the value 0 and +2 and that
P[W=0]=p,P[W=2]=q,where p+q=1,Assume that C=1,P>1/254
7. Consider an insurance portfolio that will produce 0, 1, 2, or three claims in a fixed
Time period with probabilities 0.1, 0.3, 0.4 and 0.2respectevely An individual
Claim will be of amount 1, 2, or 3 with probabilities 0.5, 0.4 and 0.1
Respectively Calculate E[N]
(a) 1.7 (b) 2.7 (c) 2.8 (d)1.6
8. Suppose that θ=2/5 and p(x)= 3/2e-3x + 7/2e-7x , x>0 calculate γ
(a) 2 (b)3 (c)4 (d)2.5
9. If S has a compound Poisson distribution given by λ=3,p(1)= 5/6,p(2)=1/6,
Calculate fs(x) for x=0
(a) 0.050 (b) 0.25 (c) 0.052 (d) 0.523
10 Consider an insurance portfolio that will produce 0, 1, 2, or three claims in a fixed
Time period with probabilities 0.1, 0.3, 0.4 and 0.2respectevely an individual
Claim will be of amount 1, 2, or 3 with probabilities 0.5, 0.4 and 0.1
Respectively Calculate V [N]
(a) 0.8 (b) 0.028 (c) 0.08 (d) 0.285
-3x -7x
11. Suppose that θ=2/5 and p(x)= 3/2e + 7/2e , x>0 calculate R
(a) 2.5 (b) 3.45 (c) 4.25 (d) 2.5
PART B
Attempt all questions- each questions carries one weight age
59
Z = S- λP1Converges the standard normal distribution as λ→∞?
λP2
15. Write an expression for the distribution of the surplus level at the first time, the
surplus falls below the initial level u, given that it does fall below u, if all
claims are of size 2?
16. Derive an expression for Ψ(u) if the Xi’s have an exponential claim amount
distribution?
17. Write an expression for the distribution of L, if the size of the individual claims
has an exponential distribution with parameter β?
18. Find the mean and variance of the Inverse Gaussian distribution, by using its
mgf
19. Derive an expression for R in the special case where the Wi’s common
distribution is N(µ,σ2)?
20. Determine the adjustment coefficient if the claim amount distribution is
exponential with parameter β>0?
PART C
Attempt any four questions- each questions carries two weight age
21.Assume that u(λ) is the gamma probability distribution function with parameter α
and β,
u(λ) = βα λα-1 e –βλ
Γα ,λ>0
Where Γα = ∫∞0 yα-1 e –y dy. Show that the marginal distribution of N is negative
binomial with parameters, r = α , p= β
1+ β
22.Prove that if S1,S2,………….Sm are mutually independent random variables,such
that Si has a compound Poisson distribution with parameter λi and d.f of
claim amount Pi(x),i=1,2,……….m, then S= S1+ S2+………….+Sm has a
m m
Compound Poisson distribution with λ= ∑ λi and P(x)= ∑ λi /λPi (x)
i=m i=m
23. Assume that u(λ) is the inerse Gaussian p d f with parameters α and β . Exhibit
the moment generating function of N, E [N] and V [N]?
25. Calculate the adjustment coefficient if all the claims are of size 1?
26. Calculate the probability of ruin in the case that the claim amount distribution is
exponential with parameter β
PART D
Attempt any two questions- each questions carries four weight age
60
is 1/6 and B, the benefit amount given that there is a claim ,has pdf
F(y) = 2(1-y), 0<y<1
0 , elsewhere
Let S be the total claims for the portfolio. Using a normal distribution, Estimate
P[S>4]
28.Prove that for compound distribution where the probability distribution for N
the number of claims , satisfies the condition
P[N=n] = a+b/n ,for n= 1,2,………..
P[N=n-1] and where the distribution of claim amounts is restricted to the
positive integers.
x
fS(x) = ∑ [a+bi/x]p(i) fS(x-i) ,x=1,2,………
i=1
With the starting value fS(0) = P[N=0]
29.Given that θ = 2/5, and P(x) = 3/2 e-3x +7/2 e-7x , x>0 .Calculate Ψ(u),γ,R?
B. STOCHASTIC MODELING
Module 1. Concept of mathematical modeling, definition, natural testing a
61
Definition of stochastic process, classification, Markov chain, transition
30hrs
14hrs
B.Sc. STATISTICS
Semester VI
STOCHASTIC MODELING
62
(a) P( X = k ) (b) ∑ P( X = k ) (c) ∑ P( X = k )s
k k
k
(d) P ( X = k ) s k
∫
(a) g ( x − y ) f ( y ) dy (b)
0
∫ g ( x + y) f ( y)dy (c)
0
∫ g ( x − y) f ( y − x)dy (d)
0
x
∫ g ( x) f ( y )dy
0
6. State j is absorbing if
( n)
(a) Pjj > 0 for some n ≥ 1 (b) f jj = 1 (c) Pjj < 1 (d) f ij = 1
7. For an irreducible markov chain, if one state is ergotic, then
(a)all states are ergotic (b) one more state is ergotic (c) no other state is
ergotic (d) none
0 1 0
8. For the following Markov chain, P= 1 / 2 0 1 / 2 with state 1, 2, 3, the
0 1 0
chain is
(a)transient (b) recurrent (c) absorbing (d) none of these
9. For the above Markov chain
(a) P 2 = P (b) P 3 = P (c) P 4 = P (d) P 2 = P 3
10. For a poisson process, {N(t)}, p n (t )
(a)independent of time (b) depends on t (c) depends on time length
(d) zero
11. Which of the following is incorrect for a poisson process
(a)Markovian (b) time homogeneous (c) independent (d)
nonstationory
12. Interarrival distribution of poisson process is
(a)gamma (b) geometric (c) exponential (d) binomial
63
Part B.
Answer all question Wt 1
Part C.
Answer any four question Wt 2
Part D.
Answer any two questions Wt 4
∑p =∞
n
27. Prove State j is persistant if ij
n =0
64
1 / 3 2 / 3 0 0
1 0 0 0
28. For the following Markov chain show that State 1 is
1/ 2 0 1/ 2 0
0 0 1 / 2 1 / 2
ergotic, state 2 is recuurent and chain is ergotic.
e − λt (λ t ) n
29. Derive, for a poisson process, p n (t ) = , n = 0,1,... , using its postulates
n!
C. RELIABILITY THEORY
1. R. E. Barlow and F Proschan (1975) Statistical theory of reliability and life testing,
Holt Rinhert, Winston
2. N. Ravi Chandran Reliability Theory, Wiley Estern
65
Model Question Paper
B.Sc. STATISTICS
Semester VI
ELECTIVE- RELIABILITY THEORY
66
(a)all components functions, (b)only one component functions, (c)atleast k components
functions, (d)atmost k component functions
3. If φ (1i , x) = φ (0i , x), ∀(.i , x) then component i is
(a) relevant, (b) irrelevant, (c) coherent, (d)monotonic
4. If φ is the structure function, then its dual is
(a) φ D ( x) = 1 − φ ( x) , (b) φ D ( x ) = 1 − φ (1 − x) , (c) φ D ( x) = φ (1 − x) , (d) φ D ( x) = 1 + φ ( x)
5. For a coherent system, which of the following argument is correct?
(a)a component may relevant, (b) each of the component is relevant, (c) no component
is relavant, (d) atleast two component is relavant
6. Which of the following is reliability of a binary system?
(a) Eφ (x) , (b) Eφ 2 ( x) , (c) 1 − Eφ ( x) , (d) Eφ ( x ) − 1
7. Reliability of a three component series system is
(a) (1 − p )3 , (b) p 3 , (c)1-(1-p) 3 , (d) p (1 − p ) 2
8. Let h(p) is the reliability function of a coherent structure.
(a) h(p) is increasing in pi , (b) h(p) is decreasing in pi , (c) constant in pi ,
(d)independent of pi
9. Which of the following is true?
(a) 0 < I h ( j ) ≤ 1 , (b) 0 < I h ( j ) < 1 , (c) 1 < I h ( j ) ≤ ∞ , (d) 0 < I h ( j ) ≤ ∞
10. Which of the following is a failure rate function?
f (t ) f (t ) F (t ) 1 − f (t )
(a) , (b) , (c) , (d)
F (t ) 1 − F (t ) f (t ) F (t )
11. Which distribution has constant failure rate?
(a) normal, (b) poisson, (c) exponential, (d) lognormal
12. A process which has stationary independent increments is
(a) gamma process, (b) poisson process, (c) exponential process, (d)geometric
Process
PART B
Answer all questions (Weight 1)
PART C
Answer any four questions( weight 2)
67
PART D
Answer any two questions (weight 4)
OPEN COURSES
A. ECONOMIC STATISTICS
Module 2. Index Numbers: Meaning and definition – uses and types- problems
in theconstruction of index numbers- simple aggregate and weighted
aggregate index numbers. Test of consistency of index numbers- factor
reversal- time reversal test and unit test. Chain base index numbers- Base
shifting- splicing- and deflating of index numbers. Consumer price index
numbers- family budget enquiry- limitations of index numbers.
68
30 hours
Books for reference
1. SC Gupta and V.K. Kapoor: Fundamentals of Applied Statistics,
Sultan Chand & Sons
2. Goon A.M., Gupta M.K. and Das Gupta: Fundamentals of Statistics
Vol.II The World Press, Culcutta.
B.Sc. STATISTICS
Semester V
OPEN COURSE (ECONOMIC STATISTICS)
Time 3Hr
Part A
Answer all questions (Weight 1 for bunch of 4)
69
Seasonal variations are periodic due to
7.
a) Man made customs, habits, rituals etc
b) Resulting due to Natural reasons
c) Resulting due to change in weather condition
d) Any force that operate regularly year after year
8. Seasonal variation is measured using
a) Seasonal Averages b) Seasonal Indices
c) Seasonal Relatives d) None of these
9. A monthly seasonal variation measures are adjusted to
a) 12 b) 120 c) 1200 d) None of these
10. A model of time- series explains the ……………….relation between value of
variable and time series components
a) Additive b) Multiplicative c) Mathematical d) None of these
70
Part- C (answer any 4 questions) weight 2
21. How trend in measured using Moving Averages.
22. Explain periodic variations in Time- Series with suitable examples.
23. Explain the Link Relative Method of measuring seasonal variation.
24. Explain the uses of Index Numbers.
25. With the help of an Index Number formula, explain Time and Factor Reversal Tests.
26. Explain the concept behind developing cost of Living Index Numbher.
Part- D (Answer any 2 Questions) weight 4
27 Given the following data related to yield of a crop in three different seasons.
Yield (Kg/10 cent plot)
Year Season 1 Season 2 Season 3
1990 12 19 17
1991 14 25 23
1992 13 27 20
1993 15 28 22
1994 17 31 24
i) If this trend is followed, what will be the expected yield in 1995?
ii) Does season influence yield of crop?
28. Briefly explain the problems in the construction of an Index Number.
29. Calculate the cost of Living Index Number for the data given below.
Rice
Year Season 1 Season 2 Season 3
Food 30 47 4
Fuel 8 12 1
Clothing 14 18 3
House Rent 22 15 2
Miscellaneous 25 30 1
71
B. QUALITY CONTROL
Sons
Sons
B.Sc. STATISTICS
72
SEMESTER V -OPEN COURSE (QUALITY CONTROL)
Time 3Hr
Part A
Answer all questions (Weight 1 for bunch of 4)
a) 3σ b) 6σ c)2σ d) 1.96σ
2. Upper control limit for R Chart is
a) A2R‾ b) A1R‾ c) D3R‾ d) D4R‾
3. Consumers risk is usually denoted by
a) µ b)∂ c) β d) α
4. The acceptance sampling plan is used for
a) Identifying good lots b) protecting the consumers interest c) protecting the producers
interest
d) All of the above
5. The Consumers risk usually fixed at
a) .05 b).01 c).95 d) .99
6. The OC curve gives
a) proportion of bad lots b) proportion of good lots c) discriminating power of the
sampling plan
d) none of them.
7. Number of breakdowns in an electric wire is studied using
a) R chart b) Sigma chart c) d chart d) c chart
8. The manageable cause of a process out of control is
a) assignable b) random c) unknown d) none
9. The quality of the lot after rectifying inspection will
a) not change b) change c) improve d) worsen.
10.Which of the following is an assignable cause.
a) Humidity b) Temperature d) Location c) Wear & tear.
11. To study the variation of a process where of costly items we use
a) R chart b) sigma chart c) p chart d) d chart.
12. The exact distribution used in acceptance sampling is
a) Binomial b) poisson c) geometric d) hyper geometric.
73
18. Expand the term AOQ
19. Give an example where there is only upper specification limits.
20. Give an example where there is only lower specification limits.
C. BASIC STATISTICS
74
Module 1. Elements of sample surveys: Census and sampling, advantages, principal
steps in a sample survey, sampling and non sampling errors. Probability sampling,
judgement sampling and simple random sampling
15 hours
Module 2. Measures of central tendency: Mean, median, mode and their empirical
relationship. Weighted arithmetic mean- Dispersion: absolute and relative measures,
standard deviation and coefficient of variation
15 hours
19 hours
20hrs
75
Section A
Answer all questions (Contains 12 questions, 4 Questions carry a weightage of 1)
3. Mean of 20 values is 45. If one of these values is to be taken 64 instead of 46, the
correct value of mean is:
a) 49.5
b) 45.9
c) 40.9
d) 42.9
4. The formula to find coefficient of variation is:
__
σ X
a) × 100 b) × 100
__
σ
X
Median
c) ×100 d) σ × 100
σ
5. Mean deviation from median is:
a) Equal to mean deviation from mean
b) Greater than mean deviation from mean
c) Less than mean deviation from mean
d) No relation
a) Leptokurtic curve
b) Mesokurtic curve
6. The value of the square of Karl Pearson’s coefficient of correlation lies between:
a) 0 and 1 b) -1 and 1
76
7. Karl Pearson’s coefficient of correlation for the following set of observation (3,12),(5,6)
a) Negative b) Positive
c) Zero d) No relation
9. Mutually exclusive events other than null event and sure event are:
a) not independent
b) independent
c) no relation
d) independent under some conditions
10. The probability that India wins a cricket match against England is 1/3. If India and
England play 3 matches, what is the probability that India will lose all the three
matches?
11. What is the probability that a non leap year selected at random will have 53 Sundays?
Q12. For a discrete r.v P(X >0) = P(X <0) and P(X =0) = p. The variable takes the
following values -2, -1, 0, 1, 2. What is the probability that X >0?
77
15. Classical definition of probability can be used in the case of a sample space with
infinite outcomes.
16. In the case of disjoint events A and B, P(A Υ B)< P(A) +P(B).
a) Say true or false
b) Explain your answer
17. Getting a queen and getting a Jack while drawing cards from a deck of cards are
independent events.
18. The correlation coefficient between X and Y is 0.85. Find the coefficient of
determination. 1
21. Explain why A.M. is considered as the best measure of central tendency? 2)
22. Calculate quartile deviation for the following data:-
26, 54, 33, 41, 94, 41, 54, 26, 93, 87, 81, 64, 68, 95.
23. The first two-sub-groups have 10 items with mean 15 and S. D. 3. If the whole group
has 250 items with mean 15. 6 and S.D. 13.44 , find the standard deviation of the
second subgroup.
24. If A and B are two independent events such that
P ( A c ) = 0.7, P ( B c ) = k , P ( A ∪ B ) = 0.8 , then find the value of k.
25. A and B stand in a ring with 12 other persons. Find the probability that A & B are
together.
26. Explain why in the case of two variables there are always two regression lines? When
do they coincide?
78
PART D ( Answer any 2 questions) Weight 4
27. State and prove addition theorem for two events? Explain what happens when A is
subset of B?
28. P (A) = 1/3, P(B) = 1/4, P(A∩B) = 1/11. Find the following probabilities.
79
STATISTICS: COMPLEMENTARY – I Syllabus for BSc.
4 ST4C04 5 3 3 3:1
APPLIED STATISTIC
There shall be 4 parts A, B, C and D in all the question papers*. Part A consists of 12
objective type questions. Part B consists of 8 questions to be answered in a word, phrase
or sentence. Part C consists of 6 questions of short essay type of which the student can
attempt 4. Part D consists of 3 questions of long essay type of which the student can
attempt 2. In part A the weightage per question is ¼.for part B weightage is 1/question
.For part D the weightage is 2/question and for part D the weightage is 4/question.
As far as possible the number of questions should be proportional to the modules.
1
2. 8 short answer questions 4 theory + 4 problems weight 1
Components Weight
Assignment 1
Test paper 2
Seminar 1
Attendance 1
There shall be two test papers and the average grade point is to be considered for
internal assessment
Semester I
2
COURSE I : PROBABILITY THEORY
properties.
15 hours
properties.
20 hours
3
Book for reference
4
Model Question Paper
Semester I
COMPLEMENTARY COURSE I
PROBABILITY THEORY
Time: 3 Hrs
Part-A
Answer all the questions weight 1 for bunch of 4
1. Cans of soft drinks cost $0.30 in a certain vending machine. What is the
expected value and variance of daily revenue (Y) from the machine, if X, the
number of cans sold per day has E(X) = 125, and Var(X) = 50 ?
Solution: b
Annual
Cash Flow $10,000 $30,000 $70,000 $90,000
$100,000
Probability 0.10 0.15 0.50 0.15 ?
The expected cash flow for the new location is:
(a) $12,800
(b) $64,000
(c) $70,000
(d) $60,000
(e) $50,000
Solution: b
5
3 The probability that the Red River will flood in any given year has been estimated
from200 years of historical data to be one in four .This means
(a) The Red River will flood every four year.
(b) In the next 100 years, the Red River will flood exactly 25 times.
(c) In the last 100 years, the Red River flooded exactly 25 times.
(d) In the next 100 years, the Red River will flood about 25 times.
(e) in the next 100 years, it is very likely that the Red River will flood exactly 25
times.
Solution: d
4 The chances that you will ticketed for illegal parking on campus are about 1/3.
During the last nine days, you have illegally parked everyday and have NOT been
ticketed you lucky person)! Today, on the 10th day, you again decided to park
illegally. The chances that you will be caught are:
(a) greater than 1/3 because you were not caught in the last nine days.
(b) less than 1/3 because you were not caught in the last nine days.
(c) still equal to 1/3 because the last nine days do not affect the probability.
(d) equal to 1/10 because you were not caught in the last nine days.
(e) equal to 9/10 because you were not caught in the last nine days.
Solution: c
5. The chance that a person will contract AIDS after asexual contact with an infected
partner has been estimated to be 1/4. This means:
(a) A person will be infected after exactly 4 sexual contacts with infected partners.
(b) Of 1000 people having sexual contacts with infected partners, exactly 250 will
become infected.
(c) Of 200 people having sexual contacts with infected partners, about 50 will
become infected.
(d) In exactly 25% of all sexual contacts with infected partners, the infection will
spread.
(e) Of 20 people having sexual contact with infected partners it is very likely that
exactly 5 people will become infected.
Solution: c
6
7. A random variable X has probability distribution as follows
R 0 1 2 3
P[R=r] 2k 3k 13k 2k
The probability that P[X < 0.2] is equal to
a) 0.9
b) 0.25
c) 0.65
d) 0.15
e) 0.75
Solution b
8 If A, B, C are any three events probability of at least one is represented by
a) P[ A Υ B Υ C ]
b) P[ AB Υ AC Υ BC ]
c) P[ A Ι B Ι C ]
d) 1 − P[ A Υ B Υ C ]
e) P[ A Υ B Υ C ]
9 A continuous random variable X has p.d.f. f ( x) = 3 x 2 ,0 ≤ x ≤ 1 . If
P[ X ≤ a ] = P[ X > a ] , then a is
1
a)
3
−1 / 3
b) 2
3
1
c)
2
1
d)
3
3
1
e)
2
Solution b
10 If F(x) is the distribution function of X, and if Y = F(x), then E(Y) is
1
a)
2
b) 1
c) y
d) 2
e) none of the above
7
11 For a continuous random variable with p.d.f. f(x) and distribution function F(x),
which may not be true
a) 0 ≤ f ( x ) ≤ 1
∞
b) ∫ f ( x)dx = 1
−∞
c) 0 ≤ F ( x) ≤ 1
d) P[ X = 0] = 0
e) F (∞ ) = 1
Solution a
12 If the rth moment of a random variable X is µ r′ = r! , the Moment generating
function is
a) (1 − t )
t
b
1− t
c) (1 − t ) −1
d) ln(1-t)
e) None of these
Part-B
Answer all the questions ,weight 1
Part-C
Answer any four questions ,weight 2
21 State and prove addition and multiplication theorem of probability for two events.
22 From a vessel containing 3 white and 5 black balls, four balls are transferred in to
an empty vessel. From this vessel a ball is drawn and is found to be white. What
is the probability that out of four balls transferred, 3 are white and 1 is black.
8
kx ,0 ≤ x < 1
k ,1 ≤ x < 2
23 Let X be a continuous random variable with p.d.f. f ( x) =
− kx + 3k ,2 ≤ x < 3
0 , else where
(1) Find the constant k, (2) Determine the distribution function.
24 Define row and central moments. Establish the relation between row and central
moments of a random variable.
25 Find the measures of skewness and kurtosis based on moments for the following
1 2 −x
p.d.f. f ( x) = x e , 0 < x < ∞.
2
26 State and prove bayes theorem.
Part-D
Answer any two questions, weight 4
27 The kms X in thousands of kms which car owners get with a certain kind of tyre is
1 − 20x
,x > 0 .
a random variable having probability density function f ( x) = 20 e
0 ,x ≤ 0
Find the probabilities that one of these tyres will last (1) at least 10000kms.(2)
anywhere from 16000 to 24000kms and (3) at least 30000kms. (4) Find the
expected distance in kms the car owners get with the tyre.
28 Explain axiomatic definition of probability
29 Explain the terms. (1) Random experiment, (2) Sample space, (3) Mutually
exclusive events, (4) Equally likely events. With example.
9
Semester II
15hours
15 hours
30 hours
10
Module 4. Law of large Numbers: Chebychev’s inequality, convergence
12 hours
11
Model Question Paper
Semester II
COMPLEMENTARY COURSE I
PROBABILITY DITRBUTIONS
Time 3hrs
(Answer all the questions. Choose the correct answer from the alternatives
given below each question). Weight 1 for a bunch of 4 questions
1. For two random variables x and y, the relation E (xy)= E(x) E(y) holds good.
a) if x and y are identical
b) for all x and y
c) if x and y are statistically independent
d) None of the above.
2. If V(x) = 1, then V(2x ± 3) is
a) 5 b) 13 c) 14 d) 1
3. E(x-k)2 is minimum when
a) k<E(x) b) k= E(x) c) k>E(x) d) K2= E(x)
4. If x is a random variable having probability function f (x), then the function
itx
Σ e f(x), for i to be an imaginary unit, is known as
a) moment generating function
b) probability generating function
c) probability distribution function
d) characteristic function
5. The skewness of a binomial distribution will be zero if
a) p < ½
b) p = ½
12
c) p > ½
d) p < q
6. The coefficient of variation of poison distribution with mean 4 is
a) ¼ b) 2/4 c) 4 d) 2
7. X is normally distributed with zero mean and unit variance. The variance of
x2 is
a) 0 b) 1 c) 2 d) 4
8. In a normal curve area to the right of the point x1 is 0.6 and to the left of the
point x2 is 0.7. Which is the correct statement.
a) n1> n2 b) n1< n2 c) n1= n2 d) none of them
9. For a normal distribution, Q.D, M.D and S.D. are in the ratio.
4 2 4 4 2 1 4
a) : 2/3:1, b) : :1 c) 1: : d) : 1:
5 3 5 5 3 2 5
10. If x is a continuous r.v with means µ and variance σ 2 then for any positive
1
number k P[│x- µ │ > K σ ] ≥ is known as
k2
a. Liapunov’s inequality b) Tchebycheff’s inequality
c. Bienayme- Tchebycheff’s inequality d) Khinchin’s inequality
11. If x and y are two random variables such that their expectations exist and
P(x ≤y) =1 then
a) E(x) ≤E (y) b) E (x) >E (y)
c. E (x) = E (y) d) None of the above
1 2
12. If x is a standard normal variate then x is
2
1
a) Gramma variate with parameters
2
b) Normal variable
1
c. Passion variable with parameter
2
d) Exponential variable with parameter 2
13
Part B
(Answer all the questions) Weight 1
15. Name the discrete distribution for which mean and variance have the same
value.
16. What is the third moment about the mean of a poison distribution if the
second moment about the origin is 12.
17. Identify the distribution (using the uniqueness property) if the name of
generating function of the distribution
is Mx(t)= (1+et ) 5/32
18. The relationship between Beta distributors of the first and second kind is----
19. What is the characteristic function of a standard cauchy distribution.
20. What are the points of inflexion of a normal curve N(µ,σ).
Part C
(Answer any 4 questions) Weight 2
14
Part D
(Answer any 2 questions) Weight 4
15
SEMESTER III
10 hours
and composite hypotheses, null and alternative hypotheses, type I and type
16
on t distribution for mean, equality of means and paired mean for paired
of attributes. 30 hours
(India),New Delhi.
17
Model Question Paper
Semester III
Time 3hrs
COMPLEMENTARY COURSE- I
STATISTICAL INFERENCE
Part A
Answer all questions ,4 questions carry weight 1
1. The mean of a Chi – square distribution with n degrees of freedom is
( a ) 2n ( b ) n 2 ( c ) n (d ) n
2. The relation between student’s-t and F distribution is.
( a ) t( n ) 2 = F( n,1) ( b ) t( n) 2 = F(1,n ) ( c ) t(1) 2 = F(1,n) ( d ) t( n ) 2 = F(1,1)
3. Let X 1 , X 2 ,..., X n be a random sample from a normal population N ( µ , σ 2 ) ,then the
∑ ( x − x)
2
i
distribution of is.
σ2
( a ) χ 2( n ) ( b ) t( n) ( c ) χ 2( n −1) ( d ) t( n−1)
1
( )
2
s2 =
n
∑ xi − x ,the unbiased estimator for the population variance σ 2 is
1 2 1 2 n 2 n −1 2
(a) s (b ) s (c ) s (d ) s
n −1 n n −1 n
5. If T is a consistent estimator of θ then
( a ) T is a consistent estimator of θ 2 ( b ) T 2 is a consistent estimator of θ
( c ) T 2 is a consistent estimator of θ 2 ( d ) None of the above
6. Let X 1 , X 2 ,..., X n be a random sample from a Bernoulli population. A sufficient
statistics for p is
18
( a ) ∑ X i ( b ) ∏ X i ( c ) Max( X1 , X 2 ,..., X n ) ( d ) Min( X 1 , X 2 ,..., X n )
8. The 95% confidence interval for mean µ of a normal population N ( µ , σ 2 ) with
known σ 2
( a ) 27 ( b ) 9 ( c ) 3 ( d ) 0
10. A sample of 12 specimen taken from a normal population is expected to have a
mean 50mg/cc. The sample has a mean 64 mg/cc with a variance of 25 .to test
H 0 : µ = µ0 aganistH1 : µ ≠ µ0 , you will choose
11. A random sample of size 20 from a nor mal population gives a mean 42 and a
variance 25.Then the value of the χ 2 statistic used for testing the significance of
population variance is
single observation from the population f ( x, θ ) = θ eθ x , x > 0 ,then the value of type I
error is
( a ) e ( b ) e2 ( c ) e−2 ( d ) e−1
19
Part B
Answer all questions ,each questions carries weightage 1
13.Let X 1 , X 2 be a random sample of size 2 from N ( 0,1) .Then the distribution of
( X 1 + X 2 ) is-------------
2
( X1 − X 2 )
2
X1 + 2 X 2 + X 3 X + X2 + X3
relative to 1 is------------
4 3
1 X −θ
16Let X 1 , X 2 ,..., X n be a random sample from the population with pdf f ( x, θ ) = e ,
2
The m.l.e of θ is---------
17.The diameter of a cylindrical rod is assumed to be normally distributed with a
variance of 0.04cm. A sample of 25 rods has a mean diameter of 4.5 cms.95% confidence
interval for population mean is -----------
18.The power of a test is ----------
19.Degrees of freedom for chi-square in case of contingency table of order 4x3 is ---
20.In tossing of a coin ,let the probability of a head turning up be p .the hypotheses are
H 0 : p = 0.4 aganistH1 : p = 0.6 . H0 is rejected if there are five or more heads in six
tosses. Then probability of type I error is----------
20
PartC
Answer any 4 questions ,each questions carries a weightage of 2
21.Obtain the distribution of the sample mean of a random sample X 1 , X 2 ,..., X n of size n
from N ( µ , σ 2 ) .
B (1, p ) .Let T = ∑ X i .
T (T − 1)
Show that is an unbiased estimator of p2.
n( n − 1)
23.Define sufficient statistic. Let X 1 , X 2 ,..., X n be a random sample of size n from
24.An oil company claims that less than 20% of all car owners have not tried its gasoline
.Test this claim at the 0.01 level of significance if a random check reveals that 22 out of
200 car owners have not tried oil company’s gasoline.
25.In the comparison of two kinds of paint ,a consumer testing service finds that four 1-
gallon cans of one brand cover on the average 546 square feet with a standard deviation
of 31 square feet ,whereas four 1-gallon cans of another brand cover on the average 492
square feet with a standard deviation of 26 square feet. Assuming that the two
populations sampled are normal and have equal variance. Test the hypothesis that on the
average the first kind of paint covers a greater area than the second.
26. Mention the advantages of non-parametric tests over parametric test.
21
Part D
Answer any 2 questions ,each questions carries 4 credit
27 Let X 1 , X 2 ,..., X n be a random sample of size n from N ( µ , σ 2 ) . Find the mle’s
29 Use the data shown in the following table to test at the 0.01% level of significance
whether a person’s ability in mathematics is independent of his or her interest in
statistics.
Ability in Mathematics
Low Average High
Interest
Low 63 42 15
in
Statistics Average 58 61 31
High 14 47 29
22
SEMESTER IV
kurtosis
5 hours
30 hours
15 hours
23
control charts, 3 sigma limits. Control chart for variables – X-bar chart and
25 hours
ANOVATable 15 hours
1. Goon A.M., Gupta M.K and Das Gupta: Fundamentals of Statistics Vol.1
24
4. 3 long essay type question 1 problem + 2 theory weight 4
Semester IV
Time 3hrs
COMPLEMENTARY COURSE- I
APPLIED STATISTICS
Part A
Answer all questions (weight 1 for a bunch of 4 questions)
Calculators are permitted
1. If the coefficient of kurtosis is equal to 3 the distribution is called
( a ) 0 to1 ( b ) 0 to ∞ ( c ) − 1to1 ( d ) − ∞ to ∞
4. The test statistic for testing the significance of ρ = 0 with usual notation is.
r 1− r2 r n−2 r n−2 r 2 (1 − r 2 )
(a)t = (b ) t = (c) t = (d )t =
n−2 1− r2 1− r2 n−2
25
( a ) a number of years ( b ) parts of a year
( c ) parts of a month ( d ) none of the above
8. Link relatives in a time series remove the influence of.
( a ) Trend ( b ) Cyclic variation
( c ) Seasonal variation ( d ) all the above
( a ) k − 1 ( b ) n − 1 ( c )( k − 1)( n − 1) ( d ) nk − 1
11. The causes leading to vast variation in the specifications of a product are
( a ) random causes ( b ) assignable causes
( c ) non − traceable causes ( d ) all the above
12. The control charts for fraction defectives are known as
15 The formula for multiple correlation coefficient R2.13 in terms of the simple --
correlation coefficients r12 , r13 and r23 is ----------
16 Given the trend equation , Y = 108 + 2.8 X with 2000 as orgin and yearly data from
2000 to 2002,the estimated trend value for 2005 is.---------
26
19 One or more points outside the control limit indicates that -------
PartC
Answer any 4 questions ,weight 2
22. Show that Correlation coefficent is indepndent of change of orgin and scale.
two pairs as (6,14) and (8,6) while the correct values where (8,12) and (6,8)
24. In a trivariate distribution r12 = .77, r13 = .72, r23 = .52 .Find the partial correlation
26. What do you understand by 3-σ control chart. Obtain the 3-σ control limits for
X bar chart
Part D
Answer any 2 questions , weight 4
27. The following are the cholesterol contents in milligrams per package that four
laboratories obtained for 6-ounce packages of three very similar diet foods
27
Diet food A Diet food B Diet food C
.
Laboratory 1 3.4 2.6 2.8
Laboratory 2 3.0 2.7 3.1
Laboratory 3 3.3 3.0 3.4
Laboratory 4 3.5 3.1 3.7
Perform a two way analysis of variance and test the null hypotheses concerning
the diet foods and laboratories at the 0.05 level of significance.
28. .Calculate seasonal index for the following time series by Ratio to moving
average method.
28
29. The net weight of a dry bleach product is to be monitored by X-bar and R
chart
using a sample size of n=5 .Data for 12 preliminary samples are as follows.
Sample no. X1 X2 X3 X4 X5
1 15.8 16.3 16.2 16.1 16.6
2 16.3 15.9 15.9 16.2 16.4
3 16.1 16.2 16.5 16.4 16.3
4 16.3 16.2 15.9 16.4 16.2
5 16.1 16.1 16.4 16.5 16.0
6 16.1 15.8 16.7 16.6 16.4
7 16.2 16.1 16.2 16.1 16.2
8 16.2 16.1 16.2 16.1 16.3
9 16.3 16.2 16.4 16.1 16.5
10 16.6 16.3 16.4 16.1 16.5
11 16.2 16.4 15.9 16.3 16.4
12 15.9 16.6 16.7 16.2 16.5
Set up X-bar and R control chart using this data. Does the process exhibit statistical
control.
29
SYLLABUS OF COMPLEMENTARY II- ACTUARIAL SCIENCE
STATISTICS: COMPLEMENTARY – II
CUCCSSUG 2009 (2009 admission onwards)
.For part D the weightage is 2/question and for part D the weightage is 4/question
As far as possible the number of questions should be proportional to the modules.
Components Weight
Assignment 1
Test paper 2
Seminar 1
Attendance 1
There shall be two test papers and the average grade point is to be considered for
internal assessment
SEMESTER I
Course I
Financial mathematics
Semester I
COMPLEMENTARY COURSE II
FINANCIAL MATHEMATICS
Time: 3 Hrs
Part A
Choose the correct answer from the brackets
Bunch of four questions carries one weight age
1. If an investor deposits £4000 in a bank account that pays simple interest at a rate
of 6% pa. Then after 8 years it will be ------------------
(a)5920 (b)4920 (c)3920 (d)3000
2. If an investor deposits £4000 in a bank account that pays compound interest at a
rate of 6% pa. Then after 8 years will be ------------------
(a)5920 (b)4920 (c)6375 (d)6000
3. An investor must make a payment of £5000 in 5years time. The investor wishes to
make provision for this payment by investing a single sum now in a deposit
account that pays 10% pa compound interest. How much should the initial
investment be?
(a)3105 (b)4105 (c)4000 (d)3000
4. An 8 month loan repayable by a single repayment is issued at a rate of
commercial discount of 15%pa. If the amount of the repayment is £1,00,000 How
much was initially lent to the borrower?
(a)80000 (b)90000 (c)100000 (d)75000
5. £80 is invested at time 5 and the accumulated amount at time 8 is £100.what is the
value of interest
(a)8.33% (b)8% (c)7% (d)7.33%
6. Find the value at time t=0 of$250 due at time t=6 and $600 due at time t=8. If
S(t)=3%pa for all t
(a)680.79 (b)650 (c)675.25 (d)680
7. Calculate a25 at 13½%pa effective
(a)7.095 (b)7.25 (c)8.095 (d)8.75
8. A loan of £900 is repayable by equal monthly payments for 3years, with interest
payable at 18½%pa effective. Calculate the amount of each monthly payments
(a)32.13 (b)31.13 (c)35.25 (d)30.75
9. Find R,if P=7892, l=5, i= 10% and n=10
(a)125.01 (b)123.25 (c)175 (d)150
10. Find P, if l=5, R=125, i=10% and n=20
(a)61.15 (b)65.25 (c)60.825 (d)62.13
11. Calculate numerical value for ā7 @7½%pa
(a)5.4928 (b)6.492 (c)7.25 (d)8.125
12. Calculate 5\ ä8(3) @ 6%
(a) 3.8247 (b) 4.8247 (c) 5.25 (d)6.875]
Part B
Attempt all questions- each questions carries one weight age
PART C
Attempt any four questions- each questions carries two weight age
21. Consider two non-overlapping time periods. Period 1 has length l time units and
period 2 has length m time units. If the effective period 1 interest rate is i. Express
the equivalent effective period 2 interest rate in terms of I, l and m
22. If the force of interest is δ(t)=0.04,0<t<6 and δ(t)=0.2-0.02t, 6<t<9. Find the
accumulated value at time 8 of a payment of $400 at time 3
23. Find the accumulated value of a payment stream of 0.3+1.5t that is received
continuously from time 4 to time 8 During which time the force of interest is
0.01+0.05t
24. A motorist buys a car costing £5000 using a loan with a flat rate of interest of
10% and repayments at the end of each of the next 12 months .Calculate the loan
outstanding immediately thereafter the second payment
25. A loan of $50000 is repayable by equal annual payments at the end of each of the
next 5 years; interest is 8%pa for the first 3 years and 12%pa thereafter. Calculate
the loan outstanding immediately thereafter the second payment
26. Derive formulae for (Iä)n
a. Algebraically, and
b. By general reasoning, starting from the formula for (Ia)n
PART D
Attempt any two questions- each questions carries four weight age
27.A woman takes out a home improvement loan for £11000 over 5 years. She makes
monthly repayments in arrears and the bank charges an effective rate of interest of
6%pa
(a) What is the monthly repayment?
(b) How much interest does she pay in the 3rd year?
(c) How much capital is repaid in the 20th installments?
28. An investor wishes to find the present value of a stream of property income
payments. She proposes to make the following assumptions
• The level of current payment is £20,000 paid quarterly in advance
• Payments will remain fixed for 5 years period. At the end of each
5-year period the payments will raise in line with total inflationary
growth over the previous 5 years
• Inflation assumed to be constant at 3%pa
• The interest rate for the calculation is 12%pa effective
Find the P.V of the income stream; assume that the payments continue for 50
years
29. The force of interest is given by
δ(t)={0.04+0.002t 0<t<10
0.015t-0.08 10<t<12
0.07 t>12
Find the expression for the accumulation factor from time 0 to t?
SEMESTE II
Module II: Multiple life functions: Joint life status-the last survivor status-
Probabilities and expectations-Insurance and annuity benefits-
Evaluation-Special mortality laws-Evaluation-Uniform distribution
of death-Simple contingent functions-Evaluation 10hrs
Semester II
COMPLEMENTARY COURSE II
PART A
13.On the basis of life table, evaluate the probability that (20)will
(a) live to 100
(b) die before 70
14.Explain complete expectation of life
15. Under the assumption of uniform distribution of deaths, show that
(a) e0x=ex+1/2
(b) Var[T]= Var[K]+1/2
16. The pdf of the future life time T,for (x)is assumed to be
fT(t)={1/80 , 0<t<80
0 , elsewhere
At a force of interest δ, calculate for Z,the PVRV for a whole life insurance f
or unit amount issued to (x)
(a) The actuarial present value
(b) The variance
17. Explain Endowment life insurance at the moment of death
18. Compare the variances of the PVRV’s for the complete annuity - immediate
19. Prove that n/qx = (Ax:n - Ax)/d – nEx
20. Explain n-year temporary life annuity – due
PART C
Attempt any four questions- each questions carries two weight age
21. Assuming that future life times of (80) and (80)are independent, obtain an
expression in single life table functions for the probability that their
(a) First death occurs after 5 and before 10 years from now
(b) Last death occurs after 5 and before 10 years from now
22. Prove that nqx2y = nqx-nqx1y & nqx1x= ½nqxx
23. Using life tables, evaluate
(a)2P[30] (b) 5P[30] (c) 1\q[31] (d)q[31]+1
24. Under the constant force of mortality assumption, are the random variable K and
S are independent
25. Assume that each of 100 independent lives
(i) Is age x
(ii) Is subjected to a constant force of mortality µ=0.04 and
(iii) Is insured for a death benefit amount of 10 units, payable at the
moment of death
The benefit payments are to be withdrawn from an investment fund earning δ=
0.06. Calculate the minimum amount at t=0, so that the probability is
approximately 0.95 that sufficient funds will be on hand to withdraw the benefit
payment at the death of each individual
26. Consider a 5-year deferred whole life insurance payable at the moment of death of
(x). The individual is subject to a constant force of mortality µ=0.04. For the
distribution of the PV of the benefit payment at δ=0 .10
(a) Calculate the expectation
(b) Calculate the variance
PART D
Attempt any two questions- each questions carries four weight age
27. Relationship between insurance payable at the moment of death and the end of
year of death
28. Under the assumptions of a constant force of mortality M, and of a constant force
of interest delta, evaluate
(a) āx=E[āT]
(b) Var[āT]
(c) Probability that āT exceeds ax
29. The future life T(x)and T(y)are independent and each has a distribution defined
by the pdf fX(t)={0.02(10-t) , 0<t<10
0 , elsewhere
(a) Determine the distribution function, survival function and force of
mortality
(b) Determine the joint pdf & joint distribution function and joint survival
Function for T(x) and T(y)
(c) Determine complete expectation for the joint life status T(x,y)
SEMESTER III
Course III
Module II: Fully continuous net premium reserves-other formulas for fully
discrete net premium results-Reserves on semi continuous basis-
Reserves based on semi continuous basis-Reserves based on
apportion able or discounted continuous basis-Recursive formulae
for fully discrete basis-Reserves at fractional duration-Allocation
of the loss to the policy years-Differential equation for fully
continuous reserves 25
Module III: Concept of Risk-the concept of Insurance-Classification of
Insurance-Types of Life Insurance-Insurance Act, fire ,marine,
motor engineering, Aviation and agricultural-Alternative
classification-Insurance of property-pecuniary interest, liability
&person, Distribution between Life & General Insurance-History
of General Insurance in India. 25hrs
Semester III
COMPLEMENTARY COURSE II
Time: 3 Hrs
PART A
Choose the correct answer from the brackets
Bunch of four questions carries one weight age
PART A
1. Given for a double decrement table, that q401(1) = 0.02 and q1(2)=0.04. Calculate
q40(1) to four decimals
(a) .0909 (b)0.0592 (c)0.0426 (d)0.3296
0.1x
2. Let the loss random variable X have a pdf given by f(x)=0.1e ,x>0, calculate
E[X]?
(a) 10 (b)30 (c)25 15
3. The loss random variable X have the pdf given by f(x)=1/100, 0<x<100, calculate
V[X]?
(a) 50, (b) 2500/3 (c) 250/3 (d)45/3
1 (12) (12)
4. If Px :20 = 1.032 and Px:20=0.040, what is the value of Px:20 ?
(a) [0.035 (b) 0.326 (c) 0.957 (d) 0.583
5. Using the illustrate life table and directly calculate P(2)[Ā50:20/ā(2)50:20]
(a) [0.0413 (b) 0.0328 (c) 0.191 (d) 0.0456]
6. Using the illustrate life table and interest rate of 6%, calculate the component of the
decomposition
1000 P50:20 = 1000(P50:120 + P50:201`)
13. Determine an expression in actuarial present values and benefit premiums for the
Var[ kL / k(x) = k, k+1,……..] for a fully discrete n-year endowment insurance with a
unit benefit
14. A fully discrete whole life insurance with a unit benefit issued to (x) has its first years
benefit and the remaining benefit premiums are level and determined by the equivalence
principle
Determine formulas for
a. The first year benefit premium
b. The level benefit premium after the 1st year
15.Calculate P[Āx] and Var[L] with the assumptions that the force of mortality is a
constant µ=0.04 and the force of interest δ=0.06
16.Derive relationships among continuous benefit premiums using identities
17. A decision maker’s utility function is given by u[w]=-e-5w. The decision maker has
two random economic prospects available. The outcome of the first has a normal
distribution with mean 5 and variance 2 and the outcome of the second has a normal
distribution with mean 6 and variance 2.5. Which prospects will be preferred
18. Explain fully continuous benefit reserves in whole life insurance
19. Derive a general expression for 2Āx - (Āx)2/ (δāx)2 , where µx(t)=µ and δ is the force of
interest for t>0
20. Prove and interpret the formula Px:n = nPx + Px:n1(1-Ax+n)
PART C
Attempt any four questions- each questions carries two weight age
25The probability that a property will not be damaged in the next period is 0.75. The pdf
of a possible loss is given by f(x)=0.25(0.01)e-0.01x, x>0 . The owner of property has a
utility function given by u(w)= - e-0.05w. Calculate the expected loss and the maximum
insurance premium. The property owner will pay to the complete insurance
27 An insurer is planning to issue a policy to a life age 0, whose curtate future life
time k is governed by the p.f k/q0=0.2,k=0,1,2,3,4
The policy will pay 1 unit at the end of year of death in exchange for the payment
of a premium P at the beginning of each year, provided the life survives. Find the
annual premium P is determined by;
c. Principle I: P will be the annual premium such that the insurer, using a
utility of wealth function u(x)=x will be indifferent between accepting and
not accepting the risk
d. Principle II: P will be the annual premium such that the insurer, using a
utility of wealth function u(x)= -e-0.01x will be in utility of wealth function
28. If k\qx= C (0.96) k+1, k=0, 1, 2……where c=0.04/0.96 and i=0.06. Calculate Px
and V[L]
29. On the basis of De-Moiver’s law with lx=100-x and the interest rate of 6%.
Calculate
(a) P(Ā35) , (b)tV(Ā35) and V[tL\T(x)>t] , for t=0,10,20,….,60
SEMESTER IV
Course IV
Probability models and Risk theory
Module I: Individual risk model for a short time: Model for individual claim
random variables-Sums of independent random variable-
Approximation for the distribution of the sum-Application to
insurance 20hrs
Module II: Collective risk models for a single period: The distribution of
aggregate claims-Selection of basic distributions-Properties of
compound Poisson distributions –Approximations to the
distribution of aggregate claims 25hrs
Module III: Collective risk models over an extended period: Claims process-
The adjustment coefficient-Discrete time model-The first surplus
below the initial level-The maximal aggregate loss 20hrs
Semester IV
COMPLEMENTARY COURSE II
Time: 3 Hrs
Part A
Choose the correct answer from the brackets
Bunch of four questions carries one weight age
1. Let X is the number obtained when one true die is tossed. Let y be the sum of the
numbers obtained when x true dice are then thrown. calculate E[y]
(a) 4/7 (b)7/4 (c)2/6 (d)3/6
2. Under certain assumptions, the probability of ruin is
Ψ(u)= (0.3) e-2u +(0.2) e-4u+(0.1)e-7u, u > 0. Calculate θ?
(a)2/3 (b)1/3 (c)1 (d)½
3. Suppose that λ = 3, C = 1 and P(x) = 1/3 e-3x +16/3 e-6x , x >0 Calculate P1
(a)3/27 (b)6/27 (c)4/27 (d)5/27
4. Suppose that λ = 1, C = 10 and P(x) = 9x/25 e-3x/5, x>0. Calculate θ
(a) 3 (b) 4 (c) 2 (d)5
5. Suppose that the claim amount distribution is discrete with P(1)=1/4 and
P(2)=3/4.If R=log 2.Calculate θ
(a) 10 -1 (b) 10 (c) 10 -1 (d)10
7log2 7log2 5log2 5log2
6. Suppose that Wi assumes only, the value 0 and +2 and that
P[W=0]=p,P[W=2]=q,where p+q=1,Assume that C=1,P>1/254
7. Consider an insurance portfolio that will produce 0, 1, 2, or three claims in a fixed
Time period with probabilities 0.1, 0.3, 0.4 and 0.2respectevely An individual
Claim will be of amount 1, 2, or 3 with probabilities 0.5, 0.4 and 0.1
Respectively Calculate E[N]
(a) 1.7 (b) 2.7 (c) 2.8 (d)1.6
-3x -7x
8. Suppose that θ=2/5 and p(x)= 3/2e + 7/2e , x>0 calculate γ
(a) 2 (b)3 (c)4 (d)2.5
9. If S has a compound Poisson distribution given by λ=3,p(1)= 5/6,p(2)=1/6,
Calculate fs(x) for x=0
(a) 0.050 (b) 0.25 (c) 0.052 (d) 0.523
10 Consider an insurance portfolio that will produce 0, 1, 2, or three claims in a fixed
Time period with probabilities 0.1, 0.3, 0.4 and 0.2respectevely an individual
Claim will be of amount 1, 2, or 3 with probabilities 0.5, 0.4 and 0.1
Respectively Calculate V [N]
(a) 0.8 (b) 0.028 (c) 0.08 (d) 0.285
-3x -7x
11. Suppose that θ=2/5 and p(x)= 3/2e + 7/2e , x>0 calculate R
(a) 2.5 (b) 3.45 (c) 4.25 (d) 2.5
12. If S has a compound Poisson distribution given by λ=3,p(1)= 5/6,p(2)=1/6,
Calculate Fs(x) for x=2
(a) 0.354 (b) 0.258 (c) 0.520 (d) 0.545
PART B
Attempt all questions- each questions carries one weight age
13. Assume that N has a geometric distribution; that is ,the probability function of N
is given by
P[N=n] = pqn , n=0,1,2…..
Where 0<q<1 and p=q-1.Determine MS(t) in terms of MX(t)?
14. If S has a compound Poisson distribution, specified λ and p(x), Then the
distribution of Z = S- λP1
λP2 Converges the standard normal distribution as
λ→∞?
15.Write an expression for the distribution of the surplus level at the first time, the
surplus falls below the initial level u, given that it does fall below u, if all claims
are of size 2?
16. Derive an expression for Ψ(u) if the Xi’s have an exponential claim amount
distribution?
17. Write an expression for the distribution of L, if the size of the individual claims
has an exponential distribution with parameter β?
18. Find the mean and variance of the Inverse Gaussian distribution, by using its
mgf
19. Derive an expression for R in the special case where the Wi’s common
distribution is N(µ,σ2)?
20. Determine the adjustment coefficient if the claim amount distribution is
exponential with parameter β>0?
PART C
Attempt any four questions- each questions carries two weight age
21.Assume that u(λ) is the gamma probability distribution function with parameter
α and β,
u(λ) = βα λα-1 e –βλ
Γα ,λ>0
Where Γα = ∫∞0 yα-1 e –y dy. Show that the marginal distribution of N is negative
binomial with parameters, r = α , p= β
1+ β
25. Calculate the adjustment coefficient if all the claims are of size 1?
26. Calculate the probability of ruin in the case that the claim amount distribution
is exponential with parameter β >0
PART D
Attempt any two questions- each questions carries four weight age
28.Prove that for compound distribution where the probability distribution for N
the number of claims , satisfies the condition
P[N=n] = a+b/n ,for n= 1,2,………..
P[N=n-1] and where the distribution of claim amounts is restricted to the
positive integers.
x
fS(x) = ∑ [a+bi/x]p(i) fS(x-i) ,x=1,2,………
i=1
With the starting value fS(0) = P[N=0]
29.Given that θ = 2/5, and P(x) = 3/2 e-3x +7/2 e-7x , x>0 .Calculate Ψ(u),γ,R?
STATISTICS: COMPLEMENTARY – I Syllabus for BSc.
4 SG4C04 5 3 3 3:1
TESTING OF
HYPOTHESIS
There shall be 4 parts A, B, C and D in all the question papers. Part A consists of 12
objective type questions. Part B consists of 8 questions to be answered in a word,
phrase or sentence. Part C consists of 6 questions of short essay type of which the
student can attempt 4. Part D consists of 3 questions of long essay type of which the
student can attempt 2. In part A the weightage per question is ¼.for part B weightage
is 1/question .For part D the weightage is 2/question and for part D the weightage is
4/question. As far as possible the number of questions should be proportional to the
modules.
Table showing the components and weightage for internal assessment
Components Weight
Assignment 1
Test paper 2
Seminar 1
Attendance 1
There shall be two test papers and the average grade point is to be considered for
internal assessment.
Semester I
B.Sc.Geography (Main)
I Semester
COURSE I : (Complementary I)
STATISTICAL METHODS
Section A
Answer all questions (Contains 12 questions, 4 Questions carry a weightage of 1)
1. The heights of 150 students are collected. The type of classification that is best
suited is
a) Qualitative
b) Quantitative
c) Geographical
d) Chronological
2. A frequency distribution in which the upper limits are not included in their
respective classes is called a
a) Continuous frequency distribution
b) Discrete frequency distribution
c) Raw data
d) Ungrouped frequency distribution
3. The class mark of a class is obtained by
a) upper limit-lower limit
b) upper limit + lower limit
upperlimit + lower limit
c)
2
upperlimit − lower limit
d)
2
4. When there are zeroes in the data we can not use
a) Median
b) Mode
c) Geometric mean
d) Arithmetic mean
5. The most suitable measure for an ordinal data is:
a) Median
b) Arithmetic mean
c) Combined mean
d) Mode
6. Mean of 20 values is 45. If one of these values is to be taken 64 instead of 46, the
correct value of mean is:
a) 49.5
b) 45.9
c) 40.9
d) 42.9
7. The formula to find coefficient of variation is:
__
σ X
a) × 100 b) × 100
__
σ
X
Median
c) ×100 d) σ × 100
σ
8. Mean deviation from median is:
a) Equal to mean deviation from mean
b) Greater than mean deviation from mean
c) Less than mean deviation from mean
d) No relation
9. The 50th percentile is equal to:
a) 10th decile
b) 1st decile
c) 2nd decile
d) 5th decile
10. For a symmetric distribution median and mode = 10. The value of mean is:
a) Zero
b) 20
c) 10
d) 5
11. For a positively skewed data:
a) Mean = mode
b) Mean < mode
c) Mean > mode
d) (Mean – Mode)/2
12. A curve which is flatter than a normal curve is called
a) Skewed curve
b) Platykurtic curve
c) Leptokurtic curve
d) Mesokurtic curve
Section B (Contains 6 questions answer any 4) Weight-1
13. When there are open end classes, we use median as a measure of central tendency
(1) Say true or false
(2) Explain your answer
14. In the case of categorical data we can not use histogram
(1) Say true or false
(2) Explain your answer
15 Suppose that the standard deviation of a set of observation is 3. If from each
observation ‘3’ is subtracted, the new standard deviation is zero.
(1) Say true or false
(2) Explain your answer
16. If 25% of the items are less than 10 and 25% are more than 40 the coefficient of
quartile deviation is -------.
17. Karl Pearson’s coefficient of skewness of a distribution is 0.32 and its standard
deviation is 6.5. The mean is 29.6. The mode is -------.
18. Define harmonic mean of n observations.
19. Give an example of a primary data.
20. Give the empirical relationship between mean ,median and mode.
Section C
(4 Questions to be answered out of 6) Weight-2
21. Explain why A.M. is considered as the best measure of central tendency?
22. Calculate quartile deviation for the following data:-
26, 54, 33, 41, 94, 41, 54, 26, 93, 87, 81, 64, 68, 95.
23. The first two-sub-groups have 10 items with mean 15 and S. D. 3. If the whole
group has 250 items with mean 15. 6 and S.D. 13.44 , find the standard deviation
of the second subgroup.
Module 2. Index numbers, meaning and use of index numbers – simple and
number, chain base and fixed base index number – construction of cost of
20hrs
secular trend semi average, moving average and least square methods (linear
B.Sc.Geography (Main)
Semester II
COURSE II : (Complementary I)
Part B
Answer all questions Weight 1
13. Karl Pearsons’s formula for measure of skewness is -------------
15. Write down the normal equation for fitting a straight lune.
16. Given the trend equation , Y = 108 + 2.8 X with 2000 as orgin and yearly data
from 2000 to 2002,the estimated trend value for 2005 is.---------
17. The formula for calculating the rank correlation coefficient is--------
PartC
Answer any 4 questions ,weight 2
25.With the help of an Index Number formula, explain Time and Factor Reversal Tests.
27.Given the following data related to yield of a crop in three different seasons.
1990 12 19 17
1991 14 25 23
1992 13 27 20
1993 15 28 22
1994 17 31 24
29. Calculate the cost of Living Index Number for the data given below.
Rice
Food 30 47 4
Fuel 8 12 1
Clothing 14 18 3
House Rent 22 15 2
Miscellaneous 25 30 1
Semester III
Course III-PROBABILITY
problems. 30hrs
Model Question Paper
B.Sc.Geography (Main)
Semester III
COURSE III : (Complementary I)
PROBABILITY
Part A
(Answer all the questions. Choose the correct answer from the
alternatives given below each question). Bunch of 4 carries weight 1
Part B
(Answer all the questions) Weight-1
21. Show that for any two events A and B in a sample space S
P ( A ∩ B) ≥ P (A) + P (B) -1
22. In a swimming race the odds that A will win are 2 to 3 and the odds that
B will win are 1 to 4. Find the probability that A or B wins the race.
23. State and prove the multiplication law of probability.
24. What are the properties of a distribution function.
25. For a poisson distribution with parameter 3find Pr(X>2)
26. Examine whether f(x) as defined below is a pdf.
F(x) =0 for x<2
1
(3+2x) for 2 ≤ x<4
18
= 0 for n>4
Part D
(Answer any 2 questions) Weight-4
Complementary I
Course-IV-TESTING OF HYPOTHESIS
Module 1. Testing of statistical hypotheses, large and small sample tests, basic
squares tests.
35hours
Module 2. Non parametric tests – advantages, sign test, run test, signed rank
30 hours
5. Box, G.E.P. and G.M. Jenkins: Time Series Analysis, Holden –Day
Model Question Paper
B.Sc.Geography (Main)
Semester IV
COURSE IV : (Complementary I)
TESTING OF HYPOTHESIS
Part A
Time 3 hours Answer all questions
2. In a paired t-test:
a) The sample sizes should be equal
b) The size of the first sample should be less than the size of the second
c) The size of the second sample should be less than the size of the first
d) Both sample sizes should be ≥ 50
r 1− r2
b.
n−2
r 1− r2
c. n−3
r
d. n−2
1− r2
4. In the test of equality of means of two normal population with small samples of
sizes n1 and n2 taken from them and if the population have equal but unknown
variance, the test statistic follows:
a) t n1+n2-1 b)t n1+n2 c)t n1+n2/2 d) t n1+n2-2
5. In a chi-square contingency table with 3 rows and 5 columns, the d.f of chi-square
statistic is
a) 15
b) 24
c) 8
d) 7
6. The chi-square test statistic for a goodness of fit test is given by:
Oi − Ei
a)
Ei
Oi − Ei
b) ∑ Ei2
(Oi − Ei ) 2
c) ∑ Ei2
(Oi − Ei )2
d) ∑ Ei
9. The test used to check the randomness of the collected set of symbols is:
a) Sign test
b) Rank sum test
c) Signed rank test
d) Run test
10. When there are 3 groups, each following normal distribution, and the null
hypothesis is concerned with the equality of means the test used is:
a) Chi square test
b) t-test for equality of means
c) Analysis of variance
d) none of the above
11. The test statistic in a two way ANOVA table follows:
a) Chi-square distribution
b) t-distribution
c) Normal distribution
d) F-distribution
12. In a one way ANOVA if the d.f of the total S.S is 13 and the d.f of the between
sample sum of squares is 6, the d.f of the error sum of squares is:
a) 7 b) 6 c) 19 d) 3
13. In chi-square test of independences of 2 attributes with 2 observations each, the d.f
of the test statistic is 1.
14.In the case of sign test, the test statistic follows a binomial distribution.
15In an one-way ANOVA, the total sum of squares of observations is 6212 and the
error sum of squares is 3272. The sum of squares between samples is 2900.
18A sample of size 12 is taken from a normal distribution. The sample variance is 1.8.
What is the value of the test statistic for the test with H o = σ 2 = 3 .
21. What is the null hypothesis for a chi-square test of homogeneity of proportions
and give the layout of observations.
24. In a lot containing 1235 articles, 35 were found to be defective. Does the
hypothesis: The proportion of defective articles is less than 0.02 hold?
25. The value of the sample mean from a population which was assumed to have
mean
5 is 4. The sample size is 100 and the variance of the sample is 1. Is there
significant difference between sample mean and population mean?
26.Explain paired t-test.
Part D Weight-4 ( Answer any 2 questions)
27 A factory operates in three shifts. The factory manager feels that quality of
part is related to shifts. For this purpose he has collected the following data
from the past records of production.
No. of Parts
Good Bad
Shift Day
900 130
Evening
Night 700 170
400 200
28 Fifteen patient records from each of two hospitals were received and
assigned a score designed to measure level of care. The scores were as
follows:-
Hospital 99 85 73 98 83 88 99 80 74 91 80 94 94 98 80
A:
Hospital 78 74 69 79 57 78 79 68 59 91 89 55 60 55 79
B
Use a proper non-parametric test to see whether the two populations are
identical with respect to the level of care.
29. The laboratories A and B carry out independent estimates of fat content in ice-
creams made by a firm. A sample is taken from each batch, halved and the
separate halves sent to the two laboratories. The fat contents obtained by the
laboratories are recorded below. (Fat contents in milligrams are given below)
Batch No. 1 2 3 4 5 6 7 8 9 10
Lab A 7 8 7 3 8 6 9 4 7 8
Lab B 9 8 8 4 7 7 9 6 6 6
Is there a significant difference between the mean fat content obtained by the two
laboratories A and B?
STATISTICS: COMPLEMENTARY – I
SYLLABUS FOR BSc. PSYCHOLOGY (MAIN)
CUCCSSUG 2009 (2009 admission onwards)
1 PS1C01 4 3 3 3:1
STATISTICAL
METHODS
There shall be 4 parts A, B, C and D in all the question papers .Part A consists of 12 objective
type questions. Part B consists of 8 questions to be answered in a word, phrase or sentence.
Part C consists of 6 questions of short essay type of which the student can attempt 4. Part D
consists of 3 questions of long essay type of which the student can attempt 2. In part A the
weightage per question is ¼.for part B weightage is 1/question .For part D the weightage is
2/question and for part D the weightage is 4/question. As far as possible the number of
questions should be proportional to the modules.
Table showing the components and weightage for internal assessment
.
Components Weight
Assignment 1
Test paper 2
Seminar 1
Attendance 1
There shall be two test papers and the average grade point is to be considered for
internal assessment
Semester-I STATISTICAL METHODS
Modue 1. Pre-requisites.
A basic idea about data, its collection, organization and planning of survey and
diagramatic representation of data is expected from the part of the students.
Classification of data, frequency distribution, formation of a frequency distribution, Graphic
representation viz. Histogram, Frequency Curve, Polygon, Ogives and Pie Diagram. 20hr
B.Sc. Psychology
I Semester -Staistical Methods
COURSE I : Psychological Statistics (Complementary I)
Time: 3 Hrs
PART A
(Contains 12 questions, 4 Questions carry a weightage of 1)
1. The heights of 150 students are collected. The type of classification that is best suited is
a) Qualitative
b) Quantitative
c) Geographical
d) Chronological
2. A frequency distribution in which the upper limits are not included in their respective
classes is called a
c) Raw data
a) Median
b) Mode
c) Geometric mean
d) Arithmetic mean
a) Median
b) Arithmetic mean
c) Combined mean
d) Mode
6. Mean of 20 values is 45. If one of these values is to be taken 64 instead of 46, the correct
value of mean is:
a) 49.5
b) 45.9
c) 40.9
d) 42.9
d) No relation
9. The 50th percentile is equal to:
a) 10th decile
b) 1st decile
c) 2nd decile
d) 5th decile
10. For a symmetric distribution median and mode = 10. The value of mean is:
a) Zero
b) 20
c) 10
d) 5
a) Mean = mode
d) (Mean – Mode)/2
a) Skewed curve
b) Platykurtic curve
c) Leptokurtic curve
d) Mesokurtic curve
PART B
13. When there are open end classes, we use median as a measure of central tendency
15. Suppose that the standard deviation of a set of observation is 3. If from each observation
‘3’ is subtracted, the new standard deviation is zero.
16. If 25% of the items are less than 10 and 25% are more than 40 the coefficient of quartile
deviation is -------.
17. Karl Pearson’s coefficient of skewness of a distribution is 0.32 and its standard deviation
is 6.5. The mean is 29.6. The mode is -------.
PART C
21. Explain why A.M. is considered as the best measure of central tendency?
26, 54, 33, 41, 94, 41, 54, 26, 93, 87, 81, 64, 68, 95.
23. The first two-sub-groups have 10 items with mean 15 and S. D. 3. If the whole group has
250 items with mean 15. 6 and S.D. 13.44 , find the standard deviation of the second
subgroup.
26. The means of two samples of sizes 50 and 100 respectively are 54.1 and 50.3. The
standard deviations are 8 and 7. Obtain the mean and standard deviations of the sample
consisting of 150 observations by combining the two samples.
PART D
Freq: 13 15 19 20 23 25 28 13
29. Calculate mean deviation about median for the data given below.
Freq: 8 7 30 26 12 7
COURSE II -SEMESTER-II
REGRESSION ANALYSIS AND PROBABILITY
B. Sc. Psychology
II Semester REGRESSION ANALYSIS AND PROBABILITY
Part A
1. The value of the square of Karl Pearson’s coefficient of correlation lies between:
a) 0 and 1 b) -1 and 1
2. Karl Pearson’s coefficient of correlation for the following set of observation (3,12),(5,6) is: a)
Zero b) -1 c) +1 d) infinity
a) Negative b) Positive
c) Zero d) No relation
b) The joint effect of X2 and X3 are studied keeping the effect of X1 a constant.
d) The correlation between X2 and X3 are studied keeping the effect of X1 a constant.
7. Mutually exclusive events other than null event and sure event are:
a) not independent
b) independent
c) no relation
8. The probability that India wins a cricket match against England is 1/3. If India and England play 3
matches, what is the probability that India will lose all the three matches?
9. What is the probability that a non leap year selected at random will have 53 Sundays?
10. The probability mass function of a discrete r.v is: p(x) = cx/15, x = 1, 2, 3, 4, 5. The value of c is:
a) zero b) 15 c) 5 d) 1
a) constant
b) non-decreasing
c) non-increasing
d) never exists
12. For a discrete r.v P(X >0) = P(X <0) and P(X =0) = p. The variable takes the following values -2, -
1, 0, 1, 2. What is the probability that X >0?
Part B
13. Classical definition of probability can be used in the case of a sample space with infinite
outcomes.
14. In the case of disjoint events A and B, P(A Υ B)< P(A) +P(B).
15. Getting a queen and getting a Jack while drawing cards from a deck of cards are independent
events.
16. The correlation coefficient between X and Y is 0.85. Find the coefficient of determination.
21. Give the axiomatic definition of probability. Mention one advantage of the definition.
22. If A and B are two independent events such that P ( A c ) = 0.7, P ( B c ) = k , P ( A ∪ B ) = 0.8 , then
find the value of k.
23. A and B stand in a ring with 12 other persons. Find the probability that A & B are together.
24. Explain briefly the concept of partial correlation with the help of an example.
25. Explain why in the case of two variables there are always two regression lines? When do they
coincide?
Part D
27. From a bag containing 5 red and 6 blue balls, 4 balls are taken at random. Find the probability
mass function of:
28. P(A) = 1/3, P(B) = 1/4, P(A∩B) = 1/11. Find the following probabilities.
29. Give an example to show that correlation coefficient is a measure of linear correlation only
Semester-III
Course III -PROBABILITY DITRIBUTIONS AND PARAMETRIC TESTS
B. Sc. Psychology
III Semester PROBABILITY DITRIBUTION AND PARAMETRIC TESTS
4. If a sample of size n is taken without replacement from a population with N units, the
probability of getting a sample is:
a) 1/n b) 1/N c) 1/nCn d) 1/2N
5. The test statistic that is used to check equality of variance of two normal populations when
two small samples are taken from them is:
a) standard normal
b) F
c) t
d) χ 2
6. A statistic is
a) Constant
b) Same as parameter
8. In a paired t-test:
b) The size of the first sample should be less than the size of the second
c) The size of the second sample should be less than the size of the first
r
a. 1− r2
n−3
r 1− r2
b.
n−2
r 1− r2
c. n −3
r
d. n−2
1− r2
10. In the test of equality of means of two normal population with small samples of sizes n1
and n2 taken from them and if the population have equal but unknown variance, the test
statistic follows:
a) heterogeneous population
b) homogeneous population
c) infinite population
d) always
18 A sample of size 12 is taken from a normal distribution. The sample variance is 1.8. What
is the value of the test statistic for the test with H o = σ 2 = 3 .
19. Define type II error
20 Define the power of the test.
Part C (4 Questions to be answered out of 6) wt 2
21. What do you mean by standard error?
22. Explain paired t-test.
23. What are the advantages of systematic sampling compared to SRS.
24. A correlation coefficient 0.65 was observed in a sample of 50 bi-variate observations. Is
the value significant?
25. In a lot containing 1235 articles, 35 were found to be defective. Does the hypothesis: The
proportion of defective articles is less than 0.02 hold?
26. The value of the sample mean from a population which was assumed to have mean = 5 is
4. The sample size is 100 and the variance of the sample is 1. Is there significant
difference between sample mean and population mean?
Part D (2 Questions to be answered out of 3) wt4
27. Using Poisson approximation to the Binomial distribution, solve the following. If the
probability that an individual suffers a bad reaction from a particular infection is 0.001,
determine the probability that out of 2,000 individuals,
a) Exactly 3
28. The laboratories A and B carry out independent estimates of fat content in ice-creams
made by a firm. A sample is taken from each batch, halved and the separate halves sent to
the two laboratories. The fat contents obtained by the laboratories are recorded below.
(Fat contents in milligrams are given below)
Batch No. 1 2 3 4 5 6 7 8 9 10
Lab A 7 8 7 3 8 6 9 4 7 8
Lab B 9 8 8 4 7 7 9 6 6 6
Is there a significant difference between the mean fat content obtained by the two laboratories
A and B?
B. Sc. Psychology
Time: 3 Hrs
Q.1. In a chi-square contingency table with 3 rows and 5 columns, the d.f of chi-square
statistic is
a) 15
b) 24
c) 8
2. The chi-square test statistic for a goodness of fit test is given by:
Oi − Ei
a)
Ei
Oi − Ei
b) ∑ Ei2
(Oi − Ei ) 2
c) ∑ Ei2
(Oi − Ei ) 2
d) ∑ Ei
3. In a Poisson goodness of fit test having ‘k’ sets of observed frequencies with estimated
value of λ , the chi-square statistic has d.f.
a) k-2
b) k
c) k-1
d) k-2
c) Run test
d) Sign test
6. The test used to check the randomness of the collected set of symbols is:
a) Sign test
d) Run test
7 When there are 3 groups, each following normal distribution, and the null hypothesis is
concerned with the equality of means the test used is:
c) Analysis of variance
a) Chi-square distribution
b) t-distribution
c) Normal distribution
d) F-distribution
a) t-test
b) Normal test
c) Chi-square test
d) ANOVA10. In a one way ANOVA if the d.f of the total S.S is 13 and the d.f of the
between sample sum of squares is 6, the d.f of the error sum of squares is:
a) 7 b) 6 c) 19
11. The mean value of a set of scores is 50 with S.D.=5. If the raw score of an individual is
55, his z-score is:
a) Zero b) -1 c) 50 d) +1
12. The reliability coefficient of a test of 50 items is 0.60. How much should it be lengthened
to raise the self correlation to 0.9?
a) 5 b) 6 c) 7 d)
\
Part B Answer all questions weight 1
13. In chi-square test of independences of 2 attributes with 2 observations each, the d.f of the
test statistic is 1.
14 In the case of sign test, the test statistic follows a binomial distribution.
15. In an one-way ANOVA, the total sum of squares of observations is 6212 and the error
sum of squares is 3272. The sum of squares between samples is 2900.
16. In χ 2 test of goodness of fit if the calculated value of χ 2 is zero, then it is a bad fit.
18. In test re-test method Karl Pearson’s coefficient of correlation between two test scores is
0.9. What is the coefficient of reliability?
21. What is the null hypothesis for a chi-square test of homogeneity of proportions and give
the layout of observations.
24. The reliability coefficient of a test of 50 items is 0.6. How much should the test be
lengthened to raise the self correlation to 0.9? What effect will the doubling of the test
length has upon the reliability coefficient?
25. A test of 50 items has reliability 0.7 and validity 0.5. If another 150 comparable items are
added to it what will be the validity?
26. In a one-way analysis of variance with three groups (samples) each consisting of 5
observations, the mean error sum of squares is 30.5. Calculate the critical difference. The
group means are 20, 25 and 26 respectively. Find which pairs show significant difference
if any.
Section D
27. A factory operates in three shifts. The factory manager feels that quality of part is related
to shifts. For this purpose he has collected the following data from the past records of
production.
No. of Parts
Good Bad
Hospital 99 85 73 98 83 88 99 80 74 91 80 94 94 98 80
A:
Hospital 78 74 69 79 57 78 79 68 59 91 89 55 60 55 79
B
Use a proper non-parametric test to see whether the two populations are identical with respect
to the level of care.
1. The value of the F-statistic for testing the equality of the means is:
(a) 4.35
(b) .0028
(c) 13.05
(d) 11.60
(e) 116.00
Solution: d
Past performance 1990 Apr - 75%
Past performance 1991 Feb - 63% (c-27%)
Past performance 1993 Feb - 84% (c-10%)
1
2. The hypothesis would be rejected at α=0.05 if the test statistic is greater
than:
(a) 4.07
(b) 3.86
(c) 8.85
(d) 8.81
(e) 3.59
Solution: a
Past performance 1990 Apr - 79%
Past performance 1991 Feb - 61% (b-31%)
Past performance 1993 Feb - 86% (b-12%)
(a) Because the p-value is small, there is evidence that all the brands
differ from each other in the mean amount of tar present.
(b) Because the p-value is small, there is no evidence that any of the
brands differ in the mean tar content.
(c) Because the p-value is small, there is evidence that at least one brand
has a different mean tar content from the other brands.
(d) Because the p-value is small, there is no evidence that at least one
brand has a different mean tar content from the other brands.
(e) Because the p-value is small, there is evidence that all of brands have
the same mean tar content.
Solution: c
Past performance 1993 Feb - 95%
Since the p-value is 0.0028 the hypothesis of equal means is rejected. Con-
sequently a multiple comparison procedure was performed. Here is a por-
tion of the output:
2006
c Carl James Schwarz 2
T GROUPING MEAN N BRAND
A 122.000 3 Wheezer
B 112.000 3 Choker
B
B 110.000 3 Hacker
B
B 108.000 3 Killer
Solution: d
Past performance 1990 Apr - 62% (B-15%)
Solution: c
Past performance 1990 Apr - 76% (E-13%)
Past performance 1991 Feb - 86%
2006
c Carl James Schwarz 3
6. Suppose the analyst wishes to repeat the experiment blocking by the type
of inhalation of smokers. Which of the following is NOT CORRECT about
a randomized complete block design?
Solution: e
Past performance 1990 Apr - 79%
Source df SS MS F-value
Nematocides * 3.456 * *
Error 8 1.200 *
Total 11 4.656
(a) 23.04
(b) 2.89
(c) 3.46
(d) 1.20
(e) 7.68
2006
c Carl James Schwarz 4
Solution: e
Past performance 1990 Feb - 90%
Solution: c
Past performance 1990 Feb - 92%
9. Suppose that based upon this experiment, the scientist wishes to be 80%
sure of detecting a difference of about 0.45 kg/plot in the average yield
among the four nematocides when testing at α=0.05. She decides to use
0.15 as an estimate of the population variance. Then:
(a) The required sample size is about 20 plots per nematocide for a total
of 80 plots.
(b) The required total sample size is 20 plots, i.e., 5 plots per nematocide.
(c) The required sample size is about 4 plots per nematocide for a total
of 16 plots.
(d) The required total sample size is 4 plots, i.e., 1 plot per nematocide.
(e) The required sample size cannot be determined because the individ-
ual population means are not known.
Solution: a
Past performance 1990 Feb - 40% (A-40%, C-46%)
10. What is the best reason for randomly assigning treatment levels to the
experimental units?
2006
c Carl James Schwarz 5
(d) Randomization is required by statistical consultants before they will
help you analyze the experiment.
(e) Randomization implies that it is not necessary to be careful during
the experiment, during data collection, and during data analysis.
Solution: b
Past performance 1990 Feb - 97%
(a) Conclude that the mean yields of the four nematocides are equal
when in fact at least one is not equal.
(b) Conclude that the mean yields of the four nematocides are equal
when in fact they are equal.
(c) Conclude that the mean yields of the four nematocides are unequal
when in fact at least one is not equal.
(d) Conclude that the mean yields of the four nematocides are unequal
when in fact they are equal.
(e) Fail this exam because you used the osmosis method of studying.
Solution: d
Past performance 1990 Feb - 82%
2006
c Carl James Schwarz 6
12. Which is the null and alternate hypothesis?
(a) H: all sample means are equal;
A: at least one sample mean differs from the others.
(b) H: all host species have the same population mean cuckoo egg size;
2006
c Carl James Schwarz 7
(d) H: all host species are the same;
A: at least one host species is different from the others.
(e) H: all host species have the same size eggs;
A: at least one host species has different sized eggs from the others.
Solution: b
Past performance 2006 Dec - 38% (30%-a; 19%-c)
2006
c Carl James Schwarz 8
Multiple Choice Questions
Analysis of Variance - general
Solution: d
1
Multiple Choice Questions
Analysis of Variance - Single factor randomized
complete block designs
Source df SS MS F Prob
prep * 38 **.* **.* 0.1517
Error * 73 **.*
Total * 111
prep Mean
burn 12.0
fertilize 16.0
none 12.5
Source df SS MS F Prob
prep * 38.0 **.* **.* 0.0121
locn * 61.7 **.* **.* 0.0077
Error * 11.3 **.*
Total * 111
prep Mean
1
burn 12.0
fertilize 16.0
none 12.5
1. The value of the test statistic for testing the appropriate hypothesis is:
(a) 2.3
(b) 10.1
(c) 10.9
(d) 2.6
(e) 11.8
Solution: b
Past performance 1993 Apr - 60% (a-15%; e-10%)
(a) 2.1
(b) 4.3
(c) 2.7
(d) 4.6
(e) 2.4
Solution: e
Past performance 1993 Apr - 33% (c-33%)
(a) > 17
(b) 4
(c) > 21
(d) 12
(e) 14
2006
c Carl James Schwarz 2
Solution: a
Past performance 1993 Apr - 25% (b-15%; c-42%; d-10%)
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 8 891.05166667 111.38145833 13.73 0.0001
Error 15 121.67791667 8.11186111
Corr Total 23 1012.72958333
B 7.475 4 30
B
C B 6.000 4 35
C B
C B 5.775 4 45
C
C 3.075 4 50
2006
c Carl James Schwarz 3
(b) F*= 9.31; Reject H if F ∗ > 8.71.
(c) F*= 13.73; Reject H if F ∗ > 2.64.
(d) F*= 9.31; Reject H if F ∗ > 3.29.
(e) F*= 16.38; Reject H if F ∗ > 4.62.
Solution: a
Past performance 1991 Apr - 55% (C-25%)
Solution: d
Past performance 1991 Apr - 95%
6. The results of this experiment were interesting but not conclusive. She
now wishes to detect differences when testing at α =.05. Which of the
following is not correct?
2006
c Carl James Schwarz 4
Solution: d
Past performance 1991 Apr - 71% (A-10%)
2006
c Carl James Schwarz 5
Multiple Choice Questions
Chi-square tests for independence
1
1. Which of the following is not correct?
(a) Operators who had both operations could not be used because this
type of analysis requires each unit to be counted in one and only one
cell.
(b) The null hypothesis is that the severity of the rodent problem is
independent of the type of operator.
(c) The alternate hypothesis is that the proportion of turkey operators
with mild, moderate, and severe rodent problems is different from the
proportion of egg operators with mild, moderate, and severe rodent
problems.
(d) A Type I error would be to conclude that the severity of rodent
problems is dependent upon the type of operator while, in fact, the
proportion of turkey operators with mild, moderate, and severe ro-
dent problems is the same as the proportion of egg operators with
mild, moderate, and severe rodent problems.
(e) A Type II error would be to conclude that the proportion of egg
operators with mild, moderate, or severe rodent problems is the same
as the proportion of turkey operators with mild, moderate, or severe
rodent problems when in fact they are independent.
Solution: e
Past performance 1993 Apr - 52% (a-10%; b-10%; c-14%; d-14%)
Past performance 1996 Dec - 61% (a-10%, d-12%)
Past performance 1998 Dec - 72%
2006
c Carl James Schwarz 2
(b) about 9.71
(c) about 6.81
(d) about 5.64
(e) about 8.60
Solution: d
Past performance 1993 Apr - 65% (a-14%; c-10%)
Past performance 1998 Dec - 99%
5. One reviewer of the study suggested that there may be a problem with the
study because results from small operators were pooled with the results
from large operators. Which of the following is NOT CORRECT?
(a) Simpson’s paradox occurs when conclusions from a pooled table differ
from the individual tables.
(b) Tables can be pooled when the underlying rates are equal among
tables.
2006
c Carl James Schwarz 3
(c) Simpson’s paradox occurs when tables with unequal row totals are
pooled.
(d) Inspection of the row or column percents will give a good clue if
Simpson’s paradox is likely to occur.
(e) Simpson’s paradox occurs when the pooled table gives no evidence
of an effect but the individual tables show evidence of an effect.
Solution: c
Past performance 1990 Dec - 68%
Past performance 1993 Apr - 32% (b-16%; d-22%; e-25%)
Past performance 1996 Dec - 65% (b-10%, d-10%)
Past performance 1998 Dec - 73% ( d-10%)
Frequency|
Row Pct |anger |happy |love |pain | Total
---------+--------+--------+--------+--------+
f | 27 | 19 | 39 | 17 | 102
| 26.47 | 18.63 | 38.24 | 16.67 |
---------+--------+--------+--------+--------+
m | 34 | 12 | 38 | 28 | 112
| 30.36 | 10.71 | 33.93 | 25.00 |
---------+--------+--------+--------+--------+
Total 61 31 77 45 214
6. Under a suitable null hypothesis, the expected frequency for the cell cor-
responding to Anger and Males is:
(a) 15.9
(b) 55.7
(c) 30.4
(d) 31.9
2006
c Carl James Schwarz 4
(e) 29.1
Solution: d
Past performance 1991 Apr - 63% (C-17%, E-15%)
Past performance 1991 Dec - 84% (e-11%)
Past performance 1997 Aug - 87%
7. The null hypothesis will be rejected at α=0.05 if the test statistic exceeds:
(a) 3.84
(b) 5.99
(c) 7.81
(d) 9.49
(e) 14.07
Solution: c
Past performance 1991 Apr - 86%
Solution: a
Past performance 1991 Dec - 77% (e-11%)
(a) The children were cross-classified by sex and emotion associated with
red. Each child was counted in one and only one cell.
(b) The null hypothesis is that the type of emotion associated with red
is independent of the sex of the child.
(c) The null hypothesis is that the proportion of emotions associated
with red is the same for both sexes.
(d) All expected cell counts should be greater than five in order that
the distribution of the test statistic is an approximate chi-square
distribution.
(e) If we reject the null hypothesis than we have proven that the two
sexes associate red with emotions in different ways.
2006
c Carl James Schwarz 5
Solution: e
Past performance 1991 Apr - 76% (C-12%)
Past performance 1991 Dec - 77% (c-9%, d-12%)
Past performance 1993 Feb - 67% (d-16%)
Solution: e
Past performance 1993 Feb - 67% (d-16%)
Past performance 1996 Oct - 92%
(a) We conclude that the sex of the child and the emotion associated
with red are independent when in fact they are not independent.
(b) We conclude that the sex of the child and the emotion associated
with red are not independent when in fact they are not independent.
(c) We conclude that the proportion of emotions associated with red
differs between males and female when in fact they are the same.
(d) We conclude that the proportion of emotions associated with red is
the same for male and female when in fact they are the same.
(e) We fail to find any association between the color red and emotions
for either sex.
Solution: c
Past performance 1991 Apr - 76% (E-20%)
Past performance 1991 Dec - 84%
Past performance 1997 Aug - 76%
2006
c Carl James Schwarz 6
(b) gender is dependent upon the emotional association with red
(c) the probability of selecting an emotion with red is related to gender
(d) the number of children in each cell does not depend upon gender nor
upon emotion
(e) the color red is independent of the emotion associated with it and
with gender.
Solution: c
Past performance 1997 Aug - 74%
14. Each person in a random sample of 50 was asked to state his/her sex and
preferred colour. The resulting frequencies are shown below.
Colour
Red Blue Green
Male 5 14 6
Sex Female 15 6 4
A chi-square test is used to test the null hypothesis that sex and preferred
colour are independent. Which of the following statements is a correct
decision about the null hypothesis?
2006
c Carl James Schwarz 7
15. The following data were obtained from a company which manufactures
special plastic containers which are to hold a specified volume of hazardous
material. On each of the three 8 hour shifts workers are able to make 500
of the containers. Some containers do not meet specifications as required
by the company’s customer because they are too small, others because
they are too large.
Conformance to Specification
Shift Too Small Within Spec. Too Large
8am 36 452 12
4pm 24 443 33
midnight 12 438 50
(a) 166.7
(b) 443
(c) 33
(d) 444.3
(e) 500
16. Are all employees equally prone to having accidents? To investigate this
hypothesis, Parry (1985) looked at a light manufacturing plant and clas-
sified the accidents by type and by age of the employee.
Accident Type
Age Sprain Burn Cut
Under 25 | 9 17 5
25 or over | 61 13 12
2006
c Carl James Schwarz 8
Solution: c
Past performance 1989 Apr - 64%
Question 1.
Yes No | Total
Question Yes 22 48 | 70
2 No 12 18 | 30
Total 34 66 | 100
Solution: c
Smoker Non-smoker
Drinker 193 165
Non-drinker 89 153
2006
c Carl James Schwarz 9
(e) At level .01 we conclude that smoking and alcohol consumption are
related.
Solution: e
If the county type of practice and the use of tetracycline are independent,
then the expected number of rural doctors who prescribe tetracycline is:
(a) 31.0
(b) 27.7
(c) 1.37
(d) 51%
(e) 62
Solution: b
20. For the problem outlined above, the critical value(table value) of the test
statistic when the level of significance is α =0.05, is:
(a) 0.1026
(b) 7.3778
(c) 5.9915
(d) 12.5916
(e) 7.8147
Solution: c
2006
c Carl James Schwarz 10
DEATH SIZE
FREQUENCY| m | s | L | TOTAL
---------+--------+--------+--------+
no | 63 | 128 | 46 | 237
---------+--------+--------+--------+
yes | 26 | 95 | 16 | 137
---------+--------+--------+--------+
TOTAL 89 223 62 374
21. Under a suitable null hypothesis, the expected frequency for the cell cor-
responding to fatal type of accident and small size automobile is:
(a) 81.68
(b) 67.00
(c) 61.43
(d) 63.41
(e) 59.72
Solution: a
Past performance 1990 Apr - 92%
Solution: e
Past performance 1990 Apr - 39% (B-12%, C-36%)
Past performance 1990 Dec - 20% ( 15% - c, 56% - d)
2006
c Carl James Schwarz 11
23. The null hypothesis will be rejected at α=0.05 if the test statistic exceeds:
(a) 12.59
(b) 7.81
(c) 5.99
(d) 3.84
(e) 9.49
Solution: c
Past performance 1990Apr - 79%
Solution: c
Past performance 1990 Dec - 78%
Past performance 1993 Apr - 80%
25. A controversial issue in sports is the use of the “instant replay” for making
decisions on plays that are extremely close or hard to call by an official.
A survey of players in each of four professional sports was conducted,
asking them if they felt “instant replays” should be used to decide close or
controversial calls. The results are as follows:
In testing to see whether opinion with respect to the use of instant replays
is independent of sport, a table of expected frequencies is found. In this
table, the expected number of professional baseball players opposing the
use of instant replays is equal to:
(a) 10.4
(b) 24.1
2006
c Carl James Schwarz 12
(c) 11.0
(d) 6.0
(e) 8.4
26. Each person in a random sample of males and females was asked to state
his/her sex and preferred colour. The resulting frequencies are shown
below.
Colour
Red Blue Green
Male 3 11 6
Sex Female 17 11 2
2006
c Carl James Schwarz 13
Multiple Choice Questions
Chi-square tests for independence
1
1. Which of the following is not correct?
(a) Operators who had both operations could not be used because this
type of analysis requires each unit to be counted in one and only one
cell.
(b) The null hypothesis is that the severity of the rodent problem is
independent of the type of operator.
(c) The alternate hypothesis is that the proportion of turkey operators
with mild, moderate, and severe rodent problems is different from the
proportion of egg operators with mild, moderate, and severe rodent
problems.
(d) A Type I error would be to conclude that the severity of rodent
problems is dependent upon the type of operator while, in fact, the
proportion of turkey operators with mild, moderate, and severe ro-
dent problems is the same as the proportion of egg operators with
mild, moderate, and severe rodent problems.
(e) A Type II error would be to conclude that the proportion of egg
operators with mild, moderate, or severe rodent problems is the same
as the proportion of turkey operators with mild, moderate, or severe
rodent problems when in fact they are independent.
Solution: e
Past performance 1993 Apr - 52% (a-10%; b-10%; c-14%; d-14%)
Past performance 1996 Dec - 61% (a-10%, d-12%)
Past performance 1998 Dec - 72%
2006
c Carl James Schwarz 2
(b) about 9.71
(c) about 6.81
(d) about 5.64
(e) about 8.60
Solution: d
Past performance 1993 Apr - 65% (a-14%; c-10%)
Past performance 1998 Dec - 99%
5. One reviewer of the study suggested that there may be a problem with the
study because results from small operators were pooled with the results
from large operators. Which of the following is NOT CORRECT?
(a) Simpson’s paradox occurs when conclusions from a pooled table differ
from the individual tables.
(b) Tables can be pooled when the underlying rates are equal among
tables.
2006
c Carl James Schwarz 3
(c) Simpson’s paradox occurs when tables with unequal row totals are
pooled.
(d) Inspection of the row or column percents will give a good clue if
Simpson’s paradox is likely to occur.
(e) Simpson’s paradox occurs when the pooled table gives no evidence
of an effect but the individual tables show evidence of an effect.
Solution: c
Past performance 1990 Dec - 68%
Past performance 1993 Apr - 32% (b-16%; d-22%; e-25%)
Past performance 1996 Dec - 65% (b-10%, d-10%)
Past performance 1998 Dec - 73% ( d-10%)
Frequency|
Row Pct |anger |happy |love |pain | Total
---------+--------+--------+--------+--------+
f | 27 | 19 | 39 | 17 | 102
| 26.47 | 18.63 | 38.24 | 16.67 |
---------+--------+--------+--------+--------+
m | 34 | 12 | 38 | 28 | 112
| 30.36 | 10.71 | 33.93 | 25.00 |
---------+--------+--------+--------+--------+
Total 61 31 77 45 214
6. Under a suitable null hypothesis, the expected frequency for the cell cor-
responding to Anger and Males is:
(a) 15.9
(b) 55.7
(c) 30.4
(d) 31.9
2006
c Carl James Schwarz 4
(e) 29.1
Solution: d
Past performance 1991 Apr - 63% (C-17%, E-15%)
Past performance 1991 Dec - 84% (e-11%)
Past performance 1997 Aug - 87%
7. The null hypothesis will be rejected at α=0.05 if the test statistic exceeds:
(a) 3.84
(b) 5.99
(c) 7.81
(d) 9.49
(e) 14.07
Solution: c
Past performance 1991 Apr - 86%
Solution: a
Past performance 1991 Dec - 77% (e-11%)
(a) The children were cross-classified by sex and emotion associated with
red. Each child was counted in one and only one cell.
(b) The null hypothesis is that the type of emotion associated with red
is independent of the sex of the child.
(c) The null hypothesis is that the proportion of emotions associated
with red is the same for both sexes.
(d) All expected cell counts should be greater than five in order that
the distribution of the test statistic is an approximate chi-square
distribution.
(e) If we reject the null hypothesis than we have proven that the two
sexes associate red with emotions in different ways.
2006
c Carl James Schwarz 5
Solution: e
Past performance 1991 Apr - 76% (C-12%)
Past performance 1991 Dec - 77% (c-9%, d-12%)
Past performance 1993 Feb - 67% (d-16%)
Solution: e
Past performance 1993 Feb - 67% (d-16%)
Past performance 1996 Oct - 92%
(a) We conclude that the sex of the child and the emotion associated
with red are independent when in fact they are not independent.
(b) We conclude that the sex of the child and the emotion associated
with red are not independent when in fact they are not independent.
(c) We conclude that the proportion of emotions associated with red
differs between males and female when in fact they are the same.
(d) We conclude that the proportion of emotions associated with red is
the same for male and female when in fact they are the same.
(e) We fail to find any association between the color red and emotions
for either sex.
Solution: c
Past performance 1991 Apr - 76% (E-20%)
Past performance 1991 Dec - 84%
Past performance 1997 Aug - 76%
2006
c Carl James Schwarz 6
(b) gender is dependent upon the emotional association with red
(c) the probability of selecting an emotion with red is related to gender
(d) the number of children in each cell does not depend upon gender nor
upon emotion
(e) the color red is independent of the emotion associated with it and
with gender.
Solution: c
Past performance 1997 Aug - 74%
14. Each person in a random sample of 50 was asked to state his/her sex and
preferred colour. The resulting frequencies are shown below.
Colour
Red Blue Green
Male 5 14 6
Sex Female 15 6 4
A chi-square test is used to test the null hypothesis that sex and preferred
colour are independent. Which of the following statements is a correct
decision about the null hypothesis?
2006
c Carl James Schwarz 7
15. The following data were obtained from a company which manufactures
special plastic containers which are to hold a specified volume of hazardous
material. On each of the three 8 hour shifts workers are able to make 500
of the containers. Some containers do not meet specifications as required
by the company’s customer because they are too small, others because
they are too large.
Conformance to Specification
Shift Too Small Within Spec. Too Large
8am 36 452 12
4pm 24 443 33
midnight 12 438 50
(a) 166.7
(b) 443
(c) 33
(d) 444.3
(e) 500
16. Are all employees equally prone to having accidents? To investigate this
hypothesis, Parry (1985) looked at a light manufacturing plant and clas-
sified the accidents by type and by age of the employee.
Accident Type
Age Sprain Burn Cut
Under 25 | 9 17 5
25 or over | 61 13 12
2006
c Carl James Schwarz 8
Solution: c
Past performance 1989 Apr - 64%
Question 1.
Yes No | Total
Question Yes 22 48 | 70
2 No 12 18 | 30
Total 34 66 | 100
Solution: c
Smoker Non-smoker
Drinker 193 165
Non-drinker 89 153
2006
c Carl James Schwarz 9
(e) At level .01 we conclude that smoking and alcohol consumption are
related.
Solution: e
If the county type of practice and the use of tetracycline are independent,
then the expected number of rural doctors who prescribe tetracycline is:
(a) 31.0
(b) 27.7
(c) 1.37
(d) 51%
(e) 62
Solution: b
20. For the problem outlined above, the critical value(table value) of the test
statistic when the level of significance is α =0.05, is:
(a) 0.1026
(b) 7.3778
(c) 5.9915
(d) 12.5916
(e) 7.8147
Solution: c
2006
c Carl James Schwarz 10
DEATH SIZE
FREQUENCY| m | s | L | TOTAL
---------+--------+--------+--------+
no | 63 | 128 | 46 | 237
---------+--------+--------+--------+
yes | 26 | 95 | 16 | 137
---------+--------+--------+--------+
TOTAL 89 223 62 374
21. Under a suitable null hypothesis, the expected frequency for the cell cor-
responding to fatal type of accident and small size automobile is:
(a) 81.68
(b) 67.00
(c) 61.43
(d) 63.41
(e) 59.72
Solution: a
Past performance 1990 Apr - 92%
Solution: e
Past performance 1990 Apr - 39% (B-12%, C-36%)
Past performance 1990 Dec - 20% ( 15% - c, 56% - d)
2006
c Carl James Schwarz 11
23. The null hypothesis will be rejected at α=0.05 if the test statistic exceeds:
(a) 12.59
(b) 7.81
(c) 5.99
(d) 3.84
(e) 9.49
Solution: c
Past performance 1990Apr - 79%
Solution: c
Past performance 1990 Dec - 78%
Past performance 1993 Apr - 80%
25. A controversial issue in sports is the use of the “instant replay” for making
decisions on plays that are extremely close or hard to call by an official.
A survey of players in each of four professional sports was conducted,
asking them if they felt “instant replays” should be used to decide close or
controversial calls. The results are as follows:
In testing to see whether opinion with respect to the use of instant replays
is independent of sport, a table of expected frequencies is found. In this
table, the expected number of professional baseball players opposing the
use of instant replays is equal to:
(a) 10.4
(b) 24.1
2006
c Carl James Schwarz 12
(c) 11.0
(d) 6.0
(e) 8.4
26. Each person in a random sample of males and females was asked to state
his/her sex and preferred colour. The resulting frequencies are shown
below.
Colour
Red Blue Green
Male 3 11 6
Sex Female 17 11 2
2006
c Carl James Schwarz 13
Multiple Choice Questions
Experimental and Survey Design
Solution: c
Past performance 1997 Jun - 99%
Past performance 1997 Aug - 99%
1
3. A nutritionist wants to study the effect of storage time (6, 12, and 18
months) on the amount of vitamin C present in freeze dried fruit when
stored for these lengths of time. Vitamin C is measured in milligrams per
100 milligrams of fruit. Six fruit packs were randomly assigned to each of
the three storage times. The treatment, experimental unit, and response
are respectively:
Solution: d
Past performance 1992 Dec - 92%
Past performance 1996 Dec - 97%
2006
c Carl James Schwarz 2
Past performance 1996 Dec - 68% (28%-e)
Past performance 1998 Dec - 80% (15%-e)
(a) Sex and starting salary are explanatory variables; area of specializa-
tion is a response variable.
(b) Sex is an explanatory variable; starting salary and area of specializa-
tion are response variables.
(c) Sex is an explanatory variable; starting salary is a response variable;
area of specialization is a possible confounding variable.
(d) Sex is a response variable; starting salary is an explanatory variable;
area of specialization is a possible confounding variable.
(e) Sex and area of specialization are response variables; starting salary
is an explanatory variable.
Solution: c
Past performance 1991 Dec - 74% (b-10%)
Past performance 1993 Apr 99%
Solution: d
Past performance 1992 Dec - 63% (30%e)
Past performance 1996 Dec - 89%
2006
c Carl James Schwarz 3
identical fish tanks (1 & 2) to put fish in and is considering how to assign
the 40 tagged fish to the tanks. To properly assign the fish, one step would
be to:
(a) put all the odd tagged numbered fish in one tank, the even in the
other, and give the standard food type to the odd numbered ones
(b) obtain pairs of fish whose weights are virtually equal at the start of
the experiment and randomly assign one to the group tank 1, the
other to tank 2 with the feed assigned at random to the tanks.
(c) to proceed as in as in (b), but put the heavier of the pair into tank
2.
(d) assign the fish at random to the two tanks and give the standard feed
to tank 1.
(e) not to proceed as in (b) because using the initial weight in (b) is a
non-random process.Use the initial length of the fish instead.
Solution: d
(a) use a table of random numbers to divide the 20 plots into 10 pairs
and then, for each pair, flip a coin to assign the fertilizers to the 2
plots.
(b) subjectively divide the 20 plots into 10 pairs (making the plots within
a block as similar as possible) and then, for each pair, flip a coin to
assign the fertilizers to the 2 plots.
(c) use a table of random numbers to divide the 20 plots into 10 pairs
and then use the table of random numbers a second time to decide
upon the fertilizer to be applied to each pair.
(d) flip a coin to divide the 20 plots into 10 pairs and then, for each pair,
use a table of random numbers to assign the fertilizers to the 2 plots.
(e) use a table of random numbers to assign the 2 fertilizers to the 20
plots and then use the table of random numbers a second time to
place the plots into 10 pairs.
Solution: b
9. A student wishes to examine the effect of wing width and wing length on
the length of flight of a paper airplane. There are 4 different models of
airplanes. Which of the following is NOT correct?
2006
c Carl James Schwarz 4
(a) A factor (such as wing width) is an experimental variable under con-
trol of the experimenter.
(b) The order of flights was randomized to remove the influence of any
other variables upon the flight distance of each flight.
(c) It would be better to make four copies of each model of plane to give
some feel for the plane-to-plane variations. Flying a single copy four
times gives information about the internal variation.
(d) Interaction between two factors means that the effect of a factor at
one level depends on the level of the second factor.
(e) Planned experiments (where randomization can take place) is one of
the strongest pieces of evidence in try to establish a causal relation-
ship.
Solution: b - randomization does not remove influences - makes them
equal in all groups
Past performance 1996 Nov - 8% (41%-c; 18%-d; 30%-e)
10. An experiment was conducted where you flew paper airplanes after mod-
ifying wing depth and wing length. There were four different models of
airplane. One design consideration was the choice between
flying each plane four times or making four copies of each model, each of
which is flown once. Which of the following is NOT correct?
(a) Flying multiple copies of each model (i.e. separate planes of each
model) could give information on variability in flight due to fabrica-
tion effects (i.e. how you made the plane).
(b) Flying a single copy of each model four times could give information
on variability in flight due to changes in initial launch conditions.
(c) The differences in flight length among the different models gives in-
formation on the “effects” of the design factors - wing depth and wing
length.
(d) The response variable is flight length; the explanatory variables are
wing depth and wing width.
(e) Interaction between the effects of wing depth and wing width implies
that the effects of wing depth are the same for all wing widths.
Solution: e
Past performance 1997 Jul - 83%
2006
c Carl James Schwarz 5
(a) The response variable is the plant height.
(b) The explanatory variables are the amount of water and seed variety.
(c) Randomization was used to eliminate the effect of other possible fac-
tors upon the growth of the plants.
(d) A possible uncontrollable factor in this experiment is any nutrients
that might be present in the clay pots.
(e) Designed experiments give the best evidence of “cause-and-effect” re-
lationships.
Solution: c - randomization does not remove influences - makes them
equal in all groups
Past performance 1997 Jun - 54% (11%-b; 19%-d; 15%-e)
12. A survey was conducted by visiting a student parking lot to estimate the
proportion of cars that were red. Which of the following is NOT correct?
(a) If the sampled stall was empty, we can simply choose another stall, at
random, to take its place because it is not likely that the stall being
vacant is related to a car being red.
(b) The sample would be representative of the population if 100 cars were
chosen regardless if randomization was used or not.
(c) Even though a random sample was taken from cars in the parking
lot, the sample may not be representative of the cars driven by SFU
students because the decision to park in B-lot is self-selected.
(d) If a another sample of cars was chosen, it is likely that a different
proportion of cars that are red would be obtained.
(e) The confidence interval computed gave a 95% confidence interval for
the true proportion of cars that were red in the population of cars
that park in B-lot (assuming that the sample was selected using the
3 R’s).
Solution: b
Past performance 1997 Jun - 91%
13. A survey was done to estimate the proportion of cars that are red and are
Japanese made in the City of Vancouver by taking a random sample of
size 25 from a student parking lot at Simon Fraser University. Which of
the following is NOT CORRECT:
(a) This sample may not be representative of the cars in Vancouver be-
cause mainly students park at SFU.
(b) If the particular stall is vacant, we can simply select another stall at
random because it is unlikely that a stall is vacant is related to the
color or manufacturer of the car.
2006
c Carl James Schwarz 6
(c) It would be dangerous to simply select the first 25 stalls in the lot
closest to the Applied Science Building because there are a number
of stall reserved for service vehicles whose primary color is white.
(d) Different students obtained different answers for their sample propor-
tions. This is an example of a sampling distribution for an estimator.
(e) The margin of error will depend upon the total number of cars in the
lot when we did the sample.
Solution: e
Past performance 1998 Nov - 76%
15. An experiment was conducted where here you tried to distinguish among
authors based on sentence length and other statistics. Which of the fol-
lowing is NOT correct?
(a) We needed to adjust some variables to a “per 100 word basis” to
adjust for the different number of words on a page.
(b) This was a simplified form of discriminant analysis where, in general,
one wishes to distinguish among groups of objects based on charac-
teristics observed.
(c) Another example of this method might be a bank making a decision
on granting a student a loan based on characteristics such as grade
point average, past credit history, etc.
2006
c Carl James Schwarz 7
(d) The polygon plot is a way of “enclosing” typical values of the statistics
for each author.
(e) Potentially useful variables are selected by finding variables whose
distribution are as similar as possible for all the authors.
Solution: e
Past performance 1997 Jul - 71% (20%-c)
16. An experiment was conducted where you analyzed the results of the plant
growth experiment after you manipulated the amount of water and seed
variety. Which of the following is correct?
(a) We randomized the plants to plots to eliminate any effect of hidden
variables.
(b) We could determine the best combination of water and seed variety
by examining the difference in the plant height in the final week of
the experiment.
(c) The variability in growth among plants of the same variety who re-
ceived the same amount of water was constant over time.
(d) The growth of a particular plant in week 3 is likely to be independent
(unrelated) of the growth of the same plant in week 2.
(e) The growth of the plants was linear over time.
Solution: b
Past performance 1997 Jul - 39% (30%-a; 11%-c; 11%-d; 7%-e)
17. The following numbers are extracted from a table of random digits:
2006
c Carl James Schwarz 8
18. We wish to draw a sample of size 5 without replacement from a population
50 households. Suppose the households are numbered 01, 02, . . . , 50, and
suppose that the relevant line of the random number table is:
2006
c Carl James Schwarz 9
(c) Slight changes in the wording of questions can make a measurable
difference to survey results.
(d) People will sometimes answer a question differently for different in-
terviewers.
(e) Sophisticated statistical methods can always correct the results if the
population you are sampling from is different from the population of
interest, e.g. due to under-coverage.
Solution: e
Past performance 1998 Oct - 87%
22. An experiment was conducted by the Schwarz family to look at the yield
of popcorn (total grams that popped when 15 g of popcorn were heated)
when two variables (the type of popcorn: gourmet or plain) and the
amount of oil (little or lots) was used. A profile plot of the results is
below:
2006
c Carl James Schwarz 10
Which of the following is NOT CORRECT:
(a) Because the lines are not parallel, there appears to be evidence of
interaction between the two variables.
(b) The two explanatory factors are the amount of oil and the type of
popcorn. The response variable is the yield of popcorn.
(c) The difference in yield between gourmet and plain popcorn is esti-
mated to increase by about 6 g when lots of oil were used.
(d) There was little change in the yield for plain popcorn when either
little or lots of oil were used.
(e) An interaction would exist if the increase in yield from going from
little to lots of oil were the same for both types of popcorn.
Solution: e
Past performance 1998 Nov - 63% (16% a; 13% c)
2006
c Carl James Schwarz 11
Solution: a
Past performance 1998 Dec - 90%
2006
c Carl James Schwarz 12
(e) About 37%=150/400 said yes to having used marijuana in the last
year.
Solution: e
Past performance 2006 Oct - 71% (12%-d
26. Recall in one assignment, you conducted a two factor experiment to com-
pare the flying distances of paper airplanes. One factor was wing length
with two levels; the second factor was wing depth also with two factors.
Which of the following is CORRECT?
(a) A good experiment would fly all four copies of the different airplanes
in sequential order.
(b) A good experiment would control for the person launching the planes
by having the same person do all the launches.
(c) A good experiment would make a single copy of each treatment com-
bination and test each copy 10 times.
(d) A good experiment would examine the effect of paper weight on flying
by making all planes of the same weight of paper.
(e) A good experiment would order the planes by weight while running
the experiment.
Solution: b
Past performance 2006 Nov - 70%; 12% choose (c); 14% choose (d)
27. Recall in one assignment you surveyed cars in a parking lot to estimate
the proportion that were red or the proportion that were from a Japanese
manufacturer. Which of the following is NOT CORRECT?
(a) A convenience sample of the cars closest to the Applied Science build-
ing may give a biased estimate of the proportion of cars which are
from a Japanese manufacturer.
(b) Different students may get different answers for the proportion of
cars that are red.
(c) The sample proportion of cars that are red is an unbiased estimate of
the population proportion if the sampling is a simple random sample.
(d) A sample of 100 cars in a convenience sample is always better than
a sample of 20 cars from a proper random sample.
(e) A sample of 100 cars from a proper random sample will give more
precise estimates of the proportion of cars that are red than a sample
of 20 cars from a proper random sample.
Solution: d
Past performance 2006 Nov - 92%
2006
c Carl James Schwarz 13
28. Consider an experiment to investigate the efficacy of different insecticides
in controlling pests and their effects on subsequent yield. What is the best
reason for randomly assigning treatment levels (spraying or not spraying)
to the experimental units (farms)?
(a) Randomization make the experiment easier to conduct because we
can apply the insecticide in any pattern rather than in a systematic
fashion.
(b) Randomization makes the analysis easier because the data can be
collected and entered into the computer in any order.
(c) Randomization is required by statistical consultants before they will
help you analyze the experiment.
(d) Randomization implies that it is not necessary to be careful during
the experiment, during data collection, and during data analysis.
(e) Randomization will tend to average out all other uncontrolled fac-
tors such as soil fertility so that they are not confounded with the
treatment effects.
Solution: e
Past performance 1990 Feb - 97%
Past performance 1993 Feb - 98%
Past performance 1996 Dec - 100%
Past performance 2006 Dec - 99%
2006
c Carl James Schwarz 14
Multiple Choice Questions
Inference - Paired samples on means
2. Trace metals in drinking water wells affect the flavor of the water and un-
usually high concentrations can pose a health hazard. Furthermore, the
water in well may vary in the concentration of the trace metals depending
upon from where it is drawn. In the paper, “Trace Metals of South Indian
River Region” (Environmental Studies, 1982, 62-6), trace metal concen-
trations (mg/L) on zinc were found from water drawn from the bottom
and the top of each of 6 wells. The data follows:
1
A a 95% confidence interval for the mean difference in the zinc concentra-
tions in this area between water drawn from the top and bottom of wells
is:
Solution: c
Past performance 1990 Dec - 64%
Past performance 1992 Dec - 75% (20%a)
2006
c Carl James Schwarz 2
Multiple Choice Questions
Inference - Single sample on means
Solution: a
(a) If we keep the sample size fixed, the confidence interval gets wider as
we increase the confidence coefficient.
(b) A confidence interval for a mean always contains the sample mean.
(c) If we keep the confidence coefficient fixed, the confidence interval gets
narrower as we increase the sample size.
(d) If the population standard deviation increases, the confidence interval
decreases in width.
(e) If the confidence intervals for two means do not overlap very much,
there is evidence that the two population means are different.
Solution: d
Past performance 1990 Dec - 72%
Past performance 1996 Nov - 76%
1
3. You have measured the systolic blood pressure of a random sample of 25
employees of a company. A 95% confidence interval for the mean systolic
blood pressure for the employees is computed to be (122,138). Which of
the following statements gives a valid interpretation of this interval?
(a) About 95% of the sample of employees have a systolic blood pressure
between 122 and 138.
(b) About 95% of the employees in the company have a systolic blood
pressure between 122 and 138.
(c) If the sampling procedure were repeated many times, then approx-
imately 95% of the resulting confidence intervals would contain the
mean systolic blood pressure for employees in the company.
(d) If the sampling procedure were repeated many times, then approxi-
mately 95% of the sample means would be between 122 and 138.
(e) The probability that the sample mean falls between 122 and 138 is
equal to 0.95.
Solution: c
Past performance 1997 Aug - 40% (40%-d; 15%-e)
Past performance 1998 Nov - 57% (15%-d; 15%-b)
(a) if the study were to be repeated many times, there is a 95% prob-
ability that the true average summer earnings is not $4500 as the
government claims.
(b) because our specific confidence interval does not contain the value
$4500 there is a 95% probability that the true average summer earn-
ings is not $4500.
(c) if we were to repeat our survey many times, then about 95% of all
the confidence intervals will contain the value $4500.
(d) if we repeat our survey many times, then about 95% of our confi-
dence intervals will contain the true value of the average earnings of
students.
(e) there is a 95% probability that the true average earnings are between
$3525 and $4425 for all students.
2006
c Carl James Schwarz 2
5. Does playing music to dairy cattle increase their milk production? An
experiment was conducted where a group of dairy cattle was divided into
two groups. Music was played to one group; the control group did not
have music played. The average increase in production was 2.5 L/cow over
the time period in question. A 95% confidence interval for the difference
(treatment-control) in the mean production was computed to be (1.5,3.5)
L/cow. This means:
(a) 95% of the cows increased their production by between 1.5 and 3.5
L.
(b) We are 95% confident that the average increase in production in the
sample is 2.5 L/cow.
(c) Because the confidence interval does not contain zero, we are 95%
confident that there was no effect of playing music.
(d) We don’t know the true increase in production, but we are 95% con-
fident that the increase in the mean production is in this interval.
(e) Because the confidence interval does not include zero, we are 95% con-
fident that the true increase in production for all cows is 2.5 L/cow.
Solution: d
Past performance 1992 Dec - 76% (10%e)
Past performance 1996 Dec - 86%
(a) We are sure the true mean yield of this new variety is between 2.48
and 3.32 t/ha.
(b) We are 95% confident that the true mean yield of this variety is 2.9
t/ha.
(c) About 95% of the yields of the new variety will be between 2.48 and
3.32 t/ha.
(d) We are 95% confident that the true mean yield of this variety is
between 2.48 and 3.32 t/ha.
(e) We are 95% confident that the mean yield of 2.9 t/hectare is between
2.48 and 3.32 t/ha.
Solution: d
Past performance 1990 Dec - 87%
7. A 95 percent confidence interval for the mean time taken to process new
insurance policies is (11, 12) days. This interval can be interpreted to
mean that:
2006
c Carl James Schwarz 3
(a) only 5 percent of all policies take less than 11 or more than 12 days
to process
(b) only 5 percent of all policies take between 11 and 12 days to process
(c) about 95 out of every 100 such intervals constructed from random
samples of the same size will contain the population mean processing
time
(d) the probability is .95 that all policies take between 11 and 12 days
to process
(e) none of the above
Solution: c
9. A turkey producer knows from previous experience that profits are maxi-
mized by selling turkeys when their average weight is 12 kilograms. Before
determining whether to put all their full grown turkeys on the market this
month, the producer wishes to estimate their mean weight. Prior knowl-
edge indicates that turkey weights have a standard deviation of around 1.5
kilograms. The number of turkeys that must be sampled in order to esti-
mate their true mean weight to within 0.5 kilograms with 95% confidence
is:
(a) 35
(b) 5
(c) 65
(d) 10
(e) 150
2006
c Carl James Schwarz 4
Solution: a
Past performance 1992 Dec - 85%
Past performance 1998 Nov - 85%
10. A random sample of 4 Herefords, each with a frame size of three (on a
one-to-seven scale), gave a sample mean weight of 452 kg and a sample
standard deviation of 12 kg. A 95% confidence interval for the average
weight of all Herefords of this frame size is (using an “exact” confidence
interval):
(a) (435.3, 468.7)
(b) (432.9, 471.1)
(c) (440.2, 463.8)
(d) (428.5, 475.5)
(e) (436.6, 467.4)
Solution: b
Past performance 1990 Dec - 75%
Past performance 1997 Jul - 75%
11. Referring to the previous question, about how many animals should be
sampled (in total) in order to be 95% confident of determining the true
mean weight WITHIN 2 kg?
(a) 140
(b) 170
(c) 550
(d) 100
(e) 190
Solution: a
Past performance 1990 Dec - 72%
Past performance 1997 Jul - 60%
2006
c Carl James Schwarz 5
(c) (132.8, 167.2)
(d) (134.5, 165.5)
(e) (145.7, 154.4)
Solution: d
Past performance 1989 Dec - 61% ( 22% -b)
Solution: b
Solution: c
Past performance 1989 Dec - 93%
2006
c Carl James Schwarz 6
1.2, .8, .6, 1.1, 1.2, .8.
A 95% confidence interval for the mean amount of toxic substances is:
Solution: a
Past performance 1989 Dec - 57% (18% - c, 11% -b,d)
Past performance 1990 Dec - 58% (14% - c, 14% - b)
Past performance 1990 Dec - 63% (11% - d, 21% - c)
16. The effect of acid rain upon the yield of crops is of concern in many places.
In order to determine baseline yields, a sample of 13 fields was selected,
and the yield of barley (g/400m2 ) was determined. The output from SAS
appears below:
QUANTILES(DEF=4) EXTREMES
N 13 SUM WGTS 13 100% MAX 392 99% 392 LOW HIGH
MEAN 220.231 SUM 2863 75% Q3 234 95% 392 161 225
STD DEV 58.5721 VAR 3430.69 50% MED 221 90% 330 168 232
SKEW 2.21591 KURT 6.61979 25% Q1 174 10% 163 169 236
USS 671689 CSS 41168.3 0% MIN 161 5% 161 179 239
CV 26.5958 STD MEAN 16.245 1% 161 205 392
Solution: d
Past performance 1989 Dec - 60% (25% - b)
17. The effect of salinity upon the growth of grasses is of concern in many
places where excess irrigation is causing salt to rise to the surface. In
order to determine baseline yields, a sample of 24 fields was selected, and
the biomass of grasses in a standard sized plot was measured (kg). The
output from SAS appears below:
2006
c Carl James Schwarz 7
QUANTILES(DEF=4) EXTREMES
N 24 SUM WGTS 24 100% MAX 22.6 99% 22.6 LOW HIGH
MEAN 9.09 SUM 218.3 75% Q3 11.45 95% 22.52 0.7 15.1
STD DEV 6.64 VARIANCE 44.0 50% MED 8.15 90% 21.8 1 19.8
SKEWNE 0.924 KURTO -0.0209 25% Q1 3.775 10% 1.6 2.2 21.3
USS 2998 CSS 1012.73 0% MIN 0.7 5% 0.77 2.2 22.3
CV 72 STD MEAN 1.35 1% 0.7 2.8 22.6
T:MEAN=0 6.7153 PROb>|T| 0.0001 RANGE 21.9
Solution: d
Past performance 1990 Dec - 65%
Past performance 1996 Nov - 82%
(a) 7
(b) 44
(c) 8
(d) 62
(e) 87
Solution: b
Past performance 1989 Dec - 70%
2006
c Carl James Schwarz 8
(a) 24.1 ± 8.32
(b) 24.1 ± 3.92
(c) 24.1 ± 2.77
(d) 24.1 ± 3.26
(e) 24.1 ± 9.78
Solution: c
20. Consider the following graph of the mean yield of barley in 1980, 1984,
and 1988 along with a 95% confidence interval.
2006
c Carl James Schwarz 9
21. A researcher in biochemistry is attempting to summarize the results of an
experiment. The experiment involved measuring enzyme active under a
variety of conditions. The analysis has yielded the following statistics:
n 10
Median 157.00
Mean 163.50
Variance 45.29
Std. Deviation 6.73
Range 38.00
Solution: d
Past performance 1991 Dec - 95%
22. The United States Golf Association (USGA) tests new brands of golf balls
to assure that they meet USGA specifications. One test involves measuring
the average distance traveled when the ball is hit by a machine called
“Iron Byron”. Past tests have indicated that the standard deviation of the
distances “Iron Byron” hits golf balls is 10 meters. How many golf balls
should be hit by “Iron Byron” in order to estimate the mean distance for
a new brand with a 90% confidence interval of WIDTH 2 meters?
(a) 17
(b) 9
(c) 384
(d) 68
(e) 271
Solution: e
2006
c Carl James Schwarz 10
(a) 183
(b) 253
(c) 64
(d) 359
(e) 90
Solution: e
24. Recently, a price war has developed among retailers selling Brand X denim
jeans. A major chain buyer wishes to estimate the mean price of these
jeans during this period to compare it to the normal selling price of $20.00.
A random sample of 7 major retailers produces a mean retail price of
$13.50 with a standard deviation of $3.50. A 80% confidence interval for
the true mean retail price of Brand X jeans during the price war is:
(a) This interval will contain the true value of µ approximately 95 times
out of one hundred.
(b) This interval is an approximate 95% confidence interval for µ
(c) This interval is too narrow to be a useful interval estimator for µ.
(d) This interval will contain the true value of µ 997 time out of 1000.
(e) Both (a) and (b) are true.
Solution: e
2006
c Carl James Schwarz 11
(a) 15
(b) 30
(c) 50
(d) 80
(e) 325
Solution: d - Note that if you use a 3 multiplier for a 99% c.i. you will
get an answer near 110.
The exact multipler for a 99% confidence interval is 2.57 (look for the
99.5th percentile on a normal curve
which gives you an answer of 81.
(a) Auditor A’s estimate will be about 10 times more accurate than
Auditor B’s estimate.
(b) Auditor B’s estimate will be about 10 times more accurate than Au-
ditor A’s estimate.
(c) Auditor A’s estimate will be about 3.16 times more accurate than
Auditor B’s estimate.
(d) Auditor B’s estimate will be about 3.16 times more accurate than
Auditor A’s estimate.
(e) the accuracy of the two estimates will be about the same.
Solution: e
Past performance 1991 Dec - 95%
28. You wish to estimate µ, the average lifetime of a particular type of battery.
You are planning to select n batteries of this type and to operate them
continuously until they fail. You have some feeling that the standard
deviation of the lifetimes should be around 20 hours, and you wish your
estimate of µ to be within 1 hour of µ with probability 0.95. How many
batteries should you select?
(a) 1537
(b) 784
2006
c Carl James Schwarz 12
(c) 40
(d) 77
(e) 1083
Solution: a - The exact answer of 1537 is found using the exact multi-
plier of 1.96 = 97.5th percentile
of the normal curve rather than the approximate multiplier of 2.
29. A statistical procedure to estimate the mean shell thickness of eggs from
chickens contaminated with PCBs obtains a point estimate of 0.70 mm
and an estimated standard error of .05 mm. This means:
(a) The standard deviation of actual shell thickness in the sample was
.05 mm.
(b) We are 95% confident that the sample mean shell thickness is accurate
to with .05 mm.
(c) An estimate of the standard deviation of the sample mean shell thick-
ness over repeated samples is .05 mm
(d) The standard deviation of the population mean over all eggs is about
.05 mm.
(e) An approximate 95% confidence interval for the sample mean shell
thickness is .70mm ± .10mm.
Solution: c - note that e refers to “sample mean”
Past performance 1996 Dec - 34% (13%-d; 45%-e)
2006
c Carl James Schwarz 13
Multiple Choice Questions
Inference - Single sample on proportions
Solution: b
2. Some scientists believe that a new drug would benefit about half of all peo-
ple with a certain blood disorder. To estimate the proportion of patients
who would benefit from taking the drug, the scientists will administer it to
a random sample of patients who have the blood disorder. What sample
size is needed so that the 95% confidence interval will have a width of
0.06?
(a) 748
(b) 1,068
(c) 1,503
(d) 2,056
(e) 2,401
Solution: b
Past performance 1989 Dec - 74%
1
3. In a random sample of 800 Winnipeg automobile owners, it was found
that 480 would like to see the size of the cars reduced. A 95% confidence
interval for the proportion of all Winnipeg car owners who would like to
see smaller cars is:
Solution: a
Past performance 1991 Dec - 92%
4. A random sample of 900 individuals has been selected from a large pop-
ulation. It was found that 180 are regular users of vitamins. Thus, the
proportion of the regular users of vitamins in the population is estimated
to be 0.20. An estimate of the standard error of this estimate is:
(a) 0.1600
(b) 0.0002
(c) 0.4000
(d) 0.0133
(e) 0.0267
Solution: d
Past performance 1996 Dec - 86%
5. A Gallup poll of 1089 adults found 326 supported the policies of a particu-
lar political party. A 95% confidence interval for the true level of support
in the entire Canadian population is:
Solution: e
Past performance 1989 Dec - 81%
Past performance 1990 Dec - 68%
Past performance 1992 Dec - 77% (12%a)
Past performance 1993 Apr - 80% (a-10%)
2006
c Carl James Schwarz 2
6. Refer to the previous question. What sample size would be needed in
order to be 95% confident that the true level of support is within .01 of
the estimated proportion, assuming that the previous poll provides us with
a reasonable estimate of the true support?
(a) 5047
(b) 9604
(c) 1089
(d) 3458
(e) 8068
Solution: e
2006
c Carl James Schwarz 3
(a) 225
(b) 1068
(c) 267
(d) 897
(e) 683
Solution: d
(a) 6147
(b) 24587
(c) 38416
(d) 4330
(e) 1537
Solution: e
Solution: c
2006
c Carl James Schwarz 4
11. Many television viewers express doubts about the validity of certain com-
mercials. In an attempt to answer their critics, the Timex Corporation
wishes to estimate the proportion of consumers who believe what is shown
in Timex television commercials. Let p represent the true proportion of
consumers who believe what is shown in Timex television commercials. If
Timex has no prior information regarding the true value of p, how many
consumers should be included in their sample so that they will be 85%
confident that their estimate is within 0.03 of the true value of p ?
(a) 400
(b) 12
(c) 576
(d) 384
(e) 544
12. The 3ůM company started a new recreation program for its employees in
the hope that a little recreation would improve an employee’s performance
at work. To determine whether the high cost of the program is justified,
the president of the company wishes to estimate the proportion of the
employees who participate in the recreational activities. In a random
sample of 200 employees, 60 were found to regularly participate in the
recreation program. A 95% confidence interval for the true proportion of
3-M employees who participate in the new recreation program is:
Solution: e
13. A random sample of married people were asked “Would you remarry your
spouse if you were given the opportunity for a second time?”; Of the
150 people surveyed, 127 of them said that they would do so. Find a
95% confidence interval for the proportion of married people who would
remarry their spouse.
2006
c Carl James Schwarz 5
(c) 0.847 ś 0.048
(d) 0.847 ś 0.058
(e) 0.847 ś 0.113
Solution: d
Past performance 1990 Dec - 83%
14. A music buff wants to estimate the percentage of students at the University
of Manitoba who believe that Elvis is still alive. How many students should
he include in a random sample if he wants a 90% confidence interval that
is less than 10 percentage points wide? Choose the sample size that is
closest to your solution
(a) 68
(b) 97
(c) 269
(d) 385
(e) 1022
15. You would like to estimate the percentage of “regular users of vitamins”
in a large population and you would like your estimate to be accurate to
within 4 percentage points, 19 times out of 20. Approximately how large
should your sample size be?
(a) 600
(b) 2400
(c) 400
(d) 1000
(e) 150
Solution: a
Past performance 1990 Dec - 37% (14% - b, 14% -c, 27% - c)
Past performance 1992 Dec - 78% (13%-b)
16. In order for the confidence interval in the previous question to be valid:
2006
c Carl James Schwarz 6
(a) we must assume that we have a random sample from a normal pop-
ulation.
(b) we must assume that we have a random sample from some population
(but it need not be a normal population because of the Central Limit
Theorem).
(c) we must assume that the population is normal (but we do not require
a random sample because of the Central Limit Theorem).
(d) we do not need to assume that the population is normal nor that the
sample is random (because of the Central Limit Theorem).
(e) we must assume that we have a random sample from a dichotomous
population.
Solution: b - the Wonderful CLT (it will change your life) strikes again.
18. A 95% confidence interval for p the proportion of Canadian beer drinkers
who prefer Lion Red was found to be (0.236 to 0.282). Which of the
following is correct?
(a) About 95% of beer drinkers have between a 23.6% and a 28.2% chance
of drinking Lion Red.
(b) There is a 95% probability that the sample proportion lies between
0.236 and 0.282.
2006
c Carl James Schwarz 7
(c) If a second sample was taken, there is a 95% chance that its confidence
interval would contain 0.25.
(d) This confidence interval indicates that we would likely reject the hy-
pothesis H: p=0.25.
(e) we are reasonably certain that the true proportion of beer drinkers
who prefer Lion Red is between 24% and 28%.
Solution: e
Past performance 1998 Dec - 71% (15% c)
19. Refer to the previous question. Suppose that the same poll was repeated
in the United States (whose population is 10 times larger than Canada),
but in this new pool, four times the number of people were interviewed.
The resulting 95% confidence intervals will be:
(a) about 1/2 as wide as the Canadian interval
(b) about 1/4 as wide as the Canadian interval
(c) about 1/10 as wide as the Canadian interval
(d) about 4/10 times as wide as the Canadian interval
(e) the same size as the Canadian interval
Solution: a
Past performance 1998 Dec - 38% (30% b; 20% e)
If you increase the sample size by a factor of x, the ci decreases in width
by sqrt(x)
The easiest way to see this is to simply compute the two se.
20. Suppose that we wish to estimate the proportion of Canadians who ac-
tually understand the Constitution of Canada. What is the approximate
number of Canadians who need to be sampled so that the 95% confidence
interval has a width of 2 percentage points?
(a) about 500
(b) about 1,000
(c) about 2,500
(d) about 5,000
(e) about 10,000
Solution: e
Past performance 1998 Dec - 42% (15% b; 28% c)
2006
c Carl James Schwarz 8
Multiple Choice Questions
Inference - Two independent samples on means
Solution: d
Past performance 1989 Dec - 64% (14% a,c)
Past performance 1990 Dec - 73%
1
(b) We can conclude that the drug was ineffective because those taking
the drug lived, on average, 1.04 years less.
(c) We can conclude that there is no evidence the drug was effective
becaue the 95% confidence interval covers zero.
(d) We can conclude that there is evidence the drug was effective because
the 95% confidence interval does not cover zero.
(e) We can make no conclusions because we do not know the sample size
nor the actual mean survival of each group.
Solution: c
Past performance 1990 Dec - 79%
Past performance 1998 Dec - 77%
Past performance 2006 Dec - 85%
Outlet 1 Outlet 2
n 5 10
mean 10.3 10.7 percent
std.dev 1.6 2.3 percent
(a) 1.95
(b) 2.08
(c) 4.38
(d) 2.09
(e) 2.11
Solution: e
Past performance 1989 Dec - 72%
4. The degrees of freedom of the pooled estimate in the previous question is:
(a) 15
(b) 13
(c) 7.5
(d) 5
(e) 10
2006
c Carl James Schwarz 2
Solution: b
Past performance 1989 Dec - 90%
(a) There is evidence that doing assignments improves the average grade
because the difference in the population means is less than zero.
(b) There is little evidence that doing assignments improves the average
grade because the 95% confidence interval does not cover 0.
(c) There is evidence that doing assignments improves the average grade
because the 95% confidence interval does not cover 0.
(d) There is evidence that doing assignments does not improve the aver-
age grade because the 95% confidence interval does not cover 0.
(e) There is little evidence that doing assignments does not improve the
average grade because the 95% confidence interval does cover 0.
Solution: c
Past performance 1989 Dec - 73%
2006
c Carl James Schwarz 3
q
146 146
(d) (6.41 − 5.20) ± 2.07 10 + 15
q
146 146
(e) (6.41 − 5.20) ± 1.96 10 + 15
Solution: b
Past performance 1990 Dec - 55%
Solution: d
Past performance 1989 Dec - 43% (27% -a)
8. A researcher wants to see if birds that build larger nests lay larger eggs.
She selects two random samples of nests: one of small nests and the other
of large nests. She weighs one egg from each nest. The data are summa-
rized below.
A 95% confidence interval for the difference between the average mass of
eggs in small and large nests.
2006
c Carl James Schwarz 4
(e) 1.6 ± 7.31 = (−5.71, 8.91)
Solution: c
Past performance 1992 Dec - 82%
(a) 240
(b) 60
(c) 8000
(d) 2000
(e) 125
Solution: a
Past performance 1992 Dec - 79%
2006
c Carl James Schwarz 5
10. Refer to the 95% confidence interval circled on the output. This means:
(a) We are 95% confident that the sample mean egg size in large nests is
between 37 and 40 mm if the survey was repeated.
(b) If the survey was repeated, we are 95% confident that eggs sizes in
large nests are between 37 and 40 mm.
(c) We are 95% confident that nests will be have large eggs between 37
and 40 mm if the survey was repeated.
(d) We are 95% confident that the true mean eggs size for large nests is
between 37 and 40 mm.
(e) We are 95% confident that repeated surveys will have population
means between 37 and 40 mm.
Solution: d
Past performance 2006 Dec - 61% (19%-a; 12%-b)
2006
c Carl James Schwarz 6
(a) Because the 95% confidence interval for the difference in means in-
cludes zero, there is no evidence of a difference in the mean egg size.
(b) Because the one-sided p-value is .18, there is no evidence of a differ-
ence in mean egg sizes.
(c) Because the confidence intervals for the two groups have a great deal
of overlap, there is no evidence of a difference in the mean egg size.
(d) Because the individual values of the eggs sizes for the two groups
have a great deal of overlap, there is no evidence of a difference in
the means.
(e) Because the 95% confidence intervals for the mean eggs sizes are
approximately equal in width, the two estimates are about equally
precise.
Solution: d
Past performance 2006 Dec - 58% (14%-a; 15%-b; 19%-d)
2006
c Carl James Schwarz 7
Multiple Choice Questions
Inference - Two independent samples on
proportions
1. Two surveys were conducted before and after the recent Autopac rate
increases to find the proportion of voters who state they would vote for
the current government. The results were as follows:
Solution: a
Past performance 1992 Dec - 97%
2. The above confidence intevals are of the order ś6 percentage points. What
sample size for each poll would be needed so that we are 95% confident
of being within 2 percentage points of the true difference assuming that
the above proportions are reasonable estimates of the proportions in the
population?
(a) 6,000
1
(b) 1,000
(c) 15,000
(d) 2,000
(e) 4,000
Solution: e
Past performance 1992 Dec - 73%
3. Two surgical procedures are widely used to treat a certain type of cancer.
To compare the success rates of the two procedures, a random sample
from each type of procedure is obtained, and the number of patients with
no reoccurrence of the disease after 1 year was recorded. Here is the data.
n No occurrence
Procedure A 100 78
Procedure B 120 102
Solution: c
Past performance 1989 Dec - 78%
4. There may be a cure for male pattern baldness (at least millions of males
hope there will be) using the blood pressure drug Minoxidil. A group of
males was randomly assigned to two groups. One group received topi-
cal applications of the drug; the other group received applications of an
identical looking placebo. The summary data
Number with
Sample Size New $H_A$ir Growth
Minoxidil group 310 100
Placebo group 100 25
2006
c Carl James Schwarz 2
(b) .073 ± .048
(c) .073 ± .024
(d) .073 ± .051
(e) .073 ± .099
Solution: e
SPRAY A SPRAY B
Total number of insects 200 200
Total number of dead insects 140 100
A 90% confidence interval for the difference in the rates of kill for the two
sprays, is:
q
.46
(a) .2 ± 1.645 200
q
.48
(b) .2 ± 1.645 200
q
.46
(c) .2 ± 1.96 200
q
.48
(d) .2 ± 1.96 200
q
.48
(e) .2 ± 2.326 200
Solution: a
Past performance 1990 Dec - 78%
q
(b) .02 = 1.96 .8(.2)
n + n
.8(.2)
2006
c Carl James Schwarz 3
q
(c) .01 = 1.96 .5(.5)
n +
.5(.5)
n
q
(d) .02 = 1.96 .5(.5)
n +
.5(.5)
n
Solution: a
Past performance 1989 Dec - 80%
Solution: e
Past performance 1990 Dec - 32% ( 12% - b, 14% - c, 36% - d, 31% - e)
2006
c Carl James Schwarz 4
Multiple Choice Questions
Probability - Binomial
2. Experience has shown that a certain lie detector will show a positive read-
ing (indicates a lie) 10% of the time when a person is telling the truth and
95% of the time when a person is lying. Suppose that a random sample of
5 suspects is subjected to a lie detector test regarding a recent one-person
crime. Then the probability of observing no positive reading if all suspects
plead innocent and are telling the truth is
(a) 0.409
(b) 0.735
(c) 0.00001
(d) 0.591
(e) 0.99999
Solution: d
1
1 PROBABILITY - BINOMIAL DISTRIBUTION
3. It has been estimated that about 30% of frozen chicken contain enough
salmonella bacteria to cause illness if improperly cooked. A consumer
purchases 12 frozen chickens. What is the probability that the consumer
will have more than 6 contaminated chickens?
(a) .961
(b) .118
(c) .882
(d) .039
(e) .079
Solution: d
Past performance 1989 Dec - 74%
Past performance 1990 Oct - 68%
Past performance 1992 Oct - 93%
Past performance 1997 Aug - 91%
2006
c Carl James Schwarz 2
1 PROBABILITY - BINOMIAL DISTRIBUTION
6. It has been estimated that as many as 70% of the fish caught in certain
areas of the Great Lakes have liver cancer due to the pollutants present.
Find an approximate 95% range for the number of fish with liver cancer
present in a sample of 130 fish.
(a) (80, 102)
(b) (86, 97)
(c) (63, 119)
(d) (36, 146)
(e) (75, 107)
Solution: a
Past performance 1989 Dec - 83%
Past performance 1991 Oct - 56% (11%d, 20% e)
Past performance 1992 Oct - 78%
2006
c Carl James Schwarz 3
1 PROBABILITY - BINOMIAL DISTRIBUTION
Solution: e
10. Suppose 60% of a herd of cattle is infected with a particular disease. Let Y
= the number of non-diseased cattle in a sample of size 5. The distribution
of Y is
(a) binomial with n = 5 and p = 0.6
(b) binomial with n = 5 and p = 0.4
(c) binomial with n = 5 and p = 0.5
(d) the same as the distribution of X, the number of infected cattle.
(e) Poisson with λ = .6
Solution: b
11. Fifteen percent of new residential central air conditioning units installed
by a supplier need additional adjustments requiring a service call. Assume
that a recent sample of seven such units constitutes a Bernoulli process.
Interest centers on X, the number of units among these seven that need
additional adjustments. The mean and variance of X are, respectively
(a) .15; .85
(b) .15; 1.05
(c) .15; .8925
(d) 1.05; .1275
(e) 1.05; .8915
Solution: e - remember variance = (std dev) squared
12. If you buy one ticket in the Provincial Lottery, then the probability that
you will win a prize is 0.11. If you buy one ticket each month for five
months, what is the probability that you will win at least one prize?
2006
c Carl James Schwarz 4
1 PROBABILITY - BINOMIAL DISTRIBUTION
(a) 0.55
(b) 0.50
(c) 0.44
(d) 0.45
(e) 0.56
Solution: c
13. Suppose that the probability that a cross between two varieties will express
a particular gene is 0.20. What is the probability that in 8 progeny plants,
two or fewer plants will express the gene?
(a) .2936
(b) .3355
(c) .1678
(d) .6291
(e) .7969
Solution: e
Past performance 1989 Oct - 95%
14. Refer to the previous question. Suppose that 120 crosses are bred. Find
a likely 95% range for the number of progeny that will express the gene.
(a) 24ś19.2
(b) 24ś4.4
(c) 24ś8.8
(d) 24ś4.9
(e) 24ś9.8
Solution: c
Past performance 1989 Oct - 65%
15. Seventeen people have been exposed to a particular disease. Each one
independently has a 40% chance of contracting the disease. A hospital
has the capacity to handle 10 cases of the disease. What is the probability
that the hospital’s capacity will be exceeded?
(a) .965
(b) .035
(c) .989
(d) .011
2006
c Carl James Schwarz 5
1 PROBABILITY - BINOMIAL DISTRIBUTION
(e) .736
Solution: b
Past performance 1991 Oct - 75%
Past performance 1993 Feb - 59% (c-14%; d-14%)
Past performance 1993 Apr - 70%
Past performance 1996 Nov - 90%
Past performance 1998 Nov - 88%
16. Refer to the previous problem. Planners need to have enough beds avail-
able to handle a proportion of all outbreaks. Suppose a typical outbreak
has 100 people exposed, each with a 40% chance of coming down with the
disease. Which is not correct:
(a) This experiment satisfies the assumptions of a binomial distribution.
(b) About 95% of the time, between 30 and 50 people will contract the
disease.
(c) Almost all of the time, between 25 and 55 people will contract the
disease.
(d) On average, about 40 people will contract the disease.
(e) Almost all of time, less than 40 people will be infected.
Solution: e
Past performance 1993 Feb - 73% (d-13%)
Past performance 1996 Nov - 80% (d- 8%)
Past performance 1998 Nov - 87%
17. There are 10 patients on the Neo-Natal Ward of a local hospital who are
monitored by 2 staff members. If the probability (at any one time) of a
patient requiring emergency attention by a staff member is .3, assuming
the patients to be behave independently, what is the probability at any
one time that there will not be sufficient staff to attend all emergencies?
(a) .3828
(b) .3000
(c) .0900
(d) .9100
(e) .6172
Solution: e
18. A newborn baby whose Apgar score is over 6 is classified as normal and
this happens in 80% of births. As a quality control check, an auditor
examined the records of 100 births. He would be suspicious if the number
2006
c Carl James Schwarz 6
1 PROBABILITY - BINOMIAL DISTRIBUTION
of normal births in the sample of 100 births fell above the upper limit of
a “95%-normal-range”. What is this upper limit?
(a) 112
(b) 72
(c) 88
(d) 8
(e) none of these
Solution: c
Past performance ???? 73% (18% -e)
19. Refer to the previous question. Babies that have Apgar scores of 6 or lower
require more expensive medical care. What is the probability that in the
next 10 births, 3 or more babies will have Apgar scores of 6 or lower?
(a) .2013
(b) .3222
(c) .9999
(d) .0001
(e) .1536
Solution: b
Past performance ???? 48% (19%-c; 11%-d; 14%-e)
20. Newsweek in 1989 reported that 60% of young children have blood lead
levels that could impair their neurological development. Assuming that a
class in a school is a random sample from the population of all children at
risk, the probability that at least 5 children out of 10 in a sample taken
from a school may have a blood level that may impair development is:
(a) about .25
(b) about .20
(c) about .84
(d) about .16
(e) about .64
Solution: c
Past performance 1998 Dec - 80%
21. Refer to the previous problem. The total number of children in the school
is about 400. In order to estimate the cost of treating all the children at
one school, the health board wishes to be reasonably sure of the upper
limit on the number of children affected. This upper limit is:
2006
c Carl James Schwarz 7
1 PROBABILITY - BINOMIAL DISTRIBUTION
22. Consider 8 blood donors chosen randomly from a population. The prob-
ability that the donor has type A blood is .40. Which of the following is
CORRECT?
(a) The probability of 1 or fewer donors having type A blood is about
.11.
(b) The probability of 7 or more donors NOT having type A blood is
about .0087.
(c) The probability of exactly 5 donors having type A blood is about .28.
(d) The probability of exactly 5 donors NOT having type A blood is
about .12.
(e) The probability that between 3 and 5 donors (inclusive) will have
type A blood is about .37.
Solution: a
Past performance 2006 Nov - 84%
Past performance 2006 Dec - 79%
23. Consider 100 blood donors chosen randomly from a population where the
probability of type A is 0.40? What is the approximate probability that
at least 43 donors will have type A blood?
(a) about .43
(b) about .62
(c) about .73
(d) about .27
(e) about .38
Solution: d
Past performance 2006 Nov - 64%
Past performance 2006 Dec - 58% (27%-c)
2006
c Carl James Schwarz 8
Multiple Choice Questions
Probability - Expected Value
1. Cans of soft drinks cost $0.30 in a certain vending machine. What is the
expected value and variance of daily revenue (Y) from the machine, if X,
the number of cans sold per day has E(X) = 125, and V ar(X) = 50 ?
(a) E(Y ) = 37.5 , V ar(Y ) = 50
(b) E(Y ) = 37.5 , V ar(Y ) = 4.5
(c) E(Y ) = 37.5 , V ar(Y ) = 15
(d) E(Y ) = 37.5 , V ar(Y ) = 15
(e) E(Y ) = 125 , V ar(Y ) = 4.5
Solution: b - remember variance = (std dev)2
2. A crop insurance company establishes the following loss table based upon
previous claims
1
3. A rock concert producer has scheduled an outdoor concert. If it is warm
that day, she expects to make a $20,000 profit. If it is cool that day, she
expects to make a $5,000 profit. If it is very cold that day, she expects to
suffer a $12,000 loss. Based upon historical records, the weather office has
estimated the chances of a warm day to be .60; the chances of a cool day
to be .25. What is the producer’s expected profit?
(a) $5,000
(b) $13,000
(c) $15,050
(d) $13,250
(e) $11,450
Solution: e
Past performance 1989 Apr - 92%
Past performance 1997 Aug - 93%
Annual
Cash Flow $10,000 $30,000 $70,000 $90,000 $100,000
Probability 0.10 0.15 0.50 0.15 ?
2006
c Carl James Schwarz 2
(b) $595
(c) $875
(d) $645
(e) $495
Solution: b
Past performance 1989 Oct - 91%
Past performance 1991 Oct - 90%
Past performance 1993 Feb - 96%
Past performance 1996 Dec - 96%
6. Before planting a crop for the next year, a producer does a risk assess-
ment. According to her assessment, she concludes that there are three
possible net outcomes: a $7,000 gain, a $4,000 gain, or a $10,000 loss with
probabilities 0.55, 0.20 and 0.25 respectively. The expected profit is:
(a) $3,850
(b) $0
(c) $2,150
(d) $2,500
(e) $800
Solution: c
Past performance 1992 Dec - 97%
2006
c Carl James Schwarz 3
Days 2 3 4 5 6
Prob .05 .20 .40 .20 ?
Solution: e
Past performance 1993 Apr - 74% (a-13%)
Past performance 1996 Dec - 92%
Past performance 1998 Dec - 95%
2006
c Carl James Schwarz 4
Multiple Choice Questions
Probability - General
1. The probability that the Red River will flood in any given year has been
estimated from 200 years of historical data to be one in four. This means:
(a) The Red River will flood every four year.
(b) In the next 100 years, the Red River will flood exactly 25 times.
(c) In the last 100 years, the Red River flooded exactly 25 times.
(d) In the next 100 years, the Red River will flood about 25 times.
(e) In the next 100 years, it is very likely that the Red River will flood
exactly 25 times.
2. The chances that you will ticketed for illegal parking on campus are about
1/3. During the last nine days, you have illegally parked every day and
have NOT been ticketed (you lucky person)! Today, on the 10th day, you
again decide to park illegally. The chances that you will be caught are:
(a) greater than 1/3 because you were not caught in the last nine days.
(b) less than 1/3 because you were not caught in the last nine days.
(c) still equal to 1/3 because the last nine days do not affect the proba-
bility.
(d) equal to 1/10 because you were not caught in the last nine days.
(e) equal to 9/10 because you were not caught in the last nine days.
3. The chance that a person will contract AIDS after a sexual contact with
an infected partner has been estimated to be 1/4. This means:
(a) A person will be infected after exactly 4 sexual contacts with infected
partners.
(b) Of 1000 people having sexual contacts with infected partners, exactly
250 will become infected.
(c) Of 200 people having sexual contacts with infected partners, about
50 will become infected.
1
(d) In exactly 25% of all sexual contacts with infected partners, the in-
fection will spread.
(e) Of 20 people having sexual contacts with infected partners, it is very
likely that exactly 5 people will become infected.
4. A random variable Y has the following distribution:
Y | -1 0 1 2
P(Y)| 3C 2C 0.4 0.1
r | 0 1 2 3
P(R=r) | 2k 3k 13k 2k
2006
c Carl James Schwarz 2
7. It has been estimated that about 20% of people between the ages of 18
and 25 have used marijuana in the last year. Which of the following is
CORRECT about this statement?
(a) Five people of this age group were randomly selected. This means
that exactly one of them must have used marijuana in the last year.
(b) Twenty people were randomly selected from this age group. Eighteen
of them use marijuana in the last year. The next person selected at
random will have a lower probability of using marijuana.
(c) Ten people were randomly selected from this age group. None of
them have used marijuana in the last year. The next person selected
must have a higher probability of using marijuana in the last year.
(d) A thousand people from this age group were randomly selected. It is
not unusual to find that 217 of them have used marijuana in the last
year.
(e) A million people from this age group were randomly selected. There
must be exactly 200,000 of them that have used marijuana in the last
year.
2006
c Carl James Schwarz 3
(a) about .21
(b) about .16
(c) about .002
(d) about .01
(e) about .38
2006
c Carl James Schwarz 4
Multiple Choice Questions
Probability - General
1. The probability that the Red River will flood in any given year has been
estimated from 200 years of historical data to be one in four. This means:
(a) The Red River will flood every four year.
(b) In the next 100 years, the Red River will flood exactly 25 times.
(c) In the last 100 years, the Red River flooded exactly 25 times.
(d) In the next 100 years, the Red River will flood about 25 times.
(e) In the next 100 years, it is very likely that the Red River will flood
exactly 25 times.
Solution: d
Past performance 1989 Oct - 90%
Past performance 1990 Dec - 99%
2. The chances that you will ticketed for illegal parking on campus are about
1/3. During the last nine days, you have illegally parked every day and
have NOT been ticketed (you lucky person)! Today, on the 10th day, you
again decide to park illegally. The chances that you will be caught are:
(a) greater than 1/3 because you were not caught in the last nine days.
(b) less than 1/3 because you were not caught in the last nine days.
(c) still equal to 1/3 because the last nine days do not affect the proba-
bility.
(d) equal to 1/10 because you were not caught in the last nine days.
(e) equal to 9/10 because you were not caught in the last nine days.
Solution: c
Past performance 1989 Oct - 96%
3. The chance that a person will contract AIDS after a sexual contact with
an infected partner has been estimated to be 1/4. This means:
1
(a) A person will be infected after exactly 4 sexual contacts with infected
partners.
(b) Of 1000 people having sexual contacts with infected partners, exactly
250 will become infected.
(c) Of 200 people having sexual contacts with infected partners, about
50 will become infected.
(d) In exactly 25% of all sexual contacts with infected partners, the in-
fection will spread.
(e) Of 20 people having sexual contacts with infected partners, it is very
likely that exactly 5 people will become infected.
Solution: c
Past performance 1989 Dec - 88%
Past performance 1990 Oct - 94%
Past performance 1991 Oct - 95%
Y | -1 0 1 2
P(Y)| 3C 2C 0.4 0.1
(a) 0.10
(b) 0.15
(c) 0.20
(d) 0.25
(e) 0.75
Solution: a
r | 0 1 2 3
P(R=r) | 2k 3k 13k 2k
2006
c Carl James Schwarz 2
(e) 1.00
Solution: b
6. Suppose that the allele for tallness (T) is dominant over shortness (t); that
for Yellow (Y) is dominant over green (y); and that for roundness (W) is
dominant over wrinkled(w). Suppose we cross two plants with genotypes
TTYyWw and TtYyWw. The probability of a Tall, Yellow, Round plant
is:
(a) 9/16
(b) 3/32
(c) 1/16
(d) 9/32
(e) 3/16
Solution: a
Past performance 1992 Oct 78%
7. It has been estimated that about 20% of people between the ages of 18
and 25 have used marijuana in the last year. Which of the following is
CORRECT about this statement?
(a) Five people of this age group were randomly selected. This means
that exactly one of them must have used marijuana in the last year.
(b) Twenty people were randomly selected from this age group. Eighteen
of them use marijuana in the last year. The next person selected at
random will have a lower probability of using marijuana.
(c) Ten people were randomly selected from this age group. None of
them have used marijuana in the last year. The next person selected
must have a higher probability of using marijuana in the last year.
(d) A thousand people from this age group were randomly selected. It is
not unusual to find that 217 of them have used marijuana in the last
year.
(e) A million people from this age group were randomly selected. There
must be exactly 200,000 of them that have used marijuana in the last
year.
Solution: d
Past performance 2006 Nov - 91%
2006
c Carl James Schwarz 3
The following two questions refer to the following situation.
All human blood can be “ABO” typed as belonging to one of A, B, O, or
AB types. The actual distribution varies slightly among different groups
of people, but for a randomly chosen person from North America, the
following are the approximate probabilities:
Blood type O A B AB
Probability .45 .40 .11 .04
8. Consider an accident victim with type B blood. She can only receive a
transfusion from a person with type B or type O blood. What is the
probability that a randomly chosen person will be suitable donor?
9. What is the probability that both people in a couple will have the SAME
blood type if matings are random with respect to blood type, i.e. one
partner’s blood type does not influence the blood type of the other partner.
(a) about .21
(b) about .16
(c) about .002
(d) about .01
(e) about .38
Solution: e
Past performance 2006 Nov - 73%
Past performance 2006 Dec - 85%
2006
c Carl James Schwarz 4
Multiple Choice Questions
Normal approximations to discrete distributions
1. The National Broomball League claims to have a balanced league; that is,
for any given game each team has an equal chance of winning or losing with
no ties. Assuming the claim is true, what is the approximate probability
that a given team will lose more than 61 games out of the 100 played?
(a) 0.0500
(b) 0.4918
(c) 0.0107
(d) 0.0082
(e) 0.0164
Solution: c
2. The probability of getting a parking ticket when not paying for a 2-hour
period is 0.3. What is the probability of getting at least 60 tickets if you
park on 250 occasions for a 2-hour period and don’t pay?
(a) 0.016
(b) 0.019
(c) 0.98
(d) 0.93
(e) 0.072
Solution: c
3. A professional basketball player sinks 80% of his foul shots, in the long
run. If he gets 100 tries during a season, then the probability that he sinks
between 75 and 90 shots (inclusive) is approximately equal to:
(a) P r(−1.25 ≤ Z ≤ 2.5)
(b) P r(−1.125 ≤ Z ≤ 2.625)
1
(c) P r(−1.125 ≤ Z ≤ 2.375)
(d) P r(−1.375 ≤ Z ≤ 2.375)
(e) P r(−1.375 ≤ Z ≤ 2.625)
Solution: e
(a) .9167
(b) .9298
(c) .9390
(d) .9268
(e) .9208
Solution: c
(a) 0.6552
(b) 0.6429
(c) 0.6078
(d) 0.6201
(e) 0.6320
Solution: c
(a) 0.4
(b) larger than that in the previous question
(c) smaller than that in the previous question
(d) equal to that in the previous question
(e) may be smaller or larger than that in the previous question
Solution: b
2008
c Carl James Schwarz 2
7. Companies are interested in the demographics of those who listen to the
radio programs they sponsor. A radio station has determined that only
20% of listeners phoning in to a morning talk program are male. During
a particular week, 200 calls are received by this program. What is the
approximate probability that at least 50 of the callers are male?
(a) .0466
(b) .0212
(c) .1168
(d) .1402
(e) Not within ś .01 of any of the above.
Solution: a
9. A politician has targeted 100 homes to visit during a week. From past
experience, 50 percent of the households answer the bell and invite him
in. Of this, 80 percent will agree with his policies. The approximate
probability that the politician will get support from at least 45 households
during a week is:
(a) 0.1991
(b) 0.3212
(c) 0.8643
(d) 0.1376
(e) 0.1788
Solution: d
10. People who have been in contact with a carrier of a disease, have a 40%
chance of contracting the disease. Suppose that the carrier of the dis-
eases may have infected a school with 500 people. Find the approximate
probability that at least 215 people will contract the disease.
2008
c Carl James Schwarz 3
(a) .09
(b) .91
(c) between .05 and .34
(d) 1.37
(e) between 2.5% and 17%
Solution: a
Past performance 1993 Apr - 40% (b-22%, c-22%)
2008
c Carl James Schwarz 4
Multiple Choice Questions
Probability - Normal distribution
1. One of the side effects of flooding a lake in northern boreal forest areas
(e.g. for a hydro-electric project) is that mercury is leached from the soil,
enters the food chain, and eventually contaminates the fish. The concen-
tration in fish will vary among individual fish because of differences in
eating patterns, movements around the lake, etc. Suppose that the con-
centrations of mercury in individual fish follows an approximate normal
distribution with a mean of 0.25 ppm and a standard deviation of 0.08
ppm. Fish are safe to eat if the mercury level is below 0.30 ppm. What
proportion of fish are safe to eat?
(a) 63%
(b) 23%
(c) 73%
(d) 27%
(e) 37%
Solution: c
Past performance 1992 Dec - 45% (16%a, 22%b, 15%d)
Past performance 1993 Apr - 57% (a-17%; d-17%)
Past performance 1996 Nov - 93%
Past performance 1997 Aug - 84%
Past performance 2006 Dec - 91%
1
(e) 20th percentile has a value of 0.07 ppm
Solution: c
Past performance 1992 Dec - 46% (28%-b, 15%-d)
Past performance 1997 Aug - 77% (13%-d)
Past performance 2006 Dec - 84% (11%-c)
3. The following graph is a normal probability plot for the amount of rainfall
in acre-feet obtained from 26 randomly selected clouds that were seeded
with silver oxide:
(a) The data appear to show exponential growth; that is, the amount
of rainfall increases exponentially as the amount of silver oxide in-
creases.
(b) The pattern suggests that the measurement is not normally dis-
tributed.
(c) A least squares regression line should be fitted to the rainfall variable.
(d) It can be expected that the histogram of rainfall amount will look
like the normal curve.
(e) The shape of the curve suggests that rainfall is caused by seeding the
clouds with silver oxide.
2006
c Carl James Schwarz 2
(d) 18%
(e) 39%
Solution: a
2006
c Carl James Schwarz 3
Solution: e
Past performance 1989 Dec - 52% (18% c,d)
Past performance 1989 Apr - 50% (C-23%, D-18%)
Past performance 1991 Dec - 80% (c-13%)
8. In some courses (but certainly not in an intro stats course!), students are
graded on a “normal curve”. For example, students within ś 0.5 stan-
dard deviations of the mean receive a C; between 0.5 and 1.0 standard
deviations above the mean receive a C+; between 1.0 and 1.5 standard
deviations above the mean receive a B; between 1.5 and 2.0 standard de-
viations above the mean receive a B+, etc. The class average in an exam
was 60 with a standard deviation of 10. The bounds for a B grade and the
percentage of students who will receive a B grade if the marks are actually
normal distributed are:
(a) (65, 75), 24.17%
(b) (70, 75), 18.38%
(c) (70, 75), 9.19%
(d) (65, 75), 12.08%
(e) (70, 75), 6.68%
Solution: c
Past performance 1997 Jul - 85%
Refer to the previous question. Another Instructor decides that the lower
B cutoff should be the 70th percentile. The lower-cutoff for a B grade is:
(a) 70
(b) 65
(c) 60
(d) 75
(e) 80
Solution: b
Past performance 1997 Jul - 71% (14%-a)
2006
c Carl James Schwarz 4
(d) .0228
(e) .4920
Solution: d
10. Suppose the test scores of 600 students are normally distributed with a
mean of 76 and standard deviation of 8. The number of students scoring
between 70 and 82 is:
(a) 272
(b) 164
(c) 260
(d) 136
(e) 328
Solution: e
11. Bolts that are used in the construction of an electric transformer are sup-
posed to be 0.060 inches in diameter, and any bolt with diameter less than
0.058 inches or greater than 0.062 inches must be scrapped. The machine
that makes these bolts is set to produce bolts of 0.060 inches in diameter,
but it actually produces bolts with diameters following a normal distribu-
tion with µ = 0.060 inches and σ = 0.001 inches. The proportion of bolts
that must be scrapped is equal to:
(a) 0.0456
(b) 0.0228
(c) 0.9772
(d) 0.3333
(e) 0.1667
Solution: a
12. The cost of treatment per patient for a certain medical problem was mod-
eled by one insurance company as a normal random variable with mean
$775 and standard deviation $150. What is the probability that the treat-
ment cost of a patient is less than $1,000, based on this model?
(a) .5000
(b) .6826
(c) .8531
(d) .9332
2006
c Carl James Schwarz 5
(e) Cannot be computed without knowledge of additional parameters
Solution: d
13. The time that a skier takes on a downhill course has a normal distribution
with a mean of 12.3 minutes and standard deviation of 0.4 minutes. The
probability that on a random run the skier takes between 12.1 and 12.5
minutes is:
(a) 0.1915
(b) 0.3830
(c) 0.3085
(d) 0.6170
(e) 0.6826
Solution: b
2006
c Carl James Schwarz 6
16. Heights of males are approximately normally distributed with a mean of
170 cm and a standard deviation of 8 cm. What fraction of males are
taller than 176 cm?
(a) .7500
(b) .6000
(c) .2734
(d) .2500
(e) .2266
Solution: e
Past performance 1990 Oct - 68%
Past performance 1993 Feb - 87%
Past performance 1998 Dec - 92%
18. The heights of students at a college are normally distributed with a mean
of 175 cm and a standard deviation of 6 cm. One might expect in a sample
of 1000 students that the number with heights less than 163 cm is:
(a) 997
(b) 23
(c) 477
(d) 228
(e) 456
Solution: b
Past performance 1991 Oct - 62% (12% c, 20% d)
Past performance 1996 Dec - 83% (11% d)
Past performance 2006 Nov - 84%
2006
c Carl James Schwarz 7
19. The height of an adult male is known to be normally distributed with a
mean of 69 inches and a standard deviation of 2.5 inches. The height of
the doorway such that 96 percent of the adult males can pass through it
without having to bend is:
(a) 1.8
(b) about 65
(c) about 74
(d) about 80
(e) about 58
Solution: c
Past performance 2006 Nov - 96%
2006
c Carl James Schwarz 8
(a) .41
(b) .09
(c) .38
(d) .12
(e) .62
Solution: d
Past performance 1990 Dec - 66%
23. Refer to the previous question. The producer is concerned when the milk
production of a cow falls below the 5th percentile because the animal
may be ill. The 5th percentile (in kg) of the daily milk production is
approximately:
(a) 1.645
(b) -1.645
(c) 33.36
(d) 25.13
(e) 44.87
Solution: d
Past performance 1990 Dec - 64%
24. Which of the following is NOT CORRECT about a standard normal dis-
tribution?
(a) P (0 ≤ Z ≤ 1.50) = .4332
(b) P (Z ≤ −1.0) = .1587
(c) P (Z ≥ 2.0) = .0228
(d) P (Z ≤ 1.5) = .9332
(e) P (Z ≥ −2.5) = .4938
Solution: e
Past performance 1989 Dec - 78%
Past performance 1990 Oct - 76%
25. The measurement of the width of the index finger of a human right hand
is a normally distributed variable with a mean of 6 cm. and a standard
deviation of 0.5 cm. What is the probability that the finger width of a
randomly selected person will be between 5 cm. and 7.5 cm.?
(a) .9759
2006
c Carl James Schwarz 9
(b) .0241
(c) .9500
(d) 1.000
(e) not within ś 0.001 of these
Solution: a
26. Lice are a pesky problem for school aged children and is unrelated to
cleanliness. The lifetimes of lice that have fallen off the scalp onto bed-
ding is approximately normally distributed with a mean of 2.2 days and a
standard deviation of 0.4 days. We would expect that approximately 90%
of the lice would die within:
(a) about 2.6 days
(b) about 3.9 days
(c) about 2.5 days
(d) about 2.7 days
(e) about 3.0 days
Solution: d
Past performance 1998 Nov - 67% (23% e)
2006
c Carl James Schwarz 10
Multiple Choice Questions
Probability - Poisson
2. Suppose flaws (cracks, chips, specks, etc.) occur on the surface of glass
with density of 3 per square metre. What is the probability of there being
exactly 4 flaws on a sheet of glass of area 0.5 square metre?
(a) 0.047
(b) 0.168
(c) 0.981
(d) 0.815
(e) 0.647
Solution: a
1
1 PROBABILITY - POISSON DISTRIBUTION
3. The rate at which a particular defect occurs in lengths of plastic film being
produced by a stable manufacturing process is 4.2 defects per 75 metre
length. A random sample of the film is selected and it was found that the
length of the film in the sample was 25 metres. What is the probability
that there will be at most 2 defects found in the sample?
(a) .2102
(b) .2417
(c) .8335
(d) .1323
(e) .1665
Solution: c
Past performance 1997 Jul - 86%
4. The number of traffic accidents per week in a small city has a Poisson
distribution with mean equal to 1.3. What is the probability of at least
two accidents in 2 weeks?
(a) 0.2510
(b) 0.3732
(c) 0.5184
(d) 0.7326
(e) 0.4816
Solution: d
5. The number of traffic accidents per week in a small city has Poisson dis-
tribution with mean equal to 3. What is the probability of at least one
accident in 2 weeks?
2006
c Carl James Schwarz 2
1 PROBABILITY - POISSON DISTRIBUTION
(a) 0.0174
(b) 0.9502
(c) 0.9975
(d) 0.1991
(e) 0.0025
Solution: c
6. Significant birth defects occur at a rate of about 4 per 1000 births in human
populations. After a nuclear accident, there were 10 defects observed in
the next 1500 births. Find the probability of observing at least 10 defects
in this sample if the rate had not changed after the accident.
(a) .008
(b) .003
(c) .041
(d) .084
(e) .042
Solution: d
Past performance 1990 Oct - 58%
Past performance 1991 Dec - 66% (c-17%)
Past performance 1996 Nov - 79% (c-12%)
(a) (4, 8)
(b) (2, 10)
(c) (2, 6)
(d) (0, 8)
(e) (0, 12)
Solution: b
Past performance 1990 Oct - 78%
Past performance 1996 Dec - 77% (10%-a)
2006
c Carl James Schwarz 3
1 PROBABILITY - POISSON DISTRIBUTION
(a) 0.950
(b) 0.262
(c) 0.738
(d) 0.199
(e) 0.801
Solution: e
(a) .2222
(b) .7408
(c) .9603
(d) .1494
(e) .1992
Solution: e
Past performance 1989 Oct - 89%
Past performance 1991 Oct - 84%
Past performance 1997 Aug - 92%
10. Refer to the previous question. A 95% range for the likely number of
bacteria present in a 100 g sample is:
(a) 30ś30.0
(b) 30ś5.5
(c) 30ś11.0
(d) 30ś16.4
(e) 30ś2.8
Solution: c
Past performance 1989 Oct - 77%
Past performance 1991 Oct - 71% (19% b)
Past performance 1997 Aug - 85%
11. The number of bacteria in a drop of water from a lake has a Poisson
distribution with an average of 0.5 bacteria/drop. A small dish containing
four drops of water from the lake is placed under a microscope. The
probability of observing at most one bacteria in the sample is
2006
c Carl James Schwarz 4
1 PROBABILITY - POISSON DISTRIBUTION
(a) 0.910
(b) 0.406
(c) 0.271
(d) 0.135
(e) 0.303
Solution: b
Past performance 1989 Dec - 75%
Past performance 1992 Oct - 82%
Past performance 2006 Dec - 74% (11%-a;)
12. Refer to the previous question. An approximate 95% range for the number
of bacteria present in 400 drops of water is:
(a) (171,229)
(b) (361,439)
(c) (185,215)
(d) (157,243)
(e) (0,400)
Solution: a
Past performance 1989 Dec - 70%
Past performance 1992 Oct - 87%
Past performance 2006 Dec - 75% (16%-c)
14. In a biological cell the average member of genes that will change into
mutant genes, when treated radioactively, is 2.4. Assuming Poisson prob-
ability distribution find the probability that there are at most 3 mutant
genes in a biological cell after the radioactive treatment.
2006
c Carl James Schwarz 5
1 PROBABILITY - POISSON DISTRIBUTION
(a) .2090
(b) .7576
(c) .5697
(d) .7787
(e) 1.000
Solution: d
15. The number of telephone calls that pass through a switchboard has a
Poisson distribution with mean equal to 2 per minute. The probability
that no telephone calls pass through the switch board in two consecutive
minutes is:
(a) 0.2707
(b) 0.0517
(c) 0.0183
(d) 0.0366
(e) 0.1353
Solution: c
16. The distribution of phone calls arriving in one minute periods at a switch-
board is assumed to be Poisson with the parameter λ. During 100 periods,
the following distribution was obtained:
# (calls) 0 1 2 3 4 or more
Frequency 30 43 21 6 0
17. A can company reports that the number of breakdowns per 8-hour shift
on its machine-operated assembly line follows a Poisson distribution with
a mean of 1.5. Assuming that the machine operates independently across
shifts, what is the probability of no breakdowns during three consecutive
8-hour shifts?
2006
c Carl James Schwarz 6
1 PROBABILITY - POISSON DISTRIBUTION
(a) .0744
(b) .0498
(c) .6065
(d) .2231
(e) .0111
Solution: e
18. A fisherman arrives at his favorite fishing spot. From past experience
he knows that the number of fish he catches per hour follows a Poisson
distribution at 0.5 fish/hour. The probability that he catches at least 3
fish in four hours is:
(a) .0126
(b) .0144
(c) .1804
(d) .3233
(e) .8571
Solution: d
19. The number of arrivals per hour at an automatic teller machine is Poisson
distributed with a mean of 3.5 arrivals/hour. What is the probability that
more than three arrivals occur in an hour?
(a) .3209
(b) .4633
(c) .5367
(d) .6791
(e) .7246
Solution: b
20. The marketing manager of a company has noted that she usually receives
10 complaint calls during a week (consisting of five working days), and
that the calls occur at random. Let us suppose that the number of calls
during a week follows the Poisson distribution. The probability that she
gets five such calls in one day is:
(a) .0361
(b) .0378
2006
c Carl James Schwarz 7
1 PROBABILITY - POISSON DISTRIBUTION
(c) .9834
(d) .2000
(e) .5
Solution: a
21. Cataracts are a very rare birth defect. In Canada, they occur at a rate
of approximately 3 babies in every 100,000 births. In 1989, there were
approximately 57,000 births in BC. The probability that more than 5
babies will be born with cataracts is approximately:
(a) about .1080
(b) about .0295
(c) about .0216
(d) about .0080
(e) about .0839
Solution: d
Past performance 1998 Nov - 78% (13% a)
Past performance 2006 Nov - 82% (10% b)
22. The number of deaths due to stroke in the Vancouver region each year
varies randomly with a mean of about 555 deaths per year. Assuming
that the number of deaths has an approximate Poisson distribution, then
the probability that there will be at least 600 deaths due to stroke in any
one year is:
(a) about 1%
(b) about 32%
(c) about 16%
(d) about 5%
(e) about 2.5%
Solution: e
Past performance 1998 Nov - 41% (10% a; 14% b; 20% c; 15% d)
Past performance 2006 Nov - 84%
23. The number of babies born with a particular severe eye defect each year
varies randomly, but at a rate of about 30/10,000 live births. Last year
there were about 15,000 live births. The approximate probability that
there will be more than 58 babies born with this eye defect is:
(a) about 16%
2006
c Carl James Schwarz 8
1 PROBABILITY - POISSON DISTRIBUTION
(b) about 5%
(c) about 1%
(d) about 0.5%
(e) about 2.5%
Solution: e
Past performance 1998 Dec - 65% (12% d)
2006
c Carl James Schwarz 9
Multiple Choice Questions
Correlation
2. If the correlation between body weight and annual income were high and
positive, we could conclude that:
Solution: d
Past performance 1991 Dec - 70% (c-25%)
Past performance 1993 Apr - 75% (c-25%)
1
3. A study found a correlation of r = −0.61 between the sex of a worker and
his or her income. You conclude that:
Solution: d
Past performance 1993 Feb - 60% (e-33%)
4. A study examined the relationship between the sepal length and sepal
width for two varieties of an exotic tropical plant. Varieties A and B are
represented by x’s and o’s, respectively, in the following plot:
Solution: d
2006
c Carl James Schwarz 2
5. From tax records, it is relative easy to determine the amount of liquor
consumed per capita and the number of cigarettes consumed per capita
for each of the 10 provinces of Canada. These are plotted on a scatter
plot and a high positive correlation is found. Which of the following is
correct?
(a) This implies that heavy smoking causes people to drink more.
(b) This implies that heavy drinking causes people to smoke more.
(c) We cannot conclude cause and effect, but this also implies that there
is a high positive correlation between cigarette smoking and alcohol
consumption for individuals.
(d) This could be an example of a correlation caused by a common cause
because both activities are highly correlated with average family in-
come and average income varies widely among the provinces.
(e) We cannot conclude cause and effect, but this also implies that the
same individuals both smoke and consume liquor.
Solution: d
Past performance 1993 Feb - 44% (c-44%; e-10%)
Solution: d
7. On May 11th, 50 randomly selected subjects had their systolic blood pres-
sure (SBP) recorded twice – the first time at about 9:00 a.m. and the
second time at about 2:00 p.m. If one were to examine the relationship
between the morning and afternoon readings, then one might expect:
(a) the correlation to be near zero, as the morning and afternoon readings
should be independent of one another.
2006
c Carl James Schwarz 3
(b) the correlation to be high and positive, as those with relatively high
readings in the morning will tend to have relatively high readings in
the afternoon.
(c) the correlation to be high and negative, as those with relatively high
readings in the morning will tend to have relatively low readings in
the afternoon.
(d) the correlation to be near zero, as correlation measures the strength
of the linear association.
(e) the correlation to be near zero, as blood pressure readings should
follow approximately a normal distribution.
Solution: b
Past performance 1996 Dec - 62% (23%-d)
Past performance 1998 Oct - 68%
8. Men tend to marry women who are slightly younger than themselves.
Suppose that every man married a woman who was exactly .5 of a year
younger than themselves. Which of the following is CORRECT?
(a) The correlation is −.5.
(b) The correlation is .5.
(c) The correlation is 1.
(d) The correlation is −1.
(e) The correlation is 0
Solution: c - Draw a scatterplot of various aged men and their wives
Past performance 2006 Oct - 75% (10%-e)
2006
c Carl James Schwarz 4
Multiple Choice Questions
Least squares
Solution: b
3. The following information was obtained from the manager of a city water
department for predicting the consumption of water (in gallons) from the
size of household:
1
Household Water
Size Used
(x) (y)
2 650
7 1200
9 1300
4 430
12 1400
6 900
9 1800
3 640
3 793
2 925
Here
P are the summary statistics:
P X = 57,
P Y 2= 10, 038,
P X2 = 433,
P Y = 11, 641, 474,
XY = 67, 669
4. For children between the ages of 18 months and 29 months, there is approx-
imately a linear relationship between “height” and “age”. The relationship
can be represented by: Yb = 64.93 + 0.63(x), where Y represents height
(in centimetres) and X represents age (in months). Joseph is 22.5 months
old and is 80 centimetres tall. What is Joseph’s residual?
(a) 79.1
(b) -0.9
(c) +0.9
(d) 56.6
(e) 64.93
2006
c Carl James Schwarz 2
Solution: c
Solution: a
Past performance 1993 Feb - 72% (b-16%)
Past performance 1996 Oct - 96%
(a) The estimated slope is 6.01 which implies that children increase by
about 6 cm for each year they grow older.
(b) The estimated height of a child who is 10 years old is about 110 cm.
(c) The estimated intercept is 50.3 cm which implies that children reach
this height when they are 50.3/6.01=8.4 years old.
(d) The average height of children when they are 5 years old is about
50% of the average height when they are 18 years old.
(e) My niece is about 8 years old and is about 115 cm tall. She is taller
than average.
Solution: c
Past performance 1993 Apr - 83%
Past performance 1997 Jun - 96%
7. A study was conducted to examine the quality of fish after seven days in
ice storage. For this study:
2006
c Carl James Schwarz 3
Y = measurement of fish quality (on a 10 point scale with 10 = BEST.)
X = # of hours after being caught that the fish were packed in ice.
The sample linear regression line is: Yb = 8.5 − .5X. From this we can say
that:
(a) A one hour delay in packing the fish in ice decreases the estimated
quality by .5
(b) A one hour delay in packing the fish in ice increases the estimated
quality by .5
(c) If the estimated quality increases by 1 then the fish have been packed
in ice one hour sooner.
(d) If the estimated quality increases by 1 the fish have been packed in
ice two hours later.
(e) Can’t really say until we see a plot of the data.
Solution: a
yield
d = 4.85 + .05(f ertilizer)
Solution: e
Past performance 1991 Apr - 96%
2006
c Carl James Schwarz 4
dose) and the subsequent weight gain was recorded. An experimenter
plots the data and finds that a linear relationship appears to hold. The
output from SAS follows:
Solution: d
Past performance 1989 Apr - 83%
Past performance 1990 Dec - 97%
Past performance 1996 Dec - 84%
Solution: c
Past performance 1989 Apr - 50% (A-32%)
Past performance 1990 Dec - 90%
Past performance 1996 Dec - 86%
11. It is suspected that weight gain should increase with dose. An appropriate
null and alternate hypothesis to test the slope, the test statistic, and the
p-value are:
2006
c Carl James Schwarz 5
(a) H: β1 = 0 A:β1 6= 0; T ∗ = 2.85; p-value = .0069
(b) H: β0 = 0 A:β0 6= 0; T ∗ = 3.23; p-value = .0066
(c) H: β1 = 0 A:β1 > 0; T ∗ = 2.85; p-value = .0137
(d) H: β0 = 0 A:β0 > 0; T ∗ = 3.23; p-value = .0033
(e) H: β1 = 0 A:β1 > 0; T ∗ = 2.85; p-value = .0069
Solution: e
Past performance 1989 Apr - 49% (C-31%)
Past performance 1996 Dec - 82%
2006
c Carl James Schwarz 6
Solution: b
Past performance 1996 Dec - 86%
14. It is suspected that the weight gain should increase with dose. An appro-
priate null and alternate hypothesis to test the slope, the test statistic,
and the p-value are:
(a) H: β1 = 0, A: β1 6= 0; T* = 7.37; p-value < .0001.
(b) H: β0 = 0, A: β0 6= 0; T* = 4.75; p-value = .0004.
(c) H: b1 = 0 A:b1 > 0 T* = 7.37; p-value = .0002.
(d) H: b0 = 0 A:b0 > 0 T* = 4.75; p-value = .0002.
(e) H: β1 = 0, A: β1 > 0; T* = 4.75; p-value = .0002.
Solution: e
Past performance 1996 Dec - 82%
16. Refer to the previous question. If the number of weeks after planting
ranged from 2 to 8, what is the predicted height for a seedling after 12
weeks?
2006
c Carl James Schwarz 7
Solution: a
17. A research group was interested in predicting the number of bus riders per
capita in census districts. They felt that the rider-ship per capita, Y , could
be predicted using the average income, X, for the census district. A sample
of 29 census districts were taken and the observations on theP samples were
used to obtain nP= 29, Y = 62.1429, X = 3452.178; (X − X)(Y −
Y ) = 189, 312.0; (X − X)2 = 19, 910, 691.0; (Y − Y )2 = 13, 369.381;
P
M SE = 428.5 Based on this data, a 98% confidence interval for β1 is:
Solution: e
2006
c Carl James Schwarz 8
(c) Yb = 3.28 − 2.34X
(d) Yb = 7.56 − 10.27X
(e) Yb = −1.03 + 75.64X
Solution: b
Past performance 1990 Apr - 89%
Past performance 1991 Dec - 93%
Solution: e
Past performance 1990 Apr - 72%(D-14%)
Past performance 19 91 Dec - 88%
20. An appropriate null and alternate hypothesis to test the slope, the test
statistic, and the p-value are:
Solution: e
Past performance 19 90 Apr - 48% (A-24%, C-18%)
2006
c Carl James Schwarz 9
(b) In order to obtain an estimated time to distress of 25 minutes, the
log(concentration ) of the pollutant should be 1.30.
(c) A ten-fold increase in pollution (represented by an increase of one
unit on the log scale) decreases the time to distress by 20.3 minutes.
(d) It would be inadvisable to extrapolating the line outside of the ob-
served values of the pollutant concentration.
(e) The method of least squares is often used to obtain the estimates of
the slope and intercept.
Solution: c
Past performance 1990 Apr - 70% (A-11%, B-12%)
Past performance 1991 Dec - 56% (a-17%, b-17%)
Solution: a
Past performance 1990 Apr - 32% (B-12%, C-28%, E-23%)
Past performance 1991 Dec - 38% (b-13%, c-31%, e-11%)
2006
c Carl James Schwarz 10
green house where soybean plants were exposed to varying levels of UV
levels - measured in Dobson units. At the end of the experiment the
yield (kg) was measured. A regression analysis was performed with the
following results:
Here is some output:
2006
c Carl James Schwarz 11
26. A 95% confidence interval for the slope will be centered on the estimated
slope and:
(a) ±0.011
(b) ±0.108
(c) ±0.054
(d) ±0.046
(e) ±0.021
Solution: e
Past performance 19 93 Apr - 37% (a-18%; c-18%; d-20%)
Past performance 19 97 Aug - 87%
27. The null and alternate hypothesis for a test of the slope, the test statistic,
and the p- value are:
Solution: c
Past performance 1993 Apr - 72% (d-18%)
Past performance 1997 Aug - 74% (d-18%)
28. A 95% confidence interval for the mean yield when the UV reading is 20
Dobson units is:
(a) 3.3 ± 0.86
(b) 3.3 ± 2.12
(c) 3.3 ± 0.40
(d) 3.3 ± 0.98
(e) 3.3 ± 0.71
Solution: a
Past performance 1993 Apr - 23% (b-25%; c-22%; d-21%; e-10%)
2006
c Carl James Schwarz 12
29. Another experiment was computed where the plants were sprayed with a
chemical that acts like a sun-screen. The following plot was obtained:
2006
c Carl James Schwarz 13
Which of the following provides the most reasonable approximation to the
least squares regression line?
(a) Yb = 50 + 10X
(b) Yb = 50 + X
(c) Yb = 10 + 50X
(d) Yb = 1 + 50X
(e) Yb = 10 + X
Solution: a
Past performance 1990 Dec - 80%
31. In simple linear regression the model that is being assumed relates the
Dependent Variable, Y , to the Independent Variable, X, according to the
following relationship: Yi = β0 + β1 Xi + i , i = 1, 2, . . . . ,n. For setting
up confidence interval statements for the parameter β1 based on the least
squares estimates, it is necessary to make the following assumption(s)
about the i ’s:
2006
c Carl James Schwarz 14
(e) least squares is purely a mathematical technique so no assumptions
are required.
32. A marine biologist wants to test the effect of water temperature on the
average dive duration for sea otters. Several otters are available for an
experiment. The biologist collects the following data:
Water. Dive
Temp (C) Duration (sec)
Otter X Y
J2 4 63
J1 8 75
B7 8 84
B9 12 91
M3 12 101
D4 16 110
B8 20 115
X 2 = 1088,
P P P
The
P 2summary statistics
P are: X = 80, Y = 639,
Y = 60457, XY = 7888
The least squares regression line is equal to:
(a) Yb = 3.4 + 52X
(b) Yb = 8.4 + 7.3X
(c) Yb = 4.7 + 21X
(d) Yb = 53 + 3.4X
(e) Yb = 50 − 3.3X
Solution: not available
2006
c Carl James Schwarz 15
(c) “Calibration” refers to the process where the relationship between the
guessed and real areas is used to correct future guesses.
(d) If the fitted regression line tends to fall below the “45ř line”, then this
student tends to underestimate real areas.
(e) The fitted straight line was fit using “least squares”. This line mini-
mizes the sum of the square of the deviations between the actual and
predicted values.
Solution: a
Past performance 1997 Jun - 76%
(a) It is estimated that for every additional gram of fat in the cereal, the
number of calories increases by about 9.
(b) It is estimated that in cereals with no fat, the total amount of calories
is about 97.
(c) If a cereal has 2 g of fat, then it is estimated that the total number
of calories is about 115.
(d) If a cereal has about 145 calories, then this equation indicates that
it has about 5 grams of fat.
(e) One cereal has 140 calories and 5 g of fat. Its residual is about 5 cal.
35. A selection of cereals was sampled and the number of calories was plotted
against the number of grams of protein with the following results:
2006
c Carl James Schwarz 16
(b) It is estimated that cereals with no protein would have just over 100
calories/serving.
(c) The observed regression line is Y = 106.0 + .339(protein)
(d) One plausible reason that the confidence interval for the slope is so
wide is that confounding variables may cloud the relationship be-
tween calories and grams of protein.
(e) The standard error for the slope indicates how much the calories may
vary among different cereals in the sample.
Solution: e
Past performance 1998 Nov - 53% (15% a)
2006
c Carl James Schwarz 17
36. Which of the following is NOT CORRECT?
(a) We are about 95% confident that the slope for this data is between
-4.0 and -2.5.
(b) The fitted regression line is approximately Yb = 82.42−3.31(runtime)
(c) There is good evidence that there is a relationship between oxygen
consumption and the run time.
(d) A person who runs 1500 m in 10 minutes would have an estimated
oxygen consumption rate of about 50.
(e) The se of .36 measures how much the estimated slope would vary if
another sample of people were measured.
Solution: a
Past performance 1998 Dec - 39% (16% c; 39% e)
38. In the above graph, both males and females appear to have the same
relationship. However, this is, in general, not true. If the relationship
for each group was not the same, then which of the following is NOT
CORRECT?
(a) The slope for the combined data could be substantially different than
either group’s slope.
(b) The intercept for the combined data could be substantially different
than either group’s intercept.
(c) The sample correlation in the combined group could be substantially
different than either group’s correlation.
2006
c Carl James Schwarz 18
(d) The combined results may be influenced by a lurking variable, in this
case gender.
(e) The median oxygen consumption for the combined group will be the
average of the medians of each group.
Solution: e
Past performance 1998 Dec - 82%
2006
c Carl James Schwarz 19
Multiple Choice Questions
Regression, Correlation, Trends
(a) plotting the variable against time and looking for a straight-line pat-
tern.
(b) calculating the least squares regression line of the variable against
time and examining the residuals.
(c) plotting the logarithm of the variable aginst time and looking for a
straight line pattern.
(d) smoothing the time series by running medians of three or five.
(e) smothing the scatter plot by median trace
Solution: c
3. The following data come from a time series of yearly sales of equipment
by a large manufacturer:
1
In order to smooth this series a running median of 3 is calculated. The
smoothed series for the years 1969 to 1973 respectively is:
4. The following plot is the net sales (billions of dollars) for Eastman Kodak
Ltd. for the years 1970 through 1989 (1970 is coded as 0):
2006
c Carl James Schwarz 2
(b) well represented by a straight line.
(c) approximately exponential growth.
(d) difficult to determine without detailed statistical analysis.
(e) regular with large residuals.
2006
c Carl James Schwarz 3
Multiple Choice Questions
Sampling Distributions
1. The Gallup Poll has decided to increase the size of its random sample of
Canadian voters from about 1500 people to about 4000 people. The effect
of this increase is to:
(a) reduce the bias of the estimate.
(b) increase the standard error of the estimate.
(c) reduce the variability of the estimate.
(d) increase the confidence interval width for the parameter.
(e) have no effect because the population size is the same.
Solution: c
Past performance 1992 Dec - 65% (11%a, 16%e)
Past performance 1997 Jul - 92%
1
3. Government regulations indicate that the total weight of cargo in a certain
kind of airplane cannot exceed 330 kg. On a particular day a plane is
loaded with 100 boxes of goods. If the weight distribution for individual
boxes is normal with mean 3.2 kg and standard deviation 7 kg, what is
the probability that the regulations will NOT be met:
(a) 1.5%
(b) 92%
(c) 8%
(d) 15%
(e) 85%
Solution: c
Past performance 1997 Jul - 75%
Past performance 2006 Nov - 78%
(a) 2514
(b) .2486
(c) .4772
(d) .0228
(e) .0013
Solution: d
(a) .1915
(b) .0125
(c) .3085
(d) .0228
(e) .4875
Solution: b
2006
c Carl James Schwarz 2
6. A random sample of 100 observations is to be drawn from a population
with a mean of 40 and a standard deviation of 25. The probability that
the mean of the sample will exceed 45 is:
(a) 0.4772
(b) 0.4207
(c) 0.0793
(d) 0.0228
(e) not possible to compute, based on the information provided.
Solution: d
(a) The standard error of the sample mean will decrease as the sample
size increases.
(b) The standard error of the sample mean is a measure of the variability
of the sample mean among repeated samples.
(c) The sample mean is unbiased for the true (unknown) population
mean.
(d) The sampling distribution shows how the sample mean will vary
among repeated samples.
(e) The sampling distribution shows how the sample was distributed
around the sample mean.
Solution: e
Past performance 1990 Dec - 40% (c-18%, d-24%)
Past performance 1991 Dec - 41% (a-10%, c-27%, d-18%)
8. The sample mean is an unbiased estimator for the population mean. This
means:
Solution: b
Past performance 1989 Dec - 77%
2006
c Carl James Schwarz 3
9. Which of the following statements is NOT CORRECT?
Solution: d
Past performance 1989 Dec - 92%
Past performance 1990 Dec - 90%
(a) the distribution of the various sample sizes which might be used in a
given study
(b) the distribution of the different possible values of the sample mean
together with their respective probabilities of occurrence
(c) the distribution of the values of the items in the population
(d) the distribution of the values of the items actually selected in a given
sample
(e) none of the above
Solution: b
11. The average monthly mortgage payment for recent home buyers in Win-
nipeg is µ = $732, with standard deviation of σ = $421 A random sample
of 125 recent home buyers is selected. The approximate probability that
their average monthly mortgage payment will be more than $782 is:
(a) 0.9082
(b) 0.4522
(c) 0.4082
(d) 0.0478
2006
c Carl James Schwarz 4
(e) 0.0918
Solution: e
12. Can of salmon have a nominal net weight of 250 g. However, due to
variation in the canning process, the actual net weight has an approximate
normal distribution with a mean of 255 g and a standard deviation of 10
g. According to Consumer Affairs, a sample of 16 tins should have less
than a 5% chance that the mean weight is less than 250 g. What is the
actual probability that a sample of 16 tins will have a mean weight less
than 250 g?
(a) .1915
(b) .3085
(c) .0228
(d) .4772
(e) .0500
Solution: c
Past performance 1993 Apr - 58% (b-32%)
Past performance 1996 Nov - 77% (b-19%)
Solution: c
2006
c Carl James Schwarz 5
(c) µ is an estimate of X; s is an estimate of the standard deviation of
the sample mean.
(d) X is an estimate of µ; s is an estimate of the standard deviation of
the sample mean.
(e) X is an estimate of µ; s is the standard error of the sample mean.
Solution: b
15. The central limit theorem tells us that the sampling distribution of is
approximately normal. Which of the following conditions are necessary
for the theorem to be valid:
Solution: a
(a) provided that the population is normally distributed and the sample
size is reasonably large.
(b) provided that the population is normally distributed (for any sample
size).
(c) provided that the sample size is reasonably large (for any population).
(d) provided that the population is normally distributed and the popu-
lation variance is known (for any sample size).
(e) provided that the population size is reasonably large (whether the
population distribution is known or not).
Solution: c
2006
c Carl James Schwarz 6
(c) it enables reasonably accurate probabilities to be determined for
events involving the sample average when the sample size is large
regardless of the distribution of the variable
(d) it tells us that if several samples have produced sample averages
which seem to be different than expected, the next sample average
will likely be close to its expected value.
(e) it is the basis for much of the theory that has been developed in the
area of discrete random variables and their probability distributions.
Solution: c
18. One class decided to estimate the proportion of cars that are red in a
parking lot. They took a random sample of the cars in the closest parking
lot to the class. Which of the following is NOT correct?
(a) Even though the sample was random sample of cars in the parking
lot, the sample may not be representative of the population of cars
driven by SFU students because the decision to park in B-lot is a
self-selected sample.
(b) If another sample of cars was taken, it is likely that a different propor-
tion for Japanese made cars would be found. The set of all possible
values for the proportion is known as the sampling distribution.
(c) The confidence interval computed refers to the proportion of cars in
the sample that were red.
(d) The sample was a simple random sample from cars parked. This
means that every car in the lot had an equal chance of being selected.
(e) A convenience sample could be chosen by selecting the first 25 cars
in the parking lot that are closest to the Applied Science Building.
Solution: c
Past performance 1996 Nov - 82%
19. Recall in one assignment you surveyed cars in a parking lot to estimate
the proportion that were red or the proportion that were from a Japanese
manufacturer. Which of the following is NOT CORRECT?
(a) A convenience sample of the cars closest to the Applied Science build-
ing may give a biased estimate of the proportion of cars which are
from a Japanese manufacturer.
(b) Different students may get different answers for the proportion of
cars that are red.
(c) The sample proportion of cars that are red is an unbiased estimate of
the population proportion if the sampling is a simple random sample.
2006
c Carl James Schwarz 7
(d) A sample of 100 cars in a convenience sample is always better than
a sample of 20 cars from a proper random sample.
(e) A sample of 100 cars from a proper random sample will give more
precise estimates of the proportion of cars that are red than a sample
of 20 cars from a proper random sample.
Solution: d
Past performance 2006 Nov - 92%
2006
c Carl James Schwarz 8
Multiple Choice Questions
Hypothesis Testing - Introduction
1 Testing - Introduction
1. To determine the reliability of experts used in interpreting the results of
polygraph examinations in criminal investigations, 280 cases were studied.
The results were:
True Status
Innocent Guilty
Examiner’s Innocent 131 15
Decision Guilty 9 125
(a) 15/280
(b) 9/280
(c) 15/140
(d) 9/140
(e) 15/146
Solution: c
The second column percentage is the probability that the examiner con-
cludes a person is is not or guilty given the person is guilty. This is what is
required for a Type II error, i.e. conditional upon the person really being
guilty.
Past performance 1993 Feb - 13% (a-65%; e-13%)
1
1 TESTING - INTRODUCTION
Solution: a
Solution: b
2006
c Carl James Schwarz 2
1 TESTING - INTRODUCTION
Solution: d
Past performance 1991 Apr - 55%
(a) The critical region is the values of the test statistic for which we
reject the null hypothesis.
(b) The level of significance is the probability of type I error.
(c) For testing H0 µ = µ0 , HA : µ > µ0 , we reject H0 for high values of
the sample mean X.
(d) In testing H0 : µ = µ0 , HA : µ 6= µ0 , the critical region is two sided.
(e) The p-value measures the probability that the null hypothesis is true.
Solution: e
2006
c Carl James Schwarz 3
1 TESTING - INTRODUCTION
Solution: e
Past performance 1991 Feb - 66% (a-12%, c-12%)
(a) the null hypothesis will not be rejected unless the data are not un-
usual (given that the hypothesis is true).
(b) the null hypothesis will not be rejected unless the p-value indicates
the data are very unusual (given that the hypothesis is true).
(c) the null hypothesis will not be rejected only if the probability of
observing the data provide convincing evidence that it is true.
(d) the null hypothesis is also called the research hypothesis; the alter-
native hypothesis often represents the status quo.
(e) the null hypothesis is the hypothesis that we would like to prove; the
alternative hypothesis is also called the research hypothesis.
Solution: b
Past performance 1993 Apr - 59% (c-26%; e-10%)
Past performance 1997 Aug - 93%
2006
c Carl James Schwarz 4
1 TESTING - INTRODUCTION
(d) if this experiment were repeated 3 per cent of the time we would get
this same result.
(e) the sample is so small that little confidence can be placed on the
result.
Solution: c
Past performance 1996 Dec - 82%
Past performance 1998 Nov - 80%
2006
c Carl James Schwarz 5
Multiple Choice Questions
Hypothesis Testing - Multinomial proportions
from a single sample
There are extensive breeding programs for salmon on the West Coast of
Canada to enhance the salmon fishery. One question of interest is whether
inbreeding affects subsequent fitness of the fish. An experiment was conducted
where released salmon were classified as unrelated if the parents were unrelated,
half-sibs if the one of the parents was in common, and full sibs if both parents
were in common. In one release, 25% of the fish were half-sibs, 40% were
unrelated, and 35% were full-sibs. Of 237 returning adult salmon, 45% were
unrelated, 25% were full-sibs, and 30% were half- sibs.
1
(d) The return rates are 40%, 35%, and 25% for unrelated, full-sibs, and
half-sibs respectively.
(e) The release percentages are different from the return percentages.
Solution: d
(d) is preferred over (a) because the hypothesis of independence
is only applicable when there are two classification variables. Here
there is only variable - the sibship. Also, the proportions that should
return when the H is true is known exactly. In the contingency table
analysis, you test if the proportions are the same for all the groups,
but the actual proportions are unknown.
Past performance 1993 Apr - 33% (a-54%)
Past performance 1997 Aug - 82% (a-11%)
(a) 13.1
(b) 4.5
(c) 5.4
(d) 10.8
(e) 6.0
Solution: d
Past performance 1993 Apr - 73% (b-10%; c-10%)
2006
c Carl James Schwarz 2
(e) 59.25
Solution: e
Past performance 1997 Aug - 84%
Phenotype
Tall Tall Dwarf Dwarf
Cut Pot Cut Pot
leaf leaf leaf leaf
Frequency 926 288 293 104
(a) 7.81
(b) 5.99
(c) 1.18
(d) 1.47
(e) 964.01
Solution: d
Past performance 1991 Apr - 90%
(a) 7.81
2006
c Carl James Schwarz 3
(b) 5.99
(c) 3.84
(d) 9.49
(e) 11.07
Solution: a
Past performance 1991 Apr - 94%
Number of spots | 1 2 3 4 5 6
Frequency | 1 4 9 9 2 5
If a chi-square goodness of fit test is used to test the hypothesis that the
die is fair at a significance level of α = 0.05, then the value of the chi-square
statistic and the decision reached are:
Solution: a
answer Frequency
A 68
B 53
C 61
D 75
E 43
(a) 11.60
(b) 10.47
2006
c Carl James Schwarz 4
(c) 190.76
(d) 310.47
(e) 48
Solution: b
Past performance 1989 Apr - 87%
10. The following table gives the number of wins for each of the first four post
positions at Assiniboine Downs for 80 races during the 1978 horse-racing
season.
Post Position 1 2 3 4
Number of wins 24 17 19 20
For testing the hypothesis that the probability of winning is the same for
all four post positions, the calculated value of the test statistic is:
(a) 26.00
(b) 1.25
(c) 1.30
(d) 0.40
(e) 20.00
Solution: c
Gasoline Selected
Regular Unleaded Super Unleaded
51 261 88
11. The expected cell counts assuming the distributor’s claim is correct are:
2006
c Carl James Schwarz 5
(e) 20%, 60%, 20%
Solution: c
12. If α=0.05, then the value of the appropriate test statistic and the critical
value respectively are:
Solution: b
Past performance 1990 Apr - 82%
Gasoline Selected
Regular Unleaded Super Unleaded
51 261 88
2006
c Carl James Schwarz 6
(b) pregular = .200; punleaded =.600; psuper = .200
(c) pbregular = .200; pbunleaded =.600; pbsuper = .200
(d) gasoline selected is independent of the type of car
(e) the probability of each type of gasoline is equal
Solution: b
(d) is not valid because there is no classification by type of car in this
survey
Past performance 1996 Dec - 71% (12%-c)
14. The expected cell counts assuming the distributor’s claim is correct are:
(a) 100, 200, 100
(b) 51, 261, 88
(c) 80, 240, 80
(d) 133, 133, 133
(e) 20%, 60%, 20%
Solution: c
Past performance 1996 Dec - 93%
15. The value of the appropriate test statistic and approximate p-value , re-
spectively, are:
Solution: b
Past performance 1996 Dec - 73% (15%-d)
2006
c Carl James Schwarz 7
A test of the hypothesis that the nonconforming parts are uniformly dis-
tributed among the three shifts can be based upon which of the following
values of the test statistic?
(a) 5.78 with 3 degrees of freedom.
(b) 5.78 with 2 degrees of freedom.
(c) 5.48 with 2 degrees of freedom.
(d) 5.48 with 3 degrees of freedom.
(e) 5.48 with 1 degree of freedom.
Solution: b
2006
c Carl James Schwarz 8
(d) H: pn = 0.215, ps =0.538, pv = 0.247
(e) The observed proportions of the three feather types occur with prob-
abilities of 0.25, 0.50, and 0.25 respectively.
Solution: c
Past performance 1998 Dec - 85%
2006
c Carl James Schwarz 9
19. What is the null hypothesis being tested?
(a) H : pweekend = .50; pweekday = .50
(b) H : µweekend = .22; µweekday = .78
(c) H : µweekend = 2/7; µweekday = 5/7
(d) H : pweekend = .22; pweekday = .78
(e) H : pweekend = 2/7; pweekday = 5/7
Solution: e.
Past performance 2006 Dec - 56% (26% c)
2006
c Carl James Schwarz 10
Solution: b
Past performance 2006 Dec - 82%
21. The test-statistic is 13.6 with a p-value that is very small. Which is COR-
RECT?
(a) There is strong evidence that the proportion of births on weekends
is different from 2/7.
(b) There is strong evidence that the mean number of births is the same
between weekends and weekdays.
(c) There is strong evidence that the mean number of births differs be-
tween weekends and weekdays.
(d) There is strong evidence that the proportion of births on weekends
is different from that on weekdays.
(e) There is strong evidence that the proportion of births on weekends
is the same as that on weekdays.
Solution: a
Past performance 2006 Dec - 45% ((19% c; 31% d)
2006
c Carl James Schwarz 11
Multiple Choice Questions
Hypothesis Testing - Population means from
paired experiments
2. The infamous researcher, Dr. Gnirips, claims to have found a drug that
causes people to grow taller. The coach of the Basketball team at Brandon
University has expressed interest but demands evidence. Ten people are
randomly selected from students at Brandon, their heights measured, the
drug administered, and 2 hours later their heights remeasured. The results
were as follows:
1
Pre-Drug 68 69 74 78 70 66 71 70 71 65
Post-Drug 70 69 75 78 73 69 72 73 72 66
Person 1 2 3 4 5 6 7 8 9 10
Using the proper test statistic, an appropriate decision rule for the hy-
potheses H:Drug has no effect versus A: Drug increases height at (αa =
.05) will be
3. A group of 10 men were given a special diet for two weeks to test weight
loss in pounds. The observed data was:
2006
c Carl James Schwarz 2
4. A manufacturer wished to compare the wearing qualities of two different
types of automobile tires, A and B, and he had 5 cars available for use in
an experiment. To make the comparison, one tire of Type A and one of
Type B were mounted on the rear wheels of each of the five automobiles.
(For each car, a coin was flipped to decide which tire would be mounted on
the left side and which would be mounted on the right.). The automobiles
were then operated for a specified number of miles and the amount of wear
was recorded for each tire. These measurements appear below:
(a) 12.83
(b) 0.57
(c) 8.35
(d) 10.72
(e) 9.45
Test for any difference in the length of dives using a non-parametric pro-
cedure:
2006
c Carl James Schwarz 3
(a) Rank-sum procedure, Wcold = 25;p−value > .111
(b) Rank-sum procedure, Wcold = 25;p−value > .222
(c) Signed-rank procedure, W − = 1;p−value = .062
(d) Signed-rank procedure, W − = 1;p−value = .124
(e) Sign-test, S = 4;p−value = .187
Solution: d
Past performance 1991 Apr - 38% (C-52%)
Pair 1 2 3 4 5
Male 25.9 20.0 28.7 13.5 18.8
Female 24.9 18.5 27.7 13.0 17.8
To test whether the mean starting salary for males is less than that of
females with α= 0.05, the absolute value of the test statistic is:
(a) 1
(b) 0.125
(c) 0.3535
(d) 5.658
(e) 6.3246
2006
c Carl James Schwarz 4
7. Consider the differences computed by taking the mother’s height - the
daughter’s height. The value of the Signed-Rank test statistic is:
(a) 36
(b) 19
(c) 16
(d) 6
(e) 20
Solution: c
Past performance 1990 Apr - 61%
8. No longer used
The next three questions refer to the following situation:
All of us non-smokers can rejoice - the mosaic tobacco virus that affects
and injures tobacco plants is spreading! Meanwhile, a tobacco company is
investigating if a new treatment is effective in reducing the damage caused
by the virus. Eleven plants were randomly chosen. On each plant, one
leaf was randomly selected, and one half of the leaf (randomly chosen)
was coated with the treatment - the other half was left untouched (con-
trol). After two weeks, the amount of damage to each half of the leaf was
assessed. The output from SAS follows:
9. What is the best reason for performing a paired experiment rather than a
two- independent sample experiment?
2006
c Carl James Schwarz 5
(a) It is easier to do because we need fewer experimental units and each
unit receives more than one treatment.
(b) It allows us to remove variation in the results caused by other factors
because we can compare both treatments within the same experi-
mental unit.
(c) The computer program is more accurate because we work only with
the differences.
(d) It requires fewer assumptions because we are only interested in the
difference between treatments
(e) It allows us to do more experiments because we use each experimental
unit twice.
Solution: b
Past performance 1991 Feb - 98%
Past performance 1997 Aug - 95%
10. What is the rejection region (α=.05) and p-value for the paired t-test?
(a) Reject if T ∗ 1.812; p-value =.040
(b) Reject if T ∗ 1.812; p-value =.020
(c) Reject if T ∗ 2.358; p-value =.040
(d) Reject if T ∗ 2.358; p-value =.020
(e) Reject if T ∗ 1.645; p-value =.020
Solution: b
Past performance 1991 Feb - 56% (a-13%, e-20%)
Solution: a
12. A group of 10 men were put on a weight reduction diet. The weights
before (b) and after (a) the diet were measured on each individual. The
differences di = ai -bi, were analyzed, yielding the following results.
2006
c Carl James Schwarz 6
- values are not given for some reason?
We wish to test if the diet has reduced the average weight. The test
statistic and critical value (α=.05) are:
The absolute value of the test statistic calculated from the data for testing
the null hypothesis that there is no difference in the average wear for the
two types of tires is:
(a) 12.83
(b) 5.7
(c) 8.35
(d) 10.72
(e) 9.45
Solution: b/option>
14. A statistics professor would like to determine whether students in his class
showed improved performance on the final examination as compared to the
mid-term examination. A random sample of 4 students selected from a
large class revealed the following mid-term and final scores:
2006
c Carl James Schwarz 7
Student #1 #2 #3 #4
Mid-term 70 62 57 68
Final 80 79 87 88
Making the appropriate assumptions, the value of the test statistic is:
(a) 19.25/8.30
(b) 19.25/(8.30/2)
p
(c) 19.25/ 28.295/4 + 28.295/4
p
(d) 19.25/ 34.92/4 + 21.67/4
(e) 19.25/(2/8.30)
Solution: b/option>
15. A sample of 8 patients had their lung capacity measured before and after
a certain treatment with the following results:
The Sign Test is used to test the hypothesis that the treatment provides
no increase in lung capacity. The probability, under H0 , of obtaining the
observed result or a more extreme one (i.e. the p-value or observed level
of significance) is:
(a) .0352
(b) .1094
(c) .0498
(d) .1445
(e) .2980
Solution: d
16. Seven sets of identical twins are given psychological tests to determine
whether the firstborn of the twins tends to be more aggressive than the
second born. The results are shown in the following table, where the
higher score represents greater aggressiveness.
2006
c Carl James Schwarz 8
Set Firstborn Second born Difference
1 86 88 -2
2 77 65 12
3 91 90 1
4 70 65 5
5 75 80 -5
6 88 81 7
7 87 72 15
Solution: d
17. The following data give uric acid levels (in milligrams per 100 milliliters)
for 5 subjects before and after a special diet.
To test the hypothesis that the diet reduces the uric acid level, we might
use
(a) a two sample t-test because the uric acid levels before and after the
diet can be assumed independent.
(b) a sign test
(c) a paired t-test
(d) a and b
(e) b and c
Solution: e
2006
c Carl James Schwarz 9
An agricultural field station is investigating the differences between the
mean yields of two varieties of corn. Because of fertility differences, both
varieties were planted in each of seven farms across the province. At
harvest time, the plots were harvested and the yield recorded. The output
from SAS appears below.
(a) H: X d = 0 A: X d 6= 0
(b) H: µd = 0 A: µd 6= 0
(c) H: µd 6= 0 A: µd = 0
(d) H: µd = 0 A: µd < 0
(e) H: X d = 0 A: X d < 0
Solution: b
Past performance 1990 Feb - 97%
19. The test statistic, rejection region (α = .05), and p-value are:
Solution: b
Past performance 1990 Feb - 56% (A-22%,)
(a) There is evidence to believe that the two varieties have a different
mean yield.
2006
c Carl James Schwarz 10
(b) There is insufficient evidence to believe that the two varieties have a
different mean yield.
(c) There is evidence to believe that the two varieties have the same
mean yield.
(d) There is insufficient evidence to believe that the two varieties do not
have a difference in their mean yields.
(e) There is sufficient evidence to believe that the two varieties are paired
on each farm.
Solution: a
Past performance 1990 Feb - 83%
2006
c Carl James Schwarz 11
(c) 2.810 .0584
(d) 1.204 .9708
(e) 2.333 .0292
Solution: e
Past performance 1996 Dec - 89%
23. Suppose that the p-value had been .0093. This would mean:
(a) There is strong evidence against the null hypothesis of equal mean
yields.
(b) There is no evidence to believe that the two varieties have a different
mean yield.
(c) There is strong evidence to believe that the two varieties have the
same mean yield.
(d) There is no evidence to believe that the two varieties do not have a
difference in their mean yields.
(e) There is sufficient evidence to believe that the two varieties are paired
on each farm
Solution: a
Past performance 1996 Dec - 87%
2006
c Carl James Schwarz 12
24. The null and alternate hypotheses are:
(a) H: X diff = 0 A: X diff 6= 0
(b) H: µdiff = 0 A: µdiff > 0
(c) H: µdiff 6= 0 A: µdiff = 0
(d) H: µdiff = 0 A: µdiff < 0
(e) H: X diff = 0 A: X diff < 0
Solution: d - Notice that diff = before − after, so if the drug is effective
in reducing blood pressure, the average before should be greater than the
average after.
Past performance 1997 Aug - 73%
Past performance 2006 Dec - 73% (11% c; 12% e)
2006
c Carl James Schwarz 13
25. The estimated difference and the p-value are:
(a) 2.00; .1672
(b) 1.23; .0836
(c) 1.62; .0836
(d) 2.00; .9164
(e) 2.00; .0836
Solution: e
Past performance 1997 Aug - 87%
Past performance 2006 Dec - 79% (10% a)
2006
c Carl James Schwarz 14
Multiple Choice Questions
Hypothesis Testing - Population mean from a
single sample
Solution: d - you always try and collect evidence against the null
(a) 0.20
(b) 0.40
(c) 0.29
(d) 0.42
(e) 0.21
Solution: d
The one-sided p-value is P (Z > .8) = .21. Because the alternative hy-
pothesis is two-sided, the two-sided p-value is found as 2 × .21 = .42.
The next 2 questions refer to the following situation
A Canadian railway company claims that its trains block crossings no
more that 8 minutes per train on the average. The actual times (minutes)
that 10 randomly selected trains block crossings were recorded:
1
10.1 9.5 6.5 8.0 8.8 >12 7.2 10.5 8.2 9.3
(a) .101
(b) .053
(c) .248
(d) .049
(e) .064
Solution: d
Past performance 1993 Apr - 72%
2006
c Carl James Schwarz 2
(d) H: X = 0.19 A: X = 0
(e) H: µ = 0.2 A: µ 6= 0.2
Solution: a
Past performance 1990 Apr - 98%
Past performance 1991 Dec - 84% (11%-e)
Past performance 1993 Feb - 99%
(a) -1.00
(b) -4.00
(c) 0.01
(d) 1.96
(e) 1.75
Solution: b
Past performance 1990 Apr - 95%
Past performance 1993 Feb - 99%
7. The null hypothesis will be rejected (α=0.05) if the test statistic is less
than: (note that if the rejection region is two sided, only one side has been
shown)
(a) -2.1314
(b) -1.7530
(c) -1.9600
(d) -1.6450
(e) -1.7459
Solution: b
Past performance 1990 Apr - 74%
Past performance 1993 Feb - 92%
(a) 8
(b) > 128
(c) 34
2006
c Carl James Schwarz 3
(d) 27
(e) > 101
Solution: d
Past performance 1993 Feb - 63%
Solution: e
Past performance 1991 Feb - 98%
10. The value of the test statistic, the rejection region (α=0.05), and the
p-value (computed by a computer) are:
Solution: d
Past performance 1991 Feb - 80%
2006
c Carl James Schwarz 4
11. The average time it takes for a person to experience pain relief from aspirin
is 25 minutes. A new ingredient is added to help speed up relief. Let µ
denote the average time to obtain pain relief with the new product. An
experiment is conducted to verify if the new product is better. What are
the null and alternative hypotheses?
(a) H0 : µ = 25 vs HA : µ 6= 25
(b) H0 : µ = 25 vs HA : µ < 25
(c) H0 : µ < 25 vs HA : µ = 25
(d) H0 : µ < 25 vs HA : µ > 25
(e) H0 : µ = 25 vs HA : µ > 25
Solution: b
12. We wish to test H0 that the average family income of Manitoba families
is at least $15,000 at level of significance α = .05. In order to test the null
hypothesis a sample of size 1000 is selected from the population, and the
p-value of the test is determined to be .02. We then:
(a) reject H0 because the data are sufficiently unusual if the null hypoth-
esis were false.
(b) reject H0 because the data are sufficiently unusual if the null hypoth-
esis were true .
(c) fail to reject H0 because the data are not sufficiently unusual if the
null hypothesis were true
(d) fail to reject H0 because the data are not sufficiently unusual if the
null hypothesis were false
(e) reject H0 because the data are sufficently unusual
Solution: b
13. The profit per new car sold by a Winnipeg automobile dealer varies from
car to car. The average profit per sale tabulated for the past 6 days was
$368 with a standard deviation of $190 To test if there is sufficient evidence
to indicate that average profit per sale is less than $480, the appropriate
null and alternative hypotheses for the test are:
2006
c Carl James Schwarz 5
Solution: b
14. In order to study the amounts owed to the city, a city clerk takes a random
sample of 16 files from a cabinet containing a large number of delinquent
accounts and finds the average amount X owed to the city to be $230
with a sample standard deviation of $36. It has been claimed that the
true mean amount owed on accounts of this type is greater than $250. If
it is appropriate to assume that the amount owed is a normally distributed
random variable, the value of the test statistic appropriate for testing the
claim is:
(a) -3.33
(b) -1.96
(c) - 2.22
(d) -0.55
(e) - 2.1314
Solution: a
2006
c Carl James Schwarz 6
What assumption(s) do we have to make in order to carry out a legitimate
statistical test of the nutritionists’ claim?
Solution: a
17. Refer to the previous question. What are the appropriate statistical hy-
potheses and the observed value of the corresponding test statistic?
Solution: b
Solution: b
2006
c Carl James Schwarz 7
(a) - 0.532
(b) 0.460
(c) 0.504
(d) - 0.504
(e) - 0.460
Solution: c
20. In the previous question, the appropriate critical region and conclusion
when testing at a = .05 are:
Solution: a
21. A Canadian railway company claims that its trains block crossings no
more that 5 minutes per train on the average. The actual times (minutes)
that 10 randomly selected trains block crossings were:
10.4 9.7 6.5 9.5 8.8 11.2 7.2 10.5 8.2 9.3
Solution: b
2006
c Carl James Schwarz 8
(b) The value of the test statistic is in the acceptance region.
(c) The p-value is less than 0.10.
(d) The p-value is greater than 0.10.
(e) If the sample mean is not equal to 100.
Solution: c
2006
c Carl James Schwarz 9
(a) .4772
(b) .0228
(c) .9772
(d) .1915
(e) .3085
Solution: a
Solution: c
Past performance 1991 Apr - 98%
(a) rejected because the calculated value of the test statistic is less than
the appropriate critical value 1.711.
(b) rejected because the calculated value of the test statistic is greater
than the appropriate critical value 1.645.
(c) accepted because the calculated value of the test statistic is less than
the appropriate critical value 1.711.
(d) accepted because the calculated value of the test statistic is less than
the appropriate critical value 1.708.
(e) accepted because the calculated value of the test statistic is less than
the appropriate critical value 2.064.
Solution: c
Past performance 1991 Apr - 77%
2006
c Carl James Schwarz 10
28. The p-value for the previous test is computed to be:
Solution: e
Past performance 1991 Apr - 75% (D-12%)
(a) H: µ = 72 A: µ < 72
(b) H: X = 72 A: X < 72
(c) H: µ = 80 A: µ = 72
(d) H: X = 80 A: X > 72
(e) H: µ = 72 A: µ > 72
Solution: e
Past performance 1990 Feb - 88%
Past performance 1993 Apr - 80% (a-17%)
Past performance 1996 Dec - 92%
(a) .32
(b) 2.00
(c) Ð.32
(d) 1.64
(e) 2.88
2006
c Carl James Schwarz 11
Solution: b
Past performance 1990 Feb - 99%
Past performance 1993 Apr - 71% (d-10%)
Past performance 1996 Dec - 96%
31. The null hypothesis will be rejected at α= 0.05 if the test statistic exceeds:
(a) 1.9600
(b) 1.6450
(c) 1.7109
(d) 2.0639
(e) 1.7081
Solution: c
Past performance 1990 Feb - 62% (A-10%, B-18%)
Solution: a
Past performance 1993 Apr - 74% (c-10%)
Past performance 1996 Dec - 92%
(a) Conclude that the students are less fit (on average) than the general
population when in fact they have equal fitness on average, .
(b) Conclude that the students have the same fitness (on average) as the
general population when in fact they are less fit on average.
(c) Conclude that the students have the same fitness (on average) as the
general population when in fact they are the same fitness level on
average.
(d) Conclude that the students are less fit (on average) than the general
population, when, in fact, they are less fit on average.
(e) Conclude that the students have the same fitness (on average) when
in fact they are more fit on average.
2006
c Carl James Schwarz 12
Solution: b
Past performance 1990 Feb - 79% (A-15%)
Past performance 1993 Apr - 80% (a-10%)
2006
c Carl James Schwarz 13
Multiple Choice Questions
Hypothesis Testing - Population proportion
from a single sample
(a) 0.90
(b) 0.40
(c) 0.05
(d) 0.20
(e) 0.10
Solution: d
The one-sided p-value is P (Z > 1.28) = .10. Because the alternative is a
two-sided alternative, the two-sided p-value is 2 × .1 = .2.
2. The power takeoff driveline on tractors used in agriculture is a potentially
serious hazard to operators of farm equipment. The driveline is covered
by a shield in new tractors, but for a variety of reasons, the shield is often
missing on older tractors. Two type of shields are the bolt-on and the flip-
up. It was believed that the bolt-on shield was perceived as a nuisance
by the operators and deliberately removed, but the flip-up shield is easily
lifted for inspection and maintenance and may be left in place. In a study
initiated by the National Safety Council of the U.S., a sample of older
tractors with both types of shields was taken to see what proportion were
removed. Of 183 tractors designed to have bolt-on shields, 35 had been
removed. Of the 136 tractors with flip-up shields, 15 were removed. We
wish to test the hypothesis H: pb = pf vs A: pb 6= pf where pb and pf are
the proportion of tractors with the bolt-on and flip-up shields removed,
respectively. The test-statistic is computed to be 1.97. The p-value is:
(a) .025
(b) .049
1
(c) .012
(d) .975
(e) .475
Solution: b
Past performance 1991 Feb - 65% (a-27%)
(a) .00192
(b) .9933
(c) .0096
(d) .0067
(e) .9936
Solution: d
(a) 1.80
(b) 1.90
(c) 1.83
(d) 1.28
(e) 1.75
Solution: a
2006
c Carl James Schwarz 2
(c) Fail to reject H0 because the calculated value of the test statistic is
1.0204 which is less than 1.96.
(d) Fail to reject H0 because the calculated value of the test statistic is
1.0204 which is less than 1.645.
(e) Not need to test because everyone knows that FTA is good.
Solution: d
(a) 0.004
(b) 0.035
(c) 0.050
(d) 0.127
(e) 0.965
Solution: d
7. A seed company claims that 80% of the seeds of a certain variety of tomato
will germinate if sown under normal growing conditions. A government
inspector is interested in whether or not the proportion of seeds germi-
nating is living up to the company’s claim. He randomly selects a sample
of 200 seeds from a large shipment and tests the sample for percentage
germination. If 155 of the 200 seeds germinate, then the calculated value
of the test statistic used to test the hypothesis of interest is:
(a) −.847
(b) −.884
(c) −.897
(d) −.825
(e) −.858
Solution: b
2006
c Carl James Schwarz 3
(a) 0.0500
(b) .0750
(c) .0375
(d) .0448
(e) .0228
Solution: e
(a) 0.4772
(b) 0.94772
(c) 0.0456
(d) 0.0114
(e) 0.0228
Solution: e
(a) 0.0348
(b) 0.0500
(c) .0700
(d) 0.0436
(e) 0.0218
Solution: ***
2006
c Carl James Schwarz 4
(a) H0 : p = 0.10 H1 : p > 0.15
(b) H0 : p = 0.10 H1 : p > 0.10
(c) H0 : p = 0.15 H1 : p 6= 0.15
(d) H0 : p = 0.15 H1 : p < 0.15
(e) H0 : p = 0.15 H1 : p > 0.15
Solution: d
Past performance 1991 Feb - 90%
(a) 1.83
(b) −1.10
(c) 1.53
(d) −1.83
(e) −1.53
Solution: e
Past performance 1991 Feb - 55% (a-13%, d-18%)
13. A method currently used by doctors to screen women for possible breast
cancer fails to detect cancer in 15% of the women who actually have the
disease. A new method has been developed that researchers hope will be
able to detect cancer more accurately. A random sample of 80 women
known to have breast cancer are to be screened using the new method. At
the 0.05 level of significance, the researchers will be able to conclude that
the new method is better than the one currently in use if the appropriate
test statistic has a value:
Solution: ***
14. Refer to the previous question. After the experiment was performed it
was discovered that the new method failed to detect the breast cancer in
8 of the 80 randomly selected women. The value of the test statistic is
equal to:
2006
c Carl James Schwarz 5
(a) 0.10
(b) −1.25
(c) 1.50
(d) 0.15
(e) −0.14
Solution: ***
2006
c Carl James Schwarz 6
Multiple Choice Questions
Hypothesis Testing - Populations Means from
two independent samples
(a) there is a large difference between the effects of the treatment and
the placebo.
(b) there is strong evidence that the treatment is very effective.
(c) there is strong evidence that there is some difference in effect between
the treatment and the placebo.
(d) there is little evidence that the treatment has any effect.
(e) there is evidence of a strong treatment effect.
Solution: c
Not (a), (b), or (e) because there is nothing the question
about the size of the effect - it may statistically significant, but.
of no practical importance - refer to notes
2. Herbicide A has been used for years in order to kill a particular type of
weed, but an experiment is to be conducted in order to see whether a new
herbicide, Herbicide B, is more effective than Herbicide A. Herbicide A
will continue to be used unless there is sufficient evidence that Herbicide
B is more effective. The alternative hypothesis in this problem is that
1
(e) Herbicides A and B differ in effectiveness.
Solution: b
$\overline{X}$ $s^2$
Excellent (E) 8.4 4.2
Simple (S) 8.9 4.6
(a) H: µE − µS = 0 A: µE − µS > 0
(b) H: µE − µS = 0 A: µE − µS 6= 0
(c) H: µE − µS = 0 A: µE − µS < 0
(d) H: µE − µS < 0 A: µE − µS = 0
(e) H: µE − µS > 0 A: µE − µS = 0
Solution: c
4. Absolute value of the calculated value of the appropriate test statistic is:
(a) 1.61
(b) 2.33
(c) 0.65
(d) 1.24
(e) 0.85
(a) 1.960
2006
c Carl James Schwarz 2
(b) 1.701
(c) 2.048
(d) 2.145
(e) 1.645
OLD 32 <25 40 31 35 29
NEW 45 32 >48 34 37 27 35 >48
One patient died before twenty five months, but it was not known when.
Two patients were still alive after four years when the study was termi-
nated.
6. The value of the test statistic (computed on the OLD drug) for testing if
the new drug gave an increased life span is:
(a) 75
(b) 71
(c) 32
(d) 34
(e) 33
Solution: e
Past performance 1990 Apr - 84%
2006
c Carl James Schwarz 3
(e) The assumption of independence is not important for non-parametric
procedures.
Solution: e
Past performance 1990 Apr - 78% (C-11%)
10. We wish to test if a new feed increases the mean weight gain compared
to an old feed. At the conclusion of the experiment it was found that the
new feed gave a 10 kg bigger gain than the old feed. A two-sample t-test
with the proper one-sided alternative was done and the resulting p-value
was .082. This means:
Solution: b
Past performance 1991 Feb - 50% (20%-a; 12%-d; 11%-e)
Past performance 1993 Feb - 86%
Past performance 1993 Apr - 81%
Past performance 1997 Aug - 74% (14%-d)
Past performance 2006 Dec - 77% (11%-a)
2006
c Carl James Schwarz 4
11. Following the analysis of some data on two samples drawn from popula-
tions in which the variable of interest is normally distributed, the p-value
for the comparison of the two sample means under the null hypothesis that
the two population means are equal (H0 µ1 = µ2 ) against HA : µ1 6= µ2
was found to be .0063. This p-value indicates that:
(a) there is very little evidence in the data for a conclusion to be reached.
(b) there is rather strong evidence against the null hypothesis.
(c) the evidence against the null hypothesis is not strong.
(d) the null hypothesis should be accepted.
(e) there is rather strong evidence against the alternative hypothesis.
Solution: b
FOR $H_0: \textit{VAR~ARE~EQUAL}$, F’= 7.55 WITH 6 AND 4 DF PROB > F’= 0.0707
12. We wish to test if the two varieties are significantly different in their mean
carbohydrate content . The null and alternative hypotheses are:
(a) H: µ1 = µ2 A: µ1 < µ2
(b) H: µ1 = µ2 A: µ1 > µ2
(c) H: µ1 = µ2 A: µ1 6= µ2
(d) H: X 1 = X 2 A: X 1 < X 2
(e) H: X 1 = X 2 A: X 1 6= X 2
Solution: c
Past performance 1990 Apr - 97%
Past performance 1990 Dec - 86%
2006
c Carl James Schwarz 5
13. The test statistic, absolute critical value (at α=.05), and p-value are:
Solution: c
Past performance 1990 Apr - 44% ( a=41%, e=12%)
Solution: e
Past performance 1990 Apr - 91%
15. These findings were submitted to a journal, and one reviewer questioned
the results because she believed that the data within each group were
not normally distributed. Consequently, a non-parametric procedure was
used, and the output follows:
2006
c Carl James Schwarz 6
(b) S=45.5 p-value=.0208
(c) Z=2.0371 p-value=.0208
(d) Z=2.0371 p-value=.0664
(e) S=45.5 p-value=.0664
Solution: a
Past performance 1990 Apr - 64% (b=23%)
16. We wish to test if the two varieties are significantly different in their mean
carbohydrate content . The null and alternative hypotheses are:
(a) H: µ1 = µ2 A: µ1 < µ2
(b) H: µ1 = µ2 A: µ1 > µ2
(c) H: µ1 = µ2 A: µ1 6= µ2
(d) H: X 1 = X 2 A: X 1 < X 2
(e) H: X 1 = X 2 A: X 1 6= X 2
Solution: c
Past performance 1996 Dec - 96%
2006
c Carl James Schwarz 7
(e) -2.725 .1020
Solution: a
Past performance 1996 Dec - 95%
18. The following are percentages of fat found in 5 samples of each of two
brands of ice cream:
Solution: b
19. The life, in months of service, before a failure of the color television picture
tube in a random sample of 6 television sets manufactured by Company
A and 8 television sets manufactured by Company B are as follows:
The calculated value of the Rank-Sum test statistic for testing the null
hypothesis that the life, in months of service, before failure of picture
tube is the same both companies is:
(a) 75
(b) 71
(c) 32
(d) 34
(e) 33
Solution: e
2006
c Carl James Schwarz 8
Feed A: 8.0 7.4 5.8 6.2 8.8 9.5
Feed B: 12.0 18.2 8.0 9.6 8.2 9.9 10.3
We wish to test the hypothesis that Feed B gives rise to larger weight
gains. The output from SAS is as follows:
Variances T DF Prob>|T|
---------------------------------------
Unequal -2.4048 7.9 0.0431
Equal -2.2596 11.0 0.0451
Solution: b
Past performance 1991 Apr - 56% (A-20%)
21. The results were written up in a report, but a reviewer of the report
thought that some of the assumptions necessary for a two-sample t-test
might be violated. Consequently, a non-parametric procedure was also
done. The rank-sum test statistic computed for Feed A and the corre-
sponding p-value are:
2006
c Carl James Schwarz 9
Solution: a
Past performance 1991 Apr - 82%
(a) Reject H if WA 2 29
(b) Reject H if WA 3 55
(c) Reject H if WA 2 36
(d) Reject H if WA 2 27
(e) Reject H if WA 2 34
Solution: a
Past performance 1991 Apr - 77%
We wish to test the hypothesis that Feed B gives rise to larger weight
gains. The output from JMP is as follows:
2006
c Carl James Schwarz 10
Solution: d
Past performance 1997 Aug - 90%
FOR H0: VAR ARE EQUAL, F’= 1.16 WITH 9 AND 6 DF PROB > F’= 0.8868
26. We wish to test if the mean level of nitric oxide from device I is greater
than that of device II. The null and alternate hypotheses are:
2006
c Carl James Schwarz 11
(a) H: µ1 − µ2 =0 A: µ1 − µ2 6= 0
(b) H: X 1 − X 2 =0 A: X 1 − X 2 < 0
(c) H: µ1 − µ2 =0 A: µ1 − µ2 < 0
(d) H: X 1 − X 2 =0 A: X 1 − X 2 > 0
(e) H: µ1 − µ2 =0 A: µ1 − µ2 < 0.
Solution: c
27. The test statistic, rejection region (α=.05), and the p-value are:
Solution: c
Solution: e
29. These findings were submitted to a journal, and one reviewer questioned
the results because she believed that the data within each group were
not normally distributed. Consequently, a non-parametric procedure was
used, and the output follows:
2006
c Carl James Schwarz 12
WILCOXON 2-SAMPLE TEST (NORMAL APPROXIMATION)
(WITH CONTINUITY CORRECTION OF .5)
S= 51.00 Z=-1.1278 PROB >|Z|=0.2594
Solution: b
FOR H0: VARIANCES EQUAL, F’=3.05 WITH 9 AND 6 DF PROB > F’= 0.1882
(a) H: µ1 − µ2 > 0 A: µ1 − µ2 = 0
(b) H: X 1 − X 2 > 0 A: X 1 − X 2 = 0
(c) H: X 1 − X 2 = 0 A: X 1 − X 2 > 0
(d) H: µ1 − µ2 = 0 A: µ1 − µ2 < 0
2006
c Carl James Schwarz 13
(e) H: µ1 − µ2 = 0 A: µ1 − µ2 > 0
Solution: e
Past performance 1990 Feb - 97%
31. The value of the proper test statistic and rejection region (α= 0.05) are:
Solution: b
Past performance 1990 Feb - 92%
(a) .1882
(b) .1871
(c) .2273
(d) .0936
(e) .1136
Solution: e
Past performance 1990 Feb - 65% (C-26%)
Solution: b
Past performance 1990 Feb - 56% (A-27%)
2006
c Carl James Schwarz 14
A sheep producer wishes to investigate if the mean number of tapeworms
in the stomachs of Suffolk sheep is less if they have been treated with a
drug compared to sheep not treated. He obtains the following sample data
to conduct a 5% significance test:
(a) H0 : µ1 − µ2 = 0; H1 : µ1 − µ2 < 0
(b) H0 : µ1 − µ2 = 0; H1 : µ1 − µ2 > 0
(c) H0 : X 1 − X 2 = 0; H1 : X 1 − X 2 < 0
(d) H0 : X 1 − X 2 = 0; H1 : X 1 − X 2 > 0
(e) H0 : µ1 − µ2 = 0; H1 : µ1 − µ2 6= 0
Solution: b
(a) 1.54
(b) 1.28
(c) 1.75
(d) 2.1
(e) 4.41
(a) 1.8946
(b) 1.7709
(c) 1.9432
(d) 1.7823
(e) 2.1788
2006
c Carl James Schwarz 15
37. Calculate the observed value of the test statistic for the test of H0 : µ1 −
µ2 = 0 versus Ha : µ1 − µ2 < 0 on the basis of the following information.
Test the hypotheses at the 5% level of significance.
(a) she concludes that the drugs are equal in effectiveness when in fact
the new drug is better.
(b) she concludes that the drugs are equal in effectiveness when in fact
the old drug is better.
2006
c Carl James Schwarz 16
(c) she concludes that the old drug is better when in fact the new drug
is better.
(d) she concludes that the new drug is better when in fact the drugs are
equal in effectiveness.
(e) she concludes that the old drug is better when in fact the drugs are
equal in effectiveness.
Solution: d
Past performance 1990 Dec - 83%
Past performance 1991 Feb - 83% (a-10%)
t-Tests
separate estimates of sigma_1, sigma_2
------------------------------------
t-Test, paired samples
not spray - spray:
2006
c Carl James Schwarz 17
(a) 1.896, 0.033
(b) 1.896, 0.131
(c) 1.896, 0.065
(d) 1.887, 0.059
(e) 1.887, 0.118
Solution: c
Past performance 1993 Feb - 38% (a-53%)
Solution: a
Past performance 1993 Feb - 83% (b-17%)
Solution: c
Past performance 1993 Feb - 66% (a-10%; e-15%)
2006
c Carl James Schwarz 18
(a) Randomization make the experiment easier to conduct because we
can apply the insecticide in any pattern rather than in a systematic
fashion.
(b) Randomization will tend to average out all other uncontrolled fac-
tors such as soil fertility so that they are not confounded with the
treatment effects.
(c) Randomization makes the analysis easier because the data can be
collected and entered into the computer in any order.
(d) Randomization is required by statistical consultants before they will
help you analyze the experiment.
(e) Randomization implies that it is not necessary to be careful during
the experiment, during data collection, and during data analysis.
Solution: b
Past performance 1990 Feb - 97%
Past performance 1993 Feb - 98%
Past performance 1996 Dec - 100%
Past performance 2006 Dec - 99%
Here is some output from JMP: (the differences are computed as control-
poisoned)
2006
c Carl James Schwarz 19
(b) H: X c = X p A: X c < X p
(c) H: pc = pp A: pc < pp A: βc < βp
(d) H: X c = X p A: X c 6= X p
Solution: a
Past performance 1998 Dec - 95%
2006
c Carl James Schwarz 20
Solution: d
Past performance 1998 Dec - 23% (20% e; 53% c)
Note: (c) refers to SAMPLE means not population means.
2006
c Carl James Schwarz 21
Solution: a
Past performance 2006 Dec - 87%
2006
c Carl James Schwarz 22
Multiple Choice Questions
Testing - Two independent samples on
proportions
(a) .025
(b) .049
(c) .012
(d) .975
(e) .475
Solution: b
Past performance 1991 Feb - 65% (a-27%)
(a) 3.29
1
(b) 2.47
(c) 8.56
(d) 12.32
(e) 3.41
Solution: e
3. Two different medical procedures are widely used to treat a disease. One
hundred patients were randomly selected for each procedure in a recent
clinical trial, with the following results:
What is the absolute value of the test statistic calculated from the data for
testing the null hypothesis that there is no difference between the success
rates between procedure 1 and procedure 2?
(a) +0.658
(b) +1.675
(c) +2.385
(d) +2.575
(e) +31.610
Solution: b
The appropriate test statistic for testing whether the traditional method
has a lower passing rate than the audio visual methods:
.63−.70
(a) √ .672×.328 .672×.328
100 + 150
.63−.70
√ .630×.370
(b)
100 + .700×.300
150
.63−.70
(c) √ .667×.333 .667×.333
100 + 150
2006
c Carl James Schwarz 2
(63−67.2)2 (37−32.8)2 (105−100.8)2 (45−49.2)2
(d) 67.2 + 32.8 + 100.8 + 49.2
(e) none of the above
Solution: a
Solution: a
Past performance 1990 Feb - 83%
Past performance 1990 Apr - 92%
Past performance 1990 Dec - 68% (22% - e)
2006
c Carl James Schwarz 3
7. The test statistic would be computed as:
q
(a) .09/ .423(1−.423)
350 + .334(1−.334)
488
q
.423(1−.423) .334(1−.334)
(b) .09/ 838 + 838
q
(c) .09/ .371(1−.371)
838
q
(d) .09/ .371(1−.371)
350 + .371(1−.371)
488
q
(e) .09/ .423(1−.423)
350 + .370(1−.370)
488
Solution: d
Past performance 1990 Feb - 65% (A-32%)
Past performance 1990 Apr - 83%
(a) 2.63
(b) .004
(c) .009
(d) .496
(e) .089
Solution: b
Past performance 1990 Feb - 80%
Past performance 1990 Apr - 83%
(a) The probability that the proportion of smokers has not changed is
.053.
(b) The proportion of smokers has definitely decreased.
(c) There is some, but not overwhelming evidence, that the proportion
of smokers has decreased.
(d) There is no evidence that the proportion of smokers is the same in
both years.
(e) There is overwhelming evidence that the proportion of smokers has
stayed the same.
Solution: c
Past performance 1990 Dec 61%
2006
c Carl James Schwarz 4
10. In a similar study of adult males, the p-value was found to be .053. This
means:
(a) The probability that the proportion of male smokes has not changed
is .053.
(b) The proportion of male smokers has definitely decreased.
(c) If the proportion of male smokers has not changed, then there is only
a .053 chance of seeing the observed drop in the smoking rate in the
survey.
(d) If the proportion of male smokers has changed, then there is only a
.053 chance of detecting a difference.
(e) If the proportion of smokers has changed, then there is only a .053
chance of seeing the observed drop in the smoking rate in the survey.
Solution: c
Past performance 1990 Feb - 38% (A-14%, B-38%, C-38%, D-29%, E-17%)
Past performance 1990 Apr - 64%(C-64%, D-11%, E-21%)
2006
c Carl James Schwarz 5
the slash on the ground. It is believed that mulching will cause the ma-
terial to break down sooner and release the nutrients to the seedlings. A
total of 500 seedlings were randomly assigned to the two treatments and
the two year survival rate was measured. Of the 250 seedling receiving
the “mulching” treatment, 75 survived; of the 250 seedlings receiving the
“control” treatment, 55 survived.
12. The null and alternate hypotheses are: (m=mulch, c=control)
Solution: c
Past performance 1993 Feb - 82% (d=19%)
13. The value of the test statistic and the p-value are:
(a) 2.76, .003
(b) 2.05, .042
(c) 2.76, .006
(d) 2.05, .021
(e) 2.05, .011
Solution: d
Past performance 1993 Feb - 84%
2006
c Carl James Schwarz 6