1. A random sample X1, X2, ... Xn is taken from a population with unknown mean μ and unknown
variance σ2. A statistic Y is based on this sample.

(a) Explain what you understand by the statistic Y.


(b) Explain what you understand by the sampling distribution of Y.


(c) State, giving a reason which of the following is not a statistic based on this sample.

( X i – X )2 n
 Xi – µ 
2 n
(i) ∑
i =1 n
(ii) ∑
i =l

 σ 
 (iii) ∑X
i =l

(Total 5 marks)

2. (a) Explain what you understand by a census.


Each cooker produced at GT Engineering is stamped with a unique serial number. GT

Engineering produces cookers in batches of 2000. Before selling them, they test a random
sample of 5 to see what electric current overload they will take before breaking down.

(b) Give one reason, other than to save time and cost, why a sample is taken rather than a

(c) Suggest a suitable sampling frame from which to obtain this sample.

(d) Identify the sampling units.

(Total 4 marks)

3. Before introducing a new rule the secretary of a golf club decided to find out how members
might react to this rule.

(a) Explain why the secretary decided to take a random sample of club members rather than
ask all the members.

(b) Suggest a suitable sampling frame.


(c) Identify the sampling units.

(Total 3 marks)

4. A large dental practice wishes to investigate the level of satisfaction of its patients.

(a) Suggest a suitable sampling frame for the investigation.


(b) Identify the sampling units.


(c) State one advantage and one disadvantage of using a sample survey rather than a census.

(d) Suggest a problem that might arise with the sampling frame when selecting patients.
(Total 5 marks)

5. A magazine has a large number of subscribers who each pay a membership fee that is due on
January 1st each year. Not all subscribers pay their fee by the due date. Based on
correspondence from the subscribers, the editor of the magazine believes that 40% of
subscribers wish to change the name of the magazine. Before making this change the editor
decides to carry out a sample survey to obtain the opinions of the subscribers. He uses only
those members who have paid their fee on time.

(a) Define the population associated with the magazine.


(b) Suggest a suitable sampling frame for the survey.


(c) Identify the sampling units.


(d) Give one advantage and one disadvantage that would have resulted from the editor using
a census rather than a sample survey.

As a pilot study the editor took a random sample of 25 subscribers.

(e) Assuming that the editor’s belief is correct, find the probability that exactly 10 of these
subscribers agreed with changing the name.

In fact only 6 subscribers agreed to the name being changed.

(f) Stating your hypotheses clearly test, at the 5% level of significance, whether or not the
percentage agreeing to the change is less that the editor believes.

The full survey is to be carried out using 200 randomly chosen subscribers.

(g) Again assuming the editor’s belief to be correct and using a suitable approximation, find
the probability that in this sample there will be least 71 but fewer than 83 subscribers who
agree to the name being changed.
(Total 20 marks)

1. (a) A statistic is a function of X1,X2,...Xn B1
that does not contain any unknown parameters B1 2
Examples of other acceptable wording:
B1 e.g. is a function of the sample or the
data / is a quantity calculated from the sample
or the data / is a random variable calculated
from the sample or the data
B1 e.g. does not contain any unknown
parameters/quantities contains only known
parameters/quantities only contains values
of the sample
Y is a function of X1,X2,...Xn that does not
contain any unknown parameters B1B1
is a function of the values of a sample with
no unknowns B1B1
is a function of the sample values B1B0
is a function of all the data values B1B0
A random variable calculated from the sample B1B0
A random variable consisting of any function B0B0
A function of a value of the sample B1B0
A function of the sample which contains no other
values/ parameters B1B0

(b) The probability distribution of Y or the

distribution of all possible values of Y (o.e.) B1 1
Examples of other acceptable wording
All possible values of the statistic together
with their associated probabilities

(c) Identify (ii) as not a statistic B1

Since it contains unknown parameters μ and σ . dB1 2
1stB1 for selecting only (ii)
2nd B1 for a reason. This is dependent upon the
first B1. Need to mention at least one of mu (mean)
or sigma (standard deviation or variance) or
unknown parameters.
since it contains mu B1
since it contains sigma B1
since it contains unknown parameters/quantities B1
since it contains unknowns B0

2. (a) A census is when every member of the population is investigated. B1

B1 Need one word from each group
(1) Every member /all items / entire /oe
(2) population/collection of individuals/sampling frame/oe
enumerating the population on its own gets B0

(b) There would be no cookers left to sell. B1

B1 Idea of Tests to destruction. Do not accept cheap or quick

(c) A list of the unique identification numbers of the cookers. B1

B1 Idea of list/ register/database of cookers/serial numbers

(d) A cooker B1 4
B1 cooker(s) / serial number(s)
The sample of 5 cookers or every 400th cooker gets B1

3. (a) Saves time / cheaper / easier B1 1

any one
A census / asking all members takes a long time or is
expensive or difficult to carry out

(b) List, register or database of all club members / golfers B1 1

Full membership list

(c) Club member(s) B1 1


4. (a) List of patients registered with the practice.

Require ‘list’ or ‘register’ or database or similar B1 1
(b) The patient(s) B1 1

(c) Adv: Quicker, cheaper, easier, used when testing results in

destruction of item, quality of info about each sampling
unit is often better. Any one B1
Disadv: Uncertainty due to natural variation, uncertainty due
to bias, possible bias as sampling frame incomplete,
bias due to subjective choice of sample, bias due to
non-response Any one B1 2

(d) Non-response due to patients registered with the practice but

who have left the area B1 1

5. (a) All subscribers to the magazine B1 1

(b) A list of all members that had paid their subscriptions B1 1
(c) Members who have paid B1 1
(d) Advantage: total accuracy B1
Disadvantage: time consuming to obtain data and analyse it B1 2

(e) Let X represent the number agreeing to change the name

∴ X ∼ B(25, 0.4) B1
P(X = 10) = P(X ≤ 10) − P(X ≤ 9) = 0.1612 M1 A1 3

(f) H0: p = 0.40, H1: p < 0.40 B1, B1

P(X ≤ 6) = 0.0736 > 0.05 ⇒ not significant M1 A1
No reason to reject H0 and conclude % is less than the editor believes A1 5

(g) Let X represent the number agreeing to change the name

∴ X ∼ B(200, 0.4)
P(71 ≤ X < 83) ≈ P(70.5 ≤ Y < 82.5) where Y ∼ N(80, 48) B1 B1
 70.5 − 80 82.5 − 80 
≈ P  ≤X <  M1 M1
 48 48 
≈ P(−1.37 ≤ X < 0.36) A1 A1
= 0.5533 A1 7

