THIS PAPER IS NOT TO BE REMOVED FROM THE EXAMINATION HALL ST104A ZA
BSc DEGREES AND GRADUATE DIPLOMAS IN ECONOMICS, MANAGEMENT, FINANCE AND THE SOCIAL SCIENCES, THE DIPLOMA IN ECONOMICS AND SOCIAL SCIENCES AND THE CERTIFICATE IN EDUCATION IN SOCIAL SCIENCES Statistics 1
Monday 13 May 2019 : 10.00
–
12.00 Time allowed: 2 hours
DO NOT TURN OVER UNTIL TOLD TO BEGIN
Candidates should answer
THREE
of the following
FOUR
questions:
QUESTION 1
of Section A (50 marks) and
TWO
questions from Section B (25 marks each).
Candidates are strongly advised to divide their time accordingly.
A list of formulae and extracts from statistical tables are provided after the final question on this paper. Graph paper is provided at the end of this question paper. If used, it must be detached and fastened securely inside the answer book. A handheld calculator may be used when answering questions on this paper and it must comply in all respects with the specification given with your Admission Notice. The make and type of machine must be clearly stated on the front cover of the answer book.
Tables 4, 5, 7, 8,
9,
10, 13 & 14 (NewCambridge). Graph paper
© University of London 2019
UL19/0000p. 1 of 21
SECTION A
Answer
all
parts of question 1 (50 marks in total).1. (a) Suppose that
x
1
= 3
.
2,
x
2
= 0,
x
3
=
√
2,
x
4
=
√
5, and
y
1
=
−
2
.
5,
y
2
=
−
0
.
8,
y
3
=
√
6,
y
4
=
√
100. Calculate the following quantities:i.
i
=2
X
i
=1
(
x
i
+
y
i
) ii.
i
=4
X
i
=3
x
2
i
y
2
i
iii.
y
34
+
i
=2
X
i
=1
x
i
y
2
i
.
(6 marks)
(b) Classify each one of the following variables as either measurable(continuous) or categorical. If a variable is categorical, further classify it aseither nominal or ordinal. Justify your answer. (
No marks will be awarded without a justification.
)i. Clothing sizes of ‘small’, ‘medium’ and ‘large’.ii. The inflation rate of a country.iii. A passport’s country of issue.
(6 marks)
(c) State whether the following are true or false and give a brief explanation. (
Nomarks will be awarded for a simple true/false answer.
)i. A boxplot is suitable for visualising a categorical variable.ii. The events
A
and
A
c
are mutually exclusive.iii. The median and the mean are the same value for a normal distribution.iv. A
p
-value of 0.8 indicates highly significant evidence against the nullhypothesis.v. Correlation coe
ffi
cients are asymmetric.
(10 marks)
(d) Three cards are drawn at random, without replacement, from a standard deckof 52 cards. What is the probability that they are all of the same suit (thatis, all three cards are hearts, or all are diamonds, or all are spades, or all areclubs)?
(5 marks)
UL19/0000p. 2 of 21
(e) The random variable
X
takes the values
−
2,
−
1 and 3 according to thefollowing probability distribution:
x
−
2
−
1 3
p
X
(
x
) 3
k
2
k
3
k
i. Explain why
k
= 0
.
125 and write down the probability distribution of
X
.ii. Find E(
X
), the expected value of
X
.iii. Find Var(
X
), the variance of
X
.
(6 marks)
(f) Briefly explain two advantages of longitudinal surveys.
(4 marks)
(g) Seven students in a class received the following examination and project marksin a subject:Examination mark 50 80 70 40 30 75 95Project mark 75 60 55 40 50 80 65You want to know if students who excel in examinations in the subject alsohad relatively high project marks.i. Calculate the Spearman rank correlation.ii. Based on your answer to part i. do you think students who score well inexaminations are also likely to have the highest project marks? Briefly justify your answer.
(7 marks)
(h) In a simple linear regression model of the form
y
=
α
+
β
x
+
ε
, where thedependent variable is income (in dollars) and the independent variable is age(in years), the value of
α
was estimated to be
−
203
.
56.i. Interpret this estimate of
α
.ii. Explain why such an estimate could occur if you are told that age andincome are highly correlated in the sample data used to run the regression.
(6 marks)
UL19/0000 p. 3 of 21
SECTION B
Answer
two
out of the three questions from this section (25 marks each).2. (a) i. The following data reflect monthly salaries of a group of people (income),measured in thousand pounds. Carefully construct a boxplot on the graphpaper provided to display these data.3 2 4 8 7 19 2 5 3 4 10 12
.
(8 marks)
ii. Describe the distribution of the data based on the boxplot you have drawn.
(2 marks)
iii. Name two other types of graphical displays that would be suitable torepresent the data and their distribution. Provide a justification for youranswers.
(3 marks)
(b) A new training programme is designed to improve the performance of 100-metre runners. A random sample of nine 100-metre runners were trainedaccording to this programme and, in order to assess its e
ff
ectiveness, theyparticipated in a run before and after completing this training programme. Thetimes (in seconds) for each runner were recorded and are shown below. Theaim is to determine whether this training programme is e
ff
ective in reducingthe average times of the runners.Before training 12.5 9.6 10.0 11.3 9.9 11.3 10.5 10.6 12.0After training 12.3 10.0 9.8 11.0 9.9 11.4 10.8 10.3 12.1i. Carry out an appropriate hypothesis test at two appropriate significancelevels to determine whether this training programme is e
ff
ective at reducingthe average times of the runners. State the test hypotheses, and specifyyour test statistic and its distribution under the null hypothesis. Commenton your findings.
(6 marks)
ii. State any assumptions you made in part i.
(2 marks)
iii. Compute an 80% confidence interval for the di
ff
erence in the means of thetimes.
(2 marks)
iv. On the basis of the data alone, would you recommend this trainingprogramme to a runner? Explain why or why not.
(2 marks)
UL19/0000 p. 4 of 21