0% found this document useful (0 votes)
55 views

Sheet10 Solution

1. The document describes two statistical tests performed on data from experiments. The first test compares the diameters of screw-nuts from two machines and fails to reject the hypothesis that machine 1's mean is greater than or equal to machine 2's. The second test compares the improvement from two javelin training methods and finds that method B is better than method A. 2. A study tested the effects of alcohol versus water on 10 subjects' motor skills. A paired t-test showed alcohol had a significant negative effect compared to water.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

Sheet10 Solution

1. The document describes two statistical tests performed on data from experiments. The first test compares the diameters of screw-nuts from two machines and fails to reject the hypothesis that machine 1's mean is greater than or equal to machine 2's. The second test compares the improvement from two javelin training methods and finds that method B is better than method A. 2. A study tested the effects of alcohol versus water on 10 subjects' motor skills. A paired t-test showed alcohol had a significant negative effect compared to water.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

1

Course of Study Exercises Statistics


Bachelor Computer Science WS 2021/22
Sheet X - Solutions

Two Samples Tests


1. Two machines produce screw-nuts. The diameters of the screw-nuts
from machine 1 resp. 1 are normally distributed with standard devi-
ation σ1 = 0.5 resp. σ2 = 0.6 and unknown means µ1 resp µ2 . Two
sample are drawn:
M1: 5.46, 5.34, 4.34, 4.82, 4.4, 5.12, 5.69, 5.53, 4.77, 5.82
M2: 5.45, 5.31, 4.11, 4.69, 4.18, 5.05, 5.72, 5.54, 4.62, 5.89, 5.6, 5.19,
3.31, 4.43, 5.3, 4.09
Test the hypotheses µ1 ≥ µ2 at level α = 0.05
Answer: 2 sample Gauss test: no rejection since p-value = 0.152
##########################################################################
# E x e r c i s e : 2 sam p l e Gauss t e s t
#
# f i l e : i n f s t a t 2 s a m p l e s t e s t i n g s c r e w n u t s .R
##########################################################################

# Aufgabensammlung Lehn 121


# Two m ach i n es p r o d u c e scr ew −n u t s . The d i a m e t e r s o f t h e scr ew −n u t s
# fr om m ach i n e 1 r e s p . 1 are normally d i s t r i b u t e d with standard d e v i a t i o n
# s i g m a 1 = 0 . 5 r e s p . s i g m a 2 =0.6 and unknown means .

s x <− 0 . 5 ; s y <− 0 . 6
x <− c ( 5 . 4 6 , 5 . 3 4 , 4 . 3 4 , 4 . 8 2 , 4 . 4 0 , 5 . 1 2 , 5 . 6 9 , 5 . 5 3 , 4 . 7 7 , 5 . 8 2 )
y <− c ( 5 . 4 5 , 5 . 3 1 , 4 . 1 1 , 4 . 6 9 , 4 . 1 8 , 5 . 0 5 , 5 . 7 2 , 5 . 5 4 , 4 . 6 2 , 5 . 8 9 , 5 . 6 0 ,
5.19 ,3.31 ,4.43 ,5.30 ,4.09)
nx <− l e n g t h ( x ) ; ny <− l e n g t h ( y )
t e s t . s t a t <− ( mean ( y)− mean ( x ) ) / s q r t ( s x ˆ2/ nx + s y ˆ2/ ny )
a l p h a <− 0 . 0 5

# on e s i d e d : r e j e c t H0 mx >= my, i f t e s t . s t a t < q u a n t i l e


q u a n t i l e <− qnorm ( al p h a , 0 , 1 )
p v a l u e <− pnorm ( t e s t . s t a t , 0 , 1 )
t e s t . s t a t ; q u a n t i l e ; pvalue # −1.027782; −1.644854; 0. 1520261

2. To test two training methods A and B for javelin throw, 60 untrained


physical education students were randomly divided into two groups of
m=25 and n=35 students, respectively. Before the start of the training
phase First, a performance test was conducted and for each student
the distance of the best of two throws was noted. After completion of
the training phase, during which the students in group 1 were method
A and the students of group 2 were trained according to method B,
was trained, a corresponding performance test was performed. The

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22


2
following results were obtained for the differences between the values
obtained in the second and the first performance test:

• Group 1:
7.06, 11.84, 9.28, 7.92, 13.5, 3.98, 3.82, 7.34, 8.7, 9.24, 4.86, 3.32,
12.78, 12, 5.24, 11.4, 6.56, 9.04, 7.72, 9.26, 7.88, 8.6, 9.3, 8.42, 8.54
• Group 2:
8.68, 6, 6.3, 10.24, 10.88, 5.36, 7.82, 4.7, 9.02, 9.78, 6.9, 5.8, 13.56,
10.32, 13.3, 11.38, 7.94, 10.74, 13.68, 14.92, 7.42, 10.36, 10.54,
5.22, 13.74, 12.98, 10.34, 10.02, 17.8, 13.04, 5.2, 9.4, 11.18, 12.68,
12.36

Which hypothesis do you have to test if you want to show that the
Method B is better than Method A? Perform an corresponding test at
the level α = 0.05. Assume that the values obtained are a realization
of independent in the group i N(µi , σi2 )-distributed random variables
(i=1,2).
Answer: 2 unpaired sample t-test resp Welsh test: reject H0
##########################################################################
# E x e r c i s e : 2 sam p l e t − t e s t , Welsh− t e s t
#
# f i l e : i n f s t a t 2 s a m p l e s t e s t i n g j a v e l i n t h r o w .R
##########################################################################

# 2 i n d e p e n d e n t random variable w i t h unknown b u t e q u a l resp . n ot equal variances

# Aufgabensammlung Lehn : Nr . 128


# To t e s t two t r a i n i n g methods A and B f o r j a v e l i n throw , 60
# u n t r a i n e d p h y s i c a l e d u c a t i o n s t u d e n t s wer e r an d om l y d i v i d e d i n t o
# two g r o u p s o f m=25 and n=35 s t u d e n t s , r e s p e c t i v e l y . B e f o r e t h e
# s t a r t o f t h e t r a i n i n g p h a s e F i r s t , a p e r f o r m a n c e t e s t was c o n d u c t e d
# and f o r e a c h s t u d e n t t h e d i s t a n c e o f t h e b e s t o f two t h r o w s was
# n o t e d . A f t e r c o m p l e t i o n o f t h e t r a i n i n g p h ase , d u r i n g wh i ch t h e
# s t u d e n t s i n gr ou p 1 wer e method A and t h e s t u d e n t s o f gr ou p 2 wer e
# t r a i n e d a c c o r d i n g t o method B , was t r a i n e d , a c o r r e s p o n d i n g
# p e r f o r m a n c e t e s t was p e r f o r m e d . The f o l l o w i n g r e s u l t s wer e o b t a i n e d
# f o r t h e d i f f e r e n c e s b et ween t h e v a l u e s o b t a i n e d i n t h e s e c o n d and
# the f i r s t performance t e s t :
x <− c ( 7 . 0 6 , 1 1 . 8 4 , 9 . 2 8 , 7 . 9 2 , 1 3 . 5 , 3 . 9 8 , 3 . 8 2 , 7 . 3 4 , 8 . 7 , 9 . 2 4 , 4 . 8 6 , 3 . 3 2 ,
12.78 ,12 ,5.24 ,11.4 ,6.56 ,9.04 ,7.72 ,9.26 ,7.88 ,8.6 ,9.3 ,8.42 ,8.54)
y <− c ( 8 . 6 8 , 6 , 6 . 3 , 1 0 . 2 4 , 1 0 . 8 8 , 5 . 3 6 , 7 . 8 2 , 4 . 7 , 9 . 0 2 , 9 . 7 8 , 6 . 9 ,
5.8 ,13.56 ,10.32 ,13.3 ,11.38 ,7.94 ,10.74 ,13.68 ,14.92 ,7.42 ,10.36 ,
10.54 ,5.22 ,13.74 ,12.98 ,10.34 ,10.02 ,17.8 ,13.04 ,5.2 ,9.4 ,11.18 ,
12.68 ,12.36)
# Which h y p o t h e s i s do you h ave t o t e s t i f you want t o show t h a t t h e
# Method B i s b e t t e r t h an Method A? P er for m an c o r r e s p o n d i n g t e s t a t
# the l e v e l alpha = 0 . 0 5 .

# H0 : mu . x >= mu . y , H1 : mu . x < mu . y

##########################################################################
# case : equal variances
a l p h a <− 0 . 0 5
t . t e s t ( x , y , a l t e r n a t i v e =” l e s s ” ,mu=0 , p a i r e d=FALSE , v a r . e q u a l=TRUE,
c o n f . l e v e l =1− a l p h a )
# r e j e c t H0 , s i n c e p−v a l u e = 0 . 0 1 8 1

##########################################################################
# case : equal variances
a l p h a <− 0 . 0 5
t . t e s t ( x , y , a l t e r n a t i v e =” l e s s ” ,mu=0 , p a i r e d=FALSE , v a r . e q u a l=FALSE ,
c o n f . l e v e l =1− a l p h a )
# r e j e c t H0 , s i n c e p−v a l u e = 0 . 0 1 5 9 6

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22


3
3. A (hypothetical) experiment is conducted on the effect of alcohol on
perceptual motor ability. Ten subjects are each tested twice, once after
having two drinks and once after having two glasses of water. The two
tests were on two different days to give the alcohol a chance to wear
off. Half of the subjects were given alcohol first and half were given
water first. The scores of the 10 subjects are shown below. The first
number for each subject is their performance in the ”water” condition.
Higher scores reflect better performance. Test to see if alcohol had a
significant effect. Report the t and p values.

water alcohol
16 13
15 13
11 10
20 18
19 17
14 11
13 10
15 15
14 11
16 16

Answer: two paired sample t-test: reject H0


##########################################################################
# E x e r c i s e : 2 p a i r e d sam p l e t − t e s t
#
# f i l e : i n f s t a t 2 s a m p l e s t e s t i n g e f f e c t a l c .R
##########################################################################

# 2 d ep en d en t random v a r i a b l e w i t h unknown b u t e q u a l v a r i a n c e s
# A ( h y p o t h e t i c a l ) e x p e r i m e n t i s c o n d u c t e d on t h e e f f e c t o f
# a l c o h o l on p e r c e p t u a l motor a b i l i t y . Ten s u b j e c t s a r e e a c h t e s t e d
# t w i c e , o n c e a f t e r h a v i n g two d r i n k s and o n c e a f t e r h a v i n g two
# g l a s s e s o f w a t e r . The two t e s t s wer e on two d i f f e r e n t d ays t o g i v e
# t h e a l c o h o l a c h a n c e t o wear o f f . H a l f o f t h e s u b j e c t s wer e g i v e n
# a l c o h o l f i r s t and h a l f wer e g i v e n w a t e r f i r s t . The s c o r e s o f t h e 10
# s u b j e c t s a r e shown b el ow . The f i r s t number f o r e a c h s u b j e c t i s t h e i r
# performance i n the ” water ” c o n d i t i o n . Higher s c o r e s r e f l e c t b e t t e r
# p e r f o r m a n c e . Test t o s e e i f a l c o h o l had a s i g n i f i c a n t e f f e c t . R ep or t
# t h e t and p v a l u e s .

w a t e r <− c ( 1 6 , 1 5 , 1 1 , 2 0 , 1 9 , 1 4 , 1 3 , 1 5 , 1 4 , 1 6 )
a l c <− c (13 ,13 ,10 ,18 ,17 ,11 ,10 ,15 ,11 ,16)

# H0 : mu = 0 , H1 : mu <> 0

a l p h a <− 0 . 0 5
t . t e s t ( wat er , a l c , a l t e r n a t i v e =”two . s i d e d ” , p a i r e d=TRUE, c o n f . l e v e l =1− a l p h a )
# r e j e c t H0 , s i n c e p−v a l u e = 0 . 0 0 0 7 2 0 5

4. A company manager is considering whether to purchase a new type B


scale on the market. A new acquisition should be made only if the
type B scale is better than the type A scale used up to now. For
the evaluation of the quality of a scale, the scatter of weighing results

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22


4
should be used. In weight measurements for one and the same weight
the following results were obtained the following measured values for
the individual scales:

A=X B=Y
1 102,4 1 98,4
2 101,3 2 101,7
3 97,6 3 100,5
4 98,2 4 99,3
5 102,3 5 100,6
6 99,1 6 99,6
7 97,8 7 102,2
8 103,9 8 101,1
9 101,6 9 99,9
10 100,1 10 101

Under suitable nomal distribution assumptions, check with a test with


α = 0.05 to determine whether a new acquisition makes sense.
Answer: F-test: rejection of H0 : σA ≤ σB
##########################################################################
# E x e r c i s e : F− t e s t
#
# f i l e : i n f s t a t 2 s a m p l e s t e s t i n g s c a l e .R
##########################################################################
# Aufgabensammlung Lehne , Wegmann 132

# A company manager i s c o n s i d e r i n g wh et h er t o p u r c h a s e a new t y p e B


# s c a l e on t h e m ar ket . A new a c q u i s i t i o n s h o u l d be made o n l y i f t h e
# t y p e B s c a l e i s b e t t e r t h an t h e t y p e A s c a l e u sed up t o now . For
# the e v a l u a t i o n o f the q u a l i t y o f a s c a l e , the s c a t t e r o f weighing
# r e s u l t s s h o u l d be u sed . be u sed . I n w e i g h t m easu r em en t s f o r on e
# and t h e same w e i g h t t h e f o l l o w i n g r e s u l t s wer e o b t a i n e d t h e
# f o l l o w i n g m easu r ed v a l u e s f o r t h e i n d i v i d u a l s c a l e s :
x <− c ( 1 0 2 . 4 , 1 0 1 . 3 , 9 7 . 6 , 9 8 . 2 , 1 0 2 . 3 , 9 9 . 1 , 9 7 . 8 , 1 0 3 . 9 , 1 0 1 . 6 , 1 0 0 . 1 )
y <− c ( 9 8 . 4 , 1 0 1 . 7 , 1 0 0 . 5 , 9 9 . 3 , 1 0 0 . 6 , 9 9 . 6 , 1 0 2 . 2 , 1 0 1 . 1 , 9 9 . 9 , 1 0 1 . 0 )
# Under s u i t a b l e nomal d i s t r i b u t i o n a s s u m p t i o n s , c h e c k w i t h a t e s t
# w i t h a l p h a = 0. 05 t o d e t e r m i n e wh et h er a new a c q u i s i t i o n makes
# sense .

# F− t e s t : H0 : si gm a . x <= si gm a . y
var . t e s t ( x , y , a l t e r n a t i v e = ” g r e a t e r ” , con f . l e v e l =0.95)
# r e j e c t H0 , s i n c e p−v a l u e = 0 . 0 3 4 0 4

5. On one farm, 10 cattle were fed (group 1) were fed concentrates of


composition A, and the remaining 10 cattle (group 2) were fed the
conventional conventional feed of composition B. After a certain time
weight gain was noted in both groups:

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22


5
Group 1 Group 2
1 7,2 1 5,3
2 4,1 2 4,4
3 5,5 3 5
4 4,5 4 3,5
5 5,7 5 3,9
6 3,8 6 4,9
7 4,6 7 5,6
8 6 8 2,5
9 5,2 9 4
10 5,4 10 3,6

(a) Assuming that weight gain can be described by independent ran-


dom variables that are identically normally distributed in both
cases, use an appropriate test at level α = 0.1 to test whether
to reject the assumption that weight gain from administration of
composition A concentrate has the same dispersion as weight gain
from administration of the conventional composition B diet.
(b) Assuming that weight gain can be described by independent ran-
dom variables with equal variance, identically normally distributed
in each of the two cases, use an appropriate test at the level
α = 0.025 to test whether the hypothesis that weight gain with
administration of concentrate of composition A is not greater than
weight gain with administration of the conventional diet of com-
position B is correct.

Answer: a) F-test: no rejection of H0 : σ1 = σ2 , b) two unpaired


sample t-test: rejection of H0 : µ1 ≤ µ2
##########################################################################
# E x e r c i s e : 2 u n p a i r e d sam p l e t − t e s t and F− t e s t
#
# f i l e : i n f s t a t 2 s a m p l e s t e s t i n g c a t t l e f e e d .R
##########################################################################

# Aufgabensammlung Lehn , Wegmann A u fgab e 130


# On on e farm , 10 c a t t l e wer e f e d ( gr ou p 1 ) wer e f e d c o n c e n t r a t e s o f
# c o m p o s i t i o n A, and t h e r e m a i n i n g 10 c a t t l e ( gr ou p 2 ) wer e f e d t h e
# conventional conventional feed of composition B. After a c e r t a i n
# t i m e w e i g h t g a i n was n o t e d i n b ot h g r o u p s :

x <− c ( 7 . 2 , 4 . 1 , 5 . 5 , 4 . 5 , 5 . 7 , 3 . 8 , 4 . 6 , 6 . 0 , 5 . 2 , 5 . 4 )
y <− c ( 5 . 3 , 4 . 4 , 5 . 0 , 3 . 5 , 3 . 9 , 4 . 9 , 5 . 6 , 2 . 5 , 4 . 0 , 3 . 6 )

# Assuming t h a t w e i g h t g a i n can be d e s c r i b e d by i n d e p e n d e n t
# random v a r i a b l e s t h a t a r e i d e n t i c a l l y n o r m a l l y d i s t r i b u t e d i n
# b ot h c a s e s , u s e an a p p r o p r i a t e t e s t a t l e v e l $ \ a l p h a =0.1 $ t o
# t e s t wh et h er t o r e j e c t t h e a s s u m p t i o n t h a t w e i g h t g a i n fr om
# a d m i n i s t r a t i o n o f c o m p o s i t i o n A c o n c e n t r a t e h as t h e same
# d i s p e r s i o n a s w e i g h t g a i n fr om a d m i n i s t r a t i o n o f t h e
# conventional composition B d ie t .

mean ( x ) ; v a r ( x )
mean ( y ) ; v a r ( y )
a l p h a <− 0 . 1

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22


6
v a r . t e s t ( x , y , a l t e r n a t i v e =”two . s i d e d ” , c o n f . l e v e l =1− a l p h a )
# do n ot r e j e c t H=, t h a t t h e v a r i a n c e s a r e e q u a l

# Assuming t h a t w e i g h t g a i n can be d e s c r i b e d by i n d e p e n d e n t
# random v a r i a b l e s w i t h e q u a l v a r i a n c e , i d e n t i c a l l y n o r m a l l y
# d i s t r i b u t e d i n e a c h o f t h e two c a s e s , u s e an a p p r o p r i a t e t e s t a t
# t h e l e v e l $ \ a l p h a = 0. 025 $ t o t e s t wh et h er t h e h y p o t h e s i s t h a t
# weight gai n with a d m i n i s t r a t i o n o f c o n c e n t r a t e o f comp osi ti on A
# i s n ot g r e a t e r t h an w e i g h t g a i n w i t h a d m i n i s t r a t i o n o f t h e
# conventional di e t of composition B i s c o r r e c t .

a l p h a <− 0 . 0 5
t . t e s t ( x , y , a l t e r n a t i v e =” l e s s ” , p a i r e d=FALSE , v a r . e q u a l=TRUE,
c o n f . l e v e l =1− a l p h a )
# r e j e c t H0 , t h a t t h e mu . x < mu . y s i n c e p−v a l u e = 0 . 0 1 8 1

6. Heumann, Schoemaker Aufgabe 10.5


A company producing clothing often finds deficient T-shirts among its
production.

(a) The company’s controlling department decides that the produc-


tion is no longer profitable when there are more than 10% defi-
cient T-shirts. A sample of 230 shirts yield 32 shirts which contain
deficiencies. Use the approximate binomial test and the exact bi-
nomial test to decide whether the shirt production is profitable or
not (α = 0.05).
(b) The company is offered a new cutting machine. To test whether
the change of machine helps to improve the production quality, 115
sample T-shirts are evaluated, 7 of which have deficiencies. Use
the 2 sample binomial test to decide whether the new machine
yields improvement or not (α = 0.5)

Answer: one sample binomial test and its approximation


two sample tests on p: Fisher’s exact test and its approximation
##########################################################################
# E x e r c i s e : e x a c t F i s h e r t e s t and n or m al a p p r o x i m a t i o n
#
# f i l e : i n f s t a t 2 s a m p l e s t e s t i n g t s h i r t s .R
##########################################################################
# Heumann , S ch oem aker A u fgab e 1 0 . 5

# A company p r o d u c i n g c l o t h i n g o f t e n f i n d s d e f i c i e n t T− s h i r t s among
# i t s production .
# The company ’ s c o n t r o l l i n g d ep ar t m en t d e c i d e s t h a t t h e p r o d u c t i o n i s no l o n g e r
# p r o f i t a b l e when t h e r e a r e more t h an 10% d e f i c i e n t T− s h i r t s . A sam p l e o f 230
# s h i r t s y i e l d 35 s h i r t s wh i ch c o n t a i n d e f i c i e n c i e s . Use t h e a p p r o x i m a t e
# b i n o m i a l t e s t and t h e e x a c t b i n o m i a l t e s t t o d e c i d e wh et h er t h e s h i r t
# p r o d u c t i o n i s p r o f i t a b l e o r n ot ( a l p h a = 0 . 0 5 ) .
n <− 2 3 0 ; d e f <− 31

# H0 : p <= 0 . 1 , H1 : p > 0 . 1
p0 <− 0 . 1 ; r <− d e f /n
# on e sam p l e t e s t f o r p : n or m al a p p r o x i m a t i o n
# test st a t i s ti cs
t . x <− ( r −p0 ) / s q r t ( p0 ∗ (1 − p0 ) / n )
# d e c i s i o n : r e j e c t H0 , i f t . x > 1− a l p h a q u a n t i l e of t h e N( 0 , 1) − d i s t r .
a l p h a <− 0 . 0 5
t . x > qnorm(1 − a l p h a )
# r e j e c t i o n o f H0

# on e sam p l e t e s t f o r p : e x a c t b i n o m i a l t e s t
binom . t e s t ( x = d e f , n = n , p = p0 ,
a l t e r n a t i v e = ” g r e a t e r ” , c o n f . l e v e l = 1− a l p h a )

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22


7
qbinom (1 − al p h a , s i z e = n , p r ob = p0 ) # 95% q u a n t i l e o f B( n , p0 ) i s 31
# no r e j e c t i o n o f H0 , s i n c e p−v a l u e = 0 . 0 5 4 1 4
# S i n c e t h e e x a c t b i n o m i a l t e s t i s more p r e c i s e t h an t h e a p p r o x i m a t e b i n o m i a l
# t e s t , we f o l l o w t h e r e s u l t o f t h e e x a c t b i n o m i l a t e s t .

# The company i s o f f e r e d a new c u t t i n g m ach i n e . To t e s t wh et h er t h e ch an ge o f


# m ach i n e h e l p s t o i m p r ove t h e p r o d u c t i o n q u a l i t y , 115 sam p l e T− s h i r t s a r e
# e v a l u a t e d , 7 o f wh i ch h ave d e f i c i e n c i e s . Use t h e 2 sam p l e b i n o m i a l t e s t t o
# d e c i d e wh et h er t h e new m ach i n e y i e l d s improvement o r n ot ( a l p h a = 0 . 5 )

n1 <− 2 3 0 ; d e f 1 <− 31
n2 <− 1 1 6 ; d e f 2 <− 7
p h at <− ( d e f 1+d e f 2 ) / ( n1+n2 )

# H0 : p2 >= p1 , H1 : p2 < p1

# two sam p l e t e s t f o r p : n or m al a p p r o x i m a t i o n
# test st a t i s ti cs
T <− ( d e f 2 / n2 − d e f 1 / n1 ) / s q r t ( p h at ∗ (1 − p h at ) ∗ ( 1 / n1 + 1/ n2 ) )
# d e c i s i o n : r e j e c t H0 , i f T < a l p h a q u a n t i l o f N( 0 , 1 )
T < qnorm ( a l p h a ) # r e j e c t i o n o f H0

# two sam p l e t e s t f o r p : f i s h e r ’ s e x a c t t e s t
# contingency ta ble : m ach i n e2 m ach i n e1
# def 7 31
# ok 109 199
c o n t . t ab <− m a t r i x ( c ( d e f 2 , d e f 1 , n2−d e f 2 , n1− d e f 1 ) ,
nrow = 2 , n c o l = 2 , byrow = TRUE)
f i s h e r . t e s t ( c o n t . tab , a l t e r n a t i v e = ” l e s s ” , c o n f . l e v e l = 1− a l p h a )
# r e j e c t i o n o f H0 , s i n c e p− v a l u e = 0 . 0 2 4 6 5

7. In 380 randomly selected families with four children each it is investi-


gated how many of them are girls. The result is the following findings:

Number of girls families


0 25
1 95
2 150
3 80
4 30

Does this finding correspond to the hypothesis that the variable “num-
ber of girls in families with four children each” follows a Binomial dis-
tribution with n = 4 and p = 0.5? Test this hypothesis at a significance
level of 0.1!
Answer: χ2 goodness of fit test: do not reject Ho
##########################################################################
# E x e r c i s e : c h i ˆ2 g o o d n e s s o f f i t t e s t : d i s c r e t e d i s t r o b u t i o n
#
# f i l e : i n f s t a t 2 s a m p l e s t e s t i n g g i r l s f a m i l y .R
##########################################################################

# I n 380 r an d om l y s e l e c t e d f a m i l i e s w i t h f o u r c h i l d r e n e a c h
# i t i s i n v e s t i g a t e d how many o f them a r e g i r l s . The r e s u l t i s t h e
# following findings :
g i r l s <− 0 : 4
fam <− c ( 2 5 , 9 5 , 1 5 0 , 8 0 , 3 0 )
# Does t h i s f i n d i n g c o r r e s p o n d t o t h e h y p o t h e s i s t h a t t h e v a r i a b l e
# ‘ ‘ number o f g i r l s i n f a m i l i e s w i t h f o u r c h i l d r e n each ’ ’ f o l l o w s a
# B i n o m i a l d i s t r i b u t i o n w i t h n = 4 and p = 0 . 5 ? Test t h i s
# h y p o t h e s i s at a s i g n i f i c a n c e l e v e l o f 0 . 1 !

# H0 : fam ˜ B( n=4 ,p = 0 . 5 )
a l p h a <− 0 . 1
c h i s q . t e s t ( fam , p = dbinom ( 0 : 4 , s i z e = 4 , p r ob = 0 . 5 ) )
# do n ot r e j e c t H0 , s i n c e p−v a l u e = 0 . 3 4 5 7

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22


8
8. The hypothesis is to be tested that the height of adult of adult German
men is normally distributed (significance level 10%). For this purpose,
a random sample is collected, which leads to the following findings:

class frequency
150 to 155 20
155 to 160 30
160 to 165 55
165 to 170 60
170 to 175 85
175 to 180 80
180 to 185 50
185 to 190 40
190 to 195 30
195 to 200 15
200 to 205 10

What is the test decision?


Answer: χ2 goodness of fit test: rejection
##########################################################################
# E x e r c i s e : c h i ˆ2 g o o d n e s s o f f i t t e s t : d i s c r e t e d i s t r i b u t i o n
#
# f i l e : i n f s t a t 2 s a m p l e s t e s t i n g h e i g h t m e a n .R
##########################################################################
library ( tidyverse )
# The h y p o t h e s i s i s t o be t e s t e d t h a t t h e h e i g h t o f a d u l t
# o f a d u l t German men i s n o r m a l l y d i s t r i b u t e d ( s i g n i f i c a n c e l e v e l
# 10% ) . For t h i s p u r p ose , a random sam p l e i s c o l l e c t e d , wh i ch l e a d s
# to the f o l l o w i n g f i n d i n g s :
r e s u l t s <− t i b b l e (
fr om = s e q ( fr om =150 , t o =200 , by = 5) ,
t o = s e q ( fr om =155 , t o =205 , by = 5) ,
no = c ( 2 0 , 3 0 , 5 5 , 6 0 , 8 5 , 8 0 , 5 0 , 4 0 , 3 0 , 1 5 , 1 0 ) ,
mid = 0 . 5 ∗ ( fr om+t o )
)
results

# e s t i m a t i o n o f mean and sd
e s t . mean <− mean ( r e p ( x=r e s u l t s $ m i d , t i m e s=r e s u l t s $ n o ) )
e s t . sd <− sd ( r e p ( x=r e s u l t s $ m i d , t i m e s=r e s u l t s $ n o ) )

# e x p e c t e d number o f o b s e r v a t i o n
e s t . p <− pnorm ( r e s u l t s $ t o , mean = e s t . mean , sd = e s t . sd ) −
pnorm ( r e s u l t s $ f r o m , mean = e s t . mean , sd = e s t . sd )
e s t . no <− e s t . p ∗ sum ( r e s u l t s $ n o )

# H0 : no n o r m a l l y d i s t r i b u t e d
c h i 2 <− sum ( ( r e s u l t s $ n o − e s t . no ) ˆ 2 / e s t . no )
# d e c i s i o n : r e j e c t H0 , i f c h i ˆ2 (1 − a l p h a ) q u a n t i l e o f c h i ˆ2
# d i s t r i b u t i o n w i t h k = number o f c l a s s e s − 2 −1
c h i 2 > q c h i s q ( 1 − 0 . 1 , d f = l e n g t h ( r e s u l t s $ f r o m ) − 2 − 1)
1− p c h i s q ( c h i 2 , d f = l e n g t h ( r e s u l t s $ f r o m ) − 2 − 1)
# r e j e c t H0 , s i n c e p−v a l u e = 0 . 0 5 9 8

9. Some parents of the West Bay little leaguers think that they are notic-
ing a pattern. There seems to be a relationship between the number
on the kids’ jerseys and their position. These parents decide to record
what they see. The hypothetical data appear below. Conduct a Chi

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22


9
Square test to determine if the parents’ suspicion that there is a rela-
tionship between jersey number and position is right.

Infield Outfield Pitcher Total


0-9 12 5 5 22
10-19 5 10 2 17
20+ 4 4 7 15
Total 21 19 14 54

Answer: χ2 test: association two qualitative variables, reject at the


5% level
##########################################################################
# E x e r c i s e : c h i ˆ2 t e s t : a s s o c i a t i o n o f two q u a l i t a t i v e numbers
#
# f i l e : i n f s t a t 2 s a m p l e s t e s t i n g n u m b e r p o s i t i o n .R
##########################################################################
library ( tidyverse )

# Some p a r e n t s o f t h e West Bay l i t t l e l e a g u e r s t h i n k t h a t t h e y


# a r e n o t i c i n g a p a t t e r n . Th er e seem s t o be a r e l a t i o n s h i p b et ween
# t h e number on t h e k i d s ’ j e r s e y s and t h e i r p o s i t i o n . These p a r e n t s
# d e c i d e t o r e c o r d what t h e y s e e . The h y p o t h e t i c a l d a t a a p p e a r
# b el ow . Conduct a Chi S qu ar e t e s t t o d e t e r m i n e i f t h e p a r e n t s ’
# s u s p i c i o n t h a t t h e r e i s a r e l a t i o n s h i p b et ween j e r s e y number and
# position is right .

c o n t t a b <− m a t r i x ( c ( 1 2 , 5 , 5 , 5 , 1 0 , 2 , 4 , 4 , 7 ) , nrow = 3 , n c o l = 3 , byrow = TRUE)

c o n t t a b %>% ad d m ar gi n s ( )

# c h i ˆ2 t e s t : a s s o c i a t i o n o f two q u a l i t a t i v e v a r i a b l e s
# H0 : no a s s o c i a t i o n
r e s <− c h i s q . t e s t ( x=c o n t t a b )
res
# contingencytable
r e s $ o b s e r v e d %>% ad d m ar gi n s ( )
# indifferenctable
r e s $ e x p e c t e d %>% ad d m ar gi n s ( )
# Chi ˆ2
res$statistic
# d e c i s i o n : r e j e c t H0 , i f c h i ˆ2 > (1 − a l p h a ) q u a n t i l e o f c h i ˆ2
# d i s t r i b u t i o n w i t h k = (3 − 1)(3 − 1)=4
r e s $ s t a t i s t i c > q c h i s q ( 0 . 9 5 , d f =4)
# r e j e c t H0 , s i n c e p−v a l u e = 0 . 0 3 6 7 9
1− p c h i s q ( 1 0 . 2 2 5 7 3 , d f =4)

10. Two therapies for a specific febrile illness are to be be compared. For
this purpose, 4 and 6 randomly selected randomly selected patients
and the duration of treatment in hours required for the patient to be
necessary for the patient to be free of fever.

Therapie 1 Therapie 2
X Y
89,75 89
94,5 91
98,75 94
101,5 96,75
99,5
101,25

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22


10
It is assumed that the given measured values are a realization of of
independent random variables X1 , ..., X4 , Y1, ..., Y6 and these random
variables and these random variables have the continuous distribution
function F and G, respectively. Test the hypothesis H0 : F = G at the
level α = 0.05 by applying an appropriate nonparametric test.
Answer: Wilcoxon-Mann-Whitney U Test: no rejection
##########################################################################
# E x e r c i s e : c h i ˆ2 g o o d n e s s o f f i t t e s t : d i s c r e t e d i s t r i b u t i o n
#
# f i l e : i n f s t a t 2 s a m p l e s t e s t i n g c o m p t h e r a p i e s .R
##########################################################################
library ( tidyverse )
# Two t h e r a p i e s f o r a s p e c i f i c f e b r i l e i l l n e s s a r e t o be
# be compared . For t h i s p u r p ose , 4 and 6 r an d om l y s e l e c t e d
# r an d om l y s e l e c t e d p a t i e n t s and t h e d u r a t i o n o f t r e a t m e n t i n h o u r s
# r e q u i r e d f o r t h e p a t i e n t t o be n e c e s s a r y f o r t h e p a t i e n t t o be
# f r ee of fever .
T1 <− c ( 8 9 . 7 5 , 9 4 . 5 , 9 8 . 7 5 , 1 0 1 . 5 )
T2 <− c ( 8 9 , 9 1 , 9 4 , 9 6 . 7 5 , 9 9 . 5 , 1 0 1 . 2 5 )
n1 <− l e n g t h ( T1 )
n2 <− l e n g t h ( T2 )
# I t i s assumed t h a t t h e g i v e n m easu r ed v a l u e s a r e a r e a l i z a t i o n o f
# o f i n d e p e n d e n t random v a r i a b l e s $ X 1 , . . . , X 4 , Y 1 , . . . , Y 6 $ and
# t h e s e random v a r i a b l e s and t h e s e random v a r i a b l e s h ave t h e
# c o n t i n u o u s d i s t r i b u t i o n f u n c t i o n $ F $ and $G$ , r e s p e c t i v e l y . Test
# t h e h y p o t h e s i s $ H 0 : F=G$ a t t h e l e v e l $ \ a l p h a = 0 . 0 5 $ by
# a p p l y i n g an a p p r o p r i a t e n o n p a r a m e t r i c t e s t .

# Determining the ranks


sam p l e <− t i b b l e (
gr p = c ( r e p ( ” T1 ” , n1 ) , r e p ( ” T2 ” , n2 ) ) ,
d u r = c ( T1 , T2 ) ,
r a n g = r an k ( d u r )
)

# D e t e r m i n i n g o f R . T1 and R . T2
sam p l e %>% f i l t e r ( gr p == ”T1 ” ) %>% su m m ar i se ( sum ( r a n g ) ) %>%
a s . n u m er i c ( ) −> R . T1 # 24
sam p l e %>% f i l t e r ( gr p == ”T2 ” ) %>% su m m ar i se ( sum ( r a n g ) ) %>%
a s . n u m er i c ( ) −> R . T2 # 31

# test s t a ti s t i c
U . T1 <− n1 ∗ n2 +n1 ∗ (1+ n1 ) ∗ 0 . 5 − R . T1 # 10
U . T2 <− n1 ∗ n2 +n2 ∗ (1+ n2 ) ∗ 0 . 5 − R . T2 # 14
t . xy <− (U −n1 ∗ n2 ∗ 0 . 5 ) / s q r t ( n1 ∗ n2 ∗ ( n1+n2 + 1) /12)
t . xy

a l p h a <− 0 . 0 5
# D i s t r i b u t i o n o f t h e W i l coxon Rank Sum S t a t i s t i c
q w i l c o x ( c ( a l p h a /2 ,1 − a l p h a / 2 ) , n1 , n2 ) # 3 ; 2 1

# d e c i s i o n : H0 : l o c a t i o n s h i f t = 0
( q w i l c o x ( a l p h a / 2 , n1 , n2 ) < U . T2 ) & (U . T2 < q w i l c o x (1 − a l p h a / 2 , n1 , n2 ) ) # t r u e −> no r e j e c t i o n
w i l c o x . t e s t ( T1 , T2 , a l t e r n a t i v e = ” two . s i d e d ” , p a i r e d = FALSE ,
c o n f . l e v e l = 1− a l p h a ) # p−v a l u e = 0 . 7 6 1 2

11. Heumann, Schoemaker Aufgabe 10.3 Christian decide to purchase


the new CD Bruce Springsteen. His first thought is to buy it online, via
an online auction . He discovers that he can also buy the CD, without
bidding at an auction, from the same online store. He also looks at the
price at an internet book staore which was recommended to him by a
friend. He notes down the follwoing prices in Euro.

• Internet book store: 16.95


• Online store, no auction:

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22


11
18.19, 16.98, 19.97, 16.98, 18.19, 15.99, 13.79, 15.90, 15.90, 15.90,
15.90, 19.97, 17.72
• Online store, auction:
10.50, 12.00, 9.54, 10.55, 11.99, 9.30, 10.59, 10.50, 10.01, 11.89,
11.03, 9.52, 15.49, 11.02

(a) Calculate and interpret the arithmetic mean, variance, standard


deviation and the coefficient of variation for the online, both for
the auction and non-auction.
(b) Test the hypothesis that the mean price at the online store (no
auction) is unequal 16.95 Euro (α = 0.05).
(c) Calculate a confidence interval for the mean price at the online
store (no auction) and interpret your findings in the light of the
hypothesis in b).
(d) Test the hypothesis that the mean price at the online store (auc-
tion) is less than 16.95 Euro (α = 0.05).
(e) Test the hypothesis that the mean non-auction price is higher than
the mean auction price. Assume (i) that the variances are equal
in both samples and (ii) that the varainces are unequal.
(f) Test the hypothesis that the variance of the non auction-price is
unequal to the variance of the auction price (α = 0.05).
(g) Use the Wilcoxon-Mann-Whitney U-test to compare the location
of the auction and non-auction prices.

Answer: 1 sample t-test, 2 sample t-test, Welsh test, F-test, Wilcoxon-


Mann-Whitney U-test
##########################################################################
# E x e r c i s e : 1 sm ap l e t − t e s t , 2 sam p l e t − t e s t , Welsh t e s t , F− t e s t ,
# Wilcoxon −Mann−Whitney U− t e s t
#
# f i l e : i n f s t a t 2 s a m p l e s t e s t i n g b o o k s t o r e .R
##########################################################################
library ( tidyverse )

# Heumann , S ch oem aker A u fgab e 1 0 . 3


# C h r i s t i a n d e c i d e t o p u r c h a s e t h e new CD Br u ce S p r i n g s t e e . H i s
# f i r s t t h o u g h t i s t o buy i t o n l i n e , v i a an o n l i n e a u c t i o n . He
# d i s c o v e r s t h a t he can a l s o buy t h e CD, w i t h o u t b i d d i n g a t an
# a u c t i o n , fr om t h e same o n l i n e s t o r e . He a l s o l o o k s a t t h e p r i c e at
# an i n t e r n e t book s t a o r e wh i ch was recommended t o him by a
# f r i e n d . He n o t e s down t h e f o l l w o i n g p r i c e s i n Euro .

i b s <− 1 6 . 9 5
o s <− t i b b l e ( noa = c ( 1 8 . 1 9 , 1 6 . 9 8 , 1 9 . 9 7 , 1 6 . 9 8 , 1 8 . 1 9 , 1 5 . 9 9 , 1 3 . 7 9 , 1 5 . 9 0 ,
15.90 , 15.90 , 15.90 , 15.90 , 19.97 , 17.72) ,
au c = c ( 1 0 . 5 0 , 1 2 . 0 0 , 9 . 5 4 , 1 0 . 5 5 , 1 1 . 9 9 , 9 . 3 0 , 1 0 . 5 9 , 1 0 . 5 0 ,
10.01 , 11.89 , 11.03 , 9.52 , 15.49 , 11.02)
)
# C a l c u l a t e and i n t e r p r e t t h e t h e a r i t h m e t i c mean , v a r i a n c e ,
# s t a n d a r d d e v i a t i o n and t h e c o e f f i c i e n t o f v a r i a t i o n f o r t h e t h e
# o n l i n e , b ot h f o r t h e a u c t i o n and non− a u c t i o n .
o s %>% su m m ar i se (

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22


12
mean . no = mean ( noa ) , mean . au c = mean ( au c ) ,
v a r . noa = v a r ( noa ) , v a r . au c = v a r ( au c ) ,
sd . noa = sd ( o s . noa ) , sd . au c = sd ( o s . au c ) ,
v a r . c o e f f . noa = sd ( noa ) / mean ( noa ) , v a r . c o e f f . au c = sd ( au c ) / mean ( au c ) )
# I t seem s t o be e v i d e n t , t h a t t h e mean a u c t i o n p r i c e s a r e l o w e r t h an mean non−
# a u c t i o n p r i c e s . But t h e a u c t i o n p r i c e s show a h i g h e r v a r i a b i l i t y t o t h e mean
# f o r the auction p r i c e s

# Test t h e h y p o t h e s i s t h a t t h e mean p r i c e a t t h e o n l i n e s t o r e
# ( no a u c t i o n ) i s u n e q u a l 1 6 . 9 5 Euro ( a l p h a = 0 . 0 5 ) .
a l p h a <− 0 . 0 5
t . t e s t ( x = os $ n oa , a l t e r n a t i v e = ” two . s i d e d ” , mu = i b s , c o n f . l e v e l = 1− a l p h a )
# One Sample t − t e s t : t = − 0. 003112 9 , d f = 1 3 , p− v a l u e = 0 . 9 9 7 6
# c o n c l u s i o n : t h e r e i s no e v i d e n c e , t h a t t h e o n l i e s t o r e d i f f e r fr om 1 6 . 9 5

# C a l c u l a t e a c o n f i d e n c e i n t e r v a l f o r t h e mean p r i c e a t t h e
# o n l i n e s t o r e ( no a u c t i o n ) and i n t e r p r e t you r f i n d i n g s i n t h e
# l i g h t of the h y p ot h esi s in b ) .
# 95 p e r c e n t c o n f i d e n c e i n t e r v a l : 15.95714 17.94001

# Test t h e h y p o t h e s i s t h a t t h e mean p r i c e a t t h e o n l i n e s t o r e
# ( a u c t i o n ) i s l e s s t h an 1 6 . 9 5 Euro ( a l p h a = 0 . 0 5 ) .
t . t e s t ( x = os $ au c , a l t e r n a t i v e = ” l e s s ” , mu = i b s , c o n f . l e v e l = 1− a l p h a )
# One Sample t − t e s t : t = − 14. 203 , d f = 1 3 , p−v a l u e = 1 . 3 5 2 e −09
# c o n c l u s i o n : t h e mean a u c t i o n p r i c e s a r e l o w e r t h an t h e p r i c e fr om t h e
# book s t o r e

# Test t h e h y p o t h e s i s t h a t t h e mean non− a u c t i o n p r i c e i s


# h i g h e r t h an t h e mean a u c t i o n p r i c e . Assume ( i ) t h a t t h e
# v a r i a n c e s a r e e q u a l i n b ot h s a m p l e s and ( i i ) t h a t t h e v a r a i n c e s
# are unequal .
# i ) equal variances
t. t e s t ( x = os $ n oa , y = o s . auc , a l t e r n a t i v e = ” g r e a t e r ” , p a i r e d = FALSE ,
v a r . e q u a l = TRUE, c o n f . l e v e l = 1− a l p h a )
# Two Sample t − t e s t : t = 9 . 5 7 7 5 , d f = 2 6 , p− v a l u e = 2 . 5 8 4 e −10
# c o n c l u s i o n : mean a u c t i o n p r i c e s a r e l o w e r t h an t h e mean non−a u c t i o n p r i c e s
# i ) n ot e q u a l v a r i a n c e s
t . t e s t ( x = os $ n oa , y = o s . auc , a l t e r n a t i v e = ” g r e a t e r ” , p a i r e d = FALSE ,
v a r . e q u a l = FALSE , c o n f . l e v e l = 1− a l p h a )
# Welch Two Sample t − t e s t : t = 9 . 5 7 7 5 , d f = 2 5 . 7 9 1 , p− v a l u e = 2 . 7 9 9 e −10
# c o n c l u s i o n : same a s b e f o r e

# Test t h e h y p o t h e s i s t h a t t h e v a r i a n c e o f t h e non
# a u c t i o n −p r i c e i s u n e q u a l t o t h e v a r i a n c e o f t h e a u c t i o n p r i c e
# ( alpha = 0. 05).
v a r . t e s t ( x = o s . noa , y = o s . auc , a l t e r n a t i v e = ” two . s i d e d ” , c o n f . l e v e l = 1− a l p h a )
# F t e s t t o compare two v a r i a n c e s : F = 1 . 2 5 7 7 , num d f = 1 2 , denom d f = 1 3 ,
# p− v a l u e = 0 . 6 8 5 6
# c o n c l u s i o n : no e v i d e n c e , t h a t t h e v a r i a n c e s a r e d i f f e r e n t ; t h i s j u s t i f i e s t h e
# u s e o f 2 sam p l e t . t e s t w i t h e q u a l v a r i a n c e s . I n p r a c t i c e , i t i s b e s t t o u s e t h e
# Welsh− t e s t r a t h e r t h an t h e t − t e s t

# Use t h e Wilcoxon −Mann−Whitney U− t e s t t o compare t h e l o c a t i o n


# o f t h e a u c t i o n and non− a u c t i o n p r i c e s .
w i l c o x . t e s t ( x = os $ n oa , y = os $ au c , c o n f . l e v e l = 1− a l p h a )
# W i l coxon r an k sum t e s t w i t h c o n t i n u i t y c o r r e c t i o n : W = 1 9 5 , p−v a l u e = 8 . 6 4 4 e −06
# con cl u si on : the l o c a t i o n s are s h i f t e d

12. Heumann, Schoemaker Aufgabe 10.6


Two friends play a computer game and each of them repeats the same
level 10 times. The score obtained are:

1 2 3 4 5 6 7 8 9 10
Player 1 91 101 112 99 108 88 99 105 111 104
Player 2 261 47 40 29 64 6 87 47 98 351

(a) Player 2 insists that he is a better player and suggests to compare


their performance. Use an appropriate test (α = 0.05) to test this
hypothesis.

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22


13
(b) Player 1 insists that he is a better player. He propose to not
focus on the mean and to use the Wlicoxon-Mann-Whitney U-
test for comparison (α = 00.5). What are the advantages and
disadvantages of using this test compared with a)?

Answer: Welsh test, Wilcoxon-Mann-Whitney U-test


##########################################################################
# E x e r c i s e : Welsh t e s t , Wilcoxon −Mann−Whitney U− t e s t
#
# f i l e : i n f s t a t 2 s a m p l e s t e s t i n g c o m p g a m e .R
##########################################################################
library ( tidyverse )

# Heumann , S ch oem aker A u fgab e 1 0 . 6 }

# Two f r i e n d s p l a y a com p u t er game and e a c h o f them r e p e a t s t h e same


# l e v e l 10 t i m e s . The s c o r e o b t a i n e d a r e :

P1 <− c ( 9 1 , 1 0 1 , 1 1 2 , 9 9 , 1 0 8 , 8 8 , 9 9 , 1 0 5 , 1 1 1 , 1 0 4 )
P2 <− c ( 2 6 1 , 4 7 , 4 0 , 2 9 , 6 4 , 6 , 8 7 , 4 7 , 9 8 , 3 5 1 )

# P l a y e r 2 i n s i s t s t h a t he i s a b e t t e r p l a y e r and s u g g e s t s t o
# compare t h e i r p e r f o r m a n c e . Use an a p p r o p r i a t e t e s t
# ( a l p h a = 0. 05 $ ) t o t e s t t h i s h y p o t h e s i s .
# 2 sam p l e Welsh t e s t
a l p h a <− 0 . 0 5
t . t e s t ( x = P1 , y = P2 , a l t e r n a t i v e = ” l e s s ” , p a i r e d = FALSE ,
v a r . e q u a l = FALSE , c o n f . l e v e l = 1− a l p h a )
# p−v a l u e = 0 . 4 8 6 9
# c o n c l u s i o n : no e v i d e n c e t o t h a t P2 i s b e t t e r t h an P1

# P l a y e r 1 i n s i s t s t h a t he i s a b e t t e r p l a y e r . He p r o p o s e t o
# n ot f o c u s on t h e mean and t o u s e t h e Wlicoxon −Mann−Whitney
# U− t e s t f o r c o m p a r i s o n ( a l p h a = 0 0 . 5 ) . What a r e t h e a d v a n t a g e s
# and d i s a d v a n t a g e s o f u s i n g t h i s t e s t compared w i t h a ) ?
w i l c o x . t e s t ( x = P1 , y = P2 , a l t e r n a t i v e = ” g r e a t e r ” , c o n f . l e v e l = 1− a l p h a )
# p−v a l u e = 0 . 0 1 8 7 5
# c o n c l u s i o n : e v i d e n c e t h a t P1 i s b e t t e r t h an P2
# The U− t e s t h as t h e a d v a n t a g e o f n ot b e i n g f o c u s e d t o t h e mean . The two
# sam p l e a r e c l e a r l y d i f f e r e n t : P2 s c o r e s w i t h much more v a r i a b i l i t y and
# h i s d i s t r i b u t i o n i s n ot sym m et r i c and n o r m a l l y d i s t r i b u t e d . S i n c e t h e
# sam p l e i s s m a l l , and t h e a s s u m p t i o n o f a n or m al d i s t r i b u t i o n i s l i k e l y
# n ot met , i t makes no s e n s e t o u s e a t − t e s t . Moreover , b e c a u s e t h e
# d i s t r i b u t i o n i s skewed t h e mean may n ot be a s e n s i b l e m easu r e o f
# c o m p a r i s o n . A drawback o f t h e U− t e s t i s t h a t i t u s e s o n l y t h e r a n k s and
# n ot t h e raw d a t a : i t t h u s u s e s l e s s i n f o r m a t o n t h an t h e t h e t − t e s t ,
# wh i ch would be p r e f e r r e d when com p ar i n g means o f a r e a s o n a b l y s i z e d sam p l e .

13. Heumann, Schoemaker Aufgabe 10.8


The passengers rescued from the titanic depending on the travel classes
is given in the following table
1. Class 2. Class 3. Class Staff Total
Rescued 202 125 180 211 718
Not rescued 135 160 541 674 1510
Check with an appropriate test whether the “rescue status” and the
“travel class” are independent and whether the conditional probabilities
of “rescue status” given “travle class” differ by “travel class”.
Answer: χ2 homogeneity and independence test
##########################################################################
# E x e r c i s e : c h i ˆ2 h om ogen ei t y and i n d e p e n d e n c e t e s t
#
# f i l e : i n f s t a t 2 s a m p l e s t e s t i n g t i t a n i c .R

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22


14
##########################################################################
library ( tidyverse )

# The p a s s e n g e r s r e s c u e d fr om t h e t i t a n i c d e p e n d i n g on t h e travel class is


# given in the f o l l o w i n g t a b l e
c o n t . t ab <− m a t r i x ( c ( 2 0 2 , 1 2 5 , 1 8 0 , 2 1 1 , 1 3 5 , 1 6 0 , 5 4 1 , 6 7 4 ) ,
nrow = 2 , n c o l = 4 , byrow = TRUE)
c o n t . t ab %>% ad d m ar gi n s

# Check w i t h an a p p r o b r i a t e t e s t wh et h er t h e ‘ ‘ r e s c u e s t a t u s ’ ’ and t h e
# ‘ ‘ t r a v e l c l a s s ’ ’ a r e i n d e p e n d e n t and wh et h er t h e c o n d i t i o n a l
# p r o b a b i l i t i e s of ‘ ‘ rescue status ’ ’ given ‘ ‘ t r a v e l c la ss ’ ’ d i f f e r
# by ‘ ‘ t r a v e l c l a s s ’ ’ .

# c h i ˆ2 i n d e n p e n d e n c e and h om ogen ei t y test


r e s <− c h i s q . t e s t ( c o n t . t ab )
# contingency ta ble
res$observed
# i n d if f e r e n c e table
res$expected
# c h i ˆ2
res$statistic
# p−v a l u e
res$p . value

# The c h i ˆ2 i n d e p e n d e n c e t e s t and c h i ˆ2 h om ogen ei t y a r e t e c h n i c a l l y i d e n t i c a l .


# The n u l l h y p o t h e s i s a r e q u i t e d i f f e r e n t :
# c h i ˆ2 i n d e p e n d e n c e t e s t : H0 = ” r e s c u e s t a t u s ” and ” t r a v e l c l a s s ” a r e
# independent
# c h i ˆ2 h om ogen ei t y t e s t : H0 : t h e p r o p o r t i o n o f p a s s e n g e r s r e s c u e d i s i d e n t i c a l
# f o r the d i f f e r e n t t r a v e l c l a s s e s
# c o n c l u s i o n : s i n c e t h e p− v a l u e i s r a t h e r l ow b ot h n u l l h y p o t h e s i s can be
# rejected

Prof. Dr. Falkenberg, Faculty 2 Statistics WS 2021/22

You might also like