0% found this document useful (0 votes)

13 views

Happ 2018

Uploaded by

Mochamad Yovi

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Happ 2018

Uploaded by

Mochamad Yovi

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Received: 28 February 2018 Revised: 17 July 2018 Accepted: 6 September 2018

DOI: 10.1002/sim.7983

RESEARCH ARTICLE

Optimal sample size planning for the

Wilcoxon-Mann-Whitney test

Martin Happ1 Arne C. Bathke1,2 Edgar Brunner1,3

1
Department of Mathematics, University
of Salzburg, Salzburg, Austria There are many different proposed procedures for sample size planning for
2
Department of Statistics, University of the Wilcoxon-Mann-Whitney test at given type-I and type-II error rates 𝛼 and
Kentucky, Lexington, Kentucky 𝛽, respectively. Most methods assume very specific models or types of data to
3
Department of Medical Statistics,
simplify calculations (eg, ordered categorical or metric data, location shift alter-
University of Göttingen, Göttingen,
Germany natives, etc). We present a unified approach that covers metric data with and
without ties, count data, ordered categorical data, and even dichotomous data.
Correspondence
For that, we calculate the unknown theoretical quantities such as the variances
Arne C. Bathke, Department of
Mathematics, University of Salzburg, 5020 under the null and relevant alternative hypothesis by considering the following
Salzburg, Austria. “synthetic data” approach. We evaluate data whose empirical distribution func-
Email: [email protected]
tions match the theoretical distribution functions involved in the computations
Present Address of the unknown theoretical quantities. Then, well-known relations for the ranks
Arne C. Bathke, University of Salzburg, of the data are used for the calculations.
Hellbrunnerstrasse 34, 5020 Salzburg,
Austria. In addition to computing the necessary sample size N for a fixed allocation
proportion t = n1 ∕N, where n1 is the sample size in the first group and
Funding information
N = n1 + n2 is the total sample size, we provide an interval for the optimal
Austrian Science Fund, Grant/Award
Number: I 2697-N31 allocation rate t, which minimizes the total sample size N. It turns out that,
for certain distributions, a balanced design is optimal. We give a characteriza-
tion of such distributions. Furthermore, we show that the optimal choice of t
depends on the ratio of the two variances, which determine the variance of the
Wilcoxon-Mann-Whitney statistic under the alternative. This is different from
an optimal sample size allocation in case of the normal distribution model.

K E Y WO R D S
nonparametric relative effect, nonparametric statistics, optimal design, rank-based inference,
sample size planning, Wilcoxon-Mann-Whitney test

1 I N T RO DU CT ION
The comparison of two independent samples is widespread in medicine, the life sciences in general, and other fields of
research. Arguably, the most popular method is the unpaired t-test for two sample comparisons. However, its application
is limited. For heavy-tailed or very skewed distributions, use of the t-test is not recommended, especially for small sample
sizes. For ordered categorical data, comparing averages by means of t-tests is not appropriate at all. For those situations,
a nonparametric test such as the Wilcoxon-Mann-Whitney (WMW) test is much preferred.
...............................................................................................................................................................
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the
original work is properly cited.
© 2018 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

Statistics in Medicine. 2018;1–13. wileyonlinelibrary.com/journal/sim 1

2 HAPP ET AL.

In order to plan a study for this type of two-sample comparison, we need to know how many subjects are needed to
detect a prespecified effect at least with probability 1 − 𝛽, where 𝛽 denotes the type-II error probability. If the underlying
distributions are normal, a prespecified effect might be formulated as a difference of means. Within a general nonpara-
metric framework, the relative effect (see Section 2) is very often used. However, for a statistics practitioner, it is sometimes
difficult to state a relevant effect size to be detected in terms of the nonparametric relative effect. Therefore, we will be
using a slightly different approach. Based on prior information F1 regarding one group, eg, the standard treatment or the
control group, one can derive the distribution F2 under a conjectured (relevant) alternative in cooperation with a subject
matter expert. This distribution is established in such a way that it features what the subject matter expert would quantify
as a relevant effect. In other words, the expert may, but does not necessarily have to, provide a (standardized) difference
of means, or a relevant value for the nonparametric relative effect on which the WMW test is based. Or, alternatively,
the subject matter expert may simply provide information on a configuration that the expert would consider relevant in
terms of providing evidence in favor of the research hypothesis. This information will then be translated into a relevant
nonparametric effect. More details on deriving F2 based on an interpretable effect to compute the nonparametric effect
and the variances involved in the sample size planning are given in Section 4.
For the WMW test, there already exist many sample size formulas. However, most of them require special situations,
eg, either continuous data as used in the works of Bürkner et al,1 Wang et al,2 or Noether,3 or they require ordered cate-
gorical data as in the works of Fan,4 Tang,5 Lachin,6 Hilton and Mehta,7 or Whitehead.8 For a review of different methods,
we refer to the work of Rahardja et al.9 A rather well-known method for sample size calculation in case of continuous
data is given by Noether3 who approximated the variance under alternative by the variance under the null hypothesis.
A similar approximation was also used by Zhao et al10 who generalized Noether's formula to allow for ties. For practical
application, however, this approximation may not always be appropriate because the variances under null hypothesis and
under alternative can be very different, thus potentially leading to an underpowered or overpowered study. See, eg, the
work of Shieh et al11 for a comparison of Noether's formula with different alternative methods.
In some other approaches, the sample size is only calculated under the assumption of a proportional odds model for
ordered categorical data (eg, the works of Kolassa12 or Whitehead8 ), or considering only location shift models for contin-
uous metric data (see, eg, the works of Rosner and Glynn,13 Chakraborti et al,14 Lesaffre et al,15 Hamilton and Collings,16
or Collings and Hamilton,17 among others). An advantage of our formula (9) in Section 2 for the sample size calculation
is its generality and practicality. It can be used for metric data as well as for ordered categorical data, and it even works
very well for dichotomous data. Furthermore, our formula does not assume any special model for the alternatives.
Within the published literature, the sample size formulas bearing most similarity to ours are those by Wang et al.2
However, their approach is limited to continuous distributions, whereas our approach is based on a unified approach
allowing for discrete, as well as continuous data.
A completely different way to approach optimality of WMW tests has been pursued by Matsouaka et al.18 They use a
weighted sum of multiple WMW tests and determine the optimal weight for each test. Their aim is not an optimal sample
size planning including optimization of the ratio of sample sizes, but instead they try to optimally combine a primary
endpoint with mortality.
In a two-sample setting, we sometimes can choose the proportion of subjects in the first group. That is, we can choose
t = n1 ∕N, where n1 is the number of subjects in the first group and N is the total number of subjects. The question that
arises is how to choose t in an optimal way. In the work of Bürkner et al,1 the optimal t is chosen such that the power of
the WMW test is maximized for a given sample size N. On the other hand, in practice, we prefer to choose t in such a way
that the total sample size N is minimized for a specified power 1 − 𝛽. For the two-sample t-test with unequal variances,
Dette and O'Brien19 showed that the optimal t to maximize the power of the test is approximately
1
t≈ ,
1+𝜏
where 𝜏 = 𝜎 1 ∕𝜎 0 is the ratio of standard deviations of the two groups under the hypothesis and under the alternative,
respectively. This means that, when applying the t-test, more subjects should be allocated to the group with the higher
variance. Bürkner et al1 showed for symmetric continuous distributions under a location shift model that a balanced
design is optimal for the WMW test. For general distributions, they observed in simulation studies that, in many situations,
the difference between using the optimal t and using a balanced design is negligible.
In most publications, the generation of the alternative from the reference group is not discussed, and instead, the dis-
tribution under the alternative is assumed to be known. Here, however, we want to discuss also how we can generate the
distribution under the alternative based on the distribution in the reference group and an interpretable relevant effect.
HAPP ET AL. 3

TABLE 1 Number of seizures for 28 subjects from the advance information X1, k ∼ F1 (x),
k = 1, … , 28, and for the relevant effect F2 (x) = F1 (x∕q), where q = 0.5 denotes the percentage
of the relevant reduction of seizures to be detected. This means X2, k = [q · X1, k ] ∼ F2 (x), where
[u] denotes the largest integer ≤ u
Number of counts
Advance Information
X1,1 , … , X1,28 ∼ F1 (x) 3, 3, 5, 4, 21, 7, 2, 12, 5, 0, 22, 4, 2, 12
9, 5, 3, 29, 5, 7, 4, 4, 5, 8, 25, 1, 2, 12
Relevant Alternative
X2, k ∼ F2 (x) = F1 (x∕q) 1, 1, 2, 2, 10, 3, 1, 6, 2, 0, 11, 2, 1, 6
4, 2, 1, 14, 2, 3, 2, 2, 2, 4, 12, 0, 1, 6

In order to motivate the method derived in this paper, let us consider an example with count data, as it appears that most
publications on sample size planning focus on ordered categorical or continuous metric data. In Table 1, the data of an
advance information F1 on a placebo in an epilepsy trial is given where the outcome variable is the number of seizures. We
would like to base sample size planning for a new drug on the data X1,1 , … , X1,28 of the advance information F1 , which
comes from a study published by Leppik et al,20 as well as Thall and Vail.21 For these data, we cannot assume a location
shift model, as an absolute reduction of two seizures would be very good for someone with three seizures, but not really
helpful for someone with 20 or more seizures. More appropriate would probably be a reduction of the number of seizures
by some percentage q, for example q = 50%. Based on this specified relevant effect F2 (x) = F1 (x∕q), we artificially gen-
erate a new data set X2,1 , … , X2,28 whose empirical distribution function F ̂2 (x) is exactly equal to F2 (x). Basically, the
number n2 of the artificially generated data is arbitrary (here, n2 = 28) as long as F ̂2 (x) = F2 (x) = F1 (x∕q). We will refer
to such data as “synthetic” data.
Most of the methods mentioned before cannot be applied to data such as these as they have been derived under different
restrictive assumptions. In particular, methods assuming a location-shift model cannot be used here. However, application
of the method proposed in the present paper does not require specific types of data or a specific alternative because it is
based on the observed data and the generated synthetic data, which do not need to follow any particular model. See also
the chapter “Keeping Observed Data as a Theoretical Distribution” in the work of Puntanen et al22 for a similar approach
in the parametric case. More details regarding this data set and the sample size calculation can be found in Section 4.
The rest of this paper is now organized as follows. We first derive a general sample size formula and investigate the
behavior of the optimal t. That is, we show in which cases more subjects should be allocated to the first or second group.
Then, we apply this method to several data examples with different types of data and provide power simulations to show
that, with the sample size calculated by our method, the simulated power is at least 1 − 𝛽. Furthermore, we simulate how
the chosen type-I and type-II error rates affect the value of the optimal allocation rate t.

2 SA MPLE SIZE FO RMULA

Let X1i ∼ F1 and X2j ∼ F2 , i = 1, … , n1 , j = 1 … , n2 , be independent random samples obtained on N different subjects,
with N = n1 + n2 . The cumulative distribution functions (cdfs) F1 and F2 are understood as their normalized versions,
ie, Fi (x) = 12 (Fi+ (x) + Fi− (x)), where Fi+ denotes the right-continuous cdf and Fi− denotes the left-continuous cdf. By
using the normalized version, we can pursue a unified approach for continuous and discrete data; no separate formulas
“correcting for ties” are necessary. This unified approach results naturally in the usage of midranks in the formulas for
the test statistics; see the works of Ruymgaart,23 Akritas et al,24 and Akritas and Brunner25 for details. We denote by t the
proportion of the N subjects that is allocated to the first group. That is, n1 = tN and n2 = (1 − t)N. Without loss of
generality, X1i may be regarded as the reference group and the second group X2i as the (experimental) treatment group.
The WMW test is based on the nonparametric relative treatment effect
1
p= F1 dF2 = P(X11 < X21 ) + P(X11 = X21 ), (1)
∫ 2
which can be estimated in a natural way by its empirical analog p̂ = ∫ F̂ 1 dF̂ 2 . Here, F̂ i = 12 (F̂ i− + F̂ i+ ) is the normalized
∑ni −1 ∑ni
empirical cdf with F̂ i− (x) = n−1
i
̂+
𝑗=1 1{Xi 𝑗 <x} , and Fi (x) = ni 𝑗=1 1{Xi𝑗 ≤x} the left- and right-continuous empirical cdfs
for i = 1, 2, respectively. Finally, 1{Xi𝑗 <x} denotes the indicator function of the set {Xi j < x}. Using the relation of the
so-called placement P2k = n1 F̂ 1 (X2k ) to the overall rank R2k of X2k among all N = n1 + n2 observations and the internal
4 HAPP ET AL.

rank R(2)
2k
of X2k only among the n2 observations within sample 2, it follows from the asymptotic equivalence theorem
(see, eg, theorem 1.3 in the work of Brunner and Puri26 ) that
√ √ [1 ( n2 + 1 )
]
TN = N(p̂ − p) = N R2· − −p (2)
n1 2
∑n2
is asymptotically normal under slight regularity assumptions. Here, R2· = n1 k=1 R2k denotes the mean of the overall
2
ranks R2k in the second sample. For a derivation, we refer, eg, to the works of Brunner and Munzel27 or Brunner and Puri26
while the placements P2k are considered in more detail at the end of this section in (10). From this theorem, it follows
that, asymptotically, the statistic
( )
√ ∑
n2
∑
n1
UN = N n2 −1 −1
F1 (X2𝑗 ) − n1 F2 (X1𝑗 ) + 1 − 2p , (3)
𝑗=1 𝑗=1

which is based on independent random variables, has the same distribution as TN . Then, under the null hypothesis H0 ∶
F1 = F2 , the variance of UN can be written as
N2 2 1
𝜎02 = 𝜎 = 𝜎2, (4)
n1 n2 t(1 − t)
where 𝜎 2 = ∫ F12 dF1 − 14 . This means, TN ∕𝜎 0 has asymptotically the same distribution as UN ∕𝜎 0 , but the distribution of
the latter is asymptotically standard normal. To compute the variance of TN , in general, we again take advantage of the
asymptotically equivalent statistic in (3) and obtain the asymptotic variance
N ( )
𝜎N2 = n2 𝜎12 + n1 𝜎22 , (5)
n1 n2
where

𝜎12 = Var(F2 (X11 )) = F22 dF1 − (1 − p)2 , (6)

∫
𝜎22 = Var(F1 (X21 )) = F12 dF2 − p2 . (7)
∫

Clearly, the variance 𝜎N2 under alternative is a weighted sum of two components, 𝜎12 and 𝜎22 . Both of these components
are important for minimizing the sample size, as performed in Section 3, unlike in the parametric case for the t-test where
only the two variances 𝜎02 under the null and 𝜎12 under the alternative hypotheses are considered.
Based on these considerations, an approximate sample size formula for the WMW test can be obtained similar to the
one calculated by Wang et al2 for continuous data. Namely, we obtain
( )2
𝜎0 u1−𝛼∕2 + 𝜎N u1−𝛽
N= ( )2 , (8)
p − 12

where 𝛼 and 𝛽 denote the type-I and type-II error rates, respectively, and u1 − 𝛼/2 is the 1 − 𝛼∕2 quantile of the standard
normal distribution.
The quantities p, 𝜎 0 , and 𝜎 N in Equation (8) are unknown in general. Moreover, 𝜎N2 is a linear combination of the two
unknown variances 𝜎12 and 𝜎22 in Equations (6) and (7). To compute these quantities from the distribution F1 of the prior
information in the reference group and the distribution F2 generated by an intuitive and easy to interpret relevant effect,
we proceed as follows.
We interpret the distributions of the data as fixed theoretical distributions similar to the parametric case in the works
, … , X1n
(p433) (pp27-28 ) ∗ ∗
of Seber28 and Puntanen et al.22 Therefore, we denote the data from the prior information by X11 and
1
the synthetic data for the treatment group by X , … , X . The corresponding cdfs are denoted by F (x) = F1 (x) and
∗ ∗ ∗ ̂
21 2n2 1
F2∗ (x) = F̂ 2 (x), respectively. Here, F̂ 1 (x) denotes the empirical distribution function of the available data X11
∗
, … , X1n
∗
in
1
the reference group and F̂ 2 (x) the empirical distribution functions of the synthetic data X , … , X in the treatment
∗
21
∗
2n2
group. In this context, “synthetic” means that the data for F2 are artificially generated based on the prior information F1
and some interpretable relevant effect. We can generate data sets of arbitrary size for F1 and F2 , as long as the relative
frequencies or probabilities remain unchanged. Because we assume that our synthetic data represent fixed distributions
and not a sample, we can calculate the variances 𝜎12 , 𝜎22 , and 𝜎 2 , as well as the relative effect p exactly. To emphasize
HAPP ET AL. 5

that these quantities are not estimators but rather the true parameters based on the synthetic data, we will denote these
quantities by 𝜎 2∗ , 𝜎12∗ , 𝜎22∗ , and p∗ .
By using the relations Nt = n1 and N(1 − t) = n2 , the sample size formula from Equation (8) is then rewritten as
( √ )2
𝜎 u1−𝛼∕2 + u1−𝛽 t𝜎2 + (1 − t)𝜎1
∗ 2∗ 2∗

N= ( )2 . (9)
∗ 1
t(1 − t) p − 2

The variances and the relative effect can be easily calculated by using a simple relation between ranks and the so-called
placements P1k = n2 F̂ 2 (X1k ) and P2k = n1 F̂ 1 (X2k ), which were introduced by Orban and Wolfe.29,30 The placements were
first defined only for continuous distributions, but were later generalized to include discrete distributions. For details,
see, eg, the work of Brunner and Munzel.27 To this end, let R∗ik denote the overall rank of Xik∗ among all n1 + n2 = N
∗ ∑ni ∗
synthetic data, and R∗(i)
ik
the ranks within the ith group, i = 1, 2. Furthermore, let Ri· = n1 k=1 Rik , i = 1, 2, denote the
i
∗
rank means. Then, the placements Pik can be represented by these ranks as
∗
Pik = R∗ik − R∗(i)
ik
, (10)

i = 1, 2; k = 1, … , ni . Finally, by letting Fi∗ (x) = F̂ i (x), the quantities in the sample size formula (9) can be calculated
directly as follows:
( ∗ )
1 ∗ 1
p∗ = F1∗ dF2∗ = R2· − R1· + , (11)
∫ N 2
2 ni ( )
1 1 ∑∑ ∗ N + 1 2
𝜎 2∗ = (F ∗ )2 dF ∗ − = 3 Rik − , (12)
∫ 4 N i=1 k=1 2
n1 ( )
1 ∑ ∗ ∗ 2
𝜎12∗ = (F2∗ )2 dF1∗ − (1 − p∗ )2 = P − P 1· , (13)
∫ n1 n22 k=1 1k
n2 ( )
1 ∑ ∗ ∗ 2
𝜎22∗ = (F1∗ )2 dF2∗ − (p∗ )2 = 2 P2k − P2· . (14)
∫ n1 n2 k=1

The cdf F ∗ is the distribution function of the combined synthetic data from both groups. Note that, for computing the
variances, we do not divide by N − 1 or ni − 1, but rather by N or ni , i = 1, 2 because the distributions of the synthetic
data are considered as fixed theoretical distributions similar to the parametric case in the work of Puntanen et al.22(pp27-28)

3 MINIMIZING N

3.1 Interval for the optimal design

In Section 2, we have derived a formula for the sample size N given type-I and type-II error rates 𝛼 and 𝛽, respectively. In
practice, we sometimes have the opportunity to choose how many subjects should be allocated to the first group and how
many to the second. The question in such a situation is how the proportion t = n1 ∕N should be chosen to minimize N.
Bürkner et al1 aimed at finding the optimal t such that the power is maximized for a given sample size N. Although both
questions lead to essentially the same answer, we prefer to minimize the sample size as this question arises more naturally
in sample size planning.
Technically, an exact solution to this problem is possible, but it is not feasible to write down the solution in closed form
anymore, and it does not give us much information about the behavior of the solution. However, it is possible to provide
an interpretable interval for the optimal allocation rate t0 = arg mint∈(0,1) N(t). For that, we only have to assume that the
power 1 − 𝛽 is greater than 50% and we distinguish between the cases 𝜎 1 = 𝜎 2 and 𝜎 1 ≠ 𝜎 2 . Note that the variances 𝜎12
and 𝜎22 can be quite different even if the variances of F1 and F2 are the same. If we allow unequal variances for F1 and F2 ,
it is even possible that 𝜎12 = 0 and 𝜎22 = 1∕4 occurs where 1∕4 is the largest possible value for the variances 𝜎i2 , i = 1, 2.
The assumption on the minimal power could be weakened to assuming that the numerator of N(t) is not zero. One then
only needs to distinguish the cases 𝛽 > 1∕2, 𝛽 < 1∕2, and 𝛽 = 1∕2. For practical considerations, however, only 𝛽 < 1∕2
is of relevance; therefore, we only consider this situation.
6 HAPP ET AL.

Now, regarding the case 𝜎 1 = 𝜎 2 , it is clear from formula (9) that the optimal allocation rate is t0 = 1∕2 because the
numerator of N(t) does not depend on t, and t(1 − t) is maximized at t = 1∕2. For the case 𝜎 1 ≠ 𝜎 2 , we consider first
0 < 𝜎 1 < 𝜎 2 . Then, it is possible to show (see Supplementary Material, Result 2) that the sample size is minimized by a
t0 ∈ [I1 , I2 ] with I1 ≤ I2 < 1∕2. The minimizer is unique in the interval (0, 1), and the bounds I1 and I2 are given by
1
I1 = , (15)
𝜅+1
√
z
I2 = √ ( √ ), (16)
z + u1−𝛼∕2 q𝜎 + u1−𝛽 𝜎22
where 𝜅 = 𝜎 2 ∕𝜎 1 , 𝜎 2 = ∫ F12 dF1 − 1∕4 as in (4), q = p(1 − p), and
( √ )( √ )
z = u1−𝛼∕2 q𝜎 + u1−𝛽 𝜎12 u1−𝛼∕2 q𝜎 + u1−𝛽 𝜎22 .

Additionally, the following equivalence holds:

1
t0 < ⇐⇒ 𝜎1 < 𝜎2 . (17)
2

In the case 0 < 𝜎 2 < 𝜎 1 , we obtain an analogous result for the minimizer t0 ∈ [I2 , I1 ], where the bounds are the same
as before. Moreover, we have a similar equivalence, namely,

1
t0 > ⇐⇒ 𝜎1 > 𝜎2 . (18)
2

The derivation of these two equivalences can be found in the Supplementary Material in Results 2 and 3.
From the form of the interval [I1 , I2 ], we can see that, if 𝜅 ≈ 1, then t0 ≈ 1∕2. In most cases, this means that the minimum
total sample size N is obtained for allocation rates close to 1∕2, or the allocation rate is 1∕2 because of rounding. Larger
values for the type-I error rate 𝛼 or the power 1 − 𝛽 lead in general to more extreme values for t0 , ie, |1∕2 − t0 | gets larger.
This can be seen from the upper bound I2 . By increasing 𝛼 or the power 1 − 𝛽, the bound I2 decreases (or increases for
𝜎 1 > 𝜎 2 ). Typically, this means that the difference |1∕2 − t0 | tends to get larger. Note that I2 is bounded from below
(above), ie, t0 cannot become arbitrarily small (or large). The impact of 𝛼 and 𝛽 is demonstrated in simulations in Section 5.
Next, we consider the case 0 = 𝜎 1 < 𝜎 2 . In the same way as before, it is possible to construct an interval for the optimal
allocation rate t0 , which is given by [I1(0) , I2 ], where the lower bound is
u1−𝛼∕2 𝜎
I1(0) = , (19)
2u1−𝛼∕2 𝜎 + u1−𝛽 𝜎2
and the upper bound is the same as in the case 0 < 𝜎 1 . More details are given in the Supplementary Material in Result 4.
An analogous result can be obtained for 0 = 𝜎 2 < 𝜎 1 .
Therefore, the value of t0 is mainly determined by 𝜅, which is the ratio of the standard deviations 𝜎 1 and 𝜎 2 under the
alternative hypothesis. This is qualitatively different from the result of the work of Dette and O'Brien19 for the t-test in
a parametric location-scale model, where the optimal allocation value is determined by the ratio of standard deviations
under the null and under the alternative hypothesis. For the WMW test, the variance under null hypothesis is not really
important for determining t0 , in case of continuous distributions, eg, the variance under null hypothesis is 𝜎02 = 1∕12.

3.2 Optimality of a balanced design

In the previous section, we have provided ranges for the optimal allocation proportion t0 . There are many situations, in
which balanced designs are optimal or close to optimal. In this section, we will describe classes of situations in which a
balanced design minimizes the sample size. From Section 3.1, we know that

1
t0 = ⇐⇒ 𝜎1 = 𝜎2 . (20)
2

The right-hand side of this equivalence can be rewritten as

1
t0 = ⇐⇒ 𝜎1 = 𝜎2 ⇐⇒ F 2 dF = (1 − F2 )2 dF1 . (21)
2 ∫ 1 2 ∫
HAPP ET AL. 7

Bürkner et al1 showed analytically that, for symmetric and continuous distributions with F2 (x) = F1 (x + a) and a ≠ 0,
the minimal sample size is attained at t0 = 1∕2. Such distributions satisfy the integral equation

F12 dF2 = (1 − F2 )2 dF1 . (22)

∫ ∫

However, the class of distributions satisfying Equation (22) is actually larger. Consider normalized cdfs F1 , F2 for which
an a ∈ ℝ exists such that, for all x ∈ ℝ, the following equality holds:

F1 (a + x) = 1 − F2 (a − x). (23)

Furthermore, let us assume 1 − 𝛽 > 0.5. Then, the minimum for N(t), t ∈ (0, 1) is attained at t0 = 1∕2. This means
that (23) is a sufficient but not necessary condition for t0 = 1∕2. As an example for distributions that satisfy Equation (22)
but not (23), consider F1 = F2 to be a nonsymmetric distribution.
Note that we do not assume for (23) that the distributions are stochastically ordered or symmetric. If we assume finite
third moments, then Equation (23) only implies that both distributions have the same variance and their skewness has
opposite signs, ie, 𝜈F1 = −𝜈F2 , if we denote with 𝜈Fi the skewness of the distribution with cdf Fi , i = 1, 2.
Obviously, for a large class of distributions, the optimal allocation rate is exactly 1∕2. Bürkner et al1 already noticed
the robustness of the WMW test regarding the optimal allocation rate. When the optimal t0 is not equal to 1∕2, it is often
close to 1∕2. Furthermore, the exact choice of t typically only has a small influence on the required total sample size. This
applies not only to continuous and symmetric distributions but in general to arbitrary distributions.

4 DATA EXAMPLES

The generality of the approach proposed in this paper is demonstrated using different data examples with continuous
metric, discrete metric, and ordered categorical data. In this section, we first describe the data sets. Then, the calculated
sample sizes along with the actual achieved power in comparison with other sample size calculation methods are given.
For all data sets, we used the prior information from one group (eg, from a previous study or from literature) to gener-
ate synthetic data for the second group based on an interpretable effect specified by a subject matter expert. For ordered
categorical data, such an effect might be that a certain percentage of subjects in each category are moved to a better or
worse category. For metric data, it is possible to simply use a location shift as the effect of interest. Regardless on how
the effects are chosen, in the end, they all are translated into the so-called nonparametric relative effect, which itself pro-
vides for another interpretable effect quantification, which might be useful for practitioners, in addition to, eg, a location
shift effect.
For all examples, we used 𝛼 = 0.05 as the type-I error rate and provide the output from an R function, which shows
the optimal t, the sample size determined for each group, and the ratio 𝜅 = 𝜎 2 ∕𝜎 1 . Furthermore, we provide simula-
tion results to assess the actual achieved power. The R Code is given in the Supplementary Material. For calculating the
asymptotic WMW test, we used the function rank.two.samples from the R package rankFD.31 For all simulations
performed with the statistical software R, we generated 104 data sets and used 0 as our starting seed value for drawing data
sets from the synthetic data. To compute the optimal allocation rate t0 and the sample sizes for each group, the function
WMWssp_Minimize from the R package WMWssp can be used.

4.1 Number of seizures in an epilepsy trial

The data for the placebo group of a clinical trial published in the works of Thall and Vail21 and Leppik et al20 are shown
in Table 1. As mentioned in the introduction, a relevant effect for a drug may be stated as a reduction of the number of
seizures by 50%. A location-shift model is clearly not appropriate for these data. Based on the specified relevant effect size,
we can generate synthetic data. These synthetic data are generated in a way such that F2 (x) = F̂ 2 (x) = F1 (x∕q) for q = 0.5,
ie, the empirical distribution of the generated data is equal to the alternative distribution F1 (x∕q). Hence, this leads to
a nonparametric relative effect p of approximately 0.27, which is inserted into the sample size formula. For computing
the sample size, it is easier to use formula (9) instead of (8). The main difference between these formulas is that we have
decomposed the variance 𝜎N∗ into two parts, 𝜎1∗ and 𝜎2∗ (see (5)). In addition, the variance under the null hypothesis is
8 HAPP ET AL.

TABLE 2 Power simulation for the number of seizures

Method Sample Sizes n1 ∕n2 Total Sample Size N Power
Balanced 24/24 48 0.802
Unbalanced 23/24 47 0.7956
Noether 26/26 52 0.8417

TABLE 3 Number of rats with

defect score 0, 1, 2, and 3
Defect Score
0 1 2 3
Substance 1 64 12 4 0
Substance 2 48 25 6 1

written in terms of 𝜎 ∗ (see formula (4)). Then, for this sample size formula (9), we still need to calculate the variances
𝜎 ∗ , 𝜎1∗ , and 𝜎2∗ . We can do that by first calculating the placements for the data according to Equation (10). Then, we use
(12), (13), and (14) to obtain the quantities needed for the sample size formula.
In order to have a power of at least 80%, we need 24 subjects in each group, according to our method. When using
the optimal t0 ≈ 0.49, we need n1 = 23 and n2 = 24 subjects. In this case, the optimal allocation only reduces the
total number of subjects needed by one, in comparison with a balanced design. Applying Noether's formula in this case
yields sample sizes n1 = n2 = 26. Table 2 presents results from a power simulation regarding the different sample size
recommendations. Here, Noether's formula would lead to a slightly overpowered study.

4.2 Irritation of the nasal mucosa

In this study, two inhalable substances with different concentrations are compared with regard to the severity of the nasal
mucosa damage of rats (see the work of Akritas et al24 ). The severity of irritation is described using a defect score from
0 to 3 where 0 refers to no irritation and 3 to severe irritation. For the nasal mucosa data, we have prior information for
substance 1 with 2 ppm concentration. A pathologist suggests, eg, that a worsening of one score unit for 25% of the rats in
categories 0, 1, and 2 is a relevant effect. This means that 25% of the rats with score 0 will be assigned score 1 and so forth.
The resulting synthetic data set for substance 2 is given in Table 3. It was generated in the same way as in the previous
example, ie, the empirical cdf F̂ 2 is equal to F2 . The original data set for substance 1 has been augmented by factor 4 to
obtain integer values of the samples sizes for the synthetic data for substance 2. The result of the sample size calculation
is not affected by this because the relative frequencies for substance 1 remain unchanged. Then, the quantities needed for
the sample size formula (9) are calculated similarly to the example form before.
Based on the synthetic data in Table 3, the relative effect is p = 0.599. Performing a sample size calculation with
1 − 𝛽 = 0.8 and balanced groups results in sample sizes n1 = n2 = 85. For this data set, the ratio of variances 𝜅 is
larger than 1; therefore, it is beneficial to assign fewer subjects to the first group (substance 1). To be more precise, the
optimal allocation rate t0 is approximately 0.49, which leads to sample sizes n1 = 83 and n2 = 87. However, as we can
see, in both cases, the total sample size is N = 170. If we apply Noether's formula,3 we arrive at n1 = n2 = 134, which is
considerably larger than the estimated minimal sample size based on our method and leads to a remarkably overpowered
study, with actual power of over 94% (see Table 4 for the simulation results). This is mainly due to ties in the data. Recall
that Noether's formula was derived for continuous distributions. Our method achieves 80% power for the balanced and

TABLE 4 Power simulation for the nasal mucosa data

Method Sample Sizes n1 /n2 Total Sample Size N Power
Balanced 85/85 170 0.8027
Unbalanced 83/87 170 0.7999
Noether 134/134 268 0.9417
Tang 86/86 172 0.8045
HAPP ET AL. 9

TABLE 5 Relative kidney weights [‰] for 16 male Wistar rats

Relative Kidney Weight [‰]
Placebo 6.62 6.65 5.78 5.63 6.05 6.48 5.50 5.37
Treatment 6.92 6.95 6.08 5.93 6.35 6.78 5.80 5.67

TABLE 6 Power simulation for the relative kidney weights

Method Sample Sizes n1 ∕n2 Total Sample Size N Power
Balanced 30/30 60 0.7976
Unbalanced 31/30 61 0.8123
Noether 32/32 64 0.8320

unbalanced design. Tang5 derived a sample size formula for ordered categorical data. If we use his method, we obtain that
86 rats per group are needed. The closeness of his result to ours may be taken as confirmation that our unified approach
produces appropriate results also in the case of ordered categorical data.

4.3 Kidney weights

In this placebo-controlled toxicity trial, female and male Wistar rats have been given a drug in four different dose levels.
The primary outcome is the relative kidney weight in [‰], ie, the sum of the two kidney weights divided by the total body
weight, and multiplied by 1000. For calculating the sample size, we consider only male rats from the placebo group and
generate a suitable data set exhibiting a relevant effect for the treatment group. For generating the synthetic data of the
treatment group, an expert considers a location shift of 5% of the mean from the placebo group as a relevant effect. The
data are displayed in Table 5.
Using the data from Table 5 as our synthetic data, the nonparametric relative effect is calculated as p ≈ 0.70. Thus,
we need n1 = n2 = 30 Wistar rats to have a power of at least 80%. In this example, there is again barely any difference
between using the optimal design t0 ≈ 0.51 (n1 = 31, n2 = 30) and a balanced allocation. Because of rounding, in this
case, the optimal design even leads to a larger sample size N = 61 in comparison to N = 60 obtained using a balanced
design. Noether's formula leads to sample sizes n1 = n2 = 32 in this case. The simulated power is given in Table 6.
Clearly, Noether's formula again exceeds the 80% power. Our method maintains the power quite well and leads to just a
slight inflation of power in the unbalanced design.

4.4 Albumin in urine

This data set was considered by Lachin6 and contains albumin levels in the urine (albuminuria) of diabetic patients.
The levels of albumin are rated as either normal, microalbuminuria, or macroalbuminuria. The goal of the study was to
compare two treatments, with expected conditional probabilities as given in Table 7.
For 90% power, Lachin6 reported a required sample size of N = 1757 (1758 because of rounding to achieve balanced
sample sizes). Using our proposed method, we obtain a necessary total sample size of N = 1754 in the balanced case. For
the optimal design, we obtain N = 1751 (see Table 8) with an optimal allocation rate t0 around 0.52. Simply using the
Noether formula despite the ties, one would calculate a required sample size of N = 5334 (!), clearly leading to a much
overpowered study. Based on this simulation study, the other three methods attained the nominal power. The relative
effect for this data set is p = 0.474.

TABLE 7 Relative frequencies for the

albumin data from the work of Lachin6
Normal Micro Macro
Control 0.85 0.10 0.05
Experimental 0.90 0.075 0.025
10 HAPP ET AL.

TABLE 8 Power simulation for the albumin in urine data

Method Sample Sizes n1 ∕n2 Total Sample Size N Power
Balanced 877/877 1754 0.9054
Unbalanced 909/842 1751 0.9033
Lachin 879/879 1758 0.9029
Noether 2667/2667 5334 ≈1

In the aforementioned four data examples, we have used 𝛼 = 0.05 and 1 − 𝛽 = 0.8 or 0.9 for the sample size calculation
and power simulation according to the examples from the literature. By formula (9) and the intervals for t0 (Equations (15)
and (16)) in Section 3.1, the choice of 𝛼 and 𝛽 has an influence not only on the total sample size N but also on the optimal
allocation rate t0 . In order to study the behavior of these two parameters, we have performed two simulation studies,
which are described in Section 5.

5 SIMULATIONS FOR THE O PTIMAL D ESIGN

In this section, we assess in different simulations the behavior of the optimal allocation rate t0 when changing the nominal
type-I error rate 𝛼, the power 1 − 𝛽, and the ratio of standard deviations 𝜅 = 𝜎 2 ∕𝜎 1 .
For simulating the influence of 𝛼, we used Beta(5, 5) and Beta(3, i) distributed random numbers in the first and second
group for i = 1, 2, 3. For each 𝛼 = 0.01, 0.02, … , 0.1, we generated 106 random numbers for each group and calculated
the optimal allocation rate t0 and the total sample sizes N(t0 ) and N(1∕2) (corresponding to a balanced design) to achieve
at least 80% power. From the formula for the upper bound I2 of t0 , we already saw (Section 3.1) that larger values for the
type-I error rate 𝛼 would lead to a larger difference |I2 − 1∕2|. While we cannot conclude from this directly that t0 will
be more extreme, the optimal allocation rate will more likely tend to more extreme values, ie, the difference |t0 − 1∕2|
tends to become larger. We can see this behavior confirmed in Figure 1. In this simulation, we had p ≈ 0.5 and 𝜅 = 1.35,
implying t0 < 1∕2 for the case i = 1 (red curve), p = 0.657 and 𝜅 = 1.53 (green curve), and p = 0.84 and 𝜅 = 1.98
(blue curve). Note that an effect of p ≈ 0.5 makes no sense in a realistic scenario as the calculated sample size would be
much too large to be of practical relevance, but we use this setting regardless just to demonstrate the behavior of t0 with
regard to the effect p. The ratio 𝜅 = 𝜎 2 ∕𝜎 1 also has an influence on the value of t0 . Hence, we chose the alternative in
such a way that 𝜅 > 1. This means that t0 < 1∕2, and if we increase p, then 𝜅 also increases. From that, we saw that more
extreme effects (or larger values of 𝜅) led to larger differences |t0 − 1∕2|. This can also be seen from the upper bound I2 .
In the data examples, we already found very little difference between using a balanced design or the optimal design.
The simulation study yielded a similar observation where the maximal difference was at most 1 for the medium and large

FIGURE 1 The graphic shows the values of the optimal allocation rate t0 for different values of type-I error rates 𝛼 where the goal is to
detect a relevant effect with at least 80% power. For the reference group, we used Beta (5, 5) distributions, and for the treatment group, we
assumed Beta (3, i), where i = 1, 2, 3. The red line represents i = 3 (relative effect p ≈ 0.5); for the green curve, we have used i = 2
( p ≈ 0.65), and for the red line, i = 1 ( p ≈ 0.84) [Colour figure can be viewed at wileyonlinelibrary.com]
HAPP ET AL. 11

FIGURE 2 The graphic shows the values of the optimal allocation rate t0 for different values of the power for 𝛼 = 0.05. For the reference
group, we used Beta (5, 5) distributions, and for the treatment group, we assumed Beta (3, i), where i = 1, 2, 3. The red line represents i = 3
(relative effect p ≈ 0.5); for the green curve, we have used i = 2 ( p ≈ 0.65), and for the red line, i = 1 ( p ≈ 0.84) [Colour figure can be
viewed at wileyonlinelibrary.com]

relative effect p, ie, max |N(t) − N(1∕2)| = 1. For the small effect p ≈ 0.5, the maximal difference was larger but still negli-
gible because the total sample size was very large for this setting. The detailed results are provided in the Supplementary
Material.
In a second simulation, we investigated the behavior of t0 for increasing power (or decreasing 𝛽). We used 𝛼 = 0.05 and
the same distributions as before. Therefore, p and 𝜅 were the same as aforementioned for the three different alternatives.
As values for the power, we chose 1 − 𝛽 = 0.5, … , 0.95 and generated 106 random numbers for each 𝛽 to calculate the
optimal allocation rate t0 . The results are displayed in Figure 2. Obviously, for 1 − 𝛽 = 0.5, we had t0 = 1∕2 in all cases.
A larger power led to more extreme values for t0 , but the difference in required sample sizes between the balanced and
optimal design was again negligible. The difference was again at most 1 for the medium and large relative effect p. Similar
to the simulation from before, more extreme values of the relative effect led to larger differences |t0 − 1∕2|.

6 DISCUSSION

In this paper, we have proposed a unified approach to sample size determination for the WMW two-sample rank sum test.
Our approach does not assume any specific type of data or a specific alternative hypothesis. In particular, data distributions
may be discrete or continuous. Based on the general formula, we have also derived an optimal allocation rate to both
groups, ie, to choose a value for t = n1 ∕N such that N is minimized. The value of this optimal allocation rate t0 mainly
depends on the ratio 𝜅 = 𝜎 2 ∕𝜎 1 (see (13) and (14) for a definition of these variances) and on 𝛽. The variance under the
null hypothesis has no influence on t0 . For 𝜅 > 1, we have t0 < 1∕2, for 𝜅 < 1, we have t0 > 1∕2, and for 𝜅 = 1, we
have exactly t0 = 1∕2 assuming u1 − 𝛽 > 0. The nominal type-I error rate 𝛼 only has a small impact on the value of t0 .
The larger 𝛼 is, the larger is the difference |t0 − 1∕2|.
We can see from the interval [I1 , I2 ] for the optimal allocation rate t0 derived in Section 3.1 that t0 will typically be close
to 1∕2. This was also confirmed in some illustrative data examples in Section 4. Furthermore, the difference in required
sample size between using a balanced design and using the optimal allocation design appears practically negligible.
In other words, in most cases, a balanced design can be recommended for the WMW test. In extensive simulations,
we have confirmed that the new procedure actually meets the power at the calculated sample sizes quite well. In special
cases, our sample size formula yields basically the same results as those by Lachin6 and Tang5 for ordinal data or Noether3
for continuous data (see Section 4). Matching the established results in these special cases is a desirable property for a
generally valid sample size formula. However, note that, for Noether's formula, the variance under the alternative hypoth-
esis is approximated by the variance under the null hypothesis; hence, a difference to our formula is to be expected even
for continuous data (see, eg, Table 6). The advantage of our new sample size formula is that it can be used universally
12 HAPP ET AL.

for different types of data. We also provide details on how to generate synthetic data based on an interpretable effect.
The new procedure has been implemented in the R package WMWssp.

ACKNOWLEDGEMENT
This research was supported by Austrian Science Fund (FWF) I 2697-N31.

ORCID

Martin Happ http://orcid.org/0000-0003-0009-2665

Arne C. Bathke http://orcid.org/0000-0002-6260-3726

REFERENCES
1. Bürkner P-C, Doebler P, Holling H. Optimal design of the Wilcoxon–Mann–Whitney-test. Biom J. 2017;59(1):25-40.
2. Wang H, Chen B, Chow SC. Sample size determination based on rank tests in clinical trials. J Biopharm Stat. 2003;13(4):735-751.
3. Noether GE. Sample size determination for some common nonparametric tests. J Am Stat Assoc. 1987;82(398):645-647.
4. Fan C, Zhang D. A note on power and sample size calculations for the Kruskal–Wallis test for ordered categorical data. J Biopharm Stat.
2012;22(6):1162-1173.
5. Tang Y. Size and power estimation for the Wilcoxon–Mann–Whitney test for ordered categorical data. Statist Med. 2011;30(29):3461-3470.
6. Lachin JM. Power and sample size evaluation for the Cochran–Mantel–Haenszel mean score (Wilcoxon rank sum test) and the
Cochran–Armitage test for trend. Statist Med. 2011;30(25):3057-3066.
7. Hilton JF, Mehta CR. Power and sample size calculations for exact conditional tests with ordered categorical data. Biometrics.
1993;49(2):609-616.
8. Whitehead J. Sample size calculations for ordered categorical data. Statist Med. 1993;12(24):2257-2271.
9. Rahardja D, Zhao YD, Qu Y. Sample size determinations for the Wilcoxon–Mann–Whitney test: a comprehensive review. Stat Biopharm
Res. 2009;1(3):317-322.
10. Zhao YD, Rahardja D, Qu Y. Sample size calculation for the Wilcoxon–Mann–Whitney test adjusting for ties. Statist Med.
2008;27(3):462-468.
11. Shieh G, Jan SL, Randles RH. On power and sample size determinations for the Wilcoxon–Mann–Whitney test. J Nonparametric Stat.
2006;18(1):33-43.
12. Kolassa JE. A comparison of size and power calculations for the Wilcoxon statistic for ordered categorical data. Statist Med.
1995;14(14):1577-1581.
13. Rosner B, Glynn RJ. Power and sample size estimation for the Wilcoxon rank sum test with application to comparisons of C statistics from
alternative prediction models. Biometrics. 2009;65(1):188-197.
14. Chakraborti S, Hong B, van de Wiel MA. A note on sample size determination for a nonparametric test of location. Technometrics.
2006;48(1):88-94.
15. Lesaffre E, Scheys I, Fröhlich J, Bluhmki E. Calculation of power and sample size with bounded outcome scores. Statist Med.
1993;12(11):1063-1078.
16. Hamilton MA, Collings BJ. Determining the appropriate sample size for nonparametric tests for location shift. Technometrics.
1991;33(3):327-337.
17. Collings BJ, Hamilton MA. Estimating the power of the two-sample Wilcoxon test for location shift. Biometrics. 1988;44:847-860.
18. Matsouaka RA, Singhal AB, Betensky RA. An optimal Wilcoxon–Mann–Whitney test of mortality and a continuous outcome. Stat Methods
Med Res. 2016;27(8):2384-2400.
19. Dette H, O'Brien TE. Efficient experimental design for the Behrens-Fisher problem with application to bioassay. Am Stat.
2004;58(2):138-143.
20. Leppik IE, Dreifuss FE, Bowman T, et al. A double-blind crossover evaluation of progabide in partial seizures: 3:15 PM8. Neurology.
1985;35(4):285.
21. Thall PF, Vail SC. Some covariance models for longitudinal count data with overdispersion. Biometrics. 1990;46:657-671.
22. Puntanen S, Styan GPH, Isotalo J. Matrix Tricks for Linear Statistical Models: Our Personal Top Twenty. Berlin, Germany: Springer; 2011.
23. Ruymgaart FH. A unified approach to the asymptotic distribution theory of certain midrank statistics. In: Raoult JP, ed. Statistique non
Parametrique Asymptotique. Berlin, Germany: Springer; 1980:1-18.
24. Akritas MG, Arnold SF, Brunner E. Nonparametric hypotheses and rank statistics for unbalanced factorial designs. J Am Stat Assoc.
1997;92(437):258-265.
25. Akritas MG, Brunner E. A unified approach to rank tests for mixed models. J Stat Plan Inference. 1997;61(2):249-277.
26. Brunner E, Puri ML. Nonparametric methods in factorial designs. Stat Pap. 2001;42(1):1-52.
27. Brunner E, Munzel U. The nonparametric Behrens-Fisher problem: asymptotic theory and a small-sample approximation. Biom J.
2000;42(1):17-25.
HAPP ET AL. 13

28. Seber GAF. A Matrix Handbook for Statisticians. Hoboken, NJ: John Wiley & Sons; 2008.
29. Orban J, Wolfe DA. A class of distribution-free two-sample tests based on placements. J Am Stat Assoc. 1982;77(379):666-672.
30. Orban J, Wolfe DA. Distribution-free partially sequential piacment procedures. Commun Stat Theory Methods. 1980;9(9):883-904.
31. Konietschke F, Friedrich S, Brunner E, Pauly M. rankFD: rank-based tests for general factorial designs. 2016. R package version 0.0.1.

SUPPORTING INFORMATION
Additional supporting information may be found online in the Supporting Information section at the end of the article.

How to cite this article: Happ M, Bathke AC, Brunner E. Optimal sample size planning for the
Wilcoxon-Mann-Whitney test. Statistics in Medicine. 2018;1–13. https://doi.org/10.1002/sim.7983

106-Troubleshooting FCC Standpipe Flow Problems
100% (1)
106-Troubleshooting FCC Standpipe Flow Problems
10 pages
Zimmerman 2012 A Note On Consistency of Non-Parametric Rank Tests and Related Rank Transformations
No ratings yet
Zimmerman 2012 A Note On Consistency of Non-Parametric Rank Tests and Related Rank Transformations
23 pages
T-Tests, Non-Parametric Tests, and Large Studies-A Paradox of Statistical Practice?
No ratings yet
T-Tests, Non-Parametric Tests, and Large Studies-A Paradox of Statistical Practice?
7 pages
Banton_temple_0225E_12445
No ratings yet
Banton_temple_0225E_12445
113 pages
Normality and Sample Size Appropriate Statistical Test For Two-Group Comparisons - Poncet, A. Et Al. - 2016
No ratings yet
Normality and Sample Size Appropriate Statistical Test For Two-Group Comparisons - Poncet, A. Et Al. - 2016
11 pages
STAT22209 - Nonparametric Statistics
No ratings yet
STAT22209 - Nonparametric Statistics
74 pages
Zimmerman 2011
No ratings yet
Zimmerman 2011
22 pages
Chapter 8: Introduction To Hypothesis Testing
No ratings yet
Chapter 8: Introduction To Hypothesis Testing
6 pages
Sample Size Calculation
No ratings yet
Sample Size Calculation
13 pages
Determinacion Tamaños Muestra Exp Clinicos (Correlación)
No ratings yet
Determinacion Tamaños Muestra Exp Clinicos (Correlación)
7 pages
Math204 NonParThree
No ratings yet
Math204 NonParThree
4 pages
Wilcox, 2011
No ratings yet
Wilcox, 2011
16 pages
Non Parametric Tests
100% (1)
Non Parametric Tests
43 pages
Exp Design
No ratings yet
Exp Design
80 pages
Wilcoxon Test in Ordinal Data
No ratings yet
Wilcoxon Test in Ordinal Data
15 pages
Power Analysis For Testing Two Independent Groups of Likert-Type Data
No ratings yet
Power Analysis For Testing Two Independent Groups of Likert-Type Data
6 pages
Mann Whitney Wilcoxon Tests (Simulation)
No ratings yet
Mann Whitney Wilcoxon Tests (Simulation)
16 pages
6 - Nonparametric Tests
No ratings yet
6 - Nonparametric Tests
10 pages
Samenvatting Statistiek 10tm17
No ratings yet
Samenvatting Statistiek 10tm17
11 pages
Sample Size Calculation
No ratings yet
Sample Size Calculation
10 pages
Single group when observations are not normally distributed
No ratings yet
Single group when observations are not normally distributed
35 pages
Chapter3 Lesson 2 Non - Parametric
No ratings yet
Chapter3 Lesson 2 Non - Parametric
26 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
21 pages
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Mann - Whitney Test - Nonparametric T Test
No ratings yet
Mann - Whitney Test - Nonparametric T Test
18 pages
Educ 202 Adv Stat Final Exam Apr 19
No ratings yet
Educ 202 Adv Stat Final Exam Apr 19
4 pages
2.C Statistics
No ratings yet
2.C Statistics
5 pages
Lecture Slides_Before Running an Experiment
No ratings yet
Lecture Slides_Before Running an Experiment
27 pages
Parametric & Non-Parametric Tests
No ratings yet
Parametric & Non-Parametric Tests
34 pages
Nonparametric Hypotheses and Rank Statistics For Unbalanced Factorial Designs
No ratings yet
Nonparametric Hypotheses and Rank Statistics For Unbalanced Factorial Designs
10 pages
Mann WHitney U Test
No ratings yet
Mann WHitney U Test
35 pages
Lecture9 Module2 Anova 1
No ratings yet
Lecture9 Module2 Anova 1
9 pages
Final Nonparametric Tests 12.12.24
No ratings yet
Final Nonparametric Tests 12.12.24
23 pages
2sample Size Determination Jan 2023
No ratings yet
2sample Size Determination Jan 2023
69 pages
MS5 6
No ratings yet
MS5 6
17 pages
Mehta_Pocock Conditional Power調整人數
No ratings yet
Mehta_Pocock Conditional Power調整人數
20 pages
Parametric & Non-Parametric Tests
No ratings yet
Parametric & Non-Parametric Tests
34 pages
Designing Comparative Experiments: Points of View
No ratings yet
Designing Comparative Experiments: Points of View
2 pages
Biostatistics L11+12 2021
No ratings yet
Biostatistics L11+12 2021
9 pages
Chapter 1
No ratings yet
Chapter 1
50 pages
Sample Size
No ratings yet
Sample Size
34 pages
Nonparametric Effect Size Estimators: East Carolina University
No ratings yet
Nonparametric Effect Size Estimators: East Carolina University
2 pages
Choice of Statistical Test For Independent Observations
No ratings yet
Choice of Statistical Test For Independent Observations
12 pages
Types of Data, Descriptive Statistics, and Statistical Tests For Nominal Data
No ratings yet
Types of Data, Descriptive Statistics, and Statistical Tests For Nominal Data
13 pages
Types of Data, Descriptive Statistics, and Statistical Tests For Nominal Data
No ratings yet
Types of Data, Descriptive Statistics, and Statistical Tests For Nominal Data
13 pages
Designing The Research Methodology
No ratings yet
Designing The Research Methodology
42 pages
Tests of Hypothesis
No ratings yet
Tests of Hypothesis
16 pages
Whole Brain Learning System Outcome-Based Education: Basic Statistics
No ratings yet
Whole Brain Learning System Outcome-Based Education: Basic Statistics
16 pages
Some Practical Guidelines For Effective Sample Size Determination
No ratings yet
Some Practical Guidelines For Effective Sample Size Determination
7 pages
Summary - Non - Parametric Tests
No ratings yet
Summary - Non - Parametric Tests
24 pages
Stat Chapter-14 Part-2 PPT
No ratings yet
Stat Chapter-14 Part-2 PPT
43 pages
Lecture 8 Hypothesis Testing
No ratings yet
Lecture 8 Hypothesis Testing
44 pages
Project A
No ratings yet
Project A
17 pages
Unit 10
No ratings yet
Unit 10
38 pages
Large Sample Problems
No ratings yet
Large Sample Problems
9 pages
SM 38
No ratings yet
SM 38
58 pages
Sample Size
No ratings yet
Sample Size
6 pages
Brannath 2005 Estimation in Flexible Two Stage Designs
No ratings yet
Brannath 2005 Estimation in Flexible Two Stage Designs
16 pages
mtl855 assign 1-pages-deleted
No ratings yet
mtl855 assign 1-pages-deleted
14 pages
Large Sample Test-Cropped
No ratings yet
Large Sample Test-Cropped
19 pages
C0640 Bib
No ratings yet
C0640 Bib
18 pages
EIN Form Ss4
0% (1)
EIN Form Ss4
2 pages
Metcal MX 500 Data Sheet
No ratings yet
Metcal MX 500 Data Sheet
2 pages
BN68-09487G-02 Um QRQ90B XH L16 190704.1
No ratings yet
BN68-09487G-02 Um QRQ90B XH L16 190704.1
388 pages
Insurance - MCQ Types Questions
100% (4)
Insurance - MCQ Types Questions
5 pages
Eye Maxxing
No ratings yet
Eye Maxxing
11 pages
PDF Document PDF
No ratings yet
PDF Document PDF
96 pages
Impact of Electromagnetic Waves Microwave and Decimeter Therapy
No ratings yet
Impact of Electromagnetic Waves Microwave and Decimeter Therapy
11 pages
Ranaldo Francis - Advent 2023 Grade 10 Int SC Worksheet Diffusion and Osmosis
No ratings yet
Ranaldo Francis - Advent 2023 Grade 10 Int SC Worksheet Diffusion and Osmosis
3 pages
SINAG (SDRRM) Newsletter
No ratings yet
SINAG (SDRRM) Newsletter
1 page
Wlodzimierz Sokolowski MD RE: "Misinformed, Self-Representing Litigants Jamming Courtrooms: Judge" TJ, July 2, 2012 by Don MacPherson
No ratings yet
Wlodzimierz Sokolowski MD RE: "Misinformed, Self-Representing Litigants Jamming Courtrooms: Judge" TJ, July 2, 2012 by Don MacPherson
2 pages
National Mental Health Program
100% (3)
National Mental Health Program
44 pages
Govt Approved D PHARMA Institutions 2020
No ratings yet
Govt Approved D PHARMA Institutions 2020
5 pages
Atracurium Besylate
No ratings yet
Atracurium Besylate
3 pages
Aging Spine: Prof. DR Mirza Bišćević Spine Department, Orthopedics
No ratings yet
Aging Spine: Prof. DR Mirza Bišćević Spine Department, Orthopedics
25 pages
Menu Temu Cafe - SCBD
0% (1)
Menu Temu Cafe - SCBD
26 pages
Arabic Dep. Material 2015
No ratings yet
Arabic Dep. Material 2015
10 pages
GROUP-3-FINAL-1
No ratings yet
GROUP-3-FINAL-1
112 pages
Ectopic Pregnancy Student Lecture
50% (4)
Ectopic Pregnancy Student Lecture
19 pages
Control of Nitrosamine Impurities in Human Drugs PDF
No ratings yet
Control of Nitrosamine Impurities in Human Drugs PDF
24 pages
HILIC in The Analysis of Antibiotics, 2017
No ratings yet
HILIC in The Analysis of Antibiotics, 2017
14 pages
Aue 360 Datasheet
No ratings yet
Aue 360 Datasheet
4 pages
Choice Theory
No ratings yet
Choice Theory
3 pages
Electricity and Circuit
No ratings yet
Electricity and Circuit
6 pages
An Dog
No ratings yet
An Dog
11 pages
Positioning of Patient or Groups, Positioning of Physiotherapist, Method of Instruction
No ratings yet
Positioning of Patient or Groups, Positioning of Physiotherapist, Method of Instruction
4 pages
Indices in Orthodontics
No ratings yet
Indices in Orthodontics
67 pages
Unit 4
No ratings yet
Unit 4
86 pages
(eBook PDF) Empowerment Series: Introduction to Social Work and Social Welfare: Empowering People 12th Edition instant download
100% (7)
(eBook PDF) Empowerment Series: Introduction to Social Work and Social Welfare: Empowering People 12th Edition instant download
52 pages
Pengaruh Covid-19 Terhadap Pendidikan Di Indonesia
No ratings yet
Pengaruh Covid-19 Terhadap Pendidikan Di Indonesia
2 pages