Yamamura 1999 Transformaciones
Yamamura 1999 Transformaciones
Yamamura 1999 Transformaciones
ORIGINAL ARTICLE
Kohji Yamamura
Abstract Transformation is required to achieve homo- where x is the number of individuals. However, the variabil-
scedasticity when we perform ANOVA to test the effect of ity of loge(x 1 1) is much different from that of loge(x) when
factors on population abundance. The effectiveness of x is small. Hence, McArdle and Gaston (1992) recom-
transformations decreases when the data contain zeros. mended the coefficient of variation (CV) as a measurement
Especially, the logarithmic transformation or the Box–Cox of population variability, because CV performs the same
transformation is not applicable in such a case. For the way as the standard deviation of loge(x) and is unaffected by
logarithmic transformation, 1 is traditionally added to avoid zeros.
such problems. However, there is no concrete foundation as The logarithmic transformation is also frequently used to
to why 1 is added rather than other constants, such as 0.5 or achieve homoscedasticity or stability of variance when we
2, although the result of ANOVA is much influenced by the perform ANOVA to test the effect of factors on the popu-
added constant. In this paper, I suggest that 0.5 is preferable lation abundance. If the data contain zero, 1 is traditionally
to 1 as an added constant, because a discrete distribution added to each data or only to the zeros. However, there is
defined in {0, 1, 2, . . .} is approximately described by a cor- no concrete foundation as to why 1 is added rather than
responding continuous distribution defined in (0, `) if we another constant, such as 0.5 or 2, although the result of
add 0.5. Numerical investigation confirms this prediction. ANOVA is much influenced by the added constant. In this
article, I first summarize the procedure to determine the
Key words ANOVA · Box–Cox transformation · Heteros- transformation formulae to stabilize the variance of popula-
cedasticity · Iwao’s m* 2 m regression · Taylor’s power law tion counts. Then, I suggest that 0.5 is a reasonable choice as
the added constant. Numerical investigation is also con-
ducted to determine an appropriate constant.
Introduction
Many works have been developed about which measure Heteroscedasticity in populations
should be used to describe the variability of populations
(Williamson 1984; McArdle et al. 1990; Gaston and The variance of population increases with increasing mean.
McArdle 1993; Leps 1993; McArdle and Gaston 1993, Bliss (1941) suggested two equations to describe the
1995). If we want to analyze the cause of population dynam- heteroscedasticity:
ics, a logarithmic scale is preferable in many cases, because
mortality, as well as reproduction, is a multiplicative pro- s2 5 gm 1 hm2 (1)
cess. Life table analyses such as key-factor/key-stage analy- s2 5 amb (2)
ses adopted a logarithmic scale for this reason (Yamamura where m and s2 are the mean and variance of the number of
1999). One of the difficulties of logarithmic scale is that we individuals in a sample, and a, b, g, and h are constants. The
cannot calculate the logarithm if the data contain zeros. In general applicability of Eqs. 1 and 2 were first shown by
such a case, loge(x 1 1) or log10(x 1 1) is traditionally used, Iwao (1968), based on his m* 2 m regression, and by Taylor
(1961), based on his power law, respectively. A consider-
able amount of controversy has been held about which of
K. Yamamura the two is superior as an ecological model (Iwao and Kuno
Laboratory of Population Ecology, National Institute of Agro- 1971; Taylor et al. 1978; Taylor 1984; Itô and Kitching 1986;
Environmental Sciences, 3-1-1 Kannondai, Tsukuba 305-8604, Japan
Tel. 181-298-38-8313; Fax 181-298-38-8199 Kuno 1991; Routledge and Swartz 1991; Perry and Woiwod
e-mail: [email protected] 1992). For the practical purpose of description, however,
230
both equations fit the data equally well in most cases, and Equation 7 includes the square root transformation as its
hence I later use both equations to investigate the effect of special case of b 5 1.
added constant on the stabilization of variance.
Box–Cox transformation
Derivation of the transformation formula Box and Cox (1964) proposed a procedure for determining
a transformation formula, which is applicable when we do
Taylor series expansions not know the form of d(x) beforehand. They used a modi-
fied form of Eq. 7:
Let us assume that the variance of a variable x is given by a
function of the mean d(m). Let f(x) be a function of x. Using
f ( x) 5
(x λ
21) ¸
( λ Þ 0)Ô˝
Taylor series expansions around the mean, m, we obtain: λ (8)
f ( x) 5 f ( m) 1 f ¢( m)( x 2 m) 1 . . . (3)
f ( x) 5 loge ( x) ( λ 5 0)Ô˛
This transformation is continuous around λ 5 0, although
where f9(m) is the first derivative of f(x) evaluated at x 5 m.
Eq. 7 is discontinuous around b 5 2. Hence, we can obtain
By squaring the above equation, we obtain an approxima- –
a series of transformations, including ÷x and loge(x), by
tion of the variance of f(x):
changing λ continuously. They estimated the parameter λ
by the maximum-likelihood method based on the assump-
[ ]Ó
2
V f ( x) 5 E ÏÌ f ( x) 2 E f ( x) ¸˝ < E f ( x) 2 f ( m)
[ ˛
[ ]] {[
2
]} tion that the distribution after the transformation follows a
normal distribution. To obtain the estimate, the working
[ ]
2
{
< f ¢ ( m) E ( x 2 m)
2
} 5 [ f ¢(m)] d(m)
2
variable, y, is first calculated:
(4) xλ 2 1
y5 (9)
where E and V indicate the expectation and variance, λG λ21
respectively. This method to obtain the variance is usually
where G is the geometric mean of x. ANOVA is performed
called the delta method because of the reliance upon first
for this working variable. Box and Cox (1964) showed that
derivatives (Stuart and Ord 1994, p 350). Our present con-
the maximum-likelihood estimate of λ coincides to the λ
cern is to find the function f(x) that yields a constant vari-
that minimizes the residual sum of squares in this ANOVA.
ance irrespective of m. Then, we obtain from Eq. 4:
Hence, we can easily find the maximum-likelihood estimate
w by comparing the residual sum of squares for various λ.
f ( x) 5 Ú dx (5)
d( x )
A general approximation
Numerical evaluation of approximation gamma distribution with a shape parameter k and a scale
parameter k/m, if m is large:
To evaluate the effectiveness for using c 5 0.5, I conducted 1
k
Ê kˆ Ê k ˆ
numerical calculations for several combinations of param- P ( x) 5 x k21 Á ˜ expÁ2 x˜ (12)
Γ (k ) Ë m¯ Ë m ¯
eters. It is known that the distribution of individuals can be
approximately described by a negative binomial distribu- Hence, we use Eq. 12 whose parameters have the same
tion in most cases. McArdle et al. (1990) used a negative constraint in its mean and variance. Bartlett (1936) used a
binomial distribution, whose parameters are subjected to similar approach to evaluate the effect of square root trans-
the constraint of Eq. 2, to evaluate the stabilization effect formation. I calculated only the transformation for a realis-
of the logarithmic transformation. In a similar way, I use a tic range, s2 $ m, for each combination of parameters.
negative binomial distribution defined by When we have a relation s2 5 m2, which corresponds to
g 5 0 and h 5 1 in Eq. 1 or a 5 1 and b 5 2 in Eq. 2, Eqs.
2k
Ê k 1 x 2 1ˆ Ê
x
mˆ Ê m ˆ 6 and 7 recommend a logarithmic transformation, loge(x). In
P ( x) 5 Á ˜ ÁË 1 1 ˜¯ ÁË ˜ this case, the transformed variable of a gamma distribution
Ë x ¯ k m 1 k¯
(11) showed a conspicuous homoscedasticity as indicated by the
2k 2x
1 Γ (k 1 x ) Ê mˆ Ê kˆ horizontality of the dotted line in Fig. 2. The variance of the
5 ◊ Á1 1 ˜ Á1 1 ˜
Γ (k ) Γ ( x 1 1) Ë k¯ Ë m¯ transformed variable of a negative binomial distribution
converges to that of the gamma distribution as the mean
whose parameter k is subjected to the constraint of Eq. 1 or increases. The convergence is much influenced by the value
2. We should also calculate the effect of transformation for of c. Among calculated values of c, c 5 0.2 seems to be most
a corresponding continuous distribution to evaluate how a preferable in this situation, because it shows superior
discrete distribution approaches a continuous distribution horizontality. c 5 0.5 is not the best choice in this case, but
by adding 0.5. Eq. 11 can be approximately described by a it is preferable to c 5 1.
232
Fig. 4. Effect of adding constant (c) on the stabilization of variance of Fig. 5. Effect of adding constant (c) on the stabilization of variance of
a negative binomial distribution with a constraint s2 5 m1.5 that corre- a negative binomial distribution with a constraint s2 5 0.5(m 1 m2) that
sponds to a 5 1 and b 5 1.5 in Eq. 2. A power transformation, (x 1 corresponds to g 5 0.5 and h 5 0.5 in Eq. 1. An arc-hyperbolic trans-
c)0.25, is used. Meaning of each curve is the same as in Fig. 2 formation, loge(Ζx— – –—–—
1c 1 Îx 1c 11), is used. Meaning of each curve is
the same as in Fig. 2
not guarantee the assumption of the F-test. This transfor- logarithmic transformation after adding 0.5 to the mean
mation may not also be suitable for describing the popula- population (x/100), the transformation corresponds to the
tion dynamics. Notice that the population dynamics should logarithmic transformation using c 5 50, because we have
be defined for the total population in an area but not loge(x/100 1 0.5) 5 loge(x 1 50) 2 loge(100). Taylor series
defined for the number of observed individuals, because the expansions about x around 0 yield:
latter is influenced by the sampling variability that is not of
x x2
interest to us. In most cases, the zero observation indicates loge ( x 1 c) < loge (c ) 1 2 1 ... (13)
that the population density is very low but does not indi- c 2c 2
cate that the population is truly zero. If the sample size is If c/x is large, therefore, the transformation formula ap-
extremely large, zero will not occur in many cases. In that proaches f(x) 5 loge(c) 1 x/c, i.e., no transformation. Thus,
sense, zero data are artifacts that derive from the deficiency the logarithmic transformation may become meaningless if
in sampling effort. Then, it is preferable to use an approxi- 0.5 is added to the mean population in this case. I recom-
mation for the dynamics of the logarithm of true total popu- mended c 5 0.5 because it is half of the discrete unit (see
lation. Let us imagine that the sample size becomes r times Fig. 1). When we analyze mean population, x/100, however,
larger to obtain the total population N. Then, the frequency the discrete unit is 1/100, and hence we should add 0.5/100
distribution of N/r is a continuous alternative to the distri- before the logarithmic transformation in such a case.
bution of x, if r is extremely large. By the same argument Thus, we should more carefully select an appropriate con-
shown in Fig. 1, therefore, loge(x 1 0.5) transformation stant when we analyze the mean density instead of total
yields an approximation for the distribution of loge(N/r), population.
i.e., the distribution of loge(N) 2 loge(r). Hence, loge(x 1
0.5) transformation seems to be preferable to loge(x 1 1) Acknowledgments I thank Dr. K. Kiritani for his comments on the
for evaluating the population dynamics as well as for per- manuscript.
forming ANOVA.
One of the possible misuses of the logarithmic transfor-
mation is to add a constant to the mean population instead References
of the total population before transformation. As an illus-
tration, let us consider that 100 plants are examined and x Anscombe FJ (1948) The transformation of Poisson, binomial and
individuals are observed on them. In this case, if we use negative-binomial data. Biometrika 35:246–254
234
Bartlett MS (1936) The square root transformation in analysis of Kuno E (1991) Sampling and analysis of insect populations. Annu Rev
variance. J R Stat Soc Suppl 3:68–78 Entomol 36:285–304
Bartlett MS (1947) The use of transformations. Biometrics 3:39–52 Leps J (1993) Taylor’s power law and the measurement of variation in
Beall G (1942) The transformation of data from entomological field the size of population in space and time. Oikos 68:349–356
experiments so that the analysis of variance becomes applicable. McArdle BH, Gaston KJ (1992) Comparing population variabilities.
Biometrika 32:243–262 Oikos 64:610–612
Berry DA (1987) Logarithmic transformations in ANOVA. Biometrics McArdle BH, Gaston KJ (1993) The temporal variability of popula-
43:439–456 tions. Oikos 67:187–191
Bliss CI (1941) Statistical problems in estimating populations of McArdle BH, Gaston KJ (1995) The temporal variability of densities:
Japanese beetle larvae. J Econ Entomol 34:221–232 back to basics. Oikos 74:165–171
Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc McArdle BH, Gaston KJ, Lawton JH (1990) Variation in the size of
B Met 26:211–252 animal populations: patterns, problems and artefacts. J Anim Ecol
Gaston KJ, McArdle BH (1993) Measurement of variation in the size 59:439–454
of populations in space and time: some points of clarification. Oikos Perry JN, Woiwod IP (1992) Fitting Taylor’s power law. Oikos 65:538–
68:357–360 542
Griffiths DA (1980) Interval estimation for the three-parameter log- Routledge RD, Swartz TB (1991) Taylor’s power law re-examined.
normal distribution via the likelihood function. Appl Stat 29:58–68 Oikos 60:107–112
Hill BM (1963) The three-parameter lognormal distribution and Stuart A, Ord JK (1994) Kendall’s advanced theory of statistics, vol 1.
Bayesian analysis of a point-source epidemic. J Am Stat Assoc Distribution theory, 6th edn. Arnold, London
58:72–84 Taylor LR (1961) Aggregation, variance and the mean. Nature (Lond)
Itô Y, Kitching RL (1986) The importance of non-linearity: a comment 189:732–735
on the view of Taylor. Res Popul Ecol 28:39–42 Taylor LR (1984) Assessing and interpreting the spatial distribution of
Iwao S (1968) A new regression method for analyzing the aggregation insect populations. Annu Rev Entomol 29:321–357
pattern of animal populations. Res Popul Ecol 10:1–20 Taylor LR, Woiwod IP, Perry JN (1978) The density-dependence of
Iwao S, Kuno E (1968) Use of the regression of mean crowding on spatial behaviour and the rarity of randomness. J Anim Ecol 47:383–
mean density for estimating sample size and the transformation of 406
data for the analysis of variance. Res Popul Ecol 10:210–214 Williamson M (1984) The measurement of population variability. Ecol
Iwao S, Kuno E (1971) An approach to the analysis of aggregation Entomol 9:239–241
pattern in biological populations. In: Patil GP, Pielou EC, Waters Yamamura K (1999) Key-factor/key-stage analysis for life table data.
WE (eds) Statistical ecology, vol 1. Spatial patterns and statistical Ecology 80:533–537
distributions. Pennsylvania State University Press, London, pp 461– Yates F (1934) Contingency tables involving small numbers and the χ2
513 test. J Roy Stat Soc Suppl 1:217–235