Lognormal Distribution and Using L-Moment Method For Estimating Its Parameters

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

Lognormal distribution and using L-moment


method for estimating its parameters
Diana Bílková

Abstract—L-moments are based on the linear combinations


of order statistics. The question of L-moments presents a general
theory covering the summarization and description of sample data
sets, the summarization and description of theoretical distributions,
but also the estimation of parameters of probability distributions and
hypothesis testing for parameters of probability distributions. L-
moments can be defined for any random variable in the case that its
mean exists. Within the scope of modeling income or wage
distribution we currently use the method of conventional moments,
the quantile method or the maximum likelihood method. The theory
of L-moments parallels to the other theories and the main advantage
of the method of L-moments over these methods is that L-moments
suffer less from impact of sampling variability. L-moments are more
robust and they provide more secure results mainly in the case
of small samples.
Common statistical methodology for description of the statistical
samples is based on using conventional moments or cumulants. An
alternative approach is based on using different characteristics which
are called the L-moments. The L-moments are an analogy to the
conventional moments, but they are based on linear combinations
of the rank statistics, i.e. the L-statistics. Using the L-moments is
theoretically more appropriate than the conventional moments
because the L-moments characterize wider range of the distribution.
When estimating from a sample, L-moments are more robust to the
existence of the outliers in the data. The experience shows that
in comparison with the conventional moments the L-moments are
more difficult to distort and in finite samples they converge faster to
the asymptotical normal distribution. Parameter estimations using the
L-moments are especially in the case of small samples often more
precise than estimates calculated using the maximum likelihood
method.
This text concerns with the application of the L-moments in the
case of larger samples and with the comparison of the precision of the
method of L-moments with the precision of other methods (moment,
quantile and maximum likelihood method) of parameter estimation
in the case of larger samples. Three-parametric lognormal
distribution is the basis of these analyses.

Keywords—Income distribution, L-moments, lognormal Fig. 1 Basic information about the Czech Republic
distribution, wage distribution.

Manuscript received October xx, 2011: Revised version received March xx, I. INTRODUCTION
2011. This work was supported by grant project IGS 24/2010 called “Analysis
of the Development of Income Distribution in the Czech Republic since 1990
to the Financial Crisis and Comparison of This Development with the
Development of the Income Distribution in Times of Financial Crisis −
According to Sociological Groups, Gender, Age, Education, Profession Field
T HE question of income and wage models is extensively
covered in the statistical literature, see form example [8]
− [9]. Data base for these calculations is composed
of two parts: firstly, the individual data of a net annual
and Region” from the University of Economics in Prague.
D. Bílková is with the University of Economics in Prague, Faculty household income per capita in the Czech Republic (in CZK),
of Informatics and Statistics, Department of Statistics and Probability, Sq. W. secondly, interval frequency distribution of gross monthly
Churchill 1938/4, 130 67 Prague 3, Czech Republic (corresponding author to wage in the Czech Republic (in CZK). The aim of this work is
provide phone: +420 224 095 484; e-mail: [email protected]). to compare the accuracy of using the L-moment method

Issue 1, Volume 6, 2012 30


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

of parameter estimation to the individual data with the 100 P% quantile is the basic quantile location characteristic
accuracy of using this method to the data ordered to the form of a random variable X
of interval frequency distribution. Another aim of this paper is
to compare the accuracy of different methods of parameter
x P = θ + eµ + σu P, (5)
estimation with the accuracy of the method of L-moments.
Three-parametric lognormal distribution was a fundamental
theoretical distribution for these calculations. Individual data where 0 < P < 1 and uP is 100·P% quantile of the standardized
on net annual household income per capita come from the normal distribution. Substituting into the relation (5) P = 0.5,
statistical survey Microcensus (years 1992, 1996, 2002) and we get 50% quantile of three-parametric lognormal
from the statistical survey EU-SILC − European Union distribution, which is called median
Statistics on Income and Living Conditions (years 2005, 2006,
2007, 2008) organized by the Czech Statistical Office. The ~
x = θ + eµ . (6)
data in the form of interval frequency distribution come from
the website of the Czech Statistical Office. Fig. 1 presents
current basic information about the location of the Czech
0,1
Republic in Europe and about the Czech Republic itself. µ=1
0,08 µ=2
II. METHODS
µ=3
A. Three-Parametric Lognormal Distribution 0,06
f(x)
µ=4
The essence of lognormal distribution is treated in detail for
example in [2]. Use of lognormal distribution in connection 0,04 µ=5
with income or wage distributions is described in [1] or [2].
0,02
Random variable X has three-parametric lognormal
distribution LN(µ,σ2,θ) with parameters µ, σ2 and θ, where 0
− ∞ < µ < ∞, σ2 > 0 and − ∞ < θ < ∞, if its probability density 1 3 5 7 9 11 13 15 17 19
x
function f(x; µ,σ2,θ) has the form
Fig. 2 Probability density function for the values of parameters
1 [ln ( x − θ) − µ]2 σ = 2, θ = −2

f ( x; µ,σ ,θ)
2
= e 2 σ2
, x > θ,
σ ( x − θ) 2 π (1)
= 0, else.
0,05
σ=1
Random variable 0,04 σ=2
σ=3
Y = ln (X − θ) (2) 0,03
f(x)

σ=4
2
has a normal distribution N(µ,σ ) and random variable 0,02 σ=5

ln ( X − θ) − µ 0,01
U= (3)
σ 0
-4 0 4 8 12 16 20 24 28 32 36 40 44
has a standardized normal distribution N (0, 1). The parameter x
µ is the expected value of random variable (2) and the
Fig. 3 Probability density function for the values of parameters
parameter σ2 is the variance of this random variable. µ = 3, θ = −2
Parameter θ is the theoretical minimum of random variable X.
Figs. 2 and 3 represent the probability density functions The Median (6) divides the range of values of random variable
of three-parametric lognormal curves depending on the values X on the two equally likely parts. The mode (7) of random
of their parameters. variable X is another often used location characteristic
The expected value (4) is the basic moment location of three-parametric lognormal distribution
characteristic of a random variable X having three-parametric
lognormal distribution 2
xɵ = θ + eµ − σ . (7)
σ2
E( X ) = θ + eµ + 2 . (4) The variance (8) of random variable X is a basic variability
characteristic of three-parametric lognormal distribution

Issue 1, Volume 6, 2012 31


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

2
D( X ) = e2 µ + σ (eσ − 1) .
2 third central moment of random variable X and we get the third
(8)
equation. We obtain a system of moment equations
Standard deviation (9) is the square root of the variance and it 2~
~ ~ σ
represents another moment variability characteristic of the x = θ + eµ + 2 , (13)
considered theoretical distribution ~ ~2 ~2
m2 = e2 µ + σ (eσ − 1) , (14)
σ2 ~ 3 ~2 ~2 2 ~2
D( X ) = eµ + 2
2
eσ − 1 . (9) m3 = e3 µ + 2 σ (eσ − 1) (e σ
+ 2) . (15)

The coefficient of variation (10) is a characteristic of relative We obtain from equations (14) and (15)
variability of this distribution and we get it by dividing the
~2 ~2 2
standard deviation to the expected value of the distribution b12 = m23 ⋅ m2− 3 = (eσ − 1) (eσ + 2) , (16)

σ2 2
eµ + 2 eσ − 1 and therefore we also gain the moment parameter estimations
V(X) = . (10) of three-parametric lognormal distribution from the system
σ2
θ + eµ + 2 of equations (13) to (15)

It is a dimensionless characteristic of variability.  2


~ 3 1 + 1 b 2 +  1 
The coefficient of skewness (11) and the coefficient σ 2
= ln 1 + b12  − 1 +
 2
1
 2 
of kurtosis (12) belong to basic moment shape characteristic  (17)
of the distribution  2
1  1 
+ 3 1 + b12 − 1 + b12  − 1 − 1 ,
2 2 2  2  
β1( X ) = (eσ + 2) eσ − 1 , (11) 
4σ 2 3σ 2 2σ 2 ~ 1 m 2
β2 ( X ) = e + 2e + 3e − 3. (12) µ = ln ~ 2 ~ 2 , (18)
2 eσ (eσ − 1)
2~
~ ~ σ
B. Methods of Point Parameter Estimation θ = x − eµ + 2 . (19)

Question of parameter estimation of three-parametric


lognormal distribution is already well developed in statistical Quantile method and Kemsley's method
literature, see for example [3]. We can use various methods to Quantile method of parameter estimation of three-
estimate the parameters of three-parametric lognormal parametric lognormal distribution is based on the use of three
distribution. We give as an example: moment method, quantile sample quantiles, namely there are 100⋅P1% quantile, 100⋅P2%
method, maximum likelihood method, method of L-moments, quantile and 100⋅P3% quantile, where P2 = 0,5 and
Kemsley's method, Cohen's method or graphical method. P3 = 1 − P1, and thus
u P2 = 0 a u P3 = − u P1.
Moment method We create a system of quantile equations by substituting to (5)
The essence of moment method of parameter estimation
lies in the fact that we put the sample moments and the ∗ + σ∗u
xVP1 = θ∗ + eµ P1 , (20)
corresponding theoretical moments into equation. We can

combine the general and the central moments. This method xV0,5 = θ∗ + eµ , (21)
of estimating parameters is indeed very easy to use, but it is ∗ − σ∗u
very inaccurate. In particular, the estimate of theoretical xV(1 − P1) = θ∗ + eµ P1 , (22)
variance by its sample counterpart is very inaccurate.
However, in the case of income and wage distribution we work where
with large sample sizes, and therefore the use of moment xVP1 , xV0,5 a xV(1 − P1)
method of parameter estimation may not be a hindrance
in terms of efficiency of estimators. are the corresponding sample quantiles. We obtain quantile
In the case of moment method of parameter estimation parameter estimations of three-parametric lognormal
of three-parametric lognormal distribution we put the sample distribution from the system of quantile equations (20) to (22)
arithmetic mean x equal to the expected value of random
variable X and we put the sample second central moment equal
to the variance of random variable X. Furthermore, we put
equal the sample third central moment m3 with a theoretical

Issue 1, Volume 6, 2012 32


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

k  r − 1
2 r −1
 xVP1 − xV0,5 
 ln V  λ r = r − 1 ∑ (−1)   EX r − k :r
, r = 1, 2 , 3 , ... . (30)
 x0,5 − x (1 − P1) 
V
k=0  k 
2∗
σ =  , (23)
 u P1  The letter ‘L’ in the name ‘L-moments’ is to stress the fact that
 
  r-th L-moment λr is a linear function of the expected rank
xVP1 − xV(1 − P ) statistics. Natural estimate of the L-moment λr based on the
µ∗ = ln ∗ ∗
1
, (24) observed sample is furthermore a linear combination of the
eσ u P1 − e− σ u P1 ordered values, i.e. the so called L-statistics. The expected

θ∗ = xV0,5 − eµ . (25) value of the rank statistic is of the form

r!
The sample median can be replaced by the sample EX j:r =
( j − 1) !⋅ (r − j ) !

∫ x [F ( x)] j − 1⋅[1 − F ( x)] r − j d F ( x) . (31)
arithmetic mean. Then we solve a similar system of equations
as in the case of quantile method. This method is called
Kemsley's method. If we plough the equation (31) in the equation (30), we get
after some operations
Maximum likelihood method and Cohen's method
1
If the value of the parameter θ in known, the likelihood
λ r = ∫ x ( F ) P∗r − 1 ( F ) d F , r = 1, 2 , 3 , ... , (32)
function is maximized when the likelihood parameter 0
estimations of three-parametric lognormal distribution have the
form
where
n
∑ ln ( xi − θ) r
i =1 ∗
µɵ = , (26) P∗r ( F ) = ∑ p r ,k F k (33)
n k=0
n
∑ [ln ( xi − θ) − µɵ ]2
i =1 (27)
σ =
ɵ2 . and
n
 r r + k 
If the value of parameter θ is not known, this problem is p∗r ,k = (− 1)r − k    ,
 (34)
considerably more complicated. If the parameter θ is estimated k k 
based on its sample minimum
where P ∗r (F ) represents the r-th shifted Legender's polynom
θɵ = xVmin , (28) which is related to the usual Legender’s polynoms. Shifted
Legender's polynoms are orthogonal on the interval (0,1) with
the likelihood function is unlimited. Maximum likelihood a constant weight function. The first four L-moments are of the
method is therefore sometimes combined with the Cohen's form
method. In this procedure, we put the smallest sample value
to the equality with 100 ⋅ (n + 1)− 1 % quantile 1
λ1 = EX = ∫ x( F ) d F , (35)
ɵ + σɵ u(n + 1) −1 0
xVmin = θɵ + eµ . (29) 1
1
λ 2 = 2 E ( X 2:2 − X 1:2) = ∫ x( F ) ⋅ (2 F − 1) d F , (36)
Equation (29) is then combined with a system of equations 0

(26) and (27). 1


λ3 = 3 E ( X 3:3 − 2 X 2:3 + X 1:3) =
L-moment method 1 (37)
2
Question of L-moment is described in detail for example = ∫ x( F ) ⋅ (6 F − 6 F + 1) d F ,
in [10]. We will assume that X is a real random variable with 0

the distribution function F(x) and quantile function x(F) and 1


X1:n ≤ X2:n ≤ … ≤ X n:n are the rank statistics of the random λ 4 = 4 E ( X 4:4 − 3 X 3:4 + 3 X 2:4 − X 1:4) =
sample of the size n selected from the distribution X. Then the 1 (38)
3 2
r-th L-moment of the random variable X is defined as = ∫ x( F ) ⋅ (20 F − 30 F + 12 F − 1) d F .
0

Issue 1, Volume 6, 2012 33


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

Details about the L-moments can be found in [4] or [5]. The −1


1 n
coefficients of the L-moments are defined as l 4 = ⋅   ⋅
4  4
∑∑∑∑ ( x i :n − 3 x j:n + 3 x k :n − x l :n) .
(48)
i> j>k >l
λ
τ r = r , r = 3 , 4 , 5 , .... (39)
λ2 Sample L-moments can be used similarly as the
conventional sample moments because they characterize basic
L-moments λ1, λ2, λ3, …, λr and coefficients of L-moments properties of the sample distribution and estimate the
τ 1, τ 2, τ 3, …, τr can be used as the characteristics of the corresponding properties of the distribution from which were
distribution. L-moments are in a way similar to the the data sampled. They might be also used to estimate the
conventional central moments and coefficients of L-moments parameters of this distribution. In these cases L-moments are
are similar to the moment ratios. Especially L-moments λ1 and of then used instead of the conventional moments because as
λ2 and coefficients of the L-moments τ3 and τ4 are considered linear functions of the data they are less sensitive on the
to be characteristics of the location, variability and skewness. sample variability or on the error size in the case of the
Using the equations (35) to (37) and the equation (39), we presence of the extreme values in the data than the
get the first three L-moments of the three-parametric conventional moments. Therefore it is assumed that the L-
lognormal distribution LN(µ, σ2, ξ), which is described e.g. moments provide more precise and robust estimates of the
in [5]. The following relations are valid for these L-moments characteristics of parameters of the population probability
distribution.
 σ 2  Let us denote the distribution function of the standard
λ1 = ξ + exp  µ + , (40)
 2  normal distribution as Φ, then Φ−1 represents the quantile
 function of the standard normal distribution. The following
σ 2  σ
λ 2 = exp  µ +  ⋅ erf  2  , (41) relation holds for the distribution function of the three-
 2 
parametric lognormal distribution LN(µ, σ2, ξ)
6 π− 1 2 σ 2  x 
τ3 = ⋅ ∫ erf   ⋅ exp ( − x 2) d x , (42)
σ 0  3  ln ( x − ξ) − µ 
erf  
2 F = Φ . (49)
 σ 
where erf(z) is the so called error function defined as
The coefficients of L-moments (39) are then commonly
estimated using the following estimates
2 z − t2
erf ( z ) = ⋅∫e dt. (43)
π 0
l r , r = 3 , 4 , 5 , ....
tr = (50)
l2
Now we will assume that x1, x2, …, xn is a random sample and
x1:n ≤ x 2:n ≤ … ≤ x n:n is the ordered sample. The r-th sample
The estimates of the three-parametric lognormal distribution
L-moment is defined as
can then be calculated as
r −1
−1
 n
l r =   ⋅ ∑∑ ∑ ... r −1
⋅ ∑ (−1) k ⋅ r − 1 ⋅ x i r − k :n , r = 1, 2 , ... , n . 8 −1  1 + t 3 
⋅ Φ 
 r  1 ≤ i 1 ≤ i 2 ≤ ... ≤ i r ≤ n k =0  k 
z=
3  , (51)
 2 
3 5
(44) σ
ˆ ≈ 0,999 281 z − 0,006 118 z + 0,000 127 z , (52)

We can write specifically for the first four sample L-moments  


  2

ˆ = ln 
µ l 2
−σ
ˆ ,
l 1 = n −1⋅ ∑ xi , (45)
 σ ˆ  2
 erf  2  
(53)
i  
−1
1  n  ˆ  .
2

l2 = ⋅ 
2  2  ⋅
∑∑ ( x i :n − x j:n) , (46)

ˆ +σ
ˆθ = l1 − exp  µ
2 
(54)
i> j 
−1
1 n
l 3 = ⋅   ⋅
3  3
∑∑∑
( x i :n − 2 x j:n + x k :n) , (47)
More on L-moments is for example in [6], [11] or [12].
i> j>k

Issue 1, Volume 6, 2012 34


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

C. Appropriateness of the Model


It is also necessary to assess the suitability of the 2
V ( X ) = eσ − 1 . (57)
constructed model or choose a model from several
alternatives, which is made by some criterion, which can be
a sum of absolute deviations of the observed and theoretical Formulas for Gini coefficient can be found in the form
frequencies for all intervals
σ
G = erf   (58)
k 2
S= ∑ ni − n πi  (55)
i =1
or equivalently in the form
or known criterion χ2
 σ 
k (n i − n πi ) 2 G = 2 Φ  − 1. (59)
χ 2= ∑ , (56)  2
i =1 n πi

Unfortunately, in the case of the three-parametric lognormal


where ni are observed frequencies in individual intervals, πi are distribution it is not true and both characteristics depend on all
theoretical probabilities of membership of statistical unit into three parameters, see (10) for the case of coefficient of
the i-th interval, n is the total sample size of corresponding variation. We substitute r = 2 into the formula (30) and we
statistical file, n ⋅ πi are the theoretical frequencies in individual obtain
intervals, i = 1, 2, ..., k, and k is the number of intervals.
The question of the appropriateness of the given curve for 1 1
model of the distribution of income and wage is not entirely λ 2 = 2 E ( X 2:2 − X 2:1) = 2 E X 1 − X 2 (60)
conventional mathematical-statistical problem in which we test
the null hypothesis “H0: The sample comes from the supposed
and we conclude that Gini mean difference equals 2λ2 (see
theoretical distribution” against the alternative hypothesis
“H1: non H0 ”,because in goodness of fit tests in the case [5]). Gini coefficient can be evaluated as λ 2 . We obtain for
of income and wage distribution we meet frequently with the λ1
fact that we work with large sample sizes and therefore the test the Gini coefficient G of the three-parametric lognormal
would almost always lead to the rejection of the null distribution a formula
hypothesis. This results not only from the fact that with such
large sample sizes the power of the test is so high at the chosen 2

significance level that the test uncovers all the slightest µ+ σ σ
e 2 ⋅ erf  
deviations of the actual income or wage distribution and 2
G= . (61)
model, but it also results from the principle of the construction 2
µ+ σ
of test. But, practically we are not interested in such small θ+e 2
deviations, so only gross agreement of the model with reality is
sufficient and we so called “borrow” the model (curve). Test
criterion χ2 can be used in that direction only tentatively. In this text, Gini coefficients are not included but from
When evaluating the suitability of the model we proceed to previous considerations the usefulness of L-moments
a large extent subjective and we rely on the experience and in evaluating these characteristics is clear.
logical analysis. More is for example in [2].
E. Four-Parametric Lognormal Distribution
D. Another Characteristics of Differentiation Random variable X has four-parametric lognormal
There are various characteristics of variability of incomes distribution LN(µ,σ2,θ,τ) with parameters µ, σ2, θ and τ,
and wages (or differentiation of incomes and wages) – where − ∞ < µ < ∞, σ2 > 0 and − ∞ < θ < τ < ∞, if its
variance, standard deviation, coefficient of variation or Gini probability density function f(x; µ,σ2,θ,) has the form
index. In this article, only variance, standard deviation and
a coefficient of variation are used. As L-moments are  x−θ 2
 ln − µ
of interests, we give a few comments on the relation between (τ − θ)  τ−x 
f (x;µ,σ2,θ,τ) = e− 2 σ2 , θ < x < τ,
the two- and three-parametric lognormal distribution and σ ( x − θ) (τ − x) 2 π
characteristics of differentiation.
(62)
If we substitute θ = 0 into the formulas of three-parametric
= 0, else.
lognormal distribution, we obtain two-parametric lognormal
distribution. It follows from the formula (10) that the
coefficient of variation depends only on one parameter σ2 Random variable
in the case of two-parametric lognormal distribution

Issue 1, Volume 6, 2012 35


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

X −θ III. ANALYSIS AND RESULTS


Y = ln (63)
τ− X Tabs. 1 to 14 present the estimated parameters of three-
parametric lognormal curves using various methods of point
has a normal distribution N(µ,σ2) and random variable parameter estimation (method of L-moments, moment method,
quantile method and maximum likelihood method) and the
X −θ sample characteristics on the basis of these the parameters
ln −µ were estimated. We can see from Tables 7, 11 and 13 that the
τ− X (64)
U= value of the parameter θ (theoretical beginning of the
σ
distribution) is negative in many cases. This means that
has a standardized normal distribution N (0, 1). The parameter
µ is the expected value of random variable (63) and the Tab. 1 Sample L-moments − Income
parameter σ2 is the variance of this random variable. Sample L-moments
Year l1 l2 l3
Parameter θ is the theoretical minimum of random variable X
1992 35,246.51 7,874.26 2,622.14
and parameter τ is the theoretical maximum of this variable.
1996 66,121.92 16,237.54 5,685.46
Figs. 4 and 5 represent the probability density functions
2002 105,029.89 27,978.40 10,229.62
of four-parametric lognormal curves depending on the values
2005 111,023.71 28,340.18 9,113.57
of their parameters.
2006 114,945.08 28,800.68 9,286.18
2007 123,806.49 30,126.11 9,530.57
1,2 2008 132,877.19 31,078.96 9,702.45
µ = -2
1 Tab. 2 Parameter estimations of three-parametric lognormal
µ = -1,5
µ = -1 distribution obtained using the L-moment method − Income
0,8 µ = -0,5 Parameter estimation
µ=0 Year µ σ2 θ
f(x)

0,6 1992 9.696 0.490 14,491.687


1996 10.343 0.545 25,362.753
0,4 2002 10.819 0.598 37,685.637
2005 11.028 0.455 33,738.911
0,2 2006 11.040 0.458 36,606.903
2007 11.112 0.440 40,327.610
0 2008 11.163 0.428 45,634.578
2

6
4
8
2
6

4
8
2
6

4
8
2
6
0
0,
0,
1,
1,

2,
2,
3,
3,

4,
4,
5,
5,
-1

Tab. 3 Sample characteristics (arithmetic mean x , standard


1E

x
deviation s and coefficient of skewness b1) − Income
Fig. 4 Probability density function for the values of parameters Sample characteristics
σ = 0,8, θ = 0,5, τ = 6 Year x s b1
1992 35,247 19,364 7.815
0,6 1996 68,286 51,102 17.606
2002 105,030 83,598 17.142
0,5 σ^2 = 0,49 2005 111,024 77,676 14.907
σ^2 = 1,24 2006 114,945 74,503 10.395
σ^2 = 2 2007 123,806 74,578 7.727
0,4
σ^2 = 3,24 2008 132,877 73,982 6.979
σ^2 = 5
f(x)

0,3 Tab. 4 Parameter estimations of three-parametric lognormal


distribution obtained using the moment method − Income
0,2 Parameter estimation
Year µ σ2 θ
0,1 1992 8.883 1.173 22,284.335
1996 9.154 1.780 45,269.967
0 2002 9.668 1.760 66,925.879
0 0,4 0,8 1,2 1,6 2 2,4 2,8 3,2 3,6 4 4,4 4,8 5,2 5,6 6 2005 9.710 1.656 73,299.950
x
2006 9.976 1.386 71,936.249
Fig. 5 Probability density function for the values of parameters 2007 10.242 1.165 73,575.417
µ = 0,5, θ = 0,5, τ = 6 2008 10.328 1.089 80,180.795

Issue 1, Volume 6, 2012 36


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

Tab. 5 Sample quartiles − Income Tab. 9 Parameter estimations of three-parametric lognormal


Sample quartiles distribution obtained using the L-moment method − Wage
Year ~ ~
x 0 ,50 ~ Parameter estimation
x 0 ,25 x 0 ,75
Year µ σ2 θ
1992 25,900 31,000 39,298
1996 47,550 57,700 76,550 2002 9.238 0.388 4,952.259
2002 73,464 89,204 115,966 2003 9.402 0.332 4,364.869
2005 79,600 97,050 124,068 2004 9.313 0.442 5,872.138
2006 82,998 100,640 128,000 2005 9.392 0.424 5,908.390
2007 90,000 108,744 138,000 2006 9.393 0.447 6,795.207
2008 97,160 117,497 148,937 2007 9.222 0.724 9,349.280
2008 9.319 0.693 9,719.297
Tab. 6 Parameter estimations of three-parametric lognormal
distribution obtained using the quantile method − Income Tab. 10 Sample characteristics (arithmetic mean x , standard
Parameter estimation deviation s and coefficient of skewness b1) − Wage
Year µ σ2 θ Sample characteristics
1992 9.490 0.521 17,766.792 Year x s b1
1996 9.998 0.842 35,708.333 2002 17,437 8,321 1.817
2002 10.551 0.619 50,986.446 2003 18,663 8,657 1.354
2005 10.805 0.420 47,774.906 2004 19,698 9,804 1.614
2006 10.813 0.423 50,970.817 2005 20,738 10,180 1.481
2007 10.862 0.436 56,577.479 2006 21,803 10,477 1.419
2008 10.961 0.417 59,909.386 2007 23,883 13,776 2.338
2008 25,478 14,485 2.191
Tab. 7 Parameter estimations of three-parametric lognormal
distribution obtained using the maximum likelihood method − Tab. 11 Parameter estimations of three-parametric lognormal
Income distribution obtained using the moment method − Wage
Parameter estimation Parameter estimation
Year Year µ σ2 θ
µ σ2 θ 2002 9.492 0.264 2,311.688
1992 10.384 0.152 -0.342 2003 9.837 0.166 -1,681.293
1996 10.995 0.180 52.236 2004 9.779 0.221 -25.695
2002 11.438 0.211 73.525 2005 9.906 0.193 -1,339.601
2005 11.503 0.206 -2.050 2006 9.979 0.180 -1,805.527
2006 11.542 0.199 -8.805 2007 9.734 0.377 3,509.924
2007 11.623 0.190 -42.288 2008 9.851 0.345 2,920.381
2008 11.703 0.177 -171.167
Tab. 12 Sample quartiles − Wage
Tab. 8 Sample L-moments − Wage Sample quartiles
Sample quartiles Year ~ ~ ~
Year l1 l2 l3 x0 ,25 x 0 ,50 x 0 ,75
2002 17,437.49 4,251.48 1,267.44 2002 11,944 15,545 20,215
2003 18,663.18 4,524.95 1,251.90 2003 12,728 16,735 22,224
2004 19,697.57 5,001.34 1,586.09 2004 13,416 17,709 23,077
2005 20,738.14 5,262.93 1,636.67 2005 14,063 18,597 24,470
2006 21,803.28 5,454.74 1,738.23 2006 14,717 19,514 25,675
2007 23,882.83 6,577.65 2,627.93 2007 15,769 20,910 27,545
2008 25,477.59 6,993.72 2,737.94 2008 16,853 22,225 29,404

Issue 1, Volume 6, 2012 37


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

Tab. 13 Parameter estimations of three-parametric lognormal Figs. 14 to 20 represent the histograms of observed interval
distribution obtained using the quantile method − Wage frequency distribution of net annual household income per
Parameter estimation capita in 1992, 1996, 2002, 2005 to 2008. Histograms
Year of observed interval frequency distribution of gross monthly
µ σ2 θ
wage in 2002 to 2008 could not be constructed due to non-
2002 9.663 0.149 -185.316
uniform width of the individual intervals. The interval
2003 9.605 0.218 1,899.151
frequency distributions with unequal wide of intervals were
2004 9.974 0.110 -3,742.702 taken from the official website of the Czech Statistical Office
2005 9.897 0.147 -1,283.306 and the frequency distribution histogram would lose any visual
2006 9.983 0.138 -2,144.719 informative about the shape of the frequency distribution
2007 10.036 0.143 -1,919.373 in this case. Figs. 21 to 24 also provide approximate
2008 9.968 0.185 887.792 information about the accuracy of the used methods
of parameter estimation. Figs. 21 and 23 represent the
Tab. 14 Parameter estimations of three-parametric lognormal development of the sample arithmetic mean and the
distribution obtained using the maximum likelihood method − development of theoretical expected values of three-parametric
Wage lognormal distribution with parameters estimated using
Parameter estimation different methods of parameter estimation. Figs. 22 and 24
Year µ σ2 θ represent the development of the sample median and the
2002 8.977 0.828 6,364.635 development of theoretical medians of three-parametric
2003 9.024 0.615 6,679.910 lognormal distribution with parameters estimated using
2004 9.363 0.306 3,090.038 different methods of parameter estimation. It is important to
2005 9.400 0.329 4,134.624 note, however, that Figs. 21 and 23 give nothing about the
accuracy of moment method of parameter estimation, because
2006 9.159 0.742 8,070.167
equality of the sample arithmetic mean and theoretical
2007 9.487 0.369 2,586.616
expected value represents one of three moment equations.
2008 9.593 0.341 3,324.455
In this case, the course of development of sample arithmetic
mean coincides with the course of development of theoretical
lognormal curve gets into negative territory at the beginning
expected value of three-parametric lognormal distribution with
of its course.
Because of a very tight contact of the lower tail of the
0,00005
lognormal curve with the horizontal axes, this fact does not
probability density function

0,000045 Year 1992


have to be a problem for a good fit of the model. 0,00004
Year 1996
The advantage of the lognormal models is that the parameters 0,000035
Year 2002
0,00003
have an easy interpretation. Also some parametric functions 0,000025 Year 2005
of these models have straight interpretation. In the case that the 0,00002 Year 2006

estimated value of parameter θ is negative, we can not really 0,000015 Year 2007
0,00001 Year 2008
interpret this value. 0,000005
Figs. 6 to 13 show the probability density functions 0
of three-parametric lognormal curves, whose parameters were
0

12 0

14 0

16 0

18 0

20 0

22 0

24 0

26 0
00

00
00
00

00

00

00

0
50

50

50

50

50

50

50

50

50

50

50

estimated using different methods of parameter estimation. We


25

45

65

85
10

28

net annual income per capita (in CZK)


can also see from these figures the development of theoretical
income distribution in the years in 1992, 1996, 2002, 2005 to Fig. 6 Probability density function of net annual household
2008 (Figs. 6 to 9) and the development of theoretical wage income per capita − L-moment method
distribution in the years 2002 to 2008 (Figs. 10 to 13).
Although the shapes of probability density function of three- 0,0001
probability density function

parametric lognormal curves differ considerably between the 0,00009 Year 1992
0,00008 Year 1996
used methods of point parameter estimation, we can observe
0,00007 Year 2002
certain trends in their development. We can see form Figs. 6 0,00006
Year 2005
to 13 that as in the case of income, so in the case of wage 0,00005
Year 2006
0,00004
distribution, characteristics of the level of these distributions 0,00003 Year 2007
increase gradually and characteristics of income and wage 0,00002 Year 2008
differentiation increase gradually, too. Therefore, data can not 0,00001
0
be considered homoskedastic in terms of the same variability
00

10 0

12 0
00

16 0
00

20 0
00

24 0

26 0
00

00

in the same distributions as the characteristics of absolute


00

00

00

00

0
50

50

50

50

50

50

50

50

50

50

50
25

45

65

85

14

18

22

28

variability grow in time. We see also from Figs. 6 to 13 the net annual income per capita (in CZK)
gradual decline of characteristics of shape of the distribution
(skewness and kurtosis). Fig. 7: Probability density function of net annual household
income per capita − Moment method

Issue 1, Volume 6, 2012 38


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

0,00006 0,00008
probability density function

Year 1992 Year 2002

probability density function


0,00005 0,00007
Year 1996 Year 2003
0,00006
0,00004 Year 2004
Year 2002
0,00005 Year 2005
0,00003 Year 2005
0,00004 Year 2006
Year 2006
0,00002 Year 2007
Year 2007 0,00003
Year 2008
0,00001
Year 2008 0,00002
0 0,00001
0

10 0
00

12 0

14 0

16 0

18 0

20 0
00

24 0

26 0

28 0
00
0
00

00

00

00

0
50

50

50

50

50

50

50

50

50

50

50
25

45

65

85

22

10 0

11 0
00
00
0

0
00

00

00

00

00

00

00

00

00

00

00
40

20
80
net annual income per capita (in CZK)

16

24

32

40

48

56

64

72

80

88

96
gross monthly wage (in CZK)
Fig. 8 Probability density function of net annual household
Fig. 12 Probability density function of gross monthly wage −
income per capita − Quantile method
Quantile method
0,00004
probability density function

0,00009
0,000035 Year 1992

probability density function


0,00008 Year 2002
0,00003 Year 1996
0,00007
Year 2003
0,000025 Year 2002
0,00006
Year 2004
0,00002 Year 2005 0,00005
Year 2005
0,000015 Year 2006 0,00004
Year 2006
0,00001 Year 2007 0,00003
Year 2007
0,000005 Year 2008 0,00002
Year 2008
0 0,00001
0
45 0

65 0

85 0
10 0

12 00

14 00

16 00

18 00

20 00

22 00

24 00

26 00

28 00
00
25 0
0
00

00

00

00
50

50

50

50

50

50

50

50

50

50

50

10 0
00

11 0
00
00

00

00

00

00

00

00

00

00

00

00

0
80

40

20
16

24

32

40

48

56

64

72

80

88

96
net annual income per capita (in CZK)
gross monthly wage (in CZK)
Fig. 9 Probability density function of net annual household
Fig. 13 Probability density function of gross monthly wage −
income per capita − Maximum likelihood method
Maximum likelihood method
0,00008
5000
probability density function

0,00007
Year 2002
0,00006 Year 2003 4000
Absolute frequency

0,00005 Year 2004


Year 2005 3000
0,00004
Year 2006
0,00003
Year 2007 2000
0,00002 Year 2008
0,00001 1000

0
0
60 0

18 0
24 00
30 00
36 00
42 0
48 00
54 00
60 00
66 0
72 00
78 00
84 00
90 0
96 00
10 000
12 00

10 000
11 00
00
00

00

00

00
0
0
0

0
0
0

0
0
0

80
40

10 00
17 00
24 00
31 00
38 00
45 00
52 00
59 00
66 00
73 00
80 00
87 00
94 00
1 0
10 500
11 500
12 500
12 500
13 500
14 500
15 500
15 500
16 500
17 500
00
2

10 50
35
5
5
5
5
5
5
5
5
5
5
5
5

15
8
5
2
9
6
3
0
7
4

gross monthly wage (in CZK)


Interval middle
Fig. 10: Probability density function of gross monthly wage −
Fig. 14 Interval frequency distribution of net annual household
L-moment method
income per capita in 1992
0,00007
7500
probability density function

0,00006 Year 2002


Year 2003 6000
Absolute frequency

0,00005
Year 2004
0,00004 Year 2005 4500
0,00003 Year 2006
Year 2007 3000
0,00002 Year 2008
0,00001 1500

0
0
0

10 0

11 0
00
00
0

0
00

00

00

00

00

00

00

00

00

00

00
40

20
80

13 00
22 00
31 0 0
40 00
49 00
58 00
67 00
76 00
85 00
94 00
3 0
2 0
1 0
0 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
25 0
1 0
05 0
00
16

24

32

40

48

56

64

72

80

88

96

10 50
11 50
12 50
13 50
13 50
14 50
15 50
16 50
17 50
18 50
19 50
20 50
21 0
22 50
45
5
5
5
5
5
5
5
5
5

gross monthly wage (in CZK)


Interval middle
Fig. 11 Probability density function of gross monthly wage −
Fig. 15 Interval frequency distribution of net annual household
Moment method
income per capita in 1996

Issue 1, Volume 6, 2012 39


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

1500 3000

2500

Absolute frequency
1200
Absolute frequency

2000
900
1500

600 1000

500
300
0
0

30 00
50 00
70 00
90 00
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
00 0
0 0
00 0
0 0
00 0
0 0
0 0
0 0
00 0
00
11 00
13 00
15 00
17 00
19 00
21 00
23 00
25 00
27 00
29 00
31 00
33 00
35 0
37 00
39 0
41 00
43 0
45 00
47 00
49 00
0
0
0
0
10
18 00
30 00
42 00
54 00
66 0
78 00
90 00
10 000
11 000
12 000
13 000
15 000
16 000
17 000
18 000
19 000
21 000
22 000
23 000
24 00
25 000
27 000
28 000
29 000
00
00
60

40

40
0
0
0

0
0

2
4
6
8
0
2
4
6
8
0
2

6
8
0
2
Interval middle
Interval middle
Fig. 20 Interval frequency distribution of net annual household
Fig. 16 Interval frequency distribution of net annual household
income per capita in 2008
income per capita in 2002
parameters estimated using the moment method of parameter
1200 estimation. Similarly situation is for Figs. 22 and 24 in the case
1000 of quantile method of parameter estimation, where equality
Absolute frequency

800 of sample and theoretical median is one of three quantile


600 equations and so the course of the development of sample
400
median coincides with the course of development
of theoretical median of three-parametric lognormal
200
distribution with the parameters estimated using the quantile
0
method of parameter estimation. Figs. 21 to 24 show a high
40 00
56 00
72 00
88 00
40 0

accuracy of all four methods used to estimate parameters


24 00

0 0
6 0
2 0
8 0
4 0
0 0
6 0
2 0
8 0
4 0
0 0
6 0
2 0
8 0
4 0
0 0
6 0
20 0
00
10 0 0
12 0
13 00
15 00
16 00
18 00
20 00
21 00
23 00
24 00
26 00
28 00
29 00
31 00
32 00
34 00
36 00
37 00
39 00
80
0
0
0
0

Interval middle
on these data.
Using moment parameter estimation has some unpleasant
Fig. 17 Interval frequency distribution of net annual household specifics in the case of the distribution of income and wage.
income per capita in 2005 The moments of higher order including the moment
characteristic of skewness are very sensitive to inaccuracies on

1500 140000
Absolute frequency

1200 120000
Arithmetic mean (in CZK)

900 100000

600 80000
L-moment
60000
300 Moment
40000 Quantile
0
Maximum likelihood
20000
24 00
40 0
56 00
72 0
88 0
10 0 00

Sample
12 000
13 000
15 000
16 000
18 000
20 000
21 000
23 00
24 000
26 00
28 000
29 00
31 000
32 000
34 000
36 000
37 00
39 000
00
00

00
00
80

60

80

00

00

20
4
0
6
2
8
4
0

6
2
8
4

0
Interv al middle
1992 1994 1996 1998 2000 2002 2004 2006 2008
Year
Fig. 18 Interval frequency distribution of net annual household
income per capita in 2006 Fig. 21 Development of sample average net annual income per
capita and the theoretical expected value (in CZK)
2500

2000
140000
Absolute frequency

120000
1500
Median (in CZK)

100000
1000
80000

500 60000 L-moment


Moment
0 40000 Quantile
20000 Maximum likelihood
30 00
50 00
70 00
90 00
0 0
0 0
15 000
17 000

0 0
0 0
0 0
0 0
27 000

0 0
31 000

0 0
0 0
0 0
0 0
41 000

0 0
45 000

0 0
00 0
00
11 00
13 00

19 00
21 00
23 00
25 00

29 00

33 00
35 00
37 00
39 00

43 0

47 0
49 00

Sample
0
0
0
0

00

00
10

0
0

0
Interval middle
1992 1994 1996 1998 2000 2002 2004 2006 2008
Year
Fig. 19 Interval frequency distribution of net annual household
income per capita in 2007 Fig. 22 Development of sample median of net annual income
per capita and the theoretical median (in CZK)

Issue 1, Volume 6, 2012 40


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

30000 Time Sequence Plot for Income_1st


(X 10000) S-curve trend = exp(11,9932 + -1,55335 /t)
18
Arithmetic mean (in CZK)

25000
actual
15 forecast

Income_1st
20000 95,0% limits
12
15000 9
L-moment
6
10000 Moment
Quantile 3
5000 Maximum likelihood 0
Sample 0 2 4 6 8 10
0
2002 2003 2004 2005 2006 2007 2008
Year
Fig. 25 The trend function in the development of the first
Fig. 23 Development of sample average gross monthly wage sample L-moment of net annual household income per capita
and the theoretical expected value (in CZK) (forecasts: 133,122.0; 136,026.0)

25000 Time Sequence Plot for Income_2nd


(X 10000) S-curve trend = exp(10,626 + -1,65801 /t)
5
20000 actual
Median (in CZK)

4 forecast

Income_2nd
95,0% limits
15000 3
L-moment
Moment 2
10000
Quantile 1
Maximum likelihood
5000 0
Sample 0 2 4 6 8 10

0
2002 2003 2004 2005 2006 2007 2008
Year Fig. 26 The trend function in the development of the second
sample L-moment of net annual household income per capita
Fig. 24 Development of sample median of gross monthly wage
(forecasts: 33,482.6; 34,262.6)
and the theoretical median (in (CZK)

Tab. 15 Sum of absolute deviations of the observed and Time Sequence Plot for Income_3rd
(X 1000) S-curve trend = exp(9,49536 + -1,58932 /t)
theoretical frequencies for all intervals − net annual household 18
income per capita actual
15 forecast
Income_3rd

Method 12 95,0% limits

Year Maximum 9
L-moment Moment Quantile likelihood 6
1992 2,661.636 5,256.970 3,880.846 2,933.275 3

1996 5,996.435 15,673.846 9,677.446 7,181.322 0


0 2 4 6 8 10
2002 2,181.635 3,888.523 3,206.585 2,236.348
2005 1,158.556 2,261.200 1,331.944 1,237.170
2006 2,197.016 3,375.662 2,984.503 2,217.975 Fig. 27 The trend function in the development of the third
2007 2,359.258 3,654.637 2,995.680 2,585.448 sample L-moment of net annual household income per capita
2008 2,251.531 4,282.314 3,277.620 2,889.890 (forecasts: 10,901.9; 11,145.3)

Tab. 16 Sum of absolute deviations of the observed and Time Sequence Plot for Wage_1st
theoretical frequencies for all intervals − gross monthly wage (X 1000) Quadratic trend = 16879,3 + 631,353 t + 84,7654 t^2
32
Method actual
29 forecast
Year Maximum
Wage_1st

95,0% limits
L-moment Moment Quantile likelihood 26

1992 134,846.633 314,497.134 292,479.483 289,279.267 23

1996 135,772.928 356,423.157 303,335.493 283,469.483 20


2002 252,042.801 357,087.483 335,019.202 295,900.939
17
2005 260,527.847 426,062.444 345,954.758 306,785.789 0 2 4 6 8 10

2006 277,661.535 448,632.374 372,420.681 357,828.202


2007 229,525.420 432,745.341 338,552.122 250,114.480
Fig. 28 The trend function in the development of the first
2008 255,510.389 441,371.539 372,924.579 289,621.287 sample L-moment of gross monthly wage
(forecasts: 27,355.1; 29,427.5)

Issue 1, Volume 6, 2012 41


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

Tabs. 15 and 16 provide more accurate information about


the used methods of parameter estimation. These tables
Time Sequence Plot for Wage_2nd
Quadratic trend = 4155,33 + 94,1457 t + 45,31 t^2 contain the sum of absolute deviations of the observed and
12200
actual
10200
forecast Tab. 19 Extrapolations of the interval distribution of relative
Wage_2nd

95,0% limits
frequencies (in %) of net annual household income per capita
8200
for 2009 and 2010
6200 Year
Interval 2009 2010
4200
0 2 4 6 8 10 0 − 20,000 0.00 0.00
20,001 − 40,000 0.00 0.00
40,001 − 60,000 1.82 1.42
Fig. 29 The trend function in the development of the second
60,001 − 80,000 15.07 13.88
sample L-moment of gross monthly wage
80,001 − 100,000 20.27 19.82
(forecasts: 7,808.34; 8,672.75)
100,001 − 120,000 17.29 17.37
120,001 − 140,000 12.89 13.16
Time Sequence Plot for Wage_3rd 140,001 − 160,000 9.19 9.49
(X 1000)
6
Quadratic trend = 1291,11 + -72,7498 t + 41,7531 t^2 160,001 − 180,000 6.47 6.75
5
actual 180,001 − 200,000 4.56 4.79
forecast
200,001 − 220,000 3.24 3.42
Wage_3rd

4 95,0% limits

3 220,001 − 240,000 2.32 2.47


2 240,001 − 260,000 1.68 1.80
1 260,001 − 280,000 1.23 1.32
0 280,001 − 300,000 0.91 0.98
0 2 4 6 8 10
300,001 − 320,000 0.68 0.74
320,001 − 340,000 0.51 0.56
Fig. 30 The trend function in the development of the third 340,001 − 360,000 0.39 0.42
sample L-moment of gross monthly wage 360,001 − 380,000 0.30 0.33
(forecasts: 3,381.31; 4,018.36) 380,001 − 400,000 0.25 0.26
400,001 − ∞ 0.93 1.02
Tab. 17 Extrapolations of sample L-moments Total 100.00 100.00
Sample L-moments
Set Year l1 l2 l3 Tab. 20 Extrapolations of the interval distribution of relative
2009 133,122 33,483 10,902 frequencies (in %) of gross monthly wage for 2009 and 2010
Incom
2010 Year
e 136,026 34,263 11,145 Interval 2009 2010
2009 27,355 7,808 3,381 0 − 5,000 0.00 0.00
Wage
2010 29,428 8,673 4,018 5,001 − 10,000 0.00 0.00
10,001 − 15,000 12.84 6.49
Tab. 18 Extrapolations of parameter estimations of three- 15,001 − 20,000 29.25 30.44
parametric lognormal distribution obtained using the L- 20,001 − 25,000 19.43 20.69
moment method 25,001 − 30,000 12.03 12.74
Parameter estimation 30,001 − 35,000 7.66 8.14
Set Year µ σ2 θ 35,001 − 40,000 5.06 5.44
2009 11.176 0.467 42,913.996 40,001 − 45,000 3.45 3.76
Incom
2010 11.201 0.466 43,631.177 45,001 − 50,000 2.43 2.69
e 50,001 − 55,000 1.75 1.97
2009 9.247 0.864 11,384.492 55,001 − 60,000 1.29 1.48
Wage
2010 9.217 1.004 12,794.380 60,001 − 65,000 0.97 1.13
65,001 − 70,000 0.74 0.88
the both ends of the distribution. Registration errors, from 70,001 − 75,000 0.57 0.69
which these inaccuracies arise, are just typical for the survey 75,001 − 80,000 0.45 0.55
of income and wage. Moment method of parameter estimation 80,001 − 85,000 0.35 0.44
does not guarantee maximum efficiency of the estimation, 85,001 − 90,000 0.28 0.36
nevertheless it may not be a hindrance when working with the 90,001 − 95,000 0.23 0.30
income and wage distributions due to a usually high sample 95,001 − 100,000 0.17 0.25
size. 100,001 − ∞ 1.05 1.56
Total 100.000 100.000

Issue 1, Volume 6, 2012 42


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

Interestingly in addition, Figs. 25 to 30 represent the trend


25 functions for the development of sample L-moments
income 2009 in corresponding monitored periods, including their forecasts
income 2010
Relative frequency (in %)

20
for the years 2009 and 2010 in parentheses. Tab. 17 represents
the extrapolations of sample L-moments created on the basis
15
of the trend functions from Figs. 25 − 30. Table 18 shows the
extrapolations of parameter estimations of three-parametric
10
lognormal curves obtained using the L-moment method based
5
on the values from Table 17. Tabs. 19 and 20 and Figs. 31 and
32 show the extrapolations of income and wage distribution
0 for the years 2009 and 2010 based on the parameter values
from Table 18.
30 0
50 0
70 0
90 0
11 00
13 00
15 00
17 00
19 00
21 00
23 00
25 00
27 00
29 00
31 00
33 00
35 00
37 00
39 00
00
00
00
00
00
0
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10

Interval middle
IV. CONCLUSION
Fig. 31 Extrapolations of the interval distribution of relative
frequencies (in %) of net annual household income per capita Importance of lognormal curve as a model for the
for 2009 and 2010 empirical distribution is indisputable, and it has found
application in many areas from the sociology to astronomy.
35
Characteristic features of the process described by this model
are: successive appearances of interdependent factors;
30 wage 2009
wage 2010 tendency to develop in a geometric sequence; overgrowth
Relative frequency (in %)

25 of random variability to the systematic variability −


differentiation. Incomes and wages are among the many
20
economic phenomena that lognormal model allows
15 to interpret, which is confirmed by numerous practical
10
experiences.
Three-parametric lognormal distribution (Johnson\s curve
5 of the type SL) was used in the modelling of incomes and
0 wages in this study. Various methods of parameter estimation
were used in estimating the parameters of this distribution −
00

12 0
17 0
22 0
27 0
32 0
37 0
42 0
47 0
52 0
57 0
62 0
67 0
72 0
77 0
82 0
87 0
92 0
97 0
0
0
50
50
50
50
50
50
50
50
50
50
50
50
50
50
50
50
50
50
25
75

moment method, quantile method, maximum likelihood


Interval middle
method and finally the method of L-moments. In the case
Fig. 32 Extrapolations of the interval distribution of relative of small sample size, L-moment method usually provides
frequencies (in %) of gross monthly wage for 2009 and 2010 markedly more accurate results than other methods
of parameter estimation, including the maximum likelihood
theoretical frequencies for all intervals and therefore they method, see for example [5]. However, it appears that even
serve as an objective criterion for evaluating the accuracy in the case of large samples tahat the L-moment method gives
of used methods of parameter estimation. It should be noted more accurate results than the other methods of parameter
here that in the case of income distribution on the one hand, estimation (and again, including the maximum likelihood
and in the case of wage distribution on the other hand, we used method). When calculating the sum of the absolute deviations
the same number of intervals, whose width is expanded in time of the observed and theoretical frequencies and also
due to the rising level of the distributions. We can see from in calculating the value of test criterion χ2, it showed that
Tabs. 15 and 16 that the method of L-moments provides the inaccuracies arise especially at the both ends of the
most accurate results, which are even more accurate than distribution in the case of method of L-moment. If we
results obtained using the maximum likelihood method. abstracted from inaccuracies on both ends of the distribution,
Already mentioned maximum likelihood method ended in the the results based on L-moment method would be much more
terms of accuracy of the estimations as the second best. accurate compared to other methods of parameter estimation
Quantile method of parameter estimation follows as the third in the case of large samples, too.
best (second worst). As expected, moment method In addressing the question which method of parameter
of parameter estimation provides the least accurate results. estimation of three-parametric lognormal distribution is most
Values of test criterion (56) were also calculated for each suitable, it was the high dependency of the value of χ2 criterion
income distribution or for each wage distribution. As it was due to the sample size. As it is usual with such a large sample
mentioned, the tested hypothesis on the expected shape of the size, all tests led to the rejection of the null hypothesis on the
distribution is rejected even at 1% significance level in the expected distribution. From the results it is clear that all four
case of each income or wage distribution. This situation is used methods of parameter estimation yielded relatively
caused by large sample sizes, with whom we work in the case accurate results at such large samples, which were used in this
of income and wage distribution. Values of test criterion χ2 are research and which are typical of the income and wage
not therefore listed. distribution. Despite some differences in the accuracy

Issue 1, Volume 6, 2012 43


INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES

of parameter estimation methods used were discovered. As it is


evident from the outputs, the L-moment method gives again
the most accurate results of parameter estimation. The method
of maximum likelihood follows as the second most accurate.
Quantile method of parameter estimation follows and method
of moments has brought at least accurate results of parameter
estimation, as expected. Notwithstanding the foregoing, the
differences in accuracy between parameter estimation methods
used are not relatively too high in the case of such large
sample sizes, see outputs above.

REFERENCES
[1] J. Bartošová, “Logarithmic-Normal Model of Income Distribution in the
Czech Republic”, Austrian Journal of Statistics, vol. 35, no. 23, 2006,
pp. 215 − 222.
[2] D. Bílková, “Application of Lognormal Curves in Modelling of Wage
Distributions”, Journal of Applied Mathematics, vol. 1, no. 2, 2008, pp.
341 − 352.
[3] A.C. Cohen and J.B. Whitten, “Estimation in the Three-parameter
Lognormal Distribution”, Journal of American Statistical Association,
vol. 75, 1980, pp. 399 − 404.
[4] N.B. Guttman, “The Use of L-moments in the Determination
of Regional Precipitation Climates”, Journal of Climate, vol. 6, 1993,
pp. 2309 − 2325.
[5] J.R.M. Hosking, “L-moments: Analysis and Estimation of Distributions
Using Linear Combinations of Order Statistics”, Journal of the Royal
Statistical Society (Series B), vol. 52, no. 1, 1990, pp. 105 – 124.
[6] J. Kyselý and J. Picek, “Regional Growth Curves and Improved design
Value Estimates of Extreme Precipitation Events in the Czech
Republic”, Climate Research, vol. 33, 2007, pp. 243 – 255.
[7] I. Malá, “Conditional Distributions of Incomes and their
Characteristics”, in ISI 2011, Dublin, Ireland, 2011, pp. 1− 6.
[8] I. Malá, “Distribution of Incomes Per Capita of the Czech Households
from 2005 to 2008”, in Aplimat 2011 [CD-ROM], Bratislava, Slovakia,
2011, pp. 1583 − 1588.
[9] I. Malá, “Generalized Linear Model and Finite Mixture Distributions”,
in AMSE 2010 [CD-ROM], Demänovská Dolina, Slovakia, 2010, pp.
225 − 234.
[10] I. Malá, “L-momenty a jejich použití pro rozdělení příjmů domácností”,
in MSED 2010 [CD-ROM], Prague, Czech Republic, 2010, pp. 1 − 9.
[11] J.C. Smithers and R.E. Schulze, “A Methodology for the Estimation
of Short Duration Design Storms in South Africa Using a Regional
Approach Based on L-moments”, J. Hydrol., vol. 241, 2001, pp. 42 –
52.
[12] T.J. Ulrych, D.R. Velis, A.D. Woodbury and M.D. Sacchi, “L-moments
and C-moments”, Stoch. Environ. Res. Risk Asses, vol. 14, 2000, pp. 50
− 68.
[13] http://www.czso.cz/
[14] http://puzzle.heureka.cz
[15] http://search.seznam.cz/

Issue 1, Volume 6, 2012 44

You might also like