Gretl Guide (401 450)

Chapter 38.
Discrete and censored dependent variables 389
38.3 Multinomial logit

When the dependent variable is not binary and does not have a natural ordering, multinomial
models are used. Multinomial logit is supported in gretl via the --multinomial option to the
logit command. Simple models can also be handled via the mle command (see chapter 26). We
give here an example of such a model. Let the dependent variable, yi , take on integer values
0, 1, . . . p. The probability that yi = k is given by
exp(xi βk )
P (yi = k|xi ) = Pp
j=0 exp(xi βj )
For the purpose of identification one of the outcomes must be taken as the “baseline”; it is usually
assumed that β0 = 0, in which case
exp(xi βk )
P (yi = k|xi ) = Pp
1+ j=1 exp(xi βj )
and
1
P (yi = 0|xi ) = Pp .
1+ j=1 exp(xi βj )
Listing 38.4: Multinomial logit

Input:
open keane.gdt
smpl year==87 --restrict
logit status 0 educ exper expersq black --multinomial
Output (selected portions):

Model 1: Multinomial Logit, using observations 1-1738 (n = 1717)
Missing or incomplete observations dropped: 21
Dependent variable: status
Standard errors based on Hessian
coefficient std. error z p-value

--------------------------------------------------------
status = 2
const 10.2779 1.13334 9.069 1.20e-19 ***
educ -0.673631 0.0698999 -9.637 5.57e-22 ***
exper -0.106215 0.173282 -0.6130 0.5399
expersq -0.0125152 0.0252291 -0.4961 0.6199
black 0.813017 0.302723 2.686 0.0072 ***
status = 3
const 5.54380 1.08641 5.103 3.35e-07 ***
educ -0.314657 0.0651096 -4.833 1.35e-06 ***
exper 0.848737 0.156986 5.406 6.43e-08 ***
expersq -0.0773003 0.0229217 -3.372 0.0007 ***
black 0.311361 0.281534 1.106 0.2687
Mean dependent var 2.691322 S.D. dependent var 0.573502

Log-likelihood -907.8572 Akaike criterion 1835.714
Schwarz criterion 1890.198 Hannan-Quinn 1855.874
Number of cases ’correctly predicted’ = 1366 (79.6%)

Likelihood ratio test: Chi-square(8) = 583.722 [0.0000]
Listing 38.4 reproduces Table 15.2 in Wooldridge (2002a), based on data on career choice from
Keane and Wolpin (1997). The dependent variable is the occupational status of an individual (0 = in
Chapter 38. Discrete and censored dependent variables 390
school; 1 = not in school and not working; 2 = working), and the explanatory variables are education
and work experience (linear and square) plus a “black” binary variable. The full data set is a panel;
here the analysis is confined to a cross-section for 1987.
38.4 Bivariate probit

The bivariate probit model is a two-equation system in which each equation is a probit model and
the two disturbance terms may not be independent. In formulae,
k1
X
∗ ∗
y1,i = xij βj + ε1,i y1,i = 1 ⇐⇒ y1,i >0 (38.9)
j=1
k2
X
∗ ∗
y2,i = zij γj + ε2,i y2,i = 1 ⇐⇒ y2,i >0 (38.10)
j=1
" # " !#
ε1,i 1 ρ
∼ N 0, (38.11)
ε2,i ρ 1
If ρ were 0, ML estimation of the parameters βj and γj could be accomplished by estimating the

two equations separately. In the general case, however, joint estimation is required for maximal
efficiency.
The gretl command for this model is biprobit, which performs ML estimation via numerical op-
timization using the Newton–Raphson method with analytical derivatives. An example of usage is
provided in the biprobit.inp sample script. The command takes either three or four arguments,
the first three being series names for y1 and y2 and a list of explanatory variables. In the common
case when the regressors are the same for the two equations this is sufficient, but if z differs from
x a second list should be appended following a semicolon, as in:
biprobit y1 y2 X ; Z
Output from estimation includes a Likelihood Ratio test for the hypothesis ρ = 0.1 This can be
retrieved in the form of a bundle named independence_test under the $model accessor, as in
? eval $model.independence_test
bundle:
dfn = 1
test = 204.066
pvalue = 2.70739e-46
Since biprobit estimates a two-equation system, the $uhat and $yhat accessors provide ma-
trices rather than series as usual. Specifically, $uhat gives a two-column matrix containing the
generalized residuals, while $yhat contains four columns holding the estimated probabilities of
the possible joint outcomes: (y1,i , y1,i ) = (1, 1) in column 1, (y1,i , y2,i ) = (1, 0) in column 2,
(y1,i , y2,i ) = (0, 1) in column 3 and (y1,i , y2,i ) = (0, 0) in column 4.
38.5 Panel estimators

When your dataset is a panel, the traditional choice for binary dependent variable models was,
for many years, to use logit with fixed effects and probit with random effects (see 23.1 for a brief
discussion of this dichotomy in the context of linear models). Nowadays the choice is somewhat
wider but the two traditional models are by and large what practitioners use as routine tools.
1 Note that if the --robust option is given to biprobit—and therefore the estimator is meant to be QMLE—this test
may not be valid, even asymptotically.

Gretl provides FE logit via the function package felogit,2 RE probit natively. Provided your dataset
has a panel structure, the latter option can be obtained by adding the --random option to the
probit command:
probit depvar const indvar1 indvar2 --random
as exemplified in the reprobit.inp sample script. The numerical technique used for this particular
estimator is Gauss-Hermite quadrature, which we’ll now briefly describe. Generalizing equation
(38.5) to a panel context, we get
k
X
∗
yi,t = xijt βj + αi + εi,t = zi,t + ωi,t (38.12)
j=1
in which we assume that the individual effect, αi , and the disturbance term, εi,t , are mutually
independent zero-mean Gaussian random variables. The composite error term, ωi,t = αi + εi,t , is
therefore a normal r. v. with mean zero and variance 1 + σα2 . Because of the individual effect, αi ,
observations for the same unit are not independent; the likelihood therefore has to be evaluated
on a per-unit basis, as
ℓi = log P yi,1 , yi,2 , . . . , yi,T .
and there’s no way to write the above as a product of individual terms.
However, the above probability could be written as a product if we were to treat αi as a constant;
in that case we would have
 
T
X x ijt β j + αi
ℓi |αi = Φ (2yi,t − 1) q 
t=1 1 + σα2
so that we can compute ℓi by integrating αi out as

Z∞
ϕ(αi )
ℓi = E ℓi |αi = (ℓi |αi ) q dαi
−∞ 1 + σα2
The technique known as Gauss–Hermite quadrature is simply a way of approximating the above
integral via a sum of carefully chosen terms:3
m
X
ℓi ≃ (ℓi |αi = nk )wk
k=1
where the numbers nk and wk are known as quadrature points and weights, respectively. Of course,
accuracy improves with higher values of m, but so does CPU usage. Note that this technique can
also be used in more general cases by using the quadtable() function and the mle command via
the apparatus described in chapter 26. Here, however, the calculations were hard-coded in C for
maximal speed and efficiency.
Experience shows that a reasonable compromise can be achieved in most cases by choosing m in
the order of 20 or so; gretl uses 32 as a default value, but this can be changed via the --quadpoints
option, as in
probit y const x1 x2 x3 --random --quadpoints=48
2 Seehttp://gretl.sourceforge.net/current_fnfiles/felogit.gfn.
3 Some have suggested using a more refined method called adaptive Gauss-Hermite quadrature; this is not imple-
mented in gretl.
38.6 The Tobit model

The Tobit model is used when the dependent variable of a model is censored. Assume a latent
variable yi∗ can be described as
k
X
yi∗ = xij βj + εi ,
j=1
2
where εi ∼ N(0, σ ). If yi∗were observable, the model’s parameters could be estimated via ordinary
least squares. On the contrary, suppose that we observe yi , defined as

 a

 for yi∗ ≤ a
yi = yi∗ for a < yi∗ < b (38.13)

 b for y ∗ ≥ b

i
In most cases found in the applied literature, a = 0 and b = ∞, so in practice negative values of yi∗
are not observed and are replaced by zeros.
In this case, regressing yi on the xi s does not yield consistent estimates of the parameters β,
Pk
because the conditional mean E(yi |xi ) is not equal to j=1 xij βj . It can be shown that restricting
the sample to non-zero observations would not yield consistent estimates either. The solution is to
estimate the parameters via maximum likelihood. The syntax is simply
tobit depvar indvars
As usual, progress of the maximization algorithm can be tracked via the --verbose switch, while
$uhat returns the generalized residuals. Note that in this case the generalized residual is defined
as ûi = E(εi |yi = 0) for censored observations, so the familiar equality ûi = yi − ŷi only holds for
uncensored observations, that is, when yi > 0.
An important difference between the Tobit estimator and OLS is that the consequences of non-
normality of the disturbance term are much more severe: non-normality implies inconsistency for
the Tobit estimator. For this reason, the output for the Tobit model includes the Chesher and Irish
(1987) normality test by default.
The general case in which a is nonzero and/or b is finite can be handled by using the options
--llimit and --rlimit. So, for example,
tobit depvar indvars --llimit=10
would tell gretl that the left bound a is set to 10.
38.7 Interval regression

The interval regression model arises when the dependent variable is unobserved for some (possibly
all) observations; what we observe instead is an interval in which the dependent variable lies. In
other words, the data generating process is assumed to be
yi∗ = xi β + ϵi
but we only know that mi ≤ yi∗ ≤ Mi , where the interval may be left- or right-unbounded (but
not both). If mi = Mi , we effectively observe yi∗ and no information loss occurs. In practice, each
observation belongs to one of four categories:
1. left-unbounded, when mi = −∞,
2. right-unbounded, when Mi = ∞,
3. bounded, when −∞ < mi < Mi < ∞ and
4. point observations when mi = Mi .
It is interesting to note that this model bears similarities to other models in several special cases:
• When all observations are point observations the model trivially reduces to the ordinary linear
regression model.
• The interval model could be thought of as an ordered probit model (see 38.2) in which the cut
points (the αj coefficients in eq. 38.8) are observed and don’t need to be estimated.
• The Tobit model (see 38.6) is a special case of the interval model in which mi and Mi do not
depend on i, that is, the censoring limits are the same for all observations. As a matter of
fact, gretl’s tobit command is handled internally as a special case of the interval model.
The gretl command intreg estimates interval models by maximum likelihood, assuming normality
of the disturbance term ϵi . Its syntax is
intreg minvar maxvar X
where minvar contains the mi series, with NAs for left-unbounded observations, and maxvar con-
tains Mi , with NAs for right-unbounded observations. By default, standard errors are computed
using the negative inverse of the Hessian. If the --robust flag is given, then QML or Huber–White
standard errors are calculated instead. In this case the estimated covariance matrix is a “sandwich”
of the inverse of the estimated Hessian and the outer product of the gradient.
If the model specification contains regressors other than just a constant, the output includes a
chi-square statistic for testing the joint null hypothesis that none of these regressors has any
effect on the outcome. This is a Wald statistic based on the estimated covariance matrix. If you
wish to construct a likelihood ratio test, this is easily done by estimating both the full model
and the null model (containing only the constant), saving the log-likelihood in both cases via the
$lnl accessor, and then referring twice the difference between the two log-likelihoods to the chi-
square distribution with k degrees of freedom, where k is the number of additional regressors (see
the pvalue command in the Gretl Command Reference). Also included is a conditional moment
normality test, similar to those provided for the probit, ordered probit and Tobit models (see
above). An example is contained in the sample script wtp.inp, provided with the gretl distribution.
As with the probit and Tobit models, after a model has been estimated the $uhat accessor re-
turns the generalized residual, which is an estimate of ϵi : more precisely, it equals yi − xi β̂ for
point observations and E(ϵi |mi , Mi , xi ) otherwise. Note that it is possible to compute an unbiased
predictor of yi∗ by summing this estimate to xi β̂. Listing 38.5 shows an example. As a further
similarity with Tobit, the interval regression model may deliver inconsistent estimates if the dis-
turbances are non-normal; hence, the Chesher and Irish (1987) test for normality is included by
default here too.
38.8 Sample selection model

In the sample selection model (also known as “Tobit II” model), there are two latent variables:
k
X
yi∗ = xij βj + εi (38.14)
j=1
p
X
si∗ = zij γj + ηi (38.15)
j=1
Listing 38.5: Interval model on artificial data [Download ▼]

Input:
nulldata 100
# generate artificial data
set seed 201449
x = normal()
epsilon = 0.2*normal()
ystar = 1 + x + epsilon
lo_bound = floor(ystar)
hi_bound = ceil(ystar)
# run the interval model

intreg lo_bound hi_bound const x
# estimate ystar
gen_resid = $uhat
yhat = $yhat + gen_resid
corr ystar yhat
Output (selected portions):

Model 1: Interval estimates using the 100 observations 1-100
Lower limit: lo_bound, Upper limit: hi_bound
coefficient std. error t-ratio p-value

---------------------------------------------------------
const 0.993762 0.0338325 29.37 1.22e-189 ***
x 0.986662 0.0319959 30.84 8.34e-209 ***
Chi-square(1) 950.9270 p-value 8.3e-209

sigma = 0.223273
Left-unbounded observations: 0
Right-unbounded observations: 0
Bounded observations: 100
Point observations: 0
...
corr(ystar, yhat) = 0.98960092

Under the null hypothesis of no correlation:
t(98) = 68.1071, with two-tailed p-value 0.0000
and the observation rule is given by
yi∗ for si∗ > 0

(
yi = (38.16)
♦ for si∗ ≤ 0
In this context, the ♦ symbol indicates that for some observations we simply do not have data on
y: yi may be 0, or missing, or anything else. A dummy variable di is normally used to set censored
observations apart.
One of the most popular applications of this model in econometrics is a wage equation coupled
with a labor force participation equation: we only observe the wage for the employed. If yi∗ and si∗
were (conditionally) independent, there would be no reason not to use OLS for estimating equation
(38.14); otherwise, OLS does not yield consistent estimates of the parameters βj .
Since conditional independence between yi∗ and si∗ is equivalent to conditional independence be-
tween εi and ηi , one may model the co-dependence between εi and ηi as
εi = ληi + vi ;
substituting the above expression in (38.14), you obtain the model that is actually estimated:
k
X
yi = xij βj + λη̂i + vi ,
j=1
so the hypothesis that censoring does not matter is equivalent to the hypothesis H0 : λ = 0, which
can be easily tested.
The parameters can be estimated via maximum likelihood under the assumption of joint normality
of εi and ηi ; however, a widely used alternative method yields the so-called Heckit estimator, named
after Heckman (1979). The procedure can be briefly outlined as follows: first, a probit model is fit
on equation (38.15); next, the generalized residuals are inserted in equation (38.14) to correct for
the effect of sample selection.
Gretl provides the heckit command to carry out estimation; its syntax is
heckit y X ; d Z
where y is the dependent variable, X is a list of regressors, d is a dummy variable holding 1 for
uncensored observations and Z is a list of explanatory variables for the censoring equation.
Since in most cases maximum likelihood is the method of choice, by default gretl computes ML
estimates. The 2-step Heckit estimates can be obtained by using the --two-step option. After
estimation, the $uhat accessor contains the generalized residuals. As in the ordinary Tobit model,
the residuals equal the difference between actual and fitted yi only for uncensored observations
(those for which di = 1).
Listing 38.6 shows two estimates from the dataset used in Mroz (1987): the first one replicates
Table 22.7 in Greene (2003),4 while the second one replicates table 17.1 in Wooldridge (2002a).
38.9 Count data

Here the dependent variable is assumed to be a non-negative integer — for example, the number of
Nobel Prize winners in a given country per year, the number of vehicles crossing a certain intersec-
tion per hour, the number of bank failures per year. A probabilistic description of such a variable
must hinge on some discrete distribution and the one most commonly employed is the Poisson,
according to which, for a random variable Y and a specific realization y,
e−λ λy
P (Y = y) = , y = 0, 1, 2 . . .
y!
where the single parameter λ is both the mean and the variance of Y . In an econometric context
we generally want to treat λ as specific to the observation, i, and driven by covariates Xi via a
parameter vector β. The standard way of allowing for this is the exponential mean function,
λi ≡ exp(Xi β)
hence leading to
exp(− exp(Xi β))(exp(Xi β))y
P (Yi = y) =
y!
4 Note that the estimates given by gretl do not coincide with those found in the printed volume. They do, however,
match those found on the errata web page for Greene’s book: http://pages.stern.nyu.edu/~wgreene/Text/Errata/
ERRATA5.htm.
Listing 38.6: Heckit model [Download ▼]
open mroz87.gdt
series EXP2 = AX^2

series WA2 = WA^2
series KIDS = (KL6+K618)>0
# Greene’s specification
list X = const AX EXP2 WE CIT

list Z = const WA WA2 FAMINC KIDS WE
heckit WW X ; LFP Z --two-step

heckit WW X ; LFP Z
# Wooldridge’s specification
series NWINC = FAMINC - WW*WHRS

series lww = log(WW)
list X = const WE AX EXP2
list Z = X NWINC WA KL6 K618
heckit lww X ; LFP Z --two-step
Given this model the log-likelihood for n observations can be written as

n
X
ℓ= (− exp(Xi β) + yi Xi β − log yi !
i=1
Maximization of this quantity is quite straightforward, and is carried out in gretl using the syntax
poisson depvar indep
In some cases, an “offset” variable is needed: the count of occurrences of the outcome of interest
in a given time is assumed to be strictly proportional to the offset variable ti . In the epidemiology
literature, the offset is known as “population at risk”. In this case λ is modeled as
λi = ti exp(Xi β)
The log-likelihood is not greatly complicated thereby. Here’s another way of thinking about the
offset variable: its natural log is just another explanatory variable whose coefficient is constrained
to equal 1.
If an offset variable is needed, it should be specified at the end of the command, separated from
the list of explanatory variables by a semicolon, as in
poisson depvar indep ; offset
Overdispersion
As mentioned above, in the Poisson model E(Yi |Xi ) = V (Yi |Xi ) = λi , that is, the conditional mean
equals the conditional variance by construction. In many cases this feature is at odds with the data;
the conditional variance is often larger than the mean, a phenomenon known as overdispersion.
The output from the poisson command includes a conditional moment test for overdispersion (as
per Davidson and MacKinnon (2004), section 11.5), which is printed automatically after estimation.
Overdispersion can be attributed to unmodeled heterogeneity between individuals. Two data points
with the same observable characteristics Xi = Xj may differ because of some unobserved scale
factor si ̸= sj so that
E(Yi |Xi , si ) = λi si ̸= λj sj = E(Yi |Xj , sj )
even though λi = λj . In other words, Yi is a Poisson random variable conditional on both Xi and
si , but since si is unobservable, the only thing we can we can use, P (Yi |Xi ), will not conform to the
Poisson distribution.
It is often assumed that si can be represented as a gamma random variable with mean 1 and
variance α. The parameter α, which measures the degree of heterogeneity between individuals, is
then estimated jointly with the vector β.
In this case, the conditional probability that Yi = y given Xi can be shown to be
y " #α−1
Γ (y + α−1 ) λi α−1

P (Yi = y|Xi ) = (38.17)
Γ (α−1 )Γ (y + 1) λi + α−1 λi + α−1
which is known as the Negative Binomial Model. The conditional mean is still E(Yi |Xi ) = λi , but the
variance is V (Yi |Xi ) = λi (1 + λi α).
To estimate the Negative Binomial model in gretl, just substitute the keyword negbin for poisson
in the commands shown above.
To be precise, the model 38.17 is that labeled NEGBIN2 by Cameron and Trivedi (1986). There’s
also a lesser-used NEGBIN1 variant, in which the conditional variance is a scalar multiple of the
conditional mean; that is, V (Yi |Xi ) = λi (1 + γ). This can be invoked in gretl by appending the
option --model1 to the negbin command.5
The two accessors $yhat and $uhat return the predicted values and generalized residuals, respec-
tively. Note that $uhat is not equal to the difference between the dependent variable and $yhat.
Examples
Among the sample scripts supplied with gretl you can find camtriv.inp. This exemplifies the
count-data estimators described above, based on a dataset analysed by Cameron and Trivedi (1998).
The gretl package also contains a relevant dataset used by McCullagh and Nelder (1983), namely
mccullagh.gdt, on which the Poisson and Negative Binomial estimators may be tried.
38.10 Duration models

In some contexts we wish to apply econometric methods to measurements of the duration of certain
states. Classic examples include the following:
• From engineering, the “time to failure” of electronic or mechanical components: how long do,
say, computer hard drives last until they malfunction?
• From the medical realm: how does a new treatment affect the time from diagnosis of a certain
condition to exit from that condition (where “exit” might mean death or full recovery)?
• From economics: the duration of strikes, or of spells of unemployment.
In each case we may be interested in how the durations are distributed, and how they are affected
by relevant covariates. There are several approaches to this problem; the one we discuss here —
which is currently the only one supported by gretl — is estimation of a parametric model by means
5 The “1” and “2” in these labels indicate the power to which λi is raised in the conditional variance expression.
of Maximum Likelihood. In this approach we hypothesize that the durations follow some definite
probability law and we seek to estimate the parameters of that law, factoring in the influence of
covariates.
We may express the density of the durations as f (t, X, θ), where t is the length of time in the state
in question, X is a matrix of covariates, and θ is a vector of parameters. The likelihood for a sample
of n observations indexed by i is then
n
Y
L= f (ti , xi , θ)
i=1
Rather than working with the density directly, however, it is standard practice to factor f (·) into
two components, namely a hazard function, λ, and a survivor function, S. The survivor function
gives the probability that a state lasts at least as long as t; it is therefore 1 − F (t, X, θ) where F
is the CDF corresponding to the density f (·). The hazard function addresses this question: given
that a state has persisted as long as t, what is the likelihood that it ends within a short increment
of time beyond t —that is, it ends between t and t + ∆? Taking the limit as ∆ goes to zero, we end
up with the ratio of the density to the survivor function:6
f (t, X, θ)
λ(t, X, θ) = (38.18)
S(t, X, θ)
so the log-likelihood can be written as
n
X n
X
ℓ= log f (ti , xi , θ) = log λ(ti , xi , θ) + log S(ti , xi , θ) (38.19)
i=1 i=1
One point of interest is the shape of the hazard function, in particular its dependence (or not) on
time since the state began. If λ does not depend on t we say the process in question exhibits du-
ration independence: the probability of exiting the state at any given moment neither increases nor
decreases based simply on how long the state has persisted to date. The alternatives are positive
duration dependence (the likelihood of exiting the state rises, the longer the state has persisted)
or negative duration dependence (exit becomes less likely, the longer it has persisted). Finally, the
behavior of the hazard with respect to time need not be monotonic; some parameterizations allow
for this possibility and some do not.
Since durations are inherently positive the probability distribution used in modeling must respect
this requirement, giving a density of zero for t ≤ 0. Four common candidates are the exponential,
Weibull, log-logistic and log-normal, the Weibull being the most common choice. The table below
displays the density and the hazard function for each of these distributions as they are commonly
parameterized, written as functions of t alone. (φ and Φ denote, respectively, the Gaussian PDF
and CDF.)
density, f (t) hazard, λ(t)
Exponential γ exp (−γt) γ
Weibull αγ α t α−1 exp [−(γt)α ] αγ α t α−1

(γt)α−1 (γt)α−1
Log-logistic γα 2 γα
[1 + (γt)α ] [1 + (γt)α ]

1 1 φ (log t − µ)/σ
Log-normal φ (log t − µ)/σ
σt σ t Φ −(log t − µ)/σ
The hazard is constant for the exponential distribution. For the Weibull, it is monotone increasing
in t if α > 1, or monotone decreasing for α < 1. (If α = 1 the Weibull collapses to the exponential.)
6 For a fuller discussion see, for example, Davidson and MacKinnon (2004).
The log-logistic and log-normal distributions allow the hazard to vary with t in a non-monotonic
fashion.
Covariates are brought into the picture by allowing them to govern one of the parameters of the
density, so that durations are not identically distributed across cases. For example, when using
the log-normal distribution it is natural to make µ, the expected value of log t, depend on the
covariates, X. This is typically done via a linear index function: µ = Xβ.
Note that the expressions for the log-normal density and hazard contain the term (log t − µ)/σ .
Replacing µ with Xβ this becomes (log t − Xβ)/σ . As in Kalbfleisch and Prentice (2002), we define
a shorthand label for this term:
wi ≡ (log ti − xi β)/σ (38.20)
It turns out that this constitutes a useful simplifying change of variables for all of the distributions
discussed here. The interpretation of the scale factor, σ , in the expression above depends on the
distribution. For the log-normal, σ represents the standard deviation of log t; for the Weibull and
the log-logistic it corresponds to 1/α; and for the exponential it is fixed at unity. For distributions
other than the log-normal, Xβ corresponds to − log γ, or in other words γ = exp(−Xβ).
With this change of variables, the density and survivor functions may be written compactly as
follows (the exponential is the same as the Weibull).
density, f (wi ) survivor, S(wi )

Weibull exp (wi − ewi ) exp(−ewi )
Log-logistic ewi (1 + ewi )−2 (1 + ewi )−1
Log-normal φ(wi ) Φ(−wi )
In light of the above we may think of the generic parameter vector θ, as in f (t, X, θ), as composed of
the coefficients on the covariates, β, plus (in all cases but the exponential) the additional parameter
σ.
A complication in estimation of θ is posed by “incomplete spells”. That is, in some cases the state
in question may not have ended at the time the observation is made (e.g. some workers remain
unemployed, some components have not yet failed). If we use ti to denote the time from entering
the state to either (a) exiting the state or (b) the observation window closing, whichever comes first,
then all we know of the “right-censored” cases (b) is that the duration was at least as long as ti .
This can be handled by rewriting the the log-likelihood (compare 38.19) as
n
X
ℓi = δi log S (wi ) + (1 − δi ) − log σ + log f (wi ) (38.21)
i=1
where δi equals 1 for censored cases (incomplete spells), and 0 for complete observations. The
rationale for this is that the log-density equals the sum of the log hazard and the log survivor
function, but for the incomplete spells only the survivor function contributes to the likelihood. So
in (38.21) we are adding up the log survivor function alone for the incomplete cases, plus the full
log density for the completed cases.
Implementation in gretl and illustration

The duration command accepts a list of series on the usual pattern: dependent variable followed
by covariates. If right-censoring is present in the data this should be represented by a dummy
variable corresponding to δi above, separated from the covariates by a semicolon. For example,
duration durat 0 X ; cens

where durat measures durations, 0 represents the constant (which is required for such models), X
is a named list of regressors, and cens is the censoring dummy.
By default the Weibull distribution is used; you can substitute any of the other three distribu-
tions discussed here by appending one of the option flags --exponential, --loglogistic or
--lognormal.
Interpreting the coefficients in a duration model requires some care, and we will work through
an illustrative case. The example comes from section 20.3 of Wooldridge (2002a) and it concerns
criminal recidivism.7 The data (filename recid.gdt) pertain to a sample of 1,445 convicts released
from prison between July 1, 1977 and June 30, 1978. The dependent variable is the time in months
until they are again arrested. The information was gathered retrospectively by examining records
in April 1984; the maximum possible length of observation is 81 months. Right-censoring is impor-
tant: when the date were compiled about 62 percent had not been rearrested. The dataset contains
several covariates, which are described in the data file; we will focus below on interpretation of the
married variable, a dummy which equals 1 if the respondent was married when imprisoned.
Listing 38.7 shows the gretl commands for Weibull and log-normal models along with most of the
output. Consider first the Weibull scale factor, σ . The estimate is 1.241 with a standard error of
0.048. (We don’t print a z score and p-value for this term since H0 : σ = 0 is not of interest.)
Recall that σ corresponds to 1/α; we can be confident that α is less than 1, so recidivism displays
negative duration dependence. This makes sense: it is plausible that if a past offender manages
to stay out of trouble for an extended period his risk of engaging in crime again diminishes. (The
exponential model would therefore not be appropriate in this case.)
On a priori grounds, however, we may doubt the monotonic decline in hazard that is implied by
the Weibull specification. Even if a person is liable to return to crime, it seems relatively unlikely
that he would do so straight out of prison. In the data, we find that only 2.6 percent of those
followed were rearrested within 3 months. The log-normal specification, which allows the hazard
to rise and then fall, may be more appropriate. Using the duration command again with the same
covariates but the --lognormal flag, we get a log-likelihood of −1597 as against −1633 for the
Weibull, confirming that the log-normal gives a better fit.
Let us now focus on the married coefficient, which is positive in both specifications but larger and
more sharply estimated in the log-normal variant. The first thing is to get the interpretation of the
sign right. Recall that Xβ enters negatively into the intermediate variable w (equation 38.20). The
Weibull hazard is λ(wi ) = ewi , so being married reduces the hazard of re-offending, or in other
words lengthens the expected duration out of prison. The same qualitative interpretation applies
for the log-normal.
To get a better sense of the married effect, it is useful to show its impact on the hazard across time.
We can do this by plotting the hazard for two values of the index function Xβ: in each case the
values of all the covariates other than married are set to their means (or some chosen values) while
married is set first to 0 then to 1. Listing 38.8 provides a script that does this, and the resulting
plots are shown in Figure 38.1. Note that when computing the hazards we need to multiply by the
Jacobian of the transformation from ti to wi = log(ti − xi β)/σ , namely 1/t. Note also that the
estimate of σ is available via the accessor $sigma, but it is also present as the last element in the
coefficient vector obtained via $coeff.
A further difference between the Weibull and log-normal specifications is illustrated in the plots.
The Weibull is an instance of a proportional hazard model. This means that for any sets of values of
the covariates, xi and xj , the ratio of the associated hazards is invariant with respect to duration. In
this example the Weibull hazard for unmarried individuals is always 1.1637 times that for married.
In the log-normal variant, on the other hand, this ratio gradually declines from 1.6703 at one month
to 1.1766 at 100 months.
7 Germán Rodríguez of Princeton University has a page discussing this example and displaying estimates from Stata
at http://data.princeton.edu/pop509/recid1.html.
Listing 38.7: Models for recidivism data [Download ▼]

Input:
open recid.gdt
list X = workprg priors tserved felon alcohol drugs \
black married educ age
duration durat 0 X ; cens
duration durat 0 X ; cens --lognormal
Partial output:
Model 1: Duration (Weibull), using observations 1-1445
Dependent variable: durat

--------------------------------------------------------
const 4.22167 0.341311 12.37 3.85e-35 ***
workprg -0.112785 0.112535 -1.002 0.3162
priors -0.110176 0.0170675 -6.455 1.08e-10 ***
tserved -0.0168297 0.00213029 -7.900 2.78e-15 ***
felon 0.371623 0.131995 2.815 0.0049 ***
alcohol -0.555132 0.132243 -4.198 2.69e-05 ***
drugs -0.349265 0.121880 -2.866 0.0042 ***
black -0.563016 0.110817 -5.081 3.76e-07 ***
married 0.188104 0.135752 1.386 0.1659
educ 0.0289111 0.0241153 1.199 0.2306
age 0.00462188 0.000664820 6.952 3.60e-12 ***
sigma 1.24090 0.0482896

Model 2: Duration (log-normal), using observations 1-1445

Dependent variable: durat

---------------------------------------------------------
const 4.09939 0.347535 11.80 4.11e-32 ***
workprg -0.0625693 0.120037 -0.5213 0.6022
priors -0.137253 0.0214587 -6.396 1.59e-10 ***
tserved -0.0193306 0.00297792 -6.491 8.51e-11 ***
felon 0.443995 0.145087 3.060 0.0022 ***
alcohol -0.634909 0.144217 -4.402 1.07e-05 ***
drugs -0.298159 0.132736 -2.246 0.0247 **
black -0.542719 0.117443 -4.621 3.82e-06 ***
married 0.340682 0.139843 2.436 0.0148 **
educ 0.0229194 0.0253974 0.9024 0.3668
age 0.00391028 0.000606205 6.450 1.12e-10 ***
sigma 1.81047 0.0623022

Listing 38.8: Create plots showing conditional hazards [Download ▼]
open recid.gdt -q
# leave ’married’ separate for analysis

list X = workprg priors tserved felon alcohol drugs \
black educ age
# Weibull variant
duration durat 0 X married ; cens
# coefficients on all Xs apart from married
matrix beta_w = $coeff[1:$ncoeff-2]
# married coefficient
scalar mc_w = $coeff[$ncoeff-1]
scalar s_w = $sigma
# Log-normal variant
duration durat 0 X married ; cens --lognormal
matrix beta_n = $coeff[1:$ncoeff-2]
scalar mc_n = $coeff[$ncoeff-1]
scalar s_n = $sigma
list allX = 0 X
# evaluate X\beta at means of all variables except marriage
scalar Xb_w = meanc({allX}) * beta_w
scalar Xb_n = meanc({allX}) * beta_n
# construct two plot matrices

matrix mat_w = zeros(100, 3)
matrix mat_n = zeros(100, 3)
loop t=1..100
# first column, duration
mat_w[t, 1] = t
mat_n[t, 1] = t
wi_w = (log(t) - Xb_w)/s_w
wi_n = (log(t) - Xb_n)/s_n
# second col: hazard with married = 0
mat_w[t, 2] = (1/t) * exp(wi_w)
mat_n[t, 2] = (1/t) * pdf(z, wi_n) / cdf(z, -wi_n)
wi_w = (log(t) - (Xb_w + mc_w))/s_w
wi_n = (log(t) - (Xb_n + mc_n))/s_n
# third col: hazard with married = 1
mat_w[t, 3] = (1/t) * exp(wi_w)
mat_n[t, 3] = (1/t) * pdf(z, wi_n) / cdf(z, -wi_n)
endloop
cnameset(mat_w, "months unmarried married")

cnameset(mat_n, "months unmarried married")
gnuplot 2 3 1 --with-lines --supp --matrix=mat_w --output=weibull.plt

gnuplot 2 3 1 --with-lines --supp --matrix=mat_n --output=lognorm.plt
Weibull
0.020
unmarried
0.018 married
0.016
0.014
0.012
0.010
0.008
0.006
0 20 40 60 80 100
months
Log-normal
0.020
unmarried
0.018 married
0.016
0.014
0.012
0.010
0.008
0.006
0 20 40 60 80 100
months
Figure 38.1: Recidivism hazard estimates for married and unmarried ex-convicts
Alternative representations of the Weibull model

One point to watch out for with the Weibull duration model is that the estimates may be represented
in different ways. The representation given by gretl is sometimes called the accelerated failure-time
(AFT) metric. An alternative that one sometimes sees is the log relative-hazard metric; in fact this is
the metric used in Wooldridge’s presentation of the recidivism example. To get from AFT estimates
to log relative-hazard form it is necessary to multiply the coefficients by −σ −1 . For example, the
married coefficient in the Weibull specification as shown here is 0.188104 and σ̂ is 1.24090, so the
alternative value is −0.152, which is what Wooldridge shows (2002a, Table 20.1).
Fitted values and residuals

By default, gretl computes fitted values (accessible via $yhat) as the conditional mean of duration.
The formulae are shown below (where Γ denotes the gamma function, and the exponential variant
is just Weibull with σ = 1).
Weibull Log-logistic Log-normal

πσ
exp(Xβ)Γ (1 + σ ) exp(Xβ) exp(Xβ + σ 2 /2)
sin(π σ )
The expression given for the log-logistic mean, however, is valid only for σ < 1; otherwise the
expectation is undefined, a point that is not noted in all software.8
Alternatively, if the --medians option is given, gretl’s duration command will produce conditional
medians as the content of $yhat. For the Weibull the median is exp(Xβ)(log 2)σ ; for the log-logistic
and log-normal it is just exp(Xβ).
The values we give for the accessor $uhat are generalized (Cox–Snell) residuals, computed as the
integrated hazard function, which equals the negative log of the survivor function:
ϵi = Λ(ti , xi , θ) = − log S(ti , xi , θ)
Under the null of correct specification of the model these generalized residuals should follow the
unit exponential distribution, which has mean and variance both equal to 1 and density exp(−ϵ).
See chapter 18 of Cameron and Trivedi (2005) for further discussion.
8 The predict adjunct to the streg command in Stata 10, for example, gaily produces large negative values for the
log-logistic mean in duration models with σ > 1.

Chapter 39
Quantile regression
39.1 Introduction
In Ordinary Least Squares (OLS) regression, the fitted values, ŷi = Xi β̂, represent the conditional
mean of the dependent variable—conditional, that is, on the regression function and the values
of the independent variables. In median regression, by contrast and as the name implies, fitted
values represent the conditional median of the dependent variable. It turns out that the principle of
estimation for median regression is easily stated (though not so easily computed), namely, choose
β̂ so as to minimize the sum of absolute residuals. Hence the method is known as Least Absolute
Deviations or LAD. While the OLS problem has a straightforward analytical solution, LAD is a linear
programming problem.
Quantile regression is a generalization of median regression: the regression function predicts the
conditional τ-quantile of the dependent variable — for example the first quartile (τ = .25) or the
ninth decile (τ = .90).
If the classical conditions for the validity of OLS are satisfied — that is, if the error term is indepen-
dently and identically distributed, conditional on X — then quantile regression is redundant: all the
conditional quantiles of the dependent variable will march in lockstep with the conditional mean.
Conversely, if quantile regression reveals that the conditional quantiles behave in a manner quite
distinct from the conditional mean, this suggests that OLS estimation is problematic.
Gretl has offered quantile regression functionality since version 1.7.5 (in addition to basic LAD
regression, which has been available since early in gretl’s history via the lad command).1
39.2 Basic syntax

The basic invocation of quantile regression is
quantreg tau reglist
where
• reglist is a standard gretl regression list (dependent variable followed by regressors, including
the constant if an intercept is wanted); and
• tau is the desired conditional quantile, in the range 0.01 to 0.99, given either as a numerical
value or the name of a pre-defined scalar variable (but see below for a further option).
Estimation is via the Frisch–Newton interior point solver (Portnoy and Koenker, 1997), which is sub-
stantially faster than the “traditional” Barrodale–Roberts (1974) simplex approach for large prob-
lems.
1 We gratefully acknowledge our borrowing from the quantreg package for GNU R (version 4.17). The core of the
package is composed of Fortran code written by Roger Koenker; this is accompanied by various driver and auxiliary
functions written in the R language by Koenker and Martin Mächler. The latter functions have been re-worked in C for
gretl. We have added some guards against potential numerical problems in small samples.
405
Chapter 39. Quantile regression 406
By default, standard errors are computed according to the asymptotic formula given by Koenker
and Bassett (1978). Alternatively, if the --robust option is given, we use the sandwich estimator
developed in Koenker and Zhao (1994).2
39.3 Confidence intervals

An option --intervals is available. When this is given we print confidence intervals for the param-
eter estimates instead of standard errors. These intervals are computed using the rank inversion
method and in general they are asymmetrical about the point estimates — that is, they are not sim-
ply “plus or minus so many standard errors”. The specifics of the calculation are inflected by
the --robust option: without this, the intervals are computed on the assumption of IID errors
(Koenker, 1994); with it, they use the heteroskedasticity-robust estimator developed by Koenker
and Machado (1999).
By default, 90 percent intervals are produced. You can change this by appending a confidence value
(expressed as a decimal fraction) to the intervals option, as in
quantreg tau reglist --intervals=.95
When the confidence intervals option is selected, the parameter estimates are calculated using
the Barrodale–Roberts method. This is simply because the Frisch–Newton code does not currently
support the calculation of confidence intervals.
Two further details. First, the mechanisms for generating confidence intervals for quantile esti-
mates require that the model has at least two regressors (including the constant). If the --intervals
option is given for a model containing only one regressor, an error is flagged. Second, when a model
is estimated in this mode, you can retrieve the confidence intervals using the accessor $coeff_ci.
This produces a k × 2 matrix, where k is the number of regressors. The lower bounds are in the
first column, the upper bounds in the second. See also section 39.5 below.
39.4 Multiple quantiles

As a further option, you can give tau as a matrix — either the name of a predefined matrix or in
numerical form, as in {.05, .25, .5, .75, .95}. The given model is estimated for all the τ
values and the results are printed in a special form, as shown below (in this case the --intervals
option was also given).
Model 1: Quantile estimates using the 235 observations 1-235

Dependent variable: foodexp
With 90 percent confidence intervals
VARIABLE TAU COEFFICIENT LOWER UPPER
const 0.05 124.880 98.3021 130.517

0.25 95.4835 73.7861 120.098
0.50 81.4822 53.2592 114.012
0.75 62.3966 32.7449 107.314
0.95 64.1040 46.2649 83.5790
income 0.05 0.343361 0.343327 0.389750

0.25 0.474103 0.420330 0.494329
0.50 0.560181 0.487022 0.601989
0.75 0.644014 0.580155 0.690413
0.95 0.709069 0.673900 0.734441
2 These correspond to the iid and nid options in R’s quantreg package, respectively.
Coefficient on income
0.75
Quantile estimates with 90% band
OLS estimate with 90% band
0.7
0.65
0.6
0.55
0.5
0.45
0.4
0.35
0.3
0 0.2 0.4 0.6 0.8 1
tau
Figure 39.1: Regression of food expenditure on income; Engel’s data
The gretl GUI has an entry for Quantile Regression (under /Model/Robust estimation), and you can
select multiple quantiles there too. In that context, just give space-separated numerical values (as
per the predefined options, shown in a drop-down list).
When you estimate a model in this way most of the standard menu items in the model window
are disabled, but one extra item is available — graphs showing the τ sequence for a given coeffi-
cient in comparison with the OLS coefficient. An example is shown in Figure 39.1. This sort of
graph provides a simple means of judging whether quantile regression is redundant (OLS is fine) or
informative.
In the example shown—based on data on household income and food expenditure gathered by
Ernst Engel (1821–1896)—it seems clear that simple OLS regression is potentially misleading. The
“crossing” of the OLS estimate by the quantile estimates is very marked.
However, it is not always clear what implications should be drawn from this sort of conflict. With
the Engel data there are two issues to consider. First, Engel’s famous “law” claims an income-
elasticity of food consumption that is less than one, and talk of elasticities suggests a logarithmic
formulation of the model. Second, there are two apparently anomalous observations in the data
set: household 105 has the third-highest income but unexpectedly low expenditure on food (as
judged from a simple scatter plot), while household 138 (which also has unexpectedly low food
consumption) has much the highest income, almost twice that of the next highest.
With n = 235 it seems reasonable to consider dropping these observations. If we do so, and adopt
a log–log formulation, we get the plot shown in Figure 39.2. The quantile estimates still cross the
OLS estimate, but the “evidence against OLS” is much less compelling: the 90 percent confidence
bands of the respective estimates overlap at all the quantiles considered.
A script to produce the results discussed above is presented in listing 39.1.
39.5 Large datasets

As noted above, when you give the --intervals option with the quantreg command, which calls
for estimation of confidence intervals via rank inversion, gretl switches from the default Frisch–
Coefficient on log(income)
0.96
0.94
0.92
0.9
0.88
0.86
0.84
0.82
0.8
0.78
Quantile estimates with 90% band
OLS estimate with 90% band
0.76
0 0.2 0.4 0.6 0.8 1
tau
Figure 39.2: Log–log regression; 2 observations dropped from full Engel data set.
Listing 39.1: Food expenditure and income, Engel data [Download ▼]
# this data file is supplied with gretl

open engel.gdt
# specify some quantiles

matrix tau = {.05, .25, .5, .75, .95}
# use levels of variables

QM1 <- quantreg tau foodexp 0 income --intervals
# use log-log specification, with two outliers removed

logs foodexp income
smpl obs!=105 && obs!=138 --restrict
QM2 <- quantreg tau l_foodexp 0 l_income --intervals
The script saves the two models “as icons”. Double-clicking on a model’s icon opens a window to
display the results, and the Graph menu in this window gives access to a tau-sequence plot.
Newton algorithm to the Barrodale–Roberts simplex method.

This is OK for moderately large datasets (up to, say, a few thousand observations) but on very large
problems the simplex algorithm may become seriously bogged down. For example, Koenker and
Hallock (2001) present an analysis of the determinants of birth weights, using 198377 observations
and with 15 regressors. Generating confidence intervals via Barrodale–Roberts for a single value of
τ took about half an hour on a Lenovo Thinkpad T60p with 1.83GHz Intel Core 2 processor.
If you want confidence intervals in such cases, you are advised not to use the --intervals option,
but to compute them using the method of “plus or minus so many standard errors”. (One Frisch–
Newton run took about 8 seconds on the same machine, showing the superiority of the interior
point method.) The script below illustrates:
quantreg .10 y 0 xlist

scalar crit = qnorm(.95)
matrix ci = $coeff - crit * $stderr
ci = ci~($coeff + crit * $stderr)
print ci
The matrix ci will contain the lower and upper bounds of the (symmetrical) 90 percent confidence
intervals.
To avoid a situation where gretl becomes unresponsive for a very long time we have set the maxi-
mum number of iterations for the Borrodale–Roberts algorithm to the (somewhat arbitrary) value
of 1000. We will experiment further with this, but for the meantime if you really want to use this
method on a large dataset, and don’t mind waiting for the results, you can increase the limit using
the set command with parameter rq_maxiter, as in
set rq_maxiter 5000

Chapter 40
Nonparametric methods
The main focus of gretl is on parametric estimation, but we offer a selection of nonparametric
methods. The most basic of these
• various tests for difference in distribution (Sign test, Wilcoxon rank-sum test, Wilcoxon signed-
rank test);
• the Runs test for randomness; and
• nonparametric measures of association: Spearman’s rho and Kendall’s tau.
Details on the above can be found by consulting the help for the commands difftest, runs, corr
and spearman. In the GUI program these items are found under the Tools menu and the Robust
estimation item under the Model menu.
In this chapter we concentrate on two relatively complex methods for nonparametric curve-fitting
and prediction, namely William Cleveland’s “loess” (also known as “lowess”) and the Nadaraya–
Watson estimator.
40.1 Locally weighted regression (loess)

Loess (Cleveland, 1979) is a nonparametric smoother employing locally weighted polynomial re-
gression. It is intended to yield an approximation to g(·) when the dependent variable, y, can be
expressed as
yi = g(xi ) + ϵi
for some smooth function g(·).
Given a sample of n observations on the variables y and x, the procedure is to run a weighted least
squares regression (a polynomial of order d = 0, 1 or 2 in x) localized to each data point, i. In each
such regression the sample consists of the r nearest neighbors (in the x dimension) to the point i,
with weights that are inversely related to the distance |xi − xk |, k = 1, . . . , r . The predicted value
ŷi is then obtained by evaluating the estimated polynomial at xi . The most commonly used order
is d = 1.
A bandwidth parameter 0 < q ≤ 1 controls the proportion of the total number of data points used
in each regression; thus r = qn (rounded up to an integer). Larger values of q lead to a smoother
fitted series, smaller values to a series that tracks the actual data more closely; 0.25 ≤ q ≤ 0.5 is
often a suitable range.
In gretl’s implementation of loess the weighting scheme is that given by Cleveland, namely,
wk (xi ) = W (h−1
i (xk − xi ))
where hi is the distance between xi and its r th nearest neighbor, and W (·) is the tricube function,
(1 − |x|3 )3
(
for |x| < 1
W (x) =
0 for |x| ≥ 1
410
Chapter 40. Nonparametric methods 411
The local regression can be made robust via an adjustment based on the residuals, ei = yi − ŷi .
Robustness weights, δk , are defined by
δk = B(ek /6s)
where s is the median of the |ei | and B(·) is the bisquare function,
(1 − x 2 )2
(
for |x| < 1
B(x) =
0 for |x| ≥ 1
The polynomial regression is then re-run using weight δk wk (xi ) at (xk , yk ).

The loess() function in gretl takes up to five arguments as follows: the y series, the x series, the
order d, the bandwidth q, and a Boolean switch to turn on the robust adjustment. The last three
arguments are optional: if they are omitted the default values are d = 1, q = 0.5 and no robust
adjustment. An example of a full call to loess() is shown below; in this case a quadratic in x is
specified, three quarters of the data points will be used in each local regression, and robustness is
turned on:
series yh = loess(y, x, 2, 0.75, 1)
An illustration of loess is provided in Listing 40.1: we generate a series that has a deterministic
sine wave component overlaid with noise uniformly distributed on (−1, 1). Loess is then used to
retrieve a good approximation to the sine function. The resulting graph is shown in Figure 40.1.
Listing 40.1: Loess script [Download ▼]
nulldata 120
series x = index
scalar n = $nobs
series y = sin(2*pi*x/n) + uniform(-1, 1)
series yh = loess(y, x, 2, 0.75, 0)
gnuplot y yh x --output=display --with-lines=yh
2
loess fit
1.5
0.5
-0.5
-1
-1.5
-2
0 20 40 60 80 100 120
x
Figure 40.1: Loess: retrieving a sine wave

40.2 The Nadaraya–Watson estimator

The Nadaraya–Watson nonparametric estimator (Nadaraya, 1964; Watson, 1964) is an estimator
for the conditional mean of a variable Y , available in a sample of size n, for a given value of a
conditioning variable X, and is defined as
Pn
j=1 yj · Kh (X − xj )
m(X) = Pn
j=1 Kh (X − xj )
where Kh (·) is the so-called kernel function, which is usually some simple transform of a density
function that depends on a scalar, h, known as the bandwidth. The one used by gretl is
!
x2
Kh (x) = exp −
2h
for |x| < τ and zero otherwise. Larger values of h produce a smoother function. The scalar τ,
known as the trim parameter, is used to prevent numerical problems when the kernel function is
evaluated too far away from zero.
A common variant of Nadaraya–Watson is the so-called “leave-one-out” estimator, which omits the
i-th observation when evaluating m(xi ). The formula therefore becomes
P
j̸=i yj · Kh (xi − xj )
m(xi ) = P
j̸=i Kh (xi − xj )
This makes the estimator more robust numerically and its usage is often advised for inference
purposes.
The nadarwat() function in gretl takes up to five arguments as follows: the dependent series y,
the independent series x, the bandwidth h, a Boolean switch to turn on “leave-one-out”, and a value
for the trim parameter τ, expressed as a multiple of h. The last three arguments are optional; if
they are omitted the default values are, respectively, an automatic data-determined value for h (see
below), leave-one-out not activated, and τ = 4. The default value of τ offers a relatively safe guard
against numerical problems; in some cases a larger τ may produce more sensible values in regions
of X with sparse support.
Choice of bandwidth
As mentioned above, larger values of h lead to a smoother m(·) function; smaller values make
the m(·) function follow the yi values more closely, so that the function appears more “jagged”.
In fact, as h → ∞, m(xi ) → Ȳ ; on the contrary, if h → 0, observations for which xi ̸= X are not
taken into account at all when computing m(X). Also, the statistical properties of m(·) vary with
h: its variance can be shown to be decreasing in h, while its squared bias is increasing in h. It
can be shown that choosing h ∼ n−1/5 minimizes the RMSE, so that value is customarily taken as a
reference point.
If the argument h is omitted or set to 0, gretl uses the following data-determined value:
r

h = 0.9 · min s, · n−1/5
1.349
where s is the sample standard deviation of x and r is its interquartile range.
Example and prediction

By way of example, Listing 40.2 produces the graph shown in Figure 40.2 (after some slight editing).
Although X could be, in principle, any value, in the typical usage of this estimator you want to
compute m(X) for X equal to one or more values actually observed in your sample, that is m(xi ).
Listing 40.2: Nadaraya–Watson example [Download ▼]
# Nonparametric regression example: husband’s age on wife’s age

open mroz87.gdt
# initial value for the bandwidth

scalar h = $nobs^(-0.2)
# three increasingly smooth estimates
series m0 = nadarwat(HA, WA, h)
series m1 = nadarwat(HA, WA, h * 5)
series m2 = nadarwat(HA, WA, h * 10)
# produce the graph

dataset sortby WA
gnuplot HA m0 m1 m2 WA --output=display --with-lines=m0,m1,m2
60
m0
m1
m2
55
50
HA
45
40
35
30
30 35 40 45 50 55 60
WA
Figure 40.2: Nadaraya–Watson example for several choices of the bandwidth parameter
If you need a point estimate of m(X) for some value of X which is not present among the valid
observations of your dependent variable, you may want to add some “fake” observations to your
dataset in which y is missing and x contains the values you want m(x) evaluated at. For example,
the following script evaluates m(x) at regular intervals between −2.0 and 2.0:
nulldata 120
set seed 120496
# first part of the sample: actual data

smpl 1 100
x = normal()
y = x^2 + sin(x) + normal()
# second part of the sample: fake x data

smpl 101 120
x = (obs-110) / 5
# compute the Nadaraya-Watson estimate

# with bandwidth equal to 0.4 (note that
# 100^(-0.2) = 0.398)
smpl full
m = nadarwat(y, x, 0.4)
# show m(x) for the fake x values only

smpl 101 120
print x m -o
and running it produces
x m
101 -1.8 1.165934

102 -1.6 0.730221
103 -1.4 0.314705
104 -1.2 0.026057
105 -1.0 -0.131999
106 -0.8 -0.215445
107 -0.6 -0.269257
108 -0.4 -0.304451
109 -0.2 -0.306448
110 0.0 -0.238766
111 0.2 -0.038837
112 0.4 0.354660
113 0.6 0.908178
114 0.8 1.485178
115 1.0 2.000003
116 1.2 2.460100
117 1.4 2.905176
118 1.6 3.380874
119 1.8 3.927682
120 2.0 4.538364
Chapter 41
MIDAS models
The acronym MIDAS stands for “Mixed Data Sampling”. MIDAS models can essentially be described
as models where one or more independent variables are observed at a higher frequency than the
dependent variable, and possibly an ad-hoc parsimonious parameterization is adopted. See Ghysels
et al., 2004; Ghysels, 2015; Armesto et al., 2010 for a fuller introduction. Naturally, these models
require easy handling of multiple-frequency data. The way this is done in gretl is explained in
Chapter 20; in this chapter, we concentrate on the numerical aspects of estimation.
41.1 Parsimonious parameterizations

The simplest MIDAS regression specification — known as “unrestricted MIDAS” or U-MIDAS — simply
includes p lags of a high-frequency regressor, each with its own parameter to be estimated. A
typical case can be written as
p
X
yt = β0 + αyt−1 + δi xτ−i + εt (41.1)
i=1
where τ represents the reference point of the sequence of high-frequency lags in “high-frequency
time”.1 Obvious generalizations of this specification include a higher AR order for y and inclusion
of additional low- and/or high-frequency regressors.
Estimation of (41.1) can be accomplished via OLS. However, it is more common to enforce parsi-
mony by making the individual coefficients on lagged high-frequency terms a function of a relatively
small number of hyperparameters, as in
yt = β0 + αyt−1 + γW (xτ−1 , xτ−2 , . . . , xτ−p ; θ) + εt (41.2)
where W (·) is the weighting function associated with a given parameterization and θ is a k-vector
of hyperparameters, k < p.
This presents a couple of computational questions: how to calculate the per-lag coefficients given
the values of the hyperparameters, and how best to estimate the value of the hyperparameters?
Gretl can handle natively four commonly used parameterizations: normalized exponential Almon,
normalized beta (with or without a zero last coefficient), and plain (non-normalized) Almon poly-
nomial. The Almon variants take one or more parameters (two being a common choice). The beta
variants take either two or three parameters. Full details on the forms taken by the W (·) function
are provided in section 41.3.
All variants are handled by the functions mweights and mgradient, which work as follows.
• mweights takes three arguments: the number of lags required (p), the k-vector of hyperpa-
rameters (θ), and an integer code or string indicating the method (see Table 41.1). It returns
a p-vector containing the coefficients.
• mgradient takes three arguments, just like mweights. However, this function returns a p × k
matrix holding the (analytical) gradient of the p coefficients or weights with respect to the k
elements of θ.
1 For discussion of the placement of this reference point relative to low-frequency time, see section 20.3 above.
415
Chapter 41. MIDAS models 416
Parameterization code string

Normalized exponential Almon 1 "nealmon"
Normalized beta, zero last lag 2 "beta0"
Normalized beta, non-zero last lag 3 "betan"
Almon polynomial 4 "almonp"
One-parameter beta 5 "beta1"
Table 41.1: MIDAS parameterizations
In the case of the non-normalized Almon polynomial the γ coefficient in (41.2) is identically 1.0
and is omitted. The "beta1" case is the the same as the two-parameter "beta0" except that θ1 is
constrained to equal 1, leaving θ2 as the only free parameter. Ghysels and Qian (2016) make a case
for use of this particularly parsimonious version.2
An additional function is provided for convenience: it is named mlincomb and it combines mweights
with the lincomb function, which takes a list (of series) argument followed by a vector of coeffi-
cients and produces a series result, namely a linear combination of the elements of the list. If we
have a suitable list X available, we can do, for example,
series foo = mlincomb(X, theta, "beta0")
This is equivalent to
series foo = lincomb(X, mweights(nelem(X), theta, "beta0"))
but saves a little typing and some CPU cycles.
41.2 Estimating MIDAS models

Gretl offers a dedicated command, midasreg, for estimation of MIDAS models. (There’s a corre-
sponding item, MIDAS, under the Time series section of the Model menu in the gretl GUI.) We begin
by discussing that, then move on to possibilities for defining your own estimator.
The syntax of midasreg looks like this:
midasreg depvar xlist ; midas-terms [ options ]
The depvar slot takes the name (or series ID number) of the dependent variable, and xlist is
the list of regressors that are observed at the same frequency as the dependent variable; this list
may contain lags of the dependent variable. The midas-terms slot accepts one or more specifica-
tion(s) for high-frequency terms. Each of these specifications must conform to one or other of the
following patterns:
1 mds(mlist, minlag, maxlag, type, theta)
2 mdsl(llist, type, theta)
In case 1 mlist must be a MIDAS list, as defined in section 20.2, which contains a full set of
per-period series but no lags. Lags will be generated automatically, governed by the minlag and
maxlag (integer) arguments, which may be given as numerical values or the names of predefined
scalar variables. The integer (or string) type argument represents the type of parameterization; in
addition to the values 1 to 4 defined in Table 41.1 a value of 0 (or the string "umidas") indicates
unrestricted MIDAS.
In case 2 llist is assumed to be a list that already contains the required set of high-frequency
lags—as may be obtained via the hflags function described in section 20.3 — hence minlag and
maxlag are not wanted.
2 Note, however, that at present "beta1" cannot be mixed with other parameterizations in a single model.
The final theta argument is optional in most cases (implying an automatic initialization of the
hyperparameters). If this argument is given it must take one of the following forms:
1. The name of a matrix (vector) holding initial values for the hyperparameters, or a simple
expression which defines a matrix using scalars, such as {1, 5}.
2. The keyword null, indicating that an automatic initialization should be used (as happens
when this argument is omitted).
3. An integer value (in numerical form), indicating how many hyperparameters should be used
(which again calls for automatic initialization).
The third of these forms is required if you want automatic initialization in the Almon polynomial
case, since we need to know how many terms you wish to include. (In the normalized exponential
Almon case we default to the usual two hyperparameters if theta is omitted or given as null.)
The midasreg syntax allows the user to specify multiple high-frequency predictors, if wanted: these
can have different lag specifications, different parameterizations and/or different frequencies.
The options accepted by midasreg include --quiet (suppress printed output), --verbose (show
detail of iterations, if applicable) and --robust (use a HAC estimator of the Newey–West type in
computing standard errors). Two additional specialized options are described below.
Examples of usage
Suppose we have a dependent variable named dy and a MIDAS list named dX, and we wish to run
a MIDAS regression using one lag of the dependent variable and high-frequency lags 1 to 10 of the
series in dX. The following will produce U-MIDAS estimates:
midasreg dy const dy(-1) ; mds(dX, 1, 10, 0)
The next lines will produce estimates for the normalized exponential Almon parameterization with
two coefficients, both initialized to zero:
midasreg dy const dy(-1) ; mds(dX, 1, 10, "nealmon", {0,0})
In the examples above, the required lags will be added to the dataset automatically then deleted
after use. If you are estimating several models using a single set of MIDAS lags it is more efficient to
create the lags once and use the mdsl specifier. For example, the following estimates three variant
parameterizations (exponential Almon, beta with zero last lag, and beta with non-zero last lag) on
the same data:
list dXL = hflags(1, 10, dX)

midasreg dy 0 dy(-1) ; mdsl(dXL, "nealmon", {0,0})
midasreg dy 0 dy(-1) ; mdsl(dXL, "beta0", {1,5})
midasreg dy 0 dy(-1) ; mdsl(dXL, "betan", {1,1,0})
Any additional MIDAS terms should be separated by spaces, as in
midasreg dy const dy(-1) ; mds(dX,1,9,1,theta1) mds(Z,1,6,3,theta2)
Replication exercise
We give a substantive illustration of midasreg in Listing 41.1. This replicates the first practical
example discussed by Ghysels in the user’s guide titled MIDAS Matlab Toolbox,3 The dependent
3 See Ghysels (2015). This document announces itself as Version 2.0 of the guide and is dated November 1, 2015.
The example we’re looking at appears on pages 24–26; the associated Matlab code can be found in the program
appADLMIDAS1.m.
variable is the quarterly log-difference of real GDP, named dy in our script. The independent vari-
ables are the first lag of dy and monthly lags 3 to 11 of the monthly log-difference of non-farm
payroll employment (named dXL in our script). Therefore, in this case equation (41.2) becomes
yt = α + βyt−1 + γW (xτ−3 , xτ−4 , . . . , xτ−11 ; θ) + εt
and in the U-MIDAS case the model comes down to

9
X
yt = α + βyt−1 + δi xτ−i−2 + εt
i=1
The script exercises all five of the parameterizations mentioned above,4 and in each case the results
of 9 pseudo-out-of-sample forecasts are recorded so that their Root Mean Square Errors can be
compared.
The data file used in the replication, gdp_midas.gdt, was contructed as described in section 20.1
(and as noted there, it is included in the current gretl package). Part of the output from the replica-
tion script is shown in Listing 41.2. The γ coefficient is labeled HF_slope in the gretl output.
For reference, output from Matlab (version R2016a for Linux) is available at http://gretl.sourceforge.
net/midas/matlab_output.txt. For the most part (in respect of regression coefficients and aux-
iliary statistics such as R 2 and forecast RMSEs), gretl’s output agrees with that of Matlab to the
extent that one can reasonably expect on nonlinear problems — that is, to at least 4 significant dig-
its in all but a few instances.5 Standard errors are not quite so close across the two programs,
particularly for the hyperparameters of the beta and exponential Almon functions. We show these
in Table 41.2.
2-param beta 3-param beta Exp Almon

Matlab gretl Matlab gretl Matlab gretl
const 0.135 0.140 0.143 0.146 0.135 0.140
dy(-1) 0.116 0.118 0.116 0.119 0.116 0.119
HF slope 0.559 0.575 0.566 0.582 0.562 0.575
θ1 0.067 0.106 0.022 0.027 2.695 6.263
θ2 9.662 17.140 1.884 2.934 0.586 1.655
θ3 0.022 0.027
Table 41.2: Comparison of standard errors from MIDAS regressions
Differences of this order are not unexpected, however, when different methods are used to calcu-
late the covariance matrix for a nonlinear regression. The Matlab standard errors are based on a
numerical approximation to the Hessian at convergence, while those produced by gretl are based
on a Gauss–Newton Regression, as discussed and recommended in Davidson and MacKinnon (2004,
chapter 6).
Underlying methods
The midasreg command calls one of several possible estimation methods in the background, de-
pending on the MIDAS specification(s). As shown in Listing 41.2, this is flagged in a line of output
immediately preceding the “Dependent variable” line. If the only specification type is U-MIDAS,
the method is OLS. Otherwise it is one of three variants of Nonlinear Least Squares.
• Levenberg–Marquardt. This is the back-end for gretl’s nls command.

4 The Matlab program includes an additional parameterization not supported by gretl, namely a step-function.
5 Nonlinear results, even for a given software package, are subject to slight variation depending on the compiler used
and the exact versions of supporting numerical libraries.
Listing 41.1: Script to replicate results given by Ghysels [Download ▼]
set verbose off

open gdp_midas.gdt --quiet
# form the dependent variable

series dy = 100 * ldiff(qgdp)
# form list of high-frequency lagged log differences
list X = payems*
list dXL = hflags(3, 11, hfldiff(X, 100))
# initialize matrix to collect forecasts
matrix FC = {}
# estimation sample
smpl 1985:1 2009:1
print "=== unrestricted MIDAS (umidas) ==="

midasreg dy 0 dy(-1) ; mdsl(dXL, 0)
fcast --out-of-sample --static --quiet
FC ~= $fcast
print "=== normalized beta with zero last lag (beta0) ==="
midasreg dy 0 dy(-1) ; mdsl(dXL, 2, {1,5})
FC ~= $fcast
print "=== normalized beta, non-zero last lag (betan) ==="

midasreg dy 0 dy(-1) ; mdsl(dXL, 3, {1,1,0})
FC ~= $fcast
print "=== normalized exponential Almon (nealmon) ==="

midasreg dy 0 dy(-1) ; mdsl(dXL, 1, {0,0})
FC ~= $fcast
print "=== Almon polynomial (almonp) ==="

midasreg dy 0 dy(-1) ; mdsl(dXL, 4, 4)
FC ~= $fcast
smpl 2009:2 2011:2

matrix my = {dy}
print "Forecast RMSEs:"
printf " umidas %.4f\n", fcstats(my, FC[,1])[2]
printf " beta0 %.4f\n", fcstats(my, FC[,2])[2]
printf " betan %.4f\n", fcstats(my, FC[,3])[2]
printf " nealmon %.4f\n", fcstats(my, FC[,4])[2]
printf " almonp %.4f\n", fcstats(my, FC[,5])[2]
Listing 41.2: Replication of Ghysels’ results, partial output
=== normalized beta, non-zero last lag (betan) ===

Model 3: MIDAS (NLS), using observations 1985:1-2009:1 (T = 97)
Using L-BFGS-B with conditional OLS
Dependent variable: dy
estimate std. error t-ratio p-value

-------------------------------------------------------
const 0.748578 0.146404 5.113 1.74e-06 ***
dy_1 0.248055 0.118903 2.086 0.0398 **
MIDAS list dXL, high-frequency lags 3 to 11
HF_slope 1.72167 0.582076 2.958 0.0039 ***

Beta1 0.998501 0.0269479 37.05 1.10e-56 ***
Beta2 2.95148 2.93404 1.006 0.3171
Beta3 -0.0743143 0.0271273 -2.739 0.0074 ***
Sum squared resid 28.78262 S.E. of regression 0.562399

R-squared 0.356376 Adjusted R-squared 0.321012
=== Almon polynomial (almonp) ===

Model 5: MIDAS (NLS), using observations 1985:1-2009:1 (T = 97)
Using Levenberg-Marquardt algorithm
Dependent variable: dy
estimate std. error t-ratio p-value

-------------------------------------------------------
const 0.741403 0.146433 5.063 2.14e-06 ***
dy_1 0.255099 0.119139 2.141 0.0349 **
MIDAS list dXL, high-frequency lags 3 to 11
Almon0 1.06035 1.53491 0.6908 0.4914

Almon1 0.193615 1.30812 0.1480 0.8827
Almon2 -0.140466 0.299446 -0.4691 0.6401
Almon3 0.0116034 0.0198686 0.5840 0.5607
Sum squared resid 28.66623 S.E. of regression 0.561261

R-squared 0.358979 Adjusted R-squared 0.323758
Forecast RMSEs:
umidas 0.5424
beta0 0.5650
betan 0.5210
nealmon 0.5642
almonp 0.5329
• L-BFGS-B with conditional OLS. L-BFGS is a “limited memory” version of the BFGS optimizer
and the trailing “-B” means that it supports bounds on the parameters, which is useful for
reasons given below.
• Golden Section search with conditional OLS. This is a line search method, used only when
there is a just a single hyperparameter to estimate.
Levenberg–Marquardt is the default NLS method, but if the MIDAS specifications include any of
the beta variants or normalized exponential Almon we switch to L-BFGS-B, unless the user gives the
--levenberg option. The ability to set bounds on the hyperparameters via L-BFGS-B is helpful, first
because the beta parameters (other than the third one, if applicable) must be non-negative but also
because one is liable to run into numerical problems (in calculating the weights and/or gradient) if
their values become too extreme. For example, we have found it useful to place bounds of −2 and
+2 on the exponential Almon parameters.
Here’s what we mean by “conditional OLS” in the context of L-BFGS-B and line search: the search
algorithm itself is only responsible for optimizing the MIDAS hyperparameters, and when the algo-
rithm calls for calculation of the sum of squared residuals given a certain hyperparameter vector we
optimize the remaining parameters (coefficients on base-frequency regressors, slopes with respect
to MIDAS terms) via OLS.
Testing for a structural break

The --breaktest option can be used to carry out the Quandt Likelihood Ratio (QLR) test for a
structural break at the stage of running the final Gauss–Newton regression (to check for conver-
gence and calculate the covariance matrix of the parameter estimates). This can be a useful aid
to diagnosis, since non-homogeneity of the data over the estimation period can lead to numerical
problems in nonlinear estimation, besides compromising the forecasting capacity of the resulting
equation. For example, when this option is given with the command to estimate the “betan” model
shown in Listing 41.2, the following result is appended to the standard output:
QLR test for structural break -

Null hypothesis: no structural break
Test statistic: chi-square(6) = 35.1745 at observation 2005:2
with asymptotic p-value = 0.000127727
Despite the strong evidence for a structural break, in this case the nonlinear estimator appears to
converge successfully. But one might wonder if a shorter estimation period could provide better
out-of-sample forecasts.
Defining your own MIDAS estimator

As explained above, the midasreg command is in effect a “wrapper” for various underlying meth-
ods. Some users may wish to undo the wrapping. (This would be required if you wish to introduce
any nonlinearity other than that associated with the stock MIDAS parameterizations, or to define
your own MIDAS parameterization).
Anyone with ambitions in this direction will presumably be quite familiar with the commands
and functions available in hansl, gretl’s scripting language, so we will not say much here beyond
presenting a couple of examples. First we show how the nls command can be used, along with the
MIDAS-related functions described in section 41.1, to estimate a model with the exponential Almon
specification.

series dy1 = dy(-1)
list X = payems*
list dXL = hflags(3, 11, hfldiff(X, 100))
smpl 1985:1 2009:1
# initialization via OLS

series mdX = mean(dXL)
ols dy 0 dy1 mdX --quiet
matrix b = $coeff | {0,0}’
scalar p = nelem(dXL)
# convenience matrix for computing gradient

matrix mdXL = {dXL}
# normalized exponential Almon via nls

nls dy = b[1] + b[2]*dy1 + b[3]*mdx
series mdx = mlincomb(dXL, b[4:], 1)
matrix grad = mgradient(p, b[4:], 1)
deriv b = {const, dy1, mdx} ~ (b[3] * mdXL * grad)
param_names "const dy(-1) HF_slope Almon1 Almon2"
end nls
Listing 41.3 presents a more ambitious example: we use GSSmin (Golden Section minimizer) to es-
timate a MIDAS model with the “one-parameter beta” specification (that is, the two-parameter beta
with θ1 clamped at 1). Note that while the function named beta1_SSR is specialized to the given
parameterization, midas_GNR is a fairly general means of calculating the Gauss–Newton regression
for an ADL(1) MIDAS model, and it could be generalized further without much difficulty.
Plot of coefficients
At times, it may be useful to plot the “gross” coefficients on the lags of the high-frequency series
in a MIDAS regression—that is, the normalized weights multiplied by the HF_slope coefficient
(the γ in 41.2). After estimation of a MIDAS model in the gretl GUI this is available via the item
MIDAS coefficients under the Graphs menu in the model window. It is also easily generated via
script, since the $model bundle that becomes available following the midasreg command contains
a matrix, midas_coeffs, holding these coefficients. So the following is sufficient to display the
plot:
matrix m = $model.midas_coeffs
plot m
options with-lp fit=none
literal set title "MIDAS coefficients"
literal set ylabel ’’
end plot --output=display
Caveat: this feature is at present available only for models with a single MIDAS specification.
41.3 Parameterization functions

Here we give some more detail of the MIDAS parameterizations supported by gretl.
In general the normalized coefficient or weight i (i = 1, . . . , p) is given by
f (i, θ)
wi = Pp (41.3)
k=1 f (k, θ)
such that the coefficients sum to unity.

Listing 41.3: Manual MIDAS: one-parameter beta specification [Download ▼]
set verbose off
function scalar beta1_SSR (scalar th2, const series y,

const series x, list L)
matrix theta = {1, th2}
series mdx = mlincomb(L, theta, 2)
# run OLS conditional on theta
ols y 0 x mdx --quiet
return $ess
end function
function matrix midas_GNR (const matrix theta, const series y,

const series x, list L, int type)
# Gauss-Newton regression
series mdx = mlincomb(L, theta, type)
ols y 0 x mdx --quiet
matrix b = $coeff
matrix u = {$uhat}
matrix mgrad = mgradient(nelem(L), theta, type)
matrix M = {const, x, mdx} ~ (b[3] * {L} * mgrad)
matrix V
set svd on # in case of strong collinearity
mols(u, M, null, &V)
return (b | theta) ~ sqrt(diag(V))
end function
/* main */

series dy1 = dy(-1)
list dX = ld_payem*
list dXL = hflags(3, 11, dX)
# estimation sample
smpl 1985:1 2009:1
matrix b = {0, 1.01, 100}

# use Golden Section minimizer
SSR = GSSmin(b, beta1_SSR(b[1], dy, dy1, dXL), 1.0e-6)
printf "SSR (GSS) = %.15g\n", SSR
matrix theta = {1, b[1]}’ # column vector needed
matrix bse = midas_GNR(theta, dy, dy1, dXL, 2)
bse[4,2] = $nan # mask std error of clamped coefficient
modprint bse "const dy(-1) HF_slope Beta1 Beta2"
In the normalized exponential Almon case with m parameters the function f (·) is
 
m
X
j
f (i, θ) = exp  θj i  (41.4)
j=1
So in the usual two-parameter case we have
exp θ1 i + θ2 i2

wi = Pp
k=1 exp (θ1 k + θ2 k2 )
and equal weighting is obtained when θ1 = θ2 = 0.
In the standard, two-parameter normalized beta case we have
f (i, θ) = (i− /p − )θ1 −1 · (1 − i− /p − )θ2 −1 (41.5)
where p − = p − 1, and i− = i − 1 except at the end-points, i = 1 and i = p, where we add and

subtract, respectively, machine epsilon to avoid numerical problems. This formulation constrains
the coefficient on the last lag to be zero — provided that the weights are declining at higher lags,
a condition that is ensured if θ2 is greater than θ1 by a sufficient margin. The special case of
θ1 = θ2 = 1 yields equal weights at all lags. A third parameter can be used to allow a non-zero
final weight, even in the case of declining weights. Let wi denote the normalized weight obtained
by using (41.5) in (41.3). Then the modified variant with additional parameter θ3 can be written as
(3) wi + θ 3
wi =
1 + pθ3
(3)
That is, we add θ3 to each weight then renormalize so that the wi values again sum to unity.
In Eric Ghysels’ Matlab code the two beta variants are labeled “normalized beta density with a zero
last lag” and “normalized beta density with a non-zero last lag” respectively. Note that while the
two basic beta parameters must be positive, the third additive parameter may be positive, negative
or zero.
In the case of the plain Almon polynomial of order m, coefficient i is given by

m
X
wi = θj ij−1
j=1
Note that no normalization is applied in this case, so no additional coefficient should be placed
before the MIDAS lags term in the context of a regression.
Analytical gradients
Here we set out the expressions for the analytical gradients produced by the mgradient function,
and also used internally by the midasreg command. In these expressions f (i, θ) should be un-
derstood as referring back to the
P specific forms noted above for the exponential Almon and beta
distributions. The summation k should be understood as running from 1 to p.
For the normalized exponential Almon case, the gradient is
dwi f (i, θ)ij f (i, θ) Xh i

= P − P 2 f (k, θ)kj
dθj k f (k, θ) k f (k, θ) k
 h i
j
P
k f (k, θ)k
= wi ij − P 
k f (k, θ)
For the two-parameter normalized beta case it is
dwi f (i, θ) log(i− /p − ) f (i, θ) X

f (k, θ) log(k− /p − )

= P − P 2
dθ1 k f (k, θ) k f (k, θ) k
P − −
!
− − k f (k, θ) log(k /p )
= wi log(i /p ) − P
k f (k, θ)
dwi f (i, θ) log(1 − i− /p − ) f (i, θ) X

f (k, θ) log(1 − k− /p − )

= P − P 2
dθ2 k f (k, θ) k f (k, θ) k
!
f (k, θ) log(1 − k− /p − )
P
= wi log(1 − i− /p − ) − k P
k f (k, θ)
And for the three-parameter beta, we have

(3)
dwi 1 dwi
=
dθ1 1 + pθ3 dθ1
(3)
dwi 1 dwi
=
dθ2 1 + pθ3 dθ2
(3)
dwi 1 p(wi + θ3 )
= −
dθ3 1 + pθ3 (1 + pθ3 )2
For the (non-normalized) Almon polynomial the gradient is simply
dwi
= ij−1
dθj
Part III
Technical details
426
Chapter 42
Gretl and ODBC
Gretl provides a method for retrieving data from databases which support the Open Database
Connectivity (ODBC) standard. Most users won’t be interested in this, but there may be some for
whom this feature matters a lot—typically, those who work in an environment where huge data
collections are accessible via a Data Base Management System (DBMS).
In the following section we explain what is needed for ODBC support in gretl. We provide some
background information on how ODBC works in section 42.2, and explain the details of getting gretl
to retrieve data from a database in section 42.3. Section 42.4 provides some example of usage, and
section 42.5 gives some details on the management of ODBC connections.
42.1 ODBC support

The piece of software that bridges between gretl and the ODBC system is a dynamically loaded
“plugin”. This is included in the gretl packages for MS Windows and Mac OS X. On other unix-type
platforms (notably Linux) you may have to build gretl from source to get ODBC support. This
is because the plugin depends on having unixODBC installed, which we cannot assume to be the
case on typical Linux systems. To enable the ODBC plugin when building gretl, you must pass the
option --with-odbc to gretl’s configure script. In addition, if unixODBC is installed in a non-
standard location you will have to specify its installation prefix using --with-ODBC-prefix, as in
(for example)
./configure --with-odbc --with-ODBC-prefix=/opt/ODBC
42.2 ODBC base concepts

ODBC is short for Open DataBase Connectivity, a group of software methods that enable a client to
interact with a database server. The most common operation is when the client fetches some data
from the server. ODBC acts as an intermediate layer between client and server, so the client “talks”
to ODBC rather than accessing the server directly (see Figure 42.1).
ODBC
query
data
Figure 42.1: Retrieving data via ODBC
For the above mechanism to work, it is necessary that the relevant ODBC software is installed
and working on the client machine (contact your DB administrator for details). At this point, the
database (or databases) that the server provides will be accessible to the client as a data source
with a specific identifier (a Data Source Name or DSN); in most cases, a username and a password
are required to connect to the data source.
427
Chapter 42. Gretl and ODBC 428
Once the connection is established, the user sends a query to ODBC, which contacts the database
manager, collects the results and sends them back to the user. The query is almost invariably
formulated in a special language used for the purpose, namely SQL.1 We will not provide here an
SQL tutorial: there are many such tutorials on the Net; besides, each database manager tends to
support its own SQL dialect so the precise form of an SQL query may vary slightly if the DBMS on
the other end is Oracle, MySQL, PostgreSQL or something else.
Suffice it to say that the main statement for retrieving data is the SELECT statement. Within a DBMS,
data are organized in tables, which are roughly equivalent to spreadsheets. The SELECT statement
returns a subset of a table, which is itself a table. For example, imagine that the database holds a
table called “NatAccounts”, containing the data shown in Table 42.1.
year qtr gdp consump tradebal

1970 1 584763 344746.9 −5891.01
1970 2 597746 350176.9 −7068.71
1970 3 604270 355249.7 −8379.27
1970 4 609706 361794.7 −7917.61
1971 1 609597 362490 −6274.3
1971 2 617002 368313.6 −6658.76
1971 3 625536 372605 −4795.89
1971 4 630047 377033.9 −6498.13
Table 42.1: The “NatAccounts” table
The SQL statement
SELECT qtr, tradebal, gdp FROM NatAccounts WHERE year=1970;
produces the subset of the original data shown in Table 42.2.
qtr tradebal gdp

1 −5891.01 584763
2 −7068.71 597746
3 −8379.27 604270
4 −7917.61 609706
Table 42.2: Result of a SELECT statement
Gretl provides a mechanism for forwarding your query to the DBMS via ODBC and including the
results in your currently open dataset.
42.3 Syntax
At present we do not offer a graphical interface for ODBC import; this must be done via the com-
mand line interface. The two commands used for fetching data via an ODBC connection are open
and data.
The open command is used for connecting to a DBMS: its syntax is
open dsn=database [user=username] [password=password] --odbc
The user and password items are optional; the effect of this command is to initiate an ODBC
connection. It is assumed that the machine gretl runs on has a working ODBC client installed.
1 See http://en.wikipedia.org/wiki/SQL.
In order to actually retrieve the data, the data command is used. Its syntax is:
data series [obs-format=format-string] query=query-string --odbc
where:
series is a list of names of gretl series to contain the incoming data, separated by spaces. Note that
these series need not exist pior to the ODBC import.
format-string is an optional parameter, used to handle cases when a “rectangular” organisation of

the database cannot be assumed (more on this later);
query-string is a string containing the SQL statement used to extract the data.
There should be no spaces around the equals signs in the obs-format and query fields in the data
command.
The query-string can, in principle, contain any valid SQL statement which results in a table. This
string may be specified directly within the command, as in
data x query="SELECT foo FROM bar" --odbc
which will store into the gretl variable x the content of the column foo from the table bar. However,
since in a real-life situation the string containing the SQL statement may be rather long, it may be
best to store it in a string variable. For example:
string SqlQry = "SELECT foo1, foo2 FROM bar"

data x y query=SqlQry --odbc
The observation format specifier

If the optional parameter obs-format is absent, as in the above example, the SQL query should
return k columns of data, where k is the number of series names listed in the data command.
It may be necessary to include a smpl command before the data command to set up the right
“window” for the incoming data. In addition, if one cannot assume that the data will be delivered
in the correct order (typically, chronological order), the SQL query should contain an appropriate
ORDER BY clause.
The optional format string is used for those cases when there is no certainty that the data from the
query will arrive in the same order as the gretl dataset. This may happen when missing values are
interspersed within a column, or with data that do not have a natural ordering, e.g. cross-sectional
data. In this case, the SQL statement should return a table with m + k columns, where the first m
columns are used to identify the observation or row in the gretl dataset into which the actual data
values in the final k columns should be placed. The obs-format string is used to translate the first
m fields into a string which matches the string gretl uses to identify observations in the currently
open dataset. Up to three columns can be used for this purpose (m ≤ 3).
Note that the strings gretl uses to identify observations can be seen by printing any variable “by
observation”, as in
print index --byobs
(The series named index is automatically added to a dataset created via the nulldata command.)
The format specifiers available for use with obs-format are as follows:
%d print an integer value

%s print an string value
%g print a floating-point value
In addition the format can include literal characters to be passed through, such as slashes or colons,
to make the resulting string compatible with gretl’s observation identifiers.
For example, consider the following fictitious case: we have a 5-days-per-week dataset, to which we
want to add the stock index for the Verdurian market;2 it so happens that in Verduria Saturdays
are working days but Wednesdays are not. We want a column which does not contain data on
Saturdays, because we wouldn’t know where to put them, but at the same time we want to place
missing values on all the Wednesdays.
In this case, the following syntax could be used
string QRY="SELECT year,month,day,VerdSE FROM AlmeaIndexes"

data y obs-format="%d-%d-%d" query=QRY --odbc
The column VerdSE holds the data to be fetched, which will go into the gretl series y. The first
three columns are used to construct a string which identifies the day. Daily dates take the form
YYYY-MM-DD in gretl. If a row from the DBMS produces the observation string 2008-04-01 this will
match OK (it’s a Tuesday), but 2008-04-05 will not match since it is a Saturday; the corresponding
row will therefore be discarded. On the other hand, since no string 2008-04-23 will be found in
the data coming from the DBMS (it’s a Wednesday), that entry is left blank in our series y.
42.4 Examples
Table Consump Table DATA

Field Type
Field Type
year decimal(4,0)
time decimal(7,2)
qtr decimal(1,0)
income decimal(16,6)
varname varchar(16)
consump decimal(16,6)
xval decimal(20,10)
Table 42.3: Example AWM database – structure
Table Consump Table DATA

1970 1 CAN −517.9085000000
1970 2 CAN 662.5996000000
1970.00 424278.975500 344746.944000
1970 3 CAN 1130.4155000000
1970.25 433218.709400 350176.890400
1970 4 CAN 467.2508000000
1970.50 440954.219100 355249.672300
1970 1 COMPR 18.4000000000
1970.75 446278.664700 361794.719900
1970 2 COMPR 18.6341000000
1971.00 447752.681800 362489.970500
1970 3 COMPR 18.3000000000
1971.25 453553.860100 368313.558500
1970 4 COMPR 18.2663000000
1971.50 460115.133100 372605.015300
1970 1 D1 1.0000000000
...
1970 2 D1 0.0000000000
...
Table 42.4: Example AWM database — data
In the following examples, we will assume that access is available to a database known to ODBC
with the data source name “AWM”, with username “Otto” and password “Bingo”. The database
“AWM” contains quarterly data in two tables (see 42.3 and 42.4):
2 See http://www.almeopedia.com/index.php/Verduria.
The table Consump is the classic “rectangular” dataset; that is, its internal organization is the same
as in a spreadsheet or econometrics package: each row is a data point and each column is a variable.
The structure of the DATA table is different: each record is one figure, stored in the column xval,
and the other fields keep track of which variable it belongs to, for which date.
Listing 42.1: Simple query from a rectangular table
nulldata 160
setobs 4 1970:1 --time
open dsn=AWM user=Otto password=Bingo --odbc
string Qry = "SELECT consump, income FROM Consump"

data cons inc query=Qry --odbc
Listing 42.1 shows a query for two series: first we set up an empty quarterly dataset. Then we
connect to the database using the open statement. Once the connection is established we retrieve
two columns from the Consump table. No observation string is required because the data already
have a suitable structure; we need only import the relevant columns.
Listing 42.2: Simple query from a non-rectangular table
string S = "select year, qtr, xval from DATA \

where varname=’WLN’ ORDER BY year, qtr"
data wln obs-format="%d:%d" query=S --odbc
In example 42.2, by contrast, we make use of the observation string since we are drawing from the
DATA table, which is not rectangular. The SQL statement stored in the string S produces a table with
three columns. The ORDER BY clause ensures that the rows will be in chronological order, although
this is not strictly necessary in this case.
42.5 Connectivity details

It may be helpful to supply some details on gretl’s management of ODBC connections. First, when
the open command is invoked with the --odbc option, gretl checks to see if a connection to the
specified DSN (Data Source Name) can be established via the ODBC function SQLConnect. If not, an
error is flagged; if so, the connection is dropped (SQLDisconnect) but the DSN details are stored.
The stored DSN then remains the implicit source for subsequent invocation of the data command,
with the --odbc option, until a countermanding open command is issued.
Each time an OBDC-related data command is issued, gretl attempts to re-establish a connection to
the given DSN; the connection is dropped once the data transfer is complete.
Listing 42.3: Handling of missing values for a non-rectangular table
string foo = "select year, qtr, xval from DATA \

where varname=’STN’ AND qtr>1"
data bar obs-format="%d:%d" query=foo --odbc
print bar --byobs
Listing 42.3 shows what happens if the rows in the outcome from the SELECT statement do not
match the observations in the currently open gretl dataset. The query includes a condition which
filters out all the data from the first quarter. The query result (invisible to the user) would be
something like
+------+------+---------------+
| year | qtr | xval |
+------+------+---------------+
| 1970 | 2 | 7.8705000000 |
| 1970 | 3 | 7.5600000000 |
| 1970 | 4 | 7.1892000000 |
| 1971 | 2 | 5.8679000000 |
| 1971 | 3 | 6.2442000000 |
| 1971 | 4 | 5.9811000000 |
| 1972 | 2 | 4.6883000000 |
| 1972 | 3 | 4.6302000000 |
...
Internally, gretl fills the variable bar with the corresponding value if it finds a match; otherwise, NA
is used. Printing out the variable bar thus produces
Obs bar
1970:1
1970:2 7.8705
1970:3 7.5600
1970:4 7.1892
1971:1
1971:2 5.8679
1971:3 6.2442
1971:4 5.9811
1972:1
1972:2 4.6883
1972:3 4.6302
...
Chapter 43
Gretl and TEX
43.1 Introduction
TEX — initially developed by Donald Knuth of Stanford University and since enhanced by hundreds
of contributors around the world — is the gold standard of scientific typesetting. Gretl provides
various hooks that enable you to preview and print econometric results using the TEX engine, and
to save output in a form suitable for further processing with TEX.
This chapter explains the finer points of gretl’s TEX-related functionality. The next section describes
the relevant menu items; section 43.3 discusses ways of fine-tuning TEX output; and section 43.4
gives some pointers on installing (and learning) TEX if you do not already have it on your computer.
(Just to be clear: TEX is not included with the gretl distribution; it is a separate package, including
several programs and a large number of supporting files.)
Before proceeding, however, it may be useful to set out briefly the stages of production of a final
document using TEX. For the most part you don’t have to worry about these details, since, in regard
to previewing at any rate, gretl handles them for you. But having some grasp of what is going on
behind the scences will enable you to understand your options better.
The first step is the creation of a plain text “source” file, containing the text or mathematics to
be typset, interspersed with mark-up that defines how it should be formatted. The second step
is to run the source through a processing engine that does the actual formatting. Typically this a
program called pdflatex that generates PDF output.1 (In times gone by it was a program called latex
that generated so-called DVI (device-independent) output.)
So gretl calls pdflatex to process the source file. On MS Windows and Mac OS X, gretl expects the
operating system to find the default viewer for PDF output. On GNU/Linux you can specify your
preferred PDF viewer via the menu item “Tools, Preferences, General,” under the “Programs” tab.
43.2 TEX-related menu items

The model window
The fullest TEX support in gretl is found in the GUI model window. This has a menu item titled
“LaTeX” with sub-items “View”, “Copy”, “Save” and “Equation options” (see Figure 43.1).
The first three sub-items have branches titled “Tabular” and “Equation”. By “Tabular” we mean that
the model is represented in the form of a table; this is the fullest and most explicit presentation of
the results. See Table 43.1 for an example; this was pasted into the manual after using the “Copy,
Tabular” item in gretl (a few lines were edited out for brevity).
The “Equation” option is fairly self-explanatory — the results are written across the page in equation
format, as below:
ENROLL
\ = 0.241105 + 0.223530 CATHOL − 0.00338200 PUPIL − 0.152643 WHITE
(0.066022) (0.04597) (0.0027196) (0.040706)
T = 51 R̄ 2 = 0.4462 F (3, 47) = 14.431 σ̂ = 0.038856

1 Experts will be aware of something called “plain T X”, which is processed using the program tex. The great majority
E
of TEX users, however, use the LATEX macros, initially developed by Leslie Lamport. gretl does not support plain TEX.
433
Chapter 43. Gretl and TEX 434
Figure 43.1: LATEX menu in model window
Table 43.1: Example of LATEX tabular output
Model 1: OLS estimates using the 51 observations 1–51

Dependent variable: ENROLL
Variable Coefficient Std. Error t-statistic p-value
const 0.241105 0.0660225 3.6519 0.0007

CATHOL 0.223530 0.0459701 4.8625 0.0000
PUPIL −0.00338200 0.00271962 −1.2436 0.2198
WHITE −0.152643 0.0407064 −3.7499 0.0005
Mean of dependent variable 0.0955686

S.D. of dependent variable 0.0522150
Sum of squared residuals 0.0709594
Standard error of residuals (σ̂ ) 0.0388558
Unadjusted R 2 0.479466
2
Adjusted R̄ 0.446241
F (3, 47) 14.4306
(standard errors in parentheses)
The distinction between the “Copy” and “Save” options (for both tabular and equation) is twofold.
First, “Copy” puts the TEX source on the clipboard while with “Save” you are prompted for the name
of a file into which the source should be saved. Second, with “Copy” the material is copied as a
“fragment” while with “Save” it is written as a complete file. The point is that a well-formed TEX
source file must have a header that defines the documentclass (article, report, book or whatever)
and tags that say \begin{document} and \end{document}. This material is included when you do
“Save” but not when you do “Copy”, since in the latter case the expectation is that you will paste
the data into an existing TEX source file that already has the relevant apparatus in place.
The items under “Equation options” should be self-explanatory: when printing the model in equa-
tion form, do you want standard errors or t-ratios displayed in parentheses under the parameter
estimates? The default is to show standard errors; if you want t-ratios, select that item.
Other windows
Several other sorts of output windows also have TEX preview, copy and save enabled. In the case of
windows having a graphical toolbar, look for the TEX button. Figure 43.2 shows this icon (second
from the right on the toolbar) along with the dialog that appears when you press the button.
Figure 43.2: TEX icon and dialog
One aspect of gretl’s TEX support that is likely to be particularly useful for publication purposes is
the ability to produce a typeset version of the “model table” (see section 3.4). An example of this is
shown in Table 43.2.
43.3 Fine-tuning typeset output

There are three aspects to this: adjusting the appearance of the output produced by gretl in
LATEX preview mode; adjusting the formatting of gretl’s tabular output for models when using the
tabprint command; and incorporating gretl’s output into your own TEX files.
Previewing in the GUI

As regards preview mode, you can control the appearance of gretl’s output using a file named
gretlpre.tex, which should be placed in your gretl user directory (see the Gretl Command Ref-
erence). If such a file is found, its contents will be used as the “preamble” to the TEX source. The
default value of the preamble is as follows:
\documentclass[11pt]{article}
\usepackage[utf8]{inputenc}
Table 43.2: Example of model table output
OLS estimates
Dependent variable: ENROLL
Model 1 Model 2 Model 3
const 0.2907∗∗ 0.2411∗∗ 0.08557

(0.07853) (0.06602) (0.05794)
CATHOL 0.2216∗∗ 0.2235∗∗ 0.2065∗∗

(0.04584) (0.04597) (0.05160)
PUPIL −0.003035 −0.003382 −0.001697

(0.002727) (0.002720) (0.003025)
∗∗ ∗∗
WHITE −0.1482 −0.1526
(0.04074) (0.04071)
ADMEXP −0.1551
(0.1342)
n 51 51 51
2
R̄ 0.4502 0.4462 0.2956
ℓ 96.09 95.36 88.69
Standard errors in parentheses

* indicates significance at the 10 percent level
** indicates significance at the 5 percent level
\usepackage{amsmath}
\usepackage{dcolumn,longtable}
\begin{document}
\thispagestyle{empty}
Note that the amsmath and dcolumn packages are required. (For some sorts of output the longtable
package is also needed.) Beyond that you can, for instance, change the type size or the font by al-
tering the documentclass declaration or including an alternative font package.
In addition, if you wish to typeset gretl output in more than one language, you can set up per-
language preamble files. A “localized” preamble file is identified by a name of the form gretlpre_xx.tex,
where xx is replaced by the first two letters of the current setting of the LANG environment vari-
able. For example, if you are running the program in Polish, using LANG=pl_PL, then gretl will do
the following when writing the preamble for a TEX source file.
1. Look for a file named gretlpre_pl.tex in the gretl user directory. If this is not found, then
2. look for a file named gretlpre.tex in the gretl user directory. If this is not found, then
3. use the default preamble.
Conversely, suppose you usually run gretl in a language other than English, and have a suitable
gretlpre.tex file in place for your native language. If on some occasions you want to produce TEX
output in English, then you could create an additional file gretlpre_en.tex: this file will be used
for the preamble when gretl is run with a language setting of, say, en_US.
Command-line options
After estimating a model via a script— or interactively via the gretl console or using the command-
line program gretlcli—you can use the commands tabprint or eqnprint to print the model to
file in tabular format or equation format respectively. These options are explained in the Gretl
Command Reference.
If you wish alter the appearance of gretl’s tabular output for models in the context of the tabprint
command, you can specify a custom row format using the --format flag. The format string must
be enclosed in double quotes and must be tied to the flag with an equals sign. The pattern for the
format string is as follows. There are four fields, representing the coefficient, standard error, t-
ratio and p-value respectively. These fields should be separated by vertical bars; they may contain
a printf-type specification for the formatting of the numeric value in question, or may be left
blank to suppress the printing of that column (subject to the constraint that you can’t leave all the
columns blank). Here are a few examples:
--format="%.4f|%.4f|%.4f|%.4f"
--format="%.4f|%.4f|%.3f|"
--format="%.5f|%.4f||%.4f"
--format="%.8g|%.8g||%.4f"
The first of these specifications prints the values in all columns using 4 decimal places. The second
suppresses the p-value and prints the t-ratio to 3 places. The third omits the t-ratio. The last one
again omits the t, and prints both coefficient and standard error to 8 significant figures.
Once you set a custom format in this way, it is remembered and used for the duration of the gretl
session. To revert to the default formatting you can use the special variant --format=default.
Further editing
Once you have pasted gretl’s TEX output into your own document, or saved it to file and opened it
in an editor, you can of course modify the material in any wish you wish. In some cases, machine-
generated TEX is hard to understand, but gretl’s output is intended to be human-readable and
-editable. In addition, it does not use any non-standard style packages. Besides the standard LATEX
document classes, the only files needed are, as noted above, the amsmath, dcolumn and longtable
packages. These should be included in any reasonably full TEX implementation.
43.4 Installing and learning TEX

This is not the place for a detailed exposition of these matters, but here are a few pointers.
So far as we know, every GNU/Linux distribution has a package or set of packages for TEX, and in
fact these are likely to be installed by default. Check the documentation for your distribution. For
MS Windows, several packaged versions of TEX are available: one of the most popular is MiKTEX at
http://www.miktex.org/. For Mac OS X a nice implementation is iTEXMac, at http://itexmac.
sourceforge.net/. An essential starting point for online TEX resources is the Comprehensive TEX
Archive Network (CTAN) at http://www.ctan.org/.
As for learning TEX, many useful resources are available both online and in print. Among online
guides, Tony Roberts’ “LATEX: from quick and dirty to style and finesse” is very helpful, at
http://www.sci.usq.edu.au/staff/robertsa/LaTeX/latexintro.html
An excellent source for advanced material is The LATEX Companion (Goossens et al., 2004).

Gretl Guide (401 450)

Uploaded by

Copyright:

Available Formats

Gretl Guide (401 450)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Gretl Guide (401 450)

Uploaded by

Copyright:

Available Formats

Chapter 38.

Discrete and censored dependent variables 389

38.3 Multinomial logit

Listing 38.4: Multinomial logit

Output (selected portions):

coefficient std. error z p-value

Mean dependent var 2.691322 S.D. dependent var 0.573502

Number of cases ’correctly predicted’ = 1366 (79.6%)

38.4 Bivariate probit

If ρ were 0, ML estimation of the parameters βj and γj could be accomplished by estimating the

38.5 Panel estimators

may not be valid, even asymptotically.

probit depvar const indvar1 indvar2 --random

so that we can compute ℓi by integrating αi out as

probit y const x1 x2 x3 --random --quadpoints=48

38.6 The Tobit model

tobit depvar indvars

tobit depvar indvars --llimit=10

would tell gretl that the left bound a is set to 10.

38.7 Interval regression

1. left-unbounded, when mi = −∞,

3. bounded, when −∞ < mi < Mi < ∞ and

4. point observations when mi = Mi .

intreg minvar maxvar X

38.8 Sample selection model

Listing 38.5: Interval model on artificial data [Download ▼]

# run the interval model

Output (selected portions):

coefficient std. error t-ratio p-value

Chi-square(1) 950.9270 p-value 8.3e-209

corr(ystar, yhat) = 0.98960092

and the observation rule is given by

yi∗ for si∗ > 0

38.9 Count data

Listing 38.6: Heckit model [Download ▼]

series EXP2 = AX^2

list X = const AX EXP2 WE CIT

heckit WW X ; LFP Z --two-step

series NWINC = FAMINC - WW*WHRS

heckit lww X ; LFP Z --two-step

Given this model the log-likelihood for n observations can be written as

poisson depvar indep

poisson depvar indep ; offset

38.10 Duration models

• From economics: the duration of strikes, or of spells of unemployment.

density, f (t) hazard, λ(t)

Exponential γ exp (−γt) γ

Weibull αγ α t α−1 exp [−(γt)α ] αγ α t α−1

density, f (wi ) survivor, S(wi )

Log-normal φ(wi ) Φ(−wi )

Implementation in gretl and illustration

duration durat 0 X ; cens

Listing 38.7: Models for recidivism data [Download ▼]

coefficient std. error z p-value

Chi-square(10) 165.4772 p-value 2.39e-30

Model 2: Duration (log-normal), using observations 1-1445

coefficient std. error z p-value

Chi-square(10) 166.7361 p-value 1.31e-30