LR With ANNs
LR With ANNs
LR With ANNs
www.elsevier.com/locate/yjbin
Methodological Review
Department of Software Engineering for Medicine, Upper Austria University of Applied Sciences, Hagenberg, Austria
b
Decision Systems Group, Brigham and Womens Hospital, Division of Health Sciences and Technology,
Harvard Medical School and Massachusetts Institute of Technology, Boston, MA, USA
Received 7 February 2003
Abstract
Logistic regression and articial neural networks are the models of choice in many medical data classication tasks. In this
review, we summarize the dierences and similarities of these models from a technical point of view, and compare them with other
machine learning algorithms. We provide considerations useful for critically assessing the quality of the models and the results based
on these models. Finally, we summarize our ndings on how quality criteria for logistic regression and articial neural network
models are met in a sample of papers from the medical literature.
2003 Elsevier Science (USA). All rights reserved.
Keywords: Articial neural networks; Logistic regression; Classication; Model comparison; Model evaluation; Medical data analysis
1. Introduction
Predictive models are used in a variety of medical
domains for diagnostic and prognostic tasks. These
models are built from experience, which constitutes
data acquired from actual cases. The data can be preprocessed and expressed in a set of rules, such as it is
often the case in knowledge-based expert systems, or
serve as training data for statistical and machine learning models. Among the options in the latter category,
the most popular models in medicine are logistic regression (LR) and articial neural networks (ANN).
These models have their origins in two dierent communities (statistics and computer science), but share
many similarities.
In this article, we show that logistic regression and
articial neural networks share common roots in statistical pattern recognition, and how the latter model
can be seen as a generalization of the former. We briey
compare these two methods with other popular classication algorithms from the machine learning eld, such
*
Corresponding author. Fax: +43-7236-3888-2099.
E-mail address: [email protected] (S. Dreiseitl).
1532-0464/02/$ - see front matter 2003 Elsevier Science (USA). All rights reserved.
doi:10.1016/S1532-0464(03)00034-0
353
354
1
;
1 eax
1
1
eboH b0
P DjaP a
:
P D
355
356
Table 1
Percentage of papers (out of 72) satisfying ve quality criteria
Details
given (%)
Details not
given (%)
76
51
89
61
25
24
49
11
39
75
Table 2
Summary of comparing the discriminatory power of articial neural
networks with logistic regression models, as percentage of 72 papers
Stat. testing
No stat. testing
ANN
better (%)
LR better
(%)
No
dierence (%)
18
33
1
6
42
0
5. Discussion
An increasingly large number of data items are collected routinely, and often automatically, in many areas
of medicine. It is a challenge for the eld of machine
learning and statistics to extract useful information and
knowledge from this wealth of data.
Mistakes in model building and evaluation can have
disastrous consequences in some medical applications.
Special care must therefore be taken to ensure that the
models are validated, preferably by using an external
data set and checking the models plausibility by surveying a panel of experts in the domain [24,25].
The latter is possible only for so-called white-box
models that allow an interpretation of model parameters. Examples of such algorithms are decision trees
(which may be expressed as a set of rules), k-nearest
neighbors (which provides exemplars similar to cases to
be classied), and logistic regression (where coecients
sizes determine their relative importance for the classication result).
Black-box models, such as support vector machines or
articial neural networks, do not allow such an interpretation, and can only be veried externally. Contrasting views on the role of articial neural networks as
predictive models are given in [26,27]. Nevertheless, their
discriminating power is often signicantly better than
that of white-box models, which may explain their
popularity in domains where classication performance
is more important than model interpretation.
Most of the papers summarized in Section 4 have
shown logistic regression and articial neural networks
to work well on a wide variety of data sets. Their performance is generally better, at least on continuous data,
than that of decision trees and k-nearest neighbors. This
may be explained by the fact that the decision tree algorithm does not construct a decision boundary between
classes per se, but rather splits the data set optimally at
each tree node. As explained in Section 2, this may result
in suboptimal classication results. The performance of
k-nearest neighbors is generally worse on high-dimensional data because, when the relative importance of
dimensions is not weighted, the data from spurious and
irrelevant dimensions may negatively inuence the distance calculation [28].
Support vector machines, on the other hand, have
shown comparable performance in the few studies on
medical data sets [29,30]. They are not as widely used yet
as logistic regression and articial neural networks, in
part because few easy-to-use software implementations
are available, and the kernel functions and kernel
function parameter settings have to be estimated from
the data (mostly by cross-validation or bootstrapping).
In short, the widespread use of logistic regression and
articial neural network models seems to be motivated
by the fact that they have lower generalization error
357
358
6. Conclusion
In this methodology review, we explained the use of
logistic regression and articial neural network models
for biomedical data classication. We outlined the
common foundations of both models in statistical pattern recognition, and briey compared these models
with other classication algorithms. We showed how to
build logistic regression and articial neural network
models, how to evaluate them, and which performance
indices to report.
We surveyed papers that compare both models to
determine the current level of publication standard, and
noticed that the information relevant for measuring the
methodological soundness of a paper is reported more
often for logistic regression models. We conjecture that
this is due to the fact that the model-building process is
easier for logistic regression, and may be considered too
detailed and not worthy of publication for articial
neural networks. This greatly limits the readers ability
to reproduce the reported results.
We discussed the application areas, relative merits
and common pitfalls of classication algorithms in
biomedicine. So far, there is no single algorithm that
performs better than all other algorithms on any given
data set and application area. For logistic regression, the
popularity may be attributed to the interpretability of
model parameters and ease of use; for articial neural
networks, this may be due to the fact that these models
can be seen as nonlinear generalizations of logistic regression, and thus at least as powerful as that model.
The evidence summarized in Section 4 shows that of the
tasks where performance was compared statistically,
there was a 5:2 ratio of cases in which it was not signicantly better to use neural networks. It remains to be
seen whether newer machine learning algorithms, such
as support vector machines and other kernel-based algorithms, can prove to be signicantly better than both
logistic regression and articial neural networks.
Until further studies are conducted and some guidelines for predictive modeling evaluation are utilized,
there may continue to exist a publication bias in favor of
the newer machine learning methods, often with disregard to proper evaluation of the results. This may mislead readers into thinking that the new methods are not
subject to the pervasive trade-os between exibility and
overtting that are typical of classical models such as
logistic regression and articial neural networks.
References
[1] Duda R, Hart P, Stork D. Pattern classication. 2nd ed. New
York: Wiley/Interscience; 2000.
[2] Vapnik V. The nature of statistical learning theory. 2nd ed. New
York: Springer; 2000.
359