Paper 8 PDF
Paper 8 PDF
Paper 8 PDF
8, AUGUST 2004
Abstract—This paper addresses the problem of the classifica- is possible to address various additional applications requiring
tion of hyperspectral remote sensing images by support vector very high discrimination capabilities in the spectral domain (in-
machines (SVMs). First, we propose a theoretical discussion and cluding material quantification and target detection). From a
experimental analysis aimed at understanding and assessing the
potentialities of SVM classifiers in hyperdimensional feature methodological viewpoint, the automatic analysis of hyperspec-
spaces. Then, we assess the effectiveness of SVMs with respect tral data is not a trivial task. In particular, it is made complex by
to conventional feature-reduction-based approaches and their many factors, such as: 1) the large spatial variability of the hy-
performances in hypersubspaces of various dimensionalities. To perspectral signature of each land-cover class; 2) atmospheric
sustain such an analysis, the performances of SVMs are compared effects; and 3) the curse of dimensionality. In the context of su-
with those of two other nonparametric classifiers (i.e., radial basis
function neural networks and the K-nearest neighbor classifier). pervised classification, one of the main difficulties is related to
Finally, we study the potentially critical issue of applying binary the small ratio between the number of available training samples
SVMs to multiclass problems in hyperspectral data. In particular, and the number of features. This makes it impossible to obtain
four different multiclass strategies are analyzed and compared: reasonable estimates of the class-conditional hyperdimensional
the one-against-all, the one-against-one, and two hierarchical probability density functions used in standard statistical classi-
tree-based strategies. Different performance indicators have
been used to support our experimental studies in a detailed and fiers. As a consequence, on increasing the number of features
accurate way, i.e., the classification accuracy, the computational given as input to the classifier over a given threshold (which
time, the stability to parameter setting, and the complexity of depends on the number of training samples and the kind of clas-
the multiclass architecture. The results obtained on a real Air- sifier adopted), the classification accuracy decreases (this be-
borne Visible/Infrared Imaging Spectroradiometer hyperspectral havior is known as the Hughes phenomenon [1]).
dataset allow to conclude that, whatever the multiclass strategy
adopted, SVMs are a valid and effective alternative to conventional Much work has been carried out in the literature to over-
pattern recognition approaches (feature-reduction procedures come this methodological issue. Four main approaches can be
combined with a classification method) for the classification of identified: 1) regularization of the sample covariance matrix; 2)
hyperspectral remote sensing data. adaptive statistics estimation by the exploitation of the classi-
Index Terms—Classification, feature reduction, Hughes phe- fied (semilabeled) samples; 3) preprocessing techniques based
nomenon, hyperspectral images, multiclass problems, remote on feature selection/extraction, aimed at reducing/transforming
sensing, support vector machines (SVMs). the original feature space into another space of a lower dimen-
sionality; and 4) analysis of the spectral signatures to model the
I. INTRODUCTION classes.
The first approach uses the multivariate normal (Gaussian)
R EMOTE sensing images acquired by multispectral sen-
sors, such as the widely used Landsat Thematic Mapper
(TM) sensor, have shown their usefulness in numerous earth
probability density model, which is a widely accepted statistical
model for optically remotely sensed data. For each information
observation (EO) applications. In general, the relatively small class, such a model requires the correct estimation of first- and
number of acquisition channels that characterizes multispec- second-order statistics. In the presence of an unfavorable ratio
tral sensors may be sufficient to discriminate among different between the number of available training samples and features,
land-cover classes (e.g., forestry, water, crops, urban areas, etc.). the common way of estimating the covariance matrix may lead
However, their discrimination capability is very limited when to inaccurate estimations (that may make it impossible to invert
different types (or conditions) of the same species (e.g., different the covariance matrix in maximum-likelihood (ML) classifiers).
types of forest) are to be recognized. Hyperspectral sensors can Several alternatives and improved covariance matrix estimators
be used to deal with this problem. These sensors are character- have been proposed to reduce the variance of the estimate for
ized by a very high spectral resolution that usually results in limited training samples [2], [3]. The main problem involved by
hundreds of observation channels. Thanks to these channels, it improved covariance estimators is the risk that the estimated co-
variance matrices overfit the few available training samples and
lead to a poor approximation of statistics for the whole image
Manuscript received November 4, 2003; revised May 16, 2004. This work
was supported by the Italian Ministry of Education, Research and University
to be classified.
(MIUR). The second approach to overcome the Hughes phenomenon
The authors are with the Department of Information and Communication proposes to use in an iterative way the semilabeled samples ob-
Technologies, University of Trento, I-38050 Trento, Italy (e-mail: mel-
gani@dit.unitn.it; lorenzo.bruzzone@ing.unitn.it). tained after classification in order to enhance statistics estima-
Digital Object Identifier 10.1109/TGRS.2004.831865 tion and to improve classification accuracy. Samples are initially
0196-2892/04$20.00 © 2004 IEEE
MELGANI AND BRUZZONE: CLASSIFICATION OF HYPERSPECTRAL REMOTE SENSING IMAGES WITH SVMs 1779
classified by using the available training samples. Then, the clas- Finally, the approach inherited from spectroscopic methods
sified samples, together with the training ones, are exploited it- in analytical chemistry to deal with hyperspectral data is worth
eratively to update the class statistics and, accordingly, the re- mentioning. The idea behind this approach is that of looking
sults of the classification up to convergence [4], [5]. The process at the response from each pixel in the hyperspectral image as
of integration between these two typologies of samples (i.e., a one-dimensional spectral signal (signature). Each information
the training and the semilabeled samples) is carried out by the class is modeled by some descriptors of the shape of its spectra
expectation–maximization (EM) algorithm, which represents a [16], [17]. The merit of this approach is that it significantly sim-
general and powerful solution to the problem of ML estimation plifies the formulation of the hyperspectral data classification
of statistics in the presence of incomplete data [6], [7]. The main problem. However, additional work is required to find out appro-
advantage of this approach is that it fits the true class distribu- priate shape descriptors capable of capturing the spectral shape
tions better, since a larger portion of the image (available with no variability related to each information class accurately.
extra cost) contributes to the estimation process. The main prob- Other methods also exist that are not included in the group
lems related to this second approach are two: 1) it is demanding of the four main approaches discussed above. In particular, it is
from the computational point of view and 2) it requires that the interesting to mention the method based on the combination of
initial class model estimated from the training samples should different classifiers [18] and that based on cluster-space repre-
match well enough the unlabeled samples in order to avoid di- sentation [19].
vergence of the estimation process and, accordingly, to improve Recently, particular attention has been dedicated to support
the accuracy of the model parameter estimation. vector machines (SVMs) for the classification of multispectral
In order to overcome the problem of the curse of dimension- remote sensing images [20]–[22]. SVMs have often been found
ality, the third approach proposes to reduce the dimensionality to provide higher classification accuracies than other widely
of the feature space by means of feature selection or extraction used pattern recognition techniques, such as the maximum
techniques. Feature-selection techniques perform a reduction of likelihood and the multilayer perceptron neural network classi-
spectral channels by selecting a representative subset of original fiers. Furthermore, SVMs appear to be especially advantageous
features. This can be done following: 1) a selection criterion and in the presence of heterogeneous classes for which only few
2) a search strategy. The former aims at assessing the discrim- training samples are available. In the context of hyperspectral
ination capabilities of a given subset of features according to image classification, some pioneering experimental investiga-
statistical distance measures among classes (e.g., Bhattacharyya tions preliminarily pointed out the effectiveness of SVMs to
distance, Jeffries–Matusita distance, and the transformed diver- analyze hyperspectral data directly in the hyperdimensional
gence measure [8], [9]). The latter plays a crucial role in hyper- feature space, without the need of any feature-reduction pro-
dimensional spaces, since it defines the optimization approach cedure [23]–[26]. In particular, in [24], the authors found
necessary to identify the best (or a good) subset of features ac- that a significant improvement of classification accuracy can
cording to the used selection criterion. Since the identification of be obtained by SVMs with respect to the results achieved
the optimal solution is computationally unfeasible, techniques by the basic minimal-distance-to-means classifier and those
that lead to suboptimal solutions are normally used. Among
reported in [3]. In order to show its relatively low sensitivity
the search strategies proposed in the literature, it is worth men-
to the number of training samples, the accuracy of the SVM
tioning the basic sequential forward selection (SFS) [10], the
classifier was estimated on the basis of different proportions
more effective sequential forward floating selection [11], and
between the number of training and test samples. As will be
the steepest ascent (SA) techniques [12]. The feature-extraction
explained in the following section, this mainly depends on the
approach addresses the problem of feature reduction by trans-
fact that SVMs implement a classification strategy that exploits
forming the original feature space into a space of a lower di-
mensionality, which contains most of the original information. a margin-based “geometrical” criterion rather than a purely
In this context, the decision boundary feature extraction (DBFE) “statistical” criterion. In other words, SVMs do not require
method [13] has proved to be a very effective method, capable an estimation of the statistical distributions of classes to carry
of providing a minimum number of transformed features that out the classification task, but they define the classification
achieve good classification accuracy. However, this feature-ex- model by exploiting the concept of margin maximization. The
traction technique suffers from high computational complexity, growing interest in SVMs [27]–[30] is confirmed by their suc-
which makes it often unpractical. This problem can be over- cessful implementation in numerous other pattern recognition
come by coupling with the projection pursuit (PP) algorithm applications such as biomedical imaging [31], image compres-
[14], which plays the role of a preprocessor to the DBFE by sion [32], and three-dimensional object recognition [33]. Such
applying a preliminary limited reduction of the feature space an interest is justified by three main general reasons: 1) their
with (hopefully) an almost negligible information loss. An al- intrinsic effectiveness with respect to traditional classifiers,
ternative feature-extraction method, whose class-specific nature which results in high classification accuracies and very good
makes it particularly attractive, was proposed by Kumar et al. generalization capabilities; 2) the limited effort required for
[15]. It is based on a combination of subsets of (highly corre- architecture design (i.e., they involve few control parameters);
lated) adjacent bands into fewer features by means of top-down and 3) the possibility of solving the learning problem according
and bottom-up algorithms. In general, it is evident that even if to linearly constrained quadratic programming (QP) methods
feature-reduction techniques take care of limiting the loss of in- (which have been studied intensely in the scientific literature).
formation, this loss is often unavoidable and may have a nega- However, a major drawback of SVMs is that, from a theoretical
tive impact on classification accuracy. point of view, they were originally developed to solve binary
1780 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 42, NO. 8, AUGUST 2004
classification problems. This drawback becomes even more their potential properties in hyperspectral feature spaces. Sec-
evident when dealing with data acquired from hyperspectral tion III describes different strategies that can be used to solve
sensors, since they are intrinsically designed to discriminate multiclass problems with binary SVMs and that are adopted in
among a broad range of land-cover classes that may be very the experiments to assess the impact of the multiclass problem
similar from a spectral viewpoint. The implementation of in a hyperdimensional context. Section IV deals with the exper-
SVMs in multiclass classification problems can be approached imental phase of the work. Finally, Section V summarizes the
in two ways [23], [24], [34], [35]. The first consists of defining observations and concluding remarks to complete this paper.
an architecture made up of an ensemble of binary classifiers.
The decision is then taken by combining the partial decisions of II. SVM CLASSIFICATION APPROACH
the single members of the ensemble. The second is represented
A. SVM Mathematical Formulation
by SVMs formulated directly as a multiclass optimization
problem. Because of the number of classes that are to be 1) Linear SVM: Linearly Separable Case: Let us consider a
discriminated simultaneously, the number of parameters to supervised binary classification problem. Let us assume that the
be estimated increases considerably in a multiclass optimiza- training set consists of vectors from the -dimensional fea-
tion formulation. This renders the method less stable and, ture space . A target
accordingly, affects the classification performances in terms is associated to each vector . Let us assume that the two classes
of accuracy. For this reason, multiclass optimization has not are linearly separable. This means that it is possible to find at
been as successful as the approach based on the two-class least one hyperplane (linear surface) defined by a vector
optimization. (normal to the hyperplane) and a bias that can separate
In this paper, we present a theoretical discussion and an ac- the two classes without errors. The membership decision rule
curate experimental analysis that aim: 1) at assessing the prop- can be based on the function sgn , where is the dis-
erties of SVM classifiers in hyperdimensional feature spaces criminant function associated with the hyperplane and defined
and 2) at evaluating the impact of the multiclass problem in- as
volved by SVM classifiers when applied to hyperspectral data
(1)
by comparing different multiclass strategies. With regard to the
experimental part of the first objective, assessment of SVM ef- In order to find such a hyperplane, one should estimate and
fectiveness is carried out through two different experiments. In so that
the first, we propose to compare the performances of SVMs with
those of two other nonparametric classifiers applied directly to with (2)
the original hyperdimensional feature space: the radial basis
function neural network, which is another kernel-based classi- The SVM approach consists in finding the optimal hyperplane
fication method (like SVMs) that uses a different classification that maximizes the distance between the closest training sample
strategy based on a “statistical” rather than a “geometrical” cri- and the separating hyperplane. It is possible to express this dis-
terion; and the K-nearest neighbors classifier, which is widely tance as equal to with a simple rescaling of the hyper-
used in pattern recognition as a reference classification method. plane parameters and such that
The second experiment consists of a comparison of SVMs with
the classical classification approach adopted for hyperspectral (3)
data, i.e., a conventional classifier combined with a feature-re-
duction technique. This also allows to assess the performances The geometrical margin between the two classes is given by the
of SVMs in hypersubspaces of various dimensionalities. As re- quantity . The concept of margin is central in the SVM
gards the second objective of this work, four different multiclass approach, since it is a measure of its generalization capability.
strategies are analyzed and compared. In particular, the widely The larger the margin, the higher the expected generalization
used one-against-all and one-against-one strategies are consid- [27].
ered. In addition, two strategies based on the hierarchical tree Accordingly, it turns out that the optimal hyperplane can be
approach are investigated. The experimental studies were car- determined as the solution of the following convex quadratic
ried out on the basis of hyperspectral images acquired by the programming problem:
Airborne Visible/Infrared Imaging Spectroradiometer (AVIRIS)
sensor in June 1992 on the Indian Pines area (Indiana) [36]. Dif- minimize:
(4)
ferent performance indicators are used to support our experi- subject to:
mental analysis, namely, the classification accuracy, the com- This classical linearly constrained optimization problem can be
putational time, the stability to parameter setting, and the com- translated (using a Lagrangian formulation) into the following
plexity of the multiclass architecture adopted. Experimental re- dual problem:
sults confirm the significant superiority of the SVM classifiers
in the context of hyperspectral data classification over the con-
ventional classification methodologies, whatever the multiclass maximize:
strategy adopted to face the multiclass dilemma.
The rest of this paper is organized in four sections. Section II subject to: and
recalls the mathematical formulation of SVMs and discusses (5)
MELGANI AND BRUZZONE: CLASSIFICATION OF HYPERSPECTRAL REMOTE SENSING IMAGES WITH SVMs 1781
Fig. 1. Optimal separating hyperplane in SVMs for a linearly nonseparable case. White and black circles refer to the classes “ +1” and “01,” respectively. Support
vectors are indicated by an extra circle.
The Lagrange multipliers ’s expressed of the cost function expressed in (7) is subject to the following
in (5) can be estimated using quadratic programming (QP) constraints:
methods [27]. The discriminant function associated with the
optimal hyperplane becomes an equation depending both on (8)
the Lagrange multipliers and on the training samples, i.e., (9)
of the inner products in the transformed space , First, in a hyperspectral space, normally distributed samples
i.e., as in (a reasonable assumption for optically remotely sensed data)
tend to fall toward the tails of the density function with virtu-
ally no samples falling in the central region [39]. This can be
maximize: illustrated by a simple geometric example [40]. Let us consider
(11) the ratio between the volume of a sphere of radius and
subject to: and C one of a cube defined in the interval in the -dimen-
sional space. It is equal to
The final result is a discriminant function conveniently ex-
pressed as a function of the data in the original (lower) dimen-
sional feature space (15)
• If Card , divide into two groups Fig. 3(b). The algorithm of the BHT-OAA strategy is drawn up
and such that in the following:
Step 0: Root Node
—Set level index
—Set
—Divide into two groups and such that
Step 2: Stop Condition and
—If or such that Card or
Step 1: k-Level Branching
Card with , go to
—Divide into two groups and such
Step 1. Otherwise, Stop.
that and
2) BHT-One Against All Strategy: The second binary tree-
based hierarchy, called BHT-one against all (BHT-OAA), rep-
resents a simplification of the OAA strategy obtained through its —Set
implementation in a hierarchical context. To this end, we pro- Step 2: Stop Condition
pose to define the tree in such a way that each node discrim- —If Card , go to Step 1. Otherwise, Stop.
inates between two groups of classes and , where It is worth noting that both BHT strategies allow to reduce the
represents the information class with the highest prior proba- number of required SVMs from and , respectively,
bility among those belonging to . This kind of hier- for the OAA and OAO strategies, to . Since the classifi-
archy leads to a tree with only one single branch as depicted in cation time depends linearly on the number of SVMs and since
MELGANI AND BRUZZONE: CLASSIFICATION OF HYPERSPECTRAL REMOTE SENSING IMAGES WITH SVMs 1785
TABLE II
BEST OVERALL AND CLASS-BY-CLASS ACCURACIES, AND COMPUTATIONAL TIMES ACHIEVED ON THE TEST SET
BY THE DIFFERENT CLASSIFIERS IN THE ORIGINAL HYPERSPECTRAL SPACE
TABLE V
CLASSIFICATION ACCURACIES YIELDED ON THE TEST SET BY THE DIFFERENT CLASSIFIERS WITH THE SUBSET OF THE BEST 30 FEATURES SELECTED ACCORDING
TO THE SA-BASED FEATURE-SELECTION PROCEDURE. THE DIFFERENCE IN OVERALL ACCURACY (DIFF-OA) FOR EACH CLASSIFIER WITH RESPECT TO THE
ACCURACY ACHIEVED IN THE ORIGINAL HYPERDIMENSIONAL SPACE IS ALSO GIVEN
TABLE VI
OVERALL AND CLASS-BY-CLASS ACCURACIES OBTAINED ON THE TEST SET BY SVMS WITH THE DIFFERENT MULTICLASS STRATEGIES CONSIDERED
TABLE VII
COMPUTATIONAL TIME AND CLASSIFICATION COMPLEXITY ASSOCIATED TO
THE DIFFERENT SVM MULTICLASS STRATEGIES CONSIDERED
TABLE VIII
OVERALL ACCURACY YIELDED ON THE TEST SET BY EACH SINGLE SVM OF THE BHT-BB AND BHT-OAA STRATEGIES
[29] C. J. C. Burges, “A tutorial on support vector machines for pattern recog- Farid Melgani (M’04) received the State Engineer
nition,” Data Mining Knowl. Discov., vol. 2, pp. 121–167, 1998. degree in electronics from the University of Batna,
[30] Set of tutorials on SVM’s and kernel methods [Online]. Available: Batna, Algeria, in 1994, the M.Sc. degree in elec-
http://www.kernel-machines.org/tutorial.html. trical engineering from the University of Baghdad,
[31] I. El-Naqa, Y. Yongyi, M. N. Wernick, N. P. Galatsanos, and R. M. Baghdad, Iraq, in 1999, and the Ph.D. degree in elec-
Nishikawa, “A support vector machine approach for detection of micro- tronic and computer engineering from the University
calcifications,” IEEE Trans. Med. Imag., vol. 21, pp. 1552–1563, Dec. of Genoa, Genoa, Italy, in 2003.
2002. From 1999 to 2002, he cooperated with the Signal
[32] J. Robinson and V. Kecman, “Combining support vector machine Processing and Telecommunications Group, Depart-
learning with the discrete cosine transform in image compression,” ment of Biophysical and Electronic Engineering,
IEEE Trans. Neural Networks, vol. 14, pp. 950–958, July 2003. University of Genoa. He is currently an Assistant
[33] M. Pontil and A. Verri, “Support vector machines for 3D object recogni- Professor of telecommunications at the University of Trento, Trento, Italy,
tion,” IEEE Trans. Pattern Anal. Machine Intell., vol. 20, pp. 637–646, where he teaches pattern recognition, radar remote sensing systems, and
June 1998. digital transmission. His research interests are in the area of processing and
pattern recognition techniques applied to remote sensing images (classification,
[34] D. J. Sebald and J. A. Bucklew, “Support vector machines and the mul-
multitemporal analysis, and data fusion). He is coauthor of more than 30
tiple hypothesis test problem,” IEEE Trans. Signal Processing, vol. 49,
scientific publications.
pp. 2865–2872, Nov. 2001.
Dr. Melgani served on the Scientific Committee of the SPIE Interna-
[35] C.-W. Hsu and C.-J. Lin, “A comparison of methods for multiclass tional Conferences on Signal and Image Processing for Remote Sensing VI
support vector machines,” IEEE Trans. Neural Networks, vol. 13, pp. (Barcelona, Spain, 2000), VII (Toulouse, France, 2001), VIII (Crete, 2002),
415–425, Mar. 2002. and IX (Barcelona, Spain, 2003) and is a referee for the IEEE TRANSACTIONS
[36] AVIRIS NW Indiana’s Indian Pines 1992 data set [Online]. Available: ON GEOSCIENCE AND REMOTE SENSING.
ftp://ftp.ecn.purdue.edu/biehl/MultiSpec/92AV3C (original files) and
ftp://ftp.ecn.purdue.edu/biehl/PC_MultiSpec/ThyFiles.zip (ground
truth).
[37] O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, “Choosing mul- Lorenzo Bruzzone (S’95–M’99–SM’03) received
tiple parameters for support vector machines,” Mach. Learn., vol. 46, pp. the laurea (M.S.) degree in electronic engineering
131–159, 2002. (summa cum laude) and the Ph.D. degree in telecom-
[38] K.-M. Chung, W.-C. Kao, T. Sun, L.-L. Wang, and C.-J. Lin, “Radius munications, both from the University of Genoa,
margin bounds for support vector machines with the RBF kernel,” Genoa, Italy, in 1993 and 1998, respectively.
Neural. Comput., vol. 15, pp. 2643–2681, 2003. He is currently Head of the Remote Sensing Lab-
[39] L. O. Jimenez and D. A. Landgrebe, “Supervised classification in high- oratory in the Department of Information and Com-
dimensional space: Geometrical, statistical, and asymptotic properties munication Technologies at the University of Trento,
of multivariate data,” IEEE Trans. Syst., Man, Cybern. C, vol. 28, pp. Trento, Italy. From 1998 to 2000, he was a Postdoc-
39–54, Jan. 1998. toral Researcher at the University of Genoa. From
[40] M. G. Kendall, A Course in the Geometry of n-Dimensions. New York: 2000 to 2001, he was an Assistant Professor at the
Hafner, 1961. University of Trento, where he has been an Associate Professor of telecommu-
[41] K. Fukunaga, Introduction to Statistical Pattern Recognition, 2nd nications since November 2001. He currently teaches remote sensing, pattern
ed. New York: Academic, 1990. recognition, and electrical communications. His current research interests are
[42] L. Bottou, C. Cortes, J. Denker, H. Drucker, I. Guyon, L. Jackel, Y. in the area of remote sensing image processing and recognition (analysis of
LeCun, U. Muller, E. Sackinger, P. Simard, and V. Vapnik, “Comparison multitemporal data, feature selection, classification, data fusion, and neural net-
of classifier methods: A case study in handwriting digit recognition,” in works). He conducts and supervises research on these topics within the frame-
Proc. Int. Conf. Pattern Recognition, 1994, pp. 77–87. works of several national and international projects. He is the author (or coau-
[43] U. H.-G. Kreßel, “Pairwise classification and support vector machines,” thor) of more than 100 scientific publications, including journals, book chapters,
in Advances in Kernel Methods: Support Vector Learning, B. Schölkopf, and conference proceedings. He is a referee for many international journals and
C. J. C. Burges, and A. J. Smola, Eds. Cambridge, MA: MIT Press, has served on the Scientific Committees of several international conferences.
Dr. Bruzzone ranked first place in the Student Prize Paper Competition of the
1999, pp. 255–268.
1998 IEEE International Geoscience and Remote Sensing Symposium (Seattle,
[44] P. H. Swain and H. Hauska, “The decision tree classifier: Design and po-
July 1998). He is the Delegate in the scientific board for the University of Trento
tential,” IEEE Trans. Geosci. Electron., vol. GE-15, pp. 142–147, 1977.
of the Italian Consortium for Telecommunications (CNIT) and a member of
[45] B. Kim and D. A. Landgrebe, “Hierarchical classifier design in high-di- the Scientific Committee of the India–Italy Center for Advanced Research. He
mensional, numerous class cases,” IEEE Trans. Geosci. Remote Sensing, was a recipient of the Recognition of IEEE Transactions on Geoscience and
vol. 29, pp. 518–528, July 1991. Remote Sensing Best Reviewers in 1999 and was a Guest Editor of a Special
[46] J. T. Morgan, A. Henneguelle, M. M. Crawford, J. Ghosh, and A. Neuen- Issue of the IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING on
schwander, “Adaptive feature spaces for land cover classification with the subject of the analysis of multitemporal remote sensing images (November
limited ground truth data,” in Proc. 3rd Int. Workshop on Multiple Clas- 2003). He was the General Co-chair of the First and Second IEEE Interna-
sifier Systems—MCS 2002, Cagliari, Italy, June 2002, pp. 189–200. tional Workshop on the Analysis of Multi-temporal Remote-Sensing Images
[47] M. Datcu, F. Melgani, A. Piardi, and S. B. Serpico, “Multisource data (Trento, Italy, September 2001—Ispra, Italy, July 2003). Since 2003, he has
classification with dependence trees,” IEEE Trans. Geosci. Remote been the Chair of the SPIE Conference on Image and Signal Processing for
Sensing, vol. 40, pp. 609–617, Mar. 2002. Remote Sensing (Barcelona, Spain, September 2003—Maspalomas, Gran Ca-
[48] L. Bruzzone and D. F. Prieto, “A technique for the selection of kernel- naria, September 2004). He is an Associate Editor of the IEEE GEOSCIENCE AND
function parameters in RBF neural networks for classification of re- REMOTE SENSING LETTERS. He is a member of the International Association for
mote-sensing images,” IEEE Trans. Geosci Remote. Sensing, vol. 37, Pattern Recognition (IAPR) and of the Italian Association for Remote Sensing
pp. 1179–1184, Mar. 1999. (AIT).