0% found this document useful (0 votes)
21 views

Untitled

This document presents methods for improving curve fitting of overlapping peaks in spectra. It discusses sources of error in traditional curve fitting techniques and proposes using artificial neural networks (ANNs) for peak detection and an evolutionary algorithm for global curve fitting to address these issues. Specifically, it introduces an ANN model that can estimate the number of peaks, peak positions, and widths in a spectrum. It also describes using a cerebellar model arithmetic computer neural network for chromatographic peak deconvolution. The document demonstrates these techniques on X-ray diffraction scans of polyethylene naphthalate yarns.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Untitled

This document presents methods for improving curve fitting of overlapping peaks in spectra. It discusses sources of error in traditional curve fitting techniques and proposes using artificial neural networks (ANNs) for peak detection and an evolutionary algorithm for global curve fitting to address these issues. Specifically, it introduces an ANN model that can estimate the number of peaks, peak positions, and widths in a spectrum. It also describes using a cerebellar model arithmetic computer neural network for chromatographic peak deconvolution. The document demonstrates these techniques on X-ray diffraction scans of polyethylene naphthalate yarns.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

zyxwvutsrq

Anal. Chem. 1994,66,23-31

zyxwv
Curve Fitting Using Natural Computation
A. P. De Weijer,t C. B. Lucasius, L. Buydens,’ and G. Kateman
Department of Analytical Chemistry, Catholic University of Ndmegen, Toernooiveld 1,
NL-6525 ED NJmegenl The Netherlands

zyxwvutsr
H. M. Heuvel and H. Mannee
Akzo Research Laboratories Arnhem, Arnheml The Netherlands zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHG

In curvefitting the most commonly usedtechnique is an iterative


hill-climbing procedure that makes use of partial derivatives
to calculate the steepest path to an optimum in solution space.
leading to poor convergence or to ambiguous solutions. (2)
There is uncertainty about the baseline position. (3) Initial
estimates of the parameters in the model are insufficiently

zyxwvu
However, reliable and accurate initial estimates of the number accurate. This may cause the problem that the fitting
of peaks, individual peak positions, heights, and widths are procedure ends in a local optimum.
necessary to find acceptable solutions. One of the main Error sources 1 and 2 originate from the experimental data,
drawbacks involved is that as the number of overlapping peaks including the way the latter are collected and prepared for the
increases, the problem becomes progressively more ill-condi- subsequent optimization method used for curve fitting. In
tioned. Consequently, small errors in the data (e.g., noise or contrast, error source 3 originates from the optimization
baseline distortions), errorsin the mathematicalmodel, or errors method. Traditionally, local searching optimization methods
in the estimates can be magnified, leading to large errors in are most widely applied, e.g., the Gauss-Newton minimization
the parameters of the final model. In addition to this, more method for nonlinear models. In contrast, globally searching
overlapping peaks can lead to ambiguous fitting results. optimization methods are more robust, i.e., less sensitive to
Ambiguous fitting is a general problem in curve fitting and is initial estimates. Admittedly, globally searching optimization

zyxwvutsrq
not limited to the steepest hill-climbing methods only. In this methods are computationally more intensive as well, but with
article we present a method for peak detection using artificial the advent of increasingly faster and cheaper computers,
neutral networks and a global search technique for curve fitting computation times within practical limits become feasible.
based on evolutionary search strategies which does not need In this paper we present the results of our contributions to
accurate estimates and is less sensitive to local optima than research concerning these three sources of errors.
steepest descent procedures. These statementsare corroborated
in our comparative case study, which involves the fitting of a THE NEED FOR PEAK DETECTION
seriea of spectra with strongly overlappingpeaks X-ray equator METHODS
diffractometer scans of poly(ethy1ene naphthalate) yams. A method is required that allows an estimate of the number
of bands and band positions to prevent the problems that were
If the underlying mathematical model of the peak pattern described previously as the first source of error. A high signal-
is not known, or proper estimates of the parameters in the to-noise ratio (SNR) is needed in the original spectra if high-
model assumed cannot be obtained, curve fitting can be a long order derivative techniques are used for signal t r a ~ i n g .A~ ? ~
and complicated task. Apart from this, a good fit of an popular, advanced method to improve SNR is due to Savitzky
experimental spectrum does not always lead to parameters and Golay.4 Jackson and Griffiths compared Fourier self-
with a valid physical meaning because many overlappingpeaks deconvolution (FSD), a method that constitutes a linear
may lead to many sets of parameters that can give a close fit method of deconvolution based on measured data in the time
of the profile. Vandeginste and De Galan evaluated curve d ~ m a i n . This
~ technique seems to perform better than
fitting in infrared spectrometry.’ They investigated the techniques needing high-order numerical derivatives, but FSD
influences of the degree of overlap, the number of unresolved needs estimates of the peak shapes and widths at half-height
bands in the profile, and the determination of the baseline for proper deconvolution. For the sake of brevity, we use the
position on the fitting results of theoretical and experimental term ”half-width” to denote “full width at half-maximum
spectra. They formulated conditions to be fulfilled in order height” further on. Wythoff et al. described an artificial neural
to obtain reliable results from the fit of infrared data. network for peakverification in noisy infrared ~ p e c t r a .They
~
Pierce et al. formulated the main sources of errors in curve concluded that a great deal of potential in applying artificial
fitting.2 These sources are used as a guideline throughout neural networks for peak recognition exists. Recently, a
this article: (1) The exact number of peaks is not known,
(3) Jackson, R. S.;Griffith, P. R. Comparison of Fourier self-deconvolution and
Permanent addrcss: Akzo Rcscarch Laboratories Amhem, P.O. Box 9300, maximum likelikhd restoration for curve-fitting. Anal. Chem. 1991, 63,
6800 SB Amhem, The Netherlands. 2557-2563.
(1) Vandeginste, B. G. M.; De Galan, L. Critical evaluation of curve fitting in (4) Savitzky,A.;Golay,M. J.E.Smoothinganddifftrentiationofdataby simplified
infrared spectrometry. Anal. Chem. 1975.47, 2124-2132. least squares proccdurcs. Anal. Chem. 1964,8, 1627-1639.
(2) Pierce, J. A.; Jackson, R. S.;Van Every, K. W.; Griffith, P. R.; Hongjin, G. (5) Wythoff, B. J.; Lcvine, S.E.; Tomellini, S. A. Spectral peak verification and
Combined dwmvolutionandcurve fitting for quantitativeanalysisof unresolved recognition using an artificial neural network. Anal. Chem. 1990.62.2702-
spectral bands. Anal. Chem. 1990.62.477484. 2709.

0003-2700/94/038800238o4.5o/o Analytical Chemlstty, Vol. 66, No. 1, January 1, 1994 23


Q 1993 American Chemical Soclety
__
cerebellar model arithmetic computer (CMAC) neural net- SNR = inf. SNR = 1000 SNR = 250
work was developed for deconvolution of a system of two --_____
overlapping chromatographic peaks. This CMAC network is synthetic ,
able to provide rapid deconvolutions of simulated and actual spectrum “ I ‘

chromatographic peaks.6 We present a neural network that I

,
estimates the number of peaks, peak position, and half-width
of spectra, diffractograms, or chromatograms that are com-
posed of bell-shaped peaks. L L
‘w- zyxwvutsrqponmlkjihgfedcbaZYXWVUTSR

Peak Detection Using Artificial Neural Networks. We


present an alternative method, based on artificial neural
networks (ANN), that is capable of detecting peaks in a
spectrum. ANNs are now frequently subjected to calibrate -
nonlinear relations and to pattern recognition with great moving ;I [i r ~

~~~1
average filter
~ u c c e s s . ~ANNs -~ differ from other approaches in that they
N=5 -JAIL I J
learn from examples. The A N N algorithm iteratively samples ANN prediction L
the examples and learns from the mistakes made in previous Flgure 1. A trained ANN used to detect peaks by scanning the profile.
trials. This process will continue‘until all examples are mapped Since this ANN depends on numerical derivatives, it is sensitive to
noise. We merely appileda movingaverage fitter to the original profile.
in an acceptable way. For further reading, the authors
recommend the following literature.1° The ANN discussed

zyxwvutsr
zyxwvutsrqp
here has been trained to estimate the number of bands, peak
did not improve the results. The input nodes represent the
position(s), and bandwidth(s) of experimental spectra. A
following factors. (Node 1) sign change of dA(v)/dv: 0, no
series of synthetic patterns of peaks varying from Lorenzian
sign change; -1, sign change from - to +; 1, sign change from
to Gaussian shapes with different degrees of overlap is used
to train the ANN. As a mathematical peak model, a sum of
+ to-. (Node2) valueofd2A(v)/dv2: 1, positive;-I, negative
or zero. (Node 3) sign change of d2A(v)/dv2: 0, no sign

zyxwvuts
several Pearson VI1 lines was used (eq 1). The Pearson VI1
change; -1, sign change from - to +; 1, sign change from +
function provides the possibility of using a wide variety of line
to -. (Node 4) value of dA(v)/dv: 1 , positive; -1, negative

zyxwvu
shapes from Gaussian to Lorenzian and beyond.’ zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
or zero. (Output node 1) peak present within half-width: 1 ,
(Vi,o - 0.5Hi) > v > (vi.0 + 0.5Hi);0, v outside [(Y~,o- 0.5Hi),
A(v) =f: + Ai,o
i=O [ 1 42:(2’”‘ - l)]“
(1) (vi.0 + 0.5Hi)l.
The representation of the input patters allows 22*32= 36
where Ai.0 is the absorbance in the center of the peak i, Zi = realization possibilities. Therefore the training set consisted
(v - v i , o ) / H i , vi,o is the peak position of peak i, H i is the half- of 36 input patterns and their corresponding binary output
width of peak i, m i is the tailing factor of peak i, and n is the patterns. Since the A N N depends on high-order derivatives,
number of peaks. the method is sensitive to the SNR. This ANN was trained
The ANN, trained to recognize vo in overlapping peak under ideal circumstances: no baseline distortions or noise
patterns, uses the first and second derivatives of A(v) to predict was present. For the learning rate ( q ) , a typical value of 0.6
the presence or absence of a peak in v. However, the relation was taken.
between these parameters and the presence or absence of a Validation. Due to the sigmoid transfer function, thevalues
peak was learned by an A N N and not derived from math- of the output units of the neural network are usually not exactly
ematical formulations. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA equal to 1 or 0. The following arbitrary thresholds were used
to force estimation H i and Gi.0 from the neural network output
EXPERIMENTAL SECTION O(v).
The source code of the A N N program is written in Turbo
if O(v) < 0.7 and O ( v + 1) 1 0.7 then va = v
Pascal and runs on an 80486 processor under MSDOS. Error
back-propagation is used as a learning rule. A multilayered +
if O(v) L 0.7 and O(v 1) < 0.7 then vb = v
perceptron feed-foward network with four input nodes, one

zyxwvutsrqpon
zyxwvutsrqpo
hidden layer with two nodes, and one output node was used
for this task. More hidden layers as well as more hidden units
(6) Gallant, S.R.; Fraleigh, S.P.; Cramer, S. M. Deconvolution of overlapping
chromatographic peaks using a cerebellar model arithmetic computer neural
network. Chemom. Intell. Lab. Syst. 1993, 18, 41-57.
(7) Zupan, J.; Gasteiger, J. Neural networks: A new method for solving chemical
problems or just a passing phase? Anal. Chim. Acra 1991.248.1-30 (review
article).
Hi = v b - va zyxwvutsrqponmlkjihgfedcbaZYXWVU

ci,O = (va + vb)/2

Since this ANN only triggers on sign changes of first and


second derivatives, it reacts irrespective of the level and
therefore irrespective of the S N R ratio, as is shown as an
example in Figure 1. Although the training set was set up
(8) Smits, J. R. M.; Brecdveld, L. W.; Derksen, M. W. J.; Kateman, G . Pattern using synthetic spectra without noise, the ANN was capable
classification with artificial neural networks: classification of algae, based
upon flow cytometer data. Anal. Chim. Acta 1992, 258, 11-25.
of detecting peaks in noisy synthetic peak patterns (Figure 1 )
(9) de Weijer, A. P.; Buydens, L.; Kateman, G.;Heuvel, H. M. Artificial neural and in experimental peak patterns, provided that some noise
network used as a soft-modelling technique for quantitative description of the was removed; to that end, a simple filter sufficed. To test the
relation between physical structureand mechanical properties of poly(ethy1ene
terephthalate) yarns. Chemom. Inrell. Lab. Syst. 1992, 16, 77-86. reliability more systematically, the performance of this ANN
(10) Rumelhart, D.; McClelland, J. Parallel DisrriburedProcessing, M.I.T. Press: was tested in comparison to peak detection by looking at the
Cambridge, MA, 1986; Vol. 1.
(11) Pearson, K. Phil. Trans. A 1895, 186, 343. numerical second-derivative method only (2ND) and human

24 AnalyticalChemistry, Vol. 66, No. 1, January 1, 1994


1 i zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

5 10 15 zyxwvutsrqponml
zyxwvutsrqponm
20 25 30 35 40 45

zyxwvutsrqp
2 theta (degrees)
Flgurr 2. Scanned profile of an experimental X-ray equator diffrac-
togram. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

-
experts (HE) working in the field of applied spectroscopy.
Several synthetic peak patterns with different degree of SNR
Ptheta
Figure 3. SD fit of a PET X-ray equator dlffractogram. The fitted
curve closely matches the profite.
and overlap were analyzed. The validation set constituted of

zyxwvutsr
720 synthetic spectra based on a full factorial design. The PearsonVII line cannot be obtained using this technique. For

zyxwvutsrqpo
Ai,0, only an upper limit, A ( v ) ,can be given.
The performance
factors involved were as follows: resolution (four levels),
variance of peak amplitude (four levels), number of peaks of the ANN as a peak detection technique has not yet been
(five levels), and SNR (three levels). Each design point was studied intensively in relation to other methods. Combinations
represented three times in the data set. A detected peak was with alternative techniques such as MEM and FSD may
classified as correct w h e d e predicted peak position ii,0 existed improve the performance, as well as combination with the
between [vi0 - O X & , v&%5H,]. Once a peak has been CMAC neural network approach5to obtain accurate estimates
assigned, other detected peaks in the interval [vi,o - 0.5HiY vi,o for peak heights and tailing factors. Research in this area is
+ OSHi] were classified as false positives. continuing and will be presented in future papers.
Since ANN and 2ND 6 s e n s i t i v e to noise, the data set IGNORANCE OF THE BASELINE POSITION
was pretreated by applying a moving average filter of n = 3 The severity of baseline detection is strongly related to the
on the original spectra. In Figure 2 the influence of SNR on number of peaks and the proportion of peak overlap in a
the percentage of correct classified and false positive peaks spectrum. If there are data points in which no contribution
for HE, ANN, and 2ND is s h a m Irrespective of the filtering, from peaks are present, the baseline position can be fairly well
there is an effect of SNR on the number of false positive estimated by means of fitting if pure baseline points are situated
detected peaks, especially witlippoor SNR. As stated previ- at several positions in theordinate of thespectrum. In complex
ously, error source 1 originates from errors in the underlying spectra, such as infrared spectra, many overlapping bands
mathematical model. Therefore it is essential that the number often prohibit the estimation of the baseline position. If the
of false positive peaks is as low as possible and that the number number of bands or peaks is not known, it is impossible to
ofcorrect peaks is as high aspossible. Undoubtedly, the human determine the baseline position beforehand. A possible
expert performs better than the ANN and the second- strategy is to fit the baseline together with the peak profile.
derivative method. However, for practical SNR ranges [-, Unfortunately, this can give rise to ambiguous fitting results
1001, this ANN is a reasonable and fast alternative, while the or ending in local optimal points since baselines strongly
2ND method exhibits an unacceptable number of false positive interfere with the tailing factors of the peaks. In our
peaks. More advanced filter methods may enlarge the SNR experiments done so far, estimates of experimental spectra of
range for which ANN peak detection is feasible. X-ray diffractometer scans can be obtained easily since peaks
As a "real-world" example, a prediction of the peak position do not interfere with a baseline. There were no diffraction
of an experimental X-ray diffractometer scan of poly(ethy1ene patterns present at low angles (<go) and at high angles (>30°)
naphthalene-2,6-dicarboxylate)is shown in Figure 3. To on the equator diffraction patterns. An exponential baseline
increase the SNR, a moving average block filter was used is assumed to compensate for the scattered light from the
with a window size of five data pznts. The diffractometer unrefracted X-ray beam. Research on more complicated
scans consisted of 470 data points. Note that the shoulder on spectra to determine baseline positions, like infrared spectra,

zyxwvutsrqpon
the left-hand side of the rightmost curve was detected as a will be done in the near future.
peak. The presence of this peak m n f i r m e d by unit cell
studies as a p(200) reflex.'* Estimates of rn values of the CURVE FITTING WITH STEEPEST DESCENT
METHODS
~ ~~

(12) Rumelhart, D.; Hinton, 0.;Williams, R. Learning representations by back-


In our laboratory, a steepest descent based curve-fitting
propagating errors. Nature 1986, 323, 533-536. module with the Gauss-Newton optimization method is

Ana&ticaiChemistry, Voi. 66,No. 1, Januty 1, 1994 25


160
,_- -. __.. .- . . _- -. .
.-... . . .. .. .. . ..
-- -1

140 -

120 -

100 - zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
eo -

60

40

20


n
250 100
SNR
zyxwvutsrqpo
HE. ANN 13 2ND
20
10 14 18 22
2 theta (degrees)
26 30 34

251
20

:il 250 100


SNR
HED ANN 2ND
20

Figure 4. (Above) Fraction false positive detected peaks in total of


- --I

10
i -.._
1
.
-
14
i - -__ _ ._-
18 22
2 theta (deqeor)

Figure 5. SD fits of PEN X-ray equator diffractograms. The scans


originatefrom measuredduplicates of the same sample. Althoughthe
initial esimates were the same, SD fits ended in different optimalpoints.
26 30 34

zyxwv
2880 peaks for human expert (HE), artificial neural network (ANN), and
second derivative (2ND). (Below) False negative detected peaks. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
in order to prevent the fitting procedure from ending in a local
successfully applied to a series of curve-fitting problems. For
brevity, we will refer to “SD” whenever this method is meant.
Curve fitting of X-ray diffractometer scans of poly(ethy1ene
terephthalate) (PET) yarns, nylon 6, and infrared spectra of
PET was based on SD for minimizing 8 = (81, 192, ..., 8,) in
nonlinear least-squares problems extended with a Marquarts
zyx optimum when SD optimizationis used. Moreover, we noticed
that more time has to be spent on the developmentof a generally
applicable model as the complexity of the spectra increases.
The development of a six-line model of nylon 6 yarns,
containinglarge y crystals in the presence of a crystals, showed
that constraints on form factors and constant ratio of totally
diffracted radiation had to be incorporated into the computer

zyxw
technique for forcing convergenceand a technique for handling model for a correct convergence of the SD optimization.16
~onstraints.1~J512~ The technical details concerning SD are Recent studies on the morphology of poly(ethy1ene naphtha-
provided in the Appendix. late) (PEN) shoed that curve fitting with SD needs even more
From X-ray diffractometer scans, physically relsvant accurate initial estimates. It was not possible to formulate
parameters such as apparant crystallite sizes, lattice param- constraints to decrease the dimensionality of the optimization
eters, and amounts of crystalline material of a sample can be problem. In Figure 5 , two X-ray equator scans of PEN are
directly calculated from peak positions, half-width, and peak shown. These two scans originate from measured duplicates
intensities. To obtain sufficient accurate estimates of the of the same sample, so they only differ in the constitution of
various spectral parameters, a nonlinear curve-fitting routine the noise. The dashed line represents the fitting with SD of
is necessary. For relatively simple diffractometer scans, the the diffractometer scans with a seven-line model. Six lines
SD method works without problems. It operates fast and is were detected by ANN peak tracing over a series of PEN
not very sensitive to a priori estimates. As an illustration, a yarns, and one extra line was added to compensate for
fitting result of a PET X-ray equator diffractometer scan, amorphous scattering. The initial setting parameters are
consisting of four PearsonVII lines, is shown in Figure 4. The shown in Table 1. The optimization criterion was a minimum

zyxwvutsrqponm
residual error is of the order of magnitude of the experimental residual variance. Both curve-fitting experiments reached
noise. As stated before, in complex peak patterns, initial convergence. The first sample, however, ended in a fit (in
estimates of the parameters in the model need to be accurate terms of half-widths, peak position, etc.) that was different
(13) Buchner, S.; Wiswe, D.; Zachmann, H. G. Kinetics of crystallization and from that of the second sample, although the initial estimates
melting behavior of poly(ethy1ene naphthalene-2.6-dicarboxylate). Polymer were the same. This illustrates that small errors in the data
1989, 30,480488.
(1 4) Draper, N. R.; Smith, H. Applied Regression Analysis; John Wiley and Sons caused by noise can be magnified to give large errors in the
Inc.: New York, 1967.
(15) Heuvel, H. M.; Huisman, R. Five-line model for the description of radial (16) Heuvel, H. M.; Huisman, R. Infrared spectra of poly(ethy1ene terephthalate)
X-ray diffractometer scans of Nylon 6 yarns. J. Polym. Sci. 1981,19, 121- yarns. Fitting of spectra, evaluations of parameters, and applications. J.
134. Appl. Polym. Sci. 1985,30, 3069-3093.

26 Analy-ticalChemistry, Vol. 66, No. 1, January 1, 1994


zyxwvutsrqponm
peak
18(100,
zyxwvut
zyxwvutsrqpon
zyxwvutsrqpon
zyxwvutsrq
Tebh 1. lnltw V l k . r for Each P ~ l u m t h
PEARSOmIl MOW
Ao
0.04
80

13.47
r I SovwbLho
rllr Ruh.ya#utlonfor SD
H
0.85
2 - Wm)
1.98
nation, where the genetic algorithm generates a “best guess”
that serves as a starting point for subsequent improvement
(refinement) by the local technique. Such (and other) hybrid
techniques are important in that the constituent methods
2 a(o1o) 0.10 15.52 1.09 1.38 mutually supplement each other for enhanced overall per-
18.68 1.41
3 B(Ol0, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
0.60 1.61
4 Woo) 0.08 23.97 1.99 1.69 formance.
6 8(m) 0.88 26.20 0.86 1.82 Hybridization for Enhanced Search. We consider three
6 a(-iw) 0.20 26.50 1.41 1.17 basic strategies to hybridization of genetic algorithms: pre-
7amorph 0.10 20.60 11.0 1.98 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
hybridization, posthybridization, and self-hybridization.
Prehybridization is concerned with finding an initial
estimate for a genetic algorithm: in our case, initial values
parameters of the final model. Since both fits have reached for the Pearson VI1 model parameters. Even when this
convergence, we may conclude that the SD ended in two estimate is considerably inaccurate, which is mostly the case,
different local optimal points or that there is a significant the genetic algorithm generally still performs well, as we will
model error which caused ambiguous fitting. show. In our application, the rough initial estimate is obtained
from the aforementioned ANN for peak detection. Tech-
A NEW APPROACH: CURVE FIlTING BASED ON nically, an initial estimate for a genetic algorithm is called a
GENETIC ALGORITHMS working point. It forms the center of the real space in which
It is undesirable to force the optimization method to a the genetic algorithm searches. This space is also called the
solution by imposing constraints if the problem is actually not search volume, and its dimensions are specified by the user.
overdimensionalized, i.e., if there is one combination of The value ranges for the model parameters thus define the
parameters that leads to a residual fit in the order of magnitude dimensions of the search volume in our application, i.e., their
of the noise. Such an undesirable decision, to which we were low bounds and high bounds. For each model parameter,
forced to the fitting of nylon 6 diffractograms, is inevitable only a limited number of values, or levels, specified by the
when steepest descent methods are used in high-dimensional user, is considered by a genetic algorithm. For convenience,
problems. Therefore, a fit procedure that is less sensitive to these levels are chosen equidistant. They amount to a search
local optimal points has been developed. grid in the search volume. The mesh sizes that characterize
This procedure is based on so-called genetic algorithms.18 thesearch grid dictate the search precision that can be attained
Genetic algorithms comprise a powerful and increasingly by the genetic algorithm concerned. Randomly selected nodes
adhered search methodology which embraces principles of on the search grid comprise the starting point of the search.
Darwinian evolution. They are especiallysuitable for complex, Posthybridization is concerned with the improvement
large-scale optimization; for more details about their concepts (refinement) of the end solution found by a genetic algorithm.
and operation, the reader is referred to refs.18-22 To that end, a local search method is used; e.g., we used the
Genetic algorithms are generally praised for their insen- steepest descent method in our application. In this way, the
sitivity to ending up in local optima, Le., for their tendency local search comprisesa “remedy” for the poor search precision
to approach the globallyoptimal solution irrespectiveof diverse as a shortcoming of a genetic algorithm.
starting conditions. We therefore say that genetic algorithms
Self-hybridization is a strategy wherein a genetic algorithm
are robust, i.e., feature a good search accuracy. By contrast,
is pre- or posthybridized with another. We implemented this
the search precision of genetic algorithms is poor: Upon approach as a chain of genetic algorithms wherein each passes
replicating runs, there is a considerable spread in the end
its result as the working point of thenext in thechain. Thereby,
solutions, despite the good search accuracy; the average end
in each step the search grid was kept invariant and the search
solution is a reliable estimate of the global solution in general.
volume was pruned (Figure 6) in order to accomplish
Importantly, opposite properties often apply to traditional
significant reductions in overall convergence time, to gain
local optimization techniques, such as the aforementioned
search precision, and to preserve good search accuracy,
steepest descent method: a good search precision and a poor
simultaneously. The first few genetic algorithms in the chain
search accuracy, as pointed out above. Therefore, better are deliberately not run up until convergence, as this is not
overall performance can be attained in a sequential combi-
~~ ~ ~
needed for the next.
(17) van den Heuvel, C. J. M.; Heuvel, H. M.; Faassen, W. A.; Veurink, J.; Lucas, Representation and Search Heuristics. Any search tech-
L. J. Molecular changes of PET yams during stretching measured with reo- nique produces candidate solutions in an iterative way. These
optical infrared spectroscopyand other techniques. J. Appl. Polym.Sci. 1993,
19,925-934. are guesses at the true solution of the problem concerned. A
(18) Goldberg, D. E. Generic Algorithms in Search, Optimization, and Machine candidate solution is represented as a string (vector) of
Learning; Addison-Wesley: Reading, MA, 1989.
(19) Lucasius, C. B.; Kateman, G. Understanding and using genetic algorithms. proposed values for the unknown parameters. Any string is
Part 1: Concepts, properties and context. Chemom. Intell. Lab. Syst. 1993, evaluated by an objectivefunction for assessment of its quality,
29, 1-33.
(20) Lucasius, C. B.; Kateman, 0. Understanding and using genetic algorithms. Le., of its likeliness to represent the true solution. A string
Part 2: Representation, configuration and hybridization, Cbemom. Intell. may be modified by the search heuristics in an attempt to
Lab. Syst., in pres.
(21)Lucarius. C. B.; Kateman, G. GATES: genetic algorithm toolbox for represent a better candidate solution.
evolutionary search. Software library in ANSI C, Laboratory for Analytical A distinguishing feature of genetic algorithms is that their
Chemistry, Catholic University of Nijmegen, January 1991.
(22) Lucasius, C. B.; Kateman, G. GATES towards evolutionary large-scale evolutionary search heuristics manipulate a population of
optimization: a software-oriented approach to genetic algorithms. (a) Part strings. In each iteration, or generation, the population is
1: General perspective;(2) Part 2: Toolbox description. Comput. Chem., in
press. replaced by a new population of equal size. The new population

Anelytcal Chemkby. Vol. 86, No. 1, Jsnwty 1, lQQ4 27


zyxwvut
peak
1 j3(1~)low bound
1 & 1 ~ high
)
2 a(o1o) low bound
2 (~(010)high bound
3 j3(m)low bound
zy
zyx
zyxw
zyxwvu
zyxwv
TaMo 2. Soarch Volumes ol Ea& Parcmntr In a 8ov.rrLh.
PEARSONVII Yodd Obteind &or Prrhykldlzatlon lor OA FII zyxwvutsrqponmlkjihg

bound
Ao
0.00
0.13
0.00
0.28
0.21
eo
12.63
14.33
14.63
16.33
17.82
H
0.00
2.71
0.17
1.87
0.54
2 - (l/m) zyxwvutsrqp
1.00
1.98
1.00
1.98
1.00
3 j3(m)high bound 0.61 19.52 2.24 1.98
4 a ( 1 ~low) bound 0.00 23.29 1.39 1.00

I
4 a(1w)high bound 0.33 24.14 3.09 1.98
5 &zoo) IOW bound 0.57 25.82 0.33 1.00
5 @(zoo) high bound 1.17 26.88 1.52 1.98
6 at-1~)IOW bound 0.00 26.45 1.10 1.00
6 a ( - 1 ~high
) bound 0.55 27.06 1.86 1.98
7 morph low bound 0.00 20.30 4.25 1.98
7 morph high bound 0.15 23.70 12.8 1.98

subdivided into 2B levels; thus, larger B values amount to a


finer meshed search grid.
Many genetic operators exist for the modification of binary
strings. We used B-UX (uniform binary recombination) and
B-M (uniform binary mutation). B-UX is applied with
probability Pr to the twosomes of bitstrings obtained after
randomly pairing the strings in the population; for each
successful trial, it swaps-positionwise and with probability
Ps-bits between both strings that make the pair; if brief,
B-UX is parameterized by Pr (recombination probability)
and Ps (swap probability); important advantages of BSJX
are maximum exploratory power and positional unbiasedness.22
B-M inverst the bits in the population with probability Pm;
in brief, B-M is parameterized by Pm (mutation probability).
ObjectiveFunction. The evaluation of a bitstring proceeds
Fburo 8. Evolutionof the search volume and search grid in sequentlal as follows. First, the bitstring is decoded into a string of real
self-hybridization by pruning the search volume. values for the unknown parameters; a detailed description of
this procedure can be found in ref 22. The real values are
then passed to the objective function. It calculates a spectrum
is created by the evolutionary search heuristics in two steps. according to the Pearson VI1model and subsequentlycompares
In the first step, strings in the current population are selected this spectrum with the known experimental spectrum to derive
and copied at rates proportional to their quality, until the new a measure of (dis)similarity. Two criteria that seem sensible
population thus created is completed. The rationale for this are root mean square (RMS, as dissimilarity) and correlation
is to obtain a new population that is expected to be better on coefficient (CORR,as similarity). Both criteria have short-
average. Next, in order to reach potentially better candidate comings, though. For instance, CORR is not fully consistent
solutions, the strings in the new population are modified to with the purposes of curve fitting because it quantifies ratios
some controlled extent. The generation cycle is closed when rather than differences between spectra. RMS, on the other
the new population becomes the current population, followed hand, is consistent but tends toward “plateau” optima on the
by evaluation of all strings therein. RMS landscape; this followsfrom observed “indecisive”search
behavior near optima and may also be appreciated intuitively
An appropriate representation must be chosen, Le., one by considering small lateral perturbations in a spectrum. In
which enables the search heuristics to impose modifications order to obtain a criterion that is both consistent and leads
in such a way that the search efficiency is high. For our to a structurally more pronounced landscape, we defined the
application (and numerous other applications), bitstrings-
binary strings-may constitute the best representation for
+
bicriterion ERROR = RMS/( 1 CORR)Z(dissimilarity).
Experimentally, minimization of ERROR led to a significantly
theoretical reasons;20we adopted such a representation. A better performance of our genetic algorithm.
bitstring may be regarded as a concatenation of bitfields-
juxtaposed segmental bitstring parts that correspond with the Configuration. The configuration is specified in a separate
respective unknown parameters of the problem. When an input file. For reasons of limited space, we do not provide a
unknown parameter is encoded by a bitfield of B bits, then transcript of this file here, but merely summarize its key entries
the value range of the unknown parameter is effectively as follows:

20 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
Ana&tI~IChemIstry,Vol. 66,No. 1, January 1, 1994
population size 100
selective reproduction fitness proportionate
fitness scaling mode sigmoid
S 3.00
fraction elitism 0.05
recombination mode B-UX
Pr
PS
mutation mode
Pm
encoding resolution
binary decoding mode zyxwvutsr
zyxwvutsr
0.85
0.30
B-M
0.01
9 bits (512levels)
Gray

For a more detailed explanation of terminology related to


L

configuration, the reader is referred to ref 22.


Software and Hardware. The routines that comprise a
genetic algorithm consist of two segments: a domain-
dependent part (concerning the problem representation and
- I zyxwvutsrqpo P

the objective function) and a domain-independent part


(concerning of evolutionary search heuristics). By "domain-
independent" routines are meant routines that can be used for
-
other domains as well; for our application these were obtained fE
from the software library GATES.2',22 In agreement with
this library, the domain-dependent routines and data structures

zyxwvuts
were programmed in C (ANSI standard) for speed and
portability. The integration of all routines resulted in CFIT:
the executable application program for genetic curve fittingaZ3
CFIT is presently available for MSDOS and UNIX systems.
The CFIT fitting procedure used in this application was Figure 7. GA fit of X-ray equator profiles of PEN. Both flts closely
match the diffractograms.
executed under MSDOS with an 80486 processor.

COMPARISON OF FITTING RESULTS GA AND variance (see Appendix).24 Curve fitting with GA showed
SD compatible values for all half-widths in both diffractograms
The same two diffractograms that were fitted with the SD (Figure 8). So the GA approach is independent of the noise
procedure were now fitted with CFIT, using the same seven- in this special case. To shown that this is also true of a more
line mathematical model. Prehybridization on the original general case, the following experiment has been set up.
data to determine the search volumes was carried out as A series of PEN yarns has been produced resulting from
described in the previous section. The low and high bounds a variation of one process parameter t in 10 discrete levels.
for each parameter are given in Table 2. The dimensions of For all these yarns X-ray equator diffractograms were
the initial search volume were too small for effective self- recorded. These scans were all fitted with the seven-linemodel
hybridization. With the ANN peak detection procedure it with SD and GA, as described previously. As an illustration
was possible to reduce the search volume in such a way that of the fitting results, the half-width of the first poorly resolved
self-hybridization was not useful. We expect that the ANN peak is plotted against the process parameter for SD and
preprocessing only for reducing the search volume will not be GA (Figure 9). Since this is a univariate relation, a smooth
enough for curve fitting of infrared spectra. Refinement of relation between and half-width is expected. The deter-
the end solution found by the genetic algorithm by a steepest mination of half-widths using SD showed severe scattering,
descent method did not reduce the residual variance of the however. Although the variance of each separate fit was rather

zyxwvuts
sums of squares significantly, so posthybridization was not low, it was not possible to follow the half-width of the first
necessary. In Figure 7, the fitting results of a genetic algorithm peak as a function of 5, due to the sensitivity of SD to local

zyxwvuts
are shown for both experimental diffractograms. Since the optima. As expected, the GA performed better.
residual variance is of the order of magnitude of the In Table 3, some properties of three optimization proce-
experimental noise, we may conclude that the Gauss-Newton dures, namely, steepest descent (SD), geneticalgorithms (GA),
steepest descent approach (Figure 5 ) ended in a local optimal and exhaustive search (ES), for this seven-line diffractogram
point and that the possibility of a significant model error can of PEN are shown. Any optimization technique applied to
be ruled out. For the poorly resolved peaks 1 and 6, the SD ill-conditioned problems has the disadvantage that there is no
gave significantly different solutions for the half-widths. The certainty about convergence into a global optimum. Since

zyxwvutsrqpo
standard deviations were calculated from the generalized the calculation time for exhaustive search is exponentially
inverse of the Jacobian matrix multiplied by the residual

zyx
(23) Lucasius. C. B.; de Weijer, A. P.;Buydens, L; Kateman, G. C F I T A genetic
algorithm for survival of the fitting. Software dcscription. Chemom. Intell.
Lob. Sysr. 1993, 19, 337-341.
proportional to the number of peak parameters, it therefore
exceeds reasonable time limits in most situations. In general,
~~

(24) Beale, E.M.L. Confidence regions in non-linear cstimation. J . R.Srar. Soc.


1960,822.41-76.

AnaMical Chemistry, Vol. 66, No. 1, January 1, 1994 29


calculated halfwidth calculated halfwidth
Table 3. Comparison of Genetk Fit (GA), Steepest Descent F I (SD),
SD GA zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
and Exhaustive Search (ES) for Thb Application
residual SS fit converged CPU optimum
2 2
P 1 2 soln time (min) type
1.5 1.5
c
P zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
ES minimal minimal ,1060 global
b
1 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
0
1
b
SD 9.1 6.4 - GA 0.92 0.89 Unknown
Yes
35
2
Unknown
local

0.5 0.5

0
H1 H2 H3 H4 H5 H6

zyxwvutsrqp
H1 H2 H3 H4 H5 H6

2sigma interval of the hatfwidthHn of peak number n


of X-ray diffraction scan of PEN

duplicate
the SD was only 2 min per run under MSDOS, but it should
be added that at least 30 different runs were performed to
achieved this best-ever solution, which was still unacceptable.

FINAL REMARKS AND OUTLOOK


In fitting complex spectra, in which many bands display
H1 H2 H3 H4 H5H6 large overlaps, the steepest descent approach can fail because
very accurate initial estimates of the fit parameters have to
be given to optimize these initial estimates to the global
optimum. We have shown that genetic algorithms are less
sensitive to local optima in our experiments in which we
successfully revealed underlying peaks in highly overlapping
peak patterns. Our research is now aimed at fitting infrared
spectra using genetic algorithms. The steepestdescent method

zyxwvutsr
has been successfully applied to the fitting of poly(ethy1ene
terephthalate) (PET) yarns, but worked only under highly
I I k I I I zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
I I # If
constrained conditions. Many constraints were imposed
B

10 14 18 22 26 30 34
2 theta without any acceptable physical background and were only
Figure 8. Comparison of estimation of half-widths at halfheight (H) used to force the optimization to convergence. The formulation
of two repeated measurements with GA and SD. Peaks which were
not very well separated gave different results with SD and identical
of the contraints was a time-consuming task. Approximately
results with the new GA method. 100 band parameters are involved in the fitting of infrared
spectra of PET. Our results indicate that genetic algorithms
halfwidth of peak1 (Hl) are useful or even essential in the determination of band
parameters of such a high-dimensional problem within
reasonable time.

zy
ACKNOWLEDGMENT
The authors acknowledge Akzo Research Laboratories
Arnhem for their financial support to this work. Furthermore,

zyx
we thank D. Nijland for the fruitful discussions concerning
neural network testing and C. J. M. van den Heuvel for
providing the X-ray diffractograms of PEN.

APPENDIX
Estimation of the unknown parameters 8 = (el, 82, ..., 8,)
in a mathematical model of the form

process parameter Y = f(x;O)


SD where y is the dependent variable or response and x is the
0
..............
?!
- independent variable or factor. The model is nonlinear in the
Flgure 9. Estimation of half-width of the @(loo) reflex in a univariate parameters. The parameters must be bound on both sides by
relation with SD and GA curve fit method. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
constants:
ajIOjIbj, j = 1 , 2,..., n
nonlocal search requires more computation than local search,
but the result is more accurate and less sensitive to the choice The parameter estimates are obtained on the basis of a least-
of initial estimates. L u c a ~ i u sestimated
~~ empirically that squares fit to m data points (xi, Y i ) ; i.e., a set of parameters
the calculation time of this GA increases roughly as the third 8* is determined such that the sum of squares F(8) is minimal:
power of the number of peaks. In this case, it resulted in an .._
m

acceptable solution in a real-time of approximately 35 min on F(e*) = min eCbi- Y(xi;e))2


a MSDOS 80486 platform and in a real-time of 8 min under i= 1

UNIX on SUN Sparc workstations. Computational time of This method is based on Gauss-Newton for minimizing

30 AnaIflicalChemistry, Vol. 66, No. 1, January 1, 1994


zyxwvutsr
functions of the type extended with a Marquarts technique
for forcing convergence and a technique for handling con-
straints.14 It is an iterative process which starts with a feasible

zyxwvutsr
zyxwvutsr
zyxwvu
eo = (el, e2, ...,e),
A sequence of new estimates is determined iteratively such
(2) The sum of squares does not change significantly (in
connection with the relative machine precision).
The variance/covariance matrix of the model parameter
initial estimate between the boundary constants: zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
is calculated from the Jacobian matrix as follows:
(JrJ)-'s:
where sr = SS,,/(n - p ) , n is the number of data points, and
p is the number of model parameter^.^^
that the sum of squares diminishes zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

zyxwv
F(ek+l) < F(ek), k = 0,1,2, ...
Received for review May 22, 1993. Accepted September 22,
This process stops by convergency if at least one of the two 1993."
following conditions is satisfied: (1) Both the parameters and
the sum of squares have been determined with a given accuracy. Abstract published in Aduonce ACS Absfrocfs, November 1 , 1993.

Ana!vtkal Chemistty, Vol. 66,No. 1, &nary I, 1994 31

You might also like