Applied Parameter Estimation For Chemical Engineers

flPPLIED PflRflMETER
ESTIMflTION FOR
CtIEMICflL ENGINEERS
CHEMICAL INDUSTFUES
A Series of Reference Books and Textbooks
Cor~strlti~~g Editor
HElNZ HEINEMANN
I. fluid Catalytic Cracking with Zeolite Catalysts, Paul B. Venuto and E.
2. Ethylene: Keystone to the Petrochemical Industry, Ludwig Kniel, Olaf
3. The Chemistry and Technology of Petroleum, J ames G. Speight
4. The Desulfurization of Heavy Oils and Residua, J ames G. Speight
5. Catalysis of Organic Reactions, edited by William R. Moser
6. Acetylene-Based Chemicals from Coal and Other Natural Resources,
7. Chemically Resistant Masonry, Walter Lee Sheppard, J r.
8. Compressors and Expanders: Selection and Application for the Process
Industry, Heinz P. Bloch, J oseph A. Cameron, Frank M. Danowski, J r.,
Ralph J ames, J r., J udson S. Swearingen, and Marilyn E. Weightman
Thomas Habib, J r.
Winter, and Karl Stork
Robert J . Tedeschi
9. Mefering Pumps: Selection and Application, J ames P. Poynton
I O. Hydrocarbons from Methanol, Clarence D. Chang
11. f orm Flotation: Theory and Applications, Ann N. Clarke and David J .
12. The Chemistry and Technology of Coal, J ames G. Speight
13. Pneumatic and Hydraulic Conveying of Solids, 0. A. Williams
14. Catalyst Manufacture: Laboratory and Commercial Preparations, Alvin 0.
15. Characterization of Heterogeneous Catalysts, edited by Francis
16. BASIC Programs for Chemical Engineering Design, J ames H. Weber
17. Catalyst Poisoning, L. Louis Hegedus and Robert W. McCabe
18. Catalysis of Organic Reactions, edited by J ohn R. Kosak
19. Adsorption Technology: A Step-by-step Approach to Process Evaluation
and Application, edited by Frank L. Slejko
20. Deactivation and Poisoning of Catalysts, edited by J acques Oudar and
Henry Wise
21. Catalysis and Surface Science: Developments in Chemicals from Meth-
anol, Hydrotreating of Hydrocahons, Catalyst Preparation, Monomers and
Polymers, Photocatalysis and Photovoltaics, edited by Heinz Heinemann
and Gabor A. Somorjai
Wilson
Stiles
Delannay
22. Catalysis of Organic Reactions, edited by Robert L. Augustine
23. Modem Control Techniques for the Processing Industries, T. H. Tsai, J .
24. Temperature-Programmed Reduction for Solid Materials Character-
25. Catalyfic Cracking: catalysts, Chemistry, and Kinetics, Bohdan W.
26. Chemical Reaction and Reactor Engineering, edited by J . J . Carberry
27. Filtration: Principles and Practices, Second Edition, edited by Michael J .
28. Corrosion Mechanisms, edited by Florian Mansfeld
29. Catalysis and Surface Properties of Liquid Metals and Alloys, Yoshisada
30. Catalyst Deactivation, edited by Eugene E. Petersen and Alexis T. Bell
31. Hydrogen Effects in Catalysis: Fundamentals and Practical Applications,
32. Now Management for Engineers and Scientists, Nicholas P. Chere-
33. Catalysis of Organic Reactions, edited by Paul N. Rylander, Harold
34. Powder and Bulk Solids Handling Processes: lnsfmtnenfation and
35. Reverse Osmosis Technology: Applications for High-Purify- Water
36. Shape Selective catalysis in lndusfrial Applications, N. Y. Chen, William
37. Alpha Olefins Applications Handbook, edited by George R. Lappin and
38. Process Modeling and Control in Chemical Industries, edited by Kaddour
39. Clathrate Hydrates of Natural Gases, E. Dendy Sloan, J r.
40. Catalysis of Organic Reactions, edited by Dale W. Blackburn
41. Fuel Science and Technology Handbook, edited by J ames G. Speight
42. Octane-Enhancing Zeolitic FCC Catalysts, J ulius Schemer
43. Oxygen in Catalysis, Adam Bielanski and J erry Haber
44. The Chemistry and Technology of Petroleum: Second Edition, Revised
45. Industrial Drying Equipment: Selection and Application, C. M. van't Land
46. Novel Production Methods for Ethylene, Light HydrocanSons, and Aro-
matics, edited by Lyle F. Albright, Billy L. Crynes, and Siegfried Nowak
47. Catalysis of Organic Reactions, edited by William E. Pascoe
48. Synthetic Lubricants and High-Performance Functional Fluids, edited by
49. Acetic Acid and Its Derivatives, edited by Victor H. Agreda and J oseph R.
50. Properties and Applications of Perovskite-Type Oxides, edited by L. G.
W. Lane, and C. S. Lin
ization, Alan J ones and Brian McNichol
Wojciechowski and Avelino Corma
and A. Varma
Matteson and Clyde Orr
Ogino
edited by Zoltan Paal and P. G. Menon
misinoff and Paul N. Cheremisinoff
Greenfield, and Robert L. Augustine
Control, Koichi linoya, Hiroaki Masuda, and Kinnosuke Watanabe
Production, edited by Bipin S. Parekh
E. Gatwood, and Frank G. Dwyer
J oseph L. Sauer
Najim
and Expanded, J ames G. Speight
Ronald L. Shubkin
Zoeller
Tejuca and J . L. G. Fierro
51. Computer-Aided Design of Catalysts, edited by E. Robert Becker and
52. Models for Thermodynamic and Phase Equilibria Calculations, edited by
53. Catalysis of Organic Reactions, edited by J ohn R. Kosak and Thomas A.
54. Composition and Analysis of Heavy Petroleum Fractions, Klaus H. Altgelt
55. NMR Techniques in Catalysis, edited by Alexis T. Bell and Alexander
56. Upgrading Petroleurn Residues and Heavy Oils, Murray R. Gray
57. Methanol Production and Use, edited by Wu-Hsun Cheng and Harold H.
58. Cafalyfic Hydroprocessing of Petroleum and Distillates, edited by Michael
59. The Chemistry and Technology of Coal: Second Edition, Revised and
60. Lubricant Base Oil and Wax Processing, Avilino Sequeira, J r.
61. Catalytic Naphtha Reforming: Science and Technology, edited by
George J . Antos, Abdullah M. Aitani, and J ose M. Parera
62. catalysis of Organic Reactions, edited by Mike G. Scaros and Michael L.
Prunier
63. Catalyst Manufacture, Alvin B. Stiles and Theodore A. Koch
64. Handbook of Grignard Reagents, edited by Gary S. Silverman and Philip
E. Rakita
65. Shape Selective Catalysis in lndustrial Applications: Second Edition,
Revised and Expanded, N. Y. Chen, William E. Garwood, and Francis
G. Dwyer
66. Hydrocracking Science and Technology, J ulius Scherzer and A. J .
Gruia
67. Hydrotreating Technology for Pollution Control: Catalysts, Catalysis,
and Processes, edited by Mario L. Occelli and Russell Chianelli
68. Catalysis of Organic Reactions, edited by Russell E. Malz, J r.
69. Synthesis of Porous Materials: Zeolites, Clays, and Nanostructures,
70. Methane and Its Derivatives, Sunggyu Lee
71. Structured Catalysts and Reactors, edited by Andrzei Cybulski and
72. Industrial Gases in Petrochemical Processing, Harold Gunardson
73. Clathrate Hydrates of Natural Gases: Second Edition, Revised and
74. Fluid Cracking Catalysts, edited by Mario L. Occelli and Paul OConnor
75. Catalysis of Organic Reactions, edited by Frank E. Herkes
76. The Chemistry and Technology of Petroleum, Third Edition, Revised
and Expanded, J ames G. Speight
77. Synthetic Lubricants and High-Performance Functional fluids, Second
Edition: Revised and Expanded, Leslie R. Rudnick and Ronald L.
Shubkin
Carmo J . Pereira
Stanley I . Sandler
J ohnson
and Mieczyslaw M. Boduszynski
Pines
Kung
C. Oballah and Stuart S. Shih
Expanded, J ames G. Speight
edited by Mario L. Occelli and Henri Kessler
J acob Moulijn
Expanded, E. Dendy Sloan, J r.
78. The Desulfurization of Heavy Oils and Residua, Second Edition,
79. Reaction Kinetics and Reactor Design: Second Edition, Revised and
80. Regulatory Chemicals Handbook, J ennifer M. Spero, Bella Devito, and
81. Applied Parameter Estimation for Chemical Engineers, Peter Englezos
82. Catalysis of Organic Reactions, edited by Michael E. Ford
Revised and Expanded, Ja$& %. Speight
Expanded, J ohn B. Butt
Louis Theodore
and Nicolas Kalogerakis
ADDITIONAL VOLUMES IN PREPARATION
The Chemical Process Industries Infrastructure: Function and
Economics, J ames R. Couper, 0. Thomas Beasley, and W. Roy
Penney
Elements of Transpott Phenomena, J oel Plawsky
This Page Intentionally Left Blank
flPPLIED PfIRflMETER
ESTIMfiTION FOR
CtlEMICflL ENGINEERS
Peter Englezor
University of British Columbia
Vancouver, Canada
Nicolas Kalogerakk
Technical University of Crete
Chanira, Greece
M A R C E L . . . . . . - -
MARCEL DEKKER, INC. NEW YORK BASEL
D E K K E R
ISBN: 0-8247-9561 -X
T h~s book IS printed on acld-free paper
Headquarters
Marccl Dekkcr, lnc
270 Madtson Avenue, New York, NY 10016
tel: 2 12-696-9000: fax: 2 I 2-685-4540
Eastern Hemisphere Distribution
Marccl Dckkcr AG
Hutgasse 4, Postfach 8 12, CH-4001 Basel, Switzerland
tCI' 41-61-261-8482. PdX 41-61-261-8896
World Wide Web
http //\v\\nv dckker coni
The publisher offcrs discounts on thls book when ordered Inbulk quantltles. For more Infor-
mation, wrltc to Spcclal Salcs/Profcsslonal Markctlng at the headquarters address above.
Copyright 0 2001 by Marcel Dekker, Inc. All Rights Reserved.
Nclthcr thls book nor any part may bc rcproduccd or transmttccl I n any form or by any
means, electronic or mechanical, Including photocopying, nwrofiltning, and recoldtng, or
by any Informahon stolage and letrieval system, w~thout permsslon i n wrltlng from the
publlshcr
Current printlng (last dlglt)
1 0 9 8 7 6 5 4 3 2 1
PRl NTED IN THE IJNITED STATES OF AR.lERIC.4
Dedicated to
Vangie, Chris, Kell& Gina & Monos
Preface
Engineering sciences state relations among measurable properties so that a tech-
nological system or process can be analyzed mathematically (Ferguson, 1992).
The term model is adopted here to refer to the ensemble of equations that describes
and interrelates the variables and parameters of a system or process (Basmadjan,
1999). In chemical, biochemical, environmental and petroleum engineering these
models are based on the principles of chemistry, physics, thermodynamics, kinet-
ics and transport phenomena. As most engineering calculations cannot be based on
quantum mechanics as of yet, the models contain a number of quantities the value
of which is not known a priori. It is customary to call these quantities a4trstabZe
parameters. The determination of suitable values for these adjustable parameters
is the objective of parameter estimation, also known as data regression. A classic
example of parameter estimation is the determination of kinetic parameters from a
set of data.
Parameter estimation is essentially an optimization problem whereby the
unknown parameters are obtained by minimizing a suitable objective function.
The structure of this objective function has led to the development of particularly
efficient and robust methods. The aim of this book is to provide students and
practicing engineers with straightforward tools that can be used directly for the
solution of parameter estimation problems. The emphasis is on applications rather
than on formal development of the theories. Students who study chemical, bio-
chemical, environmental or petroleum engineering and practicing engineers in
these fields will find the book useful. The following table summarizes how the
book can be used:
V
vi Preface
L St4 bject Chaptersfiom this book
Regression Analysis & Applications
1, 2, 3, 4, 6, 7, 8, 11, 12, 17
Biochemical Engineering
1, 2, 3, 4, 6, 8, 10, 1 1 , 12, 16 Chemical Kinetics & Reactor Design
All chapters
I Petroleum Reservoir Engineering 1 1, 2, 3, 6, 8, 10, 11, 18 I
Computational Thermodynamics
1,2,3,4,5,6,7,8,9,10,11,12
Optimization Methods
l , 2, 4, 8, 9, 1 1 , 12, 14, 15
With this book the reader can expect to learn how to formulate and solve pa-
rameter estimation problems, compute the statistical properties of the parameters,
perform model adequacy tests, and design experiments for parameter estimation or
model discrimination.
A number of books address parameter estimation (Bard, 1974; Bates and
Watts, 1988; Beck and Arnold, 1977; Draper and Smith, 1981 ; Gans, 1992; Koch,
1987; Lawson and Hanson, 1974; Seber and Wild, 1989; Seinfeld and Lapidus,
1974; Sorenson, 1980). However, the majority of these books emphasize statistics
and mathematics or system identification and signal processing. Furthermore,
most of the existing books pay considerable attention to linear and nonlinear re-
gression for models described by algebraic equations. This book was conceived
with the idea of focusing primarily on chemical engineering applications and on
systems described by nonlinear algebraic and ordinmy differe?ltinl eqtrations with
a particular emphasis on the latter.
In Chapter 1, the main areas where chemical engineers encounter parameter
estimation problems are introduced. Examples from chemical kinetics, biocheni-
cal engineering, petroleum engineering, and thermodynamics are briefly de-
scribed. In Chapter 2, the parameter estimation problem is formulated mathemati-
cally with emphasis on the choice of a suitable objective function. The subject of
linear regression is described in a succinct manner in Chapter 3. Methodologies
for solving linear regression problems with readily available software such as Mi-
crosoft ExcelTM and SigmaPlotTM for WindowsTM are presented with examples.
In Chapter 4 the Gauss-Newton method for systems described by algebraic
equations is developed. The method is illustrated by examples with actual data
from the literature. Other methods (indirect, such as Newton, Quasi-Newton, etc.,
and direct, such as the Luus-Jaakola optimization procedure) are presented in
Chapter 5 .
In Chapter 6, the Gauss-Newton method for systems described by ordinary
differential equations (ODE) is developed and is illustrated with three examples
formulated with data from the literature. Simpler methods for estimating parame-
ters in systems described by ordinary differential equations known as shortcut
rnethods are presented in Chapter 7. Such methods are particularly suitable for
systems in the field of biochemical engineering.
Chapter 8 provides practical guidelines for the implementation of the Gauss-
Newton method. Issues such as generating initial guesses and tackling the issues
of overstepping and matrix ill-conditioning are presented. In addition, guidelines
I
Preface
.^-1 . I . I . * * , ' ,3.+.:;.-:, :,; '
vii
are provided on how to utilize "prior" information and selecting a suitable
weighting matrix. The models described by ODE require special attention to deal
with stiffness and enlargement of the region of convergence.
Chapter 9 deals with estimation of parameters subject to equality and ine-
quality constraints whereas Chapter 10 examines systems described by partial
differential equations (PDE). Examples are provided in Chapters 14 and 18.
Procedures on how to make inferences on the parameters and the response
variables are introduced in Chapter 1 1. The design of experiments has a direct
impact on the quality of the estimated parameters and is presented in Chapter 12.
The emphasis is on sequential experimental design for parameter estimation and
for model discrimination. Recursive least squares estimation, used for on-line data
analysis, is briefly covered in Chapter 13.
Chapters 14 to 18 are entirely devoted to applications. Examples and prob-
lems for solution by the reader are also included. In Chapter 14 several applica-
tions of the Gauss-Newton method are presented for the estimation of adjustable
parameters in cubic equations of state. Parameter estimation in activity coefficient
models is presented in Chapter 15. Chemical kinetics has traditionally been the
main domain for parameter estimation studies. Examples formulated with models
described by algebraic equations or ODE are presented in Chapter 16. The in-
creasing involvement of chemical engineers in biotechnology motivated us to de-
vote a chapter to such applications. Thus Chapter 17 includes examples from en-
zyme kinetics and mass transfer coefficient determination in bioreactors. The last
chapter (Chapter 18) is devoted to applications in petroleum engineering. Thus the
modeling of drilling data is a linear regression problem whereas oil reservoir
simulation presents an opportunity to demonstrate the application of the Gauss-
Newton method for systems described by partial differential equations.
It is a pleasure to acknowledge those individuals who helped us
indirectly in preparing this book: our colleagues Professors L.A. Behie, P.R.
Bishnoi, R.A. Heidemann and R.G. Moore and our graduate students who over
the years as part of their MSc. and Ph.D. thesis have gathered and analyzed
data.
We sincerely thank Professor Hoffman of the Institute of Technical
Chemistry, Friedrich-Alexander University, Germany for providing us with
the raw data for the hydrogenation of 3-hydroxypropanol.
Professor Englezos acknowledges the support of the University of British
Columbia for a sabbatical leave during which a major part of this book was
completed. Professor Englezos also acknowledges the support from the
Technical University of Crete and Keio University where he spent parts of his
leave.
Professor Kalogerakis acknowledges the support of the Technical
University of Crete in completing this book; Professor Luus for his
encouragement and help with direct search procedures; and all his colleagues
at the University of Calgary for the many discussions and help he received
over the years.
viii Preface
Finally, both of us would like to sincerely thank our wives Kalliroy
Kalogerakis and Evangeline Englezos for their patience and understanding
while we devoted many hours to completing this book.
Peter Etrglezos
Vancouver, Canada
Nicolcrs Kalogernkis
Chania, Crete
Contents
Preface V
1 Introduction 1
2 Formulation of the Parameter Estimation Problem
2.1 Structure of the Mathematical Model
2.1.1 Algebraic Equation Models
2.1.2 Differential Equation Models
2.2.1 Explicit Estimation
2.2 The Objective Function
2.2.1. I Simple or Unweighted Least Squares (LS) Estimation
2.2.1.2 Weighted Least Squares (WLS) Estimation
2.2.1.3 Generalized Least Squares (GLS) Estimation
2.2.1.4 Maximum Likelihood (ML) Estimation
2.2.1.5 The Determinant Criterion
2.2.1.6 Incorporation of Prior Information About the Parameters
2.2.2 Implicit Estimation
2.3 Parameter Estimation Subject to Constraints
7
7
7
11
13
14
15
15
15
15
19
19
19
22
3 Computation of Parameters in Linear Models - Linear Regression 23
3.1 The Linear Regression Model 23
3.2 The Linear Least Squares Objective Function 26
3.3 Linear Least Squares Estimation 27
3.4 Polynomial Curve Fittmg 29
3.5 Statistical Inferences 32
ix
X Contents
3.5.1 Inference on the Parameters
3.5.2 Inference on the Expected Response Variables
3.6.1 Procedure for Using Microsoft ExcelTM for Windows
3.6.2 Procedure for Using SigmaPlotTM for Windows
3.7 Solution of Multiresponse Linear Regression Problems
3.8 Problems on Linear Regression
3.8.1 Vapor Pressure Data for Pyridine and Piperidine
3.8.2 Vapor Pressure Data for R142b and R152a
3.6 Solution of Multiple Linear Regression Problems
4 Gauss-Newton Method for Algebraic Models
4.1 Formulation of the Problem
4.2 The Gauss-Newton Method
4.2.1 Bisection Rule
4.2.2 Convergence Criteria
4.2.3 Formulation of the Solution Steps for the Gauss-Newton
4.2.4 Notes on the Gauss-Newton Method
4.3.1 Chemical Kinetics: Catalytic Oxidation of 3-Hexanol
4.3.2 Biological Oxygen Demand (BOD)
4.3.3 Numerical Example 1
4.3.4 Chemical Kinetics: Isomerization of Bicyclo [2,1,1] Hexane
4.3.5 Enzyme Kinetics
4.3.6 Catalytic Reduction of Nitric Oxide
Method: Two Consecutive Chemical Reactions
4.3 Examples
4.4 Solutions
5 Other Nonlinear Regression Methods for Algebraic Models
5.1 Gradient Minimization Methods
5.1.1 Steepest Descent Method
5.1.2 Newton's Method
5.1.3 Modified Newton's Method
5.1.4 Conjugate Gradient Methods
5.1.5 Quasi-Newton or Variable Metric or Secant Methods
5.2 Direct Search or Derivative Free Methods
5.2.1 LJ Optimization Procedure
5.2.2 Simplex Method
5.3 Exercises
32
33
35
35
42
46
46
46
47
49
49
50
52
52
53
55
55
55
56
57
58
60
61
62
64
65
66
I
b
I
67
67
69
71
76
76
77
75
79
81
83
I
Contents
6 Gauss-Newton Method for Ordinary Differential Equation (ODE)
Models
6.2 The Gauss-Newton Method
6.2.1 Gauss-Newton Algorithm for ODE Models
6.2.2 Implementation Guidelines for ODE Models
6.3 The Gauss-Newton Method - Nonlinear Output Relationship
6.4 The Gauss-Newton Method - Systems with Unknown
6.5 Examples
Initial Conditions
6.5.1 A Homogeneous Gas Phase Reaction
6.5.2 Pyrolytic Dehydrogenation of Benzene to Diphenyl and
6.5.3 Catalytic Hydrogenation of 3-Hydroxypropanal (HPA) to
Triphenyl
1,3-Propanediol (PD)
6.6 Equivalence of Gauss-Newton with Quasilinearization Method
6.6.1 The Quasilinearization Method and its Simplification
6.6.2 Equivalence to Gauss-Newton Method
6.6.3 Nonlinear Output Relationship
7 Shortcut Estimation Methods for ODE Models
7.1 ODE Models with Linear Dependence on the Parameters
7.1.1 Derivative Approach
7.1.2 Integral Approach
Parameters
7.2 Generalization to ODE Models with Nonlinear Dependence on the
7.3 Estimation of Apparent Rates in Biological Systems
7.4 Examples
xi
84
84
85
88
88
92
93
96
96
98
102
111
111
114
114
115
115
116
118
119
120
122
123
129
7.4.1 Derivative Approach - Pyrolytic Dehydrogenation of Benzene 129
8 Practical Guidelines for Algorithm Implementation
8.1 Inspection of the Data
8.2 Generation of Initial Guesses
8.2.1 Nature and Structure of the Model
8.2.2 Asymptotic Behavior of the Model Equations
8.2.3 Transformation of the Model Equations
8.2.4 Conditionally Linear Systems
8.2.5 Direct Search Approach
8.3.1 An Optimal Step-Size Policy
8.3 Overstepping
133
133
135
135
135
136
138
139
139
140
xii Contents
8.4 Ill-Conditioning of Matrix A and Partial Remedies
8.4.1 Pseudoinverse
8.4.2 Marquardt's Modification
8.4.3 Scaling of Matrix A
8.5 Use of "Prior" Information
8.6 Selection of Weighting Matrix Q in Least Squares Estimation
8.7 Implementation Guidelines for ODE Models
8.7.1 Stiff ODE Models
8.7.2 Increasing the Region of Convergence
8.7.2.1 An Optimal Step-Size Policy
8.7.2.2 Use of the Information Index
8.7.2.3 Use of Direct Search Methods
8.8 Autocorrelation in Dynamic Systems
9 Constrained Parameter Estimation
9.1 Equality Constraints
9.2 Inequality Constraints
9.1.1 Lagrange Multipliers
9.2.1 Optimum Is Internal Point
9.2.1.1 Reparameterization
9.2.1.2 Penalty Function
9.2.1.3 Bisection Rule
9.2.2 The Kuhn-Tucker Conditions
10 Gauss-Newton Method for Partial Differential Equation
(PDE) Models
10.2 The Gauss-Newton Method for PDE Models
10.3 The Gauss-Newton Method for Discretized PDE Models
10.3.1 Efficient Computation of the Sensitivity Coefficients
11 Statistical Inferences
1 1.1 Inferences on the Parameters
1 1.2 Inferences on the Expected Response Variables
1 1.3 Model Adequacy Tests
1 1.3.1 Single Response Models
1 1.3.2 Multivariate Models
12 Design of Experiments
12.1 Preliminary Experimental Design
141
143
144
145
146
147
148
148
150
150
152
155
156
158
158
159
162
162
162
163
165
165
167
167
169
172
173
177 ,
177
179
182
182
184
185
185
Contents xiii
12.2 Sequential Experimental Design for Precise Parameter Estimation
12.2.1 The Volume Design Criterion
12.2.2 The Shape Design Criterion
12.2.3 Implementation Steps
12.3.1 The Divergence Design Criterion
12.3.2 Model Adequacy Tests for Model Discrimination
12.3.3 Implementation Steps for Model Discrimination
12.4.1 Selection of Optimal Sampling Interval and Initial State for
12.4.2 Selection of Optimal Sampling Interval and Initial State for
12.4.3 Determination of Optimal Inputs for Precise Parameter
12.3 Sequential Experimental Design for Model Discrimination
12.4 Sequential Experimental Design for ODE Systems
Precise Parameter Estimation
Model Discrimination
Estimation and Model Discrimination
1 2.5 Examples
12.5.1 Consecutive Chemical Reactions
12.5.2 Fed-batch Bioreactor
12.5.3 Chemostat Growth Kinetics
13 Recursive Parameter Estimation
13.1 Discrete Input-Output Models
13.2 Recursive Least Squares (RLS)
13.3 Recursive Extended Least Squares (RELS)
13.4 Recursive Generalized Least Squares (RGLS)
14 Parameter Estimation in Nonlinear Thermodynamic Models: Cubic
Equations of State
14.1 Equations of State
14.1.1 Cubic Equations of State
14.1.2 Estimation of Interaction Parameters
14.1.3 Fugacity Expressions Using the Peng-Robinson EoS
14.1.4 Fugacity Expressions Using the Trebble-Bishnoi EoS
14.2 Parameter Estimation Using Binary VLE Data
14.2.1 Maximum Likelihood Parameter and State Estimation
14.2.2 Explicit Least Squares Estimation
14.2.3 Implicit Maximum Likelihood Parameter Estimation
14.2.4 Implicit Least Squares Estimation
14.2.5 Constrained Least Squares Estimation
14.2.5.1 Simplified Constrained Least Squares Estimation
14.2.5.2 A Potential Problem with Sparse or Not Well
Distributed Data
187
188
189
190
191
192
193
195
196
196
200
200
202
202
207
213
218
21 8
219
22 1
223
226
226
227
229
230
23 1
23 1
232
233
234
23 6
236
237
23 8
xiv
Contents
14.2.5.3 Constrained Gauss-Newton Method for Regression
14.2.6 A Systematic Approach for Regression of Binary VLE Data
14.2.7 Numerical Results
of Binary VLE Data
14.2.7.1 The n-Pentane-Acetone System
14.2.7.2 The Methane-Acetone System
14.2.7.3 The Nitrogen-Ethane System
14.2.7.4 The Methane-Methanol System
14.2.7.5 The Carbon Dioxide-Methanol System
14.2.7.6 The Carbon Dioxide-n-Hexane System
14.2.7.7 The Propane-Methanol System
14.2.7.8 The Diethylamine-Water System
14.3 Parameter Estimation Using the Entire Binary Phase
Equilibrium Data
14.3.1 The Objective Function
14.3.2 Covariance Matrix of the Parameters
14.3.3.1 The Hydrogen Sulfide-Water System
14.3.3.2 The Methane-n-Hexane System
14.4 Parameter Estimation Using Binary Critical Point Data
14.5.1 Data for the Methanol-Isobutane System
14.5.2 Data for the Carbon Dioxide-Cyclohexane System
14.5 Problems
240
242
244
244
245
246
246
246
247
248
250
255
255
257
258
258
259
26 1
26 1
264
266
266
266
15 Parameter Estimation in Nonlinear Thermodynamic Models: Activity
Coefficients 268
15.1 Electrolyte Solutions 268
15.1.1 Pitzer's Model Parameters for Aqueous Na2Si03 Solutions 265
15.1.2 Pitzer's Model Parameters for Aqueous NazSiO3 - NaOH
Solutions 270
15.1.3 Numerical Results 273
15.2 Non-Electrolyte Solutions 274
15.2.1 The Two-Parameter Wilson Model 276
15.2.2 The Three-Parameter NRTL Model 276
15.2.3 The Two-Parameter UNIQUAC Model 277
15.2.4 Parameter Estimation: The Objeciive Function 278
15.3 Problems 279
15.3.1 Osmotic Coefficients for Aqueous Solutions of KC1 Obtained
by the Isopiestic Method 279
Contents xv
15.3.2 Osmotic Coeficients for Aqueous Solutions of High-Purity
NiClz 280
15.3.3 The Benzene (1)-i-Propyl Alcohol (2) System 28 1
15.3.4 Vapor-Liquid Equilibria of Coal-Derived Liquids: Binary
Systems with Tetralin 282
15.3.5 Vapor-Liquid Equilibria of Ethylbenzene (1)-0-Xylene (2) at
26.66 kPa
16 Parameter Estimation in Chemical Reaction Kinetic Models
16.1 Algebraic Equation Models
16.1.2 Chemical Kinetics: Isomerization of Bicyclo [2, 1,1] Hexane
16.2.1 Catalytic Dehydrogenation of sec-butyl Alcohol
16.2.2 Oxidation of Propylene
16.2.3 Model Reduction Through Parameter Estimation in the
16.2 Problem with Algebraic Models
s-Domain
16.3 Ordinary Differential Equation Models
16.3.2 Pyrolytic Dehydrogenation of Benzene to Diphenyl and
16.3.4 Gas Hydrate Formation Kinetics
16.4.1 Toluene Hydrogenation
16.4.2 Methylester Hydrogenation
Triphenyl
16.4 Problems with ODE Models
1,3-Propanediol (PD) - Nonisothermal Data
17 Parameter Estimation in Biochemical Engineering Models
17.1 Algebraic Equation Models
17.1.1 Biological Oxygen Demand
1 7.1.2 Enzyme Kinetics
283
285
285
285
287
288
295
295
297
300
302
302
303
307
3 14
316
317
318
320
322
322
322
323
17.1.3 Determination of Mass Transfer Coefficient (kLa) in a Munici-
17.1.4 Determination of Monoclonal Antibody Productivity in a
pal Wastewater Treatment Plant (with PULSAR aerators) 327
* Dialyzed Chemostat 330
17.2 Problems with Algebraic Equation Models 338
17.2.1 Effect of Glucose to Glutamine Ratio on MAb Productivity in
a Chemostat 338
xvi Contents
17.2.2 Enzyme Inhibition Kinetics 340
17.2.3 Determination of kLa in Bubble-free Bioreactors 34 1
17.3 Ordinary Differential Equation Models 344
17.3.1 Contact Inhibition in Microcarrier Cultures of MRC-5 Cells 344
17.4 Problems with ODE Models 347
17.4.1 Vero Cells Grown on Microcarriers (Contact Inhibition) 347
17.4.2 Effect of Temperature on Insect Cell Growth Kinetics 348
18 Parameter Estimation in Petroleum Engineering 353
18.1 Modeling of Drilling Rate Using Canadian Offshore Well Data 353
18.1,l Application to Canadian Offshore Well Data 355
18.2 Modeling of Bitumen Oxidation and Cracking Kinetics Using Data
fiom Alberta Oil Sands 358
18.2.1 Two-Component Models 358
18.2.2 Three-Component Models 359
18.2.3 Four-Component Models 362
18.2.4 Results and Discussion 364
18.3 Automatic History Matching in Reservoir Engineering 37 1
18.3.1 A Fully Implicit, Three Dimensional, Three-phase Simulator
with Automatic History-Matching Capability 37 1
18.3.2 Application to a Radial Coning Problem (Second SPE
Comparative Solution Problem) 373
18.3.2.1 Matching Reservoir Pressure 373
18.3.2.2 Matching Water-Oil Ratio, Gas-Oil Ratio or Bottom
Hole Pressure 3 74
18.3.2.3 Matching All Observed Data 374
Matching Model: Reliability of Parameter Estimates 376
18.3.3.1 Implementation and Numerical Results 375
History Matching 3 80
18.3.4.1 Incorporation of Prior Information and Constraints
on the Parameters 3 82
18.3.4.2 Reservoir Characterization Using Automatic History
Matching 3 84
18.3.5 Reliability of Predicted Well Performance Through Automatic
History Matching 385
18.3.5.1 Quantification of Risk 385
18.3.5.2 Multiple Reservoir Descriptions 358
18.3.5.3 Case Study-Reliability of a Horizontal Well
Performance 3 89
18.3.3 A Three-Dimensional, Three-phase Automatic History-
18.3.4 Improved Reservoir Characterization Through Automatic
Contents xvii
References
Appendix 1
A. 1.1 The Trebble-Bishnoi Equation of State
A. 1.2 Derivation of the Fugacity Expression
A. 1.3 Derivation of the Expression for (alnfj/axj),,,,
Appendix 2
A.2.1 Listings of Computer Programs
A.2.2 Contents of Accompanying CD
A.2.3 Computer Program for Example 16.1.2
A.2.4 Computer Program for Example 16.3.2
391
403
403
403
405
410
410
41 1
412
420
Index 434
Introduction
During an experiment, measurement of certain variables e.g. concentrations,
pressures, temperatures, etc. is conducted. Let y = [ j l I , 9 2 ,..., j l ,,,IT be the m-
dimensional vector of measured variables during an experiment. In addition, dur-
ing each experiment, certain conditions are set or fixed by the experimentalist e.g.
substrate concentration in an enzyme kinetics experiment, time and temperature in
a kinetics experiment, etc. Let x = [x x 2 ,..., x I T be the n-dimensional vector of
these input variables that can be assumed to be known precisely.
The experimentalist often formulates a mathematical model in order to de-
scribe the observed behavior. In general, the model consists of a set of equations
based on the principles of chemistry, physics, thermodynamics, kinetics and trans-
port phenomena and attempts to predict the variables, y, that are being measured.
I n general, the measured variables y are a function of x. Thus, the model has the
following form
f = Ftrnction of (x, k) +Random Error (1.1)
The random error arises from the measurement of y the hue value of which
is not known. The measurements are assumed to be free of systematic errors. The
modeling equations contain adjustable parameters to account for the fact that the
models are phenomenological. For example, kinetic rate expressions contain rate
constants (parameters) the value of which is unknown and not possible to be ob-
tained from fundamental principles.
1
2 Chapter I
Parameter estimation is one of the steps involved in the formulation and
validation of a mathematical model that describes a process of interest. Paratneter
estimation refers to the process of obtaining values of the parameters fiom the
matching of the model-based calculated values to the set of measurements (data).
This is the classic parameter estimation or nlodei.fitting problem and it should be
distinguished from the identifkntior? problem. The latter involves the development
of a model from input/output data only. This case arises when there is no a priori
information about the form of the model i.e. it is a black box.
When the model equations are linear functions of the parameters the prob-
lem is called linear estimation. Nonlinear estimation refers to the more general and
most frequently encountered situation where the model equations are nonlinear
functions of the parameters.
Parameter estimation and identification are an essential step in the develop-
ment of mathematical models that describe the behavior of physical processes
(Seinfeld and Lapidus, 1974; Ark, 1994). The reader is strongly advised to consult
the above references for discussions on what is a model, types of models, model
formulation and evaluation. The paper by Plackett that presents the history on the
discovery of the least squares method is also recommended (Plackett, 1972).
A smooth function f(x) is used quite often to describe a set of data, (x,,y,).
(x2,y2), . . .,(xN,yN). Fitting of a smooth curve, f(x), through these points is usually
done for interpolation or visual purposes (Sutton and MacGregor, 1977). This is
called ctrrve.fitfirzg. The parameters defining the curve are calculated by minimiz-
ing a measure of the closerzess offit such as the fimction S(k)= c[yi - f(x i )I2 .
The parameters have no physical significance.
The scope of this book deals primarily with the parameter estimation prob-
lem. Our focus will be on the estimation of adjustable parameters in nonlinear
models described by algebraic or ordinary differential equations. The models de-
scribe processes and thus explain the behavior of the observed data. I t is assumed
that the structure of the model is known. The best parameters are estimated in or-
der to be used i n the model for predictive purposes at other conditions where the
model is called to describe process behavior.
The unknown model parameters will be obtained by minimizing a suitable
objective function. The objective function is a measure of the discrepancy or the
departure of the data from the model i.e., the luck of Jt (Bard, 1974; Seinfeld and
Lapidus, 1974). Thus, our problem can also be viewed as an optimization problem
and one can in principle employ a variety of solution tnethods available for such
probletns (Edgar and Himmelblau, 1988; Gill et al. 198 1 ; Reklaitis, 1983; Scales,
1985). Finally it should be noted that engineers use the term purnmeter estirtlntion
whereas statisticians use such terms as nonlirwur or linear regression cwd~:ri s to
describe the subject presented in this book.
In parameter estimation, the general problem we have to solve is:
Introduction 3
Given the strwtzrre of the model (i.e. the governing model
equations) and a set of measured data points, the problem is
to find the zrnknown model parameters so that the values
calculated by the model match the data in some optimal
manner (e.g., by minimizing the sum of squares of errors)
The specific issues that we have tried to address in this book are:
Structure of the Model ("What kind of models can be used? Linear or
nonlinear? Algebraic or differential equation models?")
Selection of the Objective Function ("What do we minimize to estimate
the parameters?")
Solution Teckniqtres ("How do we minimize the objective function?")
Statistical Properties of Parameter Estintates ("How accurate are the
estimated parameters?")
Statistical Properties of Model-Based Calculated Values ("Given the
uncertainty in the model parameters, what is the uncertainty in the cal-
culated values?")
Tests for Model Adequacy ("Is the model good enough?")
Tests for Model Discrimination ("Among several rival models, which is
the best one?")
Factoriul Experimental Design ("What is the first set of experiments I
should run?")
Sequential Experimental Design ("What should my next experiment be
so that 1 gather maximum information?") for model discrimination (to
select the best model among several ones that fit the data,) or for pre-
cise parameter estimation (to minimize further the parameter uncer-
tainty in a particular model).
Issues such as the ones above are of paramount importance and interest to
practicing engineers, researchers and graduate students. In the next paragraphs
we mention several examples that cover many important areas of chemical, bio-
chemical and petroleum engineering.
Chemical kinetics is an area that received perhaps most of the attention of
chemical engineers fiom a parameter estimation point of view. Chemical engi-
neers need mathematical expressions for the intrinsic rate of chemical reactions
4 Chapter 1
in order to design reactors that produce chemicals in an industrial scale. For ex-
ample, let 11s consider the following consecutive first order reactions that take
place (Smith, I98 1 ; Froment and Bischoff, 1990)
During an experiment, the concentrations of A and B are usually measured.
If it is assumed that only component A is present in the reactor initially (C,40#0,
CBO=O, CDo=O), we do not need to measure the concentration of chemical D be-
cause CA+CB+CD=Cno. The rate equations for A and B are given by
,
( I .3b)
where CA and CR are the concentrations of A and B, t is reaction time, and k, , k2
are the rate constants.
The above rate equations can be easily integrated to yield the following al-
gebraic equations
c* = CA()e-l\It (1.4a)
( 1.4b)
We can then obtain CD from the concentration invariant CD=C4(,-C4-Cr3.
( I .4c)
The values of the rate constants are estimated by fitting equations 1.4a and
1.4b to the concentration versus time data. I t should be noted that there are ki-
netic models that are more complex and integration of the rate equations can
only be done numerically. We shall see such models i n Chapter 6. An example
is given next. Consider the gas phase reaction of NO with O2 (Belltnan et al.
1967):
Introduction
2NO+0;! ++ 2N02
The model is given by the following kinetic rate equation
where a=126.2, p=9 1.9, and kl, k2 are the rate constants to be estimated. The
data consists of a set of measurements of the concentration of NO2 versus time.
I n this case, fitting of the data requires the integration of the governing differen-
tial equation.
I n biochemical engineering one often employs parametric models to de-
scribe enzyme-catalyzed reactions, microbial growth, nutrient transport and
metabolic rates in biological systems. The parameters in these models can be
readily obtained though the methods described in this book. In addition, we pre-
sent shortcut estimation methods for the estimation of the average specific up-
take or production rates during cell cultivation in batch, fed-batch, chemostat or
perfusion cultures. For example, such an analysis is particularly important to
scientists screening cell lines to identify high producers.
Petroleum and chemical engineers perform oil reservoir sin~ulation to opti-
mize the production of oil and gas. Black-oil, compositional or thermal oil reser-
voir models are described by sets of differential equations. The measurements
consist of the pressure at the wells, water-oil ratios, gas-oil ratios etc. The objec-
tive is to estimate through history matching of the reservoir unknown reservoir
properties such as porosity and permeability.
Volumetric equations qf state (EoS) are employed for the calculation offluid
phase equilibriurn and thernzo-physical properties required in the design of proc-
esses involving non-ideal fluid mixtures in the oil, gas and chemical industries.
Mathematically, a volumetric EoS expresses the relationship among pressure, vol-
ume, temperature, and composition for a fluid mixture. The next equation gives
the Peng-Robinson equation of state, which is perhaps the most widely used EoS
in industrial practice (Peng and Robinson, 1976).
where P is the pressure, T the temperature and v the molar volume. The mixture
parameters a and b are defined as follows
a = Z ~x i x , ( l - k ~) , , / ~
J
. .
(1.8a)
6
b=C x , b,
I
Chapter 1
( I .8b)
where a,, b, and a, are parameters specific for the individual components i and j ,
and k,J is an empirical interaction parameter characterizing the binary formed by
component i and component j. It is well known that the introduction of empirical
parameters, k,J, in the equations of state enhances their ability as tools for process
design. The interaction parameters are usually estimated from the regression of
binary vapor-liquid equilibrium (VLE) data. Such data consist of sets of tempera-
ture, pressure and liquid and vapor phase mole fractions for one of the compo-
nents. I I
Typical examples such as the ones mentioned above, are used throughout
this book and they cover most of the applications chemical engineers are faced
with. In addition to the problem definition, the mathematical developtnent and the
numerical results, the implementation of each algorithm is presented in detail and
computer listings of selected problems are given in the attached CD.
Finally, it is noted that even though we are not concerned with model devel-
opment i n this book, we should always keep i n mind that:
I
(i) The development of models is not necessarily i n the direction of greater
(io A higher degree of confidence is gained if the adjustable parameters are
complexity or an increasing number of adjustable parameters.
found from independent sources.
2
Formulation of the Parameter Estimation
Problem
The formulation of the parameter estimation problem is equally impor-
tant to the actual solution of the problem (i.e., the determination of the unknown
parameters). In the formulation of the parameter estimation problem we must
answer two questions: (a) what type of mathematical model do we have? and (b)
what type of objective function should we minimize? In this chapter we address
both these questions. Although the primary focus of this book is the treatment
of mathematical models that are nonlinear with respect to the parameters (non-
linear regression) consideration to linear models (linear regression) will also be
given.
2.1 STRUCTURE OF THE MATHEMATICAL MODEL
The primary classification that is employed throughout this book is alge-
braic versus dlflerenticrl equation models. Namely, the mathematical model is
comprised of a set of algebraic equations or by a set of ordinary (ODE) or partial
differential equations (PDE). The majority of mathematical models for physical or
engineered systems can be classified in one of these two categories.
2.1.1 Algebraic Equation Models
In mathematical terms these models are of the form
8
Chapter 2
where
k = [k,, k2,. . .,k,IT is ap-dimensional vector of parameters whose numerical values
are unknown;
x = [xI, x?,.. .,xn]' is an n-din~ensional vector of independent variables (also called
regressor or input variables) which are either fixed for each experiment by the
experimentalist or which are measured. In the statistical treatment of the
problem, it is often assutned that these variables are known precisely (i.e.,
tflere is no uncertainty in their value even if they are measured esperimen-
tally);
y = [yl, y2,...,yJ T is an n1-dimensional vector of dependent variables (also often
described as response variables or the orltpzrt vector); these are the model
variables which are actually measured in the experiments; note that y does not
necessarily represent the entire set of measured variables i n each experiment,
rather it represents the set of dependent variables;
f = [fl, fi,. . .,fnlIT is an m-dirtzensional vector function of known form (these are the
actual model equations).
Equation 2.1, the mathematical model of the process. is very general and it
covers many cases, namely,
(i) The single response with a single independent variable model (Le.,
Ill' 1 , i7= I )
y f(x; kl, k2, ..., kp) (2.2a)
(ii) The single response with several independent variables model (i.e..
m=l , ??>I )
(iii) The multi-response with several independent variables model (i.e.,
m>l , ?PI )
Formulation of the Parameter Estimation Problem
. > '
9
(2.2c)
A single experiment consists of the measurement of each of the m response
variables for a given set of values of the n independent variables. For each experi-
ment, the measured output vector which can be viewed as a random variable is
comprised of the deterministic part calculated by the model (Equation 2. I ) and the
stochastic part represented by the error term, i.e.,
i, = f ( x, , k) +E, ; i=1,2 ,..., N (2.3)
If the mathematical model represents adequately the physical system, the er-
ror term in Equation 2.3 represents only measurement errors. As such, it can often
be assumed to be normally distributed with zero mean (assuming there is no bias
present in the measurement). In real life the vector E, incorporates not only the
experimental error but also any inaccuracy of the mathematical model.
A special case of Equation 2.3 corresponds to the celebrated linear systems.
Linearity is assumed with respect to the unknown parameters rather than the inde-
pendent variables. Hence, Equation 2.3 becomes
y i = F(xi)k+Ei ; i=1,2, ..., N (2.4)
where F(xi) is an mxp dimensior~al matrix which depends only on x, and it is inde-
pendent of the parameters. Quite often the elements of matrix F are simply the
independent variables, and hence, the model can be further reduced to the well
known linear regression model,
j l , = X . k+Ei ; i=1,2 ,..., N
(2.5)
A brief review of linear regressior? analysis is presented in Chapter 3.
In algebraic equation models we also have the special situation of condi-
tionally linear systems which arise quite ofien in engineering (e.g., chemical ki-
netic models, biological systems, etc.). In these models some of the parameters
enter in a linear fashion, namely, the model is of the form,
10
i i = F( x i , k 2 ) k l + ~ i ; i==1,2,...,N
where the parameter vector has been partitioned into two groups,
= [ : : I
(3.7)
The structure of such models can be exploited in reducing the dimensional-
ity of the nonlinear parameter estimation problem since, the conditionally linear
parameters, kl , can be obtained by linear least squares in one step and without the
need for initial estimates. Further details are provided in Chapter 8 where we ex-
ploit the structure of the model either to reduce the dimensionality of the nonlinear
regression problem or to arrive at consistent initial guesses for any iterative pa-
rameter search algorithm.
In certain circumstances, the model equations may not have an explicit ex-
pression for the measured variables. Namely, the model can only be represented
implicitiy. I n such cases, the distinction between dependent and independent vari-
ables becomes rather fuzzy, particularly when all the variables are subject to ex-
perimental error. As a result, it is preferable to consider an augmented vector of ,,
measured variables, y, that contains both regressor and response variables (Box,
1970; Britt and Luecke, 1973). The model can then be written as
where, y = Cyl, y?,. . ,,yIl] is an R-dirnensiomd vector oftnenswed variables and cp
= [cp,, cpz,. . .,cp,J T is an m-dinzensional vector function of known form.
If we substitute the actual measurement for each experiment, the model
equation becomes,
cp(i, ; k) = ci ; i =I ,2,...3
As we shall discuss later, in many instances implicit estirmtion provides
the easiest and computationally the most efficient solution to this class of prob-
lems.
The category of algebraic equation models is quite general and it encom-
passes many types of engineering models. For example, any discrefe cfynamic
model described by a set of difference eqzrations falls in this category for parame-
ter estimation purposes. These models could be either deter*ministic or stochus!ic
in nature or even conlbinations of the two. Although on-line techniques are avail-
able for the estimation of parameters in sampled data systems, off-line techniques
Formulation of the Parameter Estimation Probleiii 11
such as the ones emphasized in this book are very useful and they can be em-
ployed when all the data are used together after the experiment is over.
Finally, we should refer to situations where both independent and response
variables are subject to experimental error regardless of the structure of the model.
In this case, the experimental data are described by the set {( j i , i , ), i=1,2,.. .N)
as opposed to (( j , , xi ), i=l,2,. . . ,N}. The deterministic part of the model is the
same as before; however, we now have to consider besides Equation 2.3, the error
in xi:, i.e., ii = Xi + cX,. These situations in nonlinear regression can be handled
very efficiently using an implicit formulation of the problem as shown later in
Section 2.2.2
2.1.2 Differential Equation Models
Let us first concentrate on dynamic systems described by a set of ordinary
differential equations (ODEs). I n certain occasions the governing ordinary differ-
ential equations can be solved analytically and as far as parameter estimation is
concerned, the problem is described by a set of algebraic equations. If however,
the ODEs cannot be solved analytically, the mathematical model is more complex.
In general, the model equations can be written in the form
dx(t) = f(x(t), u, k) ; x(to) = x.
dt
(2.10)
or
where
k=[kl,k2,. . .,k,JT is a p-dimensional vector of parameters whose numerical
values are unknown;
x=[xI,x2,.. .,xJ T is an n-dimensional vector of state variables;
X ~=[ X ~~, X ~~, . . . , X , ~] ~ is an n-dimensional vector of initial conditions for the
state variables and they are assumed to be known precisely (in Section
6.4 we consider the situation where several elements of the initial state
vector could be unknown);
12
Chapter 2
u=[u,,u2,. ... u,I T is an 1.-dimensional vector of manipulated variables which
are either set by the experimentalist and their numerical values are pre-
cisely known or they have been measured:
f=[fi,f* ,...,fillT is a n-dimensional vector function of known fortn (the differ-
ential equations);
y=[yl,y2,. . .,y,,,] is the n~-dimer~sional output vector i.e., the set of variables
that are measured experimentally: and
C is the m x i ? observation matrix which indicates the state variables (or l i n-
ear combinations of state variables) that are tneasured experimentally.
h=[hl,h2,.. .,h,,] is a nz-din~erzsional vector function of known form that re-
lates in a nonlinear fashion the state vector to the output vector.
The state variables are the minimal set of dependent variables that are
needed in order to describe fully the state of the system. The output vector repre-
sents normally a subset of the state variables or combinations of them that are
measured. For example, i f we consider the dynamics of a distillation column, in
order to describe the condition of the column at any point in time we need to know
the prevailing temperature and concentrations at each tray (the state variables). On
the other hand, typically very few variables are measured, e.g., the concentration at
the top and bottom of the column, the temperature in a few trays and in some oc-
casions the concentrations at a particular tray where a side stream is taken. I n other
words, for this case the observation matrix C will have zeros everywhere except in
very few locations where there will be 1s indicating which state variables are
being measured.
I n the distillation column example, the manipulated variables correspond to
all the process parameters that affect its dynamic behavior and they are normally
set by the operator, for example. reflux ratio, column pressure, feed rate, etc.
These variables could be constant or time varying. I n both cases however, it is
assumed that their values are known precisely.
As another example let us consider a cotnplex batch reactor. We may have
to consider the concentration of many intermediates and final products i n order to
describe the system over time. However, it is quite plausible that only very few
species are measured. In several cases, the measurements could even be pools of
several species present in the reactor. For example, this would reflect an output
variable which is the summation of two state variables, i.e., yI =xl+x2
The measuretnents of the output vector are taken at distinct points in time. t,
with i-1, ..., N. The initial condition xo, is also chosen by the experimentalist and it
is assumed to be precisely known. It represents a very important variable from an
experimental design point of view.
Formulation of the Parameter Estimation Problem 13
Again, the measured output vector at time t,, denoted as j r , , is related to the
value calculated by the mathematical model (using the true parameter values)
through the error term,
SI i =y(ti )+E, ; i=1,2, ..., N (2.13)
As in algebraic models, the error term accounts for the measurement error
as well as for all model inadequacies. In dynamic systems we have the additional
complexity that the error terms may be autocorrelated and in such cases several
modifications to the objective function should be performed. Details are provided
in Chapter 8.
I n dynamic systems we may have the situation where a series of runs have
been conducted and we wish to estimate the parameters using all the data simulta-
neously. For example in a study of isothermal decomposition kinetics, measure-
ments are often taken over time for each run which is carried out at a fixed tem-
perature.
Estimation of parameters present in partial differential equations is a very
complex issue. Quite often by proper discretization of the spatia1 derivatives we
transform the governing PDEs into a large number of ODEs. Hence, the problem
can be transformed into one described by ODEs and be tackled with similar tech-
niques. However, the fact that in such cases we have a system of high dimension-
ality requires particular attention. Parameter estimation for systems described by
PDEs is examined in Chapter 1 I .
2.2 THE OBJECTIVE FUNCTION
What type of objective function should we minimize? This is the question
that we are always faced with before we can even start the search for the parame-
ter values. In general, the unknown parameter vector k is found by minimizing a
scalar function often referred to as the objective function. We shall denote this
function as S(k) to indicate the dependence on the chosen parameters.
The objective function is a suitable measure of the overall departure of the
model calculated values fiom the measurements. For an individual measurement
the departure from the model calculated value is represented by the residual e,. For
example, the ith residual of an explicit algebraic model is
(2.14)
where the model based value f(xi,k) is calculated using the estimated parameter
values. It should be noted that the residual (e,) is not the same as the error term (E,)
in Equation 2.13. The error term (E,) corresponds to the true parameter values that
14 Chapter 2
are never known exactly, whereas the residual ( e, ) corresponds to the estimated
parameter values.
The choice of the objective function is very important, as it dictates not only
the values of the parameters but also their statistical properties. We may encounter
two broad estimation cases. Explicit estimation refers to situations where the out-
put vector is expressed as an explicit function of the input vector and the parame-
ters. hnplicit esfimnlion refers to algebraic models in which output and input vec-
tor are related through an implicit function.
2.2.1 Explicit Estimation
Given N measurements of the output vector, the parameters can be obtained
by tninimizing the Least Sqtlares (LS) objective function which is given below as
the weighted SI^of sqzrtrres of the residmls, namely,
(2.15a)
I
where e, is the vector of residuals from the ith experiment. The LS objective func-
tion for algebraic system takes the form,
i=l
where in both cases Q, is an n ~ x m user-supplied weighting matrix.
For systems described by ODEs the LS objective function becomes,
(2.1%)
(2.1%)
As we mentioned earlier a further complication of systems described by
ODEs is that instead of a single run, a series of runs may have been conducted. If
we wish to estimate the parameters using all the data simultaneously we must con-
sider the following objective fi~nction
(2. I 5d)
where NR is the number of runs. Such a case is considered in Chapter 6 where
three isobaric runs are presented for the catalytic hydrogenation of 3-
hydroxypropanol to I ,3-propanediol at 3 18 and 353 K. In order to maintain sim-
plicity and clarity in the presentation of the material we shall only consider the
case of a single run in the following sections, i.e., the LS objective function given
by equation 2.15~. Depending on our choice of the weighting matrix Q, in the ob-
jective function we have the following cases:
2.2.1.1 Simple or Unweighted Least Squares (LS) Estimation
In this case we minimize the sum of squares of errors (SSE) without any
weighting factor, i.e., we use Q, =I and Equation 2.14 reduces to
(2.16)
2.2.1.2 Weighted Least Squares (WLS) Estimation
In this case we minimize a weighted SSE with constant weights, i.e., the
user-supplied weighting matrix is kept the same for all experiments, Qi=Q for all
i=1, ..., N.
2.2.1.3 Generalized Least Squares (GLS) Estimation
In this case we minimize a weighted SSE with non-constant weights. The
user-supplied weighting matrices differ from experiment to experiment.
Of course, it is not at all clear how one should select the weighting matri-
ces Q,, i=l,. ..,N, even for cases where a constant weighting matrix Q is used.
Practical guidelines for the selection of Q can be derived fiom Maximum Likeli-
hood {ML) considerations.
2.2.1.4 Maximum Likelihood (ML) Estimation
If the mathematical model of the process under consideration is adequate,
it is very reasonable to assume that the measured responses fiom the ith experiment
are normally distributed. In particular the joint probability density function condi-
tional on the value of the parameters (k and E,) is of the form,
(2.17)
16 Chapter 2
where Xi is the covariance matrix of the response variables y at the it experitnent
and hence, of the residuals ei too.
If we now further assume that measurements from dfferent experiments w e
independent, the joint probability density function for the all the measured re-
sponses is simply the product,
grouping together similar terms we have,
The Loglikelihood function i s the log of the joint probability density func-
tion and is regarded as a function of the parameters conditional on the observed
responses. Hence, we have
k - .
I
where A is a constant quantity. The maximum likelihood estimates of the un-
known parameters are obtained by maximizing the Loglikelihood fhction.
At this point let us assume that the covariance matrices (E,) of the measured
responses (and hence of the error terms) during each experiment are known pre-
cisely. Obviously, in such a case the ML parameter estimates are obtained by
minimizing the following objective function
(2.2 1 )
Therefore, on statistical grounds, if the error terms ( E, ) are normally distrib-
uted with zero mean and with a known covarimzce mufris, then Qi should be the
inverse of this covariance matrix, Le.,
Q, = [COI~(E,)]-I = E; : i=1,2 ,..., N (2.22)
Formulation of the Parameter Estimation Problem ' ' 17
However, the requirement of exact knowledge of all covariance matrices
(Xl, i=l,2, ..., N) is rather unrealistic. Fortunately, in many situations of practical
importance, we can make certain quite reasonable assumptions about the structure
of 72, that allow us to obtain the ML estimates using Equation 2.2 1. This approach
can actually aid us i n establishing guidelines for the selection of the weighting
matrices QI in least squares estimation.
Case I: Let us consider the stringent assumption that the error terms in each
response variable and for each experiment (qj, i =l , . . .N; j=l,. . .,m) are all identi-
cally and independently distributed (i.i.d) normally with zero mean and variance,
0: . Namely,
X,= G: 1 ; i=1,2 ,..., N (2.23)
where I i s the mx m identity matrix. Substitution of Zi into Equation 2.21 yields
. N
(2.24)
Obviously minimization of SML(k) in the above equation does not require
the prior knowledge of the common factor 0:. Therefore, under these conditions
the ML estimation is equivalent to simple LS estimation (QI=l).
Case I!: Next let us consider the more realistic assumption that the variance
of a particular response variable is constant from experiment to experiment; how-
ever, different response variables have different variances, Le.,
(2.25)
I : : I
Although we may not know the elements of the diagonal matrix C, given by
Equation 2.25, we assume t h t we do know their relafive value. Namely we as-
sume that we know the following ratios (v,, i=l,2,. , .,m),
2
t~ el
VI = 7 (2.26a)
18
Chapter 2
0
2
e2
2
V? = -
0
(2.26b)
(2.26~)
where o' is an unknown scaling factor. Therefore X, can be written as
Upon substitution of XI into Equation 2.2 I it becomes apparent that the ML
parameter estimates are the same as the weighted LS estimates when the following
weighting matrices are used,
(2.2 8)
If the variances in Equation 2.25 are totally unknown, the ML parameter es-
timates can only be obtained by the deternzincrnt criterion presented later i n this
chapter.
Case ZIl: Generalized LS estimation will yield ML estimates whenever the
errors are distributed with variances that change from experiment to experiment.
Therefore, in this case our choice for QI should be [COI.'(E,)1" for i=l,. . . , N.
An interesting situation that arises often in engineering practice i s when the
errors have a constant, yet mknown, percent error.for each variable, i.e.
7
OE,, =o- y, ,
' 2
; i=1,2 ,..., N and j=l,2 ,..., 117 (2.29)
where o2 is an unknown scaling factor. Again. upon substitution of XI into Equa-
tion 2.21 it is readily seen that the ML parameter estimates are the same as the
generalized LS estimates when the following weighting matrices are used,
(2.30)
The above choice has the computational disadvantage that the weights are a
fimction of the unknown parameters. If the magnitude of the errors is not exces-
sive, we could use the measurements in the weighting matrix with equally good
results, namely
(2.3 1)
2.2.1.5 The Determinant Criterion
If the covariance matrices of the response variables are unknown, the maxi-
mum likelihood parameter estimates are obtained by maximizing the Loglikeli-
hood function (Equation 2.20) over k and the unknown variances. Following the
distributional assumptions of Box and Draper (1969, i.e., assuming that
X,=&= ...=XN= X, it can be shown that the ML parameter estimates can be ob-
tained by minimizing the determinant (Bard, 1974)
/ N
In addition, the corresponding estimate of the unknown covariance is
(2.32)
(2.33)
It is worthwhile noting that Box and Draper (1965) arrived at the same de-
terminant criterion following a Bayesian argument and assuming that is un-
known and that the prior distribution of the parameters is noninformative.
The determinant criterion is very powerhl and it should be used to refine
the parameter estimates obtained with least squares estimation if our assumptions
about the covariance matrix are suspect.
2.2.1.6 Incorporation of Prior Information About The Parameters
Any prior information that is available about the parameter values can fa-
cilitate the estimation of the parameter values. The methodology to incorporate
such information in the estimation procedure and the resulting benefits are dis-
cussed in Chapter 8.
2.2.2 Implicit Estimation
Now we turn our attention to algebraic models that can only be represented
implicitly though an equation of the form,
20
Chapter 2
I n implicit estimation rather than minimizing a weighted sum of squares of
the residuals in the response variables, we minimize a suitable implicit function of
the measured variables dictated by the tnodel equations. Namely, if we substitute
the actual measured variables in Equation 2.8, an error term arises always even if
the mathematical model is exact.
(2.35)
The residual is not equal to zero because of the experimental error in the
measured variables, i.e.,
i, = y I + ; i=l,2.. . . 3 (2.36)
Even if we make the stringent assumption that errors in the measurement of
each variable (E,,.,~, i=1,2 ,..., N, j=1,2 ,..., R) are independently and identically dis-
tributed (i.i.d.) normally with zero mean and constant variance, it is rather difficult
to establish the exact distribution of the error tern1 E, in Equation 2.35. This is par-
ticularly true when the expression is highly nonlinear. For example, this situation
arises in the estimation of parameters for nonlinear thermodynamic models and i n
the treatment of potentiometric titration data (Sutton and MacGregor, 1977; Sachs.
1976; Englezos et al., 1990a, 1990b).
If we assume that the residuals in Equation 2.35 ( E, ) are nor~nally distrib-
uted, their covariance matrix (X,) can be related to the covariance matrix of the
measured variables (COJ(%,.,)=Xy.,) through the error propagation law. Hence, if
for example we consider the case of independent measurements with a constant
variance, i.e,
Xs., = Xy = (2.37)
The diagonal elements of the covariance matrix (E,) can be obtained From,
where the partial derivatives are evaluated at the conditions of the it experiment.
As a result, the elements of the covariance matrix (E,) change from experiment to
experiment even if we assume that the variance of the measurement error is con-
stant.
Similarly, we compute the off-diagonal terms of Zi from,
i=1,2 ,..., N, h=1,2 ,..., R, j=1,2 ,..., R (2.39)
Having an estimate (through the error propagation law) of the covariance
matrix Zl we can obtain the ML parameter estimates by minimizing the objective
function,
(2.40)
The above implicit formulation of maximum likelihood estimation is valid
only under the assumption that the residuals are normally distributed and the
model is adequate. From our own experience we have found that implicit estima-
tion provides the easiest and computationally the most efficient solution to many
parameter estimation problems.
Furthermore, as a first approximation one can use implicit least squares es-
timation to obtain very good estimates of the parameters (Englezos et al., 1990).
Namely, the parameters are obtained by minimizing the following Implicit Least
Squares (ILS) objective hnction,
r = l
If the assumption of normality is grossly violated, ML estimates of the pa-
rameters can only be obtained using the error-in-variables method where be-
sides the parameters, we also estimate the true (error-free) value of the measured
variables. In particular, assuming that Zy.i is known, the parameters are obtained
by minimizing the following objective function
1=1
subject lo
(2.42)
q(yl; k) = 0 ; i=1,2,. . .,N (2.43)
22 Chapter 2
The method has been analyzed in detail by several researchers, for exam-
ple. Schwetlick and Tiller (1985), Seber and Wild (1989), Reilly and Patino-
Leal (198 I ), Patino-Leal and Reilly (1 982) and Duever et al. ( 1 987) among oth-
ers.
2.3 PARAMETER ESTIMATION SUBJECT TO CONSTRAINTS
In parameter estimation we are occasionally faced with an additional com-
plication. Besides the minimization of the objective function (a weighted sum of
errors) the mathematical model of the physical process includes a set of constrains
that must also be satisfied. In general these are either equality or inequality con-
straints. In order to avoid unnecessary complications in the presentation of the
material, constrained parameter estimation is presented exclusively i n Chapter 9.
I
3
Computation of Parameters in Linear
Models - Linear Regression
Linear models with respect to the parameters represent the simplest case of
parameter estimation from a computational point of view because there is no need
for iterative computations. Unfortunately, the majority of process models encoun-
tered in chemical engineering practice are nonlinear. Linear regression has re-
ceived considerable attention due to its significance as a tool in a variety of disci-
plines. Hence, there is a plethora of books on the subject ( e g , Draper and Smith,
1998; Freund and Minton, 1979; Hocking, 1996; Montgomery and Peck, 1992;
Seber, 1977). The majority of these books has been written by statisticians.
The objectives in this chapter are two. The first one is to briefly review the
essentials of linear regression and to present them in a formthat is consistent with
our notation and approach followed in subsequent chapters addressing nonlinear
regression problems. The second objective is to show that a large number of linear
regression problems can now be handled with readily available software such as
Microsoft ExcelTM and SigmaPlotTh.
3.1 THE LINEAR REGRESSION MODEL
As we have already mentioned in Chapter 2, assuming linearity with respect
to the unknown parameters, the general algebraic model can be reduced to the
following form
j l i = F(xi)k +c i ; i=1,2, ..., N
(3.1)
23
24 Chapter 3
which can be readily handled by linear least squares estimation. In this model, k is
the p-dinlensional parameter vector, x is the rz-cl'itnensionul vector of independent
variables (regressor variables), y is the m-dimensionnl vector of dependent vari-
ables (response variables) and F(x,) is an mxp dirnensionnl matrix which depends
only on x, and is independent of the parameters. This matrix is also the sensitivihf
coefficients matrix that plays a very important role in nonlinear regression. Quite
often the elements of tnatrix F are simply the independent variables themselves,
and hence, the model can be further reduced to the well known linenr regression
model,
yi = Xi k + E i ; i=1,2, ..., N (3 2 )
where N is the nutnber of measurements. The above model is very general and it
covers the following cases:
The simple linear regressiorz model which has a single response variable, a
single independent variable and two unknown parameters,
or in matrix notation
(3.3a)
(3.3b)
(ii) The mtrltiple linear regression model which has a single response vari-
able, n independent variables and p ( =n+l ) unknown parameters,
or more compactly as
j , i x i k + E,
*r
(3.4b)
(3.4c)
Computation of Parameters in Linear Models - Linear Regression 25
where xI = [ xl l , x?,, . . ., xp-l,, , 1 IT is the augmented p-dimensional vector
of independent variables (P=n+/ ).
(iii) The multiresponse linear regression model which has m response vari-
ables, (mxn) independent variables and p (=n+ I) unknown parameters,
The above equation can be written more compactly as,
where the matrix Xi is defined as
(3.5c)
It should be noted that the above definition of Xi is different from the one
often found in linear regression books. There X is defined for the simple or multi-
ple linear regression model and it contains all the measurements. In our case, in-
dex i explicitly denotes the ith measurement and we do not group our measure-
ments. Matrix X, represents the values of the independent variables from the i*
experiment.
26 Chapter 3
3.2 THE LINEAR LEAST SQUARES OBJECTIVE FUNCTION
Given N measurements of the response variables (output vector), the pa-
rameters are obtained by minimizing the Linear. Least Sqtlares ( I S ) objective
function which is given below as the weighted s z m of squcwes qfthe residmds,
namely,
where e, is the In-dimensional vector of residuals from the i t experiment. It is
noted that the residuals e, are obtained from Equation 3.2 using the estimated pa-
rameter values instead of their true values that yield the error terms E, . The LS
objective function takes the fonn,
where Q, is an mxm user-supplied weighting matrix. Depending on our choice of
Q,, we have the following cases:
Simple Linear. Least Squares
weighting factor, i.e., we use Q,=I and Equation 3.6 reduces to
In this case we minimize the sum of squares of errors (SSE) without any
This choice of Qi yields maximum likelihood estimates of the parameters if
the error terms in each response variable and for each experiment (E!,, i=l,. . .N;
j =l , ..., 07) are all identically and independently distributed (i.i.d) nonnally with
zero mean and variance, 0: . Namely, E(&,) = 0 and COV(E,) = 0 E I where I is the
mx m identity matrix.
Weighted Least Sqztmes (WLS) Estinwtim
In this case we minimize a weighted sum of squares of residuals with con-
stant weights, i.e., the user-supplied weighting matrix is kept the same for all ex-
periments, QI=Q for all i=1, . . . , N and Equation 3.7 reduces to
(3.10)
This choice of Qi yields ML estimates of the parameters if the error terms in
each response variable and for each experiment (qJ, i=I ,. . .N; j=1,. . .,m) are inde-
pendently distributed normally with zero mean and constant variance. Namely, the
variance of a particular response variable is constant from experiment to experi-
ment; however, different response variables have different variances, i.e.,
(3.1 I )
which can be written as
COV(E,) = o2 diag(vl, v2,. ..,v,,) ; i=1,2 ,..., N (3.12)
where o2 is an unknown scaling factor and vl, v2,. . .,vm are known constants. ML
estimates are obtained if the constant weighting matrix Q have been chosen as
Q = diag(v;',v; I,..., v:) ; i=1,2 ,.., ,N (3.13)
Generalized Least Squares (GLS) Estimation
In this case we minimize a weighted SSE with non-constant weights. The
user-supplied weighting matrices differ from experiment to experiment. ML esti-
mates of the parameters are obtained if we choose
Q, = [COV(E,)]" ; i=1,2,. . . , N (3.14)
3.3 LINEAR LEAST SQUARES ESTIMATION
Let us consider first the most general case of the multiresponse linear re-
gression model represented by Equation 3.2. Namely, we assume that we have N
measurements of the m-dimensional output vector (response variables), yi ,
i =1 , ..., N.
The computation of the parameter estimates i s accomplished by minimizing
the least squares (LS) objective hnction given by Equation 3.8 which is shown
next
28 Chapt er 3
Use of the stationary criterion
yields a linear equation of the form
A k = b
where the bxp) climemional matrix A is given by
N
A = CX ? Qi X i
i=I
and the p-ditnensionnl vector b is given by
(3.15)
(3.16)
(3. I7a)
(3.17b)
Solution of the above linear equation yields the least squares estimates of the
parameter vector, k*,
k* =
(3. I 8)
For the siugle response linear regression model ( nz=l ) , Equations (3.17a)
and (3.17b) reduce to
N
A = cxi x YQi
i =l
and
N
b = Cx i j l i Qi
(3.19a)
(3.19b)
i=l
Computation of Parameters in Linear Models - Lineir Regl'ession 29
where Q, is a scalar weighting factor and Xi is the augmented p-dimensional vector
of independent variables [Xli , xzl, .. ., ~ ~ - 1 , ~ , 1lT. The optimal parameter estimates
are obtained fiom
In practice, the solution of Equat io
(3.20)
In 3.16 for the estimation of the parame-
ters is not done by computing the inverse of matrix A. Instead, any good linear
equation solver should be employed. Our preference is to perform first an eigen-
value decomposition of the real symmetric matrix A which provides significant
additional information about potential ill-conditioning of the parameter estimation
problem (see Chapter 8).
3.4 POLYNOMIAL CURVE FITTING
In engineering practice we are often faced with the task of fitting a low-
order polynomial curve to a set of data. Namely, given a set of N pair data, (yi, xi),
i=l,. . . , N, we are interested in the following cases,
(i) Fitting a straight line to the data,
This problem corresponds to the simple linear regression model ( m=l , n=l,
p=2). Taking as QI=l (all data points are weighed equally) Equations 3.19a
and 3.19b become
N N
A = ~ x i x ~ = ~
i =I i =l
and
N
b = Cxi yi
i =l
g x i
i =l
N
(3.22a)
(3.22b)
30
Chapter 3
The parameters estimates are now obtained from
[kk:] =
(3 2 3 )
(ii) Fitting a quadratic po!lwontial to the data,
y = klx+ k2x + k3 (3.24)
This problem corresponds to the multiple linear regression model with m=I ,
n=2 and p=3. In this case we take x1=x2, x2=x and with Q,=1 (all data points
are weighed equally) Equations 3.19a and 3.19b become
N N
i =l i =l
N N
i = l i =l
N
i = l
i l l
i = l
N
(3.25a)
and
N
(3.25b)
Computation of Parameters in Linear Models - Linear Regression
31
N N
I
-
N
?x; x x ; cxj !
f f i p ; c x i N
CX i j l i g x ; c.2 EXi
cx; j l i
i =l i =l i=l
N N
i =I
N
i=I
N
i=l
N
i=l 1=1
. i=l i=l
i=l

(iii) Fitting a cubicpolftnomial to the data
y = klx3 + k2x2 + k3x + k4
This problem corresponds to the multiple linear regression m
(3.26)
(3.27)
ode1 with m=l ,
n=3 and p=4. In this case we take x1=x3, x2=x2, x3=x and with Q,=1 (all data
points are weighed equally) Equations 3.19a and 3.19b become
N
A = Z
i =l
and
x i 2 x; xi 1]=
N
b =
i =l
L
(3.28a)
(3.28b)
32 Chapter 3
I i =I i=l i =I I
i =I i =l
N
i =I
N N
N N N
N N
i =I i = l i = I
N N
i =I i = l i = I
(3.29)
3.5 STATISTICAL INFERENCES
Once we have estimated the unknown parameter values in a linear regres-
sion model and the underlying assumptions appear to be reasonable, we can pro-
ceed and make statistical inferences about the parameter estimates and the re-
sponse variables.
3.5.1 Inference on the Parameters
The least squares estimator has several desirable properties. Namely, the
parameter estimates are normally distributed, unbiased (Le., E(k*)=k) and their
covariance matrix is given by
COV( k") = 0; A" (3.30)
where matrix A is given by Equation 3.1721 or 3.19a. An estimate, ci z, of the vari-
ance (T, is given by
2
(3.3 1)
where (d.f.)=(Nm-11) are the degrees qf ji*eedom, namely the total number of
measurements minus the number of unknown parameters.
Given all the above it can be shown that the ( I - a) / OO% joint coqfidence
region for the parameter vector k is an ellipsoid given by the equation:
or
[k-k"IT A"[k-k*] = PS LS (k*) FpqNm-p
Nm-p
(3.32a)
(3.32b)
where a is the selected probability level in Fisher's F-distribution and F;Nmdp is
obtained fiom the F-distribution tables with vl=p and v2=(Nm-p) degrees of fiee-
dom.
The corresponding (1 - q) l OO% marginal confidence interval for each pa-
rameter, k,, i=l,2,. . .,p, is
where ti /2 is obtained fiom the tables of Student's T-distribution with v=Nm-p
degrees of freedom.
The standard error of parameter ki, 6ki , is obtained as the square root of the
corresponding diagonal element of the inverse of matrix A multiplied by GE , i.e.,
6ki = 6J (3.34)
Practically, for v230 we can use the approximation ti l z = zd2 where zd2 is
obtained from the tables of the standard normal distribution. That is why when the
degrees of fieedom are high, the 95% confidence intervals are simply taken as
twice the standard error (recall that ~ ~ ~ ~ = 1 . 9 6 and ti\25 =2.042).
3.5.2 Inference on the Expected Response Variables
A valuable inference that can be made to infer the quality of the model pre-
dictions is the (I-a) 100% confidence interval of the predicted mean response at
xo. It should be noted that the predicted mean response of the linear regression
model at x. is yo = F(xo)k* or simply yo = Xok*. Although the error term is not
included, there is some uncertainty in the predicted mean response due to the un-
certainty in k*. Under the usual assumptions of normality and independence, the
covariance matrix of the predicted mean response is given by
34
Chapter 3
or for the standard multiresponse linear regression model where F(xo) = Xn,
COC'(y0) = Xi COl/(k*) X, (3.3%)
The covariance matrix COI.'(k*) is obtained by Equation 3.30. Let us now
concentrate on the expected mean response of a particular response variable. 'The
( I -u)lOOo/o confidence interval of yIo (i=1 ,. . . . I??), the it" element of the response
vector yo at x. is given below
The standard error of ylo, 6yio , is the square root of the it" diagonal element
of COV(yo), namely,
or for the standard tnultiresponse linear regression tnodel.
(3.37a)
(3.37b)
For the sir7'gle response yo i n the case of simple or multiple linear regression
(i.e., I n = l ) , the ( I -a) 100% co@/ence intend of yo is,
L
Yo - tX;.2byo 5 P y0 5 Yo + t, ;2 Oy*
v -
(3.38a)
or equivalently
where t : ; 2 is obtained from the tables of Student's T-distribution with v=(N-p)
degrees of freedom and 0 is the stnncjnrd error ofprediction at xo. This quan-
tity usually appears in the standard output of tnany regression computer packages.
It is computed by
y 0
6 Y O = 6EJxi A"xo (3.39)
In all the above cases we presented confidence intervals for the mean ex-
pected response rather than a Jirture observation future measurement) of the re-
sponse variable, i o . In this case, besides the uncertainty in the estimated parame-
ters, we must include the uncertainty due to the measurement error (Q).
The corresponding (2 -0l) 100% confidence interval for the multiresponse
linear model is
where the corresponding standard error of iio is given by
(3.41)
For the case of a single response model (i-e., m=l), the ( I - a) l OO% confi-
dence interval of 9 0 is,
where the corresponding standard error of 4.0 is given by
'fo =
(3.42)
(3.43)
3.6 SOLUTION OF MULTIPLE LINEAR REGRESSION PROBLEMS
Problems that can be described by a multiple linear regression model (i.e.,
they have a single response variable, nz=I) can be readily solved by available
software. We will demonstrate such problems can be solved by using Microsoft
ExcelT" and SigmaPlotTM.
3.6.1 Procedure for Using Microsoft ExcelTM for Windows
Step 1. First the data are entered in columns. The single dependent variable i s
designated by y whereas the independent ones by xl , x2, x3 etc.
36
Step 2.
Step 3.
Step 4.
Step 5.
Step 6
Step 7.
Step 8.
Step 9.
Chapter 3
Select cells below the entered data to fonn a rectangle [S x p] where p
is the number of parameters sought after; e.g. in the equation
y=klxl+k2x2+k3 you are looking for kl , k2 and k3. Therefore p would be
equal to 3.
Note: Excel casts the p-parameter model in the following form:
y=mIxI+m2x2+ ...+mp-lxp-l+b.
Now that you have selected an area [ 5 x p] on the spreadsheet, go to
the.A. (Paste Function button) and click.
Click on Statistical on the leR scroll menu and click on LINEST on the
right scroll menu; then hit OK.
A box will now appear asking for the following
Known Y's
Kuow X' s
Collst
stots
Click in the text box for known values for the singe response variable
y; then go to the Excel sheet and highlight the y values.
Repeat Step 5 for the known values for the independent variables si ,
x?, etc. by clicking on the box containing these values. This the pro-
gram lets you highlight the area that encloses all the x values (x, , x?,
x;, etc.. . .).
Set the logical value Const=trtre if you wish to calculate a y intercept
value.
Set the logical value Stnts=frz!e if you wish the program to return ad-
ditional regression statistics.
Now that you have entered all the data you do not hit the OK button
but instead press Cofltr.ol-Slzifi-Enter.
This command allows all the elements i n the array to be displayed. If
you hit OK you will only see one element in the array.
Once the above steps have been followed, the program returns the following
information on the worksheet. The information is displayed i n a [S x p] table
where p is the number of parameters
I st row: parameter values
or kp- I kp-2 ... k2 kt k!J
mp- 1 mp-7 ... m7 mi b
2"d row: Standard errors for the estimated parameter values
3rd row: Coefficient of determination and standard error of the y value
R2 sev
Note: The coefficient of determination ranges in value f?om0 to 1 . I f it is equal to
one then there i s a perfect correlation in the sample. If on the other hand it
has a value of zero then the model is not useful in calculating a y-value.
4'h row: F statistic and the number of degrees of freedom (d.f.)
5'h row: Information about the regression
ssreg (regression sum of squares) ssresid (residual sum of squares)
The above procedure will be followed in the following three examples with two,
three and four parameter linear single response models
Example I
Table 3. I gives a set of pressure versus temperature data of equilibrium
C 0 2 hydrate formation conditions that were obtained from a series of experi-
ments in a 20 wt YO aqueous glycerol solution. The objective is to fit a fbnction
of the form
or
I ~ P = A + B T + C T - ~
l n P=A+BT
(3.44a)
(3.44b)
Solution of Example I
Following our notation, Equations 3.44a and 3.44b are written in the fol-
lowing form
y=klxl+k2x2+k3 (3.45a)
y=k,x,+kz (3.45b)
38 Chapter 3
where y=/nP, kl=B, k2=C, k3=A, xl=T and xz=T-' in Equation 3.34a and y=/r7P,
k,=B, kz=A and xi-T in Equation 3.45b.
Excel casts the model in the following forms
y=m1xl +mZx2+b (3.46a)
Y=ml x l +b (3.46b)
where y=hP, ml=kl=B, m2=k2=C, m3=k3=A, xl=T and x2=T-* in Equation 3.46a
and y=hP, m,=kl=R, m2=k2=A and x ,=T in Equation 3.46b.
Excel calculates the following parameters for the three-parameter model
I I3087246 1 1.40291 9 I -557.928 I
I 2168476.8 I 0.21 1819 I 86.92737 1
I 0.9984569 1 0.01 456 I I #N/A I
1617.6016
#N f A 0.00 106 0.685894
#N/A 5
and the following parameters for the two-parameter model
0.1246 1 -33.3
I 0.00579 1 I .585 I
I 0.98722 I 0.038 I
463.323
0.009 0.678 I7
6
In Table 3.1 the experimentally determined pressures together with the cal-
culated ones using the two models are shown. As seen the three-parameter model
represents the data better than the two-parameter one.
Computation of Parameters in Linear Models - Linear Regression
Table 3.1 Incipient Eqtdibrium Data on C02 Hydrate Formation in
20 (wt%) Aqzreous Glycerol Solutions
Temperature
(K)
270.4
270.6
272.3
273.6
274. I
275.5
276.2
277.1
Source:
-
Brelland
Experimental
Pressure
(A4Pa)
I SO2
1.556
1.776
2.096
2.28 I
2.72 I
3.00 I
3.556
Calculated Pres-
sure with three-
parameter model
MPa )
1,513
1.537
1.804
2.097
2.236
2.727
3.04 1
3.534
and Englezos ( 1 996)
Calculated Pres-
sure with two-
parameter model
("Pa)
1.812
1.858
2.296
2.700
2.874
3.42 1
3.733
4. I76
39
Example 2
In Table 3.2 a set of data that relate the pH with the charge on wood fibers
are provided. They were obtained from potentiometric titrations in wood fiber
suspensions.
We seek to establish a correlation of the form given by Equation 3.47 by
fitting the equation to the charge (Q) versus pH data given in Table 3.2.
Solution of Example 2
Following our notation, Equation 3.47 is written in the following form
where y=Q, kl=C2, kl=C3, k3=C4, b=Cl , xl=pH, x2=(pH)' and x~=(PH)~.
40
Table 3.2 Charge on Wood Fibers
Chapt er 3
I 2.8535 I 19.0 I 16.9
I 3.2003 I 32.6 I 34.5
I 3.6347 I 52.8 I 54.0
4.09 10
I 14.0 115.4 5.6 I07
100.7 99.6 5.0390
86.3 86.2 4.5283
71.6 71.4
6.3 183
138.4 138.4 7.0748
127.2 130.7
7.7353
162.1 159.9 8.8961
153.3 151.6 8.2385
147.0 144.1
I 9.5342 I I 72.2 I 171.9
10.0733
190.5 193.1 10.4700
181.9 183.9
I 10.992 I I 200.7 I 203.9
Sot me: Bygrave ( 1997).
Excel casts the model in the following form
y=ml xI +m2x2+m3x3+b (3.48b)
where y=Q, ml=kl= C?, m2=k2=C3, m3=k3=C4, xl=pH xZ=(pH) and x3=(pH).
The program returns the following results
0.540448 - 12.793 113.4188 -215.213
0.047329 0.9852 6.380755 12.6 1603
0.99872 1 2.27446 #N/A #N/A
3 122.612 12 #N/A #N/A
4846 I .4 1 62.07804 #N/A #N/A
Computation of Parameters in Like& Models - Linear Regression
. . ..,,
41
In Table 3.2 the calculated charge values by the model are also shown.
Example 3
Table 3.3 gives another set of equilibrium C02 hydrate formation pres-
sure versus temperature data that were obtained from a series of experiments in
pure water. Again, the objective is to fit a function of the form of Equation
3.44a.
Table 3.3 Incipient Eqzrilibrium Data on CO2
Hydrate Formation in Pure Water
I
Temperature
1 (K)
275.1
275.5
276.8
277. I
277.7
279.2
280.2
28 I .6
1 282.7
Experimental
Pressure
(MPa)
1.542
1.65 1
1.936
1.954
2.126
2.60 1
3.02 1
3.488
4.155
Calculated
Pressure
(MPa)
1.486
1.562
1.863
1.939
2.100
2.564
2.929
3.529
4.084
Suzcrce: Brelland and Englezos ( 1996),
Englezos and Hull (1 994).
Following the notation of Example 1 we have the following results fiom Excel.
The calculated values are also shown in Table 3.3.
0.097679 0.133029
0.193309 0.009 196
0.998 128 0.017101
1599.80 1 6
0.935707 0.00 1755
-36.199
2.60079 1
#N/A
#N/A
#N/A
42 Chapter 3
3.6.2 Procedure for Using SigmaPlotTh' for Windows
The use of this program is illustrated by solving Examples 1 and 2.
Step 1. A file called EX-I .FIT that contains the data is created. 'The
data are given in a table form with column 1 for temperature
(T), column 2 for pressure (P) and column 3 for InP.
Step 2. Open application SigmaPlot V2.0 for Windows.
A window at the menu bar (Menu "+Math -+Curve Fit)
is edited as follows
[Parameters]
A= 1
B= 1
c= 1
[Variables]
T=col( I )
InP=col(3)
[Equatio~s]
f=A+B*T+C/T**2
fit f to InP
The progratn returns the following results
Parameler
Dependencies CY(%)
~/a/ue
Error
A
B
I .ooooooo 1.558e+I 8.693e+I 5.578e+2
I .ooooooo I .657e+1 2. I68e+6 I .308e+7 C
I .ooooooo 1.510e+l 2.1 18e-I 1.403e+0
Step 3. Use estimated parameters to calculate fortnation pressures and
input the results to file EX-1 .XFM).
Step 4. A window at the menu bar (Menu -+Math "+Transforms) is
edited as follows:
Computation of Parameters in Li6ear Models - Linear R&gtession
43
T=data(270.3,277.1 ,O. 1)
col(S)=T
A=-557.8
B=1.403
C=1.308E+7
LnP=A+B*T+C*(T*(-2))
col(7)=lnP
col(6)=exp(lnP)
Step 5. A file to generate a graph showing the experimental data and
the calculated values is then created (File name : EX- I SPW).
L
a
2
n
v)
v)
2 1
0 "7- I I I
270 272 274 276 278
Temper at ur e (K)
Figure 3. I E,upel*inzental and calculated hydrate fornzation pressures.
44
Chapter 3
Exanjple 2
Step 1. A file called EX-2.FIT that contains the data given in Table 3.3
is created. The data are given in a table form with column 1 for
pH, column 2 for charge (Q)
Step 2. Open application SigmaPlot V2.0 for Windows.
A window at the menu bar (Menu "+Math -+Curve Fit)
is edited as follows
[Parameters]
c,= 1
cz= 1
c3=1
cj=1
[ Variclbles]
pH=col( 1)
Q=col(2)
[Eqzrcrtior~s]
f=CI+C2*pH+C3*(pHA2)+Cj"(pHA3)
fit f to Q
The program returns the following results
Parameter Valzre
CI
-2 15.2
c2
0.54
c4
-1 2.8
c3
113.4
Standard
Error.
C V(%)
Depeudencies
12.6
5.626 6.4
0.9979686 5.862
0.9996247 8.757 0.05
0.9999 197 7.70 I 0.98
0.9998497
Step 3. Use estimated parameters to calculate the charge and input the
results to file EX-2.XFM.
Step 4. A window at the menu bar (Menu -+Math -+ Transforms) is
edited as follows.
Computation of Parameters in Lidear Models - Linear Regression
pH=daia(2.8, I I . 0,O. I )
coi(5) =pH
CI=-2 15.2
Cz=ll3.4
C3=- 12.8
Cj=O. 54
Q=CI+C2*pH+C3*(pHA2)+C4*(pHA3)
col(6) =Q
45
Step 5. A file to generate a graph showing the experimental data and
the calculated values is then created (File name : EX-2.SPW).
250
200
150
Q)
(II
P
6 100
50
0
2 4 6 8 10 12
PH
Figtu-e 3.2 Experimental and calculated charge. The line represents
calculaled vahes.
46 Chapter 3
3.7 SOLUTION OF MULTIRESPONSE LINEAR REGRESSION
PROBLEMS
These problems refer to models that have more than one ( y t 733 1) response
variables, ( mxn) independent variables and p (=n+l) unknown parameters. These
problems cannot be solved with the readily available software that was used in the
previous three examples. These problems can be solved by using Equation 3.18.
We often use our nonlinear parameter estimation computer program. Obviously,
since it is a linear estimation problem, convergence occurs in one iteration.
3.8 PROBLEMS ON LINEAR RECRESSION
3.8.1 Vapor Pressure Data for Pyridine and Piperidine
Blanco et al. (1 994) reported measurements of the vapor pressure (Psat) for
p-xylene, y-picoline, piperidine, pyridine and tetralin. The data for piperidine and
pyridine are given in Table 3.4. A suitable equation to correlate these data is An-
toines relationship given next
(3.49)
where PSat is the vapor pressure in mmHg and t is the temperature in c. Determine
the values for the parameters A,B and C in Antoines equation for piperidine and
pyridine.
Table 3. 4 Vapor Presstne Data for Pilxriditle and Plqridirw
Temperature
23.89 345.4s 26.44 339.35
Psat (Pyridine) Temperature Psat (piperidine)
(K) (kPa) (K) ( kPd
339.70
26.66 348.20 32.17 344.50
25.54 347.20 26.66
I 350.25 I 39.57 I 349.60 I 28.03 I
352.50
34. I4 355.05 50.75 357.50
32.35 353.55 46.6 1 354.90
30.09 35 1.65 42.84
I 359.05 I 53.55 I 356.75 I 36.27 I
360.55
44.26 362.50
39.87 359.40 56.22
Source: Blanco et al. (1 994).
3.8.2 Vapor Pressure Data for R142b and R152a
Silva and Weber (1993) reported vapor pressure measurements for the 1-
chloro- I , I -Difluoroethane (R 142b) and 1, l -Difluoroethane (R I52a) refrigerants.
The data are given in Tables 3.5 and 3.6 respectively. Use Antoine's equation to
correlate the data for R142b and the following equation for R152a (Silva and
Weber, 1993)
log [rt] - =L ( kl R T + k2R'.5 + k,R2.5 + k,R5)
where
(3.50)
(3.5 I )
and where Pc=45 14.73 kPa and Tc=386.4 IK are the critical pressure and tempera-
ture (for R152a) respectively. You are asked to estimate Antoine's parameters A,
B and C for R142b and parameters kl , k2, k3 and for R152a by using the data
given in Table 3.6.
Table 3.5 Vapor Pressure Data for R142b
I
25 I .993
218.216 284.688 70.87 1 255.544
202.653 282.535 65.744 253.837
190.553 280.776 60.525
Source: Silva and Weber (1 993).
48 Chapter 3
Table 3.6 Il'upor Pressure Duta~for- R1520
Temperature ( K ) Temperature (A')
for R1 52a PSat for R I52a
( k P 4 ( kP4
2 19.92 I
I O I .864 249.262 27.322 223 .082
94. I23 247.489 22.723
I 224.670 1 29.885 I 250.770 I 108.836 I
I 225.959 I 32, I25 I 250. I26 1 105.788 I
I 227.850 I 35.645 1 251.534 I I 12.496 I
229.278
254.768 42.0 18 230.924
253.2 I9 38.501
I 232.567 I 45.792 1 256.519 I 138.939 I
234.08 1
178.124 262.679 55.91 I 236.496
168.744 261.3 10 5 1.588 234.902
158.5 I 1 259.745 53.303 235.540
148.355 258.1 10 49.497
236.059
22 1.084 268.32 1 64.9 12 239.528
207.226 266.603 60.052 237.938
193.084 264.757 54.700
I 241.295 I 70.666 I 268.206 I 220.076 I
242.822
249.2 18 27 1.573 80.807 244. I49
234.549 269.9 I8 75.955
I 245.844 87.35 I 263.726 273.139
Sozrrce: Silva and Weber ( 1 993).
Gauss-Newton Method for Algebraic
Models
As seen in Chapter 2 a suitable measure of the discrepancy between a model
and a set of data is the objective function, S(k), and hence, the parameter values
are obtained by minimizing this function. Therefore, the estimation of the pa-
rameters can be viewed as an optimization problem whereby any of the available
general purpose optimization methods can be utilized. In particular, it was found
that the Gauss-Newton method is the most effjcient method for estimating pa-
rameters in nonlinear models (Bard. 1970). As we strongly believe that this is in-
deed the best method to use for nonlinear regression problems, the Gauss-Newton
method is presented in detail in this chapter. It is assumed that the parameters are
free to take any values.
4.1 FORMULATION OF THE PROBLEM
In this chapter weare focusing on a particular technique, the Gauss-Newton
method, for the estimation of the unknown parameters that appear in a model de-
scribed by a set of algebraic equations. Namely, it is assumed that both the struc-
ture of the mathematical model and the objective function to be minimized are
known. In mathematical terms, we are given the model
49
so Chapter 4
where k=[kI,kz,. . .,k,IT is a p-ditnensior~al vector of parameters whose numerical
values are unknown, x=[xI,x2,. . . ,xJT is an n-dimensional vector of independent
variables (which are often set by the experimentalist and their numerical values are
either known precisely or have been measured), f is a r~Firner?siona/ vector func-
tion of known form (the algebraic equations) and y=[yI,y2,. . ..ymlT is the m-
diwensionnl vector of depended variables which are measured experimentally
(output vector).
Furthermore, we are given a set of experimental data, [f i , xi 1, i= I , . . . ,N that
we need to match to the values calculated by the model in some optitnal fashion.
Based on the statistical properties of the experimental error involved in the meas-
urement of the output vector y (and possibly in the measurement of some of the
independent variables x) we generate the objective function to be minimized as
mentioned in detail in Chapter 2. I n most cases the objective function can be writ-
ten as
N
S(k) = ZeTQi ei
i =l
(4.2a)
where e, = [i, - f(xi,k)] are the residuals and the weighting matrices Q,,
i=l ?..., N are chosen as described in Chapter 2. Equation 4.2a can also be written
as
i =l
Finally, the above equation can also be written as follows
N f 111 m
(4.2b)
(4.2~)
i = I \ k = l I = 1 / i
Minimization of S(k) can be accomplished by using almost any technique
available from optimization theory. Next we shall present the Gauss-Newton
method as we have found it to be overall the best one (Bard, 1970).
4.2 THE GAUSS-NEWTON METHOD
Let us assume that an estimate kJ) is available at the jth iteration. We shall
try to obtain a better estimate, kc). Linearization of the model equations around
k) yields,
Gauss-Newton Method for Algebraic Models
51
f(x,, k"")) = f(xl, kG)) + [ - :kT]: AkG+') + H.O.T. ; i=l, ..., N (4.3)
I
Neglecting all higher order terms (H.O.T.), the model output at kG+') can be
approximated by
where GI is the (mxRhensitivity matrix (af 1 d k r = (Vf )I' evaluated at Xi and
kG). It is noted that G is also the J acobean matrix of the vector function f(x,k).
Substitution of y(xl,k(i+')) as approximated by Equation 4.4, into the LS objective
function and use of the critical point criterion
yields a linear equation of the form
where
and
Solution of the above equation using any standard linear equation solver
yields Ak6"). The next estimate of the parameter vector, k6+'), is obtained as
where a stepping parameter, p (O<p 5 l ) , has been introduced to avoid the problem
of overstepping. There are several techniques to arrive at an optimal value for p;
however, the simplest and most widely used is the bisection rule described below.
52 Chapt er 4
4.2.1 Bisection Rule
The bisection rule constitutes the simplest and most robust way available to
determine an acceptable value for the stepping parameter p. Normally, one starts
with p=l and keeps on halving p until the objective function becomes less than
that obtained in the previous iteration (Hartley, 1961). Namely we "accept" the
first value of p that satisfies the inequality
More elaborate techniques have been published in the literature to obtain
optimal or near optimal stepping parameter values. Essentially one performs a
univariate search to determine the minimum value of the objective function along
the chosen direction (Ak'Si+')) by the Gauss-Newton method.
4.2.2 Convergence Criteria
A typical test for convergence is 11Akbb')ll TOL where TOL is a user-
specified tolerance. This test is suitable only when the unknown parameters are of
the same order of magnitude. A more general convergence criterion is
(4.1 I )
where y is the number of parameters and NSIG is the number of significant digits
desired in the parameter estimates. Although this is not guaranteed, the above con-
vergence criterion yields consistent results assuming of course that no parameter
converges to zero!
Atgorifhm - Intplen~entnriort Steps:
1. Input the initial guess for the parameters, k(O) and NSIG
2. Forj=O, 1.2 ,..., repeat the following
3. Compute y(x,,k'-" ) and G, for each i=l ...., N, and set up matrix A & vector b
4. Solve the linear equation AAkc'"'=b and obtain Ak"' ".
5. Determine p using the bisection rule and obtain k(J'"'=k'J'+ pAkc'' I )
53
Continue until the maximum number of iterations is reached or convergence
Compute statistical properties of parameter estimates (see Chapter 1 1).
In summary, at each iteration of the estimation method we compute the
model output, y(x,,kG)), and the sensitivity coefficients, G, , for each data point
i=l ,. . .,N which are used to set up matrix A and vector b. Subsequent solution of
the linear equation yields AkG+ and hence kU is obtained.
The converged parameter values represent the Least Squares (LS), Weighted
LS or Generalized LS estimates depending on the choice of the weighting matrices
Q,. Furthermore, if certain assumptions regarding the statistical distribution of the
residuals hold, these parameter values could also be the Maximum Likelihood
(ML) estimates.
4.2.3 Formulation of the Solution Steps for the Gauss-Newton Method:
Two Consecutive Chemical Reactions
Let us consider a batch reactor where the following consecutive reactions
take place (Smith, 198 1)
A k1 , B k2 , D (4.12)
Taking into account the concentration invariant CA+CB+CD=CAO, i.e. that
there is no change in the total number of moles, the integrated forms of the iso-
thermal rate equations are
where CA, CB and CD are the concentrations of A, B and D respectively, t is the
reaction time, and kl, k2 are the unknown rate constants. During a typical experi-
54 Chapter 4
ment, the concentrations of A and B are only measured as a function of time.
Namely, a typical dataset is of the form [ti, CAI, CBIJ, i=1 ,. . . , N.
The variables, the parameters and the governing equations for this problem
can be rewritten in our standard notation as follows:
Parameter vector: k = [k,, k?]'
Vector of independent variables: x = [XI] where x1 = t
Output vector (dependent variables):
y = [y I , yzJ T where yI = CA, y y CB
Model equations: f = [f,, f?IT
where
The elements of the (2x2)-senLsitivity coeficient matrix CJ are obtained as
follows:
G, z = [g) = 0
(4.1 Sa)
(4.15b)
Equations 4.14 and 4.15 are used to evaluate the model response and the
sensitivity coefftcients that are required for setting up matrix A and vector b at
each iteration of the Gauss-Newton method.
55
4.2.4 Notes on the Gauss-Newton Method
This is the well-known Gauss-Newton method which exhibits quadratic
convergence to the optimum parameter values when the initial guess is sufficiently
close. The Gauss-Newton method can also be looked at as a procedure that con-
verts the nonlinear regression problem into a series of linear regressions by line-
arizing the nonlinear algebraic equations. It is worth noting that when the model
equations are linear with respect to the parameters, there are no higher order terms
(HOT) and the linearization is exact. As expected, the optimum solution is ob-
tained in a single iteration since the sensitivity coefficients do not depend on k.
In order to enlarge the region of convergence of the Gauss-Newton method
and at the same time make it much more robust, a stepping parameter is used to
avoid the problem of overstepping particularly when the parameter estimates are
away from the optimum. This modification makes the convergence to the opti-
mum monotonic (i.e., the objective function is always reduced from one iteration
to the next) and the overall estimation method becomes essentially globally con-
vergent. When the parameters are close to the optimum the bisection rule can be
omitted without any problem.
Finally, an important advantage of the Gauss-Newton method is also that at
the end of the estimation besides the best parameter estimates their covariance
matrix is also readily available without any additional computations. Details will
be given in Chapter 1 1.
4.3 EXAMPLES
Gallot et al. (1998) studied the catalytic oxidation of 3-hexanol with hydro-
gen peroxide. The data on the effect of the solvent (CH30H) on the partial conver-
sion, y, of hydrogen peroxide were read from Figure l a of the paper by Gallot et
al. (1998) and are also given here in Table 4.1. They proposed a model which is
given by Equation 4.16.
(4.16)
In this case, the unknown parameter vector k is the 2-dimensional vector
[kl,k2lT. There is only one independent variable (xl=t) and only one output vari-
able. Therefore, the model in our standard notation is
56
Chapter 4
Table 4.1 Catalytic 0-xidation of 3-Hexand
I Modified Reaction I Partial Conversion (y) 1
._ I
Time (t)
(I? kg/kmol)
0.75 g of CH30H 1.30 g of CH30H
3
0.070 0.090 G
0.040 0.055
13
0.130 0. I50 18
0.100 0.120
I 26 I 0. I65 I 0. I50 I
I 28 I 0.175 I 0. I GO I
Source: Gallot et al. ( I 998)
The (1x2) diwensional sensitivity coefficient matrix G = [GI1, GI*] is given
by
(4.18a)
(4. I 8 b)
Equations 4.17 and 4.1 8 are used to evaluate the model response and the
sensitivity coefficients that are required for setting up matrix A and vector b at
each iteration of the Gauss Newton method.
4.3.2 Biological Oxygen Demand (BOD)
Data on biological oxygen demand versus time are usually modeled by the
following equation
where k, is the ultimate carbonaceous oxygen demand (mg/L) and k2 is the BOD
reaction rate constant (d). A set of BOD data were obtained by 3 I d year Environ-
mental Engineering students at the Technical University of Crete and are given in
Table 4.2.
Table 4.2 A Set of BOD Data
Time (days)
1
BOD (rng/L)
230 3
180 2
110
4
280 5
260
290 6
57
7
330 8
310
As seen the model for the BOD is identical in mathematical form with the
model given by Equation 4.17.
Let us consider the following nonlinear model (Bard, 1970). Data for the
model are given in Table 4.3.
This model is assumed to be able to fit the data given in Table 4.3. Using
our standard notation [y=f(x,k)] we have,
Parameter vector: k = [k,, k2, k31T
Vector of independent variables: x = [x,, x2, %IT
Output vector:
Y = [Yl
Model Equation:
f = [fll
where
(4.2 I )
The elements of the (1x3)-dimensional sensitivity coefficient matrix G are
obtained by evaluating the partial derivatives:
G,, = (2) =l . (4.22a)
58
Chapter 4
Tnhle 4.3 Data for Ntrmerical ExuI ?@~ 1
Run Y XI X2 x; Ycalc
1 0.14 1 15 1 0.1341
2 0.18 2 14 2 0.1797
3 0.22 3 13 3 0.2203
4 0.25 4 12 4 0.2565
5 0.29 5 1 1 5 0.2892
6 0.32 6 10 6 0.3 187
7 0.35 7 9 7 0.3455
8 0.39 8 8 8 0.3700
Sozlrce; Bard ( 1970).
(4.2%)
(4.22~)
4.3.4 Chemical Kinetics: Isomerization of Bicyclo 12,1,1 J Hexane
Data on the thermal isomerization of bicyclo [2,1, I ] hexane were measured
by Srinivasan and Levi (1963). The data are given in Table 4.4. 'The following
nonlinear model was proposed to describe the fraction of original material re-
maining (y) as a function of time (x, ) and temperature (x?). The model was repro-
duced fiom Draper and Snlith ( 1 998)
Gauss-Newton Method for Algebraic Models I
Using our standard notation [y=f(x,k)] we have,
Parameter vector:
Vector of independent variables:
Output vector:
Model Equation:
where
59
(4.23)
(4.24)
Table 4.4 Isomerization of Biqclo (2, I , I ) Hexane
18
0.659 639 30.0 40 0.71 5 620 90.4 20
0.638 639 30.0 39 0.576 620 150.0 19
0.425 639 60.0 38 0.712 620 90.0
41 60.0 639 0.449
Source: Srinivasan and Levi (1963).
60 Chapt er 4
The elements of the (Ix2)-dimensional sensitivity coefficient matrix G are
(4.25a)
Equations 4.24 and 4.25 are used to evaluate the tnodel response and the
Let us consider the determination of two parameters, the maximum reaction
rate (rlnax) and the saturation constant (&,) in an enzyme-catalyzed reaction fol-
lowing Michaelis-Menten kinetics. The Michaelis-Menten kinetic rate equation
relates the reaction rate (r) to the substrate concentrations ( S ) by
(4.26)
The parameters are usually obtained from a series of initial rate experiments
performed at various substrate concentrations. Data for the hydrolysis of benzoyl-
L-tyrosine ethyl ester (BTEE) by trypsin at 30 Y and pH 7.5 are given below:
S
hW
20 2.5 5.0 10 15
i
r
(bh//min)
330 110 220 260 300
Source: Blanch and Clark (1 996)
In this case, the unknown parameter vector k is the 2-dimensional vector
[r,,,,,, KJ T, the independent variables are only one, x = [SI and similarly for the
output vector, y = [r]. Therefore, the model in our standard notation is
(4.27)
Gauss-Newton Method for Algebrdic Models 61
The (1x2) dimensional sensitivity coefficient matrix G = [Gll, GI 2] is given
by
(4.28a)
(4.28b)
As another example fiom chemical kinetics, we consider the catalytic re-
duction of nitric oxide (NO) by hydrogen which was studied using a flow reactor
operated differentially at atmospheric pressure (Ayen and Peters, 1962). The fol-
lowing reaction was considered to be important
I
2
NO+H2 t3 H20+ -N2 (4.29)
Data were taken at 375,400 c, and 425 YI using nitrogen as the diluent. The
reaction rate in gmoi/(ming-catalyst) and the total NO conversion were measured
at different partial pressures for Hz and NO.
A Langmuir-Hinshelwood reaction rate model for the reaction between an
adsorbed nitric oxide molecule and one adjacently adsorbed hydrogen molecule is
described by:
(4.30)
where r is the reaction rate in gnzol/(~?zin.g-catalyst), pH2 is the partial pressure of
hydrogen (atm), pNO is the partial pressure of NO (atm), KNo= A2exp(-EzlRT)
atm is the adsorption equilibrium constant for NO, KHz= A3exp(-E3/RT) atm is
the adsorption equilibrium constant for H2 and k=Alexp(-EJRT) gmoZ/(ming-
catalysll) is the forward reaction rate constant for surface reaction. The data for the
above problem are given in Table 4.5.
62 Chapt er 4
The objective of the estimation procedure is to determine the parameters k,
KHz and KKo (if data fiom one isotherm are only considered) or the parameters A, ,
A?, A3, El , EZ, E3 (when all data are regressed together). The units of El , EL, E; are
in cal/nzol and R is the universal gas constant (1.987 cal/tttol K).
For the isothernlal regression of the
[y=f(x,k)] we have,
Parameter vector: k = [kl, k*, kRIT
Independent variables: x = [xI ? x21T
Output vector
Y = [Y l l
Model Equation
f = tfl1
data, using our standard notation
where
(4.3 1 )
(4.3%)
GI, = (2) =
klk2X,X2 - 2kl k2k3xI x;
(4.32~)
(1 + k3x2 + k2x,)? (1 + k,x, + kzx1)7
Equations 4.3 1 and 4.32 are used to evaluate the model response and the
each iteration of the Gauss Newton method.
43. 7 Numerical Example 2
Let us consider the following nonlinear model (Hartley, I96 1).
y = k l +k, exp(k3x) (4.33)
I I r . :
63
Table 4.5 Experimental Data for the Catalytic Reduction of Nitric Oxide
T=375 "c, Weight of catalyst=2.39 g
0.00922
3.54 3.64 0.0500 0.0280
2.99 3.27 0.0500 0.0 197
2.36 2.56 0.0500 0.0 1 36
1.96 1.60 0.0500
0.029 I
4.23 4.46 0.0500 0.0389
3.4 1 3.48 0.0500
T=400 "c, Weight of catalyst-1.066 g
0.00659 I 0.0500 I 2.52
0.59
0.01 13
8.83 3.64 0.0 100 0.0500
2.57 8.79 0.0500 0.0500
1.91 6.86 0.0500 0.0402
I .76 6.6 1 0.0500 0.03 I 1
1.44 5.4 1 0.0500 0.0228
1.05 4.2 1 0.0500
I I I
i 0.0500 I 0.0153 1 4.77 6.05
0.0500
2.70 7.82 0.0432 0.0500
3.20 7.94 0.0361 0.0500
4.06 6.6 I 0.02 70
T=425 "c. Weight of catalysFl.066 g
0.00474
4.17 7.23 0.0500 0.01 36
2.62 5.02 0.0500
0.0290
8.19 13.00 0.0500 0.0400
6.84 I 1.35 0.0500
0.0500
13.3 9.29 0.0269 0.0500
8.53 13.9 1 0.0500
0.0500
10.4 1 1.89 0.0387 0.0500
12.3 9.75 0.0302
Source: Ayen and Peters (1 962).
64 Chapter 4
Data for the model are given below in Table 4.6. The variable y represents
yields of wheat corresponding to six rates of application of fertilizer, x, on a coded
scale. The model equation is often called Mitcherlisch's law of diminishing re-
turns.
According to our standard notation the model equation is written as follows
Tnhle 4.6 Duta. for Ntmericul Example 2
X
127 -5
Y
-3
3 79 - I
I51
460 3
42 I I
5 426
(4.35a)
(4.35b)
(43%)
Source: Hartley ( 196 I ).
4.4 SOLUTIONS
The solutions to the Numerical Examples 1 and 2 will be given here. The
rest of the solutions will be presented in Chapter 16 where applications in chemi-
cal reaction engineering are illustrated.
Gauss-Newton Method for Algebraic Models 65
4.4.1 Numerical Example I
Starting with the initial guess k(')=[ I , 1, 1lT the Gauss-Newton method eas-
ily converged to the parameter estimates within 4 iterations as shown in Table 4.7.
In the same table the standard error (YO) in the estimation of each parameter is also
shown. Bard (I 970) also reported the same parameter estimates 10.0824 I , 1.1330,
2.34371 starting from the same initial guess.
The structure of the model characterizes the shape of the region of conver-
gence. For example if we change the initial guess for kl substantially, the algo-
rithm converges very quickly since it enters the model in a linear fashion. This is
clearly shown in Table 4.8 where we have used k'''=[ 100000, 1, 1IT. On the other
hand, if we use for k2 a value which is just within one order of magnitude away
from the optimum, the Gauss-Newton method fails to converge. For example if
k(')=[ 1, 2, 1I T is used, the method converges within 3 iterations. If however,
k")=[ I , 8, or k(')=[ I , 10, 1l T is used, the Gauss-Newton method fails to con-
verge. The actual shape of the region of convergence can be fairly irregular. For
example if we use k"'=[ 1, 14, 1I T or k"'=[ 1, 15, 1IT the Gauss-Newton method
converges within 8 iterations for both cases. But again, when k("=[ 1, 16, 1lT is
used, the Gauss-Newton method fails to converge.
Table 4.7 Parameter Estimates at Each Iteration of the Gauss-Newton
Method for Nam1ericnl Example-lwith Initial Guess [ I , I , I ]
LS Objective
fimction
Iteration
kl k3 k2
4
0
1.666 1.183 0.08265 1.26470 1
1 1 I 41.6817
Table 4.8 Paranzeter Estimates at Each Iteration of the Gauss-Newton Me-
thodfor Numerical Example I with Initial Guess [IOOOOO, I , I ]
66 Chapter 4
Starting with the initial guess k'''=[ 100, -200, - I ] the Gauss-Newton method
converged to the optimal parameter estimates given in Table 4.9 in 12 iterations.
The number of iterations depends heavily on the chosen initial guess. If for exam-
ple we use k(''=[ 1000, -200, -0.21 as initial guess, the Gauss-Newton method con-
verges to the optimum within 3 iterations as shown in Table 4.10. At the bottom of
Table 4. I O we also report the standard error (%) in the parameter estimates. As
expected the uncertainty is quite high since we are estimating 3 parameters from
only 6 data points and the structure of the model naturally leads to a high correla-
tion between k2 and k3.
Hartley ( 1 96 1) reported also convergence to the same parameter values
k*=[523.3, - I 56.9, -0.19971' by using as initial guess k'0)=[500, - 140, -0. I 8IT.
Table 4.9 Parameter Estimates nt Each Iteration of the Ga ~~w- Ne ~v t o ~~
Method, for Nzm?erical Example 2. lt~itiul Guess [loo, -200, - 1 J
Iteration
LS Objective
Function
kl k3 kz
0
-0.1997 - I 56.9 523.3 1.339~ I O4
3
-0. I998 -1 56.8 523. I 1.339~ 10'
2
-0.1993 -157.1 523.4 I .339x I O"
1
-0.2 -200 1000 1 . 826~ I o7
I Standard Error (YO)
115.2 1 85.2
30.4
5
Other Nonlinear Regression Methods for
Algebraic Models
There is a variety of general purpose unconstrained optimization methods
that can be used to estimate unknown parameters. These methods are broadly clas-
sified into two categories: direct search methods and gradient methods (Edgar and
Himmelblau, 1988; Gill et al. 1981; Kowalik and Osborne, 1968; Sargent, 1980;
Reklaitis, 1983; Scales, 1985).
A brief overview of this relatively vast subject is presented and several of
these methods are briefly discussed in the following sections. Over the years many
comparisons of the performance of many methods have been carried out and re-
ported in the literature. For example, Box (1966) evaluated eight unconstrained
optimization methods using a set of problems with up to twenty variables.
5.1 GRADIENT MINIMIZATION METHODS
The gradient search methods require derivatives of the objective functions
whereas the direct methods are derivative-free. The derivatives may be available
analytically or otherwise they are approximated in some way. It is assumed that
the objective fimction has continuous second derivatives, whether or not these are
explicitly available. Gradient methods are still efficient if there are some disconti-
nuities in the derivatives. On the other hand, direct search techniques, which use
function values, are more efficient for highly discontinuous functions.
The basic problem is to search for the parameter vector k that minimizes
S(k) by following an iterative scheme, i.e.,
67
68
Chapt er 5
N
Minimize S(k)= Ce] rQ, e,
I =1
( 5. I )
where k=[k,, kz ,..., kJ T the p-dimensional vector of parameters, e=[el. e?,. ..,e,J T
the m-dirz~ensionnl vector of residuals where e , = [i I - f ( x, , k)] and Q, is a user
specified positive definite weighting matrix.
The need to utilize an iterative scheme stems from the fact that it is usually
impossible to find the exact solutions of the equation that gives the stationary
points of S(k) (Peressini et ai. 1988),
i?S(k)
dk
VS(k)E-=O (5.2a)
/ \ I
where the operator V = -, - ..., is applied to the scalar function
L)
dk, i3k2 2k,
S(k) yielding the column vector
dS( k )
(5.2b)
Vector VS(k) contains the first partial derivatives of the objective function
S(k) with respect to k and it is often called the gvadimt vector. For simplicity, we
denoted it as g(k) in this chapter.
I n order to find the solution, one starts fiom an initial guess for the parame-
ters, k=[ k :, k r, ..., k r ) IT. There is no general rule to follow in order to ob-
tain an initial guess. However, some heuristic rules can be used and they are dis-
cussed in Chapter 8.
At the start of the jth iteration we denote by kti) the current estimate of the
parameters. The jtl iteration consists of the computation of a search vector Akj I )
from which we obtain the new estimate kJlaccording to the following equation
Other Nonlinear Regression Methods for Algebraic dodel s 69
where pG) is the step-size, also known as dumping or relaxation factor. It is ob-
tained by univariate minimization or prior knowledge based upon the theory of the
method (Edgar and Himmelblau, 1988). As seen from Equation 5.3 our main con-
cern is to determine the search vector AkCi+l). Based on the chosen method to cal-
culate the search vector, different solution methods to the minimization problem
arise.
The iterative procedure stops when the convergence criterion for termination
is satisfied. When the unknown parameters are of the same order of magnitude
then a typical test for convergence is 11AkG+') 11 < TOL where TOL is a user-
specified tolerance. A more general convergence criterion is
where p is the number of parameters and NSlG is the number of desired signifi-
cant digits in the parameter values. I t is assumed that no parameter converges to
zero.
The minimization method must be computationally eflcient and robust (Ed-
gar and Himmelblau, 1988). Robustness refers to the ability to arrive at a solution.
Cornputationally eflciency is important since iterative procedures are employed.
The speed with which convergence to the optimal parameter values, k*, is reached
is defined with the asymptotic rate of convergence (Scales, 1985). An algorithm is
considered to have a Oth order rate of convergence when 8 is the largest integer for
which the following limit exists.
In the above equation, the norm 11 11 is usually the Euclidean norm. We
have a linear convergence rate when 8 is equal to 1 . Superlinear convergence rate
refers to the case where 8=1 and the limit is equal to zero. When 8=2 the conver-
gence rate is called quadratic. In general, the value of 8 depends on the algorithm
while the value of the limit depends upon the function that is being minimized.
5.1.1 Steepest Descent Method
In this method the search vector is the negative of the gradient of the objec-
tive function and is given by the next equation
70
Chapter 5
Based on Equation 5.1 the search vector is related to the residuals
ei =[ yi -f(xi ,k)] as follows
i = l i =l
where
Vei =
T
and
of: =
dk 2
[f l
. . . .
( 5. 8)
As seen from the above equations, the ( hl xp) matrix (VeT)' is the Jacobean
matrix, J, of the vector function e, and the ( mxp) matrix(VfTjT is the Jacobean
matrix, G, of the vector function f(x,k). The st'' element of the Jacobean matrix J.
is given by
(5.9a)
Other Nonlinear Regression Metbds for Algebraic Models
71
Similarly, the srt" element of the J acobean matrix G, is given by
G,, =(2) ; s=1,2 ,..., m , r=1,2 ,..., p. (5.9b)
The rate of convergence of the Steepest Descent method is first order. The
basic difficulty with steepest descent is that the method is too sensitive to the
scaling of S(k), so that convergence is very slow and oscillations in the k-space
can easily occur. In general a MVZ/ scaled problem is one in which similar changes
in the variables lead to similar changes in the objective function (Kowalik and
Osborne, 1968). For these reasons, steepest descendascent is not a viable method
for the general purpose minimization of nonlinear functions. I t is of interest only
for historical and theoretical reasons.
Algorithnz - Implementatiorz Steps
1.
2.
3.
4.
5.
6.
Input the initial guess for the parameters, k(') and NSIG or TOL
Specify weighting matrix Qi for i=l,2,. . .N.
For j=O, 1,2,. . ., repeat
Compute AkQ+') using Equation 5.6
O+U- 0) 6) 0+I)
Determine p using the bisection rule and obtain k -k + p Ak
Continue until the maximum number of iterations is reached or conver-
5.1.2 Newton's Method
By this method the step-size parameter p") is taken equal to I and the search vector
is obtained from
( 5. IO)
where V2S(k) is the Hessian matrix of S(k) evaluated at kG). It is denoted as H(k)
and thus, the Equation 5.10 can be written as
72
Chapter 5
(5.1 1 )
The above formula is obtained by differentiating the quadratic approxima-
tion of S(k) with respect to each of the components of k and equating the resulting
expression to zero (Edgar and Himmelblau, 1988; Gill et al. 1981; Scales, 1985).
It should be noted that in practice there is no need to obtain the inverse of the Hes-
sian matrix because it is better to solve the following linear system of equations
(Peressini et al. 1988)
[V2S(k)] Ak(j+') = -VS(k) (5.12a)
or equivalently
As seen by comparing Equations 5.6 and 5.12 the steepest-descent method
arises from Newton's method if we assume that the Hessian matrix of S(k) is ap-
proximated by the identity matrix.
Newton's method is not a satisfactory general-purpose algorithm for func-
tion minimization, even when a stepping parameter p is introduced. Fortunately. it
can be modified to provide extremely reliable algorithms with the same asymp-
totic rate of convergence. There is an extensive literature on the Newton's method
and the main points of interest are summarized below (Edgar and Himmelblau,
1988; Gill et al. 198 1 ; Pertessini et al. 1988; Scales, 1985):
(i) I t is the most rapidly convergent method when the Hessian matrix of
(ii) There is no guarantee that it will converge to a mi ni mum from an at"
(iii) Problems arise when the Hessian matrix is indefinite or singular.
(iv) The method requires analytical first and second order derivatives
which may not be practical to obtain. In that case, finite difference
techniques may be employed.
S(k) is available.
bitrary starting point.
The ratio of the largest to the smallest eigenvalue of the Hessian matrix at
the minimum is defined as the condition number. For most algorithms the larger
the condition number, the larger the limit in Equation 5.5 and the more difficult it
is for the minimization to converge (Scales, 1985).
One approach to solve the linear Equations 5.12 is the method of Gill and
Murray that uses the Cholesky factorization of H as in the following (Gill and
Muway, 1974; Scales, 1985):
Other Nonlinear Regression Methods for Algebraic Models 73
H = L D L ~ (5.13)
In the above equation, D is a diagonal matrix, L is a lower triagonal matrix
with diagonal elements of unity.
As shown previously the negative of the gradient of the objective function is
related to the residuals by Equation 5.6. Therefore, the gradient of the objective
function is given by
N
g(k) = VS(k) = 2c(Ve;)Q, e, (5.14)
i =l
where e, = [y , - f ( x, , k)] and Ve? = (JFX is the transpose of the J acobean ma-
trix of the vector fimction e . The d h element of this J acobean was defined by
Equation 5.9a. Equation 5.14 can now be written in an expanded form as follows
N
VS( k) = 2 1
i =l
After completing the matrix multiplication operations, we obtain
N
g(k) 3 ~ = VS(k) = 2C
i =I
ak
(5.16)
Thus, the Sf element of the gradient vector g(k) of the objective function
S(k) is given by the following equation
Chapter 5
(5.17)
We are now able to obtain the Hessian matrix of the objective function S(k)
which is denoted by H and is given by the following equation
V2SS( k) =VVTS( k) =
d
3S( k) c3S( k )
[k, dk, . .
-
a2S(k) d'S(k) G2S(k)
3k:
dkl ak2
ak,ak,
3*S(k) o"S(k) d2S(k)
- -
akzak, akg dk23k,
"
. .
"
. .
. .
. .
3'S(k) d7S(k) d'S(k)
"
. .
i?kpakl ak,i?k, 3k
-
%(k)
dk ,
where we use the notation g, = ~.
Thus, the tpt" element of the Hessian matrix is defined by
(5.19)
and this element can be calculated by taking into account Equation 5.16 as follows
Other Nonlinear Regression Mettioas fur Algebraic Riodels 75
where 5=1,2 ,..., p and p=1,2 ,..., p.
The Gauss-Newton method arises when the second order terms on the right
hand side of Equation 5.20 are ignored. As seen, the Hessian matrix used in Equa-
tion 5. l l contains only first derivatives of the model equations f(x,k). Leaving out
the second derivative containing terms may be justified by the fact that these terms
contain the residuals e, as factors. These residuals are expected to be small quanti-
ties.
The Gauss-Newton method is directly related to Newton's method. The
main difference between the two is that Newton's method requires the computa-
tion of second order derivatives as they arise from the direct differentiation of the
objective function with respect to k. These second order terms are avoided when
the Gauss-Newton method is used since the model equations are first linearized
and then substituted into the objective function. The latter constitutes a key ad-
vantage of the Gauss-Newton method compared to Newton's method, which also
exhibits quadratic convergence.
Algoritlttn - Implenzetttntiott Steps
1.
2.
3.
4.
5.
6.
Input the initial guess for the parameters, k(O) and NSIG or TOL.
Specify weighting matrix Qi for 1=1,2,. . .N.
For j=O, 1, 2,. . . , repeat.
Compute Aka"' by solving Equation 5. I2b.
O+I)- 0) ti +l )
Determine p using the bisection rule and obtain k - k + pAk -
According to Scales (1985) the best way to solve Equation 5.12b is by per-
forming a Cholesky factorization of the Hessian matrix. One may also perform a
Gauss-Jordan elimination method (Press et al., 1992). An excellent user-oriented
presentation of solution methods is provided by Lawson and Hanson (1974). We
prefer to perform an eigenvalue decomposition as discussed in Chapter 8.
76 Chapter 5
5.1.3 Modified Newton's Method
Modified Newton methods attempt to alleviate the deficiencies of Newton's
method (Bard, 1970). The basic problem arises if the Hesssian matrix, G, is not
positive definite. That can be checked by examining if all the eigenvalues of G are
positive numbers. If any of the eigenvalues are not positive then a procedure pro-
posed by Marquardt (1 963) based on earlier work by Levenberg (1 944) should be
followed. A positive value y can be added to all the eigenvalues such that the re-
sulting poitive quantities, h,ty , i=l ,2,.. .,m are the eigenvalues of a positive ma-
trix HI,M, given by
where I is the identity matrix.
Algorithm - Impletnentntinn Steps
1 . Input the initial guess for the parameters, k"), y and NSlG or TOL.
2. Specifjl weighting matrix Q, for i=l,2,. . .N.
3. For j=O, I , 2,. . ., repeat.
4. Compute Ak"") by solving Equation 5.12b but in this case the Hessian
matrix H(k) has been replaced by that given by Equation 5.22.
0+1)- ( 1) 0'1)
5. Determine p using the bisection rule and obtain k - k' + pAk
6. Continue until the maximum number of iterations is reached or conver-
The Gill-Murray modified Newton's method uses a Cholesky factorization
of the Hessian matrix (Gill and Murray, 1974). The method is described in detail
by Scales ( 1 985).
5.1.4 Conjugate Gradient Methods
Modified Newton methods require calculation of second derivatives. There
might be cases where these derivatives are not available analytically. One may
then calculate them by finite differences (Edgar and Himmelblau, 1988; Gill et al.
1981 : Press et al. 1992). The latter, however, requires a considerable number of
gradient evaluations if the number of parameters, p, is large. In addition, finite
difference approximations of derivatives are prone to truncation and round-off
errors (Bard, 1974; Edgar and Himmelblau, 1988; Gill et al. 1981).
Conjugate gradient-type methods form a class of minimization procedures
that accomplish two objectives:
(a) There is no need for calculation of second order derivatives.
(b) They have relatively small computer storage requirements.
Thus, these methods are suitable for problems with a very large number of
parameters. They are essential in circumstances when methods based on matrix
factorization are not viable because the relevant matrix is too large or too dense
(Gill et al. 1981).
Two versions of the method have been formulated (Scales, 1986):
(a) Fletcher-Reeves version;
(b) Polak-Ribiere version
Scales (1 986) recommends the Polak Ribiere version because it has slightly
better convergence properties. Scales also gives an algorithm which is used for
both methods that differ only in the formula for the updating of the search vector.
It is noted that the Rosenbrock hnction given by the next equation has been
used to test the performance of various algorithms including modified Newtons
and conjugate gradient methods (Scales, 1986)
f(x) = lOO(Xf - XZ)* + ( 1 - x,)2 (5.22)
5.1.5 Quasi-Newton or Variable Metric or Secant Methods
These methods utilize only values of the objective function, S(k), and values
of the first derivatives of the objective function. Thus, they avoid calculation of the
elements of the (pxp) Hessian matrix. The quasi-Newton methods rely on formu-
las that approximate the Hessian and its inverse. Two algorithms have been devel-
oped:
(a) The Davidon-Fletcher-Powell Formula (DFP)
(b) The Broyden-Fletcher-Goldfard-Shanno Formula (BFGS)
The DFP and BFGS methods exhibit superlinear convergence on suitably
smooth functions. They are in general more rapidly convergent, robust and eco-
nomical than conjugate gradient methods. However, they require much more stor-
age and are not suitable for large problems i.e., problems with many parameters.
Their storage requirements are equivalent to Newtons method.
The BFGS method is considered to be superior to DFP in most cases be-
cause (a) it is less prone to loss of positive definiteness or to singularity problems
78 Chapter 5
through round off errors and (b) it has better theoretical convergence properties
(Scales, 1985; Gill et al. I98 1 ; Edgar and Himmelblau, 1988).
Algorithms are not given here because they are readily available elsewhere
(Gill and Murray, 1972, 1975; Goldfard, 1976; Scales, 1985; Edgar and Himmel-
blau, 1988; Gill et ai. 198 I ).
5.2 DIRECT SEARCH OR DERIVATIVE FREE METHODS
Direct search methods use only function evaluations. They search for the
minimum of an objective function without calculating derivatives analytically or
nutnerically. Direct tnethods are based upon heuristic rules which make no a pri-
ori assumptions about the objective function. They tend to have much poorer con-
vergence rates than gradient methods when applied to smooth functions. Several
authors claim that direct search methods are not as efficient and robust as the indi-
rect or gradient search methods (Bard, 1974; Edgar and Himmelblau, 1988;
Scales, 1986). However, in many instances direct search methods have proved to
be robust and reliable particularly for systems that exhibit local minima or have
complex nonlinear constraints (Wang and Luus, 1978).
The Simplex algorithm and that of Powells are examples of derivative-free
methods (Edgar and Himmelblau, 1988; Seber and Wild, 1989, Powell, 1965). In
this chapter only two algorithms will be presented ( I ) the LJ optimization proce-
dure and (2) the simplex method. The well known golden section and Fibonacci
methods for minimizing a filnction along a line will not be presented. Kowalik and
Osborne (1968) and Press et al. (1992) anlong others discuss these methods in
detai I .
In an effort to address the problem of combinatorial explosion in optimi-
zation, several new global optimization methods have been introduced. These
methods include: (i) neural networks (Bishop, 1995), (ii) genetic algorithms (Hol-
land, 1975), (iii) simulated annealing techniques (Kirkpatrick et al., 1983, Cerny,
1985; Otten and Ginneken, 1989), (iv) target analysis (Glover 1986) and (v)
threshold accepting (Dueck and Scheuer, 1990) to name a few. These methods
have attracted significant attention as being the most appropriate ones for lurge
scale optimization problems whose objective functions exhibit a plethora of local
optima.
The sitnulated annealing technique is probably the most popular one. It tries
to mimic the physical process of annealing whereby a material starts in a melted
state and its temperature is gradually lowered until it reaches its minimum energy
state. In the physical system the temperature should not be rapidly lowered be-
cause a sub-optimal structure may be formed in the crystallized system and lead to
quenching. In an analogous fashion, we consider the minimization of the objective
function in a series of steps. A slow reduction in temperature corresponds to al-
lowing non-improving steps to be taken with a certain probability which is higher
in the beginning of the algorithm when the temperature is high. Sitnulated an-
Other Nonlinear Regression Metkods for Algebraic Models
. . , I , . <
79
nealing is essentially aprobabilistic hill climbing algorithm and hence, the method
has the capability to move away from a local minimum. The probability used in
simulated annealing algorithms is the Gibbs-Boltzmann distribution encountered
in statistical mechanics. One of its characteristics is that for very high temperatures
each state has almost an equal chance of being chosen to be the current state. For
low temperatures only states with low energies have a high probability of becom-
ing the current state. In practice, simulated annealing is implemented using the
Metropolis et al. (1953) algorithm. Simulated annealing has solved the famous
trmelling salesman problem of finding the shortest itinerary for a salesman who
visits N cities. The method has also been successfully used to determine the ar-
rangement of several hundred thousand circuit elements on a small silicon wafer
by minimizing the interference between their connecting wires.
Usually the space over which the objective function is minimized is not de-
fined as the p-dimensional space of p continuously variable parameters. Instead it
is a discrete configuration space of very high dimensionality. In general the num-
ber of elements in the configuration space is exceptionally large so that they can-
not be fully explored with a reasonable computation time.
For parameter estimation purposes, simulated annealing can be implemented
by discretizing the parameter space. Alternatively, we can specify minimum and
maximum values for each unknown parameter, and by using a random number
uniformly distributed in the range [0,1], we can specifj, randomly the potential
parameter values as
(5.23)
where R is a random number.
Another interesting implementation of simulated annealing for continuous
minimization (like a typical parameter estimation problem) utilizes a modification
of the downhill simplex method. Press et al. (1 992) provide a brief overview of
simulated annealing techniques accompanied with listings of computer programs
that cover all the above cases.
A detailed presentation of the simulated annealing techniques can be found
in The Annealing Algorithm by Otten and Ginneken (1 989).
5.2.1 LJ Optimization Procedure
One of the most reliable direct search methods is the LJ optimization proce-
dure (Luus and J aakola, 1973). This procedure uses random search points and
systematic contraction of the search region. The method is easy to program and
handles the problem of multiple optima with high reliability (Wang and Luus,
1977, 1978). A important advantage of the method is its ability to handle multiple
nonlinear constraints.
80 Chapter 5
The adaptation of the original LJ optimization procedure to parameter esti-
mation problems for algebraic equation models is given next.
(i) Choose an initial guess for the p-dimensional unknown parameter vector,
k" ); the region contraction coefficient. 6 (typically 6=0.95 is used); the
number of random evaluations of the objective function, NIt (typically
NR=lOO is used) within an iteration; the maximum number of iterations,
j,,, (typically j,,,,,=200 is used) and an initial search region, r"' (a typical
choice is r(O) = k,,,, - k,,,,).
(ii) Set the iteration index j=1 and kc'") = k'") and d' ") = r"'.
(iii) Generate or read fiom a file, NRxp random numbers (Rl l , ) uniformly dis-
tributed in [ - OS , 0.51
(iv) For n=l,2 ,..., N, <, generate the corresponding random trial parameter vec-
tors from
k = k0-I) + R (1-1)
n n r (5.24)
where R,, = diccg(R,,/, R1,?, . . . , R,J.
(v) Find the parameter vector among the NR trial ones that minimizes the LS
Objective function
(vi) Keep the best trial parameter vector, k*, up to now and the corresponding
(vii) Set kG' = k* and compute the search region for the next iteration as
minimum value of the objective hnction, S*.
,O) = 6xrO-" (5.26)
(viii) If j < jmaY, increment j by 1 go to Step (iii); else STOP.
Given the fact that in parameter estimation we normally have a relatively
smooth LS objective ftlnction, we do not need to be exceptionally concerned about
local optima (although this may not be the case for ill-conditioned estimation
problems). This is particularly true if we have a good idea of the range where the
parameter values should be. As a result, it may be more efficient to consider using
a value for NR which is a function of the number of unknown parameters. For ex-
ample, we may consider
Typical values would be NR=60 when p=I, NR=I 10 when p=5 and NR=l 60
when p=l 0.
At the same time we may wish to consider a slower contraction of the search
region as the dimensionality of the parameter space increases. For example we
could use a constant reduction of the volzrme (say 10%) of the search region rather
than a constant reduction in the search region of each parameter. Namely we could
use,
S = (0.90) p (5.28)
Typical values would be 6=0.90 when p=l, 6=0.949 when p=2, 6=0.974
when p=4, S=0.987 when p=8 and 6=0.993 when p=16.
Since we have a minimization problem, significant computational savings
can be realized by noting in the implementation of the LJ optimization procedure
that for each trial parameter vector, we do not need to complete the summation in
Equation 5.23. Once the LS Objective function exceeds the smallest value found
up to that point (S*), a new trial parameter vector can be selected.
Finally, we may wish to consider a multi-pass approach whereby the search
region for each unknown parameter is determined by the maximum change of the
parameter during the last pass (Luus, 1998).
5.2.2 Simplex Method
The Sequential Simplex or simply Simplex method relies on geometry to
create a heuristic rule for finding the minimum of a function. It is noted that the
Simplex method of linear programming is a different method.
Kowalik and Osborn (1 968) define simplex as following
A set of N+ I points in the N-dimensional space forms a simplex.
When the points are equidistant the simplex is said to be regular.
For a function of N variables one needs a O\J +l)-dimensional geometric
figure or simplex to use and select points on the vertices to evaluate the fbnction to
be minimized. Thus, for a function of two variables an equilateral triangle is used
whereas for a function of three variables a regular tetrahedron.
Edgar and Himmelblau (1988) demonstrate the use of the method for a
function of two variables. Nelder and Mead (1965) presented the method for a
function of N variables as a flow diagram. They demonstrated its use by applying
it to minimize Rosenbrocks function (Equation 5.22) as well as to the following
functions:
Chapter 5
- Y -1 ) (5.29)
1
(5.30)
where
(5. 3 1)
In general, for a function of N variables the Simplex method proceeds as
follows:
Step 1. Form an initial simplex e.g. an equidistant triangle for a function of
two variables.
Step 2. Evaluate the function at each of the vertices.
Step 3. Reject the vertex where the functiotl has the largest value. This point
is replaced by another one that is found in the direction away from the
rejected vertex and through the centroid of the simplex. The distance
from the rejected vertex is always constant at each search step.
I n the case of a function of two variables the direction is fiom the re-
jected vertex through the middle of the line of the triangle that is op-
posite to this point . The new point together with the previous two
points define a new equilateral triangle.
Step 4. Proceed until a simplex that encloses the minimum is found. Stop
when the difference between two consecutive function evaluations is
less than a preset value (tolerance).
It is noted that Press et al. (1 992) give a subroutine that implements the sim-
plex method of Nelder and Mead. They also recommend to restart the minimiza-
tion routine at a point where it claims to have found a minimum
The Simplex optimization method can also be used in the search for optimal
experimental conditions (Walters et al. 199 1). A starting simplex is usually formed
from existing experimental information. Subsequently, the response that plays the
role of the objective function is evaluated and a new vertex is found to replace the
worst one. The experimenter then performs an experiment at this new vertex to
determine the response. A new vertex then is found as before. Thus, sequentially
one forms a new simplex and stops when the response remains practically the
same. At that point the experimenter may switch to a factorial experimental design
to further optimize the experimental conditions.
For example, Kurniawan (1 998) investigated the in-situ electrochemical
brightening of thermo-mechanical pulp using sodium carbonate as the brightening
chemical. The electrochemical brightening process was optimized by performing a
simplex optimization. I n particular, she performed two sequential simplex optimi-
zations. The objective of the first was to maximize the brightness gain and mini-
mize the yellowness (or maximize the absolute yellowness gain) whereas that of
the second was to maximize the brightness gain only. Four factors were consid-
ered: current (Amp), anode area (cm2), temperature ( K) and pH. Thus, the simplex
was a pentahedron.
Kumiawan noticed that the first vertex was the same in both optimizations.
This was due to the fact that in both cases the worse vertex was the same.
Kurniawan also noticed that the search for the optimal conditions was more effec-
tive when two responses were optimized. Finally, she noticed that for the Simplex
method to perform well, the initial vertices should define extreme ranges of the
factors.
5.3 EXERCISES
You are asked to estimate the unknown parameters in the examples given
in Chapter 4 by employing methods presented in this chapter.
Gauss-Newton Method for Ordinary
Differential Equation (ODE) Models
In this chapter we are concentrating on the Gauss-Newton method for the
estimation of unknown parameters in models described by a set of ordinary dif-
ferential equations (ODES).
As it was mentioned in Chapter 2. the mathematical models are of thc
form
or more generally
where
k=[kI,k2,. . .&IT is a p-di))wnsionul vector of parameters whose numerical
values are unknown;
84
Gauss-Newton Method for ODE Models
85
x=[xI,x2,. . .,x,,]~ is an n-dimensional vector of state variables;
x0 is an n-dimensional vector of initial conditions for state variables which
are assumed to be known precisely;
u=[uI,u2,. . .,u,IT is an r-dimensional vector of manipulated variables which
are either set by the experimentalist or they have been measured and it
is assumed that their numerical values are precisely known;
f=[f,,f,, ..., f,lT is a n-dimensional vector function of known form (the differ-
ential equations);
y=[yl,y2,.. .,y,IT is the m-dimensional output vector i.e., the set of variables
that are measured experimentally; and
C is the mxn observation matrix, which indicates the state variables (or lin-
ear combinations of state variables) that are measured experimentally.
Experimental data are available as measurements of the output vector as
a function of time, Le., [f i , ti 1, i=l,. . .,N where with f i we denote the meas-
urement of the output vector at time t,. These are to be matched to the values
calculated by the model at the same time, y(t,), in some optimal fashion. Based
on the statistical properties of the experimental error involved in the measure-
ment of the output vector, we determine the weighting matrices Qi (i=l ,. . .,N)
that should be used in the objective function to be minimized as mentioned ear-
lier in Chapter 2. The objective function is of the form,
N T
1=1
Minimization of S(k) can be accomplished by using almost any technique
available from optimization theory, however since each objective function
evaluation requires the integration of the state equations, the use of quadratically
convergent algorithms is highly recommended. The Gauss-Newton method is
the most appropriate one for ODE models (Bard, 1970) and it presented in detail
below.
6.2 THE GAUSS-NEWTON METHOD
Again, let us assume that an estimate kG) of the unknown parameters is
available at the jth iteration. Linearization of the output vector around kti) and
retaining first order terms yields
86 Chapter 6
Assuming a linear relationship between the output vector and the state
variables (y = Cx), the above equation becomes
f
y(tl,kQ' 'I) = Cx(tl,kCi)) + C [ - zr) Ak!'")
I
(6.6)
I n the case of ODE models, the sensitivity matrix G(tl) = (dxT/dk)T can-
not be obtained by a simple differentiation. However, we can find a differential
equation that G(t) satisfies and hence, the sensitivity matrix G(t) can be deter-
mined as a fhction of time by solving simultaneously with the state ODES an-
other set of differential equations. This set of ODES is obtained by differentiat-
ing both sides of Equation 6.1 (the state equations) with respect to k, namely
A( * ) =A( f ( x , u , k ) ) 3k dt i3k
(6.7)
Reversing the order of differentiation on the left-hand side of Equation
6.7 and performing the implicit differentiation of the right-hand side, we obtain
or better
The initial condition G(to) is obtained by differentiating the initial condi-
tion, x(to)=xo, with respect to k and since the initial state is independent of the
parameters, we have:
G(t0) = 0. (6.10)
Gauss-Newton Method f or ODk Models 87
Equation 6.9 is a matrix differential equation and represents a set of n y
ODEs. Once the sensitivity coefficients are obtained by solving numerically the
above ODEs, the output vector, y(t,,k(''')), can be computed.
Substitution of the latter into the objective function and use of the sta-
tionary condition aS(ko"))lakti'" = 0, yields a linear equation for AkU+"
where
N
A = ~GT ( tl ) CT QI CG( t, )
1 = 1
and
(6.12)
(6.13)
Solution of the above equation yields AkCi"' and hence, kb+') is obtained
from
where 1-1 is a stepping parameter (O<pI 1) to be determined by the bisection rule.
The simple bisection rule is presented later in this chapter whereas optimal step-
size determination procedures are presented in detail in Chapter 8.
In summary, at each iteration given the current estimate of the parame-
ters, kci), we obtain x(t) and G(t) by integrating the state and sensitivity differen-
tial equations. Using these values we compute the model output, y(ti,kO)), and
the sensitivity coefficients, G(t,), for each data point i=l ,..., N which are subse-
quently used to set up matrix A and vector b. Solution of the linear equation
yields Aka+') and hence kcit') is obtained.
Thus, a sequence of parameter estimates is generated, k('), k(*),. . . which
often converges to the optimum, k*, if the initial guess, k(O), is sufficiently close.
The converged parameter values represent the Least Squares (LS), Weighted
Least Squares (WLS) or Generalized Least Squares (GLS) estimates depending
on the choice of the weighting matrices Q,. Furthermore, if certain assumptions
regarding the statistical distribution of the residuals hold, these parameter values
could also be the Maximum Likelihood (ML) estimates.
88
6.2.1
I .
2.
3.
4.
5 .
6.
7.
Chapter 6
Gauss-Newton Algorithm for ODE Models
Input the initial guess for the parameters, k(O) and NSIG.
For j =O, 1 . 2,. . . , repeat.
Integrate state and sensitivity equations to obtain x(t) and G(t). At each
sampling period, t,, i=I ,. . .,N compute y(t,,k), and G(t,) to set up ma-
trix A and vector b.
Solve the linear equation AAk =b and obtain Ak .
( i +l ) (1 I)
Determine 11using the bisection rule and obtain k =k +pAko.
(.I- I ) ( I )
Compute statistical properties of parameter estimates (see Chapter 1 1 ).
The above method is the well-known Gauss-Newton method for differen-
tial equation systems and it exhibits quadratic convergence to the optimum.
Computational modifications to the above algorithm for the incorporation of
prior knowledge about the parameters (Bayesian estimation) are discussed i n
detail in Chapter 8.
6.2.2 Implementation Guidelines for ODE Models
I . Use of n Differer7tial Eqtrntion Solver
If the dimensionality of the problem is not excessively high, simultane-
ous integration of the state and sensitivity equations is the easiest approach to
implement the Gauss-Newton method without the need to store x(t) as a function
of time. The latter is required in the evaluation of the J acobeans i n Equation 6.9
during the solution of this differential equation to obtain G(t).
Let us rewrite G(t) as
In this case
cients of the state
lowing ODE,
the n-dimensional vector gl represents the sensitivity coeffi-
variables with respect to parameter kl and satisfies the fol-
Gauss-Newton Method for ODE Mbdels 89
(6.16a)
Similarly, the n-dimensional vector g2 represents the sensitivity coeffi-
cients of the state variables with respect to parameter k2 and satisfies the fol-
lowing ODE,
(6.16b)
Finally for the last parameter, k,, we have the corresponding sensitivity
vector g,
(6.16~)
Since most of the numerical differential equation solvers require the
equations to be integrated to be of the form
dz
- = q(z) ; z(Q = given (6.17)
dt
we generate the following nx(p+I)-dimensionai vector z
z = (6.18)
90
Chapter 6
and the corresponding /?x&+ l)-dimensio~~a/ vector function cp(z)
(6.19)
If the equation solver permits it, information can also be provided about
the J acobean of cp(z), particularly when we are dealing with stiff differential
equations. The J acobean is of the form
j
T
q=(g) = (6.20)
where the "*" in the first column represents terms that have second order de-
rivatives of f with respect to x. I n most practical situations these terms can be
neglected and hence, this J acobean can be considered as a block diagonal matrix
Gauss-Newton Method for ODE Models 91
as far as the ODE solver is concerned. This results in significant savings in
terms of memory requirements and robustness of the numerical integration.
In a typical implementation, the numerical integration routine is re-
quested to provide z(t) at each sampling point, t,, i =l , . . . ,N and hence, x(tJ and
G(t,) become available for the computation of y(t,,kti) as well as for adding the
appropriate terms in matrix A and b.
2. Implementation of the Bisection Rule
As mentioned i n Chapter 4, an acceptable value for the stepping parame-
ter p is obtained by starting with p=l and halving p until the objective function
becomes less than that obtained in the previous iteration, namely, the first value
of p that satisfies the following inequality is accepted.
In the case of ODE models, evaluation of the objective function,
S(k+pAkbt), for a particular value of p implies the integration of the state
equations. It should be emphasized here that it is unnecessary to integrate the
state equations for the entire data length [to, tN] for each trial value of p. Once
the objective function becomes greater than S(kc)), a smaller value of p can be
chosen. By this procedure, besides the savings in computation time, numerical
instability is also avoided since the objective function becomes large quickly
and the integration is often stopped well before computer overflow is threatened
(Kalogerakis and Luus, 1983a).
The importance of using a good integration routine should also be em-
phasized. When Akcl+) is excessively large (severe overstepping) during the
determination of an acceptable value for p numerical instability may cause com-
puter overflow well before we have a chance to compute the output vector at the
first data point and compare the objective functions. I n this case, the use of a
good integration routine is of great importance to provide a message indicating
that the tolerence requirements cannot be met. At that moment we can stop the
integration and simply halve p and start integration of the state equations again.
Several restarts may be necessary before an acceptable value for p is obtained.
Furthermore, when kb+pAk(i) is used at the next iteration as the current
estimate, we do not anticipate any problems in the integration of both the state
and sensitivity equations. This is simply due to the fact that the eigenvalues of
the J acobean of the sensitivity equations (inversely related to the governing time
constants) are the same as those in the state equations where the integration was
performed successfully. These considerations are of particular importance when
the model is described by a set of stiff differential equations where the wide
range of the prevailing time constants creates additional numerical difficulties
that tend to shrink the region of convergence (Kalogerakis and Luus, 1983a).
92 Chapter 6
6.3 THE GAUSS-NEWTON METHOD - NONLINEAR OUTPUT
RELATIONSHIP
When the output vector (measured variables) are related to the state vari-
ables (and possibly to the parameters) through a nonlinear relationship of the form
y(t) = h(x(t),k), we need to make some additional minor modifications. The sensi-
tivity of the output vector to the parameters can be obtained by performing the
implicit differentiation to yield:
(6.22)
Substitution into the linearized output vector (Equation 6.5) yields
y(ti,k ) = h(x(t,,k"))) + W(t) Ak(J"")
ci+l) (6.23)
where
and hence the corresponding normal equations are obtained, i.e.,
where
and
N
b = WT(t,)Ql[$, - hx(t ,,k(j ))]
(6.24)
(6.2 5 )
(6.26)
(6.27)
t =I
L J
If the nonlinear output relationship is independent of the parameters, i.e., it
is of the form
then W(t,) simplifies to
W(ti) = [ g] I G(ti)
1
93
(6.29)
and the corresponding matrix A and vector b become
and
I n other words, the observation matrix C from the case of a linear output
relationship is substituted with the J acobean matrix (dhT/dx)T in setting up matrix
A and vector b.
6.4 THE GAUSS-NEWTON METHOD -SYSTEMS WITH UNKNOWN
INITIAL CONDITIONS
Let us consider a system described by a set of ODES as in Section 6.1.
dx(t) = f(x(t), u, k) ; x(@ = x0
dt
(6.32)
The only difference here is that it is firther assumed that some or all of the
components of the initial state vector x0 are unknown. Let the q-dimensional vec-
tor p (0 < q I n) denote the unknown components of the vector xo. In this class of
parameter estimation problems, the objective is to determine not only the parame-
ter vector k but also the unknown vector p containing the unknown elements of
the initial state vector x(b).
Again, we assume that experimental data are available as measurements of
the output vector at various points in time, i.e., b, , t, ] , i=1,. . .,N. The objective
hnction that should be minimized is the same as before. Tthe only difference is
that the minimization is carried out over k and p, namely the objective function is
viewed as
94 Chapter 6
Let us suppose that an estimate k(') and p(') of the unknown parameter and
initial state vectors is available at the jth iteration. Linearization of the output vec-
tor around ko' and pG' yields,
(6.35)
Assuming a linear output relationship (i.e., y(t) = Cx(t)), the above equation
becomes
y(tl,k(' ' I ) ,p 011) ) = Cx(tl,k(l),p(i)) + CG(tl) A$' '' t CP(t,) Ap(''') (6.36)
where G(t) is the usual nxp parameter sensitivity matrix (&T/8k)T and P(t) is the
nxq initial state sensitivity matrix ( a~~/ ap) ~.
The parameter sensitivity matrix G(t) can be obtained as shown in the pre-
vious section by solving the matrix differential equation,
" dG(t) - [ - i33T]'r [ ;;T)-r
dt
G(t) + -
with the initial condition,
G(b) = 0.
Similar to the parameter sensitivity matrix. the initial s
(6.37)
(6.3 8)
itate sensitivity matrix,
P(t), cannot be obtained by a simple differentiation. P(t) is determined by solving
a matrix differential equation that is obtained by differentiating both sides of
Equation 6.1 (state equation) with respect to p".
Reversing the order of differentiation and performing implicit differentiation
on the right-hand side, we arrive at
or better
95
(6.39)
(6.40)
The initial condition is obtained by differentiating both sides of the initial
condition, x(to)=xo, with respect to p, yielding
(6.4 1 )
Without any loss of generality, it has been assumed that the unknown initial
states correspond to state variables that are placed as the first elements of the state
vector x(t). Hence, the structure of the initial condition in Equation 6.4 1.
Thus, integrating the state and sensitivity equations (Equations 6.1, 6.9 and
6.40), a total of nx(p+q+l) differential equations, the ou ut vector, y(t,k(i+l),p(icl))
is obtained as a linear function of k0"' and pG+? Next, substitution of
y(t,,kfi+'),pfi+')) into the objective function and use of the stationary criteria
(6.42a)
yields the following linear equation
96 Chapter 6
rN N 1
~GT ( t, ) CI QCG( t, ) Z C ' (t,)CTQCP(t,)
I =I I = I
N
f P.'(t,)CrQCG(t,) ~PT(tl )C7'QCP(t,)
I = I i = l -
(6.43)
Solution of the above equation yields Ak(""' and Ap[i"). The estimates
k['+') and pucl) are obtained next as
(6.44)
where a stepping parameter p (to be determined by the bisection rule) is also used.
If the initial guess k(O), p(O) is sufficiently close to the optimum, this pro-
cedure yields a quadratic convergence to the optimum. However, the same diffi-
culties, as those discussed earlier arise whenever the initial estimates are far from
the optimum.
If we consider the limiting case where p=O and q d , i.e., the case where
there are no unknown parameters and only some of the initial states are to be esti-
mated, the previously outlined procedure represents a quadratically convergent
method for the solution of two-point bozmdary value problems. Obviously in this
case, we need to compute only the sensitivity matrix P(t). I t can be shown that
under these conditions the Gauss-Newton method is a typical quadratically con-
vergent "shooting method." As such it can be used to solve optinml control prob-
lems using the Boundary Condition Iteration approach (Kalogerakis, 1983).
6.5 EXAMPLES
Bellman et al. ( 1 967) have considered the estimation of the two rate con-
stants kl and k? in the Bodenstein-Linder model for the homogeneous gas phase
reaction of NO with 02:
2NO+02 t+ 2N02
97
The model is described by the following equation
---=kl (a-x)(~-x) dx 2 - k2x2 ; x(O)=O (6.45)
dt
where a=126.2, p=91.9 and x is the concentration of NOz. The concentration of
NO2 was measured experimentally as a function of time and the data are given in
Table 6.1
The model is of the form dx/dt=f(x,k,,kz) where f(x,kl,k2)=kl(a-x)(p-x)?-
k2x2. The single state variable x is also the measured variable (i.e., y(t)=x(t)). The
sensitivity matrix, G(t), is a ( I x2)-dimensionaZ matrix with elements:
Table 6. I Data for the Homogeneous Gas Phase
Reaction of NO with 02.
I Time 1 Concentration of NO2 1
0
1.4 1
0
2
10.5 3
6.3
4
I 4 I 14.2 1
5
21.4 6
17.6
I 7 I 23 .O 1
I 9 I 27.0 1
I 1
38.8 19
34.4 14
30.5
24
45.3 39
43.5 29
41.6
Source: Bellman et al. ( 1 967).
I n this case, Equation 6. I6 simply becomes,
(6.46)
(6.47a)
98
and similarly for G2(t),
dG2 = ( C) G2 +(E) ; G?(O)=O
dt ax 6%2
Chapter 6
(6.47b)
where
= -k,(p-x)' - 2kl(a-x)(P-x) - 2k2x (6.48~1)
[E) = -x2
(6.48b)
(6.48~)
Equations 6.47a and 6.47b should be solved simultaneously with the state
equation (Equation 6.45). The three ODES are put into the standard form (dz/dt =
cp(z)) used by differential equation solvers by setting
(6.49a)
and
kI (a-x)(p-x)2 -k2x 2
-[kl (p-x)2 +2kl (a-x)(fl -x)+2k2x]G2 "x 2
-[kI (p-x)2 +2kI (a-x)(P-x)+2k2x]G1 +(a-x)(/3-x)2
Integration of the above equation yields x(t) and G(t) which are used inset-
ting up matrix A and vector b at each iteration of the Gauss-Newton method.
6.5.2 Pyrolytic Dehydrogenation of Benzene to Diphenyl and Triphenyl
Let us now consider the pyrolytic dehydrogenation of benzene to diphenyl
and triphenyl (Seinfeld and Gavalas, 1970; Hougen and Watson, 1948):
The following kinetic model has been proposed
dx 1
- = -rl - r2
dt
dX2
dt 2
where
(6.50a)
(6.50b)
(6.5 1 a)
r2 = k2[xlx2 - ( l - xl - 2~2x2- 2x1 - x2)/9K 2] (6.5 1 b)
where xl denotes l6-mole of benzene per i6-mole of pure benzene feed and x2 de-
notes lb-mole of diphenyl per Ib-mole of pure benzene feed. The parameters kl and
k2 are unknown reaction rate constants whereas KI and K2 are equilibrium con-
stants. The data consist of measurements of xI and x2 in a flow reactor at eight
values of the reciprocal space velocity t and are given below: The feed to the re-
actor was pure benzene.
Table 6.2. Data for the Pyrolytic Dehydrogenation of Benzene
Reciprocal Space
Velocity (t> x 104
XI x2
5.63
0. I322 0.622 16.97
0.1 13 0.704 1 I .32
0.0737 0.828
22.62 I 0.565 I 0.1400
34.0 1 0.499 I 0.1468
39.7
0.1476 0.443 169.7
0.1477 0.470 45.2
0.1477 0.482
Source: Seinfeld and Gavalas (1 970); Hougen and Watson ( 1 948).
100 Chapter 6
As both state variables are measured, the output vector is the same with the
state vector, i.e., yl=xl and y2=x2. The feed to the reactor was pure benzene. The
equilibrium constants KI and KZ were determined from the run at the lowest space
velocity to be 0.242 and 0.428, respectively.
Using our standard notation, the above problem is written as follows:
where fi=(-rI-rz) and f2=r,/2-r2.
The sensitivity matrix, G(t), is a (2x2)-di~ime~sior?~I matrix with elements:
Equations 6. I 6 then become,
and
dt
(6.53)
0
(6.54a)
(6.54b)
Taking into account Equation 6.53, the above equations can also be written
as follows:
101
; Gll(to)=O, G21(t0)=0
and
Finally, we obtain the following equations
where
(6.55a)
(6.55b)
(6.56a)
(6.56b)
(6.56~)
(6.56d)
(6.57a)
I
(~)=- k1(- "(2x2 +2xl - 2)
- k, x 1 --
(3x2 ~ K I ( 9K2
[ 2 ) = $ ( 2 x l +%)-k,(x2 --(4xl 1
~ K I 9K2
( 6. 57~)
102 Chapter 6
($)=%(--"(2x2 2 3K1 +2x, -2)
(6.57e)
(6.570
The four sensitivity equations (Equations 6.56a-d) should be solved sitnul-
taneously with the two state equations (Equation 6.52). Integration of these six
[=nx(p+l)=2x(2+1)] equations yields x(t) and G(t) which are used i n setting up
matrix A and vector b at each iteration of the Gauss-Newton method.
The ordinary differential equation that a particular element, G,J , of the ( m p ) -
divzensio~al sensitivity matrix satisfies, can be written directly using the following
expression,
The hydrogenation of 3-hydroxypropanal (HPA) to 1,3-propanediol (PD)
over Ni/SiO2/AI2O3 catalyst powder was studied by Professor Hoffinan's group at
the Friedrich-Alexander University in Erlagen, Germany (Zhu et al., 1997). PD is
a potentially attractive monomer for polymers like polypropylene terephthalate.
They used a batch stirred autoclave. The experimental data were kindly provided
by Professor Hoffman and consist of measurenlents of the concentration of I-IPA
and PD (Clip!,, CpD) versus time at various operating temperatures and pressures.
The complete data set will be given in the case studies section. I n this chapter, we
will discuss how we set up the equations for the regression of an isothermal data
set given in Tables 6.3 or 6.4.
The same group also proposed a reaction scheme and a mathematical model
that describe the rates of HPA consumption, PD formation as well as the forma-
tion of acrolein (Ac). The model is as follows
dCHPA
- = -[rl + r2]Ck -[r3 +r4 -r-3]
dt
(6.59a)
(6.59b)
(6.59~)
where Ck is the concentration of the catalyst (I 0 g/L). The reaction rates are given
below
(6.60a)
(6.60b)
= k4CAcCHPA (6.60e)
I n the above equations, kJ Q=l, 2, 3, -3, 4) are rate constants (L/(mol min g),
K, and K2 are the adsorption equilibrium constants (L/mol) for Hz and HPA re-
spectively. P is the hydrogen pressure (MPa) in the reactor and H is the Henry's
law constant with a value equal to 1379 (L bar/mol) at 298 K. The seven parame-
ters (kl, k2, k3, k3, k4, K, and K2) are to be determined from the measured concen-
trations of HPA and PD.
104 Chapter 6
Table 6.3 Data-for the Cntabtic Hydrogenatiou of 3-H~)tIro.u?~f.opnnul
(HPA) to 1,3-Propctnediol (PO) nt 5.15 MPa and 45 C
t (min) CpD ( f ?IOl / L) CHPA (r??oi/L)
0.0 0.0 I .34953
10
20
0.002628 I2 I .36324
0.184363 1.17918 30
0.0700394 1.25882
40
0.469777 0.825203 50
0.354008 0.972 IO2
60
0.42 145 1 so
0.607359 0.697 109
I .03535 0.232296 100
0.85243 I
I 120 I 0.128095 I 1.16413
140
1.31971 0.00962368 1 GO
I .30053 0.02898 17
Sozrrce: Zhu et al. (1 997).
Table 6.4 Duta for /he Cntn(19tic Hydrogenatior~ of 3-H~.~ro.~pf.opor?clI
(HPA) to 1,3-Propanediol (PO) at 5.15 Mpa and 80 T
t (nzin)
0.0 I .34953 0.0
CpD (rtlol/L) CHI* (mol/L)
5
0.8 I6032 0.44727 10
0.388568 0.8735 13
I 15 I 0. I40925 I 0.9670 17 1
20
25
I .05 125 0.0350076
I . 12024 0.0058 I597 30
1 .OS239 0.0 I30859
Soztrce: Zhu et al. (1 997).
In order to use our standard notation we introduce the following vectors:
Hence, the differential equation model takes the form,
Gauss-Newton Method for ODE Modeis
and the observation matrix is simply
.=[' O 01
0 1 0
105
(6.6 1 a)
(6.6 1 b)
(6.6 1 c)
(6.62)
In Equations 6.61, u1 denotes the concentration of catalyst present in the re-
actor (C,) and u2 the hydrogen pressure (P). As far as the estimation problem is
concerned, both these variables are assumed to be known precisely. Actually, as it
will be discussed later on experimental design (Chapter 12), the value of such
variables is chosen by the experimentalist and can have a paramount effect on the
quality of the parameter estimates. Equations 6.6 1 are rewritten as following
" dx ' - -ul(rl+r2)-(k3xl+kSx3x~-k4~3) (6.63a)
dt
" dx 2 - u,(r1-r2)
dt
(6.63b)
%2- = (k3xl-k5x3xI -~x3) (6.63~)
dt
where
(6.64a)
(6.64b)
106 Chapter 6
r3 = k3 x 1 (6.64~)
r-3 = k 4 X 3 (6.644
r4 = kj x3x, (6.642)
The sensitivity matrix, G(t), is a (3~7)-dir1ter?siorml matrix with elements:
G(t) =
[z)
(2)
[2)
Equations 6.16 then become,
(6.6Sa)
(6.6Sb)
(6.G6a)
(6.66b)
(6.66~)
where
T
(LC) =
and
[E)=
(z)
[ : ) ,
; j=1,2 ?. . . ?7
Taking into account the above equations we obtain
107
(6.67a)
(6.67b)
(6.68)
108 Chapter 6
The partial derivatives with respect to the state variables in Equation 6.67a that are
needed in the above ODEs are given next
( : ) =o
(6.69a)
(6.69b)
(6.69~)
(6.69d)
(6.69e)
(6.690
(6.69g)
(6.69h)
(6.691')
The partial derivatives with respect to the paratneters i n Equation 6.67b that
are needed in the above ODEs are given next
109
( 3
(2)
[z]
(2)
(2)
[ : )
[ : )
(2)
(2)
- U][$)
0
(6.70a)
(6.70b)
(6.70~)
(6.70d)
110
Chapter 6
(21
(2)
(6.70e)
(6.709
The 21 equations (given as Equation 6.68) should be solved simultaneously
with the three state equations (Equation 6.64). Integration of these 24 equations
yields x(t) and G(t) which are used in setting up matrix A and vector b at each
iteration of the Gauss-Newton method. Given the complexity of the ODES when
the dimensionality of the problem increases, it is quite helpful to have a general
purpose computer program that sets up the sensitivity equations automatically.
Furthermore, since analytical derivatives are subject to user input error, nu-
merical evaluation of the derivatives can also be used in a typical computer im-
plementation of the Gauss-Newton method. Details for a successful implementa-
tion of the method are given i n Chapter 8.
Gauss-Newton Method for ODE Models 8 111
6.6 EQUJVALENCE OF GAUSS-NEWTON WITH THE QUASI-
LINEARIZATION METHOD
The quasilinearization method (QM) is another method for solving off-line
parameter estimation problems described by Equations 6.1, 6.2 and 6.3 (Bellman
and Kalaba, 1965). Quasilinearization converges quadratically to the optimum but
has a small region of convergence (Seinfeld and Gavalas, 1970). Kalogerakis and
Luus ( 1 983b) presented an alternative development of the QM that enables a more
efficient implementation of the algorithm.
Furthermore, they showed that this simplified QM is very similar to the
Gauss-Newton method. Next the quasilinearization method as well as the simpli-
fied quasilinearization method are described and the equivalence of QM to the
Gauss-Newton method is demonstrated.
6.6.1 The Quasilinearization Method and its Simplification
An estimate k@ of the unknown parameter vector is available at the jth it-
eration. Equation 6.1 then becomes
(6.7 1)
Using the parameter estimate kCi") fi-omthe next iteration we obtain fi-om
Equation 6.1
(6.72)
By using a Taylor series expansion on the right hand side of Equation 6.72
and keeping only the linear terms we obtain the following equation
1
+(") dk [k(i+l)(t)-ku)(t)]
where the partial derivatives are evaluated at xO)(t).
will result in the following equation
The above equation is linear in x(it1) and kG"'. Integration of Equation 6.72
112 Chapt er G
xCi+')(t) = g(t) +G(t)kC,+') (6.74)
where g(t) is an n-dimensional vector and G(t) is an 1 f - y ~ matrix.
equated with the RHS of Equation 6.73 to yield
Equation 6.74 is differentiated and the RHS of the resultant equation is
and
(6.76)
The initial conditions for Equations 6.75 and 6.76 are as follows
G(t0) = 0. (6.77b)
Equations 6.71, 6.75 and 6.76 can be solved simultaneously to yield g(t) and
G(t) when the initial state vector x. and the parameter estimate vector k(') are
given. I n order to determine k('+') the output vector (given by Equation 6.3) is in-
serted into the objective function (Equation 6.4) and the stationary condition
yields,
(6.78)
The case of a nonlinear observational relationship (Equation 6.3) wi l l be
examined later. Equation 6.78 yields the following linear equation which is solved
by LU decomposition (or any other technique) to obtain kt'+')
As matrix Q, is positive definite, the above equation gives the mi ni mum of
the objective function.
Since linearization of the differential Equation 6.1 around the trajectory
xO)(t), resulting from the choice of kG) has been used, the above method gives
kG+*) which is an approximation to the best parameter vector. Using this value as
k') a new kG") can be obtained and thus a sequence of vectors k('), k"), k(2). . . is
obtained. This sequence converges rapidly to the optimum provided that the
initial guess is sufficiently good. The above described methodology constitutes
the Quasilinearization Method (QM). The total number of differential equations
which must be integrated at each iteration step is nx(p+2).
Kalogerakis and Luus (1 983b) noticed that Equation 6.75 is redundant.
Since Equation 6.74 is obtained by linearization around the nominal trajectory
xO)(t) resulting from k"), if we let k'"' be k"' then Equation 6.74 becomes
X ') (t) = g(t) -+G(t)k (j) (6.80)
Equation 6.80 is exact rather than a first order approximation as Equation 6.74 is.
This is simply because Equation 6.80 is Equation 6.74 evaluated at the point of
linearization, ko). Thus Equation 6.80 can be used to compute g(t) as
g(t) = (t) - G(t)k ci) (6.8 1)
It is obvious that the use of Equation 6.81 leads to a simplification because
the number of differential equations that now need to be integrated is nx(pt-1).
Kalogerakis and Luus (1 983b) then proposed the following algorithm for the QM.
Step 1.
Step 2.
Step 3.
Step 4.
Step 5 .
Select an initial guess k"). Hence j=O.
Integrate Equations 6.71 and 6.76 simultaneously to obtain x"'(t) and
w .
Use equation 6.8 I to obtain g(t,), i=l,2,. . . , N and set up matrix A and
vector b in Equation 6.79.
Solve equation 6.79 to obtain ko"'.
Continue until
//k (i+l ' - k ci ) // 5 TOL (6.82)
where TOL is a preset small number to ensure termination of the it-
erations. I f the above inequality is not satisfied then we set ko)=k(l+l),
increase j by one and go to Step 2 to repeat the calculations.
114 Chapter 6
6.6.2 Equivalence to Gauss-Newton Method
If we compare Equations 6.79 and 6.1 1 we notice that the only difference
between the quasilinearization method and the Gauss-Newton method is the nature
of the equation that yields the parameter estimate vector k'-"'. If one substitutes
Equation 6.8 1 into Equation 6.79 obtains the following equation
G -' (ti )CTQ iC G(t i ) k =
1
(6.83)
i =l
By taking the last term on the right hand side of Equation 6.83 to the left
hand side one obtains Equation 6.1 1 that is used for the Gauss-Newton method.
Hence, when the output vector is linearly related to the state vector (Equation 6.2)
then the simplified quasilinearization method is computationally identical to the
Gauss-Newton method.
Kalogerakis and Luus (1983b) compared the computational effort required
by Gauss-Newton, simplified quasilinearization and standard quasilinearization
methods. They found that all methods produced the same new estimates at each
iteration as expected. Furthermore, the required computational time for the Gauss-
Newton and the simplified quasilinearization was the same and about 90% of that
required by the standard quasilinearization method.
6.6.3 Nonlinear Output Relationship
When the output vector is nonlinearly related to the state vector (Equation
6.3) then substitution of from Equation 6.74 into the Equation 6.3 followed
by substitution of the resulting equation into the objective function (Equation 6.4)
yields the following equation after application of the stationary condition (Equa-
tion 6.78)
The above equation represents a set of p nonlinear equations which can be
solved to obtain kQt". The solution of this set of equations can be accomplished by
two methods. First, by employing Newton's method or alternatively by linearizing
the output vector around the trajectory xQ)(t). Kalogerakis and Luus ( 1 983b)
showed that when linearization of the output vector is used, the quasilinearization
computational algorithm and the Gauss-Newton method yield the same results.
7
Shortcut Estimation Methods for
Ordinary Differential Equation (ODE)
Models
Whenever the whole state vector is measured, i.e., when y(t,)=x(t,), i=l,..,N,
we can employ approximations of the time-derivatives of the state variables or
make use of suitable integrals and thus reduce the parameter estimation problem
froman ODE system to one for an algebraic equation system. We shall present
two approaches, one approximating time derivatives of the state variables (the
derivative approach) and other approximating suitable integrals of the state vari-
ables (the integral approach). In addition, in this chapter we present the method-
ology for estimating average kinetic rates (e.g., specific growth rates, specific up-
take rates, or specific secretion rates) in biological systems operating in the batch,
fed-batch, continuous or perfusion mode. These estimates are routinely used by
analysts to compare the productivity or growth characteristics among different
microorganism populations or cell lines.
7.1 ODE MODELS WITH LINEAR DEPENDENCE ON THE
PARAMETERS
Let us consider the special class of problems where all state variables are
measured and the parameters enter in a linear fashion into the governing differen-
tial equations. As usual, we assume that x is the n-dimensional vector of state vari-
ables and k is the p-dimensional vector of unknown parameters. The structure of
the ODE model is of the form
115
116
Chapter 7
where ~p,~(x), i=1 ,. . . ,n; j=I,. . .,p are known functions of the state variables only.
Quite often these functions are suitable products of the state variables especially
when the model is that of a homogeneous reacting system. I n a more compact
form Equation 7.1 can be written as
dx
dt
- = @(x)k (7.2)
where the nxp dirnetsior.ral matrix @(x) has as elelnents the firnctions cp,,(x),
i=l, ..., n; j=l . . . . , p. It is also assumed that the initial condition x(to)=xO is known
precisely.
7.1.1 Derivative approach
In this case we approximate the time derivatives on the left hand side of
Equation 7.1 numerically. In order to minimize the potential numerical errors in
the evaluation of the derivatives it is strongly recommended to smooth the data
first by polynomial fitting. The order of the polynomial should be the lowest pos-
sible that fits the measurements safisfucforib.
The time-derivatives can be estimated analytically from the smoothed data
as
(7.3)
where i (t) is the fitted polynomial. If for example a second or third order poly-
nomial has been used, i(t) will be given respectively by
Shortcut Estimation Methods for ObE Models 117
The best and easiest way to smooth the data and avoid misuse of the poly-
nomial curve fitting is by employing smooth cubic splines. IMSL provides two
routines for this purpose: CSSCV and CSSMH. The latter is more versatile as it
gives the option to the user to apply different levels of smoothing by controlling a
single parameter. Furthermore, IMSL routines CSVAL and CSDER can be used
once the coefficients of the cubic spines have been computed by CSSMH to cal-
culate the smoothed values of the state variables and their derivatives respectively.
Having the smoothed values of the state variables at each sampling point
and the derivatives, q,, we have essentially transformed the problem to a "usual"
Einenr regression problem. The parameter vector is obtained by minimizing the
following LS objective function
where r i i is the smoothed value of the measured state variables at t=t,. Any good
linear regression package can be used to solve this problem.
However, an important question that needs to be answered is "what constj-
tutes a satisfactory polynomial fit?" An answer can come from the following sim-
ple reasoning. The purpose of the polynomial fit is to smooth the data, namely, to
remove only the measurement error (noise) from the data. If the mathematical
(ODE) model under consideration is indeed the true model (or simply an adequate
one) then the calculated values of the output vector based on the ODE model
should correspond to the error-free measurements. Obviously, these model-
calculated values should ideally be the same as the smoothed data assuming that
the correct amount of data-filtering has taken place.
Therefore, in our view polynomial fitting for data smoothing should be an
iterative process. First we start with the visually best choice of polynomial order or
smooth cubic splines. Subsequently, we estimate the unknown parameters in the
ODE model and generate model calculated values of the output vector. Plotting of
the raw data, the smoothed data and the ODE model calculated values in the same
graph enables visual inspection. I f the smoothed data are reasonably close to the
model calculated values and the residuals appear to be normal, the polynomial
fitting was done correctly. I f not, we should go back and redo the polynomial fit-
ting. In addition, we should make sure that the ODE model is adequate. If it is not,
the above procedure fails. I n this case, the data smoothing should be based simply
on the requirement that the differences between the raw data and the smoothed
values are normally distributed with zero mean.
Finally, the user should always be aware of the danger in getting numerical
estimates of the derivatives from the data. Different smoothing cubic splines or
polynomials can result in similar values for the state variables and at the same time
have widely different estimates of the derivatives. This problem can be controlled
118 Chapter 7
by the previously mentioned iterative procedure if we pay attention not only to the
values of the state variables but to their derivatives as well.
7.1.2 lntegral Approach
In this case instead of approximating the time derivatives on the left hand
side of Equation 7.1, we integrate both sides with respect to time. Namely, inte-
gration between to and t, of Equation 7.1 yields,
which upon expansion of the integrals yields
t 0 '0
(7.7)
' (7.8)
Noting that the initial conditions are known (x(to)=x(,), the above equations
can be rewritten as
Shortcut Estimation Methods for ODE Models
or more compactly as
X(ti)=XO +Y (ti )k
119
(7. I O)
where
ti
yj r(ti ) = I qJ r(x)dt ; j=1, ... ,I? 8~ r=l , ... ,p (7.1 1)
tO
The above integrals can be calculated since we have measurements of all the
state variables as a function of time. In particular, to obtain a good estimate of
these integrals it is strongly recommended to smooth the data first using a poly-
nomial fit of a suitable order. Therefore, the integrals Y,r(t;) should be calculated as
ti
Yj,.(ti) = J qj,(i)dt ; j=1, ..., n & r=l, ..., p (7.12)
where i (t) is the fitted polynomial. The same guidelines for the selection of the
smoothing polynomials apply.
Having the smoothed values of the state variables at each sampling point,
i , and the integrals, Y(t,), we have essentially transformed the problem to a
"usual" linear regression problem. The parameter vector is obtained by minimiz-
ing the following LS objective function
The above linear regression problem can be readily solved using any stan-
dard linear regression package.
7.2 GENERALIZATION TO ODE MODELS WITH NONLINEAR
DEPENDENCE ON THE PARAMETERS
The derivative approach described previously can be readily extended to
ODE models where the unknown parameters enter in a nonlinear fashion. The
exact same procedure to obtain good estimates of the time derivatives of the state
variables at each sampling point, t,, can be followed. Thus the governing ODE
120
dxt) - f(x(t), k)
dt
Chapter 7
(7.14)
can be written at each sampling point as
q, = f (W,k)
where again
(7.15)
Having the smoothed values of the state variables at each sampling point
and having estimated analytically the time derivatives, q,, we have transformed the
problem to a usual nonlinear regression problem for algebraic models. The pa-
rameter vector is obtained by minimizing the following LS objective function
(7.16)
i =l
where i is the smoothed value of the measured state variables at t=t,.
The above parameter estimation problem can now be solved with any esti-
mation method for algebraic models. Again, our preference is to use the Gauss-
Newton method as described in Chapter 4.
The only drawback in using this method is that any numerical errors intro-
duced in the estimation of the time derivatives of the state variables have a direct
effect on the estimated parameter values. Furthermore, by this approach we can
not readily calculate confidence intervals for the unknown parameters. This
method is the standard procedure used by the General Algebraic Modeling System
(GAMS) for the estimation of parameters i n ODE models when all state variables
are observed.
7.3 ESTlMATlON OF APPARENT RATES IN BIOLOGICAL SYSTEMS
In biochemical engineering we are often faced with the problem of estimat-
ing average apparent growth or uptake/secretion rates. Such estimates are particu-
larly useful when we compare the productivity of a culture under different operat-
ing conditions or modes of operation. Such computations are routinely done by
analysts well before any attempt is made to estimate true kinetics parameters
like those appearing in the Monod growth model for example.
Shortcut Estimation Methods for ODE Moil& 121
In this section we shall use the standard notation employed by biochemical
engineers and industrial microbiologists in presenting the material. Thus if we
denote by X, the viable cell (celZs/L) or biomass (rrtg/L) concentration, S the lim-
iting substrate concentration (mmo//L) and P the product concentration (mmol/L)
in the bioreactor, the dynamic component mass balances yield the following ODES
for each mode of operation:
Bntclr Experiments
" dXV
dt
- PXV
dS
dt
- = -qs x,
dP
- = qpxv
dt
(7.1 I )
(7.18)
(7.19)
where 11 is the apparent specific growth rate (l/h), qp is the apparent specific secre-
tion/production rate, (mnzol/(mg.h.L)) and qs is the apparent specific uptake rate
(mnd/(rtlg.h*L)).
Fed-Batch Experiments
feed stream. There is no effluent stream. The governing equations are:
In this case there is a continuous addition of nutrients to the culture by the
" dXv - (p -D)X,
dt
dS
dt
- = -qs X, + D(Sf -S)
dP
dt
- = qp X, - DP
(7.20)
(7.21)
(7.22)
where D is the dilution factor (Z/h) defined as the feed flowrate over volume (F/V)
and Sf (mmol/L) is the substrate concentration in the feed. For fed-batch cultures
the volume V is increasing continuously until the end of the culture period ac-
cording to the feeding rate as follows
(7.23)
122 Chapter 7
Continuom (Clrentostnt) Evperiments
In this case, there is a continuous supply of nutrients and a continuous with-
drawal of the culture broth including the submerged free cells. The governing
equations for continuous cultures are the same as the ones for fed-batch cultures
(Equations 7.20-7.22). The only difference is that feed flowrate is normally equal
to the effluent flowrate (F,,=F,,,=F) and hence the volume, V, stays constant
throughout the culture.
Perfusion Evperhnents
Perfusion cultures of submerged free cells are essentially continuous cul-
tures with a cell retention device so that no cells exit in the effluent stream. The
governing ODEs are
- dS = - qs X, +D(Sf - S)
dt
(7.24)
(7.25)
E = q, X, - DP (7.26)
dt
Fonnulrtion of t he problem
Having measurements of X,, S and P over time, determine the apparent spe-
cific growth rate (p), the apparent specific uptake rate (qJ and the apparent spe-
cific secretion rate (qp) at any particular point during the culture or obtain an aver-
age value over a user-specified time period.
This approach is based on the rearrangement of the governing ODEs to yield
expressions of the different specific rates, namely,
P ={ (7.27)
Shortcut Estimation Methods for ODE h/todels
123
which can be reduced to
batch & perfilsion
fedbatch & continuous
The specific uptake and secretion rates are given by
4 s =
batch
(7.29)
Based on the above expressions, it is obvious that by this approach one
needs to estimate numerically the time derivatives of X,, S and P as a function of
time. This can be accomplished by following the same steps as described in Sec-
tion 7.1. I . Namely, we must first smooth the data by a polynomial fitting and then
estimate the derivatives. However, cell culture data are often very noisy and hence,
the numerical estimation of time derivatives may be subject to large estimation
errors.
A much better approach for the estimation of specific rates in biological
systems is through the use of the integral method. Let us first start with the analy-
sis of data from batch experiments.
Batch Experiments
Integration of the ODE for biomass from to to ti yields
t 24
Similarly the ODES for S and P yield,
and
Chapter 7
(7.3 1)
(7.32)
(7.33)
Let us assume at this point that the specific rates are constant in the interval
[to,ti]. Under this assumption the above equations become,
(7.35)
(7.36)
Normally a series of measurements, X,(t,), S(t,) and P(tl), i=l,. . . , N, are
available. Equation 7.34 suggests that the specific growth rate (p) can be obtained
as the slope in a plot of /nX,(t,) versus t,. Equation 7.35 suggests that the specific
substrate uptake rate (qJ can also be obtained as the negative slope in a plot of
S(t,) versus [: Xv(t)dt . Similarly, Equation 7.36 suggests that the specific se-
cretion rate (qp) can be obtained as the slope in a plot of P(tl) versus i: Xv(t)dt .
By constructing a plot of S(t,) versus jX,dt, we can viszrnl/y identify distinct
titne periods during the culture where the specific uptake rate (qs) is "constant" and
estimates of qs are to be determined. Thus, by using the linear least squares esti-
mation capabilities of any spreadsheet calculation program, we can readily esti-
mate the specific uptake rate over any user-specified time period. The estimated
Shortcut Estimation Methods for ODE Models 125
specific uptake rate is essentially its average value of the estimation period. This
average value is particularly useful to the analyst who can quickly compare the
metabolic characteristics of different microbial populations or cell lines.
Similarly we can estimate the specific secretion rate. I t is obvious fiom the
previous analysis that an accurate estimation of the average specific rates can only
be done if the integral jX,dt is estimated accurately. If measurements of biomass
or cell concentrations have been taken very fiequently, simple use of the trapezoid
rule for the computation of jX,dt may suffice. If however the measurements are
very noisy or they have been infrequently collected, the data must be first
smoothed through polynomial fitting and then the integrals can be obtained ana-
lytically using the fitted polynomial.
Estimation Procedure-for Batch Experiments
Import measurements into a spreadsheet and plot the data versus time to
spot any gross errors or outliers.
Smooth the data (X,(t,), S(t,) and P(tl), i=l,. . . ,N) by performing a polyno-
mial fit to each variable.
Compute the required integrals fi X,(t)dt , i=l,. . .,N using the fitted
polynomial for X,(t,).
Plot hX,(t,) versus ti.
t 1
Plot S(t,) and P(t,) versus Lo X,(t)dt .
Identi@ the segments of the plotted curves where the specific growth, up-
take and secretion rates are to be estimated (e.g., exclude the lag phase,
concentrate on the exponential growth phase or on the stationary phase or
on the death phase of the batch culture).
Using the linear least-squares function of any spreadsheet program compute
the specific uptakekecretion rates within the desired time periods as the
slope of the corresponding regression lines.
I t is noted that the initial time (to) where the computation of all integrals be-
gun, does not affect the determination of the slope at any later time segment. It
affects only the estimation of the constant in the linear model.
The major disadvantage of the integral method is the difficulty in computing
an estimate of the standard error in the estimation of the specific rates. Obviously,
all linear least squares estimation routines provide automatically the standard error
of estimate and other statistical information. However, the computed statistics are
based on the assumption that there is no error present in the independent variable.
126 Chapter 7
This assumption can be relaxed when the experimental error in the independent
variable is much smaller compared to the error present in the measurenlents of the
dependent variable. In our case the assumption of simple linear least squares im-
plies that IX,dt is known precisely. Although we do know that there are errors in
the measurement of X,(, the polynomial fitting and the subsequent integration pro-
vides a certain amount of data filtering which could allows us to assume that ex-
perimental error in jX,dt is negligible compared to that present in S(t,) or P(t,).
Nonetheless, the value for the standard error computed by the linear least
square computations reveal an estimate of the uncertainty which is valid for com-
parison purposes among different experiments.
Fed-Bntclr & Cmfinunrrs (Chenmtnt) Esperiments
The integral approach presented for the analysis of batch cultures can be
readily extended to fed-batch and continuous cultures. Integration of the ODE for
biomass from to to t , yields
ti
I nX,(ti)-/nXv(tO) = { k t - D]dt (7.37)
(7.38)
(7.39)
In this case we assume that we know the dilution rate (D=F/V) precisely as a
function of time. In a chenlostat D is often constant since the feed flowrate and the
volume are kept constant. I n a fed-batch culture the volume is continuously in-
creasing. The dilution rate generally varies with respect to time although it could
also be kept constant if the operator provides an exponentially varying feeding
rate.
the above equations become,
Again assuming that the specific rates remain constant
i n the period [tn,tl].
(7.40)
127
ti ti
S(t,)- J D[Sf -S(t)]dt = - qs jX,(t)dt + S(t0)
(7.41)
tO tO
and
t i ti
P(t i ) + J DP(t)dt = q p J X, (t)dt + P(t0) (7.42)
Equation 7.40 suggests that the specific growth rate (p) can be obtained as
the slope in a plot of {hX,(t,) + IDdt) versus t,. Equation 7.41 suggests that the
specific substrate uptake rate (qs) can also be obtained as the negative slope in a
plot of {S(t i ) - I' D [Sf - S(t)]dt ] versus [ A X (t)dt . Similarly, Equation 7.42
suggests that the specific secretion rate (qp) can be obtained as the slope in a plot
of {P(ti ) + [i DP(t)dt ) versus f: X , (t)dt .
By constructing the above plots, we can visually identi@ distinct time peri-
ods during the culture where the specific rates are to be estimated. If the dilution
rate (D) is taken equal to zero, the usual estimation equations for batch cultures are
obtained. This leads to the interpretation of the left hand side of Equations 7.41
and 7.42 as "effective batch culture concentrations" of the substrate (S) and the
product (P). The computational procedure is given next.
0
Estimation Procedure for Fed-Batch & Continuous (Chemostat) Experiments
(i) Import measurements (Xv, S and P) and user-specified variables (such as D
and Sf as a function of time) into a spreadsheet and plot the data versus time
to spot any gross errors or outliers.
(ii) Smooth the data (X,(ti), S(t,) and P(t,), i=l,. . . ,N) by performing a polyno-
mial fit to each variable.
(iii) Compute the required integrals f: X (t)dt , i=l,. . . , N using the fitted
polynomial for X,(t,).
(iv) Compute also the integrals f: Ddt , f: D[Sf - S(t)]dt and [: DP(t)dt ,
i=l,. . .,N using the fitted polynomial for S(t,), P(tJ and the user-specified
feeding variables D and Sf.
128
Chapter 7
(v) Plot (/17X,(t,) + Ddt }versus t,.
f;
(vi) Plot (S(t,)- f i D[Sf - S(t)]dt ) and fP(t,)+ f: DP(t)dt ) versus X (t)dt .
0
(vii) Identi@ the segments of the plotted curves where the specific growth. up-
take and secretion rates are to be estimated (eg, exclude any lag phase or
concentrate on two different segments of the culture to compare the pro-
ductivity or growth characteristics, etc.).
(viii) Using the linear least-squares hnction of any spreadsheet program compute
the specific uptake/secretion rates within the desired titne periods as the
slope of the corresponding regression lines.
The main advantage of the above estimation procedure is that there is no
need to assume steady-state operation. Since practice has shown that steady state
operation is not easily established for prolonged periods of time, this approach
enables the determination of average specific rates taking into account accumula-
tion terms.
Perfusion Experiments
The procedure for the estimation of qs and qp is identical to the one pre-
sented for fed-batch and continuous cultures. The only difference is in the estima-
tion of the specific growth rate (p). Since perfusion cultures behave as batch cul-
tures as far as the biomass is concerned, !I can be obtained as described earlier for
batch systems. Namely p is obtained as the slope in the plot of /nX,(t,) versus t,.
The above procedure can be easily modified to account for the case of a
small "bleeding rate", i.e., when there is a known (i.e., measured) small with-
drawal of cells (much less than what it could be in a chemostat).
We finish this section with a final note on the estimation of the specific
death rate (kd). The procedures presented earlier (for batch, fed-batch, continuous
and perfusion cultures) provide an estimate of the apparent growth rate (pa) rather
than the true specific growth rate (11). The true growth rate can only be obtained as
(\La+kd) where is the specific death rate.
The specific death rate can be obtained by considering the corresponding
mass balance for nonviable (dead) cells. Normally nonviable cell concentration is
measured at the same time the measurement for viable cell concentration is made.
If viability (c) data are available, the nonviable cell concentration can be obtained
from the viable one as &=X,,( 1 -()/(.
The dynamic mass balance for nonviable cells in a batch or perfusion cul-
ture yields,
dX d
- = kdx,
dt
This leads to the following integral equation for kd
129
(7.43)
(7.44)
Hence, in a batch or perfusion culture, can be obtained as the slope in a
plot of X&,) versus f: X (t)dt .
The dynamic mass balance for nonviable cells in a fed-batch or continuous
culture is
dXd = kdX, -DXd
dt
The above equation yields to the following integral equation for kd,
(7.45)
(7.46)
Thus, in a fed-batch or continuous culture, can be obtained as the slope in
a plot of (&(ti)+ f: DX d (t)dt ] VerSUS f: x (t)dt .
7.4 EXAMPLES
In Chapter 17, where there is a special section with biochemical engineering
problems, examples on shortcut methods are presented.
7.4.1 Derivative Approach - Pyrolytic Dehydrogenation of Benzene
In this section we shall only present the derivative approach for the solution
of the pyrolytic dehydrogenation of benzene to diphenyl and triphenyl regression
problem. This problem, which was already presented in Chapter 6, is also used
here to illustrate the use of shortcut methods. As discussed earlier, both state vari-
ables are measured and the two unknown parameters appear linearly in the gov-
erning ODES which are also given below for ease of the reader.
130 Chapter 7
dx I
- = -rl - r2
dt
dx2 rl
dt 2
""
- '2
where
rl = kl [ 2 X I - x2(2- 2x1 - xz)i3K1]
(7.47a)
(7.47b)
(7.48a)
r2 = k2[x1x2 - (1 - x1 - 2x2)(2 - 2x1 - x2)/9K2] (7.48b)
Here xI denotes //~-m~oles of benzene per Ib-mole of pure benzene feed and
x? denotes lb-moles of diphenyl per Ih-rztole of pure benzene feed. The parameters
k, and k2 are unknown reaction rate constants whereas K1 and K2 are known equi-
librium constants. The data consist of measurements of s1 and x2 in a flow reactor
at eight values of the reciprocal space velocity t. The feed to the reactor was pure
benzene. The experimental data are given in Table 6.2 (in Chapter 6). The gov-
erning ODES can also be written as
where
cp1 ] ( x ) = -[x: - x2(2 -2xl - x2)/3KI ]
(7.49a)
(7.49b)
(7.50a)
As we mentioned, the first and probably most crucial step is the computation
of the time derivatives of the state variables frotn smoothed data. The best and
easiest way to smooth the data is using smooth cubic splines using the IMSL rou-
tines CSSMH, CSVAI, & CSDER. The latter two are used once the cubic splines
coefficients and break points have been computed by CSSMH to generate the val-
ues of the smoothed measurements and their derivatives (q, and q2).
131
Table 7. I : Estimated parameter values with short cut methods for
different values of the smoothing parameter ( dN) in
IMSL routine CSSMH
Smoothing Pa-
rameter*
* Used in IMSL routine CSSMH (s/N).
0.9
0.8
0.7
0.6
9 0.5
0.4
0.3
0.2
0.1
0.0
@ 0 Raw Data
- B -
la
"
0 Raw Data
0 s/N = 0.01
. " . .
. . -s/N = 1
0.005 0.010 0.015 0.020
t
Figure 7. I : Swootlzed data for variables x I and x2 using a smooth cubic spline
appro-xinzntion (s/N=O.Ol, 0. I and I ) .
Once we have the smoothed values of the state variables, we can proceed
and compute cpI 1, qI2, cpZl and p 2 . All these computed quantities (ql , q2, qI1, qI2,
921 and q 2 2 ) constitute the data set for the linear least squares regression, In Figure
7.1 the original data and their smoothed values are shown for 3 different values of
the smoothing parameter "st' required by CSSMH. An one percent (1%) standard
132 Chapter 7
error in the measurements has been assumed and used as weights in data smooth-
ing by CSSMH. I MSL recommends this snloothing parameter to be in the range
[N-(2N)05, N+(2N)05]. For this problem this means 0.55s/N5I .5. As seen in Fig-
ure 7. I the smoothed data do not differ significantly for s/N equal to 0.01, 0. I and
1. However, this is not the case with the estimated derivatives of the state vari-
ables. This is clearly seen in Figure 7.2 where the estimated values of -dx,/dt are
quite different in the beginning of the transient.
Subsequent use of linear regression yields the two unknown parameters k,
and k2. The results are shown in Table 7.1 for the three different values of the
smoothing parameter.
As seen, there is significant variability in the estimates. This is the reason
why we should avoid using this technique if possible (unless we wish to generate
initial guesses for the Gauss-Newton method for ODE systems). As it was men-
tioned earlier, the numerical computation of derivatives fiom noisy data is a risky
business!
250
225
200
175
5
150
8 125
\
g 100
R
-7 75
50
25
-2 5
X
A
x s/N =0.01
AS/N =0.1
t
+s/N = 1
&- - -
- - ~ s / N = 0.01
ms/N =0.1
-SIN = 1
x1
x2
8
0.000 0.005 0.01 0 0.01 5 0.020
t
Practical Guidelines for Algorithm
Implementation
Besides the basic estimation algorithm, one should be aware of several de-
tails that should be taken care of for a successful and meaninghl estimation of
unknown parameters. Items such as quality of the data at hand (detection of out-
liers), generation of initial guesses, overstepping and potential remedies to over-
come possible ill-conditioning of the problem are discussed in the following sec-
tions. Particular emphasis is given to implementation guidelines for differential
equation models. Finally, we touch upon the problem of autocorrelation.
8.1 INSPECTION OF THE DATA
One of the very first things that the analyst should do prior to attempting to
fit a model to the data, is to inspect the data at hand. Visual inspection is a very
powerhl tool as the eye can often pick up an inconsistent behavior. The primary
goal of this data-inspection is to spot potential outliers.
Barnett et al. ( 1 994) define outliers as the observations in a sample which
appear to be inconsistent with the remainder of the sample. In engineering appli-
cations, an outlier is often the result of gross measurement error. This includes a
mistake in the calculations or in data coding or even a copying error. An outlier
could also be the result of inherent variability of the process although chances are
it is not!
Besides visual inspection, there are several statistical procedures for detect-
ing outliers (Barnett and Lewis, 1978). These tests are based on the examination of
the residuals. Practically this means that if a residual is bigger than 3 or 4 standard
133
134 Chapter 8
deviations, it should be considered a potential outlier since the probability of oc-
currence of such a data point (due to the inherent variability of the process only) is
less than 0.1 %.
A potential outlier should always be examined carefully. First we check if
there was a simple mistake during the recording or the early manipulation of the
data. If this is not the case, we proceed to a carefid examination of the experimen-
tal circumstances during the particular experiment. We should be very careful not
to discard ("reject") an outlier unless we have strong no~7-stntistictrl reasons for
doing so. I n the worst case, we should report both results, one with the outlier and
the other without it.
Instead of a detailed presentation of the effect of extreme values and out-
liers on least squares estimation, the following common sense approach is recom-
mended in the analysis of engineering data:
Start with careful visual observation of the data provided in tabular and
graphical form.
Identi@ "first" set of potential outliers.
Examine whether they should be discarded due to non-statistical reasons.
Eliminate those from the data set.
Estimate the parameters and obtain the values of response variables and
calculate the residuals.
Plot the residuals and examine whether there any data point lie beyond 3
standard deviations around the mean response estimated by the mathe-
matical model.
Having identified the second set of potential outliers, examine whether
they should be discarded due to non-statistical reasons and eliminate
them from the data set.
For all remaining "outliers" examine their effect on the model response
and the estimated parameter values. This is done by performing the pa-
rameter estimation calculations twice, once with the outlier included in
the data set and once without it.
Determine which of the outliers are highly informative compared to the
rest data points. The remaining outliers should have little on no effect on
the model parameters or the mean estimated response of the model and
hence, they can be ignored.
Prepare a final report where the effect of highly informative outliers is
clearly presented.
Perform replicate experiments at the experimental conditions where the
outliers were detected, if of course, time, money and circumstances allow
it.
In summary, the possibility of detecting and correcting an outlier i n the data
is of paramount importance particularly if it is spotted by an early inspection.
Practical Guidelines for Algorithm Implementation 135
Therefore, careful inspection of the data is highly advocated prior to using any
parameter estimation software.
8.2 GENERATION OF INITIAL GUESSES
One of the most important tasks prior to employing an iterative parameter
estimation method for the solution of the nonlinear least squares problem is the
generation of initial guesses (starting values) for the unknown parameters. A good
initial guess facilitates quick convergence of any iterative method and particularly
of the Gauss-Newton method to the optimum parameter values. There are several
approaches that can be followed to arrive at reasonably good starting values for the
parameters and they are briefly discussed below.
8.2.1 Nature and Structure of the Model
The nature of the mathematical model that describes a physical system may
dictate a range of acceptable values for the unknown parameters. Furthermore,
repeated computations of the response variables for various values of the parame-
ters and subsequent plotting of the results provides valuable experience to the
analyst about the behavior of the model and its dependency on the parameters. As
a result of this exercise, we often come up with fairly good initial guesses for the
parameters. The only disadvantage of this approach is that it could be time con-
suming. This counterbalanced by the fact that one learns a lot about the structure
and behavior of the model at hand.
8.2.2 Asymptotic Behavior of the Model Equations
Quite often the asymptotic behavior of the model can aid us in determining
sufficiently good initial guesses. For example, let us consider the Michaelis-
Menten kinetics for enzyme catalyzed reactions,
When xi tends to zero, the model equation reduces to
136 Chapter 8
and hence, from the very first data points at small values of x,, we can obtain kl/k2
as the slope of the straight line y, = c)x,. Furthermore, as x, tends to infinity, the
model equation reduces to
Y I = kl
(8.3)
and kl is obtained as the asymptotic value of y, at large values of x,.
tered in analyzing environmental samples,
Let us also consider the following exponential decay model, often encoun-
At large values of x,, (i.e.. as x, +x)) the model reduces to
Y 1 kl
whereas at values of x;, near zero (i.e., as x, -+0) the model reduces to
or
(8.5)
(8.6a)
(8.6b)
which is of the form y, = c)o+c)lx, and hence, by performing a linear regression with
the values of x, near zero we obtain estimates of kl+k2 and k2k3. Combined with
our estimate of kl we obtain starting values for all the unknown parameters.
I n some cases we may not be able to obtain estimates of all the paratneters
by examining the asymptotic behavior of the model. However, it can still be used
to obtain estimates of some of the parameters or even establish an approximate
relationship between some of the parameters as functions of the rest.
8.2.3 Transformation of the Model Equations
A suitable transformation of the model equations can simplify the structure
of the model considerably and thus, initial guess generation becomes a trivial task.
The most interesting case which is also often encountered in engineering applica-
tions, is that of transfor.?nnhly linear. rzlodels. These are nonlinear models that re-
duce to simple linear models after a suitable transformation is performed. These
models have been extensively used i n engineering particularly before the wide
availability of computers so that the parameters could easily be obtained with lin-
ear least squares estimation. Even today such models are also used to reveal char-
acteristics of the behavior of the model in a graphical form.
If we have a transfomably linear system, our chances are outstanding to get
quickly very good initial guesses for the parameters. Let us consider a few typical
examples.
The integrated form of substrate utilization in an enzyme catalyzed batch
bioreactor is given by the implicit equation
where x, is time and y, is the substrate concentration. The initial conditions (xo, yo)
are assumed to be known precisely. The above model is implicit in y, and we
should use the implicit formulation discussed in Chapter 2 to solve it by the
Gauss-Newton method. Initial guesses for the two parameters can be obtained by
noticing that this model is transformably linear since Equation 8.7 can be written
as
which is ofthe form Y, = po+plXi where
and
(8.9a)
(8.9b)
Initial estimates for the parameters can be readily obtained using linear least
The famous Michaelis-Menten kinetics expression shown below
squares estimation with the transformed model.
can become linear by the following transformations also known as
Lineweaver-Burk plot: ($)= ;+?($)
(8. I O)
(8.1 1 )
138
Eadie-Hofstee plot:
and Hanes plot:
Chapter 8
(8.12)
(8.13)
AI1 the above transformations can readily produce initial parameter esti-
mates for the kinetic parameters k, and k2 by performing a simple linear regres-
sion. Another class of models that are often transformably linear arise in heteroge-
neous catalysis. For example, the rate of dehydrogeneration of ethanol into acetal-
dehyde over a Cu-Co catalyst is given by the following expression assuming that
the reaction on two adjacent sites is the rate controlling step (Franckaerts and
Froment, 1964; Froment and Bischoff, 1990).
(8.14)
where y, is the measured overall reaction rate and xi , x2, xj and x4 are the partial
pressures of the chemical species. Good initial guesses for the unknown parame-
ters, k, , . ..,k5, can be obtained by linear least squares estimation of the transformed
equation,
which is of the form Y = ~O+P1~l +P2~2+P3~3+P4x4.
8.2.4 Conditionally Linear Systems
In engineering we often encounter conditionally linear systems. These were
defined in Chapter 2 and it was indicated that special algorithms can be used
which exploit their conditional linearity (see Bates and Watts, 1988). I n general,
we need to provide initial guesses only for the nonlinear paratneters since the con-
ditionally linear parameters can be obtained through linear least squares estima-
tion.
For example let us consider the three-parameter model,
Practical Guidelines for Algorithm Implementation
139
Yi = k, + k2exp(-k3x,)
(8.16)
where both kl and k2 are the conditionally linear parameters. These can be readily
estimated by linear least squares once the value of k3 is known. Conditionally lin-
ear parameters arise naturally in reaction rates with an Arrhenius temperature de-
pendence.
8.2.5 Direct Search Approach
If we have very little information about the parameters, direct search meth-
ods, like the LJ optimization technique presented in Chapter 5, present an excel-
lent way to generate very good initial estimates for the Gauss-Newton method.
Actually, for algebraic equation models, direct search methods can be used to de-
termine the optimum parameter estimates quite efficiently. However, if estimates
of the uncertainty in the parameters are required, use of the Gauss-Newton method
is strongly recommended, even if it is only for a couple of iterations.
8.3 OVERSTEPPING
Quite often the direction determined by the Gauss-Newton method, or any
other gradient method for that matter, is towards the optimum, however, the length
of the suggested increment of the parameters could be too large. As a result, the
value of the objective function at the new parameter estimates could actually be
higher than its value at the previous iteration.
k2
k2 *
-
kl* kl
Figure 8. I Contours of the objectivejhction i n the vicinity of the optimtm Potential
problems with ovemtepping are shown for a hvo-paranzeter pr-oblem.
I40 Chapter 8
The classical solution to this problem is by limiting the step-length through
the introduction of a stepping parameter, p (O<p< I ), namely
The easiest way to arrive at an acceptable value of p, pa, is by employing the
bisection rule as previously described. Namely, we start with p = 1 and we keep on
halving p until the objective function at the new parameter values becomes less
than that obtained in the previous iteration, i.e., we reduce p until
S( k(')+paAk0+')) < S(k("). (8.18)
Normally, we stop the step-size determination here and we proceed to per-
fonn another iteration of Gauss-Newton method. This is what has been imple-
mented i n the computer programs accompanying this book.
8.3.1 An Optimal Step-Size Policy
Once an acceptable value for the step-size has been determined. we can
continue and with only one additional evaluation of the objective function, we can
obtain the optimal step-size that should be used along the direction suggested by
the Gauss-Newton method.
Essentially we need to perform a simple line search along the direction of
Ak''"). The simplest way to do this is by approximating the objective function by a
quadratic along this direction. Namely,
S(p) = S(k'J'+pAkO"') = Po + Pl p + P2p2 (8.19)
The first coefficient is readily obtained at p=O as
where So = S (k(") which has already been calculated. The other two coefficients.
PI and p2, can be computed from information already available at p=pa and with an
additional evaluation at p=pa/2. If we denote by SI and S2 the values of the objcc-
tive function at p=pa/2 and p=pa respectively, we have
and
Solution of the above two equations yields,
(8.2 1 a)
(8.2 I b)
Practical Guidelines for Algorithm i;lipiementation 141
(8.22a)
and
so+ s2- 2SI
P2 =2 2 (8.22b)
P a
Having estimated 01 and p2, we can proceed and obtain the optimum step-
size by using the stationary criterion
which yields
dS(k + pAk ci-tl) )
=o
dP
= -x
PI
Substitution of the expressions for PI and P2 yields
(8.23)
(8.24)
(8.25)
The above expression for the optimal step-size is used in the calculation of
the next estimate of the parameters to be used in the next iteration of the Gauss-
Newton method,
If we wish to avoid the additional objective function evaluation at p=pa/2,
we can use the extra information that is available at p=O. This approach is prefer-
able for differential equation models where evaluation of the objective function
requires the integration of the state equations. It is presented later in Section 8.7
where we discuss the implementation of Gauss-Newton method for ODE models.
8.4 ILL-CONDITIONING OF MATRJX A AND PARTIAL REMEDIES
If two or more of the unknown parameters are highly correlated, or one of
the parameters does not have a measurable effect on the response variables, matrix
A may become singular or near-singular. In such a case we have a so called ill-
posed problem and matrix A is ill-conditioned.
A measure of the degree of ill-conditioning of a nonsingular square matrix is
through the condition ntrmber which is defined as
142
Chapt er 8
The condition number is always greater than one and it represents the
maximum amplification of the errors in the right hand side in the solution vector.
The condition number is also equal to the square root of the ratio of the largest to
the smallest singular value of A. In parameter estimation applications, A is a posi-
tive definite symmefric matrix and hence, the cond(A) is also equal to the ratio of
the largest to the smallest eigenvalue of A, i.e.,
(8.28)
Generally
mation problem
speaking, for condition numbers less than 10' the parameter esti-
is well-posed. For condition numbers greater than " 10" the prob-
lem is relatively ill-conditioned whereas for condition numbers 10"' or greater the
problem is very ill-conditioned and we may encounter computer overflow prob-
lems.
The condition number of a matrix A is intimately connected with the sensi-
tivity of the solution of the linear system of equations A x = b. When solving this
equation, the error in the solution can be magnified
cond(A) times the norm of the error in A and b due to
the data.
by an amount as large as
the presence of the error i n
(8 29)
Thus, the error in the solution vector is expected to be large for an i l l -
conditioned problem and small for a well-conditioned one. I n parameter estima-
tion, vector b is comprised of a linear combination of the response variables
(measurements) which contain the error terms. Matrix A does not depend explic-
itly on the response variables, it depends only on the parameter sensitivity coeffi-
cients which depend only on the independent variables (assumed to be known
precisely) and on the estimated parameter vector k which incorporates the uncer-
tainty in the data. As a result, we expect most of the uncertainty in Equation 8.29
to be present in Ab.
If matrix A is ill-conditioned at the optimum (i.eSl at k==k*), there is not
much we can do. We are faced with a "truly" ill-conditioned problem and the es-
timated parameters will have highly questionable values with unacceptably large
estimated variances. Probably, the most productive thing to do is to reexamine the
structure and dependencies of the mathematical model and try to reformulate a
better posed problem. Sequential experimental design techniques can also aid LIS i n
Practical Guidelines for Algoritiim Implkinentatioii 143
this direction to see whether additional experiments will improve the parameter
estimates significantly (see Chapter 12).
If however, matrix A is reasonably well-conditioned at the optimum, A
could easily be ill-conditioned when the parameters are away fiom their optimal
values. This is quite often the case in parameter estimation and it is particularly
true for highly nonlinear systems. In such cases, we would like to have the means
to move the parameters estimates from the initial guess to the optimum even if the
condition number of matrix A is excessively high for these initial iterations.
In general there are two remedies to this problem: (i) use a pseudoinverse
and/or (ii) use Levenberg-Marquardt's modification.
8.4.1 Pseudoinverse
The eigenvalue decomposition of the positive definite symmetric matrix A
yields
A = VTAV (8.30)
where V is orthogonal (i.e., V-' = VT) and A = diag(AI, A?, ...,A,,). Furthermore, it is
assumed that the eigenvalues are in descending order (Xl > h2> ...>A,).
Having decomposed matrix A, we can readily compute the inverse from
I f matrix A is well-conditioned, the above equation should be used. If how-
ever, A is ill-conditioned, we have the option without any additional computation
effort, to use instead the pseudoinverse of A. Essentially, instead of A-' in Equa-
tion 8.3 l , we use the pseudoinverse of A, A'.
The pseudoinverse, 11' is obtained by inverting only the eigenvalues which
are greater than a user-specified small number, 6, and setting the rest equal to zero.
Namely,
A'= diclg(l/hl, I/hz ,..., l/h,,O ,..., 0) (8.32)
where hJ 2 6, j=1,2,. . .,n and A, < 6, j=nn+l,n+2,. . .,p. In other words, by using the
pseudoinverse of A, we are neglecting the contribution of the eigenvalues that are
very close to zero. Typically, we have used for 6 a value of when hl is of
order 1.
Finally, it should be noted that this approach yields an approximate solution
to AAkci") = b that has the additional advantage of keeping llAk~+l)ll as small as
possible.
144
Chapter 8
8.4.2 Marquardts Modification
In order to improve the convergence characteristics and robustness of the
Gauss-Newton method, Levenberg in 1944 and later Marquardt (1 963) proposed
to modi@ the normal equations by adding a small positive number, y, to the di-
agonal elements of A. Namely, at each iteration the increment in the parameter
vector is obtained by solving the following equation
(A + yI ) Akci+ = b (8.33)
Levenberg-Marquardts modification is often given a geotnetric interpreta-
tion as being a compromise between the direction of the Gauss-Newton method
and that of steepest descent. The increment vector, Ak I ) . is a weighted average
between that by the Gauss-Newton and the one by steepest descent.
A more interesting interpretation of Levenberg-Marquardts modification
can be obtained by examining the eigenvalues of the modified matrix (A+yl). If
we consider the eigenvalue decomposition of A, VAV we have,
A+yl = VTAVty2VTV = VT(A+yl)V = VAhV (8.34)
where
The net result is that all the eigenvalues of matrix A are all increased by y,
Le., the eigenvalues are now hl+y2, h2+y2, ..., X,,?*. Obviously, the large eigenval-
ues will be hardly changed whereas the small ones that are much smaller than y2
become essentially equal to y2 and hence, the condition number of matrix A is re-
duced from A,na,/k,,l,n to
There are some concerns about the implementation of the method (Bates
and Watts, 1988) since one must decide how to manipulate both Marquardts y2
and the step-size h. I n our experience, a good implementation strategy is to de-
compose matrix A and examine the magnitude of the small eigenvalues. If some
of them are excessively small, we use a y2 which is larger compared to the small
eigenvalues and we compute l/Akti+)\1. Subsequently, using the bisection rule we
obtain an acceptable or even an optimal step-size along the direction of Ak(+).
8.4.3 Scaling of Matrix A
When the parameters differ by more than one order of magnitude, matrix A
may appear to be ill-conditioned even if the parameter estimation problem is well-
posed. The best way to overcome this problem is by introducing the reduced sen-
sitivity coeficients, defined as
(8.37)
and hence, the reduced parameter sensitivity matrix, GR, is related to our usual
sensitivity matrix, G, as follows
G R = G K (8.38)
where
K = diag(k1, k2, . . . , kp) (8.39)
As a result of this scaling, the normal equations AAkC+) = b become
where
K CGTQ~ G~ K = KA
[ i : , ]
(8.40)
K (8.4 1 )
(8.42)
and
The parameter estimates for the next iteration are now obtained (using a step-
size p, O<y<l) as follows
or equivalently
146 Chapter 8
(8.45)
With this nlodification the conditioning of matrix A is significantly improved
and cond(AR) gives a more reliable measure of the ill-conditioning of the parame-
ter estimation problem. This modification has been implemented in all computer
programs provided with this book.
8.5 USE OF "PRIOR" INFORMATION
Under certain conditions we may have some prior information about the pa-
rameter values. This information is often summarized by assuming that each pa-
rameter is distributed nornlally with a given mean and a small or large variance
depending on how trustworthy our prior estimate is. The Bayesian objective func-
tion, SB(k), that should be minimized for algebraic equation models is
and for differential equation models it takes the form,
N
sB(k)=C[J i , -y(ti,k)]'Qi[jli -y(ti,k)] + (k-k,)TVg'(k-kg) (8.47)
i =l
We have assumed that the prior information can be described by the mul ti -
variate normal distribution, i.e., k is nortnally distributed with mean kB and co-
variance matrix VB.
The required modifications to the Gauss-Newton algoritllm presented in
Chapter 4 are rather minimal. At each iteration, we just need to add the following
tenns to matrix A and vector b,
(8.48)
(8.49)
where AGN and bGN are matrix A and vector b for the Gauss-Newton method as
given in Chapter 4 for algebraic or differential equation models. The prior covari-
ance matrix of the parameters (V,) is often a diagonal matrix and since i n the so-
lution of the problem only the inverse of VB is used, it is preferable to use as input
to the program the inverse itself.
From a computer implementation point of view, this provides some extra
flexibility to handle simultaneously parameters for which we have some prior
knowledge and others for which no information is available. For the latter we sim-
ply need to input zero as the inverse of their prior variance.
Practical experience has shown that (i) if we have a relatively large number
of data points, the prior has an insignificant effect on the parameter estimates (ii) if
the parameter estimation problem is ill-posed, use of "prior" information has a
stabilizing effect. As seen from Equation 8.48, all the eigenvalues of matrix A are
increased by the addition of positive terms in its diagonal. It acts almost like Mar-
quadt's modification as far as convergence characteristics are concerned.
8.6 SELECTION OF WEIGHTING MATRIX Q IN LEAST SQUARES
ESTIMATION
As we mentioned in Chapter 2, the user specified matrix Qi should be equal
to the inverse of COV(e,). However, in many occasions we have very little infor-
mation about the nature of the error in the measurements. In such cases, we have
found it very useful to use Q, as a normalization matrix to make the measured re-
sponses of the same order of magnitude. I f the measurements do not change sub-
stantially fkomdata point to data point, we can use a constant Q. The simplest
form of Q that we have found adequate is to use a diagonal matrix whose jth ele-
ment in the diagonal is the inverse of the squared mean response of the jth variable,
1
(8.50)
This is equivalent to assuming a constant standard error in the measurement
of the j* response variable, and at the same time the standard errors of different
response variables are proportional to the average value of the variables. This is a
"safe" assumption when no other information is available, and least squares esti-
mation pays equal attention to the errors from different response variables (e.g.,
concentration, versus pressure or temperature measurements).
If however the measurements of a response variable change over several or-
ders of magnitude, it is better to use the non-constant diagonal weighting matrix Qi
given below
(8.5 1)
148 Chapter 8
This is equivalent to assuming that the standard error in the i"' measurement
of the jth response variable is proportional to its value, again a rather "safe" as-
sumption as it forces least squares to pay equal attention to all data points.
8.7 IMPLEMENTATION GUIDELINES FOR ODE MODELS
For models described by a set of ordinary differential equations there are a
few modifications we may consider implementing that enhance the performance
(robustness) of the Gauss-Newton method. The issues that one needs to address
more carefully are (i) numerical instability during the integration of the state and
sensitivity equations, (ii) ways to enlarge the region of convergence.
8.7.1 Stiff ODE Model s
Stiff differential equations appear quite often in engineering. The wide range
of the prevailing time constants creates additional numerical difficulties which
tend to shrink even further the region of convergence. I t is noted that if the state
equations are stiff, so are the sensitivity differential equations since both of then1
have the same Jacobean. It has been pointed out by many nutnerical analysts that
deal with the stability and efficiency of integration algorithms (e.g., Edsberg,
1976) that the scaling of the problem is very crucial for the numerical solver to
behave at its best. Therefore, it is preferable to scale the state equations according
to the order of magnitude of the given measurements, whenever possible. In addi-
tion, to normalize the sensitivity coefficients we introduce the reduced sensitivity
coefficients,
(8.52)
and hence, the reduced parameter sensitivity matrix, GR, is related to our usual
sensitivity matrix, G, as follows
GR(t) = G(t) K (8.53)
where K = ding(k,, kz, ..., k,,). By this transformation the sensitivity differential
equations shown below
T
dt = [ g] T G( t) + [g] ; G(t())=O (8.54)
become
Practical Guidelines f or Algorithm Implementation
With this transformation, the normal equations now become,
ARAkr ) = bR
where
and
or
149
(8.55)
(8.56)
(8.57)
(8.58)
(8.59)
(8.60)
(8.61)
As we have already pointed out in this chapter for systems described by al-
gebraic equations, the introduction of the reduced sensitivity coefficients results in
a reduction of cond(A). Therefore, the use of the reduced sensitivity coefficients
should also be beneficial to non-stiff systems.
We strongly suggest the use of the reduced sensitivity whenever we are
dealing with differential equation models. Even if the system of differential equa-
tions is non-stiff at the optimum (when k=k*), when the parameters are far from
their optimal values, the equations may become stiff temporarily for a few itera-
tions of the Gauss-Newton method. Furthermore, since this transformation also
results in better conditioning of the normal equations, we propose its use at all
times. This transformation has been implemented in the program for ODE systems
provided with this book.
150
Chapter 8
8.7.2 Increasing the Region of Convergence
A well known problem of the Gauss-Newton method is its relatively small
region of convergence. Unless the initial guess of the unknown parameters is i n
the vicinity of the optimum, divergence may occur. This problem has received
much attention in the literature. For example, Ramaker et ai. ( 1 970) suggested the
incorporation of Marquadt's modification to expand the region of convergence.
Donnely and Quon ( 1 970) proposed a procedure whereby the measurements are
perturbed if divergence occurs and a series of problems are solved until the model
trajectories "match" the original data. Nieman and Fisher (.1972) incorporated l i n-
ear programming into the parameter estimation procedure and suggested the solu-
tion of a series of constrained parameter estimation problems where the search is
restricted in a small parameter space around the chosen initial guess. Wang and
Luus (1 980) showed that the use of a shorter data-length can enlarge substantially
the region of convergence of Newton's method. Similar results were obtained by
Kalogerakis and Luus ( 1 980) using the quasilinearization method.
I n this section we first present an efficient step-size policy for differential
equation systems and we present two approaches to increase the region of conver-
gence of the Gauss-Newton method. One through the use of the Information Index
and the other by using a two-step procedure that involves direct search optimiza-
tion.
8.7.2.1 An Optimal Step-Size Policy
The proposed step-size policy for differential equation systems is fairly
similar to our approach for algebraic equation models. First we start with the bi-
section rule. We start with p=I and we keep on halving it until an acceptable
value, pa, has been found, i.e., we reduce p until
It should be emphasized here that it is unnecessary to integrate the state
equations for the entire data length for each value of p. Once the objective func-
tion becomes greater than S(k("). a smaller value for can be chosen. By this pro-
cedure, besides the savings in computation time, numerical instability is also
avoided since the objective function often becomes large very quickly and inte-
gration is stopped well before computer overflow is threatened.
Having an acceptable value for p, we can either stop here and proceed to
perform another iteration of Gauss-Newton method or we can attempt to locate
the optimal step-size along the direction suggested by the Gauss-Newton method.
The main difference with differential equation systems is that every evalua-
tion of the objective function requires the integration of the state equations. I n this
section we present an optimal step size policy proposed by Kalogerakis and Luus
(1983b) which uses information only at p=O (Le., at k(')) and at p=pa (i.e., at
kO'+paAkci+')). Let us consider a Taylor series expansion of the state vector with
respect to p, retaining up to second order terms,
where the n-dirnensional residual vector r(t) is not a function of p. Vector r(t) is
easily computed by the already performed integration at pa and using the state and
sensitivity coefficients already computed at k'); namely,
1
Pa
r(t) = (x(t, kci) + paAkU+')) - x(t, kc])) - paG(t)Ak0+l)) (8.63)
Thus, having r(t,), i=I ,2,. . . , N, the objective function S(p) becomes
i =l
Subsequent use of the stationary criterion dS/dp=O, yields the following 3rd
order equation for p,
P d + P2p2 + PIP + Po = 0 (8.65)
where
N
P2 = 3xr( t i ) TCTQi CG( t i ) Ak( i +1)
i=l
i=l
N r 1
(8.66a)
(8.66b)
(8.66~)
i =l
152 Chapter 8
i =l
Solution of Equation 8.65 yields the optimutn value for the step-size. The
solution can be readily obtained by Newton's method within 3 or 4 iterations using
pa as a starting value. This optimal step-size policy was found to yield very good
results. The only problem that it has is that one needs to store the values of state
and sensitivity equations at each iteration. For high dinlensional systems this is not
advisable.
Finally it is noted that in the above equations we can substitute G(t) with
GR(t) and AkC'" with Ak$"' in case we wish to use the reduced sensitivity coeffi-
cient formulation.
8.7.2.2 Use of the Information Index
The remedies to increase the region of convergence include the use of a
pseudoinverse or Marquardt's modification that overcome the problem of i l l -
conditioning of matrix A. However, if the basic sensitivity information is not
there, the estimated direction Ak".'') cannot be obtained reliably.
A careful examination of matrix A or AR, shows that A depends not only on
the sensitivity coefficients but also on their values ut the times when /he g i ~ ~
tnensttrements hme beer? taken. When the parameters are far fiom their optimum
values, some of the sensitivity coefficients may be excited and become large in
magnitude at times which fall outside the range of the given data points. This
means that the output vector will be insensitive to these parameters at the given
measurement times, resulting i n loss of sensitivity information captured in matrix
A. The normal equation, AAk(-""=b, becomes ill-conditioned and any attempt to
change the equation is rather self-defeating since the parameter estimates will be
unreliable.
Therefore, instead of modifying the normal equations, we propose a direct
approach whereby the conditioning of matrix A can be significantly improved by
using an appropriate section of the data so that most of the available sensitivity
information is captured for the current parameter values, To be able to determine
the proper section of the data where sensitivity information is available, Kalo-
gerakis and Luus (1983b) introduced the infomation Il7de.v for each parameter,
defined as
I,(t) = kI(i?l)y()-)k. C3kj 8k-i ; j =l , . . . , p (8.67)
Or equivalently, using the sensitivity coefficient matris,
IJ(t) = kj 6yGT(t)CTQCG(t)6j kj ; j=l, . . . , p (8.68)
where is a p-dimensional vector with 1 in the jt element and zeros elsewhere.
The scalar IJ(t) should be viewed as an index measuring the overall sensitivity of
the output vector to parameter kj at time t.
Thus given an initial guess for the parameters, we can integrate the state and
sensitivity equations and compute the Information Indices, IJ(t), j =l ,. . .,p as f hc-
tions of time. Subsequently by plotting Ij(t) versus time, preferably on a semi-log
scale, we can spot immediately where they become excited and large in magni-
tude. If observations are not available within this time interval, artificial data can
be generated by data smoothing and interpolation to provide the missing sensitiv-
ity information. In general, the best section of the data could be determined at each
iteration, as during the course of the iterations different sections of the data may
become the most appropriate. However, it is also expected that during the itera-
tions the most appropriate sections of the data will lie somewhere in between the
range of the given measurements and the section determined by the initial pa-
rameter estimates. It is therefore suggested, that the section selected based on the
initial parameter estimates be combined with the section of the given measure-
ments and the entire data length is used for all iterations. At the last iteration all
artificial data are dropped and only the given data are used.
The use of the Information Index and the optimal step size policy are illus-
trated in Chapter 16 where we present parameter estimation examples for ODE
models. For illustration purposes we present here the information indices for the
two-parameter model describing the pyrolytic dehydrogenation of benzene to di-
phenyl and triphenyl (introduced first in Section 6.5.2). In Figure 8.2 the Informa-
tion Indices are presented in graphical form with parameter values (k1=355,400
and k7=403,300) three orders of magnitude away from the optimum. With this
initial guess the Gauss-Newton method fails to converge as all the available sensi-
tivity information falls outside the range of the given measurements. By generat-
ing artificial data by interpolation, subsequent use of the Gauss-Newton method
brings the parameters to the optimum (kl=355.4 and k2=403.3) in nine iterations.
For the sake of comparison the Information Indices are also presented when the
parameters have their optimal values. As seen in Figure 8.3, in terms of experi-
mental design the given measurements have been taken at proper times, although
some extra information could have been gained by having a few extra data points
in the interval [ 1 0-4, 5.83~10-].
154
Chapter 8
When the dynamic systetn is described by a set of stiff ODES and observa-
tions during the fast transients are not available, generation of artificial data by
interpolation at times close to the origin may be very risky. If however, we ob-
serve all state variables (i.e., C=I), we can overcome this difficulty by redefining
the initial state (Kalogerakis and Luus, 1983b). Instead of restricting the use of
data close to the initial state, the data can be shifted to allow any of the early data
points to constitute the initial state where the fast transients have died out. At the
shifted origin, generation of artificial data is considerably easier. Illustrative ex-
amples of this technique are provided by Kalogerakis and Luus ( I 983b).
8.7.2.3 Use of Direct Search Methods
A simple procedure to overcome the problem of the small region of conver-
gence is to use a two-step procedure whereby direct search optimization is used to
initially to bring the parameters in the vicinity of the optimum, followed by the
Gauss-Newton method to obtain the best parameter values and estimates of the
uncertainty in the parameters (Kalogerakis and Luus, 1982).
For example let us consider the estimation of the two kinetic parameters in
the Bodenstein-Linder model for the homogeneous gas phase reaction of NO with
O2 (first presented in Section 6.5.1). I n Figure 8.4 we see that the use of direct
search (LJ optimization) can increase the overall size of the region of convergence
by at least two orders of magnitude.
Figure 8.4 Use of the LJ optimization procedure to bring the first parameter
estimates inside the region of convergence of the Gauss-Newton
method (denoted by the solid line). Al l test points are denoted by +.
Actual path of some typical runs is shown by the dotted line.
156
Chapter 8
8.8 AUTOCORRELATION IN DYNAMIC SYSTEMS
When experimental data are collected over time or distance there is always a
chance of having autocorrelated residuals. Box et al. (1 994) provide an extensive
treatment of correlated disturbances in discrete time models. The structure of the
disturbance term is often moving overage or nutoregressive models. Detection of
autocorrelation in the residuals can be established either from a ti me series plot of
the residuals versus time (or experiment number) or fiotn a log plot. I f we can see
a pattern i n the residuals over time, it probably means that there is correlation be-
tween the disturbances.
An autoregressive model (AR) of order 1 for the residuals has the form
e, = + St (8.69)
where e, is the residual of the i th measurement, p is an unknown constant and ct are
independent random disturbances with zero mean and constant variance (also
known as white noise). In engineering, when we are dealing with parameter esti-
mation of mechanistic models. we do not encounter highly correlated residuals.
This is not the case however when we use the black box approach (Box et al.,
1994).
If we have data collected at a constant sampling interval, any autocorrelation
in the residuals can be readily detected by plotting the residual autocorrelation
function versus lag number. The latter is defined as,
(8.70)
The 95% confidence intervals of the autocorrelation function beyond lag 2 is
simply given by k 2 / 6. If the estimated autocorrelation function has values at
greater than 2 / J N , we should correct the data for autocorrelation.
If we assume that the residuals satis@ Equation 8.69, we must transform the
data so that the new residuals are not correlated. Substituting the definition of the
residual (for the single-response case) into Equation 8.69 we obtain upon rear-
rangement,
The above model equation now satisfies all the criteria for least squares es-
timation. As an initial guess for k we can use our estimated parameter values when
it was assumed that no correlation was present. Of course, in the second step we
have to include p in the list of the parameters to be determined.
Practical Guidelines for Algorithm lmplementation 157
Finally it is noted that the above equations can be readily extended to the
multi-response case especially if we assume that there is no cross-correlation be-
tween different response variables.
Constrained Parameter Estimation
In some cases besides the governing algebraic or differential equations, the
nlathematical model that describes the physical system under investigation is ac-
companied with a set of constraints. These are either eqt/n/i l ?, or inequalit~l COH-
strnints that tnust be satisfied when the parameters converge to their best values.
The constraints may be simply on the parameter values, e.g.. a reaction rate con-
stant must be positive, or on the response variables. The latter are often encoun-
tered in thermodynamic problems where the parameters should be such that the
calculated thermophysical properties satisfL all constraints imposed by thermody-
namic laws. We shall first consider equality constraints and subsequently inequal-
ity constraints.
9.1 EQUALl TY CONSTRAINTS
Equality constraints are rather seldom in parameter estimation. If there is an
equality constraint among the parameters, one should first attempt to eliminate one
of the unknown parameters simply by solving cxplicitly for one of the parameters
and then substituting that relationship in the model equations. Such an action re-
duces the dimensionality of the parameter estimation problem which aids signifi-
cantly in achieving convergence.
If the equality constraint involves independent variablcs and parameters i n
an algebraic model, i.e.. it is of the form, cp(x, k) =0, and i f we can solve explicitly
for one of the unknown parameters, simple substitution of the expression into the
model equations reduces the number of unknown paratneters by one.
If the equality constraint involves the response variables. then we have the
option to either substitute the experimental measurement of the response variables
IS8
Constrained Parameter Estimation 159
into the constraint or to use the error-in-variables method to obtain besides the
parameters the noise-free value of the response variables. The use of the error-in-
variables method is discussed in Chapter 14 (parameter estimation with equations
of state). The general approach to handle any equality constraint is through the use
of Lagrange multipliers discussed next.
9.1.1 Lagrange Multipliers
Let us consider constrained least squares estimation of unknown parameters
in algebraic equation models first. The problem can be formulated as follows:
Given a set of data points {( x, , ~, ) , i=l, ..., N} and a mathematical model of
the form, y = f(x,k), the objective is to determine the unknown parameter vector k
by minimizing the least squares objective function subject to the equality con-
straint, namely
i =l
The point where the constraint is satisfied, (xo,yo), may or may not belong to
the data set {( X , ) , i=l ,. . . ,N). The above constrained minimization problem
can be transformed into an unconstrained one by introducing the Lagrange multi-
plier, o and augmenting the least squares objective function to form the La-
grangian,
The above unconstrained estimation problem can be solved by a small
modification of the Gauss-Newton method. Let us assume that we have an esti-
mate k) of the parameters at the j* iteration. Linearization of the model equation
and the constraint around ko yields,
160 Chapter 9
Substitution of the Equations 9.4 and 9.5 into the Lagrangian given by
Equation 9.3 and use of the stationary conditions
and
yield the following system of linear equations
and
where
0
b = bGN - -C
2
N
bGN = XG,''Q[yi - f(xi,k")]
i =l
c = (2)
Equation 9.8a is solved with respect to Akc" ') to yield
Subsequent substitution into Equation 9.8b yields
(9.7)
(9.8a)
(9.8b)
(9.9a)
(9.9b)
(9.9c)
(9.9d)
(9.9e)
(9.10)
(9. I 1)
161
which upon rearrangement results in
(9.12)
substituting the above expression for the Lagrange multiplier into Equation 9.8a
we arrive at the following linear equation for Ak(+),
(9.13)
The above equation leads to the following steps of the modified Gauss-
Newton method.
Modified Gauss-Newton Algorithm for Constrained Least Squares
The implementation of the modified Gauss-Newton method is accomplished
by following the steps given below,
Step 1.
Step 2.
Step 3.
Step 4.
Step 5.
Step 6.
Step 7.
Generate/assume an initial guess for the parameter vector k.
Given the current estimate of the parameters, kG, compute the
parameter sensitivity matrix, G, , the response variables f(x,,kQ),
and the constraint, cpo.
Set up matrix A, vector bGN and compute vector c.
Perform an eigenvalue decomposition of A=VTAV and compute
Compute the right hand side of Equation 9.1 3 and then solve
Equation 9. I3 with respect to Aka").
Use the bisection rule to determine an acceptable step-size and
then update the parameter estimates.
Check for convergence. If converged, estimate COV(k*) and stop;
else go back to Step 2.
A=VTAV.
The above constrained parameter estimation problem becomes much more
challenging if the location where the constraint must be satisfied, (xo,yo), is not
known a priori. This situation arises naturally in the estimation of binary interac-
tion parameters in cubic equations of state (see Chapter 14). Furthermore, the
above development can be readily extended to several constraints by introducing
an equal number of Lagrange multipliers.
162 Chapter 9
I f the mathematical model is described by a set of differential equations,
constrained parameter estimation becomes a fairly complicated problem. Normally
we may have two different types of constraints on the state variables: (i ) isoperi-
metric constraints, expressed as an integral (or summation) constraint on the state
variables, and/or (ii) simple constraints on the state variables. In the first case, we
generate the Langrangian by introducing the Lagrange multiplier c o which is ob-
tained in a similar fashion as the described previously. If however, we have a sitn-
ple constraint on the state variables, the Lagrange multiplier o. becomes an un-
known functional, a)(t), that must be determined. This problem can be tackled LIS-
ing elements fiom the calculus of variations but it is beyond the scope of this book
and it is not considered here.
9.2 INEQUALITY CONSTRAINTS
There are two types of inequality constraints. Those that involve only the
parameters (e.g., the parameters must be positive and less than one) and those that
involve not only the parameters but also the dependent variables (e.g., the pre-
dicted concentrations of all species must always be positive or zero and the un-
known reaction rate constants must all be positive).
We shall examine each case independently.
9.2.1 Optimum Is Internal Point
Most of the constrained parameter estimation problems belong to this case.
Based on scientific considerations, we arrive quite often at constraints that the
parameters of the mathematical model should satisfy. Most ofthe time these are of
the form,
and our objective is to ensure that the optimal parameter estimates satisfy the
above constraints. If our initial guesses are very poor, during the early iterations
of the Gauss-Newton method the parameters may reach beyond these boundaries
where the parameter values may have no physical meaning or the mathematical
model breaks down which in turn may lead to severe convergence problems. If the
optimum is indeed an internal point, we can readily solve the problem using one of
the following three approaches.
9.2.1.1 Reparameterization
The simplest way to deal with constraints on the parameters is to i, wore
them! We use the unconstrained algorithm of our choice and if the converged pa-
Constrained Parameter Estimation 163
rameter estimates satisfy the constraints no further action is required. Unfortu-
nately we cannot always use this approach. A smart way of imposing simple con-
straints on the parameters is through reparameterization. For example, if we have
the simple constraint (often encountered in engineering problems),
k, > 0 (9.15)
the following reparameterization always enforces the above constraint (Bates and
Watts, I988),
(9.16)
By conducting our search over Ki regardless of its value, exp(K,) and hence ki
is always positive. For the more general case of the interval constraint on the pa-
rameters given by Equation 9.14, we can perform the following transformation,
(9.17)
Using the above transformation, we are able to perform an unconstrained
search over K,. For any value of K,, the original parameter k, remains within its
limits. When K, approaches very large values (tends to infinity), k, approaches its
lower limit, km1n.i whereas when K, approaches very large negative values (tends to
minus infinity), k, approaches its upper limit, k,,,,,. Obviously, the above trans-
formation increases the complexity of the mathematical model; however, there are
no constraints on the parameters.
9.2.1.2 Penalty Function
In this case instead of reparameterizing the problem, we augment the objec-
tive function by adding extra terms that tend to explode when the parameters ap-
proach near the boundary and become negligible when the parameters are far, One
can easily construct such functions.
One of the simplest and yet very effective penalty function that keeps the
parameters in the interval (k,,,,,, k,,,,) is
The functions essentially place an equally weighted penalty for small or
large-valued parameters on the overall objective function. If penalty functions for
164 Chapter 9
constraints on all unknown parameters are added to the objective function, we
obtain,
N D
(9.19)
i =l i =l
or
The user supplied weighting constant, 6 (>0), should have a large value
during the early iterations of the Gauss-Newton method when the parameters are
away from their optimal values. As the parameters approach the optimum, 6
should be reduced so that the contribution of the penalty function is essentially
negligible (so that no bias is introduced in the parameter estimates).
With a few minor modifications, the Gauss-Newton method presented in
Chapter 4 can be used to obtain the unknown parameters. If we consider Taylor
series expansion of the penalty function around the current estimate of the pa-
rameter we have,
where
t
(9.22)
and
Subsequent use of the stationary condition (dS,/dk()=O, yields the normal
equations
165
The diagonal elements of matrix A are given by
and elements of vector b are given by
(9.24)
(9.25)
(9.26)
where matrix AGN and vector bGN are those given in Chapter 4 for the Gauss-
Newton method. The above equations apply equally well to differential equation
models. In this case AGN and bGN are those given in Chapter 6.
9.2.1.3 Bisection Rule
If we are certain that the optimum parameter estimates lie well within the
constraint boundaries, the simplest way to ensure that the parameters stay within
the boundaries is through the use of the bisection rule. Namely, during each itera-
tion of the Gauss-Newton method, if anyone of the new parameter estimates lie
beyond its boundaries, then vector Aka+') is halved, until all the parameter con-
straints are satisfied. Once the constraints are satisfied, we proceed with the de-
termination of the step-size that will yield a reduction in the objective function as
already discussed in Chapters 4 and 6.
Our experience with algebraic and differential equation models has shown
that this is indeed the easiest and most effective approach to use. It has been im-
plemented in the computer programs provided with this book.
9.2.2 The Kuhn-Tucker Conditions
The most general case is covered by the well-known Kuhn-Tucher condi-
tiom for optimality. Let us assume the most general case where we seek the un-
known parameter vector k, that will
(9.27a)
166
Chapter 9
Subject to cp,(k) = 0 ; i=1,2 ,..., ncp (9.27b)
and yl(k) L 0 ; i=n,+ I , n,+2,. . . , n,+n,,, (9.27c)
This constrained minimization problem is again solved by introducing n,
Lagrange multipliers for the equality constraints, n,,, Lagrange multipliers for the
inequality constraints and by forming the augmented objective function (Langran-
gian)
The necessary conditions for k* to be the optitnal parameter values cowe-
sponding to a minimum of the augmented objective function SLG(k,a) are given
by Edgar and Himmelblau ( 1 988) and Gill et al. ( I 98 1) and are briefly presented
here.
The Langrangian function must be at a stationary point, i.e.,
(9.29)
The constraints are satisfied at k*, i.e.,
cp,(k*) = 0 ; i=1,2, ..., n cp (9.27b)
y,(k*) 2 0
; i= n,+I , n,+2,. . ., n,+n,,, (9.27~)
and
The Lagrange multipliers corresponding to the inequality constraints ((I),,
i=n,+l , . . ., nq+n,,,) are non-negative, in particular,
o, > 0 ; for all active inequality constraints (when yr,(k*)=O) (9.30a)
01, = 0 : for all inactive inequality constraints (when yl,(k*)>O) (9.30b)
and
Based on the above, we can develop an "adaptive" Gauss-Newton method
for parameter estimation with equality constraints whereby the set of active con-
straints (which are all equalities) is updated at each iteration. An example is pro-
vided in Chapter 14 where we examine the estimation of binary interactions pa-
rameters in cubic equations of state subject to predicting the correct phase behav-
ior (Le., avoiding erroneous two-phase split predictions under certain conditions).
10
Gauss-Newton Method for Partial
Differential Equation (PDE) Models
In this chapter we concentrate on dynamic, distributed systems described by
partial differential equations. Under certain conditions, some of these systems,
particularly those described by linear PDEs, have analytical solutions. If such a
solution does exist and the unknown parameters appear in the solution expression,
the estimation problem can often be reduced to that for systems described by alge-
braic equations. However, most of the time, an analytical solution cannot be found
and the PDEs have to be solved numerically. This case is of interest here. Our
general approach is to convert the partial differential equations (PDEs) to a set of
ordinary differential equations (ODES) and then employ the techniques presented
in Chapter 6 taking into consideration the high dimensionality of the problem.
When dealing with systems described by PDEs two additional complica-
tions arise. First, the unknown parameters may appear in the PDE or in the bound-
ary conditions or both. Second, the measurements are in general a knction of time
and space. However, we could also have measurements which are integrals (aver-
age values) over space or time or both.
167
168 Chapter 10
Let us consider the general class of systems described by a system of n non-
linear parabolic or hyperbolic partial differential equations. For simplicity we as-
sume that we have only one spatial independent variable, z.
(10.1)
with the given initial conditions,
w,(to, z) = wJ 0(z) ; j=l,. . . , n ( 10.2)
where k is the p-din~ensionnl vector of unknown parameters. Furthermore, the
appropriate boundary conditions are also given, i.e.,
The distributed state variables w,(t,z), j=l, ..., ) I , are generally not all meas-
ured. Furthermore the measurements could be taken at certain points in space and
time or they could be averages over space or time. If we define as y 1l7e 172-
dimensional output vector, each measured variable, y,(t). j=I ,. . .,in, is related to the
state vector w(t,z) by any of the following relationships (Seinfeld and Lapidus,
1974):
(a) Measurements taken at a particular point in space, z,, at the sampling
times, t,, t? ,..., tN, i.e.,
(b) Measurements taken as an average over particular subspaces, a,, at the
sampling times, t,, t2 ,..., tN, i.e.,
(c) Measurements taken at a particular point in space, z,, as a time average
over successive sampling times, t,, t2 ,..., tN. Le.,
Gauss-Newton Method for PDE Models
169
(d) Measurements taken as an average over particular subspaces, aJ, and as
a time average over successive sampling times, t,, t2, ..., tN, i.e.,
As usual, it is assumed that the actual measurements of the output vector are
related to the model calculated values by
i j (ti ) = yj (ti ) + ~j , i ; i =l , ..., N, j=1, ..., m ( 1 0.8a)
or in vector form
i (ti ) = y(ti) + ; i=I, ..., N (1 0.8b)
The unknown parameter vector k is obtained by minimizing the corre-
sponding least squares objective function where the weighting matrix Q, is chosen
based on the statistical characteristics of the error term E, as already discussed in
Chapter 2.
10.2 THE GAUSS-NEWTON METHOD FOR PDE MODELS
Following the same approach as in Chapter 6 for ODE models, we linearize
the output vector around the current estimate of the parameter vector kG) to yield
( 1 0.9)
The output sensitivity matrix (@T/dk)T is related to the sensitivity coefi-
cient matrix defined as
T
G(t,z) = ( ? I
or equivalently
Gji(t,z) =(2) ; j =l , ..., n i=l, . . . , p
(10.1Oa)
(10.10b)
170 Chapter 10
The relationship between (dy,lo"k,) and GI, is obtained by implicit differentiation of
Equations 10.4 - 10.7 depending on the type of measurements we have. Namely,
(a) if the output relationship is given by Equation 10.4, then
or
(b) if the output relationship is given by Equation 10.5, then
or
(c) if the output relationship is given by Equation 10.6, then
or
(10.1 la)
(10.1 Ib)
(10.12a)
(10.1%)
(10.13a)
(10.13b)
(d) if the output relationship is given by Equation 10.7, then
or
Gauss-Newton Method for PDE Modeis 171
Obviously in order to implement the Gauss-Newton method we have to
compute the sensitivity coeficients G,,(t,z). In this case however, the sensitivity
coefficients are obtained by solving a set of PDEs rather than ODES.
The governing partial differential equation for G(t,z) is obtained by differ-
entiating both sides of Equation 10. I with respect to k and reversing the order of
differentiation. The resulting PDE for Gjl(t,z) is given by (Seinfeld and Lapidus,
1974,
c
+ (2) ; j=1, ..., n, i =l , ..., p
(10.15)
The initial condition for G,,(t,z) is obtained by differentiating both sides of
Equation 10.2 with respect to k,. Equivalently we can simply argue that since the
initial condition for wj(t,z) is known, it does not depend on the parameter values
and hence,
Gj,(to,x) = 0 ; j=l, ..., n, i =l , . . . , p (10.16)
The boundary condition for G,i(t,z) is obtained by differentiating both sides
of Equation 10.3 with respect to k, to yield,
Equations 10.15 to I O. 17 define a set of (nxp) partial differential equations
for the sensitivity coefficients that need to be solved at each iteration of the Gauss-
Newton method together with the n PDEs for the state variables.
172 Chapter 10
Having computed (i3yT/dk)T we can proceed and obtain a linear equation for
Aka") by substituting Equation 10.9 into the least squares objective function and
using the stationary criterion (aS/ak(l"')) = 0. The resulting equation is of the form
where
and
( 1 0. IS)
(10.19)
(1 0.20)
At this point we can summarize the steps required to implement the Gauss-
Newton method for PDE models. At each iteration, given the current estimate of
the parameters, kU', we obtain w(t,z) and G(t,z) by solving numerically the state
and sensitivity partial differential equations. Using these values we compute the
model output, y(t,,kti)), and the output sensitivity matrix, (dy'/dk)T for each data
point i=l,. . .,N. Subsequently, these are used to set up matrix A and vector b. So-
lution of the linear equation yields Akot'' and hence kufl) is obtained. The bisec-
tion rule to yield an acceptable step-size at each iteration of the Gauss-Newton
method should also be used.
This approach is useful when dealing with relatively simple partial differen-
tial equation models. Seinfeld and Lapidus ( 1 974) have provided a couple of nu-
merical examples for the estimation of a single parameter by the steepest descent
algorithm for systems described by one or two simultaneous PDEs with simple
boundary conditions.
In our opinion the above formulation does not provide any computational
advantage over the approach described next since the PDEs (state and sensitivity
equations) need to be solved numerically.
10.3 THE GAUSS-NEWTON METHOD FOR DISCRETIZED PDE
MODELS
Rather than discretizing the PDEs for the state variables and the sensitivity
coefficients in order to solve them numerically at each iteration of the Gauss-
Newton method, it is preferable to discretize first the governing PDEs and then
Gauss-Newton Method for PDE Model s 1 73
estimate the unknown parameters. Essentially, instead of a PDE model, we are
now dealing with an ODE model of high dimensionality.
If for example we discretize the region over which the PDE is to be solved
into A4grid blocks, use of finite differences (or any other discretization scheme) to
approximate the spatial derivatives in Equation 10.1 yields the following system of
ODEs:
dx
- = T(t,x;k)
dt
where the new nM-dimensional state vector x is defined as
(10.21)
(1 0.22)
where w(t,z,) is the n-dimensional vector of the original state variables at the it
grid block. The solution of this parameter estimation problem can readily be ob-
tained by the Gauss-Newton method for ODE models presented in Chapter 6 as
long as the high dimensionality of the problem is taken into consideration.
10.3.1 Efficient Computation of the Sensitivity Coefficients
The computation of the sensitivity coefficients in PDE models is a highly
demanding operation. For example if we consider a typical three-dimensional
three-phase reservoir simulation model, we could easily have 1,000 to 10,000 grid
blocks with 20 unknown parameters representing unknown average porosities and
perrneabilties in a ten zone reservoir structure. This means that the state vector
(Equation 10.22) will be comprised of 3,000 to 30,000 state variables since for
each grid block we have three state variables (pressure and two saturations). The
corresponding number of sensitivity coefficients will be 60,000 to 600,000!
Hence, if one implements the Gauss-Newton method the way it was suggested in
Chapter 6, 63,000 to 630,000 ODEs would have to be solved simultaneously. A
rather formidable task.
Therefore, efficient computation schemes of the state and sensitivity equa-
tions are of paramount importance. One such scheme can be developed based on
the sequential integration of the sensitivip coeficients. The idea of decoupling the
direct calculation of the sensitivity coefficients from the solution of the model
equations was first introduced by Dunker (1984) for stiff chemical mechanisms
174 Chapter 10
such as the oxidation of hydrocarbons in the atmosphere, the pyrolysis of ethane,
etc. Leis and Krarner (1988) presented implementation guidelines and an error
control strategy that ensures the independent satisfaction of local error criteria by
both numerical solutions of the model and the sensitivity equations. The procedure
has been adapted and successfidly used for automatic history matching (i.e.. pa-
rameter estimation) in reservoir engineering (Tan and Kalogerakis, 1991; Tan,
199 1).
Due to the high dimensionality of 3-D models, Euler's method is often used
for the integration of the spatially discretized model equations. There are several
implementations of Euler's method (e.g., semi-implicit, fully implicit or adaptive
implicit). Fully implicit methods require much more extensive coding and signifi-
cant computational effort in matrix operations with large storage requirements.
The work required during each time-step is considerably more than other solution
methods. However, these disadvantages are fully compensated by the stability of
the solution method which allows the use of much larger time-steps in situations
that exhibit large pore volume throughputs, well coning, gas percolation or high
transmissibility variation (Tan. 1991). Obviously, in parameter estimation fully
implicit formulations are preferable since stable integration of the state and sensi-
tivity ODES for a wide range of parameter values is highly desirable.
As we have already pointed out in Chapter 6, the Jacobean matrix of the
state and sensitivity equations is the same. As a result, we can safely assurne that
the maximum allowable time-step that has been determined during the implicit
integration of state equations should also be acceptable for the integration of the
sensitivity equations. If the state and sensitivity equations were simultaneously
integrated. for each reduction in the time-step any work performed in the integra-
tion of the sensitivity equations would have been in vain, Therefore, it is proposed
to integrate the sensitivity equations only after the integration of the model equa-
tions has converged for each time-step. This is shown schematically in Figure 10. l
The integration of the state equations (Equation 10.2 1 ) by the fully implicit
Euler's method is based on the iterative determination of x(t,+]). Thus. having x(tJ
we solve the following difference equation for x(tl, ,).
( 10.23)
where At is tl +l - t,.
If we denote by x(')(t,, I). xQ"l)(t,+l), x[1t2)(tlFl). . . the iterates for the determi-
nation of x(tl+l), linearization of the right hand side of Equation 10.23 around
xtl'(t,+l) yields
Gauss-Newton Method for PDE Models
1 75
which upon rearrangement yields
where Axo+') = xti+l)(tl+l) - xO)(ti+,) and the J acobean matrix ( d ~ p ~ l d x ) ~ is evalu-
ated at xb)(ti+,).
Equation 10.25 is of the form AAx=b which can be solved for AxQ+') and
thus xO+')(tlkl) is obtained. Normally, we converge to x(tl+l) in very few iterations.
If however, convergence is not achieved or the integration error tolerances are not
satisfied, the time-step is reduced and the computations are repeated.
h, Model ODEs
I I I I
b
..
Sensitivity ODEs
b for kl
31d 7'h 1 I "
Sensitivity ODEs
b for k2
ti- 1 ti t1+ I t1+2
4th 8th 1 2th
Sensitivity ODEs
b for k3
4- 1 t, t,+1 ti+?:
Figure IO. I Schematic diagram of the sequential solution of model and sensitiv-
ity equations. The order is shown for a three parameter problem.
Steps I , 5 and 9 involve iterative solution that requires a matrix in-
version at each iteration of the j dl y implicit Euler's method All
other steps (i.e., the integration of the sensitivity equations) involve
only one matrix multiplication each.
1 76 Chapter 10
For the solution of Equation 10.25 the inverse of matrix A is computed by
iterative techniques as opposed to direct methods often employed for matrices of
low order. Since matrix A is normally very large, its inverse is tnore economically
found by an iterative method. Many iterative tnethods have been published such as
successive over-relaxation (SOR) and its variants, the strongly implicit procedure
(SIP) and its variants, Orthomin and its variants (Stone, I968), nested factorization
(Appleyard and Chesire, 1983) and iterative D4 with minimization (Tan and Let-
keman, 1982) to name a few.
Once x(t,+J has been computed, we can proceed for the computation of the
sensitivity coefficients at t,, Namely, if we denote as gr(tl+l) the + column of
sensitivity matrix G(tIil), gl(tl+l) is to be obtained by solving the linear ODE
( 1 0.26)
which for the fully implicit Eulers method yields the following difference equa-
t ion
( 1 0.27)
where the Jacobean matrix (&p/d~)~ and vector ( dq/ dk, ) are evaluated at x(tl, I ).
Upon rearrangetnent we have,
1
The solution of Equation 10.28 is obtained it? one slep hJ1 pe!fOi-mi17g n sim-
ple matrix mzrltiplicntion since the inverse of the matrix on the left hand side of
Equation 10.28 is already available fiom the integration of the state equations.
Equation 10.28 is solved for ~ 1 , . . .,p and thus the whole sensitivity matrix G(t,,
is obtained as [gl(tl, ,), gz(t,+l),. . ., gp(t,+l)]. The computational savings that are re-
alized by the above procedure are substantial, especially when the number of 1111-
known parameters is large (Tan and Kalogerakis, 1991). With this modification
the computational requirements of the Gauss-Newton method for PDE models
become reasonable and hence, the estimation method becomes implementable.
A numerical example for the estimation of unknown parameters in PDE
models is provided in Chapter I8 where we discuss automatic history matching of
reservoir simulation models.
Statistical Inferences
Once we have estimated the unknown parameters that appear in an algebraic
or ODE model, it is quite important to perform a few additional calculations to
establish estimates of the standard error in the parameters and in the expected re-
sponse variables. These additional computational steps are very valuable as they
provide us with a quantitative measure of the quality of the overall fit and inform
us how trustworthy the parameter estimates are.
11. 1 INFERENCES ON THE PARAMETERS
When the Gauss-Newton method is used to estimate the unknown parame-
ters, we linearize the model equations and at each iteration we solve the corre-
sponding linear least squares problem. As a result, the estimated parameter values
have linear least squares properties. Namely, the parameter estimates are normally
distributed, unbiased (i.e., E(k*)=k) and their covariance matrix is given by
COV(k*) = 0: [A*]" ( I 1.1)
where A* is matrix A evaluated at k*. It should be noted that for linear least
squares matrix A is independent of the parameters while this is clearly not the case
for nonlinear least squares problems. The required estimate of the variance 0; is
obtained from
S(k*) S(k*)
'' =(d.f.)-Nm-p
"
(1 1.2)
177
1 78 Chapter 11
where (d.f.)=Nm-p are the degrees of freedom, namely the total number of meas-
urements minus the number of unknown parameters.
The above expressions for the COV(k*) and 6: are valid, if the statistically
correct choice of the weighting matrix Q,. (i =l ,. . .,N) is used in the formulation of
the problem. Namely, if the errors in the response variables (E,, i=l ,. . . , N) are nor-
mally distributed with zero mean and covariance matrix,
we should use [MI]" as the weighting matrix Q, where the matrices MI, i=1 ,. . .,N
are known whereas the scaling factor, oi , could be unknown. Based on the struc-
ture of MI we arrive at the various cases of least squares estimation (Simple LS,
Weighted LS or Generalized LS) as described in detail in Chapter 2.
Although the computation of COl'(k*) is a simple extra step after conver-
gence of the Gauss-Newton method, we are not obliged to use the Gauss-Newton
method for the search of the best parameter values. Once k* has been obtained
using any search method, one can proceed and compute the sensitivity coefficients
by setting up matrix A and thus quanti@ the uncertainty in the estimated parame-
ter values by estimating COV(k*).
Approximate inference regions for nonlinear nlodels are defined by analogy
to the linear models. I n particular, the (I-a~100% joint confidence region for the
parameter vector k is described by the ellipsoid,
[k - k *J" [A*r'[k - k *] = p6: F;N", -,,
or
( I I .4a)
( I 1.4b)
where a is the selected probability level in Fisher's F-distribution and F;JJ,,,+ is
obtained from the F-distribution tables with vl=p and v?=(Nm-p) degrees of free-
dom. The corresponding ( 1 4 100% margimd conjidence interval for each pa-
rameter, k,, i=1,2 ,..., p, is given by
( 1 I .5)
where tz i 2 is obtained from the tables of Student's T-distribution with v=(Nm-p)
degrees of freedom. The standard error of paratneter ki, 6k; , is obtained as the
Statistical Inferences 179
square root of the corresponding diagonal element of the inverse of matrix A*
multiplied by eE , i.e.,
( I 1.6)
It is reminded that for v230 the approximation t:,2 = zd2 can be used
where zd2 is obtained fiom the standard normal distribution tables. Simply put,
when we have many data points, we can simply take as the 95% confidence inter-
val twice the standard error (recall that q025=1 .96 whereas ti\25 =2.042).
The linear approximation employed by the Gauss-Newton method in solv-
ing the nonlinear least squares problem enables us to obtain inference regions for
the parameters very easily. However, these regions are on& approximate and
practice has shown that in many cases they can be very misleading (Watts and
Bates, 1988). Nonetheless, even if the regions are not exact, we obtain a very good
idea of the correlation among the parameters.
We can compute the exact ( I -0) 100% joint parameter likelihood region
using the equation given below
r 1
( 1 1.7)
The computation of the above surface in the parameter space is not trivial.
For the two-parameter case (p=2), the joint confidence region on the kl-k2 plane
can be determined by using any contouring method. The contour line is approxi-
mated fiom many fbnction evaluations of S(k) over a dense grid of (kl, k2) values.
11.2 INFERENCES ON THE EXPECTED RESPONSE VARIABLES
Having determined the uncertainty in the parameter estimates, we can pro-
ceed and obtain confidence intervals for the expected mean response. Let us first
consider models described by a set of nonlinear algebraic equations, y=f(x,k). The
I OO(1 -a)% confidence interval of the expected mean response of the variable yJ at
x. is given by
( 1 1.8)
180
Chapter 11
where ti , is obtained from the tables of Student's T-distribution with v=(Nrn-y)
degrees of freedom. Based on the linear approximation of the model equations, we
have
( 1 1.9)
with the partial derivative (3fTlak)' evaluated at x. and k*. Taking variances from
both sides we have
Substitution of the expression for COV(k*) yields,
(1 1.10a)
( 1 1.10b)
The standard prediction error of yJo, 6 yj o , is the square root of the j"' diago-
nal element of COI,'(yo), namely,
( I 1 . 1 I )
Equation 1 I .8 represents the confidence interval for the mean expected re-
sponse rather than aftrfzrre observation (firttrre mensttrement) of the response vari-
able, yo. In this case, besides the uncertainty in the estimated parameters, we mist
include the uncertainty due to the tneasurement error ( E~) . The ( I -(x)lOO% confi-
dence infemnl of f j o is
( I 1.12)
where the standard prediction error of f j o is given by
Statistical Inferences
6yjo = 6 7
181
(I 1.13)
Next let us turn our attention to models described by a set of ordinary differ-
ential equations. We are interested in establishing confidence intervals for each of
the response variables yJ , j=1 ,..., rn at any time t=t,. The linear approximation of
the output vector at time to,
Y(to,k) = Cx(t0,k") + CG(to)[k-k*] (1 1.14)
yields the expression for COV(y(t,)),
which can be rewritten as
COV(y0) = 6; CG(to)[A*]- I T T C G ( t o) (1 1.16)
with the sensitivity coefficients matrix G(to) evaluated at k*. The estimated stan-
dard prediction error of y,(to) is obtained as the square root of the j"' diagonal ele-
ment of CO V(y(t,)).
6 Y jo = d(Cc(to)[A*]- I T T C G (11.17)
Based on the latter, we can compute the ( I --q) 100% confidence interval of
the expected mean response of yJ at t=t,,
If on the other hand we wish to compute the(/-a)I00% confidence interval
of the response of y, at t=t,, we must include the error term (E,) in the calculation
of the standard error, namely we have
182
Chapter 11
( I 1.20)
11.3 MODEL ADEQUACY TESTS
There is a plethora of model adequacy tests that the user can employ to de-
cide whether the assumed mathematical model is indeed adequate. Generally
speaking these tests are based on the comparison of the experimental error vari-
ance estimated by the model to that obtained experimentally or through other
means.
1 1.3.1 Single Response Models
Let us first consider models that have only one measured variable (m=l ).
We shall consider two cases. One where we know pr.ecise/y the value of the ex-
perimental error variance and the other when we have an estimate of it. Namely,
there is quantifiable uncertainty in our estimate of the experimenta
CASE 1: 6: is known precisely:
In this case we assume that we know precisely the value
experimental error i n the measurements (oE). Using Equation 1 1
.I error variance.
of the standard
2 we obtain an
estimate of the experimental error variance under the assumption that the model is
adequate. Therefore, to test whether the model is adequate we simply need to test
the hypothesis
at any desirable level of significance, e.%., ~ 0 . 0 5 . Here with c~~~~~~~~ 2 we denote
the error variance estimated by the model equations (Equation I 1.2); namely,
6: is an estimate of o;?ttodel .
Since o: is known exactly (i.e.? there is no uncertainty in its value, it is a
given number) the above hypothesis test is done through a $test. Namely,
where
Statistical Inferences 183
- 2 S(k*)
2 2
0& 0&
Xi at a = (N in- p ) 2 = -
(1 I .21)
and ~, , =( ~~- ~) , l - a is obtained from the tables of the x*-distribution with degrees
of freedom v=(Nln -p).
2
CASE 2: CJ; is known approximately:
Let us assume that 0: is not known exactly, however, we have performed
n repeated measurements of the response variable. From this small sample of
multiple measurements we can determine the sample mean and sample variance. If
s, is the sample estimate of 0: , estimated from the n repeated measurements it
is given by
2
where the sample mean is obtained from
(1 I .22)
( 1 1 2 3 )
Again, we test the hypothesis at any desirable level of significance, for ex-
ample a=0.05
In this case, since 0: is known only approximately, the above hypothesis
is tested using an F-test, Le.,
If
where
Fdata '
Fl -a
~i =( Nm- p) , v2=n- l -> -
Reject Ho
184
Chapter 11
0;
Fdatn = ,2
( 1 1 24)
and F,-(l
VI =d f .\'2 = 1 1 - l
is obtained fiom the tables of the F-distribution.
11.3.2 Multivariate Models
Let us now consider models that have only more than one measured variable
( m>l ) . The previously described model adequacy tests have multivariate exten-
sions that can be found in several advanced statistics textbooks. For example. the
book Introdtiction to Applied Mtrltivnriate Statistics by Srivastava and Carter
( 1983) presents several tests on covariance matrices.
I n many engineering applications, however, we can easily reduce the prob-
lem to the univariate tests presented i n the previous section by assuming that the
covariance matrix of the errors can be written as
COI'(c,) = oi MI ; i=1, ..., N ( 1 1.25)
where Mi are known matrices. Actually quite often we can further assume that the
matrices M, i =l ,. . .,N are the same and equal to matrix M.
An independent estimate of Cob'(&), f: , that is required for the adequacy
tests can be obtained by performing N,( repeated experiments as
( 1 1.26)
or for the case of univariate tests, s, 2 , the sample estimate of' 0: in Equation
1 1.25, can be obtained from
( I 1.37)
Design of Experiments
It is quite obvious by now that the quality of the parameter estimates that
have been obtained with any of the previously described techniques ultimately
depends on "how good" the data at hand is. It is thus very important, when we do
have the option, to design our experiments in such a way so that the information
content of the data is the highest possible. Generally speaking, there are two ap-
proaches to experimental design: (1) factorial design and (2) sequential design. In
sequential experimental design we attempt to satisfy one of the following two ob-
jectives: (i) estimate the parameters as accurately as possible or (ii) discriminate
among several rival models and select the best one. In this chapter we briefly dis-
cuss the design of preliminary experiments (factorial designs), and then wefocus
our attention on the sequential experimental design where we examine algebraic
and ODE models separately.
12.1 PRELIMINARY EXPERIMENTAL DESIGN
There are many books that address experimental design and presentfacfo-
rial experimental design in detail (for example Design and Analysis of Experi-
ments by Montgomery (1 997) or Design of Experiments by Anderson and McLean
(1 974)). As engineers, we are faced quite often with the need to design a set of
preliminary experiments for a process that very little or essentially no information
i s available.
Factorial design, and in particular 2k designs, represent a generally sound
strategy. The independent variables (also calledfbctors in experimental design) are
assigned two valgs (a high and a low value) and experiments are conducted in all
185
186 Chapter 12
possible combinations of the independent variables (also called treatn1er7~s). These
experiments are easy to design and if the levels are chosen appropriately very
valuable information about the tnodel can be gathered.
For example, if we have two independent variables (x1 and x?) the follow-
ing four (2) experiments can be readily designed:
(XI-LOW, x*-Low)
(x,-Low, x*-High)
(xl-High, x2-Low)
(xl-High, x2-High)
If we have three independent variables to vary, the cotnplete 2 factorial
design corresponds to the following 8 experiments:
Run Independent Variables
(Xl-LOw, x*-Low, x;-Low)
(xl-Low, xz-Low, x3-High)
(XI-Low, x2-High, x3-Low)
(xl-High, x2-Low, x3-Low)
(xl-Low, x-High, x,-High)
(xl-High, x2-Low, x,-High)
(xl-High, x2-High, x3-Low)
(xl-High, x2-High, x3-High)
These designs are extremely powerful (fiom a statistical point of view) if
we do not have a mathematical model of the system under investigation and we
simply wish to establish the effect of each of these three independent variables (or
their interaction) on the measured response variables.
This is rarely the case in engineering. Most of the time we do have some
form of a mathematical tnodel (simple or complex) that has several unknown pa-
rameters that we wish to estimate. In these cases the above designs are very
straightforward to implement; however, the information may be inadequate if the
mathematical model is nonlinear and cotnprised of several unknown parameters.
In such cases, n~ultilet~el.factorial desigm (for example, 3k or 4k designs) may be
more appropriate.
A typical 3? design for the case of two independent variables (xI and s2)
that each can assume three values (Low, Medium and High) takes the form:
187
Run Independent Variables
(XI-LOW, x2-Low)
(xl-Low, x2-Medium)
(xl-Low, x2-High)
(xl-Medium, x2-Low)
(x,-Medium, x2-Medium)
(xl-Medium, x2-High)
(xl-High, x2-Low)
(xl-High, x2-Medium)
(xl-High, x2-High)
The above experimental design constitute an excellent set of "preliminary
experiments" for nonlinear models with several unknown parameters. Based on
the analysis of these experiments we obtain estimates of the unknown parameters
that we can use to design subsequent experiments in a rational manner taking ad-
vantage of all information gathered up to that point.
12.2 SEQUENTIAL EXPERIMENTAL DESIGN FOR PRECISE
PARAMETER ESTIMATION
Let us assume that N experiments have been conducted up to now and given
an estimate of the parameter vector based on the experiments performed up to
now, we wish to design the next experiment so that we maximize the information
that shall be obtained. I n other words, the problem we are attempting to solve is:
What are the best conditions to perform the next experiment so
that the variance of the estimatedparameters is Ininhized?
Let us consider the case of an algebraic equation model (Le., y = f(x,k)). The
problem can be restated as "find the best experimental conditions (Le., XN+~) where
the next experiment should be performed so that the variance of the parameters is
minimized."
It was shown earlier that if N experiments have been performed, the covari-
ance matrix of the parameters is estimated by
(12.1)
where
188
Chapt er 12
N
A = XGTQ, C, (1 2.2)
and where G, is the sensitivity coefficients matrix, (dfldk),, for the i"' experiment
and Q, is a suitably selected weighting matrix based on the distribution of the er-
rors in the measurement of the response variables. Obviously, the sensitivity ma-
trix G, is only a function of x, and the current estimate of k.
If for a moment we assume that the next experiment at conditions x ~ + ~ has
been performed, the new matrix A would be:
i = l
N
A""' = ~ G : Q , G , + G ; + ~ Q ~ + ~ G ~ + ~ = A " ~ + c ~ + ~ Q ~ + ~ G ~ + ~ (12.3)
and the resulting parameter covariance matrix would be
( 1 2.4)
The implication here is that if the parameter values do not change signifi-
cantly from their current estimates when the additional measurements are included
in the estimation, we can qzlantifi the effect of each additional experiment before
it has been carried out! Hence, we can search all over the operabi/i@ region (i.e.,
over all the potential values of xN+]) to find the best conditions for the next ex-
periment. The operability region is defined as the set of the feasible experimental
conditions and can usually be adequately represented by a small number of grid
points. The size and form of the operability region is dictated by the feasibility of
attaining these conditions. For example, the thermodynamic equilibrium surface at
a given experimental temperature limits the potential values of the partial pressure
of butene, butadiene and hydrogen (the three independent variables) in a butene
dehydrogenation reactor (Dumez and Froment, 1976).
Next, we shall discuss the actual optimality criteria that can be used in de-
termining the conditions for the next experiment.
12.2.1 The Volume Design Criterion
If we do not have any particular preference for a specific parameter or a
particular subset of the parameter vector, we can minimize the variance of all pa-
rameters simultaneously by minimizing the vohme qf the joint 95% conficknce
region. Obviously a small joint confidence region is highly desirable.
Minimization of the volume of the ellipsoid
189
- k*ITA1leW [k - k'] = p 6 : F;(N+l)m-p (12.5)
is equivalent to maximization of det(AneW) which in turn is equivalent to rnaximi-
zation of the product
( 12.6)
where X, , i=l ,. . . ,p are the eigenvalues of matrix A"'". Using any eigenvalue de-
composition routine for real symmetric matrices, we can calculate the eigenvalues
and hence, the determinant of Ane" for each potential value of xN e l in the operabil-
ity region. The conditions that yield the maximum value for det(Anew) should be
used to conduct the next experiment.
12.2.2 The Shape Design Criterion
In certain occasions the volume criterion is not appropriate. In particular
when we have an ill-conditioned problem, use of the volume criterion results in an
elongated ellipsoid (like a cucumber) for the joint confidence region that has a
small volume; however, the variance of the individual parameters can be very
high. We can determine the shape of the joint confidence region by examining the
cond(A) which is equal to L J A n l " and represents the ratio of the principal axes of
the ellipsoid.
I n this case, it is best to choose the experimental conditions which will yield
the minimum length for the largest principal axis of the ellipsoid. This is equiva-
lent to
Again, we can determine the condition number and bninof matrix Anew using
any eigenvalue decomposition routine that computes the eigenvalues of a real
symmetric matrix and use the conditions (xN+I) that correspond to a maximum of
L,,,.
When the parameters differ by several orders of magnitude between them,
the joint confidence region will have a long and narrow shape even if the parame-
ter estimation problem is well-posed. To avoid unnecessary use o f the shape crite-
rion, instead of investigating the properties of matrix A given by Equation 12.2, it
is better to use the normalized form of matrix A given below (Kalogerakis and
Luus, 2984) as AR.
(1 2.7)
190 Chapter 12
where K=ding(kI.k2,. . .,k,,). Therefore, A"'" should be determined from
or
A' ' ~\" = A' ' ~ + K G ~ + ~ Q ~ + ~ C ; ~ + ~ K ( 12.9)
Essentially this is equivalent to using (at;/i3kJ)k, instead of (8f@kJ ) for the
sensitivity coeficients. By this transformation the sensitivity coefficients are nor-
malized with respect to the parameters and hence, the covariance matrix calculated
using Equation 12.4 yields the standard deviation of each parameter as a percent-
age of its current value.
12.2.3 Implementation Steps
Based on the material presented up to now the steps that need to be followed
to design the next experiment for the precise estimation of the tnodel parameters is
given below:
Step 1.
Step 2.
Step 3.
Step 4.
Step 5 .
Step 6.
Step 7.
Perform a series of initial experiments (based on a factorial design) to
obtain initial estimates for the parameters and their covariance matrix.
For each grid point of the operability region, compute the sensitivity
coefficients and generate A""" given by Equation 12.9.
Perform an eigenvalue decomposition of matrix Ane" to determine its
condiliou number, detern1imnt and A,,,,,,.
Select the experimental conditions that correspond to a maxirnum
del(A"'") or maximum k,,,,,,, when the volume or the shape criterion is
used respectively. The computed condition number indicates whether
the volume or the shape criterion is the most appropriate to use.
Perform the experiment at the selected experimental conditions.
Based on the additional measurement of the response variables, esti-
mate the parameter vector and its covariance matrix.
If the obtained accuracy is satisfactory, stop; else go back to Step 2
and select the conditions for an additional experiment.
I n general, the search for the optimal xNll is made all over the operability
region. Experience has shown however, that the best conditions are always found
on the boundary of the operability region (Froment and Bischoff, 1990). This sim-
plification can significantly reduce the computational effort required to determine
the best experimental conditions for the next experiment.
191
12.3 SEQUENTIAL EXPERIMENTAL DESIGN FOR MODEL
DISCRIMINATION
Based on alternative assumptions about the mechanism of the process under
investigation, one often comes up with a set of alternative mathematical models
that could potentially describe the behavior of the system. Of course, it is expected
that only one of them is the correct model and the rest should prove to be inade-
quate under certain operating conditions. Let us assume that we have conducted a
set of preliminary experiments and we have fitted several rival models that could
potentially describe the system. The problem we are attempting to solve is:
What are the best conditions to petform the next experiment so that
with the additional information we will maximize our ability to dis-
criminate among the rival models?
Let us assume that we have r rival models
y( *) = f ( ' ) (X, k ( ' ) ) 1
(12.10)
that could potentially describe the behavior of the system. Practically this means
that if we perform a model adequacy test (as described in Chapter 1 1) based on the
experimental data at hand, none of the these models gets rejected. Therefore, we
must perform additional experiments in an attempt to determine which one of the
rival models is the correct one by rejecting all the rest.
Obviously, it is very important that the next experiment has maximum dis-
criminating power. Let us illustrate this point with a very simple example where
simple common sense arguments can lead us to a satisfactory design. Assume that
we have the following two rival single-response models, each with two parameters
and one independent variable:
Model 1:
Model 2:
(12.1 la)
(12.1 lb)
Design of the next experiment simply means selection of the best value of x
for the next run. Let us assume that based on information up to now, we have es-
timated the parameters in each of the two rival models and computed their pre-
dicted response shown in Figure 12.1. It is apparent that if the current parameter
estimates will not change significantly, both models will be able to fit satisfacto-
192 Chapter 12
rily data taken in the vicinity of x2 or x4 where the predicted response of the two
models is about the same. On the other hand, if we conduct the next experiment
near x5, only one of the two models will most likely be able to fit it satisfactorily.
The vicinity of xs as shown in Figure 12.1, corresponds to the area of the operabil-
ity region (defined as the interval [xlnlll, where the divergence between the
two models is maximized.
12.3.1 The Divergence Design Criterion
If the structure of the models is more complex and we have more than one
independent variable or we have more than two rival models, selection of the best
experimental conditions may not be as obvious as in the above example. A
straightforward design to obtain the best experimental conditions is based on the
divergence criterion.
Hunter and Reitner (1 965) proposed the simple Divergence criterion that can
readily be extended to multi-response situations. In general, if we have performed
N experiments; the experimental conditions for the next one are obtained by
maximizing the weighted Divergence between the rival models, defined as
Design of Experiments 193
(12.12)
i =l j=i+l
where Y is the number of the rival models and Q is a user-supplied weighting ma-
trix. The role of Q is to normalize the contribution of the response variables that
may be of a different order of magnitude (e.g., the elements of the output vector, y,
could be comprised of the process temperature in degrees K, pressure in Pa and
concentration as mole fraction).
Box and Hill ( 1 967) proposed a criterion that incorporates the uncertainties
associated with model predictions. For two rival single-response models the pro-
posed divergence expression takes the form,
(12.13)
where o: is the variance of the experimental error in the measurement (obtained
through replicate experiments) and o$ , 0& are the variance of y as calculated
based on models I and 2 respectively. Po,* is the prior probability (based on the N
experiments already performed) of model 1 being the correct model. Similarly,
Po,* is the prior probability of model 2. Box and Hill (1967) expressed the ade-
quacy in terms of posterior probabilities. The latter serve as prior probabilities for
the design of the next experiment.
Buzzi-Ferraris et al., (1983, 1984) proposed the use of a more powerful
statistic for the discrimination among rival models; however, computationally it is
more intensive as it requires the calculation of the sensitivity coefficients at each
grid point of the operability region. Thus, the simple divergence criterion of
Hunter and Reimer (1 965) appears to be the most attractive.
12.3.2 Model Adequacy Tests for Model Discrimination
Once the new experiment has been conducted, all models are tested for ade-
quacy. Depending on our knowledge of 0: we may perform a X2-test, an F-test or
Bartlett's i -test (when o f is completely unknown). The first two tests have been
194 Chapter 12
discussed previously i n Chapter 1 1 as Cases I and 11. Bartlett's $-test is presented
next.
Case 111: Bartlett's X2-test
Let us assume that o: . ~ is the estimate of fiom the it" model. The associ-
ated degrees of freedom for the it" model are denoted with (d.9, and are equal to
the total number of measurements minus the number of paranleters that appear in
the ith model. The idea is that the minimum sum of squares of residuals divided by
the appropriate degrees of freedom is an unbiased estimate of the experimental
error variance ody for the correct mathematical model. Al l other estimates based
on the other tnodels are biased due to lack of fit. Hence, one simply needs to ex-
amine the homogeneity of the estimated variances by the rival models. Namely,
Bartlett's X2-test is based on testing the hypothesis
with a user-selected level of significance (usually a=O.O 1 or a=0.05 is employed).
The above hypothesis is tested using an X7-test. Namely,
where x ~ , = ~ - ~ , ~ - ~ is obtained from the tables and xiata is computed ti-om the
data as follows:
2
First, we generate the pooled variance, 6; ,
7
(d.f.)i
i =l
which contains lack of fit, and then we compute x:,,, ti-om
(12.14)
r r
195
xi ata =
i=l
1
1 +"
3(r - I )
i =l
r 1 1
T"
/1 (d.f.)i
1=1 2 (d.f.)i
i =l
(12.15)
When the hypothesis Ho is rejected, we drop the model with the highest
ir:,i and we repeat the test with one less model. We keep on removing models
until Ho cannot be rejected any more. These models are now used in the determi-
nation of the overall divergence for the determination of the experimental condi-
tions for the next experiment.
Although all the underlying assumptions (local linearity, statistical independ-
ence, etc.) are rarely satisfied, Bartlett's X2-test procedure has been found adequate
in both simulated and experimental applications (Dumez et al., 1977; Froment,
1975). However, it should be emphasized that only the *-test and the F-test are
true model adequacy tests. Consequently, they may eliminate all rival models if
none of them is truly adequate. On the other hand, Bartlett's X2-test does not guar-
antee that the retained model is truly adequate. It simply suggests that it is the best
one among a set of inadequate models!
12.3.3 Implementation Steps for Model Discrimination
Based on the material presented above, the implementation steps to design
the next experiment for the discrimination among Y rival model are:
Step 1.
Step 2.
Step 3.
Step 4.
Step 5 .
Step 6.
Perform a series of initial experiments (based on a factorial design) to
obtain initial estimates for the parameters and their covariance matrix
for each of the r rival models.
For each grid point of the operability region, compute the Weighted
Divergence, D, given by Equation 12.12. In the computations we con-
sider only the models which are adequate at the present time (not all
the rival models).
Select the experimental conditions that correspond to a maximum D.
Perform the experiment at the selected experimental conditions.
Based on the new measurement of the response variables, estimate the
parameter vector and its covariance matrix for all rival models.
Perform the appropriate model adequacy test (X2-test, F-test or
Bartlett's $-test) for all rival models (just in case one of the models
196
Chapter 12
was rejected prematurely). If more than one model remains adequate,
go back to Step 2 to select the conditions for another experiment.
The idea behind the computation of the Divergence onI\ among adequate
models in Step 2, is to base the decision for the next experiment on the models that
are still competing. Models that have been found inadequate should not be i n-
cluded in the divergence computations. However, once the new data point be-
comes available it is good practice to update the parameter estimates for all models
(adequate or inadequate ones). Practice has shown that under conditions of high
experimental errors, as additional information becomes available some models
may become adequate again!
12.4 SEQUENTIAL EXPERIMENTAL DESIGN FOR ODE SYSTEMS
Let us now turn our attention to systems described by ordinary differential
equations (ODES). Namely, the mathematical model is of the form,
dx(t) = f(x(t), u, k) ; x(tn) = x.
dt
(12.16)
y(t) = Cx(t) (12.17)
where k=[kI,kz,. . .,kJT i s a p-dimensional vector of unknown parameters;
x=[xl,xz.. . . is an n-dimensional vector of state variables; . . ,x,,o]
is an n-dimensional vector of initial conditions for the state variables and
u=[u,,u?,. . .,u,IT is an r-dimensional vector of manipulated variables which are
normally set by the operatorlexperimentalist.
Design of the next experiment i n this case requires the following:
(i) Selection of the initial conditions, xo.
(ii) Selection of the sampling rate and the final time.
(iii) Selection of the manipulated variables, u, which could be kept ei-
ther constant or even be optimally manipulated over time.
The sequential experimental design can be made either for precise parameter
estimation or for model discrimination purposes.
12.4.1 Selection of Optimal Sampling Interval and Initial State for Precise
Parameter Estimation
The optimality criteria based on which the conditions for the next experi-
ment are determined are the same for dynamic and algebraic systems. However,
for a dynamic system we determine the conditions not just of the next measure-
ment but rather the conditions under which a complete experimental run will be
carried out. For precise parameter estimation we examine the determinant and the
eigenvalues of matrix Anew,
A = Aol d+ K zGT(ti )CTQCG(ti )
[ i =l
(12.18)
where NP is the number of data points to be collected, K=diag(kI,k2,. . .,kp), G(t) is
the parameter sensitivity matrix, (axT/8k)T and matrix Aold is our usual matrix A of
the normal equations based on the experimental runs performed up to now. If the
number of runs is NR and the number of data points gathered in each run is Np,
~~l~ is ofthe form,
(12.19)
Our selection of the initial state, xo, and the value of the manipulated vari-
ables vector, u(t) determine a particular experiment. Here we shall assume that the
input variables u(t) are kept constant throughout an experimental run. Therefore,
the operability region is defined as a closed region in the [ X ~, ~, X ~, ~, . ,xO,,
ul.u2, ..., u,IT -space. Due to physical constraints these independent variables are
limited to a very narrow range, and hence, the operability region can usually be
described with a small number of grid points.
In addition to the selection of the experimental conditions for the next ex-
periment (i.e., selection of the best grid point in the operability region according to
the volume or the shape criterion), the sampling rate and the final time must also
be specified. In general, it is expected that for each particular experiment there is a
corresponding optimal time interval over which the measurements should be ob-
tained. Furthermore, given the total number of data points (N,) that will be gath-
ered (Le., for a given experimental effort) there is a corresponding optimal sam-
pling rate which should be employed.
1. Time Interval Determination
The main consideration for the choice of the most appropriate time interval
is the availability of parameter sensitivity information. Kalogerakis and Luus
(1984) proposed to choose the time interval over which the output vector (meas-
ured variables) is most sensitive to the parameters. In order to obtain a measure of
the available sensitivity information with respect to time, they proposed the use of
198 Chapter 12
the Information Index introduced earlier by Kalogerakis and Luus ( 1 983b). The
information index for parameter kJ is defined as
IJ(t) = kj [c)Q[$]kj c3kj ; j=l ....,I 3
( I 2.20)
where Q is a suitably chosen weighting matrix. As indicated by Equation 12.20,
the scalar l,(t) is simply a weighted sum of squares of the sensitivity coefficients
of the output vector with respect to parameter kJ at time t. Hence, l,(t) can be
viewed as an index measuring the overall sensitivity of the output vector with re-
spect to parameter k, at time t. Using Equation 12. I7 and the definition of the pa-
rameter sensitivity matrix, the above equation becomes
iJ (t) = k 3TGT(t)CTQCG(t)6.k ; j=l, . . . , p
J . I . I J
(12.21)
where 6J is ap-dimensionnl vector with 1 in the j'" elenlent and zeros elsewhere.
The procedure for the selection of the most appropriate time interval re-
quires the integration of the state and sensitivity equations and the computation of
l,(t), j=l,.,.,p at each grid point of the operability region. Next by plotting I,(t),
j=l, ...,p versus time (preferably on a log scale) the time interval [fl,tNp] where the
information indices are excited and become large in magnitude is determined. This
is the time period over which measurements of the output vector should be ob-
tained.
11. Sampling Rate Determination
Once the best time interval has been obtained, the sampling times within this
interval should be obtained. Let us assume that a total number of NP data points
are to be obtained during the run. In general, the selected sampling rate should be
small compared to the governing time constants of the system. However, for
highly nonlinear systems where the governing time constants may change signifi-
cantly over time, it is difficult to determine an appropriate constant sampling rate.
The same difficulty arises when the system is governed by a set of stiff differential
equations. By considering several numerical examples, Kalogerakis and Luus
(1 984) found that if NP is relatively large, in order to cover the entire time interval
of interest it i s practical to choose t2, t3,...,tNp-l by a log-hear interpolation be-
tween t, (tp0) and tNp, namely,
( 12.22)
When t, and fNp do not differ significantly (i.e., more than one order of mag-
nitude), the proposed measurement scheme reduces practically to choosing a con-
stant sampling rate. If, on the other hand, tl and tNp differ by several orders of
magnitude, use of the log-linear interpolation formula ensures that sensitivity in-
formation is gathered from widely different parts of the transient response cover-
ing a wide range of time scales.
The steps that should be followed to determine the best grid point in the op-
erability region for precise parameter estimation of a dynamic system are given
below:
Step 1 .
Step 2.
Step 3.
Step 4.
Step 5.
Compute the Information Indices, IJ(t), j=1,. . .,p at each grid point of
the operabi 1 ity region..
By plotting I,(t), j=1,. . .,p versus time determine the optimal time in-
terval [tl,tNp] for each grid point.
Determine the sampling times t2, t3, ... using Equation 12.22 for each
grid point.
Integrate the state and sensitivity equations and compute matrix Anew.
Using the desired optimality criterion (volume or shape), determine
the best experimental conditions for the next experiment.
111. Simplification of the Procedure
The above procedure can be modified somewhat to minimize the overall
computational effort. For example, during the computation of the Information
Indices in Step 1, matrix Anew can also be computed and thus an early estimate of
the best grid point can be established. In addition, as is the case with algebraic
systems, the optimum conditions are expected to lie on the boundary of the oper-
ability region. Thus, in Steps 2, 3 and 4 the investigation can be restricted only to
the grid points which lie on the boundary surface indicated by the preliminary
estimate. As a result the computation effort can be kept fairly small.
Another significant simplification of the above procedure is the use of con-
stant sampling rate when tl, tNpare within one or two orders of magnitude and use
of this sampling rate uniformly for all grid points of the operability region. Essen-
tially, we use the information indices to see whether the chosen sampling rate is
adequate for all grid points of the operability region.
If a constant sampling rate is not appropriate because of widely different
time constants, a convenient simplification is the use of two sampling rates: a fast
one for the beginning of the experiment followed by the slow sampling rate when
the fast dynamics have died out. The selection of the fast sampling rate is more
difficult and should be guided by the Information Indices and satisfjl the con-
straints imposed by the physical limitations of the employed measuring devices.
Kalogerakis and Luus (1983b) have shown that if the Information Indices suggest
200 Chapt er 12
that there is available sensitivity information for all the parameters during the slow
dynamics, we may not need to employ a fast sampling rate.
12.4.2 Selection of Optimal Sampling lnterval and Initial State for Model
Discrimination
If instead of precise parameter estimation, we are designing experiments for
model discrimination, the best grid point of the operability region is chosen by
maximizing the overall divergence, defined for dynamic systems as
I=j+l j = l i =l
Again we consider that u is kept constant throughout an experimental run.
The design procedure is the same as for algebraic systems. Of course, the time
interval and sampling times must also be specified. In this case, our selection
should be based on what is appropriate for all competing models, since the infor-
mation gathered from these experiments will also be used to estitnate more pre-
cisely the parameters in each model. Again the information index for each pa-
ratneter and for each model can be used to guide us to select an overall suitable
sampling rate.
12.4.3 Determination of Optimal Inputs for Precise Parameter Estimation
and Model Discrimination
As mentioned previously, the independent variables which determine a par-
ticular experiment and are set by the experimentalist are the initial state, x. and the
vector of the manipulated variables (also known as control input), u. I n the previ-
ous section we considered the case where u is kept constant throughout an es-
perimental run. The case where u is allowed to vary as a fhnction of time consti-
tutes the problem of optitnal inputs. I n general, it is expected that by allowing
some or all of the manipulated variables to change over time, a greater amount of
information will be gathered frotn a run. The design of optimal inputs has been
studied only for the one-parameter model case. This is fairly straightforward since
maximization of the determinant of the ( I x 1 ) matrix, A, is trivial and the opti-
mal control problem reduces to tnaximization of the integral (or better summation)
of the sensitivity coefficient squared. In a similar fashion, researchers have also
looked at maximizing the sensitivity of one paratneter out of all the unknown pa-
rameters present in a dynamic model.
Murray and Reiff (1 984) showed that the use of an optimally selected square
wave for the input variables can offer considerable improvetnent in parameter
estimation. Of course, it should be noted that the use of constant inputs is often
more attractive to the experimentalist, mainly because of the ease of experimenta-
tion. In addition, the mathematical model representing the physical system may
remain simpler if the inputs are kept constant. For example, when experiments are
designed for the estimation of kinetic parameters, the incorporation of the energy
equation can be avoided if the experiments are carried out isothermally.
For precise parameter estimation, we need to solve several optimal control
problems each corresponding to a grid point of the operability region. The oper-
ability region is defined now by the potential values of the initial state (xo). A par-
ticular optimal control problem can be formulated as follows,
Given the governing djffential equations (the model) and the ini-
tial conditions, xo, determine the optimal inputs, u(t), so that the de-
terminant f or the volume criterion) or the smallest eigenvalue for
the shape criterion) of matrix Anew (given bl) Equation 12.18) is
maximized
Obviously this optimal control problem is not a typical "textbook prob-
lem". It is not even clear whether the optimal solution can be readily computed. As
a result, one should consider suboptimal solutions.
One such design was proposed by Murray and Reiff (1984) where they
used an optimally selected square wave for the input variables. Instead of com-
puting an optimal profile for u(t), they simply investigated the magnitude and fie-
quency of the square wave, for precise parameter estimation purposes. It should be
also kept in mind that the physical constraints often limit the values of the ma-
nipulated variables to a very narrow range which suggests that the optimal solution
will most likely be bang-bang.
The use of time stages of varying lengths in iterative 4nami c program-
ming (Luus, 2000) may indeed provide a computationally acceptable solution.
Actually, such an approach may prove to be feasible particularly for model dis-
crimination purposes. In model discrimination we seek the optimal inputs, u(t),
that will maximize the overall divergence among r rival models given by Equation
1 2.23.
The original optimal control problem can also be simplified (by reducing
its dimensionality) by partitioning the manipulated variables u(t) into two groups
uI and u2. One group uI could be kept constant throughout the experiment and
hence, the optimal inputs for subgroup u2(t) are only determined.
Kalogerakis (1984) following the approach of Murray and Reiff ( I 984)
suggested the use of a square wave for u(t) whose fiequency is optimally chosen
for model discrimination purposes. The rationale behind this suggestion is based
on the fact that the optimal control policy is expected to be bang-bang. Thus, in-
stead of attempting to determine the optimal switching times, one simply assumes
a square wave and optimally selects its period by maximizing the divergence
among the Y rival models.
202
Chapter 12
12.5 EXAMPLES
12.5.1 Consecutive Chemical Reactions
As an example for precise parameter estimation of dynamic systems we
consider the simple consecutive chemical reactions in a batch reactor used by
Hosten and Emig ( 1 975) and Kalogerakis and 1,uus (1 984) for the evaluation of
sequential experimental design procedures of dynamic systems. The reactions are
where both steps are irreversible and kinetically of first order. The governing dif-
ferential equations are
( I2.24a)
where xI and x2 are the concentrations of A and B (,g/L) and T is the reaction tem-
perature ( K) . For simplicity it is assumed that the reaction is carried out isother-
mally, so that there is no need to add the energy equation. Both state variables xI
and x? are assumed to be measured and the standard error in the measurement (q)
in both variables is equal to 0.02 (g/L). Consequently the statistically correct
choice for the weighting matrix Q is the identity matrix.
The experimental conditions which can be set by the operator are the initial
concentrations of.4 and 13 and the reaction temperature T. For simplicity the initial
concentration of B is assumed to be zero, Le., we always start with pure A . The
operability region is shown in Figure 12.2 on the X ~, ~- T plane where xo.l takes Val-
ues from I to 4 g/L and T from 370 to 430 K.
Let 11snow assume that two preliminary experiments were performed at the
grid points ( I , 370) and (4, 430) yielding the following estimates for the parame-
ters: kl = 0.6 I 175x IO", k2 = 10000, k3 = 0.62 155x lo* and = 7500 from 20
measurements taken with a constant sampling rate in the interval 0 to 10 I?. With
these preliminary parameter values the objective is to determine the best experi-
mental conditions for the next run.
203
I 2 3
Figure 12.2 The operability region for the consecutive chemical reactions
example [reprinted fiom the Canadian Journal of Chemical
Engineering with permission].
For comparison purposes, we first determine the best grid point in the op-
erability region using the same final time and constant sampling rate for all feasi-
ble experiments. To show the importance of the final time (or better the time inter-
val) several values in the range 1 to 40 h were used and the results are shown in
Table 12. I . A total of 20 data points were used and the best grid point was selected
based on the volume criterion. As seen, the results depend heavily on the chosen
final time. Furthermore, cases 4 and 5 or 6 and 7 show that for the same grid point
the chosen final time has a strong effect on det(Anew) and hence, there must be
indeed an optimal choice for the final time for each grid point in the operability
region.
This observed dependence can be readily explained by examining the be-
havior of the Information Indices. For example it is seen in Figure 12.3 that most
of the available sensitivity information is available in the time interval 0.03 to 2 h
for the grid point (4,430). Therefore, when the final time was increased other grid
point gave better results (i.e., a higher value for det(AneW)). On the other hand, as
seen in Figure 12.4, most of the sensitivity information is available in the interval
2 to 40 h for the grid point (4,370). Hence, only when the final time was increased
to 12 h or more was this grid point selected as the best.
The above results indicate the need to assign to each grid point an appro-
priate time interval over which the measurements will be made. The information
index provides the means to locate this time interval so that the maximum amount
of information is gathered from each experiment.
204 Chapter 12
Next by employing the volume criterion, the best grid point i n the oper-
ability region was determined using the time intervals indicated by the information
indices and the log-linear formula for the selection of the sampling times. A total
of 20, 40 and SO data points were used and the results are shown in Table 12.2. As
seen, the grid point (4, 370) was consistently selected as best.
At this point it is worthwhile making a couple of comments about the In-
formation Index. One can readily see from Figures 12.3 and 12.4 that the steady
state values of the Information Indices are zero for this example. This simply
means that the steady state values of the state variables do not depend on the pa-
rameters. This is expected as both reactions are irreversible and hence, regardless
of their rate, at steady state all of component A and B must be converted to C.
From the same figures it is also seen that the normalized information indi-
ces of k, and k2, as well as k3 and k4, overlap completely. This behavior is also not
surprising since the experiments have been conducted isothermally and hence the
product klexp(-k21T) behaves as one parameter which implies that (8xi/i3k1) and
(ai/dkz) and consequently I,(t) and 12(t) differ only by a constant factor.
Finally, it is pointed out that the time interval determined by the Informa-
tion Indices i s subject to experimental constraints. For example, it may not be pos-
sible to collect data as early as suggested by the Information Indices. If k, in this
example was 100 times larger, i.e., k, = 0.61 175x1 01', the experimentalist could
be unable to start collecting data from 0.004 h (Le.. 1 .5 sec) as the Information
Indices would indicate for grid point (4,430) shown in Figure 12.5.
Table 12. I Consecutive Chenlicnl Reactions: Eflect qf Fi nd Time on the
Selection of the Best Grid Point Using the Volrrme Criterion
Case Final Time Best Grid det( Anew)
( I d Point
1 1
(4,430)
3. 801~10~
2 2
(4,4 15)
4.235~ 10''
3 4
(4,400)
7.178~10~
4 6
(4,385)
7.642~ 1 O4
5 8
(4,385)
1.055~ 1 O5
7 40 (4,370) I .7ssx I o5
6 12 (4,370) 1.083~ I O5
Sotace: Kalogerakis and Luus (1 984).
205
Figure 12.3 Information indices vessus time for the grid point (4, 430)
[reprinted from the Canadian Journal of Chemical Engi-
neering with permission].
Figwe 12.4 Information indices vessus time for the grid point (4, 370)
[reprinted porn the Canadiarz Journal of Chemical Engi-
neering with pernzission].
206
Chapter 12
20 0.0171 1.476~ 1 O5 15.14 0.562 8.044 0.414
80 0.0174 2.316~10~ 15.02 0.556 7.236 0.362
40 0.0 173 5.83 1 x 1 O5 15.06 0.558 7.513 0.380
Best grid point for all cases: (4,370).
Sozrrce: Kalogerakis and Luus (1 984).
Finally when the selected experiment has been performed, the parameter
estimates will be updated based on the new information and the predicted pa-
rameter variances will probably be somewhat different since our estimate of the
parameter values will most likely be different. If the variance of the estimated pa-
rameters is acceptable we stop, or else we go back and design another experiment.
12.5.2 Fed-batch Bioreactor
As a second example let us consider the fed-batch bioreactor used by Ka-
logerakis and Luus (1 984) to illustrate sequential experimental design methods for
dynamic systems. The governing differential equations are (Lim et al., 1977):
dx = [*-. )x, - k4xI ; X ~( O) =X ~, ~
dt k, +x2
(1 2.25a)
where x1 and x-, are the biomass and limiting substrate (glucose) concentrations
(g/L) in the bioreactor, cy: is the substrate concentration in the feed stream (g/L)
and D is the dilution factor (h") defined as the feed flowrate over the volume of
the liquid phase in the bioreactor. The dilution factor is kept constant with respect
to time to allow x1 and x2 to reach steady state values while the volume in the bio-
reactor increases exponentially (Lim et al., 1977).
It is assumed that both state variables x1 and x2 are measured with respect
to time and that the standard experimental error (0,) is 0.1 (g/L) for both variables.
The independent variables that determine a particular experiment are (i) the in-
oculation density (initial biomass concentration in the bioreactor), x ~, ~, with range
1 to 10 g/L, (ii) the dilution factor, D, with range 0.05 to 0.20 h-l and (iii) the sub-
strate concentration in the feed, cF, with range 5 to 35 g/L.
In this case the operability region can be visualized as a rectangular prism
in the 3-dimensional x~, ~- D- c~ - space as seen in Figure 12.6. A grid of four points
in each variable has been used.
Let us now assume that fiom a preliminary experiment performed at =
7 g/L, D = 0.10 h-' and cF = 25 g/L it was found that kl = 0.3 1, k2 = 0.1 8, k3= 0.55
and kj = 0.05 from 20 measurements taken at a constant sampling rate in the inter-
val 0 to 20 h. Using Equation 12.1, the standard deviation (%), also known as co-
efficient of variation, was computed and found to be 49.86, 1 I I .4, 8.526 and
27.75 for k, , k2, k3 and k4 respectively. With these preliminary parameter esti-
mates, the conditions of the next experiment will be determined so that the uncer-
tainty in the parameter estimates is minimized.
208
Chapter 12
Table 12.3 Fed-hatch Biorencfor: Effect of Find Time on h e Selection of
the Best Grid Point Using the I/blwne Criterion
Case Final Time Best Grid del( Aew)
(h) Point
1 20
(7, 0.20,35)
I .06x 1 Ox
2 40
(4, 0.15, 35)
I I 0
3 80
( I , 0.20,35)
m X I o9
Source: Kalogerakis and Luus (1 984).
For comparison purposes we first determine the best grid point in the oper-
ability region using the same final time and constant sampling rate for all feasible
experiments. I n Table 1 the best experimental conditions are shown chosen by the
volume criterion when as final time 20, 40 or 80 h is used, A total of 20 data
points is used in all experiments. As seen, the selected conditions depend on our
choice of the final time.
Again, this observed dependence can be readily explained by examining
the behavior of the information indices. For example in Figure 12.7 it is shown
that most of the available sensitivity information for grid point (1, 0.20, 35) is in
the interval 25 to 75 h. Consequently when as final time 20 or 40 h is used, other
grid points in the operability region gave better results. On the other hand, it is
seen in Figure 12.8 that 20 h as final time is sufficient to capture most of the avail-
able sensitivity information (although 35 h would be the best choice) for the grid
point (7, 0.20, 35). By allowing the final time to be 80 h no significant improve-
ment in the results is expected.
x
W
0
5 0.80
2
0
0.60
a
I
a
f
0 4 0
0
N
-I
w 0.20
-
d 000
0
z
Figure 12.7 Fed-batch Bioreactor: Information indices versus time for- the
grid point ( I , 0.20, 35) [reprinted front the Canadian Jollr-
nal of Chemical Engineering with permission].
To illustrate the usefulness of the Information Index in determining the best
time interval, let us consider the grid point (1, 0.20, 35). From Figure 12.7 we de-
duce that the best time interval is 25 to 75 h. In Table 12.4 the standard deviation
of each parameter is shown for 7 different time intervals. From cases 1 to 4 it is
seen that that measurements taken before 25 h do not contribute significantly in
the reduction of the uncertainty in the parameter estimates. From case 4 to 7 it is
seen that it is preferable to obtain data points within [25, 751 rather than after the
steady state has been reached and the Information Indices have ieveled off Meas-
urements taken after 75 h provide information only about the steady state behavior
of the system.
The above results indicate again the need to assign to each grid point a dif-
ferent time interval over which the measurements of the output vector should be
made. The Information Index can provide the means to determine this time inter-
val. It is particularly useful when we have completely different time scales as
shown by the Information Indices (shown in Figure 12.9) for the grid point (4,
210 Chapter 12
0.25, 35). The log-linear interpolation for the determination of the sampling times
is very useful in such cases.
c I 00
CI
f
z1
080
I-
U
I060
a
' 040
2
W
0
N
2 0.20
a
I
2 4 6 8 IOo 2
T I ME ( h )
4 6 0 IO' 2 4 6 0 IO2
Table 12.4 Fed-batch Bioreactor: Stcrndard Deviation of Pwanleter Esti-
rmtes Versus Time Interval Used for. the Grid Poitlt ( I . 0.20, 35)
Time Number of Standard Deviation (YO)
Case Interval Data points kl k7, k3 k'l
1 [0.01,75]
155 1.702 4.972 1.917 1 .OS1
2 [ I , 751
75 1.704 4.972 I .919 1 .OS2
3 [ 15,751
30 1.737 5.065 1.967 I .072
4 [ E , 751
20 I .836 5.065 2.092 1 .I30
5 [25, 1501
20 I .956 4.626 2.3 16 1.202
6 [50, 1501
20 3.3 16 5.262 4.057 2.053
7 [60, 1501
20 278.3 279.5 332.4 1662.
Sozrrce: Kalogerakis and Luus (1 984).
21 1
1.00
X
O
r
$ 0.60
I-
U
-
0.60
2
f
0.40
0
W
N
1 0.20
U
I
0
a
0.00
16' 2 4 6 8 10' 2 4 6 8 IO' 2 4 6 8 IO2
TI ME ( h l
Figure 12.9 Fed-batch Rioreactos: Infomation indices versus time for the
grid point (4, 0.25, 35) [reprinted from the Canadian Jour-
nal of Chemical Engineering with permission].
Table 12.5 Fed-batch Bioreactor: Selection of the Best Grid Point Based
on the Volume Criterion and Use of the Information Index
~ ~~~ ~" ~- ~- ~~ ~ ~ ~-
No. of Standard deviation (%)
data
points A,-, det(Anew) kl k2 k3 k4
20 0.709 6.61~10~ 1.836 5.065 2.092 1 1.30
80 2.816 1.74~ 10l2 0.920 2.574 1.046 5.667
40 1.405 1.07~ I O" I .302 3.647 1.481 8.020
Best grid point for all cases: (1, 0.20, 35).
Source: Kalogerakis and Luus (1 984).
Employing the volume criterion, the best grid point in the operability re-
gion was determined using the Information Indices. A total of 20, 40 and 80 data
points were used and the results are shown in Table 12.5
The condition number of matrix Anew can be used to indicate which of the
optimization criteria (volume or shape) is more appropriate. In this example the
212 Chapt er 12
condition number is rather large (-10') and hence, the shape criterion in expected
to yield better results. Indeed, by using the shape criterion the condition number
was reduced approximately by two orders of magnitude and the overall results are
improved as shown in Table 12.6. As seen, the uncertainty i n kl , k3 and It was
decreased while the uncertainty in k2 was increased somewhat. The Information
Indices for the selected grid point (4,O. 15,35) are shown in Figure 12.10.
No. of Standard Deviation (96)
data
points htnm de(A"e'v) kl k:! k; kr
~ ~~ ~~~~ - ~ ~ ~ ~ ~~~~ ~~ ~~~~~~ ~ ~ -
20 1.408 4.71~10' 0.954 8.357 1.175 5.248
80 5.604 1.22x I O" 0.477 4.1 89 0.587 2.628
40 2.782 7 . 5 6 ~ 1 O9 0.674 5.943 0.83 1 3.717
Best grid point for all cases: (4,0.15, 35).
Sozrrce: Kalogerakis and Luus ( 1984).
I 00
x
W
0
5 0.80
z
I3
2 0. 60
z
a
0
L
0 40
-
W
0
-I
N 0.20
a
I
z
g 0.00
10' 2 4 6 8 IOo 2 4 6 8 IO' 2 4 6 8 IO2
TI ME ( h )
It is interesting to note that by using both design criteria, the selected best
grid points lie on the boundary of the operability region. This was also observed in
the previous example. The same has also been observed for systems described by
algebraic equations (Rippin et al., 1980).
12.5.3 Chemostat Growth Kinetics
As a third example let us consider the growth kinetics in a chemostat used
by Kalogerakis (1984) to evaluate sequential design procedures for model dis-
crimination in dynamic systems. We consider the following four kinetic models
for biomass growth and substrate utilization in the continuous baker's yeast fer-
mentation.
Model I (Monod kinetics with constant spec@ death rate)
Model 2 (Contois kinetics with constant speciJc death rate)
Model 3 (Linear specific growth mte)
( 1 2.26a)
(1 2.26b)
( 1 2.27a)
(12.27b)
( 1 2.28a)
(1 2.28b)
214
Chapter 12
(1 2.29a)
( 12.29b)
In the above ODES, xI and x2 represent the biomass and substrate concen-
tration in the chemostat, cF is the substrate concentration in the feed streatn (g/L)
and D is the dilution factor ( I ? - / ) defined as the feed flowrate over the volume of
the liquid phase in the chemostat. I t is assumed that both state variables, xi and x2
are observed.
The experimental conditions that can be set by the operator are the initial
conditions xI(0) and x2(0), and the manipulated variables (control inputs) u,(t)=D
and 112(f)=CF. The range of the manipulated variables is 0.05 5 D 5 0.20 (h") and 5
5 Cy 5 35 (g/L). For simplicity, it is assumed that the initial substrate concentration
in the chemostat, ~~( 0) . is always 0.01 g/L. The initial biomass concentration (in-
oculation) is assumed to take values in the range 1 to 10 g/L, i.e., I < x,(O) < 10
We have several options in terms of designing the next experiment that will
( dL>.
have the maximum discriminating power among all the above four rival models:
(i) Consider both inputs as constants (i.e., D and cF are kept constant
throughout the run). I n this case we have a 3-dimensional operability
region (xl(0), D, cF) that can be visualized as a rectangular prism. The
best grid point is selected by maximizing the weighted divergence given
by Equation 12.23.
(ii) Consider one input constant and the other one able to vary with respect
to time. For example, we can allow the feed concentration to vary opti-
mally with respect to time. The operability region is now a rectangular
region on the (x,(O)-D)-plane. For each grid point of the operability re-
gion we can solve the optimal control problem and determine the opti-
mal input sequence (CF(t)) that will maximize the divergence given by
Equation 12.23. We can greatly simplify the probletn by searching for
the optimal period of a square wave sequence for cF.
(iii) Consider both inputs variables (D and cF) able to vary with respect to
time. In this case the operability region is the one dimensional segment
on the xl(0) - axis. Namely for each grid point (for example, xI(0) could
take the values I , 4, 7 & 10 g/L) we solve the optimal control problem
whereby we determine the optimal time sequence for the dilution factor
(D) and the feed concentration (cF) by maximizing the weighted diver-
gence given by Equation 12.23. Again, the optimal control problem can
be greatly simplified by considering square waves for both input vari-
ables with optimally selected periods.
Let us illustrate the methodology by consider cases (i) and (ii). We assume
that from a preliminary experiment or from some other prior information we have
the following parameter estimates for each model:
Model 1: k(' ) = [0.30, 0.25,0.56, 0.02IT
Model 2: k(2) = [0.30, 0.03,0.55, 0.03IT
Model 3: k") = [O. 12,0.56, 0.02IT
Model 4: k(4) = [0.30,0.25,0.56,0.02]r
Using as final time 72 h and a constant sampling rate of 0.75 h, the best grid
point in the operability region was determined assuming constant inputs. The grid
point (1, 0.20, 35) was found as best and the corresponding value of the weighted
divergence was D=7 1759.
On the other hand, if we assume that cF can change with time as a square
wave, while dilution factor is kept constant throughout the run, we find as best the
grid point (xl(0), D) = (I , 0.20) and an optimal period of 27 h when cF is allowed
to change as a square wave. In this case a maximum weighted divergence of
D=84080 was computed. The dependence of the weighted divergence on the
switching period of the input variable (cF) is shown for the grid points ( I , 0.1 5 )
and (1,0.20) in Figures 12. I 1 and 12.12 respectively.
The ability of the sequential design to discriminate among the rival models
should be examined as a hnction of the standard error in the measurements (CY,).
For this reason, artificial data were generated by integrating the governing ODES
for Model 1 with "true" parameter values k,=0.3 I , k2=0.1 8, k3=0.55 and b=O.O3
and by adding noise to the noise free data. The error terms are taken from inde-
pendent normal distributions with zero mean and constant standard deviation (0,).
In Tables 12.7 and 12.8 the results of the $-adequacy test (here cs, is as-
sumed to be known) and Bartlett's $-adequacy test (here 0, is assumed to be un-
known) are shown for both designed experiments (constant inputs and square
wave for cF(t)).
As seen in Tables 12.7 and 12.8 when O, is in the range 0.001 to 0.2 g/L the
experiment with the square wave input has a higher discriminating power. As ex-
pected the X2-test is overall more powerfbl since oE is assumed to be known.
For the design of subsequent experiments, we proceed in a similar fashion;
however, in the weighted divergence only the models that still remain adequate are
included.
216
Chapter 12
Period (h)
0 20 40 60 80
Period (h)
217
Table 12.7 Chernostat Kinetics: Results fiorn Model Adequacy Tests Assuming
a, is Known (K2-tesr) Performed at a=O.Oi Level of Signijlcance
I I Constant Inputs Square Wave Input I
% Model
1
0.5 A
0.2 A
0. I A
1 0.005 1 A
I 0.001 I A
R
R R A I A
Note: R = model is rejected, A = model remains adequate.
Source: Kalogerakis ( 1984).
Table 12.8 Chemostat Kinetics: Results from Model Adequacy Tests Assuming
a, is Unknown (Bartlett's X2-test) Pevformed at a=O.O I Level of Sig-
nificance
0.001
Source: Kalogerakis (I 984).
Note: R = model is rejected, A = model remains adequate.
R R R A R R R A
Recursive Parameter Estimation
In this chapter we present very briefly the basic algorithm for recursive least
squares estimation and some of its variations for single input - single output sys-
tems. These techniques arc routinely used for on-line pornmeter estintation i n data
acquisition systems. They are presented i n this chapter without any proof for the
sake of completeness and with the aim to provide the reader with a quick over-
view. For a thorough presentation of the material thc reader may look at any of
the following references: Soderstrom et al. (1978), Ljung and Sbderstrbtn (1983)
Shanmugan and Breipohl. ( I 988), Wellstead and Zarrop ( I 991). The notation that
will be used in this chapter is different from the one we have used up to now. In-
stead we shall follow thc notation typically encountcred it1 thc analysis and control
of sampled data systems.
13.1 DlSCRETE INPUT-OUTPUT MODELS
Recursive estimation methods are routinely used in many applications
where process measurements become available continuously and we wish to re-
estimate or better update on-line the various process or controller parameters as
the data become availablc. Let us consider thc linear discretc-time tnodel having
the general structure:
where 2.' is the backward shift operator (i.e., Y, , - ~ =z-'p,, , y,,-? - z-~Y,,, etc.) and ,A(.)
and B(.) are polynomials of 2". The input variable is u,, E u(t,,) and the output vari-
218
Recursive Parameter Estimation 219
able is y,, = y(t,). The system has a delay of k sampling intervals (k2l). In ex-
panded form the system equation becomes
( 1 + alii + a2z-2 +. . . +a,iP)yn (bo +biz" +b2i2+. . . +bqz-q)u,-k + e, (1 3.2)
or
We shall present three recursive estimation methods for the estimation of
the process parameters (a,, ..., ap, bo, b,,.. ., b,) that should be employed according
to the statistical characteristics of the error tern sequence en's (the stochastic dis-
turbance).
13.2 RECURSIVE LEAST SQUARES (RLS)
In this case we assume that E, is white noise, cn, i.e., the en's are identicall
and independently distributed normally with zero mean and a constant variance CJ .
Thus, the model equation can be rewritten as
Y
which is of the form
T
Y n =Vn--l*n--I + 4 n (1 3.5)
where
and
(1 3.6b)
Whenever a new measurement, Yn, becomes available, the parameter vector
is updated to 8, by the formula
(1 3.7)
where
220
Chapter 13
(13.8)
The new estimate of the normalized parameter covariance matrix, P,, is ob-
tained fi-om
(1 3.9)
The updated quantities e,, and Pn represent our best estimates of the un-
known parameters and their covariance matrix with information up to and includ-
ing time t,. Matrix P, represents an estimate of the parameter covariance matrix
since,
coc'(e,,) = o2 P,,. (13.10)
The above equations are developed using the theory of least squares and
making use of the matrix inversion lemma
where A" =P,,, and X=\yn is used.
To initialize the algorithm we start with our best initial guess of the pa-
rameters. 0, . Our initial estimate of the covariance matrix Po is often set propor-
tional to the identity matrix (i.e., Po =y21). If very little information is available
about the parameter values, a large value for y2 should be chosen.
Of course, as time proceeds and data keeps on accumulating, Pn becomes
smaller and smaller and hence, the algorithm becomes insensitive to process pa-
rameter variations.
The standard way of eliminating this problem is by introducing a forgetting
factor h . Namely, instead of minimizing
n
S( O) = C e f
i = l
(13.12)
we estimate the parameters by minimizing the weighted least squares objective
function
Recursive Parameter Estimation 22 1
(13.13)
i =I
with O< h -4. Typically, our selection for h is in the range 0.90 2 h 5 0.9999). This
choice of weighting factors ensures that more attention is paid to recently collected
data. It can be shown (Ljung and SoderstrOm, 1983) that in this case the parame-
ters are obtained recursively by the following equations:
P* = -[I 1 - K , 4 P*-1 (13.15)
h
Equations 13.14 to 13. I6 constitute the well known recursive least squares
(RLS) algorithm. It is the simplest and most widely used recursive estimation
method. It should be noted that it is computationally very efficient as it does not
require a matrix inversion at each sampling interval. Several researchers have in-
troduced a variable forgetting factor to allow a more precise estimation of 8 when
the process is not "sensed" to change.
13.3 RECURSIVE EXTENDED LEAST SQUARES (RELS)
In this case we assume the disturbance term, e,, is not white noise, rather it
is related to t,, through the following transfer function (noise filter)
en = c(z") gn (13.17)
where
C(z-l) = 1 + c,z-I + C 2i 2 +.. .+ C,Z-r
Thus, the model equation can be rewritten as
(13.18)
222
which is again of the foun
Chapter 13
( 13 20)
where
and
This model is of the same form as the one used for RLS and hence the up-
dating equations for @,, and P, are the same as in the previous section. However,
the disturbance terms, tn-?,.. ., and hence the updating equations cannot be
implemented. The usual approach to overcome this difficulty is by using the one-
step predictor error, namely,
where
i ( I 3.23)
where we have denoted with yrl;n-l the one-step ahead predictor of yn. Therefore
the reczrrsive extended Ieust sqzmres (RELS) algorithm is given by the following
equations:
( 1 3 24)
Recursive Parameter Estimation
223
(1 3.25)
(1 3.26)
13.4 RECURSIVE GENERALIZED LEAST SQUARES (RGLS)
In this case we assume again the disturbance term, e,, is not white noise,
rather it is related to tn through the following transfer function (noise filter)
I
e, =-
5 n
C(z - 1 )
where
C(Z") = 1 + C l i ' + CZZ" +...+ c,ir
The model equations now become
A(Z")C(Z-')yn = B(i')C(i')u,k + E,n
or
or
and
or
( 1 3.27)
(I 3.28)
(1 3.29)
(1 3.30)
(13.31a)
(13.31b)
(13.32a)
( 1 3.32b)
224
Chapter 13
(13.33)
where
and
(1 3.34b)
The above equations suggest that the unknown parameters in polynomials
A(-) and B(.) can be estimated with RLS with the transformed variables Ltl and
u n-k . Having polynomials A(.) and R(-) we can go back to Equation 13. I and
obtain an estimate of the error term, e,,, as
-
where by we have denoted with x(.) and E(-) the current estimates of polynomi-
als A(.) and B(.). The unknown parameters in polynomial C(.) can be obtained
next by considering the noise filter transfer fimction. Namely, Equation 13.27 is
written as
The above equation cannot be used directly for RLS estimation. Instead of
the true error terms ,en, we must use the estimated values from Equation 13.35.
Therefore, the recursive generalized least squares (RGLS) algorithm can be im-
plemented as a two-step estimation procedure:
Step I . Compute the transformed variables Ln and I ll,-k based on 0111'
knowledge of polynomial C(.) at time t,l-l.
Apply RLS on equation A(z" ) Yn = R(z") i i n- k + &, based on in-
formation up to time t,. Namely, obtain the updated value for the
parameter vector (i.e. the coefficients of the polynomials A and
B), e,,
Step 2. Having e,,, estimate St, as A(z-l )y, - B(z- I )u ,,-k .
Recursive Parameter Estimation 225
Apply RLS to equation: C(z" )En = Gn to get the coefficients of
C(z-l) which are used for the computation of the transformed
variables y,, and i i n-k at the next sampling interval.
Essentially, the idea behind the above RGLS algorithm is to apply the ordi-
nary RLS algorithm twice. The method is easy to apply, however, it may have
multiple convergence points (Ljung and Soderstrtjm, 1983).
The previously presented recursive algorithms (RSL, RELS and RGLS)
utilize the same algorithm (i.e., the same subroutine can be used) the only differ-
ence being the set up of the state and parameter vectors. That is the primary reason
for our selection of the recursive algorithms presented here. Other recursive algo-
rithms (including recursive maximum likelihood) as well as an exhaustive pres-
entation of the subject materia1 including convergence characteristics of the above
algorithms can be found in any standard on-line identification book.
Parameter Estimation in Nonlinear
Thermodynamic Models:
Cubic Equations of State
Thermodynamic models are widely used for the calculation of equilibrium
and thermophysical properties of fluid mixtures. Two types of such models will
be examined: cubic equations of state and activity coefficient models. I n this
chapter cubic equations of state models are used. Volumetric equations of state
(EoS) are employed for the calculation of fluid phase equilibrium and thermo-
physical properties required in the design of processes involving non-ideal fluid
mixtures in thc oil and gas and chemical industries. I t is well known that the
introduction of empirical parameters in equation of state mixing rules enhances
the ability of a given EoS as a tool for process design although the number of
interaction parameters should be as small as possible. I n gcneral, the phase
equilibrium calculations with an EoS are very sensitive to the values of the
binary interaction parameters.
14.1 EQUATIONS OF STATE
An equation of state is an algebraic relationship that relates the intensive
thermodynamic variables (T, P. v, xl , x?, ..., xNc) for systems in thermodynamic
equilibrium. In this relationship, 'I' is the temperature, P is the pressure, v is the
molar volume and x, are the mole ti-actions of component i with i=l ,. . .,Nc~, N(.
being the number of species present in the system.
226
227
The EoS may be written in many forms. It may be implicit
or explicit in any of the variables, such as pressure. In a system of one component
(pure fluid) the above equation becomes
F(T,P,v) = 0 or P = P(T,v) ( 1 4.2)
The above equations describe a three-dimensional T-P-v surface which is a
representation of all the experimental or statistical mechanical information
concerning the EoS of the system. It should also be noted that it is observed
experimentally that
Pv
l i m - = I
p-0 RT
The quantity Pv/RT is called compressibility factor. The PvT or volumetric
behavior of a system depends on the intermolecular forces. Sizes, shapes and
structures of molecules determine the forces between them (Tassios, 1993).
Volumetric data are needed to calculate thermodynamic properties
(enthalpy, entropy). They are also used for the metering of fluids, the sizing of
vessels and in natural gas and oil reservoir calculations.
14.1.1 Cubic Equations of State
These equations originate kom van der Waals cubic equation of state that
was proposed more than 100 years ago. The EoS has the form (P+a/v*)(v-b)=RT
where the constants a and b are specific for each substance. This was a
significant improvement over the ideal gas law because it could describe the
volumetric behavior of both gases and liquids. Subsequently, improvements in
the accuracy of the calculations, especially for liquids, were made. One of the
most widely used cubic equations of state in industry is the Peng-Robinson EoS
(Peng and Robinson, 1976). The mathematical form of this equation was given
in Chapter 1 as Equation 1.7 and it is given again below:
(14.3)
where P is the pressure, T the temperature and v the molar volume. When the
above equation is applied to a single fluid, for example gaseous or liquid C02,
the parameters a and b are specific to C02. Values for these parameters can be
228 Cltaptet' 14
readily computed using Equations 14.5a - 14.5d. Whenever the EoS (Equation
14.3) is applied to a mixture, parameters a and b are now mixture parameters.
The way these mixture parameters are defined is important because it affects the
ability of the EoS to represent fluid phase behavior. The so-called van der Waals
combining rules given below have been widely used. The reader is advised to
consult the thermodynamics literature for other combining rules.
NC
b = C x , bi
( 1 4.4a)
( 14.4b)
i = l
where a,, and bi are parameters specific for the individual component i and k!, is
an empirical interaction parameter characterizing the binary formed by
component i and component j . The individual parameters for component i are
given by
I n th
ai = O.45724-[ R~T:, I + Ki(1 -G)] 2
Pc.i
Ki = 0.37464+ 1.542260)i -0.2699%);
le above equations, o is the acentri
( 14.5a)
(1 4.5b)
(14.5c)
( 14.5d)
c factor, and Tc, PC are the critical
temperature and pressure respectively, These quantities are readily available for
most components.
The Trebble-Bishnoi EoS is a cubic equation that may utilize up to four
binary interaction parameters, k=[k,, kb, kc, kaJ r. This equation with its quadratic
combining rules is presented next (Trebble and Bishnoi, 1987; 1988).
I'ammeter Estimation i n Cubic kqiiations of State
229
where
(1 4.6)
(1 4.7a)
( 1 4.7b)
(1 4.7c)
(1 4.7d)
and Nc is the number of components in the fluid mixture. The quantities a,,
bl,cl,dl, i=l ,..,Nc are the individual component parameters.
14.1.2 Estimation of lnteraction Parameters
As we mentioned earlier, a volumetric EoS expresses the relationship
among pressure, P, molar volume, v, temperature, T and composition z for a
fluid mixture. This relationship for a pressure-explicit EoS is of the form
P = P( V, T; Z; U; k) (14.8)
where the p-dimensional vector k represents the unknown binary interaction
parameters and z is the composition (mole fiactions) vector. The vector u is the
set of EoS parameters which are assumed to be precisely known e.g., pure
component critical properties.
Given an EoS, the objective of the parameter estimation problem is to
compute optimal values for the interaction parameter vector, k, in a statistically
correct and computationally efficient manner. Those values are expected to
enhance the correlational ability of the EoS without compromising its ability to
predict the correct phase behavior.
230 Chapter I4
14.1.3 Fugacity Expressions Using the Peng-Robinson EoS
Any cubic equation of state can give an expression for the fkgacity Of
species i in a gaseous or in liquid mixture. For example, the expression for the
fugacity of a component i in a gas mixture, f," , based on the Peng-Robinson EoS
is the following
To calculate the fugacity of each species in a gaseous mixture using the
above equation at specified T, P, and mole fractions of all components yl . y?....,
the following procedure is used
1 . Obtain the parameters a, and b, for each pure species (component) of the
2. Compute a and b parameters for the mixture from Equation 14.4.
3. Write the Peng-Robinson equation as follows
mixture from Equation 14.5.
z 3 -(1-B)z2 +(A-3R2 - 2B) z- ( AB- B2 - B3) = 0 (14.10)
where
bP
R=- "
RT
Pv
RT
z = - -
(14.1 l a)
(14.1 Ib)
(14.1 IC)
4. Solve the above cubic equation for the compressibility factor, zv
5. Use the value of z" to compute the vapor phase fugacity for each species
(corresponding to largest root of the cubic EoS).
from Equation 14.9.
The expression for the fugacity of component i in a liquid mixture is as
follows
where A and B are as before and zL is the compressibility factor of the liquid
phase (corresponding to smallest root of the cubic EoS). It is also noted that in
Equations 14.4a and 14.4b the mixture parameters a and b are computed using
the liquid phase mole fractions.
14.1.4 Fugacity Expressions Using the Trebble-Bishnoi EoS
The expression for the fugacity of a component j in a gas or liquid
mixture, fj, based on the Trebble-Bishnoi EoS is available in the literature
(Trebble and Bishnoi, 1988). This expression is given in Appendix 1. I n addition
the partial derivative, ( dl n5/ a~J ~, ~, for a binary mixture is also provided. This
expression is very useful in the parameter estimation methods that will be
presented in this chapter.
14.2 PARAMETER ESTIMATION USING BINARY VLE DATA
Traditionally, the binary interaction parameters such as the k,, kb, k, in
the Trebble-Bishnoi EoS have been estimated from the regression of binary vapor-
liquid equilibrium (VLE) data. It is assumed that a set of N experiments have been
performed and that at each of these experiments, four state variables were
measured. These variables are the temperature (T), pressure (P), liquid (x) and
vapor (y) phase mole fractions of one of the components. The measurements of
these variables are related to the "true" but unknown values of the state variables
by the equations given next
Ti = T, +eT., i = 1,2 ,..., N (14.13a)
P, = P, +e,,i i = 1,2 ,..., N (14.13b)
232
2, = x, +e,?, i = 1,2 ,..., N (14.13~)
j l , = y, +ey,, i = 1,2 ,..., N (14.13d)
where e~r,,, ep,i, ex., and ey,i, are the corresponding errors in the measurements.
Given the model described by Equation 14.8 as well as the above experimental
information, the objective is to estimate the parameters (ka, ki,, kc and kc!) by
matching the data with the EoS-based calculated values.
Least squares (LS) or maximum likelihood (ML) estimation methods can be
used for the estimation of the parameters. Both methods involve the minimization
of an objective function which consists of a weighted sum of squares of deviations
(residuals). The objective function is a measure of the correlational ability of the
EoS. Depending on how the residuals are formulated we have explicit or implicit
estimation methods (Englezos et al. 1990a). I n explicit formulations, the
differences between the measured values and the EoS (model) based predictions
constitute the residuals. At each iteration of the minimization algorithm, explicit
formulations involve phase equilibrium calculations at each experimental point.
Explicit methods often fail to converge at "difficult" points (e.g. at high pressures).
As a consequence, these data points are usually ignored i n the regression, with
resulting inferior matching ability of the model (Michelsen. 1993). On the other
hand, implicit estimation has the advantage that one avoids the iterative phase
equilibrium calculations and thus has a parameter estimation method which is
robust, and computationally efficient (Englezos et al. 1990a; Peneloux et al. 1990).
It should be kept i n mind that an objective function which does not require
any phase equilibrium calculations during each minimization step is the basis for
a robust and efficient estimation method. The development of implicit Objective
functions is based on the phase equilibrium criteria (Englezos et al. 1990a).
Finally, it should be noted that one important underlying assumption i n applying
ML estimation is that the model is capable of representing the data without any
systematic deviation. Cubic equations of state compute equilibrium properties of
fluid mixtures with a variable degree of success and hence the MI, method
should be used with caution.
14.2.1 Maximum Likelihood Parameter and State Estimation
Over the years two ML estimation approaches have evolved: (a) parameter
estimation based an implicit formulation of the objective function; and (b)
parameter and state estimation or "error in variables" method based on an explicit
formulation of the objective function. In the first approach only the parameters are
estimated whereas in the second the true values of the state variables as well as the
values of the parameters are estimated. I n this section, we are concerned with the
latter approach.
Only two of the four state variables measured in a binary VLE experiment
are independent. Hence, one can arbitrarily select two as the independent
variables and use the EoS and the phase equilibrium criteria to calculate values
for the other two (dependent variables). Let TiJ (i=l,2,. . .,N and j=1,2) be the
independent variables. Then the dependent ones, 5,j , can be obtained from the
phase equilibrium relationships (Modell and Reid, 1983) using the EoS. The
relationship between the independent and dependent variables is nonlinear and is
written as follows
tlj = h,(rd;k;u); j = 1,2 and i = 172, ... ?N
(14.14)
In this case, the ML parameter estimates are obtained by minimizing the
following quadratic optimality criterion (Anderson et al., 1978; Salazar-Sotelo et
al. 1986)
(14.15)
This is the so-called error i u variables method. The formulation of the
above optimality criterion was based on the following assumptions:
(i) The EoS is capable of representing the data without any systematic
(ii) The experiments are independent.
(iii) The errors in the measurements are normally distributed with zero
mean and a known covariance matrix diag( G T,I , o p,, , G x, i 0 ;,, ).
deviation.
2 2 2 3
Unless very few experimental data are available, the dimensionality of the
problem is extremely large and hence difficult to treat with standard nonlinear
minimization methods. Schwetlick and Tiller (1 985), Salazar-Sotelo et al. ( 1 986)
and Valko and Vajda (1987) have exploited the structure of the problem and
proposed computationally efficient algorithms.
14.2.2 Explicit Least Squares Estimation
The error in variables method can be simplified to weighted least squares
estimation if the independent variables are assumed to be known precisely or if
they have a negligible error variance compared to those of the dependent
variables. I n practice however, the VLE behavior of the binary system dictates
the choice of the pairs (T,x) or (T,P) as independent variables. In systems with a
234 Chapt er I4
sparingly soluble component in the liquid phase the (T.P) pair should be chosen
as independent variables. On the other hand, in systems with azeotropic
behavior the (T,x) pair should be chosen.
Assuming that the variance of the errors in the measurement of each
dependent variable is known, the following explicit LS objective functions may
be formulated:
( 14. m)
The calculation of y and P in Equation 14.16a is achieved by bubble
point pressure-type calculations whereas that of x and y in Equation 14.16b is
by isother.n~nl-isobaric *flash-type calculations. These calculations have to be
performed during each iteration of the minimization procedure using the current
estimates of the parameters. Given that both the bubble point and the flash
calculations are iterative in nature the overall computational requirements are
significant. Furthermore, convergence problems in the thermodynamic
calculations could also be encountered when the parameter values are away
from their optimal values.
14.2.3 Implicit Maximum Likelihood Parameter Estimation
Implicit estimation offers the opportunity to avoid the computationally
demanding state estimation by formulating a suitable optimality criterion. The
penalty one pays is that additional distributional assumptions must be made.
Implicit fornlulation is based on residuals that are implicit functions of the state
variables as opposed to the explicit estimation where the residuals are the errors
in the state variables. The assumptions that are made are the following:
(i) The EoS is capable of representing the data without any systematic
(ii) The experiments are independent.
(iii) The covariance matrix of the errors in the measurements is known and it is
deviation.
often ofthe form: diag( CY;,,, CJ : . ~, o:,, c;,, ).
(iv) The residuals employed i n the optirnality criterion are norn~ally distributed.
Parameter Estimation i n Cubi c Equations of State 235
The formulation of the residuals to be used in the objective function is based
on the phase equilibrium criterion
f v = fL ; i = 1,2 ,..., N and j = 1,2
' J 1J
(14.17)
where f is the fugacity of component I or 2 in the vapor or liquid phase. The above
equation may be written in the following form
The above equations hold at equilibrium. However, when the measurements
of the temperature, pressure and mole fiactions are introduced into these
expressions the resulting values are not zero even if the EoS were perfect. The
reason is the random experimental error associated with each measurement of the
state variables, Thus, Equation 14.18 is written as follows
The estimation of the parameters is now accomplished by minimizing the
implicit ML objective function
( 1 4.20)
where 02 (j=1,2) is computed by a first order variance approximation also
known as the error propagation law (Bevington and Robinson, 1992) as follows
!I
(1 4.22)
236
where all the derivatives are evaluated at T , P, , i , , 9, . Thus, the errors i n the
measurement of all four state variables are taken into account.
The above form of the residuals was selected based on the following
considerations:
, . A
(i) The residuals r,J should be approximately normally distributed.
(ii) The residuals should be such that no iterative calculations such as
bubble point or flash-type calculations are needed during each step of
the minimization.
(iii) The residuals are chosen in a away to avoid numerical instabilities, e.g.
exponent overflow.
Because the calculation of these residuals does not require any iterative
calculations, the overall computational requirements are significantly less than
for the explicit estimation method using Equation 14.1 5 and the explicit LS
estimation method using Equations 14.16a and b (Englezos et al. 1990a).
14.2.4 Implicit Least Squares Estimation
It is well known that cubic equations of state have inherent limitations i n
describing accurately the fluid phase behavior. Thus our objective is often
restricted to the determination of a set of interaction paratneters that w i l l yield an
"acceptable fit" of the binary VLE data. The following implicit least squares
objective function is suitable for this purpose
( 14.23)
where QiJ are user-defined weighting factors so that equal attention is paid to all
data points, If the 1 1 4 are within the same order of magnitude, Q,,. can be set
equal to one.
14.2.5 Constrained Least Squares Estimation
It is well known that cubic equations of state may predict erroneous binary
vapor liquid equilibria when using interaction parameter estimates from an
unconstrained regression of binary VLE data (Schwartzentruber et al.. 1987;
Englezos et al. 1989). I n other words, the liquid phase stability criterion is
violated. Modell and Reid (1983) discuss extensively the phase stability criteria.
A general method to alleviate the problenl is to perform the least squares
estimation subject to satisfying the liquid phase stability criterion. In other
words, parameters which yield the optimal fit of the VLE data subject to the
liquid phase stability criterion are sought. One such method which ensures that
the stability constraint is satisfied over the entire T-P-x surface has been
proposed by Englezos et al. (1 989).
Given a set of N binary VLE (T-P-x-y) data and an EoS, an efficient
method to estimate the EoS interaction parameters subject to the liquid phase
stability criterion is accomplished by solving the following problem
subject to = cp(T,P,x;k) > 0, (T,P,x) E IR (14.25)
T,P
where QI is a user-defined weighting matrix (the identity matrix is often used), k
is the unknown parameter vector, CL is the fugacity of component i in the liquid
phase, f,' is the fugacity of component i in the vapor phase, cp is the stability
function and SI is the EoS computed T-P-x surface over which the stability
constraint should be satisfied. This is because the EoS-user expects and demands
from the equation of state to calculate the correct phase behavior at any
specified condition. This is not a typical constrained minimization problem
because the constraint is a function of not only the unknown parameters, k, but
also of the state variables. In other words, although the objective function is
calculated at a finite number of data points, the constraint should be satisfied
over the entire feasible range of T, P and x.
14.2.5.1 Simplified Constrained Least Squares Estimation
Solution of the above constrained least squares problem requires the
repeated computation of the equilibrium surface at each iteration of the
parameter search. This can be avoided by using the equilibrium surface defined
by the experimental VLE data points rather than the EoS computed ones i n the
calculation of the stability function. The above minimization problem can be
further simplified by satisfying the constraint only at the given experimental
data points (Englezos et al. 1989). In this case, the constraint (Equation 14.25)
is replaced by
where 6 is a small positive number given by
238
( 1 4.27)
In Equation 14.27, oT, oP and ox are the standard deviations of the
measurements of T, P and x respectively. All the derivatives are evaluated at the
point where the stability function cp has its lowest value. We call the
minimization of Equation 14.24 subject to the above constraint sinrpl$ed
Constrained Least Squares (simplified CLS) estimation,
14.2.5.2 A Potential Problem with Sparse or Not Well Distributed Data
The problem with the above simplified procedure is that it may yield
parameters that result in erroneous phase separation at conditions other than the
given experimental ones. This problem arises when the given data are sparse or
not well distributed. Therefore, we need a procedure that extends the region over
which the stability constraint is satisfied.
The objective here is to construct the equilibrium surface in the T-P-x space
from a set of available experimental VLE data. I n general, this can be
acconlplished by using a suitable three-dimensional interpolation method.
However, if a sufficient number of well distributed data is not available, this
interpolation should be avoided as it may misrepresent the real phase behavior of
the system.
In practice, VLE data are available as sets of isothermal measurements. The
number of isotherms is usually small (typically 1 to 5) . Hence, we are often left
with limited information to perform interpolation with respect to temperature. On
the contrary, one can readily interpolate within an isotherm (two-dimensional
interpolation). In particular, for systems with a sparingly soluble component, at
each isotherm one interpolates the liquid mole fraction values for a desired
pressure range. For any other binary system (e.g., azeotropic), at each isotherm,
one interpolates the pressure for a given range of liquid phase mole fiaction.
typically 0 to 1.
For simplicity and in order to avoid potential misrepresentation of the
experimental equilibrium surface, we recommend the use of 2-D interpolation.
Extrapolation of the experimental data should generally be avoided. It should be
kept in mind that, if prediction of complete miscibility is demanded from the EoS
at conditions where no data points are available, a strong prior is imposed on the
parameter estimation from a Bayesian point of view.
Parameter Esti mati on i n C:ubir Equations of State 239
Location of (D-
The set of points over which the minimum of cp is sought has now been
expanded with the addition of the interpolated points to the experimental ones.
In this expanded set, instead of using any gradient method for the location of the
minimum of the stability function, we advocate the use of direct search. The
rationale behind this choice is that first we avoid any local minima and second
the computational requirements for a direct search over the interpolated and the
given experimental data are rather negligible. Hence, the minimization of
Equation 14.24 should be performed subject to the following constraint
where (To, Po, xo) is the point on the experimental equilibrium surface
(interpolated or experimental) where the stability function is minimum. We call
the minimization of Equation 14.24 subject to the above constraint (Equation
14.28) Constrained Least Squares (CLS) estimation.
It is noted that 5 should be a small positive constant with a value
approximately equal to the difference in the value of cp when evaluated at the
experimental equilibrium surface and the EoS predicted one in the vicinity of the
(To, Po, xo). Namely, if we have a system with a sparingly soluble component then
6 is given by
For all other systems, 5 can be computed by
(1 4.29a)
(1 4.29b)
In the above two equations, the superscripts exp and EoS indicate
that the state variable has been obtained from the experimental equilibrium
surface or by EoS calculations respectively. The value of E, which is used is the
maximum between the Equations 14.27 and 14.29a or 14.27 and 14.29b.
Equation 14.27 accounts for the uncertainty in the measurement of the
experimental data, whereas Equations 14.29a and 14.29b account for the
deviation between the model prediction and the experimental equilibrium
surface. The minimum of cp is computed by using the current estimates of the
parameters during the minimization.
240 C:llapter I t
The problem of minimizing Equation 14.24 subject to the constraint given
by Equation 14.26 or 14.28 is transformed into an unconstrained one by
introducing the Lagrange multiplier, w, and augmenting the LS objective
ft~nction, SLs(k), to yield
The minimization of SLG(k,w) can now be accomplished by applying the
Gauss-Newton method with Marquardt's modification and a step-size poIicy as
described in earlier chapters.
14.2.5.3 Constrained Gauss-Newton Method for Regression of Binary
VLE Data
As we mentioned earlier, this is not a typical constrained minimization
problem although the development of the solution method is very similar to the
material presented in Chapter 9. If we assume that an estimate k(J) is available at
the jth iteration, a better estimate, k""', of the parameter vector is obtained as
follows.
Linearization of the residual vector e = [ h f:, -117 f y , i t 7 f;' - it? f: 1'
around k") at the it" data point yields
Furthermore, linearization of the stability function yields
(14.3 1 a)
(14.3 I b)
Substitution of Equations 14.3 1 a and b into the objective function SrdG(k,o) and
as,, (k(t+'),oj) as LG (k ('+') , to)
use of the stationary criteria = 0 and
dk Ci+l)
=O
80)
yield the following system of linear equations:
and
Parameter Estimation in Cubi c Equations o f State
where
N
A = CGTQ~G~
i =l
and
N
241
(14.32b)
(14.33a)
(14.33b)
i =1
where Q, is a user-defined weighting matrix (often taken as the identity matrix),
GI is the (2xp) sensitivity matrix (g) ' evaluated at x, and ko) and c is - avo .
dk
1
Equation 14.32a is solved for Ak and the result is substituted in
0+1)
Equation 14.32b which is then solved for o to yield
cpo - 6 + c'A"b
cT A"c
( I 4.34a)
Substituting the above expression for the Lagrange multiplier into Equation
14.32a we arrive at the following linear equation for Ak6+'),
(1 4.34b)
0)
Assuming that an estimate of the parameter vector, k , is available at the j*
iteration, one can obtain the estimate for the next iteration by following the next
steps:
(i) Compute sensitivity matrix Gi, residual vector e, and stability fbnction cp at
(ii) Using direct search determine the experimental or interpolated point (To,
(iii) Set up matrix A, vector b and compute vector c at (To, Po, xo).
(iv) Decompose matrix A using eigenvalue decomposition and compute co from
each data point (experimental or interpolated).
Po, xo) where cp is minimum.
Equation 14.34a.
242
(v) Compute Ak from Equation 14.32b and update the parameter vector by
setting k = k + pAk where a stepping parameter, p (O<p I 1). is
O+l )
0'11 (1) O+I,
used to avoid the problem of overstepping. The bisection rule can be used
to arrive at an acceptable value for 1-1.
14.2.6 A Systematic Approach for Regression of Binary VLE Data
As it was explained earlier the use of the objective function described by
Equation 14. I5 is not advocated due to the heavy computational requirements
expected and the potential convergence problems. Calculations by using the
explicit LS objective function given by Equations 14.16a and 14.16b are not
advocated for the same reasons. I n fact, Englezos et al. (1990a) have shown that
the computational time can be as high as two orders of magnitude compared to
implicit LS estimation using Equation 14.23. The computational time was also
found to be at least twice the time required for ML estimation using Equation
14.20.
The implicit LS, ML and Constrained LS (CLS) estimation methods
are now used to synthesize a systematic approach for the parameter estimation
problem when no prior knowledge regarding the adequacy of the
thermodynamic model is available. Given the availability of methods to estimate
the interaction parameters in equations of state there is a need to follow a
systematic and computationally efficient approach to deal with all possible cases
that could be encountered during the regression of binary VLE data. The
following step by step systematic approach i s proposed (Englezos et al. 1993)
Step I . Dnta Cornpilution urld EOS Selection:
A set of N VLE experimental data points have been made available. These
data are the measurements of the state variables (T, P, x, y) at each of the
N performed experiments. Prior to the estimation, one should plot the data
and look for potential outliers as discussed i n Chapter 8. I n addition, a
suitable EoS with the corresponding mixing rules should be selected.
Step 2. Best Set of htemciion Parameters:
The first task of the estimation procedure is to quickly and efficiently
screen all possible sets of interaction parameters that could be used. For
example if the Trebble-Bishnoi EoS were to be employed which can utilize
up to four binary interaction parameters, the number of possible
combinations that should be examined is 15. The implicit LS estimation
procedure provides the most efficient means to determine the best set of
interaction parameters. The best set is the one that results in the smallest
value of the LS objective function after convergence of the minimization
algorithm has been achieved. One should not readily accept a set that
Iarametcr Estimation in Cubi c Eqaations o f State 243
corresponds to a marginally smaller LS function if it utilizes more
interaction parameters.
Step 3. Computation of VLE Phase Equilibria:
Once the best set of interaction parameters has been found, these
parameters should be used with the EoS to perform the VLE calculations.
The computed values should be plotted together with the data. A
comparison of the data with the EoS based calculated phase behavior
reveals whether correct or incorrect phase behavior (erroneous liquid phase
splitting) is obtained.
CASE A: Correcl Phase Behavior:
If the correct phase behavior i.e. absence of erroneous liquid phase splits is
predicted by the EoS then the overall fit should be examined and it should
be judged whether it is excellent. If the fit is simply acceptable rather
than excellent, then the previously computed LS parameter estimates
should suffice. This was found to be the case for the n-pentane-acetone
and the methane-acetone systems presented later in this chapter.
Step 4a. Implicit ML Estimation:
When the fit is judged to be excellent the statistically best interaction
parameters can be efficiently obtained by performing implicit ML
estimation. This was found to be the case with the methane-methanol and
the nitrogen-ethane systems presented later in this chapter.
CASE B: Erroneous Phase Behavior
Step 4b. CLS Estimation:
If incorrect phase behavior is predicted by the EOS then constrained least
squares (CLS) estimation should be performed and new parameter
estimates be obtained. Subsequently, the phase behavior should be
computed again and if the fit is found to be acceptable for the intended
applications, then the CLS estimates should suffice. This was found to be
the case for the carbon dioxide-n-hexane system presented later in this
chapter.
Step 5b. Mod&? EoS/Mixing Rules:
In several occasions the overall fit obtained by the CLS estimation could
simply be found unacceptable despite the fact that the predictions of
erroneous phase separation have been suppressed. I n such cases, one
should proceed and either modify the employed mixing rules or use a
different EoS all together. Of course, the estimation should start fiom Step
1 once the new thermodynamic model has been chosen. Calculations using
244
the data for the propane-methanol system illustrate this case as discussed
next.
I n this section we consider typical exanlples. They cover all possible
cases that could be encountered during the regression of binary VLE data.
Illustration of the methods is done with the Trebble-Bishnoi (Trebble and
Bishnoi, 1988) EoS with quadratic mixing rules and temperature-independent
interaction parameters. It is noted, however, that the methods are not restricted
to any particular EoS/mixing rule.
2.0 7 I
1.6
1.2
0.8
0.4
1 I
1 422.6 K A
397.7 K
T
14.2.7.1 The n-Pentane-Acetone System
Experimental data are available for this system at three temperatures by
Campbell et al. (1986). Interaction parameters were estimated by Englezos et al.
( I 993). It was found that the Trebble-Bishnoi EoS is able to represent the correct
phase behavior with an accuracy that does not warrant subsequent use of ML
estimation. The best set of interaction parameters was found by implicit LS
estimation to be (ka=0.0977, kb=O, k=O, b=O). The standard deviation for k, was
found to be equal to 0.0020. For this system, the use of more than one interaction
parameter did not result in any improvement of the overall fit. The deviations
between the calculated and the experimental values were found to be larger than
the experimental errors. This is attributed to systematic deviations due to model
inadequacy. Hence, one should not attempt to perform ML estimation. In this case,
the LS parameter estimates suffice. Figure 14.1 shows the calculated phase
behavior using the LS parameter estimate. The plot shows the pressure versus the
liquid and vapor phase mole fractions. This plot is known as partial phase
diagram.
14.2.7.2 The Methane-Acetone System
Data for the methane-acetone system are available by Yokoyama et al.
( 1 985). The implicit LS estimates were computed and found to be sufficient to
describe the phase behavior. These estimates are (ka=0.0447, kb=O, kc=O, kd=O).
The standard deviation for k, was found to be equal to 0.0079.
Nitrogen Mole Fraction
Figure 14.2 Vapor-liquid equilibrium data and calculated values for
the nitrogen-ethane system [reprinted from the Canadian
Jotlrnnl of Chemical Engineering with permission].
14.2.7.3 The Nitrogen-Ethane System
Data at two temperatures were obtained from Zeck and Knapp ( 1 986)
for the nitrogen-ethane system. The implicit LS estimates of the binary
interaction parameters are k,=O, kb=O, k,=O and kd=0.0460. The standard
deviation of kd was found to be equal to 0.0040. The vapor liquid phase
equilibrium was computed and the fit was found to be excellent (Englezos et al.
1993). Subsequently, implicit ML calculations were performed and a parameter
value of kd=0.0493 with a standard deviation equal to 0.0070 was computed.
Figure 14.2 shows the experimental phase diagram as well as the calculated one
using the implicit ML parameter estimate.
14.2.7.4 The Methane-Methanol System
The methane-methanol binary is another system where the EoS is also
capable of matching the experimental data very well and hence, use of ML
estimation to obtain the statistically best estimates of the parameters is justified.
Data for this system are available from Hong et al. ( I 987). Using these data, the
binary interaction parameters were estimated and together with their standard
deviations are shown in Table 14. I . The values of the parameters not shown in
the table (i.e., k,, kb, kc) are zero.
Parameter Value I Standard Deviation I Objective Function
I I
kd =-0.1903 1 0.0284
Implicit LS
1 (Equation 14.23)
I I
kd =-0.23 I 7
Implicit ML
1 0.0070 I (Equation 14.20)
I I
1 0.0025
1 Explicit LS (T,P)
(Equation 14. I6b)
kd =-0.25 1 5
I I
Sozrrce: Englezos et ai. (1 990a).
14.2.7.5 Carbon Dioxide-Methanol System
Data for the carbon dioxide-methanol binary are available from Hong and
Kobayashi ( 1 988). The parameter values and their standard deviations estimated
fiom the regression of these data are shown in Table 14.2.
I' arameter Esti mati oh i n Cubi c kquat i ons oisthte' ' 247
Table f 4.2 Parameter Estimates for the Carbon Dioxide-Methanol System
Note: Explicit LS (Equation 14.16a) did not converge with a zero
value for Marquardt's directional parameter.
Source: Englezos et al. (1 990a).
Table 14.3 Parameter Estimates for the Carbon Dioxide-n-Hexane S~~stern
Parameter Value Objective Function Standard Deviation
kb = 0.1977
(Eq. 14.24 & 14.26)
0.1775 kc = -0.7686
Simplified CLS
0.0375 kb= 0.1345
(Equation 14.23) 0.1648 kc = - 1.0699
Implicit LS
0.0334
14.2.7.6 The Carbon Dioxide-n-Hexane System
This system illustrates the use of simplified constrained least squares
(CLS) estimation. I n Figure 14.3, the experimental data by Li et al. (1981)
together with the calculated phase diagram for the system carbon dioxide-n-
hexane are shown. The calculations were done by using the best set of
interaction parameter values obtained by implicit LS estimation. These
parameter values together with standard deviations are given in Table 14.3. The
values of the other parameters (ka, kd) were equal to zero. As seen from Figure
14.3, erroneous liquid phase separation is predicted by the EoS in the high
pressure region. Subsequently, constrained least squares estimation (CLS) was
performed by minimizing Equation 14.24 subject to the constraint of Equation
14.26. In other words by satisfying the liquid phase stability criterion at all
experimental points. The new parameter estimates are also given in Table 14.3.
These estimates were used for the re-calculation of the phase diagram, which is
also shown i n Figure 14.3. As seen, the EoS no longer predicts erroneous liquid
phase splitting in the high pressure region. The parameter estimates obtained by
the constrained LS estimation should suffice for engineering type calculations
since the EoS can now adequately represent the correct phase behavior of the
system.
2 1
I O -
A
CL
Q
5 *-
'i 6-
2 4 -
v
Q)
cn
cn
n
2-
Calculations with /-7
Calculations with
CLS parameters
Data by Li et at. (1981)
O f I i 1 I
0.0 0.2 0. 4 0.6 0.8 1 . o
x, y Carbon Dioxide
14.2.7.7 The Propane-Methanol System
This system represents the case where the structural inadequacy of the
thermodynamic model (EoS) is such that the overall fit is sinlply unacceptable
when the EoS is forced to predict the correct phase behavior by using the CLS
parameter estimates. I n particular, it has been reported by Englezos et al.
(1990a) that when the Trebble-Bishnoi EoS is utilized to model the phase
behavior of the propane-methanol system (data by Galivel-Solastiuk et al. 1986)
erroneous liquid phase splitting is predicted. Those calculations were performed
using the best set of LS parameter estimates, which are given in Table 14.4.
Following the steps outlined in the systematic approach presented in an earlier
section, simplified CLS estimation was performed (Englezos et al. 1993;
Englezos and Kalogerakis, 1993). The CLS estimates are also given in Table
14.4.
Parameter Estimation i n Cubic Equations of State 249
Table 14.4 Parameter Estimates for the Propane-Methanol Sjvtem
1 Parameter Value 1 Standard Deviation I Objective Function ~ 1
k,= 0.1531
0.01 13
~~~ ~
Implicit LS
kd= -0.2994
(Equation 14.23)
0.0280
k,= 0.1 193
Simplified CLS 0.0305
k d = -0.3 196
(Equation 14.24 & 14.26) 0.073 1
-
Soznce: Englezos et al. (1 993).
3
Computations with LS parameters
T = 343.1 K
I / / Comautations with I
1v \8'
CLS . parameters
Data by Galivel-Solastiuk et 81. (1986)
I
I
1
I
I I I I
0.0 0.2 0.4 0.6 0.8 1.0
x, y Propane
Figure 14.4 Vapor--liquid eqrrilibritrm data anti calculated values for the
propane-methanol system [reprinted from the Canadian
Journal of Chemical Engineering with pernzission].
Next, the VLE was calculated using these parameters and the results
together with the experimental data are shown in Figure 14.4. The erroneous
phase behavior has been suppressed. However, the deviations between the
experimental data and the EoS-based calculated phase behavior are excessively
large. In this case, the overall fit is judged to be unacceptable and one should
proceed and search for more suitable mixing rules. Schwartzentruber et al.
(1987) also modeled this system and encountered the same problem.
Parameter Standard
Covariance Objective Function
ltnplicit LS
Values Deviation
k, = -0.2094
(Equation 14.23)
0.0279 kd = -0.2665
0.0 183
0.0005
k, = -0.8744
(Equation 14.24 & 14.26)
0'0364
0.4846 kd = 0.9808
Simplified CLS
0.08 14
k,=-0.7626
(Equation 14.24 & 14.28)
0*0551
0.4923 kd 0.6952
Constrained LS
0.1 143
Source: Englezos and Kalogerakis (1 993).
14.2.7.8 The Diethylamine-Water System
Copp and Everet (1953) have presented 33 experimental VLE data points
at three temperatures. The diethylamine-water systetn denionstrates the problem
that may arise when using the simplified constrained least squares estimation
due to inadequate number of data. In such case there is a need to interpolate the
data points and to perform the minimization sub-ject to constraint of Equation
14.28 instead of Equation 14.26 (Englezos and Kalogerakis, 1993). First,
unconstrained LS estimation was performed by using the objective function
defined by Equation 14.23. The parameter values together with their standard
deviations that were obtained are shown in Table 14.5. The covariances are also
given in the table. The other parameter values are zero.
Using the values (-0.2094, -0.2665) for the parameters k, and kd in the EoS,
the stability fhnction, cp, was calculated at each experimental VLE point. The
minima of the stability function at each isotherm are shown in Figure 14.5. The
stability function at 3 I 1.5 K is shown in Figure 14.6. As seen, cp becomes
negative. This indicates that the EoS predicts the existence of two liquid phases.
This is also evident in Fig. 14.7 where the EoS-based VLE calculations at 3 1 1.5
K are shown.
Subsequently, Constrained LS estimation was performed by minimizing the
objective function given by Equation 14.24 subject to the constraint described in
Equation 14.26. The calculated parameters are also shown i n Table 14.5. The
minima of the stability function were also calculated and they are shown i n
Figure 14.8. As seen, cp is positive at all the experinlental conditions, However,
the new VLE calculations indicate in Figure 14.9 that the EoS predicts
erroneous liquid phase separation. This result prompted the calculation of the
stability hnction at pressures near the vapor pressure of water where it was
found that 'p becomes negative. Figure 14.8 shows the stability function
becoming negative.
20
15
10
5
0

310 31 5 320 325 3
Temperat ure
251
Figure 14.5 The minima of the stability jimction at the experimental
tenzperatures for the diethylamine-water system [I-eprinted
porn Comptlters & Chemical Engineering with permission
from Elsevier Science].
c
0
0
c
LL
3

W
Unstable Liauid
Figure 14.6 The stability function calculated with interaction parameters
porn unconstrained least squares estimation.
252
0.05
0.04
0.03
0.02
0.01
Temperature = 31 1.5 K
0.00 f I 1 1 - - r .
C.
o * o 0.2 0.4 0.6 0.8 1.0
Diethylamine Mole Fraction
Calculated by EoS using
simplified ClS parameters
Parameter Esti mati on i n Cubi c Equat hs of &ti:
253
Temperature =31 1.5 K
I
Figure 14.9 Vapor-liquid equilibrium data and calculated values for the diethyl-
amine-water system. Calculations were done using parameters j -on1
simpliJied CLS estimation [reprinted $-om Computers & Chemical
Engineering with permission from Elsevier Science].
Calculated by EoS using
CLS parameters
Temperature = 31 1.5 K
6.00 0.01 0.02 0.03 0.04 0.05 0.06
Pressure (MPa)
Figure 13. I O The stability function calculated with interaction parameters from
constrained LS estimation [reprinted @om Computers & Chemical
Engineering with permission fi onz Elsevier Science].
254
0
0
0
-
Temperature =31 1.5 K
Figure 14. I I Iapor-liqtrid equilibriun? dntn and cnlculnted valtres.for the dietltjvI-
anzine-water systenz. Calculations were done usiug parameters
j - om CLS estirwtion Feprinted j?onl Con~puters R Chettlical
Engineering with permission from Elsevier Sciertce].
Therefore, although the stability function was found to be positive at all the
experimental conditions it becomes negative at mole fractions between 0 and the
first measured data point. Obviously, if there were additional data available i n
this region, the simplified constrained LS method that was followed above
would have yielded interaction parameters that do not result in prediction of
false liquid phase splitting.
Following the procedure described i n Section 14.2.5.2 the region over
which the stability function is examined is extended as follows. At each
experimental temperature, a large number of mole fraction values are selected
and then at each T, x point the values of the pressure, P, corresponding to VLX,
are calculated. This calculation is performed by using cubic spline interpolation
of the experimental pressures. For this purpose the subroutine ICSSCU from the
IMSL library was used. The interpolation was performed only once and prior to
the parameter estimation by using 100 mole tkaction values selected between 0.
and 0.05 to ensure a close examination of the stability function. During the
minimization, these values of T, P and x together with the given VLE data
were used to compute the stability function and find its mi ni mum value. The
current estimates of the parameters were employed at each minimization step.
Parameter Esti mati on i n Cubic Equations bf St at e 255
New interaction parameter values were obtained and they are also shown in
Table 14.5. These values are used in the EoS to calculate the stability function
and the calculated results at 3 1 1.5 K are shown in Figure 14.10.
As seen from the figure, the stability function does not become negative at
any pressure when the hydrogen sulfide mole fiaction lies anywhere between 0
and 1 . The phase diagram calculations at 3 1 1.5 K are shown in Figure 14.1 I . As
seen, the correct phase behavior is now predicted by the EoS.
The improved method guarantees that the EoS will calculate the correct
VLE not only at the experimental data but also at any other point that belongs to
the same isotherm. The question that arises is what happens at temperatures
different than the experimental. As seen in Figure 14.10 the minima of the
stability function increase monotonically with respect to temperature. Hence, it
is safe to assume that at any temperature between the lowest and the highest one,
the EoS predicts the correct behavior of the system. However, below the
minimum experimental temperature, it is likely that the EoS will predict
erroneous liquid phase separation.
The above observation provides usehl information to the experimenter
when investigating systems that exhibit vapor-liquid-liquid equilibrium. In
particular, it is desirable to obtain VLE measurements at a temperature near the
one where the third phase appears. Then by performing CLS estimation, it is
guaranteed that the EoS predicts complete miscibility everywhere in the actual
two phase region. It should be noted, however, that in general the minima of the
stability function at each temperature might not change monotonically. This is
the case with the C02-n-Hexane system where it is risky to interpolate for
intermediate temperatures. Hence, VLE data should also be collected at
intermediate temperatures too.
14.3 PARAMETER ESTIMATION USING THE ENTIRE BINARY
PHASE EQUILlBRIUM DATA
It is known that in five of the six principal types of binary fluid phase
equilibrium diagrams, data other than VLE may also be available for a particular
binary (van Konynenburg and Scott, 1980). Thus, the entire database may also
contain VL2E, VLIE, VLIL2E, and LIL2E data. In this section, a systematic
approach to utilize the entire phase equilibrium database is presented. The
material is based on the work of Englezos et al. (1990b; 1998)
Let us assume that for a binary system there are available NI VL,E, N2
VL2E, N3 LlL2E and N4 VLIL2E data points. The light liquid phase is L1 and L2
is the hemy one. Thus, the total number of available data is N=Nl+N2+N3+N4.
Gas-gas equilibrium type of data are not included in the analysis because they
are beyond the range
function that provides
represent all these types
of most practical applications. An implicit objective
an appropriate measure of the ability of the EoS to
of equilibrium data is the following
( 14.35)
where the vectors r, are the residuals and R, are suitable weighting matrices. The
residuals are based on the isoTfugaciV phose eqzrifibrium criterion. When two
phase data (VLIE, or VL2E or LIL2E) are considered, the residual vector takes
the form
( 1 4.36)
where f a and f P are the fugacities of component j ( i =l or 2) in phase 01 (Vapor
or Liquidl) and phase p (Liquid, or Liquid?) respectively for the it" experiment.
When three phase data (VLIL2E) are used, the residuals become
J J
(14.37)
The residuals are functions of temperature. pressure, composition and the
interaction parameters. These functions can easily be derived analytically for
any equation of state. At equilibrium the value of these residuals should be equal
to zero. However, when the measurements of the temperature, pressure and mole
fiactions are introduced into these expressions the resulting values are not zero
even if the EoS were perfect. The reason is the random experimental error
associated with each measurement of the state variables.
The values of the elements of the weighting matrices R, depend on the type
of estimation method being used. When the residuals in the above equations can
be assumed to be independent, normally distributed with zero mean and the
same constant variance, Least Squares (LS) estimation should be performed. I n
this case, the weighting matrices in Equation 14.35 are replaced by the identity
matrix 1. Maximum likelihood (ML) estimation should be applied when tfle EoS
is capable of calculating the correct phase behavior of the system within the
experimental error. Its application requires the knowledge of the measurement
Parameter Estimation i n Cubic Equations of Stite 257
errors for each variable (i.e. temperature, pressure, and mole fractions) at each
experimental data point. Each of the weighting matrices is replaced by the
inverse of the covariance matrix of the residuals. The elements of the covariance
matrices are computed by a first order variance approximation of the residuals
and fi-om the variances of the errors in the measurements of the T, P and
composition (z) in the coexisting phases (Englezos et a]. 1990a;b).
The optimal parameter values are found by the Gauss-Newton method
using Marquardt's modification. The estimation problem is of considerable
magnitude, especially for a multi-parameter EoS, when the entire fluid phase
equilibrium database is used. Usually, there is no prior information about the
ability of the EoS to represent the phase behavior of a particular system. It is
possible that convergence of the minimization of the objective function given by
Eq. 14.35 will be difficult to achieve due to structural inadequacies of the EoS.
Based on the systematic approach for the treatment of VLE data of Englezos et
al. (1 993) the following step-wise methodology is advocated for the treatment of
the entire phase database:
Consider each type of data separately and estimate the best set of interaction
parameters by Least Squares.
If the estimated best set of interaction parameters is found to be different for
each type of data then it is rather meaningless to correlate the entire
database simultaneously. One may proceed, however, to find the parameter
set that correlates the maximum number of data types.
If the estimated best set of interaction parameters is found to be the same
for each type of data then use the entire database and perform least squares
estimation.
Compute the phase behavior of the system and compare with the data. If the
fit is excellent then proceed to maximum likelihood parameter estimation.
If the fit is simply acceptable then the LS estimates suffice.
14.3.2 Covariance Matrix of the Parameters
The elements of the covariance matrix of the parameter estimates are
calculated when the minimization algorithm has converged with a zero value for
the Marquardt's directional parameter. The covariance matrix of the parameters
COV(k) is determined using Equation 1 1.1 where the degrees of freedom used
in the calculation of i i E are
2
(d.f.) = 2(N, +N2 +N3) +4N4 -p
(14.38)
Data for the hydrogen sulfide-water and the methane-n-hexane binary
systems were considered. The first is a type 111 system in the binary phase
diagram classification scheme of van Konynenburg and Scott. Experimental data
from Selleck et al. (1 952) were used. Carroll and Mather (1 989a; b) presented a
new interpretation of these data and also new three phase data. I n this work, only
those VLE data from Selleck et al. (1952) that are consistent with the new data
were used. Data for the methane-n-hexane system are available from Poston and
McKetta ( I 966) and Lin et al. (1 977). This is a type V system.
It was shown by Englezos et al. (1998) that use of the entire database can be
a stringent test of the correlational ability of the EoS and/or the mixing rules. An
additional benefit of using all types of phase equilibrium data in the parameter
estimation database is the fact that the statistical properties of the estimated
parameter values are usually improved in terms of their standard deviation.
14.3.3.1 The Hydrogen Sulfide"Water System
The same two interaction parameters (ka, kd) were found to be adequate to
correlate the VLE, LLE and VLLE data of the H2S-H20 system. Each data set
was used separately to estimate the parameters by implicit least squares (LS).
L
ka = 0.2209 0.0 120
VI,? E
lnlplicit LS
k d -0.5862 0.0306 (Eqn 14.35 with R,=I)
k, = 0.1929 0.0 I49
LI LIE
Implicit LS
I kd = -0.5455 0.0366 (Eqn 14.35 with R,=I)
k, = 0.2324 0.0232 Implicit LS
k d = -0.6054 0.0581
VL1L2E
(Eqn 14.35 with R,=i)
k, = 0.2340 0.0 132 VLZE & LlLzE Implicit LS
kd = -0.6 I 79 0.0333 & VLI L2E (Eqn 14.35 with R,=I)
ka = 0.2 I08 0.0003 VL'E 8~ L1L2E Implicit ML
(Equation 14.35 with
R,=COV"(e,))
k d = -0.5633 0.00 13 & VLJ &
Note: LI is the light liquid phase and L2 the heavy.
Source: Englezos et ai. ( I 998).
Parameter Estimation in Cubic Equations of State 259
The values of these parameter estimates together with the standard
deviations are shown in Table 14.6. Subsequently, these two interaction
parameters were recalculated using the entire database. The algorithm easily
converged with a zero value for Marquardt's directional parameter and the
estimated values are also shown in the table. The phase behavior of the system
was then computed using these interaction parameter values and it was found to
agree well with the experimental data. This allows maximum likelihood (ML)
estimation to be performed. The ML estimates that are also shown in Table 14.6
were utilized in the EoS to illustrate the computed phase behavior. The
calculated pressure-composition diagrams corresponding to VLE and LLE data
at 344.26 K are shown in Figures 14.12 and 14.13. There is an excellent
agreement between the values computed by the EoS and the experimental data.
The Trebble-Bishnoi EoS was also found capable of representing well the three
phase equilibrium data (Englezos et al. 1998).
14.3.3.2 The Methane-n-Hexane System
The proposed methodology was also followed and the best parameter
estimates for the various types of data are shown in Table 14.7 for the methane-
n-hexane system. As seen, the parameter set (ka, kd) was found to be the best to
correlate the VL2E, the LlL2E and the VL2LlE data and another (ka, kb) for the
VLi E data.
Table 14.7 Parameter Estimates for the Methane-n-Hexane System
Parameter 1 Standard 1 Used Data I Objective Function
Devlatlon
ka = -0.0793 0.0 I02
VL2E
Implicit LS
kd = 0.0695
(Eqn 14.35 with R,=I)
0.0 120
ka = -0.1503 0.0277
VLI E
Implicit LS
kt, = -0.2385
0.0403
ka = -0.006 1 0.00 I8
LlL2E
Implicit LS
k d = 223 6
(Eqn 14.35 with Ri=I)
0.01 16
k,= 0.01 13 0.0084
VLILlE
Implicit LS
kd= 3185
0.0254
ka= -0.0587
Implicit LS VLlE & LIL2E
0.0075
= 0.0632
(Eqn 14.35 with R,=I) & VLIL2E
0.0104
Source: Englezos et al. (I 998).
260
5
4
3
Temperature =344 26 K
2
1
0
0.0 0.2 0.4 0.6 0.8 1 .o
Hydrogen Sulfide Mole Fraction
11.0
A
n
z
2
&
rn
W
9.0
3
v)
cn
a, 7.0
5.0
L
,
Data by Selleck et at (1952)
I
Temperature = 344.26 K
I
0.0 0.2 0.4 0.6 0.8 1 .o
Hydrogen Sulfide Mole Fraction
Figure 14. I3 LLE datn and calculated phase diclgmnz .for* the hjvirogen
sulfide-water systen1 [reprinted j *om lndtrstriccl I%pneeriug
Chemistry1 Research with permission @om the Anzericnn
Chemical Socieot].
Parameter Esti mati on in Cubic Equations of St at e 26 1
Phase equilibrium calculations using the best set of interaction parameter
estimates for each type of data revealed that the EoS is able to represent well
only the VL2E data. The EoS was not found to be able to represent the L2LIE
and the VL2LIE data. Although the EoS is capable of representing only part of
the fluid phase equilibrium diagram, a single set of values for the best set of
interaction parameters was found by using all but the VLIE data. The values for
this set (ka, kd) were calculated by least squares and they are also given in Table
14.7.
Using the estimated interaction parameters phase equilibrium computations
were performed. It was found that the EoS is able to represent the VL2E
behavior of the methane-n-hexane system in the temperature range of 198.05 to
444.25 K reasonably well. Typical results together with the experimental data at
273.16 and 444.25 K are shown in Figures 14.14 and 14.15 respectively.
However, the EoS was found to be unable to correlate the entire phase behavior
in the temperature range of 195.9 I K (Upper Critical Solution Temperature) and
182.46K (Lower Critical Solution Temperature).
14.4 PARAMETER ESTIMATION USING BINARY CRITICAL POINT
DATA
Given the inherent robustness and efficiency of implicit methods, their use
was extended to include critical point data (Englezos et al. 1998). Illustration of
the method was done with the Trebble-Bishnoi EoS (Trebble and Bishnoi, 1988)
and the Peng-Robinson EoS (Peng and Robinson, 1976) with quadratic mixing
rules and temperature-independent interaction parameters. The methodology,
however, is not restricted to any particular EoS/mixing rule.
Prior work on the use of critical point data to estimate binary interaction
parameters employed the minimization of a summation of squared differences
between experimental and calculated critical temperature and/or pressure
(Equation 14.39). During that minimization the EoS uses the current parameter
estimates in order to compute the critical pressure and/or the critical
temperature. However, the initial estimates are often away from the optimum
and as a consequence, such iterative computations are difficult to converge and
the overall computational requirements are significant.
It is assumed that there are available NCP experimental binary critical point
data. These data include values of the pressure, PC, the temperature, T,, and the
mole fraction, x,, of one of the components at each of the critical points for the
binary mixture. The vector k of interaction parameters is determined by fitting
the EoS to the critical data. In explicit formulations the interaction parameters
are obtained by the minimization of the following least squares objective
function:
262
21
18
15
12
9
6
3
0
0.0 0.2 0.4 0.6 0.8 1 .o
Methane Mole Fraction
12
2
-
Data by Postm and McKetta ( 1 966)
0 I I 1 3
0.0 0.2 0.4 0.6 0.8
263
The values of q1 and q2 are usually taken equal to one. Hissong and Kay
(1970) assigned a value of four for the ratio of q2 over ql .
The critical temperature and/or the critical pressure are calculated by
solving the equations that define the critical point given below (Modell and
Reid, 1983)
( 14.40a)
(I 4.40b)
Because the current estimates of the interaction parameters are used when
solving the above equations, convergence problems are often encountered when
these estimates are far from their optimal values. It is therefore desirable to
have, especially for multi-parameter equations of state, an efficient and robust
estimation procedure. Such a procedure is presented next.
At each point on the critical locus Equations 40a and b are satisfied when
the true values of the binary interaction parameters and the state variables, T,, PC
and x, are used. As a result, following an implicit formulation, one may attempt
to minimize the following residuals.
( T)~,~ = rl. (14.4 1 a)
(14.41b)
Furthermore, in order to avoid any iterative computations for each critical
point, we use the experimental measurements of the state variables instead of
their unknown true values. In the above equations rl and r2 are residuals which
can be easily calculated for any equation of state.
Based on the above residuals the following objective function is
formulated:
264
NCP
S(k) = zr i T Ri r , ( 1 4.42)
i =l
where r=(r,, rz)T. The optimal parameter values are found by using the Gauss-
Newton-Marquardt minimization algorithm. For 1,s estimation, R,=I is used,
however, for ML estimation, the weighting matrices need to be computed. This
is done using a first order variance approximation.
The elements of the covariance matrix of the parameter estimates are
calculated when the minimization algorithm has converged with a zero value for
the Marquardt's directional parameter. The covariance matrix of the parameters
COV(k) is determined using Equation 11.1 where the degrees of freedom used
in the calculation of 6; are
(d.f.) = 2Ncp - p ( 1 4.43)
Five critical points for the methane-n-hexane system i n the temperature
range of 198 to 273 K measured by Lin et al. (1977) are available. By
employing the Trebble-Bishnoi EoS i n our critical point regression least squares
estimation method, the parameter set (ka, kh) was found to be the optitnal one.
Convergence from an initial guess of (k,,kh=O.OOl? -0.001) was achieved in six
iterations. The estimated values are given in Table 14.8.
As seen, coincidentally. this is the set that was optimal for the VL I E data.
The same data were also used with the implicit least squares optimization
procedure but by using the Peng-Robinson EoS. The results are also shown in
Table 14.8 together with results for the CF14-C3H8 system. One interaction
parameter i n the quadratic mixing rule for the attractive parameter was used.
Temperature-dependent values estimated from VLE data have also been
reported by Ohe (1990) and are also given i n the table. As seen from the table,
we did not obtain optimal parameters values same as those obtained from the
phase equilibrium data. This is expected because the EoS is a semi-empirical
model and different data are used for the regression. A perfect thermodynamic
model would be expected to correlate both sets of data equally well. Thus,
regression of the critical point data can provide additional information about the
correlational and predictive capability of the EoShixing rule.
In principle, one may combine equilibrium and critical data in one database
for the parameter estimation. From a numerical inlplementation point of view
this can easily be done with the proposed estimation methods. However, it was
not done because it puts a tremendous demand in the correlational ability of the
EoS to describe all the data and it will be simply a computational exercise.
Paramet er Est i mat i on i n Cul ,i c Equati ons of St i t e
265
Table 14.8 Interaction Parameter Values from Binary Critical Point Data
Parameters Values Using I
1 VLE Data 1
EoS
Point Data
kg= 0.1062 I kij=0.0249 I Peng- 1
A-0.0 I3 k1i=0.0575b Robinson
Source: Englezos et al. ( 1 998).
Table 14.9 Vapor-Liquid Equilibrium Data for the Methanol (I)-Isobutane
(2) System at 133.0 C
3.920
0.8048a 3.959
0.7868 0.7244
1 .oooo 1 .oooo 3.529
0.9833 0.9897 3.591
0.97 1 I 0.9782 3.65 1
0.9369 0.9452 3.742
0.9 130 0.9 I 30 3.816
0.8465 0.8465b 3.95 1
0.8048
This is a critical point composition.
a This is an azeotropic composition.
Source: Leu and Robinson (1 992).
266 Chapt er I4
14.5 PROBLEMS
There are numerous sources of phase equilibrium data available that serve
as a database to those developing or improving equations of state. References to
these databases are widely available. In addition, new data are added mainly
through the J ournal of Chemical and Engineering Data and the journal Fluid
Phase Equilibria. Next, we give data for two systems so that the reader may
practice the estimation methods discussed i n this chapter.
14.5.1 Data for the Methanol-Isobutane System
Leu and Robinson (1992) reported data for this binary system. The data
were obtained at temperatures of 0.0, 50.0, 100.0, 125.0, 133.0 and 150.0 "C. At
each temperature the vapor and liquid phase mole fractions of isobutane were
measured at different pressures. The data at 133.0 and 150.0 are given in Tables
14.9 and 14.10 respectively. The reader should test if the Peng-Robinson and the
Trebble-Bishnoi equations of state are capable of describing the observed phase
behaviour. First, each isothermal data set should be examined separately.
Pressure Liquid Phase Mole Fraction Vapor Phase Mole Fraction
(x21 (Y 2)
1.336
0.0092 1.520
0 .oooo 0.000
0.28 19 0.0285 1.982
0.1209
I 2.448 I 0.0566 I 0.4 169
0.1550
0.660 1 0.3279
0.6 196
4.542
4.623
0.6489 0.4876
0.6254 0.6254a 4.679
0.6400 0.5724
a This is a critical point composition.
Source: Leu and Robinson ( I 992).
14.5.2 Data for the Carbon Dioxide-Cyclohexane System
Shibata and Sandler (1989) reported VLE data for the C02-cyclo-
hexane and N2-cyclohexane systems. The data for the C02-cyclohexane system
are given in Tables 14.1 1 and 14.12. The reader should test if the Peng-
I'arameter Estimation i n Cubic Equations of State
267
Robinson and the Trebble-Bishnoi equations of state are capable of describing
the observed phase behavior. First, each isothermal data set should be examined
separately.
Table 14.1 I Vapor-Liquid Equilibrium Data for the Carbon Dioxide ( I )
-Cyclohexane (2) System at 366.5 K
Pressure Liquid Phase Mole Fraction
(XI)
Vapor Phase Mole Fraction
I 0.171 I 0.0000 0.0000 I
I 1.752 I 0.0959 0.8984 I
I 3.469 I 0.1681 0.9374 ~ "1
I 5.191 I 0.2657 0.9495 I
I 6.886 I 0.3484 0.95 15 I
I 8.637 1 0.4598 0.9500 - 1
0.9402 I 10.360
12.5 11
0.5698
0.7526 0.8977 I
I 12.800 I 0.7901 0.8732 I
Source: Shibata and Sandler (1 989).
Table 14.12 Vapor-Liquid Equilibrium Data for the Carbon Dioxide ( I )
-Cyclohexane (2) System at 4 10.9 K
Pressure Liquid Phase Mole Fraction
(XI)
Vapor Phase Mole Fraction
(Y I )
0 .oooo ~~~ 1
I 1.718 I 0.067 1 0.7 166 I
0.8295 I
0.8593 I
0.8850 I 6.928 0.2887
0.8861 I 8.65 1 0.3578
10.402 0.4329 0.8777 I
I 12.056 I 0.5 165 0.8636 I
0.8320 I 13.814 0.6 107
14.44 1 0.6850 0.7846 I
0.7671 I
Source: Shibata and Sandler (1 989).
Parameter Estimation in Nonlinear
Thermodynamic Models:
Activity Coefficients
Activity coefficient models offer an alternative approach to equations of
statc for the calculation of fugacities in liquid solutions (Prausnitz ct ai. 1986; Tas-
sios, 1993). These models are also mechanistic and contain adjustable parameters
to enhance their correlational ability. The parameters are estimated by matching
the thermodynamic model to available equilibrium data. In this chapter, we con-
sider the estimation of parametcrs in activity coefficient models for electrolyte and
non-electrolyte solutions.
15.1 ELECTROLYTE SOLUTIONS
We consider Pitzer's modcl for the calculation of activity coefficients in
aqueous electrolyte solutions (Pitzer, 1991). It is the most widely used thermody-
namic model for electrolyte solutions.
15.1.1 Pitzer's Model Parameters for Aqueous Na?Si03 Solutions
Osmotic coefficient data measured by Park (Park and Englezos, 1998; Park,
1999) are used for the estimation of the model parameters. There are 16 osmotic
coefficient data available for the NalSiO; aqueous solution. The data are given in
Table 15.1. Based on these measurements the following parameters i n Pitzer's
268
Parameter Estimation in Activity Coefficients Thermodynamic Models
269
activity coefficient model can be calculated: @'), p('), and Cq for Na2Si03 The
three binary parameters may be determined by minimizing the following least
squares objective function (Park and Englezos, 1998).
(15.1)
where is the calculated and qexp is the measured osmotic coefficient and q, is
the uncertainty.
Table 15. I Osmotic CoeSficient Data for the Aqueous Na2Si03 Solution
Molality Osmotic Coefficient (cpexp)
0.0096 0.8923 0.0603
Standard Deviation (q,)
0.009 1 0.8339 0.3674
0.0096 0.8926 0.0603
0.3690
0.0088 0.8 1 88 0.53 13
0.0088 0.8 188 0.53 13
0.0090 0.8304
I 0.8637 I 0.7790 I 0.0083 1
I 0.8629 I 0.7797 I 0.0083 1
I .2063
0.0078 0.7608 1.4927
0.0078 0.7607 ' 1.4928
0.0080 0.76 17 1.2059
0.0080 0.76 14
1.8213
0.0077 0.7674 1.824 1
0.0078 0.7685
2.3745
0.0078 0.8028 2.3725
0.0078 0.802 1
Source: Park and Englezos ( I 998).
The calculated osmotic coefficient is obtained by the next equation
(1 5.2)
270
where
2
M
X
fp
tn
I
b
A,
V
Chapter 15
is the charge
denotes a cation
denotes an anion
is equal to -A,I /( 1 + bl I )
is the Debye-Huckel osmotic coefficient parameter
is the molality of solute
1
is the ionic strength ( I =- ~m, z )
is a universal parameter with the value of 1.2 (kg.mol)2
is the number of ions produced by 1 mole of the solute, and
2 1
The parameters P()L,x, P ()M~, CqbIx are tabulated binary parameters specific
to the electrolyte MX. The ff2)M?t is a parameter to account for the ion pairing ef-
fect of 2-2 electrolytes. When either cation M or anion X is univalent, a, = 2.0.
For 2-2, or higher valence pairs, ( x l = 1.4. The constant a? is equal to 12. The pa-
rameter vector to be estimated is k=[p(), p(), C, IT.
15.1.2 Pitzers Model Parameters for Aqueous Na2SiO3--Na0H Solutions
There are 26 experimental osmotic coefficient data and they are given in
Table 15.2 (Park and Englezos, 1999: Park, 1999). Two sets of the binary pa-
rameters for the NaOH and Na2Si03 systems and two mixing parameters,
eoKsi+- and \YNa+Ol~,-SiO:- are required to model this system. The binary
parameters for the Na2Si03 solution were obtained previously and those for the
NaOH system, P(o)Naokl = 0.0864, = 0.253, and C(P~aOl-l = 0.0044 at 298.15
K are available in the literature. The remaining two Pitzers mixing parameters,
~oH- sl o~- and YNa+Otl-SiO:- were determined by the least squares optimization
method using the 26 osmotic coefficient data of Table 15.2.
The Gauss-Newton method may be used to minimize the following LS ob-
jective function (Park and Englezos, 1999):
N
I= I
0
2
PJ
( I 5. 3)
Parameter Estimation in Activity koefhcients Thermodynamic Model s 271
where qexp is the measured osmotic coefficient, is the uncertainty computed
using the error propagation law (Park, 1999) and cpcaic is calculated by the follow-
ing equation,
(pcalc = 1 + ( 2 / ~mi ) [ f ' P I +~~mc r n a( B ~a +X c a)
I c a
where
m
fp
A,
b
I
C
a
Z
Z
Cca
q
OIJ
c c' a
a a' C
is the molality of the solute
is equal to -AcpIi'2/( 1 + b1"*)
is the Debye-HUckel osmotic coefficient parameter
is a universal parameter with the value of 1.2 (kg.mo1)'"
is the ionic strength ( I=-c mi z," )
1
2 i
is a cation
is an anion
is equal to p $ +p !: exp(--a, I ' ' >+p exp(-a I ' ' )
is equal to m, I z, I
I
is the charge
is equal to
2( ZCZ,
is equal to @,J+ IOlj
is equal to eIJ + Ee,J (l)
(1 5.4)
2 72 Chapter 15
are tabulated mixing parameters specific to the cation-cation or anion-
anion pairs. The Y , J k is also tabulated mixing parameter specific to the
cation-anion-anion or anion-cation-cation pairs.
Table 15.2 Osmotic Coeficienl Dnta for. /he Aqzreotrs Na2Si03-NnOH Solzrtiorl
Na' , molality
(qJ
(~p"'~)
Si03*- molality OH- molality
0. I397 0.0 103 0.9625 0.0466 0.0466
r- 01405 I 0.0468 I 0.0468 I 0.9572 I 0.0 10 1
I 0.291 1 I 0.0970 I 0.0970 I 0.9267 I 0.0 100
0.2920
0.0097 0.88 16 0.1481 0.1481 0.4442
0.0096 0.9239 0.0973 0.0973
0.4395
0.0090 0.861 I 0.2338 0.2338 0.70 I5
0.0093 0.8624 0.2335 0.2335 0.7005
0.0094 0.89 10 0.1465 0. I465
1.0670
0.0085 0.8222 0.7859 0.7859 2.3577
0.0085 0.8 I75 0.5289 0.5289 1.5 868
0.0087 0.82 16 0.5263 0.5263 1 S759
0.0087 0.8236 0.4740 0.4740 1.422 1
0.0088 0.8253 0.473 I 0.473 1 1.4192
0.0088 0.8440 0.35 I9 0.35 I9 I .0558
0.009 1 0.8352 0.3557 0.3557
I 2.3537 I 0.7846 I 0.7846 I 0.8236 I 0.0084
I 2.8797 I 0.9599 I 0.9599 I 0.8323 I 0.0084
1 2.8747 I 0.9582 I 0.9582 I 0.8338 I 0.0085
3.0557
0.0085 0.8843 1.3269 1.3269 3.9807
0.0085 0.84 12 1.0181 1.0181 3.0542
0.0085 0.8408 1.01 86 1.0186
3.9739
0.0085 0.9027 1.4954 I .4954 4.486 1
0.0086 0.8858 I .3246 I .3246
I 4.4901 I 1.4967 I 1.4967 I 0.9019 I 0.0085
4.9904
Sozrrce: Park and Englezos (I 998).
0.0086 0.9329 1.6629 1.6629 4.9888
0.0086 0.9326 1 A63 5 1.6635
Parameter Estimation in Activity Coe&ikents Thermodynamic Models 273
The parameters %,(I) and EO',(I) represent the effects of asymmetrical
mixing. These values are significant only for 3-1 or higher electrolytes (Pitzer,
1975). The 2-dimensional parameter vector to be estimated is simply
k=[ 0
T
OH-SiO:- ' 'Na'OH-Si0;- *
First the values for the parameter vector k=[Y0), P(I), CV IT were obtained by
using the Na2Si03 data and minimizing Equation 15.1. The estimated parameter
values are shown in Table 15.3 together with their standard deviation.
Subsequently, the parameter vector k=[ 0
OH-SJ OZ- ' yNa+OH-S~02- 3 IT was
estimated by using the data for the Na2Si03-NaOH system and minimizing Equa-
tion 15.3. The estimated parameter values and their standard errors of estimate are
also given in Table 15.3. It is noted that for the minimization of Equation 15.3
knowledge of the binary parameters for NaOH is needed. These parameter values
are available in the literature (Park and Englezos, 1998).
Table 15.3 Calczdated Pitzer's Model Parameters for
Na2Si03 and Na2Si03-NaOH Systems
I Parameter Value I Standard Deviation 1
PC') = 0.0577
0.0559
PC') = 2.8965
0.0039
I C"' = 0.00977 I 0.00 176 I
I 1 1
= -0.2703
0.0384
I
I
i
I yNa+oH-SIO:- = 0.0233 I
0.0095
I I I
Source: Park and Englezos (I 998).
The calculated osmotic coefficients using Pitzer's parameters were com-
pared with the experimentally obtained values and found to have an average per-
cent error of 0.33 for Na2Si03 and 1.74 for the Na2Si03-NaOH system respec-
tively (Park, 1999; Park and Englezos, 1998). Figure 15.1 shows the experimental
and calculated osmotic coefficients of Na2Si03 and Figure 15.2 those for the
Na2Si03-NaOH system respectively. As seen the agreement between calculated
2 74 Chapter 15
and experimental values is excellent. There are some minor inflections on the cal-
culated curves at molalities below 0.1 mol/kg H20. Similar inflections were also
observed in other systems in the literature (Park and Englezos, 1998). It is noted
that it is known that the isopiestic method does not give reliable results below 0.1
Park has also obtained osmotic coefficient data for the aqueous solutions of
NaOH-NaCI- NaAI(OH)4 at 25C employing the isopiestic method (Park and
Englezos, 1999; Park, 1999). The solutions were prepared by dissolving
AICI3.6H20 in aqueous NaOH solutions. The osnlotic coefficient data were then
used to evaluate the unknown Pitzer's binary and mixing parameters for the
NaOH-NaCI-NaAI(OH)4-H20 system. The binary Pitzer's parameters, P( O) , P" ),
and Cq, for NaAI(OH)4 were found to be -0.0083, 0.071 0, and 0.001 84 respec-
tively. These binary parameters were obtained from the data on the ternary system
because it was not possible to prepare a single (NaAI(OH)4) solution.
i d / k g H2O.
0. 7 - 1
0. 0 0. 5 1 .0 1 .5 2. 0 2. 5
Molality (molkg HzO)
15.2 NON-ELECTROLYTE SOLUTIONS
Activity coefficient models are functions of temperature, composition and to
a very small extent pressure. They offer the possibility of expressing the fugacity
Parameter Estimation in Activity hefhcients Thetmbdynamh Models 275
of a chemical j, fJ '. , in a liquid solution as follows (Prausnitz et al. 1986; Tassios,
1993)
i;, = X, Y ,f, (I 5. 5)
where xJ is the mole tkaction, yj is the activity coefficient and 6 is the fugacity of
chemical species j.
0.9
0.8
1
0. 0 0.5 1.0 1.5 2.0
Molality (mollkg Hz 0)
Figure 15.2 Calculated and exper-inzental osmotic coeflcients for the Na2Si03
-NaOH system. The line represents the calculated values.
Several activity coefficient models are available for industrial use, They are
presented extensively in the thermodynamics literature (Prausnitz et al., 1986).
Here we will give the equations for the activity coefficients of each component in
a binary mixture. These equations can be used to regress binary parameters from
binary experimental vapor-liquid equilibrium data.
276
Chapter 15
15.2.1 The Two-Parameter Wilson Model
The activity coefficients are given by the following equations
The adjustable parameters are related to pure component molar volumes and
to characteristic energy differences as following
( I 5.7a)
(1 5.7b)
where v1 and v2 are the liquid molar volumes of components 1 and 2 and the hs
are energies of interaction between the molecules designated in the subscripts. The
temperature dependence of the quantities (h12-hl I ) and (h12-hl) can be neglected
without serious error.
15.2.2 The Three-Parameter NRTL Model
Renon used the concept of local composition to develop a non-random. two-
liquid (NRTL) three parameter (al2, cI 2, r7,) equations given below (Prausnitz et
al., 1986).
where
( I 5.8a)
( I 5.8b)
Parameter Estimation in Activity Coefficients Thermodynamic Models 277
g12 -g22
RT
212 =
g21 -gll
RT
t 2 1 =
( 1 5.8e)
( I 5. 80
The parameter ggis an energy parameter characteristic of the i-j interaction.
The parameter aI2 is related to the non-randomness in the mixture. The NRTL
model contains three parameters which are independent of temperature and com-
position. However, experience has shown that for a large number of binary sys-
tems the parameter aI 2 varies from about 0.20 to 0.47. Typically, the value of 0.3
is set.
15.2.3 The Two-Parameter UNIQUAC Model
The Universal Quasichemical (UNIQUAC) is a two-parameter (r12, rZ1)
model based on statistical mechanical theory. Activity coefficients are obtained by
where
1, =-(rl 2 - ql ) - ( q - 1)
2
(1 5.9~)
(1 5.9d)
278
Chapt er 15
Segment or volume fractions, cp, and area fiactions, 0 and 0' , are given by
( 1 5.9e)
(1 5.90
Parameters r, q and q' are pure component molecular-structure constants
depending on molecular size and external surface areas. For fluids other than wa-
ter or lower alcohols, q = q' .
For each binary mixture there are two adjustable parameters. T~~ and rZ1.
These in turn, are given in terms of characteristic energies A L I ~? = L I ~ ~ - L I ~ ~ and
Au2 =u2 -ul given by
4 2 - u12 "U22
In.r,? =----
RT RT
-
* U2l - u21 " ~ 1 1 1
In t21 = -- - -
RT RT
-
(15.10a)
(15.10b)
Characteristic energies, AuI2 and Au2, are often only weakly dependent on
temperature. The UNI QUAC equation is applicable to a wide variety of nonelec-
trolyte liquid mixtures containing nonpolar or polar fluids such as hydrocarbons,
alcohols, nitriles, ketones, aldehydes, organic acids, etc. and water, including par-
tially miscible mixtures. The main advantages are its relative simplicity using only
two adjustable parameters and its wide range of applicability.
15.2.4 Parameter Estimation: The Objective Function
According to Tassios (1993) a suitable objective function to be minimized in
such cases is the following
Parameter Estimation in Activity Coefficients Thermodynamic Models
279
(15.1 1)
This is equivalent to assuming that the standard error in the measurement of
yj is proportional to its value.
Experimental values for the activity coefficients for components 1 and 2 are
obtained fi-om the vapor-liquid equilibrium data. During an experiment, the fol-
lowing information is obtained: Pressure (P), temperature (T), liquid phase mole
fraction (x, and x2=l-xI) and vapor phase mole fraction (yl and y2=1-yl).
The activity coefficients are evaluated ,from the above phase equilibrium
data by procedures widely available in the thermodynamics literature (Tassios,
1993; Prausnitz et al. 1986). Since the objective in this book is parameter
estimation we will provide evaluated values of the activity coefficients based on
the phase equilibrium data and we will call these values experimental. These yf"P
values can then be employed in Equation 15.1 1.
Alternatively, one may use implicit LS estimation, e.g., minimize Equation
14.23 where liquid phase fhgacities are computed by Equation 15.5 whereas
vapor phase fugacities are computed by an EoS or any other available method
(Prausnitz et al., 1986).
15.3 PROBLEMS
A number of problems formulated with data from the literature are given
next as exercises. I n addition, to the objective function given by Equation 15.1 I
the reader who is familiar with thermodynamic computations may explore the use
of implicit objective functions based on fugacity calculations.
15.3.1 Osmotic Coefficients for Aqueous Solutions of KC1 Obtained by the
Isopiestic Method
Thiessen and Wilson (1987) presented a modified isopiestic apparatus and
obtained osmotic coeficient data for KC1 solutions using NaCl as reference solu-
tion. The data are given in Table 15.4. Subsequently, they employed Pitzer's
method to correlate the data. They obtained the following values for three Pitzer's
model parameters: pc,q! = 0.0504 1 1 76, = 0.195522, CSx = 0.00 1355442 .
Using a constant error for the measurement of the osmotic coefficient, esti-
mate Pitzer's parameters as well as the standard error of the parameter estimates by
minimizing the objective function given by Equation 15. I and compare the results
with the reported parameters.
280
Chapter 15
I Molality of KC1 I Osnlotic Coefficient (cp) 1
0.09872
0.9265 0.09893
0.9325
I 0.5274 I 0.8946 I
0.9634
0.9009 1.157
0.898 I 1.043
0.8944
2,919 0.935 1
Sozrrce: Thiessen and Wilson (1987).
15.3.2 Osmotic Coefficients for Aqueous Solutions o f High-Purity NiClz
Rard (I 992) reported the results of isopiestic vapor-pressure measurements
for the aqueous solution of high-purity NiC12 solution form 1.4382 to 5.7199
Inol/kg at 298.1 SkO.005 K. Rased on these measurements he calculated the os-
motic coefficient of aqueous NiCI2 solutions. He also evaluated other data from
the literature and finally presented a set of smoothed osmotic coefficient and ac-
tivity of water data (see Table IV in original reference).
Rard also employed Pitzer's electrolyte activity coefficient model to correlate
the data. It was found that the quality of the fit depended on the range of molalities
that were used. In particular, the fit was very good when the molalities were less
than 3 mol/kg.
Estimate Pitzer's electrolyte activity coefficient model by minimizing the ob-
jective function given by Equation 15.1 and using the following osmotic coeffi-
cient data from Rard (1 992) given in Table 15.5. First, use the data for molalities
less than 3 mo//kg and then all the data together. Compare your estimated values
with those reported by Rard (1 992). Use a constant value for oCp in Equation 15. I .
Parameter Estimation in Activity Coefficients Thermodynamic Models 281
Table 15.5 Osmotic CoefJicients for Aqueous NiC12 Solzltions at 298.15 K
0.1 0.8556 1.8 I .3659
0.2 0.8656 2.0 1.4415
0.3 0.8842 2.2 1.5171
0.4 0.9064 2.4 1.5919
0.5 0.93 12 2.5 1.6288
0.6 0.9580 2.6 I .6653
0.7 0.9864 2.8 1.7364
0.8 I .0163 3.0 1.8048
0.9 1.0475 3.2 1.8700
1 .o 1.0798 3.4 1.93 16
I 1.2 1.1473 3.5 1.9610
1.4 1.21 80 3.6 1.9894
1.5 1.2543 3.8 2.0433
1 .6 1.291 I 4.0 2.0933
~~~~
Source: Rard ( 1992).
15.3.3 The Benzene (1)-i-Propyl Alcohol (2) System
Calculate the binary parameters for the UNIQUAC equation by using the
vapour-liquid equilibrium data for benzene(1)-i-propyl alcohol (2) at 760 mmHg
(Tassios, 1993). The following values for other UNIQUAC parameters are avail-
able from Tassios (I 993): rl=3. 19, q,=2.40, r2=2.78, q2=2.5 1. The data are given in
Table 15.6.
Tassios (I 993) also reported the following parameter estimates
"- *'I2 - -23 1.5
R
-"*"I - 10.6
R
(15.12a)
( 1 5.12b)
The objective hnction to be minimized is given by Equation 15.1 1. The experi-
mental values for the activity coefficients are also given in Table 15.5.
282
Chapter 15
Table IS. 6 Vopor-Liquid Eqzdibrizrnl Data and Activi!y Coeficients-for
Bezene(1) -i-Propyl Alcohol at 760 mwHg
Temperature
("C)
XI Y2 Yl Y l
1
79.9
I .0628 2.0540 0.4 10 0.240 74.4
1.0356 2.1834 0.371 0. I99 75.3
1.0145 2.4376 0.276 0.126 77.1
1.0009 2.6494 0.208 0.084 78.5
0.9944 2.7 187 0.140 0.053
73.6
1.1458 1.729 I 0.493 0.357 73 .O
1.0965 1.9074 0.45 1 0.29 1
72.4 I 0.440 I 0.535 I 1 S492 I 1.2386 I
72.2
2.06 14 1.1210 0.673 0.762 72.4
1 ,7429 1.1944 0.638 0.685 72. I
1.5698 1.2625 0.6 12 0.624 72.0
1.41 52 I .3424 0.583 0.556
~~~
73.8 I 0.887 I ~ 0.760 I 1.0393 I 3.0212 I
77.5 I 0.972 I 0.901 I 1.0025 I 4.3630 I
Sozvce: Tassios (1 993).
15.3.4 Vapor-Liquid Equilibria of Coal-Derived Liquids: Binary Systems
with Tetralin
Blanco et al. (1994) presented VLE data at 26.66k0.03 kPo for binary sys-
tems of tetralin with p-xylene, g-picoline, piperidine, and pyridine. The data for
the pyridine (I)-tetralin (2) binary are given in Table 15.7.
Blanco et al. have also correlated the results with the van Laar, Wilson,
NRTL and UNIQUAC activity coefficient models and found all of them able to
describe the observed phase behavior. The value of the parameter aI 2 in the NRTL
model was set equal to 0.3. The estimated parameters were reported in Table I O of
the above reference. Using the data of Table 15.7 estimate the binary parameters
in the Wislon, NRTL and UNIQUAC models. The objective function to be mini-
mized is given by Equation 15.1 1.
Parameter Estimation in Activity Cbefficients Thermodynamic Models 283
Table 15.7 Vapor-Liquid Equilibrium Data and Activiv Coeficients for
Pyridine (I)-Tetralin (2) at 26.66 kPa*
* The standard deviation of the measured compo si
measured with a thermometer having 0.01 K-divisions (Blanco et ai., 1994).
Source: Blanco et a]. ( 1994).
15.3.5 Vapor-Liquid Equilibria of Ethylbenzene (1) - o-Xylene (2) at
26.66 kPn
Monton and Llopis (1994) presented VLE data at 6.66 and 26.66 kPa for bi-
nary systems of ethylbenzene with m-xylene and o-xylene. The accuracy of the
temperature measurement was 0.1 K and that of the pressure was 0.01 kPa. The
standard deviations of the measured mole fractions were less than 0.001. The data
at 26.66 for the ethylbenzene (1) - o-Xylene (2) are given in Table 15.8 and the
objective is to estimate the NRTL and UNIQUAC parameters based on these data.
Activity Co-
Tetralin (y2) Pyridine (y,)
efficient of efficient of
Activity Co-
1.003
1.549
1.048 1.395
1.010
1.004
2.322 0.998
2.455 0.997
1.479 1.007
1.414 1.017
1.369 1 . O M
1 .os 1 1.047
1.04 1 1.046
1.038 1.084
1.013 I .099
0.979 1.170
0.930 1.181
0.93 1 1.216
0.946 1.248
0.962 1.252
0.975 1.366
tions is 0.005. The temperature was
284 Chapter 15
The reader should refer to the original reference for further details and may also
use the additional data at 6.66 kPn to estimate the parameters.
Tuble 15.8 Vapor-Liquid Eqzlilibrizrln Data for Etlylbenzeue (1) -o-,Yylene (2)
at 26.66 kPu
Liquid Phase
Activity Activity
Vapor Phase
Temperature
Coefficient of Coefficient of Mole Fraction Mole Fraction
(K) o-Xylene Ethylbenzene
of o-Xylene of Ethylben-
zene ( x, ) ( YA (Y I ) (Y I >
373.25
1.010 1.020 0.2 14 0.171 37 1.75
I .004 1.063 0.1 16 0.09 1 3 72.45
1 .oo 1 1 .lo9 0.057 0.044 372.85
0.000 0.000
Soztrce: Monton and Llopis ( I 994).
I I
16
Parameter Estimation in Chemical
Engineering Kinetic Models
A number of examples have been presented in Chapters 4 and 6. The solu-
tions to all these problems are given here except for the two numerical problems
that were solved in Chapter 4. In addition a number of problems have been in-
cluded for solution by the reader.
16.1 ALGEBRAIC EQUATION MODELS
16.1.1 Chemical Kinetics: Catalytic Oxi dati on of 3-Hexanol
Gallot et al. ( 1 998) studied the catalytic oxidation of 3-hexanol with hydro-
gen peroxide. The data on the effect of the solvent (CH30H) on the partial conver-
sion, y, of hydrogen peroxide are given in Table 4.1. The proposed model is:
y=k,[l - e~p(- k~t)] (16.1)
As mentioned in Chapter 4, although this is a dynamic experiment where
data are collected over time, we consider it as a simple algebraic equation model
with two unknown parameters. The data were given for two different conditions:
(i) with 0.75 g and (ii) with 1.30 g of methanol as solvent. An initial guess of
kl=l. O and k2=0.01 was used. The method converged in six and seven iterations
respectively without the need for Marquardt's modification. Actually, if Mar-
quardt's modification is used, the algorithm slows down somewhat. The estimated
parameters are given in Table 16.1 In addition, the model-calculated values are
285
286
Chapter 16
compared with the experimental data in Table 16.2. As seen the agreement is very
good in this case. The quadratic convergence of the Gauss-Newton method is
shown in Table 16.3 where the reduction of the LS ob-jective function is shown
when an initial guess of kl=lOO and k2=10 was used.
Table 16. I Catalytic Oxidation of 3-Hexanol: Estirmtecl Parameter
Valzres and Standard Deviations
Parameter Value Standard Deviation
Mass of
0.75 g
0.0 I58 0.0095 0. I055 0.1776
CH30H
k2 kl ok1 okZ
1.30 g 0.1787 0.0726 0.0 129 0.01 16
~~ ~~ ~ ~~
Table 16.2 Catalytic Oxidation of 3-Hexanol: Experirtwntal Datn md
Model Culczdations
Reaction
Time
Partial Conversion
Run with 0.75 g methanol Run with 1.30 g methanol
Data
0.109 0.100 0.133 0.120 13
0.063 0.070 0.083 0.090 6
0.035 0.140 0.048 0.055 3
Model Data Model
18
0.155 0.160 0.168 0.1 75 28
0.151 0. I50 0.166 0.165 26
0. I30 0.130 0.151 0.1 50
Table 16.3 Catalytic O.vidntion of 3-Hexand: Reductio!? of t he
LS Objective Function (Data. for. 0.75 g CH30H )
1 Iteration
1 Objecrive 1
function
I 1 I 59849.1 I 100.0 I 4.360
I 3 I .00035242 I 0.1769 I 0.1 140
I 4 I .00029574 I 0.1769 I 0.1064
I 5 I .00029534 I 0.1776 I 0.1056
6
0.1055 0.1776 .00029534
~ ~ ~~
Parameter Estimation i n Chemical Engineeting Kinetics Models 287
16.1.2 Chemical Kinetics: Isomerization of Bicyclo [2,1,11 Hexane
Data on the thermal isomerization of bicyclo [2,1,1] hexane were measured
by Srinivasan and Levi (I 963). The data are given in Table 4.4. The following
nonlinear model was proposed to describe the fraction of original material re-
maining (y) as a hnction of time (x,) and temperature (x2).
y=exp - k, x1 exp - k 2 ---
{ i [ t 2 6:0)1}
(1 6.2)
This problem was descri :d in Chapter 4 (Problem 4.3.4). An initial guess of
k(o)=(O.OO1, 10000) was used and convergence of the Gauss-Newton method
without the need for Marquardts modification was achieved in five iterations. The
reduction in the LS objective function as the iterations proceed is shown in Table
16.4. In this case the initial guess was fairly close the optimum, k*. As the initial
guess is further away Aom k*, the number of iterations increases. For example, if
we use as initial guess k(O=( I , 1000000), convergence is achieved in eight itera-
tions. At the optimum, the following parameter values and standard deviations
were obtained: kI=0.0037838+0.000057 and k2=27643k46 1.
Table 16.4 Isomerization of Bicyclo [2, I , I ] Hexane:
Reduction of the LS Objective Function
Objective
Iteration
function
klx 1 O2
k2
0 10000 0.1 2.2375
1
25038 0.36650 0.029579 2
381 1 1 0.22663 0.206668
3
27643 0.37383 0.0 I028 17 5
27643 0.37384 0.01 028 17 4
27677 0.37380 0.0102817
Using the above parameter estimates the fraction of original material of bicy-
clo [2,1, I ] hexane was calculated and is shown together with the data in Table
16.5. As seen the model matches the data well.
This problem was also solved with MatlabTM (Student version 5, The MATH
WORKS Inc.) Using as initial guess the values (kl=O.OOl, k2=10000) convergence
of the Gauss-Newton method was achieved in 5 iterations to the values
(k1=0.003738, k2=27643). As expected, the parameter values that were obtained
288 Chapter 16
are the same with those obtained with the Fortran program. I n addition the same
standard errors of the parameter estimates were computed.
Table 16.5 Isomerization of Biqclo [2, I , I ] Hexane: Experimental Dntn and
&lode1 Cnlclrlafed Valzres
XI x2 9 (data) y (model)
120.
60. 0
60. 0
120.
120.
60. 0
60. 0
30. 0
15. 0
60. 0
45. 1
90. 0
150.
60. 0
60. 0
60. 0
30. 0
90. 0
150.
90. 4
120.
600
600
612
612
612
612
620
620
620
620
620
620
620
620
620
620
620
620
620
620
620
. 900
. 949
386
. 785
. 79 I
. 890
, 787
. 877
. 938
. 782
327
. 696
. 582
. 795
300
. 790
. 883
. 712
. 576
. 7 15
. 673
. 903 5
. 9505
3823
. 7784
. 7784
3823
. 799 1
3939
. 9455
. 799 1
. 8448
. 7 143
. 5708
. 799 1
. 799 1
. 799 1
3939
.7 143
, 5708
.7 132
. 6385
60. 0
60. 0
60. 0
60. 0
60. 0
60. 0
30. 0
45.1
30. 0
30. 0
45. 0
15.0
30. 0
90. 0
25. 0
60. 1
60. 0
30. 0
30. 0
60. 0
620
620
620
620
620
620
63 I
63 1
63 1
63 1
63 1
639
639
639
639
63 9
639
63 9
63 9
639
302
.so2
304
. 794
.SO4
. 799
. 764
. 688
, 717
. SO2
. 695
308
, 655
. 309
. 689
. 43 7
. 425
. 63 8
. 659
, 449
. 799 1
. 799 1
. 799 1
. 799 1
. 799 I
. 799 1
. 7835
. 6930
. 7835
. 7835
. 693 5
3097
. 6556
.28 18
. 7034
. 4292
. 4298
, 6556
. 6556
. 4298
As another example from chemical kinetics, we consider the catalytic t-e-
duction of nitric oxide (NO) by hydrogen which was studied using a flow reactor
operated differentially at atmospheric pressure (Ayen and Peters, 1962). The fol-
lowing reaction was considered to be important
Data were taken at 375OC, and 4OO0C, and 425C using nitrogen as the dilu-
ent. The reaction rate in ~~l ol / (i ni n. g-cat a~, st ) and the total NO conversion were
measured at different partial pressures for H2 and NO.
* ' ? ! *
Parameter Estimation in Chemical gngjneering Idnetics Models 289
A Langmuir-Hinshelwood reaction rate model for the reaction between an
adsorbed nitric oxide molecule and one adjacently adsorbed hydrogen molecule is
described by:
where r is the reaction rate in gmol/(ming-catalyst), PH2 is the partial pressure of
hydrogen (atm), PNO is the partial pressure of NO (atm), KNO=A2exp{-E2/RT)
atm" is the adsorption equilibrium constant for NO, KHz= A3exp(-E3/RT} atm" is
the adsorption equilibrium constant for H2 and k=A,exp{-EJ RT) gmol/(ming-
catalysf) is the forward reaction rate constant for surface reaction. The data for the
above problem are given in Table 4.5.
The objective of the estimation procedure is to determine the parameters k,
KH2 and K NO (if data from one isotherm are only considered) or the parameters AI,
A?, A3, E,, E2, E3 (when all data are regressed together). The units of E,, E2, E3 are
in cal/mol and R is the universal gas constant ( I .987 cal/mol K ) .
Kittrell et al. (1965a) considered three models for the description of the re-
duction of nitric oxide. The one given in Chapter 4 corresponds to a reaction be-
tween one adsorbed molecule of nitric oxide and one adsorbed molecule of hydro-
gen. This was done on the basis of the shape of the curves passing through the
plotted data.
In this work, we first regressed the isothermal data. The estimated parame-
ters from the treatment of the isothermal data are given in Table 16.6. An initial
guess of (kl=l.O, k2=l .O, k3=l.0) was used for all isotherms and convergence of
the Gauss-Newton method without the need for Marquardt's modification was
achieved in 13, 16 and 15 iterations for the data at 375, 400, and 425C respec-
tively.
Plotting of Znkj (i=1,2,3) versus 1/T shows that only kl exhibits Arrhenius
type of behavior. However, given the large standard deviations of the other two
estimated parameters one cannot draw definite conclusions about these two pa-
rameters.
Table 16.6 Catalytic reduction of NO: Estimated Model Parameters
by the Gauss-Newton Method Using Isothermal Data
Temperature
("c)
375
35.9 k 14.0 31.5 k 13.0 5.5 k 3.2 400
13.2 _+3.4 18.5 k 3.4 5.2 If: 1.2
(kl k (Tkl)X 10' k3 Uk3 k2 ok2
425 14.0 k 8.9 25.9 k 10.3 13.5 k 8.0
290 Chapter 16
Kittrell et al. (l965a) also performed two types of estimation. First the data
at each isotherm were used separately and subsequently all data were regressed
simultaneously. The regression of the isothermal data was also done with linear
least squares by linearizing the model equation. In Tables 16.7 and 16.8 the re-
ported parameter estimates are given together with the reported standard error.
Ayen and Peters (1962) have also reported values for the unknown parameters and
they are given here in Table 16.9.
Table 16.7 Catabtic reduction of NO: Estimated A4odel Parameters b)'
Linear Least Squares Using tsothermnl Dntn
Temperature ( "C) k3 If: ok3 k2 f 0 k 2 (k, f okI)x l O4
375
30.9 1 20.2 48.9 k 3 I .3 8.8 1 2.3 425
35.4 k 11.3 38.6 f 19.6 5.3+8.5 400
14.6 k 2.9 18.8 k 4.6 4.9 f 0.7
Source: Kittrell et al. ( 1 965a).
Table 16.8 Cataktic reduction of NO: Estimated Model Parameters hJ7
Temperature ("0
k3 f ~ k 3 k2 f (k, f okl)x 10'
Nonlinear Least Sqzrnres Using tsothe~+r~~al Data
3 75
23.1 k 11.6 34.5 k 15.2 10.1 13.0
425
36.0 f 13.9 31.61 12.9 5.51 4 1.2 400
13.2 1 3.4 18.5 k 3.4 5.19 k 0.9
Sozwce: Kittrell et al. (I 965a).
Table 16.9 CataE;rltic redzrction of NO: Estimated Model Parameters b 1 1
Nonlinear Least Squares
Temperature ( " C) k3 k2
klx104
3 75
30.95 48.55 8.79
b 425
20.96 30.45 7.08 400
14.64 19.00 4.94
Source: Ayen and Peters (1962).
Table 16. IO Catalytic redzlction q f N0 : Estimated Model Parnnleters b)?
Temperature ("0 k3 * G'k3 k2k 0 k ? (k, ~f: ok,)x I 0'
Nonlinear Least Squares Using Noni.rotherma1 Data
375
3 1.7 k 14.6 42.9 k 23.6 8.63 1 3.92
425
23.8 k 12.8 26.3 f 18.3 6.58 _+3.94
400
17.5 rt 11.5 15.5 k 13.4 4.92 f 3.71
Source: Kittrell et al. (I 965a).
Parameter Estimation in Chemical Engineering kinetics Models
1 -
291
Kittrell et al. (1965a) also used all the data simultaneously to compute the
parameter values. These parameter values are reported for each temperature and
are given in Table 16.10.
Writing Arrhenius-type expressions, k,=Ajexp(-Ej/RT), for the kinetic con-
stants, the mathematical model with six unknown parameters (A,, A?, A3, E,, E2
and E3) becomes
( 1 6.4)
1 + A2e RTx2 + A3e RTxl
The elements of the (Zxd)-dimensional sensitivity coefficient matrix G are
E I +E ~+E ~
AI A3e RT ~1x 2
+
Y 2
(1 6.5)
(16.6)
E1 +E2+2E3
-2A1A2A3e RT x:x2
G13= - -
( L3) - Y3 (1 6.7)
E1 +E2+E3
-______
AI A2e RT ~1x 2
+
Y 2
-AI A2A3e RT ~1 x 2
G14 =(g]= RTY
( 1 6. 8)
292
Chapter 16
I RTY'
( 16.9)
(16.10)
where
"_ E2 - "" F3
Y = 1+A2e RTx2 +A3e R' r x~ (16.1 1)
The results were obtained using three different sets of initial guesses which
were given by Kittrell et al. ( 1 965b). None of them was good enough to converge
to the global optimum. In particular the first two converged to local optima and the
third diverged. The lowest LS objective function was obtained with the first initial
guess and it was 0.1464~ 1 O-6. The corresponding estimated parameter values were
A1=0.803910.3352, A?= 1.37 1 x 105_+6.798x 1 OJ, A;=I .768x 1 07+8.739x 1 O',
E1=9520k0.4x l O", Ez=l 1500k0.9~ IO" and E;= 1 7,900k0.9~ IO".
In this probletn it is very difficult to obtain convergence to the global opti-
mum as the condition number of matrix A at the above local optimum is 3x
Even if this was the global optimum, a small change in the data would result i n
widely different parameter estimates since this parameter estimation problem ap-
pears to be fairly ill-conditioned.
At this point we should always try and see whether there is anything else
that could be done to reduce the ill-conditioning of the problem. Upon reexamina-
tion of the structure of the model given by Equation 16.4 we can readily notice
that it can be rewritten as
Parameter Estimation in Chemical Engineering Kinetics Models
ET
where
A; = AI AzA3
and
293
(16.12)
(16.13)
(16.14)
The reparameterized model has the same number of unknown parameters
(Al *? A*, A3, El*, E2 and E3) as the original problem, however, it has a simpler
structure. This often results in much better convergence characteristics of the it-
erative estimation algorithm. Indeed, convergence to the global optimum was ob-
tained after many iterations using Marquardts modification. The value of Mar-
quardts parameter was always kept one order of magnitude greater than the
smallest eigenvalue of matrix A. At the optimum, a value of zero for Marquardts
parameter was used and convergence was maintained.
The LS objective function was found to be 0.7604~ IO-. This value is almost
three orders of magnitude smaller than the one found earlier at a local optimum.
The estimated parameter values were: A1=22.672, A2=l 32.4, A3=585320,
El=13899, E2=2439.6 and E3=13506 where parameters A, and El were estimated
back from AI* and E,*. With this reparameterization we were able to lessen the ill-
conditioning of the problem since the condition number of matrix A was now
5. 6~ 1 OS.
The model-calculated reaction rates are compared to the experimental data
in Table 16.1 1 where it can be seen that the match is quite satisfactory. Based on
the six estimated parameter values, the kinetic constants (k,, k2 and k3) were com-
puted at each temperature and they are shown in Table 16.12.
Having found the optimum, we returned back to the original structure of the
problem and used an initial guess fairly close to the global optimum. I n this case
the parameters converged very close to the optimum where the LS objective func-
tion was 0.774~ The condition number of matrix A was found to be 1. 7~ 1 OI 3
which is about 5 orders of magnitude higher that the one for the reparameterized
formulation calculated at the same point. In conclusion, reparameterization should
be seriously considered for hard to converge problems.
294
Chapt er 16
I 0.0228 I 0.0500 1 5.41 I 5.954 I
I 0.031 1 I 0.0500 I 6.61 I 7.01 1 I
0.0402
8.357 8.79 0.0500 0.0500
7.803 6.86 0.0500
0.0500
4.275 4.77 0.0 153 0.0500
3.062 3.64 0.0 IO0
0.0500
7.907 7.82 0.0432 0.0500
7.298 7.94 0.0361 0.0500
6.249 6.6 1 0.0270
T=425OC, Weight ofcatalyst=I .066 g
0.00474
7.938 7.23 0.0500 0.0 I 36
3.550 5.02 0.0500
0.0290
10.29 9.75 0.0302 0.0500
9.586 9.29 0.0269 0.0500
1 3.29 13.91 0.0500 0.0500
12.82 13.00 0.0500 0.0400
1 1.68 I 1.35 0.0500
0.0500 I 0.0387 I I 1.89 1 1.81
Parameter Estimation in Chemical Engineering Kinefics Models
Table 16. I2 Catalytic reduction of NO: Estimated Model Parameters by
Nonlinear Least Squares Using Al l the Data.
295
Temperature ("0 k3 k2 k l x 1 o4
375 16.29 19.9 I 4.65
I 400 I 6.94 I 21.36 I 24.05 I
425
22.80 I 34.53
10.07
16.2 PROBLEMS WITH ALGEBRAIC MODELS
The following parameter estimation problems were formulated from re-
search papers available in the literature and are left as exercises.
16.2.1 Catalytic Dehydrogenation of sec-butyl Alcohol
Data for the initial reaction rate for the catalytic dehydrogenation of sec-
butyl alcohol to methyl ethyl ketone are given in Table 16.13 (Thaller and Thodos,
1960; Shah, 1965). The following two models were considered for the initial rate:
Model A
rA, = R - J R2 - k h
where
Model B
(16.15)
(16.16)
(16.17)
where h=-0.7, KA is the adsorption equilibrium constant for sec-butyl alcohol, kH
is the rate coefficient for the rate of hydrogen desorption controlling, kR is the rate
coefficient for surface reaction controlling, PA is the partial pressure of sec-butyl
alcohol.
296 Chapter 16
Table 16. I3 Data-for the Catalytic Deiydrogermlion of sec-butyl Alco/101
Source: Thaller and Thodos (1960); Shah ( I 965).
Table 16. I3 Puranteter Estimutes*for Models .4 urd B
kHx I O2
("F)
(Ihmoles alcohol (Ibmoles alcohol
kRx 1 O2
/(hr lb catalyst) /ow Ib catalyst)
575 7.65
51.5 62.8 9.50 600
40.7 20.2 11.5 575
53.5 81.7 7.89 600
44.4 23.5
Model
Temperature
KAx 10'
(ntm")
A
R
Source: Shad ( 1 965).
Using the initial rate data given above do the following: (a) Determine the
parameters, kR, kH and KA for model-A and model43 and their 95% confidence
intervals; and (b) Using the parameter estimates calculate the initial rate and cotn-
pare it with the data. Shah ( 1 965) reported the parameter estimates given in Table
16.14.
Parameter Estimation in Chemical Engineering Kineiick'Models 297
16.2.2 Oxidation of Propylene
The following data given in Tables 16.15, 16.16 and 16.17 on the oxida-
tion of propylene over bismuth molybdate catalyst were obtained at three tem-
peratures, 350,375, and 390C (Watts, 1994).
One model proposed for the rate of propylene disappearance, rp, as a function
of the oxygen concentration, C,, the propylene concentration, C,, and the stoichi-
ometric number, n, is
k, kpc~5cp
k, ~: . ~ +nkpcp
rp =
where ko and kp are the rate parameters.
Table 16. I5 Data for the Oxidation of Propylene at 350C.
CP rP
n
C O
3.05
3 .OO 0.452 1.24 3.17
2.86 0.439 3.18 I .37
2.73 0.658 3.07
3.02
2.60 0.635 3.15 4.3 1
2.64 0.695 3.85
2.78
2.69 0.642 3.13 2.96
2.56 0.760 6.48 3.1 I
2.73 0.670 3.89
2.84
2.87 0.483 7.79 1.38
2.9 I 0.525 7.93 1.46
2.77 0.665 3.14
I 1.42 8.03 I 0.522 I 2.97 1
I .49
2.75 0.63 5 3.03 3.0 1
2.93 0.530 7.78
I 1.35 I 8.00 I 0.480 I 2.90 I
I 5.68 I 7.75 I 0.996 I 2.41 I
I .36
2.59 0.835 7.89 3.18
2.86 0.367 1.25 1.42
2.8 1 0.4 16 3.10
I 1 I
2.87
3.06 I 0.609 I 2.76
Source: Watts ( 1994).
(16.18)
298
Table 16. I6 Data- for the Oxidation of Propylerw crt 3 75C
Chapt er 16
C, r,,
n
c o
2.94 2.37 1.160 2.96
I 1.35 I 3.06 I 0.680 I 2.58 I
3.04
2.19 1.170 3.70 2.90
2.24 0.740 1.19
I 4.14 I 3.03 I 1.390 I 2.32 I
1
2.69
2.16 I .290 6.23 2.99
2.3 I 1.190 3.76
I 2.85 I 3.03 I 1.130 I 2.25 1
5.46
2.63 0.804 7.67 I .39
1.93 2.030 7.46
I 1.34 1 1.15 I 0.630 I 2.58 I
2.73
2.53 0.772 7.56 1.39
2.64 0.864 7.65 1.46
2.16 I .080 3.02
I 1.33 I 7.49 I 0.777 I 2.64 1
I .37
2.25 1.310 2.93 7.02
2.5 1 0.745 7.75
I 2.89 I 2.91 I I . I60 I 2.27 I
7.30
2.55 0.74 I 7.66 I .35
2.22 1.360 2.96
3.15
2.15 1.050 2.93 2.75
2. I4 1,440 7.52
Source: Watts (I 994).
The objective is to determine the parameters and their standard errors by the
Gauss-Newton method for each temperature and then check to see if the parameter
estimates obey Arrhenius type behavior.
Watts (1994) reported the following parameter estimates at 350C:
b=I ,334 f 0.081 [(nzmol L,"'/(g s)] and k,=0.61 I k 0.055 [(I& s ) ] . Similar re-
sults were found for the data at the other two temperatures.
The parameter values were then plotted versus the inverse temperature and
were found to follow an Arrhenius type relationship
(16.19)
where AJ is the pre-exponential factor and EJ the activation energy. Both numbers
should be positive.
Table 16.17 Data for the Oxidation of Propylene at 390C
c, r,
n
c o ,
2.62 1.95 1.480 3.66
2.79
1.92 1.800 6.12 3.02
2.00 1.510 2.96
I 3.07 I 7.32 I 1.900 I 1.96 I
1.36
2.33 0.805 1.12 1.31
2.36 0.990 7.52
I 1.42 1 7.47 I 0.991 I 2.26 I
~
2.72 3.48
2.06 2.2 I O 2.86 6.86
1.93 1.520
~~
I 7.13 I 2.89 I 2.300 I 2.10 I
1.32
2.16 2.430 3.27 7.09
2.36 0.936 7.48
1 2.88 I 3.76 I 1.640 I 1.85 I
1.33
2.39 0.996 7.89 1.37
2.10 2.300 3.22 7.14
2.38 0.975 7.84
5.39
2.28 0.823 2.90 1.31
1.76 2.760 7.25
299
I 2.74 I 3.54 I 1.530 I 1.84 I
2.89
I .75 2.760 7.23 5.29
1.83 1.790 7.48
I I
Source: Watts (1 994).
Subsequently, Watts performed a parameter estimation by using the data
from all temperatures simultaneously and by employing the formulation of the rate
constants as in Equation 16.19. The parameter values that they found as well as
their standard errors are reported in Table 16.18. It is noted that they found that the
residuals from the fit were well behaved except for two at 375C. These residuals
were found to account for 40% of the residual sum of squares of deviations be-
tween experimental data and calculated values.
Watts (1 994) dealt with the issue of confidence interval estimation when es-
timating parameters in nonlinear models. He proceeded with the reformulation of
Equation 16.19 because the pre-exponential parameter estimates "behaved highly
nonlinearly." The rate constants were formulated as follows
with
(1 6.20)
300
Chapter 16
(16.21)
Tnble 16.18 Pnrameter Estimates for the h4odel~for the 0.vidution of
Propylene (Noncentered Formzrlation)
A"
7.89 28. I
E,
209.4 145.1
A,,
5.9 105.6
Eo
9.77 8.94
-Sozrrce: Watts ( 1994).
AS
0.07 1 2.74
Eo
5.9 105.6
I 4 1 0.794 I 0.02s I
E,
7.89 28.1
-Source. Watts ( 1 994).
Thus, in order to improve the behavior of the parameter estimates, Watts
(1994) centers the temperature factor about a reference value To which was chosen
to be the middle temperature of 375C (648 A?. The paranleters estimates and
their standard errors are given in Table 16.19.
You are asked to veri@ the calculations of Watts (1994) using the Gauss-
Newton method. You are also asked to determine by how much the condition
number of matrix A is improved when the centered formulation is used.
16.2.3 Model Reduction Through Parameter Estimation in the +Domain
Quite often we are face with the task of reducing the order of a transfer
function without losing essential dynamic behavior of the system. Many methods
have been proposed for model reduction, however quite often with unsatisfactory
results. A reliable method has been suggested by LUUS (1980) where the devia-
tions between the reduced model and the original one in the Nyquist plot are
minimized.
Parameter Estimation in Chemical Engineering Kinetics Models 30 1
Consider the following 8* order system
194480 +482964s +5 1 181 2s2 + 278376~~ + 82402~~ +I 3285s5 + 1086s6 + 35s7
9600+ 28880s+37492s + 27470~~ + 1 1870s4 +301 7s5 +437s6 +33s7 + s8
f(s) =
( I 6.22)
The poles of the transfer function (roots of the denominator) are at - 1 , -lkj,
-3, -4, -5, -8 and - 1 0. Let us assume that we seek a third order system that follows
as closely as possible the behavior of the high order system. Namely, consider
20.2583( 1 + kls + k2s2)
g(s>=
I + k 3 ~ + kqs 2 + k5s 3
(1 6.23)
The constant in the numerator can always be chosen to preserve the steady
state gain of the transfer fhction. As suggested by Luus (1980) the 5 unknown
parameters can be obtained by minimizing the following quadratic objective func-
tion
N
S(k) = C[Re(fCiwi))-Re(gCjwi))l2 +[I m(fCj ~i ))-z~(gCj ~i ))]~ (16.24)
i=l
where j = 0 , ~,-+,=l .l xo,, ol=0.O1 and N=100.
In this problem you are asked to determine the unknown parameters using
the dominant zeros and poles of the original system as an initial guess. LJ optimi-
zation procedure can be used to obtain the best parameter estimates.
Redo the problem but take N= 1,000, 10,000 and 100,000.
After the parameters have been estimated, generate the Nyquist plots for the
reduced models and the original one. Comment on the result at high fiequencies.
Is N=lOO a wise choice?
Redo this problem. However, this time assume that the reduced model is a
fourth order one. Namely, it is of the form
20.2583( 1 + kl s + k2s2 + k3s3)
1 + k4s+ k5s2 + kgs 3 + k7s4
g(s) =
How important is the choice of wl=O.O1?
(1 6.25)
302 Chapter 16
16.3 ORDINARY DIFFERENTIAL EQUATION MODELS
The formulation for the next three problems of the parameter estimation
problem was given in Chapter 6. These examples were formulated with data from
the literature and hence the reader is strongly recommended to read the original
papers for a thorough understanding of the relevant physical and chemical phe-
nomena.
Bellman et al. (1967) have considered the estimation of the two rate con-
stants k, and k2 in the Bodenstein-Linder model for the homogeneous gas phase
reaction of NO with 02:
2NO+ 0 2 t) 2N02
The model is described by the following equation
dx 2 2
- = kl (a - x)@ - x) - k2x
dt
x(0) = 0 ( 1 6.26)
where a=126.2, p=91.9 and x is the concentration of NO2. The concentration of
NO2 was measured experimentally as a function of time and the data are given in
Table 6. I
Bellman et al. (1967) employed the quasilinearization technique and ob-
tained the following parameter estimates: kl=0.4577x I 0-5 and k2=0.2797x I O.
Bodenstein and Lidner who had obtained the kinetic data reported slightly differ-
ent values: k,=0.53~10-~ and k2=0.4 1 x10. The latter values were obtained by a
combination of chemical theory and the data. The residual sum of squares of de-
viations was found to be equal to 0.21 O X ~O - ~. The corresponding value reported by
Bodenstein and Lidner who had obtained the kinetic data is 0.555~ I O-2. Bellman et
al. (1967) stated that the difference does not reflect one set of parameters being
better than the other.
Using the computer program Bayes-ODE1 which is given i n Appendix 2 the
following parameter estimates were obtained: kl=0.4577x 1 0-5 and k2=0.2796x I O
using as initial guess kl=O.l x10m5 and k2=0.1 x ~O - ~. Convergence was achieved in
seven iterations as seen in Table 16.20. The calculated standard deviations for the
parameters kl and k2 were 3.3% and 18.8% respectively. Based on these parameter
values, the concentration of NO? was computed versus time and compared to the
experimental data as shown i n Figure 16.1. The overall fit is quite satisfactory.
303
Table 16.20 Homogeneous Gas Phase Reaction: Convergence
of the Gauss-Newton Method
Iteration
k2 kl
LS objective function
0
o.1x10-5 0.1 x
401 1.7
1
0.40790~ 1 0-3 0.39707~ 1 Oms
123.35 3
0.1 1655~10-~ 0.35410~10-~
774.15 2
0.25876~10-* 0.33825~ 1 0-5
1739.9
4
0.27962~ 1 0-3 0.4577 1 x1 O-
2 1.867 7
0.280 17x1 0-3 0.45786~ 1 0-5
2 1 367 5
0.26482~ 10 0.4558 1 x l 0-5
22.020
6
0.27959~10-~ 0.45770~
2 1.867
0 10 20 30 40 50
Time
Figure 16.1 Homogeneous Gas Phase Reaction: Experimental dara and
model calcdated values of the NO2 concentration.
16.3.2 Pyrolytic Dehydrogenation of Benzene to Diphenyl and Triphenyl
Let us now consider the pyrolytic dehydrogenation of benzene to diphenyl
and triphenyl (Seinfeld and Gavalas, 1970; Hougen and Watson, 1948):
304
Chapter 16
The following kinetic model has been proposed
where
dx 1
- = -rl - r2
dt
dx2 - rl r2
""
dt 2
(1 6.27a)
( 16.27b)
(1 6.28a)
( I 6.28b)
and where xl denotes 16-moles of benzene per lb-ntok of pure benzene feed and x?
denotes lb-moles of diphenyl per Ih-ntole of pure benzene feed. The parameters kl
and k2 are unknown reaction rate constants whereas K1 and K2 are equilibrium
constants. The data consist of measurements of xI and x2 in a flow reactor at eight
values of the reciprocal space velocity t and are given in Table 6.2. The feed to the
reactor was pure benzene. The equilibrium constants K I and K2 were determined
from the run at the lowest space velocity to be 0.242 and 0.428, respectively.
Seinfeld and Gavalas (1970) employed the quasilinearization method to
estimate the parameters for the kinetic model that was proposed by Hougen and
Watson ( 1 948). Seinfeld and Gavalas (1 970) examined the significance of having
a good initial guess. As weighting matrix in the LS objective function they used
the identity matrix. I t was found that with an initial guess kI=k2=300 or kl=kz=SOO
convergence was achieved in four iterations. The corresponding estimated pa-
rameters were 347.4, 403.1. The quasilinearization algorithm diverged when the
initial guesses were kl=k2=100 or kl=k2=l 000. It is interesting to note that Hougen
and Watson (1948) reported the values of kl=348 and k2=404 [(lh-moles)/3.59
@~( hr) ( i d) J. Seinfeld and Gavalas pointed out that it was a coincidence that the
values estimated by Hougen and Watson in a rather "crude" manner were very
close to the values estimated by the quasilinearization method.
Subsequently, Seinfeld and Gavalas examined the role of the weighting
tnatrix and the role of using fewer data points. It was found that the estimated val-
ues at the global minimum are not affected appreciably by using different weight-
ing factors. They proposed this as a quick test to see whether the global minimum
has been reached and as a means to move away fi-om a local minimum of the LS
objective function. It was also found that there is some variation in the estimates of
k? as the number of data points used i n the regression i s reduced. The estimate of
kl was found to remain constant. The same problem was also studied by Kalo-
gerakis and Luus ( 1 983) to demonstrate the substantial enlargement of the region
of convergence for the Gauss-Newton method through the use of the information
index discussed in Chapter 8 (Section 8.7.2.2).
Parameter Estimation in Chemical Engineering Kinetics Models 305
Using the program provided in Appendix 2 and starting with an initial
guess far from the optimum (kl=k2=10000), the Gauss-Newton method converged
within nine iterations. Marquardts parameter was zero at all times. The reduction
in the LS objective function is shown in Table 16.2 1. As weighting matrix, the
identity matrix was used. The uncertainty in the parameter estimates is quite small,
namely, 0.18 1% and 0.857% for k, and k2 respectively. The corresponding match
between the experimental data and the model-calculated values is shown in Table
1 6.22.
Table 16.2 I Benzene Dehydrogenation: Convergence of the Gauss-
Newton Method
2
1479.5 1885.2 0.21317 3
1526.1 4254.5 0.2584 1
4
5 15.53 436.83
0.75509~ IO-
5
893.62 703.35
0.65848~ lo-
6 378.92 343.05
0.23063~ 1 0-3
7
400.23 354.61
0.70523 x 1 O- 9
400.24 3 54.60
0.70523~ I 0-5
8
399.62 354.32
0.7181 1~10- ~
Standard Deviation 3.43 0.642
Table 16.22 Benzene Dehydrogenation: Experimental Data and Model
Calculated Values
Reciprocal Space
Velocity x I o4
xI (data) x2 (model) x2(data) xI (model)
+
5.63
0.1477 0.1476 0.44330 0.443 169.7
0.1484 0.1477 0.46934 0.470 45.20
0.1481 0.1477 0.48 1 IO 0.482 39.70
0. I473 0.1468 0.49898 0.499 34.00
0.1407 0.1400 0.56433 0.565 22.62
0.1313 0.1322 0.62 147 0.622 16.97
0.1 122 0.1 130 0.70541 0.704 I I .32
0.0738 0.0737 0.82833 0.828
306
Chapter 16
If instead of the identity matrix, we use QI=n'iag(y1(tl)-', y2(t1)-?) as a time
varying weighting matrix, we arrive at k*=[355.55, 402.91IT which is quite close
to k*=[354.6 I , 400.23IT obtained with Q=I. This choice of the weighting matrix
assumes that the error in the measured concentration is proportional to the value of
the measured variable, whereas the choice of Q=I assumes a constant standard
error in the measurement. The parameter estimates are essentially the same be-
cause the measurements of x1 and x2 do not differ by more than one order of mag-
nitude. In general, the computed uncertainty in the parameters i s expected to have
a higher dependence on the choice of Q. The standard estimation error of k, al-
most doubled from 0.642 to l .17, while that of k2 increased marginally from 3.43
to 3.62.
The region of convergence can be substantially enlarged by implementing
the simple bisection rule for step-size control or even better by implementing the
optimal step-size policy described in Section 8.7.2.1. Furthermore, it is quite im-
portant to use a robust integration routine. Kalogerakis and Luus (1983) found that
the use of a stiff differential equation solver (like IMSL routine DGEAR which
uses Gears method) resulted in a large expansion of the region of convergence
compared to IMSL routine DVERK (using J .H. Verners Runge-Kutta formulas of
5t and 6th order) or to the simple 4th order Runge-Kutta method. Although this
problem is nonstiff at the optimum, this may not be the case when the initial pa-
rameter estimates far from the optimum and hence, use of a stiff ODE solver is
generally beneficial. The effects of the step-size policy, use of a robust integration
routine and use of the information index on the region of convergence are shown
in Figure 16.2.
As seen in Figure 16.2, the effect of the information index and the integra-
tion routine on the region of is very significant. It should be noted that use of
DGEAR or any other stiff differential equation solver does not require extra pro-
gramming effort since an analytical expression for the jacobean (afT/ax)T is also
required by the Gauss-Newton method. Actually quite often this also results in
savings in computer time as the parameters change from iteration to iteration and
the system ODES could become stiff and their integration requires excessive com-
puter effort by nonstiff solvers (Kalogerakis and Luus, 1983). In all the computer
programs for ODE models provided in Appendix 2, we have used DIVPAG, the
latest IMSL integration routine that employs Gears method.
16.3.3 Catalytic Hydrogenation of 3-Hydroxypropanai (HPA) to
The hydrogenation of 3-hydroxypropanal (HPA) to 1,3-propanedioI (PD)
over Ni/SiO2/AI2Oj catalyst powder was studied by Professor Hoffmans group at
the Friedrich-Alexander University in Erlagen, Germany (Zhu et al., 1997). PD is
a potentially attractive monomer for polymers like polypropylene terephthalate.
They used a batch stirred autoclave. The experimental data were kindly provided
by Professor Hoffman and consist of measurements of the concentration of HPA
and PD (CHPA, CPD) versus time at various operating temperatures and pressures.
The same group also proposed a reaction scheme and a mathematical model
that describe the rates of HPA consumption, PD formation as well as the forma-
tion of acrolein (Ac). The model is as follows
(1 6.29a)
308
Chapter 16
" d C ~c -r3 - r4 -r-3
dt
( I 6.29b)
( I 6.29c)
where CI, is the concentration of the catalyst ( I O g/L). The initial conditions in all
experiments were CHPA(0)= 1.34953, Cpn(0) = 0 and CA,(0)=O.
The kinetic expressions for the reaction rates are given next,
1 HPA
r, =
+
r2 = k2Cl T)CHPA
0 5
1 +(Kg) + K2CFIpA
( 16.30a)
( I 6.30b)
r-3 = k-3CAc ( 16.30d)
In the above equations, kJ ( i = I , 2, 3, -3,4) are rate constants (L/(h~u/ miu g)).
KI and K2 are the adsorption equilibrium constants (L/n?ol) for H1 and HPA re-
spectively; P is the hydrogen pressure ( Mf a) in the reactor and H is the Henry's
law constant with a value equal to 1379 (L bnr/moZ) at 298 K. The seven parame-
ters (kl , kZ, k3, k3, k4, K1 and K2) are to be determined from the measured concen-
trations of HPA and PD versus time. I n this example, weshall consider only the
data gathered at one isotherm (3 18 K) and three pressures 2.6, 4.0 and 5.15 MPrr.
The experimental data are given in Table 16.23.
In this example the number of measured variables is less than the number of
state variables. Zhu et al. (1 997) minimized an unweighted sum of squares of de-
viations of calculated and experimental concentrations of HPA and PD. They used
Marquardt's modification of the Gauss-Newton method and reported the parameter
estimates shown in Table 16.24.
Table 16.23 HPA Hydrogenation: Experimental Data Collected at 318 K
and Pressure 2.6, 4.0 and 5.15 MPa
40
50
0.136399 1.13292
0.304599 0.96 1339 60
0.238633 1.03556
2.6 I 80 I 0.734436 I 0.492378 I
7
100
0.002628 12 1.3295 10
1.26032 0.00530892 200
1.25769 0.0364 192 I80
I . I7306 0.100976 160
1.04284 0.2 1 4799 140
0.887254 0.374385 120
0.732326 0.56455 1
I 20 I 1.31157 I 0.0525624 I
30 1.22828 0.120736
40 1.087 0.24 1393
I 50 I 0.994539 I 0.384888 I
4.0
60
0.773 193 0.600962 80
0.4682 0.8 1 1825
10 1.36324 0.002628 12
20 I .25882 0.0700394
I 30 I 1.17918 I 0.184363 I
40 0.972 102 0.354008
50 0.825203 0.469777
5.15 60 0.697 109 0.607359
80 0.42 I45 I 0.85243 1
100 0.232296 1.03535
I20 0.128095 1.16413
I40 0.02898 I7 1.30053
160 0.00962368 1.31971
310 Chapter 16
Using the FORTRAN program given in Appendix 2 and starting with the
values given by Zhu et al. ( 1 997) as an initial guess, the LS objective function was
computed using the identity matrix as a weighting matrix. The LS ob-jective func-
tion was 0.26325 and the corresponding condition number of matrix A was
0.345~ I OI7. It should be noted that since the parameter values appear to differ by
several orders of magnitude, we used the formulation with the scaled matrix A
discussed in Section 8.1.3. Hence, the magnitude of the computed condition num-
ber of matrix A is solely due to ill-conditioning of the problem.
Indeed, the ill-conditioning of this problem is quite severe. Using program
Bayes-ODE3 and using as initial guess the parameter values reported by Zhu et al.
(1997) we were unable to converge. At this point it should be emphasized that a
tight test of convergence should be used (NSIG=5 in Equation 4.1 I ) , otherwise it
may appear that the algorithm has converged. In this test Marquat-dt's parameter
was zero and no prior information was used (Le., a zero was entered into the com-
puter program for the inverse of the variance of the prior parameter estimates ).
I n problems like this one which are very difficult to converge, we should use
Marquardt's modification first to reduce the LS objective hnction as much as pos-
sible. Then we can approach closer to the global minimum in a seqz,er?tinl wuy ~ . J J
letting ody one or two parameters to vary at a time. The estimated standard errors
for the parameters provide excellent information on what the next step should be.
For example if we use as an initial guess k,=IO" for all the parameters and a con-
stant Marquardt's parameter y=l 0-4, the Gauss-Newton iterates lead to k=[2.6866,
0.1 08x 1 0-6, 0.672xlO", 0.68~10-~, 0.0273, 35.56, 2.57IT. This correspotlds to a
significant reduction in the LS Objective function from 40.654 to 0.30452. Subse-
quently, using the last estimates as an initial guess and using a smaller value for
Marquardt's directional parameter (any value in the range to yielded
similar results) we arrive at k=[2.6866 , 0.236~ IO-', 0.672~10-~, 0.126~ I Om5,
0.0273, 35.56, 2.57IT. The corresponding reduction in the LS objective function
was rather marginal from 0.30452 to 0.30447. Any filrther attempt to get closer to
the global minimum using Marquardt's modification was unsuccessful. The values
for Marquardt's parameter were varied fiom a value equal to the smallest eigen-
vaIue all the way up to the third largest eigenvalue.
At this point we switched to our sequential approach. First we examine the
estimated standard errors in the parameters obtained using Marquardt's modifica-
tion. These were 22.2, O.37x1Os, 352., O.3x1O7, 1820., 30.3 and 14.5 (?4) for kl,
kz, k3, k3, k4, K1 and K2 respectively. Since K2 ,kl, K1 and k3 have the smallest
standard errors, these are the parameters we should try to optimize first. Thus,
letting only K2 change, the Gauss-Newton method converged in three iterations to
K2=2.5322 and the LS objective function was reduced from 0.30447 to 0.27807.
In these runs, Marquardt's parameter was set to zero. Next we let kl to vary.
Gauss-Newton method converged to kl = 2.7397 in three iterations and the LS
objective function was further reduced from 0.27807 to 0.26938. Next we opti-
mize K2 and kl together. This step yields K2=2.71 19 and kl=3.0436 with a corre-
sponding LS objective function of 0.26753. Next we optimize three parameters
Parameter Estimation in Chemical Engineering Kinetics Modeis
,_. , : ,;~ ' I
31 1
K2, kl and KI. Using as Marquardt's parameter y=0.5 (since the smallest eigen-
value was 0.362), the Gauss-Newton method yielded the following parameter val-
ues K2=172.25, k1=13.354 and Kl=4.5435 with a corresponding reduction of the
LS objective function from 0.26753 to 0.24357. The corresponding standard errors
of estimate were 9.3, 3.1 and 3.6 (%). Next starting from our current best parame-
ter values for KZ, kl and K1, we include another parameter in our search. Based on
our earlier calculations this should be k3. Optimizing simultaneously K2, kl, KI
and k3. we obtain K2=191.30, kl=13.502, K1=4.3531 and k3=.3922x10" reducing
further the LS objective function to 0.2 161 0. At this point we can try all seven
parameters to see whether we can reduce the objective fbnction any further. Using
Marquardt's modification with y taking values from 0 all the way up to 0.5, there
was no further improvement in the performance index. Similarly, there was no
improvement by optimizing five or six parameters. Even when four parameters
were tried with instead of k3, there was no further improvement. Thus, we con-
clude that we have reached the global minimum. All the above steps are summa-
rized in Table 16.24. The fact that we have reached the global minimum was veri-
fied by starting fiom widely different initial guesses. For example, starting with
the parameter values reported by Zhu et al. (1997) as initial guesses, we arrive at
the same value of the LS objective hnction.
At this point is worthwhile commenting on the computer standard estima-
tion errors of the parameters also shown in Table 16.24. As seen in the last four
estimation runs we are at the minimum of the LS objective function. The parame-
ter estimates in the run where we optimized four only parameters (K2, k,, KI & k3)
have the smallest standard error of estimate. This is due to the fact that in the com-
putation of the standard errors, it is assumed that all other parameters are known
precisely. In all subsequent runs by introducing additional parameters the overall
uncertainty increases and as a result the standard error of all the parameters in-
creases too.
Finally, in Figures 16.3a, 16.3b and 16.3~ we present the experimental data
in graphical form as well as the model calculations based on the parameter values
reported by Zhu et al. (1997) and f?omthe parameter estimates determined here,
namely, k*=[ 13.502, 0.236~ I O-', 0.3922~ I O", 0. I26x 1 0-5, 0.0273, 4.353 1,
I9 1 .30IT. As seen, the difference between the two model calculations is very small
and all the gains realized in the LS objective hnction (from 0.26325 to 0.2 161 0)
produce a slightly better match of the HPA and PD transients at 5.15 MPa.
Zhu et al. ( 1 997) have also reported data at two other temperatures, 333 K
and 353 K. The determination of the parameters at these temperatures is left as an
exercise for the reader (see Section 16.4.3 for details),
312
Chapt er 16
0
0
0 50 100 150 200
Time (min)
0 50 100 150 200
Time (min)
0 50 100 150 200
Time (rnin)
h

3

0

1
A

0

0
0

m

m

w

2

*

*

-

.
x
-

-

d
l
0

-
T

*

I
*

I

5

N

0

m

c
h

314 Chapt er 16
16.3.4 Cas Hydrate Formation Kinetics
Gas hydrates are non-stoichiometric crystals formed by the enclosure of
molecules like methane, carbon dioxide and hydrogen sulfide inside cages formed
by hydrogen-bonded water molecules. There are more than 100 cotnpounds
(guests) that can combine with water (host) and form hydrates. Formation of gas
hydrates is a problem in oil and gas operations because it causes plugging of the
pipelines and other facilities. On the other hand natural methane hydrate exists in
vast quantities in the earth's crust and is regarded as a future energy resource.
A mechanistic model for the kinetics of gas hydrate formation was proposed
by Englezos et al. (1 987). The model contains one adjustable parameter for each
gas hydrate forming substance. The parameters for methane and ethane were de-
termined fromexperimental data in a semi-batch agitated gas-liquid vessel. During
a typical experiment in such a vessel one monitors the rate of methane or ethane
gas consumption, the temperature and the pressure. Gas hydrate formation is a
crystallization process but the fact that it occurs from a gas-liquid system under
pressure makes it difficult to measure and monitor in situ the particle size and par-
ticle size distribution as well as the concentration of the methane or ethane in the
water phase.
The experiments were conducted at four different temperatures for each gas.
At each temperature experiments were performed at different pressures. A total of
14 and 1 1 experiments were performed for methane and ethane respectively.
Based on crystallization theory, and the two film theory for gas-liquid mass trans-
fer Englezos et al. (1987) formulated five differential equations to describe the
kinetics of hydrate formation in the vessel and the associate mass transfer rates.
The governing ODES are given next.
J fb(tO)=feq (16.31b)
( 16.3 1 d)
Parameter Estimation in Chemical Engineering knetics Models 315
(16.31e)
The first equation gives the rate of gas consumption as moles of gas (n) ver-
sus time. This is the only state variable that is measured. The initial number of
moles, nO is known. The intrinsic rate constant, K* is the only unknown model
parameter and it enters the first model equation through the Hatta number y. The
Hatta number is given by the following equation
y = yL,/4xK*pzH/ D* (1 6.32)
The other state variables are the fugacity of dissolved methane in the bulk of
the liquid water phase (fb) and the zero, first and second moment of the particle
size distribution (PO, pl , ~2 ) . The initial value for the fugacity, f : is equal to the
three phase equilibrium fugacity feq. The initial number of particles, p: , or nuclei
initially formed was calculated from a mass balance of the amount of gas con-
sumed at the turbidity point. The explanation of the other variables and parameters
as well as the initial conditions are described in detail in the reference. The equa-
tions are given to illustrate the nature of this parameter estimation problem with
five ODES, one kinetic parameter (K*) and only one measured state variable.
-8
E
3
cn
t
0
0
0.11s "
0.100
o.ms
0.070 --
0.025
--
?
0.010 t
I I 1 I 1 1
1. I t I I k
-5 10 25 40 55 70 85 fOO
time (min)
Figure 16.4: Experimental (-) and Jitted f- - - - ; ) curves for the methane
hydrate formation at 279 K [reprinted from Chemical Engineering
Science with pesnzission front Elsevier Science].
316
Chapter 16
The Gauss-Newton method with an optimal step-size policy and Mar-
quardt's modification to ensure rapid convergence was used to match the calcu-
lated gas consumption curve with the measured one. The five state and five sensi-
tivity equations were integrated using DGEAR (an IMSL routine for integration of
stiff differential equations). Initially, parameter estimation was performed for each
experiment but it was found that for each isotherm the pressure dependence was
not statistically significant. Consequently, each isothermal set of data for each gas
was treated simultaneously to obtain the optimal parameter value. The experi-
mental data and the corresponding model calculations for methane and ethane gas
hydrate formation are shown in Figures 16.4 and 16.5.
16.4 PROBLEMS WITH ODE MODELS
The following problems were formulated with data from the literature. Al -
though the information provided here is sufficient to solve the parameter estima-
tion problem, the reader is strongly recommended to see the papers in order to
fidly comprehend the relevant physical and chemical phenomena.
Parameter Estimation in Chemical Engineeking Kinetics Models
317
16.4.1 Toluene Hydrogenation
Consider the following reaction scheme
where A is toluene, B is 1 -methyl-cyclohexane, C is methyl-cyclohexane, rl is the
hydrogenation rate (forward reaction) and r-l the disproportionation rate (backward
reaction). Data are available from Belohlav et al. ( 1 997).
The proposed kinetic model describing the above system is given next (Be-
lohlav et al. 1997):
dCB
"
dt
- rl -r_l -r2 ; CB(0) = 0
" dCC
dt
-1-2 ; Cc(O)=O
The rate equations are as follows
(1 6.33a)
(16.33b)
(16.33~)
(1 6.34a)
(1 6.34b)
(I 6.34~)
where Ci (i=A, B, C) are the reactant concentrations and K''e' the relative adsorp-
tion coefficients.
The hydrogenation of toluene was performed at ambient temperature and
pressure in a semi-batch isothermal stirred reactor with commercial 5% Ru-act
catalyst. Hydrogen was automatically added to the system at the same rate at
which it was consumed. Particle size of the catalyst used and efficiency of stirring
318
Chapt er 16
were sufficient for carrying out the reaction in the kinetic regime. Under the ex-
perimental conditions validity of Henrys law was assumed. The data are given
below in Table 16.25.
You are asked to use the Gauss-Newton method and determine the parame-
ters k,.[, kD, k2, KA-Icl, and as well as their 95% confidence intervals.
For cornparison purposes, it is noted that Belohlav et al. (I 997) reported the
following parameter estimates: kH=0.023 miti, kD=0.005 min, k2=0.0 I I mhi,
KA-rel= 1.9, and Kc.,,,= 1.8.
Table 16.25 Data. for the HJldrogermtion qf Tollrerw
180
0.908 0.073 0.0 19 400
0.909 0.070 0.02 1 3 80
0.898 0.080 0.022 360
0.822 0. I46 0.03 1 320
0.747 0.21 I 0.04 1 240
0.580 0.362 0.056
Sozwce: Belohlav et al. ( 1 997).
16.4.2 Methylester Hydrogenation
Consider the following reaction scheme
where A, B, C and D are the methyl esters of linolenic, linoleic, oleic and stearic
acids and rl, r2 and r3 are the hydrogenation rates. The proposed kinetic model
describing the above system is given next (Belohlav et at. 1997).
dCA
-=
dt
-q ; CA(0)= 0.101
-= dCB
dt '1 -9 ; CB(0)=0.221
dCC
"
dt
- r2 -r3 ; Cc (0) = 0.657
dCD
"
dt
- r3 ; CD(0) = 0.0208
The rate equations are as follows
319
(1 6.35a)
(16.35b)
(I 6.35~)
(1 6.35d)
(16.36a)
(I 6.36b)
(16.36~)
Table 16.26 Data for the Hydrogenation of Methylesters
'c (min)
CA CD CC CB
1
0 0.0208 0.6570 0.22 10 0.1012
I 10 I 0.0150 I 0.1064 I 0.6941 I 0.1977 I
14
0.9982 0.000 I 0.0002 0.000 I I24
0.9680 0.0299 0.0004 0.0003 69
0.7808 0.2 1 88 0.0005 0.00 17 34
0.6055 0.3956 0.001 5 0.0029 24
0.4444 0.536 1 0.0242 0.0028 19
0.3058 0.6386 0.0488 0.0044
Source: Belohlav et al. (1 997).
320 Chapter 16
where C, (i=A, B, C, D) are the reactant concentrations and Krel the relative ad-
sorption coefficients.
The experiments were performed in an autoclave at elevated pressure and
temperature. The Ni catalyst DM2 was used. The data are given below in Table
16.26 (Belohlav et at., 1997).
The objective is to determine the parameters kl, kz. kg, KA-rel, Ku-,el , Kc.-lel-
KD-rel as well as their standard errors. It is noted that KA-lcl=I. Belohlav et ai.
( I 997) reported the following parameter estimates: kl = 1.44 m h - , kz=0.03 mi l t ,
k3=0.09 min, KB-,,1=28.0, Kc-,l=1.8 and KD.1,1=2.9.
16.4.3 Catalytic Hydrogenation o f 3-Hydroxypropanal (HPA) to
1,3-Propanediol (PD) - Nonisothermal Data
Let us reconsider the hydrogenation of 3-hydroxypropanal (HPA) to 1.3-
propanediol (PD) over Ni/SiO2/AI2Og catalyst powder that used as an example
earlier. For the sanle mathematical model of the system you are asked to regress
simultaneously the data provided in Table 16.23 as well as the additional data
given here in Table 16.28 for experiments performed at 60C (333 A) and 80C
(353 K). Obviously an Arrhenius type relationship must be used in this case. Zhu
et a]. (1997) reported parameters for the above conditions and they are shown in
Table 16.28.
Parameter Estimates -1- Standard Error
T = 333 K T = 353 K
kl
k2
kz
k-3
kr
19.54
I .475
k0.40 1 +o. 102
52.84
1.537~ 1 0-3
3.058~ J OS4
5.3 15x IO- 1.788~ 1 0-3
2.066~ 10
k0.27~ IO-^
- +O.O36x 1 O? - +0.75x 1 0-5
kO.12x Io-?
k0.025~ 1 O3 *o. 1 20x 10
7.5 15x IO-^ 2. mX I 0
k0.90~ 1 o- ~ k0.023 x 1 0.
Kl
Kz
120.0
+ 1.73 k 1.05
160.0
k0.02 I kO.0 I3
2.767 3.029
Solwce: Zhu et al. ( 1 997).
Parameter Estimation i n Chemical hgiiieering Kidtics Models
., > J 2 - * I
32 1
Table 16.28 HPA Hydrogenation: Experimental Data Collected at 333 K
and 353 K
Experimental CHPA
Time
0.0 I .34953
10
0.663043 20
1 .OS54
Conditions
(mol/L) (rn in)
5.15 MPa
333 K
& 0.3 I6554 1 30
40 0.00982225
CPD
(m ol/L)
0.0
0.196452
0.602365
0.851 117
1.16938
I 50 I 0.00515874 I 1.24704
60
0.0 1.34953 0.0
1.24836 0.0020635
5 0.388568 0.8735 I3
5.15 MPa
353 K
10
0.9670 1 7 0.140925 15 &
0.8 16032 0.44727
20
1 .OS239 0.0 130859 25
1.05 125 0.0350076
30
0.0 1.34953 0.0
1 .12024 0.0058 1597
10
0.867 148 0.448402 30 &
0.558344 0.78552 1 20
0.151380 1. I3876
4.0 MPa
333 K
40
1.32 0.0199173 60
I .29 0.0530074 50
1.12536 0.191058
Source: Zhu et al. (I 997).
Parameter Estimation in Biochemical
Engineering Models
A number of examples from biochemical engineering are presented in this
chapter. The mathematical models are either algebraic or differential and they
cover a wide area of topics. These models are often employed in biochemical en-
gineering for the dcvelopment of bioreactor models for the production of bio-
pharmaceuticals or in the environmental engineering field. In this chapter we have
also included an example dealing with the determination of the average specific
production rate from batch and continuous runs.
17.1 ALGEBRAIC EQUATION MODELS
17.1.1 Biological Oxygcn Demand
Data on biological oxygen demand versus time are modeled by the follow-
ing equation
where kl is the ultimate carbonaceous oxygen demand (mg/L) and k? is the BOD
reaction rate constant ('d'). A set of BOD data were obtained by 3'd year Environ-
mental Engineering students at the Technical University of Crete and are given in
Table 4.2.
Although this is a dynamic experiment where data are collected over time, it
is considered as a simple algebraic equation model with two unknown parameters.
322
Parameter Estimation in Biochemical Engineering 323
Using an initial guess of kI=350 and k2=l the Gauss-Newton method converged in
five iterations without the need for Marquardt's modification. The estimated pa-
rameters are kt= 334.27+2.10% and k2=0.38075+5.78%. The model-calculated
values are compared with the experimental data in Table 17.1, As seen the agree-
ment is very good in this case.
The quadratic convergence of the Gauss-Newton method is shown in Table
17.2 where the reduction of the LS objective function is shown for an initial guess
of k,=lOO and k2=0. 1.
Table 17. I BOD Data: Experimental Data and Model
Calculated Values
Time
mental Data Calculations
105.8
178.2
I 3 1 230 I 227.6 I
4
284.5 280 5
26 1.4 260
I 6 I 290 I 300.2 I
I 7 I 3 10 I 31 1.0 I
8 3 18.4 330
Table 17.2 BOD Data: Reduction of the LS Objective Function
Iteration
0.1897 70.78 380140 1
0.1 100 3903 84 0
k2 kl
Objective function
2
0.3686 33 1.8 475.7 4
0.4454 27 1.9 12160.3 3
1.169 249.5 190 17.3
5
0.3807 334.3 288.9 6
0.3803 334.3 289.0
Let us consider the determination of two parameters, the maximum reaction
rate (rmau) and the saturation constant (K,") in an enzyme-catalyzed reaction fol-
lowing Michaelis-Menten kinetics. The Michaelis-Menten kinetic rate equation
relates the reaction rate (r) to the substrate concentrations (S) by
324
Chapter 17
( I 7.2)
The parameters are usually obtained from a series of initial rate experiments
performed at various substrate concentrations. Data for the hydrolysis of benzoyl-
L-tyrosine ethyl ester (BTEE) by trypsin at 30C and pH 7.5 are given in Table
17.3
S
~ P W
20 2.5 5.0 10 15
r
64M/n1 in)
330 I 10 220 260 300
Source: Blanch and Clark (1 996).
As we have discussed in Chapter 8, this is a typical transfomlably linear
system. Using the well-known Lineweaver-Burk transformation, Equation 17.2
becomes
( 17.3)
The results from the simple linear regression of (1 /r) versus ( I /S) are shown
graphically in Figure 17.1.
Lineweawr-Burk Plot
0.01
1
0,004 y =0,0169~ +0,002
0,002 R2 =0,9632
0
0,000 0,100 0,200 0,300 0,400 0,500
1 IS
If instead we use the Eadie-Hofstee transformation, Equation 17.2 becomes
( 1 7.4)
The results from the simple linear regression of (r) versus (r/S) are shown
graphically in Figure 17.2.
Eadie-Hofstee Plot
60
20
10
y =-0,1363~
R2 =0,i
0 - -
. .
0 50 100 150 200 250 300 350
rl s
Figure 17.2 Enqnze Kinetics: Reszdtsfiom the Endie-Hojitee
transformation.
Similarly we can use the Hanes transformation whereby Equation 17.2 be-
comes
( 1 7.5)
The results from the simple linear regression of (S/r) versus (S) are shown
graphically in Figure 17.3. The original parameters r,, and K, can be obtained
from the slopes and intercepts determined by the simple linear least squares for all
three cases and are shown in Table 17.4. As seen there is significant variation in
the estimates and in general they are not very reliable. Nonetheless, these esti-
mates provide excellent initial guesses for the Gauss-Newton method.
Table I 7.4 Enzynle Kinetics: Estimated Parameter Values
Estimation Method
8.45 500
Lineweaver-Burk plot
k2 kl
326
Chapter 17
Hanes Plot
0,08
$j 0,04
y =0,0023~ +0,0146
0,02 R2 =0,9835
0,oo
0 5 10 15 20
S
25
Indeed, using the Gauss-Newton method with an initial estimate of
k(')=(450, 7) convergence to the optimum was achieved in three iterations with no
need to employ Marquardt's modification. The optimal parameter estimates are
kl = 420.2+8.68% and k2= 5.705+24.58%. It should be noted however that this
type of a model can often lead to ill-conditioned estimation problems if the data
have not been collected both at low and high values of the independent variable.
The convergence to the optimum is shown in Table 17.5 starting with the initial
guess k'O'=( 1, 1).
Table 17.5 Erqrne Kinetics: Reduction of the LS Objectitye Functiot7
Iteration
1 1 3248 I6 0
kz kl
Objective Function
I
5.685 41 7.8 98 1 260 7
4.356 359. I 3026. I2 6
12.65 386.7 39131.7 5
I A93 71 -92 196487 4
7.696 23.50 295058 3
43.38 36.52 308393 2
740.3 370.8 3 I2388
8 5.706 420.2 974.582
9 5.705 420.2 974.582
Parameter Estimation in Biochemkal Engineering 327
17.1.3 Determination of Mass Transfer Coefficient (kIAa) in a Municipal
Wastewater Treatment Plant (with PULSAR aerators)
The PULSAR units are high efficiency static aerators that have been devel-
oped for municipal wastewater treatment plants and have successfdly been used
over extended periods of time without any operational problems such as unstable
operation or plugging up during intermittent operation of the air pumps (Chourda-
kis, 1999). Data have been collected from a pilot plant unit at the Wastewater
Treatment plant of the Industrial Park (Herakleion, Crete). A series of experiments
were conducted for the determination of the mass transfer coefficient (k,a) and are
shown in Figure 17.4. The data are also available in tabular formas part of the
parameter estimation input files provided with the enclosed CD.
In a typical experiment by the dynamic gassing-idgassing-out method dur-
ing the normal operation of the plant, the air supply is shut off and the dissolved
oxygen (DO) concentration is monitored as the DO is depleted. The dissolved
oxygen dynamics during the gassing-out part of the experiment are described by
dCo2

dt
- - q02xv
( 1 7.6)
where Co2 is the dissolved oxygen concentration, x, is the viable cell concentra-
tion and qO2 is the specific oxygen uptake rate by the cells. For the very short pe-
riod of this experiment we can assume that xv is constant. In addition, when the
dissolved oxygen concentration is above a critical value (about 1.5 rng/L), the spe-
cific oxygen uptake rate is essentially constant and hence Equation 17.6 becomes,
0 50 100 150 200 250
Time (min)
Figure 17.4: PULSAR: A4easurements of Dissolved Oxygen (DO) Concentra-
tion DtlriIlg a Dynamic Gassing-in/Gassing-out Experiment.
328
Chapter 17
(1 7.7)
Equation 17.7 suggests that the oxygen uptake rate (q0.x.) can be obtained
by simple linear regression of Co2(t) versus time. This is shown in Figure 17.5
where the oxygen uptake rate has been estimated to be 0.08 I3 mg/L.mir7.
10
9
$ 8
E
- 7
0
c
~6
n
g 5
8 2
Q,
0
2 4
0 3
1
I 1
0 1
100 120 140 160 180 200
Time (min)
Subsequently. during the gassing-in part of the experiment, we can deter-
mine kLa. In this case, the dissolved oxygen dynamics are described by
(1 7.8)
where C$, is the dissolved oxygen concentration in equilibrium with the oxygen
in the air supply at the operating pressure and temperature of the system. This
value can be obtained fiom the partial pressure of oxygen in the air supply and
Henry's constant. However, Henry's constant values for wastewater are not read-
ily available and hence, we shall consider the equilibrium concentration as an ad-
ditional unknown parameter.
Analytical solution of Equation 17.8 yields
(1 7.9)
or equivalently,
Equation 17.10 can now be used to obtain the two unknown parameters (kLa
and CL2 ) by fitting the data from the gassing-in period of the experiment. Indeed,
using the Gauss-Newton method with an initial guess of (1 0, 10) convergence is
achieved in 7 iterations as shown in Table 17.6. There was no need to employ
Marquardt's modification. The FORTRAN program used for the above calcula-
tions is also provided in Appendix 2.
Table 17.6 PULSAR: Reduction of the LS Objective Function and
Convergence to the Optimzum (Two Parameters)
Iteration
cb
kLa
Objective Function
I I I
0 10 10 383.422
1
8.0977 0.3058 9.0 I289 3
8.5 173 0.7346 72.6579 2
9.8 173 1 377 1 306.050
While using Equation 17.10 for the above computations, it has been as-
sumed that Co2(tf)) = 4.25 was known precisely from the measurement of the dis-
330 Chapter 17
solved oxygen concentration taken at to = 207 nzin. We can relax this assumption
by treating CO2(t0) as a third parameter. Indeed, by doing so a smaller value for the
least squares ob-jective function could be achieved as seen in Table 17.7 by a small
adjustment in CO2(b) from 4.25 to 4.284.
Table 17.7 PULSAR: Reductiorl of the LS Objective Fwzfion nrsd
Convergence to the Opfimtatl (Three Parnnleters)
Iteration
c,>, (to) c t 2 k13
Objective Function
0 3 83.422
4.283 9.2 173 0.1259 0.1 3 8029 7
4.283 9.1794 0.1299 0.24607 1 6
4.285 9.0923 0.1375 0.508239 5
4.303 8.8860 0. I529 0.944376 4
4.428 8.3528 0. I880 1.92088 3
4.702 7.6867 0.3087 9.17805 2
4.250 9.629 1 0.6643 194.529 I
4.25 10 I O
8
4.283 9.2458 0.1223 0.086898 10
4.283 9.2420 0.1228 0.090090 9
4.283 9.2342 0.1239 0.101272
1 1 4.283 9.2420 0. I228 0.0860 I8
12 4.284 9.2486 0.1219 0.085780
13
4.284 9.2494 0.1218 0.085699 14
4.284 9.2490 0.1218 0.0857 16
r
I Standard Deviation (%) I 1.45 I 0.637 1 0.667 I
In both cases the agreement between the experimental data and the model
calculated values is very good.
17.1.4 Determination of Monoclonal Antibody Productivity in a Dialyzed
Chemostat
Linardos et al. (1992) investigated the growth of an SP2/0 derived mouse-
mouse hybridoma cell line and the production of an anti-Lewis' IgM immuno-
globulin in a dialyzed continuous suspension culture using an 1.5 L Celligen bio-
reactor. Growth medium supplemented with 1.5?/0 serum was fed directly into the
bioreactor at a dilution rate of 0.45 d' . Dialysis tubing with a molecular weight
Parameter Estimation in Biochemicai Engineering 33 1
cut-off of 1000 was coiled inside the bioreactor. Fresh medium containing no se-
rum or serum substitutes was passed through the dialysis tubing at flow rates of 2
to 5 L/d. The objective was to remove low molecular weight inhibitors such as
lactic acid and ammonia while retaining high molecular weight components such
as growth factors and antibody molecules. At the same time essential nutrients
such as glucose, glutamine and other aminoacids are replenished by the same
mechanism.
Jn the dialyzed batch start-up phase and the subsequent continuous operation
a substantial increase in viable cell density and monoclonal antibody (MAb) titer
was observed compared to a conventional suspension culture. The raw data, pro-
files of the viable cell density, viability and monoclonal antibody titer during the
batch start-up and the continuous operation with a dialysis flow rate of 5 L/d are
shown in Figures 17.6 and 17.7. The raw data are also available in tabular form in
the corresponding input file for the FORTRAN program on data smoothing for
short cut methods provided with the enclosed CD.
. . . . . - . ~. . -
o Raw Data + Smoothed (1 0%) x Smoothed (5%)
. . " . . . . "
200 . f . "111; "---~"=". -. ". . "-. " - "
0 - -
6
++
6
Batch
Start-up
continuous operation
I I
0 100 200 300 400 500 600
Time (h)
Figure I 7.6: Diolvzed Chenzostat: Monoclonal antibodv concentration kaw and
smoothed nzeasuren2ent.Y) dur-ing initial batch start-up and subsequent
dialyzed continuous operation with a dialysis pow rate of 5 Lid. [re-
printed from the Journal of Biotechnology & Bioengineering with per-
missiotz fi-om J. IViIeyl.
The objective of this exercise is to use the techniques developed in Section
7.3 of this book to determine the specific monoclonal antibody production rate
(qM) during the batch start-up and the subsequent continuous operation.
332 Chapter 17
Derivntive .4ppronch:
The operation of the bioreactor was in the batch mode up to time t=2 12 h.
The dialysis flow rate was kept at 2 L/d up to time t-9 1.5 h when a sharp drop in
the viability was observed. I n order to increase further the viable cell density, the
dialysis flow rate was increased to 4 L/d and at 180 h it was further increased to 5
L/dand kept at this value for the rest of the experiment.
o Raw Data + Smoothed (10%) x Smoothed (5%)
8
0
.-
I
II
8
Batch
Start-up
100 200 300 400 500 600
Time (h)
As described in Section 7.3.1 when the derivative method is used, the spe-
cific MAb production rate at any time t during the batch start-up period is deter-
mined by
(17.1 1 )
where X,(t) and M(t) are the smoothed (filtered) values of viable cell density and
monoclonal antibody titer at time t.
At time t=212 h the continuous feeding was initiated at 5 L/d corresponding
to a dilution rate of 0.45 d. Soon after continuous feeding started, a sharp in-
crease in the viability was observed as a result of physically removing dead cells
that had accumulated in the bioreactor. The viable cell density also increased as a
result of the initiation of direct feeding. At time tz550 h a steady state appeared to
have been reached as judged by the stability of the viable cell density and viability
for a period of at least 4 days. Linardos et al. ( I 992) used the steady state meas-
urements to analyze the dialyzed chemostat. Our objective here is to use the tech-
niques developed in Chapter 7 to determine the specific monoclonal antibody pro-
duction rate in the period 212 to 570 h where an oscillatory behavior of the MAb
titer is observed and examine whether it differs from the value computed during
the start-up phase.
During the continuous operation of the bioreactor the specific MAb produc-
tion rate at time t is determined by the derivative method as
(17.12)
In Figures 17.6 and 17.7 besides the raw data, the filtered values of MAb
titer and viable cell density are also given. The smoothed values have been ob-
tained using the IMSL routine CSSMH assuming either a 10% or a 5% standard
error in the measurements and a value of 0.465 for the smoothing parameter s/N.
This value corresponds to a value of -2 of the input parameter SLEVEL in the
FORTRAN program provided in Appendix 2 for data smoothing. The computed
derivatives from the smoothed MAb data are given in Figure 17.8. In Figure 17.9
the corresponding estimates of the specific MAb production rate (qM) versus time
are given.
Despite the differences between the estimated derivatives values, the com-
puted profiles of the specific MAb production rate are quite similar. Upon inspec-
tion of the data, it is seen that during the batch period (up to t=212 h), qkl is de-
creasing almost monotonically. It has a mean value of about 0.5 pg/(Z06 cells-h).
Throughout the dialyzed continuous operation of the bioreactor, the average q M is
about 0.6 pg/(106 cells.h) and it stays constant during the steady state around time
tx550 h.
In general, the computation of an instantaneous specific production rate is
not particularly useful. Quite often the average production rate over specific peri-
ods of time is a more usefid quantity to the experimentalist/analyst. Average rates
are better determined by the integral approach which is illustrated next.
334
Chapter 17
1.5 "
1 .o
0.5
0.0 .
-0.5 8
-1 .o
-1.5
Q
A Smoothed (10%) Smoothed (5%)
@@A
00
0
" .
4*Q
[ol
0
A
19
A
A
I
0
0 100 200 300 400 500 600
Time (h)
1.4
3
.I 1.0
3 0.8
0.6
fz 0.4
E
0 0.2
E
$ 0.0
a -0.2
2 1.2
.U
0
W
e
0 100 200 300 400 500 600
Integral Approach:
As described in Section 7.3.2 when the integral method is employed, the average
specific MAb production rate during any time interval of the batch start-up period
is estimated as the slope in a plot of M(t,) versus jt: X (t)dt .
In order to implement the integral method, we must compute numerically
the integrals [: X (t)dt, ti=l,. . .,N where X, is the smoothed value of the viable
cell density. An efficient and robust way to perform the integration is through the
use of the IMSL routine QDAGS. The necessary function calls to provide X, at
selected points in time are done by calling the IMSL routine CSVAL. Of course
the latter two are used once the cubic splines coefficients and break points have
been computed by CSSMH to smooth the raw data. The program that performs all
the above calculations is also included in the enclosed CD. Two different values
for the weighting factors (1 0% and 5%) have been used in the data smoothing.
In Figures 17.10 and 17.1 1 the plots of M(t,) versus X (t)dt are shown
for the batch start-up period.
0 100 200 300 400 500
Figwe I 7. I O: Dialyed Chemostat: Estimated values of spec$c MAb production
rate vesstls time during the initial batch start-up period. A 10%
standard error i n raw data was assumed for data smoothing.
336 Chapt er 17
The computation of an avernge specific production rate is particularly useful
to the analyst if the instantaneous rate is approximately constant during the time
segment of interest. By simple visual inspection of the plot of MAb titer versus the
integral of X,, one can readily identify segments of the data where the data points
are essentially on a straight line and hence. an average rate describes the data satis-
factorily. For example upon inspection of Figure 17. I O or 17.1 1, three periods can
be identified. The first five data points (corresponding to the time period 0 to 91 h )
yield an average qM of 0.7165 ,ug/(106 cells h) when a 10% weighting in used in
data smoothing or 0.72 I5 ,ug/(1O6 cells h) when a 2% weighting is used. The next
period where a lower rate can be identified corresponds to the time period 91 to
140 17 (i.e., 5th, 6'", ..., gth data point). In this segment of the data the average spe-
cific MAb production rate is 0.447 1 or 0.504 &(lo6 cells 17) when a 10% or a 5%
weighting is used respectively. The near unity values of the computed correlation
coefficient (R2>0.99) suggests that the selection of the data segment where the
slope is determined by linear least squares estimation is most likely appropriate.
The third period corresponds to the last five data points where it is obvious
that the assumption of a nearly constant qM is not valid as the slope changes es-
sentially from point to point. Such a segment can still be used and the integral
method will provide an average qM, however, it would not be representative of the
behavior of the culture during this time interval. It would simply be a mathemati-
cal average of a time varying quantity.
200
2 175
150
.- 125
Y
CI
g 100
a9
C
0 75
50
3 25
0
a
R2 =0 9784
R2 =0.9918
0 100 200 300 400 500
Parameter Estimation i n Biochemical Engineering 337
Let us now turn our attention to the dialyzed continuous operation (212 to
570 h). By the integral method, the specific MAb production rate can be estimated
as the slope in a plot of M(ti) + D M(t)dt versus X (t)dt .
{ 1; ; } f;
In this case besides the integral of X,, the integral of MAb titer must also be
computed. Obviously the same FORTRAN program used for the integration of X,
can also be used for the computations (see Appendix 2). The results by the integral
method are shown in Figures 17.12 and 17.13.
Figure 17.12:
1400
1200
1000
800
600
400
I v =0 5869x +333 49
R2= 09953
"-
400 600 800 1000 1200 1400 1600
fti X , (t)dt
J t O
Dialyzed Chenzostat: Estimated values of specific MAb production
rate vemus time during the period of continuous Operation. A 10%
standard error in the raw data was assumed for data smoothing.
1400
1200
M(ti)+ 1 1000
600
I I
y =0 6171x +291.52
- R2 =0.9976 \ ,- -,-&@
P- R2 =O 9975
400 I J
400 600 800 1000 1200 1400 1600
Figure 17.13: Dialyzed Clzenzostat: Estimated values of specific MAb production
rate versus time during the period of contimom operation. A 5%
standard error in the raw data was assumed for data snzoothing.
338 Chapter 17
In both cases, one can readily identi@ two segments of the data. The first
one is comprised by the first 8 data points that yielded a qh,l equal to 0.5233 or
0.4906 pg/(lO' cells 12) when the weighting factors for data smoothing are 10% or
5% respectively. The second segment is comprised of the last 5 data points corre-
sponding to the steady state conditions used by Linardos et al. (1992) to estimate
the specific MAb production rate. Using the integral method on this data segment
q M was estimated at 0.5869 or 0.6 I7 1 pg/(106 cells h) as seen in Figure 17.13.
This is the integral approach for the estimation of specific rates in biological
systems. Generally speaking it is a simple and robust technique that allows a vis-
ual conformation of the computations by the analyst.
17.2 PROBLEMS WITH ALGEBRAlC EQUATION MODELS
17.2.1 Effect of Glucose to Glutamine Ratio on MAb Productivity in a
Chemostat
At the Pharmaceutical Production Research Facility of the University of
Calgary experitnental data have been collected (Linardos, 1991) to investigate the
effect of glucose to glutamine ratio on monoclonal antibody (anti-Lewis' IgM)
productivity in a chemostat and they are reproduced here in Tables 17.8, 17.9 and
17.10. Data are provided for a 5 : 1 (standard for cell culture media), 5:2 and 5:3
glucose to glutamine ratio in the feed. The dilution rate was kept constant at 0.45
For each data set you are asked to use the integral approach to estimate the
dl.
following rates:
(a) Specific monoclonal antibody production rate
(b) Specific glucose uptake rate
(c) Specific glutamine uptake rate
(d) Specific lactate production rate, and
(e) Specific ammonia production rate
Does the analysis of data suggest that there is a significant effect of glucose
to glutamine ratio on MAb productivity?
By computing the appropriate integrals with filtered data and generating the
corresponding plots, you must determine first which section of the data is best
suited for the estimation of the specific uptake and production rates.
In the next three tables, the following notation has been used:
t Elapsed Time ( h)
x,. Viable Cell Density ( IO6 cells/ml)
1'1, Viability (%)
Lac Lactate (mrztol/L)
MAh Monoclonal Antibody Concentration (177gX)
Gls Glucose (wmol/L)
A r m Ammonia (mntol/L)
Gfm Glutamine (mmol/L)
Glt Glutamate (mmo//L)
Xd Nonviable Cell Density ( IO6 cells/mL)
Table 27.8: Glucose/Glutamine Ratio: Experimental Data from a Chemostat
Run with a Glucose to Glutamine Ratio in the Feed of 5:l
I 365.0 I 2.4 I 0.82 I 67.6 I 34.71 I 5.92 I 1 051 I 0.051 I 0.131 I 0.527 I
Source: Linardos (I 99 1).
Table 17.9: Glucose/Glutamine Ratio: Experimental Datafrom a Chemostat
Run with a Glucose to Glutamine Ratio in the Feed of 5. 2
t
xd
Glt Glnz Amm Gls Lac A4'4b
vh XI.
189.8
0.640 0.09 2.03 2.32 5.49 41.01 55.90 0.75 1.92 212.0
0.647 0.09 1.86 2.50 5.34 37.53 53.65 0.76 2.05 203.0
0.802 0.09 1.90 1.64 4.92 37.12 55.50 0.65 1.49
Source: Linardos ( I99 1).
340 Chapt er 17
Table 17. IO: Glt~cosdGln~tamine Ratio: Eyperirnentnl Data.fj'om n Chemostnt
Rtrn with a Glucose to Glutamine Ratio iu the Feed sf 5.3.
;ource: Linardos ( I 99 1).
24.27 I 1.56 I 6.66 I 6.26 I 0.66
-%I
0.399
0.450
0.463
0.5 I3
0.467
0.623
0.572
0.584
0.854
0.83 I
1 0 1 I
17.2.2 Enzyme inhibition Kinetics
Blanch and Clark ( I 996) reported the following data on an enzyme cata-
lyzed reaction in the presence of inhibitors A or B. The data are shown in Table
17.1 I .
Table 17. I 1 Inhibition Kinetics: Initial Renction Rntes in the
Presence cf Inhibitors A or B at Dfferent S r h -
strafe Concentrations
I Substrate I Reaction Rate (mWnzirl)
Concentration
( mM)
No Inhibitor B at 25 p A . I A at 5 phl
I 0.2 1 8.34 I 3.15 I 5.32 1
0.33
9.60 28.9 40.0 4.0
9.45 26.2 36.2 2.5
8.56 13.3 25.0 1 .o
7.07 7.12 16.67 0.5
6.26 5.06 12.48
5.0 9.75 31.8 42.6
Sozwce: Blanch and Clark ( I 996).
In this problem you are asked to do the following:
Parameter Estimation in Biochemical Engineering
: I
341
(a) Use the Michaelis-Menten kinetic model and the data fiom the inhibitor
fiee-experiments to estimate the two unknown parameters (rmax and K,)
(17.13)
where r is the measured reaction rate and S is the substrate concentration.
(b) For each data set with inhibitor A or B, estimate the inhibition parameter K,
for each alternative inhibition model given below:
Conzpetitive Inhibition
Uncompetitive Inhibition r = rmaxs (17.15)
Nonncompetitive Inhibition r = (17.16)
where I is the concentration of inhibitor (A or B) and K, is the inhibition ki-
netic constant.
(c) Use the appropriate model adequacy tests to determine which one of the
inhibition models is the best.
17.2.3 Determination of kLa in Bubble-free Bioreactors
Kalogerakis and Behie (1997) have reported experimental data from dy-
namic gassing-in experiments in three identical 1000 L bioreactors designed for
the cultivation of anchorage dependent animal cells. The bioreactors have an in-
ternal conical aeration area while the remaining bubble-free region contains the
microcarriers. A compartmental mathematical model was developed by Kalo-
gerakis and Behie (1 997) that relates the apparent kLa to operational and bioreactor
design parameters. The raw data from a typical run are shown in Table 17.12.
342 Chapter 17
In the absence of viable cells in the bioreactor, an efective mass transfer
coefficient can be obtained fiom
(17.17)
where Co2 is the dissolved oxygen concentration in the bubble-free region and
Cb2 is the dissolved oxygen concentration in equilibrium with the oxygen in the
air supply at the operating pressure and temperature of the systetn. Since the dis-
solved oxygen concentration is measured as percent of saturation. DO (?6), the
above equation can be rewritten as
dDO
dt
= kLa[DOloO% -DO] (17.18)
where the following linear calibration curve has been assumed for the probe
The constants Doloo?., and DOoOhare the DO values measured by the probe
at 100% or 0% saturation respectively. With this formulation one can readily ana-
lyze the operation with oxygen enriched air.
Upon integration of Equation 17.18 we obtain
DO(t) = DO1 oo% - [DOl oo~h - DO(tO)]e'kJ ~at ( I 7.20)
Using the data shown in Table 17.12 you are asked to determine the effcc-
tive kLa of the bioreactor employing different assumptions:
(a) Plot the data (DO(t) versus time) and determine DO100%, i.e., the
steady state value of the DO transient. Generate the differences
[DO,OO?,b-DO(t)] and plot them in a semi-log scale with respect to time.
I n this case kLa can be simply obtained as the slope the line formed by
the transformed data points.
(b) Besides kLaconsider DOloo?6 as an additional parameter and estimate
simultaneous both parameters using nonlinear least squares estima-
tion. I n this case assume that DO(t0) is known precisely and its value
is given i n the first data entry of Table 17.12.
(c) Redo part (b) above, however, consider that DO(b) is also an un-
known parameter.
Finally, based on your findings discuss the reliability of each solution ap-
proach.
Table 17.12 Bubble--ee Oxygenation: Experimental Data From
a Typical Gassing-in Experiment'
10 6.8 1
13 8.8 1
13.20
13.61
14.00
25 14.3 1
14.6 I
15.00
Bioreactor No. 1 , Wa
15.90
16.1 1
16.40
33 16.71
34 16.90
35 17.2 1
36 17.40
37 17.61
38 17.80
39 18.00
40 18.11
18.40
18-50
18.71
44 18.90
53 20.00
54 20.1 1
Ling volume 500 L,
Source: Kalogerakis and Behie (I 997).
21.1 I
21.1 1
67 21.21
21.21
21.21
21.21
21.30
40 RPM, Air flow 0.08 wnz.
344 Chapt er 17
17.3 ORDINARY DIFFERENTIAL EQUATION MODELS
17.3.1 Contact Inhibition in Microcarrier Cultures of MRC-5 Cells
Contact inhibition is a characteristic of the growth of anchorage dependent
cells grown on microcarriers as a monolayer. Hawboldt et ai. ( 1 994) reported data
on MRCS cells grown on Cytodex 11microcarriers and they are reproduced here in
Table 17.13.
Growth inhibition on microcarriers cultures can be best quantified by cellu-
lar automata (Hawboldt et al., 1994; Zygourakis et al., 1991) but simpler tnodels
have also been proposed. For example, Frame and Hu ( I 988) proposed the fol-
lowing model
dx
dt
(17.21)
where x is the average cell density in the culture and plnax. and C are ad.justable
parameters. The constant x, represents the maximum cell density that can be
achieved at confluence. For microcarrier cultures this constant depends on the
inoculation level since different inocula result in different percentages of beads
with no cells attached. This portion of the beads does not reach confluence since at
a low bead loading there is no cell transfer from bead to bead (Forestell et al.,
1992; Hawboldt et al., 1994). The maximum specific growth rate, Prlla, and the
constant C are expected to be the same for the two experiments presented here as
the only difference is the inoculation level. The other parameter, x.,,, depends on
the inoculation level (Forestell et al., 1992) and hence it cannot be considered the
same. As initial condition we shall consider the first measurement of each data set.
A rather good estimate of the maximum growth rate. pmau, can be readily
obtained from the first few data points in each experiment as the slope of h(x)
versus time. During the early nleasurenlents the contact inhibition effects are
rather negligible and hence pnlax is estimated satisfactorily. In particular for the
first experiment inoculated with 2.26 ceI/s/beud, p,l,.lx was found to be 0.026 (I?')
whereas for the second experiment inoculated with 4.76 ce//dbend plnay was esti-
mated to be 0.0253 (h"). Estitnates of the unknown parameter, x,, can be obtained
directly from the data as the maxinlum cell density achieved in each experiment;
namely, 1.8 and 2.1 ( IO6 ce//s/nll). To aid convergence and simplifL programming
we have used l/x, rather than x, as the parameter to be estimated.
Let us consider first the regression of the data fiom the first experiment (in-
oculation level=2.26). Using as an initial guess [ I / X , ~ , C, p,,,,,]"')= [0.55, 5, 0.0261,
the Gauss-Newton method converged to the optimum within 13 iterations. The
optimal parameter values were [0.6142 k 2.8%, 2.86 k 41.996, 0.0280 k 12. I %]
corresponding to a LS objective function value of 0.09 162 1 . The raw data together
with the model calculated values are shown in Figure 17.14.
345
Table 17.13: Growth of MRC.5 Cells: Cell Demiv versus Time for
MRC.5 Cells Grown on Cytodex I Microcarriers in 2.50 mL
,!pinner Flasks Inoculated at 2.26 and 4.76 celldbead
Sotrrce: Hawboldt et al. (1 994).
Next using the converged parameter values as an initial guess for the second
set of data (inoculation IeveI=4.76) we encounter convergence problems as the
parameter estimation problem is severly ill-conditioned. Convergence was only
achieved with a nonzero value for Marquardt's parameter ( Two of the
parameters, x, and P,,,~~, were estimated quite accurately; however, parameter C
was very poorly estimated. The best parameter values were [OS308 +_ 2.5%, 119.7
If: 4.1 x 1 Os%, 0.0265 k 3.1 %] corresponding to a LS objective function value of
0,079807. The raw data together with the model calculated values are shown in
Figure 17.15. As seen, the overall match is quite acceptable.
Finally, we consider the simulatneous regression of both data sets. Using the
converged parameter values from the first data set as an initial guess convergence
was obtained in 1 1 iterations. In this case parameters C and pmax were common to
both data sets; however, two different values for x, were used - one for each data
346 Chapter 17
set. The best parameter values were I /x, = 0.63 10 _+ 2.9% for the first data set and
l/x, = 0.5233 k 2.6% for the second one. Parameters C and plllax were found to
have values equal to 4.23 +_ 40.3*% and 0.0265 k 6.5% respectively. The value of
the 1,s objective function was found to be equal to 0.2 1890. The model calculated
values using the parameters fi-orn the simultaneous regression of both data sets are
also shown in Figures 17.14 and 17.15.
c
.2 - 1.5
2, 1.0
-
.I
E
Y
.-
tn
t
0.5
2.5
23
= 2.0
4 E Inoculation Level = 2.26 ce/ls/bend
v)
0
Q,
. .. . . .
-
-
0
Q)
0.0
Inoculation Level = 4.76 ce//s/becd
A
Only this Data
Set Regressed
Both Data Sets
Regressed
347
17.4 PROBLEMS WITH ODE MODELS
17.4.1 Vero Cells Grown on Microcarriers (Contact Inhibition)
Hawboldt et al. (1994) have also reported data on anchorage dependent
Vero cells grown on Cytodex I microcarriers in 250 mL spinner flasks and they are
reproduced here in Table 17.14.
You are asked to determine the adjustable parameters in the growth model
proposed by Frame and Hu (1 988)
dx
dt
(1 7.22)
Table 17.14: Growth of Vero Cells: Cell Density versus Time for
Vero Cells Grown on Cytodex I hyicrocarriers in 250
mL Spinner Flasks at Two Inoculation Levels (2.00 and
9.72 cells/bead).
Cell Density ( I O0 ceZls/mL)
Inoculation level: I Inoculation level:
Time
I I 2.00 fcells/bead) I 9.72 fcelldbead)
0.0
0.37 0.12 9.9
0.33 0.07
I 24.1 I 0.18 I 0.52
1
36.8
0.65 0.24 51.0
0.56 0.2 1
76.5
0.54 85.0
1.15 0.46
1.84 0.74 99.2
1.49
I 108.4 I 0.99 I 2.07
I 119.0 I 1.24 I 2.60
144.5
3.53 2.19 155.9
3-64 I .98
I 168.6 I 2.64 I 3.64
2 12.6
4.42 3.43 223.9
4.36 3.68
I 236.6 I 3 .SO I 4.48
260.7
4.63 3.99 286.9
4.77 3.97
Source: Hawboldt et al. (1 994)
348 Chapt er 17
The parameter estimation should be performed for the following two cases:
(1) x(0) is known precisely and it is equal to x. (i.e., the value given
(2) x(0) is not known and hence, x. is to be consideredjust as an ad-
below at time t=O).
ditional measurement.
Furthermore, you are asked to determine:
error of the model parameters?
(3) What is the effect of assuming that so is unknown on the standard
(4) Is paratneter C independent of the inoculation level?
17.4.2 Effect of Temperatr~re on Insect Cell Growth Kinetics
Andersen et ai. ( I 996) and Andersen ( 1 995) have studied the effect of tetn-
perature on the recombinant protein production using a baulovirus/insect cell ex-
pression system. In Tables 17.15, 17.16, 17.17, 17.18 and 17. I9 we reproduce the
growth data obtained in spinner flasks (batch cultures) using Bomhy 1nor.i (Bm5)
cells adapted to serum-free media (Ex-Cell 400). The working volume was 125
n.tL and samples were taken twice daily. The cultures were carried out at six dif-
ferent incubation temperatures (22,26,28, 30 and 32 c).
Table 17.15: Growth of BmS Cells: Growth Datn takenj-onl a Botch
Czrlttrre of Bn1.5 Cells Inczrhated at 22 c
Source: Andersen et ai. (1 996) & Andersen (1 995).
Table 17.16: Growth of Bm5 Cells: Growth Data takenfrom a Batch
Culture of Brt1.5 Cells Incubated at 26 l2
257.0 29.00 30.40
283.0 25.20 29.05
Source: Andersen et al. ( 1 996) & Andersel (1 995)
Table 17.17: Growth of Bm.5 Cells: Growth Data takenfiom a Batch
Culture of Bm5 Cells Incubated at 28 YI
190.0 29.685 30.438 0.087 0.020
2 16.7 30.452 32.030 0.080 0.027
237.0 25.890 33.000 0.080 0.040
259.0 18.900 3 1.500
Source: Andersen et al. (1 996) & Andersen ( 1 995).
350
Chapter 17
Table I7.18: Growth of Bin5 Cells: Growth Data fnken.from a Butch
Ctrltzrre of Bn1.5 Cells hctrhnted at 30 Y'
I Lac GIs
X'l .Y,.
0.0
0.030 2.50 2.950 2.800
2 16.0
0.073 0.1 1 26.600 15.000 262.0
0.0.53 0.08 26.708 24.290
Sozrrcs: Andersen et al. ( I 996) & Andersen ( I 995).
Table I7.19: Growth of Bm5 Cells: Growth Data taken-from a Balch
Czrltzwe qf Bm5 Cells Inczrbated at 32 Y'
t Lac GIs
-Td
.Y, I
0.0 0.020 2.260 2.18 1.98
24.0
0.020 I .060 1 1.95 1 I .57 91 .O
0.020 1.610 7.58 7.19 67.0
0.020 I .950 4.1 I 3.64 40.0
0.020 2.100 3.45 3.12
I 137.0 1 20.50 I 21.09 I 0.087 I 0.073 I
166.8
0.220 0.126 26.63 12.00 261 .O
0.040 0.140 28.13 22.75 235.0
0.020 0. I40 26.83 24.3 I 2 12.3
0.040 0.090 26.03 24.06 192.3
0.020 0.093 24.92 23.63
Sotace: Andersen et al. ( I 996) & Andersen (I 995).
In the above tables the following notation has been used:
t Elapsed Time ( I ?)
x,. Viable Cell Density (10'ceWtt~L)
x,, Nonviable Cell Density (IO"cel/shL)
? . t
35 1
Gls Glucose (g/L)
Lac Lactate (g/ L)
At each temperature the simple Monod kinetic model can be used that can
be combined with material balances to arrive at the following unstructured model
where
dx d
"
dt
- kd*Xv
dS
dt Y
- - ".
-
xv
( 1 7.23a)
( 1 7.23b)
(1 7.23~)
(1 7.23d)
The limiting substrate (glucose) concentration is denoted by S. There are
four parameters: plllaY is the maximum specific growth rate, K, is the saturation
constant for S, k d is the specific death rate and Y is the average yield coefficient
(assumed constant).
In this problem you are asked to:
(1) Estimate the parameters (pmax, K,, 16 and Y) for each operating tempera-
ture. Use the portion of the data where glucose is above the threshold
value of 0.1 g/L that corresponds approximately to the exponential
growth period of the batch cultures.
(2) Examine whether any of the estimated parameters follow an Arrhenius-
type relationship. If they do, re-estimate these parameters simultaneously.
A better way to numerically evaluate Arrhenius type constants is through
the use of a reference value. For example, if we consider the death rate,
as a function of temperature we have
(1 7.24)
At the reference temperature, To (usually taken close to the mean operat-
ing temperature), the previous equation becomes
352
Chapter 17
By dividing the two expressions, we arrive at
(1 7.23
(1 7.26)
Equation 17.26 behaves much better numerically than the standard Ar-
rhenius equation and it is particularly suited for parameter estimation
and/or simulation purposes. In this case instead of A and E/R we estimate
klo and E/R. In this example you may choose To = 28 T.
18
Parameter Estimation in Petroleum
Engineering
Parameter estimation is routinely used in many areas of petroleum engi-
neering. In this chapter, we present several such applications. First, we demon-
strate how multiple linear regression is employed to estimate parameters for a
drilling penetration rate model. Second, we use kinetic data to estimate parameters
from simple models that are used to describe the complex kinetics of bitumen low
temperature oxidation and high temperature cracking reactions of Alberta oil
sands. Finally, we describe an application of the Gauss-Newton method to PDE
systems. In particular wepresent the development of an efficient automatic history
matching simulator for reservoir engineering analysis.
18.1 MODELING OF DRILLING RATE USING CANADIAN
OFFSHORE WELL DATA
Offshore drilling costs may exceed similar land operations by 30 to 40 times
and hence it is important to be able to minimize the overall drilling time. This is
accomplished through mathematical modeling of the drilling penetration rate and
operation (Wee and Kalogerakis, 1989).
The processes involved in rotary drilling are complex and our current under-
standing is far from complete. Nonetheless, a basic understanding has come from
field and laboratory experience over the years. The most comprehensive model is
the one developed by Bourgoyne and Young (1986) that relates the penetration
rate (dD/dt) to eight process variables. The model is transformably linear with
353
354 Chapter 18
respect to its eight adjustable parameters. The model for the penetration rate is
given by the exponential relationship,
dD
dt
8
( I 8.1)
which can be transformed to a linear model by taking natural logarithms from both
sides to yield
(1 8.2)
The explanation of each drilling parameter (a,) related to the corresponding
drilling variable (xJ ) is given below:
formation strength constant
normal compaction trend constant
undercompaction constant
pressure differential constant
bit weight constant
rotary speed constant
tooth wear constant
hydraulics constant
The above model is referred to as the Bour.goyne-Yoz~r~g model. After careful
manipulation of a set of raw drilling data for a given formation type, a set of pene-
tration rate data is usually obtained. The drilling variables are also measured and
the measurements become part of the raw data set. The objective of the regression
is then to estimate the parameters a, by matching the model to the drilling penetra-
tion data.
Parameter al in Equation 18.1 accounts for the lumped effects of factors
other than those described by the drilling variables x?, x3,. . .,x8. Hence, its value is
expected to be different fiom well to well whereas the parameter values for
a2,a3,..,a8 are expected to be similar. Thus, when data from two wells (well A and
well B) are simultaneously analyzed, the model takes the form
- = exp Ca, x, + a9x9 + alOx10
dD dt i" i=2
( I 8.3)
where the following additional definitions apply.
Parameter Estimation in Petroleuni Engineering 355
a9 formation strength constant (parameter al ) for well A
al o formation strength constant (parameter al) for well B
x9 =I for data fiom well A and 0 for all other data
xl o =1 for data fiom well B and 0 for all other data
The above model is referred to as the Extended Botlrgoyne-Young model.
Similarly, analysis of combined data from more than two wells is straightforward
and it can be done by adding a new variable and parameter for each additional
well. A large amount of field data from the drilling process are needed to reliably
estimate all model parameters.
Wee and Kalogerakis (1989) have also considered the simple three-
parmneter model given next
dD
- = exp(al+ a5x5 + agx6)
dt
(1 8.4)
18.1.1 Application to Canadian Offshore Well Data
Wee and Kalogerakis ( 1 989) tested the above models using Canadian off-
shore well penetration data (offshore drill operated by Husky Oil). Considerable
effort was required to convert the raw data into a set of data suitable for regression.
The complete dataset is given in the above reference.
010 0 +5 1.0 1. 5 2. 0 2. 5 3.0
In( Predicted Rate)
Figure 18. I Observed versz~s calculated penetration rate and 95% conjdence inter-
vals for well A using the Bourgoyne-Young model [reprinted front the
.Journal of Canadian Petroleum Technology with permission].
356
Chapter 18
3.0
2.5
2.0
1.5
1 .o
0. 5
0. 0
0.0 0.5 1.0 1 3 2.0 2.5 3.0
In( Predicted Rate)
3.0
2.5
2.0
1.5
I .u
0. 5
0- 0
0. 0 0. 5 1.0 1. 5 2. 0 2. 5 3.0
tn( Predicted Rate)
Figure 18.3 Observed ver st Is calculated penetration rate and 95 % cortJitfence I M -
tervals for well~for well A using tire S-pnr*anzeter* model (r*eprinted.fiom
the Journal of Canadian Petrolelm Technology with per.nlissior~].
Parameter Estimation in Petroleum Engineering
357
3.0
2. 5
2.0
1.5
1. 0
0. 5
0. 0
0.0 0 . 5 4 . O 1.5 2.0 2. 5 3.0
ln(Predicted Rate)
Figure 18.4 Observed versus calculated penetration rate and 95 % confidence inter-
vals f or well fos well B using the 5-parameter model [reprintedfronz the
Jotwnal of Canadian Petrolettnl Technology with permission].
As expected, the authors found that the eight-parameter model was suffi-
cient to model the data; however, they questioned the need to have eight parame-
ters. Figure 18.1 shows a plot of the logarithm of the observed penetration rate
versus the logarithm of the calculated rate using the Bourgoyne-Young model.
The 95 % confidence intervals are also shown. The results for well B are shown in
Figure 18.2.
Results with the three parameter model showed that the fit was poorer than
that obtained by the Bourgoyne-Young model. In addition the dispersion about the
45 degree line was more significant. The authors concluded that even though the
Bourgoyne-Young model gave good results it was worthwhile to eliminate possi-
ble redundant parameters. This would reduce the data requirements. Indeed, by
using appropriate statistical procedures it was demonstrated that for the data ex-
amined five out of the eight parameters were adequate to calculate the penetration
rate and match the data sufficiently well. The five parameters were al, a2, a4, %and
a7 and the correspondingfive-parameter rnodel is given by
( 1 8. 5)
Figures 18.3 and 18.4 show the observed versus calculated penetration rates
for wells A and B using the five-parameter model. As seen the results have not
358 Chapter 18
changed significantly by eliminating three of the parameters (a3, a5 and a8). The
elimination of each parameter was done sequentially through hypothesis testing.
Obviously, the fact that only the above five variables affect significantly the pene-
tration rate means that the three-parameter model is actually inadequate even
though it might be able to fit the data to some extent.
18.2 MODELING OF BITUMEN OXIDATION AND CRACKING
KINETICS USlNG DATA FROM ALBERTA OIL SANDS
In the laboratory of Professor R.G. Moore at the University of Calgary, ki-
netic data were obtained using bitumen samples of the North Bodo and Athabasca
oil sands of northern Alberta. Low temperature oxidation data were taken at 50,
75, 100, I25 and 150C whereas the high temperature thermal cracking data at
360,397 and 420C.
Preliminary work showed that first order reaction models are adequate for
the description of these phenomena even though the actual reaction mechanisms
are extremely complex and hence difficult to determine. This simplification is a
desired feature of the models since such simple models are to be used in numerical
simulators of in situ combustion processes. The bitumen is divided into five major
pseudo-components: coke (COK), asphaltene (ASP), heavy oil (HO), light oil
(LO) and gas (GAS). These pseudo-components were lumped together as needed
to produce two, three and four component models. Two, three and four-component
models were considered to describe these complicated reactions (Hanson and Ka-
logerakis, 1984).
18.2.1 Two-Component Models
In this class of models, the five bitumen pseudo-components are lumped
into two in an effort to describe the following reaction
R " %P (1 8.6)
where R (reactant) and P (product) are the two lumped pseudo-components. Of all
possible combinations, the four mathematical models that are of interest here are
shown in Table 18. I . An Arrhenius type temperature dependence of the unknown
rate constants is always assumed.
Parameter Estimation in Petroleum Engineering 359
Table I8.1 Bitzrnlen Oxidation and Cracking: Formzrlation of Two-Component
Models
Model Model ODE Product Reactant
A
-
P=COK+ASP R=HO+LO
dCP
dt
B
- dCp - -(,-CR)NBkg e x p ( 3 )
P=COK+ASP R=HO+LO
dt
D
L = ( l - C R ) N E k
P=COK R=HO+LO+ASP E
" dCp "(1 -CR)kD eXP(*)
P=COK R=HO+LO+ASP
dt RT
dC
dt
In Table IS. 1 the following variables are used in the model equations:
Cp Product concentration (weight %)
CR Reactant concentration (weight %)
kJ Reaction rate constant in model j=A,B,D,E
EJ Energy of activation in model j=A,B,D,E.
NB Exponent used in model B
NE Exponent used in model E
18.2.2 Three-Component Models
By lumping pseudo-components, we can formulate five three-component
models of interest. Pseudo-components shown together in a circle are treated as
one pseudo-component for the corresponding kinetic model.
Figure 18.5 Schematic of reaction network for model J?
360 Chapter 18
Model F is depicted schematically i n Figure 18.5 and the corresponding
mathematical model is given by the following two ODEs:
" dCAsP - kl ( I - C ~s p - CCOK) - k2CASp (1 8.7a)
dt
Model G is depicted schematically in Figure 18.6 and the corresponding mathe-
matical model is given by the following two ODEs:
-=-k dC ASP C
I ASP ( 1 8.8a)
dt
(1 8.8b)
Model 1 is depicted schematically in Figure 18.7 and the corresponding mathe-
dC ASP

dt
- k l (l -CASP -CCOK )-k2CASP ( 1 8.9a)
dCCOK
dt
- k2CASP
( 1 8.9b)
Figure 18.8 Schematic of reaction network for model K.
Model K is depicted schematically in Figure 18.8 and the corresponding mathe-
dC COK

- k3CASP+H0 (1 8. lob)
dt
Figure 18.9 Schematic of reaction network for model J.
362 Chapter 18
Model J is depicted schematically in Figure 18.9 and the corresponding mathe-
" dCcoK
dt
- k3CASP
(18.1 la)
(1 8.1 1 b)
In all the above three-component models as well as in the four-component
models presented next, an Arrhenius-type temperature dependence is assumed for
all the kinetic parameters. Namely each parameter k, is of the form A,e,ry(-E,/RT).
18.2.3 Four-Component Models
We consider the following four-component models. Model N is depicted
schematically in Figure 18. IO and the corresponding mathematical model is given
by the following three ODEs:
dC HO
"
dt
"k3CH0 - kI CHO
" dCASP - klCHO -kzCASp
dt
(18.12a)
(18.12b)
dC COK
"
dt
- k2CASP
(18.12c)
363
Figuse 18.1 I Sclzenzatic of reaction network for model hL
Model M is depicted schematically in Figure 18.1 1 and the corresponding mathe-
matical model is given by the following three ODEs:
dCCOK
- = k4CASP
dt
(18.13b)
(18.13~)
Figure 18.12 Schematic of reaction network for model 0.
Model 0 is depicted schematically in Figure 18.12 and the corresponding mathe-
matical model is given by the following three ODEs:
364
Chapter 18
-=-k dC ASP C
dt
I ASP -k2CASP +k3CH0 - kSCASP
(18.14a)
(1 8.14b)
(1 8.14~)
18.2.4 Results and Discussion
The two-component models are "too simple" to be able to describe the
complex reactions taking place. Only model D was found to describe early coke
(COK) production adequately. For Low Temperature Oxidation (LTO) conditions
the model was adequate only up to 45 / I and for cracking conditions up to 25 h .
The three-component models were found to fit the experitnental data better
than the two-component ones. Model I was found to be able to fit both LTO and
cracking data very well. This model was considered the best of crll n~odels et'ej7
though il is unable to calczrlnte the HO/LO split (Hanson and Kalogerakis, 1984).
Four component models were found very difficult or impossible to con-
verge. Models K, M and 0 are more complicated and have more reaction paths
compared to models I or N. Whenever the parameter with the highest variance was
eliminated in any of these three models, it would revert back to the simpler ones:
Model I or N. Model N was the only four pseudo-component model that con-
verged. This model also provides an estimate of the MO/LO split. This model to-
gether with model I were recommended for use in situ combustion simulators
(Hanson and Kalogerakis, 1984). Typical results are presented next for model I .
Figures 18.1 3, through 18.17 show the experimental data and the calcula-
tions based on model I for the low temperature oxidation at 50, 75, 100, 125 and
150 T of a North Bodo oil sands bitumen with a 5% oxygen gas. As seen, there is
generally good agreement between the experinlental data and the results obtained
by the simple three pseudo-component model at all temperatures except the run at
125 C'. The only drawback of the model is that it cannot calculate the HO/LO
split. The estimated parameter values for tnodel I and N are shown in Table 18.2.
The observed large standard deviations in the parameter estimates is rather typical
for Atrhenius type expressions.
I n Figures 18.1 8, 18.19 and 18.20 the experimental data and the calcula-
tions based on model I are shown for the high temperature cracking at 360, 397
and 420 "c of an Athabasca oil sands bitumen (Drum 20). Similar results are seen
in Figures 18.2 1 , 18.22 and 18.23 for another Athabasca oil sands bitumen (Drum
433). The estimated parameter values for tnodel 1 are shown in Table 18.3 for
Drums 20 and 433.
365
100
I I I I I 1 1 1 1 1 1 1 1 1 1
n
c
c 90
.n... 0 ......................................................... ~ ..................................
E EO- -
T = 5 0 C -
n
-
70
--
.c
2 60
-
+
-
5 50
-
5 40
-
0, SO- -
-
.-
c
0 4 a ; * : I 1 ; ; / 1 , f f I I
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80
Time (h)
Figuse 18. I3 Experinzental and calculated concentrations of Coke (COK) A ,
Asphaltene (ASP) 0 and Heavy Oil +Light Oil (Hot- LO) 0
at 50 C for the low tenzperattae oxidation of North Bodo oil
sands bitnmen using tnodel I.
e
0 10
0
-
0 - 4 : f A; : I {L ; ; ; ; ; A ; ,
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 EO
Ti me (h)
Figure 18.14 Experinzental and calculated concentrations of Coke (COK) A ,
Asphaltene (ASP) (0 and Heaq) Oil +Light Oil (HO+LO) 0
al 75 C for the low temperature oxidation of North Bodo oil
sands bitzrnzen using model I.
366
Chapt er 18
U
.-.- -
0 1 1 1 1 I I l - l I l I I 1
"I-& :,, i t I . I r I I I , , I
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80
Ti me (h)
T= 1 2 5 C
Parameter Estimation i n Petroleum Engineering
367
100
n
c
c 90
2 80
CL
70
c
Ti me (h)
Figure 18. I 7 Experimental and calculated concentrations of Coke (COK) "A",
Asplzaltene (ASP) " 0" and Heavy Oil +Light Oil fHO+LO) % ''
at 1.50 "C for the low temperature oxidation of North Bodo oil
sands bitumen using model I.
Table 18.2 Estimated Parameter Values for Models I and N for the Low
Temperature Oxidation of North Bodo Oil Sands Bitumen
Source: Hanson and Kalogerakis (1 984).
368 Chapter 18
Bitumen
Parameter Parameter Values
Standard
Code
Deviation (%)
Drum 102 59320
El
20
255
8. 76~ IO A2
E2
4297 1. 23~ lo3 A ,
8.8 19485
Drum
57 52340
El
433
6. I 26232
E2
23 5
9.5 I x 10
A2
Sozrrce: Hanson and Kalogerakis (I 984).
T = 360 C
369
n
c
70
.P 60
E so+
T = 397 C
Ttme (h)
Figure 18.19 Experinzentnl and calculated concentrations of Coke (COK) A,
Asphaltene (ASP) O and Heavy Oil + Light Oil (HO+LO) 0
at 397 C for the high temperature cracking of Athabasca oil
sands bitumen (Drum 20) using model I.
100 1 1 I
1 1
t. 50 60t T = 420 C
Figure 18.20 Experimental crnd calculated concentrations of Coke (COK) A,
Asphaltene (ASP) 0 and Heavy Oil +Light Oil (HO+LO) 0
at 420 C for the high temperature cracking of Athabascn oil
sands bitwnen (Drum 20) wing model I.
370 Chapter 18
loo -
90
"
c?
I I I
5
2 80
0
9
-, ........n.........R..........n..........o................. ..............................
L
9
-
70
--
c
-
p so--
-
5 50" T = 360 C
40"
30
--
-
c
-
c
-
5
2 0 - w " " g " " - 6 " " - " - - - - - - - - - - - - - - - - - ~ - - ~ - - ~ ~ ~ ~ ~ - . -
0
0
0 0
0
s IO"
-
o + - - + "
. "
-.-
0 5 (0 IS 20
Ti me (h)
Figure IS. 2 I E.uperintental and calculated concentrcltrons of'coke KOK) "a ".
Asphaltene OSP) " 0" and Heavy Oil f Light 0 1 1 (HO+LO) "o ''
at 360 "C for the high temperatwe cracking of Atitabasca oil
snnds biturnen (Drtrrn 333) trsing nzociel 1.
100 1 1 I I
1
n
.... c! .......... El ......... Q .........
b
a
a-
70
T = 397 C
0
1 1 I
0 I to ls 20
I
Ti me (h)
371
100 , I I I I
T = 420 C
.- 5 404- -I
+
!!
c
C
0
C
0
0 S 10 15 20
Ti me (h)
Figure 18.23 Experimental and calculated concentrations of Coke (COK) A,
Asphaltene (ASP) 0 and Heavy Oil +Light Oil (HO+LO) o
at 120 C for the high temperattu-e cracking of Athabasca oil
sands bitumen (Dstlm 433) using model I.
18.3 AUTOMATIC HISTORY MATCHING IN RESERVOIR
ENGINEERING
History rnatching in reservoir engineering refers to the process of estimating
hydrocarbon reservoir parameters (like porosity and permeability distributions) so
that the reservoir simulator matches the observed field data in some optimal fash-
ion. The intention is to use the history matched-model to forecast future behavior
of the reservoir under different depletion plans and thus optimize production.
18.3.1 A Fully Implicit, Three Dimensional, Three-phase Simulator with
Automatic History-Matching Capability
The mathematical model for a hydrocarbon reservoir consists of a number of
partial differential equations (PDEs) as well as algebraic equations. The number of
equations depends on the scope/capabilities of the model. The set of PDEs is often
reduced to a set of ODES by grid discretization. The estintatiou of the reservoir
parameters of each grid cell is the essence of history matching.
The discretized reservoir model can be written in the general form presented
in Section 10.3. The state variables are the pressure and the oil, water and gas satu-
372 Chapt er 18
rations at each grid cell in a "black oil" reservoir model. The vector of calculated
variables y is related to the state variables through a nonlinear relationship. For
example, the computed water to oil ratio fiom a production oil well is a nonlinear
hnction of pressure, saturation levels and fluid composition of the grid cells the
well is completed in. In some cases we may also observe a very limited number of
the state variables. Normally measurements are available from production or ob-
servation wells placed throughout the reservoir. In general, function evaluations
are time consuming and consequently the number of function evaluations required
by the iterative algorithm should be minimized (Tan and Kalogerakis, 1991).
The parameters are estimated by minimizing the usual LS objective function
where the weighting matrix is often chosen as a diagonal matrix that normalizes
the data and makes all measurements be of the same order of magnitude. Statisti-
cally, this is the correct approach if the error in the measurements is proportional
to the magnitude of the measured variable.
Tan and Kalogerakis (1 99 1) modified a three-dimensional, three-phase (oil,
water, gas) multi-component (N, components) industrially available simulator.
The reservoir simulation model consisted of a set of ODES that described the
Component material balances subject to constraints for the saturations, S, and the
mole fractions, xlP, i=l ,..., Nc, in each of the three phases (oil, gas, water). Each
component could be found in any one of the three phases. The state variables were
pressure, saturations (Sgas, Swater, So,,) and the master mole fractions at each grid
cell of the reservoir. The master mole fiactions are related to the actual mole frac-
tions through the equilibrium distribution ratios also known as "K-values." I t is
known from phase equilibrium thermodynamics that the "K-values" are a con-
venient way to describe phase behavior. The parameters to be estimated were po-
rosity (cp) and horizontal and vertical permeability (kl, and k,) in each grid cell. The
observed variables that could be matched were the pressures of the grid cells, wa-
ter-oil ratios and gas-oil ratios of the individual production wells, and flowing
bottom hole pressures. The Gauss-Newton algorithm was implemented with an
optimal step size policy to avoid overstepping, An automatic timestep selector
adjusted the timesteps in order to ensure that the simulator performs the calcula-
tions of the pressures and the producing ratios at the corresponding observation
times.
Furthermore, the implementation of the Gauss-Newton method also incorpo-
rated the use of the pseudo-inverse method to avoid instabilities caused by the ill-
conditioning of matrix A as discussed in Chapter 8. In reservoir simulation this
may occur for example when a parameter zone is outside the drainage radius of a
well and is therefore not observable from the well data, Most importantly, in order
to realize substantial savings in computation time, the sequential computation of
the sensitivity coefficients discussed in detail in Section 10.3.1 was implemented,
Finally, the numerical integration procedure that was used was a fully implicit one
to ensure stability and convergence over a wide range of parameter estimates.
18.3.2 Application to a Radial Coning Problem (Second SPE Comparative
Solution Problem)
For illustration purposes, Tan and Kalogerakis (I 99 1) applied their history
matching approach to the Second SPE Comparative Solution Problem. The prob-
lem is described in detail by Chappelear and Nolen (1 986). The reservoir consists
of ten concentric rings and fifteen layers. A gas cap and oil zone and a water zone
are all present. Starting with an arbitrary initial guess of the reservoir porosity and
permeability distribution it was attempted to match: (a) the observed reservoir
pressure; (b) the water-oil ratio; (c) the gas-oil ratio; (d) the bottom hole pressure;
and (e) combinations of observed data. It is noted that the simulator with its built
in parameter estimation capability can match reservoir pressures, water-oil ratios,
gas-oil ratios and flowing bottom hole pressures either individually or all of them
simultaneously. The reservoir model was used to generate artificial observations
using the original values of the reservoir parameters and by adding a random noise
term. Subsequently, starting with an arbitrary initial guess of the reservoir pa-
rameters, it was attempted to recover the original reservoir parameters by match-
ing the "observed" data.
The various simulation runs revealed that the Gauss-Newton implementa-
tion by Tan and Kalogerakis ( I 991) was extremely efficient compared to other
reservoir history matching methods reported earlier in the literature.
18.3.2.1 Matching Reservoir Pressure
The fifteen layers of constant permeability and porosity were taken as the
reservoir zones for which these parameters would be estimated. The reservoir
pressure is a state variable and hence in this case the relationship between the out-
put vector (observed variables) and the state variables is of the form y(t,)=Cx(t,).
Tan and Kalogerakis (1991) performed four different runs. In the I s t run,
layers 3 to 12 were selected as the zones whose horizontal permeabilities were the
unknown parameters. The initial guess for all the permeabilities was 200 md. It
was found that the original permeability values were obtained in nine iterations of
the Gauss-Newton method by matching the pressure profile. In Figures 18.24a and
18.24b the parameter values and the LS objective fhction are shown during the
course of the iterations. The cpu time required for the nine iterations was equiva-
lent to 19.5 I model-runs by the simulator. Therefore, the average cpu time for one
iteration was 2. I7 model-runs. This is the time required for one model-run (com-
putation of all state variables) and for the computation of the sensitivity coeffi-
cients of the ten parameters which was obviously 1.17 model-runs. This compares
exceptionally well to 10 model-runs that would have normally been required by
the standard formulation of the Gauss-Netwon method. These savings in compu-
tation time are solely due to the efficient computation of the sensitivity coefficients
as described in Section 10.3.1. This result represents an order of magnitude reduc-
tion in computational requirements and makes automatic history matching through
374 Chapter 18
nonlinear regression practical and economically feasible for the first time in reser-
voir simulation (Tan and Kalogerakis, I99 1 ).
Next, i n the 2nd run the horizontal permeabilities of all 15 layers were esti-
mated by using the value of 200 md as initial guess. I t required 12 iterations to
converge to the optimal permeability values.
I n the 3rd run the porosity of the ten zones was estimated by using an initial
guess of 0.1. Finally, i n the 4'h run the porosity of all fifteen zones was estimated
by using the same initial guess (0.1) as above. I n this case, matrix A was found to
be extremely ill-conditioned and the pseudo-inverse option had to be used.
Unlike the permeability runs, the results showed that, the observed data
were not sufficient to distinguish all fifteen values of the porosity. The i l l -
conditioning of the problem was mainly due to the limited observability and it
could be overcome by supplying more information such as additional data or by a
re-parameterization of the reservoir model itself (rezoning the reservoir).
18.3.2.2 Matching Water-Oil Ratio, Gas-Oil Ratio or Bottom Hole Pressure
The water-oil ratio is a complex time-dependent function of the state vari-
ables since a well can produce oil from several grid cells at the same time. I n this
case the relationship of the output vector and the state variables is nonlinear of the
form y(t,)=h(x(t,)).
The centrally located well is completed in layers 7 and 8. The set of obser-
vations consisted of water-oil ratios at 16 timesteps that were obtained from a base
run. These data were then used in the next three runs to be tnatched with calcu-
lated values. I n the 1' ' run the horizontal permeabilities of layers 7 and 8 were es-
timated using an initial guess of 200 ~ttd. The optimum was reached within five
iterations of the Gauss-Newton method. I n the 2"d run the objective was to esti-
mate the unknown permeabilities in layers G to 9 (four zones) using an initial
guess of 200 md. In spite of the fact that the calculated water-oil ratios agreed with
the observed values very well, the calculated permeabilities were not found to be
close to the reservoir values. Finally, the porosity of layers 6 to 9 was estimated i n
the 3'" run in five Gauss-Newton iterations and the estimated values were found to
be in good agreement with the reservoir porosity values.
Similar findings were observed for the gas-oil ratio or the bottom hole pres-
sure of each well which is also a state variable when the well production rate is
capacity restricted (Tan and Kalogerakis. 199 1).
18.3.2.3 Matching All Observed Data
I n this case the observed data consisted of the water-oil ratios, gas-oil ratios,
flowing bottom hole pressure measurements and the reservoir pressures at two
locations of the well (layers 7 and 8). In the first run, the horizontal permeabilities
of layers 6 to 9 were estimated by using the value of 200 n7d as the initial guess.
Parameter Estimation in Petroleurh engineering 375
As expected, the estimated values were found to be closer to the correct ones
compared with the estimated values when the water-oil ratios are only matched. In
the 2"d run, the horizontal permeabilities of layers 5 to 10 (6 zones) were estimated
using the value of 200 rnd as initial guess. It was found necessary to use the
pseudo-inverse option in this case to ensure convergence of the computations. The
initial and converged profiles generated by the model are compared to the ob-
served data in Figures 18.25a and I8.25b.
It was also attempted to estimate permeability values for eight zones but it
was not successful. It was concluded that in order to extent the reservoir that can
be identified from measurements one needs observation data over a longer history.
Finally, in another run, it was shown that the porosities of layers 5 to 10 could be
readily estimated within 10 iterations. However, it was not possible to estimate the
porosity values for eight layers due to the same reason as the permeabilities.
Having performed all the above computer runs, a simple linear cowelation
was found relating the required cpu time as model equivalent runs (MER) with the
number of unknown parameters regressed. The correlation was
MER = 0.319-kO.07xp (18.15)
This indicates that after an initial overhead of 0.3 19 model runs to set up the
algorithm, an additional 0.07 of a model-run was required for the computation of
the sensitivity coefficients for each additional parameter. This is about 14 times
less compared to the one additional model-run required by the standard imple-
mentation of the Gauss-Newton method. Obviously these numbers serve only as a
guideline however, the computational savings realized through the efficient inte-
gration of the sensitivity ODES are expected to be very significant whenever an
implicit or semi-implicit reservoir simulator is involved.
8
E
2000 I 1
1500
f 000
500
0
1 3 5 7 9
ITERATION ITERATION
Figure 18.24 2'"'SPE problem with 1.5 permeability zones and using all meuswe-
ments: (a) Reduction of LS objective jilnction and (b) Parameter
values during the cousse of the itesations [reprinted with permission
j-onz the SocieQ of Petroleurn Engineers].
376
Chapter 18
0 WOR data
1000 .I I
0
o 200 400 600 aoo 1000
TIME (days)
0 L ay er n
0 L ay er a
3500
h
-
cn
n
Y
3300
3
v)
v)
w
a
3100
2900
o 200 400 600 aoo 1000
TIME (days)
18.3.3 A Three-Dimensional, Three-phase Automatic History-Matching
Model: Reliability of Parameter Estimates
An important benefit of using a three-dimensional three-phase automatic
history matching simulator is that besides the estimation of the unknown parame-
ters, it is possible to analyze the quality of the fit that has been achieved between
the model and the field data. In the history matching phase of a reservoir simula-
tion study, the practicing engineer would like to know how accurately the pa-
rameters have been estimated, whether a particular parameter has a significant
influence on the history match and to what degree. The traditional procedure of
history matching by varying different reservoir parameters based on engineering
judgment may provide an adequate match; however, the engineer has no measure
of the reliability of the fitted parameters.
As already discussed in Chapter I 1, matrix A calculated during each itera-
tion of the Gauss-Newton method can be used to determine the covariance matrix
of the estimated parameters, which in turn provides a measure of the accuracy of
the parameter estimates (Tan and Kalogerakis, 1992).
The covariance matrix, COV(k*), of the parameters is given by Equation
11.1, i.e., CO17(k*)=[A*]"Sld,(k*)/(d.f.) and the variances of the estimated pa-
rameters are simply obtained as the diagonal elements of COV(k*). It is reminded
that the square root of the variances are the standard deviations of the parameter
estimates (also known as the standard estimation error). The magnitude of the
standard deviation of a particular parameter estimate indicates the degree of confi-
dence that should be placed on that parameter value.
The covariances between the parameters are the off-diagonal elements of the
covariance matrix. The covariance indicates how closely two parameters are cor-
related. A large value for the covariance between two parameter estimates indi-
cates a very close correlation. Practically, this means that these two parameters
may not be possible to be estimated separately. This is shown better through the
correlation matrix. The correlation matrix, R, is obtained by transforming the co-
variance matrix as follows
R = D [COV(k)]D (1 8.16)
where D is a diagonal matrix with elements the square root of the diagonal ele-
ments of COV(k*). The diagonal elements of the correlation matrix R will all be
equal to one whereas the off-diagonal ones will have values between - 1 and 1. If
an off-diagonal element has an absolute value very close to one then the corre-
sponding parameters are highly correlated. Hence, the off-diagonal elements of
matrix R provide a direct indication of two-parameter correlation.
Correlation between three or more parameters is very difficult to detect un-
less an eigenvalue decomposition of matrix A* is performed. As already discussed
in Chapter 8, matrix A* is symmetric and hence an eigenvalue decomposition is
also an orthogonal decomposition
A * = V A V ~ (18.17)
where A is a diagonal matrix with positive elements that are the eigenvalues of
matrix A*, V is an orthogonal matrix whose columns are the normalized eigen-
vectors of matrix A* and hence, VTV=I. Furthermore, as shown in Chapter 1 1, the
(I-a)lOO% joint conJidence region for the parameter vector k is described by the
ellipsoid
[k - k *IT [A*] [k - k *] = ~
Fp,Nrn-p
P W * ) a
(d.f.)
(18.18)
If this ellipsoid is highly elongated in one direction then the uncertainty in
the parameter values in that direction is significant. The p eigenvectors of A form
the direction of the p principal axes of the ellipsoid where each eigenvalue is equal
to the inverse of the length of the corresponding principal axis. The longest axis,
corresponds to the smallest eigenvalue whereas the shortest axis to the largest ei-
genvalue. The ratio of the largest to the smallest eigenvalue is the condition num-
ber of matrix A. As discussed in Chapter 8, the condition number can be calcu-
378 Chapter 18
lated during the minimization steps to provide an indication of the ill-conditioning
of matrix A. If some parameters are correlated then the value of the condition
number will be very large. One should be careful, however, to note that the eigen-
values are not in a one-to-one correspondence with the parameters. lt requires i n-
spection of the eigenvectors corresponding to the smallest eigenvalues to know
which parameters are highly correlated.
The elements of each eigenvector are the cosines of the angles the eigen-
vector makes with the axes corresponding to the p parameters. If some of the pa-
rameters are highly correlated then their axes will lie in the same direction and the
angle between them will be very small. Hence, the eigenvector corresponding to
the snlallest eigenvalue will have significant contributions from the correlated
parameters and the cosines of the angles will tend to I since the angles tend to 0.
Hence, the larger elements of the eigenvector will enable identification of the pa-
rameters which are correlated.
18.3.3.1 Implementation and Numerical Results
Tan and Kalogerakis (1992) illustrated the above points using two simple
reservoir problems representing a uniform areal 3x3 model with a producing well
at grid block ( I , 1). I n this section their results fiom the SPE second comparative
solution problem are only reported. This is the well-known three-phase radial
coning problem described by Chappelear and Nolen (1986). The reservoir has 15
layers each of a constant horizontal permeability. The well is completed at layers 7
and 8. The objective here is to estimate the permeability of each layer using the
measurements made at the wells. The measurements include water-oil ratio, gas-
oil ratio, flowing bottom-hole pressure and reservoir pressures of the well loca-
tions.
Observations were generated running the model with the original description
and by adding noise to the model calculated values as follows
(18.19)
where 9 , (ti ) are the noisy observations of variable j at time t,, o, is the normal-
ized standard measurement error and z, , is a random variable distributed normally
with zero mean and standard deviation one. A value of 0.00 I was used for oF.
In the first attempt to characterize the reservoir, all 15 layers were treated as
separate parameter zones. An initial guess of 300 md was used for the permeability
of each zone. A diagonal weighting matrix Q, with elements j l j (t i ) -' was used.
After 10 iterations of the Gauss-Newton method the LS objective function was
reduced to 0.0147. The estimation problem as defined, is severely ill-conditioned.
Although the algorithm did not converged, the estimation part of the program pro-
vided estimates of the standard deviation in the parameter values obtained thus far.
Table 18.4 2"'SPE Prohlenz: Permeability Estimates of Original Zonation
Standard
of Smallest Deviation
Estimated
Eigenvector
Layer
True
Permeability Permeability
(%I Eigenvalue
1
-0.250 1 1416 329.2 47.5 2
0.1280 867.7 237.5 35
3
0.0003 5.33 676.2 775 7
-0.0056 40.3 324.6 41 8.5 6
0.0 108 69.1 240.8 90 5
-0.004 1 40.7 289.2 202 4
0. I037 725.2 22.4 148
8
-0.0428 26 1.4 821.1 125 11
-0.0074 12 1 .o 41 1.4 472 10
-0.01 94 I 16.8 636 682 9
0.000 1 20.9 52.6 60
I 12 I 300 323.7 I284 I -0.2848 I
13
-0.2373 1256 256.4 350 15
0.1693 1241 246.1 191 14
0.86 13 3392 159.9 137.5
Sozwce: Tan and Kalogerakis (1 992).
The estimated permeabilities together with the true permeabilities of each
layer are shown in Table 18.4. In Table 18.4, the eigenvector corresponding to the
smallest eigenvalue of matrix A is also shown. The largest elements of the eigen-
vector correspond to zones 1,2,3,12,13,14 and 15 suggesting that these zones are
correlated. Based on the above and the fact that the eigenvector of the second
smallest eigenvalue suggests that layers 9,10,11 are also correlated, it was decided
to rezone the reservoir in the following manner: Layers 1,2 and 3 are lumped into
one zone, layers 4,5,6,7 and 8 remain as distinct zones and layers 9,10,11,12,13,14
and 15 are lumped into one zone. With the revised zonation there are only 7 un-
known parameters.
With the revised zonation the LS objective fhnction was reduced to 0.00392.
The final match is shown in Figures 18.26 and 18.27. The estimated parameters
and their standard deviation are shown in Table 18.5. As seen, the estimated per-
meability for layer 7 has the lowest standard error. This is simply due to the fact
that the well is completed in layers 7 and 8. It is interesting to note that contrary to
expectation, the estimated permeability corresponding to layer 5 has a higher un-
certainty compared to zones even hrther away from it. The true permeability of
layer 5 is low and most likely its effects have been missed in the timing of the 16
observations taken.
Sowce: Tan and Kalogerakis (1 992).
18.3.4 Improved Reservoir Characterization Through Automatic History
Matching
In the history matching phase of a reservoir simulation study, the reservoir
engineer is phased with two problems: First. a grid cell model that represents the
geological stnlcture of the underground reservoir must be developed and second
the porosity and permeability distributions across the reservoir must be determined
so that the reservoir si~nulation model matches satisfactorily the field data. I n an
actual field study, the postulated grid cell model may not accurately represent the
reservoir description since the geological interpretation of seismic data can easily
be in error. Furthermore, the variation in rock properties could be such that the
postulated grid cell model may not have enough detail. Unfortunately, even with
current computer hardware one cannot use very fine grids with hundred of thou-
sands of cells to model reservoir heterogeneity in porosity or permeability distri-
bution. The simplest way to address the reservoir description problem is to use a
zonation approach whereby the reservoir is divided into a relatively small number
of zones of constant porosity and permeability (each zone may have many grid
cells dictated by accuracy and stability consideration). Chances are that the size
and shape of these zones i n the postulated model will be quite different than the
true distribution in the reservoir. In this section it is shown how automatic history
matching can be used to guide the practicing engineer towards a reservoir charac-
terization that is more representative of the actual underground reservoir (Tan and
Kalogerakis, 1993).
In practice when reservoir parameters such as porosities and permeabilities
are estimated by matching reservoir model calculated values to field data, one has
some prior information about the parameter values. For example, porosity and
permeability values may be available fiom core data analysis and well test analy-
sis. In addiction, the parameter values are known to be within certain bounds for a
particular area. All this information can be incorporated in the estimation method
of the simulator by introducing prior parameter distributions and by imposing con-
straints on the parameters (Tan and Kalogerakis, 1993).
3600
3500
-
L
-
9 3400
@
!i
3300
3- 3200
zi
3100
3000
I
BHP
2900 t
0 200 400 600 800 1000
T h e (days)
Figure 18.26 Observed and calculated bottom-hole pressure and reservoir pres-
sur-es at layers 7 and 8 for the 2'"' SPE problem using 7 pernze-
ability zones [r.eprintedfiom the Journnl of the Canadian Petro-
leum Technology with permission].
382
Chapter 18
7000
6000
h
5000
:
z:
a
' 4000
Y
ij
- 3000
n
3.
8, 2000
loo0 t
0 200 400 600 600 1000
l i me (days)
18.3.4.1 incorporation of Prior Information and Constraints on the
Parameters
It is reasonable to assume that the most probable values of the parameters
have normal distributions with means equal to the values that were obtained kom
well test and core data analyses. These are the prior estimates. Each one of these
most probable parameter values (ksJ , j=l, ...,p) also has a corresponding standard
deviation OkJ which is a measure of the uncertainty of the prior parameter estimate.
As already discussed in Chapter 8 (Section 8.5) using maximum likelihood argu-
ments the prior information is introduced by augtnenting the LS ob-jective function
to include
(1 8.20)
The prior covariance matrix of the parameters (V,) is a diagonal matrix and
since i n the solution of the problem only the inverse of VB is used, it is preferable
Parameter Estimation i n Petroleum Engineering 383
to use as input to the program the inverse itself Namely, the information is entered
into the program as vi ' = diag(~~,-~, 6 k i 2 , . . ., aki2).
Several authors have suggested the use of a weighting factor for Sprier in the
overall objective function. As there is really no rigorous way to assign an opti-
mized weighting factor, it is better to be left up to the user (Tan and Kalogerakis,
1993). I f there is a significant amount of historical observation data available, the
objective function will be dominated by the least squares term whereas the signifi-
cance of the prior term increases as the amount of such data decreases. If the esti-
mated parameter values differ significantly from the prior estimates then the user
should either revise the prior information or assign it a greater significance by in-
creasing the value of the weighting factor or equivalently decrease the value of the
prior estimates okj. As pointed out in Section 8.5, the crucial role of the prior term
should be emphasized when the estimation problem is severely ill-conditioned. In
such cases, the parameters that are not affected by the data maintain essentially
their prior estimates ( k i and 6k, ) and at the same time, the condition number of
matrix A does not become prohibitively large.
While prior information may be used to influence the parameter estimates
towards realistic values, there is no guarantee that the final estimates will not reach
extreme values particularly when the postulated grid cell model is incorrect and
there is a large amount of data available. A simple way to impose inequality con-
straints on the parameters is through the incorporation of a penalty function as
already discussed in Chapter 9 (Section 9.2.1.2). By this approach extra terms are
added in the objective function that tend to explode when the parameters approach
near the boundary and become negligible when the parameters are far. One can
easily construct such penalty functions. For example a simple and yet very effec-
tive penalty function that keeps the parameters in the interval (kmin.1, km,,,,) is
These functions are also multiplied by a user supplied weighting constant, co
(20) which should have a large value during the early iterations of the Gauss-
Newton method when the parameters are away from their optimal values. In gen-
eral, o should be reduced as the parameters approach the optimum so that the
contribution of the penalty function is essentially negligible (so that no bias is in-
troduced in the parameter estimates). I f p penalty functions are incorporated then
the overall objective function becomes
(1 8.22)
where
384
Chapter 18
( I 8.23)
As seen above, the objective function is modified in a way that it increases
quite significantly as the solution approaches the constraints but remains practi-
cally unchanged inside the feasible region. Inside the feasible region, O,(k,) is
small and results in a small contribution to the main diagonal element of matrix A.
Consequently, Ak('+') will not be affected significantly. If on the other hand, the
parameter values are near the boundary O,(k,) becomes large and dominates the
diagonal element of the parameter to which the constraint is applied. As a result,
the value of AkO"' is very small and k, is not allowed to cross that boundary.
18.3.4.2 Reservoir Characterization Using Automatic History Matching
As it was pointed out earlier, there are two major problems associated with
the history matching of a reservoir. The first is the correct representation of the
reservoir with a grid cell model with a limited number of zones, and the second is
the regression analysis to obtain the optimal parameter values. While the second
problem has been addressed in this chapter, there is no rigorous method available
that someone may follow to address the first problem. Engineers rely on informa-
tion from other sources such as seismic mapping and geological data. However,
Tan and Kalogerakis ( 1 993) have shown that automatic history matching can also
be usefill in determining the existence of impermeable boundaries, assessing the
possibility of reservoir extensions, providing estimates of the volumes of oil, gas
and water in place while at the same time limiting the parameter estimates to stay
within realistic limits.
They performed an extensive case study to demonstrate the use of automatic
history matching to reservoir characterization. For example, if the estimated per-
meability of a particular zone is unrealistically small compared to geological in-
formation, there is a good chance that an impermeable barrier is present. Similarly
if the estimated porosity of a zone approaches unrealistically high values, chances
are the zone of the reservoir should be expanded beyond its current boundary.
Based on the analysis of the case study, Tan and Kalogerakis (1993) rec-
ommended the following practical guidelines.
Step 1 . Construct a postulated grid cell model of the reservoir by using all the
available information.
Step 2. Subdivide the model into zones by following any geological zonation
as closely as possible. The zones for porosity and permeability need
not be identical and the cells allocated to a zone need not be contigu-
ous.
Parameter Estimation i n Petroleum Engineering
385
Step 3.
Step 4.
Step 5.
Step 6.
Step 7.
Step 8.
Provide estimates of the most probable values of the parameters and
their standard deviation (kj, ok,, j=l,. . .,p).
Provide the boundary (minimum and maximum values) that each pa-
rameter should be constrained with.
Run the automatic history matching model to obtain estimates of the
unknown parameters.
When a converged solution has been achieved or at the minimum of
the objective function check the variances of the parameters and the
eigenvectors corresponding to the smallest eigenvalues to identify any
highly correlated zones. Combine any adjacent zones of high variance.
In order to modify and improve the postulated grid cell representation
of the reservoir, analyze any zones with values close to these con-
straints.
Go to Step 3 if you have made any changes to the model. Otherwise
proceed with the study to predict hture reservoir performance, etc.
It should be noted that the nature of the problem is such that it is practically
impossible to obtain a postulated model which is able to uniquely represent the
reservoir. As a result, it is required to continuously update the match when addi-
tional information becomes available and possibly also change the grid cell de-
scription of the reservoir.
By using automatic history matching, the reservoir engineer is not faced
with the usual dilemma whether to reject a particular grid cell model because it is
not a good approximation to the reservoir or to proceed with the parameter search
because the best set of parameters has not been determined yet.
18.3.5 Reliability of Predicted Well Performance Through Automatic
History Matching
It is of interest in a reservoir simulation study to compute future produc-
tion levels of the history matched reservoir under alternative depletion plans. In
addition, the sensitivity of the anticipated performance to different reservoir de-
scriptions is also evaluated. Such studies contribute towards assessing the risk
associated with a particular depletion plan.
Kalogerakis and Tomos (1995) presented an efficient procedure for the
determination of the 95 percent confidence intervals of forecasted horizontal well
386 Chapter 18
production rates. Such computations are usually the final task of a reservoir simu-
lation study. The procedure is based on the Gauss-Newton method for the estima-
tion of the reservoir parameters as was described earlier in this chapter. The esti-
mation of the 95 ?h confidence intervals of well production rates was done as fol-
lows.
The production rates for oil (QoJ ), water (Qw,) and gas (Q,) fiotn a vertical
or horizontal well completed in layer j are given by the following equations
where W,i is the well productivity index given by
( 1 8.25)
and M is defined for each phase as the ratio of the relative permeability to the vis-
cosity of the oil, water or gas, Pbh is the flowing bottom hole pressure, H is the
hydrostatic pressure (assumed zero for the top layer), Bo and B , are the fortnation
volume factors for oil and water, R, is the solution gas-oil ratio, E, is the gas ex-
pansion factor, kh and k, are the horizontal and vertical permeabilities and h is the
block thickness. The determination of ro, the effective well radius and Fsk,,,, the
skin factor, for horizontal wells has been the subject of considerable discussion in
the literature.
The total oil production rate is obtained by the sun1 of all producing layers
of the well, i.e.,
(1 8.36)
Similar expressions are obtained for the total gas (QJ and water (Qw) pro-
duction rates by the sum of the Q, and QWj terms over all NL layers.
In order to derive estimates of the uncertainty of the computed well pro-
duction rates, the expression for the well production rate Q, (c=o, w, g) is line-
arized around the optimal reservoir parameters to arrive at an expression for the
dependence of Qc on the parameters (k) at any point in time t, namely,
where k* represents the converged values of the parameter vector k.
The overall sensitivity of the well production rate can now be estimated
through implicit differentiation from the sensitivity coefficients already computed
as part of the Gauss-Newton method computations as follows
- = dQc G(t)T (2) + [2)
dk
( 18.28)
where G(t) is the parameter sensitivity matrix described in Chapter 6. The evalua-
tion of the partial derivatives in the above equation is performed at k=k* and at
time t.
Once, (dQ,/dk) has been computed, the variance CJ iC of the well produc-
tion rate can be readily obtained by taking variances from both sides of Equation
18.27 to yield,
CJ& = (%), COV(k)($) (I 8.29)
Having estimated CY & , the (I -a)% confidence interval for Q, at any
point in time is given by
where t :,2 is obtained from the statistical tables for the t-distribution for v de-
grees of freedom which are equal to the total number of observations minus the
number of unknown parameters.
If there are N, production wells then the ( I -a)?! confidence interval for
the total production from the reservoir is given by
388
where
Chapter 18
( 1 8.32)
I
i=i
and ~ r ~ ~ l ~ ~ ~ is the standard error of estimate of the total reservoir production rate
which is given by
( 18.33)
It is noted that the partial derivatives i n the above variance expressions
depend on time t and therefore the variances should be computed simultaneously
with the state variables and sensitivity coefficients. Finally, the confidence inter-
vals of the czrnztrlntive production of each well and of the total reservoir are calcu-
lated by integration with respect to time (Kalogerakis and Tomos, 1995).
18.3.51 Quantification of Risk
Once the standard error of estimate of the mean forecasted response has
been estimated, i.e., the uncertainty in the total production rate, one can compute
the probability level, a, for which the minimum total production rate is below
some pre-determined value based on a previously conducted economic analysis.
Such calculations can be performed as part of the post-processing calculations.
18.3.5.2 Multiple Reservoir Descriptions
With the help of automatic history matching. the reservoir engineer can
arrive at several plausible history matched descriptions of the reservoir. These
descriptions may differ in the grid block representation of the reservoir, existence
of sealing and non-sealing faults, or simply different zonations of constant poros-
ity or permeability (Kalogerakis and Totnos, 1995). I n addition, we may assign to
each one of these reservoir models a probability of being the correct one. This
probability can be based on additional geological information about the reservoir
as well as the plausibility of the values of the estimated reservoir parameters.
Thus, for the 4'' description of the reservoir, based on the material presented
earlier, one can compute expected total oil production rate as well as the minimum
and maximum production rates corresponding to a desired confidence level ( I -a).
The minimum and maximum total oil production rates for the t'' reservoir
description are given by
Q (') 01tot,mrn (k, t) = Q$:ot(k*, t) - tV a / 2 0") Qoltot
and
(1 8.34)
( 1 8.35)
Next, the probability, Pb(r) that the rth model is the correct one, can be used
to compute the expected overalljieldprodzrction rate based on the production data
fiom N, different reservoir models, namely
Similarly the minimum and
level are computed as
and
Again, the above limits can
r=l
I
(1 8.36)
maximum bounds at (1 -cr)lOO% confidence
r=l
I
( I 8.37)
( 1 8.38)
be used to compute the risk level to meet a
desired production rate fiom the reservoir.
18.3.5.3 Case Study - Reliability of a Horizontal Well Performance
Kalogerakis and Tomos (1 995) demonstrated the above methodology for
a horizontal well by adapting the example used by Collins et al. (1992). They as-
sumed a period of 1,250 days for history matching purposes and postulated three
different descriptions of the reservoir. All three models matched the history of the
reservoir practically with the same success. However, the forecasted performance
fiom 1,250 to 3,000 days was quite different. This is shown in Figures 18.28a and
18.28b where the predictions by Models B and C are compared to the actual reser-
voir performance (given by Model A). In addition, the 95% confidence intervals
for Model A are shown in Figures 18.29a and 18.29b. Form these plots it appears
that the uncertainty is not very high; however, it should be kept in mind that these
computations were made under the null hypothesis that all other parameters in the
390 Chapter 18
reservoir model (i.e., relative permeabilities) and the grid cell description of the
are all known precisely.
1 0 000
h
&" 8000
&
U
5
i 6ooo
C
0
U
3
a
2
- 2000
6
4000
0
0 1000 2d0i ' 3000
1 . " ' ""
Ti ne (days) Ti ne (days)
0x1 00 -1;
0 10 100 2000 30b0
Ti n e (days 1
,g 4 . 5 ~ 1
C
3
TI
f
0.0xl
Ti n e (days 1
References
Andersen, J.N., "Temperature Effect on Recombinant Protein Production Using a
Baculovirus/Insect Cell Expression System", Diploma Thesis, University of
Calgary and Technical University of Denmark, 1995.
Andersen, J.N., P.G. Sriram, N. Kalogerakis and L.A. Behie, "Effect of Tempera-
ture on Recombinant protein Production Using the BmYBmS.NPV Expres-
sion System", Can. J. Clzem. Eng., 74. 5 1 1-5 17 ( 1 996).
Anderson T.F, D.S. Adams and EA . Greens 11, "Evaluation of Parameters for
Nonlinear Thermodynamic Models", AlChE J. , 24: 20-29 (1 978).
Anderson, VL. and R.A. McLean, Design of Experimennls, M. Dekker, NewYork,
NY, 1974.
Appleyard, J .R. and L.M. Chesire, "Nested Factorization" paper SPE 12264 pre-
sented at the 1983 SPE Symposium on Reservoir Simulation, San Francisco,
CA ( 1983).
Aris, R.. Matlzentatical Modelling Techniques, Dover Publications, Inc., New
York, NY. 1994.
Ayen, R.J., and M.S. Peters, "Catalytic Reduction of Nitric Oxide", Ind Eng Chem
Proc Des Dm, I , 204-207 ( 1 962).
Bard, Y., "Comparison of Gradient Methods for the Solution of Nonlinear Pa-
rameter Estimation Problems", SI AMJ. Nzrmer. Anal., 7, 157- I86 ( 1 970).
Bard, Y., Nonlinear Parameter Estimation, Academic Press, New York, NY,
1974.
Barnett, V., T. Lewis, and V. Rothamsted, Outliers in Statisticnl Data, 31d edition,
Wiley, New York, NY, 1994.
Basmadjian, D., The ,4rt of Modeling in Science and Engineering, Chapman &
Hail, CRC, Boca Raton, FL, 1999.
391
392 References
Bates, D.M., and D.G. Watts, Nonlinear Regression Ana4ai.Y and its Appiicnfions,
J . Wiley, New York, NY, 1988.
Bates, D.M., and D.G. Watts, Nonlinear Regression Annlysis and its Appiiccrtions,
J . Wiley, New York, NY, 1988.
Beck, J.V., and K.J. Arnold, Parameter Estimation in Engineeriyg and Science, J .
Bellman, R., and R. Kalaba, Qlln.rili~lear.i,7atioll and Nonlinear Bozrndary I'crilre
Probienzs, American Elsevier, New York, NY, 1965.
Bellman, R., J . J acquez, R. Kalaba, and S. Schwimmer, "Quasilinearization and
the Estimation of Chemical rate Constants from Raw Kinetic Data", Aluth.
Biosc. I , 71 -76 (1 967).
Belohlav, Z., P. Zamostny, P. Kluson, and J . Volf, "Application of Random-
Search Algorithm for Regression Analysis of Catalytic Hydrogenations",
Can J. Chem. Eng., 75,735-742 ( 1 997).
Bevington, P.R., and D.K. Robinson, Dnta Reduction and Error Anaksis.for the
Physical Sciences, 2"d ed., McGraw Hill, New York, NY, 1992.
Bishop, C.M., Neural Networks for Pattern Recognition, Oxford Univ. Press, UK,
1995.
Blanch, H.W. and D.S. Clark, Biochernical Engineering, Marcel Dekker, New
York, NY, 1996.
Blanco, B., S. Beltran, J.L. Cabezas, and J . Coca, "Vapor-Liquid Equilibria of
Coal-Derived Liquids. 3. Binary Systems with Tetralin at 200 nmHg", ,I.
Chenz. Eng. Dcrta, 39,23-26 (1 994).
Bourgoyne, A.T., J r., K. K. Millheim, M. E. Chenevert, and F.S. Young, J r., Ap-
plied Drilling Engineering, SPE Textbook Series Vol. 2, 1986.
Box, G.E.P. and W.J. Hill, "Discrimination Among Mechanistic Models", Tech-
nometrics, 9, 57-7 1 (1 967)
Box, G.E.P., G.M. J enkins, G.C. Reinsel and G. J enkins, Time Series Anntj~sis:
Forecasting Ce Control, 3Id Ed., Prentice Hall, Eglewood-Cliffs, NJ , 1994.
Box, M.J., "A Comparison of Several Current Optimization Methods, and the Use
of Transformations in Constrained Problems", Conzputer J., 9,67-77 (1 966).
Box, M.J., "Improved Parameter Estimation", Technornetrics, 12, 2 19-229 ( 1 970).
Brelland, E. and P. Englezos, "Equilibrium Hydrate Formation Data for carbon
Dioxide in Aqueous Glycerol Solutions", J. Chem. Eng. Data, 41 , 1 1-13
( 1 996).
Britt, H.I., and R.H. Luecke, "The Estimation of Parameters in Nonlinear, Implicit
Models", Technometrics, 15, 233-247 (1 973).
Buzzi-Ferraris, G. and P. Forzatti, " A New Sequential Experimental Design Pro-
cedure for Discriminating Among Rival Models", Chem. Eng. Sci., 38, 225-
232 ( I 983).
References 393
Buzzi-Ferraris, G., P. Forzatti, G. Emig and H. Hofmann, "Sequential Experi-
mental Design for Model Discrimination in the Case of Multiple Responses",
Chenz. Eng. Sci., 39( l ) , 8 1-85 ( 1984).
Bygrave, G., "Thermodynamics of Metal Partitioning in Kraft Pulps", M.A.Sc.
Thesis, Dept of Chemical and Bio-Resource Engineering, Univ. of British
Columbia, BC, Canada, 1997.
Campbell S.W., R.A. Wilsak and G. Thodos, "Isothermal Vapor-Liquid Equilib-
rium Measurements for the n-Pentane-Acetone System at 327.7, 397.7 and
422.6 K", J. Chem. Eng. Data. 3 I , 424-430 (I 986).
Carroll J.J. and A.E. Mather, "Phase Equilibrium in the System Water-Hydrogen
Sulfide: Experimental Determination of the LLV Locus", Can. J. Chem.
Eng., 67,468-470 (1 989a).
Carroll J.J. and A.E. Mather, "Phase Equilibrium in the System Water-Hydrogen
Sulfide: Modelling the Phase Behavior with an Equation of State", Can. J.
Chem. Eng., 67,999- I003 (1 989b).
Cerny, V., "Thermodynamic Approach to the Travelling Salesman Problem: An
Efficient Simulation Algorithm", J Opt. Theory Applic., 45,4 1-5 1 (I 985).
Chappelear, J .E., and Nolen, J.S., "Second Comparative Project: A Three Phase
Coning Study", ,Journal of Petroleum Technology, 345-353, March (1986).
Chourdakis, A., "PULSAR Oxygenation System", Operation Manual, Herakleion,
Crete ( 1 999).
Collins, D., L. Nghiem, R.. Sharma, Y.K. Li, and K.Jha, "Field-scale Simulation
of Horizontal Wells"; J. Can. Petr. Technology, 3 1 (l), 14-2 1 (1 992).
Copp J . L., and D.H. Everett, "Thennodynamics of Binary Mixtures Containing
Amines", Disc. Faraday SOC., 15, 174- 1 8 I (1 953).
Donnely, J.K. and D. Quon, "Identification of Parameters in Systems of Ordinary
Differential Equations Using Quasilinearization and Data Pertrubation", Can.
J. Chem. Eng., 48, I 14 ( 1 970).
Draper, N. and H. Smith, Applied Regression Analysis, 3nd ed., Wiley, New York,
NY, 1998.
Dueck, G. and T. Scheuer, "Threshold Accepting: A General Purpose Optimiza-
tion Algorithm Appearing Superior to Simulated Annealing", J Computa-
tional Physics, 90,6 1 ( 1 990).
Duever, T.A., S. E. Keeler, P.M. Reilly, J.H. Vera, and P.A. Williams, "An Appli-
cation of the Error-In-Variables Model-Parameter Estimation Fromvan Ness-
type Vapour-Liqud Equilibrium Experiments", Chem. Eng. Sci., 42, 403-412
(1 987).
Dumez, F.J. and G.F. Froment, "Dehydrogenation of 1-Butene into Butadiene.
Kinetics, Catalyst Coking, and Reactor Design", Znd Eng. Chem. Proc. Des.
Devt., 15,29 1-30 1 (1 976).
394 Ref erences
Dumez, F.J., L.H. Hosten and G.F. Froment, "The Use of Sequential Discrimina-
tion in the Kinetic Study of 1-Butene Dehydrogenetion", Ind Eng. Chem
Fzmdam., 16,298-30 1 ( I 977).
Dunker, A.M., "The Decoupled Direct Method for Calculating Sensitivity Coeffi-
cients in Chemical Kinetics", J. Chem. PJ7ys.. 8 1, 2385-2393 ( 1 984).
Edgar, T.F. and D.M. Himmelblau, Optin~ization qf Chenlicol Processes,
McGraw-Hill. New York, NY, 1988.
Edsberg, L., "Numerical Methods for Mass Action Kinetics" i n Nzrn~ericnl Meth-
ods .for Diffentinl ,!$sterns, Lapidus, L. and Schlesser W.E., Eds., Aca-
demic Press, New York. NY, 1976.
Englezos, P. and N. Kalogerakis, "Constrained Least Squares Estimation of Binary
Interaction Parameters in Equations of State", C'on~prters Chem Eng., 17.
Englezos, P. and S. Hull, "Phase Equilibrium Data on Carbon Dioxide Hydrate in
the Presence of Electrolytes, Water Soluble Polymers and Montmorillonite",
Can J. Chem. Eng., 72,887-893 (1 994).
Englezos, P., G. Bygrave, and N. Kalogerakis, "Jnteraction Parameter Estimation
in Cubic Equations of State Using Binary Phase Equilibrium 8L Critical Point
Data", Zm! Eng. Chern. Res. 37(5), 1 6 I 3- 16 1 8 (I 998).
Englezos, P., N. Kalogerakis and P.R. Bishnoi, "A Systematic Approach for the
Efficient Estimation of Interaction Parameters in Equations of State Using
Binary VLE Data", Can. J. Chem. Eng., 71, 322-326 (1993).
Englezos, P., N. Kalogerakis and P.R. Bishnoi, "Estimation of Binary Interaction
Parameters for Equations of State Subject to Liquid Phase Stability Re-
quirements", Fhi d Phase Equilibria, 53, 8 1-88, ( I 989).
Englezos, P., N. Kalogerakis and P.R. Bishnoi, "Simultaneous Regression of Bi-
nary VLE and VLLE Data", Fluid Phose Equilibria, 6 I , 1 -I S (1 990b).
Englezos, P., N. Kalogerakis, MA. Trebble and P.R. Bishnoi, "Estimation of
Multiple Binary Interaction Parameters i n Equations of State Using VLE
Data: Application to the Trebble-Bishnoi EOS", Fluid Phase Equilibria, 58,
117-132 (1990a).
Englezos, P., N. Kalogerakis, P.D. Dholabhai and P.R. Bishnoi, "Kinetics of For-
mation of Methane and Ethane Gas Hydrates", Chem. Er g. Sci.. 42, 2647-
2658 (1 987).
117-121 (1993).
Ferguson, E.S., Engineering and the A4ind's Eye, MIT Press, Boston, MA, 1992.
Forestell, S.P., N. Kalogerakis, L.A. Behie and D.F. Gerson, "Development of the
Optimal Inoculation Conditions for Microcarrier Cultures", Biotechr~ol.
Bioeng., 39,3053 13 ( 1 992).
Frame, K.K. and W.S. Hu, "A Model for Density-Dependent Growth of Anchor-
age Dependent Mammalian Cells", Biotechnol. Bioer~g., 32, I06 I - 1066
( I 988).
References 395
Franckaerts, J . and G.F. Froment, "Kinetic Study of the Dehydrogenation of Etha-
nol.", Chem. Eng. Sci., 19, 807-8 18 (1 964).
Freund, R.J. and P.D. Minton, Regression Methods, Marcel Dekker, New York,
NY, 1979.
Froment, G.F., "Model Discrimination and Parameter Estimation in Heterogene-
ous Catalysis", AIChE J., 2 1 104 1 (1 975).
Froment, G.F., and K. B. Bischofc Chemical Reactor Analysis and Design, 2'ld
ed., J . Wiley, New York, NY, 1990.
Galivel - Solastiuk F., S . Laugier and D. Richon, "Vapor-Liquid Equilibrium Data
for the Propane-Methanol-C02 System", Fluid Phase Equilibria, 28, 73-85
(1 986).
Gallot, J.E., M.P. Kapoor and S. Kaliaguine, "Kinetics of 2-hexanol and 3-
Hexanol Oxidation Reaction over TS-I Catalysts", AIChE Journal, 44 (6),
Gans, P., Data Fitting in the Chemical Sciences by the Method of Least Squares,
Wiley, New York, NY, (1992).
Gill, P.E. and W. Murray, "Newton-type Methods for Unconstrained and Linearly
Constrained Optimization", Mathematical Programming, 7, 3 1 1-350 (1 974).
Gill, P.E. and W. Murray, "Nonlinear Least Squares and Nonlinearly Constrained
Optimization", Lecture Notes in Mathematics No 506, G.A. Watson, Ed,
Springer-Verlag, Berlin and Heidelberg, pp 134-147 (1 975).
Gill, P.E. and W. Murray, "Quasi-Newton Methods for Unconstrained Optimiza-
tion", J. Inst. Maths Applics, 9, 9 1 - 108 ( 1972).
Gill, P.E., W. Murray, and M. H. Wright, Practical Optimization, Academic Press,
London, UK, I98 1.
Glover, LK., "A New Simulated Annealing Algorithm for Standard Cell Place-
ment", proc. IEEE int. Con$ On Computer-Aided Design, Santa Clara, 378-
380 (1 986).
Goldfard, D., "Factorized Variable Metric Methods for Unconstrained Optimiza-
tion", Mathematics of Computation, 30 (1 36) 796-8 1 1 (1 976).
Hanson, K. and N. Kalogerakis, "Kinetic Reaction Models for Low Temperature
Oxidation and High Temperature Cracking of Athabasca and North Bodo Oil
Sands Bitumen", NSERC Report, University of Calgary, AB, Canada, 1984.
Hartley, H.O., "The Modified Gauss-Newton Method for the Fitting of Non-
Linear Regression Functions by Least Squares", Technometrics, 3(2), 269-
280 ( 1 96 1).
Hawboldt, K.A., N. Kalogerakis and L.A. Behie, "A Cellular Automaton Model
for Microcarrier Cultures", Biotechnol. Bioeng., 43,90-100 (1 994).
Hissong D.W.; Kay, W.B. "The Calculation of the Critical Locus Curve of Binary
Hydrocarbon Systems", AIChE J., 16,580 ( 1 970).
1438- 1454 (1 998).
396 References
Hocking, R.R., Mef hoh and Applications of Lineur klodels, J . Wiley, New York.
NY, 1996.
Holland, J .H., Adupfation in Natural and Art$cial Systems, Univ. of Michigan
Press, Ann Arbor, MI, 1975.
Hong, J .H. and Kobayashi, R., "vapor-Liquid Equilibrium Studies for the carbon
Dioxide-Methanol System", Fluid Phase Equilibria, 4 1,269-276 ( 1 988).
Hong, J .H., Malone, P.V., J ett, M.D., and R. Kobayashi, "The measurement and
Interpretation of the Fluid Phase Equilibria of a Normal Fluid in a Hydrogen
Bonding Solvent: The Methane-Methanol System", Fluid Phuse Eqzdibria,
Hosten, L.H. and G. Emig, "Sequential Experimental Design Procedures for Pre-
cise Parameter Estimation in Ordinary Differential Equations", Chem. Eng.
Sci., 30, 1357 ( I 975)
Hougen, 0,, and K.M. Watson, Chemical Process Prirwples, Vol. 3, J . Wiley,
New York, NY, 1948.
Hunter, W.G. and A.M. Reimer, "Designs for Discriminating Between Two Rival
Models", Technometrics, 7, 307-323 (1 965).
Kalogerakis, N. and L.A. Behie, "Oxygenation Capabilities of New Basket-Type
Bioreactors for Microcarrier Cultures of Anchorage Dependent Cells", Bio-
process Eng., 17, 15 I - 156 (1 997).
Kalogerakis, N. and R. Luus, "Increasing the Size of the Region of Convergence
in Parameter Estimation", yroc. Awerican Control Conference, Arlington,
Virginia, 1, 358-364 (1982).
Kalogerakis, N., "Parameter Estimation of Systems Described by Ordinary Differ-
ential Equations", PI1.D. thesis, Dept. of Chemical Engineering and Applied
Chemistry, University of Toronto, ON, Canada, 1983.
Kalogerakis, N., "Sequential Experimental Design for Model Discrimination in
Dynamic Systems", proc. 34th Con. Chem. Eng. Conference, Quebec City,
Kalogerakis, N., and C. Tomos, "Reliability of Horizontal Well Performance on a
Field Scale Through Automatic History Matching", J Can. Petr. Technol-
Kalogerakis, N., and R. Luus, "Effect of Data-Length on the Region of Conver-
gence in Parameter Estimation Using Quasilinearization", ,41ChE J., 36, 670-
673 (1 980).
Kalogerakis, N., and R. Luus, "Simplification of Quasilinearization Method for
Parameter Estimation", AIChE . I , 29, 858-864 ( I983a).
Kalogerakis, N., and R. Luus, "Improvement of Gauss-Newton Method for Pa-
rameter Estimation through the Use of Information Index", lnd Eng. Chent.
38, 83-86 (1987).
Oct. 3-6 (1 984).
ogy, 34(9), 4755 (1 995).
Ftl\?dn171., 22, 436-445 ( I 983b).
References 397
Kalogerakis, N., and R. Luus, "Sequential Experimental Design of Dynamic Sys-
tems Through the Use of Information Index", Can. J. Chem. Eng., 62, 730-
737 (1 984).
Kirkpatrick S., C.D. Gelatt J r. and M.P. Vecchi, "Optimization by Simulated
Annnealing", Science, 220,67 1-680 (1 983).
Kittrell, J .R., R. Mezaki, and C.C. Watson, "Estimation of Parameters for Nonlin-
ear Least Squares Analysis", Ind Eng. Chem., 57( 12), 18-27 (1965b).
Kittrell, J.R., W.G. Hunter, and C.C. Watson, "Nonlinear Least Squares Analysis
of Catalytic Rate Models", AIChE J., 1 1 (6), 105 1 - 1057 (1 965a).
Koch, K.R., Parameter Estimation and Hypothesis Testing in Linear Models,
Springer-Verlag, New York, NY, 1987.
Kowalik, J ., and M.R. Osborne, Methods of Unconstrained Optimization Prob-
lems, Elsevier, New York, NY, 1968.
Kurniawan, P., "A Study of In-Situ Brightening of Mechanical Pulp via the Elec-
tro-Oxidation of Sodium Carbonate", M.A.Sc. Thesis, Dept. of Chemical and
Bio-Resource Engineering, The University of British Columbia, 1998.
Lawson, C.L., and R.J. Hanson, Solving Least Squares Problems, Prentice Hall,
Eglewood-Cliffs, NJ , 1974.
Leis, J.R., and Kramer, M.A., "The Simultaneous Solution and Sensitivity Analy-
sis of Systems Described by Ordinary Differential Equations", ACM trans-
actions on Mafhernatical Software, 14,45-60 ( I 988).
Leu, A-D., and D.B. Robinson, "Equilibrium Properties of the Methanol-Isobutane
Binary System", J. Chem. Eng. Data, 37, 10-13 ( 1 992).
Levenberg, K., "A Method for the Solution of Certain Non-linear Problems in
Least Squares", Quart. Appl. Math., I1(2), 164- 168 (1 944).
Li Y-H., K.H. Dillard and R.L. Robinson, "Vapor-Liquid Phase Equilibrium for
the C02-n-Hexane at 40,80, 120C", J. Chem. Eng. Data, 26,53-58 (198 1).
Lim, H.C., B.J. Chen and C.C. Creagan, "Analysis of Extended and Exponentially
Fed Batch Cultures", Biotechnol. Bioeng., 19,425-435 ( 1 977).
Lin Y-N.; Chen, R.J.J., Chappelear, P.S.; Kobayashi, R. "Vapor-Liquid Equilib-
rium of the Methane-n-Hexane System at Low Temperature", J. Chem. Eng.
Data, 22,402-408 ( I 977).
Linardos, T., "Kinetics of Monoclonal Antibody Production in Chemostat Hybrid-
oma Cultures", Ph.D. thesis, Dept. of Chemical & Petroleum Engineering,
University of Calgary, AB, Canada, 199 1.
Linardos, T.I., N. Kalogerakis, L.A. Behie and L.R. Lamontagne, "Monoclonal
Antibody Production in Dialyzed Continuous Suspension Culture", Biotech-
nol. Bioeng., 39, 504-5 10 (1 992).
Ljung, L and Soderstrom, T., Theory and Practice of Recursive Identification,
MIT press, Boston, MA (1 983).
398 References
Luus, R., "Determination of the Region Sizes for LJ Optimization", Hzrt~g. J. Ind
Luus, R., "Optimization in Model Reduction", lnt. ,J. Control, 32, 74 1-747 ( 1 980).
Luus, R., and T.H.I. J aakola, "Optimization by Direct Search and Systematic Re-
Luus, R., Iterative L$nninic Programming, Chapman & Hall, CRC, London, UK,
Marquardt, D.W., "An Algorithm for Least-Squares Estimation of Nonlinear Pa-
rameters", J. SOC. Indzrst. Appl. Math., l l (2), 43 1-44 l (1 963).
Metropolis, N., A.W. Rosenbluthh, M.N. Rosenbluth, A.H. Teller and E. Teller,
"Equations of State Calculations by Fast Computing Machines", J. Chem.
Michelsen, M.L. Phase Equilibrium Calculations. What is Easy and What is Diffi-
cult? Comptrters Chem. Eng., I 7,43 1-439 (I 993).
Modell, M., and R.C. Reid. Thermodynamics and its Applications; 2nd ed., Pren-
tice-Hall: Englewood Cliffs, NJ , 1983.
Montgomery, D.C,. Design and Arza(ysis of Experiments, 4"' ed., J Wiley, New
York, NY, 1997.
Montgomery, D.C. and E.A. Peck, Introdtrction to Linear Regression Ann!t:yis, J.
Monton, J.B., and F. J . Llopis, "Isobaric Vapor-Liquid Equilibria of Ethylbenzene
+ m-Xylene and Ethylbenzene + o-Xylene Systems at 6.66 and 26.66 kPa",
J. Chern. Eng. Data, 39, 50-52 ( I 994).
Murray, L.E. and E.K. Reiff J r., "Design of Transient Experiments for Identifica-
tion of Fixed Bed Thermal Transport Properties", Can. J Chern, Eng., 62,
Nelder, J .A., and R. Mead, "A Simplex Method for Function Minimization".
Nieman, R.E. and D.G. Fisher, "Parameter Estimation Using Linear programming
and Quasilinearization", Can. J. Chern. Eng,, SO, 802 (1 972).
Ohe, S. Vapor-Liquid Eqzrilibritrrt7 Datct at High Pressure; Elsevier Science Pub-
lishers: Amsterdam, NL, 1990.
Otten R.H.J.M. and L.P.P.P. Van Ginneken, Tl?e crnnealing algoritlzm, Kluwer
International Series in Engineering and Computer Science, Kluwer Aca-
demic Publishers, NL, 1989.
Park, FT., "Thermodynamics of Sodium Aluminosilicate Formation i n Aqueous
Alkaline Solutions Relevant to Closed-Cycle Kraft Pulp Mills", PhD Tllesis,
Dept. of Chemical and Bio-Resource Engineering, University of British Co-
lumbia, Vancouver, BC, Canada, 1999.
Chern., 26,28 1-286 ( 1998).
duction of the Search Region", AIChE J, 19, 760 (1 973).
2000.
Phjjsics, 2 1, 1087- 1092 (1 953).
55-6 1 (I 984)
COWZP. J., 7,308-3 13 ( I 965).
References 399
Park, H., and P. Englezos, "Osmotic Coefficient Data for Na2Si03 and Na2Si03-
NaOH by an Isopiestic Method and Modelling Using Pitzer's Model", Flzlid
Phase Equilibria, 153,87-104 (1998).
Park, H., and P. Englezos, "Osmotic Coefficient Data for the NaOH-NaCI-
NaAI(OH)4 System Measured by an Isopiestic Method and Modelling Using
Pitzer's Model at 298.15 K", Fluid Phase Equilibria, 155,25 1-260 (1 999).
Patino-Leal, H., and P.M. Reilly, "Statistical Estimation of Parameters in Vapor-
Liquid Equilibrium", A IChE J., 28(4), 580-587 (I 982).
Peneloux, A.; Neau, E.; Gramajo, A. Variance Analysis Fifteen Years Ago and
Now. Fluid Phase Equilibria, 56, 1 - 16 (1 990).
Peng, D-Y., and D.B. Robinson, " A New Two-Constant Equation of State", Znd.
Eng. Chem. Fundam., 1 5,59-64 ( I 976).
Peressini, A.L., F.E. Sullivan, J. J. U hl , J r., The Mathematics of Nonlinear pro-
gramming, Springer-Verlag, New York, NY, 1988.
Pitzer, K.S., Activio~ Coefxients in Electrolyte Solutions, 2nd Ed.; CRC Press:
Boca Raton, FL, 199 1.
Plackett, R.L., "Studies in the History of Probability and Statistic. XXIX The Dis-
covery of the method of least squares", Biometrika, 59 (2), 239-25 1 (1 972).
Poston, R.S.; McKetta, J . "Vapor-Liquid Equilibrium in the Methane-n-Hexane
System", J. Chern. Eng. Data, 1 I , 362-363 ( 1 966).
Powell, M.J . D., " A Method of Minimizing a Sum of Squares of Non-linear Func-
tions without Calculating Derivatives", Computer ,I, 7,303-307 (1 965).
Prausnitz, J.M., R.N. Lichtenthaler and E.G. de Azevedo, Molecular Thermody-
namics of Fluid Phase Equilibria, 2nd ed., Prentice Hall, Englewood Cliffs,
NJ , 1986.
Press, W.H., S.A. Teukolsky. W.T. Vetterling, and B. P. Flannery, Nwnerical
Recipes in Fortran: The Art of ScientlJic Computing, Cambridge University
Press, Cambridge, UK, 1992.
Ramaker, B.L., C.L. Smith and P.W. Murrill, "Determination of Dynamic Model
Parameters Using Quasilinearization", Ind Eng. Chem. Fundam., 9, 28
(1 970).
Rard, J.A., "Isopiestic Investigation of Water Activities of Aqueous NiCI2 and
CuCI2 Solutions and the Thermodynamic Solubility Product of NiCI2-6H20
at 298.15 IS", J. Chem. Eng. Data, 37,433-442 (1 992).
Ratkowsky, D.A., Nonlinear Regression Modelling, Marcel Dekker, New York,
NY, 1983.
Reamer, H.H., B.H. Sage and W.N. Lacey, "Phase Equibria in Hydrocarbon Sys-
tems", Ind. Eng. Chem., 42, 534 (1950).
Reilly, P.M. and H. Patino-Leal, " A Bayesian Study of the Error in Variables
Model", Teclznometrics, 23,22 1-23 1 (1 98 1) .
400 References
Reklaitis, G.V., A. Ravindran, and K.M. Ragsdell. Engineering Opfimizatiorl:
Methods and Aplications, J . Wiley, New York, NY, 1983.
Rippin, D.W.T., L.M. Rose and C. Schifferli, "Nonlinear Experimental Design
with Approximate Models in Reactor Studies for Process Development",
Chem. Eng. Sci., 35, 356 (1980).
Sachs, W.H., "lmplicit Multifinctional Nonlinear Regression Analysis", Tech-
nometrics, I 8, 16 1 - 1 73 ( I 976).
Salazar-Sotelo D., A. Boireaut and H. Renon, Tomputer Calculations of the Op-
timal Parameters of a Model for the Simultaneous Representation of Experi-
mental, Binary and Ternary Data", Fluid Pl?cr,re Eqz~ilihria, 27, 383-403
(1 986).
Sargent, R.W.H. " A Review of Optimization Methods for Nonlinear Problems", in
Computer Applicafions in Chemical Engineering, R.G. Squires and G.V.
Reklaitis, Eds., ACS Symposium Series 124, 37-52, 1980.
Scales, L.E., Introduction to Non-linear Optimizatiorz, Springer-Verlag, New
York, NY, 1985.
Schwartzentruber J., F. Galivel-Solastiuk and H. Renon, "Representation of the
Vapor-Liquid Equilibrium of the Ternary System Carbon Dioxide-Propane-
Methanol and its Binaries with a Cubic Equation of State. A new Mixing
Rule", Fluid Phase Equilibria, 38, 2 17-226 ( 1987).
Schwetlick, H. and V. Tiller, "Numerical Methods for Estimating Parameters in
Nonlinear Models with Errors in the Variables", Technometrics, 27, 17-24
( 1985).
Seber, G.A.F. and C.J . Wild, Nonlinear Regression, J. Wiley, New York. NY,
1989.
Seber, G.A.F., Linear Regression Analysis, J. Wiley. New York, NY, 1977.
Seinfeld, J.H., and G.R. Gavalas, "Analysis of Kinetic Parameters from Batch and
Integral Reaction Experiments", AIChE .I., 16,644-647 (1 970).
Sein feld, J.S., and L. Lapidus, Matlzematical hlodeis in Chenticai Engineering,
Vol. 3, Prentice-Hall, Inc., Englewood Cliffs, NJ , 1974.
Selleck, F.T., L.T. Carmichael and B.H. Sage, "Phase Behavior in the Hydrogen
Sulfide - Water System", Ind. Eng. Chen?., 44.22 19-2226 ( I 952).
Shah, M.J ., "Kinetic Models for Consecutive Heterogeneous Reactions", Jndza-
trial and Engineering Chentisty, 57, 18-23 ( I 965).
Shanmugan, K.S.and A. M. Breipohl, Random Signals: Detection, Estirnrrtiorz and
Data Analysis, J . Wiley, New York, NY, 1988.
Shibata, S.K., and S.I. Sandler, "High-Pressure Vapor-Liquid Equilibria of Mix-
tures of Nitrogen, Carbon Dioxide, and Cyclohexane", J. Chem. E H ~ . Data,
34, 4 19-424 ( 1989).
References 401
Silva, A.M., and L.A. Weber, "Ebulliometric Measurement of the Vapor Pressure
of 1 -Chloro- I , 1 -Difluoroethane and 1, I -Difluoro Ethane, .I Chem. Eng.
Smith, J.M., Chemical Engineering Kinetics, 31d ed., McGraw-Hill, New York,
Soderstrom, T., Ljung, L, and Gustavsson, I . (1978) "A theoretical analysis of
Sorenson? H.W., Parameter Estimation, Marcel Dekker, New York, NY, 1980.
Spiegel, M.R., "Theory and Problems ofA&anced mathematics for Engineers and
Scientists", McGraw Hill, New York, NY, 1983.
Srinivasan, R., and A.A. Levi, "Kinetics of the Thermal Isomerization of Bicyclo
[2.1.1] hexane", J Amer. Chem. SOC., 85,3363-3365 ( I 963).
Srivastava, M.S. and Carter, EM., Introduction to Applied Multivariate Statistics,
North-Holland, Elsevier Science Publ., NewYork., N.Y. (1 983)
Stone, H.L., "Iterative Solution of Implicit Approximations of Multi-Dimensional
Partial Differential Equations", SIAM J . Numerical Analysis, 5, 530-558
(1 968).
Sutton, T.L., and J .F. MacGregor, "The Analysis and Design of Binary Vapour-
Liquid Equilibrium Experiments Part I: Parameter Estimation and Consis-
tency Tests", Can. J Chem. Eng., 55,602-608 (I 977).
Tan, T.B. and J .P. Letkeman, "Application of D4 Ordering and Minimization in an
Effective Partial Matrix Inverse Iterative Method", paper SPE 10493 pre-
sented at the 1982 SPE Symposium on Reservoir Siumulation, San Antonio,
TX ( 1 982).
Tan, T.B. and N. Kalogerakis, " A Three-dimensional Three-phase Automatic
History Matching Model: Reliability of Parameter Estimates", J. Can Petr.
Technology, 3 1 (3), 34-4 I (1 992).
Tan, T.B. and N. Kalogerakis, "Improved Reservoir Characterization Using
Automatic History Matching Procedures", J Can Petr. Technology, 32(6),
Tan, T.B., "Parameter Estimation in Reservoir Engineering", Ph.D. Thesis, Dept.
of Chemical & Petroleum Engineering, University of Calgary, AB, Canada,
1991.
Tassios, D.P., Applied Chemical Engineering Thermodynamics, Springer-Verlag,
Berlin, 1993.
Thaller, L.H., and G. Thodos, "The Dual Nature of a Catalytic-Reaction: The De-
hydrogenation of sec-Butyl Alcohol to Methyl Ethyl Ketone at Elevated
Pressures", AZChE Journal, 6(3), 369-373, 1960.
Thiessen, D.B., and Wilson, "An Isopiestic method for Measurement of Electro-
lyte Activity Coefficients", AZChE J , 33( 1 l), 1926- 1929, 1987.
Data, 38,644-646 (1 993).
NY, 1981.
recursive identification methods, Automatica, 14,23 1-244.
26-33 (1 993).
402 References
Trebble, M.A.; Bishnoi, P.R. Development of a New Four-Parameter Equation of
State, Fluid Phase Equilibria, 35, 1 - 1 8 ( 1987).
Trebble, M.A.; Bishnoi, P.R. Extension of the Trebble-Bishnoi Equation of State
to Fluid Mixtures. Flzrid Phase Eqzrilibrirr, 40, 1-2 1 (1 988).
Valko, P. and S. Vajda, "An Extended Marquardtl-type Procedure for Fitting Er-
ror-In-Variables Models", Computers Chenz. Erlg., 1 1 , 37-43 ( I 987).
van Konynenburg, P.H. and R.L. Scott, "Critical Lines and Phase Equilibria in
Binary van der Waals Mixtures", Philox Trans. R. Soc. London, 298, 495-
540 (1 980).
Waiters, F.H., Parker, J. , Llyod, R., Morgan, S.L., and S.N. Deming, Sequerztirrl
Simplex Optimization: A Technique. for lmproving Qualit)? and Productivio.
in Research, Development. and Man~!factzrr.ir~g, CRC Press Inc., Boca Raton,
Florida, 199 1.
Wang, B.C. and R. Luus, %creasing the Size of the Region of Convergence for.
Parameter Estimation Through the Use of Shorter Data-Length", Int. .I. Cor7-
Wang, B.C. and R. Luus, "Optimization of Non-Unimodal Systems", ht . J. Nu-
mer. Meth. Eng, I l , 1235-1 250 (1 977).
Wang, B.C. and R. Luus, "Reliability of Optimization Procedures for Obtaining
the Global Optimum", AlChE J. , 19,6 19-626 (1 978).
Watts, D.G., "Estimating Parameters in Nonlinear Rate Equations", Can. .J. C'hem.
Eng., 72, 70 1-7 10 (I 994).
Wee, W., and N. Kalogerakis, "Modelling of Drilling Rate for Canadian Offshore
Well Data", ,I. Can. Petr. Technology, 28(6), 33-48 ( 1 989).
Wellstead, P.E. and M. B. Zarrop, Self-Tuning Systems: Cor?tr*oI andSigrr.tal Proc-
essir~g, J. Wiley, New York, NY, 1991.
Yokoyama C., H. Masuoka, K. Aval, and S. Saito, "Vapor-Liquid Equilibria for
the Methane-Acetone and Ethylene-Acetone Systems at 25 and 50"C", *I.
Chem. Eng. Data, 30, 177- 179 (1 985).
Zatsepina, O.Y., and B.A. Buffett, "Phase Equilibrium of Gas Hydrate: Implica-
tions for the Formation of Hydrate in the Deep Sea Floor", Gec~phy.~. Res.
Letters, 24, 1567- I570 ( I 997).
Zeck S., and H. Knapp, "Vapor-Liquid and Vapor-Llquid-Liquid Phase Equilibria
for Binary and Ternary Systems of Nitrogen, Ethane and Methanol: Experi-
ments and Data Reduction", Flzrid Phme Equilibrirr, 25,303-322 (1 986).
Zhu, X. D., G. Valerius, and H. Hofinann, "Intrinsic Kinetics of 3-Hydroxy-
propanal Hydrogenation over Ni/Si02/A1203 Catalyst", lnd Eng. Chem Res,
Zygourakis, K. Bizios, R. and P. Markenscoff, "Proliferation of Anchorage De-
pendent Contact Inhibited Cells: Development of Theoretical Models Based
on Cellular Automata", Biofechnol. Bioe)r.tg., 36.459-470 (1 99 I ).
fro/, 3 1 , 947-957 ( I 980)
36,3897-2902 ( 1 997).
Appendix 1
A.1.1 THE TREBBLE-BISHNOI EQUATION OF STATE
Weconsider the Trebble-Bishnoi EoS for a fluid mixture that consist of
N, components. The equation was given in Chapter 14.
A.1.2 Derivation of the Fugacity Expression
The Trebble-Bishnoi EoS (Trebble and Bishnoi, 1988a;b) for a fluid
mixture is given by equation 14.6. The hgacity, 4, of component j in a mixture is
given by the following expression for case 1 (T 2 0)
Inf , =I n( x , P) +~ ( Z- - l ) - I n/ Z- B, nI +YhX- Y- b K (A.l .1)
bm A
where
(A. 1.2)
403
404
Appendix 1
A = Z +(B,,, +CI n)Z -(B,,C,, +D:,)
22 + B,, (11 -e)
8=z o5, ?led =Qand h=l n
2 2 + B,,, (u +e)
(A. I . 3)
(A. 1.4)
(A. I . 5)
if T 2 0 (A. 1.6a)
if T < 0 (A.1.6b)
(A. I .7)
(A. I .8)
Q=-[6blncd I -6bdcI,, +2c,,cd +8d,,d 2bdc;?n 8bddi l
d- b, , - -
(A.1.9)
20b; b 111
a 111 p
A 111 = -
R ~ T ~
blll P
B,, =-
RT
Cm P
c,, =-
RT
(A. 1 . loa)
(A. 1. I Ob)
(A. 1.1 Oc)
d m P
Dl , =- (A. I . IOd)
RT
where a,,,, b,,, c,, and dm are given by Equations 14.7a. b, c and d respectively. The
quantities ad, bd, cd , ddand ud are given by the following equations
Appendix 1
405
Also
( A. l . l la)
(A. 1. llb)
(A.1.1 IC)
i =I
(A.I.l Id)
( A. l . l le)
( A. l . l I f )
(A. I . 12)
A.1.3 Derivation of the Expression for
T,P,x,,,
Differentiation of Equation A1 .I . with respect to xJ at constant T, P and x,
for i=1,2,. . . ,Nc and i#j gives
406
where
Appendix 1
&, RT ax,
Z-
RT
FX 1 =
b 111 p
(A. l. 14)
A number of partial derivatives are needed and they are given next,
(A.I.16)
(A. I . 17)
aA $2 ?Z
-=22-+(Bl, -t-Cln)-+Z
h J &]
(A. 1.19)
Appendix 1
407
I f r>O then h is given by equation A. I .6a and the derivative is given by
where g = 22 + B, u, h = B,,,8 and
(A. 1.20a)
(A. 1.20b)
(A. 1.2 la)
(A. 1.2 1 b)
(A. 1.2 I c)
If 'I is negative then h is given by equation A.1.6b and the derivative is given by
In addition to the above derivatives we need the following
(A.1.21d)
(A. 1.22)
(A. I .23)
408
Appendix 1
where
2bdci 8b,df,
QB =-+-
b Ill b In
Qc =20bi l
3Q E3 ) b, 3c
- = 24, +4-c,,, J +8di 1
& j
ax 111 ax J GX
bd ddtn
+16-dd,,, -
b 111 ax I
db 9 30
" 'Qc -40bI l1 2 +2 b ; , -
ax .1 & J
ax J
(A. 1 24)
(A, 1 2S aj
(A. 1.2Sb)
(A. 1.2%)
(A. 1.2Sd)
(A. I .25e)
(A. I .2s9
(A. 1.25~~)
Appendix 1
or
0'8 1 &
dx, 2 m ,
-
"-"
if z i s negative
409
(A. I .25h)
(A. 1.25i)
Appendix 2
A.2.1 LISTINGS OF COMPUTER PROGRAMS
In this chapter we provide listing of two computer programs with the corre-
sponding input files. One is for a typical algebraic equation model (Example
16.1.2) and the other for an ODE model (Example 16.3.2). These programs can be
used with a fcw minor changcs to solve other parameter estimation problems
posed by the user. As our intention was not to produce a full-proof commercial
software package. a few things are required fiom the user.
I n order to run the given examples, no modification changes need be made.
If however. the user wants to use the Gauss-Newton method for an algebraic or
ODE model, basic programming FORTRAN 77 skills are required. The user must
change the subroutine MODEL and the subroutine JX (only for ODE models)
using the existing ones as a template. In addition the user must select the value of a
few program parameters which are detailed in the first set of comment cards in
each program. Subsequently, the program needs to be compiled.
The input data tile should also be constructed using any of the existing ones
as a template.
All the programs have been written using FORTRAN 77. The programs
have been tested using the MICROSOFT DEVELOPER STUDIO-FORTRAN
POWER STATION. The programs call one or more IMSI, MATH library routines
which is available with above FORTRAN compiler. In general any FORTRAN
conlpiler accompanied with the IMSL MATH library should be capable to run the
enclosed programs without any difficulty.
410
Appendix 2 41 1
These programs are provided as an aid to the user with the following dis-
claimer.
We make no warranties that the programs contained in this book are free
of errors. The programs in this book are intended to be used to
demonstrate parameter estimation procedures. If the user wishes to use
the programs for solving problems related to the design of processes and
products we make no warranties that the programs will meet the user's
requirements for the applications. The user employs the programs to
solve problems and use the results at hidher own risk. The authors and
publisher disclaim all liability for direct or indirect injuries to
persons, damages or loss of property resulting from the use of the
programs.
A.2.2 CONTENTS OF ACCOMPANYING CD
In the enclosed CD the computer programs used for the solution of selected
problems presented throughout the book are provided as an aid to the user. Each
example is provided in a different folder together with a typical input and output
file. The *.exe file is also provided in case one wishes to run the particular exam-
ples and has no access to a FORTRAN compiler.
Programs are provided for the following examples dealing with algebraic
equation models:
1. Example 4.4.1
2. Example 4.4.2
3. Example 14.2.7.1
4. Example 16.1.1
5. Example 16, I .2
6. Example 16.1.3
8. Example 17.1.1
9. Example 17.1.2
Programs are also provided for the following examples dealing with ODE models:
1 . ODE-Example 16.3.1
2. ODE-Example 16.3.2
3. ODE-Example I 6.3.3
4. ODE-Example 17.3.1
In addition, the program used for data smoothing with cubic splines for the short-
cut methods are given for the example:
1. SC-Example 1 7. I .4
412
Appendix 2
A.2.3 COMPUTER PROGRAM FOR EXAMPLE 16.1.2
C
c EXAMPLE 16. 1. 2 ( I somer i zat i on of Bi cycl o Hexane)
c i n t he book:
c APPLI ED PARAMETER ESTI MATI ON FOR CHEMI CAL ENGI NEERS
c by Engl ezos & Kal oger aki s / Mar cel Dekker I nc. ( 2000)
c COPYRI GHT: Mar cel Dekker 2000
C
c " " " " " " " " " " " " " " " " - " " " " " " " " " " " " " " " " " " "
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C-
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
PARAMETER ESTI MATI ON ROUTI NE FOR ALGEBRAI C EQUATI ON MODELS
Based on Gauss- Newt on met hod wi t h PseudoI nver se and Mar quar dt ' s
Modi f i cat i on. Har d BOUNDARI ES on par amet er s can be i mposed.
GENERALI ZED LEAST SQUARES For mul at i on ( when I GLS=l )
or
WEI GHTED LEAST SQUARES For mul at i on ( when I GLS=O)
wi t h t he DI AGONAL WEI GHTI NG MATRI X, Q( i ) / Ydat a( 1) **2, i =l , NY
wi t h t he DI AGONAL WEI GHTI NG MATRI X, Q( i ) , i =l , NY
Wr i t t en i n FORTRAN 77 by Dr. Ni col as Kal oger aki s, Aug. 1999
Compi l er : MS- FORTRAN usl ng t he I MSL l i br ar y ( r out 1ne: DEVCSF)
""""""""""""""""""""""""""""~""~""~
The User must pr ovl de Subr out l ne MODEL [ t hat descr i bes t he
mat hemat l cal model Y=f ( X, P) ]
The Fol l owi ng Var i abl es MUST be Specl f l ed l n Mal n Pr ogr am:
I GLS =Fl ag t o use Gener al i zed LS ( I GLS=l ) or Wel ght ed LS ( I GLS=O)
NX =Number of i ndependent var i abl es I n t he model
NY =Number of depenent var l abl es I n t he model
NPAR =Number of unknown par amet er s
NXX( l ) , 1=1, NX =col umns i n t he dat a cor r espondl ng t o x( l ) , x( 2) , .
N Y Y ( l ) , l=l,NY =col umns l n t he dat a cor r espondl ng t o y( l ) , y( 2) , .
FI LEI N =Name of I nput f l l e
FI LOUT =Name of Out put f l l e
Q( 1) , l =l , NY =Const ant wel ght s f or each measur ed var l abl e ( Use 1
I BOUND = 1 t o enabl e har d const r al nt s on t he Dar amet er s (0 ot her wl
f or LS)
e)
NSTEP =Maxl mum number of r educt l ons al l owed for Bl sect l on r ul e ( def aul t 1s 10)
KI F, KOS, KOF =I nput f l l e uni t , Scr een out put unl t , Out put f i l e unl t .
NSI G =Appr ox. Number of Sl gni f i cant di gi t s ( Tol er ence EPS =l O**( - NSI G) )
EPSPSI =Mi ni mum Rat i o of ei genval ues bef or e t he Pseudo- i nver se appr ox. IS used.
EPSMI N =A ver y smal l number ( used t o avoi d dl vl sl on by zero) .
S'
For Space Al l ocat i on Pur poses Use:
WAR =Gr eat er or equal t o t he maxl mum number of col ums l n Dat af l l e
NROWS =Gr eat er or equal t o t he maxi mum number of exper i ment s ( =r ows) I n Dat af l l e
NPARMX =Gr eat er or equal t o t he number of unknown par amet er s In t he model .
c"""""""""""""""""""""""""""""-""----""
USE MSI MSLMD
PARAM?ZTER ( nvar=15, nrows=300 , npar mx=15)
DOUBLE PRECI SI ON ydata(nvar,nrows),xdata(nvar,nrows)
DOUBLE PRECI SI ON al l dat a( nvar , nr ows)
DOUBLE PRECI SI ON y ( nvar) , q ( nvar) , x ( nvar) , dydp ( nvar , npar mx)
DOUBLE PRECI SI ON PO ( npar mx) , dp ( npar mx) , p ( npar mx)
DOUBLE PRECI SI ON pmi n( nparmx) , pmax( nparmx) , v( npar mx, npam)
DOUBLE PRECI SI ON a ( npar mx, npar mx) , b ( npar mx) , s ( npamkx)
DOUBLE PRECI SI ON st dev ( npar mx) , bv(nparmx1 , bvs ( nparmx)
I NTEGER nyy ( nvar) , nxx ( nvar)
CHARACTER*4 dmyl l ne
CHARACTER*24 f i l ei n, f i l out
C
Appendix 2
413
COMMON /gn/imode,ny,nx,npar
C
C
C " " " " " " " " " " " " " " " Set convergence tolerence & other parameters
C
DATA kos/6/,kif/2/,kof/7/,epsrnin/l.e-8O/,igls/O/
DATA nsig/5/,epspsi/1.e-30/,nstep/10/,q/nvar*1.0/,ibound/1/
eps =10. ** (-nslg)
npar =2
ny =1
nx =2
C
C " " " " " " " " " " " " " " " Set database pointers:
C " " " " " " " " " " " " " " " Select which columns of the data correspond
C " " " " " " " " " " " " " " " to x(l), x(2) ,. .. y(1) , y ( 2 ) ,. .. in the file
nxx (1) =1
nxx (2) =2
nyy (1) =3
C
C " " " " " " " " " " " " " " " Set Filenames and Open the flles for 1/0
71
72
73
74
77
C
fllein = 'DataIN-Exl6.1.2.txt'
filout = 'DataOUT-Exl6.1.2.txt'
open(kif ,file=flleln)
open(kof ,file=filout)
write (kof , 71)
write (kos , 71)
format(//20xI'PARAMETER ESTIMATION PROGRAM FOR ALGEBRAIC MODELS',
6 /20xI'Gauss-Newton Method with PseudoInverse/Marquardt/Bisection'
& ,/20x,'Bounded parameter search/Weighted LS formulation...',/)
if (igls .eq. 1) then
write (kos , 72)
write (kof I 7 2 )
format(//20xr'GENERALIZED LEAST SQUARES Formulation...')
write(kos,73)
write (kof ,73)
format(//20xI'WEIGHTED LEAST SQUARES Formulation...')
else
end if
write (kos, 74) fileln
write (kof I 74) filein
format(/40x,'Data Input Filename: ',a24)
wrlte (kos , 77) filout
wrlte (kof I 77) fllout
format(40x,'Program Output Flle: ',a24)
C " " " " " " " " " " " " " " " Read Initial Guess for Parameters
read(kif, *)
dummyline
read(kif,*) (p(j), j=l,npar)
C
C " " " " " " " " " " " " " " " Read MIN & MAX parameter BOUNDS
read(kif, *)
dummyline
read(kif,*) (pmin(7) ,J=l,npar)
read(kif , *) d m y l l n e
read(kif,*) (pmax(j) ,J=l,npar)
read(k1f , *) dummyllne
C
C " " " " " " " " " " " " " " " Read NITER, IPRINT & EPS-Marquardt
read(kif,*) nlter, iprint, epsmrq
write(kos,83) (~(3) I j=l,npar)
wrxte(kof,83) (p(j), j=l,npar)
write(kof, 87) (pmin(j), j=l,npar)
write(kof, 88) (pmax(]) I j=l,npar)
83 format(//5~,~Init1.al-Guess P ( j ) =',6g12.4//)
87 format(5x, 'MIN Bounds Pmin(7) =' ,6g12.4//)
414
Appendi x 2
88 format(Sx,'MAX Bounds hnax(j) =',6gl2.4//)
C
c " " " " " " " " " " " " " " " Read Input Data (Experimental Data Points)
read(klf I *) dummyline
read (kif , ) ncolr
If (ncolr .It. nx+ny) then
ncolr =nx + ny
wrlte (kof , 91) ncolr
91 format(/40x,'Number of Columns CANNOT be less than =',13,
& /40x,'Check you Model Equatlons agaln...')
stop
end if
write (kof , 92) ncolr
np =0
do 100 l=l,nrows
read(kif,*,err=105,end=lO8) (alldata(j,i) I ~=l,ncolr)
np =np + 1
92 format (/40x, 'Number of Columns =' ,i3)
read(kzf , *) dummyline
100 continue
105 contmue
107 fonnat(/lx,' Input STOPPED when CHARACTERS were encountered...',/)
108 contlnue
wrlte (kof, 107)
write(kof , 109) np
write (kos , 109) np
close (klf)
109 fonnat(40x,'Data Points Entered =',i3)
c " " " " " " " " " " " " " " - Assign variables to Ydata and Xdata & prmt
them
do 120
l=l,ny
do 120
J=l,np
ydata(i, J ) =alldata(nyy(i) , J )
120 contlnue
do 125 i=l ,nx
do 125 j=1 ,np
xdata(i, j) =alldata(nxx(1) I J )
125 continue
131 format(/lx, 'Each Column corresponds to: Y(1) I . . . ,Y(',i2,
write(kof I 131) ny, nx
& I ) , X(1) 1 . . . ,X(',12, ' 1 : ' 1
do 130 J=l,np
wrlte (kof, 132) (ydata (1, j) ,1=1, ny) , (xdata (i, j) , i=l, nx)
130 contlnue
132 format(l~~8gl4.5)
dfr =ny*np - npar
C
C " " " " " " " " " " " " " " " M a n lteratlon loop
C " " " " " " " " " " " " " " " Inltlallze matrlx A , b and Objectlve Function
do 700 loop=l,niter
do 220 i=l,npar
b(i)=O.O
do 220 1=1 ,npar
a(i,l)=O.O
220 contlnue
SSE=O . 0
if (lprint .eq. 1) wrlte(kof,222)
222 format(/lx,'Current MODEL output vs. DATA',
& /lx, 'Data# Y-data(i) & Y-model(l) , i=l,M')
C
C " " " " " " " " " " " " " " " Scan through all data points
do 300 i=l,np
C
C " " " " " " " " " " " " " " " Compute model output Y (i) & dydp(1, J )
Appendix 2 415
lmode=l
do 230 j=l,nx
x ( 7 ) =xdata (j , I)
230 continue
call model(nvar,nparmx,y,x,p,dydp)
If (iprint .ge. 1) then
238 format(lx,i3,2x,8gl2.4)
write(kof,238) i, (ydata(k,i) ,y(k) ,k=l,ny)
end if
do 240 j=l,ny
C " " " " " " " " " " " " " " "
Compute LS Ob]. functlon and matrlx A, b
C " " " " " " " " " " " " " " " Select weighting factor (GLS vs WLS)
qyj= q(j)/(ydata(j,i)**2 + epsmin)
else
end if
SSE=SSE + qyj*(ydata(J,i) - y(j))**2
b(l)=b(l) + qyj*dydp(j,l)*(ydata(j,i) - y(j))
a(l,k)=a(l,k) + qyi*dydp(~,l)*dydp(j,k)
w~= s(j)
do 240 l=l,npar
do 240 k=l,npar
240 continue
300 continue
C
C " " " " " " " " " " " " " " " Keep current value of p(i)
do 310 i=l,npar
pO(i)=p(i)
310 continue
C
C " " " " " " " " " " " " " " " Decompose matrix A (using DEVCSF from IMSL)
call devcsf(npar,a,nparmx~s,v,nparmx)
C
C " " " " " " " " " " " " " " " Compute condition number 6 (V**T) *b
conda=s (1) /s (npar)
do 312 l=l,npar
bv(i)=O.O
do 312 j=l,npar
bv(i)=bv(i) + v(J,i)*b(j)
312 continue
315 continue
C
C " " " " " " " " " " " " " " " Use pseudoinverse (if cond(A) >l/epspsi)
ipsi=O
do 320 k=l,npar
If (s (k) /s(l) .It. epspsi) then
bvs (k)=O. 0
ipsi-ips1 + 1
else
C " " " " " " " " " " " " " " " Include MARQUARDT'S modification
bvs (k)-bv(k) / (s (k) + epsmrq)
end if
write (kos I 325) loop, epsmrq, epspsi I ipsi , conda
write(kof,325) loop,epsmrq,epspsx,ipsi,conda
& 'EPS-PseudoInv.=',g10.4,4x~'No. PseudoInv. Apprx.=',i2,
& /lx, 'Cond (A) =' , e12.4)
write(kof,326) ( s ( J ) , ~=l,npar)
320 contlnue
325 format(//1x,'1TERAT10N=',13,/1x,'EPS_Marq.=',g10.~,4x,
326 format(lx,'Eigenvalues=',8g12.4)
c " " " " " " " " " " " " " " " Compute dp = (V**T) *b
do 330 i=l ,npar
dp(i)=O.O
do 330 j=l,npar
416
Appendi x 2
dp(l)=dp(l) + v(1, j)*bvs(j)
330 continue
c " " " " " " " " " " " " " " " Compute new vector p and I ldpl I
dpnorm=O . 0
do 340
i=l,npar
dpnorm=dpnorm + dabs (dp (i) )
p(i)=pO(l)*(l. + dp(i))
340 contlnue
dpnorm =dpnorm/npar
write (kos, 345) SSE
write (kof, 345) SSE
wrlte(kos,346) (p(j), j=l,npar)
wrlte(kof1346) (p(j), j=l,npar)
346 format(lx, 'P(J) G-N =',8g12.5)
345 format(lx, 'LS-SSE =I ,g15.6/)
c " " " " " " " " " " " " " " " Test for convergence
if (dpnorm .It, eps ) then
if (ips1 .eq. 0) then
if (epsmrq .gt. s(npar)*0.01) then
write(kof ,347) epsmrq
wrlte (kos, 347) epsmrq
epsmrq=epsmrq*O.Ol
goto 700
else
if (epsmrq .gt. 0.0) then
write(kos, 347) epsmrq
write(kos,349)
write (kof, 347) epsmrq
wrl te (kof ,34 9)
epsmrq=O . 0
goto 700
end If
end if
sigma=sqrt (SSE/dfr)
write(kos,385) sigma,dfr
write(kof, 385) sigma,dfr
c " " " " " " " " " " " " " " " Check If Marquardt's EPS is nonzero
347 format(//lx,'>>>>>> Converged with EPS-Marquardt =',g14.5)
349 format(//lx,'>>>>>> From now on EPS-Marquardt =0.0')
c""""""""""""""" If CONVERGED go and compute standard devlatmn
385 format(///5xl'++++++ CONVERGED ++++++',/5x,'Sqma=',
E4 g14.5,/5x,'Degrees of Freedom=I,f4.0,/)
wrlte(kos,386) (p(]), ]=l,npar)
wrlte(kof ,386) (p(~), j=l,npar)
386 format(5x, 'Best P(i) =',8g12.5)
goto 800
else
more eigenvalue
C " " " " " " " " " " " " " " " If converged with ipsi nonzero, Include one
epspsi=epspsl*l.e-1
goto 315
end if
end if
C
C " " " " " " " " " " " " " " " Enforce HARD MIN & MAX parameter boundarles
c " " " " " " " " " " " " " " - Use bisection to enforce: Pmln(l)<P(I)<Pmax(l)
if (lbound . eq. 1) then
do 460 ~=l,npar
448 contlnue
if (p(j) .le.pmin(l) ,or. p(j) .ge.pmax(j)) then
do 450 i=l,npar
dp(i)=O.5O*dp(i)
p(i)=pO (i) * (1. + dp(i))
450 continue
Appendix 2 41 7
end if
if (p(j) .le.pmin(j) .or. p( j ) .ge.pmax(j)) goto A48
460 continue
465 format(lx, 'P(j) Bounded =',8g12.5)
write(kof,465) (p(j) , j=l,npar)
end if
c " " " " " " " " " " " " " " " STEP-SIZE COMPUTATIONS:
C " " " " " " " " " " " " " " " Use full step if not needed or nstep=O
if (dpnorm .le. 0.010 .or. nstep .eq. 0) then
end if
do 600 kks=l,nstep
SSEnew=O.O
Imode=O
do 580 ii=l,np
do 530 ]=l,nx
goto 700
C " " " " " " " " " " " " " " " Compute step-size by the BISECTION RULE
c " " " " " " " " " " " " " " " Compute Model Output at new parameter values
x(])=xdata(~,z~)
530 contlnue
call model(nvar,nparmx,y,x,p,dydp)
do 540 j=1 ,ny
c " " " " " " " " " " " " " " " Compute NEW value of LS Obj. function
c " " " " " " " " " " " " " " " Select welghting factor (GLS vs WLS)
else
end if
SSEnew=SSEnew + qyj*(ydata(j,iz) - y(j))**2
g y ~ = q(~)/(ydata(j,il)**2 + epsmin)
=I= q(j)
540 continue
c " " " " " " " " " " " " - zf LS Ob]. function is not improved, half step-size
if (SSEnew .ge. SSE) then
do 550 i=l,npar
dp (i) =O .50*dp (i)
p(l)'pO(i) * (1. + dp(i))
550 continue
goto 600
end If
write(kof,696) (p(Jj) ,~g=l,npar)
wrlte (kof ,585) kkk, SSE I SSEnew
580 continue
kkk=kks - 1
585 format(/2x,'Stepsize= (0.5)**',i2,3x,'LS_SSE~old=',~~4.~,4~,
& 'LS-SSE-new=',g14.6)
goto 700
write(kof,696) ( ~ ( ~ 1 1 , jj=l,npar)
600 continue
696 format(lx, 'P(j) Stepped =' ,8g12.5)
700 continue
c " " " " " " " " " " " " " " " Alert user that G-N dld not converge ...
sigma=sqrt (SSE/dfr)
write (kof ,750) sigma, df r
750 fomat(///5x,'***** Dzd NOT converged yet *****',/5xl
& 'LS-Slgma=',gll.5,/5x,'Degrees of freedom=',f4.0)
wrzte(kof,755) (p(j), j=l,npar)
755 format(/5x, 'Last P ( J ) =' ,8g12.5)
800 continue
c"""""""""""-"- Compute sigma & stand. dev. for current parameters
slgma=sqrt (SSE/dfr)
do 870 i=l,npar
stdev (i) =O . 0
do 870 j=l,npar
stdev(i)=stdev(i) + V ( i , j)**2/S(J)
Appendi x 2
418
870
880
886
C
continue
do 880 i=l,npar
continue
wrlte(kos,886) (stdev(~)*lOO, ~=l,npar)
write(kof,886) (stdev(j)*lOO,-~=l,npar)
format(//5x, 'St.Dev. (%) =',8g12.4)
stdev(l)=sqrt(stdev(i))*sigma
C " " " " " " " " " " " " " " " Alert User whether GLS or WLS was used.
887
888
890
891
C
write (kof, 887)
format(//lx,50('-') ,/5x,
6 'GENERALIZED LEAST SQUARES Formulation Used' , /lx, 50 ( ' - ' ) , /)
else
write(kof ,888)
format (//lx, 50 ( ' - I ) , /5x, 'WEIGHTED LEAST SQUARES Formulation Used'
& ,/1~r50('-') , / I
end if
write (kof, 890)
write(kos,891) filout
format(//lx,60(~-'),/5x,'Program OUTPUT stored in file: I ,
& a24,/1x,60('-') ,/)
close(kof ,status='keep')
pause
stop 100
end
format(/lx,"-----------------pROG~ END---""""""-I /)
c " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " -
C
MODEL
c Algebraic Equatlon Model of the form Y=f (X,P)
c where X(i) , i=l , N X is the vector of independent varlables
C
C Y ( i ) , i=l,NY is the vector of dependent variables
C P(l), i=l,NPAR 1s the vector of unknown parameters
C
c"""""""""""""""""""""""""""""""""""""-
C
c The User must specify the MODEL EQUATIONS
C Y(1) =fl(X(1) ,... X(NX); P ( 1 ) ,... P(NPAR))
C Y(2) =f2(X(1), . . .X(NX); P(l), . . .P(NPAR))
C Y(3) =f3(X(1) ,... X(NX); P(1) ,... P(NPAR)) ,... etc
C
c and the Jacobean matrix dYDP(1, ~)=[dY(i)/dP(j)]
C
C
c"""""""""""""""""""""""-""""""""""""
MODEL
SUBROUTINE model(nvar,nparmx,y,x,p,dydp)
DOUBLE PRECISION y (nvar) , dydp (nvar , nparmx) ,x (nvar) ,p (nparmx)
common /gn/lmode,ny,nx,npar
C
C
C " " " " " " " " " " Compute model output
y(1) =dexp(-p(l)*x(l)*dexp(-p(2)*(1./~(2)-1./620.)))
C
C " " " " " " " " " " Compute Jacobean dydp (1, j )
if (imode .eq. 0) return
dydp~l,l~=-x~l~*dexp~-p(2~*(l./x(2)-l./62O.)~*y(l)
dydp~l,2~=-~l./~~2~-1./620.)*dydp(l,l)*p(l)
C
C " " " " " " " " " " Normalise sensitivity coefficlents
do 200
i=l,ny
do 200
j=l,npar
Appendix 2
419
dydp(i, j)=d~dp(i,~)*~(j)
200 continue
C
return
end
JNPUT FILE
Enter I nt i al Guess for the parameters ...
1. 100000.
Enter Min values for the parameters ... (Pmin(i))
0. 0.
Enter MAX values for the parameters. . . (Pmax (i) )
1000000. 1000000.
Enter NITER (Max No. I t er. 1 , IPRINT (=1 f or complete Output) , EPSMRQ
20 0 0.0
Enter number of columns of DATA
3
Enter DATA points (one line per experiment)
120.
60.0
60.0
120.
120.
60.0
60.0
30.0
15.0
60.0
45.1
90.0
150.
60.0
60 .O
60.0
30.0
90.0
150.
90.4
120.
60.0
60 .O
60.0
60 .O
60.0
60.0
30.0
45.1
30.0
30.0
45.0
15.0
30 .O
90.0
25.0
60.1
600
600
612
612
612
612
620
620
620
62 0
620
620
620
620
620
620
620
62 0
620
62 0
620
620
620
620
620
620
62 0
631
631
631
631
631
63 9
63 9
63 9
639
63 9
.goo
.949
.E86
,785
.7 91
.890
.787
.877
.938
.782
.E27
.696
.582
.7 95
,800
.7 90
.E83
.712
.57 6
.715
.673
.E02
.802
.E04
.7 94
.E04
.799
.764
.688
.717
.802
.695
.E08
,655
.309
.689
a437
420
60.0 639
30.0 639
30.0 639
60.0 639
Appendix 2
. 425
.638
.659
. 4 4 9
A.2.4 COMPUTER PROGRAM FOR EXAMPLE 16.3.2
c " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " -
C
C EXAMPLE 16. 3. 2 (Pyrolytic Dehydrogenation of Benzene to . . . I
C in the book:
C APPLIED PARAMETER ESTIMATION FOR CHEMICAL ENGINEERS
C COPYRIGHT: Marcel Dekker 2000
C by Englezos & Kalogerakls / Marcel Dekker Inc. (2000)
C
c " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " -
C
C PARAMETER ESTIMATION ROUTINE FOR ODE MODELS
C Based on Gauss-Newton method wlth PseudoInverse and Marquardt's
C Modiflcation. Hard BOUNDARIES on parameters can be imposed.
C Bayeslan parameter priors can be used as an optlon.
C GENERALIZED LEAST SQUARES Formulation (when IGLS=l)
C with the DIAGONAL WEIGHTING MATRIX, Q(l) /Ydata(l) **2, l =l , NY
C
C or
C
C WEIGHTED LEAST SQUARES Formulatlon (when IGLS=O)
C wlth the DIAGONAL WEIGHTING MATRIX, Q( I ) , 1=1,NY
C Written in FORTRAN 77 by Dr. Nicolas Kalogerakis, Aug. 1999
C Compiler: MS-FORTRAN using the IMSL llbrary (routines: DEVCSF,
C DIVPAG, SETPAR)
C
C
c " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " - +
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
r egr essed
C
C
C
C
C
C
C
C
C
Squar es) .
The User must pr ovi de Subr out i ne MODEL [ t hat descr l bes t he
mat hemat i cal model dx/ dt =f ( x, p) 1
and subr out l ne JX [ comput es t he J acobeans ( df / dx) & ( df / dp) l
wher e x(l ), 1=1, NX i s t he St at e vect or .
The Out put vect or y( i ) , i=l,NY 1s assumed t o cor r espond t o
t he f l r st NY el ement s of t he St at e vect or
The Fcl l owi ng Var i abl es MUST be Specl f l ed i n PARAMETER st at ement :
NXZ (=NX) =Number of st at e var l abl es i n t he model
NYZ (=NY) =Number of measur ed var i abl es i n t he model
NPARZ ( =NPAR) =Number of unknown par amet er s
NRUNZ =Gr eat er or equal t o t he maxl mum number of r uns ( exper i ment s)
sl mul t aneousl y
NPZ = Gr eat er or equal t o t he maxl mum number of measur ement s per r un
t o be
The Fol l owi ng Var i abl es MUST be Speci f i ed I n Mal n Pr ogr am:
I GLS =Fl ag t o use Gener al i zed Least Squar es ( I GLS=l ) or Wei ght ed LS ( I GLS=O)
FI LEI N =Name of I nput f l l e
FI LOUT =Name of Out put f l l e
Q(1), l=l,NY = Const ant wel ght s for each measur ed var l abl e ( Use 1.0 f or Least
Appendix 2 42 1
C IBOUND =1 to enable hard constraints on the parameters (0 otherwise)
C NSTEP =Maxlmum number of reduction allowed for Bisection rule (default 1 s 10)
C KIF, KOS, KOF =Input file unit, Screen output unit, Output file unit.
C NSIG =Approx. Number of Significant digits (Tolerence EPS =lO**(-NSIG))
C EPSPSI =Minimum Ratlo of eigenvalues before the Pseudo-inverse approx. is used
C EPSMIN =A very small number (used to avoid division by zero).
C TOL =Tolerence for local error requlred by ODE solver (default is 1.0e-6)
C
c " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " - "
USE MSI MSL
EXTERNAL j x , model
PARAMETER (11x202 , nyz- 2 , npar zm2 , nt ot =nxz* ( npar z+l ) , npz400 , nr unzt 5)
DOUBLE PRECISION ydat a ( nr unz , nyz , npz) , t p ( nr unz , npz) , p ( npar z)
DOUBLE PRECI SI ON x0 ( nr unz , nxz) , q ( nyz) , t O ( nrunz) , nobs ( nr unz)
DOUBLE PRECI SI ON pO( nparz) , dp( nparz) , x( nt ot ) , v( nparz, nparz)
DOUBLE PRECI SI ON a( npar z, npar z) , b( npar z) , s( npar z) , ag( l , l )
DOUBLE PRECI SI ON st dev( nparz1 , bv( nparz) , bvs ( nparz) , pi vpag( 50)
DOUBLE PRECI SI ON pmi n ( nparz) , pmax( nparz)
DOUBLE PRECI SI ON pprior(nparz),vprior(nparz)
DOUBLE PRECI SI ON t , t end, t o1
COMMON / gn/ i mode , nx , npar , p
CHARACTER*4 dummyl i ne
CHARACTER*24 f i l ei n, f i l out
C
c""""""""""""""" Set conver gence t ol er ence 6 ot her par amet er s
C
DATA kos/6/,kif/2/,kof/7/,epsmin/1.e-80/,to1/1.d-6/,ig1s/0/
DATA nsig/5/,epsps~/1.e-30/,nstep/10/,q/nyz*1.0/,ibound/1/
eps =10. ** ( - nsi g)
C
nx=nxz
ny=nyz
npar =npar z
nxx=nx + 1
n=nx* ( npar + 1)
C
C""""""""""""""" Set Fi l enames and Open t he f i l es f or 1/0
C
f i l ei n = ' Dat aI N- Exl 6. 3. 2. t xt '
f i l out = ' Dat aOUT- Exl 6. 3. 2. t xt '
open ( ki f , f i l e=f i l ei n)
open( kof , f i l e=f i l out )
C""""""""""""""" Pr i nt header i nf or mat i on
wr i t e ( kos, 60)
wr i t e ( kof , 60)
60 f or mat ( / 20x, ' PARAMETER ESTI MATI ON PROGRAM FOR ODE MODELS usi ng' ,
& / 20x, ' t he GAUSS- NEWTON Met hod wi t h Mar quar dt ' s Modi f i cat i on, ' ,
6 / 20xI ' Pseudo- I nver se Appr oxi mat i on, Bi sect i on Rul e, ' ,
& / 20x, ' Bounded Par amet er Sear ch, Bayesi an Par amet er Pr i or s' )
i f ( i gl s . eq. 1) t hen
wr i t e (kos , 65)
65 format(//20x,'GENERALALSZED LEAST SQUARES For mul at i on. . . ' )
el se
wr l t e ( kos, 67)
67 f or mat ( / / 20x, ' WEI GHTED LEAST SQUARES For mul at i on. . . ' )
end i f
wr i t e (kos, 70) nx, ny, npar
wr i t e( kof , 70) nx, ny, npar
70 f or mat ( / / 20x, ' Number of St at e Var i abl es (m) = ' , i 3 ,
& / 20x, ' Number of Out put Var i abl es ( N Y )
= ' , 13,
& / 20x, ' Number of Par amet er s ( NPAR) = ' , i 3)
wr i t e( kos, 71)
f zl el n
f i l ei n
Appendix 2
422
71 format(//POx,'Data Input Filename: ',a241
write(kos,72)
filout
write (kof ,72)
filout
72 format(40x,'Program Output Flle: ',a241
C
c""""""""""""""" Read PRIOR, MIN 6 MAX parameter values
C
read(kif, *)
dummyline
read(kif, *) (pprlor (j) , j=l , npar)
read(kzf , *) dummyline
read (kif, *) (vprior ( J ) , j=1, npar)
read(kif , *) dummyline
read(kif, *)
(pmin (j) , j=l ,npar)
read(kif, *)
dummyline
read(kif, *)
(pmax ( 1) , j=1, npar)
C
c " " " " " " " " " " " " " " " Read NITER, IPRINT 6 EPS-Marquardt
C
read(kif , *) dummylme
read(kif,*)
nlter, iprlnt, epsmrq
C
C""""""""""""""" Read Number of Runs to be Regressed
C
read (kzf , *) dummyline
read(klf, *) nrun
If (nrun .gt. nrunz) then
write (kos ,74) nrun
74 format(///20x,'NRUNZ in PARAMETER must be at least',l3//)
stop 111
end if
write (kos , 75) nrun
write (kof , 75) nrun
dfr= - npar
75 format(//20x,'Input Data for',i3,' Runs',//)
C
C""""""""""""""" Read Measurements for each Run
C
C
85
&
C
91
90
C
92
93
94
do 90 xrun=l,nrun
read(kif, *) dummyllne
read (kif , *) nobs (Irun) , tO (Irun) , (X0 (irun, 1) I 1=1 I nx)
np=nobs ( i run )
dfr=dfr + ny*np
write(kos,85) irun,np,tO(irun), (xO(irun,l) ,j=lrnX)
write (kof ,85) n u n , np, tO (Irun) , (x0 (irun, I ,1=1 I nx)
format(//5x,'Run:',i2,4x,'NP=',i3,4~,
'TO =',g12.4,4x,'X(o) =',10g11.4/)
read(kif ,*) dummyline
do 90 j=l,np
read(klf,*) tp(irun,J), (ydata(irun,l,j) ,i=l,nY)
write(kof,91) tp(lrun, j) , (ydata(irun,i,j) ,i=l,ny)
fo~at(5x,'T~me=',gl2.4,lOx,'Y-data~~~=',l~~~~.4~
contxnue
do 92 j=l,npar
contmue
wrlte(kof,93) (p(j), ~=l,npar)
format(///2x, 'Pprior(j)=',8g12.5)
wrlte (kos , 94) (p ( 1) , ]=I, npar)
format(//2x, 'Pprior(J)=' ,8g12.5)
write ( kos , 96) (vprior ( J 1 ,1=1, npar)
wrlte (kof, 96) (vprior (1) , j=1 ,npar)
P ( I ) = P P ~ ~ ~ ~ ( J )
Appendix 2 423
96 format(lx, 'l/VARprior=',8gl2.5//)
write (kof, 97) (pmin (j) , j=1 ,war)
write (kos ,971 (pmin (j) , j=1 I war)
write(kof,98) (pmax(]), j=l,npar)
write(kos,98) (pmax(j) I j=l,npar)
close (kif
97 format(lx,' Pmin(j) =1,8g12.5//)
98 format(lx, Pmax(3) =' ,8912.5//)
C
c""""~"""""""""-" Main iteration loop
C
do 500 nloop=l,niter
C " " " " " " " " " " " " " " " Initialize matrix A, b and Objective function
SSEprior=O . 0
do 104 i=l,npar
SSEprior=SSEprior + vprior (i) * (pprior (i) -p (i) ) **2
b(l)~rior(i)*(ppr~or(i)-p(i))
a(l,l)=0.0
do 104 l=l,npar
104 continue
do 106 i=l ,npar
a(i,i)-yri.or(i)
106 continue
SSEtotal=SSEprior
C
c " " " " " " " " " " " " " " " Go through each Run
do 210 irun=l,nrun
if (iprint .eq. 1) write(kof ,108) irun
108 format(/4xI'Current MODEL output vs. DATA for RUN#',i2,
& /5x, ' Time Y-data(i) & Y-model (i) , i=l,NY')
C
c " " " " " " " " " " " " " " " Initialize ODEs for current Run
do 112 ~=l,nx
x(j)=xO(irun, j)
112 continue
do 114 j=nxx,n
x(j)=O.O
114 contlnue
C
c " " " " " " " " " " " " " " " Initialize ODE solver: DIVPAG
np=nobs (irun)
call SETPAR (50, pivpag)
index=l
t=tO (irun)
imode=l
C
c " " " " " " " " " " " " " " " Integrate ODEs or this Run
do 200 i=l,np
tend=tp (Irun, i)
call DIVPAG( i ndex, n, model , j x, ag, t end, t ol , pi agl x~
if (iprint .eq. 1) then
write(kof,l35) t, (ydata(lrun,k,i) ,x(k) ,k=l,ny)
135 format(lx,gl1.4,3~,8gl2.5)
end If
C
c " " " " " " " " " " " " " " " Update Objective function and matrix A, b
c " " " " " " " " " " " " " " " Select weightmg factor (GLS vs WLS)
do 140 J=l,ny
If (igls .eq. 1) then
else
end if
SSEtotal=SSEtotal + qyj*(ydata(irun,j,i) - x(]))**2
qyj= q(j)/(ydata(irun, j,1)**2 + epsmin)
qYj= q( j )
424
Appendix 2
do 140 l=l,npar
ll=l*nx
b(l)=b(l) + gyj*x(~+ll)*(ydata(irun,~,i) - x(]))
do 140 k=l,npar
kk=k*nx
a(l,k)=a(l,k) + qy~*x(j+ll)*x(~+kk)
140 continue
e " " " " " " " " " Call DIVPAG again wlth INDEX=3 to release MEMORY from IMSL
if (i .eq. np) then
1ndex=3
call DIVPAG(index,n,model~jx,ag~t,tend,tol~pivpag,x)
end If
200 continue
210 contlnue
C
C " " " " " " " " " " " " " " " Keep current value of p(i)
do 211 l=l,npar
PO (l)=p(i)
211 continue
c " " " " " " " " " " " " " " " Add Bayes influence of A & b
do 212 i=l,npar
b (1) =b (i) + vprior (i) * (pprior (1) -pO (1) )
a(i,l) =a(i,i) + vprior(i)
212 contlnue
C
C " " " " " " " " " " " " " " " Print matrix A & b
if (Iprint . eq. 1) then
write (kof ,214)
do 218 jl=l,npar
214 format(/lx,' Matrlx A and vector b....')
write(kof,216) ( a( ~1, j2), j2=l,npar) ,b(~l)
216 format (llg12.4)
218 continue
end if
C
C " " " " " " " " " " " " " " " Decompose matrix A (using DEVCSF from IMSL)
C
call devcsf (npar,a,nparz,s,v,nparz)
C
C""""""""""""""" Compute conditlon number & (V**T)*b
conda=s (1) /s (npar)
do 220 i=l,npar
bv(i)=0.0
do 220 j=l,npar
bv(i)=bv(l) + v(~,~)*b(j)
220 continue
225 continue
C
C " " " " " " " " " " " " " " " Use Pseudo-inverse (if Cond(A) > l/epspsl)
ipsl=O
do 230 k=l,npar
if (s(k) / s (1) .It. epspsl) then
bvs (k) =O .O
ipsi=ipsi + 1
else
bvs (k) =bv (k) / (s (k) + epsmrq)
end if
C""""""""""""""" Include MARQUARDT'S modlflcation
230 continue
C
wrlte(kos,235) nloop,epsmrq,epspsl,ipsl,conda
write(kof,235) nloop,epsmrq,epspsi,ipsi,conda
235 format(//lx,'ITERATION=',i3~/lx,'EPS_Marq.=1,glO.4,4x,
& 'EPS-PseudoInv.=',g10.4,4x,'No. PseudoInv. Apprx.=',i2,
Appendix 2
425
& /lx, 'Cond(A)=' ,e12.4)
write (kof, 237) (s ( J ) , j=1, npar)
237 format(lx,'Eigenvalues=',1gl10gll.4)
C
C " " " " " " " " " " " " " " " Compute dp = (V**T) *b
do 240 i=l ,npar
dp (i) =O. 0
do 240 ~=l,npar
dp(l)=dp(l) + v(i, ])*bvs(j)
240 continue
C
C " " " " " " " " " " " " " " " Compute new vector p and I ldpl I
dpnorm=O. 0
do 250 l=l,npar
dpnorm=dpnorm + abs (dp (i) )
p(i)..pO(l) * (1. + dp(i))
250 continue
dpnorm =dpnonn/npar
if (dpnorm .le. lO*eps*npar) then
is tep=0
is tep=l
else
end If
write(kos,263) SSEtotal,SSEprior/SSEtotal
wrlte (kof ,264) SSEtotal I SSEprior
write(kos,265) (p(j) I j=l,npar)
write(kof,265) (p(j) I j=l,npar)
263 format(lx,'SSE-total =',g11.5,5x,*SSEprior/SSEtot =~,f14.4,/)
264 format(lx,'SSE-total =*,gll.5,3x,'Prior-SSE =flgll.51/)
265 format(lx, ' P ( j ) by G-N =',8g12.5)
C
C " " " " " " " " " " " " " " " Enforce HARD MIN & MAX parameter boundaries
C " " " " " " " " " " " " " " Use bisectzon to enforce: Pmin (i) <P (i) <Pmax (i)
If (lbound .eq. 1) then
do 460 j =l ,npar
440 contmue
If (p(j) .le.pmin(j) .or. p(j) .ge.pmax(j)) then
do 450 l=l,npar
dp (i) =O .50*dp (i)
P(I)=PO(L) * (1. + dp(i))
450
460
465
continue
end af
If (p(~) .le.pmin(j) .or. p(j) .ge.pmax(j)) goto 440
continue
write ( kos I 465) (p (j 1 I J =I I npar)
write (kof ,465) (p (j 1 , j=1 I npar)
format(lx,'P(j) Bounded=',8gl2.5)
end if
C""""""""""""""" Test for convergence
if (dpnorm .It. eps ) then
If (ipsi .eq. 0) then
if (epsmrq .gt. s (npar) t0.01) then
wrlte (kof I 347) epsmrq
write (kos I 347) epsmrq
epsmrq=epsmrq*0.01
goto 500
else
if (epsmrq .gt. 0 .O) then
write ( ko s , 347) epsmrq
write (kos, 349)
wrlte (kof I 347) epsmrq
write (kof, 349)
C " " " " " " " " " " " " " " " Check if Marquardt's EPS is nonzero
347 format(//lx,'>>>>>> Converged wath EPS-Marquardt =',g14.5)
Appendix 2
426
34 9
C
285
&
286
C
format(//lx,'>>>>>> From now on EPS-Marquardt =0.0')
epsmrp0.0
goto 500
end if
end if
sigma=sqrt ( (SSEtotal-SSEprlor) /dfr)
write(kos,285) slgma,dfr
write(kof ,285) sigma,dfr
format(///5xI'++++++ CONVERGED ++++++',/5x,'LS-Slgma=',gll.5,
/5x,'Degrees of Freedom=',f4.0,//)
write (kos , 286) (p (j) , j=l ,npar)
wrlte(kof,286) (p(j) ,3=l,npar)
format(5x, 'Best P(J) =',8g12.5)
c " " " " " " " " " " " " " " " Go and get Standard Devlatlons & Slgma
goto 600
else
C
c " " " " " " " " " " " " " " " Include one more singular values
epspsi=epspsl*l.e-1
goto 225
end lf
end If
c " " " " " " " " " " " " " " " STEP-SIZE COMPUTATIONS:
c " " " " " " " " " " " " " " " Use full step if it is not needed or NSTEP=O
If (istep .eq. 0 .or. nstep .eq. 0) then
end if
do 400 kks=l,nstep
SSEprior=O.O
do 301 i=l,npar
goto 500
c " " " " " " " " " " " " " " " Compute step-slze by the blsectlon rule
C " " " " " " " " " " " " " " " Initialize Objective functlon, ODES and DIVPAG
SSEprior=SSEprior + vprior (i) * (pprlor (i) -p (i) ) **2
301 continue
SSEnew=SSEprior
do 390 irun=l,nrun
do 300 i=l ,nx
x(i)=xO(lrun,i)
300 continue
np=nobs (lrun)
call SETPAR (50 ,pivpag)
index=l
t=tO (irun)
Imode=O
do 340 ii=l ,np
call DIVPAG(~ndex,nx,model,jx,ag~t,tend,tol,pivpag,x)
tend=tp(irun,il)
C
C " " " " " " " " " " " " " " " Compute NEW value of the Ob~ectlve functlon
do 320 ~=l,ny
C
C " " " " " " " " " " " " " " " Select welghtlng factor (GLS vs WLS)
if (191s .eq. 1) then
else
end I
SSEnew=SSEnew + q y ~ * (ydata (irun, 3,11) - x ( J ) ) **2
q y ~ = q(~)/(ydata(lrun, j,il)**2 + epsmln)
w3= q(1)
320 continue
C
C " " " " " " " " " " " " If LS Ob]. functlon 1 s not improved, half step-slze
if (SSEnew .ge. SSEtotal) then
Appendix 2 427
do 330 i=l,npar
dp (i) =O .50*dp (i)
p(i)=PO(x) * (1. + dp(i))
330 continue
C " " " " " " " " " " " " " " " Release MEMORY from ISML (call with index=3)
index=3
call DIVPAG(index,nx,model,jx,ag,t,tend,tollpivpag,x)
end xf
If (ii.eq.np) then
goto 400
index=3
call DIVPAG(index,nx,model,jx,ag,t,tend,tol,pag,x)
end if
340 continue
390 continue
write(kof,394) (P(J) , j=l,npar)
write(kos,394) ( p( j ) ,J=l,npar)
394 fomat(lx,'P(j) stepped=',8gl2.5/)
kkk=kks - 1
write (kof ,
395) kkk , SSEtotal I SSEnew
wrlte (kos ,
395) kkk, SSEtotal , SSEnew
395 format(/2x,'Stepslze= (0.5)**',i2,4x,'SSE~total~old=~,gll.5,3xl
& 'SSE-total new=',g11.5/)
goto 500-
400 continue
496 fomat(/lxI'P(j)laststep=',8gl2.5)
500 continue
write(kof ,496) (P(J J ) , jJ=l,npar)
C
C " " " " " " " " " " " " " " " Alert user that It did not converge ...
sqma=sqrt ( (SSEtotal-SSEprior) /dfr)
write(kof,585) sigma,dfr
585 format(///5xI'***** Did NOT converged yet *****',/5x,
& 'LS-Sigma=',gll.5,/5xl'Degrees of freedom=',f4.0)
write(kof,586) (p(j) , j=l,npar)
586 format(/5x,'Last P( J ) =',8g12.5)
C
C"""""""- Compute Sigma & Stand. Deviations of the Parameter
C " " " " " " " - CALCULATIONS ARE BASED ON THE DATA ONLY (Prior info ingnored)
600 continue
do 670 ~=l,npar
stdev (1) =O . 0
do 670 ~=l,npar
stdev(i)=stdev(i) + v(i, j) **2/s(j)
670 continue
do 680 i=l,npar
stdev(i)=sqrt(stdev(i)) *szgma
write(kos,686) (stdev(~)*100, ]=l,npar)
write(kof,686) (stdev(j)*100, j=l,npar)
680 contlnue
686 fomat(/5x, 'St.Dev. (%)=',8g12.5)
C
C""""""""""""""" Alert user whether GLS or WLS was used ...
write (kof , 688)
688 format(//lx,50('-') ,/5x,
& 'GENERALIZED LEAST SQUARES Fomulatlon Used' ,/lx,50('-') ,/)
else
write (kof I 689)
689 format(//lx,50('-'),/5x,'WEIGHTED LEAST SQUARES Formulation Used'
& ,/1X,50('-')
end if
write (kof , 890)
890 format(/lx,'""""""""" PROGRAM END"""""""--' I / I
428 Appendix 2
wrlte (kos , 891) fllout
891 format(//lx,60('-'),/5x,'Program OUTPUT stored In flle: I ,
& a24,/1x,60('-') ,/)
close (kof , status='keep ' )
stop
end
C
C""""""""""""'""""""""""""""""""""""""
C
MODEL
C Ordinary Dlfferential Equation (ODE) Model of the form dX/dt=f(X,P)
C where X(i) i=l,NX is the vector of state variables
C P(l), x=l,NPAFt is the vector of unknown parameters
C
C
c"""""""""""""""""""""""""""""""""""""-
C
C The User must specify the ODES to be solved
C dX(l)/dt =fl(X(1) ,... X(NX); P(1) ,... P(NPAFt))
C dX(2)/dt =f2(X(1), . . .X(NX); P(1) ,.. .P(NPAR))
C dX(3)/dt =f3(X(1), . . .X(NX); P(1) ,.. .P(NPAR)) ,.. . etc
C and the Jacobean matrlces: dfdx(i, ~)=[df (l)/dX(j)] 1=1,NX 6 J = ~ , N X
C dfdp(i, j)=[df (l)/dP(j)] i=l,NX & j=l,NPAR
C
C
c""""""""""""""""""""""""""-"""""""""
MODEL
SUBROUTINE model (n, t , x, d x )
DOUBLE PRECISION x(n) ,dx(n) ,fx(2,2) ,fp(2,2) ,p(2)
DOUBLE PRECISION trrl,r2,drldxl,dr1dx2,dr2dxl,dr2dx2
COMMON/gn/lmode,nx,npar,p
C
C " " " " " " " " " " " " " " " " " " " " " Model Equations (dx/dt)
rl=(x(1)**2 - ~(2)*(2-2*x(l)-x(2))/0.726)
r2=(x(l)*x(2) - (1-~(1)-2*~(2))*(2-2*~(1)-~(2))/3.852)
dx(l)=-p(1) *rl - p(2) *r2
dx(2)=0.5*p(l)*rl - p(2)*r2
C
C " " " " " " " " " " " " " " " " " " " " "
Jacobean (df/dx)
if (imode.eq.0) return
drldxl=2*x(l) + 2*x(2)/0.726
drldx2=-(2-2*x(l)-x(2))/0.726 + x(2)/0.726
dr2dxl=x (2) + (2-2*x (1) -x(2) /3,852 + 2* (1-x (1) -2*x (2) ) /3.852
dr2dx2=x(l) + 2*(2-2*x(l)-x(2))/3.852 + (l-x(1)-2*~(2))/3.852
fx(l,l)=-p(l)*drldxl - p(2)*dr2dxl
fx(l,2)=-p(l)*drldx2 - p(2)*dr2dx2
fx(2,1)=0.5*p(l)*drldxl - p(2)*dr2dxl
fx(2,2)=0.5*p(l)*drldx2 - p(2)*dr2dx2
C
C " " " " " " " " " " " " " " " " " " " " "
Jacobean (df/dp)
fp (1,l) =-rl
fp(lr2)=-r2
fp(2,1)=0.5*rl
fp (2 2) =-r2
C
C " " " " " " " " " " " " " " " " " " " " "
Set up the Sensltlvlty Equations
do 10 k=l,npar
kk=k*nx
do 10 j=l,nx
dx (j+kk)=fp ( 3 , k) *p (k)
do 10 l=l,nx
dx (j+kk) =dx (~+kk) +fx ( J ,1) *x (l+kk)
10 continue
return
end
C
Appendix 2
429
C"""""""""""""""""""""""""""""""""""
Jx
C JACOBEAN of the
C Ordinary Differential Equation (ODE) Model
C required by the ODE solver (version for stiff systems)
C
C"""""""""""""""""""""""""""""""""""
C
C The User must specify the Jacobean matrix:
C dfdx(i, j)=[df (i)/dX(j)] i=l,NX & J = ~ , N X
C
C"""""""""""""""""""""""""""""""""""
Jx
SUBROUTINE -jx(n,t,x,fx)
DOUBLE PRECISION x(n) ,fx(n,n) ,p(2)
DOUBLE PRECISION t, rl,r2,drldxl,drldx2,dr2dxl,dr2dx2
common/gn/imode,nx,npar,p
C
C " " " " " " " " " " " " " " "
Inltlalize Jacobian matrix Fx
do 20
i=l,n
do 20
~ = l , n
fx(i, j)=O.O
20 continue
C
C " " " " " " " " " " " " " " " Jacobian matrix Fx
rl=(x(1)**2 - ~(2)*(2-2*x(l)-x(2))/0.726)
r2=(x(l)*x(2) - (1-~(1)-2*~(2))*(2-2*~(1)-~(2))/3.852)
drldxl=2*x(l) + 2*x(2)/0.726
drldx2=- (2-2*x (1) -x(2) ) /O. 726 + x (2) /O. 726
dr2dxl=x(2) + (2-2*x(l)-x(2))/3.852 + 2*(1-x(1)-2*~(2))/3.852
drZdx2=x(l) + 2*(2-2*x(l)-x(2))/3.852 + (l-x(1)-2*x(2))/3.852
fx (1,2) =-p (1) *drldx2 - p (2) *dr2dx2
fx (2,l) =O .5*p (1) *drldxl - p (2) *dr2dxl
fx (2,2) =O .5*p (1) *drldx2 - p (2) *dr2dx2
fx(l,l)=-p(l)*drldxl - p(2)*dr2dxl
C
C""""""""""""""" Set up the expanded Jacobian
if (imode . eq. 0) return
ll=nx
do 60 kk=l,npar
do 50 j=l,nx
3 j=ll+ j
do 50 i=l,nx
ii=ll+i
fx (ii, jJ)=fx (i, j)
50 contlnue
ll=ll+nx
60 continue
return
end
C
c""""""""""-""""""""""""""""""" SETPAR
C
subroutine SETPAR (nn, param)
C
C " " " " " " " " " " " " " " " Inltialize vector PARAM for DIWAG
C
double preclsion param (nn)
do 10 i=l,nn
param(l)=O
10 continue
param (1) =1.Od-8
param (3) =1000.
param(4)=500
param (5) =50000
param (6) =5
430
Appendix 2
param (9) =1
param(ll)=l
param (12) =2
param(l3)=l
param(l8)=1.0d-15
param (20) =1
return
end
INPUT FILE
Enter Prior Values for the Parameters (=Initial Guess) (Pprior(1)) ...
10000. 10000. 355.55 402-91
Enter Prior Values for the INVERSE of each Parameter VARIANCE (VprIor(1)) ...
0.ld-18 0.ld-18
Enter MIN values for the Parameters (Pmin(i)) ...
0.0 0.0
Enter MAX values for the Parameters (Pmax (i) ) . . .
l.d+8 l.d+8
Enter NITER (Max No. Iteratlons), IPRINT (=1 to prlnt Output vector), EPSMRQ
40 0 0.0
Enter Number of Runs In this dataset (NRUN) ...
1
Enter for thls Run: No. of Datapoints (NP) I Init. Time (to) I Inlt. state (XO(I) )
8 0.0
Enter NP rows of Measurements for
0.000563 0.828 0.0737
0.001132 0.704 0.113
0.001697 0.622 0.1322
0,002262 0.565 0.1400
0.00340 0.499 0.1468
0.00397 0.482 0,1477
0.00452 0.470 0.1477
0.01697 0.443 0.1476
1.0 0.0
this Run: Tlme (TP) and OUTPUT (ym(1) ) . . .
COMMENTS regarding Datafile (not read by the program):
If one wlshes to use NO PRIOR INFORMATION on a particular parameter
simply use Vprior =0 (i.e.l variance IS infinity - no prlor Info).
Index
A
activity coefflicient
adequacy of model
algebraic model 7,49,
annealing algorithm
apparent rate
asymptotic
behavior
rate of convergence
test for
autocorrelation
bubble-free bioreactor 342
268
3, 182
285,323
79
120
135
69
I56
B
Bartletts test I 92
Bayessian estimation 88, 146
Benzene 98, 129,303
BFGS formula 77
bicyclo [2.1.1] hexane 58,287
binary VLE Data 6.231
binary critical point data 26 1
biological oxygen demand 56, 323
bisection rule 52,91, 165
bitumen oxidation 359
Bourgoyne-Young 355
C
chemical kinetics 3,55,58, 285
chi-squared test 192
Choieky factorization 72
computational efficiency 69
conditionally linear system 9,138
condition number 72, 14 I , 189
confidence interval 33, 178
confidence region 33, 178
conjugate gradient method 76
Fletcher-Reeves 77
Polak-Ribiere 77
constrained estimation I58
constraints
equality 158
inequality 162
criteria 52
quadratic
5 5
correlation matrix 3 78
covariance matrix 32, 177,257
critical point
51
convergence
43 I
43 2
cubic splines 117, 130
curve fitting 2,29
D
degrees of freedom 32, 178,257
dependent variable 8
derivative approach 116,333
derivative free methods 78
design of experiments 3, 185
determinant criterion 19
DFP formula 77
dialyzed chemostat 336
differential equation model 1 1,84
diphenyl 98,303
direct search methods 78, 139, 155
divergence criterion 192,200
drilling rate 3 54
dynamic systems 11, 13, 156
E
Eadie-Hofstee plot 138,326
efficiency 69
eigenvalue 75,91, 142, 189
decomposition 75, 144,241
enzyme kinetics 60,324,341
equation of state 5,226
error propagation law 235
error, random 1
errors in variables model 2 1,233
estimation
explicit 14
implicit 19
1 inear 2, 23
maximum likelihood 15
non I inear 2
shortcut 5,15
batch 121
chemostat
122,213,331
continuous 122
fed-batch
121,207
perfusion
122, 128
factorial 3, 185
experiments
experimental design
preliminary
sequential
F
Factorial design
F-distribution
F-test
Fletcher-Powel function
Fletcher-Reeves
G
GAMS
gas hydrates
Index
I85
3, 187
3, 185
33, 178
183
82
77
120
315
Gauss-Newton 4935, 169
generalized least squares 27
Gill-Murray method 72
gradient methods 67
gradient vector 68
H
Hanes plot 138,327
Hessian matrix 7 1,74
3-hexane
55,285
history matching
5,372
3-hydroxypropanal
102, 307,32 1
hypothesis, null I 82
testing 1 82
I
identification problem 2
ill-conditioning 141
implicit estimation 10,19
IMSL library 117,130
independent variables 8
information index 152,205
initial
conditions 93
guess 85,135
integral approach 118,326
interaction parameters 6,228
J
J acobean 70,73,90
Index 433
joint
confidence region 33,178
likelihood region 179
K
Kuhn-Tucker 165
L
Langrange multiplier I59
least squares
constrained 158, I6 1,236
explicit 14,233
generalized 1527
implicit 19,236
linear 26,27
nonlinear 2
recursive 219
recursive extensive 22 1
recursive generalized 223
simple 15
simple linear 26
simplified constrained 237
unconstrained 236,250
unweighted 15
weighted 15,26
weighting matrix 147
least squares 26,27
regression 23,29
linearization 50,85, 159, 169
Lineweaver-Burk plot 137,325
LJ Optimization 79
log likelihood function 16
linear
M
marginal interval 33, 178
Marquardt-Levenberg method 144
Marquardt's modification 144
matrix
covariance 32, 177,257
ill-conditioning
141
sensitivity
5 1 , 86, 94
maximum likelihood
15,232
Michaelis-Menten 60, 137,324
MicrosoftExcelTM 35
model 1
algebraic 7,49,285,323
autoregressive 156
biochemical engineering 323
chemical reaction kinetic 285
discrimination 3, 191
equivalent run 3 76
linear regression 23
Monod 120
multiple linear 24,354
multiresponse linear 25
ODE 1 1,84,302,345
PDE 13, 167
simple linear 24
model adequacy 3, 182
modified Newton method 76
monoclonal antibody 33 1
multiple linear regression 24,35
multiresponse linear regression 25
N
Nelder-Mead algorithm 82
Newton's method 71
nitric oxide 61,288
nonlinear regression 2
null hypothesis 1 82
0
objective function
ODE 11, 84,
offshore well data
oil sands
optimal step-size policy
osmotic coefficients
outliers
overstepping
3, 13
302,345
354
359
140, 150
269
133
139
P
parameter
conditionally linear 9, 138
confidence interval 33, 177
estimation 2
PDE 13,167
434 Index
penalty function
Peng-Robinson
penicillin fermentation
Pitzers nlodel
Polak-Ribiere method
polynomial fitting
pooled variance
Powell function
prior information
163,384
5
268
77
29
194
82
146,383
X
I ,3-propanediol
102,307,32 1
pseudoinverse
143
pulsar aerator
328
Q
quadratic convergence 55
quasilinearization method 1 1 I
quasi-Newton methods 77
R
radial coning problem 3 74
random error 1
rate constant 4
recursive
extended least squares 22 I
generalized least squares 223
least squares 219
region of convergence 150,306
regression
linear 11,223
nonlinear 2
reliability 377,386,390
reparameterization 162
reservoir
characterization
381,385
engineering
5,372
simulation 372
residuals 13
response variable 8, 179
risk 3 89
robustness 69
Rosenbrock function 77
S
sampling rate 198
scaling 71, 145
secant methods 77
sensitivity matrix, 5 I , 86, 14.5, 173
sequential design
shape criterion
shortcut methods
SigmaPIotT
simplex method
simulated annealing
smoothing
splines
stability
criterion
fllnction
deviation
error
state variables
stationary criterion
statistical inferences
standard
3, 187, 198
I89
5 , 115
42
81
78
117, 130
117, 130
23 7
250,237
179
I79
8, 1 1 , 50, 85
28, 68, 87
32, 177
steepest descent method 69
stiff ODE models 148,307
T
t-distribution 33, 178,388
time interval 197
transformably linear model I36
Trebble-Bishnoi 228
triphenyl 98,303
U
unconstrained least squares 250
V
variable metric methods 77
variance 32, 177.328
volume criterion 188
W
weighted least squares 15
weighting matrix 14. 147
well productivity index 387
well scaled 71
white noise 156

Applied Parameter Estimation For Chemical Engineers

Uploaded by

Copyright:

Available Formats

Applied Parameter Estimation For Chemical Engineers

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Applied Parameter Estimation For Chemical Engineers

Uploaded by

Copyright:

Available Formats

flPPLIED PflRflMETER

You might also like