Fixed-Point Iteration Based Algorithm For A Class

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Journal of Applied Mathematics and Computational Mechanics 2017, 16(2), 29-41

www.amcm.pcz.pl p-ISSN 2299-9965


DOI: 10.17512/jamcm.2017.2.03 e-ISSN 2353-0588

FIXED-POINT ITERATION BASED ALGORITHM FOR A CLASS


OF NONLINEAR PROGRAMMING PROBLEMS

Ashok D. Belegundu
Department of Mechanical and Nuclear Engineering
The Pennsylvania State University, University Park, PA 16802 USA
[email protected]

Received: 9 February 2017; accepted: 22 May 2017

Abstract. A fixed-point algorithm is presented for a class of singly constrained nonlinear


programming (NLP) problems with bounds. Setting the gradient of the Lagrangian equal to
zero yields a set of optimality conditions. However, a direct solution on general problems
may yield non-KKT points. Under the assumption that the gradient of the objective func-
tion is negative while the gradient of the constraint function is positive, and that the varia-
bles are positive, it is shown that the fixed-point iterations can converge to a KKT point. An
active set strategy is used to handle lower and upper bounds. While fixed-point iteration
algorithms can be found in the structural optimization literature, these are presented without
clearly stating assumptions under which convergence may be achieved. They are also prob-
lem specific as opposed to working with general functions f, g. Here, the algorithm targets
general functions which satisfy the stated assumptions. Further, within this general context,
the fixed-point variable update formula is given physical significance. Unlike NLP descent
methods, no line search is involved to determine step size which involves many function
calls or simulations. Thus, the resulting algorithm is vastly superior for the subclass of
problems considered. Moreover, the number of function evaluations remains independent
of the number of variables allowing the efficient solution of problems with a large number
of variables. Applications and numerical examples are presented.

MSC 2010: 47N10, 49M05, 90C30, 90C06, 74P05


Keywords: fixed-point iteration, resource allocation, nonlinear programming, optimality
criteria

1. Introduction

A fixed-point algorithm is presented for a class of singly constrained NLP prob-


lems with bounds. Setting the gradient of the Lagrangian equal to zero yields a set
of optimality conditions. However, a direct solution may yield non-KKT points.
Under the assumption that the gradient of the objective function is negative while
the gradient of the constraint function is positive, and that the variables are posi-
30 A.D. Belegundu

tive, it is shown that the fixed-point iteration can be made to converge to a KKT
point. Opposite signs on the derivatives, viz. a positive objective function gradient
with a negative constraint function gradient can be handled via inverse variables.
An active set strategy is used to handle lower and upper bounds. While fixed-point
iteration algorithms can be found in the structural optimization literature [1-8],
these are presented without clearly stating assumptions under which convergence
may be achieved, and in fact, have been observed to not converge in certain situa-
tions without an explanation [5]. On the other hand, these algorithms have built-in
bells and whistles relating to step size control and Lagrange multiplier updates that
render them efficient on a wider variety of problems than targeted herein. Here, the
algorithm targets general functions f and g within a subspace of functions, as op-
posed to problem specific applications in structural optimization. Further, within
this general context, the fixed-point variable update formula is given physical sig-
nificance.
For the subclass of problems considered, the algorithm presented is vastly supe-
rior to descent based NLP methods such as sequential quadratic programming or
a generalized reduced gradient or feasible directions. In the latter category of
methods, a line search needs to be performed along a search direction, which is
computationally expensive as it involves multiple simulations in order to evaluate
the functions. Further, in fixed-point iterative methods, the number of function
evaluations involved in reaching the optimum are very weakly dependent on the
number of variables [7], allowing an efficient solution of problems with a large
number of variables. Fixed-point iterations have been applied extensively in the
area of equation solving and in market equilibrium in economics while hardly at all
in nonlinear programming. The algorithm may be applied to a subclass of problems
in resource allocation and inventory models, in addition to allocation of material in
structures [9, 10].

2. Fixed-point iteration recurrence formulas

Fixed-point iterations have been used widely in equation solving. Two recur-
rence formulas that are prevalent will be illustrated in finding the root of a one
variable equation, x = g(x). This will pave the discussion of fixed-point iterations
for solving optimization problems in the next section. Convergence is established
by the following theorem [11, 12]:

Theorem 2.1. Assume α is a solution of x = g(x) and suppose that g(x) is con-
tinuously differentiable in some neighboring interval about α with g ′ (α ) < 1 ,
dg
where g ′ ≡ . Then, provided the starting point x0 is chosen sufficiently close to
dx
α, xk+1 = g(xk), k ≥ 0, with Limit x k = α .
k →∞
Fixed-point iteration based algorithm for a class of nonlinear programming problems 31

When x0 is sufficiently close, there exists an interval I containing α where


g(I) ⊂ I, i.e. g is a contraction on the interval. For systems of equations with n vari-
ables, the above holds true with α and x being vectors and the condition that all
 ∂g j 
eigenvalues of the Jacobian G (α ) =  (α )  are less than one in magnitude [12].
 ∂xi 
Two recurrence formulas are now discussed with the help of an example. Consider
the solution of x = 10cosx. Rather than generating a sequence as xk+1 = 10cos (xk)
which is unstable, we use one of two techniques. Treating 10cos (xk) as the ‘new
guess’ and xk as the current value, we write
xk+1 = w g(xk) + (1−w) xk (1)
= 10 w cos (xk) + (1−w) xk
where w = ½ represents an average of the two. The recurrence relation in Eq. (1)
converges to the root 1.4276 provided w ≤ 1/11 for a sufficiently close starting
guess, using the theorem above. Generally, w ∈(0,1) and is reduced if the iterations
oscillate. If convergence is stable but slow, then w may be increased. w = 0.25 has
been taken as the default value for the examples in this paper.
Alternatively, we may multiply both sides of x = 10cosx by x p −1 and take the
pth root, to obtain the recurrence relation
1/ p
x k +1 = (( x )
k p −1
)
g ( xk )
(2)
1/ p

(( x ) 10 cos( x ) )
k p −1 k
=

which, for this example, converges for p ≥ 6 provided the starting guess is close
enough. Formulas in Eqs. (1) or (2) will be used for optimization problems as di-
scussed subsequently.

3. Assumptions and algorithm

Assumptions made, algorithm development, procedural steps, extensions and


physical interpretation of the fixed-point scheme are discussed in this section.

3.1. Assumptions

The subclass of NLP problems considered here are of the form


minimize f (x)

subject to g(x) ≤ 0 (single constraint) and x ≥ xL > 0 (3)


32 A.D. Belegundu

where xL are lower bounds. Descent based methods like sequential quadratic pro-
gramming and others referred to above update variables as

xk+1 = xk + βk dk (4)

where βk is a step size chosen to ensure reduction of a descent or merit function, dk


is a direction vector and k is an iteration index. In fixed-point methods, on the other
hand, iterations are based on the form

xk+1 = φ (xk) (5)


n
where x∈R + . Three assumptions are made:
Assumption 3.1. All functions are C1 continuous, i.e. continuously differentiable.
∂ f (x) ∂g ( x)
Assumption 3.2. ≤ 0, i = 1,..., n , and > 0 , i = 1,..., n .
∂ xi ∂xi
Assumption 3.3. x ∈ R+n . That is, variables are real and positive.
Generally speaking, assumption 3.2 above restricts the class of problems
considered to those where increasing the available resource reduces the objective
function value. For instance, more material reduces deflection (but not necessarily
stress), or greater effort increases the probability of finding the treasure or target.
∂ f ( x) ∂g j ( x )
Problems where ≥ 0 , and < 0 for some r can be handled by working
∂xr ∂xr
1
with reciprocal variable yr = .
xr

3.2. Development of the fixed-point algorithm

Optimality conditions reduce to minimizing the Lagrangian L = f + µ g subject


to the lower bounds on xj. For xj > xjL then we solve the optimality condition

∂f ∂g
+µ =0
∂x j ∂ xj

The main issue to obtaining a minimum point is ensuring µ ≥ 0. Here, our


assumptions on the signs of the derivatives (i.e. opposite monotonicity of the func-
tions) guarantee, from the optimality condition above, that µ ≥ 0. With lower
∂f ∂g
bounds, the same result is true since +µ ≥ 0 at points where xj = xjL which
∂x j ∂xj
again requires µ ≥ 0. Further, it readily follows that the single constraint is active
(or essential) at optimum, since increasing variable xj will reduce the objective
Fixed-point iteration based algorithm for a class of nonlinear programming problems 33

while increasing the constraint value until it reaches its limit. Thus, with the stated
assumptions, the solution of (3) using optimality conditions yields a KKT point.
The fixed-point iteration is derived as follows. The constraint g(x) ≤ 0 is linear-
n
∂g ∂g k
ized as cTx ≤ c0, where c j = > 0, c0 = − g (x k ) + ∑ xi with the derivatives
∂xj ∂
i =1 xi

evaluated at the current point xk. Working with the linearized constraint, the local
optimization problem may be stated as {min. f (x) : cTx ≤ c0, x ≥ xL}. We define
a Lagrangian function L = f + µ (cT x − c0), and an active set
I = { i : xi = xiL , 1≤i ≤ n } . The optimal point x is obtainable from
x ∈ arg minL L ( x , µ ) . The optimality conditions can be stated as
x≥x

∂L
for j ∈ I , x j = x Lj , ≥0
∂x j
(6)
∂L ∂ f
for j ∉ I , x j > x Lj , = + µ cj = 0
∂x j ∂x j

From Eq. (6) we have

j∈I: xj = xjL
 ∂f 
 − x j ∂x 
 
= max 
j
j∉I: c j x kj +1 , c j x Lj  (7)
 µ 

 
Substituting Eq. (7) into cT x = c0 gives an expression for µ, which when substi-
tuted back into Eq. (7) gives

j∈I: xj = xjL
 

j∉I: c j x kj +1 = max  E j (c0 − ∑ cr xrL ) , c j x Lj  (8)
 r ∈I 

where
 ∂f 
xj  −
 ∂x 
Ej = n 
j 
(9)
 ∂f 
∑ xi  − 
i =1
i∉ I
 ∂xi 
34 A.D. Belegundu

A starting point x0 > xL is used which may be feasible or infeasible with respect
to the constraint. We use the fixed-point recurrence relation or ‘re-sizing’ formula:

j ∈ I : xj = xjL

 

j ∉ I : c j y = max  E j (c0 − ∑ cr xrL ) , c j x Lj  (10)

 r ∈I 

As per Eq. (1), we use for j ∉ I

xkj +1 = w y + (1 − w) xkj , w ∈(0,1) (11a)

Or as per Eq. (2) we use for j ∉ I


1/ p
x k +1 = (( x ) y )
k p −1

(11b)
Basic steps are given in the algorithm below.

3.3. Algorithm steps

Input Data:
x0, xL, w (or p), gmax, move limit ∆ max , Iteration limit, tolabs, tolrel
Set initial active set I0 = {null set}
Typical Iteration:
∂ f ∂g
i. Evaluate derivatives , and then evaluate c, c0, Ej.
∂x j ∂x j
ii. Compute xtrial ≡ xk+1 from Eq. (11a) or (11b).
iii. Reset xk+1 to satisfy move limits based on xtrial
j − x kj ≤ ∆ max x kj as well as to
satisfy the bounds.
iv. Based on xk+1, update active set from Ik to Ik+1. A free variable that has reached
x k +1
a bound is included in the active set based on whether 1 − j L ≥ − g max for
xj
each j ∉ I k . Also, variables in Ik are tested if they should become free.
v. Stopping criteria is based on (i) iteration limit (= number of function calls), (ii)
based on whether

f k − f k −1 ≤ tolrel f k −1 + tolabs
and g ( x k ) ≤ g max

which should hold for, say, 5 consecutive iterations.


Fixed-point iteration based algorithm for a class of nonlinear programming problems 35

3.4. Extensions of the algorithm

The aforementioned fixed-point algorithm can be applied to problems which are


extensions to problem (3) as discussed below.
∂f
If the objective is to minimize f where ≥ 0, i = 1,..., n , subject to
∂xi
∂g
g(x) ≤ 0 where < 0, i = 1,..., n , then we define reciprocal variables yi = 1/xi,
∂xi
i = 1,...,n. Defining f (x) → f (1/y) ≡ f new(y) and similarly gnew(y), we obtain the
problem:
minimize f new(y) (12)
subject to gnew(y) ≤ 0, and y > 0
The switch in the signs of the derivatives allows the algorithm to be applied. For
example, the algorithm can be applied to minimize the compliance subject to
a mass limit, or to minimize the mass subject to a compliance limit with reciprocal
variables.
Upper bounds are present in some problems. In topology optimization, for
example, the bounds 0 ≤ xi ≤ 1 are imposed on the pseudo densities. The observa-
tion made in Section 4 is generalized, viz. (c0 − ∑ cr xrL − ∑ cr xrU ) now repre-
r ∈J L r ∈J U

sents the available resource after accounting for variables at their bounds. If a vari-
able xi ≡ θ represents an angle and crosses zero as 0 ≤ θ ≤ 2π , it can be substituted
by 2π ≤ θ ≤ 4π .

3.5. Physical interpretation of variable update formula

Importantly, the fixed-point iteration variable update in Eq. (10) can be physi-
∂f ∆f
cally interpreted as follows. Firstly, the derivative = lim has often been
∂x j ∆ x j → 0 ∆x j
termed a ‘sensitivity coefficient’ since it represents, to within a linear approxima-
tion, the change in f due to a unit change in xj. Similarly, the term
∂f ∆f
xj = lim represents the change in f due to a unit percentage change in
∂x j ∆ x j → 0 ∆ x j
xj
xj within a linear approximation. Thus, as x → x*, the contribution of the jth term
(cj xj) in the constraint
c1 x1 + c2 x2 + ... + ( cj xj ) + ... + cn xn = c0
36 A.D. Belegundu

is ‘strengthened’ or ‘trimmed’ based on the relative impact or ‘relative return on


investment’ the variable has on the objective. In structural optimization, this update
is referred to as a ‘re-sizing’ since {xj} refer to member sizes. Here, the contribu-
tion to the literature is that a physical interpretation has been provided with general
functions f and g, as opposed to specific functions pertaining to structural optimiza-
tion applications. Alternatively, dividing through by c0, we may interpret this as
splitting unity into different parts, each part reflecting the impact on the objective.
Lastly (c0 − ∑ cr xrL ) represents the available resource after accounting for varia-
r∈J

bles at their bounds. Thus, the variable updates for xj only compete for the resource
that is available. The resizing of variables can thus be expressed as “re-allocate
term (ci xi) = (fractional % reduction in objective) x (available resource)”. Interest-
∂f
ingly, the term x j has only the units of f and is independent of the units of xj.
∂x j
Also, the iterations are driven by sensitivity only whereas in NLP based descent
methods, the function f is also monitored during the line search phase.

4. Applications
4.1. Searching for a missing vessel

Consider the problem of searching for a lost object where it is to be determined


how resource b, measured in time units, must be spent to find, with the largest pro-
bability, an object among n subdivisions of the area [9]:
n
−bj x j
maximize ∑a
j =1
j (1 − e )
n
subject to ∑x
i =1
i = b,

a j , b j , b > 0, x ∈ R+n

n
−bj xj −b j x j
Defining f =−∑aj (1− e ) for minimization, we have df / dx j = − a j bj e < 0,
j=1

and derivative of the constraint function g ≤ 0 is 1. Further, variables are positive.


The fixed-point algorithm can thus be applied.

4.2. Optimum allocation in stratified sampling

Consider the problem determining an average of a certain quantity among


a large population M. The population is stratified into n strata each having a popu-
Fixed-point iteration based algorithm for a class of nonlinear programming problems 37

lation Mj with the number of samples chosen xj. The total sample is to be allocated
to the strata to obtain a minimum variance of the global estimate [9]:
n ( M j − x j )σ 2j
minimize ∑ω
j =1
2
j
( M j −1) x j
n
subject to ∑x
i =1
i = b,

x j ≥ 1, j =1, … , n

Mj
where ω j = , σ 2 is an appropriate estimate of the variance .in each strata, b is
M
the total sample size. We have df / dx j = − ω 2j σ 2j ( M j −1) M j < 0 and derivative of
the constraint function g ≤ 0 is 1. Further, variables are positive. The fixed-point
algorithm can thus be applied.

4.3. Structural optimization for minimum compliance

The problem of allocating material within a structural domain to minimize


∂ f ( x)
compliance (i.e. maximize stiffness) satisfies the assumption ≤ 0 required
∂xi
to use the fixed-point algorithm. Specifically, f (x) = FTU, with K(x) U = F. Here,
K = global stiffness, U = global displacement, F = global force, and x represents
either a cross-sectional area or element density. Using the adjoint method [13], it
r
can be shown that if element stiffness matrix k ∝ xi with the parameter r > 0, then
∂f ε
= − r i where εi = qTkq is the element strain energy. Note that q = the ele-
∂xi xi
ment displacement vector and k = the element stiffness matrix. Since εi ≥ 0 owing
∂f
to k being positive semi-definite, we have ≤ 0 . Further, the assumption
∂xi
∂g
> 0 readily follows when g represents the total volume of material. Further, in
∂xi
∂f ε
view of = − r i , the fraction Ej in (9) takes the form
∂xi xi

εj
Ej = n
(13)
∑ε
i =1
i

i∉I
38 A.D. Belegundu

That is, variable update or re-sizing is governed by the energy in the jth element
as a fraction of the total energy. Thus, many references exist in structural optimiza-
tion to energy based optimality criteria methods, which are shown here as special
cases stemming from a general formula.

4.4. Real time allocation of resources

In view of the assumptions behind this algorithm, it can be expected to be useful


in allocating fixed amount of resources relating to security. This follows from the
assumption that more security than needed in certain areas will not be harmful.
Similarly, allocating fixed amount of funds to school districts is a possible applica-
tion. Since the algorithm is fast, a dynamic allocation is possible based on say
30-day moving averages of the data. These applications need to be explored.

5. Numerical examples

Examples are presented to show that the algorithm is far superior to a general
nonlinear programming method for problems that satisfy Assumptions 3.1 and 3.2.
The fixed-point algorithm is compared to MATLAB’s constrained optimizers
(sequential quadratic programming and interior point optimizers under ‘fmincon’).

5.1. Cantilever beam

Consider the simple problem of allocating material to minimize the tip deflec-
tion of a cantilever beam with two length segments, with certain assumed problem
data:

32 1
minimize f = +
x12 x22
subject to 2 x1 + x2 ≤ 1
x>0

64 / x12 2 / x22
From Eq. (9), E1 = , E = . The recurrence
( 64 / x12 + 2 / x22 ) 2 ( 64 / x12 + 2 / x22 )
formula in Eq. (11a) becomes

c0
x kj +1 = w E j + (1 − w) xkj
cj
Fixed-point iteration based algorithm for a class of nonlinear programming problems 39

where {ci} = [2 1], c0 = 1. The optimum is x* = [0.426, 0.169]. Based on Theorem


2.1, it can be readily shown that the fixed-point iteration converges for w ∈(0,1).

5.2. Searching for a missing vessel

Table 1
Searching for a missing vessel

N NFE1 as per NFE as per f *_fixed-point,


Fixed-point algorithm2 NLP algorithm2 f *_NLP
10 37 209 ‒2.338, ‒2.339
100 67 4.041 ‒5.733, ‒5.750
3 3
100 93 20,021 ‒9.378, ‒9.240
1
NFE = Number of Function Evaluations
2
max. constraint violation = 0.001, max NFE = 100, in fixed-point code
3
NLP took 19 iterations, and 20,021 function calls

5.3. Randomized test problem

A set of randomized test problems of the form:


t0 n
a0 i j
minimize f (x) = ∑ C0 i ∏ x j
=1 j =1

tk n
a1 i j
subject to g1 (x) ≡ ∑C ∏x
i =1
1i
j =1
j ≤1

0 < x L ≤ x ≤ xU

with the restriction that all coefficients Ck i > 0, a0 i j < 0 , a1 i j > 0 . Note that
Ck i = coefficient of the kth function, ith term, while ak i j = exponent corresponding
to the kth function, ith term, and jth variable. Derivatives are calculated analytical-
ly.
Data: xL = 10‒6, xU = 1, C0 i ∈ (0,1) , C1 i ∈ (0.2,1) , a0 ij ∈ (−1, 0) , ak i j ∈ (0 ,1) ,
t0 = tk = 8, x0 = 0.5. To ensure that the constraint is active, ∑C 1i >1.
i
Table 2 shows results for various random generated trials. The fixed-point itera-
tion method far outperforms the NLP routine for the class of problems considered.
Total computation in NLP methods increase with n while fixed-point iteration met-
hods are insensitive to it, as also noted in the structural optimization literature.
40 A.D. Belegundu

Table 2
Randomized test problem
4
n NFE as per NFE as per f *_fixed-point /
Fixed-point algorithm1 NLP algorithm2 f *_NLP 2
10 50 363 1.027
20 50 1,131 1.030
3
40 50 3,008 1.003
1
fixed value
2
averaged over 5 random trials
3
larger values of n cannot be solved in reasonable time by the NLP routine
4
NFE = Number of Function Evaluations

6. Conclusions

In this paper, an optimality criteria based fixed-point iteration is developed for


a class of nonlinear programming problems. The class of problems require that the
variables are positive, derivatives of f are negative, and derivatives of gj are posi-
tive. Certain extensions of this subclass of problems have been given. Hitherto,
fixed-point methods were only developed in a problem-specific manner in the field
of structural optimization. Convergence aspects are discussed. The fixed-point iter-
ation algorithms to be found in the structural optimization literature, albeit target-
ing more general problems than considered here, are presented without clearly stat-
ing assumptions under which convergence may be achieved, and in fact, have been
observed to not converge in certain situations. Moreover, these have been devel-
oped for problem specific applications in the structural optimization and have not
targeted general functions f and gj even within a subspace of functions. Important-
ly, the fixed-point update, within this general context, is given physical signifi-
cance in this paper.
Results show that for the subclass of problems considered, fixed-point iterations
far outperform an NLP method. This is a general result because the fixed-point
iteration method does not involve line search, a main step in NLP methods, which
requires computationally expensive multiple function evaluations, even with the
use of approximations. A fixed-point algorithm is insensitive to n whereas in NLP
methods, the number of iterations increase significantly with n. In the algorithm
presented, the value of w (or p) requires, at most, a one-time adjustment based on
a simple rule, viz. if oscillations in the maximum constraint violation are noticed
during the iterations, then w is reduced (or p is increased); a smaller value of w
than that needed also works except that more iterations are needed, as there is more
emphasis on the previous point. A default value of w = 0.25 has worked well on the
examples here. New applications need to be explored for using the algorithm.
Fixed-point iteration based algorithm for a class of nonlinear programming problems 41

References
[1] Venkayya V.B., Design of optimum structures, Computers and Structures 1971, 1, 265-309.
[2] Berke L., An efficient approach to the minimum weight design of deflection limited structures,
AFFD1-TM-70-4-FDTR, Flight Dynamics Laboratory, Wright Patterson AFB, OH 1970.
[3] Khot N.S., Berke L., Venkayya V.B., Comparison of optimality criteria algorithms for minimum
weight design of structures, AIAA Journal 1979, 17(2), 182-190.
[4] Dobbs M.W., Nelson R.B., Application of optimality criteria to automated structural design,
AIAA Journal 1975, 14(10), 1436-1443.
[5] Khan M.R., Willmert K.D., Thornton W.A., Optimality criterion method for large-scale struc-
tures, AIAA Journal 1979, 17(7), 753-761.
[6] McGee O.G., Phan K.F., A robust optimality criteria procedure for cross-sectional optimization
of frame structures with multiple frequency limits, Computers and Structures 1991, 38(5/6), 485-
-500.
[7] Yin L., Yang W., Optimality criteria method for topology optimization under multiple con-
straints, Computers and Structures 2001, 79, 1839-1850.
[8] Belegundu A.D., A general optimality criteria algorithm for a class of engineering optimization
problems, Engineering Optimization 2015, 47(5), 674-688.
[9] Patriksson M., A survey on the continuous nonlinear resource allocation problem, European
Journal of Operational Research 2008, 185(1), 1-46.
[10] Jose J.A., Klein C.A., A note on multi-item inventory systems with limited capacity, Operations
Research Letters 1988, 7(2), 71-75.
[11] Atkinson K.E., An Introduction to Numerical Analysis, John Wiley, 1978.
[12] Bryant V., Metric Spaces: Iteration and Application, Cambridge University Press, 1985.
[13] Belegundu A.D., Chandrupatla T.R., Optimization Concepts and Applications in Engineering,
2nd edition, Cambridge University Press, 2011.

You might also like