Lecture 09 - Sequential Quadratic Programming

School of Computer Science and Applied Mathematics
APPM 3017: Optimization III

Lecture 09 : Sequential Quadratic Programming
Lecturer: Matthews Sejeso Date: October 2021
By the completion of this lecture you should be able to:

1. Describe the sequential quadratic programming for solving constrained optimization.
Reference:
• Chapter 18, Jorge Nocedal and Stephen J. Wright, ‘Numerical Optimization’.
9.1 Introduction
The penalty function methods are an indirect way of attempting to solve the constrained optimization
problem. A more direct and efficient approach is to iterate based on a certain approximation to the
objective function and linear approximation to the constraints. Each iteration needs to solve a quadratic
programming problem, thus the name ’sequential quadratic programming. The sequential quadratic
programming (SQP) approach can be viewed as a generalization of the Newtons method to constrained
optimization. The SQP approach can be used in both line search and trust-region frameworks. Unlike
linearly constrained Lagrangian methods from the last lecture, which are effective when most of the
constraints are linear, the SQP methods show their strength when solving problems with significant
nonlinearities in the constraints.
9.2 Sequential Quadratic Programming Idea
For simplicity, let us consider the equality constrained problem

min f (x) subject to c(x) = 0. (9.1)
where f : Rn → R and c : Rn → Rm are smooth functions. The idea behind the SQP approach is to model
problem (9.1) at the current iterate xk by a quadratic programming subproblem, then use the minimizer
of the subproblem to define a new iterate xk+1 . The challenge is to design the quadratic subproblem to
yield a good step for the nonlinear optimization problem. The simplest derivation of SQP methods can be
viewed as an application of Newton’s method to the KKT optimality conditions for problem (9.1).
Recall the Lagrangian function for this problem is L(x, λ) = f (x) − λT c(x), where λ ∈ Rm is a vector of
Lagrangian multipliers. We use J(x) to denote the Jacobian matrix of the constraints, that is
J(x)T = [∇c1 (x), ∇c2 (x), . . . , ∇cm (x)], (9.2)
where ci is the i-th component of the vector c(x) (the i-th constraint). The first-order (KKT) optimality
conditions of the equality-constrained problem (9.1) can be written as a system of n + m equations in the
n + m unknowns x and λ:
" # " #
∇x L(x, λ) ∇f (x) − J(x)T λ
∇L(x, λ) = = = 0. (9.3)
∇λ L(x, λ) c(x)
Any solution (x∗ , λ∗ ) of the equality-constrained problem (9.1) for which J(x∗ ) has full rank satisfies (9.3).
One approach is to solve the nonlinear equations (9.3) by Newton’s method. The Hessian of the Lagrangian
is given by
9-1
" #
∇2xx L(x, λ) −J(x)T
H(x, λ) = . (9.4)
J(x) 0
The Newton step from the iterate (xk , λk ) is thus given by

" # " # " #
xk+1 xk dkx
= + , (9.5)
λk+1 λk dkλ
where dkx and dkλ solve the Newton-KKT system

" #" # " #
∇2xx L(xk , λk ) −J(xk )T dkx −∇f (xk ) + J(xk )T λk
= (9.6)
J(xk ) 0 dkλ −c(xk )
This Newton iterations is well defined when the KKT matrix in (9.6) is nonsingular. This matrix is
nonsingular if the following assumption holds at (x, λ) = (xk , λk ).
Assumption 9.1. .
(a) The constraint Jacobian J(x) has full rank.
(b) The matrix ∇2xx L(x, λ) is positive definite on the tangent space of the constraints, that is
dT ∇2xx L(x, λ)dT > 0 for all d 6= 0 such that J(x)d = 0.
The first assumption is the linear independence constraint qualification (LICQ), which we assume throughout
this lecture. The second condition holds whenever (x, λ) is close to the optimum (x∗ , λ∗ ) and the second-order
sufficient condition is satisfied at the solution. The Newton iteration (9.5), (9.6) can be shown to be
quadratically convergent under these assumptions and constitutes an excellent algorithm for solving equality
constrained problems, provided that the starting point is close enough to x∗
The SQP Algorithmic Framework
There is an alternative way to view iteration (9.5), (9.6). Suppose at the iteration (xk , λk ) we model
problem (9.1) using the quadratic problem
1
min f (xk ) + ∇f (xk )T d + d∇2xx L(xk , λk )d (9.7)
x 2
k k
subject to J(x )d + c(x ) = 0 (9.8)
If Assumptions 9.1 hold this, this problem has a unique solution dkx , lk that satisfies
∇2xx L(x, λk )dkx + ∇f (xk ) − J(xk )T lk = 0. (9.9)

k
J(x )dkx k
+ c(x ) = 0. (9.10)
The vectors dk and lk can be identified with the solution of the Newton equations (9.6). If we subtract
J(xk )T λk from both sides of the first equation in (9.6), we obtain
" #" # " #
∇2xx L(xk , λk ) −J(xk )T dkx −∇f (xk )
= (9.11)
J(xk ) 0 λk+1 −c(xk )
Hence by nonsingularity of the coefficient matrix, we have that λk+1 = lk and that dkx solves (9.7),(9.8)
and (9.6).
The new iterate (xk , λk ) can therefore be defined either as the solution of the quadratic problem (9.7),
(9.8) or as the iterate generated by Newton’s method (9.5), (9.6) applied to the optimality conditions of
the problem.
9-2
We now state the SQP method in its simplest form.
Algorithm 9.1 SQP Algorithm

1: Choose an initial pair (x0 , λ0 ) and set k ← 0.
2: while not convergence do
3: Evaluate f (xk ), ∇f (x), ∇2xx L(xk ), c(xk ) and J(xk );
4: Solve problem (9.7), (9.8) for dk and lk ;
5: Set xk+1 ← xk + dk and λk+1 ← lk .
6: Stop with approximate solution xk ;
7: end while
In the objective (9.7) of quadratic program, we could replace the linear term ∇f (xk )T d by ∇x L(xk , λk )T d,
since the constraint (9.8) make the two choices equivalent. In this case, (9.7) is a quadratic approximation
of the Lagrangian function. This fact provides a motivation for the choice of the quadratic model: We first
replace the nonlinear program (9.1) by the problem of minimizing the Lagrangian subject to the equality
constraints, then make a quadratic approximation on the Lagrangian and a linear approximation to the
constraints to obtain (9.1).
Inequality Constraints
The SQP framework can be extended easily to the general nonlinear programming problem
(
ci (x) = 0, i ∈ E,
min f (x) subject to (9.12)
ci (x) ≥ 0 i ∈ I.
To model this problem we linearise both the equality and inequality constraints to obtain
1
min f (xk ) + ∇f (xk )T d + dT ∇2xx L(xk , λk )d (9.13)
d 2
subject to ∇ci (xk )T d + ci (xk ) = 0, i ∈ E, (9.14)
∇ci (xk )T d + ci (xk ) ≥ 0, i ∈ I. (9.15)
The new iterate is given by (xk + dk , λk+1 ) where dk and λk+1 are the solution and the corresponding
Lagrange multiplier of (9.13), (9.14), (9.15). An SQP method for solving (9.12) is thus given by Algorithm
9.1 with the modification that the step is computed form (9.13), (9.14), (9.15).
In this approach the set of active constraints A(xk ) at the solution of (9.13), (9.14), (9.15) constitute the
guess of the active set at the solution of the nonlinear program. If the SQP method is able to correctly
identify this optimal active set then it will act like a Newton method for equality-constrained optimization
and will converge rapidly. The following results gives conditions under which this desirable behaviour takes
place. We define the strict complementarity at the solution pair (x∗ , λ∗ ), if there is no index i ∈ I such
that λ∗i = ci (x) = 0.
Theorem 9.1. Suppose that x∗ is a local solution of (9.12) at which the KKT conditions are satisfied for
some λ∗ . Suppose, too, that the linear independence constraint qualification (LICQ), the strict complementarity
condition, ans second-order sufficient conditions hold at (x∗ , λ∗ ). Then if (xk , λk ) is sufficiently close to
(xk , λk ), there is a local solution of the subproblem (9.13), (9.14), (9.15) whose active set A(xk ) is the
same as the active set A(x∗ ) of the nonlinear program (9.12) at x∗ .
Far from the solution, the SQP approach is usually able to improve the estimate of the active set and guide
the iterates towards a solution.
9-3
PROBLEM SET IX
1. Show that in the quadratic program (9.7), (9.8) we can replace the linear term f (xk )T d by ∇L(xk , λk )T d
without changing the solution.
2. Consider the constraint x21 + x22 = 1. Write the linearised constraints (9.8) at the following points:
[0, 0]T , [0, 1]T , [0.1, 0.02]T , −[0.1, 0.02]T .
3. Consider the problem
minx1 − x2 (9.16)
subject to x21 + x22 ≤ 1. (9.17)
Starting with the point x = [−1, 0]T and form the initial value of the Lagrangian multiplier λ = 1,
carry out two iterations of the sequential quadratic programming method.
4. Write a program that implements Algorithm 9.1. Use it to solve the problem
1
min ex1 x2 x3 x4 x5 − (x31 + x32 + 1)2 (9.18)
2
subject to x21 + x22 + x23 + x24 + x25 − 10 = 0, (9.19)
x2 x3 − 5x4 x5 = 0, (9.20)
x31 + x32 + 1 = 0. (9.21)
9-4

Lecture 09 - Sequential Quadratic Programming

Uploaded by

Copyright:

Available Formats

Lecture 09 - Sequential Quadratic Programming

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 09 - Sequential Quadratic Programming

Uploaded by

Copyright:

Available Formats

School of Computer Science and Applied Mathematics

APPM 3017: Optimization III

By the completion of this lecture you should be able to:

9.2 Sequential Quadratic Programming Idea

For simplicity, let us consider the equality constrained problem

The Newton step from the iterate (xk , λk ) is thus given by

where dkx and dkλ solve the Newton-KKT system

∇2xx L(x, λk )dkx + ∇f (xk ) − J(xk )T lk = 0. (9.9)

Algorithm 9.1 SQP Algorithm

You might also like