Numerical Optimization: Numerical Geometry of Non-Rigid Shapes

Numerical geometry of non-rigid shapes Numerical Optimization

Alexander Bronstein, Michael Bronstein 2008 All rights reserved. Web:

Fastest Largest

Common denominator: optimization problems

Optimization problems
Generic unconstrained minimization problem

where Vector space A solution The value is the minimum is the search space

is a cost (or objective) function is the minimizer of

Local vs. global minimum

Find minimum by analyzing the local behavior of the cost function

Local minimum

Global minimum

Local vs. global in real life

False summit 8,030 m

Main summit 8,047 m

Broad Peak (K3), 12th highest mountain on Earth

Convex functions
A function defined on a convex set is called convex if

for any


For convex function local minimum = global minimum



One-dimensional optimality conditions

Point Approximate a function around as a parabola using Taylor expansion . is the local minimizer of a -function if

guarantees the minimum at

guarantees the parabola is convex

In multidimensional case, linearization of the function according to Taylor

gives a multidimensional analogy of the derivative.

The function

, denoted as

, is called the gradient of

In one-dimensional case, it reduces to standard definition of derivative

In Euclidean space ( ), can be represented in standard basis

in the following way:

i-th place

which gives

Example 1: gradient of a matrix function

Given product Compute the gradient of the function an matrix where is (space of real matrices) with standard inner

For square matrices

Example 2: gradient of a matrix function

Compute the gradient of the function an matrix where is

Linearization of the gradient

gives a multidimensional analogy of the secondorder derivative.

The function
is called the Hessian of

, denoted as

Ludwig Otto Hesse (1811-1874)

In the standard basis, Hessian is a symmetric matrix of mixed second-order


Optimality conditions, bis

Point matrix (denoted Approximate a function around . for all ) as a parabola using Taylor expansion , i.e., the Hessian is a positive definite is the local minimizer of a -function if

guarantees the minimum at

guarantees the parabola is convex

Optimization algorithms
Descent direction Step size

Generic optimization algorithm

Start with some Determine descent direction

Choose step size

such that

Update iterate

Until convergence

Increment iteration counter Solution Descent direction Step size Stopping criterion

Stopping criteria
Near local minimum, (or equivalently )

Stop when gradient norm becomes small

Stop when step size becomes small

Stop when relative objective change becomes small

Line search
Optimal step size can be found by solving a one-dimensional optimization problem

One-dimensional optimization algorithms for finding the optimal step size are generically called exact line search

Armijo [ar-mi-xo] rule

The function sufficiently decreases if Armijo rule (Larry Armijo, 1966): start with multiplying by some and decrease it by

until the function sufficiently decreases

Descent direction
How to descend in the fastest way? Go in the direction in which the height lines are the densest

Devils Tower

Topographic map

Steepest descent

Directional derivative: how much changes in the direction (negative for a descent direction)

Find a unit-length direction minimizing directional


Steepest descent

L2 norm

L1 norm

Normalized steepest descent

Coordinate descent (coordinate axis in which descent is maximal)

Steepest descent algorithm

Start with some Compute steepest descent direction

Choose step size using line search

Until convergence

Update iterate

Increment iteration counter

Steepest descent

Condition number
Condition number is the ratio of maximal and minimal eigenvalues of the Hessian






-1 -1



-1 -1



Problem with large condition number is called ill-conditioned Steepest descent convergence rate is slow for ill-conditioned problems

Change of coordinates


L2 norm

Function Gradient Descent direction

Using Q-norm for steepest descent can be regarded as a change of coordinates, called preconditioning Preconditioner should be chosen to improve the condition number of

the Hessian in the proximity of the solution In system of coordinates, the Hessian at the solution is

(a dream)

Newton method as optimal preconditioner

Best theoretically possible preconditioner direction , giving descent

Ideal condition number

Problem: the solution

is unknown in advance

Newton direction: use Hessian as a preconditioner at each iteration

Another derivation of the Newton method

Approximate the function as a quadratic function using second-order Taylor expansion

(quadratic function in

Close to solution the function looks like a quadratic function; the Newton method converges fast

Newton method
Start with some Compute Newton direction

Choose step size using line search

Until convergence
Update iterate

Increment iteration counter

Frozen Hessian
Observation: close to the optimum, the Hessian does not change significantly Reduce the number of Hessian inversions by keeping the Hessian from previous iterations and update it once in a few iterations Such a method is called Newton with frozen Hessian

Cholesky factorization
Decompose the Hessian


is a lower triangular matrix

Solve the Newton system Andre Louis Cholesky (1875-1918)

in two steps Forward substitution

Backward substitution
Complexity: , better than straightforward matrix inversion

Truncated Newton
Solve the Newton system approximately

A few iterations of conjugate gradients or other algorithm for the solution of linear systems can be used Such a method is called truncated or inexact Newton

Non-convex optimization
Using convex optimization methods with non-convex functions does not guarantee global convergence! There is no theoretical guaranteed global optimization, just heuristics

Local minimum
Global minimum

Good initialization


Iterative majorization
Construct a majorizing function . Majorizing inequality: for all satisfying

is convex or easier to optimize w.r.t.

Iterative majorization
Start with some Find such that

Update iterate

Until convergence

Increment iteration counter Solution

Constrained optimization


Constrained optimization problems

Generic constrained minimization problem


are inequality constraints

are equality constraints in which the constraints hold is called

A subset of the search space feasible set A point

belonging to the feasible set is called a feasible solution may be infeasible!

A minimizer of the problem

An example
Equality constraint Inequality constraint

Feasible set

Inequality constraint
A point

is active at point


, inactive otherwise
and of

is regular if the gradients of equality constraints are linearly independent

active inequality constraints

Lagrange multipliers
Main idea to solve constrained problems: arrange the objective and constraints into a single function

and minimize it as an unconstrained problem is called Lagrangian and are called Lagrange multipliers

KKT conditions
If is a regular point and a local minimum, there exist Lagrange multipliers and Known as Karush-Kuhn-Tucker conditions Necessary but not sufficient! such that for all such that inactive constraints and for all

for active constraints and zero for

KKT conditions
Sufficient conditions:

If the objective

is convex, the inequality constraints

are affine, and and for all

are convex

and the equality constraints then for all such that inactive constraints

for active constraints and zero for

is the solution of the constrained problem (global constrained


Geometric interpretation
Consider a simpler problem: Equality constraint

The gradient of objective and constraint must line up at the solution

Penalty methods
Define a penalty aggregate



are parametric penalty functions

For larger values of the parameter

is stronger

, the penalty on the constraint violation

Penalty methods

Inequality penalty

Equality penalty

Penalty methods
Start with some Find and initial value of

by solving an unconstrained optimization problem initialized with

Until convergence

Set Update


