In mathematics mirror descent descent is an iterative optimization algorithm for finding a local minimum of a differentiable function.
It generalizes algorithms such as gradient descent and multiplicative weights.
History
Mirror descent was originally proposed by Nemirovski and Yudin in 1983.[1]
Motivation
In gradient descent applied to a differentiable function , one starts with a guess for a local minimum of , and considers the sequence such that
(assuming a constant learning rate)
We have a monotonic sequence
so, hopefully, the sequence converges to the desired local minimum.
This can be reformulated by noting that
In other words, minimizes the first-order approximation to at with added proximity term .
This Euclidean distance term is a particular example of a Bregman distance. Using other Bregman distances will yield other algorithms such as Hedge which may be more suited to optimization over particular geometries.
Formulation
We are given convex function to optimize over convex set , and given some norm on .
We are also given differentiable convex function , -strongly convex with respect to the given norm. This is called the distance-generating function, and its gradient is known as the mirror map.
Starting from initial , in each iteration of Mirror Descent:
- Map to the dual space:
- Update in the dual space using a gradient step:
- Map pack to the primal space:
- Project back to the feasible region : , where is the Bregman divergence
Extensions
Mirror descent in the online optimization setting is known as Online Mirror Descent (OMD).[2]
See algo
References
- ^ Arkadi Nemirovsky and David Yudin. Problem Complexity and Method Efficiency in Optimization. John Wiley & Sons, 1983
- ^ Fang, Huang; Harvey, Nicholas J. A.; Portella, Victor S.; Friedlander, Michael P. (2021-09-03). "Online mirror descent and dual averaging: keeping pace in the dynamic case". arXiv:2006.02585 [cs, math, stat].