On the Last Iterate Convergence of Momentum Methods

Li, Xiaoyu; Liu, Mingrui; Orabona, Francesco

Computer Science > Machine Learning

arXiv:2102.07002 (cs)

[Submitted on 13 Feb 2021 (v1), last revised 24 Jul 2022 (this version, v3)]

Title:On the Last Iterate Convergence of Momentum Methods

Authors:Xiaoyu Li, Mingrui Liu, Francesco Orabona

View PDF

Abstract:SGD with Momentum (SGDM) is a widely used family of algorithms for large-scale optimization of machine learning problems. Yet, when optimizing generic convex functions, no advantage is known for any SGDM algorithm over plain SGD. Moreover, even the most recent results require changes to the SGDM algorithms, like averaging of the iterates and a projection onto a bounded domain, which are rarely used in practice. In this paper, we focus on the convergence rate of the last iterate of SGDM. For the first time, we prove that for any constant momentum factor, there exists a Lipschitz and convex function for which the last iterate of SGDM suffers from a suboptimal convergence rate of $\Omega(\frac{\ln T}{\sqrt{T}})$ after $T$ iterations. Based on this fact, we study a class of (both adaptive and non-adaptive) Follow-The-Regularized-Leader-based SGDM algorithms with increasing momentum and shrinking updates. For these algorithms, we show that the last iterate has optimal convergence $O(\frac{1}{\sqrt{T}})$ for unconstrained convex stochastic optimization problems without projections onto bounded domains nor knowledge of $T$. Further, we show a variety of results for FTRL-based SGDM when used with adaptive stepsizes. Empirical results are shown as well.

Comments:	Differences with ALT'22 camera ready: Clarified the statement of the lower bound
Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:2102.07002 [cs.LG]
	(or arXiv:2102.07002v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2102.07002

Submission history

From: Francesco Orabona [view email]
[v1] Sat, 13 Feb 2021 21:16:16 UTC (522 KB)
[v2] Fri, 4 Mar 2022 19:31:46 UTC (748 KB)
[v3] Sun, 24 Jul 2022 21:41:05 UTC (461 KB)

Computer Science > Machine Learning

Title:On the Last Iterate Convergence of Momentum Methods

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:On the Last Iterate Convergence of Momentum Methods

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators