Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced

Du, Simon S.; Hu, Wei; Lee, Jason D.

Computer Science > Machine Learning

arXiv:1806.00900 (cs)

[Submitted on 4 Jun 2018 (v1), last revised 31 Oct 2018 (this version, v2)]

Title:Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced

Authors:Simon S. Du, Wei Hu, Jason D. Lee

View PDF

Abstract:We study the implicit regularization imposed by gradient descent for learning multi-layer homogeneous functions including feed-forward fully connected and convolutional deep neural networks with linear, ReLU or Leaky ReLU activation. We rigorously prove that gradient flow (i.e. gradient descent with infinitesimal step size) effectively enforces the differences between squared norms across different layers to remain invariant without any explicit regularization. This result implies that if the weights are initially small, gradient flow automatically balances the magnitudes of all layers. Using a discretization argument, we analyze gradient descent with positive step size for the non-convex low-rank asymmetric matrix factorization problem without any regularization. Inspired by our findings for gradient flow, we prove that gradient descent with step sizes $\eta_t = O\left(t^{-\left( \frac12+\delta\right)} \right)$ ($0<\delta\le\frac12$) automatically balances two low-rank factors and converges to a bounded global optimum. Furthermore, for rank-$1$ asymmetric matrix factorization we give a finer analysis showing gradient descent with constant step size converges to the global minimum at a globally linear rate. We believe that the idea of examining the invariance imposed by first order algorithms in learning homogeneous models could serve as a fundamental building block for studying optimization for learning deep models.

Comments:	In NIPS 2018
Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:1806.00900 [cs.LG]
	(or arXiv:1806.00900v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1806.00900

Submission history

From: Wei Hu [view email]
[v1] Mon, 4 Jun 2018 00:07:38 UTC (593 KB)
[v2] Wed, 31 Oct 2018 17:02:49 UTC (482 KB)

Computer Science > Machine Learning

Title:Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators