A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics

E, Weinan; Ma, Chao; Wu, Lei

doi:10.1007/s11425-019-1628-5

Computer Science > Machine Learning

arXiv:1904.04326 (cs)

[Submitted on 8 Apr 2019 (v1), last revised 21 Feb 2020 (this version, v2)]

Title:A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics

Authors:Weinan E, Chao Ma, Lei Wu

View PDF

Abstract:A fairly comprehensive analysis is presented for the gradient descent dynamics for training two-layer neural network models in the situation when the parameters in both layers are updated. General initialization schemes as well as general regimes for the network width and training data size are considered. In the over-parametrized regime, it is shown that gradient descent dynamics can achieve zero training loss exponentially fast regardless of the quality of the labels. In addition, it is proved that throughout the training process the functions represented by the neural network model are uniformly close to that of a kernel method. For general values of the network width and training data size, sharp estimates of the generalization error is established for target functions in the appropriate reproducing kernel Hilbert space.

Comments:	Published version
Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
MSC classes:	41A99, 49M99
Cite as:	arXiv:1904.04326 [cs.LG]
	(or arXiv:1904.04326v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1904.04326
Journal reference:	Science China Mathematics (2020)
Related DOI:	https://doi.org/10.1007/s11425-019-1628-5

Submission history

From: Lei Wu [view email]
[v1] Mon, 8 Apr 2019 19:43:09 UTC (568 KB)
[v2] Fri, 21 Feb 2020 04:48:07 UTC (567 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2019-04

Change to browse by:

cs
math
math.OC
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Weinan E
Chao Ma
Lei Wu

export BibTeX citation

Computer Science > Machine Learning

Title:A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators