Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks

Domingo-Enrich, Carles; Bietti, Alberto; Gabrié, Marylou; Bruna, Joan; Vanden-Eijnden, Eric

Computer Science > Machine Learning

arXiv:2107.05134 (cs)

[Submitted on 11 Jul 2021 (v1), last revised 15 Feb 2022 (this version, v2)]

Title:Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks

Authors:Carles Domingo-Enrich, Alberto Bietti, Marylou Gabrié, Joan Bruna, Eric Vanden-Eijnden

View PDF

Abstract:Energy-based models (EBMs) are generative models that are usually trained via maximum likelihood estimation. This approach becomes challenging in generic situations where the trained energy is non-convex, due to the need to sample the Gibbs distribution associated with this energy. Using general Fenchel duality results, we derive variational principles dual to maximum likelihood EBMs with shallow overparametrized neural network energies, both in the feature-learning and lazy linearized regimes. In the feature-learning regime, this dual formulation justifies using a two time-scale gradient ascent-descent (GDA) training algorithm in which one updates concurrently the particles in the sample space and the neurons in the parameter space of the energy. We also consider a variant of this algorithm in which the particles are sometimes restarted at random samples drawn from the data set, and show that performing these restarts at every iteration step corresponds to score matching training. These results are illustrated in simple numerical experiments, which indicates that GDA performs best when features and particles are updated using similar time scales.

Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:2107.05134 [cs.LG]
	(or arXiv:2107.05134v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2107.05134

Submission history

From: Carles Domingo-Enrich [view email]
[v1] Sun, 11 Jul 2021 21:43:18 UTC (5,094 KB)
[v2] Tue, 15 Feb 2022 15:35:39 UTC (9,345 KB)

Computer Science > Machine Learning

Title:Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators