Optimizing Variational Representations of Divergences and Accelerating their Statistical Estimation

Birrell, Jeremiah; Katsoulakis, Markos A.; Pantazis, Yannis

Computer Science > Machine Learning

arXiv:2006.08781 (cs)

[Submitted on 15 Jun 2020 (v1), last revised 23 Mar 2022 (this version, v3)]

Title:Optimizing Variational Representations of Divergences and Accelerating their Statistical Estimation

Authors:Jeremiah Birrell, Markos A. Katsoulakis, Yannis Pantazis

View PDF

Abstract:Variational representations of divergences and distances between high-dimensional probability distributions offer significant theoretical insights and practical advantages in numerous research areas. Recently, they have gained popularity in machine learning as a tractable and scalable approach for training probabilistic models and for statistically differentiating between data distributions. Their advantages include: 1) They can be estimated from data as statistical averages. 2) Such representations can leverage the ability of neural networks to efficiently approximate optimal solutions in function spaces. However, a systematic and practical approach to improving the tightness of such variational formulas, and accordingly accelerate statistical learning and estimation from data, is currently lacking. Here we develop such a methodology for building new, tighter variational representations of divergences. Our approach relies on improved objective functionals constructed via an auxiliary optimization problem. Furthermore, the calculation of the functional Hessian of objective functionals unveils the local curvature differences around the common optimal variational solution; this quantifies and orders the tightness gains between different variational representations. Finally, numerical simulations utilizing neural network optimization demonstrate that tighter representations can result in significantly faster learning and more accurate estimation of divergences in both synthetic and real datasets (of more than 1000 dimensions), often accelerated by nearly an order of magnitude.

Comments:	48 pages, 6 figures
Subjects:	Machine Learning (cs.LG); Information Theory (cs.IT); Machine Learning (stat.ML)
Cite as:	arXiv:2006.08781 [cs.LG]
	(or arXiv:2006.08781v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2006.08781

Submission history

From: Jeremiah Birrell [view email]
[v1] Mon, 15 Jun 2020 21:32:21 UTC (767 KB)
[v2] Wed, 5 May 2021 23:11:14 UTC (390 KB)
[v3] Wed, 23 Mar 2022 18:32:59 UTC (918 KB)

Computer Science > Machine Learning

Title:Optimizing Variational Representations of Divergences and Accelerating their Statistical Estimation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Optimizing Variational Representations of Divergences and Accelerating their Statistical Estimation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators