How Tight Can PAC-Bayes be in the Small Data Regime?

Foong, Andrew Y. K.; Bruinsma, Wessel P.; Burt, David R.; Turner, Richard E.

Statistics > Machine Learning

arXiv:2106.03542 (stat)

[Submitted on 7 Jun 2021 (v1), last revised 13 Jan 2022 (this version, v4)]

Title:How Tight Can PAC-Bayes be in the Small Data Regime?

Authors:Andrew Y. K. Foong, Wessel P. Bruinsma, David R. Burt, Richard E. Turner

View PDF

Abstract:In this paper, we investigate the question: Given a small number of datapoints, for example N = 30, how tight can PAC-Bayes and test set bounds be made? For such small datasets, test set bounds adversely affect generalisation performance by withholding data from the training procedure. In this setting, PAC-Bayes bounds are especially attractive, due to their ability to use all the data to simultaneously learn a posterior and bound its generalisation risk. We focus on the case of i.i.d. data with a bounded loss and consider the generic PAC-Bayes theorem of Germain et al. While their theorem is known to recover many existing PAC-Bayes bounds, it is unclear what the tightest bound derivable from their framework is. For a fixed learning algorithm and dataset, we show that the tightest possible bound coincides with a bound considered by Catoni; and, in the more natural case of distributions over datasets, we establish a lower bound on the best bound achievable in expectation. Interestingly, this lower bound recovers the Chernoff test set bound if the posterior is equal to the prior. Moreover, to illustrate how tight these bounds can be, we study synthetic one-dimensional classification tasks in which it is feasible to meta-learn both the prior and the form of the bound to numerically optimise for the tightest bounds possible. We find that in this simple, controlled scenario, PAC-Bayes bounds are competitive with comparable, commonly used Chernoff test set bounds. However, the sharpest test set bounds still lead to better guarantees on the generalisation error than the PAC-Bayes bounds we consider.

Comments:	Published at Neural Information Processing Systems 2021
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
Cite as:	arXiv:2106.03542 [stat.ML]
	(or arXiv:2106.03542v4 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2106.03542

Submission history

From: Andrew Y. K. Foong [view email]
[v1] Mon, 7 Jun 2021 12:11:32 UTC (4,272 KB)
[v2] Wed, 27 Oct 2021 16:46:46 UTC (4,287 KB)
[v3] Wed, 10 Nov 2021 16:17:13 UTC (4,287 KB)
[v4] Thu, 13 Jan 2022 12:57:48 UTC (4,294 KB)

Statistics > Machine Learning

Title:How Tight Can PAC-Bayes be in the Small Data Regime?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:How Tight Can PAC-Bayes be in the Small Data Regime?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators