Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary $\beta$-Mixing Processes

Ralaivola, Liva; Szafranski, Marie; Stempfel, Guillaume

Computer Science > Machine Learning

arXiv:0909.1933 (cs)

[Submitted on 10 Sep 2009 (v1), last revised 4 Jun 2010 (this version, v2)]

Title:Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary $β$-Mixing Processes

Authors:Liva Ralaivola (LIF), Marie Szafranski (IBISC), Guillaume Stempfel (LIF)

View PDF

Abstract:Pac-Bayes bounds are among the most accurate generalization bounds for classifiers learned from independently and identically distributed (IID) data, and it is particularly so for margin classifiers: there have been recent contributions showing how practical these bounds can be either to perform model selection (Ambroladze et al., 2007) or even to directly guide the learning of linear classifiers (Germain et al., 2009). However, there are many practical situations where the training data show some dependencies and where the traditional IID assumption does not hold. Stating generalization bounds for such frameworks is therefore of the utmost interest, both from theoretical and practical standpoints. In this work, we propose the first - to the best of our knowledge - Pac-Bayes generalization bounds for classifiers trained on data exhibiting interdependencies. The approach undertaken to establish our results is based on the decomposition of a so-called dependency graph that encodes the dependencies within the data, in sets of independent data, thanks to graph fractional covers. Our bounds are very general, since being able to find an upper bound on the fractional chromatic number of the dependency graph is sufficient to get new Pac-Bayes bounds for specific settings. We show how our results can be used to derive bounds for ranking statistics (such as Auc) and classifiers trained on data distributed according to a stationary ß-mixing process. In the way, we show how our approach seemlessly allows us to deal with U-processes. As a side note, we also provide a Pac-Bayes generalization bound for classifiers learned on data from stationary $\varphi$-mixing distributions.

Comments:	Long version of the AISTATS 09 paper: this http URL
Subjects:	Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:0909.1933 [cs.LG]
	(or arXiv:0909.1933v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.0909.1933

Submission history

From: Liva Ralaivola [view email] [via CCSD proxy]
[v1] Thu, 10 Sep 2009 11:51:10 UTC (99 KB)
[v2] Fri, 4 Jun 2010 08:43:38 UTC (128 KB)

Computer Science > Machine Learning

Title:Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary $β$-Mixing Processes

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary $β$-Mixing Processes

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators