Nested bandits

Martin, Matthieu; Mertikopoulos, Panayotis; Rahier, Thibaud; Zenati, Houssam

Computer Science > Machine Learning

arXiv:2206.09348 (cs)

[Submitted on 19 Jun 2022]

Title:Nested bandits

Authors:Matthieu Martin, Panayotis Mertikopoulos, Thibaud Rahier, Houssam Zenati

View PDF

Abstract:In many online decision processes, the optimizing agent is called to choose between large numbers of alternatives with many inherent similarities; in turn, these similarities imply closely correlated losses that may confound standard discrete choice models and bandit algorithms. We study this question in the context of nested bandits, a class of adversarial multi-armed bandit problems where the learner seeks to minimize their regret in the presence of a large number of distinct alternatives with a hierarchy of embedded (non-combinatorial) similarities. In this setting, optimal algorithms based on the exponential weights blueprint (like Hedge, EXP3, and their variants) may incur significant regret because they tend to spend excessive amounts of time exploring irrelevant alternatives with similar, suboptimal costs. To account for this, we propose a nested exponential weights (NEW) algorithm that performs a layered exploration of the learner's set of alternatives based on a nested, step-by-step selection method. In so doing, we obtain a series of tight bounds for the learner's regret showing that online learning problems with a high degree of similarity between alternatives can be resolved efficiently, without a red bus / blue bus paradox occurring.

Comments:	35 pages, 14 figures; to appear in ICML 2022
Subjects:	Machine Learning (cs.LG); Computer Science and Game Theory (cs.GT); Optimization and Control (math.OC)
MSC classes:	Primary 68Q32, secondary 91B06
Cite as:	arXiv:2206.09348 [cs.LG]
	(or arXiv:2206.09348v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2206.09348

Submission history

From: Panayotis Mertikopoulos [view email]
[v1] Sun, 19 Jun 2022 08:08:38 UTC (825 KB)

Computer Science > Machine Learning

Title:Nested bandits

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Nested bandits

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators