Directional Optimism for Safe Linear Bandits

Hutchinson, Spencer; Turan, Berkay; Alizadeh, Mahnoosh

Computer Science > Machine Learning

arXiv:2308.15006v2 (cs)

[Submitted on 29 Aug 2023 (v1), last revised 11 Mar 2024 (this version, v2)]

Title:Directional Optimism for Safe Linear Bandits

Authors:Spencer Hutchinson, Berkay Turan, Mahnoosh Alizadeh

View PDF HTML (experimental)

Abstract:The safe linear bandit problem is a version of the classical stochastic linear bandit problem where the learner's actions must satisfy an uncertain constraint at all rounds. Due its applicability to many real-world settings, this problem has received considerable attention in recent years. By leveraging a novel approach that we call directional optimism, we find that it is possible to achieve improved regret guarantees for both well-separated problem instances and action sets that are finite star convex sets. Furthermore, we propose a novel algorithm for this setting that improves on existing algorithms in terms of empirical performance, while enjoying matching regret guarantees. Lastly, we introduce a generalization of the safe linear bandit setting where the constraints are convex and adapt our algorithms and analyses to this setting by leveraging a novel convex-analysis based approach.

Comments:	37 pages, 4 figures
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2308.15006 [cs.LG]
	(or arXiv:2308.15006v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2308.15006

Submission history

From: Spencer Hutchinson [view email]
[v1] Tue, 29 Aug 2023 03:54:53 UTC (742 KB)
[v2] Mon, 11 Mar 2024 23:32:33 UTC (318 KB)

Computer Science > Machine Learning

Title:Directional Optimism for Safe Linear Bandits

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Directional Optimism for Safe Linear Bandits

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators