Regret-optimal measurement-feedback control

Gautam Goel, Babak Hassibi
Proceedings of the 3rd Conference on Learning for Dynamics and Control, PMLR 144:1270-1280, 2021.

Abstract

We consider measurement-feedback control in linear dynamical systems from the perspective of regret minimization. Unlike most prior work in this area, we focus on the problem of designing an online controller which competes with the optimal dynamic sequence of control actions selected in hindsight, instead of the best controller in some specic class of controllers. This formulation of regret is attractive when the environment changes over time and no single controller achieves good performance over the entire time horizon. We show that in the measurement-feedback setting, unlike in the full-information setting, there is no single oine controller which outperforms every other oine controller on every disturbance, and propose a new H2-optimal oine controller as a benchmark for the online controller to compete against. We show that the corresponding regret-optimal online controller can be found via a novel reduction to the classical Nehari problem from robust control and present a tight data-dependent bound on its regret.

Cite this Paper


BibTeX
@InProceedings{pmlr-v144-goel21a, title = {Regret-optimal measurement-feedback control}, author = {Goel, Gautam and Hassibi, Babak}, booktitle = {Proceedings of the 3rd Conference on Learning for Dynamics and Control}, pages = {1270--1280}, year = {2021}, editor = {Jadbabaie, Ali and Lygeros, John and Pappas, George J. and A. Parrilo, Pablo and Recht, Benjamin and Tomlin, Claire J. and Zeilinger, Melanie N.}, volume = {144}, series = {Proceedings of Machine Learning Research}, month = {07 -- 08 June}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v144/goel21a/goel21a.pdf}, url = {https://proceedings.mlr.press/v144/goel21a.html}, abstract = {We consider measurement-feedback control in linear dynamical systems from the perspective of regret minimization. Unlike most prior work in this area, we focus on the problem of designing an online controller which competes with the optimal dynamic sequence of control actions selected in hindsight, instead of the best controller in some specic class of controllers. This formulation of regret is attractive when the environment changes over time and no single controller achieves good performance over the entire time horizon. We show that in the measurement-feedback setting, unlike in the full-information setting, there is no single oine controller which outperforms every other oine controller on every disturbance, and propose a new H2-optimal oine controller as a benchmark for the online controller to compete against. We show that the corresponding regret-optimal online controller can be found via a novel reduction to the classical Nehari problem from robust control and present a tight data-dependent bound on its regret.} }
Endnote
%0 Conference Paper %T Regret-optimal measurement-feedback control %A Gautam Goel %A Babak Hassibi %B Proceedings of the 3rd Conference on Learning for Dynamics and Control %C Proceedings of Machine Learning Research %D 2021 %E Ali Jadbabaie %E John Lygeros %E George J. Pappas %E Pablo A. Parrilo %E Benjamin Recht %E Claire J. Tomlin %E Melanie N. Zeilinger %F pmlr-v144-goel21a %I PMLR %P 1270--1280 %U https://proceedings.mlr.press/v144/goel21a.html %V 144 %X We consider measurement-feedback control in linear dynamical systems from the perspective of regret minimization. Unlike most prior work in this area, we focus on the problem of designing an online controller which competes with the optimal dynamic sequence of control actions selected in hindsight, instead of the best controller in some specic class of controllers. This formulation of regret is attractive when the environment changes over time and no single controller achieves good performance over the entire time horizon. We show that in the measurement-feedback setting, unlike in the full-information setting, there is no single oine controller which outperforms every other oine controller on every disturbance, and propose a new H2-optimal oine controller as a benchmark for the online controller to compete against. We show that the corresponding regret-optimal online controller can be found via a novel reduction to the classical Nehari problem from robust control and present a tight data-dependent bound on its regret.
APA
Goel, G. & Hassibi, B.. (2021). Regret-optimal measurement-feedback control. Proceedings of the 3rd Conference on Learning for Dynamics and Control, in Proceedings of Machine Learning Research 144:1270-1280 Available from https://proceedings.mlr.press/v144/goel21a.html.

Related Material