default search action
Ian Osband
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2023
- [j4]Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen:
Reinforcement Learning, Bit by Bit. Found. Trends Mach. Learn. 16(6): 733-865 (2023) - [j3]Vikranth Dwaracherla, Zheng Wen, Ian Osband, Xiuyuan Lu, Seyed Mohammad Asghari, Benjamin Van Roy:
Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping. Trans. Mach. Learn. Res. 2023 (2023) - [c20]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy:
Epistemic Neural Networks. NeurIPS 2023 - [c19]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy:
Approximate Thompson Sampling via Epistemic Neural Networks. UAI 2023: 1586-1595 - [i32]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy:
Approximate Thompson Sampling via Epistemic Neural Networks. CoRR abs/2302.09205 (2023) - 2022
- [c18]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Xiuyuan Lu, Morteza Ibrahimi, Dieterich Lawson, Botao Hao, Brendan O'Donoghue, Benjamin Van Roy:
The Neural Testbed: Evaluating Joint Predictions. NeurIPS 2022 - [c17]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Xiuyuan Lu, Benjamin Van Roy:
Evaluating high-order predictive distributions in deep learning. UAI 2022: 1552-1560 - [i31]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Xiuyuan Lu, Benjamin Van Roy:
Evaluating High-Order Predictive Distributions in Deep Learning. CoRR abs/2202.13509 (2022) - [i30]Vikranth Dwaracherla, Zheng Wen, Ian Osband, Xiuyuan Lu, Seyed Mohammad Asghari, Benjamin Van Roy:
Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping. CoRR abs/2206.03633 (2022) - [i29]Xiuyuan Lu, Ian Osband, Seyed Mohammad Asghari, Sven Gowal, Vikranth Dwaracherla, Zheng Wen, Benjamin Van Roy:
Robustness of Epinets against Distributional Shifts. CoRR abs/2207.00137 (2022) - [i28]Ian Osband, Seyed Mohammad Asghari, Benjamin Van Roy, Nat McAleese, John Aslanides, Geoffrey Irving:
Fine-Tuning Language Models via Epistemic Neural Networks. CoRR abs/2211.01568 (2022) - 2021
- [c16]Brendan O'Donoghue, Tor Lattimore, Ian Osband:
Matrix games with bandit feedback. UAI 2021: 279-289 - [i27]Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen:
Reinforcement Learning, Bit by Bit. CoRR abs/2103.04047 (2021) - [i26]Ian Osband, Zheng Wen, Mohammad Asghari, Morteza Ibrahimi, Xiyuan Lu, Benjamin Van Roy:
Epistemic Neural Networks. CoRR abs/2107.08924 (2021) - [i25]Xiuyuan Lu, Ian Osband, Benjamin Van Roy, Zheng Wen:
Evaluating Probabilistic Inference in Deep Learning: Beyond Marginal Predictions. CoRR abs/2107.09224 (2021) - [i24]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Botao Hao, Morteza Ibrahimi, Dieterich Lawson, Xiuyuan Lu, Brendan O'Donoghue, Benjamin Van Roy:
Evaluating Predictive Distributions: Does Bayesian Deep Learning Work? CoRR abs/2110.04629 (2021) - 2020
- [c15]Vikranth Dwaracherla, Xiuyuan Lu, Morteza Ibrahimi, Ian Osband, Zheng Wen, Benjamin Van Roy:
Hypermodels for Exploration. ICLR 2020 - [c14]Brendan O'Donoghue, Ian Osband, Catalin Ionescu:
Making Sense of Reinforcement Learning and Probabilistic Inference. ICLR 2020 - [c13]Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvári, Satinder Singh, Benjamin Van Roy, Richard S. Sutton, David Silver, Hado van Hasselt:
Behaviour Suite for Reinforcement Learning. ICLR 2020 - [i23]Brendan O'Donoghue, Ian Osband, Catalin Ionescu:
Making Sense of Reinforcement Learning and Probabilistic Inference. CoRR abs/2001.00805 (2020) - [i22]Brendan O'Donoghue, Tor Lattimore, Ian Osband:
Stochastic matrix games with bandit feedback. CoRR abs/2006.05145 (2020) - [i21]Vikranth Dwaracherla, Xiuyuan Lu, Morteza Ibrahimi, Ian Osband, Zheng Wen, Benjamin Van Roy:
Hypermodels for Exploration. CoRR abs/2006.07464 (2020)
2010 – 2019
- 2019
- [j2]Ian Osband, Benjamin Van Roy, Daniel J. Russo, Zheng Wen:
Deep Exploration via Randomized Value Functions. J. Mach. Learn. Res. 20: 124:1-124:62 (2019) - [i20]Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alexander Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin J. Miller, Mohammad Gheshlaghi Azar, Ian Osband, Neil C. Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew M. Botvinick, Shane Legg:
Meta-learning of Sequential Strategies. CoRR abs/1905.03030 (2019) - [i19]Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvári, Satinder Singh, Benjamin Van Roy, Richard S. Sutton, David Silver, Hado van Hasselt:
Behaviour Suite for Reinforcement Learning. CoRR abs/1908.03568 (2019) - 2018
- [j1]Daniel Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband, Zheng Wen:
A Tutorial on Thompson Sampling. Found. Trends Mach. Learn. 11(1): 1-96 (2018) - [c12]Todd Hester, Matej Vecerík, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Ian Osband, Gabriel Dulac-Arnold, John P. Agapiou, Joel Z. Leibo, Audrunas Gruslys:
Deep Q-learning From Demonstrations. AAAI 2018: 3223-3230 - [c11]Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Matteo Hessel, Ian Osband, Alex Graves, Volodymyr Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg:
Noisy Networks For Exploration. ICLR (Poster) 2018 - [c10]Brendan O'Donoghue, Ian Osband, Rémi Munos, Volodymyr Mnih:
The Uncertainty Bellman Equation and Exploration. ICML 2018: 3836-3845 - [c9]Maria Dimakopoulou, Ian Osband, Benjamin Van Roy:
Scalable Coordinated Exploration in Concurrent Reinforcement Learning. NeurIPS 2018: 4223-4232 - [c8]Ian Osband, John Aslanides, Albin Cassirer:
Randomized Prior Functions for Deep Reinforcement Learning. NeurIPS 2018: 8626-8638 - [i18]Maria Dimakopoulou, Ian Osband, Benjamin Van Roy:
Scalable Coordinated Exploration in Concurrent Reinforcement Learning. CoRR abs/1805.08948 (2018) - [i17]Ian Osband, John Aslanides, Albin Cassirer:
Randomized Prior Functions for Deep Reinforcement Learning. CoRR abs/1806.03335 (2018) - 2017
- [c7]Mohammad Gheshlaghi Azar, Ian Osband, Rémi Munos:
Minimax Regret Bounds for Reinforcement Learning. ICML 2017: 263-272 - [c6]Ian Osband, Benjamin Van Roy:
Why is Posterior Sampling Better than Optimism for Reinforcement Learning? ICML 2017: 2701-2710 - [i16]Ian Osband, Benjamin Van Roy:
Gaussian-Dirichlet Posterior Dominance in Sequential Learning. CoRR abs/1702.04126 (2017) - [i15]Mohammad Gheshlaghi Azar, Ian Osband, Rémi Munos:
Minimax Regret Bounds for Reinforcement Learning. CoRR abs/1703.05449 (2017) - [i14]Ian Osband, Daniel Russo, Zheng Wen, Benjamin Van Roy:
Deep Exploration via Randomized Value Functions. CoRR abs/1703.07608 (2017) - [i13]Todd Hester, Matej Vecerík, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John P. Agapiou, Joel Z. Leibo, Audrunas Gruslys:
Learning from Demonstrations for Real World Reinforcement Learning. CoRR abs/1704.03732 (2017) - [i12]Ian Osband, Benjamin Van Roy:
On Optimistic versus Randomized Exploration in Reinforcement Learning. CoRR abs/1706.04241 (2017) - [i11]Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg:
Noisy Networks for Exploration. CoRR abs/1706.10295 (2017) - [i10]Daniel Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband:
A Tutorial on Thompson Sampling. CoRR abs/1707.02038 (2017) - [i9]Brendan O'Donoghue, Ian Osband, Rémi Munos, Volodymyr Mnih:
The Uncertainty Bellman Equation and Exploration. CoRR abs/1709.05380 (2017) - 2016
- [c5]Ian Osband, Benjamin Van Roy, Zheng Wen:
Generalization and Exploration via Randomized Value Functions. ICML 2016: 2377-2386 - [c4]Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy:
Deep Exploration via Bootstrapped DQN. NIPS 2016: 4026-4034 - [i8]Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy:
Deep Exploration via Bootstrapped DQN. CoRR abs/1602.04621 (2016) - [i7]Ian Osband, Benjamin Van Roy:
Why is Posterior Sampling Better than Optimism for Reinforcement Learning. CoRR abs/1607.00215 (2016) - [i6]Ian Osband, Benjamin Van Roy:
Posterior Sampling for Reinforcement Learning Without Episodes. CoRR abs/1608.02731 (2016) - [i5]Ian Osband, Benjamin Van Roy:
On Lower Bounds for Regret in Reinforcement Learning. CoRR abs/1608.02732 (2016) - 2015
- [i4]Ian Osband, Benjamin Van Roy:
Bootstrapped Thompson Sampling and Deep Exploration. CoRR abs/1507.00300 (2015) - 2014
- [c3]Ian Osband, Benjamin Van Roy:
Near-optimal Reinforcement Learning in Factored MDPs. NIPS 2014: 604-612 - [c2]Ian Osband, Benjamin Van Roy:
Model-based Reinforcement Learning and the Eluder Dimension. NIPS 2014: 1466-1474 - [i3]Ian Osband, Benjamin Van Roy:
Near-optimal Regret Bounds for Reinforcement Learning in Factored MDPs. CoRR abs/1403.3741 (2014) - [i2]Ian Osband, Benjamin Van Roy:
Model-based Reinforcement Learning and the Eluder Dimension. CoRR abs/1406.1853 (2014) - 2013
- [c1]Ian Osband, Daniel Russo, Benjamin Van Roy:
(More) Efficient Reinforcement Learning via Posterior Sampling. NIPS 2013: 3003-3011 - [i1]Ian Osband, Daniel Russo, Benjamin Van Roy:
(More) Efficient Reinforcement Learning via Posterior Sampling. CoRR abs/1306.0940 (2013)
Coauthor Index
aka: Vikranth Dwaracherla
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-04 21:00 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint