default search action
Alessandro Lazaric
Person information
- affiliation: Meta AI, France
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c108]Arpit Agarwal, Nicolas Usunier, Alessandro Lazaric, Maximilian Nickel:
System-2 Recommenders: Disentangling Utility and Engagement in Recommendation Systems via Temporal Point-Processes. FAccT 2024: 1763-1773 - [c107]Matteo Pirotta, Andrea Tirinzoni, Ahmed Touati, Alessandro Lazaric, Yann Ollivier:
Fast Imitation via Behavior Foundation Models. ICLR 2024 - [c106]Edoardo Cetin, Andrea Tirinzoni, Matteo Pirotta, Alessandro Lazaric, Yann Ollivier, Ahmed Touati:
Simple Ingredients for Offline Reinforcement Learning. ICML 2024 - [i72]Ayoub Ghriss, Masashi Sugiyama, Alessandro Lazaric:
Reinforcement Learning with Options and State Representation. CoRR abs/2403.10855 (2024) - [i71]Edoardo Cetin, Andrea Tirinzoni, Matteo Pirotta, Alessandro Lazaric, Yann Ollivier, Ahmed Touati:
Simple Ingredients for Offline Reinforcement Learning. CoRR abs/2403.13097 (2024) - [i70]Arpit Agarwal, Nicolas Usunier, Alessandro Lazaric, Maximilian Nickel:
System-2 Recommenders: Disentangling Utility and Engagement in Recommendation Systems via Temporal Point-Processes. CoRR abs/2406.01611 (2024) - 2023
- [j8]Harsh Satija, Alessandro Lazaric, Matteo Pirotta, Joelle Pineau:
Group Fairness in Reinforcement Learning. Trans. Mach. Learn. Res. 2023 (2023) - [c105]Andrea Tirinzoni, Matteo Pirotta, Alessandro Lazaric:
On the Complexity of Representation Learning in Contextual Linear Bandits. AISTATS 2023: 7871-7896 - [c104]Liyu Chen, Andrea Tirinzoni, Matteo Pirotta, Alessandro Lazaric:
Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path. ALT 2023: 310-357 - [c103]Virginie Do, Elvis Dohmatob, Matteo Pirotta, Alessandro Lazaric, Nicolas Usunier:
Contextual bandits with concave rewards, and an application to fair ranking. ICLR 2023 - [c102]Rui Yuan, Simon Shaolei Du, Robert M. Gower, Alessandro Lazaric, Lin Xiao:
Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies. ICLR 2023 - [c101]Liyu Chen, Andrea Tirinzoni, Alessandro Lazaric, Matteo Pirotta:
Layered State Discovery for Incremental Autonomous Exploration. ICML 2023: 4953-5001 - [i69]Lina Mezghani, Sainbayar Sukhbaatar, Piotr Bojanowski, Alessandro Lazaric, Karteek Alahari:
Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping. CoRR abs/2301.02099 (2023) - [i68]Liyu Chen, Andrea Tirinzoni, Alessandro Lazaric, Matteo Pirotta:
Layered State Discovery for Incremental Autonomous Exploration. CoRR abs/2302.03789 (2023) - 2022
- [j7]Rui Yuan, Alessandro Lazaric, Robert M. Gower:
Sketched Newton-Raphson. SIAM J. Optim. 32(3): 1555-1583 (2022) - [c100]Rui Yuan, Robert M. Gower, Alessandro Lazaric:
A general sample complexity analysis of vanilla policy gradient. AISTATS 2022: 3332-3380 - [c99]Evrard Garcelon, Vashist Avadhanula, Alessandro Lazaric, Matteo Pirotta:
Top K Ranking for Multi-Armed Bandit with Noisy Evaluations. AISTATS 2022: 6242-6269 - [c98]Jean Tarbouriech, Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Adaptive Multi-Goal Exploration. AISTATS 2022: 7349-7383 - [c97]Lina Mezghani, Sainbayar Sukhbaatar, Piotr Bojanowski, Alessandro Lazaric, Karteek Alahari:
Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping. CoRL 2022: 1401-1410 - [c96]Pierre-Alexandre Kamienny, Jean Tarbouriech, Sylvain Lamprier, Alessandro Lazaric, Ludovic Denoyer:
Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching. ICLR 2022 - [c95]Yunchang Yang, Tianhao Wu, Han Zhong, Evrard Garcelon, Matteo Pirotta, Alessandro Lazaric, Liwei Wang, Simon Shaolei Du:
A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning. ICLR 2022 - [c94]Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto:
Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning. ICLR 2022 - [c93]Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco:
Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times. ICML 2022: 2523-2541 - [c92]Andrea Tirinzoni, Matteo Papini, Ahmed Touati, Alessandro Lazaric, Matteo Pirotta:
Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees. NeurIPS 2022 - [c91]Akram Erraqabi, Marlos C. Machado, Mingde Zhao, Sainbayar Sukhbaatar, Alessandro Lazaric, Ludovic Denoyer, Yoshua Bengio:
Temporal abstractions-augmented temporally contrastive learning: An alternative to the Laplacian in RL. UAI 2022: 641-651 - [i67]Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco:
Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times. CoRR abs/2201.12909 (2022) - [i66]Denis Yarats, David Brandfonbrener, Hao Liu, Michael Laskin, Pieter Abbeel, Alessandro Lazaric, Lerrel Pinto:
Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning. CoRR abs/2201.13425 (2022) - [i65]Akram Erraqabi, Marlos C. Machado, Mingde Zhao, Sainbayar Sukhbaatar, Alessandro Lazaric, Ludovic Denoyer, Yoshua Bengio:
Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL. CoRR abs/2203.11369 (2022) - [i64]Rui Yuan, Simon S. Du, Robert M. Gower, Alessandro Lazaric, Lin Xiao:
Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies. CoRR abs/2210.01400 (2022) - [i63]Liyu Chen, Andrea Tirinzoni, Matteo Pirotta, Alessandro Lazaric:
Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path. CoRR abs/2210.04946 (2022) - [i62]Virginie Do, Elvis Dohmatob, Matteo Pirotta, Alessandro Lazaric, Nicolas Usunier:
Contextual bandits with concave rewards, and an application to fair ranking. CoRR abs/2210.09957 (2022) - [i61]Andrea Tirinzoni, Matteo Papini, Ahmed Touati, Alessandro Lazaric, Matteo Pirotta:
Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees. CoRR abs/2210.13083 (2022) - [i60]Yifang Chen, Karthik Abinav Sankararaman, Alessandro Lazaric, Matteo Pirotta, Dmytro Karamshuk, Qifan Wang, Karishma Mandyam, Sinong Wang, Han Fang:
Improved Adaptive Algorithm for Scalable Active Learning with Weak Labeler. CoRR abs/2211.02233 (2022) - [i59]Andrea Tirinzoni, Matteo Pirotta, Alessandro Lazaric:
On the Complexity of Representation Learning in Contextual Linear Bandits. CoRR abs/2212.09429 (2022) - 2021
- [c90]Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model. ALT 2021: 1157-1178 - [c89]Matteo Papini, Andrea Tirinzoni, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta:
Leveraging Good Representations in Linear Contextual Bandits. ICML 2021: 8371-8380 - [c88]Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto:
Reinforcement Learning with Prototypical Representations. ICML 2021: 11920-11931 - [c87]Jean Tarbouriech, Runlong Zhou, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret. NeurIPS 2021: 6843-6855 - [c86]Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
A Provably Efficient Sample Collection Strategy for Reinforcement Learning. NeurIPS 2021: 7611-7624 - [c85]Matteo Papini, Andrea Tirinzoni, Aldo Pacchiano, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta:
Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection. NeurIPS 2021: 16371-16383 - [i58]Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto:
Reinforcement Learning with Prototypical Representations. CoRR abs/2102.11271 (2021) - [i57]Matteo Papini, Andrea Tirinzoni, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta:
Leveraging Good Representations in Linear Contextual Bandits. CoRR abs/2104.03781 (2021) - [i56]Jean Tarbouriech, Runlong Zhou, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret. CoRR abs/2104.11186 (2021) - [i55]Yunchang Yang, Tianhao Wu, Han Zhong, Evrard Garcelon, Matteo Pirotta, Alessandro Lazaric, Liwei Wang, Simon S. Du:
A Unified Framework for Conservative Exploration. CoRR abs/2106.11692 (2021) - [i54]Andrea Tirinzoni, Matteo Pirotta, Alessandro Lazaric:
A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs. CoRR abs/2106.13013 (2021) - [i53]Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto:
Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning. CoRR abs/2107.09645 (2021) - [i52]Rui Yuan, Robert M. Gower, Alessandro Lazaric:
A general sample complexity analysis of vanilla policy gradient. CoRR abs/2107.11433 (2021) - [i51]Pierre-Alexandre Kamienny, Jean Tarbouriech, Alessandro Lazaric, Ludovic Denoyer:
Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching. CoRR abs/2110.14457 (2021) - [i50]Matteo Papini, Andrea Tirinzoni, Aldo Pacchiano, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta:
Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection. CoRR abs/2110.14798 (2021) - [i49]Jean Tarbouriech, Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Adaptive Multi-Goal Exploration. CoRR abs/2111.12045 (2021) - [i48]Paul Luyo, Evrard Garcelon, Alessandro Lazaric, Matteo Pirotta:
Differentially Private Exploration in Reinforcement Learning with Linear Representation. CoRR abs/2112.01585 (2021) - [i47]Evrard Garcelon, Vashist Avadhanula, Alessandro Lazaric, Matteo Pirotta:
Top K Ranking for Multi-Armed Bandit with Noisy Evaluations. CoRR abs/2112.06517 (2021) - 2020
- [c84]Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, Matteo Pirotta:
Improved Algorithms for Conservative Exploration in Bandits. AAAI 2020: 3962-3969 - [c83]Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, Matteo Pirotta:
Conservative Exploration in Reinforcement Learning. AISTATS 2020: 1431-1441 - [c82]Andrea Zanette, David Brandfonbrener, Emma Brunskill, Matteo Pirotta, Alessandro Lazaric:
Frequentist Regret Bounds for Randomized Least-Squares Value Iteration. AISTATS 2020: 1954-1964 - [c81]Andrea Tirinzoni, Alessandro Lazaric, Marcello Restelli:
A Novel Confidence-Based Algorithm for Structured Bandits. AISTATS 2020: 3175-3185 - [c80]Julien Seznec, Pierre Ménard, Alessandro Lazaric, Michal Valko:
A single algorithm for both restless and rested rotting bandits. AISTATS 2020: 3784-3794 - [c79]Marc Abeille, Alessandro Lazaric:
Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation. ICML 2020: 23-31 - [c78]Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco:
Near-linear time Gaussian process optimization with adaptive batching and resparsification. ICML 2020: 1295-1305 - [c77]Leonardo Cella, Alessandro Lazaric, Massimiliano Pontil:
Meta-learning with Stochastic Linear Bandits. ICML 2020: 1360-1370 - [c76]Jean Tarbouriech, Evrard Garcelon, Michal Valko, Matteo Pirotta, Alessandro Lazaric:
No-Regret Exploration in Goal-Oriented Reinforcement Learning. ICML 2020: 9428-9437 - [c75]Andrea Zanette, Alessandro Lazaric, Mykel J. Kochenderfer, Emma Brunskill:
Learning Near Optimal Policies with Low Inherent Bellman Error. ICML 2020: 10978-10989 - [c74]Evrard Garcelon, Baptiste Rozière, Laurent Meunier, Jean Tarbouriech, Olivier Teytaud, Alessandro Lazaric, Matteo Pirotta:
Adversarial Attacks on Linear Contextual Bandits. NeurIPS 2020 - [c73]Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Improved Sample Complexity for Incremental Autonomous Exploration in MDPs. NeurIPS 2020 - [c72]Andrea Tirinzoni, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric:
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits. NeurIPS 2020 - [c71]Andrea Zanette, Alessandro Lazaric, Mykel J. Kochenderfer, Emma Brunskill:
Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration. NeurIPS 2020 - [c70]Jean Tarbouriech, Shubhanshu Shekhar, Matteo Pirotta, Mohammad Ghavamzadeh, Alessandro Lazaric:
Active Model Estimation in Markov Decision Processes. UAI 2020: 1019-1028 - [i46]Jian Qian, Ronan Fruit, Matteo Pirotta, Alessandro Lazaric:
Concentration Inequalities for Multinoulli Random Variables. CoRR abs/2001.11595 (2020) - [i45]Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, Matteo Pirotta:
Conservative Exploration in Reinforcement Learning. CoRR abs/2002.03218 (2020) - [i44]Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, Matteo Pirotta:
Improved Algorithms for Conservative Exploration in Bandits. CoRR abs/2002.03221 (2020) - [i43]Evrard Garcelon, Baptiste Rozière, Laurent Meunier, Jean Tarbouriech, Olivier Teytaud, Alessandro Lazaric, Matteo Pirotta:
Adversarial Attacks on Linear Contextual Bandits. CoRR abs/2002.03839 (2020) - [i42]Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco:
Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification. CoRR abs/2002.09954 (2020) - [i41]Andrea Zanette, Alessandro Lazaric, Mykel J. Kochenderfer, Emma Brunskill:
Learning Near Optimal Policies with Low Inherent Bellman Error. CoRR abs/2003.00153 (2020) - [i40]Jean Tarbouriech, Shubhanshu Shekhar, Matteo Pirotta, Mohammad Ghavamzadeh, Alessandro Lazaric:
Active Model Estimation in Markov Decision Processes. CoRR abs/2003.03297 (2020) - [i39]Pierre-Alexandre Kamienny, Matteo Pirotta, Alessandro Lazaric, Thibault Lavril, Nicolas Usunier, Ludovic Denoyer:
Learning Adaptive Exploration Strategies in Dynamic Environments Through Informed Policy Regularization. CoRR abs/2005.02934 (2020) - [i38]Leonardo Cella, Alessandro Lazaric, Massimiliano Pontil:
Meta-learning with Stochastic Linear Bandits. CoRR abs/2005.08531 (2020) - [i37]Andrea Tirinzoni, Alessandro Lazaric, Marcello Restelli:
A Novel Confidence-Based Algorithm for Structured Bandits. CoRR abs/2005.11593 (2020) - [i36]Rui Yuan, Alessandro Lazaric, Robert M. Gower:
Sketched Newton-Raphson. CoRR abs/2006.12120 (2020) - [i35]Ronan Fruit, Matteo Pirotta, Alessandro Lazaric:
Improved Analysis of UCRL2 with Empirical Bernstein Inequality. CoRR abs/2007.05456 (2020) - [i34]Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
A Provably Efficient Sample Collection Strategy for Reinforcement Learning. CoRR abs/2007.06437 (2020) - [i33]Marc Abeille, Alessandro Lazaric:
Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation. CoRR abs/2007.06482 (2020) - [i32]Andrea Zanette, Alessandro Lazaric, Mykel J. Kochenderfer, Emma Brunskill:
Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration. CoRR abs/2008.07737 (2020) - [i31]Andrea Tirinzoni, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric:
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits. CoRR abs/2010.12247 (2020) - [i30]Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Improved Sample Complexity for Incremental Autonomous Exploration in MDPs. CoRR abs/2012.14755 (2020)
2010 – 2019
- 2019
- [c69]Rahma Chaabouni, Eugene Kharitonov, Alessandro Lazaric, Emmanuel Dupoux, Marco Baroni:
Word-order Biases in Deep-agent Emergent Communication. ACL (1) 2019: 5166-5175 - [c68]Jean Tarbouriech, Alessandro Lazaric:
Active Exploration in Markov Decision Processes. AISTATS 2019: 974-982 - [c67]Julien Seznec, Andrea Locatelli, Alexandra Carpentier, Alessandro Lazaric, Michal Valko:
Rotting bandits are no harder than stochastic ones. AISTATS 2019: 2564-2572 - [c66]Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco:
Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret. COLT 2019: 533-557 - [c65]Jian Qian, Ronan Fruit, Matteo Pirotta, Alessandro Lazaric:
Exploration Bonus for Regret Minimization in Discrete and Continuous Average Reward MDPs. NeurIPS 2019: 4891-4900 - [c64]Andrea Zanette, Alessandro Lazaric, Mykel J. Kochenderfer, Emma Brunskill:
Limiting Extrapolation in Linear Approximate Value Iteration. NeurIPS 2019: 5616-5625 - [c63]Nicolas Carion, Nicolas Usunier, Gabriel Synnaeve, Alessandro Lazaric:
A Structured Prediction Approach for Generalization in Cooperative Multi-Agent Reinforcement Learning. NeurIPS 2019: 8128-8138 - [c62]Ronald Ortner, Matteo Pirotta, Alessandro Lazaric, Ronan Fruit, Odalric-Ambrym Maillard:
Regret Bounds for Learning State Representations in Reinforcement Learning. NeurIPS 2019: 12717-12727 - [i29]Jean Tarbouriech, Alessandro Lazaric:
Active Exploration in Markov Decision Processes. CoRR abs/1902.11199 (2019) - [i28]Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco:
Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret. CoRR abs/1903.05594 (2019) - [i27]Rahma Chaabouni, Eugene Kharitonov, Alessandro Lazaric, Emmanuel Dupoux, Marco Baroni:
Word-order biases in deep-agent emergent communication. CoRR abs/1905.12330 (2019) - [i26]Nicolas Carion, Gabriel Synnaeve, Alessandro Lazaric, Nicolas Usunier:
A Structured Prediction Approach for Generalization in Cooperative Multi-Agent Reinforcement Learning. CoRR abs/1910.08809 (2019) - [i25]Andrea Zanette, David Brandfonbrener, Matteo Pirotta, Alessandro Lazaric:
Frequentist Regret Bounds for Randomized Least-Squares Value Iteration. CoRR abs/1911.00567 (2019) - [i24]Jean Tarbouriech, Evrard Garcelon, Michal Valko, Matteo Pirotta, Alessandro Lazaric:
No-Regret Exploration in Goal-Oriented Reinforcement Learning. CoRR abs/1912.03517 (2019) - 2018
- [c61]Marc Abeille, Alessandro Lazaric:
Improved Regret Bounds for Thompson Sampling in Linear Quadratic Control Problems. ICML 2018: 1-9 - [c60]Daniele Calandriello, Ioannis Koutis, Alessandro Lazaric, Michal Valko:
Improved Large-Scale Graph Learning through Ridge Spectral Sparsification. ICML 2018: 687-696 - [c59]Ronan Fruit, Matteo Pirotta, Alessandro Lazaric, Ronald Ortner:
Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning. ICML 2018: 1573-1581 - [c58]Romain Warlop, Alessandro Lazaric, Jérémie Mary:
Fighting Boredom in Recommender Systems with Linear Reinforcement Learning. NeurIPS 2018: 1764-1773 - [c57]Ronan Fruit, Matteo Pirotta, Alessandro Lazaric:
Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes. NeurIPS 2018: 2998-3008 - [i23]Ronan Fruit, Matteo Pirotta, Alessandro Lazaric, Ronald Ortner:
Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning. CoRR abs/1802.04020 (2018) - [i22]Daniele Calandriello, Alessandro Lazaric, Michal Valko:
Distributed Adaptive Sampling for Kernel Matrix Approximation. CoRR abs/1803.10172 (2018) - [i21]Ronan Fruit, Matteo Pirotta, Alessandro Lazaric:
Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes. CoRR abs/1807.02373 (2018) - [i20]Julien Seznec, Andrea Locatelli, Alexandra Carpentier, Alessandro Lazaric, Michal Valko:
Rotting bandits are no harder than stochastic ones. CoRR abs/1811.11043 (2018) - [i19]Jian Qian, Ronan Fruit, Matteo Pirotta, Alessandro Lazaric:
Exploration Bonus for Regret Minimization in Undiscounted Discrete and Continuous Markov Decision Processes. CoRR abs/1812.04363 (2018) - 2017
- [c56]Romain Warlop, Alessandro Lazaric, Jérémie Mary:
Parallel Higher Order Alternating Least Square for Tensor Recommender System. AAAI Workshops 2017 - [c55]Marc Abeille, Alessandro Lazaric:
Linear Thompson Sampling Revisited. AISTATS 2017: 176-184 - [c54]Ronan Fruit, Alessandro Lazaric:
Exploration-Exploitation in MDPs with Options. AISTATS 2017: 576-584 - [c53]Akram Erraqabi, Alessandro Lazaric, Michal Valko, Emma Brunskill, Yun-En Liu:
Trading off Rewards and Errors in Multi-Armed Bandits. AISTATS 2017: 709-717 - [c52]Marc Abeille, Alessandro Lazaric:
Thompson Sampling for Linear-Quadratic Control Problems. AISTATS 2017: 1246-1254 - [c51]Daniele Calandriello, Alessandro Lazaric, Michal Valko:
Distributed Adaptive Sampling for Kernel Matrix Approximation. AISTATS 2017: 1421-1429 - [c50]Daniele Calandriello, Alessandro Lazaric, Michal Valko:
Second-Order Kernel Online Convex Optimization with Adaptive Sketching. ICML 2017: 645-653 - [c49]Carlos Riquelme, Mohammad Ghavamzadeh, Alessandro Lazaric:
Active Learning for Accurate Estimation of Linear Models. ICML 2017: 2931-2939 - [c48]Ronan Fruit, Matteo Pirotta, Alessandro Lazaric, Emma Brunskill:
Regret Minimization in MDPs with Options without Prior Knowledge. NIPS 2017: 3166-3176 - [c47]Daniele Calandriello, Alessandro Lazaric, Michal Valko:
Efficient Second-Order Online Kernel Learning with Adaptive Embedding. NIPS 2017: 6140-6150 - [i18]Carlos Riquelme, Mohammad Ghavamzadeh, Alessandro Lazaric:
Active Learning for Accurate Estimation of Linear Models. CoRR abs/1703.00579 (2017) - [i17]Ronan Fruit, Alessandro Lazaric:
Exploration-Exploitation in MDPs with Options. CoRR abs/1703.08667 (2017) - [i16]Kamyar Azizzadenesheli, Alessandro Lazaric, Animashree Anandkumar:
Experimental results : Reinforcement Learning of POMDPs using Spectral Methods. CoRR abs/1705.02553 (2017) - [i15]Daniele Calandriello, Alessandro Lazaric, Michal Valko:
Second-Order Kernel Online Convex Optimization with Adaptive Sketching. CoRR abs/1706.04892 (2017) - 2016
- [j6]Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos:
Analysis of Classification-based Policy Iteration Algorithms. J. Mach. Learn. Res. 17: 19:1-19:30 (2016) - [c46]Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, Ronald Ortner, Peter L. Bartlett:
Improved Learning Complexity in Combinatorial Pure Exploration Bandits. AISTATS 2016: 1004-1012 - [c45]Kamyar Azizzadenesheli, Alessandro Lazaric, Animashree Anandkumar:
Reinforcement Learning of POMDPs using Spectral Methods. COLT 2016: 193-256 - [c44]Kamyar Azizzadenesheli, Alessandro Lazaric, Animashree Anandkumar:
Open Problem: Approximate Planning of POMDPs in the class of Memoryless Policies. COLT 2016: 1639-1642 - [c43]Daniele Calandriello, Alessandro Lazaric, Michal Valko:
Analysis of Nyström method with sequential ridge leverage scores. UAI 2016 - [i14]Daniele Calandriello, Alessandro Lazaric, Michal Valko, Ioannis Koutis:
Incremental Spectral Sparsification for Large-Scale Graph-Based Semi-Supervised Learning. CoRR abs/1601.05675 (2016) - [i13]Kamyar Azizzadenesheli, Alessandro Lazaric, Animashree Anandkumar:
Reinforcement Learning of POMDP's using Spectral Methods. CoRR abs/1602.07764 (2016) - [i12]Kamyar Azizzadenesheli, Alessandro Lazaric, Animashree Anandkumar:
Open Problem: Approximate Planning of POMDPs in the class of Memoryless Policies. CoRR abs/1608.04996 (2016) - [i11]Daniele Calandriello, Alessandro Lazaric, Michal Valko:
Analysis of Kelner and Levin graph sparsification algorithm for a streaming setting. CoRR abs/1609.03769 (2016) - [i10]Kamyar Azizzadenesheli, Alessandro Lazaric, Animashree Anandkumar:
Reinforcement Learning of Contextual MDPs using Spectral Methods. CoRR abs/1611.03907 (2016) - [i9]Marc Abeille, Alessandro Lazaric:
Linear Thompson Sampling Revisited. CoRR abs/1611.06534 (2016) - 2015
- [j5]Nicola Gatti, Alessandro Lazaric, Marco Rocco, Francesco Trovò:
Truthful learning mechanisms for multi-slot sponsored search auctions with externalities. Artif. Intell. 227: 93-139 (2015) - [j4]Daniele Calandriello, Alessandro Lazaric, Marcello Restelli:
Sparse multi-task reinforcement learning. Intelligenza Artificiale 9(1): 5-20 (2015) - [c42]Julien Audiffren, Michal Valko, Alessandro Lazaric, Mohammad Ghavamzadeh:
Maximum Entropy Semi-Supervised Inverse Reinforcement Learning. IJCAI 2015: 3315-3321 - [c41]Jessica Chemali, Alessandro Lazaric:
Direct Policy Iteration with Demonstrations. IJCAI 2015: 3380-3386 - [c40]Amir Sani, Alessandro Lazaric, Daniil Ryabko:
The replacement bootstrap for dependent data. ISIT 2015: 1194-1198 - [i8]Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Peter Auer, András Antos:
Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits. CoRR abs/1507.04523 (2015) - 2014
- [c39]Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill:
Online Stochastic Optimization under Correlated Bandit Feedback. ICML 2014: 1557-1565 - [c38]Amir Sani, Gergely Neu, Alessandro Lazaric:
Exploiting easy data in online optimization. NIPS 2014: 810-818 - [c37]Daniele Calandriello, Alessandro Lazaric, Marcello Restelli:
Sparse Multi-Task Reinforcement Learning. NIPS 2014: 819-827 - [c36]Marta Soare, Alessandro Lazaric, Rémi Munos:
Best-Arm Identification in Linear Bandits. NIPS 2014: 828-836 - [i7]Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill:
Stochastic Optimization of a Locally Smooth Function under Correlated Bandit Feedback. CoRR abs/1402.0562 (2014) - [i6]Nicola Gatti, Alessandro Lazaric, Marco Rocco, Francesco Trovò:
Truthful Learning Mechanisms for Multi-Slot Sponsored Search Auctions with Externalities. CoRR abs/1405.2484 (2014) - [i5]Marta Soare, Alessandro Lazaric, Rémi Munos:
Best-Arm Identification in Linear Bandits. CoRR abs/1409.6110 (2014) - 2013
- [c35]Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill:
Sequential Transfer in Multi-armed Bandit with Finite Set of Models. NIPS 2013: 2220-2228 - [c34]Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill:
Regret Bounds for Reinforcement Learning with Policy Advice. ECML/PKDD (1) 2013: 97-112 - [i4]Amir Sani, Alessandro Lazaric, Rémi Munos:
Risk-Aversion in Multi-armed Bandits. CoRR abs/1301.1936 (2013) - [i3]Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill:
Regret Bounds for Reinforcement Learning with Policy Advice. CoRR abs/1305.1027 (2013) - [i2]Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill:
Sequential Transfer in Multi-armed Bandit with Finite Set of Models. CoRR abs/1307.6887 (2013) - 2012
- [j3]Alessandro Lazaric, Rémi Munos:
Learning with stochastic inputs and adversarial outputs. J. Comput. Syst. Sci. 78(5): 1516-1537 (2012) - [j2]Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos:
Finite-sample analysis of least-squares policy iteration. J. Mach. Learn. Res. 13: 3041-3074 (2012) - [c33]Mohammad Ghavamzadeh, Alessandro Lazaric:
Conservative and Greedy Approaches to Classification-Based Policy Iteration. AAAI 2012: 914-920 - [c32]Nicola Gatti, Alessandro Lazaric, Francesco Trovò:
A truthful learning mechanism for multi-slot sponsored search auctions with externalities. AAMAS 2012: 1325-1326 - [c31]Michal Valko, Mohammad Ghavamzadeh, Alessandro Lazaric:
Semi-Supervised Apprenticeship Learning. EWRL 2012: 131-142 - [c30]Matthieu Geist, Bruno Scherrer, Alessandro Lazaric, Mohammad Ghavamzadeh:
A Dantzig Selector Approach to Temporal Difference Learning. ICML 2012 - [c29]Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric:
Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence. NIPS 2012: 3221-3229 - [c28]Amir Sani, Alessandro Lazaric, Rémi Munos:
Risk-Aversion in Multi-armed Bandits. NIPS 2012: 3284-3292 - [c27]Nicola Gatti, Alessandro Lazaric, Francesco Trovò:
A truthful learning mechanism for contextual multi-slot sponsored search auctions with externalities. EC 2012: 605-622 - [p2]Lucian Busoniu, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Robert Babuska, Bart De Schutter:
Least-Squares Methods for Policy Iteration. Reinforcement Learning 2012: 75-109 - [p1]Alessandro Lazaric:
Transfer in Reinforcement Learning: A Framework and a Survey. Reinforcement Learning 2012: 143-173 - 2011
- [c26]Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Peter Auer:
Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits. ALT 2011: 189-203 - [c25]Matthew W. Hoffman, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos:
Regularized Least Squares Temporal Difference Learning with Nested ℓ2 and ℓ1 Penalization. EWRL 2011: 102-114 - [c24]Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, Bruno Scherrer:
Classification-based Policy Iteration with a Critic. ICML 2011: 1049-1056 - [c23]Mohammad Ghavamzadeh, Alessandro Lazaric, Rémi Munos, Matthew W. Hoffman:
Finite-Sample Analysis of Lasso-TD. ICML 2011: 1177-1184 - [c22]Alessandro Lazaric, Marcello Restelli:
Transfer from Multiple MDPs. NIPS 2011: 1746-1754 - [c21]Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric, Sébastien Bubeck:
Multi-Bandit Best Arm Identification. NIPS 2011: 2222-2230 - [i1]Alessandro Lazaric, Marcello Restelli:
Transfer from Multiple MDPs. CoRR abs/1108.6211 (2011) - 2010
- [c20]Alessandro Lazaric, Mohammad Ghavamzadeh:
Bayesian Multi-Task Reinforcement Learning. ICML 2010: 599-606 - [c19]Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos:
Analysis of a Classification-based Policy Iteration Algorithm. ICML 2010: 607-614 - [c18]Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos:
Finite-Sample Analysis of LSTD. ICML 2010: 615-622 - [c17]Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric-Ambrym Maillard, Rémi Munos:
LSTD with Random Projections. NIPS 2010: 721-729 - [c16]Odalric-Ambrym Maillard, Rémi Munos, Alessandro Lazaric, Mohammad Ghavamzadeh:
Finite-sample Analysis of Bellman Residual Minimization. ACML 2010: 299-314
2000 – 2009
- 2009
- [j1]Andrea Bonarini, Alessandro Lazaric, Francesco Montrone, Marcello Restelli:
Reinforcement distribution in fuzzy Q-learning. Fuzzy Sets Syst. 160(10): 1420-1443 (2009) - [c15]Alessandro Lazaric, Rémi Munos:
Hybrid Stochastic-Adversarial On-line Learning. COLT 2009 - [c14]Jean-Yves Audibert, Peter Auer, Alessandro Lazaric, Rémi Munos, Daniil Ryabko, Csaba Szepesvári:
Workshop summary: On-line learning with limited feedback. ICML 2009: 8 - 2008
- [c13]Nicola Gatti, Alessandro Lazaric, Marcello Restelli:
Towards Automated Bargaining in Electronic Markets: A Partially Two-Sided Competition Model. AMEC/TADA 2008: 117-130 - [c12]Eliseo Ferrante, Alessandro Lazaric, Marcello Restelli:
Transfer of task representation in reinforcement learning using policy-based proto-value functions. AAMAS (3) 2008: 1329-1332 - [c11]Alessandro Lazaric, Mario Quaresimale, Marcello Restelli:
On the usefulness of opponent modeling: the Kuhn Poker case study. AAMAS (3) 2008: 1345-1348 - [c10]Alessandro Lazaric, Marcello Restelli, Andrea Bonarini:
Transfer of samples in batch reinforcement learning. ICML 2008: 544-551 - [c9]Andrea Bonarini, Claudio Caccia, Alessandro Lazaric, Marcello Restelli:
Batch Reinforcement Learning for Controlling a Mobile Wheeled Pendulum Robot. IFIP AI 2008: 151-160 - [c8]Alessandro Lazaric, Marcello Restelli, Andrea Bonarini:
Improving Batch Reinforcement Learning Performance through Transfer of Samples. STAIRS 2008: 106-117 - 2007
- [c7]Alessandro Lazaric, Enrique Munoz de Cote, Fabio Dercole, Marcello Restelli:
Bifurcation Analysis of Reinforcement Learning Agents in the Selten's Horse Game. Adaptive Agents and Multi-Agents Systems 2007: 129-144 - [c6]Andrea Bonarini, Alessandro Lazaric, Marcello Restelli:
Reinforcement Learning in Complex Environments Through Multiple Adaptive Partitions. AI*IA 2007: 531-542 - [c5]Alessandro Lazaric, Enrique Munoz de Cote, Nicola Gatti:
Reinforcement learning in extensive form games with incomplete information: the bargaining case study. AAMAS 2007: 46 - [c4]Andrea Bonarini, Alessandro Lazaric, Marcello Restelli:
Piecewise constant reinforcement learning for robotic applications. ICINCO-ICSO 2007: 214-221 - [c3]Alessandro Lazaric, Marcello Restelli, Andrea Bonarini:
Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods. NIPS 2007: 833-840 - 2006
- [c2]Enrique Munoz de Cote, Alessandro Lazaric, Marcello Restelli:
Learning to cooperate in multi-agent social dilemmas. AAMAS 2006: 783-785 - [c1]Andrea Bonarini, Alessandro Lazaric, Marcello Restelli:
Incremental Skill Acquisition for Self-motivated Learning Animats. SAB 2006: 357-368
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-09-13 00:38 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint