default search action
Marc Lanctot
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c52]David Sychrovsky, Michal Sustr, Elnaz Davoodi, Michael Bowling, Marc Lanctot, Martin Schmid:
Learning Not to Regret. AAAI 2024: 15202-15210 - [c51]Ian Gemp, Marc Lanctot, Luke Marris, Yiran Mao, Edgar A. Duéñez-Guzmán, Sarah Perrin, Andras Gyorgy, Romuald Elie, Georgios Piliouras, Michael Kaisers, Daniel Hennes, Kalesha Bullard, Kate Larson, Yoram Bachrach:
Approximating the Core via Iterative Coalition Sampling. AAMAS 2024: 669-678 - [c50]Siqi Liu, Luke Marris, Marc Lanctot, Georgios Piliouras, Joel Z. Leibo, Nicolas Heess:
Neural Population Learning beyond Symmetric Zero-Sum Games. AAMAS 2024: 1247-1255 - [i54]Siqi Liu, Luke Marris, Marc Lanctot, Georgios Piliouras, Joel Z. Leibo, Nicolas Heess:
Neural Population Learning beyond Symmetric Zero-sum Games. CoRR abs/2401.05133 (2024) - [i53]Ian Gemp, Yoram Bachrach, Marc Lanctot, Roma Patel, Vibhavari Dasagi, Luke Marris, Georgios Piliouras, Siqi Liu, Karl Tuyls:
States as Strings as Strategies: Steering Language Models with Game-Theoretic Solvers. CoRR abs/2402.01704 (2024) - [i52]Ian Gemp, Marc Lanctot, Luke Marris, Yiran Mao, Edgar A. Duéñez-Guzmán, Sarah Perrin, Andras Gyorgy, Romuald Elie, Georgios Piliouras, Michael Kaisers, Daniel Hennes, Kalesha Bullard, Kate Larson, Yoram Bachrach:
Approximating the Core via Iterative Coalition Sampling. CoRR abs/2402.03928 (2024) - [i51]Luca D'Amico-Wong, Hugh Zhang, Marc Lanctot, David C. Parkes:
Easy as ABCs: Unifying Boltzmann Q-Learning and Counterfactual Regret Minimization. CoRR abs/2402.11835 (2024) - [i50]Heymann Benjamin, Marc Lanctot:
Learning in Games with progressive hiding. CoRR abs/2409.03875 (2024) - 2023
- [j12]Marc Lanctot, John Schultz, Neil Burch, Max Olan Smith, Daniel Hennes, Thomas Anthony, Julien Pérolat:
Population-based Evaluation in Repeated Rock-Paper-Scissors as a Benchmark for Multiagent Reinforcement Learning. Trans. Mach. Learn. Res. 2023 (2023) - [c49]Zun Li, Marc Lanctot, Kevin R. McKee, Luke Marris, Ian Gemp, Daniel Hennes, Kate Larson, Yoram Bachrach, Michael P. Wellman, Paul Muller:
Search-Improved Game-Theoretic Multiagent Reinforcement Learning in General and Negotiation Games. AAMAS 2023: 2445-2447 - [c48]Stephen Marcus McAleer, Gabriele Farina, Marc Lanctot, Tuomas Sandholm:
ESCHER: Eschewing Importance Sampling in Games by Computing a History Value Function to Estimate Regret. ICLR 2023 - [c47]Samuel Sokota, Ryan D'Orazio, J. Zico Kolter, Nicolas Loizou, Marc Lanctot, Ioannis Mitliagkas, Noam Brown, Christian Kroer:
A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games. ICLR 2023 - [i49]Zun Li, Marc Lanctot, Kevin R. McKee, Luke Marris, Ian Gemp, Daniel Hennes, Paul Muller, Kate Larson, Yoram Bachrach, Michael P. Wellman:
Combining Tree-Search, Generative Models, and Nash Bargaining Concepts in Game-Theoretic Reinforcement Learning. CoRR abs/2302.00797 (2023) - [i48]David Sychrovsky, Michal Sustr, Elnaz Davoodi, Marc Lanctot, Martin Schmid:
Learning not to Regret. CoRR abs/2303.01074 (2023) - [i47]Marc Lanctot, John Schultz, Neil Burch, Max Olan Smith, Daniel Hennes, Thomas W. Anthony, Julien Pérolat:
Population-based Evaluation in Repeated Rock-Paper-Scissors as a Benchmark for Multiagent Reinforcement Learning. CoRR abs/2303.03196 (2023) - [i46]Marc Lanctot, Kate Larson, Yoram Bachrach, Luke Marris, Zun Li, Avishkar Bhoopchand, Thomas W. Anthony, Brian Tanner, Anna Koop:
Evaluating Agents using Social Choice Theory. CoRR abs/2312.03121 (2023) - 2022
- [j11]Ian Gemp, Thomas W. Anthony, Yoram Bachrach, Avishkar Bhoopchand, Kalesha Bullard, Jerome T. Connor, Vibhavari Dasagi, Bart De Vylder, Edgar A. Duéñez-Guzmán, Romuald Elie, Richard Everett, Daniel Hennes, Edward Hughes, Mina Khan, Marc Lanctot, Kate Larson, Guy Lever, Siqi Liu, Luke Marris, Kevin R. McKee, Paul Muller, Julien Pérolat, Florian Strub, Andrea Tacchetti, Eugene Tarassov, Zhe Wang, Karl Tuyls:
Developing, evaluating and scaling learning agents in multi-agent environments. AI Commun. 35(4): 271-284 (2022) - [c46]Ian Gemp, Rahul Savani, Marc Lanctot, Yoram Bachrach, Thomas W. Anthony, Richard Everett, Andrea Tacchetti, Tom Eccles, János Kramár:
Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent. AAMAS 2022: 507-515 - [c45]Siqi Liu, Marc Lanctot, Luke Marris, Nicolas Heess:
Simplex Neural Population Learning: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games. ICML 2022: 13793-13806 - [c44]Finbarr Timbers, Nolan Bard, Edward Lockhart, Marc Lanctot, Martin Schmid, Neil Burch, Julian Schrittwieser, Thomas Hubert, Michael Bowling:
Approximate Exploitability: Learning a Best Response. IJCAI 2022: 3487-3493 - [d1]Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Figure Data for the paper "Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning". Zenodo, 2022 - [i45]Stephen McAleer, Kevin Wang, John B. Lanier, Marc Lanctot, Pierre Baldi, Tuomas Sandholm, Roy Fox:
Anytime PSRO for Two-Player Zero-Sum Games. CoRR abs/2201.07700 (2022) - [i44]Siqi Liu, Marc Lanctot, Luke Marris, Nicolas Heess:
Simplex Neural Population Learning: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games. CoRR abs/2205.15879 (2022) - [i43]Stephen McAleer, Gabriele Farina, Marc Lanctot, Tuomas Sandholm:
ESCHER: Eschewing Importance Sampling in Games by Computing a History Value Function to Estimate Regret. CoRR abs/2206.04122 (2022) - [i42]Samuel Sokota, Ryan D'Orazio, J. Zico Kolter, Nicolas Loizou, Marc Lanctot, Ioannis Mitliagkas, Noam Brown, Christian Kroer:
A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games. CoRR abs/2206.05825 (2022) - [i41]Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas W. Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning. CoRR abs/2206.15378 (2022) - [i40]Ian Gemp, Thomas W. Anthony, Yoram Bachrach, Avishkar Bhoopchand, Kalesha Bullard, Jerome T. Connor, Vibhavari Dasagi, Bart De Vylder, Edgar A. Duéñez-Guzmán, Romuald Elie, Richard Everett, Daniel Hennes, Edward Hughes, Mina Khan, Marc Lanctot, Kate Larson, Guy Lever, Siqi Liu, Luke Marris, Kevin R. McKee, Paul Muller, Julien Pérolat, Florian Strub, Andrea Tacchetti, Eugene Tarassov, Zhe Wang, Karl Tuyls:
Developing, Evaluating and Scaling Learning Agents in Multi-Agent Environments. CoRR abs/2209.10958 (2022) - [i39]Luke Marris, Marc Lanctot, Ian Gemp, Shayegan Omidshafiei, Stephen McAleer, Jerome T. Connor, Karl Tuyls, Thore Graepel:
Game Theoretic Rating in N-player general-sum games with Equilibria. CoRR abs/2210.02205 (2022) - 2021
- [c43]Dustin Morrill, Ryan D'Orazio, Reca Sarfati, Marc Lanctot, James R. Wright, Amy R. Greenwald, Michael Bowling:
Hindsight and Sequential Rationality of Correlated Play. AAAI 2021: 5584-5594 - [c42]Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot:
Solving Common-Payoff Games with Approximate Policy Iteration. AAAI 2021: 9695-9703 - [c41]Michal Sustr, Martin Schmid, Matej Moravcík, Neil Burch, Marc Lanctot, Michael Bowling:
Sound Algorithms in Imperfect Information Games. AAMAS 2021: 1674-1676 - [c40]Luke Marris, Paul Muller, Marc Lanctot, Karl Tuyls, Thore Graepel:
Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers. ICML 2021: 7480-7491 - [c39]Dustin Morrill, Ryan D'Orazio, Marc Lanctot, James R. Wright, Michael Bowling, Amy R. Greenwald:
Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games. ICML 2021: 7818-7828 - [c38]Julien Pérolat, Rémi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro A. Ortega, Neil Burch, Thomas W. Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls:
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization. ICML 2021: 8525-8535 - [c37]Abhinav Gupta, Marc Lanctot, Angeliki Lazaridou:
Dynamic population-based meta-learning for multi-agent communication with natural language. NeurIPS 2021: 16899-16912 - [i38]Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot:
Solving Common-Payoff Games with Approximate Policy Iteration. CoRR abs/2101.04237 (2021) - [i37]Dustin Morrill, Ryan D'Orazio, Marc Lanctot, James R. Wright, Michael Bowling, Amy Greenwald:
Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games. CoRR abs/2102.06973 (2021) - [i36]Ian Gemp, Rahul Savani, Marc Lanctot, Yoram Bachrach, Thomas W. Anthony, Richard Everett, Andrea Tacchetti, Tom Eccles, János Kramár:
Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent. CoRR abs/2106.01285 (2021) - [i35]Luke Marris, Paul Muller, Marc Lanctot, Karl Tuyls, Thore Graepel:
Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers. CoRR abs/2106.09435 (2021) - [i34]Abhinav Gupta, Marc Lanctot, Angeliki Lazaridou:
Dynamic population-based meta-learning for multi-agent communication with natural language. CoRR abs/2110.14241 (2021) - [i33]Martin Schmid, Matej Moravcik, Neil Burch, Rudolf Kadlec, Joshua Davidson, Kevin Waugh, Nolan Bard, Finbarr Timbers, Marc Lanctot, G. Zacharias Holland, Elnaz Davoodi, Alden Christianson, Michael Bowling:
Player of Games. CoRR abs/2112.03178 (2021) - 2020
- [j10]Karl Tuyls, Julien Pérolat, Marc Lanctot, Edward Hughes, Richard Everett, Joel Z. Leibo, Csaba Szepesvári, Thore Graepel:
Bounds and dynamics for empirical game theoretic analysis. Auton. Agents Multi Agent Syst. 34(1): 7 (2020) - [j9]Nolan Bard, Jakob N. Foerster, Sarath Chandar, Neil Burch, Marc Lanctot, H. Francis Song, Emilio Parisotto, Vincent Dumoulin, Subhodeep Moitra, Edward Hughes, Iain Dunning, Shibl Mourad, Hugo Larochelle, Marc G. Bellemare, Michael Bowling:
The Hanabi challenge: A new frontier for AI research. Artif. Intell. 280: 103216 (2020) - [j8]Yoram Bachrach, Richard Everett, Edward Hughes, Angeliki Lazaridou, Joel Z. Leibo, Marc Lanctot, Michael Johanson, Wojciech M. Czarnecki, Thore Graepel:
Negotiating team formation using deep reinforcement learning. Artif. Intell. 288: 103356 (2020) - [c36]Daniel Hennes, Dustin Morrill, Shayegan Omidshafiei, Rémi Munos, Julien Pérolat, Marc Lanctot, Audrunas Gruslys, Jean-Baptiste Lespiau, Paavo Parmas, Edgar A. Duéñez-Guzmán, Karl Tuyls:
Neural Replicator Dynamics: Multiagent Learning via Hedging Policy Gradients. AAMAS 2020: 492-501 - [c35]Paul Muller, Shayegan Omidshafiei, Mark Rowland, Karl Tuyls, Julien Pérolat, Siqi Liu, Daniel Hennes, Luke Marris, Marc Lanctot, Edward Hughes, Zhe Wang, Guy Lever, Nicolas Heess, Thore Graepel, Rémi Munos:
A Generalized Training Approach for Multiagent Learning. ICLR 2020 - [c34]Rémi Munos, Julien Pérolat, Jean-Baptiste Lespiau, Mark Rowland, Bart De Vylder, Marc Lanctot, Finbarr Timbers, Daniel Hennes, Shayegan Omidshafiei, Audrunas Gruslys, Mohammad Gheshlaghi Azar, Edward Lockhart, Karl Tuyls:
Fast computation of Nash Equilibria in Imperfect Information Games. ICML 2020: 7119-7129 - [c33]Thomas W. Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Satinder Singh, Thore Graepel, Yoram Bachrach:
Learning to Play No-Press Diplomacy with Best Response Policy Iteration. NeurIPS 2020 - [i32]Julien Pérolat, Rémi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro A. Ortega, Neil Burch, Thomas W. Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls:
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization. CoRR abs/2002.08456 (2020) - [i31]Finbarr Timbers, Edward Lockhart, Martin Schmid, Marc Lanctot, Michael Bowling:
Approximate exploitability: Learning a best response in large games. CoRR abs/2004.09677 (2020) - [i30]Thomas W. Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Satinder Singh, Thore Graepel, Yoram Bachrach:
Learning to Play No-Press Diplomacy with Best Response Policy Iteration. CoRR abs/2006.04635 (2020) - [i29]Michal Sustr, Martin Schmid, Matej Moravcík, Neil Burch, Marc Lanctot, Michael Bowling:
Sound Search in Imperfect Information Games. CoRR abs/2006.08740 (2020) - [i28]Audrunas Gruslys, Marc Lanctot, Rémi Munos, Finbarr Timbers, Martin Schmid, Julien Pérolat, Dustin Morrill, Vinícius Flores Zambaldi, Jean-Baptiste Lespiau, John Schultz, Mohammad Gheshlaghi Azar, Michael Bowling, Karl Tuyls:
The Advantage Regret-Matching Actor-Critic. CoRR abs/2008.12234 (2020) - [i27]Yoram Bachrach, Richard Everett, Edward Hughes, Angeliki Lazaridou, Joel Z. Leibo, Marc Lanctot, Michael Johanson, Wojciech M. Czarnecki, Thore Graepel:
Negotiating Team Formation Using Deep Reinforcement Learning. CoRR abs/2010.10380 (2020) - [i26]Dustin Morrill, Ryan D'Orazio, Reca Sarfati, Marc Lanctot, James R. Wright, Amy Greenwald, Michael Bowling:
Hindsight and Sequential Rationality of Correlated Play. CoRR abs/2012.05874 (2020)
2010 – 2019
- 2019
- [j7]Guy Barash, Mauricio Castillo-Effen, Niyati Chhaya, Peter Clark, Huáscar Espinoza, Eitan Farchi, Christopher W. Geib, Odd Erik Gundersen, Seán Ó hÉigeartaigh, José Hernández-Orallo, Chiori Hori, Xiaowei Huang, Kokil Jaidka, Pavan Kapanipathi, Sarah Keren, Seokhwan Kim, Marc Lanctot, Danny Lange, Julian J. McAuley, David R. Martinez, Marwan Mattar, Mausam, Martin Michalowski, Reuth Mirsky, Roozbeh Mottaghi, Joseph C. Osborn, Julien Pérolat, Martin Schmid, Arash Shaban-Nejad, Onn Shehory, Biplav Srivastava, William W. Streilein, Kartik Talamadupula, Julian Togelius, Koichiro Yoshino, Quanshi Zhang, Imed Zitouni:
Reports of the Workshops Held at the 2019 AAAI Conference on Artificial Intelligence. AI Mag. 40(3): 67-78 (2019) - [c32]Martin Schmid, Neil Burch, Marc Lanctot, Matej Moravcik, Rudolf Kadlec, Michael Bowling:
Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games Using Baselines. AAAI 2019: 2157-2164 - [c31]Edward Lockhart, Marc Lanctot, Julien Pérolat, Jean-Baptiste Lespiau, Dustin Morrill, Finbarr Timbers, Karl Tuyls:
Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent. IJCAI 2019: 464-470 - [i25]Nolan Bard, Jakob N. Foerster, Sarath Chandar, Neil Burch, Marc Lanctot, H. Francis Song, Emilio Parisotto, Vincent Dumoulin, Subhodeep Moitra, Edward Hughes, Iain Dunning, Shibl Mourad, Hugo Larochelle, Marc G. Bellemare, Michael Bowling:
The Hanabi Challenge: A New Frontier for AI Research. CoRR abs/1902.00506 (2019) - [i24]Joel Z. Leibo, Edward Hughes, Marc Lanctot, Thore Graepel:
Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research. CoRR abs/1903.00742 (2019) - [i23]Shayegan Omidshafiei, Christos H. Papadimitriou, Georgios Piliouras, Karl Tuyls, Mark Rowland, Jean-Baptiste Lespiau, Wojciech M. Czarnecki, Marc Lanctot, Julien Pérolat, Rémi Munos:
α-Rank: Multi-Agent Evaluation by Evolution. CoRR abs/1903.01373 (2019) - [i22]Edward Lockhart, Marc Lanctot, Julien Pérolat, Jean-Baptiste Lespiau, Dustin Morrill, Finbarr Timbers, Karl Tuyls:
Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent. CoRR abs/1903.05614 (2019) - [i21]Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Rémi Munos, Julien Pérolat, Marc Lanctot, Audrunas Gruslys, Jean-Baptiste Lespiau, Karl Tuyls:
Neural Replicator Dynamics. CoRR abs/1906.00190 (2019) - [i20]Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinícius Flores Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas W. Anthony, Edward Hughes, Ivo Danihelka, Jonah Ryan-Davis:
OpenSpiel: A Framework for Reinforcement Learning in Games. CoRR abs/1908.09453 (2019) - [i19]Paul Muller, Shayegan Omidshafiei, Mark Rowland, Karl Tuyls, Julien Pérolat, Siqi Liu, Daniel Hennes, Luke Marris, Marc Lanctot, Edward Hughes, Zhe Wang, Guy Lever, Nicolas Heess, Thore Graepel, Rémi Munos:
A Generalized Training Approach for Multiagent Learning. CoRR abs/1909.12823 (2019) - 2018
- [c30]Todd Hester, Matej Vecerík, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Ian Osband, Gabriel Dulac-Arnold, John P. Agapiou, Joel Z. Leibo, Audrunas Gruslys:
Deep Q-learning From Demonstrations. AAAI 2018: 3223-3230 - [c29]Karl Tuyls, Julien Pérolat, Marc Lanctot, Joel Z. Leibo, Thore Graepel:
A Generalised Method for Empirical Game Theoretic Analysis. AAMAS 2018: 77-85 - [c28]Peter Sunehag, Guy Lever, Audrunas Gruslys, Wojciech Marian Czarnecki, Vinícius Flores Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z. Leibo, Karl Tuyls, Thore Graepel:
Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward. AAMAS 2018: 2085-2087 - [c27]Kris Cao, Angeliki Lazaridou, Marc Lanctot, Joel Z. Leibo, Karl Tuyls, Stephen Clark:
Emergent Communication through Negotiation. ICLR (Poster) 2018 - [c26]Sriram Srinivasan, Marc Lanctot, Vinícius Flores Zambaldi, Julien Pérolat, Karl Tuyls, Rémi Munos, Michael Bowling:
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments. NeurIPS 2018: 3426-3439 - [i18]Karl Tuyls, Julien Pérolat, Marc Lanctot, Joel Z. Leibo, Thore Graepel:
A Generalised Method for Empirical Game Theoretic Analysis. CoRR abs/1803.06376 (2018) - [i17]Kris Cao, Angeliki Lazaridou, Marc Lanctot, Joel Z. Leibo, Karl Tuyls, Stephen Clark:
Emergent Communication through Negotiation. CoRR abs/1804.03980 (2018) - [i16]Martin Schmid, Neil Burch, Marc Lanctot, Matej Moravcik, Rudolf Kadlec, Michael Bowling:
Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines. CoRR abs/1809.03057 (2018) - [i15]Sriram Srinivasan, Marc Lanctot, Vinícius Flores Zambaldi, Julien Pérolat, Karl Tuyls, Rémi Munos, Michael Bowling:
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments. CoRR abs/1810.09026 (2018) - 2017
- [c25]Joel Z. Leibo, Vinícius Flores Zambaldi, Marc Lanctot, Janusz Marecki, Thore Graepel:
Multi-agent Reinforcement Learning in Sequential Social Dilemmas. AAMAS 2017: 464-473 - [c24]Marc Lanctot, Vinícius Flores Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Pérolat, David Silver, Thore Graepel:
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning. NIPS 2017: 4190-4203 - [i14]Joel Z. Leibo, Vinícius Flores Zambaldi, Marc Lanctot, Janusz Marecki, Thore Graepel:
Multi-agent Reinforcement Learning in Sequential Social Dilemmas. CoRR abs/1702.03037 (2017) - [i13]Todd Hester, Matej Vecerík, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John P. Agapiou, Joel Z. Leibo, Audrunas Gruslys:
Learning from Demonstrations for Real World Reinforcement Learning. CoRR abs/1704.03732 (2017) - [i12]Peter Sunehag, Guy Lever, Audrunas Gruslys, Wojciech Marian Czarnecki, Vinícius Flores Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z. Leibo, Karl Tuyls, Thore Graepel:
Value-Decomposition Networks For Cooperative Multi-Agent Learning. CoRR abs/1706.05296 (2017) - [i11]Marc Lanctot, Vinícius Flores Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Pérolat, David Silver, Thore Graepel:
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning. CoRR abs/1711.00832 (2017) - [i10]Karl Tuyls, Julien Pérolat, Marc Lanctot, Georg Ostrovski, Rahul Savani, Joel Z. Leibo, Toby Ord, Thore Graepel, Shane Legg:
Symmetric Decomposition of Asymmetric Games. CoRR abs/1711.05074 (2017) - [i9]David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy P. Lillicrap, Karen Simonyan, Demis Hassabis:
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. CoRR abs/1712.01815 (2017) - 2016
- [j6]Branislav Bosanský, Viliam Lisý, Marc Lanctot, Jirí Cermák, Mark H. M. Winands:
Algorithms for computing strategies in two-player simultaneous move games. Artif. Intell. 237: 1-40 (2016) - [j5]David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Vedavyas Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy P. Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis:
Mastering the game of Go with deep neural networks and tree search. Nat. 529(7587): 484-489 (2016) - [c23]Chrisantha Fernando, Dylan Banarse, Malcolm Reynolds, Frederic Besse, David Pfau, Max Jaderberg, Marc Lanctot, Daan Wierstra:
Convolution by Evolution: Differentiable Pattern Producing Networks. GECCO 2016: 109-116 - [c22]Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas:
Dueling Network Architectures for Deep Reinforcement Learning. ICML 2016: 1995-2003 - [c21]Audrunas Gruslys, Rémi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves:
Memory-Efficient Backpropagation Through Time. NIPS 2016: 4125-4133 - [i8]Chrisantha Fernando, Dylan Banarse, Malcolm Reynolds, Frederic Besse, David Pfau, Max Jaderberg, Marc Lanctot, Daan Wierstra:
Convolution by Evolution: Differentiable Pattern Producing Networks. CoRR abs/1606.02580 (2016) - [i7]Audrunas Gruslys, Rémi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves:
Memory-Efficient Backpropagation Through Time. CoRR abs/1606.03401 (2016) - 2015
- [c20]Viliam Lisý, Marc Lanctot, Michael H. Bowling:
Online Monte Carlo Counterfactual Regret Minimization for Search in Imperfect Information Games. AAMAS 2015: 27-36 - [c19]Johannes Heinrich, Marc Lanctot, David Silver:
Fictitious Self-Play in Extensive-Form Games. ICML 2015: 805-813 - [i6]Ziyu Wang, Nando de Freitas, Marc Lanctot:
Dueling Network Architectures for Deep Reinforcement Learning. CoRR abs/1511.06581 (2015) - 2014
- [j4]Tom Pepels, Mark H. M. Winands, Marc Lanctot:
Real-Time Monte Carlo Tree Search in Ms Pac-Man. IEEE Trans. Comput. Intell. AI Games 6(3): 245-257 (2014) - [c18]Marc Lanctot:
Further developments of extensive-form replicator dynamics using the sequence-form representation. AAMAS 2014: 1257-1264 - [c17]Marc Lanctot, Mark H. M. Winands, Tom Pepels, Nathan R. Sturtevant:
Monte Carlo Tree Search with heuristic evaluations using implicit minimax backups. CIG 2014: 1-8 - [c16]Mandy J. W. Tak, Marc Lanctot, Mark H. M. Winands:
Monte Carlo Tree Search variants for simultaneous move games. CIG 2014: 1-8 - [c15]Tom Pepels, Tristan Cazenave, Mark H. M. Winands, Marc Lanctot:
Minimizing Simple and Cumulative Regret in Monte-Carlo Tree Search. CGW@ECAI 2014: 1-15 - [c14]Tom Pepels, Mandy J. W. Tak, Marc Lanctot, Mark H. M. Winands:
Quality-based Rewards for Monte-Carlo Tree Search Simulations. ECAI 2014: 705-710 - [i5]Marc J. V. Ponsen, Steven de Jong, Marc Lanctot:
Computing Approximate Nash Equilibria and Robust Best-Responses Using Sampling. CoRR abs/1401.4591 (2014) - [i4]Marc Lanctot, Mark H. M. Winands, Tom Pepels, Nathan R. Sturtevant:
Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups. CoRR abs/1406.0486 (2014) - 2013
- [j3]Marc Lanctot, Mark H. M. Winands:
LOA Wins Lines of Action Tournament. J. Int. Comput. Games Assoc. 36(4): 239-240 (2013) - [j2]Marc Lanctot, Mark H. M. Winands:
SIA Wins Surakarta Tournament. J. Int. Comput. Games Assoc. 36(4): 241 (2013) - [c13]Markus Esser, Michael Gras, Mark H. M. Winands, Maarten P. D. Schadd, Marc Lanctot:
Improving Best-Reply Search. Computers and Games 2013: 125-137 - [c12]Todd W. Neller, Marc Lanctot, Devika Subramanian, Stephanie E. August:
Model AI Assignments 2013. EAAI 2013 - [c11]Marc Lanctot, Viliam Lisý, Mark H. M. Winands:
Monte Carlo Tree Search in Simultaneous Move Games with Applications to Goofspiel. CGW@IJCAI 2013: 28-43 - [c10]Marc Lanctot, Abdallah Saffidine, Joel Veness, Christopher Archibald, Mark H. M. Winands:
Monte Carlo *-Minimax Search. IJCAI 2013: 580-586 - [c9]Viliam Lisý, Vojtech Kovarík, Marc Lanctot, Branislav Bosanský:
Convergence of Monte Carlo Tree Search in Simultaneous Move Games. NIPS 2013: 2112-2120 - [i3]Marc Lanctot, Abdallah Saffidine, Joel Veness, Christopher Archibald, Mark H. M. Winands:
Monte Carlo *-Minimax Search. CoRR abs/1304.6057 (2013) - [i2]Viliam Lisý, Vojtech Kovarík, Marc Lanctot, Branislav Bosanský:
Convergence of Monte Carlo Tree Search in Simultaneous Move Games. CoRR abs/1310.8613 (2013) - 2012
- [c8]Richard G. Gibson, Marc Lanctot, Neil Burch, Duane Szafron, Michael Bowling:
Generalized Sampling and Variance in Counterfactual Regret Minimization. AAAI 2012: 1355-1361 - [c7]Michael Johanson, Nolan Bard, Marc Lanctot, Richard G. Gibson, Michael Bowling:
Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization. AAMAS 2012: 837-846 - [c6]Marc Lanctot, Richard G. Gibson, Neil Burch, Michael Bowling:
No-Regret Learning in Extensive-Form Games with Imperfect Recall. ICML 2012 - [c5]Richard G. Gibson, Neil Burch, Marc Lanctot, Duane Szafron:
Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions. NIPS 2012: 1889-1897 - [i1]Marc Lanctot, Richard G. Gibson, Neil Burch, Martin Zinkevich, Michael H. Bowling:
No-Regret Learning in Extensive-Form Games with Imperfect Recall. CoRR abs/1205.0622 (2012) - 2011
- [j1]Marc J. V. Ponsen, Steven de Jong, Marc Lanctot:
Computing Approximate Nash Equilibria and Robust Best-Responses Using Sampling. J. Artif. Intell. Res. 42: 575-605 (2011) - [c4]Joel Veness, Marc Lanctot, Michael H. Bowling:
Variance Reduction in Monte-Carlo Tree Search. NIPS 2011: 1836-1844 - 2010
- [c3]Marc J. V. Ponsen, Marc Lanctot, Steven de Jong:
MCRNR: Fast Computing of Restricted Nash Responses by Means of Sampling. Interactive Decision Theory and Game Theory 2010
2000 – 2009
- 2009
- [c2]Marc Lanctot, Kevin Waugh, Martin Zinkevich, Michael H. Bowling:
Monte Carlo Sampling for Regret Minimization in Extensive Games. NIPS 2009: 1078-1086 - 2007
- [c1]Franisek Sailer, Michael Buro, Marc Lanctot:
Adversarial Planning Through Strategy Simulation. CIG 2007: 80-87
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-30 21:29 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint