[PDF][PDF] Mastering basketball with deep reinforcement learning: An integrated curriculum training approach

H Jia, C Ren, Y Hu, Y Chen, T Lv, C Fan… - Proceedings of the …, 2020 - aamas.csc.liv.ac.uk
Proceedings of the 19th International Conference on Autonomous …, 2020aamas.csc.liv.ac.uk
Despite the success of deep reinforcement learning in a variety type of games such as
Board games, RTS, FPS, and MOBA games, sports games (SPG) like basketball have been
seldom studied. Basketball is one of the most popular and challenging sports games due to
its long-time horizon, sparse rewards, complex game rules, and multiple roles with different
capabilities. Although these problems could be partially alleviated by common methods like
hierarchical reinforcement learning through a decomposition of the whole game into several …
Abstract
Despite the success of deep reinforcement learning in a variety type of games such as Board games, RTS, FPS, and MOBA games, sports games (SPG) like basketball have been seldom studied. Basketball is one of the most popular and challenging sports games due to its long-time horizon, sparse rewards, complex game rules, and multiple roles with different capabilities. Although these problems could be partially alleviated by common methods like hierarchical reinforcement learning through a decomposition of the whole game into several subtasks based on game rules (such as attack, defense), these methods tend to ignore the strong correlations between these subtasks and could have difficulty in generating reasonable policies across the whole basketball match. Besides, the existence of multiple agents adds extra challenges to such game. In this work, we propose an integrated curriculum training approach (ICTA) which is composed of two parts. The first part is for handling the correlated subtasks from the perspective of a single player, which contains several weighted cascading curriculum learners that can smoothly unify the base curriculum training of corresponding subtasks together using a Q-value backup mechanism with a weight factor. The second part is for enhancing the cooperation ability of the basketball team, which is a curriculum switcher that focuses on learning the switch of the cooperative curriculum within one team by taking over collaborative actions such as passing from a single-player’s action spaces. Our method is then applied to a commercial online basketball game named Fever Basketball (FB). Results show that ICTA significantly outperforms the built-in AI and reaches up to around 70% win-rate than online human players during a 300-day evaluation period.
aamas.csc.liv.ac.uk
Showing the best result for this search. See all results