Learning phase competition for traffic signal control

G Zheng, Y Xiong, X Zang, J Feng, H Wei… - Proceedings of the 28th …, 2019 - dl.acm.org
Proceedings of the 28th ACM international conference on information and …, 2019dl.acm.org
Increasingly available city data and advanced learning techniques have empowered people
to improve the efficiency of our city functions. Among them, improving urban transportation
efficiency is one of the most prominent topics. Recent studies have proposed to use
reinforcement learning (RL) for traffic signal control. Different from traditional transportation
approaches which rely heavily on prior knowledge, RL can learn directly from the feedback.
However, without a careful model design, existing RL methods typically take a long time to …
Increasingly available city data and advanced learning techniques have empowered people to improve the efficiency of our city functions. Among them, improving urban transportation efficiency is one of the most prominent topics. Recent studies have proposed to use reinforcement learning (RL) for traffic signal control. Different from traditional transportation approaches which rely heavily on prior knowledge, RL can learn directly from the feedback. However, without a careful model design, existing RL methods typically take a long time to converge and the learned models may fail to adapt to new scenarios. For example, a model trained well for morning traffic may not work for the afternoon traffic because the traffic flow could be reversed, resulting in very different state representation. In this paper, we propose a novel design called FRAP, which is based on the intuitive principle of phase competition in traffic signal control: when two traffic signals conflict, priority should be given to one with larger traffic movement (i.e., higher demand). Through the phase competition modeling, our model achieves invariance to symmetrical cases such as flipping and rotation in traffic flow. By conducting comprehensive experiments, we demonstrate that our model finds better solutions than existing RL methods in the complicated all-phase selection problem, converges much faster during training, and achieves superior generalizability for different road structures and traffic conditions.
ACM Digital Library
Showing the best result for this search. See all results