Authors:
Sebastian Pol
1
;
Schirin Baer
2
;
Danielle Turner
1
;
Vladimir Samsonov
2
and
Tobias Meisen
3
Affiliations:
1
Siemens AG, Digital Industries, Nuremberg, Germany
;
2
Institute of Information Management in Mechanical Engineering, RWTH Aachen University, Aachen, Germany
;
3
Institute of Technologies and Management of the Digital Transformation, University of Wuppertal, Wuppertal, Germany
Keyword(s):
Cooperative Agents, Deep Reinforcement Learning, Flexible Manufacturing System, Global Optimization, Job Shop Scheduling, Reactive Scheduling, Reward Design.
Abstract:
In flexible manufacturing, efficient production requires reactive control. We present a solution for solving practical and flexible job shop scheduling problems, focusing on minimizing total makespan while dealing with many product variants and unseen production scenarios. In our system, each product is controlled by an independent reinforcement learning agent for resource allocation and transportation. A significant challenge in multi-agent solutions is collaboration between agents for a common optimization objective. We implement and compare two global reward designs enabling cooperation between the agents during production. Namely, we use dense local rewards augmented with global reward factors, and a sparse global reward design. The agents are trained on randomized product combinations. We validate the results using unseen scheduling scenarios to evaluate generalization. Our goal is not to outperform existing domain-specific heuristics for total makespan, but to generate comparab
ly good schedules with the advantage of being able to instantaneously react to unforeseen events. While the implemented reward designs show very promising results, the dense reward design performs slightly better while the sparse reward design is much more intuitive to implement. We benchmark our results against simulated annealing based on total makespan and computation time, showing that we achieve comparable results with reactive behavior.
(More)