Multi-UAV Redeployment Optimization Based on Multi-Agent Deep Reinforcement Learning Oriented to Swarm Performance Restoration
Abstract
:1. Introduction
2. Problem Formulation
2.1. Mission, Destruction, and Reconfiguration
2.1.1. Mission
2.1.2. Destruction
2.1.3. Reconfiguration
2.2. Objective, Constraints, and Variables
3. MADRL-Based DR Method
3.1. Reconfiguration Decision Process
3.1.1. Mission and Destruction Features
3.1.2. Reconfiguration Action Generation
3.1.3. Renewal Features
3.2. Deep Q-Learning for Reconfiguration
3.3. QMIX for Multi-Agent Strategy
4. Case Study
4.1. UAV Swarm Reconfiguration
4.1.1. Mission
4.1.2. Destruction
4.1.3. Reconfiguration
4.2. Discussion
4.2.1. Different Algorithms
4.2.2. Different Destruction Cases
4.2.3. Different Swarm Scales
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Sun, Z.C.; Yen, G.G.; Wu, J.; Ren, H.; An, H.; Yang, J. Mission planning for energy-efficient passive UAV radar imaging system based on substage division collaborative search. IEEE Trans. Cybern. 2023, 53, 275–288. [Google Scholar] [CrossRef] [PubMed]
- Jinqiang, H.; Husheng, W.; Renjun, Z.; Rafik, M.; Xuanwu, Z. Self-organized search-attack mission planning for UAV swarm based on wolf pack hunting behavior. J. Syst. Eng. Electron. 2021, 32, 1463–1476. [Google Scholar] [CrossRef]
- Cheng, N.; Wu, S.; Wang, X.; Yin, Z.; Li, C.; Chen, W.; Chen, F. AI for UAV-assisted IoT applications: A comprehensive review. IEEE Internet Things J. 2023, 10, 14438–14461. [Google Scholar] [CrossRef]
- Khan, M.A.; Kumar, N.; Mohsan, S.A.H.; Khan, W.U.; Nasralla, M.M.; Alsharif, M.H.; Żywiołek, J.; Ullah, I. Swarm of UAVs for network management in 6G: A technical review. IEEE Trans. Netw. Serv. Manag. 2023, 20, 741–761. [Google Scholar] [CrossRef]
- Li, X.W.; Yao, H.P.; Wang, J.J.; Xu, X.; Jiang, C.; Hanzo, L. A near-optimal UAV-aided radio coverage strategy for dense urban areas. IEEE Trans. Veh. Technol. 2019, 68, 9098–9109. [Google Scholar] [CrossRef]
- Masroor, R.; Naeem, M.; Ejaz, W. Efficient deployment of UAVs for disaster management: A multi-criterion optimization approach. Comput. Commun. 2021, 177, 185–194. [Google Scholar] [CrossRef]
- Savkin, A.V.; Huang, H.L. Range-based reactive deployment of autonomous drones for optimal coverage in disaster areas. IEEE Trans. Syst. Man Cybern. Syst. 2021, 51, 4606–4610. [Google Scholar] [CrossRef]
- Wang, J.; Liu, M.; Sun, J.L.; Gui, G.; Gacanin, H.; Sari, H.; Adachi, F. Multiple unmanned-aerial-vehicles deployment and user pairing for nonorthogonal multiple access schemes. IEEE Internet Things J. 2021, 8, 1883–1895. [Google Scholar] [CrossRef]
- Yu, M.G.; Niu, Y.J.; Liu, X.D.; Zhang, D.G.; Peng, Z.; He, M.; Luo, L. Adaptive dynamic reconfiguration mechanism of unmanned swarm topology based on an evolutionary game. J. Syst. Eng. Electron. 2023, 34, 598–614. [Google Scholar] [CrossRef]
- Wang, Y.Z.; Yue, Y.F.; Shan, M.; He, L.; Wang, D. Formation reconstruction and trajectory replanning for multi-UAV patrol. IEEE/ASME Trans. Mechatron. 2021, 26, 719–729. [Google Scholar] [CrossRef]
- Bouhamed, O.; Ghazzai, H.; Besbes, H.; Massoud, Y. A generic spatiotemporal scheduling for autonomous UAVs: A reinforcement learning-based approach. IEEE Open J. Veh. Technol. 2020, 1, 93–106. [Google Scholar] [CrossRef]
- Zhang, H.; Li, J.; Qi, Z.; Aronsson, A.; Bosch, J.; Olsson, H.H. Deep Reinforcement Learning for Multiple Agents in a Decentralized Architecture: A Case Study in the Telecommunication Domain. In Proceedings of the IEEE 20th International Conference on Software Architecture Companion (ICSA-C), L’Aquila, Italy, 13–17 March 2023; Volume 2023, pp. 183–186. [Google Scholar] [CrossRef]
- Ren, L.; Wang, C.; Yang, Y.; Cao, Z. A Learning-Based Control Approach for Blind Quadrupedal Locomotion with Guided-DRL and Hierarchical-DRL. In Proceedings of the IEEE International Conference on Robotics and Biomimetics (ROBIO), Sanya, China, 6–9 December 2021; Volume 2021, pp. 881–886. [Google Scholar] [CrossRef]
- Xu, J.; Guo, Q.; Xiao, L.; Li, Z.; Zhang, G. Autonomous Decision-Making Method for Combat Mission of UAV Based on Deep Reinforcement Learning, Electronic and Automation Control. In Proceedings of the Conference (IAEAC), Chengdu, China, 20–22 December 2019; Volume 2019, pp. 538–544. [Google Scholar] [CrossRef]
- Samir, M.; Assi, C.; Sharafeddine, S.; Ebrahimi, D.; Ghrayeb, A. Age of Information Aware Trajectory Planning of UAVs in Intelligent Transportation Systems: A Deep Learning Approach. IEEE Trans. Veh. Technol. 2020, 69, 12382–12395. [Google Scholar] [CrossRef]
- Zhang, Y.; Li, Y.; Wu, Z.; Xu, J. Deep reinforcement learning for UAV swarm rendezvous behavior. J. Syst. Eng. Electron. 2023, 34, 360–373. [Google Scholar] [CrossRef]
- Huda, S.M.A.; Moh, S. Deep reinforcement learning-based computation offloading in uav swarm-enabled edge computing for surveillance applications. IEEE Access 2023, 11, 68269–68285. [Google Scholar] [CrossRef]
- Zhang, N.; Liu, C.; Ba, J. Decomposing FANET to Counter Massive UAV Swarm Based on Reinforcement Learning. IEEE Commun. Lett. 2023, 27, 1784–1788. [Google Scholar] [CrossRef]
- Mou, Z.; Zhang, Y.; Gao, F.; Wang, H.; Zhang, T.; Han, Z. Deep Reinforcement Learning Based Three-Dimensional Area Coverage With UAV Swarm. IEEE J. Sel. Areas Commun. 2021, 39, 3160–3176. [Google Scholar] [CrossRef]
- Liu, Y.; Yan, J.; Zhao, X. Deep Reinforcement Learning Based Latency Minimization for Mobile Edge Computing with Virtualization in Maritime UAV Communication Network. IEEE Trans. Veh. Technol. 2022, 71, 4225–4236. [Google Scholar] [CrossRef]
- Zhang, R.; Zong, Q.; Zhang, X.; Dou, L.; Tian, B. Game of drones: Multi-uav pursuit-evasion game with online motion planning by deep reinforcement learning. IEEE Trans. Neural Netw. Learn. Syst. 2022, 34, 7900–7909. [Google Scholar] [CrossRef]
- Xia, Z.; Du, J.; Wang, J.; Jiang, C.; Ren, Y.; Li, G.; Han, Z. Multi-agent reinforcement learning aided intelligent UAV swarm for target tracking. IEEE Trans. Veh. Technol. 2022, 71, 931–945. [Google Scholar] [CrossRef]
- Lv, Z.; Xiao, L.; Du, Y.; Niu, G.; Xing, C.; Xu, W. Multi-Agent Reinforcement Learning based UAV Swarm Communications against Jamming. IEEE Trans. Wirel. Commun. 2023. [Google Scholar] [CrossRef]
- Xiang, L.; Xie, T. Research on UAV Swarm Confrontation Task Based on MADDPG Algorithm. In Proceedings of the 5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE), Harbin, China, 25–27 December 2020; Volume 2020, pp. 1513–1518. [Google Scholar] [CrossRef]
- Feng, Q.; Bi, W.; Chen, Y.; Ren, Y.; Yang, D. Cooperative Game Approach based on Agent Learning for Fleet Maintenance Oriented to Mission Reliability. Comput. Ind. Eng. 2017, 112, 221–230. [Google Scholar] [CrossRef]
Location | (3, 74.48) | (3, 77.94) | (3, 81.41) | (6, 76.21) | (6, 79.67) | (9, 77.94) |
Location | (12, 76.21) | (12, 79.67) | (12, 83.14) | (15, 77.94) | (15, 81.41) | (18, 79.67) |
Location | (33, 53.69) | (33, 57.16) | (33, 60.62) | (36, 55.43) | (36, 58.89) | (39, 57.16) |
…… | ||||||
Location | (15, 12.12) | (15, 15.59) | (15, 19.05) | (18, 13.86) | (18, 17.32) | (21, 15.69) |
Location | (33, 5.20) | (33, 12.12) | (33, 8.66) | (36, 6.93) | (36, 10.39) | (39, 8.66) |
Destruction | Parameter | Parameter Value |
---|---|---|
Local destruction 1 | Destruction center | (18, 17.32) |
Destruction radius | 11 | |
Destroyed UAVs | , , , , , | |
Local destruction 2 | Destruction center | (5, 83.13) |
Destruction radius | 4 | |
Destroyed UAVs | , | |
Random destruction | Destroyed UAVs | , , |
Swarm Scale | UAV | Agent | Reconfiguration Action | |
---|---|---|---|---|
Initial Location | Final Location | |||
7 × 6 UAVs | agent 1 | (9, 77.94) | (6, 79.67) | |
agent 2 | (15, 81.41) | (15, 15.59) | ||
agent 3 | (33, 60.62) | (18, 17.32) | ||
agent 3 | (39, 57.16) | (33, 57.16) | ||
…… | ||||
agent 4 | (82, 36.37) | (21, 15.59) | ||
agent 7 | (36, 6.93) | (18, 13.86) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, Q.; Geng, Z.; Ren, Y.; Feng, Q.; Zhong, J. Multi-UAV Redeployment Optimization Based on Multi-Agent Deep Reinforcement Learning Oriented to Swarm Performance Restoration. Sensors 2023, 23, 9484. https://doi.org/10.3390/s23239484
Wu Q, Geng Z, Ren Y, Feng Q, Zhong J. Multi-UAV Redeployment Optimization Based on Multi-Agent Deep Reinforcement Learning Oriented to Swarm Performance Restoration. Sensors. 2023; 23(23):9484. https://doi.org/10.3390/s23239484
Chicago/Turabian StyleWu, Qilong, Zitao Geng, Yi Ren, Qiang Feng, and Jilong Zhong. 2023. "Multi-UAV Redeployment Optimization Based on Multi-Agent Deep Reinforcement Learning Oriented to Swarm Performance Restoration" Sensors 23, no. 23: 9484. https://doi.org/10.3390/s23239484