MA-YOLO: a multi-attention object detection network for remote sensing images

Q Song, M Hou, Y Xue, J Yu - Journal of Electronic Imaging, 2024 - spiedigitallibrary.org
Q Song, M Hou, Y Xue, J Yu
Journal of Electronic Imaging, 2024spiedigitallibrary.org
In recent years, deep learning-based objects detection algorithms have demonstrated
exceptional performance in natural environments. These algorithms have been extensively
used in various remote sensing applications, which include the detection of structures and
roads as well as flood and earthquake disasters. In these applications, remote sensing
images may be captured by satellites, drones, and other equipment. Compared with
conventional images, they often feature substantial occlusion, intricate backgrounds and …
Abstract
In recent years, deep learning-based objects detection algorithms have demonstrated exceptional performance in natural environments. These algorithms have been extensively used in various remote sensing applications, which include the detection of structures and roads as well as flood and earthquake disasters. In these applications, remote sensing images may be captured by satellites, drones, and other equipment. Compared with conventional images, they often feature substantial occlusion, intricate backgrounds and numerous small targets, which are difficult to detect because of the high resolution and large data volume. The existing algorithms focus on detection accuracy or speed, which often fail to achieve a balance between these. To solve this problem, we proposed a single-stage object detection algorithm MA-YOLO based on YOLOv4. We first design a backbone network aimed to enhance feature extraction capabilities while maintaining inference speed. Second, we introduced a parallel attention mechanism, which is to improve the detection performance of small targets. Finally, we applied an attention mechanism to the path aggregation network, which is to enhance the fusion effect of multi-scale features for detecting multi-scale targets. To validate the efficacy of our proposed approach, we evaluated MA-YOLO on three datasets: DIOR, RSOD, and NWPU VHR-10. The experimental results show that our proposed network achieves detection accuracy of 68.87%, 94.13%, and 93.77% on these datasets while ensuring the reasoning speed of 28.4 frames per second and realizes the effective balance between detection accuracy and speed.
SPIE Digital Library
Showing the best result for this search. See all results