Google Scholar

MA-YOLO: a multi-attention object detection network for remote sensing images

Q Song, M Hou, Y Xue, J Yu - Journal of Electronic Imaging, 2024 - spiedigitallibrary.org

Journal of Electronic Imaging, 2024•spiedigitallibrary.org

Abstract

In recent years, deep learning-based objects detection algorithms have demonstrated exceptional performance in natural environments. These algorithms have been extensively used in various remote sensing applications, which include the detection of structures and roads as well as flood and earthquake disasters. In these applications, remote sensing images may be captured by satellites, drones, and other equipment. Compared with conventional images, they often feature substantial occlusion, intricate backgrounds and numerous small targets, which are difficult to detect because of the high resolution and large data volume. The existing algorithms focus on detection accuracy or speed, which often fail to achieve a balance between these. To solve this problem, we proposed a single-stage object detection algorithm MA-YOLO based on YOLOv4. We first design a backbone network aimed to enhance feature extraction capabilities while maintaining inference speed. Second, we introduced a parallel attention mechanism, which is to improve the detection performance of small targets. Finally, we applied an attention mechanism to the path aggregation network, which is to enhance the fusion effect of multi-scale features for detecting multi-scale targets. To validate the efficacy of our proposed approach, we evaluated MA-YOLO on three datasets: DIOR, RSOD, and NWPU VHR-10. The experimental results show that our proposed network achieves detection accuracy of 68.87%, 94.13%, and 93.77% on these datasets while ensuring the reasoning speed of 28.4 frames per second and realizes the effective balance between detection accuracy and speed.

SPIE Digital Library

Show moreShow less

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

MA-YOLO: a multi-attention object detection network for remote sensing images