A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition
Abstract
:1. Introduction
- From the perspective of paying attention to large-size feature maps and introducing the idea of Bi-PAN-FPN, this work improves the detection ability of the model for small targets, and at the same time increases the probability and time of multiscale feature fusion to obtain better feature engineering. This solves the common problem of easy misdetection and missed detection of small targets in aerial images;
- Optimizes the backbone network and loss function of the model. The Ghostblock unit and Wise-IoU bounding box regression loss are integrated to improve the generalization performance of the model from the perspectives of feature diversity, long-distance capture of feature information, and avoidance of excessive penalty of geometric factors. Suppresses the number of parameters of the model while improving the accuracy of the model. This solves the long-range information loss problem and the balance problem of predicting anchors;
- The feasibility and effectiveness of the constructed model are verified using ablation experiments. Compared with the original benchmark network, the MAP performance of the model on the international open-source dataset VisDrone2019 is improved by 9.06% (test set), the number of parameters is reduced by 13.21% (test set), and the comprehensive ability is improved significantly.
- The proposed model is compared with six current most mainstream and advanced deep object detection models to prove the superiority of our proposed model. Furthermore, comparing the interpretability of three excellent models illustrates the reason for the superiority of this method.
2. Related Work
3. Improved Aerial Image Detection Model
3.1. Improvement of the Neck
3.2. Improvement of the Backbone
3.3. Improvement of the Loss Function
4. Results
4.1. Dataset and Its Preprocessing
4.2. Ablation Experiment
- The A model (i.e., the benchmark model) performed poorly. Its accuracy indicators were in the lowest position, but the FPS was in the first place, reaching 182/f.s−1. This indicates that even if the number of model parameters was reduced (only 9.659 million) in the improved model, it would still increase the number of network layers and some inference time. The FPS index of the improved model reached 167/f.s−1, which can also ensure real-time requirements in actual deployment.
- After integrating the B, C, and D structures, the model improved performance in several aspect—focusing on small target features, multiplexing multiscale features, suppressing information loss during long-range feature transmission, and taking into account anchor boxes of different qualities, feature engineering was significantly strengthened. This can be seen from each single-category indicator. In most cases, whenever a structure was added, the performance of the P, R, and AP indicators of the model were improved to a certain extent. However, after incorporating the D structure, the model’s indicator data in some categories were not as good as before. That is, in some cases, A + B + C was better than A + B + C + D, but this did not affect the overall trend.
- On the whole, the sequential improvement of the three structures made the model in the VisDrone dataset improve the accuracy continuously, the number of parameters gradually decreased, and the final model size was only 19.157 MB. This shows that the improvement to the baseline model is feasible and effective, taking into account the accuracy and speed of edge device detection scenarios. The detection effect of some scenes is shown in Figure 8.
4.3. Performance Comparison Experiment of the Deep Learning Model
- MobilNetv2-SSD had the worst overall performance. This model had the lowest number of parameters, only 3.94 million. In the target detection task, its R index was the lowest in both the validation set and the test set, which means that the model had a large number of missed detections. However, its p value showed a high performance, and it can be seen that the objects detected by the model were easier to identify, except for the missed objects. The above situation is mainly because the VisDrone dataset has high requirements for the target detection model in terms of shooting angle, target size, and environmental complexity. MobilNetv2-SSD often has high applicability in simpler tasks, but it is not suitable for this type of complex task.
- YOLOX-s also had the above problems. The R value of the model on this dataset was relatively low, and the missed detection rate was high. The p value achieved the best results in both the validation set and the test set, which made the YOLOX-s model achieve better results (better than those of YOLOv4-s). However, the model had the worst FPS performance. The performance of YOLOv4-s was relatively mediocre, and it was only better than MobilNetv2-SSD in the detection task, but the R value of this model was greatly improved compared with the previous two models, and the missed detection rate decreased.
- The two lightweight models, YOLOv5-s and YOLOv7-tiny, both achieved excellent performance in the test set. Especially for YOLOv5-s, after the official iteration of multiple versions, the overall performance was greatly improved. The P and R indicators of the two models were in a relatively balanced state, and the detection rate and the correct rate were relatively coordinated. YOLOv7-tiny was the smallest model. The performance of YOLOv5-s and YOLOv7-tiny in the test set was second only to the lightweight model proposed in this paper, and they were also suitable for target detection tasks in UAV aerial images.
- The MAP index of the model proposed in this paper was optimal in both the validation set and the test set. Compared with the non-lightweight model YOLOv5-m, the P, R, and MAP metrics all performed comparably or better than YOLOv5-m. From the three indicators of FPS, parameters, and model size, the model performance was in the first echelon, which shows that the model had the best comprehensive performance. In the UAV aerial image target detection task involved in this paper, the proposed model met the needs of actual production scenarios in terms of detection accuracy and deployment difficulty and had considerable robustness and practicability.
4.4. Interpretability Experiments
4.5. Self-Built Dataset Experiment
- YOLOv4 s and YOLOv7-tiny performed similarly on the self-built dataset, both obtaining relatively low map values on the test set. Although YOLOv7-tiny had the relatively lowest model size and number of parameters, its universal performance was not excellent. However, these two models can still be used in occasions where precision requirements are not critical. Both had more than 150/f.s−1 in FPS, and are capable of being deployed in Edge device;
- YOLOv5-s, YOLOX-s, and YOLOv8-s all achieved excellent results, approximately at the same level of detection accuracy, but YOLOv8-s had the best accuracy. In terms of FPS, YOLOv5-s and YOLOv8-s both exceeded 300/f.s−1, but YOLOX-s did not reach 100 in this indicator, indicating that the former two have considerable advantages in detection accuracy and speed;
- The model in this article led the original YOLOv8-s by nearly two percentage points in the map of the test set, and led the worst performing YOLOv7-tiny by 18.4 percentage points. At the same time, the FPS reached 294/f.s−1, achieving a good balance between detection accuracy and speed. This also indicates that the model in this paper achieved the best detection performance in various scenarios and datasets, and has strong universality. The partial detection performance of this model on the test set is shown in Figure 15. It can be seen that for small targets, there is basically no missed detection phenomenon in the model. However, in some cases, redundant detection boxes may appear, and in a few cases, similar backgrounds may be mistaken for targets.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Adaimi, G.; Kreiss, S.; Alahi, A. Perceiving Traffic from Aerial Images. arXiv 2020, arXiv:2009.07611. [Google Scholar]
- Bouguettaya, A.; Zarzour, H.; Kechida, A.; Taberkit, A.M. Vehicle Detection from UAV Imagery with Deep Learning: A Review. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6047–6067. [Google Scholar] [CrossRef] [PubMed]
- Byun, S.; Shin, I.-K.; Moon, J.; Kang, J.; Choi, S.-I. Road Traffic Monitoring from UAV Images Using Deep Learning Networks. Remote Sens. 2021, 13, 4027. [Google Scholar] [CrossRef]
- Chang, Y.-C.; Chen, H.-T.; Chuang, J.-H.; Liao, I.-C. Pedestrian Detection in Aerial Images Using Vanishing Point Transformation and Deep Learning. In Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; pp. 1917–1921. [Google Scholar]
- Božić-Štulić, D.; Marušić, Ž.; Gotovac, S. Deep Learning Approach in Aerial Imagery for Supporting Land Search and Rescue Missions. Int. J. Comput. Vis. 2019, 127, 1256–1278. [Google Scholar] [CrossRef]
- Chen, C.; Zhang, Y.; Lv, Q.; Wei, S.; Wang, X.; Sun, X.; Dong, J. Rrnet: A Hybrid Detector for Object Detection in Drone-Captured Images. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Republic of Korea, 27–28 October 2019; pp. 100–108. [Google Scholar]
- Chen, Y.; Lee, W.S.; Gan, H.; Peres, N.; Fraisse, C.; Zhang, Y.; He, Y. Strawberry Yield Prediction Based on a Deep Neural Network Using High-Resolution Aerial Orthoimages. Remote Sens. 2019, 11, 1584. [Google Scholar] [CrossRef]
- Chen, Y.; Li, J.; Niu, Y.; He, J. Small Object Detection Networks Based on Classification-Oriented Super-Resolution GAN for UAV Aerial Imagery. In Proceedings of the 2019 Chinese Control and Decision Conference (CCDC), Nanchang, China, 3–5 June 2019; pp. 4610–4615. [Google Scholar]
- Cai, W.; Wei, Z. Remote Sensing Image Classification Based on a Cross-Attention Mechanism and Graph Convolution. IEEE Geosci. Remote Sens. Lett. 2020, 19, 1–5. [Google Scholar] [CrossRef]
- Deng, S.; Li, S.; Xie, K.; Song, W.; Liao, X.; Hao, A.; Qin, H. A Global-Local Self-Adaptive Network for Drone-View Object Detection. IEEE Trans. Image Process. 2020, 30, 1556–1569. [Google Scholar] [CrossRef]
- Domozi, Z.; Stojcsics, D.; Benhamida, A.; Kozlovszky, M.; Molnar, A. Real Time Object Detection for Aerial Search and Rescue Missions for Missing Persons. In Proceedings of the 2020 IEEE 15th International Conference of System of Systems Engineering (SoSE), Budapest, Hungary, 2–4 June 2020; pp. 519–524. [Google Scholar]
- Hong, S.; Kang, S.; Cho, D. Patch-Level Augmentation for Object Detection in Aerial Images. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Republic of Korea, 27–28 October 2019; pp. 127–134. [Google Scholar]
- Dong, J.; Ota, K.; Dong, M. UAV-Based Real-Time Survivor Detection System in Post-Disaster Search and Rescue Operations. IEEE J. Miniat. Air Space Syst. 2021, 2, 209–219. [Google Scholar] [CrossRef]
- Hsieh, M.-R.; Lin, Y.-L.; Hsu, W.H. Drone-Based Object Counting by Spatially Regularized Regional Proposal Network. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4145–4153. [Google Scholar]
- Liao, J.; Piao, Y.; Su, J.; Cai, G.; Huang, X.; Chen, L.; Huang, Z.; Wu, Y. Unsupervised Cluster Guided Object Detection in Aerial Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 11204–11216. [Google Scholar] [CrossRef]
- Huang, Y.; Chen, J.; Huang, D. UFPMP-Det: Toward Accurate and Efficient Object Detection on Drone Imagery. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 22 February–1 March 2022; Volume 36, pp. 1026–1033. [Google Scholar]
- Liu, Z.; Gao, G.; Sun, L.; Fang, Z. HRDNet: High-Resolution Detection Network for Small Objects. In Proceedings of the 2021 IEEE International Conference on Multimedia and Expo (ICME), Shenzhen, China, 5–9 July 2021; pp. 1–6. [Google Scholar]
- Qingyun, F.; Dapeng, H.; Zhaokui, W. Cross-Modality Fusion Transformer for Multispectral Object Detection. arXiv 2021, arXiv:2111.00273. [Google Scholar]
- Yang, F.; Fan, H.; Chu, P.; Blasch, E.; Ling, H. Clustered Object Detection in Aerial Images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8311–8320. [Google Scholar]
- Liang, X.; Zhang, J.; Zhuo, L.; Li, Y.; Tian, Q. Small Object Detection in Unmanned Aerial Vehicle Images Using Feature Fusion and Scaling-Based Single Shot Detector with Spatial Context Analysis. IEEE Trans. Circuits Syst. Video Technol. 2019, 30, 1758–1770. [Google Scholar] [CrossRef]
- Jiao, L.; Zhang, F.; Liu, F.; Yang, S.; Li, L.; Feng, Z.; Qu, R. A Survey of Deep Learning-Based Object Detection. IEEE Access 2019, 7, 128837–128868. [Google Scholar] [CrossRef]
- Cai, W.; Ning, X.; Zhou, G.; Bai, X.; Jiang, Y.; Li, W.; Qian, P. A Novel Hyperspectral Image Classification Model Using Bole Convolution With Three-Direction Attention Mechanism: Small Sample and Unbalanced Learning. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–17. [Google Scholar] [CrossRef]
- Li, J.; Li, B.; Jiang, Y.; Tian, L.; Cai, W. MrFDDGAN: Multireceptive Field Feature Transfer and Dual Discriminator-Driven Generative Adversarial Network for Infrared and Color Visible Image Fusion. IEEE Trans. Instrum. Meas. 2023, 72, 1–28. [Google Scholar] [CrossRef]
- Zou, Z.; Shi, Z.; Guo, Y.; Ye, J. Object Detection in 20 Years: A Survey. Proc. IEEE 2019, 111, 257–276. [Google Scholar] [CrossRef]
- Chen, Y.; Li, R.; Li, R. HRCP: High-Ratio Channel Pruning for Real-Time Object Detection on Resource-Limited Platform. Neurocomputing 2021, 463, 155–167. [Google Scholar] [CrossRef]
- Ringwald, T.; Sommer, L.; Schumann, A.; Beyerer, J.; Stiefelhagen, R. UAV-Net: A Fast Aerial Vehicle Detector for Mobile Platforms. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–17 June 2019; pp. 544–552. [Google Scholar]
- Li, Y.; Yuan, H.; Wang, Y.; Xiao, C. GGT-YOLO: A Novel Object Detection Algorithm for Drone-Based Maritime Cruising. Drones 2022, 6, 335. [Google Scholar] [CrossRef]
- Deng, F.; Xie, Z.; Mao, W.; Li, B.; Shan, Y.; Wei, B.; Zeng, H. Research on Edge Intelligent Recognition Method Oriented to Transmission Line Insulator Fault Detection. Int. J. Electr. Power Energy Syst. 2022, 139, 108054. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path Aggregation Network for Instance Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and Efficient Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar]
- Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. Ghostnet: More Features from Cheap Operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 1580–1589. [Google Scholar]
- Tang, Y.; Han, K.; Guo, J.; Xu, C.; Xu, C.; Wang, Y. GhostNetV2: Enhance Cheap Operation with Long-Range Attention. arXiv 2022, arXiv:2211.12905. [Google Scholar]
- Tong, Z.; Chen, Y.; Xu, Z.; Yu, R. Wise-IoU: Bounding Box Regression Loss with Dynamic Focusing Mechanism. arXiv 2023, arXiv:2301.10051. [Google Scholar]
- Cao, Y.; He, Z.; Wang, L.; Wang, W.; Yuan, Y.; Zhang, D.; Zhang, J.; Zhu, P.; Van Gool, L.; Han, J. VisDrone-DET2021: The Vision Meets Drone Object Detection Challenge Results. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 2847–2854. [Google Scholar]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. Yolov4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Fang, Y.; Guo, X.; Chen, K.; Zhou, Z.; Ye, Q. Accurate and Automated Detection of Surface Knots on Sawn Timbers Using YOLO-V5 Model. BioResources 2021, 16, 5390–5406. [Google Scholar] [CrossRef]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. Yolox: Exceeding Yolo Series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
- Howard, A.; Sandler, M.; Chu, G.; Chen, L.-C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-Cam: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
Parameters | Setup |
---|---|
Epochs | 150 |
Batch Size | 8 |
Optimizer | SGD |
NMS IoU | 0.7 |
Initial Learning Rate | 1 × 10−2 |
Final Learning Rate | 1 × 10−4 |
Momentum | 0.937 |
Weight-Decay | 5 × 10−4 |
Image Scale | 0.5 |
Image Flip Left-Right | 0.5 |
Mosaic | 1.0 |
Image Translation | 0.1 |
(Wise-IoU) | 1.9 |
(Wise-IoU) | 3 |
Close Mosaic | Last 10 epochs |
Classification | Data Set | Indicators | A | A + B | A + B + C | A + B + C + D |
---|---|---|---|---|---|---|
Pedestrian | Val | P | 0.492 | 0.544 | 0.560 | 0.574 |
R | 0.403 | 0.414 | 0.424 | 0.424 | ||
AP | 0.416 | 0.444 | 0.459 | 0.467 | ||
Test | P | 0.478 | 0.514 | 0.519 | 0.546 | |
R | 0.251 | 0.269 | 0.279 | 0.270 | ||
AP | 0.265 | 0.287 | 0.299 | 0.300 | ||
People | Val | P | 0.525 | 0.579 | 0.582 | 0.586 |
R | 0.281 | 0.268 | 0.295 | 0.305 | ||
AP | 0.315 | 0.330 | 0.350 | 0.360 | ||
Test | P | 0.468 | 0.486 | 0.488 | 0.518 | |
R | 0.099 | 0.109 | 0.120 | 0.117 | ||
AP | 0.134 | 0.147 | 0.152 | 0.158 | ||
Bicycle | Val | P | 0.255 | 0.273 | 0.310 | 0.286 |
R | 0.134 | 0.146 | 0.176 | 0.160 | ||
AP | 0.113 | 0.133 | 0.151 | 0.140 | ||
Test | P | 0.271 | 0.275 | 0.278 | 0.277 | |
R | 0.101 | 0.106 | 0.138 | 0.114 | ||
AP | 0.092 | 0.095 | 0.110 | 0.106 | ||
Car | Val | P | 0.713 | 0.738 | 0.754 | 0.761 |
R | 0.773 | 0.772 | 0.782 | 0.780 | ||
AP | 0.795 | 0.803 | 0.814 | 0.814 | ||
Test | P | 0.674 | 0.688 | 0.692 | 0.707 | |
R | 0.715 | 0.723 | 0.735 | 0.724 | ||
AP | 0.715 | 0.728 | 0.735 | 0.734 | ||
Van | Val | P | 0.527 | 0.514 | 0.525 | 0.536 |
R | 0.449 | 0.457 | 0.479 | 0.490 | ||
AP | 0.458 | 0.464 | 0.477 | 0.487 | ||
Test | P | 0.387 | 0.404 | 0.390 | 0.414 | |
R | 0.416 | 0.437 | 0.457 | 0.443 | ||
AP | 0.366 | 0.395 | 0.393 | 0.391 | ||
Truck | Val | P | 0.481 | 0.514 | 0.503 | 0.522 |
R | 0.372 | 0.377 | 0.383 | 0.385 | ||
AP | 0.363 | 0.382 | 0.380 | 0.391 | ||
Test | P | 0.402 | 0.383 | 0.397 | 0.422 | |
R | 0.411 | 0.419 | 0.437 | 0.439 | ||
AP | 0.367 | 0.369 | 0.378 | 0.385 | ||
Tricycle | Val | P | 0.406 | 0.441 | 0.451 | 0.446 |
R | 0.314 | 0.297 | 0.314 | 0.316 | ||
AP | 0.278 | 0.279 | 0.306 | 0.315 | ||
Test | P | 0.229 | 0.249 | 0.264 | 0.277 | |
R | 0.253 | 0.283 | 0.308 | 0.291 | ||
AP | 0.146 | 0.178 | 0.201 | 0.187 | ||
Awning-tricycle | Val | P | 0.324 | 0.322 | 0.302 | 0.310 |
R | 0.192 | 0.188 | 0.173 | 0.207 | ||
AP | 0.153 | 0.161 | 0.167 | 0.182 | ||
Test | P | 0.355 | 0.412 | 0.359 | 0.382 | |
R | 0.207 | 0.237 | 0.222 | 0.231 | ||
AP | 0.166 | 0.204 | 0.189 | 0.214 | ||
Bus | Val | P | 0.637 | 0.647 | 0.672 | 0.661 |
R | 0.486 | 0.558 | 0.574 | 0.574 | ||
AP | 0.569 | 0.577 | 0.602 | 0.584 | ||
Test | P | 0.639 | 0.663 | 0.662 | 0.669 | |
R | 0.535 | 0.549 | 0.556 | 0.538 | ||
AP | 0.558 | 0.585 | 0.588 | 0.580 | ||
Motor | Val | P | 0.529 | 0.554 | 0.560 | 0.566 |
R | 0.435 | 0.429 | 0.456 | 0.463 | ||
AP | 0.446 | 0.452 | 0.473 | 0.481 | ||
Test | P | 0.415 | 0.435 | 0.442 | 0.462 | |
R | 0.315 | 0.325 | 0.346 | 0.350 | ||
AP | 0.276 | 0.293 | 0.310 | 0.316 |
Data Set | Indicators | A | A + B | A + B + C | A + B + C + D |
---|---|---|---|---|---|
Val | P | 0.489 | 0.513 | 0.522 | 0.525 |
R | 0.384 | 0.391 | 0.406 | 0.410 | |
MAP | 0.391 | 0.402 | 0.418 | 0.422 | |
Test | P | 0.432 | 0.451 | 0.449 | 0.467 |
R | 0.330 | 0.346 | 0.360 | 0.351 | |
MAP | 0.309 | 0.328 | 0.335 | 0.337 | |
FPS /f.s−1 | 182 | 161 | 167 | 167 | |
Parameters/million | 11.129 | 11.409 | 9.659 | 9.659 | |
Model size/MB | 21.972 | 22.542 | 19.157 | 19.157 |
Data Set | Indicators | A | B | C | D | E | F | G |
---|---|---|---|---|---|---|---|---|
Val | P | 0.495 | 0.376 | 0.495 | 0.534 | 0.817 | 0.395 | 0.525 |
R | 0.016 | 0.349 | 0.378 | 0.397 | 0.182 | 0.348 | 0.410 | |
MAP | 0.066 | 0.313 | 0.385 | 0.415 | 0.325 | 0.307 | 0.422 | |
Test | P | 0.331 | 0.362 | 0.427 | 0.458 | 0.678 | 0.355 | 0.467 |
R | 0.098 | 0.309 | 0.338 | 0.358 | 0.160 | 0.325 | 0.351 | |
MAP | 0.042 | 0.265 | 0.310 | 0.337 | 0.263 | 0.268 | 0.337 | |
FPS /f.s−1 | 113 | 115 | 185 | 143 | 54 | 154 | 167 | |
Parameters/million | 3.940 | 9.119 | 9.115 | 25.051 | 8.942 | 6.015 | 9.659 | |
Model size/MB | 19.320 | 18.178 | 18.063 | 49.295 | 35.183 | 12.011 | 19.157 |
Technical Parameter | Value |
---|---|
Maximum range (km) | 30 |
Maximum wind resistance speed (m/s) | 8 |
Maximum tilt angle (°) | 35 |
Vertical hover accuracy (m) | ±0.1 |
Horizontal hover accuracy (m) | ±0.3 |
Onboard memory (GB) | 8 |
Shot | Hasselblad, Telephoto camera |
Angle of view (°) | 84, 15 |
Equivalent focal length (mm) | 24, 162 |
Aperture | f/2.8–f/11, f/4.4 |
Pixel (w) | 2000, 1200 |
Data Set | Indicators | A | B | C | D | E | F |
---|---|---|---|---|---|---|---|
Test | P | 0.765 | 0.794 | 0.665 | 0.893 | 0.772 | 0.854 |
R | 0.575 | 0.872 | 0.900 | 0.645 | 0.881 | 0.883 | |
MAP | 0.742 | 0.880 | 0.852 | 0.733 | 0.898 | 0.917 | |
FPS /f.s−1 | 185 | 333 | 87 | 159 | 357 | 294 | |
Parameters/million | 9.119 | 9.115 | 8.942 | 6.015 | 11.129 | 9.659 | |
Model size/MB | 18.178 | 18.063 | 35.183 | 12.011 | 21.972 | 19.157 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Y.; Fan, Q.; Huang, H.; Han, Z.; Gu, Q. A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition. Drones 2023, 7, 304. https://doi.org/10.3390/drones7050304
Li Y, Fan Q, Huang H, Han Z, Gu Q. A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition. Drones. 2023; 7(5):304. https://doi.org/10.3390/drones7050304
Chicago/Turabian StyleLi, Yiting, Qingsong Fan, Haisong Huang, Zhenggong Han, and Qiang Gu. 2023. "A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition" Drones 7, no. 5: 304. https://doi.org/10.3390/drones7050304
APA StyleLi, Y., Fan, Q., Huang, H., Han, Z., & Gu, Q. (2023). A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition. Drones, 7(5), 304. https://doi.org/10.3390/drones7050304