A Lightweight and High-Accuracy Deep Learning Method for Grassland Grazing Livestock Detection Using UAV Imagery
Abstract
:1. Introduction
- A grazing livestock dataset based on UAV imagery data of the Hulunbuir grassland was established. We describe the dataset in detail and show some examples in Section 2.
- The proposed model is elaborated in Section 3.
- Comparison, ablation, and multi-scale adaptation experiments of the model had been conducted. The details and results of the experiments are shown in Section 4.
- Finally, we summarize and conclude the paper in Section 5.
2. Materials and Methods
2.1. Materials
2.1.1. Study Area and the UAV Imagery
2.1.2. Data Preparation
2.2. Proposed Method
- An enhanced CSPDarknet (ECSP) was proposed as the backbone network of our model with three improvement tricks: a cascaded hybrid dilated convolutional module, stage compute ratio optimization, and input size optimization. The new backbone introduced context-related features and improved the feature extraction ability. This maximized the performance, especially the recall, with as few parameters as possible.
- A weighted aggregation feature re-extraction pyramid module (WAFR) was also proposed as the neck part of our model, which made better use of the shallow features in the network and achieved effective multi-scale feature fusion.
2.2.1. Enhanced CSPDarknet
- Cascade Hybrid Dilated Convolution Module
- 2.
- Stage Compute Ratio Optimization
- 3.
- Input Size Optimization
2.2.2. Weighted Aggregation Feature Re-Extraction Pyramid
2.2.3. Standard of Performance Evaluation
3. Results
3.1. Algorithm Performance Comparison
3.2. Ablation Experiments
3.2.1. Use of Enhanced CSPDarknet
3.2.2. Use of Weighted Aggregation Feature Re-Extraction Pyramid
3.2.3. Visualization of Results
3.3. Model Application
3.3.1. Model Scale Adaptability
3.3.2. Large-Size Remote Sensing Image Inference and Grassland Livestock Accounting
4. Discussion
- Increase the number of labeled samples and add other object categories to explore possible long-tail object detection. This is a complex problem in the field of object detection that warrants further investigation.
- Collect multi-angle and multi-scale data to expand the model’s application scenarios. This will make the model more flexible and allow it to be applied to tasks such as target tracking in the future, as well as enabling the description of objects at ultra-high resolution.
- Study the domain adaptation problem in transfer learning and explore the knowledge transfer of the model. This is critical for the inheritance and evolution of the model, as well as the reduction of data labeling costs.
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Wang, D.; Liao, X.; Zhang, Y.; Cong, N.; Ye, H.; Shao, Q.; Xin, X. Grassland Livestock Real-Time Detection and Weight Estimation Based on Unmanned Aircraft System Video Streams. Chin. J. Ecol. 2021, 40, 4099–4108. [Google Scholar]
- Wang, D.; Shao, Q.; Yue, H. Surveying Wild Animals from Satellites, Manned Aircraft and Unmanned Aerial Systems (UASs): A Review. Remote Sens. 2019, 11, 1308. [Google Scholar] [CrossRef] [Green Version]
- Fretwell, P.T.; Trathan, P.N. Penguins from Space: Faecal Stains Reveal the Location of Emperor Penguin Colonies. Glob. Ecol. Biogeogr. 2009, 18, 543–552. [Google Scholar] [CrossRef]
- Schwaller, M.R.; Southwell, C.J.; Emmerson, L.M. Continental-Scale Mapping of Adélie Penguin Colonies from Landsat Imagery. Remote Sens. Environ. 2013, 139, 353–364. [Google Scholar] [CrossRef] [Green Version]
- Schwaller, M.R.; Olson, C.E.; Ma, Z.; Zhu, Z.; Dahmer, P. A Remote Sensing Analysis of Adélie Penguin Rookeries. Remote Sens. Environ. 1989, 28, 199–206. [Google Scholar] [CrossRef] [Green Version]
- Löffler, E.; Margules, C. Wombats Detected from Space. Remote Sens. Environ. 1980, 9, 47–56. [Google Scholar] [CrossRef]
- Wilschut, L.I.; Heesterbeek, J.A.P.; Begon, M.; de Jong, S.M.; Ageyev, V.; Laudisoit, A.; Addink, E.A. Detecting Plague-Host Abundance from Space: Using a Spectral Vegetation Index to Identify Occupancy of Great Gerbil Burrows. Int. J. Appl. Earth Obs. Geoinf. 2018, 64, 249–255. [Google Scholar] [CrossRef]
- Caughley, G.; Sinclair, R.; Scott-Kemmis, D. Experiments in Aerial Survey. J. Wildl. Manag. 1976, 40, 290–300. [Google Scholar] [CrossRef]
- Stapleton, S.; Peacock, E.; Garshelis, D. Aerial Surveys Suggest Long-Term Stability in the Seasonally Ice-Free Foxe Basin (Nunavut) Polar Bear Population. Mar. Mammal Sci. 2016, 32, 181–201. [Google Scholar] [CrossRef]
- Rey, N.; Volpi, M.; Joost, S.; Tuia, D. Detecting Animals in African Savanna with UAVs and the Crowds. Remote Sens. Environ. 2017, 200, 341–351. [Google Scholar] [CrossRef] [Green Version]
- Corcoran, E.; Denman, S.; Hamilton, G. Evaluating New Technology for Biodiversity Monitoring: Are Drone Surveys Biased? Ecol. Evol. 2021, 11, 6649–6656. [Google Scholar] [CrossRef]
- Gonzalez, L.F.; Montes, G.A.; Puig, E.; Johnson, S.; Mengersen, K.; Gaston, K.J. Unmanned Aerial Vehicles (UAVs) and Artificial Intelligence Revolutionizing Wildlife Monitoring and Conservation. Sensors 2016, 16, 97. [Google Scholar] [CrossRef] [Green Version]
- Xue, Y.; Wang, T.; Skidmore, A.K. Automatic Counting of Large Mammals from Very High Resolution Panchromatic Satellite Imagery. Remote Sens. 2017, 9, 878. [Google Scholar] [CrossRef] [Green Version]
- Torney, C.J.; Dobson, A.P.; Borner, F.; Lloyd-Jones, D.J.; Moyer, D.; Maliti, H.T.; Mwita, M.; Fredrick, H.; Borner, M.; Hopcraft, J.G.C. Assessing Rotation-Invariant Feature Classification for Automated Wildebeest Population Counts. PLoS ONE 2016, 11, e0156342. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; Curran Associates, Inc.: Red Hook, NY, USA, 2015; Volume 28. [Google Scholar]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully Convolutional One-Stage Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9627–9636. [Google Scholar]
- Zhou, X.; Wang, D.; Krähenbühl, P. Objects as Points. arXiv 2019, arXiv:1904.07850. [Google Scholar]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO Series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Kellenberger, B.; Volpi, M.; Tuia, D. Fast Animal Detection in UAV Images Using Convolutional Neural Networks. In Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, 23–28 July 2017; pp. 866–869. [Google Scholar]
- Kellenberger, B.; Marcos, D.; Tuia, D. Detecting Mammals in UAV Images: Best Practices to Address a Substantially Imbalanced Dataset with Deep Learning. Remote Sens. Environ. 2018, 216, 139–153. [Google Scholar] [CrossRef] [Green Version]
- Roosjen, P.P.; Kellenberger, B.; Kooistra, L.; Green, D.R.; Fahrentrapp, J. Deep Learning for Automated Detection of Drosophila Suzukii: Potential for UAV-Based Monitoring. Pest Manag. Sci. 2020, 76, 2994–3002. [Google Scholar] [CrossRef]
- Peng, J.; Wang, D.; Liao, X.; Shao, Q.; Sun, Z.; Yue, H.; Ye, H. Wild Animal Survey Using UAS Imagery and Deep Learning: Modified Faster R-CNN for Kiang Detection in Tibetan Plateau. ISPRS J. Photogramm. Remote Sens. 2020, 169, 364–376. [Google Scholar] [CrossRef]
- Wada, K. Labelme: Image Polygonal Annotation with Python. Available online: https://github.com/wkentaro/labelme (accessed on 19 January 2023).
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path Aggregation Network for Instance Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Wang, C.-Y.; Liao, H.-Y.M.; Wu, Y.-H.; Chen, P.-Y.; Hsieh, J.-W.; Yeh, I.-H. CSPNet: A New Backbone That Can Enhance Learning Capability of CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 390–391. [Google Scholar]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Yu, F.; Koltun, V. Multi-Scale Context Aggregation by Dilated Convolutions. arXiv 2016, arXiv:1511.07122. [Google Scholar]
- Wang, P.; Chen, P.; Yuan, Y.; Liu, D.; Huang, Z.; Hou, X.; Cottrell, G. Understanding Convolution for Semantic Segmentation. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NY, USA, 12–15 March 2018; pp. 1451–1460. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Liu, Z.; Mao, H.; Wu, C.-Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A ConvNet for the 2020s. arXiv 2022, arXiv:2201.03545. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. arXiv 2021, arXiv:2103.14030. [Google Scholar]
- Liu, J.-J.; Hou, Q.; Cheng, M.-M.; Feng, J.; Jiang, J. A Simple Pooling-Based Design for Real-Time Salient Object Detection. arXiv 2019, arXiv:1904.09569. [Google Scholar]
- Lin, T.-Y.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. arXiv 2017, arXiv:1612.03144. [Google Scholar]
- Xu, M. A Review of Grassland Carrying Capacity: Perspective and Dilemma for Research in China on “Forage—Livestock Balance”. Acta Prataculturae Sin. 2014, 23, 321–329. [Google Scholar] [CrossRef]
Datasets | Animal Patches | Cattle Instances | Horse Instances | Sheep Instances |
---|---|---|---|---|
Training | 344 | 1511 | 1149 | 5354 |
Validation | 39 | 169 | 87 | 1495 |
Testing | 86 | 471 | 323 | 2342 |
Total | 469 | 2151 | 1559 | 9191 |
Category | Min Width | Max Width | Average Width | Min Height | Max Height | Average Height | MAS | MRS | Number |
---|---|---|---|---|---|---|---|---|---|
cattle | 11 | 73 | 34.87 | 9 | 87 | 33.37 | 33.99 | 0.11% | 2151 |
horse | 9 | 81 | 37.17 | 11 | 82 | 36.90 | 36.45 | 0.13% | 1559 |
sheep | 5 | 42 | 17.40 | 6 | 39 | 19.60 | 18.42 | 0.03% | 9191 |
Scale | Category | MAS | MRS | Number |
---|---|---|---|---|
0.2 | cattle | 8.12 | 0.79% | 108 |
horse | 6.41 | 0.63% | 107 | |
sheep | 3.03 | 0.30% | 605 | |
0.25 | cattle | 10.15 | 0.99% | 108 |
horse | 8.03 | 0.78% | 107 | |
sheep | 3.82 | 0.37% | 605 | |
0.33 | cattle | 13.38 | 1.31% | 108 |
horse | 10.55 | 1.03% | 107 | |
sheep | 5.01 | 0.49% | 605 | |
0.5 | cattle | 20.31 | 1.98% | 108 |
horse | 16.04 | 1.56% | 107 | |
sheep | 7.6 | 0.74% | 605 | |
1 | cattle | 40.59 | 3.96% | 108 |
horse | 32.07 | 3.13% | 107 | |
sheep | 15.21 | 1.49% | 605 | |
2 | cattle | 79.55 | 7.77% | 61 |
horse | 63.87 | 6.24% | 92 | |
sheep | 29.68 | 2.90% | 347 | |
3 | cattle | 114.73 | 11.20% | 36 |
horse | 94.45 | 9.22% | 78 | |
sheep | 43.79 | 4.28% | 195 | |
4 | cattle | 150.94 | 14.74 | 28 |
horse | 122.43 | 11.96% | 64 | |
sheep | 59.23 | 5.78% | 153 | |
5 | cattle | 179.24 | 17.50% | 23 |
horse | 149.89 | 14.64% | 50 | |
sheep | 72.04 | 7.04% | 90 |
Model | Time | Category | AP | F1 | Recall | Precision | mAP | Parameters 1 |
---|---|---|---|---|---|---|---|---|
Faster R-CNN | 2016 | cattle | 39.34% | 0.48 | 54.99% | 42.88% | 21.75% | 28.3M |
horse | 25.37% | 0.36 | 42.72% | 31.15% | ||||
sheep | 0.53% | 0.53 | 0.60% | 60.87% | ||||
RetinaNet | 2017 | cattle | 60.09% | 0.52 | 36.73% | 91.05% | 35.78% | 36.4M |
horse | 47.26% | 0.42 | 27.86% | 85.71% | ||||
sheep | 0.00% | 0 | 0.00% | 0.00% | ||||
YOLOv3 | 2018 | cattle | 82.43% | 0.78 | 71.76% | 85.79% | 74.69% | 61.5M |
horse | 69.06% | 0.67 | 60.06% | 74.62% | ||||
sheep | 72.59% | 0.75 | 74.72% | 75.82% | ||||
FCOS | 2019 | cattle | 84.95% | 0.85 | 83.23% | 87.31% | 78.06% | 32.1M |
horse | 82.06% | 0.8 | 78.95% | 81.99% | ||||
sheep | 67.17% | 0.55 | 40.73% | 86.73% | ||||
YOLOX-nano | 2021 | cattle | 84.69% | 0.83 | 82.80% | 82.80% | 77.73% | 0.9M |
horse | 75.72% | 0.76 | 74.30% | 77.67% | ||||
sheep | 72.78% | 0.77 | 71.69% | 82.87% | ||||
YOLOX-x | 2021 | cattle | 89.61% | 0.87 | 90.23% | 83.66% | 87.19% | 99.0M |
horse | 86.63% | 0.85 | 87.93% | 82.08% | ||||
sheep | 85.32% | 0.88 | 86.68% | 88.53% | ||||
GLDM (ours) | 2022 | cattle | 88.52% | 0.86 | 87.47% | 84.25% | 86.47% | 5.7M |
horse | 85.87% | 0.84 | 83.90% | 84.16% | ||||
sheep | 85.03% | 0.86 | 83.99% | 87.42% |
Baseline | Backbone | Neck | Average Recall | Average Precision | mAP | Parameters |
---|---|---|---|---|---|---|
nano | 76.26% | 81.11% | 77.73% | 0.9M | ||
nano | ECSP | 84.74% | 84.93% | 85.98% | 3.8M | |
nano | ECSP | WAFR | 85.12% | 85.28% | 86.47% | 5.7M |
Baseline | Bottleneck | Image Input | Block Rate | mAP |
---|---|---|---|---|
nano | 77.73% | |||
nano | CHDC | 78.17% | ||
nano | CHDC | 1024 × 1024 | 85.04% | |
nano | CHDC | 1024 × 1024 | 1:1:3:1 | 85.98% |
Baseline | Bottleneck | Image Input | Block Rate | mAP | Parameter |
---|---|---|---|---|---|
nano | CHDC | 1024 × 1024 | 1:3:3:1 | 85.04% | 3.949M |
nano | CHDC | 1024 × 1024 | 1:1:3:3 | 84.96% | 5.642M |
nano | CHDC | 1024 × 1024 | 1:1:4:2 | 85.19% | 4.965M |
nano | CHDC | 1024 × 1024 | 1:1:5:1 | 85.88% | 4.288M |
nano | CHDC | 1024 × 1024 | 1:1:3:1 | 85.98% | 3.836M |
Baseline | Backbone | Neck | mAP |
---|---|---|---|
nano | CSPDarknet | PAN | 77.73% |
nano | CSPDarknet | WAFR | 78.63% |
nano | ECSP | PAN | 85.98% |
nano | ECSP | WAFR | 86.47% |
Scale | 0.2 | 0.25 | 0.33 | 0.5 | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|---|---|---|---|
cattle AP | 0.5047 | 0.6361 | 0.7955 | 0.9035 | 0.9142 | 0.8467 | 0.245 | 0.1038 | 0.029 |
horse AP | 0.1479 | 0.4022 | 0.5168 | 0.7929 | 0.8919 | 0.9159 | 0.7309 | 0.3634 | 0.089 |
sheep AP | 0.0255 | 0.1187 | 0.2602 | 0.6222 | 0.9074 | 0.9457 | 0.9127 | 0.7319 | 0.4122 |
mAP | 0.226 | 0.3857 | 0.5242 | 0.7728 | 0.9045 | 0.9028 | 0.6296 | 0.3997 | 0.1767 |
Category | Truth | TP | FP | FN | LAC |
---|---|---|---|---|---|
cattle | 471 | 412 | 77 | 59 | 0.99 |
horse | 323 | 271 | 51 | 52 | |
sheep | 2342 | 1967 | 294 | 375 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Y.; Ma, L.; Wang, Q.; Wang, N.; Wang, D.; Wang, X.; Zheng, Q.; Hou, X.; Ouyang, G. A Lightweight and High-Accuracy Deep Learning Method for Grassland Grazing Livestock Detection Using UAV Imagery. Remote Sens. 2023, 15, 1593. https://doi.org/10.3390/rs15061593
Wang Y, Ma L, Wang Q, Wang N, Wang D, Wang X, Zheng Q, Hou X, Ouyang G. A Lightweight and High-Accuracy Deep Learning Method for Grassland Grazing Livestock Detection Using UAV Imagery. Remote Sensing. 2023; 15(6):1593. https://doi.org/10.3390/rs15061593
Chicago/Turabian StyleWang, Yuhang, Lingling Ma, Qi Wang, Ning Wang, Dongliang Wang, Xinhong Wang, Qingchuan Zheng, Xiaoxin Hou, and Guangzhou Ouyang. 2023. "A Lightweight and High-Accuracy Deep Learning Method for Grassland Grazing Livestock Detection Using UAV Imagery" Remote Sensing 15, no. 6: 1593. https://doi.org/10.3390/rs15061593
APA StyleWang, Y., Ma, L., Wang, Q., Wang, N., Wang, D., Wang, X., Zheng, Q., Hou, X., & Ouyang, G. (2023). A Lightweight and High-Accuracy Deep Learning Method for Grassland Grazing Livestock Detection Using UAV Imagery. Remote Sensing, 15(6), 1593. https://doi.org/10.3390/rs15061593