WoodenCube: An Innovative Dataset for Object Detection in Concealed Industrial Environments
Abstract
:1. Introduction
- This paper addresses the challenges of industrial robots in recognizing and localizing objects in complex scenes: We introduce the WoodenCube dataset, which explores the detection of objects with textures similar to the background in industrial scenarios. The dataset comprises 5113 densely annotated images of 10 different types of blocks, significantly improving annotation efficiency and accuracy through a semi-automatic annotation method, CS-SAM.
- To tackle the issue that mAP cannot effectively evaluate rotation scales due to the covariance of Gaussian distribution for nearly square features being zero, we propose G/2-ProbIoU and define Cube-mAP based on this function to more accurately assess the detection performance of models on cube-like objects.
- To address the challenges of traditional convolutional methods in distinguishing objects with similar textures, we design a multi-scale pyramid-shaped large-kernel convolutional network, CS-SKNet. This network expands the receptive field to capture strong texture features in the scene, retaining the multi-scale feature extraction advantages of FPN while further enhancing the model’s performance in complex scenarios.
- Extensive experiments conducted on the WoodenCube dataset and DOTAv1.0 dataset demonstrate the competitive performance of the proposed CS-SKNet model in terms of rotated object detection accuracy. On the WoodenCube dataset, our model achieved a Cube-mAP score of 72.64%. On the DOTAv1.0 dataset, our model achieved an mAP of 79.17% while maintaining low parameter count and FLOPs.
2. Related Works
2.1. Grasp Detection
2.2. Bounding Box Regression
2.3. Attention Mechanisms
3. WoodenCube Dataset
3.1. Data Collection
3.2. Data Annotation
3.3. Cube-mAP Evaluation Method
Algorithm 1. G/2-ProbIoU |
|
4. CS-SKNet Architecture
4.1. Multi-Scale Pyramid-like Large-Kernel Convolution Block
4.2. A Plug-and-Play MLP Based on Group Convolution
5. Experiments
5.1. Datasets and Implementation Details
5.2. Main Result
5.3. Ablation Study
5.3.1. The Rationality of Cube-mAP
5.3.2. Large Kernel Decomposition
5.3.3. The Feasibility of CS-SKNet
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Chen, Y.; Wu, J.; He, C.; Zhang, S. Intelligent warehouse robot path planning based on improved ant colony algorithm. IEEE Access 2023, 11, 12360–12367. [Google Scholar] [CrossRef]
- Yang, C.; Yuan, B.; Zhai, P. Actor-Hybrid-Attention-Critic for Multi-Logistic Robots Path Planning. IEEE Robot. Autom. Lett. 2024, 9, 5559–5566. [Google Scholar] [CrossRef]
- Li, F.; Kim, Y.C.; Lyu, Z.; Zhan, H. Research on Path Planning for Robot Based on Improved Design of Non-standard Environment Map with Ant Colony Algorithm. IEEE Access 2023, 11, 99776–99791. [Google Scholar] [CrossRef]
- Zhang, Y.; Ren, J.; Jin, Q.; Zhu, Y.; Mo, Z.; Chen, Y. Design of Control System for Handling, Sorting, and Warehousing Robot Based on Machine Vision. In Proceedings of the 2023 5th International Symposium on Robotics & Intelligent Manufacturing Technology (ISRIMT), Changzhou, China, 22–24 September 2023; pp. 375–383. [Google Scholar]
- Prawira, I.F.A.; Habbe, A.H.; Muda, I.; Hasibuan, R.M.; Umbrajkaar, A. Robot as Staff: Robot for Alibaba E-Commerce Warehouse Process. In Proceedings of the 2023 International Conference on Inventive Computation Technologies (ICICT), Lalitpur, Nepal, 26–28 April 2023; pp. 1619–1623. [Google Scholar]
- Mnyusiwalla, H.; Triantafyllou, P.; Sotiropoulos, P.; Roa, M.A.; Friedl, W.; Sundaram, A.M.; Russell, D.; Deacon, G. A bin-picking benchmark for systematic evaluation of robotic pick-and-place systems. IEEE Robot. Autom. Lett. 2020, 5, 1389–1396. [Google Scholar] [CrossRef]
- Wong, C.C.; Tsai, C.Y.; Chen, R.J.; Chien, S.Y.; Yang, Y.H.; Wong, S.W.; Yeh, C.A. Generic development of bin pick-and-place system based on robot operating system. IEEE Access 2022, 10, 65257–65270. [Google Scholar] [CrossRef]
- Surati, S.; Hedaoo, S.; Rotti, T.; Ahuja, V.; Patel, N. pick-and-place robotic arm: A review paper. Int. Res. J. Eng. Technol. 2021, 8, 2121–2129. [Google Scholar]
- Yu, F.; Kong, X.; Yao, W.; Zhang, J.; Cai, S.; Lin, H.; Jin, J. Dynamics analysis, synchronization and FPGA implementation of multiscroll Hopfield neural networks with non-polynomial memristor. Chaos Solitons Fractals 2024, 179, 114440. [Google Scholar] [CrossRef]
- Zheng, Z.; Ma, Y.; Zheng, H.; Gu, Y.; Lin, M. Industrial part localization and grasping using a robotic arm guided by 2D monocular vision. Ind. Robot. Int. J. 2018, 45, 794–804. [Google Scholar] [CrossRef]
- Abdullah-Al-Noman, M.; Eva, A.N.; Yeahyea, T.B.; Khan, R. Computer vision-based robotic arm for object color, shape, and size detection. J. Robot. Control 2022, 3, 180–186. [Google Scholar] [CrossRef]
- Gao, M.; Jiang, J.; Zou, G.; John, V.; Liu, Z. RGB-D-based object recognition using multimodal convolutional neural networks: A survey. IEEE Access 2019, 7, 43110–43136. [Google Scholar] [CrossRef]
- Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
- Luo, Z.; Tang, B.; Jiang, S.; Pang, M.; Xiang, K. Grasp detection based on faster region cnn. In Proceedings of the 2020 5th International Conference on Advanced Robotics and Mechatronics (ICARM), Shenzhen, China, 18–21 December 2020; pp. 323–328. [Google Scholar]
- Yu, Y.; Cao, Z.; Liu, Z.; Geng, W.; Yu, J.; Zhang, W. A two-stream CNN with simultaneous detection and segmentation for robotic grasping. IEEE Trans. Syst. Man Cybern. Syst. 2020, 52, 1167–1181. [Google Scholar] [CrossRef]
- Jiang, J.; Luo, X.; Luo, Q.; Qiao, L.; Li, M. An overview of hand–eye calibration. Int. J. Adv. Manuf. Technol. 2022, 119, 77–97. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Jiang, P.; Ergu, D.; Liu, F.; Cai, Y.; Ma, B. A Review of Yolo algorithm developments. Procedia Comput. Sci. 2022, 199, 1066–1073. [Google Scholar] [CrossRef]
- Song, J.; Gao, S.; Zhu, Y.; Ma, C. A survey of remote sensing image classification based on CNNs. Big Earth Data 2019, 3, 232–254. [Google Scholar] [CrossRef]
- Fan, D.P.; Ji, G.P.; Cheng, M.M.; Shao, L. Concealed object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 6024–6042. [Google Scholar] [CrossRef]
- Mei, H.; Ji, G.P.; Wei, Z.; Yang, X.; Wei, X.; Fan, D.P. Camouflaged object segmentation with distraction mining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 8772–8781. [Google Scholar]
- He, C.; Li, K.; Zhang, Y.; Tang, L.; Zhang, Y.; Guo, Z.; Li, X. Camouflaged object detection with feature decomposition and edge reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 22046–22055. [Google Scholar]
- Bi, H.; Zhang, C.; Wang, K.; Tong, J.; Zheng, F. Rethinking camouflaged object detection: Models and datasets. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 5708–5724. [Google Scholar] [CrossRef]
- Lyu, Y.; Zhang, J.; Dai, Y.; Li, A.; Liu, B.; Barnes, N.; Fan, D.P. Simultaneously Localize, Segment and Rank the Camouflaged Objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 11586–11596. [Google Scholar]
- Le, T.N.; Nguyen, T.V.; Nie, Z.; Tran, M.T.; Sugimoto, A. Anabranch Network for Camouflaged Object Segmentation. J. Comput. Vis. Image Underst. 2019, 184, 45–56. [Google Scholar] [CrossRef]
- Yan, J.; Le, T.N.; Nguyen, K.D.; Tran, M.T.; Do, T.T.; Nguyen, T.V. MirrorNet: Bio-Inspired Camouflaged Object Segmentation. IEEE Access 2021, 9, 43290–43300. [Google Scholar] [CrossRef]
- Yu, J.; Jiang, Y.; Wang, Z.; Cao, Z.; Huang, T. Unitbox: An advanced object detection network. In Proceedings of the 24th ACM international Conference on Multimedia, Amsterdam, The Netherlands, 15–19 October 2016; pp. 516–520. [Google Scholar]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 658–666. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. Proc. AAAI Conf. Artif. Intell. 2020, 34, 12993–13000. [Google Scholar] [CrossRef]
- Zhang, Y.F.; Ren, W.; Zhang, Z.; Jia, Z.; Wang, L.; Tan, T. Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing 2022, 506, 146–157. [Google Scholar] [CrossRef]
- Yang, X.; Yan, J.; Ming, Q.; Wang, W.; Zhang, X.; Tian, Q. Rethinking rotated object detection with gaussian wasserstein distance loss. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual Event, 18–24 July 2021; pp. 11830–11841. [Google Scholar]
- Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.Y.; et al. Segment anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 4–6 October 2023; pp. 3992–4003. [Google Scholar]
- Llerena, J.M.; Zeni, L.F.; Kristen, L.N.; Jung, C. Gaussian bounding boxes and probabilistic intersection-over-union for object detection. arXiv 2021, arXiv:2106.06072. [Google Scholar]
- Xu, Y.; Fu, M.; Wang, Q.; Wang, Y.; Chen, K.; Xia, G.S.; Bai, X. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 1452–1459. [Google Scholar] [CrossRef]
- Li, Z.; Hou, B.; Wu, Z.; Ren, B.; Yang, C. FCOSR: A simple anchor-free rotated detector for aerial object detection. Remote Sens. 2023, 15, 5499. [Google Scholar] [CrossRef]
- Yang, X.; Yan, J.; Feng, Z.; He, T. R3det: Refined single-stage detector with feature refinement for rotating object. Proc. AAAI Conf. Artif. Intell. 2021, 35, 3163–3171. [Google Scholar] [CrossRef]
- Li, Y.; Hou, Q.; Zheng, Z.; Cheng, M.M.; Yang, J.; Li, X. Large selective kernel network for remote sensing object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 4–6 October 2023; pp. 16794–16805. [Google Scholar]
- Kober, J.; Peters, J. Imitation and reinforcement learning. IEEE Robot. Autom. Mag. 2010, 17, 55–62. [Google Scholar] [CrossRef]
- Lenz, I.; Lee, H.; Saxena, A. Deep learning for detecting robotic grasps. Int. J. Robot. Res. 2015, 34, 705–724. [Google Scholar] [CrossRef]
- Redmon, J.; Angelova, A. Real-time grasp detection using convolutional neural networks. In Proceedings of the 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 26–30 May 2015; pp. 1316–1322. [Google Scholar]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 22–25 July 2017; pp. 77–85. [Google Scholar]
- Qin, Y.; Chen, R.; Zhu, H.; Song, M.; Xu, J.; Su, H. S4g: Amodal single-view single-shot se (3) grasp detection in cluttered scenes. In Proceedings of the Conference on Robot Learning, PMLR, Cambridge, MA, USA, 16–18 November 2020; pp. 53–65. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
- Alom, M.Z.; Taha, T.M.; Yakopcic, C.; Westberg, S.; Sidike, P.; Nasrin, M.S.; Van Esesn, B.C.; Awwal, A.A.S.; Asari, V.K. The history began from alexnet: A comprehensive survey on deep learning approaches. arXiv 2018, arXiv:1803.01164. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Liu, D.; Tao, X.; Yuan, L.; Du, Y.; Cong, M. Robotic objects detection and grasping in clutter based on cascaded deep convolutional neural network. IEEE Trans. Instrum. Meas. 2021, 71, 1–10. [Google Scholar] [CrossRef]
- Karaoguz, H.; Jensfelt, P. Object detection approach for robot grasp detection. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 4953–4959. [Google Scholar]
- Margolin, R.; Zelnik-Manor, L.; Tal, A. How to evaluate foreground maps? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 248–255. [Google Scholar]
- Chen, G.; Liu, S.J.; Sun, Y.J.; Ji, G.P.; Wu, Y.F.; Zhou, T. Camouflaged object detection via context-aware cross-level fusion. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 6981–6993. [Google Scholar] [CrossRef]
- Bhajantri, N.U.; Nagabhushan, P. Camouflage defect identification: A novel approach. In Proceedings of the 9th International Conference on Information Technology (ICIT’06), Bhubaneswar, India, 18–21 December 2006; pp. 145–148. [Google Scholar]
- Skurowski, P.; Abdulameer, H.; Błaszczyk, J.; Depta, T.; Kornacki, A.; Kozieł, P. Animal camouflage analysis: Chameleon database. Unpubl. Manuscr. 2018, 2, 7. [Google Scholar]
- Felzenszwalb, P.F.; Girshick, R.B.; McAllester, D.; Ramanan, D. Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 32, 1627–1645. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Beery, S.; Wu, G.; Rathod, V.; Votel, R.; Huang, J. Context r-cnn: Long term temporal context for per-camera object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 13075–13085. [Google Scholar]
- Wu, Y.; Chen, Y.; Yuan, L.; Liu, Z.; Wang, L.; Li, H.; Fu, Y. Rethinking classification and localization for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10186–10195. [Google Scholar]
- Zheng, Z.; Wang, P.; Ren, D.; Liu, W.; Ye, R.; Hu, Q.; Zuo, W. Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans. Cybern. 2021, 52, 8574–8586. [Google Scholar] [CrossRef]
- Guo, M.H.; Xu, T.X.; Liu, J.J.; Liu, Z.N.; Jiang, P.T.; Mu, T.J.; Zhang, S.H.; Martin, R.R.; Cheng, M.M.; Hu, S.M. Attention mechanisms in computer vision: A survey. Comput. Vis. Media 2022, 8, 331–368. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Vedaldi, A. Gather-excite: Exploiting feature context in convolutional neural networks. Adv. Neural Inf. Process. Syst. 2018, 31, 9423–9433. [Google Scholar]
- Cao, Y.; Xu, J.; Lin, S.; Wei, F.; Hu, H. Gcnet: Non-local networks meet squeeze-excitation networks and beyond. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Republic of Korea, 27–28 June 2018; pp. 1971–1980. [Google Scholar]
- Li, Y.; Li, X.; Yang, J. Spatial group-wise enhance: Enhancing semantic feature learning in cnn. In Proceedings of the Asian Conference on Computer Vision, Macao, China, 4–8 December 2022; pp. 316–332. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Park, J.; Woo, S.; Lee, J.Y.; Kweon, I.S. Bam: Bottleneck attention module. arXiv 2018, arXiv:1807.06514. [Google Scholar]
- Yan, H.; Li, Z.; Li, W.; Wang, C.; Wu, M.; Zhang, C. ConTNet: Why not use convolution and transformer at the same time? arXiv 2021, arXiv:2104.13497. [Google Scholar]
- Chen, Y.; Dai, X.; Liu, M.; Chen, D.; Yuan, L.; Liu, Z. Dynamic convolution: Attention over convolution kernels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11027–11036. [Google Scholar]
- Li, X.; Wang, W.; Hu, X.; Yang, J. Selective kernel networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 510–519. [Google Scholar]
- Liu, J.J.; Hou, Q.; Cheng, M.M.; Wang, C.; Feng, J. Improving convolutional networks with self-calibrated convolutions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10093–10102. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Ding, X.; Zhang, X.; Han, J.; Ding, G. Scaling up your kernels to 31x31: Revisiting large kernel design in cnns. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11953–11965. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar]
- Stephan, P.; Heck, I.; Krau, P.; Frey, G. Evaluation of Indoor Positioning Technologies under industrial application conditions in the SmartFactoryKL based on EN ISO 9283. IFAC Proc. Vol. 2009, 42, 870–875. [Google Scholar] [CrossRef]
- Xia, G.S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 3974–3983. [Google Scholar]
- Xie, X.; Cheng, G.; Wang, J.; Yao, X.; Han, J. Oriented R-CNN for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 3500–3509. [Google Scholar]
- Zhou, Y.; Yang, X.; Zhang, G.; Wang, J.; Liu, Y.; Hou, L.; Jiang, X.; Liu, X.; Yan, J.; Lyu, C.; et al. Mmrotate: A rotated object detection benchmark using pytorch. In Proceedings of the 30th ACM International Conference on Multimedia, New York, NY, USA, 10–14 October 2022; pp. 7331–7334. [Google Scholar]
- Han, J.; Ding, J.; Li, J.; Xia, G.S. Align deep features for oriented object detection. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–11. [Google Scholar] [CrossRef]
- Guo, Z.; Liu, C.; Zhang, X.; Jiao, J.; Ji, X.; Ye, Q. Beyond bounding-box: Convex-hull feature adaptation for oriented and densely packed object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 14–19 June 2021; pp. 8788–8797. [Google Scholar]
- Lang, S.; Ventola, F.; Kersting, K. DAFNe: A one-stage anchor-free approach for oriented object detection. arXiv 2021, arXiv:2109.06148. [Google Scholar]
- Yang, X.; Yang, J.; Yan, J.; Zhang, Y.; Zhang, T.; Guo, Z.; Sun, X.; Fu, K. Scrdet: Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8231–8240. [Google Scholar]
- Ding, J.; Xue, N.; Long, Y.; Xia, G.S.; Lu, Q. Learning RoI transformer for oriented object detection in aerial images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2844–2853. [Google Scholar]
- Wang, J.; Yang, W.; Li, H.C.; Zhang, H.; Xia, G.S. Learning center probability map for detecting objects in aerial images. IEEE Trans. Geosci. Remote Sens. 2020, 59, 4307–4323. [Google Scholar] [CrossRef]
- Yang, X.; Yan, J. Arbitrary-oriented object detection with circular smooth label. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part VIII 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 677–694. [Google Scholar]
Equipment | Details |
---|---|
Industrial camera | The MV-CS050-10UC, a second-generation industrial-line scan RGB camera, utilizes Sony’s IMX264 CMOS chip with a resolution of 2448 × 2048. It transmits uncompressed images in real-time via a USB 3.0 interface, with a maximum frame rate of up to 60 fps. |
Robot | The KUKA KR6 R900 sixx six-axis robot weighs approximately 52 kg, with a maximum payload capacity of 6 kg. It has a maximum motion range of 901.5 mm and a pose repetition accuracy (ISO 9283 [73]) of ±0.03 mm. |
Frameworks | Backbone | Cube-mAP | Params (M) | FLOPs (G) |
---|---|---|---|---|
Oriented RCNN [75] | ResNet-50 | 75.38 | 41.14 | 211.4 |
ResNet-101 | 72.89 | 60.13 | 289.32 | |
CS-SKNet | 74.13 | 51.59 | 292.9 | |
Rotated Faster RCNN [46] | ResNet-50 | 72.90 | 41.13 | 211.3 |
ResNet-101 | 71.92 | 60.13 | 289.19 | |
CS-SKNet | 70.44 | 51.59 | 294.6 | |
R3Det [36] | ResNet-50 | 75.69 | 41.90 | 335.7 |
ResNet-101 | 79.12 | 60.78 | 411.12 | |
CS-SKNet | 79.60 | 48.61 | 419.5 | |
Gliding Vertex [34] | ResNet-50 | 70.87 | 41.14 | 211.3 |
ResNet-101 | 70.96 | 60.13 | 289.19 | |
CS-SKNet | 71.53 | 51.59 | 294.6 | |
Rotated FCOS [35] | ResNet-50 | 47.92 | 31.91 | 206.7 |
ResNet-101 | 44.52 | 50.9 | 284.55 | |
CS-SKNet | 60.75 | 42.41 | 293.2 | |
S2A-Net [77] | ResNet-50 | 59.24 | 38.60 | 197.6 |
ResNet-101 | 56.86 | 57.57 | 275.01 | |
CS-SKNet | 57.26 | 45.31 | 281.4 |
Method | mAP | # P(M) | FLOPs (G) | PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
R3Det [36] | 76.47 | 41.9 | 336 | 89.8 | 83.77 | 48.11 | 66.77 | 78.76 | 83.27 | 87.84 | 90.82 | 85.38 | 85.51 | 65.57 | 62.68 | 67.53 | 78.56 | 72.62 |
CFA [78] | 76.67 | - | - | 89.08 | 83.20 | 54.37 | 66.87 | 81.23 | 80.96 | 87.17 | 90.21 | 84.32 | 86.09 | 52.34 | 69.94 | 75.52 | 80.76 | 67.96 |
DAFNe [79] | 76.95 | - | - | 89.4 | 86.27 | 53.70 | 60.51 | 82.04 | 81.17 | 88.66 | 90.37 | 83.81 | 87.27 | 53.93 | 69.38 | 75.61 | 81.26 | 70.86 |
S2A-Net * [77] | 71.41 | 38.6 | 198 | 88.32 | 74.50 | 49.39 | 72.84 | 78.36 | 80.48 | 87.04 | 90.83 | 74.66 | 83.17 | 49.64 | 60.50 | 70.27 | 66.00 | 45.17 |
SCRDet [80] | 72.61 | - | - | 89.98 | 80.65 | 52.09 | 68.36 | 68.36 | 60.32 | 72.41 | 90.85 | 87.94 | 86.86 | 65.02 | 66.68 | 66.25 | 68.24 | 65.21 |
RoT Trans * [81] | 75.57 | 55.1 | 225 | 87.60 | 82.12 | 55.18 | 73.93 | 77.05 | 82.35 | 87.50 | 90.85 | 78.62 | 84.36 | 60.86 | 58.48 | 76.78 | 74.98 | 62.80 |
G.V. [34] | 75.02 | 41.1 | 198 | 89.64 | 85.00 | 52.26 | 77.34 | 73.01 | 73.14 | 86.82 | 90.74 | 79.02 | 86.81 | 59.55 | 70.91 | 72.94 | 70.86 | 57.32 |
Oriented RCNN * [75] | 75.05 | 41.1 | 211 | 88.49 | 79.14 | 53.32 | 77.34 | 76.93 | 82.67 | 87.98 | 90.85 | 77.41 | 83.35 | 59.56 | 64.26 | 75.48 | 68.83 | 60.14 |
CenterMap [82] | 76.03 | 41.1 | 198 | 89.83 | 84.41 | 54.60 | 70.25 | 77.66 | 78.32 | 87.19 | 90.66 | 84.89 | 85.27 | 56.46 | 69.23 | 74.13 | 71.56 | 66.06 |
CSL [83] | 76.17 | 37.4 | 236 | 90.25 | 85.53 | 54.64 | 75.31 | 70.44 | 73.51 | 77.62 | 90.84 | 86.15 | 86.69 | 69.6 | 68.04 | 73.83 | 71.10 | 68.93 |
LSKNet * [37] | 78.96 | 31.0 | 174 | 89.03 | 83.12 | 56.76 | 79.32 | 78.40 | 84.66 | 87.97 | 90.91 | 85.62 | 85.04 | 63.68 | 65.89 | 77.94 | 79.31 | 76.73 |
CS-SKNet (Ours) | 79.17 | 51.3 | 293 | 88.77 | 83.56 | 56.87 | 80.71 | 78.93 | 84.68 | 87.93 | 90.88 | 85.29 | 87.25 | 64.22 | 66.09 | 77.96 | 79.33 | 75.15 |
(k,d) Sequence | Num. | RF | FPS | Cube-mAP |
---|---|---|---|---|
(47,1) | 1 | 47 | 23.1 | 72.30 |
(7,1), (9,5) | 2 | 47 | 38.3 | 72.26 |
(5,1), (7,3), (9,3) | 3 | 47 | 33.7 | 72.62 |
(,) Sequence | (,) Sequence | (,) Sequence | RF | FPS | Cube-mAP |
---|---|---|---|---|---|
(3,1) | (5,2) | (7,3) | 29 | 34.7 | 71.58 |
(3,1) | (5,3) | (7,4) | 39 | 34.7 | 72.65 |
(5,1) | (7,3) | (9,3) | 47 | 33.7 | 72.62 |
(5,1) | (7,4) | (9,3) | 53 | 32.9 | 73.26 |
(7,1) | (9,2) | (11,4) | 63 | 33.3 | 74.13 |
(7,1) | (9,3) | (11,4) | 71 | 33.0 | 72.80 |
Frameworks | Backbone | mAP | Params (M) | FLOPs (G) |
---|---|---|---|---|
Oriented RCNN [75] | ResNet-50 | 75.05 | 41.14 | 211.4 |
CS-SKNet | 79.17 | 51.33 | 292.9 | |
RoI Trans [81] | ResNet-50 | 75.57 | 55.13 | 225.3 |
CS-SKNet | 78.74 | 76.32 | 306.8 | |
S2A-Net [77] | ResNet-50 | 71.41 | 38.60 | 197.6 |
CS-SKNet | 76.84 | 45.31 | 281.4 | |
R3Det [36] | ResNet-50 | 69.55 | 41.90 | 335.7 |
CS-SKNet | 75.08 | 48.61 | 419.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, C.; Li, S.; Xie, T.; Wang, X.; Zhou, J. WoodenCube: An Innovative Dataset for Object Detection in Concealed Industrial Environments. Sensors 2024, 24, 5903. https://doi.org/10.3390/s24185903
Wu C, Li S, Xie T, Wang X, Zhou J. WoodenCube: An Innovative Dataset for Object Detection in Concealed Industrial Environments. Sensors. 2024; 24(18):5903. https://doi.org/10.3390/s24185903
Chicago/Turabian StyleWu, Chao, Shilong Li, Tao Xie, Xiangdong Wang, and Jiali Zhou. 2024. "WoodenCube: An Innovative Dataset for Object Detection in Concealed Industrial Environments" Sensors 24, no. 18: 5903. https://doi.org/10.3390/s24185903