Exploring Misclassification Information for Fine-Grained Image Classification
Abstract
:1. Introduction
- First, we select a subset of images instead of using all the images by exploring the misclassification information for classification. This helps to get rid of noisy information and improve the discriminative power of learned classifiers.
- Second, we construct new image representations by combining the outputs of classifiers for fine-grained image classification. In this way, we can make use of a number of prelearned models to boost the classification accuracy.
- Third, the proposed method has good generalization ability by making use of prelearned classification models for misclassification information extraction and classification.
2. Related Work
3. Fine-Grained Image Classification with Misclassification Information
3.1. Exploring the Misclassification Information
3.2. Confusion Information Based Image Representations and Classifications
Algorithm 1 Procedures of the proposed fine-grained image classification with misclassification information method. |
Input: Training images and labels , prelearned classifier , K testing images. |
Output: The predicted classes of testing images: Training phase |
1: Predict the classes of images with prelearned classifiers using Equation (1); |
2: Calculate the misclassification information using Equation (2); |
3: Train misclassification classifiers using Equation (4); |
4: Concatenate the results for new image representation using Equations (4) and (5); |
5: Train the final classifiers using Equations (6)–(8). |
Testing phase |
6: Calculate the misclassification information with prelearned classifiers using Equations (1) and (2); |
7: Concatenate the predicted results of testing images using Equations (4) and (5); |
8: Predict the classes of testing images using Equations (6) and (8). |
9: return The predicted classes of testing images. |
4. Experiments
4.1. Experimental Setup
4.2. The Flower-102 Dataset
4.3. The CUB-200-2011 Dataset
4.4. The Cars-196 Dataset
4.5. Influences of Parameters
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Nilsback, M.; Zisserman, A. Automated Flower Classification over a Large Number of Classes. In Proceedings of the Sixth Indian Conference on Computer Vision, Graphics Image Processing, ICVGIP 2008, Bhubaneswar, India, 16–19 December 2008; pp. 722–729. [Google Scholar]
- Zhang, C.; Liang, C.; Li, L.; Liu, J.; Huang, Q.; Tian, Q. Fine-Grained Image Classification via Low-Rank Sparse Coding with General and Class-Specific Codebooks. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 1550–1559. [Google Scholar] [CrossRef] [PubMed]
- Lin, T.; RoyChowdhury, A.; Maji, S. Bilinear Convolutional Neural Networks for Fine-Grained Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 1309–1322. [Google Scholar] [CrossRef] [PubMed]
- Lazebnik, S.; Schmid, C.; Ponce, J. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), New York, NY, USA, 17–22 June 2006; pp. 2169–2178. [Google Scholar]
- Yang, J.; Yu, K.; Gong, Y.; Huang, T.S. Linear spatial pyramid matching using sparse coding for image classification. In Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, FL, USA, 20–25 June 2009; pp. 1794–1801. [Google Scholar]
- Zhang, C.; Huang, Q.; Tian, Q. Contextual Exemplar Classifier-Based Image Representation for Classification. IEEE Trans. Circuits Syst. Video Technol. 2017, 27, 1691–1699. [Google Scholar] [CrossRef]
- Zhang, C.; Liu, J.; Tian, Q.; Xu, C.; Lu, H.; Ma, S. Image classification by non-negative sparse coding, low-rank and sparse decomposition. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colorado Springs, CO, USA, 20–25 June 2011; pp. 1673–1680. [Google Scholar]
- Zhang, C.; Liu, J.; Liang, C.; Xue, Z.; Pang, J.; Huang, Q. Image classification by non-negative sparse coding, correlation constrained low-rank and sparse decomposition. Comput. Vis. Image Underst. 2014, 123, 14–22. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012, Lake Tahoe, NV, USA, 3–8 December 2012; pp. 1106–1114. [Google Scholar]
- Zhang, C.; Cheng, J.; Tian, Q. Image-level classification by hierarchical structure learning with visual and semantic similarities. Inf. Sci. 2018, 422, 271–281. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Zhang, C.; Cheng, J.; Tian, Q. Structured Weak Semantic Space Construction for Visual Categorization. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 3442–3451. [Google Scholar]
- Zhang, C.; Cheng, J.; Tian, Q. Incremental Codebook Adaptation for Visual Representation and Categorization. IEEE Trans. Cybern. 2018, 48, 2012–2023. [Google Scholar] [CrossRef]
- Chai, Y.; Rahtu, E.; Lempitsky, V.S.; Gool, L.V.; Zisserman, A. TriCoS: A Tri-level Class-Discriminative Co-Segmentation Method for Image Classification. In Proceedings of the 12th European Conference on Computer Vision (ECCV 2012), Florence, Italy, 7–13 October 2012; Proceedings Part I. Volume 7572, pp. 794–807. [Google Scholar]
- Yang, L.; Luo, P.; Loy, C.C.; Tang, X. A large-scale car dataset for fine-grained categorization and verification. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015; pp. 3973–3981. [Google Scholar]
- Wah, C.; Branson, S.; Welinder, P.; Perona, P.; Belongie, S. The Caltech-UCSD Birds-200-2011 Dataset; California Institute of Technology: Pasadena, CA, USA, 2011; Available online: https://resolver.caltech.edu/CaltechAUTHORS:20111026-120541847 (accessed on 15 June 2021).
- Zhang, N.; Donahue, J.; Girshick, R.B.; Darrell, T. Part-based R-CNNs for Fine-grained Category Detection. arXiv 2014, arXiv:1407.3867. [Google Scholar]
- Cui, Y.; Zhou, F.; Lin, Y.; Belongie, S.J. Fine-Grained Categorization and Dataset Bootstrapping Using Deep Metric Learning with Humans in the Loop. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016; pp. 1153–1162. [Google Scholar]
- Zhang, C.; Zhu, G.; Liang, C.; Zhang, Y.; Huang, Q.; Tian, Q. Image Class Prediction by Joint Object, Context, and Background Modeling. IEEE Trans. Circuits Syst. Video Technol. 2018, 28, 428–438. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [Green Version]
- Russakovsky, O.; Lin, Y.; Yu, K.; Li, F. Object-Centric Spatial Pooling for Image Classification. In Proceedings of the 12th European Conference on Computer Vision (ECCV 2012), Florence, Italy, 7–13 October 2012; Proceedings Part II. Volume 7573, pp. 1–15. [Google Scholar]
- Chen, Q.; Song, Z.; Dong, J.; Huang, Z.; Hua, Y.; Yan, S. Contextualizing Object Detection and Classification. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 13–27. [Google Scholar] [CrossRef] [Green Version]
- Zhang, C.; Sang, J.; Zhu, G.; Tian, Q. Bundled Local Features for Image Representation. IEEE Trans. Circuits Syst. Video Technol. 2018, 28, 1719–1726. [Google Scholar] [CrossRef]
- Zhang, C.; Cheng, J.; Li, C.; Tian, Q. Image-Specific Classification With Local and Global Discriminations. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 4479–4486. [Google Scholar] [CrossRef]
- Angelova, A.; Zhu, S. Efficient Object Detection and Segmentation for Fine-Grained Recognition. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 811–818. [Google Scholar]
- Lin, D.; Lu, C.; Liao, R.; Jia, J. Learning Important Spatial Pooling Regions for Scene Classification. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014; pp. 3726–3733. [Google Scholar]
- Xie, L.; Tian, Q.; Hong, R.; Yan, S.; Zhang, B. Hierarchical Part Matching for Fine-Grained Visual Categorization. In Proceedings of the IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, 1–8 December 2013; pp. 1641–1648. [Google Scholar]
- Farrell, R.; Oza, O.; Zhang, N.; Morariu, V.I.; Darrell, T.; Davis, L.S. Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance. In Proceedings of the IEEE International Conference on Computer Vision, ICCV 2011, Barcelona, Spain, 6–13 November 2011; pp. 161–168. [Google Scholar]
- Gao, Y.; Beijbom, O.; Zhang, N.; Darrell, T. Compact Bilinear Pooling. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016; pp. 317–326. [Google Scholar]
- Torresani, L.; Szummer, M.; Fitzgibbon, A.W. Efficient Object Category Recognition Using Classemes. In Proceedings of the 11th European Conference on Computer Vision (ECCV 2010), Heraklion, Crete, Greece, 5–11 September 2010; Proceedings Part I. Volume 6311, pp. 776–789. [Google Scholar]
- Yang, Y.; Zha, Z.; Gao, Y.; Zhu, X.; Chua, T. Exploiting Web Images for Semantic Video Indexing Via Robust Sample-Specific Loss. IEEE Trans. Multim. 2014, 16, 1677–1689. [Google Scholar] [CrossRef]
- Farhadi, A.; Endres, I.; Hoiem, D.; Forsyth, D.A. Describing objects by their attributes. In Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, FL, USA, 20–25 June 2009; pp. 1778–1785. [Google Scholar]
- Li, L.; Su, H.; Xing, E.P.; Li, F. Object Bank: A High-Level Image Representation for Scene Classification Semantic Feature Sparsification. In Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, Vancouver, BC, Canada, 6–11 December 2010; pp. 1378–1386. [Google Scholar]
- Tang, J.; Chen, Q.; Wang, M.; Yan, S.; Chua, T.; Jain, R.C. Towards optimizing human labeling for interactive image tagging. ACM Trans. Multim. Comput. Commun. Appl. 2013, 9, 29:1–29:18. [Google Scholar] [CrossRef]
- Zhang, C.; Zhu, G.; Huang, Q.; Tian, Q. Image classification by search with explicitly and implicitly semantic representations. Inf. Sci. 2017, 376, 125–135. [Google Scholar] [CrossRef]
- Zhang, C.; Cheng, J.; Li, L.; Li, C.; Tian, Q. Object Categorization Using Class-Specific Representations. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 4528–4534. [Google Scholar] [CrossRef]
- Wei, Y.; Xia, W.; Lin, M.; Huang, J.; Ni, B.; Dong, J.; Zhao, Y.; Yan, S. HCP: A Flexible CNN Framework for Multi-Label Image Classification. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 1901–1907. [Google Scholar] [CrossRef] [Green Version]
- Zhang, X.; Zhang, H.; Zhang, Y.; Yang, Y.; Wang, M.; Luan, H.; Li, J.; Chua, T. Deep Fusion of Multiple Semantic Cues for Complex Event Recognition. IEEE Trans. Image Process. 2016, 25, 1033–1046. [Google Scholar] [CrossRef]
- Wu, Y.; Ji, Q. Constrained Deep Transfer Feature Learning and Its Applications. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016; pp. 5101–5109. [Google Scholar]
- Zhang, C.; Cheng, J.; Tian, Q. Multiview Label Sharing for Visual Representations and Classifications. IEEE Trans. Multim. 2018, 20, 903–913. [Google Scholar] [CrossRef]
- Krause, J.; Sapp, B.; Howard, A.; Zhou, H.; Toshev, A.; Duerig, T.; Philbin, J.; Fei-Fei, L. The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition. In Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 8–16 October 2016; Proceedings Part III. Volume 9907, pp. 301–320. [Google Scholar]
- Lin, Y.; Morariu, V.I.; Hsu, W.H.; Davis, L.S. Jointly Optimizing 3D Model Fitting and Fine-Grained Classification. In Proceedings of the 13th European Conference Computer Vision, Zurich, Switzerland, 6–12 September 2014; Proceedings Part IV. Volume 8692, pp. 466–480. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.B.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [Green Version]
- Gemert, J.C.; Veenman, C.J.; Smeulders, A.W.M.; Geusebroek, J. Visual Word Ambiguity. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 1271–1283. [Google Scholar] [CrossRef] [Green Version]
- Qi, G.; Liu, W.; Aggarwal, C.C.; Huang, T.S. Joint Intermodal and Intramodal Label Transfers for Extremely Rare or Unseen Classes. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1360–1373. [Google Scholar] [CrossRef]
- Wang, B.; Tu, Z.; Tsotsos, J.K. Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification. In Proceedings of the IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, 1–8 December 2013; pp. 425–432. [Google Scholar]
- Li, J.; Wu, Y.; Zhao, J.; Lu, K. Low-Rank Discriminant Embedding for Multiview Learning. IEEE Trans. Cybern. 2017, 47, 3516–3529. [Google Scholar] [CrossRef]
- Rasiwasia, N.; Vasconcelos, N. Holistic Context Models for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 902–917. [Google Scholar] [CrossRef] [Green Version]
- Wang, J.; Yang, J.; Yu, K.; Lv, F.; Huang, T.S.; Gong, Y. Locality-constrained Linear Coding for image classification. In Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, San Francisco, CA, USA, 13–18 June 2010; pp. 3360–3367. [Google Scholar]
- Zhang, C.; Li, C.; Lu, D.; Cheng, J.; Tian, Q. Birds of a feather flock together: Visual representation with scale and class consistency. Inf. Sci. 2018, 460–461, 115–127. [Google Scholar] [CrossRef]
- Sánchez, J.; Perronnin, F.; Mensink, T.; Verbeek, J.J. Image Classification with the Fisher Vector: Theory and Practice. Int. J. Comput. Vis. 2013, 105, 222–245. [Google Scholar] [CrossRef]
- Liu, L.; Shen, C.; van den Hengel, A. Cross-Convolutional-Layer Pooling for Image Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2305–2313. [Google Scholar] [CrossRef] [Green Version]
- Krause, J.; Stark, M.; Deng, J.; Fei-Fei, L. 3D Object Representations for Fine-Grained Categorization. In Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, ICCV Workshops 2013, Sydney, Australia, 2–8 December 2013; pp. 554–561. [Google Scholar]
- Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Xie, N.; Ling, H.; Hu, W.; Zhang, X. Use bin-ratio information for category and scene classification. In Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, San Francisco, CA, USA, 13–18 June 2010; pp. 2313–2319. [Google Scholar]
- Zhang, C.; Cheng, J.; Liu, J.; Pang, J.; Liang, C.; Huang, Q.; Tian, Q. Object categorization in sub-semantic space. Neurocomputing 2014, 142, 248–255. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.E.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Kong, S.; Fowlkes, C.C. Low-Rank Bilinear Pooling for Fine-Grained Classification. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017; pp. 7025–7034. [Google Scholar]
- He, X.; Peng, Y.; Zhao, J. Fast Fine-Grained Image Classification via Weakly Supervised Discriminative Localization. IEEE Trans. Circuits Syst. Video Technol. 2019, 29, 1394–1407. [Google Scholar] [CrossRef] [Green Version]
- Xu, Z.; Huang, S.; Zhang, Y.; Tao, D. Webly-Supervised Fine-Grained Visual Categorization via Deep Domain Adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 1100–1113. [Google Scholar] [CrossRef] [PubMed]
- Huang, S.; Xu, Z.; Tao, D.; Zhang, Y. Part-Stacked CNN for Fine-Grained Visual Categorization. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016; pp. 1173–1182. [Google Scholar]
- Branson, S.; Horn, G.V.; Belongie, S.J.; Perona, P. Bird Species Categorization Using Pose Normalized Deep Convolutional Nets. arXiv 2014, arXiv:1406.2952. [Google Scholar]
- Jaderberg, M.; Simonyan, K.; Zisserman, A.; Kavukcuoglu, K. Spatial Transformer Networks. In Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Montreal, QC, Canada, 7–12 December 2015; pp. 2017–2025. [Google Scholar]
- Moghimi, M.; Belongie, S.J.; Saberian, M.J.; Yang, J.; Vasconcelos, N.; Li, L. Boosted Convolutional Neural Networks. In Proceedings of the British Machine Vision Conference 2016, BMVC 2016, York, UK, 19–22 September 2016. [Google Scholar]
- Girshick, R.B.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Xie, S.; Yang, T.; Wang, X.; Lin, Y. Hyper-class augmented and regularized deep learning for fine-grained image classification. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015; pp. 2645–2654. [Google Scholar]
- Zhang, C.; Cheng, J.; Tian, Q. Semantically Modeling of Object and Context for Categorization. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 1013–1024. [Google Scholar] [CrossRef] [PubMed]
- Zhang, C.; Cheng, J.; Tian, Q. Multiview, Few-Labeled Object Categorization by Predicting Labels With View Consistency. IEEE Trans. Cybern. 2019, 49, 3834–3843. [Google Scholar] [CrossRef] [PubMed]
- Zhang, C.; Cheng, J.; Tian, Q. Multiview Semantic Representation for Visual Recognition. IEEE Trans. Cybern. 2020, 50, 2038–2049. [Google Scholar] [CrossRef] [PubMed]
- Lam, M.; Mahasseni, B.; Todorovic, S. Fine-Grained Recognition as HSnet Search for Informative Image Parts. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017; pp. 6497–6506. [Google Scholar]
- He, X.; Peng, Y. Fine-graind Image Classification via Combining Vision and Language. arXiv 2017, arXiv:1704.02792. [Google Scholar]
- Zhang, L.; Huang, S.; Liu, W.; Tao, D. Learning a Mixture of Granularity-Specific Experts for Fine-Grained Categorization. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea, 27 October–2 November 2019; pp. 8330–8339. [Google Scholar]
- Zheng, H.; Fu, J.; Mei, T.; Luo, J. Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition. In Proceedings of the 2017 IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017; pp. 5219–5227. [Google Scholar]
- Wang, Y.; Morariu, V.I.; Davis, L.S. Learning a Discriminative Filter Bank Within a CNN for Fine-Grained Recognition. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4148–4157. [Google Scholar]
- Chen, Y.; Bai, Y.; Zhang, W.; Mei, T. Destruction and Construction Learning for Fine-Grained Image Recognition. In Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019; pp. 5157–5166. [Google Scholar]
- Yang, Z.; Luo, T.; Wang, D.; Hu, Z.; Gao, J.; Wang, L. Learning to Navigate for Fine-grained Classification. arXiv 2018, arXiv:1809.00287. [Google Scholar]
- Luo, W.; Yang, X.; Mo, X.; Lu, Y.; Davis, L.; Li, J.; Yang, J.; Lim, S. Cross-X Learning for Fine-Grained Visual Categorization. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea, 27 October–2 November 2019; pp. 8241–8250. [Google Scholar]
- Zhang, C.; Cheng, J.; Tian, Q. Unsupervised and Semi-Supervised Image Classification With Weak Semantic Consistency. IEEE Trans. Multim. 2019, 21, 2482–2491. [Google Scholar] [CrossRef]
- Liu, C.; Xie, H.; Zha, Z.; Yu, L.; Chen, Z.; Zhang, Y. Bidirectional Attention-Recognition Model for Fine-Grained Object Classification. IEEE Trans. Multim. 2020, 22, 1785–1795. [Google Scholar] [CrossRef]
- Rodríguez, P.; Dorta, D.V.; Cucurull, G.; Gonfaus, J.M.; Roca, F.X.; Gonzàlez, J. Pay Attention to the Activations: A Modular Attention Mechanism for Fine-Grained Image Recognition. IEEE Trans. Multim. 2020, 22, 502–514. [Google Scholar] [CrossRef] [Green Version]
Symbol | Description |
---|---|
visual features of the n-th image used for the m-th prelearned model | |
N | number of images |
M | number of prelearned models |
label of | |
C | number of classes |
m-th prelearned model with the c-th class | |
predicted class for the n-th image using the m-th model | |
c-th dimension of | |
c-th dimension of | |
sorted class distribution of for all the images of the c-th class with m-th model | |
K | number of selected classes |
selected subset of images corresponding to the c-th class and m-th model | |
number of selected images corresponding to the c-th class and m-th model | |
learned classifier corresponding to the c-th class with the m-th pretrained model | |
predicted value of using | |
concatenated new representation of with the m-th model | |
linear classifier parameter | |
parameter for controlling the influence of the regularization term | |
binary label of the n-th image with the m-th prelearned model | |
hinge loss function | |
linear combination parameter |
Methods | Acc (%) |
---|---|
AFC [1] | 72.8 |
LR-GCC [2] | 75.7 |
OCB [20] | 91.3 |
ICAC [14] | 76.4 |
BR [56] | 86.8 |
R [8] | 85.3 |
ResNet-50 [73] | 92.4 |
ResNet-101 [74] | 92.3 |
FGMI-LF-AFC | 77.5 |
FGMI-LF-LR-GCC | 78.3 |
FGMI-LF-OCB | 93.6 |
FGMI-LF-ICAC | 79.2 |
FGMI-LF-BR | 89.4 |
FGMI-LF-R | 88.7 |
FGMI-CNN-ResNet-50 | 94.8 |
FGMI-CNN-ResNet-101 | 94.2 |
FGMI-LF-Combined | 95.4 |
FGMI-CNN-Combined | 95.9 |
FGMI-LF-CNN-Combined | 96.5 |
Methods | EA | Acc (%) | Network |
---|---|---|---|
FC-VGG [11] | no | 70.4 | VGG |
bilinear CNN [3] | no | 84.1 | VGG |
LRBP [59] | no | 84.2 | VGG |
WSDL [60] | no | 85.7 | VGG |
PR-CNN [18] | yes | 73.5 | AlexNet |
WS [61] | yes | 78.6 | AlexNet |
PS-CNN [62] | yes | 76.2 | AlexNet |
PN-CNN [63] | yes | 75.7 | AlexNet |
Triplet-A 19 | yes | 80.7 | GoogleNet |
STN [64] | no | 84.1 | GoogleNet |
BoostCNN [65] | no | 86.2 | VGG |
HSnet [71] | yes | 87.5 | GoogLeNet |
CVL [72] | yes | 85.6 | VGG + GoogLeNet |
MA-CNN [74] | no | 86.5 | VGG-19 |
DFL-CNN [75] | no | 87.4 | ResNet-50 |
DCL-VGG-16 [76] | no | 86.9 | VGG-16 |
NTS-Net [77] | no | 87.5 | ResNet-50 |
DFB-CNN [75] | no | 87.4 | VGG-16 |
Cross-X (ResNet) [78] | no | 87.7 | ResNet |
FGMI-CNN-AlexNet | no | 73.4 | AlexNet |
FGMI-CNN-VGG | no | 75.8 | VGG |
FGMI-CNN-GoogleNet | no | 83.1 | GoogleNet |
FGMI-CNN-BCNN | no | 86.7 | VGG |
FGMI-CNN-Combined | no | 88.2 | All |
Methods | EA | Acc (%) | Network |
---|---|---|---|
bilinear CNN [3] | no | 91.3 | VGG |
BoostCNN [65] | no | 92.1 | VGG |
BoT [66] | yes | 92.5 | VGG |
FC-VGG [11] | no | 76.8 | VGG |
WSDL [60] | no | 92.3 | VGG |
RCNN [66] | no | 57.4 | AlexNet |
FT-HAR-CNN [67] | no | 86.3 | AlexNet |
MA-CNN [74] | no | 92.8 | VGG-19 |
DFL-CNN [75] | no | 93.1 | ResNet-50 |
DCL-VGG-16 [76] | no | 94.1 | VGG-16 |
FGMI-CNN-AlexNet | no | 63.4 | AlexNet |
FGMI-CNN-VGG | no | 84.7 | VGG |
FGMI-CNN-GoogleNet | no | 87.3 | GoogleNet |
FGMI-CNN-BCNN | no | 93.1 | VGG |
FGMI-CNN-Combined | no | 95.7 | All |
Dataset | No MI | No NIR | Proposed Method |
---|---|---|---|
Flower-102 | 93.2 | 91.6 | 96.5 |
CUB-200-2011 | 85.7 | 84.5 | 88.2 |
Cars-196 | 93.2 | 92.4 | 95.7 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, D.-H.; Zhou, W.; Li, J.; Wu, Y.; Zhu, S. Exploring Misclassification Information for Fine-Grained Image Classification. Sensors 2021, 21, 4176. https://doi.org/10.3390/s21124176
Wang D-H, Zhou W, Li J, Wu Y, Zhu S. Exploring Misclassification Information for Fine-Grained Image Classification. Sensors. 2021; 21(12):4176. https://doi.org/10.3390/s21124176
Chicago/Turabian StyleWang, Da-Han, Wei Zhou, Jianmin Li, Yun Wu, and Shunzhi Zhu. 2021. "Exploring Misclassification Information for Fine-Grained Image Classification" Sensors 21, no. 12: 4176. https://doi.org/10.3390/s21124176
APA StyleWang, D.-H., Zhou, W., Li, J., Wu, Y., & Zhu, S. (2021). Exploring Misclassification Information for Fine-Grained Image Classification. Sensors, 21(12), 4176. https://doi.org/10.3390/s21124176