ODNet: A High Real-Time Network Using Orthogonal Decomposition for Few-Shot Strip Steel Surface Defect Classification

Zhang, He; Liu, Han; Guo, Runyuan; Liang, Lili; Liu, Qing; Ma, Wenlu

doi:10.3390/s24144630

Open AccessArticle

ODNet: A High Real-Time Network Using Orthogonal Decomposition for Few-Shot Strip Steel Surface Defect Classification

by

He Zhang

¹,

Han Liu

^1,*

,

Runyuan Guo

¹

,

Lili Liang

¹,

Qing Liu

¹

and

Wenlu Ma

²

¹

School of Automation and Information Engineering, Xi’an University of Technology, Xi’an 710048, China

²

School of Information Engineering, Shannxi Xueqian Normal University, Xi’an 710100, China

^*

Author to whom correspondence should be addressed.

Sensors 2024, 24(14), 4630; https://doi.org/10.3390/s24144630

Submission received: 1 July 2024 / Revised: 15 July 2024 / Accepted: 16 July 2024 / Published: 17 July 2024

(This article belongs to the Special Issue Artificial Intelligence and Smart Sensor-Based Industrial Advanced Technology)

Download

Browse Figures

Versions Notes

Abstract

:

Strip steel plays a crucial role in modern industrial production, where enhancing the accuracy and real-time capabilities of surface defect classification is essential. However, acquiring and annotating defect samples for training deep learning models are challenging, further complicated by the presence of redundant information in these samples. These issues hinder the classification of strip steel surface defects. To address these challenges, this paper introduces a high real-time network, ODNet (Orthogonal Decomposition Network), designed for few-shot strip steel surface defect classification. ODNet utilizes ResNet as its backbone and incorporates orthogonal decomposition technology to reduce the feature redundancies. Furthermore, it integrates skip connection to preserve essential correlation information in the samples, preventing excessive elimination. The model optimizes the parameter efficiency by employing Euclidean distance as the classifier. The orthogonal decomposition not only helps reduce redundant image information but also ensures compatibility with the Euclidean distance requirement for orthogonal input. Extensive experiments conducted on the FSC-20 benchmark demonstrate that ODNet achieves superior real-time performance, accuracy, and generalization compared to alternative methods, effectively addressing the challenges of few-shot strip steel surface defect classification.

Keywords:

real time; orthogonal decomposition; skip connection; few-shot defect classification; Euclidean distance

1. Introduction

In the industrial age, the demand for strip steel across various industries is increasing. However, due to the influence of temperature and manufacturing processes, surface defects such as water spots, creases, and patches frequently occur during production. These defects can seriously affect both the quality and safety of the final product [1,2]. If such defects are not identified in a timely manner, they can lead to significant losses in subsequent production stages. Therefore, it is crucial to quickly and accurately classify surface defects in strip steel production [3,4].

With the rise of deep learning, industrial production has transitioned from traditional to intelligent manufacturing, infusing artificial intelligence with new vigor [5,6,7]. Many researchers have applied deep learning techniques to classify strip steel surface defects. The fusion matrix based on Fisher’s criterion and correlation analysis was introduced in [8], effectively integrating global and local dimensions. Classification performance was improved using multi-label techniques in [9], with model complexity and latency for small datasets reduced. Consideration of the correlation between pixel-level segmentation masks, object-level bounding boxes, and global image-level classification labels was undertaken in [10], with the joint learning of the features of related tasks to improve the performance. A scheme based on ResNet50 with FcaNet and Convolutional Block Attention Module (CBAM) for strip defect classification was proposed in [11]. The CutPaste-Mix data augmentation strategy and Gaussian Density Estimation for abnormal region classification were utilized in [12].

However, in actual industrial production, surface defects on strip steel are rare and challenging to acquire. Therefore, directly applying traditional deep learning methods to classify these defects often leads to overfitting issues [13,14]. Moreover, industrial images contain significant redundant information, further complicating the task of classification [15,16,17].

Inspired by the human ability to quickly learn from a small number of examples, few-shot learning emerged [18]. Its goal is to train a classifier using a limited number of samples that can then efficiently detect new defects with a small number of samples [19,20]. This necessitates high precision and robust generalization from the model, aligning more closely with the practical demands of industrial defect classification [21].

The few-shot strip steel surface defect classification model is mainly divided into three methods: data augmentation-based, optimization-based, and metric-based [22,23].

Data augmentation-based. This is the most direct approach to addressing the few-shot strip steel surface defect classification problem, which can be extended through affine transformations such as rotation, cropping, or online enhancements like Generative Adversarial Networks (GAN) [24] and CutMix [25,26,27]. Data augmentation methods such as those proposed in [28] involve accumulating richly featured data incorporating expert knowledge of abnormalities, including diverse features, positions, sizes, and backgrounds. The residual discriminator network structure within a dual discriminator GAN framework was introduced in [29] to enhance generation diversity while preserving image features. Recognition generalization across meta-tasks is improved by a meta-augmentation method proposed in [30] through joint parameter updating from original and augmented domains.

Optimization-based. Gradient optimization enables rapid adaptation to new tasks [31,32]. MAML [33] is recognized as one of the most influential methods, with iterative models updated by amalgamating gradients, thereby influencing numerous subsequent methodologies. A hyperparametric adaptive strategy based on gradient descent (HASGD) is introduced in [34] to enhance the stability and scalability of the training process. The framework and neural network models are refined in [35] based on MAML [33].

Metric-based. This is one of the most common solutions for few-shot strip steel surface defect classification, primarily comprising a classifier and feature extractor, which categorize samples by mapping nonlinear maps in the embedding space [36,37,38]. A novel dual-stream neural network is proposed, involving the generation of numerous defect samples for classifier pretraining, and the classification of real steel strip surface defects is achieved using the transfer learning method [39]. A transductive learning algorithm was designed and presented in [40], where a new classifier was trained during the test phase to accommodate the needs of unknown samples. A depth metric-based classification method is proposed in [41] to identify a sample-matching feature space with effective similarity measures using cosine distance. A transductive few-shot surface defect classification method is introduced in [42], leveraging both instance-level and distribution-level relations within each few-shot learning task. ResMSNet, a novel backbone network presented in [43], draws on the idea of multi-scale feature extraction for small discriminative regions in defect samples and provides classification via linking prototype distances and nonlinear relation scores. CPANet, proposed in [44], effectively aggregates long-range relationships of discrete defects and introduces a space squeeze attention module to aggregate multiscale context information of defect features. An attention-guided recognition network is presented in [45], featuring channel and position attention modules and a dual-metric function for learning classification boundaries by controlling sample distances in the feature space between intraclass and interclass. Benefiting from the simplicity, high efficiency, and strong designability of metric-based methods, the model proposed in this work also falls into the category of metric-based approaches.

With the advances in deep learning technology, mainstream models are becoming increasingly complex. However, as the model complexity grows, real-time performance is adversely affected, which fails to meet the demands of industrial production. Conversely, simpler models lack the capability to extract intricate discriminative features, thereby compromising classification performance. Moreover, strip surface defects typically occupy a small portion of the overall image, with the majority consisting of redundant information. Addressing the issue of few-shot learning, the model’s effectiveness is hindered by insufficient sample data, necessitating the minimization of redundant information interference to enhance the model’s utilization of pertinent data.

Before the widespread adoption of deep learning, earlier studies utilized Singular Value Decomposition (SVD) to address redundancy in the classification of strip surface defects. Ref. [46] presents a technique for the detection of local defects in cold rolled strips. In their approach, principal component analysis is employed with SVD to reduce the dimensionality of the extracted feature vector. Subsequently, the defects in the steel strips are detected using a feed-forward neural network. An approach is proposed in [47], where the gray level matrix of a digital image is projected onto its singular vectors obtained through SVD. Defects are identified by abrupt changes in these projections, allowing for the determination and rough localization of the defects. The effectiveness of traditional machine learning also provides inspiration for this work. The combination of traditional methods with deep learning can yield improved results.

To tackle the above challenges, this study introduces ODNet, a high real-time network that utilizes orthogonal methods to mitigate the influence of redundant information on the model and maximize the utility of the limited available data. ODNet achieves de-redundancy via the orthogonal decomposition of fully connected layer parameters, ensuring orthogonal feature projection. The model incorporates hops to safeguard against the loss of useful information during orthogonal decomposition operations. This orthogonal embedding of features enhances its suitability for Euclidean distance inputs. Experiments were conducted on the FSC-20 benchmark, specifically designed to validate the few-shot strip steel surface defect classification model. ODNet demonstrates superior classification accuracy, high real-time performance, and strong generalization compared to other methods. Additionally, extensive ablation experiments were conducted to assess the influence of the model parameters and modules on the performance.

Accordingly, this paper makes the following four major contributions:

A high real-time network for few-shot strip steel surface defect classification is proposed.
ODNet employs orthogonal decomposition to derive orthogonal features, thereby minimizing the impact of redundant information on the model. The inclusion of a skip connection ensures that the valuable correlation information remains intact, especially after orthogonal decomposition.
The features extracted by the model with orthogonality also adhere more closely to the orthogonality requirement of the Euclidean distance on input, thereby enhancing the classifier performance.
Compared to alternative methods, ODNet exhibits superior real-time performance, precision, and generalization, aligning more closely with the specific demands of industrial production.

The proposed method is described in detail in Section 2, Section 3 provides the details of a series of experiments to verify the performance of the model. Finally, this paper discusses and summarizes the proposed method in Section 4 and Section 5.

2. Methodology

The problem definition of few-shot strip steel surface defect classification is explained in Section 2.1, and the proposed network and its used loss function are detailed in Section 2.2.

2.1. Problem Definition

Given a dataset D, it is divided into a mutually exclusive training set

D_{t r a i n}

and testing set

D_{t e s t}

by class. Mini-batches are randomly selected from the training set. Each mini-batch includes N classes, with K samples from each class forming the support set and C samples from each class forming the query set. Multiple mini-batches are selected to iteratively update the model. The same steps are repeated during the testing phase. This unique training process is referred to as episodes, designed to simulate the few-shot strip steel surface defect classification scenario and to objectively evaluate the model’s performance. Figure 1 illustrates this process.

2.2. ODNet

The largest limitation of few-shot steel strip defect classification is that the number of samples is small, the usefulness of the samples is limited, and there is a lot of redundant information. This makes it easy for the model to learn too much redundant information, reducing the impact of the useful information on the model and leading to overfitting. The ODNet proposed in this paper alleviates the negative impact of redundant information on model training through orthogonal decomposition operations, while better meeting the requirement of the Euclidean distance for orthogonal input features and improving the classification performance of the model. As shown in Figure 2, the model uses ResNet [48] as the backbone to perform orthogonal decomposition on the input features, while adding a skip connection to ensure that the orthogonal decomposition operation does not erroneously filter out useful information in the samples. Finally, a classifier is used to obtain the predicted labels.

2.2.1. Feature Extractor

ODNet utilizes ResNet18 as the backbone feature extractor, known for its ability to effectively extract deep sample features while mitigating overfitting. The feature

f_{ϕ} (x)

is derived by passing sample x through the feature extractor

f_{ϕ}

(_▪). Figure 3 illustrates the structure of the feature extractor, and Table 1 details its parameters, including FC_1, an orthogonal decomposition layer that produces orthogonal features. Additionally, FC_2 is introduced to align with the feature size post skip connection.

2.2.2. Orthogonal Decomposition

To mitigate the impact of redundant information in industrial defect images, orthogonal decomposition is employed for feature processing. Specifically, following ResNet18, a fully connected layer FC_1 is introduced. As demonstrated in Equation (1), the projection of X onto the fully connected layer yields

A^{'}

. Notably, feature

A^{'}

at this stage exhibits non-orthogonality, featuring strong correlations and substantial redundant information.

A^{'} = W X,

(1)

where W is fully connected layer FC_1’s parameters. To reduce the influence of redundant information on the model performance, the parameters of the fully connected layer FC_1 are subjected to SVD in this paper, as depicted in Equation (2). SVD is a matrix decomposition technique that has widespread applications in data analysis and machine learning. SVD can achieve data dimensionality reduction by retaining the main singular values and their corresponding singular vectors. This helps eliminate redundant information, reduce computational complexity, and preserve the key features of the data.

W = U S V^{T},

(2)

where S is a diagonal matrix, U and V comprise a unitary matrix. Since the columns of the unitary matrix U are orthogonal, after left-multiplying by a diagonal matrix S, the resulting matrix

W^{'}

’s columns still maintain orthogonality, as shown in Equation (3).

W^{'} = U S

(3)

W^{'}

becomes an orthogonal matrix, replacing the weight of the fully connected layer with W. Equation (4) demonstrates that the projection of feature X onto the fully connected layer FC_1

W^{'}

results in orthogonal feature A, which helps diminish the impact of redundant information on the model. To visually illustrate the orthogonal decomposition process, Figure 4 is included in this paper.

A = W^{'} X = U S X

(4)

However, excessive orthogonal decomposition may eliminate useful correlation information. To address this concern, as illustrated in Figure 2 and Figure 3, this study introduces a skip connection to reintroduce some correlations into the final feature

A_{f i n a l}

. Due to the inconsistent feature sizes between the output of ResNet18 and the orthogonal decomposition layer, direct addition by skip connection is not feasible. To address this issue, this paper introduces a fully connected layer into the hopping process. Equation (5) outlines the operation of the skip connection.

A_{f i n a l} = A + X,

(5)

where A is the orthogonal feature, and X is the non-orthogonal feature.

To facilitate a comprehensive understanding of the method proposed in this paper, Algorithm 1 delineates the specific steps involved in orthogonal decomposition and the skip connection operation. For more specific parameter settings, refer to the experimental section.

Algorithm 1: Define sample x using ResNet18 to obtain X. Two fully connected layers: FC_1

f_{W}

(_▪) and FC_2

f_{θ}

(_▪).

A_{f i n a l}

is the feature obtained from x after undergoing the operations of orthogonal decomposition and skip connection.

Input: $f_{R e} (x)$
output: $A_{f i n a l}$
step1: SVD: $W = U S V^{T}$
step2: Replacing the FC_1 parameter W with $W^{'}$ : $W^{'} = U S$
step3: $A_{f i n a l} = f_{W^{'}} (X) + f_{θ} (X)$

The integration of the orthogonal decomposition operation and the skip connection enables the model to autonomously discern between valuable and redundant information during the training phase. This approach effectively leverages the limited sample data to mitigate the detrimental impact of redundant information on the model performance.

2.2.3. Classifier

Real-time performance is a crucial metric for industrial defect classification models. To enhance the model efficiency, the Euclidean distance is employed as the classifier due to its parameter-free nature and adaptability to data distributions. Additionally, the Euclidean distance represents a specific instance of the Mahalanobis distance under orthogonal inputs. While the Euclidean distance necessitates orthogonal inputs, its strong generalization typically overlooks this requirement in practical applications. Although the final extracted features in this study are not strictly orthogonal, they undergo orthogonalization during feature extraction, which enhances their alignment with Euclidean distance characteristics and contributes to improved classifier performance.

As depicted in Equation (6), the mean of samples

x^{i}

belonging to the

K^{t h}

class in the support set is computed as the centroid of the

K^{t h}

class in the metric space, referred to as the prototype

c_{K}

.

c_{K} = \frac{1}{N} \sum_{1}^{N} f_{ϕ} (x^{i}),

(6)

where N is the number of

K^{t h}

class samples in the support set. The Euclidean distance d for query sample

\overset{⌢}{x}

and class prototypes

c_{K}

is calculated as shown in Equation (7). The probability distribution based on Softmax in metric space is shown in Equation (8).

d (c_{K}, \overset{⌢}{x}) = {∥c_{K} - f_{ϕ} (\overset{⌢}{x})∥}_{2}

(7)

p_{ϕ} (y = k |\overset{⌢}{x}) = \frac{exp (- d (f_{ϕ} (\overset{⌢}{x}), c_{k}))}{\sum_{k^{'}} exp (- d (f_{ϕ} (\overset{⌢}{x}), c_{k^{'}}))}

(8)

The label of the class prototype with the largest probability is the predicted label of the query sample.

ODNet utilizes Log loss with Adam for iterative updates, as illustrated in Equation (9). The model incorporates an L2 regularizer to constrain the parameter space and expedite convergence.

min J (ϕ) = - log p_{ϕ} (y = k |x) + λ {∥ϕ∥}_{2},

(9)

where

λ {∥ϕ∥}_{2}

represents the L2-regularizer, and

λ

represents the regularized constant.

Based on the aforementioned model and loss function, Algorithm 2 outlines the procedure. Additionally, to enhance the comprehension of ODNet’s data flow, this study illustrates the data flow encompassing both training and testing phases in Figure 5.

Algorithm 2: For an episode,

N_{C}

is the number of all classes including the support set and query set,

N_{S}

is the number of samples in each class of the support set,

N_{Q}

is the number of samples of each class in the query set,

S_{K}

is the set of K-th samples in the support set,

Q_{K}

is the query samples collection, and J is the loss function.

f_{ϕ} (\cdot)

denotes the feature extractor, and d denotes the Euclidean distance.

Input:
Training set $D_{t r} = \{(x_{1}, y_{1}), \dots, (x_{N}, y_{N})\}, y_{i} \in \{1, \dots, K\}$ , where $x_{i}$ denotes the $i^{t h}$ example feature, $y_{i}$ denotes the example $x_{i}$ label, and $\hat{x}$ denotes an example of the query set.
output: J
step1: Class Prototype:
$C_{K} = {(N_{S})}^{- 1} \times \sum_{(x_{i}, y_{i}) \in S_{K}} f_{ϕ} (x_{i})$
step2: Initialization: $J \leftarrow 0$
step3: $f o r k i n \{1, \dots, N_{C}\} d o$
step4: $f o r (\hat{x}, y) i n Q_{K} d o$
step5: $J \leftarrow J + d (f_{ϕ} (\hat{x}), c_{k}) \times {(N_{C} N_{Q})}^{- 1} + log \sum_{k^{'}} exp (- d (f_{ϕ} (\hat{x}), c_{k^{'}}))$
step6: $e n d f o r$
step7: $e n d f o r$

To comprehensively verify the performance of the proposed model, we conducted an extensive array of experiments. In addition to accuracy and time assessments, various experiments were performed to evaluate the influence of different classifiers, feature extractors, modules, and parameters on the model’s performance.

3. Experiment

This section describes the multiple experiments conducted to validate the proposed model’s performance. Section 3.1 details the datasets utilized and provides the experimental specifics. Section 3.2 presents results showcasing intra-domain and cross-domain accuracy. Section 3.3 outlines a series of ablation experiments examining the influence of various modules or parameters.

3.1. Dataset and Implementation

Dataset. FSC-20 is a dataset introduced in Song et al. [49] for few-shot strip steel surface defect classification. Figure 6 displays a partial sample of this dataset, which comprises 10 hot-rolled defects (6 types are sourced from the NEU-CLS dataset [50] and 4 types are sourced from the X-SDD dataset [51]) and 10 cold-rolled defects (sourced from the GC10-DET dataset [52]), each with 50 samples. All images were resized to

224 \times 224

pixels. This study enhanced the dataset diversity by rotating the samples several times by 90 degrees. As outlined in Table 2, this work followed Song et al.’s methodology to partition the dataset into training, testing, and validation sets based on class.

Implementation. The experiments were conducted in the same setting and environment, which adopted the Microsoft 10 Pro 64-bit Operating System built on a server that applied an Intel(R)Core(TM)i9-10900K with a frequency of 3.70 GHz and a NVIDIA GeForce RTX 3070Ti. The code was written in python3.6. The hyperparameter settings are shown in Table 3.

3.2. Precision

In this section, intra-domain and cross-domain experiments are described, including the verification of the real-time performance of the model and a comparison with other methods.

3.2.1. Intra-Domain Results

The experimental results of ODNet on the FSC-20 dataset are presented in Table 4, with the optimal result highlighted in red, the second best in blue, and the third best in green. It is observed that the proposed method outperformed others in both the 5-way 1-shot and 5-way 5-shot scenarios. Specifically, in the 1-shot case, ODNet achieved an accuracy 1% higher than LaplacianShot [53], while in the 5-shot case, it achieved a 5% higher accuracy. These findings demonstrate that the proposed model exhibits strong classification performance on intra-domain tasks and effective discrimination against untrained categories.

3.2.2. Cross-Domain Results

According to the different temperatures, the rolling process of strip steel is categorized into hot-rolled and cold-rolled. As illustrated in Figure 6, the surface defects corresponding to hot-rolled and cold-rolled processes are distinctly different. Hot-rolled defects are often irregular, while cold-rolled defects typically manifest as points or lines. In the intra-domain experiments, the dataset partition did not entirely isolate these two types of defects; instead, both hot-rolled and cold-rolled defects were jointly used for training. Because of the irregularity of hot-rolled defects, its application in testing increases the classification difficulty of the model and can better evaluate the generality of the model. Therefore, this work trained the model on the cold-rolled defects and verified the hot-rolled defects.

Table 5 displays the results of cross-domain experiments, highlighting the optimal, suboptimal, and third-best outcomes in red, blue, and green, respectively. It is evident that the proposed model performed well in both scenarios. Specifically, in the 1-shot case, ODNet achieved an accuracy 3% higher than GTNet [59]. In the 5-shot case, ODNet’s accuracy was slightly lower compared to GTNet [59]. These experimental findings underscore ODNet’s robust generalization ability, demonstrating strong classification performance even across significantly different training and testing categories.

3.2.3. Real-Time Results

In industrial production, aside from accurate defect classification, time is crucial. Timely defect classification enables factories to promptly identify issues and adjust production processes accordingly. To evaluate the real-time performance of the proposed model, this study measured the time taken for an episode (a classification task). As depicted in Table 6, the same color scheme for real-time results as in the preceding section was utilized. It is observed that ODNet achieved suboptimal results in both cases, outperforming the majority of methods. These findings highlight ODNet’s high real-time performance and its ability to swiftly classify defects.

To visualize the performance of the proposed model, Figure 7 shows the intra-domain accuracy, cross-domain accuracy, and real-time performance. The figure demonstrates that ODNet excelled in both precision and real-time performance compared to other methods, establishing it as the optimal choice for addressing the few-shot strip steel surface defect classification problem.

3.3. Ablation

To assess the impact of each module on performance, this section describes a series of ablation experiments exploring the influence of the model parameters and included modules. These experiments aim to further elucidate the model’s performance.

3.3.1. Module Results

To investigate the influence of the backbone, orthogonal decomposition operation, and skip connection on the model performance, we conducted ablation experiments on these three modules. The experimental results are presented in Table 7. It is observed that compared to using the backbone alone, integrating the orthogonal decomposition operation significantly enhanced the model performance. Furthermore, the addition of skip connections following orthogonal decomposition further improved the performance substantially. These effects were validated through experiments: orthogonal decomposition effectively mitigated the impact of redundant information while amplifying the role of pertinent data. The skip connection prevented essential information from being filtered out by orthogonal decomposition. The synergy between these components notably enhanced the model’s classification performance.

To further validate the improvement achieved by the proposed method, additional significance tests were conducted. t-tests were employed to analyze significant differences between pairwise combinations across three scenarios, as detailed in Table 7.

p < 0.05

indicates a significant difference between the compared pairs. The results indicate that all combinations had a

p < 0.0001

, highlighting significant variability in each module’s impact on model performance. The performance improvement of the orthogonal decomposition and the skip connection was notably significant.

We also evaluated the performance of ResNet12 and ResNet34 as feature extractors to determine the optimal choice. The experimental results are presented in Table 8. It was observed that ResNet12’s feature extraction capability was insufficient, resulting in decreased model classification performance. ResNet34 did not significantly enhance the classification performance, and its increased network complexity extended the model’s inference time, thereby reducing the real-time performance. Therefore, ODNet adopted ResNet18 as the backbone to achieve a balance between the real-time performance and the precision of the model.

3.3.2. Classifier Results

The orthogonal method proposed in this paper not only eliminates redundant sample information but also satisfies the orthogonality requirements of Euclidean distance on input features. To assess the impact of different classifiers on model performance, this study also evaluated the cosine distance as a classifier. The experimental results are presented in Table 9. It is evident that in both scenarios, the Euclidean distance outperformed the cosine distance, underscoring the beneficial effect of feature orthogonality on the Euclidean distance performance. Significance testing using t-tests was also conducted for both Euclidean and cosine distances. The calculations revealed a

p < 0.0001

, indicating a significant improvement in the effectiveness of the Euclidean distance.

3.3.3. N and K Results

To observe the effect of N and K on few-shot strip steel surface defect classification, experiments were conducted with

N = \{3, 5\}

and

K = \{1, 5, 10, 15\}

, respectively, and the results are shown in Table 10. For a clearer view of the parameter effects on the performance, the line plot depicted in Figure 8 reveals the following insights:

With the increase in the number of samples, the model’s performance tends to saturate;
The increase in the number of classes increases the classification challenge for the model. However, as the sample number increases, this difficulty becomes negligible.

Therefore, when evaluating model performance, it is appropriate to consider the 5-way 1-shot and 5-way 5-shot scenarios, which reflect the small-scale nature of few-shot learning. The experiments also demonstrate the feasibility of episodes to simulate real-world environments in few-shot strip steel surface defect classification, providing an objective evaluation of model performance.

4. Discussion

We verified the performance of ODNet for few-shot strip steel surface defect classification. As depicted in Figure 9, the proposed method exhibits high precision and real-time performance.

In industrial defect samples, redundant information is often prevalent. Valuable information in few-shot learning is limited and precious, and an excess of redundant information can impede the model’s training direction. This interference hinders the model’s ability to effectively discern useful information from redundancy and amplify the importance of the pertinent features. ODNet addresses this issue by subjecting features containing redundant information to orthogonal decomposition. This operation rapidly mitigates the impact of redundant information on the model’s training direction, consequently enhancing the classification performance. Moreover, the skip connection prevents the removal of useful information by the orthogonal decomposition process and fortifies the model’s capacity to distinguish between helpful and redundant information. The efficacy of the skip connection is also evident in the experimental results presented in Table 7. ODNet is a metric-based method, and its orthogonal features partly fulfill the input requirements of Euclidean distance. This enhances the alignment between the feature extractor and the classifier, thus contributing to the performance improvement of the model.

To enhance the real-time performance of the model, this study intentionally simplified its architecture, aiming to achieve improved efficiency. The orthogonal decomposition and skip connection essentially added two fully connected layers. Compared to feature extractors in other mainstream models, this approach significantly reduced the complexity. Additionally, the model employed Euclidean distance as a classifier, which offers stable classification performance without additional parameters. Experimental results, as shown in Table 9, validate the effectiveness of this design in enhancing the real-time capability of the model.

However, as depicted in Figure 9, the cross-domain performance of the model was observed to be slightly lower compared to its intra-domain performance in the 5-shot scenario. Analyzing the reasons, we posit that cross-domain tasks necessitate knowledge transfer, supplemented by prior knowledge introduction. However, ODNet’s orthogonal operation only manages current task knowledge and does not provide prior knowledge to aid learning. Consequently, the proposed model is constrained in its performance on cross-domain tasks.

ODNet theoretically fulfills the requirements of industrial production. In the future, it holds the potential to enable swift and precise detection and classification of surface defects in strip steel on the production line, ensuring product quality aligns with standards, reducing defect rates, and enhancing production efficiency. Nonetheless, its real-world industrial application may encounter challenges, particularly pertaining to the model’s generalization across diverse industrial environments. Addressing this, leveraging techniques such as model pre-training to expedite convergence or employing data augmentation methods to broaden the training dataset could significantly enhance the model’s classification performance in practical industrial settings.

5. Conclusions

In this paper, a high real-time orthogonal decomposition network is proposed for few-shot strip steel surface defect classification. ODNet uses SVD to reduce the impact of redundant information on the model. The skip connection can prevent the useful information from being eliminated by the orthogonal operation. Euclidean distance is used as a classifier to limit the overall parameters of the model. The feature with orthogonality is also more in line with the input requirements of Euclidean distance. A large number of experiments show that ODNet has both high precision and high real-time performance, which is more in line with the actual requirements of industrial production. However, compared with intra-domain tasks, the performance of the model in cross-domain tasks needs to be improved. In the future, a priori knowledge can be introduced into the model to assist model training to improve the performance of cross-domain tasks.

Author Contributions

Conceptualization, H.Z. and H.L.; methodology, H.Z., R.G. and H.L.; software, H.Z., L.L. and Q.L.; validation, H.Z., W.M. and R.G.; formal analysis, H.Z.; investigation, H.Z. and W.M.; resources, H.L.; data curation, Q.L. and W.M.; writing—original draft preparation, H.Z.; writing—review and editing, H.Z. and H.L.; visualization, H.Z., W.M. and L.L.; supervision, H.L. and W.M.; project administration, H.L.; funding acquisition, H.L., R.G., L.L. and W.M. All authors have read and agreed to the published version of the manuscript.

Funding

The authors gratefully acknowledge the support from the following foundations: the Major Research Program of National Natural Science Foundation of China under Grant 92270117, the Major Instrument Project of National Natural Science Foundation of China under Grant 62127809, the National Science Foundation of China under Grants U2034209 and 62376214, the Natural Science Basic Research Program of Shaanxi 2023-JC-YB-533 and 2024JC-YBQN-0697, the 2023 General Special Scientific Research Program of the Department of Education of Shaanxi Province under Grant 23JK0387, and the Doctoral Scientific Research Startup Foundation of Xi’an University of Technology under Grant 103-451123015.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data will be made public available on reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Cumbajin, E.; Rodrigues, N.; Costa, P.; Miragaia, R.; Frazão, L.; Costa, N.; Fernández-Caballero, A.; Carneiro, J.; Buruberri, L.H.; Pereira, A. A systematic review on deep learning with CNNs applied to surface defect detection. J. Imaging 2023, 9, 193. [Google Scholar] [CrossRef] [PubMed]
Ma, J.; Zhang, T.; Yang, C.; Cao, Y.; Xie, L.; Tian, H.; Li, X. Review of Wafer Surface Defect Detection Methods. Electronics 2023, 12, 1787. [Google Scholar] [CrossRef]
Liu, Y.; Zhang, C.; Dong, X. A survey of real-time surface defect inspection methods based on deep learning. Artif. Intell. Rev. 2023, 56, 12131–12170. [Google Scholar] [CrossRef]
Tang, B.; Chen, L.; Sun, W.; Lin, Z.K. Review of surface defect detection of steel products based on machine vision. IET Image Processing 2023, 17, 303–322. [Google Scholar] [CrossRef]
Shen, K.; Zhou, X.; Liu, Z. MINet: Multiscale Interactive Network for Real-Time Salient Object Detection of Strip Steel Surface Defects. IEEE Trans. Ind. Inform. 2024, 20, 7842–7852. [Google Scholar] [CrossRef]
Dong, Y.; Xie, C.; Xu, L.; Cai, H.; Shen, W.; Tang, H. Generative and Contrastive Combined Support Sample Synthesis Model for Few/Zero-Shot Surface Defect Recognition. IEEE Trans. Instrum. Meas. 2023, 73, 5010515. [Google Scholar] [CrossRef]
Hu, K.; Chen, Z.; Kang, H.; Tang, Y. 3D vision technologies for a self-developed structural external crack damage recognition robot. Autom. Constr. 2024, 159, 105262. [Google Scholar] [CrossRef]
Lei, B.; Yi, P.; Xiang, J. A new defect classification approach based on the fusion matrix of multi-eigenvalue. IEEE Sensors J. 2020, 21, 3398–3407. [Google Scholar] [CrossRef]
Liu, Y.; Yuan, Y.; Liu, J. Deep learning model for imbalanced multi-label surface defect classification. Meas. Sci. Technol. 2021, 33, 035601. [Google Scholar] [CrossRef]
Sampath, V.; Maurtua, I.; Martín, J.J.A.; Rivera, A.; Molina, J.; Gutierrez, A. Attention guided multi-task learning for surface defect identification. IEEE Trans. Ind. Inform. 2023, 99, 1–9. [Google Scholar]
Feng, X.; Gao, X.; Luo, L. A ResNet50-based method for classifying surface defects in hot-rolled strip steel. Mathematics 2021, 9, 2359. [Google Scholar] [CrossRef]
Zhang, H.; Sun, Q.; Xu, K. A Self-Supervised Model Based on CutPaste-Mix for Ductile Cast Iron Pipe Surface Defect Classification. Sensors 2023, 23, 8243. [Google Scholar] [CrossRef] [PubMed]
Chen, G.; Yu, H.; Jiang, L.; Shang, H. Few-Shot Learning on 3D Surface Defect Detection with PM Networks. In Proceedings of the 2021 3rd International Conference on Artificial Intelligence and Advanced Manufacture, Manchester, UK, 23–25 October 2021; pp. 104–108. [Google Scholar]
Ma, Z.; Li, Y.; Huang, M.; Deng, N. Online visual end-to-end detection monitoring on surface defect of aluminum strip under the industrial few-shot condition. J. Manuf. Syst. 2023, 70, 31–47. [Google Scholar] [CrossRef]
Li, J.; Su, Z.; Geng, J.; Yin, Y. Real-time detection of steel strip surface defects based on improved yolo detection network. IFAC-PapersOnLine 2018, 51, 76–81. [Google Scholar] [CrossRef]
Wu, S.; Zhao, S.; Zhang, Q.; Chen, L.; Wu, C. Steel Surface defect classification based on small sample learning. Appl. Sci. 2021, 11, 11459. [Google Scholar] [CrossRef]
Wan, S.; Guan, S.; Tang, Y. Advancing bridge structural health monitoring: Insights into knowledge-driven and data-driven approaches. J. Data Sci. Intell. Syst. 2023. [Google Scholar] [CrossRef]
Guo, R.; Chen, Q.; Liu, H.; Wang, W. Adversarial Robustness Enhancement for Deep Learning-Based Soft Sensors: An Adversarial Training Strategy Using Historical Gradients and Domain Adaptation. Sensors 2024, 24, 3909. [Google Scholar] [CrossRef] [PubMed]
Li, Z.; Gao, L.; Gao, Y.; Li, X.; Li, H. Zero-shot surface defect recognition with class knowledge graph. Adv. Eng. Inform. 2022, 54, 101813. [Google Scholar] [CrossRef]
Ma, S.; Song, K.; Niu, M.; Tian, H.; Wang, Y.; Yan, Y. Shape-Consistent One-Shot Unsupervised Domain Adaptation for Rail Surface Defect Segmentation. IEEE Trans. Ind. Inform. 2023, 19, 9667–9679. [Google Scholar] [CrossRef]
Yu, R.; Guo, B.; Yang, K. Selective prototype network for few-shot metal surface defect segmentation. IEEE Trans. Instrum. Meas. 2022, 71, 5020010. [Google Scholar] [CrossRef]
Liu, Z.; Guo, Z.; Li, C.; Gao, C.; Huang, N. Few-shot Steel Surface Defect Detection Based on Meta Learning. In Proceedings of the 2021 10th International Conference on Computing and Pattern Recognition, Shanghai, China, 15–17 October 2021; pp. 113–119. [Google Scholar]
Liu, Z.; Song, Y.; Tang, R.; Duan, G.; Tan, J. Few-shot defect recognition of metal surfaces via attention-embedding and self-supervised learning. J. Intell. Manuf. 2023, 34, 3507–3521. [Google Scholar] [CrossRef]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
Yun, S.; Han, D.; Oh, S.J.; Chun, S.; Choe, J.; Yoo, Y. CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features. arXiv 2019, arXiv:1905.04899. Available online: http://arxiv.org/abs/1905.04899 (accessed on 30 June 2024).
Liu, X.; Teng, W.; Liu, Y. A Model-Agnostic Meta-Baseline Method for Few-Shot Fault Diagnosis of Wind Turbines. Sensors 2022, 22, 3288. [Google Scholar] [CrossRef] [PubMed]
Gomes, J.C.; Borges, L.d.A.B.; Borges, D.L. A Multi-Layer Feature Fusion Method for Few-Shot Image Classification. Sensors 2023, 23, 6880. [Google Scholar] [CrossRef] [PubMed]
Gong, Y.; Wang, X.; Zhou, C.; Ge, M.; Liu, C.; Zhang, X. Human-machine knowledge hybrid augmentation method for surface defect detection based few-data learning. J. Intell. Manuf. 2024, 1–20. [Google Scholar] [CrossRef]
Zhang, K.; Xiao, Y.; Wang, J.; Du, M.; Guo, X.; Zhou, R.; Shi, C.; Zhao, Z. DP-GAN: A Transmission Line Bolt Defects Generation Network Based on Dual Discriminator Architecture and Pseudo-Enhancement Strategy. IEEE Trans. Power Deliv. 2024, 39, 1622–1633. [Google Scholar] [CrossRef]
Duan, G.; Song, Y.; Liu, Z.; Ling, S.; Tan, J. Cross-domain few-shot defect recognition for metal surfaces. Meas. Sci. Technol. 2023, 34, 015202. [Google Scholar] [CrossRef]
Zhang, H.; Liu, H.; Liang, L.; Ma, W.; Liu, D. BiLSTM-TANet: An adaptive diverse scenes model with context embeddings for few-shot learning. Appl. Intell. 2024, 54, 5097–5116. [Google Scholar] [CrossRef]
Deshpande, A.M.; Minai, A.A.; Kumar, M. One-shot recognition of manufacturing defects in steel surfaces. Procedia Manuf. 2020, 48, 1064–1071. [Google Scholar] [CrossRef]
Finn, C.; Abbeel, P.; Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the International Conference on Machine Learning, PMLR, Sydney, NSW, Australia, 6–11 August 2017; pp. 1126–1135. [Google Scholar]
Pang, S.; Zhang, L.; Yuan, Y.; Zhao, W.; Wang, S.; Wang, S. Adaptive-MAML: Few-shot metal surface defects diagnosis based on model-agnostic meta-learning. Measurement 2023, 223, 113612. [Google Scholar] [CrossRef]
Pang, S.; Zhao, W.; Wang, S.; Zhang, L.; Wang, S. Permute-MAML: Exploring industrial surface defect detection algorithms for few-shot learning. Complex Intell. Syst. 2024, 10, 1473–1482. [Google Scholar] [CrossRef]
Guo, R.; Liu, H.; Xie, G.; Zhang, Y.; Liu, D. A self-interpretable soft sensor based on deep learning and multiple attention mechanism: From data selection to sensor modeling. IEEE Trans. Ind. Inform. 2023, 19, 6859–6871. [Google Scholar] [CrossRef]
Bao, Y.; Song, K.; Liu, J.; Wang, Y.; Yan, Y.; Yu, H.; Li, X. Triplet-graph reasoning network for few-shot metal generic surface defect segmentation. IEEE Trans. Instrum. Meas. 2021, 70, 5011111. [Google Scholar] [CrossRef]
Shan, D.; Zhang, Y.; Coleman, S.; Kerr, D.; Liu, S.; Hu, Z. Unseen-material few-shot defect segmentation with optimal bilateral feature transport network. IEEE Trans. Ind. Inform. 2022, 19, 8072–8082. [Google Scholar] [CrossRef]
Zhang, J.; Li, S.; Yan, Y.; Ni, Z.; Ni, H. Surface Defect Classification of Steel Strip with Few Samples Based on Dual-Stream Neural Network. Steel Res. Int. 2022, 93, 2100554. [Google Scholar] [CrossRef]
Wang, W.; Wu, Z.; Lu, K.; Long, H.; Li, D.; Zhang, J.; Chen, P.; Wang, B. Surface Defects Classification of Hot Rolled Strip Based on Few-shot Learning. ISIJ Int. 2022, 62, 1222–1226. [Google Scholar] [CrossRef]
Yu, J.; Liu, K.; Qin, L.; Li, Q.; Zhao, F.; Wang, Q.; Liu, H.; Li, B.; Wang, J.; Li, K. DMnet: A New Few-Shot Framework for Wind Turbine Surface Defect Detection. Machines 2022, 10, 487. [Google Scholar] [CrossRef]
Zhang, P.; Zheng, P.; Guo, X.; Chen, E. Few-shot defect classification via feature aggregation based on graph neural network. J. Vis. Commun. Image Represent. 2024, 101, 104172. [Google Scholar] [CrossRef]
Zhao, J.; Qian, X.; Zhang, Y.; Shan, D.; Liu, X.; Coleman, S.; Kerr, D. A knowledge distillation-based multi-scale relation-prototypical network for cross-domain few-shot defect classification. J. Intell. Manuf. 2024, 35, 841–857. [Google Scholar] [CrossRef]
Feng, H.; Song, K.; Cui, W.; Zhang, Y.; Yan, Y. Cross Position Aggregation Network for Few-Shot Strip Steel Surface Defect Segmentation. IEEE Trans. Instrum. Meas. 2023, 72, 5007410. [Google Scholar] [CrossRef]
Gao, P.; Wang, J.; Xia, M.; Qin, Z.; Zhang, J. Dual-Metric Neural Network with Attention Guidance for Surface Defect Few-Shot Detection in Smart Manufacturing. J. Manuf. Sci.-Eng.-Trans. Asme 2023, 145, 121010. [Google Scholar] [CrossRef]
Kang, G.W.; Liu, H.B. Surface defects inspection of cold rolled strips based on neural network. In Proceedings of the 2005 International Conference on Machine Learning and Cybernetics, Guangzhou, China, 18–21 August 2005. [Google Scholar]
Sun, Q.; Cai, J.; Sun, Z. Detection of surface defects on steel strips based on singular value decomposition of digital image. Math. Probl. Eng. 2016, 2016, 5797654. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Zhao, W.; Song, K.; Wang, Y.; Liang, S.; Yan, Y. FaNet: Feature-aware network for few shot classification of strip steel surface defects. Measurement 2023, 208, 112446. [Google Scholar] [CrossRef]
Song, K.; Yan, Y. A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl. Surf. Sci. 2013, 285, 858–864. [Google Scholar] [CrossRef]
Feng, X.; Gao, X.; Luo, L. X-SDD: A new benchmark for hot rolled steel strip surface defects detection. Symmetry 2021, 13, 706. [Google Scholar] [CrossRef]
Lv, X.; Duan, F.; Jiang, J.j.; Fu, X.; Gan, L. Deep metallic surface defect detection: The new benchmark and detection network. Sensors 2020, 20, 1562. [Google Scholar] [CrossRef]
Ziko, I.; Dolz, J.; Granger, E.; Ayed, I.B. Laplacian regularized few-shot learning. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 13–18 July 2020; pp. 11660–11670. [Google Scholar]
Wang, Y.; Xu, C.; Liu, C.; Zhang, L.; Fu, Y. Instance credibility inference for few-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 12836–12845. [Google Scholar]
Zhang, C.; Cai, Y.; Lin, G.; Shen, C. Deepemd: Few-shot image classification with differentiable earth mover’s distance and structured classifiers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 12203–12213. [Google Scholar]
Snell, J.; Swersky, K.; Zemel, R. Prototypical networks for few-shot learning. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
Boudiaf, M.; Masud, Z.; Rony, J.; Dolz, J.; Piantanida, P.; Ayed, I. Transductive information maximization for few-shot learning. arXiv 2020, arXiv:2008.11297. [Google Scholar]
Chen, W.Y.; Liu, Y.C.; Kira, Z.; Wang, Y.C.F.; Huang, J.B. A closer look at few-shot classification. arXiv 2019, arXiv:1904.04232. [Google Scholar]
Xiao, W.; Song, K.; Liu, J.; Yan, Y. Graph embedding and optimal transport for few-shot classification of metal surface defect. IEEE Trans. Instrum. Meas. 2022, 71, 5010310. [Google Scholar] [CrossRef]

Figure 1. Episodes’ training process. The testing phase presents a few-shot classification task.

Figure 2. ODNet architecture, where colored boxes represent orthogonal features.

Figure 3. The pipeline of the feature extractor. Layer1, Layer2, Layer3, and Layer4 have the same structure.

Figure 4. Orthogonal decomposition process.

Figure 5. Data flow of ODNet in case of the 5-way, 5-shot. The black dotted line indicates the training stage, and the red dotted line represents the concrete steps of the orthogonal decomposition operation.

Figure 6. Examples for each class in the FSC-20. The classes in the blue box are cold-rolled defects, and the classes in the red box are hot-rolled defects.

Figure 7. The double Y-axis histogram–line chart of the model with real time(s) and accuracy(%). The histogram represents the time it takes for the model to run an episode corresponding to the left Y-axis. The blue and red dots in the line chart represent the intra-domain and cross-domain accuracy of the model corresponding to the right Y-axis, respectively. (a), The real time and accuracy of the model in the case of 5-way, 1-shot. (b), The real time and accuracy of the model in the case of 5-way 5-shot.

Figure 8. Comparison showing the effect of N and K for the ODNet. The blue line and red line represent the 3-way and 5-way, respectively. The X-axis indicates values of the K-shot. The Y-axis indicates the test accuracy.

Figure 9. ODNet’s accuracy and real-time statistics.

Table 1. Feature extractor details.

Feature Extractor	Stage	Detail
ResNet18	Pre-processing	Conv2d ( $5 \times 5$ , stride 2, pad 1, BatchNorm, RelU)
	Each block	Conv2d ( $3 \times 3$ , stride 2, pad 1)
		BatchNorm
		ReLU
		Conv2d ( $3 \times 3$ , stride 2, pad 1)
		BatchNorm
		Sum with Input
		ReLU
	Post-processing	AvgPool, Flatten
Orthogonal decomposition layer	Orthogonal decomposition	Fully connected layer ( $1000 \times 512$ )
	Skip connection	Fully connected layer ( $1000 \times 512$ )
	Skip connection	Sum with Input

Table 2. Detailed types of FSC-20.

Training Set (50%) ^[1]	Validation Set (25%) ^[1]	Testing Set (25%) ^[1]
Crescent gap	Welding line	One inclusion
Oil spot	Water spot	Waist folding
Rolled pit	Silk spot	Crazing
Crease	Rolled in scale	Patches
Punching hole	Iron sheet ash	Red iron
Scratches	-	-
Pitted surface	-	-
Two inclusion	-	-
Oxide scale	-	-
Slag inclusions	-	-

^[1] The values in parentheses represent the proportion of the number of samples (or classes) of this set to the total number of samples (or classes) of the dataset.

Table 3. ODNet’s hyperparameter details.

Training Epochs	Query Sample ^[1]		Regularizer Constant	Learning Rate	Decay Rate	Decay Episodes	Test Episodes ^[2]
Training Epochs	Train	Test	Regularizer Constant	Learning Rate	Decay Rate	Decay Episodes	Test Episodes ^[2]
150	5	15	0.5	$10^{- 3}$	0.5	2000	1000

^[1] The number of query samples used in each episode. ^[2] Experiments were used to compute the classification accuracy for models through an average of over 1000 randomly generated episodes from the testing set.

Table 4. The intra-domain classification accuracies (%) with 95% confidence intervals on FSC-20. All accuracy results are averaged over 1000 test episodes.

Method	5-Way
Method	1-Shot	5-Shot
LaplacianShot [53]	79.86 ± 0.11	87.83 ± 0.08
ICI [54]	63.50 ± 0.66	72.86 ± 0.51
DeepEMD [55]	62.62 ± 0.67	71.10 ± 0.45
Prototypical Nets [56]	43.31 ± 0.34	80.29 ± 0.31
TIM [57]	71.72 ± 0.13	81.27 ± 0.09
Baseline [58]	67.72 ± 0.13	81.97 ± 0.10
GTNet [59]	76.76 ± 0.19	85.56 ± 0.08
ODNet (proposed)	80.45 ± 0.47	93.41 ± 0.26

Table 5. The cross-domain classification accuracies (%) with 95% confidence intervals on FSC-20. All accuracy results are averaged over 1000 test episodes.

Method	Cold-Rolled → Hot-Rolled
Method	5-Way, 1-Shot	5-Way, 5-Shot
LaplacianShot [53]	59.90 ± 0.19	71.05 ± 0.13
ICI [54]	49.64 ± 0.32	71.62 ± 0.18
Prototypical Networks [56]	49.59 ± 0.37	75.13 ± 0.56
DeepEMD [55]	66.98 ± 0.56	80.80 ± 0.45
TIM [57]	70.11 ± 0.17	86.05 ± 0.08
Baseline [58]	67.12 ± 0.13	81.97 ± 0.10
GTNet [59]	77.61 ± 0.21	87.95 ± 0.08
OdNet (proposed)	80.32 ± 0.22	86.52 ± 0.31

Table 6. Test time(s) for an episode.

Method	Time(s)
Method	1-Shot	5-Shot
LaplacianShot [53]	0.3581	2.5784
ICI [54]	1.1528	1.5622
DeepEMD [55]	11.8745	12.6195
TIM [57]	2.7006	5.6421
Baseline [58]	4.1243	4.3781
GTNet [59]	5.3617	11.5875
ODNet (ours)	1.0358	2.3467

Table 7. Different module classification accuracies (%) with 95% confidence intervals on FSC-20. All accuracy results are averaged over 1000 test episodes.

ResNet18	Orthogonal Decomposition	Skip Connection	5-Way
ResNet18	Orthogonal Decomposition	Skip Connection	1-Shot	5-Shot
✓			65.27 ± 0.23	81.13 ± 0.34
✓	✓		74.08 ± 0.31	87.65 ± 0.42
✓	✓	✓	80.45 ± 0.47	93.41 ± 0.26

Table 8. Different backbone classification accuracies (%) with 95% confidence intervals on FSC-20. All accuracy results are averaged over 1000 test episodes.

Backbone	5-Way
Backbone	1-Shot	5-Shot
ResNet12	77.76 ± 0.62	89.98 ± 0.29
ResNet18	80.45 ± 0.47	93.41 ± 0.26
ResNet34	80.02 ± 0.24	94.35 ± 0.31

Table 9. Different classifier classification accuracies (%) with 95% confidence intervals on FSC-20. All accuracy results are averaged over 1000 test episodes.

Classifier	5-Way
Classifier	1-Shot	5-Shot
Euclidean	80.45 ± 0.47	93.41 ± 0.26
Cosine	48.61 ± 0.42	61.14 ± 0.39

Table 10. Different N and K classification accuracies (%) with 95% confidence intervals on FSC-20. All accuracy results are averaged over 1000 test episodes.

N-Way	K-Shot	Accuracy
3	1	83.43 ± 0.25
	5	93.87 ± 0.31
	10	95.70 ± 0.42
	15	95.38 ± 0.43
5	1	80.45 ± 0.47
	5	93.41 ± 0.26
	10	95.37 ± 0.29
	15	95.24 ± 0.36

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, H.; Liu, H.; Guo, R.; Liang, L.; Liu, Q.; Ma, W. ODNet: A High Real-Time Network Using Orthogonal Decomposition for Few-Shot Strip Steel Surface Defect Classification. Sensors 2024, 24, 4630. https://doi.org/10.3390/s24144630

AMA Style

Zhang H, Liu H, Guo R, Liang L, Liu Q, Ma W. ODNet: A High Real-Time Network Using Orthogonal Decomposition for Few-Shot Strip Steel Surface Defect Classification. Sensors. 2024; 24(14):4630. https://doi.org/10.3390/s24144630

Chicago/Turabian Style

Zhang, He, Han Liu, Runyuan Guo, Lili Liang, Qing Liu, and Wenlu Ma. 2024. "ODNet: A High Real-Time Network Using Orthogonal Decomposition for Few-Shot Strip Steel Surface Defect Classification" Sensors 24, no. 14: 4630. https://doi.org/10.3390/s24144630

APA Style

Zhang, H., Liu, H., Guo, R., Liang, L., Liu, Q., & Ma, W. (2024). ODNet: A High Real-Time Network Using Orthogonal Decomposition for Few-Shot Strip Steel Surface Defect Classification. Sensors, 24(14), 4630. https://doi.org/10.3390/s24144630

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

ODNet: A High Real-Time Network Using Orthogonal Decomposition for Few-Shot Strip Steel Surface Defect Classification

Abstract

1. Introduction

2. Methodology

2.1. Problem Definition

2.2. ODNet

2.2.1. Feature Extractor

2.2.2. Orthogonal Decomposition

2.2.3. Classifier

3. Experiment

3.1. Dataset and Implementation

3.2. Precision

3.2.1. Intra-Domain Results

3.2.2. Cross-Domain Results

3.2.3. Real-Time Results

3.3. Ablation

3.3.1. Module Results

3.3.2. Classifier Results

3.3.3. N and K Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI