1. Introduction
Synthetic aperture radar (SAR) [
1] offers continuous monitoring of local scenes, and is not affected by external environmental factors such as light. Over the past 60 years, SAR technology has matured and found widespread applications in both civil and military fields [
2,
3,
4,
5]. In civil applications, SAR is utilized for geological surveys, forest and crop censuses, and emergency rescues. In the military domain, the reliable interpretation of massive SAR images to extract valuable intelligence has become crucial, leading to the emergence of automatic target recognition (ATR) technology [
6]. ATR technology has evolved from theoretical research to systematic development and application worldwide. Typical SAR ATR systems include semi-automatic IMINT processing (SAIP) and the moving and stationary target acquisition and recognition (MSTAR) program [
7,
8,
9,
10]. These systems describe target characteristics based on templates and models, respectively. The release of the MSTAR database has provided ample experimental material for researchers and triggered a surge in research on SAR ATR technology.
ATR aims to achieve the automatic detection, discrimination, and recognition of the potential ROI and to obtain the category and number of objects [
11]. This paper mainly focuses on the recognition stage in the SAR target recognition process, that is, judging the target category. In realistic scenes, many uncertain operation conditions (OCs) can hide in the ROI; these can be divided into standard operation conditions (SOCs) and extended operation conditions (EOCs) [
12]. The former refers to OCs included in the target feature library. Those not included are called EOCs. Limited by the size and accuracy of the target feature library, the measured samples to be recognized are mostly derived from EOCs, such as strong noise, partial occlusion, depression angle differences, etc. [
13], making the ATR system unstable.
To enhance the robustness of the ATR system under various EOCs, it is necessary to extract and select the features carefully. SAR image ATR methods based on traditional features in recent studies can be roughly divided into three categories. The first category is based on geometry features, such as edge features [
14] and region moment features [
15]. These can intuitively describe a target; however, due to the existence of speckle noise it is difficult to accurately extract these features. The second category is based on linear or nonlinear feature projection. For linear projection, principal component analysis (PCA), linear discriminant analysis (LDA) [
16], and non-negative matrix factorization [
17] are typical representatives. Nonlinear projection is based on kernel methods, e.g., kernel principal component analysis (KPCA) [
18], and nonlinear manifold learning methods, e.g., local discriminant embedding (LDE) [
19]. While these features are convenient to extract, they are unable to represent the local characteristics of the target, making it difficult to cope with occluded targets. The last method is based on scattering center (SC) features [
20], which can reflect the target’s global and local electromagnetic scattering characteristics. After decades of development, SC models have increasingly enhanced the ability o describe the target’s characteristics [
8,
20,
21,
22,
23,
24,
25]. Classical SC models include the ideal point SC model, GTD model, Attribute Scattering Center (ASC) model, etc. [
26,
27,
28]. Among these, the ASC model proposed by Potter and Moses has been successfully applied in SAR image feature extraction and ATR [
29]. Thus, the ASC feature is a good candidate for the SAR image ATR task under EOCs.
The extracted ASC list is usually unordered, and may contain false alarms and missed alarms. Therefore, it is almost always unwise to use it to train networks such as support vector machines (SVM) [
30] and neural networks directly [
15]. Instead, researchers typically choose to determine target categories by comparing the differences between ASC sets. Therefore, a critical problem is to find an efficient and robust way to measure and assess the similarity between the two sets. Current strategies for tackling this dilemma can be broadly divided into two categories; first, to evaluate the similarity via the one-to-one correspondence between the two sets. In [
8], the authors adopted this correspondence and evaluated the similarity using the Bayesian posterior probability. In [
20], the authors constructed the Karhunen–Loeve (KL) decomposition and adopted the result-matching method to match two ASC sets. Tian et al. [
21] reconstructed ASC features by using the World View Vector (WVV), then matched the feature set with the template through the weighted bipartite graph model (WBGM) to identify the target. Dungan et al. [
22] carried out the SAR image recognition task through one-to-one matching using the Hausdorff distance. However, when noise pollution or occlusion is present in the image, the Hausdorff distance can easily cause false matching. Ding et al. [
23] adopted a one-to-one ASC matching method based on the Hungarian algorithm, and improved the recognition performance using information on false alarms and missed alarms. Although these methods have achieved good results, they remain too complicated and cumbersome. Another approach is to directly evaluate the difference between the two sets, thereby avoiding complex one-to-one matching. Such methods are based on the recognition of the ASC point set. Good results can be achieved only when the number of point set elements is determined and consistent. When encountering issues with resolution, noise, etc., these methods display certain drawbacks.
Depth features extracted by deep learning methods have been increasingly used for SAR ATR in recent years. Chen et al. [
31] first applied convolutional neural networks (CNN) to the SAR image target recognition field, obtaining excellent recognition accuracy. Li et al. [
32] proposed a multiscale CNN based on component analysis for SAR ATR and improved the recognition accuracy. Furthermore, in our previous work we proposed a combined CNN and support vector machine (SVM) method to improve the robustness of the ATR system under limited data [
33]. Guo et al. [
34] proposed a target recognition method based on an SAR capsule network, which combines a traditional CNN with a capsule network. However, the capsule network used in this method is complex in design, leading to low generalization and a large number of parameters, making training difficult. Although they introduced vectorized fully-connected operations and variable folding crossover mechanisms to improve accuracy and robustness, this further increases the computational complexity and training time. Moreover, parameter selection for recovery and variable fold-crossing mechanisms poses challenges, and may have a slight impact on network performance. Although these deep learning-based methods can achieve excellent recognition results, they do not cope well with the EOC problem. Therefore, scholars have combined neural network with ASC to improve the robustness of the recognition system. Feng et al. [
35] adopted partial convolution and an improved bidirectional convolution recurrent network to extract local features of the target through a partial model based on ASC. However, they only targeted partial occlusion of this medium EOC, and did not extend it to various EOCs. In [
36], the authors proposed a target partial attention network (PAN) based on the attribute scattering center (ASC) model, combining electromagnetic properties with a deep learning framework. However, the design of this approach is complex and requires the design of multiple CNN models, each of which needs to be trained well enough to extract useful features. Although the deep learning approach performs relatively well, in general these approaches are black box models with poor interpretability. Moreover, they tend to have a large number of adjustable parameters, requiring a large amount of data in order to complete training. It is impractical to obtain large amounts of data with labels in SAR images, however, especially in the context of military applications.
Therefore, it is essential to explore a more suitable SAR image ATR method under EOCs. Taking into account the advantages and disadvantages of the above-mentioned methods, in this paper we propose an SAR image ATR method based on SC-GMM to accommodate different EOCs. The main contributions of this paper are as follows:
- (1)
A robust SAR image ATR method is proposed that can adapt to different EOCs.
- (2)
The method utilizes the Gaussian probability density function (PDF) to describe the statistical characteristics of the position and scattering coefficients of ASCs, ensuring resilience against noise and resolution. The PDFs of multiple ASC parameters are integrated using a Gaussian mixture model (GMM) to effectively represent the statistical characteristics of the SC sets.
- (3)
To enhance calculation efficiency and robustness against noise and outliers, the Gaussian quadratic form distance (GQFD) is modified to a weighted GQFD (WGQFD). The WGQFD assigns a higher weight to position, facilitating similarity evaluation between GMMs and completing the recognition task.
- (4)
To reduce the size of the feature template library and improve recognition efficiency, this paper proposes an adaptive aspect–frame division algorithm. While it is ideal to consider the orientation sensitivity of SAR images in the template library, including SC information for all attitude angles, this is impractical. Therefore, in this paper we divide the GMM into multiple aspect frames, achieving both high recognition accuracy and improved efficiency.
The remainder of the paper is organized as follows.
Section 2 introduces related works on ASCs.
Section 3 introduces the proposed ATR method in detail. In
Section 4, experiments under EOCs are presented verify the effectiveness and robustness of the proposed method. Finally,
Section 5 provides the conclusion of the paper.
2. Related Works
In general, the electromagnetic scattering characteristics of radar targets in the high-frequency region can be equivalent to the superposition effects of several local phenomena. These local phenomena are known as SCs [
13]. As a parametric model, the ASC model [
29] is used to describe the electromagnetic scattering characteristics of complex targets in the high-frequency region, which is based on physical optics and geometric response theory. The specific ASC model can be expressed as follows:
In Equation (
1),
represents the overall backscattering at frequency
f and azimuth
;
p is the model order, namely the ASC number of the target; and
. The formula for calculating a single ASC is as follows:
where
represents the center frequency,
c is the speed of light,
represents the ASC parameter set,
represents the amplitude,
represents the frequency dependence factor, which is a discrete variable with values [−1, −0.5, 0, 0.5, 1],
represents the physical position of the SC in the scene,
and
represent the length and direction angle of the distributed SC, respectively, and
represents the dependence factor of the SC with respect to
. When
, the SC is local, while when
it is distributed.
The ASC parameters contain rich physical meaning and are closely related to the local characteristics of the target. Different combinations of
and
represent different geometric scattering types [
37]. Therefore, the ASC can sense changes in the local physical structure of the target. The ASC model provides good physical reasoning ability, making it a feasible candidate to improve recognition performance under EOCs.
3. The Proposed ATR Method
Figure 1 shows the proposed SAR image ATR method, which consists of two stages: the offline training stage and the online test stage. In the first stage, the SAR image training set undergoes preprocessing followed by extraction of the ASC for the construction of GMMs, then the GMMs are divided by aspect–frame division and the training template library is obtained. In the second stage, the aspect–frame division step is not essential after building the GMM; instead, the WGQFD-based measurement matching operation is performed between the GMM and the training template library and the minimum WGQFD corresponding to a certain category is obtained, indicating completion of the recognition task. The key steps in the approach are described below.
3.1. Preprocessing
To improve the extraction accuracy of SCs, data need to be processed in advance. Here, preprocessing is composed of amplitude normalization, target segmentation, and alignment.
Due to the diversity of the electromagnetic wave propagation environment, the sensor type, and the target distance, the echo intensity from the same target may be diverse, giving rise to certain amplitude differences in the SAR image. Therefore, maximum amplitude normalization is adopted to weaken the amplitude sensitivity.
Another operation that has to be mentioned is target segmentation. The constant false-alarm rate (CFAR) method is usually used to detect and segment the target region from the background [
38]. Nevertheless, the method often encounters modeling difficulties in the non-uniform clutter region, and the detection time is long. Moreover, the CFAR method relies heavily on the target–clutter contrast. Li et al. [
39] proposed using the ACM to segment the target part from the background. They introduce a modified region-scalable fitting (RSF) model that incorporates a statistical dissimilarity measurement to overcome the speckle noise in SAR images. They then integrated this modified RSF model into the global minimization active contour (GMAC) framework to achieve robust and accurate segmentation. We found that ACM is more robust against noise than CFAR, as shown in
Figure 2; thus, we employed the ACM to conduct target segmentation in this paper.
It can be seen that while both preprocessing methods can detect and extract the main target contour without noise, the target contour segmented by the CFAR method is relatively unsmooth. Additionally, as the signal-to-noise ratio (SNR) becomes smaller, the target segmented by CFAR method loses target information, indicating that the CFAR method is not robust to noise and will subsequently prompt inaccurate ASC feature extraction, thereby impairing the recognition effect.
Lastly, data alignment preprocessing cannot be ignored. Because the target position is not always in the center, it is necessary to align the input samples such that the position deviation can be diminished. The centroid alignment is adopted here. The target centroid after alignment is located in the center of the chip.
3.2. ASC Extraction
In essence, the extraction of ASCs is a parameter estimation process. Here, the classical approximate maximum likelihood (AML) algorithm [
23] is used for the parameter estimation. As shown in Equation (
3), measured data can be expressed as the sum of real data and noise:
where
represents the data obtained by the actual measurement,
represents the real scattering field of the target, and
represents the error caused by noise and model mismatch, which is modeled by the zero-mean Gaussian distribution. The purpose of estimating parameters is to determine the parameter set
on the basis of
and
. This estimation process of the ASC parameter can be expressed as follows:
Equation (
4) provides the basic idea of parameter estimation. However, in actual operation a single SAR image may contain multiple SCs; thus, the parameter scale to be estimated can be very large. To solve this problem, in this paper we adopt the same image domain decoupling strategy as in [
29], in which the SCs are separated one-by-one through image segmentation operations, then AML estimation is performed for the individual SCs.
Figure 3 displays the ASC extraction results for a vehicle. The extracted ASC set reflects the strong scattering point distribution rule of the original SAR image target, and includes a geometric size and structure description. The reconstructed SAR target image is very close to the original one. Morever, the remaining part is essentially the error caused by the clutter and the algorithm itself, which occupies a small amount of energy. This further indicates that the extraction algorithm has high accuracy and can facilitate subsequent recognition.
As mentioned above, the ASC parameter
consists of seven items. Among these, five the local SCs have five items, while the distributed SCs have six. The most common items are
; thus, it is reasonable to take
as the final parameter sets. Moreover, the frequency dependence factor
is associated with the specific structure of the SC; however, at the current technical level it is very difficult to obtain a relatively accurate
value. In addition, considering that
is a discrete variable, its estimation error can easily have a large impact on the similarity measure of the SC set. Previous researchers have decided to abandon it [
23,
40,
41] and in this paper we do the same, obtaining the final SC parameter
. Our subsequent experiments are conducted based on these three common parameters.
3.3. Scattering Parameter Gaussian Mixture Model
Due to the effects of resolution and noise, the estimated values of ASCs tend to have errors and fluctuate around a certain value. Here, we attempt to handle this problem through statistical methods. We set a point target in the SAR image scene center. We performed 10,000 Monte Carlo experiments with a resolution of 0.3 m and white Gaussian noise of 5dB, obtaining the scattering center estimate values shown in
Figure 4. As can be seen, these estimated values fluctuate around certain fixed values. Through fitting experiments, we found that these values of
fit well with the Gaussian distribution.
The Gaussian Mixture Model (GMM) is a statistical model used to represent the probability distribution of a set of datapoints that are believed to be generated from a mixture of several Gaussian distributions. In the GMM, the datapoints are modeled as a mixture of multiple Gaussian distributions, each with its own mean and covariance matrix. The advantages of using the GMM include its flexibility in modeling complex data distributions, its ability to handle missing data, and its ability to estimate the number of components in the mixture model using information criteria. Several studies have utilized the GMM for a variety of applications, including image segmentation, speech recognition, and anomaly detection [
42,
43,
44].
Based on the above experimental results, we used a three-dimensional Gaussian distribution to describe the relative position of individual ASCs and the statistical properties of the scattering coefficient. Then, we combined multiple ASCs to construct a GMM to characterize an SAR image. In this way, the discrete SC point set can be transformed into a continuous high-dimensional PDF, which is a key step in avoiding the need for one-to-one matching to recognize SC points. The GMM can be expressed as
where
K represents the number of Gaussian components, that is, the number of SCs, and
is determined by the scattering coefficient and the relative position, i.e.,
, which is the estimated three-dimensional mean vector of the
SC; moreover,
represents the
Gaussian distribution weight, which has a value of
, while
is the variance diagonal matrix, for which the diagonal element is the estimated variance of
,
and
, i.e.,
.
To estimate the diagonal matrix of the variance more accurately, in this paper we set the point target SAR images at diverse SNRs and resolutions and perform ASC extraction under each condition. As shown in
Figure 5, 10,000 Monte Carlo experiments were used to calculate the estimated standard deviations of the individual SCs.
With reference to
Figure 5, the following points require further discussion: (1) the higher the SNR, the smaller the estimated standard deviations of the scattering coefficient
,
, and
, while
,
, and
change linearly with the SNR; (2) the lower the resolution, the smaller the estimated standard deviations of the scattering coefficient
,
, and
,
, while
,
, and
decrease more or less exponentially as the resolution decreases. After experimental verification, we fit the expressions of
for
,
, and
:
where the parameters
can be calculated by the least square method [
43], while
is the amplitude ratio of the SC and the noise and can be obtained from Equation (
8)
where
indicates the noise power.
3.4. Weighted GQFD
After establishing the GMMs, the problem of discrete ASC set matching recognition can be transformed into the similarity measurement between two GMMs. Usually, algorithms such as the Kullback–Leibler divergence, Mahalanobis distance, etc. [
45,
46] require the same number of components in the distribution to measure the similarity between them. However, the number of SCs extracted for each SAR image in this paper is different, resulting in inconsistent components in the GMMs; thus, these measurement algorithms cannot be used directly.
Fortunately, the GQFD is the characteristic quadratic distance between the two GMMs and can be used for similarity measurement [
43,
44]. It has the following advantages: (1) there is no need to establish point-to-point connection between two point sets or compensate for the presence of excess points (false alarms) and missing points (missed alarms); (2) it can consider the respectively overall structures of the point sets, rather than simply matching the isolated points; (3) the GQFD computation between two point sets uses analytic expressions and has high computational efficiency; and (4) it can calculate the distance between point sets at different dimensions. It is not affected by differences in amplitude or relative position, or by the number of target SCs extracted in the SAR image.
Therefore, considering the practical situation of this paper, we use the GQFD to measure the similarity between GMMs. The specific definition of the GQFD is as follows:
Assuming that
and
represent the GMMs of two ASC sets, their PDFs can be expressed by the following Equations (
9) and (
10) [
43]:
Then, the expected similarity between the two GMMs is defined as follows [
44]:
where
is a similarity function, defined as
where
and
are the
dimensional elements of
and
, respectively, and
d is the dimension of
and
.
Due to the integral in Equation (
11), it is difficult to make use of this similarity distance in numerical calculations. In [
44], it was shown that the expected similarity can be simplified as follows:
where
and
are the
dimensional element of
and
. Accordingly,
and
are the
diagonal element of
and
respectively.
Based on the expected similarity between two Gaussian distributions, the similarity matrix
, where
is the element on the
row and
column of
, can be defined as follows:
where
and
are self-expectation similarity values of
and
, respectively, and
and
are co-expectation similarity values for
and
, respectively.
Therefore, the GQFD value of these two ASC sets can be calculated using Equation (
15) [
43]:
where
represents the series of
For a stable SC, its relative position changes with the target azimuth slowly, while its amplitude varies dramatically. The position information is more robust against target recognition than the amplitude. Thus, we assign different weight values
to the relative position and amplitude in the formula
, and propose the weighted GQFD (WGQFD):
where
is obtained from the search method.
Substituting Equation (
16) into Equation (
11), we obtain the analytic expression of
:
Similarly, the other analytical expressions in Equation (
14) can be obtained, and by substituting these into Equation (
15), the WGQFD is obtained.
3.5. Aspect—Frame Division
There are problems with amplitude sensitivity and attitude sensitivity in radar ATR based on SCs. The amplitude sensitivity, as described in
Section 3.1, can be reduced by normalizing the maximum amplitude value.
The attitude sensitivity refers exactly to the azimuth attitude sensitivity. Common approaches to solving the problem are to improve and enrich the template library as far as possible. As long as the target ASC information at each azimuth is contained, the target recognition effect is able to be further enhanced. However, too many templates reduce the efficiency of target recognition. Therefore, in this paper we design an adaptive aspect–frame division algorithm to reduce the stored template number and increase recognition efficiency. First, depending on the WGQFD of the training sample ASC set, the training sample is divided into several aspect-frames; then, in each aspect-frame, the SC set with the smallest WGQFD from other samples is selected as the template of the current aspect-frame. Through the same method, the aspect-frame of each target can be acquired. The detailed flow of the adaptive aspect–frame division algorithm is shown in Algorithm 1.
Algorithm 1 Adaptive aspect–frame division. |
Input: the Gaussian mixture samples constructed by ASCs set of classes target SAR training sample and the aspect-frame threshold , is the number of the training samples for the class target.
|
Initialization: the total aspect-frame number of each category sample. |
- 1:
Let the Gaussian mixture index . - 2:
Let the aspect-frame index . - 3:
Assign to the aspect-frame. - 4:
Calculate the average WGQFD between the sample and all Gaussian mixture samples in the aspect-frame. - 5:
If the average , this sample is drawn into the current aspect-frame, set , go to step 2. Otherwise, this sample doesn’t belong to the current aspect-frame, set . If , add a new aspect-frame, set , and , go to step 2; if , set , and then go to step 3. - 6:
All Gaussian mixture samples of classes targets were traversed through steps 1–5.
|
|
3.6. The Training and Testing Processes
The training process includes the following steps: first, ASC extraction is performed for each class of preprocessed SAR image samples, then GMMs are constructed for each sample separately, and finally, all GMMs in each category are divided using Algorithm 1; in this way, a library of training templates is obtained. The specific details are shown in Algorithm 2.
Algorithm 2 Training process of the recognition method based on the GMM. |
Input: the target SAR training sample , where is the number of the target class of interest.
|
- 1:
for to do - 2:
Extract the ASC set of the class samples and construct the corresponding GMM. - 3:
for to do - 4:
Extract the ASC set by using Algorithm 1 for the input SAR image samples . - 5:
Construct the GMM on the extracted ASC set . - 6:
end for - 7:
Divide the GMM set into aspect-frames by using Algorithm 1. - 8:
for to do - 9:
In the aspect-frame, choose the GMM with the smallest WGQFD from the other GMMs as the template. - 10:
end for - 11:
end for
|
|
Testing is performed by constructing GMMs for the ASCs of the test samples and then finding the minimum WGQFD through calculation using the GMMs in the template library to obtain the final recognition results. The testing process used for specific recognition is shown in Algorithm 3.
Algorithm 3 Test process of the recognition method based on the GMM. |
Input: the target SAR test sample , the training template library .
|
- 1:
Preprocess the test sample . - 2:
Extract the ASCs from the preprocessed test sample and form the ASC set . - 3:
Construct a GMM by using the ASC set . - 4:
Look for the smallest WGQFD , the class to which the template with belongs, i.e.,
|
|
4. Experimental Results
4.1. Data Introduction and Experimental Platform
In this paper, the proposed method is verified by the MSTAR dataset, which has been extensively used internationally for verifying SAR target recognition algorithms since the 1990s. This dataset collects X-band airborne SAR images of various static ground military vehicle targets with horizontal polarization mode, azimuth, and distance resolution of 0.3 m, azimuth interval of approximately 0.03°, and depression angle ranges of 15°, 17°, 30°, and 45°.
Figure 6 displays the optical images of the ten categories of targets in the MSTAR dataset, in the order of 2S1, BMP2, BRDM2, BTR60, BTR70, D7, T62, T72, ZIL131, and ZSU23_4. Other vehicle images of different models for certain categories are included in the MSTAR database as well.
In order to test the recognition performance of the proposed ATR method, the experiments described below constructed different EOCs based on these ten categories of vehicle targets and selected suitable target classes. Meanwhile, several existing SAR ATR methods were selected for comparison with the proposed method. The specific implementation of each method was as follows:
- (1)
KNN: The PCA feature vector was input to the k-Nearest Neighbor (KNN) classifier to describe the original SAR image. The Euclidean distance is employed to measure the distance between the PCA feature vectors [
16].
- (2)
SVM: The PCA feature vector was input into the SVM classifier to describe the original SAR image. The Gaussian kernel was used as the kernel function of the SVM [
47].
- (3)
SRC: To describe the original SAR image with random projection features, the sparse representation coefficient vector was solved using the orthogonal match pursuit (OMP) algorithm [
48].
- (4)
Resnet: A neural network was constructed using a deep residual structure. In this paper, we chose Resnet18, which has an 18-layer network structure, for the comparison experiment [
49].
- (5)
VGG: All the convolution kernels were of size 3 × 3. Its network can be very deep, but has many parameters. In this paper, we chose VGG16, which has a network layer of 16, for the comparison [
50].
- (6)
Densenet: Compared to Resnet, it has a smaller number of parameters and a higher network depth. A 121-layer network structure was used here [
51].
- (7)
HD-ASC: The Hausdorff distance proposed in [
52] was used to match and recognize the ASC set.
- (8)
G-ASC: The recognition method proposed in this paper.
- (9)
G-ASC1: The same recognition method as in G-ASC except without aspect–frame division.
All experiments were performed on a Dell Precision 5820 workstation (CPU: Intel i9-10920X, GPU: GeForce GTX 3090, RAM: 64 GB). The software was Matlab R2022a under Windows 10. All of the deep learning algorithms were implemented using the Pytorch library in Python 3.8.
4.2. Noise Experiments and Results
Due to the effects of the target background and radar sensor changes, the acquired SAR images are inevitably disturbed by different levels of noise. Fortunately, the chosen SAR target images from the MSTAR dataset possess a high SNR, which can reduce the difficulty of target recognition to an extent [
53]. To verify the robustness of our method under different noise levels, we adopted the same method as in [
23] to add Gaussian white noise to the original SAR target images, resulting in SAR images being constructed at different SNR levels. Considering the inability to completely remove the noise, we assume that the raw SAR images are noise-free.
As shown in
Figure 7, the raw images were first processed by two-dimensional fast Fourier transform (2D FFT), then the zero padding and window were removed to obtain the frequency domain image. Next, Gaussian white noise was added to obtain the transformed image, following the formula
. Finally, the transformed image was transmitted to the image domain by means of the reverse process, thereby obtaining the final SAR image with noise. The results can be found in
Figure 8. Obviously, as the SNR of the added noise decreases, the target identifiability in the noise background is gradually weakened. Especially at the SNR of 0dB, the target area is almost wholly submerged in the noise.
Here, the training set consisted of SAR images of the BMP2(SN_9566), BTR70(SN_C71), and T72(SN_132) at a 17° depression angle, and the test set was constituted by the same target images with noise of 10 dB, 5 dB, and 0 dB. The recognition results are shown in
Table 1.
It seems clear that the recognition rates of all recognition methods decreased to different degrees as the signal-to-noise ratio decreased, while our method was able to maintain the optimum rate. In addition, the four deep learning-based methods in the comparison methods did not adapt well to the interference caused by noise. A potential reason may be that the first three methods are based on global features and the next three are directly based on the pixels. After adding strong noise, the global intensity distribution of the SAR image changes notably, resulting in a large discrepancy between the test and training samples, causing the recognition performance to deteriorate. This difference due to noise can be visualized in
Figure 8. By contrast, the latter three methods are very robust to the noise. This is because their extracted ASCs are able enough to retain the rich parametric features of the target, which is beneficial for target recognition. In addition, the proposed method considers noise during GMM construction and uses the WGQFD as a measure in the recognition process, which effectively avoids a varying number of SCs, such as false alarms and missed alarms. Naturally, the recognition performance is superior in terms of robustness. Of course, it can be seen that the recognition method after orientation frame segmentation is a little worse than the recognition method without aspect–frame division, although it outperforms the other comparison methods.
4.3. Resolution Change Experiments and Results
In practice, the training samples often cover only a single or few resolutions. For instance, the resolution of all SAR images in the MSTAR dataset is 0.3 m. However, the sample image to be recognized is likely to be inconsistent with the training template at the resolution level, which will inevitably affect the recognition results. For the sake of testing the recognition performance of our method at various resolutions, the test set was composed of images with different resolutions constructed according to [
3]. The distance and azimuthal resolution of SAR images are determined by the radar sensor bandwidth and the synthetic aperture size, respectively. Referring to the method used in
Figure 7, the raw SAR target image is switched to the frequency domain. According to the SAR imaging mechanism, the specified proportion of data is extracted from the center of the frequency domain at the set resolution. Then, the extracted frequency domain data are converted to the image domain according to the reverse process. Finally, different resolutions of SAR images are obtained. It is worth noting that after the extraction operation, only low-resolution images can be obtained.
As shown in
Figure 9, it is obvious that the target outline and details of the SAR image gradually become more blurred with the decline in the resolution which is likely to cause issues during the recognition process.
The training samples were the same for this experiment as in
Section 4.2, and the test samples were the same images at reduced resolutions (0.6 m, 0.9 m, and 1.2 m). The recognition performance results at distinct resolutions are shown in
Table 2.
It is apparent that the recognition accuracy of the other nine methods all declined to various degrees, while the G-ASC1 and G-ASC methods maintained stronger robustness at all times. The decrease in resolution resulted in pixel distribution changes across the whole SAR image, meaning that the performance inevitably worsened for those methods based on the global features and pixels. By considering the influence of image resolution in the ASC estimation and GMM construction process, the proposed method based on the ASCs has better adaptability to resolution changes than the others. Moreover, the proposed method uses the WGQFD-based matching method to assign different weights to the ASC parameters, further guaranteeing robust recognition performance.
4.4. Target Model Change Experiments and Results
In the target recognition field, in most cases the test targets are non-cooperative. Compared to the library targets, these have the same category but different models. Usually, they are homologous to the library targets in terms of their geometrical shapes, but have several differences in their local structure [
54]. The MSTAR dataset contains multiple models of target SAR images for a given category; for instace, BMP2 includes SN_9563, SN_9566, and SN_C21; T72 includes SN_132, SN_812, SN_S7, SN_A04, SN_A05, SN_A07, and SN_A10; etc. Here, we only selected the first three models of the above-mentioned targets. In order to ensure the rationality of the experiment, we added an extra BTR70 (SN_C71) target to the training set. Specific samples from the training set and the test set are shown in
Table 3.
Figure 10 shows T72 optical images of different models. It can be seen that while the overall structure of these targets is very similar, there are subtle differences locally, such as the fuel tank, the gun barrel, and the front armor.
Table 4 presents the recognition–confusion matrix for our method, while
Table 5 presents the average recognition rates of the other methods. In general, the recognition rate of all methods exceeded 94%, indicating that the subtle changes in detail caused by the model changes did not exert a great impact on the overall recognition effect. Compared with the HD-ASC method, our approach has more robust and competitive recognition performance, which reveals that the WGQFD-based matching method has good performance in perceiving changes in local details.
4.5. Depression Angle Change Experiments and Results
SAR images have strong sensitivity to the depression angle; in general, the greater the discrepancy in the depression angle, the greater the difference between SAR images [
55]. The MSTAR dataset contains multiple target SAR images at different depression angles. Usually, images at 17° are used as the training set and images at 30° and 45° are used as the test set. In this paper, we do the same in order to verify the recognition performance under different depression angles. The selected training set and test set data are displayed in
Table 6. The SAR images at different depression angles are shown in
Figure 11. It can be seen that the image at 30° is relatively similar to the image at 17°, while the image at 45° is quite unlike it.
Table 7 depicts the recognition results of our method for three selected targets; the average recognition rates are 96.41% and 77.56% at 30° and 45°, respectively.
Table 8 exhibits the average recognition rates of various methods at 30° and 45°. It can be seen that the recognition rates of all methods are above 90% at 30°, indicating that the small difference in the depression angle has little effect on the different recognition methods. However, the recognition accuracy decreased sharply with a depression angle of 45°. This shows that an excessive depression angle disparity leads to the test samples varying from the training samples. This characteristic can be intuitively seen in
Figure 11. In comparison, the proposed method maintains a higher recognition rate at both depression angles, only slightly lower than the HD-ASC method at 30°.
This phenomenon may be due to the visual sensitivity of radar sensors; the recognition performance of the global feature-based methods (e.g., KNN, SVM, SRC) and pixel-based deep learning methods (e.g., Resnet18, VGG18, Densenet121) decreased significantly when the difference in depression angles was large. However, local features (strong SCs) can be retained and used; thus, the recognition accuracy of our method remains highly robust to changes in the depression angle. This advantage is more obvious at 45°. The results for the proposed method (after aspect–frame division) are a little bit lower than the HD-ASC method at 30°, though the model without aspect–frame division is higher than any comparison method. This shows that although aspect–frame division reduces the number of stored templates and improves recognition efficiency, it reduces the recognition accuracy to an extent. Nevertheless, the model remains within an acceptable range.
4.6. Partial Occlusion Experiments and Results
Another very important EOC is partial occlusion. Whether artificial or not, occlusion blocks the radar sight and seriously affects the target recognition effect. However, the targets in the existing MSTAR measured dataset are usually completely exposed, making it necessary to construct partially occluded target images. The concrete construction method for SAR target occlusion images refers to [
56], as shown in
Figure 12. The SAR target segmentation method in
Section 3.1 is used to obtain the target region; then, partial target regions are removed from different directions at a set proportion. Finally, the original background pixels are selected to fill the removed area randomly.
Figure 13 shows eight different occlusion directions, and
Figure 14 unfolds the complete image and the images occluded at a 30% ratio from different directions. The same training set was used here as in
Section 4.2 and
Section 4.3. The test set was comprised of SAR images at different occlusion proportions (10%–50%) from eight occlusion directions. The obtained experimental results are shown in
Figure 15.
The experimental results are shown in
Figure 15, demonstrating that as the occlusion ratio increases, the recognition performance of each method declines significantly. In contrast, the recognition performance of ASC-based methods decreases more slowly, which can be attributed to these methods’ ability to accurately describe the target’s characteristics using ASCs in non-occluded regions, resulting in robust recognition performance even when high occlusion ratios are encountered. As expected, the recognition accuracy of the proposed method consistently outperforms all the comparison methods, remaining above 80%. This shows that when part of the SCs are missing, our method can make the best use of the remaining SC information to ensure recognition accuracy. Therefore, it can exactly remain robust under occlusion EOCs. Consistent with the experimental results of the previous EOCs, it is worth noting that, the recognition accuracy always decreases slightly after aspect–frame division; regardless, the proposed method is able to effectively maintain high recognition performance at all times.
4.7. Computational Time Analysis
To further demonstrate the advantages of the proposed method, we compared the average time required by each method to recognition a sample, with the results shown in
Figure 16. It can be seen that the two classical methods, SVM and KNN, take the longest time; while the proposed method without aspect–frame division takes less time, it requires more time than the remaining comparison methods. On the contrary, the proposed method with aspect–frame division has significant savings in computation time. Without aspect–frame division, the samples in the template library need to be matched one-by-one, which is more time-consuming; however, after division the number of required matches is significantly reduced, resulting in a shorter matching time.
In combination with the recognition results for each EOC, our proposed algorithm is shown to save recognition time through aspect–frame division while maintaining a consistently strong level of robustness. While the HD-ASC method produces comparable results to our proposed algorithm in certain cases, it incurs a longer processing time overall. To summarize, our proposed method (after aspect–frame division) achieves a high recognition rate with a lower time cost, making it an efficient and effective solution that is able to maintain good robustness.
5. Conclusions
This paper proposes an SAR image ATR method based on the Scattering Parameter GMM. This approach takes into account the effects of resolution and noise by adopting GMMs to model the extracted ASC set. To address the issue of inconsistent numbers of scattering points, the proposed method uses the WGQFD to measure similarity and enable SAR image ATR. In addition, in order to improve the efficiency of the recognition, an adaptive frame division operation is used to reduce the number of templates, which reduces the time spent on recognition while ensuring that there is no excessive decrease in the recognition accuracy. Various EOCs are considered, including noise, resolution changes, model changes, depression angle changes, and partial occlusions; our recognition experiments show that the proposed method outperforms others in terms of both robustness and computation time. Moreover, our proposed method has strong engineering realizability.
However, because the ASC estimation method relies on the AML algorithm, which is computationally expensive, future research could focus on improving the efficiency and accuracy of ASC extraction. This could involve exploring ways to optimize the AML algorithm or introducing deep learning techniques to enhance the process. An additional next step is to consider combining the scattering center parameters with deep learning methods to organically combine the physical properties of SAR images with deep learning methods, which could further enhance the robustness and accuracy of SAR ATR.