Next Article in Journal
Exploring Spatial Aggregations and Temporal Windows for Water Quality Match-Up Analysis Using Sentinel-2 MSI and Sentinel-3 OLCI Data
Previous Article in Journal
Spectroscopic Phenological Characterization of Mangrove Communities
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Method for CSAR Multi-Focus Image Fusion

by
Jinxing Li
,
Leping Chen
*,
Daoxiang An
,
Dong Feng
and
Yongping Song
College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(15), 2797; https://doi.org/10.3390/rs16152797
Submission received: 5 June 2024 / Revised: 20 July 2024 / Accepted: 25 July 2024 / Published: 30 July 2024
(This article belongs to the Section Remote Sensing Image Processing)

Abstract

:
Circular synthetic aperture radar (CSAR) has attracted a lot of interest, recently, for its excellent performance in civilian and military applications. However, in CSAR imaging, the result is to be defocused when the height of an object deviates from a reference height. Existing approaches for this problem rely on digital elevation models (DEMs) for error compensation. It is difficult and costly to collect DEM using specific equipment, while the inversion of DEM based on echo is computationally intensive, and the accuracy of results is unsatisfactory. Inspired by multi-focus image fusion in optical images, a spatial-domain fusion method is proposed based on the sum of modified Laplacian (SML) and guided filter. After obtaining CSAR images in a stack of different reference heights, an all-in-focus image can be computed by the proposed method. First, the SMLs of all source images are calculated. Second, take the rule of selecting the maximum value of SML pixel by pixel to acquire initial decision maps. Secondly, a guided filter is utilized to correct the initial decision maps. Finally, fuse the source images and decision maps to obtain the result. A comparative experiment has been processed to verify the exceptional performance of the proposed method. The final processing result of real-measured CSAR data demonstrated that the proposed method is effective and practical.

1. Introduction

Circular synthetic aperture radar (CSAR) is one of the special modes of synthetic aperture radar (SAR) that can provide a long-time illumination of a region of interest (ROI) with a wide accumulation angle [1]. Unlike linear SAR, this geometry provides wide-angle information about the reflectivity of the scattering centers in ROI and a method to obtain high-resolution images. Furthermore, this geometry makes it possible for 3D imaging by single-baseline and single-channel CSAR [2]. For these excellent advantages, CSAR has attracted a lot of attention and has been widely applied in civilian and military applications.
However, CSAR is sensitive to terrain undulation due to the curve synthetic aperture. Traditional CSAR imaging algorithms rely on the flat-Earth assumption, assuming that objects are in the same height plane, such as the back projection (BP) algorithm [3,4] and the polar format algorithm (PFA) [1,5]. Practically, the terrain of ROI is rugged, especially in urban areas. The imaging results will be defocused when an object’s height deviates from a reference height, which seriously affects the quality of the CSAR images.
A similar phenomenon is observed in applications of digital cameras. Limited by the depth of field (DOF), when a lens focuses on an object at a certain distance, only objects within the DOF are focused, while objects out of the DOF are defocused [6]. It indicates that obtaining an image with all objects focused by digital cameras directly is impossible. Therefore, multi-focus image fusion is proposed to solve this problem, in which one can acquire a series of images focusing on objects of different distances and fuse them to produce an all-in-focus image. Presently, many multi-focus image fusion methods have been applied to handle this challenge, which can be roughly classified into three main groups [7]: spatial domain methods [8,9,10,11,12,13,14], transform domain methods [15,16,17,18,19,20,21,22,23], and deep learning methods [24,25,26,27,28].
Spatial domain methods estimate decision maps pixel-wise [8,9], block-wise [10,11], and region-wise [12,13], according to focus measures. These methods are simple, efficient, and can retain more spatial information from source images. Pixel-wise methods usually produce inaccurate decision maps, especially near focused/defocused boundaries (FDBs). The effectiveness of block-wise methods relies on the size of the blocks. However, they are susceptible to blocking effects [14]. Region-wise methods discriminate focused and defocused regions by image segmentation methods, which can alleviate the block effect.
Transform domain methods transform source images to feature domain, then select feature coefficients based on fusion strategy. The fused image is reconstructed by the selected coefficients in the feature domain. The discrete wavelet transform (DWT) [15,16] is one of the classical methods, which has good time–frequency localization properties, while ineffectively distinguishing FDB. Nonsubsampled contourlet transform (NSCT) [17,18] can make up for the defects of DWT, but it has a finite number of directions. Easkey et al. [19] proposed the nonsubsampled shearlet transform (NSST), which is capable of extracting features changing in arbitrary directions. Sparse representation (SR) was also used for multi-focus image fusion [20]. In addition, multi-scale transform methods modified based on the methods above have been applied to extract multi-scale features of source images [21,22].
It is inevitable to introduce artifacts in spatial domain methods and transform domain methods. Recently, since deep learning methods are excellent in feature selection capability, they have become the mainstream methods for multi-focus image fusion. A series of convolutional neural networks (CNNs) have been proposed to meet the requirements of image fusion with satisfactory results [24,25,26,27,28]. The main shortcoming is a large number of samples are needed to train the CNNs, while CSAR data collection is difficult and costly. Therefore, these methods are not applicable in CSAR image fusion.
Based on the analysis above and considering the differences between CSAR images and optical images, which are detailed in Section 2, a spatial-domain CSAR multi-focus image fusion method is proposed in this article. First, multi-layer images that are full-aperture images in different referent heights are obtained by a high-resolution imaging algorithm. Second, the sum of modified Laplacian (SML) is utilized as a focus measure to estimate initial decision maps. Thirdly, the initial decision maps are corrected by a guided filter [29] since it is articulated discontinuously near FDB. Finally, source images and decision maps are fused to obtain an all-in-focus image. The proposed method does not rely on a priori information, such as DEM, so it is very adaptable in practical applications. We validate the advantage of the proposed method through a comparative experiment. The correctness and effectiveness of the proposed method are verified by the processing of real-measured, single-baseline, and single-channel CSAR data.
The rest of this article is organized as follows. Section 2 details the complete framework for the proposed method. The experiments and conclusions are presented in Section 3 and Section 4, respectively.

2. Method and Application

2.1. Characteristics of CSAR Images

Figure 1 shows the geometric model of the optical lens. Suppose there are three point-like objects whose distances from the film plane of the lens are, from far to near, points B, A, and C. When the lens is focused on A, conclude from Figure 1 that B is inside the DOF, and C is outside the DOF. In the imaging result, A is focused to be a point, while B and C are presented as circles that are called diffuse circles. However, the radius of the diffuse circle of B is less than one systematic resolution cell, while the opposite is true for C. Then, B is focused, while C is defocused.
Similar to Figure 1, energy mapping modeling in CSAR imaging is established, as shown in Figure 2. Unlike Figure 1, the solid lines in Figure 2 represent the mapping of the objects’ scattered echo energy to image energy. Points D, E, and F are point-like objects with different heights of 0, Δ h 1 , and Δ h 2 , where Δ h 1 > 0 , Δ h 2 < 0 , and Δ h 1 1 > Δ h 2 1 . Assuming that the height of the imaging plane is 0, which leads to, in the imaging result, D focused as a point and E and F defocused as rings that are different from the diffused circle in the optical images because the defocused rings have clear edges like focused regions. If the size of the rings of E and F is smaller than the smallest resolving unit of the system, E and F are focused, while defocused in contrast. It can also be noticed from Figure 2 that E and F focus as points when the reference heights are Δ h 1 and Δ h 2 , respectively. Therefore, if E and F are defocused when the reference height is 0, they can be made to be focused by adjusting the image plane. Imaging results of the same CSAR real-measured data in reference heights of 0 m and 4.5 m are shown in Figure 3. To have a clear comparison, regions labeled by red rectangles in Figure 3a are zoomed in, as shown in Figure 4. Objects in Region 1, 2, and the upper part of 3 are focused at a reference height of 4.5 m, while Region 4 and the lower part of 3 are focused at a reference height of 0 m. It indicates that multi-layer imaging makes objects focused on the image whose reference height is close to its height.
As stated above, the characteristics of the defocused regions are different on the optical and CSAR images, which can be summarized as the defocused regions on the optical images are blurred so their spatial information is poor, such as their edge information. In contrast, the energy of the defocused regions in the CSAR images is concentrated, which behaves similarly to the focused regions, especially in the high-frequency coefficients in the transform domain. A comparison is shown in Figure 5 and Figure 6. Figure 5 shows the high-frequency coefficients in the vertical direction of Figure 3, which is produced by one-layer NSST in eight directions. Two images of different reference heights both have rich and clear detailed information in the focused and defocused regions. Therefore, for the transform domain methods, defocused regions may be extracted as focused regions and fused into the result, which seriously affects the quality of the image fusion. In Figure 6, only the focused regions have clear edges, which is suitable for the transform domain methods to detect focused and defocused regions. To draw a conclusion, the transform domain methods are not able to meet the requirements of CSAR image fusion.
Spatial domain methods recognize the focused and defocused regions directly in the image domain, which is less likely to lose information during processing and can retain more original information from source images. Moreover, they are adaptable to different types of images, including CSAR images, and do not require complex transformations or training, which results in a small amount of computation and a fast-processing speed. Based on the analysis above, it is suitable to utilize spatial domain methods in this article.

2.2. Spatial Domain Method Based on SML

Spatial frequency (SF), average gradient (AG), and SML are usually used in spatial domain methods as focus measures. The SML is locally adaptable and can maintain image edge information better, which makes the fusion result clearer and richer in detailed information. At the same time, the SML is highly interpretable because the weight information that is closely related to the pixel position and gradient of each pixel has a clear physical meaning. The definition of the SML is detailed as follows.
Compared with the traditional sum of Laplacian, the SML considers the contribution of pixels in eight directions that are horizontal, vertical, and diagonal to the modified Laplacian (ML) of the center pixel. The ML is strongly directional and effective in preserving detailed information such as edges and textures. The SML is defined as
S M L i , j = m = M M n = N , N M L i + m , j + n ,
with i , j as the coordinates of pixels. M and N are the size of the accumulating window. M L i , j denotes the ML of pixel i , j , which is described as
M L i , j = 2 I i , j I i s t e p , j I i + s t e p , j + 2 I i , j I i , j s t e p I i , j + s t e p + 1.4 I i , j 0.7 I i + s t e p , j s t e p 0.7 I i s t e p , j + s t e p + 1.4 I i , j 0.7 I i s t e p , j s t e p 0.7 I i + s t e p , j + s t e p ,
while the pixel value is denoted by I i , j , s t e p denotes variable pitch. The effect of variable-step pixels in eight directions is considered in the ML. For the center pixel’s ML, the closer pixel has a greater contribution. Therefore, the weight of the contribution is set as 1 for horizontal and vertical pixels and 0.7 for diagonal pixels. The computational process above involves redundant matrix summation operations, so we adopt the modified computation method for SML proposed in [30] to improve the computational efficiency.

2.3. Concept of Guided Filter

However, on the one hand, the focused and defocused regions will be selected in the initial decision maps, of which the valid values are 1. If fusing the source images and the initial decision maps directly, the defocused regions will be entirely fused into the fused image. We experience that the energy of the defocused regions is lower than that of the focused regions. According to [29], the guided filter can smooth the input images by using the guided images. Therefore, we use the source images as guided images to smooth the initial decision maps, which can minimize the valid values of the defocused regions based on the energy of the pixels. It is said that, in the fused image, the energy of the defocused regions is weakened, but the focused regions are wholly fused.
On the other hand, the initial decision maps are blurred near the FDB, as shown in Figure 7, which may lead to faults in the fusion result. The guided filter is a local linear filter with edge-preserving and smoothing properties while subtracting noise. Guided by source images, the outputs of the guided filter are more organized than the input images.
According to [29], the definition of the guided filter is based on the assumption that the relationship between the guided image G and output image O is locally linear. The local linear variation defined within a window ω k centered at a pixel k with a radius r can be expressed as
O i = a k G i + b k ,   i ω k ,
where a k and b k are linear coefficients that are constant in the window ω k . The determination of a k and b k is limited by constraints from the input image I , which is to minimize the difference between O and input image I . The cost function in the window ω k can be expressed as
E a k , b k = a k G i + b k I i 2 + ε a k 2 ,
with ε as a regularization parameter penalizing large a k . a k and b k are given by the linear ridge regression model.
a k = 1 ω i ω k G i I i μ k I ¯ k σ k 2 + ε ,
b k = I ¯ k a k I ¯ k ,
where μ k and σ k 2 are the mean and variance of G in window ω k . ω is the number of pixels in ω k . I ¯ k = 1 ω i ω k I i is the mean of I in ω k . Having obtained a k and b k , O can be solved by (3). However, the results O i may be different because it will be covered by different windows. The average of all the possible values of O i is used in this article, so the filtering output can be computed by
O i = a ¯ G i + b ¯ ,
with a ¯ = 1 ω k ω i a k , b ¯ = 1 ω k ω i b k is the average coefficients of all windows overlapping pixels O i .
The performance of the filter is mainly affected by parameters that are the radius r of ω k and the regularization parameter ε . Therefore, when utilizing the guided filter, adjust these two parameters as needed.

2.4. Data Process

When the object’s height differs from a reference height, the imaging result will be defocused, of which the degree is positively correlated with the height difference between the object’s height and reference heights. Inspired by multi-focus image fusion in the optical images, we utilize a spatial-domain fusion method based on the SML and a guided filter to obtain an all-in-focus image of ROI with terrain undulation. The real-measured data processing diagram is shown in Figure 8, which has been detailed in [31]. This article focuses on introducing the proposed image fusion method, and the framework of the proposed method briefly includes four steps, as shown in Figure 9. First, a high-resolution imaging algorithm is used to obtain full-aperture CSAR images in different reference heights. Second, take all of the images obtained in the first step as source images and calculate the SMLs of them. Then, initial decision maps are estimated based on the rule of taking the largest SML pixel by pixel. Fourth, correct the initial decision maps using the guided filter. Finally, the decision maps and the source images are fused to obtain the final image.

2.4.1. Multi-Layer Imaging

The focused quality of objects is closely related to the selection of reference heights. The closer the object’s height is to the reference height, the better the focused quality of the imaging result is. Since the heights of the objects within the same region are different, multi-layer imaging is necessary, whose schematic is shown in Figure 10a. The range of reference heights H is determined by roughly estimating the range of the objects’ heights within ROI. Then, the height interval Δ H is determined according to the resolution of the CSAR system. The displacement of the imaging result is related to the height mismatch. When the displacement is less than the CSAR system’s resolution, it can be ignored. Therefore, the Δ H can be calculated as [31]
Δ H < 2 ρ tan θ 0 ,
where ρ is the resolution of the CSAR system, and θ 0 is the pitch angle of the center point of the scene, as shown in Figure 10b.
Finally, the BP algorithm [4] is utilized to process the echo.

2.4.2. Focus Region Detection

The focused and defocused regions of CSAR images both have clear edge information and appear sharply, so it is a challenge to distinguish them. Unlike the transform domain methods, the spatial domain methods are sensitive to spatial information. In this article, an SML is selected to be the focus measure. Theoretically, a larger SML denotes the pixel with higher focused quality. Focused and defocused regions are distinguished relying on the rule of choosing the largest SML pixel by pixel, which is implemented after calculating the source images’ SMLs. The process of obtaining initial decision maps can be represented as
b d d x i , j = 1 ,   i f   s m l x i , j = max S M L 0 ,   o t h e r w i s e ,
where S M L = s m l 1 ,   s m l 2 , s m l N , s m l x is the SML of the source image I x .

2.4.3. Guided Filter

Since the initial decision maps are blurred near the FDB, the guided filter introduced in Section 2 is utilized to optimize the initial decision maps. The process can be described as
B D D x = g u i d e d f i l t e r I x , b d d x , r , ε ,
with B D D x as the final decision map of source image I x . r denotes the radius of the window ω k , and ε is a regularization parameter, which has been detailed in Section 2.

2.4.4. Fused Result

After obtaining the final decision maps, the fused image I F is computed by fusing source images and final decision maps, which is represented as
I F = x = 1 N I x · B D D x .

3. Experiments and Results

Data were collected by an L-band CSAR system in Shanxi Province, China. Further system parameters are listed in Table 1. The aircraft flew along a circular trajectory with a radius of 2000 m at a height of 2000 m, spotlighting at a circular ROI whose radius is 300 m. As shown in Figure 11, the ROI is centered on an island with objects of different heights, such as buildings, and marinas, whose optical images are shown in Figure 12. There is a square in the northeast and streetlamps along a road in the north. The size of the imaging result is 4000 pixels × 4000 pixels with a 0.15 m × 0.15 m sampling interval and a theoretical resolution of 1 m by incoherent imaging.
By estimating the height range of objects in ROI, we set 24 reference heights in a range of −1.8–2.8 m at a 0.2 m interval. In total, 24 full-aperture images with different reference heights were obtained using the BP algorithm, where the images with reference heights of −1.6 m and 1.4 m are shown in Figure 13. Focused regions are different on images of different reference heights, which is obvious, especially in regions marked with red rectangles in Figure 13a whose zoomed-in results are shown in Figure 14. Objects in the lower part of Regions 1 and 2 are focused when the reference height is −1.6 m because there are below-grade marinas whose heights are closer to −1.6 m compared with 1.4 m. Meanwhile, Region 3 and the upper part of Regions 1 and 2 are focused when the reference height is 1.4 m. Moreover, streetlamps along the road marked with blue rectangles are focused as points in Figure 13b because they are higher than the ground. The conclusion is adaptive to other images. It indicates that multi-layer imaging enables objects with different heights to focus on images whose reference height is close to its height.
To verify the excellent performance of the spatial domain methods in CSAR image fusion, the proposed method is compared with three representative methods: the AG-based method, the NSST [23], and the pulse-coupled neural network (PCNN) [32]. The AG-based method is a spatial domain method whose focus measure is AG, which is a classical focus measure in image fusion. The NSST is a classical method of the transform domain. The PCNN can be categorized as a deep learning method, but it does not depend on a training data set for training and prediction, so it can be used in CSAR image fusion.

3.1. Evaluation Metrics

A well-performing image fusion method needs to fulfill the following requirements. First, it should be able to extract complementary information between fused images accurately. Second, a high degree of jerkiness and reliability is necessary. Third, introducing information and errors that are incompatible with the human visual system should be avoided.
Subjective evaluation relies on human subjective judgment, and the reliability of results is closely related to the experience of the observer, which makes the evaluation process time consuming and the results unrepeatable. In contrast, supported by a complete theoretical derivation process, objective evaluation is highly interpretable and can objectively assess the similarity between fused images and source images. Therefore, both subjective and objective evaluation should be included in the image fusion effect evaluation system [33]. The definition of objective evaluation metrics used in this article is as follows.
  • All Cross-Entropy (ACE)
Cross-entropy (CE) can describe the difference between two images by calculating the ratio of the distribution probability of gray values. C E A F is the value of CE between the source image A and the fused image F , which can be described as
C E A F = k P A k log 2 P A k P F k ,
with P A k and P F k as the distribution probability of the gray values of A and F , respectively. k 0 , 255 is the range of gray values.
ACE is defined as the average of CEs between source images A and B ,
A C E = C E A F + C E B F 2 .
A smaller A C E indicates that the overall difference between the source images and the fused image is smaller.
2.
Structural similarity (SSIM)
SSIM [34] reflects the similarity of the two images. S S I M A F is the value of SSIM between A and F
S S I M A F = 2 μ A μ F + c 1 2 σ A F + c 1 μ A 2 + μ F 2 + c 1 σ A 2 + σ F 2 + c 2 ,
where μ A and μ F are the average of pixels of A and F , respectively, whose variances are σ A 2 and σ F 2 . The covariance of A and F is σ A F . c 1 = k 1 L 2 and c 2 = k 2 L 2 are constants used to maintain stability, where L is the dynamic range of pixel values k 1 = 0.01 and k 2 = 0.03 . The definition of S S I M B F is the same as S S I M A F . S S I M a v g can be calculated as
S S I M a v g = S S I M A F + S S I M B F 2 .
A smaller S S I M value means better performance.
3.
Sum mutual information (SMI)
The evaluation metric SMI [35] calculates the image feature information between the source images and the fused image, such as gradient and edge. The SMI between A , B , and F is defined as follows.
S M I = M I A F + M I B F ,
M I A F = a , f p A F a , f log p A F a , f p A a p F f ,
where a and f are the normalized gray values of A and F , respectively. p A a and p F f denote the normalized grayscale histogram probability distribution function. p A F a , f is the probability distribution function of the joint grayscale histogram between A and F . A larger value of S M I indicates that the information of the fused image is similar to that of the source images.
4.
Edge retention (ER)
The ER is called Q A B / F , which evaluates the amount of edge information transferred from the source images to the fused image [36]. Q A B / F is defined as
Q A B / F = i = 1 R j = 1 C Q A F i , j w A i , j + Q B F i , j w B i , j i = 1 R j = 1 C w A i , j + w B i , j ,
where w A i , j is the weights of Q A F in Q A B / F .
Q A F = Q g A F i , j Q α A F i , j is the edge information hold value, with Q g A F i , j and Q α A F i , j as the edge intensity and the direction of the edge gradient of A and F , respectively. The calculation process is as follows.
Firstly, the edge strengths g A and g F and the edge directions α A and α F of A and F are calculated as
g A i , j = s A x i , j 2 + s A y i , j 2 g F i , j = s F x i , j 2 + s F y i , j 2 ,
α A i , j = tan 1 s A y i , j s A x i , j α F i , j = tan 1 s F y i , j s F x i , j ,
with s A x i , j and s A y i , j as the edge intensity of A in directions of x and y , respectively. s F x i , j and s F y i , j are similarly defined.
Secondly, relevant edge intensity Q g A F i , j and relevant edge direction Q α A F i , j can be represented as
g A F i , j = g F i , j g A i , j , i f g A i , j > g F i , j g A i , j g F i , j , o t h e r w i s e ,
α A F = 1 α A i , j α F i , j π / 2 ,
Q g A F = Γ g 1 + exp K g α A F i , j σ g ,
Q α A F = Γ α 1 + exp K α α A F i , j σ α ,
L , Γ g , K g , σ g , Γ α , K α , and σ α in the derivation of the equation above are hyper parameterizations that are set as L = 1 , Γ g = 0.9879 , K g = 15 , σ g = 0.5 , Γ α = 0.9994 , K α = 15 , and σ α = 0.8 according to [36].
Finally, w A i , j is calculated as
w A i , j = g A i , j L .
Q A B / F can describe the edge strength and orientation information obtained by the Sobel operator in each pixel of A , B , and F . The larger the value of Q A B / F is, the more edge information will be transferred from the source images to the fused image. It can also be used as a metric for evaluating image sharpness in this article.
5.
Equivalent Number of Looks (ENL)
ENL reflects the intensity of the noise contained in the SAR image. It is the square of the ratio of the mean to the standard deviation of the SAR magnitude plot, the ENL of image F is defined as:
E N L F = μ σ 2 ,
where μ and σ are the mean and the standard deviation of image F . The smaller the ENL, the smaller the noise contained in the image, and the better the image quality.

3.2. Parameter Setting

In the process of the proposed method, the parameters for calculating SML are set as s t e p = 15 and N = M = 30 . For the guided filter, the radius of the window ω k is r = 8 , and the regularization parameter is ε = 0.4 2 . The fusion rule for the low-frequency coefficients of NSST is averaged pixel by pixel, which is the only difference from [23]. The high-frequency coefficients are decomposed in four layers, each of which is decomposed in eight directions, and the fusion rule is SML taking the largest one pixel by pixel. The processing of PCNN is consistent with [32]. However, the computation of PCNN is time-consuming. To operate an effective and fair comparative experiment, only two images shown in Figure 13 are used as source images to test the performance of the proposed method and comparative methods.

3.3. Experimental Results and Discussion

Experimental results are shown in Figure 15; there are differences in the details. To have a clear comparison, we zoomed in on the same regions that are labeled with red rectangles in Figure 15a, as shown in Figure 16, where Figure 16a–d correspond to that in Figure 15, and images in each group are the zoomed-in results of Regions 1–3 from top to bottom, respectively. The results of the AG-based method, NSST, and PCNN are blurred near FDB because the defocused regions are fused. In contrast, the proposed method can distinguish the focused and defocused regions correctly and suppress the energy of defocused regions. With the help of the guided filter, the fused result of the proposed method is continuous and smooth near FDR, which is more compatible with the human visual system.
The results of the objective evaluation metrics are listed in Table 2, where the best results are shown in bold. The Q A B / F describes the edge strength and orientation information. The AG-based method distinguishes the focused and defocused regions by analyzing the AG value of each pixel, so it performs better in keeping the edge strength and orientation information than the proposed. Meanwhile, a small Q A B / F value indicates the grayscale value of the image changes smoothly, indicating that the image is relatively blurry and bad quality. Therefore, we can know that the NSST and PCNN methods are worse than other methods. In the subsequent discussion, we focus on comparing the proposed method and the AG-based method.
A C E is used to measure the similarity between the probability distributions of grayscale values in two images. S M I describes the total amount of information obtained from the source images by the fused image. S S I M is the evaluation metric to measure the degree of structural similarity between the fused image and the source images. In summary, these metrics evaluate the similarity between several images from different aspects, which are simultaneously used to comprehensively measure the performance of the proposed method. In the field of image fusion, the more the defocused regions fused into the fusion result, the higher the similarity between the source images and the fusion result, and the worse the effect of the image fusion. Therefore, a large value of A C E and small values of S M I and S S I M illustrate that the fusion method performs well. Combining the evaluation results of A C E , Q A B / F , and S M I , we can know that the proposed method is better than the AG-based method.
E N L can measure the intensity of noise contained in a single SAR image. It is listed that the proposed method achieves the smallest value of E N L , which indicates that the proposed method has the best suppression effect on noise. In terms of computational time, the AG-based method, NSST, and PCNN are 13, 34, and 180 times more than the proposed method, so the proposed method has the highest computational efficiency among these four methods. In summary, the results of objective evaluation metrics show that the proposed method has the best performance.
To have a more intuitive comparison, the marked regions’ zoomed-in results are shown in Figure 16. The proposed method accurately recognizes and extracts the focused regions on source images, and the outline of each object on the fused image is clear and readable. The other methods yielded poorly readable results because they blended the defocused regions, resulting in blurring around the objects.
Considering the results of the subjective and objective evaluations, the proposed method performs the best. It can distinguish the focused and defocused regions exactly to produce a highly readable all-in-focus image.
Finally, 24 full-aperture images with different reference heights were processed using the proposed method. An all-in-focus image is shown in Figure 17. To assess the imaging results’ quality, Figure 18 shows the image slices of the marked target, where the target is best focused when the imaging reference height is 1.4 m. The imaging quality of the fused result is similar to that of Figure 18b, subjectively. To have an objective analysis, the X and Y slices of the marked target in these four images are compared in Figure 19. The spike of the blue slice in Figure 19b is the peak energy of the defocused region. It is clear that the best-focused result is fused into the fusion image, which fully verifies the correctness and effectiveness of the proposed method.
However, when the energy of the defocused region is similar to that of the focused regions, the defocused region will be fused into the final result. This raises the question of how to set parameters to achieve better fusion results. We will do further research on this.

4. Conclusions

Multi-focus image fusion provides a new approach to solving the problem of CSAR imaging in areas with terrain undulation. This article focused on finding an image fusion method that is suitable for CSAR images. The proposed method is a spatial domain method based on the SML and the guided filter. The SML is a focus measure, and the guided filter is utilized to optimize the decision maps. An all-in-focus CSAR image of ROI with terrain undulation can be obtained by the proposed method. A comparative experiment was taken, which consisted of the AG-based method, NSST, and PCNN, in addition to the proposed method. The conclusions of the subjective and objective evaluations illustrated the superior performance of the proposed method compared to the other methods. Furthermore, the processing result of the real-measured data verified the correctness and effectiveness of the proposed method. However, it should be noted that the settings of the reference heights and fusion parameters directly affect the quality of image fusion, so how to carry out appropriate parameter settings deserves further research.

Author Contributions

All the authors made significant contributions to this work. J.L. and L.C. proposed the theoretical framework; J.L., L.C., D.A., D.F. and Y.S. designed the experiments; L.C., D.A., D.F. and Y.S. collected the real-measured data; J.L. and L.C. carried out the experiments; validation, D.A., D.F. and Y.S.; formal analysis, J.L. and L.C.; writing—original draft preparation, J.L.; writing—review and editing, J.L., L.C., D.A., D.F. and Y.S.; funding acquisition, L.C., D.A., D.F. and Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grants 62101566, 62271492, and 62101562; and in part by the Natural Science Foundation for Distinguished Young Scholars of Hunan Province under Grant 2022JJ10062.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to privacy restrictions.

Acknowledgments

We thank the authors of all references for their theoretical contributions to the work of this article, as well as the editors and reviewers for their helpful suggestions and comments.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Chen, J.W.; An, D.X.; Wang, W.; Luo, Y.X.; Zhou, Z.M. Extended Polar Format Algorithm for Large-Scene High-Resolution WAS-SAR Imaging. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2021, 14, 5326–5338. [Google Scholar] [CrossRef]
  2. Zhang, H.; Lin, Y.; Teng, F.; Feng, S.S.; Hong, W. Holographic SAR Volumetric Imaging Strategy for 3-D Imaging with Single-Pass Circular InSAR Data. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–16. [Google Scholar] [CrossRef]
  3. Octavio, P.; Pau, P.I.; Muriel, P.; Marc, R.C.; Rolf, S.; Andreas, R.; Alberto, M. Fully Polarimetric High-Resolution 3-D Imaging with Circular SAR at Lband. IEEE Trans. Geosci. Remote Sens. 2014, 52, 3074–3090. [Google Scholar]
  4. Chen, L.P.; An, D.X.; Huang, X.T. A Backprojection Based Imaging for Circular Synthetic Aperture Radar. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2017, 10, 3547–3555. [Google Scholar] [CrossRef]
  5. Fan, B.; Qin, Y.L.; You, P.; Wang, H.Q. An Improved PFA with Aperture Accommodation for Widefield Spotlight SAR Imaging. IEEE Geosci. Remote Sens. Lett. 2015, 12, 3–7. [Google Scholar] [CrossRef]
  6. Yang, Y.; Tong, S.; Huang, S.Y.; Lin, P. Multifocus Image Fusion Based on NSCT and Focused Area Detection. IEEE Sens. J. 2015, 15, 2824–2838. [Google Scholar]
  7. Karacan, L. Multi-image transformer for multi-focus image fusion. Signal Process. Image Commun. 2023, 119, 117058. [Google Scholar] [CrossRef]
  8. Li, S.T.; Kang, X.D.; Hu, J.W. Image fusion with guided filtering. IEEE Trans. Image Process. 2013, 22, 2864–2875. [Google Scholar]
  9. Liu, Y.; Liu, S.P.; Wang, Z.F. Multi-focus image fusion with dense SIFT. Inf. Fusion 2015, 23, 139–155. [Google Scholar] [CrossRef]
  10. Bouzos, O.; Andreadis, I.; Mitianoudis, N. Conditional random field model for robust multi-focus image fusion. IEEE Trans. Image Process. 2019, 28, 5636–5648. [Google Scholar] [CrossRef]
  11. Chen, Y.B.; Guan, J.W.; Cham, W.K. Robust multi-focus image fusion using edge model and multi-matting. IEEE Trans. Image Process. 2018, 27, 1526–1541. [Google Scholar] [CrossRef]
  12. Xiao, B.; Ou, G.; Tang, H.; Bi, X.L.; Li, W.S. Multi-focus image fusion by hessian matrix based decomposition. IEEE Trans. Multimed. 2020, 22, 285–297. [Google Scholar] [CrossRef]
  13. Zhang, L.X.; Zeng, G.P.; Wei, J.J. Adaptive region-segmentation multi-focus image fusion based on differential evolution. Int. J. Pattern Recognit. Artif. Intell. 2019, 33, 1954010. [Google Scholar] [CrossRef]
  14. Li, M.; Cai, W.; Tan, Z. A region-based multi-sensor image fusion scheme using pulse-coupled neural network. Pattern Recognit. Lett. 2006, 27, 1948–1956. [Google Scholar] [CrossRef]
  15. Ranchin, T.; Wald, L. The wavelet transform for the analysis of remotely sensed images. Int. J. Remote Sens. 1993, 14, 615–619. [Google Scholar] [CrossRef]
  16. Shi, Q.; Li, J.W.; Yang, W.; Zeng, H.C.; Zhang, H.J. Multi-aspect SAR image fusion method based on wavelet transform. J. Beijing Univ. Aeronaut. Astronaut. 2017, 43, 2135–2142. [Google Scholar]
  17. Cunha, A.; Zhou, J.P.; Do, M.N. The Nonsubsampled Contourlet Transform: Theory, Design, and Applications. IEEE Trans. Image Process. 2006, 15, 3089–3101. [Google Scholar] [CrossRef] [PubMed]
  18. Li, X.S.; Zhou, F.Q.; Tan, H.S.; Chen, Y.Z.; Zou, W.X. Multi-focus image fusion based on nonsubsampled contourlet transform and residual removal. Signal Process. 2021, 184, 108062. [Google Scholar] [CrossRef]
  19. Easley, G.; Labate, D.; Lim, W.Q. Sparse directional image representations using the discrete shearlet transform. Appl. Comput. Harmon. Anal. 2008, 25, 25–46. [Google Scholar] [CrossRef]
  20. Yang, B.; Li, S.T. Multifocus image fusion and restoration with sparse representation. IEEE Trans. Instrum. Meas. 2010, 59, 884–892. [Google Scholar] [CrossRef]
  21. Zhou, Z.Q.; Li, S.; Wang, B. Multi-scale weighted gradient-based fusion for multi-focus images. Inf. Fusion 2014, 20, 60–72. [Google Scholar] [CrossRef]
  22. Liu, Z.D.; Chai, Y.; Yin, H.P.; Zhou, J.Y.; Zhu, Z.Q. A novel multi-focus image fusion approach based on image decomposition. Inf. Fusion 2017, 35, 102–116. [Google Scholar] [CrossRef]
  23. An, D.X.; Huang, J.N.; Chen, L.P.; Feng, D.; Zhou, Z.M. A NSST-Based Fusion Method for Airborne Dual-Frequency, High-Spatial-Resolution SAR Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 4362–4370. [Google Scholar]
  24. Liu, Y.; Chen, X.; Peng, H.; Wang, Z.F. Multi-focus image fusion with a deep convolutional neural network. Inf. Fusion 2017, 36, 191–207. [Google Scholar] [CrossRef]
  25. Du, C.B.; Gao, S.S. Image segmentation-based multi-focus image fusion through multi-scale convolutional neural network. IEEE Access 2017, 5, 15750–15761. [Google Scholar] [CrossRef]
  26. Ma, B.; Zhu, Y.; Yin, X.; Ban, X.J.; Huang, H.Y.; Mukeshimana, M. Sesf-fuse: An unsupervised deep model for multi-focus image fusion. Neural Comput. Appl. 2021, 33, 5793–5804. [Google Scholar] [CrossRef]
  27. Xiao, B.; Xu, B.C.; Bi, X.L.; Li, W.S. Global-feature encoding U-Net (GEU-Net) for multi-focus image fusion. IEEE Trans. Image Process. 2021, 30, 163–175. [Google Scholar] [CrossRef]
  28. Ma, B.Y.; Yin, X.; Wu, D.; Shen, H.K.; Ban, X.J.; Wang, Y. End-to-end learning for simultaneously generating decision map and multi-focus image fusion result. Neurocomputing 2022, 470, 204–216. [Google Scholar] [CrossRef]
  29. He, K.M.; Sun, J.; Tang, X.O. Guided image filtering. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1397–1409. [Google Scholar] [CrossRef]
  30. Sun, X.L.; Wang, Z.Y.; Fu, Y.Q.; Yi, Y.; He, X.H. Fast image fusion based on sum of modified Laplacian. Comput. Eng. Appl. 2015, 5, 193–197. [Google Scholar]
  31. Li, J.X.; Chen, L.P.; An, D.X.; Feng, D.; Song, Y.P. CSAR Multilayer Focusing Imaging Method. IEEE Geosci. Remote Sens. Lett. 2024, 21, 1–5. [Google Scholar] [CrossRef]
  32. Li, J.F.; Galdran, A. Multi-focus Microscopic Image Fusion Algorithm Based on Sparse Representation and Pulse Coupled Neural Network. Acta Microsc. 2020, 29, 1816–1823. [Google Scholar]
  33. Piella, G.; Heijmans, H. A new quality metric for image fusion. In Proceedings of the 2003 International Conference on Image Processing (Cat. No.03CH37429), Barcelona, Spain, 14–17 September 2003. [Google Scholar]
  34. Wang, Z.; Bovik, A.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
  35. Qu, G.H.; Zhang, D.L.; Yan, P.F. Information measure for performance of image fusion. Electron. Lett. 2002, 38, 313–315. [Google Scholar] [CrossRef]
  36. Xydeas, C.S.; Petrovic, V. Objective image fusion performance measure. Electron. Lett. 2000, 36, 308–309. [Google Scholar] [CrossRef]
Figure 1. Geometric optics model. Object A focuses on the film plane as a point a , while b and c are blurred circles of B and C . b and c are the position of film plane when B and C are focused as points.
Figure 1. Geometric optics model. Object A focuses on the film plane as a point a , while b and c are blurred circles of B and C . b and c are the position of film plane when B and C are focused as points.
Remotesensing 16 02797 g001
Figure 2. Energy mapping modeling in CSAR imaging. Δ h 1 and Δ h 2 are the heights of E and F , respectively. The imaging result of D is point d , while e and f are defocused rings of E and F . e and f are the position of the image plane when the imaging results of E and F are focused as points.
Figure 2. Energy mapping modeling in CSAR imaging. Δ h 1 and Δ h 2 are the heights of E and F , respectively. The imaging result of D is point d , while e and f are defocused rings of E and F . e and f are the position of the image plane when the imaging results of E and F are focused as points.
Remotesensing 16 02797 g002
Figure 3. The reference heights of (a,b) are 0 m and 4.5 m, respectively. The real-measured data are collected by setting a trajectory over a road intersection with buildings and factories of different heights. There are streetlamps higher than the ground along both sides of roads. The focus quality of regions marked with red rectangles, whose zoomed-in results are shown in Figure 4, is especially different.
Figure 3. The reference heights of (a,b) are 0 m and 4.5 m, respectively. The real-measured data are collected by setting a trajectory over a road intersection with buildings and factories of different heights. There are streetlamps higher than the ground along both sides of roads. The focus quality of regions marked with red rectangles, whose zoomed-in results are shown in Figure 4, is especially different.
Remotesensing 16 02797 g003
Figure 4. Zoomed-in results of the regions marked with red rectangles in Figure 3a. (ad) correspond to rectangles 1–4, respectively. The left results in the four groups of images correspond to Figure 3a, while the others correspond to Figure 3b.
Figure 4. Zoomed-in results of the regions marked with red rectangles in Figure 3a. (ad) correspond to rectangles 1–4, respectively. The left results in the four groups of images correspond to Figure 3a, while the others correspond to Figure 3b.
Remotesensing 16 02797 g004
Figure 5. (a) The high-frequency coefficients in the vertical direction produced by one-layer NSST in eight directions of Figure 3a. (b) The high-frequency coefficients of Figure 3b.
Figure 5. (a) The high-frequency coefficients in the vertical direction produced by one-layer NSST in eight directions of Figure 3a. (b) The high-frequency coefficients of Figure 3b.
Remotesensing 16 02797 g005
Figure 6. (a) An optical image with the foreground focused. (c) An optical image with the background focused. (b,d) are high-frequency coefficients in the vertical direction for (a,c), respectively, produced by one-layer NSST in eight directions.
Figure 6. (a) An optical image with the foreground focused. (c) An optical image with the background focused. (b,d) are high-frequency coefficients in the vertical direction for (a,c), respectively, produced by one-layer NSST in eight directions.
Remotesensing 16 02797 g006
Figure 7. Initial decision maps of source images in Figure 3. (a,b) correspond to that of Figure 3.
Figure 7. Initial decision maps of source images in Figure 3. (a,b) correspond to that of Figure 3.
Remotesensing 16 02797 g007
Figure 8. The real-measured data processing diagram. The detailed process of the proposed image fusion method is shown in Figure 9.
Figure 8. The real-measured data processing diagram. The detailed process of the proposed image fusion method is shown in Figure 9.
Remotesensing 16 02797 g008
Figure 9. Schematic diagram of the proposed method consisting of multi-layer imaging and multi-focus image fusion. The box with NSML is the SML calculation of source images, while the others with MAX, GF, and IF represent focus measure, guided filter, and image fusion, respectively. On the contrary, I 1 , I N are source images of different reference heights with their SMLs that are represented as S M L 1 , S M L N . I D M , F D M , and I F are initial decision maps, final decision maps, and the fused image, respectively.
Figure 9. Schematic diagram of the proposed method consisting of multi-layer imaging and multi-focus image fusion. The box with NSML is the SML calculation of source images, while the others with MAX, GF, and IF represent focus measure, guided filter, and image fusion, respectively. On the contrary, I 1 , I N are source images of different reference heights with their SMLs that are represented as S M L 1 , S M L N . I D M , F D M , and I F are initial decision maps, final decision maps, and the fused image, respectively.
Remotesensing 16 02797 g009
Figure 10. Schematic of multi-layer imaging, with H as the range of reference heights and Δ H as the interval. The regions marked with red boxes are focused. (b) The CSAR imaging geometry.
Figure 10. Schematic of multi-layer imaging, with H as the range of reference heights and Δ H as the interval. The regions marked with red boxes are focused. (b) The CSAR imaging geometry.
Remotesensing 16 02797 g010
Figure 11. The circular trajectory collected a data set over an island whose optical image is shown in Figure 12. V represents the victory of the platform. The blue triangle represents the radar beam width.
Figure 11. The circular trajectory collected a data set over an island whose optical image is shown in Figure 12. V represents the victory of the platform. The blue triangle represents the radar beam width.
Remotesensing 16 02797 g011
Figure 12. The optical image of the ROI. There are marinas and buildings of different heights. Moreover, there is a road with streetlamps to the north of the island and a square to the northeast.
Figure 12. The optical image of the ROI. There are marinas and buildings of different heights. Moreover, there is a road with streetlamps to the north of the island and a square to the northeast.
Remotesensing 16 02797 g012
Figure 13. The reference heights of (a,b) are −1.6 m and 1.4 m, respectively. The focused quality of imaging results of the same objects is different in (a,b), especially in the regions marked with a blue rectangle. The zoomed-in results of the regions marked with red rectangles labeled 1–3 are shown in Figure 14.
Figure 13. The reference heights of (a,b) are −1.6 m and 1.4 m, respectively. The focused quality of imaging results of the same objects is different in (a,b), especially in the regions marked with a blue rectangle. The zoomed-in results of the regions marked with red rectangles labeled 1–3 are shown in Figure 14.
Remotesensing 16 02797 g013
Figure 14. Zoomed-in results of the regions marked with red rectangles in Figure 13a. (ac) correspond to Regions 1–3, respectively. The left images are optical images of Regions 1–3, which are a square with several objects of different heights, a marina lower than the ground, and a tower, respectively. The middle results in the three groups of images correspond to Figure 13a, while the others correspond to (b).
Figure 14. Zoomed-in results of the regions marked with red rectangles in Figure 13a. (ac) correspond to Regions 1–3, respectively. The left images are optical images of Regions 1–3, which are a square with several objects of different heights, a marina lower than the ground, and a tower, respectively. The middle results in the three groups of images correspond to Figure 13a, while the others correspond to (b).
Remotesensing 16 02797 g014
Figure 15. Fusion results of different methods. (a) Proposed method. (b) AG-based method. (c) NSST. (d) PCNN. The zoomed-in results of regions marked with rectangles are shown in Figure 16.
Figure 15. Fusion results of different methods. (a) Proposed method. (b) AG-based method. (c) NSST. (d) PCNN. The zoomed-in results of regions marked with rectangles are shown in Figure 16.
Remotesensing 16 02797 g015
Figure 16. Zoomed-in results of the regions marked with red rectangles in Figure 15a. The images from top to bottom correspond to Regions 1–3. Vertical images are a group and (ad) correspond to that of Figure 15.
Figure 16. Zoomed-in results of the regions marked with red rectangles in Figure 15a. The images from top to bottom correspond to Regions 1–3. Vertical images are a group and (ad) correspond to that of Figure 15.
Remotesensing 16 02797 g016aRemotesensing 16 02797 g016b
Figure 17. The fused result of 24 source images of different reference heights was processed by the proposed method. The target marked with a red circle is a point-like one that is used to analyze the performance of the proposed method.
Figure 17. The fused result of 24 source images of different reference heights was processed by the proposed method. The target marked with a red circle is a point-like one that is used to analyze the performance of the proposed method.
Remotesensing 16 02797 g017
Figure 18. The image slices of the marked target in multi-layer images. (a) The imaging reference height is 0 m. (b) The imaging reference height is 1.4 m. (c) The imaging reference height is 2.8 m. (d) The fusion result.
Figure 18. The image slices of the marked target in multi-layer images. (a) The imaging reference height is 0 m. (b) The imaging reference height is 1.4 m. (c) The imaging reference height is 2.8 m. (d) The fusion result.
Remotesensing 16 02797 g018
Figure 19. The X and Y slices of the marked target. (a) The X slice. (b) The Y slice.
Figure 19. The X and Y slices of the marked target. (a) The X slice. (b) The Y slice.
Remotesensing 16 02797 g019
Table 1. Basic system parameters.
Table 1. Basic system parameters.
ParametersValues
Carrier frequency f c L band
The velocity of the platform v a 52 m/s
The flight radius of the platform R 2000 m
The altitude of the platform H 2000 m
Table 2. Objective evaluation metrics of different methods.
Table 2. Objective evaluation metrics of different methods.
Method Q A B / F ACESMISSIMENLTime (s)
Our0.49790.00644.88860.87482.706151
AG0.52430.00505.54990.88942.7686647
NSST0.3652----1726
PCNN0.3291----9186
Note: The best results are shown in bold.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, J.; Chen, L.; An, D.; Feng, D.; Song, Y. A Novel Method for CSAR Multi-Focus Image Fusion. Remote Sens. 2024, 16, 2797. https://doi.org/10.3390/rs16152797

AMA Style

Li J, Chen L, An D, Feng D, Song Y. A Novel Method for CSAR Multi-Focus Image Fusion. Remote Sensing. 2024; 16(15):2797. https://doi.org/10.3390/rs16152797

Chicago/Turabian Style

Li, Jinxing, Leping Chen, Daoxiang An, Dong Feng, and Yongping Song. 2024. "A Novel Method for CSAR Multi-Focus Image Fusion" Remote Sensing 16, no. 15: 2797. https://doi.org/10.3390/rs16152797

APA Style

Li, J., Chen, L., An, D., Feng, D., & Song, Y. (2024). A Novel Method for CSAR Multi-Focus Image Fusion. Remote Sensing, 16(15), 2797. https://doi.org/10.3390/rs16152797

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop