1. Introduction
Crop area classification is essential for yield monitoring and food security [
1]. Since the characteristics of agricultural fields may change rapidly due to land use and natural events, cloud-penetrating Synthetic Aperture Radar (SAR) and high-resolution optical data have become very useful, especially when capturing the full growth cycle of crops is needed.
Optical images from current and past satellite missions have been widely used for the monitoring of crop properties [
2,
3,
4,
5,
6]. Sentinel-2 and Landsat missions offered valuable contributions to this end [
7,
8,
9,
10]. Unfortunately, optical sensors are affected by weather and sunlight conditions, implying significant limitations when frequent monitoring is required. Commonly distributed satellite-derived indices such as the Normalized Difference Vegetation Index (NDVI) have a too coarse resolution, unsuitable for monitoring agricultural fields with a relatively small size (~1–2 ha). Other sources of crop maps include Cropland Data Layer (CDL) [
11] or the European Space Agency (ESA) WorldCover [
12], which are updated yearly with at least one year delay.
SAR data, such as those collected by the C-band Sentinel-1 and the L-band ALOS-2 satellites, can map the Earth regardless of day, night or cloud presence, with a frequent revisit time (i.e., 12 to 14 days) and high spatial resolution (up to 3 m for ALOS-2). Different polarizations are important for studying how the radar signal interacts with crop structure and land cover classes. SAR polarimetry can reveal how different scattering mechanisms are impacted by diverse crop types and vegetation covers [
13,
14]. Studies on the integration of SAR and optical data have also been carried out [
15,
16]. The upcoming NASA ISRO SAR (NISAR) mission will collect L-band and S-band data with a revisit time of 12 days (6 days including ascending and descending passes) [
17]. NISAR science data will have high spatial resolution (up to 6 m) and will be freely distributed for addressing various scientific needs. The L-band NISAR’s cropland products will be released at 100 m resolution, aiming at the overall accuracy requirement of 80% on a seasonal interval.
Previous studies already presented the results obtained by applying the NISAR science algorithm for crop area classification [
18,
19,
20,
21]. The algorithm is based on the Coefficient of Variation (CV) computed over time from a dense time-series of data. The past studies emphasized the efficiency of cross-polarization in distinguishing between crop and non-crop areas. S. Kraatz et al. used C-band Sentinel-1 and L-band ALOS-2 data for classifying crop/non-crop areas over an agricultural region in Manitoba (Canada) [
20]. The overall accuracy values were reported as 83.5% and 73.2% for Sentinel-1 data and ALOS-2 data, respectively, showing the superior performance of C-band. Using Sentinel-1 data acquired over the contiguous United States, the overall accuracy was greater than 80% [
18]. L-band Uninhabited Aerial Vehicle SAR (UAVSAR) airborne data were simulated into the NISAR products by modifying the spatial resolutions and frequency band [
19]. The accuracy values exceeded 80%, except for the 10 m simulated data. These past studies pose questions on what the optimal frequency, spatial resolution, and polarization are, especially in the context of preparing an operational algorithm for the upcoming NISAR mission.
Another important element to achieve the operational quality of the algorithm is the source of the ground truth data for the calibration and validation of the crop area classification algorithm. Although land cover products such as the Cropland Data Layer (CDL) are widely used as reference datasets for calibrating and evaluating the algorithm, they are released with a delay of at least 1 year. With this latency, such data are not useful for operational validation at the seasonal interval of the NISAR products. In this study, we explored whether alternative contemporary optical data could generate a reliable ground reference.
The objectives of this study are: (1) to continue verifying the performance of the NISAR science algorithm for classifying crop and non-crop areas on a new location; (2) to compare the algorithm’s performance between L-band and C-band inputs and analyze how results are frequency-dependent; (3) to introduce operationally feasible ground reference for training and validating the algorithm instead of CDL or other land cover products; (4) to evaluate the classification results by analyzing different polarizations (co-pol vs. cross-pol) and spatial resolutions (10 m vs. 100 m); (5) to investigate the capability of a high-resolution optical dataset to classify crop areas as an alternative to the radar data input. These investigations will eventually contribute to support the development of the NISAR cropland products. The novelty of this study lies in: (1) the provision of contemporaneous ground truth prepared by analyzing the Planet optical images for training and validation; (2) finding that only the L-band HV at 100 m resolution satisfied the accuracy goal; (3) documentation of the fact that optical NDVI is not helpful as a ground truth dataset for crop area classification.
Section 2 describes the datasets used for the analysis and the methodology followed for achieving the objectives of the study.
Section 3 reports and discusses the results.
3. Results and Discussion
In this section, the results are separately reported for the three input datasets: L-band ALOS-2 PALSAR-2 NISAR simulated (3.1), C-band Sentinel-1A (3.2), and NDVI from PlanetScope (3.3) data.
3.1. L-Band ALOS-2 PALSAR-2 NISAR Simulated Data (L2 GCOV)
The algorithm was applied to CV computed from the L-band NISAR L2 GCOV products for two polarizations (HH, HV) and resolutions (10 m, 100 m).
Figure 5 and
Figure 6 show pixel-based CV values over the site of interest for HH and HV 10 m (
Figure 5), and for HH and HV 100 m (
Figure 6).
CV from 10 m resolution images is affected by speckle noise (
Figure 5), hindering the distinguishability of details. The consequent high variability is not suitable for crop area classification. As the resolution becomes coarser, speckle effects are mitigated, thus allowing for a clearer delineation of agricultural fields boundaries and other land cover classes such as forests. The mean CV values averaged over the entire image were 0.51 and 0.44 for HH and HV at a 10 m resolution, respectively. At 100 m, the mean values were 0.28 and 0.24 for HH and HV, respectively. Additionally, pixels corresponding to non-crop (such as forests) are notably characterized by low CV values (typically below 0.2), which is expected given their lower likelihood to experience significant variation over time compared to cropland. The ROC curve approach was then applied to CV images to derive the optimal CV threshold values by using the training set of 39 crop/non-crop polygons.
Figure 7 shows the ROC curves obtained for each polarization and resolution. The red points on the curves correspond to the optimal CV thresholds reported in
Table 4. In the 10 m resolution case, the curve is very close to the 1:1 line, indicating that the classification results are very poor (
Figure 7). At 100 m, threshold values of 0.14 and 0.15 were derived for HH and HV, respectively.
Figure 8 shows the 100 m binary CV crop/non-crop maps with HH and HV inputs and the derived thresholds. HV polarization appears to better distinguish between crop and non-crop areas than HH does. For example, most forest pixels shown in purple in center right are classified as non-crop when compared to HH, likely because cross-polarization is more affected by vegetation than co-polarization.
The set of 39 crop/non-crop validation polygons was used to create the confusion matrix defined in
Table 3 and assess the overall accuracy of the classification according to Equation (7). The results are reported in
Table 5. HH 10 m and HV 10 m do not meet the 80% requirement of the NISAR science accuracy, likely due to the presence of speckle noise effects. On the other hand, HH 100 m and HV 100 m improved overall accuracy values: still, only HV exceeds 80%. These are different from the findings reported in [
19]. In [
19], L-band airborne UAVSAR (Uninhabited Aerial Vehicle Synthetic Aperture Radar) data (10 m) and simulated NISAR products generated from the UAVSAR (10 m, 30 m, and 100 m) led to the 80% accuracy achieved in all cases except for the 10 m NISAR simulated data. An explanation can be the lower noise level of the UAVSAR and the simulated NISAR data than that of the PALSAR-2 data. In summary, 100 m resolution and HV input are necessary to accomplish the NISAR science requirement.
3.2. C-Band Sentinel-1A Data
C-band performances were found to be not as good as that of L-band, which was different from the results obtained in [
20] with C-band Sentinel-1B and L-band ALOS-2 inputs. The classification performance most likely depends on the site of interest, dominant crop types, and crop structures. Our algorithm was applied to the CV computed from C-band Sentinel-1A images for two polarizations (VH, VV) and resolutions (10 m, 100 m).
Figure 9 and
Figure 10 show pixel-based CV values over the site of interest for VH and VV 10 m (
Figure 9), and for VH and VV 100 m (
Figure 10).
The CV values averaged over the scene were 0.46, 0.46, 0.19, 0.18 for VV 10 m, VH 10 m, VV 100 m, and VH 100 m, respectively. VH 10 m and VV 10 m values were comparable with the ones obtained for the L-band 10 m images, while C-band values for 100 m spatial resolution were smaller. The C-band 10 m images were affected by speckle noise, while the 100 m images improved the delineation of various features within the site of interest. L-band was found to better discriminate those features, likely because it has a higher capability to penetrate the canopy layers and interact with the soil. On the contrary, C-band has a weaker penetration than the L-band does. As also reported in [
20], C-band data can reach a saturation level earlier during the crop season and especially when crop fields approach the mature stage, resulting in lower CV values during this period.
Classification results obtained with 10 m images were not reliable, since CV threshold values were very low and produced entire images to be classified as crop (
Table 6). This is also confirmed by the ROC curves reported in
Figure 11 being very close to the 1:1 line. For VV 100 m and VH 100 m, the threshold values are 0.15 and 0.12, respectively. Differently with respect to L-band,
for C-band increases when the resolution becomes coarser. Also, for VH 100 m the threshold value is 0.12, while for L-band HV 100 m the value is 0.15: this result reflects the larger CV values in general at L-band.
Co-polarization VV better delineates non-crop pixels related to forests than VH does (
Figure 12). The overall accuracy at 10 m resolution is poor as explained earlier (
Table 7). The overall accuracy of 73% exactly matches the crop percentage that is obtained when counting all the pixels within the validation polygons that correspond to crop. As also reported in [
19], in a situation where
is 0 or 1 and all the pixels are classified as either crop or non-crop, the overall accuracy becomes equal to the crop/non-crop breakdown. Differently from L-band, the results for VV 100 m were found to be slightly better than VH 100 m: this is because of VH saturation at C-band.
3.3. NDVI from PlanetScope Data
Optical NDVI was used to derive CV and crop/non-crop classification with two motivations in this section: (1) to explore which one between optical and radar is superior in classification; (2) to evaluate if the NDVI-based crop/non-crop map can be generated automatically and accurately enough to serve in lieu of ground truth. The algorithm was applied to NDVI CV computed from the PlanetScope images after downsampling to 10 m and 100 m. NDVI was linearly interpolated over time at the same acquisition dates of ALOS-2 and non-significant pixels (i.e., clouds, shadow, light haze, heavy haze) were masked out.
CV values in
Figure 13 are not as high in comparison to those derived from the L- and C-band radar data. When vegetation approaches a mature growth stage, it is expected for NDVI to saturate, resulting in little or no change over time. Consequently, it is not able to produce high values for crop fields. This is in contrast to the radar observation, in which the scattered power can increase with the growing canopy and structures even when the NDVI becomes saturated optically. The saturation of NDVI has an impact on the overall accuracy assessment since many pixels were classified as False Positive and False Negative.
The ROC curves at 10 m and 100 m are very similar to each other (
Figure 14), suggesting that the resolution is not a key factor impacting the results obtained for NDVI. This finding is different from what was observed for L-band and C-band data, where spatial averaging reduced speckle noise.
From
Figure 15, many cropland pixels are misclassified as non-crop, due to the low threshold values (
Table 8) and the inability of NDVI CV to produce high values for crop fields when crops reach the mature stage. Results in
Table 9 show that NDVI-based classification does not meet the 80% NISAR science requirement. This is supported by the presence of a high percentage of crop pixels incorrectly classified as non-crop (False Negative), which was equal to 22% and 34% for 10 m and 100 m, respectively. The percentage of non-crop pixels incorrectly classified as crop (False Positive) was also not negligible, being equal to 10% for 10 m and 8% for 100 m. The results also suggest that it is not possible to use the NDVI-based crop/non-crop map as a ground truth, since the overall accuracy of the classification was not high enough (68% and 58% for 10 and 100 m, respectively).
Table 8.
Optimal CV thresholds for NDVI.
Table 8.
Optimal CV thresholds for NDVI.
NDVI | |
---|
10 m | 0.07 |
100 m | 0.08 |
Figure 15.
NDVI-based crop/non-crop map.
Figure 15.
NDVI-based crop/non-crop map.
Table 9.
Overall accuracy for NDVI crop map.
Table 9.
Overall accuracy for NDVI crop map.
4. Conclusions
The algorithm that uses the Coefficient of Variation (CV) for classifying crop and non-crop areas was validated in preparation for the L-band NISAR crop area products. The NISAR science requirement for the cropland products specifies that the overall accuracy should exceed 80% on a seasonal time interval. The algorithm was applied to the time-series of the L-band NISAR simulated products (L2 GCOV) generated from the ALOS-2 PALSAR-2 images. Its performance was compared with those from the C-band Sentinel-1A and the NDVI derived from the high-resolution PlanetScope optical data. The classification results were analyzed with different polarizations (HH, HV for L-band, VH, VV for C-band) and spatial resolutions (10 m, 100 m).
As our first new feature of this paper, crop and non-crop polygons were delineated manually with a Planet image, to train the classification threshold. The polygons were divided into training and validation sets and used for deriving the optimal CV threshold by applying the ROC curve approach and for validating the classification accuracy. In comparison, previous studies [
18,
19,
26] adopted the CDL as ground truth. A shortcoming of this approach is the fact that the CDL is a yearly map and released with a one-year delay. Alternatively, we tested the Planet-based NDVI to compare its classification performances with the radar products, and to explore its potential as ground truth. The results revealed that NDVI was not sufficiently effective since the optical data suffered from saturation when the crops are mature. NDVI itself (not its CV) was not useful either, in that forests and crop fields were both classified as crop due to their high values.
Our second finding is that our analysis showed that L-band PALSAR-2 data at HV polarization and 100 m resolution was the only case in which the 80% NISAR science requirement was satisfied. The overall accuracy and optimal CV threshold were 86% and 0.15, respectively. In comparison, L-band HH both at 10 m and 100 m resolution did not meet this condition (60% and 76% as overall accuracy, respectively), likely because cross-polarization temporally varied more than co-polarization did. The speckle noise was too strong at 10 m for the image to be useful. The optimal CV threshold was found to be dependent on the spatial resolution, with values increasing as the resolution became finer.
Third, we found that a coarse resolution (i.e., 100 m) was necessary to meet the 80% requirement. In comparison, a past study [
19] using L-band airborne UAVSAR data and the NISAR products simulated from the UAVSAR data found that UAVSAR data met the 80% accuracy requirement at 10 m resolution and the simulated data were effective at 30 m and 100 m. The difference between our and the previous work is due to the UAVSAR and simulated data having a lower level of noise with respect to the PALSAR-2 data. The PALSAR-2 data were strongly affected by speckle effects at 10 m but not at 100 m.
Fourth, the classification results at C-band Sentinel-1A 10 m resolution were found to be unreliable with the very low CV threshold values. In this case, pixels were almost entirely classified as crop. The 100 m products did not meet the 80% requirement. Unlike the previous study [
20], we found that L-band performances were better than C-band’s, suggesting that classification results likely depend on the site of interest, dominant crop types, and crop structures.
Lastly, to demonstrate that a reliable result in terms of classification accuracy estimate can be achieved using the selected number of 39 validation polygons, we followed the approach proposed in [
27] to compute the confidence interval for the estimated accuracy. Given the overall accuracy requirement of 80%, for each case study investigated in the paper we derived, at 95% level of confidence, the limits of the interval. Using 39 validation polygons, the best result was obtained for L-band HV 100 m: we found that our estimate of the accuracy (i.e., 86%) was bounded between 83% and 89%. It is worth noting that the lower limit for L-band HV 100 m (83%) was greater than the upper limit derived for each of the other analyzed cases. This confirms that the validation set was large enough to support our conclusion, i.e., L-band HV polarized data at 100 m resolution were necessary to satisfy the 80% requirement.