1. Introduction
Vegetation is the main component of the ecosystem and plays an important role in the process of material circulation and energy exchange on land surface. Various vegetation types have different responses to the ecosystem. Monitoring vegetation types is of great significance for mastering its current situation and changes and promoting the sustainable development of resources and environment [
1,
2]. The application of remote sensing technology in the investigation and monitoring of vegetation types is the focus and hotspot in current study. In recent years, a large number of studies on vegetation classification have been carried out by widely using remote sensing technology [
3,
4]. However, due to the widespread phenomenon of spectral confusion, the number of types that can be identified and the classification accuracy still need to be improved in the process of remote sensing classification for vegetation types, especially in the tropical and subtropical areas where vegetation distribution is fragmented, types are multitudinous, and planting structures are diverse [
5,
6]. Remote sensing images with high spatial resolution significantly enhance the structural information of vegetation, making it possible to utilize the spatial combination features of ground objects. However, due to the band number constraint, the identification ability of vegetation types is still limited [
7,
8]. Hyperspectral load has the ability to acquire continuous spectral information within a specific range, thus being able to capture subtle spectral characteristics of ground objects, which makes it possible to accurately identify vegetation types [
9]. In recent years, the application of Unmanned Aerial Vehicle (UAV) in remote sensing has been increasingly expanded owning to UAV’s academic and commercial success [
10,
11]. Due to the low flying height and limited ground coverage, the resolution of images obtained by UAV-based hyperspectral cameras can reach 2–5 cm or less [
12], which makes UAV hyperspectral images have both hyperspectral and high spatial characteristics and provides an important data source for the classification of vegetation types in highly fragmented planting areas [
13].
There are many difficulties in the interpretation of centimeter-level UAV hyperspectral images. On one hand, with the improvement of image spatial resolution, the information of ground objects is highly detailed, the differences of spectral characteristics for similar objects become larger, and the spectra of different objects overlap with each other, promoting the intra-class variance to be larger and the inter-class variance to be smaller. The phenomena of spectral confusion occur in a large number, which weakens the statistical separability of the image spectral field and makes the identification result uncertain [
14,
15]. On the other hand, UAV hyperspectral images with centimeter-level resolution can provide rich spatial features, and blade shapes can even be clearly visible. Therefore, facing the problems of ultra-fine structural information and mixed spectral information in UAV hyperspectral images, balancing the spatial and spectral information to effectively identify vegetation types has to be solved. Traditional spectral interpretation methods based on image processing in medium and low resolution face great difficulties in interpreting complex features of high resolution. Object-based image analysis (OBIA) has gained popularity in recent years along with the increase of high-resolution images. It takes both spectral and spatial (texture, shape and spatial combination relation) information into account to characterize the landscape, and generally outperforms the pixel-based method for vegetation classification, particularly with high resolution images [
16,
17,
18,
19].
Scholars have been working to enhance the spatial resolution of images to improve the accuracy of clustering, detection and classification; it usually works when the size of the resolution is suitable for specific scene targets, which is related to the actual morphological and structural characteristics of these targets [
20,
21,
22]. Actually, the spatial resolution variation of remote sensing images will lead to differences in the expression of information content, resulting in scale effect in related results. In particular, scholars argued that the appropriate spatial resolution is related to the scales of spatial variation in the property of interest [
23,
24,
25]. Because of the complicated combinations of leaf structure and biochemistry, crown architecture and canopy structure, the spectral separability is altered by the scale at which observations are made (i.e., the spatial resolution). The spatial scale problem is the key to the complexity of type identification [
26]. Due to the lack of images under synchronous multi-scale observation, the appropriate scale for vegetation type identification has not been effectively solved [
27]. Scholars have generally conducted relevant study on the spatial scale effect of vegetation type identification based on multi-scale images generated after resampling [
28,
29,
30]. For example, Meddens resampled 0.3 m digital aerial images to 1.2, 2.4 and 4.2 m based on Pixel Aggregate method and classified healthy trees and insect attacking trees using Maximum Likelihood classifier. The result shows that the identification accuracy of 2.4 m images is the highest [
31]. Roth spatially aggregated the fine resolution (3–18 m) of airborne AVIRIS to the coarser resolution (20–60 m) for accurate mapping of plant species. The result shows that the best classification accuracy is at the coarser resolution, not the original image [
32]. Despite these achievements, most research has been based on images at meter-level and sub-meter-level resolutions. However, for more fragmented planting structures and based on finer spatial resolution (e.g., centimeter level), how to balance the relationship between monitoring target scale and image scale and on which resolution the vegetation types can be accurately identified have to be solved. Especially in view of the limited ground coverage of UAV images, it is of great significance to flight at a proper spatial scale to maximize the coverage while ensuring classification accuracy.
This study is mainly aimed at: (1) Making full use of the hyperspectral and high spatial characteristics of UAV hyperspectral images to realize the fine classification of vegetation types in highly fragmented planting areas and (2) obtaining the scale variation characteristics of vegetation type identification for UAV images, exploring the appropriate scale range to provide reference for UAV flight experiment design, remote sensing image selection and UAV image application of vegetation classification in similar areas.
3. Methods
Based on UAV hyperspectral images of 7 different scales, sugarcane, eucalyptus, citrus and other vegetation were classified by using object-based image analysis (OBIA). Specifically, it included: (1) selecting appropriate scale for multi-resolution segmentation; (2) using the mean decrease accuracy (MDA) method for feature evaluation and selection; (3) selecting the classifiers of Support Vector Machines (SVM) and Random Forest (RF) to classify vegetation types of multi-scale images and comparing their accuracy differences; (4) analyzing the variation of appropriate segmentation parameters and feature space of multi-scale images, and discussing the influence of spatial scale variation on vegetation classification of UAV images.
3.1. Multi-Resolution Segmentation
Image segmentation is the first step of OBIA. It is a process of dividing an image into several discrete image objects (IOs) with unique properties according to certain criteria [
47]. The accuracy of image segmentation significantly affects the accuracy of OBIA [
48]. A bottom up region-merging technique based on the fractal net evolution algorithm proposed by Baatz and Schäpe was used for multi-resolution segmentation [
49]. As the most widely used method, it can generate highly homogeneous segmentation regions, thus separating and representing ground objects in the best scale [
50]. In the study, two segmentations were performed, creating two levels of IOs. A series of interactive "trial and error" tests were used to determine the proper segmentation parameters [
51,
52]. Six spectral bands including blue, green, red, red edge I, red edge II and near-infrared were used as inputs. A small-scale factor was used for the first segmentation to maximize the separation of vegetation and non-vegetation. In the second segmentation, three scale factors were set separately to prevent the objects being too fragmented and the appropriate segmentation of the three was determined by the classification accuracy.
3.2. Feature Extraction
In the process of OBIA, the features related to the IOs can be extracted from the UAV image. The ideal features should reflect the differences between the target types.
Sugarcane, citrus and eucalyptus are broadleaf vegetation, showing different spatial distribution features in high spatial resolution images. Sugarcane stalks are 3–5 m high, with clumps of leaves. The leaves are about 1m long and 4–6 cm wide, the edges of which are serrated and rough. The rows are 80–100 cm apart, showing the characteristics of fine and compact distribution. Citrus is an evergreen small tree with a height of about 2 m and a round crown of less than 2 m. The plant spacing is 1.5–2 m. The leaves are ovate-lanceolate, with a large size variation and a length of 4–8 cm. It shows a feature of regular, sparse and circular crown distribution. Eucalyptus is an evergreen dense-shade tall tree with a height of 20 m. Its crown is triangular spire-shaped, with small crown and opposite leaves in heart or broadly lanceolate shape, showing a feature of dense distribution.
In this study, as shown in
Table 2, 149 features associated with the IOs derived from the second segmentation in 4 categories including spectrum, vegetation index, texture and shape have been extracted. (1) Commonly used parameters such as reflectance, hue, intensity, brightness, maximum difference (Max.diff), standard deviation (StdDev) and ratio were used to analyze the spectral features of vegetation. (2) 18 vegetation indices were selected (
Table 3), including not only the broadband index that could be calculated based on traditional multi-spectral images but also the red edge index that was more sensitive to vegetation types. (3) Two shape features, shape index and compactness, were used to represent the shape of vegetation. (4) The widely used gray-level co-occurrence matrix (GLCM) [
53] and gray-level difference vector (GLDV) [
54] were used to extract textural features.
3.3. Feature Evaluation and Reduction
High-dimensional data usually need feature selection before machine learning. The significance of feature selection lies in reducing data redundancy, strengthening the understanding of features, enhancing the generalization ability of models and improving the processing efficiency. Random Forest (RF) can effectively reduce the data dimension while ensuring the classification accuracy, which is a machine learning algorithm composed of multiple classification and regression trees (CART) proposed by Breiman [
71]. It is widely used in classification and identification of images and selection of high-dimensional features [
72,
73]. The mean decrease accuracy (MDA) method of RF was adopted for feature importance evaluation, which could disturb the eigenvalue order of each feature and then evaluate the importance of the feature by measuring the influence of this change on the accuracy of the model. If a feature is important, its order change will significantly reduce the accuracy.
On the basis of MDA results, all the features are ranked from big to small according to the feature importance, and different number of features will be used successively to classify vegetation types. In order to eliminate feature redundancy that might be caused by repeated participation of adjacent 101 reflectance bands, the following feature reduction principle is adopted based on feature importance evaluation:
For 101 bands, they are first ranked in order of feature importance from big to small. When a band is retained, the two adjacent bands above and below will be deleted.
Sometimes, it may occur that the interval between the band to be retained and the band already retained is 3, which is also acceptable. However, if the interval is 4, this band should be deleted, and its upper or lower band should be retained to ensure that the retained band has relatively strong importance and the number of bands deleted between the retained bands is 2 or 3.
For example, if the first important band is the 64th band, it should be retained first, and the 62nd, 63rd, 65th and 66th bands should be deleted at the same time. If the second important band is the 61st or 60th band, both cases are acceptable because it can ensure that the number of deleted bands is 2 or 3. However, if the second important band is the 59th band, it needs to be deleted, and the more important one of the 58th or 60th band needs to be retained according to the importance of the two.
3.4. Classifier
Two supervised classifiers were considered in this paper: Support Vector Machines (SVM) and Random Forest (RF).
SVM is a non-parametric classifier [
74]. Adopting the principle of structural risk minimization, it automatically selects important data points to support decision-making. It provides a brand-new solution for classification of ground objects in high-resolution images with extraordinary efficiency. In this study, we used Radial Basis Function Kernel (RBF) function because of its outperformance in classification [
75]. Two important classifier parameters need to be determined, including RBF kernel parameter gamma and penalty factor c. A moderate error penalty value (c = 100) and a gamma equaling to the inverse of the feature dimension were configured to facilitate the comparison between the different classification results related to different feature numbers [
76,
77].
As mentioned above, RF is a non-parametric ensemble learning algorithm [
71], which is composed of multiple decision-making trees. In the process of building decision-making trees, the splitting of each point is determined by Gini coefficient criterion to realize the best variable splitting. RF has the characteristics of fast learning speed, strong robustness and generalization ability. It has the abilities to analyze and select complex interaction features, being widely used in computer vision, human identification, image processing and other fields. Two important parameters need to be determined, including ntree (the number of decision trees executing classification) and mtry (the number of input variables used at each node) [
78]. Different numbers of ntree were tested from 50 to 150 at 10 tree intervals. Classification accuracy did not change much as the number changed. Consistently used ntree (ntree = 50) and mtry equaling to the square root of the feature dimension were configured to facilitate the comparison between the different classification results related to different feature numbers.
6. Conclusions
This study aims to evaluate the impact of spatial resolution on the classification of vegetation types in highly fragmented planting areas based on UAV hyperspectral images. By aggregating the 0.025 m UAV hyperspectral image into coarse spatial resolutions (0.05, 0.1, 0.25, 0.5, 1, 2.5 m), we have simulated centimeter-to-meter level resolution images that can be obtained by the UAV system, and evaluated the accuracy variation of the fine classification of several vegetation types such as sugarcane, citrus and eucalyptus in southern China based on multi-scale images. The results show that the classification accuracy of vegetation types is closely related to the scale of remote sensing images. For this study area, with the decrease of spatial resolution, the OA shows a stable and slight fluctuation and then gradually decreases. The best classification accuracy does not occur in the original image but at an intermediate level of resolution. These results are consistent with similar studies on image scale, i.e., the best resolution occurs when the spectral intra-class variance is the smallest, and the class has not yet begun to mix spatially. Therefore, the ideal spatial resolution should vary according to the diversity and distribution of species in the ecosystem. Parcel size and distribution are the key factors that determine the accuracy at a given resolution. Due to the existence of small and fragmented parcels, images with coarse resolution no longer contain some original categories, such as citrus in this study, resulting in the reduction of classification accuracy of 1 and 2.5 m images. Therefore, it is important to select images of appropriate spatial scale according to the special distribution and parcel size of the study area so as to obtain more ideal classification accuracy in UAV flight experiment, data processing and application. In the process of OBIA, based on the results of multi-feature evaluation and analysis, it is successful to classify vegetation types of images in different scales by using different feature numbers and segmentation parameters. We find that with the decrease of the spatial resolution, the importance of vegetation index features increases and that of textural features shows an opposite trend; the appropriate segmentation scale decreases gradually, and the appropriate number of features is 30–40, which means that the feature parameters vary for multi-scale images. Therefore, appropriate feature parameters need to be selected for images in different scales to ensure the accuracy of classification.
There are several clear directions for future study. First, a more realistic simulation and amplification for images of fine spatial resolutions will help to improve the evaluation of potential applications of similar data in coarse resolution. The study on up-scaling of remote sensing images also shows that the spectral information of the resampled images has a strong dependence on the original images, resulting in differences with the actual observation results at a given scale [
93]. The results of this study should be compared with the classification results using actual observation images for further understanding the potential impact of resolution on the classification of vegetation types. In addition, the improvement of spatial resolution will lead to greater intra-class difference and inter-class similarity, which will usually result in classification errors [
94,
95]. In view of the challenges and potential of ultrahigh resolution UAV images in the classification of vegetation types, advanced data analysis technologies developed in computer vision and machine learning, such as deep learning [
96], should be comprehensively analyzed to improve the application capability of UAV images.