1. Introduction
The 3D information of vegetation is an important part of digital forestry, which provides important data guarantees for vegetation carbon storage, biomass and ecological assessment. Under the background of rapid development of big data and artificial intelligence, active remote sensing technology represented by LiDAR has gradually become an important means for vegetation 3D information acquisition and forest resources investigation. It provides technical support for forest resource investigation and ecological process detection and analysis at different scales [
1,
2,
3,
4,
5,
6]. LiDAR technology is characterized by the fast acquisition of spatial data, a high degree of automation, high precision and a large amount of data [
7]. With the help of this technology, high-precision spatial information of individual trees and small-scale forests can be quickly obtained, and then the segmentation of individual tree canopy structures [
8,
9,
10] and tree structural reconstructions [
4,
11], and the extraction of individual tree vegetation information [
12,
13] can be performed. For example, Terrestrial Laser Scanning (TLS) can quickly extract vegetation parameters, such as tree volume, leaf area index and gap fraction, providing key parameters for forest ecological evaluation investigation [
14]. In addition, TLS can also be used to evaluate the height of individual trees and leaf area density [
15]. In addition to individual tree/forest vegetation parameter extraction, high-precision and high-density tree point cloud is also widely used in high-precision extraction and reconstruction of branch structure [
16], which can provide important data and model support for digital forestry and digital city construction.
However, restricted by factors, such as laser scanning mode and environmental occlusion, the LiDAR point cloud of individual trees/forests is often missing, to varying degrees. For example, (1) limited by the scanning resolution, the complex layered branch structure inside the canopy often has the situation that the thin branches in the center of the canopy are blocked by the thin branches in the outer layer; (2) affected by the scanning point cloud density and scanning distance, different trees/different parts of the same tree due to its distance inconsistency, often result in a point cloud density that is not uniform, which to a certain extent, can be seen as an incomplete point cloud. In addition, the point cloud near the missing area also has problems, such as drastic density changes; (3) the thin branch point cloud inside the canopy is usually missing in a large range due to the self-occlusion of branches/leaves and the external occlusion of pedestrians/billboards during the scanning process; (4) compared with the time-consuming and labor-intensive multi-station scanning, single-station scanning is simple and efficient, but the data loss of a point cloud obtained by single-station scanning is more serious. Usually, only the tree point cloud facing the scanning device can be obtained, and a large number of missing point clouds exist on the side away from the scanner.
The absence of point clouds usually has an adverse impact on the extraction of vegetation parameters and the abstract expression of branches. For example, the individual tree point cloud based on airborne LiDAR, due to its lack of twigs and trunks inside the tree canopy, often produces large data deviations in large-scale ecological studies of forest biomass and volume, which brings difficulties to practical analysis and application. In addition, in the practical application of individual tree 3D reconstruction, the lack of regional point clouds will further aggravate the difficulty of high-precision 3D reconstruction of individual trees. If the incomplete point cloud is ignored and the model is directly constructed, problems, such as the wrong connection of branch topology and inconsistency of branch radius with the true value will often occur. In order to ensure the realism and fidelity of the model, point cloud repair for the missing area is one of the simplest and most direct strategies to solving the problem of high-precision model reconstruction in the case of missing parts of point clouds.
Figure 1 shows the scenarios that may occur when there are missing data in the bifurcated branch structure region of an individual tree. It can be seen that when the branch point cloud is complete, the branch skeleton can be extracted correctly and completely, and the branch topological connection is correct, as shown in
Figure 1a,c. However, when the point cloud inside the branch, especially the point cloud at the junction, is missing, the extracted skeleton may have topological connection errors, as shown in
Figure 1b,d. In this case, if the skeleton/tree 3D model is constructed directly without data repair, problems, such as branch topology errors and radius calculation errors will often occur, affecting the accuracy of tree skeleton extraction and model reconstruction. Therefore, it is important to construct a branch point cloud enhancement algorithm based on the acquired tree point clouds to maintain the tree skeleton topology and tree model realism.
To solve the problem of missing parts of a tree point cloud, many authors have proposed data enhancement and recovery methods for tree point clouds, which can be roughly divided into the following three categories: (1) Repairing missing parts of point clouds based on point cloud features, usually using point cloud local features, such as normal vector [
17], curvature factor [
18], reflection intensity [
19], point cloud density and normal information [
20]. For example, repairing the incomplete point cloud based on structure-aware global optimization algorithms [
21], based on point cloud local density information [
22], and based on point cloud normal information [
23]. The branch point cloud repaired by this method is often more consistent with the natural growth pattern of vegetation, but the time cost is high. (2) Multi-source data fusion for missing parts of point cloud data enhancement, such as improving point cloud data based on backpack LiDAR scanning [
24] or multi echo-recording mobile laser scanning [
25]. Although these methods are direct and convenient, it brings problems, such as multi-source data fusion and registration. (3) Based on prior knowledge or modeling algorithms, such as methods based on first reconstructing the branch model/fitting the branch cylinder and then completing the missing point cloud [
7,
26]. These methods can obtain relatively complete branch detail information, but the branch radius will violate the natural growth rules, and the subsequent completion needs further prior constraints to ensure the realism of the branch point cloud reconstruction model.
Therefore, in order to ensure the reliability of vegetation parameter extraction and ecological analysis and the accuracy of branch reconstruction, it is important to explore a high-precision point cloud restoration algorithm that conforms to the tree structure rules. Considering the disadvantages of long recovery time of algorithms based on point cloud density and normal information, and the disadvantages of the lack of realism based on an a priori strategy, this paper adds constraints on tree structure direction and proposes an iterative point cloud optimization algorithm based on local point cloud weight density and skeleton point dominant direction. The algorithm realizes the repair and enhancement of the incomplete point cloud of an individual tree through iteration and obtains the point cloud of an individual tree branch that matches the geometry of the real tree. The repaired and enhanced tree point cloud can lay an important data foundation for the subsequent extraction of vegetation parameters and ecological analysis, as well as the reconstruction of branch structure with high accuracy and correct topological connection.
2. Tree Point Cloud Restoration
The process framework of the repair algorithm in this paper is shown in
Figure 2. Firstly, the L
1-Median algorithm [
27] is used to extract the median points of the individual tree point cloud as the initial skeleton key points. Then, the initial skeleton key points are combined with the original individual tree point cloud, which is used to calculate the local weight density of the incomplete point cloud and the dominant direction of each skeleton key point. They are then used as a reference for point cloud repair optimization. The repair process draws on the point cloud movement strategy of the local awareness global optimization algorithm; the force and contraction constraints of each point are added to make the adjacent points in the missing area move to the missing branches along the constraint distance. Since there is no heterogenous point cloud involved in the restoration process, the total amount of point cloud of repaired branches remains unchanged, so the point cloud density will gradually decrease with the extension of branches, and the density change is closer to the natural rules of vegetation growth. Finally, we iterate the process of “skeleton point extraction—point cloud superposition—local point cloud feature calculation—input point cloud spatial position optimization”. When the force and contractive constraint of each point near the missing area reach equilibrium, the iterative process stops and the missing repair is completed.
2.1. Extraction of Initial Skeleton Key Points
Studies on tree skeleton extraction based on LiDAR point cloud can be roughly divided into the following three categories: (1) Skeleton extraction based on clustering algorithms, such as the clustering algorithm based on horizontal data sets [
28,
29], i.e., making horizontal slices at a certain distance in the Z dimension, followed by clustering the point clouds within the horizontal slices to form skeleton points. There are also extraction methods based on K-means clustering of tree skeleton points [
30,
31]. (2) Skeleton extraction based on graph theory methods, such as tree skeleton extraction based on octree structure [
32,
33]. (3) Laplacian operator-based skeleton extraction methods [
5,
34].
2.1.1. Random Sampling of Original Point Cloud
Considering that the skeleton extraction involved in this paper only serves to calculate the local features and repair of a point cloud, and does not require high spatial geometric accuracy, the L
1-Median algorithm proposed in reference [
27] is selected as the skeleton key point extraction method. Since there is often a discrepancy in cloud density between twigs and main branches, direct skeleton extraction based on the L
1-Median algorithm is prone to shrinkage inconsistency, that is, skeleton extraction is completed with a large number of skeleton points with higher density, while skeleton points of twig are not effectively extracted. In order to prevent such a situation, this paper refers to the iterative shrinkage method in reference [
27], which firstly identifies the labeled skeleton branch points, then selects suitable bridging points from the non-branch points to connect the branch points with the non-branch points, and finally gradually expands the neighborhood values to achieve the growth and merging of branches. Specifically, the original individual tree point cloud is first randomly sampled to obtain a set of sampled points, and the set of points is marked as non-branching points.
2.1.2. Point Cloud Skeleton Extraction
After random sampling of the original point set, the tree skeleton key points are extracted based on the L
1-Median algorithm with iterative shrinkage. Specifically, a scattered individual tree point cloud that is undirected, unevenly distributed, and contains noise and outlier points is used as input, denoted as
Q = {
qj}
j∈J, and the output of the algorithm is a one-dimensional curve skeleton point. In order to extract the initial skeleton point of an individual tree point cloud, this paper transforms the initial skeleton point location problem into a location problem of finding a set of optimal point sets
X = {
xi}
i∈I, where point set
X is a set of point with the minimum Euclidean distance from the point cloud in its local neighborhood. The formula for a point
xi in the point set
X is as in Equation (1) [
22,
27]:
where the first term is to calculate the spatial position of the optimal point set
X in the input individual tree point set
Q; the second term
R(
X) is a regular term with conditions attached, which mainly serves to generate a repulsive force when the local branch skeleton is formed and imposes a penalty on the position of point
xi to ensure the uniform distribution of the skeleton point positions.
I is the index point set of point set
X and
J is the index point set of point set
Q.
θ is a fast decaying function, whose definition is shown in Equation (2):
where
h is the local support radius and it defines the size of the supporting local neighborhood for L
1-medial skeleton construction.
In order to prevent the appearance of non-uniform distribution situations, such as point clusters, it is proposed to add a conditional regular term R(X) to apply a repulsive force during the generation of local skeleton points, so as to avoid point offset due to iteration when the initial skeleton points extracted by the L1-Median algorithm are already at the appropriate positions, and to ensure the uniform distribution of the initial skeleton points.
The classical weighted Principal Component Analysis (PCA) is used to detect the distribution of the point cloud near the individual tree branch skeleton structure. For a point
xi in the point set
X, the eigenvalues and eigenvectors of a 3 × 3 weighted covariance matrix are calculated as shown in Equation (3):
To conditionally apply the repulsion force, we define our regularization function as in Equation (4) [
27]:
where
γi is the balancing constant of the optimal point set
X,
σi is the directionality degree of
xi within a local neighborhood, and its calculation formula is as in Equation (5):
where
are the eigenvalues of point
xi, where
forms an orthogonal system, which is the principal component of the point set. The closer
σi is to 1, the smaller
and
are compared to
; and hence, the more points around
xi are aligned along the direction of the tree skeleton.
After determining the regular term
R(
X), this paper calculates the L
1-median point
xi,
,
. When the energy gradient value is 0, the fixed coefficient at each point should satisfy Equation (6) [
27]:
At this point, the parameter
μ is defined, and
μ satisfies Equation (7) [
27]:
In order to avoid the
xi coefficient matrix being singular, let 0 ≤
μσi < 1/2, and the L
1 median point
xi is solved iteratively at the same time. Note, that the median point set in the current iterate
,
then the point set
generated in the next iterate is as shown in Equation (8):
where
.
Since can adaptively adjust the repulsive force based on the point dominant direction. In this paper, only the control parameter 0 ≤ μ < 1/2, can control the penalty strength of the overall individual tree point cloud during the iterative shrinkage process.
This paper iteratively shrinks the set of sampling points according to the initial neighborhood value. The initial contraction radius is set according to the initial neighborhood radius, as shown in Equation (9) [
22,
27]:
where
h0 is the initial neighborhood value,
is the diagonal length of the input
Q’s bounding box and
is the number of points in
Q.
2.1.3. Searching Key Points in Skeletons
To identify branch points in the set of labeled sampling points, this paper calculates the directional metric
σi for all non-branch points and eliminates outlying points based on the
k-nearest neighbor algorithm (the default
k value set is 6). Meanwhile, the corresponding threshold value is set for
σi. Based on the experience of reference [
22], it is set that when
σi > 0.9, the point
xi is identified as a candidate branch point, and it is determined that all points in the neighborhood of point
xi have the same directional distribution at this time. Then the labeled branch points are further identified based on the candidate branch points. In this paper, the point corresponding to the maximum
σ value is labeled as seed point
x0, and all candidate branch points in its neighborhood are traversed from point
x0. Specifically, this paper calculates the distribution direction of point
xi based on PCA and searches for candidate branch points in the vicinity of point
xi along the direction. The search process for branch points is terminated when there are no points within the neighborhood that satisfy condition (10):
The new iteration process selects the largest σi value from the remaining candidate branch points as the new seed point and iterates until all candidates have been processed.
In the process of searching for branch points, if a fixed value
h0 is used as the neighborhood radius for searching, there are often some branch points that are incorrectly marked as non-branch points, thus causing some regions of branch skeletons to be missing. Therefore, an adaptive neighborhood radius value
h is needed to better adapt to the variation of the different individual tree structures and avoid the phenomenon of over-shrinkage and under-shrinkage. Based on the assumption that the individual tree point cloud gradually becomes less dense along the skeleton structure from the root to the end of the branch, we gradually increase the size of the
h value during the shrinkage iterations, while eliminating the branch points that have been marked. At each iteration, the neighborhood value is to be increased by ∆
h, and the equations are as follows:
2.2. Local Feature Calculation of Point Cloud (Calculation of Dominant Direction and Local Point Cloud Weight Density)
Based on the extracted L1-Median initial skeleton key points, the dominant direction and local point cloud weight density of skeleton key points in the individual tree point cloud were defined, and the adjacent correlation points of the missing area were guided to move along the dominant direction, and the individual tree incomplete point cloud was gradually repaired.
Specifically, based on the assumption that each branch extends in a unique direction, the extension direction of the branch to which each point belongs is defined as the dominant direction of the point, and the dominant direction of the initial skeleton point is used to represent the dominant direction of the points in the neighborhood of the skeleton point. Firstly, the
k nearest neighbors of the initial skeleton point
i are obtained based on the
k-nearest neighbor algorithm [
35], and the dominant direction of these
k nearest neighbors are defined to be the same as the dominant direction of the initial skeleton point
i. The sub-nodes of the initial skeleton point can be divided into three cases: (1) contains only a single sub-node, i.e., this initial skeleton point is the internal point of the branch; (2) contains two or more sub-nodes, i.e., this initial skeleton point is the branch bifurcation point; (3) does not contain sub-nodes, i.e., it is a branch end skeleton point. When the initial skeleton point is located inside the branch, the dominant direction of the initial skeleton point
i is calculated schematically as shown in
Figure 3a. Where the red nodes
i,
j, and
k represent the initial skeleton points, the yellow, blue, and green nodes represent the original point cloud (different colors represent the nearest neighbors of different skeleton points), and the green line represents the dominant direction of skeleton points
i and
j. The dominant direction of the skeleton point is the direction that the initial skeleton point
i points to its unique child node
j. This direction also represents the dominant direction of the
k points in the neighborhood of point
i (the points indicated in yellow in the figure). When the initial skeleton point is located at the branch bifurcation point, at this time, based on the single child node dominant direction calculation method, the dominant direction is calculated for each initial skeleton point of the branch route in turn, and the calculation schematic is shown in
Figure 3b, where the red node is the initial skeleton point and the green line represents the dominant direction of the point. In the case of
Figure 3b, this paper first calculates the dominant direction of all skeleton points on branching route a, and then calculates the dominant direction of all skeleton points on branching route b based on the bifurcated skeleton point
i. When the initial skeleton point is located at the end of the branch, the traversal iteration ends at this point, and the dominant direction of the end skeleton point is set to be the same as the dominant direction of its parent node.
After determining the dominant direction of the initial skeleton point, the point cloud is guided to move according to the dominant direction of the key point. Drawing on the idea of structure-aware global optimization [
21], this paper analogizes point clouds as particles with electric power. Then these particles move freely according to the dominant direction, and the repair of the point cloud in the missing region is completed when the particles reach force equilibrium and stop moving. For the above purpose, it is assumed that each point in these discrete point clouds is a particle with the same kind of charge. They repel each other, so the particle at the end of the branch point cloud will be repelled by its forward particle and move. However, these particles cannot be moved arbitrarily, because the branches have directions, so it is necessary to obtain the direction of each branch, and then let these particles move along the direction of this branch extension. Next, it is necessary to define the force between the particles to control the range of motion of these particles, and
Fr is defined to express the repulsive force of each particle by the surrounding particles. The direction of this force follows the extension direction of the branch, which is the projection of the repulsive force of the surrounding particles on this particle in the extension direction of the branch, either in the same direction as the extension of the branch or in the opposite direction. However, if there is only a repulsive force, the particle at the end will always move in the direction of the branch extension and will not stop, so a binding force
Fs is needed to prevent the point from deviating significantly from the original position of the point. The direction of this force should be from the current position to the original position of the point. The farther the point deviates from its original position, the stronger the force should be, similar to a spring force. A particle is subjected to the joint action of these two forces, and when the particle finally stops moving, the two forces should be equal in magnitude and opposite in direction, so that the particle is in a state of force equilibrium. When all the particles stop moving, the missing part is repaired by these particles.
To quantify the force equilibrium state of the particles (point cloud), a constraint is imposed on the moving distance of the points, and the local point cloud weight density
dj of the key points of the skeleton is defined, and the calculation formula is as in Equation (13):
where
vi is the average distance of all points connected to the skeleton point
i of the key point.
The local point cloud weight density can ensure that the tree point cloud moves within a small range, following the vegetation growth law conditions. For example, in the area with a larger local point cloud weight density, it indicates that the point cloud is closer to its skeleton key point, the point cloud density is higher, and the distance to be moved becomes larger accordingly. Accordingly, in the region where the local point cloud weights are less dense, the neighboring points are farther away from the skeleton key points, the point cloud is sparser, and the distance to be moved becomes smaller accordingly.
2.3. Iterative Repair Optimization
After obtaining the dominant direction and local weight density of the point, in order to further quantify the balancing condition, the idea of gravitational and repulsive force construction is borrowed from reference [
21] to define the action force and contractive constraint of the method point in this paper, where the formula of the action force
Fr(
i) is shown in Equation (14):
where
di and
dj are the local weight density of point
I and point
j,
Pi and
Pj are, respectively, the space coordinates of point
i and point
j after iteration contraction,
Oi is the dominant direction of point
i, and
T is the transpose operator.
The contractive constraint
Fs(
i) can be calculated as Equation (15):
where
Ui is the initial spatial position of point
i,
, which is an adjustment factor to prevent the point cloud at the end of the branch from moving widely due to low local weight density, so the
φ value will be larger at the missing location and smaller at the end of the branch.
m is the number of all L
1-Median skeleton key points, and
ci is the sum of Euclidean distances of all child nodes of point
i in the L
1-Meidan skeleton structure.
Through the above optimization process, the incomplete individual tree point cloud is iteratively repaired and optimized. When the point force and contraction constraints reach balance, the iterative process terminates and the incomplete point cloud is repaired.
2.4. Repair Effect Evaluation
To further evaluate the restoration effect of tree point clouds, this paper quantitatively evaluates the point cloud repair optimization results with the help of the final modeling effect based on the quantitative analysis model in the literature [
36]. Specifically, an individual tree with good point cloud integrity was selected as the study object, and a 3D tree model was constructed based on this point cloud data, which was recorded as the reference individual tree model. Then, some branch point clouds were manually removed, and the point cloud data of the individual tree after each iteration was output through the above iterative restoration process. To quantify the repair effect of each iteration, the individual tree point cloud generated after each iteration was constructed as a 3D individual tree model and recorded as a validation individual tree model. For the constructed 3D model, the TLS point cloud generation process was simulated: specifically, the laser beam of the 3D laser scan was simulated using PBRT (Physically-Based Ray Tracing) software [
37], and the field of view was set to the same value as that of the scanner to obtain the point cloud, which was set to −60°–90°. The reconstructed individual tree model was placed at a distance so that the simulated beam could scan the entire tree. Three laser scans were simulated at azimuths of 0°, 120°, and 240° around the individual tree, and the distance between the camera and the tree model was kept constant. Each simulated scan produces a raster image. For each pixel of the raster image, the simulated scan emits a laser beam to ensure the relative integrity of the simulated point cloud generation, and finally, the simulated point cloud is aligned to generate a 3D point cloud of an individual tree. For the simulated generated individual tree point cloud, this paper continues to compare the differences between the point clouds in the two forms to verify the differences between the reference individual tree model and the validated individual tree model. In other words, the simulated individual tree point cloud is rasterized and a 3D raster (voxel) of size 0.2 m is created in the point cloud space enclosing the box, and then the simulated individual tree point cloud is put into the voxel and the number of points in each voxel is recorded. Finally, the number of simulated transformed points in the voxels with points is counted to obtain the difference between the individual tree model to be validated and the reference individual tree model. To facilitate comparison, the differences between the number of points are further normalized and the mean and standard deviation of these differences are counted.