4.1. The non-rigid transformation model
A non-rigid transformation model can be described by a
continuity model f and a
local transformation modelg, c.f. equation
11. We propose in this work the usage of
piece-wise tricubic polynomials (PTCP) as the
continuity modelf
and the
local translation as the
local transformation model g
In the following, a formal description of this transformation model will be given. Afterwards, we will motivate in detail the choice of this specific model.
The idea of using PTCP to model the transformation parameters is borrowed from the tricubic interpolation (TCI) method [
60]. TCI is the extension of the popular and highly efficient bicubic interpolation (used e.g. for image resampling) to the third dimension. It is a three-dimensional interpolation method used to derive smooth values from a given set of sparse and irregularly-spaced data points. TCI uses a grid-based 3D
control structure, i.e. a voxel structure. The voxel size is the main parameter to adjust the resolution of the transformation field
. The interpolated values change continuously (
continuity) and smoothly (
continuity) across the entire voxel structure, i.e. not only within a single voxel, but also across the voxel faces. The overall model is composed by PTCP and accordingly the values in each voxel are defined by a cubic polynomial with an individual set of 64 coefficients.
We use PTCP to model the space-varying values of the transformation parameters. More specifically, the values of each transformation parameter are represented by an individual scalar field, cf.
Figure 6. Accordingly, three scalar fields are used to model the components of
, namely for
,
, and
. The transformation field
, i.e. the translations
in form of a vector field (cf. equation (
10)), is obtained by combining these three individual scalar fields.
In the following, we describe the definition of a single scalar field, namely
; the scalar fields
and
are defined analogously. The translation
is defined at the position
by the cubic polynomial:
where
are the 64 coefficients corresponding to the voxel in which the point
is located and
,
,
are the reduced and normalized point coordinates of
. These coordinates are defined in a local voxel coordinate system (VCS) by
where
(
) is the local origin of the voxel and
s is the voxel size.
In order to achieve global
and
continuity, the coefficients
of each voxel can not be estimated independently. Instead, one must ensure that the values and its derivatives are continuous at the contact faces of neighboring voxels. Lekien and Marsden [
60] present an elegant and efficient solution to this problem by relating the coefficients of a voxel to the values and its derivatives at the 8 corners (
, …,
) of this voxel. For this, first, the tricubic polynomial (
21) is expressed as the scalar product
Thereby, the column vector
(
) contains the 64 coefficients
of the tricubic polynomial (
21) – the elements are defined as:
Similarly, the column vector
(
) contains the products of the exponentiated coordinates
,
,
– the elements are defined as:
Now, a new column vector
(
) is introduced which is composed by the values and the first, second, and third derivatives of the scalar field
at the 8 corners of the voxel:
The relationship between
and
can now be formulated using a matrix
(
) by
where the elements of
are defined by:
The matrix
is rather sparse (46.9% sparsity) and its elements are integer numbers. These numbers do not depend on the actual values of the coefficients
. Consequently,
is a constant matrix whose elements are known in advance. The determinant of
equals 1 and as a consequence
is invertible. We provide the matrices
and
in our public repository
here. The inverse matrix can be used to compute the coefficients
from
with
With this, finally, the tricubic polynomial (
23) can be written in the elegant form
With this form, the scalar field can now be defined through the values of
(instead of using the coefficients
), which means by 8 parameters at each voxel corner. Accordingly, the elements of
for the entire voxel structure correspond to the unknown parameters to be estimated in the optimization process (
15). Notably, Lekien and Marsden [
60] have proven that continuity of
at the corners of neighboring voxels is sufficient to achieve global
and
continuity of the scalar field. In other words, continuity of the values and derivatives at the voxel corners is sufficient to achieve also continuity at the contact faces of the voxels.
There are three important advantages of form (
30) over form (
21). The first advantage is that the scalar field can be defined by a significantly smaller number of parameters – we’d like to illustrate this with an example. For this we assume to have 3 rather small scalar fields with 5×4×2=40 voxels as the ones depicted in
Figure 6. With form (
21) one would need for each voxel an individual set of 64 coefficients to represent a single scalar field; this leads in sum to 40×64×3=7680 parameters for all three scalar fields. Additionally, one must define 8 continuity constraints (for the values and its derivatives) at the adjoining corners of the voxel structure; this leads in sum to 5520 additional constraints. In contrast, with form (
30) a single scalar field is defined through the 8 values and derivatives at the voxel corners; this leads in sum for all three scalar fields to only 6×5×3×8×3=2160 parameters, where 6×5×3 is the number of the voxel corners. Additional constraints are not needed. Summarizing, with form (
30) the number of parameters can be reduced by approximately 72% in this case.
The second important advantage of form (
30) is the efficient evaluation of the scalar fields for a large amount of points. This is particularly important when applying an estimated transformation to the entire point cloud which potentially consist of hundreds of millions of points. For this, we assume to have a set of points
in a single voxel. The scalar field can then be evaluated efficiently for all points at once with
where the matrix
(
) is defined as
The evaluation of the scalar field through equation (
31) is particularly advantageous when used in interpreted programming languages like Python or Matlab. This is because performing matrix multiplications for a large set of points is much more efficient than iterating through each point one by one.
Finally, a third advantage of form (
30) is that it is much easier and intuitive to manipulate the transformation field
by manipulating
instead of
. For example, one can easily adjust the smoothness of the transformation field by directly manipulating the derivatives of
at the voxel corners, e.g. by defining regularizing observations (see next section), constraints, or upper limits for the corresponding parameters in
(
26). Specifically, such additional observations or constraints can be useful to mitigate large unmotivated oscillations of the transformation values, e.g. in regions with only few correspondences.
Summarizing, the main motivations for the proposed non-rigid transformation model are:
Continuity: The transformation field is and continuous, i.e. transformation values change smoothly over the entire voxel structure.
Flexibility: The domain of corresponds to the extents of the voxel structure. Thus, it can easily be defined by the user, e.g. to match exactly the extents of point cloud tiles. Moreover, the resolution of can easily be adjusted through the voxel size.
Efficiency: The transformation field
can efficiently be estimated for two reasons. First, the number of unknown parameters is relatively low. Second, the transformation is a linear function of the parameters in
. In other words, the parameters of
can be estimated through a closed-form solution which does not require an iterative solution or initial values for the parameters. Moreover, the transformation of very large point clouds can efficiently be implemented using equation (
31).
Intuitivity: The parameters of the transformation field can easily be interpreted as they directly correspond to the translation values and its derivatives. Thus, it is also rather easy to manipulate these parameters by introducing additional parameter observations, constraints, or upper limits to the optimization.
4.3. A synthetic 2D example
In this section, we will discuss various aspects of the proposed non-rigid transformation model on the basis of an example. In order to better visualize scalar and vector fields, the example takes place in the two-dimensional Euclidean space . The main differences to the previously-described transformation in are: the bicubic polynomial has only 16 coefficients (instead of the 64 coefficients of the tricubic polynomial), the transformation field is obtained by combining the scalar fields and (instead of combining , , ), and the control structure is composed of two-dimensional cells (instead of voxels).
The two point clouds to be registered are visualized in the upper left image of
Figure 7. The fixed point cloud
is synthetically generated and consists of two axially parallel lines, four simple geometric forms, and a dense point raster. The transformed point cloud
is generated from
by applying two consecutive transformations. First, a rigid-body transformation with
=
,
=
, and
=
is applied. Then, an additional sinusoidal translation (amplitude=2, period=15) is added in
y direction. The goal of this example is to estimate the combination of these two transformations using the non-rigid transformation model presented in the previous sections.
For this, 632 correspondences
3 between the point clouds
and
are used. The point-to-point distance (
7) is minimized between these correspondences. The
control structure of
consists of 17×24=408 cells with a cell size of 5. Considering that
has 4 elements in
(e.g. for the scalar field
:
,
,
, c.f. equation
26), the number of unknown parameters for both scalar fields
and
equals to 18×25×4×2=3600. These parameters are estimated by solving an overdetermined linear equation system according to the least squares principle. The equation system consists of 4232 condition equations: 632 point-to-point distance observations and 3600 regularizing observations. Consequently, the redundancy of the equation system is 632. The weights of the regularizing observations
,
, and
are set to 0.02, 0.01, and 0.01, respectively.
In the upper right image of
Figure 7 the adjusted state of the point clouds is visualized. One can see that the two point clouds match very well after adding the estimated transformation field
to
– mean and standard deviation of the distance residuals are
. The vector field shows the estimated translations
at selected points in scaled form. The lower part of
Figure 7 shows a comparison between the estimated scalar fields
and
and their ground-truth values. Additionally,
Figure 8 and
Figure 9 show the effects on the estimated scalar fields
and
when varying the weights
,
, and the cell size – the main results from
Figure 7 are thereby located in the middle of each parameter variation. For each variant, the condition number
C of the normal matrix and the goodness-of-fit (GoF), defined as the sum of squared distance residuals, is specified.
These results lead us to the subsequent observations:
In areas with dense correspondences, the transformation can be well estimated, i.e. the differences between the estimated scalar fields and their ground-truth fields are nearly zero in these areas. In correspondence-free areas the transformation tends towards zero due to a lack of information.
The locality of the transformation depends mainly on the cell size. Minor adjustments of the locality can be made by modifying the weight . The cell size needs to be adjusted to the variability of the transformation to be modelled.
The scalar fields tend to oscillate if the ratio is large – in such cases the scalar fields have relatively steep slopes at the cell corners.
The GoF is better for lower weights and smaller cell sizes. However, in case of correspondences with even small random errors, a small cell size also increases the risk of overfitting.
The condition number C decreases with higher weights, i.e. the stability and efficiency of the parameter estimation increases.