ICP registration with DCA descriptor for 3D point clouds

Ying He; Ying He; Jun Yang; Xingming Hou; Shiyan Pang; Jia Chen

doi:10.1364/OE.425622

1. Introduction

In recent years, interactive motion sensing games and virtual reality (VR) have become part of people’s lives, with the advent of low-cost 3D data acquisition devices such as lidar, Microsoft’s Kinect, Google’s Tango, Intel’s Realsense, etc. These devices can obtain the target 3D point cloud with high precision and little influence from changes in light and scale, which can be further used for 3D modeling and pose tracking. The widespread application of these devices facilitates the rapid development of the studies on 3D point cloud technology.

The process of point cloud registration is an essential process of 3D shape modeling and pose tracking [1]. As the current most commonly used algorithm for point cloud registration, the Iterative Closest Point (ICP) algorithm [2] is used to iteratively solve the transformation parameters until they converge to the specified accuracy based on the assumption that the closest point between the two point clouds is the corresponding point, and finally obtain the transformation relationship between the two point sets. The ICP algorithm has the advantage of very high registration accuracy, but requires a good initial value, not too large transformation between the two point clouds, and not too much occlusion; otherwise the iterative convergence process is time-consuming or even unable to converge to a correct result [3,4].

Therefore, this paper proposes an ICP registration algorithm based on the local features of point clouds. This method uses the geometrical features of point clouds to be registered to establish the feature description (DCA), searches the correspondence between two point clouds via feature description, and realizes the accurate registration of two point clouds with the ICP algorithm. Without necessity to set a good initial value, the method can avoid the ICP algorithm against falling into local extremum and has a high convergence speed.

In summary, the main achievements of this paper are as follows:

(1)In view of the complicated and time-consuming calculation of the existing 3D point cloud feature descriptors, this paper proposes a simple, efficient and robust 3D point cloud feature descriptor (DCA) based on the average distance, curvature change and normal angle of the neighborhood points.
(2)The performance of feature descriptor DCA and three commonly used feature descriptors are compared on 5 public point cloud data sets. The advantages and disadvantages of each feature descriptor are summarized to provide reference for the reasonable selection of 3D feature descriptor.
(3)Aiming at the problem that ICP algorithm depends on the initial value, this paper proposes an ICP algorithm based on DCA features of point cloud. This method first calculates the feature descriptor (DCA) of point cloud, then finds the correspondence between point clouds based on the feature description and obtains the initial registration result, and finally, the result is used as the initial value of ICP to achieve accurate registration.
(4)The accuracy and speed of the ICP registration algorithm based on DCA feature of point cloud are evaluated on 5 public data sets. It can robustly complete the accurate registration of point cloud in real time.

The rest of this paper is organized as follows. In Section 2, the work on coarse registration algorithms and fine registration algorithms of point cloud is reviewed. Section 3 introduces the method of calculating three dimensional feature descriptors based on the geometric features of point cloud such as curvature, surface normal, density, etc. Section 4 details the method to achieve ICP registration based on this feature descriptor. Section 5 verifies the method hereunder through experiments. Conclusions are in Section 6.

2. Related work

Based on the “coarse to fine” strategy of point cloud registration, the work is done in two steps. First, establish the correspondence between the two point clouds to estimate a coarse registration; second, carry out fine tuning through the fine registration algorithm [5,6,7].

2.1 Coarse registration algorithms

Coarse registration algorithms can be divided into two categories [8]: one is based on global features [9], and the other one is based on local features [10].

Global feature descriptors such as shape contexts [11] and extended Gaussian images (EGI) [12] have rotation invariance. They can be used to construct sparse feature correspondence, and then use algorithms of random sampling [13] and Hough transforms [14] to obtain point cloud transformation relationship. The method based on global features requires offline training and online point cloud segmentation in advance, and is sensitive to interference such as self-occlusion and cluttered information. The method based on local features is more robust to interference such as partial overlap. Therefore, the point cloud registration method based on local features is mainly considered in this paper.

SI (Spin Image) [15] is the most commonly used method in the field of 3D feature description. Firstly, it uses the normal vector of the specified point as the local reference axis (LRA), calculates the projection distance of the vector from the specified point to the local neighborhood point on the normal and tangent plane of the point, then divides the neighborhood points into different distance intervals according to their distance, and finally counts the number of points in each interval as the SI characteristic value of the point. The 3D feature descriptor FPFH (Fast Point Feature Histograms) [16] calculates the normal angle between different neighborhood points and a specified point (Point Feature Histogram, PFH characteristics [17]), before weighting and summing these features. In the open source point cloud library PCL [18], FPFH currently occupies the smallest memory. The feature descriptor SHOT (Signature of Histograms of Orientations) [19] first establishes local reference coordinates according to the distribution of the local point cloud of the specified point, fixes the local point cloud to the reference coordinate system, and then divides the local point cloud into different blocks. For each block, count up the histograms of the normal angle distribution. Finally, all the histograms are concatenated. At present, SHOT has achieved good performance in terms of calculation efficiency, discrimination and robustness [20]. The feature descriptor RoPS (Rotational Projection Statistics) [21] rotates and projects local point clouds around a specified point onto a local reference coordinate system, then obtains the density distribution matrix through discretization, and finally calculates the Shannon entropy of the density distribution matrix. As for feature matching, RoPS is the best performing feature descriptor [20]. As a relatively new feature descriptor, TOLDI (Triple Orthogonal Local Depth Images) [22] obtains three orthogonal depth maps of a specified point through the projection discretization operation, and directly uses the pixel values of the depth maps to concatenate as the description of the specified point. Its dimension has reached 1,200, but this method has not been publicly implemented up to now. Please refer to the literature [20] for the detailed progress and comparison of the local feature descriptors for 3D point cloud. Choi [23] used local geometric features to match 2.5D scenes for the first time, but they only tested the performance of FPFH, and the accuracy was not high.

In recent years, many scholars begin to apply deep learning to the feature matching of 3D point cloud. Xiao [24] proposed 3DMatch to apply deep learning to 3D scene splicing and global reconstruction. He used CAD data as training set, employed convolution DBN network to build a 3DShapeNets, and realized the splicing and reconstruction of 3D data. Zeng [25] proposed self supervised feature learning based on 3DMatch to get local feature descriptors. Guo [26] used the feature descriptor ConvNet of two-dimensional structure to match the geometric features of local point clouds, but this method is only suitable for relatively simple scenes and still needs to be manually labeled. Li [27] used the self supervised deep learning method of 3D convolution neural network to build a model that can describe the geometric characteristics of 3D data, thus realizing the registration of point clouds. However, this method still needs to manually provide a small number of two tags: matched pairs and unmatched pairs.

2.2 Fine registration algorithms

Fine registration algorithms can be divided into 3 categories: iterative closest point algorithms, normal distribution transform algorithms and random sample consensus algorithms [4].

The advantage of Iterative Closest Point (ICP) algorithm is that there is no need to estimate the position or extract features, but ICP algorithm have some shortcomings. Therefore, many researchers have adjusted and improved the four steps of the original ICP algorithm. Mitra [28] designed an error measurement function. When the distance between corresponding point pairs approaches infinity, the measurement function is the distance from point to point; when the distance between corresponding point pairs is close to 0, the measurement function is the distance from point to surface. Chen [29] used the sum of squared distances from the point to the tangent plane of the corresponding point on CAD model surface as an error function and mathematically deduced a equivalent solution of a linear matrix equation. Segal [30] used the point-to-point distance, point-to-plane or plane-to-plane distance as the registration feature metric in their G-ICP algorithm. On the basis of simplicity and rapidity of the original ICP algorithm, it improved the robustness and accuracy of the algorithm and avoided complicated parameter selection. In addition, in order to speed up the registration process and avoid falling into local minimum, some researchers replace the calculation of point pairs by calculating features [31,32]. This method can effectively reduce the influence of noise.

The normal distribution transform (NDT) first divides the 3D point cloud into small 3D cells, converts the point cloud data in each 3D cell into a continuously differentiable probability density distribution function, and then uses the Hessian matrix method to solve the matching between other point clouds [33]. The application of NDT algorithm in mobile robots is very common, mainly because the robot can obtain the positional relationship between two points during measurement. Directly through the initial transformation, the point cloud registration can be realized quickly and simply by using NDT algorithm. Hong [34] and Miao [35] proposed effective methods for the construction of point cloud cell size. In the process of point cloud registration, we can use feature-based coarse registration method, and then use NDT algorithm to achieve accurate registration. However, in large-scale complex geographical environment, there is still a lack of applied research.

With the development of lidar technology in recent years, the application of RANSAC [36] method in 3D point cloud registration has become an important research field. The RANSAC algorithm consists of three steps. Firstly, some points are randomly selected from the point cloud to calculate the transformation relationship. Secondly, the transformation relationship is used to remove some external points in point cloud and calculate the registration degree of point cloud. Finally, we use iteration to find the data set with the maximum registration degree, and use the data set to calculate the transformation relationship. This process is similar to ICP algorithm, but it can avoid the iteration of the whole point cloud. Combined with the point cloud feature operator, RANSAC can effectively solve the problem of 3D point cloud registration and improve the registration efficiency [37,38].

3. DCA Feature description calculation

In this section, a new point cloud local feature descriptor (DCA) is proposed based on the normal vector, curvature and density of point cloud. Based on the neighborhood of the feature points of the point cloud, the average distance, curvature change and normal angle of the neighborhood points corresponding to the point are calculated respectively, and finally the 3D feature descriptor of the point is obtained. This feature description is mainly divided into three steps: 1) calculating the average distance between the neighborhood points; 2) calculating the curvature; and 3) calculating the normal angle. The flow of the whole algorithm is shown in Fig. 1.

Fig. 1. DCA feature description process: (a) A feature point of the target point cloud (b) calculate the K-Neighborhood point around the feature point (c) calculate the average neighborhood distance, curvature change and normal angle of the neighborhood point (d) combine the three features to get the final 3D DCA feature.

Download Full Size | PDF

3.1 Average distance between neighborhood points

For the $ k$ neighborhood, the number of neighborhood points is consistent in the same number of k neighborhoods, no matter whether the sampling surface is regular or not. It is assumed that the point cloud data set is $G = \{g_i\},i = 1, \cdots ,N$, where $({{x_i},{y_i},{z_i}} )$ are the 3D coordinates of the point cloud ${g_i}$, and N is the number of point cloud data points. $M({g_i} )= \{g_{ij}\},1 \le j \le k$ is the k neighborhood of point ${g_i}$, that is, the k points closest to point ${g_i}$ in space.

The average distance between a point and its neighboring points reflects the feature of the point cloud. When the average distance between the point and its surrounding neighborhood points is small, the point cloud distribution is relatively dense and forms generally the feature area of the point cloud; on the other hand, when that is large, the point cloud distribution is relatively sparse and forms generally the smooth area of the point cloud. The average distance ${\omega _{nb}}({g_i})$ between the point and the neighborhood point is used as a distance parameter to distinguish the features of the point cloud.

The average distance of neighborhood points from a point can be expressed with the following formula:

(1)$${\omega _{nb}}({g_i} )= \frac{1}{k}\sum\limits_{{g_j} \in M({g_i} )} {|{{g_i} - {g_j}} |}$$

where ${g_j}$ is the neighborhood point of ${g_i}$.

3.2 Curvature change

The normal and curvature information of the surface is an important geometric feature of surface recognition. Therefore, the normal direction and curvature of each data point should be calculated [39].

First, the normal direction of point cloud data ${g_i}$ is calculated. Finding the plane equation is transformed into finding $a,b,c,d$ four parameters. Assuming that the set of plane points to be fitted is $({{x_i},{y_i},{z_i}} ),i = 1,2, \cdots ,n$, then the distance from any point to the plane is

(2)$${d_i} = |{ax + by + cz - d} |$$

To get the best plane fit, Lagrange method is needed to solve the extreme value

(3)$$f = \mathop \sum \nolimits_{i = 1}^n d_i^2 - \lambda ({{a^2} + {b^2} + {c^2} - 1} )$$

By calculation

(4)$$C{e_m} = {\lambda _m}{e_m},m \in \{1,2,3 \}$$

It is transformed into solving the problem of eigenvalues and eigenvectors of matrix C, where matrix C is a covariance matrix of n points, and ${({a,b,c} )^T}$ is an eigenvector of the matrix. Corresponding to each data point ${g_i}$ of the point cloud, the corresponding matrix C is

(5)$$C = \frac{1}{k}\mathop \sum \nolimits_{i = 1}^k \left( {{g_i} - {{\mathop g\limits^ - }_i}} \right) \cdot {\left( {{g_i} - {{\mathop g\limits^ - }_i}} \right)^T}$$

where k is the number of neighborhood points and ${\mathop g\limits^ - _i}$ is the center of the neighborhood point set $M({g_i} )$ of point ${g_i}$. Therefore, the eigenvector corresponding to the smallest eigenvalue is the normal vector of point ${g_i}$.

It may be assumed that ${\lambda _1} \le {\lambda _2} \le {\lambda _3}$, ${\lambda _1}$ describes the variation of the curved surface in the normal direction, and ${\lambda _2}$ and ${\lambda _3}$ indicate the distribution of data point ${g_i}$ on the tangent plane. The surface variation of the data point ${g_i}$ in the k neighborhood is

(6)$${\tau _i}\textrm{ = }\frac{\lambda _1}{({{\lambda_1} + {\lambda_2} + {\lambda_3}} )}$$

The curvature ${H_i}$ [40] of the point cloud model at the data point ${g_i}$ can be approximated as the surface variation ${\tau _i}$ at that point.

In Fig. 2, the data points on the model are represented by red dots, the neighbor points of the corresponding data points are represented by hollow dots, and the straight line represents the tangent plane at that point. The distance between the neighborhood point of point ${g_1}$ and the tangent plane is big, indicating that the point is located in a curve with a large curvature, so the point is located in the feature area. The distance between the neighborhood point of point ${g_2}$ and the tangent plane is smaller, indicating that the point is located on a curve with a smaller curvature, so the point is located in the non-featured area.

Fig. 2. Curvature of feature point and non-feature point.

Download Full Size | PDF

3.3 Angle between data point normal direction and neighborhood point normal direction

As an important geometric feature for surface recognition, the change of the normal angle reflects the degree of curvature or flatness of the point cloud surface. If it is assumed that the data point ${g_i}$ is one of any points in the point cloud model G, ${g_j}$ is the neighborhood point of ${g_i}$, and their normal directions are ${n_{g_i}}$ and ${n_{g_j}}$, respectively, then the normal cosine of ${g_i}$ and ${g_j}$ can be expressed with the following formula:

(7)$$\cos {\theta _{{g_i}{g_j}}} = \frac{{n_{g_i}} \cdot {n_{{g_j}}}}{|{{n_{g_i}}} ||{{n_{{g_j}}}} |}$$

where the value range of ${\theta _{{g_i}{g_j}}}$ is $[{0,\pi } ]$.

In order to obtain the normal angle parameter ${\omega _a}({g_i} )$ between the data point ${g_i}$ and its neighborhood point, the normal angles between ${g_i}$ and all neighborhood points are summed as follows

(8)$${\omega _a}({g_i} )= \sum\limits_{g_{j \in M({g_i} )}} {\theta _{{g_i}{g_j}}}$$

The normal angle between the neighborhood points in the feature area and the non-feature area is shown in Fig. 3, where the black points are the data points on the model, the hollow ones are the neighborhood points of the corresponding data points, and the number of neighborhood points is $k = 4$. Point ${g_3}$ is located on a curve with a greater degree of curvature, and its normal angle with the neighborhood point is also larger, the value of the normal angle parameter ${\omega _a}({g_i} )$ is also large, so the point is located in the feature area; on the other hand, the point ${g_8}$ is located on a curve with a small degree of curvature, and its normal angle with the neighborhood point is also small, the value of the normal angle parameter ${\omega _a}({g_i} )$ is also small, so the point is located in the non-featured area.

Fig. 3. The normal angle between data point and neighbor point.

Download Full Size | PDF

3.4 Parameter analysis

After analyzing the different data, we know that the quantity of neighborhood points should depend on the density of the point cloud data and the uniformity of their distribution. When the point cloud density is large, the quantity may be smaller and generally is 10-30. When the point cloud contains noise, the quantity needs to be larger and generally is 50-100. Please refer to the section below on experimental for the comparison of the quantity of neighborhood points.

4. ICP registration based on DCA feature description

As an iterative algorithm, the ICP algorithm requires a good initial value, not too large transformation between the two point clouds, and not too much occlusion; otherwise, the iteration cannot converge to a correct result. To solve this problem, this paper uses the 3D local feature descriptor, DCA, based on the density, curvature, and normal information of the point cloud mentioned above to find the corresponding relationship of the point cloud, obtain the initial registration result, and achieve fine-tuning of registration results with the result above as the initial value of ICP. The algorithm framework is shown in Fig. 4.

Fig. 4. ICP registration process based on point cloud DCA features.

Download Full Size | PDF

4.1 Feature point detection

Feature point detection is a prerequisite for subsequent point cloud registration. Sparse sampling of point clouds is the most commonly used feature detection method. Therefore, this paper uses point cloud sparse sampling to achieve feature point detection.

4.2 Feature description and matching

The purpose of feature description is to extract the feature information of each point in the point cloud to make it distinguishable. In this paper, 3D floating-point DCA feature descriptors are used to describe the feature information of each point in point cloud. Then the feature matching problem turns into the problem of searching the corresponding point of high-dimensional data. The feature description information of each point cloud is N 3-dimensional floating-point data. Use FLANN (Fast Library for Approximate Nearest Neighbors) to search for the closest point of high-dimensional Euclidean distance. If the distance is less than the threshold $\tau $, then this point is regarded as its corresponding pairing point. Herein, $\tau $ is set to 0.9.

4.3 Mismatch elimination

Most of the mismatches have been eliminated by matching based on point cloud DCA features. In order to increase the registration accuracy of the two point clouds, this paper uses the RANSAC algorithm to remove a small number of existing mismatches. The RANSAC algorithm divides the data into “inner points” and “noise”, and then iteratively calculates the model parameters for a small amount of sampled data to obtain the best results for data fitting. Assuming that n pairs of matching points are obtained after the aforementioned feature matching, the process of using the RANSAC algorithm to achieve mismatch elimination and obtaining the position transformation relationship between point clouds is as follows:

(1) In order to determine the 3D transformation relationship, randomly select 3 of the n pairing results and calculate the initial transformation relationship ${T_{i0}}$ of the point cloud;
(2) Verify whether the remaining n-3 pairing results satisfy the transformation relationship ${T_{i0}}$. If the error between the transformed point and its original registration point is less than a certain threshold, the pairing is considered to be an inner point, otherwise it is noise;
(3) When the number of inner points reaches a certain number, the position transformation relationship between the point clouds can be expressed by ${T_{i0}}$;
(4) Recalculate the position transformation relationship ${T_{i1}}$ between point clouds through the pairing of all inner points, and count the error under the transformation relationship;
(5) Repeat the above process continuously, and each time you can get a position transformation relationship between point clouds. The transformation relationship with the smallest error is retained as the result of point cloud registration.

The RANSAC algorithm is also robust to obvious outliers, but the time required is uncertain. For the above process, RANSAC needs to iterate $C_n^3$ times. When the feature matching result n is large, it will take a long time to obtain the optimal result. In practical applications, for efficiency reasons, the upper limit of the number of iterations is usually set, so that the results obtained may not be optimal or even wrong. Higher precision of the feature matching result will make it easier for RANSAC to converge to the optimal result, otherwise the convergence speed is slow, and the result is difficult to be guaranteed. In this case, the correctness of the feature matching results is very important.

4.4 ICP pose fine-tuning

The standard ICP algorithm [2] boasts high accuracy and strong applicability. However, the algorithm is easy to fall into local extremum. When the initial value is poor, the algorithm convergence speed is slow or even unable to converge. In this paper, the position transformation relationship between point clouds obtained by RANSAC algorithm is used as the initial value of ICP registration, and the result of point cloud registration is further fine-tuned and optimized.

In this paper, the G-ICP [30] algorithm is used to fine-tune the registration of point clouds based on DCA features. The G-ICP algorithm uses the point-to-point distance, point-to-plane or plane-to-plane distance as the registration feature metric. It improves the robustness and accuracy of the algorithm on the basis of the simplicity and speed of the original ICP algorithm and avoids complex parameter selection. The general form to solve T by G-ICP is

(9)$$T = \mathop {\textrm{argmin}}\limits_T \mathop \sum \nolimits_i d_i^{{(T )}^T}{({C_i^Q + TC_i^P{T^T}} )^{ - 1}}d_i^{(T )}$$

where $d_i^{(T )} = {q_i} - T{p_i}$, $C_i^P$ and $C_i^Q$ are the covariance matrix of the point cloud.

5. Experimental result

This section mainly analyzes the performance of the ICP algorithm based on DCA feature description. The experiments were conducted on five public data sets and compared with the existing typical 3D feature descriptors. Section 5.1 introduces the five data sets and the evaluation indexes involved in the experiment. Section 5.2 compares the registration accuracy of matching algorithms based on different feature descriptors on different public data sets. Section 5.3 compares the robustness of the feature descriptors to neighborhood radius in different data sets and the robustness of different feature descriptors to Gaussian noise. Section 5.4 compares the operating efficiency of different feature descriptors. Finally, Section 5.5 summarizes the performance of the ICP algorithm based on DCA feature description designed in this paper.

5.1 Experimental setup

1) Experimental Data Set
The experiment is conducted on 5 public data sets. The first data set is the Bologna object recognition data set [41], obtained by random rotation and translation of the model in the Stanford 3D scanning warehouse [42]. The second one is the Random View data set [43], a synthetic 2.5D scene obtained by viewing the model randomly in the Stanford 3D scanning warehouse. The third one is the UWA data set [44], acquired using a Minolta Vivid 910 laser scanner. The fourth one is the Kinect data set [45], collected using Microsoft’s first-generation Kinect. The fifth one is the ToF data set [46], collected using MESA-SR4000. These data sets are obtained with different acquisition equipment and represent different qualities of 3D point cloud data, so they can comprehensively evaluate the performance of the ICP algorithm based on DCA feature description. The related information on the data sets used herein are shown in Table 1, and the models of all data sets are shown in Fig. 5.
The 3D feature descriptors involved in the comparison are the FPFH [16], RoPS [21], SHOT [19] and standard ICP registration algorithms. These comparison algorithms are currently widely used and have excellent performance. All experiments were performed on a laptop with Intel i5-8250U@1.6 GHz processor, 8G memory, and running a 64-bit Win 10 system and VS2017 development software.
2) Evaluation Index
In this paper, the precision and recall are used to evaluate the feature descriptors, and the deviation between point clouds is used as the index to evaluate the quality of point cloud matching [20,47]. Precision is the ratio of the number of correct matches to that of all matches. Recall is the ratio of the number of correct matches to that of corresponding features. The deviation between point clouds is the square sum of the corresponding points’ distances between two registered point clouds.
$(10)$$Precision = \frac{\# \textrm{number of correct matches}}{\# \textrm{all matches}}$$$ $(11)$$Recall = \frac{\# \textrm{number of correct matches}}{\# \textrm{corresponding feature total}}$$$ $(12)$$error = \mathop \sum \nolimits_{i = 1}^N {({M_i} - S_i^k)^2}$$$ where ${S^k} = RS + T$, M and S are two point clouds to be registered, N is the number of point clouds, and R and T are rotation and translation matrices.

Fig. 5. Data set used in the experiment.

Download Full Size | PDF

Table 1. Data set used in the experiment

View Table | View all tables in this article

5.2 Accuracy comparison

In order to evaluate the accuracy of the ICP algorithm based on DCA feature description, this section compares the performance of this algorithm with standard ICP algorithm and the ICP algorithm based on FPFH, RoPS and SHOT feature descriptors. We use the 5 public data sets shown in Table 1 to compare and evaluate the performance of the above algorithms. In this paper, the point cloud data are processed by down-sampling in the comparative experiment. Table 2 shows the precision and recall results of various algorithms in different data sets.

Table 2. Precision and recall of various algorithms in different data sets^a

View Table | View all tables in this article

It can be seen from the results above that because the Bologna data set is synthesized point cloud with high quality and can be considered to not contain noise, so the 4 descriptors have high registration precision. When the precision is basically the same, SHOT has the best performance and the highest recall, followed by RoPS, and the algorithm proposed herein is the third. The Random View data set contains occlusion, Gaussian noise, etc. Among them, the result of SHOT is the best, followed by RoPS, and the algorithm proposed herein is the third. Compared with Bologna data set, the performance of the algorithm mentioned herein and FPFH is reduced, which is caused by the influence of Gaussian noise. The UWA data set is less cluttered than the Random View data set, but the occlusion is more serious. It can be seen that the algorithm proposed herein and FPFH are affected by occlusion, and their performance is reduced. The data of the Kinect data set is collected by low- cost Kinect. Compared with the other three data sets, the point cloud quality of the Kinect data set is the worst. On this data set, SHOT and RoPS perform best, while FPFH no longer match. FPFH performance degradation is mainly due to sensitivity to grid resolution. With 3D point cloud, the ToF data set is improved in terms of its quality compared with the Kinect data set. On this data set, SHOT and RoPS still perform best. It is worth noting that FPFH and the algorithm proposed herein have improved performance compared with Kinect data set. This is mainly because the Kinect point cloud is ordered and has similar density, which has a great impact on the algorithm and FPFH.

Table 3 shows the registration results of the Bologna data set, UWA data set, and Kinect data set after the coarse registration and fine tuning of the DCA feature description. From the results in the table, it can be seen that after the initial matching of the DCA features, most of the erroneous matching point pairs have been eliminated, and the two point clouds have initially achieved registration. Partial point cloud registration results are shown in Fig. 6(a, c). After RANSAC further eliminates the wrong matching point pairs and uses the ICP algorithm to achieve registration fine-tuning, the registration error of the point cloud is further reduced. At this time, the two point clouds have achieved accurate registration. Partial point cloud registration results are shown in Fig. 6(b, d).

Fig. 6. ICP registration results based on DCA features.

Download Full Size | PDF

Table 3. Point cloud registration error

View Table | View all tables in this article

In Fig. 6, the left column is the coarse registration result of the point cloud, the right column is the result of fine tuning of the point cloud registration, the red is the model point cloud, and the blue is the scene point cloud.

Table 4 shows the point cloud deviation results of various algorithms in different data sets. It can be seen from Table 4 that the ICP registration results based on different feature descriptors are consistent with the matching accuracy results of various descriptors shown in Table 2. This shows that the correctness of the feature matching result has an impact on the final result of point cloud registration. Higher accuracy of the feature matching result will make it easier for the point cloud registration to converge to the optimal result. While the standard ICP algorithm is more seriously affected by the initial value of the point cloud (Bologna data set), noise (Kinect data set) and occlusion (UWA data set) than the algorithm based on feature description.

Table 4. Point cloud deviation of various algorithms on different data sets^a

View Table | View all tables in this article

5.3 Robustness comparison

In order to verify the robustness of the ICP algorithm based on DCA feature description to the number of k neighborhoods and Gaussian noise interference, this section uses the precision to evaluate the matching performance of the algorithm.

1) Robustness to k Neighborhoods
Similar to the concept of “scale” in the 2D feature descriptor domain, the neighborhood radius is very important for most 3D feature descriptors. Because this paper adopts the k neighborhood approach, we use different numbers of k neighborhoods to evaluate the robustness of our feature descriptors to the changes of k neighborhood numbers on different data sets. The curves for the different numbers of k neighborhoods are shown in Fig. 7. It can be seen from the figure that the precision of DCA feature descriptors increases with the number of k neighborhoods on all data sets until it reaches 1. On the Random View and Kinect data sets, because the density of point cloud is small and there is noise, the number of neighborhood points k is large. The larger the k value is, the stronger the anti-noise capability will be, however the longer the algorithm operation will take.
2) Robustness to Gaussian Noise
In order to evaluate the robustness of feature descriptors to Gaussian noise, we add Gaussian white noise with different standard deviations to the scene point cloud of the Bologna data set. Here we only give the results on the Bologna data set, because the remaining four data sets already contain occlusion and real noise. The results under different Gaussian noises are shown in Fig. 8. It can be seen from the results that with the improvement of the Gaussian noise level, the precision of the SHOT and RoPS descriptors are less affected and they show strong robustness. FPFH is seriously affected by Gaussian noise, but the algorithm proposed herein is robust to Gaussian noise on the whole.

Fig. 7. Robustness for the number of different k neighborhoods.

Download Full Size | PDF

Fig. 8. Robustness for different Gaussian noise.

Download Full Size | PDF

5.4 Speed comparison

Speed is another important factor in practical application. After all, the amount of 3D point cloud data is huge. For example, point cloud matching algorithm is needed to achieve real-time registration in driverless applications. This section mainly compares the efficiency of different feature descriptors from the perspectives of the time spent in feature description, matching and ICP registration. Here we only give the results on the Bologna, UWA and Kinect data sets. Because these three data sets represent different point cloud quality and are also the most commonly used data collection methods at present [48,49].

The average time consumption of point cloud registration based on feature description is summarized in Tables 5, 6, and 7. The calculations of running time in the tables are obtained with similar accuracy. It can be seen from the results that the feature description process of the algorithm proposed herein is the fastest among all the descriptors in different data sets, mainly because the algorithm has a low calculation dimension. FPFH is the second fastest in the feature description process because its feature dimension is 33. With the increase of feature dimension, the time of feature description grows longer. However, it should be noted that the feature dimension of RoPS is lower than of that of SHOT, but its feature description time is longer than that of SHOT. This is mainly because the 3D point cloud needs to be triangulated before the RoPS feature description, which takes some time. The average time of the feature matching process is directly proportional to the feature dimension of the descriptor. The larger the feature dimension is, the longer the matching takes, because feature matching uses FLANN search. Thanks to the accuracy of the feature matching results, the ICP algorithm based on the four feature descriptors converges in a short time. It can also be seen from the three tables that the process of feature description and feature matching takes more than 90% of the total time in the point cloud matching process based on feature description.

Table 5. Average operation time of various algorithms on the bologna data set^a

View Table | View all tables in this article

Table 6. Average operation time of various algorithms on the uwa data set^a

View Table | View all tables in this article

Table 7. Average operation time of various algorithms on the Kinect data set^a

View Table | View all tables in this article

In general, the algorithm proposed herein is the fastest, and FPFH is the second. On the contrary, SHOT and RoPS have the lowest matching efficiency because of their high feature dimensions.

5.5 Performance summary

According to the experimental results in Sections 5.2–5-4, the following conclusions can be drawn:

1) The performance of point cloud registration algorithms based on different feature descriptors under different evaluation indexes (accuracy, robustness, and speed) depends on the tested data set and the choice of algorithm parameters. In practical application, we should select the appropriate algorithm and set the reasonable parameters according to the characteristics of the data in order to obtain the best performance.
2) In terms of precision and robustness, RoPS and SHOT are the best and represent the top level in the field of 3D feature description [50,51]. The algorithm proposed herein still has room for improvement in terms of precision and robustness. But it is the complexity of RoPS and SHOT in the feature description that leads to the too long algorithm running. The algorithm proposed herein also has low feature dimension and fast calculation speed, and can be applied to some real-time systems.

6. Conclusions

As the most widely used point cloud registration, the ICP algorithm requires a good initial value, not too large transformation between the two point clouds, and not too much occlusion; otherwise, the iteration cannot converge to a correct result. To solve this problem, this paper proposes an ICP registration algorithm based on the local features of point clouds. With this method, a robust and efficient 3D local feature descriptor (DCA) is firstly designed by combining the density, curvature, and normal information of the point clouds, then based on the feature description, the correspondence between the point clouds and also the initial registration result are found, and finally, the aforementioned result is used as the initial value of ICP to achieve fine tuning of the registration result. By comparing with the most widely used ICP algorithm based on three feature descriptors on different data sets, we can know that the algorithm has faster speed under the same registration accuracy, and also boosts stronger robustness, so it is more suitable for point cloud registration systems.

Funding

National Natural Science Foundation of China (61605054, 61772012); Fundamental Research Funds for the Central Universities (CCNU18TS041, CCNU19TD007); Wuhan Application Foundation Frontier Project (2020010601012190).

Disclosures

The authors declare that there are no conflicts of interest related to this article.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. J. Chen, X. Wu, M. Y. Wang, and F. Deng, “Human body shape and motion tracking by hierarchical weighted ICP,” in International Symposium on Visual Computing (ISVC), Las Vegas, USA, (2011).

2. P. J. Besl and N. D. Mckay, “A method for registration of 3-D shapes,” IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992). [CrossRef]

3. J. Wang and M. Wang, “3D reconstruction of shoe-last based on binocular stereo vision,” Computer Technology and Development 19(4), 224–230 (2009).

4. L. Cheng, C. Song, X. Liu, H. Xu, and Y. Chen, “Registration of laser scanning point clouds: a review,” Sensors 18(5), 1641–1665 (2018). [CrossRef]

5. J. Yang, H. Li, D. Campbell, and Y. Jia, “Go-ICP: A globally optimal solution to 3D ICP point-set registration,” IEEE Trans. Pattern Anal. Mach. Intell. 38(11), 2241–2254 (2016). [CrossRef]

6. X. Huang, J. Zhang, Q. Wu, L. Fan, and C. Yuan, “A coarse-to-fine algorithm for registration in 3D street-view cross-source point clouds,” International Conference on Digital Image Computing Techniques and Applications, Gold Coast, Australia, (2016).

7. Y. He, B. Liang, J. Yang, S. Li, and J. He, “An iterative closest points algorithm for registration of 3D laser scanner point clouds with geometric features,” Sensors 17(8), 1862–1877 (2017). [CrossRef]

8. A. Aldoma, Z. C. Marton, F. Tombari, W. Wohlkinger, and M. Vincze, “Tutorial: Point cloud library: Three-dimensional object recognition and 6 DOF pose estimation,” IEEE Robot. Automat. Mag. 19(3), 80–91 (2012). [CrossRef]

9. A. Rhodes, E. Kim, J. A. Christian, and T. Evans, “Lidar-based relative navigation of non-cooperative object suing point cloud descriptors,” AIAA/AAS Astrodynamics Specialist Conference, Long Beach, California, (2016).

10. J. Yang, Y. Xiao, and Z. Cao, “Aligning 2.5D scene fragments with distinctive local geometric features and voting-based correspondences,” IEEE Trans. Circuits Syst. Video Technol. 29(3), 714–729 (2019). [CrossRef]

11. S. Belongie, J. Malik, and J. Puzicha, “Shape matching and object recognition using shape contexts,” IEEE Trans. Pattern Anal. Machine Intell. 24(4), 509–522 (2002). [CrossRef]

12. A. Makadia, A. Patterson, and K. Daniilidis, “Fully automatic registration of 3d point clouds,” IEEE Conference on Computer Vision and Pattern Recognition1297–1304 (2006).

13. R. B. Rusu, N. Blodow, and M. Beetz, “Fast point feature histograms (FPFH) for 3D registration,” IEEE International Conference on Robotics and Automation, 3212–3217, (2009).

14. O. J. Woodford, M. T. Pham, A. Maki, F. Perbet, and B. Stengere, “Demisting the Hough transform for 3D shape recognition and registration,” Int. J. Comput. Vis. 106(3), 332–341 (2014). [CrossRef]

15. A. E. Johnson and M. Hebert, “Using spin images for eﬃcient object recognition in cluttered 3D scenes,” IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 433–449 (1999). [CrossRef]

16. R. B. Rusu, N. Blodow, and M. Beetz, “Fast point feature histograms (FPFH) for 3D registration,” IEEE International Conference on Robotics and Automation, 3212–3217, (2009).

17. R. B. Rusu, N. Blodow, Z. C. Marton, and M. Beetz, “Aligning point cloud views using persistent feature histograms,” IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, 3384–3391, (2008).

18. R. B. Rusu and S. Cousins, “3D is Here: Point Cloud Library (PCL),” IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, (2011).

19. F. Tombari, S. Salti, and L. D. Stefano, “Unique signatures of histograms for local surface description,” European Conference on Computer Vision Conference on Computer Vision, 356–369, (2010).

20. Y. Guo, M. Bennamoun, F. Sohel, L. Min, J. Wan, and N. M. Kwok, “A comprehensive performance evaluation of 3D local feature descriptors,” Int. J. Comput. Vis. 116(1), 66–89 (2016). [CrossRef]

21. Y. Guo, F. Sohel, M. Bennamoun, M. Lu, and J. Wan, “Rotational projection statistics for 3D local surface description and object recognition,” Int. J. Comput. Vis. 105(1), 63–86 (2013). [CrossRef]

22. J. Yang, Q. Zhang, Y. Xiao, and Z. Cao, “TOLDI: An eﬀective and robust approach for 3D local shape description,” Pattern Recognition 65, 175–187 (2017). [CrossRef]

23. S. Choi, Q. Y. Zhou, and V. Koltun, “Robust reconstruction of indoor scenes,” IEEE Conference on Computer Vision Pattern Recognition, Boston, USA, 5556–5565, (2015).

24. Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, “3D shapeNets: a deep representation for volumetric shapes,” IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, (2014).

25. A. Zeng, S. Song, M. Niebner, M. Fisher, J. Xiao, and T. Funkhouser, “3D Match: learning local geometric descriptors from RGB-D reconstruction,” International Conference on Computer Vision and Pattern Recognition, 199–208, (2017).

26. K. Guo, D. Zou, and X. Chen, “3D mesh labeling via deep convolutional neural networks,” ACM Trans. Graph. 35(1), 1–12 (2015). [CrossRef]

27. J. Li, X. Yang, and B. He, “Geometric features matching with deep learning,” Computer Sci. 46(7), 274–279 (2019). [CrossRef]

28. N. J. Mitra, “Algorithms for Comparing and Analyzing 3D Geometry,” California: Stanford University, 13–20, (2007).

29. J. Chen, X. Wu, M. Wang, and X. Li, “3D Shape Modeling Using a Self-developed Hand-held 3D Laser Scanner and an Efficient HT-ICP Point Cloud Registration Algorithm,” Opt. Laser Technol. 45, 414–423 (2013). [CrossRef]

30. A. V. Segal, D. Haehnel, and S. Thrun, “Generalized-icp,” Robotics: Science and Systems, Seattle, USA, (2009).

31. F. Pomerleau, F. Colas, and R. Siegwart, “A review of point cloud registration algorithms for mobile robotics,” FNT in Robotics 4(1), 1–104 (2015). [CrossRef]

32. Y. Díez, F. Roure, X. Lladó, and J. Salvi, “A qualitative review on 3d coarse registration methods,” ACM Comput. Surv. 47(3), 1–36 (2015). [CrossRef]

33. A. Das and S. Waslander, “Scan registration using segmented region growing NDT,” Int. J. Robotic. Res. 33(13), 1645–1663 (2014). [CrossRef]

34. H. Hong and B. H. Lee, “Key-layered normal distributions transform for point cloud registration,” Electron. Lett. 51(24), 1986–1988 (2015). [CrossRef]

35. Y. Miao, Y. Liu, H. Ma, and H. Jin, “The pose estimation of mobile robot based on improved point cloud registration,” J. Int. Robotic Sys. 13(2), 52 (2016). [CrossRef]

36. M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM 24(6), 381–395 (1981). [CrossRef]

37. X. Wang, L. Yang, H. Wei, and L. Feng, “An ASIFT-based local registration method for satellite imagery,” Remote Sensors 7(6), 7044–7061 (2015). [CrossRef]

38. X. Bo, W. Jiang, S. Jie, J. Zhang, and L. Li, “Investigation on the weighted ransac approaches for building roof plane segmentation from LiDAR point clouds,” Remote Sensors 8(1), 5 (2015). [CrossRef]

39. Y. He, B. Liang, J. He, and S. Li, “Non-cooperative Spacecraft Pose Tracking Based on Point Cloud Feature,” Acta Astronaut. 139, 213–221 (2017). [CrossRef]

40. M. Pauly, M. Gross, and L. Kobbelt, “Efficient simplification of point-sampled surfaces,” Visualization IEEE Computer Society 1(4), 163–170 (2002).

41. http://vision.deis.unibo.it/research/80-shot.

42. http://graphics.stanford.edu/data/3Dscanrep/

43. http://vision.desi.unibo.it/keypoints3d

44. http://staﬀhome.ecm.uwa.edu.au/00053650/recognition.html

45. http://rgbd-dataset.cs.washington.edu/index.html

46. http://www.umiacs.umd.edu/research/POETICON/umd_complex_activities/

47. A. G. Buch, H. G. Petersen, and N. Krüger, “Local shape feature fusion for improved matching, pose estimation and 3D object recognition,” SpringerPlus 5(1), 297 (2016). [CrossRef]

48. F. Tombari, S. Salti, and L. D. Stefano, “Performance evaluation of 3D keypoint detectors,” Int. J. Comput. Vis. 102(1-3), 198–220 (2013). [CrossRef]

49. L. Wolf, T. Hassner, and Y. Taigman, “Eﬀective unconstrained face recognition by combining multiple descriptors and learned background statistics,” IEEE Trans. Pattern Anal. Mach. Intell. 33(10), 1978–1990 (2011). [CrossRef]

50. Y. Zou, X. Q. Wang, T. Zhang, B. Liang, and J. Song, “BRoPH: An eﬃcient and compact binary descriptor for 3D point clouds,” Pattern Recognition 76(4), 522–536 (2018). [CrossRef]

51. Y. Zou, T. Zhang, X. Q. Wang, Y. He, and J. Song, “BRoPH: A compact and eﬃcient binary 3D feature descriptor,” IEEE International Conference on Robotics and Biomimetics, China, 1093–1098, (2016).

Data set	Feature Points	DCA		FPFH		RoPS		SHOT
Data set	Feature Points	Precision	Recall	Precision	Recall	Precision	Recall	Precision	Recall
Bologna	3951	0.994	0.51	0.97	0.24	0.926	0.92	1	0.95
Random View	3983	0.968	0.81	0.724	0.016	0.998	0.99	1	0.99
UWA	3611	1	0.19	1	0.01	1	1	1	0.99
Kinect	717	0.976	0.06	—	—	1	0.94	1	0.95
ToF	2535	0.994	0.54	1	0.92	1	0.97	1	0.99

Data Set	Feature Points	Coarse Registration		Fine Tuning
Data Set	Feature Points	Match Point Pair	Point Cloud error	Match Point Pair	Point Cloud error
Bologna	3951	3571	0.2253	3565	4.84e-05
UWA	3611	684	253.92	682	0.068
Kinect	717	43	3.86e-08	43	2.83e-08

Data Set	DCA	FPFH	RoPS	SHOT	ICP
Bologna	0.946	26.249	38.692	6.749e-08	258.135
Random View	2.056e-04	0.016	6.883e-04	3.509e-11	0.178
UWA	3.919e-04	9.761e-06	0.01	0.012	—
Kinect	0.646	—	2.941e-06	2.943e-06	41.594
ToF	3.769e-03	2.347e-08	1.301e-09	1.3e-09	3.18

Descriptor	Dimension	Feature Description (ms)	Feature Matching (ms)	RANSAC (ms)	ICP (ms)	Total (ms)
DCA	3	838	136	65	25	1064
SHOT	352	19968	5456	49	59	25532
RoPS	135	41426	2134	32	32	43677
FPFH	33	3161	755	14	14	3986

Descriptor	Dimension	Feature Description (ms)	Feature Matching (ms)	RANSAC (ms)	ICP (ms)	Total (ms)
DCA	3	1335	111	9	5	1460
SHOT	352	30546	3855	6	78	34485
RoPS	135	40652	2098	62	28	42840
FPFH	33	2023	315	44	15	2397

ICP registration with DCA descriptor for 3D point clouds

Abstract

1. Introduction

2. Related work

2.1 Coarse registration algorithms

2.2 Fine registration algorithms

3. DCA Feature description calculation

3.1 Average distance between neighborhood points

3.2 Curvature change

3.3 Angle between data point normal direction and neighborhood point normal direction

3.4 Parameter analysis

4. ICP registration based on DCA feature description

4.1 Feature point detection

4.2 Feature description and matching

4.3 Mismatch elimination

4.4 ICP pose fine-tuning

5. Experimental result

5.1 Experimental setup

5.2 Accuracy comparison

5.3 Robustness comparison

5.4 Speed comparison

5.5 Performance summary

6. Conclusions

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (8)

Tables (7)

Equations (12)

Optics Express

No	Data Set	Collection Method	Point Cloud Quality	Dimension	Model
1	Bologna	synthesis	high	3D	Happy Buddha
2	Random View	synthesis	high	3D	Armadillo
3	UWA	Minolta Vivid 910	medium	3D	Cheff
4	Kinect	First generation Kinect	low	2.5D	Pliers
5	ToF	MESA SR4000	low	3D	Human

Descriptor	Dimension	Feature Description (ms)	Feature Matching (ms)	RANSAC (ms)	ICP (ms)	Total (ms)
DCA	3	1034	19	4	4	1061
SHOT	352	1446	838	13	14	2311
RoPS	135	30829	777	12	13	31631
FPFH	33	—	—	—	—	—