Accurate three dimensional body scanning system based on structured light

Haosong Yue; Yue Yu; Weihai Chen; Xingming Wu

doi:10.1364/OE.26.028544

1. Introduction

In recent years, three dimensional (3D) body scanning has gained a lot of interest [1–4]. This technology reconstructs different parts of a human body from multi-views and then integrates these parts into one 3D model. It can be used in a variety of fields, such as virtual reality, animation production, anthropometry, and medical plastic.

There are many approaches to reconstruct 3D objects or scenes. For example, Davison et al. [5] reconstructed dense scenes using a single uncontrolled camera, which is known as structure from motion method. Pons et al. [6] proposed a reconstruction algorithm using stereo cameras. These camera based algorithms extract and match feature points from images. Therefore they require the objects or scenes to have a certain amount of textures. There are also researchers utilize laser range finders to reconstruct the world [7, 8]. Laser sensors are quite accurate, but they are expensive and their resolutions are low.

Recently, structured light based reconstruction algorithms have been developed rapidly. They project one or several pattern images onto the objects and capture the deformed patterns using one or two cameras. Then 3D points can be calculated using the triangulation method. The key of these algorithms is phase estimation on every single pixel. Many researchers have designed different kinds of gray-scale or color patterns and only one-shot was needed to reconstruct 3D objects [9–12]. These algorithms can achieve high frame rates, however their spatial resolutions are not high enough. In order to get a better resolution, phase shifting methods are usually used [13–15]. Gray-code and other period cues can be combined to unwrap high frequency phase data into a single period [16–18]. Structured light based algorithms do not require objects to have complex textures. Therefore they are quite suitable for human body scanning.

To obtain the complete model of the scanned object, three strategies are commonly used to gather data from different views. The first strategy is fixing the object and moving the sensor [19]. The second strategy is fixing the sensor and rotating the scanned object. For example, Cai et al. [20] put the object to be scanned on a turning disc. The last one is using multiple sensors and all the sensors and the object are fixed [21]. The first two strategies are quite suitable for small objects. For whole human body scanning, the last configuration is more desirable.

No matter which scanning strategy is adopted, registration of points reconstructed from different views is inevitable. Barone et al. [22] used fiducial markers to align different range maps. Instead of artificial markers, point features can be used for data registration, such as SIFT (scale-invariant feature transform) [23], SURF (speeded-up robust features) [24], FAST (features from accelerated segment test) [25], and Shi-Tomasi [26] features. There are also algorithms that use plane features [27] and line features [28]. The most commonly used points registration algorithms are based on ICP (iterative closest point) [29, 30]. These algorithms can produce accurate 3D models by minimizing an error function between different frames. However there are still registration errors, especially between the first and the last data frames.

In this paper, we propose an accurate 3D body scanning system based on structured light technology. The system consists of eight projectors and eight cameras to scan human body from different views. Each projector and camera pair is calibrated using a red-and-blue checkerboard. A four-step phase shifting combined with Gray-code method is used to match pixels in projector and camera planes. The relations between pixels and 3D point coordinates are derived. After reconstruction from each view, an improved global ICP based algorithm is proposed to align different point clouds. We also propose a graph optimization method to further minimize registration errors. A series of experiments have been done and the results demonstrate that our algorithm can produce accurate 3D body models conveniently.

2. Principle

In this section, our proposed 3D body scanning system is discussed in detail. The configuration and topology of the system are introduced first. Then we present the absolute phase calculation and system calibration methods. Next, relations between pixels and 3D point coordinates are derived. Finally, our proposed global registration and optimization algorithms are presented.

2.1. System overview

The configuration of the proposed 3D body scanning system is illustrated in Fig. 1. There are four poles in the system, which are placed clockwise in the bird eye view. The poles are located at the corners of a square, whose side length is approximately 1.4 meters. Each pole has two modules mounting vertically, as surrounded by dashed boxes in Fig. 1. Each of the modules is consisted of a projector, a camera, and a computer. The computer controls the projector to cast a series of fringe patterns onto the object to be scanned and capture the reflection images according to the camera. The cameras of the lower and the upper modules are approximately 0.3 and 1.0 meters high, respectively. The distance between the projector lens and the camera lens is approximately 0.4 meters. These modules point at the center of the square and therefore can scan different parts of a human body. In our implementation, the computer has a 2 Core 1.6GHz CPU (i5-4200U), 2G RAM, and 32G solid state drive. The projector has the resolution of 1920×1080 and its focal length can be adjusted between 10.2 and 10.22 millimeters. The camera has the resolution of 1292×964 with the frame rate of 30fps (frames per second). Its vertical and horizontal angle of views are 59.4 and 79.0 degrees, respectively.

Fig. 1 Configuration of the proposed 3D body scanning system.

Download Full Size | PDF

The topology of the system is shown in Fig. 2. A user computer is utilized to run user programs including module calibration and body scanning. It communicates with module computers via a router. The module computers are regarded as servers, while the user computer is a client. Control commands are sent from the client to these servers and reflection images are transmitted in the opposite direction. In order to speed up the scanning process, modules at the diagonal positions scan simultaneously. For example, the scanning order can be: module 1 and module 6, module 2 and module 5, module 3 and module 8, and finally module 4 and module 7.

Fig. 2 Topology of the proposed 3D body scanning system.

Download Full Size | PDF

2.2. Phase calculation

Four-step phase-shifting (least-squares) algorithm has been widely used in structured light-based profilometry. It projects four grey scale images with sinusoidal fringe patterns onto the object. The difference of initial phases between two adjacent images is 90 degrees. The intensity of the ith image on pixel (x, y) can be expressed as

I_{i} (x, y) = I^{'} + I^{″} \sin [ϕ (x, y) + \frac{π}{2} (i - 1)], i = 1, 2, \dots, 4,

where I′ is the average intensity of the image, I″ is the amplitude modulation, and ϕ(x, y) is the phase value of the pixel. The phase value can be calculated by

ϕ (x, y) = \frac{2 π N x}{W},

where W is the width of the projected image and N is the number of fringes.

When these pattern images are projected onto one object, they will be modulated by the surface of the object. The reflection images captured by the camera have similar patterns with the projected ones. In the most elementary case that the camera and the projector have collimated illumination, the intensity of the ith reflection image on pixel (u, v) is

R_{i} (u, v) = R^{'} + R^{″} \sin [u_{0} u + u_{0} \tan (θ_{0}) O (u, v) + \frac{π}{2} (i - 1)], i = 1, 2, \dots, 4,

where R′ is the average intensity of the reflection image, R″ is the amplitude modulation, u₀ is the spatial frequency of the fringes at the camera plane in radians/pixels, θ₀ is the angle between the camera and the projector, and O (u, v) is the physical object height. The term u₀u represents the standard carrier frequency component and u₀tan(θ₀)O (u, v) represents phase modulation caused by object surface. Denotes the phase value of pixel (u, v) in camera plane as

φ (u, v) = u_{0} u + u_{0} \tan (θ_{0}) O (u, v) .

Then we can get the following general expression

R_{i} (u, v) = R^{'} + R^{″} \sin [φ (u, v) + \frac{π}{2} (i - 1)], i = 1, 2, \dots, 4 .

When the camera and the projector have divergent optical systems, the model in Eq. (3) can be found in [31]. In our implementation, projectors are regarded as reverse cameras. Then the developed binocular stereo vision technology can be used to calculate 3D point coordinates. Therefore the key problem turns to be matching pixels in camera and projector planes. The following equation can be easily deduced from Eq. (5).

φ (u, v) = \arctan (\frac{R_{4} (u, v) - R_{2} (u, v)}{R_{1} (u, v) - R_{3} (u, v)}) .

Note that the phase value calculated according to Eq. (6) is wrapped in the range of [0, 2π). Because there are N periods in each projected pattern image, φ(u, v) should be unwrapped to establish a one-to-one mapping between pixels in camera and projector planes. In our implementation, each projected phase shifting image has N = 64 fringes. Therefore six binary images with Gray-code patterns are needed to unwrap the phase value φ(x, y). Assume the value of Gray-code on pixel (u, v) is B(u, v), then the absolute phase on this pixel can be calculated as

Φ (u, v) = φ (u, v) + 2 π B (u, v) .

The phase is in the range of [0, 2Nπ) and only one period exists now. Finally the x coordinate of pixel (x, y) in the projector plane corresponding to pixel (u, v) in the camera plane is

x = \frac{W Φ (u, v)}{2 N π} .

The process of absolute phase calculation is illustrated in Fig. 3.

Fig. 3 Process of absolute phase calculation. Left: four-step phase shifting images. Bottom middle: phase map within [0, 2π). Top middle: six Gray-code images. Right: absolute phase map within [0, 2Nπ).

Download Full Size | PDF

2.3. System calibration

System calibration is crucial for structured light based systems. It contains two tasks: calibration between modules and calibration in every module. The former is to obtain the transform matrices of different modules, which are initial guesses of cloud poses in the procedure of cloud registration. The latter is to get the intrinsic and extrinsic parameters of the camera and projector, which will be used in 3D reconstruction. Because the guesses do not have to be very exact, the approximate physical parameters obtained in system building process can be used directly. Therefore only the latter is presented in this paper.

Calibration of a binocular stereo camera system is quite mature and can be referenced. The most widely used calibration algorithm is proposed by Zhang [32]. It uses a black-and-white checkerboard as calibration target. Images are captured by cameras when the checkerboard are adjusted in several poses. Then the coordinates of the checkerboard corners are extracted and the intrinsic and extrinsic parameters can be calculated. However, in a camera-projector system, the projector cannot capture checkerboard images. We need to project fringe patterns using the projector and capture reflection images by the camera. Then the coordinates of the corners in projector plane are determined according to their corresponding pixels in camera plane. Moreover, the black blocks of the checkerboard greatly affect the Gray-code and fringe patterns. Therefore we use a red-and-blue checkerboard as calibration target instead, as shown in Fig. 4(a). The reflection image captured by camera under structured light is shown in Fig. 4(b). It can be seen that the fringes are not affected by the red and blue blocks. To extract corners in camera plane, red light is projected onto the checkerboard. Figure 4(c) shows the captured reflection image, which is similar with the black-and-white checkerboard under natural lighting conditions.

Fig. 4 Checkerboard images captured by camera under different lighting conditions: (a) natural light, (b) fringe pattern, and (c) red light.

Download Full Size | PDF

In order to calculate both the x and y coordinates of the checkerboard corners in projector plane, we need to project both vertical and horizontal patterns. Eq. (2) depicts the situation of projecting vertical patterns. The phase value ϕ(x, y) in horizontal patterns can be calculated by

ϕ (x, y) = \frac{2 π N y}{H},

where H is the height of the projected image and N is the number of horizontal fringes.

The process of module calibration is summarized as follows.

Step 1: Adjust the pose of the checkerboard.
Step 2: Project one red image and capture the reflection image. Then extract coordinates of checkerboard corners in camera plane.
Step 3: Project six vertical Gray-code images and four vertical fringe pattern images. At the same time, capture the reflection images. Then calculate absolute phase map and x coordinates of checkerboard corners in projector plane.
Step 4: Project six horizontal Gray-code images and four horizontal fringe pattern images. At the same time, capture the reflection images. Then calculate absolute phase map and y coordinates of checkerboard corners in projector plane.
Step 5: Repeat step 1 to step 4 more than ten times.
Step 6: Calculate the intrinsic and extrinsic parameters of the camera and projector using the algorithm proposed by Zhang [32].

2.4. 3D reconstruction

With the intrinsic and extrinsic parameters, 3D reconstruction can be carried out now. Assume P_W = [X_W,Y_W, Z_W, 1]^T (in millimeter) is the homogenous coordinate of a 3D point in the world coordinate frame, P_C = [X_C,Y_C, Z_C, 1]^T (in millimeter) is its coordinate in camera coordinate frame, and [u_c, v_c, 1]^T (in pixel) is its projection on the image plane. Then we have

Z_{C} [\begin{matrix} u_{c} \\ v_{c} \\ 1 \end{matrix}] = [\begin{matrix} a_{x} & 0 & u_{0} \\ 0 & a_{y} & v_{0} \\ 0 & 0 & 1 \end{matrix}] [\begin{matrix} X_{C} \\ Y_{C} \\ Z_{C} \end{matrix}]

where a_x and a_y are focal lengths of the camera in pixel unit, u₀ and v₀ are coordinates of the principle point. All the four parameters are the intrinsic parameters of the camera and can be obtained by module calibration.

Let x_c = X_C/Z_C and y_c = Y_C/Z_C, it can be derived from Eq. (10) that

{\begin{matrix} x_{c} = \frac{X_{C}}{Z_{C}} = \frac{u_{c} - u_{0}}{a_{x}} \\ y_{c} = \frac{Y_{C}}{Z_{C}} = \frac{v_{c} - u_{0}}{a_{y}} \end{matrix} .

Similarly, there is

{\begin{matrix} x_{p} = \frac{X_{P}}{Z_{P}} = \frac{u_{p} - u_{0}^{'}}{a_{x}^{'}} \\ y_{p} = \frac{Y_{P}}{Z_{P}} = \frac{v_{p} - v_{0}^{'}}{a_{y}^{'}} \end{matrix},

where

a_{x}^{'}

,

a_{y}^{'}

,

u_{0}^{'}

and

v_{0}^{'}

are the intrinsic parameters of the projector, [u_p, v_p, 1]^T is the corresponding pixel in projector plane, P_P = [X_P,Y_P, Z_P, 1]^T is the coordinate of the 3D point in projector coordinate frame.

Additionally, we have

[\begin{matrix} X_{P} \\ Y_{P} \\ Z_{P} \\ 1 \end{matrix}] = T_{P} [\begin{matrix} X_{C} \\ Y_{C} \\ Z_{C} \\ 1 \end{matrix}] = [\begin{matrix} r_{11} & r_{12} & r_{13} & t_{1} \\ r_{21} & r_{22} & r_{23} & t_{2} \\ r_{31} & r_{32} & r_{33} & t_{3} \\ 0 & 0 & 0 & 1 \end{matrix}] [\begin{matrix} X_{C} \\ Y_{C} \\ Z_{C} \\ 1 \end{matrix}],

where T_P is the transform matrix between the camera and projector coordinate frames. It is the extrinsic parameter in module calibration.

According to Eqs. (11) – (13), we can get the z coordinate of the 3D point in camera coordinate frame

\begin{matrix} Z_{C} = \frac{t_{1} - t_{3} x_{p}}{(r_{31} x_{c} + r_{32} y_{c} + r_{33}) x_{p} - (r_{11} x_{c} + r_{12} y_{c} + r_{13})} \\ = \frac{t_{2} - t_{3} y_{p}}{(r_{31} x_{c} + r_{32} y_{c} + r_{33}) y_{p} - (r_{21} x_{c} + r_{22} y_{c} + r_{23})} . \end{matrix}

Then the x and y coordinates of this point in camera coordinate frame can be calculated as X_C = Z_C x_c and Y_C = Z_C y_c. Further, the coordinate of the point in world coordinate frame is

P_{W} = T \cdot P_{C},

where T is the transform matrix of the module, which is obtained in system building process.

It can be seen from Eq. (14) that, only one of x_p and y_p is needed to calculate Z_c. Therefore, only the vertical pattern images are used in our implementation and the matching between u_p and (u_c, v_c) is determined.

2.5. Registration and optimization

In our system, there are eight modules to scan different parts of the human body. Therefore we can get eight point clouds, which should be aligned into one model. Traditionally, the transform matrix between two adjacent point clouds is estimated using ICP algorithm. Then the point clouds can be registered incrementally. However, registration errors are accumulated in the whole process. In order to decrease these errors, we improve the ICP algorithm by registering each point cloud with all its neighbors instead of just its former one.

The layout of the modules can be seen in Fig. 1. For example, module 1, module 4, and module 5 are neighbors of module 3. Assume Ωⁱ is the point cloud in world coordinate frame reconstructed by the ith module. When registering the third cloud Ω³, Ω¹, Ω⁴, and Ω⁵ are combined and sampled into a temporary point cloud Δ. Then the points in Ω³ and Δ are matched according to their distances. Assume U = {u₁, u₂, ⋯, u_n} and V = {v₁, v₂, ⋯, v_n} are matched point sets from Ω³ and Δ, respectively. Fixing Δ, the following error function can be calculated by adjusting the rotation matrix R and translation vector t of Ω³.

E (R, t) = \arg \min_{R, t} \frac{1}{n} \sum_{i = 1}^{n} (1 - \frac{D (u_{i}, v_{i})}{D_{\max}}) | | u_{i} - R v_{i} - t | |^{2},

where D(u_i, v_i) is the distance between point u_i and v_i. D_max is the largest distance of the matching point pairs. If E(R, t) is smaller than a given threshold, the pose of Ω³ is found, i.e. R and t. Otherwise, Ω³ will be transformed using R and t and the above process will be executed again until E(R, t) is small enough.

In order to further minimize registration errors, a graph optimization algorithm is proposed in this paper. The nodes of the graph are poses of the modules. The edges of the graph are transform matrices between these modules. Assume node x_i is the pose of the ith module, edge T_i, _j is the transform matrix between x_i and x_j. Ideally, there should be

x_{i} = T_{i, j} \cdot x_{j} .

However, because of registration errors, there is

e_{i} = x_{i} - T_{i, j} \cdot x_{j} .

Then we define an error function as

E_{o} (x) = \sum_{i = 1}^{8} e_{i}^{T} \cdot e_{i} .

The final poses of the modules can be obtained by minimizing the error function in Eq. (19) using the solving method proposed by Kummerle et al. [33].

3. Experiments

In this section, experimental results are presented. In the experiments, we found that the camera’s response to projected images of different intensities is nonlinear. When the intensities of the projected images are close to 0 or 255, the intensities of the reflection images captured by camera are almost unchanged. Therefore the intensities of the projected fringe pattern images in our implementation are set between 80 and 200, i.e. I′ = 140 and I″ = 60 in Eq. (1).

3.1. Module calibration

Here we take the first module as the example to present module calibration results. In calibration, the checkerboard was adjusted under twelve different poses. At each pose, a red image was projected and the reflection image was captured to extract checkerboard corners on camera image plane. Ten vertical and ten horizontal pattern images were projected and corresponding reflection images were captured to match checkerboard corners on projector image plane. Therefore twelve image pairs indicating checkerboard corners can be obtained to calculate module parameters. The calculated intrinsic matrices of the camera and projector are

K_{C} = [\begin{matrix} a_{x} & 0 & u_{0} \\ 0 & a_{y} & v_{0} \\ 0 & 0 & 1 \end{matrix}] = [\begin{matrix} 967.9 & 0 & 623.7 \\ 0 & 971.7 & 439.8 \\ 0 & 0 & 1 \end{matrix}],

and

K_{P} = [\begin{matrix} a_{x}^{'} & 0 & u_{0}^{'} \\ 0 & a_{y}^{'} & v_{0}^{'} \\ 0 & 0 & 1 \end{matrix}] = [\begin{matrix} 1364.3 & 0 & 961.7 \\ 0 & 1353.9 & 1097.5 \\ 0 & 0 & 1 \end{matrix}] .

The transform matrix of the projector is

T_{P} = [\begin{matrix} r_{11} & r_{12} & r_{13} & t_{1} \\ r_{21} & r_{22} & r_{23} & t_{2} \\ r_{31} & r_{32} & r_{33} & t_{3} \\ 0 & 0 & 0 & 1 \end{matrix}] = [\begin{matrix} 0.9463 & - 0.0671 & 0.3164 & 409.82 \\ - 0.0570 & 0.9284 & 0.3672 & - 19.27 \\ - 0.3184 & - 0.3655 & 0.8747 & 70.54 \\ 0 & 0 & 0 & 1 \end{matrix}] .

The mean reprojection errors are shown in Fig. 5. The errors on camera image planes are represented by blue stripes and are in the range of 0.1816 to 0.2681 pixels. The errors on projector image planes are represented by brown stripes and are in the range of 0.6036 to 0.9598 pixels. The overall mean error of these image pairs is 0.4715, as illustrated by blue dashed line.

Fig. 5 Mean reprojection errors of module calibration.

Download Full Size | PDF

The visualization of extrinsic parameters are shown in Fig. 6, where colored planes represent checkerboards under different poses. From Figs. 5 and 6, it can be seen that the calibration of this module is accurate.

Fig. 6 Extrinsic parameters visualization. Colored planes represent checkerboards under different poses.

Download Full Size | PDF

3.2. 3D reconstruction by one module

In order to verify the accuracy of our 3D reconstruction algorithm, a precise flat board was reconstructed by one module. The board was about one meter far away from the module. Figure 7(a) shows the calculated relative phase map by four-step phase shifting method. As can be seen in this figure, there are many phase periods in the map. Figure 7(b) presents the absolute phase map after applying Gray-code decoding, where only one phase period exists. The reconstructed point cloud is shown in Fig. 7(c). These 3D points are fitted into a plane and the standard error is 1.03 mm, which meets the requirement of 3D body scanning.

Fig. 7 3D reconstruction of a flat board. (a) Relative phase map obtained by four-step phase shifting method. (b) Absolute phase map with Gray-code decoding. (c) 3D point cloud of the board.

Download Full Size | PDF

Figure 8 shows the reconstruction results of a mannequin by one module. The relative and absolute phase maps are shown in Figs. 8(a) and 8(b), respectively. Note that the modules are mounted on the poles vertically to scan more parts of the body (see Fig. 1). Therefore Figs. 8(a) and 8(b) have been rotated 90 degrees and the fringe patterns seem to be horizontal now. Figures 8(c) and 8(d) show two different views of the reconstructed point cloud. The reconstruction results of a human body by one module are shown in Fig. 9. From these figures we can see that our 3D reconstruction algorithm is feasible and can produce sophisticated models.

Fig. 8 3D reconstruction of a mannequin. (a) Relative phase map obtained by four-step phase shifting method. (b) Absolute phase map with Gray-code decoding. (c, d) Two views of the reconstructed 3D point cloud.

Download Full Size | PDF

Fig. 9 3D reconstruction of a human body. (a) Absolute phase map. (b, c) Two views of the reconstructed 3D point cloud.

Download Full Size | PDF

3.3. Body scanning and optimization

With different parts reconstructed by different modules, the entire model can be obtained by point cloud registration. Then the graph optimization algorithm is used to minimize registration errors. Figure 10 shows the entire models before (left) and after (right) graph optimization, which are presented from the same perspective. Points reconstructed by different modules are marked with different colors. From the distributions of these colored points, we can see that the poses of the modules are different and the points are better overlapped after graph optimization, although the difference is not obvious enough.

Fig. 10 3D model of a human body. Points in different colors represent different parts reconstructed by different modules. Left: Before optimization. Right: After optimization.

Download Full Size | PDF

To observe the improvement of performance by graph optimization more intuitively, the poses of the modules are drawn in Fig. 11 and are marked as numbered circles. Circle 1′ represents the pose of the first module after point cloud registration. It should be coincide with circle 1 ideally. However they are not coincide with each other because of accumulate errors, as shown in Fig. 11(a). After optimization, circle 1′ and circle 1 are overlapped, as shown in Fig. 11(b).

Fig. 11 Pose graphs of the modules (a) before and (b) after optimization. Numbered circles represent the poses of the modules. Circle 1′ represents the pose of the first module after point cloud registration.

Download Full Size | PDF

Table 1 presents the (x, y, z) positions of the modules in world coordinate frame in meter. It can be seen that module 1 and module 1′ are in the same position after optimization. Positions of other modules are also changed slightly. This means that the accumulate errors have been evenly distributed among these modules. Therefore the accuracy of the scanned model is improved, as well as the visual effects. Different views of the final scanned results after combining all point clouds and removing noise points are presented in Fig. 12.

Table 1. Positions of modules in world coordinate frame (in meter).

View Table | View all tables in this article

Fig. 12 Different views of the final model after combining all point clouds.

Download Full Size | PDF

In order to present more quantitative comparing results, digitizing of a geometrical object has been performed. For all ICP based registration algorithms, the point clouds should have a certain of complex shapes. For example, points on a plane or a sphere cannot be registered using only geometric information. Therefore we designed an object as shown in Fig. 13(a). The base of the object is a cuboid, whose length (L1), width (W1), and height (H1) are 300, 300, and 400 millimeters, respectively. On top of the cuboid are four prismoids with 12 sides. Two prismoids with 8 sides are located at the left and right of the base cuboid. All these prismoids have the height of 170mm and the basal diameter of 300mm. In Fig. 13(a), H2 is the height of the entire object. C1, C2, and C3 represent the basal polygons’ circumferences of the prismods. A photograph of the real object is shown in Fig. 13(b). The reconstructed 3D model using our algorithm is shown in Fig. 13(c). It can be seen that the geometrical object was digitized correctly.

Fig. 13 (a) A geometrical object to be measured. (b) Photograph of the real object. (c) Digitalized model using our scanning system.

Download Full Size | PDF

After 3D modelling, we measured the dimensions indicated in Fig. 13(a). The circumferences C1, C2, and C3 were calculated according to the contour points of the basal polygons. Other parameters were computed by searching the boundary points in corresponding directions. Table 2 presents the quantitative results. The ground truths were measured manually, which had a little bit differences with our designed values because of machining errors. We also registered the point clouds using the well-known ICP method and measured the dimensions according to the reconstructed model. From Tab. 2, we can calculate that the mean absolute error of our system is 1.94mm, while the mean absolute error of ICP is 4.84mm. Considering the scanning distance (approximately 1.2 meters), the result of our system is satisfied. Similar with Fig. 10, the visual difference between point clouds reconstructed by these two algorithms is not obvious. Therefore only the model obtained by our algorithm is presented in Fig. 13.

Table 2. Measurement results of a geometrical object (in millimeter).

View Table | View all tables in this article

More results are shown in Fig. 14, including two mannequins and six human bodies. All the results demonstrate that our 3D body scanning system can produce accurate and sophisticated models.

Fig. 14 Final results of two mannequins (a, b) and six human bodies (c-h).

Download Full Size | PDF

4. Conclusion

In this research, we present an accurate 3D body scanning system based on structured light technology. An improved ICP algorithm has been proposed to align different point clouds. We also propose a graph optimization algorithm to further minimize registration errors. These two contributions make our system more accurate than the standard ICP algorithm. In our implementation, eight modules are used to scan the body from different views. All the sensors, as well as the scanned body, need not to move in the whole scanning process. Each module is consisted of a projector and a camera. The projector projects periodic fringe patterns onto the scene and the reflection images are captured by the camera. We use a four-step phase shifting combined with Gray-code method to match pixels in projector and camera planes. Then the modules are calibrated using a red-and-blue checkerboard, which will not affect the projected stripes and is easy to extract corners at the same time. We have also derived the calculation of 3D point coordinates according to matched pixels. Experimental results show that our calibration and reconstruction algorithms are accurate. The proposed system can generate complete and accurate 3D human body models conveniently.

With modern projectors, digital cameras, and high-speed computers, one can easily obtain ten or more phase-stepping images. As analyzed in [34], utilizing higher phase step number fringes can improve both signal-to-noise ration and harmonics rejection, which will produce more accurate 3D body models. Therefore it is worth to increase the number of phase-steps although the capture and computing time will increase slightly. In the future, we would like to re-run our experiments with different number of phase-stepping images and further improve the accuracy of our 3D scanning system.

Funding

National Natural Science Foundation of China (NSFC) (61603020, 61620106012, 61573048).

References

1. J. Siebert and S. Marshall, “Human body 3D imaging by speckle texture projection photogrammetry,” Sensor Rev. 20(3), 218–226 (2000). [CrossRef]

2. B. Allen, B. Curless, and Z. Popovic, “The space of human body shapes: reconstruction and parameterization from range scans,” ACM Trans. Graph. 22(3), 587–594 (2003). [CrossRef]

3. J. Lu and M. Wang, “Automated anthropometric data collection using 3D whole body scanners,” Expert Syst. Appl. 35(1), 407–414 (2008). [CrossRef]

4. J. Guo, X. Peng, A. Li, X. Liu, and J. Yu, “Automatic and rapid whole-body 3D shape measurement based on multinode 3D sensing and speckle projection,” Appl. Opt. 56(31), 8759–8768 (2017). [CrossRef] [PubMed]

5. R. Newcombe and A. Davison, “Live dense reconstruction with a single moving camera,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2010), pp. 1498–1505.

6. J. Pons, R. Keriven, and O. Faugeras, “Multi-view stereo reconstruction and scene flow estimation with a global image-based matching score,” Int. J. Comput. Vis. 72(2), 179–193 (2007). [CrossRef]

7. H. Ji, K. An, J. Kang, M. Chung, and W. Yu, “3D environment reconstruction using modified color ICP algorithm by fusion of a camera and a 3D laser range finder,” in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IEEE, 2009), pp. 3082–3088.

8. D. Paulo, M. Miguel, and S. Vítor, “3D reconstruction of real world scenes using a low-cost 3D range scanner,” Comput.-Aided Civ. Inf. Eng. 21(7), 486–497 (2010).

9. S. Chen, Y. Li, and J. Zhang, “Vision processing for realtime 3-D data acquisition based on coded structured light,” IEEE Trans. Image Process. 17(2), 167–176 (2008). [CrossRef] [PubMed]

10. R. Sagawa, R. Furukawa, and H. Kawasaki, “Dense 3D reconstruction from high frame-rate video using a static grid pattern,” IEEE Trans. Pattern Anal. Mach. Intell. 36(9), 1733–1747 (2014). [CrossRef]

11. T. Petkovic, T. Pribanic, and M. Donlic, “Single-shot dense 3D reconstruction using self-equalizing De Bruijn sequence,” IEEE Trans. Image Process. 25(11), 5131–5144 (2016). [CrossRef] [PubMed]

12. H. Jing, J. Zhu, and P. Zhou, “Optical 3-D surface reconstruction with color binary speckle pattern encoding,” Opt. Express 26(3), 3452–3465 (2018). [CrossRef] [PubMed]

13. J. Xu, S. Liu, A. Wan, B. Gao, Q. Yi, D. Zhao, R. Luo, and K. Chen, “An absolute phase technique for 3D profile measurement using four-step structured light pattern,” Opt. Lasers Eng. 50(9), 1274–1280 (2012). [CrossRef]

14. C. Jiang, B. Li, and S. Zhang, “Pixel-by-pixel absolute phase retrieval using three phase-shifted fringe patterns without markers,” Opt. Lasers Eng. 91232–241 (2017). [CrossRef]

15. Z. Cai, X. Liu, X. Peng, and B. Gao, “Ray calibration and phase mapping for structured-light-field 3D reconstruction,” Opt. Express 26(6), 7598–7613 (2018). [CrossRef] [PubMed]

16. G. Sansoni, S. Corini, S. Lazzari, R. Rodella, and F. Docchio, “Three-dimensional imaging based on Gray-code light projection: characterization of the measuring algorithm and development of a measuring system for industrial applications,” Appl. Opt. 36(19), 4463–4472 (1997). [CrossRef] [PubMed]

17. Y. Wang, K. Liu, Q. Hao, D. Lau, and L. Hassebrook, “Period coded phase shifting strategy for real-time 3-D structured light illumination,” IEEE Trans. Image Process. 20(11), 3001–3013 (2011). [CrossRef] [PubMed]

18. X. Huang, J. Bai, K. Wang, Q. Liu, Y. Luo, K. Yang, and X. Zhang, “Target enhanced 3D reconstruction based on polarization-coded structured light,” Opt. Express 25(2), 1173–1184 (2017). [CrossRef] [PubMed]

19. M. Wheeler, “Automatic modeling and localization for object recognition,” PhD Thesis, School of Computer Science, Carnegie Mellon University, 1996.

20. Z. Cai, X. Liu, A. Li, Q. Tang, X. Peng, and B. Gao, “Phase-3D mapping method developed from back-projection stereovision model for fringe projection profilometry,” Opt. Express 25(2), 1262–1277 (2017). [CrossRef]

21. Q. Wu, B. Zhang, J. Huang, Z. Wu, and Z. Zeng, “Flexible 3D reconstruction method based on phase-matching in multi-sensor system,” Opt. Express 24(7), 7299–7318 (2016). [CrossRef] [PubMed]

22. S. Barone, A. Paoli, and A. Razionale, “Multiple alignments of range maps by active stereo imaging and global marker framing,” Opt. Lasers Eng. 51(2), 116–127 (2013). [CrossRef]

23. V. Pradeep, G. Medioni, and J. Weiland, “Visual loop closing using multi-resolution SIFT grids in metric-topological SLAM,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2009), pp. 1438–1445.

24. Y. Wang, D. Hung, and C. Sun, “Improving data association in robot SLAM with monocular vision,” J. Inf. Sci. Eng. 27(6), 1823–1837 (2011).

25. P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “RGB-D mapping: using depth cameras for dense 3D modeling of indoor environments,” Int. J. Robot. Res. 31(5), 647–663 (2014). [CrossRef]

26. I. Dryanovski, R. Valenti, and J. Xiao, “Fast visual odometry and mapping from RGB-D data,” in Proceedings of IEEE International Conference on Robotics and Automation (IEEE, 2013), pp. 2305–2310.

27. T. Lee, S. Lim, S. Lee, S. An, and S. Oh, “Indoor mapping using planes extracted from noisy RGB-D sensors,” in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IEEE, 2012), pp. 1727–1733.

28. A. Bartoli and P. Sturm, “Structure-from-motion using lines: representation, triangulation, and bundle adjustment,” Comput. Vis. Image Underst. 100(3), 416–441 (2005). [CrossRef]

29. P. Besl and N McKay, “A method for registration of 3-D shapes,” IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992). [CrossRef]

30. H. Yue, W. Chen, X. Wu, and J. Liu, “Fast 3D modeling in complex environments using a single Kinect sensor,” Opt. Lasers Eng. 53, 104–111 (2014). [CrossRef]

31. K. Gasvik, Optical Metrology, 3 Edition, (Wiley, 2002). [CrossRef]

32. Z. Zhang, “A flexible new technique for camera calibration”, IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000). [CrossRef]

33. R. Kummerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard, “G2o: a general framework for graph optimization,” in Proceedings of IEEE International Conference on Robotics and Automation (IEEE, 2011), pp. 3607–3613.

34. M. Servin, J. Quiroga, and M. Padilla, Fringe Pattern Analysis for Optical Metrology: Theory, Algorithms, and Applications, (Wiley-VCH, 2014).

Module index	Position before optimization	Position after optimization
Module 1	(0.300, 0.817, −0.736)	(0.294, 0.810, −0.752)
Module 2	(0.862, 0.844, −0.741)	(0.856, 0.840, −0.757)
Module 3	(0.287, −0.662, −0.778)	(0.289, −0.669, −0.778)
Module 4	(1.024, −0.645, −0.774)	(1.026, −0.648, −0.773)
Module 5	(0.371, −0.612, 0.717)	(0.371, −0.613, 0.718)
Module 6	(1.101, −0.665, 0.646)	(1.101, −0.665, 0.648)
Module 7	(0.324, 0.911, 0.666)	(0.326, 0.912, 0.675)
Module 8	(0.881, 0.934, 0.605)	(0.883, 0.935, 0.617)
Module 1′	(0.288, 0.788, −0.760)	(0.294, 0.810, −0.752)

Parameter	Ground truth	Results of our system	Results of ICP
L1	300.4	302.0	304.9
W1	300.7	302.1	305.1
H1	399.1	400.8	402.8
H2	1080.8	1082.9	1086.5
C1	932.6	934.6	937.2
C2	921.0	918.5	915.3
C3	920.8	918.5	915.5

Module index	Position before optimization	Position after optimization
Module 1	(0.300, 0.817, −0.736)	(0.294, 0.810, −0.752)
Module 2	(0.862, 0.844, −0.741)	(0.856, 0.840, −0.757)
Module 3	(0.287, −0.662, −0.778)	(0.289, −0.669, −0.778)
Module 4	(1.024, −0.645, −0.774)	(1.026, −0.648, −0.773)
Module 5	(0.371, −0.612, 0.717)	(0.371, −0.613, 0.718)
Module 6	(1.101, −0.665, 0.646)	(1.101, −0.665, 0.648)
Module 7	(0.324, 0.911, 0.666)	(0.326, 0.912, 0.675)
Module 8	(0.881, 0.934, 0.605)	(0.883, 0.935, 0.617)
Module 1′	(0.288, 0.788, −0.760)	(0.294, 0.810, −0.752)

Parameter	Ground truth	Results of our system	Results of ICP
L1	300.4	302.0	304.9
W1	300.7	302.1	305.1
H1	399.1	400.8	402.8
H2	1080.8	1082.9	1086.5
C1	932.6	934.6	937.2
C2	921.0	918.5	915.3
C3	920.8	918.5	915.5

Accurate three dimensional body scanning system based on structured light

Abstract

1. Introduction

2. Principle

2.1. System overview

2.2. Phase calculation

2.3. System calibration

2.4. 3D reconstruction

2.5. Registration and optimization

3. Experiments

3.1. Module calibration

3.2. 3D reconstruction by one module

3.3. Body scanning and optimization

4. Conclusion

Funding

References

Cited By

Figures (14)

Tables (2)

Equations (22)

Optics Express