Large depth-of-field 3D shape measurement using an electrically tunable lens

Xiaowei Hu; Xiaowei Hu; Guijin Wang; Guijin Wang; Yujin Zhang; Huazhong Yang; Song Zhang; Song Zhang

doi:10.1364/OE.27.029697

1. Introduction

Three dimensional (3D) shape measurement (or imaging) has been extensively employed in many areas such as human-computer interaction (HCI), industrial measurement, intelligent robots, etc. Among existing methods, the digital fringe projection (DFP) technique has been exhaustively studied and widely used due to its high performance, easy implementation and universal applicability especially for near range scenes [1].

DFP methods often require the use of the largest apertures to maximize the light throughput from the projector to the camera, especially for high-speed applications. This is because 1) DFP systems typically adopt off-the-shelf projectors and cameras; 2) these projectors are not specifically designed for DFP applications and they can only emit limited radiation power; and 3) generic imaging sensor’s quantum efficiency (QE) is much lower than that of the human vision. As a result, the working volume of standard DFP system is shallow. This fundamental limitation makes it difficult for DFP system to be used to measure large depth variation scenes.

In this paper, we define the depth-of-field (DOF) of DFP system as the intersection of projector’s in-focus view and camera’s in-focus view. The state-of-the-art large DOF methods can be classified into virtual DOF extending and real DOF extending. The former refers to increasing the tolerance to blurring effect of lens, and the latter refers to increasing the physical DOF by modifying the hardware configurations.

The virtual DOF extending methods include optimal pattern design, blur kernel estimation, and adaptive pattern. Gupta and Nayar [2] proposed using sinusoidal patterns with frequencies limited to a narrow and high-frequency band. The projector lens defocusing effect can be attenuated as much as possible. However, the accuracy of this method relies on the user selection of the pattern frequency; and the working range extension is quite limited because the pattern information rapidly degrades when the projector is severely defocused. Wang and Zhang [3] developed a binary dithering technique that can achieve high-speed and high-quality 3D shape measurement. Sun et al. [4] developed an improved optimization method which can work under different amounts of defocusing because the existence of the ($3k$)-th harmonics does not induce any phase errors for a three-step phase-shifting algorithm. However, such a method restricts its usage to a three-step phase-shifting algorithm with a fringe period of multiples of 3. Zhang and Nayar [5] developed a novel temporal defocus analysis method to estimate the projection defocus kernel in the frequency domain, then used a defocus compensation method in a spatially varying, depth-dependent manner to increase the DOF of the projector. Garcia and Zakhor [6] improved the measurement performance by making an optimal order for temporally dithered codes to increase the tolerable blur size. Zhang et al. [7] used speckle patterns to change illumination pattern frequency and intensity based on prior object depth information. However, using speckles pattern leads to lower measurement accuracy in depth. Rao et al. [8] presented an online frequency optimization method driven by depth, even though such a method improved measurement quality, the improvement is not significant if the depth variation is large.

The real DOF extending methods physically change the optical system of the projector. Achar and Narasimhan [9] proposed extending the working volume by using multiple projector focus settings and carefully characterizing the projector blur. However, this method requires a relatively large number of images (typically 25-30 for each focal length setting) and the mechanically tunable lens is slow and volatile. Using the light field projector in the structured light system was firstly studied by Kawasaki et al. [10]. In such a method, projector was constructed by attaching a coded aperture with a high-frequency mask in front of the lens. The convolution of the aperture and the projector pattern forms a light field, realizing depth-dependent pattern projection. However, this method attenuates the light throughout significantly due to the coded aperture and requires heavy computational resources.

Unlike the camera defocusing, the projector defocusing only depends on its distance to the projector lens while not on its neighboring surface geometry, which means projector defocusing is scene-independent and camera defocusing is dependent of the scene being captured [5]. As a result, comparing to the camera defocusing, the projector defocusing has less effect on measurement accuracy and details in 3D reconstruction using the DFP technique. Therefore, for high-quality 3D shape measurement, it is more important to extend the DOF of camera for large depth range scenes. Unfortunately, the aforementioned projector’s DOF extension methods assume that camera has an infinite DOF, which will not hold in the case of large depth variation scenes. Although there are numerous studies working on extending the DOF of the camera such as all-in-focus imaging [11–14] and depth-from-defocus [15–17], they are primarily designed for getting focused 2D image without exactly knowing the changed physical parameters (e.g., focal length), while for any DOF extending methods to be adopted in 3D shape measurement, any changes of physical parameter have to be precisely known. Thus those methods cannot be directly employed for large depth range 3D shape measurement.

In traditional optical designs, extending the DOF of 2D imaging is typically achieved by translating the optical element(s) of the lens system which is rather complicated, slow, power-consuming and low precision. The rapid development of the electrically tunable lens (ETL) in the past few years has provided additional degree of freedom for designing elegant optical systems, and it has been widely used in countless applications: auto-focus imaging, display, laser processing, microscopy and other fields [18]. As described before, different from extending the DOF of 2D imaging which does not require being aware of the focal length number precisely, it is vital for 3D measurement systems to know the certain focal length and keep it unchanged after calibration. Thus extending the working range of DFP systems using an ETL requires a highly stable optical performance, especially the reproducibility of the ETL.

This paper presents a method to enlarge 3D shape measurement range using the ETL. First, we pre-calibrated the system at discrete focal length settings and verified the reproducibility of the 3D reconstruction results under different focus settings. Then we developed an efficient phase unwrapping method. Specifically, we 1) capture always in-focus phase-shifted patterns by precisely synchronizing the ETL attached to the camera with the image acquisition and pattern projection; 2) project fringe patterns of different periods under different focal distances; 3) fully utilize the in-focus phase, the out-of-focus phase, and the geometric constraints from the focal length setting to unwrap the phase. This framework can effectively reduce the number of patterns required and improve the robustness of the system for large depth range measurement. Our prototype hardware system can achieve a speed of 20 Hz 3D frames with the working range from 400 mm to 1400 mm and the measurement error of 0.05%.

Section 2 explains the principle of the proposed method. Section 3 presents experimental results to verify the performance of the proposed method, and Section 4 summarizes the paper.

2. Principle

2.1 Multi-step phase-shifting algorithm

Phase-shifting methods have been extensively used in optical metrology due to their advantages of accuracy, speed and resolution [19]. The intensity function of the n-th pattern for an N-step phase-shifting algorithm can be described as,

(1)$$I_n(x,y)=A(x,y)+B(x,y)\cos[\phi(x,y)-\delta_n(x,y)],$$

where $A(x,y)$ denotes the average intensity, $B(x,y)$ is the intensity modulation amplitude, $\phi (x,y)$ is the phase to be solved for, and $\delta _n(x,y)$ is the phase-shift value corresponding to the n-th step. If $N\geq ~3$, the unknown phase $\phi (x,y)$ can be calculated by

(2)$$\phi(x,y)={-}\tan^{{-}1}\left[\frac{\sum_{n=1}^{N}I_n(x,y)\sin\delta_n}{\sum_{n=1}^{N}I_n(x,y)\cos\delta_n}\right].$$

The value returned by the arctangent function ranges from $-\pi$ to $\pi$ with $2\pi$ discontinuities, which is called the wrapped phase. In order to remove the $2\pi$ discontinuities, a spatial or temporal phase unwrapping algorithm is usually required to determine the wrapped phase order $k(x,y)$. Once the $k(x,y)$ is determined, the absolute phase $\Phi (x,y)$ can be obtained by

(3)$$\Phi(x,y)=\phi(x,y)+k(x,y)\times2\pi.$$

The unwrapped phase $\Phi (x,y)$ can be converted to 3D coordinates once system is calibrated. Meanwhile, $A(x,y)$ is often regarded as texture image, and can be obtained by

(4)$$A(x,y)=\left[\frac{\sum_{n=1}^{N}I_n(x,y)}{N}\right].$$

$A(x,y)$ is often regarded as the texture image, and can be used for visualization or providing cues for visual analysis.

2.2 Depth-of-field extending using an electrically tunable lens

2.2.1 Basic principle of the ETL

There are two main approaches to implement ETL [18]. The first is based on local variations in refractive index [20], and the most popular technology is to use liquid crystals. It has the advantages of low driving voltage, low power consumption, and easy miniaturization, but it is sensitive to polarization and slow in response. The second approach is to control the shape of a lens, including electrowetting [21,22] and shape-changing polymers [23]. The electrowetting-effect-based liquid lenses consist of two liquids with the same density, but different refractive indices. The curvature of the interface can be controlled by applying a voltage to an insulated metal substrate. Its main drawback is limited aperture size (e.g., smaller than 3 mm) and sensitive to gravity due to the density of the two liquids does not match. The shape-changing based liquid lens is based on a combination of an optical fluid and a polymer membrane. It consists of a container filled with optical liquid and sealed with thin elastomeric polymer. Through pushing the ring toward the membrane or applying pressure to the outside of the membrane, the deflection of the membrane and the radius of the lens can be changed. The pressure difference between two chambers can be controlled in many ways: mechanically, electromechanically, pneumatically or electrostatically. However, one of the issues associated with shape changing polymer lenses is the coma, although such an effect is negligible when the lens is in the horizontal position. The coma is not temperature dependent and typically small, thus the effect can be reduced by using stiffer and thicker membranes. The thermal expansion also has an influence on the lens shape but can be handled in closed-loop control systems with temperature sensing. The advantages of using a liquid lens are: less mechanic motion, more compact and robust design, less weight, less power consumption, and fast response time (e.g., in milliseconds). Therefore, such a technology has a wide range of applications.

2.2.2 Proposed large DOF 3D shape measurement

For high-accuracy 3D shape measurement with DFP techniques, it is well known that lens cannot be changed after its calibration. However, as described in Sec. 2.2.1, with the advancement of manufacturing technology in the past few years, the imaging quality, linearity and reproducibility of ETL have been greatly improved. In this research, we explore the potential of employing ETL to change the focal length of a lens after calibration for DFP techniques.

Figure 1 shows the framework of the proposed large DOF 3D shape measurement technique. The system includes a projector, a camera attached with an ETL and a control board for synchronization. $f_p$ is the projector focal distance and $f_c^i$ denotes the $i^{th}$ focal distance of the camera. For simplicity, we adopted three discrete focal distances of the camera: $f_c^1=450$ mm, $f_c^2=800$ mm and $f_c^3=1300$ mm. Since lens blur of the projector has little effect on the reconstruction accuracy [5], we experimentally found that the fixed focal distance of the projector $f_p$ at 800 mm is good enough to cover the entire focal range of the camera. Each small image near the camera focal plane (denoted by dotted rectangles with different colors) is the sampled image with different focal distance settings. The corresponding close-up images show that each object can be clearly in-focus by setting different focal distances through the ETL. We calibrate each focal length setting separately, fix other settings unchanged, perform 3D reconstruction for each focal length, and then merge the entire 3D scene together by selecting those measurement points in focus.

Fig. 1. Schematic diagram of the proposed large DOF 3D shape measurement system.

Download Full Size | PDF

To ensure the focal length of the ETL stable during 3D shape measurement, a controller board is utilized to precisely synchronize the camera, projector and ETL. Figure 2 shows the corresponding timing chart. For the focal distance $i$ ($i=1\cdots M$), the ETL control signal is firstly generated to adjust the ETL, and waits until the focal length achieves the setting value and keeps steady, which takes time $t_{i-1}^i$, then the controller board simultaneously sends trigger signal to the camera and projector for capturing $L_i$ images. After completing the image acquisition for the focal distance $i$, above steps are repeated for the next focal distance $i + 1$. If $M$ focal distances are set to cover the entire depth range, the reconstruction frame rate of the entire volume $F$ can be calculated by

(5)$$F=\frac{1}{T}=\begin{cases} \;\frac{1}{L_1\times T_c}\; , \qquad \qquad \qquad \quad \quad \;\; M=1\\ \;\frac{1}{t_{M}^{1}+L_1\times T_c+\sum_{i=2}^{M} (t_{i-1}^{i}+L_i\times T_c)} \; , \quad M \geqslant 2,\\ \end{cases}$$

where $T$ is the reconstruction period of the entire volume, $t_{i-1}^{i}$ represents the ETL settling time changed from the focal distance $f_c^{i-1}$ to $f_c^i$, $L_i$ represents the amount of captured images at focal distance $f_c^i$, and $T_c$ represents the period of the trigger signal. For example, if there are three focal distances, and a two-frequency, three-step phase-shifting algorithm is adopted, the settling time of ETL $t_3^1, t_1^2, t_2^3$ all equal to 10 ms, the trigger signal period $T_c$ is 3 ms, then the reconstruction period $T$ for the entire volume is 84 ms, and the corresponding 3D data acquisition rate is approximately $F \approx$ 11.9 Hz. This timing diagram clearly illustrates the importance of precise synchronization when using ETL in DFP systems, especially for high-speed, large depth range 3D shape measurement applications.

Fig. 2. Timing chart of the proposed large DOF 3D shape measurement system. Here $f_c^i$ represents the $i^{th} (1\leq i\leq M)$ focal distance, $t_{i-1}^{i}$ represents the ETL settling time when changing from focal distance $f_c^{i-1}$ to $f_c^i$, $t_c^{exp}$ represents the exposure time for both the camera and projector, $T_c$ represents the period of the trigger signal, $L_i$ represents the number of captured images at focal distance $i$, and $T$ is the total 3D reconstruction time for the entire depth range.

Download Full Size | PDF

2.3 Efficient phase unwrapping using focal length constraints

For a typical DFP system, higher frequency fringe patterns are preferable for high accuracy 3D shape measurement. However, when the frequency becomes too high, the digital representation of the sinusoidal signal may not be precise, and the slight lens blur may drastically attenuate the amplitude, leading to lower signal-to-noise ratio (SNR) and thus poor quality measurement. Since the depth range of the proposed method is very large, we found that it is necessary to choose fringe patterns with a different frequency to measure objects at different focal distances. In the meantime, recovering absolute phase is required to recover absolute shape of the entire scene. The simplest approach is to project fringe pattern with different frequencies and employ a multiple-frequency phase unwrapping algorithm for each focal distance. However, this approach slows down the measurement speed because it requires the projection of lots of fringe patterns.

In this research, we propose to project only one frequency patterns for each focal length setting, and then unwrap the phase using the equivalent phase map generated by two different frequency patterns from adjacent focal distances. However, since the frequencies used for all fringe patterns are quite high and close to each other, the conventional temporal phase unwrapping algorithm does not work for the entire scene. To solve this problem, we propose an efficient absolute phase unwrapping method that takes advantage of the geometric constraints-based phase unwrapping algorithm [24,25] along with the available focal length information.

Figure 3 illustrates the computational framework we developed for phase unwrapping. For each focal distance, the camera captures images of the fringe patterns whose period is $\lambda ^i$, from which the wrapped phase $\phi ^{i}$ can be generated by applying Eq. (2). For objects placed at the focal distance $f_c^i$, $\phi ^{i}$ is called the in-focus phase, and $\phi ^{i-1}$ or $\phi ^{i+1}$ is the out-of-focus phase. High-quality reconstruction can only be obtained from the in-focus phase (i.e., $\phi ^{i}$ ). The equivalent phase map of the focal distance $f_c^i$ can be computed as

(6)$$\phi_{eq}^{i}=\begin{cases} \;(\phi^i-\phi^{i+1})\:\: \mod\:\: 2 \pi \ , \quad i=1\\ \;(\phi^i-\phi^{i-1})\:\: \mod\:\: 2 \pi \ , \quad 1 < i \leq M,\\ \end{cases}$$

where $\textrm{mod}$ is the modulus operation. The corresponding equivalent fringe periods to generate the equivalent phase map is

(7)$$\lambda_{eq}^{i}=\begin{cases} \;\left|\: \frac{\lambda^i\lambda^{i+1}}{\lambda^i-\lambda^{i+1}}\:\right| \ , \quad i=1 \\ \;\left|\: \frac{\lambda^i\lambda^{i-1}}{\lambda^i-\lambda^{i-1}}\:\right| \ , \quad 1 < i \leq M \\ \end{cases}$$

where $|\cdot |$ is the absolute value operator. Since the equivalent fringe period $\lambda _{eq}^{i}$ is not big enough to cover the whole projector range, we introduce the geometric-constraint based phase unwrapping method along with the available focal length information. Briefly, for the point $(x,y)$ at depth $z_0$, its artificial phase using fringe pattern with a fringe period $\lambda ^0$ can be obtained by

(8)$$\Phi^0(x,y)=g(\textbf{P}_\textrm{c}, \textbf{P}_\textrm{p},\lambda^0, z_{0}),$$

where $\textbf {P}_\textrm {c}$ and $\textbf {P}_\textrm {p}$ are known projection matrices. The mapping function from the focus distance $f_c^i$ to the depth z is defined as

(9) $$z^i=h(f_c^i),$$

which is discrete and fixed after focal distances are known. Let $z_{\min }^i=h_{\min }(f_c^i)$ be the minimum measurement depth when using the focal distance $f_c^i$, then its virtual minimum phase map is

(10)$$\Phi_{\min}^i(x,y)=g(\textbf{P}_\textrm{c}, \textbf{P}_\textrm{p}, \lambda_{eq}^i, h_{\min}(f_c^i)),$$

Then $\Phi _{\min }^i(x,y)$ can be used to unwrap the equivalent phase $\phi _{eq}^{i}$ by

(11)$$K_{eq}^i(x,y)= \textrm{ceil}\left[\frac{\phi_{eq}^i(x,y)-\Phi_{\min}^i(x,y)}{2\pi}\right],$$

(12)$$\Phi_{eq}^i(x,y)= \phi_{eq}^i(x,y) + K_{eq}^i(x,y) \times 2\pi.$$

Here ceil() is an operator to get the closest smaller integer number of a float point value, $\Phi _{eq}^i(x,y)$ is the coarse absolute phase that can be used to unwrap high-frequency phase $\phi ^i$ by

(13)$$\Phi^i(x,y)= \phi^i(x,y) + K^i(x,y) \times 2\pi,$$

where

(14)$$K^i(x,y)= \textrm{Round}\left[\frac{\Phi_{eq}^i(x,y)\times \lambda_{eq}^i/\lambda^i -\phi^i(x,y)}{2\pi}\right].$$

Here Round() is an operator to get the closest integer number of a float point value; and $\Phi ^i(x,y)$ is the in-focus unwrapped absolute phase map for 3D reconstruction.

Fig. 3. Computational framework of our proposed phase unwrapping algorithm.

Download Full Size | PDF

3. Experiment

We developed a prototype system to evaluate the performance of the proposed method. The system includes a CMOS camera (model: PointGrey Grasshopper GS3-U3-23S6M). The lens combination of the camera consists of a 16 mm focal length lens (model: Ricoh FL-CC1614-2M) and a liquid lens (model: Optotune EL-10-30-Ci). A DLP projector (model: DLP Lightcrafter 4500) is used for 3D shape measurement. The focal length of the ETL can be tuned from +100 to +200 mm with 10 mm clear aperture. A high-precision electrical lens driver with a resolution of 0.1 mA (model: Optotune Lens driver 4i) was adopted to control the liquid lens. We used a microprocessor (model: Arduino Uno) to generate the trigger signal for projector and camera, and also send UART data to liquid lens driver to synchronize the lens with the camera and projector. The camera resolution was set as 736$\times$736 pixels while the projector resolution was 912$\times$1140 pixels. The camera aperture is set to the maximum of $f/1.4$. For all experiments, we adopted three focal positions approximately 450 mm, 800 mm and 1300 mm, and their corresponding driver currents are 130.01 mA, 110.06 mA and 95.04 mA. For each focal position, we calibrated the system individually using the principle described in [26]. Since the projector’s lens was not changed during the whole process, its intrinsic parameters were fixed for all focal positions. Thus we set the projector’s coordinate system as the world coordinate system to avoid misalignment of 3D data generated from different focal positions.

To understand whether it is possible to use ETL for high-accuracy 3D shape measurement, we firstly investigated the optical performance of the ETL, including reproducibility and dynamic property. Unlike mechanically tunable lens with inner friction, the liquid lens exhibits no hysteresis. The absolute reproducibility of the latest liquid lens can achieve to typically 0.1 diopters over a temperature range of 10 to 50 °C [27], which is quite small for DFP systems in theory. To verify the reproducibility of the system based on the liquid lens, we put one sphere at each location, performed 12 repetitive experiments, and fitted the measurement data with an ideal spherical model. To maximize the effect of hysteresis, we performed experiments in a circular sequence: $f_c^1$, $f_c^2$, $f_c^3$, $f_c^2$, $f_c^1$, $\dots$. Figure 4 shows results of sphere center location and the corresponding fitted radius for each measurement. This experiment demonstrated that, among those four parameters, $z_0$ changes the most but the standard deviations are quite small: $\sigma _{z_0}$ for each sphere is 0.166 mm ($z_0^1$ = 455 mm), 0.392 mm ($z_0^2$ = 847.5 mm) and 0.328 mm ($z_0^3$ = 1378 mm) respectively. The corresponding relative errors are 0.036%, 0.046% and 0.024%, which are as accurate as that of DFP systems using mechanic lens.

Fig. 4. 12 repeated sphere measurements. (a) and (d) sphere center for the sphere at $z_0^1 = 455$ mm ( $\sigma _{x_0}$ = 0.043 mm, $\sigma _{y_0}$ = 0.033 mm, $\sigma _{z_0}$ = 0.166 mm and $\sigma _{r_0}$ = 0.099 mm); (b) and (e) sphere center for the sphere at $z_0^2 = 847.5$ mm ($\sigma _{x_0}$ = 0.071 mm, $\sigma _{y_0}$ = 0.055 mm, $\sigma _{z_0}$ = 0.392 mm and $\sigma _{r_0}$ = 0.216 mm); (c) and (f) sphere center for the sphere at $z_0^3 = 1378$ mm ($\sigma _{x_0}$ =0.102 mm, $\sigma _{y_0}$ = 0.207 mm, $\sigma _{z_0}$ = 0.328 mm and $\sigma _{r_0}$ = 0.202 mm).

Download Full Size | PDF

Next we investigated the response speed of the ETL. Since the speed of liquid lens deformation is at the millisecond level, we replaced the previous camera with a high-speed camera (Model: Phantom 340L) to observe its dynamic characteristics. We set the camera resolution as 256$\times$256 pixels with the frame rate of 8,000 Hz. The focal length and projector settings remained the same. During the experiment, we kept one whiteboard at the focus distance $f_c^1$ and projected one binary pattern of a long period. Then, we switched the liquid lens between three focus distances, collected images and calculated their blurry metrics. We adopted the metric from [28] to describe the degree of lens blur, and defined it as $\sigma$, which ranges from 0 to 1 with 0 being the best (i.e., not blurred) and 1 being the worst quality (i.e. severely blurred). Figures 5(a)–5(c) show the captured images of the projected binary pattern and the selected cross-sections at different focal length settings. Due to the lens blur, binary fringes gradually become sinusoidal from Figs. 5(a)–5(c), and $\sigma$ changes from 0.456 to 0.723 as expected, indicating that $\sigma$ can describe the focal length very well, although not completely equivalent. The dynamic response curves of $\sigma$ when the focal length is switched from $f_c^1$ to $f_c^2$, $f_c^2$ to $f_c^3$ and $f_c^3$ to $f_c^1$ are shown in Figs. 5(d)–5(e). The starting and ending points of the step response are indicated by black ticks. The settling time corresponding to three step response is about 15 ms, 8 ms and 9 ms, respectively. Even though 15 ms is already relatively short, it is possible to further reduce the settling time to be below 5 ms with a better driving signal [27].

Fig. 5. Dynamic properties of the liquid lens. (a) Photograph of the projected binary pattern taken at $f_c^1$ and selected cross-section, $\sigma$ = 0.456; (b) photograph of the projected binary pattern taken at $f_c^2$ and selected cross-section, $\sigma$ = 0.657; (c) photograph of the projected binary pattern taken at $f_c^3$ and selected cross-section, $\sigma$ = 0.723; (d) dynamic response curve of $\sigma$ when liquid lens switched from (a) to (b), target axial position was reached after 15 ms (indicated by black ticks, associated Visualization 1); (e) dynamic response curve of $\sigma$ when liquid lens switched from (b) to (c), target axial position was reached after 8 ms (indicated by black ticks, associated Visualization 2); (f) dynamic response curve of $\sigma$ when liquid lens switched from (c) to (a), target axial position was reached after 9 ms (indicated by black ticks, associated Visualization 3)

Download Full Size | PDF

Now we have demonstrated that the liquid lens responds rapidly and its repeatability is sufficient for 3D shape measurement. We then measured three static objects positioned at three different depth. In this experiment, we used a three-step phase-shifting algorithm with fringe width $\lambda =16$ pixels and 11-bits Gray-coding algorithms for temporal phase unwrapping. Figure 6 show the experimental results. Figure 6(a) shows the camera is focused at a distance of 450 mm (i.e., $f_c^1$ = 450 mm) where the bottom statue is focused. Figures 6(b)–6(d) respectively shows the 3D reconstruction of each statue, clearly the bottom statue gives the highest quality. Similarly, the second row images show the results when the camera is focused on the middle statue with $f_c^2$ = 800 mm; and the third row images show the results when the camera is focused on the top statue with $f_c^3$ = 1300 mm. These results clearly show that the best quality 3D result is obtained if the camera is in-focus, and the measurement quality drops with the increasing amount of defocusing.

Fig. 6. Photograph of the scene with different focal settings. (a) The camera focal distance at $f_c^1$ = 450 mm; (b)–(d) the close-up views of each individual statue for the focal settings used in (a); (e) the camera focal distance at $f_c^2$ = 800 mm; (f)–(h) the close-up views of each individual statue for the focal settings used in (e); (i) the camera focal distance at $f_c^3$ = 1300 mm; (j)–(l) the close-up views of each individual statue for the focal settings used in (i). Matlab surf() function was used and the color represents depth with red being closer and blue being further away. The corresponding depth range is [1220, 1450] mm for the second column images, and [820, 890] mm for the third column images, and [470, 510] mm for the fourth column images.

Download Full Size | PDF

We further quantitatively evaluated the measurement error by measuring the sphere and using the sphere fitting method described before. Figure 7 shows the measurement results and error maps. Each column represents a different camera focus setting, and each row represents the results of the same sphere. Table 1 lists corresponding mean error and RMSE of Fig. 7. Figures 7(b), 7(j) and 7(r) are the results when the camera is in-focus, which are also the most accurate results. These experiments clearly demonstrated that the reconstruction error increases as the target moves away from the camera focal plane, especially the region with larger shape gradient (see the edge area of Fig. 7(b) and Fig. 7(r)). The static experimental results presented so far demonstrated that the camera lens blur has a great influence on measurement quality (both accuracy and details), and high-quality 3D shape measurement can be achieved using ETL.

Fig. 7. Measurement results of three spheres at different locations. The first row shows the results of the first sphere (diameter = 40 mm at 450 mm) when the camera changes its focal distance from $f_c^1$ to $f_c^3$; The second row shows the corresponding results of the second sphere (diameter = 80 mm at 800 mm); and the third row shows the results for the third sphere (diameter = 200 mm at 1300 mm). The first, third, and fifth column shows the 3D reconstruction, and the second, fourth and sixth column shows the corresponding error map of the 3D result on the previous column. Similar to Fig. 6, the color represents depth information. The corresponding depth ranges are [415, 460], [790, 870], [1150, 1400] mm for the first, second and third row images, respectively.

Download Full Size | PDF

Table 1. Measurement mean error and RMSE of sphere shown in Fig. 7 (mm)

View Table

Lastly, we evaluated our proposed method by measuring dynamically moving objects. In this experiment, we moved the nearest and farthest objects along the horizontal direction and keeping the intermediate objects still. The exposure time of both the camera and the projector was set as 2 ms, and three-step phase-shifted and dithered patterns were used and the fringe period $\lambda ^1$, $\lambda ^2$ and $\lambda ^3$ corresponding to three focus positions are 30, 33, and 37 pixels, respectively. The liquid lens settling time was $t_1^2$ = 15 ms, $t_2^3$ = 8 ms and $t_3^1$ = 9 ms. According to Eq. ( 5), it takes 50 ms to capture one frame for the entire volume, and thus the equivalent 3D shape measurement speed is $F$ = 20 Hz. Figure 8 shows one representative 3D frame and the associated Visualization 4-7 shows the video sequence. Figure 8(a) shows the results when camera is focused at $f_c^1$, Fig. 8(b) corresponds to $f_c^2$, and Fig. 8(c) corresponds to $f_c^3$. Figure 8(d) shows the result obtained from our proposed method. It is clearly demonstrated that our proposed method can effectively alleviate artifacts induced by lens blur and achieve high-quality measurement of a large-depth range dynamic scene.

Fig. 8. Measurement results of moving objects. (a) 3D result when camera is focused at position 1 (associated Visualization 4); (b) 3D result when camera is focused at position 2 (associated Visualization 5); (c) 3D result when camera is focused at position 3 (associated Visualization 6); (d) 3D result obtained from our proposed method (associated Visualization 7).

Download Full Size | PDF

4. Summary

This paper has presented a method to enlarge 3D measurement range using an electrically tunable lens. Different from virtual extending methods based on increasing the tolerance to blur, we physically changed the focal length to achieve the highest light transfer efficiency through an electrically signal, which is stable, accurate and fast. We also developed an efficient phase unwrapping algorithm leveraging the geometric constraints based phase unwrapping method and the available focal length information. We developed a prototype hardware system that successfully verified the proposed method can work from a depth range of 400 mm to 1400 mm with an error of 0.05% and can achieve a frame rate of 20 Hz 3D.

Funding

Directorate for Computer and Information Science and Engineering (IIS-637961).

Disclosures

XWH: The author declares no conflicts of interest; GW: The author declares no conflicts of interest; YZ: The author declares no conflicts of interest; HY: The author declares no conflicts of interest; SZ: Vision Express Optics LLC (I,E,P), Orbbec (C).

References

1. S. Zhang, High-Speed 3D imaging with digital fringe projection techniques (CRC University, 2016).

2. M. Gupta and S. K. Nayar, “Micro phase shifting,” in IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2012), pp. 813–820.

3. Y. Wang and S. Zhang, “Three-dimensional shape measurement with binary dithered patterns,” Appl. Opt. 51(27), 6631–6636 (2012). [CrossRef]

4. J. Sun, C. Zuo, S. Feng, S. Yu, Y. Zhang, and Q. Chen, “Improved intensity-optimized dithering technique for 3d shape measurement,” Opt. Lasers Eng. 66, 158–164 (2015). [CrossRef]

5. L. Zhang and S. Nayar, “Projection defocus analysis for scene capture and image display,” ACM Trans. Graph. 25(3), 907–915 (2006). [CrossRef]

6. R. R. Garcia and A. Zakhor, “Selection of temporally dithered codes for increasing virtual depth of field in structured light systems,” in IEEE Conference on Computer Vision and Pattern Recognition-Workshops, (IEEE, 2010), pp. 88–95.

7. Y. Zhang, Z. Xiong, P. Cong, and F. Wu, “Robust depth sensing with adaptive structured light illumination,” J. Vis. Commun. Image Represent. 25(4), 649–658 (2014). [CrossRef]

8. G. Rao, L. Song, S. Zhang, X. Yang, K. Chen, and J. Xu, “Depth-driven variable-frequency sinusoidal fringe pattern for accuracy improvement in fringe projection profilometry,” Opt. Express 26(16), 19986–20008 (2018). [CrossRef]

9. S. Achar and S. G. Narasimhan, “Multi focus structured light for recovering scene shape and global illumination,” in European Conference on Computer Vision, (Springer, 2014), pp. 205–219.

10. H. Kawasaki, S. Ono, Y. Horita, Y. Shiba, R. Furukawa, and S. Hiura, “Active one-shot scan for wide depth range using a light field projector based on coded aperture,” in Proceedings of the IEEE International Conference on Computer Vision, (IEEE, 2015), pp. 3568–3576.

11. R. Ng, M. Levoy, M. Brédif, G. Duval, M. Horowitz, and P. Hanrahan, “Light field photography with a hand-held plenoptic camera,” Comput. Sci. Tech. Rep. 2, 1–11 (2005).

12. A. Levin, R. Fergus, F. Durand, and W. T. Freeman, “Image and depth from a conventional camera with a coded aperture,” ACM Trans. Graph. 26(3), 70 (2007). [CrossRef]

13. J. R. Alonso, A. Fernández, G. A. Ayubi, and J. A. Ferrari, “All-in-focus image reconstruction under severe defocus,” Opt. Lett. 40(8), 1671–1674 (2015). [CrossRef]

14. G. Wang, W. Li, X. Yin, and H. Yang, “All-in-focus with directional-max-gradient flow and labeled iterative depth propagation,” Pattern Recognit. 77, 173–187 (2018). [CrossRef]

15. S.-H. Lai, C.-W. Fu, and S. Chang, “A generalized depth estimation algorithm with a single image,” IEEE Trans. Pattern Anal. Machine Intell. 14(4), 405–411 (1992). [CrossRef]

16. M. Subbarao and G. Surya, “Depth from defocus: a spatial domain approach,” Int. J. Comput. Vis. 13(3), 271–294 (1994). [CrossRef]

17. V. Aslantas and D. Pham, “Depth from automatic defocusing,” Opt. Express 15(3), 1011–1023 (2007). [CrossRef]

18. M. Blum, M. Büeler, C. Grätzel, and M. Aschwanden, “Compact optical design solutions using focus tunable lenses,” in Optical Design and Engineering IV, vol. 8167 (International Society for Optics and Photonics, 2011), p. 81670W.

19. D. Malacara, Optical shop testing (John Wiley & Sons, 2007).

20. H.-C. Lin, M.-S. Chen, and Y.-H. Lin, “A review of electrically tunable focusing liquid crystal lenses,” Transactions on Electr. Electron. Mater. 12(6), 234–240 (2011). [CrossRef]

21. B. Berge and J. Peseux, “Variable focal lens controlled by an external voltage: An application of electrowetting,” Eur. Phys. J. E: Soft Matter Biol. Phys. 3(2), 159–163 (2000). [CrossRef]

22. R. Shamai, D. Andelman, B. Berge, and R. Hayes, “Water, electricity, and between$\cdots$ on electrowetting and its applications,” Soft Matter 4(1), 38–45 (2008). [CrossRef]

23. H. Ren and S.-T. Wu, Introduction to adaptive lenses, vol. 75 (John Wiley & Sons, 2012).

24. Y. An, J.-S. Hyun, and S. Zhang, “Pixel-wise absolute phase unwrapping using geometric constraints of structured light system,” Opt. Express 24(16), 18445–18459 (2016). [CrossRef]

25. J.-S. Hyun and S. Zhang, “Enhanced two-frequency phase-shifting method,” Appl. Opt. 55(16), 4395–4401 (2016). [CrossRef]

26. B. Li, N. Karpinsky, and S. Zhang, “Novel calibration method for structured-light system with an out-of-focus projector,” Appl. Opt. 53(16), 3415–3426 (2014). [CrossRef]

27. Optotune, “EL-10-30 datasheet,” https://www.optotune.com/images/products/Optotune%20EL-10-30.pdf//.

28. F. Crete, T. Dolmiere, P. Ladret, and M. Nicolas, “The blur effect: perception and estimation with a new no-reference perceptual blur metric,” in Human Vision and Electronic Imaging (International Society for Optics and Photonics, 2007), p. 64920I.

	$f_{c}^{1}$		$f_{c}^{2}$		$f_{c}^{3}$
	mean	RMSE	mean	RMSE	mean	RMSE
z = 450 mm	0.017	0.022	0.045	0.058	0.139	0.219
z = 800 mm	0.054	0.074	0.049	0.074	0.080	0.113
z = 1300 mm	0.656	1.069	0.152	0.248	0.087	0.126

Large depth-of-field 3D shape measurement using an electrically tunable lens

Abstract

1. Introduction

2. Principle

2.1 Multi-step phase-shifting algorithm

2.2 Depth-of-field extending using an electrically tunable lens

2.2.1 Basic principle of the ETL

2.2.2 Proposed large DOF 3D shape measurement

2.3 Efficient phase unwrapping using focal length constraints

3. Experiment

4. Summary

Funding

Disclosures

References

Supplementary Material (7)

Cited By

Figures (8)

Tables (1)

Equations (14)

Optics Express

Name	Description
Visualization 1	Visualization 1
Visualization 2	Visualization 2
Visualization 3	Visualization 3
Visualization 4	Visualization 4
Visualization 5	Visualization 5
Visualization 6	Visualization 6
Visualization 7	Visualization 7