Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Three-dimensional computational ghost imaging using a dynamic virtual projection unit generated by Risley prisms

Open Access Open Access

Abstract

Computational ghost imaging (CGI) using stereo vision is able to achieve three-dimensional (3D) imaging by using multiple projection units or multiple bucket detectors which are separated spatially. We present a compact 3D CGI system that consists of Risley prisms, a stationary projection unit and a bucket detector. By rotating double prisms to various angles, speckle patterns appear to be projected by a dynamic virtual projection unit at different positions and multi-view ghost images are obtained for 3D imaging. In the process of reconstruction, a convolutional neural network (CNN) for super-resolution (SR) is adopted to enhance the angular resolution of reconstructed images. Moreover, an optimized 3D CNN is implemented for disparity estimation and 3D reconstruction. The experimental results validate the effectiveness of the method and indicate that the compact system with flexibility has potential in applications such as navigation and detection.

© 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Computational ghost imaging (CGI) proposed by Shapiro [1] is a promising technology that achieves imaging by measuring the correlated intensity from a bucket detector (a single-pixel detector without spatial resolution [2]). In contrast with traditional cameras featuring abundant pixels, CGI using spatial light modulators only requires bucket detectors, which avoids the limitation of using array detectors. Recently, there have been many developments in CGI for spectral imaging [35], microscope [68], x-ray imaging [911] and image encryption [1214].

Several studies on obtaining high-dimensional information have demonstrated that three-dimensional (3D) CGI can be achieved by using time-resolved systems [1518] or adopting multi-devices schemes [1922]. With a photomultiplier tube to collect reflected photons, a ghost imaging lidar system is proposed for remote sensing [15,16]. A single-pixel system using compressed sensing and short-pulsed structured illumination enables 3D imaging by time-of-flight approach [17]. Improved by single-photon detection, a photon-limited imaging technique for single-pixel 3D imaging is proposed to achieve better performance [18]. Moreover, current CGI systems with multi-devices schemes enable 3D imaging by stereo vision methods. The systems composed of multiple bucket detectors are demonstrated for 3D imaging via photometric stereo algorithms [1921]. Based on binocular vision, the system using multiple projection units also has the 3D imaging capability [22]. Compared with time-resolved systems, the multi-devices scheme is a low-cost approach without measuring arrival of photons. However, both multiple bucket detectors and multiple projection units are separated spatially, which limit the integration of systems. A compact 3D CGI system consisting of a projector and a single photodetector is able to achieve stereo vision by adding a prism reflector [23], but the pixels of patterns are divided into two parts with lower spatial resolution. In stereo vision methods, not only spatial resolution but also angular resolution will affect the performance of 3D reconstruction [2426]. Higher angular resolution without sacrificing spatial resolution will facilitate the acquisition of ghost images and 3D images with better quality.

As an effective approach to enhance the performance of the CGI system, machine learning techniques have been employed to improve the quality of 2D results and minimize the sampling number [2729]. By using deep learning (DL), CGI with a deep neural network (DNN) model is proposed to improve the quality of ghost images under a low measurement ratio [27]. A method combining the compressed sensing (CS) algorithm and the convolutional neural network (CNN) outperforms the conventional CS and DL algorithms [28]. CGI with an extremely low sampling rate is achieved by using the framework based on DNN with convolution layers and pink noise patterns [29]. The CNN is suitable to extract spatial features in images by convolutional operations, which improves the performance of the reconstruction.

In this paper, we present a compact 3D CGI system using stereo vision. The proposed system consists of a stationary projection unit, a bucket detector and Risley prisms with two independent prisms. In our scheme, a dynamic virtual projection unit is generated by the stationary projection unit under predetermined rotation angles of prisms. Based on randomly encoded speckle patterns from the virtual projection unit, multi-view ghost images are obtained from the correlated intensity signal. During reconstruction, a CNN for super-resolution (SR) in the angular domain is implemented to enhance the angular resolution related with the quality of 3D images. Moreover, an optimized 3D CNN is adopted to demonstrate the 3D imaging capability of our system. The experimental results validate the effectiveness of the method that provides more compactness and better performance.

2. Theoretical methodology

2.1 Risley-prism-based (RP) 3D CGI system

The schematic diagram of our proposed 3D CGI system is illustrated in Fig. 1. The emitted light from the stationary projection unit Pr is deflected after passing through Risley prisms, which enables illumination from a virtual projection unit Prv that differs from the stationary one.

 figure: Fig. 1.

Fig. 1. Schematic diagram of the proposed 3D CGI system. The system mainly consists of Risley prisms, a stationary projection unit, a bucket detector, a data acquisition device and additional lens. The stationary and virtual projection unit are formed of components with the same parameters, which include a light source, lens and a digital micromirror device (DMD) for modulating the incident light. Risley prisms are placed coaxially with the optical axis of the stationary projection unit Pr. Based on Risley prisms with different rotation angles, the O1-O2-O3-O4-O5 trajectory is implemented in the experiment, where Ok (1 ≤ k ≤ 5) represents the optical center of the dynamic virtual projection unit at different positions. The data acquisition device collects the intensity signal from the bucket detector with lens and transmits the signal to the computer for post-processing.

Download Full Size | PDF

Based on the refraction associated with the rotation angles of prisms, the speckle patterns appear to be projected from dynamic virtual views. The reflected light from objects is collected by the bucket detector and the ghost image under each combination of rotation angles is obtained by correlating the intensity signal and patterns. According to Helmholtz reciprocity, the projection unit in CGI systems can be regarded as a camera in traditional imaging systems. Based on rotatable Risley prisms, multi-view images are obtained by using the virtual illumination source moving along the specified trajectory, which provides the angular domain for 3D reconstruction. A more detailed analysis about the method of the proposed 3D CGI system is presented in Supplement 1.

2.2 Reconstruction process

In the process of angular SR and 3D imaging, CNNs are adopted to improve the performance of the RP 3D CGI system. The implementation process of reconstruction is illustrated in Fig. 2. The ghost images corresponding to the virtual projection unit Prv at different positions are used as the input of the process, where k is set to be 5 in the following experiment. Assuming that the ghost image GCk is τ2 × τ2, the size of EPIGC is 5 × τ2 and EPIGA is 9 × τ2. After a total of τ2 operations that input various EPIs, the optimized images with the size of τ2 × 2τ2 × 9 are restored from EPIGA. Further, the ghost images with the size of τ2 × τ2 × 9 are used for disparity estimation, which are obtained by resizing the original optimization results. By using a CNN for disparity estimation, we can obtain the corresponding disparity maps of scenes. Based on the disparity map, the 3D profile is eventually reconstructed by Zs = f·ςU, where f, ς, and ΔU are the focal length, the interval between adjacent positions and the disparity, respectively.

 figure: Fig. 2.

Fig. 2. Implementation process of reconstruction in the RP 3D CGI system. To improve the computational efficiency, we select the suitable image areas by cropping original images. The cropped images are imported to an angular SR network in the form of the EPI that is obtained by gathering samples with a fixed angular coordinate q and a fixed spatial coordinate u. We sequentially input various EPIGC composed of different rows of multi-view images and finally obtain EPIGA by selecting 2k-1 rows of the corresponding output. Based on EPIGA, the enhanced images are obtained for disparity estimation and 3D reconstruction.

Download Full Size | PDF

As depicted in Fig. 3(a), the angular SR network is mainly composed of upsampling and deep recursive residual network (DRRN) [30] optimized by convolutional block attention module (CBAM) [31]. The upsampling operation of bicubic interpolation is performed on the input and the upsampled results are blurred by Gaussian kernels. The pre-processed data is imported into the network composed of sub-units that are made up of DRRN and CBAM. In DRRN, global residual learning is used in the identity branch and the network is formed by stacking residual network units. Moreover, the rectified linear unit (ReLU) is employed as the activation function in each convolution layer. As an attention mechanism, CBAM is used to emphasize meaningful features along the channel axis and the spatial axis. The output is obtained by performing several sub-units and additional convolution layers. In the following experiment, the main network is formed by 22 sub-units ConvA. We use mean squared error (MSE) as the loss function of the angular SR network for training, which is expressed as

$$\; \mathrm{{\cal L}}(\sigma ) = \frac{1}{{\varGamma }}\sum {{{||{{x^{(u)}} - {X^{(u)}}} ||}^2}} ,$$
where Γ, σ, X­(u) and x­(u) denote the number of training samples, the set of network parameters, the ground truth patches and the predictive values, respectively. Furthermore, the function is optimized via the stochastic gradient descent.

 figure: Fig. 3.

Fig. 3. Structure of the adopted network. (a) Angular SR network. The block CA represents channel attention module and the block SA represents spatial attention module. ⊗ denotes the element-wise multiplication and ⊕ denotes the element-wise addition. Qc and Qs represent the channel attention map and the spatial attention map, respectively. Each 2D convolution layer is expressed as sh2 × sw2 @ C2, where the first two parameters correspond to the kernel size and C2 is the number of channels. (b) Disparity estimation network for 9 views. The network is mainly made up of 3D convolution layers, sub-units ConvD and additional 2D convolution layers. Each 3D convolution layer is expressed as sh3 × sw3 × sd3 @ C3, where the first three parameters correspond to the kernel size and C3 is the number of channels.

Download Full Size | PDF

A 3D CNN [32] optimized by CBAM is adopted for disparity estimation, as illustrated in Fig. 3(b). Before 3D convolution layers, a process of padding is applied to ensure that the size of the shrunk feature map after 3D convolution layers is consistent with the size of the original input. The 3D convolution layers are used to learning from EPIs containing the rows of multi-view images. In the sub-units ConvD with attention mechanism, there are two parallel convolution layers with different kernel sizes. After the execution of this network, the disparity map as the output is obtained for generating the final 3D reconstruction results. The loss function of the disparity estimation network is defined as

$$\hbar (\eta ) = \frac{1}{{\varGamma }}\sum {|{{y^{(u)}} - {Y^{(u)}}} |} ,$$
where Γ, η, Y(u) and y­(u) are the number of training samples, the set of network parameters, the ground truth of disparity maps and the predictive results, respectively. In the disparity estimation network, the adaptive moment estimation (Adam) is adopted for optimization.

3. Experiments and discussion

3.1 Experimental setup

In the light emitting part, the light source is a light-emitting diode operating at 400-760 nm (@ 20 W). The incident light is modulated by the DMD (Texas Instruments DLP Discovery 4100) and the focal length of additional lens is 400 mm. The wedge angle of prisms is 11.4° and the diameter of prisms is 25.4 mm. The thinnest-end thickness of each prism is 3 mm and the material of prisms is BK7. The distance D1 between double prisms is 33 mm and the length D0 from Prism 1 to the optical center O of the stationary projection unit is approximately 67 mm. In the device providing transmission, the minimal step interval of motors is 0.058°. The minimal rotation angle of prisms is 0.0193° due to the transmission ratio of 3. In the experiment, the value of the interval ς between adjacent virtual positions for projection is set to be 0.3 mm. In the experiment, the theoretical rotation angle $\varTheta_k^1 $ and $\varTheta_k^2 $ for moving the dynamic virtual projection unit to 5 positions are listed in Table 1. Moreover, there are tolerable angle deviations existing in the actual process of rotating prisms.

Tables Icon

Table 1. Theoretical angles of prisms

In the signal receiving part, the bucket detector (Thorlabs PDA100A) with lens (focal length, 15 mm) receives the reflected light. The data acquisition device (PICO6404E) is used to collect the signal from the detector. Moreover, a platform with an Intel Xeon Gold 6226R CPU @ 2.90 GHz and an NVIDIA GeForce RTX 2080 Ti GPU is employed for algorithm implementation. For training the angular SR network, we select 200 images from Berkeley Segmentation Dataset [33] and 100 images from Urban100 [34] as original images. To better train networks, we input the low-resolution images to the simulation of CGI with compressed sensing and obtain processed datasets that are similar to the results of experiments. The angular SR network is trained by using pre-processed images as inputs and high-resolution images as the ground truth, which takes around 30 epochs. In addition, 16 scenes in the 4D light field benchmark [35] are selected as the dataset for training the disparity estimation network. The training process of the disparity estimation network takes around 300 epochs. The training samples, validation samples, testing samples and experimental objects are independent of each other.

3.2 Results

In the experimental scene, two objects are separately placed 228 mm and 252 mm away from the origin. The speckle patterns with 128 × 128 pixels are used for modulation and each pixel of speckle patterns is composed of 2 × 2 pixels in the DMD coding plane. All the following experiments is based on the ghost images gathered from 5 views and all the original images are obtained by using the RP 3D CGI system. Based on the signal from the bucket detector, the 2D image of the central view is shown in Fig. 4(a) and the 3D profile of the experimental scene under the sampling number of 7000 is reconstructed in the Figs. 4(b) and 4(c).

 figure: Fig. 4.

Fig. 4. Experimental results under the sampling number of 7000. (a) 2D image. (b) 3D results. (c) Rendered view.

Download Full Size | PDF

To further demonstrate the performance of the system, we separately conduct several experiments with the sampling number of 3000, 4000, 5000 and 6000. In all reconstructed images, the cropped images with 76 × 76-pixel resolution are selected for obtaining a common field of view. To compare the performance of the system under different processes, we add the conditions of performing disparity estimation network with 5 views of the original images and performing disparity estimation network with 9 views of the images that are obtained by implementing the angular SR network with only bicubic interpolation and blurring. The experimental results are illustrated in Fig. 5.

 figure: Fig. 5.

Fig. 5. Experimental results under different sampling numbers and different reconstruction processes. A measurement range with the maximum and minimum values is set in the presented 3D results, which excludes erroneous values caused by noise disturbance. The points with depth values out of the range are considered invalid.

Download Full Size | PDF

The quantitative results including the root mean square error (RMSE) and the ratio of valid points Rvp are depicted in Fig. 6. The RMSE is calculated by

$$RMSE = \sqrt {\frac{1}{\mu }\sum {{{(va{l_i} - tr{u_i})}^2}} } ,$$
where µ, vali, trui are the number of valid points, the calculated depth value of i-th valid point and the truth value of i-th valid point, respectively.

 figure: Fig. 6.

Fig. 6. Results of quantitative evaluation. (a) RMSE varying with sampling numbers. (b) Ratio of valid points varying with sampling numbers.

Download Full Size | PDF

The ratio of valid points Rvp is represented as

$${R_{vp}} = \frac{{nu{m_{valid}}}}{{nu{m_{total}}}} \times 100\%,$$
where numvalid and numtotal are the number of valid points and total points.

According to the ratio of valid points in the presented results, it is obvious that the quality of 3D results is improved with the increasing sampling number. Under the same sampling number, more valid points with less RMSE are obtained in the results generated by the disparity estimation network and the angular SR network with only bicubic interpolation and blurring. Moreover, the quality of results generated by the complete angular SR network with convolution layers and the disparity estimation network for 9 views is optimal in the experiment under the same sampling number. When the sampling number is 6000, the RMSE is ∼ 4.1 mm and the ratio of valid points is ∼ 98.6%. It is worth mentioning that the quality of results generated by the complete angular SR network under the sampling number of 4000 is better than the quality of results generated by original images under the sampling number of 6000. The results indicate that adopting the complete angular SR network in the proposed 3D CGI system can reduce the sampling number and improve the quality of results.

3.3 Discussion

In the RP 3D CGI system, the 3D profiles are generated by the algorithm using ghost images from multiple views. As the optical devices widely used in laser detection systems [3638] and visual systems [3941], Risley prisms consisting of two rotating prisms cause the boresight of the system to be dynamically deflected, which enable the speckle patterns projected from virtual positions. Due to the angle of refraction varying with wavelengths, the chromatic aberration existing in visual systems with multi-wavelength results in measurement deviation, which can be fully corrected by using the design of achromatic prisms [42,43]. When the wedge angle of prisms is not large, the influence of the chromatic aberration is minuscule. Moreover, the position of the virtual projection unit is determined by the rotation angles of prisms, which can be adjusted flexibly. Our proposed system can be used to achieve long-range measurements by enlarging the moving path of the virtual projection unit without modifying the hardware device. According to the presented 2D results, although the spatial resolution is not enhanced in our proposed system, the quality of 2D images is obviously improved by the angular SR network.

4. Conclusions

To summarize, we have experimentally demonstrated a compact 3D CGI system using Risley prisms. Different from existing 3D CGI methods using stereo vision, multiple projection units or multiple bucket detectors are not employed in the proposed system. By rotating prisms, speckle patterns are projected by a dynamic virtual projection unit at different positions and the multi-view ghost images without sacrificing spatial resolution are obtained for 3D imaging. The angular SR network is adopted to enhance the angular resolution of directly obtained results. Based on the optimized images, the 3D profiles of scenes are reconstructed by the disparity estimation network. The experimental results have indicated that our proposed system is able to achieve 3D CGI compactly and flexibly. The compact system with potential has promising applications such as navigation and detection.

Funding

National Natural Science Foundation of China (61871031, 61875012, 61905014); Beijing Municipal Natural Science Foundation (4222017); Funding of foundation enhancement program under Grant (2019-JCJQ-JJ-273).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

References

1. J. H. Shapiro, “Computational ghost imaging,” Phys. Rev. A 78(6), 061802 (2008). [CrossRef]  

2. Y. Bromberg, O. Katz, and Y. Silberberg, “Ghost imaging with a single detector,” Phys. Rev. A 79(5), 053840 (2009). [CrossRef]  

3. Z. Liu, S. Tan, J. Wu, E. Li, X. Shen, and S. Han, “Spectral Camera based on Ghost Imaging via Sparsity Constraints,” Sci. Rep. 6(1), 25718 (2016). [CrossRef]  

4. J. Huang and D. Shi, “Multispectral computational ghost imaging with multiplexed illumination,” J. Opt. 19(7), 075701 (2017). [CrossRef]  

5. S. Liu, Z. Liu, C. Hu, E. Li, X. Shen, and S. Han, “Spectral ghost imaging camera with super-Rayleigh modulator,” Opt. Commun. 472, 126017 (2020). [CrossRef]  

6. W. Li, Z. Tong, K. Xiao, Z. Liu, Q. Gao, J. Sun, S. Liu, S. Han, and Z. Wang, “Single-frame wide-field nanoscopy based on ghost imaging via sparsity constraints,” Optica 6(12), 1515–1523 (2019). [CrossRef]  

7. S. C. Chen, Z. Feng, J. Li, W. Tan, L. H. Du, J. Cai, Y. Ma, K. He, H. Ding, Z. H. Zhai, Z. R. Li, C. W. Qiu, X. C. Zhang, and L. G. Zhu, “Ghost spintronic THz-emitter-array microscope,” Light: Sci. Appl. 9(1), 99 (2020). [CrossRef]  

8. N. Radwell, K. J. Mitchell, G. M. Gibson, M. P. Edgar, R. Bowman, and M. J. Padgett, “Single-pixel infrared and visible microscope,” Optica 1(5), 285–289 (2014). [CrossRef]  

9. M. P. Olbinado, D. M. Paganin, Y. Cheng, and A. Rack, “X-ray phase-contrast ghost imaging using a single-pixel camera,” Optica 8(12), 1538–1544 (2021). [CrossRef]  

10. Y. Klein, A. Schori, I. P. Dolbnya, K. Sawhney, and S. Shwartz, “X-ray computational ghost imaging with single-pixel detector,” Opt. Express 27(3), 3284–3293 (2019). [CrossRef]  

11. D. Ceddia and D. M. Paganin, “Random-matrix bases, ghost imaging, and x-ray phase contrast computational ghost imaging,” Phys. Rev. A 97(6), 062119 (2018). [CrossRef]  

12. P. Clemente, V. Duran, V. Torres-Company, E. Tajahuerce, and J. Lancis, “Optical encryption based on computational ghost imaging,” Opt. Lett. 35(14), 2391–2393 (2010). [CrossRef]  

13. S. Zhao, X. Yu, L. Wang, W. Li, and B. Zheng, “Secure optical encryption based on ghost imaging with fractional Fourier transform,” Opt. Commun. 474, 126086 (2020). [CrossRef]  

14. P. Zheng, Z. Ye, J. Xiong, and H.-C. Liu, “Computational ghost imaging encryption with a pattern compression from 3D to 0D,” Opt. Express 30(12), 21866–21875 (2022). [CrossRef]  

15. C. Zhao, W. Gong, M. Chen, E. Li, H. Wang, W. Xu, and S. Han, “Ghost imaging lidar via sparsity constraints,” Appl. Phys. Lett. 101(14), 141123 (2012). [CrossRef]  

16. W. Gong, C. Zhao, H. Yu, M. Chen, W. Xu, and S. Han, “Three-dimensional ghost imaging lidar via sparsity constraint,” Sci. Rep. 6(1), 26133 (2016). [CrossRef]  

17. M. J. Sun, M. P. Edgar, G. M. Gibson, B. Sun, N. Radwell, R. Lamb, and M. J. Padgett, “Single-pixel three-dimensional imaging with time-based depth resolution,” Nat. Commun. 7(1), 12010 (2016). [CrossRef]  

18. X. Liu, J. Shi, L. Sun, Y. Li, J. Fan, and G. Zeng, “Photon-limited single-pixel imaging,” Opt. Express 28(6), 8132–8144 (2020). [CrossRef]  

19. B. Sun, M. P. Edgar, R. Bowman, L. E. Vittert, S. Welsh, A. Bowman, and M. J. Padgett, “3D computational imaging with single-pixel detectors,” Science 340(6134), 844–847 (2013). [CrossRef]  

20. Y. Qian, R. He, Q. Chen, G. Gu, F. Shi, and W. Zhang, “Adaptive compressed 3D ghost imaging based on the variation of surface normals,” Opt. Express 27(20), 27862–27872 (2019). [CrossRef]  

21. L. Zhang, Z. Lin, R. He, Y. Qian, Q. Chen, and W. Zhang, “Improving the noise immunity of 3D computational ghost imaging,” Opt. Express 27(3), 2344–2353 (2019). [CrossRef]  

22. Z. Yang, G. Li, R. Yan, Y. Sun, L.-A. Wu, and A.-X. Zhang, “3-D Computational Ghost Imaging With Extended Depth of Field for Measurement,” IEEE Trans. Instrum. Meas. 68(12), 4906–4912 (2019). [CrossRef]  

23. E. Salvador-Balaguer, P. Clemente, E. Tajauerce, F. Pla, and J. Lancis, “Full-color stereoscopic imaging with a single-pixel photodetector,” J. Disp. Technol. 12, 417–422 (2016). [CrossRef]  

24. Y. Wang, J. Yang, L. Wang, X. Ying, T. Wu, W. An, and Y. Guo, “Light Field Image Super-Resolution Using Deformable Convolution,” IEEE Trans. on Image Process. 30, 1057–1071 (2021). [CrossRef]  

25. S. Zhang, H. Sheng, C. Li, J. Zhang, and Z. Xiong, “Robust depth estimation for light field via spinning parallelogram operator,” Comput. Vis. Image Underst. 145, 148–159 (2016). [CrossRef]  

26. G. Wu, Y. Liu, L. Fang, Q. Dai, and T. Chai, “Light Field Reconstruction Using Convolutional Network on EPI and Extended Applications,” IEEE Trans. Pattern Anal. Mach. Intell. 41(7), 1681–1694 (2019). [CrossRef]  

27. M. Lyu, W. Wang, H. Wang, H. Wang, G. Li, N. Chen, and G. Situ, “Deep-learning-based ghost imaging,” Sci. Rep. 7(1), 17865 (2017). [CrossRef]  

28. H. Zhang and D. Duan, “Computational ghost imaging with compressed sensing based on a convolutional neural network,” Chin. Opt. Lett. 19(10), 101101 (2021). [CrossRef]  

29. H. Song, X. Nie, H. Su, H. Chen, Y. Zhou, X. Zhao, T. Peng, and M. O. Scully, “0.8% Nyquist computational ghost imaging via non-experimental deep learning,” Opt. Commun. 520, 128450 (2022). [CrossRef]  

30. Y. Tai, J. Yang, and X. Liu, “Image Super-Resolution via Deep Recursive Residual Network,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (IEEE, 2017), pp. 2790–2798.

31. S. H. Woo, J. Park, J. Y. Lee, and I. S. Kweon, “CBAM: Convolutional Block Attention Module,” in 15th European Conference on Computer Vision (ECCV), (Springer, 2018), pp. 3–19.

32. A. Faluvegi, Q. Bolsee, S. Nedevschi, V. T. Dadarlat, A. Munteanu, and Ieee, “A 3D CONVOLUTIONAL NEURAL NETWORK FOR LIGHT FIELD DEPTH ESTIMATION,” in 9th International Conference on 3D Immersion (IC3D), (IEEE, 2019), pp. 1–5.

33. D. Martin, C. Fowlkes, D. Tal, J. Malik, S. Ieee Computer, and S. Ieee Computer, “A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics,” in 8th IEEE International Conference on Computer Vision (ICCV 2001), (IEEE, 2001), pp. 416–423.

34. J. B. Huang, A. Singh, N. Ahuja, and Ieee, “Single Image Super-resolution from Transformed Self-Exemplars,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (IEEE, 2015), pp. 5197–5206.

35. K. Honauer, O. Johannsen, D. Kondermann, and B. Goldluecke, “A Dataset and Evaluation Methodology for Depth Estimation on 4D Light Fields,” in 13th Asian Conference on Computer Vision (ACCV), (Springer, 2016), pp. 19–34.

36. A. Li, X. Liu, J. Sun, and Z. Lu, “Risley-prism-based multi-beam scanning LiDAR for high-resolution three-dimensional imaging,” Opt. Lasers Eng. 150, 106836 (2022). [CrossRef]  

37. J. F. Sun, L. R. Liu, M. J. Yun, L. Y. Wan, and M. L. Zhang, “Distortion of beam shape by a rotating double-prism wide-angle laser beam scanner,” Opt. Eng. 45(4), 043004 (2006). [CrossRef]  

38. S. Alajlouni, “Solution to the Control Problem of Laser Path Tracking Using Risley Prisms,” IEEE ASME Trans. Mechatron. 21(4), 1892–1899 (2016). [CrossRef]  

39. F. Huang, H. Ren, X. Wu, and P. Wang, “Flexible foveated imaging using a single Risley-prism imaging system,” Opt. Express 29(24), 40072–40090 (2021). [CrossRef]  

40. H. Zhang, J. Cao, H. Cui, D. Zhou, and Q. Hao, “Virtual image array generated by Risley prisms for three-dimensional imaging,” Opt. Commun. 517, 128309 (2022). [CrossRef]  

41. A. Li, Z. Zhao, X. Liu, and Z. Deng, “Risley-prism-based tracking model for fast locating a target using imaging feedback,” Opt. Express 28(4), 5378–5392 (2020). [CrossRef]  

42. P. J. Bos, H. Garcia, and V. Sergan, “Wide-angle achromatic prism beam steering for infrared countermeasures and imaging applications: solving the singularity problem in the two-prism design,” Opt. Eng. 46(11), 113001 (2007). [CrossRef]  

43. B. D. Duncan, P. J. Bos, and V. Sergan, “Wide-angle achromatic prism beam steering for infrared countermeasure applications,” Opt. Eng. 42(4), 1038–1047 (2003). [CrossRef]  

Supplementary Material (1)

NameDescription
Supplement 1       Supplemental document with details of the theoretical analysis.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (6)

Fig. 1.
Fig. 1. Schematic diagram of the proposed 3D CGI system. The system mainly consists of Risley prisms, a stationary projection unit, a bucket detector, a data acquisition device and additional lens. The stationary and virtual projection unit are formed of components with the same parameters, which include a light source, lens and a digital micromirror device (DMD) for modulating the incident light. Risley prisms are placed coaxially with the optical axis of the stationary projection unit Pr. Based on Risley prisms with different rotation angles, the O1-O2-O3-O4-O5 trajectory is implemented in the experiment, where Ok (1 ≤ k ≤ 5) represents the optical center of the dynamic virtual projection unit at different positions. The data acquisition device collects the intensity signal from the bucket detector with lens and transmits the signal to the computer for post-processing.
Fig. 2.
Fig. 2. Implementation process of reconstruction in the RP 3D CGI system. To improve the computational efficiency, we select the suitable image areas by cropping original images. The cropped images are imported to an angular SR network in the form of the EPI that is obtained by gathering samples with a fixed angular coordinate q and a fixed spatial coordinate u. We sequentially input various EPIGC composed of different rows of multi-view images and finally obtain EPIGA by selecting 2k-1 rows of the corresponding output. Based on EPIGA, the enhanced images are obtained for disparity estimation and 3D reconstruction.
Fig. 3.
Fig. 3. Structure of the adopted network. (a) Angular SR network. The block CA represents channel attention module and the block SA represents spatial attention module. ⊗ denotes the element-wise multiplication and ⊕ denotes the element-wise addition. Qc and Qs represent the channel attention map and the spatial attention map, respectively. Each 2D convolution layer is expressed as sh2 × sw2 @ C2, where the first two parameters correspond to the kernel size and C2 is the number of channels. (b) Disparity estimation network for 9 views. The network is mainly made up of 3D convolution layers, sub-units ConvD and additional 2D convolution layers. Each 3D convolution layer is expressed as sh3 × sw3 × sd3 @ C3, where the first three parameters correspond to the kernel size and C3 is the number of channels.
Fig. 4.
Fig. 4. Experimental results under the sampling number of 7000. (a) 2D image. (b) 3D results. (c) Rendered view.
Fig. 5.
Fig. 5. Experimental results under different sampling numbers and different reconstruction processes. A measurement range with the maximum and minimum values is set in the presented 3D results, which excludes erroneous values caused by noise disturbance. The points with depth values out of the range are considered invalid.
Fig. 6.
Fig. 6. Results of quantitative evaluation. (a) RMSE varying with sampling numbers. (b) Ratio of valid points varying with sampling numbers.

Tables (1)

Tables Icon

Table 1. Theoretical angles of prisms

Equations (4)

Equations on this page are rendered with MathJax. Learn more.

L ( σ ) = 1 Γ | | x ( u ) X ( u ) | | 2 ,
( η ) = 1 Γ | y ( u ) Y ( u ) | ,
R M S E = 1 μ ( v a l i t r u i ) 2 ,
R v p = n u m v a l i d n u m t o t a l × 100 % ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.