Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Complete plenoptic imaging using a single detector

Open Access Open Access

Abstract

Multi-dimensional imaging is a powerful technique for many applications, such as biological analysis, remote sensing, and object recognition. Most existing multi-dimensional imaging systems rely on scanning or camera array, which make the system bulky and unstable. To some extent, these problems can be mitigated by employing compressed sensing algorithms. However, they are computationally expensive and highly rely on the ill-posed assumption that the information is sparse in a given domain. Here, we propose a snapshot spectral-volumetric imaging (SSVI) system by introducing the paradigm of light-field imaging into Fourier transform imaging spectroscopy. We demonstrate that SSVI can reconstruct a complete plenoptic function, P(x,y,z,θ,φ,λ,t), of the incoming light rays using a single detector. Compared with other multidimensional imagers, SSVI features prominent advantages in compactness, robustness, and low cost.

© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

In an imaging system, each incoming light ray carries abundant information, which is described as the plenoptic function by Adelson and Bergen [1]. This seven-dimensional (7D) plenoptic function, P(x,y,z,θ,φ,λ,t), gives the spatial (x,y,z) and angular (θ,φ) coordinates, wavelength (λ), and time (t) of an incoming light ray. Gao and Wang [2] further expanded the function to nine dimensions by including the polarization orientation and ellipticity angles (ψ,χ). Conventional imaging systems which only record the information from two dimensions (x,y), miss a mass of messages delivered by the light rays.

Spectral information in the wavelength (λ) reveals the chemical and molecular characteristics of the object [3–6]. Spectral imaging has been widely used in many fields, such as biology and biomedicine [3,7], remote sensing [4,8], and food quality control [5,6], due to its three-dimensional (3D) (x,y,λ) imaging capabilities. On the other hand, the morphological and functional information of the object, carried by the volumetric spatial data(x,y,z), can be very useful in biomedical imaging [9–12], photography [13], object recognition [14,15], and particle image velocimetry [16,17]. In the past few decades, numerous efforts have been made to push the limits of imaging systems and capture light information in more dimensions. In recent years, spectral-volumetric imaging, which is capable of capturing a four-dimensional (4D) datacube (x,y,z,λ) of the incoming light rays, has drawn considerable interest with applications in biomedicine [18,19,29], remote sensing [20,21], object recognition [22], and multimedia [23]. Most spectral-volumetric imaging systems rely on scanning [18–25] or multi-shot [26,27] which require prolonged acquisition time that restricts their use in dynamic imaging. Fusing data from multi-sensor [28–31] is a feasible strategy to capture a 4D datacube (x,y,z,λ) in a single snapshot, but these systems are bulky and suffer from alignment error. Feng et al. have reported on compressed sensing to achieve snapshot 4D spectral-volumetric imaging [32]. However, this technique is computationally expensive and highly relies on the ill-posed assumption that the 4D datacube (x,y,z,λ) is sparse in a given domain.

Here, we demonstrate a snapshot spectral-volumetric imaging (SSVI) system by introducing the paradigm of light-field imaging into Fourier transform imaging spectroscopy. In the SSVI system, we employ a single detector to record the light-field image coupled with the interference from a birefringent polarization interferometer. A convolutional neural network has been developed to decouple the light-field image and the interference. We derive the 3D volumetric datacube (x,y,z) and 3D spectral datacube (x,y,λ) from the light-field image and the interference, respectively. Combining these two datacubes with the 4D light-field datacube (x,y,θ,φ) gives the complete, 7D plenoptic function, P(x,y,z,θ,φ,λ,t). To our knowledge, this is the first time a complete plenoptic function of light field has been recorded by a single detector. Compared with multi-sensor-based systems [28–31], SSVI is compact, robust, and inexpensive. In addition, SSVI preserves the high-frequency details of the object in both spatial and spectral domains.

2. Methods

2.1 Snapshot spectral-volumetric imaging system

The configuration of the SSVI system is shown in Fig. 1. The incident light from an object is first imaged by an objective lens, forming the first intermediate image at a field stop. This intermediate image is relayed by a relay lens to a virtual intermediate image, indicated as the second intermediate image in Fig. 1(a). After that, an elemental image (EI) array, denoted as the third intermediate image, is formed through a microlens array (MLA). Finally, the third intermediate image is relayed by a relay lens onto a charge-coupled device (CCD).

 figure: Fig. 1

Fig. 1 Snapshot spectral-volumetric imaging system configuration. (a) System sketch. MLA, microlens array; CCD, charge-coupled device; EI, elemental image. (b) Configuration of the birefringent polarization interferometer (BPI). Components between the polarizer I and the Wollaston prism have been omitted for clarity. The red arrow and circle indicate the polarization eigenmodes of the Wollaston prism, while the black ones denote the polarization of light rays. The transmission axes of the two polarizers are both oriented at 45° with respect to the polarization eigenmodes of the Wollaston prism. (c) The rotated BPI and CCD. Other components have been omitted.

Download Full Size | PDF

The SSVI system incorporates a birefringent polarization interferometer (BPI) which contains two polarizers and a Wollaston prism, as shown in Fig. 1(b). The Wollaston prism was custom-made from quartz. Based on the birefringence of the prism, the BPI separates the incident light rays into two paths which are then converged by relay lens II and interfere on the CCD. As depicted in Fig. 1(c), the BPI is rotated about the z-axis by a small angle (δ) with respect to the y-axis. Thus, the theoretical optical path difference (OPD) on the CCD can be derived as [33,34]

OPD(x,y)=2Btan(α)[(xx0)cos(δ)ysin(δ)]/MR2,
where B is the birefringence of quartz, α is the wedge angle of the Wollaston prism, MR2 is the magnification of relay lens II, and x0 is the x offset of the zero-OPD reference position. A virtual interference is located inside the Wollaston prism, as indicated by the blue dashed line in Fig. 1(b). In SSVI, we adopt a focused light-field configuration – the second intermediate image is located behind the MLA [35]. To record both the light-field and interference of the object, we set the third intermediate image and the virtual interference within the depth of field (DOF) of relay lens II.

The MLA design choice affects the resolution in the lateral (x,y), spectral (λ) and angular (θ,φ) dimensions. In a focused light-field camera, the lateral resolution is determined by the pitch (microlens diameter) and numerical aperture (NA) of the MLA, while the angular resolution is controlled by the number of the microlenses in the MLA [35]. On the other hand, the spectral resolution of a BPI-based imaging spectrometer is related to both the pitch and number of the microlenses in the MLA [33,34]. Here, we employ an MLA which contains 15 × 15 microlenses with a pitch of 1mm × 1mm. The focal length of each microlens is 10.9mm which gives NA=0.0459. A bi-telecentric lens with 0.44 × magnification is used to relay the third intermediate image onto the CCD. In the BPI, we employ a customized Wollaston prism with a wedge angle α=7.6° and an x offset x0~1.5mm, which makes the largest OPD around 28 μm.

2.2 Reconstruction algorithm

Mathematically, we denote the plenoptic function of an incoming light ray from the object as P(x,y,z,θ,φ,λ,t). This can be reduced to P(x,y,θ,φ,λ) under the assumptions that the function does not vary within a single integration time of the detector and the radiance along a ray is a constant [36]. Without loss of generality, we can consider only the rays in the yoz plane and the function can be further simplified as P(y,φ,λ).

For a centered lens system, the output ray vector [y,φ]' at the image point can be derived as [37]

[yφ]=[M01/f1/M][yφ],
where [y',φ']' is the input ray vector at the object point, M and f are the magnification and focal length of the system, respectively. For the SSVI system with an MLA, we shift the optical axis to the center of each microlens and shift it back to the z -axis after doing the transformation given in Eq. (2). For a light ray passing through the nth microlens along the y-axis, the output ray vector [y',φ']' at the image point can be calculated as
[yφ]=[M01/f1/M][ydnφ]+[dn0],
where dn is the distance from the optical axis to the center of the nth microlens. Here, we define the distances pointing up and down as positive and negative, respectively. Combining the interference introduced by the BPI, we can derive the radiance of light rays on the CCD as
P(y,φ,λ)=P(ydnM+dn,ydnf+Mφ,λ)12[1+cos(2πλOPD)],
where P' denotes the radiance of light rays from the object. Herein we consider the BPI as an integrated component that can modulate the input light rays by OPD, as indicated in Eq. (4). The detailed light propagations in the BPI, e.g. the separation of light rays, are omitted. Extending Eq. (4) to 3D space and integrating over all directions and wavelengths at each location (x,y) gives the intensity of the theoretical raw image captured by the CCD:
I(x,y)=Θ,Φ,ΛP(x,y,θ,φ,λ)12{1+cos[2πλOPD(x,y)]}dθdφdλ,
where

x=xdmM+dm,y=ydnM+dn,θ=xdmf+Mθ,φ=ydnf+Mφ.

Θ, Φ, and Λ are the detection range of the SSVI system in the two-dimensional propagation-angle domain and the spectral domain, respectively. dm and dn are the distances from the optical axis to the center of the microlens, through which the light rays pass, along the x- and y-axis respectively. If we assume a Lambertian surface, the spectra (λ) of light rays are independent of the propagation angles (θ,φ) and Eq. (5) can be decomposed as

I(x,y)=Θ,Φl(x,y,θ,φ)dθdφΛs(x,y,λ)12{1+cos[2πλOPD(x,y)]}dλ,
where l(x,y,θ,φ) and s(x,y,λ) denote the 4D light-field and the 3D spectral datacube of the object, respectively.

Figure 2 shows the image processing pipeline, which consists of four steps, namely (I) light-field-interference decouple, (II) depth reconstruction, (III) interferogram extraction, and (IV) spectral datacube reconstruction. As indicated in Eq. (7), the light-field and interference are multiplicatively coupled in the raw image I(x,y). Therefore, the first step is to decouple the light-field and interference, or in other words, to remove the interference from the raw image. Convolutional neural network (CNN) is a powerful tool which has been successfully used to remove image patterns, e.g. rain streaks [38] and dirt on a window [39]. Moreover, the interference in the raw image captured by SSVI is much simpler than rain streaks or dirt on a window, because it tilts in a specific angle and its frequency is limited by the spectral response of the CCD. Herein, we develop a CNN, as shown in Fig. 2(a), to decouple the light-field and interference. We refer to the proposed decoupling convolutional neural network as DC-CNN. To convert the multiplicative coupling mode into an additive one, we take the logarithm of the raw image, i.e.

 figure: Fig. 2

Fig. 2 Flowchart of image processing pipeline. (a) Framework of the light-field-interferogram decoupling convolutional neural network (DC-CNN). Log, logarithm; Conv, convolution; ReLU, rectified linear unit; BN, batch normalization. (b) Light-field image. (c) An epipolar plane image extracted from the light-field image. (d) Disparity map. (e) Depth map. (f) An epipolar plane image extracted from the raw image. (g) Interferogram of a single pixel. (h) Spectrum of a single pixel. FFT, fast Fourier transform. (i) Spectral datacube.

Download Full Size | PDF

log[I(x,y)]=log[Θ,Φl(x,y,θ,φ)dθdφ]+log[Λs(x,y,λ)12{1+cos[2πλOPD(x,y)]}dλ].

Inspired by the residual learning strategy [40], we set the interference instead of the light-field image of objects as the learning target, because the light-field image is much more complex than the interference of objects. As shown in Fig. 2(a), the DC-CNN architecture consists of two hidden layers and an output layer. The forward propagation of our network can be represented by three operations

H1(Ii)=ReLU{W1log[H0(Ii)]+b1},
H2(Ii)=ReLU[W2H1(Ii)+b2],
H3(Ii)=exp[W3H2(Ii)+b3],
where Wl and bl (l=1,2,3) are the weight and bias parameters that need to be learned through training, and ReLU() denotes the rectified linear unit, ReLU(x)=max(0,x). Ii is the ith image in the training set and * indicates convolution. Hl(Ii)(l=1,2) is the output of the lth hidden layer, while H0(Ii) and H3(Ii) are the input raw image and the output interference, respectively.

To generate the training set, we used the spectral images from the ICVL spectral data set released by Arad and Ben-Shahar [41]. These images were captured by a line scanner camera (Specim PS Kappa DX4 hyperspectral) with a lateral resolution of 1392 × 1300 over 519 spectral bands, and then downsampled to 31 spectral channels from 400nm to 700nm. We further downsampled the spectral images to the same size of a single elemental image captured by the SSVI system and replicated each downsampled image by 5 × 5. Then a synthesized elemental image array with interference, I, was derived through Eqs. (1) and (7), while a corresponding elemental image array without interference, E, was simply calculated by accumulating the slice images over all spectral channels. We define the loss function of the DC-CNN as

L=1Ni=1NH3(Ii)IiEi22,
where N is the number of images in the training set. The network was implemented in MatConvNet [42]. Training and evaluation were run on a workstation with Intel(R) Xeon(R) CPU E5-2650, 2.0 GHz CPU, 128G RAM, and NVIDIA GeForce GTX 1080 Ti. It took about 20 hours to train the full network. An example input raw image and the corresponding output interference are shown in Fig. 2(a). Dividing the input image by the interference gives the light field image [Fig. 2(b)], which can be mathematically expressed as

ILF(x,y)=Θ,Φl(xdmM+dm,ydnM+dn,xdmf+Mθ,ydnf+Mφ)dθdφ.

In step II, we consider the MLA as an array of stereo cameras and arrange the elemental image array into a four-dimensional datacube LF(x,y,m,n), where m and n index the elemental images along the x- and y-axis, respectively. A 2D yn slice of the datacube, dubbed an epipolar plane image (EPI), is shown in Fig. 2(c). The tilting angle of the ‘line’ structure, β, in the EPI corresponds to a certain depth of that point in the object space [43]. The disparity between the top and bottom elemental images can be derived by D=htan(β), where h is the width of the EPI along the n-axis. We adopt a disparity estimation algorithm based on scale-depth space transform [44] to reconstruct the disparity map. Finally, the depth map of the object can be derived through a calibration procedure [45].

In step III, we extract interference vectors from the raw image under the guidance of the tilting angles derived in the last step. Figure 2(f) shows an example of the EPIs reconstructed from the raw image. The red dashed line indicates the tilting angle at that position. The vector extracted along the red dashed line from bottom to top is a portion of the corresponding interference vector. Concatenating all the vectors extracted from the EPIs along the m-axis gives the complete interference vector.

Mathematically, we denote the extracted interference vector as [LFraw(xi,yi,mi,ni)]i=1,2,,PQ, where LFraw(x,y,m,n) is the light-filed datacube reconstructed from the raw image. P and Q are the number of elemental images along the m- and n-axis, respectively. Here, we consider the central elemental image as a reference. According to the extracting strategy described above, the coordinates, (xi,yi,mi,ni), can be calculated by

mi=iQ,
ni=Qmod[(i1),Q],
xi=xc+Da(mcmi),
yi=yc+Da(ncni),
where denotes the ceiling operation, and mod(x,y) calculates the remainder after dividing x by y. The central elemental image is indexed as  (mc,nc) and (xc,yc) are the coordinates of the corresponding point in the central elemental image. Da denotes the disparity between adjacent elemental images. An OPD vector corresponding to the interference vector can be derived as [OPD(xiraw,yiraw)]i=1,2,,PQ, where (xiraw,yiraw) are the coordinates of the point in the raw image corresponding to the point (xi,yi,mi,ni) in the light-field datacube. These coordinates can be calculated by
xiraw=xi+(mi1)d,
yiraw=yi+(ni1)d,
where d is the distance between the centers of adjacent elemental images. Combining Eq. (1) and Eqs. (16)-(19) gives
OPD(xiraw,yiraw)=2Btan(α)MR2[(xirawx0)cos(δ)yirawsin(δ)],
where

xiraw=xc+mcDad+mi(dDa),
yiraw=yc+ncDad+ni(dDa).

We can get an OPD vector with equal intervals by setting δ=tan1(1/Q) [33,34]. Figure 2(g) plots an example of interference vector at different OPD values.

In step IV, we derive the spectrum at point (xc,yc) by taking the Fourier transformation of the interference vector [LFraw(xi,yi,mi,ni)]i=1,2,,PQ along the OPD vector [OPD(xiraw,yiraw)]i=1,2,,PQ. Figure 2(h) plots the derived spectrum corresponding to the interference in Fig. 2(g). Finally, following this procedure pixelwise yields a reconstructed spectral datacube S(x,y,λ) [Fig. 2(i)].

3. Experiment

To demonstrate the concept of SSVI, we built a prototype (Fig. 3) based on the description in Section 2. We employed two commercial lenses (Canon EF 50mm f/1.8 and EF 50mm f/1.4) as the objective lens and relay lens I, respectively. A monochromatic CCD (BM-500GE, JAI) with 2058 × 2456 pixels was used as the detector. Before the following experiments, we calibrated the system in both spectral- and depth-dimension by the methods described in Ref [34] and [45], respectively.

 figure: Fig. 3

Fig. 3 Prototype of the snapshot spectral-volumetric imaging system.

Download Full Size | PDF

3.1 Real-time spectral-volumetric video

To highlight SSVI’s single-shot multi-dimensional-imaging capability, we visualized a dynamic scene consisting of a static white board with black letters and a green leaf swinging along the line of sight with a period of ~2s [Fig. 4(a)]. The scene was located at a distance of ~0.85m from the SSVI system and a halogen lamp (MI-150, Edmund) was used for illumination. Using the reconstruction algorithm described in Section 2, we derived a real-time 4D (x,y,z,λ) video with a frame rate of 15Hz (Visualization 1). Figures 4(b)-4(i) show the depth map and reconstructed image of eight example frames from the video. Here, the reconstructed images are derived by accumulating the datacube over all spectral channels. As a result, we are able to visualize the real-time motion of the leaf in 3D (x,y,z). To demonstrate the refocusing ability of the SSVI system, we reconstructed two videos when the system focuses at 450mm and 850mm (Visualization 2). Figures 4(j)-4(m) illustrate four example frames in each reconstructed video.

 figure: Fig. 4

Fig. 4 Real-time spectral-volumetric video. (a) Illustration of the scene including a static paper with letters and a swinging leaf. (b-i) Samples of consecutive depth and image frames reconstructed at 15 Hz frame-rate with a lateral resolution of 110 × 110 pixels. (j-m) Samples of image frames reconstructed when the system focuses at 450mm (upper row) and 850mm (lower row). Note: the full video can be seen in Visualization 1 and Visualization 2.

Download Full Size | PDF

In this experiment, we attached an L-shaped green paper on the swinging leaf. Figure 5(c) shows a close-up of the leaf captured by a commercial RGB camera. A frame of the reconstructed-image video at t=1.6s is depicted in Fig. 5(a). Since the color of the green paper is quite close to that of the leaf, the contrast between the L-shaped paper and the leaf is low in Figs. 5(a) and 5(c). Figure 5(b) plots the spectra of point A and B which are located on the green paper and the leaf [Fig. 5(a)], respectively. The spectrum of point B increases dramatically around 700nm due to the absorbing property of chlorophyll. By contrast, the spectrum of point A varies smoothly from 500nm to 800nm. This phenomenon, whereby different spectra appear the same color to RGB cameras and human eyes, is referred to as metamerism. Unlike the monochromatic [Fig. 5(a)] and RGB images [Fig. 5(c)], the spectral slice at 728.6nm [Fig. 5(d)] shows high contrast between the L-shaped paper and the leaf. The image of the leaf can be further enhanced by taking the difference between the spectral slice at 728.6nm [Fig. 5(d)] and 673.9nm [Fig. 5(e)].

 figure: Fig. 5

Fig. 5 Response to metamerism. (a) Reconstructed image at t=1.6s. (b) Spectra of points A and B indicated in (a). (c) High-resolution RGB image of the leaf captured by a commercial camera. (d) Spectral slice at 728.6nm. (e) Spectral slice at 673.9nm.

Download Full Size | PDF

3.2 Lateral resolution

We assessed the lateral resolution of SSVI by imaging a 1951 USAF resolution test target at ten steps along the optical axis. The lateral resolution was derived from the reconstructed image at each step. Figure 6(a) shows an example of the reconstructed images when the resolution target was located at 840mm from the imaging system, and Fig. 6(b) plots the experimentally measured resolution at ten steps with blue squares. The red dashed line indicates the diffraction limit which is greater than the experimental resolution of SSVI. We attribute this discrepancy to two reasons. First, the aberrations introduced by the optical lenses, especially by the MLA, blur the image. Second, the depth reconstruction error also degrades the imaging performance.

 figure: Fig. 6

Fig. 6 Lateral resolution of SSVI. (a) Experimentally measured resolution at different working distance. (b) Reconstructed image when the working distance is 840 mm.

Download Full Size | PDF

3.3 Spectral resolution and accuracy

To evaluate the spectral resolution of the SSVI system, we imaged a black board illuminated by three lasers [Melles Griot, 25-LHP-925-230, 632.8nm (red); Oxxius, 532S-100-COL-PP, 532nm (green); Coherent, OBIS 488-60 LS, 488nm (blue)]. The board was located at a distance of ~0.9m from the imaging system. Figure 7(a) shows the reconstructed image where the three laser points are labeled as ‘Red’, ‘Green’, and ‘Blue’, respectively. The spectra at the three laser point areas are plotted in Figs. 7(b)-7(d). The solid line in each figure is the fitted Gaussian curve. The insets depict the spectral slice at 632.6nm, 531.6nm, and 487.1nm, respectively. We consider the spectra of these three lasers as delta functions and use the full width at half maximum (FWHM) of the measured spectra to characterize the spectral resolution of SSVI. As indicated in Figs. 7(b)-7(d), the spectral resolution of the system is 15.4nm, 10.7nm, and 8.4nm, corresponding to 384.6 cm1, 378.1cm1, and 352.8cm1 in the wavenumber domain, at 632.8nm, 532nm, and 488nm, respectively.

 figure: Fig. 7

Fig. 7 Spectral resolution of SSVI. (a) Reconstructed image. (b-d) Spectra at the red, green, and blue laser point. The insets depict the spectral slice at 632.6nm, 531.6nm, and 487.1nm, respectively.

Download Full Size | PDF

To evaluate the accuracy and precision of the spectral-datacube reconstruction, we visualized a ColorChecker (X-Rite, MSCCPPCC0616) which was located at ~0.78m from the imaging system and illuminated by a halogen lamp (MI-150, Edmund). Figure 8(a) shows a photo of the ColorChecker captured by a commercial camera. The color blocks are labeled as ‘1’ to ‘16’, respectively. We employed a commercial fiber spectrometer (Avantes, AvaSpec-ULS 2048-USB2) with a spectral resolution of 1.15nm to measure the spectrum of each block. The comparison between the spectra derived from the Avantes spectrometer and SSVI in each block are shown in Figs. 8(c)-8(r), respectively. Considering the measurements from the Avantes spectrometer as references, we took 10 × 10 pixels in each block area and calculated the normalized root mean square (RMS) error of the corresponding spectra. The blue line in Fig. 8(b) plots the average and standard deviation of the normalized RMS errors in each block. The average normalized RMS error over all blocks is 6.87%. To compare SSVI with a snapshot-spectral-imaging modality, we transferred the SSVI system to the spectral imager described in [34] by extracting the interference vectors according to a registering procedure [46] instead of the disparity map. A spectral datacube, S(x,y,λ), was then reconstructed by taking the Fourier transformation of the interference vectors. The red line in Fig. 8(b) plots the normalized RMS errors of the spectra derived by the spectral imager, where the average normalized RMS error over all blocks is 6.57%. Compared with the original spectral imager, SSVI well maintains the spectral accuracy when introducing the paradigm of light-field imaging.

 figure: Fig. 8

Fig. 8 Quantitative evaluation of the spectral-datacube reconstruction. (a) Photo of the ColorChecker captured by a commercial camera. (b) Average normalized RMS errors of the reconstructed spectra in color blocks. (c-r) Reconstructed spectra from the SSVI system (Blue dots) and the Avantes spectrometer (Red line) in color blocks.

Download Full Size | PDF

3.4 Depth accuracy

To evaluate the depth accuracy of SSVI, we imaged a scene containing two white boards with black letters [Fig. 9(a)]. The back board is perpendicular to the z-axis, and the front board is tilted with respect to the x-axis by 45°. As shown in Fig. 9(a), the yellow shadowed area on the back board is occluded by the front board. The distances from the boards to the imaging system were measured and considered as the ground truth [Fig. 9(b)]. We reconstructed the depth map [Fig. 9(c)] using the algorithm described in Section 2. We then calculated an error map representing the absolute difference between the reconstructed depth and the ground truth [Fig. 9(d)], where the RMS error was found to be 7.7mm. To compare SSVI with a light-field imaging modality, we derived a light-field image without interference by summing the two raw images when the transmission axis of polarizer I was oriented at 45° and 135° with respect to the x'-axis [Fig. 1(b)]. We employed the same algorithm [44] used in SSVI to reconstruct the depth map. Figure 9(e) shows the error map of the reconstructed depth from the light-field image without interference, where the RMS error was found to be 7.4mm. These results show that the integration of spectral imaging into a light field camera as in SSVI maintains the light field camera’s original depth accuracy.

 figure: Fig. 9

Fig. 9 Depth accuracy of SSVI. (a) Schematic of the experimental setup. (b) Ground-truth depth of the scene. (c) Reconstructed depth. (d) Error map of the reconstructed depth from SSVI. (e) Error map of the reconstructed depth from the light-field image without interference.

Download Full Size | PDF

4. Discussion and conclusion

We have introduced a snapshot spectral-volumetric imaging system which can capture the 4D light-field datacube (x,y,θ,φ) coupled with the interference of the incoming light rays within a snapshot of a single detector. We also proposed a post-processing algorithm to reconstruct the corresponding 4D spectral-volumetric datacube (x,y,z,λ) from the raw image. From these experiments, we have reconstructed a spectral-volumetric video (x,y,z,λ,t) of a dynamic scene. Combining the video with the light-field datacubes, we derived the complete plenoptic function, P(x,y,z,θ,φ,λ,t), of the incoming light rays. To our knowledge, this is the first single-detector imaging system to record the complete plenoptic function of light rays from an object. Additionally, we compare SSVI with two separate imaging modalities, i.e. the spectral and light-filed imaging modalities. The comparison results indicate that the SSVI system maintains both spectral and depth accuracy when integrating the two separate imaging modalities.

Compared to multi-camera systems [28–31], SSVI is compact, robust and inexpensive. Moreover, camera-array-based systems [29,30] capture the multispectral light-field by placing different broadband color filters in front of a camera array. This directly limits the optical throughput of the system as the inverse of the number of spectral channels. SSVI does not suffer this trade-off, as it relies on the Fellget multiplex advantage of a Fourier transform imaging spectroscopy. Additionally, the lateral, depth, and spectral resolution can be easily adjusted using different Wollaston prisms and MLAs in our system.

That being said, SSVI has a relatively low throughput because 75% of the light is lost in the BPI. One future avenue of development is replacing the first polarizer in BPI with a polarizing beam splitter (PBS) and a half-wave plate (HWP). Setting the fast axis of the HWP at 22.5° with respect to the x'-axis makes the polarization of the light rays passing through the PBS and HWP orient at 45° with respect to the x'-axis. We can use these light rays to reconstruct the 7D plenoptic datacube and employ the light rays reflected by the PBS to obtain a high-resolution monochromatic image [46]. The energy loss in this configuration is reduced to 25%. Moreover, the captured high-resolution monochromatic image can be used to improve the lateral resolution of the 7D plenoptic datacube by image fusion [46].

Currently, the post-processing of SSVI is implemented in Matlab 2015 by an Intel Core i7-8700 CPU. It takes less than ten minutes to reconstruct the complete 7D plenoptic datacube from a raw image. Since most computations in the post-processing algorithm can be processed in parallel, we believe that the computing time can be dramatically reduced by using a graphics processing unit (GPU).

It is worth noting that the reconstruction algorithm of SSVI is based on two assumptions. First, the radiance along a light ray is a constant. Equivalently put, the object scene is free of occlusions. Second, we consider the object as Lambertian. In recent years, several algorithms [47–49] have been developed to deal with the object scene with occlusions and specular surfaces in light field imaging. Adopting these algorithms in the reconstruction process of SSVI is our future work.

The experimental results presented here demonstrate that SSVI can be used to derive the complete plenoptic function, P(x,y,z,θ,φ,λ,t), of the light rays from a dynamic scene with information in the lateral, spectral, depth, and time domains. In light of its unprecedented 7D imaging capability, we anticipate that SSVI will facilitate a wide range of applications in biological analysis, robotic vision, and many other fields.

Funding

National Natural Science Foundation of China (61675056); National High Technology Research and Development Program of China (2015AA042401); Applied Technology Research and Development Program of Heilongjiang Province (GX16C013); The National Science Foundation (NSF) CAREER Award (1652150).

Acknowledgment

The authors thank Faraz Arastu for useful discussions in the writing of this paper.

References

1. E. H. Adelson and J. R. Bergen, “The plenoptic function and the elements of early vision,” in Computational Models of Visual Processing (MIT, 1991), pp. 3–20.

2. L. Gao and L. V. Wang, “A review of snapshot multidimensional optical imaging: measuring photon tags in parallel,” Phys. Rep. 616, 1–37 (2016). [CrossRef]   [PubMed]  

3. G. Lu and B. Fei, “Medical hyperspectral imaging: a review,” J. Biomed. Opt. 19(1), 010901 (2014). [CrossRef]   [PubMed]  

4. F. D. van der Meer, H. M. A. van der Werff, F. J. A. van Ruitenbeek, C. A. Hecker, W. H. Bakker, M. F. Noomen, M. van der Meijde, E. J. M. Carranza, J. B. de Smeth, and T. Woldai, “Multi- and hyperspectral geologic remote sensing: A review,” Int. J. Appl. Earth Obs. 14(1), 112–128 (2012). [CrossRef]  

5. B. M. Nicolaï, K. Beullens, E. Bobelyn, A. Peirs, W. Saeys, K. I. Theron, and J. Lammertyn, “Nondestructive measurement of fruit and vegetable quality by means of NIR spectroscopy: A review,” Postharvest Biol. Technol. 46(2), 99–118 (2007). [CrossRef]  

6. A. A. Gowen, C. P. O’Donnell, P. J. Cullen, G. Downey, and J. M. Frias, “Hyperspectral imaging – an emerging process analytical tool for food quality and safety control,” Trends Food Sci. Technol. 18(12), 590–598 (2007). [CrossRef]  

7. R. M. Levenson and J. R. Mansfield, “Multispectral imaging in biology and medicine: slices of life,” Cytometry A 69(8), 748–758 (2006). [CrossRef]   [PubMed]  

8. A. F. H. Goetz, G. Vane, J. E. Solomon, and B. N. Rock, “Imaging spectrometry for Earth remote sensing,” Science 228(4704), 1147–1153 (1985). [CrossRef]   [PubMed]  

9. R. Prevedel, Y. G. Yoon, M. Hoffmann, N. Pak, G. Wetzstein, S. Kato, T. Schrödel, R. Raskar, M. Zimmer, E. S. Boyden, and A. Vaziri, “Simultaneous whole-animal 3D imaging of neuronal activity using light-field microscopy,” Nat. Methods 11(7), 727–730 (2014). [CrossRef]   [PubMed]  

10. N. C. Pégard, H. Y. Liu, N. Antipa, M. Gerlock, H. Adesnik, and L. Waller, “Compressive light-field microscopy for 3D neural activity recording,” Optica 3(5), 517–524 (2016). [CrossRef]  

11. N. Bedard, T. Shope, A. Hoberman, M. A. Haralam, N. Shaikh, J. Kovačević, N. Balram, and I. Tošić, “Light field otoscope design for 3D in vivo imaging of the middle ear,” Biomed. Opt. Express 8(1), 260–272 (2017). [CrossRef]   [PubMed]  

12. S. Zhu, P. Jin, R. Liang, and L. Gao, “Optical design and development of a snapshot light-field laryngoscope,” Opt. Eng. 57(2), 023110 (2018). [CrossRef]  

13. R. Ng, “Digital light field photography,” Ph.D. dissertation (Stanford University, 2006).

14. R. Raghavendra, K. B. Raja, and C. Busch, “Presentation attack detection for face recognition using light field camera,” IEEE Trans. Image Process. 24(3), 1060–1075 (2015). [CrossRef]   [PubMed]  

15. K. Maeno, H. Nagahara, A. Shimada, and R. I. Taniguchi, “Light field distortion feature for transparent object recognition,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2013), pp. 122–135.

16. J. Belden, T. T. Truscott, M. C. Axiak, and A. H. Techet, “Three dimensional synthetic aperture particle image velocimetry,” Meas. Sci. Technol. 21(12), 125403 (2010). [CrossRef]  

17. K. Lynch, T. Fahringer, and B. Thurow, “Three-dimensional particle image velocimetry using a plenoptic camera,” in 50th AIAA Aerospace Sciences Meeting (AIAA, 2012), pp. 1–14.

18. C. Li, G. S. Mitchell, J. Dutta, S. Ahn, R. M. Leahy, and S. R. Cherry, “A three-dimensional multispectral fluorescence optical tomography imaging system for small animals based on a conical mirror design,” Opt. Express 17(9), 7571–7585 (2009). [CrossRef]   [PubMed]  

19. W. Jahr, B. Schmid, C. Schmied, F. O. Fahrbach, and J. Huisken, “Hyperspectral light sheet microscopy,” Nat. Commun. 6(1), 7990 (2015). [CrossRef]   [PubMed]  

20. F. Morsdorf, C. Nichol, T. Malthus, and I. H. Woodhouse, “Assessing forest structural and physiological information content of multi-spectral LiDAR waveforms by radiative transfer modelling,” Remote Sens. Environ. 113(10), 2152–2163 (2009). [CrossRef]  

21. A. Wallace, C. Nichol, and I. Woodhouse, “Recovery of forest canopy parameters by inversion of multispectral LiDAR data,” Remote Sens. 4(2), 509–531 (2012). [CrossRef]  

22. A. D. Gleckler, A. Gelbart, and J. M. Bowden, “Multispectral and hyperspectral 3D imaging lidar based upon the multiple-slit streak tube imaging lidar,” in Aerospace/Defense Sensing, Simulation, and Controls (SPIE, 2001), pp. 328–335.

23. A. Mansouri, A. Lathuiliere, F. S. Marzani, Y. Voisin, and P. Gouton, “Toward a 3D multispectral scanner: an application to multimedia,” IEEE Multimed. 14(1), 40–47 (2007). [CrossRef]  

24. P. Latorre-Carmona, E. Sánchez-Ortiga, X. Xiao, F. Pla, M. Martínez-Corral, H. Navarro, G. Saavedra, and B. Javidi, “Multispectral integral imaging acquisition and processing using a monochrome camera and a liquid crystal tunable filter,” Opt. Express 20(23), 25960–25969 (2012). [CrossRef]   [PubMed]  

25. V. Farber, Y. Oiknine, I. August, and A. Stern, “Compressive 4D spectro-volumetric imaging,” Opt. Lett. 41(22), 5174–5177 (2016). [CrossRef]   [PubMed]  

26. L. Gao, N. Bedard, N. Hagen, R. T. Kester, and T. S. Tkaczyk, “Depth-resolved image mapping spectrometer (IMS) with structured illumination,” Opt. Express 19(18), 17439–17452 (2011). [CrossRef]   [PubMed]  

27. H. Rueda, C. Fu, D. L. Lau, and G. R. Arce, “Single aperture spectral+ToF compressive camera: toward hyperspectral+depth imagery,” IEEE J. Sel. Top. Signal Process. 11(7), 992–1003 (2017). [CrossRef]  

28. P. Latorre-Carmona, F. Pla, A. Stern, I. Moon, and B. Javidi, “Three-dimensional imaging with multiple degrees of freedom using data fusion,” Proc. IEEE 103(9), 1654–1671 (2015). [CrossRef]  

29. J. Wu, B. Xiong, X. Lin, J. He, J. Suo, and Q. Dai, “Snapshot hyperspectral volumetric microscopy,” Sci. Rep. 6(1), 24624 (2016). [CrossRef]   [PubMed]  

30. Y. Zhao, T. Yue, L. Chen, H. Wang, Z. Ma, D. J. Brady, and X. Cao, “Heterogeneous camera array for multispectral light field imaging,” Opt. Express 25(13), 14008–14022 (2017). [CrossRef]   [PubMed]  

31. Z. Xiong, L. Wang, H. Li, D. Liu, and F. Wu, “Snapshot hyperspectral light field imaging,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE, 2017), pp. 3270–3278.

32. W. Feng, H. Rueda, C. Fu, G. R. Arce, W. He, and Q. Chen, “3D compressive spectral integral imaging,” Opt. Express 24(22), 24859–24871 (2016). [CrossRef]   [PubMed]  

33. M. W. Kudenov and E. L. Dereniak, “Compact snapshot birefringent imaging Fourier transform spectrometer,” in Imaging Spectrometry XV, (SPIE, 2010), paper 781206.

34. M. W. Kudenov and E. L. Dereniak, “Compact real-time birefringent imaging spectrometer,” Opt. Express 20(16), 17973–17986 (2012). [CrossRef]   [PubMed]  

35. S. Zhu, A. Lai, K. Eaton, P. Jin, and L. Gao, “On the fundamental comparison between unfocused and focused light field cameras,” Appl. Opt. 57(1), A1–A11 (2018). [CrossRef]   [PubMed]  

36. E. Y. Lam, “Computational photography with plenoptic camera and light field capture: tutorial,” J. Opt. Soc. Am. A 32(11), 2021–2032 (2015). [CrossRef]   [PubMed]  

37. A. Gerrard and J. M. Burch, Introduction to Matrix Methods in Optics (John Wiley & Sons, 1994), Chap. 2.

38. X. Fu, J. Huang, X. Ding, Y. Liao, and J. Paisley, “Clearing the Skies: A deep network architecture for single-image rain streaks removal,” IEEE Trans. Image Process. 26(6), 2944–2956 (2017). [CrossRef]   [PubMed]  

39. D. Eigen, D. Krishnan, and R. Fergus, “Restoring an image taken through a window covered with dirt or rain,” in 2013 IEEE International Conference on Computer Vision (IEEE, 2013), pp. 633–640. [CrossRef]  

40. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE, 2016), pp. 770–778.

41. B. Arad and O. Ben-Shahar, “Sparse recovery of hyperspectral signal from natural RGB images,” in Computer Vision – ECCV 2016 (Springer International Publishing, 2016), pp. 19–34.

42. A. Vedaldi and K. Lenc, “MatConvNet: convolutional neural networks for MATLAB,” in Proceedings of the 23rd ACM International Conference on Multimedia (ACM, 2015), pp. 689–692. [CrossRef]  

43. R. C. Bolles, H. H. Baker, and D. H. Marimont, “Epipolar-plane image analysis: an approach to determining structure from motion,” Int. J. Comput. Vis. 1(1), 7–55 (1987). [CrossRef]  

44. I. Tosic and K. Berkner, “Light field scale-depth space transform for dense depth estimation,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2014), pp. 435–442.

45. L. Gao, N. Bedard, and I. Tosic, “Disparity-to-depth calibration in light field imaging,” in Imaging and Applied Optics, OSA Technical Digest, (Optical Society of America, 2016), paper CW3D.2.

46. S. Zhu, Y. Zhang, J. Lin, L. Zhao, Y. Shen, and P. Jin, “High resolution snapshot imaging spectrometer using a fusion algorithm based on grouping principal component analysis,” Opt. Express 24(21), 24624–24640 (2016). [CrossRef]   [PubMed]  

47. G. Wu, B. Masia, A. Jarabo, Y. Zhang, L. Wang, Q. Dai, T. Chai, and Y. Liu, “Light field image processing: an overview,” IEEE J. Sel. Top. Signal Process. 11(7), 926–954 (2017). [CrossRef]  

48. T. C. Wang, A. A. Efros, and R. Ramamoorthi, “Depth estimation with occlusion modeling using light-field cameras,” IEEE Trans. Pattern Anal. Mach. Intell. 38(11), 2170–2181 (2016). [CrossRef]   [PubMed]  

49. M. W. Tao, J. C. Su, T. C. Wang, J. Malik, and R. Ramamoorthi, “Depth estimation and specular removal for glossy surfaces using point and line consistency with light-field cameras,” IEEE Trans. Pattern Anal. Mach. Intell. 38(6), 1155–1169 (2016). [CrossRef]   [PubMed]  

Supplementary Material (2)

NameDescription
Visualization 1       Real-time four-dimensional video
Visualization 2       Videos with reconstructed images from SSVI when the system focuses at 450mm and 850mm

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (9)

Fig. 1
Fig. 1 Snapshot spectral-volumetric imaging system configuration. (a) System sketch. MLA, microlens array; CCD, charge-coupled device; EI, elemental image. (b) Configuration of the birefringent polarization interferometer (BPI). Components between the polarizer I and the Wollaston prism have been omitted for clarity. The red arrow and circle indicate the polarization eigenmodes of the Wollaston prism, while the black ones denote the polarization of light rays. The transmission axes of the two polarizers are both oriented at 45° with respect to the polarization eigenmodes of the Wollaston prism. (c) The rotated BPI and CCD. Other components have been omitted.
Fig. 2
Fig. 2 Flowchart of image processing pipeline. (a) Framework of the light-field-interferogram decoupling convolutional neural network (DC-CNN). Log, logarithm; Conv, convolution; ReLU, rectified linear unit; BN, batch normalization. (b) Light-field image. (c) An epipolar plane image extracted from the light-field image. (d) Disparity map. (e) Depth map. (f) An epipolar plane image extracted from the raw image. (g) Interferogram of a single pixel. (h) Spectrum of a single pixel. FFT, fast Fourier transform. (i) Spectral datacube.
Fig. 3
Fig. 3 Prototype of the snapshot spectral-volumetric imaging system.
Fig. 4
Fig. 4 Real-time spectral-volumetric video. (a) Illustration of the scene including a static paper with letters and a swinging leaf. (b-i) Samples of consecutive depth and image frames reconstructed at 15 Hz frame-rate with a lateral resolution of 110 × 110 pixels. (j-m) Samples of image frames reconstructed when the system focuses at 450mm (upper row) and 850mm (lower row). Note: the full video can be seen in Visualization 1 and Visualization 2.
Fig. 5
Fig. 5 Response to metamerism. (a) Reconstructed image at t = 1.6 s . (b) Spectra of points A and B indicated in (a). (c) High-resolution RGB image of the leaf captured by a commercial camera. (d) Spectral slice at 728.6nm. (e) Spectral slice at 673.9nm.
Fig. 6
Fig. 6 Lateral resolution of SSVI. (a) Experimentally measured resolution at different working distance. (b) Reconstructed image when the working distance is 840 mm.
Fig. 7
Fig. 7 Spectral resolution of SSVI. (a) Reconstructed image. (b-d) Spectra at the red, green, and blue laser point. The insets depict the spectral slice at 632.6nm, 531.6nm, and 487.1nm, respectively.
Fig. 8
Fig. 8 Quantitative evaluation of the spectral-datacube reconstruction. (a) Photo of the ColorChecker captured by a commercial camera. (b) Average normalized RMS errors of the reconstructed spectra in color blocks. (c-r) Reconstructed spectra from the SSVI system (Blue dots) and the Avantes spectrometer (Red line) in color blocks.
Fig. 9
Fig. 9 Depth accuracy of SSVI. (a) Schematic of the experimental setup. (b) Ground-truth depth of the scene. (c) Reconstructed depth. (d) Error map of the reconstructed depth from SSVI. (e) Error map of the reconstructed depth from the light-field image without interference.

Equations (22)

Equations on this page are rendered with MathJax. Learn more.

O P D ( x , y ) = 2 B tan ( α ) [ ( x x 0 ) cos ( δ ) y sin ( δ ) ] / M R 2 ,
[ y φ ] = [ M 0 1 / f 1 / M ] [ y φ ] ,
[ y φ ] = [ M 0 1 / f 1 / M ] [ y d n φ ] + [ d n 0 ] ,
P ( y , φ , λ ) = P ( y d n M + d n , y d n f + M φ , λ ) 1 2 [ 1 + cos ( 2 π λ O P D ) ] ,
I ( x , y ) = Θ , Φ , Λ P ( x , y , θ , φ , λ ) 1 2 { 1 + cos [ 2 π λ O P D ( x , y ) ] } d θ d φ d λ ,
x = x d m M + d m , y = y d n M + d n , θ = x d m f + M θ , φ = y d n f + M φ .
I ( x , y ) = Θ , Φ l ( x , y , θ , φ ) d θ d φ Λ s ( x , y , λ ) 1 2 { 1 + cos [ 2 π λ O P D ( x , y ) ] } d λ ,
log [ I ( x , y ) ] = log [ Θ , Φ l ( x , y , θ , φ ) d θ d φ ] + log [ Λ s ( x , y , λ ) 1 2 { 1 + cos [ 2 π λ O P D ( x , y ) ] } d λ ] .
H 1 ( I i ) = ReLU { W 1 log [ H 0 ( I i ) ] + b 1 } ,
H 2 ( I i ) = ReLU [ W 2 H 1 ( I i ) + b 2 ] ,
H 3 ( I i ) = exp [ W 3 H 2 ( I i ) + b 3 ] ,
L = 1 N i = 1 N H 3 ( I i ) I i E i 2 2 ,
I L F ( x , y ) = Θ , Φ l ( x d m M + d m , y d n M + d n , x d m f + M θ , y d n f + M φ ) d θ d φ .
m i = i Q ,
n i = Q mod [ ( i 1 ) , Q ] ,
x i = x c + D a ( m c m i ) ,
y i = y c + D a ( n c n i ) ,
x i r a w = x i + ( m i 1 ) d ,
y i r a w = y i + ( n i 1 ) d ,
O P D ( x i r a w , y i r a w ) = 2 B tan ( α ) M R 2 [ ( x i r a w x 0 ) cos ( δ ) y i r a w sin ( δ ) ] ,
x i r a w = x c + m c D a d + m i ( d D a ) ,
y i r a w = y c + n c D a d + n i ( d D a ) .
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.