Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Coded aperture snapshot hyperspectral light field tomography

Open Access Open Access

Abstract

Multidimensional imaging has emerged as a powerful technology capable of simultaneously acquiring spatial, spectral, and depth information about a scene. However, existing approaches often rely on mechanical scanning or multi-modal sensing configurations, leading to prolonged acquisition times and increased system complexity. Coded aperture snapshot spectral imaging (CASSI) has introduced compressed sensing to recover three-dimensional (3D) spatial-spectral datacubes from single snapshot two-dimensional (2D) measurements. Despite its advantages, the reconstruction problem remains severely underdetermined due to the high compression ratio, resulting in limited spatial and spectral reconstruction quality. To overcome this challenge, we developed a novel two-stage cascaded compressed sensing scheme called coded aperture snapshot hyperspectral light field tomography (CASH-LIFT). By appropriately distributing the computation load to each stage, this method utilizes the compressibility of natural scenes in multiple domains, reducing the ill-posed nature of datacube recovery and achieving enhanced spatial resolution, suppressed aliasing artifacts, and improved spectral fidelity. Additionally, leveraging the snapshot 3D imaging capability of LIFT, our approach efficiently records a five-dimensional (5D) plenoptic function in a single snapshot.

© 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Over the past few decades, computational imaging has made remarkable progress thanks to advancements in optical instruments and the exponential growth of computing power. One ultimate objective of computational imaging is to simultaneously capture the complete plenoptic function [1], a seven-dimensional (7D) function encompassing 3D space (x, y, z), 1D time (t), 1D spectrum (λ), and 2D angle (u, v) [2,3]. However, traditional optical imaging systems typically only measure light intensity over a 2D spatial (x-y) grid, overlooking a significant portion of the photon's information and leading to inefficient data acquisition.

To capture a high-dimensional (high-D) plenoptic function, most systems acquire a 1D or 2D subset of a datacube at a time, necessitating tedious scanning of the remaining dimension(s) to obtain a complete datacube. For instance, hyperspectral imagers, which capture a 3D plenoptic datacube (x, y, λ), often perform scanning either in the spatial domain [4,5] or the spectral domain [68], resulting in low light collection efficiency and prolonged acquisition times. Snapshot techniques like image mapping spectrometry (IMS) [912], computed tomography imaging spectrometry (CTIS) [13,14], and coded aperture snapshot spectral imaging (CASSI) [1517] optically map a 3D plenoptic datacube to a 2D plane while maximizing the light throughput [18,19]. Among these techniques, CASSI stands out with its significant advantage in data acquisition efficiency, achieved by leveraging compressed sensing (CS) [20] to minimize the number of necessary measurements. The compression ratio, defined as $\boldsymbol{cr} = \boldsymbol{Nyquist\; sampling\; number}/\boldsymbol{Actual\; sampling\; number}$, plays a crucial role in reducing the data load of high-D measurements. However, the substantial data compression ratio inherent in CASSI can often render the reconstruction problem ill-posed, resulting in compromised image quality when capturing complex scenes. Recent efforts have been made to enhance the reconstruction quality by reducing the compression ratio [2128]. Two major approaches include (i) capturing multiple snapshots with different coded apertures [2123] and (ii) utilizing a non-diffracted image as a prior in the reconstruction [2428]. While these methods improve reconstruction quality, they either require multiple shots or additional optics and cameras, thereby increasing system cost and complexity.

Capturing a high-D plenoptic datacube, such as hyperspectral volumetric imaging (x, y, u, v, $\lambda$), in a single snapshot while preserving high image quality poses a formidable challenge. Existing approaches often rely on Nyquist sampling with a unit compression ratio [2932] or adopt complex multi-modal systems [29,33,34]. For instance, non-compressed IMS-based hyperspectral volumetric imagers [32] face a significant trade-off among sampling along various plenoptic dimensions due to limited detector array size. Multi-modal systems [29,33,34] integrate multiple cameras with distinct modalities and necessitate intricate calibration procedures. Acquiring a high-D datacube with a substantial compression ratio using a single camera remains demanding. In our previous research [35], we sought to address this challenge by employing compressed sampling of the light field through sparsely spaced angles. This approach allowed us to achieve a relatively high compression ratio, albeit with a trade-off of reduced light throughput due to the use of narrow slits.

To surpass the limitations of the previously mentioned techniques, herein we present coded aperture snapshot hyperspectral light field tomography (CASH-LIFT), a novel approach that employs a highly efficient cascaded compressed sensing scheme to achieve superior imaging results. Light field tomography (LIFT), which acquires en-face projections of perspective images [36,37], serves as the first compressed stage to undersample spatial and angular dimensions using a line spread function. The subsequent CASSI system functions as the second compressed module, further encoding the spectral dimension of the compressed light field image from the previous stage. By incorporating 1D optical blurring along the dispersion direction from the LIFT stage, we effectively distribute the computational load across each stage, thereby mitigating the ill-posed nature of the image recovery problem. This approach results in enhanced spatial resolution, suppressed aliasing artifacts, and improved spectral reconstruction fidelity. Furthermore, by leveraging the compressibility of natural scenes in multiple dimensions, CASH-LIFT enables highly efficient data acquisition to capture the 5D plenoptic function (x, y, u, v, $\lambda$) in a snapshot while maintaining a high light throughput.

2. Method

2.1 Cascaded compressed image formation model

CASH-LIFT introduces a two-stage cascaded compressed process to transform high-dimensional (high-D) plenoptic function into a 2D representation. The approach combines LIFT and CASSI techniques to sequentially perform compressed measurements across angular, spatial, and spectral domains. In the first stage, a dove prism array and cylindrical lens array capture compressed light field information, enabling synthetic refocusing using the resulting sub-pupil disparities. The second stage involves using a coded aperture and a dispersive element, such as a Wedge prism, to encode and disperse the compressed light field [ Fig. 1(a,b)]. Accordingly, we can separate the forward image formation model into two steps. The first step describes the light field compression by the LIFT process, and it can be formulated as:

$$b(\lambda )= {\boldsymbol{T}_{\boldsymbol{x}}}{\boldsymbol{R}_{\boldsymbol{\theta }}}g(\lambda ), $$
where $g(\lambda )$ is the vectorized object image at wavelength $\lambda $, ${\boldsymbol{R}_{\boldsymbol{\theta }}}$ is the collection of ${N_\theta }$ (${N_\theta } = 16$) rotation operators, which can be expressed as ${\boldsymbol{R}_{\boldsymbol{\theta }}} = {[{{R^{{\theta_1}}} \cdot{\cdot} \cdot {R^{{\theta_{{N_\theta }}}}}} ]^T}$, ${\boldsymbol{T}_{\boldsymbol{x}}}$ represents the 1D convolution of the rotated object image with the line-spread-function (LSF) of the cylindrical lens along its invariant axis (i.e., the axis without the optical power), and $b(\lambda )$ is the compressed light field at wavelength $\lambda $. The convolution operation in LIFT results in downsampling the light field information for a specific 2D spatial slice (${N_x} \times {N_y}$) from a 3D hyperspectral scene (${N_x} \times {N_y} \times {N_\lambda }$). This downsampling can be expressed as ${N_x}/n \times {N_y} \times {N_\theta }$, where n represents the length of the cylindrical lens’ LSF counted in pixels. Consequently, the compression ratio of the LIFT stage can be written as $c{r_1} = {N_x} \times {N_y}/({{N_x}/n \times {N_y} \times {N_\theta }} )$.

 figure: Fig. 1.

Fig. 1. Optical schematic and image formation model. (a) Comparison of the working principles between CASSI and CASH-LIFT. ${C_{xy}}$, spatial encoding operator; ${S_x}$, spectral shearing operator; I, spatial-spectral integration; ${R_\theta }$, angular rotation operator; ${T_x}$, spatial integration operator. (b) The Dove prism array and cylindrical array are located at the Fourier plane of the main lens. The 2D light field sub-image is downsampled along the invariant axis of the cylindrical lens. A binary mask and a wedge prism modulate the LIFT output image and encode high-dimensional spectral data into 2D space. The wedge dispersed direction is parallel to the non-power axis of the cylindrical lens. (c) The reconstruction pipeline of our imaging method. Leveraging the reduced information content along the spectral dispersion direction, the spectral projection cube can be efficiently reconstructed in the CASSI stage, reducing computation load. The object can then be reconstructed in five dimensions ( $x, y, \theta, \varphi, \lambda$) with higher light throughput and spatial resolution by combining coded aperture imaging, computational tomography, and light field imaging.

Download Full Size | PDF

The subsequent stage involves the spatial-spectral scene encoding performed by CASSI:

$$m = \boldsymbol{I}{\boldsymbol{S}_{\boldsymbol{x}}}{\boldsymbol{C}_{\boldsymbol{xy}}}b(\lambda )= \mathbf{\Phi }b(\lambda )+ e. $$

Here ${\boldsymbol{C}_{\boldsymbol{xy}}}$ is a spatial encoding operator, which indicates the function of a random binary mask. ${\boldsymbol{S}_{\boldsymbol{x}}}$ is a spectral shearing operator along the dispersion direction of a diffractive element [$x$ axis in Fig. 1(a)]. $\boldsymbol{I}$ is a spatial-spectral integration operator. m is the raw measurement acquired by a monochrome camera, and e denotes the noise. The spatial-spectral encoding mechanism ($\boldsymbol{I}{\boldsymbol{S}_{\boldsymbol{x}}}{\boldsymbol{C}_{\boldsymbol{xy}}}$) can also be characterized by a sensing matrix $\mathbf{\Phi }$ (or transport matrix from the light transport perspective) [see Supplement 1]. In a conventional CASSI system, a compressed measurement of size $({N_x} + {N_\lambda }) \times {N_y}$ is used to reconstruct the complete hyperspectral datacube ${N_x} \times {N_y} \times {N_\lambda }$, resulting in a significant compression ratio ${N_x} \times {N_y} \times {N_\lambda }/[{({N_x} + {N_\lambda }} )\times {N_y}]$ [Fig. 1(a)]. In contrast, in our method, the convolution operation in the LIFT stage reduces the spatial information content along the spectral dispersion direction. As a result, the effective information that requires recovery in the CASSI stage is diminished by a factor of n, leading to a lower compression ratio $c{r_2} = {N_x}/n \times {N_y} \times {N_\lambda }/[{({N_x} + {N_\lambda }} )\times {N_y}]$. The reduction in effective information not only mitigates the computational burden of the CASSI system but also enables improved reconstruction quality.

Therefore, the image formation model of the entire system can be formulated as:

$$m = \boldsymbol{I}{\boldsymbol{S}_{\boldsymbol{x}}}{\boldsymbol{C}_{\boldsymbol{xy}}}{\boldsymbol{T}_{\boldsymbol{x}}}{\boldsymbol{R}_{\boldsymbol{\theta }}}g(\lambda ). $$

The overall compression ratio equals the multiplication of the compression ratio of each stage:

$$cr = \frac{{{N_x} \times {N_y}}}{{\; \frac{{{N_x}}}{n} \times {N_y} \times {N_\theta }}}\ast \frac{{\frac{{{N_x}}}{n} \times {N_y} \times {N_\lambda }}}{{({N_x} + {N_\lambda }) \times {N_y}\; }} = \frac{{{N_x} \times {N_\lambda }}}{{\; ({{N_x} + {N_\lambda }} )\times {N_\theta }}}. $$

There are two special cases: (i) $n = 1.$ The system is equivalent to a combination of a conventional light field system and CASSI. The PSF of each sub-image has a circular shape, uniformly convolving the input scene in both x and y directions. The compression ratio of the CASSI stage remains the same as that of the original CASSI system. (ii) $n = {N_x}$. The resultant system has a long LSF and distributes most compression to the LIFT stage. To strike a balance in the computational burden between the stages, it is essential to ensure that both $c{r_1}$ and $c{r_2}$ are greater than one. Consequently, the parameter n must be chosen within an optimal range between ${N_\theta }$ and ${N_x} \times {N_\lambda }/({N_x} + {N_\lambda })$ to ensure optimal performance.

2.2 Image reconstruction

The reconstruction of a hyperspectral cube involves solving two optimization problems sequentially, which can be formulated as:

$$\hat{b} = \mathop {\textrm{argmin}}\limits_b \left\|{m - \mathbf{\Phi }b}\right\|_2^2 + \mu \varphi (b ), \;\;\textrm{s.t.}\;\;\;b \ge 0$$
$$\hat{g} = \mathop {\textrm{argmin}}\limits_g \left\|{b - {\boldsymbol{T}_{\boldsymbol{x}}}{\boldsymbol{R}_{\boldsymbol{\theta }}}g}\right\|_2^2 + \rho \varphi (g ), $$
where $\varphi $ is the total variation regularization term, $\mu $ and $\rho $ are the regularization weights. We use the augmented Lagrangian and alternating direction algorithms [38,39] to solve Eq. (5) and fast iterative shrinkage-thresholding algorithm [40] for Eq. (6). The reconstruction takes ∼10 min for each spectral channel and ∼20 min for recovering the light field information on a personal computer equipped with an AMD 5700 G CPU.

To extend our method to a 3D scene, we introduce a translation step for each sub-pupil image before passing it to the iterative reconstruction [3537]. This translation is performed with respect to the spatial coordinates ($u,v$) of each Dove prism in the array. We translate the sub-pupil image a distance of the vector $s^{\ast}$($u,v$), where $s$ is a parameter that depends on the axial location of the synthetic focal plane. Provided that a Dove prism rotates the image counterclockwise by angle $\theta$, and the non-power direction of the cylindrical lens aligns with the sub-pupil coordinate axis $v$, we apply a corresponding shift to each sub-pupil projection [35]:

$$s\ast u\ast \cos (\theta )+ s\ast v\ast \sin (\theta ).$$

The quality of the estimated hyperspectral datacube obtained through the optimization algorithm heavily relies on the accurate characterization of sensing matrix $\mathbf{\Phi }$. System calibration plays a critical role in mitigating the disparity between the theoretical system operator $\boldsymbol{I}{\boldsymbol{S}_{\boldsymbol{x}}}{\boldsymbol{C}_{\boldsymbol{xy}}}$ and the experimental system-specific operator $\mathbf{\Phi }$. For our system, the calibration of dispersion shift was obtained by illuminating the mask using a broadband light source with an optical tunable bandpass filter. The spectral channel resolution can be calculated from the pixel shift (dispersion direction) between adjacent channels [see Supplement 1]. In the current setup, the spectral range and average resolution are 150 nm and 7.5 nm, respectively. It is important to note that the spectral range and resolution can be adjusted by employing different dispersion elements or utilizing a different focal length relay system. This allows for tailoring the system's imaging performance for specific applications.

2.3 Optical setup

The system schematic is shown in Fig. S1 in Supplement 1. The beam is collimated by a main lens (AC254-300, f = 300 mm, Thorlabs. Inc), which is located 300 mm from the object plane. A custom 16 Dove prism array (height = 2 mm, spacing between adjacent prism = 2.9 mm) is placed at the Fourier plane of the first lens. The 5 × 1 cylindrical lens array (f = 75 mm, height = 2 mm) covers all Dove prisms with an extended length of 17 mm. The LIFT output image is focused onto a coded mask and encoded by a random binary pattern. The mask was fabricated on a quartz substrate with chrome coating by photolithography (Frontrange-photomask, Inc). The smallest coded feature of the pattern is sampled by approximately 2 × 2 pixels on the camera (15 µm). We use two photographic lenses (50 mm, f/1.8, Yongnuo, Inc.) as a 4f system to relay the coded image to a monochromatic camera (16 bits, 7.4um pixel size, 4864 × 3232, Lt16059H, Lumenera, Inc). To disperse the image, we position a round wedge prism (PS814-A, 10° Beam Deviation, Thorlabs, Inc.) at the Fourier plane of the 4f relay system. For the illumination source, we use a halogen lamp (HL250-AY, AmScope, Inc.) bandlimited by a combination of a 450 nm long pass filter (FELH0450, Thorlabs, Inc.) and a 600 nm short pass filter (FESH0600, Thorlabs, Inc.). The field of view (FOV) of the system is approximately 8 mm × 8 mm, with a systematic magnification of 0.25×.

3. Simulation

To accurately evaluate the imaging performance of our CASH-LIFT system, we maintain consistency between the simulation parameters and the real experimental optical system. The mask size and spectrum used in the simulation were aligned with those employed in the physical setup. As depicted in Fig. 2(a), CASH-LIFT effectively mitigates the spatial blurring effect commonly encountered in CASSI systems, particularly for scenes with lower spectral sparsity. Additionally, it demonstrates the capability to resolve small line pairs with narrow spacing, as small as approximately 44um (3 times the minimal mask feature size). This indicates the system's ability to capture fine details and resolve high-frequency features, thereby enabling high-resolution imaging. Given that the LIFT stage utilizes multiple sub-images to capture light field information, we further conducted a comparison between our CASH-LIFT and CASSI employing a 16 sub-images sampling scheme, known as Light Field CASSI [Fig. 2(a), middle column]. In Light Field CASSI, the PSF of each sub-image has a circular shape and is uniformly convolved in both x and y directions. In contrast, CASH-LIFT efficiently down-samples light field information using a cylindrical lens with a line-shaped PSF, significantly reducing the reconstruction burden in the subsequent CASSI reconstruction and, thereby, improving the image quality. Figure 2(b) shows the reconstructed spectral curves with a mean squared error of the normalized spectral difference measured at 0.04 (compared to 0.06 in CASSI).

 figure: Fig. 2.

Fig. 2. Simulation results of USAF target/planar object reconstruction. (a) Comparison of spatial resolution between CASSI, light field CASSI, and CASH-LIFT at a representative spectral channel. (b) Evaluation of spectral reconstruction quality between CASSI and CASH-LIFT. (c) Comparison of reconstruction quality between Hyper-LIFT and CASH-LIFT under different SNR (Signal-to-Noise Ratio), characterized with PSNR (Peak Signal-to-Noise Ratio), and SSIM (Structural Similarity Index Measure) as metrics.

Download Full Size | PDF

In Fig. 2(c), we compare the robustness of Hyper-LIFT and CASH-LIFT by adding Gaussian and Poisson noise to the simulated raw image. We scaled the signal intensity and assessed the reconstruction quality accordingly. In Hyper-LIFT, a slit array is employed for 1-D sampling. However, this approach sacrifices light throughput, resulting in a decrease of less than 1% and consequently leading to poor reconstruction quality, particularly in low signal-to-noise ratio (SNR) scenarios. On the other hand, CASH-LIFT benefits from its higher light throughput design, achieving 50% light efficiency over the random binary mask. This larger photon budget enables CASH-LIFT to achieve enhanced reconstruction quality even in challenging low SNR conditions.

4. Experimental results

4.1 Enhanced resolution demonstration on a USAF target

The effectiveness of CASSI relies on the inherent sparsity observed in the spectral data of natural scenes. However, verifying the necessary conditions for high-fidelity compressed sensing reconstruction in practical coded aperture designs can be challenging, leading to limitations in the quality of the reconstructed spectral data [22]. In our experiments, we compared the performance of the CASSI system with our CASH-LIFT system using a United States Air Force (USAF) resolution target. The CASSI reconstruction suffered from spatial blurring, resulting in excessively smooth recovered images [ Fig. 3(a)]. In contrast, the two-stage compression in the CASH-LIFT system significantly reduces the reconstruction burden in each stage, thereby improving image resolution and mitigating spatial blurring effects. The contrast level, defined as $({{I_{max}} - {I_{min}}} )/({{I_{max}} + {I_{min}}} )$, is measured [Fig. 3(b)]. For line pair groups 1-4 to 1-6, the CASH-LIFT system demonstrates contrast levels of 1, 0.89, and 0.62, while the contrast levels of the CASSI system are 0.55, 0.23, and 0.12.

 figure: Fig. 3.

Fig. 3. Enhancement of spatial resolution by cascading LIFT and CASSI. (a) Comparison of reconstruction results between the CASH-LIFT system and CASSI system when imaging a USAF resolution target. The reconstruction result of a representative spectral channel is presented. (b) Pixel intensity profiles of sampling lines i, ii, and iii in (a).

Download Full Size | PDF

4.2 Hyperspectral imaging of planar objects

To further illustrate the resolution enhancement, we compared our method with CASSI on a planar object, using the Structural Similarity Index Measure (SSIM) and Peak-to-Noise Ratio (PSNR) as metrics to assess the reconstruction quality [ Fig. 4(a)]. The broadband light spectrum was captured using both our system and a reference fiber spectrometer (O STS-VIS-L-25-400-SMA, Ocean Optics Inc.). The reconstructed spectral intensity distribution was normalized and compared to the ground truth provided by the spectrometer [Fig. 4(b)].

 figure: Fig. 4.

Fig. 4. Validation of reconstruction quality and synthetic refocus. (a,b) Comparison of reconstruction results between the CASH-LIFT system and CASSI system when imaging a sparse object. The reconstruction quality is assessed using SSIM and PSNR metrics. The spatial recovery performance is related to spectral sparsity, as indicated by the red dashed box representing single-wavelength illumination (BW = 1 nm), while the remaining image corresponds to broadband illumination (BW = 150 nm). We observe significant spatial performance improvement while maintaining similar spectral performance compared to CASSI. (c) The resolving capability gains robustness in dealing with samples of different orientations. (d) Reconstruction image of an artificial eye phantom depicting a complex scene with sparse features. (e) Validation of synthetic refocusing. The reconstruction result of a representative spectral channel is presented. The relationship between sub-image positions and the ability of our system to restore sharp focus to a previously defocused object by shearing. This plot demonstrates the dependence between the degree of shearing and the object's depth.

Download Full Size | PDF

As demonstrated in Fig. 4(a), the fidelity of CASSI reconstruction is highly influenced by the sparsity in both the spatial and spectral domains. Under sparse spectrum conditions, such as the scenario with 1 nm bandwidth (indicated by the red dashed box), CASSI can resolve high-spatial-frequency components (sharp features and edges) of the object in the reconstruction. However, the reconstruction image exhibits more artifacts and spatial blurring when dealing with broadband spectrum conditions. This is due to the correspondence between spatial blurring in CASSI and the sparsity in the spectral domain. In other words, CASSI encounters limitations in image quality when reconstructing less spectrally sparse scenes. By effectively distributing the computational burden to both stages, CASH-LIFT mitigates the sparsity requirement in the spectral domain for CASSI reconstruction, leading to improved reconstruction quality. Furthermore, the resolving capability of CASSI is also influenced by the relative orientation of the object with respect to the dispersion direction. In Fig. 4(c), the CASSI reconstruction quality of the vertical and oblique bars is relatively lower compared to the horizontal bar. This discrepancy is attributed to the spatial blurring introduced by spectral shearing, which predominantly affects the dispersion direction (horizontal). In contrast, CASH-LIFT achieved better image quality and improved resolving ability for objects with different orientations.

We also tested our system by imaging a standard eye phantom (wide-field model eye, Rowe Technical Design), which features a realistic eye lens and a vasculature-like pattern painted on the back surface [Fig. 4(d)]. We observed that sparse objects with fine spatial details and complex spectra benefit the most from our method. Moreover, CASH-LIFT captures light field information, enabling 3D reconstruction. We evaluated the synthetic refocusing performance by placing a planar object at different depths relative to the imaging system's object plane. For each depth, we captured an image and reconstructed a series of focal stack images for each spectral channel, as depicted in Fig. 4(e). By shifting each projection accordingly during post-processing, the object comes into focus at its specific depth.

4.3 Dynamic hyperspectral volumetric imaging

CASH-LIFT's snapshot 3D hyperspectral imaging capability enables simultaneous recording of a hyperspectral datacube at different depths. We demonstrated this capability by imaging the letter ‘USAF’ on a resolution target (R3L3S1N, Thorlabs Inc) illuminated with rainbow light [ Fig. 5(a)]. To achieve a broad-spectrum projection, we utilized a linear variable visible bandpass filter (LVF) (88365, Edmund Optics Inc) placed in front of a broadband halogen light source (HL250-AY, AmScope Inc). The LVF has a coating thickness that linearly varies along one dimension, resulting in continuous variation in spectral transmission.

 figure: Fig. 5.

Fig. 5. Dynamic 3-D hyperspectral imaging experiments. (a) An illumination system was designed to capture 3-D hyperspectral video. The linear variable filter creates rainbow illumination, while the tunable filter linearly scans the center wavelength. The object was mounted on a motorized stage controlled by a computer, and the tunable filter, motorized stage, and camera were synchronized. (b) The upper row displays representative frames of the letters, while the lower row shows the letters placed at another depth. Through synthetic refocus in post-processing, clear images of dynamic scenes at various depths can be reconstructed.

Download Full Size | PDF

To project the broad spectrum onto the field of view, we used a lens pair with a focal length ratio of 3.3:1 (MAP1030100-A, Thorlabs Inc) to demagnify the linear filter and relay it to an intermediate plane where the letter object was positioned. The tunable filter temporally sweeps the center wavelength through the entire spectrum band with a speed of 10 nm/s. This configuration ensured that at each frame, only one lateral location of the object exhibited a distinct color. Furthermore, the letter object was mounted on a motorized stage (PT1-Z8, Thorlabs Inc) that moved at an approximate speed of 2 mm/s along the depth direction. We reconstructed a sharp image sequence (12 Hz) for each spectral channel and depth using synthetic refocusing in post-processing [Fig. 5(b), Visualization 1]. This approach allowed us to obtain dynamic imaging results capturing the variation of objects or illumination over different depths in the scene.

4. Discussion and conclusion

Cascaded compressed imaging offers a highly efficient approach for capturing high-dimensional light datacubes. In conventional multidimensional imaging, acquiring optical information along an additional dimension can significantly increase the data load, presenting challenges in terms of data transfer, storage, and processing. However, cascaded compression addresses this issue by compressing the optical information along the desired axis of interest and mapping it onto a lower-dimensional spatial axis. This approach enables efficient data acquisition by capturing and representing only the essential information rather than measuring the complete dataset. For instance, in the CASH-LIFT system, instead of capturing duplicated sub-aperture images with disparity information as done in conventional light field systems, LIFT provides an efficient means to capture light field information using cylindrical lenses to reduce the data dimensions. The snapshot capability and high data acquisition efficiency of CASSI distinguish it from traditional scanning-based spectral imagers.

Our two-stage compressed imaging scheme can be extended to measure other plenoptic dimensions by adapting different coded aperture systems as the second stage, incorporating spatial encoding, shearing, and integration operators. The first stage can utilize the LIFT module to obtain the projection that downsamples spatial information along the desired shearing direction, leveraging optical blurring as a prior during reconstruction to reduce the computational burden. While currently combining LIFT and CASSI for (x, y, u, v, λ) imaging, the cascaded compressed framework can be expanded to capture additional dimensions of light, such as the time (t) dimension using a streak camera [41], and the polarization (p) dimension using a birefringent crystal [42,43].

Moreover, the competitive advantage of distributing the computational burden in cascaded compressed imaging extends beyond the CASSI stage (encoding, shearing, and integration stage) and also applies to the reconstruction of the LIFT stage. Similar to sparse-view computed tomography (CT), LIFT reconstruction is prone to artifacts and anisotropic resolution due to the limited number of projection measurements [35]. However, in the case of CASH-LIFT, instead of utilizing a narrow slit to select a single line of the projection image as in Hyper-LIFT, the entire frame is preserved as the first stage measurement. Although this results in a reduced spectral resolution, it broadens the power spectrum of each sub-aperture image captured. Under the same number of projection conditions, the additional sampling information in the Fourier domain contributes to improved reconstruction quality [Supplement 1].

Further improvement is possible by implementing appropriate regularizations and adjusting weighting coefficients to adapt to different scene types. In our method, the use of total variation (TV) regularization has proven effective in preserving image quality during optimization, with first-order TV regularization preserving sharp edges and second-order TV regularization resolving smooth transitions [44]. By synergistically combining these regularizers, we can elevate the image reconstruction process to achieve a level of superior image quality beyond what each regularization method can accomplish individually. Additionally, integrating deep learning techniques holds great promise for enhancing snapshot compressive imaging and efficiently solving ill-posed inverse problems while reducing computation time [45,46]. Embracing these advancements will lead to even better results in various applications.

In conclusion, we have successfully developed a novel cascaded compressed imaging scheme, CASH-LIFT, which efficiently maps high-dimensional datacubes to a 2D sensor through cascading LIFT and CASSI techniques. This innovative approach enables the simultaneous capture of high-dimensional information of the plenoptic function from a single 2D acquisition with a single camera, setting it apart from other hybrid and multi-model-based hyperspectral light field imaging systems. Our CASH-LIFT system demonstrated impressive 5D hyperspectral volumetric dynamic imaging over a substantial spatial volume (∼8mm × 8mm × 12.5 mm) and a spectral range of 150 nm (25 spectral channels) with high reconstruction quality. Notably, the cascaded compressed system design, employing light field sub-image downsampling in the LIFT stage, significantly enhances the reconstruction quality of CASSI, especially in less sparse spectrum cases, while retaining its snapshot advantage. This unique advantage positions the cascaded compressed CASH-LIFT approach as a promising solution to address the challenges in multidimensional optical imaging and have a major impact on the broad applications in both basic and applied sciences, such as snapshot optical coherence tomography and 3D multispectral fluorescence microscopy.

Funding

National Institutes of Health (R01HL165318, RF1NS128488).

Disclosures

The authors declare no conflicts of interest.

Data Availability

Data underlying the results presented in this paper may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

References

1. Gordon Wetzstein, Ihrke Ihrke, Douglas Lanman, et al., “Computational plenoptic imaging,” Computer Graphics Forum, Vol. 30. No. 8. Blackwell Publishing Ltd, Oxford, UK: 2011.

2. Edward H. Adelson and James R. Bergen, “The plenoptic function and the elements of early vision,” Computational models of visual processing 1(2), 3–20 (1991).

3. L. Gao and L. V. Wang, “A review of snapshot multidimensional optical imaging: measuring photon tags in parallel,” Phys. Rep. 616, 1–37 (2016). [CrossRef]  

4. M. Abdo, V. Badilita, and J. Korvink, “Spatial scanning hyperspectral imaging combining a rotating slit with a Dove prism,” Opt. Express 27(15), 20290–20304 (2019). [CrossRef]  

5. Y. J. Hsu, C.-C. Chen, C.-H. Huang, C.-H. Yeh, L.-Y. Liu, and S.-Y. Chen, “Line-scanning hyperspectral imaging based on structured illumination optical sectioning,” Biomed. Opt. Express 8(6), 3005–3016 (2017). [CrossRef]  

6. P.-H. Cu-Nguyen, A. Grewe, M. Hillenbrand, S. Sinzinger, A. Seifert, and H. Zappe, “Tunable hyperchromatic lens system for confocal hyperspectral sensing,” Opt. Express 21(23), 27611–27621 (2013). [CrossRef]  

7. G. Di Caprio, D. Schaak, and E. Schonbrun, “Hyperspectral fluorescence microfluidic (HFM) microscopy,” Biomed. Opt. Express 4(8), 1486–1493 (2013). [CrossRef]  

8. M. C. Phillips and N. Hô, “Infrared hyperspectral imaging using a broadly tunable external cavity quantum cascade laser and microbolometer focal plane array,” Opt. Express 16(3), 1836–1845 (2008). [CrossRef]  

9. L. Gao, R. T. Kester, and T. S. Tkaczyk, “Compact image slicing spectrometer (ISS) for hyperspectral fluorescence microscopy,” Opt. Express 17(15), 12293–12308 (2009). [CrossRef]  

10. L. Gao, R. T. Smith, and T. S. Tkaczyk, “Snapshot hyperspectral retinal camera with the image mapping spectrometer (IMS),” Biomed. Opt. Express 3(1), 48–54 (2012). [CrossRef]  

11. L. Gao, R. T. Kester, N. Hagen, and T. S. Tkaczyk, “Snapshot image mapping spectrometer (IMS) with high sampling density for hyperspectral microscopy,” Opt. Express 18(14), 14330–14344 (2010). [CrossRef]  

12. M. E. Pawlowski, J. G. Dwight, T.-U. Nguyen, and T. S. Tkaczyk, “High performance image mapping spectrometer (IMS) for snapshot hyperspectral imaging applications,” Opt. Express 27(2), 1597–1612 (2019). [CrossRef]  

13. B. K. Ford, M. R. Descour, and R. M. Lynch, “Large-image-format computed tomography imaging spectrometer for fluorescence microscopy,” Opt. Express 9(9), 444–453 (2001). [CrossRef]  

14. M. Descour and E. Dereniak, “Computed-tomography imaging spectrometer: experimental calibration and reconstruction results,” Appl. Opt. 34(22), 4817–4826 (1995). [CrossRef]  

15. M. Gehm, R. John, D. Brady, R. Willett, and T. Schulz, “Single-shot compressive spectral imaging with a dual-disperser architecture,” Opt. Express 15(21), 14013–14027 (2007). [CrossRef]  

16. A. Wagadarikar, R. John, R. Willett, and D. Brady, “Single disperser design for coded aperture snapshot spectral imaging,” Appl. Opt. 47(10), B44–B51 (2008). [CrossRef]  

17. A. A. Wagadarikar, N. P. Pitsianis, X. Sun, and D. J. Brady, “Video rate spectral imaging using a coded aperture snapshot spectral imager,” Opt. Express 17(8), 6368–6388 (2009). [CrossRef]  

18. N. Hagen and M. Kudenov, “Review of snapshot spectral imaging technologies,” Opt. Eng. 52(9), 090901 (2013). [CrossRef]  

19. N. Hagen, L. Gao, T. Tkaczyk, and R. Kester, “Snapshot advantage: a review of the light collection improvement for parallel high-dimensional measurement systems,” Opt. Eng. 51(11), 111702 (2012). [CrossRef]  

20. Emmanuel J. Candès, “Compressive sampling,” Proceedings of the international congress of mathematiciansC Vol. 3. 2006.

21. David Kittle, Kerkil Choi, Ashwin Wagadarikar, and David J. Brady, “Multiframe image estimation for coded aperture snapshot spectral imagers,” Appl. Opt. 49(36), 6824–6833 (2010). [CrossRef]  

22. Yuehao Wu, Iftekhar O. Mirza, Gonzalo R. Arce, and Dennis W. Prather, “Development of a digital-micromirror-device-based multishot snapshot spectral imaging system,” Opt. Lett. 36(14), 2692–2694 (2011). [CrossRef]  

23. Henry Arguello and Gonzalo R. Arce, “Code aperture optimization for spectrally agile compressive imaging,” J. Opt. Soc. Am. A 28(11), 2400–2413 (2011). [CrossRef]  

24. Lizhi Wang, Zhiwei Xiong, Dahua Gao, Guangming Shi, and Feng Wu, “Dual-camera design for coded aperture snapshot spectral imaging,” Appl. Opt. 54(4), 848–858 (2015). [CrossRef]  

25. Lizhi Wang, Zhiwei Xiong, Guangming Shi, et al., “Compressive hyperspectral imaging with complementary RGB measurements,” 2016 Visual Communications and Image Processing (VCIP). IEEE, 2016.

26. Hoover Rueda, Henry Arguello, and Gonzalo R. Arce, “Dual-ARM VIS/NIR compressive spectral imager,” 2015 IEEE International Conference on Image Processing (ICIP). IEEE, 2015.

27. Lizhi Wang, Zhiwei Xiong, Dahua Gao, et al., “High-speed hyperspectral video acquisition with a dual-camera architecture,” Proceedings of the IEEE Conference on Computer Vision and Pattern RecognitionC2015.

28. Xin Yuan, Tsung-Han Tsai, Ruoyu Zhu, et al., “Compressive hyperspectral imaging with side information,” IEEE J. Sel. Top. Signal Process. 9(6), 964–976 (2015). [CrossRef]  

29. K. Zhu, Y. Xue, Q. Fu, S. B. Kang, X. Chen, and J. Yu, “Hyperspectral light field stereo matching,” IEEE Trans. Pattern Anal. Mach. Intell. 41(5), 1131–1143 (2019). [CrossRef]  

30. S. Zhu, L. Gao, Y. Zhang, J. Lin, and P. Jin, “Complete plenoptic imaging using a single detector,” Opt. Express 26(20), 26495–26510 (2018). [CrossRef]  

31. X. Lv, Y. Li, S. Zhu, X. Guo, J. Zhang, J. Lin, and P. Jin, “Snapshot spectral polarimetric light field imaging using a single detector,” Opt. Lett. 45(23), 6522–6525 (2020). [CrossRef]  

32. Q. Cui, J. Park, R. T. Smith, and L. Gao, “Snapshot hyperspectral light field imaging using image mapping spectrometry,” Opt. Lett. 45(3), 772–775 (2020). [CrossRef]  

33. Z. Xiong, L. Wang, H. Li, D. Liu, and F. Wu, “Snapshot hyperspectral light field imaging,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE, 2017), pp. 3270–3278.

34. J. Holloway, K. Mitra, S. Koppal, and A. Veeraraghavan, “Generalized assorted camera arrays: robust cross-channel registration and applications,” IEEE Trans. on Image Process. 24(3), 823–835 (2015). [CrossRef]  

35. Qi Cui, Jongchan Park, Yayao Ma, and Liang Gao, “Snapshot hyperspectral light field tomography,” Optica 8(12), 1552–1558 (2021). [CrossRef]  

36. X. Feng and L. Gao, “Ultrafast light field tomography for snapshot transient and non-line-of-sight imaging,” Nat. Commun. 12(1), 2179 (2021). [CrossRef]  

37. Zhaoqiang Wang, Tzung K. Hsiai, and Liang Gao, “Augmented light field tomography through parallel spectral encoding,” Optica 10(1), 62–65 (2023). [CrossRef]  

38. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Found. Trends Mach. Learn. 3(1), 1–122 (2011). [CrossRef]  

39. Chengbo Li, Wotao Yin, and Yin Zhang, “User's guide for TVAL3: TV minimization by augmented lagrangian and alternating direction algorithms,” CAAM report 20, 46–47 (2009).

40. A. Beck and M. Teboulle, “A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problem,” SIAM J. Imaging Sci. 2(1), 183–202 (2009). [CrossRef]  

41. Liang Gao, Jinyang Liang, Chiye Li, et al., “Single-shot compressed ultrafast photography at one hundred billion frames per second,” Nature 516(7529), 74–77 (2014). [CrossRef]  

42. Tsung-Han Tsai and David J. Brady, “Coded aperture snapshot spectral polarization imaging,” Appl. Opt. 52(10), 2153–2161 (2013). [CrossRef]  

43. Jianglan Ning, Zhilong Xu, Dan Wu, et al., “Compressive circular polarization snapshot spectral imaging,” Opt. Commun. 491, 126946 (2021). [CrossRef]  

44. Stephan Didas, Simon Setzer, and Gabriele Steidl, “Combined ℓ 2 data and gradient fitting in conjunction with ℓ 1 regularization,” Adv. Comput. Math. 30(1), 79–99 (2009). [CrossRef]  

45. Lishun Wang, Zongliang Wu, Yong Zhong, et al., “Snapshot spectral compressive imaging reconstruction using convolution and contextual transformer,” Photonics Res. 10(8), 1848–1858 (2022). [CrossRef]  

46. Xiaowan Hu, Yuanhao Cai, Jing Lin, et al., “Hdnet: High-resolution dual-domain learning for spectral compressive imaging,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

Supplementary Material (2)

NameDescription
Supplement 1       Supplement
Visualization 1       Hyperspectral volumetric imaging results of CASH-LIFT system

Data Availability

Data underlying the results presented in this paper may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (5)

Fig. 1.
Fig. 1. Optical schematic and image formation model. (a) Comparison of the working principles between CASSI and CASH-LIFT. ${C_{xy}}$, spatial encoding operator; ${S_x}$, spectral shearing operator; I, spatial-spectral integration; ${R_\theta }$, angular rotation operator; ${T_x}$, spatial integration operator. (b) The Dove prism array and cylindrical array are located at the Fourier plane of the main lens. The 2D light field sub-image is downsampled along the invariant axis of the cylindrical lens. A binary mask and a wedge prism modulate the LIFT output image and encode high-dimensional spectral data into 2D space. The wedge dispersed direction is parallel to the non-power axis of the cylindrical lens. (c) The reconstruction pipeline of our imaging method. Leveraging the reduced information content along the spectral dispersion direction, the spectral projection cube can be efficiently reconstructed in the CASSI stage, reducing computation load. The object can then be reconstructed in five dimensions ( $x, y, \theta, \varphi, \lambda$) with higher light throughput and spatial resolution by combining coded aperture imaging, computational tomography, and light field imaging.
Fig. 2.
Fig. 2. Simulation results of USAF target/planar object reconstruction. (a) Comparison of spatial resolution between CASSI, light field CASSI, and CASH-LIFT at a representative spectral channel. (b) Evaluation of spectral reconstruction quality between CASSI and CASH-LIFT. (c) Comparison of reconstruction quality between Hyper-LIFT and CASH-LIFT under different SNR (Signal-to-Noise Ratio), characterized with PSNR (Peak Signal-to-Noise Ratio), and SSIM (Structural Similarity Index Measure) as metrics.
Fig. 3.
Fig. 3. Enhancement of spatial resolution by cascading LIFT and CASSI. (a) Comparison of reconstruction results between the CASH-LIFT system and CASSI system when imaging a USAF resolution target. The reconstruction result of a representative spectral channel is presented. (b) Pixel intensity profiles of sampling lines i, ii, and iii in (a).
Fig. 4.
Fig. 4. Validation of reconstruction quality and synthetic refocus. (a,b) Comparison of reconstruction results between the CASH-LIFT system and CASSI system when imaging a sparse object. The reconstruction quality is assessed using SSIM and PSNR metrics. The spatial recovery performance is related to spectral sparsity, as indicated by the red dashed box representing single-wavelength illumination (BW = 1 nm), while the remaining image corresponds to broadband illumination (BW = 150 nm). We observe significant spatial performance improvement while maintaining similar spectral performance compared to CASSI. (c) The resolving capability gains robustness in dealing with samples of different orientations. (d) Reconstruction image of an artificial eye phantom depicting a complex scene with sparse features. (e) Validation of synthetic refocusing. The reconstruction result of a representative spectral channel is presented. The relationship between sub-image positions and the ability of our system to restore sharp focus to a previously defocused object by shearing. This plot demonstrates the dependence between the degree of shearing and the object's depth.
Fig. 5.
Fig. 5. Dynamic 3-D hyperspectral imaging experiments. (a) An illumination system was designed to capture 3-D hyperspectral video. The linear variable filter creates rainbow illumination, while the tunable filter linearly scans the center wavelength. The object was mounted on a motorized stage controlled by a computer, and the tunable filter, motorized stage, and camera were synchronized. (b) The upper row displays representative frames of the letters, while the lower row shows the letters placed at another depth. Through synthetic refocus in post-processing, clear images of dynamic scenes at various depths can be reconstructed.

Equations (7)

Equations on this page are rendered with MathJax. Learn more.

b ( λ ) = T x R θ g ( λ ) ,
m = I S x C x y b ( λ ) = Φ b ( λ ) + e .
m = I S x C x y T x R θ g ( λ ) .
c r = N x × N y N x n × N y × N θ N x n × N y × N λ ( N x + N λ ) × N y = N x × N λ ( N x + N λ ) × N θ .
b ^ = argmin b m Φ b 2 2 + μ φ ( b ) , s.t. b 0
g ^ = argmin g b T x R θ g 2 2 + ρ φ ( g ) ,
s u cos ( θ ) + s v sin ( θ ) .
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.