Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

LED-based compressive spectral-temporal imaging

Open Access Open Access

Abstract

A compressive spectral-temporal imaging system is reported. A multi-spectral light-emitting diode array is used for target illumination and spectral modulation, while a digital micro-mirror device (DMD) encodes the spatial and temporal frames. Several encoded video frames are captured in a snapshot of an integrating focal plane array (FPA). A high-frame-rate spectral video is reconstructed from the sequence of compressed measurements captured by the grayscale low-frame-rate camera. The imaging system is optimized through the design of the DMD patterns based on the forward model. Laboratory implementation is conducted to validate the performance of the proposed imaging system. We experimentally demonstrate the video acquisition with eight spectral bands and six temporal frames per FPA snapshot, and thus a 256 × 256 × 8 × 6 4D cube is reconstructed from a single 2D measurement.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

High frame-rate video recording poses challenging hardware requirements such as faster acquisition, larger memory buffers, and broader bandwidths. These demands are only amplified if, in addition, the spectral content in the video scenes is also of interest. Multi-spectral sensing usually requires spatial or spectral scanning thus temporal resolution is often sacrificed. Compressive sensing (CS) has been used to overcome these limitations without increasing the system volume or power consumption [1]. Compressive sampling is based on the assumption that the underlying signals are sparse or compressible in some transform domain, allowing the entire signal to be reconstructed from relatively few measurements. Spectral video streams not only exhibit strong inter-voxel correlation in space and time but along the spectral dimension as well and thus are amenable to compressive sampling. In this paper, we aim to develop a new imaging system to capture high-resolution, high-speed spectral video streams using CS. Fundamental to the principles of CS are coded projections where high-dimensional data streams are coded and projected onto detectors spanning lower dimensions. Coding strategies, as such, play a key role in any CS imaging system. Several coding mechanisms have been proposed to sample video signals including spatial light modulators [2] that enable dynamic coded apertures, strobe shutters [3] to temporally code the incoming optical field, and dispersive elements [47] used in concert with coded apertures to the code spectral data-cube.

Compressive measurements for dynamic imaging, record a coded dynamic scene into a sequence of detector array snapshots, from which many more video frames can be recovered. The coded aperture compressive temporal imaging (CACTI) [8,9] was introduced recently which uses a harmonically driven binary coded aperture during the exposure of a video capture. In this way, each temporal frame in the video sequence is modulated by a shifted version of the code. Decoding of the signal is subsequently done using one of many reconstruction algorithms available in CS where multiplexed temporal frames are separated from the compressed measurements [5,1021]. CACTI can be generalized by using a dispersive element to spectrally modulate the optical source. Spectral coding is done by a coded aperture and a dispersive element placed after coded aperture modulation [22]. The detection integrates the coded spectral planes. The video stream can be recovered by isolating each spectral plane based on its local coding structure. This line of work has been named as snapshot compressive imaging summarized in [23].

Given that optical coding is at the heart of CACTI and its multi-spectral camera extension, one may ask if there are other efficient approaches to realize optical coded projections of spectral video streams, in addition to mechanical movement of coded apertures. This is, in fact, the case if one exploits high speed switching light emitting diode (LED) illumination arrays. In particular, we develop coding strategies where in this case, the coding builds on the concepts used in LED illumination multi-spectral cameras. The sensor sequentially captures images of a scene under $N_l$ different color LED lights, thus producing a $N_l$-band spectral image of the scene. The multi-spectral image cube is thus built by sequentially scanning the cube along the spectral dimension. Multi-spectral LEDs have been successfully implemented in spectral imaging systems for static scenes [2427].

In this paper, a novel LED-based compressive spectral-temporal imaging (LeSTI) system is proposed, optimized, and implemented for the acquisition of high speed spectral-temporal video. Different from traditional compressive spectral imaging systems [5,2830], the proposed LeSTI system employs an active multi-spectral LED array without the use of any dispersive elements or spatially varying color filters. It consists of an active multi-spectral LED illuminator, a digital micro-mirror device (DMD) on the focal plane, an objective lens, an imaging lens and a grayscale sensor array. The experimental test-bed for this system is shown in Fig. 1. During a measurement snapshot, different types of multi-spectral LEDs are sequentially turned on and off for spectral modulation, while the DMD rapidly alters its patterns for spatial modulation. The sensor then compresses multiple coded spectral-temporal images into a single 2D projection. CS single shot algorithms are then used to recover a high frame-rate spectral video from a set of successively collected low frame-rate measurement frames. The inverse problem of CS system is ill-posted, thus several types of priors are applied in the reconstruction process to test the capability of the system. Specifically, the gradient projection for sparse reconstruction (GPSR) [12] with sparsity prior, generalized alternating projection with total variation (GAP-TV) [31], and the decompress snapshot compressive imaging (DeSCI) [15] which uses the alternating direction method of multipliers with weighted nuclear norm minimization (DeSCI) are applied.

 figure: Fig. 1.

Fig. 1. LeSTI testbed. Spectral LED illumination is used for spectral scene modulation. An imaging lens focuses the spectrally modulated scene onto the DMD imaging plane, where spatial modulations are imposed. A second objective lens focuses the spatial modulated scene onto the imaging plane, where a grayscale detector captures multiple temporally modulated video frames.

Download Full Size | PDF

This sensing process is characterized through the development of a forward model. Based on the forward model, computer simulations are performed with a laboratory measured video data-cube, where random block-unblock DMD patterns and LED scannings are applied. DMD patterns are then designed to attain good reconstruction image results. Several block-unblock patterns are compared in [2] for their performance in compressive temporal imaging systems. We further propose to use multi-frame blue noise coding [32] for the DMD patterns in the proposed spectral-temporal imaging system to improve the reconstruction quality. With the proposed imaging system, high temporal resolution spectral videos can be recovered with an off-the-shelf grayscale video-rate camera, aided by the DMD and the multi-spectral LED array. Our system relies on active LED illumination. Ambient light can introduce noise to the system. We believe that the proposed system has numerous potential applications in life sciences and industrial imaging where controlled illumination is often the norm [7,26,33,34]. The main contributions of this paper are summarized below:

  • 1) A novel compressive spectral-temporal imaging system is proposed to acquire 4D spectral-temporal videos.
  • 2) Different system configurations are compared under the proposed imaging design, e.g. modulation pattern transmittance, pattern schematics (random/blue-noise patterns), and temporal compression ratio.
  • 3) Several CS reconstruction algorithms are tested with our imaging system. From the reconstructed quality perspective, DeSCI [15] provides the best results. However, the running time is on the order of hours. GAP-TV [31], on the other hand, provides a good trade-off of speed and quality.
  • 4) The synchronization of field-programmable gate array (FPGA) controller is built to capture the continuous moving targets with a multi-spectral LED array, a DMD, and a grayscale camera. With this synchronization system, the proposed imaging system is implemented in an optical test-bed to experimentally capture high-speed spectral videos.

2. Imaging system

The proposed optical system architecture is shown in Fig. 2(a). Multiple spectral LEDs and the DMD are used for spectral modulation, while the fast alternation of the LEDs and DMD patterns, are used for spatio-temporal modulation, thus the 4D spatio-spectral-temporal data is encoded prior of their measurements. Specifically, the LEDs act like optical filters, which are used to modulate the spectra. The DMD adds binary block-unblock spatial coding by switching the micro-mirror tiles in two directions. Synchronized LEDs and DMD are used for modulating the 4D data-cube in space, spectra, and time; lastly, the overall coding makes it possible to compress a 4D data-cube onto a single 2D snapshot. The system relies on the high switching speed of the LEDs and DMD. An off-the-shelf low-speed grayscale detector is used for sensing on the focal plane array (FPA). The principles of the LeSTI system are described next.

 figure: Fig. 2.

Fig. 2. (a) LeSTI system architecture. Multi-spectral LEDs response for illumination and spectral modulations; a DMD is used for imposing spatial binary patterns. A grayscale sensor is utilized for image capture. (b) Eight types of LED’s spectra shown in the 400-700nm wavelength range. (c) The designed LED illuminator with multi-spectral LEDs used in our system. Eight different types of LEDs are used and evenly placed on a circuit board with three elements of each type.

Download Full Size | PDF

2.1 Imaging system model

A multi-spectral LED array is used as the illumination source in our system to spectrally modulate the scene. The various low cost LEDs have different spectra and have the ability to switch on and off independently at high speed. The switching speed of the LEDs can run from megahertz up to gigahertz [35]. Figure 2(b) shows the spectra of the eight types of LEDs (from LUMILEDs [36]) covering 450nm - 690nm used in our prototype, as measured by a broadband Ocean Optics USB2000+ spectrometer. Figure 2(c) shows the LED-board being used in our system.

Let $f(x,y,\lambda , t)$ denote the spatio-spectral-temporal 4D scene. In order to compress the 4D scene onto a 2D sensor, a 4D modulation pattern, $m(x,y,\lambda , t)$ is applied to the scene, leading to the following forward model

$$g(x', y') = \iint f(x,y,\lambda,t) m(x,y,\lambda,t) d \lambda d t,$$
where $g(x', y')$ is the measurement captured by the sensor. In this paper, a grayscale sensor is used. Since the pixels in the DMD and the grayscale sensor share the same resolution and are perfectly aligned, the sampling coordinate $(x', y')$ is the same as $(x,y)$.

To implement the 4D modulation, $m(x,y,\lambda ,t)$, the LED illumination array and the DMD are used. The 4D modulation process can be interpreted as applying temporal modulation on top of the 3D spectral modulation. For the 3D spectral modulation, $N_l$ types of LEDs sequentially illuminate a 3D spatio-spectral scene within a time interval $\delta _T$, where each type of the LED illumination interval is $\delta _t = \delta _T/N_l$. Within each illumination period $\delta _t$, the DMD imposes a unique 2D spatial pattern to the LED projected scene, in such that way the 3D spatio-spectral scene is modulated, and can be compressed for sensing. In order to modulate the 4D video scene in time, the 3D modulation process is repeated $N_t$ times to modulate $N_t$ temporal frames by illuminating $N_l$ types of LEDs with $N_l$ new spatial patterns in each temporal frame, which is shown in Fig. 3(a). This assumes that within a temporal frame period $\delta _T$, little movement occurs in the scene, hence the same 3D spatio-spectral scene is assumed to be modulated in that period. The video scene is captured with a snapshot every $\Delta _T$ seconds within $N_t$ frame intervals, where $\Delta _T = N_t \delta _T$ is the exposure time.

 figure: Fig. 3.

Fig. 3. (a) Timing sequence in the LeSTI sensing. (b) LeSTI system sensing process (bottom row). A pixel on the scene shows the spectral modulation in the system (middle row).

Download Full Size | PDF

As mentioned before, the system uses $N_l$ different types of LEDs, each type with a different spectrum profile, and thus $N_\lambda = N_l$ different spectral bands can be recovered from the measurement of the frames. However the LEDs could also be modulated in intensity which can empower the LeSTI system to recover a higher number of spectral bands [27]. This is left for future work. The 4D scene can be discretized as 4D data-cube ${{\textbf {F}}} \in {\mathbb {R}}^{N_x\times N_y \times N_{\lambda } \times N_t}$, where $N_x$ and $N_y$ denote the number of pixels in the 2D space, and $N_\lambda$ and $N_t$ denote the number of spectral bands and temporal frames, respectively. In the following, we use $n_{\lambda }$ and $n_t$ to index the spectral band and temporal frame, respectively. The modulation is carried out in two parts: i) $N_l$ different LEDs spectrally code the data-cube, and ii) the DMD imposes spatial binary patterns to the coded 4D data-cube in space, which is depicted in Fig. 3(b). In the first part, the spectra of data-cube, $\boldsymbol{f}_{n_x,n_y,n_t} \in {\mathbb {R}}^{ N_{\lambda }}$, at $(n_x,n_y)$ pixel of $n_t$-th temporal frame is projected as $\boldsymbol{f}^{*}_{n_x,n_y,n_t} \in {\mathbb {R}}^{N_{l}}$, where $\boldsymbol{f}^{*}$ denotes the spectral projections by $N_l$ types of LED illumination at $(n_x,n_y)$ pixel of $n_t$-th temporal frame of the 4D cube ${{\textbf {F}}}^{*}\in {\mathbb {R}}^{N_x\times N_y \times N_{l} \times N_t}$. Let ${{\textbf {S}}} = (s_{n_l,n_\lambda }) \in {\mathbb {R}}^{ N_{l} \times N_{\lambda }}$, where $s_{n_l,n_\lambda }$ denotes the $n_\lambda$-th element of $n_l$-th LED spectra curve. The projection can be described as

$$\boldsymbol{f}^{*}_{n_x,n_y,n_t} = {{\textbf{S}}} \boldsymbol{f}_{n_x,n_y,n_t}.$$

In the second part, spatial patterns are imposed based on the LED projected scene ${{\textbf {F}}}^{*}$. $N_{l} N_t$ spatially different DMD patterns, denoted by ${\textbf {C}} \in {\mathbb {R}}^{N_x\times N_y \times N_{l} \times N_t}$, are applied within one exposure time ($\Delta _T$) of the sensor. Let ${{\textbf {G}}} \in {\mathbb {R}}^{N_x \times N_y}$ denote the discrete measurement captured by the sensor, then the forward model of the system is,

$${{\textbf{G}}} =\textstyle \sum_{n_l=1}^{N_l} \sum_{n_t=1}^{N_t} {\textbf{C}}_{n_l, n_t} \odot {{\textbf{F}}}^{*}_{n_l, n_t} + {{\textbf{E}}},$$
where $\odot$ means element-wise production, ${{\textbf {E}}} \in {\mathbb {R}}^{N_x \times N_y}$ denotes measurement noise of the system and ${{\textbf {F}}}^{*}_{n_l, n_t}\in {\mathbb {R}}^{N_x \times N_y}, {\textbf {C}}_{n_l, n_t}\in {\mathbb {R}}^{N_x \times N_y}$ denote the $(n_l, n_t)$-th frame of ${{\textbf {F}}}^{*}$ and ${\textbf {C}}$, respectively.

The 4D data-cube is formed by a sequence of spectral-temporal 2D images varying spectrally and in time. To combine the two parts above, we need to consider how these 2D images in the 4D data-cube are modulated in the system. As the LED illumination gives different spectral weights to the different spectral images in the first part, and the DMD imposes the spatial patterns to the LED projected images in the second part, each 2D image in the 4D data-cube is imposed with a combination of multiple patterns. The modulation of each 2D image ${{\textbf {M}}}_{n_\lambda , n_t}\in {\mathbb {R}}^{N_x \times N_y}$ of $n_\lambda$-th spectral band at $n_t$-th frame can be represented as

$${{\textbf{M}}}_{n_\lambda,n_t} = \textstyle \sum_{n_l=1}^{N_l} s_{n_l,n_\lambda} \cdot {\textbf{C}}_{n_l, n_t}.$$

The modulation are summed across several ($N_l$) binary patterns with different weight. As long as these grayscale patterns are different from each other, the spectral information can be recovered during the inversion. Equation (3) can thus be written as,

$${{\textbf{G}}} = \textstyle\sum_{n_\lambda=1}^{N_\lambda} \sum_{n_t=1}^{N_t} {{\textbf{M}}}_{n_\lambda, n_t} \odot {{\textbf{F}}}_{n_\lambda, n_t} + {{\textbf{E}}}.$$

The inverse problem is then to recover ${{\textbf {F}}}$ from ${{\textbf {G}}}$. First, we write the forward model in vectorized notation as ${{\boldsymbol{g}}} = \textrm {Vec}({{\textbf {G}}}) \in {\mathbb {R}}^{N_x N_y}$ with $\textrm {Vec}(\cdot )$ vectorizing the ensued matrix. Similarly, ${{\boldsymbol{e}}} = \textrm {Vec}({{\textbf {E}}}) \in {\mathbb {R}}^{N_x N_y}$ and $\boldsymbol{f} \in {\mathbb {R}}^{N_x N_yN_\lambda N_t}$. We denote the sensing matrix as ${{\textbf {H}}} \in {\mathbb {R}}^{N_xN_y \times N_xN_yN_\lambda N_t}$:

$${{\textbf{H}}} = \left[ diag({{\textbf{M}}}_{1,1}), diag({{\textbf{M}}}_{2,1}), \dots, diag({{\textbf{M}}}_{N_{\lambda},1}), diag({{\textbf{M}}}_{1,2}), \dots, diag({{\textbf{M}}}_{N_{\lambda},N_t})\right],$$
where $diag({{\textbf {M}}}_{n_{\lambda }, n_t})$ denotes a diagonal matrix whose diagonal elements are composed of the vectorized matrix ${{\textbf {M}}}_{n_\lambda , n_t}$. Then forward model is written as
$${{\boldsymbol{g}}} = {{\textbf{H}}} \boldsymbol{f} + {{\boldsymbol{e}}}.$$

Several reconstruction algorithms can be used to solve Eq. (7), which will be discussed in Section 2.3. The compression rate of the system is thus $\frac {1}{N_\lambda N_t}$ and recent theoretical analysis [37] has shown that high quality reconstruction is possible using the special structure of the sensing matrix in Eq. (6).

2.2 DMD pattern design

As mentioned above, the DMD modulations play an important role in the LeSTI system. Random binary patterns with elements being $\{0,1\}$ are widely used in CS, in which ‘0’ denotes blocking the light and ‘1’ denotes unaltered light [16]. In LeSTI, we use random binary codes, as well as blue noise codes, which have been shown to improve the reconstruction image quality in spectral imaging [38]. Blue noise was introduced in digital halftoning to apply stochastic dither patterns as homogeneously as possible [3941]. Blue noise binary patterns preserve both randomness and high frequency properties, making blue noise coding a good alternative to random binary coding. These blue noise binary patterns were applied in the design of spatial-polarization and spatial-spectral coding patterns [38,42], and outperformed random (white noise) coding in different compressive spectral imaging systems. Comparisons of reconstruction results using random and blue noise codings are shown in Section 4.

2.3 Reconstruction

The estimate of $\boldsymbol{f}$, given ${{\textbf {H}}}$ and ${{\boldsymbol{g}}}$ is an ill-posed linear inverse problem that has been addressed extensively in CS [1,43]. It has been shown that sub-Nyquist sampling and reliable recovery can be achieved imposing constraints on the sampling/sensing matrix and proper priors of the signal. The performance bound of snapshot CS has been proved in [37]. More recently, denoisers using deep neural networks have been used as the prior of natural images with certain constraints on the network training process [44]. For natural or spectral images, the prior distribution of the images is needed for a good recovery. From the statistical inference perspective, we can use the maximum a posterior probability (MAP) estimate given the measurement ${{\boldsymbol{g}}}$ and the forward model (likelihood function $p_{{{\boldsymbol{g}}} \mid \boldsymbol{f}}$) to estimate the unknown signal $\boldsymbol{f}$ in Eq. (7), that is $ \hat{\boldsymbol{f}}=\arg\;\max_{\boldsymbol{f}}p(\boldsymbol{f}\mid{{\boldsymbol{g}}})=\arg\;\max_{\boldsymbol{f}}\frac{p({{\boldsymbol{g}}} \mid\boldsymbol{f}) p(\boldsymbol{x})} {p({{\boldsymbol{g}}})} = \arg\;\max_{\boldsymbol{f}}{ p({{\boldsymbol{g}}}\mid \boldsymbol{f} ) p(\boldsymbol{f})}.$. Hereby, we assume the additive white Gaussian noise (AWGN) of the measurements ${{\boldsymbol{e}}}\sim \mathcal {N}(0,\sigma _{e})$, and the MAP estimate can be rewritten as

$$\hat{\boldsymbol{f}}= \textstyle \arg\;\max_{\boldsymbol{f}}\exp\Big\{ -\frac{1}{2\sigma^{2}_{e}} \|{{\boldsymbol{g}}}-{{\textbf{H}}} \boldsymbol{f}\|_2^{2} + \log p (\boldsymbol{f}) \Big\} = \arg\;\min_{\boldsymbol{f}} \frac{1}{2} \|{{\boldsymbol{g}}} - {{\textbf{H}}}\boldsymbol{f}\|_2^{2} - \sigma^{2}_{e} \log p(\boldsymbol{f}).$$

By replacing the unknown noise variance $\sigma ^{2}_{e}$ with a noise-balancing factor $\lambda$ and negative log prior function $p(\boldsymbol{f})$ with a regularization term $R(\boldsymbol{f})$, Eq. (8) can be written as

$$\hat{\boldsymbol{f}} = \arg\;\min\nolimits_{\boldsymbol{f}} \frac{1}{2} \|{{\boldsymbol{g}}} - {{\textbf{H}}}\boldsymbol{f}\|_2^{2} + \lambda R(\boldsymbol{f})\,.$$

Different solvers for 9 have been developed including the iterative shrinkage/thresholding algorithm (ISTA) [45], Fast ISTA (FISTA) [13], two-step iterative shrinkage/thresholding (TwIST) [46], alternating direction method of multipliers (ADMM) [47], Gradient Projection for Sparse Reconstruction (GPSR) [12] and generalized alternating projection (GAP) [48]. In addition priors including sparsity [9] and total variation (TV) [31] have been used to solve the snapshot compressive imaging system. It has been recently recognized that these methods can be categorized as the plug-and-play (PnP) framework [49] by using different solvers and different denoisers [15], which basically includes the gradient decent step and the denoising step and these two steps are iteratively performed [16]. Most recently, the deep denoiser has also been adapted into the PnP framework to provide a balance of speed, accuracy and flexibility [44,50].

In the following, we follow the ADMM framework to derive the main steps in PnP and these steps can be generalized to most of the other algorithms. The ADMM solution to the optimization problem 9 can be written as

$$\boldsymbol{f}^{k+1} = \arg\;\min\nolimits_{\boldsymbol{f}} \frac{1}{2} \|{{\boldsymbol{g}}} -{{\textbf{H}}}\boldsymbol{f}\|_2^{2}+\frac{\rho}{2}\|\boldsymbol{f}-(\boldsymbol{z}^{k}-{{\boldsymbol{u}}}^{k})\|_2^{2},$$
$$\boldsymbol{z}^{k+1} = \arg\;\min\nolimits_{\boldsymbol{f}} \lambda R(\boldsymbol{z}) + \frac{\rho}{2}\|\boldsymbol{z}-(\boldsymbol{f}^{k+1}+{{\boldsymbol{u}}}^{k})\|_2^{2},$$
$${{\boldsymbol{u}}}^{k+1} = {{\boldsymbol{u}}}^{k} + (\boldsymbol{f}^{k+1} - \boldsymbol{z}^{k+1}),$$
where $\boldsymbol{z}$ is an auxiliary variable, ${{\boldsymbol{u}}}$ is the multiplier, $\rho$ is a penalty factor, and $k$ is the index of iterations. It has been derived that the PnP-ADMM solution to the SCI problem [15,31,44,50] is
$$\boldsymbol{f}^{k+1} = (\boldsymbol{z}^{k}-{{\boldsymbol{u}}}^{k}) + {{\textbf{H}}}{{}^{\top}} \big({{\boldsymbol{g}}}-{{\textbf{H}}}(\boldsymbol{z}^{k}-{{\boldsymbol{u}}}^{k})\big)\oslash(\textrm{Diag}({{\textbf{H}}}{{\textbf{H}}}{{}^{\top}})+\rho),$$
$$\boldsymbol{z}^{k+1} = \mathcal{D}_{\hat{\sigma}_k} (\boldsymbol{f}^{k+1}+{{\boldsymbol{u}}}^{k}),$$
$${{\boldsymbol{u}}}^{k+1} = {{\boldsymbol{u}}}^{k} + (\boldsymbol{f}^{k+1} - \boldsymbol{z}^{k+1}),$$
where $\textrm {Diag}(~)$ extracts the diagonal elements of the matrix inside $(~)$, $\oslash$ denotes the element-wise division or Hadamard division, and $\hat {\sigma }_k$ is the estimated noise standard deviation for the current ($k$-th) iteration. Here, the noise penalty factor $\rho$ is tuned to match the measurement noise. $\mathcal {D}_\sigma (\cdot )$ is a denoising operation with $\sigma$ as the estimated noise standard deviation. Note that different denoising methods can be applied to Eq. (14). For TV regularization in the noiseless case of $\rho =0$, PnP-ADMM will reduce to GAP-TV [31], which gives a fast baseline. When the weighted nuclear norm minimization (WNNM) is applied, PnP-ADMM will be equal to DeSCI developed in [15], which gives the state-of-the-art results. In the following experiments, we use GAP-TV and DeSCI to perform the video reconstructions. We modified the DeSCI algorithm for 4D reconstruction by replacing non-local search of 2D images to 3D spectral image cube.

3. Hardware design

A key step in the LeSTI system is the fast LEDs and DMD switching, so it is important to design a high speed hardware control system to synchronize the devices. A detailed experimental hardware set-up and devices synchronization mechanism are illustrated, and the calibration process is described.

3.1 System setup

We designed and built the LED illumination array shown in Fig. 2(b). Ten types of LUXEONS Rebel Color LEDs are evenly placed on the illumination board with three units of each type. Each type of LED can be turned on and off independently through a LED driver by a logic signal. The switching speed of the LEDs is over megahertz [35], but due to the LED driver limitation, the actual switching speed is 10kHz in our application. In our application eight types of LEDs were needed to cover the 450-670nm spectral range. The experimental optical set-up of LeSTI is depicted in Fig. 1. The LED illuminator is placed near the scene for even illumination. The imaging lens images the spectral modulated scene onto an intermediate plane, where a DMD (TI, DLP4500) is placed to impose spatial modulation. An objective lens images the modulated images onto the FPA. A grayscale (Ximea, MQ042xG-CM) sensor is placed on the FPA to integrate the 4D modulated data-cube into a single 2D snapshot within $\Delta _T$. The DMD square micro-mirrors are oriented +12° and −12° along its diagonal axis relative to the DMD panel; $912 \times 1140$ micro-mirrors are placed like fish scales with its diagonal hinge axis vertically. In order to match the placement of micro-mirrors, the sensor is tilted 45°along the objective lens optical axis. $512 \times 512$ out of $912 \times 1140$ DMD pixels are utilized to display $256 \times 256$ spatial patterns, which means $2 \times 2$ DMD pixels are binned to one pattern pixel. The objective lens projects one 7.6 $\mu$m DMD pixel to $2 \times 2$ 2.2$\mu$m sensor pixels. Then, $1024 \times 1024$ out of $2064 \times 3088$ sensor pixels are used to sense $256 \times 256$ resolution images.

A high speed FPGA micro-controller unit (MCU) is adopted to synchronize all electrical devices. Eight logic signals are connected to LED drivers to control the on-off LED switching. A rising edge trigger signal is sent to the DMD controller to trigger the DMD to load a new pre-stored spatial pattern. A trigger signal is used to control the sensor shutter (exposure time), and the sensor remains open as the trigger signal level keeps high. A rising edge signal is used to initiate the object movement in our experiments.

3.2 Calibration

The optical imaging system does not have ideal elements. For instance, the modulation patterns are not ideally binary, i.e., ‘0’ for blocking the light or ‘1’ for transmitting the light. The actual patterns projected on the sensor could be affected by lens distortions, diffraction, and spectral wavelength. A calibration process is thus required to measure the DMD spatial patterns with corresponding LED illuminations. To this end, a white board is placed as a scene, so the scene $\mathbf {F}_{n_{\lambda },n_t}$ in Eq. (5) can be assumed to equal $\mathbf {1}$. The goal of calibration is to capture the spatial patterns ${\textbf {C}}$ in 4, as the LED spectra information $s_{n_l,n_\lambda }$ is already measured by the spectrometer. This can be calibrated by i) imposing a specific pattern on the DMD with corresponding LED illumination, ii) capturing the image on the sensor composed of the DMD pattern, and iii) normalize the captured pattern. After these steps, we can get patterns of ${\textbf {C}}_{n_l,n_t}$. By performing this $N_l N_t$ times, we can obtain all the masks ${\textbf {C}}$ used in the reconstruction. After this, we get ${\textbf {C}}$ and thus ${{\textbf {M}}}$ to formulate the forward matrix ${{\textbf {H}}}$. Finally, the reconstruction algorithms are employed to recover the desired 4D data-cube.

4. Simulation results

In this section, we perform simulations to verify the hardware implementation and compare different system configurations and reconstruction algorithms. A 4D input data-cube is first introduced as the ground truth ${{{\textbf {F}}}}^{0}$ for the simulation of the LeSTI system. The simulation process is shown based on the discrete model and the results of the reconstruction are presented, analyzed, and compared. To study the performance of the LeSTI system, a spectral video is acquired in our laboratory by a SP DK240 1/4m monochromator and a $2048 \times 2048$ 8-bit grayscale sensor (BOBCAT B2021M). Specifically, 26 spectral images are captured from 450 to 690 nm central wavelength at 10 nm interval by varying the monochromatic illumination. In the temporal domain, the dynamics of the scene are attained by controlled motion of a LEGO figure from left to right with a nano-positioner for 32 frames. In this way, the resulting data-cube thus has $256\times 256$ pixels of spatial resolution, 26 spectral bands, and 32 temporal frames. In this way, a $256\times 256 \times 26 \times 32$ 4D data-cube ${{{\textbf {F}}}}^{0}$ is obtained. There are two phases in the simulation: i) the forward modeling phase where a 4D input data-cube is used to generate a measurement snapshot, and ii) the reconstruction phase where a 4D data-cube is reconstructed from the measurement and modulation patterns by algorithms. For the forward process, according to Eqs. (4) and (5), a measurement $\mathbf {G}$ can be generated with full $N_\lambda =26$ bands based on the size of input data-cube. In the reconstruction process, $N_\lambda = 8$ bands can be reconstructed, which is limited by the number of types of LED. Eight bit grayscale measurements are used at the FPA, so we quantize the simulated measurement to 256 grayscale levels to simulate the sensor limited dynamic range.

We employ GAP-TV [31] to conduct most of the experiments to verify the proposed system. Peak-signal-to-noise-ratio (PSNR) and structured similarity index measure (SSIM) are chosen to quantitatively evaluate the reconstruction quality. In Fig. 4, the PSNR and SSIM are plotted versus temporal compression $N_t = 2,4,6,8,10$, with different spatial patterns and different pattern transmittance 12.5%, 25%, and 50% of the DMD patterns. Transmittance denotes the ratio of ‘1’s in all pixels of a 2D binary pattern. As we can see, 12.5% and 25% transmittance shows similar performance and both of them are superior to 50% transmittance. We analyze the reasons as following. Recalling Eq. (4), multiple binary patterns are superimposed on a 2D image in the desired 4D data-cube. As the transmittance goes up, there is more chance of overlapping among different patterns, and it tends to make the modulations uniform or similar to each other and the results are thus getting degraded. By reducing the transmittance, the coding overlapping can be reduced, and in the result, the performance is improved. Following this, we choose to use 25% transmittance blue noise for the rest of the reconstructions. Selected reconstruction results are shown in Fig. 5 based on $N_t = 2$, where in addition to selected frames, we also plot the reconstructed spectra compared with the ground truth along with the correlation. As we can see, blue noise coding outperforms random noise, and provides better spatial reconstructions in details, especially for high frequency signals. We further conduct the simulation at different temporal compression rates $N_t$. It can be seen from Fig. 6 as $N_t$ increases, the reconstruction quality degrades as expected. We also run two sets of simulations to test how the LeSTI system responds to fast object motion. In the first simulation shown in Fig. 7(a), the object has slow motion with respect to the speed of DMD and LED modulation. The spectral-temporal information is well preserved in the reconstructions. In the second simulation shown in Fig. 7(b), the objects are moving fast with respect to the inter-frame interval $\delta _T$, but slower than the LEDs scan interval $\delta _t$. As we can observe, in this scenario the spatial and spectral information is preserved accurately. In the third simulation shown in Fig. 7(c), the objects are moving at a very fast pace with respect to both interval $\delta _t$ and $\delta _T$. In this scenario, our assumption that the moving objects remain still while LEDs are scanning, is not satisfied. This causes blurring artifacts in space and spectra, and different color objects are not aligned with each other. It should be pointed out that in the proposed system, the LEDs can be modulated at gigahertz, while the DMD can only be modulated at few kilohertz. Therefore, the speed of our system is limited by the DMD. We now compare the results of different reconstruction algorithms at $N_t = 8$, including, GPSR4D, GAP-TV4D and the modified DeSCI4D. The results are shown in Fig. 8, where we can see that DeSCI leads to the best visual results. The reconstructed frames are clearer particularly in the boundaries and fine details. Quantitatively, DeSCI outperforms GAP-TV by 12.2dB in PSNR, and outperforms GPSR by 8dB in PSNR. From the reconstructed frames, we can also observe that GPSR leads to some artifacts, while the results of GAP-TV are blurry. Interestingly, though GPSR leads to a higher PSNR than GAP-TV, the SSIM is lower and from the spectral curves, we notice that GAP-TV can provides better spectra. Regarding the reconstruction complexity, at $N_t=8$, GAP-TV took 206s, GPSR took 346s, while DeSCI took 151,380s to reconstruct the 4D video data-cube. GAP-TV is shown to provide a good balance of speed and quality of reconstruction.

 figure: Fig. 4.

Fig. 4. PSNR and SSIM plots with different system configurations. Notice that the GAP-TV reconstruction algorithm is applied for reconstructions. Here ‘white’ refers to random binary random patterns used in the DMD and ‘Blue’ means the blue noise patterns are used. The following percentage numbers denote the pattern transmittance.

Download Full Size | PDF

 figure: Fig. 5.

Fig. 5. Original image cube (row 1, Visualization 1) and simulation reconstruction results of $N_t = 2$ with random (white noise) coding transmittance 12.5% (row 2, Visualization 2), random coding 50% (row 3, Visualization 3), blue noise coding 12.5% (row 4, Visualization 4), and blue noise coding 50% (row 5, Visualization 5). In the first row, original data is presented with spectral images, reproduced RGB images and zoomed information. Reconstruction results with different configurations (rows 2-5) are shown with the same layout. The compressed measurements are shown at the bottom left (last row). Spectral plots of selected regions are shown in the bottom right.

Download Full Size | PDF

 figure: Fig. 6.

Fig. 6. Simulation reconstruction results of 25% transmittance blue noise with different temporal compression, i.e. $N_t = 6$ (row 2, Visualization 6), $10$ (row 3, Visualization 7). Notice that the P3 interest point of reconstruction $N_t = 6$ is on frame 4, and for reconstruction $N_t = 10$ it is on frame 6. The coordinates of P3 on these two frames move with LEGO.

Download Full Size | PDF

 figure: Fig. 7.

Fig. 7. Simulation reconstruction with moving LEGO and static background shown in reproduced RGB images. Simulation measurement and the reconstruction of 4 frames from a snapshot when the objects have (a) small motion, (b) fast motion, and (c) very fast motion with respect to the LED scanning. Notice that the GAP-TV algorithm is applied to generate the reconstruction results.

Download Full Size | PDF

 figure: Fig. 8.

Fig. 8. Simulation reconstruction results with different reconstruction algorithms, i.e. GPSR4D (row 2, Visualization 8), GAP-TV4D (row 3, Visualization 9), DeSCI4D (row 4, Visualization 10). The results are based on 25% transmittance blue noise with $N_t = 8$ frames. Four reproduced RGB images are selected to be shown.

Download Full Size | PDF

5. Experimental results

The imaging system prototype described in Section 3. was built and used to continuously capture a high-speed spectral scene. We used the GAPTV4D and DeSCI4D method to reconstruct the 4D data-cube. Based on our experimental set-up, the reconstruction frame rate can reach 200 fps with 33.3 fps measurements, i.e., $N_t=6$. Three scenes shown in Fig. 9 are captured by our set-up. These objects are driven by a motorized positioner to implement the motion in the horizontal axis. All reconstruction results are shown based on 25% transmittance blue noise coding. Due to the limit of the DMD controller, only 48 patterns can be used in a experiment. Thus maximum $N_t = 6$ temporal frames with $N_{\lambda } = 8$ can be compressed. Figure 10 compares the experimental reconstruction results between GAPTV4D and DeSCI4D algorithms. We show the measurement, and the results with 8 spectral bands of the reconstructed frames, as well as the reproduced RGB images based on the CIE 1931 color space. It can be seen that DeSCI4D significantly improves the reconstruction quality.

 figure: Fig. 9.

Fig. 9. Three imaging scenes: (a) color checker, (b) Super Mario and background, and (c) natural plants.

Download Full Size | PDF

 figure: Fig. 10.

Fig. 10. Experimental reconstruction spectral video excerpts with $N_t = 4$ using GAPTV4D (top-right, Visualization 11) and DeSCI4D (bottom-right, Visualization 12) algorithm.

Download Full Size | PDF

Figure 11 shows the results with different dynamic scenes. All the results are based on DeSCI4D reconstruction algorithm. The first scene we captured is color checker, which is used to verify the performance of spectral reconstruction of our system. It can be seen that both spectral and temporal frames are well reconstructed with the correct color and fine details. We do notice that, similar to the simulation results, the reconstructed frames at $N_t = 6$ are blurrier than the ones at $N_t = 4$. The second scene is Super Mario composed of some cartoon logos printed on a paper with rich colors and shapes shown in the middle of Fig. 9. Similar to the color checker, we also captured two measurements at different temporal compression rates. The reconstructed results are shown in Fig. 11 (middle), where we can see that the reconstructed frames at $N_t = 4$ are very clean. The images are getting degraded at $N_t =6$ with some undesired artifacts.

 figure: Fig. 11.

Fig. 11. Real data reconstruction spectral video excerpts with different temporal compression ratios, i.e. Color checker with $N_t = 4$ (Visualization 12) and $N_t = 6$ (Visualization 13), Super Mario with $N_t = 4$ (Visualization 14) and $N_t = 6$ (Visualization 15), Natural plants with $N_t = 4$ (Visualization 16) and $N_t = 6$ (Visualization 17).

Download Full Size | PDF

As the goal of our research is to capture high-speed multi-spectral scenes, we captured natural plants as our last scene, which consists of leaves and flowers shown in the right part of Fig. 9. Again, the reconstruction results of two different measurements are shown in Fig. 11 (bottom). Similar to other two scenes, both spatial, temporal and spectral information can be reconstructed from the single measurement. When $N_t$ increases, we lose some fine details of the object.

6. Conclusion and discussion

In this work, we reported a new compressive spectral temporal video imaging system based on spectral coded illumination. The system captures coded projections of a 4D scene with 2D coded measurements. The scenes are spectrally modulated by LED illumination, and spatio-temporally modulated by a DMD light modulator. A detailed mathematical model of the imaging system was presented. Comparison between conventional random noise coding and blue noise coding with different coding transmittance showed that blue noise with 25% transmittance provides the best image reconstruction results. Several CS regularized algorithms were applied for reconstruction, and their performance was compared. Real data measurements and reconstructions were reported.

Some limitations should be noted. Firstly, our work relies on active LED illuminations, which would limit the application of our work to indoors or controllable illumination settings. A number of real applications are well suited for this type of imaging systems in medical and industrial applications. For instance, the plastics and glass recycling industries rely on spectral imaging with controllable illumination and they are used in controlled motion environments [34]. Secondly, our system is based on the assumption that objects remain still when LEDs are scanning. While, in real life, objects usually move continuously, LED scanning is at megahertz. Color artifacts may occur in the results if the object moves at very high speed. In future work, we are going to focus on spectral super-resolution by intensity modulation of the LED array [27]; and using deep learning to speed up the reconstruction and improve the spectral-temporal resolution [51].

Disclaimer

The target object presented in the experimental results was not endorsed by the trademark owners and it is used here as fair use to illustrate the quality of reconstruction LeSTI system. LEGO is a trademark of the LEGO Group, which does not sponsor, authorize or endorse the images in this paper. All Rights Reserved. https://www.lego.com/en-us/legal/notices-and-policies/fair-play/. Super Mario is a trademark of the Nintendo Co., Ltd, which does not sponsor, authorize or endorse the images in this paper. All Rights Reserved. https://www.nintendo.co.jp/networkservice_guideline/en/.

Disclosures

X. Y., Nokia (E).

References

1. D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006). [CrossRef]  

2. R. Koller, L. Schmid, N. Matsuda, T. Niederberger, L. Spinoulas, O. Cossairt, G. Schuster, and A. K. Katsaggelos, “High spatio-temporal resolution video with compressed sensing,” Opt. Express 23(12), 15992–16007 (2015). [CrossRef]  

3. M. F. Duarte, M. A. Davenport, D. Takhar, J. N. Laska, T. Sun, K. F. Kelly, and R. G. Baraniuk, “Single-pixel imaging via compressive sampling,” IEEE Signal Process. Mag. 25(2), 83–91 (2008). [CrossRef]  

4. A. Wagadarikar, R. John, R. Willett, and D. Brady, “Single disperser design for coded aperture snapshot spectral imaging,” Appl. Opt. 47(10), B44–B51 (2008). [CrossRef]  

5. G. R. Arce, D. J. Brady, L. Carin, H. Arguello, and D. S. Kittle, “Compressive coded aperture spectral imaging: An introduction,” IEEE Signal Process. Mag. 31(1), 105–115 (2014). [CrossRef]  

6. Z. Meng, J. Ma, and X. Yuan, “End-to-end low cost compressive spectral imaging with spatial-spectral self-attention,” in European Conference on Computer Vision (ECCV), (2020).

7. Z. Meng, M. Qiao, J. Ma, Z. Yu, K. Xu, and X. Yuan, “Snapshot multispectral endomicroscopy,” Opt. Lett. 45(14), 3897–3900 (2020). [CrossRef]  

8. P. Llull, X. Liao, X. Yuan, J. Yang, D. Kittle, L. Carin, G. Sapiro, and D. J. Brady, “Coded aperture compressive temporal imaging,” Opt. Express 21(9), 10526–10545 (2013). [CrossRef]  

9. X. Yuan, P. Llull, X. Liao, J. Yang, D. J. Brady, G. Sapiro, and L. Carin, “Low-cost compressive sensing for color video and depth,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2014), pp. 3318–3325.

10. E. J. Candès and M. B. Wakin, “An introduction to compressive sampling,” IEEE Signal Process. Mag. 25(2), 21–30 (2008). [CrossRef]  

11. J. Tan, Y. Ma, H. Rueda, D. Baron, and G. R. Arce, “Compressive hyperspectral imaging via approximate message passing,” IEEE J. Sel. Top. Signal Process. 10(2), 389–401 (2016). [CrossRef]  

12. M. A. T. Figueiredo, R. D. Nowak, and S. J. Wright, “Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems,” IEEE J. Sel. Top. Signal Process. 1(4), 586–597 (2007). [CrossRef]  

13. A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imaging Sci. 2(1), 183–202 (2009). [CrossRef]  

14. X. Yuan, Y. Sun, and S. Pang, “Compressive video sensing with side information,” Appl. Opt. 56(10), 2697–2704 (2017). [CrossRef]  

15. Y. Liu, X. Yuan, J. Suo, D. J. Brady, and Q. Dai, “Rank minimization for snapshot compressive imaging,” IEEE Trans. Pattern Anal. Mach. Intell. 41(12), 2990–3006 (2019). [CrossRef]  

16. X. Yuan, H. Jiang, G. Huang, and P. Wilford, “SLOPE: Shrinkage of local overlapping patches estimator for lensless compressive imaging,” IEEE Sens. J. 16(22), 8091–8102 (2016). [CrossRef]  

17. X. Yuan, “Adaptive step-size iterative algorithm for sparse signal recovery,” Signal Process. 152, 273–285 (2018). [CrossRef]  

18. M. Qiao, Z. Meng, J. Ma, and X. Yuan, “Deep learning for video compressive sensing,” APL Photonics 5(3), 030801 (2020). [CrossRef]  

19. M. Qiao, X. Liu, and X. Yuan, “Snapshot spatial–temporal compressive imaging,” Opt. Lett. 45(7), 1659–1662 (2020). [CrossRef]  

20. Z. Cheng, B. Chen, G. Liu, H. Zhang, R. Lu, Z. Wang, and X. Yuan, “Memory-efficient network for large-scale video compressive sensing,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021).

21. Z. Wang, H. Zhang, Z. Cheng, B. Chen, and X. Yuan, “Metasci: Scalable and adaptive reconstruction for video compressive sensing,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021).

22. T. H. Tsai, P. Llull, X. Yuan, L. Carin, and D. J. Brady, “Spectral-temporal compressive imaging,” Opt. Lett. 40(17), 4054–4057 (2015). [CrossRef]  

23. X. Yuan, D. J. Brady, and A. K. Katsaggelos, “Snapshot compressive imaging: Theory, algorithms, and applications,” IEEE Signal Process. Mag. 38(2), 65–88 (2021). [CrossRef]  

24. M. Parmar, S. Lansel, and J. Farrell, “An led-based lighting system for acquiring multispectral scenes,” Digit. Photogr. VIII 8299, 82990P (2012). [CrossRef]  

25. J.-I. Park, M.-H. Lee, M. D. Grossberg, and S. K. Nayar, “Multispectral imaging using multiplexed illumination,” 2007 IEEE 11th International Conference on Computer Vision pp. 1–8 (2007).

26. J. H. G. M. Klaessens, M. Nelisse, R. M. Verdaasdonk, and H. J. Noordmans, “Non-contact tissue perfusion and oxygenation imaging using a led based multispectral and a thermal imaging system, first results of clinical intervention studies,” Adv. Biomed. Clin. Diagn. Syst. XI, 8572, 857207 (2013). [CrossRef]  

27. J. Tschannerl, J. Ren, H. Zhao, F.-J. Kao, S. Marshall, and P. Yuen, “Hyperspectral image reconstruction using multi-colour and time-multiplexed led illumination,” Opt. Lasers Eng. 121, 352–357 (2019). [CrossRef]  

28. C. Fu, M. L. Don, and G. R. Arce, “Compressive spectral imaging via polar coded aperture,” IEEE Transactions on Computational Imaging (2017).

29. X. Lin, Y. Liu, J. Wu, and Q. Dai, “Spatial-spectral encoded compressive hyperspectral imaging,” ACM Trans. Graph. 33(6), 1–11 (2014). [CrossRef]  

30. X. Yuan, T.-H. Tsai, R. Zhu, P. Llull, D. Brady, and L. Carin, “Compressive hyperspectral imaging with side information,” IEEE J. Sel. Top. Signal Process. 9(6), 964–976 (2015). [CrossRef]  

31. X. Yuan, “Generalized alternating projection based total variation minimization for compressive sensing,” in 2016 IEEE International Conference on Image Processing (ICIP), (2016), pp. 2539–2543.

32. J. B. Rodriguez, G. R. Arce, and D. L. Lau, “Blue-noise multitone dithering,” IEEE Trans. Image Process. 17(8), 1368–1382 (2008). [CrossRef]  

33. C. M. U. Neale and B. G. Crowther, “An airborne multispectral video/radiometer remote sensing system: Development and calibration,” Remote. Sens. Environ. 49(3), 187–194 (1994). [CrossRef]  

34. M. F. Carlsohn, B. H. Menze, B. M. Kelm, F. A. Hamprecht, A. Kercek, R. Leitner, and G. Polder, “Spectral imaging and applications,” in Color image processing: methods and applications, R. Lukac and K. N. Plataniotis, eds. (CRC Press, 2006), chap. 17.

35. D. Karunatilaka, F. Zafar, V. Kalavally, and R. Parthiban, “Led based indoor visible light communications: State of the art,” IEEE Commun. Surv. Tutorials 17(3), 1649–1678 (2015). [CrossRef]  

36. LUMILEDS, “Luxeon rebel color leds,” https://www.lumileds.com/products/color-leds/luxeon-rebel-color.

37. S. Jalali and X. Yuan, “Snapshot compressed sensing: Performance bounds and algorithms,” IEEE Trans. Inf. Theory 65(12), 8005–8024 (2019). [CrossRef]  

38. C. V. Correa, H. Arguello, and G. R. Arce, “Spatiotemporal blue noise coded aperture design for multi-shot compressive spectral imaging,” J. Opt. Soc. Am. A 33(12), 2312–2322 (2016). [CrossRef]  

39. D. L. Lau and G. R. Arce, Modern digital halftoning (CRC Press, 2008).

40. D. L. Lau, G. R. Arce, and N. C. Gallagher, “Green-noise digital halftoning,” Proc. IEEE 86(12), 2424–2444 (1998). [CrossRef]  

41. D. L. Lau, R. Ulichney, and G. R. Arce, “Blue and green noise halftoning models,” IEEE Signal Process. Mag. 20(4), 28–38 (2003). [CrossRef]  

42. C. Fu, H. Arguello, B. M. Sadler, and G. R. Arce, “Compressive spectral polarization imaging by pixelized polarizer and colored patterned detector,” J. Opt. Soc. Am. A 32(11), 2178–2188 (2015). [CrossRef]  

43. E. J. Candes, J. Romberg, and T. Tao, “Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Inf. Theory 52(2), 489–509 (2006). [CrossRef]  

44. X. Yuan, Y. Liu, J. Suo, and Q. Dai, “Plug-and-play algorithms for large-scale snapshot compressive imaging,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), pp. 1447–1457.

45. I. Daubechies, M. Defrise, and C. De Mol, “An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,” Comm. Pure Appl. Math. 57(11), 1413–1457 (2004). [CrossRef]  

46. J. Bioucas-Dias and M. Figueiredo, “A new TwIST: Two-step iterative shrinkage/thresholding algorithms for image restoration,” IEEE Trans. Image Process. 16(12), 2992–3004 (2007). [CrossRef]  

47. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations Trends Mach. Learn. 3(1), 1–122 (2010). [CrossRef]  

48. X. Liao, H. Li, and L. Carin, “Generalized alternating projection for weighted-ℓ2, 1 minimization with applications to model-based compressive sensing,” SIAM J. Imaging Sci. 7(2), 797–823 (2014). [CrossRef]  

49. S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plug-and-play priors for model based reconstruction,” in 2013 IEEE Global Conference on Signal and Information Processing, (2013), pp. 945–948.

50. S. Zheng, Y. Liu, Z. Meng, M. Qiao, Z. Tong, X. Yang, S. Han, and X. Yuan, “Deep plug-and-play priors for spectral snapshot compressive imaging,” Photonics Res. 9(2), B18–B29 (2021). [CrossRef]  

51. D. J. Brady, L. Fang, and Z. Ma, “Deep learning for camera data acquisition, control, and image estimation,” Adv. Opt. Photonics 12(4), 787–846 (2020). [CrossRef]  

Supplementary Material (17)

NameDescription
Visualization 1       Video of the simulation input 4D data-cube with size 256 by 256 by 8 by 32.
Visualization 2       Video of the simulation reconstruction result with 12.5% transmittance random coding.
Visualization 3       Video of the simulation reconstruction result with 50% transmittance random coding.
Visualization 4       Video of the simulation reconstruction result with 12.5% transmittance blue noise coding.
Visualization 5       Video of the simulation reconstruction result with 50% transmittance blue noise coding.
Visualization 6       Video of the simulation reconstruction result with 6 temporal frames compression.
Visualization 7       Video of the simulation reconstruction result with 10 temporal frames compression.
Visualization 8       Video of the simulation reconstruction result with GPSR4D algorithm.
Visualization 9       Video of the simulation reconstruction result with GAPTV4D algorithm.
Visualization 10       Video of the simulation reconstruction result with DeSCI4D algorithm.
Visualization 11       Video of the expeirmental reconstruction result of natural plants with gaptv4D algorithm.
Visualization 12       Video of the expeirmental reconstruction result of color checker with 4 temporal frames compression and DeSCI4D algorithm.
Visualization 13       Video of the expeirmental reconstruction result of color checker with 6 temporal frames compression.
Visualization 14       Video of the expeirmental reconstruction result of Super Mario with 4 temporal frames compression.
Visualization 15       Video of the expeirmental reconstruction result of Super Mario with 6 temporal frames compression.
Visualization 16       Video of the expeirmental reconstruction result of natural plants with 4 temporal frames compression.
Visualization 17       Video of the expeirmental reconstruction result of natural plants with 6 temporal frames compression.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (11)

Fig. 1.
Fig. 1. LeSTI testbed. Spectral LED illumination is used for spectral scene modulation. An imaging lens focuses the spectrally modulated scene onto the DMD imaging plane, where spatial modulations are imposed. A second objective lens focuses the spatial modulated scene onto the imaging plane, where a grayscale detector captures multiple temporally modulated video frames.
Fig. 2.
Fig. 2. (a) LeSTI system architecture. Multi-spectral LEDs response for illumination and spectral modulations; a DMD is used for imposing spatial binary patterns. A grayscale sensor is utilized for image capture. (b) Eight types of LED’s spectra shown in the 400-700nm wavelength range. (c) The designed LED illuminator with multi-spectral LEDs used in our system. Eight different types of LEDs are used and evenly placed on a circuit board with three elements of each type.
Fig. 3.
Fig. 3. (a) Timing sequence in the LeSTI sensing. (b) LeSTI system sensing process (bottom row). A pixel on the scene shows the spectral modulation in the system (middle row).
Fig. 4.
Fig. 4. PSNR and SSIM plots with different system configurations. Notice that the GAP-TV reconstruction algorithm is applied for reconstructions. Here ‘white’ refers to random binary random patterns used in the DMD and ‘Blue’ means the blue noise patterns are used. The following percentage numbers denote the pattern transmittance.
Fig. 5.
Fig. 5. Original image cube (row 1, Visualization 1) and simulation reconstruction results of $N_t = 2$ with random (white noise) coding transmittance 12.5% (row 2, Visualization 2), random coding 50% (row 3, Visualization 3), blue noise coding 12.5% (row 4, Visualization 4), and blue noise coding 50% (row 5, Visualization 5). In the first row, original data is presented with spectral images, reproduced RGB images and zoomed information. Reconstruction results with different configurations (rows 2-5) are shown with the same layout. The compressed measurements are shown at the bottom left (last row). Spectral plots of selected regions are shown in the bottom right.
Fig. 6.
Fig. 6. Simulation reconstruction results of 25% transmittance blue noise with different temporal compression, i.e. $N_t = 6$ (row 2, Visualization 6), $10$ (row 3, Visualization 7). Notice that the P3 interest point of reconstruction $N_t = 6$ is on frame 4, and for reconstruction $N_t = 10$ it is on frame 6. The coordinates of P3 on these two frames move with LEGO.
Fig. 7.
Fig. 7. Simulation reconstruction with moving LEGO and static background shown in reproduced RGB images. Simulation measurement and the reconstruction of 4 frames from a snapshot when the objects have (a) small motion, (b) fast motion, and (c) very fast motion with respect to the LED scanning. Notice that the GAP-TV algorithm is applied to generate the reconstruction results.
Fig. 8.
Fig. 8. Simulation reconstruction results with different reconstruction algorithms, i.e. GPSR4D (row 2, Visualization 8), GAP-TV4D (row 3, Visualization 9), DeSCI4D (row 4, Visualization 10). The results are based on 25% transmittance blue noise with $N_t = 8$ frames. Four reproduced RGB images are selected to be shown.
Fig. 9.
Fig. 9. Three imaging scenes: (a) color checker, (b) Super Mario and background, and (c) natural plants.
Fig. 10.
Fig. 10. Experimental reconstruction spectral video excerpts with $N_t = 4$ using GAPTV4D (top-right, Visualization 11) and DeSCI4D (bottom-right, Visualization 12) algorithm.
Fig. 11.
Fig. 11. Real data reconstruction spectral video excerpts with different temporal compression ratios, i.e. Color checker with $N_t = 4$ (Visualization 12) and $N_t = 6$ (Visualization 13), Super Mario with $N_t = 4$ (Visualization 14) and $N_t = 6$ (Visualization 15), Natural plants with $N_t = 4$ (Visualization 16) and $N_t = 6$ (Visualization 17).

Equations (15)

Equations on this page are rendered with MathJax. Learn more.

g(x,y)=f(x,y,λ,t)m(x,y,λ,t)dλdt,
fnx,ny,nt=Sfnx,ny,nt.
G=nl=1Nlnt=1NtCnl,ntFnl,nt+E,
Mnλ,nt=nl=1Nlsnl,nλCnl,nt.
G=nλ=1Nλnt=1NtMnλ,ntFnλ,nt+E.
H=[diag(M1,1),diag(M2,1),,diag(MNλ,1),diag(M1,2),,diag(MNλ,Nt)],
g=Hf+e.
f^=argmaxfexp{12σe2gHf22+logp(f)}=argminf12gHf22σe2logp(f).
f^=argminf12gHf22+λR(f).
fk+1=argminf12gHf22+ρ2f(zkuk)22,
zk+1=argminfλR(z)+ρ2z(fk+1+uk)22,
uk+1=uk+(fk+1zk+1),
fk+1=(zkuk)+H(gH(zkuk))(Diag(HH)+ρ),
zk+1=Dσ^k(fk+1+uk),
uk+1=uk+(fk+1zk+1),
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.