Visible and NIR microscopic hyperspectrum reconstruction from RGB images with deep convolutional neural networks

Kunshen Feng; Junfeng Li; Ming Li; Shilong Gao; Weiqi Deng; Haitao Xu; Jing Zhao; Yubin Lan; Yongbing Long; Haidong Deng

doi:10.1364/OE.510718

1. Introduction

As one of the non-invasive and label-free spectral microscopy techniques, microscopic hyperspectral imaging technology, which combines advantages of microscopy and hyperspectral image technology, can simultaneously provide the spatial and spectral information about samples [1–3]. More importantly, microscopic hyperspectral imaging techniques have high signal sensitivities and their spatial resolutions can reach micrometer or even nanometer levels compared with the conventional hyperspectral image technology, which makes it possible to find potential applications including disease diagnosis [4,5], nanomaterial characterization [6], microorganism detection [7–9], and microscopic contaminants analysis [10]. For example, Verebes et al. identified different morphological characteristics of blood cells by using a visible/near-infrared hyperspectral microscopy [11]. With the same techniques, Xu et al. classified the different freshness levels of pork by collecting the microscopic images and spectral information of pork tissue [12]. However, similar to the conventional hyperspectral imaging technology, the acquisition of high quality microscopic hyperspectral images (MHSI) needs to capture three-dimensional signals with a two-dimensional sensor, and the acquisition time of a single image can range from tens of minutes to hours, depending on the desired spatial/spectral resolutions and the image size, which severely limits the application scope of microscopic hyperspectral imaging system in real-time imaging and transient biological, chemical and physical processes. To overcome these difficulties and enable microscopic hyperspectral imaging system acquisition in dynamic scenes, Elizabeth A. Holman et al. developed a grid-less autonomous adaptive sampling method (AAS) to replace the conventional uniform grid sampling method in scanning microscopic hyperspectral imaging system, and the results showed that this method can substantially decrease the image acquisition time by increasing sampling density in regions of steeper physic-chemical gradients [13]. Still, the whole microscopic hyperspectral imaging system and the AAS algorithm are of high complexity.

In recent years, as one of the computational spectral imaging technologies, the reconstruction of hyperspectral images (HSIs) from RGB images is proved to be an alternative solution to address the problem of the trade-off between the temporal, spatial, and spectral resolution in real-time HSIs acquisition [14]. Existing HSI recovery methods mainly apply a more sophisticated approach to project the 3D data cube onto 2D detector and reconstruction the 3D HSIs with various algorithms. For example, the coded aperture snapshot compressive imaging (CASSI) system compresses information of snapshots along the spectral dimension into one single 2D measurement [15], which is flexible in design and provides the prior knowledge for later reconstruction. Recently, based on the CASSI, a large number of reconstruction algorithms and upgraded settings, such as low rank [16], Gaussian mixture mode [17], deep learning [18,19], and multi-frame CASSI [20], have been proposed to recover the 3D HSI cube from the 2D measurement. However, the decoding process of CASSI, different from the hardware-based encoding, requires the additional computation by designing algorithms, which makes the HSI reconstruction more computational expensive and time consuming. Furthermore, degradation problem when using fewer measurements also limits the application of CASSI in resource constrained environments.

Inspired by the remarkable success of deep convolutional neural networks (DCNNs) in many computer vision tasks [21–24], a series of DCNN-based HSI reconstruction models have demonstrated impressive abilities in solving such a highly ill-posed recovery problem [25–30]. However, all these models have to be trained with completely matched RGB and hyperspectral images [31]. Unfortunately, it is difficult to collect a large number of such paired data without specially designed devices, e.g., well calibrated dual cameras [32,33]. Although one can inversely synthesize RGB images from available HSIs to form the paired data for training [34,35], the huge gaps between the synthetic and real RGB images may degrade the performance of the well-trained model when applied to recover the HSIs with the real RGB images.

All these researches about the reconstruction of HSI mentioned above mainly concentrate in the conventional hyperspectral imaging system which is always applied in macroscopic application scenarios, and there are few reports about the reconstruction of HSI in microcosmic domain. Inspired by these pioneer works about the HSI reconstruction mentioned above, we explore the reconstruction of MHSIs and microscopic hyperspectra (MHS) from the microscopic RGB (MRGB) images based on DCNNs with a homemade MHSI system. Based on the MHSI system, the totally matched RGB image captured with a digital CCD camera and the MHSIs captured with a HSI camera can be collected conveniently with a beam path switch device, and then all these matched RGB and MHSI pairs from various samples are used as the dataset for training the DCNN. DCNN models with 2D and 3D convolutional kernels are adopted to reconstruct the MHSIs and MHS from the RGB images, and the reconstruction performances of both kinds of DCNNs with different convolutional kernels are evaluated and compared. Furthermore, considering the widely application of HSI detection in the wavelength range of near-infrared, we explore the possibility of recovery the MHSIs from the RGB images in the near-infrared region. The proposed method can potentially facilitate the application of MHSI system towards investigating fast changing phenomena in microcosmic field.

2. Experimental details and methods

2.1 Collection of MRGB and MHSI pairs with MHSI system

The acquisition of the dataset formed with matched MHSI and MRGB image pairs is realized with a homemade MHSI system based on an inverted fluorescence microscope (Nikon, Eclipse Ti-U), as shown in Fig. 1. A 10X objective lens is used to collect the reflected light from the sample and then the reflected light can be captured by the digital RGB camera (Nikon, DS-Fi3) or the HSI camera (Dualix, GaiaField-Pro-V10) by steering the beam path selection device. The detecting wavelength range of the HSI camera is 400-1000 nm and the wavelength resolution is 3.2 nm. A push-broom data collection method is used to capture the reflected MHSI and it takes about 30 seconds to capture a MHSI.

Fig. 1. Structural schematic diagram of the MHSI system based on an inverted fluorescence microscope.

Download Full Size | PDF

In order to save the computational cost, a total of 36 bands MHSIs are extracted at an interval of 10 nm in the wavelength range of 450-800 nm. The image resolutions of HSI and RGB camera are 991 × 960 and 2880 × 2048 respectively. The reflectance images collected by the RGB or the HSI camera are from 28 kinds of samples which include the color chart, skins of different kinds of fruits such as apples, grapes, cherry tomatoes etc., and some natural plants including flowers and leaves with different colors. Some representatives of samples are shown in Fig. 2. The details of the samples also can be found in Fig. S1 of the Supplement 1. A total of 315 MRGB and MHS image pairs have been obtained. All these images are divided into small patches with a spatial size of 64 × 64 to form the sample set, and the sample set is classified randomly as training set and testing set as ratio of 8:2.

Fig. 2. Some representatives of the experimental sample and the corresponding MRGB images of the region of interest (ROI). (a) color chart, (b) the skin of the grape, (c) the leaf of Ficus microcarpa, and (d) the flower of Ruellia brittoniana. The square in the figure represents the ROI.

Download Full Size | PDF

2.2 Image calibration and pixel registration

In order to extract the reflectance spectrum and reduce the influence of environment factors and improve the signal-to noise ratio (SNR), the MRGB and MHSIs are calibrated by black and white correction using the Eq. (1):

(1)$$R = ({{R_o} - {R_d}} )/({{R_w} - {R_d}} )\; $$

where R is the corrected MHSI or MRGB image expressed as the relative reflectance, ${R_o}$ is the original acquired MHSI or MRGB image, ${R_d}$ is the dark image (approximately 0% reflectance), and ${R_w}$ is the white reference image obtained by a white calibration board (near 100% reflectivity). After image calibration, ENVI 4.2 software (Research system, Inc, Boulder, Co) is used to extract the microscopic reflectance hyperspectral data.

Although both cameras can capture the images from the same field of view with the beam path selection knob, the receptive field and the image resolutions of both cameras are totally different, which results in the original images captured by the MHSI and MRGB cameras with different geometry sizes and resolutions. In order to solve this problem, we first select the same ROIs from the original MHS and MRGB images to make sure that the MHS and MRGB image pairs used to training are from the same field of view. Then, the resolution of the MRGB image is adjusted to the same as the MHS image by using the build-in image processing software of the RGB camera. Finally, a method based on the perspective projection [36,37] is used to realize the pixel registration between the MHS and MRGB images.

2.3 DCNN architecture and evaluation metrics

In the DCNN framework, a dense structure is adopted as the backbone framework of our model (as shown in the Fig. 3), which can learn an end-to-end mapping from the pairs of MRGB and MHSI [38]. We first employ a single convolutional layer to extract the shallow features from the RGB images, and then stack two dense blocks to form a deep network for extraction of the deep feature extraction. The dense blocks can effectively alleviate the problem of gradient vanishing and explosion in deep learning networks. The dense connection enables the $i$-th layer to receive the features of all preceding layers from ${l_{0\; }}$ to ${l_{i - 1}}$, which can be expressed as Eq. (2)

(2)$${l_i} = {f_i}({[{{l_0},{l_1}, \cdots {l_{i - 1}}} ]} )$$

where ${f_i}(\cdot )$ denotes the i-th convolutional layer in the dense block, and $[{{l_0},{l_1}, \cdots {l_{i - 1}}} ]$ denotes the concatenation of the features output from the preceding layers. Finally, a single 3D convolution layer is used to complete the fusion and mapping of shallow and deep features, and the final MHSI can be reconstructed. Additionally, in order to minimize the loss of feature, pooling layers are not applied in the model proposed in this paper. Instead, batch normalization (BN) layers are inserted after each convolutional layer. The Rectified Linear Unit (ReLU) [39] is used as the activation function to introduce non-linear and accelerate convergence. It should be mentioned that we find removing the batch normalization layers after the output convolutional layer and choosing sigmoid function rather than ReLU as the activation function can significantly improve the performance of the networks to learn correlations between the MRGB and MHSI image pairs when optimizing DCNN. Therefore, behind the convolutional layer in the output end the BN layer is removed and the activation function ReLU is replaced by sigmoid function in the final optimized model. Additionally, a global average pooling layer is added behind the reconstructed MHSI output block which is used to directly extract the reconstructed MHS.

Fig. 3. Proposed 3D DCNN architecture used to reconstruction MHSI and MHS from MRGB images (a) and the frameworks of convolutional (b) and dense (c) blocks. Different blocks are presented with different colors.

Download Full Size | PDF

The mean square error (MSE) is used as the loss function during training, which has been widely applied to the hyperspectral reconstruction task [40], while the loss from the reconstructed hyperspectral curves is always neglected. Hence, in this paper, the loss from the reconstructed MHS is introduced in the loss function, and the loss function L used during training can be expressed as Eq. (3)

(3)$$L = \lambda {L_I} + ({1 - \lambda } ){L_S}$$

here $\lambda $ is the weight factor, ${L_I}$ and ${L_S}$ denote respectively the losses from the reconstructed MHSI and MHS, which can be calculated with the following expression:

(4)$$MSE = \left( {\frac{1}{n}} \right)\mathop \sum \nolimits_{i = 1}^n {({|{x_{gt}^{(i )} - x_{re}^{(i )}} |} )^2}$$

where $x_{re}^{(i )}$ and $x_{gt}^{(i )}$ represent the reconstructed and ground truth intensities of the $i$-th pixel of the MHSI or the $i$-th channel of the MHS, n represents the number of pixels or the channels.

The performance of the proposed model is evaluated by root mean square error (RMSE), the structure similarity index (SSIM), and the peak signal-to-noise ratio (PSNR). The RMSE and PSNR are used to measure the spatial fidelity of the reconstruction results and the SSIM is utilized to measure the spatial structure similarity between two images. As for the spectral similarity, the L-RMSE is used to evaluate the performance of our model. Generally, a larger PSNR or SSIM and a smaller RMSE (L-RMSE) indicate the better performance of the reconstruction model.

3. Experiment results and discussion

3.1 Model testing

In order to verify the effectiveness of the proposed DCNN model, we first conducted a simulation on the benchmark datasets CAVE, which is always used as the public dataset to realize the reconstruction HSIs from RGB images [41]. The CAVE dataset contains 32 pairs of HSI and the corresponding RGB image. Each HSI contains 31 spectral bands and the spatial resolution is 512 × 512. We compare our model with three typical HSI reconstruction network models HSCNN + [29], AWAN [42] and HDNet [43]. The HDNet is the spectral encoding imaging reconstruction method, and HSCNN + and AWAN are recently proposed DCNN-based HSI reconstruction models.

We implement these models in TensorFlow. The Adam optimizer [44] is utilized during the training, and the optimizer parameters are ${\mathrm{\beta }_1}$= 0.9, ${\mathrm{\beta }_2}$= 0.999, and $\mathrm{\varepsilon }$ = 10⁻⁸. The learning rate is set as 8.5 × 10⁻⁴. We trained the model for 200 epochs with a batch size of 400. As mentioned in Section 2, the images in the CAVE dataset are first divided into patches with a spatial size of 64 × 64 and then all these patches are fed into the networks. The sample set is divided in training set and testing set as the ratio of 8:2. We set early stopping to enhance training efficiency and prevent over-fitting. If the loss function did not decrease below a predefined threshold within 10 epochs, the training of the network would stop automatically. Training takes about 8 hours on a RTX A4000 GPU.

Under the same experimental settings, we evaluate the performances of all methods mentioned above on the public dataset CAVE. The numerical results are shown in the left panel of Table 1. It can be seen that our model performs much better than the three reported reconstruction models for the public CAVE dataset with the least amount of training parameters (0.79 M). For example, our model can achieve the smallest RMSE of 0.0224 compared with HSCNN+, HDNet, and AWAN. The SSIM of our model (93.09) is smaller than that of AWAN (94.57), while it is much higher than that of HSCNN + (81.84) and HDNet (88.47). In addition, our model can obtain the highest PSNR (36.2) compared with the HSCNN+, HDNet, and AWAN. To further clarify the conclusions, we also present the reconstructed hyperspectra and HSI error maps in Fig. 4. By comparing the HSI error maps from HSCNN + (Fig. 4(b)), HDNet (Fig. 4(c)), AWAN Fig. 4(c), and our model (Fig. 4(e)), one can see that the model proposed in this paper can not only recover more details of the HSIs but also induce smaller reconstruction error than other three models reported in previous works, which further manifests that our model can realize the highest fidelity reconstruction. On the other hand, by comparing the reconstructed reflectance spectra shown in Fig. 4 (f), it can be seen obviously that the reconstructed hyperspectrum with our model has relatively smaller errors to the ground truth hyperspectrum than that of HSCNN+, HDNet, and AWAN).

Fig. 4. Reconstructed HSIs from RGB images with different HSI reconstruction models based on the public dataset (CAVE). The example RGB image and the 10^th-band HSI (a), the reconstructed HSI and the error map (RMSE) with HSCNN + (b), HDNet (c), AWAN (d), and our model (e). Reconstructed hyperspectral curves from RGB images with different reconstruction models (f).

Download Full Size | PDF

Table 1. Numerical results of different models on CAVE and Microscopic data. The best results are in bold.

View Table | View all tables in this article

3.2 Reconstruction performance of the DCNN model with microscopic dataset

Now let us investigate the reconstruction performance of our model in the microscopic dataset. The detail reconstructed numerical results are presented in right part of Table. 1. For comparison, the performances of HSCNN+, HDNet, and the AWAN based on the microscopic dataset are also shown in the same table. Similar to the case of CAVE dataset, our model can realize the best reconstruction performance compared with other three reconstruction models. Meanwhile, by comparing the evaluation metrics for the reconstructed HSIs of all mentioned models based on the CAVE dataset, one can see that the reconstruction capabilities of all models based on the MHSI dataset degrade in some degree. However, for the metric used to evaluate the reconstructed hyperspectrum performance, the L-RMSE values of all models are obviously higher than those of the case base on CAVE dataset. The results denote that there still exists the pixel mismatch in MHSI-RGB image pairs although we have adopted the perspective projection method to align the pixels of MHSI-RGB image pairs. Fortunately, the errors induced by the pixel mismatch between the MHSI-RGB image pairs can hardly affect the MHS reconstruction for the ROI. In order to confirm this, we presented the reconstructed MHS curves from MRGB images (different ROIs) of a color chart in Fig. 5. It can be seen clearly that the reconstructed MHS curves are extremely close to the ground truth MHS curves in the considered wavelength range. More importantly, our model shows the nice reconstruction ability at the peak and valley wavelengths which are always used as feature wavelengths to characterize the optical properties of different samples in practical application of MHSI technology. Besides, it takes 215.2 ms to reconstruct a MHSI with size of 576 × 576 × 3 on a RTX A4000 GPU by using our model. But for HSCNN+, HDNet, and AWAN, 453.8 ms, 406.1 ms, and 424.1 ms are needed to reconstruct the same MHSI on the same GPU. This result shows that our model can reconstruct MHSI from MRGB with high speed.

Fig. 5. Reconstructed MHS of 4 ROIs with different colors in the printed color chart. The inserts represent the ROIs with different colors. The reconstructed MHS (MHS_Re) and the ground truth MHS (MHS_GT) are represented with black dot line and red dot line, respectively.

Download Full Size | PDF

In previous reports about the HSI reconstruction based on deep-learning algorithms, the MSE is used as the loss function and the similarity between the reconstructed hyperspectrum and the ground truth is always neglected during the process of training. Hence, in this part, we investigate the effect of the weight factor $\lambda $ in the loss function on the reconstruction performance of our model. The reconstruction quantitative results with different values of $\lambda $ are presented in Table 2. It can be seen that the performance of the proposed model can be improved in some degree when the loss of the reconstructed hyperspectrum is introduced in the loss function. The RMSE and Line_RMSE decrease respectively from 0.0541 and 0.0178 to 0.0533 and 0.0171 with reducing the value of $\lambda $ from 1 to 0.9. However, the performance of the proposed model in this paper degrades by further increasing the weight of the reconstruction spectrum.

Table 2. Numerical results of our model on Microscopic dataset by tuning the value of λ.

View Table | View all tables in this article

Besides, most of conventional reconstruction models based on DCNN framework always adopt 2D convolution kernel to extract the spatial features, and such a 2D convolution on multiple images separately cannot explore the spectral information encoded in contiguous bands, resulting in spectral distortion. Therefore, in this study, instead of using 2D convolution kernel, the 3D convolution kernel is applied to conduct the convolution in both spatial and spectral dimensions simultaneously and capture the spatial-spectral features. The numerical results are presented in Table 3, in which the numerical results of the model with 2D convolution kernel of different size are also provided for comparison. It can be seen that the reconstruction model with 3D convolution kernel can produce more accurate results either in MHSI or MHS reconstruction than the model based on 2D convolution kernels. Obviously, the MHS reconstruction ability of our model can be improved much more than the HSI reconstruction when the spectral dimension is introduced in the convolution, which is presented by the value of L-RMSE in Table 3 that decreases from 0.0204 (2D convolution kernel: 3 × 3) to 0.0171 (3D convolution kernel: 3 × 3 × 3).

Table 3. Numerical results of our model on Microscopic dataset with 2D and 3D convolutional kernels.

View Table | View all tables in this article

3.3 Reconstruction performance of the DCNN model in NIR region

The MHSI and MHS reconstruction from MRGB images results discussed in above section are conducted in the visible region. In this section, we explore the MHSI and MHS reconstruction performance in NIR region, and the detail reconstructed results are shown in Fig. 6. For comparison, the reconstructed MHSI and MHS in the visible range are also presented in the same figure. In this experiment, we adopt a piece of green leaf as the sample because the green leaf can show obvious reflection in the considered NIR region, as shown in Figs. 6 (a). In this figure, the image in the middle of the figure is the MRGB image captured by the digital camera and the reconstructed and the ground truth spectra of a part of the leaf are presented in the left subfigure of Fig. 6 (a). In order to show the advantages of the microscopic reconstruction in practical application, we also recover the reflection spectrum of a little segment of leaf vein in the RGB image, as shown in the right subfigure of Fig. 6 (a). It can be seen obviously that our model also shows excellent MHS reconstruction performance in NIR region. For the whole image, the evolution of the reconstructed reflection spectrum curve with the wavelength is same as that of the ground truth reflection spectrum in the NIR region, but it should be pointed out that there still exist slightly bigger errors in the reflectance than that of the MHS reconstruction in visible range. The main reason is that the MRGB camera shows weak spectral response in the NIR region, which means that there exists no obvious correlation between the MRGB and MHS image pairs. For the MHS reconstruction of the leaf vein, however, one can see that the reconstructed reflection spectrum curve is very close to the ground truth reflection spectrum in the NIR region and the error is almost neglected. All these results can be further confirmed with the reconstructed MHSIs at different wavelength bands shown in Fig. 6 (b). One can see that the reconstructed MHSIs are the same as the ground truth MHSIs at 470 nm, 530 nm, 610 nm, and 690 nm. By comparing the reconstructed MHSI and the ground truth MHSI at 780 nm (NIR), it also can be seen that the reconstructed MHSI can clearly show the total structure information of the leaf although there exist some differences in some pixels. The overall results indicate that the reconstruction DCNN model proposed in this paper can effectively predict the features of the MHSI not only in the Vis region but also in the NIR region from the corresponding MRGB image.

Fig. 6. Reconstructed MHSIs and MHS curves from the MRGB image of a part of a piece of leaf in NIR region. (a) Reconstructed MHS and the ground truth MHS of the whole ROI (Left) and the reconstructed MHS and the ground truth MHS of the segment of leaf vein (right). (b) Reconstructed MHSIs (above) and ground truth MHSIs (below) at 470 nm, 530 nm, 610 nm, 690 nm, and 780 nm.

Download Full Size | PDF

4. Summary

In summary, the MHS and MHSI have been reconstructed from the MRGB images based on a homemade DCNN framework. The reconstruction performance of the model proposed in this paper is investigated with the public dataset CAVE and the homemade microscopic dataset formed by the MHS and the MRGB image pairs. In order to avoid the mismatch between the MHSI and the corresponding MRGB image, the perspective projection method is used to calibration the MHS and MRGB image pairs. In addition, different from the previous reported models used to recover the HSIs from RGB images which mainly focused on the spatial correlation between the HSI and RGB images, the spectral dimension is also considered when constructing the DCNN model. Another metric used to evaluate the MHS reconstruction performance is introduced in the loss function. Meanwhile, the 3D convolution kernel is used to replace the 2D convolution kernel which is always conducted the convolution in the traditional HSI reconstruction models based on DCNN framework. The reconstruction results demonstrate that our model can realize the perfect performance in reconstructions of MHSIs and MHS from MRGB images. Furthermore, we explore the possibility of recovering the MHSI and MHS in NIR region from MRGB images. It is found that the model proposed in this paper also show the excellent reconstruction performance though there exists low correlation between the NIR MHSI and the MRGB image pair. The overall results manifest that the proposed method of Vis-NIR MHSI and MHS reconstruction from MRGB image can provide an effective and low-cost way for application of MHSI in real-time imaging and transient biological, chemical and physical processes.

Funding

College Students Innovative and Entrepreneurial Training Program (S202310564020); College Students Innovative and Entrepreneurial Training Program (S202310564020); Laboratory of Lingnan Modern Agriculture project (NT2021009); Overseas Expertise Introduction Project for Discipline Innovation (D18019); Basic and Applied Basic Research Foundation of Guangdong Province (2021A1515012112, 2022A1515010411); National Natural Science Foundation of China (11774099, 12174120); Special Project for Research and Development in Key areas of Guangdong Province (2019B020219002).

Disclosures

The authors declare no conflicts interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

References

1. M. Eady and B. Park, “Classification of Salmonella enterica with selective bands using visible/NIR hyperspectral microscope images,” J. Microsc. 263(1), 10–19 (2016). [CrossRef]

2. Q. Li, X. He, Y. Wang, et al., “Review of spectral imaging technology in biomedical engineering: achievements and challenges,” J. Biomed. Opt. 18(10), 100901 (2013). [CrossRef]

3. J. Wang, X. Mao, Y. Wang, et al., “Automatic generation of pathological benchmark dataset from hyperspectral images of double stained tissues,” Opt. Laser Technol. 163, 109331 (2023). [CrossRef]

4. T. Vo-Dinh, “A hyperspectral imaging system for vivo optical diagnostics,” IEEE Eng. Med. Biol. Mag. 23(5), 40–49 (2004). [CrossRef]

5. S. J. Leavesley, M. Walters, C. Lopez, et al., “Hyperspectral imaging fluorescence excitation scanning for colon cancer detection,” J. Biomed. Opt. 21(10), 104003 (2016). [CrossRef]

6. J. E. Batey, M. Yang, H. Giang, et al., “Ultrahigh-throughput single-particle hyperspectral imaging of gold nanoparticles,” Anal. Chem. 95(13), 5479–5483 (2023). [CrossRef]

7. R. Kang, B. Park, M. Eady, et al., “Classification of foodborne bacteria using hyperspectral microscope imaging technology coupled with convolutional neural networks,” Appl. Microbiol. Biotechnol. 104(7), 3157–3166 (2020). [CrossRef]

8. V. Studer, J. Bobin, M. Chahid, et al., “Compressive fluorescence microscopy for biological and hyperspectral imaging,” Proc. Natl. Acad. Sci. U. S. A. 109(26), E1679–1687 (2012). [CrossRef]

9. J. Wu, B. Xiong, X. Lin, et al., “Snapshot hyperspectral volumetric microscopy,” Sci. Rep. 6(1), 24624 (2016). [CrossRef]

10. R. Fakhrullin, L. Nigamatzyanova, and G. Fakhrullina, “Dark-field/hyperspectral microscopy for detecting nanoscale particles in environmental nanotoxicology research,” Sci. Total Environ. 772, 145478 (2021). [CrossRef]

11. G. S. Verebes, M. Melchiorre, A. Garcia-Leis, et al., “Hyperspectral enhanced dark field microscopy for imaging blood cells,” J. Biophotonics 6(11-12), 960–967 (2013). [CrossRef]

12. Y. Xu, Q. S. Chen, Y. Liu, et al., “A novel hyperspectral microscopic imaging system for evaluating fresh degree of pork,” Kerean J. Food Sci. Anim. Resour. 38(2), 362–375 (2018). [CrossRef]

13. E. Holman, Y. S. Fang, L. Chen, et al., “Autonomous adaptive data acquisition for scanning hyperspectral imaging,” Commun. Biol. 3(1), 684 (2020). [CrossRef]

14. X. Lin, Y. Liu, J. Wu, et al., “Spatial-spectral encoded compressive hyperspectral imaging,” ACM Trans. Graph. 33(6), 1–11 (2014). [CrossRef]

15. M. E. Gehm, R. John, D. J. Brady, et al., “Single-shot compressive spectral imaging with a dual-disperser architecture,” Opt. Express 15(21), 14013–14027 (2007). [CrossRef]

16. Y. Liu, X. Yuan, J. Suo, et al., “Rank minimization for snapshot compressive imaging,” IEEE Trans. Pattern Anal. Mach. Intell. 41(12), 2990–3006 (2019). [CrossRef]

17. T. Huang, W. Dong, X. Yuan, et al., “Deep Gaussian scale mixture prior for spectral compressive imaging,” In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR)16216–16225 (2021).

18. L. Wang, C. Sun, M. Zhang, et al., “DNU: Deep non-local unrolling for computational spectral imaging,” In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR)1661–1671 (2020).

19. L. Wang, T. Zhang, Y. Fu, et al., “HyperReconNet: Joint coded aperture optimization and image reconstruction for compressive hyperspectral imaging,” IEEE Trans. Image Process. 28(5), 2257–2270 (2019). [CrossRef]

20. D. Kittle, K. Choi, A. Wagadarikar, et al., “Multiframe image estimation for coded aperture snapshot spectral imagers,” Appl. Opt. 49(36), 6824–6833 (2010). [CrossRef]

21. S. Ji, W. Xu, M. Yang, et al., “3D convolutional neural networks for human action recognition,” IEEE Trans. Pattern Anal. Mach. Intel. 35(1), 221–231 (2013). [CrossRef]

22. C. Dong, C. C. Loy, K. He, et al., “Image superresolution using deep convolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2016). [CrossRef]

23. J. Kim, J. Kwon. Lee, and K. Mu Lee, “Accurate image superresolution using very deep convolutional networks,” In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR)1646–1654 (2016a).

24. J. Kim, J. Kwon. Lee, and K. Mu Lee, “Deeply-recursive convolutional network for image super-resolution,” In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR)1637–1645 (2016b).

25. B. Kaya, Y. B. Can, and R. Timofte, “Towards spectral estimation from a single rgb image in the wild,” In IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 2546–3555 (2019).

26. Y. Yan, L. Zhang, J. Li, et al., “Accurate spectral super-resolution from single rgb image using multi-scale cnn,” In Chinese Conference on Pattern Recognition and Computer Vision (PRCV)206–217 (2018).

27. Z. Xiong, Z. Shi, H. Li, et al., “Hscnn: Cnn-based hyperspectral image recovery from spectrally undersampled projections,” In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)518–525 (2017).

28. T. Stiebel, S. Koppers, P. Seltsam, et al., “Reconstructing spectral images from rgb-images using a convolutional neural network,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)948–953 (2018).

29. Z. Shi, C. Chen, Z. Xiong, et al., “Hscnn+: Advanced cnn-based hyperspectral recovery from rag images,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)939–947 (2018).

30. S. Koundinya, H. Sharma, M. Sharma, et al., “2D-3D CNN based architectures for spectral reconstruction from RGB images,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)844–851 (2018).

31. B. Arad, O. Ben-Shahar, and R. Timofte, “Ntire 2018 challenge on spectral reconstruction from rgb images,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)929–938 (2018).

32. Y. Sun, J. Zhang, and R. Liang, “pHSCNN: CNN-based hyperspectral recovery from a pair of RGB images,” Opt. Express 30(14), 24862–24873 (2022). [CrossRef]

33. Y. Ji, S. M. Park, S. Kwon, et al., “mHealth hyperspectral learning for instantaneous spatiospectral imaging of hemodynamics,” PNAS Nexus 2(4), 1–15 (2023). [CrossRef]

34. L. Zhang, Z. Lang, P. Wang, et al., “Pixel-aware deep function-mixture network for spectral super-resolution,” In Proceedings of the AAAI conference on Artificial Intelligence34(07), 12821–12828 (2020). [CrossRef]

35. L. Zhang, W. Wei, C. Bai, et al., “Exploiting Clustering Manifold Structure for Hyperspectral Imagery Super-Resolution,” IEEE Trans. Image Process. 27(12), 5969–5982 (2018). [CrossRef]

36. G. Wolberg and S. Zokai, “Image registration for perspective deformation recovery,” Autom. Target Recognit. X 4050, 259–270 (2000). [CrossRef]

37. W. Liu, Y. Wang, J. Chen, et al., “A completely affine invariant image-matching method based on perspective projection,” Mach. Vision Appl. 23(2), 231–242 (2012). [CrossRef]

38. G. Huang, Z. Liu, L. Van, et al., “Densely connected convolutional networks,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)4700–4708 (2017).

39. V. Nair and G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines,” In Proceedings of the 27th International Conference on Machine Learning (2010).

40. L. Yan, X. Wang, M. Zhao, et al., “Reconstruction of Hyperspectral Data from RGB Images with Prior Category Information,” IEEE Trans. Comput. Imaging 6, 1070–1081 (2020). [CrossRef]

41. J. Il Park, M.H. Lee, M.D. Grossberg, et al., “Multispectral imaging using multiplexed illumination,” In IEEE 11th International Conference on Computer Vision (2007).

42. J. Li, C. Wu, R. Song, et al., “Adaptive weighted attention network with camera spectral sensitivity prior for spectral reconstruction from RGB images,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)462–463 (2020).

43. X. Hu, Y. Cai, J. Lin, et al., “HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)17542–17551 (2022).

44. D. Kinga and J.B. Adam, “A method for stochastic optimization,” In International Conference on Learning Representations (ICLR)5 (2015).

		CAVE				Microscopic dataset
Methods	Params	RMSE	SSIM	PSNR	L-RMSE	RMSE	SSIM	PSNR	L-RMSE
HSCNN+	4.65M	0.0486	81.84	30.93	0.0295	0.0556	57.36	26.00	0.0192
HDNet	2.66M	0.0326	88.47	34.11	0.0279	0.0552	59.14	26.25	0.0182
AWAN	4.04M	0.0257	94.57	36.07	0.0187	0.0548	58.54	26.26	0.0195
Ours	0.79M	0.0224	93.09	36.20	0.0130	0.0533	59.5	26.38	0.0171

		CAVE				Microscopic dataset
Methods	Params	RMSE	SSIM	PSNR	L-RMSE	RMSE	SSIM	PSNR	L-RMSE
HSCNN+	4.65M	0.0486	81.84	30.93	0.0295	0.0556	57.36	26.00	0.0192
HDNet	2.66M	0.0326	88.47	34.11	0.0279	0.0552	59.14	26.25	0.0182
AWAN	4.04M	0.0257	94.57	36.07	0.0187	0.0548	58.54	26.26	0.0195
Ours	0.79M	0.0224	93.09	36.20	0.0130	0.0533	59.5	26.38	0.0171

Visible and NIR microscopic hyperspectrum reconstruction from RGB images with deep convolutional neural networks

Abstract

1. Introduction

2. Experimental details and methods

2.1 Collection of MRGB and MHSI pairs with MHSI system

2.2 Image calibration and pixel registration

2.3 DCNN architecture and evaluation metrics

3. Experiment results and discussion

3.1 Model testing

3.2 Reconstruction performance of the DCNN model with microscopic dataset

3.3 Reconstruction performance of the DCNN model in NIR region

4. Summary

Funding

Disclosures

Data availability

Supplemental document

References

Supplementary Material (1)

Data availability

Cited By

Figures (6)

Tables (3)

Equations (4)

Optics Express

	Microscopic dataset
Kernel_Size	RMSE	L-RMSE
3 × 3	0.0536	0.0204
5 × 5	0.0589	0.0291
3 × 3 × 3	0.0533	0.0171