Self super-resolution of optical coherence tomography images based on deep learning

Zhuoqun Yuan; Di Yang; Weike Wang; Jingzhu Zhao; Yanmei Liang

doi:10.1364/OE.495530

1. Introduction

Optical coherence tomography (OCT) is a non-invasive medical imaging method that can provide micron-scale resolution and millimeter-scale anatomical information in depth. In order to detect subtle structural changes, the higher resolution has been pursued continuously for OCT. The most direct way to improve the axial resolution is to use a light source with a wide spectral bandwidth, such as the Ti: Sapphire laser [1,2] and the supercontinuum (SC) light source [3,4].

The deconvolution approach has been widely used to improve resolution since it was reported in 1997 by M.D. Kulkarni et al. [5]. After that, Y. Liu et al. [6] and E. Bousi et al. [7] improved the deconvolution method to achieve better image reconstruction. In addition to the deconvolution methods, other digital processing technologies, such as spectral shaping [8,9] and spectral estimation [10], can also improve the axial resolution.

With the development of machine learning, sparse representation [11,12] and deep learning methods [13,14] have attracted more and more attention in super-resolution (SR) research. Fang et al. constructed an OCT clinical dataset and studied the reconstruction of OCT images using sparse representation [15,16]. Deep learning strategies, such as enhanced super resolution generative adversarial networks (ESRGAN) [17], super-resolution optimized using perceptual-tuned generative adversarial network [18], pyramid attention enhanced deep super-resolution network (PAEDSR) [19], and efficient super-resolution transformer (ESRT) [20], were proposed to perform image super resolution and achieved great progress. Besides, GAN has also been used to generate high-resolution (HR) images from low-resolution (LR) OCT images [21–27]. Among these studies, the intrinsically co-registered pair of low and high axial resolution images was produced by windowing the interference spectra [25–27]. However, the available methods are supervised and need the prior knowledge of higher resolution systems, which may limit the practical application of SR methods. Self super-resolution (SSR), as a self-supervised deep learning method, has been used in medical imaging [28,29] and optical imaging [30–32]. By learning the mapping from the LR image to the HR image, and then applying this mapping to the original HR image, SSR method can estimate a higher resolution image [33,34].

In this paper, we presented a deep-learning based SSR pipeline to super-resolve OCT images based on the HR and LR images collected by the home-made OCT systems. In this framework, we proposed the enhanced super-resolution asymmetric generative adversarial networks (ESRAMGAN) to improve the network performance without increasing the network complexity. The quantitative results demonstrated the improvement of resolution and recovery of delicate speckles for the collected swept-source OCT (SS-OCT) and spectral-domain OCT (SD-OCT) images, proving the feasibility of OCT self-super-resolution (OCT-SSR) approaches, which have great potential in improving the axial resolution of OCT systems and breaking the limitation of the source bandwidth on the axial resolution of the OCT system.

The proposed method only relies on the collected images and is not dependent on any external training data. More importantly, we found the OCT-SSR method can suppress the sidelobes while improving the resolution, which will benefit the application of biological samples imaging.

2. Methods and materials

2.1 Framework of OCT-SSR method

The SSR method is an unsupervised learning method whose data itself provides supervised signals. The flowchart of our proposed OCT-SSR method is shown in Fig. 1, which includes three steps, construction of datasets, training of neural networks, and generation of OCT-SSR images.

Fig. 1. The workflow of the OCT-SSR method. (a) Data preparation. (b) Training. (c) The workflow of using the trained generator to super-resolve the image.

Download Full Size | PDF

The construction of datasets or data preparation is shown in Fig. 1(a). The B-scan spectra data are collected by the OCT system. The original B-scan image of the sample is obtained by inverse Fourier transform (IFT) of interference spectrum and regarded as the HR image. During its generation, Gaussian window was used to suppress sidelobes to improve image performance. The corresponding LR image is generated by the Gaussian spectrum shaping method and IFT [25], whose resolution is controlled by the width of Gaussian window. After removing the background spectrum, the interferogram was multiplied by Gaussian windows to generate LR signals. Then, these signals were mapped to the wave number domain and IFT was performed to obtain axial LR intensity images. Finally, the co-registered pair of HR and LR OCT images was generated to construct datasets.

As shown in Fig. 1(b), LR images are input to the generator to output images with improved resolution, and the discriminator is used to distinguish the output image of the generator from the real HR image. During the training process, the discriminator and the generator continuously update the parameters through back propagation. The training is completed until the discriminator can no longer distinguish between the real HR image and the generated image.

After being trained, the generator learned the mapping relationship between LR and HR images. The mapping is then applied to the HR image to estimate a higher resolution image. During the test phase, as shown in Fig. 1(c), the HR image is firstly resampled spatially by the bicubic interpolation method with factor t (t is the ratio of axial resolutions between LR and HR images and t = HR/LR), and the resampled HR image is input to the generator of trained super-resolution networks (SRNet) to generate SR images. Finally, SR images are resampled spatially by factor 1/t as the final output images.

2.2 Super-resolution network structure

SRNet structures are shown in Fig. 2. The networks adopt the basic structure of enhanced super resolution generative adversarial networks (ESRGAN) [17]. ESRGAN is composed of a generator and a discriminator, and they learn to play against each other to produce reasonable SR images.

Fig. 2. The architecture of GAN. (a) The structure of generator. (b) The structure of discriminator.

Download Full Size | PDF

The generator, as shown in Fig. 2(a), introduces residual-in-residual dense block (RRDB), which is composed of dense connections and residual structure. Each RRDB block contains 23 dense blocks, which are added by skip connections. The dense block is composed of five convolutional layers and four Leaky ReLU (LReLU) layers. The image is input to 64 RRDB layers to extract features after passing through a convolutional layer. The extracted features are reconstructed into the output image after two convolutional layers and a LReLU layer. The discriminator consists of eight convolutional blocks and two linear layers, whose structure is shown in Fig. 2(b). Except for the first convolutional layer, each convolutional layer is followed by a batch-normalization (BN) layer and a LReLU layer.

There have been many innovative forms of convolution blocks, such as gated convolutions [35], attention mechanisms [36], and the asymmetric convolution block (ACB) [37]. In our study, the lateral resolution of the input image is the same as that of the ground truth image, but their axial resolutions are different. To address this asymmetric feature, we proposed ESRAMGAN, in which convolution blocks in the generator were replaced by ACB. The basic convolution block in ESRAMGAN is shown in Fig. 3. ACB is a set of convolution kernels, including 1 × 3, 3 × 1, and 3 × 3 three convolution kernels, which is shown in Fig. 3(a). The asymmetric convolution kernels can strengthen the asymmetric features of the image and the outputs of three kernels are added as the output of a single asymmetric convolution block. Each convolution kernel in the generator of ESRGAN is replaced by ACB when training the SRNet. As shown in Fig. 3(b), three convolution kernels are merged into one convolution kernel during the super-resolving images, which means that the parameters of the 1 × 3 convolution kernel and the 3 × 1 convolution kernel in each ACB are added to the corresponding 3 × 3 convolution kernel, thus simplifying the structure. In this way, the network capacity is increased without increasing the network complexity.

Fig. 3. (a) Convolution block in ESRAMGAN. (b) Convolution blocks are merged in ESRAMGAN when super-resolving images.

Download Full Size | PDF

2.3 Sample preparations

OCT images were collected from plant, animal, and human samples. Plant tissues are cucumber and orange pulp. Animal tissues include in vivo zebrafish and ex vivo mouse ears. Human tissues include in vivo fingers and ex vivo thyroid tissue.

For zebrafish imaging, adult wild-type zebrafish (AB strain) were used. First, the zebrafish was pre-anesthetized with a 0.016% tricaine solution. And then, the zebrafish was anesthetized with our home-made anesthesia system in the imaging process [38]. After imaging, the zebrafish was perfused with fresh water to recover from anesthesia. The removed ex vivo mouse ear were immersed in 4% paraformaldehyde for 24 hours, and placed in a glass-bottomed petri dish for imaging. All animal experimental protocols were approved by the Institutional Animal Care Committee of Nankai University.

Ex vivo human thyroid tissues were provided by Department of Thyroid and Neck Tumor, Tianjin Medical University Cancer Institute and Hospital, China, and it was in accordance with approved protocols and guidelines of institutional review board of Tianjin Cancer Hospital.

2.4 Systems and data preprocessing

A homemade SD-OCT [39] was used to collect OCT images, whose light source is a multiplexed super-luminescent diode in the near-infrared (central wavelength 840 nm, bandwidth 100 nm) and achieved ∼3.4 µm axial resolution in air. Because of the rich structure of zebrafish, the visual effects of images with different resolutions are greatly different. Therefore, the zebrafish was selected to generate multiple resolution images for evaluating OCT-SSR. After shaping the spectrum of the light source with different Gaussian windows, a series of images with axial resolutions about 3.4 μm, 5.3 µm, 8 µm, 12 µm, and 18 µm were obtained to evaluate the feasibility of the OCT-SSR method. The SD-OCT image size is 1000 pixels × 2048 pixels, corresponding to a field of view of 6 mm (x) × 2.3 mm (z), and they were divided into patches with the size of 80 × 80 pixels × pixels to generate the data set, and then we applied the data argument, such as random vertical and horizontal flips.

In order to evaluate the possibility of OCT-SSR for SS-OCT images, two different axial resolutions SS-OCT systems were also used to collect OCT images. A SS-OCT system, whose light source has central wavelength of 1310 nm and 87 nm bandwidth (Santec HSL-20-100-B), has ∼14 µm axial resolution [40], and its images will be regarded as the HR images. After shaping the spectrum of the light source with Gaussian windows, simulated LR images with 20 µm axial resolution were obtained. Another SS-OCT system, whose light source has central wavelength of 1310 nm and 107 nm bandwidth (Santec HSL-20-100-B), has ∼10 µm axial resolution, and its images will be regarded as ground truth (GT) images to verify the reliability of SS-OCT-SSR.

Our network models were implemented based on Pytorch by a server with 64 GB of RAM and an NVIDIA TITAN RTX graphics processing unit (GPU). All networks were optimized by using the Adam algorithm, where β₁ and β₂ are set as 0.9 and 0.99 empirically. The learning rate is dropped by step decay and firstly set as 0.0001. The training iterations were set as 150000. More training parameters are the same as in Ref. [26].

3. Results

3.1 Quantitative evaluations of ESRAMGAN

Because of the rich information from its tiny structure, the mouse ear was used to quantitatively evaluate ESRAMGAN on image super-resolution in our study. When training the SRNet of mouse ear, images with 18 µm and 8 µm resolutions were set as LR and HR images, respectively. The images with 18 µm resolution were input into the SRNet, and the network outputs are compared with the images of 8 µm axial resolution. A dataset consisting of 4407 pairs of image patches was built to train ESRAMGAN, ESRGAN, PAEDSR, and ESRT for mouse ear image SR.

SR results and quantitative evaluations of the test images are shown in Fig. 4. Figures 4(a) and 4(f) are mouse ear OCT images with 18 µm and 8 µm axial resolutions, respectively. Figures 4(b)–4(e) show SR images of Fig. 4(a) generated by ESRAMGAN, ESRGAN, PAEDSR, and ESRT, respectively. As shown in Fig. 4, SR images are closer to the HR images in texture structures and speckle features, demonstrating that all networks can obtain axial super resolution images precisely.

Fig. 4. Quantitative evaluations of different networks. (a) is an OCT image of mouse ear with 18 µm axial resolution. (b) is the SR result of (a) by ESRAMGAN. (c) is the SR result of (a) by ESRGAN. (d) is the SR image of PAEDSR. (e) is the SR image of ESRT. (f) is the corresponding mouse ear OCT image with 8 µm axial resolution. Scale bar in the image is 250 µm.

Download Full Size | PDF

To evaluate the performance of the network, three quantitative metrics, peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and learned perceptual image patch similarity (Lpips) [41] were used as the quantitative evaluation parameters. PSNR is based on the error between the corresponding image pixels and SSIM is used to quantify the structural similarity between two images. In the following calculation of PSNR and SSIM, the images with higher level of resolution were used as the GT images. Lpips, as a method to measure the perceptual similarity, is more consistent with human perception than PSNR and SSIM. The lower the value of Lpips, the more similar the two images. As shown in Fig. 4, the best result in each metric is indicated in bold. The more intuitive results, boxplots of PSNR, SSIM, and Lpips, are plotted in Fig. 5. PAEDSR, ESRT is better than ESRAMGAN in PSNR and SSIM but worse in visual perception Lpips. The same results are also demonstrated in enlarged regions in Fig. 4, where Figs. 4(d) and 4(e) are slightly blurry compared to Fig. 4(b) visually. For ESRGAN, Lpips is improved, but PSNR and SSIM are degraded. The quantitative metrics of ESRAMGAN are better than those of ESRGAN, indicating that ACB strengthens the feature extraction ability of the network and is effective as an alternative to the convolution kernel. In summary, ESRAMGAN has the best visual effect and is effective for super-resolution.

Fig. 5. Boxplots of quantitative evaluation parameters of SR networks. IQR: Interquartile range.

Download Full Size | PDF

3.2 Quantitative evaluations of OCT-SSR

For evaluating OCT-SSR, zebrafish images with different levels of resolution (5.3 µm, 8 µm, 12 µm, and 18 µm) were obtained by spectral shaping method and used as the test dataset for quantitative evaluation. When training the SRNet, the 18 µm images were set as LR images to input into the SRNet, and the network outputs are compared with HR images of 12 µm axial resolution. A dataset consisting of 3625 pairs of image patches was used to train ESRAMGAN. The resolution ratio between LR and HR image pairs is 1.5. Images with different resolutions (5.3 µm, 8 µm, and 12 µm) were resampled by the bicubic interpolation with factor 1.5, 2.25, 3.39, respectively and input into the trained network to get SSR outputs. Images with 18 µm resolution were input into the network directly. In the experiments, the higher level of resolution image was regarded as their corresponding GT images, such as the image with 8 µm resolution is the GT image of SR image of 12 µm resolution.

The evaluation results are demonstrated in Fig. 6. Figures 6(a₁), 6(b₁), 6(c₁), 6(d₁), and 6(e₁) are OCT images generated by Gaussian spectrum shaping with different widths, and their axial resolutions are about 18 µm, 12 µm, 8 µm, 5.3 µm, and 3.4 µm, respectively. Figures 6(a₂), 6(b₂), 6(c₂), and 6(d₂) are SR results of images of Figs. 6(a₁), 6(b₁), 6(c₁), and 6(d₁) by the trained ESRAMGAN, respectively. By the comparison between SR images and their corresponding GT images (the higher level resolution image), the improvement of axial resolution are visually illustrated. Sub-images in the lower part of Fig. 6 are the magnified views of the regions framed by the rectangles. Figures 6(f) and 6(g) are the A-line profiles of the position pointed out by the blue dashed lines in Figs. 6(a₁)–(e₁). With the improvement of the axial resolution, the distributed texture in the box is gradually clear (Figs. 6(a₁), (b₁), (c₁), (d₁), and (e₁)) As shown in Figs. 6(f) and 6(g), the peaks of the SR image curves are closer to those of their corresponding GT image curves, and both are sharper than the peaks of the corresponding LR image curves, demonstrating that the proposed OCT-SSR method improved the image resolution and reconstructed speckle pattern and texture distribution similar to GT images. Quantitative analyses of the experimental data are demonstrated in Table 1. The quantitative evaluation parameters on the SR images of 18 µm and 12 µm resolution are improved. However, the PSNR and SSIM on the SR images of 8 µm and 5.3 µm resolution decreased slightly, which may be caused by the slight difference in pixel intensity values between SR and GT images. Based on the statistical results of Lpips with the visual effect, we think that the texture feature distributions of SR images are closer to those of GT images.

Fig. 6. Evaluations of OCT-SSR. (a₁), (b₁), (c₁), (d₁), (e₁) are 18 µm, 12 µm, 8 µm, 5.3 µm, 3.4 µm images, respectively. (a₂), (b₂), (c₂), and (d₂) are SR images of 18 µm, 12 µm, 8 µm and 5.3 µm images, respectively. (f) is A-line profiles pointed out by the blue dashed lines in (a₁), (a₂), (b₁), (b₂), and (c₁). (g) is A-line profiles pointed out by the blue dashed lines in (c₁), (c₂), (d₁), (d₂), and (e₁). Scale bar in the image is 250 µm.

Download Full Size | PDF

Table 1. Quantitative evaluation parameters of ESRAMGAN on different axial resolution images.

View Table | View all tables in this article

3.3 OCT-SSR on OCT images with a larger difference in resolution

Further experiments were verified on SD-OCT images with a larger difference in resolution between LR and HR image pairs to demonstrate the ability of resolution enhancement of OCT-SSR. The ESRAMGAN was trained on a simulated SD-OCT image dataset of 3110 pairs of 8 µm HR and 18 µm LR zebrafish image patches, whose ratio of resolution is 2.25. The test results are shown in Fig. 7. Figure 7(a) is the LR image with 18 µm resolution. Figure 7(b) is the HR image and Fig. 7(c) is the corresponding SR result. Figure 7(d) is the B-scan image with 3.4 µm axial resolution, which is the GT image. Sub-images at the lower parts in Fig. 7 are the magnified views of the regions framed by the square.

Fig. 7. Quantitative evaluation of OCT-SSR on OCT images with a larger difference in resolution. (a) is the image with 18 µm resolution. (b) is the image with 8 µm resolution. (c) is the SR image of (b). (d) is the image with 3.4 µm resolution. Scale bar in the image is 250 µm.

Download Full Size | PDF

As the resolution was enhanced, the image qualities were gradually improved (Figs. 7(a)–7(c)). Compared with Fig. 7(b), the speckle is more delicate, and the zebrafish scale texture is narrowed longitudinally in the sub-image of Fig. 7(c). Meanwhile, the quantitative evaluation results are shown in Fig. 7. The improvement of PSNR, SSIM, and Lpips indicates that the SR image is closer to the GT image in texture feature distribution, reflecting the improvement of visual effect and image resolution. Experiments show that the OCT-SSR method is still effective on images with the larger resolution difference, which means a larger improvement in resolution can be obtained.

3.4 Experiments of OCT-SSR on SS-OCT images

Further experiments were carried out on collected SS-OCT images. HR images and GT images were collected by SS-OCT systems with 14 µm and 10 µm axial resolutions, respectively. The SRNet was trained on the dataset consisting of 4438 pairs of collected 14 µm HR and simulated 20 µm LR image patches. Then, HR images were resampled and further input to the trained network to generate SR images.

The results are given in Fig. 8. Figures 8(a) and 8(c) are the collected cucumber SS-OCT images with 14 µm and 10 µm resolution. SR image of Fig. 8(a) is shown in Fig. 8(b). Sub-images at the lower part of Figs. 8 are the magnified views of the regions framed by the square. Intensity profiles of blue dashed lines in red and yellow sub-images of Figs. 8 are plotted in Figs. 8(d₁) and (8(d₂)), respectively. The honeycomb-like internal structure of the cucumber is shown in Figs. 8(a)–8(c). As we can see from the intensity profiles (Figs. 8(d₁) and 8(d₂)), the intensity curves of the SR image are closer to those of the GT image, and its full width at half maximum is higher than that of the HR image, indicating the improvement of the image resolution and effectiveness of OCT-SSR.

Fig. 8. OCT-SSR on SS-OCT cucumber images. (a) is image with 14 µm resolution, and (b) is super-resolved image of (a). (c) is GT image with 10 µm resolution. (d₁) - (d₂) are intensity profiles of the lateral position pointed out by the blue dashed lines in red and yellow boxes in (a)-(c), respectively. Scale bar in the image is 250 µm.

Download Full Size | PDF

Furthermore, experiments with the same setup were performed on zebrafish and human thyroid tissues. The results are given in Fig. 9. Figures 9(a₁) and 9(c₁) are the collected zebrafish SS-OCT images with 14 µm and 10 µm resolution, respectively. SR image of Fig. 9(a₁) is shown in Fig. 9(b₁). The SRNet was trained on the dataset consisting of 5324 pairs of zebrafish image patches. The muscle texture of zebrafish is shown in red boxes and it is clearer in Fig. 9(b₁) than in Fig. 9(a₁) after OCT-SSR. The details in the deeper position of the sample were given in yellow boxes, demonstrating that the fine speckle is recovered with the improvement of the axial resolution.

Fig. 9. OCT-SSR results of SS-OCT zebrafish and human thyroid images. (a₁) - (c₁) are the HR zebrafish image, super-resolved image of (a₁), and the GT image respectively. (a₂) - (c₂) are human thyroid HR, SR, and GT images, respectively. Scale bar in images is 250 µm.

Download Full Size | PDF

The results of ex vivo human thyroid tissues were given in Figs. 9(a₂)–9(c₂), where the SRNet was trained on the dataset consisting of 5223 pairs of human thyroid image patches. Figures 9(a₂) and 9(c₂) are thyroid SS-OCT images with 14 µm and 10 µm resolution, respectively. SR image of Fig. 9(a₂) is shown in Fig. 9(b₂). Similarly, resolution enhancement and speckle recovery after OCT-SSR can be clearly seen in Fig. 9(b₂). Experiments on cucumber, zebrafish, and human thyroid tissues demonstrated adaptability of the method to SS-OCT images of different samples.

4. Discussions

In this paper, we proposed an OCT-SSR pipeline for super-resolving OCT images based on deep learning. The ESRAMGAN was constructed by combining ACB with ESRGAN, which improved the performance without increasing network complexity. The experimental results on different resolution images showed the feasibility of the method. The SR results on the images collected by the home-made SD-OCT and SS-OCT indicated that the prior knowledge of the OCT system can be used to improve its axial resolution, and the self-supervised approaches can break the limitation of the source bandwidth on the axial resolution of the OCT system in a certain extent.

In order to super-resolve the HR image, it is necessary to ensure the correlation between test data and training data. Therefore, in the test phase, the HR image should be resampled so that it has similar speckle distribution to the LR image, which is different from the available SR algorithms. For evaluating the OCT-SSR methods in depth, we further studied the following issues.

4.1 Analysis of image resampling in OCT-SSR

For analyzing the necessity of image resampling in OCT-SSR, an ablation study was done. The original un-resampled and resampled test images were super-resolved to generate the SR images, whose results are displayed in Fig. 10 and Table 2. ESRAMGAN trained based on zebrafish images in Section 3.2 was still used here for image super-resolution. Figures 10(a₁), 10(b₁), and 10(c₁) are SD-OCT images with 12 µm, 8 µm, and 5.3 µm axial resolutions, respectively. Figures 10(a₂), 10(b₂), and 10(c₂) are SR images whose corresponding inputs are un-resampled 12 µm, 8 µm, and 5.3 µm axial resolution images, respectively, while Figs. 10(a₃), 10(b₃), and 10(c₃) are SR outputs of resampled images. The blue box shows the texture of the zebrafish myotome. Compared to the Figs. 10(a₃), 10(b₃), and 10(c₃), the images without resampling (Figs. 10(a₂), 10(b₂), and 10(c₂)) are sharper in detail and have much more repetitive and incorrect texture distribution with the resolution increasing, which is also verified in the quantitative expression. Most of results in Table 2 are worse than the corresponding results in Table 1. The degradation of SR image quality is because the input images have not been resampled, which have low similarity with LR images in the training dataset. The experimental result confirmed the importance of the resampling process.

Fig. 10. Ablation study on image resampling in OCT-SSR. (a₁) - (c₁) are zebrafish dorsal images with axial resolutions of 12 µm, 8 µm, and 5.3 µm, respectively. (a₂) - (c₂) are SR images of (a₁) – (c₁), respectively. (a₃) - (c₃) are SR images of axial resampled images of (a₁) – (c₁), respectively. Scale bar in the image is 250 µm.

Download Full Size | PDF

Table 2. Quantitative evaluation parameters of ESRAMGAN on different axial resolution images without resampling.

View Table | View all tables in this article

4.2 Suppression of sidelobes by OCT-SSR

More importantly, we found that this OCT-SSR method can also be used to suppress sidelobes caused by the non-Gaussian spectra of light source in OCT images in the meantime of super-resolving images. To clearly demonstrate this advantage, we used orange pulp with hole-like structures as test samples. The SRNet trained in Section 3.3 was used for orange pulp OCT image super-resolution and the result is shown in Fig. 11. Figure 11(a) is an orange HR image with 8 µm resolution, and Fig. 11(b) is the corresponding SR result. Figure 11(c) is the B-scan image with 3.4 µm axial resolution by SD-OCT system, as the GT image. Sub-images in the lower part of Fig. 11 are the magnified views of the regions framed by the square. The quantitative evaluation results are also shown in Fig. 11.

Fig. 11. OCT images of orange pulp. (a) 8 µm. (b) Super-resolved image of (a). (c) 3.4 µm. Scale bar in the image is 250 µm.

Download Full Size | PDF

As shown in Figs. 11(a) and (b), the cell wall of orange pulp in the SR image is more clearly visible and the boundary is also more prominent than that of the HR image. Meanwhile, sidelobes in the GT image (yellow sub-image in Fig. 11(c)) are not visible in the SR image (Fig. 11(b)), which means that the method can suppress most of the sidelobes when existing strong reflections on the sample surface.

Sidelobes exist in the collected GT images because of light source with irregular spectral shape. Since the HR image is obtained by spectral shaping and the Gaussian spectral shaping method suppresses the sidelobes of the original signal, the generated SR image will not have sidelobes and has better image quality than GT images. The decrease of PSNR in SR images may be due to the differences in sidelobes between SR and GT images. On the contrary, improvements in SSIM and Lpips reflect enhancements visually.

Moreover, the SRNet used here was trained on zebrafish images, and the experimental results proved that the network trained on a single sample can be used for OCT-SSR of other biological samples, and the OCT-SSR method has good generalization performance.

4.3 Continuous application of super resolution network

To study the super-resolving capability of the SR networks, we also conducted a continuous experiment, and SR outputs were used as the input of the SRNet again for further super-resolving. The experimental results are shown in Fig. 12. ESRAMGAN trained based on zebrafish images in Section 3.2 was also used here for image super-resolution.

Fig. 12. Continuous experimental results of super resolution network. (a) is zebrafish dorsal image with 18 µm axial resolution. (b) is the SR image of Fig. (a). (c) is the SR image of (b). (d) is the SR image of (c). (e) is the SR image of (d), (f) is the image with 12 µm axial resolution. Scale bar in each image is 250 µm.

Download Full Size | PDF

Figures 12(a) and 12(f) are OCT images with 18 µm and 12 µm axial resolutions, respectively. Figure 12(b) is the SR result of Fig. 12(a). Figures 12(c), 12(d), and 12(e), labeled as SR², SR³, and SR⁴, are SR images of resampled images of Figs. 12(b), 12(c), and 12(d), respectively. The muscle textures in Figs. 12(b) and 12(c) are narrowed in the depth direction, reflecting the improvement of resolution. However, wrong structures obviously appear in the third and fourth consecutive SR images (SR³ and SR⁴), which means it is impossible to achieve continuous improvement in resolution by inputting the SR image into the network again and again. Training datasets of LR and HR images with a larger ratio may be a way to improve the image resolution more effectively.

4.4 Recovery of image speckle by OCT-SSR

In addition, to further demonstrate the image speckle recovery of this method, we conducted experiments on biological tissues without strong internal backscattering, such as human fingers and mouse ears. The results are shown in Fig. 13, where the SRNet were trained with images of 18 µm LR and 8 µm HR resolutions of respective samples. Figures 13(a₁)-(c₁) were human fingers SD-OCT images, where Figs. 13 (a₁) and (c₁) are 8 µm and 3.4 µm images. Figure 13(b₁) is the SR image of (a₁). The magnified details of the finger surface are in the red box. After super-resolution, the skin surface texture is reconstructed, and compared with the GT image [Fig. 13(c₁)], the sidelobe is suppressed while retaining the details [Fig. 13(b₁)]. The yellow box in each image shows the region of interior detail which can be seen as the high signal intensity area. Even if the backscattering is not strong, the speckle is recovered and the sample structure is clearer after SSR [yellow boxes in Fig. 13(b₂)]. Figures 13(a₂)-(c₂) are SD-OCT images of the mouse ear. Figures 13 (a₂) and (c₂) are 8 µm and 3.4 µm images and Fig. 13(b₂) is the SR image of (a₂). Similarly, while the surface skin is reconstructed, the side lobe is suppressed [red box in Fig. 13(b₂)]. In addition, the method achieves speckle restoration in the low signal-to-noise ratio (SNR) region [yellow boxes in Fig. 13(b₂)], which shows that the method is also applicable to low SNR images.

Fig. 13. Experiments on biological tissues without strong internal backscattering. (a₁) and (c₁) are human finger images with axial resolutions of 8 µm and 3.4 µm, respectively. (a₂) and (c₂) are mouse ear images with axial resolutions of 8 µm and 3.4 µm, respectively. (b₁) and (b₂) are SR images of (a₁) and (a₂), respectively. Scale bar in each image is 250 µm.

Download Full Size | PDF

Though the preliminary study shows that prior knowledge of the OCT system can be used to improve its axial resolution, more comprehensive and in-depth research is needed. For example, an extremely important point in the OCT-SSR method is to resample the image. We adopted the bicubic interpolation method and achieved good results, but the interpolation method is not consistent with the actual imaging, which may make inaccurate continuous image reconstruction. We will focus on a more appropriate resampling method to optimize OCT-SSR in the follow-up research. Besides, considering the importance of speckles in super-resolution reconstruction, it is worth studying to control speckle size through experiments [42,43], which may bring new ideas for OCT image super-resolution.

5. Conclusions

In this paper, we developed a deep-learning based OCT-SSR pipeline to improve the axial resolution of OCT images based on the HR and LR spectral data collected by the OCT system itself. Based on our proposed ESRAMGAN, the evaluation index of the network outputs can be improved without increasing the complexity of the SRNet. Experimental results based on the home-made SD-OCT and SS-OCT systems showed that the OCT-SSR method can suppress sidelobes and improve the axial resolution of OCT system. We believe it has great potential in breaking the limitation of the source bandwidth on the axial resolution of OCT system.

Funding

National Natural Science Foundation of China (61875092); Tianjin Foundation of Natural Science (21JCYBJC00260); Beijing-Tianjin-Hebei Basic Research Cooperation Special Program (19JCZDJC65300).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. J. Xi, A. Zhang, Z. Liu, W. Liang, L. Y. Lin, S. Yu, and X. D. Li, “Diffractive catheter for ultrahigh-resolution spectral-domain volumetric OCT imaging,” Opt. Lett. 39(7), 2016–2019 (2014). [CrossRef]

2. W. Yuan, R. Brown, W. Mitzner, L. Yarmus, and X. D. Li, “Super-achromatic monolithic microprobe for ultrahigh-resolution endoscopic optical coherence tomography at 800 nm,” Nat. Commun. 8(1), 1531 (2017). [CrossRef]

3. W. Yuan, J. Mavadia-Shukla, J. Xi, W. Liang, X. Yu, S. Yu, and X. D. Li, “Optimal operational conditions for supercontinuum-based ultrahigh-resolution endoscopic OCT imaging,” Opt. Lett. 41(2), 250–253 (2016). [CrossRef]

4. Y.-J. You, C. Wang, Y.-L. Lin, A. Zaytsev, P. Xue, and C.-L. Pan, “Ultrahigh-resolution optical coherence tomography at 1.3 µm central wavelength by using a supercontinuum source pumped by noise-like pulses,” Laser Phys. Lett. 13(2), 025101 (2016). [CrossRef]

5. M. D. Kulkarni, J. A. Izatt, and M. V. Sivak, “Image enhancement in optical coherence tomography using deconvolution,” Electron. Lett. 33(16), 1365–1367 (1997). [CrossRef]

6. Y. Liu, Y. Liang, G. Mu, and X. Zhu, “Deconvolution methods for image deblurring in optical coherence tomography,” J. Opt. Soc. Am. A 26(1), 72–77 (2009). [CrossRef]

7. E. Bousi and C. Pitris, “Axial resolution improvement by modulated deconvolution in Fourier domain optical coherence tomography,” J. Biomed. Opt. 17(7), 071307 (2012). [CrossRef]

8. J. Gong, B. Liu, Y. L. Kim, Y. Liu, X. Li, and V. Backman, “Optimal spectral reshaping for resolution improvement in optical coherence tomography,” Opt. Express 14(13), 5909–5915 (2006). [CrossRef]

9. Y. Chen, J. Fingler, and S. E. Fraser, “Multi-shaping technique reduces sidelobe magnitude in optical coherence tomography,” Biomed. Opt. Express 8(11), 5267–5281 (2017). [CrossRef]

10. X. Liu, S. Chen, D. Cui, X. Yu, and L. Liu, “Spectral estimation optical coherence tomography for axial super-resolution,” Opt. Express 23(20), 26521–26532 (2015). [CrossRef]

11. M. Asif, M.U. Akram, T. Hassan, A. Shaukat, and R. Waqar, “High resolution OCT image generation using super resolution via sparse representation,” in Eighth International Conference on Graphic and Image Processing (ICGIP) (2017).

12. Q. Xie and N. Sang, “Super-resolution for medical image via sparse representation and adaptive M-estimator,” West Indian Med. J. 65(2), 271–276 (2015). [CrossRef]

13. C. Zhao, A. Carass, B.E. Dewey, and J.L. Prince, “Self super-resolution for magnetic resonance images using deep networks,” in International Symposium on Biomedical Imaging (ISBI) (2018), pp. 365–368.

14. N. Zhao, Q. Wei, A. Basarab, D. Kouamé, and J.-Y. Tourneret, “Single image super-resolution of medical ultrasound images using a fast algorithm,” in International Symposium on Biomedical Imaging (ISBI) (2016), pp. 473–476.

15. L. Fang, S. Li, R. P. McNabb, Q. Nie, A. N. Kuo, C. A. Toth, J. A. Izatt, and S. Farsiu, “Fast acquisition and reconstruction of optical coherence tomography images via sparse representation,” IEEE Trans. Med. Imaging 32(11), 2034–2049 (2013). [CrossRef]

16. L. Fang, S. Li, D. Cunefare, and S. Farsiu, “Segmentation based sparse reconstruction of optical coherence tomography images,” IEEE Trans. Med. Imaging 36(2), 407–421 (2017). [CrossRef]

17. X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, and C.C. Loy, “ESRGAN: enhanced super-resolution generative adversarial networks,” in European Conference on Computer Vision (2018), pp. 63–79.

18. K. Zhang, H. Hu, K. Philbrick, G. M. Conte, J. D. Sobek, P. Rouzrokh, and B. J. Erickson, “SOUP-GAN: super-resolution MRI using generative adversarial networks,” in Tomography (2022), pp. 905–919.

19. Y. Mei, Y. Fan, Y. Zhang, J. Yu, Y. Zhou, D. Liu, Y. Fu, T. S. Huang, and H. Shi, “Pyramid attention networks for image restoration,” arXiv, arXiv:2004.13824 (2020). [CrossRef]

20. Z. Lu, J. Li, H. Liu, C. Huang, L. Zhang, and T. Zeng, “Transformer for single image super-resolution,” in IEEE/CVF conference on computer vision and pattern recognition (2022), pp. 457–466.

21. Y. Xu, B.M. Williams, B. Al-Bander, Z. Yan, Y.-C. Shen, and Y. Zheng, “Improving the resolution of retinal OCT with deep learning,” in Medical Image Understanding and Analysis (MIUA), 2018. pp. 325–332.

22. Y. Huang, Z. Lu, Z. Shao, M. Ran, J. Zhou, L. Fang, and Y. Zhang, “Simultaneous denoising and super-resolution of optical coherence tomography images based on generative adversarial network,” Opt. Express 27(9), 12289–12307 (2019). [CrossRef]

23. V. Das, S. Dandapat, and P. K. Bora, “Unsupervised super-resolution of OCT images using generative adversarial network for improved age-related macular degeneration diagnosis,” IEEE Sens. J. 20(15), 8746–8756 (2020). [CrossRef]

24. S. Cao, X. Yao, N. Koirala, B. Brott, S. Litovsky, Y. Ling, and Y. Gan, “Super-resolution technology to simultaneously improve optical & digital resolution of optical coherence tomography via deep learning,” in 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) (2020), pp. 1879–1882.

25. H. Pan, D. Yang, Z. Yuan, and Y. Liang, “More realistic low-resolution OCT image generation approach for training deep neural networks,” OSA Continuum 3(11), 3197–3205 (2020). [CrossRef]

26. Z. Yuan, D. Yang, H. Pan, and Y. Liang, “Axial super-resolution study for optical coherence tomography images via deep learning,” IEEE Access 8, 204941–204950 (2020). [CrossRef]

27. K. Liang, X. Liu, S. Chen, J. Xie, W. Q. Lee, L. Liu, and H. K. Lee, “Resolution enhancement and realistic speckle recovery with generative adversarial modeling of micro-optical coherence tomography,” Biomed. Opt. Express 11(12), 7236–7252 (2020). [CrossRef]

28. A. Jog, A. Carass, and J.L. Prince, “Self super-resolution for magnetic resonance images,” in Medical Image Computing and Computer-Assisted Intervention (MICCAI) (2016), pp. 553–560.

29. C. Zhao, A. Carass, B.E. Dewey, J. Woo, J. Oh, P.A. Calabresi, D.S. Reich, P. Sati, D. Pham, and J.L. Prince, “A deep learning based anti-aliasing self super-resolution algorithm for MRI,” in Medical Image Computing and Computer-Assisted Intervention (MICCAI) (2018), pp. 100–108.

30. Y. He, J. Yao, L. Liu, Y. Gao, J. Yu, S. Ye, H. Li, and W. Zheng, “Self-supervised deep-learning two-photon microscopy,” Photonics Res. 11(1), 1–11 (2023). [CrossRef]

31. X. Li, G. Zhang, J. Wu, Y. Zhang, Z. Zhao, X. Lin, H. Qiao, H. Xie, H. Wang, L. Fang, and Q. Dai, “Reinforcing neuron extraction and spike inference in calcium imaging using deep self-supervised denoising,” Nat. Methods 18(11), 1395–1400 (2021). [CrossRef]

32. C. Qiao, D. Li, Y. Liu, S. Zhang, K. Liu, C. Liu, Y. Guo, T. Jiang, C. Fang, N. Li, Y. Zeng, K. He, X. Zhu, J. Lippincott-Schwartz, Q. Dai, and D. Li, “Rationalized deep learning super-resolution microscopy for sustained live imaging of rapid subcellular processes,” Nat. Biotechnol. 41(3), 367–377 (2023). [CrossRef]

33. A. Shocher, N. Cohen, and M. Irani, “Zero-shot super-resolution using deep internal learning,” in Computer Vision and Pattern Recognition (2018), pp. 3118–3126.

34. H. Liu, J. Liu, S. Hou, T. Tao, and J. Han, “Perception consistency ultrasound image super-resolution via self-supervised CycleGAN,” Neural Comput. Appl. 35(17), 12331–12341 (2023). [CrossRef]

35. J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. Huang, “Free-form image inpainting with gated convolution,” in IEEE/CVF international conference on computer vision (2019), pp. 4471–4480.

36. I. Bello, B. Zoph, A. Vaswani, J. Shlens, and Q. V. Le, “Attention augmented convolutional networks,” in IEEE/CVF international conference on computer vision (2019), pp. 3286–3295.

37. X. Ding, Y. Guo, G. Ding, and J. Han, “ACNet: strengthening the kernel skeletons for powerful CNN via asymmetric convolution blocks,” in International Conference on Computer Vision, 1911–1920 (2019).

38. D. Yang, Z. Yuan, Z. Yang, M. Hu, and Y. Liang, “High-resolution polarization-sensitive optical coherence tomography and optical coherence tomography angiography for zebrafish skin imaging,” J. Innov. Opt. Health Sci. 14(06), 2150022 (2021). [CrossRef]

39. D. Yang, M. Hu, M. Zhang, and Y. Liang, “High-resolution polarization-sensitive optical coherence tomography for zebrafish muscle imaging,” Biomed. Opt. Express 11(10), 5618–5632 (2020). [CrossRef]

40. Z. Yang, J. Shang, C. Liu, J. Zhang, F. Hou, and Y. Liang, “Intraoperative imaging of oral-maxillofacial lesions using optical coherence tomography,” J. Innov. Opt. Health Sci. 13(02), 2050010 (2020). [CrossRef]

41. R. Zhang, P. Isola, A.A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Computer vision and pattern recognition (2018), pp. 586–595.

42. O. Liba, M. D. Lew, E. D. SoRelle, R. Dutta, D. Sen, D. M. Moshfeghi, S. Chu, and A. de la Zerda, “Speckle-modulating optical coherence tomography in living mice and humans,” Nat. Commun. 8(1), 15845 (2017). [CrossRef]

43. Z. Sun, F. Tuitje, and C. Spielmann, “Toward high contrast and high-resolution microscopic ghost imaging,” Opt. Express 27(23), 33652–3366 (2019). [CrossRef]

Axial resolution of images (SR: SR results of images)	Axial resolution of GT	Image numbers	PSNR	SSIM	Lpips
18 µm	12 µm	15	27.22	0.888	0.0742
18 µm (SR)	12 µm	15	31.36	0.923	0.0212
12 µm	8 µm	15	28.05	0.881	0.0783
12 µm (SR)	8 µm	15	29.09	0.903	0.0256
8 µm	5.3 µm	15	27.94	0.874	0.0790
8 µm (SR)	5.3 µm	15	26.16	0.838	0.0468
5.3 µm	3.4 µm	15	29.74	0.926	0.0745
5.3 µm (SR)	3.4 µm	15	28.24	0.888	0.0421

Axial resolution of images (SR: SR results of images)	Axial resolution of GT	PSNR	SSIM	Lpips
12 µm (SR)	8 µm	27.22	0.687	0.0327
8 µm (SR)	5.3 µm	22.58	0.731	0.0855
5.3 µm (SR)	3.4 µm	22.00	0.638	0.1725

Axial resolution of images (SR: SR results of images)	Axial resolution of GT	Image numbers	PSNR	SSIM	Lpips
18 µm	12 µm	15	27.22	0.888	0.0742
18 µm (SR)	12 µm	15	31.36	0.923	0.0212
12 µm	8 µm	15	28.05	0.881	0.0783
12 µm (SR)	8 µm	15	29.09	0.903	0.0256
8 µm	5.3 µm	15	27.94	0.874	0.0790
8 µm (SR)	5.3 µm	15	26.16	0.838	0.0468
5.3 µm	3.4 µm	15	29.74	0.926	0.0745
5.3 µm (SR)	3.4 µm	15	28.24	0.888	0.0421

Axial resolution of images (SR: SR results of images)	Axial resolution of GT	PSNR	SSIM	Lpips
12 µm (SR)	8 µm	27.22	0.687	0.0327
8 µm (SR)	5.3 µm	22.58	0.731	0.0855
5.3 µm (SR)	3.4 µm	22.00	0.638	0.1725

Self super-resolution of optical coherence tomography images based on deep learning

Abstract

1. Introduction

2. Methods and materials

2.1 Framework of OCT-SSR method

2.2 Super-resolution network structure

2.3 Sample preparations

2.4 Systems and data preprocessing

3. Results

3.1 Quantitative evaluations of ESRAMGAN

3.2 Quantitative evaluations of OCT-SSR

3.3 OCT-SSR on OCT images with a larger difference in resolution

3.4 Experiments of OCT-SSR on SS-OCT images

4. Discussions

4.1 Analysis of image resampling in OCT-SSR

4.2 Suppression of sidelobes by OCT-SSR

4.3 Continuous application of super resolution network

4.4 Recovery of image speckle by OCT-SSR

5. Conclusions

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (13)

Tables (2)

Optics Express