GPU-accelerated iterative method for FD-OCT image reconstruction with an image-level cross-domain regularizer

Mengyuan Wang; Yuye Ling; Zhenxing Dong; Xinwen Yao; Yu Gan; Chuanqing Zhou; Chuanqing Zhou; Yikai Su

doi:10.1364/OE.478970

1. Introduction

Optical coherence tomography (OCT) is a widely used imaging modality that could provide high-resolution 3D images of biological samples noninvasively [1–5], which is suitable for clinical diagnosis and biomedical research [6]. In recent years, there has been a renewed interest in using the iterative method to solve the image reconstruction problem in Fourier-domain optical coherence tomography (FD-OCT) [7–24]. Compared with the common inverse discrete Fourier transform (IDFT), the iterative method can produce more accurate estimation of sample reflectivity, which offers a higher image quality than what can be achieved by using the conventional IDFT.

The iterative method has been extensively used for FD-OCT image reconstruction. They can be mainly categorized as none-optimization-based and optimization-based. For none-optimization-based iterative method, implementations such as deconvolution [7–10] and phase retrieval [11,12] for FD-OCT have been proposed. One kind of none-optimization-based iterative method is based on the spectral estimation (SE) technique [13–15]. Liu $et$ $al.$ [14] proposed an SE-OCT for image reconstruction. High-precision results could be obtained with suppression of side lobes, which makes it a possible alternative to the traditional IDFT. A modified SE technique [15] applied an iterative adaptive approach (IAA) to OCT image reconstruction. Considerable speed improvement with high image quality and accuracy has been achieved. For optimization-based iterative methods, the image reconstruction can be achieved by solving a regularized inverse problem [16–24]. Deconvolution [24] and phase retrieval [25,26] using such method have also been proposed. Specifically, different regularizers are chosen depending on the prior. For instance, the $l1$-norm regularizer is implemented when the object is considered sparse [16–18,22,23], while the total variation (TV)-norm, which is suitable for image enhancement and denoising [27–29], is often used for preserving a continuous structure. Zhang $et$ $al.$ [19] showed that using the combination of two regularizers, including the $l1$-norm and the TV-norm, could further improve the image quality for FD-OCT images. Some recent progress in algorithm developments includes iterative projection between the frequency and spatial domains. Zhou $et$ $al.$ [21] proposed an iterative framework, where a forward model is designed to aid the optimization. Their iterative framework uses B-scan prediction from the forward model to solve the inverse problem. A high-resolution reconstruction and the refractive index distribution of the sample can be generated.

However, the existing iterative methods are mostly A-line-based, which has at least two limitations: (1) The time-sequential reconstruction of A-lines is slow. Take the optimization-based method as an example, the numerical calculation may slow down the reconstruction process when solving the ill-posed inverse problem, especially the inverse problem is more under-determined. Reconstructing one B-scan in sequential could easily cost several minutes or even hours [23], not to mention reconstructing large-scale 3D volumetric datasets. Moreover, the computational load of the large-scale datasets would sometimes far exceed the computation capability of the central processing unit (CPU) [30]. (2) Only an A-line-based regularizer could be imposed in the optimization. Although compressed-sensing (CS) algorithm for FD-OCT image reconstruction has been proposed by using GPU to parallel B-scan images [31], A-lines are still independently optimized. This processing paradigm is incompatible with most existing image-level image processing techniques.

Herein, we propose an iterative method that performs B-scan-based FD-OCT image reconstruction to address the above issues, which could overcome the shortcomings of the existing iterative methods. In contrast to the conventional iterative methods, the proposed method has at least three distinctive advantages, as listed below:

(1) GPU-accelerated large-scale parallelism. Figure 1 compares the computation architecture of the conventional and proposed iterative methods. As shown in Fig. 1(a), the conventional iterative method sequentially solves the inverse problem based on A-lines. It is worth noting that the reconstruction of those methods is typically implemented on the CPU. Figure 1(b) shows the proposed architecture. We utilized the GPU acceleration that encourages large-scale parallelism for the B-scan-based method. The entire iterative scheme, including a forward model and an inverse solver, is paralleled based on B-scan. In this way, the computational time could be significantly reduced.

Fig. 1. (a) Conventional reconstruction architecture. The reconstruction is sequential and A-line-based on the CPU. (b) The proposed B-scan-based architecture with the image-level cross-domain regularizer. (c) An example of naive data parallelization.

Download Full Size | PDF

(2) An image-level cross-domain regularizer. An example of naive data parallelism is shown in Fig. 1(c). In this case, we cannot impose an image-level prior if the per-A-line regularizer is employed. We designed a novel image-level cross-domain regularizer in the inverse solver to tackle this problem. The B-scan-based reconstruction offers an image-level property, which inspires us to optimize the image in both the spatial and frequency domains at the same time. As shown in Fig. 1(b), an image-level prior is imposed when solving the inverse problem, which generates an enhanced B-scan image directly from the OCT interferogram. Therefore, additional image processing techniques are no longer needed. Moreover, the computational time of both reconstruction and processing could be further reduced. To the best of our knowledge, it is the first time that image processing could be simultaneously achieved during image reconstruction.

(3) Scalability for 3D OCT image reconstruction. The proposed method can be scaled from B-scan reconstruction to 3D image reconstruction. The regularizer could also be imposed on the volumetric data. We developed a multiple-GPU architecture that allows large-scale 3D OCT data parallelization. B-scans in the volumetric image can be reconstructed in real time with high quality, which is highly desired in the OCT community.

We designed three studies to demonstrate the above advantages of the proposed method. First of all, the speed improvement of the proposed method was validated by using simulated air wedge data. We compared the computational time of the proposed method with the other state-of-the-art (SOTA) iterative approaches, including the traditional inverse fast Fourier transform (IFFT), auto-regression (AR) [14], recursive fast IAA (RFIAA) [15], and Alternating Direction Method of Multipliers (ADMM) [23]. The image quality and computational time were also evaluated on ex vivo (onion) and in vivo (monkey cornea) datasets. The results show that the proposed method is a good replacement of the ADMM in terms of image quality while improving speed by orders of magnitude.

Secondly, we observed the image enhancement capability of the proposed method. A comparison was conducted between the proposed method and the conventional workflow, in which an IDFT reconstructed image is later processed by a TV-regularized denoising algorithm. Here, we used the in vivo (human skin, human retina) datasets. Metrics were evaluated, including signal-to-noise ratio (SNR), contrast-to-noise ratio (CNR), and edge-preserving index (EPI). The results show that the enhanced images produced by the proposed method have better quantitative and qualitative performance. The proposed method shows SNR and CNR advantages over the conventional workflow for the skin and retina datasets. Furthermore, we could integrate deconvolution into reconstruction, which could further improve the image quality. In addition, the computational time is reduced by more than 48 and 30 times for skin and retina datasets, respectively.

Thirdly, we showed the scalability of our method. We developed a multiple-GPU architecture to reconstruct the skin and retina volumetric data. As a result, real-time image reconstruction has been realized. The proposed method achieved a frame rate of 20 frames per second (fps) with a frame size of 4096 (axial)$\times$1000 (lateral) for skin and 21 frames per second (fps) with a frame size of 2048 (axial)$\times$1000 (lateral) for the retina. In summary, the proposed method could provide high-quality OCT images with high-speed reconstruction.

2. Theory

Before the FD-OCT acquisition, the interferogram is an analog signal on finite support of [$-z_0-\Delta z,z_0+\Delta z$][32], which could be written as Eq. (1) if we neglect the DC and auto-correlation term:

(1)$$I(k)=S(k)\int_{{-}z_0-\Delta z}^{z_0+\Delta z} r_{s}(z)\exp ({-}2kz) dz$$

where $S(k)$ is the emission spectrum of the light source, $r_{s}(z)$ is the reflectivity profile of the sample. After the digitization, the measured interferogram, which has been sampled by $M$ times in $k$-domain and truncated into a uniformed distribution within a range of [$k_0, k_M$], could be written as:

(2)$$I[k_m]=S(k_m)\int_{{-}z_0-\Delta z}^{z_0+\Delta z} r_{s}(z)\exp ({-}2k_{m}z)dz, \quad m=0,1,\ldots M-1$$

where $m$ denotes the wavenumber index. In a regular FD-OCT, image reconstruction is performed by using IDFT. However, IDFT has been proved an axial resolution limitation [23,32]. The reconstructed grid size of IDFT is $\delta _z=\frac {1}{\Delta k}$, where $\Delta k$ is the spectral bandwidth. This means that the reconstructed grid of IDFT, which is the digital aixal resolution, is determined by the wavenumber range. Besides, the imaging range $\Delta z$ of IDFT is also pre-determined by the relationship of $\Delta z = \frac {1}{\delta _k}$, where $\delta _k$ is the spectral sampling interval [23].

However, iterative methods based on optimization could break the limitation of the IDFT. We will discuss the iterative method in the next section

2.1 Iterative methods in FD-OCT image reconstruction

Iterative methods based on optimization are often used to solve a regularized inverse problem, written as:

(3)$$\mathop{\arg\min}_{r}\Vert i-\hat{i} \Vert_2^2+\lambda f_A (r)$$

where $f_A (r)$ is the A-line-based regularizer, which could impose the prior knowledge of the object. $\lambda$ determines the weight of the regularizer. $i$ is the $M\times 1$ vector of the measured interferogram, and $r$ is the $N\times 1$ vector of A-line. The notation $\hat {\bullet }$ is the estimation. Let $S$ denote the vector of the emission spectrum, $\boldsymbol {C}$ is an $M\times N$ matrix, written as:

(4)$$\begin{pmatrix} \exp (j2 k_0 z_0) & \dots & \exp (j2 k_0 z_{N-1})\\ \vdots & \ddots & \vdots\\ \exp (j2k_{M-1} z_0 ) & \dots & \exp (j2 k_{M-1}z_{N-1}) \end{pmatrix}$$

The estimated interferogram could be written as:

(5)$$\hat{i} = S\otimes C \cdot \hat{r}$$

where the notation $\otimes$ denotes point-wise multiplication.

In Eq. (3), inadequate sampling ($M\leq N$) could result in an under-determined problem. The solution of the inverse problem is not always feasible if we intend to interpret a sub-sampled A-line profile with $N$ numbers from an FD-OCT measurement with $M$ numbers [33]. Solving the under-determined problem on a self-defined grid [23] can improve the axial resolution. The reconstructed grid $\delta _z$ and the grid length T can be chosen to cover the targeted range of the object. By solving Eq. (3) on the self-defined grid, we could obtain the image with arbitrary resolution and imaging depth.

During the iterative process, the object estimation $r^{p_{i+1}}$ is calculated as:

(6)$$r^{p_{i+1}}=\mathop{\min}_{r} \Vert i-\hat{i}^{p_{i}} \Vert_2^2+\lambda f_A (r^{p_{i}})$$

We should notice that the estimation is based on A-lines. Therefore, only an A-line-based regularizer could be applied. In this case, the computational time to reconstruct a single B-scan image is determined by $M$, $N$, and $p$. $M$ and $N$ are the row and column numbers of the transformation matrix $\boldsymbol {C}$, respectively. $p$ is the iteration number.

2.2 Iterative method with image-level cross-domain regularization

The schematic of the proposed iterative method is shown in Fig. 2(a). The proposed algorithm contains an inverse solver and a forward model. After the OCT acquisition, the measured interferogram is pre-processed by resampling and numerical dispersion compensation. The interferogram is firstly resampled to a uniform grid in wavenumber. After the resampling, the dispersion compensation is conducted, which generates a complex interferogram. Here, if the degree of the dispersion mismatch is small, the dispersion compensation is not required. Therefore, The input interferogram could be either real or complex. In Fig. 2 , we use the complex interferogram as an example. In the first iteration, an initial guess of the estimated image should be given. One option is to use the image by IDFT reconstruction of the input interferogram. After that, the image is sent to a forward model, which generates the estimated interferogram for the next update. Equation (5) serves as the forward model of the proposed method.

Fig. 2. (a) The schematic of our iterative method. The loss function in the inverse solver, which is represented by the red dashed arrow, consists of two parts: $\mathcal {L}_{fringe}$ is the fringe loss between the measured and interferogram. Both are complex and separated into real and imaginary parts. The interferogram The $\mathcal {L}_{image}$ is the image loss, which provides the image-level regularization. (b) The gradient descent during the backward propagation. The forward process is represented by the right arrow, where $\hat {r}$ is the image estimation, $S$ is the emission spectrum, $\boldsymbol {C}$ is an $M\times N$ matrix in Eq. (4). The estimated interferogram contains the real part $I_{re}$ and the imaginary part $I_{im}$. The backward process is represented by the right dashed arrow. The red dashed arrow represents the loss function $\mathcal {L}$. The derivatives of the loss function with respect to the variable $\boldsymbol {r}$ are shown by the formulas below the black dashed arrows, where $\bar {\bullet }$ denotes the derivation $\frac {\partial {\mathcal {L}}}{\partial {\bullet }}$.

Download Full Size | PDF

In contrast to the per-A-line reconstruction in Eq. (3), the proposed method is B-scan-based. In the inverse solver, we solve the ill-posed problem based on B-scan to obtain an image estimation at each iteration:

(7)$$\begin{aligned} \mathcal{L}&=\mathcal{L}_{fringe}+\mathcal{L}_{image} \\ &= \mathop{\arg\min}_{\boldsymbol{r}}\Vert \boldsymbol{i} -\boldsymbol{\hat{i}} \Vert_2^2+\beta f_B(\boldsymbol{r}) \end{aligned}$$

where $\beta$ controls the weight of the image-level cross-domain regularizer $f_B(\boldsymbol {r})$. Here, $\boldsymbol {r}$ denotes the OCT B-scan image or even the 3D image instead of one A-line, and $\boldsymbol {i}$ denotes the 2D or 3D interferogram. The loss function of the proposed method follows Eq. (7). We can see that the loss function $\mathcal {L}$ is the combination of two parts. One is the fringe loss $\mathcal {L}_{fringe}$, which is the mean square loss between the input interferogram and estimated interferogram in frequency domain. The other is the image loss $\mathcal {L}_{image}$, which is the image-level regularization from the spatial domain. The loss function is represented as the red dashed arrow in Fig. 2.

We have adopted two image-level cross-domain regularizers in this paper: the $l1$-norm and TV-norm. The use of TV-norm regularization in B-scan-based reconstruction has an advantage over that in A-line-based reconstruction. In contrast, the advantage of the $l1$-norm regularization for B-scan-based reconstruction is not obvious. Assume the B-scan image has $M$ rows and $N$ columns. According to Eq. (7), the loss function with the $l1$-norm regularizer could be written as:

(8)$$\begin{aligned} &\mathop{\arg\min}_{\boldsymbol{r}}\Vert \boldsymbol{i} -\boldsymbol{\hat{i}} \Vert_2^2+\beta \Vert \boldsymbol{r}\Vert_{1} \\& =\mathop{\arg\min}_{\boldsymbol{r}}\Vert \boldsymbol{i} -\boldsymbol{\hat{i}} \Vert_2^2+\beta \sum_{n=1}^{N}\sum_{m=1}^{M} |r_{m n}| \end{aligned}$$

The advantage of the regularization on the B-scan is that the sparsity could be incorporated over axial and lateral directions. Consider the $l1$-norm regularization over the lateral direction, Eq. (9) could be further written as:

(9)$$\begin{aligned} &=\mathop{\arg\min}_{\boldsymbol{r}}\Vert \boldsymbol{i} -\boldsymbol{\hat{i}} \Vert_2^2+\beta (\sum_{m=1}^{M}|r_{m 1}|+\sum_{m=1}^{M}|r_{m 2}|+\cdots+\sum_{m=1}^{M}|r_{m N}|) \\&=\mathop{\arg\min}_{\boldsymbol{r}}\Vert \boldsymbol{i} -\boldsymbol{\hat{i}} \Vert_2^2+\beta (\Vert r_{x_1} \Vert_1 +\Vert r_{x_2} \Vert_1+\cdots+\Vert r_{x_N} \Vert_1) \end{aligned}$$

where $r_{x_1} , r_{x_2},\ldots, r_{x_N}$ are the columns of the B-scan image $\boldsymbol {r}$. Similarly, if the $l1$-norm regularization is over the axial direction, Eq. (9) could be written as:

(10)$$\begin{aligned} &=\mathop{\arg\min}_{\boldsymbol{r}}\Vert \boldsymbol{i} -\boldsymbol{\hat{i}} \Vert_2^2+(\beta \sum_{n=1}^{N}|r_{1 n}|+\sum_{n=1}^{N}|r_{2 n}|+\cdots+\sum_{n=1}^{N}|r_{M n}|) \\&=\mathop{\arg\min}_{\boldsymbol{r}}\Vert \boldsymbol{i} -\boldsymbol{\hat{i}} \Vert_2^2+\beta (\Vert r_{z_1} \Vert_1 +\Vert r_{z_2} \Vert_1+\cdots+\Vert r_{z_M} \Vert_1) \end{aligned}$$

where $r_{z_1} , r_{z_2},\ldots, r_{z_M}$ are the rows of the B-scan image $\boldsymbol {r}$.

The A-line-based $l1$-norm regularization can also incorporate the sparsity in the axial direction. However, in contrast to Eq. (11), the A-line-based $l1$-norm regularization is imposed independently on each A-line. Follow Eq. (3), we have:

(11)$$\begin{aligned}& \mathop{\arg\min}_{r_{x_1}}\Vert i_{x_1}-\hat{i}_{x_1} \Vert_2^2+\lambda \Vert r_{x_1}\Vert_1 \\&\mathop{\arg\min}_{r_{x_2}}\Vert i_{x_2}-\hat{i}_{x_2} \Vert_2^2+\lambda \Vert r_{x_2} \Vert_1 \\&\cdots\cdots \\&\mathop{\arg\min}_{r_{x_N}}\Vert i_{x_N}-\hat{i}_{x_N} \Vert_2^2+\lambda \Vert r_{x_N} \Vert_1 \end{aligned}$$

where $i_{x_1} , i_{x_2},\ldots, i_{x_M}$ are the columns of the measured interferogram, $\hat {i}_{x_1} , \hat {i}_{x_2},\ldots, \hat {i}_{x_M}$ are the columns of the estimated interferogram.

Figure 2(b) gives the gradient descent algorithm via the chain rule. This backward process calculates the derivations of the loss function with regard to the input $\boldsymbol {r}$ to update the subsequent image estimation, which is sent to the forward model to generate the estimated interferogram. In both forward and backward propagation, the real and imaginary parts are separately calculated, which generates the fringe loss $\mathcal {L}_{fringe}$ by summarizing the two losses of real and imaginary parts. We should note that if the interferogram is not dispersion compensated, we only consider the real part. The algorithm will finally produce processed and high-quality OCT images after iterations. One choice to stop the iteration is to set the termination criteria by calculating the difference between losses in two successive iterations. Once the loss difference is smaller than a preset value, the iteration will be terminated. To avoid the abrupt stop before convergence, we set a minimum iteration number to ensure that the algorithm will not be terminated within these iterations.

3. Methods

To validate the effectiveness of the proposed method, we compared the computational time and image quality of the proposed method and other iterative methods [14,15,23]. We used both the simulated and experimental datasets. A simulated air wedge data was used to demonstrate the advantages of the proposed method on the processing speed and axial resolving capability. To demonstrate the feasibility of reconstructing biological tissues, we applied the proposed method to ex vivo (onion) and in vivo (monkey cornea, human skin, human retina) datasets, which were taken by four different FD-OCT systems. We adopted different image-level cross-domain regularizer in each case according to the sample.

Comparison study was implemented in the MATLAB environment on the CPU, while the proposed method was implemented on both CPU and GPUs. The CPU is an Intel Core i7-8700 CPU with 16 GB RAM, and the GPU is the latest Nvidia GeForce RTX 3090 graphic card with 10496 CUDA cores. The toolkit Pytorch enables a convenient GPU implementation for the proposed method. We built both the inverse and forward models in parallel on GPUs to maximize the acceleration. Following the loss function in Eq. (6), the inverse problem was solved by utilizing Adam [34], a well-defined variation of the stochastic gradient descent (SGD) optimizer that is extensively used in Deep Learning.

To demonstrate the scalability for 3D image reconstruction, we used the experimental 3D datasets of skin and retina. We developed a multiple-GPU architecture for to parallel the volumetric data instead of B-scans. We used PyTorch’s $torch.nn.DataParallel$ facility to parallelize across the 4 GPU devices. The rest of this section will give the details about our implementations.

3.1 OCT dataset collection and pre-processing

We used raw data taken from four different FD-OCT systems to verify the generalization of the proposed method. Table 1 gives the configuration of the OCT systems, including the central wavelength, bandwidth range, axial resolution, line rate, and incident power to the sample. The onion dataset was collected by using a commercial spectral domain OCT (SD-OCT) system (GAN620C1, Thorlabs), and the cornea dataset was collected by using an ultrahigh-resolution(UHR) SD-OCT [35]. We also built two swept-source optical coherence tomography (SS-OCT) systems. Microscopic and eye systems were developed, offering human skin and retina datasets, respectively. We used the two-axis galvo mirrors (GVS002/M, Thorlabs) to steer the collimated light for scanning in the two custom-built systems. The interferogram was detected with a balanced photoreceiver (PDB48xC-AC, Thorlabs) and digitized (ATS9371, Alazartech) at 1 GS/s with 4096 data points for each sweep. During the in vivo experiments, the power of light incident on the sample was attenuated to conform to the ANSI safety standard for light exposure.

Table 1. Configuration of OCT systems used for data acquisition.

View Table | View all tables in this article

The raw interferogram was pre-processed in each case, which includes the background subtraction of the source spectrum and wavenumber-linear resampling. Dispersion compensation was performed on cornea and retina datasets, which makes their interferogram complex. The onion and human skin datasets are not compensated, which are real. After the pre-processing, the interferogram of the B-scan image were used as the input of the proposed method. Note that the emission spectrum of each system was taken, which was employed in the reconstruction.

3.2 Image reconstruction of simulated and experimental datasets

3.2.1 Speed improvement: simulated air wedge reconstruction

Firstly, simulated air wedge data was designed to demonstrate the axial resolution and speed improvement in a controllable situation. We simulated two reflectors with lateral displacements ranging from 0 to 1.6 $\mu$m. The axial grid size of the ground-truth air wedge was chosen as 0.5 $\mu$m, which corresponds to the object grid size. The spectrum of the light source $S(k)$ is simulated as a gaussian shape. 1024 samples were used to form an interferogram over 150 A-lines, which generates a simulated interferogram with a data size of 1024$\times$150.

We compared the proposed method with other iterative methods, including the IFFT, AR, RFIAA, and ADMM [14,15,23]. Here, we use the optimization-based iterative method ADMM as the benchmark of the proposed method in terms of image quality. While the AR and RFIAA methods can both improve the axial resolution with fast computational time. The comparison with these methods could be helpful in evaluating the proposed method.

For AR and RFIAA, we keep the order as 1/3 of the spectrum length. The parameters for RFIAA are all set to the default value in [15]. For the proposed method, the $l1$-norm was used in the proposed method. The weight $\beta$ of the regularization controls the sparsity, which was set to 10 for ADMM and 50 for the proposed method to achieve the best performance. The reconstructed grid of ADMM and the proposed method are both set to 1 $\mu$m with 512 numbers of the grid. Since the existing GPU algorithms for the compared methods (AR, RFIAA, and ADMM) are unavailable, these methods are on the CPU in our study. We expanded the data to demonstrate parallelism for larger data sizes by increasing the A-line numbers to 100k.

3.2.2 Image quality: onion and cornea reconstruction

We employed the experimental datasets of onion and monkey cornea to demonstrate the image quality and computational time advantages of our method. The interferograms of onion and cornea are with the data size of 2048$\times$1999 and 1700$\times$600, respectively.

Computational time and image quality were evaluated between the proposed method and other iterative methods [14,15,23], where ADMM is the image-quality benchmark. We implemented both the sequential and parallel manner to evaluate the time improvement. For onion data, we used the full and half bandwidth of the spectrum to better evaluate the bandwidth truncation influence on the reconstructed image of each method.

For AR and RFIAA, the bandwidth and parameters are set to the default value in [15]. For ADMM and the proposed method, the $l1$-norm regularizer is imposed to promote sparsity since the onion and cornea are considered sparse objects. The weight $\beta$ was set to 1000. The reconstruction grid size $\delta _z$ was set to half of the IDFT grid size [23,32] for super-resolving. We chose the grid length T=1000 image for the onion and T=600 for the cornea. The iteration numbers are set to 100 and 200 for the onion and cornea

3.2.3 Image enhancement: skin and retina reconstruction

Since skin and retina samples are not ideally sparse objects, using a TV-norm regularizer is suitable for improving image quality, which achieved effects such as edge preservation [36,37], artifacts removal, and noise removal [38,39]. Here, we imposed the image-level cross-domain TV-norm regularization. Both the interferogram of skin and retina were acquired with the data size of 4096$\times$1000. The weight $\beta$ of the TV-norm regularizer was set to 0.01 for the skin image and 0.05 for the retina image. The grid size was set the same as the IDFT size grid with the grid length T=600 for both skin and retina images. Iteration numbers for skin and retina are 200 and 500, respectively. The performance of the proposed method was evaluated with and without deconvolution.

For comparison, images after IFFT reconstruction were processed by a typical TV-based denoising algorithm [40]. We evaluated the images using widely-used metrics, including SNR, CNR, EPI, and computational time, to quantify the effects of each method.

The SNR indicates the level of the OCT signal to the level of the noise background. Similar to SNR, CNR measures the contrast between the feature in the region of interests (ROIs) and the noisy background. The SNR and CNR are defined as follows:

(12)$$SNR=10\cdot \log 10(\frac{\mu_s^2}{\sigma_b^2})\\$$

(13)$$CNR=10\cdot \log 10(\frac{\mu_s-\mu_b}{0.5\cdot \sqrt{\sigma_s^2+\sigma_b^2}})$$

where $\mu _s$ and $\sigma _s$ are the mean and standard deviation of the ROIs, marked as the shaded red rectangles in Fig. 5 and Fig. 6. $\mu _b$ and $\sigma _b$ are the mean and standard deviation of the background noise.

The EPI measures the local correlation between two edge regions selected from the denoised image and the original reference image [41]. It could be used to describe the degree of edge preservation in a denoised image, defined as:

(14)$$EPI=\frac{\Gamma(s-\bar{s},\hat{s}-\bar{\hat{s}})}{\Gamma(s-\bar{s},s-\bar{s}) \cdot \Gamma(\hat{s}-\bar{\hat{s}},\hat{s}-\bar{\hat{s}})}\\$$

where $s$ and $\hat {s}$ are the reference image and the estimated image. The expression of $\Gamma$ is:

(15)$$\Gamma(s_1,s_2)=\sum_{i,j \in ROIs} s_1(i,j) \cdot s_2(i,j)$$

$\bar {s}(i,j)$ and $\bar {\hat {s}}(i,j)$ are mean values in the ROIs of $s(i,j)$ and $\hat {s}(i,j)$, respectively. The ROIs contain edge structure, marked as the shaded blue rectangles in Fig. 5 and Fig. 6. Here, the IFFT reconstruction image was used as the reference, with an EPI of 1.

3.2.4 Scalability: 3D image reconstruction

Most existing iterative methods cannot reconstruct FD-OCT images in real time. The proposed method can be rescaled to 3D OCT image reconstruction, which enables real-time reconstruction for B-scan images. Here, B-scans are not sequentially reconstructed. Instead, the 3D image reconstruction is parallelly reconstructed, which allows for real-time reconstruction for each B-scan. The computational time of the GPU can be further reduced by implementing a multiple-GPU architecture, which uses more than one GPU in parallel. Four GPU devices were used to parallelize 100 B-scans for skin and retina volumetric interferogram (4096$\times$1000$\times$100). Since the data size greatly influences the computational time, we evaluated the computational time using different interferogram data sizes. The interferogram is down-sampled at the first dimension to 2048 and 1024 points, which are typical values in the OCT signal acquisition.

4. Results

4.1 Results of simulated air wedge reconstruction

Table 2 shows the computational time for a B-scan with a size of 1024$\times$150. For CPU parallelization, the proposed method consumes 0.27s on average, which is at the same level as the RFIAA but achieves around 100 times faster than ADMM. After GPU acceleration, our proposed method could achieve a speed-up to 0.09s per B-scan. Moreover, the proposed method could provide a real-time reconstruction of 0.22s with a size of 1024$\times$100$\times$1000 by exploiting the massive parallelism of the GPU. In other words, for a typical 100k SS-OCT system, we only need 0.22s to process the data generated in 1 second.

Table 2. Computational time of different methods for air wedge B-scan image.^a

View Table | View all tables in this article

The reconstructed results of the air wedge are shown in Fig. 3. All the iterative methods manifest resolution improvement. Although there are some unresolved parts at the tip region in the result of ADMM and the proposed method, the point spread function is narrower than that of the other iterative methods. We could infer from the result that the proposed method can achieve a comparable axial resolution with ADMM.

Fig. 3. Results of the air wedge reconstruction. (a) The ground truth. (b) IFFT reconstruction, (c) AR reconstruction, (d) RFIAA reconstruction, (e) ADMM reconstruction, (f) the proposed method.

Download Full Size | PDF

4.2 Results of onion and cornea reconstruction

The time comparison of the onion and the monkey cornea reconstruction is given in Table 3. With the help of GPU acceleration, the proposed method has an average computational time of 0.86s and 1.15s for onion and cornea, respectively, which outperforms the other methods. From the results, we can see that the CPU implementation cannot compete with the GPU implementation except for IFFT reconstruction, which is also shown in Table 2. The slowdown is because the GPU has overhead that takes extra time aside from the computational time. For example, the memory overhead is the data transfer time between the CPU and GPU. Besides, because the data size of one B-scan is small, IFFT reconstruction is fast on the GPU. Therefore, most of the computational time is spent on the GPU overhead instead of the actual reconstruction time.

Table 3. Computational time of different methods for the onion and monkey cornea reconstruction, s denotes seconds, and h denotes hours.^a

View Table | View all tables in this article

Figure 4 and Fig. 5 show the qualitative results of the onion and the monkey cornea reconstruction. The reconstruction results using the half-truncated spectrum are provided for the onion sample for a more pronounced visual effect. The layer structure of the onion could be resolved, even if we use the half spectral range, as shown in Figs. 4(g)-(j). The layers of the onion sample are thinner with AR and RFIAA, but the proposed method shows fewer artifacts and discontinuity. The two-layer structure of the cornea could also be resolved in all the iterative methods. From the result, we could infer that the proposed method can achieve the same results as the ADMM counterparts but with a faster speed at orders of magnitude.

Fig. 4. Results of the onion sample reconstructed by IFFT, AR, RFIAA, ADMM, and the proposed method. (a)-(e): use full spectral bandwidth. (f)-(j): use a half spectral bandwidth. Zoomed images are provided in the red boxes.

Download Full Size | PDF

Fig. 5. Results of the cornea reconstructed by: (a) IFFT, (b) AR, (c) RFIAA, (d) ADMM, and (e) the proposed algorithm. Zoomed images are provided in the red boxes.

Download Full Size | PDF

4.3 Results of skin and retina reconstruction

The quantitative results of skin and retina reconstruction are shown in Table 4. The results show that the image generated by the proposed method has better noise performance, with a higher SNR and CNR than the image generated by the conventional workflow. The improvement of SNR and CNR is more evident in the retina than in skin images. Because the signal power of the skin image is relatively higher, there is a subtle improvement in the SNR and CNR.

Table 4. The comparison of different metrics for skin and retina image.^a

View Table | View all tables in this article

We also implement deconvolution in the compared methods. The proposed iterative method with deconvolution could outperform the other methods in SNR and CNR, with a little drop of EPI, as shown in Table 4. Furthermore, the computational time for skin and retina reconstruction with deconvolution is 1.42 and 2.40s on average, approximately 48 and 30 times faster than the conventional workflow. In summary, the proposed method could outperform others in detail preservation, noise removal, and computation time.

In Fig. 6 and Fig. 7, the proposed method with deconvolution shows more detailed structures. In Fig. 6, the boundary between the epidermis and dermis is enhanced. The red arrows mark the shadows of vessels where we could observe a contrast enhancement in the proposed method. The proposed method with deconvolution, highlighted with white arrows, could remove artifacts. Figure 7(e) shows a clear visualization of retinal structures, such as the boundary between the outer segment layer (OS) and retinal pigmented epithelium (RPE) (red arrows), and smaller choroidal vessels (white arrows).

Fig. 6. Results of the human skin. (a) IFFT reconstruction, (b) IFFT reconstruction and processed by a TV-based denoising algorithm, (c) the proposed method, (d) IFFT reconstruction with deconvolution and processed by a TV-based denoising algorithm, (e) the proposed method with deconvolution. The shaded red rectangulars mark the ROIs for SNR and CNR, and the shaded blue rectangulars mark the ROIs for EPI. Zoomed images are given in green and yellow boxes. The red arrows in the green boxes show a contrast enhancement of structures. The white arrows in the yellow boxes mark the artifacts.

Download Full Size | PDF

Fig. 7. Results of the human retina. (a) IFFT reconstruction, (b) IFFT reconstruction and processed by a TV-based denoising algorithm, (c) the proposed method, (d) IFFT reconstruction with deconvolution and processed by a TV-based denoising algorithm, (e) the proposed method with deconvolution. The shaded red rectangulars mark the ROIs for SNR and CNR, and the shaded blue rectangulars mark the ROIs for EPI. Zoomed images are given in green and yellow boxes. The red arrows in the green boxes mark the boundary of OS and RPE layers, and the white arrows in the yellow boxes mark the choroidal vessels.

Download Full Size | PDF

4.4 Result of real-time image reconstruction

We implement the multiple-GPU architecture to accelerate the proposed iterative method of rendering skin and retina 3D volumetric data. Table 5 gives the computational time varies with the number of GPU device.

Table 5. Time comparison of multiple-GPU implementation for 3D image reconstruction.^a

View Table | View all tables in this article

A real-time B-scan image reconstruction has been realized. The average computational time is 4.93s per volume with a B-scan size of 4096$\times$1000 for skin and 4.7s per volume with a B-scan size of 2048$\times$1000 for retina, which corresponds to 20 fps with a frame size of 4096$\times$1000 for skin and 21 fps with a frame size of 2048$\times$1000 for retina, respectively. We should notice that when the 3D data is paralleled on one GPU, it will fully utilize the computational power of the GPU. Therefore, the average computational time for single B-scan will effectively drop. However, sending one B-scan at a time into the GPU may not be able to fully harvest the computational power of the GPU, which leads to a longer average computational time. Therefore, the computational time for one B-scan in the 3D volume parallelization (0.1072s) can be much faster than that shown in Table 5 (1.5s).

The skin 3D image in Fig. 8 shows clear epithelium structures such as the epidermis (ED), dermis (D), stratum corneum (SC), and hair (H). Figure 9 shows the 3D retina images. The white arrows point out retinal structures such as the optic disk (OD), fovea (F), and choroid (C). We also demonstrated the characteristic retinal layers (RNFL, ONL, and RPE) that could be resolved by the proposed method.

Fig. 8. 3D image of the human skin. D dermis; ED epidermis; SC stratum corneum; H hairs.

Download Full Size | PDF

Fig. 9. 3D image of the human retina. OD optic disk; F fovea; C choroid; RNFL retinal nerve fiber layer; ONL outer nuclear layer; RPE retinal pigmented epithelium.

Download Full Size | PDF

5. Conclusion and discussion

In conclusion, we proposed an iterative method for FD-OCT image reconstruction. In contrast to the conventional sequential A-line-based iterative method, the proposed iterative method performs B-scan-based optimization. We applied GPU acceleration to parallel the OCT image reconstruction. Besides, an image-level cross-domain regularizer is developed. We have demonstrated that the image processing can be simultaneously performed with image reconstruction. Real-time image reconstruction has been achieved by parallelize the 3D volumetric data, which is achieved by the multiple-GPU architecture.

Results of simulated and experimental data have shown that the proposed method could provide better image quality with more features preserved while the computational time is significantly improved. A more substantial denoising effect could be achieved, as shown in Table 4. Instead of implementing the conventional workflow, in which the image is reconstructed by the IFFT and then processed by the one-shot TV-based denoising algorithm, we solve the constraint inverse problem, where the TV regularizer is added to the loss function in each iteration, which leads to a more stable solution [42]. However, the iterative solution tends to smooth out small-scale structures with sharper edges [40,42], which could also explain the EPI drop of the proposed method. The EPI of all methods could be maintained at a high EPI of more than 90$\%$, but smoothness might be induced by our iterative method, which could lead to a slightly lower EPI.

It is important to notice that using a TV-norm regularizer could smooth the speckles in the estimated image. The smoothness is valid if the speckle is regarded as noise. If the speckle is regarded as the coherence addition of small scatters, smoothing speckles may generate an interferogram that does not match the measured interferogram. However, the origin of the speckle is complicated. Distinguishing small structures and noise is out of the scope of this paper. Here, we treated the speckle as noise.

Reconstructing a complex interferogram can take more computational time in our implementation because the real and imaginary components are optimized separately. For example, the interferogram of the retina is complex due to the dispersion compensation, which causes a longer reconstruction time (Table 4 and Table 5). We use different compensation strategies in the two custom-built systems to tackle the dispersion problem. In the microscopic system, we used a dispersion compensating block (LSM03DC, Thorlabs) in the reference arm to deal with the dispersion brought by the sample arm. While in the eye system, the dispersion cannot be easily compensated due to the complicated optical design. Besides, the incident light can be strongly dispersed by the anterior eye. Therefore, the digital dispersion compensation was applied to the data of the retina, which makes the interferogram complex.

The choice of parameters, such as the step size and the regularization weight $\beta$, will influence the result. For $l1$-norm, we take the onion image as an example. We use the step size of 100 for the onion image. Increasing the step size to 1000 could significantly decrease the image quality. Besides, the weight $\beta$ of $l1$-norm in the onion image is set to 1000. Using a large weight such as 10000 will lead to a missing structure. However, using a smaller $\beta$ equal to 50 will not influence the visual effect of the image quality but will generate some artifacts. The weight and step size vary if we use a different regularizer. For example, the step size is 0.00005, and the weight of the TV norm is 0.01 for the skin image. We should avoid using a large weight such as $\beta$ equal to 1, which could lead to an over-smoothness of the image.

Further improvement of our method is possible. Firstly, the step size for B-scans should be carefully chosen. The initial step size should be larger over the first several iterations to search for the optimum globally. Secondly, although a significant speed-up has been achieved, the computational time can not compete with the speed of IFFT reconstruction. Therefore, reducing iterations by adjusting the step size for faster convergence is needed. Besides, a possible direction to further reduce the computational time is to use deep-learning-based approaches to improve the speed with considerable image quality. Training a network by unsupervised learning to perform image reconstruction was proved much faster than the iterative methods [43,44]. Finally, we could consider reconstructing a larger-scale dataset using CUDA programming for GPU parallelism. In summary, the proposed can provide a high-quality image within a short period of time. It could be possible to motivate the development of intraoperative OCT for high-quality and real-time visualization, which could aid the diagnosis during the operations.

Funding

National Natural Science Foundation of China (61905141, 61875123); Open Research Fund Program of the State Key Laboratory of Low-Dimensional Quantum Physics (KF202107).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. E. A. Swanson, J. A. Izatt, M. R. Hee, D. Huang, C. P. Lin, J. S. Schuman, C. A. Puliafito, and J. G. Fujimoto, “In vivo retinal imaging by optical coherence tomography,” Opt. Lett. 18(21), 1864–1866 (1993). [CrossRef]

2. J. G. Fujimoto, “Optical coherence tomography for ultrahigh resolution in vivo imaging,” Nat. Biotechnol. 21(11), 1361–1367 (2003). [CrossRef]

3. P.-L. Hsiung, L. Pantanowitz, A. D. Aguirre, Y. Chen, D. Phatak, T. H. Ko, S. Bourquin, S. J. Schnitt, S. Raza, J. L. Connolly, H. Mashimo, and J. G. Fujimoto, “Ultrahigh-resolution and 3-dimensional optical coherence tomography ex vivo imaging of the large and small intestines,” Gastrointest. Endosc. 62(4), 561–574 (2005). [CrossRef]

4. K. K. Bizheva, A. Unterhuber, B. M. Hermann, B. Povazay, H. Sattmann, A. F. Fercher, W. Drexler, M. Preusser, H. Budka, A. Stingl, and T. M. Le, “Imaging ex vivo healthy and pathological human brain tissue with ultra-high-resolution optical coherence tomography,” J. Biomed. Opt. 10(1), 011006 (2005). [CrossRef]

5. H. Pahlevaninezhad, M. Khorasaninejad, Y.-W. Huang, Z. Shi, L. P. Hariri, D. C. Adams, V. Ding, A. Zhu, C.-W. Qiu, F. Capasso, and M. J. Suter, “Nano-optic endoscope for high-resolution optical coherence tomography in vivo,” Nat. Photonics 12(9), 540–547 (2018). [CrossRef]

6. J. Fujimoto and E. Swanson, “The development, commercialization, and impact of optical coherence tomography,” Invest. Ophthalmol. Vis. Sci. 57(9), OCT1–OCT13 (2016). [CrossRef]

7. D. A. Fish, A. M. Brinicombe, E. R. Pike, and J. G. Walker, “Blind deconvolution by means of the Richardson–Lucy algorithm,” J. Opt. Soc. Am. A 12(1), 58–65 (1995). [CrossRef]

8. M. D. Kulkarni, C. W. Thomas, and J. A. Izatt, “Image enhancement in optical coherence tomography using deconvolution,” Electron. Lett. 33(16), 1365–1367 (1997). [CrossRef]

9. Y. Liu, Y. Liang, G. Mu, and X. Zhu, “Deconvolution methods for image deblurring in optical coherence tomography,” J. Opt. Soc. Am. A 26(1), 72–77 (2009). [CrossRef]

10. S. A. Hojjatoleslami, M. R. N. Avanaki, and A. G. Podoleanu, “Image quality improvement in optical coherence tomography using Lucy-Richardson deconvolution algorithm,” Appl. Opt. 52(23), 5663–5670 (2013). [CrossRef]

11. J. R. Fienup, “Phase retrieval algorithms: a comparison,” Appl. Opt. 21(15), 2758–2769 (1982). [CrossRef]

12. A. Ozcan, M. J. F. Digonnet, and G. S. Kino, “Minimum-phase-function-based processing in frequency-domain optical coherence tomography systems,” J. Opt. Soc. Am. A 23(7), 1669–1677 (2006). [CrossRef]

13. C. S. Seelamantula and S. Mulleti, “Super-Resolution Reconstruction in Frequency-Domain Optical-Coherence Tomography Using the Finite-Rate-of-Innovation Principle,” IEEE Trans. Signal Process. 62(19), 5020–5029 (2014). [CrossRef]

14. X. Liu, S. Chen, D. Cui, X. Yu, and L. Liu, “Spectral estimation optical coherence tomography for axial super-resolution,” Opt. Express 23(20), 26521–26532 (2015). [CrossRef]

15. J. de Wit, K. Angelopoulos, J. Kalkman, and G.-O. Glentis, “Fast and accurate spectral-estimation axial super-resolution optical coherence tomography,” Opt. Express 29(24), 39946–39966 (2021). [CrossRef]

16. X. Liu and J. U. Kang, “Compressive SD-OCT: the application of compressed sensing in spectral domain optical coherence tomography,” Opt. Express 18(21), 22010–22019 (2010). [CrossRef]

17. H. L. Seck, Y. Zhang, and Y. C. Soh, “High resolution optical coherence tomography by l1-optimization,” Opt. Commun. 284(7), 1752–1759 (2011). [CrossRef]

18. R. Nayak and C. S. Seelamantula, “Optimal sparsifying bases for frequency-domain optical-coherence tomography,” Opt. Lett. 37(23), 4907–4909 (2012). [CrossRef]

19. N. Zhang, T. Huo, C. Wang, T. Chen, J. gao Zheng, and P. Xue, “Compressed sensing with linear-in-wavenumber sampling in spectral-domain optical coherence tomography,” Opt. Lett. 37(15), 3075–3077 (2012). [CrossRef]

20. M. Mousavi, L. Duan, T. Javidi, and A. K. E. Bowden, “Iterative re-weighted approach to high-resolution optical coherence tomography with narrow-band sources,” Opt. Express 24(2), 1781–1793 (2016). [CrossRef]

21. K. C. Zhou, R. Qian, S. Degan, S. Farsiu, and J. A. Izatt, “Optical coherence refraction tomography,” Nat. Photonics 13(11), 794–802 (2019). [CrossRef]

22. M. Wang, Y. Ling, S. Shao, and Y. Su, “An iterative algorithm for artifacts removal in Fourier-domain optical coherence tomography,” in Optical Coherence Tomography and Coherence Domain Optical Methods in Biomedicine XXV, vol. 11630J. A. Izatt and J. G. Fujimoto, eds. (SPIE, 2021), pp. 124–128.

23. Y. Ling, M. Wang, Y. Gan, X. Yao, L. Schmetterer, C. Zhou, and Y. Su, “Beyond Fourier transform: super-resolving optical coherence tomography,” arXiv, arXiv:2001.03129 (2020). [CrossRef]

24. J. Wang, B. Wohlberg, and R. B. A. Adamson, “Convolutional dictionary learning for blind deconvolution of optical coherence tomography images,” Biomed. Opt. Express 13(4), 1834–1854 (2022). [CrossRef]

25. S. Mukherjee and C. S. Seelamantula, “An iterative algorithm for phase retrieval with sparsity constraints: application to frequency domain optical coherence tomography,” in International Conference on Acoustics, Speech and Signal Processing (IEEE, 2012), pp. 553–556.

26. S. Mukherjee and C. S. Seelamantula, “Fienup algorithm with sparsity constraints: Application to frequency-domain optical-coherence tomography,” IEEE Trans. Signal Process. 62(18), 4659–4672 (2014). [CrossRef]

27. X. Zhang and E. Y. Lam, “Edge-preserving sectional image reconstruction in optical scanning holography,” J. Opt. Soc. Am. A 27(7), 1630–1637 (2010). [CrossRef]

28. J. Ke and E. Y. Lam, “Image reconstruction using spectroscopic and hyperspectral information for compressive terahertz imaging,” J. Opt. Soc. Am. A 27(7), 1638–1646 (2010). [CrossRef]

29. J. Ke and E. Y. Lam, “Image reconstruction from nonuniformly spaced samples in spectral-domain optical coherence tomography,” Biomed. Opt. Express 3(4), 741–752 (2012). [CrossRef]

30. D. Xu, Y. Huang, and J. U. Kang, “GPU-accelerated non-uniform fast Fourier transform-based compressive sensing spectral domain optical coherence tomography,” Opt. Express 22(12), 14871–14884 (2014). [CrossRef]

31. D. Xu, Y. Huang, and J. U. Kang, “Real-time compressive sensing spectral domain optical coherence tomography,” Opt. Lett. 39(1), 76–79 (2014). [CrossRef]

32. Y. Ling, M. Wang, X. Yao, Y. Gan, L. Schmetterer, C. Zhou, and Y. Su, “Effect of spectral leakage on the image formation of Fourier-domain optical coherence tomography,” Opt. Lett. 45(23), 6394–6397 (2020). [CrossRef]

33. M. Bertero, P. Boccacci, and C. De Mol, Introduction to Inverse Problems in Imaging (CRC Press, 2021).

34. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv, arXiv:1412.6980 (2014). [CrossRef]

35. X. Yao, K. Devarajan, R. M. Werkmeister, V. A. dos Santos, M. Ang, A. Kuo, D. W. K. Wong, J. Chua, B. Tan, V. A. Barathi, and L. Schmetterer, “In vivo corneal endothelium imaging using ultrahigh resolution OCT,” Biomed. Opt. Express 10(11), 5675–5686 (2019). [CrossRef]

36. J. Yang, W. Yin, Y. Zhang, and Y. Wang, “A Fast Algorithm for Edge-Preserving Variational Multichannel Image Restoration,” SIAM J. Imaging Sci. 2(2), 569–592 (2009). [CrossRef]

37. Z. Tian, X. Jia, K. Yuan, T. Pan, and S. B. Jiang, “Low-dose CT reconstruction via edge-preserving total variation regularization,” Phys. Med. Biol. 56(18), 5949–5967 (2011). [CrossRef]

38. M. Lustig, D. Donoho, and J. M. Pauly, “Sparse MRI: The application of compressed sensing for rapid MR imaging,” Magn. Reson. Med. 58(6), 1182–1195 (2007). [CrossRef]

39. G. Gong, H. Zhang, and M. Yao, “Speckle noise reduction algorithm with total variation regularization in optical coherence tomography,” Opt. Express 23(19), 24699–24712 (2015). [CrossRef]

40. A. Chambolle, V. Caselles, D. Cremers, M. Novaga, and T. Pock, An Introduction to Total Variation for Image Analysis (Walter de Gruyter, 2010).

41. F. Sattar, L. Floreby, G. Salomonsson, and B. Lovstrom, “Image enhancement based on a nonlinear multiscale method,” IEEE Trans. Image Process. 6(6), 888–895 (1997). [CrossRef]

42. J. I. Sperl, D. Bequé, G. P. Kudielka, K. Mahdi, P. M. Edic, and C. Cozzini, “A Fourier-domain algorithm for total-variation regularized phase retrieval in differential X-ray phase contrast imaging,” Opt. Express 22(1), 450–462 (2014). [CrossRef]

43. Y. Peng, S. Choi, N. Padmanaban, and G. Wetzstein, “Neural holography with camera-in-the-loop training,” ACM Trans. Graph. 39(6), 1–14 (2020). [CrossRef]

44. J. Wu, K. Liu, X. Sui, and L. Cao, “High-speed computer-generated holography using an autoencoder-based deep neural network,” Opt. Lett. 46(12), 2908–2911 (2021). [CrossRef]

Sample	System	Central wavelength (nm)	Bandwidth range (nm)	Axial resolution ( $μ$ m)	Line rate (kHz)	Incident power (mW)
Onion	Commercial SD-OCT (GAN620C1)	892.8	791.6 to 994.0	1.94	5 to 248	N.A.^a
Monkey cornea	UHR SD-OCT [35]	850	787.5 to 912.5	2.8	70	1.5
Human skin	Custom-built microscopic SS-OCT	1300	1210 to 1310	7.5	100	3.5
Human retina	Custom-built eye SS-OCT	1060	1020 to 1095	7.3	100	1.35

Methods	IFFT	AR	RFIAA	ADMM	Proposed
CPU non-parallel	0.0073 $\pm$ 0.0009s	1.63 $\pm$ 0.082s	0.47 $\pm$ 0.16s	30.46 $\pm$ 0.17s	6.04 $\pm$ 0.099s
CPU parallel	0.0059 $\pm$ 0.0008s	0.95 $\pm$ 0.033s	0.22 $\pm$ 0.008s	19.91 $\pm$ 0.51s	0.27 $\pm$ 0.009s
GPU parallel	0.26 $\pm$ 0.008s	N.A.^b	N.A.^b	N.A.^b	0.09 $\pm$ 0.004s

Sample	Methods	IFFT	AR	RFIAA	ADMM	Proposed
Onion	CPU non-parallel	0.07 $\pm$ 0.002s	22.45 $\pm$ 0.27s	4.33 $\pm$ 0.11s	4724.69s	186.35 $\pm$ 4.3s
	CPU parallel	0.04 $\pm$ 0.001s	9.47 $\pm$ 0.23s	2.64 $\pm$ 0.08s	1380.43s	11.25 $\pm$ 0.09s
	GPU parallel	0.24 $\pm$ 0.007s	N.A.^b	N.A.^b	N.A.^b	0.86 $\pm$ 0.11s
Cornea	CPU non-parallel	0.03 $\pm$ 0.001s	68.39 $\pm$ 0.67s	2.46 $\pm$ 0.027s	23 h	247.72s $\pm$ 3.2s
	CPU parallel	0.02 $\pm$ 0.002s	42.7 $\pm$ 0.52s	1.94 $\pm$ 0.017s	10h	7.42 $\pm$ 0.20s
	GPU parallel	0.27 $\pm$ 0.005s	N.A.	N.A.	N.A.	1.15 $\pm$ 0.011s

Sample	Methods	SNR (dB)	CNR (dB)	EPI ( $%$ )	T (s)
Human skin	IFFT (reference)	36.68	52.80	1	0.05 $\pm$ 0.05
	IFFT + TV denoising	36.95	53.08	94.75	36.73 $\pm$ 1.18
	Proposed	37.19	53.14	94.45	1.29 $\pm$ 0.006
	IFFT+ TV denoising (w/ deconv)	37.0	53.36	98.74	69.11 $\pm$ 0.50
	Proposed (w/ deconv)	38.54	54.68	98.60	1.42 $\pm$ 0.068
Human retina	IFFT (reference)	14.65	13.37	1	0.04 $\pm$ 0.005
	IFFT+ TV denoising	17.13	15.37	96.47	42.29 $\pm$ 0.15
	Proposed	18.09	17.02	95.81	2.32 $\pm$ 0.046
	IFFT+ TV denoising (w/ deconv)	18.26	16.44	98.52	74.05 $\pm$ 1.64
	Proposed (w/ deconv)	22.22	18.47	98.49	2.40 $\pm$ 0.057

Sample	Numbers of GPU	4096 $\times$ 1000 $\times$ 100	2048 $\times$ 1000 $\times$ 100	1024 $\times$ 1000 $\times$ 100
Human skin	1	10.72 $\pm$ 0.057s	7.24 $\pm$ 0.023s	6.85 $\pm$ 0.004s
	2	5.93 $\pm$ 0.028s	4.51 $\pm$ 0.009s	3.84 $\pm$ 0.012 s
	4	4.93 $\pm$ 0.008s	4.27 $\pm$ 0.008s	3.71 $\pm$ 0.011s
Human retina	1	16.53 $\pm$ 0.088s	10.34 $\pm$ 0.012s	7.65 $\pm$ 0.013 s
	2	7.82 $\pm$ 0.046 s	5.40 $\pm$ 0.019s	4.13 $\pm$ 0.036s
	4	5.18 $\pm$ 0.025s	4.70 $\pm$ 0.009s	4.08 $\pm$ 0.005s

GPU-accelerated iterative method for FD-OCT image reconstruction with an image-level cross-domain regularizer

Abstract

1. Introduction

2. Theory

2.1 Iterative methods in FD-OCT image reconstruction

2.2 Iterative method with image-level cross-domain regularization

3. Methods

3.1 OCT dataset collection and pre-processing

3.2 Image reconstruction of simulated and experimental datasets

3.2.1 Speed improvement: simulated air wedge reconstruction

3.2.2 Image quality: onion and cornea reconstruction

3.2.3 Image enhancement: skin and retina reconstruction

3.2.4 Scalability: 3D image reconstruction

4. Results

4.1 Results of simulated air wedge reconstruction

4.2 Results of onion and cornea reconstruction

4.3 Results of skin and retina reconstruction

4.4 Result of real-time image reconstruction

5. Conclusion and discussion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (9)

Tables (5)

Equations (15)

Optics Express