Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Three-dimensional refractive index estimation based on deep-inverse non-interferometric optical diffraction tomography (ODT-Deep)

Open Access Open Access

Abstract

Optical diffraction tomography (ODT) solves an inverse scattering problem to obtain label-free, 3D refractive index (RI) estimation of biological specimens. This work demonstrates 3D RI retrieval methods suitable for partially-coherent ODT systems supported by intensity-only measurements consisting of axial and angular illumination scanning. This framework allows for access to 3D quantitative RI contrast using a simplified non-interferometric technique. We consider a traditional iterative tomographic solver based on a multiple in-plane representation of the optical scattering process and gradient descent optimization adapted for focus-scanning systems, as well as an approach that relies solely on 3D convolutional neural networks (CNNs) to invert the scattering process. The approaches are validated using simulations of the 3D scattering potential for weak phase 3D biological samples.

© 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Optical diffraction tomography (ODT) provides a 3D reconstruction of the refractive index (RI) for nearly optically transparent samples exhibiting weak absorption and scattering properties. Traditionally ODT approaches use specialized holographic imaging exploiting light sources with high degrees of spatial and temporal coherence, and complex hardware to achieve angular scanning of the sample illumination projection [14]. Interferometric approaches are subject to coherent speckle and phase instabilities. Non-interferometric ODT has been demonstrated using partially-coherent sources and traditional wide-field transmission microscopes [58]. This approach achieves a simplified hardware solution by utilizing an electronically tunable lens (ETL) to collect a series of through-focus intensity images as the axial focal depth is scanned. For samples that meet the criteria of the first Born approximation, the well-known transport of intensity equation (TIE) provides non-interferometric access to quantitative phase and RI contrast [9]. ODT based solely on axial scanning suffers from an incomplete representation of the 3D optical transfer function (OTF), particularly in the axial dimension. Hardware and software solutions to recover this missing resolution, commonly referred to as the missing cone” problem, are at the core of almost all inverse scattering applications. In addition, choosing the optimal defocusing distance presents an inherent tradeoff between sensitivity and spatial resolution. While these systems have been shown to provide estimates of the transverse phase [1015], 3D wavefield inversion is severely under-represented by the measurement space. We propose to improve upon non-interferometric ODT approaches by considering a system comprised of fast axial scanning using an ETL as well as fast angular illumination scanning using a fully or partially coherent source. Angular scanning can be achieved using beam steering hardware including digital micromirror devices (DMD), spatial light modulators (SLM), and programmable Galvo mirrors. Angular diversity can also be achieved using a 2D LED array, where individual diodes sequentially provide illumination from a discrete and well calibrated direction [1521].

Traditional ODT reconstruction techniques rely on a formal definition of the forward scattering model which is used directly to solve the corresponding inverse problem. These approaches are subject to the limitations of a rigid mathematical construct, including boundary artificats and 1st order scattering approximations resulting in phase biases, low-frequency blurring, and reduced axial resolution. Deep learning (DL) approaches can deduce the complex nonlinear transformation between the input measurements and desired output without the constraints imposed by a direct definition of an optical scattering model. We propose an inversion scheme based on representing the scattering process using a 3D convolutional neural network (CNN) for improved refractive index and phase retrieval. System performance is evaluated using simulations of the optical scattering and measurement acquisition processes for an ODT imaging system utilizing illumination and defocus scanning. Reconstruction performance is evaluated relative to an optimization technique based on an iterative back-projection gradient descent method using a multiple in-plane scattering model.

2. Optical diffraction tomography

Tomographic reconstruction can be realized by adopting the mathematical framework of an inverse problem. An inverse problem attempts to solve for an unknown sample s using measurements m generated from a forward model H operating on $s$, i.e. $m = H(s )$. Applications of inverse problems in imaging include deconvolution, interpolation, denoising, super-resolution, non-linear wavefield inversion, inverse scattering, and time-reversal [22]. A tomographic inverse problem is depicted in Fig. 1, where the forward model represents the entire tomographic imaging system sampling the measurement space.

 figure: Fig. 1.

Fig. 1. Depiction of solving a typical tomographic inversion problem.

Download Full Size | PDF

Optical diffraction tomography is an ill-posed inverse problem that entails computing the 3D refractive index from 2D scattering fields. The scattering problem can be defined by the inhomogeneous Helmholtz equation given by [23]

$$({{\nabla^2} + k_0^2} )\psi ({r,z} )={-} V({r,z} )\psi ({r,z} ), $$
where ${k_0} = 2\pi /\lambda $ is the optical wavenumber, $r = ({x,y} )$ are the transverse spatial coordinates, z is the axial dimension, $\psi ({r,z} )= {\psi _i}({r,z} )+ {\psi _s}({r,z} )$ is the total field with contributions from the on-axis incident and scattered components, and $V({r,z} )$ is the complex scattering potential defined as [5,24]
$$V({r,z} )= k_0^2[{{n^2}({r,z} )- n_m^2} ]= \phi ({r,z} )+ iA({r,z} ). $$

The scattering potential is a solution to the Helmholtz equation and is a function of the 3D complex refractive index $n = {n_{re}} + i{n_{im}}$ with amplitude and phase components given by A and $\phi $. Using the first-order Born approximation, $|{{\psi_s}} |\ll |{{\psi_i}} |$, the total field can be solved for without explicit knowledge of the scattered component using the Green’s function solution $G({r,z} )$ to the homogeneous equation for undiffracted light given by [25]

$$\begin{array}{c}\psi ({r,z} )= {\psi _i}({r,z} )+ [{V({r,z} ){\psi_i}({r,z} )} ]\mathrm{\ast }G({r,z} )\\G({r,z} )={-} exp({i{k_0}|{({r,z} )} |} )/({4\pi |{({r,z} )} |} ),\end{array}$$
where (*) denotes the convolution operation. The Born approximation neglects multiple-scattering and the total field is proportional to the interference of first-order diffraction with undiffracted light [24]. More simply put, the Born approximation is valid when in-plane scattering dominates out-of-plane scattering, expressed in terms of field derivatives as ${\nabla _r}\psi \ll {\nabla _z}\psi $. The Fourier transform of the intensity of the total field can then be expressed as [5,10]
$$\mathrm{{\cal F}}\{I \}= B\delta ({\rho ,\eta } )+ \tilde{A}({\rho ,\eta } ){H_A}({\rho ,\eta } )+ \tilde{\phi }({\rho ,\eta } ){H_\phi }({\rho ,\eta } ), $$
where $\tilde{A}$ and $\tilde{\phi }$ and are the frequency domain representations of the amplitude and phase components of the scattered optical field, ${H_{A,\phi }}$ are the amplitude and phase components of the 3D OTFs, and $B\delta ({\rho ,\eta } )$ is the DC component of the corresponding background intensity B. We adopt the notation $\rho = ({{k_x},{k_y}} )$ corresponding to transverse spatial frequency components and $\eta = {k_z}$ corresponding to the axial spatial frequency component. The amplitude and phase OTFs are given by [11,17]
$${H_A}({\rho ,\eta } )= \frac{\lambda }{{4\pi }}[{P({\rho ,\eta } )+ P({\rho , - \eta } )} ]$$
$${H_\phi }({\rho ,\eta } )= \frac{{i\lambda }}{{4\pi }}[{P({\rho ,\eta } )- P({\rho , - \eta } )} ],$$
where P is the 3D complex aperture function of the source and imaging optics. Under the assumption of weak diffraction and scattering, given by ${({\nabla \phi /{k_0}} )^2} \ll \Delta n$, the inverse problem can be solved assuming a phase-only contribution to the scattering potential which can be easily solved to provide the real component of the refractive index ${n_{re}}$, which determines the phase velocity in the sample where the imaginary component ${n_{im}}$ representing the absorption coefficient has been ignored. The scattering inversion of Eq. (4) can then be represented as a 3D deconvolution given in inverse notation as $\tilde{\phi } = \mathrm{{\cal F}}\{I \}/{H_\phi }$. In practice the inverse problem can be solved using the following least-squares energy minimization in the spatial domain [9,16,26],
$$\hat{n}({r,z} )= \mathop {argmin}\limits_{n({r,z} )} \mathop \sum \nolimits_m \sqrt {{I_m}(r )} - {|{{H_m}({n({r,z} )} )} |^2} + \gamma R({n({r,z} )} ), $$
where $n({r,z} )$ is the scattering potential represented as the 3D complex RI, ${I_m}(r )$ are the 2D intensity measurements, ${H_m}$ describes the ${\mathrm{\mathbb{C}}^3} \to {\mathrm{\mathbb{C}}^2}$ forward model that predicts the 2D field at the recording plane for measurement m, and $R$ is a regularization function with tuning parameter $\gamma $. By inspecting Eq. (6), we can see that an accurate inversion requires the 3D RI to be adequately encoded into 2D intensity measurements from a predictable model $H$. In addition, the measurements should provide an adequate representation of the scattering kernel for robust inversion of the scattering process. Support for ODT systems has been broadly demonstrated, including sample rotation, coaxial rotation, optical tweezing, axial defocusing, mechanical and optical beam rotation, and multiplexed tomography [4]. In this work, we consider combining axial defocusing and optical beam steering to provide measurements for tomographic inversion techniques with improved axial resolution.

3. Non-interferometric ODT system

Coherent ODT has been demonstrated based on digital holographic microscopy and angular illumination diversity achieved using beam steering devices [2]. The interferometric phase is encoded in the coherent spatial mixing of the path length matched object and reference beams and is trivially deduced in reconstruction. We consider a simpler non-interferometric ODT realization, suitable for fully or partially-coherent sources which generate spatially incoherent quasi monochromatic light. Partial-coherence can be achieved directly using a filtered halogen lamp, LED array, or diffusing the beam of a spatially coherent laser source. The system provides phase sensitivity without issues associated with interferometric optics including coherent optical speckle, mechanical instabilities, aberrations, and phase wrapping. Figure 2 shows the proposed non-interferometric ODT system [15,1821,27]. In this configuration, angular illumination is achieved by diversifying single diode illumination in a structured LED array. The partially-coherent angular illumination is focused on the object using transmission microscope optics consisting of a condenser and objective lens. Without the need for a reference arm, a simpler optical refocusing module placed after interaction with the sample provides access to quantitative phase information by solving the TIE equation at axially displaced planes. The refocusing optics consists of a divergent offset lens and electronically tunable lens (OL/ETL) combination placed at the focal plane of a relay lens in a 4$f$ imaging system. Altering the effective focal length of the OL/ETL pair is equivalent to but much faster than either axially translating the recording plane of the focal plane array (FPA) or the sample stage resulting in a re-focusing at the object plane [28].

 figure: Fig. 2.

Fig. 2. Proposed non-interferometric ODT imaging system for measurement acquisition and 3D RI reconstruction based on through-focus axial scanning and illumination scanning.

Download Full Size | PDF

We consider intensity measurements recorded as the effective focal length of the OL/ETL is varied to provide a focal scan across a variety of illumination projections provided by the illumination scanning module. We define the total number of measurements M as the number of defocus planes ${M_z}$ times the number of angles ${M_\theta }$, i.e. $M = {M_z} \cdot {M_\theta }$. Adequate sampling to support both scanning modalities comes at the cost of increased time and computational complexity in both hardware and reconstruction implementations. Through-focus scanning using an ETL and angular scanning typically achieves 10-20ms update rates [2931]. Measurement acquisition rates will also be limited by the integration time required to achieve high contrast intensity images collected through an objective lens, typically on the order of 100-400ms for partially coherent sources [16,17]. Measurement acquisition rates of 2.5-10Hz are therefore expected; providing 100 measurements of axial and illumination diversity in 10-40 sec. To improve effective measurement throughput multi-diode coded illumination patterns can be used to improve reconstruction quality for a limited number of led scans, drastically reducing the time required to collect measurement support for sample inversion [32].

4. Simulation of optical scattering

Transmission phase-sensitive imaging of biological samples is a well-suited application for ODT as such samples are often highly optically transparent with valuable information contained in the optical path length induced via propagation through the specimen [16,33,34]. In this section we present simulations of the 3D optical scattering potential and forward scattering process for samples that meet the requirements for ODT inversion.

4.1 Object scattering potential

We consider simulated phantoms representing the phase-only scattering potential of nearly-spherical cells, shown as a 3D distribution of refractive index in Fig. 3 (a). The simulated samples are stochastically generated using over-resolved object-plane coordinates. Stochastic sampling is guided by measured optical properties of cells [35,36] and contains random distributions of sub-cellular organelles modeled as 3D ellipsoids. The complex object field is computed using plane wave illumination at 532nm. The imaging aperture of an objective lens is modeled using NA = 1.4. Figure 3 (b) shows 16 image-plane realizations of the simulated cells convolved with the incoherent detector point spread function (PSF) and subsampled on the FPA; these will serve as ground truth in subsequent processing and analysis whereas the object plane representations will be used to generate image-plane intensity measurements through the application of an optical scattering forward model. Transverse image sizes of 96${\times} $96 are considered, representing the FOV for a 10 micron sample assuming a pixel pitch of 6.5 microns and 60x magnification. Approximating the optical scattering process in the cell phantoms provides data for validation of 3D inverse scattering techniques.

 figure: Fig. 3.

Fig. 3. (a) 3D rendering and $z = 0$ axial slice for the 96${\times} $96${\times} $96 object-plane representation of the 3D scattering potential for a simulated cell phantom. (b) 3D renderings for the image-plane representation of 3D scattering potentials, representing the multi-realizations used for training and testing support for algorithm analysis.

Download Full Size | PDF

4.2 Forward scattering model

Light propagation through homogeneous objects can be represented as the 3D convolution of the object scattering potential with the PSF for diffraction. By the mathematical nature of the convolution, this model only accounts for first-order scattering effects or equivalently, light scattered once in any direction. To compute optical scattering through inhomogeneous objects, we consider a multi-slice beam propagation model (MSBP) implementation of the tomographic forward operator that accounts for multiple-scattering along the axis of propagation. The model approximates 3D objects as thin layers and computes the optical field as sequential layer to layer propagation [16,18,37,38]. While the model does not account for out-of-plane scattering or plane-to-plane absorption, it does represent multiple in-plane scattering by computing the scattered field iteratively at each plane and is a suitable representation of the optical scattering process in biological samples. Figure 4 describes the mathematical form of the MSBP algorithm. We represent the axial dimension in the plane-to-plane propagation scheme using the z-plane indices p, e.g. $n({r,z} )= {n_{1:p}}(r )$. A plane wave is incident on the object at an angle $\theta = ({{\theta_x},{\theta_y}} )$ relative to the transverse coordinates. The scattered field ${\psi _p}(r )$ is computed by applying the scattering potential in the form of a transmittance function at each slice in a plane-to-plane propagation scheme. The composite scattered field is then propagated forward a distance ${z_f}$ to the final defocusing plane where a generalized pupil function is applied corresponding to both the imaging and detection optics. Figure 5 presents a visual representation of the forward scattering process. In general, the forward operator provides the complex ${\mathrm{\mathbb{C}}^3} \to {\mathrm{\mathbb{C}}^2}$ scattering transformation. However, for weak phase objects only the real-valued refractive index contributes to the scattering potential and the magnitude of the scattered field encodes phase-diffraction information. The modulus-squared operation is returned to represent intensity measurements from a photosensitive FPA.

 figure: Fig. 4.

Fig. 4. MSBP forward model operates on the scattering potential of an object to return an intensity measurement de-focused to a plane ${z_f}$ from an incident wavefront at an angle $\theta = ({{\theta_x},{\theta_y}} )$. Sequential plane-to-plane propagation allows for the inclusion of multiple in-line scattering effects.

Download Full Size | PDF

 figure: Fig. 5.

Fig. 5. Depiction of the forward scattering process which transforms 3D scattering potentials to 2D scattered fields, the phase of the scattered field is shown to demonstrate plane-to-plane phase scattering effects.

Download Full Size | PDF

5. ODT-gradient inverse

The minimization problem in Eq. (6) describes a scattering inversion framework that can be solved using an iterative approach based on plane-to-plane forward-backward projection and gradient descent optimization [16,17,38]. We refer to this approach as ODT-Gradient inverse, outlined in Fig. 6. The model is comprised of a forward component that predicts the 3D complex field $\hat{\psi }({r,z,\theta } )$ and 2D complex measurements $\hat{\psi }({r,{z_f},\theta } )$ from the 3D reconstructed RI $\hat{n}({r,z} )$, and an inverse solver which re-estimates the scattering potential from the 3D complex field predictions supported by the through-focus/angle intensity measurements $I({r,{z_f},\theta } )$. In reconstruction we denote an estimate using the hat operator. The forward and backward scattering projections are modeled using the multi-slice beam propagation model described in Section 4.2, however in reconstruction the forward operator operates on the estimated scattering potential to provide access to the 3D complex scattered field and complex measurements, as opposed to intensity-only measurements. During each iteration, the estimate of the refractive index is successively updated in the direction of minimization for each measurement. After each iteration, total-variation (TV) regularization using the ${L^2}$ norm of the 3D refractive index, corresponding to R in Eq. (6), is implemented using a proximal gradient method to update to the object reconstruction [8,17,39,40]. This allows for a momentum-based acceleration of the gradient in convex regularization problems.

 figure: Fig. 6.

Fig. 6. ODT-Gradient Inverse consisting of a forward model and gradient based object update. During each iteration the reconstruction is sequentially updated in the direction of measurements using a gradient-descent method. TV regularization is implemented after each iteration using a proximal-gradient update to the reconstruction.

Download Full Size | PDF

The scattering estimates from the forward model are used in conjunction with the intensity measurements to update the complex object reconstruction during the back-projection gradient update described in Fig. 7. For a forward prediction at each illumination angle θ and defocus plane ${z_f}$, the residual is initialized as the difference between the complex measurements as predicted by the forward model and the observed measurements. In lieu of complex observations, the residual is initialized by enforcing the phase as predicted by the forward model applied to the square-root of the intensity measurements. For each axial plane p in a plane-by-plane back-propagation scheme, the complex field predicted by the forward model is inverse-scattered’ according to the current estimate of the scattering potential. The gradient is then computed using the residual and subtracted from the previous estimate of the object reconstruction, so as to achieve an object update against the gradient in the direction of minimization using the gradient step-size $\mathrm{\varepsilon }$. Finally, the residual is back-projected to the previous plane and the algorithm repeats.

 figure: Fig. 7.

Fig. 7. Back-projection gradient update consisting of a plane-by-plane update of the reconstructed scattering potential in the direction of an intensity measurement.

Download Full Size | PDF

6. ODT-deep inverse

Deep learning realized using CNNs has been broadly applied to optical microscopy applications [35,4144] and solving inverse problems [45,46], including 3D ODT inverse scattering problems applied to holographic imaging and angular diversity [24]. In the context of Section 5, CNNs have been demonstrated to provide a deep regularization solution to replace the proximal-gradient update method in Fig. 6. In this implementation the network is trained to map the low-to-high resolution transformation provided by regularization routines [47]. Similarly, iterative tomographic inversion has been demonstrated using CNN architecture that is trained in situ to learn minimization using a deep physics prior” loss function as an alternative to solving a regularized optimization problem directly [48]. Both solutions address the missing cone problem by alleviating rigid dependence on scattering model approximations. We propose to address the shortcomings of traditional optimization routines by using CNN architecture to directly map intensity measurements to refractive index, thus providing a full representation of scattering inversion from intensity-only measurements.

To acquire RI slices from a stack of intensity images we adopted the pix2pix conditional generative adversarial network (cGAN) architecture [49], a well-known model for high resolution image-to-image translation. Figure 8 (a) shows the proposed architecture for non-interferometric ODT imaging and reconstruction applied to a simulated cell phantom. The complex optical field de-focused to a plane ${z_{{m_z},{m_\theta }}}$ is generated and measured as intensity by the ${\mathrm{\mathbb{C}}^3} \to {\mathrm{\mathbb{R}}^2}$ forward model (${H_{{m_z}}}$) operating on the 3D object scattering potential, described by the multi-slice beam propagation forward model. The composite forward model $H = ({{H_0}, \ldots ,{H_{M = {M_z} \cdot {M_\theta }}}} )$ provides a series of 2D measurements as the object is scanned through focus and angle, i.e. $H:\,{\mathrm{\mathbb{C}}^3} \to \mathrm{\mathbb{R}}_{m\in M}^2$ where M is the measurement set. The non-linear inverse model ${\tilde{H}_{{n_z}\in {N_z}}}:\, \mathrm{\mathbb{R}}_{m\in M}^2\, \to {\mathrm{\mathbb{R}}^3}$ is implemented as a fully connected cGAN based on 3D convolution/deconvolution to achieve stack-to-volume translation for a set of z-plane reconstructions ${N_z}$, shown in Fig. 8 (b).

 figure: Fig. 8.

Fig. 8. Schematic illustration of the ODT-Deep inverse approach (a) and training procedure for the pix2pix cGAN (b) used in tomographic reconstruction. The forward model generates a series of through-focus and angle intensity measurements, which are translated to a 3D RI volume in the inverse model. Each node in the forward model is a MSBP forward operator which provides an intensity measurement defocused to a plane ${z_{{m_z},{m_\theta }}}$ from incident illumination at an angle ${\theta _{{m_\theta }}}$. The inverse model is implemented as a trained cGAN with a generator network (U-Net) to provide volumetric RI prediction for a set of reconstruction planes ${z_{{n_z}\in {N_z}}}$. Training is optimized using a discriminator network (PatchGAN) and conditioned using image labels.

Download Full Size | PDF

6.1 cGAN network architecture

GANs consist of two adversarial models; a generative model G learns the data distribution and is trained to produce outputs that cannot be distinguished from “real” images by an adversarially trained discriminator model D. The discriminator is trained to identify “fake” images produced by the generator and estimates the likelihood that a sample belongs to the training data; i.e. GANs learn a mapping from a random noise vector $\tilde{z}$ to an output image $\tilde{y}$, i.e. $G\, :\tilde{z} \to \tilde{y}$ [50].

Conditional GANs use auxiliary information to condition the generator and discriminator to continuously optimize network parameters and learning; cGANs learn a mapping from an observed image $\tilde{x}$ and random noise vector $\tilde{z}$, to an output image $\tilde{y}$, i.e. $G\, :\, \{{\tilde{x},\tilde{z}} \}\to \tilde{y}$ [51]. Figure 8 (b) depicts the training procedure implemented in the cGAN architecture. The generator output G($\tilde{x}$) and input images x are concatenated and input to the discriminator network to generate a mapping function D1. The input images and conditional images y, in this case consisting of the label images or desired generator outputs, are concatenated and input to the discriminator network to generate a mapping D2. The feature mappings D1 and D2 are used to construct the loss function of the discriminator. The generator output, conditional, and feature mapping D1 are used to construct the loss function of the generator. The composite objective of a conditional GAN can be expressed as [49,51]

$${\mathrm{{\cal L}}_{cGAN}}({G,D} )= {\mathrm{\mathbb{E}}_{\tilde{x},\tilde{y}}}[{\log D({\tilde{x},\tilde{y}} )} ]+ {\mathrm{\mathbb{E}}_{\tilde{x},\tilde{z}}}[{\log ({1 - D({\tilde{x},G({\tilde{x},\tilde{z}} )} )} )} ], $$
where $\mathrm{\mathbb{E}}$ is the expectation operator. The generator attempts to minimize the objective against an adversarial discriminator that tries to maximize it, i.e.
$${G^\ast } = arg \mathop {min }\limits_G \mathop {max }\limits_D {\mathrm{{\cal L}}_{cGAN}}({G,D} ). $$

The use of the discriminator in the min-max objective function provides an excellent model for image-to-image prediction as the conditioned network can represent sharp features, high resolution/high frequency content, and even degenerate distributions whereas simpler models with commonly used pixel-wise loss functions require blurry distributions. It has been proven beneficial to regularize the objective function using the mean absolute error (MAE) as the L1 norm will further discourage blurring [49]. The final objective is given by

$${G^\ast } = \arg \mathop {\min }\limits_G \mathop {\max }\limits_D {\mathrm{{\cal L}}_{cGAN}}({G,D} )+ {\gamma _G}{\mathrm{{\cal L}}_{MAE}}(G ), $$
where the L1 loss function is given by
$${\mathrm{{\cal L}}_{MAE}}(G )= {\mathrm{\mathbb{E}}_{\tilde{x},\tilde{y},\tilde{z}}}[{\tilde{y} - G{{({\tilde{x},\tilde{z}} )}_1}} ]. $$

The regularization parameter, ${\gamma _G}$=100, is used to penalize generator mismatch with the conditional labeled data. The cGAN network trained using L1 minimization can learn to create a solution that resembles realistic high-resolution images with high-frequency detail. Network prediction is achieved by applying the trained generator to testing images. The networks are trained using 240 input/output image-stack pairs and tested using 64 hold-out samples. Each training example consists of the intensity measurement stack and 3D RI. For each training epoch, i.e. when each training sample has propagated through the network at least once, the model parameters are updated for each batch of training images using gradient descent optimization and back-propagation. The Adam algorithm is used for stochastic optimization with a learning rate of 0.0002 and momentum parameters ${\beta _1} = 0.5,\,{\beta _2} = 0.999$ [52].

6.2 cGAN failure mechanisms

While cGAN is a powerful mathematical construct for image generation, there are common convergence failure mechanisms that can degrade the quality of the generated images. The fundamental GAN failure mechanism is mode collapse. Mode collapse arises from the adversarial training strategy when the generator is over-optimized to produce only a small subspace of plausible outcomes, or dominant modes, that the discriminator has learned to easily identify as a “fake”. Consequently, the discriminator lacks training support to adequately learn the full mapping of plausible input/output translations. In training, the discriminator converges to a local minimum and does not incentivize the generator to produce non-dominant modes required to adequately train the network. Conditional GANs attempt to minimize this effect by processing the generator output and conditional images through the discriminator independently, and defining a composite loss function for optimization. While the issue of mode collapse has not fully been solved, more advanced mitigation strategies have received much attention in the machine learning community, including unrolled GANs which additionally include future discriminator outputs in the loss function to prevent over-optimization.

A distinct challenge is feature hallucination, which occurs if the generator has learned numerical mappings that are inconsistent with the true bounds of physics based forward and inverse scattering. Feature hallucinations can arise when training datasets contain large sample-to-sample variations and contain even small biases and noise. Increasing the regularization parameter ${\gamma _G}$in (9) would further penalize prediction mismatch with the labeled data, effectively desensitizing the learning process and de-motivating feature hallucinations, perhaps at the cost of robust network prediction for samples that are not well represented in the training data.

6.3 U-net generator network

The generator network implementation is the well-known U-Net architecture based on the fully connected convolutional network [49,53], shown in Fig. 9. It consists of a contracting path or encoder and an expansive path or decoder. The encoder/decoder follow typical CNN architectures. The contracting path consists of repeated applications of 4${\times} $4${\times} $4 3D spatial convolutions with 3D downsampling, each followed by a leaky rectified linear unit (LReLU) activation function to alleviate the vanishing gradient problem and improve learning efficiency [54]. At each layer in the contracting path the data is downsampled using stride 2 in the transverse dimensions with variable stride ${s_z}$ in the axial dimension depending on the number of measurements. At each downsampling step the number of feature channels (i.e. the number of independent 3D spatial filters in the network node) is doubled from 16 to 128. The size of the data at each layer in the network is then $({96/{2^l},96/{2^l},M/{s_z}^l} )$, with the final layer in the encoder producing 3${\times} $3${\times} $1 transverse image patches. Every step in the expansive path consists of an up-sampling of the feature map followed by a 4${\times} $4${\times} $4 convolution (“up-convolution”) with single pixel stride that halves the number of feature channels from 128 to 16. At layer interfaces in the expansive path ReLU activation functions are enforced with 50% dropout rate. After the last decoder layer, a 3D convolution is applied to map to the number of output channels and a Tanh activation function is applied to acquire the volume prediction. The network consists of 10 total layers; batch normalization is performed at each layer interface in the encoder and decoder paths to standardize the data, stabilize and improve network training [55]. The operations performed at each layer interface are detailed in Fig. 9. While classical CNN solutions require information to flow through the entire encoder/decoder structure; the “U-Net” implementation provides skip connections between mirror symmetric layers in the encoder/decoder. Skip connections are a way to protect the information from loss during transport in neural networks and provide access to low-level information which is crucial for image representation [49,53].

 figure: Fig. 9.

Fig. 9. 3D U-Net generator network consisting of convolutional encoder layers and de-convolutional decoder layers; skip connections between mirrored layers, depicted as black arrows, provide access to additional low-level information.

Download Full Size | PDF

6.4 PatchGAN discriminator network

To further enforce high spatial frequency representations pix2pix adopts the well-known PatchGAN discriminator framework [49], which only penalizes structure at the scale of local image patches. The discriminator is run for a series of convolution patches across the image, averaging all responses to provide an aggregate likelihood that each 3${\times} $3${\times} $1 patch is a real image produced by the generator. By assuming independence between pixel separations which are large relative to the patch size, the network will learn higher order image texture [49]. The discriminator network, shown in Fig. 10, mimics the first five convolutional layers in the contracting path of the generator network; the number of feature channels doubles from 16 to 128. The final layer in the discriminator network applies a Tanh activation function to 3${\times} $3${\times} $1 patches to generate class likelihood scores that feed the cGAN loss function described in Section 6.1 and train the Generator network described in Section 6.3.

 figure: Fig. 10.

Fig. 10. 3D PatchGAN discriminator network mimics the decoder path in the generator network. The discriminator is run for a series of convolution patches across the image and only penalizes structure at the scale of local image patches.

Download Full Size | PDF

7. Results

7.1 Implementation and timing

All results were generated using a Linux based system with Intel Xeon W-2295 18 core (36 thread) processor operating at 3.00GHz base frequency with 64 GB of RAM. To accelerate computations, both ODT inversion algorithms were executed using a NVIDIA RTX A5000 GPU with 24 GB RAM and 8192 CUDA cores. The algorithms were developed using Python 3.9.7 and CUDA Toolkit 11.6. ODT-Gradient inverse utilizes CuPy 9.6.0 for GPU accelerated computing. The cGAN architecture for the ODT-Deep inverse framework was implemented using the TensorFlow (v2.8.0) framework; matrix operations were implemented using single-precision (32-bit) floating point operations evaluated across CUDA cores.

The time complexity of ODT-Gradient inverse and ODT-Deep inverse implemented on the GPU are shown in Fig. 11. Algorithm timing is performed for 3D RI profiles with ${N_z}$=96 axial planes and N root-pixels per image, performance is shown for increasing number of measurements and image sizes. The reconstruction time per iteration for ODT-Gradient inverse is shown in Fig. 11 (a). The time complexity of ODT-Deep inverse is presented in Fig. 11 (b), the training time per epoch (black) using 240 input/output image-stack pairs and testing time per sample (red) are shown. For N = 96, network training per epoch is 1-10 min and measurement stack to RI volume prediction is performed in less than 100ms for up to several hundred measurements. The total time to reconstruct a full 3D object is then significantly decreased relative to ODT-Gradient inverse, which takes 225 sec to iterate and converge to a 96${\times} $96${\times} $96 3D reconstruction using 400 total measurements and 25 iterations. For moderate image sizes (N > 128), ODT-Gradient reconstruction time is dominated by the series of plane-to-plane 2D spatial FFTs exhibiting $({{N^2} \cdot {N_z}} )\log ({{N^2}} )$ complexity. ODT-Deep inverse is dominated by layers of 3D convolutions, with the most intensive exhibiting $({{{[{N + k} ]}^2} \cdot [{{N_z} + k} ]} )\log ({{{[{N + k} ]}^2} \cdot [{{N_z} + k} ]} )$ complexity for filter size k. Computational limitations are reached for N > 512 due to extensive training times and limited GPU memory to store training measurements and volumes, highlighting a main disadvantage of pre-trained 3D neural networks.

 figure: Fig. 11.

Fig. 11. (a) Reconstruction time per iteration for ODT-Gradient inverse implemented on GPU, shown for increasing measurement support and root-pixels per image. (b) Training time per epoch (red) and testing time per sample (black) for ODT-Deep inverse implemented on GPU, shown for increasing measurement support and root-pixels per image.

Download Full Size | PDF

7.2 ODT inverse for 3D RI estimation

We consider 3D refractive index reconstruction and prediction using through-focus and angle measurement simulations of the optical system described in Fig. 2 and optical scattering described in Section 4. Figure 12 (a-c) shows 2D slices through the true RI of a simulated cell phantom. The forward model is applied to the 3D scattering potential to generate intensity measurements which are used as support for the tomographic inversion processes. As a representative example, we will consider 252 measurements consisting of ${M_z} = 7$ defocus planes spanning ${\pm} $3.3 micron for each of ${M_\theta } = 36$ illumination angles spanning (-60,60) degrees in both transverse spatial dimensions, i.e. 6 angles relative to x and y coordinates, respectively. For alternate measurement sets, the de-focus and angular scanning range remained fixed at ${\pm} $3.3 microns and (-60,60) degrees respectively, with increasing resolution for increasing measurements.

 figure: Fig. 12.

Fig. 12. 3D RI used as scattering potential. (a) Transverse (x-y) slice, (b) axial (x-z) slice, (c) axial (y-z) slice.

Download Full Size | PDF

Figure 13 (a-c) shows 2D slices through the reconstructed RI using ODT-Gradient inverse after 25 optimization iterations. The gradient solver timed 6 sec/iteration to perform 3D object reconstruction with ${N_z} = 96$ axial planes. Figure 14 (a-c) shows 2D slices through the reconstructed RI predicted by ODT-Deep inverse after 100 training epochs and the same measurements as considered in Fig. 13. Network training was executed at a rate of 165 sec/epoch, however the mean time to test a sample and predict a volume with ${N_z} = 96$ axial planes was just 68ms.

 figure: Fig. 13.

Fig. 13. 3D reconstruction of RI using ODT-Gradient inverse. (a) Transverse (x-y) slice, (b) axial (x-z) slice, (c) axial (y-z) slice.

Download Full Size | PDF

 figure: Fig. 14.

Fig. 14. 3D reconstruction of RI using ODT-Deep inverse. (a) Transverse (x-y) slice, (b) axial (x-z) slice, (c) axial (y-z) slice.

Download Full Size | PDF

Figure 1214 shows that ODT-Gradient inverse suffers from blurring and reconstruction biases that are not present in the ODT-Deep inverse reconstruction. Figure 15 provides a 3D rendering of the forward scattering process (a) and scattering inversion based on the two approaches discussed (b-c). The 3D projection further demonstrates that ODT-Deep inverse reconstructs the original 3D refractive index with minimal blurring and less overall biases relative to ODT-Gradient inverse. Furthermore, many inhomogeneities that are missing from (b) are fully resolved using the network based approach (c). Deep learning based reconstruction is not without artifacts, as seen by faint vertical striping in Fig. 14 (b-c). The artifacts were found to become less pronounced with increasing training examples and training iterations, at the cost of increased computational resources and extended training times.

 figure: Fig. 15.

Fig. 15. 3D rendering of ODT forward and inverse processes for 7 de-focus planes and 36 illumination angles. Measurements are generated using the multiple-scattering forward model described in Section 4.2 applied to the 3D scattering potential (a); 3D RI is reconstructed using ODT-Gradient inverse (b) and ODT-Deep inverse (c).

Download Full Size | PDF

Figure 16 shows the root-mean-square error (RMSE) of the 3D RI reconstruction relative to the true scattering potential aggregated using 64 testing samples for ODT-Gradient inverse as a function of optimization iteration (a) and ODT-Deep inverse as a function of training epochs (b). Increased axial and illumination diversity independently improved the quality of reconstruction for both approaches. Increased axial-scanning support for ODT-Gradient inverse improved the convergence for sufficiently high illumination diversity. ODT-Gradient inverse ultimately becomes limited by the numerical inaccuracies of the non-deterministic optimization routine for sufficiently large measurement sets, achieving minimum RI errors on the order of 2e-2. The minimum achievable reconstruction error for ODT-Deep inverse was found to be 4${\times} $ below this bound. All results assume a microscope objective function with NA = 1.4, demonstrating robust reconstruction performance for high resolution imaging.

 figure: Fig. 16.

Fig. 16. RMSE of the 3D RI computed over 64 test samples for ODT-Gradient inverse as a function of optimization iteration (a) and ODT-Deep inverse as a function of training epoch (b). The individual line plots correspond to various measurement sets.

Download Full Size | PDF

8. Conclusions and future work

Non-interferometric phase microscopy significantly decreases the complexity of 2D/3D quantitative imaging systems. Measuring a stack of through-focus intensity images can provide high fidelity phase contrast images for samples that meet the requirements of the first Born approximation. 3D RI retrieval methods exist for focal-scanning systems, however with drastically reduced axial resolution relative to systems that measure illumination diversity. We considered advanced tomographic RI retrieval approaches designed to meet the unique challenges of non-interferometric ODT systems. To provide support for scattering inversion techniques we considered through-focus and angle scanning diversity to encode a more complete 3D representation into the 2D scattered fields with improved spatial resolution relative to illumination-scanning only systems. High-fidelity 3D RI images were acquired using advanced software solutions to provide high-fidelity tomographic reconstruction for large datasets consisting of 3D input/output images.

This work demonstrates, in simulation, accurate 3D refractive index retrieval achieved by inverting the optical scattering process using non-interferometric measurements and an ODT-Deep inverse model based entirely on DL/CNN. The model consists of a pre-trained 3D cGAN network which translates 2D intensity measurements to a 3D RI volume. Achieving a fully-learned representation of optical scattering inversion improves errors associated with an incomplete sampling of the forward scattering kernel, scattering approximations, and numerical instabilities which arise from solving a non-deterministic constrained optimization problem. As is often the case with DL solutions, the performance will be bounded by the quality of the output images used in training. Using simulations of the phase-only scattering potential for biological samples, we have demonstrated improved performance relative to a constrained optimization approach based on a multiple in-plane scattering model and regularized gradient descent. While training the ODT-Deep inverse model requires significant time and computational resources, we demonstrated a significant decrease in reconstruction error and three orders of magnitude reduction in post-training reconstruction time relative to the ODT-Gradient inverse model. However, DL generated data can exhibit inaccuracies and feature hallucinations caused by the presence of noise and over-optimization. Our results show that increased axial and illumination diversity both significantly improve the quality of reconstruction for both inversion techniques. Additionally, we showed increased axial scanning improves the convergence in terms of number of training epochs for sufficiently high illumination diversity.

This work serves as a proof of concept for 3D optical scattering inversion implemented using a single 3D convolutional neural network. As a next step we will seek to eliminate the training complexity by adopting a self-supervised learning approach based on an intelligent regularization routine that exploits knowledge of the optical scattering process [19,21]. A meaningful extension would consider scattering inversion for opaque and highly diffractive samples. For such samples, we would expect a significantly improved reconstruction quality for increasing optical sectioning provided by axial scanning diversity.

Acknowledgments

We would like to thank The Johns Hopkins Applied Physics Laboratory for supporting this research effort. Portions of this work were presented at the Optica Digital Holography and 3-D Imaging Meeting in 2022, 3D Refractive Index Estimation Based on Partially-Coherent Optical Diffraction Tomography (PC-ODT) and Deep Learning.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper may be obtained from the authors upon reasonable request.

References

1. T. Latychevskaia, F. Gehri, and H. Fink, “Depth-resolved holographic reconstructions by three-dimensional deconvolution,” Opt. Express 18(21), 22527–22544 (2010). [CrossRef]  

2. A. Kuś, W. Krauze, and M. Kujawińska, “Active limited-angle tomographic phase microscope,” J. Biomed. Opt. 20(11), 111216 (2015). [CrossRef]  

3. Y. Lin, H. Chen, H. Tu, C. Liu, and C. Cheng, “Optically driven full-angle sample rotation for tomographic imaging in digital holographic microscopy,” Opt. Lett. 42(7), 1321–1324 (2017). [CrossRef]  

4. V. Balasubramani, A. Kuś, H. Tu, C. Cheng, M. Baczewska, W. Krauze, and M. Kujawińska, “Holographic tomography: techniques and biomedical applications,” Appl. Opt. 60(10), B65–B80 (2021). [CrossRef]  

5. J. Soto, J. Rodrigo, and T. Alieva, “Optical diffraction tomography with fully and partially coherent illumination in high numerical aperture label-free microscopy,” Appl. Opt. 57(1), A205–A214 (2018). [CrossRef]  

6. J. Soto, J. Rodrigo, and T. Alieva, “Partially coherent illumination engineering for enhanced refractive index tomography,” Opt. Lett. 43(19), 4699–4702 (2018). [CrossRef]  

7. J. Soto, J. Rodrigo, and T. Alieva, “Partially coherent optical diffraction tomography for label-free quantitative microscopy,” 16th Workshop on Information Optics (WIO), 1–3 (2017).

8. J. Soto, J. Rodrigo, and T. Alieva, “Partially coherent optical diffraction tomography toward practical cell study,” Front. Phys. 9, 666256 (2021). [CrossRef]  

9. C. Zuo, L. Jiaji, J. Sun, Y. Fan, J. Zhang, L. Lu, R. Zhang, B. Wang, L. Huang, and Q. Chen, “Transport of intensity equation: a tutorial,” Opt. Lasers Eng. 135, 106187 (2020). [CrossRef]  

10. M. Jenkins, J. Long, and T. Gaylord, “Multifilter phase imaging with partially coherent light,” Appl. Opt. 53(16), D29–D39 (2014). [CrossRef]  

11. Y. Bao and T. Gaylord, “Quantitative phase imaging method based on an analytical nonparaxial partially coherent phase optical transfer function,” J. Opt. Soc. Am. A 33(11), 2125–2136 (2016). [CrossRef]  

12. Z. Jingshan, R. Claus, J. Dauwels, L. Tian, and L. Waller, “Transport of Intensity phase imaging by intensity spectrum fitting of exponentially spaced defocus planes,” Opt. Express 22(9), 10661–10674 (2014). [CrossRef]  

13. L. Waller, L. Tian, and G. Barbastathis, “Transport of Intensity phase-amplitude imaging with higher order intensity derivatives,” Opt. Express 18(12), 12552–12561 (2010). [CrossRef]  

14. Z. Jingshan, R. A. Claus, J. Dauwels, L. Tian, and L. Waller, “Non-uniform sampling and Gaussian process regression in transport of intensity phase imaging,” 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 7784–7788 (2014).

15. L. Tian and L. Waller, “Quantitative differential phase contrast imaging in an LED array microscope,” Opt. Express 23(9), 11394–11403 (2015). [CrossRef]  

16. S. Chowdhury, M. Chen, R. Eckert, D. Ren, F. Wu, N. Repina, and L. Waller, “High-resolution 3D refractive index microscopy of multiple-scattering samples from intensity images,” Optica 6(9), 1211–1219 (2019). [CrossRef]  

17. J. Li, N. Zhou, J. Sun, S. Zhou, Z. Bai, L. Lu, Q. Chen, and C. Zuo, “Transport of intensity diffraction tomography with non-interferometric synthetic aperture for three-dimensional label-free microscopy,” Light: Sci. Appl. 11(154), 154 (2022). [CrossRef]  

18. L. Tian and L. Waller, “3D intensity and phase imaging from light field measurements in an LED array microscope,” Optica 2(2), 104–111 (2015). [CrossRef]  

19. R. Eckert, “Robust 3D Quantitative Phase Imaging,” PhD Thesis (Department of Electrical Engineering and Computer Science, University of California, Berkeley, 2022).

20. R. Eckert, Z. Phillips, and L. Waller, “Efficient illumination angle self-calibration in Fourier ptychography,” Appl. Opt. 57(19), 5434–5442 (2018). [CrossRef]  

21. R. Cao, M. Kellman, D. Ren, R. Eckert, and L. Waller, “Self-calibrated 3D differential phase contrast microscopy with optimized illumination,” Biomed. Opt. Express 13(3), 1671–1684 (2022). [CrossRef]  

22. E. Soubies, F. Soulez, M. McCann, T. Pham, L. Donati, T. Debarre, D. Sage, and M. Unser, “Pocket guide to solve inverse problems with GlobalBioIm,” Inverse Problems 35(10), 104006 (2019). [CrossRef]  

23. A. Devaney, Mathematical Foundations of Imaging, Tomography and Wavefield Inversion (Cambridge University, 2012), Chap. 6.

24. N. Streibl, “Three-dimensional imaging by a microscope,” J. Opt. Soc. Am. A 2(2), 121–127 (1985). [CrossRef]  

25. D. Jin, R. Zhou, Z. Yaqoob, and P. So, “Tomographic phase microscopy: principles and applications in bioimaging,” J. Opt. Soc. Am. B 34(5), B64–B77 (2017). [CrossRef]  

26. T. Nguyen, V. Bui, and G. Nehmetallah, “Computational Optical Tomography Using 3D Deep Convolutional Neural Networks (3D-DCNNs),” Opt. Eng. 57(04), 1 (2018). [CrossRef]  

27. T. Nguyen, Y. Xue, Y. Li, L. Tian, and G. Nehmetallah, “A deep learning approach for Fourier ptychography microscopy,” Opt. Express 26(20), 26470–26484 (2018). [CrossRef]  

28. T. Nguyen, G. Nehmetallah, D. Tran, A. Darudi, and P. Soltani, “Fully automated, high speed, tomographic phase object reconstruction using the transport of intensity equation in transmission and reflection configurations,” Appl. Opt. 54(35), 10443–10453 (2015). [CrossRef]  

29. J. Jabbour, B. Malik, C. Olsovsky, R. Cuenca, S. Cheng, J. Jo, Y. Cheng, J. Wright, and K. Maitland, “Optical axial scanning in confocal microscopy using an electrically tunable lens,” Biomed. Opt. Express 5(2), 645–652 (2014). [CrossRef]  

30. B. Samanta, F. Pardo, T. Salamon, R. Kopf, and M. Eggleston, “Low-cost electrothermally actuated MEMS mirrors for high-speed linear raster scanning,” Optica 9(2), 251–257 (2022). [CrossRef]  

31. S. Shin, K. Kim, J. Yoon, and Y. Park, “Active illumination using a digital micromirror device for quantitative phase imaging,” Opt. Lett. 40(22), 5407–5410 (2015). [CrossRef]  

32. A. Matlock and L. Tian, “High-throughput, volumetric quantitative phase imaging with multiplexed intensity diffraction tomography,” Biomed. Opt. Express 10(12), 6432–6448 (2019). [CrossRef]  

33. R. Guo, I. Barnea, and N. Shaked, “Limited-angle tomographic phase microscopy utilizing confocal scanning fluorescence microscopy,” Biomed. Opt. Express 12(4), 1869–1881 (2021). [CrossRef]  

34. Y. Sung, W. Choi, C. Fang-Yen, K. Badizadegan, R. Dasari, and M. Feld, “Optical diffraction tomography for high resolution live cell imaging,” Opt. Express 17(1), 266–277 (2009). [CrossRef]  

35. B. Bazow, T. Phan, V. Lam, C. Raub, and G. Nehmetallah, “Synthetic training of machine learning algorithms for holographic cell imaging,” Proc. SPIE 11731, 1173102 (2021). [CrossRef]  

36. B. Bazow, T. Phan, C.B. Raub, and G. Nehmetallah, “Computational multi-wavelength phase synthesis using convolutional neural networks,” Appl. Opt. 61(5), B132–B146 (2022). [CrossRef]  

37. Y. Suzuki, S. Watanabe, M. Odaira, T. Okamura, and Y. Kawata, “Numerical simulations of partially coherent illumination for multi-scattering phase objects via a beam propagation method,” Appl. Opt. 58(4), 954–962 (2019). [CrossRef]  

38. U. Kamilov, I. Papadopoulos, M. Shoreh, A. Goy, C. Vonesch, M. Unser, and D. Psaltis, “Optical Tomographic Image Reconstruction Based on Beam Propagation and Sparse Regularization,” IEEE Trans. Comput. Imaging 2(1), 59–70 (2016). [CrossRef]  

39. X. Jin, L. Li, Z. Chen, L. Zhang, and Y. Xing, “Anisotropic total variation for limited-angle CT reconstruction,” IEEE Nuclear Science Symposuim & Medical Imaging Conference, 2232–2238 (2010).

40. W. Krauze, A. Kuś, and M. Kujawinska, “Limited-angle hybrid optical diffraction tomography system with total-variation-minimization-based reconstruction,” Opt. Eng. 54(5), 054104 (2015). [CrossRef]  

41. M. Rubin, O. Stein, N. Turko, Y. Nygate, D. Roitshtain, L. Karako, I. Barnea, R. Giryes, and N. Shaked, “TOP-GAN: Stain-free cancer cell classification using deep learning with a small training set,” Med. Image Anal. 57, 176–185 (2019). [CrossRef]  

42. M. Rubin, O. Stein, R. Giryes, D. Roitshtain, and N. T. Shaked, “Quantitative Phase Maps of Live Cells Classified By Transfer Learning and Generative Adversarial Network (GAN),” Computational Optical Sensing and Imaging Conference (COSI), 25–28 (2018).

43. I. Moon and K. Jaferzadeh, “Automated digital holographic image reconstruction with deep convolutional neural networks,” Proc. SPIE 11402, 114020A (2020). [CrossRef]  

44. X. Li, H. Qi, S. Jiang, P. Song, G. Zheng, and Y. Zhang, “Quantitative phase imaging via a cGAN network with dual intensity images captured under centrosymmetric illumination,” Opt. Lett. 44(11), 2879–2882 (2019). [CrossRef]  

45. A. Lucas, M. Iliadis, R. Molina, and A. Katsaggelos, “Using deep neural networks for inverse problems in imaging: beyond analytical methods,” IEEE Signal Process. Mag. 35(1), 20–36 (2018). [CrossRef]  

46. M. McCann, K. Jin, and M. Unser, “Convolutional neural networks for inverse problems in imaging: a review,” IEEE Signal Process. Mag. 34(6), 85–95 (2017). [CrossRef]  

47. D. Ryu, D. Ryu, Y. Baek, H. Cho, G. Kim, Y. Kim, Y. Lee, Y. Kim, J. Ye, H. Min, and Y. Park, “DeepRegularizer: rapid resolution enhancement of tomographic imaging using deep learning,” arXiv, arXiv:2009.13777 (2020).

48. K. Zhou and R. Horstmeyer, “Diffraction tomography with a deep image prior,” Opt. Express 28(9), 12872–12896 (2020). [CrossRef]  

49. P. Isola, J. Zhu, T. Zhou, and A. Efros, “Image-to-Image Translation with conditional adversarial networks,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5967–5976 (2017).

50. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” arXiv, arXiv:1406.2661 (2014).

51. M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv, arXiv:1411.1784 (2014). [CrossRef]  

52. D. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” arXiv, arXiv:1412.6980 (2015). [CrossRef]  

53. O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” arXiv, arXiv:1505.04597 (2015).

54. D. Clevert, T. Unterthiner, and S. Hochreiter, “Fast and accurate deep network learning by exponential linear units (ELUs),” arXiv, arXiv:1511.07289 (2016). [CrossRef]  

55. S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Coveriate Shift,” arXiv, arXiv:1502.03167 (2015). [CrossRef]  

Data availability

Data underlying the results presented in this paper may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (16)

Fig. 1.
Fig. 1. Depiction of solving a typical tomographic inversion problem.
Fig. 2.
Fig. 2. Proposed non-interferometric ODT imaging system for measurement acquisition and 3D RI reconstruction based on through-focus axial scanning and illumination scanning.
Fig. 3.
Fig. 3. (a) 3D rendering and $z = 0$ axial slice for the 96${\times} $96${\times} $96 object-plane representation of the 3D scattering potential for a simulated cell phantom. (b) 3D renderings for the image-plane representation of 3D scattering potentials, representing the multi-realizations used for training and testing support for algorithm analysis.
Fig. 4.
Fig. 4. MSBP forward model operates on the scattering potential of an object to return an intensity measurement de-focused to a plane ${z_f}$ from an incident wavefront at an angle $\theta = ({{\theta_x},{\theta_y}} )$. Sequential plane-to-plane propagation allows for the inclusion of multiple in-line scattering effects.
Fig. 5.
Fig. 5. Depiction of the forward scattering process which transforms 3D scattering potentials to 2D scattered fields, the phase of the scattered field is shown to demonstrate plane-to-plane phase scattering effects.
Fig. 6.
Fig. 6. ODT-Gradient Inverse consisting of a forward model and gradient based object update. During each iteration the reconstruction is sequentially updated in the direction of measurements using a gradient-descent method. TV regularization is implemented after each iteration using a proximal-gradient update to the reconstruction.
Fig. 7.
Fig. 7. Back-projection gradient update consisting of a plane-by-plane update of the reconstructed scattering potential in the direction of an intensity measurement.
Fig. 8.
Fig. 8. Schematic illustration of the ODT-Deep inverse approach (a) and training procedure for the pix2pix cGAN (b) used in tomographic reconstruction. The forward model generates a series of through-focus and angle intensity measurements, which are translated to a 3D RI volume in the inverse model. Each node in the forward model is a MSBP forward operator which provides an intensity measurement defocused to a plane ${z_{{m_z},{m_\theta }}}$ from incident illumination at an angle ${\theta _{{m_\theta }}}$. The inverse model is implemented as a trained cGAN with a generator network (U-Net) to provide volumetric RI prediction for a set of reconstruction planes ${z_{{n_z}\in {N_z}}}$. Training is optimized using a discriminator network (PatchGAN) and conditioned using image labels.
Fig. 9.
Fig. 9. 3D U-Net generator network consisting of convolutional encoder layers and de-convolutional decoder layers; skip connections between mirrored layers, depicted as black arrows, provide access to additional low-level information.
Fig. 10.
Fig. 10. 3D PatchGAN discriminator network mimics the decoder path in the generator network. The discriminator is run for a series of convolution patches across the image and only penalizes structure at the scale of local image patches.
Fig. 11.
Fig. 11. (a) Reconstruction time per iteration for ODT-Gradient inverse implemented on GPU, shown for increasing measurement support and root-pixels per image. (b) Training time per epoch (red) and testing time per sample (black) for ODT-Deep inverse implemented on GPU, shown for increasing measurement support and root-pixels per image.
Fig. 12.
Fig. 12. 3D RI used as scattering potential. (a) Transverse (x-y) slice, (b) axial (x-z) slice, (c) axial (y-z) slice.
Fig. 13.
Fig. 13. 3D reconstruction of RI using ODT-Gradient inverse. (a) Transverse (x-y) slice, (b) axial (x-z) slice, (c) axial (y-z) slice.
Fig. 14.
Fig. 14. 3D reconstruction of RI using ODT-Deep inverse. (a) Transverse (x-y) slice, (b) axial (x-z) slice, (c) axial (y-z) slice.
Fig. 15.
Fig. 15. 3D rendering of ODT forward and inverse processes for 7 de-focus planes and 36 illumination angles. Measurements are generated using the multiple-scattering forward model described in Section 4.2 applied to the 3D scattering potential (a); 3D RI is reconstructed using ODT-Gradient inverse (b) and ODT-Deep inverse (c).
Fig. 16.
Fig. 16. RMSE of the 3D RI computed over 64 test samples for ODT-Gradient inverse as a function of optimization iteration (a) and ODT-Deep inverse as a function of training epoch (b). The individual line plots correspond to various measurement sets.

Equations (11)

Equations on this page are rendered with MathJax. Learn more.

( 2 + k 0 2 ) ψ ( r , z ) = V ( r , z ) ψ ( r , z ) ,
V ( r , z ) = k 0 2 [ n 2 ( r , z ) n m 2 ] = ϕ ( r , z ) + i A ( r , z ) .
ψ ( r , z ) = ψ i ( r , z ) + [ V ( r , z ) ψ i ( r , z ) ] G ( r , z ) G ( r , z ) = e x p ( i k 0 | ( r , z ) | ) / ( 4 π | ( r , z ) | ) ,
F { I } = B δ ( ρ , η ) + A ~ ( ρ , η ) H A ( ρ , η ) + ϕ ~ ( ρ , η ) H ϕ ( ρ , η ) ,
H A ( ρ , η ) = λ 4 π [ P ( ρ , η ) + P ( ρ , η ) ]
H ϕ ( ρ , η ) = i λ 4 π [ P ( ρ , η ) P ( ρ , η ) ] ,
n ^ ( r , z ) = a r g m i n n ( r , z ) m I m ( r ) | H m ( n ( r , z ) ) | 2 + γ R ( n ( r , z ) ) ,
L c G A N ( G , D ) = E x ~ , y ~ [ log D ( x ~ , y ~ ) ] + E x ~ , z ~ [ log ( 1 D ( x ~ , G ( x ~ , z ~ ) ) ) ] ,
G = a r g m i n G m a x D L c G A N ( G , D ) .
G = arg min G max D L c G A N ( G , D ) + γ G L M A E ( G ) ,
L M A E ( G ) = E x ~ , y ~ , z ~ [ y ~ G ( x ~ , z ~ ) 1 ] .
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.