Coherent modulation imaging using a physics-driven neural network

Dongyu Yang; Dongyu Yang; Junhao Zhang; Junhao Zhang; Ye Tao; Ye Tao; Wenjin Lv; Wenjin Lv; Yupeng Zhu; Yupeng Zhu; Tianhao Ruan; Tianhao Ruan; Hao Chen; Hao Chen; Xin Jin; Zhou Wang; Zhou Wang; Jisi Qiu; Jisi Qiu; Jisi Qiu; Yishi Shi; Yishi Shi; Yishi Shi

doi:10.1364/OE.472083

1. Introduction

Coherent modulation imaging (CMI) is a newly developed lensless diffractive imaging method that requires only one frame of the diffraction pattern to reconstruct the complex-valued object [1,2]. Relying on support constraint and modulation constraint, CMI can fast converge to the complex-valued object and to some extent eliminate the inherent ambiguities of coherent diffraction imaging (CDI) [3]. As a single-shot imaging method, CMI has higher temporal resolution than ptychographic CDI [4] and is a powerful tool for observing dynamic processes [5]. So far, CMI has been demonstrated with visible light and X-rays, and it is currently of high interest and developing rapidly [6–9].

In CMI, the reconstruction problem is traditionally solved with an iterative projection algorithm that updates the object guess by iteratively propagating the wavefield estimate between the support plane and the detector plane, via the modulator, and applying constraints [9]. Alternatively, we can also frame the reconstruction problem in CMI as a nonlinear minimization problem, where we minimize an error metric with a deep-learning-based approach. Recently, deep learning has shown excellent performance in solving inverse problems in coherent diffraction imaging, ghost imaging, and holography [10–21]. As a data-driven gradient descent method, deep learning provides a new framework to design powerful computational imaging algorithms [22]. Sinha et al. [23] demonstrated for the first time that an end-to-end deep neural network can be trained to solve the lensless diffraction imaging for phase-only object. To improve resilience to the noise of the phase-from-intensity imaging problem, Kang et al. [24] proposed a deep learning method combined with the phase extraction neural network (PhENN) and a Gerchberg-Saxton-Fienup (GSF) approximate based on CMI. Nevertheless, those methods remain data-dependent, relying on a large training dataset with ground truth images, which are difficult to obtain in many biomedical applications [25]. In a recent research, Wang et al. [26] demonstrated an unsupervised learning method by combining a neural network with a real-world physical model, termed PhysenNet, to reconstruct a phase-only object without any labeled data. However, for a complex-valued object, there are more unknowns compared to a phase-only object in the phase travel problem from one single intensity measurement. Therefore, more constraints are needed to reconstruct a complex-valued object from a single diffraction pattern.

In this paper, we develop a CMI-physical-model-based method to reconstruct a complex-valued object for CMI by using a physics-driven neural network, termed as CMINet, which only needs the single corresponding diffraction pattern as an input. According to the CMI physical model, we design loss functions combined with modulator constraint and support constraint to optimize the parameters in the network. Therefore, the phase retrieval problem becomes an optimization problem. In addition, a trained CMINet can reconstruct a dynamic process with a high speed instead of iterations frame-by-frame. Simulation and experimental results show that the developed method can reconstruct a complex-valued object with high image quality, which has less noise, better convergence, robustness to physical parameters, and better performance on the biological sample than the traditional iterative algorithm.

2. Theory

2.1 Physical model of CMI

Figure 1 shows the schematic of CMI experiment in optical near-field. A coherent laser beam passes through a pinhole which is placed against to the sample plane. The sample exit wave then propagates to a modulator with distance $z_1$. The diffraction plane is set $z_2$ downstream of the modulator. The forward process of CMI can be expressed as:

(1)$$I(x,y)=|{{\mathcal{F}}_{{{z}_{2}}}}\{{{\mathcal{F}}_{{{z}_{1}}}}\{S(r)\cdot {{U}_{obj}}(x,y)\}\cdot T(x,y)\}{{|}^{2}},$$

(2)$$S(r) = \begin{cases} 1, & r\in \text{ support regions}\\ 0, & \text{otherwise} \end{cases},$$

where ’$\cdot$’ denotes element-wise multiplication; $S(r)$ denotes the support;

Fig. 1. Schematic diagram of coherent modulation imaging.

Download Full Size | PDF

$U_{obj}^{{}}(x,y)\text {=}A(x,y){{e}^{(i\varphi (x,y))}}$ denote the field at the exit plane of the complex-valued sample; $T(x,y)$ is the complex transmission function of the modulator; ${{\mathcal {F}}_{z}}$ denotes propagation operator with distance $z$. Specifically, the angular spectrum propagator is used as the propagation operator in this paper.

2.2 CMINet algorithm

The flowchart of CMINet is shown in Fig. 2(a). The reconstruction process starts by feeding the diffraction pattern ($I(x,y)$) into the artificial neural network. The output of the artificial neural network ${U}'_{obj}(x,y)$ can be described as the mapping function $H_{\theta }\{\cdot \}$:

(3)$${U}'_{obj}(x,y)=H_{\theta}\{I(x,y)\},$$

where ${U}'_{obj}(x,y)={A}'(x,y){{e}^{(i{\varphi }'(x,y))}}$, ${A}'(x,y)$ and ${\varphi }'(x,y)$ are the predicted amplitude and phase, respectively.

Fig. 2. (a) Flowchart of the CMINet algorithm. (b) Architecture of the artificial neural network.

Download Full Size | PDF

The physics-driven neural network means the network’s weights are not pre-trained with any ground truth, but optimized with customized loss functions based on physical constraints. CMINet utilizes two customized loss functions based on CMI’s physical model. The first is calculated at the support plane, which drives the pixels value outside the support to zero. The second is calculated at the detector plane, which takes the forward model of CMI into account. The two losses are separately calculated as:

(4)$$Loss1=MSE\{{A}'(x,y)\cdot (1-S(r)),{{0}_{N\times N}}\},$$

(5)$$Loss2=MSE\{I(x,y),{I}'(x,y)\},$$

where $MSE$ is the mean squared errors, ${I}'(x,y)=|{{\mathcal {F}}_{{z}_{2}}}\{{{\mathcal {F}}_{{{z}_{1}}}}\{S(r)\cdot {{{U}'}_{obj}}(x,y)\}\cdot T(x,y)\}{{|}^{2}}$, $0_{N\times N}$ is the N-by-N matrix of zeros. To associate the two constraints, the following sum of the $Loss1$ and $Loss2$ is back-propagated through the artificial neural network:

(6) $$L = Loss1 + Loss2.$$

Consequently, the phase retrieval problem becomes an optimization problem which is to find the parameters $\theta$ according to the loss function L:

(7)$$\begin{aligned} &\quad\underset{\theta}{\operatorname{argmin}}L\left(H_{\theta}\left(I(x,y)\right),\ I(x,y)\right)\\ &=\underset{\theta}{\operatorname{argmin}}L\left(\ A^{\prime}(x,\ y)\ e^{\left(i\ \varphi^{\prime}(x,\ y)\right)},\ I(x,y)\right)\\ &=\underset{\theta}{\operatorname{argmin}}\ \Big\{\ Loss1(A^{\prime}(x,\ y)\ \cdot(1-S(r)),\ 0_{N\ \times\ N})\ +\ Loss2(I(x,\ y),\ I^{\prime}(x,\ y))\Big\}. \end{aligned}$$

In theory, if we minimize the Eq. (7), we can get the proper parameters $\theta$ that can make the network mapping from the diffraction pattern to the complex-valued object.

2.3 Algorithms implementation

In this study, we implement CMINet with the general encoder-decoder U-Net structure [27] which is widely used in image restoration [28,29] and image segmentation [30,31]. The network architecture is shown in Fig. 2(b), which consists of double convolution layer, transposed convolution, max pooling and skip connection & concatenation. The double convolution layer contains [Conv2d $+$ BatchNorm2d $+$ ReLU] $*$ 2 with kernel size=3, padding=1, and stride=1. The transposed convolution is performed with kernel size=2 and stride=2. The implementation of CMINet network is performed by Python (version 3.7.9) with Pytorch framework (version 1.10) on a desktop workstation (E5-2630 @ 2.20GHz CPU and NVIDIA GTX 2080Ti GPU). In the training processes, the Adam method is used to optimize the network with an initial learning rate of $5\times 10^{-3}$. The iterative CMI algorithm implements based on Ref. [9] and it also uses the GPU to accelerate the iteration process.

3. Simulation

3.1 Phase range of modulator & sample-to-modulator distance

Simulations are performed to compare the performance of the CMINet with the iterative algorithm. We first analyze the effects of the phase range of the modulator and sample-to-modulator distance $z_1$. The input diffraction pattern is $512\times 512$ pixels with the pixel size of $5.5\mu m$, and the support radius is 164 pixels (~0.9 mm). The ground truths of the amplitude and phase are shown in Figs. 3(a1) and (a2). The distribution of the complex-valued modulators obtained in our previous work [5] is used, as shown in Figs. 3(b1), (b2), (g1), and (g2), which have the same amplitude distribution and different phase depth, $[0,{\pi }/{4}]$ and $[0,{\pi }/{2}]$. The reconstruction results are shown in Figs. 3(c1), (c2), (d1), (d2), (e1), (e2), (f1), (f2) and (h1), (h2), (i1), (i2), (j1), (j2), (k1), (k2), and we calculate peak signal to noise ratio (PSNR) [32] value of each reconstruction result for the quantitative evaluation. One can see from the results and the PSNR values that, for the iterative algorithm, the reconstruction quality is improved as the $z_1$ increases, especially for the phase part. The PSNR values reveal that higher reconstruction quality can be obtained with a strong modulator. However, the noise has not been effectively suppressed on the reconstructed phase images. The results obtained from the iterative algorithm indicate that a larger $z_1$ and a stronger modulator can help improve the reconstruction quality. Compared to the iterative algorithm, CMINet has a stable and high reconstruction quality under different distances and modulators. There is a significant improvement of PSNR in the reconstruction phase images, as shown in Figs. 3.(d2), (f2), (i2), and (k2), from which we can see that the noise is effectively suppressed. The cross section highlighted by the red dashed line in each reconstructed phase results, as shown in Figs. 3(c3), (d3), (e3), (f3), (h3), (i3), (j3), and (k3), indicates that reconstructed by the CMINet is more closed to the ground-truth.

Fig. 3. The numerical simulation results with different modulators and diffraction distances. (a1) and (a2) are the ground truth of amplitude and phase, respectively. The red box region in (a1) (central $200\times 200$ pixels) is used to calculate the PSNR value. (b1) and (g1) are the modulator amplitude. (b2) and (g2) are the modulator phase. (c1)-(f2) and (h1)-(k2) are the reconstruction results from the iterative algorithm and CMINet with different modulators and distances. The PSNR value is inserted in the bottom of each reconstruction result. (c3)-(k3) are the cross sections of each reconstructed phase image compared with the ground-truth. The blue line is the ground-truth as shown in (a2), and the red dashed line is the reconstructed phase in each result.

Download Full Size | PDF

3.2 Support constraint

Another important factor affecting CMI reconstruction quality is the support constraint that has been widely used in iterative algorithms [33–35]. Five different support radii, $r=136\;pix$ (~0.75 mm), $r=164\;pix$ (~0.9 mm), $r=181\;pix$ (~1.0 mm), $r=200\;pix$ (~1.1 mm), and $r=\infty$ (without support), are analyzed to examine the performance of CMINet and the iterative algorithm. In this simulation, we use the complex-valued modulator with phase range of $[0,{\pi }/{2}\;]$, as shown in Figs. 3(g1) and (g2). The diffraction distances are $z_1=30\;mm$ and $z_2=30\;mm$. The reconstruction results are presented in Fig. 4. Following the increase of support radius, the reconstruction quality of the iterative algorithm is deteriorated on both the amplitude and phase, as shown in Figs. 4(b1, 2), (d1, 2), (f1, 2), (h1, 2), and (j1, 2). With a fixed-size diffraction pattern, the small radius support contributes to collecting more high-angle diffraction information, so that the high-frequency components of the diffraction pattern can be used in the iterative process to reconstruct the complex-valued object with high quality [36]. The reconstruction quality of both methods are effected by the high-angle information in diffraction patterns, as shown in the last row in Figs. 4. From the results in Fig. 4(c1, 2), (e1, 2), (g1, 2), (i1, 2), and (k1, 2), it is apparent that CMINet has a good reconstruction performance under different support radii. CMINet is a learning-based method stacking many linear and nonlinear functions or layers with a large number of parameters which can fit a complex mapping. And CMINet uses the advanced optimizer [37–39], such as Momentum, AdaGrad, and Adam, which shows improvement in the reconstruction quality in the iterative algorithm [40]. The quantitative evaluation of PSNR indicates that CMINet is robust to support radius.

Fig. 4. The reconstruction results of different support radii. The PSNR value is calculated on the central 200${\times }$200 pixels of each images. The last row is the diffraction patterns with different support radii.

Download Full Size | PDF

3.3 Computation time

The computation time of an algorithm is an important feature for practical applications. As a single-shot imaging technique, CMI can be used to reconstruct a dynamic process that generates a large number of data in a short time. The iterative algorithm must restart the iteration for each diffraction pattern, which is a time-consuming process. Fortunately, the learning-based method has the advantage to crunch massive amounts of data in a far shorter time than the iterative algorithm. In the learning-based method, the network can ’learn’ and ’remember’ the mapping by a training process with a training set. What’s more, the trained network has a generalization ability on similar data. Thus, we can use a small part of data in a dynamic process to train a network and then use the trained network to predict the whole dynamic process. Here, we take a flow of chloroplast of black algae as a typical biological dynamic process example to explain the principle more explicitly and compare the reconstruction time of both methods.

A biological dynamic process, a flow of chloroplast of black algae, is recorded in 180 seconds with 24 FPS, a total of 4320 frames of simulated diffraction patterns. The collected data of the flow of chloroplast is a time series image of continuous change which means that the adjacent frames are similar. Thus, the whole dynamic process can be reconstructed by a trained network that is trained using only part of the collected data, because of the generalization ability of the network. The collected data is divided into two sets, a training set and a testing set, and the training set is used to train the CMINet. A diffraction pattern is randomly selected in each equal time interval to form a training set. We take three time-intervals, i.e., 1 s, 2 s, and 4 s, to train the CMINet. Therefore, three training sets are generated with 45, 90, and 180 frames. After training, all collected data is inputted into the trained CMINet to reconstruct the dynamic process. Part of the reconstruction results from CMINet with different training sets are shown in Figs. 5(a), (b), and the average PSNR of the training sets and testing sets are illustrated in Table 1. One can clearly see that the average PSNR values of different training sets are very close, but the average PSNR values of the testing sets increase with the size of the training set increasing. In the training processes, increasing the size of the training set can help improve the generalization ability of the network on similar data. Besides, when the training set size equals 180, the PSNR values of the training set and testing set are close, which means that the trained network has enough generalization ability to reconstruct the dynamic process (please see Visualization 1 for the whole dynamic process reconstructed by CMINet that is trained with training set 180). We also observe that the training time is sublinear increasing, and the training epochs decrease when the size of the training set increases. Note that the collected data from the dynamic process with continuous change is similar. For this reason, the training time sublinear increases with the training set linear increase because the loss value changes slightly when training the network with similar data, as the loss curves shown in Fig. 6. Although the training epoch is decreased, the update times of the parameter $\theta$ are increasing because the update time of the parameter $\theta$ is equal to the size of the training set / batch size (in this study, batch size = 15). After training, all diffraction patterns are input into the trained network, and the average reconstruction speed is 0.004 s/frame. So, reconstructing the whole dynamic process takes about 17 s. For the iterative algorithm, in this study, one frame reconstruction takes about 8 s with 500 iterations. Therefore, reconstruction of the whole dynamic process by iterative frame-by-frame takes about 34560 s ($\sim$9.6 hours). The average PSNR values of the reconstructed amplitude and phase by iterative CMI are 27.5463 dB and 12.5467 dB, respectively. In this case, the sum of the training time and reconstruction time of the CMINet is far less than the iterative method. From the results, it is clear that the learning method has the advantage of computation time when dealing with huge amounts of data from the dynamic process.

Fig. 5. Part reconstruction results of the dynamic process from CMINet and the iterative algorithm. (a) Reconstruction results of CMINet with different sizes of training sets from the training set. (b) Reconstruction results of CMINet from testing set. (c) Reconstruction results of the iterative algorithm.

Download Full Size | PDF

Fig. 6. MSE loss curves of CMINet with different training set sizes.

Download Full Size | PDF

Table 1. Average PSNR values of CMINet with different training set sizes and training time.

View Table

4. Experimental setup

The experimental setup is shown in Fig. 7, where a collimated laser beam of 637 nm (Coherent, OBIS 637LX ) is incident on a beam splitter and then illuminated on the pinhole and sample. The modulator is placed at $z_1=24\;mm$ behind the sample. The diffraction pattern is recorded downstream of the modulator at a distance $z_2=8.6\;mm$ by the charge coupled device(CCD) with 8-bit (IMPERX 6620B, $5.5\;\mu m \times 5.5\;\mu m$ pixel size).

Fig. 7. (a) Experimental setup. CCD: the detector is a charge coupled device. The (b) amplitude and (c) phase transmission function of the modulator calibrated by ptychography.

Download Full Size | PDF

5. Experiment results and analysis

5.1 USAF resolution target

In our experiment, the USAF resolution target is used as the object and two pinhole of different support radius, ~2 mm($r=181\;pix$) and ~1.5 mm($r=136\;pix$), are used. The diffraction patterns are shown in Figs. 8(c) and (g). The reconstruction results are shown in Figs. 8(a1), (a2), (b1), (b2) and (e1), (e2), (f1), (f2). For $r=181\;pix$, one can clearly see that CMINet’s results have lower noise than the iterative algorithm both of amplitude and phase images, as shown in Figs. 8(a1)-(b2). For $r = 136\;pix$, the reconstructed amplitude and phase in both methods are basically the same. And the improvement of reconstruction quality can be visually observed for the iterative algorithm. Those results are consistent with the simulation results in Fig. 4. To quantitatively investigate the improvement of the SNR, relative standard deviation (RSD) is calculated within the background:

(8)$$RSD=\frac{\sqrt{\sum_{i=1}^n (x_i-\bar{x})^2/(n-1)}}{\bar{x}} {\times 100\%},$$

where $x_i$ is the pixel value; $n$ is the total pixel number and $\bar {x}$ is the corresponding average value. The background regions marked in yellow are illustrated in Figs. 8(d) and (h) for $r=181\;pix$ and $r=136\;pix$, respectively. As shown in Figs. 8(a1)-(b2), $r=181\;pix$, the RSD values of reconstructed amplitude and phase by the iterative algorithm are 9.38% and 4.87%, while the RSD values of CMINet are 4.4% and 1.27%, which are about 46.9% and 26.1% of the iterative algorithm. These results indicate that CMINet has a higher reconstruction quality on both the amplitude and phase than the iterative algorithm in the case of a large support radius. Therefore, CMINet is able to gain a larger field of view and higher reconstruction quality at the same time, compared to the iterative algorithm.

Fig. 8. Experimental reconstruction results of the USAF resolution target. (a1), (a2), (b1), (b2) and (e1), (e2), (f1), (f2) are the reconstruction results with the iterative algorithm and CMINet with different support radii, $r=181\;pix$ and $r=136\;pix$. (c) and (g) are the input diffraction patterns. (d) and (h) illustrate the background region (yellow part) for RSD calculation. (i) MSE curves.

Download Full Size | PDF

We further study the convergence of CMINet and the iterative algorithm by calculating the MSE value between the recorded diffraction pattern and estimated diffraction pattern after each epoch or iteration. Figure 8(i) shows the convergence curves of both methods with experimental data in the condition of two support radii. One can see that the MSE values of both methods drop rapidly to ~$1\times 10^{-2}$ at the first 20 epochs/iterations. However, the iterative algorithm is stagnated in an error of ~$1\times 10^{-2}$ on both conditions. For CMINet, the MSE values are slowly decreased to ~$1\times 10^{-3}$ in the next 300 epochs. Compared to the iterative algorithm, CMINet can achieve better convergence.

5.2 Results of the spatial resolution limit of the experimental setup

In this study, we use a plane laser wave to illuminate samples and collect the diffraction pattern with pixel size of $5.5\;\mu m$. Theoretically, our lensless diffraction imaging system resolution$(R)$[41] can be determined by the equation:

(9) $$R=PixeSize*2.3,$$

where 2.3 compensates for the Nyquist limit. Thus, the resolution limit of our system is $12.65\mu m$, which is approached to line pair 3, group 5 $(12.40\mu m)$. Figure 9 shows the reconstructed results by the iterative algorithm and CMINet with the resolution target group 4 and 5. From the reconstructed amplitude in Figs. 9. (a2) and (c2), both methods can approach the resolution limitation of the current experimental setup. The cross-sections of the reconstructed amplitude show that the curve of CMINet is sharper than the iterative algorithm, especially in the resolution limitation area at the pixel position of 40 to 70 in Fig. 9. (e1). Similar observations hold for the reconstruction phase results in Fig. 9. (e2). Besides, one can clear see that the phase image reconstructed by CMINet has less noise than the iterative algorithm, as shown in Figs. 9. (b2) and (d2).

Fig. 9. Experimental reconstruction results of the USAF resolution target. (a1), (a2), (b1), (b2) and (e1), (e2), (f1), (f2) are the reconstruction results with the iterative algorithm and CMINet with different support radii, $r=181\;pix$ and $r=136\;pix$. (c) and (g) are the input diffraction patterns. (d) and (h) illustrate the background region (yellow part) for RSD calculation. (i) MSE curves.

Download Full Size | PDF

5.3 Biological sample

To further verify the performance of CMINet for biological applications, we conduct an experiment using a flower of Capsella bursa-pastoris sild as the sample, which is shown in Fig. 10(a), and compare the reconstruction results by the iterative algorithm and CMINet as shown in Fig. 10. Figure 10(b) is the recorded diffraction pattern. Figures 10(c1) and (c2) show the reconstructed amplitude and phase, respectively, by the iterative algorithm from Fig. 10(b). From the reconstructed amplitude image in Fig. 10(c1), we can see the noise seriously affects the image quality, resulting in the outline of the sample being ambiguous as shown in the zoom-in views of the red box in Fig. 10(c1). For the reconstructed phase image (Fig. 10(c2)), the internal structures of the sample are vague. The CMINet reconstructed amplitude and phase are shown in Figs. 10(d1) and (d2), respectively. One can obtain the clear sample edge which is easily identifiable by CMINet reconstructed amplitude in Fig. 10(d1). And from the wrapped phase image in Fig. 10(d2), the internal structures of the sample can be clearly distinguished. Besides, the unwrapped phase results of the iterative algorithm and CMINet are shown in Figs. 10(c3) and (d3), respectively, which are unwrapped by the TIE-FD-based iteration method [42]. Those results prove that the CMINet can achieve high reconstruction quality with less noise and is practical for biological applications.

Fig. 10. The reconstruction of the flower of Capsella bursa-pastoris sample from a single measurement by the iterative algorithm and CMINet. (a) The flower of Capsella bursa-pastoris sample image taken by a conventional microscope with a ${\times }$10 objective lens. (b) the recorded diffraction pattern. (c1-c3) are the reconstruction results by the iterative algorithm. (d1-d3) are the reconstruction results by CMINet.

Download Full Size | PDF

6. Conclusion

In conclusion, we propose CMINet, using a physics-driven neural network to reconstruct the complex-valued object in CMI configuration. We design loss functions based on physical constraints to optimize the network’s weights, and the trained CMINet can directly map the diffraction patterns to the complex-valued objects of the dynamic process instead of iteration frame-by-frame. The developed method is validated with numerical simulations and experiments. The results show that CMINet has a higher reconstruction quality with less noise and better convergence. Compared to the iterative algorithm, CMINet is robust to physical parameters such as the phase range of the modulator, support radius, and diffraction distance. What’s more, CMINet has a good reconstruction performance on biological samples. In a word, CMINet is an improvement to the iterative algorithm and can provide a new powerful phase retrieval strategy for visible light, x-ray, and high-energy electrons imaging for biological and material science applications.

Appendix A

This part is shown the reconstruction results of CMINet on Labled Faces in the Wild (LFW) [43] dataset to assess the reconstruction performance with different inputs. 200 images are used form LFW to generate 100 complex-valued samples. In this simulation, the diffraction patterns are generated with the modulator in Figs. 3(g1),(g2), and distance $z_1=30\;mm, z_2=30\;mm$. Every complex-valued sample is reconstructed by a separate network, and 2 reconstruction results are shown in Fig. 11. The average PSNR is 37.1260 dB for all reconstruction amplitudes and 25.8854 dB for all reconstruction phases.

Fig. 11. Two examples reconstructed by CMINet on FLW dataset.

Download Full Size | PDF

Funding

National Natural Science Foundation of China (61975205, 6207521, 62131011); Youth Innovation Promotion Association of the Chinese Academy of Sciences (2017489); Natural Science Foundation of Hebei Province (F2018402285); Hebei Province Innovation Capability Improvement Plan (No. 20540302D); Fundamental Research Funds for the Central Universities; Fusion Foundation of Research and Education of CAS.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. F. Zhang, G. Pedrini, and W. Osten, “Phase retrieval of arbitrary complex-valued fields through aperture-plane modulation,” Phys. Rev. A 75(4), 043805 (2007). [CrossRef]

2. F. Zhang and J. M. Rodenburg, “Phase retrieval based on wave-front relay and modulation,” Phys. Rev. B 82(12), 121104 (2010). [CrossRef]

3. Z. He, B. Wang, J. Bai, G. Barbastathis, and F. Zhang, “High-quality reconstruction of coherent modulation imaging using weak cascade modulators,” Ultramicroscopy 214, 112990 (2020). [CrossRef]

4. J. M. Rodenburg and H. M. L. Faulkner, “A phase retrieval algorithm for shifting illumination,” Appl. Phys. Lett. 85(20), 4795–4797 (2004). [CrossRef]

5. J. Zhang, D. Yang, Y. Tao, Y. Zhu, W. Lv, D. Miao, C. Ke, H. Wang, and Y. Shi, “Spatiotemporal coherent modulation imaging for dynamic quantitative phase and amplitude microscopy,” Opt. Express 29(23), 38451 (2021). [CrossRef]

6. B. Wang, Q. Wang, W. Lyu, and F. Zhang, “Modulator refinement algorithm for coherent modulation imaging,” Ultramicroscopy 216, 113034 (2020). [CrossRef]

7. X. Pan, C. Liu, and J. Zhu, “Phase retrieval with extended field of view based on continuous phase modulation,” Ultramicroscopy 204, 10–17 (2019). [CrossRef]

8. X. Dong, X. Pan, C. Liu, and J. Zhu, “Single shot multi-wavelength phase retrieval with coherent modulation imaging,” Opt. Lett. 43(8), 1762 (2018). [CrossRef]

9. F. Zhang, B. Chen, G. R. Morrison, J. Vila-Comamala, M. Guizar-Sicairos, and I. K. Robinson, “Phase retrieval by coherent modulation imaging,” Nat. Commun. 7(1), 13367 (2016). [CrossRef]

10. K. Wang, J. Dou, Q. Kemao, J. Di, and J. Zhao, “Y-Net: a one-to-two deep learning framework for digital holographic reconstruction,” Opt. Lett. 44(19), 4765 (2019). [CrossRef]

11. K. Wang, Q. Kemao, J. Di, and J. Zhao, “Y4-Net: a deep learning solution to one-shot dual-wavelength digital holographic reconstruction,” Opt. Lett. 45(15), 4220 (2020). [CrossRef]

12. T. Nguyen, Y. Xue, Y. Li, L. Tian, and G. Nehmetallah, “Deep learning approach for Fourier ptychography microscopy,” Opt. Express 26(20), 26470 (2018). [CrossRef]

13. Y. Wu, Y. Rivenson, Y. Zhang, Z. Wei, H. Günaydin, X. Lin, and A. Ozcan, “Extended depth-of-field in holographic imaging using deep-learning-based autofocusing and phase recovery,” Optica 5(6), 704 (2018). [CrossRef]

14. O. Wengrowicz, O. Peleg, T. Zahavy, B. Loevsky, and O. Cohen, “Deep neural networks in single-shot ptychography,” Opt. Express 28(12), 17511 (2020). [CrossRef]

15. D. Yang, J. Zhang, Y. Tao, W. Lv, S. Lu, H. Chen, W. Xu, and Y. Shi, “Dynamic coherent diffractive imaging with a physics-driven untrained learning method,” Opt. Express 29(20), 31426 (2021). [CrossRef]

16. S. Li, M. Deng, J. Lee, A. Sinha, and G. Barbastathis, “Imaging through glass diffusers using densely connected convolutional networks,” Optica 5(7), 803 (2018). [CrossRef]

17. Z. Ren, Z. Xu, and E. Y. Lam, “Learning-based nonparametric autofocusing for digital holography,” Optica 5(4), 337 (2018). [CrossRef]

18. Y. Li, Y. Xue, and L. Tian, “Deep speckle correlation: a deep learning approach toward scalable imaging through scattering media,” Optica 5(10), 1181 (2018). [CrossRef]

19. F. Wang, C. Wang, M. Chen, W. Gong, Y. Zhang, S. Han, and G. Situ, “Far-field super-resolution ghost imaging with a deep neural network constraint,” Light: Sci. Appl. 11(1), 1 (2022). [CrossRef]

20. F. Wang, H. Wang, H. Wang, G. Li, and G. Situ, “Learning from simulation: An end-to-end deep-learning approach for computational ghost imaging,” Opt. Express 27(18), 25560 (2019). [CrossRef]

21. M. Lyu, W. Wang, H. Wang, H. Wang, G. Li, N. Chen, and G. Situ, “Deep-learning-based ghost imaging,” Sci. Rep. 7(1), 17865 (2017). [CrossRef]

22. G. Barbastathis, A. Ozcan, and G. Situ, “On the use of deep learning for computational imaging,” Optica 6(8), 921 (2019). [CrossRef]

23. A. Sinha, J. Lee, S. Li, and G. Barbastathis, “Lensless computational imaging through deep learning,” Optica 4(9), 1117 (2017). [CrossRef]

24. I. Kang, F. Zhang, and G. Barbastathis, “Phase extraction neural network (PhENN) with coherent modulation imaging (CMI) for phase retrieval at low photon counts,” Opt. Express 28(15), 21578 (2020). [CrossRef]

25. G. Wang, J. C. Ye, and B. De Man, “Deep learning for tomographic image reconstruction,” Nat. Mach. Intell. 2(12), 737–748 (2020). [CrossRef]

26. F. Wang, Y. Bian, H. Wang, M. Lyu, G. Pedrini, W. Osten, G. Barbastathis, and G. Situ, “Phase imaging with an untrained neural network,” Light: Sci. Appl. 9(1), 77 (2020). [CrossRef]

27. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention, (Springer, 2015), pp. 234–241.

28. Y. Rivenson, Z. Göröcs, H. Günaydin, Y. Zhang, H. Wang, and A. Ozcan, “Deep learning microscopy,” Optica 4(11), 1437 (2017). [CrossRef]

29. X. Mao, C. Shen, and Y. B. Yang, “Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections,” in Advances in Neural Information Processing Systems, vol. 29D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, eds. (Curran Associates, Inc., 2016).

30. O. Cicek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger, “3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016, vol. 9901S. Ourselin, L. Joskowicz, M. R. Sabuncu, G. Unal, and W. Wells, eds. (Springer International Publishing, Cham, 2016), pp. 424–432. Series Title: Lecture Notes in Computer Science.

31. E. Shelhamer, J. Long, and T. Darrell, “Fully Convolutional Networks for Semantic Segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017). [CrossRef]

32. P. Gupta, P. Srivastava, S. Bhardwaj, and V. Bhateja, “A modified PSNR metric based on HVS for quality assessment of color images,” in 2011 International Conference on Communication and Industrial Application, (IEEE, Kolkata, West Bengal, India, 2011), pp. 1–4.

33. J. R. Fienup, “Reconstruction of a complex-valued object from the modulus of its fourier transform using a support constraint,” J. Opt. Soc. Am. A 4(1), 118–123 (1987). [CrossRef]

34. B. Abbey, K. A. Nugent, G. J. Williams, J. N. Clark, A. G. Peele, M. A. Pfeifer, M. De Jonge, and I. McNulty, “Keyhole coherent diffractive imaging,” Nat. Phys. 4(5), 394–398 (2008). [CrossRef]

35. J. M. Rodenburg, “Ptychography and related diffractive imaging methods,” Adv. Imaging Electron Phys. 150, 87–184 (2008). [CrossRef]

36. M. Humphry, B. Kraus, A. Hurst, A. Maiden, and J. Rodenburg, “Ptychographic electron microscopy using high-angle dark-field scattering for sub-nanometre resolution imaging,” Nat. Commun. 3(1), 730 (2012). [CrossRef]

37. I. Sutskever, J. Martens, G. Dahl, and G. Hinton, “On the importance of initialization and momentum in deep learning,” in International conference on machine learning, (PMLR, 2013), pp. 1139–1147.

38. J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization,” J. Mach. Learn Res. 12, 2121–2159 (2011). [CrossRef]

39. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980 (2014).

40. A. Maiden, D. Johnson, and P. Li, “Further improvements to the ptychographical iterative engine,” Optica 4(7), 736–745 (2017). [CrossRef]

41. B. R. Masters, “Handbook of biological confocal microscopy,” J. Biomed. Opt. 13(2), 029902 (2008). [CrossRef]

42. Z. Zhao, H. Zhang, C. Ma, C. Fan, and H. Zhao, “Comparative study of phase unwrapping algorithms based on solving the Poisson equation,” Meas. Sci. Technol. 31(6), 065004 (2020). [CrossRef]

43. G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller, “Labeled faces in the wild: A database forstudying face recognition in unconstrained environments,” in Workshop on faces in’Real-Life’Images: detection, alignment, and recognition, (2008).

Size of training set		45	90	180
Training set	amplitude	33.9410 dB	33.9015 dB	33.9310 dB
Training set	phase	32.6777 dB	32.8636 dB	32.7483 dB
Testing set	amplitude	26.0016 dB	28.3462 dB	32.3435 dB
Testing set	phase	25.3517 dB	27.0761 dB	31.3301 dB
Training time		$\sim$ 10 min	$\sim$ 12 min	$\sim$ 16 min
		(500 epoches)	(300 epoches)	(200 epoches)

Coherent modulation imaging using a physics-driven neural network

Abstract

1. Introduction

2. Theory

2.1 Physical model of CMI

2.2 CMINet algorithm

2.3 Algorithms implementation

3. Simulation

3.1 Phase range of modulator & sample-to-modulator distance

3.2 Support constraint

3.3 Computation time

4. Experimental setup

5. Experiment results and analysis

5.1 USAF resolution target

5.2 Results of the spatial resolution limit of the experimental setup

5.3 Biological sample

6. Conclusion

Appendix A

Funding

Disclosures

Data availability

References

Supplementary Material (1)

Data availability

Cited By

Figures (11)

Tables (1)

Equations (9)

Optics Express