Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Quantitative phase imaging based on model transfer learning

Open Access Open Access

Abstract

Convolutional neural networks have been widely used in optical information processing and the generalization ability of the network depends greatly on the scale and diversity of the datasets, however, the acquisition of mass datasets and later annotation have become a common problem that hinders its further progress. In this study, a model transfer-based quantitative phase imaging (QPI) method is proposed, which fine-tunes the network parameters through loading pre-training base model and transfer learning, enable the network with good generalization ability. Most importantly, a feature fusion method based on moment reconstruction is proposed for training dataset generation, which can construct rich enough datasets that can cover most situations and accurately annotated, it fundamentally solves the problem from the scale and representational ability of the datasets. Besides, a feature distribution distance scoring (FDDS) rule is proposed to evaluate the rationality of the constructed datasets. The experimental results show that this method is suitable for different types of samples to achieve fast and high-accuracy phase imaging, which greatly relieves the pressure of data, tagging and generalization ability in the data-driven method.

© 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Quantitative phase imaging (QPI) [1,2] is an effective method to deal with phase information loss in optical imaging, which can be obtained by quantitatively measuring the changes of diffraction pattern or interference fringe caused by refractive index difference, which is a powerful and unlabeled imaging method for cell and tissue. Due to its advantages of non-contact, high-accuracy, full-field and fast, QPI has been widely used in the field of microscopic imaging [38].

As an end-to-end method, Deep learning (DL) [9] can be trained through a large number of training data, extract the feature map in the input image through convolution kernel, and establish the mapping relationship from input to output, which has been widely proved that it can be combined with QPI to carry out phase shift [10] and phase unwrapping [1113], eliminate the influence of system noise and self-interference spatial artifacts to a certain extent [1416]. Moreover, with its powerful nonlinear fitting ability, the neural network extracted optical path differences of measured sample from the intensity distribution of interferogram and recovery the phase directly [1721], providing a new approach for QPI. However, DL also has many shortcomings in the application of QPI, most mainly embodied in two points. Firstly, as a method of supervised learning, convolutional neural network (CNN) requires high quality training datasets, which means that no matter what the imaging device is, it has to take a lot of interferograms and recovery the target phase to obtain mass training datasets and corresponding labels, which is bound to lead to the complexity, time consuming, and the extreme pressure of tagging. Secondly, CNN's generalization ability is closely related with the datasets, when the feature distribution of the measured sample is different from the training sample, the phase recovery precision of the neural network will be reduced rapidly, which severely restricts its wider application.

The composition of datasets is usually divided into two types: simulation data and experimental data. A common method to obtain simulation data is to build phase image using some special functions or open datasets, and then generate interferogram to construct datasets. Simulation dataset has certain advantages in solving standardization or generalizing problems, but there is still a lack of representative method to ensure scientific and reasonable construction of simulation data, the difference between it and experimental data also greatly affects the accuracy of network output results. Experimental data is more conducive to solving specific problems, the biggest problem is the acquisition of large-scale laboratory data, as well as the corresponding label calculation and labeling need to pay a lot of time and labor costs. To solve the difficult in obtaining mass experimental data, data augmentation can be adopted, that is, to create multiple copies of the same image with slight changes like mirroring, random cropping and shearing, as it amplifies the datasets, it also risks introducing irrelevant data. Therefore, if we can design a scientific, reasonable and effective construction method to build simulation datasets, solve the primary problem that must be considered by neural network, and then supplement transfer learning to complete the transfer from simulation data set to experimental data set and diverse samples, enables the neural network with generalization ability. Undoubtedly, it is a beneficial exploration and research to solve the common problems of DL in the application of QPI.

Therefore, we proposed a QPI method based on deep transfer learning. The parameters in the network can be fine-tuned to recovery the phase of various samples through transfer learning [22]. To perform transfer learning successfully, we also proposed a feature fusion method based on moment reconstruction for the generation of training sets, which greatly saves the cost of acquiring datasets and tagging. More importantly, a class difference evaluation method based on global and local features of images is proposed to evaluate the feature differences of different types of samples, which helps selecting the sample with high task commonality to train the base model. The neural network fine-tuned on this basic model has a significant improvement in phase recovery accuracy, can be used for phase recovery of multiple samples at the same time, shows strong generalization ability.

2. Methods

In this section, a QPI method based on model transfer is proposed, which is detailed in Fig. 1. Firstly, we proposed a sample reconstruction algorithm based on Tchebichef moment to construct a feature-rich and correctly labeled dataset. Then, feature extraction was carried out for the reconstructed samples and other samples, and the generality of each task was calculated based on FDDS rules. The task with high generality was selected as the data set to obtain the pre-training mode. At the same time, the rationality of the reconstructed samples was verified. Lastly, the conditional generative adversarial network (cGAN) [23] was selected as the basic model to establish the mapping between interferogram and target phase image. Through transfer learning, network parameters of the source domain are applied to the tasks of the target domain to realize QPI with small sample datasets.

 figure: Fig. 1.

Fig. 1. Flowchart of the proposed method for the QPI based on transfer learning.

Download Full Size | PDF

2.1 Transfer learning based on cGAN

In the data driven-based QPI method, the performance of the model becomes more and more perfect as the amount of data increases. However, rich and accurately labeled datasets are hard to obtain, which is time-consuming and expensive. The difficulty in collecting and labeling datasets and the scarcity of sample categories will limit the generalization ability of the trained network, making it impossible to achieve good robustness on new samples outside the training set. Therefore, a model-based transfer learning method is introduced to transfer the parameters of a network model with similar tasks to a new sample. Information from the source domain can be utilized by the target domain, reducing the need for large amounts of training data and avoiding learning new tasks from scratch. By learning the features of a small number of target samples and fine-tuning model parameters, the network could achieve rapid and high-precision phase recovery effect for different applications.

The cGAN is selected as the base model for the proposed method, which introduce the conditional variable y on the basis of generative adversarial network (GAN) [24]. The additional information y and priori input form joint hidden layer expression to conduct supervised learning of the model and instruct subnetworks to output results. In the generation task, we refer to the U-net [25] to design the generator, random noise is replaced by interferogram and take target phase as conditional variable y. The architecture of generator is showed in Fig. 2. The convolution operations (kernel size:3×3, stride:2) are performed to extract and enrichment the features of the interferogram and reduce the resolution of the feature map in encoder section. At the same time, the mean distribution of each layer is adjusted to 0, and the variance is adjusted to 1, so as to ensure the gradient transfer and accelerate the convergence of the network. The decoder part is basically symmetric with the encoder, and the residual connection is added to splicing feature images with the same resolution in the encoder and decoder, so as to better fuse low-order and high-order feature information, and finally output a phase image with the same resolution as the input image through deconvolution (kernel size:5×5, stride:2), completing the construction of direct mapping from interferogram to phase image. The root mean square error (RMSE) of phase image is employed as the loss function of the generator and is calculated as:

$$Loss = \sqrt {\frac{1}{N}{{\sum\limits_i^N {({y - \hat{y}} )} }^2}}$$
where N is the resolution of input image, ŷ is the phase image obtained by the network and y represent ground truth of the target phase.

 figure: Fig. 2.

Fig. 2. Architecture of generator.

Download Full Size | PDF

In the discriminator, a CNN is selected to construct a two-category network, splicing the interferogram with the target phase and the phase image obtained by the network as the input of the network, the convolution operations (kernel size:3×3, stride:1) are also performed to extract features. Detailed architecture of discriminator is given in the Fig. 3. The network outputs a confidence scalar varies between 0 and 1 to determine the source of input images, where 0 represents that the discriminated data is the image generated by the generator, and 1 represents the input images comes from the real distribution. Cross entropy measures the error between network output and label and acts as the loss function of discriminator:

$$CrossEntropyLoss ={-} [{y\log ({\hat{y}} )+ ({1 - y} )\log ({1 - \hat{y}} )} ]$$

In the training process, the generator and discriminator compete and fight against each other to improve their own performances. Two networks reach a balance when the discriminator cannot distinguish the current input phase image comes from network generation or real distribution, indicating that the generator has learned the target distribution.

 figure: Fig. 3.

Fig. 3. Architecture of discriminator.

Download Full Size | PDF

2.2 Acquisition of datasets

In transfer learning tasks, we need to measure the similarity of two domains and apply the model learned from source domain to target domain. Therefore, the source domain dataset should possess two attributes: (1) it is similar to the target domain sample task, (2) rich dataset. The advantage of the simulation dataset is that it can contain various features and obtain the ground truth of the phase image conveniently. In order to verify the feasibility of our method, we produced 14 kinds of simulation datasets by collecting samples from public data sets MNIST [26] and Cell Image Library [27], which were Clothing, Number, Saccharomyces cerevisiae(S. cerevisiae), Drug Resistant Streptococcus Pneumoniae(DRSP), Caenorhabditis elegans(CE), Fertilized egg of xenopus, Tetrahymena pyriformis(TETPY), Mus musculus(MMU), Osteosarcoma(OS), Micrasterias, Vibrio cholerae(VIBCH), Caulobacter vibrioides(CAUVC), Mycoplasma(PPLO) and Aggregatibacter actinomycetemcomitans (AGGAC). Partial datasets are shown in Fig. 4, in which the simulated interferogram is formulated as follows:

$${I_s} = {I_r}^2 + {I_o}^2 + 2{I_r}{I_o}\cos [{\varphi + \phi } ]$$
where Ir and Io represent the amplitudes of reference light and object light respectively, φ is the background phase distribution, which is collected in the experiment, and ϕ is the target phase.

 figure: Fig. 4.

Fig. 4. Presentation of Partial datasets. S1: Clothing, S2: Number, S3:S. cerevisiae, S4: DRSP, S5: CE, S6: Fertilized egg of xenopus, S7: TETPY, S8: MMU, S9: OS, S10: Micrasterias, S11: VIBCH, S12: CAUVC, S13: PPLO, S14: AGGAC.

Download Full Size | PDF

 figure: Fig. 5.

Fig. 5. Flowchart of moment extraction, fusion and reconstruction.

Download Full Size | PDF

In addition, we proposed a sample reconstruction algorithm based on Tchebichef moment [28,29]. For an image of N×N size, the Tchebichef moments of order p + q are defined as:

$${T_{pq}} = \frac{1}{{\tilde{p}({p,N} )\tilde{p}({q,N} )}}\sum\limits_{x = 0}^{N - 1} {\sum\limits_{y = 0}^{N - 1} {{{\tilde{t}}_p}(x )} } {\tilde{t}_q}(x )f({x,y} ),p,q = 0,1,\ldots N - 1$$
where
$$\tilde{p}({n,N} )= \frac{{N\left( {1 - \frac{1}{{{N^2}}}} \right)\left( {1 - \frac{2}{{{N^2}}}} \right)\ldots \left( {1 - \frac{n}{{{N^2}}}} \right)}}{{2n + 1}},n = 0,1,\ldots N - 1$$
$${\tilde{t}_n}(x )= \frac{{{t_n}(x )}}{{{N^n}}}$$
${t_n}(x )$ represents the discrete orthogonal Tchebichef polynomials which could be defined as follows:
$${t_n}(x )= n!\sum\limits_{k = 0}^n {{{({ - 1} )}^{n - k}}} \left( {\begin{array}{c} {N - 1 - k}\\ {n - k} \end{array}} \right)\left( {\begin{array}{c} {n + k}\\ {n - k} \end{array}} \right)\left( {\begin{array}{c} x\\ k \end{array}} \right)$$

The moments with different order represent different information of the image. By fusing the Tchebichef moments of different samples, we can obtain a new matrix which contains the feature of each sample. A new class of samples can be obtained by reconstructing the new feature matrix through the following inverse transformation formula. Meanwhile, we can generate mass reconstructed datasets by permutation and combination of Tchebichef moments in different categories of samples, as shown in Fig. 5.

$$\dot{f}({x,y} )= \sum\limits_{x = 0}^{N - 1} {\sum\limits_{y = 0}^{N - 1} {{T_{pq}}} } {\tilde{t}_p}(x ){\tilde{t}_q}(y ),x,y = 0,1,\ldots ,N - 1$$
here f(x,y) represents the gray function of the reconstructed image.

2.3 Feature distribution distance scoring rule

In the process of transfer learning, it is very important to select a source domain that can represent the feature distribution of the target domain. The transfer learning performance improves with the increase of the correlation between the source domain and the target domain. A basic model that can best learn multiple feature distributions will have higher phase recovery accuracy and better generalization ability. Therefore, we propose a FDDS rule to evaluate the rationality of the constructed training sets through feature expression and fusion, and to help to choose the pre-training dataset. In this method, Tchebichef moment [2830] can reflect the global features of the image and describe the rich geometric information of the image; The Histogram of Oriented Gradient (HOG) [31] focuses on the local area of the image, and the image is characterized by calculating and counting the gradient direction histogram of the local area of the image. Thus, The Tchebichef moment and HOG are selected to extract the sample features, ensuring that the evaluated feature matrix is completed in describing the sample. In addition, the maximum mean discrepancy (MMD) [32] is introduced to evaluate the difference between the different sample features.

Moments are commonly used in statistics to characterize the distribution of random variables, if we consider a binary or gray level image segment as a two-dimensional density distribution function, the image feature can be analyzed through moments straightforwardly. As one of discrete orthogonal moments, Tchebichef moments have excellent performance in image description. In addition to translation, rotation, scaling invariance and multiresolution capabilities, data redundancy and discretization errors are also reduced in calculation of Tchebichef moments which make them more accurate and robust. Due to its powerful feature extraction capability, moments are widely used in visual pattern recognition [33,34], object classification [35], pose estimation [36] and other scenes. The feature refinement degree of the extracted moment is determined by the order p and q. The low-order moment values represent low-frequency, fundamental geometric properties of a distribution or body and equip with a certain anti-noise ability. Specifically, the first three order moments represent the center of mass, size and orientation of the given image and skewness respectively. In addition, higher order moments contain the details of the image. In this paper, the moment {Tp,q} from order 0 to order 10 are selected, and the moment values of multiple images under each sample sets are integrated as the total feature matrix of the sample.

HOG is a common method for feature extraction in machine learning process, which has a strong ability of image feature description. The image appearance and shape are characterized by count and calculate the gradient of the local area of the image, and the influence of the image brightness gradient on the results of the characterization is eliminated by normalization, which improves the robustness of feature extraction to images with different gray levels. In this paper, the size of the detection window (cell) we selected is 32×32 pixels, and the pixels in each cell are allocated to channels with 9 directions. Each channel considers the pixel gradient direction of 20 degrees. 2×2 cells are normalized as a block, and the block moving step is 32 pixels. Since the image size is 256×256, the block has to move 7 steps in each direction to cover the whole image, thus, HOG features extracted from each image are 7×7×4×9 equals 1764 dimensions.

Based on the sample features extracted above, the distribution difference of features of different kinds of samples can be expressed as follows:

$$MM{D^2}({X,Y} )= \left\|{\sum\limits_{i = 1}^m {\Phi ({{x_i}} )- \sum\limits_{i = 1}^n {\Phi ({{y_i}} )} } } \right\|_{\rm H}^2$$
where xi and yi represent source domain data and target domain data, and m and n represent the amount of data in the two domains respectively, Φ(·) represents the mapping of the original variables into a high-dimensional reproducing kernel Hilbert space (RKHS), in which the previously linearly indivisible points become linearly separable. Using the reproducing property of RKHS, we can use the kernel function to complete the calculation in the low-dimensional space and avoid constructing the display formula of the high-dimensional transformation. The smooth kernel k used in this paper is the Gaussian radial basis kernel, which could achieve relatively high accuracy with fewer parameters:
$$k_\sigma ^{rbf}({x,y} )= {e^{ - \frac{1}{{2{\sigma ^2}}}{{||{x - y} ||}^2}}}$$
the σ is a hyperparameter as the width of the Gaussian kernel. Then, the calculation formula of MMD is written as follows:
$$MM{D^2}({X,Y} )= \left\|{\frac{1}{{{m^2}}}\sum\limits_{i,j = 1}^m {k({{x_i},{x_j}} )- \frac{2}{{mn}}\sum\limits_{i,j = 1}^{m,n} {k({{x_i},{y_j}} )- } \frac{1}{{{n^2}}}\sum\limits_{i,j = 1}^n {k({{y_i},{y_j}} )} } } \right\|_{\rm H}^2$$

If and only if the distribution of feature matrix of two input samples are exactly the same, the calculated result of MMD is 0.

The feature extraction objects include interferogram and phase image. The phase image represents the contour and morphology information of the sample directly, and the fringe in interferogram is deformed due to the modulation of the sample phase, which contains the contour and morphology information of the sample and is also the direct processing object of the network. After extracting Tchebichef moments and HOG features respectively for two kinds of images, the feature matrix of this type of samples was constructed, and the MMD results between the two samples feature matrices were calculated as the feature distribution difference. The MMD calculation results of Tchebichef moments and HOG for different phase images are given in Table 1 and 2. Accordingly, the MMD calculation results of Tchebichef moments and HOG for different interferograms are also given in Table 3 and 4. All the MMD calculation results of a sample with other samples were added as the sum of feature distribution distances of this sample. With smaller sum indicants higher task commonality with other tasks, thus we could select the proper task to train the base model.

Tables Icon

Table 1. MMD calculation results of Tchebichef moments of different phase images. S1: Clothing, S2: Number, S3: S.cerevisiae, S4: DRSP, S5: CE, S6: Fertilized egg of xenopus, S7: TETPY, S8: MMU, S9: OS, S10: Micrasterias, S11: VIBCH, S12: CAUVC, S13: PPLO, S14: AGGAC, S15: Moment reconstruction samples.

Tables Icon

Table 2. MMD calculation results of HOG of different phase images.

Tables Icon

Table 3. MMD calculation results of Tchebichef moments of different interferograms.

Tables Icon

Table 4. MMD calculation results of HOG of different interferograms.

Tchebichef moment and HOG describe the features of samples from different dimensions. To comprehensively evaluate the distance of feature distribution of samples, we combined the MMD calculation results of two feature extraction methods as follows:

$${D_n} = \frac{{Tchebiche{f_n}}}{{\sum\limits_{n = 1}^N {Tchebiche{f_n}} }} + \frac{{HO{G_n}}}{{\sum\limits_{n = 1}^N {HO{G_n}} }}$$
where N represents the total number of samples.

The combined feature distribution difference is showed in Fig. 6. The results indicate that the collected interferogram still retains the feature information of the target phase, and OS sample has the smallest FDDS with other tasks while the clothing sample has the largest. In addition, the feature fusion sample based on moment reconstruction has great task commonality in evaluating the feature distribution distance, second only to OS sample, and can be selected as the pretrain datasets to obtain the base model.

 figure: Fig. 6.

Fig. 6. The combined feature distribution difference.

Download Full Size | PDF

3. Results analysis and discussion

According to the above analysis of the difference between the combined classes, we generated 10000 sets of clothing samples and moment-reconstruction samples respectively. Through full training, we obtained two networks as base models, which were respectively represented as the categories with high and low commonality between tasks. After training, we evaluated the phase recovery accuracy of various tasks under different training strategies by root mean squared error (RMSE) and structural similarity (SSIM), the result verified the reliability of the evaluation method for feature distribution distance. In addition, the role of basic model and fine-tuning strategy in training is discussed, and it is proved that the basic model trained by simulation data set can be used for quantitative phase imaging of experimental interferograms. The RMSE and SSIM can be calculated by:

$$RMSE = \sqrt {\frac{{{{\sum\limits_{i = 1}^N {({Predicte{d_i} - Actua{l_i}} )} }^2}}}{N}}$$
$$SSIM = \frac{{({2{\mu_{Predicted}}{\mu_{Actual}} + {c_1}} )({2{\sigma_{Predicted,Actual}} + {c_2}} )}}{{({\mu_{Predicted}^2 + \mu_{Actual}^2 + {c_1}} )({\sigma_{Predicted}^2 + \sigma_{Actual}^2 + {c_2}} )}}$$
where μ is the mean value of image, σ is the variance of image, σPredicted,Actual is the covariance of image Predicted and Actual, c1 and c2 are two constants to avoid dividing by zero.

3.1 Simulation results

3.1.1 Effect of base model and fine-tuning strategy on phase recovery accuracy

The performance of phase recovery of the network can be evaluated by calculating the difference between the network output and the reference phase when the interferogram is sent into the network. Thus, the network performance obtained under three different training strategies are showed and compared in this section. The first network loaded the network parameters trained by the moment-reconstruction sample as the pre-training datasets, and then the network parameters are fine-tuned by transfer learning the samples of the target domain. In the second network, the phase recovery of the target domain samples was performed directly after loading the network parameters trained by the moment-reconstruction sample as the pre-training datasets. In the third network, we initialized the network parameters randomly and then trained the target domain samples. Each kind of samples contained 200 groups of interferogram and reference phases, including 100 groups as training sets, 20 groups as validation sets, and 80 groups as test sets. Its purpose is to verify the effect of pre-training model and transfer learning in network training task with small dataset. The results of network training are shown as follows. Table 5 and Fig. 7 show the average phase recovery accuracy of various samples under three different training strategies.

 figure: Fig. 7.

Fig. 7. Phase recovery accuracy of various samples under different training strategies (a)RMSE (b)SSIM.

Download Full Size | PDF

Tables Icon

Table 5. RMSE and SSIM of various samples under different training strategies. Strategy 1:Loading base model with transfer learning, Strategy 2:Loading base model without transfer learning, Strategy 3:Training with random initialized parameters.

By analyzing Table 5 and Fig. 7, it can be seen that loading pre-training network parameters alone cannot stably improve the phase recovery accuracy of the target sample, and the phase recovery accuracy is similar to that obtained by training small datasets after randomly initialize the network parameters. However, after loading the pre-training parameters, the phase recovery accuracy of the target sample can be significantly improved by training target domain datasets and fine-tuning the network parameters adaptively. Taking Mycoplasma sample as an example, we can understand the role of loading pre training parameters and transfer learning in stage recovery tasks.

The phase recovery results of Mycoplasma under different training strategies are showed in Fig. 8. From the results, we can intuitively understand that although the latter two strategies both have terrible phase recovery accuracy in small datasets, their error sources are different. The third network of Strategy 3 cannot completely filter the background fringes, which can still be seen from the recovered image. In addition, if the pre-training parameters are not loaded, the contour in the recovery stage is blurred and the morphological information is missing. In the Strategy 2, the result of loading only the pre-training parameters no longer contains the background fringe, and can better obtain the contour of the target phase. However, due to the special “hollow” phase characteristics of Mycoplasma samples, which were not contained in the pre-training datasets, made the recovery stage is “solid” and lack of morphology information, is also the main source of error in this strategy.

 figure: Fig. 8.

Fig. 8. Phase recovery results of Mycoplasma under different training strategies. Strategy 1: Loading base model with transfer learning, Strategy 2: Loading base model without transfer learning, Strategy 3: Training with random initialized parameters.

Download Full Size | PDF

It can be concluded that, in the case of small datasets, if we didn’t load the base model of similar tasks, the network cannot build the mapping from interferogram to target phase, resulting in that neither background fringe can be filtered out, nor accurate phase contour and morphology information can be obtained. The base model obtained through sufficient pre-training datasets has completed the mapping, which helps identify the region where target samples exist in the interferogram and perform phase recovery and background filtering. The feature distribution difference between two domains combines with the poor generalization ability of the network lead to the loss of details of the morphologies in the final recovery results. Through transfer learning, fine-tune the parameters on the basis of the pre-training network to make it learn the characteristic distribution of the target domain. Only a small number of data sets can effectively improve the generalization ability of the network. Finally, it can not only maintain the ability of the pre-training network to filter the background fringes, but also significantly improve the phase recovery accuracy of the target sample.

3.1.2 Network phase recovery accuracy of different base models

The phase recovery accuracy obtained by transfer learning after loading different base models is discussed in this section, which is used to verify the validity of the FDDS rule proposed in this paper and the feasibility of the moment-reconstruction samples as pre-training datasets. The first base model was trained by moment-reconstruction samples as pre-training datasets and the second was clothing samples. In addition, we set up the third network that did not load the base model but the parameters were randomly initialized as a control group. Each kind of sample contains 200 groups of interferogram and reference phase, including 100 groups of training sets, 20 groups of validation sets, and 80 groups of test sets.

By comparing the phase recovery accuracy of different base models in Table 6 and Fig. 9, it can be concluded that the transfer learning after loading base models of similar tasks can effectively improve the final phase recovery accuracy for phase recovery tasks with small datasets, and both RMSE and SSIM have been significantly improved. Contrasting the phase recovery results of various samples in Fig. 10, for different pre-training models, the fine-tuned network has stronger generalization ability after transfer learning as the task commonality between source domain and target domain increases, and the network can provide finer phase details. As reflected in the results of this paper, that is, compared with the pre-training model trained by the clothing samples, the pre-training model trained by the moment-reconstruction samples can achieve higher phase recovery accuracy after transfer, which is in line with the results calculated by the FDDS rule mentioned above.

 figure: Fig. 9.

Fig. 9. Phase recovery accuracy of various samples under different base models (a)RMSE (b)SSIM.

Download Full Size | PDF

 figure: Fig. 10.

Fig. 10. Phase recovery results of various samples.

Download Full Size | PDF

Tables Icon

Table 6. RMSE and SSIM of various samples under different base models. Strategy a: Moment reconstruction samples as pre-training datasets and transfer learning, Strategy b: Clothing samples as pre-training datasets and transfer learning, Strategy c: Training with random initialized parameters.

3.1.3 Network phase recovery accuracy under different data volumes

The experimental results in the previous section have shown that when the datasets were not enough to train a network with strong robustness and excellent generalization ability, fine-tuning the network parameters of a base model with similar tasks can effectively improve the phase recovery accuracy. To further illustrate that transfer learning can relieve the required data pressure for training, we compare the phase recovery accuracy of two networks, the first network loading the base model with the moment-reconstruction samples as pre-training datasets and the second initialized the parameters randomly. In this section, we are no longer limited to a certain kind of sample, but the phase recovery accuracy of the whole under different strategies. Figure 11 shows the change trend of phase recovery accuracy when the number of training sets increases from 20 to 1000.

 figure: Fig. 11.

Fig. 11. Phase recovery accuracy of the two training strategies under different data volume (a) RMSE (b) SSIM.

Download Full Size | PDF

By comparing the curves of two strategies in Fig. 11, it can be seen that the accuracy of two training strategies improved steadily with the increase of data volume, but the phase recovery accuracy of the network using transfer learning is stably better than that of the training result without pre-training model. To achieve a certain accuracy of phase recovery, the strategy without base model required greater amount of data than another, which can also be visually indicate that loading a base model with high task commonality could provide the new task a training starting point closer to the global optimal value, reduce the data pressure required to establish input/output mapping.

3.2 Experimental results

To further verify the feasibility of the proposed method in real application scenarios, a digital holographic microscopy system shown in the Fig. 12 for recording interferograms is built. He-Ne laser with a wavelength of 632.8 nm is used as the lighting source. Pinhole is used to perform spatial filtering to improve the beam quality and lenses are used to expand and collimate the beam. The intensity ratio of two polarized beams generated by polarized beam splitter can be adjusted by rotating half wave plate to obtain well-contrast interferograms. The object light is modulated by the phase of the sample to be measured, and interferes with the reference light at the beam splitters. An objective lens is used to image the light field with target phase on the camera. The polarization direction of the polarizer is 45° with the two beams, and the polarizer is combined with the quarter-wave plate as a phase shifter. Different phase shifts can be introduced by rotating the quarter-wave plate. The multi-step phase-shifting algorithm is used to calculate the phase from the interferograms with different phase shifts, which is used as the reference phase of the experimental sample and further made into the label in the network training task. Polystyrene spheres and Hela cells are recorded as two groups of datasets.

 figure: Fig. 12.

Fig. 12. The digital holographic microscopy system. L1 and L2: Optical lens, PH: pinhole, HWP: half wavelength plate, PBS: polarized beam splitter, M1 and M2: mirrors, Obj: objective, BE: beam expander, BS: beam splitter, QWP: quarter wave plate, P: Polarizer.

Download Full Size | PDF

First, we calculated the feature distribution difference between the two groups of experimental samples and two pre-training samples, and the results are shown in Table 7. As shown in Table 7, Tchebichef moments and HOG features of polystyrene and Hela cells were extracted respectively, and MMD calculation results showed that, compared with clothing samples, the moment-reconstruction samples generated by the proposed method have higher task commonality with experimental samples.

Tables Icon

Table 7. Feature distribution differences between experimental samples and pre-training datasets.

Similarly, we compared the phase recovery accuracy of two experimental samples under different training strategies. The first network loaded the parameters which were trained with moment-reconstruction samples as pre-training datasets, the second network loaded the parameters with clothing samples as pre-training datasets, and the parameters of the third network was initialized randomly. After the training, the average RMSE and SSIM of network output results and labels are calculated and the phase recovery accuracy of networks under different base models is shown in Table 8 and Fig. 13. As shown in Table 8, for experimental samples, the network loaded pre-training model obtain higher phase recovery accuracy than that of network with random initialize parameters, the network without pre-training parameters can’t filter out background fringes completely and details of target phase is also missing. For different pre-training models, the phase recovery accuracy will be improved as the task commonality between the pre-training dataset and the experimental sample increases. The experimental results are consistent with the simulation results, which proves that the network trained by simulation datasets with high task commonality still performs well in the QPI of experiment interferograms, and can effectively improve the accuracy of phase imaging based on DL under the condition of limited data volume.

 figure: Fig. 13.

Fig. 13. Phase recovery results of experimental samples. Strategy I: Moment reconstruction samples as pre-training datasets and transfer learning, Strategy II: Clothing samples as pre-training datasets and transfer learning, Strategy III: Training with random initialized parameters.

Download Full Size | PDF

Tables Icon

Table 8. Phase recovery accuracy of experimental samples under different base models.

4. Conclusion

We propose a quantitative phase imaging (QPI) method based on model transfer and apply it to the phase recovery tasks of experimental samples with small datasets. By comparing different training strategies, the fine-tuning parameters of the base model with similar mission is verified that can help new task build target mapping and improved the generalization ability of the network, ultimately achieved fast and precise QPI form single experimental interferogram. First, a feature fusion method based on moment reconstruction for training sets generation is proposed to construct a rich enough dataset that can cover most situations and accurately annotated by extracting the feature of open dataset and the inverse transformation formula, relieved the pressure of data and tagging, and improved generalization ability in the data-driven method. Then, a feature distribution difference measure rule based on global and local image features is proposed to evaluate the feature difference between different samples, and to help us select the sample with high task commonality as pre-training datasets, specifically, tasks with smaller scores can achieve better generalization performance, and the feasibility of the proposed rule is verified by the experimental results. Ulteriorly, the base model obtained by simulation data in the experimental data through transfer learning is applied, and the results showed that the fine-tuned network has a significant improvement on the accuracy of phase recovery and is capable of achieve fast and high-accuracy phase recovery for different samples, which is convenient for practical operations.

Funding

National Natural Science Foundation of China (62075140, 61727814, 61875059).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. Y. Park, C. Depeursinge, and G. Popescu, “Quantitative phase imaging in biomedicine,” Nat. Photonics 12(10), 578–589 (2018). [CrossRef]  

2. Y. Zhang, X. B. Tian, and R. G. Liang, “Accurate and fast two-step phase shifting algorithm based on principle component analysis and Lissajous ellipse fitting with random phase shift and no pre-filtering,” Opt. Express 27(14), 20047–20063 (2019). [CrossRef]  

3. K. Ishikawa, R. Tanigawa, K. Yatabe, Y. Oikawa, T. Onuma, and H. Niwa, “Simultaneous imaging of flow and sound using high-speed parallel phase-shifting interferometry,” Opt. Lett. 43(5), 991–994 (2018). [CrossRef]  

4. J. S. Li, X. X. Lu, Q. N. Zhang, B. B. Li, J. D. Tian, and L.Y. Zhong, “Dual-channel simultaneous spatial and temporal polarization phase-shifting interferometry,” Opt. Express 26(4), 4392–4400 (2018). [CrossRef]  

5. P. Sun, L. Y. Zhong, C. S. Luo, W. H. Niu, and X. X. Lu, “Visual measurement of the evaporation process of a sessile droplet by dual-channel simultaneous phase-shifting interferometry,” Sci. Rep. 5(1), 12053 (2015). [CrossRef]  

6. M. R. Miller, I. Mohammed, and P. S. Ho, “Quantitative strain analysis of flip-chip electronic packages using phase-shifting moiré interferometry,” Opt. Lasers Eng. 36(2), 127–139 (2001). [CrossRef]  

7. P. Müller, M. Schürmann, S. Girardo, G. Cojoc, and J. Guck, “Accurate evaluation of size and refractive index for spherical objects in quantitative phase imaging,” Opt. Express 26(8), 10729–10743 (2018). [CrossRef]  

8. Q. N. Zhang, L. Y. Zhong, P. Tang, Y. J. Yuan, S. D. Liu, J. D. Tian, and X. X. Lu, “Quantitative refractive index distribution of single cell by combining phase-shifting interferometry and AFM imaging,” Sci. Rep. 7(1), 2532 (2017). [CrossRef]  

9. Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature 521(7553), 436–444 (2015). [CrossRef]  

10. Q. N. Zhang, S. D. Lu, J. S. Li, D. Li, X. X. Lu, L.Y. Zhong, and J. D. Tian, “Phase-shifting interferometry from single frame in-line interferogram using deep learning phase-shifting technology,” Opt. Commun. 498, 127226 (2021). [CrossRef]  

11. G E Spoorthi, S Gorthi, and R K S S Gorthi, “PhaseNet: A deep convolutional neural network for two-dimensional phase unwrapping,” IEEE Signal Process. Lett. 26(1), 54–58 (2019). [CrossRef]  

12. K. Wang, Y. Li, Q. Kemao, J. Di, and L. Peng, “One-step robust deep learning phase unwrapping,” Opt. Express 27(10), 15100–15115 (2019). [CrossRef]  

13. J. C. Zhang, X. Tian, J. Shao, H. Luo, and R. Liang, “Phase unwrapping in optical metrology via denoised and convolutional segmentation networks,” Opt. Express 27(10), 14903–14912 (2019). [CrossRef]  

14. Y. Rivenson, Y. Zhang, H. Günaydın, T. Da, and A. Ozcan, “Phase recovery and holographic image reconstruction using deep learning in neural networks,” Light: Sci. Appl. 7(2), 17141 (2018). [CrossRef]  

15. K. Q. Wang, J. Z. Dou, Q. Kemao, J. L. Di, and J. L. Zhao, “Y-Net: a one-to-two deep learning framework for digital holographic reconstruction,” Opt. Lett. 44(19), 4765 (2019). [CrossRef]  

16. X. Li, H. Y. Qi, S. W. Jiang, P. M. Song, G. A. Zheng, and Y. B. Zhang, “Quantitative phase imaging via a cGAN network with dual intensity images captured under centrosymmetric illumination,” Opt. Lett. 44(11), 2879 (2019). [CrossRef]  

17. H. Wang, M. Lyu, and G. H. Situ, “eHoloNet: A learning-based end-to-end approach for in-line digital holographic reconstruction,” Opt. Express 26(18), 22603–22614 (2018). [CrossRef]  

18. S. Y. Lu, Y. Tian, Q. N. Zhang, X. X. Lu, and J. D. Tian, “Dynamic quantitative phase imaging based on Ynet-ConvLSTM neural network,” Opt. Lasers Eng. 150, 106833 (2022). [CrossRef]  

19. F. Wang, Y. M. Bian, H. C. Wang, M. Lyu, G. Pedrini, W. Osten, G. Barbastathis, and G. H. Situ, “Phase imaging with an untrained neural network,” Light: Sci. Appl. 9(1), 77 (2020). [CrossRef]  

20. Z. Ren, Z. Xu, and E. Y. M. Lam, “End-to-end deep learning framework for digital holographic reconstruction,” Adv. Photonics 1, 016004 (2019). [CrossRef]  

21. Y. J. Xue, S. Y. Cheng, Y. Z. Li, and L. Tian, “Reliable deep-learning-based phase imaging with uncertainty quantification,” Optica 6(5), 618–629 (2019). [CrossRef]  

22. S. J. Pan and Q Yang, “A survey on transfer learning,” IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010). [CrossRef]  

23. M Mirza and S Osindero, “Conditional generative adversarial nets,” arXiv preprint arXiv:1411.1784 (2014).

24. I. Goodfellow, J. Pouget-Abadie, and M. Mirza, “Generative adversarial nets,” Advances in neural information processing systems, 27 (2014).

25. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 234–241 (2015).

26. H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms,” arXiv preprint arXiv:1708.07747 (2017).

27. S. M. Gustafsdottir, V. Ljosa, and K. L. Sokolnicki, “Human U2OS cells-compound cell-painting experiment,” The Cell Image Library (2015).

28. R. Mukundan, S. H. Ong, and P. A. Lee, “Image analysis by Tchebichef moments,” IEEE Trans. on Image Process. 10(9), 1357–1364 (2001). [CrossRef]  

29. W. J. Li, Q. N. Zhang, L. Y. Zhong, X. X. Lu, and J. D. Tian, “Image definition assessment based on Tchebichef moments for micro-imaging,” Opt. Express 27(24), 34888–34900 (2019). [CrossRef]  

30. M. K. Hu, “Visual pattern recognition by moment invariants,” IEEE Trans. Inform. Theory 8, 179–187 (1962). [CrossRef]  

31. N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” IEEE Conference on Computer Vision & Pattern Recognition.556–893 (2005).

32. A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. Smola, “A kernel two-sample test,” The Journal of Machine Learning Research 13, 723–773 (2012).

33. S. O. Belkasim, M. Shridhar, and M. Ahmadi, “Pattern recognition with moment invariants: a comparative study and new results,” Pattern Recogn. 24(12), 1117–1138 (1991). [CrossRef]  

34. J Flusser and T Suk, “Pattern recognition by affine moment invariants,” Pattern Recogn. 26(1), 167–174 (1993). [CrossRef]  

35. M. I. Heywood, “Fractional central moment method for moment-invariant object classification,” Proc. Inst. Elect. Eng. 142, 213–219 (1995).

36. R. Mukundan, “Estimation of quaternion parameters from two-dimensional image moments,” CVGIP: Graphical Models and Image Processing 54(4), 345–350 (1992). [CrossRef]  

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (13)

Fig. 1.
Fig. 1. Flowchart of the proposed method for the QPI based on transfer learning.
Fig. 2.
Fig. 2. Architecture of generator.
Fig. 3.
Fig. 3. Architecture of discriminator.
Fig. 4.
Fig. 4. Presentation of Partial datasets. S1: Clothing, S2: Number, S3:S. cerevisiae, S4: DRSP, S5: CE, S6: Fertilized egg of xenopus, S7: TETPY, S8: MMU, S9: OS, S10: Micrasterias, S11: VIBCH, S12: CAUVC, S13: PPLO, S14: AGGAC.
Fig. 5.
Fig. 5. Flowchart of moment extraction, fusion and reconstruction.
Fig. 6.
Fig. 6. The combined feature distribution difference.
Fig. 7.
Fig. 7. Phase recovery accuracy of various samples under different training strategies (a)RMSE (b)SSIM.
Fig. 8.
Fig. 8. Phase recovery results of Mycoplasma under different training strategies. Strategy 1: Loading base model with transfer learning, Strategy 2: Loading base model without transfer learning, Strategy 3: Training with random initialized parameters.
Fig. 9.
Fig. 9. Phase recovery accuracy of various samples under different base models (a)RMSE (b)SSIM.
Fig. 10.
Fig. 10. Phase recovery results of various samples.
Fig. 11.
Fig. 11. Phase recovery accuracy of the two training strategies under different data volume (a) RMSE (b) SSIM.
Fig. 12.
Fig. 12. The digital holographic microscopy system. L1 and L2: Optical lens, PH: pinhole, HWP: half wavelength plate, PBS: polarized beam splitter, M1 and M2: mirrors, Obj: objective, BE: beam expander, BS: beam splitter, QWP: quarter wave plate, P: Polarizer.
Fig. 13.
Fig. 13. Phase recovery results of experimental samples. Strategy I: Moment reconstruction samples as pre-training datasets and transfer learning, Strategy II: Clothing samples as pre-training datasets and transfer learning, Strategy III: Training with random initialized parameters.

Tables (8)

Tables Icon

Table 1. MMD calculation results of Tchebichef moments of different phase images. S1: Clothing, S2: Number, S3: S.cerevisiae, S4: DRSP, S5: CE, S6: Fertilized egg of xenopus, S7: TETPY, S8: MMU, S9: OS, S10: Micrasterias, S11: VIBCH, S12: CAUVC, S13: PPLO, S14: AGGAC, S15: Moment reconstruction samples.

Tables Icon

Table 2. MMD calculation results of HOG of different phase images.

Tables Icon

Table 3. MMD calculation results of Tchebichef moments of different interferograms.

Tables Icon

Table 4. MMD calculation results of HOG of different interferograms.

Tables Icon

Table 5. RMSE and SSIM of various samples under different training strategies. Strategy 1:Loading base model with transfer learning, Strategy 2:Loading base model without transfer learning, Strategy 3:Training with random initialized parameters.

Tables Icon

Table 6. RMSE and SSIM of various samples under different base models. Strategy a: Moment reconstruction samples as pre-training datasets and transfer learning, Strategy b: Clothing samples as pre-training datasets and transfer learning, Strategy c: Training with random initialized parameters.

Tables Icon

Table 7. Feature distribution differences between experimental samples and pre-training datasets.

Tables Icon

Table 8. Phase recovery accuracy of experimental samples under different base models.

Equations (14)

Equations on this page are rendered with MathJax. Learn more.

L o s s = 1 N i N ( y y ^ ) 2
C r o s s E n t r o p y L o s s = [ y log ( y ^ ) + ( 1 y ) log ( 1 y ^ ) ]
I s = I r 2 + I o 2 + 2 I r I o cos [ φ + ϕ ]
T p q = 1 p ~ ( p , N ) p ~ ( q , N ) x = 0 N 1 y = 0 N 1 t ~ p ( x ) t ~ q ( x ) f ( x , y ) , p , q = 0 , 1 , N 1
p ~ ( n , N ) = N ( 1 1 N 2 ) ( 1 2 N 2 ) ( 1 n N 2 ) 2 n + 1 , n = 0 , 1 , N 1
t ~ n ( x ) = t n ( x ) N n
t n ( x ) = n ! k = 0 n ( 1 ) n k ( N 1 k n k ) ( n + k n k ) ( x k )
f ˙ ( x , y ) = x = 0 N 1 y = 0 N 1 T p q t ~ p ( x ) t ~ q ( y ) , x , y = 0 , 1 , , N 1
M M D 2 ( X , Y ) = i = 1 m Φ ( x i ) i = 1 n Φ ( y i ) H 2
k σ r b f ( x , y ) = e 1 2 σ 2 | | x y | | 2
M M D 2 ( X , Y ) = 1 m 2 i , j = 1 m k ( x i , x j ) 2 m n i , j = 1 m , n k ( x i , y j ) 1 n 2 i , j = 1 n k ( y i , y j ) H 2
D n = T c h e b i c h e f n n = 1 N T c h e b i c h e f n + H O G n n = 1 N H O G n
R M S E = i = 1 N ( P r e d i c t e d i A c t u a l i ) 2 N
S S I M = ( 2 μ P r e d i c t e d μ A c t u a l + c 1 ) ( 2 σ P r e d i c t e d , A c t u a l + c 2 ) ( μ P r e d i c t e d 2 + μ A c t u a l 2 + c 1 ) ( σ P r e d i c t e d 2 + σ A c t u a l 2 + c 2 )
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.