Deep learning for denoising in a Mueller matrix microscope

Xiongjie Yang; Xiongjie Yang; Qianhao Zhao; Qianhao Zhao; Tongyu Huang; Tongyu Huang; Zheng Hu; Tongjun Bu; Honghui He; Anli Hou; Anli Hou; Migao Li; Yucheng Xiao; Hui Ma; Hui Ma; Hui Ma

doi:10.1364/BOE.457219

1. Introduction

Mueller matrix (MM) can provide complete polarization properties of a sample, which are sensitive to microstructural features down to subwavelength scales. Among the available MM equipment, the Mueller matrix microscope (MMM) has shown promising application prospects in biomedical research and clinical practices [1–3] . Various MMM have been developed for different applications, such as transmission MMM for pathological diagnosis of thin tissue slices [4] and collinear reflection MMM for studying bulk tissue samples [5]. Since individual MM elements often lack clear physical meanings, polarimetry basis parameters (PBP) [6–8] derived from MM elements are used for quantitative microstructural features characterization and diagnosis of diseases [9–12]. In all these applications, the accuracy of MM is always a top concern because precise MM imaging can provide more detailed polarization features for microstructure characterization and pathological diagnosis.

The accuracy of MM imaging is often limited by two factors - errors in the state of polarization (SOP) and noise in the instrument. The effects of these two factors need to be addressed by calibration and noise reduction, respectively. Calibration procedures of a MM polarimeter are required to calibrate the instrument matrix of polarization state generator (PSG) and polarization state analyzer (PSA), in order to reduce the influence of SOP errors. There are many proven calibration methods, such as model-based calibration method (MCM) [13], numerical calibration method (NCM) [14], and eigenvalue calibration method (ECM) [15] for different applications. However, the calibration procedure will not have an effect on acquisition time, but noise reduction will. Therefore, we focus more on the effects of instrument noise in this paper. There are two ways to suppress the effects of noises: (1) Measure the light intensity matrix multiple times and take the average to calculate the MM [16]. (2) Choose the correct instrument matrix of PSG and PSA and expand the dimensions of the instrument matrix appropriately to reduce the estimated variances caused by the noise in the MMM [17,18]. While these methods are effective in improving accuracy, they also place higher requirements on acquisition time. Deep learning can help us break the conflict between accuracy and acquisition time.

Deep learning [19] methods, such as convolutional neural networks (CNNs), has proved their power in computer vision [20–22]. Deep learning can learn to perform complex tasks by employing multi-layered neural networks trained on a large amount of example data. As a powerful tool for everyone, its applications in biophotonics have gained a tremendous amount of attention in recent years. Applying deep learning methods to optical technology can help solve problems that traditionally require hardware upgrades, additional measurements, or limit imaging frame rates to improve imaging quality. For example, learning-based image super-resolution techniques can be used to address the trade-off between imaging speed, spatial resolution, and exposure in microscopy [23], and deep learning image reconstruction methods are used for complex-valued image reconstruction without twin images in on-axis holography [24], elimination of phase affected by background inhomogeneity in diffractive phase microscopy [25], and imaging frame rate improvement with minimal loss in image quality for in vivo two-photon imaging [26]. The implementation of the above application benefits from one of the most exciting advantages of learning-based methods, that is, the inference is very fast to compute once the model is trained. Obtaining high signal-to-noise Mueller matrix images is very time-consuming, which hinders the development of Mueller matrix microscopy and its application. Therefore, employing deep neural networks to quickly reduce the noise in MM images and enhance the PBP image quality is promising. Previously, learning based denoising method has been successfully applied in Stokes images by denoising the intensity images captured by DoFP camera [27]. However, the complexities of Stokes framework and Mueller framework are quite different. Performing denoising on intensity images to help enhance the quality of MM images is difficult to some extent. To our best knowledge, there is no deep learning-based denoising method in MM imaging.

In this paper, we trained a deep residual U-Net incorporating channel attention named Mueller denoising U-Net (MDU-Net) with a large number of paired low-SNR (signal-noise ratio) and high-SNR MM images, which are captured by a MMM with rotating polarizer and rotating waveplate (RPRQ) PSG. This network can achieve impressive denoising and recover polarimetric information from low-quality MM images captured in a non-ideal environment. In the evaluation of root mean square error (RMSE), peak signal to noise ratio (PSNR), and structural similarity (SSIM) index, the MDU-Net shows excellent results.

2. Experimental setup

The MM images were obtained by a fast MMM based on dual division of focal plane (DoFP) polarimeters (DoFPs-MMM) which contains an RPRQ-based PSG [28] and a dual DoFP polarimeter based PSA [29]. This PSA can obtain full Stokes images from a snapshot with low equally weighted variance (EWV) [30] independent of the input Stokes vector, resulting in excellent noise suppression performance [31].

RPRQ-based PSG can improve the performance of DoFPs-MMM in the presence of Gaussian noise and Poisson shot noise [28]. The combination of its optimal frames and dual DoFP polarimeters minimizes the EWV of DoFP-MMM and makes it sample-independent. For the optimal frames of RPRQ based PSG, EWV can be expressed by [28]:

(1)$$EW{V_{optimal}} = \frac{{220}}{N}\left( {{\sigma^2} + \frac{{{M_{11}}}}{4}} \right)$$

where ${\sigma ^2}$ is the variance of Gaussian noise, ${M_{11}}$ is the first element of MM, and N denotes the number of SOPs by RPRQ-based PSG. EWV of the entire system will be proportional to $1/{\rm{N}}$. One can obtain higher SNR in MM imaging by taking bigger N with increasing cost in acquisition time. Based on this method, we obtained noisy MM images with N = 4 as the network input and clean MM images with N = 100 as the ground truth, corresponding to acquisition time 2.5s and 69s, respectively, in experiments. The optimal frames of RPRQ-based PSG with N = 4 and 100 are expressed on the Poincaré sphere, as shown in Fig. 1.

Fig. 1. Optimal frames of RPRQ based PSG for (a) N = 4 (b)N = 100.

Download Full Size | PDF

According to the above optimal configuration of DoFPs-MMM, we built a transmission RPRQ-based DoFPs-MMM with a multi-axis cage system (RayCage Photoelectric Technology Co., Ltd, China) to obtain the dataset. The schematic is shown in Fig. 2. LED light source (3W, 633nm, ${\Delta\rm{\lambda }} = 20{\rm{nm}}$) with Lens1, Lens2, Lens3 and aperture stop together form illumination part. The unpolarized incident light is modulated to different polarization states by an RPRQ- based PSG consisting of rotating linear polarizer P and rotating quarter-waveplate R1. P and R1 are installed in two motorized rotation stages (DDR25/M, Thorlabs Inc., USA), respectively, which rotate at 1800 deg/s speed. Scattered light from the sample is collected by a microscope objective and imaged on the DoFP polarimeter through the tube lens. The dual DoFP polarimeter based PSA consists of a 50:50 beam splitter prism NPBS, a quarter-waveplate R2, and two DoFP cameras. We define the $4{\rm{ \times N}}$ instrument matrix of PSG and $8{\rm{ \times 4}}$ instrument matrix of PSA as W and A, respectively, and the $8{\rm{ \times N}}$ intensity matrix as I. The MM of the sample can be estimated by:

(2)$$M = {A^{\dagger} }I{W^{\dagger} }$$

where † denotes pseudoinverse. Before the experiment, the instrument matrix W of PSG is pre-calibrated by a commercial polarimeter (PAX1000VIS/M, Thorlabs Inc., USA). After that, by measuring the intensity matrix ${I_{air}}$ when the sample is air, the instrument matrix A of the PSA can be calculated pixel by pixel as:

(3a)$$A = {I_{air}}{W^{\dagger} }$$

Fig. 2. Schematic of RPRQ based DoFPs-MMM. AS: aperture stop P:linear polarizer R1,R2: quarter-waveplate MO: microscope object.

Download Full Size | PDF

3. Method

U-Net is a very popular CNN architecture, which was first designed and used to perform biomedical image segmentation [22], it also has been demonstrated to show excellent results in various fields, including image restoration [32]. U-Net is a U-shaped encoder-decoder network architecture, which consists of several encoder blocks and decoder blocks. The encoder network (contracting path on the left side) halves the spatial dimensions and doubles the feature channels at each encoder block. Likewise, the decoder network (expanding path on the right side) doubles the spatial dimensions and half the number if feature channels at each decoder block. The overall structure of our proposed MDU-Net is based on a typical U-Net architecture. As shown in Fig. 3 (a), the network architecture is symmetric with four encoder stages and corresponding decoder stages. In each encoder stage, the feature maps are downsampled to $1/2{\rm{ \times }}$ scale using a ${\rm{4 \times 4}}$ convolutional layer with stride 2 (A convolutional layer contains a set of filters (or kernels) that are moved across the image. Stride is a parameter that controls how the filter convolves around the input volume. When the stride is 2, then we move the filters to 2 pixels at a time). The feature maps are also upsampled to ${\rm{2 \times }}$ scale by a ${\rm{2 \times 2}}$ deconvolution with stride 2 before each decoder stage. In order to assist the denoising process, skip connections are applied to pass large-scale low-level feature maps from each encoder stage to the corresponding decoder stage. The skip connection is a bridge that connects the encoder block and decoder block. Different from conventional U-Net-like architectures that directly fuse low-level and high-level features in each decoder stage, the major difference in our method is the interdependencies among channels of low-level feature maps are well exploited before fusion, resulting in channel attention mechanism [33]. The attention mechanism is a technique that mimics cognitive attention, which enhances some parts of the input data while diminishing other parts. Channel attention (CA) is one of the most influential attention mechanisms. CA uses scalar to represent and evaluate the importance of each feature channel. Channel attention allows our model to focus more on useful information. Channel attention block [34] (CAB) is a computational unit that exploits CA and adaptively rescales channel-wise features. As shown in Fig. 3 (b), let ${\rm{X = }}[{{{{x}}_1}{{,\;}}{{{x}}_2}{{,}}\ldots {{,}}{{{x}}_{{{C^{\prime}}}}}} ]$ be an input of channel attention, which has $C^{\prime}$ feature maps with a spatial resolution of ${{H^{\prime} \times W^{\prime}}}$. The channel attention can be formulated by:

(3b)$$\widetilde X = X \ast \sigma ({{f_2}({\delta ({{f_1}({{G_X}} )} )} )} )$$

Fig. 3. The architectures of (a) MDU-Net, (b) Channel Attention Module (CAM), (c) Channel Attention Block (CAB), (d) Convolution Block (Conv Block).

Download Full Size | PDF

The ${G_X} \in {\mathbb{R}^{C' \times 1 \times 1}}$ represents the channel-wise statistics by shrinking $X$ through spatial dimensions with global average pooling (GAP). Next, a gating mechanism with sigmoid function ${\rm{\sigma }}$ is introduced to fully capture the information from GAP. The function ${f_1}$ and ${f_2}$ refer to two 1×1 convolution layers act as channel-downscaling and channel-upscaling respectively, and $\delta $ denotes activation function. Then we obtain the final channel statistics S, and we use it to rescale the input $X$. CABs are stacked to form a channel attention module (CAM) (as shown in Fig. 3(b)) to further exploit the interdependencies among feature channels. In each skip connection, it implicitly applies feature selection to MM image and makes the network focus on informative features of MM image. Besides CAB, the convolution block is another important component in our model. The convolution blocks in the encoder, decoder, and CAM follow the same residual-convolution structure shown in Fig. 3 (c). Specifically, each convolution block applies two 3×3 convolutional layers followed by Parametric Rectified Linear Unit (PReLU) [35] to extract features, and a 1×1 convolutional layer acts as local residual learning (LRL) to further enhance the information flow. LRL utilizes a shortcut to jump over several layers, making deep model training easier. Finally, a 3×3 convolution layer is applied after the last decoder stage to adjust the features adaptively. Moreover, the original noisy MM image is undoubtedly similar to the high-quality one, and this fact indicates that they have a lot of shared information. Therefore, global residual learning (GRL) is introduced to learn the residual information between the corrupted input and denoised output. It is beneficial in better preserving the fine textual and structural details in the denoised MM image.

In this paper, we optimize the network with the following Charbonnier loss [36] for all experiments:

(4)$$L({{M_{dn}},{M_{gt}}} )= \sqrt {{{||{{M_{dn}} - {M_{gt}}} ||}^2} + {\varepsilon ^2}}$$

where $\parallel \cdot \parallel $ denotes ${L_2}$ norm, ${{{M}}_{{{dn}}}}$ denotes the restored MM image, ${{{M}}_{{{gt}}}}$ is the ground-truth, and $\varepsilon $ is a constant which we empirically set to ${10^{ - 6}}$ for all experiments. The $\varepsilon $ is added to make the loss function differentiable.

4. Experiments and analysis

4.1 Datasets

In fact, training a deep denoising neural network usually requires a dataset containing a large number of paired low-SNR and high-SNR MM images. In our work, we built a dataset containing 337 groups of paired MM images with a spatial resolution of 1961×2381. Out of these, 153 groups were selected for training, and the remaining 184 groups were used for validation and test. More details about the dataset are provided in Table 1.

Table 1. The details (regions/patches) of dataset in the experiments.

View Table | View all tables in this article

As shown in the Table 1, we trained the models on four types of very different samples from different hospitals including breast tissue (from Shenzhen Hospital of Traditional Chinese Medicine), liver fibrosis tissue (from Mengchao Hepatobiliary Hospital of Fujian Medical University), liver cancer tissue (from Fujian Medical University Cancer Hospital) and stained liquid-based cervical cytology smears (from University of Chinese Academy of Sciences Shenzhen Hospital). Besides, 30 groups of unstained liquid-based cytology smears (from University of Chinese Academy of Sciences Shenzhen Hospital) were used to further blindly test the generalization of models. Tissue samples are imaged with a 4×objective lens, and cervical smears are imaged with a 20×objective lens. MM images are normalized by ${M_{11}}$.To generate a large enough training dataset, the 153 groups of paired training data were further split into about 38000 patches in total. Specifically, patches of size 196×196 were sampled in each MM image with a step 128 pixels. The MM images used for testing are not cropped into patches. Thus the number of regions is equivalent to the patches. Next, based on the above MM images test dataset, we also derived the corresponding PBP images test dataset for further comparisons.

Figure 4 shows a MM image pair of liver cancer tissue to visualize the differences between the PBP images (including D, Δ, and δ) derived from low-SNR and high-SNR MM images respectively. The ${{{M}}_{{{22}}}}$, ${{{M}}_{{{33}}}}$ and ${{{M}}_{{{44}}}}$ are all subtracted by 1 for the convenience of presentation. The formulas and physical meanings of PBPs used in experiments are listed in Table 2 [6,8]. It should be noticed that the value of the Mueller matrix element ${{{M}}_{14}}$ related to the circular diattenuation is much smaller than the Mueller matrix elements ${{{M}}_{{{12}}}}$, ${{{M}}_{{{13}}}}$ related to the linear diattenuation. Therefore, we do not separate the circular diattenuation separately, only use the total diattenuation. These parameters have been used in microstructure characterization [9,12,29], e.g., retardance parameter is sensitive to fibrous structures [8]. It can be seen that PBP images can be very sensitive to noise, making some features hard to recognize, which proves the need to develop effective MM image noise removal algorithms.

Fig. 4. Visualization of differences between noisy images and ground truth from a liver cancer tissue sample. (a) MM images: Left one refers to the noisy MM image and the right one refers to ground truth. For convenience, a given MM element image ${M_{ij}}$ is represented at row i column j on a $4 \times 4$ grid. (b)PBP images: Top row refers to the noisy PBP images, and the bottom row to corresponding ground truth.

Download Full Size | PDF

Table 2. The formulas and physical meanings of PBPs used in experiments.

View Table | View all tables in this article

4.2 Implementation details

The proposed network architecture is end-to-end trainable and requires no pre-training. In this experiment, we insert 7, 5, 3, 1 CABs in skip connections from top to bottom, respectively. For all CABs, the reduction rate r is set to 15. The number of channels of shallow feature C is 45. The models are trained on 128×128 patches with a batch size of 10 for 100 training epochs. The networks are trained with the Adam optimizer [37]. The Adam optimizer is an extension of stochastic gradient descent, used to finetune the parameters in neural networks to minimize the loss. The parameters of the Adam optimizer are set to $\{{{{\rm{\beta }}_1}{\rm{ = 0}}{\rm{.9,\;}}{{\rm{\beta }}_2}{\rm{ = 0}}{\rm{.999}}} \}$. The initial learning rate is set to ${\rm{2 \times 1}}{{\rm{0}}^{{\rm{ - 4}}}}$. We applied the cosine annealing strategy [38] to steadily decrease the learning rate from the initial value to $1{\rm{ \times 1}}{{\rm{0}}^{{\rm{ - 6}}}}$ during training. For the training data, we perform the same data augmentation, including random rotations of 90, 180, 270, and horizontal or vertical flipping. We use PyTorch [39] 1.9.1 and Python 3.8.11 to implement models and train them on NVIDIA GeForce RTX 3090 GPU. The operating system is Ubuntu 20.0.4 with kernel 5.11.0.

4.3 Mueller matrix image denoising results

After training phase, the network is fixed and used to output the denoised MM images. In this section, we evaluate the quality of denoised MM images and PBP images in terms of quantified evaluations and visual effects. In order to further demonstrate the effectiveness of the proposed method, we also blindly tested it on entirely different samples, which were not used in the training process. The model can effectively denoise MM images of samples that are totally different from the training data. Specifically, as shown in Table 1, four types of samples are used to train and evaluate, including breast tissue, stained liquid-based cervical cytology smears, liver fibrosis tissue, and liver cancer tissue, while the unstained liquid-based cervical cytology smears are only used to test the generalization capability.

We compared the proposed denoising method with existing state-of-the-art deep-learning-based methods, including DnCNN [20] and MIRNet [40], which have achieved excellent performance in grayscale or color image denoising tasks and demonstrated the superiority of our method in MM image denoising.

4.3.1 Quantitative comparison

We use Root Mean Square Error (RMSE), peak signal-to-noise (PSNR), and structural similarity (SSIM) index [41] as quantitative criteria to evaluate MM images output by networks and the derived PBP images. RMSE expresses the differences between two images based on a calculation of error at each pixel coordinate, as shown in Eq. (5), where $\hat I$ and ${I_{gt}}$ are the tested image and reference, respectively, $({x,y} )$ represents a given pixel coordinate in an ${n_x} \times {n_y}$ image. PSNR expresses a logarithmic quantity of tested image quality with respect to reference using the decibel scale, as shown in Eq. (6), where ${\rm{\alpha }}$ is computed so that the maximum value of $\alpha \cdot {I_{gt}}$ is d. SSIM is a perceptual metric that quantifies the similarity between tested image and reference, as shown in Eq. (7), where ${\rm{\mu }}$ refers to the average, ${\sigma ^2}$ is the variance, ${\sigma _{{I_{gt}}\hat I}}$ denotes the covariance of tested image and reference, ${C_1}$ and ${C_2}$ are constants to avoid being divided by zero. All codes for computing the performance are implemented with Python.

(5)$$RMSE({\widehat I,{I_{gt}}} )= \sqrt {\frac{1}{{{n_x}{n_y}}} \cdot \sum\limits_{x = 1}^{{n_x}} {\sum\limits_{y = 1}^{{n_y}} {{{[{{I_{gt}}({x,y} )- \widehat I({x,y} )} ]}^2}} } }$$

(6)$$PSNR\left( {\widehat I,{I_{gt}}} \right) = 10 \cdot {\log _{10}}\left( {\frac{{{d^2}}}{{{\alpha ^2} \cdot \sum\limits_{x = 1}^{{n_x}} {\sum\limits_{y = 1}^{{n_y}} {{{\left[ {{I_{gt}}\left( {x,y} \right) - \widehat I\left( {x,y} \right)} \right]}^2}} } }}} \right)$$

(7)$$SSIM({\widehat I,{I_{gt}}} )= \frac{{({2{\mu_{{I_{gt}}}}{\mu_{\widehat I}} + {C_1}} )({2{\sigma_{{I_{gt}}\widehat I}} + {C_2}} )}}{{({{\mu_{{I_{gt}}}}^2 + {\mu_{\widehat I}}^2 + {C_1}} )({{\sigma_{{I_{gt}}}}^2 + {\sigma_{\widehat I}}^2 + {C_2}} )}}$$

The comparison of four types of stained sample test dataset is reported in Table 3. Mean RMSE (MRMSE), mean PSNR (MPSNR) and mean SSIM (MSSIM) are used to evaluate the performance on the MM image test set by averaging the 15 MM element images. Compared with the original MM and PBP images, image quality after the proposed denoising show significant improvement. The value of MRMSE decreases more than half in terms of MM image. Our approach also performs better in all indices over MIRNet, which is one of the latest best CNN methods in real color image denoising task. MIRNet has achieved excellent performance with many novel building blocks extracting, exchanging, and utilizing multi-scale feature information, but suffered from a heavy computational burden due to its sophisticated structure. Compared to MIRNet, our model is resource-efficient, thus efficient enough to process high-resolution MM images without dividing them into patches. After training, the proposed MDU-Net only requires 2.7s on average in the test set to get the denoised result, which is faster than MIRNet (about 6.8s). The improvement of acquisition time can make sense, especially for the development of polarization imaging-based digital pathology, which has a great need for high-quality data.

Table 3. Quantitative denoised results on four types of stained sample test dataset.^a

View Table | View all tables in this article

Next, we blindly test the proposed method on the unstained liquid-based cervical cytology smears test dataset. Quantitative comparisons are summarized in Table 4. The experimental results demonstrate that our model generalizes well to different MM image domains.

Table 4. Quantitative denoised results on unstained liquid-based cervical cytology smears which are entirely different from training data.^a

View Table | View all tables in this article

4.3.2 Visual comparison

First, we provide visualization of the noise reduction results on the MM images. For a clearer presentation, we select a region from a liver fibrosis sample imaged with a $4 \times $ objective lens to show the effectiveness of our method. The ${{{M}}_{{{22}}}}$, ${{{M}}_{{{33}}}}$ and ${{{M}}_{{{44}}}}$ are all subtracted by 1 for the convenience of presentation. The results for all 15 denoised MM images output by our model are shown in Fig. 5 (b). The corresponding low SNR Fig. 5 (a) and high SNR Fig. 5 (c) MM images are available for comparison. Compared with the ground truth, there are many apparent “pocks” in the noisy MM image, and a lot of useful information is degraded by this background noise. For example, details in both ${M_{23}}$ and ${M_{32}}$ images directly from experiments are hard to distinguish but can be easily recognized after denoising. It proves that our method is able to remove real noise in MM images while preserving the polarimetric texture details well.

Fig. 5. MM image denoising example from a liver fibrosis sample. (a) Noisy MM image, (b) denoised results, (c) Ground truth. For convenience, a given MM element image ${M_{ij}}$ is represented at row i column j on a $4 \times 4$ grid. Our method effectively removes the noise in every MM element image.

Download Full Size | PDF

In Fig. 6, we pick 3 regions in a liver fibrosis tissue and enlarge them to demonstrate the visual comparison on the effects of different denoising techniques for different PBP images, including D, $\Delta$ and $\delta $ which correspond to diattenuation, depolarization, and birefringence properties of the sample. The first column of the figure shows that the noisy MM images result in noisy PBP images, and the noise background looks stronger for the anisotropic polarization properties D and $\delta $ then for the isotropic polarization property $\Delta$. The three denoising techniques can effectively remove the background noise in $\Delta$ images, but MIRNet and our MDU-Net work more effectively than DnCNN for the anisotropic PBP images for D and $\delta \;{\rm{\;}},{\rm{\;}}$ which contain more microstructural features. For example, the birefringence properties on tissues are always affected by the existence of fibrous microstructures, the distribution and orientations of the fiber shown in denoised $\delta $ image are clearer. It seems that the MIRNet has a similar performance with the proposed MDU-Net from the visual effect. However, MDU-Net obtained performance gains of 0.38 dB on D, 0.69 dB on Δ, and 0.37 dB on $\delta $.

Fig. 6. Denoised PBP images for D, $\Delta$ and $\delta $ derived by different methods from a liver fibrosis tissue. The upper part refers to the grayscale image of the sample, the following part is the results. Specifically, D images refer to the brown bounding box, $\Delta$ to the red one, and $\delta $ to the blue one.

Download Full Size | PDF

MDU-Net works well on other types of samples, which are stained (breast tissue, liver cancer tissue, and liquid-based cervical cytology smear), as shown in Fig. 7, Fig. 8, and Fig. 9. The denoised PBP images are highly similar to the ground truth.

Fig. 7. The denoised PBP results of liver fibrosis tissue. The denoised PBP images are on the right, corresponding grayscale images are presented on the left side.

Download Full Size | PDF

Fig. 8. The denoised PBP results of liver cancer tissue. The denoised PBP images are on the right, corresponding grayscale images are presented on the left side.

Download Full Size | PDF

Fig. 9. The denoised PBP results of stained liquid-based cervical cytology smear. The denoised PBP images are on the right, corresponding grayscale images are presented on the left side.

Download Full Size | PDF

Figure 10 shows the results of the unstained liquid-based cervical cytology smear, which illustrates the model generalization on the samples that are completely different from the training data. PBP images derived from noisy MM images also show strong background noise. The previously trained MDU-Net has then applied to denoise the MM images of the unstained cervical cytology smear, which were not included in the training data set. The denoising is very effective that one can recognize the cell border and nuclei easily from the PBP images. Currently, the diagnosis of cervical cancer is usually determined by calculating the nucleocytoplasmic ratio of stained cervical cells, which requires a time-consuming sample preparation process. Since the nuclei and cellular extent of these unstained cells can be even clearer after post-processing, such fast and high-quality PBP microscopy technique for unstained cells may have many potential applications in rapid cervical cancer diagnosis or screening in clinical operations.

Fig. 10. Denoised PBP images correspond to an unstained liquid-based cervical cytology smear which is entirely different from training data. The right images show the denoised PBP results, while the left image corresponds to grayscale image.

Download Full Size | PDF

4.4 Model analysis

First, we present ablation experiments to analyze the contribution of design choices to the final performance. Specifically, we analyzed the impact of the GRL (a shortcut map to learn the residual information between the noisy input MM image and output denoised MM image), the LRL in convolution building block and the CAM in skip connections by adding them to our baseline U-Net. Evaluation is performed on the stained sample MM images dataset with the denoising models trained for 100 epochs (an epoch means training the network with training data for one cycle, and the internal model parameters are updated with each epoch). We implemented and trained five networks: the baseline (Net-1), the baseline-GRL (Net-2), the baseline-GRL-LRL (Net-3), the baseline-GRL-CAM (Net-4), the baseline-GRL-LRL-CAM (MDU-Net).

Figure 11 shows the convergence progress of the above five models on the validation set during the training stage. Firstly, we can observe that the GRL plays the most important role in the image denoising task, the absence of the GRL (Net-1) causes the largest performance drop. Compared with MDU-Net, the convergence progress of the baseline U-Net is not stable, and its convergence MRMSE value is higher than MDU-Net, which indicates it performs worse. The denoised output has a lot of shared information with noisy input. With GRL, the network can transmit the low-frequency information directly forward to output, which can help the model focus on high-frequency information and stabilize training. Next, according to Fig. 11, adding LRL (Net-3) or CAM (Net-4) also yields performance improvement. A benefit of the LRL is that it can solve the performance degradation problem, and make training deeper networks easier. For CAM, it implicitly applies feature selection to the MM element and makes the network focus more on informative features of MM image. Moreover, CAM mitigates the shortcomings, including limited receptive field. Indeed, MDU-Net adding both LRL and CAM performs best.

Fig. 11. The mean values of RMSE for 15 MM element images with increasing number of training epochs for Net-1, Net-2, Net-3, Net-4 and MDU-Net on the validation dataset.

Download Full Size | PDF

The quantitative evaluation results on test dataset are listed in the Table 5. As shown in Table 5, the proposed MDU-Net performs the best while the U-Net (Net-1) performs the worst, which is consistent with the analysis in Fig. 11.

Table 5. The mean values of RMSE, PSNR and SSIM on test dataset for Net1, Net2, Net3, Net4 and MDU-Net.

View Table | View all tables in this article

Next, we analyze how the number of CAB in skip connections affects the restoration quality. We fix the total number of CAB to 16, then allocate them to different level skip connections. The evaluation results of the test dataset are listed in Table 6. In particular, ‘1,3,5,7’ means that the numbers of CAB in skip connections from top to bottom are 1,3,5,7, respectively. It can be observed that the ‘7, 5, 3, 1’ achieves the best results, followed by ‘4,4,4,4’ and ‘1,3,5,7’. We found that more CABs in low-level skip connections result in better performance, which is in line with our expectations because the feature maps from low levels contain more detailed raw information.

Table 6. The mean values of RMSE, PSNR and SSIM on test dataset for different designs in skip connections.

View Table | View all tables in this article

5. Conclusion

In this paper, we have demonstrated that deep learning can significantly enhance the quality of MM and PBP images. We built a dataset containing low-SNR and high-SNR MM image pairs of various samples. Contrasts in both MM and PBP images can be low in some low-SNR data. We have proposed a deep learning approach based on a U-Net combining channel attention mechanism to perform MM image denoising. Experiments show that our method has achieved promising performance with strong generalization capability in terms of both visual effect and objective quality metrics. In particular, the proposed method can effectively reduce the noise in MM images with different magnification and generalize well for samples different from the training data. Denoising of the MM images also results in higher contrasts in PBP images, which seems more effective for anisotropic polarization properties. In fact, this method as a supervised approach can be also applied in other image restoration or enhancement cases as long as a large number of paired low-SNR and high-SNR images are available in training data. Since the acquisition of high-quality images is time-consuming and can be expensive for extra requirements in system stability, such a learning-based denoising method allows one to more quickly obtain higher measurement accuracy faster and with minimal expense.

Funding

Guangdong Development Project of Science and Technology (2020B1111040001); National Natural Science Foundation of China (41527901, 61527826).

Acknowledgments

We thank Wenming Yang, Ruqi Huang and Conghui Shao for helpful discussions.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. N. Ghosh and A. I. Vitkin, “Tissue polarimetry: concepts, challenges, applications, and outlook,” J. Biomed. Opt. 16(11), 110801 (2011). [CrossRef]

2. V. V. Tuchin, “Polarized light interaction with tissues,” J. Biomed. Opt. 21(7), 071114 (2016). [CrossRef]

3. C. He, H. He, J. Chang, B. Chen, H. Ma, and M. J. Booth, “Polarisation optics for biomedical and clinical applications: a review,” Light: Sci. Appl. 10(1), 1–20 (2021). [CrossRef]

4. Y. Wang, H. He, J. Chang, N. Zeng, S. Liu, M. Li, and H. Ma, “Differentiating characteristic microstructural features of cancerous tissues using Mueller matrix microscope,” Micron 79, 8–15 (2015). [CrossRef]

5. Z. Chen, R. Meng, Y. Zhu, and H. Ma, “A collinear reflection Mueller matrix microscope for backscattering Mueller matrix imaging,” Opt. Lasers Eng. 129, 106055 (2020). [CrossRef]

6. S.-Y. Lu and R. A. Chipman, “Interpretation of Mueller matrices based on polar decomposition,” J. Opt. Soc. Am. A 13(5), 1106–1113 (1996). [CrossRef]

7. H. He, N. Zeng, E. Du, Y. Guo, D. Li, R. Liao, and H. Ma, “A possible quantitative Mueller matrix transformation technique for anisotropic scattering media/Eine mögliche quantitative Müller-Matrix-Transformations-Technik für anisotrope streuende Medien,” Photonics Lasers Med. 2(2), 129–137 (2013). [CrossRef]

8. P. Li, Y. Dong, J. Wan, H. He, T. Aziz, and H. Ma, “Polaromics: deriving polarization parameters from a Mueller matrix for quantitative characterization of biomedical specimen,” J. Phys. D: Appl. Phys. 55(3), 034002 (2022). [CrossRef]

9. Y. Wang, H. He, J. Chang, C. He, S. Liu, M. Li, N. Zeng, J. Wu, and H. Ma, “Mueller matrix microscope: a quantitative tool to facilitate detections and fibrosis scorings of liver cirrhosis and cancer tissues,” J. Biomed. Opt. 21(7), 071112 (2016). [CrossRef]

10. T. Liu, M. Lu, B. Chen, Q. Zhong, J. Li, H. He, H. Mao, and H. Ma, “Distinguishing structural features between Crohn's disease and gastrointestinal luminal tuberculosis using Mueller matrix derived parameters,” J. Biophotonics 12(12), e201900151 (2019). [CrossRef]

11. Y. Dong, J. Wan, X. Wang, J.-H. Xue, J. Zou, H. He, P. Li, A. Hou, and H. Ma, “A polarization-imaging-based machine learning framework for quantitative pathological diagnosis of cervical precancerous lesions,” IEEE Trans. Med. Imaging 40(12), 3728–3738 (2021). [CrossRef]

12. Y. Dong, J. Wan, L. Si, Y. Meng, Y. Dong, S. Liu, H. He, and H. Ma, “Deriving polarimetry feature parameters to characterize microstructural features in histological sections of breast tissues,” IEEE Trans. Biomed. Eng. 68(3), 881–892 (2021). [CrossRef]

13. R. Collins and J. Koh, “Dual rotating-compensator multichannel ellipsometer: instrument design for real-time Mueller matrix spectroscopy of surfaces and films,” J. Opt. Soc. Am. A 16(8), 1997–2006 (1999). [CrossRef]

14. Z. Chen, Y. Yao, Y. Zhu, and H. Ma, “Removing the dichroism and retardance artifacts in a collinear backscattering Mueller matrix imaging system,” Opt. Express 26(22), 28288–28301 (2018). [CrossRef]

15. E. Compain, S. Poirier, and B. Drevillon, “General and self-consistent method for the calibration of polarization modulators, polarimeters, and Mueller-matrix ellipsometers,” Appl. Opt. 38(16), 3490–3502 (1999). [CrossRef]

16. J. Zhou, H. He, Z. Chen, Y. Wang, and H. Ma, “Modulus design multiwavelength polarization microscope for transmission Mueller matrix imaging,” J. Biomed. Opt. 23(01), 1 (2018). [CrossRef]

17. G. Anna and F. Goudail, “Optimal Mueller matrix estimation in the presence of Poisson shot noise,” Opt. Express 20(19), 21331–21340 (2012). [CrossRef]

18. F. Goudail, “Optimal Mueller matrix estimation in the presence of additive and Poisson noise for any number of illumination and analysis states,” Opt. Lett. 42(11), 2153–2156 (2017). [CrossRef]

19. Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature 521(7553), 436–444 (2015). [CrossRef]

20. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 770–778.

21. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: unified, real-time object detection,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (2016), pp. 779–788.

22. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical Image Computing and Computer-assisted Intervention (Springer, 2015), pp. 234–241.

23. Y. Rivenson, Z. Göröcs, H. Günaydin, Y. Zhang, H. Wang, and A. Ozcan, “Deep learning microscopy,” Optica 4(11), 1437–1443 (2017). [CrossRef]

24. Y. Rivenson, Y. Zhang, H. Günaydın, D. Teng, and A. Ozcan, “Phase recovery and holographic image reconstruction using deep learning in neural networks,” Light: Sci. Appl. 7(2), 17141 (2018). [CrossRef]

25. Y. Jiao, Y. R. He, M. E. Kandel, X. Liu, W. Lu, and G. Popescu, “Computational interference microscopy enabled by deep learning,” APL Photonics 6(4), 046103 (2021). [CrossRef]

26. H. Guan, D. Li, H.-c. Park, A. Li, Y. Yue, Y. A. Gau, M.-J. Li, D. E. Bergles, H. Lu, and X. Li, “Deep-learning two-photon fiberscopy for video-rate brain imaging in freely-behaving mice,” Nat. Commun. 13(1), 1–9 (2022). [CrossRef]

27. X. Li, H. Li, Y. Lin, J. Guo, J. Yang, H. Yue, K. Li, C. Li, Z. Cheng, and H. Hu, “Learning-based denoising for polarimetric images,” Opt. Express 28(11), 16309–16321 (2020). [CrossRef]

28. Q. Zhao, T. Huang, Z. Hu, T. Bu, S. Liu, R. Liao, and H. Ma, “Geometric optimization method for a polarization state generator of a Mueller matrix microscope,” Opt. Lett. 46(22), 5631–5634 (2021). [CrossRef]

29. T. Huang, R. Meng, J. Qi, Y. Liu, X. Wang, Y. Chen, R. Liao, and H. Ma, “Fast Mueller matrix microscope based on dual DoFP polarimeters,” Opt. Lett. 46(7), 1676–1679 (2021). [CrossRef]

30. D. Sabatke, M. Descour, E. Dereniak, W. Sweatt, S. Kemme, and G. Phipps, “Optimization of retardance for a complete Stokes polarimeter,” Opt. Lett. 25(11), 802–804 (2000). [CrossRef]

31. S. Roussel, M. Boffety, and F. Goudail, “On the optimal ways to perform full Stokes measurements with a linear division-of-focal-plane polarimetric imager and a retarder,” Opt. Lett. 44(11), 2927–2930 (2019). [CrossRef]

32. X. Mao, C. Shen, and Y.-B. Yang, “Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections,” Adv. Neural Inform. Process. Systems 29, 2802–2810 (2016). [CrossRef]

33. Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu, “Image super-resolution using very deep residual channel attention networks,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018), 286–301.

34. S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018), 3–19.

35. K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: surpassing human-level performance on imagenet classification,” in Proceedings of the IEEE international Conference on Computer Vision (2015), pp. 1026–1034.

36. J. T. Barron, “A general and adaptive robust loss function,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), pp. 4331–4339.

37. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980 (2014).

38. I. Loshchilov and F. Hutter, “Sgdr: Stochastic gradient descent with warm restarts,” arXiv preprint arXiv:1608.03983 (2016).

39. A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” (2017).

40. S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M.-H. Yang, and L. Shao, “Learning enriched features for real image restoration and enhancement,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXV 16 (Springer, 2020), pp. 492–511.

41. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing13(4), 600–612 (2004). [CrossRef]

	Breast Tissue	Stained cervical smear	Liver fibrosis tissue	Liver cancer tissue	Unstained cervical smear	Total
Training	47 / 11,844	23 / 5,796	38 / 9,576	45 / 11,340	0 / 0	153 / 38,556
Validation	28 / 7,056	15 / 3,780	15 / 3,780	19 / 4,788	0 / 0	77 / 19,404
Testing	27 / 27	12 / 12	19 / 19	19 / 19	30 / 30	107 / 107

Polarimetry Basis Parameters (PBPs)	Physical Meanings
$D = \sqrt{{M_{12}}^{2} + {M_{13}}^{2} + {M_{14}}^{2}}$	Diattenuation
$Δ = 1 - \frac{1}{3} \| t r (M_{Δ}) \|$	Depolarization
$δ = \cos^{- 1} (\sqrt{{(M_{R 22} + M_{R 33})}^{2} + {(M_{R 22} - M_{R 33})}^{2}} - 1)$	Linear Retardance

	MM Images			PBP Images ( $D / Δ / δ$ )
Method	MRMSE ( $\times 1 0^{- 3})$	MPSNR	MSSIM	RMSE ( $\times 1 0^{- 3})$	PSNR	SSIM ( $\times 1 0^{- 2})$
Noisy	6.589	30.07	0.6875	5.616/3.405/5.891	31.81/36.03/29.30	68.42/69.11/84.05
DnCNN	4.099	34.41	0.8097	4.154/1.989/4.154	34.14/40.89/32.01	76.35/87.44/91.35
MIRNet	2.740	37.90	0.8948	2.351/1.300/2.971	39.24/44.69/34.90	89.79/93.57/95.78
MDU-Net	2.654	38.12	0.8968	2.298/1.217/2.831	39.40/45.15/35.19	90.30/94.26/95.88

	MM Images			PBP Images ( $D / Δ / δ$ )
Method	MRMSE ( $\times 1 0^{- 3})$	MPSNR	MSSIM	RMSE ( $\times 1 0^{- 3})$	PSNR	SSIM( $\times 1 0^{- 2})$
Noisy	7.296	29.21	0.5118	5.696/4.099/6.508	31.31/34.36/30.10	57.15/72.87/61.51
MDU-Net	3.879	35.06	0.7870	3.825/2.914/3.525	35.60/38.02/36.14	79.57/87.53/87.27

	Net1	Net2	Net3	Net4	MDU-Net
GLR	✗	✓	✓	✓	✓
LRL	✗	✗	✓	✗	✓
CAM	✗	✗	✗	✓	✓
MRMSE $(\times 10^{- 3})$	3.748	2.913	2.838	2.768	2.654
MPSNR (dB)	36.01	37.38	37.63	37.75	38.12
MSSIM	0.8687	0.8866	0.8893	0.8919	0.8968

Deep learning for denoising in a Mueller matrix microscope

Abstract

1. Introduction

2. Experimental setup

3. Method

4. Experiments and analysis

4.1 Datasets

4.2 Implementation details

4.3 Mueller matrix image denoising results

4.3.1 Quantitative comparison

4.3.2 Visual comparison

4.4 Model analysis

5. Conclusion

Funding

Acknowledgments

Disclosures

Data availability

References

Data availability

Cited By

Figures (11)

Tables (6)

Equations (8)

Biomedical Optics Express

	1,3,5,7	4,4,4,4	7,3,5,1
MRMSE ( $\times 10^{- 3})$	2.795	2.720	2.654
MPSNR (dB)	37.74	37.93	38.12
MSSIM ( $\times 10^{- 2})$	89.17	89.30	89.68