Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Mueller transform matrix neural network for underwater polarimetric dehazing imaging

Open Access Open Access

Abstract

Polarization dehazing imaging has been used to restore images degraded by scattering media, particularly in turbid water environments. While learning-based approaches have shown promise in improving the performance of underwater polarimetric dehazing, most current networks rely heavily on data-driven techniques without consideration of physics principles or real physical processes. This work proposes, what we believe to be, a novel Mueller transform matrix network (MTM-Net) for underwater polarimetric image recovery that considers the physical dehazing model adopting the Mueller matrix method, significantly improving the recovery performance. The network is trained with a loss function that combines content and pixel losses to facilitate detail recovery, and is sped up with the inverse residuals and channel attention structure without decreasing image recovery quality. A series of ablation experiment results and comparative tests confirm the performance of this method with a better recovery effect than other methods. These results provide deeper understanding of underwater polarimetric dehazing imaging and further expand the functionality of polarimetric dehazing method.

© 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Underwater imaging plays a significant role in many fields, especially in complex turbid environments such as underwater rescue, marine resource detection, and aquaculture [13]. However, the quality of underwater imaging is often compromised by the absorption and scattering effects of suspended particles in water, resulting in low contrast, blurred edges, and noisy images captured by detectors. Recently, restoring clear images from hazy one,s especially in complex turbid underwater environments has become a research focus in the computational photography and vision community [4]. Deep learning techniques have been integrated with polarization imaging to make significant progress in various areas, including de-scattering [57], demosaicing [8,9], and denoising [10,11]. Hu et al. have proposed a polarization residual dense network for underwater polarimetric image recovery [12]. Zhu et al. have combined the image formation model with learning-based methods to effectively remove the veiling effect and restore a clear image [13]. Ren et al. have proposed a lightweight convolutional neural network that combined with a physical dehazing model by fusing two key parameters used in the polarization dehazing model into one parameter, and using a convolutional neural network to learn and achieve clear defogging imaging [14]. Additionally, Xiang et al. have introduced a polarization residual dense network (PRDN) with four inputs to efficiently fuse four polarization features for effective polarization dehazing imaging in complex underwater environments with high turbidity [15].

In recent years, various methods have been proposed to improve image quality. Among these methods, the polarimetric dehazing method has attracted a lot of attention as one of the most effective techniques [1620]. The Mueller matrix, which represents the change of polarization state as a coefficient matrix, can be adopted as one of the effective methods for image enhancement, due to its ability to contain a large amount of polarization information. To address the challenges of underwater polarimetric imaging, a novel approach based on the Mueller matrix has been proposed. This method synergistically modulates the polarization state of the illumination light and the polarization filter in front of the camera, intending to maximally filter the backscattered light and improve image quality [21]. Zhao et al. have employed the polar decomposition technique to estimate the Muller matrix of the target in the entire field of view. This approach also involves calculating the illumination light with globally varying polarization states, to suppress noise [22]. Liu et al. derived the depolarization index from the Mueller matrix to estimate the transmittance map and recover a clear target vision [23].

Polarimetric dehazing imaging requires the network to capture both the environment's scattering information and the target objects’ polarization information from multiple scales. It involves the fusion of multi-level polarization features and requires the network to be easily optimized and trained, ensuring efficient restoration performance for both global and local information. In this work, we propose a Mueller transform matrix network (MTM-Net) for underwater polarimetric dehazing imaging. It fuses the advantages of physical model interpretability based on the Mueller transform matrix with neural network feature extraction. The U-shaped structure in this network allows multi-scale polarization feature extraction in the encoder and corresponding reconstruction in the decoder. The skip connections enable the retrieval of features from different levels in the encoder and fuse them with the corresponding levels in the decoder for better image restoration. The skip connections in the U-shaped architecture can propagate the gradients better in the network to alleviate the gradient vanishing and exploding, making the network easier to optimize and train. The content loss and pixel loss jointly guide the training to take full advantage of the information on polarized images and further improve the performance. The network inference is sped up by using inverse residuals and channel attention structure. The experimental results confirm the effectiveness and advantages of our approach compared to the state-of-the-art techniques in this field.

2. Theory model and structure of the neural network

The Mueller matrix can represent the polarization change characteristics of light with the conversion relationship between the Stokes vectors of the incident light and reflected light in an underwater environment:

$${{\boldsymbol S}^{haze}} = [\begin{array}{{ccc}} {s_0^{haze}}&{s_1^{haze}}&{\begin{array}{{cc}} {s_2^{haze}}&{s_3^{haze}} \end{array}} \end{array}] = {{\boldsymbol M}_{haze}} \cdot {{\boldsymbol S}^{in}}$$
$${{\boldsymbol S}^{clear}} = [\begin{array}{{ccc}} {s_0^{clear}}&{s_1^{clear}}&{\begin{array}{{cc}} {s_2^{clear}}&{s_3^{clear}} \end{array}} \end{array}] = {{\boldsymbol M}_{clear}} \cdot {{\boldsymbol S}^{in}}$$
where ${{\boldsymbol M}_{haze}}$ and ${{\boldsymbol M}_{clear}}$ are the Mueller matrix of the hazed water and clear water, ${{\boldsymbol S}^{haze}}$ and ${{\boldsymbol S}^{clear}}$ are the Stokes vector received by the detector passing through the hazed water and clear water. ${{\boldsymbol S}^{in}}$ denotes the Stokes vector of active incident light. Let a transformation matrix ${{\boldsymbol M}_{trans}}$ satisfy the following equation:${{\boldsymbol M}_{trans}} \cdot {{\boldsymbol M}_{haze}} = {{\boldsymbol M}_{clear}}$ then the polarimetric underwater imaging dehaze model could be obtained as:
$${{\boldsymbol M}_{trans}} \cdot {{\boldsymbol S}^{haze}} = {{\boldsymbol S}^{clear}}$$
From Eq. (3), we can see that the Stokes vector of the hazed image can be transformed into the Stokes vector of the clear image by the Mueller transform matrix. When the incident light does not contain a circularly polarized component, there is almost no circularly polarized component in the scattered light [22]. Therefore, the circular polarization component (Stokes vector ${s_3}$) can be neglected, and the Mueller matrix and Stokes vector are degraded to a third-order matrix accordingly. In addition, only the first term $s_0^{clear}$ needs to be solved to obtain the clear intensity image from a Stokes vector with a hazed image, the underwater dehazing model was simplified as:
$$s_0^{clear} = {\boldsymbol m}_t^T \cdot [\begin{array}{{cc}} {s_0^{haze}}&{\begin{array}{{cc}} {s_1^{haze}}&{s_2^{haze}{]^T}} \end{array}} \end{array}$$
where ${{\boldsymbol m}_t} = {[\begin{array}{{ccc}} {{M_{11}}}&{{M_{12}}}&{{M_{13}}} \end{array}]^T}$ is the convert vector, representing the first three elements of the first row in the matrix ${{\boldsymbol M}_{trans}}$. According to Eq. (4), the recovered intensity image $\hat{s}_0^{clear}$ can be calculated if ${{\boldsymbol m}_t}$ is given. Here ${{\boldsymbol m}_t}$ is estimated by deep learning with numerous pairs of ${{\boldsymbol S}^{haze}}$ and ${{\boldsymbol S}^{clear}}$. The architecture of the polarization dehazing network based on the above Mueller transform matrix is sketched in Fig. 1. Different from the conventional end-to-end network, the proposed network learns the convert vector ${{\boldsymbol m}_t}$ from the input Stokes vector ${{\boldsymbol S}^{haze}}$ instead of the direct output of the dehazed intensity image. The vector mt is estimated by the training dataset with the U-shaped structure network. The loss between the predicted dehazed image $\hat{s}_0^{clear}$ and the corresponding ground truth is computed, and the network iteratively reduced the error according to the loss, enabling the network to effectively estimate the vector mt.

 figure: Fig. 1.

Fig. 1. The structure of the proposed Mueller transform matrix neural network.

Download Full Size | PDF

To reduce the data pre-processing operations and float calculations, the network with U-shaped structures with skip-connection to capture multiscale information hierarchically for underwater dehazing imaging is proposed. The linear polarization components 90°, 45°, 135°and 0°are disposed in 2 × 2 macroblocks on an input mosaiced image 224 × 224 × 1. Stokes vectors ${{\boldsymbol S}^{haze}} = {[\begin{array}{{ccc}} {s_0^{haze}}&{s_1^{haze}}&{s_2^{haze}} \end{array}]^T}$ are extracted from the input image in the polarization extraction (PE) block of the network, obtaining an array of 224 × 224 × 3. The dimensions of the array are expanded to 224 × 224×c(c is the hyperparameter depending on the number of convolution kernels) by a 3 × 3 convolution layer with LeakyReLU [24] in Input Projection. Then, the feature maps pass through n (here n = 4) encoder stages in the U-shaped structure, and each stage contains the inverted residual polarization channel attention block (IRPCA) and one down-sampling layer. The most important polarization features are extracted with the channel-attention mechanism and the computational cost is reduced by the inverted residual through depth-wise convolution [25] in the IRPCA block. The channels are doubled by using 3 × 3 convolution with stride 2 in the down-sampling layer, e.g., the feature maps ${2^l}C \times H/{2^l} \times W/{2^l}$ are generated in the l-th stage of the encoder with the input feature maps $C \times H \times W$. An IRPCA block is added at the end of the encoder as a bottleneck stage.

The proposed decoder also contains n (n = 4 here) stages for feature reconstruction, and each stage includes an up-sampling layer and an IRPCA block similar to the encoder. The nearest neighbor interpolation algorithm [26] is adopted for up-sampling, and the feature channels reduce half and the size of the feature maps is doubled. The up-sampled features and the corresponding features from the encoder through skip-connection are concatenated in the IRPCA block. Feature maps 224 × 224 × 3, the target convert vector ${m_t}$ in Eq. (4), is obtained with a 3 × 3 convolution layer (not shown in the figure) after passing through the four decoder stages. We calculate the loss of $\hat{s}_0^{clear}$ and $s_0^{clear}$, and continuously optimize all parameters of the network by back propagation of the loss to calculate ${m_t}$ during the deep learning process.

The IRPCA block in the proposed network is composed of the inverted residual structure and the channel-attention structure, as shown in Fig. 1. The computation cost is reduced in the inverted residual structure with the decreased parameters by the depth-wise convolution. Firstly, the feature dimension increased using a 1 × 1 convolution layer, followed by batch normalization and ReLU activation function (BNReLU). The local information is extracted with a 3 × 3 depth-wise convolution, and the dimension is adjusted to match that of the input channels by using a 1 × 1 convolution layer after BNReLU. Then, the output and input features are added up as ${F_{IR}}$ for the operation of channel-attention. In the channel-attention structure, the features are compressed into a vector by global average pooling to generate initial channel weights, and the initial channel weights are adaptively adjusted by employing a one-dimensional convolution with adaptively varying kernel size $k$ according to the number of channels $C$, where the kernel size $k = |{({{\log }_2}C + b)/\gamma } |$ with $\gamma = 2$ and $b = 1$ [27]. The final features of IRPCA are outputted with the channel-wise multiplication of adjusted channel weights and ${F_{IR}}$.

The proposed IRPCA block enables the network to maintain high efficiency and performance for polarimetric dehazing imaging with fewer parameters and computational resources. The block is important for building an efficient deep learning-based polarimetric dehazing model of the Mueller matrix in this work. Polarimetric dehazing tasks involve extracting and fusing information from multiple polarimetric channels. The inverted residual structure enables the network to effectively process multi-channel features. In contrast, the channel attention mechanism enables the network to adaptively attend to specific polarimetric features based on their importance. Therefore, these techniques can address the specific challenges of polarimetric dehazing, including reducing computational costs, and improving feature extraction capabilities.

The loss function plays a crucial role in the training of neural networks as it reflects the optimization objective of the model. The edge loss function ${L^E}$ can make the outline of the image clearer, as defined by [28]:

$$\begin{array}{l} {L^E} = \frac{1}{{wh}}\sum\limits_{w,h} {{{||{{{({H_x}(G({x_{haze}})))}_{w,h}} - {{({H_x}({x_{label}}))}_{w,h}}} ||}_1}} \\ \begin{array}{{cc}} {}&{ + \frac{1}{{wh}}\sum\limits_{w,h} {{{||{{{({H_v}(G({x_{haze}})))}_{w,h}} - {{({H_v}({x_{label}}))}_{w,h}}} ||}_1}} } \end{array} \end{array}$$
where ${||\cdot ||_1}$ to represent the ${\ell _1}$ norm; $w$ and $h$ are the width and height of the image; ${H_x}({\cdot} )$ and ${H_v}({\cdot} )$ denote the horizontal and vertical gradient calculations, respectively; $G({\cdot} )$ denote our proposed network; ${x_{haze}}$ and ${x_{label}}$ represent the hazed image and its corresponding clear label image, respectively.

In addition, the content loss function [29] is also introduced to restore feature details, with the VGG19 network trained on the ImageNet for image classification [30]. Here, the last convolutional layer of the fourth convolutional block is adopted as the content layer, represented by ${\varphi _{25}}({\cdot} )$, and the content loss function ${L^C}$ then can be calculated by:

$${L^C} = \frac{1}{{wh}}\sum\limits_{w,h} {{{||{{{({\varphi_{25}}(G({x_{haze}})))}_{w,h}} - {{({\varphi_{25}}({x_{label}}))}_{w,h}}} ||}_2}}$$
Furthermore, the pixel loss function ${L^P}$ is introduced for constraints to further restore the image, which can be calculated by:
$${L^P} = \frac{1}{{wh}}\sum\limits_{w,h} {{{||{{{(G({x_{haze}}))}_{w,h}} - {{({x_{label}})}_{w,h}}} ||}_2}}$$
where ${||\cdot ||_2}$ represent the ${\ell _2}$ norm. Therefore, the complete customized loss function can be expressed as follows:
$$Loss = {\lambda _e}{L^E} + {\lambda _c}{L^C} + {\lambda _p}{L^P}$$
where ${\lambda _e}$, ${\lambda _c}$ and ${\lambda _p}$ are three weight coefficients of the loss function.

Here, the stages of decoder and encoder are set as $n = 4$, the hyper-parameter base channel is set as $c = 32$, the batch size is set to 64. The Adam optimizer with weight decay 10−4 is chosen to update network parameters. The learning rate was initially set to 5 × 10−3 with the exponential decay rate as 0.95. The Nvidia RTX 3090Ti GPU is used to train our model. In addition, it is found that the proposed model has a better performance with the weight coefficients ${\lambda _e}$, ${\lambda _c}$ and ${\lambda _p}$ set as 0.1, 1, 1.

3. Experiment results and analysis

The experiments of underwater imaging are carried out to achieve a real-world polarimetric dataset, and the schematic of the experiment setup is shown in Fig. 2. A 532 nm blue-green laser is used as the active light source, and the linear polarized light is generated by a linear polarizer. The target is placed in the water with a 65*30*40 cm transparent container, and the scattering environment with different turbidity is simulated with different amounts of milk. The reflected light from the target objective is received by a commercial division of focal plane (DoFP) polarization camera (LUCID, PHX050S-PC). The camera has a pixel number of 2048 × 2448, and its pixel arrays are composed of macroblocks with four different polarization orientations (90°, 45°, 135°and 0°) pixels. The 120 sets of image pairs are collected as the dataset, and all images were cropped horizontally and vertically with 224 pixels as the side length and 64 pixels as the stride. Then the image dataset was expanded by the data augmentation method, and the dataset was randomly divided into a training set and a validation set in the ratio of 9:1. Finally, we obtained a training dataset size of about 90,204 and a resolution of 224*224 pixels. The Stokes vector s0, s1 and s2 can be computed by: s0 = (I0 + I45 + I90 + I135)/2, s1 = I0-I90, s2 = I45-I135, in the Polarization Extraction block (PE). The turbidity of water is described by NTU (Nephelometric Turbidity Units) to quantify different levels of scattering in the underwater environment [23].

 figure: Fig. 2.

Fig. 2. The schematic of the experiment setup.

Download Full Size | PDF

The proposed networks with different configurations are examined to demonstrate the efficacy of our settings. The results of these ablation experiments with similar optimal weights are compared in Fig. 3 and Table 1. The raw image taken in turbid water without any processing has severely degraded information, as shown in the first plot in Fig. 3, compared with the ground truth (GT) mage in the second plot in the first row in Fig. 3. The restored images by the network without Lc or MTM structure are shown in the first and second plot in a lower row in Fig. 3, respectively. There is a lack of sufficient resolution for the textual components in the image restored by the network without the content loss function (see the first plot in a lower row in Fig. 3). The restored images without MTM structure exhibit obvious smears (see the second plot in a lower row in Fig. 3). The third plot in a lower row in Fig. 3 shows the result by the network with bilinear interpolation algorithm instead of the nearest neighbor interpolation algorithm in the up-sampling operation of the decoder stage. The image restored using bilinear interpolation has apparent noise and dark spots. The restoring result (see the third plot in the first row in Fig. 3) with our proposed network structure reveals more details of the label. The corresponding PSNR and SSIM are compared in the bottom left corner and the bottom right corner. Our proposed MTM-Net performs best and achieves the highest PSNR and SSIM values. These results further demonstrate the superiority and effectiveness of our proposed method by considering the underlying mechanisms of polarimetric dehazing imaging.

 figure: Fig. 3.

Fig. 3. Ablation experiment results. The corresponding PSNR and SSIM values for different methods are shown at the bottom of the plots.

Download Full Size | PDF

Tables Icon

Table 1. PSNR and SSIM for different methods

The total loss curve during training over the epoch iterations, along with corresponding intermediate recovered images, are illustrated in Fig. 4. There is a substantial decrease in the total loss within the first ten epochs, followed by a consistent downward trend from epoch 10 to epoch 100, reaching a plateau around 0.11. These results indicate that our network gradually learns and internalizes key features, improving output quality, as confirmed by the corresponding intermediate recovered images.

 figure: Fig. 4.

Fig. 4. Total loss with epoch iterations and intermediate recovered image.

Download Full Size | PDF

In addition, the results of our method are compared with that of the traditional underwater polarization clear imaging model proposed by Schechner [16], the method combining histogram stretching for polarimetric image recovery proposed by Huang [17], the learning-based polarimetric dense network (PDN) proposed by Hu [12], the physics-based dark channel prior method (DCP) proposed by He [3], and the image processing method CLAHE proposed by Reza [31]. The images recovered by these methods in the validation and test sets (without appearing in the training set) are shown in Fig. 5.

 figure: Fig. 5.

Fig. 5. comparison of the recovered images by different methods: (a) with a validation set; (b) with a test set (without appearing in the training set).

Download Full Size | PDF

The image recovered by Schechner and Huang exhibits low contrast and high noise levels. The image generated by the DCP method darkened compared to the GT image, whereas the high contrast but noisy images are outputted with the CLAHE method. These phenomena can be explained by the fact that the DCP and CLAHE solely rely on intensity information and statistical rules for digital image processing. The images restored by the PDN method (trained with our custom dataset and recommended parameters) exhibit prevalent shadowing and smudging artifacts, especially on the scale part of the ruler. In contrast, our network can extract features at different scales from polarized images during the encoder stages and fuse these features during the decoder stages, leveraging a physically guided network architecture and a carefully designed loss function. The restored image with our approach performs superior visual quality in highly turbidity underwater environments with 40 NTU, as shown in Fig. 5(a). In addition, the generalization and robustness of our method are further examined with the test sets that are not included in the training and validation sets. The comparison of the results between our approach and other methods further indicates the superiority of our approach in generalization and robustness with 60 NTU, as shown in Fig. 5(b). As the turbidity gradually increases, the restored images introduce noise in the background regions. The experiment results indicate that our method performs well in dehazing images when the turbidity is below 65 NTU. At 60 NTU, the images captured by the CCD camera are barely discernible, while the images restored by our method remain clear, as shown in Fig. 5(b). However, when the turbidity exceeds 65 NTU, the restored image's details become blurred, and a significant amount of noise appears in the background regions. Our method becomes ineffective when the turbidity reaches 80 NTU.

To quantify the performance of our approach, we employ three metrics to assess the quality of the restored images, namely mean square error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM) [32]. These metrics are full-reference evaluation metrics with the participation of ground truth in the calculation. Table 2 presents the quantitative outcomes of the reconstructed images depicted in Fig. 5. Notably, values displayed in bold signify the superior quantitative result. Our approach outperforms all other methods for image restoration in terms of MSE, PSNR, and SSIM metrics, as evidenced by the comparison. The quantitative results of three metrics (MSE, PSNR, SSIM) for the reconstructed images (in Fig. 5) are presented in Table 2, indicating that our method outperforms other image restoration approaches in all metrics.

Tables Icon

Table 2. MSE, PSNR, and SSIM values of recovered results with validation dataset (Fig. 5(a)) and test dataset (Fig. 5(b)).

The inference time, number of model parameters, and number of floating-point operations (Flops) are employed as the key indicators for evaluating the efficiency of the deep network [33]. The inference time refers to the duration required to compute the output of a trained model given an input image. The number of model parameters refers to the total number of parameters that the model needs to learn and store. The floating-point operations refer to the total number of computations required for all floating-point arithmetic operations. Employing the Intel i5-11400 CPU, we conducted computations on a size of 224 × 224 image to determine three key indicators for both our method and PDN outcomes. The results obtained are as follows: for interference time, our method yielded 0.312 seconds compared to PDN's 4.784 seconds; in terms of model parameters, our approach demonstrated values of 3.190 M while PDN achieved 4.352 M; finally, regarding Flops, our method demonstrated superior efficiency with 1.8 G, surpassing PDN's 218.5 G. The results show that our proposed method is more efficient.

4. Conclusion

The Mueller transform matrix is introduced for the underwater polarization image dehazing model, which yields better results than end-to-end data-driven dehazing methods. The IRPCA block with efficient computation and feature extraction is constructed according to this model. The MTM-Net is proposed for restoring underwater polarization images that bring together the sophisticated feature exaction capabilities of deep learning and the distinct advantages of the Mueller matrix. The MTM-Net is constructed by considering the physical processing of underwater polarization dehazing imaging with the Mueller matrix. To reduce the pre-processing data operations and float calculations, the U-shaped structures of the network with skip-connection are designed to capture multiscale information hierarchically for underwater dehazing imaging. The IRPCA in the proposed network is composed of the inverted residual structure through depth-wise convolution and the channel-attention structure. The loss function in this work, including the content loss can effectively restore fine-grained details. The ablation experiment results indicate the superior outcomes of our approach, either the quality of restored images or the generalization and robustness. These results confirm the superior performance of this approach and provide a deeper understanding of the underwater polarimetric dehazing imaging in turbid underwater environments by considering the physical model (the Mueller transform matrix model). This approach can be further extended to the color images by employing it for RGB channels, high-resolution real-time polarization dehazing imaging by the efficient computation and feature extraction.

Funding

Zhejiang Provincial Key Research and Development Program (2022C04007); National Natural Science Foundation of China (11874323).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. X. Guo, Y. Yang, C. Wang, and J. Ma, “Image dehazing via enhancement, restoration, and fusion: A survey,” Information Fusion 86-87, 146–170 (2022). [CrossRef]  

2. X. Li, Y. Han, H. Wang, T. Liu, S.-C. Chen, and H. Hu, “Polarimetric imaging through scattering media: A review,” Front. Phys. 10, 815296 (2022). [CrossRef]  

3. K. He, J. Sun, and X. Tang, “Single image haze removal using dark channel prior,” IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2341–2353 (2010). [CrossRef]  

4. X. Li, L. Yan, P. Qi, L. Zhang, F. Goudail, T. Liu, J. Zhai, and H. Hu, “Polarimetric Imaging via Deep Learning: A Review,” Remote Sens. 15(6), 1540 (2023). [CrossRef]  

5. A. Turpin, I. Vishniakou, and J. d Seelig, “Light scattering control in transmission and reflection with neural networks,” Opt. Express 26(23), 30911–30929 (2018). [CrossRef]  

6. D. Li, B. Lin, X. Wang, and Z. Guo, “High-performance polarization remote sensing with the modified U-net based deep-learning network,” IEEE Trans. Geosci. Remote Sensing 60, 1–10 (2022). [CrossRef]  

7. J. Zhang, J. Shao, J. Chen, D. Yang, B. Liang, and R. Liang, “PFNet: an unsupervised deep network for polarization image fusion,” Opt. Lett. 45(6), 1507–1510 (2020). [CrossRef]  

8. X. Zeng, Y. Luo, X. Zhao, and W. Ye, “An end-to-end fully-convolutional neural network for division of focal plane sensors to reconstruct S0, DoLP, and AoP,” Opt. Express 27(6), 8566–8577 (2019). [CrossRef]  

9. M. Pistellato, F. Bergamasco, T. Fatima, and A. Torsello, “Deep demosaicing for polarimetric filter array cameras,” IEEE Trans. on Image Process. 31, 2017–2026 (2022). [CrossRef]  

10. X. Li, H. Li, Y. Lin, J. Guo, J. Yang, H. Yue, K. Li, C. Li, Z. Cheng, and H. Hu, “Learning-based denoising for polarimetric images,” Opt. Express 28(11), 16309–16321 (2020). [CrossRef]  

11. H. Liu, Y. Zhang, Z. Cheng, J. Zhai, and H. Hu, “Attention-based neural network for polarimetric image denoising,” Opt. Lett. 47(11), 2726–2729 (2022). [CrossRef]  

12. H. Hu, Y. Zhang, X. Li, Y. Lin, Z. Cheng, and T. Liu, “Polarimetric underwater image recovery via deep learning,” Opt. Lasers Eng. 133, 106152 (2020). [CrossRef]  

13. Y. Zhu, T. Zeng, K. Liu, Z. Ren, and E. Y. Lam, “Full scene underwater imaging with polarization and an untrained network,” Opt. Express 29(25), 41865–41881 (2021). [CrossRef]  

14. Q. Ren, Y. Xiang, G. Wang, J. Gao, Y. Wu, and R.-P. Chen, “The underwater polarization dehazing imaging with a lightweight convolutional neural network,” Optik 251, 168381 (2022). [CrossRef]  

15. Y. Xiang, X. Yang, Q. Ren, G. Wang, J. Gao, K.-H. Chew, and R.-P. Chen, “Underwater polarization imaging recovery based on polarimetric residual dense network,” IEEE Photonics J. 14(6), 1–6 (2022). [CrossRef]  

16. Y. Y. Schechner and N. Karpel, “Recovery of underwater visibility and structure by polarization analysis,” IEEE J. Oceanic Eng. 30(3), 570–587 (2005). [CrossRef]  

17. X. Li, H. Hu, L. Zhao, H. Wang, Y. Yu, L. Wu, and T. Liu, “Polarimetric image recovery method combining histogram stretching for underwater imaging,” Sci. Rep. 8(1), 12430 (2018). [CrossRef]  

18. H. Jin, L. Qian, J. Gao, Z. Fan, and J. Chen, “Polarimetric calculation method of global pixel for underwater image restoration,” IEEE Photonics J. 13(1), 1–15 (2021). [CrossRef]  

19. J. Liang, L. Ren, H. Ju, W. Zhang, and E. Qu, “Polarimetric dehazing method for dense haze removal based on distribution analysis of angle of polarization,” Opt. Express 23(20), 26146–26157 (2015). [CrossRef]  

20. F. Liu, P. Han, Y. Wei, K. Yang, S. Huang, X. Li, G. Zhang, L. Bai, and X. Shao, “Deeply seeing through highly turbid water by active polarization imaging,” Opt. Lett. 43(20), 4903–4906 (2018). [CrossRef]  

21. H. Wang, J. Li, H. Hu, J. Jiang, X. Li, K. Zhao, Z. Cheng, M. Sang, and T. Liu, “Underwater imaging by suppressing the backscattered light based on Mueller matrix,” IEEE Photonics J. 13(4), 1–6 (2021). [CrossRef]  

22. Y. Zhao, W. He, H. Ren, Y. Zhang, and Y. Fu, “Polarization Descattering Imaging of Underwater Complex Targets Based on Mueller Matrix Decomposition,” IEEE Photonics J. 14(5), 1–6 (2022). [CrossRef]  

23. F. Liu, S. Zhang, P. Han, F. Chen, L. Zhao, Y. Fan, and X. Shao, “Depolarization index from Mueller matrix descatters imaging in turbid water,” Chin. Opt. Lett. 20(2), 022601 (2022). [CrossRef]  

24. B. Xu, N. Wang, T. Chen, and M. Li, “Empirical evaluation of rectified activations in convolutional network,” arXiv, arXiv:1505.00853 (2015). [CrossRef]  

25. M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition (IEEE, 2018), pp. 4510–4520.

26. O. Rukundo and H. Cao, “Nearest neighbor value interpolation,” arXiv, arXiv:1211.1768 (2012). [CrossRef]  

27. Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, and Q. Hu, “ECA-Net: Efficient channel attention for deep convolutional neural networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (IEEE, 2020), pp. 11534–11542.

28. J. Zhang, J. Shao, H. Luo, X. Zhang, B. Hui, Z. Chang, and R. Liang, “Learning a convolutional demosaicing network for microgrid polarimeter imagery,” Opt. Lett. 43(18), 4534–4537 (2018). [CrossRef]  

29. L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition (IEEE, 2016), pp. 2414–2423.

30. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv, arXiv:1409.1556 (2014). [CrossRef]  

31. A. M. Reza, “Realization of the contrast limited adaptive histogram equalization (CLAHE) for real-time image enhancement,” Journal of VLSI Signal Processing Systems for Signal Image and Video Technology 38(1), 35–44 (2004). [CrossRef]  

32. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. on Image Process. 13(4), 600–612 (2004). [CrossRef]  

33. Y. Song, Y. Zhou, H. Qian, and X. Du, “Rethinking performance gains in image dehazing networks,” arXiv, arXiv:2209.11448 (2022). [CrossRef]  

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (5)

Fig. 1.
Fig. 1. The structure of the proposed Mueller transform matrix neural network.
Fig. 2.
Fig. 2. The schematic of the experiment setup.
Fig. 3.
Fig. 3. Ablation experiment results. The corresponding PSNR and SSIM values for different methods are shown at the bottom of the plots.
Fig. 4.
Fig. 4. Total loss with epoch iterations and intermediate recovered image.
Fig. 5.
Fig. 5. comparison of the recovered images by different methods: (a) with a validation set; (b) with a test set (without appearing in the training set).

Tables (2)

Tables Icon

Table 1. PSNR and SSIM for different methods

Tables Icon

Table 2. MSE, PSNR, and SSIM values of recovered results with validation dataset (Fig. 5(a)) and test dataset (Fig. 5(b)).

Equations (8)

Equations on this page are rendered with MathJax. Learn more.

S h a z e = [ s 0 h a z e s 1 h a z e s 2 h a z e s 3 h a z e ] = M h a z e S i n
S c l e a r = [ s 0 c l e a r s 1 c l e a r s 2 c l e a r s 3 c l e a r ] = M c l e a r S i n
M t r a n s S h a z e = S c l e a r
s 0 c l e a r = m t T [ s 0 h a z e s 1 h a z e s 2 h a z e ] T
L E = 1 w h w , h | | ( H x ( G ( x h a z e ) ) ) w , h ( H x ( x l a b e l ) ) w , h | | 1 + 1 w h w , h | | ( H v ( G ( x h a z e ) ) ) w , h ( H v ( x l a b e l ) ) w , h | | 1
L C = 1 w h w , h | | ( φ 25 ( G ( x h a z e ) ) ) w , h ( φ 25 ( x l a b e l ) ) w , h | | 2
L P = 1 w h w , h | | ( G ( x h a z e ) ) w , h ( x l a b e l ) w , h | | 2
L o s s = λ e L E + λ c L C + λ p L P
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.