Recovery for underwater image degradation with multi-stage progressive enhancement

Junnan Liu; Zhilin Liu; Yanhui Wei; Wenjia Ouyang

doi:10.1364/OE.453387

1. Introduction

In recent years, underwater visual imagery shows wide applications in underwater vision tasks [1,2], underwater environment monitoring, seabed resource exploration, and salvage [3,4]. However, underwater images suffer from low-readability and illegible problems due to light scattering and selective absorption in water (Underwater images of Fig. 1). Therefore, images with excellent perceptual quality are highly desired in practical applications.

Fig. 1. Demonstration of (A) underwater optical imaging, (B) degraded underwater images, (C) method overview of MUIE, and (D) the corresponding recovered results of MUIE. (Best viewed at 200% zoom).

Download Full Size | PDF

It is well known that underwater image recovery aims to generate a clear image with corrected color from a degraded input [5,6]. However, as only the degraded input is available, underwater image recovery generated from the degraded input is an under-determined problem.

Early recovery algorithms focus on exploiting prior knowledge from natural-scene images, such as the Dark Channel Prior (DCP) [7], Red Channel Prior (RCP) [8], and Minimum Loss Prior (MLP) [9]. The goal of the prior-based methods is to obtain the theoretically non-degraded image with alleviated blur and color shifting by reversing the imaging equation [10] or solving color models (HSV [11] or RGB [12]). However, the methods always employ non-convex optimization in modelization, resulting in insufficient recovery performance.

Recently, based on deep Generative Adversarial Networks (GANs) and deep Convolutional Neural Networks (CNNs), learning technologies for underwater image recovery have been developed to recover quality degradation with data supervision, such as UGAN [13], UWCNN [14], and Ucolor [15], Underwater Denoising Autoencoder (UDAE) [16]. Generally, learning-based method requires underwater images and the corresponding ground-truth image pairs to train a universal recovery network. Due to the absence of real underwater images with ground-truth pairs, natural-scene images and synthetic underwater images are applied for training. However, since the spatial gap between between synthetic and real-world domain, the end-to-end optimization of learning-based methods require the recovery networks to learn the specific patterns of synthetic degradation recovery, which results in limited recovery generalization on different underwater images. Thus, effectively recovering the degradation of underwater imagery remains a challenge.

In this paper, a Multi-stage Underwater Image Enhancement (MUIE) is proposed, which breaks down the overall recovery process into manageable modules, including in-situ enhancement and data-driven correction, and cascades the modules in a unified framework. Specifically, by compensating for channel discrepancies under scene-relevant supervision, the in-situ enhancement performs a dual computation to enhance blurred textures of different levels of degradation, and introduces a multi-scale fusion into the dual computation to generate spatially accurate information. In light of the semantically unreliable of in-situ information, we perform a data-driven correction to learn the pattern of encoding corrected color-constancy information under data supervision, and combines it with in-situ enhancement to achieve an adaptive response to different types of degradation. Finally, under the combination of scene and data information, the recovery of MUIE shows well-posed performance and generalization in different underwater scenarios, especially in recovery-challenging scenes (Recovered results of MUIE of Fig. 1).

The contributions of this work are summarized as follows:

• First, instead of handling the degradation of underwater images as an integrated whole, this work devotes to generating well-posed recovery by exploring which solutions are favorable to improve recovery performance, with an eye toward explanations that suggest necessities for those solutions. Holding that scene specificity and adaptability are critical to the well-posed performance of different types of degradation, we have made efforts to break down the overall recovery process to in-situ enhancement and data-driven correction modules and cascade the modules together.
• Second and specifically, we highlight the effectiveness of duality-based fusion in avoiding ill-posed performance in the effort of in-situ enhancement to generate spatially accurate results under different degrees of degradation, with a particular emphasis on the necessity of enhancing the channel compensation in recovery-challenging scenes. Subsequently, we focus on the color-constancy correction under data supervision, and combine it with channel compensation and duality-based fusion in developing the framework of MUIE.
• Third, we conducted qualitative and quantitative recovery experiments, in which the experimental results demonstrate the recovery effectiveness and necessity of MUIE in different underwater scenes. The recovery effectiveness of MUIE was statistically analyzed in terms of the channel correlation, histogram distribution, and domain effect. The analysis results show that the combination of scene and data information a unified framework is effective for well-posed recovery with generalization to different underwater images.

The rest of this paper is organized as follows. Section 2 presents the related works of underwater image recovery, and section 3 introduces the methodology and modules of MUIE. Section 4 presents the experimental results of underwater image recovery. In section 5, we analyzed and discussed the recovery effectiveness of MUIE. Section 6 presents the conclusion of this paper.

2. Related work

The two approaches that have so far received the most attention from researchers performing recovery are prior-based recovery (section 2.1) and learning-based recovery (section 2.2).

2.1 Prior-based recovery

The solution of underwater image recovery usually resorts to the imaging equation [10]:

(1)$$I_{c}(x)=J_{c}(x)\cdot t(x) + B_{c}\cdot(1-t(x)),$$

where $c\in \left \{ R,G,B \right \}$ is the color channel, $I_{c}$ and $J_{c}$ represent the raw underwater image and the recovered image, $B_{c}$ is the veiling-light, $t(x) = e^{-\beta _{c}d}$ is a transmission map, in which $\beta _{c}$ is the wide-band RGB attenuation coefficients and $d$ is the traveling distance in water scene.

Model-based restoration. Since only the underwater image $I_{c}(x)$ is available for recovery, solving the Eq. (1) is an under-determined problem. Thus, the statistical properties of images (i.e. image prior) have been exploited to estimate the parameters. He et al. [7] proposed the Dark Channel Prior (DCP), which estimates the transmission map $t_{c}(x)$ and veiling-light $B_{c}$ to rarefy the appearance of the dark channel of a degraded image. Based on the dark channel prior, many variants have been developed to recover the degradation of underwater images, such as Red Channel Prior (RCP) [8], Generalized Dark Channel Prior (GDCP) [17], low-complexity DCP [18], Minimum Dark Channel Prior (MDCP) [19], Underwater Dark Channel Prior (UDCP) [20,21].

Berman et al. [22] proposed the Haze-Lines (HL) prior based on the color cluster difference between natural images and haze images, and demonstrated that the recovery of underwater images could be simplified to the image dehazing with preset ratios of attenuation coefficients [23].

With specialized hardware devices [24], polarization imaging technologies have been utilized to suppress scattering of underwater scenarios [25–27]. Wei et al. [28] established an underwater polarimetric imaging model based on Stokes vector to remove the backscattered light and recover the target information accurately. Liu et al. [29] considering the light scattering and absorption effect in water, and developed an underwater polarization imaging model to explored the polarization information of the target scene that backscattered light. Zhu et al. [30] proposed an underwater polarization imaging model optimized by untrained network to correct the effect of imperfect parameter settings of Eq. (1).

The performance of model-based methods mainly depends on the settings of the global parameters in the application scenarios. However, Akkaynak et al. [31] demonstrated that the color attenuation experienced by wide-band color channels has different values with different optical instruments and then summarized another imaging equation to differentially estimate the wide-band coefficients of backscatter and direct transmission, which is defined as [32]:

(2)$$I_{c}=J_{c}\cdot e^{-\beta^{D}_{c}\left ( v_{D}\right )\cdot z}+B^{\infty }_{c}\left ( 1-e^{-\beta^{B}_{c}\left ( v_{B}\right )\cdot z}\right ),$$

where $\beta ^{D}_{c}( v_{D})$ and $\beta ^{B}_{c}(v_{B})$ represent the wide-band direct transmission coefficient and backscatter coefficient, $v_{D} =\left \{z,\rho,E,S_{c},\beta \right \}$ and $v_{B} =\left \{ E,S_{c} ,b,\beta \right \}$ determined by the coefficients $\beta ^{D}_{c}$ and $\beta ^{B}_{c}$ on range $z$, reflectance $\rho$, ambient light spectrum $E$, the camera spectral response $S_{c}$, and the physical scattering and beam attenuation coefficients of the water medium $b$, these parameters are functions of wavelength $\lambda$. Based on the revised formation, Akkaynak et al. [33] proposed the Sea-thru method to remove the degradation of RGB-D underwater images, and Zhou et al. [5] appended the depth and illumination estimation into the revised formation.

Model-free enhancement. Ancuti et al. [34] proposed an optimization-fusion and a multi-scale version, which fuses the gamma-corrected results and white-balanced results for underwater image enhancement. Considering color correction, Ancuti et al. [35] proposed the Color Balance and Fusion (CBF) method to enhance underwater images with channel transformation. Further, Ancuti et al. developed the Color Channel Transfer (CCT) [36] and the Color Channel Compensation (3C) [37] to compensate for the loss of information in one color channel. CCT transfers information between opponent colors (blue-yellow, red-green) from source-reference pairs with aligned global features in the CIE $L*a*b*$ color space. The compensation of 3C is equivalent to that of CBF when compensating for the loss of the red channel [37]. Based on the color transformation of CB, Liu et al. [38] proposed the Parameter-Adaptive Compensation (PAC) [38] for local color correction. Tao et al. [2] further proposed the effective approach to generate the contrast-enhanced and color-corrected reference of CCT. Liu et al. [4] employed the CCT as a preprocessing operation to improve the dehazing performance of underwater image enhancement.

To avoid improper parameter estimation of Eq. (1), Galdran et al. [39] exploited the duality relation between the Retinex-based function $Ret(\cdot )$ and the equation:

(3)$$J_{c}(x) = 1-Ret(1- I_{c}(x)).$$

Although it shows competitive enhancement performance with the current fog-removal techniques, the equivalence cannot be directly applied to the recovery of underwater images due to the unique unbalanced attenuation of underwater optics.

Since the prior-based recovery methods, whether model-based restoration or model-free enhancement, involve non-convex optimization in modelization, the methods usually generate ill-posed recovery performance for challenging underwater scenarios, as shown in the experiments (Section 4).

2.2 Learning-based recovery

Numerous deep learning models have been proposed to recover the degradation of underwater image with data supervision. For training recovery networks, learning-based methods require paired training data (i.e., degraded underwater images and the corresponding ground truth). The methods rely on synthetic data to train recovery networks, due to the absence of ground truth.

Zhou et al. [40] introduces the domain adaptive mechanism and a physics model constraint feedback control into an adversarial learning framework. Based on ten types of water parameters [41], Li et al. [14] trained a series of underwater image enhancement (UWCNN) models, which can be utilized to enhance the corresponding type of underwater images. Li et al. [15] proposed a medium transmission-guided multi-color space embedding (Ucolor), which explores the complementary merits between domain knowledge of underwater imaging and data information with multi-color space and discriminative features. The analysis-synthesis network (ANA-SYN) [42] is proposed to integrate underwater domain knowledge and distortion distribution in a universal framework. Wang et al. designed a joint network with multiple iterative optimization of color correction and dehazing [43], which is optimized with the same loss function as ANA-SYN.

On the other hand, existing algorithms [44–46] apply unpaired data to train the network based on GANs or cycle Generative Adversarial Networks (cycleGANs) [47], which aims to translate an image from degraded domain to a clear domain without paired data supervision. However, recovery networks trained with unpaired data show less effectiveness than that trained under unpaired data [48].

Fabbri et al. [13] proposed the Underwater GAN (UGAN), which imitates a degradation processing based on cycleGAN to generate paired training data, and also utilize the U-Net architecture [49] to improve the quality of underwater image. Liu et al. [50] set up a deep multi-scale feature fusion network based on conditional GAN for underwater image color correction. Further, M. J. Islam et al. [51] combined paired and unpaired training in a conditional GAN model for fast underwater image enhancement.

Although the learning-based methods generate visually pleasing results, the end-to-end optimization strategy enforces the recovery networks to learn the pattern of synthetic degradation-recovery. Due to the spatial gap between synthetic and real underwater domains, the recovery of learning-based methods cannot generalize to different scenes, especially the scenes beyond the training dataset. Moreover, the current loss functions of the methods focus less on the unbalanced attenuation of channels caused by selective absorption, which results in the retention of color shifting, as shown in the experiments (Section 4).

Herein, we designed a multi-stage enhancement method, MUIE, which cascades in-situ spatial information and data information in a unified framework to respond different underwater scenes.

3. Proposed methods in this paper

The recovery of MUIE is organized into two modules (Fig. 2). MUIE begins with channel transformation to mitigate unbalanced attenuation and reduce the channel discrepancy, and then enhances the compensation of red channel with the loss function of channel similarity between green and compensated red channels. Based on the channel compensation, the duality relation between channel normalization and blurred input is established to ameliorate the impact of scattering, and the multi-scale fusion is employed in the duality relation to preserve the textural details and scene appearance. Finally, the color autoencoder, effectively encoding color-constancy information, is supervised by data information to correct the color appearance, and thus adapt the recovery to various underwater scenarios.

Fig. 2. The architecture of the MUIE for underwater image recovery.

Download Full Size | PDF

3.1 Channel compensation

As a critical process of MUIE, the compensation aims to compensate the channel discrepancies and further improve the channel similarity, especially the similarity between the red and the green channels. The compensation is established based on the following observations:

1. Particles in water selectively absorb different wavelengths of light. Among channels of underwater images, the red channel suffers from the most severe attenuation due to the longest wavelength ($10$-$15ft$), followed by the green ($20$-$25ft$) and the blue ($35$-$45ft$) [35]. Accordingly, the degraded red channel shows the lowest information entropy in most underwater scenes (the first line of Fig. 3). As presented, 88.3% cases of the underwater dataset [42], 100% cases of the Cognitive Autonomous Diving Buddy (CADDY) dataset [52], and 100% cases of the Video Driver Detection (VDD) dataset [53] demonstrate the low entropy of red channels of underwater images. In contrast, the channels of natural-scene images have relatively close entropy. Merely 45.5% red channels show lower entropy than other channels (33.7% lower than green channels, 35.8% lower than blue channels). As the information loss negatively affects the recovery performance [37], compensating for the missing information of the red channel is necessary for well-posed underwater image recovery.
2. The current compensation (CBF [35] and 3C [37]) employ the opponent-color space assumption (green versus red, yellow versus blue) to establish the transformation from the green to the red channel. Our previous work [38] statistically proved that the correlation between red and green channels is stronger than that between red and blue channels. Herein, we reported more cases to demonstrate the channel correlation of underwater images (the second line of Fig. 3). In these 1000, 2000, 3000 cases of the underwater dataset [42], CADDY [52], and VDD [53], over 91.5%, 100%, 91.1% cases of the correlation between red and green channels are significantly larger than that between red and blue channels. Similarly, 96.3% of the red-green combination of natural-scene images show a strong channel correlation.
3. Technically, a local window (kernel of size 3$\times$3, padding 1, stride 1) is introduced into the global channel transformation (Eq. (4)). To respond severely unbalanced attenuation, we enhance the compensation of the red channel with the channel similarity loss, based on the guided filter [54], between the compensated red and the green channels (Eq. (5)). To produce the effect according to varying degradation degrees, the channel compensation can edit the interpolation between the compensated red channel and the enhanced red compensation, where the interpolation ratio $\gamma$ is set to 0.5 in most computations (Eq. (9)). In addition, the blue channel compensation is optional for extreme cases [35] (Channel compensation of Fig. 2).

Fig. 3. The statistics of channel entropy (the first line) and channel correlation (the second line) of degraded underwater images of the dataset [42], CADDY [52], VDD [53], and natural-scene images. The channel correlation is generated by the Pearson correlation coefficient. (Best viewed at 200% zoom).

Download Full Size | PDF

Mathematically, the channel transformation in a local window is expressed as:

(4)$$\begin{array}{l} \textbf{I}_{r\textbf{c}} = \textbf{I}_{r} + \sum\limits _{w} \left ( \bar{I}_{g_{w}} - \bar{I}_{r_{w}} \right )\cdot \left (\textbf{I}_{g}-\textbf{I}_{g \cdot r}\right ),\\ \textbf{I}_{b\textbf{c}} = \textbf{I}_{b} + \sum\limits _{w} \left ( \bar{I}_{g_{w}} - \bar{I}_{b_{w}} \right )\cdot \left (\textbf{I}_{g}-\textbf{I}_{g \cdot b}\right ), \end{array}$$

where $w$ denote the local window of transformation, $\textbf {I}_{c}$ denotes the color channels, $c\in \left \{ r,g,b \right \}$, $\textbf {I}_{c\textbf {c}}$ is the compensated channels, $\bar {I}_{c_{w}}$ is the average value of color channel in window $w$, and $\textbf {I}_{c_{1} \cdot c_{2}}$ denotes the product of $\textbf {I}_{c_{1}} \cdot \textbf {I}_{c_{2}}$.

Further, the channel similarity loss ${L}_{s}$ between the compensated red channel $\textbf {I}_{r\textbf {c}}$ and green channel $\textbf {I}_{g}$ is described as:

(5)$${L}_{s} = \sum_{x \in p} \left\| I_{r\textbf{e}_{p}-g_{p}}(x) + \epsilon {a}(x) \right\|_{2}^{2},$$

where $I_{r\textbf {e}_{p}-g_{p}}(x)$ denotes the similarity difference between the predicted enhancement $I_{r\textbf {e}_{p}}(x)$ and the green channel $I_{g_{p}}(x)$ in patch $p$ (set to 32), the predicted enhancement $I_{r\textbf {e}_{p}}(x)$ is shown as $a_{p}(x) \cdot {I}_{ r{\textbf {c}_{p}}}(x)+ b_{p}(x)$, $a_{p}(x)$ and $b_{p}(x)$ denote the coefficients of pixel $x$ in $p$, and $\epsilon$ is a regularization parameter (set to 0.01). The linear regression solutions of $a_{p}(x)$ and $b_{p}(x)$ are shown as [54]:

(6)$$a_{p}(x)= \frac{\bar{I}_{r \textbf{c}_{p} \cdot g_{p}} - \bar{I}_{r \textbf{c}_{p}}\cdot\bar{I}_{g_{p}} }{\sigma_{g_{p}}^{2} +\epsilon },$$

(7)$$b_{p}(x) = \bar{I}_{g_{p}}- a_{p}(x) \cdot \bar{I}_{r\textbf{c}_{p}},$$

where $\bar {I}_{r \textbf {c}_{p}}$ and $\bar {I}_{g_{p}}$ denote the average value of $\textbf {I}_{r \textbf {c}}$ and $\textbf {I}_{g}$ in patch $p$, and $\bar {I}_{r \textbf {c}_{p} \cdot g_{p}}$ denote the average product of $\textbf {I}_{r \textbf {c}}$ and $\textbf {I}_{g}$ in $p$, and $\sigma ^{2}_{g_{p}}$ denote the variance of $\textbf {I}_{g}$ in $p$. With the coefficients, the enhanced compensation is defined as:

(8)$$I_{r\textbf{e}}(x) = \bar{a}_{p} \cdot I_{r\textbf{c}}(x) + \bar{b}_{p},$$

where $\bar {a}_{p}$ and $\bar {b}_{p}$ denote the average value of ${a}_{p}(x)$ and $b_{p}(x)$ in $p$ that contain the pixel $x$.

With $\textbf {I}_{r\textbf {c}} \rightarrow \textbf {I}_{g}$ in Eq. (4), the coefficient $a_{p}(x)$ is proportional to ${ \sigma _{g_{p}}^{2}} / ({\sigma _{g_{p}}^{2} +\epsilon })$ in Eq. (6), where $\sigma _{g_{p}}^{2} =E(I^{2}_{g_{p}})-E^{2}(I_{g_{p}})$. $a_{p}(x)$ shows a positive relation to gradient transitions in areas with high variance, as the variance $\sigma ^{2}_{g_{p}}$ is positively associated with gradient transitions (Fig. 4). Thus, based on the green channel, the compensated red channel obtains edge enhancement in high variance patches (Eq. (6)), while performing an average filter in flat patches (Eq. (7)).

With the compensated image $\textbf {I}_{\textbf {c}}$ and the corresponding enhanced compensation $\textbf {I}_{\textbf {e}}$, the result of channel compensation $\textbf {I}_{\textbf {ce}}$ is edited by the following equation:

(9)$$\textbf{I}_{\textbf{ce}} = \gamma \cdot \textbf{I}_{\textbf{c}} +(1-\gamma) \cdot \textbf{I}_{\textbf{e}}.$$

Fig. 4. Demonstration of gradient transition. The variance $\sigma$ are as depicted.

Download Full Size | PDF

Figure 5 shows the improvement of our channel compensation on severely unbalanced attenuation. The channel compensation is effective in revealing blur appearance and removes the color distortion of different quality degradations (CE-RCP of Fig. 5). In contrast, the compensation of CB and PAC generate incomplete performance and remain color distortions in the challenging scenarios (Img-2 and Img-3 of CB-RCP and PAC-RCP in Fig. 5). Therefore, the improvement from Raw-, CB-, and PAC-RCP to CE-RCP illustrates the necessity and effectiveness of our channel compensation for well-posed recovery.

Fig. 5. (A) The compensation comparisons of CB [35], PAC [38], and the channel compensation of MUIE (i.e., CE). The recovery results of compensated images were generated by RCP method [8]. The underwater images were selected from VDD dataset [53]. (B) Histogram distribution of corresponding images. (Best viewed at 200% zoom).

Download Full Size | PDF

3.2 Duality-based fusion

The duality-based fusion is established based on the following principles:

1. The duality between Retinex enhancement and optical de-scattering (Eq. (3)) has been shown effective for image dehazing [33,37,39]. In general, the dehazing task assumes that all colors are preserved and modulated by the illuminant and attenuation [37], and the haze is homogeneous [48]. Actually, the channels of underwater images suffer from inhomogeneous degradation, information loss, and low illumination. Thus, MUIE performs the dual computation after channel compensation mitigates the channel discrepancies.
2. The duality with Retinex enhancement produces favorable results in enhancing over-exposed images, while the degradation of underwater images always shows low illumination. Therefore, a channel normalization is utilized in the dual computation (Eq. (10)), and the Retinex enhancement $Ret(\cdot )$ based on Adaptive Local Tone Mapping (ALTM) [55] is performed after dual computation (Duality-based fusion of Fig. 2). Mathematically, the channel normalization $Norm(\cdot )$ is expressed as: $(10)$$Norm(\textbf{I})=\frac{I_{c}(x)-I_{c,min}}{I_{c,max}-I_{c,min}},$$$ where $c \in \left \{ r,g,b \right \}$, $I_{c,max}$ and $I_{c,min}$ are given as $(\bar {I}_{c}+2\cdot \sigma _{c})$ and $(\bar {I}_{c}-2\cdot \sigma _{c})$, respectively [56].
3. The multi-scale fusion process [35] is introduced into the dual computation to preserve the gamma-corrected and sharp features (Duality-based fusion of Fig. 2). The multi-scale fusion $Fus(\cdot )$ of MUIE is expressed as $(11)$$Fus(\textbf{I}) = \sum_{k=1}G_{l}\left \{ \bar{W}_{k}(x) \right \}L_{l} \left \{I_{kc}(x) \right \},$$$ where $k \in \left \{ G, S \right \}$ is the index that refers to the gamma-corrected and the sharpened visions, $l$ is the scale levels (set to 3), $W$ is the normalized weight maps, $G$ and $L$ are the Gaussian and Laplacian pyramid decomposition, respectively. As illustrated in Fig. 2, by integrating the multi-scale fusion $Fus(\cdot )$ into the proposed dual computation, the output of duality-based fusion is defined as: $(12)$$Dua(\textbf{I})=Ret\left ( 1-Fus \left ( Norm \left ( 1-\textbf{I} \right ) \right ) \right ).$$$
Besides, the computations $Dua_{F}(\cdot )$ and $Dua_{R}(\cdot )$ are optional for underwater scenarios suffer from forward scattering (surface-camera [57] or surface-camera) and artificial lighting, respectively:
$(13)$$Dua_{F}(\textbf{I})=1-Fus \left ( Ret \left ( 1-\textbf{I} \right ) \right ),$$$ $(14)$$Dua_{R}(\textbf{I})=Ret\left ( 1-Fus \left ( Ret \left ( 1-\textbf{I} \right ) \right ) \right ).$$$

Figure 6 shows the function of the duality-based fusion. Applying the duality-based fusion directly to the raw underwater images introduce annoying red artifacts (Fig. 6(B)), which further illustrates the necessity of using the channel compensation of MUIE for recovery. For the greenish images in deep water, the results of applying multi-scale fusion [35] with channel compensation show unexpected pinkish appearance (Img-7 and Img-8 of Fig. 6(D)), while the proposed duality-based fusion with channel compensation generates significance color correction gains robustly (Img-7 and Img-8 of Fig. 6(F)), illustrating the effectiveness of the duality-based fusion. In comparison, the dual computation with channel normalization (i.e., without fusion) remits the global color distortion but the results still suffer from blurry details, which shows the necessity of introducing the fusion technology into the dual computation (Fig. 6(E)).

Fig. 6. (A) Raw underwater images of VDD dataset [53], (B) the results of fusion [35] without compensation, (C) the results of channel compensation, (D) the results of fusion [35] with compensation, (E) the results of dual computation only based on normalization with channel compensation, and (F) the results of duality-based fusion with channel compensation (i.e., the in-situ enhancement of MUIE).

Download Full Size | PDF

3.3 Color autoencoder

To robustly adapt the recovery to different degradations, we focus on a data-driven correction of the in-situ enhancement. The data-driven correction aims at automatically encoding low-level semantical information [58] to restore the color constancy, which is utilized to perceive the intrinsic semantic information of objects under different appearances.

The data-driven correction achieved by color autoencoder is performed based on the standard U-Net (i.e., encoder-decoder architecture with skip connections) [49], which has been shown effective in encoding broad semantic information [16,48,59–61]. The autoencoder employs four-scale encoders to extract multi-scale latent representations and four-scale decoders with skip-connections from corresponding encoders to fuse and reconstruct the representations, which can be regarded as a subnetwork of the multi-decoder mapping [61]. The components of the architecture are visualized in Fig. 7.

Fig. 7. The encoder-decoder architecture. The max-pooling layer (kernel of size 2$\times$2, stride 1) is employed for down-sampling, and the transposed convolutional layer (kernel of size 2$\times$2, stride 1) is utilized for up-sampling. In each encoder and decoder module, the convolutional layer before the ReLU layer adopts a kernel of size 2$\times$2 and stride 1. The convolutional layer before the output layer has the kernels of size 1$\times$1 and stride 1.

Download Full Size | PDF

For training the autoencoder, the color constancy loss $L_{c}$ employs the $l_{1}$-norm as the objective function to optimize the color constancy difference between the ground truth $\textbf {I}_{gt}$ and predicted images $\textbf {I}_{p}$ [61]:

(15)$${L}_{c}=\sum_{i}^{N} \left \| \textbf{I}_{p}^{(i)} - \textbf{I}_{gt}^{(i)} \right \|_{1},$$

where $i$ indexes the $i$th paired images in a training patch. The training of the autoencoder relies on a large dataset of standard RGB images [62], which contains images $\textbf {X}$ render with improper color constancy settings and corresponding images $\textbf {Y}$ rendered with corrected constancy setting. 13333 paired images were randomly selected from this dataset as training input and ground truth to learn the mapping $\textbf {X} \rightarrow \textbf {Y}$. The training process is performed with Adam optimizer with $\beta _{1}=0.9$ and $\beta _{2}=0.999$ [63]. The autoencoder was trained on 128$\times$128 patches with a batch size of 32 training patches for 110 epochs. The initial learning rate is set to $10^{-4}$, which is steadily decreased by 2 every 50 epochs.

Under the supervision of data information, the color autoencoder improves scene appearance of in-situ enhanced results, while preserving global structures in underwater scenarios with different types of degradation (Fig. 8).

Fig. 8. Function demonstration of the color autoencoder and (A) Raw underwater images of VDD dataset [53], (B) the results of in-situ enhancement (i.e., MUIE without data-driven correction), (C) the results of in-situ enhancement with data-driven correction (i.e., the final results of MUIE).

Download Full Size | PDF

3.4 Visualized results

Figure 9 visualizes the implicit representations of each MUIE stage. The channel compensation is effective in polishing the unbalanced attenuation (Enhanced compensation of Fig. 9). In the recovery-challenging cases, the normalization layer of duality-based fusion is dominant, which reveals the blurry structure (Normalized image of Fig. 9). The multi-scale fusion generates spatially accurate information in the dual computation, and the Retinex enhancement is utilized to improve the illumination uniformity (Inverted fusion and Toned image of Fig. 9). The color autoencoder, rendering the results of in-situ enhancement to a clear appearance, improves the scene adaptability of MUIE (Final output of Fig. 9).

Fig. 9. Visualized results of each MUIE stage. (Best viewed at 200% zoom).

Download Full Size | PDF

4. Experiments and results

In this section, the recovery of MUIE on different underwater scenes, including recovery-challenging scenes, is evaluated to demonstrate the recovery effectiveness and superiority of the method qualitatively. Subsequently, a quantitative evaluation is presented based on standard metrics to illustrate the quality improvement and the recovery generalization of MUIE. Finally, an evaluation is conducted in terms of transmission estimation and edge detection to indicate the perceptual improvement of MUIE on underwater images.

Evaluation datasets. The recovery evaluation is conducted on the underwater images of UIEB-R [64], UIEB-C [64], and SQUID [23] datasets. The underwater images of the datasets show obvious quality degradation characteristics in various underwater scenes, such as color shifting, low contrast and blur. The UIEB-R [64] is employed for reference comparison, as each image in the dataset has a corresponding reference image selected from recovered results of different methods.

Comparison methods. The recovery performance of MUIE is compared with nine methods, including model-free enhancement methods (RE [55], CLAHE [65], FUSION [34], CBF [35]), model-based restoration methods (RCP [8], DU [39]), and learning-based methods (FUnIE [51], Ucolor [15], UWCNN [14]).

Evaluation metrics. Underwater Image Quality Measurement (UIQM) [66] and Underwater Color Image Quality Evaluation (UCIQE) [67] metrics are employed to quantitatively evaluate the recovery results of MUIE and other recovery methods.

Since the recovery of underwater images demand a complex balanced between multiple aspects, both the evaluation metrics, UIQM and UCIQE, integrate multiple indexes to evaluate the performance of recovery methods. The UIQM metric, based on the degradation and optical imaging mechanism, utilizes the color measurement component UICM, the clarity measurement component UISM and the contrast measurement component UIConM as indexes to evaluate the underwater image quality:

(16)$$UIQM(I) = \eta_{1} \times UICM + \eta_{2} \times UICM + \eta_{3} \times UIConM,$$

where the parameters $\eta _{1}$, $\eta _{2}$, and $\eta _{3}$ are set to 0.0282, 0.2953 and 3.5753. The UCIQE metric linearly combines the standard deviation of image chroma $\sigma _c$, the contrast of image brightness $conl$, and an average image saturation $\mu _s$ to quantify the quality evaluation of inhomogeneous color shifting, blurring, and low contrast of underwater images $I$:

(17)$$UCIQE(I) = \kappa _{1} \times \sigma_c + \kappa _{2} \times conl + \kappa _{3} \times \mu_s,$$

where the parameters $\kappa _{1}$, $\kappa _{2}$, and $\kappa _{3}$ are set to 0.4680, 0.2745 and 0.2576. The UIQM and UCIQE scores of the recovery result show a positive relation with image quality.

4.1 Qualitative Evaluation

4.1.1 Recovery evaluation with reference (Fig. 10)

As presented, raw underwater images of UIEB-R dataset are characterized by a greenish or bluish appearance, low illumination, and blurry details (Raw of Fig. 10). Compared with other recovery methods, applying the recovery of MUIE to the underwater images removes color shifting, remits blur, and inhibits artifact colors, resulting in favorable recovery performance (MUIE of Fig. 10). Compared with the reference images (Reference of Fig. 10), the MUIE recovered images still give a significant outperformance in quality improvement, especially the correction of background appearance.

Fig. 10. Qualitative recovery evaluation on UIEB-R [64] of various methods: the raw input, RCP [8], RE [55], DU [39], CLAHE [65], FUSION [34], CBF [35], FUnIE [51], Ucolor [15], UWCNN [14], reference provided by UIEB, and the MUIE. (Best viewed at 200% zoom).

Download Full Size | PDF

4.1.2 Recovery evaluation on recovery-challenging images (Fig. 11 and Fig. 12)

Due to drastically unbalanced attenuation, underwater images of UIEB-C dataset show severe obscure and color shifting (Raw of Fig. 11). The current recovery cannot well generalize to these images, resulting in reddish or purplish artifacts (RE [55], CBF [35], and FUnIE [51] of Fig. 11), insufficient recovery (RCP [8], Ucolor [15], and UWCNN [14] of Fig. 11), and image noises (CLAHE [65] of Fig. 11). In comparison, the qualitative evaluation shows that recovery applied with MUIE, in these challenging cases, generates clear scene appearances and enhances the degraded quality (MUIE of Fig. 11).

Fig. 11. Qualitative recovery evaluation on UIEB-C [64] of various methods: the raw input, RCP [8], RE [55], DU [39], CLAHE [65], FUSION [34], CBF [35], FUnIE [51], Ucolor [15], UWCNN [14], and the MUIE. (Best viewed at 200% zoom).

Download Full Size | PDF

Fig. 12. Qualitative recovery evaluation on SQUID [23] of various methods: the raw input, RCP [8], RE [55], DU [39], CLAHE [65], FUSION [34], CBF [35], FUnIE [51], Ucolor [15], UWCNN [14], and the MUIE. (Best viewed at 200% zoom).

Download Full Size | PDF

Similar to the underwater images of UIEB-C dataset, those of the SQUID dataset also fail the current methods, neither prior-based recovery nor learning-based recovery methods (Fig. 12). In this evaluation, the recovery of CBF and MUIE [35] shows quality improvement, but CBF still failed to achieve satisfactory results, such as the purplish artifacts and the bluish appearance (CBF of Fig. 12). The recovery of MUIE achieves better performance than other methods and prevents the emergence of artificial tones in these scenes (MUIE of Fig. 12).

Therefore, the results of qualitative evaluations demonstrate that the recovery of MUIE shows the well-posed performance and can be generalized to different underwater images, even to recovery-challenging scenarios.

4.2 Quantitative Evaluation

The recovery performance was evaluated quantitatively on 90 images of UIEB-R, 60 images of UIEB-C, and 12 images of SQUID datasets (Fig. 13 and Table 1).

First, MUIE obtains better performance gains in UIQM and UCIQE than the reference of UIEB-R dataset. Quantitatively, compared to the reference, the MUIE-recovered images achieve 16.18% improvement in UIQM and 3.61% improvement in UCIQE (UIEB-R of Fig. 13 and Table 1).
Second, the quantitative evaluation of UIEB-C demonstrates that the recovered results of MUIE show the improvement of 9.29% and 10.21% over the CLAHE [65], 11.03% and 8.22% over the FUnIE [51] in terms of UIQM and UCIQE, respectively (UIEB-C of Fig. 13 and Table 1).
Third, the recovery of CBF [35] and MUIE generates notable quantitative improvements in the evaluation of SQUID, which is consistent with the qualitative evaluation results (CBF and MUIE of Fig. 12). However, the recovery of CBF introduces undesired artificial colors, which cannot be quantified by the metrics.
Last, the average and variance UIQM scores of MUIE shows considerable gains over the state-of-the-art approaches, 4.49% over CBF in average and -35.15% less CBF in variance. Although RE [55] and DU [39] obtain better performance than MUIE in terms of average UCIQE, the recovery of the methods generate slight score improvement of average UIQM (Table 1).

Fig. 13. Quantitative recovery evaluation on UIEB-R [64], UIEB-C [64], SQUID [23] of various methods, the raw input, RCP [8], RE [55], DU [39], CLAHE [65], FUSION [34], CBF [35], FUnIE [51], Ucolor [15], UWCNN [14], and the MUIE. UIQM and UCIQE scores show a positive relationship with recovery performance.

Download Full Size | PDF

Table 1. The recovery evaluations on Test-R [64], Test-C [64], and SQUID [23] datasets in terms of UIQM [66] and UCIQE [67]. The top three results are marked in red, blue, and green.

View Table

Therefore, the results of qualitative and quantitative evaluations confirm that the recovery of MUIE generates robust performance with generalization to different underwater images.

4.3 Perceptual Evaluation

4.3.1 Transmission appearance (Fig. 14)

As the medium transmission can illustrate the quality degradation of underwater images [15], we employ the transmission maps of recovered images to evaluate the perceptual improvement, especially the improvement of small objects.

Fig. 14. Transmission maps of raw underwater images and images recovered by RCP [8], RE [55], DU [39], CLAHE [65], FUSION [34], CBF [35], FUnIE [51], Ucolor [15], UWCNN [14], reference of UIEB [64], and the MUIE. The transmission maps were depicted by the DCP estimation [7]. The appearance of the transmission map is positively associated with the perceptual quality.

Download Full Size | PDF

The recovery of MUIE yields significant improvement in estimating the transmission, where the transmission maps of MUIE-recovered images show a more legible and readable appearance than the transmission maps of others. Specifically, recovering salient objects in underwater scenes is toilless, such as the nearby fish and driver. The recovery on tiny objects with low signal becomes more challenging, as seen on the ship and fishes in zoom-ins. While the contrasts have been improved, the current recovery still suffers from color shifting, which adversely affect the perceptual performance of machine vision focused on semantic appearance [58], resulting in small objects being indistinguishable from background areas. In contrast, the recovery of MUIE corrects the distorted appearance and generates spatially accurate information, thus enhancing the perception of recovered objects (MUIE of Fig. 14).

4.3.2 Edge detection (Fig. 15)

As a fundamental task in image processing and machine vision, edge detection aims at removing irrelevant information and preserving the important structure of objects. Accordingly, edge maps of recovered underwater images are employed to illustrate the textural enhancement intuitively.

Fig. 15. Edge maps of raw underwater images and images recovered by RCP [8], RE [55], DU [39], CLAHE [65], FUSION [34], CBF [35], FUnIE [51], Ucolor [15], UWCNN [14], reference of UIEB [64], and the MUIE. The edge maps were generated by Sobel detector with a threshold of 0.08. The appearance of the edge map is positively associated with the perceptual quality.

Download Full Size | PDF

Raw underwater images suffer from information loss and perceptual degradation, thus generating poor detection performance (Raw of Fig. 15). Compared to other recovery methods, MUIE removes color distortions while generating the spatially accurate information. As a result, the recovery of MUIE prevents the edge detector from detecting the irrelevant features and thus outperform the recovery competitors with excellent detection results and uninterrupted contours (MUIE of Fig. 15).

Therefore, the evaluation results demonstrate that the recovery of MUIE enhances the degraded appearance of scene with notable improvement in perceptual quality of objects.

5. Analysis and discussions

5.1 Multi-stage progressive enhancement

The recovery of MUIE employs a multi-stage enhancement instead of handling the degradation of underwater images as an integrated whole. Errors of propagation route or multi-stage propagation, as we know, induce error accumulation, while the learning-based approach avoids the error accumulation with joint optimization. However, for underwater image recovery, limited degradation information of synthetic underwater images for training induces the bias-to-training-pattern limitation. As a result, the recovery effectiveness of the method cannot be well generalized, generating insufficient performance (Fig. 11 and Fig. 12).

MUIE breaks down the overall recovery to multiple manageable modules and performs targeted supervisions on the modules to mitigate the error accumulation and bias-to-training-pattern issues simultaneously. Scene-relevant supervision on the channel compensation is designed to reduce channel discrepancies, and data supervision on color autoencoder is intended for color correction. In this way, the recovery of MUIE positively responds to varying degrees and types of degradation, thereby improving the recovery generalization of different underwater scenes, even the recovery-challenging scenes (Fig. 5, Fig. 6, and Fig. 8).

5.2 Statistical analysis

To understand the recovery effectiveness and generalization of MUIE, the histogram distribution, channel correction, and domain effect were statistically analyzed on the Biograd and the Genova sets of CADDY dataset [52], in which the underwater images suffer from varying types of quality degradation and color distortions (Fig. 16). In this case, applying the recovery of MUIE improves the perception of objects and corrects the color distortion of scenes. The detector, highlighting the regions that contributed to detection, shows the best attention performance on the MUIE-recovered images compared to other recovery methods. As presented, the attention of the detector for the same object is considerably altered by well-posed recovery (MUIE of Fig. 16).

Fig. 16. Recovered results and corresponding attention map of CADDY [52] of various methods: the raw input, RCP [8], RE [55], DU [39], CLAHE [65], FUSION [34], CBF [35], FUnIE [51], UWCNN [14], and the MUIE. The attention map is obtained by pre-trained ResNet-18 [68]. The appearance of the attention map is positively associated with the perceptibility of recovered objects.

Download Full Size | PDF

5.2.1 Average histogram distribution (Fig. 17)

The analysis is motivated by the discrepancy of histogram distribution between the natural-scene and the underwater images. The natural-scene images with high contrast and brightness always show wide and consistent histogram distribution in channels [9]. On the contrary, underwater images with blurring and color distortion have a convex distribution in vertical and a clustered distribution in horizontal (Raw of Fig. 17). Accordingly, the wide and consistent distribution of recovered results shows a positive relationship with recovery quality.

Fig. 17. Average histogram distribution of raw images of CADDY [52] and recovered images of RCP [8], RE [55], DU [39], CLAHE [65], FUSION [34], CBF [35], FUnIE [51], UWCNN [14], and the MUIE. The discrepant distribution of histogram is negatively correlated with the recovery performance.

Download Full Size | PDF

The results of the prior-based and learning-based methods show significant improvement on green channels and blue channels, while the red channels still retain the shifted and convex distribution, resulting in color shifting retention (RE, DU, and FUnIE of Fig. 17). The underwater image fusion enhancement reduces the discrepancies among color channels (FUSION and CBF of Fig. 17). By contrast, the channels of MUIE-recovered show the minimum distribution discrepancy (MUIE of Fig. 17), which results in high contrast, clear details and realistic scene appearances (MUIE of Fig. 16).

5.2.2 Channel correlation (Fig. 18)

Further, the channel correlation is employed to quantify the channel discrepancy of recovered images. As presented, the recovered results of MUIE show consistent histogram distribution (MUIE of Fig. 17). Thus, the channels of the MUIE-recovered results show a stronger correlation than that of other methods (MUIE of Fig. 18), which suggests that the MUIE can break the inherent color shifting of raw images and generate the realistic color without distortions.

Fig. 18. Channel correlation of raw images of CADDY [52] and recovered images of RCP [8], RE [55], DU [39], CLAHE [65], FUSION [34], CBF [35], FUnIE [51], UWCNN [14], and the MUIE. The channel correlation is generated by the Pearson correlation coefficient. The convergence of distribution shows a positive relationship with channel correlation.

Download Full Size | PDF

5.2.3 Domain effect (Fig. 19)

The domain shift between underwater images with different types of degradation hampers the recovery performance. Thus, the domain calibration of recovered results indicates a positive correlation with recovery generalization. In this part, we analyze the recovery generalization of MUIE based on the domain shift of underwater images between the Genova and the Biograd sets.

Fig. 19. Domain distribution of raw images of CADDY [52] and recovered images of RCP [8], RE [55], DU [39], CLAHE [65], FUSION [34], CBF [35], FUnIE [51], UWCNN [14], and the MUIE. The domain distribution is visualized by clustering technology [69]. Homologous domain distribution suggests strong recovery generalization.

Download Full Size | PDF

There is an obvious domain shift between underwater images of different sets (Raw of Fig. 19). Due to the intractability of recovery adapting to various domains simultaneously, the recovered results of other methods still retain shifted distribution. The application of multi-stage enhancement enables the recovery of MUIE to incorporate the in-situ scene information into data information, which optimizes model adaptation and improves recovery generalization. Under the recovery of MUIE, underwater images with different degradations show a homologous distribution, which confirms the strong recovery generalization of MUIE (MUIE of Fig. 19).

6. Conclusion

In this paper, the multi-stage enhancement method combines in-situ enhancement (channel compensation and duality-based fusion) and data-driven correction (color autoencoder) to address the factors that induce quality degradation of underwater images. The recovery of the method can be well generalized to different underwater scenes, with qualitatively pleasing and quantitatively convincing recovered results. Based on the perceptual evaluation and statistical analysis that indicate the extent to which recovery generates well-posed performance, this work presents results of multiple aspects to demonstrate the recovery effectiveness of the method. The findings suggest that the recovery of the method reduces channel discrepancy, removes color distortion, and generates spatially accurate results, thus leading to well-posed responses to various types of degradation. Finally, as an effective approach, the multi-stage enhancement shows a promising application in addressing other image degradation tasks with under-determined issues.

Funding

Natural Science Foundation of Heilongjiang Province (E2017024); the Basic Scientific Research for National Defense (A0420132202); the Key Research and Development Projects of Ministry of Science and Technology (2020YFC1512200).

Acknowledgments

The authors acknowledge the financial funding of this work. We also thank the anonymous reviewers for their critical comments on the manuscript.

Disclosures

The authors declare that there are no conflicts of interest related to this article.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. Z. Chen, N. Qiu, H. Song, L. Xu, and Y. Xiong, “Optically guided level set for underwater object segmentation,” Opt. Express 27(6), 8819–8837 (2019). [CrossRef]

2. Y. Tao, L. Dong, L. Xu, and W. Xu, “Effective solution for underwater image enhancement,” Opt. Express 29(20), 32412–32438 (2021). [CrossRef]

3. Z. Duan, Y. Yuan, J. C. Lu, J. L. Wang, Y. Li, S. Svanberg, and G. Y. Zhao, “Underwater spatially, spectrally, and temporally resolved optical monitoring of aquatic fauna,” Opt. Express 28(2), 2600–2610 (2020). [CrossRef]

4. K. Liu and Y. Liang, “Underwater image enhancement method based on adaptive attenuation-curve prior,” Opt. Express 29(7), 10321–10345 (2021). [CrossRef]

5. J. Zhou, Y. Wang, W. Zhang, and C. Li, “Underwater image restoration via feature priors to estimate background light and optimized transmission map,” Opt. Express 29(18), 28228–28245 (2021). [CrossRef]

6. K. Liu and Y. Liang, “Enhancement of underwater optical images based on background light estimation and improved adaptive transmission fusion,” Opt. Express 29(18), 28307–28328 (2021). [CrossRef]

7. K. He, J. Sun, and X. Tang, “Single image haze removal using dark channel prior,” IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2341–2353 (2011). [CrossRef]

8. A. Galdran, D. Pardo, A. Picón, and A. Alvarez-Gila, “Automatic red-channel underwater image restoration,” J. Vis. Commun. Image Represent. 26, 132–145 (2015). [CrossRef]

9. C.-Y. Li, J.-C. Guo, R.-M. Cong, Y.-W. Pang, and B. Wang, “Underwater image enhancement by dehazing with minimum information loss and histogram distribution prior,” IEEE Trans. on Image Process. 25(12), 5664–5677 (2016). [CrossRef]

10. Y. Schechner and N. Karpel, “Recovery of underwater visibility and structure by polarization analysis,” IEEE J. Oceanic Eng. 30(3), 570–587 (2005). [CrossRef]

11. H. Lu, Y. Li, L. Zhang, and S. Serikawa, “Contrast enhancement for images in turbid water,” J. Opt. Soc. Am. A 32(5), 886–893 (2015). [CrossRef]

12. H. Lu, Y. Li, and S. Serikawa, “Underwater image enhancement using guided trigonometric bilateral filter and fast automatic color correction,” in IEEE International Conference on Image Processing, (IEEE, 2013), pp. 3412–3416.

13. C. Fabbri, M. J. Islam, and J. Sattar, “Enhancing underwater imagery using generative adversarial networks,” in IEEE International Conference on Robotics and Automation, (IEEE, 2018), pp. 7159–7165.

14. C. Li, S. Anwar, and F. Porikli, “Underwater scene prior inspired deep underwater image and video enhancement,” Pattern Recognition 98, 107038 (2020). [CrossRef]

15. C. Li, S. Anwar, J. Hou, R. Cong, C. Guo, and W. Ren, “Underwater image enhancement via medium transmission-guided multi-color space embedding,” IEEE Trans. on Image Process. 30, 4985–5000 (2021). [CrossRef]

16. Y. Hashisho, M. Albadawi, T. Krause, and U. Lukas, “Underwater color restoration using U-Net denoising autoencoder,” https://arxiv.org/abs/1905.09000v1.

17. Y.-T. Peng, K. Cao, and P. C. Cosman, “Generalization of the dark channel prior for single image restoration,” IEEE Trans. on Image Process. 27(6), 2856–2868 (2018). [CrossRef]

18. H.-Y. Yang, P.-Y. Chen, C.-C. Huang, Y.-Z. Zhuang, and Y.-H. Shiau, “Low complexity underwater image enhancement based on dark channel prior,” in Second International Conference on Innovations in Bio-inspired Computing and Applications, (2011), pp. 17–20.

19. K. B. Gibson, D. T. Vo, and T. Q. Nguyen, “An investigation of dehazing effects on image and video coding,” IEEE Trans. on Image Process. 21(2), 662–673 (2012). [CrossRef]

20. P. Drews Jr, E. do Nascimento, F. Moraes, S. Botelho, and M. Campos, “Transmission estimation in underwater single images,” in IEEE International Conference on Computer Vision Workshops, (IEEE, 2013), pp. 825–830.

21. S. Emberton, L. Chittka, and A. Cavallaro, “Underwater image and video dehazing with pure haze region segmentation,” Comput. Vis. Image Underst. 168, 145–156 (2018). [CrossRef]

22. D. Berman, T. Treibitz, and S. Avidan, “Non-local image dehazing,” in IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2016), pp. 1674–1682.

23. D. Berman, D. Levy, S. Avidan, and T. Treibitz, “Underwater single image color restoration using haze-lines and a new quantitative dataset,” IEEE Trans. Pattern Anal. Mach. Intell. 43, 2822–2837 (2021). [CrossRef]

24. H. Wang, H. Hu, J. Jiang, X. Li, W. Zhang, Z. Cheng, and T. Liu, “Automatic underwater polarization imaging without background region or any prior,” Opt. Express 29(20), 31283–31295 (2021). [CrossRef]

25. K. O. Amer, M. Elbouz, A. Alfalou, C. Brosseau, and J. Hajjami, “Enhancing underwater optical imaging by using a low-pass polarization filter,” Opt. Express 27(2), 621–643 (2019). [CrossRef]

26. S. Fang, X. Xia, X. Huo, and C. Chen, “Image dehazing using polarization effects of objects and airlight,” Opt. Express 22(16), 19523–19537 (2014). [CrossRef]

27. F. Liu, P. Han, Y. Wei, K. Yang, S. Huang, X. Li, G. Zhang, L. Bai, and X. Shao, “Deeply seeing through highly turbid water by active polarization imaging,” Opt. Lett. 43(20), 4903–4906 (2018). [CrossRef]

28. Y. Wei, P. Han, F. Liu, and X. Shao, “Enhancement of underwater vision by fully exploiting the polarization information from the stokes vector,” Opt. Express 29(14), 22275–22287 (2021). [CrossRef]

29. F. Liu, Y. Wei, P. Han, K. Yang, L. Bai, and X. Shao, “Polarization-based exploration for clear underwater vision in natural illumination,” Opt. Express 27(3), 3629–3641 (2019). [CrossRef]

30. Y. Zhu, T. Zeng, K. Liu, Z. Ren, and E. Y. Lam, “Full scene underwater imaging with polarization and an untrained network,” Opt. Express 29(25), 41865–41881 (2021). [CrossRef]

31. D. Akkaynak, T. Treibitz, T. Shlesinger, Y. Loya, R. Tamir, and D. Iluz, “What is the space of attenuation coefficients in underwater computer vision?” in IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2017), pp. 568–577.

32. D. Akkaynak and T. Treibitz, “A revised underwater image formation model,” in IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2018), pp. 6723–6732.

33. D. Akkaynak and T. Treibitz, “Sea-Thru: A method for removing water from underwater images,” in IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2019), pp. 1682–1691.

34. C. Ancuti, C. O. Ancuti, T. Haber, and P. Bekaert, “Enhancing underwater images and videos by fusion,” in IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2012), pp. 81–88.

35. C. O. Ancuti, C. Ancuti, C. De Vleeschouwer, and P. Bekaert, “Color balance and fusion for underwater image enhancement,” IEEE Trans. on Image Process. 27(1), 379–393 (2018). [CrossRef]

36. C. O. Ancuti, C. Ancuti, C. De Vleeschouwer, and M. Sbetr, “Color channel transfer for image dehazing,” IEEE Signal Process. Lett. 26(9), 1413–1417 (2019). [CrossRef]

37. C. O. Ancuti, C. Ancuti, C. De Vleeschouwer, and M. Sbert, “Color Channel Compensation (3C): A fundamental pre-processing step for image enhancement,” IEEE Trans. on Image Process. 29, 2653–2665 (2020). [CrossRef]

38. J. Liu and X. Zhang, “Parameter-Adaptive Compensation (PAC) for processing underwater selective absorption,” IEEE Signal Processing Letters 27, 2178–2182 (2020). [CrossRef]

39. A. Galdran, A. Bria, A. Alvarez-Gila, J. Vazquez-Corral, and M. Bertalmío, “On the duality between Retinex and image dehazing,” in IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2018), pp. 8212–8221.

40. Y. Zhou and K. Yan, “Domain adaptive adversarial learning based on physics model feedback for underwater image enhancement,” https://arxiv.org/abs/2002.09315.

41. A. Morel, “Marine optics,” Earth-Sci. Rev. 14(2), 170–171 (1978). [CrossRef]

42. Z. Wang, L. Shen, M. Yu, Y. Lin, and Q. Zhu, “Single underwater image enhancement using an analysis-synthesis network,” https://arxiv.org/abs/2108.09023.

43. K. Wang, L. Shen, Y. Lin, M. Li, and Q. Zhao, “Joint iterative color correction and dehazing for underwater image enhancement,” IEEE Robot. Autom. Lett. 6(3), 5121–5128 (2021). [CrossRef]

44. Y. Guo, H. Li, and P. Zhuang, “Underwater image enhancement using a multiscale dense generative adversarial network,” IEEE J. Oceanic Eng. 45(3), 862–870 (2020). [CrossRef]

45. X. Chen, J. Yu, S. Kong, Z. Wu, X. Fang, and L. Wen, “Towards real-time advancement of underwater visual quality with gan,” IEEE Trans. Ind. Electron. 66(12), 9350–9359 (2019). [CrossRef]

46. J. Lu, N. Li, S. Zhang, Z. Yu, H. Zheng, and B. Zheng, “Multi-scale adversarial network for underwater image restoration,” Opt. Laser Technol. 110, 105–113 (2018). [CrossRef]

47. J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in IEEE International Conference on Computer Vision, (IEEE, 2017), pp. 2242–2251.

48. L. Li, Y. Dong, W. Ren, J. Pan, C. Gao, N. Sang, and M.-H. Yang, “Semi-supervised image dehazing,” IEEE Trans. on Image Process. 29, 2766–2779 (2020). [CrossRef]

49. O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention, vol. 9351 (MICCAI, 2015), pp. 234–241.

50. X. Liu, Z. Gao, and B. M. Chen, “MLFcGAN: Multilevel feature fusion-based conditional GAN for underwater image color correction,” IEEE Geosci. Remote Sensing Lett. 17(9), 1488–1492 (2020). [CrossRef]

51. M. J. Islam, Y. Xia, and J. Sattar, “Fast underwater image enhancement for improved visual perception,” IEEE Robot. Autom. Lett. 5(2), 3227–3234 (2020). [CrossRef]

52. A. Gomez Chavez, A. Ranieri, D. Chiarella, E. Zereik, A. Babic, and A. Birk, “CADDY underwater stereo-vision dataset for human–robot interaction (HRI) in the context of diver activities,” J. Mar. Sci. Eng. 7(1), 16 (2019). [CrossRef]

53. M. J. Islam, M. Fulton, and J. Sattar, “Toward a generic diver-following algorithm: Balancing robustness and efficiency in deep visual detection,” IEEE Robot. Autom. Lett. 4(1), 113–120 (2019). [CrossRef]

54. K. He, J. Sun, and X. Tang, “Guided image filtering,” IEEE Trans. Pattern Anal. Mach. Intell. 35(6), 1397–1409 (2012). [CrossRef]

55. H. Ahn, B. Keum, D. Kim, and H. S. Lee, “Adaptive local tone mapping based on retinex for high dynamic range images,” in IEEE International Conference on Consumer Electronics, (IEEE, 2013), pp. 153–156.

56. W. Zhang, G. Li, and Z. Ying, “A new underwater image enhancing method via color correction and illumination adjustment,” in IEEE Visual Communications and Image Processing, (IEEE, 2017), pp. 1–4.

57. B. Sun, R. Ramamoorthi, S. Narasimhan, and S. Nayar, “A practical analytic single scattering model for real time rendering,” ACM Trans. Graph. 24(3), 1040–1049 (2005). [CrossRef]

58. M. Afifi and M. S. Brown, “What else can fool deep learning? Addressing color constancy errors on deep neural network performance,” in IEEE International Conference on Computer Vision, (IEEE, 2019), pp. 243–252.

59. S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M.-H. Yang, and L. Shao, “Multi-stage progressive image restoration,” in IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2021).

60. C. O. Ancuti, C. Ancuti, R. Timofte, L. Van Gool, L. Zhang, M.-H. Yang, T. Guo, X. Li, V. Cherukuri, V. Monga, H. Jiang, S. Yang, Y. Liu, X. Qu, P. Wan, D. Park, S. Y. Chun, M. Hong, J. Huang, Y. Chen, S. Chen, B. Wang, P. N. Michelini, H. Liu, D. Zhu, J. Liu, S. Santra, R. Mondal, B. Chanda, P. Morales, T. Klinghoffer, L. M. Quan, Y.-G. Kim, X. Liang, R. Li, J. Pan, J. Tang, K. Purohit, M. Suin, A. Rajagopalan, R. Schettini, S. Bianco, F. Piccoli, C. Cusano, L. Celona, S. Hwang, Y. S. Ma, H. Byun, S. Murala, A. Dudhane, H. Aulakh, T. Zheng, T. Zhang, W. Qin, R. Zhou, S. Wang, J.-P. Tarel, C. Wang, and J. Wu, “Ntire 2019 image dehazing challenge report,” in IEEE Conference on Computer Vision and Pattern Recognition Workshops, (IEEE, 2019), pp. 2241–2253.

61. M. Afifi and M. S. Brown, “Deep white-balance editing,” in IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2020), pp. 1394–1403.

62. M. Afifi, B. Price, S. Cohen, and M. S. Brown, “When color constancy goes wrong: Correcting improperly white-balanced images,” in IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2019), pp. 1535–1544.

63. D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” https://arxiv.org/abs/1412.6980.

64. C. Li, C. Guo, W. Ren, R. Cong, J. Hou, S. Kwong, and D. Tao, “An underwater image enhancement benchmark dataset and beyond,” IEEE Trans. on Image Process. 29, 4376–4389 (2020). [CrossRef]

65. A. Reza, “Realization of the Contrast Limited Adaptive Histogram Equalization (CLAHE) for real-time image enhancement,” VLSI Signal Processing 38(1), 35–44 (2004). [CrossRef]

66. K. Panetta, C. Gao, and S. Agaian, “Human-visual-system-inspired underwater image quality measures,” IEEE J. Oceanic Eng. 41(3), 541–551 (2016). [CrossRef]

67. M. Yang and A. Sowmya, “An underwater color image quality evaluation metric,” IEEE Trans. on Image Process. 24(12), 6062–6071 (2015). [CrossRef]

68. Kenta, “Explainable AI: interpreting the classification using LIME,” https://github.com/KentaItakura.

69. H.-B. Han, “Image2palette: Simple K-means color clustering,” https://ww2.mathworks.cn/matlabcentral/fileexchange/69538.

Metric	UIEB-R $↑$		UIEB-C $↑$		SQUID $↑$		Average $↑$		Variance $↓$
Metric	UIQM	UCQIE	UIQM	UCQIE	UIQM	UCQIE	UIQM	UCQIE	UIQM	UCQIE
Raw	2.023	0.487	1.584	0.495	-2.661	0.370	0.315	0.451	2.587	0.070
RCP [8]	3.244	0.603	2.800	0.577	-0.572	0.602	1.824	0.594	2.087	0.015
RE [55]	2.636	0.659	2.538	0.648	-2.254	0.665	0.973	0.657	2.795	0.008
DU [39]	2.703	0.656	2.516	0.653	-0.769	0.654	1.483	0.654	1.953	0.001
CLAHE [65]	4.150	0.598	3.579	0.582	1.286	0.528	3.005	0.569	1.516	0.037
FUSION [34]	3.639	0.625	1.706	0.502	1.504	0.581	2.283	0.569	1.179	0.062
CBF [35]	3.812	0.604	3.286	0.587	2.213	0.633	3.104	0.608	0.815	0.023
FUnIE [51]	4.112	0.583	3.614	0.578	-0.667	0.463	2.353	0.541	2.627	0.068
Ucolor [15]	3.609	0.571	3.000	0.567	0.480	0.525	2.363	0.554	1.659	0.025
UWCNN-2 [14]	3.636	0.442	2.728	0.427	-1.654	0.258	1.570	0.376	2.829	0.102
UWCNN-III [14]	3.394	0.413	2.992	0.465	-0.150	0.394	2.078	0.424	1.941	0.037
Reference	3.633	0.623	-	-	-	-	-	-	-	-
MUIE(ours)	4.222	0.646	3.911	0.642	2.866	0.618	3.667	0.635	0.710	0.015

Recovery for underwater image degradation with multi-stage progressive enhancement

Abstract

1. Introduction

2. Related work

2.1 Prior-based recovery

2.2 Learning-based recovery

3. Proposed methods in this paper

3.1 Channel compensation

3.2 Duality-based fusion

3.3 Color autoencoder

3.4 Visualized results

4. Experiments and results

4.1 Qualitative Evaluation

4.1.1 Recovery evaluation with reference (Fig. 10)

4.1.2 Recovery evaluation on recovery-challenging images (Fig. 11 and Fig. 12)

4.2 Quantitative Evaluation

4.3 Perceptual Evaluation

4.3.1 Transmission appearance (Fig. 14)

4.3.2 Edge detection (Fig. 15)

5. Analysis and discussions

5.1 Multi-stage progressive enhancement

5.2 Statistical analysis

5.2.1 Average histogram distribution (Fig. 17)

5.2.2 Channel correlation (Fig. 18)

5.2.3 Domain effect (Fig. 19)

6. Conclusion

Funding

Acknowledgments

Disclosures

Data availability

References

Data availability

Cited By

Figures (19)

Tables (1)

Equations (17)

Optics Express