Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Three-dimensional integral imaging-based image descattering and recovery using physics informed unsupervised CycleGAN

Open Access Open Access

Abstract

Image restoration and denoising has been a challenging problem in optics and computer vision. There has been active research in the optics and imaging communities to develop a robust, data-efficient system for image restoration tasks. Recently, physics-informed deep learning has received wide interest in scientific problems. In this paper, we introduce a three-dimensional integral imaging-based physics-informed unsupervised CycleGAN algorithm for underwater image descattering and recovery using physics-informed CycleGAN (Generative Adversarial Network). The system consists of a forward and backward pass. The base architecture consists of an encoder and a decoder. The encoder takes the clean image along with the depth map and the degradation parameters to produce the degraded image. The decoder takes the degraded image generated by the encoder along with the depth map and produces the clean image along with the degradation parameters. In order to provide physical significance for the input degradation parameter w.r.t a physical model for the degradation, we also incorporated the physical model into the loss function. The proposed model has been assessed under the dataset curated through underwater experiments at various levels of turbidity. In addition to recovering the original image from the degraded image, the proposed algorithm also helps to model the distribution under which the degraded images have been sampled. Furthermore, the proposed three-dimensional Integral Imaging approach is compared with the traditional deep learning-based approach and 2D imaging approach under turbid and partially occluded environments. The results suggest the proposed approach is promising, especially under the above experimental conditions.

© 2024 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Image restoration and recovery aims to learn a mapping function that transforms degraded images into clean images. It has gained huge interest among the optics and computer vision communities due to its applications in various areas, including video surveillance, autonomous driving, underwater imaging and signal detection and medical imaging [16]. During image acquisition, the degradation of images is inevitable due to scattering and attenuation and often affects the performance of these systems considerably and inherently becomes an ill-posed problem. Image restoration can be generally categorized into learning-based and model-based algorithms. In model-based approaches, generally, we use a physical model for the degradation, and the image prior is obtained from the mathematical construction of the penalty functional. Under such classes of algorithms, the sparse-coding and its variations are popular in the literature [1,2]. In contrast, learning-based approaches leverage the training data (degraded-clean image pair) to learn the image prior. Of late, deep learning-based approaches [35] have been immensely successful owing to their ability to approximate arbitrary functions and the wide availability of large datasets. However, good performance on specific datasets does not always guarantee that the function learned by a trained deep network is consistent with the physical model of the problem. In fact, it takes extensive testing to establish such a guarantee. To address this, physics-informed deep learning integrates the physical modeling of the problem in the training algorithms of deep learning models. This constrains the deep network to produce physically consistent outputs and lends interpretability to the results obtained.

Integral imaging is a passive three-dimensional imaging technique that captures the spatio-angular information of a 3D scene. Several integral imaging architectures are available, such as using camera arrays or plenoptic imaging for capturing 3D information. Although other 3D image acquisition methods, such as LiDAR, etc., are available in the literature, integral imaging has the advantage of working in incoherent or ambient light. This makes it better suited for problems like underwater communication, gesture recognition, and polarimetric imaging than other contenders [612]. Several Integral imaging with deep learning-based image restoration and classification algorithms have been previously proposed [4,13]; however, those approaches were purely data-driven. These algorithms can be generally relied upon only within the bounds of the data used to train the model. While the performance of those approaches was commendable, the need to develop more interpretable and physically consistent predictions must be addressed. One possible approach is to use physics-based constraints in the loss function during the neural network training. Additionally, in physical sciences, the dataset available for training might be limited, and modeling the distribution of data generation also might produce valuable insights into the underlying physical processes.

This paper proposes a 3D integral imaging-based underwater image descattering and restoration using physics-informed unsupervised CycleGAN. Unlike traditional architectures that use paired training data, we use unpaired data to train the neural network. The network consists of an encoder and a decoder. First, we estimate the distribution of the degradation parameters using ground truth values, which can generate a degraded image from the clean image based on a physical model. In addition to the clean-degraded images used for training, the sampled degradation parameter is also fed as an input to the neural network. A physics-informed loss function has also been used in addition to the other loss functions used for training the model to achieve physical consistency in predictions.

This paper is organized into four different sections: Section 1 provides the introduction, and a brief review of image restoration techniques and the motivation behind the use of physics-informed deep learning for these tasks. Section 2 discusses the proposed approach in detail, section 3 deals with the experimental results, including the performance comparisons. Finally, section 4 provides the conclusions of the paper.

2. Methodology

We propose a physics-informed deep learning-based system with an encoder which will generate a degraded image given a clean image and a decoder which will recover the clean image from the degraded image. Therefore, we will be able to model the distribution of underlying physical degradation in addition to recovering the clean image from the degraded version. The block diagram of the proposed approach is as shown in Fig. 1.

 figure: Fig. 1.

Fig. 1. Block diagram of the proposed 3D integral imaging (InIm) based image restoration under scattering medium.

Download Full Size | PDF

In addition, the degraded image generation would be controlled by a physical model enforced into the loss, which makes the system physically consistent while leveraging the advantages of a data-driven approach.

2.1 3D Integral Imaging based computational reconstruction

3D Integral imaging captures the spatial-angular information of a 3D scene using a lenslet array, camera array or a moving camera framework [1218]. It has proved to be useful for 3D object sensing, visualization, and classification under various degradations such as partial occlusion, scattering medium, and low illumination conditions as compared to 2D imaging [4,7,813]. The elemental images captured using the array of cameras could be used for reconstructing the object at the plane of interest. It is done by back projecting the elemental images through a virtual array of pinholes and reconstructing at the depth of interest z as:

$$r({x,y,z} )= \frac{1}{{O({x,y} )}}\mathop \sum \nolimits_{i = 0}^{P - 1} \mathop \sum \nolimits_{j = 0}^{N - 1} E{I_{i,j}}\left( {x - i\frac{{{r_x} \times {p_x}}}{{M \times {d_x}}},\; y\; - j\frac{{{r_y} \times {p_y}}}{{M \times {d_y}}}} \right)$$
where, $r({x,y,z} )$ is the three-dimensional integral imaging reconstructed image. The EIi,j stands for the i,jth elemental image. $r({x,y,z} )$ is obtained by shifting and overlapping the elemental images at the desired reconstruction depth z. Here, x,y represents their pixel indices, f is the focal length and the magnification factor is $M = \; z/f$ . In Eq. (1), the matrix O(x,y;t) holds the data concerning the number of overlapping pixels. px,py indicates the pitch of the adjacent image sensors, rx, ry and dx, dy represent the resolution and physical size of the image sensor, respectively in horizontal and vertical directions. Figure 2 depicts the 3D Integral imaging-based camera pickup and reconstruction process.

 figure: Fig. 2.

Fig. 2. (a) Integral imaging passive sensing and image pickup process, (b) computational 3D reconstruction process using integral imaging.

Download Full Size | PDF

 figure: Fig. 3.

Fig. 3. Imaging in underwater scenario, degradation under scattering and absorption

Download Full Size | PDF

2.2 Physics Informed deep learning for image restoration

Here, we briefly discuss the details of our proposed approach. First, we consider a physical model for the degradation which fits the data well under low and highly degraded scenarios. Then, we estimate the ground truth parameter values of the model using an appropriate prior under Bayesian optimization. Once we get the suitable ground truth parameters, we estimate the distribution of those parameters so that we can reliably sample from this distribution and provide as input to the neural network. The network is trained in an unsupervised fashion using the ground truth parameter values and experimentally collected clean and degraded image dataset. Using the estimated ground truth parameter values and the physical model, we synthesize degraded images under low and highly degraded conditions. Figure 3 shows the underwater imaging scenario, demonstrating various degradations such as scattering and absorption. For the current work, we consider underwater turbidity as degradation, but the approach is generalizable for other environmental degradations as well using a suitable physical model. The physical model is incorporated into the loss functions used for training in order to enforce physical consistency for the predictions.

2.2.1 Physical model and ground truth parameter estimation

Under a scattering environment, the inherent and apparent optical properties (IOPs and AOPs) of the scattering environment affect the propagation of light. The beam scattering $a(\lambda )$, beam absorption $b(\lambda )$ and beam attenuation $({c(\lambda )= a(\lambda )+ b(\lambda )} )$ constitute the IOPs. Thus, the light propagation in a homogeneous, source-free scattering medium can be defined according to radiance transfer equation [19,20] as:

$${I(d;\;\ \zeta ;\;\ \lambda )\;\ =\ \;\ }\; {I_0}({{d_0};\mathrm{\;\ \zeta };\mathrm{\;\ \lambda }} ){e^{ - c(\lambda )z}} + \; \frac{{{I_\ast }({\mathrm{d;\;\ \zeta ;\;\ \lambda }} )}}{{C(\lambda )- {K_d}(\lambda )\cos (\theta )}}({1 - {e^{ - ({c(\lambda )- {K_d}(\lambda )\cos (\theta )} )z}}} )$$
where, $I(. ),$ ${I_0}(. )$ and ${I_\ast }(. )\; $are the captured radiance, the radiance leaving the object and the path function describing radiance photons arriving from all directions, respectively. Here, c stands for attenuation coefficient (experienced by the direct rays), $\lambda $ is the wavelength, z is the depth and $\theta $ be the viewing direction. ${K_d}$ is the diffuse attenuation coefficient of downwelling light (attenuation experienced by light penetrating vertically). Here, c and ${K_d}$ are both attenuation coefficients and characterize the effective attenuation but ${K_d}(\lambda )$ becomes evident especially when viewing angle $\theta \ne \; {90^0}$.

The above model could be simplified to [2022]

$$I = {I_0}\; {e^{ - (c )z}} + A(1 - {e^{ - cz}})$$
where, we assume the viewing direction is horizontal i.e., $\theta = {90^0}$, and assuming the attenuation coefficients vary negligibly with wavelength. This assumption is generally true in the case of air but in underwater environments it might not be the case always [20]. Therefore, the physical model varies according to the problem, and finding the optimal model for different scenarios is out of scope of the present manuscript, therefore we consider the simplified model for our case. Here, the parameters to be estimated are $cz$ (scaled depth) and A, the veiling or atmospheric light. Using the training dataset, the ground truth parameter values are estimated sequentially, the scaled depth is obtained analytically using dark channel prior approach [22] as:
$$t(x )= \; {e^{ - (c )z}} = 1\; -{-}\; mi{n_c}\left( {mi{n_{y\epsilon \mathrm{\Omega }(x )}}\left( {\frac{{I(y )}}{A}} \right)} \right)$$

Here, A as the brightest pixel corresponding to the top $0.1\%$ brightest pixels in the corresponding dark channel. The atmospheric light is then re-estimated using Bayesian optimization [2325] with mean square error between the synthetically degraded and actual degraded image as the objective function. In the context of Bayesian optimization, suppose we need to minimize a function $f(. )$ over the domain $X{\; },$ i.e ${x^\ast } = \; argmi{n_{x\epsilon X}}\; f(x )$. In general, Bayesian optimization choses an appropriate gaussian process prior for the function f, i.e., the sample points (or observations) of f are drawn from a gaussian process. $f(. )$ may be expensive to evaluate.

Given an initial set of observations, we will choose the best next feasible point which minimizes $f(. )$. We use an inexpensive function $a(x )$ which is commensurate with how desirable evaluating f at x is for minimization. The function $a(x )$ is called the acquisition function. We optimize acquisition function instead of optimizing $f(. )$ to get the next feasible point. For our scenario, we choose a Gaussian process prior and Expected Improvement (EI) as the acquisition function. Expected improvement evaluates ${f_{t + 1}}$ at a point that in average improves upon ${f_t}$ the most. For our case, we use the objective function $f(. )$ as the mean square error loss between the original degraded image and the synthetically generated image (generated with the parameter value in the current iteration).

Once the ground truth parameter values are estimated, we assume a parametric form for the distribution of those parameters to sample from in order to feed to the neural network. For choosing the best fit distribution for these set of parameter values, we use Akaike information criterion (AIC) [26,27] to evaluate the goodness of fit as well as the complexity. The Akaike information criterion (AIC) uses log likelihood while penalizing the number of parameters for the model and can be calculated as $\textrm{AIC} = 2\ast k\; - 2\ast LL$, where k is the number of parameters of the model and $LL$ is the log likelihood function. Once AIC is calculated for all the models, the one with lowest AIC is chosen as the best fit.

We calculated the AIC for most parametric distributions and found out that the Gaussian Mixture Model (GMM) has the least AIC value. For finding out the parameters of the GMM model, we utilize Expectation Maximization (EM) algorithm [28,29]. It is a generalization of maximum likelihood estimation for incomplete data. The expectation maximization tries to find the set of parameters which maximizes the log probability $logP({X,\theta } )$ of the data, X being the data and $\theta $ being the parameters. Let X be the observed variables and Z be the latent variables (unobserved), using the initial set of parameter values ${\theta ^0}$, in the E-step, we calculate the posterior distribution of Z, $P({Z/X,{\theta^t}} )$. During the M-step, we update the ${\theta ^{t + 1}}$ as ${\theta ^{t + 1}}\; = argma{x_\theta }\mathop \sum \nolimits_i {\gamma _{ik}}\textrm{log}({P({{X_i} = {x_i},{Z_i} = k/\theta } )} )$. This process repeats till it converges. Once the distribution of the ground truth parameters is obtained, it is also fed to the neural network along with the dataset during training so as the network predictions becomes consistent with the physical model.

2.2.2 CycleGAN-based physics informed neural network for image restoration

Here, we propose a CycleGAN [30,31] based image restoration algorithm utilizing the physical degradation model. We propose a system which will be able to 1) generate various degraded images given a clean image as well as parameters sampled from a learned distribution of degradation and, 2) recover the latent clean image given a degraded image. In this way, we would be able to model the degradation distribution in addition to image recovery. For these tasks, we propose an unsupervised learning algorithm utilizing the CycleGAN architecture. Traditionally, for image-to-image translation models, we require plenty of paired (clean-degraded image pair) images for training the network. However, it is prohibitively expensive to curate a dataset with sufficiently large number of candidate images, each of which undergoes all the degradations that are of interest to us. CycleGAN is a technique for training unsupervised translation models using the GAN architecture and unpaired image dataset. The basic structure of our system is as shown in Fig. 4.

 figure: Fig. 4.

Fig. 4. Overall system architecture. Here, T represents the ground truth depth map, A is a value obtained by sampling from the distribution of atmospheric light parameter values with the parameters of the distribution estimated using the procedure described in Section 2.2.1. For our problem, the depth map refers to the one estimated by the minimization part in Eq. (4).Parameter z accounts for possible model mismatch.

Download Full Size | PDF

During training, we have a forward and a backward cycle. During forward cycle, the encoder takes the clean image, input depth map, the degradation parameter and a latent vector to generate a synthetic degraded image. The decoder takes the generated degraded image along with the depth map to produce clean image along with the latent vector and the degradation parameter. In general, we can incorporate multiple degradation parameters but here, we assume the degradation parameter to be the atmospheric light specified in Eq. (3). During the backward cycle, the actual degraded image along with the ground truth depth map is fed to the decoder to generate a clean image along with the corresponding parameters and those values are fed to the encoder to recover the input degraded image. Latent vectors are sampled from standard gaussian distribution and help account for possible model mismatch. The encoder architecture is shown in Fig. 5. Since both the encoder and the decoder architectures are GANs, they each have a generator and a discriminator. For all the generators and discriminators this work, we have used the architectures used in SRGAN [32].

 figure: Fig. 5.

Fig. 5. Encoder generator architecture for CycleGAN-based physics informed neural network. BN: Batch normalization. A, N: the degradation parameter and the latent vector respectively.

Download Full Size | PDF

Thus, the encoder (${f_{enc}}$) receives a combination of a clean image (x), the ground truth depth map ($t(x )$) and the degradation parameter with latent vector (z) and produces $y = {f_{enc}}({x,t(x ),z} )$. The decoder (${f_{dec}}$) takes this degraded image, the depth map ($t(x )$) and extracts an estimate of the clean image ($\hat{x}$) and the degradation vector ($\hat{z}$) and produce $[{\hat{x},\; \widehat {z\; }} ]= {f_{dec}}({y,t(x )} )$. Network is trained as a Cycle GAN and mutual information is used between the random vector and the degraded image to maintain a correlation between them. We utilize three different loss functions in the neural network architecture, 1) Content loss between the synthetically degraded image and the generated degraded image to ensure that the content of the image is preserved across neural networks, 2) Cycle consistency loss ${L_{cycle}}({G,F} )$ which ensures the two mappings $G:X \to Y$ and $F:Y \to X$ are inverses of one another

$${L_{cycle}}({G,F} )= \; {E_{x\sim {p_{data}}(x )}}[{\|F({G(x )} )- {x\|_{1}}} ]+ \; {E_{y\sim {p_{data}}(y )}}[\|{G({F(y )} )- {y\|_{1}}} ]$$
where, $G({\cdot} )$ is the encoder ${f_{enc}}$ whereas, $F({\cdot} )$ is the decoder ${f_{dec}}$. We also use mutual information loss between the latent vector and the degraded image. To calculate the mutual information (MI) loss between the latent variable z and the degraded image y, we feed y to the discriminator network corresponding to encoder ${f_{enc}}$ and, extract the output of the penultimate layer of this network. By design, this output has the same dimensionality as z, making it possible to calculate the desired mutual information using widely available algorithms. Additionally, to ensure that the output of the encoder is in agreement with the physical model of degradation, we have imposed a mean squared error (MSE) loss between y and the output of the physical degradation model. This has been designated as ${C_{loss}}$ in Fig. 4. The decoder generator architecture is also similar to the encoder generator architecture except that we feed only the input and depth images to the generator.

3. Experimental results and discussions

In this section, we discuss the details of underwater experiments conducted and analyze the performance of the proposed system for image recovery. Optical signal detection using integral imaging in underwater environment under degradations such as turbidity is challenging and has been widely studied in the literature [6,8]. The primary issue with underwater signal detection is that the signal gets degraded due to turbidity and/or partial occlusion. In addition, the presence of ambient light along with turbidity worsens the performance of the system. The performance of such applications can be enhanced if the optical signal information deteriorated under the environmental degradations can be retrieved. Therefore, we explore the utility of the proposed approach in the case of underwater optical signal enhancement under several cases of turbidity and in the presence of external light. In addition, we compare the performance of the proposed approach with that of conventional deep learning architectures with integral imaging used in the literature. The experimental setup for the proposed approach is as shown in Fig. 6. A light emitting diode (LED) operating at 630 nm is used to transmit the signal through the water tank of dimension 500(W) ${\times} $ 250 (L) ${\times} $ 250 (W) mm. An external light source placed above the water tank is used to mimic ambient light. The turbidity of the water is created by adding antacid and the data has been collected at different levels of turbidity to characterize the performance of the proposed approach. To quantify the level of turbidity, we use Beer-Lambert’s law ($I = {I_0}{e^{ - \alpha d}}$, where $\alpha $ is the Beer’s coefficient, d the propagating distance in the turbid medium where we considered $d = 10mm$ and ${I_0}$, I being the initial and final intensities). The data has been captured using a $3 \times 3$ camera array with G-192 GigE cameras with a focal length of 20 mm. The spatial resolution of the camera is $1600\; (H )\times 1200(V )$ and the pixel size being $4.5\mu m\; \times 4.5\mu m$.

 figure: Fig. 6.

Fig. 6. 3 × 3 camera array for integral imaging capture stage used for our underwater experiments. We collected data with and without partial occlusion and low as well as high turbidity.

Download Full Size | PDF

3D Integral imaging helps us to recover the signal information under degradations such as partial occlusion and improves visualization. The collected data are subjected to preprocessing where they are center cropped and resized to 400 by 400. Figure 7 shows the experimentally collected clear water image without partial occlusion and the corresponding 2D and 3D reconstructed image with partial occlusion and turbidity. Figure 7 demonstrates that the 3D Integral imaging helps for improved signal recovery and visualization under degradations such as partial occlusion.

 figure: Fig. 7.

Fig. 7. Experimentally collected a) Sample clean image (2D), b) 2D image (central perspective) under partial occlusion and turbidity, and c) 3D Integral imaging (InIm) based reconstructed image under turbidity and partial occlusion.

Download Full Size | PDF

We have considered two different scenarios for analyzing the utility of the proposed approach: 1) under turbidity and in the presence of an external light source, 2) under turbidity and partial occlusion and external light source. For training the neural network model, we used the 2D images (central perspective), with light source positions at different positions to improve the generalization capability of the network. During training, we first partitioned the data into 320 training images and 60 images for testing both under varying levels of turbidity (α= 0.0025–0.040 mm−1). Once the data has been collected, the parameters of the physical model are estimated using the collected training data as explained in Section 2.2.1. Once the ground truth parameter values are obtained, the distribution of the ground truth parameter has been estimated, which is sampled and fed to the neural network. The encoder takes the input clean image, depth image, and the sampled ground truth parameter along with the latent vectors and produces the degraded image. The decoder transforms the input degraded image into the recovered image. During the test stage, the decoder has been utilized to obtain the clean image. The recovered image is subjected to total variational denoising to reduce the generator artifacts. Currently, it has been implemented outside the neural network model, but it can also be included in the network as a loss function. To demonstrate the efficacy of the approach, Fig. 8 shows the sample degraded and corresponding recovered images under high turbidity and in the presence of external light.

 figure: Fig. 8.

Fig. 8. Experimentally collected data demonstrating a) Sample clean image ($\alpha = 0.0025\; m{m^{ - 1}}$), b)-c) two sample degraded images at the highest turbidity considered $({\alpha = 0.040\; m{m^{ - 1}}} )$, d) recovered image from (b), e) recovered image from (c).

Download Full Size | PDF

The network has been trained with an initial learning rate of $5 \times {e^{ - 3}},5 \times {e^{ - 4}}$ for the generator and the discriminator. The learning rate decay rate is set to $0.5$ and decays every 1000 iterations. The batch size used is 10. We used Adam optimizer with the parameters ${\beta _1} = 0.9\; $and ${\beta _2} = 0.999.\; $From Fig. 8, we can see that the decoder is able to recover the degraded images even under high turbidity levels and in the presence of external light noise. From Fig. 8, our experiments propound the efficacy of the proposed approach. As such, we experimentally collected 60 additional test images under various conditions such as with different levels of turbidity (α= 0.0025–0.040 mm−1), partial occlusion and external light noise. The experimentally collected data has been reconstructed using 3D Integral imaging at the depth of interest, where the depth estimate is obtained using the minimum variance approach [33]. In addition, we compare the performance of the proposed approach between 2D and 3D Integral imaging. Additionally, we also compared the proposed approach with the previously proposed DnCNN based approach using 3D Integral imaging [4] for image recovery under degradations.

We have used mean structural similarity index (SSIM) and mean square error (MSE) between the recovered and ground truth clean images for comparing the performance of different approaches where the mean is computed over the entire test set. The 3D Integral imaging-based approach performs better under the experimental conditions considered as compared to 2D imaging. Also, the results suggest the utility of the proposed approach compared to the previously proposed supervised approach even in the presence of lesser training data and unsupervised training. Figure 9 depicts the degraded, corresponding recovered and ground truth clean images using the proposed physics informed deep learning-based approach. Thus, we could achieve better and physically consistent results by incorporating physical model into the deep learning algorithm. Table 1 summarizes the results using the proposed approach and provides comparison between various methodologies such as 2D imaging-based approach and 3D Integral imaging based DnCNN.

 figure: Fig. 9.

Fig. 9. Experimentally collected a) sample clean image, b) 2D degraded image (central perspective) with partial occlusion, external light source and turbidity (0.008$\; m{m^{ - 1}}$), c) corresponding recovered image using proposed approach from 2D degraded image d) corresponding recovered image by proposed approach using 3D Integral imaging.

Download Full Size | PDF

Tables Icon

Table 1. Table showing the comparison of the proposed physics informed deep learning with other approaches. InIm: -Integral Imaging, MSE: - Mean square error, SSIM: - Structural similarity index.

As mentioned earlier, the proposed physics informed CycleGAN with 3D integral imaging-based approach provides better performance as compared to 2D imaging in wide range of degradations such as partial occlusion and thus the proposed approach with 3D integral imaging can be promising under those applications.

4. Conclusion

In summary, we have presented a 3D Integral imaging with physics informed CycleGAN for image restoration under physical degradations. The results suggest that the proposed approach can be promising and could be extended along with 3D Integral imaging technique for wide range of degradations such as low illumination, fog and partial occlusion with different classes of objects and under challenging conditions. Here, we have incorporated and enforced the physical model into the deep learning algorithm, but it would be better if we can estimate multiple unknown parameters of a physical model in addition to image recovery. Therefore, our future work focuses on fine tuning, improving the parameter estimation and analyzing the sensitivity of these parameters on its performance in addition to recovery and extending the approach for other degradations such as fog and low illumination.

Funding

National Science Foundation (2141473); Air Force Office of Scientific Research (FA9550-21-1-0333); Office of Naval Research (N000142212349, N000142212375).

Acknowledgments

We wish to acknowledge support under The Office of Naval Research (ONR) (N00014-22-1-2349, N00014-22-1-2375); Air-Force Office of Scientific Research (AFOSR) (FA9550-21-1-0333); National Science Foundation (NSF) # 2141473.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. F. Wu, W. Dong, T. Huang, et al., “Hybrid sparsity learning for image restoration: An iterative and trainable approach,” Signal Processing 178, 107751 (2021). [CrossRef]  

2. W. Dong, L. Zhang, G. Shi, et al., “Nonlocally centralized sparse representation for image restoration,” IEEE Trans. Image Process. 22(4), 1620–1630 (2013). [CrossRef]  

3. M. Hui, Y. Wu, W. Li, et al., “Image restoration for synthetic aperture systems with a non-blind deconvolution algorithm via a deep convolutional neural network,” Opt. Express 28(7), 9929 (2020). [CrossRef]  

4. K. Usmani, T. O’Connor, and B. Javidi, “Three-dimensional polarimetric image restoration in low light with deep residual learning and integral imaging,” Opt. Express 29(18), 29505 (2021). [CrossRef]  

5. L. Zhai, Y. Wang, S. Cui, et al., “A comprehensive review of deep learning-based real-world image restoration,” IEEE Access 11(March), 21049–21067 (2023). [CrossRef]  

6. G. Krishnan, R. Joshi, T. O’Connor, et al., “Optical signal detection in turbid water using multidimensional integral imaging with deep learning,” Opt. Express 29(22), 35691–35701 (2021). [CrossRef]  

7. P. Wani, K. Usmani, G. Krishnan, et al., “Lowlight object recognition by deep learning with passive three-dimensional integral imaging in visible and long wave infrared wavelengths,” Opt. Express 30(2), 1205 (2022). [CrossRef]  

8. R. Joshi, T. O’Connor, X. Shen, et al., “Optical 4D signal detection in turbid water by multi-dimensional integral imaging using spatially distributed and temporally encoded multiple light sources,” Opt. Express 28(7), 10477 (2020). [CrossRef]  

9. X. Shen, A. Carnicer, and B. Javidi, “Three-dimensional polarimetric integral imaging under low illumination conditions,” Opt. Lett. 44(13), 3230 (2019). [CrossRef]  

10. A. Stern, D. Aloni, and B. Javidi, “Experiments with three-dimensional integral imaging under low light levels,” IEEE Photonics J. 4(4), 1188–1195 (2012). [CrossRef]  

11. I. Moon and B. Javidi, “Three-dimensional visualization of objects in scattering medium by use of computational integral imaging,” Opt. Express 16(17), 13080–13089 (2008). [CrossRef]  

12. M. Martínez-Corral and B. Javidi, “Fundamentals of 3D imaging and displays: a tutorial on integral imaging, light-field, and plenoptic systems,” Adv. Opt. Photonics 10(3), 512–566 (2018). [CrossRef]  

13. B. Javidi, F. Pla, J. M. Sotoca, et al., “Fundamentals of automated human gesture recognition using 3D integral imaging: a tutorial,” Adv. Opt. Photonics 12(4), 1237–1299 (2020). [CrossRef]  

14. S.-H. Hong, J.-S. Jang, and B. Javidi, “Three-dimensional volumetric object reconstruction using computational integral imaging,” Opt. Express 12(3), 483–491 (2004). [CrossRef]  

15. N. Davies, M. McCormick, and L. Yang, “Three-dimensional imaging systems: a new development,” Appl. Opt. 27(21), 4520–4528 (1988). [CrossRef]  

16. C. B. Burckhardt, “Optimum parameters and resolution limitation of integral photography,” J. Opt. Soc. Am. 58(1), 71–76 (1968). [CrossRef]  

17. B. Javidi, R. Ponce-Díaz, and S.-H. Hong, “Three-dimensional recognition of occluded objects by using computational integral imaging,” Opt. Lett. 31(8), 1106–1108 (2006). [CrossRef]  

18. G. Lippmann, “Epreuves reversibles donnant la sensation du relief,” J. Phys. 7, 821–825 (1908). [CrossRef]  

19. C. D. Mobley, Light and water: radiative transfer in natural waters (Academic Press, 1994).

20. D. Akkaynak and T. Treibitz, “A revised underwater image formation model,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 1, 6723–6732 (2018).

21. R. T. Tan, “Visibility in bad weather from a single image,” 26th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR (2008).

22. K. He, J. Sun, and X. Tang, “Single image haze removal using dark channel prior,” IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2341–2353 (2011). [CrossRef]  

23. B. Shahriari, K. Swersky, Z. Wang, et al., “Taking the human out of the loop: A review of Bayesian optimization,” Proc. IEEE 104(1), 148–175 (2016). [CrossRef]  

24. A. Candelieri, “A gentle introduction to Bayesian optimization,” Proc. - Winter Simul. Conf. 2021-Decem, 1–16 (2021).

25. W. Gan, Z. Ji, and Y. Liang, “Acquisition functions in Bayesian optimization,” Proc. - 2021 2nd Int. Conf. Big Data Artif. Intell. Softw. Eng. ICBASE 2021129–135 (2021).

26. H. Bozdogan, “Model selection and Akaike’s Information Criterion (AIC): The general theory and its analytical extensions,” Psychometrika 52(3), 345–370 (1987). [CrossRef]  

27. S. Konishi and G. Kitagawa, Information Criteria and Statistical Modeling, 1st ed. (Springer Publishing Company, Incorporated, 2007).

28. C. B. Do and S. Batzoglou, “What is the expectation maximization algorithm?” Nat. Biotechnol. 26(8), 897–899 (2008). [CrossRef]  

29. A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum Likelihood from Incomplete Data Via the EM Algorithm,” 39(1) (1977). [CrossRef]  

30. J. Y. Zhu, T. Park, P. Isola, et al., “Unpaired image-to-image translation using cycle-consistent adversarial networks,” Proc. IEEE Int. Conf. Comput. Vis. 2017-Octob, 2242–2251 (2017).

31. X. Chen, Y. Duan, R. Houthooft, et al., “InfoGAN: interpretable representation learning,” NIPS 1, 2172–2180 (2016).

32. C. Ledig, L. Theis, F. Huszár, et al., “Photo-realistic single image super-resolution using a generative adversarial network,” CVPR 2(3), 4 (2017).

33. M. Daneshpanah and B. Javidi, “By passive three-dimensional imaging,” Opt. Lett. 34(7), 1105–1107 (2009). [CrossRef]  

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (9)

Fig. 1.
Fig. 1. Block diagram of the proposed 3D integral imaging (InIm) based image restoration under scattering medium.
Fig. 2.
Fig. 2. (a) Integral imaging passive sensing and image pickup process, (b) computational 3D reconstruction process using integral imaging.
Fig. 3.
Fig. 3. Imaging in underwater scenario, degradation under scattering and absorption
Fig. 4.
Fig. 4. Overall system architecture. Here, T represents the ground truth depth map, A is a value obtained by sampling from the distribution of atmospheric light parameter values with the parameters of the distribution estimated using the procedure described in Section 2.2.1. For our problem, the depth map refers to the one estimated by the minimization part in Eq. (4).Parameter z accounts for possible model mismatch.
Fig. 5.
Fig. 5. Encoder generator architecture for CycleGAN-based physics informed neural network. BN: Batch normalization. A, N: the degradation parameter and the latent vector respectively.
Fig. 6.
Fig. 6. 3 × 3 camera array for integral imaging capture stage used for our underwater experiments. We collected data with and without partial occlusion and low as well as high turbidity.
Fig. 7.
Fig. 7. Experimentally collected a) Sample clean image (2D), b) 2D image (central perspective) under partial occlusion and turbidity, and c) 3D Integral imaging (InIm) based reconstructed image under turbidity and partial occlusion.
Fig. 8.
Fig. 8. Experimentally collected data demonstrating a) Sample clean image ($\alpha = 0.0025\; m{m^{ - 1}}$), b)-c) two sample degraded images at the highest turbidity considered $({\alpha = 0.040\; m{m^{ - 1}}} )$, d) recovered image from (b), e) recovered image from (c).
Fig. 9.
Fig. 9. Experimentally collected a) sample clean image, b) 2D degraded image (central perspective) with partial occlusion, external light source and turbidity (0.008$\; m{m^{ - 1}}$), c) corresponding recovered image using proposed approach from 2D degraded image d) corresponding recovered image by proposed approach using 3D Integral imaging.

Tables (1)

Tables Icon

Table 1. Table showing the comparison of the proposed physics informed deep learning with other approaches. InIm: -Integral Imaging, MSE: - Mean square error, SSIM: - Structural similarity index.

Equations (5)

Equations on this page are rendered with MathJax. Learn more.

r ( x , y , z ) = 1 O ( x , y ) i = 0 P 1 j = 0 N 1 E I i , j ( x i r x × p x M × d x , y j r y × p y M × d y )
I ( d ;   ζ ;   λ )   =     I 0 ( d 0 ;   ζ ;   λ ) e c ( λ ) z + I ( d ;   ζ ;   λ ) C ( λ ) K d ( λ ) cos ( θ ) ( 1 e ( c ( λ ) K d ( λ ) cos ( θ ) ) z )
I = I 0 e ( c ) z + A ( 1 e c z )
t ( x ) = e ( c ) z = 1 m i n c ( m i n y ϵ Ω ( x ) ( I ( y ) A ) )
L c y c l e ( G , F ) = E x p d a t a ( x ) [ F ( G ( x ) ) x 1 ] + E y p d a t a ( y ) [ G ( F ( y ) ) y 1 ]
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.