Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Transcending conventional snapshot polarimeter performance via neuromorphically adaptive filters

Open Access Open Access

Abstract

A channeled Stokes polarimeter that recovers polarimetric signatures across the scene from the modulation induced channels is preferrable for many polarimetric sensing applications. Conventional channeled systems that isolate the intended channels with low-pass filters are sensitive to channel crosstalk effects, and the filters have to be optimized based on the bandwidth profile of scene of interest before applying to each particular scenes to be measured. Here, we introduce a machine learning based channel filtering framework for channeled polarimeters. The machines are trained to predict anti-aliasing filters according to the distribution of the measured data adaptively. A conventional snapshot Stokes polarimeter is simulated to present our machine learning based channel filtering framework. Finally, we demonstrate the advantage of our filtering framework through the comparison of reconstructed polarimetric images with the conventional image reconstruction procedure.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

Optical polarimeters have emerged in recent years as tools that can provide valuable information about object shape, surface quality, and orientation in applications ranging from biomedical optics to target detection to astronomy [1,2]. In recent years, snapshot polarimeters have been developed that modulate the irradiance of the optical field in a manner that depends on the polarization distribution as a function of space, wavelength, or angle of incidence. And these modulations generate a set of channels in the corresponding frequency domain. By filtering the information in each of these channels and demodulating the image, the polarization properties can be reconstructed from a single measurement or frame of data.

The primary limitation of snapshot polarimetry is the inherent bandwidth reduction, since the full resolution of the system must be shared to measure multiple pieces of information simultaneously. Furthermore, wide-bandwidth image data creates the challenge of channel crosstalk, which leads to artifcacts in the reconstructed data. Implementation of the snapshot strategy depends intimately on the choice of filter used to extract the channels in the frequency domain. Traditional low-pass filters are generally prefered to prevent this crosstalk, but they also result in poor high-frequency performance.

Recent studies have proposed neural network-based interpolation schemes for demosaicing DoFP image data that apply machine learning (ML) directly in the spatial domain [3,4]. Results from these studies show improved performance over conventional bicubic interpolation and low-pass filtering methods. In this paper, we introduce the concept of neuromorphically adaptive filtering for extraction of the information in the polarization carrying channels that instead operates in the spatial frequency domain. We train a deep neural network (DNN) to estimate the ideal ratio filter (IRF) and the complex ideal ratio filter (cIRF) for any particular polarized image which are preferable over conventional Planck-taper low-pass filters (PTFs) for image reconstruction. The predicted filters are tested on novel images and demonstrate image reconstruction performance that exceeds that of bicubic interpolation and PTFs optimized on the same data sets, and even out-performs the specific PTFs optimized a posteriori for each individual test image. The bicubic interpolation and more complicated interpolation strategies have been discussed elsewhere [5] and will not be explained in this paper, but the classic bicubic interpolation is adopted here and the obtained results are used as benchmark to compare our method with past and future studies.

2. Polarimeter background

Traditional photon-based optical detectors cannot directly measure polarization information. In order to construct an imaging polarimeter, one must design a system that makes multiple measurements of the optical irradiance through different polarization analyzers. The final reconstructed polarization scene is then computed through post-processing of these measurements. There are two broad strategies to accomplish these multiple measurments: 1) systems that divide the optical energy into multiple copies (wavefront division polarimeters), analyzing each copy through a separate set of polarization optics; and 2) systems that modulate the intensity of the optical information in a polarization-dependent manner, measure that modulated signal using a single detector or detector array, and reconstruct the data computationally. This modulation creates a set of channels in the Fourier domain, and we refer to these instruments as channeled polarimeters here.

2.1 Wavefront division polarimeters

There are several classic designs for wavefront division instruments. The most widely used designs are the Division of Amplitude (DoAMP) systems that use beasmsplitters or some other method to divide the light and relay it to mutiple detectors/cameras [6,7]. These instruments can be made to perform at the native frame rate and resolution of the underlying focal plane arrays (FPAs). However, they require precise image alignment at the sub-pixel level, and they are highly sensitive to vibrations. These factors coupled with differential aberrations in the optical paths will often limit their overall performance.

More recently, it has been proposed to replace the multiple beamsplitters with a single metasurface that can divide the light into an arbitrary number of nearly-arbitrary polarization states [8]. This method shows great promise because it reduces the volume of the system and some of the aberrations introduced by the monolithic devices. However, current metasurface designs tend to be narrow-band, introducing significant chromatic variability, but they provide great potential especially for laser-illuminated applications.

Related to the DoAMP system are polarimeters that perform this division at the pixel level on the FPA. Systems have been conceived and/or tested that accomplish this with birefringent crystals integrated with the FPA [9] and with multi-layer detector stacks using gratings [10]. Note that this class of instrument is different from the Division of Focal Plane (DoFP) polarimeter discussed below.

The second major class of wavefront division polarimeter is the Division of Aperture (DoAP) instrument [11]. These devices use a lens array to subdivide a pupil plane. These devices then relay these sub-apertures to different FPAs or, more commonly, different portions of a single FPA. DoAP systems are generally more compact than DoAMP systems, making them less susceptible to vibration, but they still suffer from alignment challenges. Further, differential optical aberrations arising from the use of different portions of the pupil usually prove to be the ultimate limit of performance.

2.2 Channeled polarimeters

Unlike wavefront division polarimeters, channeled polarimeters use a single detector array to measure the optical field. Depending on the system design, the polarimeter has resolution in some combination of the independent variables (e.g. space [12,13], wavelength [14], or angle of incidence [15]). These systems generally resolve the scene as a function of time as well. The optical components of the system are designed to have polarization properties that are themselves functions of these variables. The Stokes parameters are written in vector form as

$$\underline{\textbf{S}} = \begin{bmatrix} s_0\\s_1\\s_2\\s_3 \end{bmatrix} = \begin{bmatrix} I_H+I_V \\ I_H-I_v \\ I_{45^\circ}-I_{135^\circ} \\ I_{LCP}-I_{RCP} \end{bmatrix}$$
where $I_H, I_V, I_{45^\circ }$ and $I_{135^\circ }$ represent the irradiance of horizontal, vertical, $45^\circ$ and $135^\circ$ polarized of light, and $I_{LCP}$ and $I_{RCP}$ represent the irradiance of left- and right-circular polarized light. The measured irradiance after modulation by the system is
$$I_0(\vec{\theta})=\underline{\textbf{A}}^{\mathrm{T}}\cdot\underline{\textbf{S}}_{in}=\sum_{i=0}^{3}a_i(\vec{\theta})\cdot s_i(\vec{\theta}),$$
where $\underline {\textbf {A}}=[a_0,a_1,a_2,a_3]^{\mathrm{T}}$ is the system analyzer vector which describes the periodic modulation imposed on the incident light and $\vec {\theta }=\{x,y,t,etc.\}$ represents the set of independent variables over which the polarimeter modulates. By taking the Fourier transform, Eq. (2) becomes
$$\tilde{I}_0(\tilde{\vec{\theta}})=\tilde{\underline{\textbf{A}}}^{\mathrm{T}}\ast \tilde{\underline{\textbf{S}}}_{in}=\sum_{i=0}^{3}\tilde{a}_i(\tilde{\vec{\theta}})\ast \tilde{s}_i(\tilde{\vec{\theta}}),$$
where the tilde indicates Fourier transform, the $*$ symbol denotes convolution, and $\tilde {\vec {\theta }}=\{\xi ,\eta ,\nu ,etc\}$ is the frequency counterpart of $\vec {\theta }$. When the modulation functions are periodic, the analyzer vector becomes a set of $\delta$-functions after Fourier transform. The complete set of the $\delta$-functions compose the channel structure of the polarimeter. The Fourier transformed Stokes parameters are assigned into these channels through convolution, and the distribution of the Fourier transformed Stokes parameters in these channels can be described with the matrix $\underline {\underline {\textbf {Q}}}$. The channeled polarimeters reconstruct the polarization images by directly interpolating the encoded images or taking inverse Fourier transform on the Fourier transformed Stokes parameters unmixed from the filtered channels using the $Q$-matrix formalism developed by Alenin and Tyo [16]. In a previous study [17], these two image reconstruction approaches are unified mathematically and it is proven that the demosaicing approach is equivalent to the channel filtering approach, but with potential artifacts in high frequency parts of the reconstructed image due to the implied filters used in that method. When the modulation occurs in domains other than time, then the polarimeter is termed a snapshot polarimeter, because the desired polarization parameters can be measured from a single temporal frame or measurement. Equations  (2) and 3 are written as continuous functions, but real systems invariably sample the data and process using some variant of the discrete Fourier transform [17].

In this paper we are primarily concerned with passive polarimeters that measure the Stokes parameters [1]. Instruments that measure only the linear Stokes parameters must make at least three measurements, and full-Stokes polarimeters must make at least four. However, the concepts we discuss here are directly applicable to active polarimeters that measure any number of the Mueller matrix elements. Active polarimeters have been made with a small number of measurements to maximize image contrast [20,21], but 16 or more measurements are needed to reconstruct the full Mueller matrix [22]. Partial Mueller polarimeters that reconstruct a particular subset of the Mueller matrix degrees of freedom based on the task at hand have also been considered [2326]. Whereas the wavefront division and snapshot polarimeters are both viable options for passive instruments, the wavefront division system becomes increasingly impractical as the number of required measurements increases.

3. Channel filtering schemes

In this work, the neuromorphically adaptive filters are compared with the conventional Planck-taper filters (PTFs) and bicubic interpolation, and the comparison is carried out through simulation of a snapshot polarimeter. The truth images used in this study are presented in Fig. 1. The simulated snapshot system is a conventional $2\times 2$ MPA-based DoFP polarimeter. Recently other mosaic patterns have been proposed that are predicted to have superior performance [27], but we only consider the classic design of Chun [12] because it remains the only layout that is commercially available. An example of the simulated DoFP readout image is shown in Fig. 2(A) (see Sect. 4.1 below for details of data preparation).

 figure: Fig. 1.

Fig. 1. Polarized images used for the simulation work from the public database published by Lapray, Gendre, Foulonneau and Bigué [18]. The applied colormap indicates the polarization signature across the images is developed by A.Kruse [19].The degrees on the color wheel indicates the angle of polarization and radiance indicates the degree of linear polarization.

Download Full Size | PDF

 figure: Fig. 2.

Fig. 2. Example of a simulated DoFP raw image (A) and its Fourier transform (B)

Download Full Size | PDF

The Fourier transform of the DoFP readout image is presented in Fig. 2(B), where the $s_0$ channel sits at the center of the Nyquist square, the $s_1+s_2$ channels sits at $\eta = \pm 0.5$ cycles per pixel, and the $s_1-s_2$ channels sits at $\xi = \pm 0.5$ cycles per pixel. The $s_0$ channel carries the strongest signal and its high frequencies overlap with the signals in $s_1\pm s_2$ channels. The key step in modulated polarimetry discussed in this paper is the choice of the filters that can be used to extract these channels from mixed data for reconstruction.

3.1 Conventional filtering

A key element of the demodulation steps needed to reconstruct the Stokes parameters from the Fourier representation in Eq. (3) is a filter that can isolate the information in each of the channels surrounding the locations of the delta-functions in the frequency domain. All systems in the literature known to us use some form of a conventional low-pass filter to accomplish this task. The earliest snapshot polarimeters were DoFP instruments that integrated a micropolarizer array (MPA) with the FPA [12], and they considered each $2\times 2$ set of pixels as a “super pixel.” These reconstruction strategies were inherently limited by instantaneous field-of-view (IFOV) errors, since the pixels measuring the different polarization properties also corresponded to different locations in the image. Later Tyo, LaCasse and Ratliff [28] made the link between DoFP polarimeters and other classes of spatially modulated systems. This connection brought in the concept of tailoring more sophisticated filters for reconstruction [17,29] and more sophisticated MPA tilings for both linear [27,30] and full-Stokes polarimeters [31].

There are many different specific filter shapes, but for reference here we will use the PTF family. PTFs are commonly used because they can be infinitely differentiated, they are easily expandable to multiple dimensions, while the transition from 0 to 1 for each dimension is controllable by a single parameter [32]. The PTF transfer function is defined as

$$F(r,\epsilon) = \frac{1}{1+\exp(\epsilon W[\frac{1}{W-r}+\frac{1}{(1-\epsilon)W-r}])},$$
where $W$ is the width of the filter, $r$ is the distance to the filter center in the relevant frequency coordinate, and $\epsilon$ controls the slope. An example of a two-dimensional PTF is presented in Fig. 3(A). As is clear from the filter in Fig. 3(A), this low-pass filter is used to control the bandwidth of each of the polarization-carrying channels around the delta functions in Eq. (3). The width and slope of the filter are generally chosen to maximize how much channel information is used in the reconstruction.

 figure: Fig. 3.

Fig. 3. Examples of the filters used in this paper: (A) PTF, (B) IRF, (C) cIRF (real part), (D) cIRF (imaginary part).

Download Full Size | PDF

3.2 Neuromorphically adaptive filtering

Drawing from ML-based signal separation techniques [33], a neuromorphically adaptive filtering scheme for channeled polarimetry is presented in Fig. 4. Here we adopt a deep neural network (DNN) system to learn and adaptively predict filters that can be used for reconstruction of any particular image. The framework we adopt can be divided into two stages: the training stage and the application stage. As Fig. 4 shows, the Fourier transformed DoFP readout images are fed into the DNN, and the DNN predicts filters for extracting the intended channels in the frequency domain. The DNN in this work is constructed based on the UNet architecture, which is shown schematically in Fig. 5. Comparing to other encoder-decoder structures, the UNet has additional skip connections between its encoder and decoder sides allowing the transmission of detailed information for better image prediction [34].

 figure: Fig. 4.

Fig. 4. Neuromorphically adaptive channel filtering framework

Download Full Size | PDF

 figure: Fig. 5.

Fig. 5. UNet architecture used in this study.

Download Full Size | PDF

A key design choice for the neuromorphically adaptive channel filtering scheme is the choice of the ideal target filter. Two non-parameterized filters are adopted as the training target of the DNN in this study: the ideal ratio filter (IRF) and the complex ideal ratio filter (cIRF). The IRF is defined as

$$\textrm{IRF} (\tilde{\vec{\theta}}) = \left(\frac{|S(\tilde{\vec{\theta}})|^2}{|S(\tilde{\vec{\theta}})|^2+|N(\tilde{\vec{\theta}})|^2}\right)^\beta,$$
where $S$ denotes the spectrum of the desired channels ($s_0$, $s_1+s_2$ or $s_1-s_2$ accordingly), $N$ denotes the spectrum of the Frequency domain excluding the desired channels, and $\beta$ is an empirical parameter adjusting the scale of the filter, commonly chosen as 0.5. However, here $\beta$ is set to 1 because the filter becomes more Wiener-like and provides better results (data not shown). An example of the IRF is presented in Fig. 3(B).

The complex ideal ratio filter (cIRF) is defined as

$$\textrm{cIRF} (\tilde{\vec{\theta}}) = \frac{C_r(\tilde{\vec{\theta}})S_r(\tilde{\vec{\theta}})+C_i(\tilde{\vec{\theta}})S_i(\tilde{\vec{\theta}})}{C_r(\tilde{\vec{\theta}})^2+C_i(\tilde{\vec{\theta}})^2} + \frac{C_r(\tilde{\vec{\theta}})S_i(\tilde{\vec{\theta}})-C_i(\tilde{\vec{\theta}})S_r(\tilde{\vec{\theta}})}{C_r(\tilde{\vec{\theta}})^2+C_i(\tilde{\vec{\theta}})^2}i$$
where $C$ denotes the spectrum of the complete channel structure including all channels and $S$ denotes the spectrum of the desired channel respectively, the subscripts $r$ and $i$ denote the real and imaginary parts of the signals.

Compared with IRFs, ideal cIRFs are able to reconstruct the filtered signal perfectly when the true signal is known. However, unlike the IRFs, the value of the cIRFs are not bounded to $[0,1]$. For the sake of network training, the cIRFs are compressed with the hyperbolic tangent function [35],

$$\textrm{CF}_x = K \frac{1-\exp({-}C\cdot F_x)}{1+\exp({-}C\cdot F_x)},$$
where $\textrm {CF}_x$ stands for compressed filter, the subscript $x$ can be $r$ or $i$ depending on compressing the real or imaginary part of the filter, $K$ defines the interval where the values of the compressed filter scatter and $C$ defines the distribution of the values. An example of the cIRF is presented in Fig. 3(C,D), where both the real and imaginary part of the filter are compressed to $[0,1]$.

3.3 Filter training

Because of the small number of available images, we trained the DNN by subdividing the training images into overlapping 32-pixel $\times$ 32-pixel tiles (see Sect. 4.1). The ideal IRFs and cIRFs are computed for each tile, and these ideal filters are used as the targets to train the DNNs. The $32\times 32$ tiles overlap because only the central $16\times 16$ pixels of the reconstructed tiles are used to reduced the edge artifacts from the frequency domain decovolution when the images are stitched back together.

In the past, the PTFs have been optimized with simulated scenes modeled using power spectral densities (PSDs) based on the statistical properties of the actual scenes to be measured [36,37]. Therefore, any mismatch between the assumed polarimetric PSDs and the real-time measured scene statistics limits the performance of the resulting PTFs. In this work, to have a fair comparison between the ML-based and conventional filtering approach, the PTF optimized with the same data set used to train the DNN and the optimized PTFs are referred as the average PTF. The image quality metric (See Sect. 4.2 below) was optimized using a particle swarm optimization to choose the filter parameters producing the best average results.

4. Simulation study

4.1 Polarimetric data

The data used in this study are obtained from a laboratory-based division of time (DoT) polarimeter [18]. The DoT-measured Stokes parameters are then used to simulate the output of a conventional DoFP polarimeter based on a $2\times 2$ linear MPA [12]. Though it has been shown recently that the $2\times 2$ tiling is significantly sub-optimal [27,30], we use it here because it remains the only MPA architecture that is commercially available.

The database is provided with open access online [18] and is composed of linear polarization intensity images ($I_{0^\circ }$, $I_{45^\circ }$, $I_{90^\circ }$,$I_{135^\circ }$) of six spectral bands from ten different scenes. In this work, the polarimetric data from the near infrared (NIR) band are chosen as the ground truth data and the linear Stokes parameters across each scene are calculated using Eq. (1). Given the analyzer vector of the conventional MPA

$$\underline{\textbf{A}}(m,n)=\frac{1}{4}\begin{bmatrix} 2 & {\cos{\pi m} + \cos{\pi n}} & {\cos{\pi m} - \cos{\pi n}} & 0 \end{bmatrix}^\mathrm{T},$$
where $m,n$ are spatial indices of the micropolarizers, the readout of the DoFP can be calculated using Eq. 2. The total ten simulated DoFP raw images are split into a training dataset of eight images and a testing dataset of two images. Then these images are subdivided to overlapping $32\times 32$ tiles for DNN training and testing.

4.2 Image quality metric

To compare the predicted filters and the optimized PTFs, the quality of the reconstructed polarization images are utilized to evaluate the filters. Here we adopt the gradient magnitude similarity deviation (GMSD) model as the image quality metric. Compared with state-of-the-art full reference image quality metrics such as SSIM, the GMSD metric performs better in both accuracy and efficiency [38]. In addition, aside from providing a reliable image quality assessment index, the GMSD model also provides the gradient magnitude similarity (GMS) map to illustrate the the portions of the image with better and worse reconstruction. The GMS is defined as

$$\textrm{GMS}(i)=\frac{2G_r(i)G_d(i)+c}{G_r(i)^2+G_d(i)^2+c} ,$$
where c serves as the numerical stability term, $G_r$ and $G_d$ are the gradient magnitude maps and the subscript $r$ and $d$ indicates the ground truth image and reconstructed image. And $G_r$ and $G_s$ can be calculated as:
$$\begin{aligned}G_r(i)&=\sqrt{G_{x,r}(i)^2+G_{y,r}(i)^2}\\G_d(i)&=\sqrt{G_{x,d}(i)^2+G_{y,d}(i)^2} , \end{aligned}$$
where $G_{x,}$ and $G_{y,}$ denotes the image gradient on $x$- and $y$- direction. The GMSD index is defined as the standard deviation of the GMS map:
$$\textrm{GMSD}=\sqrt{\frac{1}{N}\sum_{i=1}^N(\textrm{GMS}(i)-\overline{\textrm{GMS}})^2} ,$$
where $\overline {\textrm {GMS}}$ denotes the mean value of GMS of the image. Since the GMSD is a standard deviation score, the lower the index is, the better is the image quality.

5. Results and discussion

The GMSD and PSNR results of the stitched images recovered from ideal/average PTF, ideal/predicted IRFs and cIRFs, and bicubic interpolation are plotted in Fig.6 including both the training set (images $\#$ 1–8) and the testing set (images$\#$ 9 and 10). The PSNR results are provided for the convenience of cross comparison between our methods and others, but we believe GMSD is a better metric than PSNR as explained in section 4.2. Although the results of image $\#$ 1-8 are from training images, these results are plotted in Fig. 6 to intuitively demonstrate that our DNNs are not overfitting because there is no obvious gap between the performance on the training set as opposed to the testing set reconstructed with the predicted IRFs and cIRFs. The PSNR and GMSD reconstruction accuracy results from the testing set are listed in Tables 1 and 2, respectively. In these tables, the performance of the four image reconstruction approaches are compared (the bicubic results serve as a baseline for further comparison with other image reconstruction methods such as the PDCNN method [3]) and the best results are highlighted. Here the ideal PTF results are obtained by optimizing a PTF for each subdivided image tile on that particular tile, while the average PTF is a PTF optimized on the whole training dataset to achieve the best performance on average, and the ideal IRFs/cIRFs are IRFs/cIRFs calculated from the ground truth data. The testing set reconstructed with the four approaches are presented in Fig. 7 and 8 (reconstructed $s_0$ and DoLP images) where the GMS maps are also presented to show the quality of the reconstructed images and the dark areas in the GMS maps indicate distorted reconstruction. Four randomly picked sub-image tiles are highlighted in Fig. 7 are magnified in Fig. 9 to provide a close-up examination to the image reconstruction process.

 figure: Fig. 6.

Fig. 6. PSNR (left column) and GMSD (right column) metrics of the stitched images reconstructed by PTF(blue), IRF(red), cIRF(orange) and bicubic interpolation(green) for the Stokes parameters (rows 1–3) and DoLP (row 4). Image $\#$s are consistent with Fig. 1: images 1–8 are the training set, whereas image 9–10 are the testing set.

Download Full Size | PDF

 figure: Fig. 7.

Fig. 7. Stitched $s_0$ images and the GMS maps of the test images. Dark areas in GMS maps have worse reconstruction. Rows A and B show Image 9, Rows C and D show Image 10. Small red boxes are the ROIs that are examined in detail in Fig. 9. ROI #1 (left) and 2 (right) are from Image 9, ROI #3 (left) and 4 (right) are from image 10.

Download Full Size | PDF

 figure: Fig. 8.

Fig. 8. Stitched DoLP images and the corresponding GMS maps of the test images. Rows A and B show Image 9, Rows C and D show Image 10. Small red boxes are the ROIs that are examined in detail in Fig. 9.

Download Full Size | PDF

 figure: Fig. 9.

Fig. 9. The ROIs from Fig. 7. A: Ground truth ($s_i(x,y)$); B: Demodulated (baseband) channel structure before filtering ($s_0$ channel below $s_0$ image, $s_1+s_2$ channel below $s_1$ image, $s_1-s_2$ channel below $s_2$ image; C/D/E/F: Ideal (a posteriori) filters (PTF/IRF/compressed cIRF real/imaginary); G/H/I/J: average or predicted filters from training data (PTF/IRF/compressed cIRF real/imaginary); K/L/M/N: Reconstructions using the filters in G/H/I+J/bicubic; O/P/Q/R: GMS error maps for K/L/M/N (dark areas indicate poor reconstruction); All image tiles are normalized to [0,1]; $s_1$ and $s_2$ have the same normalization.

Download Full Size | PDF

Tables Icon

Table 1. PSNR scores of the reconstructed test images

Tables Icon

Table 2. GMSD scores of the reconstructed test images

From the results listed in Table 1 and 2, firstly it can be concluded that the DNN predicted IRFs and cIRFs show a clear superiority on image reconstruction compared with the optimized PTFs and bicubic interpolation, especially for the weak $s_1$ and $s_2$ signals. This advantage is demonstrated intuitively by the GMS maps in Fig. 7 and 8. As Fig. 7 shows, the bicubic interpolation method creates more artifacts in the $s_0$ images than the other three approaches. This is because the bicubic interpolation is equivalent to a sub-optimal filter for the $s_0$ channel as explained in previous work by LaCasse, Chipman and Tyo [17]. However, the $s_0$ images reconstructed by the optimized PTFs are not necessarily worse than the ML-based filters because the $s_0$ signal is the dominant signal in the Fourier domain, the residual edge artifacts in the reconstructed image tiles impacts the final results and the DNNs may not be able to predict a favorable filter for every tile in the test dataset. As for reconstructing the linear Stokes parameters, the DNN predicted filters perform better than both the PTFs and the bicubic interpolation as illustrated by the reconstructed DoLP image in Fig. 8.

In order to understand how our framework is producing better results, we compare the ideal and predicted filters for the highlighted tiles in Fig. 9. These results reveal two facts about the ML-based filters that make them superior to the PTFs. First, consider the ideal filters (Fig. 9 row (C-F)) for the channel structures. The channel structures shown in Fig. 9 row (B) are logarithmic magnitude of the Fourier transform of the simulated DoFP readout image from the ground truth images (Fig. 9 row (A)). IRFs (Fig. 9 row (D)) and cIRFs (Fig. 9 row (E, F)) have the potential to outperform ideal PTFs (Fig. 9 row (C)) because they have much wider bandwidth. A PTF that attempts to use information more than 0.25 cycles/pixel away from the center of a channel runs a severe risk of channel crosstalk. However, the IRF and cIRF defined by Eq. (5) and 6 can, in fact, avoid channel crosstalk because they can separate the (known) channel of interest from the (known) whole-channel structure. This “anti-aliasing” nature of the ML-based filters means that the upper bound on performance is necessarily superior to conventional low pass filters.

The second important reason for the performance of our framework is its neuromorphically adaptive nature. Biological neural systems are able to use past experience to process novel stimuli rather efficiently. Examining the predicted filters (Fig. 9 row (G-J)) for each of the test image tiles, the prediction demonstrates clear adaptivity (compare Fig. 9 row (D) with row H for IRF, row E, F with row I, J for cIRF). The shapes and extent of the predicted filters, while not exactly matching the corresponding ideal filters, are clearly adapting to the scene. This is allowing our framework to more closely approach the ideal performance in each case. On the other hand, the same average PTF (Fig. 9 row (G)) optimized with the training data is applied to all the testing data. The reconsturcted sub-image tiles are presented in Fig. 9. Of specific interest are rows K-N, which compare PTF, IRF, cIRF, and bicubic interpolation results. The reconstruction accuracy of these sub-images is illustrated by the GMS tiles in rows O-R, where the proposed filters excel.

Another straightforward observation from the results (Table 1 and 2) is that the predicted cIRFs are clearly not performing as well as the ideal cIRFs. Instead, they are performing at the level similar to that of the IRFs. The ideal cIRFs are able to reconstruct the polarimetric images perfectly, which is achieved through phase information affording the extra degree of freedom. It is reasonable to expect some improvements for the predicted cIRFs if the DNNs are trained with a larger data set; however, there may be an intrinsic limitation in terms of not being able to invent the necessary phase information. Nonetheless, the ideal cIRF represents the ultimate level for reconstruction accuracy. A likely vector for overall improvement may be found through a more elaborate cIRF compression than what is described in Section 3.2. Figure 10 presents the ideal IRF and the compressed cIRF for the same channel structure. The real part of the cIRF is unambiguously similar to the IRF (compare Fig. 10(A) and (C)), whereas the imaginary part of the cIRF affects mainly the high frequencies where the signal is weak (Fig. 10(B)). The proportion of imaginary part of cIRF is presented in Fig. 10(D) to demonstrate the coupling of the real and imaginary parts of the cIRF. Unlike the central part, the outskirts of the cIRF where the imaginary part takes effect lack any clear pattern, which may be the reason why the trained DNN failed to predict the imaginary component to the the level required. As a result, the performance of the predicted cIRFs remained at a similar level to that of the IRFs.

 figure: Fig. 10.

Fig. 10. Compressed Ideal cIRF and IRF to the same channel structure. A - cIRF real part, B - cIRF imaginary part, C - IRF, and D - proportion of the imaginary part in cIRF ($\mathrm {Im}/\sqrt {\mathrm {Re}^2+\mathrm {Im}^2}$)

Download Full Size | PDF

6. Conclusions

Burgeoning AI techniques in recent years have encouraged the polarization community to incorporate ML skills into polarimetric imagery. For instance, the recently proposed polarization demosaicing CNN (PDCNN) model [3] and the conditional GAN demosaicing model [4] both show promising results on direct domain polarimetric image reconstruction. In contrast to these studies, we cast our attention to the frequency domain. In this paper, we have outlined a neuromorphically adaptive filtering framework for reconstruction of snapshot polarimeter images. To have a rough comparison with the previous work, we also reconstructed the polarimetric images using bicubic interpolation method and provide results with PSNR metric in order to have a common baseline. Our reconstruction framework provides a larger $\Delta$PSNR gain: +4.1dB for $s_0$ and +4.9dB for DoLP (IRF results). A similar analysis for the PDCNN method reveals their gains to be: +3.7dB for $s_0$ and +2.1dB for DoLP. While this is an encouraging sign, this does not serve as a fair comparison between the two deep learning based image reconstruction approaches. The main caveat lies in the fact that the databases used by each group are diverse and the specific trained parameters of the DNNs are difficult to share. Nonetheless, seeing a larger impact in the reconstruction of the polarization information could be indicative of the overarching premise of this study—determining filters that assign correct information to correct channels. Intuitively, it makes sense for the generally weaker polarization channels to benefit by a greater amount. We hope to answer many of the leftover uncertainties through a combined study that fairly compares the performance of each of the recently proposed methods on a common set of data.

The trained DNN in our system takes as input the Fourier transform of the measured data and outputs a predicted IRF/cIRF. The system is neuromorphically adaptive in that the trained DNN is able to predict proper filters when exposed to novel image data based on prior experience. We have statically trained the DNN using supervised methods here, but future operation could see the training database continually augmented as more data are experienced in an unsupervised fashion.

The work in this paper focuses on DoFP polarimeters using the conventional $2\times 2$ MPA of Chun [12]. However, the framework is immediately applicable to any category of snapshot polarimeter, such as those that modulate in space [13,22], wavelength [14], or angle of incidence [15]. Because the method scales to an arbitrary number of channels, it has the potential to be applied to both passive polarimeters, as done here, and active polarimeters that measure the full Mueller matrix. Such systems may require many channels [16,22], each of which must be filtered separately to reconstruct the data. We are also working to apply these methods to recently developed multi-domain modulation methods that modulate in a combination of independent variables such as space and time, space and wavelength, or wavelength and time simultaneously [3941]. To date, conventional low-pass filtering methods have been used in those studies, but these results are directly applicable to those cases.

Funding

University of New South Wales Canberra; Asian Office of Aerospace Research and Development (FA2386-15-1-4098).

Acknowledgments

A. Kruse for assistance with designing the color map used for presentation of data. G. Sargent and B. Ratliff for conversations about preparing data for the DNN training.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. J. S. Tyo, D. L. Goldstein, D. B. Chenault, and J. A. Shaw, “Review of passive imaging polarimetry for remote sensing applications,” Appl. Opt. 45(22), 5453–5469 (2006). [CrossRef]  

2. F. Snik, J. Craven-Jones, M. Escuti, S. Fineschi, D. Harrington, A. De Martino, D. Mawet, J. Riedi, and J. S. Tyo, “An overview of polarimetric sensing techniques and technology with applications to different research fields,” in Polarization: measurement, analysis, and remote sensing XI, vol. 9099 (International Society for Optics and Photonics, 2014), p. 90990B.

3. J. Zhang, J. Shao, H. Luo, X. Zhang, B. Hui, Z. Chang, and R. Liang, “Learning a convolutional demosaicing network for microgrid polarimeter imagery,” Opt. Lett. 43(18), 4534–4537 (2018). [CrossRef]  

4. G. C. Sargent, B. M. Ratliff, and V. K. Asari, “Conditional generative adversarial network demosaicing strategy for division of focal plane polarimeters,” Opt. Express 28(25), 38419–38443 (2020). [CrossRef]  

5. B. M. Ratliff, C. F. LaCasse, and J. S. Tyo, “Interpolation strategies for reducing ifov artifacts in microgrid polarimeter imagery,” Opt. Express 17(11), 9112–9125 (2009). [CrossRef]  

6. R. Azzam, “Arrangement of four photodetectors for measuring the state of polarization of light,” Opt. Lett. 10(7), 309–311 (1985). [CrossRef]  

7. J. D. Barter, P. H. Lee, H. Thompson Jr, and T. Schneider, “Stokes parameter imaging of scattering surfaces,” in Polarization: Measurement, Analysis, and Remote Sensing, vol. 3121 (International Society for Optics and Photonics, 1997), pp. 314–320.

8. N. A. Rubin, A. Zaidi, M. Juhl, R. P. Li, J. B. Mueller, R. C. Devlin, K. Leósson, and F. Capasso, “Polarization state generation and measurement with a single metasurface,” Opt. Express 26(17), 21455–21478 (2018). [CrossRef]  

9. A. G. Andreou and Z. K. Kalayjian, “Polarization imaging: principles and integrated polarimeters,” IEEE Sens. J. 2(6), 566–576 (2002). [CrossRef]  

10. M. Serna, “Single-pixel polarimeter: dielectric-gratings model and fabrication progress,” Infrared Phys. Technol. 44(5-6), 457–464 (2003). [CrossRef]  

11. J. L. Pezzaniti and D. B. Chenault, “A division of aperture mwir imaging polarimeter,” in Polarization Science and Remote Sensing II, vol. 5888 (International Society for Optics and Photonics, 2005), p. 58880V.

12. C. S. Chun, D. L. Fleming, and E. Torok, “Polarization-sensitive thermal imaging,” in Automatic Object Recognition IV, vol. 2234 (International Society for Optics and Photonics, 1994), pp. 275–286.

13. M. W. Kudenov, L. Pezzaniti, E. L. Dereniak, and G. R. Gerhart, “Prismatic imaging polarimeter calibration for the infrared spectral region,” Opt. Express 16(18), 13720–13737 (2008). [CrossRef]  

14. K. Oka and T. Kato, “Spectroscopic polarimetry with a channeled spectrum,” Opt. Lett. 24(21), 1475–1477 (1999). [CrossRef]  

15. Y. Otani, T. Wakayama, K. Oka, and N. Umeda, “Spectroscopic mueller matrix polarimeter using four-channeled spectra,” Opt. Commun. 281(23), 5725–5730 (2008). [CrossRef]  

16. A. S. Alenin and J. S. Tyo, “Generalized channeled polarimetry,” J. Opt. Soc. Am. A 31(5), 1013–1022 (2014). [CrossRef]  

17. C. F. LaCasse, R. A. Chipman, and J. S. Tyo, “Band limited data reconstruction in modulated polarimeters,” Opt. Express 19(16), 14976–14989 (2011). [CrossRef]  

18. P.-J. Lapray, L. Gendre, A. Foulonneau, and L. Bigué, “Database of polarimetric and multispectral images in the visible and nir regions,” in Unconventional Optical Imaging, vol. 10677 (International Society for Optics and Photonics, 2018), p. 1067738

19. A. W. Kruse, A. S. Alenin, I. J. Vaughn, and J. S. Tyo, “Perceptually uniform color space for visualizing trivariate linear polarization imaging data,” Opt. Lett. 43(11), 2426–2429 (2018). [CrossRef]  

20. G. Anna, F. Goudail, and D. Dolfi, “General state contrast imaging: an optimized polarimetric imaging modality insensitive to spatial intensity fluctuations,” J. Opt. Soc. Am. A 29(6), 892–900 (2012). [CrossRef]  

21. G. Anna, F. Goudail, and D. Dolfi, “Optimal discrimination of multiple regions with an active polarimetric imager,” Opt. Express 19(25), 25367–25378 (2011). [CrossRef]  

22. M. W. Kudenov, M. J. Escuti, N. Hagen, E. L. Dereniak, and K. Oka, “Snapshot imaging mueller matrix polarimeter using polarization gratings,” Opt. Lett. 37(8), 1367–1369 (2012). [CrossRef]  

23. A. S. Alenin and J. S. Tyo, “Structured decomposition design of partial mueller matrix polarimeters,” J. Opt. Soc. Am. A 32(7), 1302–1312 (2015). [CrossRef]  

24. A. S. Alenin and J. S. Tyo, “Structured decomposition of a multi-snapshot nine-reconstructables mueller matrix polarimeter,” J. Opt. Soc. Am. A 37(6), 890–902 (2020). [CrossRef]  

25. R. Ossikovski and O. Arteaga, “Complete mueller matrix from a partial polarimetry experiment: the nine-element case,” J. Opt. Soc. Am. A 36(3), 403–415 (2019). [CrossRef]  

26. O. Arteaga and R. Ossikovski, “Complete mueller matrix from a partial polarimetry experiment: the 12-element case,” J. Opt. Soc. Am. A 36(3), 416–427 (2019). [CrossRef]  

27. D. A. LeMaster and K. Hirakawa, “Improved microgrid arrangement for integrated imaging polarimeters,” Opt. Lett. 39(7), 1811–1814 (2014). [CrossRef]  

28. J. S. Tyo, C. F. LaCasse, and B. M. Ratliff, “Total elimination of sampling errors in polarization imagery obtained with integrated microgrid polarimeters,” Opt. Lett. 34(20), 3187–3189 (2009). [CrossRef]  

29. B. Ratliff, C. Lacasse, and J. Tyo, “Quantifying ifov error and compensating its effects in dofp polarimeters,” Opt. Express 17(11), 9112–9125 (2009).

30. A. S. Alenin, I. J. Vaughn, and J. S. Tyo, “Optimal bandwidth micropolarizer arrays,” Opt. Lett. 42(3), 458–461 (2017). [CrossRef]  

31. A. S. Alenin, I. J. Vaughn, and J. S. Tyo, “Optimal bandwidth and systematic error of full-stokes micropolarizer arrays,” Appl. Opt. 57(9), 2327–2336 (2018). [CrossRef]  

32. I. J. Vaughn, Bandwidth and noise in spatiotemporally modulated mueller matrix polarimeters, Ph.D. thesis, The University of Arizona (2016).

33. D. Wang and J. Chen, “Supervised speech separation based on deep learning: An overview,” IEEE/ACM Transactions on Audio, Speech, Lang. Process. 26(10), 1702–1726 (2018). [CrossRef]  

34. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention (Springer, 2015), pp. 234–241.

35. D. S. Williamson, Y. Wang, and D. Wang, “Complex ratio masking for monaural speech separation,” IEEE/ACM Transactions on Audio, Speech, Lang. Process. 24(3), 483–492 (2016). [CrossRef]  

36. I. J. Vaughn, A. S. Alenin, and J. S. Tyo, “Statistical scene generation for polarimetric imaging systems,” arXiv preprint arXiv:1707.02723 (2017).

37. C. F. LaCasse, O. G. Rodríguez-Herrera, R. A. Chipman, and J. S. Tyo, “Spectral density response functions for modulated polarimeters,” Appl. Opt. 54(32), 9490–9499 (2015). [CrossRef]  

38. W. Xue, L. Zhang, X. Mou, and A. C. Bovik, “Gradient magnitude similarity deviation: A highly efficient perceptual image quality index,” IEEE Transactions on Image Process. 23(2), 684–695 (2014). [CrossRef]  

39. W. B. Sparks, T. A. Germer, and R. M. Sparks, “Classical polarimetry with a twist: a compact, geometric approach,” Publ. Astron. Soc. Pac. 131(1001), 075002 (2019). [CrossRef]  

40. J. Song, I. J. Vaughn, A. S. Alenin, and J. S. Tyo, “Imaging dynamic scenes with a spatio-temporally channeled polarimeter,” Opt. Express 27(20), 28423–28436 (2019). [CrossRef]  

41. Q. Li, A. S. Alenin, and J. S. Tyo, “Spectral–temporal hybrid modulation for channeled spectropolarimetry,” Appl. Opt. 59(30), 9359–9367 (2020). [CrossRef]  

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (10)

Fig. 1.
Fig. 1. Polarized images used for the simulation work from the public database published by Lapray, Gendre, Foulonneau and Bigué [18]. The applied colormap indicates the polarization signature across the images is developed by A.Kruse [19].The degrees on the color wheel indicates the angle of polarization and radiance indicates the degree of linear polarization.
Fig. 2.
Fig. 2. Example of a simulated DoFP raw image (A) and its Fourier transform (B)
Fig. 3.
Fig. 3. Examples of the filters used in this paper: (A) PTF, (B) IRF, (C) cIRF (real part), (D) cIRF (imaginary part).
Fig. 4.
Fig. 4. Neuromorphically adaptive channel filtering framework
Fig. 5.
Fig. 5. UNet architecture used in this study.
Fig. 6.
Fig. 6. PSNR (left column) and GMSD (right column) metrics of the stitched images reconstructed by PTF(blue), IRF(red), cIRF(orange) and bicubic interpolation(green) for the Stokes parameters (rows 1–3) and DoLP (row 4). Image $\#$ s are consistent with Fig. 1: images 1–8 are the training set, whereas image 9–10 are the testing set.
Fig. 7.
Fig. 7. Stitched $s_0$ images and the GMS maps of the test images. Dark areas in GMS maps have worse reconstruction. Rows A and B show Image 9, Rows C and D show Image 10. Small red boxes are the ROIs that are examined in detail in Fig. 9. ROI #1 (left) and 2 (right) are from Image 9, ROI #3 (left) and 4 (right) are from image 10.
Fig. 8.
Fig. 8. Stitched DoLP images and the corresponding GMS maps of the test images. Rows A and B show Image 9, Rows C and D show Image 10. Small red boxes are the ROIs that are examined in detail in Fig. 9.
Fig. 9.
Fig. 9. The ROIs from Fig. 7. A: Ground truth ( $s_i(x,y)$ ); B: Demodulated (baseband) channel structure before filtering ( $s_0$ channel below $s_0$ image, $s_1+s_2$ channel below $s_1$ image, $s_1-s_2$ channel below $s_2$ image; C/D/E/F: Ideal (a posteriori) filters (PTF/IRF/compressed cIRF real/imaginary); G/H/I/J: average or predicted filters from training data (PTF/IRF/compressed cIRF real/imaginary); K/L/M/N: Reconstructions using the filters in G/H/I+J/bicubic; O/P/Q/R: GMS error maps for K/L/M/N (dark areas indicate poor reconstruction); All image tiles are normalized to [0,1]; $s_1$ and $s_2$ have the same normalization.
Fig. 10.
Fig. 10. Compressed Ideal cIRF and IRF to the same channel structure. A - cIRF real part, B - cIRF imaginary part, C - IRF, and D - proportion of the imaginary part in cIRF ( $\mathrm {Im}/\sqrt {\mathrm {Re}^2+\mathrm {Im}^2}$ )

Tables (2)

Tables Icon

Table 1. PSNR scores of the reconstructed test images

Tables Icon

Table 2. GMSD scores of the reconstructed test images

Equations (11)

Equations on this page are rendered with MathJax. Learn more.

S _ = [ s 0 s 1 s 2 s 3 ] = [ I H + I V I H I v I 45 I 135 I L C P I R C P ]
I 0 ( θ ) = A _ T S _ i n = i = 0 3 a i ( θ ) s i ( θ ) ,
I ~ 0 ( θ ~ ) = A _ ~ T S _ ~ i n = i = 0 3 a ~ i ( θ ~ ) s ~ i ( θ ~ ) ,
F ( r , ϵ ) = 1 1 + exp ( ϵ W [ 1 W r + 1 ( 1 ϵ ) W r ] ) ,
IRF ( θ ~ ) = ( | S ( θ ~ ) | 2 | S ( θ ~ ) | 2 + | N ( θ ~ ) | 2 ) β ,
cIRF ( θ ~ ) = C r ( θ ~ ) S r ( θ ~ ) + C i ( θ ~ ) S i ( θ ~ ) C r ( θ ~ ) 2 + C i ( θ ~ ) 2 + C r ( θ ~ ) S i ( θ ~ ) C i ( θ ~ ) S r ( θ ~ ) C r ( θ ~ ) 2 + C i ( θ ~ ) 2 i
CF x = K 1 exp ( C F x ) 1 + exp ( C F x ) ,
A _ ( m , n ) = 1 4 [ 2 cos π m + cos π n cos π m cos π n 0 ] T ,
GMS ( i ) = 2 G r ( i ) G d ( i ) + c G r ( i ) 2 + G d ( i ) 2 + c ,
G r ( i ) = G x , r ( i ) 2 + G y , r ( i ) 2 G d ( i ) = G x , d ( i ) 2 + G y , d ( i ) 2 ,
GMSD = 1 N i = 1 N ( GMS ( i ) GMS ¯ ) 2 ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.