Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Strong non-uniformity correction algorithm based on spectral shaping statistics and LMS

Open Access Open Access

Abstract

The existence of non-uniformity in infrared detector output images is a widespread problem that significantly degrades image quality. Existing scene-based non-uniformity correction algorithms typically struggle to balance strong non-uniformity correction with scene adaptability. To address this issue, we propose a novel scene-based algorithm that leverages the frequency characteristics of the non-uniformity, combine and improve single-frame stripe removal, multi-scale statistics, and least mean square (LMS) methods. Following the “coarse-to-fine” correction process, the coarse correction stage introduces an adaptive progressive correction strategy based on Laplacian pyramids. By improving 1-D guided filtering and high-pass filtering to shape high-frequency sub-bands, non-uniformity can be well separated from the scene, effectively suppressing ghosting. In the fine correction stage, we optimize the expected image estimation and spatio-temporal adaptive learning rates based on guided filtering LMS method. To validate the efficacy of our algorithm, we conduct extensive simulation and real experiments, demonstrating its adaptability to various scene conditions and its effectiveness in correcting strong non-uniformity.

© 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Infrared detectors often exhibit non-uniformity caused by various factors, including material properties, fabrication processes of the focal plane array, and variations within the readout electronics [1]. This non-uniformity leads to different output responses from each pixel for the same input radiation, resulting in what is commonly referred to as fixed pattern noise (FPN). Although non-uniformity gradually drifts over time and with changes in temperature, it is generally considered constant in the short term during the correction process. After calibration, the FPN is typically characterized by fine stripes and granular Gaussian noise with high spatial frequency features, along with minor deviations due to drift. However, in detectors that have not been calibrated for an extended period, strong noise with coarse stripes and significant deviations can also be present, particularly in uncooled infrared detectors. These types of noise severely degrade image quality, emphasizing the importance of non-uniformity correction (NUC) for front-end image restoration. NUC serves as the foundation for image enhancement, super-resolution, target recognition, multi-spectral fusion, and finds valuable applications in various infrared fields, including night vision, intelligent driving, and temperature monitoring.

To better understand the frequency characteristics of non-uniformity, we simulate an image containing stripe noise and divide it into four equal parts from left to right. These parts consist of small-deviation (20DN) fine stripes with a width of one pixel, coarse stripes with a width of four pixels and the same small deviation, fine stripes with a width of one pixel but with large-deviation (200DN), and coarse stripes with a width of four pixels and the same large deviation, as depicted in Fig. 1(a). By utilizing the Laplacian pyramid (LP) [2], we decompose the image into four levels, denoted as ${L_1}$ to ${L_4}$, where the first three levels represent sub-bands with decreasing frequency ranges. Each level corresponds to a band-pass filtered output, and the top level ${L_4}$, represents a low-frequency sub-band obtained through multiple down-sampling, as illustrated in Fig. 1(b)-(e). From the perspective of deviation degree, strong noise exhibits a larger intensity range in ${L_1}$ to ${L_3}$. Comparing coarse and fine stripes with the same deviation degree, it is evident that while the intensity range of fine stripes is greater in ${L_1}$, the intensity range of coarse stripes is greater in ${L_2}$ and ${L_3}$. This suggests that fine stripes exhibit more pronounced high-frequency features, while coarse stripes possess wider frequency band attributes. Similarly, strong Gaussian noise also has wider frequency bands than weak Gaussian noise, and fine stripes with the same intensity exhibit wider frequency bands than Gaussian noise. We refer to this type of FPN with wide-band characteristics as strong FPN.

 figure: Fig. 1.

Fig. 1. Four-level decomposition of the noisy image based on LP. (a) Noisy image; (b) ${L_1}$; (c) ${L_2}$; (d) ${L_3}$; (e) ${L_4}$.

Download Full Size | PDF

NUC methods can be categorized as calibration-based and scene-based. Calibration-based methods, including single-point, two-point, and multi-point correction, utilize blackbody to calibrate the output response at even radiations. These methods calculate gain and bias matrices based on a linear model, which are then incorporated into the program [3]. Although calibration-based methods are straightforward to implement in hardware, do not produce ghosting artifacts, they require frequent recalibration and lack adaptability to correct FPN caused by drift and handle high dynamic range imaging effectively. In contrast, scene-based methods do not necessitate additional hardware or pre-calibration, resulting in cost and volume savings. They possess adaptive correction capabilities and have been extensively researched. However, these methods often struggle to strike a balance between correction ability and scene adaptability. Moreover, current research primarily focuses on correcting high-frequency FPN, and there are few methods available for effectively correcting strong FPN with wide bandwidths, yielding unsatisfactory correction results.

This paper presents a scene-based NUC algorithm designed to effectively correct infrared images affected by strong FPN, without the need for pre-calibration of the infrared detector. The proposed algorithm demonstrates excellent adaptability to scenes and high computational efficiency. Given the primarily perspective motion and relatively homogeneous content of in-vehicle scenes, we mainly select such scenes to evaluate the algorithm's performance, and additionally test the adaptability to the lack of motion scenes. The structure of this paper is organized as follows: Section 2 provides a brief overview of representative scene-based NUC methods; Section 3 introduces the methodology of the proposed algorithm; Section 4 presents simulation experiments using two public datasets [4,5], wherein strong FPN is intentionally added, and practical experiments conducted using an uncooled infrared detector that was not pre-calibrated for correction purposes; Finally, Section 5 concludes the paper by summarizing the findings and discussing potential future directions. By addressing the limitations of existing methods, this research aims to contribute to the development of robust NUC techniques capable of handling strong FPN in infrared imagery.

2. Related works

Scene-based NUC methods using video sequences encompass various techniques such as constant statistical method, temporal high-pass method, inter-frame registration method, and Least Mean Square (LMS) method.

Constant statistical method assumes that the mean and standard deviation of each pixel are equal, which requires sufficient scene motion and adequate frames [6]. When the assumption holds, the method is effective in correcting strong FPN. However, when the scene lacks motion and the number of statistical frames is limited, this method will treat slowly changing scenes as FPN, resulting in burn-in ghosting artifacts. To suppress ghosting artifacts, Harris [6] proposed a constrained global constant statistical method, which only counts pixels with changes greater than a threshold. Local constant statistical method [7] and multi-scale constant statistical method [8] analyze FPN from the perspective of spatial frequency and believe that the high-frequency part of FPN is more likely to satisfy the constant statistical assumption. The constant statistical method of adjacent ratios [9] and adjacent differential constant statistics [10] both try to separate scene information from FPN, reduce the number of statistical frames effectively, and exclude outliers using the median filter. Liu [11] improved the absolute median deviation by adding first-order lag filters based on the adjacent differential constant statistics.

Temporal high-pass method assumes the scene has high-frequency properties and FPN has low-frequency properties in temporal domain [12]. This assumption can lead to ghosting artifacts when the scene lacks motion. Essentially, this method is equivalent to the constant statistical method's assumption that each pixel has an equal mean value. The spatial low-pass and temporal high-pass method [13] combines wavelet denoising, while the bilateral filtering temporal high-pass method [14] utilizes the hard segmentation property of the bilateral filter. Both methods can separate the high-frequency part of FPN from the scene and effectively suppress ghosting artifacts but are only effective for high-frequency noise.

Inter-frame registration method assumes that the radiation of the same target in adjacent frames is equal, which requires global motion of the scene and accurate registration of the motion trajectory of adjacent frames [1517]. Hardie [15] used the iterative gradient-based algorithm, and Zuo [16] proposed the mutual power spectrum algorithm. Their algorithms have good registration effects for scenes with general translational motion, but fail for other types of motion such as perspective and rotation. In terms of frame requirements, this class of method has the highest correction efficiency, and in ideal conditions, 2 frames can complete the correction [17]. In addition, this method also fails for strong FPN or less detailed features.

LMS method assumes that pixels and their neighbors have the same output, which is like the process of retina-like imaging, so it is also called neural networks method [18]. This method has strong scene adaptability and can implement correction as long as there is motion in the scene. The method constructs an error function between the corrected image and the expected image, and usually uses the steepest descent method to minimize the LMS of this function. The key to accurate correction lies in the accurate estimation of the expected image and the adaptive control of the correction rate. Scribner [18] estimated the expected image using a 4-neighborhood mean filter, but this filter has large estimation errors in edges and textures, which easily results in ghosting artifacts. With the development of filtering techniques, bilateral, guided, and co-occurrence filters with edge-preserving smoothing properties [1921] have improved the estimation of the expected image. Regarding the control of the correction rate, Torres and Vera [22] found a reverse relationship between the local standard deviation and the correction rate in spatial domain and proposed the fast adaptive LMS method. Vera [23] proposed the total variation adaptive LMS method based on pixel gradients rather than standard deviation, which further improved the correction effect. These two methods greatly enhance the ghost artifact suppression capability of the LMS algorithm from the perspective of spatial domain, but they perform poorly on static scenes. To address this issue, Hardie [24] proposed the temporal gating LMS method, which constructs a reference frame and does not correct pixels whose absolute difference value with the reference frame is less than a threshold, effectively solving the problem of ghosting artifacts caused by static scenes. This algorithm has good stability but sacrifices correction speed. Lai [25] and Li [26] both considered spatio-temporal factors and integrate adaptive correction rates, while Song [27] proposed a method that separately corrects using temporal and spatial filters before adaptively fusing them. In essence, since the filters used to estimate the expected image in the LMS method have low-pass filtering properties, the LMS method is essentially a spatial low-pass correction method that has a significant effect on correcting high-frequency FPN. Theoretically, with the progressive correction strategy, as the number of frames increases, the correction of each pixel can be locally transmitted to the global level, achieving correction for mid-to-low frequency FPN. However, due to slow transmission speed, it is difficult to achieve satisfactory results even after thousands of frames. In addition, strong FPN can seriously interfere with the estimation of the expected image, rendering this method ineffective.

In summary, the performance of scene-based NUC methods is summarized in Table 1. These algorithms cannot balance the correction ability for strong FPN with scene adaptability.

Tables Icon

Table 1. Scene-based NUC methods performance summary [627]

Additionally, there exist single-frame image correction methods based on scene analysis. Liu [28] solves the problem of simultaneously correcting stripes and bias fields by introducing regularization terms to solve a convex optimization model based on the prior properties of the image and FPN. However, this method requires a few iterations and has high computational complexity. Cao [29] proposes a stripe noise correction method based on 1-D guided filtering, which can effectively separate the scene from the stripe noise. The wavelet-based multi-scale stripe removal method further improves Cao's method and enhances the accuracy of scene and stripe noise separation [30]. All these methods mainly focus on eliminating stripe noise and do not correct the dominant Gaussian noise in FPN. In recent years, deep learning-based methods have been used for NUC [31,32]. The biggest advantage of this method is that it can simultaneously correct various types of noise without ghosting because they are corrected frame by frame. However, the computational complexity of this method is much higher than that of classical algorithms, which greatly limits its application in real-time processing tasks.

Considering the limitations, this paper proposes a correction strategy that progresses from coarse to fine based on the frequency distribution of FPN. It consists of 3 phases: stripe removal, removal of large-deviation Gaussian noise, and refined correction. The first two phases are part of the coarse correction process, aiming to significantly weaken strong FPN and bring it to a state suitable for LMS correction. Coarse correction is based on statistics and LP decomposition. By shaping the high-frequency spectral band, scenes and noise are effectively separated, suppressing ghosting artifacts. Consequently, the proposed algorithm is called the Spectral Shaping Statistics and LMS Combination Algorithm (3SLMS). The contribution of this algorithm lies in: (1) Combining the advantages of single-frame stripe removal, multi-scale constant statistics, and LMS methods; (2) Starting from the frequency characteristics of FPN, proposing a progressive correction strategy based on LP. Based on this strategy, a method for separating stripe noise and large-deviation Gaussian noise from scenes by shaping the high-frequency sub-bands is proposed; (3) Optimizing the expected image estimation and spatio-temporal adaptive learning rates based on guided filtering LMS method.

3. Methodology

3.1 Algorithm structure

Using a linear output model for infrared detectors [627]:

$${Y^{i,j}}(n )= {g^{i,j}}(n ){X^{i,j}}(n )+ {b^{i,j}}(n )+ n{r^{i,j}}(n )$$
where Y represents the detector output response, X the true scene radiation, g the gain, b the bias, and $nr$ the random noise. The superscript $(i,j$) represents pixel indices, and n represents the number of frames. To simplify the expression, $({i,j} )$ and n are omitted when there is no need to distinguish between pixel and inter-frame operations. By ignoring $nr$ and based on the linear characteristics of Eq. (1), we split g and b into stripe correction matrix ${b_{str}}$, Gaussian noise correction matrix ${b_{gn}}$, and fine correction matrix ${g_f}$ and ${b_f}$. Equation (1) is rewritten as $Y = {g_f}X + {b_f} + {b_{gn}} + {b_{str}}$, and the algorithm structure is shown in Fig. 2.

 figure: Fig. 2.

Fig. 2. The algorithm structure of 3SLMS.

Download Full Size | PDF

Phase 1: Stripe removal. Based on the global constant statistics method [6], we calculate the initial bias correction matrix ${b_c}$, which contains all FPN and inevitably includes some scene information. We use a progressive correction strategy (Section 3.2), which separates the stripe noise from the scene by using 1-D guided filtering, obtaining the first-phase correction matrix ${\hat{X}_1} = Y - {\hat{b}_{str}}$. The symbol $\;\widehat {}\;$ represents the estimated value and detailed content is described in Section 3.3.

Phase 2: Removal of large-deviation Gaussian noise. We continue to use a progressive correction strategy, extract the high-frequency sub-band ${L_1}$ of ${\hat{X}_1}$, and perform multiple high-pass filtering operations on it to completely separate Gaussian noise from the scene, obtaining the second-phase correction matrix ${\hat{X}_2} = {\hat{X}_1} - {\hat{b}_{gn}}$ (Section 3.4).

Phase 3: LMS. As the FPN in the first two phases has been largely corrected, under the framework of the LMS algorithm, guided filtering with both edge-preserving and smoothing properties can accurately estimate the expected image. The correction rate combines spatio-temporal domain factors and can effectively suppress ghosting. By using the steepest descent method, we estimate the fine correction gain matrix ${\hat{g}_f}$ and bias matrix ${\hat{b}_f}$, and calculate the correction image $\hat{X} = {\hat{g}_f}{\hat{X}_2} + {\hat{b}_f}$ (Section 3.5).

3.2 LP-based progressive correction strategy

A. LP adapts to the nature of NUC

We used the first 50 frames of FLIR in-vehicle infrared image public dataset [4]. After manually adding FPN, we obtained the temporal mean image $\bar{Y}$. The image $\bar{Y}$ and its corresponding four-level LP decomposition using the square normal Gaussian filter with a radius of 1 are shown in Fig. 3. The symbol “${\oplus} $” represents the LP reconstruction operation.

 figure: Fig. 3.

Fig. 3. Decomposition of the temporal mean image into 4 scales based on LP.

Download Full Size | PDF

The mean image $\bar{Y}$ contains lots of scene information, such as the separation edges between sky and ground areas, the distant horizon, and the motion trajectory of the scene, which obviously does not satisfy the assumption of global statistical assumption. By observing that ${L_1}\sim {L_3}$ mainly contain FPN and ${L_4}$ mainly contains scene information, it can be concluded that LP can alleviate the correlation between most pixels. The closer to the bottom level (${L_1}$), the more information about FPN, and the closer to the top level (${L_4}$), the more information about the scene.

Let the mean image obtained using n frames of statistics be denoted as $\bar{Y}(n )$, and the Mean Squared Error (MSE) [33] between this mean image and the true mean m be denoted as $MSE[{\bar{Y}(n )} ]= E\{{{{[{\bar{Y}(n )- m} ]}^2}} \}= Var[{\bar{Y}(n )} ]= \sigma _Y^2/n$ [6,8,13,14]. Here, E represents the expectation operation and $Var$ represents the variance operation. Due to the relationship that the variance is almost doubled between adjacent levels of the pyramid [2], the variance relationship between ${L_a}$ and ${L_b}$ can be expressed as $\sigma _{{L_a}}^2 = {2^{a - b}}\sigma _{{L_b}}^2$. Under the condition that $\sigma _Y^2$ is the same, for ${L_2}$ to reach the same MSE as ${L_1}$, it requires about twice the number of frames as ${L_1}$. For ${L_a}$ and ${L_b}$, if they are required to reach the same MSE, the relationship between the required number of frames can be expressed as ${n_{{L_a}}} = {2^{a - b}}{n_{{L_b}}}$. Therefore, we conclude that the variance is proportional to the MSE and inversely proportional to the number of statistical frames n. As the number of statistical frames increases, the variance gradually decreases and tends to stabilize, and the MSE reaches its minimum value, at which point we consider the level to have converged. In addition, a smaller MSE indicates closer adherence to the assumption of constant statistical assumption, which correspondingly means more FPN and less scene information in that level. Utilizing this property can help with the selection of Gaussian filter parameters in LP and the implementation of progressive correction strategy.

B. Progressive correction strategy

We calculate the energy by taking the average absolute value of the image. After decomposing the mean image $\bar{Y}({50} )$ into 4 levels in Fig. 3, we only performed progressive correction on ${L_1}$ for 100 iterations, and the energy changes of each level after correction are shown in Fig. 4. We can see that although only ${L_1}$ was corrected, the energy of ${L_2}$ and ${L_3}$ also decreased and stabilized accordingly, but the energy of ${L_4}$ remained almost unchanged.

 figure: Fig. 4.

Fig. 4. Energy changes in each level after continuous correction of the bottom level.

Download Full Size | PDF

Considering that ${L_1}\sim {L_3}$ mainly contain FPN while ${L_4}$ mainly contains scene, we can conclude that through progressive correction of the FPN at bottom level, the energy of FPN in other levels will gradually move downwards, while the energy of the scene information almost unchanged. Therefore, progressive correction of the bottom level is feasible, and only ${L_1}$ needs to be calculated without considering the number of levels in LP. From another perspective, the frequency of scene information in most temporal mean images is generally lower than that of FPN. Continuous correction of the bottom level is equivalent to constantly removing the highest frequency information. Inevitably, there is also scene information with a higher frequency than the FPN. Without further separation, this portion of the scene information will still cause ghosting.

3.3 Stripe removal

We divided the de-striping process into three steps: First, we preliminarily separated the scene and FPN using temporal low-pass filtering and LP-based spatial high-pass filtering. Second, we extracted the stripe noise using an improved 1-D guided filter. Third, we used an adaptive progressive correction strategy until entering Phase 2. Since there are some mature techniques for single-frame image de-striping [2830], which have less dependency on the scene, we first corrected the stripe noise.

A. Temporal low-pass and spatial high-pass filtering

When the number of frames is limited, extreme values will greatly deviate the statistical mean from the true value. Taking the in-vehicle scene as an example, sunlight reflection and exhaust pipes have extremely high intensities in infrared images, corresponding to extremely high frequencies in the mean image. This can easily cause strong ghosting during correction. Therefore, we borrowed the idea of wavelet threshold denoising and used histogram compression to truncate signals greater than the upper limit proportion coefficient ${k_{up}}$ and less than the lower limit proportion coefficient ${k_{down}}$ around the current frame image spatial mean ${u_Y}$ to obtain a compressed image ${Y_c}$.

$${Y_c} = \left\{ {\begin{array}{ll} {k_{up}}\ast {u_Y}\;& if\; {Y^{i,j}} > {k_{up}}\ast {u_Y}\;\\ {Y}\; & if\; {k_{down}}\ast {u_Y} \le {Y^{i,j}} \le {k_{up}}\ast {u_Y}\; \\ {k_{down}}\ast {u_Y}\;& if\; {Y^{i,j}} < {k_{down}}\ast {u_Y}\; \end{array}} \right.$$

The temporal mean image ${\bar{Y}_c}$ of the $n$th frame is calculated as follows:

$${\bar{Y}_c}(n )= [{({n - 1} ){{\bar{Y}}_c}({n - 1} )+ {Y_c}} ]/n$$

The initial correction matrix ${b_c}$ is obtained by normalizing ${\bar{Y}_c}$.

$${b_c} = {\bar{Y}_c} - (\mathop \sum \nolimits_{i = 1}^{i = row} \mathop \sum \nolimits_{j = 1}^{j = col} \bar{Y}_c^{i,j})/({row \times col} )$$
where $row \times col$ is the image resolution, followed by a spatial high-pass filter, resulting in:
$${L_1}({{b_c}} )= {b_c} - EXPAND({REDUCE({{b_c}} )} )$$
where $REDUCE$ represents down-sampling and $EXPAND$ represents up-sampling. ${L_1}(\; )$ represents the high-pass filtering operation based on LP decomposition. The ${L_1}({{b_c}} )$ consist mainly of high frequency FPN and high frequency scene information that lacks motion.

B. Extraction of stripe noise

According to prior information, there is an approximate local linear relationship between the stripe noise and the true infrared input data without noise in the same column [29]. This relationship is the same as the linear relationship between the output image and the guided image [34]. Therefore, we use 1-D guided filtering to estimate the stripe noise, and the calculation formulae are as follows.

$$\left\{ {\begin{array}{l} {\hat{b}_{newstr}^{i,j} = {{\bar{p}}^{i,j}}{G^{i,j}} + {{\bar{q}}^{i,j}}\; \; \; \; \; \; }\\ {{p^{i,j}} = cov_{IG}^{i,j}/[{var_G^{i,j} + {\varepsilon_1}} ]}\\ {{q^{i,j}} = u_I^{i,j} - {p^{i,j}}u_G^{i,j}\; \; \; \; \; \; \; \; \; \; \; \; \; } \end{array}} \right.$$
We use ${L_1}({{b_c}} )$ to replace Cao's [29] method of preliminary extracting FPN by horizontal 1-D guided filtering and use it as the input image I. G is the guided image, ${\hat{b}_{newstr}}$ is the filtered output image (i.e., separated stripe noise), and ${\varepsilon _1}$ is the blurring factor. The window ${w_1}$ is a 1-D sliding window centered at $(i,j$) with a vertical radius of ${r_1}$. Under the traversal of the window ${w_1}$, $\bar{p}$ is the average of p, $\bar{q}$ is the average of q, $co{v_{IG}}$ is the covariance between the input image and the guided image, $va{r_G}$ is the variance of the guided image, ${u_I}$ is the mean value of the input image, and ${u_G}$ is the mean value of the guided image. Because our method combines temporal factors, and the true temporal mean image after normalization is a plane with intensity approximately equal to 0, we set the guided image to be a 0-plane. Therefore, $co{v_{IG}} = 0$, $p = 0$, ${\varepsilon _1}$ does not need to be set, and the above filtering calculation is simplified as Eq. (7).
$$\left\{ {\begin{array}{l} {\hat{b}_{newstr}^{i,j} = {{\bar{q}}^{i,j}}\; \; }\\ {{q^{i,j}} = u_I^{i,j}\; \; \; \; \; \; \; \; \; } \end{array}} \right.$$

Explained in another way, for the linear model Eq. (6), $\nabla {\hat{b}_{newstr}} = \bar{p}\nabla G$, the smoothness of ${\hat{b}_{newstr}}$ depends on $\bar{p}$. If the guided image is a 0-plane, $\bar{p} = 0$ and the smoothness of ${\hat{b}_{newstr}}$ is higher, which means that there is less scene information, and it is more conducive to separating the scene from the stripe noise. Regarding the issue of selecting the window size of the 1-D guided filter, if ${r_1}$ is too small, it is susceptible to interference from scene information, resulting in extracted stripe noise containing scene information; if ${r_1}$ is too large, it will lead to a decrease in estimation accuracy. We refer to Cao's [29] method and choose ${r_1} = row/4$. Because the guided filter using boxfilter has the advantage of being independent of computational complexity and window size, the speed of the algorithm will not be affected.

C. Adaptive progressive correction

By using Eq. (3) to calculate the temporal low-pass image ${\bar{Y}_c}$ and combining it with a progressive correction strategy, we set ${\hat{b}_{str}}$ to be initially 0 and update it progressively according to the following formulae.

$${\bar{Y}_c} = {\bar{Y}_c} - {\hat{b}_{str}}$$
$${\hat{b}_{str}} = {\hat{b}_{str}} + {\hat{b}_{newstr}}$$

As the number of iterations increases, the energy of ${\hat{b}_{newstr}}$ ($EN{G_{str}})$ gradually decreases and stabilizes. When $EN{G_{str}}$ is less than the energy threshold $T{H_{str}}$, the algorithm adaptively enters Phase 2, where the corrected image ${\hat{X}_1} = Y - {\hat{b}_{str}}$. Setting the energy threshold $T{H_{str}}$ too small will reduce the correction efficiency, while setting it too large will result in a large amount of stripe noise remaining in subsequent phases, leading to a decrease in correction accuracy. Through our experiments, we have found that taking 0.20% to 0.25% of the energy of ${\hat{b}_{newstr}}$ calculated during the first iteration as the threshold is optimal.

3.4 Removal of large-deviation Gaussian noise

Similar to the adaptive progressive correction for stripe removal, we continue to update ${\bar{Y}_c}$ based on Eq. (8) and set ${\hat{b}_{sh}}$ to be initially 0. The progressive update formulae are as follows.

$${\bar{Y}_c} = {\bar{Y}_c} - {\hat{b}_{gn}}$$
$${\hat{b}_{gn}} = {\hat{b}_{gn}} + {\hat{b}_{newgn}}$$

By using the temporal mean statistical method with extreme value compression, we can ensure that large-deviation Gaussian noise have the highest frequency characteristics. Using this feature and referring to Eq. (5), we perform x iterations of high-pass filtering on ${b_c}$, progressively extracting high-frequency components to completely separate large-deviation Gaussian noise from the scene. Through experiments, even for stationary scene, when $x = 3$, the noise and scene can be completely separated and represented as ${\hat{b}_{newgn}} = {L_1}({L_1}({L_1}({{b_c}} )) = L_1^3({{b_c}} )$, with the general formula being:

$${\hat{b}_{newgn}} = L_1^x({{b_c}} )$$

As the number of iterations increases, the energy of ${\hat{b}_{newgn}}$ ($EN{G_{gn}}$) gradually decreases and stabilizes. Referring to the energy threshold selection method in Phase 1, we also take 0.25% to 0.25% of the energy of ${\hat{b}_{newgn}}$ calculated during the first iteration as the threshold $T{H_{gn}}$. When $EN{G_{gn}} < T{H_{gn}}$, the algorithm adaptively enters Phase 3, where the corrected image ${\hat{X}_2} = {\hat{X}_1} - {\hat{b}_{gn}}$.

3.5 Gated LMS method based on guided filtering

After spectral shaping statistical correction, strong FPN is significantly weakened. By using a guided filter with edge-preserving smoothing property, we can accurately estimate the expected image. To suppress ghosting artifacts, the correction rate is linked to the spatio-temporal domain, and its algorithmic framework is shown in Fig. 5.

 figure: Fig. 5.

Fig. 5. Block diagram of gated LMS algorithm based on guided filter.

Download Full Size | PDF

$D$ represents the expected image estimated by the guided filter. We found that using image $\hat{X}$ instead of ${\hat{X}_2}$ [20] as the input and guided image, can more accurately estimate the expected image. Both the input and guided images are set to $\hat{X}$, $\hat{X} = {\hat{g}_f}{\hat{X}_2} + {\hat{b}_f}$. This leads to some differences between Eqs. (13) and (6).

$$\left\{ {\begin{array}{l} {{D^{i,j}} = {{\bar{a}}^{i,j}}{{\hat{X}}^{i,j}} + {{\bar{b}}^{i,j}}\; \; }\\ {{a^{i,j}} = \sigma_{i,j}^2/(\sigma_{i,j}^2 + \varepsilon )\; }\\ {{b^{i,j}} = ({1 - {a^{i,j}}} ){u^{i,j}}\; \; } \end{array}} \right.$$

The sliding window ${w_2}$ has a center of $({i,j} )$ and a radius of ${r_2}$. During the traversal of the ${w_2}$, $\bar{a}$ is the mean of a, $\bar{b}$ is the mean of b, u is the mean of $\hat{X}$, ${\sigma ^2}$ is the variance of $\hat{X}$, and $\varepsilon $ is the blur factor, where a smaller $\varepsilon $ value results in better edge preservation. We define the error function $E = \hat{X} - D$. To obtain the Least Mean Square error ${E^2}$, by combining Eq. (1), we use the steepest descent method to update the gain ${\hat{g}_f}$ and bias ${\hat{b}_f}$ as shown in Eq. (14), with ${\hat{g}_f}$ initialized to 1 and ${\widehat {\; b}_f}$ initialized to 0. To unify the gain and bias correction rate and avoid ghosting artifacts caused by a large gain learning rate [35], we multiply the gain learning rate by a coefficient of 0.01.

$$\left\{ {\begin{array}{l} {\hat{g}_f^{i,j}({n + 1} )= \hat{g}_f^{i,j}(n )- 0.01{\eta^{i,j}}{E^{i,j}}(n )\hat{X}_2^{i,j}(n )}\\ {\hat{b}_f^{i,j}({n + 1} )= \hat{b}_f^{i,j}(n )- {\eta^{i,j}}{E^{i,j}}(n )\; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; } \end{array}} \right.$$
where $\eta $ represents the adaptive learning rate. We use Hardie's [24] temporal gating method for motion detection, setting a temporal threshold $T{H_{tp}}$ to only update pixels greater than the threshold. We introduce a spatial threshold $T{H_{sp}}$ based on the inverse relationship between local spatial variance and correction rate. We normalize the variance within the range of $[0,T{H_{sp}}]$ to the interval $[{0,1} ]$ and truncate learning rates greater than $T{H_{sp}}$ to 0. This effectively suppresses ghosting artifacts caused by high local variances. The adaptive learning rate formulae are:
$${\eta ^{i,j}}(n) = \left\{ {\begin{array}{l} {\left[ {1 - \sigma _{i,j}^2\left( n \right)/T{H_{sp}}} \right]rate\; \; if\; \sigma _{i,j}^2\left( n \right)\left\langle {T{H_{sp}}\; \textrm{}\; \left| {{D^{i,j}}\left( n \right) - {R^{i,j}}\left( n \right)} \right|} \right\rangle T{H_{tp}}}\\ {\begin{array}{cc} 0&{else} \end{array}} \end{array}} \right.$$
$${R^{i,j}}({n + 1} )= \left\{ {\begin{array}{lc} {D^{i,j}}(n )&if\; |{{D^{i,j}}(n )- {R^{i,j}}(n )} |> T{H_{tp}}\\ {{R^{i,j}}(n )}&else \end{array}} \right.$$
where $rate$ is a constant controlling the correction rate. We use the guidance filtered image D instead of ${\hat{X}_2}$ as the reference image for temporal motion detection to overcome the effects of random noise. We initialize $R(1 )= \infty $ to ensure that all pixels are corrected in the first frame. The outline of the proposed algorithm is summarized as follows.

oe-31-19-30693-i001

4. Experimental results and discussion

4.1 Simulation experiments

First, we conducted our simulation experiments using a subset of the publicly available FLIR infrared in-vehicle video dataset (Data 1) [4]. Specifically, we used the first 1000 frames of the dataset, which are considered as true data. To test the algorithm's ability to correct for strong FPN, we added independent Gaussian noise to the gain and bias based on Eq. (1) (gain: mean equals 1 and standard deviation equals 0.01; bias: mean equals 0 and standard deviation equals 50). We also introduced random coarse and fine vertical stripes within an intensity range of [-120,120], with the maximum width of 5 pixels for the coarse stripes, resulting in severe degradation of the images. We selected four advanced NUC algorithms that are suitable for strong FPN correction and have simple parameter settings: local constant statistics algorithm (LCS) [7], constant statistical algorithm of adjacent ratios (CSAR) [9], FPN estimation algorithm (FPNE) [11], and total variation LMS algorithm (TVLMS) [23]. In addition to subjective evaluation, we used PSNR and SSIM [33] as quantitative evaluation metrics to measure the performance. Next, two sets of outdoor data (Data 2 and Data 3) from FLIR's “Thermal Infrared Dataset” [5] were used to further test the adaptation of the algorithm to the scenes, in particular to the lack of motion scenes, where Data 2 has almost no motion in the scene, except for a small amount of local motion, and Data 3 also changes slowly.

A. Parameter settings

We created six different Gaussian filters by setting the window radius to 1 and 2, and standard deviation to 0.5, 1, and 3. These filters were used to perform LP decomposition on the temporal mean images for the first 50 frames. The curves of variance of ${L_1}$ are shown in Fig. 6 with each filter. We found that all variance curves decrease and stabilize with an increasing number of frames. The filter with a window radius of 1 and a standard deviation of 1 produced the smallest variance, corresponding to the lowest MSE and minimal scene information, which is more favorable for the initial separation of FPN and scene. Please refer to Table 2 for the remaining parameter settings.

 figure: Fig. 6.

Fig. 6. Effect of Gaussian filter parameter settings on the variance of temporal mean images.

Download Full Size | PDF

Tables Icon

Table 2. Parameter settings for each algorithm

B. Subjective and quantitative evaluation

Figure 7 displays the correction results of each algorithm on Data 1. For clearer observation, we enhanced the images using contrast limited adaptive histogram equalization (CLAHE) [36]. There is one group for each two rows, group 1 to group 3 correspond to the corrected images and locally zoomed images (red-boxed regions) of the 300th, 630th and 890th frames, respectively, and group 4 corresponds to the corrected images and correction matrices for the 1000th frame, and the PSNR and SSIM curves are shown in Fig. 8.

 figure: Fig. 7.

Fig. 7. Correction images for the 300th, 630th, 890th, 1000th frames, and the correction matrices for the 1000th frame image on Data 1. (a) Degraded images and true images with local zoom; (b) LCS [7]; (c) CSAR [9]; (d) FPNE [11]; (e) TVLMS [23]; (f) 3SLMS. The complete correction video can be seen in Visualization 1.

Download Full Size | PDF

 figure: Fig. 8.

Fig. 8. PSNR and SSIM curves of the corrected images on Data 1. (a) PSNR; (b) SSIM.

Download Full Size | PDF

By observing the corrected images, LCS and TVLMS are weaker than the other methods in correcting the strong FPN, and there are still streak residues until the 1000th frame. By further examining the local zoomed images and the correction matrix, LCS, CSAR, FPNE and TVLMS all had obvious ghosting artifacts. This was because the boundaries between the sky and ground areas appeared in the same region of the images for a long time, similar to motionless scenes. Although CSAR used image cubic median filtering for 100 frames, and FPNE utilized temporal mean and first-order lag filters, significant estimation errors still occurred due to the areas lacking movement for an extended period. In addition, both methods had inherent defects where errors were accumulated and propagated, leading to new diagonal noise extending downwards and to the right in larger error regions. As for the LCS method, it gradually approached the locally constant statistical assumption with an increasing number of frames, resulting in an increase in PSNR. However, due to the lack of movement in the aforementioned areas, ghosting artifacts persisted, causing SSIM to fluctuate significantly under the influence of the scene. Although TVLMS used total variation and neighborhood mean filtering, performed poorly overall in correcting the images due to the strong FPN and the homogeneity of the content of the in-vehicle scenes, and its SSIM was the lowest among these methods, which validated that it was difficult to adapt to strong FPN correction using only the LMS method.

Our algorithm has entered the third phase at frame 300, because the vehicle kept moving forward, the sky region appeared almost fixed in the upper part of the images, and the efficiency of correction in this region was reduced due to the limitations of the temporal domain gating, resulting in the presence of wavy noise in the sky region at the 300th frame that was not effectively corrected. However, the correction of all pixels was quickly completed with two turns around the 630th and 890th frames. By observing the locally magnified images at frames 300, 630, and 890, except for the sky region at frame 300, our algorithm produced good correction results that were close to the true images and better than other algorithms.

The correction matrices of our algorithm at each phase are shown in Fig. 9, with almost no ghosting artifacts in each phase. At the end of coarse correction (Phases 1 and 2), PSNR and SSIM were improved by 13.98 dB and 0.49, respectively, accounting for 78.7% and 58.3% of the total improvement throughout the correction process. This indicates that our algorithm greatly weakened FPN without introducing ghosting artifacts, laying a solid foundation for the LMS method.

 figure: Fig. 9.

Fig. 9. Correction matrix corresponding to the 3SLMS at the end of each phase on Data 1.

Download Full Size | PDF

Figure 10 shows the corrected and locally zoomed images (red-boxed regions) of the last frames on Data 2 and Data 3, as displayed by CLAHE. The performance showed that our algorithm balanced the preservation of details and the effective correction of strong FPN, which was more intuitive in the local zoom images. Figure 11 shows the SSIM curves of the two data sets. The first two phases affected the correction efficiency of our algorithm in the early stage due to the need to continuously separate the FPN from the scenes, but laid a good foundation for the fine correction in the third phase, and both obtained the highest SSIM in the end. Based on subjective and objective evaluations, all the above experiments confirmed the adaptability of our algorithm to the scenes in the case of strong FPN.

 figure: Fig. 10.

Fig. 10. Corrected images of the last frames on Data 2 and Data 3. (a) Degraded images and true images with local zoom; (b) LCS [7]; (c) CSAR [9]; (d) FPNE [11]; (e) TVLMS [23]; (f) 3SLMS. The complete correction videos can be seen in Visualization 2 and Visualization 3.

Download Full Size | PDF

 figure: Fig. 11.

Fig. 11. SSIM curves of the corrected images on (a) Data 2; (b) Data 3.

Download Full Size | PDF

4.2 Real experiments

We collected 1000 frames of data on the road using an uncooled infrared detector with a data format of 14 bits and a resolution of 480 × 640. To further test the correction ability and scene adaptability of the algorithm, we did not pre-calibrate the detector and directly performed NUC based on the scene. In addition, the vehicle was not started in the first 100 frames, and the scene was almost motionless. Because there was no true image reference, we used roughness [37] for quantitative evaluation and compared the running efficiency.

Figure 12 shows the correction effects of each algorithm as displayed by CLAHE. There is one group for each two rows, group 1 to group 3 correspond to the corrected images and locally zoomed images (red-boxed regions) of the 300th, 600th and 900th frames, respectively, and the roughness curves are shown in Fig. 13.

 figure: Fig. 12.

Fig. 12. Correction images for the 300th, 600th and 900th frames. (a) Degraded images and local zoom images; (b) LCS [7]; (c) CSAR [9]; (d) FPNE [11]; (e) TVLMS [23]; (f) 3SLMS. The complete correction video can be seen in Visualization 4.

Download Full Size | PDF

 figure: Fig. 13.

Fig. 13. The roughness curves of the corrected images. (a) 1-1000 frames; (b) local enlargement of the area inside the dashed box in (a).

Download Full Size | PDF

Due to the high thermal radiation of the exhaust pipes, LCS, CSAR, and FPNE exhibited obvious tailing ghosting. Moreover, since the first 100 frames were almost still, all three methods produced ghosting artifacts of stationary scene residues in the 300th frame. As CSAR used the nearest 100 frames of image cubic for median filtering, these stationary scenes did not affect the corrected images of the 600th and 900th frames. However, LCS and FPNE still produced ghosting artifacts from the first 100 frames of stationary scene residues until the 900th frame. Among them, LCS began correction from the first frame, and all scene information was used for FPN correction, so the starting point of the roughness curve was very low and synchronized with other algorithms after 100 frames. LCS and TVLMS still showed significant streak noise in the zoomed images at frame 900. Our method performed well in terms of correction effect, ghosting suppression, and roughness evaluation. It entered the second phase at the 57th frame and the third phase at the 62nd frame, corresponding to the fastest drop in roughness in the second phase and the lowest roughness indicator among all methods after the 350th frame.

We ran all algorithms on a Matlab R2020b with a computer configuration of a 2.10 GHz CPU and 16GB of RAM, and the average processing time per frame was obtained by counting the running time of 1000 frames, as shown in Table 3. Except for CSAR, which requires inefficient median filtering of 100 frames of image cubic, all other algorithms are suitable for real-time image processing. Our algorithm can process 30 frames per second for images with 300,000 pixels, and efficiency can be further improved through hardware transplantation and algorithm optimization in practical applications.

Tables Icon

Table 3. Average runtime per frame for each algorithm (s × 10−3)

5. Conclusion

This paper proposes a strong FPN NUC algorithm that combines the advantages of single-frame stripe removal, multi-scale constant statistics, and the LMS method. The study focuses on in-vehicle and lack-of-motion scenes and demonstrates the algorithm's good scene adaptability. The proposed algorithm begins by analyzing the frequency characteristics of FPN. Phase 1 and Phase 2 employ an LP-based progressive correction strategy, leveraging 1-D guided filtering and multiple high-pass filtering, to separate stripes and large-deviation Gaussian noise, respectively. This approach significantly weakens FPN without introducing ghosting artifacts, allowing for adaptive progression to the next phase. In Phase 3, based on guided filtering LMS method, we optimize the expected image estimate and spatio-temporal adaptive learning rates to improve correction accuracy. Experiments validate the algorithm's strong adaptability, correction effects on strong FPN, and fast convergence within hundreds of frames, making it suitable for real-time image processing. While it effectively addresses these forms of noise, other types with significant deviation, such as oblique stripes and irregular blocks, may require further improvements. Increasing the radius and standard deviation of the Gaussian filter in Phase 2 can assist in separating these noise types along with Gaussian noise, although caution is needed to avoid mixing scene information and introducing ghosting artifacts. Future research should focus on refining this aspect of the algorithm.

Funding

National Natural Science Foundation of China (62105152); Leading Technology of Jiangsu Basic Research Plan (BK20192003); Fundamental Research Funds for the Central Universities (30919011401, 30922010204, 30922010718, JSGP202202); Funds of the Key Laboratory of National Defense Science and Technology (6142113210205).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are available in Ref. [4].

References

1. D. A. Scribner, M. R. Kruer, and J. M. Killiany, “Infrared focal plane array technology,” Proc. IEEE 79(1), 66–85 (1991). [CrossRef]  

2. P. J. Burt and E. H. Adelson, “The Laplacian Pyramid as a compact image code,” IEEE Trans. Commun. 31(4), 671–679 (1983). [CrossRef]  

3. A. Friedenberg and I. Goldblatt, “Nonuniformity two-point linear correction errors in infrared focal plane arrays,” Opt. Eng. 37(4), 1251–1253 (1998). [CrossRef]  

4. Teledyne FLIR, “Free Teledyne FLIR thermal dataset for algorithm training,” Teledyne FLIR (2018), https://www.flir.com/oem/adas/adas-dataset-form/.

5. ASL, “Thermal Infrared Dataset,” ASL2014, https://projects.asl.ethz.ch/datasets/doku.php?id=ir:iricra2014.

6. J. G. Harris and Y. M. Chiang, “Nonuniformity correction of infrared image sequences using the constant-statistics constraint,” IEEE Trans. Image Process. 8(8), 1148–1151 (1999). [CrossRef]  

7. C. Zhang and W. Zhao, “Scene-based nonuniformity correction using local constant statistics,” J. Opt. Soc. Am. A 25(6), 1444–1453 (2008). [CrossRef]  

8. C. Zuo, Q. Chen, G. Gu, X. Sui, and W. Qian, “Scene-based nonuniformity correction method using multiscale constant statistics,” Opt. Eng. 50(8), 087006 (2011). [CrossRef]  

9. D. Zhou, D. Wang, L. Huo, R. Liu, and P. Jia, “Scene-based nonuniformity correction for airborne point target detection systems,” Opt. Express 25(13), 14210–14226 (2017). [CrossRef]  

10. L. Geng, Q. Chen, and W. Qian, “An adjacent differential statistics method for IRFPA nonuniformity correction,” IEEE Photonics J. 5(6), 6801615 (2013). [CrossRef]  

11. C. Liu, X. Sui, Y. Liu, X. Kuang, G. Gu, and Q. Chen, “FPN estimation based nonuniformity correction for infrared imaging system,” Infrared Phys. Technol. 96, 22–29 (2019). [CrossRef]  

12. D. A. Scribner, K. A. Sarkady, J. T. Caulfield, M.R. Kruer, G. Katz, and C. J. Gridley, “Nonuniformity correction for staring IR focal plane arrays using scene-based techniques,” Proc. SPIE 1308, 224–233 (1990). [CrossRef]  

13. W. Qian, Q. Chen, and G. Gu, “Space low-pass and temporal high-pass nonuniformity correction algorithm,” Opt. Rev. 17(1), 24–29 (2010). [CrossRef]  

14. C. Zuo, Q. Chen, G. Gu, and W. Qian, “New temporal high-pass filter nonuniformity correction based on bilateral filter,” Opt. Rev. 18(2), 197–202 (2011). [CrossRef]  

15. R. C. Hardie, M. M. Hayat, E. Armstrong, and B. Yasuda, “Scene-based nonuniformity correction with video sequences and registration,” Appl. Opt. 39(8), 1241–1250 (2000). [CrossRef]  

16. C. Zuo, Q. Chen, G. Gu, and X. Sui, “Scene-based nonuniformity correction algorithm based on interframe registration,” J. Opt. Soc. Am. A 28(6), 1164–1176 (2011). [CrossRef]  

17. C. Zuo, Y. Zhang, Q. Chen, G. Gu, W. Qian, X. Sui, and J. Ren, “A two-frame approach for scene-based nonuniformity correction in array sensors,” Infrared Phys. Technol. 60, 190–196 (2013). [CrossRef]  

18. D. A. Scribner, K. A. Sarkady, M. R. Kruer, and J. T. Caulfield, “Adaptive nonuniformity correction for IR focal plane arrays using neural networks,” Proc. SPIE 1541, 100–109 (1991). [CrossRef]  

19. A. Rossi, M. Diani, and G. Corsini, “Bilateral filter-based adaptive nonuniformity correction for infrared focal-plane array systems,” Opt. Eng. 49(5), 057003 (2010). [CrossRef]  

20. S. Rong, H. Zhou, Z. Wen, H. Qin, K. Qian, and K. Cheng, “An improved non-uniformity correction algorithm and its hardware implementation on FPGA,” Infrared Phys. Technol. 85, 410–420 (2017). [CrossRef]  

21. L. Li, Q. Li, H. Feng, Z. Xu, and Y. Chen, “A novel infrared focal plane non-uniformity correction method based on co-occurrence filter and adaptive learning rate,” IEEE Access 7, 40941–40950 (2019). [CrossRef]  

22. E. Vera and S. Torres, “Fast adaptive nonuniformity correction for infrared focal-plane array detectors,” EURASIP J. Adv. Signal Process. 2005(13), 560759 (2005). [CrossRef]  

23. E. Vera, P. Mesa, and S. Torres, “Total variation approach for adaptive nonuniformity correction in focal-plane arrays,” Opt. Lett. 36(2), 172–174 (2011). [CrossRef]  

24. R. C. Hardie, F. Baxley, B. Brys, and P. Hytla, “Scene-based nonuniformity correction with reduced ghosting using a gated LMS algorithm,” Opt. Express 17(17), 14918–14933 (2009). [CrossRef]  

25. R. Lai, J. Guan, Y. Yang, A. Xiong, and A, “Spatiotemporal adaptive nonuniformity correction based on BTV regularization,” IEEE Access 7, 753–762 (2019). [CrossRef]  

26. Y. Li, W. Jin, J. Zhu, X. Zhang, and S. Li, “An adaptive deghosting method in neural network-based infrared detectors nonuniformity correction,” Sensors 18(1), 211 (2018). [CrossRef]  

27. L. Song and H. Huang, “Spatial and temporal adaptive nonuniformity correction for infrared focal plane arrays,” Opt. Express 30(25), 44681–44700 (2022). [CrossRef]  

28. L. Liu, L. Xu, and H. Fang, “Simultaneous intensity bias estimation and stripe noise removal in infrared images using the global and local sparsity constraints,” IEEE Trans. Geosci. Remote Sensing 58(3), 1777–1789 (2020). [CrossRef]  

29. Y. Cao, M. Y. Yang, and C. L. Tisse, “Effective strip noise removal for low-textured infrared images based on 1-D guided filtering,” IEEE Trans. Circuits Syst. Video Technol. 26(12), 2176–2188 (2016). [CrossRef]  

30. Y. Cao, Z. He, J. Yang, X. Ye, and Y. Cao, “A multi-scale non-uniformity correction method based on wavelet decomposition and guided filtering for uncooled long wave infrared camera,” Signal Process-Image 60, 13–21 (2018). [CrossRef]  

31. X. Kuang, X. Sui, Q. Chen, and G. Gu, “Single infrared image stripe noise removal using deep convolutional networks,” IEEE Photonics J. 9(4), 1–13 (2017). [CrossRef]  

32. J. Guan, R. Lai, A. Xiong, Z. Liu, and L. Gu, “Fixed pattern noise reduction for infrared images based on cascade residual attention CNN,” Neurocomputing 377, 301–313 (2020). [CrossRef]  

33. Z. Wang and A. C. Bovik, “Mean Squared Error: Love it or leave it?” IEEE Signal Proc. Mag. 26(1), 98–117 (2009). [CrossRef]  

34. K. He, J. Sun, and X. Tang, “Guided image filtering,” IEEE Trans. Pattern Anal. Mach. Intell. 35(6), 1397–1409 (2013). [CrossRef]  

35. H. Yu, H. Zhang, Z., and C. Wang, “An improved retina-like nonuniformity correction for infrared focal-plane array,” Infrared Phys. Technol. 73, 62–72 (2015). [CrossRef]  

36. K. Zuiderveld, “Contrast limited adaptive histogram equalization,” in Graphic Gems IV (Academic Press Professional, 1994), pp. 474–485.

37. T. Svensson, “An evaluation of image quality metrics aiming to validate long term stability and the performance of NUC methods,” Proc. SPIE 8706, 870604 (2013). [CrossRef]  

Supplementary Material (4)

NameDescription
Visualization 1       The complete correction video of the simulation experiments on Data 1.
Visualization 2       The complete correction video of the simulation experiments on Data 2.
Visualization 3       The complete correction video of the simulation experiments on Data 3.
Visualization 4       The complete correction video of the real experiments.

Data availability

Data underlying the results presented in this paper are available in Ref. [4].

4. Teledyne FLIR, “Free Teledyne FLIR thermal dataset for algorithm training,” Teledyne FLIR (2018), https://www.flir.com/oem/adas/adas-dataset-form/.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (13)

Fig. 1.
Fig. 1. Four-level decomposition of the noisy image based on LP. (a) Noisy image; (b)  ${L_1}$ ; (c) ${L_2}$ ; (d) ${L_3}$ ; (e) ${L_4}$ .
Fig. 2.
Fig. 2. The algorithm structure of 3SLMS.
Fig. 3.
Fig. 3. Decomposition of the temporal mean image into 4 scales based on LP.
Fig. 4.
Fig. 4. Energy changes in each level after continuous correction of the bottom level.
Fig. 5.
Fig. 5. Block diagram of gated LMS algorithm based on guided filter.
Fig. 6.
Fig. 6. Effect of Gaussian filter parameter settings on the variance of temporal mean images.
Fig. 7.
Fig. 7. Correction images for the 300th, 630th, 890th, 1000th frames, and the correction matrices for the 1000th frame image on Data 1. (a) Degraded images and true images with local zoom; (b) LCS [7]; (c) CSAR [9]; (d) FPNE [11]; (e) TVLMS [23]; (f) 3SLMS. The complete correction video can be seen in Visualization 1.
Fig. 8.
Fig. 8. PSNR and SSIM curves of the corrected images on Data 1. (a) PSNR; (b) SSIM.
Fig. 9.
Fig. 9. Correction matrix corresponding to the 3SLMS at the end of each phase on Data 1.
Fig. 10.
Fig. 10. Corrected images of the last frames on Data 2 and Data 3. (a) Degraded images and true images with local zoom; (b) LCS [7]; (c) CSAR [9]; (d) FPNE [11]; (e) TVLMS [23]; (f) 3SLMS. The complete correction videos can be seen in Visualization 2 and Visualization 3.
Fig. 11.
Fig. 11. SSIM curves of the corrected images on (a) Data 2; (b) Data 3.
Fig. 12.
Fig. 12. Correction images for the 300th, 600th and 900th frames. (a) Degraded images and local zoom images; (b) LCS [7]; (c) CSAR [9]; (d) FPNE [11]; (e) TVLMS [23]; (f) 3SLMS. The complete correction video can be seen in Visualization 4.
Fig. 13.
Fig. 13. The roughness curves of the corrected images. (a) 1-1000 frames; (b) local enlargement of the area inside the dashed box in (a).

Tables (3)

Tables Icon

Table 1. Scene-based NUC methods performance summary [627]

Tables Icon

Table 2. Parameter settings for each algorithm

Tables Icon

Table 3. Average runtime per frame for each algorithm (s × 10−3)

Equations (16)

Equations on this page are rendered with MathJax. Learn more.

Y i , j ( n ) = g i , j ( n ) X i , j ( n ) + b i , j ( n ) + n r i , j ( n )
Y c = { k u p u Y i f Y i , j > k u p u Y Y i f k d o w n u Y Y i , j k u p u Y k d o w n u Y i f Y i , j < k d o w n u Y
Y ¯ c ( n ) = [ ( n 1 ) Y ¯ c ( n 1 ) + Y c ] / n
b c = Y ¯ c ( i = 1 i = r o w j = 1 j = c o l Y ¯ c i , j ) / ( r o w × c o l )
L 1 ( b c ) = b c E X P A N D ( R E D U C E ( b c ) )
{ b ^ n e w s t r i , j = p ¯ i , j G i , j + q ¯ i , j p i , j = c o v I G i , j / [ v a r G i , j + ε 1 ] q i , j = u I i , j p i , j u G i , j
{ b ^ n e w s t r i , j = q ¯ i , j q i , j = u I i , j
Y ¯ c = Y ¯ c b ^ s t r
b ^ s t r = b ^ s t r + b ^ n e w s t r
Y ¯ c = Y ¯ c b ^ g n
b ^ g n = b ^ g n + b ^ n e w g n
b ^ n e w g n = L 1 x ( b c )
{ D i , j = a ¯ i , j X ^ i , j + b ¯ i , j a i , j = σ i , j 2 / ( σ i , j 2 + ε ) b i , j = ( 1 a i , j ) u i , j
{ g ^ f i , j ( n + 1 ) = g ^ f i , j ( n ) 0.01 η i , j E i , j ( n ) X ^ 2 i , j ( n ) b ^ f i , j ( n + 1 ) = b ^ f i , j ( n ) η i , j E i , j ( n )
η i , j ( n ) = { [ 1 σ i , j 2 ( n ) / T H s p ] r a t e i f σ i , j 2 ( n ) T H s p | D i , j ( n ) R i , j ( n ) | T H t p 0 e l s e
R i , j ( n + 1 ) = { D i , j ( n ) i f | D i , j ( n ) R i , j ( n ) | > T H t p R i , j ( n ) e l s e
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.