Cascading auto-regressive exponential smoothing of image sequences for reducing turbulence induced motion

Guy Potvin; Donald McGaughey

doi:10.1364/OPTCON.481487

1. Introduction

Atmospheric turbulence in conjunction with density and humidity fluctuations lead to refractive index fluctuations. These fluctuations can have a significant impact on electromagnetic propagation, especially in the visible and infrared wavelengths [1,2]. Images taken over a long horizontal path near the earth show time-varying blur, warping and apparent motion induced by atmospheric turbulence [3,4]. Such distortions can hinder the detection and identification of objects in a scene. In real-time surveillance applications, accurate, low-latency imagery is of high value.

A number of algorithms have been proposed to correct the effects of atmospheric turbulence including: speckle interferometry [5,6], image segmentation and optical flow [7–11], blind iterative deconvolution [12,13] maximum-likelihood blind deconvolution [14] and lucky region imaging [15,16]. Some algorithms may be considered as hybrids that combine optical flow and/or lucky region imaging along with sophisticated image processing, such as blind deconvolution [17,18]. There are also algorithms based on neural network deep learning [19,20].

Speckle interferometry, blind deconvolution and the optical flow algorithms are computationally expensive and difficult to compute in real-time. Some of these algorithms are block processing algorithms (require several input frames) [14,17,18] and thus there is an inherent time delay between collecting the images and displaying the corrected frame. The blind iterative deconvolution approaches [12,13] require only one frame, but their iterative nature could slow down algorithms that use it. Lucky region imaging is not suitable for moving targets in near ground imaging.

The image segmentation approaches [7–11] are specifically designed for video processing. They rely on a clear distinction between moving objects and the background, enabling them to perform an image segmentation and treat the background and moving objects separately. Such algorithms are rather elaborate, requiring techniques such as optical flow and intensity changes to detect moving objects. These objects are only detected if they pass a variety of thresholds set for the algorithm, leaving open the possibility that slow moving objects with a low contrast with respect to the background may not be detected.

This paper introduces the Cascading Auto-Regressive Exponential Smoothing (CARES) algorithm for the real-time suppression of turbulence effects in image sequences with moving objects. The CARES algorithm reduces the apparent turbulent motion; while not significantly delaying or distorting moving objects.

Section 2 provides the background theory for the image processing and time-domain filters used in CARES. Section 3 describes the CARES algorithm. Section 4 describes the turbulent image measurement trial and the analysis of the CARES outputs of those images regarding improvements in image quality and moving object delays. These results are discussed in Section 5., and the conclusions are in Section 6..

2. Background theory

2.1 Laplacian image pyramid

The Laplacian Image Pyramid (LIP) is a way of decomposing an image into a multi-resolution pyramid [21–23]. Note that the LIP has been used by Xue et al. [24] to suppress turbulence, but their method is suitable for static scenes. The decomposition begins with the choice of a finite one-dimensional generating kernel $\hat {w}[n]$ [21]

(1)$$\hat{w} \left[ n \right] = \left[ {\frac{1}{4} - \frac{a}{2}},\ \frac{1}{4},\ a, \frac{1}{4},\ {\frac{1}{4} - \frac{a}{2}} \right]$$

where $a$ is an adjustable parameter and $\sum \hat {w} [n] = 1$ for any value of $a$. From this, a two-dimensional (2D) kernel is constructed

(2)$$w[n ,m] = \hat{w} [n] \hat{w} [m],$$

which will be used to implement the LIP. We will use $0.375\leq a \leq 0.4$ as these values generate Low-Pass Filters (LPF) whose kernels are approximately Gaussian.

A flowchart of the LIP is shown in Fig. 1, where the input image is decomposed into image octaves. The process begins by producing a low-pass filter image $T_1$ as a convolution of the $N \times M$ pixel input image $I_0$ with the 2-D kernel $w[n,m]$,

(3)$$T_1 [n, m] = \left( w \star I_0 \right) [n , m] = \sum_{k ={-}2}^{2} \sum_{l ={-}2}^{2} I_0 [n - k, m - l ] w [ k, l ] .$$

Note that we use a zero padding around the edges of $I_0$ so that $T_1$ has the same size. This image is downsampled by a factor of 2 as indicated by the $\downarrow _{2}$ symbol to create

(4)$$G_1[n,m]=T_1[2n,2m] \, .$$

A high-pass image is created by upsampling $G_1$ to be the same size as $I_0$ and subtracting it from the original image by $L_0 = I_0 - G_{1\uparrow }$. The upsampled $G_{1\uparrow }$ is given by

(5)$$G_{1 \uparrow }\left[ n, m \right] = 4 \sum_{k ={-}2}^{2} \sum_{l ={-}2}^{2} G_1 \left[ {\frac{{n - k}}{2},\frac{{m - l}}{2}} \right] w \left[ k, l \right]$$

where the summations are only computed for integral values of $\left [ {\frac {{n - k}}{2},\frac {{m - l}}{2}} \right ]$. Note that in Eq. (5), we have $1 \leq n \leq 2N_G$ and $1 \leq m \leq 2M_G$, where $N_G$ and $M_G$ is the height and width respectively of $G_1$.

Fig. 1. Flowchart for the LIP decomposition for four octaves.

Download Full Size | PDF

Next, $G_1$ is low-pass filtered $T_2 = w \star G_1$ and $T_2$ is downsampled by a factor of two to create $G_2$. The second octave is given by subtracting the upsampled $G_2$ from $G_1$ as given by $L_1 = G_1 - G_{2\uparrow }$. The decomposition proceeds as above with the third octave given by $L_2 = G_2 - G_{3\uparrow }$. The last octave is simply the low-pass image $G_3$ and is an eighth the size of the input image. Note that the images $G_0$ to $G_3$ form the Gaussian pyramid expansion of $I_0$ , while $L_0$ to $L_3$ form the Laplacian pyramid expansion of $I_0$ [21].

The first octave of the Laplacian pyramid $L_0$ is a high-pass image; the last octave $L_3$ is a low-pass image; and the octaves in between $L_1$ and $L_2$ are bandpass images. Figure 1 shows a Laplacian decomposition with four octaves, but obviously a LIP with fewer or more octaves could have been created.

Figure 2 shows the LIP of a test image with three octaves. The first octave is the high-pass filtered input and is the same size as the input image. The second octave is a band-pass image and is half the size of the original image. The third octave is a low-pass image that is one quarter the size of the original image.

Fig. 2. The test input image for the LIP decomposition is on the left. It is 512 $\times$ 512 pixels, as is the first octave in the middle. The second octave (upper right) is 256 $\times$ 256 pixels and the last octave (lower right) is 128 $\times$ 128 pixels. The picture is in the common domain and is used under the fair use principle.

Download Full Size | PDF

The LIP is a reversible transform. To invert the LIP, the highest octave is upsampled as in Eq. (5) and added to the next highest octave as given by $\hat {I}_2 = L_{3\uparrow } + L_2$. This image is then upsampled again and added to the next lower octave as given by $\hat {I}_1 = \hat {I}_{2\uparrow } + L_1$. This process is repeated up the octaves until all the octaves have been summed in giving the reconstructed image $\hat {I}_0 = \hat {I}_{1\uparrow } + L_0$.

2.2 Exponential smoothing

In the CARES algorithm, an Exponential Smoothing (ES) filter is used to low-pass filter each pixel in time. The ES filter is given by the recursion relation

(6)$$x_1 \left[ n \right] = \alpha x_1 \left[ n - 1 \right] + \left( 1 - \alpha \right) x_0 \left[ n \right] \ ,$$

where $n$ is the time step, $\alpha$ is the filter parameter, $x_0$ is the unfiltered pixel intensity value time-series and $x_1$ is the filtered time-series (see Fig. 3). The ES filter has the same attenuation of Gaussian white noise as an $N$ tap averaging filter for

(7)$$\alpha = \frac{{N - 1}}{{N + 1}}$$

where $N$ is the number of samples in the average. The recursion relation (6) has the following representation in the frequency domain

(8)$$X_1 \left( e^{j \omega} \right) = H \left( e^{j \omega} \right) X_0 \left( e^{j \omega} \right) \ ,$$

where $-\pi \leq \omega \leq \pi$ is the angular frequency, $X_1$ and $X_0$ are the Fourier transforms of $x_1$ and $x_0$ respectively, and

(9)$$H \left( e^{j \omega} \right) = \frac{ 1 - \alpha}{ 1 - \alpha e^{{-}j \omega}} \ ,$$

is the transfer function of the ES filter.

Fig. 3. Flowchart for the ES filter. Note that the operator $z^{-1}$ represents a one timestep delay.

Download Full Size | PDF

It is known that the averaging filter results in a time delay of the output with respect to the input. The delay of the filter can be seen in the group delay. By writing the ES transfer function in polar form, $H \left ( e^{j \omega } \right ) = A \exp (j \theta )$, the group delay can be defined as

(10)$$\tau \left( \omega \right) ={-} \frac{{d\theta \left( \omega \right)}}{d \omega} ,$$

where $\theta \left ( \omega \right )$ is the phase of the transfer function. Ideally, the group delay would be zero so that the output is not delayed relative to the input.

The group delay of the ES filter can be reduced by recursively applying the ES filter. Brown and Meyer [25] have shown that if an input signal can be approximated as a Taylor series of order $l$, such that

(11)$$x_0 \left[ n \right] \approx x_0 \left[ n_0 \right] + c_1 \left(n - n_0\right) + c_2 \left(n - n_0\right)^2 + \ldots + c_l \left(n - n_0\right)^l$$

then the linear, quadratic and higher order terms can be removed by a modified ES filter. This modified ES filter takes the following form in the frequency domain

(12)$$F_l \left( e^{j \omega} \right) = 1 - \left( 1 - H \left( e^{j \omega} \right) \right)^{l + 1} ,$$

where $H^{l+1}$ means that the ES was applied $l + 1$ times.

Since the modified ES filter can find the coefficients of the Taylor series and remove them, it will be designated as the Auto-Regressive Exponential Smoothing (ARES) filter. It can be shown that the output corresponding to a polynomial of order $l$ is $Y_l = F_l X_0$ and in the time domain is

(13)$$y_l \left[ n \right] = \sum_{m = 0}^{l} \left\{ \frac{ \left({-}1 \right)^m \left( l + 1 \right)!}{ \left( m + 1 \right)! \left( l - m \right)! } \right\} x_{m + 1} \left[ n \right] ,$$

Only ARES filters of the orders of 0, 1 and 2 will be used in this work. In the time domain, the output from applying the ES filter recursively is

(14)$${x_2}\left[ n \right] = \alpha {x_2}\left[ {n - 1} \right] + \;\left( {1 - \alpha } \right){x_1}\left[ n \right]$$

and

(15)$${x_3}\left[ n \right] = \alpha {x_3}\left[ {n - 1} \right] + \;\left( {1 - \alpha } \right){x_2}\left[ n \right]$$

where ${x_1}\left [ n \right ]$ is defined in Eq. (6), and ${x_2}\left [ n \right ]$ and ${x_3}\left [ n \right ]$ are the outputs for applying the ES filter recursively two and three times respectively. Then using Eq. (13), the zeroth ${y_0}\left [ n \right ]$, first ${y_1}\left [ n \right ]$ and second ${y_2}\left [ n \right ]$ order ARES filters are respectively given by

(16)$${y_0}\left[ n \right] = {x_1}\left[ n \right],$$

(17)$${y_1}\left[ n \right] = 2{x_1}\left[ n \right] - {x_2}\left[ n \right],$$

and

(18)$${y_2}\left[ n \right] = 3{x_1}\left[ n \right] - 3{x_2}\left[ n \right] + {x_3}\left[ n \right].$$

Figure 4 shows an example of ARES filter outputs for a Gaussian pulse input with a standard deviation of 15 time steps, and all of the ARES filters have $N = 10$. The zero-order output shows a Gaussian form that is delayed with respect to the input, wider and somewhat attenuated. The first-order output shows less of a delay, but appreciable distortion of the Gaussian form, whereas the second-order output sticks very closely to the input. This means that over a span of ten time steps, the input can be well approximated as a second-order Taylor series.

Fig. 4. The outputs of an ARES filter with $N = 10$ and of order 0 (black line), order 1 (blue line) and order 2 (red line) using a Gaussian pulse as input (fat gray line) with a standard deviation of 15.

Download Full Size | PDF

Figure 5(a) shows the frequency response for order 0, 1 and 2 ARES filters with the frequency response of the Gaussian pulse from Fig. 4 shown in gray. Notice the Gaussian is passed with very little attenuation as most of its power is in the pass-band of the LPF. The group delay for an ARES filter with $\alpha =0.8182$, $(N=10)$, is shown in Fig. 5(b). The order 0 filter has a group delay of 4.5 timesteps for low frequencies which is the same as for an averaging filter with $N=10$ . The order 1 and order 2 ARES filters have zero group delay at zero frequency but have a nonlinear group delay at higher frequencies. Also note, the frequency band of the Gaussian in Fig. 5(a) is within the range where the group delay for order 2 is approximately zero in Fig. 5(b).

Fig. 5. On the left (a) are the magnitudes of the ARES filters and the Gaussian from Fig. 4, while on the right (b) are the group delays of the ARES filters. The functions are plotted with respect to the normalized frequency $f = \omega /\pi$.

Download Full Size | PDF

As the order of the ARES filter increases, it attenuates the higher frequencies less. In order to have the same attenuation of the higher order filters as the zero order filter, the $N$ value of the higher order filters must be increased. The average power of the output of the filter when the input is zero-mean noise with a variance of $\sigma ^{2}_n$ and uncorrelated covariance $\left \langle {x\left [ n \right ]x\left [ m \right ]} \right \rangle = {\sigma _n^2}{\delta _{n m}}$ can be used as a measure of the attenuation of the filter. Note the angle brackets $\left \langle \right \rangle$ indicate an ensemble average and $\delta _{n m}$ is the Kronecker delta function.

The average output power for a filter is

(19)$$\left\langle s^2 \right\rangle = {\sigma ^2_n}\sum_{k = 0}^\infty {{h^2}\left[ k \right]}$$

where $h\left [ k \right ]$ is the impulse response of the filter. The impulse responses for the zeroth, first and second order ARES filters are respectively found to be

(20)$$h_0 [k] = \left( \frac{2}{N + 1} \right) \left( \frac{N - 1}{N + 1} \right)^k ,$$

(21)$$h_1 [k] = \left( \frac{2}{N + 1} \right)^2 \left( \frac{N - 1}{N + 1} \right)^k \left( N - k \right) ,$$

and

(22)$$h_2 [k] = \left( \frac{2}{N + 1} \right)^3 \left( \frac{N - 1}{N + 1} \right)^k \left\{ \frac{3}{4} \left( N - k \right)^2 - \frac{1}{4} \left( k^2 - 1 \right) \right\}.$$

Then substituting Eqs. (20), (21) and (22) into Eq. (19), the noise powers of the zeroth, first and second order ARES filters are found to be

(23)$${P_0} = \frac{{{\sigma_n^2}}}{N_0} ,$$

(24)$${P_1} = \frac{{{\sigma_n^2}}}{{2{N_1^3}}}\left( 5 N_1^2 - 4N_1 + 1 \right) ,$$

and

(25)$${P_2} = \frac{{{\sigma_n^2}}}{8 N_2^5}\left( 33 N_2^4 - 54 N_2^3 + 44 N_2^2 - 18N_2 + 3 \right) .$$

where $N_0$, $N_1$ and $N_2$ are the $N$ values of the zeroth, first and second order ARES filters respectively.

For the order 1 and 2 ARES filters, the $N_1$ and $N_2$ are adjusted to keep the filter powers the same as the power of the zero order filter with $N_0$. By equating Eq. (23) with Eq. (24) we get the polynomial

(26)$$\left(\frac{1}{N_0}\right) N_1^3 - \left(\frac{5}{2}\right) N_1^2 + 2 N_1 - \frac{1}{2} = 0 \ .$$

Similarly, the $N_2$ for an order 2 ARES filter can be found by solving the polynomial

(27)$$\left(\frac{1}{N_0}\right) N_2^5 - \left(\frac{33}{8}\right) N_2^4 + \left(\frac{27}{4}\right) N_2^3 - \left(\frac{11}{2}\right) N_2^2 + \left(\frac{9}{4}\right) N_2 - \frac{3}{8} = 0 \ .$$

In both these polynomials, there is only one real root, which is the desired root.

Figure 6 shows flowchart representations of the ARES filters from orders zero to two. Each box is composed of two steps. The first is the order adjustment of the timescales $N$, represented by an order adjustment function $N_O = {\rm OA}(N,O)$ which in turn represents the real root of the polynomials in Eqs. (26) and (27). The second step is the ARES filter proper, which is described by Eq. (6) and by Eqs. (14) to (18). Note that the ES filter shown in Fig. 3 is here represented as the function ${\rm ES}(N)$.

Fig. 6. Flowcharts for the ARES filter. Box A represents the zero order filter, ARES(N,0), box B is the first order filter, ARES(N,1), and box C is the second order filter, ARES(N,2).

Download Full Size | PDF

3. Cascading auto-regressive exponential smoothing filter

3.1 Algorithm

The CARES algorithm processes each image as it is collected with the following steps: i) the image is decomposed into $K$ octaves using the LIP, ii) each octave is passed through an ARES filter with a timescale $N_{(k)}$ and order $O_{(k)}$, iii) the filtered octaves are recombined into an image using the inverse LIP, iv) a sharpening filter is applied to the image. This algorithm is illustrated in Fig. 7. Note that in the figure, the variables $N_{(k)}$ indicate the timescale for the octave $k$, and not the $k$th order timescale $N_k$.

Fig. 7. Flowchart for the CARES filter.

Download Full Size | PDF

The CARES algorithm has a large number of parameters to be specified. These include the parameter $a$ of the generating kernel in Eq. (1), the number of octaves $K$ in the LIP, for each octave the timescale $N_{(k)}$ and order $O_{(k)}$ of the ARES filters, and the sharpening filter coefficient $\epsilon$ (see Eq. (29)). However, the parameter $a$ and the number of octaves $K$ in the LIP would likely not be adjustable by a user. The user adjustable parameters are likely to be the timescales, the orders and the sharpening filter coefficient, which are $2K + 1$ parameters. Therefore, for a 4-octave LIP the user must adjust 9 independent parameters, which is a significant number.

The number of parameters can be reduced by defining a critical velocity, $v_c$ in pixels per timestep, that filters any movement faster than that velocity while allowing slower moving objects to pass unattenuated. Consider a circular object with a diameter of $D$ pixels moving across an image. The object will pass by the pixel in the time $T = D/V$ where $V$ is the speed of the object in pixels per timestep. If we want this object to pass, we define a low-pass filter with a cutoff frequency $f_c = 1/T = V/D$. Objects of size $D$ moving at a speed $V$ or less will pass while objects of size $D$ moving faster will be attenuated.

Each octave of the LIP represents the energy in a characteristic size of the image. The first high-pass octave shows all features whose characteristic size is smaller than 2 pixels. The second octave shows all features with characteristic size between 2 and 4 pixels, and so on. Thus, for each octave of the LIP we define a zero order timescale for the ARES filter,

(28)$$N_{0, k} = \frac{d_k}{v_c} \ ,$$

where $d_k = 2^{k + 1}$ is the characteristic size at octave $k$ of the pyramid. For first and second order ARES filters, Eq. (28) can be substituted into Eqs. (26) and (27) to calculate $N_{1, k}$ and $N_{2, k}$ respectively.

In the last stage of the CARES algorithm, a sharpening filter is applied using the convolution $I_{out} = W_{\epsilon } \star I_{filt}$. The impulse response of the sharpening filter is given by:

(29)$$W_{\epsilon} \left[n, m \right] = C_{\epsilon} \left( 3 - \frac{n^2}{\epsilon^2} - \frac{m^2}{\epsilon^2} \right) \exp \left( - \frac{n^2 + m^2}{2 \epsilon^2} \right)$$

where $0 < \epsilon \leq 2$. The constant $C_{\epsilon }$ is set so that the sharpening filter is normalized to unity, $\sum _{n} \sum _{m} W_{\epsilon } [n, m] = 1$.

By fixing the ARES order at each octave level, the number of adjustable parameters is reduced to two: the critical velocity $v_c$ and the scaling parameter $\epsilon$ for the sharpening filter. These parameters were manually selected by varying the parameters and comparing the image quality metrics and visually examining the images. It is important to note that an inherent tension exists in the CARES filter, in that a small value of $v_c$ will more effectively suppress turbulence but will attenuate and delay moving objects more, while a large value of $v_c$ will leave moving objects undistorted and have less delay but will not suppress turbulence as effectively. This makes it difficult to determine an optimal value of $v_c$, since it depends on the intensity of the turbulence and the velocities of the moving objects.

3.2 Measuring moving target delays

The main objectives of the CARES algorithm are to provide a real-time algorithm with low computational cost for mitigating the effects of turbulence on an image and at the same time to have no delay on moving targets. As discussed in Section 2.2, the ARES filters of order 1 or higher have zero phase-delay at low frequencies and thus were chosen to temporally filter the LIP decomposition of the input image. In such ARES filter, frequencies below the cutoff frequency ($f_c$) should be passed with almost zero phase delay while higher frequencies will be delayed by the ARES filter. In the CARES algorithm, the cutoff frequency for the ARES filters is set by the critical velocity $v_{c}$ parameter. Thus it is important to choose $v_c$ higher than the speed of the moving object to allow it to pass without delay, while still removing turbulence effects at higher velocities.

In order to understand the problems associated with assessing the delay of the CARES output with respect to the input, we have constructed a simple simulation. It consists of a one dimensional shape moving across the image with a constant velocity, measured in pixels/timestep. We used two shapes, a Gaussian pulse (shown as the blue line in the left part of Fig. 8) and a wavelet (the blue line on the right part of Fig. 8). The pulses are moving at a speed of 0.8 pixel/timestep, as this corresponds roughly to the maximum speed of the walking man with a frame rate of 60 Hz (which is 48 pixels/sec). The Gaussian pulse represents a bright object against a dark background, while the wavelet pulse represents a bright object against a bright background with its intensity subtracted from it.

Fig. 8. The Gaussian shaped pulse (left) and wavelet shaped pulse (right) used to test the delay of the CARES filter. The blue lines represent the inputs and the red lines are the CARES outputs. The pulses are moving from left to right with a velocity of 0.8 pixel/timestep and the critical velocity is set at 1.6 pixel/timestep.

Download Full Size | PDF

The Gaussian pulse has low wavenumber components in Fourier space whereas the wavelet pulse is concentrated in the mid wavenumber range. This causes a substantial difference in the CARES filter outputs, shown as the red lines in Fig. 8. The wavelet output follows the input very accurately, whereas the Gaussian output shows significant distortion as well as a delay. This means that the measured delay depends on the imaging scenario. It also raises the question of how to measure the delay when there is object distortion. Do we shift the output and find which shift maximizes the correlation, or do we find the position of the output’s peak? Rather than decide, we do both in Fig. 9, where we see the correlation (solid lines) and peak (dashed lines) lags as a function of the critical velocity. There is a strong difference in lags for the Gaussian pulse (blue lines), where the correlation lag is almost always substantially greater than the peak lag (by a factor of two initially). For the wavelet pulse (red lines), the two types of lag are closer together, with the correlation lag being smaller than the peak lag initially and then the reverse towards the right end. Note that the sudden drop in the peak lag estimates are likely an artifact of the interpolation method used. Note also that if we set a lag threshold of one pixel (which is a natural threshold for a digital image), then we can set the critical velocity at $v_c = v_o$, where $v_o$ is the object velocity, for the wavelet pulse and $v_c \approx 1.75 v_o$ or $v_c \approx 3 v_o$ for the Gaussian pulse for the peak and correlation lags respectively. The simulation therefore demonstrates the complexity of assessing a lag, depending as it does on imaging scenario, object shape and method of evaluating the lag.

Fig. 9. The lag caused by the CARES filter as a function of the critical velocity. The blue lines show the lags for the Gaussian pulse, and the red lines show the lags for the wavelet pulse. The solid lines are the lag values determined by the maximum correlation between the input and output, and the dashed lines are the lag determined by finding the position of the peak of the output. The pulses are moving with a velocity of 0.8 pixel/timestep.

Download Full Size | PDF

4. Results

4.1 Measurement trial

Eleven sets of data were collected on September 13, 2019 over a 900 m path with a Schmidt & Bender rifle scope with an aperture diameter of 55.35 mm, using a Mikrotron EoSens 1362 digital camera. The rifle scope was set on a custom made tripod that held the scope approximately 15 cm above the ground. The EoSens camera has 1280 $\times$ 1024 pixels with a pixel pitch of $14 \mu {\rm m} \times 14 \mu {\rm m}$. The camera is capable of frame rates up to 500 frames/sec. The diameter of the aperture was varied between 15.01, 29.93 and 55.35 mm using an aperture mask.

The images collected were all $512 \times 512$ pixels. The aperture size, frame rate and presence or absence of a moving target was varied between the data sets. Table 1 shows the parameters for the eleven data sets. Note that experiments 1, 4 and 8 did not have a moving target present. In all the other experiments there was a man walking or jogging back and forth in front of the building. Note also that the scene types are grouped together in Tables 1 and 2, such that the first three are fixed scenes, the next six are man walking scenes, and the last two are man jogging scenes. This is to ease in the comparison of different types of scenes. In all the experiments, the magnification on the rifle scope was 25, the camera gain and black level were 1 and 128 respectively, and at the object each pixel was 2.09 cm wide.

Table 1. The parameters of the 11 trials taken over a $900$ m path with the magnification of the rifle scope set to 25.

View Table | View all tables in this article

Table 2. The parameters for the CARES filter where $v_{c}$ has units of pixels/sec, $\epsilon$ units of pixels and the other parameters are dimensionless.

View Table | View all tables in this article

The values of the CARES parameters used in these experiments are shown in Table 2. The number of octaves of decomposition of the LIP was 8, which results in a $4\times 4$ image at the highest level of decomposition given the original images are $512\times 512$. The $a$ parameter of the generating kernel was set to 0.375, which results in an approximately Gaussian shaped response. All combinations of the filter order, critical velocity $v_{c}$ and sharpening filter parameter $\epsilon$ were tested. Note that the ARES filter order was the same for all the octaves in the LIP in order to reduce the number of combinations of experiments required. Also note that only the case where all the ARES filter orders are 2 and where $\epsilon = 1$ will be analyzed in this work.

4.2 Image quality

The intention of the CARES algorithm is to remove apparent motion between frames in a video. Videos with the raw atmospherically distorted data and CARES corrected images side-by-side are included in the additional materials provided on the website. Qualitatively, the CARES corrected movie is less blurry and has less apparent motion than the raw images. One frame (frame 1000) of a distorted image beside the CARES corrected output for data set 2, order 2 ARES filters, 8 octaves, $\epsilon =1$ and $v_c=30$ pixels/sec is shown in Fig. 10. It can be seen that the edges of the roof line, the side of the building and the outline of the glass doors are straighter and less blurred in the CARES corrected image than in the raw image. In addition, the leaves behind the building are less blurred in the CARES corrected image.

Fig. 10. The raw input image (left) and the corresponding CARES filter output image (right) from data set 2 [see Visualization 1] with the ARES filter for each octave set to order 2, $\epsilon =1$ pixel, and $v_c = 30$ pixels/sec. See also Visualization 2 for the CARES filtered image with $v_c = 90$ pixels/sec.

Download Full Size | PDF

Next, the STandard Deviation (STD) and the Median Absolute Deviation (MAD) (see the Appendix for a definition) were computed for the intensity of each pixel as it varies over time. The STD and MAD for data set 2 are shown in Fig. 11. It is known that the apparent motion caused by the atmosphere is proportional to the gradient of the image [26,27]. Consequently, it can be seen that the STD and MAD are greatest along the roof line and edge of the building where there is a sharp edge between bright and dark pixels. The MAD is lower than the STD for all pixels. In the STD image, an increase in the STD of the pixels in the path of the moving person can be clearly seen. The moving object is effectively removed from the median image and the MAD as shown on the right in Fig. 11.

Fig. 11. The STD (left) and MAD (right) of the raw images for data set 2.

Download Full Size | PDF

Figure 12 shows the average STD (left) and MAD (right) for all the pixels in the image for the raw images and CARES filtered images with $v_{c}=30$, $v_{c}=60$, $v_{c}=90$ and $v_{c}=120$ pixels/sec. As expected the STD and MAD decrease with a decrease in $v_{c}$ in the filtered images. This shows that the CARES algorithm is reducing the apparent motion because in the absence of atmospheric turbulence the STD and MAD of stationary targets should be zero.

Fig. 12. The STD (left) and MAD (right) of the raw images for all data sets for $v_{c}=30$, $v_{c}=60$, $v_{c}=90$ and $v_{c}=120$ pixels/sec, with the legend indicating the set number on the far left.

Download Full Size | PDF

Four image quality metrics (MSE, SSIM, BRISQUE and NIQE) were used to assess the quality of the raw images and CARES corrected images to quantify the effect of the CARES algorithm. See the Appendix for the details of the image quality assessment (IQA) algorithms and a description of the estimation of the ground truth image for the MSE and SSIM algorithms.

All four quality metrics (MSE, SSIM, BRISQUE and NIQE) are plotted in Fig. 13 for $v_{c}=30$, $v_{c}=60$, $v_{c}=90$ and $v_{c}=120$ pixels/sec. For all the data sets, the MSE of the CARES filtered data is lower than the MSE of the raw images and the MSE decreases as $v_{c}$ decreases. As mentioned, a lower MSE indicates the images are more closely matched so a decreasing MSE indicates an improved image quality. Unlike the other quality metrics in this paper, an increase in the SSIM indicates improved similarity (quality) between the image and the reference image. In all the data sets, the SSIM of the CARES filtered data is higher than the SSIM of the raw images indicating the CARES algorithm improves the quality of the image. Also, in all cases, the SSIM is highest for $v_{c}=30$ pixels/sec and it decreases as $v_{c}$ increases. This confirms that the greater the critical velocity, the more turbulence is passed through the CARES filter, whereas a smaller critical velocity tends to suppress more turbulence.

Fig. 13. The (a) average MSE, (b) average SSIM, (c) average BRISQUE and (d) average NIQE values for CARES filtered images with $v_{c}=30$, $v_{c}=60$, $v_{c}=90$ and $v_{c}=120$ pixels/sec.

Download Full Size | PDF

All the quality metrics (MSE, SSIM, BRISQUE and NIQE) improved for the CARES filtered data compared to the raw images. The MSE and SSIM indicated the highest quality for $v_{c}=30$ pixels/sec and the quality degraded as $v_{c}$ increased. The BRISQUE values sometimes increased and sometimes decreased as $v_{c}$ increased. But for all cases the BRISQUE was lower (improved) for CARES filtered images compared to raw images. The NIQE metric was approximately constant for all the $v_{c}$ values in the CARES filtered data but was lower (improved) than the NIQE for the raw images in all cases.

4.3 Delays in moving targets

Next we tested the delay in position of the moving person between the CARES filtered data and the raw images for the data sets. Each of the data sets with a moving person was CARES filtered with four critical velocities ($v_c=30, 60, 90$ and $120$ pixels/sec). Small sub-images featuring the moving person, were extracted from the raw and CARES filtered images and the background was removed from each sub-image. The median image of the raw data and the median image of the CARES filtered data were used as an estimate of the background respectively. The relative shift on the x-axis between the two sub-images was then computed using the interpolated correlation method by Guizar et al. [28]. Since the person was walking on a horizontal path, the y-position shift was forced to be within one pixel to prevent large errors due to false correlations.

Figure 14 shows the raw and CARES filtered sub-image, for $v_c = 30, 60, 90$ and $120$ pixels/sec, of the moving person with the background removed. Notice that the person, who is moving at about 48 pixels/sec, is attenuated for $v_c=30$ and $v_c=60$ pixels/sec. As the critical velocity increases, the person is less attenuated and the pixel delay decreases.

Fig. 14. A example of raw images (top) and CARES filtered images (bottom) for $v_c = 30$ [see Visualization 3], $v_c = 60$, $v_c = 90$ [see Visualization 4] and $v_c = 120$ pixels/sec from left to right for frame 150 of dataset 2.

Download Full Size | PDF

The graph on the left in Fig. 15 shows the delay in the x axis for data set 2 for $v_c = 30$, $v_c = 60$, $v_c = 90$ and $v_c = 120$ pixels/sec respectively. As can be seen from Fig. 15, the relative delay of the target in the CARES filtered data decreases as the critical velocity $v_{c}$ increases. The lower the critical velocity, the more the target is attenuated as can be seen in Fig. 14. This attenuation of the target causes the correlation used to detect the relative shift to occasionally have false alarms where it returns large position errors as can be seen in the line for $v_c = 30$ pixels/sec in Fig. 15.

Fig. 15. The pixel delay in the x-axis (left) for CARES filtered data for data set 2 and the average pixel delay between the raw image and CARES filtered images (right) for $v_c = 30$, $v_c = 60$, $v_c = 90$, and $v_c = 120$ pixels/sec.

Download Full Size | PDF

The graph on the right in Fig. 15 shows the average relative delay between the unfiltered and CARES filtered images $v_c = 30$, $v_c = 60$, $v_c = 90$ and $v_c = 120$ pixels/sec respectively. For all the data sets, the relative delay decreases as the critical velocity $v_{c}$ increases. Contrast this to Fig. 13 where the MSE increases and SSIM decreases (both indicating poorer quality) with increasing $v_{c}$. Thus there is a trade-off between reducing the delay of the moving object and improving the image quality.

5. Discussion

The MSE and SSIM are full-reference (FR) IQA metrics that require a reference image to compute. We estimated the reference image by taking the median image of all the frames in a data set. Any errors in this reference image will result in errors in the MSE and SSIM values computed. Also, since the median image removes moving targets in its estimate of the reference image, a target moving between two frames will increase the MSE and decrease the SSIM. Since the MSE and SSIM use a reference image, apparent motion between a frame and the reference image can be detected by the MSE and SSIM.

The NIQE and BRISQUE metrics used in this paper, measure the quality of a single image and have the advantage of not requiring an estimate of a reference image. For a video sequence, the NIQE and BRISQUE for each frame are calculated and then averaged. The apparent motion induced by turbulence will cause variations of the images in time. Thus the NIQE and BRISQUE metrics used in this paper indicate the average quality of individual frames but do not detect apparent motion between frames.

Although there are limitations to all the IQA metrics used in this paper, the MSE, SSIM, NIQE and BRISQUE all indicated that the CARES algorithm improved the quality of the images compared to the raw images.

The CARES filter is intended as a real-time turbulence suppression filter. The real-time aspect entails two things: the filter must be fast, and it must not induce a lag on moving objects with respect to their actual position. The speed of the CARES filter is greatly aided by the recursive nature of the ES filter, although the actual implementation of CARES in terms of hardware and programming will determine the real rate of processing.

As Section 3.2 has shown, the issue of the induced object lag is more complicated. The CARES filter induces distortion in the low wavenumber components of a moving object. As previously explained, this can create distortions in the moving object depending on the imaging situation (i.e. a bright object against a dark background, or an opaque object in front a background of similar intensity). However, this distortion may or may not create a perceptible lag. Just as the estimated lag depends on the method used to evaluate it, the perceived lag can depend on what is doing the perceiving. For instance, if the human eye preferentially perceives the small scale features of an object, then the large scale distortion may not matter in any significant way. We may also wonder how a recognition software would be affected by turbulence and how CARES would improve its perception. These questions are left for future work.

It is worth pointing out that configuration of the CARES filter is not unique. The LIP was divided into eight octaves, the corresponding ARES filter for each octave was set to order 2 and the zero order value of $N$ (Eq. (28)) is adjusted to second order using Eq. (27). But all kinds of alternatives can be imagined: a different number of octaves, different order ARES filters for each octave and a different formula for determining $N$. Indeed, the formulas (26) and (27) are based on the attenuation of white noise, but the turbulent image fluctuations are not white. However, while different combinations of octave number, orders and $N$ are possible, the end user must be given a manageable set of adjustable parameters. Hence, whatever combination is used, something as intuitively useful as a critical velocity should always be one of the adjustable parameters. This, also, is left for future work.

6. Conclusions

CARES is a fast algorithm for real-time compensation of atmospheric turbulence effects in horizontal path images. The algorithm uses a critical velocity $v_{c}$ to compute the timescales of ARES filters used to filter the LIP decomposition of an input image. The lower the critical velocity, the more atmospheric effects will be attenuated by the CARES filter. The STD and MAD of the intensities of each pixel in the image were reduced by the CARES filter indicating a reduction in the blurring and apparent motion in the CARES corrected images. In addition the MSE, NIQE and BRISQUE metrics all decreased, and the SSIM increased, all indicating improved image quality in the CARES filtered images.

The CARES algorithm has also been shown to have small delays on the position of moving objects and some distortion of the shape of those objects when the critical velocity equals the velocity of the target. The exact amount of lag and distortion depends on the form of the object and the imaging scenario of which it is a part. When the object is moving faster than the critical velocity, it is attenuated more and its position is more delayed in space. As the critical velocity increases, the delay in the position and the distortion of the object decreases. However, the reduction of the turbulence induced distortions decreases as the critical velocity increases. Thus there will be a tradeoff between reducing the effects of atmospheric turbulence and preserving moving objects without introducing a delay or distortion.

The CARES algorithm is therefore an efficient algorithm for reducing atmospheric turbulence effects in images with moving objects while having a sub-pixel delay in the object position.

Appendix: Image quality assessment metrics

In addition to subjective image quality assessment, four Image Quality Assessment (IQA) metrics were used to compare the quality of the raw image to the CARES corrected images [29]. These metrics are: the Mean-Squared Error (MSE); the Structural SIMilarity (SSIM) index [30]; blind/referenceless image spatial quality evaluator (BRISQUE) [31]; and the natural image quality evaluator (NIQE) [32].

The image MSE is given by

(30)$${\rm MSE} = \frac{1}{NM} \sum_{n = 1}^{N} \sum_{m = 1}^{M} \left( I_t \left[ n, m \right] - I_{ref} \left[ n, m \right] \right)^2$$

where $I_t \left [ n, m \right ]$ is the test image and $I_{ref} \left [ n, m \right ]$ is the reference image.

The SSIM metric between a test image and a reference image is given by

(31)$${\rm SSIM} = {\left( {\frac{{2{\mu _t}{\mu _{ref}} + {C_1}}}{{\mu _t^2 + \mu _{ref}^2 + {C_1}}}} \right)^\alpha } {\left( {\frac{{2{\sigma _t}{\sigma _{ref}} + {C_2}}}{{\sigma _t^2 + \sigma _{ref}^2 + {C_2}}}} \right)^\beta }{\left( {\frac{{{\sigma _{t,ref}} + {C_3}}}{{{\sigma _t}{\sigma _{ref}} + {C_3}}}} \right)^\gamma }$$

where $\mu _t$, $\mu _{ref}$, $\sigma _t^2$ and $\sigma _{ref}^2$ are the local mean and variances in the test image and reference images respectively and $\sigma _{t,ref}$ is the cross correlation between the test and reference image. By default the exponents $\alpha =\beta =\gamma =1$. A map of the SSIM is generated by calculating the local SSIM at each pixel when the images are windowed by an $11\times 11$ Gaussian filter with $\sigma _g = 1.5$ pixels. The mean of the SSIM map gives the overall SSIM metric of the image.

The MSE metric and SSIM index are Full Reference (FR) algorithms and require a truth (reference) image. These FR IQA metrics are useful for assessing the effectiveness of image denoising methods [33,34]. However, such a reference image is not available from our data. To use these quality metrics, an estimate of the truth image was required. As the raw images are known to have atmospheric distortion, averaging the distorted images will result in a more blurred and distorted image. Thus, the reference images were generated from the CARES filtered data and not from the raw images. To estimate the truth image, the median of each pixel over time (not spatially) was taken for the CARES filtered images. For video sequences with moving objects, the pixels crossed by the moving object will have the mean and standard deviation of those pixels modified by the moving target. The mode and the median effectively remove the effects of a moving target and give a reference image estimating the background pixels only. Thus the pixel-by-pixel median image was used as the truth image in this work.

The STandard Deviation (STD) is defined for each pixel of an image as

(32)$${\rm STD} \left[ n, m \right] = \sqrt{ {\mathop{\rm mean}\nolimits} \left( {I\left[ {n, m} \right] - {\mathop{\rm mean}\nolimits} \left( {I\left[ {n, m} \right]} \right)} \right)^2 } \, .$$

Also, the Median Absolute Deviation (MAD) is much less sensitive to moving objects crossing over a pixel than the STD and is given by

(33)$${\rm MAD}\left[ n, m \right] = {\mathop{\rm median}\nolimits} \left( {{\mathop{\rm abs}\nolimits} \left( {I\left[ {n, m} \right] - {\mathop{\rm median}\nolimits} \left( {I\left[ {n, m} \right]} \right)} \right)} \right) \, ,$$

where the operators ‘mean’ and ‘median’ refer to temporal statistics. In order to assess the CARES algorithm, the BRISQUE [31] and NIQE [32] no reference (NR) IQA algorithms were run on both the input and CARES filtered images. Both these algorithms start by calculating the mean subtracted contrast normalized (MSCN) coefficients of the image as given by

(34)$$\hat{I} \left[ n, m \right] = \frac{I \left[ n, m \right] - \mu \left[ n, m \right]}{\sigma \left[ n, m \right] + 1}$$

where $\mu \left [ {n,m} \right ]$ and $\sigma \left [ {n,m} \right ]$ are the local mean and standard deviation when the image is windowed by a Gaussian filter with $\sigma _g = 1.5$ pixels. The MSCN coefficients are then fitted at a generalized Gaussian distribution resulting in two features of the image. Next the one-pixel delay product of the MSCN images is calculated for the horizontal $\hat {I} \left [ {n,m} \right ] \hat {I} \left [ {n+1,m} \right ]$, vertical $\hat {I} \left [ {n,m} \right ] \hat {I} \left [ {n,m+1} \right ]$, main diagonal $\hat {I} \left [ {n,m} \right ] \hat {I} \left [ {n+1,m+1} \right ]$ and lower diagonal $\hat {I} \left [ {n,m} \right ] \hat {I} \left [ {n+1,m-1} \right ]$ directions. The distribution of each of these product images is fitted to an asymmetric generalized Gaussian distribution with four parameters resulting in an additional 16 features for the original images. Next the image is low-pass filtered and downsampled by a factor of two and the 18 features just described are calculated for the downsampled image.

In the BRISQUE algorithm, these features are inputted into a Support Vector Machine (SVM) that was trained on natural scenes and on scenes with known distortions that were classified by a human. The output distance from the SVM gives the image quality metric. For the NIQE algorithm, the features extracted from the image are fitted to the multivariate Gaussian distribution (see Ref. [32] for details). The distance between these parameters and the parameters of a multivariate Gaussian distribution fitted to a training set of data is used as the NIQE quality metric. Note, the NIQE algorithm does not require the training set to have images classified by their distortions and is a truly blind algorithm.

Lower scores in the BRISQUE and NIQE indicate better perceptual quality. Both these algorithms were run in MATLAB using the default parameters provided.

Acknowledgments

The authors would like to gratefully acknowledge the assistance from the personnel of the DRDC - Valcartier Research Center in obtaining the turbulent imaging data. Particular thanks are due to Jean Tessier and Martin Bérubé. We would also like to acknowledge funding from the Directorate of Technical Airworthiness and Engineering Support 6 and a CDAARP grant for supporting this project. We would also like to acknowledge assistance from Philip Ugbaja for proof-reading and help getting reference papers.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. R. R. Beland, “Propagation through atmospheric optical turbulence,” in Atmospheric Propagation of Radiation, F. G. Smith, ed. (SPIE Press, Bellingham, Washington USA, 1993), pp. 157–232.

2. L. C. Andrews, R. L. Phillips, and C. Y. Hopen, Laser Beam Scintillation with Applications (SPIE Press, Bellingham, Washington USA, 2001).

3. D. L. Fried, “Optical resolution through a randomly inhomogeneous medium for very long and very short exposures,” J. Opt. Soc. Am. 56(10), 1372–1379 (1966). [CrossRef]

4. M. C. Roggemann and B. Welsh, Imaging through turbulence (CRC press, Boca Raton, Florida, 1996).

5. C. Carrano, “Mitigating atmospheric effects in high-resolution infra-red surveillance imagery with bispectral speckle imaging,” in Image Reconstruction from Incomplete Data IV, vol. 6316 of Proc. SPIEP. J. Bones, M. A. Fiddy, and R. P. Millane, eds. (2006), pp. 631602–12.

6. J. L. Herring, J. Nagy, and L. Ruthotto, “Gauss-newton optimization for phase recovery from the bispectrum,” IEEE Trans. Comp. Imaging 6, 235–247 (2020). [CrossRef]

7. B. Fishbain, L. P. Yaroslavsky, and I. A. Ideses, “Real-time stabilization of long range observation system turbulent video,” J. Real-Time Image Proc. 2(1), 11–22 (2007). [CrossRef]

8. E. Chen, O. Haik, and Y. Yitzhaky, “Detecting and tracking moving objects in long-distance imaging through turbulent medium,” Appl. Opt. 53(6), 1181–1190 (2014). [CrossRef]

9. K. K. Halder, M. Tatahli, and S. G. Anavatti, “Simple and efficient approach for restoration of non-uniformly warped images,” Appl. Opt. 53(25), 5576–5584 (2014). [CrossRef]

10. K. K. Halder, M. Tatahli, and S. G. Anavatti, “Geometric correction of atmospheric turbulence-degraded video containing moving objects,” Opt. Express 23(4), 5091–5101 (2015). [CrossRef]

11. R. Nieuwenhuizen, J. Dijk, and K. Schutte, “Dynamic turbulence mitigation for long-range imaging in the presence of large moving objects,” EURASIP J. Image Video Proc. 2019(1), 2 (2019). [CrossRef]

12. P. Nisenson, “Single Speckle Frame Imaging Using Ayers-Dainty Blind Iterative Deconvolution,” in European Southern Observatory Conference and Workshop Proceedings, vol. 39 (1992), pp. 299–308.

13. R. Holmes and V. S. R. Gudimetla, “Image reconstructions with active illumination in strong-turbulence scenarios with single-frame blind deconvolution approaches,” Appl. Opt. 58(28), 7823–7835 (2019). [CrossRef]

14. T. J. Schulz, “Multiframe blind deconvolution of astronomical images,” J. Opt. Soc. Am. A 10(5), 1064–1073 (1993). [CrossRef]

15. M. Aubailly, M. A. Vorontsov, G. W. Carhart, and M. T. Valley, “Automated video enhancement from a stream of atmospherically distorted images: the lucky-region fusion approach,” in Atmospheric Optics: Models, Measurements, and Target-in-the-Loop Propagation III, vol. 7463 of Proc. SPIES. M. Hammel, A. M. J. van Eijk, and M. A. Vorontsov, eds. (2009), pp. 74630C–10.

16. X. Huang, B. Li, J. Wang, and J. Li, “A real-time lucky imaging algorithm based on Fourier transform and its implementation techniques,” Publ. Astro. Soc. Japan 73(5), 1240–1254 (2021). [CrossRef]

17. C. P. Lau, Y. H. Lai, and L. M. Lui, “Restoration of atmospheric turbulence-distorted images via RPCA and quasiconformal maps,” Inverse Problems 35(7), 074002 (2019). [CrossRef]

18. Z. Mao, N. Chimitt, and S. H. Chan, “Image reconstruction of static and dynamic scenes through anisoplanatic turbulence,” IEEE Trans. Comp. Imaging 6, 1415–1428 (2020). [CrossRef]

19. R. Nieuwenhuizen and K. Schutte, “Deep learning for software-based turbulence mitigation in long-range imaging,” in Artificial Intelligence and Machine Learning in Defense Applications, vol. 11169J. Dijk, ed., International Society for Optics and Photonics (SPIE, 2019), p. 111690J.

20. D. Jin, Y. Chen, Y. Lu, J. Chen, P. Wang, Z. Liu, S. Guo, and X. Bai, “Neutralizing the impact of atmospheric turbulence on complex scene imaging via deep learning,” Nat. Mach. Intell. 3(10), 876–884 (2021). [CrossRef]

21. P. J. Burt and E. H. Adelson, “The laplacian pyramid as a compact image code,” IEEE Trans. Commun. 31(4), 532–540 (1983). [CrossRef]

22. E. H. Adelson, C. H. Anderson, J. R. Bergen, P. J. Burt, and J. M. Ogden, “Pyramid methods in image processing,” RCA Engineer 29, 34–41 (1984).

23. J. M. Ogden, E. H. Adelson, J. R. Bergen, and P. J. Burt, “Pyramid-based computer graphics,” RCA Engineer 30, 4–15 (1985).

24. B. Xue, Y. Liu, L. Cui, X. Bai, X. Cao, and F. Zhou, “Video stabilization in atmosphere turbulent conditions based on the laplacian-riesz pyramid,” Opt. Express 24(24), 28092–28103 (2016). [CrossRef]

25. R. G. Brown and R. F. Meyer, “The fundamental theorem of exponential smoothing,” Operations Research 9(5), 673–685 (1961). [CrossRef]

26. S. Zamek and Y. Yitzhaky, “Turbulence strength estimation from an arbitrary set of atmospherically degraded images,” J. Opt. Soc. Am. A 23(12), 3106–3113 (2006). [CrossRef]

27. G. Potvin, “Linear perturbation model for simulating imaging through weak turbulence,” Opt. Eng. 59(03), 1 (2020). [CrossRef]

28. M. Guizar-Sicairos, S. T. Thurman, and J. R. Fienup, “Efficient subpixel image registration algorithms,” Opt. Lett. 33(2), 156–158 (2008). [CrossRef]

29. D. McGaughey and G. Potvin, “Quality metrics for atmospherically distorted images,” in OSA Imaging and Applied Optics Congress 2021 (3D, COSI, DH, ISA, pcAOP) (Optica Publishing Group, 2021), p. PTu4C.2.

30. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. Image Proc. 13(4), 600–612 (2004). [CrossRef]

31. A. Mittal, A. K. Moorthy, and A. C. Bovik, “No-reference image quality assessment in the spatial domain,” IEEE Trans. Image Proc. 21(12), 4695–4708 (2012). [CrossRef]

32. A. Mittal, R. Soundararajan, and A. C. Bovik, “Making a "completely blind" image quality analyzer,” IEEE Signal Proc. Letters 20(3), 209–212 (2013). [CrossRef]

33. S. Ghose, N. Singh, and P. Singh, “Image denoising using deep learning: Convolutional neural network,” in 2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence), (2020), pp. 511–517.

34. P. Singh and A. Shankar, “A novel optical image denoising technique using convolutional neural network and anisotropic diffusion for real-time surveillance applications,” J. Real-Time Image Proc. 18(5), 1711–1728 (2021). [CrossRef]

Name	Description
Visualization 1	The raw input image (left) and the corresponding CARES filter output image (right) with the critical velocity at 30 pixels/sec.
Visualization 2	The raw input image (left) and the corresponding CARES filter output image (right) with the critical velocity at 90 pixels/sec.
Visualization 3	A example of raw images (top) and CARES filtered images (bottom) for a critical velocity at 30 pixels/sec.
Visualization 4	A example of raw images (top) and CARES filtered images (bottom) for a critical velocity at 90 pixels/sec.

CARES Parameters
number of octaves	8
$a$	0.375
filter order	0, 1, 2
$v_{c}$ (pixels/sec)	30, 60, 90, 120
$ϵ$ (pixels)	0.1, 0.5, 1

CARES Parameters
number of octaves	8
$a$	0.375
filter order	0, 1, 2
$v_{c}$ (pixels/sec)	30, 60, 90, 120
$ϵ$ (pixels)	0.1, 0.5, 1

Cascading auto-regressive exponential smoothing of image sequences for reducing turbulence induced motion

Abstract

1. Introduction

2. Background theory

2.1 Laplacian image pyramid

2.2 Exponential smoothing

3. Cascading auto-regressive exponential smoothing filter

3.1 Algorithm

3.2 Measuring moving target delays

4. Results

4.1 Measurement trial

4.2 Image quality

4.3 Delays in moving targets

5. Discussion

6. Conclusions

Appendix: Image quality assessment metrics

Acknowledgments

Disclosures

Data availability

References

Supplementary Material (4)

Data availability

Cited By

Figures (15)

Tables (2)

Equations (34)

Optics Continuum

record #	Time	fps	Exposure	Diameter	Scene
		(Hz)	(ms)	(mm)
1	14:11	60	1000	29.93	fixed
4	14:23	60	3200	15.01	fixed
8	14:34	60	400	55.35	fixed
2	14:13	60	1000	29.93	man walking
3	14:15	120	1000	29.93	man walking
5	14:25	60	3200	15.01	man walking
6	14:27	120	3200	15.01	man walking
9	14:36	60	400	55.35	man walking
10	14:38	120	400	55.35	man walking
7	14:29	120	3200	15.01	man jogging
11	14:40	120	400	55.35	man jogging