Deconvolution of astronomical images using SOR with adaptive relaxation

S. V. Vorontsov; V. N. Strakhov; S. M. Jefferies; K. J. Borelli

doi:10.1364/OE.19.013509

1. Introduction

Current research in image deblurring is focused on iterative methods which allow to implement a priori constraints on the solutions, like non-negativity which is particularly important in astronomical imaging. However, a problem which is common for the iterative techniques is their slow convergence, in particular when “blind” deconvolution is addressed. This provides the motivation for a search for new algorithms. In a recent paper [1] (hereafter Paper I) we addressed the potential performance of the successive overrelaxation technique (SOR). When applied to normal equations, SOR is equivalent to iterative coordinate descent (ICD). The first successful implementations of ICD and SOR in image deblurring were reported by Sauer and Bouman [2] and Fessler [3]. For a wider account for the earlier implementation of SOR in image deblurring, and its relation with other techniques, the reader is referred to Paper I. Our current study represents a direct extension to the work reported in Paper I, and we advise the interested reader to address Paper I first. In this section, however, we summarize the necessary background.

The problem of image deblurring is formulated as a problem of finding an approximate solution to a large ill-conditioned linear system

A x = f_{δ}, f_{δ} = f + δ f .

Here, vector x of size N ² represents the N × N true image, with image elements arranged by lexicographical ordering, the N ² × N ² matrix A represents a blurring operator, and the “true” data f is distorted by noise δf. We assume that the N × N domain is large enough to accommodate both the true object and its blurred image, so that edge effects can be discarded and matrix A can be considered as block-circulant with circulant blocks; the central column of A represents lexicographically-ordered elements of the point-spread function (PSF).

We assume the noise to be uncorrelated Gaussian, and to have a uniform variance. The maximum-likelihood solution minimizes the quadratic merit function

M = {| | A x - f_{δ} | |}^{2};

where || · || designates the Euclidian (l ₂) norm. We do not add any penalty terms (which would bring Tikhonov regularization), instead we assume that the regularization will be performed by early termination of the iterations (the so-called iterative regularization). The minimum of M corresponds to the solution of normal equations

A^{T} A x = A^{T} f_{δ},

where T designates the matrix transpose. Matrix A^T A describes a convolution with a symmetrized point-spread function (later SPSF) of the two consecutive blurrings: one with the original PSF, and another with the same point-spread function but reflected in both spatial coordinates around the origin (such that if A represents a simple image shift, further multiplication with A^T returns the image to its original position).

SOR is a matrix-splitting algorithm. The matrix A^T A in the Eq. (3) is represented by the matrix sum

A^{T} A = L + D + L^{T},

where L is a strictly lower triangular matrix and D is a diagonal matrix, and each consecutive iteration consists in solving the triangular linear system

(\frac{1}{τ_{k}} D + L) x^{(k)} = [(\frac{1}{τ_{k}} - 1) D - L^{T}] x^{(k - 1)} + A^{T} f_{δ}, k = 1, 2, \dots,

starting with an initial guess x ⁽⁰⁾, where τ_k is the relaxation parameter, which we will later allow to change between the consecutive iterations. When τ_k is set to 1, SOR reduces to the classical Gauss-Seidel iteration.

When A is a convolution operator, matrix L also represents a convolution (with a point-spread function which can be obtained from SPSF by setting to zero the elements of the central row starting from the central element and progressing to the right, together with all the elements below the central row). Matrix D can be written as D = dI, where I is the unit matrix and d is the square of the Euclidean norm of the PSF. Therefore, the matrix-vector multiplications in Eq. (5) are convolutions, and computational expenses can be reduced by solving this equation in the Fourier domain.

When SOR is applied to the normal equations (Eq. (3)), there is another mathematically equivalent algorithm, which is an iterative coordinate descent (ICD) applied directly to the original equations (Eq. (1)) [1, 4]. In the ICD, each iteration consists in N ² minor steps where the merit function (Eq. (2)) is minimized separately with respect to each individual component x_i of the vector x in the consecutive order. At each minor step, an optimal variation of x_i is scaled with the relaxation parameter τ_k. This algorithm avoids the calculation of the SPSF, and allows us to work with a spatially-variant PSF, and/or with quadratic approximation for Poissonian likelihood in the merit function.

Deconvolution of astronomical images is known to benefit greatly from enforcing the nonnegativity constraint. The modification which is required is very straightforward in both the SOR and the ICD algorithms. When a particular pixel value is being updated (in SOR, in the process of solving the triangular system), this pixel value is simply set to zero when it is going to become negative. As in some earlier literature, we designate the resultant algorithm as +SOR. This algorithm was first proposed by Cryer [5] for minimizing a strictly convex quadratic function with non-negativity constraints; therefore, another name for the algorithm, which can be met in the literature, is Cryer’s method.

Another a priori information is the support-domain constraint. It is particularly important in “blind” deconvolution, where the true object needs to have a finite support for its deconvolution to be possible without prior knowledge of the point-spread function [6, 7]. The support-domain constraint can also be incorporated in a trivial way, by just keeping at zero value all the pixels outside of a given domain. We note that when either the non-negativity or the support-domain constraint is incorporated, the SOR iterations can no longer be performed in the Fourier domain. The convergence analysis also becomes more difficult, since one particular iteration can no longer be considered as a linear filter.

The distinctive feature of SOR in image deblurring, which is behind the property of its potentially high performance, is the resonant behavior of the convergence rates with significant enhancement at a particular spatial frequency which can be achieved by a proper choice of the relaxation parameter τ_k (Paper I). In a typical implementation, when the blurred image is dominated by low spatial frequencies (and higher frequencies are dominated by noise), fast deblurring requires deep underrelaxation (τ_k ≪ 1). Rapid restoration of the higher spatial frequencies requires bigger τ_k. Artificial inversions in which the relaxation parameter was adjusted by experimentation (Paper I) indicate that the potential SOR performance, when measured by the number of iterations required, can be made competitive with that of conjugate gradients and related techniques. In implementations where a positivity constraint is important, the performance of +SOR can be made superior to that of other techniques.

The resonant behaviour of SOR indicates that in order to obtain the solution which responds properly to different spatial frequencies in the input data over a wide frequency range, the iterations must be performed with a relaxation parameter which changes during the iterations. Since overfitting the data shall be avoided, especially at frequencies where signal is below the noise level, the requirements of the solution quality and of the restoration speed essentially coincide: when the relaxation parameter is properly scheduled for guiding the iterative descent to a good solution, this solution will be achieved quickly.

In Section 2, we suggest a simple strategy for an adaptive scheduling the relaxation parameter, where its value is evaluated from the approximation residuals which are left from the previous iterations. In Section 3 we suggest an algorithm for addressing the “blind” deconvolution problem, and report some numerical results. Section 4 contains a short discussion.

2. Scheduling the relaxation parameter

We focus the analysis on the reduction of the approximation residuals of the unconstrained SOR iterations. We note that the analysis which follows can not be extended to the +SOR iterations due to their nonlinear behaviour; but we expect that the convergence properties of SOR are largely inherited by +SOR, and hence the same strategy for choosing the relaxation parameter will remain productive when implemented with +SOR, at least when the effects of the non-negativity constraint are not too high.

We first address the reduction, in one particular iteration, of the residual vector

r^{(k)} = f_{δ} - A x^{(k)} .

The Eq. (5) can be rewritten as

(\frac{1}{τ_{k}} D + L) Δ x = A^{T} r^{(k - 1)},

where we have introduced

Δ x = x^{(k)} - x^{(k - 1)}

to make further expressions shorter. We also have

r^{(k)} = r^{(k - 1)} - A Δ x;

taking the Euclidian norm of the both sides of the last equation leads to the standard result

{| | r^{(k)} | |}^{2} = {| | r^{(k - 1)} | |}^{2} - (\frac{2}{τ_{k}} - 1) {(Δ x)}^{T} D Δ x,

which proves the residual-reduction property of the iteration (||r ^(k)|| = ||r ^(k−1)||) when τ_k ∈ (0, 2], since D is positive-definite. We now address the reduction of the residual vector in a different norm (in the AA^T-norm). Taking the Euclidian norm of both sides of the equation

A^{T} r^{(k)} = A^{T} r^{(k - 1)} - A^{T} A Δ x,

it is straightforward to show that

{| | A^{T} r^{(k)} | |}^{2} = {| | A^{T} r^{(k - 1)} | |}^{2} - (\frac{2}{τ_{k}} - 1) {(Δ x)}^{T} D A^{T} A Δ x + {(Δ x)}^{T} (L L^{T} - L^{T} L) Δ x .

The matrices L and L^T represent convolutions, and hence LL^T – L^T L = 0, when edge effects are discarded, since any two convolutions commute. Matrix D is D = dI, where I is the unit matrix and d is the central element of the SPSF (sum of squares of all the elements of the PSF). We thus have

{| | A^{T} r^{(k)} | |}^{2} = {| | A^{T} r^{(k - 1)} | |}^{2} - (\frac{2}{τ_{k}} - 1) d {| | A Δ x | |}^{2},

or, using Eq. (9),

{| | A^{T} r^{(k)} | |}^{2} = {| | A^{T} r^{(k - 1)} | |}^{2} - (\frac{2}{τ_{k}} - 1) d {| | r^{(k)} - r^{(k - 1)} | |}^{2},

which proves that the residual is also reduced in the A^T A-norm, in the same range of τ_k. The last equation gives also an interesting hint. Suppose for a moment that the residual vector r ^(k) can be reduced to zero (or almost zero) in one iteration. Setting r ^(k) to zero in the Eq. (14) shows that the required value of the relaxation parameter τ_k shall then be calculated from r ^(k−1) using the simple equation

(\frac{2}{τ_{k}} - 1) d = \frac{{| | A^{T} r^{(k - 1)} | |}^{2}}{{| | r^{(k - 1)} | |}^{2}} .

It is easy to check that Eq. (15) gives the required value of τ_k in two limiting situations, when the complete convergence can be achieved in one iteration. One is when the point-spread function differs from zero at one pixel only. We then have A^T A = AA^T = D = dI, and hence ||A^T r ^(k−1)||² = d||r ^(k−1)||², which gives τ_k = 1. Since L = 0, Eq. (7) gives A^T AΔx = A^T r ^(k−1), and hence r ^(k) = 0 from Eq. (11). Another limiting situation is when the PSF spreads over more than one pixel, but the residual image contains very low spatial frequencies only, so that the PSF works again like Dirac’s δ-function. When the sum of the PSF’s elements is normalized to 1 (which is a standard practice, since a PSF is usually interpreted as a probability distribution function for a single photon), we have AA^T r ^(k−1) = r ^(k−1), and the Eq. (15) suggests τ_k = 2d/(1 + d). This value is the same as that suggested for the fastest restoration of the lowest frequencies by the convergence analysis of Paper I.

To reveal how the scheduling of the relaxation parameter suggested by the Eq. (15) works in a general situation when the convergence in one iteration is not possible, we transfer the convergence analysis into the Fourier domain. Let the image be updated row-by-row, from left to right, starting with the first row. Using ℱ to designate a 2-D discrete Fourier transform (DFT), we have ℱ (Lx) = lℱ (x), where l is DFT of a truncated point-spread function obtained from SPSF by setting to zero the elements of the central row starting with the central element and progressing to the right, together with all the elements below the central row. We also have ℱ (L^T x) = l^*ℱ (x), where l^* is the complex conjugate of l, and ℱ (Dx) = dℱ (x). Let also ℱ (Ax) = aℱ (x), ℱ (A^T x) = a^*ℱ (x), such that a^*a = l + d + l^* is DFT of the SPSF.

Evaluating the right-hand side of Eq. (15) in terms of Fourier transforms, we use Parseval’s theorem (the Euclidean norm of a vector is the Euclidian norm of its Fourier transform), to get

(\frac{2}{τ_{k}} - 1) d = \frac{\sum_{ω} | ℱ (A^{T} r^{(k - 1)}) |^{2}}{\sum_{ω} | ℱ (r^{(k - 1)}) |^{2}} = \frac{\sum_{ω} | a * ℱ (r^{(k - 1)}) |^{2}}{\sum_{ω} | ℱ (r^{(k - 1)}) |^{2}} = \frac{\sum_{ω} (l + d + l *) | ℱ (r^{(k - 1)}) |^{2}}{\sum_{ω} | ℱ (r^{(k - 1)}) |^{2}},

where Σ_ω designates the summation over all the spatial frequencies. The right-hand side of Eq. (16) is thus a weighted average of l +d +l^* over the spatial frequencies; the weighting function is the Fourier power of the current residuals.

From Eqs. (7) and (9) we have

r^{(k)} = r^{(k - 1)} - A {(\frac{1}{τ_{k}} D + L)}^{- 1} A^{T} r^{(k - 1)} .

Applying the DFT to the both sides of this equation, we get

ℱ (r^{(k)}) = W (τ_{k}) ℱ (r^{(k - 1)}),

where we omit the dependence on frequency for shortness, with

W (τ_{k}) = 1 - \frac{l + d + l *}{d / τ_{k} + l} = \frac{(1 / τ_{k} - 1) d - l *}{d / τ_{k} + l},

and hence

\frac{| ℱ (r^{(k)}) |^{2}}{| ℱ (r^{(k - 1)}) |^{2}} = | W (τ_{k}) |^{2} = \frac{{[(2 / τ_{k} - 1) d - (l + d + l *)]}^{2} + 4 {Im}^{2} (l)}{{[(2 / τ_{k} - 1) d + (l + d + l *)]}^{2} + 4 {Im}^{2} (l)},

where Im(l) is imaginary part of l.

Suppose for a moment that the current residual r ^(k−1) contains only one Fourier component, or is dominated by the spectral components limited by a small frequency domain. The last equation shows that the convergence in one iteration is possible when Im(l) = 0, if the relaxation parameter τ_k is taken such that (2/τ_k − 1)d = l + d + l^*, in agreement with the recipe suggested by Eq. (16). When Im(l) differs from zero, the value suggested for τ_k is nearly optimal (an optimal reduction of |W(τ_k)| is achieved with (2/τ_k − 1)d = |2l + d|).

In a general situation when the residual r ^(k−1) spreads over a wide range of spectral components, the weighted average l + d + l^* suggested by the Eq. (16) for (2/τ_k − 1)d will provide a value for the relaxation parameter which is more favourable for suppressing those spectral components in the residual which have bigger weights |ℱ(r ^(k−1)) |², i.e. those which are bigger in power (note that d does not depend on frequency). As iterations proceed, the power spectrum of the residuals will evolve (towards smaller values) with a general tendency of becoming more uniform in frequency. The flattening of the power spectrum of the approximation residuals during the iterative descent is the property which is most favourable for obtaining a high-quality restoration of the blurred and noisy image.

In a typical image deblurring we expect the Fourier power of the initial residuals (blurred and noisy image, when the initial guess is taken to be zero) to peak sharply at lower spatial frequencies, and to drop gradually towards higher frequencies, before arriving to a nearly-constant value (the Fourier power of the additive white noise). We expect the first iteration to be performed with a small value of τ_k, close to its low-frequency limit 2d/(1 + d); this iteration will reduce the components with the lowest frequencies in the residuals, with almost no impact on the high frequencies. At further iterations, we expect τ to increase gradually thus reducing the higher-frequency components, which now have larger relative weights in the residuals, with the net effect of flattening the Fourier power of the residuals towards the noise level. The iterations shall be terminated when the Fourier power of the residuals is made nearly flat, as further iterations will start to transfer the magnified high-frequency noise from the residuals into the solution. If the iterations are continued, τ_k will continue growing, but we expect the rate of change of τ_k with k to slow down at high iterations. This is because at very high frequencies (it depends on the PSF what “very high” really means) the residuals can not be quickly suppressed (the effect of the imaginary part of l in Eq. (20) becomes more significant).

To compensate for the dependence of the convergence rate on the spatial orientation of the harmonic components, the image scans are alternated between different raster orderings cyclically during consecutive iterations.

As a first implementation of the +SOR technique with adaptive relaxation, we address the “satellite” test problem of Paper I. This particular problem is very popular in the literature on astronomical image deblurring; the description of the artificial input data and references can be found in Paper I. The true object is shown in Fig. 1(a), and its blurred and noisy image—in Fig. 1(b). The image size is 256 × 256.

Fig. 1 The true object (a), blurred image with noise (b), and the minimum-error restoration obtained with adaptive +SOR (c).

Download Full Size | PDF

The convergence history of the +SOR iterations, defined as the relative mean error of the solution ||x–x _true||/||x _true|| versus iteration count, is shown by the solid line in Fig. 2. The adapted values of the relaxation parameter τ_k are shown along this line. As expected, the iterations start with τ_k close to the theoretical low-frequency limit 2d/(1 + d), which is 0.00039 in this experiment. As also expected, the relaxation parameter grows with iteration count k, from the small values (deep underrelaxation) to τ_k > 1 (overrelaxation), and the growth rate slows down at higher k. The solution with minimum error (0.3294) is reached at iteration count k = 13; this solution is shown in Figure 1(c). At higher iterations, the solution degrades because of the magnified high-frequency noise which is transferred into the solution.

Fig. 2 Restoration errors versus the iteration count k. The thick gray line marked with circles is the result obtained with the adaptive +SOR; figures along the line indicate current value of the relaxation parameter τ_k. The dashed line shows the result obtained with a conjugate-gradient technique, modified to incorporate the non-negativity constraint (+CG).

Download Full Size | PDF

The dashed line in Figure 2 shows the convergence history of the solution obtained with a conjugate-gradient technique modified to incorporate the non-negativity constraint (+CG). We note that there is no unique way of adapting the conjugate gradients to the non-negativity constraint; we were using here an adaptation advocated by Puetter et al [8]. It takes 170 iterations for the +CG algorithm to deliver its minimum-error solution, which has ||x – x _true||/||x _true|| = 0.3299. We observe that the performance of the adaptive +SOR is much higher in this experiment: the minimum-error solution is more accurate, and the number of iterations required to achieve this solution is an order of magnitude smaller. The computational cost of a single iteration, however, is significantly larger: enforcing the positivity constraint at each sequential pixel update does not allow any reduction in the numerical expenses by excursions to the Fourier domain.

This particular test problem was addressed with using different inversion techniques in a number of earlier studies (see Paper I for references). The best restoration error quoted in the literature is ||x – x _true||/||x _true|| = 0.3406 [9]. The adaptive +SOR provides restoration of significantly better quality.

To give a deeper insight on how the adaptive +SOR works, Fig. 3 shows how the Fourier transform of the residuals changes during the iterative descent. In the initial blurred and noisy image (panel a), the Fourier power is sharply localized at the lowest spatial frequencies. After 15 iterations the Fourier power is nearly flat (panel b). After 30 iterations (panel c), we see a significant near-circular darkening developed at the center of the plot, of a size bigger than the bright spot in panel (a). This indicates that a significant amount of noise has been transferred from the data into the solution at frequencies where the true signal is well below the noise level. The discussion of the different stopping rules applicable in the iterative inversions is outside the scope of this paper; we refer the reader to Hansen [10] and Vogel [11]. Allowing the iterations to proceed until the power spectrum of the approximation residuals becomes nearly flat is another way of detecting an optimal iteration count.

Fig. 3 Modulus of DFT of the approximation residuals r ^(k) in the adaptive +SOR iterative descent, at k = 0 (a), k=15 (b), and k=30 (c). The gray scale is taken such that the maximum signal (white) saturates at a level which is 1/1000 of its maximum value at k = 0 (panel a).

Download Full Size | PDF

In Paper I, two other test problems were addressed using +SOR with a constant relaxation parameter chosen by experimentation. In both the two problems, the adaptive relaxation brings better performance, especially in the problem with image restoration in the 3D space.

3. Blind deconvolution

We now turn to the implementation of the adaptive +SOR in the problem of blind deconvolution. The algorithm which we suggest is a double iterative process. One outer iteration involves a single descent in the merit function (Eq. (2)). After a few iterations, the descent is followed by an iterative application of the “refresh” procedure; in these inner iterations, the level of the merit function where iterations are terminated is kept unchanged. For illustration, consider first the processing of a single frame of data; generalization to multi-frame processing will be straightforward.

The descent is performed by a linear combination of two trial descents, one in the object coordinates, another in the PSF coordinates. Coefficients of this linear combination are found from the requirement of optimal reduction of the merit function (discarding the higher-order cross terms arising from the variation of the object and the PSF).

The refresh procedure consists of two steps. First, we fix the current guess for the PSF, and run the usual (non-blind) iterative deconvolution for the object, starting from zero guess, until the current value of the merit function is met. We then fix the resulted new guess for the object and perform, in the similar manner, the iterative deconvolution for the PSF. The important point (the central proposition in this algorithm) is that when repeated, the refresh process converges. No mathematical proof of the convergence is available at the moment, we can only suggest some constructive arguments in support of this claim.

Each inner iteration (refresh) with adaptive +SOR produces a low-frequency approximation for the object (or the PSF) which is consistent with current value of the merit function. When repeated, alternatively for the object and for the PSF, the inner iterations tighten the Fourier power spectra of both the object and the PSF to lower spatial frequencies. The process converges to the unique low-frequency joint approximation for the object and the PSF, which corresponds to the current value of the merit function (strictly speaking, a member of a family of approximations which differ in the relative scaling and shifts of the object and the PSF). Among all the possible solutions with the same merit, this approximation can be considered as an “optimal” one, which is the most stable to the distortion by the high-frequency noise.

Starting with a simple slowly-varying initial guess, the iterative algorithm thus specifies a unique descent trajectory in the (object, PSF)-coordinates, guiding the outer iterations along a sequence of optimal approximations. The descent has to be terminated when the noise level is reached. In multi-frame deconvolution, the trial descents are performed separately for each of the PSFs, which leads to solving a linear algebraic system of a trivial structure. In inner iterations, the PSF refresh is also performed separately for each frame, with using individual merit functions.

Our first numerical test is an artificial noise-free experiment, targeted at addressing directly the ability of the deconvolution technique to accurately split the information contained in the blurred images between the object and the PSF(s). In this experiment, each 128 × 128 data frame represents the same true object, but blurred with different PSF. Three upper panels in Fig. 4 show the true object (a), the PSF of the first frame (b), and the first data frame (c).

Fig. 4 Artificial noise-free blind-deconvolution experiment. First row: the true object (a); PSF of the first frame (b); and the first data frame generated from the convolution of the true object and the PSF (c). Second row: the results of deconvolution obtained with adaptive +SOR in 1000 iterations, in a single-frame experiment (d) and with using two (e) and ten (f) frames of data. Third row: the inferred PSF of the first frame.

Download Full Size | PDF

The relative norm of the approximation residuals (ratio of the Euclidean norm of the residuals to the Euclidean norm of the original data frames) versus the iteration count is shown by the solid lines in Fig. 5. Deconvolutions were performed with using one data frame only, and with two and ten frames. The refresh procedure was implemented periodically after four iterations. 16 inner iterations were used in the single-frame inversion, and just one in the 2- and 10-frame experiments. The image scans where alternated periodically between four different orientations, i.e. each four consecutive iterations constitute a complete cycle. The object and PSF estimates obtained in 1000 outer iterations are shown on lower panels of Fig. 4.

Fig. 5 The artificial noise-free experiment. Relative norm of the approximation residuals in the iterative blind deconvolution with using the adaptive +SOR (solid lines) and with using the PCID algorithm based on conjugate gradients [12] (dashed lines).

Download Full Size | PDF

To compare the adaptive +SOR with other techniques developed for the blind astronomical-image deconvolution, the same input data were processed (using the same initial guess) with the positivity-constrained conjugate-gradient version of the PCID algorithm [12]. The convergence history is shown in Fig. 5 by the dashed lines (marked +CG) for a single-frame and ten-frames experiments. Compared with +SOR, the convergence of the +CG residuals is much slower at higher iterations, with a clear tendency to stagnation; the accuracy obtained with +SOR can hardly be achieved in any practical number of iterations. Visually, the restorations (we do not display them) are much poorer. Another problem of the +CG is that the results are sensitive to the initial guess: a simple rescaling of the initial guess brings the convergence closer to (though not better than) that of +SOR in the ten-frame deconvolution; at the same time, the one-frame results degrade. After a few initial iterations, the +SOR results become essentially independent on the initial guess, which is another important advantage of the technique.

Minimization of the residual norm to the 10⁻⁴ level (see Fig. 5) is hardly needed for working with astronomical data, since the noise level is usually much higher. However, it is important to make sure that the quality of the restored images is only limited by data quality, not by the efficiency of the inversion technique.

We now examine the +SOR performance in its application to real data. For a comparison, the same data is also processed with PCID algorithm based on conjugate gradients (+CG). Our first data set consists of 10 consecutive images of the Seasat satellite. These data were acquired using the 1.578m GEMINI telescope at the Air Force Optical Site (AMOS) on Maui, Hawaii. The observation wavelength was 651 nm and the exposure time for each frame 7 ms. The first data frame (which is also by far the best one) is shown in Fig. 6(a). The 128×128 images were embedded into 256×256 data domains by padding with zeros, and the inversions were done with using 128×128 support domains for the object and for the PSFs. The initial guess for the PSFs was taken to be a Gaussian. The initial guess for the object is the input data (with shift-and-average in the multi-frame experiment) in the +CG, and zero in the +SOR inversion. The inner iterations (+SOR) were run until near-convergence, after each even descent, starting with the initial guess.

Fig. 6 Blind deconvolution of the Seasat images. First row: the first data frame (a), and the results of deconvolution obtained with adaptive +SOR in a single-frame experiment (b) and with using ten (c) frames of data. Second row: the results obtained with using the PCID algorithm based on conjugate gradients [12].

Download Full Size | PDF

The convergence history of the approximation residuals is shown in Fig. 7. It can be seen best in the results of single-frame deconvolution that the residual norm exhibits a step-like behavior: after passing a moderately well-defined plateau (iteration counts 30–50), the residual then drops faster again. Our interpretation of this transition at higher iteration counts is that the iterative descent falls on the trajectory leading to a trivial solution (δ-function for the object, input data for the PSFs). The best restorations correspond, roughly, to the turning points in the convergence-history curves. At higher iterations, the restored images exhibit growing contamination by the high-frequency noise. In the +SOR results, the optimal iteration count can also be identified as near-flattening of the Fourier power of the approximation residuals (as in Fig. 3; we did not address the residuals of the +CG inversion).

Fig. 7 As Fig. 5, but for the blind deconvolution of the Seasat data.

Download Full Size | PDF

Restorations obtained with +SOR reveal smaller distortion by artifacts, when compared with +CG results (Fig. 6). The “optimal” restorations have smaller residual norm (better likelihood).

The second data set consists of five 512×512 images of the Hubble Space Telescope collected with the AEOS 3.63m telescope with adaptive optics compensation at the AMOS site on Maui. The observation wavelength was 740 nm and the exposure time of each image was 98 sec. The images were embedded into 1024×1024 data domains to allow 512×512 support domains for the object and the PSFs. In the inversion with +SOR, zero initial guess was taken for the object, and a Gaussian guess for the PSFs. Eight inner iterations were used in the single-frame restoration, and four in the restoration with five frames of data, which were run after each even descent. In the +CG inversion, a Lorentzian guess was taken for the PSFs, and the first data frame as an initial guess for the object.

The results are shown in Figs. 8 and 9. A distinctive feature of the images restored with +SOR is their very high dynamics range, which creates difficulty with their proper display. If a linear gray scale was used in Fig. 8, only a few bright dots would be seen in the +SOR panels. The higher dynamic range of the results comes together with better resolution of the small-scale features. The +SOR restorations do not show any particular artifacts.

Fig. 8 As Fig. 6, but for the deconvolution of the HST images. Because of the high dynamic range of the +SOR results, the pixel values were rescaled by taking a cubic root (in all the frames displayed in this figure).

Download Full Size | PDF

Fig. 9 As Fig. 5, but for the deconvolution of the HST data.

Download Full Size | PDF

The +SOR minimizer has met an unexpected obstacle in the multi-frame inversion, when approaching an iteration count of about 50. The numerical expenses per iteration start to increase sharply. The relaxation parameter of the descents in object coordinates returns to small values and stalls, signalling inability of the minimizer to suppress lower spatial frequencies in the residuals; and the number of iterations required in the object refresh grows enormously, signalling that there is no object which would fit the input data to the required accuracy (Fig. 10). The only explanation for such a behavior of the minimizer is inconsistency in the information contained in different data frames.

Fig. 10 Deconvolution of HST images with adaptive +SOR. Left: relaxation parameter of the descents in object parameter space. Right: number of descents required for the object refresh.

Download Full Size | PDF

To see better what is behind the scene, we took the result of the 5-frame inversion at iteration count 50 as an initial guess and performed 10 more descents, but now allowing object improvement separately in each individual frame. The result is shown in Fig. 11. What is changing between the frames are glits, the specular reflections of the sunlight from the protective shield. A posteriori, the changes can be seen in the input data (upper panels in Fig. 11): a bright spot in the middle of the “upper” solar battery is dying gradually between frames 1 and 5. We conclude that multi-frame blind deconvolution with +SOR allows to detect moderately small changes in the true object.

Fig. 11 Deconvolution of the HST data in separate frames. First row: input data. Second row: object restoration from an individual frame. In the second row, the color scale saturates at 50 percent of maximum pixel value (red) to make the entire satellite visible.

Download Full Size | PDF

4. Discussion

In all our experiments, the adaptive +SOR demonstrates a performance which is superior to that of the conjugate gradients. The adaptive scheduling of the relaxation parameter minimizes the number of required descents and, simultaneously, brings a solution of nearly-optimal quality. In the implementation suggested for blind deconvolution, the iteration follows an optimal descent trajectory and reveals no signatures of stagnation, a problem which is in common for other iterative blind-deconvolution techniques. The results of the iterative inversion are essentially independent on the initial guess. Another important feature of the technique is that it allows for the detection of moderately small changes in the true object between separate data frames.

Our current implementation of the adaptive +SOR in blind deconvolution is a double iterative process, with computations performed in the real domain. As a result, the algorithm is numerically expensive. The expenses are not intolerable: computations reported in this paper were done on a laptop with dual-core processor. However, deconvolution of the 512×512 images took hours, and multiframe deconvolution took days. Better theoretical insight on the convergence properties is needed to eliminate the redundancy in the amount of inner iterations. Acceleration by excursions to the Fourier domain has to be explored, which is allowed when the +SOR descent is replaced with unconstrained SOR descent, followed by projection to the non-negative (convex) set.

Acknowledgments

This work was supported by the UK STFC under grant PP/E001459/1. S.M.J was supported by award FA-9550-09-1-0216 from the US Air Force Office of Scientific Research. We thank an anonymous referee for comments which helped to improve the presentation.

References and links

1. V. N. Strakhov and S. V. Vorontsov, “Digital image deblurring with SOR,” Inverse Probl. 24, 025024 (2008) (Paper I). [CrossRef]

2. K. Sauer and C. Bouman, “A local update strategy for iterative reconstruction from projections,” IEEE Trans. Signal Process. 41, 534–548 (1993). [CrossRef]

3. J. A. Fessler, “Penalized weighted least-squares image reconstruction for positron emission tomography,” IEEE Trans. Med. Imaging 13, 290–300 (1994). [CrossRef] [PubMed]

4. A. Björck, Numerical Methods for Least Squares Problems (SIAM, 1996). [CrossRef]

5. C. W. Cryer, “The solution of a quadratic programming problem using systematic overrelaxation,” SIAM J. Control 9, 385–392 (1971). [CrossRef]

6. R. G. Lane and R. H. T. Bates, “Automatic multidimensional deconvolution,” J. Opt. Soc. Am. 4, 180–188 (1987). [CrossRef]

7. R. T. H. Bates, B. K. Quek, and C. R. Parker, “Some implications of zero sheets for blind deconvolution and phase retrieval,” J. Opt. Soc. Am. 7, 468–479 (1990). [CrossRef]

8. R. C. Puetter, T. R. Gosnell, and A. Yahil, “Digital image reconstruction: deblurring and denoising,” Ann. Rev. Astron. Astrophys. 43, 139–194 (2005). [CrossRef]

9. D. Calvetti, G. Landi, L. Reichel, and F. Sgallari, “Non-negativity and iterative methods for ill-posed problems,” Inverse Probl. 20, 1747–1758 (2004). [CrossRef]

10. P. C. Hansen, “Analysis of discrete ill-posed problems by means of the L-curve,” SIAM Rev. 34, 561–580 (1992). [CrossRef]

11. C. R. Vogel, Computational Methods for Inverse Problems (SIAM, 2002). [CrossRef]

12. C. L. Matson, K. Borelli, S. M. Jefferies, E. K. Hege, C. C. Beckner, and M. Lloyd-Hart, “A fast and optimal multiframe blind deconvolution algorithm for high-resolution, ground-based imaging of space objects,” Appl. Opt. 48, A75–A92 (2009). [CrossRef]

Deconvolution of astronomical images using SOR with adaptive relaxation

Abstract

1. Introduction

2. Scheduling the relaxation parameter

3. Blind deconvolution

4. Discussion

Acknowledgments

References and links

Cited By

Figures (11)

Equations (20)

Optics Express