Abstract
In this paper we investigate the performance of a Fourier based algorithm for fast subpixel shift determination of two mutually shifted images subjected to noise. The algorithm will be used for Shack-Hartmann based adaptive optics correction of images of an extended object subjected to dynamical atmospheric fluctuations. The performance of the algorithm is investigated both analytically and by Monte Carlo simulations. Good agreement is achieved in relation to how the precision of the shift estimate depends on image parameters such as contrast, photon counts and readout noise, as well as the dependence on sampling format, zero-padding and field of view. Compared to the conventional method for extended object wavefront sensing, a reduction of the computational cost is gained at a marginal expense of precision.
©2005 Optical Society of America
1. Introduction
Within the last ten years adaptive optics (AO) correction of astronomical images has become a standard technique on many observatories. For stellar observations the standard wavefront sensing method uses the Shack-Hartmann sensor. The pupil is reimaged onto a lenslet array, thereby splitting the pupil into subpupils. For each lenslet, a subimage of the reference object is produced. If the reference object is a point source, the position of the center of intensity in the subimage defines a vectorial shift that is proportional to the average wavefront gradient over the corresponding lenslet. These vectorial shifts serve as sensor signals for controlling the actuators on a deformable mirror counteracting the wavefront error.
In case the reference object of interest is extended (e.g. the Sun or the Moon), the procedure of measuring the center of intensity must be substituted by cross correlation analysis [1, 2]. The shift of each subimage is estimated relative to a reference subimage for which the shift is estimated relative to the temporally previous subimage. Related to astronomical observations, the main problem is photon starvation resulting in noisy subimage shifts. The effects of noise is well understood for stellar AO [3, 4], and it has also been studied for correlation based wavefront sensing, e.g., in [5, 6].
In this paper we take starting point in a least squares estimate of the shift, expressed in spatial frequency space. The estimate is based on the phase of the cross correlation spectrum. An analytical model for the photon and readout noise induced RMS error to the shift estimate is then derived. In the next section, analytical predictions of how the RMS error depends on object contrast, photon counts and readout noise are verified by simulations, as are the dependencies on sampling format, zero-padding and field of view (FoV). In the last section, the reduction of computational cost for this method compared to the conventional method, based on the localization of the cross correlation peak [6], is discussed.
2. Analytical model
2.1. Shift estimate and noise performance
Given the two images f(r) and g(r), where r = (x,y), and assuming g(r) = f(r - r 0), a least squares estimate of the shift r 0 can be obtained by a minimization with respect to r 0 of the squared error ε 2 given by
where F(f) and G(f) are the Fourier transforms of f(r) and g(r) respectively. The squared error ε 2 can be minimized setting the derivatives with respect to x 0 and y 0 to zero. This leads to
where ℜ extracts the real part and * denotes the complex conjugate. Given that φC (f) - 2π f·- r 0 is small, where φC (f) is the phase of the cross correlation spectrum C(f) = |C(f) | exp[iφC (f)] = F(f)G *(f), leads to the following estimate of the shifts x̂ 0 and ŷ 0, for details see [7],
Given a sampled version of f(r) and g(r), for which r = (x,y) = Δ(p,q), and using the discrete Fourier transform (DFT), for which f = (fx , fy ) = Δ f (u,v), the shift estimate takes the form
The sampling format m satisfies ΔΔ f = 1/m, where m × m is the number of pixels (sampling points) in each image. Assuming the two images f(r) and g(r) to be identical but mutually shifted and subjected to additive noise we have F(f) = F 0(f) + N 1(f), G(f) = G 0(f) + N 2(f) and |F 0(f)| = |G 0(f)|. For large signal to noise ratios |F 0(f)|/|N(f)|, or equivalently |N 1(f)|/|F 0(f)| ≪ 1 and |N 2(f)|/|F 0(f)| ≪ 1, and a small correlation phase φc (f) this leads to
Here φc 0 (f) is the ideal correlation phase and φN (f) is the noise contribution. φN1 (f) and φN2 (f) are the random noise phases and φF 0 (f) and φG0 (f) are the phases of the noise-free spectra. Eq. (5) shows that, as expected, a large signal to noise ratio leads to a small noise contribution to the correlation phase and hence also to the shift estimate in Eq. (4). It is therefore important that contributions from spectral components where the signal to noise ratio is small are rejected completely from the shift estimate, as pointed out in references [8, 9].
Assuming 〈|N 1(f)|2〉 = 〈|N 2(f)|2〉 = 〈|N(f)|2〉, where 〈〉 stands for ensemble average, and using Eq. (5), it is seen that the sampled version of φN (f) satisfies
Replacing φC (u,v) in the b-coefficients, last line in Eq. (4), by ΦN (u,v), the noise contributions δx and δy to the shift estimates can be calculated. Assuming no correlation of φN (u,v) for different (u,v), we get after some calculations
It should be mentioned that since the images are real, the cross correlation spectrum is Her-mitian, i.e., C(u,v) = C *(-u, -v), and only half of the points in the spectrum need to be included in Eq. (4). In relation to astronomical AO, where the speed of the calculations is of crucial importance (~ 1 kHz sampling rate), the simplest possible estimate based on only two frequency values (u,v) = (u 1,0) and (u,v) = (0,v 1) simplifies the shift estimate in Eq. (4) to
This estimate is independent of the amplitude of the correlation spectrum |C(u,v)|, cf. Eq. (4). It is straightforward to show that this will be the case for all estimates based on only two linearly independent frequency components of the correlation spectrum. As pointed out by [8, 9] the estimate based on u 1 = v 1 = 1 has the advantage of being the least sensitive to aliasing. Furthermore, it is an inherent feature of the optical transfer function, that it acts as a spatial low-pass filter leaving the low frequency components in the image with the highest contrast and hence most reliable. Taking u 1 = v 1 = 1 it is seen that the condition for avoiding unwrapping the phase φC leads to |x 0| ≤ mΔ/2 and |y 0| ≤ mΔ/2, or in other words, the shift should be less than half of the field of view. Invoking more than two frequency components as given in Eq. (4) yields the possibility of suppressing contributions from unreliable frequency components having a low |C(u,v)| but still a satisfactory signal to noise ratio. Use of Eqs. (6) and (7) gives the error contributions 〈δx 〉1/2 and 〈δy 〉1/2 to the two point estimate in Eq. (8):
2.2. Performance in the presence of photon and readout noise
Letting f(r) and g(r) be measured in units of photon-counts/arcsec2 on the sky, we have
where N tot = Δ2∑∑f 0(p,q) is the total photon counts in the noise free image f 0(p,q) and |V(u,v)| = |F(u,v)|/|F(0,0)| is the visibility, or contrast, of the frequency component (u,v). Assuming the noise contribution to fluctuate independently in different pixels we get
where the index i stands for either 1 or 2. Δ4〈(p,q)〉 = Δ2 f 0(p,q) is valid for photon (Poisson) noise and Δ4 〈(p, q)〉 = 〈 (p,q)〉 is valid for readout noise, where 〈(p, q)〉1/2 is the readout noise in e-. Introducing these relations in Eq. (11), and assuming uncorrelated noise sources, leads to
Use of Eqs. (9), (10) and (12) gives the result below for the simple two point estimate in Eq. (8):
This is in correspondence with the estimated error given in [3, 4] for a quadrant detector, m = 2, and assuming Nyquist sampling. Several useful conclusions can be drawn from the above expression; (i) the standard error is inversely proportional to the contrast, (ii) neglecting the readout noise, the standard error is proportional to the inverse of the square root of the total photon counts in the subimage, (iii) increasing u 1 and v 1 will only lead to improved performance if | V(u 1, 0) |u 1 and |V(0, v 1) |v 1 are also increased. A few interesting cases can also be considered (neglecting the readout noise).
Case 1: Sampling format. The sampling interval Δ is changed but the field of view (FoV) is fixed. The format m → km, where k < 1 corresponds to less dense sampling and k > 1 corresponds to more dense sampling. The signal N tot is unchanged, δ → Δ/k and the constant FoV implies that δ f is unchanged. The continuous image is unchanged and so is the visibility |V(u,v)|. According to Eq. (13) this leads to 〈δ 2〉 = 〈〉 + 〈〉 unchanged as long as aliasing can be neglected. If the contribution from the readout noise is included it is seen that the performance degrades with increased sampling.
Case 2: Zero-padding. The FoV, or image, is zero-padded but Δ is fixed. The format m→km, where k≥1. N tot and Δ are unchanged and Δ f → Δ f /k. In case the same physical frequency components are used, that is u→ku and v → kv, this leads to unchanged visibility |V(ku, kv)|. Eq. (13 ) gives that the noise contribution 〈δ 2〉 is unchanged. However, zero-padding leads to the lowest frequency values being diminished, probably leading to higher contrast and a better performance for u 1=v 1 = 1 for the zero-padded image.
Case 3: Field of view. The effective FoV is enlarged by increasing the number of pixels which implies that the format m → km. The signal N tot → k2Ntot, Δ is unchanged and Δ f → Δ f /k. The following controlled case may be considered here: The field is enlarged by enlarging the image feature, f(r) → f(r/k), which means that the Fourier transform F(f) → kF(k f). This means that the visibility |V(u, v)| is unchanged and that 〈δ 2〉 is unchanged. However, enlarging the FoV and leaving the overall spectrum unchanged will lead to improved performance for the frequency components showing unchanged visibility. Furthermore the lowest frequency component u 1 = v 1 = 1 may show higher contrast further improving the performance.
3. Simulations of the noise performance
3.1. Procedure
The performance of the algorithm was evaluated using two identical, but mutually shifted, images. To estimate a realistic shift between the images, the RMS jitter (in radians) caused by the tip and tilt of the wavefront over a lenslet, given in [10], was used
where the diameter D of the aperture is approximated to the side of a lenslet in the Shack-Hartmann sensor and r 0 is the Fried parameter (projected onto the lenslet array). We assume that the side of the lenslet is matched to the Fried parameter, D = r 0. Given that the lenslet image is properly sampled in the focal plane, i.e. Nyquist sampling, the angular image extent of a pixel is
These relationships give that the image RMS jitter is ≈ 1.2 pixels in image space. The reference image, f 0(p,q), and test image, g 0(p,q), both scaled to an average intensity level, were shifted such a distance with respect to each other. The fractional pixel shift was obtained by downsampling images from a larger format. Photon and readout noise were then added to the images according to
where ε P and ε R are integer random variables described by a zero mean normal probability density function with σ2 εP = f 0(p, q) and = 〈〉 respectively. The signal f 0(p, q) was assumed to be large, such that the Poisson probability distribution could be approximated with a discrete normal probability distribution. The signal g(p, q) was obtained in the same way.
The DFT’s of f(p, q) and g(p, q) were then calculated and the correlation spectrum is given by C(u,v) =F(u,v)G *(u,v). From the correlation spectrum phase, the shift could then be estimated, either by the simple expression given by Eq. (8) or by the amplitude-weighted estimate given by Eq. (4). Also, the conventional method of localization of the cross correlation peak by parabolic interpolation in c(p, q) = ℱ-1{C(u,v)}, see [6], was used as reference.
As reference image, an ideal negative Gaussian function on a bias level was used to produce a crater-like image, f(r) =b 1-b 2 · exp[-r 2/2σ 2]. The parameters b 1 and b 2 could be tuned to set the signal and contrast level of the image. The width of the function was set to σ = Δm/6, a sixth of the field of view, in these simulations. Examples of a single noise realisation of the images and the correlation spectrum of this feature are given in Fig. 1. In the same way a more irregular object, cut out from a real image, was used in Fig. 2. In both cases, the average signal per pixel was set to 600 e- and the readout noise was 8 e- per pixel. It is seen in the figures that only the low frequency components exhibit a gradient in the correlation spectrum phase.
A number of noise realisations could then be used to obtain a measure of the standard error caused by noise. The simulations were also used to investigate bias errors in the estimates, by changing the angle of the separation between the images.
For the case of the ideal Gaussian function, see Fig. 1, the estimates obtained for eight different shift directions, 500 runs for each direction, are given in Fig. 3(a)–(c). In Fig. 3(a), the shift was estimated with Eq. (8), using only the frequency components ϕC (1,0) and ϕC (0,1). The shift in Fig. 3(b) was estimated with Eq. (4), setting all C(u, v) to zero except for the lowest frequency components C(1, - 1), C(1,0), C(1,1) and C(0,1). In in Fig. 3(c) the shift was estimated using the conventional method of localization of the cross correlation peak by parabolic interpolation, see [6] for details.
In the same way, this was reproduced for the irregular image in Fig. 2. The estimates based on Eq. (8) and ϕC (1,0) and ϕC (0,1), are shown in Fig. 3(d). The estimates were improved if Eq. (4) was applied, using only the four low frequency components C(1, -1), C(1,0), C(1, 1) and C(0,1). This is seen in Fig. 3(e). Again the shifts given by the the conventional method of localization of the cross correlation peak by parabolic interpolation, are shown in Fig. 3(f).
It is seen that by including more frequency components in the estimate, the precision was improved. Related to Fig. 3(a), the noise contribution 〈〉1/2/Δ in the simulation was 0.13 pixels. The parameters used in this simulation were m = 8, N tot = 38400 e-, |V(1,0)| = 0.053 and 〈〉1/2 = 8 e- and the analytical prediction in Eq. (13) leads to the same result. The noise contribution in Fig. 3(b) was 0.10 pixels. Using the conventional shift algorithm, relying on subpixel localization of the cross correlation peak by fitting a quadratic function to the peak and neighbour values, the noise contribution in Fig. 3(c) turned out to be 0.09 pixels.
3.2. Dependence on contrast, signal level and readout noise
By altering the contrast |V(u,v)|, or the signal level N tot, or the readout noise 〈〉1/2, in the input images, the effect of this on the estimated standard error was compared with the analytical prediction given by Eq. (13). The results are shown in Fig. 4. The values of the parameters used in the simulations were |V(1,0)| = 0.053, N tot/m 2 = 600 e-and 〈〉1/2 =0 e-, unless variable, and the image format was m = 8. No shift between the images, with the ideal negative Gaussian object, was introduced in this simulation and 1000 noise realisations were used to estimate each data point in the plots.
3.3. Dependence on sampling format, zero-padding and field of view
The three cases described in the discussion following Eq. (13) were evaluated by simulations as well. The same images were used for these simulations, the ideal Gaussian image, starting with a sampling format of m = 8. No shift was introduced between the images, and the shift estimate was obtained using Eq. (8). The parameters used were |V(1,0)| = 0.053, N tot = 38400 e- (unless variable) and 〈〉1/2 = 0 e-. 1000 noise realisations were used to estimate each data point.
For case one, i.e. increased sampling within a constant field of view, N tot was kept constant, whereas the sampling format m was altered to km (the downsampling from the higher format was changed). The estimated shifts were obtained using Eq. (8) with u 1=v 1 = 1. The estimated standard error 〈〉1/2/Δm, given as a fraction of the field of view, is shown in Fig. 5(a).
For case two, the images were zero-padded before doing the DFT’s, hence extending the format by a factor of k. The estimated shifts were obtained using Eq. (8) with u 1=v 1=k. The estimated standard error 〈〉1/2/δ, in pixels, is shown in Fig. 5(b).
For case three the same procedure as for case one was used, with the only difference being that the signal was scaled to k 2 N tot. The estimated standard error 〈〉1/2/Δ, in pixels, is given in Fig. 5(c). The simulated cases are in agreement with the discussion following Eq. (13). There are no tendencies of variations in 〈〉1/2.
4. Gain in computational time
The main reason for investigating this algorithm is the desire for a fast algorithm in order to reduce the servo-lag of the closed loop AO system. Compared to the conventional method of localizing the cross correlation peak by parabolic interpolation, there is a reduction of the computational time when using the method described by Eq. (8).
For the conventional method of fitting a parabolic function to the cross correlation peak there are a number of computations to be carried out. For each shift estimate, or for each subaperture, there is an FFT of the test image, for which the computational time scales as t F m 2log2 m 2. This is then followed by a multiplication F(u,v)G *(u,v) that scales as tMm 2. The inverse FFT of the complex cross correlation spectrum is then calculated to obtain the cross correlation function, and the computational time for this scales as 2t F m 2log2 m 2, where the factor 2 comes from the fact that the correlation spectrum is complex. tM should be smaller than 2tF since it only includes a complex multiplication whereas 2tF also includes a complex addition. The peak value is then to be found and this process scales as t S m 2. The overhead time, t OH1, consumed by the parabolic interpolation is constant. The total time for the method is given by
As m → ∞ it is seen that this method scales as O(m 2 log2 m 2).
If Eq. (8) is used instead, only the two frequency components G(u 1,0) and G(0,v 1) have to be calculated. The computational time for this scales as 2t M m 2. The overhead computational time for the correlation phase and shift estimate is independent of m and is given by t OH2. The total computational time for the algorithm using Eq. (8) is
and thus scales as O(m 2).
Comparing the computational time of the two algorithms one can see that there is a gain if there is a possibility to use Eq. (8) instead of the conventional method. The gain t cc/tφ scales as O(log2 m 2). As m grows large, it is arguable whether Eq. (8) will give a reasonable estimate. More frequency components could be included to improve the precision, if they show reliable noise performance.
The computational times of the two different algorithms were estimated using a Matlab function written for each routine. The computational speed was optimized as far as possible for both routines. The format m was varied and 10000 runs were carried out for each format to estimate t cc and tφ . Matlab 7.0 ran with real time priority under Windows XP on a 1.4 GHz Intel Pentium M processor. Given this non real time operating system, 10 repetitions were carried out to estimate the variation in computational speed.
The FFT library, FFTW [11], used in Matlab implies that the computational time for an FFT is not strictly proportional to m 2 log2 m 2. The efficiency of the FFTW-library varies with frame format as well as with the processor that the system is running on, see benchmarking in [11].
The measured computational time for a single shift estimate using the traditional cross correlation procedure,t cc, is normalized with m 2 log2 m 2 and plotted in Fig. 6(a) versus frame format. As expected from Eq. (17), the normalized t cc tends to be constant for large m.
The measured computational time for a single shift estimate using Eq. (8), tφ , is normalized with m 2 and plotted in Fig. 6(b) versus frame format. As expected from Eq. (18), the normalized tφ tends to be constant for large m. The absolute time for the two methods can be extracted from Fig. 6(a) and Fig. 6(b) respectively, but it is of course computer dependent.
Finally, the fraction between the times of the two methods as a function of frame format is given in Fig. 6(c). The predicted O(log2 m 2) is not clearly seen due to the fact that for these small formats there is a considerable overhead time for the methods as seen in Fig. 6(a) and Fig. 6(b). For the format of m = 8 the gain corresponded to approximately seven.
5. Conclusion
A fast algorithm for subpixel shift determination of two images was presented in this paper. Based on only two frequency components in the cross correlation spectrum, the precision of the shift estimate was comparable to the conventional method of localization of the cross correlation peak by parabolic interpolation, for guide objects with a low frequency content. The analytical expression of the precision of the method, depending on noise and image contrast, agreed well with the simulations. This was also valid for the dependence on sampling format, zero-padding and field of view. A reduction of the computational cost was demonstrated when using only two frequency components. This is at the expense of a marginal decrease of performance which could be overcome invoking more frequency components.
The accuracy and precision of this method, similar to all cross correlation methods, will be dependent on the guide object. The remedy to the bias errors in the estimates is to use the algorithm in closed loop AO systems. Furthermore, noncyclic boundaries of the images might necessitate windowing of the images, e.g. with a Hann window, prior to calculation of the cross correlation spectrum.
Acknowledgments
This work has been supported by the Royal Physiographic Society in Lund and by Science Foundation Ireland under Grant 01 PI.2 B039C.
References and links
1 . O.von der Lühe , A. L. Widener , Th. Rimmele , G. Spence , R. B. Dunn , and P. Wiborg , “ Solar feature correlation tracker for ground-based telescopes ,” Astron. Astrophys. 224 , 351 – 360 ( 1989 ).
2 . T. R. Rimmele and R. R. Radick , “ Solar Adaptive Optics at the National Solar Observatory ,” in Adaptive Optical System Technologies , D. Bonaccini and R. K. Tyson , eds., Proc. SPIE 3353 , 72 – 81 ( 1998 ). [CrossRef]
3 . G. A. Tyler and D. L. Fried , “ Image-position error associated with a quadrant detector ,” J. Opt. Soc. Am. 72 , 804 – 808 ( 1982 ). [CrossRef]
4 . B. M. Welsh , B. L. Ellerbroek , M. C. Roggemann , and T. L. Pennington , “ Fundamental performance comparison of a Hartmann and a shearing interferometer wave-front sensor ,” Appl. Opt. 34 , 4186 – 4195 ( 1995 ). [CrossRef] [PubMed]
5 . V. Michau , G. Rousset , and J. C. Fontanella “ Wavefront Sensing from Extended Sources, ” in Real Time and Post Facto Solar Image Correction , R. R. Radick , ed. ( NSO/SP Summer Workshop Series No. 13 , 1992 ), pp. 124 – 128 .
6 . L. A. Poyneer , “ Scene-based Schack-Hartmann wave-front sensing: analysis and simulation ,” Appl. Opt. 42 , 5807 – 5815 ( 2003 ). [CrossRef] [PubMed]
7 . M. Owner-Petersen , “ An algorithm for computation of wavefront tilts in the LEST solar slow wavefront sensor ,” in Real Time and Post Facto Solar Image Correction , R. R. Radick , ed. (NSO/SP Summer Workshop Series No. 13, 1992 ), pp. 77 – 85 .
8 . H. Stone , M. Orchard , and E.-C. Chang , “ Subpixel Registration of Images ,” in Conf. Record of the 33rd Asilomar Conf. on Signals, Systems, and Computers M. B. Matthews , ed. ( Pacific Grove, Calif. , 1999 ) pp. 1446 – 1452 .
9 . H. S. Stone , M. T. Orchard , E.-C. Chang , and S. A. Martucci , “ A Fast Direct Fourier-Based Algorithm for Subpixel Registration of Images ,” IEEE Trans. Geosci. Remote Sens. 39 , 2235 – 2243 ( 2001 ). [CrossRef]
10 . D. P. Greenwood and D. L. Fried , “ Power spectra requirements for wave-front-compensative systems ,” J. Opt. Soc. Am. 66 , 193 – 206 ( 1976 ). [CrossRef]
11 . M. Frigo and S. G. Johnson , “ FFTW Home Page ,” ( 2005 ), http://fftw.org/ .