Optimal Fourier control performance and speckle behavior in high-contrast imaging with adaptive optics

Lisa A. Poyneer; Bruce A. Macintosh

doi:10.1364/OE.14.007499

1. Introduction

One of the most active areas in astronomy in recent years has been the study of extrasolar planets. Since 1995 more than 160 planets have been discovered orbiting stars other than the sun [1]. The vast majority have been detected through measurements of the Doppler shift of the parent star caused by the gravitational influence of the planet and are found within five astronomical units of their parent star—different from the location of giant planets in our solar system.

A powerful complement to Doppler techniques would be the direct imaging detection of photons from the extrasolar planets themselves. Imaging detection is primarily sensitive to planets in wide orbits where scattered starlight is less of a problem. Imaging techniques require only a short period of time rather than a full orbit, and through photometry or spectroscopy can measure properties of the planet such as radius, temperature, and composition. Unfortunately such detection is extremely challenging. Jupiter, seen from outside our solar system, is 109 times fainter than the sun; a Jupiter analog in another solar system would be swamped by the light from its parent star, scattered by diffraction, optical imperfections, and atmospheric turbulence. At infrared wavelengths, younger and more massive planets will be brighter but still extremely challenging to detect with current telescopes and adaptive optics (AO) systems.

Truly imaging a large sample of extrasolar planets will require an AO system designed specifically for the high-contrast imaging task. This AO system will have dense actuator spacing, a coronagraph to control diffraction, and will be designed not merely to maximize Strehl ratio but to maximize detectability of a planet near a bright star. The spatially filtered wavefront sensor [2] will prevent aliasing in the wavefront sensor (WFS). This will produce a “dark hole region” of reduced point spread function (PSF) intensity in the AO controllable band. Even with bright stars, such systems will typically have very small subapertures (d ≈ 20 cm) and high update rates (1000 – 2000 Hz) and hence still be photon-starved, particularly since many of the stars with the brightest planets are often distant and faint. As a result, these systems must employ optimal controllers to achieve the best possible performance from their available photons. We have proposed a system using an optimal Fourier controller (OFC) [3]. The first major aim of this study is to explore how the gain optimization process works and affects performance.

No AO system can remove all the scattered light in the PSF of a target star. The ability to detect a planet will be limited by the amount and character of this residual light—not merely the Poisson noise set by photon statistics, but the properties of residual speckles present in the PSF [4]. The second major aim of this study is to explore the qualities and temporal behavior of the science images produced by the AO system as a function of the residual phase error.

2. System specifications

2.1. Wavefront control

The control system uses the OFC scheme developed by Poyneer and Véran [3]. OFC consists of two portions. At every time step, the computationally efficient Fourier transform reconstruction [5] (FTR) method is used to obtain the residual phase estimate from the slope measurements. FTR applies fast discrete Fourier transforms (DFTs) to the slopes and obtains the phase estimate by filtering. This filter consists of an inversion of the WFS process, a compensation for the deformable mirror (DM) response, and a control gain for each mode. The reconstructor works on a spatial frequency grid, where the frequencies k and l range from -N/2 + 1 to N/2, for reconstruction done on a N ×N grid.

Fig. 1. Block diagram of the AO control system for a complex Fourier modal coefficient. All modal signals are complex-valued. The WFS, FTR, and DM units are boxed off.

Download Full Size | PDF

The modal gains are optimized by supervisory processes using a formalism originally developed by Gendron [6]. Measurements of modal coefficients are accumulated in a buffer and periodically used (around 1 to 2 Hz) to estimate the optimal gain for each mode given the current temporal characteristics of the atmospheric turbulence and WFS noise. This is done through temporal power spectral density (PSD) estimation and solving the optimization problem given the control system transfer functions. Currently the optimization criterion is the minimization of the total residual error power, which maximizes the Strehl.

This two-step wavefront control scheme provides computational efficiency and real-time adaptability. An in-depth treatment of the method is found in previous work [3]. For our present purposes we will summarize the salient technical points of OFC, incorporating some refinements to the model.

The FTR method enables the use of Fourier modes as the natural modal basis for wavefront control. This is particularly desirable for two reasons. First, the Fourier modes have a natural relationship with PSF structure at high Strehl ratios [4]. A spatial frequency pair [k,l] in the Fourier reconstructor scatters light in the PSF to a location λk/D, λl/D, where λ is the wavelength of light in the science leg and D is the aperture diameter. This means that optimizing the Fourier modes in this form of wavefront control optimizes specific locations in the PSF. This is very useful for high-contrast imaging where we are not concerned with a single parameter like Strehl ratio, but rather with the structure of the PSF. Second, the Fourier modes are eigenfunctions of linear, shift-invariant systems (LSI). Since the AO system components are LSI, this allows us to represent each element in frequency space as a simple complex gain. That means that the control system breaks into independent loops, one for each spatial frequency pair [k,l]. Temporal analysis of the control system can then be done for each mode following established techniques[7]. Figure 1 shows the refined block diagram for the temporal control of a modal coefficient. We now control the cosine and sine modes at a specific spatial frequency jointly as a complex coefficient. This makes all the modal time series in the diagram complex-valued. Note that this means the complex time series is directly used in the modal gain optimization.

The phase aberration to be corrected is given by the modal coefficient m[t]. This is compensated by the DM signal. The residual error e[t] is then seen by the science camera. This residual is also seen by the WFS. The x- and y-slopes are obtained from the spatially filtered Shack–Hartmann WFS as a simple delay due to integration time and a scaling by complex coefficients g_wx and g_wy , respectively. The WFS noise comes in on both slope measurements. These two signals n_x [t] and n_y [t] are uncorrelated. The slopes are then reconstructed. FTR in modal formulation does reconstruction as a simple scaling and addition of the slopes in the spatial frequency domain. If the complex-valued x-slope modal coefficient is noted by x[t] and likewise the y-slope coefficient is y[t], the estimate of the modal phase coefficient is

ϕ [t] = x [t] g_{rx} + y [t] g_{ry},

where these coefficients are given by

g_{rx} = \frac{1}{g_{d}} (\frac{g_{wx}^{*}}{{∣ g_{wx} ∣}^{2} + {∣ g_{wy} ∣}^{2}}),

and

g_{ry} = \frac{1}{g_{d}} (\frac{g_{wy}^{*}}{{∣ g_{wx} ∣}^{2} + {∣ g_{wy} ∣}^{2}}) .

When the FTR coefficients are determined from measured WFS gains g_wx and g_wy and DM gain g_d , the filter is termed custom. This is in contrast to theoretical filters based on analytic models of the elements, which we have used previously. These theoretical filters perform well but do not produce unity gain through the system. In our simulations we have generated such a custom filter and have used it.

After phase estimation, the modal gain is applied. This is split out from the control law because the gain is actually applied with the reconstruction filter. After this scaling, the modal coefficient is measured and stored in a buffer for later optimization. This signal s[t] is sent on the control law. We use a leaky-integrator control law of the form

C (z) = \frac{g}{1 - c z^{- 1}},

where the integrator value c is just less than one and the gain g is bounded from 0 to 1 by stability. After integration, the modal signal is sent to the DM. Because the modes are Fourier modes, the DM influence function simply scales each mode by a gain g_d (see Ref. [3] for more discussion). The delay here represents the computational delay of the system. This gives the simulation system a two-step delay due to the WFS integration time and the computation time. The DM signal is then subtracted from the input atmospheric phase aberration to close the loop.

2.2. Simulation

A detailed end-to-end AO system simulation with Fourier optics components is used to test new algorithms and evaluate system performance. This simulation incorporates a frozen-flow atmosphere and a full implementation of the AO system elements and wavefront control scheme. It is based on the design for a high-contrast imaging AO system for a D = 8 meter class telescope. The subaperture diameter is d = D/44, and the Fourier reconstruction uses a N = 48 grid for reconstruction and filtering. The block diagram of Fig. 1 describes the simulation’s behavior.

The base atmosphere is created given the Cerro Pachon atmospheric models [8] and condensed to two layers. Each large phase screen is created using a spectral factor method [9]. The screens are created to be large enough that the entire exposure time will fit in the width of the screen, as opposed to simply wrapping around the edges. The phase aberration in the pupil at any instant in time is available using subpixel shifting of the phase screens and ray tracing. We are not concerned with scintillation for this study, so ray tracing, not optical propagation, is sufficient for producing the phase error in the pupil plane.

The AO simulation runs at an internal system rate of 2 kHz. At each time step, the phase aberration in the pupil is phase-conjugated by the current surface of the DM and tip-tilt mirror. The residual phase is then sent to the science camera unit, which generates PSFs with a gray-scale Blackman function apodization on the amplitude to suppress diffraction. This is a computationally efficient way to model ideal coronagraphy.

The residual phase is sent to the WFS unit. The WFS contains a spatial filter, which is a square field stop of angular size λ_wfs/d in the focal plane before the lenslet array, where λ_wfs is the WFS wavelength and d is the subaperture size in the pupil. The spatial filter removes spatial frequency components in the phase that are beyond the ability of the system to measure or correct. As a result, it removes the aliasing error term and enables a dark region of improved correction inside the controllable band of the AO system.

After the spatial filter, the lenslet array generates spots on 2×2 pixel quadcells, with pixel size 1.8”. The WFS operates in the linear range given the pixel size. These pixel values are correctly generated to be integrated intensities over the area. Photon and read noise are added given system radiometry. The total number of received photons in a subaperture is based on a combination of standard photometric calibration to Vega, and an assumed transmission of the telescope and AO system and CCD efficiency of 0.468. The CCD has an assumed 8 electrons read noise per pixel. For a given star magnitude I, the signal-to-noise ratio (SNR) of the WFS measurement can be calculated using standard methods [10]. In our specific case, the WFS SNR is exactly

SNR (F, I) = \frac{0.146202 (10^{9 - 0.4 I})}{F} {(\frac{0.146202 (10^{9 - 0.4 I})}{F} + 256)}^{- \frac{1}{2}},

where F is the frame rate of the AO system.

The slopes are calculated with the standard quadcell centroid algorithm and are then sent to FTR, where the phase is reconstructed and the current modal gain applied. The phase estimate is then integrated and sent to the DM. The DM surface is created using an influence function model. Full system telemetry is stored, and the gain optimization occurs as a supervisory process on this data. When the system is run at a lower frame rate than 2 kHz, the atmosphere and science camera continue to run at 2 kHz, while the AO updates occur at the slower rate.

3. Performance of the gain optimizer

The gain optimization process minimizes the residual error power, which is the variance of ε[t]. Assuming that the signal m[t] and the noises n_x [t], n_y [t] are wide-sense stationary (WSS) random processes, we can easily analyze how the control system modifies them using temporal PSDs and transfer functions. We will further assume for clarity of notation that the random processes are also zero-mean.

In particular, if the atmospheric modal coefficient m[t] has temporal PSD P_m (ω), then the residual error ε[t] due to uncorrected atmosphere has temporal PSD

P_{ε} (ω) = P_{m} (ω) \frac{1 + c^{2} - 2 c \cos ω)}{1 + c^{2} + g^{2} - 2 c (1 + g) \cos ω) + 2 gcos 2 ω)} .

The variance is found by integrating the PSD as

σ_{ε}^{2} = \frac{1}{2 π} \int_{- π}^{π} P_{ε} (ω) dω .

Though the exact formulation depends on the input PSD P_m (ω), the residual error variance will generally decrease as the gain is increased.

The noise has a different path through the system. We reasonably assume that the two noise signals are temporally white, of equal power, and uncorrelated, such that

P_{nx} (ω) = P_{ny} (ω) = \frac{σ_{n}^{2}}{2} .

The residual error ε[t] due to WFS noise has temporal PSD

P_{ε} (ω) = \frac{σ_{n}^{2}}{2} \frac{g^{2} ({∣ g_{rx} g_{d} ∣}^{2} + {∣ g_{ry} g_{d} ∣}^{2})}{1 + c^{2} + g^{2} - 2 c (1 + g) \cos (ω) + 2 gcos (2 ω)} .

The integral for the error variance can be solved in closed form as

σ_{ε}^{2} = \frac{σ_{n}^{2}}{2} g^{2} ({∣ g_{rx} g_{d} ∣}^{2} + {∣ g_{ry} g_{d} ∣}^{2}) \frac{g^{2} (1 + g)}{(1 - g) {(1 + g)}^{2} - c^{2} (1 - g)} .

As the control system gain increases, the error variance due to noise will monotonically increase. For very high gains (g > 0.85 in this case), the error variance will become very large due to the large overshoot in the transfer function.

In previous work we have verified that the gain optimization does indeed reduce residual error power and that the Fourier modes can be independently controlled, despite the circular aperture. Here we will asses the impact that gain optimization can have in a high-contrast imaging system. How important is gain optimization, and how big a performance improvement can we expect from it?

3.1. AO simulation parameters

To assess performance improvement, the end-to-end simulation is used with a reasonable atmospheric model. The simplified Cerro Pachon model, condensed into two layers, is used with better than median conditions. Layer 1 has r ₀ of 24.6 cm and moves in frozen flow at speed 2.3 m/s along and angle -124 degrees from the x-axis. Layer 2 is weaker, with r ₀ of 34.1 cm, but with a higher wind speed of 16.5 m/s at an angle of -95 degrees. This gives a total r ₀ of 18.7 cm. The dominant wind direction is clearly visible in the AO-corrected PSFs.

To compare performance, we ran the AO simulation on the same input atmosphere in two different configurations. The first used a modal gain filter that was 0.3 for all modes. The second method used an optimized modal gain filter for the system rate and natural guide star (NGS) magnitude. The optimal gains filters were determined by running the system in optimizing mode for at least 10 gain update iterations. The gains were averaged over those iterations to produce a mean filter. This filter was then used during simulation operation on atmospheres of the same parameters, but of different realizations.

Performance of the system is tracked with telemetry of residual error power and short- and long-exposure apodized PSFs. This was done for operation at 1 kHz and 2 kHz, and for a natural guide star of magnitude I from 4 to 9.

Fig. 2. Optimal gain filters for a range of NGS magnitudes and two frame rates. The filters are centered so that the lowest frequency is in the middle. The optimal gain varies with modal spatial frequency, spanning in most cases a range of 0.4 gain units.

Download Full Size | PDF

3.2. Results: Necessary range of gains

In previous work [3] we have shown that the gain-optimization process can compensate for unknown DM characteristics, and that in our simulation, this necessitated a wide range of gains. Now that we are using a customized FTR filter, no such DM compensation is necessary. Instead, the range of optimal gains across spatial frequency space represents the true variation in atmospheric power, temporal behavior, and WFS power. And this range, it turns out, is still quite substantial. Figure 2 shows the optimal gain filters for all of the cases, all on a 0 to 1 color scale. The filters are shown with spatial frequency [k,l] = [0,0] in the middle. All the filters exhibit the same fundamental structure: a butterfly pattern oriented with the wings just slightly off the y-axis. This is due to the direction of the wind in the atmospheric model. In this butterfly region the atmospheric power is much higher, leading to higher optimal gains.

In all except the lowest-SNR cases, the optimal gains span a range of around 0.4 gain units. The optimal gains do not in general go above 0.8, because the overshoot in the transfer function becomes too large. This means that the optimal gains take up essentially half of the available range. This large range emphasizes the need to do gain optimization. If a uniform gain is used across all spatial frequencies, many, if not most, frequencies will have substantially the wrong gain.

Another interesting thing shown in Fig. 2 is the strong structural similarity in gains as the NGS magnitude changes. Changing the magnitude simply changes the level of WFS noise power. As long as the SNR is not too low (e.g., I = 9 here) the optimal gains are simply scaled as I changes.

3.3. Results: Improvement in PSF intensity by region

The optimal gain PSFs have equal or lower intensity at all regions in the controllable band out to λ/(2d), except the core, than the uniform-gain PSFs. The in-band error is the error in spatial frequencies out to frequency 1/(2d), set by Nyquist. We will examine this for both signal and noise error reduction through two animated figures.

The first, Fig. 3, shows the 1 s PSFs for the I = 5, 2 kHz case. Before gain optimization, the modal gains are uniformly 0.3. The apodized PSF is shown on log-scale on a color range from intensity of 1e-6 to 1e-2. (Intensity 1 is the peak of a diffraction limited PSF.) This is a moderate SNR case and the temporal error shows very clearly as a butterfly pattern. The speckles in this area are still well-defined, because only one second has passed. The dominant wind layer is moving at 2.3 m/s and the aperture is 8 m in diameter, so little averaging has occurred. Because error due to the atmosphere is dominant, the gain optimization increases the modal gains, up to a value of 0.6, in the spatial frequency range corresponding to this butterfly shape. These higher gains result in less temporal error. As the animated figure shows in a smooth dissolve between PSFs, the intensity of the atmospheric speckles is greatly reduced, though the overall structure remains the same. In the prior case, the median in-band rms error is 24.0 nm. The gain optimization reduces this to 18.6 nm. This reduction by 15.3 nm rms is concentrated in a specific region. In this butterfly pattern, the intensity of the PSF is cut in half.

Fig. 3. [Animated Figure: 1.3 MB] Gain optimization that reduces temporal errors in the PSF, I = 5, 2 kHz case.

Download Full Size | PDF

Fig. 4. [Animated Figure: 1.9 MB] Gain optimization that reduces noise errors in the PSF, I = 8, 2 kHz case.

Download Full Size | PDF

The second animation, Fig. 4, shows the 1 sec PSF for the I = 8, 2 kHz case, where WFS noise is the dominant error source. The PSFs are shown on the same color scale as in Fig. 3. With uniform 0.3 gains, the PSF’s controllable region is dominated by a smooth, symmetric halo of WFS noise. It is smooth because 2000 realizations of WFS (1 s at 2 kHz) have been averaged out. This default gain of 0.3 is too high. The gain optimization lowers all gains into the range 0.05 to 0.25. The PSF after optimization has a visible butterfly pattern from the temporal error. There is still some noise halo, but at a greatly reduced level. The median in-band rms error before gain optimization is 69.6 nm rms. After gain optimization it is 58.4, a reduction of 37.9 nm. There is a slight reduction in the butterfly area, but the biggest gain is in the orthogonal noise-dominated region along the x-axis. In this region the PSF intensity is reduced by up to 80%; the PSF intensity is just one-fifth of what it had been. Because the optimization minimizes the residual error power, it maximizes the Strehl. In the signal-dominated case, the already high Strehl improves from 96.1% to 96.3%. In the noise-dominated case, the Strehl ratio improves from 91.0% to 92.8%. These small increases in Strehl disguise the significant reduction in intensity in the dark hole, emphasizing that for high-contrast imaging, the Strehl ratio is too simple a metric.

3.4. Benefits of gain optimization

Real-time gain optimization is a very important part of any high-performance AO system. The real-time optimization framework is attractive because it frees the reconstruction from reliance on potentially inadequate a priori models. As shown above, in practice, gain optimization reduces PSF intensity at most locations in the controllable region, sometimes by a substantial amount. The exact amount will depend on two factors: the initial default gain and the atmospheric profile. In this specific case with a dominant wind direction, reductions of PSF intensity by 50% to 80% are reasonably possible. As mentioned above, this method of gain optimization method was originally developed [6] for AO with conventional reconstructors. Use of the Fourier modes is particularly desirable in high-contrast imaging, because the reconstruction algorithm is computationally efficient and because the Fourier modes allow control of specific locations in the PSF. Another modal basis (such as a Karhunen-Loeve basis) could be used, provided that the reconstruction could be implemented efficiently.

4. Image contrast and speckles

In this specific planet-imaging application, we are concerned with the ability to detect a planet (if it exists) in a ~ 1 hour exposure. This depends strongly on the light level in the controllable region of the PSF, but also on the smoothness of that halo in a region. As shown above, the gain optimizer results in PSFs with reduced intensity. But the speckles produced by atmospheric and noise errors have very different structures on short time scales. In order to properly evaluate system performance for planet detection, we need a theoretical framework adequate for determining the ability to detect a planet and for verifying the simulation results as reasonable.

To illustrate the performance evaluation problem, consider a short and a long exposure PSF from the simulation, as shown in Fig. 5. The first PSF is a 10 ms exposure; the other is a 5 s exposure. These two PSFs have nearly the same level of scattered light in the controllable region, though the short exposure is slightly lower due to random fluctuations. Despite the similar levels of intensity, the structure of the PSFs is very different. In the short exposure PSF there are many bright speckles and many dark regions. In the long exposure the overall structure is much smoother. In both images a simulated planet has been inserted at a peak intensity 2 × 10^-5 times less than the PSF peak value. It is located in the upper right region at 0.6” separation. The smoothness of the long exposure makes the target planet much more clearly detectable.

The image statistic that describes this detectability is contrast, a term for which there are several definitions in use. For a reasonable planet detection rule, we need an estimate of the variance of the background in which that planet might appear. In our case, this background is the controllable region of the PSF, excluding the PSF core. We will not use a general intensity-based metric, such as those that are appropriate for the space-based imaging cases where static speckles dominate the PSF [11].

The first way to estimate the background variation is to take a single long-exposure PSF (generated by the simulation) and to average spatially over the speckles in a sensible fashion. This spatial-averaging method has the advantage that it requires only a single PSF; it is particularly relevant for a detection strategy that involves the analysis of a single science image. However, it does assume that spatial variations in a single PSF will be comparable with temporal variations at a specific location; in other words, it assumes that the process is ergodic.

Fig. 5. Short and long exposure PSFs from the end-to-end simulation. Both have the same standard deviation of intensity in the controllable region (dark hole) but a planet is more easily detected in the long exposure image.

Download Full Size | PDF

Another way to estimate image contrast is to do temporal averaging. For this, a large set of sample images of the desired exposure time is used to evaluate the sample variance for a specific location of interest. This set is drawn over random atmospheres and will give an estimate of the temporal variations at a specific location. Temporal averaging is most appropriate when multiple images will be used for detection decisions. However, this technique requires much more computation and as such may be prohibitively expensive.

A third technique would be to use an analytic or semi-analytic approach, which would obviate the need for long end-to-end simulation runs. Spatial-frequency domain modeling is a type of analytic approach. Versions of this include the linear systems spatial-frequency method of Ellerbroek [12] and the spatial power spectrum approach of Jolissaint et al. [13]. Another analytic approach looks directly at the statistical behavior of the PSF intensity. Examples include the theoretical work of Aime and Soummer [14] and the experimental work of Fitzgerald and Graham [15]. Neither of these approaches has entirely what we want; the spatial-frequency domain techniques are not designed to analyze the PSF intensity temporal behavior, and the statistical PSF methods do not incorporate detailed knowledge of an AO system.

Aspects of three methods (spatial-averaging, temporal-averaging, and the analytic approach) are used in concert to analyze the temporal behavior of the PSF intensity. The residual error output of the AO control system consists of atmospheric errors and WFS noise errors. These two types behave very differently as a function of time, and this difference should be taken into account when evaluating final system performance.

4.1. AO simulation parameters

To study the temporal behavior of the PSF intensity, the simulation is run in a special short-exposure mode. Like the long-exposure simulations previously described, the simulation generates the instantaneous PSF at every time step (2 kHz). For speckle studies, these are accumulated into 10 ms exposures which are saved, producing a 100 Hz sampling of the PSF evolution. For this study 12 different realizations of the above-median atmosphere were run, each for 5.12 seconds. This was done for the I = 6, 2 kHz case and also for the same case but with no WFS noise, so that the atmospheric behavior could be examined directly. One 5.12 second exposure with no atmosphere and only WFS noise was also run.

4.2. Image behavior as exposure time increases

We begin with a basic statistical model for how the PSF intensity behaves in short- and long-exposure images. The PSF intensity at any location is modeled as a random process with a specific mean and autocovariance function. This is a discrete-time model because the simulation is discrete-time, but the concepts here can easily be converted to a continuous-time model.

The PSF intensity at a specific point in time is represented by a random process i[t]. The exact mean and autocovariance of this time series will be discussed later, but it is worth noting now that this random process is not zero-mean. A long exposure image will be the average of many of these consecutive values to preserve normalization of intensity. This long-exposure value for T consecutive frames will be

p_{N} [t] = \frac{1}{T} \sum_{l = 0}^{T - 1} i [t - l] .

Since this is a linear filter, if i[t] is a WSS random process, the same techniques as used in Section 3 can be applied to determine the PSD and variance of p_N [t]. Given a temporal PSD for the instantaneous PSF intensity P_i (ω), the temporal PSD for a long-exposure is

P_{p, T} (ω) = P_{i} (ω) \frac{1}{T^{2}} {[\frac{\sin (\frac{ωT}{2})}{\sin (\frac{ω}{2})}]}^{2} .

Our quantity of interest, the variance of the long-exposure intensity, is then

σ_{p, T}^{2} = {\frac{1}{2 π} \frac{1}{T^{2}} \int_{- π}^{π} P_{i} (ω) {[\frac{\sin (\frac{ωT}{2})}{\sin (\frac{ω}{2})}]}^{2}} - m_{i}^{2},

where m_i is the mean of i[t].

If the intensity is temporally white, P_i (ω) is constant and this integral is easily determined to be 1/T, by Parseval’s theorem. This gives us the averaging out with exposure time that we would expect. If P_i (ω) is not white, we need to consider how the integral behaves as the exposure time increases. When T = 2, the sinc-squared function is very broad. As T increases, the lowpass filter rapidly narrows and suppresses the high temporal frequencies. Once P_i (ω) is relatively uniform over the width of the lowpass filter, the integral behaves exactly as in the white noise case: the variance drops off as 1/T. The level of the variance is set by the power at the lowest temporal frequencies of P_i (ω). We can identify the point at which the intensity variance drops off to one-half of the instantaneous exposure level. This is our analytic basis for the speckle lifetime, a concept that was introduced by Racine et al. [16].

This model applies to any random process that produces intensity fluctuations. We will now focus on the case that only considers PSF intensity variations due to residual phase errors. We will ignore scintillation for the purposes of our analysis and simulation. This is reasonable, because we wish to directly study how the AO control system (which controls phase, not amplitude) affects residual error, and hence the PSF. The exact formulation for the intensity as a function of residual phase error is given in the next section. First, we will illustrate the concept of speckle lifetime with simulated data.

This will be done using the temporal-averaging contrast method. Using the 100 Hz sampling of the PSF described above, the intensity of the PSF at a specific location is tracked through the long exposure. Figure 6(a) shows the PSF intensity for the location 0.45” above the PSF core. The Atmosphere line is for the no-WFS noise simulation, and the WFS Noise line is for the no-atmosphere simulation. The intensity caused by the atmospheric residual error has large and much slower fluctuations than the WFS noise. We characterize the temporal behavior of the PSF intensity by using estimated temporal PSDs for each pixel in the image. These are calculated with the averaged periodogram method[18] over the entire 5.12 second interval. This produces a sampled temporal PSD with a frequency spacing of 0.195 Hz and a maximum frequency of 50 Hz. Figure 6(b) shows the estimated temporal PSD for the same location. Based on the two separate simulations, the temporal behavior for the intensity due to only atmosphere and only WFS noise are shown. The two temporal PSDs have substantially different structures.

Fig. 6. (a) (left) PSF intensity at a location 0.45” above the PSF core, sampled at 100 Hz. Atmosphere is for simulation with no WFS noise. WFS noise line for simulation with no atmospheric error. (b) (right) the estimated temporal PSDs for PSF intensity at the same location as shown in (a).

Download Full Size | PDF

In particular, the atmospheric phase error has two major features. The spike at 20 Hz is due to interference of the phase components in the two layers in the model. This interference is a function of the velocity and angle at which the two screens move, and it has a temporal frequency of ∣f_k (v _x1 - v _x2) + f_l (v _y1 - v _y2)∣. This temporal frequency is a function of the spatial frequency of interest (given by f_k , f_l in the pupil plane) and the x- and y-components of the velocity vectors of the two layers (v _x1,v _y1) and (v _x2,v _y2). The low temporal-frequency component is set by the changing content of the phase screen as it moves across the pupil. This in turn is controlled by the Kolmogorov profile. In this case of PSF variation due to phase errors, the atmosphere-induced speckle lifetime is implicitly independent of r ₀, unlike Racine’s calculation. In our regime, r ₀ is simply a scaling factor on the power of the atmospheric phase aberration and does not affect the structure of the temporal PSDs. As shown in previous work [17] for the single-layer case, the speckle lifetime is approximately 0.6D/v, where v is the wind speed.

Using these temporal PSD estimates, the pixel variance as a function of exposure time can be computed using Eq. (13). The results of this are in Fig. 7(a). The atmospheric component has initially flat variance, since in a short exposure an atmospheric-induced speckle stays relatively constant. As the phase screen blows by the aperture, the phase content changes. Averaging begins to occur and for exposures past 0.25 seconds, the pixel variance follows the 1/T behavior. The WFS noise-induced intensity follows the 1/T behavior for the range of exposure times.

We can confirm this analysis using the spatial-averaging contrast method. The contrast calculation is done on a single PSF. First the smooth low-order components (such as the butterfly’s overall shape) are removed from the PSF with a median filter. Then the sample standard deviation is calculated for specific regions. In this case, the regions are segments inside the butterfly region of concentric rings around the core. Input to this radial contrast metric is a series of images of increasing exposure times. For a given 5.12 second simulation run, cumulative exposures of power-of-two lengths 0.01, 0.02, 0.04, up to 5.12 seconds were stored. Then the intensity variance at a position 0.45” above the core was obtained. For the WFS noise-only case, a single set of images was used. For the atmosphere-only case, sets of images from all 12 trials were used, then the average intensity variance was calculated. Figure 7(b) shows these data points, along with 1/T trend-lines for comparison. This exhibits the same trends as shown as in Fig. 6(b), though the data points were produced with a completely different method.

Fig. 7. (a) (left) estimated intensity variances as a function of exposure time, calculated with Eq. 13 and the data shown in Fig. 6b. The WFS noise has the 1/T drop-off, but the atmospheric variance is flat before dropping off after 0.25 seconds. (b) (right) Intensity variances drawn from a spatially-averaged contrast metric at 0.45” distance for cumulative exposures of increasing length. Atmosphere data points are over 12 difference atmospheres, with error bars. These results, determined with a completely different method, confirm the results shown (a).

Download Full Size | PDF

Given the outputs of the simulation, the speckles due to atmospheric errors behave very differently than the speckles due to WFS noise. To get a reasonable estimate of the contrast given residual atmospheric errors, a sufficiently long exposure needs to be examined so that the low temporal-frequency terms have begun to average out.

4.3. Intensity as a function of residual modes

The approach outlined above gives a way to evaluate the contrast of a PSF produced by the AO simulation. It is necessary, however, to confirm the the simulation itself is producing correct outputs. To do so, we will use a further analytic method based on the PSF expansion [4]. The Fourier modal coefficients are particularly easy to use in this formalism.

The PSF expansion is a mathematical way to represent the PSF, generated from a complex field of arbitrary amplitude and phase, in terms of the Fourier transforms of that amplitude and phase. It allows for improved understanding of the high-contrast imaging scenario as well as enabling mathematical analysis in a tractable form. Given the complex field a(x,y)exp[jϕ(x,y)], where both a and ϕ are functions of two spatial variables in the pupil plane, the PSF in the image plane is

P (X, Y) = {∣ A (X, Y) + jA (X, Y) * Φ (X, Y) - 0.5 A (X, Y) * Φ (X, Y) * Φ (X, Y) + \dots ∣}^{2} .

First, since the amplitude is real and symmetric, complex conjugation on A(X,Y) is not necessary. Second, the spatial variables will be dropped unless necessary for clarity. This makes the PSF

P \approx {∣ A ∣}^{2} + 2 Im {A [A * Φ^{*}]} + {∣ A * Φ ∣}^{2} - Re {A [A * Φ^{*} * Φ^{*}]}

For now we will consider the simplified case where a single spatial-frequency phase error (Fourier mode) scatters light. This phase error is a single modal coefficient that is controlled by the AO system. This complex-valued random process ε[t] represents the residual error power. The variance and PSD of the residual error are dependent on the signal and noise inputs and the behavior of the control system, as detailed in Section 3. The modal coefficient ε[t] produces the real-valued phase aberration

ϕ (x, y, t) = ε_{\cos} [t] \cos (\frac{2 π}{D} [kx + ly]) + ε_{\sin} [t] \sin (\frac{2 π}{D} [kx + ly]),

where ε _cos[t] is just the real part of ε[t] and ε _sin[t] is the imaginary part of ε[t]. The Fourier transform of this phase is

Φ (X, Y, t) = \frac{1}{2} {ε [t] δ (X - \frac{λk}{D}, Y - \frac{λl}{D}) * ε^{*} [t] δ (X + \frac{λk}{D}, Y + \frac{λl}{D})} .

Incorporating this into the PSF expansion [Eq. (15)] produces an expression for the intensity as a function of the complex modal coefficient. There are several terms that are of very small value, which are dropped in the final expression. Dropping the λ/D terms from the variables for clarity, the intensity at the prime scattering location [k,l] is simplified to the diffraction term plus the pinned speckle and power-spectrum term

i_{k, t} [t] \approx A^{2} (k, l) - A (k, l) ε_{\sin} [t] + \frac{{∣ ε [t] ∣}^{2}}{4} .

This is of course a simplification. We have ignored other terms, such as the higher-order scattering of light from outside the controllable region that we have previously identified [2] as having the potential to contribute non-negligible intensity. However, for the purpose of illuminating significant temporal behavior of the PSF intensity, this approximation is adequate.

Given this formulation of the pixel intensity as a function of the random residual phase error modes, the statistical behavior of the intensity can be analyzed. Since the image-formation process is not a linear process, a transfer function cannot be used. Direct calculations, however, will obtain the mean, autocorrelation, and autocovariance functions and hence the PSD of the pixels. This will be done assuming that the random processes are zero-mean and Gaussian, which allows evaluation of the higher moments. At this point the notation is clearer when the statistics are given in terms of the real and imaginary parts of ε, namely ε _cos and ε _sin. The variance of the error is the sum of two parts

σ_{ε}^{2} = σ_{ε, \cos}^{2} + σ_{ε, \sin}^{2} .

Furthermore, the correlation between the cosine and sine parts is given by E[cs]. This correlation is nonzero in the case of the atmosphere-induced residual error. The mean intensity is just diffraction plus the power level of the residual error

E [i] = A^{2} (k, l) + \frac{σ_{ε}^{2}}{4} .

The variance of the instantaneous exposure is

σ_{i}^{2} = \frac{1}{8} {σ_{ε, \cos}^{4} + σ_{ε, \sin}^{4} + 2 {(E [cs])}^{2}} + A^{2} (k, l) σ_{ε, \sin}^{2} .

These expressions are useful in evaluating how the error terms due to the atmosphere and the WFS noise (which are uncorrelated sources) combine in the PSF. Consider two uncorrelated errors, ε ₁ from signal and ε ₂ from noise. The expected intensity is simply the sum of the individual terms

E [i] = A^{2} (k, l) + \frac{1}{4} (σ_{ε 1}^{2} + σ_{ε 2}^{2}) .

This is what we would expect, as uncorrelated noise sources should add in the expected PSF intensity. Analyzed individually, the PSF intensity variance from each term is simply Eq. 21 rewritten:

σ_{i 1}^{2} = \frac{1}{8} {σ_{ε 1, \cos}^{4} + σ_{ε 1, \sin}^{4} + 2 {(E [c_{1} s_{1}])}^{2}} + A^{2} (k, l) σ_{ε 1, \sin}^{2},

σ_{i 2}^{2} = \frac{1}{8} {σ_{ε 2, \cos}^{4} + σ_{ε 2, \sin}^{4} + 2 {(E [c_{2} s_{2}])}^{2}} + A^{2} (k, l) σ_{ε 2, \sin}^{2} .

However, when those two error random processes are added together as ε ₁ +ε ₂, the resulting intensity variance is not simply the sum $σ_{i 1}^{2}$ + $σ_{i 2}^{2}$ , but rather

σ_{i}^{2} = σ_{i 1}^{2} + σ_{i 2}^{2} + {\frac{1}{4} (σ_{ε 1, \sin}^{2} + σ_{ε 2,cos}^{2} + σ_{ε 1, \sin}^{2} σ_{ε 2, \sin}^{2}) + \frac{1}{2} E [c_{1} s_{1}] E [c_{2} s_{2}]} .

The significance of this extra term depends on the relative power levels in residual signal and noise error random processes. When the power levels are close, this extra variance term becomes significant. If the signal and noise residuals have equal power, the PSF intensity variance can be twice the simple sum of the individual variances. This extra-variance phenomenon is visible in the simulation outputs in the temporal PSDs when power levels are similar.

When considering two error sources in the AO system (i.e., atmospheric signal and WFS noise), the expected PSF intensity is well-approximated by the diffraction pattern plus the power level of that specific spatial frequency. However, the variance of the PSF intensity (and hence the contrast of the PSF and the ability to detect planets) is not necessarily the sum of the individual variances. This only occurs in specialized situations, but since any extra variance will degrade contrast, this term should be analyzed. The end-to-end simulation run with both inputs automatically takes into consideration this extra-variance, but analytic methods, which sum over independent error sources, may not.

5. Gain optimizer and contrast

The above formulation for PSF intensity variance as a function of the atmospheric and WFS noise phase error random processes and the AO control system provides a framework for evaluating the performance of OFC in regard to contrast. Such an analytic treatment is beyond the scope of this study, but we will provide a direct comparison of the simulation results for long-exposure images. The test case is I = 6, 2 kHz operation. This case has a mix of gains being raised (up to 0.5 in the butterfly) and lowered (down to 0.15 orthogonal to the butterfly). A 2 s exposure PSF was accumulated from the 100 Hz short exposure series to generate an estimate of the PSF intensity. The same 100 Hz short exposure series was used as described above to generate an estimate of the image contrast in the 2 s exposure, using the temporal-averaging contrast method.

As discussed in Section 3, the PSF intensity is lowered in the controlled region everywhere that the optimized gain is not the default gain of 0.3. However, for the contrast this is not the case. Everywhere the gain was increased by the optimizer, the resulting contrast is improved. Everywhere the gain was decreased, however, the resulting contrast is worse. In the region where the optimal gains were lowered, the atmospheric residual error power is now higher. Because the speckle decorrelation time is non-zero, the atmospheric speckles dominate the PSF contrast. This is illustrated in Fig. 8, with slices through the PSF contrast given by the intensity standard deviation. In the butterfly region the contrast has improved by a factor of 2, but in the opposite region contrast has been degraded by a factor of 1.5. The reason why contrast is worse in part of the PSF is because the gain optimization metric minimizes total residual error power, not contrast. In this region, though the WFS noise error term dominates, the atmospheric component still dominates the contrast. For high-contrast imaging, a contrast-optimizing controller is needed. We are currently investigating adaptive OFC to optimize image contrast, using the formulation of Section 4.3 to develop the gain optimization metric.

Fig. 8. Contrast of 2 s PSF, before and after gain optimization. (a) (left) Slice along y-axis through butterfly region. Gain optimization has improved contrast by a factor of 2. (b) (right). Slice along x-axis through region of little atmospheric error. Gain optimization has decreased contrast by a factor of 1.5.

Download Full Size | PDF

6. Conclusion

High-contrast imaging with AO for planet detection requires a sophisticated AO control system for the best possible performance. OFC, a computationally efficient method of real-time modal optimization, provides significant performance improvements. The characteristics of the atmosphere, particularly a dominant wind direction, necessitate a large range of gains for each Fourier mode. This means that the use of OFC results in reduced residual error power and reduced PSF intensity at most locations in the controllable region of the PSF.

PSF intensity, however, is not the final measure of system performance for planet detection. Contrast, defined here as the variation in intensity in the PSF for a given exposure time, is what we want. PSF speckles caused by atmospheric phase errors behave very differently in a temporal fashion from speckles caused by WFS noise phase errors. While the WFS noise errors lead to a 1/T drop-off in intensity variance as a function of exposure time for even short exposures, the atmospheric speckles have a much longer decorrelation time. These speckles can dominate the contrast, even when error power is low, as shown in Section 5. As a consequence, the error-minimizing metric of OFC does not optimize contrast.

Acknowledgments

The authors thank the reviewers for their useful comments, particularly regarding the clarity of the manuscript. We also thank Julia W. Evans for her helpful suggestions regarding the organization of this paper. This work was performed under the auspices of the U.S. Department of Energy by the University of California, Lawrence Livermore National Laboratory under contract W-7405-Eng-48. The document number is UCRL-JRNL-219806. This work has been supported by the National Science Foundation Science and Technology Center for Adaptive Optics, managed by the University of California at Santa Cruz under cooperative agreement AST - 9876783.

References and links

1. J. Schneider, The Extrasolar Planets Encyclopaedia, Tech. rep. (CNRS-LUTH, Paris Observatory, 2006). http://vo.obspm.fr/exoplanetes/encyclo/catalog.php.

2. L. A. Poyneer and B. Macintosh, “Spatially filtered wave-front sensor for high-order adaptive optics,” J. Opt. Soc. Am. A 21, 810–819 (2004). [CrossRef]

3. L. A. Poyneer and J.-P. Véran, “Optimal modal Fourier transform wave-front control,” J. Opt. Soc. Am. A 22, 1515–1526 (2005). [CrossRef]

4. M. D. Perrin, A. Sivaramakrishnan, R. B. Makidon, B. R. Oppenheimer, and J. R. Graham, “The structure of high Strehl ratio point-spread functions,” Astrophys. J. 596, 702–712 (2003). [CrossRef]

5. L. A. Poyneer, D. T. Gavel, and J. M. Brase, “Fast wavefront reconstruction in large adaptive optics systems with use of the Fourier transform,” J. Opt. Soc. Am. A 19, 2100–2111 (2002). [CrossRef]

6. E. Gendron and P. Léna, “Astronomical adaptive optics, 1: Modal control optimization,” Astron. Astrophys. 291, 337–347 (1994).

7. P.-Y. Madec, “Control Techniques,” in Adaptive Optics in Astronomy, F. Roddier, ed. (Cambridge University Press, 1999) pp. 131–154.

8. A. Tokovinin, “Modeling turbulence profile for GLAO,” Tech. rep. (Gemini Observatory, 2004).

9. E. M. Johansson and D. T. Gavel, “Simulation of stellar speckle imaging,” in Amplitude and Intensity Spatial Interferometry II, J. B. Breckinridge, ed., Proc. SPIE 1237 (1994) pp. 372–383. [CrossRef]

10. S. B. Howell, Handbook of CCD Astronomy (Cambridge University Press, 2000).

11. J. J. Green and S. B. Shaklan, “Optimizing coronagraph designs to minimize their contrast sensitivity to low-order optical aberrations,” in Techniques and Instrumentation for Detection of Exoplanets, D. R. Coulter, ed., Proc. SPIE 5170 (2003) pp. 25–37. [CrossRef]

12. B. L. Ellerbroek, “Linear systems modeling of adaptive optics in the spatial-frequency domain,” J. Opt. Soc. Am. A 22, 310–322 (2005). [CrossRef]

13. L. Jolissaint, J.-P. Véran, and R. Conan, “Analytical modeling of adaptive optics: foundations of the phase spatial power spectrum approach,” J. Opt. Soc. Am. A 23, 382–394 (2006). [CrossRef]

14. C. Aime and R. Soummer, “The usefulness and limits of coronagraphy in the presence of pinned speckles,” Astrophys. J. 612, L85–L88 (2004). [CrossRef]

15. M. P. Fitzgerald and J. R. Graham, “Speckle statistics in adaptively corrected images,” Astrophys. J. 637, 541–547 (2006). [CrossRef]

16. R. Racine, G. A. H. Walker, D. Nadeau, R. Doyon, and C. Marois, “Speckle noise and the detection of faint companions,” Publ. Astron. Soc. Pac. 111, 587–594 (1999). [CrossRef]

17. B. Macintosh, L. A. Poyneer, A. Sivaramakrishnan, and C. Marois, “Speckle lifetimes in high-contrast adaptive optics,” in Astronomical Adaptive Optics Systems and Applications II, R. K. Tyson and M. Llyod-Hart, eds., Proc. SPIE 5903 (2005) p. 59030J. [CrossRef]

18. A. V. Oppenheim, R. W. Schafer, and J. R. Buck, Discrete-time Signal Processing (Prentice Hall, 1999).

Optimal Fourier control performance and speckle behavior in high-contrast imaging with adaptive optics

Abstract

1. Introduction

2. System specifications

2.1. Wavefront control

2.2. Simulation

3. Performance of the gain optimizer

3.1. AO simulation parameters

3.2. Results: Necessary range of gains

3.3. Results: Improvement in PSF intensity by region

3.4. Benefits of gain optimization

4. Image contrast and speckles

4.1. AO simulation parameters

4.2. Image behavior as exposure time increases

4.3. Intensity as a function of residual modes

5. Gain optimizer and contrast

6. Conclusion

Acknowledgments

References and links

Supplementary Material (2)

Cited By

Figures (8)

Equations (26)

Optics Express