Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

An Ultra Fast Kolmogorov Phase Screen Generator Suitable For Parallel Implementation

Open Access Open Access

Abstract

Modelling phase fluctuations due to Kolmogorov turbulence is important in many areas of applied optics such as simulating adaptive optics configurations, prediction of the performance of laser designators and simulation of infrared (IR) scenes in the presence of atmospheric turbulence. The computational performance of algorithms implementing this model is an important issue because in many situations a large number of phase screens is required. For example, in IR scene simulation a different phase screen is required for each pixel in the scene, and in other situations there exists a need for many thousands of phase screens to be calculated to obtain a statistical average. Whilst there have been previous attempts to increase the computational speed of these algorithms, the computation time required for a large number of phase screens still remains an issue. In this paper, we apply linear and statistical properties to improve the performance of the previous best published algorithm by 60 times when implemented on a sequential processor in software. Because the new algorithm is now trivially parallelizable, a further 20 times speedup can easily be achieved through a parallel software or hardware implementation.

©2007 Optical Society of America

1. Background

A phase screen generator has been defined as a

“computer program which creates random arrays of phase values on a grid of sample points which have the same statistics as the [wavefront phases distorted due to atmospheric] turbulence …” [1]

Factorization methods of phase screen generation ([2], p1428 and [1], p105) have computational complexities which increase exponentially with the size of the screen. For this reason more computationally efficient algorithms have been developed. A popular method is that of Harding et al. [3]. It begins with the factorization method to generate a coarse phase screen and then improves the resolution of that phase screen by interpolating between the points of the base screen. The interpolation step has only linear complexity in the size of the final screen. However even the method of Harding et al. is very computationally expensive. On a Pentium 4 (2.8GHz) dual core processor running Matlab version 7 (Release 14), using this method it takes 20 milliseconds to generate a 200×200 phase screen with the interpolation step accounting for 90% of the execution time. Some applications require an enormous number of phase screens to be generated. For example, in the simulation of IR scenes [4] a phase screen is required for each pixel in the simulated image. So for a modest image of 100,000 pixels it will take 2,000 seconds to generate all the phase screens required using this method. So there is a strong motivation to improve the computational performance of phase screen generation.

In this paper we describe a transformation of the interpolation step in Harding’s method that reduces the software execution time by a factor of 60 and opens the way for a further 20 times acceleration with the use of parallel hardware. The paper is organized as follows. In the next section we briefly review the factorization method for generating a coarse phase screen and the interpolation step which is the key to the speedup of Harding’s algorithm. In section 3 we describe the transformations that we have applied to simplify the computation of the interpolation step. In section 4 we compare the structure function generated using our new method with that obtained by Harding’s method and the ideal structure function.

2. Phase screen generation by factorization and interpolation

The factorization method begins with the theoretical structure function under the Kolmogorov model of turbulence. From this a covariance function is constructed ([5], p1773). The covariance function of the phase ψ(r̰) is sampled to produce an N × N array of values. This is then rearranged as a vector Φ of length N 2. The product CS = ΦΦT, an N 2 × N 2 array, is treated as the covariance for a subsequent Karhunen-Loéve decomposition. The matrix is factorized by Singular Value Decomposition

CS=UΛU,

to yield a set of N 2 eigenvectors U = {U 1,U 2,…,U N2} which serve as orthogonal basis functions. The corresponding eigenvalues λi are the variances associated with the Ui. The square roots of these are then multiplied by uncorrelated Gaussian random numbers, zi, with 0 mean and variance of 1 to produce the weights xi = λizi on the Ui. The phase screen is then

Φ̂l=Ux˜,

with the property that

E[Φ̂lΦ̂lT]=E[Ux˜x˜TUT]=UE[x˜x˜T]UT=UΛU=CS.

The interpolation step of Harding’s algorithm starts by adding intermediate points to the coarse phase screen and setting them to zero. We denote this upsampled zero-filled phase screen as Φh0. The correct statistical properties are restored to the phase screen Φh0 in two stages. In the first stage, Φh0 is convolved with an interpolator matrix. To the output of this convolution is added a Gaussian random number with a 0 mean and variance of 1 multiplied by a constant residual. These operations are described in Eq. 4

Φh1=Φh0I4x4+εres1×x1,

where I 4x4 is the interpolator matrix, x 1 is a matrix of Gaussian random numbers with 0 mean and variance of 1 and ε res1 is a residual matrix. The term ε 1 = ε res1 × x 1 is the same as the term εi in Eq. 18 of Harding et al. [3]. Harding et al. refer to this as “a random displacement epsilon”. The constant part of this quantity is ε res1 and the random part is x 1. The value of I 4x4 is given in Eq. 27 of [3]. Also note that since all elements in the Eq. 4 are matrices we have dropped the bold nomenclature for matrices.

In the second stage the convolution and addition is repeated:

Φh2=Φh1I4x4rotated+εres2×x2,

where x 2 is a second matrix of Gaussian random numbers with 0 mean and variance of 1 and ε res2 is another residual matrix. The residual matrices are constant; the derivation for both of the residual matrices is given in section 3 and section 4 of ([3], p.2163), respectively. The I 4x4rotated which is just the I 4x4 matrix rotated by 45° is:

I4x4rotated=[0000.0017000000.034100.03410000.034100.319800.034100.001700.319810.319800.001700.034100.319800.03410000.034100.0341000000.0017000].

3. Transformed algorithm

In this section we show how the application of a well known property of the convolution operator and properties of weighted sums of Gaussian random numbers can be applied to simplify the computation of the interpolation step of Harding’s algorithm.

We first combine Eq. 4 and Eq. 5:

Φh2=(Φh0I4x4+εres1×x1)I4x4rotated+εres2×x2

We now note that convolution, being a linear operation, distributes over addition [6]. So Eq. 6 can be rearranged as

Φh2=(Φh0I4x4)I4x4rotated+εres1(x1I4x4rotated)+εres2×x2

In Eq. 7, the first term on the right-hand side need only be computed once at the same time as the original coarse phase screen Φh0 is created. The term x 1I 4x4rotated might appear to require a complete convolution of the whole refined phase screen but this is not the case.

Recall that if y is a weighted sum of uncorrelated Gaussian random numbers, g having a mean of μg and variance of σ 2 g,

y=i=1nαigi,

then the mean of y (see [7]) is given by

μy=i=1nαiμgi

and the variance of y (see [7]) is given by

σy2=i=1nαi2σgi2.

The convolution x 1I 4x4rotated can be expanded and expressed as a weighted sum of uncorrelated Gaussian random numbers with 0 mean and variance of 1. To explain we represent the convolution kernel I 4x4rotated as a sequence of coefficients αi, where i ranges from 1 to 49 and examine the elements of one convolution operation x 1(4,4)α part of the overall convolution x 1I 4x4rotated.

x14,4α=x11,1α1+x11,2α2+x11,3α3+..+x17,7α49

This weighted sum, as explained in Eq. 9 above, can be replaced with a single Gaussian random number with a mean given by ∑n i=1 αiμi and a variance given by ∑n i=1 α 2 i σ 2 i. Since the mean of Gaussian random numbers in the weighted sum was 0, the new Gaussian random number would have a 0 mean as well. The variance of Gaussian random numbers of the weighted sum was 1 and so the variance of the new Gaussian random number is the square of the sum of the weights of the convolution, i.e. ∑49 i=1 I 2 4x4rotatedi. Thus x 1I4x 4rotated may be replaced by a matrix x 3 of Gaussian random numbers with 0 mean and a variance of ∑49 i=1 I 2 4x4rotatedi. Therefore Eq. 7 now reduces to

Φh2=(Φh0I4x4)I4x4rotated+εres1×x3+εres2×x2

In Eq. 11 further performance gains can be achieved by combining the remaining two matrices, ε res1 × x 3 and ε res2 × x 2, of Gaussian random numbers. The weighted sum of two Gaussian random numbers is another Gaussian random number with a mean equal to the weighted sum of the means of the two added Gaussian random numbers and a variance equal to the sum of the variances of the two Gaussian random numbers [7]. Thus the last two terms of Eq. 11 can be replaced by a single matrix x of Gaussian random numbers with 0 mean and variance of (εres1)2i=149I4x4rotatedi2+(εres2)2. The new equation for phase screen generation is

Φh2=(Φh0I4x4)I4x4rotated+x.

Not only have we saved on the convolution operations for every phase screen but we now only need to generate one set of Gaussian random numbers for each point on the final phase screen as opposed to two per every point in the previous algorithm.

4. Validation

An established means of testing the accuracy of generated phase screens is through the comparison of the structure function derived from the generated phase screens with the ideal structure function [3]. The phase structure function is given by

DΦ(r)=2[BΦ(0)BΦ(r)],

where

BΦ(r)=(Φ(x)Φ(xr))

is the covariance. The ideal structure function is

DΦideal(r)=6.88×(r)53
r=ractualr0,[3]

where ractual is the actual displacement between points in the phase screen and r 0 is the Fried parameter [2].

In Fig. 1 we show a slice of the ideal phase structure function compared to the average of ensemble of 10,000 structure functions computed using our transformed version of Harding’s algorithm. The lower line is a slice of the ideal function, while the top line is a slice of the structure function generated using our transformed algorithm. Because our simulated structure function is very close to the ideal, we show in Fig. 1b the difference between them as a fraction of the ideal function. We do not attribute all the error of our simulated results to the transformations we have made to Harding’s algorithm. Rather we attribute the errors reported in Fig. 1b to approximations introduced in Harding’s original algorithm. Although Harding et al. do not present their results in the same form as Fig. 1b, our simulations indicate that there is no significant difference between our transformed algorithm and Harding’s original algorithm.

 figure: Fig. 1.

Fig. 1. Comparison of the structure function derived from the transformed algorithm as compared to the ideal function. The x axis is the distance between points in the phase screen non dimensionalized by the Fried parameter. (a) shows the ideal structure function (lower trace) as compared with the simulated structure function (upper trace). (b) shows the relative error between the ideal and simulated structure functions as a fraction of the ideal. Note that the errors reported in (b) are attributed to the original approximations made by Harding et al. and not to the transformations reported in this paper.

Download Full Size | PDF

The assertion that our transformed algorithm produces substantially the same results as the original Harding algorithm can be verified by comparing the reported variance of computed structure function with that reported by Harding et al. In Fig. 2, we show the error between the ideal function and an ensemble of 10,000 generated structure functions expressed in terms of the standard deviation of the ensemble of structure functions. Fig. 2 shows that there is no substantial difference in the normalized error between our transformed algorithm and Harding’s original algorithm.

The standard deviation of the ensemble of structure functions computed is given by

σ(r)={1(N1)i[Dϕ(i)(r)1NDϕ(i)(r)]2}12
Error(standarddeviation)=Dϕideal(r)[1N]Dϕi(r)σ(r).
 figure: Fig. 2.

Fig. 2. Number of standard deviations from the ideal structure function. (Lower trace) Harding’s original algorithm, (upper trace) transformed algorithm.

Download Full Size | PDF

5. Conclusion

In this paper we have been able to reduce the software computation time of Harding’s algorithm by 60 times through a transformation of the key computationally intensive steps. The remaining speed limitation (random number generation in software) can be resolved through the use of a parallel hardware (FPGA-based) random number generator [8]. For IR scene generation, if each image frame requires 100,000 phase screens of 200×200 resolution, the reduction of computation time per scene is from 2,000 seconds to 33 seconds. The addition of parallel hardware random number generation reduces this to 1.6 seconds.

References and links

1. M. C. Roggemann and B. Welsh, Imaging Through Turbulence (CRC Press, Boca Raton, Fla., 1996).

2. D. L. Fried, “Statistics of a Geometric Representation of Wavefront Distortion,” J. Opt. Soc. Am. 55, 1427–1435 (1965). [CrossRef]  

3. C. M. Harding, R. A. Johnston, and R. G. Lane, “Fast Simulation of a Kolmogorov Phase Screen,” Appl. Opt. 38, 2161–2170 (1999). [CrossRef]  

4. V. Sriram and D. Kearney, “High Speed High Fidelity Infrared Scene Simulation using Reconfigurable Computing,” 2006 IEEE International Conference on Field Programmable Logic and Applications (FPL), Spain, IEEE Press.

5. E. P. Wallner, “Optimal wave-front correction using slope measurements,” J. Opt. Soc. Am. 73, 1771–1776 (1983). [CrossRef]  

6. W. Rugh, “Linear Time-Invariant Systems and Convolution: An Interactive Lecture” http://www.jhu.edu/signals/lecture1/main.html#spot2.

7. E. Weisstein, “Normal Sum Distribution,” From MathWorld-A Wolfram Web Resource. http://mathworld.wolfram.com/NormalSumDistribution.html.

8. V. Sriram and D. Kearney, “A high throughput area time efficient uniform random number generator based on the TT800 algorithm,” In Proc of IEEE International Conference on Field Programmable Logic and Applications, Amsterdam, Netherlands, August 2007, IEEE Press.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (2)

Fig. 1.
Fig. 1. Comparison of the structure function derived from the transformed algorithm as compared to the ideal function. The x axis is the distance between points in the phase screen non dimensionalized by the Fried parameter. (a) shows the ideal structure function (lower trace) as compared with the simulated structure function (upper trace). (b) shows the relative error between the ideal and simulated structure functions as a fraction of the ideal. Note that the errors reported in (b) are attributed to the original approximations made by Harding et al. and not to the transformations reported in this paper.
Fig. 2.
Fig. 2. Number of standard deviations from the ideal structure function. (Lower trace) Harding’s original algorithm, (upper trace) transformed algorithm.

Equations (20)

Equations on this page are rendered with MathJax. Learn more.

C S = U Λ U ,
Φ ̂ l = U x ˜ ,
E [ Φ ̂ l Φ ̂ l T ] = E [ U x ˜ x ˜ T U T ] = UE [ x ˜ x ˜ T ] U T = U Λ U = C S .
Φ h 1 = Φ h 0 I 4 x 4 + ε res 1 × x 1 ,
Φ h 2 = Φ h 1 I 4 x 4 rotated + ε res 2 × x 2 ,
I 4 x 4 rotated = [ 0 0 0 0.0017 0 0 0 0 0 0.0341 0 0.0341 0 0 0 0.0341 0 0.3198 0 0.0341 0 0.0017 0 0.3198 1 0.3198 0 0.0017 0 0.0341 0 0.3198 0 0.0341 0 0 0 0.0341 0 0.0341 0 0 0 0 0 0.0017 0 0 0 ] .
Φ h 2 = ( Φ h 0 I 4 x 4 + ε res 1 × x 1 ) I 4 x 4 rotated + ε res 2 × x 2
Φ h 2 = ( Φ h 0 I 4 x 4 ) I 4 x 4 rotated + ε res 1 ( x 1 I 4 x 4 rotated ) + ε res 2 × x 2
y = i = 1 n α i g i ,
μ y = i = 1 n α i μ g i
σ y 2 = i = 1 n α i 2 σ gi 2 .
x 1 4,4 α = x 1 1,1 α 1 + x 1 1,2 α 2 + x 1 1,3 α 3 + .. + x 1 7,7 α 49
Φ h 2 = ( Φ h 0 I 4 x 4 ) I 4 x 4 rotated + ε res 1 × x 3 + ε res 2 × x 2
Φ h 2 = ( Φ h 0 I 4 x 4 ) I 4 x 4 rotated + x .
D Φ ( r ) = 2 [ B Φ ( 0 ) B Φ ( r ) ] ,
B Φ ( r ) = ( Φ ( x ) Φ ( x r ) )
D Φ ideal ( r ) = 6.88 × ( r ) 5 3
r = r actual r 0 , [ 3 ]
σ ( r ) = { 1 ( N 1 ) i [ D ϕ ( i ) ( r ) 1 N D ϕ ( i ) ( r ) ] 2 } 1 2
Error ( standard deviation ) = D ϕ ideal ( r ) [ 1 N ] D ϕ i ( r ) σ ( r ) .
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.