Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Orthonormal vector polynomials in a unit circle, Part I: basis set derived from gradients of Zernike polynomials

Open Access Open Access

Abstract

Zernike polynomials provide a well known, orthogonal set of scalar functions over a circular domain, and are commonly used to represent wavefront phase or surface irregularity. A related set of orthogonal functions is given here which represent vector quantities, such as mapping distortion or wavefront gradient. These functions are generated from gradients of Zernike polynomials, made orthonormal using the Gram-Schmidt technique. This set provides a complete basis for representing vector fields that can be defined as a gradient of some scalar function. It is then efficient to transform from the coefficients of the vector functions to the scalar Zernike polynomials that represent the function whose gradient was fit. These new vector functions have immediate application for fitting data from a Shack-Hartmann wavefront sensor or for fitting mapping distortion for optical testing. A subsequent paper gives an additional set of vector functions consisting only of rotational terms with zero divergence. The two sets together provide a complete basis that can represent all vector distributions in a circular domain.

©2007 Optical Society of America

1. Introduction

Zernike polynomials [13] are commonly used in optical testing, engineering, and analysis. There are two reasons for this. First of all, Zernike polynomials are orthogonal in a unit circle, which is convenient since many optics are circular in shape. Secondly, the lower order members of Zernike polynomials represent typical optical wavefront aberrations such as power, astigmatism, coma and spherical aberration. Besides direct wavefront measurements, wavefront slopes are often measured as well, e.g. with shearing interferometry [4], Shack- Hartmann sensors [5], or a scanning pentaprism test [6]. Various techniques have been developed to convert measured slope data to a wavefront map expressed in terms of Zernike polynomials. Garvrelides [7] developed a set of vector polynomials that are orthogonal to the gradients of Zernike polynomials but not mutually orthogonal. The coefficient for a specific Zernike polynomial representing the wavefront can then be directly calculated from integration of the dot product of the slope and the corresponding vector polynomial. Acosta [8] et al, took a different approach but arrived at similar results. This approach skips the intermediate step of fitting the vector slope data and obtains the wavefront directly. Yet, it is desirable to fit measurement data in the measurement space. In this case, a set of vector polynomials is needed to fit the vector slope data.

Vector polynomials are also used for quantifying mapping distortion, which is important for accurate measurement of optical surfaces [9] and can be severe due to the use of null optics. Typically, polynomial mapping functions are defined and the coefficients are fit to data using least squares techniques. [10, 11]

Although the above problems can be solved using a least squares fit to vector functions that are not orthogonal over the domain, the results are not optimal. The fit to a non-orthogonal basis set can require many more terms than are necessary, and the coefficients themselves may not be meaningful, because the value for any particular coefficients will change as higher order terms are fit. When fitting to real data, the propagation of noise is increased with the use of non-orthogonal basis functions. If the functions are truly orthogonal, the least squares solution is not necessary, coefficients can be determined by a much simpler and computationally efficient inner product. Clearly, an orthonormal basis is desired.

In this paper, we present such a desired set of vector polynomials which are orthonormal in a unit circle. These polynomials are perfect for fitting slope data. Since they are gradients of linear combinations of Zernike polynomials, it is also straightforward to convert the fitted slope map to the wavefront map expressed in terms of Zernike polynomials.

In Section 2, we present the Zernike notations that we adopted from Noll’s landmark paper1 and list the gradients of the Zernikes following the recursion relationships presented there. We then use the Zernike gradients as a basis to obtain an orthonormal set of vector polynomials using the Gram-Schmidt method and present the result in Section 3. The mapping from the orthonormal vector polynomials to gradients of scalar functions represented by standard Zernike polynomials is discussed in Section 4.

The vector set is made complete with the addition of a complementary set of vector polynomials with non-zero curl, as presented in a subsequent paper. [12] The addition of this second set of functions provides a complete basis, capable of representing any vector distribution in the circular domain. Applications of the vector polynomials for fitting the slope data taken from Shack-Hartmann sensors or other slope measurement devices, and in correcting mapping distortions for null tests of aspheric surfaces will be presented in subsequent papers as well.

2. Zernike polynomials and their gradients

There are different numbering schemes for Zernike polynomials. In this paper we adopt Noll’s notation and numbering scheme which defines the polynomials in polar coordinates as

Zevenj=n+1Rnm(r)2cos(mθ)Zoddj=n+1Rnm(r)2sin(mθ)}m0
Zj=n+1Rn0(r),m=0

where

Rnm(r)=s=0(nm)2(1)s(ns)!s![(n+m)2s]![(nm)2s]!rn2s

j: the general index of Zernike polynomials

n: the power of the radial coordinate r

m: the multiplication factor of the angular coordinate θ

n and m have the following relations: mn and (n - m) is even

The general index j has no physical meaning, while the indices n and m do. For a given j, there is a unique corresponding pair of (n, m), and the parity of j determines the angle dependence of the polynomial. While for a given pair of (n, m), j is ambiguous when m≠0. In some relationships given in the subsequent text, n and m are usually known, but the corresponding j (therefore the sine or cosine angle dependence of the polynomial) depends on other factors. For this reason, we choose to use j(n, m) for the general index of a Zernike polynomial to show that n and m are known and the actual j will be determined by other conditions. The first 37 polynomials of this numbering scheme are listed in the Appendix, where the aforementioned relationship between j and (n, m) can be seen as well.

As the first step toward establishing an orthonormal basis of vector polynomials, we derive the gradients of the Zernike polynomials. We take the gradient of each Zernike polynomial and apply the recursion relationships from Noll to represent the gradients as linear combinations of lower order Zernike polynomials. The first 37 gradient terms are presented in Table 1. These functions provide a complete basis to represent gradients, but they require further manipulation to create an orthonormal set.

Tables Icon

Table 1. Gradient of Zernike polynomials

3. An orthonormal set of vector polynomials

We use linear combinations of the above terms to create an orthogonal set. We define the inner product of two vector polynomials defined in a unit circle as

(A,B)=1π(A·B)dxdy

where the integration is over a unit circle.

The inner product is taken of the above gradient functions, and some results are shown in Table 2 (the table is symmetric about the diagonal, but only non-zero elements under the diagonal are shown). These Zernike gradient polynomials are not orthogonal, as the matrix of inner products listed in Table 2 is not diagonal.

Tables Icon

Table 2. List of the inner products of the first 13 Zernike gradients

3.1 Orthogonalization of gradient functions

Using the Gram-Schmidt orthogonalization method [13, 14] (general description for the method can be found in Reference 13, and an optical application can be found in Reference 14), we construct a new set of vector polynomials with Zernike gradient polynomials as basis. The gradient of Z1 is zero, therefore it is not used in the construction of the new set. We choose to index the first polynomial of this new set as 2 to maintain its correspondence with Zernike polynomials. The first 36 such polynomials are listed in Table 3.

Tables Icon

Table 3. List of first 36 orthonormal vector polynomials Si as functions of Zernike gradients.

In general, the S polynomials can be simply expressed in terms of Zernike gradient polynomials:

For all j with n=m,

Sj=12n(n+1)Zj.

For all j with nm,

Sj=14n(n+1)(Zjn+1n1Zj(n=n2,m=m))

where j-j′ is even when m≠0.

3.3 S as linear combinations of Zernike polynomials

Given that the vector polynomials S are functions of Zernike gradient polynomials and the Zernike gradient polynomials are functions of Zernike polynomials, we can express S in terms of Zernike polynomials as listed in Table 4.

Tables Icon

Table 4. List of S polynomials expressed as linear combinations of Zernike polynomials.

For a given S j with corresponding indices j(n, m), we define its x and y components as S jx and S jy, respectively, i.e.

Sj=îSjx+ĵSjy

From observation of the first 37 S polynomials, we found that both S jx and S jy are linear combinations of at most two Zernike polynomials with corresponding indices j (n-1,m±1) which may or may not exist binding by the rules nm≥0:

Sjx=CaZja(n1,m1)+CaZja(n1,m+1),
Sjy=CbZjb(n1,m1)+CbZjb(n1,m+1).

For a given j(n, m), a set of rules can be used to determine all the parameters in Eq. (7) to express S j as linear combinations of Zernike polynomials. These rules are summarized in Table 5. These rules are useful for obtaining analytical expression of any S polynomial by programming. They are complex since we have to deal with different cases of j, n and m combinations. The complexity mostly comes from the numbering scheme. In Noll’s numbering scheme, even j correspond to cosine angle terms and odd j to sine angle terms and these terms swap order each time after an m=0 term. The rules will be simpler if we just use the sine/cosine dependence of the terms. Basically, if an S polynomial has the same j index of a Zernike polynomial, its x-component is the linear combination of the Zernikes with same sine or cosine angle dependence, and the y-component has the opposite angle dependence. For example, for S 32, the corresponding Z 32 has cosine angle dependence, so the x-component of S 32 has Z24 and Z26 terms which both have cosine angle dependence, while the y-component of S 32 has Z23 and Z25 terms which both have sine angle dependence.

Tables Icon

Table 5. The rules for writing S in terms of linear combinations of Zernikes.

3.4 Plots of vector polynomial functions

The plots of first 12 S vector polynomials are shown in Table 6.

Tables Icon

Table 6. Plots of first 12 S polynomials in a unit circle.

4. Relating the vector polynomials to gradients of scalar functions

The set of S polynomials fully spans the space of vector distributions V⃗(x, y) over the unit circle where a scalar function Φ(x,y) exists such that V⃗(x,y)=∇Φ(x,y). It is useful to represent the vector data using the vector polynomials S and relate to a scalar functions ϕ that are defined as Si=∇ϕ i.

Applying the rules listed in (4) and (5), the scalar functions can be calculated as

For all j with n=m,

ϕj=12n(n+1)Zj.

For all j with nm,

ϕj=14n(n+1)(Zjn+1n1Zj(n=n2,m=m)),

where j-j is even when m≠0.

These relationships match those demonstrated for the vector functions listed in Table 3. For example, S7=148(Z72Z3), which leads ϕ7=148(Z72Z3).

Applying these relations, the vector data V⃗(x,y) is decomposed into a linear combination of the orthonormal S polynomials as

V=αiSi.

Using the definitions of the scalar functions Φ and ϕ i(V⃗=∇Φ, Si=∇ϕ i), we have

Φ=αiϕi,

where the coefficients α i were found from the vector decomposition in Eq. (10). Then the scalar function Φ can in turn be represented as linear combinations of standard Zernike polynomials:

Φ=αiϕi=γiZi

The coefficients of these standard Zernike polynomials can be found by

γj=αj(n,m)2n(n+1)n=m
γj=αj(n,m)4n(n+1)αj(n+2,m)4(n+1)(n+2)nm

where j-j is even when m≠0.

This procedure is useful for applications such as processing data from a Shack Hartmann sensor. The centroid data, which is proportional to wavefront slopes, can be fit to the vector S polynomials to give a set of coefficients α i. These are converted directly to a standard Zernike polynomial representation of the wavefront, with coefficients γ i.

A reverse problem is: given a scalar function Φ and its Zernike decomposition coefficients γ i, we can find α i from Eq. (13). When Φ is a wavefront, the rms spot radius is r=2f(αi2), where f is the system F number.

5. Summary

We derived an orthonormal set of vector polynomials in a unit circle. It has many potential applications, one of which is fitting slope data in optical testing. These polynomials are linear combinations of at most two Zernike polynomial’s gradients. They can be expressed as linear combinations of at most four scalar Zernike polynomials as well. After wavefront slope data, e.g. data taken with a Shack-Hartmann sensor, is fit with the vector polynomials, it is straightforward to convert the fitted slope map to the wavefront map expressed in terms of Zernike polynomials.

Appendix:

Tables Icon

The first 37 Zernike polynomials according to Noll’s numbering:

References and links

1. R. J. Noll, “Zernike polynomials and atmospheric turbulence,” J. Opt. Soc. Am. 66, 207–211 (1976). [CrossRef]  

2. M. Born and E. Wolf, Principles of Optics, (Pergamon Press, 1980) pg. 464–468.

3. See http://wyant.opt-sci.arizona.edu/zernikes/zernikes.htm.

4. G. Harbers, P. J. Kunst, and G. W. R. Leibbrandt, “Analysis of lateral shearing interferograms by use of Zernike polynomials,” Appl. Opt. 35, 6162–6172 (1996). [CrossRef]   [PubMed]  

5. R. G. Lane and M. Tallon, “Wave-front reconstruction using a Shack-Hartmann sensor,” Appl. Opt. 31, 6902–6907 (1992). [CrossRef]   [PubMed]  

6. P. C. V. Mallik, C. Zhao, and J. H. Burge, “Measurement of a 2-meter flat using a pentaprism scanning system,” Opt. Eng. 46, 023602 (2007). [CrossRef]  

7. A. Gavrielides, “Vector polynomials orthogonal to the gradient of Zernike polynomials,” Opt. Lett , 7, 526–528 (1982). [CrossRef]   [PubMed]  

8. E. Acosta, S. Bara, M. A. Rama, and S. Rios, “Determination of phase mode components in terms of local wave-front slopes: an analytical approach,” Opt. Lett. 20, 1083–1085 (1995). [CrossRef]   [PubMed]  

9. P. E. Murphy, T. G. Brown, and D. T. Moore, “Interference imaging for aspheric surface testing,” Appl. Opt. 39, 2122–2129 (2000). [CrossRef]  

10. J. H. Burge, Advanced Techniques for Measuring Primary Mirrors for Astronomical Telescopes, Ph. D. Dissertation, Optical Sciences, University of Arizona (1993).

11. DurangoTM Interferometry Software, Diffraction International, Minnetonka, MN.

12. C. Zhao and J. H. Burge, “Orthonormal vector polynomials in a unit circle, Part II : completing the basis set,” to be submitted to Optics Express (2007).

13. T. M. Apostol, Linear Algebra: A First Course, with Applications to Differential Equations (John Wiley & Sons, 1997), Page 111–114.

14. R. Upton and B. Ellerbroek, “Gram-Schmidt orthogonalization of the Zernike polynomials on apertures of arbitrary shape,” Opt. Lett. 29, 2840–2842 (2004). [CrossRef]  

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Tables (7)

Tables Icon

Table 1. Gradient of Zernike polynomials

Tables Icon

Table 2. List of the inner products of the first 13 Zernike gradients

Tables Icon

Table 3. List of first 36 orthonormal vector polynomials S i as functions of Zernike gradients.

Tables Icon

Table 4. List of S polynomials expressed as linear combinations of Zernike polynomials.

Tables Icon

Table 5. The rules for writing S in terms of linear combinations of Zernikes.

Tables Icon

Table 6. Plots of first 12 S polynomials in a unit circle.

Tables Icon

Table 7 The first 37 Zernike polynomials according to Noll’s numbering:

Equations (16)

Equations on this page are rendered with MathJax. Learn more.

Z even j = n + 1 R n m ( r ) 2 cos ( m θ ) Z odd j = n + 1 R n m ( r ) 2 sin ( m θ ) } m 0
Z j = n + 1 R n 0 ( r ) , m = 0
R n m ( r ) = s = 0 ( n m ) 2 ( 1 ) s ( n s ) ! s ! [ ( n + m ) 2 s ] ! [ ( n m ) 2 s ] ! r n 2 s
( A , B ) = 1 π ( A · B ) dx dy
S j = 1 2 n ( n + 1 ) Z j .
S j = 1 4 n ( n + 1 ) ( Z j n + 1 n 1 Z j ( n = n 2 , m = m ) )
S j = i ̂ S jx + j ̂ S jy
S jx = C a Z ja ( n 1 , m 1 ) + C a Z ja ( n 1 , m + 1 ) ,
S jy = C b Z jb ( n 1 , m 1 ) + C b Z jb ( n 1 , m + 1 ) .
ϕ j = 1 2 n ( n + 1 ) Z j .
ϕ j = 1 4 n ( n + 1 ) ( Z j n + 1 n 1 Z j ( n = n 2 , m = m ) ) ,
V = α i S i .
Φ = α i ϕ i ,
Φ = α i ϕ i = γ i Z i
γ j = α j ( n , m ) 2 n ( n + 1 ) n = m
γ j = α j ( n , m ) 4 n ( n + 1 ) α j ( n + 2 , m ) 4 ( n + 1 ) ( n + 2 ) n m
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.