Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Robust and fast computation for the polynomials of optics

Open Access Open Access

Abstract

Mathematical methods that are poorly known in the field of optics are adapted and shown to have striking significance. Orthogonal polynomials are common tools in physics and optics, but problems are encountered when they are used to higher orders. Applications to arbitrarily high orders are shown to be enabled by remarkably simple and robust algorithms that are derived from well known recurrence relations. Such methods are demonstrated for a couple of familiar optical applications where, just as in other areas, there is a clear trend to higher orders.

©2010 Optical Society of America

1. Introduction

Many of the special functions used in optics satisfy three-term recurrence relations. These include trigonometric functions, Bessel functions (including modified Bessels and Hankels), and various orthogonal polynomials. Although most of the ideas discussed in what follows apply for applications of any in this class of functions, I concentrate on orthogonal polynomials in particular. Among the better known orthogonal polynomials in optics are the Zernike polynomials [1] (for characterizing a function defined on a disc, or modified for cases with non-uniform weighting, or for annular domains) and the Laguerre and Hermite polynomials [2] (for characterizing profiles of transversely localized beams). Traditionally, such polynomials were only ever needed to modest orders, say up to ten or so terms. (Note that, in the case of Zernikes, there are distinct polynomials for each azimuthal order, so the total number of terms is often more than thirty, but these hold a more modest number of terms for each azimuthal order.) This has meant that explicit expressions have been widely used for the polynomials, and they have generally been perfectly adequate.

The inclusion of higher-order terms in such decompositions is increasingly being used to boost capabilities. Two considerations limit progress in this direction, however. First, the explicit expressions for the polynomials lead to numerical round-off problems that become catastrophic when the number of terms reaches around 20 —a lesson sometimes learned the hard way. Second, computational efficiency (i.e. arithmetic operation count) remains a vital consideration in applications such as optical design: despite the staggering growth of computer power, multi-dimensional global optimization is so challenging that its results will always benefit from efficiency gains. The methods discussed below lead to simple code that is remarkably efficient while also avoiding round-off failures.

Any set of orthogonal polynomials, say {Pm} where Pm(x) is a polynomial of order m, is defined so that

Pm,Pn=0,  when mn,
where <f,g> is typically an integral over the domain of interest of the product of f(x) and g(x). When the basis is normalized by <Pn,Pn>=1, the coefficients in an expansion of the form
S(x)=msmPm(x)
are then determined individually by noticing that <S,Pn>=m<smPm,Pn>=sn, i.e.
sm=S,Pm.
These coefficients are evidently like a spectrum, where the analogue of Parseval’s theorem follows similarly from Eqs. (1.1) and (1.2):
S,S=msm2.
This is an intuitive conservation law that couples the total energy in the spectrum to the energy in the original function.

A result that is of central importance in this work follows upon considering the expansion of the function defined by S(x):=xPn(x). On account of Eq. (1.1), <ϕ,Pm> vanishes when ϕ is any polynomial of order less than m. It follows that sm=<xPn,Pm>=<Pn,xPm> vanishes unless |mn|1. This observation can be expressed as

xPn(x)=qPn+1(x)+rPn(x)+sPn1(x),
where q, r, and s, are constants and q0. Equation (1.5) can be re-arranged to give the standard form for the three-term recurrence relation that underpins the following sections:
Pn+1(x)=(an+bnx)Pn(x)cnPn1(x).
Such a relation follows regardless of the normalization. In fact, in place of <Pn,Pn>=1, normalization by peak value is the usual convention for the applications considered in this work. Simple closed forms for an, bn, and cn are known for all the most commonly used orthogonal polynomials, e.g. see Sec. 22.7 of [3].

Since the well known orthogonal polynomials in optics are among the standard set, the existence of the associated recurrence relations, generating functions, Christoffel-Darboux identities, etc. has often been noted and occasionally rediscovered. In the case of Zernikes, see [47] as examples. Some of the associated recurrence relations have been adopted in the field of image processing, as reviewed in [8]. Even so, in most applications of Zernikes, significant effort —and space— is devoted to working with the explicit expressions for the polynomials of moderate orders. It has not been generally appreciated that, in practice, this is a road to grief. With r as the radial coordinate on a unit disc, each element of a Zernike decomposition involves cosmθ or sinmθ times rmZnm(r2). These polynomials for m’th azimuthal order are related to the Jacobi polynomials by Znm(x)=Pn(0,m)(2x1), e.g. see Eq. (3.7) of [4]. The domain of interest is 0x1. The explicit expression that is the focus of many applications in such work is

Znm(x)=j=0n(1)nj(nj)(m+n+jn)xj=j=0n(1)nj(m+n+j)!j!(m+j)!(nj)!xj.
(This is given by, for example, 22.3.3 and 22.5.2 of [3]). From among the rotationally symmetric subset, as an example, it follows that
Z100(x)=1110x+2,970x234,320x3+210,210x4756,756x5+1,681,680x62,333,760x7+1,969,110x8923,780x9+184,756x10.
Since|Znm(1)|=1, six or seven significant digits are evidently lost through heavy cancellation when evaluating Eq. (1.8) for any x1. More generally, the magnitude of the coefficients of Znm(x) is found to be about 100.7n. This means that, even when using double precision arithmetic, it is impossible to guarantee any accurate decimal places at all in Znm(1.0) once n exceeds about 20. Accuracy can be seen to suffer long before that point of total catastrophe and, as demonstrated in Fig. 1 , this problem is avoided by using Eq. (1.6). The details of this particular recurrence relation are given in Section 4, and the stability of such relations is discussed in, for example [9].

 figure: Fig. 1

Fig. 1 The red curve at left is a plot of a high-order rotationally symmetric Zernike over just 0.95 < x < 1; the green curve is 1014 times the error when this function is evaluated by using Eq. (1.6), so there are typically 14 significant digits in this double-precision result. The curve on the right is the catastrophic error when Eq. (1.7) is used at double precision. Although exact to within round-off, the explicit polynomial gives values with the wrong order of magnitude and chaotic sign. This failure grows exponentially with the polynomial’s order.

Download Full Size | PDF

Another benefit offered by Eq. (1.6) can be appreciated when evaluating a function expressed as in Eq. (1.2) with the sum truncated at, say, m=M. Even when nested according to the Horner scheme, the evaluation of Pm(x) requires about 2m arithmetic operations. In this way, assuming all the coefficients are pre-computed, the evaluation of S(x) therefore requires about M2 arithmetic operations. When Eq. (1.6) is employed, on the other hand, each of the basis elements is then determined with just five arithmetic operations, so only 5M are required for the entire set. In either case, forming the linear combination that constitutes S(x) takes another 2M operations, but the main point to be made is that an “O(M2) process” is converted to one of O(M). This comes on top of the vital accuracy gain demonstrated in Fig. 1. More importantly, however, the focus of this paper is on three additional steps that were developed decades ago by Clenshaw [10], Smith [11], and Salzer [12], respectively, to extract additional benefits from Eq. (1.6). Their elegant, but little used, ideas are given unified derivations below and are adapted to the context of optics.

2. Linear combinations of orthogonal polynomials and their derivatives

An obvious application of Eq. (1.6) is for functions defined as a truncated sum of the type in Eq. (1.2), say

S(x):=m=0MsmPm(x).
The required basis members can first be evaluated sequentially by using Eq. (1.6), and the end result then formed as a linear combination with the coefficients sm. Clenshaw [10] presented a related recurrence relation that allows these two steps to be combined. As proven in Appendix A, by starting with αM=sM and αM1=sM1+(aM1+bM1x)sM, and progressively working downward from n=M2 by using
αn=sn+(an+bnx)αn+1cn+1αn+2,
the desired result is given by just S=α0. This process has all the robustness of Eq. (1.6), and it is even more efficient at evaluating Eq. (2.1). Each cycle of Eq. (2.2) involves three multiplications and three additions, and this must be done M1 times to determine the sum of interest, namelyα0. In this way, when an, bn, and cn are pre-computed, Eq. (2.1) can be evaluated robustly with about 6M operations via a remarkably simple bit of code to perform nothing more than Eq. (2.2). Notice that no single basis member is evaluated along the way; this is an elegant short-circuit from the spectrum to the function value. When nested as a Horner scheme, evaluating a regular polynomial of order M requires only 2M operations. A factor of three in operation count is evidently the minimal cost that must be paid to avoid catastrophic numerical instability. There are significant side-benefits, however, and a sample is commented on in one context in Section 5. Many of them follow from the fact that the rms value of a function is just the square root of the sum of the squared coefficients, as in Eq. (1.4).

Clenshaw’s process turns out to open the way to an equally effective scheme for computing derivatives. Although Eq. (1.6) can be differentiated to find a recurrence relation for Pn(x)=ddxPn(x), a more direct determination of S(x) can be performed with a simple variant of Eq. (2.2). In fact, the same thing works for higher derivatives. If the j’th derivative is written as S(j)(x), it is proven in Appendix B that if we start with αMj+1(j)=0 and αMj(j)=jbMjαMj+1(j1), then

αn(j)=jbnαn+1(j1)+(an+bnx)αn+1(j)cn+1αn+2(j)
can be used by working down from n=Mj1 to find the desired result: S(j)=α0(j) for any j>0. Since Eqs. (2.2) and (2.3) have the same structure, it is effectively the same block of highly optimized code that can be employed to implement them both. That the derivatives follow so readily is a more significant benefit than the minor efficiency gain in going from the 7M operations discussed before Fig. 1 to the 6M operations for Clenshaw.

Something equivalent to Eqs. (2.3) was presented by Smith [11], but with minimal derivation. The argument in his paper seems to proceed via differentiation of Eq. (1.6) instead of Eq. (2.2), as done in Appendix B. He presents two intermediate expressions that appear to complicate any interpretation of the process, and inconsistently applies a notational convention for higher derivatives. [For example, Smith presents something akin to Eq. (2.3), but without the factor of j that appears immediately after the equals sign.] The confusion this may have caused could explain the fact that his beautiful result seems to have been ignored in much of the related literature, even in numerical classics like [9]. Similarly, Clenshaw’s result has also been poorly appreciated in physics and optics and, again, it may be no coincidence that I have been unable to find a clear derivation of Eq. (2.2) in the literature. This is what motivated me to develop Appendices A and B. It is also interesting to note that Doha [13] and Barrio [14] both offer some interesting analyses and alternative results, although they are specific to Jacobi polynomials: Doha re-expresses S(j) as a linear combination (with constant coefficients) of the original basis elements, i.e. in terms of Pm instead of Pm(j), while Barrio gives S(j) without requiring all lower derivatives. The fact that the derivatives of Jacobi polynomials are once again Jacobi polynomials, opens the door for their special results. Thankfully, the simple process described above applies more generally to any polynomials that satisfy Eq. (1.6) and Eq. (2.3) is all that is required in our context to determine derivatives.

3. Change of basis

When a function is expressed in terms of the basis {PmI} as

S=m=0MsmIPmI,
consider determining its representation in terms of a second basis, say
S=m=0MsmIIPmII.
That is, the goal is to determine smII from smI where each of the polynomial bases satisfies a three-term recurrence relation like Eq. (1.6), say
Pn+1I=(anI+bnIx)PnIcnIPn1I,Pn+1II=(anII+bnIIx)PnIIcnIIPn1II.
By using Eqs. (3.3), Ting and Luke [15] derived a five-term recurrence relation for determining the elements of an upper-triangular change-of-basis matrix that multiplies smI to give smII. Just as the methods discussed in the previous section absorbed sm into a single-pass recurrence, Salzer [12] presented an analogous, apparently little-known process to change bases. That is, Salzer’s five-term recurrence absorbs smI, and its output is the smII elements themselves. (Salzer predated Ting and Luke, but the latter’s comments suggest that they misunderstood his method.) Sample applications are presented in Sections 4 and 5 where this method greatly simplifies past and current developments in optics.

Salzer’s process is derived in Appendix C in terms of what can be regarded as an (M+1)×(M+1) matrix with elements αkn [not to be confused with the differentiated polynomial written as αn(j) in Section 2]. It is convenient to introduce some auxiliary matrices in terms of the constants from the recurrence relations of Eqs. (3.3), namely

fkn:=bnI/bk1II,gkn:=anIbnIakII/bkII,hkn:=bnIck+1II/bk+1II.

In some applications, it may be appropriate for these to be pre-computed and stored. It turns out that αkn is triangular with (M+1)(M+2)/2 non-zero elements: αkn=0 unless 0kMn. This property can be used to simplify the penultimate equation of Appendix C, namely Eq. (C.6). At k=0, one term drops out leaving

α0n=snI+g0nα0n+1+h0nα1n+1cn+1Iα0n+2.
A different term drops for 0<k<Mn1, so that
αkn=fknαk1n+1+gknαkn+1+hknαk+1n+1cn+1Iαkn+2.
For k=Mn1, three terms vanish leaving only
αMn1n=fMn1nαMn2n+1+gMn1nαMn1n+1,
and only one term remains when k=Mn, namely
αMnn=fMnnαMn1n+1.
The change of basis is realized by starting at n=M, where the only non-zero term is α0M=sMI, and with the two terms for n=M1, namely α0M1=sM1I+g0M1α0M and α1M1=f1M1α0M. From these; Eqs. (3.5)-(3.8) can be used to work down from n=M2 to n=0. It is shown in the appendix that the desired result is just smII=αm0.

Equations (3.5)-(3.8) allow each αkn to be determined in no more than seven arithmetic operations. The roughly M2/2 non-zero elements can therefore be determined with about 3.5M2 operations. Notice that, as for the processes in Section 2, this change of basis is achieved without evaluating a single member of either basis set. More significantly, there is no need to determine integrals involving the basis functions as you might expect from Eq. (1.3). It is a powerful and efficient shortcut. Its value is enhanced by the fact that the bases need not be orthogonal polynomials; all that is required is that they satisfy a three-term recurrence. For example, the regular monomial basis {xm} obviously satisfies Eq. (1.6) with bn=1 and an=cn=0, and this is exploited in Section 5.

4. Applications for Zernike polynomials

As stated in the Introduction, the Zernike polynomials of m’th azimuthal order are related to the Jacobi polynomials by Znm(x)=Pn(0,m)(2x1). It follows that Znm(x) satisfies a recurrence relation like Eq. (1).6). With s:=m+2n, this relation has

an=(s+1)[(sn)2+n2+s](n+1)(sn+1)s,bn=(s+2)(s+1)(n+1)(s+n1),cn=(s+2)(sn)n(n+1)(sn+1)s.
(See, for example, 22.7.1 of [3]). Any number of Zernikes can therefore be computed robustly and efficiently by using Eq. (1.6). More importantly, the same is possible for linear combinations of Zernikes, and also of their derivatives, by applying the results of Section 2. Perhaps the most significant observation relates to the problem of changing aperture size. Suppose that the modified aperture is a fraction, say ε>0, of the first, where typically ε<1. This process arises in a variety of contexts including ophthalmology and lithography, and it has been revisited repeatedly over the last decade, see e.g [1620]. In all of this work, each azimuthal order can be treated separately for this change of basis. The challenge therefore is, for given values of sn, find tn so that nsnrmZnm(r2) is identical (as a function of r) to ntn(r/ε)mZnm(r2/ε2).

The authors of [1620] have started from Eq. (1.7) and tackled laborious algebra in order to derive explicit expressions for the matrix elements in the change-of-basis matrix. The problem is that their results take an almost identical form to Eq. (1.7) —see e.g., Eq. (19) of [19] or Eq. (29) of [20]. These matrix elements are polynomials in ε2 and, while the expressions are exact to within round-off errors, they nevertheless fail as catastrophically as Eq. (1.7) when the number of terms grows to 20 or more. To make matters worse, this failure is most extreme for ε1, which is precisely where the expressions are typically used. An elegant, specialized alternative is given in [21] where the matrix elements are themselves expressed as differences between two Zernikes. Their impressive derivation relies on two integral relations involving Bessel functions. A similarly robust and efficient scheme follows simply from the more general results given in the previous section, however.

This direct application of the results of Section 3 involves finding tn so that

n=0MsnPnI(x)n=0M(tn/εm)PnII(x/ε2),
where {PnI} and {PnII} are both just {Znm}. Both of the recurrence relations in Eqs. (3.3) are therefore given by Eqs. (4.1), but with the one distinction that bnII=bn/ε2 due to the scaling of x in the second basis. The end result is then evidently just tn=εmαn0. In this way, Salzer’s elegant general-purpose routine means that there was no need to wrestle with special functions or explicit high-order expansions. More importantly, the failure of the explicit expressions is thereby avoided. It is also interesting to notice that, for this application, fkn, gkn, and hkn of Eqs. (3.4) are linear functions of ε2. This means that αkj is a polynomial of order Mj in ε2 and, just as in Eq. (B.3), Eqs. (3.5)-(3.8) can be differentiated with respect to ε2 to obtain a recurrence for determining the derivative of tn with respect to ε2. Such an entity is used for sensitivity analysis in [21].

5. Applications for asphere shape specification

Advances in both fabrication and metrology are enabling the introduction of more complex and precise aspheric optical surfaces. This trend demands larger numbers of terms in the polynomial used in the standard characterization of surface shape. The methods described above are ideally suited to one of the orthogonal bases that were recently introduced for efficient characterization in this context [22]. If c denotes the axial curvature and κ the conic constant, the surface’s sag is written as

z(ρ)=cρ2/[1+1(1+κ)c2ρ2]+u4m=0MsmQmcon(u2),
where u is just the normalized radial coordinate, i.e. u:=ρ/ρmax with ρmax=CA/2. A sample of these basis members is plotted in Fig. 2 . It so happens that Qmcon(x)Zm4(x), so the recurrence relation for this set follows from Eqs. (4.1) with m=4:

 figure: Fig. 2

Fig. 2 A sample of the members of the basis used in Eq. (5.1).

Download Full Size | PDF

an=(2n+5)(n2+5n+10)(n+1)(n+2)(n+5),bn=2(n+3)(2n+5)(n+1)(n+5),cn=(n+3)(n+4)n(n+1)(n+2)(n+5).

The methods of Sections 2 and 3 are ideally matched to working with aspheres in these terms. In this way, the potentially chronic round-off problems are localized to the basis members themselves, and tamed there by using recurrence relations. As discussed in [22], the fact that the coefficients now function like a spectrum has a variety of benefits.

Upon introducing S(x):=msmQmcon(x) and ϕ=[1(1+κ)c2ρ2]1/2, Eq. (5.1) and its derivatives become

z(ρ)=cρ2/(1+ϕ)+u4S(u2),
z(ρ)=cρ/ϕ+2u3ρmax[2S(u2)+u2S(u2)],
z(ρ)=c/ϕ3+2u2ρmax2[6S(u2)+9u2S(u2)+2u4S(u2)].
Such entities are fundamental to working with aspheres, and they can now be robustly evaluated to arbitrary orders with remarkable efficiency by using Eqs. (2.2) and (2.3). Scaling the aperture size is another basic process in this context. Because Qmcon(x)Zm4(x), this can be achieved simply by taking m=4 in the process described in Section 4. That is, the new coefficients that give exactly the same surface —but now orthogonalized over a different aperture size— can be determined robustly and efficiently via simple code.

Perhaps more interestingly, the conventional manner to describe surfaces is by expressing the additive polynomial of Eq. (5.1) as a sum of monomials, say

u4m=0MsmQmcon(u2)=m=0MA2m+4ρ2m+4=(uρmax)4m=0MA2m+4(uρmax)2m.
With x:=u2, this relation can be rewritten as
m=0MsmQmcon(x)=m=0Mtmxm,
where tm:=A2m+4ρmax2m+4. It is once again straightforward to apply the process presented in Section 3 to convert backwards and forwards between these two representations: one of the recurrences is described by Eq. (5.2), while the other has a simpler form, namely
an=0,bn=1,cn=0.
The fact that two of these coefficients vanish means that terms drop out of Eqs. (3.5)-(3.8), and a slightly optimized form may be implemented for each of these inter-conversions. Of course, the degeneracy of {xm} means that the conventional representation is prone to round-off failure. The conversion to and from this basis is therefore recommended only for modest numbers of terms (not many more than a dozen or so). Orthogonal bases will hopefully soon be the norm in this field, and there will be little need for such conversions.

6. Concluding remarks

Four basic processes are described in this work, namely the evaluation of (i) a set of orthogonal polynomials, (ii) a linear combination of them, (iii) a linear combination of their derivatives, and (iv) changing from one such basis to another. These regularly enter optics, but much of the related work was focused on explicit expressions for the polynomials. It has not been widely appreciated that, although they are exact, these expressions fail numerically when the number of terms grows much beyond ten or so. Further, these decompositions also tend to become computationally intensive as the number of terms grows. Robust and efficient alternatives were presented here in each case. None of these alternatives for (ii)-(iv) require the evaluation of a single orthogonal polynomial; they all operate directly upon the coefficients of the expansion and lead to simple code that avoids round-off problems.

Although these methods were introduced decades ago, it appears that they were ahead of their time. They now open the door to benefits delivered by larger numbers of terms in various contexts including modeling based on Gaussian-beam-mode decompositions, various applications of Zernike polynomials, or optical design based on the methods of Section 5. The two-step induction-based process that was introduced in the Appendices clarifies how these methods can be generalized to other contexts. In my view, the work of Clenshaw, Smith, and Salzer certainly deserves to be better appreciated in optics, and more widely known in general. Many of us can then benefit significantly by having faster, more versatile code that does not suffer from catastrophic round-off failure and can proceed to arbitrary orders.

Appendix A: Directly evaluating linear combinations

The three-term recurrence relations that underpin the key results in this work can be expressed as

Pn+1=unPnvnPn1,
where explicit expressions are in hand for un and vn. Once P0 and P1 are specified, this relation can be used to generate those of higher order. Clenshaw [10] pointed out that Eq. (A.1) leads to a fast, robust way to evaluate any sum of the form
S:=m=0MsmPm.
The idea is to avoid the evaluation of the Pm elements by progressively eliminating them from the top end of the sum. The three-term recurrence in Eq. (A.1) means that the topmost two terms are therefore progressively modified. After eliminating PM, PM1,… Pn+2, where n<M, the result can be expressed as
S=αn+1Pn+1+βn+1Pn+m=0n1smPm,
for some αn+1 and βn+1. It follows trivially from Eqs. (A.1) and (A.3) that
S=αn+1(unPnvnPn1)+βn+1Pn+m=0n1smPm=(unαn+1+βn+1)Pn+(sn1vnαn+1)Pn1+m=0n2smPm.
Upon replacing n in Eq. (A.3) with n1 and comparing the result with Eq. (A.4), it is then evident that
αn=unαn+1+βn+1,
βn=sn1vnαn+1.
These two relations hold the key to the efficient evaluation of Eq. (A.2) since taking n=0 in Eq. (A.3) reveals that
S=α1P1+β1P0.
By starting with αM+2=βM+2=0, Eqs. (A.5) and (A.6) yield αn and βn for all nM+1, hence S can then be determined simply with Eq. (A.7).

Things can be simplified a little further by using Eqs. (A.6) and (A.5), respectively, to eliminate the β’s from Eqs. (A.5) and (A.7):

αn=sn+unαn+1vn+1αn+2,
S=(α0u0α1)P0+α1P1.
Since Eq. (A.5) establishes that αM+1=0, start Eq. (A.8) with αM+2=αM+1=0 and work downwards to determine α1 and, ultimately, α0. The value of the sum in Eq. (A.2) then follows from Eq. (A.9). Keep in mind that P0 and P1 are specified at the outset. Although, as pointed out in [9], it is also possible to proceed from bottom up rather than top down in eliminating the P’s, this is not effective for the optical applications considered here. It is important to note, however, that [9] gives a useful treatment of applications involving trigonometric and Bessel functions as well as a sketch of stability analysis for recurrence-based processes. For many cases of interest, including those considered in Sections 4 and 5, it turns out that v0=0 hence P1=u0P0, and the conventional normalization is P01. As a result, Eq. (A.9) then reduces to an even more beautiful result:

S=α0.

Appendix B: Directly evaluating derivatives of linear combinations

When Pm in Eq.(A.2) is a polynomial of order m in x and each sm is independent of x, Smith [11] presented a scheme to compute the derivative of that sum with respect to x. With primes denoting derivatives, we now seek

S=m=1MsmPm.
(Note that P0=0, so it was dropped.) In this case, un of Eq. (A.1) is a linear function, say
un=an+bnx,
and αn of Appendix A is readily seen to be a polynomial of order Mn for 0nM. Since vn is independent of x, differentiating Eq. (A.8) now leads directly to
αn=bnαn+1+unαn+1vn+1αn+2.
This can be initialized with αM+1=αM=0, and the result that we seek is then simply
S=α0.
That is —assuming that S was evaluated beforehand—S can be evaluated just as robustly and with much the same number of arithmetic operations.

Of course, if v00, it is the derivative of Eq. (A.9) that yields the desired end result, but such cases aren’t needed in this work. What is more, even when un and vn are more general functions of x, it is straightforward to use the derivative of Eq. (A.8) to get a slight generalization of Eq. (B.3). For the most common case laid out above, it is also clear that higher derivatives can be determined in much the same way: if a superscript in parentheses is used to denote higher derivatives, it follows trivially from Eq. (A.8) that

αn(j)=jbnαn+1(j1)+unαn+1(j)vn+1αn+2(j).
This is initialized by αM+2j(j)=αM+1j(j)=0, and the desired higher derivative for our cases is then simply α0(j):
S(j)=m=jMsmPm(j)=α0(j).
Remarkably, nothing more complex than Eq. (B.5) is needed to robustly determine derivatives of any order.

Appendix C: Directly changing to a new basis

The inter-conversion considered in Sec. 3 can be carried out by, much as in Appendix A, progressively eliminating PmI from the top end of the sum in Eq.(3.1). In this case, however, αn is to be expresses as a linear combination of {PmII} rather than as the nested polynomials of Appendix A. The intermediate result that replaces Eq.(A.3) is written as

S=(k=0Mn1αkn+1PkII)Pn+1I+(k=0Mnβkn+1PkII)PnI+m=0n1smIPmI,
where the α’s and β’s are now just constant coefficients. By using Eq. (3.3a), the first term on the right-hand side of Eq. (C.1) can be re-expressed as
(k=0Mn1αkn+1PkII)Pn+1I=(k=0Mn1αkn+1PkII)[(anI+bnIx)PnIcnIPn1I]=(k=0Mn1αkn+1xPkII)bnIPnI+(k=0Mn1αkn+1PkII)[anIPnIcnIPn1I].
In turn, the first term on the last line of Eq. (C.2) can be re-expressed upon using Eq. (3.3b) to see that
xPkII=[Pk+1IIakIIPkII+ckIIPk1II]/bkII.
Combining Eqs. (C.2) and (C.3) with Eq. (C.1) gives

S=(k=0Mn1αkn+1[Pk+1IIakIIPkII+ckIIPk1II]/bkII)bnIPnI+(k=0Mn1αkn+1PkII)[anIPnIcnIPn1I]+(k=0Mnβkn+1PkII)PnI+m=0n1smIPmI.

Upon gathering the terms of Eq. (C.4) that involve Pn1I and, separately, those that involve PnI, the result can be equated to Eq. (C.1) with n replaced by n1. The first relation that emerges is simply

βkn=sn1Iδk0cnIαkn+1,
where δk0 is zero for all k except δ00=1. As done for the β’s in Appendix A, Eq. (C.5) makes it easy to eliminate βkn from the second relation that emerges in order to determine an analogue for Eq. (A.8), namely a recurrence relation for αkn:
αkn=snIδk0+(bnIbk1II)αk1n+1+(anIbnIakIIbkII)αkn+1+(bnIck+1IIbk+1II)αk+1n+1cn+1Iαkn+2.
Notice that the factors inside the parentheses in Eq. (C.6) involve just the constants from the recurrence relations in Eqs. (3.3). For the cases of interest in the body, we always have c0I=c0II=0 and P0IP0II1. To appreciate the power of Eq. (C.6), notice that just as Eq. (A.3) becomes (A.10) upon taking n=1, Eq. (C.1) leads to
S=k=0Mαk0PkII.
Upon comparing Eqs. (3.2) and (C.7), it follows that the coefficients sought here are just smII=αm0. It follows from Eq. (C.1) that αkn=0 whenever either k<0 or k>Mn so, for example, αkM+2=αkM+1=0 for all k. With this as a given, Eq. (C.6) can be used, for fixed n, by running over 0kMn. The process starts at n=M and works down to n=0 in order to yield the desired end result.

References and links

1. M. Born, and E. Wolf, Principles of Optics (Cambridge, 1999), see Sec. 9.2 and Appendix VII.

2. A. E. Siegman, Lasers (University Science Books, 1986), Chaps. 16–17.

3. M. Abramowitz, and I. Stegun, Handbook of Mathematical Functions (Dover, 1978), Chap. 22.

4. A. B. Bhatia, E. Wolf, and M. Born, “On the circle polynomials of Zernike and related orthogonal sets,” Proc. Camb. Philos. Soc. 50(01), 40–48 (1954). [CrossRef]  

5. D. R. Myrick, “A generalization of the radial polynomials of F. Zernike,” J. Soc. Ind. Appl. Math. 14(3), 476–492 (1966). [CrossRef]  

6. E. C. Kintner, “On the mathematical properties of the Zernike polynomials,” Opt. Acta (Lond.) 23, 679–680 (1976). [CrossRef]  

7. R. Barakat, “Optimum balanced wave-front aberrations for radially symmetric amplitude distributions: generalizations of Zernike polynomials,” J. Opt. Soc. Am. 70(6), 739–742 (1980). [CrossRef]  

8. C.-W. Chong, P. Raveendran, and R. Mukundan, “A comparative analysis of algorithms for fast computation of Zernike moments,” Pattern Recognit. 36(3), 731–742 (2003). [CrossRef]  

9. W. Press, S. Teukolsky, W. Vetterling, and B. Flannery, Numerical Recipes: The Art of Scientific Computing (Cambridge University Press, 1992) Section 5.5.

10. C. W. Clenshaw, “A Note on the Summation of Chebyshev Series,” Math. Tables Other Aids Comput. 9, 118–120 (1955), http://www.jstor.org/stable/2002068.

11. F. J. Smith, “An Algorithm for Summing Orthogonal Polynomial Series and their Derivatives with Applications to Curve-Fitting and Interpolation,” Math. Comput. 19(89), 33–36 (1965). [CrossRef]  

12. H. E. Salzer, “A Recurrence Scheme for Converting from One Orthogonal Expansion into Another,” Commun. ACM 16(11), 705–707 (1973). [CrossRef]  

13. E. H. Doha, “On the coefficients of differentiated expansions and derivatives of Jacobi polynomials,” J. Phys. Math. Gen. 35(15), 3467–3478 (2002). [CrossRef]  

14. R. Barrio and J. M. Peña, “Numerical evaluation of the p’th derivative of Jacobi series,” Appl. Numer. Math. 43(4), 335–357 (2002). [CrossRef]  

15. B. Y. Ting and Y. L. Luke, “Conversion of Polynomials between Different Polynomial Bases,” IMA J. Numer. Anal. 1(2), 229–234 (1981). [CrossRef]  

16. K. A. Goldberg and K. Geary, “Wave-front measurement errors from restricted concentric subdomains,” J. Opt. Soc. Am. A 18(9), 2146–2152 (2001). [CrossRef]  

17. J. Schwiegerling, “Scaling Zernike expansion coefficients to different pupil sizes,” J. Opt. Soc. Am. A 19(10), 1937–1945 (2002). [CrossRef]  

18. C. E. Campbell, “Matrix method to find a new set of Zernike coefficients from an original set when the aperture radius is changed,” J. Opt. Soc. Am. A 20(2), 209–217 (2003). [CrossRef]  

19. G. M. Dai, “Scaling Zernike expansion coefficients to smaller pupil sizes: a simpler formula,” J. Opt. Soc. Am. A 23(3), 539–543 (2006). [CrossRef]  

20. H. Shu, L. Luo, G. Han, and J.-L. Coatrieux, “General method to derive the relationship between two sets of Zernike coefficients corresponding to different aperture sizes,” J. Opt. Soc. Am. A 23(8), 1960–1966 (2006). [CrossRef]  

21. A. J. E. M. Janssen and P. Dirksen, “Concise formula for the Zernike coefficients of scaled pupils,” J. Microlith. Microfab Microsyst. 5(3), 030501–3 (2006). [CrossRef]  

22. G. W. Forbes, “Shape specification for axially symmetric optical surfaces,” Opt. Express 15(8), 5218–5226 (2007), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-15-8-5218. [CrossRef]   [PubMed]  

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (2)

Fig. 1
Fig. 1 The red curve at left is a plot of a high-order rotationally symmetric Zernike over just 0.95 < x < 1; the green curve is 1014 times the error when this function is evaluated by using Eq. (1.6), so there are typically 14 significant digits in this double-precision result. The curve on the right is the catastrophic error when Eq. (1.7) is used at double precision. Although exact to within round-off, the explicit polynomial gives values with the wrong order of magnitude and chaotic sign. This failure grows exponentially with the polynomial’s order.
Fig. 2
Fig. 2 A sample of the members of the basis used in Eq. (5.1).

Equations (52)

Equations on this page are rendered with MathJax. Learn more.

P m , P n = 0 ,   when m n ,
S ( x ) = m s m P m ( x )
s m = S , P m .
S , S = m s m 2 .
x P n ( x ) = q P n + 1 ( x ) + r P n ( x ) + s P n 1 ( x ) ,
P n + 1 ( x ) = ( a n + b n x ) P n ( x ) c n P n 1 ( x ) .
Z n m ( x ) = j = 0 n ( 1 ) n j ( n j ) ( m + n + j n ) x j = j = 0 n ( 1 ) n j ( m + n + j ) ! j ! ( m + j ) ! ( n j ) ! x j .
Z 10 0 ( x ) = 1 110 x + 2 , 970 x 2 34 , 320 x 3 + 210 , 210 x 4 756 , 756 x 5 + 1 , 681 , 680 x 6 2 , 333 , 760 x 7 + 1 , 969 , 110 x 8 923 , 780 x 9 + 184 , 756 x 10 .
S ( x ) : = m = 0 M s m P m ( x ) .
α n = s n + ( a n + b n x ) α n + 1 c n + 1 α n + 2 ,
α n ( j ) = j b n α n + 1 ( j 1 ) + ( a n + b n x ) α n + 1 ( j ) c n + 1 α n + 2 ( j )
S = m = 0 M s m I P m I ,
S = m = 0 M s m II P m II .
P n + 1 I = ( a n I + b n I x ) P n I c n I P n 1 I , P n + 1 II = ( a n II + b n II x ) P n II c n II P n 1 II .
f k n : = b n I / b k 1 II , g k n : = a n I b n I a k II / b k II , h k n : = b n I c k + 1 II / b k + 1 II .
α 0 n = s n I + g 0 n α 0 n + 1 + h 0 n α 1 n + 1 c n + 1 I α 0 n + 2 .
α k n = f k n α k 1 n + 1 + g k n α k n + 1 + h k n α k + 1 n + 1 c n + 1 I α k n + 2 .
α M n 1 n = f M n 1 n α M n 2 n + 1 + g M n 1 n α M n 1 n + 1 ,
α M n n = f M n n α M n 1 n + 1 .
a n = ( s + 1 ) [ ( s n ) 2 + n 2 + s ] ( n + 1 ) ( s n + 1 ) s , b n = ( s + 2 ) ( s + 1 ) ( n + 1 ) ( s + n 1 ) , c n = ( s + 2 ) ( s n ) n ( n + 1 ) ( s n + 1 ) s .
n = 0 M s n P n I ( x ) n = 0 M ( t n / ε m ) P n II ( x / ε 2 ) ,
z ( ρ ) = c ρ 2 / [ 1 + 1 ( 1 + κ ) c 2 ρ 2 ] + u 4 m = 0 M s m Q m con ( u 2 ) ,
a n = ( 2 n + 5 ) ( n 2 + 5 n + 10 ) ( n + 1 ) ( n + 2 ) ( n + 5 ) , b n = 2 ( n + 3 ) ( 2 n + 5 ) ( n + 1 ) ( n + 5 ) , c n = ( n + 3 ) ( n + 4 ) n ( n + 1 ) ( n + 2 ) ( n + 5 ) .
z ( ρ ) = c ρ 2 / ( 1 + ϕ ) + u 4 S ( u 2 ) ,
z ( ρ ) = c ρ / ϕ + 2 u 3 ρ max [ 2 S ( u 2 ) + u 2 S ( u 2 ) ] ,
z ( ρ ) = c / ϕ 3 + 2 u 2 ρ max 2 [ 6 S ( u 2 ) + 9 u 2 S ( u 2 ) + 2 u 4 S ( u 2 ) ] .
u 4 m = 0 M s m Q m con ( u 2 ) = m = 0 M A 2 m + 4 ρ 2 m + 4 = ( u ρ max ) 4 m = 0 M A 2 m + 4 ( u ρ max ) 2 m .
m = 0 M s m Q m con ( x ) = m = 0 M t m x m ,
a n = 0 , b n = 1 , c n = 0 .
P n + 1 = u n P n v n P n 1 ,
S : = m = 0 M s m P m .
S = α n + 1 P n + 1 + β n + 1 P n + m = 0 n 1 s m P m ,
S = α n + 1 ( u n P n v n P n 1 ) + β n + 1 P n + m = 0 n 1 s m P m = ( u n α n + 1 + β n + 1 ) P n + ( s n 1 v n α n + 1 ) P n 1 + m = 0 n 2 s m P m .
α n = u n α n + 1 + β n + 1 ,
β n = s n 1 v n α n + 1 .
S = α 1 P 1 + β 1 P 0 .
α n = s n + u n α n + 1 v n + 1 α n + 2 ,
S = ( α 0 u 0 α 1 ) P 0 + α 1 P 1 .
S = α 0 .
S = m = 1 M s m P m .
u n = a n + b n x ,
α n = b n α n + 1 + u n α n + 1 v n + 1 α n + 2 .
S = α 0 .
α n ( j ) = j b n α n + 1 ( j 1 ) + u n α n + 1 ( j ) v n + 1 α n + 2 ( j ) .
S ( j ) = m = j M s m P m ( j ) = α 0 ( j ) .
S = ( k = 0 M n 1 α k n + 1 P k II ) P n + 1 I + ( k = 0 M n β k n + 1 P k II ) P n I + m = 0 n 1 s m I P m I ,
( k = 0 M n 1 α k n + 1 P k II ) P n + 1 I = ( k = 0 M n 1 α k n + 1 P k II ) [ ( a n I + b n I x ) P n I c n I P n 1 I ] = ( k = 0 M n 1 α k n + 1 x P k II ) b n I P n I + ( k = 0 M n 1 α k n + 1 P k II ) [ a n I P n I c n I P n 1 I ] .
x P k II = [ P k + 1 II a k II P k II + c k II P k 1 II ] / b k II .
S = ( k = 0 M n 1 α k n + 1 [ P k + 1 II a k II P k II + c k II P k 1 II ] / b k II ) b n I P n I + ( k = 0 M n 1 α k n + 1 P k II ) [ a n I P n I c n I P n 1 I ] + ( k = 0 M n β k n + 1 P k II ) P n I + m = 0 n 1 s m I P m I .
β k n = s n 1 I δ k 0 c n I α k n + 1 ,
α k n = s n I δ k 0 + ( b n I b k 1 II ) α k 1 n + 1 + ( a n I b n I a k II b k II ) α k n + 1 + ( b n I c k + 1 II b k + 1 II ) α k + 1 n + 1 c n + 1 I α k n + 2 .
S = k = 0 M α k 0 P k II .
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.