Shape specification for axially symmetric optical surfaces

G. W. Forbes

doi:10.1364/OE.15.005218

1. Introduction

Characterizing the desired shape of individual mirrors or lens surfaces is central to activities such as optical design and fabrication. In essence, this process is pure geometry. After agreeing on certain conventions, however, the problem becomes a numerical one that is made ever more challenging by the demanding tolerances of modern precision optics. A variety of different approaches to this task have been adopted within the context of optics, and especially within optical design [1]. Limitations of one of the most widely used options [2] are demonstrated here, and a number of simple improvements are proposed. General geometry and numerics are the key points of interest; details that are specific to only one context, such as design, are suppressed as much as possible. While some comments are offered about how these ideas are expected to be of use for design purposes, the emphasis is on general processes for specifying ideal surface shapes. The results relate to various activities including optical fabrication, testing, and patenting, as well as design.

Since conic sections have special optical properties that are often exploited in system design, it is conventional to express rotationally symmetric surface shapes in terms of their deviation from the sagittal representation —or just the “sag”— of a close-fitting conic, say [2]

z (ρ) = \frac{c ρ^{2}}{(1 + \sqrt{1 - {ε c}^{2} ρ^{2}})} + \sum_{m = 0}^{M} a_{m} ρ^{2 m + 4},

where {z,ρ,ϕ} are standard cylindrical polar coordinates. Here, c is the paraxial curvature of the surface and ε the conic parameter. Only a handful of terms are typically retained in the added polynomial, but it is increasingly common to see this number grow in order to characterize a desired shape with sufficient accuracy. For the purposes of fabrication and testing, this characterization of the surface shape is completed by specification of the aperture size, i.e. a value for ρ _max where Eq. (1) is then valid over 0<ρ<ρ _max.

There are enough degrees of freedom in Eq. (1) to admit independent control of all even powers up to ρ ^2M+4 (and, in fact, via ε to ρ ^2M+6). This representation is therefore sufficiently general to approximate any symmetric shape with arbitrary accuracy provided M is allowed to be large enough. However, the failings of such a representation are well known. For example, if a particular shape, say z = f(ρ), is prescribed, a least-squares fit can be performed to determine the optimal values of {a_m;m = 0,1,..M} once a close-fitting conic section has been identified. It can readily be shown, however, that even when working in terms of the normalized variable u≔/ρ _max and using double-precision arithmetic, the associated Gram matrix [defined below at Eq. (7)] is so ill-conditioned that this simplistic process fails when more than about ten terms are used. When a solution is found for modest values of M, the values of {a_m; m = 0,1,..M} typically alternate in sign and only yield the desired shape through heavy cancellation between the individual terms. This property is all too familiar to the optical design and fabrication community and makes for numerical inefficiency due to problematic round-off errors. In particular, large numbers of digits must be specified. A sample of these difficulties is displayed in Sec. 4, and one of the goals of this work is to push these failings far beyond the ever-growing design space for optics. The proposed solution is to use non-standard orthogonal bases in place of the simplistic additive polynomial of Eq. (1). A useful property of orthogonal decompositions is that the mean square value of the associated superposition is just the sum of the squares of the coefficients. Section 3 contains a particular basis that is tailored to exploit this property to achieve the second goal for this work: facilitating the enforcement of manufacturability constraints during design.

2. Strong aspheres

By choosing an appropriate basis, the ill-conditioning of the Gram matrix discussed in the Introduction can be entirely removed. To see this, rewrite Eq. (1) as

z (ρ) = \frac{c ρ^{2}}{(1 + \sqrt{1 - {ε c}^{2} ρ^{2}})} + D_{con} (\frac{ρ}{ρ_{max}}),

where the departure from a conic, written as D _con (u), is defined by

D_{con} (u) ≔ u^{4} \sum_{m = 0}^{M} a_{m} Q_{m}^{con} {(u)}^{2} .

In place of the standard choice of simple monomials, namely Q ^mon _m(x) = x_m [which gives Eq. (1)], we can choose {Q ^mon _m(x); m = 0,1,…M} to be an orthogonal set. Consider fitting the surface described by z = f(ρ) where f may be specified via a numerical look-up table, or whatever. After choosing a close-fitting conic (which can be refined by using a process that invokes the procedure discussed here) the coefficients in the additive polynomial can be chosen to minimize the rms sag error. That is, if the difference between f(ρ) and the conic component of Eq. (2) is written as g(ρ), the minimization of

E^{2} (a_{0}, a_{1}, \dots a_{M}) ≔ 〈 {[g ({u ρ}_{max}) - u^{4} \sum_{m = 0}^{M} a_{m} Q_{m}^{con} {(u)}^{2}]}^{2} 〉 .

becomes a standard least-squares process. The angle brackets in Eq. (4) denote a weighted average that may be defined by a sum or an integral of whatever form we please. In this work, for any function h, it is convenient at first to take

〈 h (u) 〉 ≔ \frac{\int_{0}^{1} h (u) w (u^{2}) u du}{\int_{0}^{1} w (v^{2}) v dv} .

Choosing the weight function to be constant, i.e. W(u ²) = 1, means that equal areas in the aperture carry equal weight, and this is as good as any for the moment.

Setting the gradient of E ² to zero leads to the expression for the optimal coefficients:

\sum_{n = 0}^{M} G_{m n} a_{n} = b_{m},

where b_m ≔〈g(u ρ _max)u ⁴ Q ^con _m(u ²)〉 and the elements of the Gram matrix are given by

G_{m n} ≔ 〈 u^{8} Q_{m}^{con} (u^{2}) Q_{n}^{con} (u^{2}) 〉 = \int_{0}^{1} Q_{m}^{con} (x) Q_{n}^{con} (x) x^{4} d x .

It follows that this matrix is diagonal if the basis is chosen to be a particular case of the Jacobi polynomials [3]. In particular,

Q_{m}^{con} (x) ≔ P_{m}^{(0,4)} (2 x - 1) .

For reference, the first six members can be written as

Q_{0}^{con} (x) = 1, Q_{1}^{con} (x) = - (5 - 6 x), Q_{2}^{con} (x) = 15 - 14 x (3 - 2 x),

As can be seen in Fig. 1, these basis elements are scaled to take a maximum value of unity. Even when the mean square error of Eq. (4) is computed as a finite sum, the basis of Eq. (8) is still a good choice. Although the Gram matrix may then not be precisely diagonal, it is nearly so and remains well conditioned. Some benefits of this basis are outlined in Sec. 4.

Fig. 1. Plots of the orthogonal basis elements for m = 0, 1, 2,… 5.

Download Full Size | PDF

3. Mild aspheres

Due to considerations related to both fabrication and testing, aspheric surfaces are generally more cost effective if their deviation from a sphere is constrained. In terms of metrology, it is appropriate to limit the transverse slope of the deviation between the surface and its best-fit sphere when measured along the surface’s normal. Because this slope translates directly into fringe density, a full-aperture interferometric test can be performed provided this slope is constrained so that the interferogram remains sufficiently below the Nyquist sampling limit. As a bonus, such a constraint also reduces both the magnitude of the aspheric departure and the rate of change of the local principal radii of curvature over the aperture. These reductions directly enhance manufacturability. Because, as pointed out in the Introduction, orthogonal bases can offer immediate access to mean square values, it can be effective to work in terms of a representation that is coupled to the mean square slope of a surface’s aspheric departure. As described in the following section, the capability derived in the next few paragraphs can also be used as part of an effective scheme to limit the maximum absolute slope of the departure.

One simple definition of the best-fitting sphere for a rotationally symmetric surface is to choose the sphere that is coincident with the surface at its axial point and around its perimeter. Various minor modifications from this best fit are possible, e.g. to minimize the maximum absolute deviation, but the distinctions are relatively minor for mild aspheres. If this simple option is accepted and the sag at the edge of the clear aperture is written as f(ρ _max), it follows that the curvature of the best-fit sphere, say c _bfs, is given by c _bfs = 2f(ρ _max)/[ρ ² _max + f(ρ _max)²]. In place of Eq. (1), it is now more effective to express the sag as

z (ρ) = \frac{c_{bfs} ρ^{2}}{(1+ \sqrt{1 - c_{bfs}^{2} ρ^{2}})} + D_{bfs} (\frac{ρ}{ρ_{max}}),

where the sag-based departure from the best-fit sphere, written as D _bfs(u), is defined by

D_{bfs} (u) ≔ \frac{u^{2} (1 - u^{2})}{\sqrt{1 - c_{bfs}^{2} ρ_{max}^{2} u^{2}}} \sum_{m = 0}^{M} a_{m} Q_{m}^{bfs} (u^{2}) .

So that the coefficient values have no impact on the best-fitting sphere, D _bfs(u) is explicitly forced to vanish at the aperture’s centre and edge, i.e. at u = 0 and u = 1. Also, so that the axial curvature may be different fromc _bfs, the pre-factor in Eq. (11) reduces to u ₂ for u≪1 [unlike the form used in Eq. (3) where the pre-factor is u ⁴].

To first order, D _bfs(u) can be converted to a deviation measured along the surface normal by multiplying it by the cosine of the angle between the optical axis and the local normal to the best-fit sphere. This cosine factor is precisely the square root that appears in the denominator of Eq. (11), and this is the justification for the square root’s presence. It is now straightforward to choose Q ^bfs _m (x) to be polynomials of order m and to configure them so that the weighted rms slope of the departure along the normal is just the sum of the squares of a_m. To achieve this, the elements of the normal-departure slope are written as

S_{m} (u) ≔ \frac{d}{d u} {u^{2} (1 - u^{2}) Q_{m}^{bfs} (u^{2})},

and then Q ^bfs _m (x) is chosen to make S_m (u) orthogonal. In particular,

〈 S_{m} (u) S_{n} (u) 〉 = δ_{mn},

where the angle brackets were defined at Eq. (5) and δ_mn is the Kronecker delta. By starting with Q ^bfs ₀ (x) as a constant that is normalized in accordance with Eq. (13), higher-order members can be determined progressively by requiring that each new polynomial be orthogonal to all the lower members and normalized in keeping with Eq. (13).

To constrain the tendency of orthogonal polynomials to shoot off at the edge of their domain, it is sometimes convenient to choose a weight function that emphasizes the behaviour at the ends. If we choose W(x) = [x(1-x)]^-1/2 in Eq. (5), the first six basis elements are found to be

Q_{0}^{bfs} (x) = 1, Q_{1}^{bfs} (x) = \frac{1}{\sqrt{19}} (13 - 16 x), Q_{2}^{bfs} (x) = \sqrt{\frac{2}{95}} [29 - 4 x (25 - 19 x)],

These are plotted in Fig. 2. Notice that, as shown in Fig. 3, on account of the particular nonuniform weight distribution, the associated slope functions remain aesthetically bounded even at the part's centre and edge. The key property of this representation is that the mean square slope of the normal departure from a best-fit sphere is now evidently given by

〈 {\frac{1}{ρ_{max}} \frac{d}{d u} [u^{2} (1 - u^{2}) \sum_{m = 0}^{M} a_{m} Q_{m}^{bfs} (u^{2})]}^{2} 〉 = 〈 {\frac{1}{ρ_{max}} \sum_{m = 0}^{M} a_{m} S_{m} (u)}^{2} 〉 = \frac{1}{ρ_{max}^{2}} \sum_{m = 0}^{M} a_{m}^{2} .

It is this result that makes this basis well suited to certain design tasks.

Fig. 2. Plots for m = 0, 1, 2,… 5 of the basis elements that are tailored for use when constraining the departure from a best-fitting sphere along its normal.

Download Full Size | PDF

Fig. 3. Plots of the orthogonal slope functions of Eq. (12) for m = 0, 1, 2,… 5.

Download Full Size | PDF

4. Simple applications

A specific example, say a Cartesian oval, can provide elementary numerical checks for implementations of the basis proposed in Sec. 2. Although such an example may be entirely academic, it can also help to clarify some of the benefits associated with this basis. The single refracting surface that perfectly images a source point to its image point can be found by requiring that the summed optical distance along line segments that join the two points via any point on the surface be a constant. An example is shown in Fig. 4 with n = 1.5 and d ₁, = d ₂ = 10mm, and 2ρ _max = 7mm, where n is the refractive index and d ₁, and d ₂ are the distances between the axial point on the surface and the object and image points, respectively. Finding the explicit sag of the surface involves solving a quartic equation. Thankfully, standard mathematical software packages give direct access to the solution for the purposes of the fitting described here. The axial curvature is readily found to be c = -(n/d ₁+ 1/d ₂)/(n -1) = -0.5mm^-1. For the current purposes, it is not important whether the conic parameter, i.e. ε of Eq. (2), is chosen to match the fourth-order expansion of the sag, or to give a fit through the surface at the edge of the aperture. For the latter, in the notation used before Eq. (4), we have ε= [2f(rho; _max)-cρ ² _max]/[cf(ρ _max)²], which gives ε≈0.156284. The residual 60μm of deviation from this fitted conic is plotted in Fig. 5(a).

Fig. 4. The Cartesian oval used as an example. For some purposes, it can be helpful to note that a parametric description as η(θ) follows upon solving a quadratic.

Download Full Size | PDF

For the case M = 7, the fit error associated with either of Eqs. (1) or (2) is plotted in Fig. 5(b). The coefficients for each case are presented in the first and last columns of Table 1. In principle, these two polynomials are identical. When the traditional monomial basis is adopted, however, the coefficient values are highly dependent upon the number of terms used in the fit. Further, because of the ill-conditioned Gram matrix, the accuracy of the fits based on monomials rapidly bottoms out as M is increased. More problematically, the non-empty null space for the Gram matrix means that the coefficients are truly ill-defined: the associated values in the Table are good to only a couple of significant digits because a coordinated change can leave the overall fit effectively untouched. This is clarified in the second column of Table 1 where some sample changes in the coefficients for the monomial basis are given. Although individually exceeding hundreds of thousands of nanometers, these changes combine to give no more than just one nanometer of change in the net sag. [These changes are precisely –Q ^con ₇(x) and, as suggested by Fig. 1, x ² Q ^con ₇(x) has a peak value of unity.] Setting tolerances directly in such a representation is evidently an ugly prospect. The first column exhibits the coefficients’ characteristic alternation in signs and growing magnitude. In fact, the largest coefficient value grows with increasing M so, even if it were possible to solve for a larger number of coefficients, evaluation of the polynomial would involve increasing numerical round-off problems due to cancellation. Notice, for example, in Table 1 that a ₅ is on the scale of millimeters for the traditional basis even though we are fitting just 60μm of aspheric departure. The traditional representation is evidently increasingly inefficient in terms of the number of significant digits required in specifying the coefficient values.

For the orthogonal basis, the Gram matrix is diagonal, and the coefficient values are thus independent of M. The decomposition therefore resembles a familiar spectrum. In contrast to the monomial representation, all of the digits shown in the table for this case are significant, and there are clearly fewer of them. Notice that these coefficient values decay exponentially with m: in this case, successive terms fall by a factor of three or four. (This trend continues out to any number of terms, and the fit is accurate to full machine precision — 15 digits— by M = 25. The key point in this is the striking robustness.) Notice that, because of the decaying coefficient values, fit error plots such as Fig. 5(b) generally resemble the first of the truncated basis elements. If we truncate at M = 4, for example, it is clear that the fit error will resemble the dark blue curve of Fig. 1 and have a peak value of roughly 200nm (see a ₅ of Table 1).

Fig. 5. Plots of (a) the Cartesian oval’s departure from the best-fitting conic, and (b) the error in the associated polynomial fit with M=7.

Download Full Size | PDF

Because G_mn is diagonal, it follows from Eqs. (3) and (7) that the mean square departure between the reference conic and the asphere is just a weighted sum of the squares of the coefficients in the decomposition. While this is a relatively incidental result for this basis, the corresponding property given in Eq. (15) for the basis of Sec. 3 is its defining feature. This property can be exploited in a number of ways within design tools. When aspherizing an all-spherical imaging system, for example, it is straightforward to linearly approximate the impact on the wavefront error from the aspheric terms on each surface. The result involves simple ray inclination factors. Even when averaged over the field, the mean square wavefront error is then a quadratic in the coefficients. Since the mean square slope of the normal deviation from the spherical surfaces follows from the trivial quadratic of Eq. (15), the optimal wavefront error for any prescribed mean square slope can be found by using Lagrange multipliers and solving only linear equations. Various simple strategies can now be used to identify optimal solutions that use only aspheres that can be measured interferometrically. An effective step in this is to constrain the rms slope to be below some fraction, say γ where 0 < γ<1, of Nyquist. When testing in reflection, the Nyquist slope corresponds to a change in the normal aspheric departure between adjacent pixels of λ/4. When the aperture is imaged onto an N × N detector, the constraint on the rms slope follows from Eq. (15):

\sum_{m = 0}^{M} a_{m}^{2} < {(\frac{γ N λ}{8})}^{2} .

This type of result underlies the envisioned applications of the basis introduced in Sec. 3. Notice that, if desired, the maximum absolute slope can be monitored when solving for the appropriate value of the Lagrange multiplier.

Table 1. Fit coefficients (in nanometer units) for two different bases. [To clarify a point made in the text, the central column contains just the monomial expansion of Q ^con ₇(x), c.f. Eq. (9).]

View Table

5. Concluding remarks

The traditional parametrization of optical surface shapes in terms of monomials is famously ill-conditioned. One symptom of this weakness is the wildly varying and precariously balanced values among the associated degrees of freedom. All the problematic features disappear when the basis is orthogonalized, however. As a result, significantly fewer digits are required to characterize a surface to any given accuracy. What is more, the high-order coefficients then independently control just the high-order shape. This property can facilitate the tolerancing of an aspheric surface. In the context of optical design, the coefficients associated with orthogonalized representations also promise to give a more natural coordinate space for optimization of the associated merit function. For example, including extra terms in the asphere delivers the benefits of higher orders with minimal change to the existing coefficients. It is worth noting, however, that in general design processes, the value of ρ _max for each surface must be updated whenever the system modifications that are part of optimization cause the effective (illuminated) aperture to drift significantly from ρ _max.

The polynomials introduced in Sec. 2, namely Q ^con _m (x), are intended for use in a range of contexts including design, fabrication, testing, and even patenting of a variety of aspheres. Because of their generality, numerical details were provided in Sec. 4 for code verification. Efficient algorithms can be implemented to provide a range of related services for ray-tracing and design. For example, as discussed in other contexts [4,5,6,7], it is helpful to be able to provide access to first and second derivatives as well as interconvert bases and change the aperture size for the polynomial decompositions. It is possible to achieve this while working to arbitrary orders with at most comparable operation counts to the code for the monomial basis. The explicit expressions for the polynomials given in Eq. (9) are adequate for exploratory implementations, however.

The introduction of the polynomials described in Sec. 3, namely Q ^bfs _m (x), was motivated by the fact that aspheres can be significantly more cost effective if their departure from a best-fit sphere is sufficiently constrained. In particular, it is then possible to avoid the need for dedicated, expensive null optics as part of their testing. Further, when a part’s local principal curvatures vary slowly enough, the fabrication difficulties associated with non-uniform tool-fit are greatly relieved. While similarly avoiding the weaknesses of working with monomials, Q ^bfs _m (x) retains most of the advantages of Q ^con _m (x) while also facilitating the design of milder aspheres. Notice that the simple design procedures discussed in Sec. 4 do not require access to derivatives, and they change neither aperture sizes nor best-fit curvatures. In this sense, a simple application of Q ^bfs _m (x) requires minimal dedicated effort other than implementing the basis elements themselves, see Eq. (14).

References and links

1. See, for example, the discussion and references in H. Chase, “Optical design with rotationally symmetric NURBS”, SPIE Proceedings 4832, 10–24 (2002) and A. W. Greynolds, “Superconic and subconic surface descriptions in optical design,” Proc. SPIE 4832, 1–9 (2002). Such matters are also treated within the manuals for commercial optical design software. [CrossRef]

2. G. H. Spencer and M. V. R .K. Murty, “General ray-tracing procedure,” J. Opt. Soc. Am. 52, 672–678 (1962), see Eq. (16). [CrossRef]

3. M. Abramowitz and I. Stegun, Handbook of Mathematical Functions (Dover, 1978), see 22.2.1.

4. E. H. Doha, “On the coefficients of differentiated expansions and derivatives of Jacobi polynomials,” J. Phys. A: Math. Gen. 35, 3467–3478 (2002). [CrossRef]

5. E. W. Weisstein, “Jacobi Polynomial” from MathWorld—A Wolfram Web Resource. http://mathworld.wolfram.com/JacobiPolynomial.html, see esp. Eqs. (10–14).

6. B. Y. Ting and Y. L. Luke, “Conversion of polynomials between different polynomial bases,” IMA J. Numer. Anal. , 1, 229–234 (1981). [CrossRef]

7. A. J. E. M. Janssen and P. Dirksen, “Concise formula for the Zernike coefficients of scaled pupils”, J. Microlithogr., Microfabr., Microsyst. 5, 1–3 (2006). (Note that the Zernikes are also Jacobi polynomials.) [CrossRef]

m	a_m for Eq. (1)	δa_m for Eq. (1)	a_m for Eq. (3)
0	274,224	330	70,858
1	-224,416	-5,544	-57,065
2	256,935	36,036	-9,992
3	-967,466	-120,120	-2,825
4	1,976,119	225,225	-705
5	-2,506,362	-240,240	-195
6	1,686,616	136,136	-54
7	-495,643	-31,824	-16

Shape specification for axially symmetric optical surfaces

Abstract

1. Introduction

2. Strong aspheres

3. Mild aspheres

4. Simple applications

5. Concluding remarks

References and links

Cited By

Figures (5)

Tables (1)

Equations (22)

Optics Express