Mutation operators in lens system optimization to jump out of local minima

Lei Li; Qiong-Hua Wang; Xiao-Qing Xu; Da-Hai Li

doi:10.1364/OE.18.020876

1. Introduction

Damped Least Squares [1] method is traditionally used in lens system design. It can find local minima from a starting configuration in the merit function landscape effectively. However, optical system designers are often confronted with poor local minima using Damped Least Squares method. For decades, various kinds of global optimization methods [2–8] have been proposed for the problem. These global methods are indeed powerful to make the solution jump out of poor local minima. However, doing global search is time-consuming. The research on more effective and faster methods for lens system design is still in progress. Previous research [9,10] shows that controlling the damping factor of a lens design program can help the designer to find alternative system configurations. In this paper, we apply mutation operators to control the damping factor in Damped Least Squares to jump out of poor local minima.

2. Theory

We first describe briefly the conventional use of Damped Least Squares method [11]. Damped Least Squares method employs the damping factor to obtain the step size vector ∆X to change the configuration parameter vector X of a lens system as follows:

Δ X = - {(A^{T} A + P I)}^{- 1} A^{T} F_{0},

X = X_{0} + Δ X .

We note that the system includes the operand space (dimension M) and the variable space (dimension N). A (M × N) is derivative matrix of an optimization problem. I (N × N) is the identity matrix. F₀ is the vector of operand values to be reduced to zero or certain target values. X₀ is the configuration parameter vector before iteration. P is the damping factor. Usually P is a proper positive real number. The choice of P determines the effectiveness of the algorithm. For more details, see Ref [11].

In the proposed method, for the purpose of jumping out of local minima we must reset P when the solution is trapped in local minima. From the Eq. (1), we know that if $A^{T} A + P I$ is not a singular matrix, the step size ∆X is to become small and the algorithm is to converge due to basins of attraction [12]. Then the algorithm can find local minima effectively. However, if $A^{T} A + P I$ is close to a singular matrix, this algorithm shows divergence instead of convergence. We can make $A^{T} A + P I$ singular as follow:

| A^{T} A + P I | = 0.

We rewrite Eq. (3) as follow:

| A^{T} A - P_{k} I | = 0,

where P_k is the eigenvalue randomly chosen from all the eigenvalues of the matrix

A^{T} A

. If we set P to -P_k,

A^{T} A + P I

is a singular matrix. The step size ∆X becomes infinite and out of control. The obtained configuration parameter vector X is usually useless. However, in this paper, we proposed a mutation operator based on Self-Adaptive Gaussian mutation and Decreasing-Based Gaussian mutation [13,14], which are traditionally used in evolutionary algorithm to keep diversity of the population, to control the damping factor P as follows:

P = - P_{k} {(1 + \exp [τ N (0, 1) + τ^{'} N_{i} (0, 1)])}^{n} {[1 + Q N (0, 1)]}^{n},

τ = {(\sqrt{2 N})}^{- 1},

τ^{'} = {(\sqrt{2 \sqrt{N}})}^{- 1},

where N(0,1) is the standard normal distribution, N_i(0,1) is a new value with distribution N(0,1) that must be regenerated for each index i. i presents number of iteration. n is a positive integer. Q is a constant whose value is between 0 and 1. In the proposed method, the mutated P as shown in Eq. (5) searches around a center -P_k, and is neither too near to nor too far from -P_k due to the proposed mutation operator. In this way, the matrix

A^{T} A + P I

is not a singular matrix, and

| A^{T} A + P I |

is neither too small nor too large. As a result, the mutated P can produce a large amount of different ∆X resulting in mutated solution X. The ∆X resulting from the mutated P is larger than that in trapped local minimum. In this way, the algorithm is no longer to stay in the local minimum. On the contrary, the algorithm with mutated P is to search more extensive variable space beyond the basins so that it can help the trapped solution escape from local minima.

We note that Self-Adaptive Gaussian mutation and Decreasing-Based Gaussian mutation are usually used in mutation procedure in evolutionary algorithm. These two operators can produce a solution around a center. For Decreasing-Based Gaussian mutation, it is like we search a solution in hypersphere centered at the original solution. For Self-Adaptive Gaussian mutation, the search space becomes a hyperellipse. These methods are described in detail in Ref [13]. and Ref [14].

We call the period, when the algorithm employs the mutated P as damping factor, mutation period. The mutation period is usually applied to search the variable space until it produce solutions better than the trapped solution. The mutated P is neither too near to -P_k, which makes $| A^{T} A + P I |$ singular, nor too far from -P_k, which results in no obvious mutation taking place.

3. Mutation behavior in Damped Least Squares method

We examine the mutation behavior of the proposed method using a monochromatic doublet optimization problem as an example. The design requirement is that f-number is 3 and the field angle is 1 degree. The optimization parameters are radiuses (r₁, r₂, r₃, r₄) and thicknesses (d₁, d₂, d₃, d₄) as shown in Fig. 1 . The stop is placed at the first surface. The configuration parameter vector X is (r₁, r₂, r₃, r₄, d₁, d₂, d₃, d₄).

Fig. 1 An example of doublet lens.

Download Full Size | PDF

Mutation operations take place only when the algorithm confronts with local minima. In the optimization, we first allow the conventional Damped Least Squares method to iterate until it reaches a local minimum. Then we reset damping factor P as shown in Eq. (5). The algorithm continues to iterate. This procedure is mutation period as we mention in Section 2. Then the mutation period is followed by the former period (using conventional damping factor). These two periods alternate until the final satisfactory result is produced. In the following experiments, Q is set to 0.7.

In the first trial, we reset n to 3. Figure 2 shows the value of merit function for each iteration in the optimization procedure. We allow the algorithm to iterate for 1200 times. The vertical axis is the values of merit function, and the horizontal axis is the times of iteration. During the mutation period (600-700, 800-900, 1000-1100), we reset P according to Eq. (5). In these periods, we find that the merit function varies randomly shown in Fig. 2 (a). This is because the damping factor is controlled by mutation operators to change randomly, which results in the change of X to search variable space to jump out of the local minimum. In Fig. 2 (b), the mutation has produced many solutions that are better than the trapped solution. Then we chose the best one (the one with the smallest value of merit function) as the initial solution for the next search procedure after 700 iterations. Figure 2 (c) and Fig. 2 (d) also show the same kind of phenomenon.

Fig. 2 Optimization route (n = 3). (a) Route of 1200 iterations. (b) Route of 500-750 iterations. (c) Route of 750-950 iterations. (d) Route of 950-1200 iterations.

Download Full Size | PDF

In the second trial, we reset n to 13. Figure 3 shows the value of merit function for each iteration in the optimization procedure. In this trial, n becomes larger than that in the first one. We could see that this algorithm iterates less than 1050, but it has produced a satisfactory result which is better than the target. Thus the algorithm ends ahead of 1200 times.

Fig. 3 Optimization route (n = 13). (a) Route of 1200 iterations. (b) Route of 550-750 iterations. (c) Route of 750-950 iterations. (d) Route of 950-1050 iterations.

Download Full Size | PDF

In the third trial, we reset n to 23. The same phenomenon happens as shown in Fig. 4 . We can see that in the first mutation period the value of merit function decreases sharply so as it needn’t more iteration times to produce a satisfactory result.

Fig. 4 Optimization route (n = 23). (a) Route of 1200 iterations. (b) Route of 500-750 iterations. (c) Route of 620-750 iterations.

Download Full Size | PDF

From three examples, we could make a conclusion that the mutation operation in Eq. (5) takes effect in jumping out of local minima.

In Section 2, we also note that mutation operators can take effect only when the mutated P is neither too near to nor too far from -P_k. Therefore, in the fourth trial, we make mutated P very near to -P_k as follow:

P = - P_{k} {\exp [τ N (0, 1) + τ^{'} N_{i} (0, 1)]}^{n} {[Q N (0, 1)]}^{n} .

Self-Adaptive Gaussian mutation $\exp [τ N (0, 1) + τ^{'} N_{i} (0, 1)]$ and Decreasing-Based Gaussian mutation $Q N (0, 1)$ [13,14] make the mutated solutions search randomly round a center. There is no modification like Eq. (5), so the mutated factor P in Eq. (8) searches the space only near to -P_k (very near with high probability). n is set to 1.

Figure 5 shows the value of merit function for each iteration in the optimization procedure. We can see that the mutated solution is very large (bad). In Fig. 5 (a) we could not find the mutated solutions. We enlarge the vertical axis to 3000 as shown in Fig. 5 (b). Then the mutated solutions can be found. Obviously, the result is rather bad. In this trial, P is too near to -P_k, and $A^{T} A + P I$ becomes close to a singular matrix. Then the step size ∆X becomes infinite and out of control. Thus result is very bad.

Fig. 5 Optimization route when mutated P is near to –P_k. (a) Overview of optimization route. (b) Overview of optimization route with enlarged vertical axis.

Download Full Size | PDF

In the last trial, we make the mutated P far from -P_k as follow:

P = P_{0} {(1 + \exp [τ N (0, 1) + τ^{'} N_{i} (0, 1)])}^{n} {[1 + Q N (0, 1)]}^{n},

where P₀ is not eigenvalue and is set to 0.2. The value is similar to the value of the common damping factor so as to avoid singular matrix, so the center P₀ is quite far from -P_k. We could hardly find the mutation phenomenon during iteration process in Fig. 6 .

Fig. 6 Optimization route when mutated P is far from –P_k.

Download Full Size | PDF

From the above examples, we can make conclusions that the mutation operators can conduct as an effective way to jump out of poor local minima. However, the mutation operators are not always effective. The mutated P controlled by mutation operators should be set neither too near to nor too far from -P_k. In this paper, we propose a mutation operator shown in Eq. (5). It works fairly well in the optimization.

4. Design examples and result discussions

In Section 3, we analyze the mutation behavior in Damped Least Squares. Now we choose the monochromatic quartet problem to evaluate the capability of the proposed method for automatic lens system design. The statement of the problem is as follows [15]:

Design a 4-elment, f/3, 100 mm effective focal length lens of BK7 glass, illuminated by helium d wavelength. The object is at infinity, the object field covers 30° full field (15° semi-field angle) and the image field is flat. Constraints on the construction include: only spherical surfaces, no aspheric, GRIN elements, Fresnel lenses, binary elements, holographic optical elements, etc. The minimum glass thickness is 2 mm, but there is no upper limit on the size of the lens. The distortion must be less than 1% and there should be no vignetting. The last is intended to assure that vignetting could not be used to improve the edge performance on the lens. No requirement is put on the location of the stop of the system. The merit function consists of the average of the RMS blur spot for three fields: on-axis, 10.5°, and 15°, weighted equally.

The merit function in the experiment is constructed according to the criteria from above.

F = P_{d i s t} + P_{i m g} + P_{v i g n} + \frac{1}{3} \sum_{θ = {0, 10.5, 15}} R M S,

where the variables P_dist, P_img and P_vign are different penalties computed from a set of physical constraints, as presented in Table 1 . The initial value of P_dist, P_img and P_vign is 0. The penalty value is used to make sure the merit function value of unfeasible designs is always worse than that of feasible one. The RMS blur spot size is computed from the variance of the position at the image plane of different exact rays with the same entrance angle.

Table 1. Physical constraints for the monochromatic quartet.

View Table | View all tables in this article

Table 2 presents the configuration parameters of initial rough design. First, we used a conventional Damped Least Squares method to optimize. Then we used the proposed method to conduct optimization. Figure 7 presents the design found with the proposed method. Table 3 gives the configuration parameters of the design found with the proposed method. Similarly, Fig. 8 presents the design found with conventional Damped Least Squares method. Table 4 gives the configuration parameters of the design found with conventional Damped Least Squares method. From Table 5 we can see the proposed method produced an averaged RMS blur spot of 0.00567mm, which is much better than the averaged RMS blur spot of 0.01568 mm obtained by conventional Damped Least Squares method. Comparing the distortion, we can see that the solution found with the proposed method can satisfy the specification (1%), while the solution found with Damped Least Squares method doesn’t satisfy the specification. The results indicate that the solution found with Damped Least Squares method is trapped in a poor local minimum. Damped Least Squares method is a local search strategy which could not produce a better solution because of the local minimum. However, when the proposed method is confronted with a local minimum, the damping factor P is changed by mutation operators. This operation makes the algorithm search more extensive variable space, which helps the trapped solution jump out of the local minimum successfully.

Table 2. Configuration parameters of the initial rough design.

View Table | View all tables in this article

Fig. 7 The design found with the proposed method.

Download Full Size | PDF

Table 3. Configuration parameters of the design found with the proposed method.

View Table | View all tables in this article

Fig. 8 The design found with Damped Least Squares method.

Download Full Size | PDF

Table 4. Configuration parameters of the design found with Damped Least Squares method.

View Table | View all tables in this article

Table 5. Comparison between the solution found with Damped Least Squares and the solution found with the proposed method.

View Table | View all tables in this article

Besides, we also used evolutionary algorithm to design the same problem. The initial design is also the rough design shown in Table 2. Figure 9 presents the solution. Table 6 gives the configuration parameters of the design found with evolutionary algorithm. Table 7 is a comparison between solution of the proposed method and the solution of the standard evolutionary algorithm. It is obvious that the proposed method can produce the final solution much faster than the evolutionary algorithm, although these two methods seem to find out the same kind solution finally.

Fig. 9 The design found with evolutionary algorithm.

Download Full Size | PDF

Table 6. Configuration parameters of the design found with evolutionary algorithm.

View Table | View all tables in this article

Table 7. Comparison of the solutions found with proposed method and the solution found with evolutionary algorithm.

View Table | View all tables in this article

In evolutionary algorithm, there is usually a large population (50 in our experiment) for every generation. In each generation, the merit function has to be calculated for more than 50 times. And the algorithm usually has to evolve for many generations to produce a satisfactory result. Besides, the operation of recombination, mutation and selection for a large population is also time-consuming. Therefore, the needed time is usually very long. However, in the proposed method, conventional Damped Least Squares is first used to find a local minimum, then, it is followed by modified Damped Least Squares (using the proposed damping factor in Eq. (5)) to jump out of the local minimum. These two procedures alternate until a satisfactory result is obtained. We know that conventional Damped Least Squares is a local search method and is very fast. The modified Damped Least Squares only functions as an escape factor. The calculation of Eq. (5) is also very fast compared with the recombination and mutation procedure in evolutionary algorithm. Therefore, the proposed method has the ability to conduct global search and is faster than evolutionary algorithm.

We can draw a conclusion that the proposed method can perform well in design lens system problem. It can obtain global solutions as good as that from the other global search method. The proposed method can come into use by simply modifying the conventional Damped Least Squares method. At present, Damped Least Squares method is widely applied by existing softwares as an optimization method for lens design such as the well known CAD tool ZEMAX, CODE V and so on. However, the damping factor is fixed in these softwares. The proposed method provides a choice of setting the damping factor as Eq. (5), when the algorithm is trapped in poor local minima. It can make conventional Damped Least Squares method much more powerful. So, we believe it is not only effective but also easy to be applied to practical use.

5. Conclusion

In this paper we apply mutation operators to control the damping factor in Damped Least Squares. We study the mutation behavior in this method when the algorithm confronts with local minima. The proposed method can go beyond local minima by taking mutation operators to control the damping factor. The proposed method was successfully applied to design problems. The results indicate that the mutation operators provide an effective and rapid way to jump out of poor local minima. Besides, the proposed method come into use simply by applying the mutation operators to control the damping factor in conventional Damped Least Squares method. So we believe this study can help the existing softwares perform better.

Acknowledgments

The work was supported by Program for New Century Excellent Talents in University in China under Grant No. NCET-07-0582 and the National Natural Science Foundation of China (NNSFC) under Grant No. 60877004.

References and links

1. K. Levenberg, “A method for the solution of certain non-linear problems in least squares,” Q. Appl. Math. 2, 164–168 (1944).

2. M. Isshiki, H. Ono, K. Hiraga, J. Ishikawa, and S. Nakadate, “Lens design: global optimization with escape function,” Opt. Rev. 2(6), 463–470 (1995). [CrossRef]

3. D. Vasiljevic, “Classical and evolutionary algorithms in the optimization of optical systems,” (Wiley, 2001).

4. V. Yakovlev and G. Tempea, “Optimization of chirped mirrors,” Appl. Opt. 41(30), 6514–6520 (2002). [CrossRef] [PubMed]

5. C. Gagné, J. Beaulieu, M. Parizeau, and S. Thibault, “Human-competitive lens system design with evolution strategies,” Appl. Soft Comput. 8(4), 1439–1452 (2008). [CrossRef]

6. I. Ono, S. Kobayashi, and Y. Yoshida, “Optimal lens design by real-coded genetic algorithms using UNDX,” Comput. Methods Appl. Mech. Eng. 186(2-4), 483–497 (2000). [CrossRef]

7. L. Li, Q. H. Wang, D. H. Li, and H. R. Peng, “Jump method for optical thin film design,” Opt. Express 17(19), 16920–16926 (2009). [CrossRef] [PubMed]

8. L. Li, Q. H. Wang, X. Q. Xu, and D. H. Li, “Two-step method for lens system design,” Opt. Express 18(12), 13285–13300 (2010). [CrossRef] [PubMed]

9. D. Shafer, “Global optimization in optical design,” Comput. Phys. 8, 188–195 (1994).

10. M. van Turnhout and F. Bociort, “Chaotic behavior in an algorithm to escape from poor local minima in lens design,” Opt. Express 17(8), 6436–6450 (2009). [CrossRef] [PubMed]

11. J. Meiron, “Damped Least-Squares method for automatic lens design,” J. Opt. Soc. Am. 55(9), 1105–1107 (1965). [CrossRef]

12. H. E. Nusse and J. A. Yorke, “Basins of attraction,” Science 271(5254), 1376–1380 (1996). [CrossRef]

13. H. P. Schwefel, “Numerical optimization of computer models,” (Wiley, 1981).

14. J. M. Yang and C. Y. Kao, “A robust evolutionary algorithm for optical thin-film designs,” Evol. Comput. 2, 978–985 (2000).

15. D. C. O’Shea, “The monochromatic quartet: a search for the global optimum,” Proc. SPIE 1354, 548–554 (1990). [CrossRef]

Type	Constraint	Penalty
Distortion	\|dist%\|<1%	P_dist = \|dist%\|
Image distance	t_img>0	P_img = -t_img
Vignetting	l_vign = 0	P_vign = 1000l_vign

Radius (mm)	Thickness (mm)	Semi-aperture (mm)	Glass
113.60	7.33	39.50	BK7
−3165.75	4.65	39.50	AIR
−161.63	79.96	39.50	BK7
−193.38	0	22.84	AIR
57.36	18.21	20.80	BK7
0	66.90	12.11	BK7
−261.75	4.92	18.74	AIR
−34.21	12.09	18.74	BK7
−2340.99	6.61	23.85	AIR

Radius (mm)	Thickness (mm)	Semi-aperture (mm)	Glass
−231.74	175.47	122.34	BK7
−363.39	0.49	132.41	AIR
315.11	414.24	126.10	BK7
985.14	0.74	38.18	AIR
0	112.09	13.51	BK7
120.27	7.58	38.10	BK7
−524.70	0.01	38.10	AIR
58.66	87.12	20.31	BK7
46.52	5.03	20.38	AIR

	RMS blur spot	Distortion	Vignetting	Effective Focal Length	Entrance Pupil Diameter
Damped Least Squares	0.01568 mm	24.17%	No	100.01mm	33.33mm
Proposed method	0.00567 mm	0.798%	No	99.99mm	33.98mm

Radius (mm)	Thickness (mm)	Semi-aperture (mm)	Glass
119.20	5.8	32.97	BK7
−2089.12	4.10	32.73	AIR
−135.45	77.17	32.55	BK7
0	4.33	15.30	BK7
−133.60	0.00	17.51	AIR
63.53	89.04	16.33	BK7
395.61	7.80	19.97	AIR
−37.86	15.70	20.10	BK7
−241.53	2.11	26.61	AIR

Mutation operators in lens system optimization to jump out of local minima

Abstract

1. Introduction

2. Theory

3. Mutation behavior in Damped Least Squares method

4. Design examples and result discussions

5. Conclusion

Acknowledgments

References and links

Cited By

Figures (9)

Tables (7)

Equations (10)

Optics Express

Radius (mm)	Thickness (mm)	Semi-aperture (mm)	Glass
1000	10	13	BK7
1000	10	13	AIR
1000	10	13	BK7
1000	10	13	AIR
1000	10	13	BK7
0	10	13	BK7
1000	10	13	AIR
1000	10	13	BK7
1000	10	13	AIR

	RMS blur spot	Distortion	Time	CPU
The proposed method	0.00567 mm	0.798%	0.3h	Intel Core 2 Duo 2.40GHz
Evolutionary algorithm	0.00715 mm	−0.594%	2.0h	Intel Core 2 Duo 2.40GHz