Fast nonlinear compressive sensing lithographic source and mask optimization method using Newton-IHTs algorithm

Yiyu Sun; Naiyuan Sheng; Tie Li; Yanqiu Li; Enze Li; Pengzhi Wei

doi:10.1364/OE.27.002754

1. Introduction

Inverse lithography technology (ILT) has appeared as an important resolution enhancement technique (RET) [1,2] to improve lithographic imaging fidelity at 7-5nm node. In the forward transformation of ILT methods, given an initial mask pattern, we can get the print image using the accurate imaging model. To achieve best imaging fidelity, ILT methods iteratively modify the initial mask pattern until the minimum pattern error between target pattern and print image [1–3]. However, ILT methods need to calculate the aerial image in the forward transformation in each iteration, which is the most time-consuming step during the optimization procedure, thus resulting in the computational inefficiency. Researchers had to apply machine learning methods to accelerate the inverse optimization process [4,5]. In any supervised machine learning methods, the training data set is very important, so the ILT methods are generally used to generating training data set now [4]. To ensure sufficient pattern coverage, a lot of training data is required. Therefore, it is necessary to develop more advanced strategies to speed up the ILT methods with better imaging fidelity.

As a significant part of ILT methods, source and mask optimization (SMO) increases the optimization degree of freedom by jointly optimizing source pattern and mask pattern, and can achieve a pair of combined source and mask to ensure high image fidelity. Since Rosenbluth et al. first introduced the idea of the SMO [6], a set of parameterized or half-pixelated SMO approaches have been proposed in the literature [7–9] using heuristic algorithm [10–13]. With the continuous shrink of critical dimension (CD) and the emergence of pixelated sources realized by freeform diffractive optical element (DOE) [14,15], several pixelated SMO methods were proposed using gradient-based algorithms to improve the computation efficiency [16–20] and process robustness [21–25]. There were also many state of the art technologies in the industry, such as IBM [26–28], ASML [5,29], Mentor [30–32] and other research institutes [4,33,34]. They have done a lot of valuable researches in models and algorithms. With the universal application of double patterning and multiple patterning technology, the number of masks is greatly increased, resulting in the requirement of more efficient optimization strategies.

Recently, compressive sensing (CS) theory [35,36] is firstly introduced to accelerate the source optimization (SO) procedure in our works [37–39], where the SO is formulated as an underdetermined linear problem by randomly sampling monitoring pixels on layout pattern. However, this CS theory cannot be directly applied to optical proximity correction (OPC) or SMO problems because of the nonlinear relationship between aerial image and mask pattern. Fortunately, a new framework of nonlinear CS has been proposed recently [40], which has been universal applied in magnetic resonance imaging [41], and time-varying networks [42]. This nonlinear CS theory allows to accurately recover the sparse or structured signals in a more general inverse optimization framework with nonlinear projection measurements. After that, a pixelated OPC approach based on the nonlinear CS theory was proposed in our recent publication [43], where the two-dimensional (2D) discrete Fourier transform (DFT) basis was used as the sparse representation basis for the mask patterns in CS-OPC method and the iterative hard thresholding (IHTs) algorithm [44] was used to solve the OPC problem. Compared to traditional gradient-based OPC method, this method can reduce the runtime by 19%. However, constrained by the computational efficiency of algorithm mentioned above, this CS-OPC method only employed 24 fixed source points rather than whole source of illumination and did not optimize the source at all, resulting in that this OPC result is invalidity in industrial applications.

In this paper, to our best knowledge, the nonlinear compressive sensing (CS) theory is for the first time applied to solve nonlinear inverse reconstruction problem in SMO. The proposed CS-SMO method optimizes both completed freeform source pattern and mask pattern sparsely represented by space basis and 2D discrete cosine transform (DCT) basis, respectively. As shown later, the 2D DCT basis can represent the mask pattern more sparsely than the 2D DFT basis, which can save the storage and accelerate computation procedure. In our CS-SMO framework, the layout pattern is downsampled in both SO and OPC procedure by the downsampling rate $K$ rather than downsampled randomly to ensure consistency, where the downsampling rate K means to sample one point per K points in this paper. Because of the downsampling operation, the proposed method only needs to calculate the aerial images and the gradients of cost functions on sparse meshes, which can reduce the computational complexity. Subsequently, a Newton-IHTs algorithm is also innovated taking into account the second derivative of the cost function, learning from the Newton method [46], which can converge very fast and accelerate the SMO procedure than gradient-based method and IHTs-based method by a factor of 9.31 and 7.39, respectively.

The remainder of this paper is organized as follows. The nonlinear imaging model are described in Section 2. The nonlinear CS-SMO framework are described in Section 3. The fast SMO method using Newton-IHTs algorithm is developed in Section 4. Simulations and analyses are presented in Section 5. Conclusions are provided in Section 6.

2. The nonlinear imaging model

An optical lithography imaging system is described in Fig. 1. According to the Abbe’s method, the aerial image of the lithography systems can be calculated as [45]

I = \frac{1}{J_{sum}} \sum_{x_{S}} \sum_{y_{S}} [J (x_{S}, y_{S}) \times \sum_{p = x, y, z} | H_{p}^{x_{S}, y_{S}} \otimes (B^{x_{S}, y_{S}} ⊙ M) |^{2}],

where

M \in ℝ^{N \times N}

denotes the mask pattern, and

ℝ^{N \times N}

is the real number set with the size

N \times N

. The

B^{x_{S}, y_{S}} \in ℝ^{N \times N}

represents the phase shift resulted by the oblique incidence effect of the light rays. The

H_{p}^{x_{S}, y_{S}}

is the equivalent point spread function of the lithography system, where

p = x, y or z

, and the convolution of the PSF and mask pattern

H_{p}^{x_{S}, y_{S}} \otimes (B^{x_{S}, y_{S}} ⊙ M)

embodies the diffraction of mask pattern. Unlike some other imaging fields [46,47] where PSF does not include diffraction pattern, the PSF in this work includes diffraction pattern. The matrix

J \in ℝ^{N_{S} \times N_{S}}

represents the intensity distribution of the source pattern, where

J (x_{S}, y_{S})

represents the intensity of the source point

(x_{S}, y_{S})

. The

J_{sum} = \sum_{x_{S}} \sum_{y_{S}} J (x_{S}, y_{S})

is an illumination intensity normalization factor. The notations

\otimes

and

⊙

represent matrix convolution and matrix element-wise multiplication, respectively.

Fig. 1 Imaging formation based on the vector imaging model [45].

Download Full Size | PDF

In this paper, the resist effect is approximated by a sigmoid function

s i g (x) = \frac{1}{1 + \exp [- a (x - t_{r})]},

where

a

is the steepness of the sigmoid function, and

t_{r}

is the process threshold. So we can get the print image on the wafer as

Z = s i g (I) = \frac{1}{1 + \exp [- a (I - t_{r})]} = Φ (J, M),

where

I

is the aerial image described in Eq. (1), and

Φ (\cdot)

is a nonlinear mapping representing the relationship between print image

Z

and source pattern

J

or mask pattern

M

.

3. The nonlinear CS-SMO framework

Given the target layout $\tilde{Z} \in ℝ^{N \times N}$ , the cost function is defined as

f (J, M) = | | Π ⊙ (\tilde{Z} - Z) | |_{2}^{2} = \sum_{m = 1}^{N} \sum_{n = 1}^{N} {Π (m, n) \times [\tilde{Z} (m, n) - Z (m, n)]}^{2},

where

\tilde{Z} (m, n)

and

Z (m, n)

are the

(m, n)

th elements of

\tilde{Z}

and

Z

, respectively. The

Π \in ℝ^{N \times N}

is the weight matrix to modify the weight of different layout regions. In this paper, we downsample the layout pattern in both SO and MO procedure by the downsampling rate

K

to reduce the computational complexity. Then, the cost function in Eq. (4) can be transferred as

\begin{array}{l} f_{K} (J, M) = | | Π_{K} ⊙ ({\tilde{Z}}_{K} - Z_{K}) | |_{2}^{2} \\ = \sum_{m = 1}^{N / K} \sum_{n = 1}^{N / K} {Π (K m, K n) \times [\tilde{Z} (K m, K n) - Z (K m, K n)]}^{2}, \end{array}

where

Π_{K} = Π (K m, K n)

,

Z_{K} = Z (K m, K n)

, and

{\tilde{Z}}_{K} = \tilde{Z} (K m, K n)

.

Meanwhile, we apply the discretization penalty in [16] and the generalized wavelet penalty in [45] to suppress the quantization errors and reduce the complexity of the mask patterns, respectively. Thus the overall cost function is

d (J, M) = f_{K} (J, M) + γ_{q} R_{q} (M) + γ_{w} R_{w} (M),

where

R_{q} (M)

and

R_{w} (M)

represent the discretization penalty and wavelet penalty, respectively, and,

γ_{q}

and

γ_{w}

are the regularization weights.

The CS theory [35,36] requires that the signal to be reconstructed should be sparse. If itself is not sparse enough, the signal need to have a sparse representation in some basis, which is called sparse basis. In this paper, we choose space basis, which is the identity matrix here, as the sparse basis of source pattern because source pattern is sparse enough. So we can get the sparse coefficients of source pattern as

Ω_{J} = Ψ_{J} \cdot J \cdot Ψ_{J}^{T},

where

Ψ_{J} \in ℝ^{N_{S} \times N_{S}}

is the identity matrix, and

Ω_{J} \in ℝ^{N_{S} \times N_{S}}

represents the sparse coefficients of source pattern. It is worth to note that the matrix

J

is called

S

-sparse when

Ω_{J}

only includes

S

non-zero elements.

For binary masks, the values of mask pixels are constrained to be 0 or 1, which constrains the optimization degree of freedom. Therefore, a parameter transformation is used in this paper as

M = \frac{1 + \cos Θ}{2},

where

Θ \in ℝ^{N \times N}

is used in the following sections to replace

M

, thus transferring the constrained SMO problem to unconstrained optimization problem. The 2D-DCT basis is selected as the sparse basis of mask pattern because the 2D-DCT basis can represent the mask pattern more sparsely than the 2D-DFT basis, which can save the storage and accelerate computation procedure. So we have

Ω_{M} = Ψ_{M} \cdot Θ \cdot Ψ_{M}^{T},

where

Ψ_{M} \in ℝ^{N \times N}

is the 2D-DCT transformation matrix, and

Ω_{M} \in ℝ^{N \times N}

represents the sparse coefficients. In the future, we will discuss the impacts of different sparse basis of source and mask pattern.

According to the nonlinear CS theory [41,45], the SMO problem can be formulated as following

\begin{array}{l} {\hat{Ω}}_{J}, {\hat{Ω}}_{M} = \arg \min_{Ω_{J}, Ω_{M}} d (Ω_{J}, Ω_{M}) \\ = \arg \min_{Ω_{J}, Ω_{M}} | | Π_{K} ⊙ ({\tilde{Z}}_{K} - Φ_{K} (Ω_{J}, Ω_{M})) | |_{2}^{2} \\ + γ_{q} R_{q} (Ω_{M}) + γ_{w} R_{w} (Ω_{M}) \\ subject to | | Ω_{J} | |_{0} \leq S_{J}, | | Ω_{M} | |_{0} \leq S_{M}, \end{array}

where

| | \cdot | |_{0}

denotes the L₀-norm of the arguments, which equals the non-zero elements number of the arguments, and we use these restrictions to ensure

J

and

M

S_{J}

-sparse and

S_{M}

-sparse, respectively. The nonlinear CS algorithms can be used to solve the Eq. (10).

4. The Newton-IHTs algorithm

The nonlinear CS reconstruction problem can be generally formulated as

\arg \min_{x \in A} f (x),

where

x

is the signal to be recovered and

A

is a constraint set. The iteration formulation used in IHTs [40,44] is

x_{n + 1} = P_{A}^{S} {x_{n} - s t e p \times \nabla f (x_{n})},

where

P_{A}^{S} {\cdot}

is a mapping operator that only keeps the largest

S

elements in the argument and sets the other elements to be zero,

s t e p

is the step length, and

\nabla f (x_{n})

is the gradient of cost function.

In this paper, to accelerate convergence, we innovate a Newton-IHTs algorithm taking into account the second derivative of the cost function, learning from the Newton method [48], which can accelerate the SMO procedure. Then this Newton-IHTs algorithm can be written as

x_{n + 1} = P_{A}^{S} {x_{n} - s t e p \times H_{n} \nabla f (x_{n})},

where

H_{n}

is the inverse matrix of Hessian matrix of

f (x^{n})

. To save the storage and accelerate computation, we choose limited memory BFGS (L-BFGS) algorithm [49] to approximately calculate the

H

as

\begin{array}{l} H_{n + 1} = (V_{n}^{T} \cdot \cdot \cdot V_{n - m}^{T}) H_{0} (V_{n - m} \cdot \cdot \cdot V_{n}) \\ + \sum_{j = 0}^{m} ρ_{n - m + 1} (\prod_{l = 0}^{m - j - 1} V_{n - l}^{T}) s_{n - m + j} s_{n - m + j}^{T} (\prod_{l = 0}^{m - j - 1} V_{n - l}^{T}), \end{array}

where

H_{0}

need to be a positive define matrix, which is set to the identity matrix in this paper, and m is a user-defined parameter representing that the curvature information of last

m

th iteration is used to approximate the

H

. Other parameters can be calculated as

s_{n} = x_{n} - x_{n - 1}, t_{n} = \nabla f (x_{n}) - \nabla f (x_{n - 1}),

ρ_{n} = \frac{1}{s_{n}^{T} t_{n}}, V_{n} = (E - ρ_{n} t_{n} s_{n}^{T}),

where

E

is the identity matrix.

Therefore, we need to calculate the gradients of the cost functions with respect to $Ω_{J}$ and $Ω_{M}$ as following

\nabla d (Ω_{J}) = \nabla f_{K} (Ω_{J}),

\nabla d (Ω_{M}) = \nabla f_{K} (Ω_{M}) + γ_{q} \nabla R_{q} (Ω_{M}) + γ_{w} \nabla R_{w} (Ω_{M}),

where

\nabla R_{q} (Ω_{M})

and

\nabla R_{w} (Ω_{M})

represent the gradients of the discretization penalty and the wavelet penalty terms, respectively. The methods to calculate the gradients in Eqs. (17) and (18) can be found in Appendix A. According to Eq. (13),

Ω_{J}

and

Ω_{M}

are updated by Newton-IHTs algorithm as

Ω_{J}_{n + 1} = P_{A}^{S_{J}} {Ω_{J}_{n} - s t e p \times H_{J}_{n} \nabla d (Ω_{J}_{n})},

Ω_{M}_{n + 1} = P_{A}^{S_{M}} {Ω_{M}_{n} - s t e p \times H_{M}_{n} \nabla d (Ω_{M}_{n})},

The flowchart for proposed SMO method based on Newton-IHTs algorithm is shown in Fig. 2. The $Ω_{J}$ and $Ω_{M}$ are updated using Eqs. (19) and (20), respectively.

Fig. 2 Flowchart for proposed SMO method.

Download Full Size | PDF

5. Simulation and analysis

This section presents an overall simulation to verify the outstanding performance of proposed method with three target layouts at 45nm technology nodes. All of the computations are implemented by Matlab and carried out on a computer with Inter(R) Core(TM) i5-7500 CPU, 3.4GHz, and 8GB of RAM. The simulations in this paper are based on the 193nm ArF immersion lithography systems. An annular illumination light source is used as the initial source pattern, with the inner and outer partial coherence factors of $σ_{i n} = 0.82$ and $σ_{o u t} = 0.97$ . The pixelated source pattern is represented by a $N_{S} \times N_{S}$ matrix, where $N_{S} = 21$ . The numerical aperture (NA) on the wafer side is 1.35, the refractive index of the immersion medium is 1.44, and the demagnification factor of the projection optics is $R = 4$ . We set $t_{r} = 0.2$ and $a = 25$ in Eq. (3). In this paper, we use pattern error (PE) to evaluate the imaging fidelity of the imaging systems optimized by different methods. The PE is the distance between the print image and target layout, which can be calculated as $| | \tilde{Z} - Z | |_{2}^{2}$ in Eq. (4). Meanwhile, we compare the simulation results of traditional SMO framework with steepest descent algorithm (SD method), CS SMO framework with IHTs algorithm (IHTs method), and CS SMO framework with Newton-IHTs algorithm (Newton-IHTs method) in this section.

We verify the performance of proposed method based on two simple target layout and a complex target layout at 45nm technology node. Two simple target layout are shown in Fig. 3. The target layout shown in Fig. 3(a) is a horizontal block layout pattern at 45nm technology node. The target layout shown in Fig. 3(b) is a line-space pattern with critical dimension (CD) of 45nm, and the duty ratio is 1:1. The mask dimension is $5760 nm \times 5760 nm$ and the pixel size on the mask is $11 .25nm \times 11.25 nm$ in both these two target layout situation. The electric field of the illumination is polarized in X direction and Y direction, respectively.

Fig. 3 Target layout at 45nm technology node.

Download Full Size | PDF

According to Nyquist sampling theorem [50,51], the sampling interval must satisfy:

Δ x \leq \frac{1}{2 f_{c}},

where

Δ x

is the sampling interval, and

f_{c}

is the cut-off frequency of lithography system. In actual systems, components close to the cut-off frequency may be distorted and lost, so the sampling frequency is generally chosen to be 5 to 10 times of

f_{c}

. In this paper, the maximum sampling interval can be:

Δ x_{c} = \frac{1}{5 f_{c}} = \frac{1}{5} \times \frac{λ}{N A} = \frac{1}{5} \times \frac{193 n m}{1.35} = 28.59 n m .

In this paper, the pixel size on the mask is 11.25nm, so we choose the sampling rate

K = 2

and the sampling interval as

Δ x = K \times p i x e l = 2 \times 11.25 n m = 22.5 n m

.

Figure 4 illustrates the simulation results of different SMO methods based on the target layout in Fig. 3(a). From first column to forth column, Fig. 4 shows the simulation results using the initial system, SD method, IHTs method, and Newton-IHTs method, respectively. From top to bottom, Fig. 4 shows the source pattern, mask pattern and print image, respectively. The regularization weights in Eq. (18) are set to $γ_{q} = 5 \times 10^{- 5}$ and $γ_{w} = 2.5 \times 10^{- 5}$ . The weight matrix $Π$ in Eq. (4) is set to 1.3 in the areas between the boundaries of the target pattern and the outer contour with 8 pixels far away from the boundaries, and 1 in the other areas. Besides, the downsampling rate K in both IHTs method and Newton-IHTs method are set to 2. The sparsity degree of Newton-IHTs method is set to $S_{J} = 80$ in Eq. (19) and $S_{M} = 60$ in Eq. (20). The sparsity degree of IHTs method is set to $S_{J} = 80$ and $S_{M} = 3500$ , respectively. The sparsity degree is set based on the sparse basis. Small sparsity degree can accelerate computation procedure while too small sparsity degree will reduce the imaging fidelity. The 2D DFT basis and 2D DCT basis are used to represent the mask pattern in IHTs method and Newton-IHTs method, respectively. The huge different of sparsity degree between these two methods results from the different ability of 2D DFT basis and 2D DCT basis to sparsely represent mask pattern, which is shown in Fig. 7.

Fig. 4 Simulations of different SMO methods using horizontal block layout pattern at 45nm technology node.

Download Full Size | PDF

Figure 4 shows that the proposed CS-SMO framework can reduce the pattern error by 25% than SD method. Besides, the imaging fidelity obtained by Newton-IHTs method is comparable to the IHTs method. That means that reasonable downsampling can improve the modeling accuracy in SMO procedure.

Figure 5 shows the convergence curves of pattern error of different SMO methods. Because of taking into account the second derivative of the cost function, the Newton-IHTs method can converge very fast. Figure 6 shows the comparison of total optimization runtime of the different SMO methods. As Fig. 6 shows, the CS-SMO framework can effectively improve the computational efficiency. Besides, benefiting from its efficient convergence, the Newton-IHTs method can accelerate the optimization procedure than SD method and IHTs method by a factor of 9.31 and 7.39, respectively.

Fig. 5 The convergence curves of different SMO method using horizontal block pattern at 45nm technology node.

Download Full Size | PDF

Fig. 6 The runtime of different SMO methods using horizontal block pattern at 45nm technology node.

Download Full Size | PDF

To ensure the sparsity assumption, the 2D DFT basis and 2D DCT basis are used to represent the mask pattern in IHTs method and Newton-IHTs method, respectively. The sparse coefficients of the mask pattern in Fig. 4(e) in 2D DFT domain and 2D DCT domain are shown in Figs. 7(a) and 7(b), respectively. The sparse coefficients of the mask pattern in Fig. 4(g) in 2D DFT domain and the mask pattern in Fig. 4(h) in 2D DCT domain are shown in Figs. 7(c) and 7(d), respectively. The low-frequency components of the spectrum in 2D DFT domain is at the center and the horizontal and vertical midlines of the figure, while the low-frequency components of the spectrum in 2D DCT domain is at the upper left corner and the left and upper boundary line of the figure. The white and black areas indicate the amplitudes greater than 1 and less than 0, respectively. The grey areas represent the amplitudes within [0, 1]. It is observed that most of the energy of initial mask concentrates in the low-frequency components both in 2D DFT domain and 2D DCT domain. As Fig. 7 shows, the IHTs algorithm and Newton-IHTs algorithm remove the high-frequency components with small amplitudes in each iteration, thus generating simpler mask pattern than traditional SD method. As Fig. 7(d) shows, the energy only concentrates in the red rectangle mark. Therefore, the 2D DCT basis can represent mask pattern more sparsely than 2D DFT basis, which can save the storage and accelerate computation procedure.

Fig. 7 Sparse coefficients of the mask pattern on different sparse bases.

Download Full Size | PDF

In order to study the effect of sparse basis on imaging fidelity, we make a comparison between Newton-IHTs method with 2D DFT basis (DFT method) and Newton-IHTs method with 2D DCT basis (DCT method). From left to right, Fig. 8 shows the simulation results using the initial system, DFT method, and DCT method, respectively. From top to bottom, Fig. 8 shows the source pattern, mask pattern and print image, respectively. It is shown that there is little difference on imaging fidelity between DFT method and DCT method.

Fig. 8 Simulations of different sparse basis using Newton-IHTs algorithm and horizontal block layout pattern at 45nm technology node.

Download Full Size | PDF

Figure 9 presents the simulations based on vertical line-space pattern at 45nm technology node in Fig. 3(b). Figure 10 shows the convergence curves of pattern error of different SMO methods. Figure 11 shows the comparison of total optimization runtime of the different SMO methods. Compared to SD method, the proposed method can effectively improve the imaging fidelity and computational efficiency of the lithography system.

Fig. 9 Simulations of different SMO methods using vertical line-space layout pattern at 45nm technology node.

Download Full Size | PDF

Fig. 10 The convergence curves of different SMO method using vertical line-space pattern at 45nm technology node.

Download Full Size | PDF

Fig. 11 The runtime of different SMO methods using vertical line-space pattern at 45nm technology node.

Download Full Size | PDF

Figure 12 presents the simulations based on a complex pattern at 45nm technology node in Fig. 12(e). The electric field of the illumination is TE-polarized. The regularization weights in Eq. (18) are set to $γ_{q} = 5 \times 10^{- 5}$ and $γ_{w} = 5 \times 10^{- 5}$ in SD method and IHTs method but still $γ_{q} = 5 \times 10^{- 5}$ and $γ_{w} = 2 .5 \times 10^{- 5}$ in Newton-IHTs method. The sparsity degree is set to $S_{M} = 200$ and $S_{M} = 3500$ in Newton-IHTs method and IHTs method, respectively. The other parameters are the same as those used in the simulation of two simple target layout in Fig. 3.

Fig. 12 Simulations of different SMO methods using a complex layout pattern at 45nm technology node.

Download Full Size | PDF

As Fig. 12 shows, the IHTs method cannot achieve smaller image errors than the SD method. That is because there are more high frequency components in the case of complex pattern, which is close to cut-off frequency and may be distorted or lost during the sampling process. It means that the applied downsampling method is improper at this complex target layout and we will study more reasonable downsampling method which can be applied to the complex layout in the future. Besides, the imaging fidelity obtained by the Newton-IHTs method is comparable to the SD method. That is because the Newton-IHTs method takes into account the second derivative of the cost function, which enable this method to jump out of the local minimum point in the optimization procedure.

Figure 13 shows the convergence curves of pattern error of different SMO methods. Figure 14 shows the comparison of total optimization runtime of the different SMO methods. Compared to traditional SD method, the proposed method can effectively improve the computational efficiency of the lithography system.

Fig. 13 The convergence curves of different SMO method using a complex pattern at 45nm technology node.

Download Full Size | PDF

Fig. 14 The runtime of different SMO methods using a complex pattern at 45nm technology node.

Download Full Size | PDF

In the future, more reasonable downsampling methods, which can be applied to the complex layout, will be studied. Besides, some process defect will be taken into account, which can be used to develop both fast and robust CS-SMO methods.

6. Conclusion

This paper developed a fast SMO method based on nonlinear CS theory. The proposed method downsamples the layout pattern in both SO and MO procedure, which can effectively reduce the computation complexity and formulate the SMO procedure as a nonlinear CS reconstruction problem. A Newton-IHTs algorithm is proposed to efficiently solve this problem by taking into account the second derivative of the cost function. Benefiting from the sparse meshes generated by downsampling and convergence speed of Newton-IHTs algorithm, the proposed CS-SMO method can efficiently design source pattern and mask pattern. Simulation results show the proposed method can accelerate the SMO procedure than traditional gradient-based method and IHTs-based method by a factor of 9.31 and 7.39, respectively.

Appendix A

According to the Eq. (1), we can get:

\begin{array}{l} I (K m, K n) = \frac{1}{J_{sum}} \sum_{x_{S}} \sum_{y_{S}} {J (x_{S}, y_{S}) \times \sum_{p = x, y, z} [\sum_{r = 1}^{N} \sum_{s = 1}^{N} \\ H_{p}^{x_{S}, y_{S}} (K m - r, K n - s) \times (B^{x_{S}, y_{S}} (r, s) \times M (r, s))]^{2}} . \end{array}

A parameter transformation is taken as

r = K a + u

and

s = K b + v

, where

u, v = 1, ..., K

, and

a, b = 0, ..., N / K - 1

. Then the Eq. (23) can be transformed as

\begin{array}{l} I (K m, K n) = \frac{1}{J_{sum}} \sum_{x_{S}} \sum_{y_{S}} (J (x_{S}, y_{S}) \times \sum_{p = x, y, z} \sum_{u = 1}^{K} \sum_{v = 1}^{K} \\ {\sum_{a = 0}^{N / K - 1} \sum_{b = 0}^{N / K - 1} H_{p}^{x_{S}, y_{S}} [K (m - a) - u, K (n - b) - v] \\ {\times [B^{x_{S}, y_{S}} (K a + u, K b + v) \times M (K a + u, K b + v)]}}^{2}) . \end{array}

Then, we denote

H_{p}^{x_{S}, y_{S}} (K a - u, K b - v)

,

B^{x_{S}, y_{S}} (K a + u, K b + v)

and

M (K a + u, K b + v)

as

H_{p, u v}^{x_{S}, y_{S}} (a, b)

,

B_{u v}^{x_{S}, y_{S}} (a, b)

and

M_{u v} (a, b)

, respectively. Obviously, the matrices

H_{p, u v}^{x_{S}, y_{S}}

,

B_{u v}^{x_{S}, y_{S}}

and

M_{u v}

are the downsampled matrices of

H_{p}^{x_{S}, y_{S}}

,

B_{}^{x_{S}, y_{S}}

and

M

with downsampling rate

K

. Then the Eq. (24) can be reformulated as

\begin{array}{l} I (K m, K n) = \frac{1}{J_{sum}} \sum_{x_{S}} \sum_{y_{S}} (J (x_{S}, y_{S}) \times \sum_{p = x, y, z} \sum_{u = 1}^{K} \sum_{v = 1}^{K} \\ {\sum_{a = 0}^{N / K - 1} \sum_{b = 0}^{N / K - 1} H_{p, u v}^{x_{S}, y_{S}} (m - a, n - b) \\ {\times [B_{u v}^{x_{S}, y_{S}} (a, b) \times M_{u v} (a, b)]}}^{2}) . \end{array}

Let matrix

I_{K}

be the downsampled matrix of the aerial image

I

in Eq. (1). Thus

\begin{array}{l} I_{K} = \frac{1}{J_{sum}} \sum_{x_{S}} \sum_{y_{S}} [J (x_{S}, y_{S}) \times \sum_{p = x, y, z} \sum_{u = 1}^{K} \sum_{v = 1}^{K} | H_{p, u v}^{x_{S}, y_{S}} \otimes (B_{u v}^{x_{S}, y_{S}} ⊙ M_{u v}) |_{2}^{2}] \\ = \frac{1}{J_{sum}} \sum_{x_{S}} \sum_{y_{S}} [J (x_{S}, y_{S}) \times \sum_{p = x, y, z} I_{p}^{w a f e r}], \end{array}

where

I_{p}^{w a f e r}

is formulated as

I_{p}^{w a f e r} = \sum_{u = 1}^{K} \sum_{v = 1}^{K} | H_{p, u v}^{x_{S}, y_{S}} \otimes (B_{u v}^{x_{S}, y_{S}} ⊙ M_{u v}) |_{2}^{2} .

Then, according to the Eq. (3), the downsampled print image can be formulated as

Z_{K} = s i g (I_{K}) = \frac{1}{1 + \exp [- a (I_{K} - t_{r})]} .

According to Eq. (5), the partial derivative of

f_{K}

to

J (x_{S}, y_{S})

can be calculated as

\begin{array}{l} \frac{\partial f_{K}}{\partial J (x_{S}, y_{S})} = \frac{- 2 a}{J_{sum}} \sum_{m = 1}^{N / K} \sum_{n = 1}^{N / K} {Π (K m, K n) \times [\tilde{Z} (K m, K n) - Z (K m, K n)] \\ \times Z (K m, K n) \times [1 - Z (K m, K n)]} \times \sum_{p = x, y, z} I_{p}^{w a f e r} . \end{array}

Thus, the gradient of

f_{K}

to

J

can by calculated as

\begin{array}{l} \nabla f_{K} (J) = \frac{- 2 a}{J_{sum}} \times 1_{N / K \times 1}^{T} \times \\ [Π_{K} ⊙ ({\tilde{Z}}_{K} - Z_{K}) ⊙ Z_{K} ⊙ (1 - Z_{K}) ⊙ \sum_{p = x, y, z} I_{p}^{w a f e r}] \times 1_{N / K \times 1}, \end{array}

where

1_{N / K \times 1}

is a vector with all values equaling to 1. Due to the length limit of this paper, the gradient of

f_{K}

to

M

can be found in [43].

Funding

General Program of National Natural Science Foundation of China (61675026), and National Science and Technology Major Project (2017ZX02101006-001).

Acknowledgments

We thank Mentor Graphics Corporation for providing academic use of Calibre. We also thank KLA-Tencor for providing academic use of PROLITH.

References

1. A. K. Wong, Resolution Enhancement Techniques in Optical Lithography (SPIE, 2001).

2. A. K. Wong, Optical Imaging in Projection Lithography (SPIE, 2005).

3. L. Pang, Y. Liu, and D. Abrams, “Inverse lithography technology (ILT): a natural solution for model-based SRAF at 45nm and 32nm,” Proc. SPIE 6607, 660739 (2007). [CrossRef]

4. S. Lan, J. Liu, Y. Wang, K. Zhao, and J. Li, “Deep learning assisted fast mask optimization,” Proc. SPIE 10587, 105870H (2018).

5. S. Wang, S. Baron, N. Kachwala, C. Kallingal, D. Sun, V. Shu, W. Fong, Z. Li, A. Elsaid, J. Gao, J. Su, J. Ser, Q. Zhang, B. Chen, R. Howell, S. Hsu, L. Luo, Y. Zou, G. Zhang, Y. Lu, and Y. Cao, “Efficient full-chip SRAF placement using machine learning for best accuracy and improved consistency,” Proc. SPIE 10587, 105870N (2018). [CrossRef]

6. A. E. Rosenbluth, S. Bukofsky, C. Fonseca, M. Hibbs, K. Lai, A. Molless, R. N. Singh, and A. K. Wong, “Optimum mask and source patterns to print a given shape,” J. Microlithogr., Microfabr., Microsyst. 1, 12–30 (2002).

7. C. Progler, W. Conley, B. Socha, and Y. Ham, “Layout and source dependent phase mask transmission tuning,” Proc. SPIE 5454, 315–326 (2005). [CrossRef]

8. R. Socha, X. Shi, and D. LeHoty, “Simultaneous source mask optimization (SMO),” Proc. SPIE 5853, 180–193 (2005). [CrossRef]

9. S. Hsu, L. Chen, Z. Li, S. Park, K. Gronlund, H. Liu, N. Callan, R. Socha, and S. Hansen, “An innovative source-mask co-optimization (SMO) method for extending low k1 imaging,” Proc. SPIE 7140, 714010 (2008). [CrossRef]

10. S. Sherif, B. Saleh, and R. De Leone, “Binary image synthesis using mixed linear integer programming,” IEEE Trans. Image Process. 4(9), 1252–1257 (1995). [CrossRef] [PubMed]

11. Y. Liu and A. Zakhor, “Binary and phase shifting mask design for optical lithography,” IEEE T. Semiconduct. M. 5(2), 138–152 (1992). [CrossRef]

12. Y. Granik, “Solving inverse problems of optical microlithography,” Proc. SPIE 5754, 506–526 (2005). [CrossRef]

13. A. Erdmann, T. Fuehner, T. Schnattinger, and B. Tollkuhn, “Towards automatic mask and source optimization for optical lithography,” Proc. SPIE 5377, 646–657 (2004). [CrossRef]

14. Y. V. Miklyaev, W. Imgrunt, V. S. Pavelyev, D. G. Kachalov, T. Bizjak, L. Aschke, and V. N. Lissotschenko, “Novel continuously shaped diffractive optical elements enable high-efficiency beam shaping,” Proc. SPIE 7640, 764024 (2010). [CrossRef]

15. J. T. Carriere, J. Stack, J. Childers, K. Welch, and M. D. Himel, “Advances in DOE modeling and optical performance for SMO applications,” Proc. SPIE 7640, 764025 (2010). [CrossRef]

16. A. Poonawala and P. Milanfar, “Mask design for optical microlithography--an inverse imaging problem,” IEEE Trans. Image Process. 16(3), 774–788 (2007). [CrossRef] [PubMed]

17. Y. Granik, “Fast pixel-based mask optimization for inverse lithography,” J. Micro-Nanolith. Mem. 5(4), 043002 (2006).

18. X. Ma and G. R. Arce, “Pixel-based OPC optimization based on conjugate gradients,” Opt. Express 19(3), 2165–2180 (2011). [CrossRef] [PubMed]

19. E. Y. Lam and A. K. Wong, “Computation lithography: virtual reality and virtual virtuality,” Opt. Express 17(15), 12259–12268 (2009). [CrossRef] [PubMed]

20. Y. Shen, N. Wong, and E. Y. Lam, “Level-set-based inverse lithography for photomask synthesis,” Opt. Express 17(26), 23690–23701 (2009). [CrossRef] [PubMed]

21. C. Han, Y. Li, X. Ma, and L. Liu, “Robust hybrid source and mask optimization to lithography source blur and flare,” Appl. Opt. 54(17), 5291–5302 (2015). [CrossRef] [PubMed]

22. T. Li and Y. Li, “Lithographic source and mask optimization with low aberration sensitivity,” IEEE Trans. NanoTechnol. 16(6), 1099–1105 (2017). [CrossRef]

23. C. Han, Y. Li, L. Dong, X. Ma, and X. Guo, “Inverse pupil wavefront optimization for immersion lithography,” Appl. Opt. 53(29), 6861–6871 (2014). [CrossRef] [PubMed]

24. N. Jia and E. Y. Lam, “Machine learning for inverse lithography: using stochastic gradient descent for robust photomask synthesis,” J. Opt. 12(4), 45601–45609 (2010). [CrossRef]

25. W. Lv, E. Y. Lam, H. Wei, and S. Liu, “Cascadic multigrid algorithm for robust inverse mask synthesis in optical lithography,” J. Micro-Nanolith. Mem. 13(2), 023003 (2014).

26. K. Lai, M. Gabrani, D. Demaris, N. Casati, A. Torres, S. Sarkar, P. Strenski, S. Bagheri, D. Scarpazza, A. Rosenbluth, D. Melville, A. Wachter, J. Lee, V. Austel, M. Szeto-Millstone, K. Tian, F. Barahona, T. Inoue, and M. Sakamoto, “Design specific joint optimization of masks and sources on a very large scale,” Proc. SPIE 7973, 797308 (2011). [CrossRef]

27. K. Tian, M. Fakhry, A. Dave, A. Tritchkov, J. Tirapu-Azpiroz, A. Rosenbluth, D. Melville, M. Sakamoto, T. Inoue, S. Mansfield, A. Wei, Y. Kim, B. Durgan, K. Adam, G. Berger, G. Bhatara, J. Meiring, H. Haffner, and B. Kim, “Applicability of global source mask optimization to 22/20 nm node and beyond,” Proc. SPIE 7973, 79730C (2011). [CrossRef]

28. A. Chen, Y. Foong, J. Maeng, N. Jain, and S. McDermott, “Exploration of resist effect in source mask optimization,” Proc. SPIE 10587, 105870J (2018). [CrossRef]

29. P. Liu, Z. Zhang, S. Lan, Q. Zhao, M. Feng, H. Liu, V. Vellanki, and Y. Lu, “A full-chip 3D computational lithography framework,” Proc. SPIE 8326, 83260A (2012). [CrossRef]

30. H. Vu, S. Kim, J. Word, and Y. Cai, “A novel processing platform for post tape out flows,” Proc. SPIE 10587, 105870R (2018). [CrossRef]

31. Y. Du, L. Li, J. Zhang, F. Shao, C. Zuniga, and Y. Deng, “A model-based approach for the scattering-bar printing avoidance,” Proc. SPIE 10587, 105870Q (2018).

32. S. Kobelkov, V. Roizen, S. Rodin, A. Tritchkov, J. Han, and Y. Granik, “Constraint approaches for some inverse lithography problems with pixel-based mask,” Proc. SPIE 10587, 105870I (2018). [CrossRef]

33. H. Choi and A. Hamouda, “Inverse lithography OPC correction with multiple patterning and etch awareness,” Proc. SPIE 10587, 105870O (2018). [CrossRef]

34. H. Lee, S. Kim, J. Hong, S. Lee, and H. Han, “Thread scheduling for GPU-based OPC simulation on multi-thread,” Proc. SPIE 10587, 105870P (2018). [CrossRef]

35. E. Candés, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Inf. Theory 52(2), 489–509 (2006). [CrossRef]

36. D. Donoho, “Compressive sensing,” IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006). [CrossRef]

37. Z. Song, X. Ma, J. Gao, J. Wang, Y. Li, and G. R. Arce, “Inverse lithography source optimization via compressive sensing,” Opt. Express 22(12), 14180–14198 (2014). [CrossRef] [PubMed]

38. X. Ma, D. Shi, Z. Wang, Y. Li, and G. R. Arce, “Lithographic source optimization based on adaptive projection compressive sensing,” Opt. Express 25(6), 7131–7149 (2017). [CrossRef] [PubMed]

39. X. Ma, Z. Wang, H. Lin, Y. Li, G. R. Arce, and L. Zhang, “Optimization of lithography source illumination arrays using diffraction subspaces,” Opt. Express 26(4), 3738–3755 (2018). [CrossRef] [PubMed]

40. T. Blumensath, “Compressed sensing with nonlinear observations and related nonlinear optimization problems,” IEEE Trans. Inf. Theory 59(6), 3466–3474 (2013). [CrossRef]

41. Y. Zhang, Z. Dong, G. Ji, and S. Wang, “An improved reconstruction method for CS-MRI based on exponential wavelet transform and iterative shrinkage/thresholding algorithm,” J. Electromagnet. Wave. 28(18), 2327–2338 (2014). [CrossRef]

42. S. Patterson, Y. C. Eldar, and I. Keidar, “Distributed compressed sensing for static and time-varying networks,” IEEE T. Signal Processing 62(19), 4931–4946 (2014).

43. X. Ma, Z. Wang, Y. Li, G. R. Arce, L. Dong, and J. Garcia-Frias, “Fast optical proximity correction method based on nonlinear compressive sensing,” Opt. Express 26(11), 14479–14498 (2018). [CrossRef] [PubMed]

44. T. Blumensath and M. E. Davies, “Iterative hard thresholding for compressed sensing,” Appl. Comput. Harmon. Anal. 27(3), 265–274 (2009). [CrossRef]

45. X. Ma, Y. Li, and L. Dong, “Mask optimization approaches in optical lithography based on a vector imaging model,” J. Opt. Soc. Am. A 29(7), 1300–1312 (2012). [CrossRef] [PubMed]

46. K. Ahi, S. Shahbazmohamadi, and N. Asadizanjani, “Quality control and authentication of packaged integrated circuits using enhanced-spatial-resolution terahertz time-domain spectroscopy and imaging,” Opt. Lasers Eng. 104, 274–284 (2018). [CrossRef]

47. K. Ahi, “Mathematical Modeling of THz Point Spread Function and Simulation of THz Imaging Systems,” IEEE Trans. Terahertz Sci. Technol. 7(6), 747–754 (2017). [CrossRef]

48. T. J. Ypma, “Historical development of the Newton-Raphson method,” SIAM Rev. 37(4), 531–551 (1995). [CrossRef]

49. J. Nocedal, “Updating Quasi-Newton matrices with limited storage,” Math. Comput. 35(151), 773–782 (1980). [CrossRef]

50. H. Nyquist, “Certain topics in telegraph transmission theory,” Trans. A.I.E.E. 617–644 (1928).

51. C. A. Shannon and W. Weaver, The mathematical theory of communication (University of Illinois, 1949).

Fast nonlinear compressive sensing lithographic source and mask optimization method using Newton-IHTs algorithm

Abstract

1. Introduction

2. The nonlinear imaging model

3. The nonlinear CS-SMO framework

4. The Newton-IHTs algorithm

5. Simulation and analysis

6. Conclusion

Appendix A

Funding

Acknowledgments

References

Cited By

Figures (14)

Equations (30)

Optics Express