Information theoretical aspects in coherent optical lithography systems

Xu Ma; Hao Zhang; Zhiqiang Wang; Yanqiu Li; Gonzalo R. Arce; Javier Garcia-Frias; Lu Zhang

doi:10.1364/OE.25.029043

1. Introduction

As a key component in micro and nano scale semiconductor fabrication, optical lithography systems are regarded as some of the most sophisticated optical imaging instruments to date. As shown in Fig. 1, the light source illuminates the lithographic mask. Then, the layout pattern on the mask is replicated onto the light-sensitive photoresist attached to the wafer surface through the projection optics. The light exposure will induce the chemical reactions in the photoresist, which is subsequently developed to form the print image. As the critical dimensions (CD) of integrated circuits continuously shrink, the semiconductor industry has relied on the resolution enhancement techniques to improve the lithography imaging performance [1, 2]. Optical proximity correction (OPC) is a major resolution enhancement technique that pre-warps the mask patterns to compensate for the imaging distortions [3, 4]. Recently, inverse lithography technology (ILT) was introduced to effectively solve for the pixelated mask optimization problem. Different from traditional OPC methodologies, ILT uses mathematical approaches to inversely pursue the optimal mask pattern from the predefined desired print image [5–9]. As illustrated by Fig. 1, ILT treats the mask pattern as a pixelated image, and inverts the imaging model of lithography system to pursue the optimal mask pattern.

Fig. 1 The sketch of an optical lithography system and the pixelated OPC method.

Download Full Size | PDF

Currently, a variety of ILT algorithms have been developed and published in the literature [10–20]. One of the most important goals of these works is to study numerical algorithms to optimize the mask, thus improving the lithography resolution and image fidelity. Frequently used metrics for lithography image fidelity include the edge placement error (EPE), the pattern error (PE), the CD error and so on [18, 21, 22]. As a first attempt, this paper takes the PE as the criterion to evaluate the image fidelity of lithography systems. The PE is defined as the square of the Euclidean distance between the target pattern and the print image on the wafer. As mentioned in [22], the PE is approximately proportional to the average EPE. It is natural to ask what are the theoretical limits of the resolution and image fidelity for a given optical lithography system. According to the Rayleigh resolution limit, the minimum feature or half pitch that can be reproduced on the wafer is given by [23–25]:

R = k \frac{λ}{NA},

where λ is the wavelength of light source, NA is the numerical aperture on the wafer side, and k is a process constant that can be minimized by ILTs [26–28]. However, the limit of image fidelity, measured by the difference between the print image and target layout, cannot be straightforwardly derived from the Rayleigh resolution limit. Figure 1 provides an intuitive explanation to this issue. The mask before optimization can produce resolvable critical features in the print image, which satisfies the resolution requirement. But, the image fidelity is unacceptable since the print image is far from the target pattern. This example shows that high resolution does not necessarily mean high image fidelity. More importantly, image fidelity sometimes makes more sense than the resolution. That is because all practical integrated circuit layouts have two-dimensional geometric designs that have to be printed accurately on target to achieve predefined circuit characteristics. Unfortunately, the theoretical limit of image fidelity is difficult to derive due to the nonlinearity of lithography imaging model and the variable geometry of layout. So far, to our best knowledge, the theoretical limit of lithography image fidelity has not yet been completely understood.

This paper proposes a new way from information theory and probability theory to study and analyze the imaging process of coherent optical lithography systems. In the past, Rieger, et al. introduced and discussed the parallels between the lithography techniques and communication theory, and summarized some innovative lithography process methods from the point of view of communications [29]. However, the mathematical model and formulation from information theory aspects for optical lithography have never been built up or derived. This paper studies the information transmission mechanism of coherent optical lithography system using mathematical, analytic and numerical methods. The physical quantities in optical lithography are first mapped to fundamental concepts in information theory. Similar to the communication scenario, the mask pattern is regarded as the input signal to be transmitted. The lithography system serves as the information channel delivering the information of the mask pattern through the light path. The print image is treated as the output signal received on the wafer. The pixelated OPC method is analogous to the coding method, which rearranges the mask pixels elaborately to improve the communication efficiency between the mask and print image. Based on this analogy, information theoretical aspects of coherent optical lithography systems are then studied and discussed, and an approximate information channel model is introduced to describe the relationship between mask pattern and print image. After that, we derive the mutual information between the mask and print image, and obtain the maximum information transfer (MIT) in coherent lithography system.

Another contribution of this paper is to introduce a systematic approach to calculate the theoretical limit of lithography image fidelity. First, we explain the physical meaning of MIT in lithography imaging processes, so as to relate the MIT to lithography imaging performance. Then, the lower bound of PE of the print image is derived from the MIT. The proposed theoretical limit of image fidelity is verified by a set of simulations using different layout patterns. For simplicity, this paper focuses on the coherent optical lithography system. However, the proposed methods and theoretical limit will be generalized to partially coherent lithography systems in the future. It is important to remark that this paper tries to develop a complementary perspective by applying information theoretical tools, aiming at obtaining new insights into the lithography imaging problem. The approximate information channel model and capacity quantification approaches in this paper need to be improved and modified in future work to fit the requirements and constraints of real products. However, we believe this paper opens a new window to study the information transmission mechanism of lithography systems, providing new insights in the understanding of lithography imaging performance.

In the following, Section 2 establishes an approximate information channel model of coherent lithography systems, and derives its MIT. Section 3 proposes the method to calculate the theoretical limit of image fidelity for coherent optical lithography. The simulations and discussions are provided in Section 4. Conclusions are given in Section 5.

2. Information theory formulation for coherent lithography system

In this section, we first set up an approximate information channel model for coherent lithography system, and then derive the mutual information and MIT of the proposed channel model.

2.1. Approximate information channel model of coherent lithography systems

The forward imaging process of coherent lithography system is illustrated in Fig. 2. Let M represent the input binary mask in a lithography system, with pixel values equal to 0 or 1. According to the Hopkins diffraction model, the aerial image is calculated as [11, 30, 31]

I = | H \otimes M |^{2},

where H is the point spread function, and ⊗ is the convolution operator. The point spread function of a coherent lithography system is the Bessel function of first kind with normalized energy [15]. Thus, the point spread function is equivalent to a low-pass filter with cutoff frequency of NA/λ [18]. The aerial image is exposed on the wafer, which induces the chemical reactions in the photoresist. The photoresist effect can be represented by the constant threshold model [10, 11]. According to Eq. (2), the print image is then defined as

Z = Γ {I - t_{r}} = Γ {{| H \otimes M |}^{2} - t_{r}},

where Γ{•} is the hard threshold function, and t_r is the photoresist threshold. Thus, Z is also a binary image with its pixel values equal to 0 or 1.

Fig. 2 The approximate information channel model for coherent optical lithography systems.

Download Full Size | PDF

Now consider the red pixel on the mask in Fig. 2, whose value is represented by a binary random variable X = 0 or 1. On the other hand, the pixel value on the corresponding location in the print image is denoted by another binary random variable Y = 0 or 1. Then, the aerial imaging model followed by the resist model can be regarded as a binary information channel. This information channel is band-limited due to the low-pass filtering property of the lithography system. X and Y are the input and output signals to the channel, respectively. In Eq. (2), the optical proximity effect in optical lithography systems is described by the convolution between H and M. This implies that the imaging of one mask pixel will be affected by its surrounding pixels due to the optical diffraction and interference phenomena. As shown in Fig. 2, the value of the blue mask pixels will influence the print image of the red one, if their distance is shorter than the optical interaction range. Consider the value of a blue pixel as a binary random variable N = 0 or 1. As a first approximation, the OPC mask pixels are assumed to be independent of each other. This assumption is based on the fact that the OPC mask is optimized at the pixel-level, often resulting in random like mask patterns which, collectively, after projected onto the lithographic process, form the desired output pattern. Figure 3 illustrates this concept, where the mask pattern obtained by ILT methods is projected onto the corresponding print image. As illustrated in Fig. 3(a), the ILT method optimizes the mask pattern at the pixel-level, and each mask pixel can be modified independently. In addition, for simplicity we only consider the information transfer between single pixels on the masks and the print images. With these assumptions, we are able to obtain analytically an approximate model for information transfer in the lithography system, where the impact of the blue pixels on the red one can be approximated as additive noise in the information channel. Of course, more accurate models for the OPC mask pixels are of interest in order to account for the inter-pixel correlation. This is particularly the case when mask manufacturing constraints are considered. While our proposed model is an approximation, as shown in the sequel, it offers several useful insights for optical lithography.

Fig. 3 The example of the pixelated OPC pattern and the corresponding print image.

Download Full Size | PDF

Based on our assumptions, all of the surrounding pixels around the red one in Fig. 2 will be treated as independent noise random variables, which jointly contribute to the transmission error in the information channel. In this way, the lithography system can be approximately modeled as the binary information channel in Fig. 4. In this model, p_ij = P_r(Y = j|X = i) with i, j=0 or 1 is referred to as the transition probability, which means the probability of receiving Y = j when X = i is transmitted. Hereafter, we use the notation of P_r{•} to represent the probability of the argument. Consider the extreme case in which the influence of the surrounding pixels can be ignored, then p₀₀ = p₁₁ = 1 and p₀₁ = p₁₀ = 0. In this case, the lithography system is reduced to a noiseless binary information channel, where the information of the mask can be exactly reproduced onto the print image. The opposite case is when the input signal X is totally submerged under the noise attributed to the surrounding pixels, then p₀₀ = p₁₁ = p₀₁ = p₁₀ = 0.5. That is, the probability of the output signal Y is always 0.5 regardless of the input distribution. In this case, the lithography system cannot transmit any useful information of the mask, since we cannot tell anything about the input signal X from the output signal Y.

Fig. 4 Information transmission between input and output signals in coherent optical lithography system.

Download Full Size | PDF

In order to complete the information channel modeling, we proceed to calculate the transition probability p_ij. Figure 5 illustrates an example of a point spread function H and a mask pattern M. Suppose the red pixel is the input signal X to be transmitted. The center of H is located on the red pixel. The convolution in Eq. (2) induces the interaction between the red pixel and its surrounding pixels within the area supported by H. The values of the surrounding pixels will be modulated by the amplitude of H, and then added to X as the noise term. We can divide the mask area underneath H into W different concentric circles. The wth (1 ≤ w ≤ W) concentric circle is denoted as C_w. Due to the symmetry of the point spread function, the amplitude of H remains the same on a certain concentric circle. Let h_w represent the amplitude of H on C_w, and assume there are L_w surrounding pixels distributed along C_w. The value of the lth (1 ≤ l ≤ L_w) surrounding pixel is defined as a binary random variable N_wl, and all surrounding pixels obey an independent identical distribution.

Fig. 5 The point spread function of coherent lithography system and the interaction between different mask pixels.

Download Full Size | PDF

In a practical layout pattern, the distribution of N_wl should be related to the value of X. In particular, when X = 0, N_wl obeys the Bernoulli distribution B(1, p_w₀). When X = 1, N_wl obeys the Bernoulli distribution B(1, p_w₁). That is

\begin{array}{l} P_{r} (N_{w l} = 1 | X = 0) = p_{w 0}, \\ P_{r} (N_{w l} = 0 | X = 0) = 1 - p_{w 0}, \\ P_{r} (N_{w l} = 1 | X = 1) = p_{w 1}, \\ P_{r} (N_{w l} = 0 | X = 1) = 1 - p_{w 1} . \end{array}

Then, the overall noise contributed by all surrounding pixels on C_w can be calculated as

N_{w} = \sum_{l = 1}^{L_{w}} h_{w} N_{w l} = h_{w} \sum_{l = 1}^{L_{w}} N_{w l}

. Applying the central limit theorem, the summation of N_wl approximately obeys a Gaussian distribution. When X = 0,

N_{w} ~ N (μ_{w 0}, σ_{w 0}^{2}), where μ_{w 0} = L_{w} h_{w} p_{w 0} and σ_{w 0}^{2} = L_{w} h_{w}^{2} p_{w 0} (1 - p_{w 0}) .

When X = 1,

N_{w} ~ N (μ_{w 1}, σ_{w 1}^{2}), where μ_{w 1} = L_{w} h_{w} p_{w 1} and σ_{w 1}^{2} = L_{w} h_{w}^{2} p_{w 1} (1 - p_{w 1}) .

In Eqs. (5) and (6),

N

denotes a Gaussian distribution, and the values of p_w₀ and p_w₁ can be estimated from the target layout by statistical approaches. The method to calculate p_w₀ is briefly described in the following. We first find out all of the zero-valued pixels on the target layout, which are denoted as

M_{1}^{0}, M_{2}^{0}, \dots, M_{i}^{0}, \dots, M_{K_{0}}^{0}

, where K₀ is the total number of the zero-valued pixels. For each of

M_{i}^{0}

we locate its concentric circle C_w, and calculate the proportion of the one-valued pixels to the total pixel number on C_w. Then, p_w₀ is estimated as the average value of the ratios over all of the zero-valued pixels

M_{i}^{0}

on the target layout. Using a similar method, we can calculate the value of p_w₁. Note that the values of p_w₀ and p_w₁ should be calculated from the OPC mask, but our goal is to obtain theoretical minimum values for PE without mask knowledge. Thus, we assume that the probability of the OPC mask is similar to that of the target layout. The rationale for this assumption is that inverse lithography techniques usually use the target layout as the initial guess of the OPC mask, then iteratively optimize the mask, which inserts some additional sub-resolution assist features around the main features.

Summing up the noise from all of the concentric circles, the overall noise signal is given by $\hat{N} = \sum_{w = 1}^{W} N_{w}$ . When X = 0,

\hat{N} ~ N ({\hat{μ}}_{0}, {\hat{σ}}_{0}^{2}), where {\hat{μ}}_{0} = \sum_{w = 1}^{W} L_{w} h_{w} p_{w 0} and {\hat{σ}}_{0}^{2} = \sum_{w = 1}^{W} L_{w} h_{w}^{2} p_{w 0} (1 - p_{w 0}) .

When X = 1,

\hat{N} ~ N ({\hat{μ}}_{1}, {\hat{σ}}_{1}^{2}), where {\hat{μ}}_{1} = \sum_{w = 1}^{W} L_{w} h_{w} p_{w 1} and {\hat{σ}}_{1}^{2} = \sum_{w = 1}^{W} L_{w} h_{w}^{2} p_{w 1} (1 - p_{w 1}) .

The global influence of the surrounding pixels on all concentric circles can be approximated as a Gaussian noise. However, the surrounding pixels on different concentric circles have different influence on the red pixel. As mentioned above, the values of the surrounding pixels are modulated by the point spread function H, which is a Bessel function with oscillating amplitude. This makes the influence of surrounding pixels on different concentric circles exhibit some ringing behavior. As shown in Eqs. (5) and Eq. (6), the ringing effect of neighborhood influence is represented by the fluctuation of mean and variance of N_w corresponding to different concentric circles. According to Eq. (3), the output signal of the information channel is

Y = Γ {{| h_{0} X + \hat{N} |}^{2} - t_{r}},

where h₀ is the amplitude of H at the center,

\hat{N}

is described in Eqs. (7) and (8), and t_r is the resist threshold. Based on Eqs. (7)–(9), the transition probability of the information channel can be calculated as

\begin{array}{l} p_{00} = P_{r} (Y = 0 | X = 0) = \int_{- \sqrt{t_{r}}}^{\sqrt{t_{r}}} \frac{1}{\sqrt{2 π} {\hat{σ}}_{0}} \exp [- \frac{{(x - {\hat{μ}}_{0})}^{2}}{2 {\hat{σ}}_{0}^{2}}] d x, \\ p_{10} = P_{r} (Y = 0 | X = 1) = \int_{- \sqrt{t_{r}}}^{\sqrt{t_{r}}} \frac{1}{\sqrt{2 π} {\hat{σ}}_{1}} \exp [- \frac{{(x - h_{0} - {\hat{μ}}_{1})}^{2}}{2 {\hat{σ}}_{1}^{2}}] d x, \\ p_{01} = 1 - p_{00}, \\ p_{11} = 1 - p_{10}, \end{array}

where

{\hat{μ}}_{0}

,

{\hat{σ}}_{0}^{2}

,

{\hat{μ}}_{1}

and

{\hat{σ}}_{1}^{2}

are given by Eqs. (7) and (8). This paper does not take into account the influence of variations in the lithographic process on the imaging performance. It is important to remark that one of the main goal of modern ILT is to improve the robustness of imaging performance to the variations in the lithographic process, such as defocus uncertainty and so on. However, this topic is out of the scope of this paper, and will be studied in future work.

2.2. Maximum information transfer in coherent lithography

In order to derive the MIT of a coherent lithography system, we need first to calculate the entropy of Y, the conditional entropy of Y given X, and the mutual information between X and Y [32]. Suppose X and Y obey the Bernoulli distributions of B(1, p_X) and B(1, p_Y), respectively. Given the transition probabilities in Eq. (10), the probability p_Y can be calculated as

p_{Y} = P_{r} (Y = 1) = p_{X} p_{11} + (1 - p_{X}) p_{01}

The entropy of Y is given by

E_{n} (Y) = - p_{Y} \log_{2} p_{Y} - (1 - p_{Y}) \log_{2} (1 - p_{Y}),

where p_Y is given by Eq. (11). E_n(Y) is a measurement of the uncertainty in the random variable Y. The conditional entropy of Y given X is formulated as

E_{n} (Y | X) = p_{X} [- p_{10} \log_{2} p_{10} - p_{11} \log_{2} p_{11}] + (1 - p_{X}) [- p_{00} \log_{2} p_{00} - p_{01} \log_{2} p_{01}] .

E_n(Y|X) quantifies the amount of information needed to describe Y given that X is known. Thus, the mutual information I(X;Y) is calculated as [32]

I (X; Y) = E_{n} (Y) - E_{n} (Y | X),

where E_n(Y) and E_n(Y|X) are given by Eqs. (12) and Eq. (13), respectively. The mutual information I(X;Y) describes the reduction in the uncertainty of X due to the knowledge of Y. In information theory, the channel capacity is used to describe the maximum rate at which we can send information over the channel that can be successfully recovered at the output end. The channel capacity can be calculated by maximizing the mutual information taken over all possible input distributions p_X [32], i.e.,

\hat{C} = \max_{p_{X}} I (X; Y) .

However, this paper is based on an approximate channel model, which assumes that the mask pixels are independent of each other, and the surrounding pixels are treated as the source of channel noise. Thus, Eq. (15) will lead to an approximate estimation of the channel capacity, which is referred to as the maximum information transfer (MIT) in this paper. Hereafter, we still use the notation of

\hat{C}

to represent the MIT. In order to calculate the maximum of Eq. (15), we set

\partial I (X; Y) / \partial p_{X} \overset{Δ}{=} 0

, thus the extreme point is

{\hat{p}}_{X} = \frac{α p_{10} + p_{10} - 1}{(α + 1) (p_{10} - p_{00})},

where

α = 2^{β}, and β = \frac{- p_{00} \log_{2} p_{00} - p_{01} \log_{2} p_{01} + p_{10} \log_{2} p_{10} + p_{11} \log_{2} p_{11}}{p_{00} - p_{10}} .

3. The limit of image fidelity for coherent lithography

In this section, we first discuss the physical meaning of the MIT $\hat{C}$ in lithography imaging processes. Based on the approximate information channel model, the MIT indicates the upper limit of the rate at which information can be reliably transmitted over the lithography system. In a lithography system, the information is transferred through an image instead of a time signal, and the minimum signal transmission interval is a pixel. This paper treats the mask pattern and print image as pixelated images, since most of current ILT methods grid the mask pattern and optimize the values of mask pixels. Notice that some design layouts of mask patterns could include some non-orthogonal edges that are not necessarily defined on the discrete grid perfectly. Thus, in our future work we may extend the proposed method to handle this case. Based on the pixelated mask model in this paper, the lithography system can at most transmit $\hat{C}$ bits of information per pixel over the channel without error, where $0 \leq \hat{C} \leq 1$ . In other words, if we want to transmit 1 information bit without error, we have to use at least $1 / \hat{C}$ pixels. On the binary mask pattern, each pixel takes 1 information bit. Thus, in order to transmit the information included in one mask pixel, we need to use at least $1 / \hat{C}$ pixels. That means that on the print image we are not able to subdivide the adjacent $1 / \hat{C}$ pixels any more, because these pixels have to act synchronously to transmit the information in one mask pixel. Hereafter, we refer to the set of adjacent $1 / \hat{C}$ pixels on the print image as “irresolvable pixels”. Figure 6(a) presents an intuitive illustration of this idea. In Fig. 6(a), the single pixel with size a × a is the minimum information unit in a lithography system. Due to the symmetric property of the lithography imaging system, a set of irresolvable pixels will be constrained in a circle with radius r. The area of the circle should be equal to the area of one pixel divided by $\hat{C}$ , i.e., $a^{2} / \hat{C}$ . So, the radius of the circle is given by

r = \frac{a}{\sqrt{π \hat{C}}} .

Fig. 6 The relationship between MIT and image fidelity.

Download Full Size | PDF

As the output signal of the information channel, the print image will be drawn by these irresolvable pixels rather than single pixels. Figure 6(a) also shows two adjacent groups of irresolvable pixels. The distance between the circle centers is Δd = a, which is the lateral size of single pixel. That is because the pixels on the mask are independent of each other, and the mask information is transmitted by single pixels. On the print image, each set of irresolvable pixels corresponds to one mask pixel. However, different sets of irresolvable pixels may overlap since $0 \leq \hat{C} \leq 1$ thus r > a/2. The inconsistency between single pixel and irresolvable pixels will produce distortion in the print image, deteriorating the image fidelity. From the optical point of view, the overlap between irresolvable pixels indicates the optical proximity effect attributed to the diffraction and interference phenomena.

Based on the physical meaning of MIT, Figs. 6(b) and 6(c) show the approach to calculate the limit of image fidelity for coherent lithography systems. In Fig. 6(b), the blue area represents the target layout. We try to use the overlapped irresolvable pixels to cover the underlying layout pattern. The minimum distance between different sets of irresolvable pixels is Δd = a. Obviously, the coverage of irresolvable pixels cannot fit the target layout perfectly. The difference between the coverage and the target layout is referred to as the pattern error, PE, as illustrated by the shadow in Fig. 6(b). It is shown that the PE occurs when the layout feature is uncovered or the non-pattern area is extra-covered. Subsequently, we shift the irresolvable pixels and find out the optimal coverage position to minimize the PE as shown by the shadowed area. Figure 6(c) illustrates the optimal coverage position with the minimum PE. The minimum PE value under the optimal coverage is recognized as the theoretical limit of image fidelity, which is denoted as F_lim. Notice that Fig. 6 is used to illustrate the method to calculate the limit of PE given the value of MIT, but the figures in Fig. 6 are not generated by simulations, and thus they are not realistic.

The theoretical limit of image fidelity is the lower bound of PE for a given coherent optical lithography system. In the lithography imaging process, the pattern error is defined as $| | \tilde{Z} - Z | |_{2}^{2}$ , where $\tilde{Z}$ is the target layout and Z is the print image as described in Eq. (3). As mentioned in the Introduction, pixelated OPC serves as a coding method to approach the MIT of lithography system by translating the target layout into a coded mask pattern. The aforementioned analysis tells us that, no matter how the mask is optimized, the PE cannot be reduced below F_lim.

As shown in Fig. 6, using smaller circles to cover the layout pattern is likely to achieve smaller PE. According to Eq. (18), we can reduce the circle radius by increasing $\hat{C}$ . Therefore, in general a larger $\hat{C}$ is more likely to result in lower PE. However, this observation is not always right. With more in-depth analysis, we find that the image fidelity limit is not necessarily positively correlated to $\hat{C}$ . In Section 4, we will have more detailed discussions on this issue. In summary, this paper first calculates the MIT that determines the radius of the irresolvable pixels. Then, the F_lim is calculated by a numerical algorithm pursuing the optimal coverage of the target layout. In our future work, we will try to calculate the F_lim in a theoretical way, and find out the analytic formula of the image fidelity limit.

4. Simulation and discussion

This section provides a set of simulations to assess the proposed information channel model and image fidelity limit. Several important observations from the simulation results will be discussed in the following. A deep ultra-violet coherent lithography system is used in the simulations. The illumination wavelength is 193nm, and the resist threshold is t_r = 0.25. In Eq. (2), the lateral sizes of M and H are 50 pixels and 25 pixels, respectively. The top row of Fig. 7 shows the three target layouts used in the simulations, namely the (a) block pattern, (b) T-shape pattern, and (c) L-shape pattern.

Fig. 7 The target layouts and the corresponding spectrum patterns.

Download Full Size | PDF

From the top to bottom, Fig. 8 shows the variations of the MIT, number of irresolvable pixels and lower bound of PE with respect to CD and NA. The CD is the dimension of minimum line-width in the layout pattern. From left to right, Fig. 8 shows the simulations for block, T-shape and L-shape patterns, respectively. In each figure, the x-axis represents the value of NA, and the y-axis represents the CD value. From Fig. 8, we have the following observations.

Fig. 8 Variations of MIT, number of irresolvable pixels and lower bound of PE with respect to CD and NA for different layouts.

Download Full Size | PDF

First, for a fix value of CD (or NA), reducing the value of NA (or CD) will reduce the value of MIT to zero. So, it is easier to print a layout image with larger CD by using larger NA. This can be explained by the Rayleigh resolution limit in Eq. (1). Increasing NA can improve the lithography resolution, which is beneficial to print layouts with small CD. This idea has been implemented in immersion lithography systems, where the immersion liquid is inserted between the projector and the wafer to further increase the value of NA [23].

Second, it is not true that the greater the NA the better performance. It is interesting to note that for a specific CD, the MIT first increases and then becomes smaller when NA increases. In other words, we need to find out the optimal NA matching a given CD to achieve superior lithography imaging performance. Some of the prior work has considered the co-optimization of the source, mask and NA to maintain high image fidelity within a large depth of focus (DOF) [33].

Third, the image fidelity limit is not necessarily positively correlated to MIT. This means that smaller MIT may sometimes lead to higher image fidelity. In fact, the image fidelity limit is jointly determined by the MIT and the layout pattern. This can be intuitively explained by Figs. 6(b) and 6(c). The MIT determines the radius of the set (circle) of irresolvable pixels, while the image fidelity limit is determined by the matching degree between the circle of irresolvable pixels and the geometry of the layout pattern. A large circle may appropriately cover the underlying layout pattern, as long as the layout line-width is an integral multiple of the circle radius. In contrast, misalignment between a small circle and the layout pattern may lead to even larger pattern error.

Finally, the MIT is target-dependent. As shown in Figs. 8(a), 8(b) and 8(c), the MIT will change with the variation of CD. At first glance, this seems unreasonable since the MIT should be independent from the input layout. However, according to the approximate information channel model in Fig. 2, the surrounding pixels on the mask around the input signal X are treated as channel noise. Thus, the channel noise is related to the layout geometry, which then influences the MIT.

In order to verify the limit of image fidelity, we optimize the mask pattern in the top row of Fig. 7 to improve the image fidelity of the lithography system. In particular, we use the pixelated OPC algorithm developed in [10]. In the optimization procedure, the constant threshold for the resist model is approximated by the sigmoid function:

Γ {x - t_{r}} \approx sigmod {x, t_{r}} = \frac{1}{1 + \exp [- a (x - t_{r})]},

where a is the steepness index that dictates the steepness of the sigmoid function. It is noticed that the step length and steepness index are two key parameters that significantly affect the convergence property of the algorithm. These two parameters have impact on the mask optimization results and the lithography image fidelity. In order to approach the image fidelity limit, we repeat the pixelated OPC simulations by traversing the value of the step length from 0.1 to 2, and traversing the value of the steepness index from 10 to 90. In each of the simulations, we iterate the algorithm for 200 times, and then find out the minimum PE value during the entire optimization process. Figure 9 shows the obtained minimum PE values based on different step lengths and steepness indexes, where different colors represent different PE values. The top row in Fig. 9 provides the simulations with CD = 180nm and NA = 0.75, while the bottom row shows the simulations with CD = 90nm and NA = 1.35. From left to right, Fig. 9 presents the results for block, T-shape and L-shape patterns, respectively.

Fig. 9 Variation of the minimum PEs with respect to the step length and steepness index for different layouts.

Download Full Size | PDF

For the case of CD = 180nm and NA = 0.75, Fig. 10 illustrates the best OPC results for the block (top row) and T-shape (bottom row) patterns. The first column shows the target layouts. The second column shows the print images if the target layouts are used as the masks. The third column shows the OPC masks, which achieve the minimum PE values. The fourth column shows the print images corresponding to the OPC masks. The top half of Table 1 lists the theoretical limits of PE, the minimum PEs obtained by OPC, and the optimal step lengths and steepness indexes used in the simulations. Here, the theoretical limit of PE is derived by the method proposed in Section 3. It is observed that the minimum PE is always larger than its theoretical lower bound, which validates the methods proposed in this paper.

Fig. 10 Simulations of pixelated OPC for block and T-shape patterns with CD = 180nm and NA = 0.75.

Download Full Size | PDF

Table 1. The theoretical limits of PE, the minimum PEs obtained by OPC, and the optimal step lengths and steepness indexes.

View Table

Figure 11 illustrates the pixelated OPC simulations for the case of CD = 90nm and NA = 1.35, where we choose the block (top row) and L-shape (bottom row) patterns as the target layouts. The bottom half of Table 1 lists the theoretical limits of PE, minimum PEs obtained by OPC, and the optimal step lengths and steepness indexes used in the simulations. The simulations in Fig. 11 also demonstrate the validity of the proposed methods. It is noted that the minimum PEs obtained by OPC (Min. PE) and the theoretical limits for PE are higher for the T-shape and L-shape patterns. This can be intuitively explained from the frequency domain of the target layouts. Figures 7(d), 7(e) and 7(f) illustrates the spectra of the block, T-shape and L-shape patterns, respectively. The white and black regions represent the frequency components with amplitudes greater than 1 and less than 0, respectively. The grey regions represent the frequency components with amplitudes within [0, 1]. It is observed that most of the spectral energy of the block pattern concentrates on the lower frequency components, while the L-shape pattern contains more high frequency components. Due to the low-pass filtering property of the lithography system, all of the high frequency components beyond the cut-off frequency will be filtered out. Thus, it is more difficult to preserve the image fidelity for the L-shape pattern than for the block pattern. In the above results, the masks are optimized using the gradient-based OPC algorithm, which is more likely to get trapped in local minima for the T-shape and L-shape patterns. This may explain the bigger gaps between the minimum PE and the theoretical limit for PE observed in Table 1 for these patterns.

Fig. 11 Simulations of pixelated OPC for block and L-shape patterns with CD = 90nm and NA = 1.35.

Download Full Size | PDF

Although the MIT and image fidelity limit of coherent lithography are derived and verified in this paper, there are still some limitations in the proposed information channel model. First, current work focuses on coherent lithography systems. However, most practical lithography systems operate under partially coherent illuminations [34]. Notice that there are strong relationships among the source pattern, the spatial frequency distribution of the optical transmission, and layout pattern characteristics. Thus, the information channel model in this paper will be generalized to partially coherent lithography systems in our future work. Particularly, we will investigate the influence of complex source patterns on the information transmission property of lithography systems. Second, masks with a number of isolated pixels are physically unrealizable due to the manufacturing capability [35, 36]. Thus, in our future work the correlation between different mask pixels will be taken into account. Third, the imaging performance of lithography systems is mainly influenced by the uncertainties of defocus, exposure dose, mask, and other process variations [29, 37]. In the future, we will aim at modeling these uncertainty factors as other sources of noise in the information channel to handle the impact of these effects on capacity. Finally, for simplicity this paper only considers the information transfer between single pixels on the mask and the print image. However, the pixels on the print image are indeed correlated to each other due to the band-limited property of the system. In our future work, we will introduce more advanced channel models to embed a set of neighboring pixels on the mask and print image. The actual MIT is expected to be smaller than that estimated in this paper since the correlation will reduce the entropy of the output signals. In the last decades, researchers and engineers from industry have made great efforts to incorporate concurrent requirements and constraints into the resolution enhancement techniques, aiming at improving the imaging performance, yield and robustness of practical lithography systems. Relevant work in the industry has inspired our paper. The approximate information channel model and capacity quantification approaches used in this paper need to be improved and modified in future work to fit the requirements and constraints on the real product lines. However, despite the above limitations, we believe this paper is the first to look at the lithography imaging process and pixelated OPC approaches from an information theoretical perspective, opening a new window to study the information transmission mechanism in lithography systems.

5. Conclusion

This paper studied information theoretical aspects of optical lithography, and established an approximate information channel model for coherent optical lithography systems. The coherent lithography system was modeled as a memoryless binary information channel, and the maximum information transfer was derived using a statistical method. Then, the physical meaning of MIT in lithography imaging processes was explained, and the theoretical limit of image fidelity was derived. A set of simulations based on different target layouts were provided to assess the proposed methods. Finally, the limitations of the proposed information channel model were discussed.

Funding

National Natural Science Foundation of China (NSFC) (61675021, 61675026); National Science and Technology Major Project; Beijing Natural Science Foundation (4173078).

References and links

1. A. K. Wong, Resolution Enhancement Techniques in Optical Lithography (SPIE, 2001). [CrossRef]

2. X. Ma and G. R. Arce, Computational Lithography, Wiley Series in Pure and Applied Optics, 1st ed. (John Wiley and Sons, 2010). [CrossRef]

3. S. Sherif, B. Saleh, and R. Leone, “Binary image synthesis using mixed integer programming,” IEEE Trans. Image Process. 4(9), 1252–1257 (1995). [CrossRef] [PubMed]

4. Y. Liu and A. Zakhor, “Binary and phase shifting mask design for optical lithography,” IEEE Trans. on Semicond. Manuf. 5(2), 138–152 (1992). [CrossRef]

5. Y. Granik, “Fast pixel-based mask optimization for inverse lithography,” J. Microlith. Microfab. Microsyst. 5(4), 043002 (2006).

6. N. B. Cobb and Y. Granik, “Dense OPC for 65nm and below,” Proc. SPIE 5992, 599259 (2005). [CrossRef]

7. P. M. Martin, C. J. Progler, G. Xiao, R. Gray, L. Pang, and Y. Liu, “Manufacturability study of masks created by inverse lithography technology (ILT),” Proc. SPIE 5992, 599235 (2005). [CrossRef]

8. A. Poonawala and P. Milanfar, “OPC and PSM design using inverse lithography: A non-linear optimization approach,” Proc. SPIE 6154, 61543H (2006). [CrossRef]

9. A. Poonawala, B. Painter, and C. Kerchner, “Model-based assist feature placement for 32nm and 22nm technology nodes using inverse mask technology,” Proc. SPIE 7488, 748814 (2009). [CrossRef]

10. A. Poonawala and P. Milanfar, “Mask design for optical microlithography – an inverse imaging problem,” IEEE Trans. Image Process. 16(3), 774–788 (2007). [CrossRef] [PubMed]

11. X. Ma and G. R. Arce, “Generalized inverse lithography methods for phase-shifting mask design,” Opt. Expres. 15(23), 15066–15079 (2007). [CrossRef]

12. Y. Shen, N. Wong, and E. Y. Lam, “Level-set-based inverse lithography for photomask synthesis,” Opt. Expres. 17(26), 23690–23701 (2009). [CrossRef]

13. N. Jia and E. Y. Lam, “Machine learning for inverse lithography: Using stochastic gradient descent for robust photomask synthesis,” J. Opt. 12(4), 045601 (2010). [CrossRef]

14. J. Yu and P. Yu, “Impacts of cost functions on inverse lithography patterning,” Opt. Expres. 18(8), 23331–23342 (2010). [CrossRef]

15. Y. Shen, N. Wong, and E. Y. Lam, “Aberration-aware robust mask design with level-set-based inverse lithography,” Proc. SPIE 7748, 77481U (2010). [CrossRef]

16. Y. Shen, N. Jia, N. Wong, and E. Y. Lam, “Robust level-set-based inverse lithography,” Opt. Expres. 19(6), 5511–5521 (2011). [CrossRef]

17. X. Ma and G. R. Arce, “Pixel-based OPC optimization based on conjugate gradients,” Opt. Expres. 19(3), 2165–2180 (2011). [CrossRef]

18. X. Ma, Y. Li, and L. Dong, “Mask optimization approaches in optical lithography based on a vector imaging model,” J. Opt. Soc. Am. 29(7), 1300–1312 (2012). [CrossRef]

19. W. Lv, S. Liu, Q. Xia, X. Wu, Y. Shen, and E. Y. Lam, “Level-set-based inverse lithography for mask synthesis using the conjugate gradient and an optimal time step,” J. Vac. Sci. Technol. B 31(4), 041605 (2013). [CrossRef]

20. X. Wu, S. Liu, W. Lv, and E. Y. Lam, “Robust and efficient inverse mask synthesis with basis function representation,” J. Opt. Soc. Am. 31(12), B1–B9 (2014). [CrossRef]

21. N. B. Cobb, Fast Optical and Process Proximity Correction Algorithms for Integrated Circuit Manufacturing, Ph.D. thesis, University of California, Berkeley, 1998.

22. W. Lv, Q. Xia, and S. Liu, “Mask-filtering-based inverse lithography,” J. Micro/Nanolith. MEMS MOEMS 12(4), 043003 (2013). [CrossRef]

23. B. J. Lin, “Immersion lithography and its impact on semiconductor manufacturing,” J. Micro/Nanolith. MEMS MOEMS 3(3), 377–395 (2004). [CrossRef]

24. F. M. Schellenberg, “A history of resolution enhancement technology,” Opt. Rev. 12(2), 83–89 (2005). [CrossRef]

25. J. Mulkens, D. Flagello, B. Streefkerk, and P. Graeupner, “Benefits and limitations of immersion lithography,” J. Micro/Nanolith. MEMS MOEMS 3(1), 377–395 (2004).

26. S. A. Campbell, The Science and Engineering of Microelectronic Fabrication, 2nd ed. (Publishing House of Electronics Industry, 2003).

27. F. M. Schellenberg, “Resolution enhancement technology: The past, the present, and extensions for the future, optical microlithography,” Proc. SPIE 5377, 1–20 (2004). [CrossRef]

28. L. Liebmann, S. Mansfield, A. Wong, M. Lavin, W. Leipold, and T. Dunham, “TCAD development for lithography resolution enhancement,” IBM J. Res. Dev. 45(5), 651–665 (2001). [CrossRef]

29. M. L. Rieger, “Communication theory in optical lithography,” J. Micro/Nanolith. MEMS MOEMS 11(1), 013003 (2012). [CrossRef]

30. H. H. Hopkins, “On the diffraction theory of optical images,” Proc. R. Soc. Lond. 217, 408–432 (1953). [CrossRef]

31. H. H. Hopkins, “The concept of partial coherence in optics,” Proc. R. Soc. Lond. 208, 263–277 (1951). [CrossRef]

32. T. M. Cover and J. A. Thomas, Elements of Information Theory (John Wiley and Sons, 1991). [CrossRef]

33. X. Guo, Y. Li, L. Dong, L. Liu, X. Ma, and C. Han, “Parametric source-mask-numerical aperture co-optimization for immersion lithography,” J. Micro/Nanolith. MEMS MOEMS 13(4), 043013 (2014). [CrossRef]

34. B. Salik, J. Rosen, and A. Yariv, “Average coherent approximation for partially coherent optical systems,” J. Opt. Soc. Am. 13(10), 2086–2090 (1996). [CrossRef]

35. B. Kim, S. S. Suh, S. G. Woo, H. Cho, G. Xiao, D. H. Son, D. Irby, D. Kim, and K. Baik, “Inverse lithography technology (ILT) mask manufacturability for full-chip device,” Proc. SPIE 7488, 748812 (2009). [CrossRef]

36. N. Jia, A. K. Wang, and E. Y. Lam, “Regularization of inverse photomask synthesis to enhance manufacturability,” Proc. SPIE 7520, 75200E (2009).

37. J. Sturtevant, E. Tejnil, T. Lin, S. Schulze, P. Buck, F. Kalk, K. Nakagawa, G. Ning, P. Ackmann, F. Gans, and C. Buergel, “Impact of 14-nm photomask uncertainties on computational lithography solutions,” J. Micro/Nanolith. MEMS MOEMS 13(1), 011004 (2014). [CrossRef]

CD = 180nm, NA = 0.75
	Limit of PE	Min. PE	Step Length	Steep. Ind.
Block	14.40	20	1.7	36
T-shape	16.80	98	1	10
CD = 90nm, NA = 1.35
	Limit of PE	Min. PE	Step Length	Steep. Ind.
Block	24.06	28	0.8	16
L-shape	40.55	76	1	24

Information theoretical aspects in coherent optical lithography systems

Abstract

1. Introduction

2. Information theory formulation for coherent lithography system

2.1. Approximate information channel model of coherent lithography systems

2.2. Maximum information transfer in coherent lithography

3. The limit of image fidelity for coherent lithography

4. Simulation and discussion

5. Conclusion

Funding

References and links

Cited By

Figures (11)

Tables (1)

Equations (19)

Optics Express