Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Effect of choices of boundary conditions on the numerical efficiency of direct solutions of finite difference frequency domain systems with perfectly matched layers

Open Access Open Access

Abstract

Direct solvers are a common method for solving finite difference frequency domain (FDFD) systems that arise in numerical solutions of Maxwell’s equations. In a direct solver, one factorizes the system matrix. Since the system matrix is typically very sparse, the fill-in of these factors is the single most important computational consideration in terms of time complexity and memory requirements. As a result, it is of great interest to determine ways in which this fill-in can be systematically reduced. In this paper, we show that in the context of commonly used perfectly matched boundary layer methods, the choice of boundary condition behind the perfectly matched boundary layer can be exploited to reduce fill-in incurred during the factorization, leading to significant gains of up to 40% in the efficiency of the factorization procedure. We illustrate our findings by solving linear systems and eigenvalue problems associated with the FDFD method.

© 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

In any finite difference simulation, the domain must be truncated and a boundary condition must be imposed. Common boundaries include the periodic boundary condition or the Dirichlet boundary condition. However, in many applications, one wants to simulate an "infinite" domain in the sense that any wave incident on the boundary should have no reflection. In one dimension, it is possible to eliminate all reflections via the Mur absorbing boundary condition [1]. However in $n > 2$ dimensions, there is no absorbing boundary condition that can eliminate reflections for waves incident from all angles. Instead, it is common to surround the computational domain by perfectly matched layers (PML), a specially designed anisotropic absorber [1,2]. The PML region is typically truncated using a boundary condition such as the periodic or the Dirichlet boundary condition.

The PML has been extensively used in computational electromagnetics, for both frequency and time-domain solutions of the Maxwell’s equations [15]. In the time domain, significant efforts have been invested in reducing the reflection for a given number of layers in the PML region [68]. Various formalisms of the PML have also been developed to ensure strong suppression of reflection over broad bandwidth [2,3]. In the frequency domain, there are additional considerations in the applications of PML. For example, Ref. [9] has shown that the condition number of the system matrix, and hence the performance of iterative solutions of Maxwell’s equations in the frequency domain, can vary drastically for different formalisms of PML, even though these formalisms offer similar wall-clock performance in time-domain algorithms.

In this paper we provide a discussion of PML in the context of direct solver for Maxwell’s equations in the frequency domain. In such direct solvers, one typically performs an LU decomposition of the system matrix. We show that the fill-in in the LU decomposition strongly depends on the boundary condition that is used to truncate the PML. In particular, the use of the Dirichlet boundary condition leads to far less fill-in and hence significant efficiency gain, as compared with the use of the periodic boundary condition. This result is in contrast with both time-domain algorithms, as well as with iterative algorithms in the frequency domain for the solutions of Maxwell’s equations, where the use of these two boundary conditions do not significantly affect the performance of the algorithms. Our result is important for the development of direct solvers for Maxwell’s equations in the frequency domain.

2. Brief review of the finite difference frequency domain method

2.1 Linear system formulation

The Maxwell’s equation written for the $\mathbf {H}$-field is:

$$\nabla \times \frac{1}{ \overleftrightarrow{\epsilon_r}(\mathbf{r})} \nabla \times \mathbf{H(r)} - \omega^2 \mu_0 \mathbf{H(r)} ={-}i\omega \mathbf{M(r)}$$
where $\overleftrightarrow {\epsilon _r}$ denotes a three-dimensional dielectric tensor, $\omega$ is the frequency, $\mathbf {H(r)}$ denotes the vector magnetic field, $\mathbf {M(r)}$ is the vector magnetic source, and the magnetic permeability is assumed to be $\mu _0$ in the entire computational domain. In the finite-difference frequency-domain (FDFD) method, one discretizes Eq. (1) via finite differences on a Yee grid [2,1012] to obtain a system of linear equations:
$$\hat A \mathbf{x} =\mathbf{b}.$$
The matrix $\hat A$ corresponds to the left hand side of Eq. (1) and $\mathbf {x}$ is a vector corresponding to the discretized version of $\mathbf {H}$. $\mathbf {b}$ corresponds to the source term on the right hand side.

In order to obtain a solution to Eq. (2), a number of algorithms exist, such as Gaussian elimination [13]. However, a common method is to factorize the matrix $\hat A$ into $\hat A = \hat L \hat U$, where $\hat L$ is a lower triangular matrix and $\hat U$ is an upper triangular matrix. The solution can then be obtained:

$$\mathbf{x} = \hat U^{{-}1}(\hat L^{{-}1} \mathbf{b}).$$
Eq. (3) is typically called back-substitution. The computation of $L^{-1} b$ and $U^{-1} (L^{-1}b)$ is cheaper as compared with the full Gaussian elimination of the system matrix $\hat A$ [14,15]. However, the process of determining the factorization requires a modified version of Gaussian elimination, so the overall time complexity of using the LU method is the same as the Gaussian elimination [13]. On the other hand, if one has multiple right hand sides to solve, then the LU factorization becomes efficient as the $L$ and $U$ factors can be re-used.

2.2 Linear eigenvalue system formulation

The LU decomposition is also useful for solving eigenvalue problems in electromagnetics, an example of which is:

$$\nabla \times \frac{1}{ \overleftrightarrow{\epsilon_r}(\mathbf{r})} \nabla \times \mathbf{H(r)} = \omega^2 \mu_0 \mathbf{H(r)}.$$
Following the same discretization procedure as that leading to Eq. (2), we obtain the eigenvalue problem:
$$\hat A \mathbf{x} = \lambda \mathbf{x}$$
where $\hat A$ now corresponds to the left-hand side of Eq. (5) and $\lambda = \mu _0\omega ^2$. Eigenvalues are typically determined iteratively by using the Arnoldi iteration and its variants where one repeatedly applies the matrix $\hat A$ to a test vector [16,17]. However, direct application of these methods can only produce the largest $k$ eigenvalues of a system. In computational electromagnetics, one typically is interested instead in eigenvalues that are near a target value $\sigma$. For such a problem, a reformulation of the eigenvalue problem is required [16,18]. Instead of solving Eq. (5) we solve:
$$(\hat A-\sigma \hat I )^{{-}1}\mathbf{x} = \beta \mathbf{x}$$
where $\hat I$ represents the identity matrix. The eigenvalues $\beta$ can be related to eigenvalues of $A$ via: $\beta = 1/(\lambda - \sigma )$. Thus the eigenvalues of $\hat A$ that are closest to $\sigma$ correspond to the largest eigenvalues of $(\hat A-\sigma I)^{-1}$. In applying iterative methods to obtain the largest eigenvalues of $(\hat A-\sigma I)^{-1}$, we need to repeatedly apply $(\hat A-\sigma \hat I)^{-1}$. In the FDFD method, $\hat A$ is sparse, but $(\hat A- \sigma \hat I)^{-1}$ is fully dense. For large problems it is too costly memory-wise to form $(\hat A-\sigma \hat I)^{-1}$ explicitly. The efficient way to do this is to compute an LU factorization of $\hat A-\sigma \hat I$ [19]. Thus, the LU factorization is important for the eigenvalue problems that arise in the FDFD method as well.

2.3 Brief review of boundary conditions

In the FDFD method the computational domain is finite and a boundary condition needs to be applied on the surface of the computational domain. This is illustrated in Fig. 1, where we show an undirected graph representation of a simple square grid consisting of $N = N_x \times N_y$ nodes with $N_x = N_y = 10$. The points at which the fields are evaluated are termed nodes. We denote the fields evaluated at the nodes as $\mathbf {x}(i,j)$, where $i$ is the row index and $j$ is the column index. For a 2D problem, the Maxwell’s equations can be written in terms of only $E_z$ or $H_z$ fields, where $z$ is the out-of-plane direction, hence $\mathbf {x}$ can correspond either to the $E_z$ or the $H_z$ fields. The outermost nodes are the boundary nodes where the boundary condition is enforced. The edges represent the node-to-node coupling as derived from finite-difference approximation of Eq. (1). In Fig. 1(a), we show the periodic boundary, which can be described as: $\mathbf {x}(N_x,j)=\mathbf {x}(0,j)$ and $\mathbf {x}(i,N_y) = \mathbf {x}(i,0)$. In Fig. 1(b), we show a Dirichlet boundary condition: $\mathbf {x}(N_x,j)=\mathbf {x}(0,j) = \mathbf {x}(i,N_y) = \mathbf {x}(i,0) = 0$. When $\mathbf {x}$ corresponds to the $\mathbf {H}$($\mathbf {E}$) fields, the Dirichlet boundary condition corresponds to the perfectly magnetic (electric) conductor boundary condition.

 figure: Fig. 1.

Fig. 1. Two dimensional finite difference grid with a five point stencil. a) A grid truncated with the periodic boundary condition. The boundary connections which connect left boundary to right and top to bottom are shown as trailing black edges. b) Same grid with the Dirichlet boundary condition.

Download Full Size | PDF

In typical FDFD simulations one often surrounds a computation domain of interest with perfectly matched layers (PML) [1,9]. These layers are used to ensure that any wave exiting the computational domain is nearly perfectly absorbed, which is important for simulating an open electromagnetic system. When the PML are used, the entire computational domain consists of the domain of interest and the PML regions. Outside the PML regions, the computational domain still needs to be truncated. Either the periodic or the Dirichlet boundary conditions can be used for this purpose.

The choice of either the periodic or the Dirichlet boundary conditions outside the PML regions does not affect their absorption properties. Moreover, in the finite difference time domain method, or when iterative solvers are used in the finite difference frequency domain method, these two boundary conditions result in very similar computational cost. Therefore, there has been very little attention paid to the choice between these two boundary conditions. As we will show in this paper, however, this choice in fact makes a significant difference when direct solvers are used in the finite difference frequency domain method, since the choice significantly influences the fill-in behaviors in the LU decomposition of the system matrix.

3. Brief review of the fill-in of sparse linear systems

In general $\hat A$ for Eq. (2) or Eq. (5) is sparse. A key consideration in the LU factorization is the issue of fill-in, which is the accumulation of additional non-zero elements in $\hat L$ and $\hat U$ outside those of $\hat A$. In general, fill-in during the factorization process for a sparse linear system depends sensitively on the detailed sparsity pattern of the system matrix. Consider, for example, the following factorization of a matrix:

$$\begin{bmatrix} p & p & p & p & p \\ p & p & & & \\ p & & p & & \\ p & & & p & \\ p & & & & p \\ \end{bmatrix} = \begin{bmatrix} p & & & & \\ p & p & & & \\ p & p & p & & \\ p & p & p & p & \\ p & p & p & p & p \\ \end{bmatrix} \begin{bmatrix} 1 & p & p & p & p \\ & 1 & p & p & p \\ & & 1 & p & p \\ & & & 1 & p \\ & & & & 1 \\ \end{bmatrix}$$
where $p$ denotes a matrix element which is nonzero (element values may be different). Because the first row has no zeros, the factorization process results in $\hat L$ and $\hat U$ factors that are dense [13]. In contrast, consider the factorization of an equivalent system as obtained from the left hand size of Eq. (7) by permuting the row index:
$$\begin{bmatrix} p & & & & p \\ & p & & & p \\ & & p & & p \\ & & & p & p \\ p & p & p & p & p \\ \end{bmatrix} =\begin{bmatrix} p & & & & \\ & p & & & \\ & & p & & \\ & & & p & \\ p & p & p & p & p \\ \end{bmatrix} \begin{bmatrix} 1 & & & & p \\ & 1 & & & p \\ & & 1 & & p \\ & & & 1 & p \\ & & & & 1 \\ \end{bmatrix}.$$
The $\hat L$ and $\hat U$ factors resulting from Gaussian elimination do not have any fill-in. In general, the order of the equations as well as the order of the variables can drastically change the fill-in. The operations which generate these reorderings can be expressed as preconditioners on $\hat A$ via a set of permutation matrices $\hat P$ and $\hat Q$: $\hat A \rightarrow \hat P\hat A\hat Q$, where $\hat P$ performs row-wise and $\hat Q$ performs column-wise permutations. Unfortunately, the problem of finding the optimal permutation to produce the least amount of fill-in is NP-complete [20], which means there are no known, at-worst polynomial time algorithms to determine the optimal permutation. Instead, there is a wide array of different heuristic or approximate methods to reduce fill-in.

The state-of-the-art re-orderings are based on the exact minimum degree algorithm and their variants, such as the methods of approximate minimum degree (AMD) and nested dissection [2125]. All of these methods effectively rely on the minimum degree heuristic, which is the observation that first eliminating variables in the factorization process with fewer couplings (or nonzeros in the row/column) tends to yield less fill-in overall.

4. Reducing fill-in by reducing connectivity within the PML

As discussed in Section 2.3, the choice of boundary condition backing a PML can vary. As a result, we have the freedom to choose the boundary condition behind the PML. In this section, we now show how the choice of boundary condition behind the PML can affect the efficiency of sparse direct solvers, specifically impacting the fill-in during factorization.

From Fig. 1, we can apply the minimum degree heuristic by counting the neighbors of the individual nodes on the grid. The number of neighbors corresponds to the number of nonzero elements in each row of $\hat A$. For the periodic boundary case shown in Fig. 1(a), all nodes including the boundary have 4 neighbors. By the minimum degree heuristic, there is no preference in picking which rows to eliminate first. Meanwhile, for the Dirichlet boundary in Fig. 1(b), the four corner nodes have 2 neighbors, the edge nodes have 3 neighbors and then all other nodes have 4 neighbors. By the minimum degree heuristic in this case, we can see that the rows corresponding to the boundary nodes should be eliminated first.

Finally, we can take advantage of the fact that the nodes adjacent to the Dirichlet boundary have fields that are generally close to zero, since the fields have been strongly attenuated by the PML. As a result, we can also further modify the connectivity to minimize the number of neighbors per node, as we show in Fig. 2:

 figure: Fig. 2.

Fig. 2. a) Optimized grid with reduced connectivity of the boundary. b) Same as a) with the nodes color coded by their degree (deg) in the graph representation.

Download Full Size | PDF

In Fig. 3, we show the influence of the boundary conditions on the fill-in behavior for the small 10$\times$10 grids shown in Figs. 1 and 2. The use of such a small grid allows easier visualization of various quantities involved. We solve a 2D problem corresponding to setting all terms involving $\partial _z$ to zero in Eq. (1). We simulate a magnetic source in vacuum for the magnetic field perpendicular $H_z$ to the 2D plane:

$$\bigg(\frac{\partial}{\partial x}\frac{1}{\epsilon_{r}(x,y)} \frac{\partial}{\partial x}+ \frac{\partial}{\partial y}\frac{1}{\epsilon_{r}(x,y)} \frac{\partial}{\partial y} + \mu_0 \omega^2\bigg) H_z = i\omega M_z$$
where we assume $\epsilon _{r}$ is isotropic, $M_z$ is the $z$-component of the magnetic source current. We consider the periodic boundary conditions, the Dirichlet boundary conditions, and the modified Dirichlet boundary conditions as discussed above where the connectivity at the boundary is reduced. The first column of Fig. 3 shows the sparsity pattern of the system matrices for the three boundary conditions. The sparsity patterns of the three matrices are similar since they differ only on the boundary. Compared with the system matrix from the Dirichlet boundary condition (Fig. 3(d)), the matrix from the periodic boundary condition (Fig. 3(a)) has extra non-zero elements at the lower-left and upper-half corners in the figure due to the long-range coupling induced by the periodic boundary condition. The matrix from the modified Dirichlet (Fig. 3(g)) boundary condition has less non-zero elements near the diagonal as results from the modifications.

 figure: Fig. 3.

Fig. 3. In the first column of Fig. 3, we show the sparsity pattern of $\hat A$ corresponding to the finite difference discretization of the $10\times 10$ grid with the a) periodic, d) Dirichlet boundary, and g) modified Dirichlet boundary with reduced connectivity at the boundary respectively. In the second column, we show the corresponding $\hat L$ factors for factorizing these three matrices as is. In the third column, we show corresponding $\hat L$ factors for factorizing these three matrices using a minimum degree heuristic to reorder the matrices. The number of nonzero elements (nnz) for each matrix is also provided in each figure.

Download Full Size | PDF

The second column in Fig. 3 shows the $\hat L$ factor as obtained by directly factoring the corresponding matrices in the first column as is without any reordering. For the case with periodic boundary condition (Fig. 3(b)), the $\hat L$ factor fills in near the diagonal and the lower edge in the figure. The fill-in at the lower edge arises from the long-range coupling as induced by the periodic boundary condition. By switching to the Dirichlet boundary condition (Fig. 3(e)), the fill-in near the lower edge disappears and the number of non-zero elements in the $\hat L$ factor significantly decreases. Modifying the Dirichlet boundary condition further reduces the fill-in and the number of non-zero elements (Fig. 3(h)).

The third column in Fig. 3 shows the $\hat L$ factor as obtained by factoring the corresponding matrices in the first column using a minimum degree heuristics to reorder the matrix [21]. Compared with the second column, the use of the minimum degree heuristics significantly reduces the number of fill-in for all three cases. Also, the number of non-zero elements reduces as we go from the case with the periodic boundary condition (Fig. 3(c)), to the case with the Dirichlet boundary condition (Fig. 3(f)), to the case with the modified Dirichlet boundary condition (Fig. 3(i)). The $\hat L$ factor shown in Fig. 3(i), which was obtained for the modified Dirichlet boundary condition using the minimum degree heuristics, has 456 non-zero elements, as compared with the number of non-zero elements of 500 in the system matrix with the periodic boundary condition. The result in Fig. 3 provides a direct demonstration that the use the modified Dirichlet boundary condition, in combination with the minimum degree heuristics, can significantly reduce the fill-in.

5. Numerical experiments

In this section, we show that the concept as described above can be applied to typical simulations of electromagnetic waves. We consider both the driven case with a source, and eigenmode calculations without a source. For both cases, we show that choosing either the periodic or the Dirichlet boundary conditions behind the PML does not affect the accuracy of the results, but the choice of the Dirichlet boundary condition leads to better numerical performance.

5.1 Driven problem

We solve Eq. (9) as in the previous section but now the interior domain consists of $N_x \times N_y$ with $N_x = N_y = 301$ grid cells surrounded with a PML with a thickness of 30 cells to ensure minimal reflection [26]. The domain has a length $a = 4\lambda$ where $\lambda$ is the wavelength of the source.

First, we confirm that the use of either the periodic boundary or the modified Dirichlet boundary in conjunction with a PML indeed has no effects on the solution within the interior domain. In Fig. 4 we show the result where the interior domain consits of vacuum and we put a point source at the center of the interior domain. For the two boundary conditions, the field distributions inside the computational domain are almost identical, as can be seen by comparing visually the field distributions shown in Fig. 4(a) and (b), or by exampling the field distributions along a line in the computational domain as shown in Fig. 4(c).

 figure: Fig. 4.

Fig. 4. a) Field solution for $H_z$ for Eq. (9) with the periodic boundary. The periodic boundary is denoted by a red line. b) Field solution on the same grid but with the Dirichlet boundary condition (shown in orange). c) Comparison of the field distribution along the cyan lines in a) and b). The transparent gray regions in (a) and (b) denote where the PML lies.

Download Full Size | PDF

For the matrix $\hat A$ corresponding to the driven problem, we show in Fig. 5 the difference in factorizing the matrix with the periodic boundary vs the Dirichlet boundary. In Fig. 5(a) and (b), we show the fill-in of the $\hat L$ factor with the modified Dirichlet and periodic boundary respectively for the system shown in Fig. 4. The $\hat L$ factor from the modified Dirichlet boundary is noticeably sparser than the $\hat L$ corresponding to the periodic boundary.

 figure: Fig. 5.

Fig. 5. a) Fill-in pattern of the $\hat L$ factor for the system shown in Fig. 4 with the Dirichlet boundary. b) Fill-in pattern for the $\hat L$ factor for the system with the periodic boundary. c) The percentage reduction of the number of nonzeros in $\hat L$ factor in going from the periodic to the modified Dirichlet boundary condition for varying values of $N_x = N_y$.

Download Full Size | PDF

In Fig. 5(c), we vary the size the domain by keeping the domain to be a square shape and varying $N_x=N_y$. We plot the percentage reduction in the number of non-zero elements in the $\hat L$ factor as we change from the periodic to the modified Dirichlet boundary condition. The percentage reduction fluctuates significantly as $N_x$ varies, since the reordering strategies are heuristic. But in general, we see significant reduction for all $N_x$’s. On average the percentage reduction decreases as $N_x$ increases. This is expected since the boundary, being a smaller and smaller fraction of the domain as $N_x$ increases, would lead to a smaller effect on the relative fill-in. What is remarkable is that the boundary couplings clearly have a disproportionately large effect on the fill-in of $\hat L$ factor, even though the boundary nodes are often just 1% or less of the total nodes in the entire grid. Even with $N_x\approx 1000$, there are cases where the use of Dirichlet boundary condition results in 33% fewer nonzeros in the $\hat L$ as compared with the use of the periodic boundary condition.

5.2 Eigenvalue problem

As a second example, we consider an eigenvalue problem that commonly arises in the study of nanophotonic waveguides [27,28]. In this problem, we consider a structure that is periodic along the $x$-direction. For a given operating frequency $\omega$, we solve for the propagating wavevector $k_x$. To develop the formalism, we substitute via the Bloch theorem $H_z(x,y) \rightarrow H_z(x,y)e^{ik_xx}$ in Eq. (9) to obtain:

$$\frac{1}{\mu_0} \bigg( \partial_x\bigg(\frac{1}{\epsilon_{r}}\bigg)\partial_x +\partial_y\bigg(\frac{1}{\epsilon_{r}}\bigg)\partial_y +\omega^2\mu_0 +i k_x \bigg(\partial_x\epsilon_{r}^{{-}1}+\epsilon_{r}^{{-}1}\partial_x\bigg) \bigg) H_z e^{ikx} = \frac{1}{\mu_0\epsilon_{r}}k_x^2 H_z e^{ikx}$$
which is a quadratic eigenvalue problem of the form:
$$\hat M \mathbf{v} + \lambda^2 \hat K \mathbf{v} +\lambda \hat C \mathbf{v} = 0$$
where $\lambda = k_x$ and:
$$\hat K = \frac{1}{\mu_0}\epsilon_r$$
$$\hat M = \frac{1}{\mu_0}\bigg(\partial_x\bigg(\frac{1}{\epsilon_{r}}\bigg)\partial_x +\partial_y\bigg(\frac{1}{\epsilon_{y}}\bigg)\partial_y+\omega^2\bigg)$$
$$\hat C = \frac{i}{\mu_0}\bigg(\partial_x\epsilon_{r}^{{-}1}+\epsilon_{r}^{{-}1}\partial_x\bigg).$$
Eq. (11) can be reformulated into an equivalent linear, generalized eigenvalue problem of the form:
$$\hat D \lambda \mathbf{w} = \hat B \mathbf{w}$$
where $\mathbf {w} = [\mathbf {x},\lambda \mathbf {x}]$ with $\mathbf {x}$ denoting the vector formed by $H_z(x,y)$ and where:
$$\hat D = \begin{bmatrix} \hat K & \hat 0 \\ \hat 0 & \hat I \\ \end{bmatrix}$$
and
$$\hat B = \begin{bmatrix} \hat C & \hat M \\ -\hat I & \hat 0 \\ \end{bmatrix}.$$
After discretization using a computational domain consisting of a single unit cell of the waveguide structure, $\hat 0$ denotes a matrix of zeros of dimension $N_xN_y$. In the waveguide eigenvalue problem, similar to the discussion in Section (2.2), we are typically interested in modes with wavevector near a particular value $\sigma$. Therefore, following the same derivation as in Section (2.2), the matrix that needs to factored is $\hat B-\sigma \hat D$ [16].

As an illustration we consider the waveguide structure schematically shown in Fig. 6(a) [29,30]. The permittivity of the materials used is provided in the caption of Fig. 6(a). The waveguide consists of two walls. Each wall consists of a mixture of dielectric and a Drude metal. The thickness of each wall is 0.3 $\mu$m. The separation between the walls is 1.6 $\mu$m. The structure is periodic along the $x$-direction. The periodicity along the $x$-direction is 0.2 $\mu$m. The metal region in the wall has a width along the $x$-direction of 0.04 $\mu$m. Two unit cells are shown in Fig. 6(a). By operating this system at $\lambda = 2$ $\mu$m, the walls become near-completely reflective, creating a hollow-core waveguide with light guided in the vacuum region between the walls.

 figure: Fig. 6.

Fig. 6. a) Schematic of the waveguide structure and the computational domain. The violet regions consist of a metal described by the Drude model with $\epsilon (\omega ) = 1 - \omega ^2/(\omega _p^2-i\gamma \omega )$, with $\omega _p = 0.72\pi \times 10^{15}$ rad/s, $\gamma = 5.5\times 10^{12}$ $s^{-1}$. The yellow regions consist of a dielectric material with a relative permittivity of $16$. The blue region is vacuum. The gray regions are the PML regions. The structure is periodic along the $x$-direction. Plotted here are two unit cells of the structure. b) $Re(H_z)$ modal profile of the structure with a Dirichlet boundary at the top and bottom edges of the computational domain at an operating frequency corresponding to a free space wavelength of 2 $\mu$m. b) The corresponding result with the periodic boundary at the top and bottom edges of the computational domain.

Download Full Size | PDF

To simulate this structure we use a computational domain consisting of a single unit cell shown in Fig. 6(a). We discretize the computational domain with $N_x \times N_y$ grid cells where $N_x = 100$ and $N_y = 150$. Along the $x$-direction a periodic boundary condition is imposed at the edges of the computational domain. Along the $y$-direction we place PML regions at both edges. The PML regions consist of 10 grid cells. Outside the PML region we truncate the grid with either the modified Dirichlet or the periodic boundary conditions.

First, we validate that the use of different boundary conditions backing the PML does not affect the guided modes of the system. In Fig. 6(b) and (c), we compare the real part of the field profiles of the structure for a sample mode with the same solved $k_x$ eigenvalue but with either the Dirichlet or periodic boundary conditions. We see an excellent agreement between the modal profiles. In Fig. 7, we compare the bandstructure of the waveguide with the periodic and Dirichlet boundary conditions backing the PML in the $y$-direction. To generate these bandstructures, we scan a discrete set of $\omega$ values, and look for the $k_x$ eigenvalues closest to 0. Additionally, due to spurious solutions that come from discretizing Eq. (10), a mode-filter must be implemented to produce the band-structure. Figure 7(a) is the band structure with the Dirichlet boundary and Fig. 7(b) is the corresponding one with the periodic boundary. Again, we see that the band structures in both case are essentially identical, as expected.

 figure: Fig. 7.

Fig. 7. Band structures obtained with a) the modified Dirichlet boundary, b) with the periodic boundary condition. The blue and red dots correspond to the real and imaginary parts of the $k_x$, respectively.

Download Full Size | PDF

Now we compare the fill-in of factorizing $(\hat B-\sigma \hat D)$ with the periodic versus the modified Dirichlet boundary conditions. In Fig. 8(a)-c we consider the case where $\sigma = 0$. In Fig. 8(a), we show the sparsity pattern of the matrix $\hat B$ in Eq. (17). In Fig. 8(b) and (c), we show the fill-in of the $\hat L$ factor of $\hat B$ with the domain truncated by the periodic boundary and the modified Dirichlet boundary conditions, respectively. The fill-in patterns vary substantially between the two cases. Moreover, even though we have PML regions only along the $y$ boundaries, we still achieve a $40\%$ reduction in the number of nonzeros. In the second row of Fig. 8, we show a case with $\sigma \neq 0$. Since $\hat B$ and $\hat D$ have different sparsity patterns, the sparsity pattern shown in Fig. 8(d) differ from that in Fig. 8(a). In comparing Fig. 8(e) with Fig. 8(f), we again see that the use of the modified Dirichlet boundary condition leads to less fill-in.

 figure: Fig. 8.

Fig. 8. a) Sparsity pattern of $\hat B -\sigma \hat D$ with $\sigma = 0$ in Eq. (11) of the eigenvalue problem. The gray lines partition the matrix into $N_xN_y \times N_xN_y$ sub-blocks corresponding to those in Eqs. (17) and (16). b) The corresponding sparsity pattern of the $\hat L$ factor with the modified Dirichlet boundary condition. c) The corresponding sparsity pattern of the $\hat L$ factor for the periodic boundary condition. In the second row d)-f), we show the analogous result with $\sigma = 1$ ($\sigma$ in units of $\lambda _p$, the free-space wavelength at the plasma frequency).

Download Full Size | PDF

6. Conclusion

In summary, we consider direct solvers for the FDFD methods applied to Maxwell’s equations. We demonstrate that the choice of boundary condition used behind the PML can have significant effects on the numerical efficiency of the LU factorization of the system matrix. With the PML the physical properties of the solutions do not depend on the choices of the boundary conditions. But we observe that the use of a modified Dirichlet boundary condition can lead to a reduction of fill-in by up to 40%, as compared to the commonly used periodic boundary conditions. Such a reduction translates to significant increase in computational speed and decrease in memory requirements for the direct solvers.

While in our paper we focus on the PML, our results should be applicable for other numerical absorbers, such as the adiabatic absorbers [31]. Also, while we illustrate our observations with two-dimensional simulations, the same observations should be applicable for three-dimensional simulations as well. Our observation should also be relevant for finite-element frequency-domain solvers.

Our result is important for enhancing the performance of direct solvers in the FDFD methods. In addition, this result should have implications for several numerical tasks in computational electromagnetics that can benefit from efficient LU decomposition of the system matrix, such as adjoint variable method [11,32], efficient line searches for gradient-based photonic optimization [3335], and domain decomposition methods utilizing the Schur complement approach [36,37].

This work is supported by the U. S. Air Force Office of Scientific Research (AFOSR) (Grant No. FA9550-17-1-0002, FA9550-21-1-0244).

Funding

Air Force Office of Scientific Research (FA9550-17-1-0002, FA9550-21-1-0244).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. J.-P. Berenger, “A perfectly matched layer for the absorption of electromagnetic waves,” J. Comput. Phys. 114(2), 185–200 (1994). [CrossRef]  

2. A. Taflove, Computational Electrodynamics (Artech House, 2005), 3rd ed.

3. J.-P. Berenger, “Perfectly matched layer for the fdtd solution of wave-structure interaction problems,” IEEE Trans. Antennas Propag. 44(1), 110–117 (1996). [CrossRef]  

4. Z. Sacks, D. Kingsland, R. Lee, and J.-F. Lee, “A perfectly matched anisotropic absorber for use as an absorbing boundary condition,” IEEE Trans. Antennas Propag. 43(12), 1460–1463 (1995). [CrossRef]  

5. S. G. Johnson, “Notes on perfectly matched layers (pmls),” (2021).

6. F. Collino and P. B. Monk, “Optimizing the perfectly matched layer,” Comput. Methods Appl. Mech. Eng. 164(1-2), 157–171 (1998). [CrossRef]  

7. A. Agrawal and A. Sharma, “Perfectly matched layer in numerical wave propagation: factors that affect its performance,” Appl. Opt. 43(21), 4225–4231 (2004). [CrossRef]  

8. A. Oskooi and S. Johnson, “Distinguishing correct from incorrect pml proposals and a corrected unsplit pml for anisotropic, dispersive media,” J. Comput. Phys. 230(7), 2369–2377 (2011). [CrossRef]  

9. W. Shin and S. Fan, “Choice of the perfectly matched layer boundary condition for frequency-domain Maxwell’s equations solvers,” J. Comput. Phys. 231(8), 3406–3431 (2012). [CrossRef]  

10. K. Yee, “Numerical solution of initial boundary value problems involving maxwell’s equations in isotropic media,” IEEE Trans. Antennas Propag. 14(3), 302–307 (1966). [CrossRef]  

11. G. Veronis, R. W. Dutton, and S. Fan, “Method for sensitivity analysis of photonic crystal devices,” Opt. Lett. 29(19), 2288 (2004). [CrossRef]  

12. R. LeVeque, Finite Difference Methods for Ordinary and Partial Differential Equations: Steady-State and Time-Dependent Problems (Classics in Applied Mathematics Classics in Applied Mathemat) (Society for Industrial and Applied Mathematics, USA, 2007).

13. G. Strang, Introduction to Linear Algebra (Wellesley-Cambridge, 2016).

14. X. S. Li, “An overview of SuperLU: Algorithms, implementation, and user interface,” (2005).

15. T. A. Davis, “Algorithm 832: UMFPACK V4.3 - An unsymmetric-pattern multifrontal method,” ACM Trans. Math. Softw. 30(2), 196–199 (2004). [CrossRef]  

16. R. B. Lehoucq, D. C. Sorensen, and C. Yang, “Arpack users guide: Solution of large scale eigenvalue problems by implicitly restarted arnoldi methods,” (1997).

17. G. H. Golub and H. A. van der Vorst, “Eigenvalue computation in the 20th century,” J. Comput. Appl. Math. 123(1-2), 35–65 (2000). Numerical Analysis 2000. Vol. III: Linear Algebra. [CrossRef]  

18. C. Campos and J. E. Roman, “Strategies for spectrum slicing based on restarted Lanczos methods,” Numer. Algor. 60(2), 279–295 (2012). [CrossRef]  

19. C. Peng and D. Boley, “Methods for large sparse eigenvalue problems from waveguide analysis,” in Proceedings of the 1996 12th Annual Review of Progress in Applied Computational Electromagnetics, (1996), pp. 645–652.

20. M. Yannakakis, “Computing the minimum fill-in is np-complete,” SIAM J. Discrete Math. 2(1), 77–79 (1981). [CrossRef]  

21. P. R. Amestoy, T. A. Davis, and I. S. Duff, “An approximate minimum degree ordering algorithm,” SIAM J. Matrix Anal. Appl. 17(4), 886–905 (1996). [CrossRef]  

22. A. George, “Nested dissection of a regular finite element mesh,” SIAM J. Numer. Anal. 10(2), 345–363 (1973). [CrossRef]  

23. J. R. Gilbert and R. E. Tarjan, “The analysis of a nested dissection algorithm,” Numer. Math. 50(4), 377–404 (1986). [CrossRef]  

24. P. Agarwal and E. Olson, “Variable reordering strategies for slam,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, (2012), pp. 3844–3850.

25. G. Karypis and V. Kumar, “MeTis: Unstructured Graph Partitioning and Sparse Matrix Ordering System,” Version 4.0, http://www.cs.umn.edu/metis (2009).

26. A. Taflove, A. Oskooi, and S. Johnson, Advances in FDTD Computational Electrodynamics: Photonics and Nanotechnology (Artech House, 2013).

27. G. Veronis and S. Fan, “Guided subwavelength plasmonic mode supported by a slot in a thin metal film,” Opt. Lett. 30(24), 3359–3361 (2005). [CrossRef]  

28. A. Nicolet and C. Geuzaine, “Waveguide propagation modes and quadratic eigenvalue problems,” in 6th International Conference on Computational Electromagnetics, (2006), pp. 1–3.

29. N. Z. Zhao, I. A. D. Williamson, Z. Zhao, S. Boutami, and S. Fan, “Penetration depth reduction with plasmonic metafilms,” ACS Photonics 6(8), 2049–2055 (2019). [CrossRef]  

30. N. Zhao, I. A. D. Williamson, Z. Zhao, S. Boutami, and S. Fan, “Penetration depth engineering in plasmonic metafilms for enhanced reflection and confinement,” in Conference on Lasers and Electro-Optics, (Optica Publishing Group, 2020), p. FF3E.7.

31. A. F. Oskooi, L. Zhang, Y. Avniel, and S. G. Johnson, “The failure of perfectly matched layers, and towards their redemption by adiabatic absorbers,” Opt. Express 16(15), 11376–11392 (2008). [CrossRef]  

32. C. M. Lalau-Keraly, S. Bhargava, O. D. Miller, and E. Yablonovitch, “Adjoint shape optimization applied to electromagnetic design,” Opt. Express 21(18), 21693 (2013). [CrossRef]  

33. S. Boutami, N. Zhao, and S. Fan, “Large permittivity increments for efficient predictive photonic devices optimization,” in High Contrast Metastructures IX, vol. 11290C. J. Chang-Hasnain, A. Faraon, and W. Zhou, eds., International Society for Optics and Photonics (SPIE, 2020), pp. 55–67.

34. S. Boutami, N. Zhao, and S. Fan, “Determining the optimal learning rate in gradient-based electromagnetic optimization using the shanks transformation in the lippmann–schwinger formalism,” Opt. Lett. 45(3), 595–598 (2020). [CrossRef]  

35. N. Z. Zhao, S. Boutami, and S. Fan, “Efficient method for accelerating line searches in adjoint optimization of photonic devices by combining schur complement domain decomposition and born series expansions,” Opt. Express 30(4), 6413–6424 (2022). [CrossRef]  

36. N. Z. Zhao, S. Boutami, and S. Fan, “Accelerating adjoint variable method based photonic optimization with Schur complement domain decomposition,” Opt. Express 27(15), 20711 (2019). [CrossRef]  

37. N. Zhao, S. Verweij, W. Shin, and S. Fan, “Accelerating convergence of an iterative solution of finite difference frequency domain problems via schur complement domain decomposition,” Opt. Express 26(13), 16925–16939 (2018). [CrossRef]  

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (8)

Fig. 1.
Fig. 1. Two dimensional finite difference grid with a five point stencil. a) A grid truncated with the periodic boundary condition. The boundary connections which connect left boundary to right and top to bottom are shown as trailing black edges. b) Same grid with the Dirichlet boundary condition.
Fig. 2.
Fig. 2. a) Optimized grid with reduced connectivity of the boundary. b) Same as a) with the nodes color coded by their degree (deg) in the graph representation.
Fig. 3.
Fig. 3. In the first column of Fig. 3, we show the sparsity pattern of $\hat A$ corresponding to the finite difference discretization of the $10\times 10$ grid with the a) periodic, d) Dirichlet boundary, and g) modified Dirichlet boundary with reduced connectivity at the boundary respectively. In the second column, we show the corresponding $\hat L$ factors for factorizing these three matrices as is. In the third column, we show corresponding $\hat L$ factors for factorizing these three matrices using a minimum degree heuristic to reorder the matrices. The number of nonzero elements (nnz) for each matrix is also provided in each figure.
Fig. 4.
Fig. 4. a) Field solution for $H_z$ for Eq. (9) with the periodic boundary. The periodic boundary is denoted by a red line. b) Field solution on the same grid but with the Dirichlet boundary condition (shown in orange). c) Comparison of the field distribution along the cyan lines in a) and b). The transparent gray regions in (a) and (b) denote where the PML lies.
Fig. 5.
Fig. 5. a) Fill-in pattern of the $\hat L$ factor for the system shown in Fig. 4 with the Dirichlet boundary. b) Fill-in pattern for the $\hat L$ factor for the system with the periodic boundary. c) The percentage reduction of the number of nonzeros in $\hat L$ factor in going from the periodic to the modified Dirichlet boundary condition for varying values of $N_x = N_y$.
Fig. 6.
Fig. 6. a) Schematic of the waveguide structure and the computational domain. The violet regions consist of a metal described by the Drude model with $\epsilon (\omega ) = 1 - \omega ^2/(\omega _p^2-i\gamma \omega )$, with $\omega _p = 0.72\pi \times 10^{15}$ rad/s, $\gamma = 5.5\times 10^{12}$ $s^{-1}$. The yellow regions consist of a dielectric material with a relative permittivity of $16$. The blue region is vacuum. The gray regions are the PML regions. The structure is periodic along the $x$-direction. Plotted here are two unit cells of the structure. b) $Re(H_z)$ modal profile of the structure with a Dirichlet boundary at the top and bottom edges of the computational domain at an operating frequency corresponding to a free space wavelength of 2 $\mu$m. b) The corresponding result with the periodic boundary at the top and bottom edges of the computational domain.
Fig. 7.
Fig. 7. Band structures obtained with a) the modified Dirichlet boundary, b) with the periodic boundary condition. The blue and red dots correspond to the real and imaginary parts of the $k_x$, respectively.
Fig. 8.
Fig. 8. a) Sparsity pattern of $\hat B -\sigma \hat D$ with $\sigma = 0$ in Eq. (11) of the eigenvalue problem. The gray lines partition the matrix into $N_xN_y \times N_xN_y$ sub-blocks corresponding to those in Eqs. (17) and (16). b) The corresponding sparsity pattern of the $\hat L$ factor with the modified Dirichlet boundary condition. c) The corresponding sparsity pattern of the $\hat L$ factor for the periodic boundary condition. In the second row d)-f), we show the analogous result with $\sigma = 1$ ($\sigma$ in units of $\lambda _p$, the free-space wavelength at the plasma frequency).

Equations (17)

Equations on this page are rendered with MathJax. Learn more.

× 1 ϵ r ( r ) × H ( r ) ω 2 μ 0 H ( r ) = i ω M ( r )
A ^ x = b .
x = U ^ 1 ( L ^ 1 b ) .
× 1 ϵ r ( r ) × H ( r ) = ω 2 μ 0 H ( r ) .
A ^ x = λ x
( A ^ σ I ^ ) 1 x = β x
[ p p p p p p p p p p p p p ] = [ p p p p p p p p p p p p p p p ] [ 1 p p p p 1 p p p 1 p p 1 p 1 ]
[ p p p p p p p p p p p p p ] = [ p p p p p p p p p ] [ 1 p 1 p 1 p 1 p 1 ] .
( x 1 ϵ r ( x , y ) x + y 1 ϵ r ( x , y ) y + μ 0 ω 2 ) H z = i ω M z
1 μ 0 ( x ( 1 ϵ r ) x + y ( 1 ϵ r ) y + ω 2 μ 0 + i k x ( x ϵ r 1 + ϵ r 1 x ) ) H z e i k x = 1 μ 0 ϵ r k x 2 H z e i k x
M ^ v + λ 2 K ^ v + λ C ^ v = 0
K ^ = 1 μ 0 ϵ r
M ^ = 1 μ 0 ( x ( 1 ϵ r ) x + y ( 1 ϵ y ) y + ω 2 )
C ^ = i μ 0 ( x ϵ r 1 + ϵ r 1 x ) .
D ^ λ w = B ^ w
D ^ = [ K ^ 0 ^ 0 ^ I ^ ]
B ^ = [ C ^ M ^ I ^ 0 ^ ] .
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.