Robust fast direct integral equation solver for quasi-periodic scattering problems with a large number of layers

Min Hyung Cho; Alex H. Barnett

doi:10.1364/OE.23.001775

1. Introduction

Periodic geometries (such as diffraction gratings and antennae) and multilayered media (such as dielectric mirrors) are both essential for the manipulation of waves in modern optical and electromagnetic devices. In an increasing number of applications these features occur in tandem: for instance, performance of a grating with transverse periodicity will be enhanced by using many layers of different indices. Dielectric gratings for high-powered laser [1, 2] or wideband [3] applications rely on structures with up to 50 dielectric layers. Efficient thin-film photovoltaic cells exploit multiple layers of silicon, transparent conductors, and dielectrics, which are patterned to enhance absorption [4, 5]. For solar thermal power, efficient visible-light absorbers which reflect in the infrared require patterning of several layers [6]. Related wave scattering problems appear in photonic crystals [7], process control in semiconductor lithography [8], in the electromagnetic characterization of increasingly multilayered integrated circuits, and in models for underwater acoustic wave propagation. In most of these settings, it is common to solve for the scattering for a large number of incident angles and/or wavelengths, then repeat this inside a design optimization loop. Thus, a robust and efficient solver is crucial. We present such a solver, which scales optimally with respect to the number of layers.

Let us describe the geometry of the problem (Fig. 1(a)). Consider I interfaces Γ_i, each of which has the same periodicity d in the horizontal (x) direction. The interfaces lie between I + 1 homogeneous material layers, each filling a domain Ω_i ⊂ ℝ², i = 1 ...,I + 1. The ith layer lies between Γ_i and Γ_i−1, whilst the top and bottom layers are semi-infinite. The wavenumber will be k_i in the ith layer. A plane wave is incident in the uppermost layer,

u^{inc} (r) = {\begin{array}{l} e^{i k \cdot r}, & r \in Ω_{1} \\ 0, & otherwise \end{array}

with wavevector k = (k₁ cosθ^inc, k₁ sinθ^inc), at angle −π < θ^inc < 0. The incident wave is quasi-periodic (periodic up to a phase), meaning u^inc(x + d, y) = αu^inc(x, y) for all (x, y) ∈ ℝ², where the Bloch phase (phase factor associated with translation by one unit cell) is

α : = e^{i d k_{1} cos θ^{inc}} .

Note that α is controlled by the period and indicent wave alone. We will seek a solution sharing this quasi-periodic symmetry.

Fig. 1 (a) Geometry of scattering problem. The periodicity is in the horizontal direction, with one period lying between the vertical blue dotted lines. Γ_i for i = 1,...,I are the material interfaces. The medium is uniform in the ith layer, which lies above the ith interface and has wavenumber k_i. Our algorithm also uses: L_i and R_i which are the left and right walls of one period Ω_i of the ith layer, P_i the proxy circle for this layer, and U, D the upper and lower fictitious interfaces (at y = y_U and y = y_D) where the radiation condition is applied. (b) Zoom of the top part of the geometry, showing quadrature nodes for Nyström method and collocation (for clarity, less nodes are shown than actually used), including the near-field neighboring copies.

Download Full Size | PDF

As is standard for scattering theory [9], the incident wave causes a scattered wave u to be generated, and the physical wave is their total u^inc + u. The scattered wave is given by solving the following boundary value problem (BVP). We have the Helmholtz equation in each layer,

Δ u_{i} (r) + k_{i}^{2} u_{i} (r) = 0, r = (x, y) \in Ω_{i}

where we write u_i for the scattered wave in the ith layer, and the following interface, boundary, and radiation conditions:

Continuity of the value and derivative the total wave on each interface, i.e. $u_{1} - u_{2} = - u^{inc} and \frac{\partial u_{1}}{\partial n} - \frac{\partial u_{2}}{\partial n} = - \frac{\partial u^{inc}}{\partial n} on Γ_{1},$ $u_{i} - u_{i + 1} = 0 and \frac{\partial u_{i}}{\partial n} - \frac{\partial u_{i + 1}}{\partial n} = 0 on Γ_{i}, i = 2, 3, \dots I .$
Quasi-periodicity in all layers, i.e. for all i = 1,..., I + 1, $u_{i} (x + d, y) = α u_{i} (x, y), for all (x, y) \in ℝ^{2} .$
Outgoing radiation conditions in u₁ and u_I+1, namely the uniform convergence of Rayleigh–Bloch expansions [10] in the upper and lower half-spaces, $u_{1} (x, y) = \sum_{n \in ℤ} a_{n}^{U} e^{i κ_{n} x} e^{i k_{n}^{U} (y - y_{U})} for y \geq y_{U},$ $u_{I + 1} (x, y) = \sum_{n \in ℤ} a_{n}^{D} e^{i κ_{n} x} e^{i k_{n}^{D} (- y + y_{D})} for y \leq y_{D},$ where the horizontal wavenumbers in the modal expansion are $κ_{n} : = k_{1} cos θ^{inc} + \frac{2 π n}{d},$ and the upper and lower vertical wavenumbers are $k_{n}^{U} = \sqrt{k_{1}^{2} - κ_{n}^{2}}$ and $k_{n}^{D} = \sqrt{k_{I + 1}^{2} - κ_{n}^{2}}$ respectively, with all signs taken as positive real or imaginary. The complex coefficients $a_{n}^{U}$ and $a_{n}^{D}$ are the Bragg diffraction amplitudes of the reflected and transmitted nth order; only the orders for which $k_{n}^{U} > 0$ or $k_{n}^{D} > 0$ carry power away from the system, and they characterize the far-field diffraction pattern.

The above BVP describes time-harmonic acoustics with layers of different sound speeds (but the same densities) [9], or the time-harmonic full Maxwell equations in the case of a z-invariant multilayer dielectric in TM polarization, as we now briefly review; see [11, 12]. (The modification to the interface conditions for differing densities or TE polarization are simple, and we will not present them.) Letting the time dependence be e^−iωt, Maxwell’s equations state that the divergence-free vector fields E and H satisfy

\nabla \times E = i ω μ H,

\nabla \times H = - i ω ε E .

Writing E = (0, 0, u) and H = (H_x, H_y, 0), and eliminating H gives in the ith layer the Helmholtz PDE (3) with wavenumber

k_{i} = ω \sqrt{ε_{i} μ_{i}}

, where ε_i and μ_i are the permittivity and permeability of the layer. Continuity of transverse E and H then gives the interface conditions (4)–(5). Note that all of the layer wavenumbers k_i are scaled linearly by the overall frequency ω.

The BVP (3)–(8) has a solution for all parameters [10, Thm. 9.2]. The solution is unique at all but a discrete set of frequencies ω when θ^inc is fixed [10, Thm. 9.4]; these frequencies correspond to guided modes of the dielectric structure, where resonance makes the physical problem ill-posed. They are distinct from (but in the literature sometimes confused with) Wood anomalies [13], which are scattering parameters (θ^inc, ω) for which $k_{U}^{n} = 0$ or $k_{D}^{n} = 0$ for some n, making the upper or lower nth Rayleigh–Bloch mode a horizontally traveling plane wave. A Wood anomaly does not prevent the solution from being unique, although it does become arbitrarily sensitive to changes in θ^inc or ω. For more detail see [14, 15], and the extensive review in the three-dimensional (3D) case by Shipman [16].

There are many low-order numerical methods used to solve multilayer scattering problems, which in test problems may only agree to 1 digit of accuracy [17]. Finite difference time-domain (FDTD) [18, 19] is easy to code but has dispersion errors, and requires artificial absorbing boundary conditions and arbitrarily long settling times near resonances. Direct discretization in the frequency-domain, as in finite difference (FD) or finite element methods, is also possible [20, 21, 8], although they require a large number of unknowns, and, as the frequency grows, “pollution” error means that an increasing number of degrees of freedom per wavelength is needed [22]. They also demand artificial absorbing boundary conditions (perfectly matched layers). The rigorous-coupled wave analysis (RCWA) or Fourier Modal Method is specially designed for multilayer gratings [23]; to overcome slow convergence a Fourier factorization method is needed [24, 25]. However, RCWA is not easy to apply for arbitrary shapes (relying on an intrinsically low-order “staircase” approximation of layer shapes), nor to generalize to 3D. Other methods include volume integral equations [3, 26], for which it is hard to exceed low-order convergence. In general, when the layers are strictly planar, the problem becomes 1D [27] and the (unstable) transfer matrix and (stable) scattering matrix approaches are standard [28]. However, we are concerned with interfaces of arbitrary shape.

Since the medium is piecewise constant, boundary integral equations (BIE) formulated on the interfaces are natural and mathematically rigorous [9, 29, 30]. By exploiting the reduced dimensionality, the number of unknowns is much reduced, and high-order quadratures exist, in 2D [31, 32] but also in 3D. Combined with fast algorithms for handling the resulting linear systems, this leads to much higher efficiency and accuracy than FD methods, and has started to be used effectively in periodic problems [30, 33, 34, 35, 36, 37, 38]. In the setting of quasi-periodic scattering, the interfaces are unbounded and the usual approach is to replace the 2D free-space kernel (Green’s function) for waves in layer i,

G^{i} (r, r^{'}) : = \frac{i}{4} H_{0}^{(1)} (k_{i} ‖ r - r^{'} ‖)

(where

H_{0}^{(1)}

is the Hankel function of order zero [39]), by the quasi-periodic one obeying (6), in which case the problem may be formulated on a single period of the interface. The usual quasi-periodic Green’s function for layer i is

G_{qp}^{i} (r, r^{'}) : = \sum_{l \in ℤ} α^{l} G^{i} (r, r^{'} + l d), where d : = (d, 0),

a sum whose slow convergence renders it computationally useless. Thus a large industry has been built around efficient evaluation of G_qp using convergence acceleration, Ewald’s method, or lattice sums [40]. It is convenient to expand slightly the definition of Wood anomaly, as follows.

Definition 1. We say that layer i is at a Wood anomaly if either κ_n = k_i or κ_n = −k_i (or both) for some n ∈ ℤ.

The problem then with G_qp-based methods is that (12) does not exist (the sum diverges) whenever the ith layer is at a Wood anomaly. As the number of different materials increases in a structure, the chances of some layer hitting (or being close to) a Wood anomaly, and thus of failure, increases.

Two classes of solutions to this non-robustness problem have recently been introduced:

Replace (12) by the quasi-periodic Green’s function for the Dirichlet [30] or impedance half-plane problems [41], or their generalization to larger numbers of images [42, 43].
Return to the free-space Green’s function (11) for the central unit cell plus immediate neighbors, plus a new representation of far-field contributions, imposing the quasi-periodicity condition in the least-squares sense via additional rows in the linear system [44, 15, 37].

In the multilayer dielectric setting robustness using class I would require the impedance Green’s function, which is difficult to evaluate; while the class II contour-integral approach of Barnett–Greengard [15] does not generalize well to multiple layers.

In this paper we introduce a simpler class II BIE method which combines the free-space Green’s function for the unit cell and neighbors, with a ring of proxy point sources (i.e. the method of fundamental solutions, or MFS [45, 46]) to represent the far-field contributions which “periodize” the field. This combines ideas in [44, Sec. 3.2] and the fast direct solver community [47, Sec. 5], and has been independently proposed recently for Laplace problems by Gumerov–Duraiswami [48]. Related representations have been used for some time in the engineering community [49]. A modern interpretation of the key idea is that the far-field contribution is smooth—the interaction between distant periodic copies and the central unit cell is low rank; eg see [37]—and hence only a small number of proxy points is needed, at least if the frequency is not too high.

Remark 1. Conveniently, in our new formulation we can take Schur complements to eliminate the proxy strength unknowns for each layer without recreating (12) and its associated Wood anomaly problem, as would happen in prior methods [44, 15] (see [44, Remark 8]). The difference is that in [15] both upward and downward radiation conditions (of the type (7) and (8)) are imposed on the Green’s function, making it equivalent to (12), whereas we do not impose any radiation condition in the finite-thickness layers, and impose outgoing conditions only in the semi-infinite layers 1 and I +1. Since non-divergent Green’s functions do exist which satisfy these minimal radiation conditions, they are selected by the least-squares linear algebra in the Schur complement (see Sec. 3.2).

We present our new representation and its discretization in Sec. 2, then combine it in Sec. 3 with a direct solver which has two steps: Schur complements to eliminate the proxy unknowns, followed by direct block-tridiagonal factorization. The tridiagonal structure arises simply because layer i couples only with layers i − 1 and i + 1. The overall scaling is 𝒪(IN³), i.e. linear in the number of layers and cubic in N the number of unknowns per layer. This allows our solver to tackle problems with NI of order 10⁶ unknowns in only a few minutes. Since the solution is direct, as explained in Sec. 3.4 we can solve new incident waves that share the same α without extra effort, and, by reusing matrix blocks, handle other α values efficiently. We test the solver’s error and speed performance with a variety of interface shapes, with up to I = 1000 layers, with random or periodic permittivities, and multiple incident angles including a Wood anomaly, in Sec. 4. We conclude with a summary and implications for future work in Sec. 5.

2. Boundary integral formulation, periodizing scheme, and its discretization

We now reformulate the BVP as a system of linear second-kind integral equations on the interfaces Γ_i, i = 1,...,I which lie in a single unit cell, coupled with linear conditions on fictitious unit cell walls; the complete system will be summarized by (53) below. A little extra geometry notation is needed, as shown in Fig. 1. Let us define the (central) unit cell as the vertical strip of width d lying between x = −d/2 and x = d/2; of course its horizontal displacement is arbitrary. The blue dashed vertical lines ${L_{i}}_{i = 1}^{I + 1}$ and ${R_{i}}_{i = 1}^{I + 1}$ are the left and right boundaries of the layer domains ${Ω_{i}}_{i = 1}^{I + 1}$ lying inside the unit cell. The proxy points for layer i lie on the circle P_i (shown by red dotted lines). The magenta dashed lines U and D are fictitious interfaces for radiation conditions located at y = y_U and y = y_D, touching Ω₁ and Ω_I+1, respectively.

2.1. Representation of the scattered wave

Using (11) we define standard potentials for the Helmholtz equation, the single- and double-layer representations [9] lying on a general curve W, at wavenumber k_i for the ith layer,

(𝒮_{W}^{i} σ) (r) : = \int_{W} G^{i} (r, r^{'}) σ (r^{'}) d s_{r^{'}}, (𝒟_{W}^{i} τ) (r) : = \int_{W} \frac{\partial G^{i}}{\partial n^{'}} (r, r^{'}) τ (r^{'}) d s_{r^{'}}, r \in ℝ^{2}

where n′ is the unit normal on the curve W at r′, and ds the arclength element. Shortly we will set W to be either Γ_i−1 or Γ_i, with the normals pointing down (into the layer below the interface). Integral representations which include phased contributions from the nearest neighbors are indicated with a tilde,

({\tilde{𝒮}}_{W}^{i} σ) (r) : = \sum_{l = - 1}^{1} α^{l} \int_{W} G^{i} (r + r^{'} + l d) σ (r^{'}) d s_{r^{'}},

({\tilde{𝒟}}_{W}^{i} τ) (r) : = \sum_{l = - 1}^{1} α^{l} \int_{W} \frac{\partial G^{i}}{\partial n^{'}} (r, r^{'} + l d) τ (r^{'}) d s_{r^{'}} .

Let the proxy points ${y_{p}^{i}}_{p = 1}^{P} \in ℝ^{2}$ lie uniformly on the circle P_i of radius R which is centered on the domain Ω_i. As is well known in MFS theory, increasing R allows a higher convergence rate with respect to P [46, Thm. 3]; however, since the proxy points are representing the contributions from far periodic interface copies {−∞,..., −3, −2} and {2, 3,...,∞}, which thus have singularities at |x| > 3d/2, the proxy charge strengths will turn out to be exponentially large [46, Thm. 7] if R exceeds 3d/2 by much. In practice we choose R ∈ [3d/2, 2d]. Note that, should a layer i be very tall (high aspect ratio), its proxy points should instead be chosen on a vertical oval to retain the “shielding” of Ω_i from its far periodic copies. In order to make the proxy representation more robust at high wavenumbers we use a “combined field” approach, choosing the proxy basis functions for ith layer,

ϕ_{p}^{i} (r) : = \frac{\partial G^{i}}{\partial n_{p}} (r, y_{p}^{i}) + i k_{i} G^{i} (r, y_{p}^{i}), r \in Ω_{i}, p = 1, \dots, P

where n_p is the outwards-directed unit normal to the circle P_i at the pth proxy point. This results in smaller coefficients than if monopoles or dipoles alone were used (which can be justified by treating the proxy points as a discrete approximation to a layer potential on P_i, and considering arguments in [50, Sec. 7.1]).

Combining the near-field single- and double-layer potentials and proxy representations in each layer we have, recalling the notation u_i for u in Ω_i,

u_{1} = {\tilde{𝒟}}_{Γ_{1}}^{1} τ_{1} + {\tilde{𝒮}}_{Γ_{1}}^{1} σ_{1} + \sum_{p = 1}^{P} c_{p}^{1} ϕ_{p}^{1}

u_{i} = {\tilde{𝒟}}_{Γ_{i - 1}}^{i} τ_{i - 1} + {\tilde{𝒮}}_{Γ_{i - 1}}^{i} σ_{i - 1} + {\tilde{𝒟}}_{Γ_{i}}^{i} τ_{i} + {\tilde{𝒮}}_{Γ_{i}}^{i} σ_{i} + \sum_{p = 1}^{P} c_{p}^{i} ϕ_{p}^{i}, i = 2, 3, \dots, I

u_{I + 1} = {\tilde{𝒟}}_{Γ_{I}}^{I + 1} τ_{I} + {\tilde{𝒮}}_{Γ_{I}}^{I + 1} σ_{I} + \sum_{p = 1}^{P} c_{p}^{I + 1} ϕ_{p}^{I + 1}

By construction, for all layers i = 1,..., I + 1, and for all density functions σ_i and τ_i and proxy unknown vectors

c^{i} : = {c_{p}^{i}}_{p = 1}^{P}

, this representation satisfies the Helmholtz equations (3). In the following subsections, we describe in turn how to enforce the interface matching, quasi-periodicity, and radiation conditions. Each of these three conditions will comprise a block row of the final linear system (53) that enables us to solve for the densities and proxy unknowns.

2.2. Matching conditions at material interfaces

In this subsection, matching conditions (4) and (5) will be enforced at all material interfaces in a standard Müller–Rokhlin [51, 52] scheme.

In the indirect approach, boundary integral operators arise from the restriction of representations (13) to curves [9]. Following [15] we use notation $S_{V, W}^{i}$ to indicate the single-layer operator at wavenumber k_i from a source curve W to target curve V. Similarly we use $D_{V, W}^{i}$ for the double-layer operator, $D_{V, W}^{i, *}$ for the target-normal derivative of the single-layer operator, and $T_{V, W}^{i}$ for the target-normal derivative of the double-layer operator. As before, we use a tilde to indicate summation over the source curve and its phased nearest neighbors, thus, for a target point x ∈ V,

({\tilde{S}}_{V, W}^{i} σ) (r) : = \sum_{l = - 1}^{1} α^{l} \int_{W} G^{i} (r, r^{'} + l d) σ (r^{'}) d s_{r^{'}},

({\tilde{D}}_{V, W}^{i} τ) (r) : = \sum_{l = - 1}^{1} α^{l} \int_{W} \frac{\partial G^{i}}{\partial n^{'}} (r, r^{'} + l d) τ (r^{'}) d s_{r^{'}},

({\tilde{D}}_{V, W}^{i, *} σ) (r) : = \sum_{l = - 1}^{1} α^{l} \int_{W} \frac{\partial G^{i}}{\partial n^{'}} (r, r^{'} + l d) σ (r^{'}) d s_{r^{'}},

({\tilde{T}}_{V, W}^{i} τ) (r) : = \sum_{l = - 1}^{1} α^{l} \int_{W} \frac{\partial^{2} G^{i}}{\partial n \partial n^{'}} (r, r^{'} + l d) τ (r^{'}) d s_{r^{'}} .

When the target curve is the same as the source (V = W), we note that the single-layer operator is a weakly singular integral operator, that the action of the double-layer and its transpose must be taken in their principal value sense, and that the T operator is hypersingular.

At the first interface Γ₁, u₁ and u₂ are coupled. The functions u₁, u₂, $\frac{\partial u_{1}}{\partial n}$ , and $\frac{\partial u_{2}}{\partial n}$ at Γ₁ can be found by letting r in (17)–(18) approach Γ₁ from the respective side, and using the standard jump relations [9, Thm. 3.1 and p.66] which introduce terms of one half times the density to each D and D* term, giving

u_{1} = - \frac{1}{2} τ_{1} + {\tilde{D}}_{Γ_{1}, Γ_{1}}^{1} τ_{1} + {\tilde{S}}_{Γ_{1}, Γ_{1}}^{1} σ_{1} + \sum_{p = 1}^{P} c_{p}^{1} ϕ_{p}^{1}, on Γ_{1}

u_{2} = \frac{1}{2} τ_{1} + {\tilde{D}}_{Γ_{1}, Γ_{1}}^{2} τ_{1} + {\tilde{S}}_{Γ_{1}, Γ_{1}}^{2} σ_{1} + {\tilde{D}}_{Γ_{1}, Γ_{2}}^{2} τ_{2} + {\tilde{S}}_{Γ_{1}, Γ_{2}}^{2} σ_{2} + \sum_{p = 1}^{P} c_{p}^{2} ϕ_{p}^{2}, on Γ_{1}

\frac{\partial u_{1}}{\partial n} = {\tilde{T}}_{Γ_{1}, Γ_{1}}^{1} τ_{1} + \frac{1}{2} σ_{1} + {\tilde{D}}_{Γ_{1}, Γ_{1}}^{1, *} σ_{1} + \sum_{p = 1}^{P} c_{p}^{1} \frac{\partial ϕ_{p}^{1}}{\partial n}, on Γ_{1}

\frac{\partial u_{2}}{\partial n} = {\tilde{T}}_{Γ_{1}, Γ_{1}}^{2} τ_{1} - \frac{1}{2} σ_{1} + {\tilde{D}}_{Γ_{1}, Γ_{1}}^{2, *} σ_{1} + {\tilde{T}}_{Γ_{1}, Γ_{2}}^{2} τ_{2} + {\tilde{D}}_{Γ_{1}, Γ_{2}}^{2, *} σ_{2} + \sum_{p = 1}^{P} c_{p}^{2} \frac{\partial ϕ_{p}^{2}}{\partial n}, on Γ_{1} .

On this interface only, the matching conditions (4) include the indicent wave, and enforcing them using the above gives the inhomogeneous coupled BIE and functional equations,

\begin{matrix} - τ_{1} + ({\tilde{D}}_{Γ_{1}, Γ_{1}}^{1} - {\tilde{D}}_{Γ_{1}, Γ_{1}}^{2}) τ_{1} + ({\tilde{S}}_{Γ_{1}, Γ_{1}}^{1} - {\tilde{S}}_{Γ_{1}, Γ_{1}}^{2}) σ_{1} \\ - {\tilde{D}}_{Γ_{1}, Γ_{2}}^{2} τ_{2} - {\tilde{S}}_{Γ_{1}, Γ_{2}}^{2} σ_{2} + \sum_{p = 1}^{P} {(c_{p}^{1} ϕ_{p}^{1} - c_{p}^{2} ϕ_{p}^{2}) |}_{Γ_{1}} = - {u^{inc} |}_{Γ_{1}}, \end{matrix}

\begin{matrix} ({\tilde{T}}_{Γ_{1}, Γ_{1}}^{1} - {\tilde{T}}_{Γ_{1}, Γ_{1}}^{2}) τ_{1} + σ_{1} + ({\tilde{D}}_{Γ_{1}, Γ_{1}}^{1, *} - {\tilde{D}}_{Γ_{1}, Γ_{1}}^{2, *}) σ_{1} \\ - {\tilde{T}}_{Γ_{1}, Γ_{2}}^{2} τ_{2} - {\tilde{D}}_{Γ_{1}, Γ_{2}}^{2, *} σ_{2} + \sum_{p = 1}^{P} {(c_{p}^{1} \frac{\partial ϕ_{p}^{1}}{\partial n} - c_{p}^{2} \frac{\partial ϕ_{p}^{2}}{\partial n}) |}_{Γ_{1}} = - {\frac{\partial u^{inc}}{\partial n} |}_{Γ_{1}} . \end{matrix}

Note that the half density terms added to give a −τ₁ in the first equation and a +σ₁ in the second; these terms appear for every layer and will cause the BIE to be of Fredholm second kind.

On the middle interfaces Γ_i, i = 2,...,I − 1, we similarly match u_i and u_i+1 and their normal derivatives, noting that now there is coupling to both the above and below interfaces, but no effect of the incident wave, to get

\begin{array}{l} - τ_{i} + ({\tilde{D}}_{Γ_{i}, Γ_{i}}^{i} - {\tilde{D}}_{Γ_{i}, Γ_{i}}^{i + 1}) τ_{i} + ({\tilde{S}}_{Γ_{i}, Γ_{i}}^{i} - {\tilde{S}}_{Γ_{i}, Γ_{i}}^{i + 1}) σ_{i} + {\tilde{D}}_{Γ_{i}, Γ_{i - 1}}^{i} τ_{i - 1} + {\tilde{S}}_{Γ_{i}, Γ_{i - 1}}^{i} σ_{i - 1} \\ - {\tilde{D}}_{Γ_{i}, Γ_{i + 1}}^{i + 1} τ_{i + 1} - {\tilde{S}}_{Γ_{i}, Γ_{i + 1}}^{i + 1} σ_{i + 1} + \sum_{p = 1}^{P} {(c_{p}^{i} ϕ_{p}^{i} - c_{p}^{i + 1} ϕ_{p}^{i + 1}) |}_{Γ_{i}} = 0, \end{array}

\begin{array}{l} ({\tilde{T}}_{Γ_{i}, Γ_{i}}^{i} - {\tilde{T}}_{Γ_{i}, Γ_{i}}^{i + 1}) τ_{i} + σ_{i} + ({\tilde{D}}_{Γ_{i}, Γ_{i}}^{i, *} - {\tilde{D}}_{Γ_{i}, Γ_{i}}^{i + 1, *}) σ_{i} + {\tilde{T}}_{Γ_{i}, Γ_{i - 1}}^{i} τ_{i - 1} + {\tilde{D}}_{Γ_{i}, Γ_{i - 1}}^{i, *} σ_{i - 1} \\ - {\tilde{T}}_{Γ_{i}, Γ_{i + 1}}^{i + 1} τ_{i + 1} - {\tilde{D}}_{Γ_{i}, Γ_{i + 1}}^{i + 1, *} σ_{i + 1} + \sum_{p = 1}^{P} {(c_{p}^{i} \frac{\partial ϕ_{p}^{i}}{\partial n} - c_{p}^{i + 1} \frac{\partial ϕ_{p}^{i + 1}}{\partial n}) |}_{Γ_{i}} = 0 . \end{array}

On the bottom interface Γ_I, the only change is the absence of coupling from any lower interface, so,

\begin{array}{l} - τ_{I} + ({\tilde{D}}_{Γ_{I}, Γ_{I}}^{I} - {\tilde{D}}_{Γ_{I}, Γ_{I}}^{I + 1}) τ_{I} + ({\tilde{S}}_{Γ_{I}, Γ_{I}}^{I} - {\tilde{S}}_{Γ_{I}, Γ_{I}}^{I + 1}) σ_{I} \\ + {\tilde{D}}_{Γ_{I}, Γ_{I - 1}}^{I} τ_{I - 1} + {\tilde{S}}_{Γ_{I}, Γ_{I - 1}}^{I} σ_{I - 1} + \sum_{p = 1}^{P} {(c_{p}^{I} ϕ_{p}^{I} - c_{p}^{I + 1} ϕ_{p}^{I + 1}) |}_{Γ_{I}} = 0, \end{array}

\begin{array}{l} ({\tilde{T}}_{Γ_{I}, Γ_{I}}^{I} - {\tilde{T}}_{Γ_{I}, Γ_{I}}^{I + 1}) τ_{I} + σ_{I} + ({\tilde{D}}_{Γ_{I}, Γ_{I}}^{I, *} - {\tilde{D}}_{Γ_{I}, Γ_{I}}^{I + 1, *}) σ_{I} \\ + {\tilde{T}}_{Γ_{I}, Γ_{I - 1}}^{I} τ_{I - 1} - {\tilde{D}}_{Γ_{I}, Γ_{I + 1}}^{I, *} σ_{I - 1} + \sum_{p = 1}^{P} {(c_{p}^{I} \frac{\partial ϕ_{p}^{I}}{\partial n} - c_{p}^{I + 1} \frac{\partial ϕ_{p}^{I + 1}}{\partial n}) |}_{Γ_{I}} = 0 . \end{array}

We wish to write these in a more compact form, hence we pair up double- and single-layer densities, then stack them into a single column vector,

η : = {[η_{1}, η_{2}, \dots, η_{I}]}^{T}, where η_{i} : = [\begin{matrix} τ_{i} \\ σ_{i} \end{matrix}], i = 1, 2, \dots, I .

Similarly we stack the proxy strength vectors

c^{i} = {c_{p}^{i}}_{p = 1}^{P}

, and form a vector of right-hand side functions,

c = {[c^{1}, c^{2}, \dots, c^{I + 1}]}^{T}, f = {[{- u^{inc} |}_{Γ_{1}}, {- \frac{\partial u^{inc}}{\partial n} |}_{Γ_{1}}, 0, \dots, 0]}^{T} .

Then all of the coupled BIEs and functional equations (28)–(33) can be compactly grouped into the matrix-type notation,

A η + Bc = f,

where A is a I-by-I matrix, each of whose entries A_i,j is a 2 × 2 block of operators which maps η_j to a pair of functions (i.e. values then normal derivatives) on Γ_i. Every block of A is zero apart from the following tridiagonal entries,

\begin{array}{l} A_{i, i} & = & [\begin{matrix} - I + ({\tilde{D}}_{Γ_{i}, Γ_{i}}^{i} - {\tilde{D}}_{Γ_{i}, Γ_{i}}^{i + 1}) & ({\tilde{S}}_{Γ_{i}, Γ_{i}}^{i} - {\tilde{S}}_{Γ_{i}, Γ_{i}}^{i + 1}) \\ ({\tilde{T}}_{Γ_{i}, Γ_{i}}^{i} - {\tilde{T}}_{Γ_{i}, Γ_{i}}^{i + 1}) & I + ({\tilde{D}}_{Γ_{i}, Γ_{i}}^{i, *} - {\tilde{D}}_{Γ_{i}, Γ_{i}}^{i + 1, *}) \end{matrix}], & i = 1, 2, \dots, I, \\ A_{i, i + 1} & = & [\begin{matrix} - {\tilde{D}}_{Γ_{i}, Γ_{i + 1}}^{i + 1} & - {\tilde{S}}_{Γ_{i}, Γ_{i + 1}}^{i + 1} \\ - {\tilde{T}}_{Γ_{i}, Γ_{i + 1}}^{i + 1} & - {\tilde{D}}_{Γ_{i}, Γ_{i + 1}}^{i + 1, *} \end{matrix}], & i = 1, 2, \dots, I - 1, \\ A_{i, i - 1} & = & [\begin{matrix} {\tilde{D}}_{Γ_{i}, Γ_{i - 1}}^{i} & {\tilde{S}}_{Γ_{i}, Γ_{i - 1}}^{i} \\ {\tilde{T}}_{Γ_{i}, Γ_{i - 1}}^{i} & {\tilde{D}}_{Γ_{i}, Γ_{i - 1}}^{i, *} \end{matrix}], & i = 2, 3, \dots, I, \end{array}

where I is the identity operator. B is an I-by-(I + 1) matrix, each of whose entries B_i,j is a stack of P continous function columns (sometimes called a quasi-matrix) expressing the effect of each proxy point strength

c_{p}^{j}

on the value and normal derivative functions on Γ_i. The only nonzero blocks of B are

B_{i, i} = [\begin{matrix} {ϕ_{1}^{i} |}_{Γ_{i}}, & \dots, & {ϕ_{P}^{i} |}_{Γ_{i}} \\ {\frac{\partial ϕ_{1}^{i}}{\partial n} |}_{Γ_{i}}, & \dots, & {\frac{\partial ϕ_{P}^{i}}{\partial n} |}_{Γ_{i}} \end{matrix}], B_{i, i + 1} = [\begin{matrix} {- ϕ_{1}^{i + 1} |}_{Γ_{i}}, & \dots, & {- ϕ_{P}^{i + 1} |}_{Γ_{i}} \\ {- \frac{\partial ϕ_{1}^{i + 1}}{\partial n} |}_{Γ_{i}}, & \dots, & {- \frac{\partial ϕ_{P}^{i + 1}}{\partial n} |}_{Γ_{i}} \end{matrix}], i = 1, 2 \dots, I

The term Aη in (36) is precisely (barring the summation over neighbors) the Müller–Rokhlin formulation [51, 52] for multiple material interfaces. This is of Fredholm second kind since the off-diagonal blocks in (37) have continuous kernels, and cancellation of the leading singularities occurs in the pairs in parentheses in (37), leaving the diagonal operators at most weakly singular, hence compact.

2.3. Imposing the quasi-periodicity conditions

Quasi-periodicity (6) will be enforced in each layer by matching both values and normal derivatives between the left L_i and right R_i = L_i + d walls. Since the PDE is second-order, matching two functions (values and normal derivatives) is sufficient Cauchy data to guarantee extension as a quasi-periodic solution.

We evaluate the first layer representation (17) on the walls, and exploit the following simplification due to translational symmetry (as in [15, 44]) which cancels six terms (three from each near-field sum) down to two,

\begin{array}{l} α^{- 1} {u_{1} |}_{R_{1}} - {u_{1} |}_{L_{1}} \\ = α^{- 1} ({\tilde{D}}_{R_{1}, Γ_{1}}^{1} τ_{1} + {\tilde{S}}_{R_{1}, Γ_{1}}^{1} σ_{1} + \sum_{p = 1}^{P} c_{p}^{1} {ϕ_{p}^{1} |}_{R_{1}}) - ({\tilde{D}}_{L_{1}, Γ_{1}}^{1} τ_{1} + {\tilde{S}}_{L_{1}, Γ_{1}}^{1} σ_{1} + \sum_{p = 1}^{P} c_{p}^{1} {ϕ_{p}^{1} |}_{L_{1}}) \\ = (α^{- 2} D_{R_{1} + d, Γ_{1}}^{1} - α D_{L_{1} - d, Γ_{1}}^{1}) τ_{1} + (α^{- 2} S_{R_{1} + d, Γ_{1}}^{1} - α S_{L_{1} - d, Γ_{1}}^{1}) σ_{1} \\ + \sum_{p = 1}^{P} (α^{- 1} {ϕ_{p}^{1} |}_{R_{1}} - {ϕ_{p}^{1} |}_{L_{1}}) c_{p}^{1} \end{array}

For quasi-periodicity we wish this function to vanish, so we make it the first operator block row of a homogeneous linear system. Doing the same for the normal derivatives on the L₁ and R₁ walls, and then for similar conditions for all other layers i = 2,...,I + 1, gives equations that can be written compactly with a matrix notation as follows:

C η + Qc = 0

where C is an (I + 1)-by-(I) matrix, each entry of which is a 2 × 2 block of operators mapping interface densities to wall values and normal derivatives. Every block of C is zero apart from the bidiagonal blocks,

C_{i, i} = [\begin{array}{l} α^{- 2} D_{R_{i} + d, Γ_{i}}^{i} - α D_{L_{i} - d, Γ_{i}}^{i} & α^{- 2} S_{R_{i} + d, Γ_{i}}^{i} - α S_{L_{i} - d, Γ_{i}}^{i} \\ α^{- 2} T_{R_{i} + d, Γ_{i}}^{i} - α T_{L_{i} - d, Γ_{i}}^{i} & α^{- 2} D_{R_{i} + d, Γ_{i}}^{i, *} - α D_{L_{i} - d, Γ_{i}}^{i, *} \end{array}]

C_{i, i - 1} = [\begin{array}{l} α^{- 2} D_{R_{i} + d, Γ_{i - 1}}^{i} - α D_{L_{i} - d, Γ_{i - 1}}^{i} & α^{- 2} S_{R_{i} + d, Γ_{i - 1}}^{i} - α S_{L_{i} - d, Γ_{i - 1}}^{i} \\ α^{- 2} T_{R_{i} + d, Γ_{i - 1}}^{i} - α T_{L_{i} - d, Γ_{i - 1}}^{i} & α^{- 2} D_{R_{i} + d, Γ_{i - 1}}^{i, *} - α D_{L_{i} - d, Γ_{i - 1}}^{i, *} \end{array}]

for i = 1, 2,···,I and i = 2, 3,···,I + 1, respectively. Q is an (I + 1)-by-(I + 1) matrix, each entry of which is a stack of P function columns (as with B_i,j), but only the diagonal entries are nonzero,

Q_{i, i} = : Q_{i} = [\begin{array}{l} {α^{- 1} ϕ_{1}^{i} |}_{R_{i}} - {ϕ_{1}^{i} |}_{L_{i}}, & \dots, & α^{- 1} {ϕ_{P}^{i} |}_{R_{i}} - {ϕ_{P}^{i} |}_{L_{i}} \\ α^{- 1} {\frac{\partial ϕ_{1}^{i}}{\partial n} |}_{R_{i}} - {\frac{\partial ϕ_{1}^{i}}{\partial n} |}_{L_{i}}, & \dots, & α^{- 1} {\frac{\partial ϕ_{P}^{i}}{\partial n} |}_{R_{i}} - {\frac{\partial ϕ_{P}^{i}}{\partial n} |}_{L_{i}} \end{array}] for i = 1, 2, \dots, I + 1 .

2.4. Imposing the radiation conditions

First we enforce the upward radiation condition (7) at the artificial interface U (with upward-pointing normal), substituting the layer-1 representation (17) to get,

{\tilde{D}}_{U, Γ_{1}}^{1} τ_{1} + {\tilde{S}}_{U, Γ_{1}}^{1} σ_{1} + \sum_{p = 1}^{P} {ϕ_{P}^{1} |}_{U} c_{p}^{1} - \sum_{n \in ℤ} a_{n}^{U} e^{i κ_{n} x} = 0 .

Matching values at U is not enough: we also need to match normal (y) derivatives, to ensure that the second-order PDE solution continues smoothly through U, thus,

{\tilde{T}}_{U, Γ_{1}}^{1} τ_{1} + {\tilde{D}}_{U, Γ_{1}}^{1, *} σ_{1} + \sum_{p = 1}^{P} {\frac{\partial ϕ_{p}^{1}}{\partial n} |}_{U} c_{p}^{1} - \sum_{n \in ℤ} a_{n}^{U} i k_{n}^{U} e^{i κ_{n} x} = 0 .

We will truncate the Rayleigh–Bloch expansion to 2K + 1 terms, from n = −K to K, since it is exponentially convergent once |κ_n| exceeds k₁ (in the upper layer) and k_I+1 (lower layer). We also apply the downward radiation condition (8) at D to the representation (19), giving a second set of homogeneous linear conditions. We choose the normals of U and D both to point in the upward sense. As with η and c, we stack all coefficients into a single vector,

a = {[a^{U}, a^{D}]}^{T} = {[a_{- K}^{U}, \dots, a_{K}^{U}, a_{- K}^{D}, \dots, a_{K}^{D}]}^{T} .

The resulting conditions can again be written in a simple matrix form:

Z η + Vc + Wa = 0,

where

Z = [\begin{matrix} Z_{U} & 0 & \dots & 0 \\ 0 & \dots & 0 & Z_{D} \end{matrix}], V = [\begin{matrix} V_{U} & 0 & \dots & 0 \\ 0 & \dots & 0 & V_{D} \end{matrix}], W = [\begin{matrix} W_{U} & 0 \\ 0 & W_{D} \end{matrix}],

in which

Z_{U} = [\begin{matrix} {\tilde{D}}_{U, Γ_{1}}^{1} & {\tilde{S}}_{U, Γ_{1}}^{1} \\ {\tilde{T}}_{U, Γ_{1}}^{1} & {\tilde{D}}_{U, Γ_{1}}^{1, *} \end{matrix}], Z_{D} = [\begin{matrix} {\tilde{D}}_{D, Γ_{I}}^{I + 1} & {\tilde{S}}_{D, Γ_{I}}^{I + 1} \\ {\tilde{T}}_{D, Γ_{I}}^{I + 1} & {\tilde{D}}_{D, Γ_{I}}^{I + 1, *} \end{matrix}],

V_{U} = [\begin{matrix} {ϕ_{1}^{1} |}_{U}, & \dots, & {ϕ_{P}^{1} |}_{U} \\ {\frac{\partial ϕ_{1}^{1}}{\partial n} |}_{U}, & \dots, & {\frac{\partial ϕ_{P}^{1}}{\partial n} |}_{U} \end{matrix}], V_{D} = [\begin{matrix} {ϕ_{1}^{I + 1} |}_{D}, & \dots, & {ϕ_{P}^{I + 1} |}_{D} \\ {\frac{\partial ϕ_{1}^{I + 1}}{\partial n} |}_{D}, & \dots, & {\frac{\partial ϕ_{P}^{I + 1}}{\partial n} |}_{D} \end{matrix}],

W_{U} = [\begin{matrix} {- e^{i κ_{- K} x} |}_{U}, & \dots, & {- e^{i κ_{K} x} |}_{U} \\ {- i k_{- K}^{U} e^{i κ_{- K} x} |}_{U}, & \dots, & {- i k_{K}^{U} e^{i κ_{K} x} |}_{U} \end{matrix}],

W_{D} = [\begin{matrix} {- e^{i κ_{- K} x} |}_{D}, & \dots, & {- e^{i κ_{K} x} |}_{D} \\ {i k_{- K}^{D} e^{i κ_{- K} x} |}_{D}, & \dots, & {i k_{K}^{D} e^{i κ_{K} x} |}_{D} \end{matrix}] .

To clarify, in W_U and W_D, the 2K + 1 columns are pairs of Fourier functions evaluated over x ∈ (−d/2, d/2), the x-coordinate extent of the lines U and D.

2.5. Discretization of functions and operators

Finally, combining the linear conditions from the previous three subsections, we have the coupled BIE and functional equations,

[\begin{matrix} A & B & 0 \\ C & Q & 0 \\ Z & V & W \end{matrix}] [\begin{matrix} η \\ c \\ a \end{matrix}] = [\begin{matrix} f \\ 0 \\ 0 \end{matrix}],

Recall that η contains unknown density functions, while c and a are discrete coefficient vectors. On the right hand side, f involves functions (from the incident wave), and each 0 is a stack of zero functions. Thus A, C, and Z are blocks of operators, while the other six matrix blocks involve quasi-matrices (stacks of function columns).

For numerical computation the continuous variables must be discretized, turning each function into a discrete set of values, and each operator block into a matrix. This is simple for the functional conditions in the 2nd and 3rd block rows of (53): we just sample at a discrete set of collocation points. For the 2nd block row, we use M_w nodes ${x_{m}^{i}}_{m = 1}^{M_{w}}$ of a Gauss–Legendre quadrature living on the left wall L_i for the ith layer. See Fig. 1(b). Thus each diagonal block of Q (43) is replaced by a 2M_w-by-2P matrix Q_i with elements

{(Q_{i})}_{m p} = {\begin{array}{l} α^{- 1} ϕ_{p}^{i} (x_{m}^{i} + d) - ϕ_{p}^{i} (x_{m}^{i}), & m = 1, \dots, M_{w}, & p = 1, \dots, P \\ α^{- 1} \frac{\partial ϕ_{p}^{i}}{\partial n} (x_{m - M_{w}}^{i} + d) - \frac{\partial ϕ_{p}^{i}}{\partial n} (x_{m - M_{w}}^{i}), & m = M_{w} + 1, \dots, 2 M_{w}, & p = 1, \dots, P \end{array}

Similarly for the 3rd row, we use M equally-spaced (trapezoid rule) nodes

{x_{m}^{U}}_{m = 1}^{M}

on U, and

{x_{m}^{D}}_{m = 1}^{M}

on D. The trapezoid rule is appropriate here since functions will be periodic. Inserting these nodes, the formulae for the matrices V_U, V_D, W_U, and W_D discretizing (50)–(52) are similar to (54) above.

To discretize the remaining blocks A, B, C and Z, we need to fix a set of quadrature nodes ${z_{j}^{i}}_{j = 1}^{N_{i}}$ on each interface Γ_i. These nodes have associated weights ${w_{j}^{i}}_{j = 1}^{N_{i}}$ , such that for any smooth d-periodic function f on Γ_i, the quadrature rule

\int_{Γ_{i}} f (r) d s_{r} \approx \sum_{j = 1}^{N_{i}} f (z_{j}^{i}) w_{j}^{i}

holds to high accuracy. To choose these nodes and weights, we first consider the case of Γ_i a smooth interface (e.g. Γ₁ in Fig. 1(b)). Let one period of the interface be parametrized by a vector function Z(s) for 0 ≤ s < 2π. By changing variable, the periodic trapezoid rule s_j = 2π(j − 1/2)/N_i, j = 1,...,N_i in parameter s gives a quadrature rule

z_{j}^{i} = Z (s_{j})

and

w_{j}^{i} = (2 π / N_{i}) | Z^{'} (s_{j}) |

. Then for (C^∞) smooth d-periodic integrands on Γ_i the error in (55) will be superalgebraically convergent [53, (2.9.16)]. On the other hand, if Γ_i has corners, it breaks into one or more “segments” (e.g. Γ₂ in Fig. 1(b)). These need not be straight lines, merely smooth. Say a segment is again parametrized by a function Z(s) for 0 ≤ s < 2π. Then we reparametrize it via Z(w(s)) using the corner grading function suggested by Kress [31, (6.9)],

w (s) = 2 π \frac{v {(s)}^{q}}{v {(s)}^{q} + v {(2 π - s)}^{q}}, where v (s) = (\frac{1}{q} - \frac{1}{2}) {(\frac{π - s}{π})}^{3} + \frac{1}{q} \frac{s - π}{π} + \frac{1}{2}, 0 \leq s < 2 π,

where q controls the grading at endpoints. Higher q will cause more nodes to be close to the endpoints; typically we choose q = 6 or higher. Let n_i,l be the number of nodes used for the lth segment of Γ_i. Then a separate trapezoid rule s_j = 2πj/n_i,l, j = 1,..., n_i,l is used on each segment, making N_i = ∑_l n_i,l nodes in total. The formula for the nodes and weights (we implement this in the MPSpack command segment(...,’pc’)) are the same as in the smooth case, with the proviso that Z is replaced by the composed function Z ○ w. The grading function, since its derivative vanishes to high order at the endpoints, insures that the trapezoid rule achieves high order accuracy (typically order q). We find this more efficient for up to 10-digit accuracy than dyadically-refined panel quadratures [38].

With interface quadratures defined, blocks B, C, and Z are easy to discretize by restriction of the continuous variable to the set of nodes. For example, each block B_i,i in (38), describing the interaction of the ith proxy basis with Γ_i, is replaced by a 2N_i-by-P matrix B_i,i with elements

{(B_{i, i})}_{j p} = [\begin{array}{l} ϕ_{p}^{i} (z_{j}^{i}), & j = 1, \dots, N_{i}, & p = 1, \dots, P \\ \frac{\partial ϕ_{p}^{i}}{\partial n} (z_{j - N_{i}}^{i}), & j = N_{i} + 1, \dots, 2 N_{i}, & p = 1, \dots, P \end{array}

The matrix for B_i,i+1 is similar. Operator blocks C and Z involve boundary integral representations over interfaces: each integral is replaced by a sum according to (55). For example, the M_w-by-N_i matrix discretizing the upper-right block of C_i,i in (41) has elements

(α^{- 1} G^{i} (x_{m}^{i} + d, z_{j}^{i}) - α G^{i} (x_{m}^{i} - d, z_{j}^{i})) w_{j}^{i}

, for m = 1,..., M_w and j = 1,..., N_i. Other blocks of C and Z are discretized similarly; for the reader’s sanity we refrain from giving all formulae.

Finally we discretize A. The operators in the blocks A_i,j for i ≠ j, and for the neighboring terms l = ±1 in the local sums (20)–(23) even when i = j, involve only interactions between differing interfaces, and thus may be replaced simply by substitution of the native quadrature rule (55) for the sources, and evaluation at discrete target nodes, as above. This method of discretizing an integral operator is called Nyström’s method [54, Sec. 12.2]. This leaves only the self-interaction terms l = 0 in A_i,i, which involve operators that are logarithmically singular. To achieve high-order accuracy for these operators, we use the standard “plain” Nyström matrix entries of the form $A (z_{m}^{i}, z_{j}^{i}) w_{j}^{i}$ , for m, j = 1,..., N_i (here A symbolizes a generic operator block) for entries far from the diagonal. Near-diagonal entries are adjusted by local Lagrange interpolation of the smooth density from the existing periodic trapezoid nodes onto a set of auxiliary nodes special to the logarithmic singularity, due to Alpert [55]; see [32, Sec. 4] for the full recipe. We use 30 auxiliary nodes per target node, which achieves high-order convergence with error $𝒪 (N_{i}^{- 16} log N_{i})$ . With this Nyström matrix, the non-zero right-hand side terms in f become the samples at the first interface nodes $z_{j}^{1}$ and all the unknown vectors η_i become samples of the densities τ_i and σ_i at the nodes of all interfaces.

The size of the resulting matrix A is N_den-by-N_den, where the total number of density unknowns is

N_{den} : = 2 \sum_{i = 1}^{I} N_{i},

the factor of two coming from the two types of layer potential per interface.

Remark 2. The Alpert correction to the periodic trapezoid rule [55] has become a standard option for closed curves [32], on which the kernel and densities are of course periodic. However, our interfaces Γ_i do not close on each other, moreover the solution density η_i is quasi-periodic with Bloch phase α. This means that, when Γ_i is smooth and hence has a single periodic quadrature rule, we must modify the Alpert correction entries in the northeast and southwest corners of the matrix, to account for the continuation of the interface into the next unit cell, and the phase factors α and α⁻¹ (this periodic-segment adjustment of the Alpert correction is in the MPSpack code quadr.alpertizeselfmatrix).

Kress’ periodic logarithmic correction would also be a slightly more accurate option [31] [32, Sec. 6]; however, we found that it was less convenient to adjust this scheme for quasi-periodic densities and open interface segments. (See Meier et al. [30] for an example of this; extra cut-off functions are required.)

We will see that, due to the high-order convergence, the numbers of collocation nodes N_i, n_i,l, M_w, and M can be small, of order a hundred, even for 10-digit accuracies. Note that for smooth interfaces the periodic trapezoid rule is in fact inaccurate for the interactions from neighboring interfaces, e.g. terms like $S_{Γ_{i} \pm d, Γ_{i}}^{i}$ , due to “dangling” ends of these interfaces, but that this is handled by the periodizing scheme, retaining high-order convergence.

To summarize, the linear algebraic system which discretizes (53) has identical structure to (53), but with all of its blocks matrices as constructed above. We notate these blocks using standard (non-bold) font. It will now be rearranged in order to solve it in a fast direct fashion, exploiting its tridiagonal structure.

3. Rearrangement of equations, Schur complement, and fast solver

The discretized BIEs of the previous section require a few hundred to a couple of thousand unknowns per layer, to represent typical geometries as shown in Fig. 1 to accuracies of around 10 digits. The total number of unknowns includes the densities, proxy strengths, and scattered amplitudes, namely

𝒩 = N_{den} + I P + 2 (2 K + 1) .

We will find P ∼ 10², thus there is an order of magnitude less proxy unknowns than density unknowns. In any case, a device with dozens or more layers leads to linear systems that are too large for direct 𝒪(𝒩³) inversion or solution. On the other hand, due to the proxy point (MFS) representation, the full linear system is exponentially ill-conditioned [46], so iterative solution of the full system is impossible. Therefore, for robustness, we describe in this section a direct solution technique, that will be “fast” (optimally scaling in I the number of layers), by exploiting the algebraic structure. This also will make it easier to solve for the technologically-important case of multiple incident angles θ^inc at the same frequency ω.

3.1. Rearrangement

The first step is to rearrange the unknowns, in a form amenable to elimination of the proxy and scattering amplitude unknowns. We reorder our vector of all unknowns to be

x = {[η, c_{1}, a^{U}, c_{2}, c_{3}, \dots c_{I - 1}, c_{I}, c_{I + 1}, a^{D}]}^{T} = {[η, x_{1}, x_{2}, \dots, x_{I + 1}]}^{T}

where

x_{1} : = {[c_{1}, a^{U}]}^{T}, x_{i} : = c_{i}, i = 2, 3, \dots, I, x_{I + 1} : = {[c_{I + 1}, a^{D}]}^{T} .

Now similarly rearranging the (discretized) blocks of (53) we get the full linear system,

[\begin{matrix} B_{1, 1}^{'} & B_{1, 2} & 0 & \dots & 0 \\ 0 & B_{2, 2} & B_{2, 3} & \dots & 0 \\ A & 0 & 0 & B_{3, 3} & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & B_{I - 1, I + 1} \\ 0 & 0 & 0 & \dots & B_{I, I + 1}^{'} \\ C_{1, 1}^{'} & 0 & 0 & \dots & 0 & 0 & Q_{1}^{'} & 0 & 0 & \dots & 0 \\ C_{2, 1} & C_{2, 2} & 0 & \dots & 0 & 0 & 0 & Q_{2} & 0 & \dots & 0 \\ 0 & C_{3, 2} & C_{3, 3} & \dots & 0 & 0 & 0 & 0 & Q_{3} & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & C_{I, I - 1} & C_{I, I} & 0 & 0 & 0 & \dots & 0 \\ 0 & 0 & 0 & 0 & 0 & C_{I + 1, I}^{'} & 0 & 0 & 0 & \dots & Q_{I + 1}^{'} \end{matrix}] x = [\begin{matrix} f \\ 0 \\ 0 \\ ⋮ \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ ⋮ \\ 0 \\ 0 \end{matrix}] .

Because the amplitudes a were separated into a^U and a^D then merged into c₁ and c_I+1 to make x₁ and x_I+1, respectively, the first and last blocks of B, C and Q matrix had to be expanded to

\begin{array}{l} B_{1, 1}^{'} & = & [\begin{matrix} B_{1, 1} & 0 \end{matrix}], & B_{I, I + 1}^{'} = [\begin{matrix} B_{I, I + 1} & 0 \end{matrix}], \\ C_{1, 1}^{'} & = & [\begin{matrix} C_{1, 1} \\ Z_{U} \end{matrix}], & C_{I + 1, I}^{'} = [\begin{matrix} C_{I + 1, I} \\ Z_{U} \end{matrix}], \\ Q_{1}^{'} & = & [\begin{matrix} Q_{1} & 0 \\ V_{U} & W_{U} \end{matrix}], & Q_{I + 1}^{'} = [\begin{matrix} Q_{I + 1} & 0 \\ V_{D} & W_{D} \end{matrix}] . \end{array}

3.2. Schur complements

The rearrangement in subsection 3.1 enables us to use I + 1 independent Schur complements to eliminate all the unknown vectors x_i; this corresponds to periodizing all of the Green’s functions. For example, consider the equation in the first row,

A_{1, 1} η_{1} + A_{1, 2} η_{2} + B_{1, 1}^{'} x_{1} + B_{1, 2} x_{2} = f .

The first two equation rows starting the C block are

C_{1, 1}^{'} η_{1} + Q_{1}^{'} x_{1} = 0, C_{2, 1} η_{1} + C_{2, 2} η_{2} + Q_{2} x_{2} = 0 .

With the first of these x₁ can be eliminated, and with the second x₂ can, giving

(A_{1, 1} - B_{1, 1}^{'} Q_{1}^{' †} C_{1, 1}^{'} - B_{1, 2} Q_{2}^{†} C_{2, 1}) η_{1} + (A_{1, 2} - B_{1, 2} Q_{2}^{†} C_{2, 2}) η_{2} = f,

where

Q_{i}^{' †}

denotes the pseudo-inverse of the rectangular matrix Q′_i, for i = 1,..., I + 1. Filling the matrix

Q_{i}^{' †}

and then using matrix-matrix multiplication with the C blocks to fill matrices in (61) would lose accuracy, because the exponentially large matrix entries in the pseudo-inverses would cause unacceptably amplified round-off error. To retain full accuracy, we do not in fact ever evaluate the pseudo-inverse matrices. Rather, for product matrices such as

X = Q_{i}^{' †} C_{1, 1}^{'}

appearing above, we solve the (ill-conditioned) linear system Q′₁X = C′_1,1 with a standard backward-stable direct dense solver. For all such dense solves we use the “backslash” or mldivide command in MATLAB. (Another option that retains full accuracy would be to apply

Q_{i}^{' †}

in two multiplication stages using its SVD factored form; for details see [56, Sec. 5]).

Similar computations eliminate x₁, x₂, and x₃ from the second equation to give

(A_{2, 1} - B_{2, 2} Q_{2}^{†} C_{2, 1}) η_{1} + (A_{2, 2} - B_{2, 2} Q_{2}^{†} C_{2, 2} - B_{2, 3} Q_{3}^{†} C_{3, 2}) η_{2} + (A_{2, 3} - B_{2, 3} Q_{3}^{†} C_{3, 3}) η_{3} = 0 .

By repeating the same computation for all the equations, all the x_j are eliminated and a block tridiagonal system for η is obtained as

[\begin{matrix} A_{1, 1}^{'} & A_{1, 2}^{'} & 0 & 0 & \dots & 0 & 0 & 0 \\ A_{2, 1}^{'} & A_{2, 2}^{'} & A_{2, 3}^{'} & 0 & \dots & 0 & 0 & 0 \\ 0 & A_{3, 2}^{'} & A_{3, 3}^{'} & A_{3, 4}^{'} & \dots & 0 & 0 & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & 0 & \dots & A_{I - 1, I - 2}^{'} & A_{I - 1, I - 1}^{'} & A_{I - 1, I}^{'} \\ 0 & 0 & 0 & 0 & \dots & 0 & A_{I, I - 1}^{'} & A_{I, I}^{'} \end{matrix}] [\begin{matrix} η_{1} \\ η_{2} \\ ⋮ \\ η_{I - 1} \\ η_{I} \end{matrix}] = [\begin{matrix} f \\ 0 \\ 0 \\ 0 \\ 0 \end{matrix}],

with new interaction matrices for the density unknowns,

A_{1, 1}^{'} = A_{1, 1} - B_{1, 1}^{'} Q_{1}^{' †} C_{1, 1}^{'} - B_{1, 2} Q_{2}^{†} C_{2, 1},

A_{i, i}^{'} = A_{i, i} - B_{i, i} Q_{i}^{†} C_{i, i} - B_{i, i + 1} Q_{i + 1}^{†} C_{i + 1, i}, i = 2, 3, \dots, I - 1,

A_{I, I}^{'} = A_{I, I} - B_{I, I} Q_{I}^{†} C_{I, I} - B_{I, I + 1}^{'} Q_{I + 1}^{' †} C_{I + 1, I}^{'},

A_{i, i + 1}^{'} = A_{i, i + 1} - B_{i, i + 1} Q_{i + 1}^{†} C_{i + 1, i + 1}, i = 1, 2, \dots, I - 1,

A_{i, i - 1}^{'} = A_{i, i - 1} - B_{i, i} Q_{i}^{†} C_{i, i - 1}, i = 2, 3, \dots, I .

Remark 3. The Schur complements (63)–(67) replacing matrix blocks A_i,j by A′_i,j correspond to replacing each layer free-space Green’s function Gⁱ (11) by an arbitrary quasi-periodic Green’s function G′ⁱ that obeys the layer’s correct wall boundary conditions G′ⁱ(x + d, y) = αG′ⁱ(x, y), x ∈ L_i, y ∈ Ω_i. Crucially, since upward and downward radiation conditions are never imposed together in any single layer i, this is not the standard quasi-periodic

G_{qp}^{i}

of (12), which diverges at that layer’s Wood anomalies. The G′ⁱ selected by the backward-stable solves are always non-divergent; intuitively, since proxy strength vectors of small norm are possible, hence these are selected.

3.3. Block tridiagonal solve and evaluation of scattered wave

The block tridiagonal system (62) can be efficiently solved with a block LU decomposition [57, Sec. 4.5.1], at a cost of dense direct inversion of diagonal blocks. Since they derive from a second-kind integral equation, these diagonal blocks are all well conditioned, and no significant rounding error occurs when their inverses are multiplied as matrices. We write f_i for the block vectors of the right-hand side of (62). The algorithm is initialized by setting f̃₁ = f₁ and Ã_1,1 = A′_1,1, then the forward sweep, for i = 2 to I in order,

\begin{matrix} {\tilde{A}}_{i, i} & = & A_{i, i}^{'} - A_{i, i - 1}^{'} {({\tilde{A}}_{i - 1, i - 1})}^{- 1} A_{i - 1, i}^{'} \\ {\tilde{f}}_{i} & = & f_{i} - {({\tilde{A}}_{i - 1, i - 1})}^{- 1} A_{i - 1, i}^{'} {\tilde{f}}_{i} . \end{matrix}

To save RAM, it is possible to have Ã_i,i overwrite A′_i,i. The backward sweep starts with solving Ã_I,Iη_I = f̃_I, then for i = I − 1 down to 1 in order, solve for η_i in

{\tilde{A}}_{i, i} η_{i} = {\tilde{f}}_{i} - A_{i, i + 1}^{'} η_{i + 1} .

If each N_i ≈ N, for some constant N, the cost is 𝒪(N³I) due to two block inversions per layer. This is roughly I² times faster than naive inversion of the whole system. Note that the matrix filling time is only 𝒪(N²I), but dominates for the small N in our settings, due the large prefactor in evaluating special functions and applying Alpert corrections.

Once all of the density vectors η_i are known, the proxy and scattering amplitude vectors x_i are easily recovered by

x_{1} = - Q_{1}^{' †} C_{1, 1}^{'} η_{1},

x_{i} = - Q_{i}^{†} [\begin{matrix} C_{i, i - 1} & C_{i, i} \end{matrix}] [\begin{matrix} η_{i - 1} \\ η_{i} \end{matrix}], i = 2, 3, \dots, I,

x_{I + 1} = - Q_{I + 1}^{' †} C_{I + 1, I}^{'} η_{I} .

Here the products of the type

Q_{i}^{†} C_{i, j}

can be reused from their prior computation in the Schur complements (63)–(67).

Finally, the scattered wave solution u_i in each layer can be evaluated from their representations (17)–(19) by applying the interface quadrature rules to the single- and double-layer potentials, using the discrete density vectors η_i. The evaluation involves only free-space Green’s functions and monopole/dipole sources, which would be compatible with a standard FMM [58]. Above y = y_U and below y = y_D, the solution is evaluated from the Rayleigh–Bloch expansions (truncated versions of (7)–(8)), using the Bragg amplitudes a^U and a^D.

3.4. Accelerated sweep over multiple incident angles at one frequency

Modeling many real-world devices requires characterizing transmission and reflection over a wide range of incident angles θ^inc at one frequency ω. In general, changing θ^inc changes both the operators and the right-hand side in (53), thus naively an independent solution is required for each angle, making the task very expensive. Here we show how to exploit two independent types of structure that speeds this up by typically an order of magnitude.

Firstly, we exploit the fact that the operators in (53), and hence the entire system matrix, depends on θ^inc only through α in (2). Thus we may group together multiple incident angles that share α (as in [37]), and solve them together at a cost that is essentially the same as a single angle, using the precomputed inverse blocks in the tridiagonal solve of Sec. 3.3. Roughly there are k₁d/π such angles, hence this is the speed-up factor. It grows in proportion to the incident wavenumber. For example, if k₁ = 40 and d = 1, the speedup is around a factor of 6. To set up a sweep of all incident angles, we choose a uniform grid in θ^inc with spacing 2π/(ndk₁) for some integer n, insuring that only n independent α-value solves are needed.

Secondly, we exploit the fact that filling (rather than solving) the matrix blocks in (60) often accounts for the bulk of the solution time. Consider one of the integral operators used in the BIE matrix (defined in (20)),

(𝒟_{V, W}^{i} τ) (r) = \sum_{l = - 1}^{1} α^{l} \int_{W} \frac{\partial G_{i}}{\partial n^{'}} (r + r^{'} + l d) τ (r^{'}) d s_{r^{'}} .

The integral part is independent of the incident angle or α. Therefore

\int_{W} \frac{\partial G_{i}}{\partial n^{'}} (r, r^{'} + l d) τ (r^{'}) d s_{r^{'}}

can be precomputed for l ∈ {−1, 0, 1} then used to assemble

({\tilde{D}}_{V, W}^{i} τ) (r)

whenever a new α is given. Exactly the same argument applies to (

({\tilde{S}}_{V, W}^{i} σ) (r)

,

({\tilde{T}}_{V, W}^{i} τ) (r)

, and

({\tilde{D}}_{V, W}^{i, *} σ) (r)

. Therefore, the A, B, C, and Q matrices can be assembled for any α simply by adding and subtracting precomputed integral operators; this speeds up the solution at multiple α values at the expense of using extra RAM.

4. Numerical results

In all numerical examples, we let the first layer be “vacuum” with ε₁ = 1, set μ_i = 1 for all layers, and choose periodicity d = 1. All the computations are conducted using MATLAB 2014a running on a workstation with two Intel Xeon E5-2687W processors (total 16 cores) with 256 GB memory and CentOS 6.5. For filling of the Nyström matrix blocks we use MPSpack [59], which has an interface to Fortran codes for Hankel function evaluations by Vladimir Rokhlin. The parfor command in MATLAB’s Parallel Computing Toolbox is also used to fill the A and C matrices, using only 12 threads. In the following we present: numerical tests on convergence, scaling of timing and memory with frequency ω and the number of layers I, solution plots for a 1000-interface case, and accelerated computation of transmission and reflection spectra at multiple angles.

Remark 4. The Bragg coefficients ${a_{n}^{U}}$ and ${a_{n}^{D}}$ will be used as an independent measure of the accuracy of our numerical scheme based on conservation of the flux (energy) [14, 16], namely

\sum_{k_{n}^{U} > 0} k_{n}^{U} {| a_{n}^{U} |}^{2} + \sum_{k_{n}^{D} > 0} k_{n}^{D} {| a_{n}^{D} |}^{2} = k_{1} cos θ^{inc} .

This holds when all the material properties ε_i and μ_i are real. Therefore, we will define the relative flux error as

Flux error : = | \frac{\sum_{k_{n}^{U} > 0} k_{n}^{U} {| a_{n}^{U} |}^{2} + \sum_{k_{n}^{D} > 0} k_{n}^{D} {| a_{n}^{D} |}^{2} - k_{1} cos θ^{inc}}{k_{1} cos θ^{inc}} | .

4.1. Convergence

We study the convergence of the scattered field at a fixed point (0.15, 0.6) in the first layer, relative to its converged value, and convergence of flux errors defined in (74). The frequency ω = 10 corresponds to a period of about 1.6 wavelengths in the first layer, and larger numbers of wavelengths in the other layers. Thirty sine interfaces with random ε_i chosen between 1 and 2 are considered in Fig. 2. The actual geometry is depicted in the inset of Fig. 2(b). All the interfaces have a periodic trapezoid rule with the same number of quadrature points N_i = N, i = 1, 2, ···30. Fig. 2(a) shows convergence of the scattered field error (blue square) and flux error (red circle) as a function of N: the expected 16th-order rate is observed, and the fact that the two types of error are essentially equivalent, at least not near any material interfaces. The same convergence test is conducted as a function of the number of proxy points P per layer, in Fig. 2(b). The slight downwards curvature on the log-log axes is consistent with super-algebraic convergence. The fact that 10-digit accuracy results with only N = P = 50 is a testament to the extremely rapid convergence of the method.

Fig. 2 Convergence of u(0.15, 0.6) and flux error for 30 sine interfaces (see the inset in (b)) with ω = 10 and random ε_i between 1 and 2. (a) Convergence in N, number of nodes per sine interface (blue square) and flux error (red triangle) while P = 110. (b) Convergence in P (blue square) and flux error (red triangle), fixing N = 70 per sine interface. All other parameters are fixed at M_w = 110, M = 100, K = 10, and R = 2. (Color online.)

Download Full Size | PDF

Secondly, in order to study convergence for non-smooth interfaces, every other sine interface is replaced by a rectangular-ridge interface consisting of five line segments (see the inset in Fig. 3(c)), with Kress grading parameter q = 6. We study the convergence of the two types of interfaces independently. Figure 3(a) shows convergence in N on sine interfaces, where N_i = N, i = 1, 3,···, 29, while the quadrature points on rectangle is fixed at N_i = 110×5, i = 2, 4,···, 30 (i.e. n_i,l = 110 for all l = 1,...,5). It shows the same convergence as before. Then, N_i is fixed at 70 on all sine interfaces (i = 1, 3,···, 29), and N_i = N × 5 on all rectangle interfaces (i = 2, 4,···, 30), is increased. Both the scattered field and flux error appear to converge as 𝒪(N⁻⁶) in Fig. 3(b), the order expected from the q = 6 grading. Figure 3(c) shows at least very high-order convergence in P.

Fig. 3 Convergence of u(0.15, 0.6) and flux error for 30 mixed sine and rectangle interfaces (see the inset in (c)) with ω = 10 and random ε_i between 1 and 2. (a) Convergence in N on sine interface (blue square) and flux error (red triangle), while N = 110 on each line segment of rectangle interfaces, and P = 110. (b) Convergence in N on each line segment of rectangle interfaces (blue square) and flux error (red triangle), while N = 70 on sine interfaces and P = 110. (c) Convergence in P (blue square) and flux error (red triangle), while N = 70 on sine, N = 110 on rectangle interfaces. All other parameters are fixed at M_w = 110, M = 120, K = 20, and R = 2. (Color online.)

Download Full Size | PDF

These tests confirm that flux error is a good indicator of pointwise error in u, at least not close to the interfaces; from now on we quote flux error.

4.2. Performance

Tables 1 and 2 present the CPU time, memory usage, and flux error for periodic dielectric structures with I = 1, 3, 10, 30, 100, 300, and 1000 mixed sine, triangle, and rectangle interfaces. The first is at frequency ω = 5 (0.8 wavelengths per period), the second ω = 40 (6.4 wavelengths per period). Triangle interfaces are placed every 8th interface starting from the 4th interface and rectangle interfaces are placed every 15th interface starting from the 4th interface. All other interfaces are sine. All other parameters used for computations are specified in the table captions. In both tables, the computation time to solve the matrix (Schur complement and block matrix solve) and memory usage increases linearly as number of interfaces increases as expected in Sec. 3.3. For 1000 interfaces, it took 151 sec and 634 sec for ω = 5 and 40, respectively; note that more quadrature nodes are needed to accurately discretize the more oscillatory functions for higher ω. The sizes of the full matrix (60) for 1000 interfaces are 468260 × 287922 and 832000 × 751762, respectively. At ω = 40 the structure is around 8000λ tall.

Table 1. CPU time, memory, and flux error: ω = 5 (period is 0.8λ in vacuum), ε₁ = 1 and all other ε_i are random between 1 and 2, θ^inc = −π/5, N_i = 70 on sine, N_i = 100×2 on triangle, and N_i = 100×5 on rectangle interfaces, M_w = 120, M = 60, P = 60, K = 20, and R = 2.

View Table | View all tables in this article

Table 2. CPU time, memory, and flux error: ω = 40 (period is 6.4λ in vacuum), ε₁ = 1 and all other ε_i are random between 1 and 2, θ^inc = −π/5, N_i = 180 on sine, N_i = 150 × 2 on triangle, and N_i = 340 × 5 on rectangle interfaces, M_w = 120, M = 60, P = 160, K = 20, and R = 2.

View Table | View all tables in this article

In order to present a performance of the numerical method in some extreme cases, three numerical examples are presented. First, we considered 100 interfaces chosen randomly as sine, triangle, or rectangle type, with random heights, phases, layer thicknesses (while preventing collisions), and layer permittivities; see Fig. 4(a). We chose a frequency corresponding to a period of 4.5 wavelengths in the top layer, and an incident angle making the top layer precisely at a Wood anomaly. Due to space limitations, the real part of total field in only the first 10 and last 10 layers (regions enclosed by rectangles in Fig. 4(a)) is plotted in Fig. 4(b). Matrix filling time was 19 sec, the Schur complements 9 sec, and the tridiagonal solve 8 sec. We achieve 9-digit accuracy in flux; in general we find that the flux error is no worse at Wood anomalies than at other angles.

Fig. 4 (a) The 100-interface structure tested at a Wood anomaly for the top layer. (b) Real part of total field u + u^inc in the rectangles drawn in (a), for ω = 9π, θ^inc = −cos⁻¹(1 − 2π/ω), ε₁ = 1 and all other ε_i are randomly chosen between 1 and 2. N_i = 260 on sines, 100 × 2 on triangles, 90 × 5 on rectangles, M_w = 120, M = 60, P = 120, and K = 10. Flux error is 5×10⁻¹⁰, total solution time 35 sec (not including field evaluation). (Color online.)

Download Full Size | PDF

The second and third examples are for 1000 interfaces. The real part of total field u + u^inc in only the top 10 layers is displayed due to figure resolution limitations. Figure 5(a) and (b) show the top 30 interfaces of the structure and the real part of the total field from the 1000-interface case in the last column of Table 2, respectively. The third example is presented to highlight the geometric flexibility: we considered 1000 interfaces consisting of seven complex interface shapes on top of 993 sine interfaces (Fig. 6(a)). Here ε_i, i = 2,...,1000 are chosen randomly between 1 and 3. All other parameters are given in the figure caption. The total field is computed and displayed in Fig. 6(b). The CPU time for the solve was about 400 sec with 7 × 10⁻⁹ flux error; evaluation of the solution at 1 million points took a similar time.

Fig. 5 (a) First 30 interfaces of the 1000-interface structure used in Tables 1 and 2. (b) Real part of the total field u + u^inc in the rectangle drawn in (a): ω = 40, θ^inc = −π/5, ε₁ = 1, and all other ε_i randomly chosen between 1 and 2. Flux error is 4.7 × 10⁻⁸. (Color online.)

Download Full Size | PDF

Fig. 6 (a) 1000-interface structure consisting of 7 complex-shaped interfaces on top of 993 sine interfaces. (b) Real part of total field u + u^inc in the rectangular region drawn in (a): ω = 40, θ^inc = −π/4, ε₁ = 1 and all other ε_i are chosen randomly between 1 and 3. N₁ = 160 × 6, N₂ = 160 × 6, N₃ = 250, N₄ = 180 × 2, N₅ = 160 × 3, N₆ = 160 × 5, and N₇ = 300 on the first 7 interfaces. N_i = 300 for the rest of the sine interfaces. M_w = 130, M = 80, P = 150, K = 20, R = 1.5. Flux error is 7 × 10⁻⁹, total matrix filling time 192 sec, Schur complement 107 sec, block matrix solve 103 sec, total memory used 28 GB. Field evaluation (1000 × 1000 grid points) took 446 sec. (Color online.)

Download Full Size | PDF

4.3. Transmission and reflection spectrum

We now compute transmission (T) and reflection (R) spectra,

T (θ^{inc}) : = \frac{\sum_{k_{n}^{D} > 0} k_{n}^{D} {| a_{n}^{D} |}^{2}}{k_{1} cos θ^{inc}}, R (θ^{inc}) : = \frac{\sum_{k_{n}^{U} > 0} k_{n}^{U} {| a_{n}^{U} |}^{2}}{k_{1} cos θ^{inc}},

respectively, as a function of incident angle, −π < θ^inc < 0, and benchmark the acceleration technique of Sec. 3.4. The 30-interface structure shown in Fig. 7(a) is used, the same as in Tables 1 and 2.

Fig. 7 (a) 30-interface structure. Reflection (blue solid line) and transmission (red dashed line) as a function of incident angle from −π to 0 for: (b) ω = 2 with periodic ε = {1, 4, 1, 4,···, 4, 1, 4, 1}, average flux error 8.3 × 10⁻⁹; (c) ω = 2 with ε₁ = 1 and all other ε_i random between 1 and 4, average flux error 1.3 × 10⁻¹⁰; (d) same structure as (b) but ω = 10, average flux error 4.7 × 10⁻⁷; and (e) same structure as (c) but ω = 10, average flux error 4.1 × 10⁻¹⁰. (Color online.)

Download Full Size | PDF

First, ω = 2 (0.3 wavelengths per period) and periodic permittivities ε = {1, 4, 1, 4,···, 4, 1} are considered. The spectrum clearly shows Bragg mirror or Fabry-Perot characteristics (there are ranges of incident angles that have total reflection), and symmetry, in Fig. 7(b). Because the wavelength is larger than the geometric features, the interface shape does not play an important role in determining the scattering. However, when ε_i are set to random numbers between 1 and 4, the total reflection regime disappears in Fig. 7(c). Since ωd/π < 1, essentially no benefit comes from multiple angles sharing α values, but precomputation of matrix blocks does help. For 200 incident angles, the computation took 50 sec to produce with acceleration but 352 sec without, a 7× speed-up.

The frequency is now increased to ω = 10 (1.6 wavelengths per period), with 641 incident angles. Regardless of the ε distribution, the transmission-reflection spectrum behaves very abruptly in both periodic (Fig. 7(d)) and random (Fig. 7(e)) ε cases. Also notice that the symmetry of the spectrum is broken because the layered structure has rectangle and triangle shaped interfaces which are now resolved by the wavelength. The computation took 121 sec with acceleration, and 1619 sec without, a speed-up of 13×. Finally, at ω = 20 (3.2 wavelengths per period), using 1279 incident angles, the speed-up was about 25× (we don’t show these spectra since they do not show any new phenomena). This is consistent with the acceleration factor growing linearly with ω. Thus for ω = 40 as in Table 2 a speedup of 50× is expected.

5. Conclusion

We presented a new robust and fast integral equation method for 2D scattering from a periodic dielectric grating with an arbitrary number of layers of general shape. There are three main features: (1) The computational cost of the new method scales optimally (linearly) in the number of layers, allowing 1000-layer structures to be solved rapidly. (2) The method is stable for all scattering parameters including Wood anomalies, since it is based on free-space rather than quasi-periodic Green’s functions. (3) The periodizing scheme is simple, high-order accurate, and largely supersedes the Sommerfeld integral method of the second author and Greengard in [15]. This solver is expected to be useful in variety of wave applications in engineering and experimental physics, including the high accuracy modeling and optimization of optical, electromagnetic, and acoustic devices, and meta-materials.

There are several natural extensions of this work. Allowing dielectric inclusions in the layers (as in [56]), or material triple-junctions (e.g. incorporating robust representations of Lee–Greengard as in [38]), is simply a matter of bookkeeping, as long as the number of unknowns per layer remains small (e.g. less than 10⁴). Since we use high-order quadrature schemes, this would allow significantly more complex unit cell shapes than presented here. Beyond this, an iterative FMM solution of the combined system would be appropriate when the number of layers is small [56]; for robustness with many layers a hierarchical fast direct solver such as in [37, 38] could be used in each layer, combined with our tridiagonal block solve. Our scheme generalizes to 3D without needing new ideas, given a surface quadrature for the integral operators. However, the number of unknowns per layer could then easily exceed 10⁴, demanding something more elaborate than dense direct linear algebra within each layer.

For a production code, matrix filling and evaluation should be implemented in C or Fortran, and a parallel implementation would allow more simultaneous filling of matrix blocks, as well exploiting a parallel tridiagonal solve. In terms of analysis, an extension of the rigorous framework of free-space integral equations to include the presented periodizing scheme is needed.

The MATLAB codes which implement the methods of this paper and generate some of the figures can be downloaded from http://math.dartmouth.edu/~mhcho/software These rely on layer-potential quadrature codes in the MPSpack toolbox by the second author, which can be downloaded from http://code.google.com/p/mpspack

Acknowledgments

We benefited from a helpful discussion with Stephen Shipman, and the useful remarks of the anonymous reviewers. The work of AHB is supported by NSF grant DMS-1216656.

References and links

1. M. D. Perry, R. D. Boyd, J. A. Britten, D. Decker, B. W. Shore, C. Shannon, and E. Shults, “High-efficiency multilayer dielectric diffraction gratings,” Opt. Lett. 20, 940–942 (1995). [CrossRef] [PubMed]

2. M. K. C. P. J. Barty, J. Britten, R. Beach, G. Beer, C. Brown, S. Bryan, J. Caird, T. Carlson, J. Crane, J. Dawson, A. Erlandson, D. Fittinghoff, M. Hermann, C. Hoaglan, A. Iyer, L. Ivanovic II, I. Jovanovic, A. Komashko, O. Landen, Z. Liao, W. Molander, S. Mitchell, E. Moses, N. Nielsen, H.-H. Nguyen, J. Nissen, S. Payne, D. Pennington, L. Risinger, M. Rushford, K. Skulina, M. Spaeth, B. Stuart, G. Tietbohl, and B. Wattellier, “An overview of LLNL high-energy short-pulse technology for advanced radiography of laser fusion experiments,” Nuclear Fusion 44, S266 (2004). [CrossRef]

3. G. A. Kalinchenko and A. M. Lerer, “Wideband all-dielectric diffraction grating on chirped mirror,” J. Lightwave Technology 28, 2743–2749 (2010). [CrossRef]

4. H. A. Atwater and A. Polman, “Plasmonics for improved photovoltaic devices,” Nature Materials 9, 205–213 (2010). [CrossRef] [PubMed]

5. M. D. Kelzenberg, S. W. Boettcher, J. A. Petykiewicz, D. B. Turner-Evans, M. C. Putnam, E. L. Warren, J. M. Spurgeon, R. M. Briggs, N. S. Lewis, and H. A. Atwater, “Enhanced absorption and carrier collection in Si wire arrays for photovoltaic applications,” Nature Materials 9, 239–244 (2010). [CrossRef] [PubMed]

6. N. P. Sergeant, M. Agrawal, and P. Peumans, “High performance solar-selective absorbers using coated sub-wavelength gratings,” Opt. Express 18, 5525–5540 (2010). [CrossRef] [PubMed]

7. J. D. Joannopoulos, S. G. Johnson, R. D. Meade, and J. N. Winn, Photonic Crystals: Molding the Flow of Light (Princeton University, 2008), 2nd ed.

8. R. Model, A. Rathsfeld, H. Gross, M. Wurm, and B. Bodermann, “A scatterometry inverse problem in optical mask metrology,” J. Phys.: Conf. Ser. 135, 012071 (2008).

9. D. Colton and R. Kress, Inverse Acoustic and Electromagnetic Scattering Theory, vol. 93 of Applied Mathematical Sciences (Springer-Verlag, 1998), 2nd ed. [CrossRef]

10. A.-S. Bonnet-BenDhia and F. Starling, “Guided waves by electromagnetic gratings and non-uniqueness examples for the diffraction problem,” Math. Meth. Appl. Sci. 17, 305–338 (1994). [CrossRef]

11. J. A. Stratton, Electromagnetic Theory (John Wiley & Sons, 2007).

12. J. D. Jackson, Classical Electrodynamics (Wiley, 1998), 3rd ed.

13. R. W. Wood, “On a remarkable case of uneven distribution of light in a diffraction grating spectrum,” Philos. Mag. 4, 396–408 (1902). [CrossRef]

14. C. M. Linton and I. Thompson, “Resonant effects in scattering by periodic arrays,” Wave Motion 44, 165–175 (2007). [CrossRef]

15. A. H. Barnett and L. Greengard, “A new integral representation for quasi-periodic scattering problems in two dimensions,” BIT Numer. Math. 51, 67–90 (2011). [CrossRef]

16. S. Shipman, Resonant scattering by open periodic waveguides (Bentham Science Publishers, 2010), vol. 1 of Progress in Computational Physics (PiCP), pp. 7–50.

17. G. L. Wojcik, J. Mould Jr., E. Marx, and M. P. Davidson, “Numerical reference models for optical metrology simulation,” Proc. SPIE 1673, 06 (1992).

18. A. Taflove, Computational Electrodynamics: The Finite-Difference Time-Domain Method (Artech House, 1995).

19. H. Holter and H. Steyskal, “Some experiences from FDTD analysis of infinite and finite multi-octave phased arrays,” IEEE Trans. Antennae Prop. 50, 1725–1731 (2002). [CrossRef]

20. G. Bao and D. C. Dobson, “Modeling and optimal design of diffractive optical structures,” Surv. Math. Ind. 8, 37–62 (1998).

21. J. Elschner, R. Hinder, and G. Schmidt, “Finite element solution of conical diffraction problems,” Adv. Comput. Math. 16, 139–156 (2002). [CrossRef]

22. I. M. Babuska and S. A. Sauter, “Is the pollution effect of the FEM avoidable for the Helmholtz equation considering high wave numbers?” SIAM J. Numer. Anal. 34, 2392–2423 (1997).

23. M. G. Moharam and T. G. Gaylord, “Rigorous coupled-wave analysis of planar-grating diffraction,” J. Opt. Soc. Am. 71, 811–818 (1981). [CrossRef]

24. L. Li, “Use of Fourier series in the analysis of discontinuous periodic structures,” J. Opt. Soc. Am. A 13, 1870–1876 (1996). [CrossRef]

25. L. Li, “Formulation and comparison of two recursive matrix algorithms for modeling layered diffraction gratings,” J. Opt. Soc. Am. A 13, 1024–1035 (1996). [CrossRef]

26. A. Lechleiter and D.-L. Nguyen, “A trigonometric Galerkin method for volume integral equations arising in TM grating scattering,” Adv. Comput. Math. 40, 1–25 (2014). [CrossRef]

27. M. H. Cho and W. Cai, “A parallel fast algorithm for computing the Helmholtz integral operator in 3-D layered media,” J. Comput. Phys. 231, 5910–5925 (2012). [CrossRef]

28. D. Y. K. Ko and J. R. Sambles, “Scattering matrix method for propagation of radiation in stratified media: attenuated total reflection studies of liquid crystals,” J. Opt. Soc. Am. A 5, 1863–1866 (1988). [CrossRef]

29. J. C. Nédélec and F. Starling, “Integral equation methods in a quasi-periodic diffraction problem for the time-harmonic Maxwell’s equations,” SIAM J. Math. Anal. 22, 1679–1701 (1991). [CrossRef]

30. A. Meier, T. Arens, S. N. Chandler-Wilde, and A. Kirsch, “A Nyström method for a class of integral equations on the real line with applications to scattering by diffraction gratings and rough surfaces,” J. Integral Equations Appl. 12, 281–321 (2000). [CrossRef]

31. R. Kress, “Boundary integral equations in time-harmonic acoustic scattering,” Mathl. Comput. Modelling 15, 229–243 (1991). [CrossRef]

32. S. Hao, A. H. Barnett, P. G. Martinsson, and P. Young, “High-order accurate Nyström discretization of integral equations with weakly singular kernels on smooth curves in the plane,” Adv. Comput. Math. 40, 245–272 (2014). [CrossRef]

33. T. Arens, “Scattering by biperiodic layered media: The integral equation approach,” Habilitation thesis, Karlsruhe (2010).

34. M. J. Nicholas, “A higher order numerical method for 3-D doubly periodic electromagnetic scattering problems,” Commun. Math. Sci. 6, 669–694 (2008). [CrossRef]

35. Y. Otani and N. Nishimura, “A periodic FMM for Maxwell’s equations in 3D and its applications to problems related to photonic crystals,” J. Comput. Phys. 227, 4630–4652 (2008). [CrossRef]

36. O. P. Bruno and M. C. Haslam, “Efficient high-order evaluation of scattering by periodic surfaces: deep gratings, high frequencies, and glancing incidences,” J. Opt. Soc. Am. A 26, 658–668 (2009). [CrossRef]

37. A. Gillman and A. Barnett, “A fast direct solver for quasiperiodic scattering problems,” J. Comput. Phys. 248, 309–322 (2013). [CrossRef]

38. L. Greengard, K. L. Ho, and J.-Y. Lee, “A fast direct solver for scattering from periodic structures with multiple material interfaces in two dimensions,” J. Comput. Phys. 258, 738–751 (2014). [CrossRef]

39. M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables (Dover, 1964), 10th ed.

40. C. M. Linton, “Lattice sums for the Helmholtz equation,” SIAM Review 52, 603–674 (2010). [CrossRef]

41. K. V. Horoshenkov and S. N. Chandler-Wilde, “Efficient calculation of two-dimensional periodic and waveguide acoustic Green’s functions,” J. Acoust. Soc. Amer. 111, 1610–1622 (2002). [CrossRef]

42. O. P. Bruno, S. Shipman, C. Turc, and S. Venakides, “Efficient evaluation of doubly periodic Green functions in 3D scattering, including Wood anomaly frequencies,” (2013). Preprint, arXiv:1307.1176v1.

43. O. P. Bruno and B. Delourme, “Rapidly convergent two-dimensional quasi-periodic Gteen function throughout the spectrum-including Wold anomalies,” J. Comput. Phys. 262, 262–290 (2014). [CrossRef]

44. A. H. Barnett and L. Greengard, “A new integral representation for quasi-periodic fields and its application to two-dimensional band structure calculations,” J. Comput. Phys. 229, 6898–6914 (2010). [CrossRef]

45. A. Bogomolny, “Fundamental solutions method for elliptic boundary value problems,” SIAM Journal on Numerical Analysis 22, 644–669 (1985). [CrossRef]

46. A. H. Barnett and T. Betcke, “Stability and convergence of the Method of Fundamental Solutions for Helmholtz problems on analytic domains,” J. Comput. Phys. 227, 7003–7026 (2008). [CrossRef]

47. P. Martinsson and V. Rokhlin, “A fast direct solver for boundary integral equations in two dimensions,” J. Comp. Phys. 205, 1–23 (2005). [CrossRef]

48. N. A. Gumerov and R. Duraiswami, “A method to compute periodic sums,” J. Comput. Phys. 272, 307–326 (2014). [CrossRef]

49. C. Hafner, The Generalized Multipole Technique for Computational Electromagnetics (Artech House Books, 1990).

50. L. Zhao and A. H. Barnett, “Robust and efficient solution of the drum problem via Nyström approximation of the Fredholm determinant,” (2014). arxiv:1406.5252, submitted to J. Comput. Phys.

51. C. Müller, Foundations of the Mathematical Theory of Electromagnetic Waves (Springer-Verlag, 1969). [CrossRef]

52. V. Rokhlin, “Solution of acoustic scattering problems by means of second kind integral equations,” Wave Motion 5, 257–272 (1983). [CrossRef]

53. P. J. Davis and P. Rabinowitz, Methods of Numerical Integration (Academic, 1984).

54. R. Kress, Linear Integral Equations, vol. 82 of Appl. Math. Sci. (Springer, 1999), 2nd ed. [CrossRef]

55. B. K. Alpert, “Hybrid Gauss-trapezoidal quadrature rules,” SIAM J. Sci. Comput. 20, 1551–1584 (1999). [CrossRef]

56. J. Lai, M. Kobayashi, and A. H. Barnett, “A fast solver for the scattering from a layered periodic structure with multi-particle inclusions,” (2014). In preparation.

57. G. H. Golub and C. F. Van Loan, Matrix Computations, Johns Hopkins Studies in the Mathematical Sciences (Johns Hopkins University, 1996), 3rd ed.

58. W. Y. Crutchfield, Z. Gimbutas, Gol, J. Huang, V. Rokhlin, N. Yarvin, and J. Zhao, Remarks on the implementation of the wideband FMM for the Helmholtz equation in two dimensions (Amer. Math. Soc., 2006), vol. 408 of Contemp. Math., pp. 99–110.

59. A. H. Barnett and T. Betcke, “MPSpack: A MATLAB toolbox to solve Helmholtz PDE, wave scattering, and eigenvalue problems,” (2008–2014). http://code.google.com/p/mpspack.

Number of interfaces	1	3	10	30	100	300	1000
Matrix Filling (sec)	0.518	1.860	4.200	5.600	12.384	32.332	103.331
Schur Complement (sec)	0.028	0.058	0.299	0.644	2.263	6.525	21.037
Block Solve (sec)	0.003	0.041	0.398	0.898	2.805	8.626	26.655
Memory (MB)	18	41	83	183	608	1753	5830
Flux Error	4.8e-12	3.1e-11	2.4e-11	4.0e-11	2.2e-11	1.3e-10	9.1e-10

Number of interfaces	1	3	10	30	100	300	1000
Matrix Filling (sec)	0.813	2.739	13.295	16.312	30.960	75.915	247.597
Schur Complement (sec)	0.080	0.231	1.366	3.361	10.553	33.097	108.384
Block Solve (sec)	0.018	0.118	3.879	8.637	28.180	90.187	277.933
Memory (MB)	58	112	496	1069	3576	10319	34347
Flux Error	8.3e-13	5.3e-12	8.9e-10	3.4e-10	3.6e-09	5.0e-09	4.7e-08

Number of interfaces	1	3	10	30	100	300	1000
Matrix Filling (sec)	0.518	1.860	4.200	5.600	12.384	32.332	103.331
Schur Complement (sec)	0.028	0.058	0.299	0.644	2.263	6.525	21.037
Block Solve (sec)	0.003	0.041	0.398	0.898	2.805	8.626	26.655
Memory (MB)	18	41	83	183	608	1753	5830
Flux Error	4.8e-12	3.1e-11	2.4e-11	4.0e-11	2.2e-11	1.3e-10	9.1e-10

Number of interfaces	1	3	10	30	100	300	1000
Matrix Filling (sec)	0.813	2.739	13.295	16.312	30.960	75.915	247.597
Schur Complement (sec)	0.080	0.231	1.366	3.361	10.553	33.097	108.384
Block Solve (sec)	0.018	0.118	3.879	8.637	28.180	90.187	277.933
Memory (MB)	58	112	496	1069	3576	10319	34347
Flux Error	8.3e-13	5.3e-12	8.9e-10	3.4e-10	3.6e-09	5.0e-09	4.7e-08

Robust fast direct integral equation solver for quasi-periodic scattering problems with a large number of layers

Abstract

1. Introduction

2. Boundary integral formulation, periodizing scheme, and its discretization

2.1. Representation of the scattered wave

2.2. Matching conditions at material interfaces

2.3. Imposing the quasi-periodicity conditions

2.4. Imposing the radiation conditions

2.5. Discretization of functions and operators

3. Rearrangement of equations, Schur complement, and fast solver

3.1. Rearrangement

3.2. Schur complements

3.3. Block tridiagonal solve and evaluation of scattered wave

3.4. Accelerated sweep over multiple incident angles at one frequency

4. Numerical results

4.1. Convergence

4.2. Performance

4.3. Transmission and reflection spectrum

5. Conclusion

Acknowledgments

References and links

Cited By

Figures (7)

Tables (2)

Equations (84)

Optics Express