Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Differentiable scattering matrix for optimization of photonic structures

Open Access Open Access

Abstract

The scattering matrix, which quantifies the optical reflection and transmission of a photonic structure, is pivotal for understanding the performance of the structure. In many photonic design tasks, it is also desired to know how the structure’s optical performance changes with respect to design parameters, that is, the scattering matrix’s derivatives (or gradient). Here we address this need. We present a new algorithm for computing scattering matrix derivatives accurately and robustly. In particular, we focus on the computation in semi-analytical methods (such as rigorous coupled-wave analysis). To compute the scattering matrix of a structure, these methods must solve an eigen-decomposition problem. However, when it comes to computing scattering matrix derivatives, differentiating the eigen-decomposition poses significant numerical difficulties. We show that the differentiation of the eigen-decomposition problem can be completely sidestepped, and thereby propose a robust algorithm. To demonstrate its efficacy, we use our algorithm to optimize metasurface structures and reach various optical design goals.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

The scattering matrix is a fundamental concept in many fields. It relates the input state and the output state of a physical system undergoing a scattering process. Particularly revealing in optics, the scattering matrix has been widely used for analyzing photonic structures such as waveguides [1] and metasurface units [2]. Once the scattering matrix of a photonic structure is known, the structure’s optical performance (e.g., mode conversion efficiency and phase shift) can be directly obtained.

Because of its vital importance, many numerical methods have been developed to compute the scattering matrix of a photonic structure. Among them, a popular class is the semi-analytical methods, such as the method of lines [3] and rigorous coupled-wave analysis (RCWA) [4]. These methods exploit the fact that many photonic structures in practice (such as waveguides and metasurface units) have a piecewise constant cross-sectional shape along the transmission direction (denoted as $z$-direction). Thus, to solve Maxwell’s equations, they only need to discretize the 2D cross-sectional region, reducing Maxwell’s equations into a set of continuous differential equations along $z$-direction, whose solution can be expressed through an eigenvalue analysis. Thanks to the semi-discretization, these methods often enable faster computation than full discretization methods (such as finite-element- and finite-volume-based methods). Indeed, methods like RCWA have been widely used in designing various photonic structures, such as metasurfaces [5], metagratings [6], holograms [7], polarimeters [8], solar cells [9], radiative cooling structures [10], color structures [11], photonic crystals [12], and waveguides [1].

In this work, we extend the semi-analytical methods to obtain the higher-order information of scattering matrices, namely the scattering matrix’s derivatives (or gradient). Provided a photonic structure specified by certain design parameters, we aim to compute not only its scattering matrix but its derivatives with respect to the design parameters.

The scattering matrix derivatives depict the changes of the structure’s optical behaviors as its design parameters vary. This higher-order information, if robustly and efficiently computed, finds many applications in photonic design. Perhaps most notable is the optimization of photonic structures. The derivatives provide guidance on how we can adjust the parameters (e.g., through the gradient descent algorithm [13]) to improve the structure’s optical performance [14] or to find a design robust to fabrication error [1517].

Unfortunately, the computation of scattering matrix derivatives is nontrivial. The difficulty is rooted in the fact that the permissible optical modes in a photonic structure are eigenfunctions of a linear (Hermitian) operator determined by Maxwell’s equations [18]. Thus, to compute the scattering matrix, semi-analytical methods must solve an eigen-decomposition problem: its eigenvalues describe the propagation constants (or effective indices) of the modes and its eigenvectors indicate propagating modal patterns. Differentiating the scattering matrix, by chain rule, requires the derivatives of eigenvalues and eigenvectors. It is the need of eigenvector derivatives that renders the scattering matrix differentiation ill-posed: when there exist repeated eigenvalues, the corresponding eigenvectors are not uniquely defined. As the parameter changes, the numerical results of the eigenvectors may change discontinuously, and their derivatives become undefined (see more discussion in Section 3).

Not merely does this issue exist as a corner case; many photonic structures in practice have geometric and material symmetries, from which repeated eigenvalues and thus ill-defined eigenvector derivatives emerge (see Fig. 2). Consequently, one must carefully choose eigenvectors such that they vary smoothly with respect to the design parameters. This choice, albeit attainable, demands complex and expensive computational effort [19].

In this paper, we question the necessity of eigenvector derivatives for differentiating scattering matrices. We show that while eigen-decompositions are needed for computing a photonic structure’s scattering matrix, eigenvector derivatives can be fully sidestepped for differentiating the scattering matrix. Based on our new derivation, we present a fast and robust algorithm that, without resorting to eigenvector derivatives, computes the scattering matrix derivatives with respect to any design parameters.

Our method is designed for scattering matrices in general, independent from any specific basis representation; nor is it bound to any particular geometric parameterization. To demonstrate the use of our method, we apply the scattering matrix derivatives for optimizing the design of metasurface units. We can choose different design parameterizations and use gradient-based optimization to reach various light transmission goals. We also propose a general parameterization of the meta-unit’s cross-sectional shape that can be optimized using our method.

A C++ implementation of our scattering matrix differentiation in RCWA has been made publicly available (see Code 1 [20]).

2. Background: scattering matrix

We start by briefly reviewing the classic notion of scattering matrix in computational photonics, to pave the way toward its differentiation.

To numerically analyze a photonic structure (such as a waveguide), the structure is often discretized along the wave propagation direction (i.e., $z$-direction) into a series of layers each with a uniform cross-sectional material distribution (Fig. 1(b)). Consider optical waves of a specific frequency. Their propagation in each layer is characterized by a scattering matrix $\textbf {S}$, which relates waves incident on the layer from left and right sides (Fig. 1(a)) to the waves scattered out in either direction.

 figure: Fig. 1.

Fig. 1. (a) In a photonic structure, light may be incident from both sides and get scattered out. The relationship of incident and scattered light is characterized by the scattering matrix. (b) A complex structure can be decomposed into individual layers. Each layer is characterized by its scattering matrix, and these scattering matrices are then combined (using the Redheffer star product [21]) to form the scattering matrix of the entire structure.

Download Full Size | PDF

Concretely, let $\textbf {a}_{\textrm {L}}$ and $\textbf {a}_{\textrm {R}}$ denote vectors describing incident waves on the layer from left and right sides, respectively. Here $\textbf {a}_{\textrm {L}}$ and $\textbf {a}_{\textrm {R}}$ stack coefficients that represent the waves under a chosen basis, whose construction will be outlined shortly. Under the same basis, we use $\textbf {b}_{\textrm {L}}$ and $\textbf {b}_{\textrm {R}}$ to denote the scattered waves in left and right directions. With these notations, the incident and scattered waves are related through

$$\left[\begin{matrix} \textbf{b}_{\textrm{L}} \\ \textbf{b}_{\textrm{R}} \end{matrix}\right] = \textbf{S} \left[\begin{matrix} \textbf{a}_{\textrm{L}} \\ \textbf{a}_{\textrm{R}} \end{matrix}\right], \textrm{ where } \textbf{S} \: = \left[\begin{matrix} \textbf{R}_{\textrm{L}} & \textbf{T}_{\textrm{RL}} \\ \textbf{T}_{\textrm{LR}} & \textbf{R}_{\textrm{R}} \end{matrix}\right].$$

Here $\textbf {S}$ is decomposed into four submatrices: $\textbf {R}_{\textrm {L}}$ and $\textbf {R}_{\textrm {R}}$ indicate how the incident wave from left or right direction is reflected by the layer, while $\textbf {T}_{\textrm {RL}}$ and $\textbf {T}_{\textrm {LR}}$ describe how the incident wave (from either direction) transmits through the layer.

The computation of scattering matrix starts with a semi-discretization of the frequency-domain Maxwell’s equations of a photonic layer, namely,

$$-jk_0\frac{\partial}{\partial z}\textbf{e} = \textbf{P} \textbf{h} \;\textrm{ and }\; -jk_0\frac{\partial}{\partial z}\textbf{h} = \textbf{Q} \textbf{e},$$
where $k_0$ is the free-space wave number, and the vectors $\textbf {e}$ and $\textbf {h}$ describe the electric and magnetic fields of the photonic structure under a chosen basis—for example, RCWA uses the 2D Fourier basis on the cross-section of the wave propagation direction. The matrices $\textbf {P}$ and $\textbf {Q}$ encode the cross-sectional distributions of material permeability and permittivity.

The semi-discretization (2) is a common form in many numerical analysis methods for photonic structures (such as the method of line [3] and RCWA [4]). The difference across those methods only lies in the specific ways of constructing $\textbf {P}$ and $\textbf {Q}$ (e.g., see Supplement 1 for the numerical recipe of constructing $\textbf {P}$ and $\textbf {Q}$ in RCWA and a discussion of its relation to the method of line).

Once $\textbf {P}$ and $\textbf {Q}$ are determined, the scattering matrix $\textbf {S}$ can be constructed. A key step of this construction is to solve an eigenvalue problem, $(\textbf {P}\textbf {Q})\textbf {W} = \textbf {W}\boldsymbol{\Gamma }$, to obtain eigenvectors $\textbf {W}$ and the diagonal eigenvalue matrix $\boldsymbol{\Gamma }$. As we will discuss in Section 3, it is this eigenproblem that renders the differentiation of the scattering matrix ill-posed. To understand the challenges and how we overcome them, we first present the recipe of computing $\textbf {S}$ from $\textbf {W}$ and $\boldsymbol{\Gamma }$, as follows.

Let $\boldsymbol{\Omega }\;(\textbf {P}\textbf {Q})^{\frac {1}{2}}$. Then, its eigenvalue matrix is $\boldsymbol{\Lambda } = \boldsymbol{\Gamma }^{\frac {1}{2}}$. As derived in [22], the formulas of computing the scattering matrix $\textbf {S}$ defined in (1) are

$$\textbf{R}_{\textrm{L}} = \textbf{R}_{\textrm{R}} = \left(\textbf{A} -\textbf{X}\textbf{B}\textbf{A}^{-1}\textbf{X}\textbf{B}\right)^{-1}\left(\textbf{X}\textbf{B}\textbf{A}^{-1}\textbf{X}\textbf{A}-\textbf{B}\right),$$
$$\textbf{T}_{\textrm{LR}}=\textbf{T}_{\textrm{RL}} = \left(\textbf{A} -\textbf{X}\textbf{B}\textbf{A}^{-1}\textbf{X}\textbf{B}\right)^{-1}\textbf{X}\left(\textbf{A} - \textbf{B}\textbf{A}^{-1}\textbf{B}\right),$$
where the matrices $\textbf {X}$, $\textbf {A}$, and $\textbf {B}$ have the following forms:
$$\textbf{X} = e^{j\boldsymbol{\Lambda}\frac{L}{k_0}},$$
$$\textbf{A} = \textbf{W}^{-1}\textbf{W}_0 + \textbf{V}^{-1}\textbf{V}_0,\; \textrm{ and } \textbf{B} = \textbf{W}^{-1}\textbf{W}_0 - \textbf{V}^{-1}\textbf{V}_0.$$

Here we use $L$ to denote the layer thickness (Fig. 1), and the matrix $\textbf {V}$ is related to $\textbf {W}$ through $\textbf {V}=\textbf {Q}\textbf {W}\boldsymbol{\Lambda }^{-1}$. $\textbf {W}$ and $\textbf {V}$ together form a basis of electric and magnetic components for the optical waves in the layer. Similarly, $\textbf {W}_0$ and $\textbf {V}_0$ form a basis for free space propagation, independent from the photonic structure. They are constant values for computing the derivatives of $\textbf {S}$. The vectors, $\textbf {a}_{\textrm {L}}$, $\textbf {a}_{\textrm {R}}$ $\textbf {b}_{\textrm {L}}$, and $\textbf {b}_{\textrm {R}}$, in (1) are coefficients under this free-space basis to describe incident and scattered waves.

Once the scattering matrices of individual layers are computed, they are combined using the Redheffer star product [21] into the total scattering matrix, one that indicates the optical response of the entire photonic structure.

Remark. The formulas in Eqs. (3) and (4) assume that the current photonic layer is sandwiched by two free-space layers. This assumption is by no means a limitation. In an arbitrary photonic structure, the layers can be treated as if they are interleaved with free-space layers—each of which has a zero thickness.

3. Differentiable scattering matrix

The geometry or material distribution of photonic structure is specified by its structural (design) parameters (e.g., see Fig. 2). These parameters determine the structure’s permittivity and permeability distributions described by $\textbf {P}$ and $\textbf {Q}$ in (2). Thus, one can compute their derivatives, $\textbf {P}'$ and $\textbf {Q}'$, with respect to an arbitrary parameter. Given $\textbf {P}'$ and $\textbf {Q}'$, we now address the question of how to compute the scattering matrix derivative $\textbf {S}'$ with respect to the same parameter.

 figure: Fig. 2.

Fig. 2. Repetition from symmetry. Consider a meta-atom structure (b) whose cross-sectional shape (a) is parameterized by $\alpha$, which controls the size of the cross-shaped hollow region. The shape symmetry causes the structure’s propagating modes to have repeated effective indices (c), as also indicated by the repeated eigenvalues when one solves (2). As $\alpha$ varies, mode 0 and mode 1 always have the same effective index, meaning that their first-order and higher-order derivatives of the corresponding eigenvalues with respect to $\alpha$ are always the same, and thus their eigenvector derivatives are not mathematically well-defined. The same issue occurs in other modes when $\alpha$ becomes small and more propagating modes appear (e.g., see mode 5 and mode 6 when $\alpha$ is within $\sim [0.3,0.5]$).

Download Full Size | PDF

3.1 Challenges in scattering matrix differentiation

The construction of scattering matrix $\textbf {S}$ needs to solve an eigenvalue problem $(\textbf {P}\textbf {Q})\textbf {W} = \textbf {W}\boldsymbol{\Gamma }$, as the eigenvalues $\boldsymbol{\Gamma }$ and eigenvectors $\textbf {W}$ appear in Eqs. (3) and (4) for computing $\textbf {S}$. Thus, for the differentiation of $\textbf {S}$, it seems also needed to compute the derivatives of the eigenvalues $\boldsymbol{\Gamma }$ and eigenvectors $\textbf {W}$. Computing the eigenvalue derivatives are relatively straightforward, as has arisen in many fields [2325]. When all the eigenvalues are distinct, computing the derivatives of their eigenvectors are also well-posed [25].

However, significant challenges arise when there exist repeated eigenvalues. Repeated eigenvalues are not uncommon in photonic design: many photonic devices have certain structural symmetries, from which eigenvalue repetition naturally emerges (see Fig. 2). For those repeated eigenvalues, their eigenvectors (up to a scale) are not uniquely determined; any set of linearly independent vectors that span the same subspace are valid eigenvectors. Because of the ambiguity, as the structural parameter changes, those eigenvectors may change discontinuously (see an examples in Supplement 1), and thus their derivatives may not be well-defined.

As a result, one must carefully choose eigenvectors in the subspace of repeated eigenvalues such that the eigenvectors change continuously with respect to the structural parameter. This choice, however, is computationally expensive [19,24]. As derived in [19], to ensure well-defined eigenvector derivatives, one must compute higher-order derivatives of both the eigenvalues and the matrix $\textbf {P}\textbf {Q}$: if the repeated eigenvalues have repeated derivatives up to the $n$-th order (see Fig. 2), then derivatives up to the $(n+1)$-th order of the eigenvalues and the matrix $\textbf {P}\textbf {Q}$ must be computed to determine first-order eigenvector derivatives.

3.2 Differentiation without resort to eigenvector derivatives

We now present a new algorithm for computing the scattering matrix derivative $\textbf {S}'$. Even in the presence of repeated eigenvalues and their derivatives, our method requires only the first-order derivatives of the matrices $\textbf {P}$ and $\textbf {Q}$, completely sidestepping the differentiation of eigenvalues and eigenvectors. In comparison to the way that takes eigenvalue derivatives (as described above), our method is more robust and efficient.

First, we rewrite the commonly used expressions of scattering matrix components, shown in (3), in new forms,

$$\begin{aligned} \textbf{R}_{\textrm{L}} = \textbf{R}_{\textrm{R}} & = \left(\textbf{I} -\textbf{D}_1^{2}\right)^{-1}\left(\textbf{D}_1\textbf{D}_2 - \textbf{D}_3\right),\end{aligned}$$
$$\begin{aligned}\textbf{T}_{\textrm{LR}}=\textbf{T}_{\textrm{RL}} & = \left(\textbf{I} -\textbf{D}_1^{2}\right)^{-1}\left(\textbf{D}_2 - \textbf{D}_1\textbf{D}_3\right), \end{aligned}$$
where $\textbf {D}_1$, $\textbf {D}_2$, and $\textbf {D}_3$ denote the following matrix multiplications, respectively:
$$\textbf{D}_{1} :=\textbf{A}^{-1}\textbf{X}\textbf{B} = \left(\textbf{W}_0+\textbf{T}\textbf{V}_0\right)^{-1}\textbf{W}\textbf{X}\textbf{W}^{-1}\left(\textbf{W}_0-\textbf{T}\textbf{V}_0\right),$$
$$\textbf{D}_{2} :=\textbf{A}^{-1}\textbf{X}\textbf{A} = \left(\textbf{W}_0+\textbf{T}\textbf{V}_0\right)^{-1}\textbf{W}\textbf{X}\textbf{W}^{-1}\left(\textbf{W}_0+\textbf{T}\textbf{V}_0\right),$$
$$\textbf{D}_{3} :=\textbf{A}^{-1}\textbf{B} = \left(\textbf{W}_0+\textbf{T}\textbf{V}_0\right)^{-1}\left(\textbf{W}_0-\textbf{T}\textbf{V}_0\right).$$

The derivation of these new expressions (5) and (6) are provided in Supplement 1. In Eqs. (6), the equalities are reached by applying (4) and using $\textbf {T}$ that denotes $\textbf {T}\:\boldsymbol{\Omega }\textbf {Q}^{-1}$.

The expressions in (5) present a new route for computing scattering matrix derivative. They indicate that the scattering matrix $\textbf {S}$ is determined by the three matrices, $\textbf {D}_1$, $\textbf {D}_2$, and $\textbf {D}_3$. As a result, to compute its derivative $\textbf {S}'$ using the chain rule, we need to compute the derivatives of $\textbf {D}_1$, $\textbf {D}_2$, and $\textbf {D}_3$ with respect to the structural parameter.

In Eqs. (6), both $\textbf {W}_0$ and $\textbf {V}_0$ (introduced in (4)) are constant matrices. Thus, the derivatives of $\textbf {D}_1$, $\textbf {D}_2$, and $\textbf {D}_3$ only depend on the derivatives of two other matrices in Eqs. (6), namely $\textbf {T}$ and $\textbf {W}\textbf {X}\textbf {W}^{-1}$. We now describe how to compute the derivatives of the two matrices, respectively.

Derivative of $\textbf {T}$. The matrix $\textbf {T}\:\boldsymbol{\Omega }\textbf {Q}^{-1}$ is related to $\textbf {Q}$ and $\boldsymbol{\Omega }\:\left (\textbf {P}\textbf {Q}\right )^{{1}/{2}}$ but not the eigenvalues and eigenvectors. Its derivative can be expressed as

$$\textbf{T}' = \boldsymbol{\Omega}'\textbf{Q}^{-1}+\boldsymbol{\Omega}\left(\textbf{Q}^{-1}\right)' = \boldsymbol{\Omega}'\textbf{Q}^{-1} - \boldsymbol{\Omega}\textbf{Q}^{-1}\textbf{Q}'\textbf{Q}^{-1},$$
where $\textbf {Q}$ depends on the material permittivity distributions of the photonic structure, and therefore its derivative $\textbf {Q}'$ with respect to a design parameter can be directly computed (see examples in Section 4). The way of computing $\boldsymbol{\Omega }'$ can be derived by taking the derivatives on both sides of the relation $\boldsymbol{\Omega }^{2}=\textbf {P}\textbf {Q}$, which yields
$$\boldsymbol{\Omega}'\boldsymbol{\Omega} + \boldsymbol{\Omega}\boldsymbol{\Omega}' = \textbf{P}'\textbf{Q} + \textbf{P}\textbf{Q}'.$$

Given $\textbf {P}'$ and $\textbf {Q}'$, the right-hand side of this equation can be directly computed. To compute $\boldsymbol{\Omega }'$, we rewrite the left-hand side by denoting $\boldsymbol{\Omega }'$ as $\boldsymbol{\Omega }'=\textbf {W}\textbf {Y}\textbf {W}^{-1}$ for some unknown $\textbf {Y}$, where $\textbf {W}$ is the eigenvector matrix of $\textbf {P}\textbf {Q}$. Using the fact that $\boldsymbol{\Omega } = (\textbf {P}\textbf {Q})^{1/2}=\textbf {W}\boldsymbol{\Lambda }\textbf {W}^{-1}$, we obtain a simplified form of (8):

$$\textbf{Y}\boldsymbol{\Lambda} + \boldsymbol{\Lambda}\textbf{Y} = \textbf{W}^{-1}(\textbf{P}'\textbf{Q}+\textbf{P}\textbf{Q}')\textbf{W}.$$

From (9), $\textbf {Y}$ can be easily solved by noticing that $\boldsymbol{\Lambda }$ is a diagonal matrix, and therefore (9) can be written element-wise as $(\lambda _i + \lambda _j)\textbf {Y}_{ij}=\textbf {C}_{ij}$, where $\lambda _i$ is the $i$-th eigenvalue in $\boldsymbol{\Lambda }$, and $\textbf {C}$ denote the matrix on the right-hand side of (9). In other words, the elements of $\textbf {Y}$ can be obtained by solving $n^{2}$ 1D linear equations in parallel. Once $\textbf {Y}$ is obtained, $\boldsymbol{\Omega }'$ is computed using $\boldsymbol{\Omega }'=\textbf {W}\textbf {Y}\textbf {W}^{-1}$.

We note that while this process of computing $\boldsymbol{\Omega }'$ requires the eigenvectors $\textbf {W}$ and eigenvalues $\boldsymbol{\Lambda }$, they are also needed for computing the scattering matrix in the first place. Our solving process does not require the derivatives of eigenvectors. Therefore, it introduces no additional effort in terms of eigen-decomposition.

Derivative of $\textbf {W}\textbf {X}\textbf {W}^{-1}$. In the first glance, the derivative of $\textbf {W}\textbf {X}\textbf {W}^{-1}$ depends on the eigenvectors $\textbf {W}$. However, from the definition of $\textbf {X}$ in (4a), we notice that $\textbf {W}\textbf {X}\textbf {W}^{-1}=e^{j\boldsymbol{\Omega }{L}/{k_0}}$, which suggests an alternative approach: take the derivative of the matrix exponential $e^{j\boldsymbol{\Omega }{L}/{k_0}}$ with respect to $\boldsymbol{\Omega }$.

A common approach of computing the matrix exponential $e^{j\boldsymbol{\Omega }{L}/{k_0}}$ is through the eigen-decomposition of $\boldsymbol{\Omega }$ followed by the exponential of the resulting eigenvalues. If we take this approach, the derivative computation must involve the derivatives of eigenvectors, which might not be well-defined. Another approach, used by Feynman [26] and others [2729], expresses the derivative of a matrix exponential using an integral that in itself involves matrix exponentials. Yet, numerically evaluating the matrix exponentials and the integral are expensive.

Instead, our proposed method for computing the derivative is based on the following proposition originally proved in [30].

Proposition 1 Consider an $n\times n$ matrix $\boldsymbol{\Omega }$ and its derivative $\boldsymbol{\Omega }'$ with respect to an arbitrary parameter. If

$$\textbf{G}=\left[\begin{matrix}\boldsymbol{\Omega} & \boldsymbol{\Omega}' \\ \textbf{0} & \boldsymbol{\Omega} \end{matrix}\right],\;{then } e^{j\textbf{G} {L}/{k_0}} = \left[\begin{matrix} e^{j\boldsymbol{\Omega}{L}/{k_0}} & \left(e^{j\boldsymbol{\Omega}{L}/{k_0}}\right)' \\ \textbf{0} & e^{j\boldsymbol{\Omega}{L}/{k_0}} \end{matrix}\right],$$
where the top-right $n\times n$ block matrix in $e^{j\textbf {G} {L}/{k_0}}$ is the derivative of the matrix exponential $e^{j\boldsymbol{\Omega }{L}/{k_0}}$.

In our problem, $\boldsymbol{\Omega }'$ is computed as described above (by solving (9)), and the common way of computing $e^{j\textbf {G} {L}/{k_0}}$ is by taking the eigen-decomposition of $\textbf {G}$, which is again what we wish to avoid. We therefore take a different approach, the scaling and squaring method [31], to compute $e^{j\textbf {G} {L}/{k_0}}$—without the need of eigen-decomposition.

The scaling and squaring method exploits the relation $e^{\textbf {A}} = \left (e^{\textbf {A}/\sigma }\right )^{\sigma }$ for any $n\times n$ matrix $\textbf {A}$. In practice, $\sigma$ is chosen to be $\sigma =2^{s}$ for some non-negative integer $s$. The idea is to have the norm of $\textbf {A}/\sigma$ sufficiently small such that $e^{\textbf {A}/\sigma }$ can be well approximated by a Padé approximant near the origin. The Padé approximant is a rational polynomial of $\textbf {A}$. Its evaluation requires only matrix multiplications and inverse, but no eigen-decomposition. The scaling and squaring method is robust and accurate, and has been used in many numerical tools (such as MATLAB’s expm function).

When applying this method, we further exploit the specific structure of $\textbf {G}$ (i.e., its bottom-left block matrix vanishes, and its two diagonal block matrices are identical) to tailor the method for improving computational performance. Supplement 1 presents our detailed derivations and computational steps.

4. Results

This section presents our numerical results. First, we validate our algorithm for computing scattering matrix derivatives. Next, to demonstrate the use of scattering matrix derivatives in photonics, we optimize the geometry of photonic metasurface units (also called meta-atoms). All the numerical studies are performed on a workstation with an Intel Xeon E5-1620 CPU running at 3.60GHz.

Meta-atoms are the building blocks of a metasurface, often designed based on physical intuitions and manually crafted libraries [3335]. More recently, inverse design methods of meta-atom structures have also been explored—e.g., through finite-difference-based gradient descent [36], adjoint-based level-set method [37], and topological optimization [38,39].

Due to fabrication constraints, meta-atoms often have constant cross-sectional shapes along one direction (i.e., $z$-direction, as shown in Fig. 3(a)). Thus, the semi-analytical methods (such as RCWA) are particularly efficient for simulating meta-atoms, thanks to their ability of not discretizing along $z$-direction [4]. Our method, for the first time, enables the semi-analytical methods to also compute scattering matrix derivatives with respect to design parameters. Here, in the framework of RCWA, we demonstrate automatic discovery of meta-atom structures that reach various amplitude and phase goals.

 figure: Fig. 3.

Fig. 3. Accuracy comparison. (a) We use our method and FDTD (Lumerical [32]) to analyze a 3D meta-atom. The width of the pillar and its square hole are $0.6 \mu \textrm {m}$ and $0.2\mu \textrm {m}$, respectively, and it has a height of 1.4 $\mu \textrm {m}$. We use the periodic boundary condition, with the period of $0.66\mu \textrm {m}$. (b) We scan the (x-polarized) wavelength and plot the effective indices of the fundamental mode evaluated by FDFD and our method. (c) For each wavelength (color mapped here), we compute the far-field amplitudes and phases changes, and compare our results to FDTD.

Download Full Size | PDF

4.1 Validation

To validate our algorithm, we consider a dielectric meta-atom used in metasurface holography [34,35]. Its structure is shown in Fig. 3(a). We use Eqs. (5) to compute the scattering matrix, for which the matrices $\textbf {P}$ and $\textbf {Q}$ (introduced in (2)) are constructed using RCWA. First, we validate the accuracy of Eqs. (5) for scattering matrix computation. To this end, we compare the scattering matrices resulted from Eqs. (5) to those resulted from the 3D finite-difference time-domain (FDTD) method in Lumerical [32].

We scan the light wavelength from $1.2 \mu \textrm {m}$ to $1.6 \mu \textrm {m}$. For each wavelength, we compute, using our scattering matrix and FDFD respectively, the effective index of the fundamental mode propagating in the meta-atom. The results from our method agree with FDFD results (see Fig. 3(b)). Furthermore, we consider the far-field light transmission through the meta-atom, and compute the phase shift and amplitude change for each wavelength. Again, the results from our method and FDTD match closely, as shown in Fig. 3(c).

These numerical studies confirm that our scattering matrix computation is as accurate as the FDTD method in Lumerical. In terms of computational cost, our method takes 0.15 seconds for each monochromatic simulation, and 3.6 seconds for the entire $1.2 \mu \textrm {m}$-$1.6 \mu \textrm {m}$ wavelength range, whereas the FDTD simulation takes 179 seconds.

Next, we validate our derivative computation. We consider again the meta-atom structure shown in Fig. 3(a), and choose the parameter $\alpha$ to be the size of the hollow square. Using our method, we compute the derivative of the structure’s scattering matrix with respect to $\alpha$. Meanwhile, since there is no analytic expression of the scattering matrix derivative, we approximate it using finite difference (FD) estimation, that is,

$$\frac{\partial \textbf{S}}{\partial \alpha} \approx \frac{\textbf{S}(\alpha + \Delta\alpha) - \textbf{S}(\alpha - \Delta\alpha)}{2\Delta\alpha}.$$

We estimate $\frac {\partial \textbf {S}}{\partial \alpha }$ using a sweeping range of $\Delta \alpha$ values, and compare them to the derivative resulted from our method.

The results are illustrated in Fig. 4. The accuracy of FD approximation largely depends on the choice of $\Delta \alpha$. Only when $\Delta \alpha$ is chosen within a certain range, FD approximation is accurate enough to agree with our derivative results. This agreement confirms the correctness of our method. But for different elements in the scattering matrix, the valid $\Delta \alpha$ range varies (indicated in light green in Fig. 4), suggesting that FD approximation is impractical: it is hard, if not impossible, to choose a proper $\Delta \alpha$ to produce accurate derivative estimations for all elements in the scattering matrix. In contrast, our method is robust for computing the derivatives.

 figure: Fig. 4.

Fig. 4. Scattering matrix derivatives. We choose two matrix elements in the scattering matrix, and plot their FD derivatives estimated with different FD sizes ($\Delta \alpha$ in $x$-axis) in (a) and (b), respectively. In each plot, the red and blue solid curves correspond to the real and imaginary parts of the estimated derivative. Meanwhile, the derivatives computed by our method are indicated by the red (real) and blue (imaginary) horizontal dash lines. These plots show that FD estimation is highly sensitive to $\Delta \alpha$. Light green regions indicate valid $\Delta \alpha$ ranges for both matrix elements. The valid $\Delta \alpha$ varies element by element. Thus, in practice, it is hard to choose a proper $\Delta \alpha$ for the entire scattering matrix, whereas our method is always robust.

Download Full Size | PDF

Computational cost. In addition to the robustness, our method is also faster than the FD method. In the FD method, computing a matrix derivative requires the computation of two scattering matrices $\textbf {S}(\alpha +\Delta \alpha )$ and $\textbf {S}(\alpha -\Delta \alpha )$. In contrast, our method, in addition to computing $\textbf {S}(\alpha )$, only requires a few matrix multiplications and inverses (recall Section 3.3.2). In our test, we use 25$\times$25 harmonics in the RCWA method. It takes 0.18 seconds to compute the scattering matrix, and takes 0.25 seconds to compute the scattering matrix and its derivative. This cost is $1.4\times$ the cost of computing the scattering matrix itself, faster than the FD method (which is 2$\times$ more expensive than computing the scattering matrix, as it evaluates the scattering matrix twice).

4.2 Use case: optimization of meta-atom structure

Controlling phase and amplitude of monochromatic light. First, we optimize meta-atom structures to reach specific transmitted amplitudes and phases for a monochromatic light (at 1.55$\mu \textrm {m}$ wavelength, $x$-polarized). The cross-sectional shape is shown in Fig. 5(a), determined by two parameters. The objective function for the inverse design is defined as

$$\mathcal{L} = {\left| \textbf{T}_{\textrm{LR}}(m, m) - t_m\right|}^{2},$$
where $\textbf {T}_{\textrm {LR}}$ is the transmission submatrix in the scattering matrix (recall (1)), $m$ is the mode index for the incident and outgoing light in free space, thus $\textbf {T}_{\textrm {LR}}(m, m)$ denotes the $m$-th diagonal element of the matrix. Also, $t_m$ is a complex constant specifying the target amplitude and phase of the transmission. Here we consider the fundamental mode (the way to choose corresponding $m$ is given in Supplement 1), which describes the far-field light transmission along the $z$-direction.

 figure: Fig. 5.

Fig. 5. Optimization of (12). (a) We optimize the cross-sectional shape specified by two design parameters $\alpha$ and $\beta$ in order to reach a target transmission amplitude and phase. We examine different targets evenly sampled on a circle on the complex plane (indicated by the square dots in (b)). The optimized amplitudes and phases (indicated by triangular dots) reach closely to the targets. As a reference, we also show the amplitudes and phases (in circular dots) of the designs that globally minimize (12), that is, ones obtained through a slow, exhaustive search of all parameter combinations. While no gradient-based optimization algorithm can guarantee the global minimum of (12), our results approach the targets closely, comparable to what the global minimums can achieve. The resulting cross-sectional shape for each sampled target are shown in (c).

Download Full Size | PDF

To verify the robustness of our method and the enabled optimization, we evenly sample different targets $t_m$ on a circle on the complex plane (see Fig. 5(b)), that is, targets all having the amplitude 0.9 but different phases. For each target, we find meta-atom’s shape parameters by minimizing (12) through a gradient-descent algorithm [13], for which the gradients of (12) with respect to the design parameters are computed using our method. As shown in Fig. 5(b), we are able to automatically discover structures that reach these targets closely.

Controlling phases for both $x-$ and $y-$polarized light. Next, we optimize meta-atom structures to obtain target responses for $x$- and $y$-polarized light, simultaneously. This type of meta-atoms has been used to construct metasurface holograms [40]. In our example, the light wavelength is 1.3$\mu \textrm {m}$; the meta-atoms have a fixed height of 2.0$\mu \textrm {m}$ and a period of 2.5$\mu \textrm {m}$ along $x$- and $y$-direction. The cross-sectional shape of the meta-atoms are specified by two parameters shown in Fig. 6(a). We determine the parameters by minimizing the following objective function:

$$\mathcal{L} = -\frac{\textbf{T}_{\textrm{LR}}(m_x, m_x)}{\left|\textbf{T}_{\textrm{LR}}(m_x, m_x)\right|}t_x^{*}-\frac{\textbf{T}_{\textrm{LR}}(m_y, m_y)}{\left|\textbf{T}_{\textrm{LR}}(m_y, m_y)\right|}t_y^{*},$$
where the subscript $x$ (and $y$) indicates light polarization; $t_x$ (and $t_y$) are the target phase changes from $x$-polarized (and $y$-polarized) incident light to the outgoing light with the same polarization (i.e., $t_x=\exp (i\phi _x)$ for some $\phi _x$). The first term in (13) measures, for the $x$-polarized light, the cosine difference (through dot product on complex plane) between the $m$-th mode’s phase change and the target phase change, and similarly for the second term. The optimized structures for different $x$- and $y$-polarized phase targets are shown in Fig. 6. In all cases, the residual between the target and the resulting phase change is within 7% of one period ($2\pi$), and in most cases within 1%.

 figure: Fig. 6.

Fig. 6. Optimization of (13). (a) We optimize the meta-atom structure described by two parameters. The goal is to achieve certain phase changes for $x$- and $y$-polarized light simultaneously. (b) We sample six target phase changes for $x$- and $y$-polarized light, respectively. Their combination forms 36 different optimization targets. For each target, our optimization produces a cross-sectional shape design. (c) For $x$-polarized incident light, we show the residual (in terms of phase angle difference) between each pair of inversely designed phase change and the target. And similar visualization for $y$-polarized light is shown in (d).

Download Full Size | PDF

In this study, we assume that all meta-atoms have the same height due to the restriction in most current fabrication processes. Advanced fabrication techniques allow the meta-atom’s height to vary. To demonstrate the use of our method with those fabrication techniques, we further improve the designs in Fig. 6 by including meta-atom’s height as another optimization parameter. The study are described in Supplement 1, and the resulting designs and their improved performance are reported in Fig. S1 therein.

Controlling amplitudes for multiple wavelengths. We also demonstrate inverse design of meta-atoms for another type of optical response: obtain two target amplitude responses at two separate wavelengths, simultaneously. This type of responses have proven useful for making colored metasurface holograms [34,35]. Here we consider two archetypes used in [34], each described by two parameters (see Fig. 7(a)). The two wavelengths under consideration are 1.2$\mu \textrm {m}$ (labeled as blue) and 1.6$\mu \textrm {m}$ (red), and the objective function is defined as

$$\mathcal{L} = \left[ \left|\textbf{T}_{\textrm{LR},1}(m, m)\right|^{2} - A^{2}_{1}\right]^{2} + \left[ \left|\textbf{T}_{\textrm{LR},2}(m, m)\right|^{2} - A^{2}_{2}\right]^{2}.$$

Here the subscript “1” and “2” indicate the blue (1.2$\mu \textrm {m}$) and red (1.6$\mu \textrm {m}$) wavelength, respectively. The first term accounts for the blue wavelength: $\textbf {T}_{\textrm {LR},1}$ is the transmission submatrix of the scattering matrix and $A_1$ is the desired amplitude. Similar is the second term. More terms can be added in (14) to incorporate more than two wavelengths.

 figure: Fig. 7.

Fig. 7. Optimization of (14). (a) We optimize meta-atom structures described by two archetypes, each with two parameters. The goal is to obtain desired amplitude responses at two separate wavelengths (i.e., 1.2$\mu \textrm {m}$ (blue) and 1.6$\mu \textrm {m}$ (red)), simultaneously. We sample five amplitudes from 0.2 to 1 for each wavelength, forming 25 different optimization targets. Each target leads to a different cross-sectional design shown in (b). For red light, the discrepancies between achieved amplitudes and the targets are shown in (c), and the same visualization for blue light is shown in (d).

Download Full Size | PDF

For each archetype, we find its parameter values via a gradient-descent algorithm that minimizes (14), and choose between the two archetypes one that produces a smaller objective value. The optimized structures and their performances are shown in Fig. 7. For almost all the optimized meta-atoms (each with a different amplitude target), the resulting amplitudes match closely to their targets.

General cross-sectional shape design. Lastly, we introduce a new way to inverse design the meta-atom’s cross-sectional shape under a general representation. We use the star-convex polygon [41] to represent the cross-sectional shape. Such a shape can be discretized by sampling $N$ points on its boundary so that the polar angles of these points are evenly distributed over $[0, 2\pi ]$. In other words, the $(k+1)$-th point has the coordinate $p_k\left [\cos {\left ( 2k\pi /N\right )}, -\sin {\left ( 2k\pi /N\right )}\right ]$, where $p_k$ is a non-negative value (see Fig. 8(a)), and the shape is specified by $N$ parameters $p_1,\ldots ,p_N$. A large $N$ offers many degrees of freedom to represent a complex shape, but meanwhile renders exhaustive search through the entire parameter space too expensive—one must rely on numerical optimization methods to determine the parameter values.

 figure: Fig. 8.

Fig. 8. Inverse design of star-convex meta-atoms. (a) The star-convex polygon is used to represent the cross-section of a meta-atom, defined by many control variables ($p_1$$p_8$ in this case). As an example, we inverse design the shape for reaching target amplitudes and phases in two scattering directions (i.e., corresponding to diffraction orders (−1,0) and (1,0) in (b)) simultaneously. (c-d) We perform two numerical studies to reach two sets of $(t_1, t_2)$ goals shown in the plots. In each experiment, our optimization finds the design parameters within hundreds of iterations, resulting in nontrivial shapes that are hard to be manually designed.

Download Full Size | PDF

This shape representation is particularly suitable for RCWA-based analysis, as it allows for a closed-form 2D Fourier transform of the shape (and thus the permittivity distribution) [42]. In RCWA framework, 2D Fourier transform of the cross-sectional permittivity distribution is needed for computing the matrices, $\textbf {P}$ and $\textbf {Q}$, as well as their derivatives with respect to the $p_k$ parameters. Supplement 1 provides the details of this process.

As examples, we optimize octagons ($N=8$) to obtain desired optical responses in different scattering directions. First, we specify the target scattering directions. Notice that to predict optical behavior of a single meta-atom in simulation, periodic boundary condition is often used. Under this condition, the meta-atom is effectively a 2D grating structure, for which we can use diffraction orders to specify different scattering directions: the output light with diffraction order $(p, q)$ is along the direction

$$\vec{k} = \left(\frac{2\pi p}{L_x}, \frac{2\pi q}{L_y}, 1 \right),$$
where $L_x$ and $L_y$ are periods along $x$- and $y$-axis, respectively ($L_x=L_y=1\mu \textrm {m}$ in our examples).

We consider $x$-polarized light with the wavelength of 1.55. The goal here is to obtain specified far-field phases and amplitudes at two scattering directions—ones that correspond to the diffraction orders, $(-1,0)$ and $(1, 0)$, as shown in Fig. 8(b). We further restrict $p_k$ to be in the range $[0.15,0.45]$, and determine $p_k$ values by minimizing

$$\mathcal{L} = {\left| \textbf{T}_{\textrm{LR}}(m, n_1) - t_1 \right|}^{2} + {\left| \textbf{T}_{\textrm{LR}}(m, n_2) - t_2 \right|}^{2},$$
where $n_1$ and $n_2$ are mode indices for the diffraction orders $(-1,0)$ and $(1,0)$, respectively; and $t_1$ and $t_2$ specify the target phases and amplitudes (as complex values) in the two outgoing directions. We perform two numerical studies for two sets of $t_1$ and $t_2$ goals. The optimization convergence curves and resulting shapes are shown in Fig. 8.

5. Conclusion

We have presented an algorithm for computing the derivatives of the scattering matrices of a photonic structure with respect to its structural parameters. Our method is built on the framework of semi-analytical methods for analyzing photonic structures. A key step in semi-analytical methods for computing scattering matrices is the eigen-decomposition. However, to compute scattering matrix derivatives, directly differentiating the eigenvalue analysis poses significant difficulties. We show a new route to compute scattering matrix derivatives without the need of differentiating the eigen-decomposition process.

The scattering matrix derivatives describe how a photonic structure’s performance will change as its structural parameters vary. While we demonstrated their use in optimization of meta-atom units, they can be found useful in many other applications. Therefore, our method may serve as a useful analysis tool in a wide range of photonic design tasks.

Funding

Directorate for Computer and Information Science and Engineering (1717178, 1816041, CAREER-1453101).

Acknowledgments

We thank Nanfang Yu for valuable suggestions.

Disclosures

The authors declare no conflicts of interest.

See Supplement 1 for supporting content.

References

1. P. Lalanne and E. Silberstein, “Fourier-modal methods applied to waveguide computational problems,” Opt. Lett. 25(15), 1092–1094 (2000). [CrossRef]  

2. S. M. Kamali, E. Arbabi, A. Arbabi, Y. Horie, M. Faraji-Dana, and A. Faraon, “Angle-multiplexed metasurfaces: encoding independent wavefronts in a single metasurface under different illumination angles,” Phys. Rev. X 7(4), 041056 (2017). [CrossRef]  

3. R. Pregla and W. Pascher, “The method of lines,” Numerical techniques for microwave and millimeter wave passive structures 1, 381–446 (1989).

4. M. Moharam and T. Gaylord, “Rigorous coupled-wave analysis of planar-grating diffraction,” J. Opt. Soc. Am. 71(7), 811–818 (1981). [CrossRef]  

5. A. Arbabi, Y. Horie, M. Bagheri, and A. Faraon, “Dielectric metasurfaces for complete control of phase and polarization with subwavelength spatial resolution and high transmission,” Nat. Nanotechnol. 10(11), 937–943 (2015). [CrossRef]  

6. A. Ahmed, M. Liscidini, and R. Gordon, “Design and analysis of high-index-contrast gratings using coupled mode theory,” IEEE Photonics J. 2(6), 884–893 (2010). [CrossRef]  

7. R. Zhao, B. Sain, Q. Wei, C. Tang, X. Li, T. Weiss, L. Huang, Y. Wang, and T. Zentgraf, “Multichannel vectorial holographic display and encryption,” Light: Sci. Appl. 7(1), 95 (2018). [CrossRef]  

8. E. Arbabi, S. M. Kamali, A. Arbabi, and A. Faraon, “Full-stokes imaging polarimetry using dielectric metasurfaces,” ACS Photonics 5(8), 3132–3140 (2018). [CrossRef]  

9. K. X. Wang, Z. Yu, V. Liu, Y. Cui, and S. Fan, “Absorption enhancement in ultrathin crystalline silicon solar cells with antireflection and light-trapping nanocone gratings,” Nano Lett. 12(3), 1616–1619 (2012). [CrossRef]  

10. E. Rephaeli, A. Raman, and S. Fan, “Ultrabroadband photonic structures to achieve high-performance daytime radiative cooling,” Nano Lett. 13(4), 1457–1461 (2013). [CrossRef]  

11. Y. Shen, V. Rinnerbauer, I. Wang, V. Stelmakh, J. D. Joannopoulos, and M. Soljacic, “Structural colors from fano resonances,” ACS Photonics 2(1), 27–32 (2015). [CrossRef]  

12. Z. Wang and Z. Yu, “Spectral analysis based on compressive sensing in nanophotonic structures,” Opt. Express 22(21), 25608–25614 (2014). [CrossRef]  

13. S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv preprint arXiv:1609.04747 (2016).

14. C. M. Lalau-Keraly, S. Bhargava, O. D. Miller, and E. Yablonovitch, “Adjoint shape optimization applied to electromagnetic design,” Opt. Express 21(18), 21693–21701 (2013). [CrossRef]  

15. H. Akel and J. Webb, “Design sensitivities for scattering-matrix calculation with tetrahedral edge elements,” IEEE Trans. Magn. 36(4), 1043–1046 (2000). [CrossRef]  

16. G. Veronis, R. W. Dutton, and S. Fan, “Method for sensitivity analysis of photonic crystal devices,” Opt. Lett. 29(19), 2288–2290 (2004). [CrossRef]  

17. Z. Zhu, U. D. Dave, M. Lipson, and C. Zheng, “Inverse geometric design of fabrication-robust nanophotonic waveguides,” in CLEO: Science & Innovations, (Optical Society of America, 2020).

18. J. Joannopoulos, R. Meade, and J. Winn, “Photonic crystals,” Molding the flow of light (1995).

19. N. van der Aa, H. Ter Morsche, and R. Mattheij, “Computation of eigenvalue and eigenvector derivatives for a general complex-valued eigensystem,” The electronic journal of linear algebra 16, 1 (2007).

20. Z. Zhu and C. Zheng, “DiffSMat,” https://github.com/Columbia-Computational-X-Lab/DiffSMat (2020).

21. R. Redheffer, “Difference equations and functional equations in transmission-line theory,” Modern mathematics for the engineer 12, 282–337 (1961).

22. R. C. Rumpf, “Improved formulation of scattering matrices for semi-analytical methods that is consistent with convention,” Prog. Electromagn. Res. 35, 241–261 (2011). [CrossRef]  

23. M. I. Friswell and S. Adhikari, “Derivatives of complex eigenvectors using nelson’s method,” AIAA J. 38(12), 2355–2357 (2000). [CrossRef]  

24. U. Prells and M. I. Friswell, “Calculating derivatives of repeated and nonrepeated eigenvalues without explicit use of eigenvectors,” AIAA J. 38(8), 1426–1436 (2000). [CrossRef]  

25. L. Li, Y. Hu, and X. Wang, “Design sensitivity and hessian matrix of generalized eigenproblems,” Mech. Syst. Signal Process. 43(1-2), 272–294 (2014). [CrossRef]  

26. R. P. Feynman, “An operator calculus having applications in quantum electrodynamics,” Phys. Rev. 84(1), 108–128 (1951). [CrossRef]  

27. R. Karplus and J. Schwinger, “A note on saturation in microwave spectroscopy,” Phys. Rev. 73(9), 1020–1026 (1948). [CrossRef]  

28. R. Bellman, Introduction to matrix analysis, vol. 19 (Siam, 1997).

29. R. Snider, “Perturbation variation methods for a quantum boltzmann equation,” J. Math. Phys. 5(11), 1580–1587 (1964). [CrossRef]  

30. I. Najfeld and T. F. Havel, “Derivatives of the matrix exponential and their computation,” Adv. applied mathematics 16(3), 321–375 (1995). [CrossRef]  

31. N. J. Higham, “The scaling and squaring method for the matrix exponential revisited,” SIAM J. on Matrix Analysis Appl. 26(4), 1179–1193 (2005). [CrossRef]  

32. F. Solutions, “Lumerical solutions inc,” Vancouver, Canada (2003).

33. N. Yu, F. Aieta, P. Genevet, M. A. Kats, Z. Gaburro, and F. Capasso, “A broadband, background-free quarter-wave plate based on plasmonic metasurfaces,” Nano Lett. 12(12), 6328–6333 (2012). [CrossRef]  

34. A. C. Overvig, S. Shrestha, S. C. Malek, M. Lu, A. Stein, C. Zheng, and N. Yu, “Dielectric metasurfaces for complete and independent control of the optical amplitude and phase,” Light: Sci. Appl. 8(1), 92 (2019). [CrossRef]  

35. A. Overvig, S. Shrestha, C. Xiao, C. Zheng, and N. Yu, “Two-color and 3d phase-amplitude modulation holograms,” in CLEO: QELS_Fundamental Science, (Optical Society of America, 2018), pp. FF1F–6.

36. A. S. Backer, “Computational inverse design for cascaded systems of metasurface optics,” Opt. Express 27(21), 30308–30331 (2019). [CrossRef]  

37. M. Mansouree and A. Arbabi, “Metasurface design using level-set and gradient descent optimization techniques,” in 2019 International Applied Computational Electromagnetics Society Symposium (ACES), (IEEE, 2019), pp. 1–2.

38. D. Sell, J. Yang, S. Doshay, R. Yang, and J. A. Fan, “Large-angle, multifunctional metagratings based on freeform multimode geometries,” Nano Lett. 17(6), 3752–3757 (2017). [CrossRef]  

39. Z. Lin, V. Liu, R. Pestourie, and S. G. Johnson, “Topology optimization of freeform large-area metasurfaces,” Opt. Express 27(11), 15765–15775 (2019). [CrossRef]  

40. W. T. Chen, K.-Y. Yang, C.-M. Wang, Y.-W. Huang, G. Sun, I.-D. Chiang, C. Y. Liao, W.-L. Hsu, H. T. Lin, S. Sun, L. Zhou, A. Q. Liu, and D. P. Tsai, “High-efficiency broadband meta-hologram with polarization-controlled dual images,” Nano Lett. 14(1), 225–230 (2014). [CrossRef]  

41. J. C. Stanek, “A characterization of starshaped sets,” Can. J. Math. 29(4), 673–680 (1977). [CrossRef]  

42. S.-W. Lee and R. Mittra, “Fourier transform of a polygonal shape function and its application in electromagnetics,” IEEE Trans. Antennas Propag. 31(1), 99–103 (1983). [CrossRef]  

Supplementary Material (2)

NameDescription
Code 1       A full C++ implementation source code
Supplement 1       Supplemental Document

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (8)

Fig. 1.
Fig. 1. (a) In a photonic structure, light may be incident from both sides and get scattered out. The relationship of incident and scattered light is characterized by the scattering matrix. (b) A complex structure can be decomposed into individual layers. Each layer is characterized by its scattering matrix, and these scattering matrices are then combined (using the Redheffer star product [21]) to form the scattering matrix of the entire structure.
Fig. 2.
Fig. 2. Repetition from symmetry. Consider a meta-atom structure (b) whose cross-sectional shape (a) is parameterized by $\alpha$ , which controls the size of the cross-shaped hollow region. The shape symmetry causes the structure’s propagating modes to have repeated effective indices (c), as also indicated by the repeated eigenvalues when one solves (2). As $\alpha$ varies, mode 0 and mode 1 always have the same effective index, meaning that their first-order and higher-order derivatives of the corresponding eigenvalues with respect to $\alpha$ are always the same, and thus their eigenvector derivatives are not mathematically well-defined. The same issue occurs in other modes when $\alpha$ becomes small and more propagating modes appear (e.g., see mode 5 and mode 6 when $\alpha$ is within $\sim [0.3,0.5]$ ).
Fig. 3.
Fig. 3. Accuracy comparison. (a) We use our method and FDTD (Lumerical [32]) to analyze a 3D meta-atom. The width of the pillar and its square hole are $0.6 \mu \textrm {m}$ and $0.2\mu \textrm {m}$ , respectively, and it has a height of 1.4 $\mu \textrm {m}$ . We use the periodic boundary condition, with the period of $0.66\mu \textrm {m}$ . (b) We scan the (x-polarized) wavelength and plot the effective indices of the fundamental mode evaluated by FDFD and our method. (c) For each wavelength (color mapped here), we compute the far-field amplitudes and phases changes, and compare our results to FDTD.
Fig. 4.
Fig. 4. Scattering matrix derivatives. We choose two matrix elements in the scattering matrix, and plot their FD derivatives estimated with different FD sizes ( $\Delta \alpha$ in $x$ -axis) in (a) and (b), respectively. In each plot, the red and blue solid curves correspond to the real and imaginary parts of the estimated derivative. Meanwhile, the derivatives computed by our method are indicated by the red (real) and blue (imaginary) horizontal dash lines. These plots show that FD estimation is highly sensitive to $\Delta \alpha$ . Light green regions indicate valid $\Delta \alpha$ ranges for both matrix elements. The valid $\Delta \alpha$ varies element by element. Thus, in practice, it is hard to choose a proper $\Delta \alpha$ for the entire scattering matrix, whereas our method is always robust.
Fig. 5.
Fig. 5. Optimization of (12). (a) We optimize the cross-sectional shape specified by two design parameters $\alpha$ and $\beta$ in order to reach a target transmission amplitude and phase. We examine different targets evenly sampled on a circle on the complex plane (indicated by the square dots in (b)). The optimized amplitudes and phases (indicated by triangular dots) reach closely to the targets. As a reference, we also show the amplitudes and phases (in circular dots) of the designs that globally minimize (12), that is, ones obtained through a slow, exhaustive search of all parameter combinations. While no gradient-based optimization algorithm can guarantee the global minimum of (12), our results approach the targets closely, comparable to what the global minimums can achieve. The resulting cross-sectional shape for each sampled target are shown in (c).
Fig. 6.
Fig. 6. Optimization of (13). (a) We optimize the meta-atom structure described by two parameters. The goal is to achieve certain phase changes for $x$ - and $y$ -polarized light simultaneously. (b) We sample six target phase changes for $x$ - and $y$ -polarized light, respectively. Their combination forms 36 different optimization targets. For each target, our optimization produces a cross-sectional shape design. (c) For $x$ -polarized incident light, we show the residual (in terms of phase angle difference) between each pair of inversely designed phase change and the target. And similar visualization for $y$ -polarized light is shown in (d).
Fig. 7.
Fig. 7. Optimization of (14). (a) We optimize meta-atom structures described by two archetypes, each with two parameters. The goal is to obtain desired amplitude responses at two separate wavelengths (i.e., 1.2 $\mu \textrm {m}$ (blue) and 1.6 $\mu \textrm {m}$ (red)), simultaneously. We sample five amplitudes from 0.2 to 1 for each wavelength, forming 25 different optimization targets. Each target leads to a different cross-sectional design shown in (b). For red light, the discrepancies between achieved amplitudes and the targets are shown in (c), and the same visualization for blue light is shown in (d).
Fig. 8.
Fig. 8. Inverse design of star-convex meta-atoms. (a) The star-convex polygon is used to represent the cross-section of a meta-atom, defined by many control variables ( $p_1$ $p_8$ in this case). As an example, we inverse design the shape for reaching target amplitudes and phases in two scattering directions (i.e., corresponding to diffraction orders (−1,0) and (1,0) in (b)) simultaneously. (c-d) We perform two numerical studies to reach two sets of $(t_1, t_2)$ goals shown in the plots. In each experiment, our optimization finds the design parameters within hundreds of iterations, resulting in nontrivial shapes that are hard to be manually designed.

Equations (21)

Equations on this page are rendered with MathJax. Learn more.

[ b L b R ] = S [ a L a R ] ,  where  S = [ R L T RL T LR R R ] .
j k 0 z e = P h  and  j k 0 z h = Q e ,
R L = R R = ( A X B A 1 X B ) 1 ( X B A 1 X A B ) ,
T LR = T RL = ( A X B A 1 X B ) 1 X ( A B A 1 B ) ,
X = e j Λ L k 0 ,
A = W 1 W 0 + V 1 V 0 ,  and  B = W 1 W 0 V 1 V 0 .
R L = R R = ( I D 1 2 ) 1 ( D 1 D 2 D 3 ) ,
T LR = T RL = ( I D 1 2 ) 1 ( D 2 D 1 D 3 ) ,
D 1 := A 1 X B = ( W 0 + T V 0 ) 1 W X W 1 ( W 0 T V 0 ) ,
D 2 := A 1 X A = ( W 0 + T V 0 ) 1 W X W 1 ( W 0 + T V 0 ) ,
D 3 := A 1 B = ( W 0 + T V 0 ) 1 ( W 0 T V 0 ) .
T = Ω Q 1 + Ω ( Q 1 ) = Ω Q 1 Ω Q 1 Q Q 1 ,
Ω Ω + Ω Ω = P Q + P Q .
Y Λ + Λ Y = W 1 ( P Q + P Q ) W .
G = [ Ω Ω 0 Ω ] , t h e n e j G L / k 0 = [ e j Ω L / k 0 ( e j Ω L / k 0 ) 0 e j Ω L / k 0 ] ,
S α S ( α + Δ α ) S ( α Δ α ) 2 Δ α .
L = | T LR ( m , m ) t m | 2 ,
L = T LR ( m x , m x ) | T LR ( m x , m x ) | t x T LR ( m y , m y ) | T LR ( m y , m y ) | t y ,
L = [ | T LR , 1 ( m , m ) | 2 A 1 2 ] 2 + [ | T LR , 2 ( m , m ) | 2 A 2 2 ] 2 .
k = ( 2 π p L x , 2 π q L y , 1 ) ,
L = | T LR ( m , n 1 ) t 1 | 2 + | T LR ( m , n 2 ) t 2 | 2 ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.