Tandem neural network-assisted inverse design of highly efficient diffractive slanted waveguide grating

Menglong Luo; Sang-Shin Lee

doi:10.1364/OE.514502

1. Introduction

In recent years, virtual reality (VR) devices have gained widespread attention as emerging technology products, triggering an unstoppable boom in the display market [1,2]. In comparison to the VR systems that rely on the geometric waveguide, VR apparatuses employing the diffractive waveguide gratings offer notable advantages, such as lighter structures, simplified processing technology, and reduced costs [3,4]. Diffractive grating is a pivotal component that affects the optical coupling efficiency and full-color imaging quality [5,6]. Developers need to conscientiously contemplate a multitude of structural parameters of the grating, encompassing the material, period, height, width, and tilted angle. However, fixing the optimal grating structure from a vast array of combinations of parameters is a formidable endeavor. Designers are prone to encountering challenges when manually adjusting the parameters or relying exclusively on their intuition to garner a grating structure that exhibits high efficiency. Although commercial simulation software embeds algorithms to aid designers in optimizing grating structures, these algorithms can occasionally be limited by local optima, thus hindering the acquisition of global optimal parameters. In certain instances, pursuing global optimization may be excessively time-consuming and unfruitful. Furthermore, these optimization features necessitate additional licenses that come at a considerable expense, imposing financial strain on researchers. The conventional approaches employed in the grating development typically require substantial computational resources and lengthy runtimes [7], thus falling short of the demands of efficient design. Therefore, we aim to propose an effective strategy that can automatically infer the grating parameters to meet the required performance for multiple visible beams, tackling the predicaments faced by engineers in the development of the grating. Encouragingly, the advent of deep learning methodologies facilitates the implementation of our scheme [8,9].

Several researchers count solely on multilayer neural networks to forecast the diffraction properties of the grating [10]. However, this approach inevitably necessitates numerous blind trials exploiting a large amount of grating parameter combinations and demands additional procedures to ascertain the desired grating, thus consuming considerable expenditure of time and lacking convenience. To satisfy the actual engineering scenarios, an inclination leaning toward the convenience-oriented inverse design without manual intervention is embraced in our work. The concept of inverse design, which aims to uncover the physical structural configuration of a device corresponding to the predetermined functional characteristics [11], has gained substantial traction. In the past decade, various applications of deep learning-powered inverse design techniques have proliferated in the realm of photonic structures, such as optical thin films [12,13], multimode interference power splitters [14], metasurfaces [15,16], and plasmonic metamaterials [17]. Certain inverse problems exhibit fuzziness and ill-posedness, reflecting the phenomenon of one-to-many mapping, that is, one physical attribute may correspond to multiple parameter combinations [18,19]. Certain conventional artificial neural networks face difficulties in attaining convergence in the foregoing inverse problems [20]. Among these networks, the multilayer perceptron (MLP) stands as the framework of the fully connected neural network (FCNN) that can execute forward prediction but falls short in inverse design. The invertible neural network (INN) driven by the burgeoning generative flow model has been proven to be a powerful paradigm for navigating the inverse problems characterized by the one-to-many relationships. It has been extended to various applications, such as medical imaging [25], 2D materials [26], and multimodal distributions [27]. The effectiveness of INN lies in its incorporation of latent variables, which can achieve a one-to-one correspondence between the features and labels of the model [21–24]. However, to the best of our knowledge, there is currently no research record on the inverse design of slanted waveguide grating using the INN.

In this study, we construct a tandem neural network (TNN) formed by an INN and an FCNN for the inverse design of a slanted waveguide grating device. The deduction of the grating parameters falls within the purview of the INN. The FCNN functions as an auxiliary network to forecast the coupling efficiency resulting from the inferred grating parameters. Subsequently, the predicted efficiency is compared with the efficiency designated in the INN. In cases where the disparity between the specified and predicted efficiency surpasses the predefined threshold, the optimization process will be initiated to guarantee the convergence within the given criterion. The collaboration between the two sub-networks engenders the joint optimization to enhance the coupling efficiency and predict the corresponding grating parameters.

2. Proposed diffractive slanted waveguide grating and its key structural parameters

Our study focuses on the development of an asymmetric slanted grating device, where an in-coupling grating plays a role of diffracting the incoming beam at a specific order to excite the guided mode of a waveguide, which is subsequently diffracted out of the waveguide by the corresponding out-coupling grating in an analogous manner. Compared to asymmetric slanted gratings, symmetric upright binary gratings may induce an occurrence of the beam coupling in the opposite direction of the intended path, resulting in reduced optical power in the desired direction [28]. However, asymmetric slanted gratings offer remarkable selectivity for a single diffraction mode, thereby significantly improving the coupling efficiency of the beams in a designated orientation. The proposed TNN is crafted to efficiently predict the grating structures operating at multi-wavelengths (λs) corresponding to tricolor bands, namely the red-green-blue (RGB) with λs of 630, 532, and 473 nm, respectively. This feature validates the versatility and reliability of the TNN. Broader spectral bandwidths are often implemented using materials with higher refractive indexes. Therefore, the episulfide MGC171 (Mitsubishi Gas Chemical Inc., Japan) has been chosen as a suitable material, which is an optical resin with high refractive indices ranging from 1.70 to 1.72 at λs of RGB beams [29].

The schematic of the proposed slanted waveguide grating is illustrated in Fig. 1. The collimated incident beam impinges upon the in-coupling grating with an incident angle denoted by θ, which varies between 0°–15°. The beam then undergoes the diffraction as it passes through the in-coupling grating and is coupled into the waveguide, propagating towards the positive y-axis. To prevent the beam from leaking out of the waveguide, the diffraction angle (γ) that governs the beam propagation angle should conform to the condition of total internal reflection (TIR), which can effectively confine the beam within the waveguide. The γ is influenced by the θ, λ, and grating period (P). The relationship between γ and the above parameters adheres to the well-known grating equation ${n_2}\sin \gamma = {n_1}\sin \theta + m\lambda /P$ [30], where n₁ and n₂ are the refractive indices of air and the grating device, respectively, and m is the diffraction order. To satisfy the requirement of TIR, γ should exceed the critical angle = $\sin ^{-1}\left(n_1 / n_2\right)$ and be less than π / 2. Thus, the emphasis of our investigation centers on the first-order diffraction (m = 1). As per the grating equation, the P of the in-coupling grating approximately lies within the range of 0.5λ to λ [31]. The TIR condition ceases to exist once the beam traverses the intermediate region connecting the in- and out-coupling gratings and reaches the out-coupling grating, causing the beam to exit from the waveguide into the surrounding medium of air. To ensure the consistent incident and exit angles of the beam in the air, the structures of both the in- and out-coupling gratings should remain identical, thereby supporting the same diffraction order [4,28]. In accordance with the phase-matching condition, the amount of the optical power of the incident beam that can be coupled into a guided mode is contingent upon the alignment between the y-axis component of the wavevector of the incoming beam and the propagation constant β_m of the guided mode [32]. The fulfillment of the formula ${n_1}{k_0}\sin \theta + {K} = {\beta _m}$ is a requisite for the phase-matching mechanism, where ${k_0} = 2\pi /\lambda$ is the wavevector in the vacuum, K = 2πm / P signifies the phase factor modulated by the grating, and ${{\beta}_{m}}\textrm{ = }\; {{k}_\textrm{0}}{{n}_{\textrm{eff}}}$. The n_eff is the effective refractive index of the grating, which can be stated as ${n_{eff}} = \sqrt {Cn_2^2 + (1 - C)n_1^2}$ [33–35], where C and 1 – C are the ratios of widths of the ridge and groove to P, respectively. However, since the grating is tilted and asymmetrical, the width of the ridge experiences a gradual alteration at different ridge heights (hs), thereby rendering C a varying value. C is unequivocally dictated by the grating parameters, as defined by the two inclination angles (α and β) formed by the two hypotenuses of the ridge and the normal parallel to the z-axis, respectively, fill factor (FF), total height (H), and P. The FF is the proportion of the bottom width (W_bottom) of the ridge in relation to P, specifically represented by ${FF}\; = \; {{W}_{\text{ bottom}}}/{P}$. The calculation of C abides by the mathematical function: $C = FF + (h - H)(\tan \beta - \tan \alpha )\;/P$ (0 ≤ h ≤ H, specifying that h increases along the direction from the top to the base of the ridge). Additionally, the slanted angle (φ) of the grating is defined as $\mathrm{\varphi = (\alpha + \beta )/2}$ and the top width (W_top) of the ridge = ${{W}_{\textrm{bottom}}}\; {{-}\; H\;\times \;(tan\beta }\; \mathrm{{-}\;tan\alpha )}$. Given that the in- and out-coupling gratings share the identical structure, the in- and out-coupling efficiencies are equivalent according to the reciprocity theorem [36]. To facilitate the subsequent discussion, this work focuses on the in-coupling efficiency, which is defined as the transmittance, that is, the proportion of the power coupled into the waveguide relative to the power of the incident beam.

Fig. 1. Schematic of the proposed slanted grating based on the TIR-based beam propagation in the waveguide.

Download Full Size | PDF

Considering the proposed TNN is classified under the deep learning scheme, an essential step is to obtain a substantial amount of data for its training. The grating parameters and transmittances constitute the datasets fed into the TNN. The transmittances at θs denote the 3D numerical simulation results corresponding to each set of grating parameters. The datasets employed in this work were acquired following the simulations in the transverse electromagnetic mode with the help of a rigorous coupled wave analysis-based commercial software, Rsoft DiffractMOD (Synopsys, USA). The simulations were facilitated by the built-in scanning tool, MOST Parameters. During the scanning process, five essential grating parameters, including the H, P, FF, α, and β, were defined within the upper and lower boundaries. The number of sampling steps per parameter was set to ten, generating the datasets of RGB beams with a size of 10⁵ datapoints each. The ranges of the structural parameters are listed in Table 1.

Table 1. Acquired structural parameters pertaining to the slanted grating for RGB beams

View Table | View all tables in this article

3. Operating mechanism of the proposed TNN

This work aims to develop a model that can effectively fulfill two objectives, namely optimizing the coupling efficiencies of the proposed grating in terms of the peak transmittance, average transmittance, and illuminance uniformity, and inferring the corresponding grating parameters. The architecture of the proposed TNN encompasses two integrated units: INN and FCNN, which are crafted to cope with their designated functions. The INN can accomplish the deduction of grating parameters. However, the optimization process requires the assistance of the FCNN. The execution mechanism of the TNN is depicted in Fig. 2. To commence the inverse design, the accumulated datasets of the grating are utilized to train and assess the TNN. The aforementioned efficiencies are marked as y_n (${n}\; = \; 1,\; \textrm{2,}\; \textrm{and}\; 3$) in the efficiency indicator set Y. At the outset, y_n is initialized to 100%, implying the maximum efficiency. However, the ideal set of parameters that would provide a grating with such optimal efficiency may not actually exist. Therefore, an error function and a sufficiently low threshold associated with the evaluation of each indicator need to be predetermined. The error function is formulated as ${Er}{{r}_{n}}{\; = \; |}{{y}_{n}}\; -\; {{\hat{y}}_{n}}|$, where y_n represents the efficiency value in the INN and ${\hat{y}_n}$ denotes the efficiency predicted by the FCNN. The deviation of the error function below the specified threshold value signifies that the divergence between y_n and ${\hat{y}_n}$ can be disregarded, thus setting the threshold at a negligible magnitude of 0.005. The size of the threshold depends on the precision of the trained model and designer’s tolerance level towards the error. The initially specified y_n is entered into the INN, generating a set of grating parameters identified as X. The FCNN is trained as a physical solver with the ability to precisely forecast the transmittances at θs based on the X. All transmittances constitute a set denoted by T that can be used to compute the three efficiency indicators. Specially, the equation of illuminance uniformity can be expressed as Γ$= 1\;{-}\;({{T}_{\textrm{max}}}{\; {-}\; }{{T}_{\textrm{min}}}\textrm{)\; /\; (}{{T}_{\textrm{max}}}{\; + \; }{{T}_{\textrm{min}}}\textrm{)}$, where Γ alludes to the illuminance uniformity, and T_max and T_min refer to the highest and lowest transmittances within the range of θ, respectively. If the outcome of the error function surpasses the pre-established threshold, the optimization process commences. Each iteration of the optimization is finalized through a gradient descent strategy to diminish the value of y_n. The coupling efficiency is typically expressed as a percentage [3,6,10], with calculations rounded to the first or second decimal place. To enhance the precision for determining the coupling efficiency, the designer can redefine the gradient descent value accordingly. However, it is crucial to acknowledge that reducing the gradient descent value will lengthen the optimization time due to increased number of loops in the program. Hence, when determining the criteria for the gradient descent value, the commonly used coupling efficiency target and the optimization time should be concurrently taken into consideration. Consequently, the gradient descent value for each iteration is set at a tiny value of 0.01%, which can effectively guarantee that the size of each iteration interval is adequately fine, ensuring that high efficiency values are rarely missed throughout the optimization process. The descending amount is subtracted from the current y_n to obtain the subsequent value. The iterative process will only cease until the error function reaches a value lower than the preset criteria. The termination of the optimization loop implies that the optimized y_n has discovered the actual corresponding grating parameters. Eventually, the predicted grating parameters and T can be acquired.

Fig. 2. Workflow of the TNN-driven inverse design of the slanted grating.

Download Full Size | PDF

4. Architecture of the proposed TNN

The architecture of the TNN is delineated in Fig. 3(a). The construction of the TNN involves the sequential concatenation of the INN and FCNN. The framework of the INN is illustrated in Fig. 3(b). The mapping from the input X to output Y represents the conventional forward process. Conversely, the inverse mapping entails deriving X from Y. Due to the ill-posed characteristic of the inverse problem and the fact that the intrinsic dimension of y_n which encompasses one dimension is smaller than the five dimensions of X, the loss of key information transpires during the forward propagation. To counteract the inherent information loss, the INN incorporates a set of latent output variables Z that follow the normal distribution with dimensions equal to the difference between X and Y. Thus, the INN associates X with a two-element tuple consisting of Y and Z. The association of $[{Y} ] = {G(X)}$ becomes $[{{Y,\; Z}} ] = {G(X)}$. The corresponding inverse propagation is expressed as ${[X] = }{{G}^{{{-}1}}}{(Y, Z)}$. With the assistance of Z, the congruence of information is upheld during both inverse and forward propagations, thereby converting the inverse process from one-to-many to one-to-one mapping. The core components of the INN comprise a set of ten glow coupling blocks (GCBs), which are encapsulated by the framework for invertible architectures [37]. GCB serves as a reversible block, consisting of complementary affine coupling layers, ensuring the ability to execute both forward and inverse computations. The layout of GCB is shown in Fig. 3(c). Each GCB encompasses two subnets, both of which are equipped with two hidden layers, each containing 256 hidden neurons. The input set X is split into two halves with two and three features assigned to subnet 1 and subnet 2, respectively. To improve the interactions and stability of the subnetworks, the features in the subnetworks are randomly permutated. Additionally, each GCB contains an hyperparameter clamp, set to a value of two, to promote stability in the training process and facilitate the utilization of the higher learning rate. When the computation of all GCBs ends, the two divided parts are re-merged into a unified set. The activation function and solver of the INN are selected as the rectified linear unit (ReLU) and Adam, respectively. The ReLU function can fit nonlinear patterns to boost the data fitting capabilities, and Adam can internally optimize the hyperparameters to heighten the stability and convergence of the INN. Adam is configured with a learning rate of 0.001 and a weight decay rate of 0.00001. Furthermore, to maintain a consistent and invertible training process, slight perturbations caused by Gaussian noise factors are incorporated into the INN.

Fig. 3. Configurations of the proposed TNN and its constituent subnetworks. (a) Overall architecture of the TNN encompassing the INN and FCNN. (b) Specific framework of the INN. (c) Layout of the GCB.

Download Full Size | PDF

The architecture of the proposed FCNN is depicted in Fig. 3(a). The framework of FCNN is rooted in MLP, which can be achieved using the open-source libraries such as scikit-learn and Keras. The FCNN acts as a nonlinear regression approximator to explore the correlation between X and Y. The input layer of FCNN takes X as the input features. The t_is at θs are assigned as the labels of the output layer, where i contains all integers between 0 and 15. In addition, the FCNN is constructed with five hidden layers, located between the input layer and output layer. Each hidden layer comprises 512 neurons. These hidden layers are responsible for performing nonlinear transformations and extracting features from the input data. Likewise, the FCNN is equipped with ReLU function and Adam. The initial learning rate of Adam and batch size are configured as 0.0001 and 128, respectively. To prevent the issue of overfitting, the training process can adopt an early stopping strategy that allows termination when the training loss continues to fall below a threshold of 0.0001 for ten consecutive epochs.

5. Performance evaluation and applications of the proposed TNN

5.1 Training performance of the TNN

The training was performed based on Python and deployed on a device equipped with an 11^th generation Intel Core i7-11700KF central processing unit, an NVIDIA GeForce RTX 3060 graphics processing unit, and a memory size of 64 GB. Each dataset collected for the RGB beams was of the same size, thus requiring almost identical durations of data collection and model training for each beam. Specifically, the time costs per beam for the dataset acquisition and model training were approximately 7.8 and 1.27 hours, respectively. Notably, the training of the TNN for each efficiency indicator of RGB beams incurred a runtime overhead consisting of two parts: the time required by the INN and FCNN, which amounted to 1.27 hours and 121 seconds, respectively. The fundamental principle governing the training of models is to ensure independence between the training and test datasets, thus preventing the leakage of information between the two. In this work, the test datasets employed were unequivocally distinguished from the training datasets as they were collected separately. As previously detailed in Section 2, a total of 10⁵ datapoints were obtained for each beam. These datapoints were then randomly divided into the training and test datasets according to the ratio of 80% and 20%, respectively. Consequently, the test dataset encompassed 20,000 datapoints, which were used to compute the MSE losses presented in Tables 2 and 3 below.

Table 2. Mean values of MMD and MSE of the INN for the indicator set Y on the training and test sets of the RGB beams

View Table | View all tables in this article

Table 3. Final R² scores and MSE losses of the FCNN on the training and test sets of the RGB beams

View Table | View all tables in this article

The training curves of the INN and FCNN were assessed in terms of the deviations between the actual and predicted outcomes on both the training and test sets. Two extensively used estimation metrics for the INN were adopted as the mean-squared error (MSE) loss and maximum mean discrepancy (MMD), providing insights into the training effect of the forward and inverse propagation, respectively. The MSE of the INN gauged the difference between the predicted efficiency indicators and corresponding simulation results, while the MMD quantified the disparity between the forecasted and collected grating parameters. Likewise, the metrics exploited to evaluate the FCNN were the coefficient of determination (R²) and MSE. The R² scores underpinned the extent to which the predicted transmittances aligned with the acquired data, revealing the fitting level. The Eqs. (1)–(3) represent the MMD, MSE loss, and R² score, respectively, which can be expressed as:

(1)$$MMD = {\left[ {\frac{1}{{{n^2}}}\sum\limits_{i = 1}^n {\sum\limits_{i^{\prime} = 1}^n {k({x_i},x_i^\prime )} } - \frac{2}{{nm}}\sum\limits_{i = 1}^n {\sum\limits_{j = 1}^m {k({x_i},{{\hat{x}}_j})} } + \frac{1}{{{m^2}}}\sum\limits_{j = 1}^m {\sum\limits_{j^{\prime} = 1}^m {k({{\hat{x}}_j},\hat{x}_j^\prime )} } } \right]^{\frac{1}{2}}}$$

(2)$$MSE = (\frac{1}{n}) \times \sum\limits_{i = 1}^n {({y_i} - {{\hat{y}}_i}} {)^2}$$

(3)$${R^2} = 1 - \frac{{\sum\limits_{i = 1}^n {{{({t_i} - {{\hat{t}}_i})}^2}} }}{{\sum\limits_{i = 1}^n {{{({t_i} - {{\bar{t}}_i})}^2}} }}$$

where x and $\mathrm{\hat{x}}$ are the actual and predicted grating parameters, respectively. k represents the commonly used inverse multiquadric kernel function [21], and n and m are the number of samples. Similarly, y and $\hat{y}$ represent the actual and predicted indicators, respectively. t and $\mathrm{\hat{t}}$ are the actual and predicted transmittances at each θ, respectively, and $\mathrm{\bar{t}}$ is the mean transmittance. Fig. 4 presents the training curves for the MMD and MSE of the INN for both the training and test datasets of RGB beams. Specifically, subfigures 4(a-i), 4(b-i), and 4(c-i) show the MMD of the grating parameters and MSE corresponding to the peak transmittance of each beam. Subfigures 4(a-ii), 4(b-ii), and 4(c-ii) present the results related to the average transmittance, subfigures 4(a-iii), 4(b-iii), and 4(c-iii) describe the training effect pertaining to the illuminance uniformity. All curves corresponding to the MMD and MSE exhibit an inclination to converge toward a stable state in the vicinity of zero within 100 epochs. The mean values of MMD and MSE for the three indicators on the training and test sets in relation to RGB beams are listed in Table 2. The realized MSE and MMD outcomes are approximately equal to zero, ensuring that the INN has been trained decently. Table 3 outlines the stabilized values of R² and MSE of the FCNN of the RGB beams at the 100^th epoch. The R² scores approaching one and insignificantly low MSE losses substantiate that the FCNN effectively fitted the training sets and accurately predicted unknown data.

5.2 Optimized coupling efficiencies and grating parameters inferred from the TNN

The application of the TNN is centered on optimizing the coupling efficiencies and predicting the corresponding grating parameters. As previously stated in Section 3, the grating parameters deduced by the INN relied on the optimized efficiencies. The predicted parameters were subsequently delivered to the FCNN to infer the transmittances at differing θs. The predicted grating structures were concurrently simulated to yield the computed transmittances with the aid of DiffractMOD. All transmittances of RGB beams obtained from the FCNN and simulations are comparatively elucidated in Fig. 5. The average transmittance and illuminance uniformity were calculated using the transmittance set T. The subfigures (i), (ii), and (iii) show the transmittances of RGB beams, respectively. Specifically, subfigures (i) pertain to the peak transmittance, subfigures (ii) are concerned with the average transmittance, and subfigures (iii) are in association with the illuminance uniformity. The depicted curves reveal a conspicuous trend where the disparities between the predicted and simulated transmittances are disregarded, signifying a substantial level of concurrence between the outcomes derived from the FCNN and the simulated findings. The coupling efficiencies obtained from the INN, FCNN, and 3D simulations are comprehensively compared in Table 4. The efficiency outcomes of the INN and FCNN align closely with the simulated results. The discrepancies in all efficiencies attained through the INN, FCNN, and simulations are less than 1%, providing ample evidence for the reliability of the TNN. The optimized peak transmittance and illuminance uniformity approach a commendable level of around 100% and 98%, respectively, and the average transmittance reaches ∼92%, thus confirming the superior efficiencies of the predicted gratings. The predicted grating parameters corresponding to the optimized peak transmittance, average transmittance, and illuminance uniformity are listed in Table 5. The derived parameters conform to the physical meaning and possess practical processability. Furthermore, our TNN exhibit rapid speed, as the execution time for each inverse design process consistently remains within hundreds of milliseconds to seconds.

Fig. 4. Performance evaluation of the learning curves of the INN on both the training and test sets of RGB beams. MMD and MSE pertaining to the peak transmittance of (a-i) R, (b-i) G, and (c-i) B beams; average transmittance of (a-ii) R, (b-ii) G, and (c-ii) B beams; illuminance uniformity of (a-iii) R, (b-iii) G, and (c-iii) B beams.

Download Full Size | PDF

Fig. 5. Comparison between the TNN-predicted transmittances and simulated outcomes of RGB beams within θs. The subfigures (a-i), (b-i), and (c-i) pertain to the optimized peak transmittance, while (a-ii), (b-ii), and (c-ii) are related to the average transmittance, and (a-iii), (b-iii), and (c-iii) correspond to the illuminance uniformity.

Download Full Size | PDF

Table 4. Comparison of the optimized efficiencies of RGB beams achieved via the INN, FCNN, and simulations

View Table | View all tables in this article

Table 5. TNN-predicted grating parameters corresponding to the optimized efficiencies of the RGB beams

View Table | View all tables in this article

Notwithstanding the attainment of a high degree of uniformity in illuminance, a substantial power loss was noticeable. The power losses above a level of 50% or higher are shown in Figs. 5(a-iii), 5(b-iii), and 5(c-iii). To fulfill an optimal trade-off between the average transmittance and illuminance uniformity, our strategy necessitates a modest compromise in uniformity to minimize power loss, thus effectively enhancing the overall transmittances. In the pursuit of elevating the average transmittance, a constraint was imposed that the average transmittance must exceed a pre-defined threshold. Contemplating the optimized average transmittance following the optimization process achieved ∼92%, it was rational to determine a prescribed value near this figure. After a few iterations and trials, it was discovered that when the constraint value exceeded 89%, the optimization process failed to yield outcomes. As a result, the constraint value ended up being set slightly below 89%, specifically 88%. Fig. 6 depicts an obvious enhancement in the transmittances of the RGB beams with respect to θs. The decrease in the power dissipation becomes apparent when examining the data illustrated in Figs. 6(a), 6(b), and 6(c) in comparison with Figs. 5(a-iii), 5(b-iii), and 5(c-iii), respectively. The alignment between the FCNN and simulation outcomes realizes a prominent level of congruity. Table 6 shows the middle-ground results of the average transmittance and illuminance uniformity. Despite a marginal decrease in the level of illuminance uniformity, dropping from ∼98% to ∼94%, an increase in the average transmittance is observed, rising from ∼30% to ∼88%. Hence, our scheme efficiently accomplishes the equilibrium between power loss and illuminance uniformity, enhancing the overall efficiencies of the grating.

Fig. 6. Compromised transmittances of (a) R, (b) G, and (c) B beams at θs obtained by the FCNN and simulations.

Download Full Size | PDF

Table 6. Trade-offs of average transmittance and illuminance uniformity in FCNN and simulations

View Table | View all tables in this article

Despite the need for the multitudinous collections of simulation data to train the TNN, the proposed model offers unparalleled advantages over 3D simulations. Even numerous simulations may not guarantee the attainment of desired optimization targets, but the trained TNN provides the inverse deductive capability to swiftly extrapolate the required grating structures beyond the range of the collected simulation data. However, the time spent to acquire the datasets is a one-time investment. Once the training of the TNN is completed, the designers can use the trained TNN to derive grating parameters which can satisfy any specified coupling efficiency without gathering further data.

6. Conclusion

In summary, our study presented the construction of a TNN comprising the generative flow-based INN and MLP-based FCNN. The proposed TNN demonstrated the ability to automatically optimize the coupling efficiencies of the slanted grating for RGB beams at different θs. Additionally, the inverse deduction of the grating paraments was concurrently implemented within hundreds of milliseconds to seconds. The TNN gave rise to exceptional precision, as indicated by negligible values of MMD and MSE losses, and R² scores in the vicinity of one. The discrepancy between the efficiencies obtained from the TNN and simulations was minimal, with a margin of only 1%. The optimized peak transmittance, average transmittance, and illuminance uniformity reached high degrees of 100%, 92%, and 98% or so, respectively. The proposed TNN holds promise as an inverse design methodology to lower the threshold and accelerate the efficient design of slanted waveguide gratings.

Appendix

Before establishing the numbers of GCBs, a series of tests were conducted to assess the impact of varying numbers of GCBs on the metrics for the INN. Considering the metric outcomes of the INN are relevant to the efficiency indicators pertaining to the RGB beams, the MMD and MSE values of the peak transmittance of the red beam on the training and test sets are primarily displayed herein. Table 7 presents the corresponding MMD and MSE values for 2, 4, 6, 8, and 10 GCBs. It was observed that setting the number of GCBs to 10 yielded sufficiently low MMD and MSE values. Additionally, it was noted that increasing the number of GCBs from 8 to 10 resulted in only a limited increment in MMD, indicating that evaluating the case with more GCBs was unnecessary. Notably, further increasing the number of GCBs may continue to decrease the MMD and MSE, but at the expense of longer training time. Consequently, from the perspective of the trade-off between prediction accuracy and time cost, 10 GCBs were deemed acceptable. Similarly, to construct a high-performance FCNN, 18 experiments were carried out to analyze the R² and MSE results with varying numbers of hidden layers and neurons. The corresponding R² and MSE outcomes for different numbers of hidden layers (ranging from 1 to 6) and neuron quantities (from 128 to 512) are presented in Table 8. It was observed that as the number of hidden layers and neurons were increased to 5 and 512, respectively, the R² score reached its highest value and the MSE loss dropped to the lowest. Additionally, setting the hidden layers to 5 and 6 with a neuron count of 512 resulted in consistent R² scores and MSE losses. However, with an increase in the number of hidden layers, the training time for the FCNN was inevitably extended. Therefore, this work incorporated a configuration of five hidden layers and 512 neurons in its implementation.

Table 7. MMD and MSE results of the peak transmittance of the red beam on the training and test sets when the INN is configured with different GCB numbers

View Table | View all tables in this article

Table 8. (R², MSE) on the training and test sets of the red beam for the FCNN with different hidden layers and neurons

View Table | View all tables in this article

Funding

Basic Science Research Program; National Research Foundation of Korea; Kwangwoon University; Ministry of Science and ICT, South Korea (2020R1A2C3007007); Ministry of Education (2018R1A6A1A03025242).

Acknowledgments

This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by both the Ministry of Education (No. 2018R1A6A1A03025242) and Ministry of Science and ICT (2020R1A2C3007007), and the research grant of Kwangwoon University in 2023.

Disclosures

The authors declare no conflicts of interest.

Data availability

All the codes and datasets employed in this work have been uploaded to the online platform GitHub. The resources can now be accessed through [38].

References

1. J. Xiong, E. L. Hsiang, Z. He, et al., “Augmented reality and virtual reality displays: emerging technologies and future perspectives,” Light: Sci. Appl. 10(1), 216 (2021). [CrossRef]

2. T. Zhan, K. Yin, J. Xiong, et al., “Augmented reality and virtual reality displays: perspectives and challenges,” iScience 23(8), 101397 (2020). [CrossRef]

3. Z. Liu, C. Pan, Y. Pang, et al., “A full-color near-eye augmented reality display using a tilted waveguide and diffraction gratings,” Opt. Commun. 431, 45–50 (2019). [CrossRef]

4. Y. Zhang and F. Fang, “Development of planar diffractive waveguides in optical see-through head-mounted displays,” Precis. Eng. 60, 482–496 (2019). [CrossRef]

5. D. Ni, D. Cheng, Y. Liu, et al., “Uniformity improvement of two-dimensional surface relief grating waveguide display using particle swarm optimization,” Opt. Express 30(14), 24523–24543 (2022). [CrossRef]

6. G.Y. Lee, J. Hong, S. Hwang, et al., “Metasurface eyepiece for augmented reality,” Nat. Commun. 9(1), 4562 (2019). [CrossRef]

7. X. Shi, T. Qiu, J. Wang, et al., “Metasurface inverse design using machine learning approaches,” J. Phys. D: Appl. Phys. 53(27), 275105 (2020). [CrossRef]

8. Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature 521(7553), 436–444 (2015). [CrossRef]

9. T. Zeng, Y. Zhu, and E. Y. Lam, “Deep learning for digital holography: a review,” Opt. Express 29(24), 40572–40593 (2021). [CrossRef]

10. X. Chen, D. Lin, T. Zhang, et al., “Grating waveguides by machine learning for augmented reality,” Appl. Opt. 62(11), 2924–2935 (2023). [CrossRef]

11. S. Molesky, Z. Lin, A. Y. Piggott, et al., “Inverse design in nanophotonics,” Nat. Photonics 12(11), 659–670 (2018). [CrossRef]

12. A. Luce, A. Mahdavi, H. Wankerl, et al., “Investigation of inverse design of multilayer thin-films with conditional invertible neural networks,” Mach. Learn.: Sci. Technol. 4(1), 015014 (2023). [CrossRef]

13. A. Lininger, M. Hinczewski, and G. Strangi, “General inverse design of layered thin-film materials with convolutional neural networks,” ACS Photonics 8(12), 3641–3650 (2021). [CrossRef]

14. X. Xu, Y. Li, and W. Huang, “Inverse design of the MMI power splitter by asynchronous double deep Q-learning,” Opt. Express 29(22), 35951–35964 (2021). [CrossRef]

15. Z. Li, R. Pestourie, Z. Lin, et al., “Empowering metasurfaces with inverse design: principles and applications,” ACS Photonics 9(7), 2178–2192 (2022). [CrossRef]

16. Z. Liu, D. Zhu, S. P. Rodrigues, et al., “Generative model for the inverse design of metasurfaces,” Nano Lett. 18(10), 6570–6576 (2018). [CrossRef]

17. E. Ashalley, K. Acheampong, L. V. Besteiro, et al., “Multitask deep-learning-based design of chiral plasmonic metamaterials,” Photonics Res. 8(7), 1213–1225 (2020). [CrossRef]

18. W. Ma, Z. Liu, Z. A. Kudyshev, et al., “Deep learning for the design of photonic structures,” Nat. Photonics 15(2), 77–90 (2021). [CrossRef]

19. D. Liu, Y. Tan, E. Khoram, et al., “Training deep neural networks for the inverse design of nanophotonic structures,” ACS Photonics 5(4), 1365–1369 (2018). [CrossRef]

20. P. R. Wiecha, A. Arbouet, C. Girard, et al., “Deep learning in nano-photonics: inverse design and beyond,” Photonics Res. 9(5), B182–B200 (2021). [CrossRef]

21. L. Ardizzone, J. Kruse, S. Wirkert, et al., “Analyzing inverse problems with invertible neural networks,” arXiv, arXiv:1808.04730 (2015). [CrossRef]

22. L. Dinh, D. Krueger, and Y. Bengio, “Nice: Non-linear independent components estimation,” arXiv, arXiv:1410.8516 (2014). [CrossRef]

23. L. Dinh, J. S. Dickstein, and S. Bengio, “Density estimation using Real NVP,” arXiv, arXiv:1605.08803 (2017). [CrossRef]

24. D. P. Kingma and P. Dhariwal, “Glow: Generative flow with invertible 1 × 1 convolution,” Adv. Neural Inf. Process. Syst.31, (2018).

25. A. Denker, M. Schmidt, J. Leuschner, et al., “Conditional invertible neural networks for medical imaging,” J. Imaging 7(11), 243 (2021). [CrossRef]

26. V. Fung, J. Zhang, G. Hu, et al., “Inverse design of two-dimensional materials with invertible neural networks,” npj Comput. Mater. 7(1), 200 (2021). [CrossRef]

27. M. Frising, J. B. Abad, and F. Prins, “Tackling multimodal device distributions in inverse photonic design using invertible neural networks,” Mach. Learn.: Sci. Technol. 4(2), 02LT02 (2023). [CrossRef]

28. T. Levola and V. Aaltonen, “Near-to-eye display with diffractive exit pupil expander having chevron design,” J. Soc. Inf. Disp. 16(8), 857–862 (2008). [CrossRef]

29. T. Levola, “Diffractive optics for virtual reality displays,” J. Soc. Inf. Disp. 14(5), 467–475 (2007). [CrossRef]

30. D. Gu, C. Liang, L. Sun, et al., “Optical metasurfaces for waveguide couplers with uniform efficiencies at RGB wavelengths,” Opt. Express 29(18), 29149–29164 (2021). [CrossRef]

31. T. Levola and P. Laakkonen, “Replicated slanted gratings with a high refractive index material for in and outcoupling of light,” Opt. Express 15(5), 2067–2074 (2007). [CrossRef]

32. B. E. Saleh and M. C. Teich, Fundamentals of photonics, 3^rd ed. (Wiley, 2019).

33. R. Marchetti, C. Lacava, L. Carroll, et al., “Coupling strategies for silicon photonics integrated chips,” Photonics Res. 7(2), 201–239 (2019). [CrossRef]

34. C. Ye and D. Dai, “Ultra-compact broadband 2 × 2 3 dB power splitter using a subwavelength-grating-assisted asymmetric directional coupler,” J. Lightwave Technol. 38(8), 2370–2375 (2020). [CrossRef]

35. J. Yang, Z. Zhou, H. Jia, et al., “High-performance and compact binary blazed grating coupler based on an asymmetric subgrating structure and vertical coupling,” Opt. Lett. 36(14), 2614–2617 (2011). [CrossRef]

36. Y. Tamura and J. Nakayama, “Symmetries of scattering factors and diffraction efficiencies in grating theory,” J. Opt. Soc. Am. 35(8), 1306–1314 (2018). [CrossRef]

37. L. Ardizzone, T. Bungert, F. Draxler, et al., “Framework for Easily Invertible Architectures (FrEIA),” https://github.com/vislearn/FrEIA (2018).

38. M. Luo and S. S. Lee, “Code for PRL-LML/TNN-assisted-inverse-design-of-slanted-waveguide-grating,” GitHub (2021), https://github.com/PRL-LML/TNN-assisted-inverse-design-of-slanted-waveguide-grating.

	H [µm]	P [µm]			FF	α [°]	β [°]
Values		R	G	B
Minimum	0.350	0.320	0.270	0.240	0.100	0	0
Maximum	0.550	0.630	0.530	0.470	0.900	30	30
Interval	0.022	0.034	0.029	0.026	0.089	3.333	3.333

Incident beams	MMD		MSE losses
Incident beams	Training set	Test set	Training set	Test set
R	0.63%	0.65%	1.35%	1.50%
G	0.61%	0.63%	1.72%	1.90%
B	0.62%	0.62%	2.15%	2.40%

Incident beams	R² scores		MSE losses
Incident beams	Training set	Test set	Training set	Test set
R	99.87%	99.84%	0.13%	0.17%
G	99.78%	99.73%	0.22%	0.27%
B	99.69%	99.61%	0.31%	0.38%

Incident beams	Peak transmittance			Average transmittance			Illuminance uniformity
Incident beams	INN	FCNN	Sim.	INN	FCNN	Sim.	INN	FCNN	Sim.
R	99.85%	99.53%	99.05%	92.14%	91.77%	91.36%	96.44%	96.48%	96.60%
G	99.46%	99.30%	98.46%	90.60%	90.56%	90.67%	98.27%	97.78%	98.73%
B	98.74%	98.80%	98.86%	91.80%	92.15%	91.51%	97.80%	97.40%	98.41%

Parameters	Peak transmittance			Average transmittance			Illuminance uniformity
Parameters	R	G	B	R	G	B	R	G	B
H [µm]	0.451	0.381	0.507	0.522	0.562	0.550	0.410	0.332	0.428
P [µm]	0.488	0.412	0.366	0.582	0.533	0.458	0.613	0.529	0.471
FF	0.544	0.548	0.544	0.445	0.487	0.761	0.466	0.494	0.728
α [°]	8.50	6.69	10.96	17.52	9.62	9.58	30.67	31.67	25.54
β [°]	27.48	28.23	30.51	26.30	30.02	30.00	17.42	19.38	17.98

Tandem neural network-assisted inverse design of highly efficient diffractive slanted waveguide grating

Abstract

1. Introduction

2. Proposed diffractive slanted waveguide grating and its key structural parameters

3. Operating mechanism of the proposed TNN

4. Architecture of the proposed TNN

5. Performance evaluation and applications of the proposed TNN

5.1 Training performance of the TNN

5.2 Optimized coupling efficiencies and grating parameters inferred from the TNN

6. Conclusion

Appendix

Funding

Acknowledgments

Disclosures

Data availability

References

Data availability

Cited By

Figures (6)

Tables (8)

Equations (3)

Optics Express

Number of GCBs	MMD		MSE
Number of GCBs	Training set	Test set	Training set	Test set
2	1.15%	1.15%	3.59%	3.75%
4	0.71%	0.72%	1.67%	1.79%
6	0.68%	0.68%	1.24%	1.33%
8	0.60%	0.61%	0.86%	0.96%
10	0.59%	0.60%	0.59%	0.64%

Hidden layers	Neurons = 128		Neurons = 256		Neurons = 512
Hidden layers	Training set	Test set	Training set	Test set	Training set	Test set
1	(91.99%, 8.00%)	(91.91%, 8.20%)	(93.40%, 6.60%)	(93.33%, 6.77%)	(94.59%, 5.41%)	(94.50%, 5.58%)
2	(98.26%, 1.74%)	(98.14%, 1.88%)	(98.92%, 1.08%)	(98.81%, 1.20%)	(99.22%, 0.78%)	(99.12%, 0.89%)
3	(99.16%, 0.84%)	(99.07%, 0.94%)	(99.59%, 0.41%)	(99.52%, 0.49%)	(99.76%, 0.24%)	(99.70%, 0.30%)
4	(99.48%, 0.51%)	(99.42%, 0.59%)	(99.76%, 0.24%)	(99.71%, 0.29%)	(99.85%, 0.15%)	(99.81%, 0.19%)
5	(99.54%, 0.46%)	(99.48%, 0.53%)	(99.83%, 0.17%)	(99.78%, 0.23%)	(99.87%, 0.13%)	(99.84%, 0.17%)
6	(99.67%, 0.33%)	(99.61%, 0.39%)	(99.85%, 0.15%)	99.81%, 0.19%)	(99.87%, 0.13%)	(99.84%, 0.17%)

Incident beams	Average transmittance		Illuminance uniformity
Incident beams	FCNN	Sim.	FCNN	Sim.
R	88.25%	88.55%	94.56%	94.29%
G	88.34%	88.95%	95.05%	94.08%
B	88.71%	87.90%	94.89%	95.50%

Incident beams	Average transmittance		Illuminance uniformity
Incident beams	FCNN	Sim.	FCNN	Sim.
R	88.25%	88.55%	94.56%	94.29%
G	88.34%	88.95%	95.05%	94.08%
B	88.71%	87.90%	94.89%	95.50%