Deep learning based reconstruction of directional coupler geometry from electromagnetic near-field distribution

Tom Coen; Hadar Greener; Hadar Greener; Michael Mrejen; Michael Mrejen; Lior Wolf; Haim Suchowski; Haim Suchowski

doi:10.1364/OSAC.397103

1. Introduction

Integrated photonics circuits with increasing density and interaction complexity facilitate a wide variety of devices, such as on-chip modulators [1,2], photodetectors [3], frequency combs [4]. These are at the heart of recent advances in classical and quantum photonics-based applications such as nano-optical data communication, quantum information processing, sensing, and LIDAR. However, such devices generally entail a lengthy fabrication process systematically interrupted for critical dimensions monitoring. Moreover, such devices are typically buried beneath an oxide layer that allows for both the homogeneity of the waveguides’ surrounding and protection from the environment. Therefore, precisely recovering the device’s geometry once it has been fabricated is a major challenge. The directional coupler seen in Fig. 1(a), a photonic building block comprised of two interacting waveguides, is a typical example of such devices.

Fig. 1. Illustration of pipeline for reconstructing buried waveguide geometry from electromagnetic near-field distribution. (a) Input light of wavelength $\lambda$ is injected into a waveguide with variable geometry $w(z)$, coupled at a distance of $g(z)$ from the centerline of another one. The light intensity transfers between the two in a manner that depends on the waveguides’ geometry and distance. The inverse problem of discovering the geometry from the measured light intensity inspired this current work. (b) A training set is generated and fed to the Pix2Pix GAN. This is a set of an original waveguide geometry and refractive index map with a two-channel image of the EM field distribution calculated via numerical methods at $\lambda =1.31$ and $1.55\mu m$ for $Si$ in $SiO_{2}$ waveguides situated at an arbitrary distance with a total length of $200\mu m$. A test set is input to the GAN, and the network outputs a reconstructed waveguide refractive index map (corresponding to the geometry), used to regenerate the two-channel image of the EM field distribution. The results are validated by comparing the EM field distribution and geometry estimation to the test set.

Download Full Size | PDF

Light of wavelength $\lambda$ injected into one waveguide will be transferred after a certain length to the adjacent one, owing to the evanescent coupling between the two waveguides that depends on their geometry and distance [5]. The dynamics of the energy transfer between these coupled waveguides follows the physics of a SU(2) system and is described by a 2x2 Hamiltonian [5]. It therefore follows that for coupled waveguides with changing geometry along the light propagation axis, the coefficient of the propagator become length dependent, just as a time dependent two-level atomic system or in nuclear magnetic resonance. Waveguides configurations of this kind have been recently shown to allow for dynamics robust to fabrication errors [6]. Such minute changes of the waveguides’ geometry can be challenging to monitor, especially in the conditions mentioned above, where the waveguides are not directly accessible for inspection. Nevertheless, the evanescent field penetrating through the oxide layer gives a glimpse of the electromagnetic (EM) field distribution of this system and can be retrieved using a near-field scanning optical microscope [7]. Linking this EM distribution to the geometry underneath requires a prohibitive time-consuming task, where specialized software and numerical packages, such as COMSOL Multiphysics [8] and Lumerical [9],run in an iterative optimization process to obtain convergence between the measured and calculated EM distribution.

In this context, deep learning (DL) algorithms have recently shown great promise as a candidate to solve the above-mentioned inverse problem [10–17]. By employing neural networks with many layers interleaved with non-linear transformations, DL has established itself as a powerful and versatile computational technique. This remarkable abstraction of data has been applied to cutting-edge technologies in computer vision, natural language processing and more [18]. Furthermore, DL is emerging as an approach for research in fields other than computer science, such as microscopy [19], condensed matter [20], design of plasmonic nanostructures [11], metasurfaces [21,22] and metamaterials [23], inverse design of plasmonic waveguides [24], and classification of photonic mode field distributions [25].

In this paper, we introduce a DL approach to address the challenge of inferring the buried waveguide geometry from the EM field distribution. To this end, we employ Pix2Pix [26], a DL generative adversarial neural network (GAN) [27,28]. Namely, we formulate the problem as an image-to-image translation, where the input is a map of the spatial distribution of the EM field intensity measured above the oxide layer. The output is a spatial map of the structures’ dielectric constant. This enables accurate retrieval of the geometry of the buried waveguide structures.

We show that our method is more accurate than interpolation performed by using the well-known k-nearest neighbors (kNN) algorithm [29] and results in EM field distributions that better resemble the input. Moreover, as our method learns the underlying physics of the problem at hand, it permits exposure to instances excluded from the neural network’s training set. Our algorithm provides a basis for a wide range of fast predictive characterization and design of nanophotonic systems. It opens a path toward post-fabrication validation and quality assessment of physical devices that have yet been out of reach.

1.1 Theoretical model

We focus on an evanescently coupled optical waveguide system shown schematically in Fig. 1(a). Simulating the resulting EM field of a given coupled waveguide system geometry is a straightforward task, within the framework of the coupled mode equations [5]. This can be done by fully solving the three-dimensional Maxwell’s equations using widely available commercial Finite Element Method (FEM) tools. As this is a time-consuming calculation, we will later introduce the eigenmode expansion (EME) method that we used to create our training sets. However, assessing the a priori waveguide system geometry from the EM field distribution is a lengthy, iterative, and at times, inaccurate process.

1.2. Methods

In our work, we employ Pix2Pix to regenerate the coupled waveguide system’s geometry from its EM field distribution. This image-to-image translation has been defined as the task of translating one possible representation of a scene to another. The Pix2Pix design achieves a general purpose model that has been shown to perform well in a variety of image-to-image translation tasks. These include processes such as synthesizing photos from label maps, reconstructing objects from edge maps [26], and organ segmentation [30]. In addition the model has been extended to perform implicit domain adaptation [31] and other tasks. The task of reconstructing a coupled waveguide geometry from its EM field distribution can be formulated as a image-to-image translation problem making Pix2Pix an excellent candidate for this task.

Fig. 2. Simulated (top) vs. reconstructed (middle) waveguide geometry output from Pix2Pix network, and their matching two-channel EM field distributions at $\lambda =1.31$ and $1.55 \mu m$, with their matching Bloch sphere trajectories (bottom). Simulated fields (red trajectory) are nearly indistinguishable from their reconstructed counterparts (grey trajectory). The network is able to retrieve geometries with a diverse number of segments, without a-priori information.

Download Full Size | PDF

1.3 Dataset generation

In order to efficiently generate large training datasets from the waveguides’ geometries, an eigenmode expansion method (EME) was employed via Lumerical MODE [9]. EME allows for a quick calculation of the propagation of an arbitrary input EM field in a given geometry, assuming that the variation of the said geometry is significant only in the direction of propagation ($\hat {z}$ in Fig. 1(a)). By solving Maxwell’s equations, thus computing the eigenmodes of the input field in a two-dimensional cross section of a coupled waveguide system at point $z'$, each eigenmode is propagated with a phase, according to its propagation constant:

(1)$$E(x,y,z) = \sum_{j} c_{j}(z')A_{j}(x,y)e^{i\beta_{j}z}$$

where the weights are

(2)$$c_{j}(z') = \int\int A_{j}^{*}(x,y)E(x,y,z=z')dxdy$$

In this work, we considered two different families of coupled waveguide geometries. The first was composed of waveguides with a discrete number of segments of varying widths. The second included waveguides with discrete segments composed of tapered widths; each waveguide’s thickness followed a Gaussian random walk. Each dataset contained samples that varied in the number of segments, in the distance between the two waveguides, and in the thicknesses that defined each waveguide’s geometry. Furthermore, the input light considered in the simulations were of two different values $\lambda = 1.31$ and $1.55\mu m$. This was interpreted as a set of waveguide geometries with a respective two-channel image of two different distributions of $|E_{\lambda }(x,y)|^{2}$ that was fed to the network, where the two channels correspond to the two wavelengths. The output of the network was a spatial map $n(x,y)$ of the dielectric constant that corresponds to the waveguide geometry. The regenerated output waveguide geometry was again fed to to the EME simulation to compare to the reconstructed the two-channel images of $|E(x,y)|^{2}$ distributions.

1.4 Proposed model

A GAN [27] is a deep-learning-based model in which two neural networks, generator $G$ and a discriminator $D$, are simultaneously trained. $D$ is trained to classify samples as belonging or not to the statistical distribution $p_{y}$ of the training dataset, while $G$ is not directly exposed to the training dataset. Rather, it is trained to generate new data, defined by a random vector $z$ that is mapped to a sample $G(z)$ that was classified by the discriminator as belonging to the original distribution $p_{y}$. Both networks aim to minimize ($G$) and maximize ($D$) the following function:

(3)$$\min_{G}\max_{D}\left[ \mathbb{E}_y [\log \left(D(y)\right)] + \mathbb{E}_z [\log \left( 1 - D(G(z))\right)]\right]$$

where $y$ represents a random variable from the distribution $p_{y}$, and $z$ is a random vector that is input to the GAN.

A cGAN [28] is a variation of the GAN principle where an additional input, called a condition $c$, is combined with the input of $G$ and $D$. In the case of our study, $c$ is the EM field distribution. Thus, the cGAN searches for the specific geometry that has this distribution, and not for a generic realistic geometry of a directional coupler. Thus the optimization function becomes:

(4)$$\min_{G}\max_{D}\left[ \mathbb{E}_{y,c} [\log \left(D(y,c)\right)] + \mathbb{E}_{z,c} [\log \left( 1 - D(G(z,c),c)\right)]\right]$$

The proposed model, Pix2Pix [26], is a cGAN whose use case is image-to-image translation for paired datasets. Specifically, it is useful for training datasets where the pairing between an input image $i$ and its translated output $f(i)$ is known for all samples. Its architecture differs from a regular cGAN in that there is no random vector $z$ mapping. The network is still provided noise in the form of dropout, both at training and testing time. This strategy achieves only minor stochasticity in the output of the nets [26]. The condition $c$ is the input image to translate, while $G(c)$ is the output image and $y$ is the ground truth. In addition to the GAN loss function, Pix2Pix utilizes a $L_1$ loss function [32], defined as $\sum _{i=1}^{n}|y_{true}-y_{predicted}|$. This loss is applied between $y$ and $G(c)$ with a weight $\lambda$, so the final objective is:

(5)$$\begin{aligned}G^*=\arg\min_G \max_D \mathbb{E}_{c,y} [\log D(y,c)]+ \\ \mathbb{E}_{c}[\log (1-D(G(c),c))] + \lambda \mathbb{E}_{c,y}[\Vert y-G(c)\Vert_1] \end{aligned}$$

Fig. 3. Mean Squared Error (MSE) in logarithmic scale as a measure of accuracy, for (a) $n(x,y)$ and (b) $|E(x,y)|^{2}$ of the proposed model with and without an additional $L_{1}$ loss function vs. kNN regression for different sizes of the training datasets. Here the dataset is comprised of coupled waveguides set at a fixed distance with 12 variable-width segments.

Download Full Size | PDF

The architecture of $G$ in Pix2Pix is an encoder-decoder with skip connections (U-Net) [33], while $D$ is a PatchGAN convolutional network [26]. It decides for each $70\times 70$ patch of the image if it belongs to the desired distribution or not, then averages the outputs. It is argued that while the $L_1$ loss optimizes the low-frequency features of the output images, the PatchGAN is calibrated to optimize the high frequency features.

1.5 Results

For this work two different datasets of 6400 sample pairs of waveguide geometries with matching two-channel EM field distribution maps were created. The first set of waveguides has between 3 to 12 discrete segments of varying widths, while the waveguides were separated by a distance between 600 nm to 1000 nm. The second set has tapered segments instead, according to a Gaussian random walk. A third dataset of 6400 sample pairs consists of a smaller variation of distances and thicknesses. In all the sample sets the waveguides have a symmetrical axis along $\hat {z}$, with a total length of $200 \mu m$.

The sample sets were divided to training (6000), validation (300) and testing (100) datasets. Each training was performed for up to 20 epochs (more epochs resulted in overfitting), and the test was performed on the best performance on the validation set out of those obtained at the end of each epoch. The weight $\lambda$ (6) was kept constant at the value 1000.

In Fig. 2, we demonstrate a prediction of Pix2Pix on a sample with 5 segments, when the model was trained on a dataset with 12 segments. The model was able to accurately output the original geometry even when the sample was not explicitly contained in the training distribution. The pixel resolution of the produced images are $12nm$ in the $y$ direction (i.e. the horizontal distance between the coupled waveguides) and $781nm$ along the propagation axis $z$. Therefore, one can immediately translate pixels to units of geometric distance within the limit of resolution.

To validate the final output of the method we investigated two different measures. One was a mean square error (MSE) loss on the predicted dielectric constant maps $n_{pred}(x,y)$, that provides a measure of the accuracy of the geometry retrieval. This measure is strongly correlated to the $L_1$ loss on which the discriminator neural network model was trained. The other is a MSE loss on the electric field intensity $|E^2(x,y)|_{pred}$ obtained by performing a new EME simulation on $n_{pred}$. The rationale behind evaluating the model on this measure was to quantify the ability of the model to generate meaningful physical devices whose performance resembles the input EM field intensity. Due to the nature of the problem, this is not accurately modeled by the MSE on $n_{pred}(x,y)$ alone.

Fig. 4. Learning capabilities of our proposed model on examples well outside the training dataset. The new example is of waveguides designed with a single segment of different set widths. Simulated (top) and reconstructed (middle) geometry output from the Pix2Pix network and their EM field distribution for wavelengths $\lambda = 1.31$ and $1.55 \mu m$. It is highly apparent that the geometry reconstructed from our neural network highly resembles the simulated counterpart, while kNN (bottom) was unable to output adequate physical results.

Download Full Size | PDF

In order to have a baseline for the quality of our model’s solution to the task, we compared our results to a cGAN without an $L_{1}$ loss and to those obtained by the well-known $k$-nearest-neighbors (kNN) regression algorithm [29], which consists of returning the geometry of the sample in the training set whose simulated EM field distribution best matches the one given as its input. We combined the training and validation sets for this task, and used $k=1$, since higher orders did not significantly improve performance. Moreover, averaging different geometries has a high risk of producing unphysical results. If our model were to be outperformed by the kNN algorithm, it would indicate that the cGAN was not capable of finding a better interpolation than a very simple algorithm.

In Table 1 we report the performance of our proposed approach on the test datasets. Our model greatly outperforms the kNN baseline in both measures of validation. The accuracy of our model improves with the increasing size of the training dataset, as shown in Fig. 3.

Table 1. Benchmarks for Pix2Pix (P2P) trained on different datasets: MSE for $n(x,y)$ and $|E|^2(x,y)$. $T$ represents a tapered dataset (distance range: 600nm-1000nm), $D$ is discrete, while $T_m$ is a tapered dataset (distance range: 600nm-800nm).

View Table

Moreover, in order to demonstrate generalization and to further challenge the model, we fed the network an input dramatically outside the training distribution. One such control input was an EM field distribution of two almost-uncoupled waveguides, with different single width segments along the propagation axis. These widths were outside the distribution of widths featured in the training set, as was the resulted EM field distribution. The results shown in Fig. 4 are of the reconstructed geometry and two-channel field distributions from the GAN (top) and its simulated versions (middle). As the images are nearly indistinguishable, it’s apparent that the model was able to sufficiently capture the geometry and physics of the waveguide structures. As seen in the bottom portion of Fig. 4, the proposed model’s ability to learn extends high above the realm of kNN’s inadequate predicted results, both in terms of the geometry and in terms of the physical results.

2. Conclusion

We introduce a new deep-learning-based method to retrieve the geometry of a coupled waveguides system based on the near-field EM distribution map. We find that this method succeeds in geometry retrieval with much greater accuracy than a kNN interpolation trained on the same dataset. Moreover, the proposed method succeeds in outputting results beyond the training dataset, a hallmark of knowledge acquisition in DL. This work provides solid ground for further investigation of more complex and sophisticated physical systems such as linear propagation in waveguide arrays with variable geometries, nonlinear optical interactions in waveguides and more. This novel method provides new prospects for post-fabrication and validation of classical and quantum integrated photonic devices.

Funding

European Research Council (725974); PAZY Foundation.

Acknowledgments

We acknowledge ISF. 1433/15 and PAZY young scientist grant. This project has received funding from the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (grant ERC CoG 725974).

Disclosures

The authors declare no conflicts of interest.

References

1. M. Lipson, “Compact electro-optic modulators on a silicon chip,” IEEE J. Sel. Top. Quantum Electron. 12(6), 1520–1526 (2006). [CrossRef]

2. C. Wang, M. Zhang, X. Chen, M. Bertrand, A. Shams-Ansari, S. Chandrasekhar, P. Winzer, and M. Lončar, “Integrated lithium niobate electro-optic modulators operating at cmos-compatible voltages,” Nature 562(7725), 101–104 (2018). [CrossRef]

3. J. F. Gonzalez Marin, D. Unuchek, K. Watanabe, T. Taniguchi, and A. Kis, “Mos2 photodetectors integrated with photonic circuits,” npj 2D Mater. Appl. 3(1), 14 (2019). [CrossRef]

4. A. L. Gaeta, M. Lipson, and T. J. Kippenberg, “Photonic-chip-based frequency combs,” Nat. Photonics 13(3), 158–169 (2019). [CrossRef]

5. A. Yariv, “Coupled-mode theory for guided-wave optics,” IEEE J. Quantum Electron. 9(9), 919–933 (1973). [CrossRef]

6. E. Kyoseva, H. Greener, and H. Suchowski, “Detuning-modulated composite pulses for high-fidelity robust quantum control,” Phys. Rev. A 100(3), 032333 (2019). [CrossRef]

7. B. Knoll and F. Keilmann, “Enhanced dielectric contrast in scattering-type scanning near-field optical microscopy,” Opt. Commun. 182(4-6), 321–328 (2000). [CrossRef]

8. “Comsol. comsol multiphysics^® v. 5.2 (comsol ab, stockholm, sweden) https://www.comsol.com/support/knowledgebase/1223/.”.

9. “Lumerical inc. https://www.lumerical.com/products/,” .

10. G. M. Sacha and P. Varona, “Artificial intelligence in nanotechnology,” Nanotechnology 24(45), 452002 (2013). [CrossRef]

11. I. Malkiel, M. Mrejen, A. Nagler, U. Arieli, L. Wolf, and H. Suchowski, “Plasmonic nanostructure design and characterization via Deep Learning,” Light: Sci. Appl. 7(1), 60 (2018). [CrossRef]

12. D. Macías, P.-M. Adam, V. Ruíz-Cortés, R. Rodríguez-Oliveros, and J. A. Sánchez-Gil, “Heuristic optimization for the design of plasmonic nanowires with specific resonant and scattering properties,” Opt. Express 20(12), 13146–13163 (2012). [CrossRef]

13. P. Ginzburg, N. Berkovitch, A. Nevet, I. Shor, and M. Orenstein, “Resonances on-demand for plasmonic nano-particles,” Nano Lett. 11(6), 2329–2333 (2011). [CrossRef]

14. C. Forestiere, M. Donelli, G. F. Walsh, E. Zeni, G. Miano, and L. D. Negro, “Particle-swarm optimization of broadband nanoplasmonic arrays,” Opt. Lett. 35(2), 133–135 (2010). [CrossRef]

15. C. Forestiere, A. J. Pasquale, A. Capretti, G. Miano, A. Tamburrino, S. Y. Lee, B. M. Reinhard, and L. Dal Negro, “Genetically engineered plasmonic nanoarrays,” Nano Lett. 12(4), 2037–2044 (2012). [CrossRef]

16. T. Feichtner, O. Selig, M. Kiunke, and B. Hecht, “Evolutionary optimization of optical antennas,” Phys. Rev. Lett. 109(12), 127701 (2012). [CrossRef]

17. C. Forestiere, Y. He, R. Wang, R. M. Kirby, and L. Dal Negro, “Inverse design of metal nanoparticles’morphology,” ACS Photonics 3(1), 68–78 (2016). [CrossRef]

18. I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (MIT Press, 2016). http://www.deeplearningbook.org.

19. Y. Wu, Y. Rivenson, H. Wang, Y. Luo, E. Ben-David, L. A. Bentolila, C. Pritz, and A. Ozcan, “Three-dimensional virtual refocusing of fluorescence microscopy images using deep learning,” Nat. Methods 16(12), 1323–1331 (2019). [CrossRef]

20. J. Carrasquilla and R. G. Melko, “Machine learning phases of matter,” Nat. Phys. 13(5), 431–434 (2017). [CrossRef]

21. J. Jiang and J. A. Fan, “Simulator-based training of generative neural networks for the inverse design of metasurfaces,” Nanophotonics 0 (2019).

22. Z. Liu, D. Zhu, S. P. Rodrigues, K.-T. Lee, and W. Cai, “Generative model for the inverse design of metasurfaces,” Nano Lett. 18(10), 6570–6576 (2018). [CrossRef]

23. W. Ma, F. Cheng, Y. Xu, Q. Wen, and Y. Liu, “Probabilistic representation and inverse design of metamaterials based on a deep generative model with semi-supervised learning strategy,” Adv. Mater. 31(35), 1901111 (2019). [CrossRef]

24. T. Zhang, J. Wang, Q. Liu, J. Zhou, J. Dai, X. Han, Y. Zhou, and K. Xu, “Efficient spectrum prediction and inverse design for plasmonic waveguide systems based on artificial neural networks,” Photonics Res. 7(3), 368 (2019). [CrossRef]

25. C. Barth and C. Becker, “Machine learning classification for field distributions of photonic modes,” Commun. Phys. 1(1), 58 (2018). [CrossRef]

26. P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” (2016).

27. I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” (2014).

28. M. Mirza and S. Osindero, “Conditional generative adversarial nets,” CoRR abs/1411.1784 (2014).

29. N. S. Altman, “An introduction to kernel and nearest-neighbor nonparametric regression,” Am. Stat. 46(3), 175–185 (1992). [CrossRef]

30. M. Eslami, S. Tabarestani, S. Albarqouni, E. Adeli, N. Navab, and M. Adjouadi, “Image-to-images translation for multi-task organ segmentation and bone suppression in chest x-ray radiography,” IEEE Trans. Med. Imaging 39(7), 2553–2565 (2020). [CrossRef]

31. A. Rau, P. J. E. Edwards, O. F. Ahmad, P. Riordan, M. Janatka, L. B. Lovat, and D. Stoyanov, “Implicit domain adaptation with conditional generative adversarial networks for depth prediction in endoscopy,” Int. J. Comput. Assist. Radiol. Surg. 14(7), 1167–1176 (2019). [CrossRef]

32. H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss Functions for Image Restoration With Neural Networks,” IEEE Trans. Comput. Imaging 3(1), 47–57 (2017). [CrossRef]

33. E. Shelhamer, J. Long, and T. Darrell, “Fully convolutional networks for semantic segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017). [CrossRef]

	P2P ( $n$ )	kNN ( $n$ )	P2P ( $\| E \|^{2}$ )	kNN ( $\| E \|^{2}$ )
$T$	0.0016	0.0899	0.9e-5	8.9e-5
$D$	0.015	0.12	3.06e-4	4.07e-4
$T_{m}$	0.0003	0.04	1.44e-4	5.71e-4

Deep learning based reconstruction of directional coupler geometry from electromagnetic near-field distribution

Abstract

1. Introduction

1.1 Theoretical model

1.2. Methods

1.3 Dataset generation

1.4 Proposed model

1.5 Results

2. Conclusion

Funding

Acknowledgments

Disclosures

References

Cited By

Figures (4)

Tables (1)

Equations (5)

OSA Continuum