Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Inverse design of unparametrized nanostructures by generating images from spectra

Open Access Open Access

Abstract

Recently, there has been an increasing number of studies applying machine learning techniques for the design of nanostructures. Most of these studies train a deep neural network (DNN) to approximate the highly nonlinear function of the underlying physical mapping between spectra and nanostructures. At the end of training, the DNN allows an on-demand design of nanostructures, i.e., the model can infer nanostructure geometries for desired spectra. While these approaches have presented a new paradigm, they are limited in the complexity of the structures proposed, often bound to parametric geometries. Here we introduce spectra2pix, which is a DNN trained to generate 2D images of the target nanostructures. By predicting an image, our model architecture is not limited to a closed set of nanostructure shapes, and can be trained for the design of a much wider space of geometries. We show, for the first time, to the best of our knowledge, a successful generalization ability, by designing completely unseen shapes of geometries. We attribute the successful generalization to the ability of a pixel-wise architecture to learn local properties of the meta-material, therefore mimicking faithfully the underlying physical process. Importantly, beyond synthetical data, we show our model generalization capability on real experimental data.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

Introduction. The interaction of light with nano-scale material, can be characterized by various properties of the outgoing light [1]. This interaction results in partial transmission, due to absorption and scattering. Predicting such optical response of a nanostructure requires solving the full set of Maxwell equations. This problem, denoted as the “direct problem” in Fig. 1, is a feasible prediction problem, and can be solved via simulations. The more challenging direction is the “inverse problem,” i.e., inferring the nanoscale geometry from a measured (or desired) spectra in the far field.

Indeed, the inverse design of nanophotonics structures, i.e., obtaining a geometry for a desired photonic function, has been a challenge for decades. Due to the highly nonlinear nature of this optimization problem, it requires, when applying evolutionary or topology optimization algorithms [2,3], hundreds to several thousands of iterations for a single design task. Recently, modern machine learning (ML) algorithms have been applied to this nanophotonics inverse problem, demonstrating great promise [47].

One can distinguish two different levels of generalization in the design of nanostructures by ML, one of which has been demonstrated, but the other remains a desirable goal. The first, and the more basic one, is obtaining a model that is capable of designing nanostructures of the same shape and material it was trained on, but with different properties, such as arm sizes, angles, host material, etc. Work such as [816] fall within this category, since their test set comprises geometries with shapes of the same structure as the ones in the train set, and the ML algorithm outputs a parameterized description for those structures.

The second category refers to models that can generalize and design geometries with shapes that differ from the set of shapes used during training. For example, in this work, we showcase that our model can infer “${\rm L}$,”- “${\rm h}$,”- and “${\rm H}$”-shaped nanostructures, given matched spectra, although the model was trained on different types of shapes. Additional attempts to devise such a model have been recently presented in [17]. In this work, the authors proposed a generator-discriminator architecture with an adversarial objective that does not utilize a pixel-wise loss in geometry space. They tried to test the generalization ability of their model by training on a set of digit shapes (from the MNIST [18] dataset), leaving one digit as a test set. However, in their work, the model designed a shape from the set of shapes it was trained on, and did not seem to generalize as expected.

 figure: Fig. 1.

Fig. 1. Interaction of light with plasmonic nanostructures. Incoming electromagnetic radiation interacts with the nanostructure in a resonant manner, leading to an effective optical response. The spectra of both polarizations are dictated by the geometry of the nanostructure, rather than the chemical composition.

Download Full Size | PDF

Achieving generalization requires training on a diverse enough dataset. In this work, we rely on the dataset from [12,13], which has been converted to a pixel-wise format. To test generalization ability, we hold out some of the shape classes.

It is important to clarify the distinction between [12,13] and our work. In [12,13], the authors study the model’s ability to infer geometries from the same shapes the model was trained on, by encoding the geometries into a parameter vector and training a neural network to regress this vector given spectra, i.e., the model architecture was designed to retrieve coding vectors that encode a closed set of shapes. In contrast, we aim to support generalization beyond the shapes of a given dataset by predicting a pixel (grid) representation of the target shape.

Our presented approach introduces a model that is flexible enough to design any 2D nanostructure geometry provided the right dataset. Moreover, it is the first report, to our knowledge, of generalization exhibiting successful design of geometries sampled from a fairly different distribution from what the model was trained on (level two generalization, as described above).

Method. Let ${\cal S} = \{s \}$ be the set of all supported spectra. Let ${\cal G} = \{{{g_i}} \}_{i = 1}^g$ be the set of binary 2D square images of all geometries, ${g_i} \in {\{{0,1} \}^{d \times d}}$, and $d \in \mathbb{N}$ is the dimension of the images. Each geometry image ${g_i}$ is associated with a valid pair of spectra $(s_i^1,s_i^2)$, $s_i^1,s_i^2 \in S$. Each element in the pairs of spectra is associated with a different polarization (vertical or horizontal). Let $C \subseteq {\mathbb{R}^l}$ be the set of all supported materials, represented as vectors. In this study, without loss of generality, we use one material (gold), and define $C$ as a set of single-dimensional vectors ($l = 1$) containing real-valued parameters called epsilon host $h \in {\mathbb R}$.

We define $M:S \times S \times C \to G$ to be a model that maps pairs of spectra associated with material properties, into a 2D image of the matched geometry. Given a set of $N$ quadruplet training elements,

$$X = [(s_i^1,s_i^2,{c_i},{g_i})]_{i = 1}^N,$$
our training goal is to learn a model ${\rm M}$, such that for all $({s^1},{s^2},c,g) \in X$, the generated image,
$${\hat g}: = M({s^1},{s^2},c),$$
approximates the label image $g$ with high accuracy.

The Loss Function. During training, the spectra2pix model minimizes a pixelwise reconstruction loss term, which compares the generated image with the ground truth image. Our loss function $L:{{\mathbb R}^{d \times d}} \times {{\mathbb R}^{d \times d}} \to R$ is defined as

$$L(\hat g,g) = \left\| {M({s^1},{s^2},c) - g} \right\|_2^2,$$
where $\parallel \; \parallel _2^2$ is the squared Euclidean norm. This loss relies solely on the pixelwise comparison between the generated image and the ground truth image.

By employing a pixelwise loss function on the generated images, our spectra2pix model learns to approximate the hidden inverse function between spectra and material properties and geometries without explicitly parameterizing the shape.

Model Architecture. The architecture of spectra2pix comprises two parts. The first part receives the vectorized representation of the spectra pair along with a scalar representing the epsilon host material. It then applies three parallel fully connected layers on each of the three inputs. This part is similar to [12,13].

The second part of the model architecture receives the three outputs of the last fully connected layers from the first part and concatenates them into one unified representation. The unified vector is then transformed into a higher dimension by utilizing a fully connected layer. Next, the higher dimensional vector is reshaped into a matrix and forwarded through a sequence of three convolutional layers, each followed a ReLU activation. Each convolutional layer incorporates 10 filters with a kernel size of $5 \times 5$, except for the last layer, which utilizes a single filter. The output of the last convolutional layer is the generated image, denoted by $\hat g$. The spectra2pix model is illustrated in Fig. 2.

 figure: Fig. 2.

Fig. 2. Spectra2pix architecture and dataset. The model receives two spectra and an epsilon host value. Each of the three inputs is forwarded through a fully connected network (with separate weights). The outputs of the three networks are concatenated into one intermediate vector. Next, another fully connected layer is applied to transform the vector into a higher dimension (equals ${64} \times {64}$). The latter is reshaped to a square matrix, to which three convolutional layers are applied sequentially, resulting in a ${64} \times {64}$ output image. All activations are ReLU activations.

Download Full Size | PDF

Synthetic Data. We utilize the dataset from [12,13]. This dataset comprises ${\sim}13\;{\rm K}$ samples of synthetic experiments. Each sample is associated with a geometry, a single polarization (vertical or horizontal), and material properties. By pairing the polarizations, we formed ${\sim}6.5\;{\rm K}$ experiments comprising the quadruplets $({s_1},{s_2},c,g) \in X$, where $c \in [1,3]$ represents the host material.

The geometries are composed of different combinations of edges. All three data parts, geometry, spectrum, and material properties, are represented as vectors. Specifically, for the geometries, an eight-dimensional encoding is used. Five dimensions encode the presence of each one of the five edges of the H shape (binary values). Two dimensions encode the size of (1) the outer edges (which share the same size) and (2) the inner edge. The last dimension represents the angle between the top left outer edge with the inner edge (angles are between 0° and 90°). In our work, we transform the parametrized shapes into 2D binary images. The transformed dataset will be made completely public.

To study the ability of spectra2pix to generalize to the second category, we split the above dataset into train, test, and validation sets, leaving a complete subset of a specific shape to the test set. In other words, instead of employing a random split, we utilize a leave-one-shape-out strategy, where the train and validation sets contain all shapes but one. All the samples of the left-out shape are assigned to the test set.

Numerical Results. In our experiments, we chose three different leave-one-shape-out splits. The first split utilizes a test set containing all the geometries of the shape ${\rm L}$ and their variants (resulting in 779 test samples). The second leave-one-out split leaves all ${\rm h}$ shapes to the test set (879 test samples), and the third leaves all ${\rm H}$ shapes (623 test samples). Notably, each test set includes different edge sizes for the inner edge and outer edges, along with a variable angle for the outer top left edge.

The train and validation sets for each leave-one-out split contain all the rest of the samples. The split between the train and validation sets is random, where the validation set size is ${\sim}5\%$ of the size of the entire dataset. We train the spectra2pix network for 1M training steps, with a batch size of 64. The Adam optimizer is used with a learning rate of ${10^{- 5}}$.

We report quantitative results for the spectra2pix model after transforming the output images to binary values (with a standard rounding operation) and compare it with multiple baselines utilizing the inverse network architecture taken from [12,13]. Each baseline was trained on one of the above leave-one-shape-out datasets. The baseline models, denoted by baseline-L/h/H, are used to compare spectra2pix with a parametric model trained and tested on the same task of cross-category generalization.

For completeness and to further assess our proposed model’s performance as well as the complexity of the cross-category generalization tasks, we train and evaluate additional spectra2pix and baseline models on a random split. These models, denoted by spectra2pix-all and baseline-all, are trained with all the different shapes in the dataset and evaluated for a standard within-category generalization task.

Notably, all baselines are trained to minimize the L2 loss of the geometry encoding vector. During the inference, we transform the predicted geometry encoding to 2D images.

Performance is reported using mean squared error (MSE), normalized MSE (NMSE), defined as

$${\textit{NMSE}(\hat g,g\!) = \frac{{\textit{MSE}(\hat g,g)}}{{\textit{MSE}(g,0)}} = \frac{{\parallel\! \hat g - g\!\parallel _2^2}}{{\parallel\! g\!\parallel _2^2}}},$$
and an MSE score evaluated on the spectral space (denoted by MSE-spectra).

Since flipping and mirroring do not affect the spectra of some geometries (e.g., ${\rm L}$ and $\rm{J}$ share the same spectra), we flip and mirror each generated image, during the calculation of the quantitative metrics defined above. Specifically, for each sample, we flip and mirror the predicted image, forming four variations of the predicted geometry. Then, we calculate the MSE and NMSE between each of the four variations and the ground truth image. The variation that minimizes both metrics is chosen to contribute its score to the total score calculated across the entire test set. For example, for a given ground truth image of ${\rm L}$ shape, if the model predicted an accurate geometry but in a different orientation, such as ${\rm J}$, the MSE and NMSE of such prediction would be zero.

To assess our proposed model’s performance, we report all models’ accuracy in the spectral space. To circumvent the computational burden process of evaluating all predicted geometries’ spectra across all models and test sets, we train a proxy model to predict a given geometry’s spectra. The model receives a geometry-image as input and a host material and predicts the spectra. The proxy model architecture utilizes the reversed architecture of spectra2pix, hence denoted by pix2spectra. We propagate all predicted geometries, from all models above, through the pix2spectra model to predict their associated spectra. Finally, we calculate the MSE in spectra space and report the performance in Table 1.

Tables Icon

Table 1. Comparing Spectra2pix with Four Baseline Inverse Networks [11,12]a

The results, shown in Table 1, indicate that spectra2pix outperforms the baseline in both within- and cross-category generalization tasks and across all metrics. As can be seen, the within- and cross-category models greatly differ in their performance, with the models that were trained on all shapes having a much lower error. In the cross-category tasks, the spectra2pix method obtained results that are better in all cases than the baseline with the same generalization level. More specifically, for both ${\rm L}$ and ${\rm h}$ splits, spectra2pix obtained performance closer to the baseline-all model than the baseline, which is trained on the same cross-category task. Notably, the spectra2pix-all model yields the best performance across all test sets, outperforming the baseline-all on all tasks.

Figure 3 presents a qualitative comparison for the two ${\rm L}$ and ${\rm H}$ split models: spectra2pix-${\rm L}/{\rm H}$ and the baseline-${\rm L}/{\rm H}$. In Fig. 3 (first row), our spectra2pix model generalizes well to ${\rm L}$ shape and designs an accurate geometry, while the baseline model seems to collapse to a wrong shape that appears in the train set. In Fig. 3 (second row), spectra2pix returns the correct outlines of the geometries from the H split (which it has not seen at all). The baseline, on the other hand, completely misses the shapes and infers geometries only from the subsets it has seen. In summary, these results highlight the generalization and stability level of spectra2pix, when designing shapes sampled from a fairly different distribution from what the model was trained on.

 figure: Fig. 3.

Fig. 3. Qualitative comparison of spectra2pix and the baseline networks in [12,13], trained on the no-${\rm L}$ and no-${\rm H}$ splits. The first row exhibits a sample from the ${\rm L}$ split, predicted by spectra2pix-${\rm L}$ and the baseline-${\rm L}$ model. The second row presents a sample from the ${\rm H}$ split, predicted by spectra2pix-${\rm H}$ and baseline-${\rm H}$. The epsilon host of every sample is exhibited in yellow on the corresponding ground truth image. Each of the baseline and spectra2pix models receives the input spectra and the epsilon host as input and predicts a geometry (for the baseline, we present the image associated with the predicted parameters).

Download Full Size | PDF

Fabrication Results. To showcase the ability of spectra2pix to generate faithful images from measured spectra for real nanostructures, we nano-fabricated a test set of ${\rm H}$-shaped geometries made of gold. Since the fabricated nanostructures were made on a substrate with an ITO layer, we harvested a relatively smaller-sized dataset of ${\sim} {500}$ COMSOL simulations of geometries from the same H family, hosted in ITO, excluding the particular H geometries. We then fine-tuned spectra2pix-H using an additional synthetic dataset, which uses simulated ITO layers, to allow the model to adapt to the ITO properties. Both the initial synthetic dataset and the ITO dataset used to pre-train and fine-tune, respectively, do not include H-shaped geometries.

Next, we propagated the measured transmission spectra of the fabricated test set through spectra2pix-${\rm H}$. Figure 4 presents the generated spectra of a representative sample from the fabricated test set. As can be seen, the model predicts a fairly accurate image that aligns with the fabricated geometries as they are measured by a scanning electron microscope (right column). For completeness, we also show in Fig. 4 (left column) the spectra of COMSOL simulations associated with the fabricated geometries. These results show experimentally, for the first time, the ability of neural networks to predict images of unseen fabricated geometries.

 figure: Fig. 4.

Fig. 4. Measured spectra (left) of fabricated nanostructures (right) were fed into spectra2pix trained on the no-${\rm H}$ split. The model generated image (middle) comprises ${\rm H}$-shaped geometries. The generated shapes are faithful to the fabricated nanostructures (right). Comsol simulations for the fabricated nanostructures are depicted along the measured spectra (left).

Download Full Size | PDF

Conclusion. The use of ML techniques and deep learning in particular has spawned huge interest over the past few years in the nanophotonics communities, due to the great promises these techniques offer for the inverse design of novel devices and functionalities. In this Letter, we introduce spectra2pix, a model that supports the design of any 2D geometry. In addition, compared to other work in the field, spectra2pix is the first to successfully generalize by designing completely unseen shapes of geometries, a capability confirmed by both synthetic as well as experimental data. Our results highlight the importance and the generalization ability of deep neural networks, towards the goal of inverse design of a suitable nanostructure for any desired spectral response.

Funding

European Research Council (ERC) under the European Union's Horizon 2020 Framework Programme research and innovation programme (ERC CoG 725974); .

Acknowledgments

H.S. acknowledges the PAZI young scientist award. The contribution of I.M. is part of PhD thesis research conducted at Tel Aviv University.

Disclosures

The authors declare no conflicts of interest.

Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

REFERENCES

1. W. Cai and V. Shalaev, Optical Metamaterials: Fundamentals and Applications (Springer, 2009).

2. J. Jensen and O. Sigmund, Laser Photon. Rev. 5, 308 (2011). [CrossRef]  

3. S. Molesky, Z. Lin, A. Y. Piggott, W. Jin, J. Vucković, and A. W. Rodriguez, Nat. Photonics 12, 659 (2018). [CrossRef]  

4. W. Ma, Z. Liu, Z. A. Kudyshev, A. Boltasseva, W. Cai, and Y. Liu, Nat. Photonics 15, 77 (2021). [CrossRef]  

5. D. Melati, Y. Grinberg, M. Kamandar Dezfouli, S. Janz, P. Cheben, J. H. Schmid, A. Sánchez-Postigo, and D.-X. Xu, Nat. Commun. 10, 4775 (2019). [CrossRef]  

6. Y. Kiarashinejad, M. Zandehshahvar, S. Abdollahramezani, O. Hemmatyar, R. Pourabolghasem, and A. Adibi, Adv. Intell. Syst. 2, 1900132 (2020). [CrossRef]  

7. I. Malkiel, M. Mrejen, L. Wolf, and H. Suchowski, MRS Bull. 45, 221 (2020). [CrossRef]  

8. J. Peurifoy, Y. Shen, L. Jing, Y. Yang, F. Cano-Renteria, B. G. DeLacy, J. D. Joannopoulos, M. Tegmark, and M. Soljačić, Sci. Adv. 4, eaar4206 (2018). [CrossRef]  

9. I. Sajedian, J. Kim, and J. Rho, Microsyst. Nanoeng. 5, 27 (2019). [CrossRef]  

10. D. Liu, Y. Tan, E. Khoram, and Z. Yu, ACS Photon. 5, 1365 (2018). [CrossRef]  

11. W. Ma, F. Cheng, and Y. Liu, ACS Nano 12, 6326 (2018). [CrossRef]  

12. I. Malkiel, M. Mrejen, A. Nagler, U. Arieli, L. Wolf, and H. Suchowski, Light Sci. Appl. 7, 60 (2018). [CrossRef]  

13. I. Malkiel, M. Mrejen, A. Nagler, U. Arieli, L. Wolf, and H. Suchowski, in IEEE International Conference on Computational Photography (ICCP) (IEEE, 2018), pp. 1–14.

14. Y. Kiarashinejad, S. Abdollahramezani, and A. Adibi, npj Comput. Mater. 6, 1 (2020). [CrossRef]  

15. M. Mrejen, I. Malkiel, A. Nagler, U. Arieli, L. Wolf, and H. Suchowski, in 2019 Conference on Lasers and Electro-Optics (CLEO), (IEEE, 2019).

16. I. Malkiel, A. Nagler, M. Mrejen, U. Arieli, L. Wolf, and H. Suchowski, “Deep learning for design and retrieval of nano-photonic structures,” arXiv preprint arXiv:1702.07949 (2017).

17. Z. Liu, D. Zhu, S. P. Rodrigues, K.-T. Lee, and W. Cai, Nano Lett. 18, 6570 (2018). [CrossRef]  

18. Y. LeCun, C. Cortes, and C. Burges, “ATT Labs,” 2010, Vol. 2 [Online]. Available: http://yann.lecun.com/exdb/mnist.

Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (4)

Fig. 1.
Fig. 1. Interaction of light with plasmonic nanostructures. Incoming electromagnetic radiation interacts with the nanostructure in a resonant manner, leading to an effective optical response. The spectra of both polarizations are dictated by the geometry of the nanostructure, rather than the chemical composition.
Fig. 2.
Fig. 2. Spectra2pix architecture and dataset. The model receives two spectra and an epsilon host value. Each of the three inputs is forwarded through a fully connected network (with separate weights). The outputs of the three networks are concatenated into one intermediate vector. Next, another fully connected layer is applied to transform the vector into a higher dimension (equals ${64} \times {64}$ ). The latter is reshaped to a square matrix, to which three convolutional layers are applied sequentially, resulting in a ${64} \times {64}$ output image. All activations are ReLU activations.
Fig. 3.
Fig. 3. Qualitative comparison of spectra2pix and the baseline networks in [12,13], trained on the no- ${\rm L}$ and no- ${\rm H}$ splits. The first row exhibits a sample from the ${\rm L}$ split, predicted by spectra2pix- ${\rm L}$ and the baseline- ${\rm L}$ model. The second row presents a sample from the ${\rm H}$ split, predicted by spectra2pix- ${\rm H}$ and baseline- ${\rm H}$ . The epsilon host of every sample is exhibited in yellow on the corresponding ground truth image. Each of the baseline and spectra2pix models receives the input spectra and the epsilon host as input and predicts a geometry (for the baseline, we present the image associated with the predicted parameters).
Fig. 4.
Fig. 4. Measured spectra (left) of fabricated nanostructures (right) were fed into spectra2pix trained on the no- ${\rm H}$ split. The model generated image (middle) comprises ${\rm H}$ -shaped geometries. The generated shapes are faithful to the fabricated nanostructures (right). Comsol simulations for the fabricated nanostructures are depicted along the measured spectra (left).

Tables (1)

Tables Icon

Table 1. Comparing Spectra2pix with Four Baseline Inverse Networks [11,12] a

Equations (4)

Equations on this page are rendered with MathJax. Learn more.

X = [ ( s i 1 , s i 2 , c i , g i ) ] i = 1 N ,
g ^ := M ( s 1 , s 2 , c ) ,
L ( g ^ , g ) = M ( s 1 , s 2 , c ) g 2 2 ,
NMSE ( g ^ , g ) = MSE ( g ^ , g ) MSE ( g , 0 ) = g ^ g 2 2 g 2 2 ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.