Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Deep learning approach to predict optical attenuation in additively manufactured planar waveguides

Open Access Open Access

Abstract

The booming demand for efficient, scalable optical networks has intensified the exploration of innovative strategies that seamlessly connect large-scale fiber networks with miniaturized photonic components. Within this context, our research introduces a neural network, specifically a convolutional neural network (CNN), as a trailblazing method for approximating the nonlinear attenuation function of centimeter-scale multimode waveguides. Informed by a ray tracing model that simulated many flexographically printed waveguide configurations, we cultivated a comprehensive dataset that laid the groundwork for rigorous CNN training. This model demonstrates remarkable adeptness in estimating optical losses due to waveguide curvature, achieving an attenuation standard deviation of 1.5 dB for test data over an attenuation range of 50 dB. Notably, the CNN model’s evaluation speed, at 517 µs per waveguide, starkly contrasts the used ray tracing model that demands 5–10 min for a similar task. This substantial increase in computational efficiency accentuates the model’s paramount significance, especially in scenarios mandating swift waveguide assessments, such as optical network optimization. In a subsequent study, we test the trained model on actual measurements of fabricated waveguides and its optical model. All approaches show excellent agreement in assessing the waveguide’s attenuation within measurement accuracy. Our endeavors elucidate the transformative potential of machine learning in revolutionizing optical network design.

Published by Optica Publishing Group under the terms of the Creative Commons Attribution 4.0 License. Further distribution of this work must maintain attribution to the author(s) and the published article's title, journal citation, and DOI.

1. INTRODUCTION

The need for efficient and scalable optical networks grows as the world is increasingly interconnected. Flexographic printing of optical networks emerges as a promising technology to bridge large-scale networks of individual fibers and miniaturized photonic chips. Current optical networks can be categorized into large-scale networks using individual optical fibers [13], and integrated photonic chips created through semiconductor technology [410]. The former utilizes waveguide lengths exceeding 1 m, while the latter’s size is restricted to wafer size or, due to cost constraints, even smaller [11]. However, these two manufacturing systems leave a gap for centimeter-scale optical networks. Flexographic printing stands out here, known for its high productivity, cost-effectiveness, and safety due to the absence of toxic chemicals and impressive resolution [1216].

Despite the proven manufacturability of flexographic printing for optical networks, optimal layouts still need to be resolved, primarily because waveguides are sensitive to bending, worsened with polymer waveguides due to the minimal refractive index difference. Also, waveguides in optical circuit boards are interdependent, necessitating robust optimization to mitigate sharp bends. Due to the nonlinear relationship between the waveguide’s path, curvature, and attenuation, optical simulations typically conduct the optimization. Previous work presented an optimization algorithm for flexographic waveguide networks using curvature as the loss function. This function simplifies the optimization engine but deviates from the true minimum of the attenuation function [17].

Existing literature showcases numerous instances where deep neural networks have been employed to design optical elements. These endeavors primarily fall into two distinct categories: optimization and inverse design [18]. Within the optimization realm, machine learning techniques serve to expedite the iterative process by evaluating numerical solutions [19,20]. Conversely, inverse design specifies the system’s desired optical characteristics, allowing the network to derive an appropriate solution autonomously [21,22]. Notably, the predominant systems discussed in the literature thus far exhibit sub-millimeter dimensions with feature sizes less than the light’s wavelength. The potential of neural networks in shaping larger systems or complete optical networks remains largely untapped. The task of optimizing expansive multimode waveguide networks emerges as a novel field. It demands innovative solution approaches, especially given the limitations of wave optical solvers at such magnitudes, which falter owing to overwhelming memory and computational prerequisites.

 figure: Fig. 1.

Fig. 1. Schematic of the optical decimal-to-binary converter built on a flame retardant circuit board material (FR4) substrate with a PMMA overlay. The device translates integers 1–5 into binary outputs using five optical inputs. Waveguides, created from ultraviolet (UV) curable polymer, cover $20 \times 26\;{\rm mm}$. Laser diodes initiate and photodiodes detect optical signals, with waveguides kept 2 mm from end faces for diode contact pad production. Assembly dimensions: $20 \times 30\;{\rm mm}$ [17]. (a) Schematic of the flexographically printed decimalto-binary converter using a photopolymer (${n} = {1.516}$) on a poly(methyl methacrylate) (PMMA) substrate (${n} = {1.49}$). (b) High-exposure image of input 5: shows points P1 and P2 where light leaks from the S-bends’ high curvature in suboptimal waveguides.

Download Full Size | PDF

Our paper delves into the challenges these curved multimode waveguides pose, explicitly focusing on curvature-induced losses. We propose a neural network that can approximate the actual nonlinear cost function for these large-scale waveguides, thereby serving as a cost function for optimizing complex networks. We develop a ray tracing model for flexographically printed waveguides and use it to train a neural network for accurate optical loss prediction. Data augmentation is employed to increase the robustness of the model. We explore optimal network architecture selection and hyperparameters, testing various network layouts. We evaluate the performance of these networks, analyzing prediction error and identifying the best-performing network for further optimization. Our paper evaluates the model’s quality, considering the error function across the waveguide, average error, and standard deviation. We highlight the neural network’s computational efficiency and practical applicability, aiming to provide a robust and efficient model for predicting losses in curved optical waveguides, which will be valuable in designing and optimizing printed optical networks. To prove the model’s validity, we fabricate several S-bends by flexographic printing and compare its attenuation characteristics to the predictions made by the developed optical model and the derived deep learning model.

2. DESIGN CHALLENGES IN WAVEGUIDE OPTIMIZATION

In optical flexography, core and cladding consist of polymers, typically with similar refractive indices around 1.5. This similarity results in a markedly reduced critical acceptance angle. Therefore, tight curves affect these waveguides, which limits how compact optical networks can be designed. Figure 1 clearly shows this in an example of an optical decimal-to-binary converter [17].

The illustrated converter incorporates five optical inputs and three outputs, transforming integers from 1 to 5 into a trio of binary optical outputs via linear combination, thereby generating corresponding binary numbers. The network is imprinted on a PMMA substrate adhered to an FR4 base. Spanning a dimension of $20 \times 26\;{\rm mm}$, the network uses laser diodes at the input and photodiodes at the output to produce and detect optical signals, respectively. To accommodate the manufacture of contact pads for the diodes, the waveguides ensure a minimum distance of 2 mm from the end faces. The entire assembly has been designed to measure $20 \times 30\;{\rm mm}$. As depicted in Fig. 2, the printed and optimized network layout has been generated via flexographic printing utilizing a UV curable polymer, specifically 390119 UV Supraflex, supplied by Jänecke + Schneemann. These waveguides, exhibiting a cross-section resembling a circle segment, feature a height of 25 µm and a maximum width of 340 µm. The structural height, constricted by the limited volume transference, necessitates multiple overprints to yield the waveguide structure, with this research implementing nine printing iterations.

The network layout was optimized employing the technique of smallest local curvature [17]. The layout optimization has 40 independent variables, with many influencing the performance of multiple waveguides at once. It is evident that at points P1 and P2 [Fig. 1(b)], the waveguide decouples a significant amount of optical power, highlighting that although the optimization method is suitable as an initial solution, it is considerably inferior to ray tracing in detail. The latter technique allows a direct statement about optical performance. Still, its high computational load limits its suitability for multi-dimensional optimizations.

Figure 3 illustrates another example of an optical network, a flexographically printed coupler that couples light from eight laser diodes into a multimode optical fiber. This light propagation is enabled through the use of printed waveguides. The assembly consists of four primary components: the electrical infrastructure that controls the laser diodes, the laser diodes equipped with micro-optics for beam shaping, the flexographically printed coupler, and a fiber connector. The coupler substrate (with a refractive index, ${n_S}$, of 1.49) is made from PMMA, with a printed waveguide (with a refractive index, ${n_W}$, of 1.51).

 figure: Fig. 2.

Fig. 2. Different tested neural networks’ architectures are illustrated, with layer dimensions indicated in round brackets (${\rm length} \times {\rm width} \times {\rm filter}\;{\rm number}$) and activation functions in square brackets. In every decoder block, batch normalization precedes the activation function application after the transposed convolution layer. (a) Neural network with 10 fully connected layers (DNN). (b) CNN with 24 layers (CNN1): the front part of the network consists of blocks of two or three convolutional layers with a max-pooling layer for regularization. Depending on the number of input coordinates’ length (${n_0} = {100}$, 500, 1000), ${n_i}$ is calculated for ${i} = {1}$, 2, 4, 6, 7 using ${n_{i}} = {n_{{i} - 1}} - {1}$. ${n_{i}}$ is determined for ${i} = {3}$, 5 with ${n_{i}} = ({n_{{i} - 1}} - {1}){/2}$. (c) 42-layer CNN (CNN2): the front part of the network consists of blocks of three convolutional layers with a max-pooling layer for regularization. Depending on the number of input coordinates’ length (${n_0} = {500}$, 1000), $n_{i}$ for ${i} = {1}$, 2, 3, 5, 6, 8, 9, 11, 12, 14, 15 is calculated as ${n_{i}} = {n_{i- 1}} - {1}$. $n_{i}$ for ${i} = {4}$, 7, 10, 13 is determined by ${n_{i}} = ({n_{{i} - 1}} - {1}){/2}$.

Download Full Size | PDF

 figure: Fig. 3.

Fig. 3. Displayed is a flexographically printed coupler that directs light from eight laser diodes into a multimode optical fiber using printed waveguides. The assembly includes an electrical control for the diodes, laser diodes with micro-optics, the printed coupler, and a fiber connector. (a) Top view of the optical part of the coupler: The substrate ($n_{S}= {1.49}$) is made of PMMA, with a printed waveguide ($n_{W}= {1.51}$). (b) Overview of the assembly with a 3D-printed housing, electrical board, optical network, and fiber.

Download Full Size | PDF

However, the design of the waveguides has encountered several challenges. To achieve ideal coupling, the angle between the optical fiber and the laser diode should ideally be 0°. However, this condition can only be met for one diode without using a combining mirror. Smaller angles that still fall within the acceptance angle may also be acceptable, though not ideal. The power coupled into the fiber decreases with increasing angle. Even though a longer waveguide reduces the coupling angle, this also introduces higher transmission losses due to the longer waveguide. A potential solution could be a curved waveguide that gradually reduces the incidence angle of the light beam in the waveguide to 0°, thereby creating an optimal connection. However, the optimal path of the waveguide remained unclear. Therefore, this requires an optimization of the waveguide layout to achieve an ideal connection. A significantly improved solution could be identified through multidimensional optimization, where both the position of components (laser diode + lenses and optical fiber) and the course of the waveguide are available as optimization parameters. However, due to the vast solution space, ray tracing is theoretically useable for these multidimensional optimizations. However, because of the high computational effort, it is not solvable in a reasonable amount of time.

These two introductory examples are intended to serve as motivation for the following work. The objective is to identify a rapid and precise method to determine the optical attenuation of curved waveguides without ray tracing. This information can then be employed to optimize the waveguides in these examples to achieve a superior result.

3. SIMULATION MODEL

Curved connections are requisite for linking two arbitrary network elements, an element that introduces losses attributable to waveguide geometry. The literature provides extensive discussion on losses within circular curved waveguides, demonstrating an exponential relationship between attenuation and the curvature radius of the waveguide. The larger the curvature, the greater the attenuation [23]. However, it is critical to note that attenuation can be calculated analytically solely for waveguides featuring constant curvature—specifically circles or lines. Given the complex profile of flexographically printed waveguides, the necessary waveguide parameters for their characterization elude analytic computation. Consequently, to delineate the behavior of arbitrarily curved flexographic waveguides, a ray tracing model is necessary. This model is developed utilizing Optics Studio by Zemax in conjunction with MATLAB. The attenuation ${A_n}$ in any detector ${d_n}$ is defined as

$${A_n} = - 10{\log}_{10} \frac{{{I_n}}}{{{I_1}}}.$$
${I_n}$ is defined as the number of rays that hit the rectangular detector. ${I_1}$ is the first detector.
 figure: Fig. 4.

Fig. 4. Optical model of flexographically printed waveguides developed in MATLAB using Optics Studio by Zemax, optimized to reduce curvature-related losses. (a) Top view: a Gaussian-profile laser beam irradiates one facet, with multiple detectors along the waveguide path assessing transmission spatially. (b) Cross-section: waveguide on a 170-µm-thick substrate ($n_{S}= {1.49}$); waveguide dimensions: $n_{W}= {1.51}$, height = 30 µm, chord width = 300 µm.

Download Full Size | PDF

 figure: Fig. 5.

Fig. 5. Representation of exemplary input ${\overrightarrow{\rm Inp}}$ and output $\overrightarrow{\rm Out}$ training data and the corresponding vector length ${R}$, as well as the angles $\alpha$ and $\beta$.

Download Full Size | PDF

The simulation workflow involves defining the waveguide cross-section and its path in MATLAB. The waveguide cross-section for flexographically printed optical waveguides can be modeled as a circular segment [15]. The dimensions of this waveguide are shown in Fig. 4. The waveguide path is defined as

$$\overrightarrow{\rm f_{(t)}} = \vec a{t^3} + \vec b{t^2} + \vec ct + \vec d.$$
The endpoints are at $t = 0$ and $t = 1$. $\vec a$, $\vec b$, $\vec c$, and $\vec d$ are two-dimensional polynomial constants. The splines ${f_t}$ start and end points are at positions $\vec{\rm s_1}$ and $\vec{\rm s_2}$. Its direction is defined by $\vec{\rm \sigma _1}$ and $\vec{\rm \sigma _2}$. This results in a linear system of equations that can be solved for $\vec a$, $\vec b$, $\vec c$, and $\vec d$:
$$\left[{\begin{array}{*{20}{c}}{{\vec{\rm f}_{n(0)}}}&\,\,\,{{\vec{\rm f}_{n(1)}}}&\,\,\,{\frac{{\partial {\vec{\rm f}_{n(0)}}}}{{\partial t}}}&\,\,\,{\frac{{\partial {\vec{\rm f}_{n(1)}}}}{{\partial t}}}\end{array}} \right] = \left[{\begin{array}{*{20}{c}}{{\vec{\rm s}_1}}&\,\,\,{{\vec{\rm s}_2}}&\,\,\,{\lambda {\vec{\rm \sigma}_1}}&\,\,\,{\kappa {\vec{\rm \sigma}_2}}\end{array}} \right].$$
Because the system of equations is underdetermined, $\lambda$ and $\kappa$ are introduced. Since there is a relation between curvature and attenuation, the method according to Pflieger et al. [23] is used. The function is optimized so that the highest curvature ${K_{(\lambda ,\kappa)}}$ on the entire spline becomes minimal:
$$\mathop {\min}\limits_{\lambda ,\kappa \in + \mathbb{R}} \{{K_{(\lambda ,\kappa)}}\} = \mathop {\min}\limits_{\lambda ,\kappa \in \mathbb{R}} \left\{{\mathop {\max}\limits_{t \in \left[{0;1} \right]} \left({\left| {\frac{{\frac{{\partial x}}{{\partial t}}\frac{{{\partial ^2}y}}{{\partial {t^2}}} - \frac{{\partial y}}{{\partial t}}\frac{{{\partial ^2}x}}{{\partial {t^2}}}}}{{\left| {{{\frac{{\partial x}}{{\partial t}}}^2} + {{\frac{{\partial y}}{{\partial t}}}^2}} \right|_2^3}}} \right|} \right)} \right\}.$$
The resulting low curvature spline is then transformed into a computer aided design (CAD) model using the STL data format.

In the simulation engine, a laser source is placed at the beginning of the waveguide, and multiple detectors are positioned equidistant along the waveguide to investigate the spatially resolved transmission behavior. The positions and orientations of the detectors are passed to the ray tracing engine. The optical model consists of two volumetric bodies: the waveguide (imported from the STL file) (${n_W}=1.51$) and the substrate (${n_S}=1.49$) on which the waveguide is placed [Fig. 4(b)]. The light source used is a Gaussian beam profile. The simulation results are available in MATLAB for analysis.

4. DATA GENERATION

Since training neural networks requires several thousand examples, these have been generated synthetically employing the presented model. We simulate the optical attenuation for thousands of waveguides. Figure 5 shows exemplary input and output training data. The starting point of the waveguides is at position (0 0) with an orientation in the positive ${y}$-direction (${90^ \circ}$). The endpoint is generated using three randomly generated parameters ($\alpha$, $\beta$, and ${R}$):

$$\alpha = 360{X_1}{[^ \circ}],$$
$$\beta = 360{X_2}{[^ \circ}],$$
$$R={ 10^{(2 - {\log}10(2,5)){X_3} + {\log}10(2,5)}}[{\rm mm}],$$
$$0 \le {X_n} \le 1.$$
In our study, we define each ${X_n}$ randomly between zero and one. The end of the waveguide is designed to be between 2.5 and 100 mm away from the origin. In order to achieve a robust representation across a length span of two orders of magnitude, the values for $R$ were produced following an exponential scale.

For practical reasons, we limited the values to between 2.5 and 100 mm. Below 2.5 mm, the curvatures of the waveguides would be too high. All light is coupled out, negating waveguiding. We chose the upper limit, as circuit boards rarely have a diameter over 200 mm. Moreover, when connecting two widely separated points, the curvature of the waveguide disappears, reducing curvature losses. In our model, $\alpha$ indicates the orientation on the substrate starting from the positive ${ x}$-direction. For simplicity, the starting point is fixed at the origin. Any pair of start and end points can be transformed to the origin through translation and rotation, with the emission direction in the positive ${ y}$-direction. This approach means that the trained neural network can only handle starting points at the origin, which requires each waveguide to be reduced to this standard case. This restriction is by choice. An arbitrary start position and direction of rotation would certainly increase the flexibility of the network, but it would also mean much more training data and, thus, training effort. The input ${\overrightarrow{\rm Inp}}$ is a collection of different positions on the path in ${x}$ and ${ y}$:

$${ \overrightarrow{\rm Inp}} = {\left({\begin{array}{*{20}{c}}{{x_1}}&\,\,\,{{x_2}}& \,\,\,\ldots &\,\,\,{{x_n}}\\{{y_1}}&\,\,\,{{y_2}}& \,\,\,\ldots &\,\,\,{{y_n}}\end{array}} \right)^T}.$$
Waveguides with at least one point outside a square with an edge length of 150 mm are discarded.

As described previously, the output vector $\overrightarrow{\rm Out}$ is determined by the attenuation $\vec A$. It consists of 100 equidistant points on this path. Additionally, it is normalized to between zero and one for better convergence, and an attenuation over 50 dB is reduced to exactly 50 dB. This case is treated as a total loss:

$${\overrightarrow{\rm Out}}={ 50^{- 1}}\min\! \left\{{{{\left({\begin{array}{*{20}{c}}{{A_1}}&{{A_2}}& \ldots &{{A_n}}\end{array}} \right)}^T},50} \right\}\!.$$
The output vector $\overrightarrow{\rm Out}$ is sized ${1} \times {100}$.

We carried out ray tracing with 10,000 rays, and 4050 examples were generated using this method. We employed data augmentation to increase the training data, which involves adding slightly modified copies of existing data to the data set. Though these training data can theoretically be rotated, shifted, or mirrored without changing optical properties, only mirroring remains a method for augmentation due to our definition excluding rotation and translation. However, to maintain the orientation of the start point, the waveguide can only be mirrored along the y-axis. We applied this mirroring to increase the number of training examples to 8100.

5. NETWORK LAYOUTS AND ARCHITECTURE

We are evaluating three autoencoder configurations in order to find the best network layout: a fully connected network, often referred to as a dense network, and two CNNs. All three network layouts are depicted in Fig. 2. Typically, these autoencoders are shaped like an hourglass: they possess many neurons at the input and output ends but only a sparse set in the middle. This design enforces a form of data compression, compelling the network to retain only the most pertinent details in an abstracted form and funnel them through the architectural bottleneck at the network’s midpoint.

The fully connected network comprises 10 dense layers. The CNNs feature blocks of convolutional layers succeeded by max-pooling in the encoder segment. In contrast, the decoder segment incorporates inverse convolutional layer blocks followed by batch normalization and employs the leaky ReLU activation function.

 figure: Fig. 6.

Fig. 6. Representation of the error function depending on the epoch and the moving average of the previous 20 measurements for the different networks. (a) Error function of the training data. (b) Error function of the test data.

Download Full Size | PDF

The total attenuation in a waveguide can be conceptually perceived as the product of individual attenuations. This product is transformed into an additive summation by employing a logarithmic representation in decibels. This representation aligns harmoniously with neural networks, given that neurons in each layer effectively contain the summation of outputs from neurons in the preceding layer. Moreover, the attenuation at a specific point is influenced both by local coordinate components (${x}$ and ${ y}$) and by proximate preceding points. Making it pragmatic to view the input vector as an analogous image where various geometric features are discernible and from which an attenuation value can be extrapolated. Given this perspective, convolutional layers emerge as a highly suitable choice. The convolutional layers in the CNN are followed by max-pooling in the encoder segment to distill these features further. The decoder then uses inverse convolutional layers, batch normalization, and the leaky ReLU activation function to reconstruct from this distilled information.

Another hyperparameter is the number of spline support points needed for accurate prediction. The base network divides the spline into 100 coordinates, but also 500 and 1000 coordinates are tested. Figure 2 shows the three network architectures studied. For network training, examples are split into training (90%) and testing (10%) data, trained over 500 epochs. The initial network weights are generated randomly.

The errors for both groups’ training and testing are evaluated each epoch. These error functions of training and testing data are depicted over training epochs in Fig. 6. The training function has not reached an error minimum for the models, except for DNN-1000. However, test data function reaches a lower limit after 500 epochs, with large fluctuations between epochs due to the logarithmic representation. Training error decreases from DNN networks over CNN2 to CNN1. The testing error confirms this observation. This is surprising because the training error typically decreases with increased network depth. Test error can increase due to overfitting of a too-deep network. By increasing the number of support points, the error is further reduced. Hence, considering the error, CNN1-1000 is the best network. Even though CNN1-1000 shows slight overfitting, the training is stopped after 250 epochs to achieve an optimal result.

A. Performance Analysis

The generated error must be examined for different cases to assess the neural model’s quality. In this study, the error (Err) is defined as the difference in decibels (dB) between the simulation (${D_{{\rm Sim}}}$) and the prediction (${D_{{\rm Prog}}}$) of the model at various support points on the spline:

$${\rm Err}(x) = {D_{{\rm Sim}}}(x) - {D_{{\rm Prog}}}(x).$$

Figure 7 presents two histograms of error function across the spline, with separate consideration for training and test data. For clarity, the function range is limited to $\pm {3}\;{\rm dB}$ with values outside projected to the 3 dB edge. The model provides the best predictions at the start, with a 0 dB attenuation by definition. Variations in optical conductivity occur later, broadening the distribution, but the average error remains centered at 0 dB.

 figure: Fig. 7.

Fig. 7. Three-dimensional histogram of the error difference between simulation and neural prediction as a function of position: the position is normalized to the total length of the spline. (a) Error function of the training data. (b) Error function of the test data.

Download Full Size | PDF

Figure 8 shows the average error and standard deviation along the spline. The average error is practically always at 0 dB, with a maximum deviation of 0.28 dB. There is no significant difference between training and test data. The standard deviation starts at 0 dB, increases linearly to about 0.5 for test data and 0.4 for training data, plateaus at 75%, and then increases again. The maximum standard deviation at the end of the spline is 1.5 dB for test data and 1.3 for training data. The jumps in the standard deviation function align with the inflection points of the splines, which are the main loss centers. The network has learned this effect correctly and handles it almost perfectly.

 figure: Fig. 8.

Fig. 8. Displayed are the standard deviation and the mean error of the error difference histogram (Fig. 7) for the training and test data, depending on the normalized position on the spline. (a) Representation of the mean error. (b) Representation of the standard deviation.

Download Full Size | PDF

The neural network delivers a reasonable initial estimate of optical losses in a spline, with a maximum standard deviation of less than 2 dB. It maintains a stable error up to a damping of 40 dB, making it applicable for a broad range of waveguides. The computational efficiency of the neural network is also noteworthy. While simulating a spline takes between 5 and 10 min on a dual Intel Xeon Gold 6226R processor system, the neural network evaluates all 8008 splines in just 414 ms, equating to 517 µs per spline. This significant speed advantage is particularly evident when dealing with many splines.

The fact that the standard deviation is relatively low and stable indicates that the neural model is robust and has generalized well from the training data to the test data. The minimal average error near 0 dB, with only a slight deviation, shows that the model’s predictions are close to the ray tracing simulations. This is a positive sign that the neural model has been trained effectively and captures the underlying relationships between waveguide geometry and optical attenuation. A few conclusions can be drawn from this analysis so far.

  • • The neural model exhibits consistent performance across both training and test datasets. The similar standard deviation and average error values for both datasets indicate that the model has not overfit the training data and can generalize well to unseen data.
  • • The model is highly accurate at the start of the waveguide, where the attenuation is 0 dB by definition. Showing that the model has learned this aspect of the relationship well.
  • • The CNN1-1000 network appears to be the best model in terms of error metrics, even though it shows signs of slight overfitting. While the model is sophisticated enough to capture complex relationships, it might be worth exploring further regularization techniques rather than early stopping in future iterations to mitigate overfitting.
  • • The approach of loss detection using CNN compared to a ray tracing simulation provides a computational speedup of several orders of magnitude.
  • • Given the quality of the neural model’s predictions, it can be utilized in practical scenarios where predicting the optical attenuation of waveguides is essential, thus providing significant time savings over traditional ray tracing methods.

In conclusion, the presented neural model shows promising results in predicting the optical attenuation of waveguides based on their geometric data. Future work could focus on refining the model by introducing additional regularization, testing on a broader range of waveguide geometries, or investigating the influence of different neural architectures and training methodologies.

6. EXPERIMENTAL VALIDATION

While there is a significant degree of alignment between simulation data and the neural network, it is necessary to validate the model using experimental data as well. Consequently, this section will apply the simulation model and neural network to S-bends as an example. To that end, S-bends [Fig. 9(b)] with radii ranging from 7.5 to 25 mm were created by flexographic printing. The manufacturing process is described in previous works [16,17]. The waveguide base has a length of 438 µm and a height of 66 µm. Three samples were examined for each radius, with the average of these three measurements represented as the measurement point and error bars indicating the highest and lowest values. The dashed line denotes the measurement station’s resolution limit, with values above this line considered negligible [Fig. 9(a)].

 figure: Fig. 9.

Fig. 9. Optical losses in a flexographically printed S-bend with different radii. The layout of the waveguides is shown in (b). (a) compares the three methods studied. (c)–(f) compare the scattered light in the waveguide in the ray tracing simulation with that of a printed sample. The substrate was roughened to enhance scattering. (a) Comparison for S-bends between the described simulation model, the CNN prediction, and measurements of real samples. (b) Image of a fabricated S-bend with a radius of 7.5 mm. (c) Representation of the emitted light within the substrate for S-bends with a radius of 7.5 mm in the simulation. (d) Intensity profile of a high-exposure image of an S-bend with 7.5 mm. (e) Representation of the emitted light within the substrate for S-bends with a radius of 15 mm in the simulation. (f) Intensity profile of a high-exposure image of an S-bend with 15 mm.

Download Full Size | PDF

Furthermore, the S-bends were also analyzed through simulations. For this, waveguide paths were constructed following the method illustrated in Fig. 9(b), comprising two circle segments of variable radii and two straight segments. As the simulations did not account for material and defect losses or coupling losses, these were determined via separate cutback measurements on a straight piece and subsequently added to the simulation results for improved comparability. Coupling losses stood at 1.5 dB, and absorption losses at 2.1 dB/cm.

To demonstrate the accuracy of the neural network, paths of the S-bends that the network had never encountered were validated by the CNN-1000 network, following the method outlined in this paper. Initial testing with the network described in Section 5 did not reveal any alignment with the simulation or experimental data. Apparently, there is a fundamental difference between the described splines used for training and the tested circles, and the network cannot generalize for circular paths. Thus, the training data were expanded by an additional 91 examples (respectively, 182 after reflection on the y-axis) of circular paths. The training data have been expanded to 8282.

The conducted experiments suggest a roughly exponential relationship between the curvature radius and the attenuation losses. For a radius exceeding 17.5 mm, the curvature appears to have no measurable effect on the attenuation. Below this value, the attenuation increases exponentially until at less than 10 mm. No radiation can be detected due to the dominance of measurement inaccuracies.

Examining in detail, the simulation curve, Fig. 9(a), possesses several plateaus—an unexpected phenomenon suggesting that attenuation increases in jumps rather than continuously. The exponential increase in attenuation also occurs in the manufactured waveguides. The plateau effect, potentially obscured by measurement inaccuracies resulting from high material absorption and manufacturing defects, has not been definitively proven in an experiment. Nevertheless, the measured values of the produced waveguides match the simulated ones. To further validate the simulation model, the scattered light into the substrate is investigated in the simulation in Figs. 9(c) and 9(e). For this purpose, a detector is positioned in the substrate to detect outcoupling rays. It can be seen that the light that couples out of the waveguide is highly localized. Most of the power is coupled out at points ${P_1}$ and ${P_2}$, and ${P_4}$ and ${P_5}$, respectively. In addition, there are other smaller points of loss. To prove this localization on real samples, a scattered light analysis was performed in Figs. 9(d) and 9(f). To achieve this objective, a camera was employed to capture images of the dispersed light. Subsequently, these images were translated into a color-coded representation, with the colors determined by the corresponding intensity values of the scattered light. Using the color temperature as an indicator of intensity. In these photographs, ${P_1}$ and ${P_2}$, as well as ${P_4}$ and ${P_5}$ can be seen. Even though the effect in question, ${P_1}$ and ${P_4}$, is not clearly visible at the exit points, scattered light is visible at the back surface of the sample, ${P_3}$ and ${P_6}$. The secondary scattering centers in the simulation can only be evidenced through a logarithmic representation of the intensity and, hence, are undetectable in the photographs. It is conceivable that the attenuation could increase or decrease abruptly depending on the number of these scattering sites present on the S-bend. Thus, such a sharp increase in attenuation is plausible, and the experimental data do not exclude this possibility. It should be noted, however, that confirming this prediction would necessitate a higher-resolution measurement.

In our comprehensive examination of waveguide attenuation, we observed a distinct alignment between the simulation data and the convolutional neural network (CNN) performance. This intricate alignment was exemplified through the study of S-bends. Manufactured using advanced techniques and validated via rigorous simulations, these S-bends served as an instrumental example in understanding waveguide behavior. Our observations led to several findings.

  • • A pronounced exponential relationship exists between the curvature radius and attenuation losses. This correlation was not just evident in the physical waveguides but was also accurately replicated by the CNN.
  • • Interestingly, the simulation curve revealed periodic plateaus, indicating that attenuation increments are not continuous. The CNN also mirrored this quasi-periodic behavior, emphasizing its aptitude for understanding complex patterns.
  • • An analysis of scattered light, particularly from real samples, provided insightful data. The resultant imagery showcased a color-coded representation, giving us a detailed visualization based on the intensity of the scattered light.

Although the current data offer rich insights, they also suggest abrupt alterations in attenuation, likely influenced by scattering sites. It is imperative to note that while these observations hint at certain behaviors, drawing conclusive evidence would necessitate higher-resolution measurements.

7. CONCLUSION

The increasing demand for efficient, scalable optical networks necessitates innovative solutions that bridge the gap between large-scale fiber networks and miniaturized photonic chips. This study identified flexographic printing as a promising methodology for producing centimeter-scale optical networks. While the technique offers exceptional productivity and safety advantages, challenges remain in designing optimal layouts due to the sensitive nature of waveguides to bending.

The closeness in refractive indices between core and cladding, typically around 1.5, results in a significantly reduced critical acceptance angle for flexografically printed waveguides. This poses challenges in design, making waveguides particularly susceptible to tight curves and thus impacting the compactness of optical networks. This challenge is depicted in the optical decimal-to-binary converter and the flexographically printed coupler. Both illustrations underscore the design complexities, especially in maintaining optimal angles for efficient light propagation and minimizing optical power loss at points of high curvature. No efficient and precise optimization algorithm is known in the state of the art. Inspired by these examples, the vision is to devise a method that rapidly and accurately predicts optical attenuation in curved waveguides without resorting to ray tracing, paving the way for enhanced waveguide optimization and design.

Our contribution to addressing this challenge lies in developing and validating a neural network model tailored to approximate the true nonlinear attenuation function of curved optical waveguides. The significance of this approach is two-fold. First, our model, trained using data from a ray tracing simulation of flexographically printed waveguides, showcased a robust ability to predict optical losses, mirroring complex patterns, such as the quasi-exponential relationship between curvature radius and attenuation losses. The second noteworthy achievement was the network’s computational efficiency, delivering predictions in microseconds compared to the traditional simulation’s minutes.

Practical assessments revealed the model’s applicability across a spectrum of flexographically printed waveguides. With a maximum standard deviation of 1.5 dB for test data and 1.3 dB for training data at the end of the spline, the network’s potential as a tool for real-world applications is evident. Additionally, the model’s alignment with simulated and real data from S-bends further underscores its reliability. However, occasional deviations point to the need for refining the model further, potentially by incorporating a more diverse dataset and addressing potential measurement inaccuracies.

Our modeling and validation process centered on a distinct waveguide type and its associated fabrication procedure. A specific material system was judiciously selected to facilitate the training regimen. Consequently, the current iteration of our neural network is tailored explicitly to this type. To be compatible with an alternative fabrication technique or material system, it would be imperative to regenerate the training dataset using our established methodology and subject the network to subsequent retraining. Under such modifications, the majority of waveguides amenable to ray tracing simulations, specifically the multimode waveguides, should be feasibly represented. On the other hand, the modeling of single-mode waveguides necessitates a nuanced wave-optical analysis, which remains beyond the purview of ray tracing. As a result, a comprehensive revalidation would be essential to gauge the method’s suitability in these contexts.

In closing, our work underscores the efficacy of machine learning, particularly convolutional neural networks, in advancing the field of optical networks. By bridging the gap between simulation and real-world measurements, our neural network model offers a promising route to designing more efficient and scalable optical networks. Future endeavors should focus on further refining the neural network model, introducing more diverse datasets for training, and expanding the range of waveguide geometries studied. As the demand for efficient optical networks surges, the innovations and findings from this research serve as a robust foundation for future advancements in the domain.

Funding

Deutsche Forschungsgemeinschaft (390833453).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

REFERENCES

1. E. Agrell, M. Karlsson, A. R. Chraplyvy, D. J. Richardson, P. M. Krummrich, P. Winzer, K. Roberts, J. K. Fischer, S. J. Savory, B. J. Eggleton, M. Secondini, F. R. Kschischang, A. Lord, J. Prat, I. Tomkos, J. E. Bowers, S. Srinivasan, M. Brandt-Pearce, and N. Gisin, “Roadmap of optical communications,” J. Opt. 18, 063002 (2016). [CrossRef]  

2. “Cisco annual internet report (2018–2023): White paper,” (Cisco, 2020).

3. A. A. Jørgensen, D. Kong, M. R. Henriksen, et al., “Petabit-per-second data transmission using a chip-scale microcomb ring resonator source,” Nat. Photonics 16, 798–802 (2022). [CrossRef]  

4. W. Bogaerts, D. Pérez, J. Capmany, D. A. B. Miller, J. Poon, D. Englund, F. Morichetti, and A. Melloni, “Programmable photonic circuits,” Nature 586, 207–216 (2020). [CrossRef]  

5. X. Chen, M. M. Milosevic, S. Stankovic, S. Reynolds, T. D. Bucio, K. Li, D. J. Thomson, F. Gardes, and G. T. Reed, “The emergence of silicon photonics as a flexible technology platform,” Proc. IEEE 106, 2101–2116 (2018). [CrossRef]  

6. N. M. Fahrenkopf, C. McDonough, G. L. Leake, Z. Su, E. Timurdogan, and D. D. Coolbaugh, “The aim photonics mpw: A highly accessible cutting edge technology for rapid prototyping of photonic integrated circuits,” IEEE J. Sel. Top. Quantum Electron. 25, 1–6 (2019). [CrossRef]  

7. P. Munoz, P. W. L. van Dijk, D. Geuzebroek, M. Geiselmann, C. Dominguez, A. Stassen, J. D. Domenech, M. Zervas, A. Leinse, C. G. H. Roeloffzen, B. Gargallo, R. Banos, J. Fernandez, G. M. Cabanes, L. A. Bru, and D. Pastor, “Foundry developments toward silicon nitride photonics from visible to the mid-infrared,” IEEE J. Sel. Top. Quantum Electron. 25, 1–13 (2019). [CrossRef]  

8. A. Rahim, T. Spuesens, R. Baets, and W. Bogaerts, “Open-access silicon photonics: Current status and emerging initiatives,” Proc. IEEE 106, 2313–2330 (2018). [CrossRef]  

9. W. D. Sacher, J. C. Mikkelsen, Y. Huang, J. C. C. Mak, Z. Yong, X. Luo, Y. Li, P. Dumais, J. Jiang, D. Goodwill, E. Bernier, P. G.-Q. Lo, and J. K. S. Poon, “Monolithically integrated multilayer silicon nitride-on-silicon waveguide platforms for 3-d photonic circuits and devices,” Proc. IEEE 106, 2232–2245 (2018). [CrossRef]  

10. M. Smit, K. Williams, and J. van der Tol, “Past, present, and future of inp-based photonic integration,” APL Photonics 4, 050901 (2019). [CrossRef]  

11. J. E. Bowers and A. Y. Liu, “A comparison of four approaches to photonic integration,” in Optical Fiber Communications Conference and Exhibition (OFC) (2017), pp. 1–3.

12. G.-A. Hoffmann, Benetzungssteuerung auf Foliensubstraten mittels Flexodruck zur additiven Fertigung polymerer optischer Wellenleiter, Vol. 03/2022 of Berichte aus dem ITA (TEWISS Verlag, 2022).

13. K. Pflieger, L. Overmeyer, and E. Olsen, “Flexografically printed optical waveguides for complex low-cost optical networks,” Proc. SPIE 12007, 120070G (2022). [CrossRef]  

14. T. Reitberger, G.-A. Hoffmann, T. Wolfer, L. Overmeyer, and J. Franke, “Printing polymer optical waveguides on conditioned transparent flexible foils by using the aerosol jet technology,” Proc. SPIE 9945, 99450G (2016). [CrossRef]  

15. T. Wolfer, Additive Fertigung integrierter multimodaler Polymer-Lichtwellenleiter mittels Flexodruck, Vol. 01/2020 of Berichte aus dem ITA (TEWISS Verlag, 2020).

16. T. Wolfer, P. Bollgruen, D. Mager, L. Overmeyer, and J. G. Korvink, “Printing and preparation of integrated optical waveguides for optronic sensor networks,” Mechatronics 34, 119–127 (2016). [CrossRef]  

17. K. Pflieger, B. Reitz, G.-A. Hoffmann, and L. Overmeyer, “Layout optimization for flexographically printed optical networks,” Appl. Opt. 60, 9828–9836 (2021). [CrossRef]  

18. W. Ma, Z. Liu, Z. A. Kudyshev, A. Boltasseva, W. Cai, and Y. Liu, “Deep learning for the design of photonic structures,” Nat. Photonics 15, 77–90 (2021). [CrossRef]  

19. G. Alagappan and C. E. Png, “Modal classification in optical waveguides using deep learning,” J. Mod. Opt. 66, 557–561 (2019). [CrossRef]  

20. G. Alagappan and C. E. Png, “Prediction of electromagnetic field patterns of optical waveguide using neural network,” Neural Comput. Appl. 33, 2195–2206 (2021). [CrossRef]  

21. D. Mengu, M. S. Sakib Rahman, Y. Luo, J. Li, O. Kulce, and A. Ozcan, “At the intersection of optics and deep learning: statistical inference, computing, and inverse design,” Adv. Opt. Photonics 14, 209 (2022). [CrossRef]  

22. N. J. Dinsdale, P. R. Wiecha, M. Delaney, J. Reynolds, M. Ebert, I. Zeimpekis, D. J. Thomson, G. T. Reed, P. Lalanne, K. Vynck, and O. L. Muskens, “Deep learning enabled design of complex transmission matrices for universal optical components,” ACS Photonics 8, 283–295 (2021). [CrossRef]  

23. R. G. Hunsperger, Integrated Optics (Springer New York, 2009).

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (9)

Fig. 1.
Fig. 1. Schematic of the optical decimal-to-binary converter built on a flame retardant circuit board material (FR4) substrate with a PMMA overlay. The device translates integers 1–5 into binary outputs using five optical inputs. Waveguides, created from ultraviolet (UV) curable polymer, cover $20 \times 26\;{\rm mm}$. Laser diodes initiate and photodiodes detect optical signals, with waveguides kept 2 mm from end faces for diode contact pad production. Assembly dimensions: $20 \times 30\;{\rm mm}$ [17]. (a) Schematic of the flexographically printed decimalto-binary converter using a photopolymer (${n} = {1.516}$) on a poly(methyl methacrylate) (PMMA) substrate (${n} = {1.49}$). (b) High-exposure image of input 5: shows points P1 and P2 where light leaks from the S-bends’ high curvature in suboptimal waveguides.
Fig. 2.
Fig. 2. Different tested neural networks’ architectures are illustrated, with layer dimensions indicated in round brackets (${\rm length} \times {\rm width} \times {\rm filter}\;{\rm number}$) and activation functions in square brackets. In every decoder block, batch normalization precedes the activation function application after the transposed convolution layer. (a) Neural network with 10 fully connected layers (DNN). (b) CNN with 24 layers (CNN1): the front part of the network consists of blocks of two or three convolutional layers with a max-pooling layer for regularization. Depending on the number of input coordinates’ length (${n_0} = {100}$, 500, 1000), ${n_i}$ is calculated for ${i} = {1}$, 2, 4, 6, 7 using ${n_{i}} = {n_{{i} - 1}} - {1}$. ${n_{i}}$ is determined for ${i} = {3}$, 5 with ${n_{i}} = ({n_{{i} - 1}} - {1}){/2}$. (c) 42-layer CNN (CNN2): the front part of the network consists of blocks of three convolutional layers with a max-pooling layer for regularization. Depending on the number of input coordinates’ length (${n_0} = {500}$, 1000), $n_{i}$ for ${i} = {1}$, 2, 3, 5, 6, 8, 9, 11, 12, 14, 15 is calculated as ${n_{i}} = {n_{i- 1}} - {1}$. $n_{i}$ for ${i} = {4}$, 7, 10, 13 is determined by ${n_{i}} = ({n_{{i} - 1}} - {1}){/2}$.
Fig. 3.
Fig. 3. Displayed is a flexographically printed coupler that directs light from eight laser diodes into a multimode optical fiber using printed waveguides. The assembly includes an electrical control for the diodes, laser diodes with micro-optics, the printed coupler, and a fiber connector. (a) Top view of the optical part of the coupler: The substrate ($n_{S}= {1.49}$) is made of PMMA, with a printed waveguide ($n_{W}= {1.51}$). (b) Overview of the assembly with a 3D-printed housing, electrical board, optical network, and fiber.
Fig. 4.
Fig. 4. Optical model of flexographically printed waveguides developed in MATLAB using Optics Studio by Zemax, optimized to reduce curvature-related losses. (a) Top view: a Gaussian-profile laser beam irradiates one facet, with multiple detectors along the waveguide path assessing transmission spatially. (b) Cross-section: waveguide on a 170-µm-thick substrate ($n_{S}= {1.49}$); waveguide dimensions: $n_{W}= {1.51}$, height = 30 µm, chord width = 300 µm.
Fig. 5.
Fig. 5. Representation of exemplary input ${\overrightarrow{\rm Inp}}$ and output $\overrightarrow{\rm Out}$ training data and the corresponding vector length ${R}$, as well as the angles $\alpha$ and $\beta$.
Fig. 6.
Fig. 6. Representation of the error function depending on the epoch and the moving average of the previous 20 measurements for the different networks. (a) Error function of the training data. (b) Error function of the test data.
Fig. 7.
Fig. 7. Three-dimensional histogram of the error difference between simulation and neural prediction as a function of position: the position is normalized to the total length of the spline. (a) Error function of the training data. (b) Error function of the test data.
Fig. 8.
Fig. 8. Displayed are the standard deviation and the mean error of the error difference histogram (Fig. 7) for the training and test data, depending on the normalized position on the spline. (a) Representation of the mean error. (b) Representation of the standard deviation.
Fig. 9.
Fig. 9. Optical losses in a flexographically printed S-bend with different radii. The layout of the waveguides is shown in (b). (a) compares the three methods studied. (c)–(f) compare the scattered light in the waveguide in the ray tracing simulation with that of a printed sample. The substrate was roughened to enhance scattering. (a) Comparison for S-bends between the described simulation model, the CNN prediction, and measurements of real samples. (b) Image of a fabricated S-bend with a radius of 7.5 mm. (c) Representation of the emitted light within the substrate for S-bends with a radius of 7.5 mm in the simulation. (d) Intensity profile of a high-exposure image of an S-bend with 7.5 mm. (e) Representation of the emitted light within the substrate for S-bends with a radius of 15 mm in the simulation. (f) Intensity profile of a high-exposure image of an S-bend with 15 mm.

Equations (11)

Equations on this page are rendered with MathJax. Learn more.

A n = 10 log 10 I n I 1 .
f ( t ) = a t 3 + b t 2 + c t + d .
[ f n ( 0 ) f n ( 1 ) f n ( 0 ) t f n ( 1 ) t ] = [ s 1 s 2 λ σ 1 κ σ 2 ] .
min λ , κ + R { K ( λ , κ ) } = min λ , κ R { max t [ 0 ; 1 ] ( | x t 2 y t 2 y t 2 x t 2 | x t 2 + y t 2 | 2 3 | ) } .
α = 360 X 1 [ ] ,
β = 360 X 2 [ ] ,
R = 10 ( 2 log 10 ( 2 , 5 ) ) X 3 + log 10 ( 2 , 5 ) [ m m ] ,
0 X n 1.
I n p = ( x 1 x 2 x n y 1 y 2 y n ) T .
O u t = 50 1 min { ( A 1 A 2 A n ) T , 50 } .
E r r ( x ) = D S i m ( x ) D P r o g ( x ) .
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.