Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Parameter extraction and inverse design of semiconductor lasers based on the deep learning and particle swarm optimization method

Open Access Open Access

Abstract

A deep-learning artificial neural network (NN) combined with the particle swarm optimization (PSO) method has been proposed to inversely design the semiconductor laser with high accuracy and computational speed. This method is exempt from the single-solution problem of tandem NN and can be highly useful to extract the possible problematic parameters in the failure analysis of a device. The light-current curves and small signal responses have been tested against the benchmarks calculated by the traveling-wave model to demonstrate the NN’s robustness and efficiency in simulating the laser behavior for further use in the inverse design by PSO.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

Optoelectronic devices such as semiconductor lasers are highly important for the high speed and broadband optical communication systems. Modeling and design of such devices usually involves detailed characterization of the multi-physics interactions between quantized electrons/holes within nano-scale dimensions, and optical fields oscillating in the micro/millimeter-scale laser cavity. By calculating the material gain in the quantum wells/dots (QWs/QDs), and the transverse/longitudinal modes in the 3D laser cavity, separately, we can capture most of the laser device behavior by simulations [1]. But to inversely design or extract the key structural parameters of a laser, according to the given/desired light-current and small signal response curves, is still challenging, as in practice even with the identical design parameters and processing/fabrication procedures, performances of separate devices from the same wafer can differ by great amount. Statistical summary to correlate the input parameters with the output response may suggest certain trend to guide the second round design, but this makes the process highly experimental, time-costly, and human experience dependent. Therefore, it is highly desired to objectively discern the device parameters according to its output curves not only based on the well-established multi-physics knowledge, but also on an accurate predictive model that can be trained automatically by the abundant statistical database during the device design, manufacturing and testing steps. This may also be valuable for the failure analysis and automatic characterization of a complicated device, such as a laser, from which the problem-causing parameters can be easily identified by the model through the variations in the output curves. Here, we propose an inverse design method based on the deep-learning neural network (NN) algorithm in combination with the particle swarm optimization (PSO) process to achieve automatic parameter extraction with minimum pre-knowledge of the device if given only certain tested results, such as the light-current (L-I) curves and small signal responses (SSR), etc. The demonstration is based on the traveling wave model (TWM) calculated training dataset, which simulates the laser behavior and provides an inverse-design benchmark to test the NN-PSO method’s accuracy and robustness.

The conventional inverse design processes for photonic devices, usually employ the heuristics based genetic algorithm [24], simulated annealing method [5,6] and swarming intelligence method [7,8], as well as the gradient-based steepest descent method [9,10], etc. These methods can be used to extract the target parameters if given the device governing equations, which may involve numerical (e.g., finite difference method) [1113] or analytical (e.g., transfer matrix method) [14] calculations of the equations during device optimization or parameter extraction process. But if these calculations are performed iteratively, it can be highly time-consuming and computationally expensive in order to capture the device intrinsic trends, especially for the case of semiconductor lasers where nonlinear processes and multi-physics interactions are involved.

Recently, with the development of artificial neural network (ANN) methods in the forward design of metasurfaces [15], nanophotonics [1620], and optical communication networks [21], etc., the studied system can be mapped/represented accurately by a trained neural network, to save further calculations of the governing equations iteratively during the device optimization process. For the inverse design using ANN, a tandem network has been proposed [22,23] by cascading a pre-trained forward prediction net with an inverse-design net to remove the non-uniqueness problem (where multiple structures can correspond to the same spectrum) during training. However, on the other hand, this scheme also constrains the inverse net, leading to the single-valued design parameters, i.e. the trained inverse network can only predict one set of the regression structure corresponding to one given spectrum. For the design process, this can still be acceptable as we may need only one type of design to do the fabrication, but for the failure analysis applications and parameter extraction processes, etc., all the configurations have to be tabulated as many as possible, in order to reveal the potential problems in the device. Therefore, to tackle this single-solution problem of tandem network and save the effort in re-training the whole inverse-net each time [22,23], we can use the PSO method combined with ANN to fully describe the mapping relationship between the input and output spaces in both forwards and backwards ways with high accuracy and computational speed. Also, this method can be generalized to solve many other inverse problems, where only one-time training of a single forward net is required for use by PSO, to obtain multiple solutions during the inverse design process.

2. Forward training and PSO inverse design

2.1 Forward deep-learning neural network

To obtain the neural network that can predict the properties of semiconductor lasers with high accuracy, a fully connected (FC) deep-learning neural network (DLNN) is constructed, with 3 hidden layers and 50 neurons in each layer, as shown schematically in Fig. 1.

 figure: Fig. 1.

Fig. 1. The topology of forward NN with 7 input parameters and 200 output ports. The 7 inputs are the design parameters, while the 200 outputs represent the 80 output powers and 120 small signal responses, respectively.

Download Full Size | PDF

The 7 input parameters, which are selected from the many material/structural/operational ones, are the injection efficiency ηeffc (dimensionless), the reciprocal of heat capacity Rt (in unit of K/J), the material series resistance Rs (in unit of Ω), the heat sink temperature Ke (in unit of K), the gain characteristic temperature coefficient Kg (in unit of K), the carrier characteristic temperature coefficient Kn (in unit of K) and the cavity loss (in unit of cm-1). Here, specifically but without loss of generality, we choose the ones that may affect the system behavior significantly during the traveling wave model (TWM) simulation [1113] (details of the TWM are given in the Appendix). These input parameters can be readily expanded to include any desired ones, according to the different operation conditions and application scenarios. Values for the parameters X = [ηeffc, Rt, Rs, Ke, Kg, Kn, loss] are randomly selected within each specified ranges as ηeffc = [0.4, 1]; Rt = [1e7, 1e9]; Rs = [0.5, 20]; Ke = [300, 360]; Kg = [50, 350]; Kn = [50, 350]; loss = [10, 50], respectively, for a total of 5000 different samples/combinations. By using the TWM-generated database, NN can be trained to map the device behavior in terms of the laser output powers and small signal responses, with respect to the corresponding 7-parameter combinations. The output powers P = [p1, p2, p3, …, p80] are for 80 injection currents ranging from 10 ∼ 168 mA, with an interval of 2 mA; and the three SSR curves Zi(=1,2,3)= [zi,1, zi,2, zi,3, …, zi,40] under bias currents of 30, 50, 70 mA are for 40 different frequencies ranging from 0 ∼ 20 GHz, with an interval of 0.5 GHz, respectively. We have to mention that to remove the weights imbalance caused by the magnitude difference between the output powers and the three small signal responses, those four groups of data are normalized by their corresponding mean values before forming one combined database.

To choose a proper NN topology, three networks with different number of hidden layers (1 to 3 layers, with each layer containing 50 neurons) are studied. The training curves in terms of the mean square errors (MSE, i.e., difference between the NN predicted L-I + SSR curves and the TWM calculated ones) are shown in Fig. 2. Here, the MSE for the 3-layer DLNN case can converge to the lowest level to about 10−4, which indicates that this network should be more accurate to be used, to predict the output power for each combination of the 7 design parameters to replace the TWM method. The average CPU time to generate a group of L-I and SSR curves by the neural network is 0.08s, which is much faster than TWM that takes 149.35s to obtain the results with the same computational facility.

 figure: Fig. 2.

Fig. 2. Training curves in terms of the mean square errors (MSE) for different neural networks with one/two/three hidden layers (each layer containing 50 neurons), where the 3-layer DLNN can converge to the lowest level of MSE of about 10−4.

Download Full Size | PDF

For dependency of the neural network on sizes/volumes of the training-dataset, we calculate MSE of the L-I and SSR curves predicted by NN for different dataset sizes as shown in Fig. 3. The training sets are formed by the first 200, 300, 500, 1000, 2000, 3000 and 4000 samples of the 5000 ones in the total dataset, respectively. During training, the datasets are further split into 3 parts for the training, validation, and simultaneous temporary testing processes, according to the 70:15:15 proportion at each epoch, (e.g. 140 for training, 30 for validation and 30 for temporary testing, in the 200 samples training case). Once the networks are trained, we can use the final test dataset, i.e., the reserved last 500 of the 5000 prepared samples in the total dataset, to objectively compare all the nets performance and accuracy. The testing error reduces at larger training set (as shown by the black-squared dotted-line in Fig. 3), which indicates that the network accuracy improves with less fluctuations at sufficiently large sampling size. However, if we take the extra cost of CPU-time into consideration for building a larger dataset (as shown by the blue-triangle straight-line), the dataset size also has to be optimized to achieve the balance between NN prediction accuracy and the computational efficiency. We have to mention that as indicated by the differences between two MSE curves (i.e., the testing and training ones), too small training dataset size (less than 500) can lead to over-fitted networks and less accurate predictions. For practical applications, this may be avoided as usually there exists abundant statistical data during the device design, manufacturing and testing steps, to suffice the sampling size requirement for the network training.

 figure: Fig. 3.

Fig. 3. Dependency of the neural network accuracy and total computational time (blue-triangle straight-line, as is circled out and pointed to the right-side axis) on the dataset sizes, where the accuracy is shown in terms of MSE for the testing (black-squared dotted-line) and training sets (red-circled dash-dotted line, as are both circled out and pointed to the left-side axis), respectively.

Download Full Size | PDF

To further verify the ability of forward network in predicting the L-I and SSR of a laser, we randomly select 2 samples out of the 500 ones in the final testing dataset (different from the ones in the training dataset) to show comparison between the neural network predictions and the original TWM generated ones as in Figs. 4(a)-(d).

 figure: Fig. 4.

Fig. 4. (a) L-I, and (b) SSR curves generated by the neural network, and compared with the TWM method. The SSR curves are under bias currents of 30/50/70 mA. The same for (c) L-I and (d) SSR for another sample. (e) Histogram of the normalized L-I / SSR values in the training dataset, with the corresponding NN prediction errors (i.e., difference between the NN predicted value and the benchmark TWM one) of the final-test dataset for each column of the L-I / SSR histogram. (f) The histogram of NN prediction errors for all the points in the final test-set.

Download Full Size | PDF

For correlation between the NN prediction error and the training dataset distribution, we can plot the histogram for all the points in the training set (4000 samples, where each sample contains 200 output points of the normalized power and small signal response curves), and compare it with the corresponding NN prediction error of the final-test dataset (500 samples) for each column of the training set L-I / SSR histogram as in Fig. 4(e). It is shown that the NN error tends to be higher for cases where NN has not seen as many examples in the training set - for example the higher performing devices with L-I/SSR values being larger. We also plotted the histogram of NN errors for all the points in the final test dataset as in Fig. 4(f). The log scale is used for the plot as close-to-zero points dominate the distribution, almost to the 104 level. It could be seen that the NN can achieve high accuracy to map our laser systems and capture the complicated nonlinear correlations between the multiple design parameters and the system responses.

As an important aspect of the laser modeling, we test the network over different temperatures by setting all the design parameters fixed, except for the heat sink temperature (Ke), which are randomly selected for three different Ke values within the 300∼360 K range to carry out the NN and TWM calculations. Figure 5 shows that the neural network can accurately predict the three L-I curves as compared to the TWM benchmarks, so that it can be used to inversely design the laser from any desired spectrum, without resorting to the lengthy TWM simulations iteratively during the PSO searching process (as will be discussed in the following section). Here, the 7 input parameters are X = [0.8, 7e8, 19, T, 111, 124, 34] with T = 302 K, 331 K and 354 K, respectively.

 figure: Fig. 5.

Fig. 5. Comparison of the neural network predictions with the TWM benchmarks, for parameters X = [0.8, 7e8, 19, T, 111, 124, 34] at randomly-selected temperatures T = 302 K, 331 K and 354 K, respectively.

Download Full Size | PDF

2.2 Inverse design

For the inverse design and parameter extraction of the semiconductor lasers, whose rate equations are dominated by the nonlinear process that has time-dependent solutions during evolution, we can use the heuristics based PSO method [7,8] to avoid the single-solution problem associated with the inverse tandem network [22,23]. The PSO method can efficiently search the parameter values corresponding to a desired spectrum, with the help of NN instead of TWM for the possible single/multiple solutions. It can also search the local and global minimums/optimums simultaneously, as well as their history epochs, such that when using the PSO-NN to do inverse design, the searching process could have more chance to be saved from being stuck in the local minimum, in way to the global optimum.

The schematic diagram and flow chart for the PSO method are shown in Figs. 6(a) and (b). And each blue particle in Fig. 6(a) (representing one combination of the 7 parameters) keeps its own history of the closest predictions (pbest, yellow dots) for the target function. After comparing the predictions with all other particles in every epoch, the global optimal position (g­best) can be obtained, such that the swam of particles can get their new sampling positions, in terms of the moving velocities and directions according to the prediction closeness (i.e., the fitness function) of the current pbest ­and g­best positions [7,8]. This shared/interconnected sampling scheme, when combined with NN instead of the original TWM simulations, would greatly improve the convergence speed. During the searching process, fitness function used for the PSO is the mean square error function as $\textrm{MSE} = \sqrt {\frac{1}{{n - 1}}\sum\limits_{i = 1}^n {{{({{r^{\prime}}_i} - {r_i})}^2}} }$, which estimates difference between the predicted L-I / SSR curves (ri’) and the desired ones (ri) for the present epoch as well the history epochs, and n is the number of points on the compared curves. We randomly initialize the PSO within the value ranges of ηeffc=[0.2, 1]; Rt=[1e7, 2e9]; Rs=[0.5, 20]; Ke=[260, 380]; Kg=[10, 360]; Kn=[10, 360]; loss = [10, 50] as listed in Table 1, and generate the searching parameters according to the PSO algorithm to find out possible solutions for the desired spectra. For the TWM-PSO simulations of lasers in our case, it takes about 192 hours to get one set of design parameters, while the NN-PSO takes only 49 seconds (about 14106 times faster than the TWM-PSO method) as shown in Fig. 6(c) for its convergence curve. The error with respect to the target/desired spectra can be minimized below 2×10−3 globally.

 figure: Fig. 6.

Fig. 6. (a) The working principle of NN-PSO method. (b) The flow chart. (c) The convergence curve of NN-PSO, for the original parameter set of X = [0.720, 5.373×108, 11.833, 359.673, 67.448, 331.619, 14.044].

Download Full Size | PDF

Tables Icon

Table 1. Two groups of design parameters corresponding to one set of spectrum

Due to nonlinear nature of the semiconductor laser operation, inverse design and parameter extraction of the device can be a multi-solution problem, i.e. one set of the spectrum can correspond to multiple combinations of the design parameters as shown in Table 1, where 2 rounds of the extracted parameters are listed and compared with the original parameters.

To verify the NN-PSO designs, L-I and SSR curves for S1 and S2 parameters are also plotted in Fig. 7 against the curves for the original parameters, where TWM method is used for all the curve calculations. The small fitness value (∼10−3 as in Table 1) and the high accordance of those curves indicate that S1 and S2 parameters can be viewed as multi-solutions to the desired spectrum. Here, we have tried to extend the PSO searching range slightly outside the NN training range as in Table 1, to test the methods robustness and to avoid hitting the boundaries during searching.

 figure: Fig. 7.

Fig. 7. TWM verifications of the curves for S1 and S2 parameters, as compared to the original design.

Download Full Size | PDF

 As plotted in Figs. 8(a)-(g) for another 50 rounds of the PSO searching, whose means and standard deviations (STD) are listed in Table 2, the inverse design parameters can be obtained by the NN-PSO method automatically with high accuracy and computational efficiency, instead of testing them manually as in Ref. [1]. This also shows that the parameters generated by NN-PSO are not totally random, but distribute close to the original values within certain error range, which indicates that this method may be capable in solving the multi-solution problem for the nonlinear lasing process of semiconductor lasers, as compared to the tandem network [22,23] or GAN schemes [15]. To verify the designs, we also plot a fitness function distribution in Fig. 8(h) to show the converged values (1∼1.5×10−3) of PSO after 500 epochs, for 50 repeated rounds.

 figure: Fig. 8.

Fig. 8. (a)-(g) Statistical distribution of one randomly selected group for the inverse design parameters calculated by the NN-PSO method. The blue dots represent the parameter values generated by NN-PSO, the dashed lines represent their statistical averages and the red solid lines are the original true values; (h) the fitness function distribution of PSO at Epoch 500 for 50 repeated rounds.

Download Full Size | PDF

Tables Icon

Table 2. Mean and standard deviation of the 7 parameters inversely designed by NN-PSO method, as compared to the original pre-set design parameters

As can be seen from Figs. 8(a)-(g), which is also summarized in Table 2, that distributions of the quantities Rt, Rs, Kg and Kn are more scattered, since they are related to the exponential formula of Eq. (6) and 12 in the Appendix for the gain calculation in TWM. For a small change of these quantities, a much larger change in L-I and SSR could be induced to cause fluctuations.

To further verify the NN-PSO method’s inverse design ability and prediction accuracy for different sets of the parameters, we listed the first eleven groups in the testing dataset, as in Fig. 9. From the plots, which show the inversely-designed 7-parameter values (with the error bar for the standard deviation) as compared to their corresponding original values (red dot), we can see that the quantities Rt, Rs, Kg and Kn are again more scattered as discussed previously, while ηeffc, Ke and loss are more accurately predicted due to their non-exponential nature.

 figure: Fig. 9.

Fig. 9. Mean and standard deviation of the 7 parameters inversely designed by NN-PSO method, for 11 groups of data in the test dataset.

Download Full Size | PDF

3. Summary

In summary, the deep-learning artificial neural network combined with the particle swarm optimization method has been proposed to inversely design the semiconductor laser system and solve the single-solution problem of tandem network during mapping with high accuracy and computational speed. The light-current curves and SSRs have been tested against the traveling-wave model benchmark, and the method’s robustness and efficiency in simulating the laser behavior for parameter extraction and inverse design are demonstrated.

Appendix: TWM method for dataset generation

By combining the conventional transfer matrix method and the time domain evolution of electrical and optical fields, the time-domain traveling wave method (TD-TWM) have been developed as a cascade of elementary transfer matrices, i.e., the scattering and propagation matrices, to describe the time-space evolution of the propagating waves along the laser cavity. As in Ref. [11,13], we can write the forward and backward optical fields at the two boundaries of each structural section k at time t as follows

$$\left[ \begin{array}{c} {E_f}(t + dt,k + 1)\\ {E_b}(t,k + 1) \end{array} \right] = [{{A_k}(t)} ]\left[ \begin{array}{c} {E_f}(t,k)\\ {E_b}(t + dt,k) \end{array} \right] = \left[ \begin{array}{l} {a_{11}}(t,k)\textrm{ }{a_{12}}(t,k)\\ {a_{21}}(t,k)\textrm{ }{a_{22}}(t,k) \end{array} \right]\left[ \begin{array}{c} {E_f}(t,k)\\ {E_b}(t + dt,k) \end{array} \right],$$
where $[{A_k}(t)]$ is composed of the propagation matrix P for a uniform medium of length l, and the scattering matrix Tij for an index jump from ni to nj, as
$$P\textrm{ = }\left[ \begin{array}{cc} {e^{ - j\beta l}}\textrm{ }&0\\ 0\textrm{ }&{e^{j\beta l}} \end{array} \right]$$
$${T_{ij}} = \frac{1}{{2{n_j}}}\left[ \begin{array}{cc} {n_j} + {n_i}\textrm{ }&{n_j} - {n_i}\\ {n_j} - {n_i}\textrm{ }&{n_j} + {n_i} \end{array} \right].$$
Here, $\beta = {n_{eff}}{k_0} - j\alpha$ is the effective propagation wave vector along z-direction, with $\alpha $ the waveguide loss. At two ends of the waveguide $z = 0$ and $z = L$, we have to require that the fields are connected by the left and the right end-facet reflectivities ${r_l}$ and ${r_r}$ as
$${E_f}(t + dt,0) = {r_l}{E_b}(t,0),$$
$${E_b}(t + dt,L) = {r_r}{E_f}(t,L),$$
The carrier equation that governs the electron-photon relations can be expressed as
$$N({t + dt,k} )= N(t,k) + dt\left[ {\eta \frac{{J(t,k)}}{{ed}} - {R_{sp}}(N(t,k)) - {v_g}g(t,k)S(t,k)} \right],$$
$$\begin{array}{l} T(z,t) = {K_e} + \sum\nolimits_{n = 0}^\infty {t_n^1} \exp ( - {D_n}t)\left[ {\frac{{I_{{s_0}}^2}}{{{D_n}}} + \int_0^t {\exp ({D_n}\tau ){I^2}(z,\tau )d\tau } } \right]\\ + \sum\nolimits_{n = 0}^\infty {t_n^2} \exp ( - {D_n}t)\left( {\frac{{{I_{{s_0}}} - {X_{{s_0}}}}}{{{D_n}}} + \int_0^t {\exp ({D_n}\tau )[I(z,\tau ) - X(\tau )]d\tau } } \right), \end{array}$$
where $T(z,t)$ is the active region temperature in terms of time and position, D is the thermal diffusion coefficient, ${R_{s,t}}$ are the material series resistance and reciprocal of thermal capacity, respectively, ${E_g}$ is the band gap energy in the active region, ${P_{out}}$ is the total output power from the laser, ${I_{s0}}$ is the static injected current at the initial time, ${X_{s0}}$ is a constant related to the initial static output power, and other parameters are defined as follows:
$${D_n} = {\left[ {{{(n + \frac{1}{2})\pi } / h}} \right]^2}D,$$
$$t_n^1 = {( - 1)^n}\frac{2}{h}\int_0^h {{R_t}(x){R_s}(x)\sin \left[ {{{(n + \frac{1}{2})\pi x} / h}} \right]} \textrm{ }dx,$$
$$t_n^2 = \frac{{2{E_g}d}}{{eh}}{R_t}(h),$$
$$X(t) = \frac{e}{{{E_g}}}{P_{out}}(t),$$
in which the parameters in the equation are the injection current density J, the injection efficiency η, the active region thickness d, the spontaneous recombination rate ${R_{sp}}$, the material gain g and the photon density S, respectively. The detailed expressions for some of these parameters can be found in Ref. [12]. Higher injection can also increase the temperature as
$$\Delta T = T - [{K_e} + f({I^2}) + g(I - a{P_{out}})],$$
where f and g are functions of the device structure as expressed in Ref. [12], and a is the conversion coefficient. At the same time, differential gain and transparent carrier density in the gain formula have to be modified to include the temperature factor as
$${g_k}({N_k},{S_k}) = \frac{{\frac{{dg}}{{dN}}{e^{ - \frac{{\varDelta T}}{{{K_g}}}}}\ln ({e^{ - \frac{{\varDelta T}}{{{K_n}}}}}{N_k}/{N_{tr}})}}{{1 + \varepsilon {S_k}}},$$
where ${K_\textrm{g}}$ and ${K_\textrm{n}}$ are the characteristic temperature for $dg/dN$ and ${N_{tr}}$, respectively. This will affect the resonance and lasing threshold conditions of the structure, especially the L-I curve. Parameters needed for those approximations can be extracted from the experiment already done on the same materials. The laser parameters are shown in the Table 3.

Tables Icon

Table 3. Laser parameters

Funding

National Key Research and Development Program of China (2018YFA0209000).

Acknowledgement

We thank the referee for the helpful comments. Special thanks also go to Pei Feng and Professor Qiang Wu for the helpful discussion.

Disclosures

The authors declare no conflicts of interest.

References

1. X. Li, Optoelectronic devices: design, modeling, and simulation. (Cambridge University, 2009).

2. R. S. Hegde, “Photonics inverse design: pairing deep neural networks with evolutionary algorithms,” IEEE J. Sel. Top. Quantum Electron. 26(1), 1–8 (2020). [CrossRef]  

3. S. F. Shu, “Evolving ultrafast laser information by a learning genetic algorithm combined with a knowledge base,” IEEE Photonics Technol. Lett. 18(2), 379–381 (2006). [CrossRef]  

4. P. H. Fu, T. Y. Huang, K. W. Fan, and D. W. Huang, “Optimization for ultrabroadband polarization beam splitters using a genetic algorithm,” IEEE Photonics J. 11(1), 1–11 (2019). [CrossRef]  

5. S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by simulated annealing,” Science 220(4598), 671–680 (1983). [CrossRef]  

6. S. Zommer, E. N. Ribak, S. G. Lipson, and J. Adler, “Simulated annealing in ocular adaptive optics,” Opt. Lett. 31(7), 939–941 (2006). [CrossRef]  

7. J. Kennedy and R. Eberhart, “Particle Swarm Optimization,” Proc. of IEEE Int. Conf. on Neural Networks 4, 1942–1948 (1995). [CrossRef]  

8. J. Robinson and Y. R. Samii, “Particle swarm optimization in electromagnetics,” IEEE Trans. Antennas Propag. 52(2), 397–407 (2004). [CrossRef]  

9. S. W. Piche, “Steepest descent algorithms for neural network controllers and filters,” IEEE Trans. Neural Netw. 5(2), 198–212 (1994). [CrossRef]  

10. N. A. Ahmad, “A globally convergent stochastic pairwise conjugate gradient-based algorithm for adaptive filtering,” IEEE Signal Process. Lett. 15, 914–917 (2008). [CrossRef]  

11. M. G. Davis and R. F. O’Dowd, “A transfer matrix method based large-signal dynamic model for multielectrode DFB lasers,” IEEE J. Quantum Electron. 30(11), 2458–2466 (1994). [CrossRef]  

12. W. Li, X. Li, and W. P. Huang, “A traveling-wave model of laser diodes with consideration for thermal effects,” Opt. Quantum Electron. 36(8), 709–724 (2004). [CrossRef]  

13. Y. Li, Y. P. Xi, X. Li, and W. P. Huang, “Design and analysis of single mode Fabry-Perot lasers with high speed modulation capability,” Opt. Express 19(13), 12131–12140 (2011). [CrossRef]  

14. P. Yeh, Optical waves in layered media (Wiley, 1988).

15. Z. C. Liu, D. Zhu, S. P. Rodrigues, K. T. Lee, and W. Cai, “Generative model for the inverse design of metasurfaces,” Nano Lett. 18(10), 6570–6576 (2018). [CrossRef]  

16. J. Peurifoy, Y. C. Shen, L. Jing, Y. Yang, F. Cano-Renteria, B. G. DeLacy, J. D. Joannopoulos, M. Tegmark, and M. Soljacic, “Nanophotonic particle simulation and inverse design using artificial neural networks,” Sci. Adv. 4(6), eaar4206 (2018). [CrossRef]  

17. D. Zibar, A. M. R. Brusin, U. C. de Moura, F. D. Ros, V. Curri, and A. Carena, “Inverse system design using machine learning: the Raman amplifier case,” J. Lightwave Technol. 38(4), 736–753 (2020). [CrossRef]  

18. D. Melati, Y. Grinberg, M. K. Dezfouli, S. Janz, P. Cheben, J. H. Schmid, A. Sanchez-Postigo, and D. X. Xu, “Mapping the global design space of nanophotonic components using machine learning pattern recognition,” Nat. Commun. 10(1), 4775 (2019). [CrossRef]  

19. G. P. P. Pun, R. Batra, R. Ramprasad, and Y. Mishin, “Physically informed artificial neural networks for atomistic modeling of materials,” Nat. Commun. 10(1), 2339 (2019). [CrossRef]  

20. B. Hu, B. Wu, D. Tan, J. Xu, and Y. Chen, “Robust inverse-design of scattering spectrum in core-shell structure using modified denoising autoencoder neural network,” Opt. Express 27(25), 36276–36285 (2019). [CrossRef]  

21. D. Wang, M. Zhang, Z. Li, C. Song, M. Fu, J. Li, and X. Chen, “System impairment compensation in coherent optical communications by using a bio-inspired detector based on artificial neural network and genetic algorithm,” Opt. Commun. 399, 1–12 (2017). [CrossRef]  

22. D. Liu, Y. Tan, E. Khoram, and Z. Yu, “Training deep neural networks for the inverse design of nanophotonic structures,” ACS Photonics 5(4), 1365–1369 (2018). [CrossRef]  

23. Y. Long, J. Ren, Y. Li, and H. Chen, “Inverse design of photonic topological state via machine learning,” Appl. Phys. Lett. 114(18), 181105 (2019). [CrossRef]  

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (9)

Fig. 1.
Fig. 1. The topology of forward NN with 7 input parameters and 200 output ports. The 7 inputs are the design parameters, while the 200 outputs represent the 80 output powers and 120 small signal responses, respectively.
Fig. 2.
Fig. 2. Training curves in terms of the mean square errors (MSE) for different neural networks with one/two/three hidden layers (each layer containing 50 neurons), where the 3-layer DLNN can converge to the lowest level of MSE of about 10−4.
Fig. 3.
Fig. 3. Dependency of the neural network accuracy and total computational time (blue-triangle straight-line, as is circled out and pointed to the right-side axis) on the dataset sizes, where the accuracy is shown in terms of MSE for the testing (black-squared dotted-line) and training sets (red-circled dash-dotted line, as are both circled out and pointed to the left-side axis), respectively.
Fig. 4.
Fig. 4. (a) L-I, and (b) SSR curves generated by the neural network, and compared with the TWM method. The SSR curves are under bias currents of 30/50/70 mA. The same for (c) L-I and (d) SSR for another sample. (e) Histogram of the normalized L-I / SSR values in the training dataset, with the corresponding NN prediction errors (i.e., difference between the NN predicted value and the benchmark TWM one) of the final-test dataset for each column of the L-I / SSR histogram. (f) The histogram of NN prediction errors for all the points in the final test-set.
Fig. 5.
Fig. 5. Comparison of the neural network predictions with the TWM benchmarks, for parameters X = [0.8, 7e8, 19, T, 111, 124, 34] at randomly-selected temperatures T = 302 K, 331 K and 354 K, respectively.
Fig. 6.
Fig. 6. (a) The working principle of NN-PSO method. (b) The flow chart. (c) The convergence curve of NN-PSO, for the original parameter set of X = [0.720, 5.373×108, 11.833, 359.673, 67.448, 331.619, 14.044].
Fig. 7.
Fig. 7. TWM verifications of the curves for S1 and S2 parameters, as compared to the original design.
Fig. 8.
Fig. 8. (a)-(g) Statistical distribution of one randomly selected group for the inverse design parameters calculated by the NN-PSO method. The blue dots represent the parameter values generated by NN-PSO, the dashed lines represent their statistical averages and the red solid lines are the original true values; (h) the fitness function distribution of PSO at Epoch 500 for 50 repeated rounds.
Fig. 9.
Fig. 9. Mean and standard deviation of the 7 parameters inversely designed by NN-PSO method, for 11 groups of data in the test dataset.

Tables (3)

Tables Icon

Table 1. Two groups of design parameters corresponding to one set of spectrum

Tables Icon

Table 2. Mean and standard deviation of the 7 parameters inversely designed by NN-PSO method, as compared to the original pre-set design parameters

Tables Icon

Table 3. Laser parameters

Equations (13)

Equations on this page are rendered with MathJax. Learn more.

[ E f ( t + d t , k + 1 ) E b ( t , k + 1 ) ] = [ A k ( t ) ] [ E f ( t , k ) E b ( t + d t , k ) ] = [ a 11 ( t , k )   a 12 ( t , k ) a 21 ( t , k )   a 22 ( t , k ) ] [ E f ( t , k ) E b ( t + d t , k ) ] ,
P  =  [ e j β l   0 0   e j β l ]
T i j = 1 2 n j [ n j + n i   n j n i n j n i   n j + n i ] .
E f ( t + d t , 0 ) = r l E b ( t , 0 ) ,
E b ( t + d t , L ) = r r E f ( t , L ) ,
N ( t + d t , k ) = N ( t , k ) + d t [ η J ( t , k ) e d R s p ( N ( t , k ) ) v g g ( t , k ) S ( t , k ) ] ,
T ( z , t ) = K e + n = 0 t n 1 exp ( D n t ) [ I s 0 2 D n + 0 t exp ( D n τ ) I 2 ( z , τ ) d τ ] + n = 0 t n 2 exp ( D n t ) ( I s 0 X s 0 D n + 0 t exp ( D n τ ) [ I ( z , τ ) X ( τ ) ] d τ ) ,
D n = [ ( n + 1 2 ) π / h ] 2 D ,
t n 1 = ( 1 ) n 2 h 0 h R t ( x ) R s ( x ) sin [ ( n + 1 2 ) π x / h ]   d x ,
t n 2 = 2 E g d e h R t ( h ) ,
X ( t ) = e E g P o u t ( t ) ,
Δ T = T [ K e + f ( I 2 ) + g ( I a P o u t ) ] ,
g k ( N k , S k ) = d g d N e Δ T K g ln ( e Δ T K n N k / N t r ) 1 + ε S k ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.