Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Analyzing modal power in multi-mode waveguide via machine learning

Open Access Open Access

Abstract

A machine learning assisted modal power analyzing scheme designed for optical modes in integrated multi-mode waveguides is proposed and studied in this work. Convolutional neural networks (CNNs) are successfully trained to correlate the far-field diffraction intensity patterns of a superposition of multiple waveguide modes with its modal power distribution. In particular, a specialized CNN is trained to analyze thin optical waveguides, which are single-moded along one axis and multi-moded along the other axis. A full-scale CNN is also trained to cross-validate the results obtained from this specialized CNN model. Prediction accuracy for modal power is benchmarked statistically with square error and absolute error distribution. It is found that the overall accuracy of our trained specialized CNN is very satisfactory for thin optical waveguides while that of our trained full-scale CNN remains nearly unchanged but the training time doubles. This approach is further generalized and applied to a waveguide that is multi-moded along both horizontal and vertical axes and the influence of noise on our trained network is studied. Overall, we find that the performance in this general condition keeps nearly unchanged. This new concept of analyzing modal power may open the door for high fidelity information recovery in far field and holds great promise for potential applications in both integrated and fiber-based spatial-division demultiplexing.

© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

Modal power analysis is usually performed after separating modes by spatial mode sorting technologies [1–8]. Integrated multi-mode waveguides and fibers are two typical optical interconnection platforms that can implement spatial-division multiplexing and demultiplexing. Modal power analysis is necessary in this new multiplexing technology, when optical data carried by different modes need to be detected. On the SOI platform, spatial mode sorting utilizes mode coupling in optical devices such as directional couplers [9,10]. The inter-waveguide mode coupling process can be designed to emulate an eigenmode projection operation. This is done by selective phase matching between a specific optical mode in a deliberately designed multi-mode waveguide and the optical mode in a single-mode waveguide, where the optical power carried by the mode in the multi-mode waveguide can be very efficiently coupled or projected into the single-mode waveguide. By repeating this process for the rest of the modes in the multi-mode waveguide, spatial mode sorting can be carried out [11–14]. On the fiber platform, tilted Bragg gratings imprinted to the fiber can be used to realize spatial mode sorting [15,16]. Some approaches capable of extracting both amplitudes and phases have also been demonstrated [17–21]. By mapping the fiber mode to different frequencies, both amplitude and phase of the modes can be extracted without any reference light [17]. Nevertheless, all these methods rely on spatial mode sorting technique that are implemented on the device level. Here we would like to report a novel modal power analyzing method that does not rely on spatial mode sorting, which eliminates the complexity on device level and shifts the burden to the data processing level. Simply speaking, our technique does not involve the actual spatial separation of the modes. Instead, the coherent far-field diffraction image contributed from all the modes is processed as a whole, i.e., as a gray-scale image. The key concept behind our method relies on the fact that there exists certain correlation between the modal power distribution and the corresponding far-field intensity pattern that can be in practice conveniently captured with a charge-coupled device (CCD) or receiver arrays. To uncover this correlation, CNNs are trained with theoretically synthetic sample sets consisting of the far-field intensity patterns with the corresponding modal power distribution as label.

Convolutional neural networks have been shown to be very efficient for processing data and discovering patterns with grid data structures, e.g., images [22–25]. It has features called sparse interaction and parameter sharing, which can significantly reduce the storage burden of parameters [26]. A typical convolutional layer consists of three stages. The convolution stage performs convolutions to produce linear activations, the detector stage performs a nonlinear activation, and the pooling stage modifies the output of the layer further and gives naturally invariant to translation when using max-pooling [27]. In our case, a CNN architecture consisting of two convolutional layers and three fully connected layers is first designed to deal with thin optical wavaguides, and later it is extended to include five convolutional layers and three fully connected layers to tackle a more general waveguide. The convolution operation takes the pixelated image pattern and the kernel function or filter as two input parameters [28]. Our trained CNNs are very successful when taking either processed or unprocessed far-field intensity patterns as input. It is capable of predicting its modal power distribution with very high accuracy. Notice that our method focuses on analyzing the power of different modes rather than spatially separating them, which has potential applications in cases, e.g., at the optical output of an optical route or an optical-electrical-optical repeater, where the optical data need to be detected before further actions are taken. We believe our proposed approach is novel and clearly belongs to a different paradigm that has recently attracted a lot of interests in other field and will impact the future optical system.

2. Theories and methods

2.1 Theoretical model

Our training set consists of samples of far-field intensity patterns with known modal power distribution as label. They can be collected either in experiments or numerical computations. Here they are obtained numerically by projecting a variety of random superpositions of a set of orthogonal waveguide modes into far-field and recording their corresponding intensity patterns. The performance of our trained neural networks is benchmarked with criteria that are the percentage of test samples that are predicted precisely within a predetermined square error or absolute error tolerance.

Herein we consider an air-cladded silicon-on-insulator (SOI) waveguide shown in Fig. 1(a) as an example to demonstrate our concept of machine learning assisted modal power analysis. The SOI waveguide is assumed to be a multimode bus waveguide commonly found in the emerging spatial-division multiplexing scheme. The width and height of this SOI waveguide are chosen, for instance, as w=1.6μm and h=0.22μm respectively. This thin waveguide supports only three quasi-TE optical modes at a wavelengthλ=1.55μm, which are TE00, TE10, TE20 mode, respectively, as shown in Fig. 1(b). The arbitrary guided optical field near the exit of the SOI waveguide, E(x,y), could thus be written as a linear superposition, E(x,y)=iaiEi(x,y), where ai and Ei(x,y) is the complex amplitude and normalized electric field of mode i. Owing to orthogonality, i.e., Ei(x,y)Ej(x,y)dxdy=δij, the complex amplitude of the three modes can then be calculated by projecting total electric field E(x,y) to each waveguide mode. However, in practice, the far-field intensity pattern I(x',y'), described in Eq. (1), rather than the field inside the waveguide is more conveniently captured by using an imaging device (e.g., CCD).

 figure: Fig. 1

Fig. 1 SOI waveguide under consideration. (a) A schematic with a superimposed mode at the exit of the SOI waveguide and the corresponding far-field pattern that can be easily recorded by CCD and analyzed by CNN. (b) Pseudo-color images showing the calculated mode profile for the three quasi-TE modes of this SOI waveguide with w = 1.6µm, h = 0.22µm, and λ = 1.55µm.

Download Full Size | PDF

I(x',y')|Eff(x',y';a)|2=|iaiEi(x,y)exp(j2π(x'λzx+y'λzy))dxdy|2

In this scheme, the phase information of the field is unrecoverable and thus the full knowledge of the field is unknown, rendering the projection method discussed above inapplicable. In order to retrieve the modal power distribution, a mapping function or operation that maps the intensity pattern I to the modal power distribution (i.e., [A¯1,A¯2,A¯3] in our three-mode system, where A¯i=|ai|2/j|aj|2) needs to be investigated. The essence of this investigation involves finding feature maps, where machine learning based CNNs have clear advantages over other approaches.

2.2 Steps to generate and preprocess synthetic data set

  • 1. Compute the three quasi-TE eigenmodes of SOI waveguide by Finite Difference Eigenmode Solver [29].
  • 2. Generate 10000 groups of [a1,a2,a3] with their absolute values uniformly randomly distributed in the range of [0, 1) and phase values uniformly randomly distributed in the range of [π,π), and then calculate [A¯1,A¯2,A¯3] as label.
  • 3. Calculate the far-field intensity pattern I according to Eq. (1).
  • 4. With modes TEmn:
    • ■ If max(m) = 0 or max(n) = 0, the far-field intensity distribution will be compressed along vertical direction, y' axis or horizontal direction, x' axis, separately and effectively eliminates one dimension in the data array.
    • ■ If max(m)>0 and max(n)>0, the far-field intensity distribution will stay unprocessed.
  • 5. Normalize I obtained from Step 4 and take -log(I) as final data.

2.3 Network architecture

In this work, max(n) = 0 is satisfied and the preparation of the input data is shown in Fig. 2(a). The far-field intensity pattern of our waveguide is imaged and pixelated into a [16], [64], [1] array, where 16, 64 and 1 represent the number of pixels along vertical direction and horizontal direction and the number of color channels, respectively. Notice that only one color channel is required to describe our intensity pattern, which is basically a greyscale map. Because all three modes considered here are single-moded along vertical axis, the far-field pattern is manifested with a collection of stripes along vertical axis as shown in Fig. 2(a). It is thus convenient to compress this [16], [64], [1] array of intensity data by summing up the values along the direction of the stripe, i.e., vertical direction to form a simplified [1], [64], [1] array. This 64 intensity values denoted by I in this array is further normalized, logarithmized and multiplied by −1 to become the final input data to our CNN. Preprocessing the input data in this fashion provides two key advantages, i.e., 1) great simplification of the CNN which takes effectively a 1D array as input, and 2) boost in training speed due to a reduced dynamic range of the data value.

 figure: Fig. 2

Fig. 2 Input data and the CNN architecture. (a) Far-field intensity distribution data is compressed by summing up all the intensity values along the vertical axis leaving behind a horizontal-position (or x'-axis) dependent grayscale map, I, and -log(I) is used as final input data to our CNN. (b) CNN architecture with two convolutional layers and three fully connected layers.

Download Full Size | PDF

As shown in Fig. 2(b), our specialized CNN consists of 2 convolutional layers, denoted as Conv1 and Conv2, and 3 fully connected layers, denoted as FC3, FC4 and FC5. Both the convolutional layers use filters with shape of [1,3], stride of [1,1] and same-padding, use ReLU [30] as activation function and use max-pooling with shape of [1,2] and stride of [1,2]. Number of filters of Conv1 and Conv2 are 16 and 32 separately. The number of features shrinks while the number of channels is multiplied owning to the [1,2] stride of max-pooling and multiplication of filter number. The feature map after two convolutional layers is flattened to one dimension and then fed to fully connected layers. The activation functions of FC3 and FC4 are ReLU while that of FC5 is softmax [31]. Number of neurons used for FC3, FC4 and FC5 are 64, 32 and 3, separately. The final output of FC5 corresponds the predicted modal power distribution of a given far-field intensity pattern.

In training phase, the output will be compared with label to compute the mean square error (MSE) of m samples as shown in Eq. (2),

J=1mi=1mj=13(yp(i)[j]yl(i)[j])2.
Hereyp(i)[j] and yl(i)[j]denote the predicted and labeled power of TE(j1)0 mode of the ith sample, respectively. Notice that Adam optimization [32] is used throughout this work to train our designed neural networks.

3. Results and discussion

A data set containing 9000 samples is classified as a training set and another data set containing 1000 samples is classified as a cross-validation set. In an epoch, mini-batches with size of 16 out of 9000 samples in the training set are fed to the network in turn until the entire set is traversed. The training set is then shuffled and used in a new epoch. After repeating this process 10000 times, the MSE of training set ends up with 2.72e-5 while that of cross-validation set is 9.04e-5. The difference between the training set and the cross-validation set is a result of the natural regularization of CNN and is completely acceptable, suggesting that our trained CNN performs a superb job in predicting modal power distribution. Next, in order to prevent overfitting to the cross-validation set, the trained neural network is tested for a second time with 10000 randomly generated samples and the MSE ends up with 9.07e-5, which is very close to the MSE of the cross-validation set.

To further examine the efficacy of our trained CNN, square error (SE) and absolute error (AE) of the modal power distribution of the ith sample are investigated. Their definitions are shown in Eqs. (3) and (4),

SE(i)=j=13(yp(i)[j]yl(i)[j])2
AE(i)=max(|yp(i)[j]yl(i)[j]|),j=1,2,3
where j = 1, 2, 3 denotes the index of the three modes. A typical histogram for SE distribution is illustrated in Fig. 3. And for the sake of clarity, a zoom-in view for SE larger than 1e-3 is illustrated in the insert. It is found that when maximum square error or tolerance is set to tolse=1e3, the prediction accuracy can be as high as 99.53% and if it is set to 1e-2, the prediction accuracy will reach 99.99%.

 figure: Fig. 3

Fig. 3 Histograms of square error distribution of 10000 testing samples. Insert: samples with square error larger than 1e-3.

Download Full Size | PDF

For samples whose maximum squared errors are less than tolse, their maximum possible absolute error can be calculated by AEmax=2×tolse/3. When tolse is set to 1e-3, the maximum possible absolute error is approximately equal to 0.0258. A typical histogram for actual distribution of absolute error is shown in Fig. 4. A zoom-view plot for AE larger than 0.02 is illustrated in the insert for clarity. It is obvious that the absolute errors for modal power distribution of the three modes are less than 1% for more than 85% of all the samples and the absolute error is hardly greater than 2%. If maximum absolute error is set to tolae=0.02, the prediction accuracy can be as high as 98.95%.

 figure: Fig. 4

Fig. 4 Histogram of absolute error distribution of 10000 testing samples. Insert: samples with absolute error larger than 0.02.

Download Full Size | PDF

It is worth noting that the modes concerned in this study are those from a thin waveguide that is single mode along its vertical direction. For simplicity, a technique has been then applied to compress the intensity data as explained above and in Fig. 2(a). However, it is neither mandatory nor a limiting factor for our concept if stricter analysis is required and more general cases are under consideration. To illustrate this, a full-scale CNN, taking a matching input data of the size [16], [64], [1] (e.g., the left bottom image of the pixelated far-field pattern shown in Fig. 2(a)) instead of [1], [64], [1] in the specialized case, is trained and investigated. The prediction accuracy for this full-scale CNN is found to be 99.02% with a maximum square error tolerance tolse=1e3 and 97.94% with maximum absolute error tolerance tolae=0.02. Compared with the results from the specialized CNN above, the performance remains nearly unchanged but the training time doubles when using the same computation resources. Hence, the preprocessing of data so as to facilitate the construction of a specialized CNN may be preferred in scenarios where the modal power distribution in thin waveguide is of interest.

To verify the general feasibility of our concept, a more general condition, e.g., concerning waveguides with multiple mode orders in both vertical and horizontal directions, is also investigated. For the illustration purpose, a SOI waveguide with 8 quasi-TE eigenmodes corresponding to max(m) = 1 and max(n) = 3 is used. Considering the fact that the far-field intensity patterns are determined simultaneously by as many as 16 variables (8 for amplitude and 8 for phase), the difficulty of accurately predicting the modal power distribution increases dramatically, compared with the case shown above. Nevertheless, after a few rounds of optimization, an appropriate CNN structure for this difficult task is shown in Fig. 5(a). It can be broken into 8 hidden layers. Each of the layers denoted by “Conv1-5” consists of a sequence of convolutional stage, batch-normalization stage [33], activation stage and max-pooling stage. Each layer uses convolutional filters with the size of [3,3], stride of [1,1] and same-padding, uses ReLU as activation function, and uses max-pooling with size of [2,2] and stride of [2,2]. The activation functions of layers denoted by “FC1” and “FC2” are ReLU while that of “FC3” is softmax. The size of 2D images and number of channels after each layer are indicated in Fig. 5(a). After training with 396000 samples from the training set (4000 from the cross-validation set) for 1000 epochs, the MSE of the training set ends up with 8.90e-6 while that of the cross-validation set is 2.35e-5. Performance of this CNN is benchmarked statistically with 10000 testing samples and the corresponding square error and absolute error distributions are shown in Fig. 5(b) and (c), respectively. The prediction accuracy for this CNN turns out to be excellent, about 98.70% with a maximum square error tolerance tolse=1e3 and 97.42% with maximum absolute error tolerance tolae=0.02. Therefore, compared with the previous case shown above, although the number of modes has been more than doubled (8 vs. 3), the prediction performance by this newly trained CNN remains nearly unchanged. This finding confirms the generality of using CNNs to confidently predict modal power distribution. Our concept thus may have a very wide application scope in the related field.

 figure: Fig. 5

Fig. 5 (a) Structure of proposed CNN for predicting mode power distribution for waveguides with multiple mode orders in both vertical and horizontal directions. (b) Performance of this CNN with 10000 testing samples benchmarked with square error distribution. Insert: samples with square error larger than 1e-3. (c) Performance benchmarked with absolute error distribution. Insert: samples with absolute error larger than 0.02.

Download Full Size | PDF

In reality, the performance of CNNs can be impaired by many factors. Among them, noise data could have the most direct impact. So in order to verify the robustness of our general CNN shown in Fig. 5(a), noisy testing samples are prepared and fed into the trained CNN. In detail, the intensity value of each pixel of testing samples is multiplied by its own factor, f, which equals to 1+N(0,1)σ. Here N(0,1) is the standard normal distribution and σ can be regarded as the simulated noise intensity. The prediction accuracy against the noise intensity with absolute error < 0.02 is shown in Fig. 6. Overall, the prediction accuracy slowly degrades with the increasing noise intensity. It, however, still remains larger than 95% even when the simulated noise intensity reaches 0.08. Notice that this kind of noise level can be hardly reached in any practical scenarios. This study thus confirms that our CNN model can be very tolerant to noises.

 figure: Fig. 6

Fig. 6 Prediction accuracy against different noise intensities with absolute error smaller than 0.02 as criteria.

Download Full Size | PDF

4. Conclusion

In summary, a modal power distribution analyzing scheme using CNNs has been proposed and its performance has been evaluated using synthetic data. A specialized CNN and two full-scale CNN are trained separately to discriminate modal power based on the preprocessed far-field intensity pattern. These trained CNNs have performed excellently on predicting modal power distribution with a very high accuracy > 98% even at the maximum SE tolerance tolse=1e3 and even for waveguides that are heavily multi-moded. In addition, these CNNs can be very efficient and robust in predicting mode power distribution even when exaggerated noises are incorporated in the model. We believe that this new approach to analyze modal power for modes in multimode waveguides holds great promises for potential applications in spatial-division multiplexing and structured light, and can be confidently extended to few mode fibers.

Funding

National Basic Research Program of China (Grant 2015CB659400); Natural Science Foundation of Jiangsu Province (Grant BK20150057); National Key Research and Development Program of China (Grant 2017YFA0206403); The Research Program of the Chinese Academy of Sciences (Grant XDB24020200).

References

1. V. A. Sleiffer, Y. Jung, V. Veljanovski, R. G. van Uden, M. Kuschnerov, H. Chen, B. Inan, L. G. Nielsen, Y. Sun, D. J. Richardson, S. U. Alam, F. Poletti, J. K. Sahu, A. Dhar, A. M. Koonen, B. Corbett, R. Winfield, A. D. Ellis, and H. de Waardt, “73.7 Tb/s (96 x 3 x 256-Gb/s) mode-division-multiplexed DP-16QAM transmission with inline MM-EDFA,” Opt. Express 20(26), B428–B438 (2012). [CrossRef]   [PubMed]  

2. T. Uematsu, Y. Ishizaka, Y. Kawaguchi, K. Saitoh, and M. Koshiba, “Design of a compact two-mode multi/demultiplexer consisting of multimode interference waveguides and a wavelength-insensitive phase shifter for mode-division multiplexing transmission,” J. Lightwave Technol. 30(15), 2421–2426 (2012). [CrossRef]  

3. D. Dai, J. Wang, and S. He, “Silicon multimode photonic integrated devices for on-chip mode-division-multiplexed optical interconnects (invited review),” Prog. Electromagnetics Res. 143, 773–819 (2013). [CrossRef]  

4. H. Qiu, H. Yu, T. Hu, G. Jiang, H. Shao, P. Yu, J. Yang, and X. Jiang, “Silicon mode multi/demultiplexer based on multimode grating-assisted couplers,” Opt. Express 21(15), 17904–17911 (2013). [CrossRef]   [PubMed]  

5. L. W. Luo, N. Ophir, C. P. Chen, L. H. Gabrielli, C. B. Poitras, K. Bergmen, and M. Lipson, “WDM-compatible mode-division multiplexing on a silicon chip,” Nat. Commun. 5(1), 3069 (2014). [CrossRef]   [PubMed]  

6. R. Van Uden, R. A. Correa, E. A. Lopez, F. Huijskens, C. Xia, G. Li, A. Schülzgen, H. De Waardt, A. Koonen, and C. Okonkwo, “Ultra-high-density spatial division multiplexing with a few-mode multicore fibre,” Nat. Photonics 8(11), 865–870 (2014). [CrossRef]  

7. H. Zhou, J. Dong, L. Shi, D. Huang, and X. Zhang, “Hybrid coding method of multiple orbital angular momentum states based on the inherent orthogonality,” Opt. Lett. 39(4), 731–734 (2014). [CrossRef]   [PubMed]  

8. N. Bozinovic, Y. Yue, Y. Ren, M. Tur, P. Kristensen, H. Huang, A. E. Willner, and S. Ramachandran, “Terabit-scale orbital angular momentum mode division multiplexing in fibers,” Science 340(6140), 1545–1548 (2013). [CrossRef]   [PubMed]  

9. D. Dai, J. Wang, and Y. Shi, “Silicon mode (de)multiplexer enabling high capacity photonic networks-on-chip with a single-wavelength-carrier light,” Opt. Lett. 38(9), 1422–1424 (2013). [CrossRef]   [PubMed]  

10. J. Wang, S. He, and D. Dai, “On‐chip silicon 8‐channel hybrid (de) multiplexer enabling simultaneous mode‐and polarization‐division‐multiplexing,” Laser Photonics Rev. 8(2), L18–L22 (2014). [CrossRef]  

11. R. Kirchain and L. Kimerling, “A roadmap for nanophotonics,” Nat. Photonics 1(6), 303–305 (2007). [CrossRef]  

12. Y. Ding, J. Xu, F. Da Ros, B. Huang, H. Ou, and C. Peucheret, “On-chip two-mode division multiplexing using tapered directional coupler-based mode multiplexer and demultiplexer,” Opt. Express 21(8), 10376–10382 (2013). [CrossRef]   [PubMed]  

13. J. Wang, P. Chen, S. Chen, Y. Shi, and D. Dai, “Improved 8-channel silicon mode demultiplexer with grating polarizers,” Opt. Express 22(11), 12799–12807 (2014). [CrossRef]   [PubMed]  

14. P. J. Winzer, “Making spatial multiplexing a reality,” Nat. Photonics 8(5), 345–348 (2014). [CrossRef]  

15. C. Yang, Y. Wang, and C. Q. Xu, “A novel method to measure modal power distribution in multimode fibers using tilted fiber Bragg gratings,” IEEE Photonics Technol. Lett. 17(10), 2146–2148 (2005). [CrossRef]  

16. L. Yan, R. Barankov, P. Steinvurzel, and S. Ramachandran, “Modal-weight measurements with fiber gratings,” J. Lightwave Technol. 33(13), 2784–2790 (2015). [CrossRef]  

17. H. Zhou, Q. Zhu, W. Liang, G. Zhu, Y. Xue, S. Chen, L. Shen, M. Liu, J. Dong, and X. Zhang, “Mode measurement of few-mode fibers by mode-frequency mapping,” Opt. Lett. 43(7), 1435–1438 (2018). [CrossRef]   [PubMed]  

18. L. Li, J. Leng, P. Zhou, and J. Chen, “Multimode fiber modal decomposition based on hybrid genetic global optimization algorithm,” Opt. Express 25(17), 19680–19690 (2017). [CrossRef]   [PubMed]  

19. G. Stepniak, “Application of the error reduction algorithm to measurement of modal power distribution in a multimode fiber,” Proc. SPIE 9290, 929007 (2014). [CrossRef]  

20. D. M. Nguyen, S. Blin, T. N. Nguyen, S. D. Le, L. Provino, M. Thual, and T. Chartier, “Modal decomposition technique for multimode fibers,” Appl. Opt. 51(4), 450–456 (2012). [CrossRef]   [PubMed]  

21. J. W. Nicholson, A. D. Yablon, J. M. Fini, and M. D. Mermelstein, “Measuring the modal content of large-mode-area fibers,” IEEE J. Sel. Top. Quantum Electron. 15(1), 61–70 (2009). [CrossRef]  

22. K. Kavukcuoglu, P. Sermanet, Y. L. Boureau, K. Gregor, M. Mathieu, and Y. L. Cun, “Learning convolutional feature hierarchies for visual recognition,” in Proceedings of Advances in Neural Information Processing Systems(Curran Associates Inc, 2010), pp. 1090–1098.

23. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Proceedings of Advances in Neural Information Processing Systems (Curran Associates Inc, 2012), pp. 1097–1105.

24. C. Farabet, C. Couprie, L. Najman, and Y. Lecun, “Learning hierarchical features for scene labeling,” IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1915–1929 (2013). [CrossRef]   [PubMed]  

25. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” https://arxiv.org/abs/1409.4842 (2014).

26. M. Matsugu, K. Mori, Y. Mitari, and Y. Kaneda, “Subject independent facial expression recognition with robust face detection using a convolutional neural network,” Neural Netw. 16(5-6), 555–559 (2003). [CrossRef]   [PubMed]  

27. I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep Learning (MIT press Cambridge, 2016).

28. D. C. Ciresan, U. Meier, J. Masci, L. Maria Gambardella, and J. Schmidhuber, “Flexible, high performance convolutional neural networks for image classification,” in Proceedings of International Joint Conference on Artificial Intelligence (Barcelona, Spain, 2011), pp. 1237–1242.

29. Z. Zhu and T. Brown, “Full-vectorial finite-difference analysis of microstructured optical fibers,” Opt. Express 10(17), 853–864 (2002). [CrossRef]   [PubMed]  

30. R. H. Hahnloser, R. Sarpeshkar, M. A. Mahowald, R. J. Douglas, and H. S. Seung, “Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit,” Nature 405(6789), 947–951 (2000). [CrossRef]   [PubMed]  

31. N. M. Nasrabadi, “Pattern recognition and machine learning,” J. Electron. Imaging 16(4), 049901 (2007). [CrossRef]  

32. D. P. Kingma and J. Ba, “Adam: a method for stochastic optimization,” https://arxiv.org/abs/1412.6980 (2014).

33. S. Ioffe and C. Szegedy, “Batch normalization: accelerating deep network training by reducing internal covariate shift,” https://arxiv.org/abs/1502.03167 (2015).

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (6)

Fig. 1
Fig. 1 SOI waveguide under consideration. (a) A schematic with a superimposed mode at the exit of the SOI waveguide and the corresponding far-field pattern that can be easily recorded by CCD and analyzed by CNN. (b) Pseudo-color images showing the calculated mode profile for the three quasi-TE modes of this SOI waveguide with w = 1.6µm, h = 0.22µm, and λ = 1.55µm.
Fig. 2
Fig. 2 Input data and the CNN architecture. (a) Far-field intensity distribution data is compressed by summing up all the intensity values along the vertical axis leaving behind a horizontal-position (or x ' -axis) dependent grayscale map, I, and -log(I) is used as final input data to our CNN. (b) CNN architecture with two convolutional layers and three fully connected layers.
Fig. 3
Fig. 3 Histograms of square error distribution of 10000 testing samples. Insert: samples with square error larger than 1e-3.
Fig. 4
Fig. 4 Histogram of absolute error distribution of 10000 testing samples. Insert: samples with absolute error larger than 0.02.
Fig. 5
Fig. 5 (a) Structure of proposed CNN for predicting mode power distribution for waveguides with multiple mode orders in both vertical and horizontal directions. (b) Performance of this CNN with 10000 testing samples benchmarked with square error distribution. Insert: samples with square error larger than 1e-3. (c) Performance benchmarked with absolute error distribution. Insert: samples with absolute error larger than 0.02.
Fig. 6
Fig. 6 Prediction accuracy against different noise intensities with absolute error smaller than 0.02 as criteria.

Equations (4)

Equations on this page are rendered with MathJax. Learn more.

I ( x ' , y ' ) | E f f ( x ' , y ' ; a ) | 2 = | i a i E i ( x , y ) exp ( j 2 π ( x ' λ z x + y ' λ z y ) ) d x d y | 2
J = 1 m i = 1 m j = 1 3 ( y p ( i ) [ j ] y l ( i ) [ j ] ) 2 .
S E ( i ) = j = 1 3 ( y p ( i ) [ j ] y l ( i ) [ j ] ) 2
A E ( i ) = max ( | y p ( i ) [ j ] y l ( i ) [ j ] | ) , j = 1 , 2 , 3
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.