Simple Learning Method to Guarantee Operational Range of Optical Monitors

Takahito Tanimura; Takeshi Hoshida; Tomoyuki Kato; Shigeki Watanabe; Hiroyuki Morikawa

doi:10.1364/JOCN.10.000D63

I. Introduction

Optical physical-layer monitors (OPMs) are indispensable for designing autonomous “self-driving” optical networks that can achieve autonomous management by measuring themselves [1–4]. In this context, network operators simply want to declare high-level monitoring objectives and have monitors autonomously adapt themselves to current network situations. Therefore, it is necessary to find a methodology to accurately estimate the value of a target quantity within a given operational range. The term “operational range” is used here to refer to a set of environmental conditions under which an OPM can successfully employ monitoring capabilities. The OPM must provide correct monitor results within its operational range.

There is growing interest in using machine learning (ML) [5,6] to automatically find a methodology for achieving an OPM [7–18]. This should be more useful than current pre-designed OPMs with fixed measurement and signal processing, because they are flexible in adapting to different monitors with different environments. Some examples are optical signal-to-noise ratio (OSNR), nonlinearity factors, chromatic dispersion (CD), polarization mode dispersion (PMD), and any quantities that are unexpected in the design stage but needed later.

Key challenges of the ML-based OPM approach are the abundance of training data needed for convergence, pre-processing input data by human engineers needed for feature (representation) extraction, and guaranteeing operational ranges for trained models. Our recent studies on deep neural network (DNN)-based OPMs [19–21] first resolved two of these challenges by combining a large amount of data collected by digital coherent receivers and end-to-end learning for representation learning by a DNN. Guaranteeing an operational range was left for another investigation. It is necessary to specify the operational range of an ML-based OPM for practical use because many learning algorithms that include regular DNNs provide only a point estimate (mean value of the quantity of interest) and do not capture the confidence of a model output. Standard DNN models (e.g., an OSNR estimator) can estimate clear values (e.g., OSNR values) even outside of their operational range, but it is difficult to trust the model’s estimation. Although the Bayesian probability theory [22] offers mathematically grounded tools that help reconcile a model’s uncertainty [23], it usually comes with a prohibitive computational cost and sometimes causes issues with representation engineering for high-dimensional input.

We give a simple treatment of the link between pre-processing training datasets for DNN-based OPMs and their specified operational ranges. This is applicable to standard DNN models with minimum modifications to training processes by simple addition of calculation. We assess the relationship between the pre-processing and the operational range using a DNN-based OSNR estimator with an uncertain laser frequency offset between a signal and a local oscillator (LO) as an example case. We evaluate bias errors and standard deviations of the OSNR estimation from different frequency offsets ranging from $- 3.5$ to $+ 3.5 GHz$ .

Section II presents the architecture of training data processing to ensure the operational range of an OPM based on DNNs. We call this mechanism an “operational range expander.” Section III presents a case study in which a specified operational range was successfully acquired by training with the operational range expander. We demonstrate robust operation of a DNN-based OSNR monitor within a specified operational range. After experiment results and discussion in Section IV, we conclude the paper in Section V.

II. Ensuring Operational Range in DNN-Based OPM

After a short review of DNN-based OPMs, we introduce an additional process to ensure their operational range with explicitly specified values by using the transformation of a training dataset that is a reconstructed electric field in a digital coherent receiver.

A. DNN-Based OPM and Its Features

We proposed an OPM based on both a DNN and a digital coherent receiver in our previous work [19–21]. Figure 1 shows a schematic diagram of the DNN-based OPM. Optical measurement and digitization of this OPM was carried out by a digital coherent receiver. This part converts an optical signal into a digitized dataset that contains full information of the optical amplitude and phase of both polarizations within the bandwidth of the receiver [24]. Thanks to the high-speed sampling rate (typically several tens of GSa/s) of analog-to-digital converters (ADCs), both static and fast phenomena on an optical electric field can be captured by the measurement-and-digitization part. This digitization function can provide an enormous amount of data necessary for training DNNs.

Fig. 1. Schematic diagram of DNN-based optical physical-layer monitor and CNN.

Download Full Size | PDF

The data-analytic part of this OPM is carried out using a DNN that provides flexibility and versatility for processing, which is automatically learned from the input dataset without prior modeling of the channel or component characteristics. The OPM provides its outputs for a network controller to manage networks. In this study, we used a convolutional neural network (CNN), which is one form of DNN and is inspired by the primary visual cortex in the human brain, as an implementation of the data-analytic part of the OPMs, as shown in Fig. 1.

Using a DNN instead of existing shallow ML techniques, such as shallow artificial neural networks or support vector machines (SVMs), made it possible to skip handcrafted engineering to design a data representation from measured raw data. In particular, a typical processing pipeline of shallow ML algorithms is separated into two steps: (1) representation engineering and (2) task execution. The representation engineering step is for composing a specific data representation that is useful for a given task. This step has been conducted by human engineers. In contrast to the two-step pipeline, a DNN enables the use of an integrated approach: both representation and a task can be learned directly from the data. This DNN-based approach offers end-to-end learning [25] for both representation and a task. Thus, the use of a DNN can relax scalability limitations of the shallow ML-based OPMs to a more general set of signals, such as having different modulation formats in the absence of automatic representation extraction.

B. Operational Range Expander

Although the DNN-based OPMs can skip representation engineering by human engineers because of their end-to-end learning, they must provide correct monitor results within their operational range. To ensure the operational ranges of DNN-based OPMs, we introduce an extra process called an “operational range expander.” This process aims to create “new” data from existing data to enrich diversity in training datasets. Figure 2(a) shows the basic idea of the operational range expander in the DNN-based OPM. This is a transformation for training datasets that are reconstructed electric fields in digital domains. This transformation $T$ is expressed by:

E_{out} = T (E_{in}, ξ), ξ = g (S, u), T (E_{in}, 0) = E_{in},

where

E_{out} = {(E_{out}^{(x)}, E_{out}^{(y)})}^{T}

and

E_{in} = {(E_{in}^{(x)}, E_{in}^{(y)})}^{T}

are the output and input complex electric fields of the expander, and

ξ

is a random value as a parameter of transformation.

ξ

is calculated by

S

(an operational range specifier) and

u

(uniform random value between

- 1

and

+ 1

) through function

g

. A function

g

forms a relationship between

ξ

and

S

depending on phenomena emulated on the expander. Examples of

T

,

ξ

,

S

, and

g

will be given in Eqs. (3) and (4) in Section III.B.

Fig. 2. Schematic diagram of (a) DNN-based OPM with operational range expander, and (b) frequency distribution of quantity operated in the operational range expander.

Download Full Size | PDF

Here, we discuss the general restriction of transformation $T$ for the operational range expander. First, transformation $T$ does not impact the physical quantities of the monitoring target. The transformation changes and expands a frequency distribution of quantity to be expanded over the whole training dataset, as shown in Fig. 2(b). For example, to correctly estimate the OSNR values, these values should be invariant even when adding the expander process in training. For example, frequency offset addition (constellation rotation on an IQ plane), chromatic dispersion addition, and polarization rotation do not impact OSNR values. Thus, the operational range expander can enhance the operational range of the OSNR monitor over these phenomena.

Next, transformation $T$ should emulate the corresponding phenomena; e.g., frequency offset, chromatic dispersion, polarization rotation. The DNN-based OPM has a significant advantage in this aspect because the DNN-based OPM treats raw data that is a reconstructed electric field from a digital coherent receiver. This allows any waveform transformation onto the measured dataset by post-processing. Moreover, the transformation can be concatenated to expand the operational range over multiple phenomena, such as $E_{out} = T_{2} (T_{1} (E_{in}, ξ_{1}), ξ_{2})$ , where $T_{1}$ and $T_{2}$ are transformation, and $ξ_{1}$ and $ξ_{2}$ are random values corresponding to phenomena 1 and 2, respectively. The result is the ability to design any specifications for operational ranges, leading to specified robustness of the DNN-based OPM against system uncertainty.

This expander shares a concept with data augmentation to obtain an invariance [5]—e.g., random clopping and/or shifting image pixels in visual image recognition—but it is identified by the data transformation being based on knowledge of physical phenomena in optical fiber communication systems and subsystems. Finding physical phenomena in optical fiber communication systems is important for composing transform $T$ to obtain robustness of system uncertainty without affecting the monitoring target.

Compared to actual data gathering with diverse conditions, the operational range expander has the cost advantage thanks to its zero additional measurement cost and fixed computational cost to obtain a “new” data point, especially for highly reliable systems, e.g., optical fiber communication systems, that require training data points corresponding to extremely rare phenomena. The operational range expander can perform any transformation corresponding to rare phenomena with constant computational cost, whereas the actual data gathering process related to rare phenomena tends to increase measurement cost due to difficulty in preparing a corresponding condition.

III. Experiment

We tested the operational range expander to ensure the operational range of DNN-based OPMs. We assessed DNN-based OSNR estimators against frequency offsets to prove the proposed scheme.

A. Frequency Offset Between Signal and Local Laser

When using an intradyne coherent receiver with a free-running LO [24], the transmitter and receiver lasers are not frequency locked. This results in a residual frequency in the received signal after optical downconversion in the receiver.

Figure 3 shows a schematic diagram of this frequency offset between a signal and a LO. The digital samples straight after ADCs are expressed by:

HI = Re (E_{H} E_{LO}^{*}), HQ = Im (E_{H} E_{LO}^{*}), VI = Re (E_{V} E_{LO}^{*}), VQ = Im (E_{V} E_{LO}^{*}),

where Re and Im are functions that take real and imaginary parts in input complex value.

E_{H}

,

E_{V}

, and

E_{LO}

are the horizontal and vertical polarization signals and LO complex electric fields [24], respectively. Taking the product of electric fields of the signals and the complex-conjugated LO, the electric field is downconverted except for the frequency

δ f = f_{sig} - f_{LO}

and phase offset between signal and LO laser.

f_{sig}

is the laser-emitting frequency of signals, and

f_{LO}

is the laser-emitting frequency of the LO. Although the offset can be compensated in the digital signal processing (DSP) part of a coherent receiver, it is not compensated at the monitoring point, because the monitoring needs to occur before all main signal processing.

Fig. 3. Schematic diagram of frequency offset between signal and local laser in digital coherent receiver. LD, laser diode; BPD, balanced photo diode.

Download Full Size | PDF

B. Method to Augment Dataset on Frequency Offset

We introduced an operational range expander as a pre-process in the training phase of the DNN-based OSNR estimator to enhance the tolerance against frequency offset uncertainty. We give a practical form of the operational range expander of frequency offset in this section. First, a general form of the operational range expander is given by Eq. (1). If $T$ is a linear transformation, the practical form of $T$ can be given by:

T (E_{in}, ξ) = (\begin{matrix} ξ_{11} & ξ_{12} \\ ξ_{21} & ξ_{22} \end{matrix}) E_{in} .

In Eq. (3), the insertion of an extra random frequency offset is given by:

ξ_{12} = ξ_{21} = 0, ξ_{11} = ξ_{22} = g (S, u), g (S, u) = \exp (j 2 π Sut), S = f_{\max},

where

t

denotes time, and

S

(

= f_{\max}

in this case) is the maximum frequency of the additional frequency offset as the operational range specifier for the frequency offset.

S

gives the operational range of the frequency offset of the DNN-based OSNR estimator. The operational range specifier

S

was set to 1.25 and 2.5 GHz in this paper because 2.5 GHz should be the maximum offset value based on the Optical Internetworking Forum (OIF) integrable tunable laser assembly (iTLA) specifications [26].

C. Experimental and Simulation Setup

Figure 4 shows the experimental setup for acquiring training datasets and inference of the DNN-based OSNR estimator. The details of the setup are also described in Refs. [19–21]. The generated Nyquist-filtered (roll-off $factor = 0.01$ ) 16-Gbaud (GBd) dual-polarization quadrature phase shift keying (DP-QPSK) optical signals were received with the coherent receiver and digitized by ADCs with a sample rate of 80 Gsample/s and analog bandwidth (BW) of 16 GHz, after varying the OSNR with additional amplified spontaneous emission (ASE) noise loading. The digitized samples were presented to the CNNs that were trained and tested using the Tensorflow library [27] on servers equipped with GPUs. Note that DP-QPSK signals were modulated by a different bit sequence for the training and test/inference of CNNs to avoid over-fitted evaluation. The CNNs also received OSNR measured by an optical spectrum analyzer (OSA) for their supervised training (this information is not necessary for their test/inference).

Fig. 4. Experimental and simulation setup. DAC, digital-to-analog converter; InP IQM, indium phosphide in-phase and quadrature-phase modulator; LD, laser diode; EDFA, erbium-doped fiber amplifier; VOA, variable optical attenuator; ASE, amplified spontaneous emission source; OBPF, optical band-pass filter; ADC, analog-to-digital converter.

Download Full Size | PDF

The simulation setup was almost identical to the experimental setup, except for the imperfections of the components, including frequency detuning and linewidth of lasers, extinction ratio of the modulator, and analog bandwidth limitation of the transceiver.

Figure 5 shows the network architecture of a CNN for OSNR estimation. A CNN is usually composed of multiple convolutional layers (Conv.), pooling layers (Pooling), and fully- connected layers (FC). A CNN has been widely used in image processing, because pixel images can be treated as a two-dimensional grid-like array that adjusts to the input of the CNN. In this paper, we applied a CNN to the coherent-based OPM scheme by interpreting a waveform sampled by constant duration as a one-dimensional grid-like array that fits to CNN input. Although some monitoring tasks can also be performed with other forms of DNN, such as a fully connected DNN [19,20], a CNN has a crucial advantage in the monitoring task that extracts abstractive information from raw and high-dimensional input. This is because the CNN can effectively reduce a number of trainable parameters with the weight sharing over input data, assuming the output of CNN is not changed during the one input data of CNN.

Fig. 5. CNN architecture for OSNR estimation.

Download Full Size | PDF

The actual input for the CNN has four channelized electric fields: 512 samples $\times 4$ channels to each of the HI, HQ, VI, and VQ of the digital coherent receiver outputs. Note that H, V, I, and Q respectively represent horizontal and vertical polarization and in-phase and quadrature-phase components of an optical field. We convolved the sampled time-series data over the time axis in the convolutional layers (i.e., one-dimensional convolution with trainable filter weights). We used multiple filters for these convolutional layers; thus, the number of output and input channels differed through the convolution layer (e.g., it went from 512 samples $\times 4$ channels to 512 samples $\times 16$ channels through the convolutional layer 1-1 in Fig. 5). All convolutional layers in this work had the activation function of the rectified linear unit (ReLU) [28]. The pooling layer in this work was a max pooling layer with strides of 4. These pooling layers reduced the length of data to one-fourth (e.g., from 512 to 128 through the pooling 1 in Fig. 5). The data is flattened before the first fully connected (FC) layer, going from 32 samples $\times 64$ channels to 2048 samples in Fig. 5. The flattened data was served for the FC layers. For output, linear regression was used for OSNR estimation.

The CNN was trained with supervised learning using the backpropagation and minibatch stochastic gradient descent algorithms with a controlled learning rate using the Adam optimization algorithm [29]. The losses to be minimized were defined by a mean squared error for the OSNR estimation. There were 400,000 records in the training datasets for one epoch in a training step. To make a fair comparison, both the CNN model that trained with the operational range expander and the CNN model that trained without the operational range expander shared the same number of records in the training dataset. In other words, the operational range expander enhanced only the diversity in the training dataset but did not increase a number of records in the training dataset in this study. Dropout [30] (drop probability of 0.5) and batch normalization [31] techniques were also used to prevent overfitting.

IV. Results and Discussion

First, to highlight our motivation again, we present an example case of a CNN-based OSNR estimator operated within [Fig. 6(a)] and outside [Fig. 6(b)] of the operational range of the frequency offset. For Figs. 6(a) and 6(b), the CNN-based OSNR estimator was trained with a simulation dataset that contained only data with no frequency offset. Blue circles in Fig. 6 show the averaged value of estimated OSNRs over a test dataset.

Fig. 6. Evaluation results of CNN-based OSNR estimator trained with simulation data with frequency $offset = 0$ . (a) Using test dataset with frequency $offset = 0$ , and (b) 1 GHz.

Download Full Size | PDF

Figure 6(a) shows test results from a simulation test dataset that contains no frequency offset data ( $δ f = 0$ ). This shows that the trained model can correctly estimate OSNRs ranging from 11 to 29 dB for measured optical signals if there is no frequency offset. Next, we tuned the laser frequency of the transmitter laser $+ 1 GHz$ in a simulation; this optical signal was monitored with the trained CNN-based OSNR estimator. Note that this CNN-based OSNR estimator was trained with a dataset having $δ f = 0$ . Figure 6(b) shows test results from a simulation test dataset that contains 1 GHz frequency offset data ( $δ f = 1 GHz$ ). Strong saturation was observed in this result; thus, the CNN-based OSNR monitor was not able to estimate correct values when there was frequency offset that was out of the operational range of the monitor.

Next, we turned on the operational range expander to ensure frequency offset tolerance of the CNN-based OSNR estimator on simulation. The operational range specifier $S$ of frequency offset was set to 1.25 and 2.5 GHz. The CNN-based OSNR estimator with the operational range expander could have a tolerance of $\pm 1.25$ and 2.5 GHz against frequency offset. Note that the expander changed the frequency offset but did not change a number of records in training datasets. Figure 7 shows simulation results of the relationship between measured and CNN-estimated OSNR with the operational range specifier of frequency offset at 0, 1.25 and 2.5 GHz, respectively. Each CNN-based OSNR estimator was evaluated by a test dataset with a certain frequency offset of 0, 1, 2, or 3 GHz. Obviously, the CNN trained with the operational range specifier of 0 Hz failed to learn a functionality of OSNR estimation, except for the input signal with 0 Hz frequency offset. When increasing the operational range specifier up to 2.5 GHz, the operational ranges to estimate correct OSNR values were expanded depending on the value of the specifier.

Fig. 7. Evaluation results of CNN-based OSNR estimator trained with simulation data of the operational range expander of frequency offset. The operational range $specifier = 0$ (blue circles), 1.25 (red diamonds), and 2.5 GHz (green triangles).

Download Full Size | PDF

For a more detailed discussion, we evaluated bias error and standard deviation of CNN-estimated OSNR values with differing frequency offset on test datasets. Note that the bias errors and standard deviations were averaged over all test data, including measured OSNRs ranging from 11 to 29 dB. Figures 8(a) and 8(c) show simulation results of bias errors and standard deviations as a function of frequency offset on test datasets. Figure 8(a) shows the CNN-based OSNR estimator with the operational range specifier acquired tolerances that are almost the same as the value of the specifiers. Figure 8(b) shows the standard deviations of the CNN-estimated OSNR and how they started to fluctuate out of operational ranges indicated by the specifiers.

Fig. 8. Evaluation results of CNN-based OSNR estimator trained with simulation data having the operational range expander of frequency offset. The operational range $specifier = 0$ (blue circles), 1.25 (red diamonds), and 2.5 GHz (green triangles). (a) Bias error, (b) enlarged graph of (a), and (c) standard deviation of CNN-estimated OSNR as a function of frequency offset of incoming signals.

Download Full Size | PDF

Figure 8(b) is a magnified figure of Fig. 8(a) to show detail dependency of the specifiers. In Fig. 8(c), there is little difference ( $< 0.1 dB$ ) between the bias errors among the results that have a different specifier even within its operational range. One possible implication of the results is that the OSNR estimation task might become more difficult for the CNN with increasing the specifier $f_{\max}$ in this case.

Next, we demonstrated the expander by experiment. For the baseline dataset, the LO laser in the experimental setup was finely tuned so that the frequency offset was within 150 MHz. The frequency offset between signals and LO was estimated by the frequency offset estimator of the receiver-side DSP [32]. The evaluated standard deviation of the measured frequency offset by the DSP is shown in Fig. 9.

Fig. 9. Standard deviation of measured frequency offset derived from the number of measurements $N = 23$ by using a DSP-based method.

Download Full Size | PDF

Using the baseline dataset, the CNN-based OSNR estimators were trained with the operational range expander. The operational range specifier of frequency offset was set to 0, 1.25, and 2.5 GHz. Test datasets with certain frequency offsets were measured in experiments with varying frequencies of the LO laser. Figures 10(a) and 10(c) show experimental results of bias errors and standard deviations as a function of frequency offset on test datasets. The experimental results show good agreement with simulation results shown in Fig. 8. Results with the specifiers of 1.25 and 2.5 GHz (red diamonds and green triangles in Fig. 10) show slightly raised tendency of errors even within their operational range, compared to the simulation results in Fig. 8. This is due to the limited analog bandwidth of ADCs in this experiment. Even with this bandwidth limitation, the experimental results show that the introduced expander dramatically reduced bias errors of CNN-estimated OSNRs within the specified operational range.

Fig. 10. Evaluation results of CNN-based OSNR estimator trained with experimental data having the operational range expander of frequency offset. The operational range $specifier = 0$ (blue circles), 1.25 (red diamonds), and 2.5 GHz (green triangles). (a) Bias error, (b) enlarged graph of (a), and (c) standard deviation of CNN-estimated OSNR as a function of frequency offset of incoming signals.

Download Full Size | PDF

V. Conclusion

We proposed a simple treatment of the link between pre-processing training datasets for DNN-based OPMs and their specified operational range. We assessed the relationship between pre-processing and operational ranges using a CNN-based OSNR estimator with uncertain laser frequency offset between a signal and a local oscillator. We gave examples through both simulation and experiment. We evaluated the bias errors and standard deviations of OSNR estimation from different frequency offsets ranging from $- 3.5$ to $+ 3.5 GHz$ and confirmed that the operational range expander provided specified operational ranges of CNN-based OSNR estimators in their training phase. This is a valuable step toward designing autonomous “self-driving” optical networks.

Acknowledgment

We would like to thank E. Katayama, Y. Harada, and K. Shiota with Fujitsu Kyushu Network Technologies Limited for their support in the data analysis.

References

1. S. Yan, A. Aguado, Y. Ou, R. Wang, R. Nejabati, and D. Simeonidou, “Multi-layer network analytics with SDN-based monitoring framework,” J. Opt. Commun. Netw., vol. 9, no. 2, pp. A271–A279, 2017. [CrossRef]

2. F. Meng, Y. Ou, S. Yan, K. Sideris, M. D. G. Pascual, R. Nejabati, and D. Simeonidou, “Field trial of a novel SDN enabled network restoration utilizing in-depth optical performance monitoring assisted network re-planning,” in Optical Fiber Communication Conf. (OFC), 2017, paper Th1J.8.

3. S. Yan, F. N. Khan, A. Mavromatis, D. Gkounis, Q. Fan, F. Ntavou, K. Nikolovgenis, F. Meng, E. H. Salas, C. Guo, C. Lu, A. P. T. Lau, R. Nejabati, and D. Simeonidou, “Field trial of machine-learning-assisted and SDN-based optical network planning with network-scale monitoring database,” in 43rd European Conf. Optical Communication (ECOC), PDP, 2017.

4. S. Oda, M. Miyabe, S. Yoshida, T. Katagiri, Y. Aoki, T. Hoshida, J. C. Rasmussen, M. Birk, and K. Tse, “A learning living network with open ROADMs,” J. Lightwave Technol., vol. 35, no. 8, pp. 1350–1356, 2017. [CrossRef]

5. C. M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.

6. D. Zibar, M. Piels, R. Jones, and C. G. Schaeffer, “Machine learning techniques in optical communication,” J. Lightwave Technol., vol. 34, no. 6, pp. 1442–1452, 2016. [CrossRef]

7. X. Wu, J. A. Jargon, R. A. Skoog, L. Paraschis, and A. E. Willner, “Applications of artificial neural networks in optical performance monitoring,” J. Lightwave Technol., vol. 27, no. 16, pp. 3580–3589, 2009. [CrossRef]

8. J. A. Jargon, X. Wu, H. Y. Choi, Y. C. Chung, and A. E. Willner, “Optical performance monitoring of QPSK data channels by use of neural networks trained with parameters derived from asynchronous constellation diagrams,” Opt. Express, vol. 18, no. 5, pp. 4931–4938, 2010. [CrossRef]

9. X. Wu, J. A. Jargon, L. Paraschis, and A. E. Willner, “ANN-based optical performance monitoring of QPSK signals using parameters derived from balanced-detected asynchronous diagrams,” IEEE Photon. Technol. Lett., vol. 23, no. 4, pp. 248–250, 2011. [CrossRef]

10. F. N. Khan, T. S. R. Shen, Y. Zhou, A. P. T. Lau, and C. Lu, “Optical performance monitoring using artificial neural networks trained with empirical moments of asynchronously sampled signal amplitudes,” IEEE Photon. Technol. Lett., vol. 24, no. 12, pp. 982–984, 2012. [CrossRef]

11. J. Thrane, J. Wass, M. Piels, J. C. M. Diniz, R. Jones, and D. Zibar, “Machine learning techniques for optical performance monitoring from directly detected PDM-QAM signals,” J. Lightwave Technol., vol. 35, no. 4, pp. 868–875, 2017. [CrossRef]

12. T. S. R. Shen, Q. Sui, and A. P. T. Lau, “OSNR monitoring for PM-QPSK systems with large inline chromatic dispersion using artificial neural network,” IEEE Photon. Technol. Lett., vol. 24, no. 17, pp. 1564–1567, 2012. [CrossRef]

13. M. C. Tan, F. N. Khan, W. H. Al-Arashi, Y. D. Zhou, and A. P. T. Lau, “Simultaneous optical performance monitoring and modulation format/bit-rate identification using principal component analysis,” J. Opt. Commun. Netw., vol. 6, no. 5, pp. 441–448, 2014. [CrossRef]

14. F. N. Khan, Y. Yu, M. C. Tan, W. H. Al-Arashi, C. Yu, A. P. T. Lau, and C. Lu, “Experimental demonstration of joint OSNR monitoring and modulation format identification using asynchronous single channel sampling,” Opt. Express, vol. 23, pp. 30337–30346, 2015. [CrossRef]

15. F. N. Khan, K. Zhong, W. H. Al-Arashi, C. Yu, C. Lu, and A. P. T. Lau, “Modulation format identification in coherent receivers using deep machine learning,” IEEE Photon. Technol. Lett., vol. 28, no. 17, pp. 1886–1889, 2016. [CrossRef]

16. F. N. Khan, K. Zhong, X. Zhou, W. H. Al-Arashi, C. Yu, C. Lu, and A. P. T. Lau, “Joint OSNR monitoring and modulation format identification in digital coherent receivers using deep neural networks,” Opt. Express. vol. 25, no. 15, pp. 17767–17776, 2017. [CrossRef]

17. D. Wang, M. Zhang, J. Li, Z. Li, J. Li, C. Song, and X. Chen, “Intelligent constellation diagram analyser using convolutional neural network-based deep learning,” Opt. Express, vol. 25, no. 15, pp. 17150–17166, 2017. [CrossRef]

18. D. Wang, M. Zhang, Z. Li, J. Li, M. Fu, Y. Cui, and X. Chen, “Modulation format recognition and OSNR estimation using CNN-based deep learning,” IEEE Photon. Technol. Lett., vol. 29, no. 19, pp. 1667–1670, 2017. [CrossRef]

19. T. Tanimura, T. Hoshida, J. C. Rasmussen, M. Suzuki, and H. Morikawa, “OSNR monitoring by deep neural networks trained with asynchronously sampled data,” in Int. Conf. Photonics in Switching OptoElectronics and Communications Conf. (OECC/PS), 2016, paper TuB3-5.

20. T. Tanimura, T. Hoshida, T. Kato, S. Watanabe, J. C. Rasmussen, M. Suzuki, and H. Morikawa, “Deep learning based OSNR monitoring independent of modulation format, symbol rate and chromatic dispersion,” in 42nd European Conf. Optical Communication (ECOC), 2016, paper Tu2C.2.

21. T. Tanimura, T. Hoshida, T. Kato, S. Watanabe, and H. Morikawa, “Data-analytics-based optical performance monitoring technique for optical transport networks,” in Optical Fiber Communications Conf. and Exhibition (OFC), 2018, paper Tu3E.3.

22. D. J. C. MacKay, “A practical Bayesian framework for backpropagation networks,” Neural Comput., vol. 4, no. 3, pp. 448–472, 1992. [CrossRef]

23. J. Wass, J. Thrane, M. Piels, R. Jones, and D. Zibar, “Gaussian process regression for WDM system performance prediction,” in Optical Fiber Communications Conf. and Exhibition (OFC), 2017, paper Tu.3D.7.

24. K. Kikuchi, “Fundamentals of coherent optical fiber communications,” J. Lightwave Technol., vol. 34, no. 1, pp. 157–179, 2016. [CrossRef]

25. Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, pp. 436–444, 2015. [CrossRef]

26. “Integrable tunable laser assembly MSA,” Tech. Rep. OIF-ITLA-MSA-01.2, June 26, 2008.

27. M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P.Warden, M.Wicke, Y. Yu, and X. Zheng, “Tensorflow: a system for large-scale machine learning,” in 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2016.

28. X. Glorot, A. Bordes, and Y. Bengui, “Deep sparse rectifier neural networks,” in 14th Int. Conf. Artificial Intelligence and Statistics (AISTATS), 2011, pp. 315–323.

29. J. Ba and D. Kingma, “Adam: a method for stochastic optimization,” in 3rd Int. Conf. Learning Representations (ICLR), 2015.

30. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from over fitting,” J. Mach. Learn. Res., vol. 15, pp. 1929–1958, 2014.

31. S. Ioffe and C. Szegedy, “Batch normalization: accelerating deep network training by reducing internal covariate shift,” in 32nd Int. Conf. Machine Learning (ICML), 2015.

32. T. Tanimura, T. Hoshida, T. Kato, S. Watanabe, M. Suzuki, and H. Morikawa, “Throughput and latency programmable optical transceiver by using DSP and FEC control,” Opt. Express, vol. 25, no. 10, pp. 10815–10827, 2017. [CrossRef]

Takahito Tanimura received his B.S. and M.S. in Physics from the Tokyo Institute of Technology (Tokyo Tech), Tokyo, Japan in 2004 and 2006, respectively, and his Ph.D. in Electrical Engineering from the University of Tokyo, Tokyo, Japan, in 2018. Since 2006, he has been with Fujitsu Laboratories Ltd., Kawasaki, Japan, where he has been engaged in the research and development of digital coherent optical communication systems. From 2011 to 2012, he was with the Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Berlin, Germany. His research interests include digital signal processing and machine learning for large-scale nonlinear systems. Dr. Tanimura is a member of the Institute of Electronics, Information and Communication Engineers (IEICE), and the Physical Society of Japan (JPS), and is currently serving on the Editorial Committee Member of the IEICE Transactions on Communications (Japanese edition) and the Technical Program Committee of the Optical Fiber Communication Conference.

Takeshi Hoshida (S’97–M’98) received his B.E., M.E., and Ph.D. in Electronic Engineering from the University of Tokyo, Tokyo, Japan in 1993, 1995, and 1998, respectively. Since 1998, he has been with Fujitsu Laboratories Ltd., Kawasaki, Japan, where he has been engaged in the research and development of dense wavelength-division multiplexing optical transmission systems. From 2000 to 2002, he was with Fujitsu Network Communications, Inc., Richardson, TX. Since 2007, he has also been with Fujitsu Limited, Kawasaki, Japan. Dr. Hoshida is a senior member of the Institute of Electrical and Electronics Engineers (IEICE) and a member of the Japan Society of Applied Physics (JSAP).

Tomoyuki Kato (S’05–M’06) was born in Saitama, Japan in 1979. He received his B.E., M.E., and Dr. Eng. all in electrical engineering from the Yokohama National University, Japan in 2001, 2003, and 2006, respectively. He worked for the Precision and Intelligence Laboratories, Tokyo Institute of Technology, as a Research Associate from 2006 to 2009. He joined Fujitsu Laboratories Ltd., Kawasaki, Japan, in 2009. His current research includes nonlinear optical signal processing. Dr. Kato is a member of the IEEE Photonics Society and the Institute of Electronics, Information, and Communication Engineers (IEICE) of Japan.

Shigeki Watanabe (M’93) received his B.S. and M.S. in physics from Tohoku University, Sendai, Japan in 1978 and 1980, respectively, and Ph. D. in electrical engineering from the University of Tokyo in 1997. He joined Fujitsu Ltd. in 1980, and since 1987 he has been with Fujitsu Laboratories Ltd., Kawasaki, Japan, where he is engaged in advanced phonic technologies in the field of optical communications. His current research includes nonlinear optical signal processing and ultrafast photonics. Dr. Watanabe is a member of IEEE, The Optical Society (OSA), and the Institute of Electronics, Information, and Communication Engineers (IEICE) of Japan.

Hiroyuki Morikawa received B.E., M.E, and Dr. Eng. degrees in electrical engineering from the University of Tokyo, Tokyo, Japan, in 1987, 1989, and 1992, respectively. Since 1992, he has been at the University of Tokyo and is currently a full professor of the School of Engineering at the University of Tokyo. His research interests are in the areas of ubiquitous networks, sensor networks, big data/IoT/M2M, wireless communications, and network services. He served as a technical program committee chair of many IEEE/ACM conferences and workshops, Vice President of IEICE, Editor-in-Chief of IEICE Transactions of Communications, OECD Committee on Digital Economy Policy (CDEP) vice chair, and Director of New Generation M2M Consortium, and sits on numerous telecommunications advisory committees and frequently serves as a consultant to government and companies. He has received more than 50 awards including three IEICE best paper awards, an IPSJ best paper award, a JSCICR best paper award, an Info-Communications Promotion Month Council President Prize, a NTT DoCoMo Mobile Science Award, a Rinzaburo Shida Award, and a Radio Day Ministerial Commendation.

Simple Learning Method to Guarantee Operational Range of Optical Monitors

Abstract

I. Introduction

II. Ensuring Operational Range in DNN-Based OPM

A. DNN-Based OPM and Its Features

B. Operational Range Expander

III. Experiment

A. Frequency Offset Between Signal and Local Laser

B. Method to Augment Dataset on Frequency Offset

C. Experimental and Simulation Setup

IV. Results and Discussion

V. Conclusion

Acknowledgment

References

Cited By

Figures (10)

Equations (4)

Journal of Optical Communications and Networking