Intelligent constellation diagram analyzer using convolutional neural network-based deep learning

Danshi Wang; Min Zhang; Jin Li; Ze Li; Jianqiang Li; Chuang Song; Xue Chen

doi:10.1364/OE.25.017150

1. Introduction

Measuring the quality of optical signals is one of the most important tasks in optical communications [1]. In intensity modulation-direct detection (IM-DD) systems, the eye diagram, as a commonly-used analysis object, qualitatively reflects the effects of all impairments on signal quality, especially for on-off keying (OOK) and pulse amplitude modulation (PAM) formats [2]. Over the past few years, coherent optical transmission system and advanced modulation formats such as M-ary phase-shift keying (PSK) and quadrature amplitude modulation (QAM) are establishing quickly [3], where the eye diagram is no longer valid due to the lack of phase information. Instead, the constellation diagram is employed to display both amplitude and phase information, and comprehensively present multiple performance metrics for PSK and QAM signals [4,5]. By observing constellation diagram, the modulation format can be recognized, optical signal-to-noise (OSNR) can be estimated, error vector magnitude (EVM) can be calculated, and various impairments can be analyzed [6–9]. However, the conventional constellation diagram analysis methods are strongly dependent on the professional expertise, which is only suitable for the experienced engineer. Meanwhile, manual operation can only qualitatively estimate the approximate values, but is difficult to obtain the accurate results. In addition, the conventional statistical approaches need to acquire every constellation point information, meaning that all the in-phase and quadrature data need to be collected. This is a rather time-consuming process so that inapplicable to the real-time test system. Therefore, the prospective constellation diagram analyzer is still desiring the more advanced techniques to develop the capability of intelligent operation without manual intervention, accurate measurement without personal error, and instant processing without data statistics.

Machine learning, considered as a powerful interdisciplinary tool, is driving various areas closer to one of its original goals: artificial intelligence (AI) [10]. Recently, techniques from machine learning have been equally well used in optical communications to promote the development of intelligent systems [11]. The published works mainly focused on optical network control and management, optical performance monitoring (OPM), and digital signal processing (DSP) using different algorithms from machine learning community, including naive Bayes [12], expectation maximum [13], k-means [14], random forest [15], as well as our recently proposed support vector machine (SVM) [16, 17], distance weighted-k-nearest neighbors (DW-KNN) [18], and back-propagation artificial neural network (BP-ANN) [19]. However, all of these machine learning algorithms were limited in their ability of feature extraction. To be more specific, these machine learning models could not process the natural data in their raw form directly, and have to require considerable domain expertise and engineering skill to design a feature extractor that transformed the raw data into a suitable internal representation [20]. Hence, it is desirable to develop more powerful machine learning algorithms that cannot only be fed with raw data, but also automatically discover the features needed for recognition.

Lately, deep learning becomes a rapidly expanding research topic, which is part of a broader family of machine learning methods based on learning representations [21]. Deep learning can be generally understood as deep neural networks with multiple nonlinear layers, in which the features are learned from data through a general-purpose learning procedure, but not designed by human engineers [22]. Deep learning can solve both linear and nonlinear problems. One of the most famous breakthroughs made by deep learning was a computer program AlphaGo from Google DeepMind, who beat a professional player at the board game Go for the first time [23]. Additionally, as a current research hotspot, deep learning is making major advances in a wide variety of applications, such as image recognition based on convolutional neural network (CNN) [24], natural language processing based on recurrent neural network (RNN) [25], speech recognition based on restricted Boltzmann machine (RBM) [26], as shown in Fig. 1. However, to the best of our knowledge, few deep learning-based work in the context of optical communication systems has been reported.

Fig. 1 The relationship among artificial intelligent, machine learning, and deep learning. CNN: convolutional neural network; RNN: recurrent neural network; RBM: restricted Boltzmann machine.

Download Full Size | PDF

In this paper, we propose a deep learning-based intelligent constellation diagram analyzer, which can implement modulation format recognition (MFR) and OSNR estimation simultaneously. The earlier MFR and OSNR estimation schemes were mainly based on the techniques of amplitude histograms, delay-tap scatter plots, or two-tap sampling [27–31], but they could not process raw data directly and had to extract the corresponding features artificially. To realize real intellectualization and automation without manual intervention, we adopt CNN as deep learning algorithm and select constellation diagram as processing objective. With the ability of feature extraction and self-learning, CNN can directly process constellation diagram in its raw data form (namely the pixel points of an image) without knowing other constellation diagram parameters nor making data statistics. A constellation diagram generation module in oscilloscope and the simulation system are conducted to generate the constellation diagram images of six widely-used modulation formats: QPSK, 8PSK, 8QAM, 16QAM, 32QAM, and 64QAM. For the purpose of comparison, other 4 traditional machine learning algorithms (decision tree, SVM, KNN, and BP-ANN) are also conducted. The results show that CNN achieves the better accuracy and is significantly superior to other machine learning methods. Moreover, the experiment for QPSK and 16QAM coherent communication system is also setup. The experimental results show that the OSNR estimation errors for all the signals are less than 0.7 dB and the accuracies of MFR are 100%, demonstrating the feasibility of CNN-based constellation diagram analyzer.

2. Operating principle of CNN-based constellation diagram analyzer

Convolutional neural network is a specialized type of neural network for processing data that have a grid-like topology, such as image data, which can be regarded as a two-dimension (2D) grid of pixels. The principle of CNN-based constellation diagram analyzer for six different formats is illustrated in Fig. 2. The colored constellation diagram images are collected from oscilloscope with pixel size of 720 × 720. To reduce computation load and enhance generalization ability, the colored image is converted into a grayscale and down-sampled to 28 × 28. After the preliminary processing, the low-resolution and gray images as input data are sent into CNN. In CNN, the first few stages are composed of two layers: convolutional layer and pooling layer.

Fig. 2 Schematic diagram of CNN-based constellation diagram analyzer. Input layer: constellation diagram images with pixel size of 28 × 28. Convolution layer 1 (C1): six 24 × 24 feature maps generated by six 5 × 5 kernels. Pooling layer 1 (P1): six 12 × 12 feature maps after subsampling from 2 × 2 region. C2: twelve 8 × 8 feature maps generated by twelve 5 × 5 kernels. P2: twelve 4 × 4 feature maps after subsampling from 2 × 2 region. Fully connected layer 1 (F1): 192 nodes transformed from the all pixels of P2, i.e., 12 × 4 × 4 = 192. Output layer fully connected with all nodes of F1 and consisting of 22 nodes, among which 6 nodes for modulation format recognition and 16 nodes for OSNR estimation. Note that all the activation functions in the CNN are sigmoid function.

Download Full Size | PDF

The convolutional layer is the core building block of a CNN. The parameters in this layer consist of a set of kernels, which have a small receptive filed, but extend through the full depth of the input image. During the forward pass, each kernel convolves with pixel points across the width and height of input image, computing the dot product between the entries of the kernel and the input. The output units are organized into a 2D plane, which is called feature map generated from that kernel. An example of 2D convolution is displayed in Fig. 3(a) [32]. A 2 × 2 kernel convolves with a 3 × 4 input image to produce a feature map with 6 units arranged in 2 × 3 grid. Different from classic convolution in mathematics, the operation in CNN is the discrete convolution which can be viewed as multiplication by a matrix. If we think of the kernel as feature detector, then all of the units in a feature map detect the same pattern but at different locations in the input image. This is because natural images have a key property of being stationary, meaning that the features that we learn at one part of the image can also be applied to other parts. In general, in order to build an effective model, several kernels are needed to detect multiple features so as to produce multiple feature maps in the convolutional layer.

Fig. 3 (a) Schematic diagram of 2D convolution: a 2 × 2 kernel convolves with a 3 × 4 input image to produce a 2 × 3 feature map [32]; (b) max pooling: a 4 × 4 convolved feature map is divided into four disjoint 2 × 2 regions, and take the maximum of each region to generate a 2 × 2 pooled feature map.

Download Full Size | PDF

After the feature extraction in convolution layer, the pooling layer is to merge semantically similar features into one. The output of the max pooling layer is given by the maximum activation over non-overlapping rectangular regions. The max pooling creates position invariance over larger local regions and down-samples the input image, leading to faster convergence rate by selecting superior invariant features which improves generalization performance and prevent the model from over-fitting [32]. For instance in Fig. 3(b), each subsampling unit takes inputs from a 2 × 2 unit region in the convolved feature map and would get the maximum of those inputs as the pooled feature map. The receptive fields are chosen to be contiguous and non-overlapping so that there are half number of rows and columns in the pooling layer compared with convolution layer. A pooling function replaces the output of the net at a certain locating with a summary statistic of the nearby outputs. Pooling can greatly reduce the complexity of parameter computation and significantly improve the statistical efficiency of the network.

Owing to the convolution and pooling operation, CNN can detect the features of an image and recognize the subtle difference from numerous pixel points. Generally, the pixel intensities of a constellation image contain a wealth of information, such as modulation format and OSNR value. Here, we take two most widely-used formats (QPSK and 16QAM) as example for illustration. The constellation diagrams of QPSK and 16QAM signals at OSNR of 15 dB, 20 dB, 25 dB are collected, respectively. After the grayscale conversion and down-sampling, a 5 × 5 region at the same position of each image is extracted and displayed in Fig. 4. It is obviously seen that for the same format, the extracted regions at different OSNRs present different gray intensity values; while for the same OSNR, the ones in different formats also display differently. For image processing case, the nonlinear method is necessary to recognize the feature of pixel points, even for the information of linear impairments. Therefore, we think CNN has the ability of constellation diagram recognition.

Fig. 4 The constellation diagram images of QPSK and 16QAM signals at OSNR of 15, 20, 25 dB. At the same position of each constellation image, a 5 × 5 region is extracted and enlarged.

Download Full Size | PDF

In a practical system, there may be several pairs of convolution and pooling layers. In our scheme of Fig. 2, we set two convolution layers (C1 and C2) where both kernel sizes are 5 × 5, and two pooling layers (P1 and P2) where both subsampling regions are 2 × 2. In C1, 6 kernels produce 6 feature maps; in C2, 12 kernels generate 12 feature maps. After the second subsampling, the output of P2 will be mapped into a one-dimension layer (F1), consisting of 192 (12 × 4 × 4) neuron nodes. Finally, F1 is fully connected with the output layer, comprising 22 nodes (6 nodes for MFR and 16 nodes for OSNR estimation). The whole network can be trained by the gradient decent method through back propagation to minimize the output error [22]. The detailed parameters of our designed CNN are described in the caption of Fig. 2.

3. CNN detection system and results

First, we set up the simulation system based on VPI Transmission Maker 9.0 to generate six widely-used optical signals (QPSK, 8PSK, 8QAM, 16QAM, 32QAM, 64QAM) modulated by a pseudo-random binary sequence (PRBS) with a length of 2¹⁶ at 25Gbaud, as shown in Fig. 5. The reason why we choose these six formats is that all of them are based on coherent detection so that their information can be reflected in their phase and amplitudes, which are suitable for constellation diagram analysis. An erbium-doped fiber amplifier (EDFA) with noise figure of 6 dB and gain of 16 dB is used to add amplified spontaneous emission (ASE) noise into the optical signal and a variable optical attenuator (VOA) assists in adjusting the OSNR from 15 to 30 dB at the step of 1 dB (except for 64QAM, requiring the higher OSNR values from 20 to 35 dB). To simulate the real optical signals, a chromatic dispersion (CD) emulator is also used, ranging from −100 and 100 ps/nm at the step of 10 ps/nm. At the receiver, the signal is passed through an optical band pass filter (OBPF) with bandwidth of 28 GHz, and coherently detected by a 90° optical hybrid receiver connected with two balanced photodetecteds (BPD). The bandwidth of modulator and receiver are 30 GHz. After synchronous sampling by two analog-to-digital converts (ADC), two electrical digital signals containing the in-phase (I) and quadrature (Q) information of six signals are obtained. In order to present the visually realistic effect, we adopt the specialized constellation-diagram generation module from oscilloscope, which can turn the I- and Q-components into the corresponding constellation diagram. As shown in Fig. 2, the constellation diagrams obtained from our module have the similar rendering effect to those captured from oscilloscope, which guarantees the extendibility and feasibility of the simulated data sets. Then the generated constellation images are sent into CNN-based DSP module, where TensorFlow^TM library is selected as the model of CNN [33].

Fig. 5 Simulation setup. CW: continuous wave; PRBS: pseudo-random binary sequence; EDFA: erbium-doped fiber amplifier; VOA: variable optical attenuator; ASE: amplified spontaneous emission; OC: optical coupler; OBPF: optical band pass filter; LO: local oscillation; A/D: analog-to-digital converter.

Download Full Size | PDF

Based on the above system, we collect 100 constellation diagram images (in “jpg” format) for each OSNR value of each modulation format as training data. More specifically, each format has 1600 constellation diagram images for 16 OSNR values (15~30 dB or 20~35 dB) as training data, and thus the entire training data set comprises 9600 (1600 × 6) images in total. The examples of constellation diagram images after gray conversion and down-sampling are selected and displayed in Fig. 6. It is seen that constellation diagrams present different modulation formats and careful analysis of this visual display can also give us a first-order approximation of OSNR. In the training data set, each image has a label vector. In CNN, the label vector is always composed of several binary bits. Generally, the amount of binary bits is equal to the class number. In our scheme, there are 6 modulation formats, and thus we set first 6 bits to denote the modulation formats (QPSK: 000001, 8PSK: 000010, 8QAM: 000100, 16QAM: 001000, 32QAM: 010000, 64QAM: 100000). In these 6 bits, 5 bits are “0”, and only one is “1”, the position of which represents the corresponding format class. Similarly, there are 16 OSNR values (from 15 dB to 30 dB) requiring 16 bits to represent the corresponding OSNR values (15 dB: 0000000000000001, 16 dB: 0000000000000010,…, 30 dB: 1000000000000000). Therefore, the output label vector consists of 22 bits totally. During training course, the effective features (like the amount of constellation points or the size of each constellation point) are extracted gradually by CNN as introduced above. To minimize the errors between ideal labels and actual output labels, the parameters in networks are adjusted gradually via backpropagation using gradient decent methods. An epoch is a step in the CNN training course in which the whole training data set is presented once to CNN for parameter adjustment. The small epochs may result in poor performance due to the incomplete parameter adjustment, while large epochs may increase the computation time as well as the risk of over-fitting. To investigate the effect of epoch on CNN performance, the other 9600 constellation diagram images are collected as test data to measure the accuracies of MFR and OSNR estimation.

Fig. 6 The collected constellation diagram images of QPSK, 8PSK, 8QAM, 16QAM, 32QAM, 64QAM at 16 OSNR values ranging from 15 to 30 dB at the step of 1 dB (except for 64QAM, which requires higher OSNR values ranging from 20 to 35 dB).

Download Full Size | PDF

3.1 OSNR estimation

First, we measure the accuracies of OSNR estimation under different epochs, as shown in Fig. 7. It is evident that the accuracies of six formats increase with the growth of epochs. The trained CNNs at different epochs perform different recognition capabilities. As the epochs are small, the parameter adjustment is still unfinished due to the limited iteration times, resulting in the relatively low accuracies; as the epochs grow, the parameters are further optimized, contributing to the gradually improved performance. When the epochs arrive at 100, the accuracies of QPSK, 8PSK, 8QAM, 16QAM, 32QAM signals are over 90%; when the epochs exceed 200, the accuracies of five formats all attain 99%, close to the error-free results. However, note that 64QAM signal achieves the relatively lower accuracy. It is because compared with other formats, the number of 64QAM constellation points is larger, representing more features, which requires more feature detectors (i.e., convolution kernels in CNN). In our system, all the formats are trained with the same-structure networks and detected by the same amount of kernels. Thus for 64QAM signal, comparatively speaking, the feature detectors are insufficient resulting in the suboptimal estimation performance. But when epochs at 200, the accuracy of 64QAM is also over 95% that is an acceptable result.

Fig. 7 Accuracy of OSNR estimation as a function of epochs for QPSK, 8PSK, 8QAM, 16QAM, 32QAM, 64QAM.

Download Full Size | PDF

In order to demonstrate the comparative advantage of CNN, other four traditional machine learning algorithms [15–19], i.e., decision tree, DW-KNN, SVM, and BP-ANN, are also conducted for OSNR estimation, as shown in Fig. 8. The detailed parameters of each algorithm are described in the caption of Fig. 8. The histogram shows that CNN outperforms all the other algorithms. Decision trees are easy to interpret and fast, but they can only have the lowest estimation accuracy. DW-KNN typically have accurate estimation accuracy in low dimensions, but not in high dimensions. SVM have the suboptimal estimation accuracy and memory usage with few support vectors, but SVM in nature is only a binary classifier, which signifies numerous SVMs are necessary for multiple OSNR values. Though BP-ANN is also developed from neural networks, it lacks of capacity of feature extraction, requires more training data, and is easy to fall into the local minimum and suffer from over-fitting. Compared with them, CNN is less sensitive to the variance of input data, capable of constructing more robust networks to avoid over-fitting, and automatically extracting features of input data, especially for image data.

Fig. 8 Performance comparison between CNN and other four machine algorithms: decision tree: maximum number of splits is 100; BP-ANN: percentage of training data is 70%, activation function is sigmoid function, maximum failure number is 3, numbers of neurons in input, hidden, and out layers are 784, 50, 16; DW-KNN: number of neighbors is10, distance weight: squared inverse; SVM: kernel is radial basis function (RBF), multiclass method is one to all, kernel scale is 10; CNN: number of epochs is fixed as 200, other parameters are kept as above.

Download Full Size | PDF

In addition, the model complexity is also taken into account. The complexity of machine learning algorithms usually consists of the training complexity and testing complexity, as summarized in Table 1. The training complexities are O(nlog₂n) for decision tree, O(n) for BP-ANN, 0 for KNN, O(n³) for SVM, and O(n) for CNN, where n is the size of training data set; and their testing complexities are O(1), O(1), O(n²), O(n³), and O(1) accordingly [34, 35]. The training process often takes a lot time, but it can be accomplished before practical measurement. And the testing complexity of CNN is O(1), which is feasible in practice.

Table 1. the model complexity of machine learning algorithms, n is the size of training data set

View Table | View all tables in this article

Accordingly, we calculate the computation time for these algorithms based on an ordinary computer (Intel Core i3-2100 CPU and 4.00 GB RAM). Both training time and testing time are calculated and summarized in Table 2. It is seen that KNN is free of a training process, but its test time is too long (~7.34s), which means KNN is not practical for real-time processing. For other algorithms, the training time is always much longer than test time. The training time of CNN is 6.21 s/epoch, while test time is 0.49 s. Though the training time is relatively long, the key factor that affects processing performance is test time. That is because as long as the model training is finished, the model can be used for MFR or OSNR estimation directly with no need of training again. Therefore, the test time of CNN is less than 0.5s that is an acceptable processing speed for real-time processing and has the potential to be further improved by using high-performance computer or graphics processing unit (GPU).

Table 2. The training time and test time for different algorithms

View Table | View all tables in this article

Then we measure the proportion of misestimated samples at each OSNR value. We collect all the error samples that are estimated by the CNN trained at epochs of 196, 197, 198, 199, and 200. According to the original labels, the corresponding OSNR values of error samples are known so that the amount of error samples at each OSNR can be counted. As a consequence, the percentages of error samples at each OSNR are calculated and the statistics histograms are obtained in Fig. 9. It is clearly seen that for all the formats (except for 64QAM), there are no error samples at small OSNR values (15~21 dB), and all the error samples take place in the large OSNR region (22~30 dB). It suggests that based on constellation analysis method, the small OSNR values can be estimated precisely and the large ones suffer from the higher risk of misestimating. This is because for the small OSNR, increasing every 1 dB contributes to the improved performance of constellation diagram, leading to the obvious difference between the adjacent OSNRs, which is beneficial for CNN recognition. The larger OSNR corresponds to the better constellation diagram with more centralized constellation points and smaller EVM. But when the OSNR exceeds a certain degree, the increased OSNRs have only a mild effect on constellation diagram, which can be explained as the theory of diminishing marginal utility in traditional economy [36]. At this case, the features presented by the adjacent OSNRs are very similar, which is difficult for CNN recognition and increases the probability of misclassification. For 64QAM signals, due to the insufficient feature detectors, the more error samples are generated. Additionally, subject to the larger OSNR starting value (20 dB), the error samples occupy the whole OSNR region, resulting in a relatively even distribution. Besides, we also study the effect of step size of OSNR on the sample errors. It is found that when step of OSNR is 2 dB, the error samples are 0%, which means the error samples in 1 dB step case occur in the adjoining OSNR values. Therefore, when the step is enlarged to 2 dB, the misclassified samples are turned out to be correct ones.

Fig. 9 The percentages of error samples at each OSNR for (a) QPSK, (b) 8PSK, (c) 8QAM, (d) 16QAM, (e) 32QAM, and (f) 64QAM.

Download Full Size | PDF

3.2 Modulation format recognition

Next, the CNN performance for MFR is investigated. We set epoch to 1 and measure the accuracies of MFR when the training data of each format are 800 and 1600 respectively, as shown in Fig. 10. The confusion matrix in Fig. 10(a) shows that the reduced training data results in lower accuracies due to the limited learning samples. Because of the similar constellation in the shapes of square, the 7.6% 16QAM constellation diagrams are misclassified as 64QAM, leading to the total accuracy down to 98.7%. When the training data rise up to 1600, the accuracies return to 100%, as shown in Fig. 10(b). It is found that compared with OSNR estimation, modulation formats are much easier to be recognized. It is because different format present notably different features in constellation diagrams so that CNN can distinguish six formats more easily even at low epochs.

Fig. 10 Confusion matrix for MFR under different training data, the epoch is set to 1 and the test data is kept as 1600: (a) training data of each format is 800, 7.6% 16QAM misclassified as 64QAM; (b) training data is 1600 and the total accuracy is 100%.

Download Full Size | PDF

To explore the limit of CNN recognition for MFR, we further reduce the training data quantity of each format to 800, 400, 320, 200, until down to 160, and then measure the accuracies at different epochs, as shown in Fig. 11. When at the same epoch, the smaller training data sizes perform the poorer performance; when the epochs grows, the accuracies rise again and eventually reach 100% for all the data sizes. Accordingly, we can get that even for the smaller training data size, the error-free results can also be easily achieved by increasing several epochs (no more than 6), which proves the outstanding effect of CNN on MFR.

Fig. 11 The MFR accuracies at different epochs for different training data sizes. The training data sizes of each format are 160, 200, 320, 400, and 800.

Download Full Size | PDF

3.3 Image resolution and network structure

Then we research the effect of other factors on CNN performance, including image resolution and network structure. We mainly focus on the more complicated OSNR estimation rather than MFR. First, only the resolution of input images are changed and CNN structure and other parameters are kept. Here, we take 16QAM constellation diagram as research example. The resolution of original colored 16QAM images is 720 × 720, which are then converted to grayscales and down-sampled to different resolutions of 16 × 16, 28 × 28, 40 × 40, 56 × 56, and 84 × 84. The accuracies of OSNR estimation under different epochs for these five resolutions are measured in Fig. 12(a). It is seen that at the same epoch, the higher resolution performs the higher estimation accuracy. The higher-resolution images provides higher sharpness and the larger pixel sizes contain the richer information, which is helpful for CNN detection. However, a strange phenomenon is found that when the resolution comes up to 84 × 84, the accuracies stay around 6.25%, meaning that CNN is invalid. To figure out the reason, we modify the subsample scale of two pooling layers from 2 to 4 (namely the subsampling region extended from 2 × 2 to 4 × 4), equivalent to compress the data size in the hidden layers and thus reduce the amount of parameters by nearly half. For the input images with resolution of 84 × 84, when the subsample scale is increased to 4, the CNN performance is recovered and the accuracy goes up again, as shown in Fig. 12(b). This demonstrates that too high resolution produces too many parameters that need to be adjusted. As analyzed in Section 2, the process of training CNN is the process of parameter adjustment. The huge parameter quantity brings great challenge for network training and parameter adjustment even at high epochs 100. In this situation, the unfinished CNN always suffers from the problem of under-fitting. When the subsampling scale of pooling layers is increased, the overall parameter quantity is reduced significantly and large redundant information is removed. After the proper training and reasonable epochs, the CNN is as effective as it was. Hence, the high resolution is a double-edge sword, increasing the quantity of image information but also causing the difficulties of parameter adjustment.

Fig. 12 (a) The OSNR estimation accuracy at different epochs for input images with different resolutions of 16 × 16, 28 × 28, 40 × 40, 56 × 56, 84 × 84; (b) for the resolution of 84 × 84, the OSNR estimation accuracy at different epochs for the subsample scales of 4 and 2 in pooling layers.

Download Full Size | PDF

Next, we study the effect of network structure on CNN performance. The network structure is mainly related to the amounts of feature map in the first and second convolutional layers (Conv1, Conv2). All of the above network structures are (6, 12), and now we construct new CNNs with different structures: (3, 6), (12, 24), (24, 36). Based on these different-structure CNNs, the OSNR estimation accuracy at different epochs are measured for QPSK and 16QAM signals, as shown in Fig. 13(a) and 13(b). Interestingly, these two formats shows different results. For QPSK, the small-scale network structure (3, 6) achieves the best result; while for 16QAM, the medium-scale one (6, 12) obtains the optimal outcome. In CNN, a feature map presents a feature extracted from the input image. For constellation diagram, the features mainly refer to the number, size, and shape of constellation points. Comparatively, QPSK with only four constellation points contains less representations, requiring less feature extractors. Thus the small-scale structure (3, 6) is good enough for QPSK, and the medium-scale structure (6, 12) is better for 16QAM. While the large-scale structures, like (23, 26) are not necessary for them. Instead of detecting more useful features, the larger-scale structures increase the overall parameter quantity, resulting in the suboptimal results for both formats, as shown in Fig. 13.

Fig. 13 The OSNR estimation accuracy at different epochs with different network structures of (3, 6), (6, 12), (12, 24), (24, 36): (a) QPSK and (b) 16QAM.

Download Full Size | PDF

As analyzed above, we can conclude that the higher-resolution images and more complicated network structures not always achieve the better results. It is important to select appropriate resolution size and design reasonable network structure, in accordance with the specific situation. For constellation diagram analysis, comprehensively considering the factors of image quality, training difficulty, and processing complexity, we think the resolution of 28 × 28 and structure of (6, 12) is a reasonable and practical option.

4. Experimental setup and results

To demonstrate the feasibility and practicality of proposed scheme, we also set up the physical experiment for QPSK and 16QAM signals based on coherent optical communication system, as shown in Fig. 14. The I-Q modulator-based transmitter (Tektronix OM5110) capable of generating QPSK and 16QAM signals is driven by an arbitrary waveform generator (Tektronix AWG7001A) at 25 Gbaud. An external cavity laser (ECL-EXFO-FLS2800) with line-width of less than 100 kHz is used as light source for channel at 1550 nm. The modulated signals are sent to an EDFA. Then through an optical coupler (OC), an additional ASE noise source is loaded to the generated signals, and with the assistance of a variable optical attenuator (VOA), the received OSNR is adjusted from 15 to 30 dB. The received OSNR is measured by an optical spectra analyzer (OSA YOKOGAMA AQ6370B) based on the out-of-band monitoring method [37]. After OSNR measurement, QPSK and 16QAM signals are detected by a coherent receiver (Tektronix OM4106D). The signals are digitalized by a 33-GHz bandwidth oscilloscope (OSC Tektronix MOS73304DX), where the resulting constellation diagrams are displayed. The software interface of this OSC is open so that our constellation diagram module is embedded in it to generate the constellation images similar to the previous training data. Finally, for both QPSK and 16QAM, 20 constellation images at each OSNR values are collected and input to the CNN-based constellation analysis module.

Fig. 14 Experimental setup: ECL: external cavity laser; AWG: arbitrary waveform generator; EDFA: erbium-doped fiber amplifier; VOA: variable optical attenuator; ASE: amplified spontaneous emission; OC: optical coupler; LO: local oscillation; A/D: analog-to-digital converter; OSA: optical spectra analyzer; OSC: oscilloscope.

Download Full Size | PDF

In constellation analysis module, we use the trained CNN model (in Section 3.1) to estimate OSNR of test data captured from OSC. As described above, CNN was trained with 1600 data set (100 × 16) and test data set is 320 (20 × 16). Figure 14 shows the estimated OSNR by CNN as a function of measured OSNR. Red circles in Fig. 15 denote the averaged value of estimated OSNR over test data set. Consistent with previous results, the large estimation errors mainly arise in large OSNR region: from 27 to 30 dB for QPSK and from 25 to 30 dB for 16QAM. The maximum error is 0.6 dB for QPSK and 0.7 dB for 16QAM, and thus the errors of each OSNR are kept within1 dB for both formats. Moreover, it is a simple task to identify the modulation formats between QPSK and 16QAM signals so that the accuracy of MFR is 100%. The satisfactory accuracy proves that the proposed scheme can be successfully be used to estimate OSNR and recognize formats.

Fig. 15 Experimental results for OSNR estimation by CNN: (a) QPSK and (b) 16QAM.

Download Full Size | PDF

At last, QPSK and 16QAM signals with OSNR of 20 dB are adopted as cases to look inside the convolutional network, as shown in Fig. 16. The QPSK and 16QAM constellation diagram images are input into the trained CNN. Through convolution and pooling processing, several feature maps are generated. All the feature maps generated in convolution layer1, pooling layer1, convolution layer2, and pooling layer2 are displayed in Fig. 16. These feature maps presents the extracted features in different stages and could help us understand the detailed process of CNN operation.

Fig. 16 The feature maps generated in convolution layer1, pooling layer1, convolution layer2, and pooling layer2: (a) QSPK with OSNR of 20 dB; (b) 16QAM with OSNR of 20 dB.

Download Full Size | PDF

5. Conclusion

In this paper, a CNN-based constellation diagram analyzer was proposed to implement both MFR and OSNR estimation functionalities. Six widely-used modulation formats (QPSK, 8PSK, 8QAM, 16QAM, 32QAM, and 64QAM) were comprehensively evaluated. Compared with other four machine learning methods (decision tree, DW-KNN, SVM, and BP-ANN), CNN performed the better results. For OSNR estimation, the high accuracies were obtained at epochs of 200 (95% for 64QAM, and over 99% for other five formats); for MFR, 100% accuracies were achieved even with less training data at lower epochs. Additionally, the effects of multiple factors on CNN performance were separately studied, including the training data size, image resolution, and network structure. Moreover, the experiment for QPSK and 16QAM signals were also conducted. The experimental results shown that the OSNR estimation errors for all the signals were less than 0.7 dB and the accuracy of MFR was 100%, proving the feasibility of proposed scheme.

In addition to OSNR estimation and MFR, based on the CNN-based constellation diagram analyzer, other metrics (like EVM estimation, BER calculation, linear or nonlinear impairments monitoring) can also be analyzed for future work. We believe that the proposed technique has the potential to be embedded in the test instruments to implement intelligent signals analysis, or applied in OPM module to ensure robust network operation.

Funding

National Natural Science Foundation of China (NSFC) Project No.61372119.

References and links

1. J. Thrane, J. Wass, M. Piels, J. C. Diniz, R. Jones, and D. Zibar, “Machine Learning Techniques for Optical Performance Monitoring from Directly Detected PDM-QAM Signals,” J. Lightwave Technol. 35(4), 868–875 (2017). [CrossRef]

2. G. Breed, “Analyzing signals using the eye diagram,” High Freq. Electron. 4(11), 50–53 (2005).

3. P. J. Winzer, “High-spectral-efficiency optical modulation formats,” J. Lightwave Technol. 30(24), 3824–3835 (2012). [CrossRef]

4. M. Sköld, J. Yang, H. Sunnerud, M. Karlsson, S. Oda, and P. A. Andrekson, “Constellation diagram analysis of DPSK signal regeneration in a saturated parametric amplifier,” Opt. Express 16(9), 5974–5982 (2008). [CrossRef] [PubMed]

5. S. Amiralizadeh, A. Yekani, and L. A. Rusch, “Discrete multi-tone transmission with optimized QAM constellations for short-reach optical communications,” J. Lightwave Technol. 34(15), 3515–3522 (2016). [CrossRef]

6. D. Che and W. Shieh, “Entropy-Loading: The Multi-Carrier Constellation-Shaping for Colored-SNR Optical Channels,” in Optical Fiber Communication Conference (Optical Society of America, 2017), paper Th5B. 4. [CrossRef]

7. R. Schmogrow, B. Nebendahl, M. Winter, A. Josten, D. Hillerkuss, S. Koenig, J. Meyer, M. Dreschmann, M. Huebner, C. Koos, J. Becker, W. Freude, and J. Leuthold, “Error vector magnitude as a performance measure for advanced modulation formats,” IEEE Photonics Technol. Lett. 24(1), 61–63 (2012). [CrossRef]

8. S. Zhang, “On the Use of GMI to Compare Advanced Modulation Formats,” in Optical Fiber Communication Conference (Optical Society of America, 2017), paper M3C. 6. [CrossRef]

9. T. Liu and I. B. Djordjevic, “Optimal signal constellation design for ultra-high-speed optical transport in the presence of nonlinear phase noise,” Opt. Express 22(26), 32188–32198 (2014). [CrossRef] [PubMed]

10. P. Harrington, Machine Learning in Action (Manning Publications, 2012).

11. D. Zibar and C. Schäffer, “Machine Learning Concepts in Coherent Optical Communication Systems,” in Signal Processing in Photonic Communications (SSPCom, 2014), paper ST2D.1.

12. D. Zibar, M. Piels, R. Jones, and C. G. Schäeffer, “Machine learning techniques in optical communication,” J. Lightwave Technol. 34(6), 1442–1452 (2016). [CrossRef]

13. D. Zibar, O. Winther, N. Franceschi, R. Borkowski, A. Caballero, V. Arlunno, M. N. Schmidt, N. G. Gonzales, B. Mao, Y. Ye, K. J. Larsen, and I. T. Monroy, “Nonlinear impairment compensation using expectation maximization for dispersion managed and unmanaged PDM 16-QAM transmission,” Opt. Express 20(26), B181–B196 (2012). [CrossRef] [PubMed]

14. N. G. Gonzalez, D. Zibar, A. Caballero, and I. T. Monroy, “Experimental 2.5-Gb/s QPSK WDM Phase-Modulated Radio-Over-Fiber Link With Digital Demodulation by a K -Means Algorithm,” IEEE Photonics Technol. Lett. 22(5), 335–337 (2010). [CrossRef]

15. L. Barletta, A. Giusti, C. Rottondi, and M. Tornatore, “QoT estimation for unestablished lighpaths using machine learning,” in Optical Fiber Communications Conference and Exhibition (Optical Society of America, 2017), paper Th1J. 1.

16. D. Wang, M. Zhang, Z. Li, Y. Cui, J. Liu, Y. Yang, and H. Wang, “Nonlinear decision boundary created by a machine learning-based classifier to mitigate nonlinear phase noise,” in European Conference on Optical Communication (ECOC 2015), paper P.3.1. [CrossRef]

17. D. Wang, M. Zhang, Z. Cai, Y. Cui, Z. Li, H. Han, M. Fu, and B. Luo, “Combatting nonlinear phase noise in coherent optical systems with an optimized decision processor based on machine learning,” Opt. Commun. 369, 199–208 (2016). [CrossRef]

18. D. Wang, M. Zhang, M. Fu, Z. Cai, Z. Li, H. Han, Y. Cui, and B. Luo, “Nonlinearity Mitigation Using a Machine Learning Detector Based on k-Nearest Neighbors,” IEEE Photonics Technol. Lett. 28(19), 2102–2105 (2016). [CrossRef]

19. D. Wang, M. Zhang, Z. Li, C. Song, M. Fu, J. Li, and X. Chen, “System impairment compensation in coherent optical communications by using a bio-inspired detector based on artificial neural network and genetic algorithm,” Opt. Commun. 399, 1–12 (2017). [CrossRef]

20. C. M. Bishop, Pattern Recognition and Machine Learning (Springer, 2006).

21. J. Schmidhuber, “Deep learning in neural networks: an overview,” Neural Netw. 61, 85–117 (2015). [CrossRef] [PubMed]

22. Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature 521(7553), 436–444 (2015). [CrossRef] [PubMed]

23. D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of Go with deep neural networks and tree search,” Nature 529(7587), 484–489 (2016). [CrossRef] [PubMed]

24. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 1097–1105 (2012).

25. R. Socher, C. C. Lin, C. Manning, and A. Y. Ng, “Parsing natural scenes and natural language with recursive neural networks,” in Proceedings of the 28th international conference on machine learning (ICML-11), 129–136 (2011).

26. G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbury, “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal Process. Mag. 29(6), 82–97 (2012). [CrossRef]

27. F. N. Khan, K. Zhong, W. H. Al-Arashi, C. Yu, C. Lu, and A. P. T. Lau, “Modulation Format Identification in Coherent Receivers Using Deep Machine Learning,” IEEE Photonics Technol. Lett. 28(17), 1886–1889 (2016). [CrossRef]

28. M. C. Tan, F. N. Khan, W. H. Al-Arashi, Y. Zhou, and A. P. T. Lau, “Simultaneous optical performance monitoring and modulation format/bit-rate identification using principal component analysis,” J. Opt. Commun. Netw. 6(5), 441–448 (2014). [CrossRef]

29. R. S. Luis, A. Teixeira, and P. Monteiro, “Optical Signal-to-Noise Ratio Estimation Using Reference Asynchronous Histograms,” J. Lightwave Technol. 27(6), 731–743 (2009). [CrossRef]

30. F. N. Khan, A. P. T. Lau, Z. Li, C. Lu, and P. K. A. Wai, “Statistical Analysis of Optical Signal-to-Noise Ratio Monitoring Using Delay-Tap Sampling,” IEEE Photonics Technol. Lett. 22(3), 149–151 (2010). [CrossRef]

31. Z. Dong, F. N. Khan, Q. Sui, K. Zhong, C. Lu, and A. P. T. Lau, “Optical performance monitoring: A review of current and future technologies,” J. Lightwave Technol. 34(2), 525–543 (2016). [CrossRef]

32. I. Goodfellow, Y. Bengio, and A. Courville, Deep learning (Massachusetts Institute of Technology, 2016).

33. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, and M. Devin, “Tensorflow: Large-scale machine learning on heterogeneous distributed systems,” https://arXiv:1603.04467 (2016).

34. A. Abdiansah and R. Wardoyo, “Time Complexity Analysis of Support Vector Machines (SVM) in LibSVM,” Int. J. Comput. Appl. 128(3), 1–7 (2015).

35. Z. Zhu and A. K. Nandi, Automatic modulation classification: principles, algorithms and applications (John Wiley & Sons, 2014).

36. E. Kauder, History of Marginal Utility Theory, (Princeton University, 2015).

37. D. Kilper, R. Bach, D. Blumenthal, D. Einstein, T. Landolsi, L. Ostar, M. Preiss, and A. Willner, “Optical performance monitoring,” J. Lightwave Technol. 22(1), 294–304 (2004). [CrossRef]

	Decision tree	KNN	SVM	BP-ANN	CNN
Training time	13.53 s	0 s	22.65 s	4.31 s/epoch	6.21 s/epoch
Testing time	0.0874s	7.34 s	3.07 s	0.22 s	0.49 s

	Decision tree	KNN	SVM	BP-ANN	CNN
Training time	13.53 s	0 s	22.65 s	4.31 s/epoch	6.21 s/epoch
Testing time	0.0874s	7.34 s	3.07 s	0.22 s	0.49 s

Intelligent constellation diagram analyzer using convolutional neural network-based deep learning

Abstract

1. Introduction

2. Operating principle of CNN-based constellation diagram analyzer

3. CNN detection system and results

3.1 OSNR estimation

3.2 Modulation format recognition

3.3 Image resolution and network structure

4. Experimental setup and results

5. Conclusion

Funding

References and links

Cited By

Figures (16)

Tables (2)

Optics Express

Complexity Model	Training Complexity	Testing Complexity	Total Complexity
Decision tree	O(nlog₂n)	O(1)	O(nlog₂n)
BP-ANN	O(n)	O(1)	O(n)
KNN	0	O(n²)	O(n²)
SVM	O(n³)	O(n³)	O(n³)
CNN	O(n)	O(1)	O(n)