Modulation format recognition in a UVLC system based on reservoir computing with coordinate transformation and folding algorithm

Fujie Li; Fujie Li; Fujie Li; Xianhao Lin; Xianhao Lin; Xianhao Lin; Jianyang Shi; Jianyang Shi; Jianyang Shi; Ziwei Li; Ziwei Li; Ziwei Li; Nan Chi; Nan Chi; Nan Chi

doi:10.1364/OE.491377

1. Introduction

Underwater visible light communication (UVLC) is a new underwater wireless communication technology, which uses visible light as a carrier to transmit information through underwater channels [1–3]. Compared with traditional underwater communication technology, UVLC allows for a higher communication rate, stronger anti-electromagnetic interference ability, and better communication stability [4]. In the foreseeable future, UVLC is envisioned to be dynamic and heterogeneous to meet the demands of various frequency bands and improve spectrum utilization [5].

As one of the most essential key technologies directly influence the data transmission quality and speed in UVLC, modulation format recognition (MFR) is aimed to detect modulation formats autonomously without any prior information from the transmitters and then help the receiver adjust the demodulation mode [6]. However, due to the lack of prior information from the transmitter, blind classification of the modulation formats is a difficult task. In the long course of development, various methods have been proposed to improve the performance of MFR tasks. These methods can be roughly divided into two categories, namely the likelihood-based (LB) methods and the feature-based (FB) methods [7]. Within the LB methods, with the correct formulation of MFR problems and the selection of appropriate thresholds, the LB methods minimize the probability of misrecognition, thus providing an optimal solution in the Bayesian sense [8–10]. However, the LB methods involve higher computational complexity and are not easy to implement. Unlike the LB methods, the FB methods extract salient features from the received signals and then these features are used to recognize the signal modulation format. While the FB methods are suboptimal approaches in the Bayesian sense, they are easy to implement and are able to achieve near-optimal recognition performance with well-designed feature extraction [11]. However, how to manually design powerful feature extraction algorithms to further improve the recognition performance is also a difficult problem.

Recently, with the rapid development of deep learning, various neural networks (NN) are applied in MFR owing to their magnificent feature-fitting ability. In [12], a deep neural network (DNN) based algorithm is applied to signals’ amplitude histograms to achieve great MFR accuracy. In [13], a convolution neural network (CNN) based method is proposed to process eye diagrams from the perspective of image processing and obtain high MFR performance. These well-designed NN automatically extract features from the input data to accurately determine modulation formats.

However, considering the unique complexity of the underwater channel, the structure of NN tends to be designed with deep layers and complex architectures to gain good accuracy of MFR, which makes it time-consuming and computationally expensive. Besides, NN are always data-driven so its network convergence performance and discrimination accuracy rely on a huge amount of data for training. Yet in many practical applications of UVLC, it is knotty to get massive amounts of data to sufficiently train the NN. Therefore, a better MFR algorithm is seeking to achieve high discrimination accuracy with stronger feature extraction ability and lower complexity.

In this paper, to alleviate the above bottleneck problems, a novel algorithm is proposed for MFR by employing reservoir computing (RC) and incorporating powerful feature extraction algorithms including coordinate transformation and folding algorithm. RC [14–16] is a special recurrent neural network, whose weight of connections between the input layer and reservoir and connections between internal nodes are randomly generated and fixed during the whole training process. The only parameters need to be trained in RC are the linear weight connections between the reservoir and output layer. Through its simple structure, we are able to dramatically reduce the time cost and computational overhead during both training and inference process, and achieve better performance even with less training data compared with much more complex CNN. Meanwhile, to better extract and enrich the characteristics of input data, we employ the combination of coordinate transformation and folding algorithm. Coordinate transformation better characterizes the salient features of the input data by transforming the input data into different coordinate systems and folding algorithm reduces repetitive information and highlights local features by folding the feature map of original input data. Both of these methods make the characteristics of input data more prominent and further improve discrimination accuracy of MFR task. Experiments are conducted to investigate the performance of our algorithm in UVLC system, where six modulations are recognized, including OOK, 4QAM, 8QAM-DIA, 8QAM-CIR, 16APSK, and 16QAM. Through experiments in UVLC channels under various complex conditions, we demonstrate that our RC-based methods take only a few seconds for training process in MFR tasks and under different pin voltages of LED, the accuracy almost all exceeds 90%, and the highest is close to 100%, which prove the great superiority, portability, and robustness of our methods.

2. Principle

2.1 Modulation format recognition

In MFR task, the most classical method is to split the signal data received into IQ samples [7]. Denote received complex signals as $y = \{ y(i )\} _{i = 1}^N$ (N represents the length of received signals), then we can consider splitting the imaginary part and the real part of each $\textrm{y}(\textrm{i} )$, as in:

(1)$$\begin{array}{l} { In phase} = {Imag}[y(i)]\\ {Quadrature = Real}[y(i)]. \end{array}$$

Through this way, the received one-dimensional complex signals can be decomposed into two-dimensional data (such as IQ samples). Then MFR task is converted to how to learn from these two-dimensional IQ samples to obtain the modulation format of received data. For instance, a set of training data can be denoted as

(2)$$D = \{ ({x_1},{y_1}),({x_2},{y_2}),\ldots ,({x_m},{y_m})\} ,$$

where m is the length of all training data, each ${x_i} \in X$ means a set of $2 \times \textrm{N}$ dimensional IQ samples and each ${y_i} \in Y$ means the modulation format of ${x_i}$. The task of MFR is to find a mapping function f from X to Y, which minimizes the error between f(X) and Y.

Considering the powerful nonlinear fitting capabilities of NN, many exquisite and sophisticated structures of NN are designed to automatically extract features from input IQ samples and then discriminate the modulation format of each input IQ samples set. In the later experimental result, we will practice a classical NN to realize the MFR task, which demonstrates that NN is not always able to achieve excellent results in MFR task. Although deeper and more complicated NN structures may obtain small performance improvement, these structures will exacerbate its disadvantages of time-cost and computational complexity.

Exactly because of these disadvantages of classical modulation format recognition, it is urgent to optimize the algorithm for extracting features and simplifying the fitting model for MFR. Therefore, we apply reservoir computing with coordinate transformation and folding algorithm to improve the performance of MFR while dramatically simplifying the complexity.

The schematic diagram of our proposed RC-based algorithm is shown in Fig. 1. Firstly, the input data sequences, namely received complex signals, are processed by coordinate transformation and divided into two ways. One way is transformed to Rectangular coordinates, and the other way is transformed to Polar coordinates. Then, folding algorithm is applied to sequence data in different coordinate systems. Next, the data after above feature extraction algorithms are directly concatenated to aggregate the features in Rectangular and Polar coordinates and sent into reservoir computing to calculate the probability of the modulation format that the input data sequences belong to. Finally, we can get the prediction result by choosing the class with the highest probability. In the following content, we will introduce coordinate transformation, folding algorithm, and reservoir computing in detail respectively.

Fig. 1. The schematic diagram of our proposed RC algorithm with coordinate transformation and folding algorithm for modulation format recognition.

Download Full Size | PDF

2.2 Coordinate transformation

In classical modulation format recognition, the received complex signals are split into IQ samples, in which way the received signals are considered in Rectangular coordinate system. We investigate six diffident modulation formats to be recognized, including OOK, 4QAM, 8QAM-DIA, 8QAM-CIR, 16APSK, and 16QAM. Considering their distribution of constellation points, we can be roughly divided these modulation formats into two groups: QAM and PSK. Since QAM constellation points are distributed evenly on the x and y axes in Rectangular coordinate system, splitting QAM signals into IQ samples is a good way to reflect their distinctive features. However, it is obvious that splitting PSK signals in this way can hardly extract their distinctive features due to PSK constellation points’ non-uniform distribution on the x and y axes in the Rectangular coordinate system.

For better feature extraction, the coordinate transformation algorithm is proposed to transform IQ samples in Rectangular coordinates to a new coordinate system. When we analyze the distribution of PSK constellation points, the circular symmetry of PSK constellation points hints to us that we can transform IQ samples in Rectangular coordinates to Polar coordinates. This coordinate transformation algorithm better reflects the salient features of PSK signals while maintaining the original symmetry.

The results of Coordinate Transformation Algorithm are shown in Fig. 2. Upper line (Fig. 2(a)) shows diagrams of six modulation formats signals in Rectangular coordinate system, which just are the constellation diagrams we are familiar with. The lower line (Fig. 2(b)) shows the corresponding diagrams in Polar coordinate system. In Polar coordinate system, the polar angle component and polar radius component of received signals exhibit a richer set of salient features.

Fig. 2. Results of Coordinate Transformation algorithm. The upper and lower lines are diagrams of six modulation formats signals in (a) Rectangular coordinate system and in (b) Polar coordinate system respectively.

Download Full Size | PDF

It is noteworthy that our Coordinate Transformation Algorithm not only extracts more salient features of PSK modulated received signals, but also reveals more different features of QAM modulated signals from another perspective. With the aggregation of feature extractions in Rectangular coordinate system and in Polar coordinate system, the subsequent fitting model will be easier to achieve better MFR accuracy.

2.3 Folding algorithm

When we feed the entire received signals (in Rectangular coordinate system or Polar coordinate system) into the subsequent fitting model, the model tends to learn overall global features but it is difficult for the model to learn local features precisely. In other words, when the model learns from the input data, we are not sure whether it has learned the local salient features just as we expected. In order to optimize this problem, it is necessary to consider how to highlight the local salient features.

Through further analysis of the diagrams of six modulation formats signals in Rectangular coordinate system, we can find that, in addition to the circular symmetry, the diagrams of received signals also have axisymmetric symmetry properties. This inspires us to utilize the axisymmetric symmetry of input data to highlight local salient features by reducing repetitive information [17].

Therefore, we propose Folding Algorithm to highlight local salient features. In detail, we first split received complex signals into IQ samples in Rectangular coordinate system. Then we transform IQ samples to Polar coordinate system. With polar angle and polar radius components, we can easily express our folding functions. Denote the range of current polar angles is from $\beta _1^n$ to $\beta _2^n$, and the symmetry angle is $\beta _0^n$ ($\beta _0^n = ({\beta_1^n + \beta_2^n} )/2$), then the folding function symmetrically along $\beta _0^n$ can be shown as:

(3)$$\theta _i^{n + 1} = \left\{ {\begin{array}{{r}} {\theta_i^n,\beta_1^n < \theta_i^n \le \beta_0^n}\\ {2\beta_0^n - \theta_i^n,\beta_0^n < \theta_i^n \le \beta_2^n} \end{array}} \right.,$$

where n is folding order, $\theta _i^n$ is polar angle of input received data after n times folding. We can also fold the angles to another symmetrical range, which is shown as:

(4)$$\theta _i^{n + 1} = \left\{ {\begin{array}{{r}} {\theta _i^n,\beta _0^n < \theta _i^n \leqslant \beta _2^n} \\ {2\beta _0^n - \theta _i^n,\beta _1^n < \theta _i^n \leqslant \beta _0^n} \end{array}} \right..$$

Due to symmetry, these two kinds of folding functions are not fundamentally different and end up with similar results. An example diagram of the folding algorithm for 16APSK modulation format signals is shown in Fig. 3. After each fold, the feature map will be reduced to half of the original so that the subsequent fitting model will be easier to learn useful local features.

Fig. 3. An example diagram of the folding algorithm for 16APSK modulation format signals in (a) Rectangular coordinate system and in (b) Polar coordinate system respectively. Original means original diagrams of signals, 2nd, 3rd, and 4th-folding mean diagrams of signals after folding 2, 3, and 4 times respectively.

Download Full Size | PDF

Original diagrams and further folding diagrams of all six modulation formats are depicted in Fig. 4. Theoretically, we can fold the diagrams more times. However, although higher-order folding operations can reduce the feature map and highlight the local salient features, local features will also be blurred as the folding order increases. Thus, the folding order is a trade-off between shrinking the feature map and maintaining clear and salient local features. In practice, we find 3 or 4 to be the optimal number of folding times. Fewer folding operations cannot highlight the local salient features adequately while more folding operations will cause too small feature map to losing useful features.

Fig. 4. Original diagrams and folding diagrams of all six modulation formats in (a) Rectangular coordinate system and in (b) Polar coordinate system respectively. Original, 2nd, 3rd, and 4th-folding have the same meanings as in Fig. 3, which mean the folding order.

Download Full Size | PDF

2.4 Reservoir computing (RC)

Theoretically, recurrent neural networks (RNNs) are powerful to fit sequence data and widely used in practical tasks including machine translation, speech recognition, image processing [18,19], etc. However, the training of RNNs suffers from gradient disappearance and gradient explosion [20] so it is hard to converge quickly and achieve good performance. To optimize this problem, a new approach to RNN design with fixed internal weights was proposed, which is now often referred to as reservoir computing (RC) [14–16].

The advantages of RC lie in the simple network structure and easy training process. Let {${x_i}$, $\textrm{i} = 1,2, \ldots ,\textrm{K}$} be the input data sequence of length K in each timestep and {${u_i}$, $\textrm{i} = 1,2, \ldots ,\textrm{N}$} be the state of N reservoir nodes in each timestep. Denote the weight matrix between the input layer and the reservoir as ${W_{in}}$ and the internal weight matrix in reservoir nodes as ${W_{res}}$. Then the node states in timestep $\textrm{t}$ are basically computed as the following equation:

(5)$$u(t) = f({W_{in}} \cdot x(t) + {W_{res}} \cdot u(t - 1)),$$

where f means the nonlinear activation function, typically tanh in practice:

(6)$$\tanh (x) = \frac{{\sinh (x)}}{{\cosh (x)}} = \frac{{{e^x} - {e^{ - x}}}}{{{e^x} + {e^{ - x}}}}.$$

To further improve the dynamics of reservoir, a leak rate α is introduced [21]. How to adjust the nonlinear activation function f and the leak rate α will be investigated in the later section.

(7)$$u(t) = (1 - \alpha ) \cdot u(t - 1) + \alpha \cdot f({W_{in}} \cdot x(t) + {W_{res}} \cdot u(t - 1))$$

Considering the weight connection matrix ${W_{in}}$ and ${W_{res}}$ are generated randomly and remain fixed during the whole training process. The computation of reservoir nodes states is surprisingly time-saving.

Finally, the output {${y_i}$, $\textrm{i} = 1,2, \ldots ,\textrm{L}$} can be calculated from the linear combination of reservoir nodes states:

(8)$$y(t) = {W_{out}} \cdot u(t).$$

The aim of the output weight ${W_{out}}$ is to minimize the MSE between y(t) and our expected output Y. This optimization problem can be expressed as:

(9)$${\hat{W}_{\textrm{out}}} = \mathop {\arg \min }\limits_{{\textrm{W}_{\textrm{out}}}} \left( {\frac{1}{L}\textrm{||}{\textrm{W}_{\textrm{out}}}\textrm{U} - \textrm{Y}{\parallel^2} + \epsilon \parallel {\textrm{W}^{\textrm{out}}}{\parallel^2}} \right),$$

with $\mathrm{\epsilon }$ being the regularization parameter, which is intended to prevent overfitting of the optimization problem.

Moreover, this optimization problem has an optimal closed form solution in the least squares sense [22] as:

(10)$${\hat{W}_{\textrm{out}}} = {(U{U^T} + \epsilon I)^{ - 1}}(Y{U^T}).$$

Through this network structure of RC, only the output weight ${W_{out}}$ needs to be trained and the training of ${W_{out}}$ is a simple linear regression, which dramatically reduces network complexity and computation. Meanwhile, although the training parameters of RC are smaller than classical NN-based structures, it does not necessarily mean the learning and representation ability of RC are inferior. From subsequent experiments, we demonstrate that with the combination of RC and previous feature extraction algorithms including coordinate transformation and folding algorithm, our RC-based algorithm can achieve better performance than classical CNN in MFR task.

3. Experimental setup

The whole work is based on the experiment through QAM-CAP/APSK-CAP modulated UVLC system [23], as present in Fig. 5. The sampling frequency of signal is 2.2 GHz and the baud rate is 0.55 Gbaud. Firstly, we map the original data into six modulation formats and get the received complex signals. Among them, each subcarrier transmits 128 data and the number of subcarriers in each group is 1024. After four times of up sampling and pulse shaping filter, the digit signals will be converted into analog signals through DAC and transmitted through LED, which is the blue chip (457 nm) of RGBYC silicon substrate LED lamp. The signals pass through a 1.2-meter length water tank and are received by a differential receiver. The PD used in the experiment is Si PIN photodiode (S10784, HAMAMATSU). After data processing including equalization, CAP demodulation, and LMS, we get the complex signals and split them into IQ samples. In the following experiment, we apply coordinate transformation and folding algorithm to IQ samples and send the aggregated data into reservoir computing (RC) to realize modulation format recognition.

Fig. 5. The experimental setup of our QAM-CAP/APSK-CAP modulated UVLC system.

Download Full Size | PDF

We use 6000 sets of data for experiments, in which each modulation format corresponds to 1000 sets of data. Then they are randomly divided into training set and test set according to the ratio of 7 to 3. To better compare the performance of our RC-based algorithm with the classical NN-based algorithm, we choose a well-performed CNN named DrCNN as a baseline. The details about architecture of DrCNN and output dimensions can be found in [24]. In the following section, we will demonstrate the effectiveness and superiority of our RC-based algorithm in terms of performance, time-cost and computational complexity compared to the classical CNN-based algorithms.

4. Experimental results

In this section, we will demonstrate the MFR performance comparison between the baseline NN-based algorithm DrCNN, and the simple RC-based algorithm without any feature extraction. Next, we introduce the coordinate transformation and folding algorithm, and show how these powerful feature extraction algorithms can effectively improve the accuracy of our RC-based algorithm while not adding significant complexity. Finally, we will also compare the complexity of DrCNN algorithm and our RC-based algorithm, and investigate how to design a well-performed RC to strike a balance between accuracy and time cost.

4.1 RC with coordinate transformation

The experimental results of RC with different coordinate transformation methods are depicted in Fig. 6. We adjust the pin voltage of the LED and send the received signals into RC and compare the MFR accuracy. From Fig. 6, we can clearly see that simple RC without any coordinate transformation method is powerless to discriminate the modulation formats of received signals, whose accuracy is lower than 40%. By transforming original data from rectangular coordinate system to polar coordinate system, a considerable improvement in MFR task accuracy has been achieved and the highest accuracy can reach 60%. Combining the data in two coordinate systems, we further achieve some accuracy improvement. However, compared to DrCNN, RC with coordinate transformation is still slightly inferior, which means that more effective feature extraction algorithms are needed to be utilized to further improve the accuracy of the RC-based method. Besides, it is worth noting that both the DrCNN and the RC method perform best when the voltage is roughly in the range of 0.3V- 0.7 V, which we call the ideal region. On the one hand, when the pin voltage of the LED is too low, the SNR of received signal is low (as shown in the right inset (a) of Fig. 6), resulting in poor MFR accuracy. On the other hand, when the voltage is too high, the nonlinear effect of received signal is serious (as shown in right inset (c) of Fig. 6), which poses a difficult challenge to algorithm’s nonlinear fitting capabilities.

Fig. 6. MFR accuracy of DrCNN and RC with different coordinate transformation methods. Ori_Rec means the data fed into RC are signals in Rectangular coordinate system, Ori_Pol means in Polar coordinate system and Ori_Rec_Pol means the combination of both two coordinate systems. Right insets a, b, and c show constellation diagrams of 8QAM-CIR when the voltage is 0.2 V, 0.5 V, and 1.3 V respectively.

Download Full Size | PDF

4.2 RC with coordinate transformation and folding algorithm

From the experiment results shown in Fig. 6, it is clear that RC with the aggregated data both in Polar and Rectangular coordinate systems can achieve better accuracy than in a single coordinate system. Therefore, we further apply the folding algorithm to enhance the feature extraction ability of our RC-based method on the basis of coordinate aggregation. The results of RC with coordinate transformation and folding algorithm are depicted in Fig. 7. Encouragingly, the introduction of the folding algorithm can further significantly improve the accuracy of our RC algorithm. Even with two-order folding algorithm, the accuracy of our RC completely exceeds the accuracy of baseline DrCNN. With higher-order folding algorithm, the accuracy of our RC will continue to increase and exceed the accuracy of DrCNN by about 30%. When the pin voltage of the LED changes from 0.1 V to 1.3 V, the accuracy of our RC with coordinate transformation and folding algorithm almost all exceeds 90%, and the highest is close to 100%, which demonstrates the superior performance of our methods.

Fig. 7. MFR accuracy of DrCNN and RC with different coordinate transformation and folding algorithm methods. Rec_Pol means the input data are the combination of signals in Rectangular coordinate system and Polar coordinate system. Ori, 2nd, 3rd, and 4th mean the order of folding operations to the original diagram.

Download Full Size | PDF

Meanwhile, with the help of our proposed feature extraction algorithms, including coordinate transformation and folding algorithm, the accuracy of DrCNN has been great improved and the accuracy is mostly more than 80%, which proves the effectiveness of our proposed feature extraction algorithms. It is also worth noting that even with the same feature extraction algorithms, the complex DrCNN still performs slight worse than our RC. The reason is that after feature extraction, the features of input data have been highlighted and can be easily fully learned. At this time, it is often difficult for an overly complex algorithm to achieve satisfactory results.

Besides, when the voltage reaches the nonlinear region, the accuracy of DrCNN without any feature extraction algorithm drops significantly, while the accuracy of our RC only has a slight drop. This phenomenon demonstrates that even for strong nonlinear effects, our proposed feature extraction algorithms can still highlight the salient features of the input data in different coordinate systems and reduce data redundancy, so that the subsequent RC can efficiently and conveniently complete the MFR task. Meanwhile, when the order of folding algorithm increases from 2 to 3, a significant accuracy improvement can be achieved, but when the order of folding algorithm continues to increase, the accuracy of our RC algorithm will no longer improve accordingly, which means the accuracy improvement brought by folding algorithm will tend to be saturated. This phenomenon is also expected, because although the folding algorithm can indeed reduce redundant features and highlight local salient features, too many times folding operations will shrink the feature map, resulting in the loss of useful features. Therefore, the optimal order of folding algorithm is about 3 or 4 in practice.

The details of the accuracy comparison of six different modulation formats, including OOK, 4QAM, 8QAM-DIA, 8QAM-CIR, 16APSK, and 16QAM, under different pin voltages of the LED are shown in Fig. 8. We choose the voltages of 0.3 V, 0.7 V and 1.1 V for analysis. Generally speaking, the performance of the RC-based algorithm with only coordinate transformation is not as good as the performance of NN-based algorithm. However, the performance of RC-based algorithm with coordinate transformation and folding algorithm greatly surpasses the performance of DrCNN under different voltage conditions.

Fig. 8. Confusion matrix of DrCNN and RC with different coordinate transformations and folding algorithms in different voltages. Rec and Pol represent the input data are signals in rectangular and polar coordinate systems respectively, and Rec + Pol means the combination of both. Original and 4th represent the order of folding algorithms.

Download Full Size | PDF

Moreover, when the voltage is in the ideal region, the quality of received signals is the best, resulting in the highest accuracy. But even in low SNR region and nonlinear region, our RC-based algorithm with coordinate transformation and folding algorithm still shows great performance. Another interesting phenomenon is that we tend to believe simple modulation formats, such as 2QAM and 4QAM, are more likely to be correctly classified by algorithms than complex modulation formats, such as 16QAM and 16APSK. However, this is only true for algorithms that have strong feature fitting ability and are able to correctly learn the characteristics of input data, such as our RC with coordinate transformation and folding algorithm. For some algorithms with poor fitting ability, such as simple RC without any feature extraction algorithm or DrCNN, it is difficult for these algorithms to learn the correct features for recognition, resulting in poor performance even on simple input data.

4.3 Computational complexity of RC

The architecture of simple RC can be roughly divided into three parts: input layer, reservoir, and output layer. Denote the weight matrix between input layer and reservoir as ${W_{in}}$, the internal weight matrix in reservoir nodes as ${W_{res}}$ and the weight matrix between reservoir and output layer as ${W_{out}}$. Let K, N, L be the length of input data, the size of reservoir, and the length of output respectively, then the parameters of ${W_{in}}$, ${W_{res}}$, and ${W_{out}}$ are $K \times N$, $N \times N$, and $L \times N$ respectively. Among these three weight matrices, ${W_{in}}$ and ${W_{res}}$ are randomly generated and only ${W_{out}}$ needs to be trained.

Figure 9 shows that when the scale of reservoir gradually increases from 50, the performance of RC will improve accordingly. But when the scale increases to a certain extent, about 500, the performance will reach the bottleneck and a larger scale of reservoir only bring slight performance gains. However, we also calculate the computational time cost of RC when the size of reservoir changes. Python version 3.6.7 was used to implement the program, and a laptop computer was used (Intel Core i7-8565U CPU @ 1.80 GHz, 8GB of RAM, running a Microsoft Windows 10 operating system). The results demonstrate that when the scale of reservoir gradually increases, the computational time cost increases exponentially, which is unacceptable when high real-time performance is require. In practice, we need to make a trade-off between accuracy performance and computational time cost. In our experiment, choosing a scale of 500 is a good balance point. Our above experimental results are also based on the scale of reservoir being 500.

Fig. 9. MFR accuracy of RC with coordinate transformation and 4-th folding algorithm at different scales of the reservoir (denoted by the resSize). The inserted small graph shows the MFR accuracy and time cost of RC in voltage = 0.3 V when the resSize changes from 50 to 1400.

Download Full Size | PDF

Next, we can specifically compare the parameters of our RC-based algorithm and DrCNN. The detailed architecture of DrCNN and the corresponding output dimensions of each layer are shown in [24]. Then we can easily calculate the parameters of DrCNN. For our RC, the input length K is 256 and the output length is 6. Let the scale of reservoir N be 500. The comparison of parameters between DrCNN and our RC algorithm is shown in Table 1.

Table 1. Comparison of parameters between DrCNN and RC

View Table

In NN-based algorithm, all parameters need to be trained, while in our RC-based algorithm, only the final output weight need to be trained, and this training process can be calculated by a simple linear regression closed-form solution. In this case, we are surprised to find that the total parameters of our RC are only almost 38% of that of DrCNN and the trainable parameters of our RC are less than 0.3% of that of DrCNN. Meanwhile, considering our feature extraction algorithms, including coordinate transformation and folding algorithm, only need to be executed once, and the feature extraction algorithms themselves are very simple, which will not add too much additional computational complexity, we can safely believe that our RC-based algorithm dramatically reduces the computational complexity and exhibits better performance and robustness in MFR task.

4.4 Effect of hyperparameters in RC

For RC, the states of reservoir have the greatest impact on the performance of MFR. When the states of reservoir are updated according to Eq. (5), there are five components that need to be adjusted for better performance of RC, including ${W_{in}}$, ${W_{res}}$, spectral radius $\rho ({{W_{res}}} )$ (i.e., the largest absolute eigenvalue), nonlinear activation function f, and leak rate $\alpha $. Among them, the weight matrix ${W_{in}}$ and ${W_{res}}$ are usually sparse and generated randomly from a uniform distribution symmetric around the zero value [16], including random uniform distribution, random Gaussian distribution, or random binary distribution. To the best of our knowledge, there is currently no complete theoretical basis for how to design random weights of RC for a specific task and this problem needs further in-depth research. In our experiments, the weight matrix ${W_{in}}$ and ${W_{res}}$ are generated from random Gaussian distribution and processed to be sparse matrix symmetric around the zero value. For the spectral radius $\rho ({{W_{res}}} )$, we found that smaller $\rho ({{W_{res}}} )$ are more suitable for MFR task, so $\rho ({{W_{res}}} )$ is set to 0.2 in the experiments. Meanwhile, the remaining two parts, the nonlinear activation function and the leak rate, can easily be adjusted appropriately to improve the performance of RC.

In the experiments, we select three common nonlinear activation functions including tanh, sigmoid and ReLU to test the influence of different nonlinear activation functions on the learning ability of RC. The accuracy of RC algorithm in the case of these three nonlinear activation functions and without the nonlinear activation function is shown in Fig. 10(a). Compared with no nonlinear activation function, the utilization of the nonlinear activation function can bring a significant increase in the accuracy of RC, roughly 10%. In addition, the impact of different nonlinear activation functions is also slightly different. Among the three nonlinear activation functions, the performance of ReLU is slightly worse. This is because after ReLU operation, all negative elements become 0 while all positive elements remain as they are, thus some singular elements cannot be constrained to a smaller range, resulting in worse performance. The performance of tanh and sigmoid are very close, almost the same at different voltages. This is because the shapes of tanh and sigmoid are very similar, which are all constrained within a limited range and the middle areas change quickly while the areas on both sides change smoothly. The main difference between tanh and sigmoid lies in the difference of their center point and this difference can be considered as a change in the linear bias term, which is easily learned by RC algorithm.

Fig. 10. (a) MFR accuracy of RC in the case of without nonlinear activation function and with three different nonlinear activation functions including tanh, sigmoid, and ReLU respectively. (b) MFR accuracy of RC when changing both leak rate and reservoir size, which represents the scale of reservoir.

Download Full Size | PDF

Besides, the leak rate also has a great influence on the dynamic performance of RC. We investigate the effect of leak rate at different scales of reservoir by changing both leak rate and reservoir size. The experimental results are shown in Fig. 10(b). In general, a larger scale of reservoir is more likely to bring better performance. Meanwhile, when the scale of reservoir increases, the range of optimal leak rate expands correspondingly and even a lower leak rate can bring good performance. In practice, we first determine the scale of reservoir, and then select the corresponding optimal leak rate to achieve the best performance of RC.

5. Conclusion

In this paper, we propose the RC-based method for MFR task to alleviate the shortcomings of NN-based algorithm, including high complexity and huge computational overhead. The total parameters and trainable parameters of our RC-based method are only 38% and 0.3% of the common NN-based method, which significantly reduces the complexity while achieving good performances. Besides, to further improve the performance of RC, we propose powerful feature extraction algorithms including coordinate transformation and folding algorithm. The coordinate transformation algorithm transforms the input data to Rectangular or Polar coordinate systems to better characterize the features of received signals. The folding algorithm reduces repetitive information and highlights local features by folding the original diagrams. Then we further implement the RC with coordinate transformation and folding algorithm in MFR task and six modulations are recognized, including OOK, 4QAM, 8QAM-DIA, 8QAM-CIR, 16APSK, and 16QAM. Through experiments in UVLC channels under various complex conditions, we demonstrate that compared with classical NN-based methods, our RC-based method dramatically reduces the structure complexity and time cost while achieving satisfactory MFR performance. Under different pin voltages of the LED, the accuracy of our RC-based method is almost all over 90%, and the highest is close to 100%. The comprehensive advantages of low computational overhead, satisfactory MFR accuracy and algorithm robustness make our proposed RC with coordinate transformation and folding algorithm a promising candidate for modulation formats recognition task.

Funding

National Key Research and Development Program of China (2022YFB2802803); National Key Research Project (2021YFB2801804); National Natural Science Foundation of China (61925104, 62031011, 62201157).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. S. Arnon, “Underwater optical wireless communication network,” Opt. Eng 49(1), 015001 (2010). [CrossRef]

2. H. Kaushal and G. Kaddoum, “Underwater Optical Wireless Communication,” IEEE Access 4, 1518–1547 (2016). [CrossRef]

3. N. Chi, Y. Zhao, M. Shi, P. Zou, and X. Lu, “Gaussian kernel-aided deep neural network equalizer utilized in underwater PAM8 visible light communication system,” Opt. Express 26(20), 26700–26712 (2018). [CrossRef]

4. N. Chi, Y. Zhou, Y. Wei, and F. Hu, “Visible Light Communication in 6G: Advances, Challenges, and Prospects,” IEEE Veh. Technol. Mag. 15(4), 93–102 (2020). [CrossRef]

5. M. Liu, T. Song, J. Hu, J. Yang, and G. Gui, “Deep Learning-Inspired Message Passing Algorithm for Efficient Resource Allocation in Cognitive Radio Networks,” IEEE Trans. Veh. Technol. 68(1), 641–653 (2019). [CrossRef]

6. Y.-C. Liang, K.-C. Chen, G. Y. Li, and P. Mahonen, “Cognitive radio networking and communications: an overview,” IEEE Trans. Veh. Technol. 60(7), 3386–3407 (2011). [CrossRef]

7. O. A. Dobre, A. Abdi, Y. Bar-Ness, and W. Su, “Survey of automatic modulation classification techniques: classical approaches and new trends,” IET Commun. 1(2), 137 (2007). [CrossRef]

8. W. Wei and J. M. Mendel, “Maximum-likelihood classification for digital amplitude-phase modulations,” IEEE Trans. Commun. 48(2), 189–193 (2000). [CrossRef]

9. Q. Shi and Y. Karasawa, “Noncoherent Maximum Likelihood Classification of Quadrature Amplitude Modulation Constellations: Simplification, Analysis, and Extension,” IEEE Trans. Wireless Commun. 10(4), 1312–1322 (2011). [CrossRef]

10. J. L. Xu, W. Su, and M. Zhou, “Likelihood-Ratio Approaches to Automatic Modulation Classification,” IEEE Trans. Syst., Man, Cybern. C 41(4), 455–469 (2011). [CrossRef]

11. A. K. Jain, R. P. W. Duin, and J. Mao, “Statistical pattern recognition: a review,” IEEE Trans. Pattern Anal. Machine Intell. 22(1), 4–37 (2000). [CrossRef]

12. F. N. Khan, K. Zhong, W. H. Al-Arashi, C. Yu, C. Lu, and A. P. T. Lau, “Modulation Format Identification in Coherent Receivers Using Deep Machine Learning,” IEEE Photon. Technol. Lett. 28(17), 1886–1889 (2016). [CrossRef]

13. D. Wang, M. Zhang, Z. Li, J. Li, M. Fu, Y. Cui, and X. Chen, “Modulation Format Recognition and OSNR Estimation Using CNN-Based Deep Learning,” IEEE Photon. Technol. Lett. 29(19), 1667–1670 (2017). [CrossRef]

14. H. Jaeger, “The “echo state” approach to analysing and training recurrent neural networks – with an Erratum note,” Germany: German National Research Center for Information Technology GMD Technical Report148(34), 13 (2001).

15. W. Maass, T. Natschläger, and H. Markram, “Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations,” Neural Computation 14(11), 2531–2560 (2002). [CrossRef]

16. M. Lukoševičius and H. Jaeger, “Reservoir computing approaches to recurrent neural network training,” Computer Science Review 3(3), 127–149 (2009). [CrossRef]

17. C. Xu, R. Jin, W. Gao, and N. Chi, “Efficient Modulation Classification Based on Complementary Folding Algorithm in UVLC System,” IEEE Photonics J. 14(4), 7346406 (2022). [CrossRef]

18. A. Sherstinsky, “Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network,” Phys. D (Amsterdam, Neth.) 404, 132306 (2020). [CrossRef]

19. S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation 9(8), 1735–1780 (1997). [CrossRef]

20. R. Pascanu, T. Mikolov, and Y. Bengio, “On the difficulty of training Recurrent Neural Networks,” International conference on machine learning 1310–1318 (2013).

21. J. Dong, M. Rafayelyan, F. Krzakala, and S. Gigan, “Optical Reservoir Computing Using Multiple Light Scattering for Chaotic Systems Prediction,” IEEE J. Select. Topics Quantum Electron. 26(1), 1–12 (2020). [CrossRef]

22. D. W. Marquardt and R. D. Snee, “Ridge Regression in Practice,” Am. Stat. 29(1), 3–20 (1975). [CrossRef]

23. N. Chi, Y. Zhou, S. Liang, F. Wang, J. Li, and Y. Wang, “Enabling Technologies for High-Speed Visible Light Communication Employing CAP Modulation,” J. Lightwave Technol. 36(2), 510–518 (2018). [CrossRef]

24. Y. Wang, M. Liu, J. Yang, and G. Gui, “Data-Driven Deep Learning for Automatic Modulation Recognition in Cognitive Radios,” IEEE Trans. Veh. Technol. 68(4), 4074–4077 (2019). [CrossRef]

		Trainable Parameters	Total Parameters ^a
DrCNN		1.012 million	1.012 million
RC	$W_{in}$	0	0.128 million
	$W_{res}$	0	0.250 million
	$W_{out}$	0.003 million	0.003 million

Modulation format recognition in a UVLC system based on reservoir computing with coordinate transformation and folding algorithm

Abstract

1. Introduction

2. Principle

2.1 Modulation format recognition

2.2 Coordinate transformation

2.3 Folding algorithm

2.4 Reservoir computing (RC)

3. Experimental setup

4. Experimental results

4.1 RC with coordinate transformation

4.2 RC with coordinate transformation and folding algorithm

4.3 Computational complexity of RC

4.4 Effect of hyperparameters in RC

5. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (10)

Tables (1)

Equations (10)

Optics Express