PRBS orders required to train ANN equalizer for PAM signal without overfitting

Jongwan Kim; Hoon Kim

doi:10.1364/OE.461199

1. Introduction

Due to the continuous growth of existing bandwidth-hungry services together with the advent of new user applications, including virtual reality, over-the-top media services, 4 K/8 K video, internet of things, and cloud computing services, the data traffic around network edges has been increasing in an exponential fashion. For the cost-effective implementation of optical transport systems for short-haul and optical access applications, it is still desirable to utilize the intensity-modulation (IM)/direct-detection (DD) scheme, rather than costly and power-hungry coherent detection [1–6]. However, the square-law detection of DD receiver makes linear transmission impairments (e.g., fiber chromatic dispersion) nonlinear at the receiver. There have been substantial efforts to compensate for nonlinear waveform distortions by using decision-feedback equalizer [7], maximum likelihood sequence estimation [8], and Volterra equalizer [9]. However, theses nonlinear equalizers suffer from one or more of the following limitations: the implementation complexity is prohibitively high to be used for cost-sensitive applications; it is not easy to implement the equalizer operating at very high baud rate due to its complicated structure; the performance improvement is limited when the accumulated dispersion is large.

An artificial neural network (ANN)-based nonlinear equalizer (NLE) has recently attracted considerable attention as a powerful means to compensate for nonlinear waveform distortions in optical communications [10–18]. The ANN-NLE is capable of approximating complicated nonlinear functions involving a large amount of data [19]. For the proper operation of ANN-NLE, it is desirable to train the ANN-NLE using a pre-determined sequence, rather than a random sequence. This is because the receiver, commonly more than a few kilometers away from the transmitter, should be able to recover the pre-determined training sequence from the input sequence whose waveforms are distorted severely by the channel. In optical communications, a pseudo-random binary sequence (PRBS) is widely used for training ANN-NLEs. However, the ANN-NLE might be able to estimate the entire PRBS from its part, and as a result, the trained ANN-NLE exhibits very poor performance for new input sequences [20–23]. This is because the weights of ANN-NLE are wrongly set to estimate the PRBS pattern during the training process, instead of correcting the waveform distortions induced by the channel. To solve this overfitting problem, a very long pre-determined sequence can be used for training. Some examples include a Mersenne Twister random sequence [22] and the combination of a couple of PRBSs [23]. However, such long training sequences not only increase the complexity of transmitter and receiver, but also incur long latency for sequence synchronization. Also, it is highly desirable for the training sequence to have unique single-peak autocorrelation characteristics for synchronization, which might not be clear when a couple of PRBSs are combined to produce a new pre-determined sequence.

Some research reported on the overfitting problem of ANN trained on a specific PRBS order [20–23]. We have recently provided a selection guideline on the PRBS order required for training an ANN-NLE without overfitting [24]. We determined through theory the minimum PRBS order required to avoid the overfitting for a given input size of ANN-NLE, and confirmed it through simulation for binary systems.

In this paper, we generalize our previous work to M-ary pulse amplitude modulation (PAM-M) signals. The PAM-M format is now widely used for high-speed short-reach IM/DD systems due to its high spectral efficiency in comparison with conventional on-off-keying format. We provide a selection guideline on the PRBS orders required to train the ANN-NLE for the PAM-M signals without experiencing overfitting. To the best of our knowledge, this is the first work on the generalized requirement of PRBSs for training the PAM-M-based ANN-NLE. We first analyze the generation rule of PAM-M signals naturally- [25] and Gray-coded [26] by PRBSs. We then investigate the pivotal tap indexes which govern the decoding of the PAM-M signals to the constituent PRBSs. From those indexes, we determine the minimum PRBS orders required to train the PAM-M-based ANN-NLE for a given input size of the equalizer without overfitting. Our theoretical analysis is verified through computer simulation. We evaluate the performance of the ANN-NLE trained on 28-Gbaud PAM-4 and PAM-8 signals generated by PRBSs and random signals at C-band. We show that the simulation results agree very well with our theoretical analysis.

2. Theoretical analysis

For the generation of PAM-M signals, we need m (=log₂M) PRBSs, each having a different order. We assume that the PRBSs are generated by a two-tap linear feedback shift register (LFSR) and a two-input exclusive-OR (XOR) operation. This generation scheme is commonly used for optical communications [13,17,20–23]. The connection representation of the PRBS can be expressed as [24]

(1)$$\begin{aligned} {x_i}(n )&= {x_i}({n - {l_{1,i}}} )\oplus {x_i}({n - {l_{2,i}}} )\\ &= {x_i}({n + {l_{1,i}}} )\oplus {x_i}({n + {l_{1,i}} - {l_{2,i}}} )\\ &= {x_i}({n + {l_{2,i}}} )\oplus {x_i}({n + {l_{2,i}} - {l_{1,i}}} )\end{aligned}$$

where i is the index of the PRBS ranging from 1 to m, x_i(n) is the n^th bit of PRBS i, l_1,i and l_2,i are the tap positions of the two-tap LFSR for PRBS i, and ⊕ is the XOR operation. A PRBS having a pattern length of 2^N – 1 (N is an PRBS order) is denoted by PRBS-N in this paper. Table 1 shows the tap positions of PRBS-N generated by the two-tap LFSR [27]. Here, all the tap positions are positive integers and l₁ > l₂.

Table 1. Tap positions of two-tap LFSR for generation of PRBS-N [27].^a

View Table

PAM-M signals naturally- or Gray-coded by PRBSs can be expressed as

(2)$$y(n )= {x_1}(n )+ \frac{{{x_2}(n )}}{2} +{\cdot}{\cdot} \cdot{+} \frac{{{x_m}(n )}}{{{2^{m - 1}}}}\;\; \textrm{(Natural coding)}$$

(3)$$y(n )= {x_1}(n )+ \frac{{[{{x_1}(n )\oplus {x_2}(n )} ]}}{2} +{\cdot}{\cdot} \cdot{+} \frac{{[{{x_1}(n )\oplus {x_2}(n )\oplus{\cdot} \cdot{\cdot} \oplus {x_m}(n )} ]}}{{{2^{m - 1}}}}\;\;\textrm{(Gray coding)}$$

where y(n) is the n^th symbol of the PAM-M signal. At the receiver, the ANN-NLE decodes the PAM-M signal, y(n), to the PRBSs, x₁(n), x₂(n), …, x_m(n). Thus, the n^th bit of a constituent PRBS i can be expressed as

(4)$${x_i}(n )= {D_i}({y(n )} )$$

where D_i is a decoding function. By substituting Eq. (1) with Eq. (4), we have

(5)$$\begin{aligned} {x_i}(n )&= {D_i}({y({n - {l_{1,i}}} )} )\oplus {D_i}({y({n - {l_{2,i}}} )} )\\ &= {D_i}({y({n + {l_{1,i}}} )} )\oplus {D_i}({y({n + {l_{1,i}} - {l_{2,i}}} )} )\\ &= {D_i}({y({n + {l_{2,i}}} )} )\oplus {D_i}({y({n + {l_{2,i}} - {l_{1,i}}} )} ). \end{aligned}$$

This equation shows that the constituent PRBS i can be decoded only by using two PAM-M symbols, regardless of symbol coding. The ANN-NLE is capable of decoding the symbol (i.e., estimating D) as long as the two symbols are provided [19]. Also, those symbols are expressed uniquely by two tap indexes. They are hereinafter referred to as the pivotal tap indexes. Figure 1 shows the pivotal tap indexes for each expression of Eq. (5). Even though the pivotal tap indexes of each expression are different, they generate the same constituent PRBS i from the PAM-M sequence. Thus, when the input size of ANN-NLE is large enough to receive both the symbols, the equalizer would be able to estimate one or more of entire constituent PRBSs from a part of PAM-M sequence, and thus it suffers from the overfitting problem.

We assume that the ANN-NLE receives a symmetric input sequence including k preceding [i.e., from y(n – k) to y(n – 1)] and k following symbols [i.e., from y(n + 1) to y(n + k)] excluding a current symbol, y(n). This is because the fiber chromatic dispersion makes an optical pulse to spread symmetrically from its center. Among the three expressions of Eq. (5), the maximum absolute value between two pivotal tap indexes in the third expression is smaller than that in the other two expressions because l_1,i > l_2,i. This is clearly shown in Fig. 1. Thus, in order to include two symbols located at the pivotal tap indexes in the input sequence of ANN-NLE, the input size of ANN-NLE, k, should be greater than or equal to

(6)$${L_i} = max ({{l_{2,i}},{l_{1,i}} - {l_{2,i}}} ).$$

Thus, L_i implies the minimum input size of ANN-NLE to estimate the constituent PRBS i. Figure 2 shows L_i for the PRBSs listed in Table 1. For a given input size of ANN-NLE, the overfitting occurs if k ≥ L_i. For example, if PRBS-15 is used as a constituent PRBS of training PAM sequence, an ANN-NLE having k greater than or equal to 14 is able to estimate the entire PRBS-15 from the sequence. In other words, the ANN-NLE having k smaller than L_i does not suffer from the overfitting problem. For a given input size of the ANN-NLE and PRBS order, the overfitting occurs as long as k is larger than or equal to L_i regardless of the length of training sequence. Under that condition, the ANN-NLE suffers from the overfitting even though a short training sequence which has unrepeated PRBS pattern is used for training the ANN-NLE [24]. It is worth mentioning that if the ANN-NLE receives only preceding or following symbols, L_i should be l_1,i because it is determined by the pivotal tap indexes of first and second expressions in Eq. (5).

Fig. 1. The pivotal tap indexes for each expression in Eq. (5). Shown at the bottom is the range of the symmetric input sequence of ANN-NLE including k preceding and k following symbols. The hollow circle implies that it is not included in the input sequence.

Download Full Size | PDF

Fig. 2. The minimum input size of ANN-NLE required to have the overfitting, L_i, versus the PRBS order.

Download Full Size | PDF

The overfitting occurs when the ANN-NLE can estimate one or more of entire constituent PRBSs from the input sequence. Thus, the input size of ANN-NLE required to avoid the overfitting problem is determined by the minimum value among L_i. Thus, we have

(7)$$k < {L_{\min }} = min ({{L_1},{L_2}, \cdot{\cdot} \cdot ,{L_m}} ).$$

It should be noted that L_min values are independent of whether natural or Gray coding is employed. Thus, the selection guideline provided in Eq. (7) is applicable to any PAM-M signals regardless of symbol coding.

It is worth mentioning that we can generate PAM-M training sequence by using m copies of a single PRBS (instead of using m PRBSs having different orders, as we have discussed above). In this case, L_min value is smaller than the case where multiple different PRBSs are employed. This implies that when a single PRBS is utilized to generate PAM signals, a higher PRBS order should be employed to avoid the overfitting (than the case where multiple PRBSs are used). Thus, it is desirable to utilize multiple different PRBSs for the generation of PAM signals rather than using decorrelated copies of a single PRBS.

3. Verification though simulation

3.1 Simulation setup

We first verify our theoretical analysis of Eq. (7) through simulation. Figure 3 shows the simulation setup. We first generate 28-Gbaud PAM signals by using m different PRBSs. In this simulation, we use PAM-4 and PAM-8 signals because they are now widely used for IM/DD systems. Then, the generated optical PAM signals are detected by the square-law detection. After adding additive white Gaussian noise (AWGN) to the detected signals, we send them to an ANN-NLE. The inset in Fig. 3 shows the structure of 3-layer ANN-NLE which have an input layer, one hidden layer, and an output layer. The input layer receives a current symbol together with k preceding and k following symbols. The rectified linear unit (ReLU) and sigmoid functions are used as activation functions in the hidden and output layers, respectively. The number of neurons in the hidden layer, N_h, is set to be 30. At the output layer, there are m output neurons which are the same as the number of the PRBSs used to generate the PAM-M signals. Then, each output neuron generates the demodulated binary sequence. In the training process, the mean squared error (MSE) is used as a loss function at the output layer. Then, the connection weights, w⁽¹⁾ and w⁽²⁾, are updated by the back propagation algorithm [28] based on the gradient descent algorithm [29]. The ANN-NLE is trained with one million training data sets (each data set includes a bias, a current symbol, and k preceding and k following symbols) with a batch size of 200 and 50 epochs. For the measurement of bit-error ratio (BER) after training, we use one million test data sets, which is different from the training data sets.

Fig. 3. Simulation setup.

Download Full Size | PDF

3.2 Minimum input size of ANN-NLE required to estimate PAM-M signal

We first obtain the BER performances for the PAM-4 signals generated by two PRBSs. Figure 4 shows the BER performance of the naturally-coded PAM-4 signals as a function of the input size of ANN-NLE, k. The received power is −13 dBm. The PAM-4 signals generated by PRBS-7 and 9, PRBS-10 and 11, and PRBS-15 and 17 are used for training the ANN-NLE. Then, the performance of ANN-NLE is evaluated by using test PAM-4 signals: three PAM-4 signals generated by two PRBSs and one random PAM-4 signal. It should be noted that the test data are different from the training data even though the PRBS orders are the same. The results show that when k is smaller than L_min formulated by Eq. (7), the BERs exhibit similar values for all the test PAM-4 signals. For example, the ANN-NLE trained on the PAM-4 signals generated by PRBS-10 and 11 shows a BER of ∼5 × 10⁻² for all the test PAM-4 signals when k is less than or equal to 6 in Fig. 4(b). It should be noted that L_min = min(7, 9) = 7. Thus, there is no overfitting in this case. However, if k ≥ L_min, the ANN-NLE is capable of estimating one or more of the constituent PRBSs, and thus we have the overfitting problem. For example, Fig. 4(a) shows the BER degradation for three test PAM-4 signals except one test PAM-4 signal generated by the same PRBSs as the training PAM-4 signal when k ≥ 5. Otherwise, the BER shows better performance. This clearly indicates the overfitting. Figure 4 clearly shows that the minimum input sizes of the ANN-NLE trained on PAM-4 signals generated by PRBS-7 and 9, PRBS-10 and 11, and PRBS-15 and 17 to have the overfitting are 5, 7, and 14, respectively. These agree very well with our theoretical results of Eq. (7). We also obtain the same results for the PAM-4 signals naturally- and Gray-coded by two PRBSs.

Fig. 4. BER performances as a function of input size of ANN-NLE, k. The ANN-NLE is trained on PAM-4 signals naturally-coded by (a) PRBS-7 and 9, (b) PRBS-10 and 11, and (c) PRBS-15 and 17. The received power is -13 dBm.

Download Full Size | PDF

The occurrence of overfitting can also be explained by observing the importance distribution of the input data. For this purpose, we define the importance, Wp, as a sum of the absolute weights between each input neuron and the hidden neurons at a time index p. It can be expressed as [24]

(8)$${W_p} = \sum\limits_{q = 1}^{{N_h}} {|{w_{pq}^{(1)}} |} ,$$

where p is the time index of the input sequence ranging from − k to k, q is the index of the hidden neuron ranging from 1 to N_h, and w_pq⁽¹⁾ is the connection weight between the input symbol at time index p and the q^th hidden neuron. Figure 5 shows the importance distributions for the PAM-4 signal naturally coded by PRBS-7 and 9. When the input size of ANN-NLE is 4, there is the only peak at zero time index, as shown in Fig. 5(a). However, when k is 5 in Fig. 5(b), there are additional peaks at time indexes of −4 and 5 which correspond to the pivotal tap indexes of PRBS-9. It means that the ANN-NLE estimates the constituent PRBS-9 from the input signal. Thus, the ANN-NLE suffers from the overfitting when k ≥ 5. We confirm that the normalized W_p exhibits larger values at the pivotal tap indexes than at other tap indexes when k ≥ L_min for the other cases.

Fig. 5. The importance distributions of the input data for PAM-4 signal naturally coded by PRBS-7 and 9 when the input sizes of ANN-NLE are (a) 4 and (b) 5.

Download Full Size | PDF

Next, we generate the PAM-8 signals from three different PRBS orders. Figure 6 shows the BER performances of the PAM-8 signals Gray-coded by three PRBSs. The ANN-NLE is trained on the PAM-8 signals generated by PRBS-5, 6, and 9, PRBS-7, 10, and 11, and PRBS-15, 17, and 18, respectively. The received power is set to be -10 dBm. For the evaluation of BER performance, we utilize three PAM-8 signals generated by PRBSs and a random PAM-8 signal. The results show the exactly same tendency we observed for PAM-4 signals shown in Fig. 4. For example, the BER performance is measured to be ∼5 × 10⁻² when k < L_min. As the input size of ANN-NLE is greater than or equal to L_min, the BER values start to deviate from this value. This is because the equalizer estimates one or more of the constituent PRBSs and thus it suffers from the overfitting. Figure 6 clearly shows that the minimum input sizes of the ANN-NLE trained on the PAM-8 signals generated by PRBS-5, 6, and 9, PRBS-7, 10, and 11, and PRBS-15, 17, and 18 to have the overfitting are 3, 6, and 11, respectively. These results agree very well with our theoretical analysis.

Fig. 6. BER performances as a function of input size of ANN-NLE, k. The ANN-NLE is trained on PAM-8 signals Gray-coded by (a) PRBS-5, 6, and 9, (b) PRBS-7, 10, and 11, and (c) PRBS-15, 17, and 18. The received power is -10 dBm.

Download Full Size | PDF

Figure 7 shows the importance distributions of the PAM-8 signal Gray-coded by PRBS-5, 6, and 9. In Fig. 7(a), there is only one peak at the time index of zero when the input size of ANN-NLE is 2. This is an example of the case where no overfitting occurs. On the other hand, when k is 3, there are additional peaks at time indexes of −2 and 3 which are matched to the pivotal tap indexes of PRBS-5, as shown in Fig. 7(b). Thus, the overfitting occurs when k ≥ 3. We confirm that the pivotal tap indexes shows much higher importance in other cases as well when k ≥ L_min.

Fig. 7. The importance distributions of the input data for PAM-8 signal Gray coded by PRBS-5, 6, and 9 when the input sizes of ANN-NLE are (a) 2 and (b) 3.

Download Full Size | PDF

4. Effect of overfitting on BER performance of ANN-NLE

4.1 Simulation setup

Next, we carry out simulation study to evaluate the transmission performance of IM/DD system utilizing ANN-NLE. Figure 8 shows the simulation setup. The optical PAM-4 and PAM-8 signals are first generated by using 2 and 3 different PRBSs, respectively. Then, the optical PAM-M signals at 1.55 µm are transmitted over 15-km long standard single-mode fiber (SSMF) having a dispersion parameter of 18 ps/nm/km. After the square-law detection, AWGN is added to the received signal at the receiver. The waveform distortions induced by chromatic dispersion of fiber are compensated by using a 3-layer ANN-NLE. The ANN-NLE has the same structure as the inset of Fig. 3. The ANN-NLE is trained and evaluated under the same condition as in subsection 3.1.

Fig. 8. Simulation setup for 15-km transmission of PAM-M signal.

Download Full Size | PDF

The input size of equalizer should be determined by the delay spread of the channel. For 28-Gbaud signals having a spectral width of 0.45 nm, the delay spread of 15 km long SSMF link is estimated to be 121 ps, which corresponds to 4 symbols. Thus, we set the input size of ANN-NLE, k, to be 5, giving one symbol margin. However, the input size of equalizer could be much larger than this value when the transmission distance is long and the signal baud rate is high. For example, the input size of the equalizer should be as high as 90 when the 56-Gbaud PAM-4 signals are employed for 80-km link. In such a case, PAM-4 signals generated by PRBS-93 and PRBS-97 can be used for training an ANN-NLE. Thus, even though the input size of ANN-NLE is very large, the selection guideline on PRBS orders reported in this paper is applicable to such an ANN-NLE. Tap positions of two-tap and four-tap LFSRs for generation of PRBS-N are provided up to N = 4096 in [27].

4.2 Effect of overfitting in PAM-M transmission

We investigate the BER performances of the ANN-NLE by varying the received power at a given input size of ANN-NLE. Figure 9 shows the BER results of the ANN-NLE trained on six PAM-4 signals: five PAM-4 signals generated by two PRBSs and a random PAM-4 signal. After training, the ANN-NLE is tested by another random PAM-4 signal. The BER performances together with the importance distributions are shown for naturally- and Gray-coded PAM-4 signals in Fig. 9(a) and (b), respectively. The results show that the ANN-NLE performs similarly when the ANN-NLE is trained on the PAM-4 signals generated by PRBS-7 and 10, PRBS-10 and 11, and PRBS-15 and 17, and the random PAM-4 signal. It implies that the overfitting does not occur in the training process. This agrees very well with our theoretical analysis. For example, L_min values for the PAM-4 signals generated by PRBS-7 and 10, PRBS-10 and 11, and PRBS-15 and 17 are 6, 7, and 14, respectively, according to Eq. (7). They are all larger than the given input size of ANN-NLE (i.e., k = 5). Thus, there is no overfitting in those cases. On the other hand, when the ANN-NLE is trained on the PAM-4 signals generated by PRBS-5 and 6 and PRBS-7 and 9, L_mi_n values are calculated to be 3 and 5, respectively. Since k is greater than or equal to those L_min values, we expect that the equalizer experiences the overfitting. The BER results depicted in Fig. 9 confirm this. We clearly observe that the BER performance deteriorates when the ANN-NLE is trained on the PAM-4 signals generated from PRBS-5 and 6 and PRBS-7 and 9, the BER is degraded by more than a factor of 5 at the received power of -5 dBm, compared to the case where the equalizer is trained on the random PAM-4 signal.

Fig. 9. BER performances for the ANN-NLE trained on the PAM-4 signal (a) naturally- and (b) Gray-coded by two PRBSs when k = 5. Three insets on the right are the importance distributions of the input data. The red inverse triangles in the insets represent the positions of pivotal tap indexes.

Download Full Size | PDF

These results are also confirmed by observing the importance distribution of the input data. From the right bottom insets of Fig. 9(a) and (b), the importance distributions show that the current symbol significantly affects the output and the influences of the preceding and following symbols become weaker as they are farther away from the current symbol. They exhibit typical importance distributions characterized by fiber chromatic dispersion. However, when the overfitting occurs, the pivotal tap indexes manifest themselves as strong peaks in the importance distribution since the input symbols located at pivotal tap indexes contribute considerably to the estimation of one or more of the constituent PRBSs. The right middle insets in Fig. 9(a) and (b) show that W_p at time indexes of –4 and 5 (marked with the red inverse triangles in the insets) have significant importance which are matched to the pivotal tap indexes of PRBS-9. It should be noted that the equalizer suffers from the overfitting in this case. In the right top insets of Fig. 9(a) and (b), we also observe several peaks in the importance distribution at the pivotal tap indexes of PRBS-5 (i.e., –5, –3, –2, 2, 3, and 5) and PRBS-6 (i.e., –1 and 5). The equalizer estimates both the constituent PRBSs during training process in this case, and thus suffers severely from the overfitting.

Next, we obtain the BER performances when the ANN-NLE is trained on four PAM-8 signals and tested by using another random PAM-8 signal. Figure 10 shows the BER results. In the same manner as we observed in the previous subsection, the BERs of the ANN-NLE trained on the PAM-8 signals generated by PRBS-7, 10, and 11 and PRBS-15, 17, and 18 and the random PAM-8 signal have similar values. These results are also matched with our theoretical analysis. From Eq. (7), the L_min values are calculated to be 6 and 11 for the PAM-8 signals generated by PRBS-7, 10, and 11 and PRBS-15, 17, and 18, respectively. Since these L_min values are greater than the given input size of ANN-NLE, the equalizer does not suffer from the overfitting. However, when the ANN-NLE is trained on the PAM-8 signals generated by PRBS-5, 6, and 9, we observe the BER degradation. This is because of overfitting since L_min = min(3, 5, 5) = 3 in this case. From the results in this section, we confirm that the ANN-NLE can be trained on the PAM-4 and PAM-8 signals without the overfitting when the constituent PRBSs are selected based on L_min > k. This guideline can be applied regardless of symbol coding.

Fig. 10. BER performances of the ANN-NLE trained on the PAM-8 signal (a) naturally- and (b) Gray-coded by three PRBSs when k = 5.

Download Full Size | PDF

5. Conclusions

We have provided a selection guideline of PRBS orders to generate PAM-M signal required to train the ANN-NLE without the overfitting problem. For this purpose, we first determine the minimum input size of ANN-NLE required to estimate one or more of entire constituent PRBSs from the input PAM-M sequences. This theoretical analysis is used to find out the PRBS orders required to avoid the overfitting problem for a given input size of ANN-NLE. Our theoretical analysis is confirmed through computer simulation. The results on BER performance and importance distribution clearly show that we can train the ANN-NLE without experiencing the overfitting problem when the PRBS orders satisfy Eq. (7), regardless of symbol coding. We believe that our findings would be used to select the PRBSs for training ANN-based systems, including ANN-based fiber channel modeling and ANN-NLE.

Funding

Institute for Information and Communications Technology Promotion (2021-0-00809).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. D. V. Plant, M. Morsy-Osman, and M. Chagnon, “Optical communication systems for datacenter networks,” in Proc. Opt. Fiber Commun. Conf. (OFC) (2017), paper W3B.1.

2. S. Randel, F. Breyer, S. C. J. Lee, and J. W. Walewski, “Advanced modulation schemes for short-range optical communications,” IEEE J. Select. Topics Quantum Electron. 16(5), 1280–1289 (2010). [CrossRef]

3. J. L. Wei, Q. Zhang, L. Zhang, N. Stojanovic, C. Prodaniuc, F. Karinou, and C. Xie, “Challenges and advances of direct detection systems for DCI and metro networks,” in Proc. Opt. Fiber Commun. Conf. (OFC) (2018), paper W2A.60.

4. J. C. Cartledge and A. S. Karar, “100 Gb/s intensity modulation and direct detection,” J. Lightwave Technol. 32(16), 2809–2814 (2014). [CrossRef]

5. H. Kim, “High-speed optical transmission system using 1.55-µm directly modulated laser,” in Proc. Int. Conf. Opt. Instruments and Technol. (2017).

6. C. Sun, S. Bae, and H. Kim, “Transmission of 28-Gb/s duobinary and PAM-4 signals using DML for optical access network,” IEEE Photon. Technol. Lett. 29(1), 130–133 (2017). [CrossRef]

7. M. Kim, S. H. Bae, H. Kim, and Y. C. Chung, “Transmission of 56-Gb/s PAM-4 signal over 20 km of SSMF using a 1.55-µm directly-modulated laser,” in Proc. Opt. Fiber Commun. Conf. (OFC) (2017), paper Tu2D.6.

8. T. Xu, Z. Li, J. Peng, A. Tan, Y. Song, Y. Li, J. Chen, and M. Wang, “Decoding of 10-G optics-based 50-Gb/s PAM-4 signal using simplified MLSE,” IEEE Photonics J. 10(4), 1–8 (2018). [CrossRef]

9. N. Stojanovic, F. Karinou, Q. Zhang, and C. Prodaniuc, “Volterra and Wiener equalizers for short-reach 100G PAM-4 applications,” J. Lightwave Technol. 35(21), 4583–4594 (2017). [CrossRef]

10. M. A. Jarajreh, E. Giacoumidis, I. Aldaya, S. T. Le, A. Tsokanos, Z. Ghassemlooy, and N. J. Doran, “Artificial neural network nonlinear equalizer for coherent optical OFDM,” IEEE Photon. Technol. Lett. 27(4), 387–390 (2015). [CrossRef]

11. E. Giacoumidis, S. T. Le, M. Ghanbarisabagh, M. McCarthy, I. Aldaya, S. Mhatli, M. A. Jarajreh, P. A. Haigh, N. J. Doran, A. D. Ellis, and B. J. Eggleton, “Fiber nonlinearity-induced penalty reduction in CO-OFDM by ANN-based nonlinear equalization,” Opt. Lett. 40(21), 5113–5116 (2015). [CrossRef]

12. M. Luo, F. Gao, X. Li, Z. He, and S. Fu, “Transmission of 4×50-Gb/s PAM-4 signal over 80-km single mode fiber using neural network,” in Proc. Opt. Fiber Commun. Conf. (OFC) (2018), paper M2F.2.

13. P. Li, L. Yi, L. Xue, and W. Hu, “56 Gb/s IM/DD PON based on 10G-class optical devices with 29 dB loss budget enabled by machine learning,” in Proc. Opt. Fiber Commun. Conf. (OFC) (2018), paper M2B.2.

14. A. G. Reza and J.-K. K. Rhee, “Nonlinear equalizer based on neural networks for PAM-4 signal transmission using DML,” IEEE Photon. Technol. Lett. 30(15), 1416–1419 (2018). [CrossRef]

15. W. Zhang, L. Ge, Y. Zhang, C. Liang, and Z. He, “Compressed nonlinear equalizers for 112-Gbps optical interconnects: Efficiency and stability,” Sensors 20(17), 4680 (2020). [CrossRef]

16. C.-Y. Chuang, L.-C. Liu, C.-C. Wei, J.-J. Liu, L. Henrickson, W.-J. Huang, C.-L. Wang, Y.-K. Chen, and J. Chen, “Convolutional neural network based nonlinear classifier for 112-Gbps high speed optical link,” in Proc. Opt. Fiber Commun. Conf. (OFC) (2018), Paper W2A.43.

17. C. Ye, D. Zhang, X. Hu, X. Huang, H. Feng, and K. Zhang, “Recurrent neural network (RNN) based end-to-end nonlinear management for symmetrical 50Gbps NRZ PON with 29dB+ loss budget,” in Proc. Eur. Conf. Opt. Commun. (ECOC) (2018), Paper Mo4B.3.

18. Z. Xu, S. Dong, J. H. Manton, and W. Shieh, “Low-complexity multi-task learning aided neural networks for equalization in short-reach optical interconnects,” J. Lightwave Technol. 40(1), 45–54 (2022). [CrossRef]

19. S. Ferrari and R. F. Stengel, “Smooth function approximation using neural networks,” IEEE Trans. Neural Netw. 16(1), 24–38 (2005). [CrossRef]

20. T. A. Eriksson, H. Bülow, and A. Leven, “Applying neural networks in optical communication systems: Possible pitfalls,” IEEE Photonics Technol. Lett. 29(23), 2091–2094 (2017). [CrossRef]

21. L. Shu, J. Li, Z. Wan, W. Zhang, S. Fu, and K. Xu, “Overestimation trap of artificial neural network: Learning the rule of PRBS,” in Proc. Eur. Conf. Opt. Commun. (ECOC) (2018), paper Tu4f.1.

22. C.-Y. Chuang, L.-C. Liu, C.-C. Wei, J.-J. Liu, L. Henrickson, C.-L. Wang, Y.-K. Chen, and J. Chen, “Study of training patterns for employing deep neural networks in optical communications systems,” in Proc. Eur. Conf. Opt. Commun. (ECOC) (2018), paper Tu4f.2.

23. T. Liao, L. Xue, L. Huang, W. Hu, and L. Yi, “Training data generation and validation for a neural network-based equalizer,” Opt. Lett. 45(18), 5113–5116 (2020). [CrossRef]

24. J. Kim and H. Kim, “Length of pseudorandom binary sequence required to train artificial neural network without overfitting,” IEEE Access 9, 125358–125365 (2021). [CrossRef]

25. K. Szczerba, P. Westbergh, J. Karout, J. S. Gustavsson, Å. Haglund, M. Karlsson, P. A. Andrekson, E. Agrell, and A. Larsson, “4-PAM for high-speed short-range optical communications,” J. Opt. Commun. Netw. 4(11), 885–894 (2012). [CrossRef]

26. R. W. Doran, “The Gray code,” J. Univers. Comput. Sci. 13(11), 1573–1597 (2007).

27. R. Ward and T. Molteno, “Table of linear feedback shift registers,” University of Otago (2012).

28. S. Haykin, Neural Networks and Learning Machines, 3rd ed. (Pearson Education, 2009), pp. 122–229.

29. S. Ruder, “An overview of gradient descent optimization algorithms,” arxiv.org/abs/1609.04747 (2016).

N	l₁, l₂	N	l₁, l₂	N	l₁, l₂	N	l₁, l₂	N	l₁, l₂
3	3, 2	7	7, 6	11	11, 9	15	15, 14	18	18, 11
4	4, 3	8	-	12	-	16	-	19	-
5	5, 3	9	9, 5	13	-	17	17, 14	20	20, 17
6	6, 5	10	10, 7	14	-

PRBS orders required to train ANN equalizer for PAM signal without overfitting

Abstract

1. Introduction

2. Theoretical analysis

3. Verification though simulation

3.1 Simulation setup

3.2 Minimum input size of ANN-NLE required to estimate PAM-M signal

4. Effect of overfitting on BER performance of ANN-NLE

4.1 Simulation setup

4.2 Effect of overfitting in PAM-M transmission

5. Conclusions

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (10)

Tables (1)

Equations (8)

Optics Express