Discrete wavelet transform assisted convolutional neural network equalizer for PAM VLC system

Xingyu Lu; Yi Li; Xiang Chen; Yuqiao Li; Yanbing Liu

doi:10.1364/OE.516195

1. Introduction

Visible light communication refers to the optical wireless communication system that transmits information by modulating the light in the visible spectrum used for illumination. Due to its rich bandwidth resources, license-free spectrum resources, high security, and electromagnetic interference-free, VLC has aroused great research interest [1]. And it is predicted to become an indispensable key technology in 6 G wireless networks [2,3]. Currently, VLC has been widely applied in special fields [4]such as underwater, space, and nuclear power plants.

However, due to linear and non-linear distortion, visible light communication systems’ channel capacity and transmission distance would be severely limited [5]. How to alleviate this kind of distortion has brought attention to VLC research. Linear equalization techniques are currently mature, and non-linear compensation methods based on traditional adaptive equalization and machine learning are considered effective solutions. By combining linear convolution and nonlinear power series, the Volterra-based equalization technique effectively compensates for impairments [6]. It is noted that the use of a Volterra-based equalizer results in significant computational complexity when unfolding higher-order analytic formulas. Since networks based on multilayer perceptron structures have the virtue of fitting more complex functions, neural networks have recently been used to improve nonlinear distortion compensation in VLCs. For example, using an equalizer based on artificial neural network (ANN) to perform joint spatial and temporal equalization on MIMO-VLC systems [7]. In [8], a Gaussian kernel-assisted deep neural network (GK-DNN) equalizer was used to compensate for nonlinear distortion in underwater PAM-8 visible light communication. Equalization schemes based on convolutional neural network (CNN) [9,10]and post-equalizer based on long short-term memory (LSTM) [11] can effectively mitigate the nonlinear distortion of the memory-based equalization method in the time domain. [12] used a novel entity extraction neural network (EXNN) the convolutional kernel to extract distortion as an entity to compensate for the signal.

At the same time, [13] proposed that the wavelet transform has a unique advantage in dealing with non-smooth signal features, while its ability to extract local features well and avoid information loss. By adopting the parameter-sharing mechanism of the CNN, the combination reduces the complexity of the network. The main idea is to use wavelet transforms to decompose the signal into coefficient groups at different frequency levels to obtain information related to the characteristics of the signal. This information is then employed to determine optimal thresholds that are then applied to these coefficient groups, effectively removing redundant data. The experimental results prove that the incorporation of the wavelet transform structure is indeed effective in improving the system performance.

Motivated by the discussed above, using classical signal processing techniques, as well as deep learning and attention mechanisms, we propose a discrete wavelet transform-assisted convolutional neural network (DWTCNN) equalizer. Specifically, the DWTCNN equalizer utilizes wavelet transform to decompose the signal into approximation and detail coefficients and applies an adaptive soft-threshold structure to mitigate linear and nonlinear distortion in the specific coefficients. While combining the advantage of neural network to learn approximate complex functions, the wavelet expansion tends to concentrate the signal energy into a number of coefficients with larger values. This energy-concentrating property of the DWT allows wavelet analysis to compensate for the damaged signals in VLC systems. The coefficients are then reconstructed to recover the original signal, thus achieving signal compensation. Our experiments validate the feasibility of the proposed scheme and show the better equalization performance of the proposed DWTCNN. Under optimal voltage and different current conditions, the DWTCNN improved the Q factor by 0.76 and 0.53 dB compared with the LSTM equalizer and EXNN equalizer compensated with a single time-domain signal respectively.

2. Principle

In LED based VLC system, signals are subject to damage from system devices and transmission links during emission, transmission, and reception. The nonlinear distortion is mainly caused by the response of the electronic amplifier (EA) on the transmitter side, the PIN photodetector on the receiver side, and components such as the transmitter drive link and LEDs, which would significantly impact the recovery of the original signal from the received signal [14]. Wavelet transform allows us to decompose the original signal and its unimportant parts into different groups of coefficients [15]. Experimentally, it was found that the decomposition of the damaged signal into different wavelet coefficients does not exhibit a significant difference in the time and frequency domains in VLCs, as illustrated in Fig. 5. Our work combines the wavelet transform technique and deep learning to develop an equalizer for compensating linear and nonlinear distortions in VLCs, we use a learning-based approach to treat the distortion in the VLC system as unnecessary redundant information. We consider the problem of signal compensation as a problem of removing redundant information. Figure 1 shows the principle of signal compensation.

Fig. 1. DWTCNN equalizer compensation principle.

Download Full Size | PDF

The DWTCNN equalizer mainly performs non-linear compensation on the signal in three stages: discrete wavelet decomposition, coefficient processing, and discrete wavelet coefficient reconstruction.

To be specific, firstly, the convolution operation is used to turn the one-dimensional signal into a multi-channel, which is used to extract the shallow features of the signal and increase the coefficient data after wavelet transformation. Then, the signal is decomposed into approximation coefficients and detail coefficients using wavelet transform. As mentioned earlier, the approximation coefficients are the main component of the signal, while the detail coefficients contain a lot of redundant information, including the linear and nonlinear distortion introduced during transmission. Our task is to use neural networks to learn suitable thresholds, which transform the useful information in the signal into arbitrary positive or negative features and convert the redundant information into zero features. Therefore, we introduce an adaptive soft threshold structure in the branch processing detail coefficients, which is a trainable module that can automatically determine the threshold based on the `signal features without requiring specialized knowledge in signal processing. At the same time, feature fusion is used in the branch processing approximation coefficients, and finally, these coefficient signals are reconstructed to recover the original signal and achieve VLC system equalization.

Next, the principle of the DWTCNN equalizer will be further explained with the help of the formula. Taking into account that the original signal $Tx$ receives the nonlinear distortion caused by the intrinsic characteristics from physical devices during transmission through the VLC system, the received signal $Rx$ is denoted by y(n) can be expressed as:

(1)$$\textrm{y}(n) = F(x(n)) + noise(n)$$

Where F function is the channel response which consists nonlinear distortion, $x(n)$ is the original signal, noise denotes noise interferences generated in the system. For the input time-domain distorted signal $Rx$, after being processed by the input layer, it becomes a multi-channel $Rx^{\prime}$, and then the discrete wavelet transform is used to project it onto the wavelet domain:

(2)$$\begin{array}{l} H = D(Rx^{\prime})\\ H: = ({H_{high}},{H_{low}}) \end{array}$$

Where $D$ represents a series of transformation matrices that transform the signal variables into corresponding wavelet coefficients. H represents the set of coefficients generated by a discrete wavelet transformation of the signal, including high-frequency (detail) coefficients ${H_{high}}$ and low-frequency (approximate) coefficients ${H_{low}}$. The decomposition process involves the use of successive functions $h[n]$ and $g[n]$ to split into its coefficients ${H_{high}}$ and ${H_{low}}$. The outputs of the ${H_{high}}$ and ${H_{low}}$, ${\textrm{y}_h}(k)$ and ${\textrm{y}_l}(k)$, respectively are given as [15]:

(3)$$\begin{array}{l} {\textrm{y}_h}[k] = \sum\limits_n {y[n]\cdot } \textrm{g[2k - n]}\\ {\textrm{y}_l}[k] = \sum\limits_n {y[n]\cdot } \textrm{h[2k - n]} \end{array}$$

These coefficients are passed into the appropriate branch for processing $P(H)$:

(4)$$H^{\prime} = P(H) = \left\{ {\begin{array}{{c}} {AST({H_{high}})}\\ {CF({H_{low}})} \end{array}} \right.$$

Where P represents the processing operations on the high-frequency and low-frequency coefficients [16], which are divided into high-frequency coefficient soft-thresholding $AST$ and low-frequency coefficient fusion $CF$, Finally, the processed coefficients are reconstructed using the inverse operation R of D operation [13]:

(5)$$Y^{\prime} = R (H^{\prime})$$

The reconstructed signal $Y^{\prime}$ needs to be further processed by the output layer to match the dimensions of the original signal, and finally obtain the compensated signal Y.

2.1 DWTCNN architecture

The DWTCNN consists of four parts, including the input subnet, shrinkage subnet, fusion subnet, and output subnet. The model structure is shown in Fig. 2. The input subnet is mainly used to convert the damaged signal into the input dimension required by the network, which includes the discrete wavelet decomposition module (DWT), responsible for decomposing the signal into different coefficients and forwarding them to different subnets. The shrinkage subnet is used to process the detail coefficients and eliminate redundant information, including the key adaptive soft-thresholding structure. The fusion subnet is used to process the approximate coefficients, and the purpose of this subnet is to increase the weight of the approximate coefficients for subsequent coefficient fusion. The output subnet is mainly used for coefficient reconstruction and matching the dimension of the original input signal.

Fig. 2. Structure of DWTCNN.

Download Full Size | PDF

The input subnet mainly consists of two convolutional layers and a wavelet transform layer, shown in orange in Fig. 2. The role of the convolutional layers is to increase the number of feature channels and extract shallow features of the signal. The wavelet transform layer will transform the signal features into different scales. The processing of the input subnet corresponds to the mapping function $D $ in the previous section.

(6)$${Y_{conv}} = \textrm{ReLU}[\omega _{input}^2 \otimes \textrm{ReLU}(\omega _{input}^1 \otimes Rx) + b_{input}^1] + b_{input}^2$$

Where $\omega _{input}^j$ and $b_{input}^j(j = 1,2)$ denotes the weight matrix and bias vector of the convolutions at layer $j$ in the input subnet, The activation function used is $\textrm{ReLu}$ function, The first convolutional layer with a kernel size of 3 is used to extract signal features, while the second convolutional layer with a kernel size of 1 is used to increase the number of feature channels. After calculating the output of the convolutional layers, the result is forwarded to the wavelet transform layer in a forward pass.

(7)$$[{{Y_{ac}},{Y_{dc}}} ]= {F_{\textrm{DWT}}}({Y_{conv}})$$

Where ${F_{\textrm{DWT}}}$ denotes discrete wavelet transform, ${Y_{conv}}$ is the output of Eq. (6), ${Y_{ac}}$ and ${Y_{dc}}$ represent the approximation and detail components, respectively. In discrete wavelet transform, the wavelet basis functions determine the coefficients obtained by decomposition. The selection of wavelet basis functions is discussed in the optimization experiments section of this paper.

After the signal is decomposed, different coefficients enter different subnets for further processing. The processing of the approximate components is relatively simple. The fusion subnet mainly includes three convolutions, using feature fusion and residual connections. The purpose of feature fusion is to use the previously generated feature maps as input during convolution to preserve the most important information in the signal features. Residual connections are used to speed up training and improve network accuracy. Assuming that the input to the fusion subnet is ${x_{ac}}$, the output after feature fusion is given by:

(8)$${Y_{concat}} = Concat[\textrm{ReLU}(\omega _{Fusion}^2 \otimes \textrm{ReLU(}\omega _{Fusion}^1 \otimes {x_{ac}}\textrm{)}),\textrm{ReLU(}\omega _{Fusion}^1 \otimes {x_{ac}}\textrm{)}]$$

Where $\omega _{Fusion}^j$ denotes the weight matrix of the j convolutional layer in the fusion subnet, each activation after convolution is added with the corresponding bias term, which is not reflected in the formula for ease of writing. $Concat({x_1},{x_2})$ represents the concatenation of ${x_1}$ and ${x_2}$ according to the number of feature channels. If the input sizes of ${x_1}$ and ${x_2}$ are both $B \times H$, the concatenated result is $2B \times H$. At this point, the output of the fusion subnet can be expressed as:

(9)$${Y_{Fusion}} = \textrm{ReLU(}\omega _{Fusion}^3 \otimes {Y_{concat}}\textrm{)} + {x_{ac}}$$

Where ${Y_{Fusion}}$ represents the final output of the fusion subnet, ${Y_{concat}}$ is the output of Eq. (8), Combining Eq. (8) and Eq. (9), we have the entire process of the fusion subnet.

The shrinkage subnet is used to process detail coefficients, mainly including an adaptive soft-thresholding structure, details of which will be discussed in the next section. And the |GAP| proposed in Fig. 2 represents the global average pooling [17] before the fully connected layer in order to prevent the overfitting.

After different components are processed in their corresponding subnets, coefficient reconstruction is needed to restore the original signal, denoted as ${F_{\textrm{IDWT}}}$. The formula is as follows:

(10)$${Y_{idwt}} = {F_{\textrm{IDWT}}}([{x_{dc}},{x_{ac}}])$$

Where ${x_{dc}}$ and ${x_{ac}}$ represent the processed detail and approximation coefficients through their corresponding subnets, ${Y_{idwt}}$ is the reconstructed signal vector from the coefficients. To ensure that its dimension is consistent with that of the input signal, needs to be further processed by the output subnet.

(11)$$Y = \textrm{ReLu(}{\omega _{output}} \otimes {Y_{idwt}}\textrm{)} + {b_{output}}$$

Where ${\omega _{output}}$ and ${b_{output}}$ are the weight and bias parameters of the convolution in the output subnet, respectively, Y represents the signal compensated by the DWTCNN equalizer. The Adam optimizer, widely utilized in CNNs, is employed to minimize the loss value to complete the update of the DWTCNN’s parameters. We use the Adam optimizer to minimize the loss of Y and Tx [18], thereby updating the parameters of the network, which can be expressed as:

(12)$${\omega _t} \leftarrow {\omega _{t - 1}} - \eta \frac{{\partial [Loss({Y_{{\omega _{t - 1}}_{}}},Tx)]}}{{\partial {\omega _{t - 1}}}}$$

where ${\omega _{}}$ denotes the updated parameters of the network include the weight and bias parameters of the convolutional kernel, $\eta$ denotes the step size of the Adam optimizer, $Loss$ represents the loss function is used to calculate the loss value which is discussed in detail below.

2.2 Adaptive soft threshold structure

The design of the adaptive soft threshold structure comes from the wavelet denoising algorithm in signal processing. Generally, wavelet denoising methods include three steps: wavelet decomposition, thresholding, and wavelet reconstruction. Thresholding is the process of selecting different threshold functions to process the decomposed coefficients. Threshold functions can be divided into two types: hard thresholding and soft thresholding, the latter improves the former by making the transition smoother, as shown in Fig. 3.

Fig. 3. Hard and soft threshold functions.

Download Full Size | PDF

The key to thresholding is to design a filter that can transform useful information into large positive or negative features, and transform redundant noise information into features close to zero. However, designing such a filter requires a lot of expertise in signal processing. Deep learning provides a new approach to solving this problem. Deep learning does not rely on expert-designed filters, but instead uses gradient descent algorithms to automatically learn filters. Therefore, the combination of soft thresholding and deep learning is an effective method for distinguishing and mitigating redundant information. The formula for soft thresholding is as follows:

(13)$$y = \left\{ {\begin{array}{{cc}} {x - \mathrm{\tau }}&{x > \mathrm{\tau }}\\ 0&{ - \mathrm{\tau } \le x \le \mathrm{\tau }}\\ {x + \mathrm{\tau }}&{x < - \mathrm{\tau }} \end{array}} \right.$$

Where $x$ denotes the input feature, $y$ denotes output feature, and $\tau > 0$ is the threshold. Soft thresholding sets features close to zero and retains useful positive and negative features greater than the threshold. It can be observed that the derivative of the output with respect to the input is either 1 or 0, which can effectively prevent gradient vanishing and exploding problems.

In classical signal denoising algorithms, the selection of thresholds is a difficult problem, and the optimal threshold generally depends on the characteristics of the system and the signal. This paper improves this structure and embeds it into the contraction subnet of DWTCNN to allow the threshold to be automatically determined based on the visible light signal, avoiding the inconvenience of manual operation. The specific operation process of the adaptive soft-threshold structure in the network is as follows:

Firstly, the absolute value of the input feature ${x_{dc}}$ is taken to ensure that the threshold value is positive, preventing the output feature from being all 0s after soft thresholding. Global pooling is used to simplify the vector is indicated in Fig. 2, and the output after pooling is represented as:

(14)$${Y_{pool}} = avgpool(|{{x_{dc}}} |) = avgpool(|{{x_1}} |,|{{x_2}} |,\ldots ,|{{x_3}} |)$$

Where ${x_i}$ represents the feature at channel $i$, $avgpool$ represents the global average pooling operation, and the output ${Y_{pool}}$ is a one-dimensional vector. Then, the output is propagated to a two-layer fully connected network. The number of neurons in the second fully connected layer needs to be consistent with the number of channels in the input feature. The output of the fully connected network is scaled to the range of 0 to 1 using the Sigmoid function.

(15)$${\alpha _c} = \frac{1}{{1 + {e^{ - Y_{FC}^c}}}}$$

Where $Y_{FC}^c$ represents the output feature of the $c$ neuron in the fully connected network, ${\alpha _c}$ is the scaling factor corresponding to the c channel, By multiplying the feature vector of each channel of the input feature with the corresponding scaling factor, the threshold of that channel can be obtained. Therefore, the calculation formula for the threshold is:

(16)$${\tau _c} = {\alpha _c} \cdot x_{dc}^c$$

Where $x_{dc}^c$ represents the feature vector of the c channel of the input of the entire shrinkage network ${x_{dc}}$, ${\alpha _c}$ is the shrinkage coefficient corresponding to the channel. After the threshold vector is calculated, Eq. (14) can be used to perform soft thresholding on the feature vector of each channel.

3. Experimental setup

Figure 4 shows the experimental setup using the DWTCNN equalizer in the PAM-8 VLC system. The experimental setup in this paper consists of three parts: the visible light signal transmitter, the free-space transmission channel, and the visible light signal receiver.

Fig. 4. Experimental setup.

Download Full Size | PDF

Fig. 5. Wavelet transform spectrum comparison of (a) Tx (b) Rx (c) wavelet-decomposed approximation (d) detail signal

Download Full Size | PDF

At the transmitter end, the original binary data is converted into PAM-8 symbols through PAM-8 encoding, and then up-sampled and up-converted before being loaded into an arbitrary waveform generator to generate analog signals. The electrical signal is pre-equalized by a basic hardware equalizer and amplified by an electric amplifier, and in order to reach the threshold of the LED switch current, the signal amplified by the EA needs to pass through a DC bias circuit before being loaded into the blue LED light source to complete the electro-optical conversion. In this experiment, the distance of the free-space transmission channel is 1.2 m.

At the visible light signal receiver end, the transmitted light signal is first converged by a lens to reduce signal loss caused by light scattering. The signal is detected using a PIN photodiode and undergoes optoelectronic conversion. The converted electrical signal is then amplified using an amplifier, and the signal is sampled using a digital oscilloscope. The data is transferred to a computer for offline processing through an external IO port. The offline data processing includes data synchronization, down-sampling and down-conversion. The proposed DWTCNN is used as the post-equalizer to compensate for the signal. To recover the original transmitted signal, the equalized signal is decoded, and finally, the system's performance is measured by calculating the BER.

4. Experimental results

This experiment was divided into three parts to verify the experimental results. We first conduct validation experiments on the proposed DWTCNN equalizer, mainly to demonstrate the effectiveness and its substructures through addition and deletion wavelet decomposition and adaptive soft threshold substructures. Then, we analyze several key parameters of the DWTCNN network to improve optimal network performance, such as the size and number of convolution kernels, the loss function, and the number of iterations, etc. Finally, we compare the performance with classical linear and nonlinear equalization algorithms, such as blind equalization algorithms CMA and Volterra algorithm, and some neural network-based equalization algorithms proposed in previous works, and analyze the performance improvement brought by the proposed model.

4.1 Verification experiments

We experimentally verify that the structure of embedding DWT and AST within the model can effectively reduce the BER of the signal. Furthermore, we use Daubenchies wavelet basis functions to perform wavelet decomposition of the signal. Figure 5(a)-(d) show the performance of Tx, Rx, and wavelet-decomposed approximation and detail signal in the time domain respectively. The corresponding frequency spectra is shown on the right. In (c), when we use filters to cut off the high-frequency and low-frequency components of the signal under different scale conditions, we observed that the detailed signal has more short-term mutation than the approximate signal, which is most of the redundant information and noise information of the signal. By separating this part of the information, we can significantly improve system performance.

To analyze the equalization performance of the proposed DWTCNN, we conducted several ablation studies by adding or removing the corresponding substructures. The results show that both the embedded wavelet transform structure and the adaptive soft thresholding structure can significantly improve equalization performance. In particular, after adding the wavelet transform substructure, the BER of the equalization results was significantly reduced, as shown in Fig. 6(a). Under the experimental condition of 40 mA, the reduction reached its maximum, and under the condition of 120 mA, the network model with the wavelet decomposition substructure was able to control the BER under the 7% hard decision forward error correction (HD-FEC) limit of 3.8 × 10⁻³, while the network model without the wavelet decomposition substructure had an error rate above this benchmark, indicating an inability to identify the original transmitted data. The presence or absence of the adaptive soft thresholding substructure also had a significant impact, as shown in Fig. 6(b). In particular, when the current was 20 mA, the BER after signal compensation without the soft thresholding substructure was still higher than the HD-FEC threshold. Under other conditions that were the same, the network model with the soft thresholding substructure had a significantly lower BER in the equalized signal than the network model without the soft thresholding substructure. The soft thresholding substructure can automatically determine the threshold of the input signal based on its characteristics, effectively eliminating redundant information.

Fig. 6. Ablation Experiments to verify the effectiveness of (a)wavelet transform (b)adaptive soft threshold structure.

Download Full Size | PDF

4.2 Optimization experiments

In addition to the design of the network structure, the setting of hyperparameters can also have a significant impact on the compensation effect of the DWTCNN equalizer. To avoid overfitting or underfitting caused by inappropriate hyperparameter settings and to maximize the equalization performance of this neural network equalizer, this section selects the hyperparameters in the network through experiments. It was found through experiments that selecting MSE as the loss function, setting the step size of the Adam optimizer to 0.0005 [18], setting the convolution kernel size to 3, the number of convolution kernels to 64, using Daubechies wavelet basis functions, and incorporating residual connections and skip connections are good choices for the model.

The DWTCNN equalizer proposed is designed based on convolutional neural networks. The number of the convolution kernel is denoted Kn, which determines the receptive field of the network, and the larger the receptive field, the more global features of the signal can be obtained. However, a larger Kn will increase the computational complexity and lead to poor model performance. Alternatively, if the Kn is too small, it will result in insufficient feature extraction of the signal. The size of the convolutional kernel is denoted Ks, which determines the sampling range of the signal. To obtain a good value for the parameters of the DWTCNN network, this paper conducts the following experiment: selecting the same group of received signals, controlling the convolution kernel size, with a range of [1,3,5] for the kernel size Ks. Then, different combinations are made with different numbers of convolution kernels, with a range of [16, 32, 64, 128] for the number of kernels Kn. The experimental results shown in Fig. 7(a) indicate that the BER of the group with a kernel size of 3 is better than the other two groups, as shown by the green line in Fig. 7(a). The experimental results indicate that when the values of Ks and Kn are both small, the network performance becomes unpredictable. As illustrated in Fig. 7(a), the two points corresponding to Kn = 16 with Ks values of 3 and 5 deviate from the expected outcome. At the same time, 64 is the inflection point, and if the number of convolution kernels continues to increase to 128, the BER of the groups with Ks = 3 and Ks = 5 will increase instead. Therefore, setting the convolution kernel size to 3 and the number of convolution kernels to 64 can achieve the best equalization performance for the DWTCNN equalizer.

Fig. 7. (a) Optimization experiment for convolution kernel size and number (b) The variation curves of loss value with iteration number for DWTCNN network using different error functions.

Download Full Size | PDF

Neural networks evaluate the training error of the model through the loss function and adjust the model parameters by minimizing the error to improve model performance. And we verified the loss impact of the classic regression loss functions mean square error (MSE) and mean absolute error (MAE) on the experimental results, which are widely used to deal with regression problems and classification problems [19,20]. The experiment controlled the working current at 100 mA and the working voltage at 0.8 V, and observed the change in the model's loss value with different iterations (epochs). The experimental results, shown in Fig. 7(b), indicate that using MSE as the loss function results in lower loss values than using MAE, and for the DWTCNN equalizer, using MSE as the loss function leads to faster convergence.

The number of iterations is one of the key factors that improve the effectiveness of the neural network model. If the number of iterations is set too small, the network may not fully learn the signal patterns, leading to the inability of the model to converge. On the other hand, if the number of iterations is set too large, the neural network will overfit, meaning that the generalization performance will be greatly reduced, and the equalizer may not effectively compensate for other signal samples. In the experiment, different iteration numbers (Epoch) were set for training and testing the training signal samples and Fig. 8 shows the relationship between the network's loss value and iteration number. In addition, we also showed the corresponding constellation diagram of the signal samples at specific iteration numbers. Figure 8(a) indicates that when the iteration number is set to 75, the boundaries of the corresponding signal's constellation diagram are still stuck together, and the system performance needs further improvement. Figure 8(b) shows that when the iteration number is set to 150, the boundaries of the signal's constellation diagram are relatively clear, and the signal quality is good enough. Figure 8(c) shows that when the iteration number increases to 250, the constellation diagram of the signal is similarly well-defined. Since the loss value in Fig. 6 arrive at a plateau when the epoch reaches around 150, in order to minimize the overfitting while ensuring the balance performance, the optimal choice of training times for the DWTCNN equalizer in the PAM-8 VLC system is 150.

Fig. 8. DWTCNN network iteration number tuning experiment.

Download Full Size | PDF

As mentioned earlier, wavelet transform is crucial in determining the subsequent approximation and detail components. Therefore, which wavelet basis function to use for decomposition is also essential. Here, we compare the commonly used discrete wavelet bases, such as Coiflets, Biorthogonal, Daubechies, and Symlets. The experimental setup was with a working voltage of 0.8 V, and the transmission distance between the transmitter and the receiver was 1.2 meters. One of the wavelet basis functions was selected, and data were collected for offline processing by adjusting the current range at the transmitter from 5 mA to 160 mA. The experimental results are shown in Fig. 9. According to the results, we can see that for the DWTCNN equalizer, using the Daubechies wavelet is a better choice. Specifically, using Biorthogonal or Symlets wavelets to decompose signals cannot reduce the BER of the damaged signal below the HD-FEC threshold beyond 120 mA. Although both Coiflets and Daubechies wavelets can reduce the BER below the HD-FEC threshold under the working condition of 120 mA, the overall BER performance of Daubechies wavelet is slightly better.

Fig. 9. Effect of different wavelet basis functions on DWTCNN equalizer performance.

Download Full Size | PDF

In this section, the main focus was on parameter tuning of the DWTCNN network to obtain the DWTCNN equalizer with improved equalization performance. The specific configuration of the DWTCNN equalizer network structure and some parameters are referred to Table 1.

Table 1. DWTCNN partial parameter settings

View Table

4.3 Comparison experiments

To further verify the equalization performance of the model, this section conducted comparative experiments by using some key parameters obtained from the optimization experiments, and verified the equalization effect of the DWTCNN equalizer through controlled variable analysis. The experiments were divided into two parts. Firstly, the performance of the proposed DWTCNN equalizer was compared with traditional equalization algorithms such as CMA and Volterra. Secondly, the performance of the proposed DWTCNN equalizer was compared with some recent deep learning-based equalization algorithms, such as equalizers based on LSTM, CNN, and EXNN. The experimental results showed that the DWTCNN equalizer proposed in this paper has better equalization performance.

To compare the compensating effects of the DWTCNN network with classical linear and nonlinear equalization algorithms on the damaged signal, experiments were designed to compare the DWTCNN equalizer with the CMA linear equalizer and the Volterra nonlinear equalizer. The BER of the signals after different equalization algorithms was calculated, and the performance of the algorithms under different working conditions is shown in Fig. 10. Figure 10(a) shows the variation of the system BER with the DC when other conditions are the same for the different equalization algorithms. Eight control groups were set up in this experiment, and the current range was from 5 mA to 140 mA. The experimental results showed that when the current was between 20 mA and 80 mA, and the nonlinearity in the signal was small, the BER of the signals after equalization using the three algorithms could be controlled below the HD-FEC threshold, ensuring basic communication requirements. However, when the current was greater than 100 mA, the nonlinearity in the signal exceeded the compensation capability of the CMA and Volterra algorithms, but the DWTCNN could still effectively equalize the damaged signal. Through comparative analysis, it can be seen that the DWTCNN equalizer performed better in compensating the damaged signal, and its BER was reduced by 28% to 69% under different currents.

Fig. 10. Comparison of BERs of DWTCNN, Volterra and CMA algorithms in terms of (a) current and (b) voltage.

Download Full Size | PDF

Figure 10(b) shows the variation of the BER with input signal peak-to-peak voltage (VPP) for each equalization algorithm under the same experimental conditions. The results indicate that when the voltage is between 0.4 V and 0.6 V, all three equalization algorithms can maintain the BER below the HD-FEC threshold. At 0.8 V, the nonlinearity in the signal gradually increases, after exceeding a certain baud rate, the traditional equalizer will not able to compensate, resulting in a BER above the HD-FEC threshold. However, the DWTCNN equalizer is still capable of compensating for the nonlinear distortion. When the voltage exceeds 1 V, the system nonlinearity becomes intractable, and none of the three equalization algorithms can effectively compensate for the signal distortion, meeting only the minimum communication requirements. Nevertheless, the equalization ability of the DWTCNN is still greater than that of the other two traditional equalization algorithms. Through comparative analysis, it can be seen that the DWTCNN equalizer performs better in compensating for signal distortion, reducing the BER by 10% to 75% at different voltages.

In addition, the working range of current for the DWTCNN equalizer is 10 mA to 120 mA below the threshold, which is about 33.3% greater than that of traditional algorithms. The working range of voltage is 0.35 V to 0.9 V, which is about 28.4% greater than that of traditional algorithms.

The DWTCNN equalizer not only extends the working range of signal-driven voltage and the DC bias but also enhances the overall working distance range of the system. Under the working conditions of a signal transmission bandwidth of 600 MHz, a driving voltage of 0.8 V, and a bias current of 100 mA, the system's transmission distance was set to 1 m, 1.2 m, and 1.4 m, and different equalizers were used to measure the system's gain Q factor. The experimental results, as shown in Fig. 11, indicate that as the visible light transmission distance increases, the nonlinearity loss in the system becomes more severe. Using the proposed DWTCNN equalizer, the system's overall Q factor was greater than that of using the traditional Volterra equalizer. Specifically, at a transmission distance of 1 m, the Q factor using the DWTCNN equalizer was 1.5 dB higher than that using the Volterra nonlinear equalizer. The transmission distance of the VLC system using the DWTCNN equalizer was increased by 0.1 m compared to that using the Volterra equalizer.

Fig. 11. The variation curves of Q-factor with transmission distance for DWTCNN equalizer and Volterra algorithm.

Download Full Size | PDF

In addition to conducting comparative experiments with traditional signal processing algorithms, this paper also performed comparative experiments with equalization algorithms based on deep learning, such as LSTM and EXNN equalizer, which are recognized for their superior equalization performance in both linear and nonlinear aspects. The experiments were conducted under the same current conditions and the BER was compared. As shown in Fig. 12, over the entire current range, the DWTCNN equalizer proposed in this chapter achieved the best equalization effect with the lowest BER. The LSTM equalizer and EXNN equalizer each have their advantages and disadvantages. Specifically, the equalization effect of the LSTM network is better before 50 mA, and the performance of the EXNN equalizer is better after 50 mA.

Fig. 12. Comparison of the performance of DWTCNN, LSTM and EXNN equalizer.

Download Full Size | PDF

That is, when working in the linear range, the equalization effect of the LSTM is better, while in the nonlinear range, the equalization effect of the EXNN is better. The reason is that as the working current increases, both linear and nonlinear distortion in the system increase. The LSTM-based equalizer has a certain degree of memory, while the EXNN equalizer's structure includes hole convolution, allowing for more thorough signal extraction. Under the premise that the BER is lower than the threshold value, compared to the LSTM equalizer and EXNN equalizer, the DWTCNN equalizer increases the system's working current range by 20 mA and 5 mA, respectively.

5. Conclusions

This article proposes a wavelet-assisted convolutional neural network equalizer (DWTCNN) for PAM-8 VLC systems, which uses deep learning technology combined with wavelet transform in traditional signal processing to compensate for both linear and nonlinear distortions in VLC systems. In our experiments, we first validate the effectiveness of DWTCNN in compensating the distorted VLC signals. Then, we verify the indispensability of the proposed wavelet transform and adaptive soft thresholding substructure by observing the performance of the system before and after adding or removing them. In order to obtain the optimal performance of DWTCNN, we conduct parameter tuning experiments, mainly adjusting the convolution kernel size and number. We compare the signal compensation performance of DWTCNN, CMMA, and Volterra equalizers under different working conditions. The experiment results show that, regardless of the working conditions, the compensation performance of DWTCNN is superior to the other two, especially compared to CMMA and Volterra equalizers, the operating range of DWTCNN is increased by 33.3% and 31.6% for current and by 28.4% and 26.1% for voltage, respectively. Compared with the LSTM and EXNN equalizer, DWTCNN equalizer improves the Q factor by 0.76 and 0.53 dB, and increases the operating ranges of the DC bias by 4.76% and 23.5%, respectively.

Funding

National Natural Science Foundation of China (62201112, 6227074).

Acknowledgements

A portion of this work was performed in the Key Laboratory for Information Science of Electromagnetic Waves, Fudan University.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this study may be obtained from the authors on reasonable request.

References

1. L. E. M. Matheus, A. B. Vieira, L. F. M. Vieira, et al., “Visible Light Communication: Concepts, Applications and Challenges,” IEEE Commun. Surv. Tutorials 21(4), 3204–3237 (2019). [CrossRef]

2. N. Chi, Y. Zhou, Y. Wei, et al., “Visible light communication in 6G: Advances, challenges, and prospects,” IEEE Veh. Technol. Mag. 15(4), 93–102 (2020). [CrossRef]

3. E. C. Strinati, S. Barbarossa, J. L. Gonzalez-Jimenez, et al., “6G: The next frontier: From holographic messaging to artificial intelligence using subterahertz and visible light communication,” IEEE Veh. Technol. Mag. 14(3), 42–50 (2019). [CrossRef]

4. A. Kumar and D. N. K. Jayakody, “Secure NOMA-assisted multi-LED underwater visible light communication,” IEEE Trans. Veh. Technol. 71(7), 7769–7779 (2022). [CrossRef]

5. S. Jain, R. Mitra, and V. Bhatia, “On BER analysis of nonlinear VLC systems under ambient light and imperfect/outdated CSI,” OSA Continuum 3(11), 3125–3140 (2020). [CrossRef]

6. G. Stepniak, J. Siuzdak, and P. Zwierko, “Compensation of a VLC phosphorescent white LED nonlinearity by means of Volterra DFE,” IEEE Photonics Technol. Lett. 25(16), 1597–1600 (2013). [CrossRef]

7. S. Rajbhandari, H. Chun, G. Faulkner, et al., “Neural network-based joint spatial and temporal equalization for MIMO-VLC system,” IEEE Photonics Technol. Lett. 31(11), 821–824 (2019). [CrossRef]

8. N. Chi, Y. Zhao, M. Shi, et al., “Gaussian kernel-aided deep neural network equalizer utilized in underwater PAM8 visible light communication system,” Opt. Express 26(20), 26700–26712 (2018). [CrossRef]

9. L. Chuan, C. Qing, and L. Xianxu, “Uplink NOMA signal transmission with convolutional neural networks approach,” J. of Syst. Eng. Electron. 31(5), 890–898 (2020). [CrossRef]

10. B. Lin, Q. Lai, Z. Ghassemlooy, et al., “A machine learning based signal demodulator in NOMA-VLC,” J. Lightwave Technol. 39(10), 3081–3087 (2021). [CrossRef]

11. X. Lu, C. Lu, W. Yu, et al., “Memory-controlled deep LSTM neural network post-equalizer used in high-speed PAM VLC system,” Opt. Express 27(5), 7822–7833 (2019). [CrossRef]

12. X. Lu, Y. Li, J. Chen, et al., “Utilizing deep neural networks to extract non-linearity as entities in PAM visible light communication with noise,” Opt. Express 30(15), 26701–26715 (2022). [CrossRef]

13. S. Rajbhandari, Z. Ghassemlooy, and M. Angelova, “Effective denoising and adaptive equalization of indoor optical wireless channel with artificial light using the discrete wavelet transform and artificial neural network,” J. Lightwave Technol. 27(20), 4493–4500 (2009). [CrossRef]

14. H. Lu, Y. Hong, L.-K. Chen, et al., “On the study of the relation between linear/nonlinear PAPR reduction and transmission performance for OFDM-based VLC systems,” Opt. Express 26(11), 13891–13901 (2018). [CrossRef]

15. C. Burrus, “Introduction to wavelets and wavelet transforms: A primer,” Englewood Cliffs: Prentice-Hall (1997).

16. Ç. P. Dautov and M. S. Özerdem, “Wavelet transform and signal denoising using Wavelet method,” in 2018 26th Signal Processing and Communications Applications Conference (SIU), (Ieee, 2018), 1–4.

17. M. Lin, Q. Chen, and S. Yan, “Network in network,” International Conference on Learning Representation(ICLR),pp, 1–10 (2014).

18. D. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” (San Diego, CA, USA, 3rd International Conference for Learning Representations, 2015), pp. 1–15.

19. Z. Li, F. Liu, W. Yang, et al., “A survey of convolutional neural networks: analysis, applications, and prospects,” IEEE Trans. Neural Netw. Learning Syst. 33(12), 6999–7019 (2022). [CrossRef]

20. P. Domingos, “A few useful things to know about machine learning,” Commun. ACM 55(10), 78–87 (2012). [CrossRef]

Parameters	Value	Parameters	Value
Wavelet basis functions	Daubechies	Kernel number	64
Kernel size	3	Epoch	150
Loss function	MSE	Adam Optimizer step	0.0005

Discrete wavelet transform assisted convolutional neural network equalizer for PAM VLC system

Abstract

1. Introduction

2. Principle

2.1 DWTCNN architecture

2.2 Adaptive soft threshold structure

3. Experimental setup

4. Experimental results

4.1 Verification experiments

4.2 Optimization experiments

4.3 Comparison experiments

5. Conclusions

Funding

Acknowledgements

Disclosures

Data availability

References

Data availability

Cited By

Figures (12)

Tables (1)

Equations (16)

Optics Express