Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Vehicle identification using deep learning for expressway monitoring based on ultra-weak FBG arrays

Open Access Open Access

Abstract

A deep learning with knowledge distillation scheme for lateral lane-level vehicle identification based on ultra-weak fiber Bragg grating (UWFBG) arrays is proposed. Firstly, the UWFBG arrays are laid underground in each expressway lane to obtain the vibration signals of vehicles. Then, three types of vehicle vibration signals (the vibration signal of a single vehicle, the accompanying vibration signal, and the vibration signal of laterally adjacent vehicles) are separately extracted by density-based spatial clustering of applications with noise (DBSCAN) to produce a sample library. Finally, a teacher model is designed with a residual neural network (ResNet) connected to a long short-term memory (LSTM), and a student model consisting of only one LSTM layer is trained by knowledge distillation (KD) to satisfy the real-time monitoring with high accuracy. Experimental demonstration verifies that the average identification rate of the student model with KD is 95% with good real-time capability. By comparison tests with other models, the proposed scheme shows a solid performance in the integrated evaluation for vehicle identification.

© 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Real-time and accurate traffic flow data is essential for expressway traffic management and intelligent services. In recent years, expressway vehicle monitoring methods have developed rapidly. However, the commonly used methods, such as cameras, radar, and GPS, are susceptible to weather or other environmental influences, making it difficult to achieve continuous long-distance monitoring. The ultra-weak fiber Bragg grating (UWFBG) has the advantages of high spatial resolution [1], fast sensing response [2], and dynamic measurement [3], which can realize long-distance and real-time vehicles monitoring.

Similarly, studies have been conducted on applying fiber-optic sensors in expressway traffic flow monitoring, and the results are closely related to how the transmission cable is laid. Wang et al. [4] used distributed acoustic sensing (DAS) on the roadside of the expressway, relying only on a single transmission cable to monitor vehicles. However, this approach could not achieve “lane level” monitoring, and the vibration signal generated by a vehicle farther away from the cable was weaker. It is also unable to identify “laterally adjacent vehicles”. Yuksel et al. [5] extracted the vehicle vibration signal through the transmission optical cable perpendicular to the road direction, whose method could detect the vibration signal of the vehicle wheel axle. Nevertheless, the research mainly focused on measuring the vehicle speed and wheelbase, which could not accurately identify vehicles.

In addition to how the transmission fiber optic cable is laid, another problem is the analysis of vibration signals. The traditional analysis includes time-domain, frequency-domain, and time-frequency analysis [6,7]. In recent years, machine learning has been widely used in vibration signals. For example, Xin et al. [8] used local characteristics-scale decomposition (LCD) to extract vibration signal features and input them to a support vector machine (SVM), which has achieved good results in the subway perimeter. Van et al. [9] combined empirical mode decomposition (EMD) and SVM for bearing fault diagnosis. Celler et al. [10] estimated blood pressure using a Gaussian mixture model and a hidden Markov model. Dong et al. [11] used wavelet transform to extract time-frequency features as the input of a random forest, which has a good effect in bearing fault diagnosis.

Traditional machine learning needs to extract features of complex vibration signals first, and the quality of feature extraction will directly affect the accuracy of the recognition model. In addition, vibration signals have the characteristics of complex nonlinearity, non-stationary, considerable noise, and extensive crosstalk between vibration sources. Especially in a complex environment such as an expressway, different vehicles crosstalk with each other, which causes great difficulties for manual feature extraction. On the other hand, the feature extraction of vibration signals will also incur time overhead, and complex signal processing algorithms cannot be used in scenes with real-time requirements such as expressways. For example, Qi et al. [12] used variational modal decomposition (VMD) for laser radar signal denoising, but with the increase of the number of modal components, the single processing time of the algorithm will also become more extended.

Compared with the above methods, deep learning, as a branch of machine learning, has powerful automatic feature extraction capability and mature inference acceleration method, widely used in computer vision and natural language processing. In vibration signal analysis, Liu et al. [13] proposed an adaptive convolutional neural network for identifying perimeter intrusion. Shrestha et al. [14] used convolutional neural networks for bridge structure monitoring. Deeper neural networks can extract richer features of vibration signals, but deeper networks can produce the problem of gradient disappearance. To solve the problem, Peng et al. [15] proposed to introduce residual learning in a one-dimensional convolutional neural network, which is applied to high-speed train fault diagnosis. In addition, LSTM, as a recurrent neural network, performs well in time series processing and can capture nonlinear dynamic changes in time series, and Ma et al. [16] used LSTM for transformer fault prediction. Chen et al. [17] collected vibration signals from spacecraft as input to Bi-LSTM to predict the spacecraft's lifetime.

Since complex deep learning integration models have a large number of redundant parameters and also incur huge time overhead, it is challenging to meet real-time demanding application scenarios like an expressway. Knowledge distillation can meet the need for lightweight deployment [18], with the idea of generating soft labels for structurally simple student model learning through a robust teacher model, so that the performance of the student model approximates that of the teacher model. Deng et al. [19] used multiple teacher models to generate soft labels for student models to learn, which led to good performance of student models. Li et al. [20] used knowledge distillation for bearing fault diagnosis and achieved good results in practical applications.

In this paper, the UWFBG-based lateral lane level monitoring scheme for expressway vehicle monitoring is proposed for high accuracy, high performance, and lightweight deployment. UWFBG arrays are laid underground in each expressway lane to collect vibration signals. With the propagation of vibrations in the ground, the detected vibration signals from UWFBG arrays become more complex as the number of vehicles on the highway increases, which makes it challenging to recognize if there is an actual vehicle in the lane. A sliding window filters the vehicle vibration signals by DBSCAN to create a sample library. For the characteristics of the pavement of the UWFBG arrays, the vehicle vibration signals are classified into three types to identify the actual vehicle in the lane: vibration signal of a single vehicle, accompanying vibration signals in the adjacent lane of the vehicle, and the vibration signals of laterally adjacent vehicles. Then, using knowledge distillation, a powerful teacher model is designed, which consists of a deep residual network with an LSTM, and a student model consists of only one LSTM layer. The student model approximates the performance of the teacher model by learning the soft labels generated by the teacher model. The experiment shows that compared with other advanced methods, the deep learning model can extract vibration signal features better, and the student model improves its accuracy through knowledge distillation. This method deals well with both real-time and accuracy.

The rest of the paper is organized as follows: Section 2 briefly introduces the experimental system with UWFBG arrays. Section 3 describes the proposed identification scheme using deep learning. Section 4 presents the experiment results and comparison analysis. Finally, Section 5 comes to conclusions.

2. Experimental system based on UWFGB arrays

The sensing array is made of an on-line writing system, in which UV light from a 248 nm excimer laser is irradiated onto the stretched fiber with very low transmission loss. A large-scale ultra-weak FBG array consisting of thousands of identical-wavelength FBGs with low reflectivity is fabricated, and the UWFBG array has no fusion loss with high mechanical stability. The narrow linewidth laser (NLL) generates a continuous wave light source with a central wavelength of 1550 nm, which is modulated into a 1 kHz light pulse by the semiconductor optical amplifier (SOA) and then enters the erbium-doped fiber amplifier (EDFA). The pulsed light is subsequently guided into the UWFBG by a circulator. The 3*3 coupler phase demodulation unit consisting of Michelson interferometer will recover the vibration signal waveform by demodulating the phase change caused by the optical length change of two adjacent sensors, and optical time domain reflection (OTDR) is used to locate the vibration appearance [8].

As shown in Fig. 1, four fiber optic cables are parallelly buried 30 cm underground in four lanes with a grating spacing of 5 m. 100 valid FBGs in each cable are used in this experimental system. The sensing network transmits the vibration signal with a sampling frequency of 1 k Hz to the demodulator.

 figure: Fig. 1.

Fig. 1. Underground laying of the UWFBG arrays.

Download Full Size | PDF

The vehicle vibration signals are classified into three types to identify the vehicle in the lane: the vibration signal of a single vehicle, accompanying vibration signals in the adjacent lane of the vehicle, and the vibration signals of laterally adjacent vehicles. They are referred to as “single vehicle”, “accompanying vibration”, and “laterally adjacent vehicles”, respectively. The three signal types will be obtained from the UWFBG arrays of each lane. The result of “single vehicle” or “laterally adjacent vehicles” is derived from the UWFBG arrays in the lane where the vehicle is detected. However, the result of “accompanying vibration” comes from the adjacent lane of the vehicle while there is no vehicle, and is easily mistaken for the other two types of vehicle vibration.

It is seen in Fig. 2 that a single yellow car is traveling on UWFBG I. Therefore, the vibration signal of UWFBG I is called “single vehicle”. On the other hand, the vibration single of UWFBG II is called “accompanying vibration”, which looks like a traveling vehicle (a black-dashed car). In contrast, there is no vehicle in Lane II. Two blue cars drive closely on UWFBG arrays II and III, respectively. The vibration signals of them are called “laterally adjacent vehicles”. Correspondingly, the vibration signals on UWFBG arrays I and IV are “accompanying vibration”.

 figure: Fig. 2.

Fig. 2. Three types of vehicles vibration signals from UWFBG arrays.

Download Full Size | PDF

The truck (Brand: Dongfeng Rio Tinto) and the car (Brand: Geely Emgrand) are used in the experiment. The drivers follow their driving habits respectively. The field test interval is 500 meters, the time for each trip of a single vehicle is about 20 seconds, and the vehicle speed is at least 70 km/h. In addition, for the close-distance driving conditions of the two vehicles, the speed is controlled at around 40 km/h to ensure the safety of the vehicles. As a result, the time of each trip is about 43 seconds.

Figure 3 shows the truck driving on the UWFBG III. It generates two types of vibration signals, “single vehicle” and “accompanying vibration”, as shown in Fig. 4. Figure 4(c) represents the vibration signal of the single vehicle from the lane where the vehicle is driving. While, Fig. 4(a), (b), and (d) are the accompanying vibration signals from vehicles to adjacent lanes.

 figure: Fig. 3.

Fig. 3. Truck driving on the UWFBG III.

Download Full Size | PDF

 figure: Fig. 4.

Fig. 4. Vibration signals of one truck from UWFBG arrays. (a) Accompanying vibration signal. (b) Accompanying vibration signal. (c) Vibration signal of the single vehicle. (d) Accompanying vibration signal.

Download Full Size | PDF

Two laterally adjacent vehicles are driving on UWFBG II and III respectively in Fig. 5. Correspondingly, the waterfall plots of four UFWBGs are shown in Fig. 6. Figure 6(a) and (d) show the accompanying vibration signals. Figure 6(b) and (c) represent the vibration signals of laterally adjacent vehicles.

 figure: Fig. 5.

Fig. 5. Two laterally adjacent vehicles in different lanes (UFWBG II and III).

Download Full Size | PDF

 figure: Fig. 6.

Fig. 6. Vibration signals of the two laterally adjacent vehicles. (a) Accompanying vibration signal. (b) Vibration signal of laterally adjacent vehicles. (c) Vibration signal of laterally adjacent vehicles. (d) Accompanying vibration signal.

Download Full Size | PDF

Figure 7 shows the waveforms and the fast Fourier transform (FFT) of three typical vehicle vibration signals from Fig. 4 and Fig. 6. As can be seen from Fig. 7, the three types of signals have an extensive similarity in both time domain and frequency domain, especially between the accompanying vibration signal and the vibration signal of laterally adjacent vehicles because of close signal features.

 figure: Fig. 7.

Fig. 7. Three types of typical vibration signal. (a) Vibration signal of single vehicle. (b) Vibration signal of laterally adjacent vehicles. (c) Accompanying vibration signal.

Download Full Size | PDF

3. Principle of identification scheme using deep learning

The proposed framework is shown in Fig. 8, divided into three parts: signal preprocessing, data training, and testing. Since a large amount of noise exists in the expressway, the vibration signal extracted by a sliding window should first be preprocessed with the basic threshold of the time domain amplitude. Signals of the vehicles are further cleaned to produce a sample library by the DBSCAN for trajectory clustering. Then both a teacher model consisting of ResNet connected to LSTM and a student model consisting of only one LSTM layer are separately trained by knowledge distillation. The final trained student model with knowledge distillation (student + KD) is tested in practical applications.

 figure: Fig. 8.

Fig. 8. Flowchart of the proposed vehicle identification framework.

Download Full Size | PDF

3.1 Sliding window

The vibration signals are obtained through a sliding window, as shown in Fig. 9, where the distance of each slide of the sliding window is less than the length of the window, thus making an overlapping part between the samples, which allows the model to learn more features of the samples. On the other hand, the sliding window can refer to historical data when analyzing the current grating array vibration signal. Assume that, the vibration signal of a single sensor in a certain length of time is $x\left[ n \right]$, and n denotes the length of the sequence. When the length of the window ${L_{window}}$ and the offset ${L_{offset}}$ of the window are determined, the overlap ${L_{overlap}}$ and the number of the samples N are also determined, and their relationship is expressed as

$${L_{overlap}} = {L_{window}} - {L_{offset}}, $$
$$N = \frac{{n - {L_{overlap}}}}{{{L_{offset}}}}$$

 figure: Fig. 9.

Fig. 9. Vibration signal acquisition by the sliding window.

Download Full Size | PDF

The sampling frequency of the demodulator used in this system is 1 k Hz, which means 1000 samples are obtained in one second of one FBG. Although the sliding window is supposed to cover the whole vibration wave when a vehicle passes by the FBG, the time depends on the vehicle’s speed. Therefore, according to the average speed of the vehicles on the expressway, the sliding window parameters are set as follows: the ${L_{window}}$ is set to 1500 (equals 1.5 seconds), and the ${L_{offset}}$ is set to 500 (equals 0.5 seconds). If the value of parameters is too big, the real-time capability will be influenced, while the computing consumption will directly increase if they are too small. Thus, the input of the sliding window is real-time vibration data flow, and each 1500 vibration data are selected for vehicle identification.

3.2 Deep learning models

3.2.1 Student model

The proposed vehicle identification scheme will adopt a low-inference cost network consisting of only one LSTM layer with 64 hidden units as the student model (LSTM). The conventional recurrent neural networks (RNN) have the problem of gradient disappearance in long time sequences. LSTM [21] overcomes the gradient disappearance problem of RNN by introducing forgetting gates, input gates, and output gates to retain the key information. A typical LSTM structure is shown in Fig. 10, where ${f_g}$, ${f_i}$ and ${f_o}$ are the hyperbolic tangent functions, and ${f_c}$ and ${f_h}$ are the Sigmoid functions. The forgetting gate ${F_t}$ helps the current memory unit ${C_t}$ to discard the unimportant information from the previous memory unit ${C_{t - 1}}$, then the input gate ${I_t}$ helps to retain key information from the previous state ${h_{t - 1}}$ and the new input ${X_t}$. The candidate information ${\bar{C}_t}$ is determined by ${X_t}$ and ${h_{t - 1}}$, and the current memory unit ${C_t}$ is generated by ${C_{t - 1}}$ and ${\bar{C}_t}$. Finally, the current state ${h_t}$ is updated by the output gate ${Y_t}$ generated by ${X_t}$ and ${h_{t - 1}}$.

 figure: Fig. 10.

Fig. 10. Typical LSTM structure.

Download Full Size | PDF

3.2.2 Teacher model

Although deep neural networks can fit complex nonlinear features, the convolution operation may lose important features of the original vibration signal as the number of layers increases, leading to the information in the feature vector also decreasing, which will eventually cause the problem of gradient disappearance. The deep residual network solves this problem by shortcut connection. As shown in Fig. 11, the residual block treats the stacked layers as a mapping $F(x )$, where x is the input to these stacked layers. Then the output of these stacked layers is re-represented by $F(x )+ x$, and the output $H(x )$ of this block is finally obtained after the rectified linear unit (ReLU) activation function.

 figure: Fig. 11.

Fig. 11. Residual block.

Download Full Size | PDF

Considering the strong feature extraction capability of ResNet and the capability of LSTM to capture the nonlinear dynamic changes of time series, a teacher model is proposed with a heavy network consisting of ResNet50 with LSTM embedding the average pooling layer and the Flatten layer (ResNet-LSTM) in Fig. 12. The deep feature vector from the average pooling layer of ResNet50 is utilized as the input to LSTM in the teacher model.

 figure: Fig. 12.

Fig. 12. Identification process of the proposed teacher model (ResNet-LSTM).

Download Full Size | PDF

3.2.3 Knowledge distillation

Knowledge distillation is one of the popularly used methods to satisfy lightweight deployment. The soft labels generated by the teacher model usually imply richer feature information, and the student model achieves the effect of approximating the teacher model by learning the soft labels of the teacher model. Furthermore, since the student model has a simpler structure and fewer parameters, the student model can usually meet the requirements of lightweight deployment.

For a set of probabilistic distribution $\mathbf{z} = \{{{z_1},{z_2}, \ldots ,{z_n}} \}$, where n denotes the number of categories, then the similarity relationship in the predicted values can be amplified by the temperature T. This process is called distillation, which can be expressed as

$${q_i} = \frac{{\exp ({z_i}/T)}}{{\sum\nolimits_j {\exp ({z_j}/T)} }}. $$
where ${q_i}$ denotes the predicted value ${z_i}$ after distillation by temperature T. When T is large enough, the predicted values of different categories tend to be equal, and otherwise, if T is too small, the distribution is close to One-Hot coding.

The total loss of knowledge distillation includes Kullback-Leibler (KL) divergences of the teacher model and the student model, and the cross-entropy (CE) loss between the true labels and the probabilistic distribution of the student model. Given by

$$Loss = \alpha {T^2}{L_{KL}}({q_T}||{q_S}) + (1 - \alpha ){L_{CE}}. $$
where $\alpha $ is a weighting factor, ${q_T}$ and ${q_S}$ denote the probabilistic distribution of the teacher model and the student model, respectively. ${L_{KL}}$ and ${L_{CE}}$ are the KL divergences and CE loss, respectively.

3.3 Training process of vehicle identification model

The proposed training process of deep learning with knowledge distillation is shown in Fig. 13. The FFT of the vehicle vibration signals from UWFBG arrays are used as input for the training of the teacher model and the student model. The parameters of the teacher model are listed in Table 1.The teacher model and the student model are trained by knowledge distillation, where the value of T is 2 and $\alpha $ is 0.9.

 figure: Fig. 13.

Fig. 13. The proposed training process of deep learning with knowledge distillation.

Download Full Size | PDF

Tables Icon

Table 1. Parameters of the proposed teacher model

The training parameters of the teacher model, student model, and student model with KD are shown in Table 2. The training optimizer of all the models is Adam, and the learning rate is 0.001 without a custom learning rate schedule. In addition to the training method of knowledge distillation, the model loss function is categorical cross-entropy, and when the student model is trained with knowledge distillation, its loss function is the distillation loss function.

Tables Icon

Table 2. Training parameters of the proposed deep learning model with knowledge distillation

4. Experimental results

4.1 Sample library of vehicle vibration signals

Data cleaning is an essential procedure for deep learning. Thus, the DBSCAN clustering algorithm is employed in our work to achieve data cleaning by extracting vehicle vibration samples without unknown noises. DBSCAN is well known for its fast clustering, and no need to determine the number of clusters in advance. The function of DBSCAN is to produce a higher quality sample library of vehicle vibration signals, whose input is a collection of extraction points of vehicle vibration signals. Its output is a purer collection of extraction points of vehicle vibration signals, and finally, the vehicle vibration signals corresponding to these extraction points are selected as the sample library. There are only two parameters in DBSCAN: the minimum sample number and the distance between categories. The values of parameters are generally set manually. The extraction time needs to be normalized with the sensor number before clustering. After clustering, the sampling time is inverse normalized with the sensor number to obtain the sampling moment required for each sensor.

The process of DBSCAN is as follows. First, an extraction point of an unvisited vehicle vibration signal is randomly selected. Next, whether the number of extraction points within the circle with the point as the center and distance as the radius is greater than or equal to the minimum sample numbers is calculated. If yes, the point is a core point. If not, it is a noise point or an edge point. If the point is in the circle with other extraction points as the center and distance as the radius, then it is an edge point, and the opposite is a noise point. Repeat the above steps until all the extraction points are visited. Finally, remove the noisy points, cluster the core points with a distance less than or equal to the distance into one category, and assign the edge points to the category of core points associated with them.

One trip of the vehicle driving is shown in Fig. 14, where the minimum sample number is set to 60, and the distance between categories is set to 0.1. The horizontal axis of the extraction points is the location area of the vehicle vibration signal, and the vertical axis is the extraction time of the vehicle vibration signal. Finally, the further extraction points in Fig. 14(b) are selected as the sample library. The number of samples extracted for the vibration signals of accompanying vibration signal, vibration signal of a single vehicle, and vibration signals of laterally adjacent vehicles are 400, 460, and 390, respectively, a total of 1250 samples in the built sample library.

 figure: Fig. 14.

Fig. 14. (a) Results of preliminary extraction. (b) Further extraction results using DBSCAN.

Download Full Size | PDF

4.2 Characteristic visualization of vehicle vibration signals

One of the advantages of the deep learning model is end-to-end classification, which is accomplished through feature autonomy learning. Therefore, no feature extraction of vehicle vibration signals is performed manually in the proposed method. To illustrate the effect of automatic feature extraction of the deep learning with knowledge distillation model in our work, the visual analysis of feature extraction is performed by projecting the deep feature vectors output from their Flatten layer into two dimensions by t-distributed stochastic neighbor embedding (t-SNE) [22], where the output form of Flatten is shown in Table 1 in the paper.

t-SNE is a well-known feature visualization method and not a classification method. As an unsupervised feature dimensionality reduction method, it can demonstrate the effect of feature extraction by different methods. The better the feature extraction, the more discrete the different kinds of feature projection points and the more concentrated the same kind of feature projection points. t-SNE projects the feature vector into two dimensions to demonstrate the effect of feature extraction by deep learning models, while the features used in different classification methods may be not only two.

Compared with manual features methods for the three types of vehicle vibration signals, the original vibration signals are decomposed into multiple modal components by VMD [12], and then their multiscale alignment entropy (MPE) [8] is calculated as feature vectors (VMD-MPE). The time-frequency method is further selected [23]. Visualization of different methods is shown in Fig. 15, including the handcrafted features methods, such as of the original signal in Fig. 15(a), time-frequency of the original signal in (b), and VMD-MPE of the original signal (c). In contrast, the teacher model (d), the student model (e), and the proposed method with knowledge distillation (student + KD) (f) show more discrete than the manual methods. Furthermore, it shows that the visualization result of the proposed method in Fig. 15(f) is between that of the teacher model (less overlapping area) in Fig. 15(d) and the student model (more overlapping area) in Fig. 15(e), because the student model with knowledge distillation not only retains the low-inference cost of the student model but also achieves a higher accuracy by learning the knowledge of the teacher.

 figure: Fig. 15.

Fig. 15. Visualization of different methods with t-SNE. (a) Original signal. (b) Time-Frequency. (c) VMD-MPE. (d) The teacher model. (e) The student model. (f) The proposed method with knowledge distillation.

Download Full Size | PDF

4.3 Model evaluation

4.3.1 Evaluation on the classification performance

The training and validation losses of the different models are shown in Fig. 16. The loss of the teacher model is reduced the most, and the convergence speed is fast. The convergence speed of the student model is faster when trained through knowledge distillation than without knowledge distillation, and the loss of the model is further decreased. The hardware used in the experimental test is as follows: the processor is AMD Ryzen 5 5600U, the memory size is 16 G, and no GPU is used.

 figure: Fig. 16.

Fig. 16. The training and validation losses of different models. (a) The teacher model. (b) The student model. (c) The student model with KD.

Download Full Size | PDF

In order to evaluate the performance of the proposed method, Deep Forest [24] is employed and denoted by DF1 and DF2, respectively. CNN-LSTM with the FFT of the original signal as input is used to compare with the classical deep learning method. Table 3 lists the identification rates of the mentioned methods, among which the teacher model has the highest correct rate and DF2 has the worst effect, which indicates that the method only using time-frequency feature cannot effectively distinguish the three types of vehicle vibration signals.

Tables Icon

Table 3. Identification rates of different methods of three types of vehicle signals

${F_1}$ score is a commonly used parameter for model performance evaluation, which includes both precision P and recall R. It is defined as

$${F_1} = \frac{{2PR}}{{P + R}}. $$

Figure 17 shows the ${F_1}$ score and the running time of different methods. Regarding ${F_1}$ score, the teacher model has the highest ${F_1}$ score, DF2 performs the worst, and the student model has a lower ${F_1}$ score than that of CNN-LSTM before knowledge distillation, but a higher ${F_1}$ score is obtained after knowledge distillation. Considering the running time of the different methods, the teacher model takes the longest time and cannot be applied in scenarios with real-time requirements, and the time of DF2 is the shortest because of the algorithm's simplicity. Also, DF1 needs more time due to the higher computational requirement of VMD. After knowledge distillation, the student model considers both time efficiency and correctness.

 figure: Fig. 17.

Fig. 17. Performance comparison of different methods.

Download Full Size | PDF

Since the CPU-based multi-threaded parallel computing mode is adopted in this work, the accumulation of real-time data in the sliding window and the model inference are two independently running threads. Furthermore, the inference time of the proposed method is only 0.11 s (110 ms), which is less than the sliding frequency of the sliding window (0.5 s). Hence, the sliding frequency of the sliding window is the maximum delay of real-time data processing, which is 0.5 s (500 sampling points).

4.3.2 Experiments on the robustness of the knowledge distillation

Deep learning models may be affected by the quality of the sample as well as the number of samples. Since DBSCAN affects the proposed model's performance by affecting the sample quality, Fig. 18 shows the experimental results of training the model with different sample libraries generated by different DBSCAN parameter values. The minimum sample numbers range from 10 to 80 with a step size of 10, and the distance ranges from 0.1 to 0.9 with a step size of 0.1. It can be seen from the experimental results that for the samples generated by different parameter combinations of DBSCAN, the ${F_1}$ score of the teacher model is above 0.9, demonstrating strong robustness. Furthermore, after the knowledge distillation, the overall ${F_1}$ score of the student model has also been improved. In general, the difference between the highest ${F_1}$ score and the lowest ${F_1}$ score of these models is about 0.1, and the combinations of DBSCAN parameters do not significantly impact the model's performance.

 figure: Fig. 18.

Fig. 18. The effect of parameter values of DBSCAN on the performance of different models.

Download Full Size | PDF

In order to verify the robustness of the proposed method trained by knowledge distillation in the case of imbalanced and few samples, the number of unbalanced samples is reduced for training. As shown in Table 4, the ${F_1}$ scores of each model decreased as the sample size is reduced. The teacher model decreases the least, and the proposed student model decreases the most because the student model has the simplest structure. The performance of the student model with KD is significantly improved when the sample size is smaller. It indicates that in practical applications, even with a small number of samples, the proposed methods can achieve better identification accuracy by knowledge distillation.

Tables Icon

Table 4. ${{\boldsymbol F}_1}$ scores for different methods

5. Conclusion

Based on the significant advantages of UWFBG arrays, a UWFBG-based deep learning method is proposed for lateral lane-level monitoring of expressways. UWFBG arrays are buried underground in each lane of the expressway. The vehicle vibration signals are obtained and classified into three types because of the difficulties in identifying the two laterally adjacent vehicles. The signals are extracted and further filtered using DBSCAN. The powerful teacher model with a simple student model is constructed and trained with knowledge distillation using the FFT of the vibration signal of the vehicle as input to improve both the accuracy and real-time performance. t-SNE is employed to visualize the automatic feature extraction capability of deep learning compared with the methods using manual feature extraction. The experiments show that, the proposed method has more significant advantages over the traditional methods in terms of model accuracy and time efficiency, and is capable of effectively identifying vehicle vibration signals to meet the requirements of fast response for expressway monitoring with regular traffic flow.

Funding

National Key Research and Development Program of China (2021YFB3202901).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. X. Gui, Z. Li, F. Wang, Y. Wang, C. Wang, S. Zeng, and H. Yu, “Distributed sensing technology of high-spatial resolution based on dense ultra-short FBG array with large multiplexing capacity,” Opt. Express 25(23), 28112–28122 (2017). [CrossRef]  

2. C. Wang, Y. Shang, X. Liu, C. Wang, H. Yu, D. Jiang, and G. Peng, “Distributed OTDR-interferometric sensing network with identical ultra-weak fiber Bragg gratings,” Opt. Express 23(22), 29038–29046 (2015). [CrossRef]  

3. Q. Nan, S. Li, Y. Yao, Z. Li, H. Wang, L. Wang, and L. Sun, “A Novel Monitoring Approach for Train Tracking and Incursion Detection in Underground Structures Based on Ultra-Weak FBG Sensing Array,” Sensors 19(12), 2666 (2019). [CrossRef]  

4. M. Wang, L. Deng, Y. Zhong, J. Zhang, and F. Peng, “Rapid Response DAS Denoising Method Based on Deep Learning,” J. Lightwave Technol. 39(8), 2583–2593 (2021). [CrossRef]  

5. K. Yuksel, D. Kinet, K. Chah, and C. Caucheteur, “Implementation of a Mobile Platform Based on Fiber Bragg Grating Sensors for Automotive Traffic Monitoring,” Sensors 20(6), 1567 (2020). [CrossRef]  

6. D. Li, W. Wang, and F. Ismail, “An Enhanced Bispectrum Technique With Auxiliary Frequency Injection for Induction Motor Health Condition Monitoring,” IEEE Trans. Instrum. Meas. 64(10), 2679–2687 (2015). [CrossRef]  

7. R. Yan, R. Gao, and X. Chen, “Wavelets for fault diagnosis of rotary machines: A review with applications,” Signal Process. 96, 1–15 (2014). [CrossRef]  

8. L. Xin, Z. Li, X. Gui, X. Fu, M. Fan, J. Wang, and H. Wang, “Surface intrusion event identification for subway tunnels using ultra-weak FBG array based fiber sensing,” Opt. Express 28(5), 6794–6805 (2020). [CrossRef]  

9. M. Van and H. Kang, “Bearing Defect Classification Based on Individual Wavelet Local Fisher Discriminant Analysis with Particle Swarm Optimization,” IEEE Trans. Ind. Inf. 12(1), 124–135 (2016). [CrossRef]  

10. B. Celler, P. Le, A. Argha, and E. Ambikairajah, “GMM-HMM-Based Blood Pressure Estimation Using Time-Domain Features,” IEEE Trans. Instrum. Meas. 69(6), 3631–3641 (2020). [CrossRef]  

11. X. Dong, G. Li, Y. Jia, and K. Xu, “Multiscale feature extraction from the perspective of graph for hob fault diagnosis using spectral graph wavelet transform combined with improved random forest,” Measurement 176, 109178 (2021). [CrossRef]  

12. B. Qi, G. Yang, D. Guo, and C. Wang, “EMD and VMD-GWO parallel optimization algorithm to overcome Lidar ranging limitations,” Opt. Express 29(2), 2855–2873 (2021). [CrossRef]  

13. F. Liu, S. Li, Z. Yu, X. Ju, H. Wang, and Q. Qi, “Adaptive Intrusion Recognition for Ultraweak FBG Signals of Perimeter Monitoring Based on Convolutional Neural Networks,” in 25th International Conference on Neural Information Processing (ICONIP), Lecture Notes in Computer Science (Springer International Publishing Ag, 2018), pp. 359–369.

14. A. Shrestha and J. Dang, “Deep Learning-Based Real-Time Auto Classification of Smartphone Measured Bridge Vibration Data,” Sensors 20(9), 2710 (2020). [CrossRef]  

15. D. Peng, Z. Liu, H. Wang, Y. Qin, and L. Jia, “A Novel Deeper One-Dimensional CNN With Residual Learning for Fault Diagnosis of Wheelset Bearings in High-Speed Trains,” IEEE Access 7, 10278–10293 (2019). [CrossRef]  

16. X. Ma, H. Hu, and Y. Shang, “A New Method for Transformer Fault Prediction Based on Multifeature Enhancement and Refined Long Short-Term Memory,” IEEE Trans. Instrum. Meas. 70, 1–11 (2021). [CrossRef]  

17. C. Chen, N. Lu, B. Jiang, Y. Xing, and Z. Zhu, “Prediction Interval Estimation of Aeroengine Remaining Useful Life Based on Bidirectional Long Short-Term Memory Network,” IEEE Trans. Instrum. Meas. 70, 1–13 (2021). [CrossRef]  

18. G. Hinton, O. Vinyals, and J. Dean, “Distilling the Knowledge in a Neural Network,” arXiv, arXiv:15003.02531 (2015).

19. J. Deng, W. Jiang, Y. Zhang, G. Wang, S. Li, and H. Fang, “HS-KDNet: A Lightweight Network Based on Hierarchical-Split Block and Knowledge Distillation for Fault Diagnosis With Extremely Imbalanced Data,” IEEE Trans. Instrum. Meas. 70, 1–9 (2021). [CrossRef]  

20. F. Li, J. Chen, S. He, and Z. Zhou, “Layer Regeneration Network With Parameter Transfer and Knowledge Distillation for Intelligent Fault Diagnosis of Bearing Using Class Unbalanced Sample,” IEEE Trans. Instrum. Meas. 70, 1–10 (2021). [CrossRef]  

21. Z. C. Lipton, J. Berkowitz, and C. J. C. S. Elkan, “A Critical Review of Recurrent Neural Networks for Sequence Learning,” arXiv, arXiv:1506.00019 (2015).

22. L. van der Maaten and G. Hinton, “Visualizing Data using t-SNE,” J. Mach. Learn. Res. 9, 2579–2605 (2008).

23. F. Liu, H. Zhang, X. Li, Z. Li, and H. Wang, “Intrusion identification using GMM-HMM for perimeter monitoring based on ultra-weak FBG arrays,” Opt. Express 30(10), 17307–17320 (2022). [CrossRef]  

24. Z. Zhou and J. Feng, “Deep Forest: Towards An Alternative to Deep Neural Networks,” arXiv, arXiv:1702.08835 (2017).

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (18)

Fig. 1.
Fig. 1. Underground laying of the UWFBG arrays.
Fig. 2.
Fig. 2. Three types of vehicles vibration signals from UWFBG arrays.
Fig. 3.
Fig. 3. Truck driving on the UWFBG III.
Fig. 4.
Fig. 4. Vibration signals of one truck from UWFBG arrays. (a) Accompanying vibration signal. (b) Accompanying vibration signal. (c) Vibration signal of the single vehicle. (d) Accompanying vibration signal.
Fig. 5.
Fig. 5. Two laterally adjacent vehicles in different lanes (UFWBG II and III).
Fig. 6.
Fig. 6. Vibration signals of the two laterally adjacent vehicles. (a) Accompanying vibration signal. (b) Vibration signal of laterally adjacent vehicles. (c) Vibration signal of laterally adjacent vehicles. (d) Accompanying vibration signal.
Fig. 7.
Fig. 7. Three types of typical vibration signal. (a) Vibration signal of single vehicle. (b) Vibration signal of laterally adjacent vehicles. (c) Accompanying vibration signal.
Fig. 8.
Fig. 8. Flowchart of the proposed vehicle identification framework.
Fig. 9.
Fig. 9. Vibration signal acquisition by the sliding window.
Fig. 10.
Fig. 10. Typical LSTM structure.
Fig. 11.
Fig. 11. Residual block.
Fig. 12.
Fig. 12. Identification process of the proposed teacher model (ResNet-LSTM).
Fig. 13.
Fig. 13. The proposed training process of deep learning with knowledge distillation.
Fig. 14.
Fig. 14. (a) Results of preliminary extraction. (b) Further extraction results using DBSCAN.
Fig. 15.
Fig. 15. Visualization of different methods with t-SNE. (a) Original signal. (b) Time-Frequency. (c) VMD-MPE. (d) The teacher model. (e) The student model. (f) The proposed method with knowledge distillation.
Fig. 16.
Fig. 16. The training and validation losses of different models. (a) The teacher model. (b) The student model. (c) The student model with KD.
Fig. 17.
Fig. 17. Performance comparison of different methods.
Fig. 18.
Fig. 18. The effect of parameter values of DBSCAN on the performance of different models.

Tables (4)

Tables Icon

Table 1. Parameters of the proposed teacher model

Tables Icon

Table 2. Training parameters of the proposed deep learning model with knowledge distillation

Tables Icon

Table 3. Identification rates of different methods of three types of vehicle signals

Tables Icon

Table 4. F 1 scores for different methods

Equations (5)

Equations on this page are rendered with MathJax. Learn more.

L o v e r l a p = L w i n d o w L o f f s e t ,
N = n L o v e r l a p L o f f s e t
q i = exp ( z i / T ) j exp ( z j / T ) .
L o s s = α T 2 L K L ( q T | | q S ) + ( 1 α ) L C E .
F 1 = 2 P R P + R .
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.