Vehicle identification using deep learning for expressway monitoring based on ultra-weak FBG arrays

Fang Liu; Yu Lei; Yu Xie; Xiaorui Li; Qiuming Nan; Lina Yue

doi:10.1364/OE.487400

1. Introduction

Real-time and accurate traffic flow data is essential for expressway traffic management and intelligent services. In recent years, expressway vehicle monitoring methods have developed rapidly. However, the commonly used methods, such as cameras, radar, and GPS, are susceptible to weather or other environmental influences, making it difficult to achieve continuous long-distance monitoring. The ultra-weak fiber Bragg grating (UWFBG) has the advantages of high spatial resolution [1], fast sensing response [2], and dynamic measurement [3], which can realize long-distance and real-time vehicles monitoring.

Similarly, studies have been conducted on applying fiber-optic sensors in expressway traffic flow monitoring, and the results are closely related to how the transmission cable is laid. Wang et al. [4] used distributed acoustic sensing (DAS) on the roadside of the expressway, relying only on a single transmission cable to monitor vehicles. However, this approach could not achieve “lane level” monitoring, and the vibration signal generated by a vehicle farther away from the cable was weaker. It is also unable to identify “laterally adjacent vehicles”. Yuksel et al. [5] extracted the vehicle vibration signal through the transmission optical cable perpendicular to the road direction, whose method could detect the vibration signal of the vehicle wheel axle. Nevertheless, the research mainly focused on measuring the vehicle speed and wheelbase, which could not accurately identify vehicles.

In addition to how the transmission fiber optic cable is laid, another problem is the analysis of vibration signals. The traditional analysis includes time-domain, frequency-domain, and time-frequency analysis [6,7]. In recent years, machine learning has been widely used in vibration signals. For example, Xin et al. [8] used local characteristics-scale decomposition (LCD) to extract vibration signal features and input them to a support vector machine (SVM), which has achieved good results in the subway perimeter. Van et al. [9] combined empirical mode decomposition (EMD) and SVM for bearing fault diagnosis. Celler et al. [10] estimated blood pressure using a Gaussian mixture model and a hidden Markov model. Dong et al. [11] used wavelet transform to extract time-frequency features as the input of a random forest, which has a good effect in bearing fault diagnosis.

Traditional machine learning needs to extract features of complex vibration signals first, and the quality of feature extraction will directly affect the accuracy of the recognition model. In addition, vibration signals have the characteristics of complex nonlinearity, non-stationary, considerable noise, and extensive crosstalk between vibration sources. Especially in a complex environment such as an expressway, different vehicles crosstalk with each other, which causes great difficulties for manual feature extraction. On the other hand, the feature extraction of vibration signals will also incur time overhead, and complex signal processing algorithms cannot be used in scenes with real-time requirements such as expressways. For example, Qi et al. [12] used variational modal decomposition (VMD) for laser radar signal denoising, but with the increase of the number of modal components, the single processing time of the algorithm will also become more extended.

Compared with the above methods, deep learning, as a branch of machine learning, has powerful automatic feature extraction capability and mature inference acceleration method, widely used in computer vision and natural language processing. In vibration signal analysis, Liu et al. [13] proposed an adaptive convolutional neural network for identifying perimeter intrusion. Shrestha et al. [14] used convolutional neural networks for bridge structure monitoring. Deeper neural networks can extract richer features of vibration signals, but deeper networks can produce the problem of gradient disappearance. To solve the problem, Peng et al. [15] proposed to introduce residual learning in a one-dimensional convolutional neural network, which is applied to high-speed train fault diagnosis. In addition, LSTM, as a recurrent neural network, performs well in time series processing and can capture nonlinear dynamic changes in time series, and Ma et al. [16] used LSTM for transformer fault prediction. Chen et al. [17] collected vibration signals from spacecraft as input to Bi-LSTM to predict the spacecraft's lifetime.

Since complex deep learning integration models have a large number of redundant parameters and also incur huge time overhead, it is challenging to meet real-time demanding application scenarios like an expressway. Knowledge distillation can meet the need for lightweight deployment [18], with the idea of generating soft labels for structurally simple student model learning through a robust teacher model, so that the performance of the student model approximates that of the teacher model. Deng et al. [19] used multiple teacher models to generate soft labels for student models to learn, which led to good performance of student models. Li et al. [20] used knowledge distillation for bearing fault diagnosis and achieved good results in practical applications.

In this paper, the UWFBG-based lateral lane level monitoring scheme for expressway vehicle monitoring is proposed for high accuracy, high performance, and lightweight deployment. UWFBG arrays are laid underground in each expressway lane to collect vibration signals. With the propagation of vibrations in the ground, the detected vibration signals from UWFBG arrays become more complex as the number of vehicles on the highway increases, which makes it challenging to recognize if there is an actual vehicle in the lane. A sliding window filters the vehicle vibration signals by DBSCAN to create a sample library. For the characteristics of the pavement of the UWFBG arrays, the vehicle vibration signals are classified into three types to identify the actual vehicle in the lane: vibration signal of a single vehicle, accompanying vibration signals in the adjacent lane of the vehicle, and the vibration signals of laterally adjacent vehicles. Then, using knowledge distillation, a powerful teacher model is designed, which consists of a deep residual network with an LSTM, and a student model consists of only one LSTM layer. The student model approximates the performance of the teacher model by learning the soft labels generated by the teacher model. The experiment shows that compared with other advanced methods, the deep learning model can extract vibration signal features better, and the student model improves its accuracy through knowledge distillation. This method deals well with both real-time and accuracy.

The rest of the paper is organized as follows: Section 2 briefly introduces the experimental system with UWFBG arrays. Section 3 describes the proposed identification scheme using deep learning. Section 4 presents the experiment results and comparison analysis. Finally, Section 5 comes to conclusions.

2. Experimental system based on UWFGB arrays

The sensing array is made of an on-line writing system, in which UV light from a 248 nm excimer laser is irradiated onto the stretched fiber with very low transmission loss. A large-scale ultra-weak FBG array consisting of thousands of identical-wavelength FBGs with low reflectivity is fabricated, and the UWFBG array has no fusion loss with high mechanical stability. The narrow linewidth laser (NLL) generates a continuous wave light source with a central wavelength of 1550 nm, which is modulated into a 1 kHz light pulse by the semiconductor optical amplifier (SOA) and then enters the erbium-doped fiber amplifier (EDFA). The pulsed light is subsequently guided into the UWFBG by a circulator. The 3*3 coupler phase demodulation unit consisting of Michelson interferometer will recover the vibration signal waveform by demodulating the phase change caused by the optical length change of two adjacent sensors, and optical time domain reflection (OTDR) is used to locate the vibration appearance [8].

As shown in Fig. 1, four fiber optic cables are parallelly buried 30 cm underground in four lanes with a grating spacing of 5 m. 100 valid FBGs in each cable are used in this experimental system. The sensing network transmits the vibration signal with a sampling frequency of 1 k Hz to the demodulator.

Fig. 1. Underground laying of the UWFBG arrays.

Layer Name	Kernel Size/Stride	Kernel Number	Output Shape
Input Layer	-	-	750 $\times 1$
con1	16/ 7	64	(105, 64)
MaxPooling1D	3/2	64	(52, 64)
con2.x	$(\begin{matrix} 1 \\ 3 \\ 1 \end{matrix}) \times 3 / 2$	$(\begin{matrix} 64 \\ 64 \\ 256 \end{matrix}) \times 3$	(52, 256)
con3.x	$(\begin{matrix} 1 \\ 3 \\ 1 \end{matrix}) \times 4 / 2$	$(\begin{matrix} 128 \\ 128 \\ 512 \end{matrix}) \times 4$	(26, 512)
con4.x	$(\begin{matrix} 1 \\ 3 \\ 1 \end{matrix}) \times 6 / 2$	$(\begin{matrix} 256 \\ 256 \\ 1024 \end{matrix}) \times 6$	(13, 1024)
con5.x	$(\begin{matrix} 1 \\ 3 \\ 1 \end{matrix}) \times 3 / 1$	$(\begin{matrix} 512 \\ 512 \\ 2048 \end{matrix}) \times 3$	(7, 2048)
AveragePooling1D	2	2048	(4, 2048)
LSTM	64	-	(4, 64)
Flatten	-	-	(None, 256)
Dense	3	1	(None, 3)

Model	Optimizer	Learning rate	Loss
Teacher	Adam	0.001	Categorical cross-entropy
Student	Adam	0.001	Categorical cross-entropy
Student + KD	Adam	0.001	Distillation loss

Classifier	Accompanying Vibration	Single vehicle	Laterally adjacent vehicles
DF1	83%	95%	83%
DF2	62%	89%	77%
Teacher	98%	99%	94%
CNN-LSTM	91%	97%	92%
Student	95%	96%	85%
Student + KD	97%	99%	88%

Number of samples	CNN	CNN-LSTM	Teacher	Student	Student + KD
1250	0.91	0.94	0.97	0.92	0. 95
625	0.91	0.90	0.95	0.88	0.93
312	0.90	0.88	0.94	0.87	0.92
156	0.90	0.88	0.92	0.85	0.90

Abstract

1. Introduction

2. Experimental system based on UWFGB arrays

3. Principle of identification scheme using deep learning

3.1 Sliding window

3.2 Deep learning models

3.2.1 Student model

3.2.2 Teacher model

3.2.3 Knowledge distillation

3.3 Training process of vehicle identification model

4. Experimental results

4.1 Sample library of vehicle vibration signals

4.2 Characteristic visualization of vehicle vibration signals

4.3 Model evaluation

4.3.1 Evaluation on the classification performance

4.3.2 Experiments on the robustness of the knowledge distillation

5. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (18)

Tables (4)

Equations (5)

Optics Express