Retrieval of the planetary boundary layer height from lidar measurements by a deep-learning method based on the wavelet covariance transform

Liang Mei; Xiaoqi Wang; Zhenfeng Gong; Kun Liu; Dengxin Hua; Dengxin Hua; Xiaona Wang

doi:10.1364/OE.454094

1. Introduction

The planetary boundary layer (PBL) is the lowest layer of the troposphere directly influenced by the frictional resistance, evaporation and transpiration of the Earth’s surface over an hour or shorter response time [1–4]. The height of the PBL (PBLH), frequently taken as the middle of the entrainment zone [5], determines the volume available for aerosol dispersion. Understanding and characterization of the PBL are of great importance in terms of air pollution management, weather forecasting, modelling of climate change, etc. However, the structure of the PBL can be rather complex and highly variable, causing a great challenging for the determination of the PBLH. During past decades, the balloon-borne radiosonde and microwave radiometer have often been utilized for the detection of the PBLH. However, there are a few shortcomings for these methods, e.g., limited observation frequency, sparse spatial coverage and low vertical resolution [6–11].

In recent decades, lidar has been widely employed for the measurements of the PBLH [12–22]. The principle of the lidar-based PBLH retrieval is that a sharp decreasing of the aerosol load from the boundary layer to the free troposphere exist at the top of the PBL. As a result, there is often a sharp attenuation in the atmospheric backscattering signal measured by lidar techniques at the PBL. The PBLH can thus be determined from the height with the maximum gradient of the lidar signal. A few approaches have been utilized for the retrieval of the PBLH from lidar measurements, e.g. the standard deviation method [23], the gradient method [24–27] and the curve-fitting method [28–31]. The gradient method obtains the PBLH directly from the maximum gradient of the lidar signal. However, the performance is not satisfied for lidar signals with low signal-to-noise ratio (SNR) and local spatial variations due to multiple aerosol layers or clouds. The curve-fitting method is mainly suitable for lidar signals matching the shape of the idealized curve. The wavelet covariance transform (WCT) method [32–36], which is also a gradient-based method, retrieves the PBLH from the covariance transform of the lidar profile with a wavelet (e.g. Haar wavelet) parameterized by a center value (b) and a dilation value (a). The Haar wavelet and the WCT are given by

(1)$$h(\frac{{z - b}}{a}) = \left\{ \begin{array}{ll} + 1&b - a/2 \le z \le b\\ - 1&b \le z \le b + a/2\\ 0&{ elsewhere} \end{array} \right.$$

(2)$$w(a,b) = \frac{1}{a}\int_{{z_b}}^{{z_t}} {P(z)} h(\frac{{z - b}}{a})\textrm{d}z$$

Here h is the Haar function, ${z_b}$ and ${z_t}$ are the bottom and top altitudes implying the possible retrieval range of the PBLH, $P(z)$ is the range-resolved lidar profile, and $w(a,b)$ is referred to as the wavelet coefficient. According to the definition of the planetary boundary layer, the PBLH is often determined by the altitude with the maximum wavelet coefficient. Selecting a proper dilation value (a) is the key issue for the WCT method. For a simple lidar profile with a distinct transition zone, the dilation value can be chosen in a wide range and the retrieved PBLH is nearly constant for different dilation values [32]. However, the selection of the dilation value can be very challenge for sophisticated lidar signals with large local intensity variations due to multiple aerosol layers. If the dilation value is too small, the wavelet coefficient may exhibit quite a few local maximums, which may be misrecognized as the PBLH. However, a large dilation value can only identify significant attenuation of the lidar signal and may overestimate the PBLH [32,33]. Besides, the determination of the PBLH retrieval range from the wavelet coefficient profile is also difficult in the presence of residual layers and clouds [37–39]. Although several PBLH retrieval algorithms based on gradient methods have been developed in recent years, e.g., structure of the atmosphere (STRAT) algorithm [40], the temporal-height-tracking (THT) algorithm [41], the boundary-layer-height retrieval algorithm based on vertical gradients and grouping (BRAVO) algorithm [42], improvements in terms of signal preprocessing, gradient detections and layer attribution are still required [43]. In summary, developing a robust automated retrieval approach without or with little human intervention is still a challenging task [44].

In this work, we have proposed a novel deep learning method based on the WCT for the retrieval of the PBLH from lidar measurements. Lidar profiles measured were processed according to the WCT with a series of dilation values. 2-dimensional (2D) wavelet images as well as the corresponding PBLH-labelled images are created as the training dataset for a convolutional neural network (CNN). The final PBLH can be retrieved from the predicted PBLH-labelled image from the well-trained CNN. The deep learning method has been validated by performing comparison studies with the radiosonde data and the Micro-Pulse-Lidar Network (MPLNET) PBL product.

2. Methods

2.1 Lidar profiles for training

Lidar profiles used for training were measured by an 808-nm Scheimpflug lidar (SLidar) system developed in Dalian University of Technology (DLUT). The principle of the SLidar technique has been thoroughly demonstrated in previous literatures [45–47]. Briefly, the SLidar technique employs a continuous-wave (CW) high-power laser diode (e.g., 808 nm) as the light source, and measures the range-resolved atmospheric backscattering echo using an image sensor (typically 45° tilted) according to the angle-of-incident of the scattering light. The transmitter, the receiving telescope (e.g., the Newtonian telescope), and the image sensor are arranged in an optical configuration satisfying the Scheimpflug principle. The range-resolved lidar signal is then obtained from the recorded laser beam image after performing pixel binning, signal averaging, digital filtering, pixel-distance transform [48]. Each lidar profile has been averaged for about 45 seconds to improve the signal-to-noise ratio (SNR). Meanwhile, since the range resolution of the lidar profile deteriorates with the increasing of the measurement distance (e.g., 1 m@500 m, 6 m@1 km) according to the principle of the SLidar technique, the lidar profile has been resampled with an equal range interval of 10 m to facilitate post-analysis.

The time-height intensity (THI) maps of the lidar signals measured in August 2018 are shown in Fig. 1. As the lidar system has an elevation angle of 30°, the measurement distance has been transformed to altitude. The corresponding range-resolution in altitude is thus equal to 5 m. These lidar profiles are utilized to prepare the data set for the CNN, including 10520 training profiles and 1133 test profiles.

Fig. 1. Atmospheric backscattering intensity measured in August 2018 by a Scheimpflug lidar (SLidar) system with an elevation angle of 30°. The measurement distance has been transformed into altitude. Each lidar profile has been averaged for about 45 seconds.

Download Full Size | PDF

2.2 Radiosonde data

The balloon-borne radiosonde can measure various meteorological parameters, such as atmospheric pressure, relative humidity, temperature, position temperature, etc. The PBLH can thus be retrieved from the recorded temperature profile. In this work, the radiosonde data at Dalian site (ID: 54662), located at about 9.3 km away from DLUT, is obtained from the Wyoming Weather Web [49] to verify the PBLH evaluated by the deep learning method. The sounding balloon is generally released at around 11:15 and 23:15 in Beijing time. However, the real-time temporal variation of the PBLH cannot be obtained owing to the limited observation frequency of the balloon-borne radiosonde. Besides, some of the radiosonde data cannot be used as the near-ground altitude information may be missed. According to the potential temperature profile method, the determination of the PBLH relies on the regime of the PBL structure that can be identified by evaluating the near-surface thermal gradient [50]

(3)$${\theta _2} - {\theta _1} = \left\{ \begin{array}{l} < - \delta \textrm{, Convective boundary layer }\\ > + \delta ,\textrm{ Stable boundary layer }\\ \textrm{else, Neutral boundary layer } \end{array} \right.\textrm{ }$$

Here θ₂ and θ₁ are the near surface (typically below 200 m) potential temperatures (Kelvin) measured by the radiosonde. In this paper, the value of δ is set to the common value, namely 1.0 K. In the case of neutral layer or convective layer, the height at which the potential temperature gradient is greater than 1 K/km for the first time is defined as the PBLH. In the case of stable layer, the height at which the potential temperature gradient is less than 6.5 K/km for the first time is set as the PBLH.

2.3 Data set generation

According to the definition of the planetary boundary layer, the PBLH can be directly determined from the maximum value of the wavelet coefficient in a simplified case. However, the maximum wavelet coefficient could also be due to interference layers such as cloud, residual layers, etc., which can then introduce considerable retrieval errors on the PBLH. The present deep-learning method aims at automatically eliminating these interferences when retrieving the PBLH by employing a well-trained CNN. The flowchart of generating the training dataset is shown in Fig. 2. The procedures of preparing the training dataset will be described in detail below.

Fig. 2. The flowchart of generating the training dataset. There are generally four steps. (i) Performing WCT on lidar profiles and generating 2D wavelet images. (ii) Screening/classifying wavelet images into different categories, and manually setting searching regions for the retrieval of the PBLH. (iii) Finding the local maximum of the wavelet coefficient in a customized region, from which a PBLH-labelled image is generated. (iv) Performing image augmentation to expand the training dataset for the CNN.

Download Full Size | PDF

The first step for training the CNN is to prepare the data set from lidar profiles. The flowchart of generating the training sample is shown in Fig. 3. Each training sample consists of a two-dimensional (2D) wavelet image and the corresponding PBLH-labelled image, as shown in Fig. 3(c)-(d). The 2D wavelet image is generated by performing a series of WCT in the altitude of 40-2500 m, where the aerosol load is mainly concentrated. The red or dark-red gradient pattern shown in Fig. 3(c) indicates the positions with large wavelet coefficients or large gradients in lidar profiles. In early studies, individual dilation value has been selected for calculating the wavelet coefficient, e.g., 200 m, 300 m and 500 m, showing promising performance under different atmospheric conditions [34]. However, it is always very challenge to select a proper dilation value, as mentioned above. Thus, in this work we calculate the wavelet coefficient for each lidar profile with a series of dilation values from 200 m to 505 m (in altitude) with a step of 5 m, and a few examples are shown in Fig. 3(b). As shown in Fig. 3(c), the wavelet coefficient has been min-max normalized to the region of [0 1] before plotting the wavelet image. The maximum value of the color bar is set to 1, while its minimum value is determined by a threshold value distinguishing the smallest 10% wavelet coefficients from the rest wavelet coefficients. The 2D wavelet image is resized to 343×372 pixels to ensure consistent sizes of all images. Nevertheless, the sizes of the training image and the labelled image may be changed according to dataset. Thus, one pixel of the wavelet image corresponds to a range interval of 7 m in altitude. There has been few studies utilizing wavelet images to evaluate the influence of the dilation value [5]. However, it has not yet been utilized for direct retrieval of the PBLH particularly with the combination of the deep learning method.

Fig. 3. The flowchart for the generation of the training set. (a) lidar profile, (b) the WCT with different dilation values, (c) the 2D wavelet image (343×372 pixels, 343 corresponds to the vertical axis and 372 refers to the horizontal axis), and (d) the PBLH-labelled image. The bright pattern (red or dark red) corresponds to larger wavelet coefficients and thus larger gradient in the lidar profile. The horizontal axis in figure (c) corresponds to the dilation value from 200 m to 505 m, while the vertical axis refers to the measurement altitude. One pixel in the 2D wavelet image corresponds to a range interval of 7 m in altitude.

Download Full Size | PDF

As can be seen in Fig. 3(c), there could be several local maximums appearing in the wavelet image owing to e.g., the true boundary layer and the residual layer. However, only one of them may indicate the true PBLH. Thus, we should first exclude interferences from cloud or residual layers and then labelled the position of the PBLH by finding local maximum of the wavelet coefficient below a threshold altitude, e.g., 0.5 km, as shown in Fig. 3(c). Finally, the PBLH-labelled image for a specific wavelet image is created. As shown in Fig. 3(d), all pixels are set to zero in the PBLH-labelled image except those indicating the positions of the potential PBLH. Unfortunately, not all wavelet images can be simply labelled according above method due to atmospheric variation and interferences from cloud, residual layers, etc. Thus, we should first screening the wavelet images with human intervention and then classify them into different categories for PBLH labeling. In fact, visual inspection method has been utilized for roughly estimating the PBLH in early days [14,44,51]. The classification rules are described below.

In a simple case, there is a distinct transition zone between the planetary boundary layer and the free troposphere. There is no or weak local intensity variation due to e.g., aerosol vertical structures in the lidar profile. Thus, the wavelet coefficient calculated with various dilation values mostly has a single local maximum or one dominant local maximum that is much larger than others. Thus, a single high-contrast stripe can be observed in the 2D wavelet image, as shown in Fig. 4(a). The PBLH can be readily labelled by finding the altitude with the maximum wavelet coefficient under different dilation values. In some occasions, there is a long transition zone between the boundary layer and the free troposphere. As a result, the 2D wavelet image may not have a high contrast between the maximum wavelet coefficient and the background wavelet coefficient. Nevertheless, the PBLH can still be labelled. However, the PBLH may increase with the increasing of the dilation value [33].

Fig. 4. Lidar profiles measured under different atmospheric conditions, 2D wavelet images and the corresponding PBLH-labelled images, where the horizontal axis refers to the dilation value from 200 m to 505 m and the vertical axis corresponds to the measurement altitude. (a) lidar profile measured at 15:43 on Aug. 25 with a distinct transition zone, (b) lidar profile measured on 20:56 on Aug. 27, with a strong cloud echo, (c) lidar profile measured at 15:01 on Aug. 26, with potential residual layers and cloud, (d) lidar profile measured at 20:58 on Aug. 24, with a nearby residual layer, (e) lidar profile measured at 11:58 on Aug. 27, with nearby residual layers, (f) lidar profile measured at 16:21 on Aug. 22, with multiple aerosol layers.

Download Full Size | PDF

Cloud, residual layers and aerosol layers aloft, etc., which are all considered as interference layers for the retrieval of the PBLH, can appear above the boundary layer. If the gradient of the interference layer is very large, the position of the maximum wavelet coefficient may indicate the upper boundary of the interference layer rather than the PBLH. Meanwhile, a negative local minimum can often be observed from the profile of the wavelet coefficient and the 2D wavelet image, which corresponds to the lower boundary of the interference layer. The PBLH can thus be labelled by finding the maximum wavelet coefficient below the interference layer no matter it has a larger or smaller gradient, as shown in Fig. 4(b), (c) and (d).

If residual layers or local aerosol layers, etc. are in close contact with the boundary layer, they could mask the growth and collapse of the PBL. As shown in Fig. 3(e), the two local maximums may be very close to each other in the wavelet image. Multiple aerosol layers could also appear around the boundary layer. As indicated by the three stripe patterns in Fig. 4(f), three or even more local maximums with similar amplitudes can be found out on the 2D wavelet image, which makes the labeling of the PBLH very difficult. Under these circumstances, the position of the local maximum with a lower altitude is more preferable. However, a very important rule for labeling the PBLH under these sophisticated cases is the principle of continuity – meaning that the evolution of the boundary layer cannot change suddenly. In other words, the labelling of the PBLH should consider its neighboring PBLHs, where the wavelet images do not have many local maximums due to multiple aerosol layers and can thus be readily labelled. Nevertheless, if the wavelet image is difficult to classify or the PBLH is difficult to recognize, it will not be utilized as a training sample.

After classification according above rules, customized searching regions for finding the PBLH is manually defined. The wavelet image is labelled by automatically finding the local maxima in a customized sub-region with interferences from clouds, residual layers, etc. eliminated. Clearly, the classification process requires human intervention, which is time-consuming and thus not suitable for long-term measurements. The purpose of the deep learning method is to directly predict a PBLH-labelled image under various atmospheric conditions without human intervention according to the knowledge obtained during the CNN training.

Besides, a large-scale data set is the premise of successful application of the deep neural network during the training. Image augmentation is a commonly used approach for expanding the training dataset of the CNN [52,53]. In this work, we have adjusted the brightness and contrast of the original wavelet images to create new images, in order to augment the training set while maintaining the structural features of the wavelet images. In addition, the data distribution of the training set is not balanced, and some images are fewer, e.g., Fig. 4(e) and (g). For these cases, the same method has been utilized to expand the training wavelet images as well. Finally, 48176 pairs of 2D wavelet images and the corresponding PBLH-labelled images are prepared for the neural network training.

2.4 Neural network

The architecture of the neural network utilized in this work is inspired by the VGG16 (VGG – Visual Geometry Group) network originally proposed by the Visual Geometry Group of Oxford [52]. The VGG16 network, showing the state-of-art performance in the ImageNet challenge, has been demonstrated to be very useful for low-level edge detection tasks [54]. Therefore, we directly call the parameters in VGG16 pre-train model at the beginning of the training rather than starting from scratch, which can greatly speed up the training progress. The backbone network is divided into 5 stages, and each of them contains 2 or 3 conv layers with kernel size 3×3. Different stages are connected by a 2×2 max pooling layer. The VGG16 network has also been modified for holistic image predictions and multi-scale/multi-level feature learning by cutting the 3 fully connected layers and adding side output layers to conv layers in each stage for deep supervision [55,56]. Through deep supervision, the training level of each stage can be evaluated. Specifically, each conv layer on the backbone network is connected to a side conv layer with a kernel size of 1×1 and a channel depth of 21, as shown in Fig. 5. The side conv layers also adjust the model parameters through the loss function to approach the target. The output feature images from each stage are accumulated followed by a 1×1-1 conv layer to attain hybrid features, and then a deconvolution (deconv) layer is connected to up-sample the feature image so that the final image output by CNN is 343 × 372 pixels, the same as the input image. The CNN is implemented using the Pytorch framework. The CNN has been trained with 48176 wavelet images and the corresponding PBLH-labelled images. In order to evaluate the performance of the training results, the cross-entropy loss function is utilized to evaluate the difference between the training output images and the labelled images. The training procedure continues for 12 epochs until the value of the cross-entropy loss function tend to be a constant with negligible variation.

Fig. 5. The architecture of the proposed CNN, modified from the VGG16 network [52].

Download Full Size | PDF

3. Results and discussions

3.1 Prediction result and the retrieval of the PBLH

The well-trained CNN has been utilized to evaluate lidar profiles measured in August 21 to August 28, 2018. 1133 original 2D wavelet images are randomly selected from the data set as the test set. Through the well-trained CNN, the image-to-image prediction with potential PBLH labelled is achieved. Several typical predicted PBLH images from the CNN are shown in Fig. 6. The simplest case is shown in Fig. 6(a). The high-contrast region in the image corresponds to the position of the boundary layer. In the presence of strong cloud or residual layers, the corresponding large gradient patterns can be readily removed and the predicted image clearly indicates the position of the PBLH, which has the largest wavelet coefficients below the cloud or the residual layer, as shown in Fig. 6(b). Meanwhile, the prediction result is also very promising even though the gradient at the PBLH is significantly smaller than the gradient owing to the cloud or residual layers, as shown in Fig. 6(c) and (d).

Fig. 6. Lidar profiles measured under different atmospheric conditions, 2D wavelet images and the corresponding predicted PBLH images by the CNN, where the horizontal axis refers to the dilation value from 200 m to 505 m and the vertical axis corresponds to the measurement altitude (343×372 pixels). (a) lidar profile measured at 16:13 on Aug. 24, the high-contrast region corresponds to the position of the boundary layer, (b) lidar profile measured at 2:26 on Aug. 22, the position of the boundary layer corresponds to the largest wavelet coefficients below the cloud or the residual layer, (c) lidar profile measured at 20:03 on Aug. 23, with a strong cloud echo, (d) lidar profile measured at 16:20 on Aug. 22, with multiple aerosol layers and cloud, (e) lidar profile measured at 6:30 on Aug. 25, with a residual layer, (f) lidar profile measured at 12:05 on Aug. 26, multiple aerosol layers.

Download Full Size | PDF

When two nearby local maximums appear in the wavelet image, the prediction result is dependent on the relative gradients as well as the spatial (or altitude) separation between the PBL and the interference layers. As shown in Fig. 6(e), the gradient pattern owing to the residual layer can be readily eliminated if two gradient pattern has a clear spatial separation with each other. On the contrary, if the two patterns with strong gradients tend to get closer or blurred with the increasing of the dilation value, the influence of the residual layer may not be completely eliminated and the residual mark may appear in the predicted PBLH image, as shown in Fig. 6(f). The predicted images with residual marks, which can be identified by statistically evaluating the pixel positions of the labels, should be removed in order to ensure the reliability of the subsequent evaluation of the PBLH.

The retrieval of the PBLH is accomplished by evaluating the PBLH-labelled images predicted by the well-trained CNN. The flowchart for the retrieval of the PBLH is shown in Fig. 7. In some occasions, there might be residual marks on the output image. Thus, each predicted image has been preprocessed before the PBLH retrieval by setting the pixel with intensity less than 100 (maximum value 255) to zero, which can eliminate weak interferences. A few PBLH-labelled images may still have residual marks as shown in Fig. 6(f), which are excluded for the PBLH retrieval.

Fig. 7. The flowchart of determining the PBLH from the predicted PBLH-labelled image.

Download Full Size | PDF

In an ideal case, the predicted PBLH-labelled image shows a single PBLH line indicating the PBLH at different dilation values, similar to the PBLH-labelled image during dataset preparation. However, as shown in Fig. 6, the predicted image exhibits as a broadened PBLH line, which indicates the potential region of the PBLH. A simple approach for extracting the PBLH is to identify the altitude with the maximum wavelet coefficient in the region defined by the broadened PBLH line in the predicted image. Meanwhile, the position of the PBLH as well as the corresponding wavelet coefficient are obtained. In a simple case, the PBLH can be obtained by taking the median altitude at all dilation values. However, in some occasions the predicted image exhibits as a broadened oblique line that change considerably with the increasing of the dilation value, as shown in Fig. 7. Under this circumstance, the maximum wavelet coefficient may decrease with the increasing of the dilation value and the PBLH could be overestimated for larger dilation values [33,32]. Thus, the values of the wavelet coefficient at different dilations should be considered for the retrieval of the PBLH. In this work, the PBLH can be determined by the following rule

(4)$$\textrm{PBLH} = \left\{ \begin{array}{l} {H_{a = \min .}}\textrm{, if }\sum\limits_{\textrm{First third}} {{w_a}} > \sum\limits_{\textrm{Last third}} {{w_a}} \\ \textrm{Median}({{H_a}} ),\textrm{ else } \end{array} \right.\textrm{ }$$

Here w_a and H_a are the maximum wavelet coefficient and the corresponding altitude at the dilation value of a. Specifically, if the sum of the wavelet coefficient in the first third part of the PBLH line is larger than the one in the last third part of the PBLH line, the PBLH is set as the altitude at the minimum dilation value ($a = \min .$). Otherwise, the PBLH is obtained from the median altitude at all dilation values.

We have also evaluated lidar profiles measured in December 2020 and January 2021 by an imaging lidar system [57] to validate the performance. The measured data is transformed into wavelet images by WCT and input into the CNN model. The predicted PBLH-labelled images are calculated by the image processing algorithm described above, and the THI map with the PBLH retrieved by the deep-learning method are shown in Fig. 8. As can be seen, the retrieved PBLH is generally in good agreement with the temporal evolution of the lidar profile. The influence from cloud, residual layers appearing on e.g. December 26^th and January 18^th can also be eliminated. Meanwhile, the PBLH from radiosonde data retrieved by the method in Section 2.2 is shown in triangles in Fig. 8. As can be seen, the measurement results obtained from the radiosonde data and the deep-learning method are in good agreement, proving the feasibility of using the deep learning method for atmospheric boundary layer studies.

Fig. 8. The THI map and the corresponding PBLH retrieved by the deep learning method based on the WCT. PBLHs retrieved in December 2020: radiosonde data – 293 m, 793 m and 270 m, deep learning – 299 m, 857 m, 319 m. PBLHs retrieved in January 2021: radiosonde data – 616 m, 562 m, 234 m and 587 m, deep learning – 638 m, 539 m, 219 m and 539 m.

Download Full Size | PDF

3.2 Validation with the MPLNET product

Validation studies have also been carried out by utilizing the lidar data from the MPLNET [58,59] in order to further investigate the performance of the deep learning method. The MPLNET is a federated network of micro-pulse lidar (MPL) systems designed to measure aerosol/cloud vertical structure and boundary layer heights. Recently, the PBLH product has been released in the V3 PBL product for several MPLNET sites based on the algorithm proposed by Lewis et al. [60]. The retrieval algorithm combined the WCT with a first-derivative Gaussian wavelet and Canny edge detection to identify features from lidar profiles. Meanwhile, the V3 PBL product contains up to three feature heights: the altitude of the lowest detected feature and the altitudes of the two largest peaks in the WCT, which are candidates for the final PBLH. A fuzzy logic attribution scheme is then used to determine the best-estimate PBLH from the three feature heights to avoid various interferences. Nevertheless, it has been suggested that manual inspection should be performed to verify the retrieved PBLHs.

In this work, the normalized relative backscatter (NRB) and the PBL product measured in one month at the site of EPA-NCU (24.9670° N, 121.1810° E) have been utilized for comparison studies, as shown in Fig. 9. Unfortunately, there is no publicly available radiosonde data nearby the EPA-NCU site for validation studies. Each range-corrected lidar profile measured by the 532-nm MPL system has been processed by the WCT up to 3 km altitude and then converted into a wavelet image according above method. The corresponding PBLH is thus obtained from the predicted PBLH-labelled image. The temporal evolution of the PBLH obtained by the deep learning method without any further human intervention is shown in Fig. 9. The PBLH would be absent if there are residual marks appearing in the predicted PBLH-labelled image, as shown in Fig. 6(f).

Fig. 9. The THI map (Local time) measured at the site of EPA-NCU, the corresponding PBLH product (all square marks) and the PBLH retrieved by the deep learning method (circle marks). MPLNET-outlier refers to the PBLH values that are excluded for comparison studies through manual inspection on the lidar profiles and the THI map. MPLNET-only indicates that there is no output from the deep learning method.

Download Full Size | PDF

As shown in Fig. 9, substantial variation can be observed for the PBLHs obtained from MPLNET in the period from 13^th to 16^th September, which was unrealistic for real atmosphere. Thus, manual inspection on the PBLH product should be performed as suggested by the MPLNET instruction. Typical lidar profiles and the corresponding wavelet coefficients are shown in Fig. 10. As can be seen from the lidar profiles shown in Fig. 10(a), the MPLNET product can identify the sharp attenuation of the lidar profile at about 1.05 km measured before 15:40. However, the PBLH at 15:50 was estimated to be 1.4 km, while the lowest feature height still located at around 1.05 km. This implied that the fuzzy logic scheme used in MPLNET may misrecognize the PBLH at 15:50, considering the principle of continuity and the feature heights. Meanwhile, the PBLH predicted by the deep learning method demonstrated good temporal continuity during this period. Similar phenomenon has also been observed for lidar profiles measured at around 18:22 on September 15^th, where the PBLHs obtained from MPLNET were significantly larger (beyond 2 km) comparing with their neighboring results. In presence of residual layers or aloft aerosols, the MPLNET product may overestimate the PBLH, as shown in Fig. 10(e). Nevertheless, the deep learning method was able to eliminate the influence of the residual layers and identify the PBLH correctly. As can be seen from the wavelet coefficients shown in Fig. 10(f), the PBLHs showed good temporal continuity, even though the wavelet coefficient at the boundary layer is smaller than that at the residual layer (dot-dashed curve). In some occasions, the boundary layer is quite low, e.g., lower than the blind range of the MPL system, the MPLNET product recognized feature heights owing to local variation as the PBLH, as shown in Fig. 10(g) and (h). However, the deep learning method can generate null values as the PBLH cannot be obtained under these circumstances. Even though a PBLH value is given by the MPLNET product, it could be most likely overestimated.

Fig. 10. Different types of lidar profiles, i.e., (a), (c), (e), (g) and the corresponding wavelet coefficients with a dilation value of 200, i.e., (b), (d), (f), (h). The PBLHs retrieved by the deep learning method and from the MPLNET product are marked. Meanwhile, the feature heights, indicating the candidates for the PBLH, are also provided by the MPLNET.

Download Full Size | PDF

Thus, the MPLNET product were manually screened and the PBLHs that had large variation or misrecognized the residual layers were kicked out as outliers, which should not be used for comparison studies. As shown in Fig. 9, the pink squared marks are the PBLHs that have been identified as outliers in manual inspection. After eliminating those obvious outliers in the MPLNET product, the PBLH profiles measured by the deep learning method and from the MPLNET product agreed well with each other, and their relationship is shown in Fig. 11. The correlation coefficient was found to be 0.81, which verified the promising performance of the deep learning method. The stripe appearance in the correlation figure is mainly owing to the continuity strategy used in the MPLNET algorithm [60], which sets the PBLH having larger discrepancy (>150 m) with neighboring PBLHs to the baseline PBLH.

Fig. 11. The correlation between the PBLH retrieved by the deep learning method and the MPLNET V3 product after manual inspection. The data points are plotted when both approaches can retrieve the PBLH.

Download Full Size | PDF

4. Conclusions

In recent years, the WCT based inversion algorithm has been widely employed for the retrieval of the PBLH from lidar profiles. However, previous WCT-based algorithm suffers from the challenges in selection of dilation values, interferences from cloud or residual layers, etc. In order to overcome these difficulties, we have proposed and demonstrated a novel deep-learning method based on the WCT for the PBLH retrieval in this work. Lidar profiles measured by a Scheimpflug lidar system have been processed according to the WCT with a series of dilation values from 200 m to 505 m to generate 2D wavelet images. 48176 wavelet images and the corresponding PBLH-labelled images are created as the training set. The measured data in January and March 2021 have been prepared to validate the performance of the proposed CNN. The PBLH is then obtained by evaluating the position and intensity of the output PBLH-labelled images from the CNN.

Comparison studies with radiosonde data have also been carried out to further validate the prediction result. It has been found out that the PBLHs retrieved from the deep learning method and the radiosonde are in good agreement. Validation studies with the PBLH product from the MPLNET site of EPA-NCU has been carried out for about one month. It has been found out that the deep learning method can efficiently eliminate the influence from residual layers and demonstrates better temporal consistency, comparing with the MPLNET product. In cases of sophisticated atmospheric structures, the predicted PBLH-labelled images may have residual marks, which are omitted to ensure the reliability of the retrieved PBLH. If the boundary layer is quite low, e.g. lower than the blind range of the MPL system, the deep learning method could generate null values, while the MPLNET product may overestimate the PBLH. Nevertheless, a correlation coefficient of 0.81 has been achieved through one-month analysis on the PBLHs obtained from the MPLNET product after manual inspection and the deep learning method.

In summary, the promising result has proved the feasibility of utilizing the deep-learning method based on the WCT for automated retrieval of the PBLH with the capability of eliminating influences from cloud, residual layers in practical atmospheric applications. The validation studies with the MPLNET product also indicated that the present deep-learning method can also be directly applied to lidar profiles measured by the conventional pulsed lidar systems. Meanwhile, the present deep learning method may be modified for the identification of cloud or the PBLH retrieval of satellite lidar measurements by changing the training strategy. Long term (e.g. one year or even longer) comparison studies are also planned in the near future.

Funding

National Natural Science Foundation of China (62075025); Dalian High-Level Talent Innovation Program (2020RQ018).

Acknowledgments

The authors greatly acknowledge Zheng Kong and Teng Ma for the help on the experimental work. The MPLNET project is funded by the NASA Radiation Sciences Program and Earth Observing System. We thank the MPLNET (PI: Carlo Wang) for its effort in establishing and maintaining the EPA-NCU site. The authors also wish to thank editors for the extension of the due date for revision.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper can be obtained from the authors upon request.

References

1. C. L. Du, S. Y. Liu, X. Yu, X. M. Li, C. Chen, Y. Peng, Y. Dong, Z. P. Dong, and F. Q. Wang, “Urban boundary layer height characteristics and relationship with particulate matter mass concentrations in Xi’an, central China,” Aerosol Air Qual. Res. 13(5), 1598–1607 (2013). [CrossRef]

2. Y. R. Zhao, W. Q. Mao, K. Q. Zhang, Y. N. Ma, H. F. Li, and W. Y. Zhang, “Climatic variations in the boundary layer height of Arid and Semiarid areas in East Asia and North Africa,” Journal of the Meteorological Society of Japan 95(3), 181–197 (2017). [CrossRef]

3. A. Molod, H. Salmun, and M. Dempsey, “Estimating planetary boundary layer heights from NOAA profiler network wind profiler data,” J. Atmos. Ocean Tech. 32(9), 1545–1561 (2015). [CrossRef]

4. R. B. Stull, An Introduction to Boundary Layer Meteorology (Springer, 1988).

5. S. A. Cohn and W. M. Angevine, “Boundary layer height and entrainment zone thickness measured by lidars and wind-profiling radars,” J. Appl. Meteorol. 39(8), 1233–1247 (2000). [CrossRef]

6. B. Hennemuth and A. Lammert, “Determination of the atmospheric boundary layer height from radiosonde and lidar backscatter,” Boundary-Layer Meteorol 120(1), 181–200 (2006). [CrossRef]

7. T. Rose, S. Crewell, U. Lohnert, and C. Simmer, “A network suitable microwave radiometer for operational monitoring of the cloudy atmosphere,” Atmos. Res. 75(3), 183–200 (2005). [CrossRef]

8. Z. Wang, X. Cao, L. Zhang, J. Notholt, B. Zhou, R. Liu, and B. Zhang, “Lidar measurement of planetary boundary layer height and comparison with microwave profiling radiometer observation,” Atmos. Meas. Tech. 5(8), 1965–1972 (2012). [CrossRef]

9. D. Cimini, F. De Angelis, J. C. Dupont, S. Pal, M. Haeffelin, “Mixing layer height retrievals by multichannel microwave radiometer observations,” Atmos. Meas. Tech. 6(11), 2941–2951 (2013). [CrossRef]

10. U. Saeed, F. Rocadenbosch, and S. Crewell, “Performance test of the synergetic use of simulated lidar and microwave radiometer observations for mixing-layer height detection,” Remote Sensing of Clouds and the Atmosphere Xx 9640, 964008 (2015). [CrossRef]

11. J. P. Guo, Y. C. Miao, Y. Zhang, H. Liu, Z. Q. Li, W. C. Zhang, J. He, M. Y. Lou, Y. Yan, L. G. Bian, and P. Zhai, “The climatology of planetary boundary layer height in China derived from radiosonde and reanalysis data,” Atmos. Chem. Phys. 16(20), 13309–13319 (2016). [CrossRef]

12. E. Nelson, R. Stull, and E. Eloranta, “A prognostic relation for entrainment zone thickness,” J. Appl. Meteorol. 28(9), 885–903 (1989). [CrossRef]

13. R. Boers, E. W. Eloranta, and R. L. Coulter, “Lidar observations of mixed layer dynamics - Tests of parameterized entrainment models of mixed layer growth-rate,” J. Clim. Appl. Meteorol. 23(2), 247–266 (1984). [CrossRef]

14. C. Flamant, J. Pelon, P. H. Flamant, and P. Durand, “Lidar determination of the entrainment zone thickness at the top of the unstable marine atmospheric boundary layer,” Boundary-Layer Meteorology 83(2), 247–284 (1997). [CrossRef]

15. K. J. Davis, D. H. Lenschow, S. P. Oncley, C. Kiemle, G. Ehret, A. Giez, and J. Mann, “Role of entrainment in surface-atmosphere interactions over the boreal forest,” J. Geophys. Res. 102(D24), 29219–29230 (1997). [CrossRef]

16. L. M. Russell, D. H. Lenschow, K. K. Laursen, P. B. Krummel, S. T. Siems, A. R. Bandy, D. C. Thornton, and T. S. Bates, “Bidirectional mixing in an ACE 1 marine boundary layer overlain by a second turbulent layer,” J. Geophys. Res. 103(D13), 16411–16432 (1998). [CrossRef]

17. T. Yang, Z. F. Wang, W. Zhang, A. Gbaguidi, N. Sugimoto, X. Q. Wang, I. Matsui, and Y. L. Sun, “Technical note: Boundary layer height determination from lidar for improving air pollution episode modeling: development of new algorithm and evaluation,” Atmos. Chem. Phys. 17(10), 6215–6225 (2017). [CrossRef]

18. W. Wang, F. Y. Mao, W. Gong, Z. X. Pan, and L. Du, “Evaluating the governing factors of variability in nocturnal boundary layer height based on elastic lidar in Wuhan,” IJERPH 13(11), 1071 (2016). [CrossRef]

19. D. Summa, P. Di Girolamo, D. Stelitano, M. Cacciani, “Characterization of the planetary boundary layer height and structure by Raman lidar: comparison of different approaches,” Atmos. Meas. Tech. 6(12), 3515–3525 (2013). [CrossRef]

20. G. Tsaknakis, A. Papayannis, P. Kokkalis, V. Amiridis, H. D. Kambezidis, R. E. Mamouri, G. Georgoussis, and G. Avdikos, “Inter-comparison of lidar and ceilometer retrievals for aerosol and Planetary Boundary Layer profiling over Athens, Greece,” Atmos. Meas. Tech. 4(6), 1261–1273 (2011). [CrossRef]

21. P. P. Sullivan, C. H. Moeng, B. Stevens, D. H. Lenschow, and S. D. Mayor, “Structure of the entrainment zone capping the convective atmospheric boundary layer,” J. Atmos. Sci. 55(19), 3042–3064 (1998). [CrossRef]

22. M. C. Coen, C. Praz, A. Haefele, D. Ruffieux, P. Kaufmann, and B. Calpini, “Determination and climatology of the planetary boundary layer height above the Swiss plateau by in situ and remote sensing measurements as well as by the COSMO-2 model,” Atmos. Chem. Phys. 14(23), 13205–13221 (2014). [CrossRef]

23. W. P. Hooper and E. W. Eloranta, “Lidar measurements of wind in the planetary boundary layer - the method, accuracy and results from joint measurements with radiosonde and kytoon,” J. Clim. Appl. Meteorol. 25(7), 990–1001 (1986). [CrossRef]

24. A. K. Piironen and E. W. Eloranta, “Convective boundary layer mean depths and cloud geometrical properties obtained from volume imaging lidar data,” J. Geophys. Res. 100(D12), 25569–25576 (1995). [CrossRef]

25. L. Menut, C. Flamant, and J. Pelon, “Evidence of interaction between synoptic and local scales in the surface layer over the Paris area,” Bound.-Lay. Meteorol. 93(2), 269–286 (1999). [CrossRef]

26. R. M. Endlich, F. L. Ludwig, and E. E. Uthe, “An automatic method for determining the mixing depth from lidar observations,” Atmos. Environ. 13(7), 1051–1056 (1979). [CrossRef]

27. K. L. Hayden, K. G. Anlauf, R. M. Hoff, J. W. Strapp, J. W. Bottenheim, H. A. Wiebe, F. A. Froude, J. B. Martin, D. G. Steyn, and I. G. McKendry, “The vertical chemical and meteorological structure of the boundary layer in the Lower Fraser Valley during Pacific ‘93,” Atmos. Environ. 31(14), 2089–2105 (1997). [CrossRef]

28. D. G. Steyn, M. Baldi, R. M. Hoff, “The detection of mixed layer depth and entrainment zone thickness from lidar backscatter profiles,” J. Atmos. Oceanic Technol. 16(7), 953–959 (1999). [CrossRef]

29. P. Hageli, D. G. Steyn, and K. B. Strawbridge, “Spatial and temporal variability of mixed-layer depth and entrainment zone thickness,” Boundary-Layer Meteorology 97(1), 47–71 (2000). [CrossRef]

30. N. Eresmaa, A. Karppinen, S. M. Joffre, J. Rasanen, and H. Talvitie, “Mixing height determination by ceilometer,” Atmos. Chem. Phys. 6(6), 1485–1493 (2006). [CrossRef]

31. T. M. Mok and C. Z. Rudowicz, “A lidar study of the atmospheric entrainment zone and mixed layer over Hong Kong,” Atmos. Res. 69(3-4), 147–163 (2004). [CrossRef]

32. K. J. Davis, N. Gamage, C. R. Hagelberg, C. Kiemle, D. H. Lenschow, and P. P. Sullivan, “An objective method for deriving atmospheric structure from airborne lidar observations,” J. Atmos. Oceanic Technol. 17(11), 1455–1468 (2000). [CrossRef]

33. I. M. Brooks, “Finding boundary layer top: Application of a wavelet covariance transform to lidar backscatter profiles,” J. Atmos. Oceanic Technol. 20(8), 1092–1105 (2003). [CrossRef]

34. R. J. Dang, H. Li, Z. G. Liu, and Y. Yang, “Statistical analysis of relationship between daytime lidar-derived planetary boundary layer height and relevant atmospheric variables in the Semiarid region in northwest China,” Advances in Meteorology 2016, 1–13 (2016). [CrossRef]

35. H. Baars, A. Ansmann, R. Engelmann, and D. Althausen, “Continuous monitoring of the boundary-layer top with lidar,” Atmos. Chem. Phys. 8(23), 7281–7296 (2008). [CrossRef]

36. R. F. Banks, J. Tiana-Alsina, F. Rocadenbosch, and J. M. Baldasano, “Performance evaluation of the boundary-Layer height from lidar and the weather research and forecasting model at an urban coastal site in the north-east Iberian Peninsula,” Boundary-Layer Meteorol 157(2), 265–292 (2015). [CrossRef]

37. M. J. Granados-Munoz, F. Navas-Guzman, J. A. Bravo-Aranda, J. L. Guerrero-Rascado, H. Lyamani, J. Fernandez-Galvez, and L. Alados-Arboledas, “Automatic determination of the planetary boundary layer height using lidar: One-year analysis over southeastern Spain,” J. Geophys. Res. 117(D18), n/a (2012). [CrossRef]

38. J. C. Compton, R. Delgado, T. A. Berkoff, and R. M. Hoff, “Determination of planetary boundary layer height on short spatial and temporal scales: a demonstration of the covariance wavelet transform in ground-based wind profiler and lidar measurements,” J. Atmos. Ocean Tech. 30(7), 1566–1575 (2013). [CrossRef]

39. V. Caicedo, B. Rappengluck, B. Lefer, G. Morris, D. Toledo, and R. Delgado, “Comparison of aerosol lidar retrieval methods for boundary layer height detection using ceilometer aerosol backscatter data,” Atmos. Meas. Tech. 10(4), 1609–1622 (2017). [CrossRef]

40. Y. Morille, M. Haeffelin, P. Drobinski, and J. Pelon, “STRAT: An automated algorithm to retrieve the vertical structure of the atmosphere from single-channel lidar data,” J. Atmos. Ocean Tech. 24(5), 761–775 (2007). [CrossRef]

41. F. Angelini, F. Barnaba, T. C. Landi, L. Caporaso, and G. P. Gobbi, “Study of atmospheric aerosols and mixing layer by lidar,” Radiation Protection Dosimetry 137(3-4), 275–279 (2009). [CrossRef]

42. G. Martucci, C. Milroy, and C. D. O’Dowd, “Detection of cloud-base height using Jenoptik CHM15 K and Vaisala CL31 ceilometers,” J. Atmos. Ocean Tech. 27(2), 305–318 (2010). [CrossRef]

43. M. Haeffelin, F. Angelini, Y. Morille, G. Martucci, S. Frey, G. P. Gobbi, S. Lolli, C. D. O’Dowd, L. Sauvage, I. Xueref-Remy, B. Wastine, and D. G. Feist, “Evaluation of mixing-height retrievals from automatic profiling lidars and ceilometers in view of future integrated networks in Europe,” Boundary-Layer Meteorol 143(1), 49–75 (2012). [CrossRef]

44. R. J. Dang, Y. Yang, X. M. Hu, Z. T. Wang, and S. W. Zhang, “A review of techniques for diagnosing the atmospheric boundary layer height (ABLH) using aerosol lidar data,” Remote Sensing 11(13), 1590 (2019). [CrossRef]

45. Z. Liu, L. Li, H. Li, and L. Mei, “Preliminary studies on atmospheric monitoring by employing a portable unmanned Mie-scattering Scheimpflug lidar system,” Remote Sensing 11(8), 937 (2019). [CrossRef]

46. L. Mei, P. Guan, Y. Yang, and Z. Kong, “Atmospheric extinction coefficient retrieval and validation for the single-band Mie-scattering Scheimpflug lidar technique,” Opt. Express 25(16), A628–A638 (2017). [CrossRef]

47. L. Mei and M. Brydegaard, “Atmospheric aerosol monitoring by an elastic Scheimpflug lidar system,” Opt. Express 23(24), A1613 (2015). [CrossRef]

48. L. Mei, L. Zhang, Z. Kong, and H. Li, “Noise modeling, evaluation and reduction for the atmospheric lidar technique employing an image sensor,” Opt. Commun. 426, 463–470 (2018). [CrossRef]

49. http://weather.uwyo.edu/upperair/bufrraob.shtml.

50. S. Y. Liu and X. Z. Liang, “Observed diurnal cycle climatology of planetary boundary layer height,” J. Climate 23(21), 5790–5809 (2010). [CrossRef]

51. S. M. Joffre, M. Kangas, M. Heikinheimo, and S. A. Kitaigorodskii, “Variability of the stable and unstable atmospheric boundary-layer height and its scales over a boreal forest,” Boundary-Layer Meteorology 99(3), 429–450 (2001). [CrossRef]

52. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proceedings of Computer Vision and Pattern Recognition, (San Diego, CA, USA, 2015).

53. C. Shorten and T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,” J Big Data 6(1), 60 (2019). [CrossRef]

54. G. Bertasius, J. Shi, and L. Torresani, “Deepedge: A multiscale bifurcated deep network for top-down contour detection,” in Proceedings of Computer Vision and Pattern Recognition, (San Diego, CA, USA, 2015).

55. S. N. Xie and Z. W. Tu, “Holistically-nested edge detection,” Int J Comput Vis 125(1-3), 3–18 (2017). [CrossRef]

56. Y. Liu, M. M. Cheng, X. W. Hu, J. W. Bian, L. Zhang, X. Bai, and J. H. Tang, “Richer convolutional Features for edge detection,” IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1939–1946 (2019). [CrossRef]

57. Z. Kong, T. Ma, K. Zheng, Y. Cheng, Z. Gong, D. Hua, and L. Mei, “Development of an all-day polarization sensitive imaging lidar for atmospheric depolarization studies based on the division-of-focal-pane method,” Opt. Express 29(23), 38512–38526 (2021). [CrossRef]

58. E. J. Welton, J. R. Campbell, J. D. Spinhirne, and V. S. Scott, “Global monitoring of clouds and aerosols using a network of micro-pulse lidar systems,” Lidar Remote Sensing for Industry and Environment Monitoring 4153, 151–158 (2001). [CrossRef]

59. J. R. Campbell, D. L. Hlavka, E. J. Welton, C. J. Flynn, D. D. Turner, J. D. Spinhirne, V. S. Scott, and I. H. Hwang, “Full-time, eye-safe cloud and aerosol lidar observation at atmospheric radiation measurement program sites: Instruments and data processing,” J. Atmos. Oceanic Technol. 19(4), 431–442 (2002). [CrossRef]

60. J. R. Lewis, E. J. Welton, A. M. Molod, and E. Joseph, “Improved boundary layer depth retrievals from MPLNET,” J. Geophys. Res. Atmos. 118(17), 9870–9879 (2013). [CrossRef]

Retrieval of the planetary boundary layer height from lidar measurements by a deep-learning method based on the wavelet covariance transform

Abstract

1. Introduction

2. Methods

2.1 Lidar profiles for training

2.2 Radiosonde data

2.3 Data set generation

2.4 Neural network

3. Results and discussions

3.1 Prediction result and the retrieval of the PBLH

3.2 Validation with the MPLNET product

4. Conclusions

Funding

Acknowledgments

Disclosures

Data availability

References

Data availability

Cited By

Figures (11)

Equations (4)

Optics Express