Abstract
3D single-photon LiDAR imaging has an important role in many applications. However, full deployment of this modality will require the analysis of low signal to noise ratio target returns and very high volume of data. This is particularly evident when imaging through obscurants or in high ambient background light conditions. This paper proposes a multiscale approach for 3D surface detection from the photon timing histogram to permit a significant reduction in data volume. The resulting surfaces are background-free and can be used to infer depth and reflectivity information about the target. We demonstrate this by proposing a hierarchical Bayesian model for 3D reconstruction and spectral classification of multispectral single-photon LiDAR data. The reconstruction method promotes spatial correlation between point-cloud estimates and uses a coordinate gradient descent algorithm for parameter estimation. Results on simulated and real data show the benefits of the proposed target detection and reconstruction approaches when compared to state-of-the-art processing algorithms.
Published by Optica Publishing Group under the terms of the Creative Commons Attribution 4.0 License. Further distribution of this work must maintain attribution to the author(s) and the published article's title, journal citation, and DOI.
1. Introduction
Single-photon (SP) Light detection and ranging (LiDAR) used with their time-correlated single-photon technique is receiving significant interest as an emerging approach in numerous applications such as automotive LiDAR [1–3], remote sensing for environmental sciences [4,5], or imaging at long-range [6–11], through obscurants [12–14] or with multiple wavelengths [1516–17]. An SP-LiDAR system operates by illuminating the scene using laser pulses and recording the arrival times of the reflected photons with respect to their emission times using an SP detector, typically a single-photon avalanche diode (SPAD) detector. A histogram of photon counts with respect to time-of-flight is constructed for each individual SPAD pixel. By using a focal plane array composed of multiple SPAD detectors, it is possible to image the entire optical field simultaneously. This process can be repeated using different laser wavelength illumination to obtain additional multispectral information on the observed scene. The resulting timing histograms enable object detection [18,19] as well as estimation of the target’s depth and reflectivity profiles [17,20].
However, SP-LiDAR faces several challenges to allow further deployment to real-world applications. Large data volume constitutes a major challenge for SP-LiDAR imaging due to the acquisition of millions of events per second that are typically acquired in large histogram cubes. This challenge is more evident when using multiple wavelengths to observe the scene and when imaging through scattering media. This happens when the target return signal is attenuated and the background noise is significantly increased due to optical backscatter when imaging through atmospheric obscurants or underwater [12–14,17]. The use of LIDAR in atmospheric obscurants means that the backscattered return will depend both on distance of the scattering event from the transceiver and the optical configuration used, typically resulting in a non-uniform background noise level in the timing histogram [21]. A similar non-uniformity is also observed as a result of the pile-up effect [22,23] caused by the high counting rates, where the probability of a second event within a given timing period is too high, leading to an higher background level near the beginning of the timing period. Another limitation includes the detection of multiple-surfaces-per-pixel which usually occur when imaging through semi-transparent materials (e.g., windows, camouflage), or in long-range depth profiling as shown in [24].
Several approaches have been proposed to deal with these challenges. To reduce data volume, the authors of [18,19] proposed a surface detection problem using a Bayesian formulation to determine if a pixel contained photons related to a surface or only spurious background counts due to ambient light or scattered photons from the observation environment (air, water, etc.). The approach in [20] pushed the problem further by being able to determine the material of a detected surfaces using multispectral imaging. However, these approaches assume a single peak per pixel and are operating on each pixel independently (pixel-wise), hence not taking into account spatial correlation between pixels to improve the detection. Multiple solutions have also been proposed to improve robustness of the processing to low-light or high-background scenarios [25–28]. Altmann et al. [25] introduced a Bayesian approach to regularise the data fidelity term, while Shin et al. [26], Halimi et al. [28], and Rapp and Goyal [27] proposed optimisation approaches as a mean to regularise the maximum likelihood solution. However, all of these approaches are not adapted to scenarios where the one-surface-per-pixel assumption is not valid. In that regard, many papers have been dedicated to tackling the multiple return problem in the context of 3D LiDAR imaging. Bayesian-based approaches [24,29] recovered the 3D profile of a target with multiple returns per-pixel. These approaches used reversible jump Markov chain Monte Carlo (RJ-MCMC) as an inference method to sample from the posterior distribution of interest and promote spatial correlation between points belonging to the same surface, showing good performance. However, the RJ-MCMC results in a large computational cost, which hinders its use for real-time imaging. 3D reconstruction from a point cloud point-of-view has also showed promising results in [30,31] to tackle an unknown number of objects per pixel, but these approaches do not provide uncertainty about the estimates. This shows the need for a fast and robust pre-processing algorithm, to reduce data volume and only extract useful information that can be used to perform higher level processing with quantified uncertainty.
This paper proposes a saliency-based pre-processing algorithm to perform 3D target detection while accounting for multiscale information and the possible presence of multiple-peaks per pixel. The resulting 3D map can be used to filter the raw data to only consider useful regions hence reducing data volume and eliminating background counts. The resulting clean data can be used to perform higher level inference as a post-processing step (see Fig. 1). We propose a hierarchical Bayesian model and a corresponding estimation algorithm for spectral classification and 3D reconstruction of the detected surfaces. Comparisons with state-of-the-art algorithms show that the proposed framework is robust and provide good results even with a high background level and in a low photon return regime.
To summarize the main contributions of the paper are:
- • A new computationally efficient 3D target detection algorithm that uses multiscale information to locate unknown number of surfaces per-pixel when the background noise can be non-uniform.
- • A new hierarchical Bayesian algorithm that performs 3D spectral classification and scene reconstruction while promoting spatial correlation between neighbouring surfaces in the 3D point-cloud.
The paper is structured as follows. The problem formulation and the proposed 3D target detection algorithm are presented in Section 2. The proposed Bayesian model and estimation algorithm for 3D spectral classification and scene reconstruction are presented in section 3. Results on simulated and real data are analysed in Section 4. The conclusions and future work are finally reported in Section 5.
2. Pre-processing: 3D target detection
2.1 Observation model
We consider 3D histogram cubes $\boldsymbol {Y}$ of LiDAR photon counts of dimension $N \times L \times T$, where $N$ is the number of scanned spatial positions (i.e., $N = N_{\textrm {row}} \times N_{\textrm {col}}$ pixels), $L$ is the number of spectral wavelengths and $T$ is the number of time bins. Akin to [27,29,32], each photon count $y_{n,l,t}$, where ${n \in \{1,\ldots,N\}}$, $l \in \{1,\ldots,L\}$ and $t \in \{1,\ldots,T\}$, is assumed to follow a Poisson distribution as follows:
2.2 Saliency-based target detection
In this work, we will expand the saliency framework proposed [33,34] to high dimensional data such as LiDAR data. This method inspired several state-of-the-art algorithm that deals with RGB [35] or RGB-D data [36]. The approach can highlight efficiently the salient regions of a scene while preserving objects’ boundaries and discarding high frequency components due to noise, texture, or blocking artifacts. In the context of SP Lidar, we propose to compute the saliency matrix $\boldsymbol {S}$ of size $N \times T$ on the observed cube of histograms, as follows:
The saliency cube in (2) contains voxels of background, and others originating from target peaks. To detect salient regions, we assume the saliency values of a background voxel to follow a Gamma distribution $\mathcal {G}$, and signal voxels to have positive values, as follows
A simple hypothesis test allows us to separate signal from background voxels. This is obtained by first fitting a gamma distribution to ${\boldsymbol {S}}$ followed by a thresholding of the saliency matrix $S > \textrm {Thresh}(\alpha _b, \beta _b)$, where $\textrm {Thresh}(\alpha _b, \beta _b)$ depends on the gamma parameters and can be fixed based on the desired probability of false alarm. This results in a binary 3D map, denoted ${\boldsymbol {M}}$ of size $N \times T$, highlighting regions of target returns (i.e., white voxels in Fig. 1) from those due to background (i.e., black voxels in Fig. 1). The saliency-based target detection algorithm is summarized in Algo. 1.
2.3 Background estimation
Our approach requires an estimate of the background level as indicated in (2). The presence of obscurants causes the background level $b_{n,l,t}$ to be non-uniformly distributed w.r.t pixels, wavelengths, and/or time bins (depth dimension). Our background estimation strategy assumes $b_{n,l,t}$ to be smooth, which is the case if the obscurant is spatially homogeneous. This implies that the low-pass filtered histogram $y^{Q}_{n,t,l}$, where $Q$ is the coarsest scale of the histogram, can be represented by the sum of sparse signal returns (related to the reflected target) and a 3D smooth background $\hat {b}_{n,l,t}$. Akin to [17], an efficient estimation can be obtained by assuming the same temporal shape of the background for all pixels as follows
3. Post-processing: Bayesian 3D reconstruction and spectral classification
Given the estimated 3D target map and background level, one can isolate target’s peaks and approximate their clean histogram of counts by removing the estimated background returns (see Section 3.3). As a result, we obtain $N_s = N \times K_s$ small histograms called surfaces, where $K_s$ represents the maximum number of surfaces per pixel, i.e., some pixels might have less surfaces. These small histograms can then be used to perform higher level processing such as object recognition, classification, etc. In this paper, we focus on 3D reconstruction and spectral classification of the observed objects. More precisely, we assume the availability of a spectral library containing $K$ objects (defined by their $K$ spectral signatures ${\boldsymbol {m}}_k,$ for $k \in 1,\ldots, K$) and aim to spatially localise the objects contained in the scene. In this case, the observation model becomes
where $r_{s,l}$ represents the spectral signature of the $s$th surface, and $h_s$ is a positive coefficient allowing the model to account for different illumination conditions. Indeed, an object can be observed under different illumination conditions (e.g., shadowed regions) leading to the same spectral shape but different amplitude levels. It is worth-noting that the model in (7) uses surface indices $s$ instead of pixels, where in comparison to (1), a surface can be associated with the $c$th peak of the $n$th pixel. Note also that (7) assumes the absence of the background, as it was removed in the pre-processing step. The joint likelihood, when assuming the observed surfaces, wavelengths and bins mutually independent, is then given by:3.1 Prior distributions
Prior of ${\boldsymbol {R}}$: Akin to [18,20,37], the classification problem intent to assign to each pixel a label associated to one of the $K$ known spectral classes. The reflectivity conjugate prior accounts for this effect by considering a mixture of $K$ gamma distributions as follows:
Prior of ${\boldsymbol {H}}$: The reflectivity modulation parameter $h_s$ should be positive and is assumed to vary smoothly from one pixel to neighbouring ones. These properties are promoted by assigning a hidden gamma-MRF (GMRF) [38] prior to $\textbf {h}$, which also ensures the tractability of the resulting posterior distribution [13,25]. More precisely, we introduce an ${N}_\textrm {row} \times N_\textrm {col} \times K_s$ auxiliary tensor with elements $w_s \in \mathbb {R}^+$ and define a 3D graph between $\textbf {H}$ and ${\boldsymbol {W}}$ such that each $h_s$ is associated to 4$K_s$ elements of ${\boldsymbol {W}}$ and vice-versa (see Fig. 2(a)). The resulting joint distribution is computationally compelling as the conditional prior distributions of $h$ and $w$ reduce to conjugate inverse gamma ($\mathcal {IG}$) and gamma ($\mathcal {G}$) distributions as follows:
Prior of ${\boldsymbol {u}}$: Potts prior is assigned to the class variable ${\boldsymbol {u}}$ to promote spatial correlations between class labels. The prior of ${\boldsymbol {u}}$ is then obtained using the Hammersley-Clifford theorem
Prior of ${\boldsymbol {d}}$: In absence of additional knowledge, the depth parameter $d_n$ is assigned a non-informative uniform prior as follows:
Nonetheless, the depth prior can be adjusted in the event of additional information regarding the position of a target.
Prior of $\boldsymbol {\beta }$: The hyper-parameter $\beta _{k,l}$ controls the mean and variance of the unknown spectral signatures ${\boldsymbol {R}}$ for each class $k$. The $\beta _{k,l}$ hyper-parameter is assigned an inverse-gamma distribution
where $\upsilon _{k,l}$ and $\epsilon _{k,l}$ are fixed values that are related to the mean the variance of the hyper-parameter $\beta _{k,l}$.3.2 Joint posterior distribution
From the joint likelihood in (8) and the prior distributions considered in Section 3.1, we can build the joint posterior distribution for $\boldsymbol {\Theta }=\left (\boldsymbol {r},\boldsymbol {B},{\boldsymbol {d}}, {\boldsymbol {u}}, {\boldsymbol {h}}, {\boldsymbol {w}} \right )$ given the 3D histograms $\boldsymbol {Y_s}$ and the hyper-parameters $\boldsymbol {\Theta _{H}}=\left (K, \boldsymbol {\Upsilon }, \boldsymbol {A}, \rho, \boldsymbol {C} \right )$, where $\boldsymbol {A}$, and $\boldsymbol {\Upsilon }$ gather $\{\alpha \}_{k,l}$, $\{\nu, \epsilon \}_{k,l}$, respectively. Using Bayes rule, the joint posterior distribution can be formulated as follows:
The Directed acyclic graph (DAG) of the proposed hierarchical Bayesian model is summarized in Fig. 2(b).
3.3 Graph estimation - multiscale information
This sections indicates how to obtain clean surfaces of counts interconnected with a graph. Given the multiscale histograms $\boldsymbol {Y}^{q}_{l}$, the estimated noise $\hat {{\boldsymbol {B}}}_l$, and 3D binary detection map, one can approximate the background free signal on detected surfaces as follows
For readability purposes, the star ${*}$ and the scale selected ${\bar {q}_s}$ have been omitted from the denoised surface-histogram ${}^{*}y^{{\bar {q}_s}}_{s,l,t}$ in (7) and following equations, leading to the notation ${}^{*}y^{{\bar {q}_s}}_{s,l,t}$ to be simply denoted ${y}_{s,l,t}$. Note that the proposed model assumes a maximum of $K_s$ surfaces per-pixel. However, some pixels might have less surfaces in which case we introduce a binary matrix $C$ of size $N_\textrm {row} \times N_\textrm {col} \times K_s$ to indicate if a surface is detected. If a surface $s$ is not detected, $c_{s,:} = 0$ is set to $0$ and all connections to this surface will have a null weight (see Fig. 2(a)).
3.4 Estimation strategy
This paper approximates the maximum-a-posteriori parameter estimates using a Coordinate Descent Algorithm (CDA) algorithm. The CDA aims at maximizing the posterior in (14) with respect to the parameters of interest, namely $\boldsymbol {\Theta }=\left (\boldsymbol {r},\boldsymbol {B},{\boldsymbol {d}}, {\boldsymbol {u}}, {\boldsymbol {h}}, {\boldsymbol {w}} \right )$. Due to the large number of parameters of interest and the complexity of the joint distribution, the CDA emerges as a great candidate to estimate the parameters of interest as in [13,45–47]. Indeed, this algorithm iteratively updates the unknown parameters by maximizing their conditional distributions with respect to a single parameter while keeping the others fixed until a convergence criterion is met as detailed in Algo. 2.
From (14), we can easily demonstrate that the conditional distributions associated with the parameters of interest are
3.4.1 Stopping criteria
Algorithm 2 is an iterative process that requires the definition of some stopping criteria. In this paper, we consider two criteria that would terminate the algorithm if they are satisfied. The first criterion compares the new value of the estimated parameters $\boldsymbol {\Theta }$ to the previous ones and stops the algorithm if changes are smaller than a given user-defined threshold, that is:
4. Experiments
4.1 Comparison algorithms and evaluation criteria
The proposed algorithm performs a joint target detection and 3D scene reconstruction, while considering minimal assumptions such as the presence of multiple points per pixel. To highlight the robustness and benefit of the proposed algorithm, it is compared to several state-of-the-art target detection and reconstruction algorithms including:
- • The histogram-based TD algorithm (HTD) [18]: is a fast target detection algorithm that operates on per-pixel histograms and assumes one surface per-pixel.
- • The event-based TD algorithm (ETD) [19]: is a fast target detection algorithm that operates on per-pixel raw ToFs and assumes one surface per-pixel.
- • The multispectral TD algorithm (MSTD) [20]: is a fast per-pixel signature-based target detection algorithm that performs range estimation and multispectral classification.
- • The multispectral 3D reconstruction algorithm (MS3D) [17]: assumes a single-surface and designed for robust processing of multispectral LiDAR data acquired through obscurants.
- • The RT3D algorithm [31]: assumes multiple surfaces per-pixel and is used when analysing robustness to noise and photon-sparse regime imaging on single spectral data.
- • The MuSaPoP algorithm [24]: assumes multiple surfaces per-pixel and is used when analysing multispectral LiDAR data.
Different metrics are considered to evaluate the TD and the 3D reconstruction estimates:
- • True detection $F_{\textrm {true}}(\tau )$: Probability of true detection with respect to the distance $\tau$, which consider a detection as a true one if there is another point in the ground truth in the same pixel such that $|d_{s}^{\textrm {true}}$ – $d_{s'}^{\textrm {est}}| \leq \tau$.
- • False detection $F_{\textrm {false}}(\tau )$: Number of estimated points that cannot be assigned to a ground truth point at a distance $\tau$.
- • Mean absolute error of intensity at distance $\tau$ denoted IAE: sum of absolute error across all detected points $S_{est}$ of intensity estimates $I^{\textrm {est}}_{s,l}$ normalized with respect to the total number of ground truth points $S_{true}$, i.e., $\frac {1}{S_{true}} \sum ^{S_{est}}_{s=1} \sum ^L_{l=1} | I^{\textrm {est}}_{s,l} \text {--} I^{\textrm {true}}_{s,l}|$ where $I_{s,l}=h_{s} r_{s,l}$ and $I^{\textrm {true}}_{s,l}$ is the intensity of the closest ground-truth point to the estimated one. Note that if a point was falsely estimated or a ground truth point was not found, then they are considered to have resulted in an error of $\sum ^{L}_{l=1} |I_{s,l}|$ .
- • Mean absolute error depth at distance $\tau$ denoted DAE: sum of absolute error across all detected points $S_{est}$ of depth estimates normalized with respect to the total number of true detection $F_{\textrm {true}}(\tau )$ i.e., $\frac {1}{F_{\textrm {true}}(\tau )} \sum ^{S_{Est}}_{s=1} | d_{s}^{\textrm {true}} \text {--} d_{s'}^{\textrm {est}}|$. The ground truth and estimated points are coupled using the probability of detection $F_{\textrm {true}}(\tau )$.
4.2 Evaluation on simulated data
In this section, the proposed algorithm is tested on a simulated cluttered art scene extracted from the Middlebury dataset (Available on: http://vision.middlebury.edu/stereo/data/). This scene is commonly used as a standard scene for algorithms evaluation in many LiDAR imaging experiments [17,27]. To simulate a scene with multiple surfaces per-pixel, two cluttered art scenes are concatenated in the depth direction. The number of pixels and bins are, respectively, $N=185\times 232$ and $T=164$. Two values of average photons per-pixel (PPP) and signal-to-background (SBR) ratio are considered and the data are corrupted by uniform and non-uniform type of background. The proposed algorithm has been used with $Q=4$ ($1 \times 1, 5 \times 5, 7 \times 7, 11 \times 11$), $\rho >1$, $\gamma =4$, and $\lambda _q = 1/Q, \forall q$. Figure 3 shows the qualitative results of the 3D Target profile (second and third row) and the 3D spectral reconstruction (fourth and fifth row) of the art scene. This figure shows robust detection results in presence of the multiple surfaces per-pixel and for different levels of SBR and PPP. For uniform background, the detection results are good even at low SBR and PPP levels, but the reflectivity seems underestimated in extreme cases. Slightly lower performance can be observed for non-uniform background at low PPP levels where the algorithm has few false detections and estimates lower reflectivity values. Figure 4 shows quantitative results for the 3D target detection and spectral reconstruction experiments. The first row depicts the detection performance using true, false detected points, and reconstruction results using IAE and DAE, w.r.t distance for four values of SBR and PPP couples. As expected, the latter metrics improves when the PPP or/and SBR increase. However, we notice that the overall performance are more sensitive to a decrease in PPP than in SBR. The second row show the same performance w.r.t PPP and SBR with distance fixed to $\tau$=4mm. Again, the performance improves for larger PPP and/or SBR. However, we notice that false detection gets worse if the SBR level increases for low PPP levels.
4.3 Evaluation on real data
This section evaluates our algorithm on different data-sets acquired with real LiDAR systems. The target detection, classification, and spectral reconstruction performance are evaluated on scenes with one and multiple surfaces per-pixel. In the next experimental scenarios, $Q=4$ with kernel sizes $1 \times 1, 3 \times 3, 7 \times 7, 9 \times 9$, $\rho >1$, $\gamma =4$, and $\boldsymbol {\lambda } = [0,1,0,0]$
4.3.1 One surface per-pixel
This subsection considers two real scenes that contain at most one surface per-pixel. The first scene is a life-sized polystyrene head scanned at a stand-off distance of 325 metres during midday. This scene consists of $N=200\times 200$ pixels, $T = 1700$ histogram bins per pixel with a timing bin size of 16ps and has an average $SBR$ of 0.25 with a 5th-95th percentile interval of (0.01, 0.6). The acquisition time per-pixel is 1 ms, which corresponds to an average PPP of 30 photons (more details can be found in [48] regarding this scene). The second scene is a Lego figurine of size 42 mm tall and 30 mm wide which was scanned at a standoff distance of 1.8 m using a TCSPC module at an acquisition time per-pixel of 160ms (40ms acquisition time per wavelength). The size of the spatial, spectral and temporal dimension of the single-photon Lego data are, respectively, $N=200 \times 200$ pixels, $L=4$ wavelengths and $T = 1500$ bins with a timing bin size of 2ps. The data was acquired in presence of ambient illumination leading to an SBR of $1.3$. The Lego exhibits three classes of interest ($K=3$) whose spectral signatures (related to $\boldsymbol {A}$ and $\boldsymbol {B_s}$) are extracted from pixels acquired considering a negligible background contribution and after maximum acquisition time per-pixel. The reader is referred to [49] for more detail regarding the Lego scene. Figure 5 compares the estimated 3D target detection maps of the proposed algorithm with the 2D maps of the state-of-the-art TD algorithms. Both ETD and HTD require a post-processing with a TV regularization [18,19] to obtain smooth detection maps, while the proposed method provides smooth maps thanks to the use of multiscale information. Note also that ETD and HTD provide 2D maps while the proposed method detects 3D maps. Table 1 shows the accuracy of the proposed and MSTD algorithms for different acquisition times. The proposed algorithm has high accuracy performance and outperforms the MSTD algorithms post-processed with a TV regularization for an acquisition time higher than $0.4$ ms per-pixel.
Figure 6 shows the 3D reconstruction of the Mannequin head (first row) and the Lego figurine data (second row) for 1 ms (PPP=30) and 2ms (PPP=50) acquisition time per-pixel, respectively. The top row of this figure shows similar 3D reconstruction performance for all algorithms when applied to the mannequin head. For the Lego scene, we notice that the 3D reconstruction of the proposed algorithm and MS3D are closer to the ground truth than MuSaPoP. Table 2 shows the time performance of the proposed framework against the state-of-the-art algorithms considered in this subsection.
Figure 7 shows the 3D classification of the Lego scene against the MSTD algorithm. The proposed approach shows fewer false detections and better accuracy.
4.3.2 Multiple surfaces per-pixel
This subsection evaluates the target detection, classification, and spectral reconstruction on real data that contain multiple surfaces per-pixel. The first scene considered consists of a mannequin located 4 meters behind a partially scattering object, with $N = 99 \times 99$ pixels and $T = 4000$ bins with a mean photon count per-pixel of 45. This LiDAR scene is publicly available online (https://github.com/photon-efficient-imaging/full-waveform/tree/master/fcns.). The second scene consists of a man standing behind camouflage at a 230 m stand-off distance from the LiDAR sensor. Different acquisition times (from 0.5 to 3.2 ms per-pixel) were used obtaining 7 to 44.6 photon per-pixel on average from which approximately 2.1-13.3 photons correspond to background levels. The LiDAR cube has $N = 159 \times 78$ pixels and $T = 600$ histogram bins.
Figure 8 compares the proposed 3D target detection and reconstruction with those of state-of-the-art algorithms (MuSaPoP/ManiPoP and RT3D) when considering the two real scenes. Different acquisition times are considered for the man behind camouflage and represented in the first three rows. The proposed algorithm shows consistent results for these different cases and better depth estimates for small acquisition times when compared to the other state-of-the-art algorithms. The other scene (4th row), which depicts a mannequin man behind a scattering object, shows similar performance for all algorithms. Table 3 reports quantitative performance using the metrics defined in Section 4.1 and considering as reference the proposed TD and reconstruction estimates obtained with 3.2 ms acquisition time per-pixel. This table indicates best performance for the proposed approach for both true detection rates and IAE. This comes at the cost of a small increase in false alarm rate in low acquisition times.
5. Conclusions
This paper proposed 3D target detection and spectral classification algorithms for multispectral 3D single-photon LiDAR data. The detection algorithm exploited multiscale information to approximate the background level and detect signal peaks. This pre-processing step allowed unmixing signal information from background hence reducing data volume, and enabling higher level post-processing. In comparison to existing single-photon TD algorithms, the proposed strategy delivered smooth detection maps by considering spatial correlation between pixels, and allowed the detection of multi-peaks per pixel. We designed a hierarchical Bayesian model and estimation algorithm to perform spectral classification on the cleaned multispectral data. The results compared well with state-of-the-art 3D reconstruction algorithms on both simulated and real data. Future work will investigate the acceleration of these algorithms using parallel processing tools (e.g., GPU) to enable real-time processing.
Funding
Royal Academy of Engineering (RF/201718/17128, RF/202021/20/307); Engineering and Physical Sciences Research Council (EP/S026428/1, EP/T00097X/1).
Disclosures
The authors declare no conflicts of interest.
Data availability
A demo of the Matlab code is not publicly available at this time but may be obtained from the authors upon reasonable request.
References
1. A. M. Wallace, A. Halimi, and G. S. Buller, “Full waveform LiDAR for adverse weather conditions,” IEEE Trans. Veh. Technol. 69(7), 7064–7077 (2020). [CrossRef]
2. J. Rapp, J. Tachella, Y. Altmann, S. McLaughlin, and V. K. Goyal, “Advances in single-photon LiDAR for autonomous vehicles: Working principles, challenges, and recent advances,” IEEE Signal Process. Mag. 37(4), 62–71 (2020). [CrossRef]
3. P. Lindner and G. Wanielik, “3D LiDAR processing for vehicle safety and environment recognition,” in Workshop on Computational Intelligence in Vehicles and Vehicular Systems (IEEE, 2009), pp. 66–71.
4. A. M. Wallace, A. McCarthy, C. J. Nichol, X. Ren, S. Morak, D. Martinez-Ramirez, I. H. Woodhouse, and G. S. Buller, “Design and evaluation of multispectral LiDAR for the recovery of arboreal parameters,” IEEE Trans. Geosci. Remote Sensing 52(8), 4942–4954 (2014). [CrossRef]
5. T. Hakala, J. Suomalainen, S. Kaasalainen, and Y. Chen, “Full waveform hyperspectral LiDAR for terrestrial laser scanning,” Opt. Express 20(7), 7119–7127 (2012). [CrossRef]
6. H. Tan, J. Peng, Z. Xiong, D. Liu, X. Huang, Z.-P. Li, Y. Hong, and F. Xu, “Deep learning based single-photon 3D imaging with multiple returns,” in International Conference on 3D Vision (IEEE, 2020), pp. 1196–1205.
7. Z.-P. Li, X. Huang, P.-Y. Jiang, Y. Hong, C. Yu, Y. Cao, J. Zhang, F. Xu, and J.-W. Pan, “Super-resolution single-photon imaging at 8.2 kilometers,” Opt. Express 28(3), 4076–4087 (2020). [CrossRef]
8. A. McCarthy, R. J. Collins, N. J. Krichel, V. Fernández, A. M. Wallace, and G. S. Buller, “Long-range time-of-flight scanning sensor based on high-speed time-correlated single-photon counting,” Appl. Opt. 48(32), 6241–6251 (2009). [CrossRef]
9. Z.-P. Li, X. Huang, Y. Cao, B. Wang, Y.-H. Li, W. Jin, C. Yu, J. Zhang, Q. Zhang, C.-Z. Peng, F. Xu, and J.-W. Pan, “Single-photon computational 3D imaging at 45 km,” Photonics Res. 8(9), 1532–1540 (2020). [CrossRef]
10. A. M. Pawlikowska, A. Halimi, R. A. Lamb, and G. S. Buller, “Single-photon three-dimensional imaging at up to 10 kilometers range,” Opt. Express 25(10), 11919–11931 (2017). [CrossRef]
11. Z. Li, E. Wu, C. Pang, B. Du, Y. Tao, H. Peng, H. Zeng, and G. Wu, “Multi-beam single-photon-counting three-dimensional imaging LiDAR,” Opt. Express 25(9), 10189–10195 (2017). [CrossRef]
12. A. Maccarone, A. McCarthy, X. Ren, R. E. Warburton, A. M. Wallace, J. Moffat, Y. Petillot, and G. S. Buller, “Underwater depth imaging using time-correlated single-photon counting,” Opt. Express 23(26), 33911–33926 (2015). [CrossRef]
13. A. Halimi, A. Maccarone, A. McCarthy, S. McLaughlin, and G. S. Buller, “Object depth profile and reflectivity restoration from sparse single-photon data acquired in underwater environments,” IEEE Trans. Comput. Imaging 3(3), 472–484 (2017). [CrossRef]
14. R. Tobin, A. Halimi, A. McCarthy, M. Laurenzis, F. Christnacher, and G. S. Buller, “Three-dimensional single-photon imaging through obscurants,” Opt. Express 27(4), 4590–4611 (2019). [CrossRef]
15. Y. Altmann, A. Maccarone, A. Halimi, A. McCarthy, G. Buller, and S. McLaughlin, “Efficient range estimation and material quantification from multispectral LiDAR waveforms,” in 2016 Sensor Signal Processing for Defence (SSPD) (IEEE, 2016), pp. 1–5.
16. A. Halimi, R. Tobin, A. McCarthy, J. Bioucas-Dias, S. McLaughlin, and G. S. Buller, “Robust restoration of sparse multidimensional single-photon LiDAR images,” IEEE Trans. Comput. Imaging 6, 138–152 (2020). [CrossRef]
17. A. Halimi, A. Maccarone, R. A. Lamb, G. S. Buller, and S. McLaughlin, “Robust and guided bayesian reconstruction of single-photon 3d lidar data: Application to multispectral and underwater imaging,” IEEE Trans. Comput. Imaging 7, 961–974 (2021). [CrossRef]
18. J. Tachella, Y. Altmann, S. McLaughlin, and J.-Y. Tourneret, “On fast object detection using single-photon LiDAR data,” in Wavelets and Sparsity XVIII, vol. 11138 (International Society for Optics and Photonics, 2019), p. 111380T.
19. A. Halimi, A. Wallace, G. S. Buller, and S. McLaughlin, “Fast surface detection using single-photon detection events,” in Sensor Signal Processing for Defence Conference (IEEE, 2020), pp. 1–5.
20. M. A. A. Belmekki, R. Tobin, G. S. Buller, S. McLaughlin, and A. Halimi, “Fast task-based adaptive sampling for 3d single-photon multispectral lidar data,” IEEE Trans. Comput. Imaging 8, 174–187 (2022). [CrossRef]
21. G. Satat, M. Tancik, and R. Raskar, “Towards photography through realistic fog,” in International Conference on Computational Photography (IEEE, 2018), pp. 1–10.
22. F. Heide, S. Diamond, D. B. Lindell, and G. Wetzstein, “Sub-picosecond photon-efficient 3d imaging using single-photon sensors,” Sci. Rep. 8(1), 17726–8 (2018). [CrossRef]
23. A. K. Pediredla, A. C. Sankaranarayanan, M. Buttafava, A. Tosi, and A. Veeraraghavan, “Signal processing based pile-up compensation for gated single-photon avalanche diodes,” arXivarXiv:1806.07437 (2018). [CrossRef]
24. J. Tachella, Y. Altmann, M. Márquez, H. Arguello-Fuentes, J.-Y. Tourneret, and S. McLaughlin, “Bayesian 3D reconstruction of subsampled multispectral single-photon LiDAR signals,” IEEE Trans. Comput. Imaging 6, 208–220 (2020). [CrossRef]
25. Y. Altmann, X. Ren, A. McCarthy, G. S. Buller, and S. McLaughlin, “LiDAR waveform-based analysis of depth images constructed using sparse single-photon data,” IEEE Transactions on Image Processing 25(5), 1935–1946 (2016). [CrossRef]
26. D. Shin, A. Kirmani, V. K. Goyal, and J. H. Shapiro, “Photon-efficient computational 3-d and reflectivity imaging with single-photon detectors,” IEEE Trans. Comput. Imaging 1(2), 112–125 (2015). [CrossRef]
27. J. Rapp and V. K. Goyal, “A few photons among many: Unmixing signal and noise for photon-efficient active imaging,” IEEE Trans. Comput. Imaging 3(3), 445–459 (2017). [CrossRef]
28. A. Halimi, Y. Altmann, A. McCarthy, X. Ren, R. Tobin, G. S. Buller, and S. McLaughlin, “Restoration of intensity and depth images constructed using sparse single-photon data,” in 24th European Signal Processing Conference (IEEE, 2016), pp. 86–90.
29. S. Hernandez-Marin, A. M. Wallace, and G. J. Gibson, “Bayesian analysis of LiDAR signals with multiple returns,” IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2170–2180 (2007). [CrossRef]
30. C. Mallet, F. Lafarge, M. Roux, U. Soergel, F. Bretar, and C. Heipke, “A marked point process for modeling lidar waveforms,” IEEE Trans. on Image Process. 19(12), 3204–3221 (2010). [CrossRef]
31. J. Tachella, Y. Altmann, N. Mellado, A. McCarthy, R. Tobin, G. S. Buller, J.-Y. Tourneret, and S. McLaughlin, “Real-time 3d reconstruction from single-photon lidar data using plug-and-play point cloud denoisers,” Nat. Commun. 10(1), 4984 (2019). [CrossRef]
32. Y. Altmann, A. Maccarone, A. McCarthy, G. Buller, and S. McLaughlin, “Joint range estimation and spectral classification for 3d scene reconstruction using multispectral lidar waveforms,” in Statistical Signal Processing Workshop (IEEE, 2016), pp. 1–5.
33. R. Achanta, F. Estrada, P. Wils, and S. Süsstrunk, “Salient region detection and segmentation,” in International conference on computer vision systems (Springer, 2008), pp. 66–75.
34. R. Achanta, S. Hemami, F. Estrada, and S. Susstrunk, “Frequency-tuned salient region detection,” in Conference on computer vision and pattern recognition (IEEE, 2009), pp. 1597–1604.
35. H. Mei, G.-P. Ji, Z. Wei, X. Yang, X. Wei, and D.-P. Fan, “Camouflaged object segmentation with distraction mining,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), pp. 8772–8781.
36. T. Zhou, D.-P. Fan, M.-M. Cheng, J. Shen, and L. Shao, “Rgb-d salient object detection: A survey,” Comp. Visual Media 7(1), 37–69 (2021). [CrossRef]
37. M. A. A. Belmekki, S. McLaughlin, and A. Halimi, “Fast classification and depth estimation for multispectral single-photon lidar data,” in Sensor Signal Processing for Defence Conference (IEEE, 2021), pp. 1–5.
38. H. Rue and L. Held, Gaussian Markov Random Fields: Theory and Applications (Chapman and Hall/CRC, 2005).
39. S. Chen, A. Halimi, X. Ren, A. McCarthy, X. Su, S. McLaughlin, and G. S. Buller, “Learning non-local spatial correlations to restore sparse 3d single-photon data,” IEEE Trans. on Image Process. 29, 3119–3131 (2020). [CrossRef]
40. W. Marais and R. Willett, “Proximal-gradient methods for poisson image reconstruction with bm3d-based regularization,” in 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (IEEE, 2017), pp. 1–5.
41. A. Halimi, J. Koo, R. A. Lamb, G. S. Buller, and S. McLaughlin, “Robust bayesian reconstruction of multispectral single-photon 3d lidar data with non-uniform background,” in International Conference on Acoustics, Speech and Signal Processing (IEEE, 2022), pp. 1531–1535.
42. S. Chen, W. Hao, X. Su, Z. Zhang, S. Wang, and W. Zhu, “Restoration of sparse multispectral single photon lidar data,” in International Conference on Signal Processing, Communications and Computing (IEEE, 2020), pp. 1–5.
43. J. Koo, A. Halimi, A. Maccarone, G. S. Buller, and S. McLaughlin, “A bayesian based unrolling approach to single-photon lidar imaging through obscurants,” in 30th European Signal Processing Conference (2022).
44. J. Koo, A. Halimi, and S. McLaughlin, “A Bayesian based deep unrolling algorithm for single-photon lidar systems,” IEEE J. Sel. Top. Signal Process. 16(4), 762–774 (2022). [CrossRef]
45. A. Halimi, C. Mailhes, J.-Y. Tourneret, and H. Snoussi, “Bayesian estimation of smooth altimetric parameters: Application to conventional and delay/doppler altimetry,” IEEE Trans. Geosci. Remote Sensing 54(4), 2207–2219 (2016). [CrossRef]
46. A. Halimi, G. S. Buller, S. McLaughlin, and P. Honeine, “Denoising smooth signals using a bayesian approach: Application to altimetry,” IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing 10(4), 1278–1289 (2017). [CrossRef]
47. A. Halimi, P. Honeine, and J. M. Bioucas-Dias, “Hyperspectral unmixing in presence of endmember variability, nonlinearity, or mismodeling effects,” IEEE Trans. on Image Process. 25(10), 4565–4579 (2016). [CrossRef]
48. Y. Altmann, X. Ren, A. McCarthy, G. S. Buller, and S. McLaughlin, “Robust bayesian target detection algorithm for depth imaging from sparse single-photon data,” IEEE Trans. Comput. Imaging 2, 456–467 (2016). [CrossRef]
49. R. Tobin, Y. Altmann, X. Ren, A. McCarthy, R. A. Lamb, S. McLaughlin, and G. S. Buller, “Comparative study of sampling strategies for sparse photon multispectral LiDAR imaging: towards mosaic filter arrays,” J. Opt. 19(9), 094006 (2017). [CrossRef]