Enriching absorption features for hyperspectral materials identification

Baofeng Guo

doi:10.1364/OE.384580

1. Introduction

Hyperspectral imaging [1,2] is a remote sensing technique derived from the theory of spectrometry. Hyperspectral sensors collect materials’ radiance from a wide electromagnetic spectrum range with contiguous spectral intervals. Its finer spectral resolution, typically less than 10 $nm$, and the wider electromagnetic spectrum, e.g., from 400-2500 $nm$, make it possible to lessen the ambiguity among materials’ spectral responses. Using data from hyperspectral imaging sensors, the hyperspectral material identification is carried out by analyzing the hyperspectral response curves. In recent years, the hyperspectral material identification has received significant attention in many applications [3,4], such as in agriculture [5,6], molecular biology [7,8], biomedical imaging [9], etc [3,10].

To find the spectral difference among materials for hyperspectral materials identification, feature extraction is a necessary pre-processing. Conventional approaches for hyperspectral feature extraction can be summarized as two classes, namely the methods relying on spectra [11–14] and the methods relying on data transform [15–18]. Other novel ideas include the methods based on the polarimetric information [19] and the dichromatic reflection [20]. In literature [19], a polarimetric-based method is proposed by using the polarization-mixed spectral data. It can identify metals or dielectric materials without the need of a ground truth.

In hyperspectral materials identification, one major problem needs to be addressed in the first place, i.e., the variations of the hyperspectral curves. In details, the variation of the hyperspectral curves can be found in both spatial domain and the time domain. A group of spectral samples of vegetation are shown in Fig. 1, where ‘Corn’, ‘Alfalfa’, and ‘Soybean’ are labeled by blue, green and red color respectively. These samples are extracted randomly from the AVIRIS 92AV3C hyperspectral imagery and its spectral resolution is about 10 $nm$. The data set is $145\times145$ pixels and the spatial resolution of is about 50 $m$. Figure 1 depicts the variation of the spectral curves caused by different locations, namely, the spatial domain variability.

Fig. 1. Variation of spectral curves in spatial domain, corn (blue), alfalfa (green) and soybean (red).

Download Full Size | PDF

Figure 2 shows several samples of spectral reflectance came from ‘Carbon fiber’ and ‘Polyester film’, labeled by red and blue colors respectively. These samples are acquired by an ASD field spectrometer at different time and with spectral interval of 1 $nm$. It is seen from Fig. 2 that substantial variation can be observed. In this research, we use ‘time domain variability’ to describe the temporal change of hyperspectral curves. Such time-domain variability is thought to be caused by changing sunlight, or possibly, the atmospheric light. Both spatial domain variation and the time domain variation may cause considerable overlaps between classes, and affect the separability of the hyperspectral curves.

Fig. 2. Variation of spectral curves in time domain, polyester film (blue) and carbon fiber (red).

Download Full Size | PDF

In this paper, we propose to cope with this problem by enriching the spectral absorption features. It is known from spectrometry that absorption happens if incident light is taken in by material’s components. In our previous research, we proposed an information theory based band selection method [14] to capture the most discriminating bands subset. To further increase classification accuracy, we initiated a new approach to explore the alternative absorption features. We first proposed a decision level fusion [21] to combine the outputs from a Support Vector Machine (SVM) and a multi-label classifier (MLC), where the SVM is based on the traditional spectral curves, and the MLC is based the new absorption features. Further improvements have been carried out by optimizing the fusion scheme and customizing the multi-label classifier [22]. It is found that some wrong absorptions have been detected due to sensor’s noise and atmospheric interference. To avoid its negative effect for classification, we proposed to filter the absorption features based on a mutual information criterion and developed a more robust classification algorithm [23].

In this paper, we propose a hyperspectral material identification method based on a group of enriched absorption features. Major differences from our previous research [14,21–23] and main contributions are summarized as follows. First, we extend an absorption valley detection algorithm by introducing a continuum removal processing. Through using the improved anchor detection results, we develop a series of formulas that make the calculation of the absorption parameters automatically. By this way, the parameters, including the width, the depth, the area, the level of symmetry associated with each of the absorption valleys, are measured without human intervention. By contrary, all previous researches use the binary representation of the absorption features that only give the location information of the absorptions. Furthermore, we explore the information associated with the absorption valleys by calculating its neighboring orientation field. Based on the idea of information fusion, an enriched absorption feature vector is put forward by augmenting the above newly developed orientation features. Finally, aiming at the enriched absorption features, we put forward a novel matching scheme for hyperspectral materials identification.

The remainder of this paper is organized as follows. In Section 2, we investigate the absorption valleys detection and discuss how to use them in hyperspectral material identification. In Section 3, we study how to parameterize an absorption valley and enrich the discriminating information based on the detected absorption valleys. Next, we design a matching scheme for the enriched absorption features in Section 4. Simulations are implemented in Section 5 to evaluate the performance of the proposed method based on two sets of hyperspectral data. Finally, conclusions are presented in Section 6.

2. Absorption features of hyperspectral data

According to the theory of spectroscopy [24], every material, probably except the ‘black hole’, absorbs energy from the incident light. The intensity of the absorption at a certain frequency range is actually depending on the material constituents, and therefore can be seen as ‘absorption signature’. If the spectral resolution is higher enough, e.g., 10 $nm$, the absorptions become dips compared with the nearby frequencies or ‘valleys’ at a spectral curve. Figure 3 illustrates three most significant absorption valleys at a hyperspectral curve for material ‘Aluminum’. Other minor absorption valleys are marked at the spectral curve by small triangles.

Fig. 3. Absorptions of aluminum, ASD sensor.

Download Full Size | PDF

Because the spectral absorptions can be used to determine the presence of a particular substance, they provide useful information for material identification. For the purpose of material identification, we can check the absorptions of one material against those of other materials. The absorptions different from all other materials, are unique features. These unique absorption valleys can be used directly to identify. For example, we can simply check whether a group of particular absorption valleys, which belong to a certain material’s unique absorptions, can be found for the query examples. If they are matched, then we can categorize this testing example to the queried class and the identification succeeds. However, it should be noticed that this approach needs ground truth data, or a suitable spectral library, where the possible materials are already identified and stored.

For those relatively simple application scenarios, it is straightforward and effective to use the unique absorptions for materials identification. But if many kinds of materials are involved in identification, one have to compare the absorption features detected from one material with all those absorption valleys detected from the remaining materials. This will reduce the chance of finding unique absorptions. When there is not a unique absorption valley that can be found to be exclusive by any one of the identified materials, this approach may no longer work. To avoid the aforementioned problem, i.e., the null set of unique absorption features, we propose to enrich the absorption features for hyperspectral material identification, which is discussed in the next section.

3. Enrichment of absorption features for hyperspectral material identification

In this paper, we propose to extract more discriminating information from the absorption valleys through two different phases, namely phase I: parameterization of absorption valleys, and phase II: amendment of orientation field features, respectively.

3.1 Automatic parameterization of absorption valleys

The first phase of our absorption feature enrichment is the parameterization, i.e., to characterize each of the absorptions by a group quantitative values or parameters. They include the depth ($a_{d}$), the width ($a_{w}$), the area ($S_{l}+S_{r}$), the symmetry property ($S_{l}/S_{r}$), etc., associated with each of the absorption valleys. Figure 4 illustrates these parameters of an absorption valley of real-life material ‘Monocrystalline silicon’, acquired by an portable ASD sensor.

Fig. 4. Parameters of an absorption features, Monocrystalline silicon.

Download Full Size | PDF

In the parameterization, one major challenge is how to accurately and automatically estimate the above parameters. Therefore, we develop a parameterization method that can calculate the parameters of absorptions without much human intervention. It is seen from Fig. 4 that every absorption valleys can be characterized by three salient points, i.e., the dip, the left shoulder, and the right shoulder respectively.

The dips of absorption valleys can be found by the peak detection algorithm. The detection of the absorption shoulders is more complicated, for there is not a simple way to locate the shoulders accurately. Fig. 5 shows details of three typical absorption valleys from a spectral example of the material Monocrystalline silicon. It is found that many absorption valleys’ shoulders can be actually categorized as the summit points (see the left shoulder and the right shoulder at Fig. 5(a). Because the summit points can be considered as the extreme points on the spectral curve, they can be detected successfully by choosing the point whose first derivatives is just going cross zeros, and is the closest point to the dip (see the second absorption valley in Fig. 5(b).

Fig. 5. Detected results of left shoulder and right shoulders by spectral reflectance, Monocrystalline silicon.

Download Full Size | PDF

By applying to other spectral examples, this approach is verified to be effective to locate most of the absorption valleys’ shoulders (see the first and the second absorption valleys in Fig. 5(b)). But in some circumstances, the approach may fail. For example in Fig. 5(b), it is found that the approach missed the left shoulder of the third absorption valley. The reason is the shoulders of some absorption valleys are not always appearing as summit points. In the third absorption valley of Fig. 5(b), its left shoulder is a turning points rather than the summit point. To accommodate these types of exemptions, we propose to improve the approach by introducing a pre-processing step of continuum removal.

The continuum removal is a technique that can remove the envelop of a spectral curve. Given an original spectral curve as $\mathbf {x}$ and its envelop as $\mathbf {x}_{e}$ (i.e., the segments connecting all local spectra maxima of $\mathbf {x}$), the continuum-removed spectra $\mathbf {x}_{c}$ is calculated by dividing the envelop $\mathbf {x}_{e}$ into the original spectra $\mathbf {x}$ [25], i.e.,

(1)$$\mathbf{x}_{c} = \mathbf{x}/\mathbf{x}_{e}.$$

After removing the continuum, different absorptions can be compared by a common baseline. Particularly, because both the submit points and the turning points are among the most close points to the envelop of the hyperspectral curves, it can be expected that these points will be highlighted after the continuum removal. Thus, it could be very helpful to amend the detection results obtained by the original approach shown in Fig. 5(b)).

Figure 6(a) shows the result after using the continuum removal in Eq. (1). During continuum removal, the reflectance is subtracted by its envelop and is then normalized to range [0,1]. The ‘normalized magnitude’ is used to represent the reflectance value after continuum removal, and the ‘magnitude’ to stand for the value of the first order derivative for the continuum removed reflectance. Comparing with the original spectral curve shown in Fig. 5(a), it is found that the turning point of the third absorption valley (i.e., its left shoulder, ignored by the previous algorithm), is transformed to a summit point. Thus, it becomes possible to detect all the shoulders after adopting the continuum removal. Figure 6(b) shows the results of the shoulder detection, where the red lines label the original detection results and the green line is the new detection result. From Fig. 6(b), it is seen that after continuum removal, all absorptions’ shoulders are detected correctly.

Fig. 6. Detected results of left shoulder and right shoulders after continuum removal, Monocrystalline silicon.

Download Full Size | PDF

After detecting the three salient points, i.e., the left shoulder, the dip and the right shoulder, the parameters of an absorption valley can be calculated automatically. Assuming the coordinates of the three salient points are $\left ( x_{d}, y_{d} \right )$ (for the dip), $\left ( x_{l}, y_{l} \right )$ (for the left shoulder), and $\left ( x_{r}, y_{r} \right )$ (for the right shoulder), respectively, where $x$ stands for the band number and $y$ is the normalized spectral value, the width of an absorption valley is measured as

(2)$$a_{w} = x_{d} - x_{l}.$$

The depth of the absorption is calculated as

(3)$$a_{d} = \frac{1}{2}\left( y_{r} + y_{l} \right) - y_{d}.$$

The area of the absorption is calculated as

(4)$$a_{s} = \frac{1}{2}\left[ (x_{d}y_{l}-x_{l}y_{d})+(x_{l}y_{r}-x_{r}y_{l})+(x_{r}y_{l}-x_{l}y_{r}) \right].$$

The symmetry of the absorption is measured as

(5)$$a_{m} = \frac{\left[ (x_{d}y_{l}-x_{l}y_{d})+(x_{l}y_{h}-x_{h}y_{l})+(x_{h}y_{l}-x_{l}y_{h}) \right]}{\left[ (x_{d}y_{h}-x_{h}y_{d})+(x_{h}y_{r}-x_{r}y_{h})+(x_{r}y_{h}-x_{h}y_{r}) \right]},$$

where $x_{h}=\frac {1}{2}(x_{l}+x_{r})$ and $y_{h}=\frac {1}{2}(y_{l}+y_{r})$.

Here the continuum-removed curves are only used to find the correct shoulder of absorption valleys. The subsequent feature matching is still carried out on the original values, and the above $\left (x, y\right )$ coordinates should be applied to the original domain. By amending the position of the dip, i.e., $a_{p} = x_{d}$, with the above calculated parameters, an absorption valley can be characterized by an real-value vector, such as $\mathbf {a} = \left [a_{p}\;a_{w}\;a_{d}\;a_{s}\;a_{m}\right ]^T$. In contrast to the binary absorption features [21] that only recode the locations of the absorptions, the parameterized absorption, proposed in this research, contains much more information. This is helpful to improve the absorption features’ discrimination capability, and lead to a better material identification accuracy.

3.2 Orientation features

In the second phase of the feature enrichment, we expand the above parameterized absorptions (i.e., the feature vector $\mathbf {a}$) by augmenting a group of new spectral representations. Specifically, to incorporate more curve-based information, new orientation descriptors are considered, which are designed to describe the appearance of an absorption pattern around the absorption valley. The details are discussed as follows.

It has been argued that materials can be identified by their spectral curves, especially the dominant absorptions. In many traditional absorption-based approaches (e.g., [21]), the location of each absorption was used by the dip of the absorption valley, but the information that can characterize the neighboring spectral curves has not been explored. Even in the aforementioned parameterized absorption method, some details regarding the surrounding spectral curves have not been described. Hence, we proposed to consider a new descriptor that comprises the orientation information around the absorption dip.

Figure 7 depicts the orientation features of an absorption valley. First, a series of sampling points (bands) are assigned to each of the absorption valleys. These sampling points can be organized in a pattern, consisting of two sides, surrounding the depicted absorption dip. In details, each of patterns comprises $2K$ sampling points, which are equally distributed along the left side and the right side. Using the dip of the absorption valley as a reference, the sampling bands can be ordered along two sides starting from the dip towards to the two shoulders. Then, we can define the orientation descriptor of the absorption valley as follows:

(6)$$\mathbf{d}= \left[\theta_{1}\;\theta_{2}\;\dots\;\theta_{2K}\right]^{T}$$

where $\theta _{k}$ denotes the angle of the sampling point $k$ against the horizontal direction, i.e., the $x$-axis.

Fig. 7. Orientation Parameters of an absorption features, Monocrystalline silicon.

Download Full Size | PDF

The orientation information is designed to provide a new descriptor to the spectral absorption, which can be used to supplement the existing absorption features with the more spectral details regarding the absorption valleys. Moreover, since the angle of the sampling point against the horizontal direction is used as an individual feature, the orientation descriptor (Eq. 6) is invariant to magnitude of the hyperspectral curve. In this way, the new orientation features characterize the absorption valley with respect to the desired spectral pattern. However, when the environmental light or the spectral shape of the illumination changes, the parameters would change for the same material if radiance data are used. This problem should be carefully considered during system level design, especially by atmospheric correction or classifier optimization. In the following section, we developed a matching algorithm for the enriched absorption features based on a pair-wise pattern correspondence.

4. Matching algorithm

For the purpose of material identification, we need to calculate a matching score between the query example and the known (gallery) example. To calculate this matching score, we first concatenate the two enriched absorption feature vectors, i.e., the parameterized absorption feature vector $\mathbf {a} = \left [a_{p}\;a_{w}\;a_{d}\;a_{s}\;a_{m}\right ]^T$, with the orientation feature vector $\mathbf {d} = \left [\theta _{1}\;\theta _{2}\;\dots \;\theta _{2K}\right ]^{T}$, as an augmented feature vector, such as:

(7)$$\mathbf{f} = \left[ \mathbf{a};\mathbf{d} \right].$$

Give two hyperspectral curves as the candidates for matching with $N$ and $M$ absorption valleys respectively, let

(8)$$Q = \left\{ \mathbf{q}_{i} \right\},\;i=1,2,\dots,N,$$

and

(9)$$P = \left\{ \mathbf{p}_{j} \right\},\;j=1,2,\dots,M,$$

be two sets of the enriched absorption feature vectors extracted from the two hyperspectral curves, where $\mathbf {q}_{i}$ and $\mathbf {p}_{j}$ are the feature vectors for the $i$-th and $j$-th absorption valleys from the two hyperspectral curves.

To calculate the matching score, pairs of two corresponding absorptions needed to be found in the first place, such as

(10)$$C=\left\{(i,j)|q_{i}\in{\mathbf{q},p_{j}\in{\mathbf{p}}}\right\}.$$

where each pair $(i,j)\in C$ denotes that the absorptions $q_{i}$ and $p_{j}$, detected in both hyperspectral curves, are associated with the same absorption. This process is called as alignment, and is discussed as follows.

4.1 Alignment

For various factors, such as the sensor’s spectrum shift, the errors caused by the absorption detection, etc., the pair of the corresponding absorptions may displace one with respect to another. Fig. 8 shows the spectral curves of the samples of ‘wheat’ and ‘oats’ extracted from 92AV3C data set. It is seen that significant shift of absorption valleys occurs around 600 $nm$. This is thought to be errors of spectral calibration around certain range of spectrum. Thus in hyperspectral material identification, the absorption from one spectral curve may not overlap exactly over the corresponding spectral curves came from the same material.

Fig. 8. Shift of spectral valleys, samples extracted from AVIRIS 92AV3C dataset, wheat (red) and oats (green).

Download Full Size | PDF

To address this problem, we design an alignment process that allows a certain level of tolerance between the absorption positions for each pair of the corresponding absorptions. In the alignment, the corresponding absorption pairs are first collected among all possible pairs. Then, three criteria, namely the distance of absorption location, the similarity of absorption parameters and the match level of absorption shape, are considered consequently, to decide which pair is the correct corresponding absorptions. The distance between any pair of the absorptions’ positions, i.e.,

(11)$$L_{i,j}=|a_{p}^{i}-a_{p}^{j}|, i = 1,2,\dots,N; j = 1,2,\dots,M.$$

are calculated and the two absorption whose location distance $L_{i,j}$ does not exceed a certain value $\delta _{l}$, i.e.,

(12)$$L_{i,j}\;<\;\delta_{l},\;i = 1,2,\dots,N; \;j = 1,2,\dots,M.$$

will be selected as a group of eligible pairs, i.e.,

(13)$$C_{1} = \left\{(p,q)_{k}|\;p\in\left\{1,2,\dots,N\right\};\;q\in\left\{1,2,\dots,M\right\}\right\},\;k=1,2,\dots,K$$

The first candidate set $C_{1}$ is then passed to the next step.

In the second step, the remained absorption pairs are further evaluated based on the similarity of their absorption parameters, using the following Euclidean distance,

(14)$$S_{i,j} = \sqrt{\sum_{h}{(a_{h}^{i}-a_{h}^{j})^{2}}},\;h\in\{w,d,s,m\};\;(i,j)\in C_{1}.$$

Similar to the previous process, the pairs whose similarity of absorption parameters are within the acceptable tolerances $\delta _{s}$, i.e.,

(15)$$S_{i,j}\;<\;\delta_{s},\;(i,j)\in C_{1}.$$

will be selected as the secondary set of the eligible pairs $C_{2}$ and are passed to the third round of check, where the level of the shape match regarding each pair of the absorption valley are measured using the orientation features, such as follows:

(16)$$M_{i,j} = \sqrt{\sum_{k=1}^{2K}{(\theta_{k}^{i}-\theta_{k}^{j})^{2}}},\;(i,j)\in C_{2}$$

Finally, the pairs with their shape difference with the absorption valleys is less than a certain tolerance $\delta _{m}$ are selected as the corresponding absorption pairs, such as:

(17)$$M_{i,j}\;<\;\delta_{m},(i,j)\in C_{2}.$$

Through the above three consecutive steps, the pairs satisfying the three constraints are chosen. This alignment can find the corresponding pairs of the absorptions with the largest possibility of spectral matching. At the same time, the algorithm introduces three tolerance thresholds, which gives the flexibility for the absorption matching and can avoid the interference from sensor’s spectrum shifting or inaccurate absorption detection results. In our testing, the detailed values of $\delta$ can be decided subjectively by domain expert, or empirically by using only the training samples. For example, the tolerance parameter $\delta$ can be decided by observing the training samples to ensure that the majority of the absorption valleys are corresponded correctly. If training samples are used to validate the parameter, the number of samples should be carefully addressed to avoid the problem of overfit.

4.2 Matching score

After finding the corresponding absorption pairs, we can evaluate the matching score between two hyperspectral curves for material identification. Specifically, we consider three different measurements for the matching score.

4.2.1 Number of correspondence.

This criterion checks how many absorption features one spectral curve has that are the same as the other spectral curve, i.e., the number of correspondence of absorptions between two hyperspectral curves. This number can be calculated by Hamming distance conveniently. The Hamming distance is calculated as follows.

(18)$$S_{C}(\mathbf {x_{1}},\mathbf {x_{2}}) = \sum_{i=1}^{L}\|X_{1}^{i} - X_{2}^{i} \|$$

where $\mathbf {x}=\left [X^1\;X^2\;\cdots \;X^L\right ]^{T}$ is a $L$-dimensional binary vectors.

For a hyperspectral curve with $N$ absorption features such as $Q = \left \{ \mathbf {q}_{i} \right \}_{i=1}^{N}$ and $\mathbf {q}_{i} = \left [a_{p}\;a_{w}\;a_{d}\;a_{s}\;a_{m}\;\theta _{1}\;\theta _{2}\;\dots \;\theta _{2K} \right ]$, its binary absorption vector $\mathbf {x}$ is calculated as follows:

(19)$$X_{l} = \left\{ \begin{array}{ll} 1, & a_{p} = l, l = 1,2,\ldots,L\\ 0, & \textrm{otherwise} \end{array} \right.$$

The number of correspondence between two hyperspectral curves constitutes an important clue for materials identification. Two hyperspectral curves are likely to be the same material if their number of correspondence is large. However, we may encounter difficulties in correctly distinguishing materials if they have many absorptions. For example, a hyperspectral curve may exhibit a large number of correspondences with one material but is come from another material. This is because that the matched materials have many absorptions and this increases the probability of correspondence. Hence, we propose to consider the second matching score.

4.2.2 Number of non-correspondence.

On the contrary to the correspondence number, the number of non-correspondence measures how many absorption features one spectral curve has that do not appear on the other spectral curve, i.e., a mismatch number of absorptions between two hyperspectral curves.

The non-correspondence can be calculated based on Hamming distance as well, such as:

(20)$$S_{M}(\mathbf {x_{1}},\mathbf {x_{2}}) = \max \{(N - S_{C}(\mathbf {x_{1}},\mathbf {x_{2}})), (M - S_{C}(\mathbf {x_{1}},\mathbf {x_{2}}))\}$$

where $N$ and $M$ are the numbers of the absorption features for the spectral data $\mathbf {x_{1}}$ and $\mathbf {x_{2}}$ respectively. This metric gives an indication on how much divergence that the two hyperspectral curves may have. Combining these two complementary measurements, it becomes possible to avoid the aforementioned problem.

4.2.3 Averaged distance

The above two measurements evaluate the level of correspondence, and complementarily, the level of the mismatching between two spectral curves. Both of them are based purely on the positions of the absorption features. In other words, they only considered the locations or the wavelengths of the spectral absorptions. But according to our discussion in Section 3, a spectral absorption could provide much more information, not only the spectral position where the absorption occurs, but also including the width, the area, the height, the symmetry associated with each of the absorption valleys. Besides these, we also introduce a group of orientation features that bring in a new source of direction information. Each of the features can give information to discriminate one material against other materials. So we further introduce a Euclidean distance to measure the similarity between two absorptions with the enriched absorption features.

Given a combined ($2K+4$) dimensional feature vector, such as:

(21)$$\mathbf{f} = \left[a_{w}\;a_{d}\;a_{s}\;a_{m}\;\theta_{1}\;\theta_{2}\;\dots\;\theta_{2K} \right]^{T} $$

(22)$$ = \left[f_{1}\;f_{2}\;f_{3}\;f_{4}\;f_{5}\;f_{6}\;\dots\;f_{(2K+4)} \right]^{T} $$

where $\mathbf {a} = \left [a_{w}\;a_{d}\;a_{s}\;a_{m}\right ]^T$ (the parameter $a_{p}$ has been used before by the aforementioned two metrics) is the parameterized absorption features, and $\mathbf {d}= \left [\theta _{1}\;\theta _{2}\;\dots \;\theta _{2K}\right ]^{T}$ is the orientation descriptor, a distance between two feature vectors $\mathbf {f}^1$ and $\mathbf {f}^2$ is defined as:

(23)$$S_{d}(\mathbf{f}^1,\mathbf{f}^2) = \sqrt{\sum_{i=1}^{(2K+4)}\left(f_{i}^{1}-f_{i}^{2}\right)^{2}}$$

Using the extracted absorption features, the three metrics are adopted to measure the level of marching between two hyperspectral curves. Each of the metrics makes one decision on whether the two hyperspectral curves come from the same material. Considering that each individual feature may have different discriminating capability, the final decision is given by a weighted majority vote from the initial decisions, such as the follows.

(24)$$\hat{y} = \arg \max_{i} \sum^{3}_{j=1} w_{j} p_{ij}$$

where $y$ is the weighted average predicted label for the class membership, $p_{ij}$ is the probability of the testing example for the class membership $i$ according to classifier $j$, and $w_{j}$ is a weight for the classifier $j$’s prediction.

In our scheme, the weight of each classifier $j$ are measured by the entropy of its output, i.e.,

(25)$$w_{j}=-\sum_{i=1}^{N} p_{ij}\log p_{ij}.$$

The output probability $p_{ij}$ is defined as

(26)$$p_{ij} = S_{ij}/\max_{i}\left(S_{ij}\right)$$

where $S_{ij}$ is the $j$-th matching score for class member $i$, and the matching scores $S_{j}, j = 1,2,3$ are measured by Eq. (18), Eq. (20), and Eq. (23) respectively.

5. Results and discussions

To assess the performance of the proposed method, different hyperspectral material identification approaches are tested on two hyperspectral data sets. The bench-marked classical methods include the nearest neighbor ($k$-NN) method [26] and the spectral angle mapping (SAM) method [27]. Both of them use hyperspectral spectral curves as features. The proposed method is also compared with the method that adopts the unique absorptions as features [23]. Because it usually takes time and effort to acquire spectral samples for many materials, especially in remote sensing applications, maintaining a sizable database is becoming expensive. So, some modern machine learning based algorithms, such as the support vector machines (SVMs) and Convolutional neural networks (CNNs), are not particular suitable to this scenario. Thus in the first stage of comparison, they are not adopted as the benchmark approaches. The results are discussed in the following sections for different hyperspectral data sets respectively.

5.1 Results of ASD data set

In the first data set, we investigate seven kinds of materials, including ‘aluminum’, ‘carbon fiber’, ‘mono-crystalline silicon’, ‘silicon dioxide’, ‘titanium’, ‘white polyester film’, and ‘yellow polyester film’. Analytical Spectral Devices (ASD [28]) is used to recode these seven kinds of materials’ spectra. The portable ASD spectrometer provides 1 $nm$ spectral resolution, and is particularly suitable for laboratory research or filed investigation. An overall 2,151 measured wavelengths is given by ASD sensor, covering the spectral range from 350 $nm$ to 2,500 $nm$. In the test, three measurements are acquired for each of the materials. Figure 9 shows the spectral curves, sampled from the above seven kinds of materials.

Fig. 9. Samples of ASD data set.

Download Full Size | PDF

The absorption valleys are detected for the seven kinds of materials, and are listed in Fig. 10. Figure 11 further compares all absorption valleys (Fig. 11.(a)) and the unique absorptions (Fig. 11.(b)) for ’aluminum’. It is shown that the unique absorptions is less than the qualified absorption valleys. By pair-wise comparisons, we search for the unique absorption for the seven kinds of materials, and illustrate them in Fig. 12.

Fig. 10. Absorption valleys of seven materials, marked by black lines.

Download Full Size | PDF

Fig. 11. Comparison of absorption valleys and absorption features, aluminum

Download Full Size | PDF

Fig. 12. Unique absorption features of seven materials, ASD data set

Download Full Size | PDF

In the identification, 100 spectral samples are acquired by a portable ASD sensor for each of the seven materials mentioned above. Figure 13 shows 300 samples for three kinds of materials from the seven materials, including ‘polyester film (blue color)’, ‘silicon dioxide (green color)’, and ‘titanium (red color)’. Based on the method discussed in Section 4, about 10% (i.e., 10) samples are extracted from each of the materials as the training or gallery examples. Identification accuracy is assessed based on the remaining 90% (or 90) testing or query examples. Two classical methods, namely spectral angle mapping (SAM) method [27] and nearest neighbor ($k$-NN) method [26] are chosen as the benchmarks. The method introduced in [23], which is based on checking the unique absorption features, is also evaluated.

Fig. 13. 100 Spectral samples of Polyester film (white), Silicon dioxide, and Titanium, ASD sensor.

Download Full Size | PDF

Identification results are listed in Table 1, where it is seen that the two classical methods can not compete with the methods based on absorptions. In details, the identification accuracies of the SAM method and the $k$-NN method are 91.11% and 90.63% respectively, while both the absorption-feature based methods achieve 100% accuracy. Specifically, the traditional methods fail to identify the materials of ‘silicon dioxide’ and ‘mono-crystalline silicon’, but the methods based on the absorption features did correctly. To evaluate the performances of different approaches further more, a remotely sensed AVIRIS hyperspectral data set is tested as follows.

Table 1. Identification accuracy on ASD data, each material has 10 train sample and 90 test samples.

View Table | View all tables in this article

5.2 AVIRIS data set

In the second test, the AVIRIS 92AV3C data set, with 10 $nm$ spectral resolution, is used. It is acquired by an air-borne AVIRIS sensor, which has 224 bands and covers [400-2500] $nm$ spectrum range [17]. The 92AV3C scene, with size $145 \times 145$ pixels, has 16 classes vegetation, including Alfalfa, Corn (notill), Corn (mintill), Corn, Grass/pasture, Grass/trees, Grass/pasture (mowed), Hay (windrowed), Oats, Soybean (notill), Soybean (mintill), Soybean (clean), Wheat, Woods, Buildings/Grass/Trees/Drives, and Stone/Steel/Towers. About 1% pixels (or at least 1 pixel available for those classes with less than 100 pixels) are extracted randomly as the training examples for each of the classes. The other 99% pixels are remained as the test samples. Fig. 14 displays the spectral samples of 16 kinds of 92AV3C’s vegetation. In Fig. 15, the absorption valleys are marked by black blocks, and the blank areas mean no absorption.

Fig. 14. Spectral examples of 16 classes of material in AVIRIS 92AV3C data set.

Download Full Size | PDF

Fig. 15. Detection results of all absorption valleys, AVIRIS 92AV3C data set [21].

Download Full Size | PDF

According to the strategy introduced in [23], we search the absorption valleys in each of the 16 classes of objects (found in Fig. 14) for the unique absorptions. The results are shown as the black blocks in Fig. 16. We can see for four of the 16 classes, no unique absorption was found. In other words, in the data set AVIRIS 92AV3C, four classes, including ‘Corn’, ‘Hay’ , ‘Grass/pasture’ and ‘Soybean (mintill)’, can not be identified based on unique absorptions. So in the rest test of AVIRIS 92AV3C data set, three approaches, including the spectral angle mapping method, the nearest neighbor method, and the proposed method, are evaluated.

Fig. 16. Absorption features (unique) for AVIRIS 92AV3C data set.

Download Full Size | PDF

The results are listed in Table 2, where we can find that the accuracy of the proposed method is higher than those of the two bench-marked methods by 68% vs. 37% and 40%.

Table 2. Material identification results, AVIRIS 92AV3 data set.

View Table | View all tables in this article

In the orientation feature vector, parameter $K$ controls how many neighboring bands will be used to estimate the spectral shape for the absorption valley. The more bands are used, the more accurately the absorption valley is approximated. But with the increasing number of $K$, its potential of depicting the desired absorption will be declining since the bands sampled by $K$ are more likely falling outside of the effectual absorption valley. Figure 17 shows the change of identification accuracy against different values of parameter $K$. It is seen from Fig. 17 that in the beginning the accuracy is improved rapidly with the increasing of $K$ starting from a smaller value. After a turning point of $K$ (here is 7), the change of accuracy is becoming very slow. This may indicate that sampling range of the orientation features is too far to the center of the absorption, and the effectiveness of the orientation vector is becoming weaker.

Fig. 17. Accuracy change against different values of parameter $K$.

Download Full Size | PDF

In this research, the data values of AVIRIS 92AV3C are calibrated DN (Digital Number) values, which are converted to radiance values subsequently. It is noticed that the main illuminant (such as sunlight) and atmospheric light may create many dips in hyperspectral radiance curves as well, which actually do not contribute to materials’ identity. This may cause serious problems to identification that are relying on detecting radiance’s absorptions, and might explain the experimental result where the accuracies of AVIRIS 92AV3C data set is lower than those of ASD data set. To alleviate this problem, matching strategy and atmospheric correction have to be considered carefully to reduce the interference from illuminant or atmospheric light.

6. Conclusion

In this paper, we proposed a material identification approach by using the enriched hyperspectral absorption. Specifically, we put forward an improved anchor detection algorithm and propose a series of parameters, such as the width, the depth, the area, the level of symmetry, etc., to quantify each of the detected absorption valleys. To further improve the capability of identification, we enrich the information associated with the selected absorption valley by calculating its neighboring orientation field and forming an extended absorption feature vector. Finally, we design a feature matching scheme to identify material based on the enriched absorption features. The proposed approach is different with those conventional methods, which emphasize the absolute amplitude of the radiance.

To assess performance of identification accuracy, simulations are carried out based on two different hyperspectral data-sets. Results show that the proposed method achieved better identification accuracy than the classical methods. On the other hand, it is worthwhile to notice that main illuminant (such as sunlight) and atmospheric light may create dips in hyperspectral radiance curves as well, which can not account for materials’ identity. Therefore using radiance for material identification would cause problems to absorption-based methods, and this has to be carefully addressed in applications.

Moreover, it is known that absorption occurs when energy from the radiative source is absorbed by the material’s components. This is going to be taken into account in future research.

Funding

National Natural Science Foundation of China (61375011).

Acknowledgments

The author would like to thank the anonymous reviewers for many insightful comments and constructive suggestions, which help to improve the quality of this paper greatly. The author would like to thank Honghai Shen and Mingyu Yang of Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, for their support on ASD data.

Disclosures

The authors declare no conflicts of interest.

References

1. A. F. H. Goetz, G. Vane, J. E. Solomon, and B. N. Rock, “Imaging spectrometry for earth remote sensing,” Science 228(4704), 1147–1153 (1985). [CrossRef]

2. A. F. H. Goetz, “Three decades of hyperspectral remote sensing of the earth: A personal view,” Remote Sens. Environ. 113, S5–S16 (2009). [CrossRef]

3. J. M. Bioucasdias, A. Plaza, G. Campsvalls, P. Scheunders, N. M. Nasrabadi, and J. Chanussot, “Hyperspectral remote sensing data analysis and future challenges,” IEEE Geosci. Remote Sens. Mag. 1(2), 6–36 (2013). [CrossRef]

4. A. Plaza, J. A. Benediktsson, J. W. Boardman, J. Brazile, L. Bruzzone, G. Campsvalls, J. Chanussot, M. Fauvel, P. Gamba, and A. J. Gualtieri, “Recent advances in techniques for hyperspectral image processing,” Remote Sens. Environ. 113, S110–S122 (2009). [CrossRef]

5. V. V. Kozoderov, E. V. Dmitriev, and A. Sokolov, “Improved technique for retrieval of forest parameters from hyperspectral remote sensing data,” Opt. Express 23(24), A1342 (2015). [CrossRef]

6. T. Adao, J. Hruska, L. Padua, J. Bessa, E. Peres, R. Morais, and J. J. Sousa, “Hyperspectral imaging: A review on uav-based sensors, data processing and applications for agriculture and forestry,” Remote Sens. 9(11), 1110 (2017). [CrossRef]

7. P. Gattinger, J. Kilgus, I. Zorin, G. Langer, R. Nikzadlangerodi, C. Rankl, M. Groschl, and M. Brandstetter, “Broadband near-infrared hyperspectral single pixel imaging for chemical characterization,” Opt. Express 27(9), 12666–12672 (2019). [CrossRef]

8. H. Cen and R. Lu, “Optimization of the hyperspectral imaging-based spatially-resolved system for measuring the optical properties of biological materials,” Opt. Express 18(16), 17412–17432 (2010). [CrossRef]

9. M. Uzair, A. Mahmood, F. Shafait, C. Nansen, and A. Mian, “Is spectral reflectance of the face a reliable biometric,” Opt. Express 23(12), 15160–15173 (2015). [CrossRef]

10. M. A. Martinez, E. M. Valero, J. Nieves, R. Blanc, E. Manzano, and J. L. Vilchez, “Multifocus hdr vis/nir hyperspectral imaging and its application to works of art,” Opt. Express 27(8), 11323–11338 (2019). [CrossRef]

11. J. Gualtieri and R. Cromp, “Support vector machines for hyperspectral remote sensing classification,” in “Proceedings of the 27th AIPR Workshop on Advances in Computer Assisted Recognition,” (Washington DC, 1998), pp. 121–132.

12. G. Campsvalls and L. Bruzzone, “Kernel-based methods for hyperspectral image classification,” IEEE Trans. Geosci. Remote Sensing 43(6), 1351–1362 (2005). [CrossRef]

13. S. Matteoli, M. Diani, and G. Corsini, “A tutorial overview of anomaly detection in hyperspectral images,” IEEE Aerosp. Electron. Syst. Mag. 25(7), 5–28 (2010). [CrossRef]

14. B. Guo, S. R. Gunn, R. I. Damper, and J. D. B. Nelson, “Band selection for hyperspectral image classification using mutual information,” IEEE Geosci. Remote Sensing Lett. 3(4), 522–526 (2006). [CrossRef]

15. G. Franchi and J. Angulo, “Morphological principal component analysis for hyperspectral image analysis,” ISPRS Int. J. Geo-Inf. 5(6), 83 (2016). [CrossRef]

16. L. O. Jimenez and D. A. Landgrebe, “Hyperspectral data analysis and supervised feature reduction via projection pursuit,” IEEE Trans. Geosci. Remote Sensing 37(6), 2653–2667 (1999). [CrossRef]

17. D. Landgrebe, “On information extraction principles for hyperspectral data: A white paper,” Technical Report, School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN (1997).

18. R. Willett, M. F. Duarte, M. A. Davenport, and R. G. Baraniuk, “Sparsity and structure in hyperspectral imaging : Sensing, reconstruction, and target detection,” IEEE Signal Process. Mag. 31(1), 116–126 (2014). [CrossRef]

19. M. A. Martinezdomingo, E. M. Valero, J. Hernandezandres, S. Tominaga, T. Horiuchi, and K. Hirai, “Image processing pipeline for segmentation and material classification based on multispectral high dynamic range polarimetric images,” Opt. Express 25(24), 30073–30090 (2017). [CrossRef]

20. S. Tominaga, “Dichromatic reflection models for a variety of materials,” Color Res. Appl. 19(4), 277–285 (1994). [CrossRef]

21. B. Guo, H. Shen, and M. Yang, “Improving hyperspectral image classification by fusing spectra and absorption features,” IEEE Geosci. Remote Sensing Lett. 14(8), 1363–1367 (2017). [CrossRef]

22. B. Guo, “Entropy-mediated decision fusion for remotely sensed image classification,” Remote Sens. 11(3), 352 (2019). [CrossRef]

23. B. Guo, “Hyperspectral image classification via matching absorption features,” IEEE Access 7, 131039 (2019). [CrossRef]

24. J. M. Hollas, Modern Spectroscopy, 4th Edition (John Wiley & Sons, Inc, 2003).

25. K. Banerjee and M. K. Jain, “Copper ore identification using spectral similarity measurement from hyperion image, mapping of porphyry copper mineralized zone,” J. Geol. Soc. India 91(2), 239–247 (2018). [CrossRef]

26. T. Hastie and R. Tibshirani, “Discriminant adaptive nearest neighbor classification,” IEEE Trans. Pattern Anal. Machine Intell. 18(6), 607–616 (1996). [CrossRef]

27. Y. Sohn and N. S. Rebello, “Supervised and unsupervised spectral angle classifiers,” Photogramm. Eng. Remote Sens. 68(12), 1271–1282 (2002).

28. “Asd visible, nir (and swir) spectrometers,” https://www.malvernpanalytical.com/en/products/product-range/asd-range/. Accessed April 15, 2019.

ClassPerformance	Accuracy (%) of Methods
ClassPerformance	$k$ -NN	SAM	Proposed
Alfalfa	95.56	95.56	0
Corn (notill)	31.91	30.54	60.29
Corn (min)	30.93	31.68	40.87
Corn	23.91	19.57	20.43
Grass Pasture	3.20	3.41	65.88
Grass trees	62.85	66.38	98.16
Grass Pasture (mowed)	92.59	92.59	0
Hay (windrowed)	34.05	31.03	99.78
Oats	42.11	42.11	0
Soybeans (notill)	14.32	14.10	30.86
Soybeans (min)	47.54	51.99	92.31
Soybean (clean)	18.61	18.26	35.13
Wheat	69.85	69.35	0
Woods	52.81	71.31	97.23
Building Drives etc	1.07	0.80	43.05
Stone steel towers	86.67	85.56	70.0
Overall accuracy (%)	37.05	40.22	68.26

ClassPerformance	Accuracy (%) of Methods
ClassPerformance	$k$ -NN	SAM	Proposed
Alfalfa	95.56	95.56	0
Corn (notill)	31.91	30.54	60.29
Corn (min)	30.93	31.68	40.87
Corn	23.91	19.57	20.43
Grass Pasture	3.20	3.41	65.88
Grass trees	62.85	66.38	98.16
Grass Pasture (mowed)	92.59	92.59	0
Hay (windrowed)	34.05	31.03	99.78
Oats	42.11	42.11	0
Soybeans (notill)	14.32	14.10	30.86
Soybeans (min)	47.54	51.99	92.31
Soybean (clean)	18.61	18.26	35.13
Wheat	69.85	69.35	0
Woods	52.81	71.31	97.23
Building Drives etc	1.07	0.80	43.05
Stone steel towers	86.67	85.56	70.0
Overall accuracy (%)	37.05	40.22	68.26

Enriching absorption features for hyperspectral materials identification

Abstract

1. Introduction

2. Absorption features of hyperspectral data

3. Enrichment of absorption features for hyperspectral material identification

3.1 Automatic parameterization of absorption valleys

3.2 Orientation features

4. Matching algorithm

4.1 Alignment

4.2 Matching score

4.2.1 Number of correspondence.

4.2.2 Number of non-correspondence.

4.2.3 Averaged distance

5. Results and discussions

5.1 Results of ASD data set

5.2 AVIRIS data set

6. Conclusion

Funding

Acknowledgments

Disclosures

References

Cited By

Figures (17)

Tables (2)

Equations (26)

Optics Express

ClassAccuracy (%)	Methods
ClassAccuracy (%)	SAM	$k$ -NN	Unique Absorption	Proposed
Aluminum	100	100	100	100
Carbon fiber	100	100	100	100
Polyester film (yellow)	100	100	100	100
Polyester film (white)	100	100	100	100
Mono-crystalline silicon	81.82	81.82	100	100
Silicon dioxide	66.04	64.22	100	100
Titanium	100	100	100	100
Overall accuracy (%)	91.11	90.63	100	100