Strategy for reducing the effect of surface fluctuation in the classification of aluminum alloy via data transfer and laser-induced breakdown spectroscopy

Jing Chen; Jing Chen; Yu Ding; Yu Ding; Yu Ding; Ao Hu; Ao Hu; Wenjie Chen; Wenjie Chen; Yufeng Wang; Yufeng Wang; Meiling Zhao; Meiling Zhao; Yan Shu; Yan Shu

doi:10.1364/OE.507787

1 Introduction

The demand for aluminum has drastically increased owing to the rapid industrial development; its light weight, high strength, good corrosion resistance, and high conductivity make aluminum an attractive choice for many products, including food packaging, car parts, airplane components, and building features [1]. Due to the production of large amounts of aluminum scrap and substantial difference in energy consumption (producing 1 kg of recycled aluminum requires on average 9.2 mJ, compared to 144.6 mJ for producing 1 kg of primary aluminum), companies and policymakers increasingly focus on aluminum recycling as a potential solution for meeting said demand requirements [2]. The European aluminum industry has set a circular aluminum action plan and a Vision 2050 report to improve the collection, sorting, and recycling of European post-consumer scrap [3]. Therefore, it is important to establish an efficient method for sorting and recycling aluminum scrap, as it not only conforms to the demands of relevant policies but also addresses critical issues related to aluminum resource waste and energy consumption.

For the detection field of alloy materials, the prevalent methods can be categorized into conventional separation techniques (such as magnets [4–6], eddy currents [7,8], and heavy media [9–11]) and emerging spectroscopic detection methods (including X-ray fluorescence spectroscopy [12–14] and atomic absorption spectroscopy [15–17]). The minimal differences in the physical properties of aluminum alloys result in poor performance of conventional separation techniques [18]. Consequently, a significant amount of high-quality aluminum alloy resources are downgraded and wasted, resulting in substantial resource inefficiency. Moreover, these techniques require thermal and chemical pre-treatment etching steps, which raise environmental and cost concerns [19]. Current spectroscopic detection methods have advantages, such as low detection limits and high sensitivity. However, these methods often require laborious sample pre-treatment procedures and the formulation of pertinent solutions, which can potentially lead to secondary pollution [20].

Laser-induced breakdown spectroscopy (LIBS) is an analytical technique that uses laser-induced plasma to generate emission spectra for elemental analysis. LIBS is based on the generation of a plasma plume via laser ablation, which produces characteristic emission spectra. LIBS enables the identification and quantification of elements in a sample by analyzing the emitted light [21]. LIBS has been widely utilized for the identification and detection of diverse substances, including alloys [22,23], plant materials [24,25], food [26,27], and soils [28,29]. As a fast, online, nonpolluting, and cost-effective technique, it is highly suitable for the pertinent analysis of alloy materials.

Scholars have conducted numerous studies on alloy classification based on the LIBS technology. In 2018, Zhan et al. [30] introduced a rapid classification method for Al alloys based on LIBS and the Random Forest (RF) algorithm; results indicated that the RF model had the highest accuracy at 98.45%. In 2021, Dai et al. [31] studied the LIBS technique combined with principal component analysis (PCA) and a least-squares support-vector machine (LSSVM) algorithm to classify and identify five types of aluminum alloys. The identification accuracies of the support-vector machine (SVM) and LSSVM for the aluminum alloy were 98.33% and 100%, respectively. In 2022, Harefa et al. [32] classified five different types of aluminum alloys rapidly and noninvasively utilizing a manifold dimensionality reduction technique and an SVM classifier model integrated with LIBS technology; the accuracy of the model was 96.67%. In 2022, Guo et al. [33] combined the wavelet transform (WT) method with LSSVM to the classify and identify of aviation alloys using LIBS; the accuracies of the training set and the test set were 99.98% and 99.56%, respectively. In 2023, Dillam et al. [3] proposed a novel multi-sensor solution for the complex task of sorting aluminum post-consumer scrap into alloy groups with LIBS using Deep Learning models. The single-output model performed best in separating aluminum post-consumer scrap, with a precision of 99%.

Based on these studies, a combination of LIBS and machine learning can be used to effectively classify alloy materials. However, for LIBS measurements with sample surface fluctuations, it is difficult to consistently and precisely maintain the laser and fiber focus on the sample surface, and fluctuations in the focus severely affect the stability of the generated spectrum. The fluctuation of the sample surface causes the difference of the data distribution, which leads to the reduction of the predictive performance of traditional machine learning models. Simultaneously, there is redundancy in LIBS spectral features, with numerous irrelevant features posing a challenge to the predictive performance of the model. Therefore, a data processing method is needed to address the issues related to data distribution and feature redundancy for improving the predictive performance of traditional models [34]. Aisyah et al. [35] revealed that due to significant differences in the distribution of near-infrared spectral (NIRS) from mangoes harvested during different seasons, the partial least squares (PLS) model is inadequate for accurately predicting mango dry matter content. Consequently, method that joint distribution adaptation (JDA) was proposed to improve the performance of PLS. A comparison of the traditional PLS with JDA-PLS shows that R_P² improves from 0.794 to 0.826, and RMSEP decreases from 1.61 to 1.139. These results indicate that the application of JDA effectively reduces the distribution differences among spectral data, thereby improving the predictive performance of the PLS model. Additionally, in terms of feature selection, Chao et al. [36] combined Mutual information-induced interval selection with kernel partial least squares (MIKPLS) for near-infrared spectral (NIRS) calibration. Based on a publicly available beer dataset, the RMSEP decreased from 0.4357 to 0.2303 after MI feature selection. It indicates that the model performance can be improved after MI feature selection. Considering the similarity between LIBS spectra and near-infrared spectra, JDA and MI were introduced to reduce the difference of spectral data distribution and the redundancy of features for improving the predictive performance of the model.

In this study, an existing laboratory LIBS setup was used to simulate the uneven surface environment of aluminum alloy scrap. The focus of the laser and fiber was fixed on the sample surface, whereas the other experimental conditions remained unchanged. The displacement platform was gradually adjusted up and down at 0.5 mm intervals along the z-axis until an invalid spectrum was obtained. Eleven spectral datasets at different heights were collected to develop and validate the classification model for aluminum alloys in different series. Then, JDA and MI were used to reduce the distribution difference between spectral data at different heights and the redundancy of LIBS spectral features, for improving the predictive performance of traditional models. The performance of the model was evaluated using metrics such as accuracy, recall, F1-score, confusion matrix, and AUC value.

2 Materials and methods

2.1 Experimental setup

The experimental setup is illustrated in Fig. 1 (a) and consists of a laser, mirrors, spectrometer, focusing lens, fiber, delay generator, three-dimensional stage, sample, and computer. A Q-switched Nd: YAG laser (Dawa_200, Beamtech) with a wavelength of 1064 nm was used in this experiment.

Fig. 1. Schematic diagram of the LIBS system. (a) Construction of the LIBS setup. (b) Setting of the focus point position.

Download Full Size | PDF

A delay generator was used for the time control of the laser and spectrometer to eliminate continuous radiation from the LIBS plasma. During the experiment, the aluminum alloy was positioned directly on a three-dimensional platform. A lens with a focal length of 100 mm was used to concentrate the laser pulses onto the sample surface, resulting in plasma generation. Subsequently, the radiation emitted from the plasma was collected using a fiber- optic probe coupled to a spectrometer (Avantes, AvaSpec-ULS2048-2-USB2, 198 ∼ 400 nm, spectral resolution is 0.07 nm; AvaSpec-ULS4096CL-EVO, 400 ∼ 938 nm, spectral resolution is 0.3 nm). Spectra within the wavelength range of 198–938 nm were displayed and stored on a computer; each spectrum consisted of 8190 data points. To enhance signal quality, we optimized the operating conditions as follows: the repetition rate and pulse energy were 1 Hz and 105 mJ, respectively. The delay time between the laser and spectrometer was 0.7 us, integration time of the spectrometer was 1.05 ms.

2.2 Algorithm structure

2.2.1 Data transfer

Because of sample surface fluctuations, the intensity of the LIBS spectral signals fluctuates accordingly. Therefore, data are distributed at different heights, making traditional models ineffective. Traditional machine learning models assume that the training data and test data have similar distributions. Inconsistent data distributions can lead to differences in features between the training and test data. Traditional machine learning models struggle to handle such feature disparities because they cannot automatically adapt to these variations. If the distributions are inconsistent, the model will fail to correctly capture the features and relationships in the test data, resulting in decreased performance [35,37]. To address this issue, a data transfer JDA is introduced. JDA addresses the distribution discrepancy between the source and target domains by adjusting their data distributions, thereby reducing the discrepancy in the feature space [38].

The fundamental concept of the JDA method is to maximize the differences in the marginal distributions and minimize the differences in their conditional distributions simultaneously. Initially, JDA aims to decrease the difference between the source and target domains by maximizing the difference in the marginal distributions, which is accomplished by maximizing the maximum mean discrepancy (MMD) between the two domains. As a measurement method, the MMD can effectively quantify the distribution difference between the source and target domains in the feature space and evaluate the dissimilarity of the distribution between them. The marginal distribution adaptation is represented by Eq. (1), where A represents the transformation matrix, n and m denote the number of samples in the source and target domains, respectively, and M₀ denotes the MMD matrix for the marginal distribution adaptation. Furthermore, JDA ensures the coherence of the source and target domains by minimizing the divergence in the conditional distributions within each domain. This is accomplished by mapping the data from the source and target domains onto a common feature space, where the discrepancy in the conditional distributions is minimized. During conditional distribution adaptation, the JDA utilizes data from the source domain (x_s, y_s) to train a KNN classifier. Subsequently, this classifier is directly applied to the target domain (x_t, y_t) to obtain pseudo-labels. Then, the MMD distance is calculated based on these pseudo-labels and iteratively updated to achieve more accurate predictions. The adaptation of the conditional distributions is described by Eq. (2), where A represents the transformation matrix, n_c indicate the number of samples from class c in the source domains, m_c represents the number of samples in the target domains with pseudo-labels as class c. Mc denotes the MMD matrix utilized for the conditional distribution adaptation.

(1)$$\begin{array}{c} D({A^T}{x_i},{A^T}{x_j}) = \left\|{\frac{1}{n}\sum\limits_{i = 1}^n {{A^T}{x_i} - \frac{1}{m}\sum\limits_{j = n + 1}^{n + m} {{A^T}{x_j}} } } \right\|_H^2\\ = tr({A^T}X{M_0}{X^T}A)\\ {({M_0})_{ij}} = \left\{ {\begin{array}{c} {\frac{1}{{{n^2}}},{x_i},{x_j} \in {D_s}}\\ {\frac{1}{{{m^2}}},{x_i},{x_j} \in {D_t}}\\ { - \frac{1}{{mn}},otherwise} \end{array}} \right. \end{array}$$

(2)$$\begin{array}{c} D({A^T} {{x_s}} |{y_s},{A^T} {{x_t}} |{y_t}^\prime ) = \sum\limits_{c = 1}^C {\left\|{\frac{1}{{{n_c}}}\sum\limits_{{x_{{s_i}}} \in D_s^{(c)}} {{A^T}{x_{{s_i}}} - \frac{1}{{{m_c}}}\sum\limits_{{x_{{t_i}}} \in D_t^{(c)}} {{A^T}{x_{{t_i}}}} } } \right\|} _H^2\\ = \sum\limits_{c = 1}^C {tr({A^T}X{M_c}{X^T}A)} \\ {({M_c})_{ij}} = \left\{ {\begin{array}{c} {\frac{1}{{{n_s}(c){n_s}(c)}},{x_i},{x_j} \in D_s^{(c)}}\\ {\frac{1}{{{n_t}(c){n_t}(c)}},{x_i},{x_j} \in D_t^{(c)}}\\ {\frac{{ - 1}}{{{n_s}(c){n_t}(c)}},\left\{ {\begin{array}{c} {{x_i} \in D_s^{(c)},{x_j} \in D_t^{(c)}}\\ {{x_j} \in D_s^{(c)},{x_i} \in D_t^{(c)}} \end{array}} \right.}\\ {0,otherwise} \end{array}} \right. \end{array}$$

The overall optimization problem of the domain adaptation can be solved using an alternating minimization approach. By iteratively optimizing the feature mapping and classifier in the source and target domains, the final mapping and classifier are designed to jointly maximize the marginal distribution discrepancy and minimize the conditional distribution discrepancy.

2.2.2 Feature selection

The number of features in this study significantly exceeded the sample size. This posed a challenge, as the model is susceptible to overfitting the training data, thereby hindering its ability to effectively generalize to new data [34]. The LIBS spectral data commonly include redundant, noisy, or irrelevant information. Consequently, the adoption of feature selection or dimensionality reduction techniques is imperative for enhancing the robustness and accuracy of LIBS prediction models.

In this study, a spectral selection algorithm known as mutual information was employed. The mutual information algorithm offers an estimate of the mutual information between each feature and target variable. Mutual information values were considered as the scores for each individual feature; the resulting values from this algorithm fall within the range of 0 to 1. Features with higher scores were selected and incorporated into the model-training process. Using the MI algorithm, features with high contributions can be selected to improve the efficiency of the modeling process.

2.3 Data preparation

In this study, there are 6 different series, comprising a total of 14 aluminum alloy samples used for experiments, as shown in Table 1. When creating the classification labels, 1xxx series, 2xxx series, 3xxx series, 5xxx series, 6xxx series, and 7xxx series were respectively labeled as 1#, 2#, 3#, 4#, 5#, and 6#. In the context of the following description, 1#, 2#, 3#, 4#, 5#, and 6# are used to represent the series of aluminum alloys. Figure 1(b) shows the setting of the focus point position. The focal point at the position on the sample surface exactly was taken as the baseline (B). Then, keeping experimental conditions unchanged, the three-dimensional platform was only moved up and down along the z-axis by 0.5 mm, 1 mm, 1.5 mm, 2 mm and 2.5 mm, respectively. Furthermore, those moving positions were denoted as B, B ± 0.5, B ± 1.0, B ± 1.5, B ± 2.0, and B ± 2.5. At position B, initially, 200 spectral data were collected for each of the 14 aluminum alloy samples, resulting in a total of 2800 spectral data, which were taken as train dataset A, then, 50 spectra were collected again for each aluminum alloy, resulting in 700 spectral data were taken as test dataset B. Subsequently, for positions B ± 0.5, B ± 1.0, B ± 1.5, B ± 2.0, and B ± 2.5, 50 spectral data were collected for each aluminum alloy sample, resulting in 700 spectral data for each position, and these spectral data were designated as test dataset B ± 0.5, B ± 1.0, B ± 1.5, B ± 2.0, and B ± 2.5. The specific information of dataset is listed in Table 2.

Table 1. Series and grades of aluminum alloys

View Table | View all tables in this article

Table 2. Division of the experimental dataset.

View Table | View all tables in this article

3 Results and discussions

3.1 Qualitative analysis

The wavelength and corresponding intensity in the spectra can be employed to qualitatively depict essential information regarding the elemental composition of the analyzed sample. The spectral wavelength range of the Al alloy samples in this study was 198–938 nm. Figure 2 illustrates the spectral graph of aluminum alloys. Taking the aluminum alloy block as an example and referring to the NIST database, it is evident that the characteristic lines of the aluminum alloy excited in this experiment are primarily Mn I at 257.61 nm, Mg I at 285.13 nm, Si I at 288.15 nm, Cu I at 324.75 nm, Cr I at 357.86 nm, Al I at 394.40 nm, Al I at 396.15 nm, Na I at 589.59 nm, and so forth. Figure 3 shows a comparative representation of the spectra of the aluminum alloys in various series. Each series of aluminum alloys contains different characteristic elemental compositions, leading to differences in the LIBS spectra of the different series.

Fig. 2. LIBS spectra of a representative aluminum alloy series.

Download Full Size | PDF

Fig. 3. Comparative representation of representative LIBS spectra for different series of aluminum alloys.

Download Full Size | PDF

Figure 4 compares the spectra of the aluminum alloys at various heights. The laser and fiber-optic probe were focused on the surface of the aluminum alloy sample and all other experimental conditions were kept constant. Taking the example of an aluminum alloy in 1series, Fig. 4 demonstrates that the spectral intensity is highest when the laser and fiber optic probe are focused on position B. As the height is adjusted up and down in increments of 0.5 mm, the spectral intensity gradually decreases. The spectra at positions B ± 3.0, B ± 3.5, and B ± 4.0 did not display clear characteristic peaks. Consequently, these spectra could not accurately reflect the elemental information of the Al alloy. Accordingly, these positions were classified as invalid.

Fig. 4. Comparison of spectra at different heights. (a) Spectrum at B. (b) Spectrum from B + 0.5 to B + 4.0. (c) Spectrum from B-0.5 to B-4.0.

Download Full Size | PDF

3.2 Analysis of the traditional classification model

In this study, dataset A, which comprised 2800 spectra, was employed as the initial training set to develop a corresponding prediction model using a traditional algorithm (KNN, K-nearest neighbor). The target domain datasets B, B ± 0.5, B ± 1.0, B ± 1.5, B ± 2.0, and B ± 2.5 were used as the testing set to evaluate and analyze the performance of the prediction model.

Figure 5 illustrates the confusion matrix of the KNN model when applied to datasets of different heights. The confusion matrix presents a visual representation of the precise classification accuracy for each label within the aluminum alloy series. The values along the diagonal of the confusion matrix represent the number of correctly classified samples, whereas the other values indicate the number of misclassified samples. The confusion matrix data for B is relatively concentrated on the diagonal but exhibits some off-diagonal elements. This implies that while the model is generally effective at predicting B, there are instances of 32 misclassifications. For B ± 0.5, B ± 1.0, B-1.5, the confusion matrix data shows a noticeable departure from the diagonal line. This indicates a significant challenge in correctly classifying those datasets, with a higher number of misclassifications. The confusion matrix data for B + 1.5, B ± 2.0, and B ± 2.5 demonstrates a considerable departure from the diagonal, indicating substantial misclassifications were included. The confusion matrices illustrate the model's varying performance across different datasets. The extent of misclassification is most pronounced for B + 1.5, B ± 2.0, and B ± 2.5 indicating that these data pose greater challenges for the model due to the shifting data distribution with height.

Fig. 5. Confusion matrix of the KNN model on different datasets.

Download Full Size | PDF

The predictive performance of the traditional models at different heights is presented in Table 3. The KNN model achieved prediction accuracies of 11.48%, 19.71%, 30.57%, 45.71%, 53.57%, 88.28%, 52.57%, 21.42%, 14.42%, 14.42%, and 14.42% for the datasets at corresponding heights. Compared with the prediction accuracy of the central location dataset B, the prediction accuracy of the KNN model on two extreme height datasets, B-2.5 and B + 2.5, decreased by 86.99% and 83.66%, respectively. The results show that the KNN model trained on Dataset A exhibits promising predictive performance only for Dataset B; this suggests that optimal prediction results were obtained when the training and prediction sets were at the same height. As the height increased or decreased, the predictive performance of the model gradually decreased.

Table 3. Comparison of prediction performance of KNN model at different heights.

View Table | View all tables in this article

As the height increased or decreased, the intensity of the spectral data changed, leading to a change in the distribution of the collected data. Figure 6 shows the Quantile-Quantile plot (Q-Q plot) of the data, with the horizontal axis representing the theoretical quantiles (quantiles of dataset A) and the vertical axis representing the practical quantiles (quantiles of datasets B, B ± 0.5, B ± 1.0, B ± 1.5, B ± 2.0, and B ± 2.5). When the points in the graph are aligned with the red line, they indicate a coincident data distribution, whereas a deviation from the red line indicates a disparity in the data distribution. It can be observed that the B dataset is closest to the red line, indicating a relatively consistent data distribution with training set A. As the height increases or decreases, the deviation from the red line becomes more pronounced, resulting in larger differences in data distribution. Among these, B + 2.0 and B ± 2.5 exhibit more significant deviations from the red line, indicating greater disparities in data distribution.

Fig. 6. Proof of differences in data distribution.

Download Full Size | PDF

In order to enhance the data's expressiveness, PCA with two principal components is used for clustering visualization of data. As illustrated in Fig. 7, PC1 corresponds to the X-axis, and PC2 corresponds to the Y-axis, resulting in a two-dimensional scatter plot. Notable discrimination is observed mainly between the 3# and 4# in the B dataset, while other labels exhibit overlapping clusters that are challenging to differentiate effectively. However, as the height increases or decreases, the clustering performance of samples 3# and 4# gradually weakens, eventually overlapping completely at B ± 2.0 and B ± 2.5. These findings suggest that the quality of clustering deteriorates rapidly with increasing or decreasing height.

Fig. 7. Spectral data visualization by PCA.

Download Full Size | PDF

3.3 Analysis of the data transfer classification model

In order to address the issue of inconsistent distributions between the training and testing datasets, which can lead to a decrease in model prediction performance. A data transfer method was introduced, and the original training set and testing set was renamed as the source domain and target domain, respectively. Using the data transfer method, the data distributions of the source and target domains were adjusted to increase similarity in the feature space. This adjustment ultimately improved the learning performance in the target domain. Due to the core objective of the JDA algorithm, which is to align the data distributions between the source and target domains, it does not make use of self-supervised classification techniques. The KNN classifier was employed within the JDA framework, and the spectral data of the aluminum alloy were used as inputs to construct the JDA (KNN) data transfer model. To validate the effectiveness of the data transfer method, the data transfer model and traditional models were compared in terms of their predictive performance for each dataset.

The JDA method involves two primary parameters: the dimensionality of the feature space and the regularization parameter for domain adaptation. These parameters serve the purpose of determining the target dimension to which the input features are mapped, and regulating the extent of domain adaptation between the source and target domains. Figure 8 shows the process of parameter optimization in the JDA (KNN) model. Figure 8(a) and (b) depict the optimization process for the dimensionality of the feature space. The parameter tuning range for the dimensionality of the feature space, denoted as k, was [10, 20, …, 200]. Adjusting the dimensionality of the feature space enables precise data reconstruction. Figure 8(c) and (d) illustrate the optimization process for the regularization parameter λ in the JDA framework. Theoretically, a larger λ value places more emphasis on the shrinkage regularization in JDA. When λ approaches 0, the optimization problem becomes ill-posed. Conversely, when λ approaches infinity, no distribution adaptation is performed, and JDA is unable to construct robust representations for cross-domain classification. Hence, in this study, the λ regularization parameter was tuned within the range of [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10]. The research indicates that the optimal K values for B-2.5, B-2.0, …, B, …, B + 2.0, B + 2.5 are 70, 70, 80, 60, 70, 10, 60, 70, 50, 90, 60. The optimal λ values for B-2.5, B-2.0, …, B, …, B + 2.0, B + 2.5 are 0.01, 0.1, 0.1, 0.05, 0.05, 0.1, 0.1, 0.5, 0.1, 0.1, 0.5. Then, the model is trained by using these optimized parameters.

Fig. 8. Parameter optimization for the JDA (KNN) model. (a) and (b). The optimization process for the dimensionality of the feature space. (c) and (d), The optimization process for the regularization parameter λ.

Download Full Size | PDF

To provide a more accurate representation of the distribution between the two datasets, we use the maximum mean discrepancy (MMD) within the JDA framework. The MMD is a commonly employed loss function in transfer learning, specifically in the context of domain adaptation, its primary purpose is to quantify the distance between two distinct yet interconnected variable distributions.

The optimized parameters were input into the model and a subsequent predictive analysis was performed for each target domain. Figure 9 shows the training iteration plot of the JDA (KNN) model, encompassing a total of 20 iterations. Figure 9 (a) and (b) show the progressive evolution of the MMD distance with increasing iterations. Initially, the MMD distance exhibited a significant reduction, followed by gradual convergence. On the other hand, Fig. 9 (c) and (d) show fluctuations in the accuracy as the number of iterations increases. Accuracy underwent an initial rapid improvement before gradually reaching a plateau. The MMD distances for B-2.5, B-2.0, …, B, …, B + 2.0, B + 2.5 stabilize at the 10th, 9th, 12th, 9th, 10th, 3rd, 11th, 9th, 8th, 8th, and 7th iterations, respectively. After 20 iterations, the MMD distances reach 99.48, 92.75, 71.52, 49.89, 42.01, 3.12, 39.98, 50.99, 55.12, 198.36, and 210.04, respectively. The prediction accuracy for B-2.5, B-2.0, …, B, …, B + 2.0, B + 2.5 also reaches 45.85%, 49.57%, 56.14%, 65.14%, 69.71%,96.28%, 71.85%, 61.42%, 66.28%28.42%, and 21.71% after 20 iterations. The study demonstrates that JDA effectively reduces the distance between two datasets, mitigates differences in data distribution, and enhances the predictive performance of traditional models.

Fig. 9. Iteration diagram of JDA (KNN) model. (a) and (b). The progressive evolution of the MMD distance with increasing iterations. (c) and (d). The progressive evolution of the accuracy with increasing iterations.

Download Full Size | PDF

Figure 10 shows the confusion matrix generated by the JDA (KNN) model for each height dataset. The confusion matrix data for B exhibits a relatively concentrated distribution on the diagonal, but there are still 23 misclassifications. In the case of B ± 1.0 and B ± 1.5, after undergoing JDA (KNN) processing, the confusion matrix data gradually tends towards the diagonal, yet the number of misclassifications remains relatively high. As for B ± 2.0 and B ± 2.5, the confusion matrix data still maintains a substantial number of misclassifications. However, when compared to KNN, they demonstrate a superior ability to effectively distinguish between the five series of aluminum alloys. An examination of the performance comparisons is presented in Table 4. It is evident that the JDA (KNN) model improved its predictive performance after adjusting the data distribution between the source and target domains through JDA. Additionally, the JDA (KNN) model significantly improved its predictive accuracy compared to KNN across various height datasets, with the accuracies increasing from 11.48%, 19.71%, 30.57%, 45.71%, 53.57%, 88.28%, 52.57%, 21.42%, 14.42%, 14.42%, and 14.42% to 45.85%, 49.57%, 56.14%, 65.14%, 69.71%, 96.28%, 71.85%, 61.42%, 66.28%, 28.42%, and 21.71%, respectively. Furthermore, the JDA (KNN) model demonstrated a discrepancy of 50.43% and 74.57% in predictive accuracy when compared to the central position on the datasets of the two marginal heights.

Fig. 10. Confusion matrix of JDA (KNN) model on different datasets.

Download Full Size | PDF

Table 4. Comparison of prediction performance of JDA (KNN) model at different heights

View Table | View all tables in this article

The Fig. 11 showcases the spectral data visualization using PCA after the JDA. A significant enhancement in the clustering performance was apparent compared to the original data visualization using PCA. Compared to the clustering performance of the original spectral data, the data processed through JDA has shown significant improvements in clustering, particularly for B and B ± 0.5. In the figures, although there are still some overlapping data points within the 3#, 4#, 5#, and 6#, they have formed clearer clusters. However, for other height data, the clustering performance of data processed by JDA processing does not exhibit a notable enhancement compared to the original spectral data. Therefore, further data processing is required to improve classification accuracy.

Fig. 11. Spectral data visualization employing PCA after JDA.

Download Full Size | PDF

3.4 Analysis of the data transfer classification model combined with feature selection

LIBS spectral data encompass noise and interference signals that can originate from the instrument, environment, or the samples themselves. To address this issue, feature selection techniques can be used to identify the most relevant and informative features. This approach reduces the data dimensionality, simplifies the analysis process, and facilitates model construction.

MI was employed to select the characteristic variables from the aluminum alloy spectra, capturing the relationships between each feature and its corresponding labels. Figure 12 shows the feature selection process for the aluminum alloy spectra obtained using MI. The bar chart displays the number of feature variables, and the line graph illustrates the accuracy.

Fig. 12. Selection process of spectral feature variables.

Download Full Size | PDF

The spectral signals of the aluminum alloy covered a wavelength range of 198–938 nm, resulting in 8189 feature variables. In the feature selection process, KNN was utilized as the classifier in the MI algorithm. Dataset A was divided into training and validation sets using a 7:3 ratio. A new training set was employed to construct a spectral selection model using the MI algorithm, and a validation set was employed to determine the optimal threshold. MI is used to eliminate irrelevant features by applying various selection thresholds. When the threshold is set to 0.8, only 4 feature variables are remained. As the threshold increased further, all feature variables are considered irrelevant and are removed. Therefore, the threshold selection range is between 0.1 and 0.8. As depicted in Fig. 12, as the threshold increases, the number of feature variables gradually decreases, and the accuracy initially increased and then decreased. The accuracy peaked at a threshold of 0.6, reaching 98.71%. Through this feature selection process, the number of feature variables is reduced from the initial 8189 to just 117. The selected data were subsequently combined with the JDA (KNN) to establish the MI-JDA (KNN) model.

Figure 13 illustrates the iterative process of the MI-JDA (KNN) model comprising 20 iterations. Figure 13 (a) and (b) show the variations in the MMD distance as the number of iterations increased, whereas Fig. 13 (c) and (d) show the changes in accuracy throughout the iterations. After MI-JDA, the MMD distances for B-2.5, B-2.0, …, B, …, B + 2.0, B + 2.5 stabilize at the 8th, 6th, 7th, 4th, 2nd, 2nd, 3rd, 5th, 7th, 7th, and 9th iterations, respectively. After 20 iterations, the MMD distances reach 37.57, 27.45, 30.54, 19.15, 17.06, 2.12, 13.73, 15.99, 28.65, 37.57, and 41.37, respectively. The prediction accuracy for B-2.5, B-2.0, …, B, …, B + 2.0, B + 2.5 also reaches 78.14%, 82.28%, 80.14%, 89.71%, 91.85%, 98.42%, 94.28%, 92.42% 82.14%, 78.57%, and 73.71% after 20 iterations. Compared to JDA, the MMD distances for B-2.5, B-2.0, …, B, …, B + 2.0, B + 2.5 decreased by 62.23%, 70.40%, 57.29%, 61.61%, 59.39%, 32.05%, 65.65%, 68.64%, 48.02%, 81.05%, and 80.30%, after MI-JDA processing. The study indicates that MI-JDA significantly reduces the data distribution differences between the source and target domains, thereby enhancing the predictive accuracy of traditional models.

Fig. 13. Iteration diagram of MI-JDA (KNN) model. (a) and (b). The progressive evolution of the MMD distance with increasing iterations. (c) and (d). The progressive evolution of the accuracy with increasing iterations.

Download Full Size | PDF

Figure 14 shows the confusion matrix of the MI-JDA (KNN) model for the target domain dataset. The confusion matrix data for B is primarily concentrated on the diagonal, with only 11 misclassifications. In comparison to JDA (KNN), MI-JDA (KNN) exhibits a notably more concentrated distribution of prediction confusion matrix data across all datasets, particularly in the case of B ± 2.5, where the false positives have decreased from 548 and 255 to 184 and 153, respectively. These results indicate that MI-JDA (KNN) achieves the highest predictive accuracy and can effectively handle variations in sample surface height. Considering the performance comparison in Table 5, it is evident that the predictive performance of the MI-JDA (KNN) model across all height datasets (ranging from low to high) was higher than that of the JDA (KNN) model. The accuracy values progressively increase from 45.85%, 49.57%, 56.14%, 65.14%, 69.71%, 96.28%, 71.85%, 61.42%, 66.28%, 28.42%, and 21.71% to 78.14%, 82.28%, 80.14%, 89.71%, 91.85%, 98.42%, 94.28%, 92.42%, 82.14%, 78.57%, and 73.71%. The MI-JDA (KNN) model showed noticeable disparities of 20.28% and 24.71% in the predictive accuracies between the datasets of the two marginal heights and the central position. Notably, the MI-JDA (KNN) model exhibited substantial enhancements in predictive performance at the B + 2.0 and B + 2.5 positions. The findings of this study suggest that JDA is more effective in adjusting the data distribution after employing an MI-based feature selection.

Fig. 14. Confusion matrix of MI-JDA (KNN) model on different data sets

Download Full Size | PDF

Table 5. Comparison of prediction performance of MI-JDA (KNN) model at different heights

View Table | View all tables in this article

Figure 15 shows the spectral data visualization using PCA after the MI-JDA model. The clustering performance is noticeably superior to that of the other two models. After applying MI-JDA processing of the data at the positions B ± 1.0, B ± 1.5, B ± 2.0, and B ± 2.5, the clustering performance is better than the original data and data processed by JDA.

Fig. 15. Spectral data visualization by PCA after using the MI-JDA model.

Download Full Size | PDF

3.5 Comparison of prediction performance

The performances of the three models are listed in Tables 2, 3, and 4. To provide a more comprehensive evaluation of the predictive performance of these three classification models, multiple evaluation metrics were introduced, including accuracy, recall, F1-score, and AUC value. Figure 16 displays a radar chart illustrating the performance of the three models for each height dataset; the performance values represent the accuracy of each model at the corresponding height. Although the KNN model demonstrated satisfactory predictive performance at height B, achieving an accuracy of 88.28%, its performance at the other heights was relatively poor, with accuracies below 55%. Consequently, by incorporating JDA to adjust the dataset distribution, predictive performance was enhanced. Compared to KNN, JDA (KNN) exhibited significant accuracy improvements across all height datasets, with increases of 299.39%, 151.49%, 83.64%, 42.50%, 30.12%, 9.06%, 36.67%, 186.74%, 359.63%, 97.08%, and 50.55%. However, the predictive performance of the JDA (KNN) model at the marginal heights fell short of the expected level.

Fig. 16. Performance of the three models.

Download Full Size | PDF

To further enhance predictive performance, MI was introduced to select the spectral features of the aluminum alloys. Compared to JDA (KNN), the MI-JDA (KNN) model showed accuracy improvements of 70.43%, 65.99%, 42.75%, 37.72%, 31.76%, 2.22%, 31.22%, 50.47%, 23.93%, 176.46%, and 239.52% on each height dataset, respectively. Compared with KNN, the MI-JDA (KNN) model exhibited accuracy improvements of 580.66%, 317.45%, 162.15%, 96.26%, 71.46%, 11.49%, 79.34%, 331.47%, 469.63%, 444.87%, and 411.17% for each height dataset. These comparisons indicate that both data transfer and feature selection contribute to improving predictive performance.

To further illustrate the effectiveness of the MI-JDA, we employed other classifiers, such as SVM and RF, to model and predict the aforementioned data. Table 6 provides the performance of the original data model and MI-JDA-optimized model for each height dataset. The comparison results demonstrate that MI-JDA can effectively process data features and improve the performance of traditional models.

Table 6. Performance of the original data model and the MI-JDA-optimized model applied to each height dataset

View Table | View all tables in this article

4 Conclusions

This study proposes a data transfer method coupled with LIBS for the classification of aluminum alloys with surface fluctuations. This method addresses the negative impact of spectral fluctuations on classification performance by combining feature selection with data transfer to reduce the disparity between the distributions of the testing and training data. Consequently, there is no need to adjust the laser and fiber focus in real time when there are changes in surface fluctuations, which reduces labor and time costs. Initially, we simulated the LIBS detection process for aluminum alloys with surface fluctuations. The focus was set on the surface of the sample, and the height of the sample stage was gradually adjusted at 0.5 mm intervals for data acquisition until an invalid spectrum was obtained. Eleven spectral datasets at different heights were collected and analyzed. Subsequently, the acquired dataset was employed to train and validate the traditional LIBS model and the model after data transfer. The traditional LIBS classification model is KNN; MI and JDA were used for feature selection and transfer, respectively. The KNN model (with predictive accuracies on each height dataset of 11.48%, 19.71%, 30.57%, 45.71%, 53.57%, 88.28%, 52.57%, 21.42%, 14.42%, 14.42%, and 14.42%, respectively) failed to effectively classify all heights of the aluminum alloy series. The predictive accuracy was 45.85%, 49.57%, 56.14%, 65.14%, 69.71%, 96.28%, 71.85%, 61.42%, 66.28%, 28.42%, and 21.71% on each height dataset after feature transfer. To improve predictive performance, feature selection was introduced to select 117 of 8,189 feature variables. The MI-JDA (KNN) model achieved predictive accuracies of 78.14%, 82.28%, 80.14%, 89.71%, 91.85%, 98.42%, 94.28%, 92.42%, 82.14%, 78.57%, and 73.71% on each height dataset, respectively. This study found that the MI-JDA (KNN) model was superior to the other models in terms of accuracy, recall, F1-score, and AUC. This indicates that the belived-to-be novel LIBS data processing method can offer a new technological reference for practical applications in the field of aluminum alloy inspection.

Funding

Natural Science Foundation of Fujian Province (2023J05303); National Natural Science Foundation of China (62105160).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. S. Pan, Y. Fu, Y. Wang, et al., “Research Progress of Al Alloy Semisolid Forming Technology,” Rare Met. Mater. Eng. 51, 3110–3120 (2022).

2. S. Van den Eynde, E. Bracquené, D. Diaz-Romero, et al., “Forecasting global aluminium flows to demonstrate the need for improved sorting and recycling methods,” Waste Manage. 137, 231–240 (2022). [CrossRef]

3. D. Diaz-Romero, S. Van den Eynde, I. Zaplana, et al., “Classification of aluminum scrap by laser induced breakdown spectroscopy (LIBS) and RGB plus D image fusion using deep learning approaches,” Resour., Conserv. Recycl. 190, 106865 (2023). [CrossRef]

4. J. Hannula, J. R. A. Godinho, A. A. Llamas, et al., “Simulation-Based Exergy and LCA Analysis of Aluminum Recycling: Linking Predictive Physical Separation and Re-melting Process Models with Specific Alloy Production,” J. Sustain. Metall. 6(1), 174–189 (2020). [CrossRef]

5. Y. R. Smith, J. R. Nagel, and R. K. Rajamani, “Eddy current separation for recovery of non-ferrous metallic particles: A comprehensive review,” Miner. Eng. 133, 149–159 (2019). [CrossRef]

6. D. S. Wong and P. Lavoie, “Aluminum: Recycling and Environmental Footprint,” JOM 71(9), 2926–2927 (2019). [CrossRef]

7. J. R. Nagel, D. Cohrs, J. Salgado, et al., “Electrodynamic Sorting of Industrial Scrap Metal,” KONA 37(0), 258–264 (2020). [CrossRef]

8. M. E. Zyazev, E. S. Lyampasova, Z. O. Abdullaev, et al., “Issues of induction sorting of scrap and waste of non-ferrous metals,” IOP Conf. Ser.: Mater. Sci. Eng. 950(1), 012017 (2020). [CrossRef]

9. F. Mennik, İ. N. Dinç, and F. Burat, “Selective recovery of metals from spent mobile phone lithium-ion batteries through froth flotation followed by magnetic separation procedure,” Results Eng. 17, 100868 (2023). [CrossRef]

10. X. Wang, W. Liu, F. Jiao, et al., “New insights into the mechanism of selective flotation of copper and copper-tin alloy,” Sep. Purif. Technol. 253, 117497 (2020). [CrossRef]

11. G. Z. Zha, C. F. Yang, Y. K. Wang, et al., “New vacuum distillation technology for separating and recovering valuable metals from a high value-added waste,” Sep. Purif. Technol. 209, 863–869 (2019). [CrossRef]

12. M. I. Khan, A. Fayyaz, S. Mushtaq, et al., “Optical and thermal characterization of pure CuO and Zn/CuO using laser-induced breakdown spectroscopy (LIBS), x-ray fluorescence (XRF), and ultraviolet–visible (UV–Vis) spectroscopy techniques,” Laser Phys. Lett. 20(8), 086001 (2023). [CrossRef]

13. M. A. Roxburgh, S. Heeren, D. J. Huisman, et al., “Non-Destructive Survey of Early Roman Copper-Alloy Brooches using Portable X-ray Fluorescence Spectrometry,” Archaeometry 61(1), 55–69 (2019). [CrossRef]

14. C. Vanhoof, J. R. Bacon, A. T. Ellis, et al., “2019 atomic spectrometry update - a review of advances in X-ray fluorescence spectrometry and its special applications,” J. Anal. At. Spectrom. 34(9), 1750–1767 (2019). [CrossRef]

15. Z. Feng, X. Cheng, C. Dong, et al., “Passivity of 316 L stainless steel in borate buffer solution studied by Mott–Schottky analysis, atomic absorption spectrometry and X-ray photoelectron spectroscopy,” Corros. Sci. 52(11), 3646–3653 (2010). [CrossRef]

16. C. Kuphasuk, Y. Oshida, C. J. Andres, et al., “Electrochemical corrosion of titanium and titanium-based alloys,” J. Prosthet. Dent. 85(2), 195–202 (2001). [CrossRef]

17. S. Xiao, G. Ren, M. Xie, et al., “Recovery of Valuable Metals from Spent Lithium-Ion Batteries by Smelting Reduction Process Based on MnO–SiO2–Al2O3 Slag System,” J. Sustain. Metall. 3(4), 703–710 (2017). [CrossRef]

18. J. Cui and H. J. Roven, “Recycling of automotive aluminum,” Trans. Nonferrous Met. Soc. China 20(11), 2057–2063 (2010). [CrossRef]

19. B. Campanella, E. Grifoni, S. Legnaioli, et al., “Classification of wrought aluminum alloys by Artificial Neural Networks evaluation of Laser Induced Breakdown Spectroscopy spectra from aluminum scrap samples,” Spectrochim. Acta, Part B 134, 52–57 (2017). [CrossRef]

20. R. R. Gamela, V. C. Costa, M. A. Sperança, et al., “Laser-induced breakdown spectroscopy (LIBS) and wavelength dispersive X-ray fluorescence (WDXRF) data fusion to predict the concentration of K, Mg and P in bean seed samples,” Food Res. Int. 132, 109037 (2020). [CrossRef]

21. F. Deng, Y. Ding, Y. J. Chen, et al., “Quantitative analysis of the content of nitrogen and sulfur in coal based on laser-induced breakdown spectroscopy: effects of variable selection,” Plasma Sci. Technol. 22(7), 074005 (2020). [CrossRef]

22. E. Mal, R. Junjuri, M. K. Gundawar, et al., “Optimization of temporal window for application of calibration free-laser induced breakdown spectroscopy (CF-LIBS) on copper alloys in air employing a single line,” J. Anal. At. Spectrom. 34(2), 319–330 (2019). [CrossRef]

23. H. Sattar, H. Ran, W. Ding, et al., “An approach of stand-off measuring hardness of tungsten heavy alloys using LIBS,” Appl. Phys. B 126(1), 5 (2020). [CrossRef]

24. P. Modlitbová, P. Pořízka, and J. Kaiser, “Laser-induced breakdown spectroscopy as a promising tool in the elemental bioimaging of plant tissues,” TrAC, Trends Anal. Chem. 122, 115729 (2020). [CrossRef]

25. G. Nicolodelli, J. Cabral, C. R. Menegatti, et al., “Recent advances and future trends in LIBS applications to agricultural materials and their food derivatives: An overview of developments in the last decade (2010–2019). Part I. Soils and fertilizers,” TrAC, Trends Anal. Chem. 115, 70–82 (2019). [CrossRef]

26. Y. Ding, J. Chen, W. J. Chen, et al., “A novel strategy for quantitative analysis of the energy value of milk powder via laser-induced breakdown spectroscopy coupled with machine learning and a genetic algorithm,” J. Anal. At. Spectrom. 38(2), 464–471 (2023). [CrossRef]

27. Y. Ding, W. Zhang, X. Q. Zhao, et al., “A hybrid random forest method fusing wavelet transform and variable importance for the quantitative analysis of K in potassic salt ore using laser-induced breakdown spectroscopy,” J. Anal. At. Spectrom. 35(6), 1131–1138 (2020). [CrossRef]

28. Y. Ding, G. Y. Xia, H. W. Ji, et al., “Accurate quantitative determination of heavy metals in oily soil by laser induced breakdown spectroscopy (LIBS) combined with interval partial least squares (IPLS),” Anal. Methods 11(29), 3657–3664 (2019). [CrossRef]

29. J. Ren, Y. Zhao, and K. Yu, “LIBS in agriculture: A review focusing on revealing nutritional and toxic elements in soil, water, and crops,” Comput. Electron. Agric. 197, 106986 (2022). [CrossRef]

30. L. Y. Zhan, X. H. Ma, W. Q. Fang, et al., “A rapid classification method of aluminum alloy based on laser-induced breakdown spectroscopy and random forest algorithm,” Plasma Sci. Technol. 21(3), 034018 (2019). [CrossRef]

31. Y. J. Dai, S. Y. Zhao, C. Song, et al., “Identification of aluminum alloy by laser-induced breakdown spectroscopy combined with machine algorithm,” Microw. Opt. Technol. Lett. 63(6), 1629–1634 (2021). [CrossRef]

32. E. Harefa and W. D. Zhou, “Laser-Induced Breakdown Spectroscopy Combined with Nonlinear Manifold Learning for Improvement Aluminum Alloy Classification Accuracy,” Sensors 22(9), 3129 (2022). [CrossRef]

33. H. R. Guo, M. C. Cui, Z. Q. Feng, et al., “Classification of Aviation Alloys Using Laser-Induced Breakdown Spectroscopy Based on a WT-PSO-LSSVM Model,” Chemosensors 10(6), 220 (2022). [CrossRef]

34. Q. Song and F. Liang, “High-Dimensional Variable Selection With Reciprocal L1-Regularization,” J. Am. Stat. Assoc. 110(512), 1607–1620 (2015). [CrossRef]

35. N. A. S. Suarin and K. S. Chia, “Transferring Near Infrared Spectroscopic Calibration Model Across Different Harvested Seasons Using Joint Distribution Adaptation,” Control, Instrumentation and Mechatronics: Theory and Practice 921, 707–716 (2022). [CrossRef]

36. C. Tan and M. Li, “Mutual information-induced interval selection combined with kernel partial least squares for near-infrared spectral calibration,” Spectrochim. Acta, Part A 71(4), 1266–1273 (2008). [CrossRef]

37. H. Zhao, X. Yang, B. Chen, et al., “Bearing fault diagnosis using transfer learning and optimized deep belief network,” Meas. Sci. Technol. 33(6), 065009 (2022). [CrossRef]

38. M. Long, J. Wang, G. Ding, et al., “Transfer Feature Learning with Joint Distribution Adaptation,” Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2013), pp. 2200–2207.

label	The series of aluminum alloys	The grades of aluminum alloys
1#	1xxx series	1050	1100
2#	2xxx series	2A12	2024
3#	3xxx series	3A21	3003
4#	5xxx series	5A06	5052	5083	5754
5#	6xxx series	6061	6063	6082
6#	7xxx series	7075

Model	Data set	Accuracy	Recall	F1-Score	AUC
KNN	B-2.5	11.48%	15.94%	6.79%	51.10%
	B-2.0	19.71%	20.30%	13.19%	57.28%
	B-1.5	30.57%	32.95%	28.75%	60.93%
	B-1.0	45.71%	39.08%	36.82%	70.96%
	B-0.5	53.57%	49.08%	45.35%	78.00%
	B	88.28%	87.80%	87.21%	97.26%
	B + 0.5	52.57%	52.99%	51.41%	81.40%
	B + 1.0	21.42%	20.36%	17.11%	57.38%
	B + 1.5	14.42%	16.66%	4.43%	50.00%
	B + 2.0	14.42%	16.66%	4.43%	50.00%
	B + 2.5	14.42%	16.66%	4.43%	50.00%

Model	Data set	Accuracy	Recall	F1-Score	AUC
JDA(KNN)	B-2.5	45.85%	40.83%	40.38%	64.50%
	B-2.0	49.57%	43.77%	45.60%	66.26%
	B-1.5	56.14%	55.43%	57.09%	73.26%
	B-1.0	65.14%	67.50%	63.90%	80.50%
	B-0.5	69.71%	73.94%	71.81%	84.36%
	B	96.28%	96.52%	96.10%	97.91%
	B + 0.5	71.85%	73.33%	70.90%	84.00%
	B + 1.0	61.42%	62.94%	60.96%	77.76%
	B + 1.5	66.28%	66.63%	67.41%	79.98%
	B + 2.0	28.42%	25.05%	23.98%	55.03%
	B + 2.5	21.71%	18.38%	17.97%	52.80%

Model	Data set	Accuracy	Recall	F1-Score	AUC
MI-JDA(KNN)	B-2.5	78.14%	81.38%	81.43%	88.68%
	B-2.0	82.28%	83.22%	82.85%	89.93%
	B-1.5	80.14%	80.97%	80.45%	88.58%
	B-1.0	89.71%	88.83%	89.76%	93.30%
	B-0.5	91.85%	90.50%	90.77%	94.30%
	B	98.42%	98.44%	98.46%	99.06%
	B + 0.5	94.28%	94.66%	94.42%	96.80%
	B + 1.0	92.42%	92.33%	92.39%	95.40%
	B + 1.5	82.14%	83.27%	84.01%	89.96%
	B + 2.0	78.57%	79.82%	80.88%	87.89%
	B + 2.5	73.71%	76.54%	77.51%	85.92%

Dataset	SVM	MI-JDA(SVM)	RF	MI-JDA(RF)
B-2.5	18.42%	74.14%	20.14%	77%
B-2.0	20.42%	79.28%	29.42%	82.28%
B-1.5	33.85%	82.71%	49.14%	80.41%
B-1.0	52.14%	89.57%	57.28%	90.14%
B-0.5	63.28%	92.28%	58%	90.71%
B	96%	98.14%	97.85%	98.28%
B + 0.5	73.85%	92%	80.85%	93.42%
B + 1.0	44.14%	91.42%	56%	92.28%
B + 1.5	21.71%	86.28%	28.42%	87.12%
B + 2.0	14.28%	77.57%	14.28%	76.41%
B + 2.5	14.28%	73.11%	14.28%	75.63%

Strategy for reducing the effect of surface fluctuation in the classification of aluminum alloy via data transfer and laser-induced breakdown spectroscopy

Abstract

1 Introduction

2 Materials and methods

2.1 Experimental setup

2.2 Algorithm structure

2.2.1 Data transfer

2.2.2 Feature selection

2.3 Data preparation

3 Results and discussions

3.1 Qualitative analysis

3.2 Analysis of the traditional classification model

3.3 Analysis of the data transfer classification model

3.4 Analysis of the data transfer classification model combined with feature selection

3.5 Comparison of prediction performance

4 Conclusions

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (16)

Tables (6)

Equations (2)

Optics Express

Dataset	The spectral number of each aluminum alloy	Total number of spectra
train dataset
A	200	2800
test dataset
B	50	700
B ± 0.5	50	700
B ± 1.0	50	700
B ± 1.5	50	700
B ± 2.0	50	700
B ± 2.5	50	700