Use of weighting algorithms to improve traditional support vector machine based classifications of reflectance data

Bin Qi; Chunhui Zhao; Eunseog Youn; Christian Nansen

doi:10.1364/OE.19.026816

1. Introduction

With increasing use of hyperspectral reflectance data in research, commercial and military applications, there is a continuous demand for improving the accuracy of classification algorithms. Classification accuracy may be defined as the ability to correctly classify a given object or pixel, and Cohen Kappa coefficient [1] is often used as a measurement of classification algorithm. Support vector machine (SVM) was proposed by Vapnik and his colleagues as a classification approach in the fields of pattern recognition and machine learning based on the structural risk minimization principle [2–4]. That is, SVM searches for a decision boundary, which aims at providing a tradeoff between hypothesis space complexity and quality of fitting the training data [5, 6]. Different SVMs have been applied successfully to analyses of hyperspectral reflectance data in pattern recognition (i.e. endmember extraction [7], geometric camera calibration [8], text categorization [9], handwritten character recognition [10], and face recognition [11]) and in classification of objects into discrete classes [12]. Traditional SVM treats each feature (spectral band or variable) with equal weight [13], even though they unlikely have equal contributions to the classification. Thus, it might be advantageous to incorporate ways to assign highest weights to the features with the largest contribution to the classification and lower weights or simply omit those associated with noise/ stochasticity. Weighting of features is widely used in statistically-based classifications (i.e. forward stepwise band/feature selection), such as stepwise discriminant analysis [14] and other regression-based classifications [15, 16]. Basic relief algorithm was originally proposed as a statistically-based feature selection method to assign different weights to different features according to their statistical contribution [17]. However, one of the potential challenges with use of basic relief algorithm is that the adjustment of weights is sensitive to outliers or noise in the training data set. Consequently, this approach may reduce classification robustness and increase the risk of “over-fitting” [18]. In order to reduce the risk of over-fitting, we incorporated fuzzy theory into the basic relief algorithm and adjusted the contribution of features based on the pixel distance to the centroid of each class. The design idea presented in this study is based on the assumption that the membership degree in fuzzy theory can identify the distribution of the training samples. Moreover, in order to use the statistical information that exists in the training data set, we utilized relief algorithm as a feature weighting method in the SVMs. A new weighting formula was used to increase differences among classes and reduce differences within each class. Thus, the relief weighting algorithm may increase the class separation of SVM [13, 19, 20] by giving comparatively higher weights to features with high classification contribution. For convenient citation, we use the following abbreviations: “SVM” refers to original support vector machine, “RSVM” refers to support vector machine with relief feature weighting algorithm, and “FRSVM” refers to support vector machine with fuzzy relief feature weighting algorithm.

A problem often noted in the classification of reflectance data is Hughes phenomenon, which tend to occur when the number of classification features exceeds the number of training samples [21]. As a consequence of Hughes phenomenon, the classification accuracy progressively increases with the addition of features but reaches a maximum and subsequently declines [5]. An important aspect of classification accuracy is therefore to select the most appropriate number of classification features to avoid adverse effects of Hughes phenomenon [22].

The objective of this study was to compare traditional SVM with RSVM and FRSVM regarding: 1) classification accuracy, and 2) effect of feature reduction. We conducted this evaluation on the basis of two reflectance data sets: 1) individual corn kernels of three inbred lines and 2) a public data set with three selected land-cover classes. With this study, we intended to demonstrate that weighting and feature reduction methods can increase the accuracy of SVM based classifications.

2. Methods and concepts

Basic classification method used in this study is SVM, which has already shown high performance in machine learning applications, especially in dealing with high dimensional features [19, 23]. For additional theory about SVM, we refer to [2–4].

2.1 Relief feature selection algorithm

Relief feature selection algorithm was proposed by Kira and Rendell [17] and briefly presented here to foster the discussion of our proposed feature weighting methods. In a theoretical reflectance data set, $X = {x_{1}, x_{2}, \dots, x_{n}}, x_{i} \in ℝ^{d}$ is the training data set of p classes, n pixels and each pixel has d features. λ is a $d \times 1$ vector which represents the weight of each dimensional feature. As for an arbitrary pixel x_i, L pixels are selected that have the closest distance to x_i of the same class with x_i, which is referred as $h_{j}, j = 1, 2, \dots, L, h_{j} \in ℝ^{d}$ . Then L pixels are chosen that have the closest distance to x_i of the class different from x_i, which is referred as $m_{l j}, j = 1, 2, \dots, L, l = {1, 2, \dots, p} / c l a s s (x_{i}), m_{l j} \in ℝ^{d}$ , diff_hit is a $d \times 1$ vector, which represents the difference between h_j and x_i.

d i f f_h i t = \sum_{j = 1}^{L} \frac{| x_{i} - h_{j} |}{m a x (X) - m i n (X)}

where max(X)(min(X)) is the maximum (minimum) element in X. diff_miss is a

d \times 1

vector, which represents the difference between m_lj and x_i.

d i f f_m i s s = \sum_{l \neq c l a s s (x_{i})} \frac{P (l)}{1 - P (c l a s s (x_{i}))} \sum_{j = 1}^{L} \frac{| x_{i} - m_{l j} |}{m a x (X) - m i n (X)}

where P(l) is the probability of class l and | | is the operation of absolute value. The update formula for λ is given by

λ_{n e w} = λ_{o l d} - d i f f_h i t / L + d i f f_m i s s / L

λ is updated for each

x_{i}, i = 1, 2, \dots, n

with its initial start vector set to zero, and features with weights above a given threshold, denoted τ, are retained and those features with smaller weights are discarded.

2.2 Relief feature weighting algorithm

The original relief algorithm was proposed for feature selection. diff_hit refers to the difference within each class, and diff_miss refers to the difference among classes. In this study, we utilized relief algorithm as a feature weighting method, so that only features with high diff_miss / diff_hit ratios were selected according to:

w_{q} = \frac{\sum_{i = 1}^{n} \sum_{l \neq c l a s s (x_{i})} \frac{P (l)}{1 - P (c l a s s (x_{i}))} \sum_{j = 1}^{L} \frac{| x_{i, q} - m_{l j, q} |}{m a x (X) - m i n (X)}}{\sum_{i = 1}^{n} \sum_{j = 1}^{L} \frac{| x_{i, q} - h_{j, q} |}{m a x (X) - m i n (X)}}, q = 1, 2, \dots d

where w_q is the weight for the q-th feature, x_i,q is the q-th element of vector x_i, m_lj,q is the q-th element of vector m_lj, h_j,q is the q-th element of vector h_j.

2.3 Fuzzy relief feature weighting algorithm

If assumed that all pixels are divided into p classes (α₁, α₂,∙∙∙, α_p) and the centroid of each class is $r_{i}, i = 1, 2, \dots, p, x_{i} \in α_{k}$ , then the distance between x_i and r_k is

D (x_{i}, r_{k}) = ‖ x_{i} - r_{k} ‖

where

‖ ‖

is the operation of Euclidean distance. The membership degree of x_i to class α_k is defined as:

u_{i k} = \frac{\sum_{j \neq k} D^{2} (x_{i}, r_{j})}{D^{2} (x_{i}, r_{k})}

The corresponding diff_hit and diff_miss are given by

d i f f_h i t = \sum_{j = 1}^{L} \frac{| x_{i} - h_{j} | u_{i k}}{m a x (X) - m i n (X)}, x_{i} \in α_{k}

d i f f_m i s s = \sum_{l \neq k} \frac{P (l)}{1 - P (k)} \sum_{j = 1}^{L} \frac{| x_{i} - m_{l j} | u_{i k}}{m a x (X) - m i n (X)}, x_{i} \in α_{k}

_{$w = (w_{1}, w_{2}, \dots, w_{d})$} is a vector which corresponds to the weights of the features which are given by

w_{q} = \frac{\sum_{i = 1}^{n} \sum_{l \neq k} \frac{P (l)}{1 - P (k)} \sum_{j = 1}^{L} \frac{| x_{i, q} - m_{l j, q} | u_{i k}}{m a x (X) - m i n (X)}}{\sum_{i = 1}^{n} \sum_{j = 1}^{L} \frac{| x_{i, q} - h_{j, q} | u_{i k}}{m a x (X) - m i n (X)}}, q = 1, 2, \dots d

where w_q is the weight for the q-th feature, x_i,q is the q-th element of vector x_i, m_lj,q is the q-th element of vector m_lj, h_j,q is the q-th element of vector h_j. Both the training and the test data sets were weighted by the feature weighting vector w prior to analysis.

3. Materials and experimental design

3.1 Experimental data samples

The corn kernel samples used in this study were provided by Dr. Kolomiets at Texas A&M University. In brief, they represent three proprietary inbred lines: a wild type without genetic modification, and two mutants with suppression of one of two genes in the lipoxygenase pathway. Genetically, the homozygous corn mutants are near-isogenic to the recurrent wild type parent and share about 97.5% of the parent genome with one mutant (mutant 1) showing negligible visual/phenotypic difference from the wild type, and the other mutant (mutant 2) being slightly darker in color than the wild type (Fig. 1(a) and (b) ). Consequently, kernels from these inbred corn lines were considered ideal as a challenging model data set for evaluation of classification accuracy. Reflectance data from 15 individual corn kernels, five kernels from each of the three genotypes were used. Reflectance data from corn kernels were acquired after the kernels had been positioned on white Teflon, and hyperspectral images were acquired with a spatial resolution of 169 pixels per cm². A subsample of 100 pixels was selected from each kernel, so totally there were 500 pixels from each class.

Fig. 1 Digital image of the corn genotypes: Wild type (left column), Mutant 1 (center column), and Mutant 2 (right column) (a), and corresponding average reflectance profiles (b).

Download Full Size | PDF

3.2 Hyperspectral imaging system

Hyperspectral imaging data of corn kernels were acquired with a line-scanning push-broom hyperspectral camera (PIKA II, www.resonon.com), which has 640 sensors producing hyperspectral images with 160 wavelength channels within the wavelength range from 405 to 907 nm (wavelength resolution of 3.1 nm). The objective lens has 35mm focal length optimized for the visible and near-infrared (NIR) spectra and the angular field of view is 7° [24]. The hyperspectral camera was mounted on an aluminum tower-structure 60 cm above the target object platform. Hyperspectral image acquisition was conducted inside a darkroom with four halogen lamps (http://www.resonon.com/scanning-systems-and-accesories.html) as only light source in order to keep unwanted light from contaminating the signal. To ensure consistent acquisition conditions, the hyperspectral camera and lighting system were turned on at least 30 min prior to image acquisition. Dark calibration was conducted at the beginning of the data acquisition by covering the lens of the camera with its cap, and white Teflon was used for white calibration immediately before image acquisition. Based on dark and white calibration, reflectance values from hyperspectral image cubes were converted into proportions (denoted relative reflectance) ranging from 0 to 1.

3.3 AVIRIS data set

Public vegetation reflectance data from northwest Indiana’s Indian Pines (AVIRIS sensor, June 12, 1992: ftp://ftp.rcn.purdue.edu/biehl/MultiSpec/92AV3C) was also included in this study (Fig. 2 ), which has been used in multiple published studies [7, 12]. The hyperspectral image consists of a scene of size 145 by 145 pixels, with a spatial resolution of 20m/pixel and 200 spectral bands. From 16 different land-cover classes available in the original ground truth data, three classes (Corn-min till (834 pixels), Grass/Pasture (497 pixels) and Soybean-clean till (614 pixels)) were selected to testify the effectiveness of different classifier.

Fig. 2 Pseudo-color image of AVIRIS data set (composed of band 17, 27 and 57) (a), and corresponding average reflectance profiles (b).

Download Full Size | PDF

3.4 Training and test data sets

Experimental analysis was organized into two main parts. The first aimed at comparing average classification accuracies based on 10-fold cross-validations of proposed classifiers (RSVM and FRSVM) with that of traditional SVM. In the second part we examined effects of feature reduction on RSVM and FRSVM to evaluate effects of Hughes phenomenon on two data set sizes, small and large. Small data sets consisted of randomly partitioning the input data into 10 subsamples with one of the 10 subsamples retained as training data set, and the remaining nine subsamples were used as test data set. The subsampling process was repeated 10 times, with each subsample used once as training data set. Large data sets consisted of randomly partitioning the input data into five subsamples. Of the five subsamples, a single subsample was retained as training data set, and the remaining four subsamples were used as test data set. The process was then repeated five times, with each subsample used once as the training data set.

3.5 SVM and parameter settings

Similar to [12], we used the “one-against-one” SVM classification strategy without weighting of features as initial SVM method. The kernel function used here is the Gaussian RBF, as follows:

K (x, y) = \exp (- γ {‖ (x - y) ‖}^{2})

where γ determines the width and tunes the smoothing of the discriminant function. The penalty C is another important factor in the SVM classifier, and it controls the trade-off between the margin and the size of the slack variables [25]. Consequently, to reliably optimize γ and C, a cross-validation frame work was applied with both γ and C ranging from 2⁻²-2⁵ (Fig. 3 ). Based on this initial analysis, γ = 2⁻¹ and C = 2^1.5 were selected as suitable parameter values for corn kernel data set (Fig. 3a), while γ = 2² and C = 2^2.5 were selected for AVIRIS data set (Fig. 3b).

Fig. 3 Cross validation of corn kernel data set (a), cross validation of AVIRIS data set (b)

Download Full Size | PDF

Parameter L is an integer, which represents the number of pixels selected as the closest pixels to calculate difference within each class (diff_hit) and difference among classes (diff_miss). As part of testing the accuracy of RSVM and FRSVM to parameter settings, we compared classification accuracies with L values ranging from 1 to 4. We also tested L > 4, but these results are not presented, as the classification accuracy decreased markedly in response to increasing parameter L. The Cohen Kappa coefficient [7] was used to measure the classification accuracy of each classifier.

4. Results and discussion

4.1. Reflectance data and weights assigned to spectral bands

Figure 1(b) shows the average reflectance profiles from kernels of three inbred corn lines with reflectance values acquired from mutant 2 being consistently lower than those from wild type and mutant 1, especially in spectral bands from 600 to 907 nm. Relative reflectance values were consistently higher from wild type kernels compared to mutant kernels, and about 6% difference in average reflectance curves was observed at 885nm between wild type and mutant 1. For comparison, the highest difference in average reflectance curve was 19% between wild type and mutant 2, which appeared at 724nm. With as little as 6% difference in average reflectance profiles between wild type and mutant 1 and only about 19% difference in average reflectance between wild type and mutant 2, this challenging data set was considered highly suitable for testing novel SVM approaches to reflectance data classification.

Figure 2(b) shows the average reflectance profiles from three land cover classes with reflectance values acquired from Grass/Pasture showing visual difference from the other two classes. It is evident that the average reflectance profiles of Corn-min till and Soybean-clean till are very similar across the examined spectrum. Careful evaluation reveals that the average reflectance curve from Corn-min till was slightly above Soybean-clean till in several regions.

Figure 4(a) and Fig. 5(a) show standardized weights assigned by RSVM and FRSVM to corn kernel data and AVIRIS data, respectively. In both data sets, it is clearly illustrated that the two classification methods assigned similar weights to spectral bands and that spectral bands did not contribute equally to the classifications. For corn kernel data, both RSVM and FRSVM assigned highest standardized weights to spectral bands between 550 and 700 nm, and careful evaluation revealed that weights assigned by FRSVM between 550 and 700 nm were slightly higher than those assigned by RSVM (Fig. 4(b)). It was also seen that RSVM assigned higher standardized weights to spectral bands in both ends of the examined spectrum than those assigned by FRSVM. Regarding the AVIRIS data, both RSVM and FRSVM assigned highest standardized weights to spectral bands between 500 and 750 nm, 780-1200nm, 1550-1850nm, and 1950-2400nm. It was evident that weights assigned by FRSVM between 570 and 770nm, 820-880nm and 900-1200nm were slightly higher than those assigned by RSVM (Fig. 5(b)). We suspect that the slight difference in assignments of weighting scores by RSVM and FRSVM is attributed to the way the two classification methods operate. In RSVM, weighting scores assigned to each spectral band are based on all pixels providing equal contribution. For comparison, FRSVM identified a class centroid, which is a vector representing the spectral mean for each class. Subsequently, FRSVM assigns high weighting contributions to pixels near this class centroid and lower weighting contributions to pixels away from the class centroid. As a consequence, standardized weights assigned by RSVM are almost exclusively determined by spectral information, while standardized weights assigned by FRSVM are determined by a combination of spectral and spatial (distance from class centroid) information within the hyperspectral image cube.

Fig. 4 Comparison of standardized weights obtained by using two different weighting methods: support vector machine with relief feature weighting algorithm (RSVM) and support vector machine with fuzzy relief feature weighting algorithm (FRSVM) on corn kernel data set (a), and the ratio of standardized weights using FRSVM and RSVM (b). Parameter L was equal to 1.

Download Full Size | PDF

Fig. 5 Comparison of standardized weights obtained by using two different weighting methods: support vector machine with relief feature weighting algorithm (RSVM) and support vector machine with fuzzy relief feature weighting algorithm (FRSVM) on AVIRIS data set (a), and the ratio of standardized weights using FRSVM and RSVM (b). Parameter L was equal to 1.

Download Full Size | PDF

4.2. Classification accuracy

Classification accuracies based on 10-fold cross-validations showed that both weighting methods outperformed the traditional SVM, and FRSVM exhibited the highest overall accuracy (i.e., the percentage of correctly classified pixels among all the test pixels considered) (Table 1 and 2 ). In the analysis of corn kernel data, RSVM and FRSVM caused 0.86% and 1.07% increase in average overall classification accuracy, respectively. As expected, the highest classification accuracy was obtained when differentiating mutant 2 and the other two inbred lines. In the analysis of AVIRIS data, RSVM and FRSVM showed an average increase in overall accuracy of 1.67% and 1.82% compared to SVM, respectively. The slightly better classification accuracy of FRSVM is likely explained by the fact that FRSVM is less influenced by outliers.

Table 1. Comparison of classification accuracies (%), overall accuracies (%) and Cohen Kappa coefficients conducted by the SVM, RSVM and FRSVM algorithm yielded on corn kernel data set

View Table | View all tables in this article

Table 2. Comparison of classification accuracies (%), overall accuracies (%) and Cohen Kappa coefficients conducted by the SVM, RSVM and FRSVM algorithm yielded on AVIRIS data set

View Table | View all tables in this article

4.3. Feature reduction and classification

As part of assessing the effect of feature reduction on classification accuracy, features were selected based on Eq. (4) and (9).

The average classification accuracy obtained with RSVM and FRSVM (1/10 of original data was selected as training data set) as a function of the number of features was shown in Fig. 6 . For corn kernel data, RSVM and FRSVM showed the highest classification accuracy when 130 and 120 features were included, respectively. Regarding AVIRIS data, both RSVM and FRSVM had the highest classification accuracy when 160 features were included. Similar accuracy trends were also observed with 1/5 of the original data being used as training data set (not shown).

Fig. 6 Comparison of overall accuracies (%) conducted by RSVM and FRSVM algorithm with different number of features using corn kernel data (a) and AVIRIS data (b). 1/10 of the original data was selected as training data set. Parameter L was equal to 1.

Download Full Size | PDF

For corn kernel data set, the largest difference between the peak accuracy and that obtained from the use of all 160 features was 1.44% (RSVM) and 1.13% (FRSVM) (Table 3 ), when 1/10 of the original data was selected as training data. A similar general trend was also observed with the analysis of the AVIRIS data set. That is, the largest difference between the peak accuracy and that obtained from the use of all 200 features was 0.19% (RSVM) and 0.40% (FRSVM) (Table 4 ), when 1/10 of the original data was selected as training data. The results highlight the adverse effects of Hughes phenomenon when a small training data set is used, but it was also seen that RSVM and FRSVM reduced the negative effects of Hughes phenomenon.

Table 3. Difference between peak accuracy and that derived from the use of all 160 features acquired from corn kernel data

View Table | View all tables in this article

Table 4. Difference between peak accuracy and that derived from the use of all 200 features acquired from AVIRIS data

View Table | View all tables in this article

Conclusion

Comparing the two weighting methods with the traditional SVM, weighting of features was shown to increase classification accuracy of reflectance data set. It was illustrated that the accuracy of classification was influenced by the number of features used and, therefore, was affected by the Hughes phenomenon. Compare with RSVM, we also demonstrated that FRSVM had slightly higher overall classification accuracy. It is explained by the fact that FRSVM uses the spatial distribution information of the pixel in the class and will greatly reduce the effect of noisy pixels.

Acknowledgments

This study was partially supported by the National Natural Science Foundation of China (Grant No. 61077079), by the Ph.D. Programs Foundation of Ministry of Education of China (Grant No. 20102304110013) and by the Academic Leader Foundation of Harbin City in China (Grant No. 2009RFXXG034). The authors would like to thank the support from the China Scholarship Council. Dr. Kolomiets at Texas A&M University is thanked for providing the corn kernels used in this study.

References and links

1. J. Cohen, “A coefficient of agreement for nominal scales,” Educ. Psychol. Meas. 20(1), 37–46 (1960). [CrossRef]

2. V. Vapnik, The Nature of Statistical Learning Theory (Springer & New York, 2000), Chap. 1.

3. C. Cortes and V. Vapnik, “Support-vector networks,” Mach. Learn. 20(3), 273–297 (1995). [CrossRef]

4. B. E. Boser, I. M. Guyon, and V. Vapnik, “A training algorithm for optimal margin classifiers,” in COLT '92 Proceedings of the fifth annual workshop on computational learning theory, D. Haussler, ed. (ACM, New York, NY, 1992), pp. 144–152.

5. M. Pal and G. M. Foody, “Feature selection for classification of hyperspectral data by SVM,” IEEE Trans. Geosci. Remote Sens. 48(5), 2297–2307 (2010). [CrossRef]

6. F. Bovolo, L. Bruzzone, and L. Carlin, “A novel technique for subpixel image classification based on support vector machine,” IEEE Trans. Image Process. 19(11), 2983–2999 (2010). [CrossRef]

7. A. M. Filippi, R. Archibald, B. L. Bhaduri, and E. A. Bright, “Hyperspectral agricultural mapping using support vector machine-based endmember extraction (SVM-BEE),” Opt. Express 17(26), 23823–23842 (2009). [CrossRef] [PubMed]

8. B. Ergun, T. Kavzoglu, I. Colkesen, and C. Sahin, “Data filtering with support vector machines in geometric camera calibration,” Opt. Express 18(3), 1927–1936 (2010). [CrossRef] [PubMed]

9. M. A. Kumar and M. Gopal, “A comparison study on multiple binary-class SVM methods for unilabel text categorization,” Pattern Recognit. Lett. 31(11), 1437–1444 (2010). [CrossRef]

10. N. Shanthi and K. Duraiswamy, “A novel SVM-based handwritten Tamil character recognition system,” Pattern Anal. Appl. 13(2), 173–180 (2010). [CrossRef]

11. X. Xu, D. Zhang, and X. Zhang, “An efficient method for human face recognition using nonsubsampled contourlet transform and support vector machine,” Opt. Appl. 39, 601–615 (2009).

12. B. Guo, S. R. Gunn, R. I. Damper, and J. B. Nelson, “Customizing kernel functions for SVM-based hyperspectral image classification,” IEEE Trans. Image Process. 17(4), 622–629 (2008). [CrossRef] [PubMed]

13. J. Li, X. Gao, and L. Jiao, “A new feature weighted fuzzy cluster algorithm,” Acta. Electron. 34, 89–92 (2006).

14. C. Nansen, A. J. Sidumo, and S. Capareda, “Variogram analysis of hyperspectral data to characterize the impact of biotic and abiotic stress of maize plants and to estimate biofuel potential,” Appl. Spectrosc. 64(6), 627–636 (2010). [CrossRef] [PubMed]

15. L. R. LaMotte and A. McWhorter, “A regression-based linear classification procedure,” Educ. Psychol. Meas. 41(2), 341–347 (1981). [CrossRef]

16. L. Gao, F. Gao, X. Guan, D. Zhou, and J. Li, “A regression algorithm based on AdaBoost,” in WCICA 2006: Sixth World Congress on Intelligent Control and Automation, D. M. Zhou, ed. (IEEE Computer Society Press, Dalian, Liaoning, 2006), pp. 4400–4404.

17. K. Kira and L. A. Rendell, “A practical approach to feature selsecion,” in Proceeding of the 9th International Workshop on Machine Learning, D. Sleeman, ed. (Morgan Kaufmann, San Francisco, CA, 1992), pp. 249–256.

18. T. Kayikcioglu and O. Aydemir, “A polynomial fitting and k-NN based approach for improving classification of motor imagery BCI data,” Pattern Recognit. Lett. 31(11), 1207–1215 (2010). [CrossRef]

19. F. Melgani and L. Bruzzone, “Classification of hyperspectral remote sensing images with support vector machine,” IEEE Trans. Geosci. Remote Sens. 42(8), 1778–1790 (2004). [CrossRef]

20. L. Wang, C. Zhao, Y. Qiao, and W. Chen, “Research on all-around weighting methods of hyperspectral imagery classification,” Int. J. Infrared Millim. Waves 27, 442–446 (2008).

21. P.-H. Hsu, “Feature extraction of hyperspectral images using wavelet and matching pursuit,” ISPRS J. Photogramm. Remote Sens. 62(2), 78–92 (2007). [CrossRef]

22. C. Lee and D. A. Landgrebe, “Analyzing high-dimensional multispectral data,” IEEE Trans. Geosci. Remote Sens. 31(4), 792–800 (1993). [CrossRef]

23. D. J. Sebald and J. A. Bucklew, “Support vector machine techniques for nonlinear equalization,” IEEE Trans. Signal Process. 48(11), 3217–3226 (2000). [CrossRef]

24. C. Nansen, T. Herrman, and R. Swanson, “Machine vision detection of bonemeal in animal feed samples,” Appl. Spectrosc. 64(6), 637–643 (2010). [CrossRef] [PubMed]

25. F. A. Mianji and Y. Zhang, “Robust hyperspectral classification using relevance vector machine,” IEEE Trans. Geosci. Remote Sens. 49(6), 2100–2112 (2011). [CrossRef]

Parameter L	Classifier	Classification Accuracy			Overall Accuracy	Cohen Kappa Coefficient
Parameter L	Classifier	Wild type	Mutant 1	Mutant 2	Overall Accuracy	Cohen Kappa Coefficient
―	SVM	87.00	91.00	100.00	92.67	0.8900
1	RSVM	90.20	91.80	99.80	93.93	0.9090
1	FRSVM	90.40	92.40	99.40	94.07	0.9110
2	RSVM	90.60	92.00	99.80	94.13	0.9120
2	FRSVM	91.20	91.80	99.80	94.27	0.9140
3	RSVM	89.60	91.20	99.60	93.47	0.9020
3	FRSVM	90.00	91.80	99.60	93.80	0.9070
4	RSVM	87.00	91.40	99.40	92.60	0.8890
4	FRSVM	88.00	91.00	99.40	92.80	0.8920

Parameter L	Classifier	Classification Accuracy			Overall Accuracy	Cohen Kappa Coefficient
Parameter L	Classifier	Corn	Grass	Soybean	Overall Accuracy	Cohen Kappa Coefficient
―	SVM	94.46	100.00	89.18	94.19	0.9105
1	RSVM	95.90	100.00	92.13	95.75	0.9343
1	FRSVM	95.90	100.00	92.79	95.96	0.9376
2	RSVM	96.39	100.00	91.80	95.86	0.9359
2	FRSVM	96.15	100.00	92.79	96.06	0.9391
3	RSVM	9639	100.00	91.80	95.86	0.9359
3	FRSVM	95.90	100.00	92.79	95.96	0.9375
4	RSVM	96.39	100.00	92.13	95.96	0.9375
4	FRSVM	95.42	100.00	93.77	96.06	0.9392

	RSVM	RSVM	FRSVM	FRSVM
	1/10 of original data was selected as training data set	1/5 of original data was selected as training data set	1/10 of original data was selected as training data set	1/5 of original data was selected as training data set
Peak Accuracy	86.07	87.23	87.26	89.35
Mean Accuracy	84.66	85.97	85.64	87.44
Difference	1.44	1.40	1.13	0.45

	RSVM	RSVM	FRSVM	FRSVM
	1/10 of original data was selected as training data set	1/5 of original data was selected as training data set	1/10 of original data was selected as training data set	1/5 of original data was selected as training data set
Peak Accuracy	88.69	93.06	89.26	93.36
Mean Accuracy	88.01	92.73	88.48	92.82
Difference	0.19	0.03	0.40	0.04

Parameter L	Classifier	Classification Accuracy			Overall Accuracy	Cohen Kappa Coefficient
Parameter L	Classifier	Wild type	Mutant 1	Mutant 2	Overall Accuracy	Cohen Kappa Coefficient
―	SVM	87.00	91.00	100.00	92.67	0.8900
1	RSVM	90.20	91.80	99.80	93.93	0.9090
1	FRSVM	90.40	92.40	99.40	94.07	0.9110
2	RSVM	90.60	92.00	99.80	94.13	0.9120
2	FRSVM	91.20	91.80	99.80	94.27	0.9140
3	RSVM	89.60	91.20	99.60	93.47	0.9020
3	FRSVM	90.00	91.80	99.60	93.80	0.9070
4	RSVM	87.00	91.40	99.40	92.60	0.8890
4	FRSVM	88.00	91.00	99.40	92.80	0.8920

Use of weighting algorithms to improve traditional support vector machine based classifications of reflectance data

Abstract

1. Introduction

2. Methods and concepts

2.1 Relief feature selection algorithm

2.2 Relief feature weighting algorithm

2.3 Fuzzy relief feature weighting algorithm

3. Materials and experimental design

3.1 Experimental data samples

3.2 Hyperspectral imaging system

3.3 AVIRIS data set

3.4 Training and test data sets

3.5 SVM and parameter settings

4. Results and discussion

4.1. Reflectance data and weights assigned to spectral bands

4.2. Classification accuracy

4.3. Feature reduction and classification

Conclusion

Acknowledgments

References and links

Cited By

Figures (6)

Tables (4)

Equations (10)

Optics Express