Abstract
A modification of ensemble Monte Carlo uninformative variable elimination (EMCUVE) is proposed, which does not involve the use of random variables, with the aim of improving the performance of partial least squares (PLS) regression models, increasing the consistency of results and reducing processing time by selecting the most informative variables in a spectral dataset. The proposed method (ensemble Monte Carlo variable selection—EMCVS) and the robust version (REMCVS) were compared to PLS models and with the existing EMCUVE method using three near infrared (NIR) datasets, i.e. prediction of n-butanol in a five-solvent mixture, moisture in corn and glucosinolates in rapeseed. The proposed methods were more consistent, produced models with better predictive accuracy (lower root mean squared error of prediction) and required less computational time than the conventional EMCUVE method on these datasets. In this application, the proposed method was applied to PLS regression coefficients but it may, in principle, be used on any regression vector.
© 2011 IM Publications LLP
PDF Article
More Like This
Cited By
You do not have subscription access to this journal. Cited by links are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.
Contact your librarian or system administrator
or
Login to access Optica Member Subscription