Published 2004

Read in Norwegian

Publication details

Journal : Journal of Chemometrics , vol. 18 , p. 498–507 , 2004

International Standard Numbers :
Printed : 0886-9383
Electronic : 1099-128X

Publication type : Academic article

Contributors : Mevik, Bjørn-Helge; Segtnan, Vegard; Næs, Tormod

If you have questions about the publication, you may contact Nofima’s Chief Librarian.

Kjetil Aune
Chief Librarian
kjetil.aune@nofima.no

Summary

Recently, there has been an increased attention in the literature on the use of ensemble methods in multivariate regression and classification. These methods have been shown to have interesting properties both for regression and classification. In particular, they can improve the accuracy of unstable predictors. Ensemble methods have so far, been little studied in situations that are common for calibration and prediction in chemistry, i.e., situations with a large number of collinear x-variables and few samples. These situations are often approached by data compression methods such as principal components regression (PCR) or partial least squares regression (PLSR). The present paper is an investigation of the properties of different types of ensemble methods used with PLSR in situations with highly collinear x-data. Bagging and data augmentation by simulated noise are studied. The focus is on the robustness of the calibrations. Real and simulated data is used. The results show that ensembles trained on data with added noise can make the PLSR robust against the type of noise added. In particular, the effects of sample temperature variations can be eliminated. Bagging does not seem to give any improvement over PLSR for small and intermediate number of components. It is, however, less sensitive to over-fitting.