Journal : Computational Statistics & Data Analysis , vol. 46 , p. 689–705 , 2004
Publisher : Elsevier
International Standard Numbers
Printed : 0167-9473
Electronic : 1872-7352
Publication type : Academic article
If you have questions about the publication, you may contact Nofima’s Chief Librarian.
The situation where classes arise by dividing the range of a continuous response variable into intervals is discussed. The focus is on assessing the performance of classifiers. Due to the underlying continuum, all misclassifications are not equally grave. The probability of misclassification (pmc) is not optimal in this situation. An alternative performance measure, the squared error rate (sqerr) is proposed. It is related to the mean squared error of regression, and penalises misclassifications according to their severity. Also, because of measurement errors in the response variable, there are misallocated class labels in data sets used for training and testing. Estimates of the pmc and the sqerr are developed for this situation. The estimates are tested and compared on a real data set and in a simulation.