Abstract
In many real world classification problems, class-conditional classification noise (CCC-Noise) frequently deteriorates the performance of a classifier that is naively built by ignoring it. In this paper, we investigate the impact of CCC-Noise on the quality of a popular generative classifier, normal discriminant analysis (NDA), and its corresponding discriminative classifier, logistic regression (LR). We consider the problem of two multivariate normal populations having a common covariance matrix. We compare the asymptotic distribution of the misclassification error rate of these two classifiers under CCC-Noise. We show that when the noise level is low, the asymptotic error rates of both procedures are only slightly affected. We also show that LR is less deteriorated by CCC-Noise compared to NDA. Under CCC-Noise contexts, the Mahalanobis distance between the populations plays a vital role in determining the relative performance of these two procedures. In particular, when this distance is small, LR tends to be more tolerable to CCC-Noise compared to NDA.
Original language | English (US) |
---|---|
Pages (from-to) | 1622-1637 |
Number of pages | 16 |
Journal | Journal of Multivariate Analysis |
Volume | 101 |
Issue number | 7 |
DOIs | |
State | Published - Aug 2010 |
Externally published | Yes |
Keywords
- Asymptotic distribution
- Class noise
- Misclassification rate
- Misspecified model
ASJC Scopus subject areas
- Statistics and Probability
- Numerical Analysis
- Statistics, Probability and Uncertainty