Finite-sample analysis of impacts of unlabeled data and their labeling mechanisms in linear discriminant analysis

2017 
ABSTRACTIt is widely believed that unlabeled data are promising for improving prediction accuracy in classification problems. Although theoretical studies about when/how unlabeled data are beneficial exist, an actual prediction improvement has not been sufficiently investigated for a finite sample in a systematic manner. We investigate the impact of unlabeled data in linear discriminant analysis and compare the error rates of the classifiers estimated with/without unlabeled data. Our focus is a labeling mechanism that characterizes the probabilistic structure of occurrence of labeled cases. Results imply that an extremely small proportion of unlabeled data has a large effect on the analysis results.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    1
    Citations
    NaN
    KQI
    []