Finite-sample analysis of impacts of unlabeled data and their labeling mechanisms in linear discriminant analysis

Kenichi Hayashi,Keiji Takai

Finite-sample analysis of impacts of unlabeled data and their labeling mechanisms in linear discriminant analysis

2017

Kenichi Hayashi
Keiji Takai

ABSTRACTIt is widely believed that unlabeled data are promising for improving prediction accuracy in classification problems. Although theoretical studies about when/how unlabeled data are beneficial exist, an actual prediction improvement has not been sufficiently investigated for a finite sample in a systematic manner. We investigate the impact of unlabeled data in linear discriminant analysis and compare the error rates of the classifiers estimated with/without unlabeled data. Our focus is a labeling mechanism that characterizes the probabilistic structure of occurrence of labeled cases. Results imply that an extremely small proportion of unlabeled data has a large effect on the analysis results.

Keywords:

Missing data
Statistics
Semi-supervised learning
Mathematics
Probabilistic logic
Monte Carlo method
Efficiency
Linear discriminant analysis

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations