A Bayesian Information Criterion for Unsupervised Learning Based on an Objective Prior

2019 
Data processing techniques, such as mathematical formulas, statistical methods and machine learning algorithms, require a set of tools for evaluating knowledge extracted from data. In unsupervised learning it is impossible to use referential or predictive estimation. Therefore, the only reliable way to evaluate results of unsupervised learning is information estimation. Unfortunately, information estimation suffer from underfitting and overfitting. We propose a new method for evaluating unsupervised learning results, which is based on the Bayesian criterion for optimal decision and an objective prior probability distribution of partitions. We illustrate the proposed method application on Fisher’s iris data set by comparing original label distribution with results of clustering with different numbers of clusters. We show the method prevents underfitting and overfitting and verify it by comparing the recommended value with posterior distribution.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    3
    Citations
    NaN
    KQI
    []