Centroid prior topic model for multi-label classification

2015 
Supervised topic models such as labeled latent Dirichlet allocation (L-LDA) have attracted increased attention for multi-label classification. However, they lack considerations of the label frequency of the word (i.e., the number of labels containing the word), which is crucial for classification. To address this problem, we investigate the L-LDA model and then propose an extension, namely centroid prior topic model (CTPM). Class-feature-centroid (CFC) suggests a discriminative label-word vector that takes the label frequency of the word into account. CPTM uses this CFC vector as prior for label-word distributions. Extensive experiments on the Yahoo! dataset have been conducted to evaluate our algorithm. The experimental results demonstrate that CPTM outperforms the existing multi-label classification algorithms on AUC, Macro-F1 and Micro-F1.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    8
    Citations
    NaN
    KQI
    []