TFLA:A Quality Analysis Framework for User Generated Contents 主题特征格分析:一种用户生成文本质量评估方法

2018 
In this paper, we design a topic-features lattices analysis (TFLA) framework based on objectivity quality dimensions. Firstly, we apply the latent Dirichlet allocation (LDA) approach to get latent topics as topic-features for each goods categories. Secondly, we construct formal background based on the strong relationship between goods categories and topic-features. So we could get generalization and instantiation relationship among the topic-features through formal concept analysis (FCA). We employ domain knowledge and relationships among topic-features to define five objective quality features. Also, we use machine learning methods to build quality evaluation models based on these quality features. Experiment results on actual comment data sets show that our new quality models' prediction results are in agreement with the artificial quality tags in most cases. The best performances could get that the mean absolute error (MAE) is 0.7 and F-measure is 0.5, which is significantly better than the conventional quality prediction model bases on Support Vector Machine (SVM) classification.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []