An Improved Naive Bayesian Classification Algorithm for Sentiment Classification of Microblogs

2014 
For the attribute-weighted based naive Bayesian classification algorithms, the selection of the weight directly affects the classification results. Based on this, the drawbacks of the TFIDF feature selection approaches in sentiment classification for the microblogs are analyzed, and an improved algorithm named TF-D(t)-CHI is proposed, which applies statistical calculation to obtain the correlation degree between the feature words and the classes. It presents the distribution of the feature items by variance in classes, which solves the problem that the short-texts contain few feature words while the high frequency feature words have too high weight. Experimental result indicate that TF-D(T)-CHI based naive Bayesian classification for feature selection and weight calculation has better classification results in sentiment classification for microblogs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    3
    References
    0
    Citations
    NaN
    KQI
    []