A topic-enhanced word embedding for Twitter sentiment classification

2016 
Word representation is crucial to lexical features used in Twitter sentiment analysis models. Recent work has demonstrated that dense, low-dimensional and real-valued word embedding gives competitive performance for Twitter sentiment classification. We follow this line of work, and propose a topic-enhanced word embedding for the task, which is generally neglected in previous work. Firstly, we exploit a recursive autoencoder framework to learn topic-enhanced word embedding, where topic information is generated through topic modeling based on an effective implementation of Latent Dirichlet Allocation (LDA). Then we use a uniform framework by adopting Support Vector Machine (SVM) classifier, to compare existing word representation methods with our method. Experimental results on the dataset show that topic-enhanced word embedding is very effective for Twitter sentiment classification.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    66
    References
    80
    Citations
    NaN
    KQI
    []