Better Word Representations with Word Weight

Gege Song,Xianglin Huang,Gang Cao,Zhulin Tao,Wei Liu,Lifang Yang

Better Word Representations with Word Weight

2019

As a fundamental task of natural language processing, text classification has been widely used in various applications such as sentiment analysis and spam detection. In recent years, the continuous-valued word embedding learned by neural network attaches extensive attentions. Although word embedding achieves impressive results in capturing similarities and regularities between words, it fails to highlight important words for identifying text category. Such deficiency could be attenuated by word weight, which conveys word contribution in text categorization. Toward this end, we propose an effective text classification scheme by incorporating word weight into word embedding in this paper. Specifically, in order to enrich word representation, the bidirectional gated recurrent units (Bi-GRU) is first employed to grasp context information of words. Then the word weights yielded by term frequency (TF) are used to modulate the word representation of Bi-GRU for constructing text representation. Extensive experimental results on several large text datasets verify that the accuracy of our proposed text classification scheme outperforms the state-of-the-art ones.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations