Text classification method based on convolution neural network

2017 
Automatic text classification is a fundamental task in the field of natural language processing and it can help users select vital information from massive text resources. To better represent the semantic meaning of a text, and to solve the problem that traditional methods need to extract features manually, we use TF-IDF algorithm to calculate the weight of each word in a text, then weight the word vectors by TF-IDF value. This method will generate text vectors, which have clearer semantic meanings. Then we input the text vector matrix into Convolution Neural Network (CNN), so that the CNN will automatically extract text features. Through extensive experiments conducted on two data sets, experiments demonstrate that our approach can effectively improve the accuracy of classification, and the classification accuracy of the two data sets are 96.28% and 96.97% respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    9
    Citations
    NaN
    KQI
    []