Text Categorization as a Graph Classification Problem

2015 
In this paper, we consider the task of text categorization as a graph classification problem. By representing textual documents as graph-of-words instead of historical n-gram bag-of-words, we extract more discriminative features that correspond to long-distance n-grams through frequent subgraph mining. Moreover, by capitalizing on the concept of k-core, we reduce the graph representation to its densest part – its main core – speeding up the feature extraction step for little to no cost in prediction performances. Experiments on four standard text classification datasets show statistically significant higher accuracy and macro-averaged F1-score compared to baseline approaches.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    53
    References
    76
    Citations
    NaN
    KQI
    []