Graph-Based Term Weighting Scheme for Topic Modeling
2016
LSI and LDA are widely used techniques to uncover the underlying topical structure of text. They traditionally rely on bag-of-words representation of documents and term frequency-based (TF) weighting schemes. In this paper, we represent documents as graph-of-words to capture the relationships between close words and propose the number of contexts of co-occurrences as alternative term weights (TW). Experiments with a downstream supervised task show that counting the importance of a node inside the graph results in statistically significant higher accuracy and macro-averaged F1score than with TF-based LSI and LDA.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
29
References
3
Citations
NaN
KQI