Graph Topic Neural Network for Document Representation

2021 
Graph Neural Networks (GNNs) such as GCN can effectively learn document representations via the semantic relation graph among documents and words. However, despite a few exceptions, most of the previous work in this line of research does not consider the underlying topical semantics inherited in document contents and the relation graph, making the representations less effective and hard to interpret. In a few recent studies trying to incorporate latent topics into GNNs, the topics have been learned independently from the relation graph modeling. Intuitively, topic extraction can benefit much from the information propagation of the relation graph structure - directly and indirectly connected documents and words have similar topics. In this paper, we propose a novel Graph Topic Neural Network (GTNN) model to mine latent topic semantics for interpretable document representation learning, taking into account the document-document, document-word, and word-word relationships in the graph. We also show that our model can be viewed as semi-amortized inference for relational topic model based on Poisson distribution, with high order correlations. We test our model in several settings: unsupervised, semi-supervised, and supervised representation learning, for both connected and unconnected documents. In all the cases, our model outperforms the state-of-the-art models for these tasks.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    2
    Citations
    NaN
    KQI
    []