Incorporating External Knowledge into Unsupervised Graph Model for Document Summarization

2020 
Supervised neural network models have achieved outstanding performance in the document summarization task in recent years. However, it is hard to get enough labeled training data with a high quality for these models to generate different types of summaries in reality. In this work, we mainly focus on improving the performance of the popular unsupervised Textrank algorithm that requires no labeled training data for extractive summarization. We first modify the original edge weight of Textrank to take the relative position of sentences into account, and then combine the output of the improved Textrank with K-means clustering to improve the diversity of generated summaries. To further improve the performance of our model, we innovatively incorporate external knowledge from open-source knowledge graphs into our model by entity linking. We use the knowledge graph sentence embedding and the tf-idf embedding as the input of our improved Textrank, and get the final score for each sentence by linear combination. Evaluations on the New York Times data set show the effectiveness of our knowledge-enhanced approach. The proposed model outperforms other popular unsupervised models significantly.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    2
    Citations
    NaN
    KQI
    []