SGCSumm: An extractive multi-document summarization method based on pre-trained language model, submodularity, and graph convolutional neural networks

2023 
The increase in online text generation by humans and machines needs automatic text summarization systems. Recent research studies commonly use deep learning, besides sentence embedding and feature learning mechanisms, to find a solution for text summarization. But they ignore the fact that while finding the optimal solution for extractive text summarization is NP-hard, how do they ensure the quality of their solution? In our previous work, an extractive summarizer, called DSNSum, was proposed based on (DSN) that uses handcrafted features. It leverages submodularity to guarantee a minimum bound for performance. In this paper, (SGCSumm), an extractive multi-document summarization method, is presented where DSN is used to guarantee a minimum performance bound. It addresses DSNSum’s shortcomings and makes enhancements that yield a summarizer supporting sentence embedding and graph structure-aware feature learning. It takes a formal approach to show how the enhancements are possible without losing the guarantee of a minimum bound for performance. Finally, to evaluate the performance of the proposed summarization method, the DUC 2004 and DailyMail/CNN datasets are used. The experimental results show that the performance of SGCSumm is comparable to that of state-of-the-art summarization methods in terms of ROUGE scores.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []