A Concept-Based ILP Approach for Multi-document Summarization Exploring Centrality and Position

2018 
Multi-document summarization systems aim to generate a brief text containing the most relevant information from a collection of related documents. The fast and continually growing volume of text data has increasingly drawn the attention from users and researchers to such systems. Aspects such as sentence centrality and position have been extensively studied in multi-document summarization as indicators of content relevancy. Very few works have investigated their efficient integration using global-based optimization approaches, however. This paper proposes a concept-based integer linear programming approach for multi-document summarization of news articles that integrates centrality and position features to filter out the less relevant sentences and measure the importance of concepts (textual fragments) in composing the output summary. The presented approach relies on a centrality-based strategy to perform the sentence clustering process and also to support the sentence ordering step. The benchmarks conducted with four datasets of the Document Understanding Conferences from 2001 to 2004 demonstrate that the proposed approach presents competitive performance compared with other state-of-the-art methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    1
    Citations
    NaN
    KQI
    []