Topic Oriented Multi-document Summarization Using LSA, Syntactic and Semantic Features

2019 
Multi-document Summarization (MDS) is a process obtaining precise and concise information from a set of documents described on the same topic. The generated summary makes the user to understand the important information that is present in the documents. In general, a set of interrelated documents discusses about a subject. The subject contains a set of topics. The description of various topics that belongs to the main subject are presented in the documents. The user wish to consolidate the subject in terms of topics that are covered in the various documents. The existing approaches suffer with identification of the topics within a document and also lack the establishment of semantic and syntactic relationship among the words within a sentence. In this paper, a novel unsupervised model is proposed to generate extractive multi-document summaries by identifying the topics that are present in the documents using Latent Semantic Analysis (LSA) and eliminating the redundant sentences that are describing the same topic that are present in the multiple documents using semantic and syntactic information embedded in the sentences. Empirical evaluations are carried out using LSA, lexical, syntactic and semantic features on DUC2006 dataset. The experimental results on DUC2006 demonstrate that the performance of proposed summarization system is comparable with the existing summarization systems in terms of F-measure, recall, and precision values.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    0
    Citations
    NaN
    KQI
    []