Text Segregation On Asynchronous Group Chat

2020 
Abstract Ability to successfully segregate texts around different topics can lead to successful text summarization. Though text summarization has been well researched, summarization of multi-party asynchronous chat has not been attempted. Popular systems such as Whatsapp, Telegram etc. have such kind of chat scenarios. Such chats are asynchronous as participants can respond to a thread after a long delay and not as an immediate reply to ongoing conversation. Though there are few chat data sets available, annotated data sets of such multi-party asynchronous chats are not yet available even though summarization in this particular domain presents a good use case for chat participants. With such a summarization feature, a user can quickly review the summary of past conversations during his period of inactivity. However, this is challenging as any summarization attempt on a data set like this must address three aspects i.e. threads of discussion, time window and topic/sub-topics in an inter-woven sequence of messages that does not have any usual sense of sequence. In the absence of annotated data sets and challenge of addressing these three aspects in seemingly non-sequential messages, machine learning based approaches will not work well. In this work, an innovative pipeline based on heuristics is designed to address text segregation for such a scenario. Once text segregation is achieved, text summarization may become a lesser challenge. It is observed that this approach is promising enough to warrant further research and enhancements.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    5
    References
    0
    Citations
    NaN
    KQI
    []