Just the Facts: Winnowing Microblogs for Newsworthy Statements using Non-Lexical Features

2017 
Microblogging has become a popular method to disseminate information quickly, but also for many other dialogue acts such as expression opinion and advertising. As the volumes have risen, the task of filtering messages for wanted information has become increasingly important. In this work we examine the potential of natural language processing and machine learning to filter short messages for those that state items of news. We propose an approach that makes use of information carried at a deeper level than message’s lexical surface, and show that this can be used effectively improve precision in filtering Twitter messages. Our method outperforms a baseline unigram “bag-of-words” approach to selecting news-event Tweets, yielding a 4.8% drop in false detection.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    0
    Citations
    NaN
    KQI
    []