EDM-JBW: A Novel Event Detection Model Based on JS − ID ′ F order and Bikmeans with Word Embedding for News Streams

2017 
Due to the popularity of Internet, news streams are overflowing every day, so people struggle to capture the underlying hot events in news streams. Discovering the hot events from numerous news documents has become an urgent problem. However,existing works of this problem only use traditional Term Frequency× Inverse Document Frequency to select text features and text representations, and some don’t consider the time order. In this paper, we propose a novel event detection model called EDM-JBW. This model utilizes Jaccard Similarity coefficient × Inverse Dimension Frequency with time order, i.e., JS  −  ID ′ F order , based on word2vec to represent document embedding, and uses Bikmeans to cluster all news documents into news events. The experimental evaluation on three real datasets shows that our techniques outperform the baseline techniques.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    5
    Citations
    NaN
    KQI
    []