Topic Mining Based on the Heat of Micro-blog

2018 
With the popularity of social networks, users interact with each other and comment on current events through online social network more and more frequently, how to extract the hot topic has become the focus of natural language processing research. In this paper, we propose a hot topic extraction method based on the popularity of micro-blog. First we use cross-entropy to define the heat of micro-blog according to the number of its comment and forwarding. Then combining the heat of micro-blog and word2vec model to assign weight for each word, and we apply bidirectional LSTM(Long Short Term Memory) to conduct document semantic coding and single-pass method for topic mining. Besides, We separately introduce three evaluation indicators to test the proposed method: UMI(Normalized Information), PMI(Point-wise Mutual Information)and Purity. We used crawlers to crawl over 10,000 micro-blogs in 15 hot topics in 2017 in Sina Weibo, the experiment results show that the proposed method performs better and has stronger robustness than the traditional topic detection method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    0
    Citations
    NaN
    KQI
    []