User and Topic Hybrid Context Embedding for Finance-Related Text Data Mining

2019 
As large amounts of finance-related microblogs, posted by both professional and amateur investors, are mingling in online social networks that could bring impact to real financial markets, it's an essential task to analyze financial texts posted by different users. Besides that, as barely nobody could be an all-round expert in various financial products such as stocks, futures and bonds, topic preferences of different users should also be considered in these mining tasks. That is to say, financial text contents should be analyzed in both user and topic level. In recent advances of deep learning technology, learning the "embedding" representation of contents is a state-of-art method in financial text mining over hand-crafted feature extraction techniques. Along this line, in this paper we propose a hybrid embedding method that both the "embedding" representation of users and the "embedding" representation of financial topics are learned from the posted finance-related contents in online social networks. Following the commonly used methods in embedding-related researches, we apply extensive experiments with real-world datasets to show the effectiveness of our hybrid user and topic embedding (UTE) approach, in both intrinsic and extrinsic ways. It's shown that our approach can intrinsically distinguish social network users with their learned embeddings, so they can be grouped into human-explainable clusters. Moreover, we also propose a case study on sentiment analysis by applying our user and topic hybrid embedding with a deep contextual neural network architecture. The results prove that our approach outperforms other baseline methods in financial-related sentiment analysis, and hopefully for other downstream text mining tasks as well.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    4
    Citations
    NaN
    KQI
    []