Stocks Clustering Based on Textual Embeddings for Price Forecasting

2020 
Forecasting stock market prices is a hard task. The main reason for that is due to the fact that its environment is highly dynamic, intrinsically complex, and chaotic. Traditional economic theories suggest that trying to forecast short-term stock price movements is a wasted effort because the market is influenced by several external events and its behavior approximates a random walk. Recent studies that address the problem of stock market forecasting usually create specific prediction models for the price behavior of a single stock. In this work we propose a technique to predict price movements based on similar stock sets. Our goal is to build a model to identify whether the price tends to bullishness or bearishness in the near future, considering stock information from similar sets based on two sources of information: historical stock data and Google Trends news. Firstly, the proposed study applies a method to identify similar stock sets and then creates a forecasting model based on a LSTM (long short-term memory) for these sets. More specifically, two experiments were conducted: (1) using the K-Means algorithm to identify similar stock sets and then using a LSTM neural network to forecast stock price movements for these stock sets; (2) using the DBSCAN (Density-based spatial clustering) algorithm to identify similar stock sets and then using the same LSTM neural network to forecast stock price movements. The study was conducted over 51 stocks of the Brazilian stock market. The results show that the use of an algorithm to identify stock clusters yields an improvement of approximately 7% in accuracy and f1-score and 8% in recall and precision when compared to models for a single stock.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []