Predicting COVID-19 incidence using Google Trends and data mining techniques: A pilot study in Iran

2020 
BACKGROUND: COVID-19 is a recent global outbreak affecting many countries around the world. Iran is one of the ten most affected countries. Search engines provide useful data from populations and this data might be useful to analyze epidemics. Utilizing data mining methods on electronic resources' data might give better insight to manage the health crisis of coronavirus outbreak for each country and the world. OBJECTIVE: This study is aimed to predict the incidence of COVID-19 in Iran. METHODS: The data is obtained from the Google Trend website. Linear regression and long short-term memory (LSTM) models have been used to estimate the number of positive COVID-19 cases. All models are evaluated using 10-fold-cross validation, and Root Mean Square Error (RMSE) was used as the performance metric. RESULTS: The Linear Regression model predicts the incidence with an RMSE of 7.562 +/- 6.492. The most effective factors besides previous day incidence include the searches' frequency of handwashing, hand sanitizer, and antiseptic topics. The RMSE of the LSTM model was equal to 27.187. CONCLUSIONS: The data mining algorithms can be employed to predict outbreak spreading trends. This prediction might support policymakers and healthcare managers to plan and allocate healthcare resources accordingly. CLINICALTRIAL:
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    161
    Citations
    NaN
    KQI
    []