Rare-Event Time Series Prediction: A Case Study of Solar Flare Forecasting

2019 
We present a case study for time series prediction models in extreme class-imbalance problems. We have extracted multiple properties from the Space Weather ANalytics for Solar Flares (SWAN-SF) benchmark dataset which comprises of magnetic features from over 4075 active regions over a period of 9 years to create the forecasting dataset used in this study. In the extracted dataset, the class-imbalance ratio is 1:60, where the minority class is formed by instances of strong solar flares (GOES M-and X-class). This ratio reaches to 1:800 if we only consider the strongest class of flares (GOES X-class). This case of extreme imbalance, along with the temporal coherence of the sliced time series, provides us with an interesting set of challenges in the forecasting of scarce real-life phenomena. We have explored remedies to tackle the class-imbalance issue such as undersampling, oversampling and misclassification weights. In the process, we elaborate on common mistakes and pitfalls caused by ignoring the side effects of these remedies, including how and why they weaken the robustness of the trained models while seemingly improving the performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    7
    Citations
    NaN
    KQI
    []