A Statistical Data-Filtering Method Proposed for Short-Term Load Forecasting Models

Duong Minh Bui,Phuc Duy Le,Tien Minh Cao,Hung Nguyen,Trang Pham,Duy Anh Pham

A Statistical Data-Filtering Method Proposed for Short-Term Load Forecasting Models

2020

Reliability assessment of the SCADA-system based load data is necessary for improving accuracy of short-term load forecasting (STLF) methods in a distribution network (DN). Specifically, the reliability evaluation of the load data is to properly eliminate noise/outliers caused by random power consumption behaviors or the sudden change in load demand from industrial and residential customers in the DN. Thus, this paper proposes a novel statistical data-filtering method, working at an input data pre-processing stage, which will evaluate the reliability of input load data by analyzing all possible data confidence levels in order to filter-out the noise/outliers for accuracy improvement of different short-term load forecasting models. The proposed statistical data-filtering method is also compared to other existing data-filtering methods (such as Kalman Filter, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Discrete Wavelet Transform (DWT) and Singular Spectrum Analysis (SSA)). Moreover, several case studies of short-term load forecasting for a typical 22 kV distribution network in Vietnam are conducted with an Artificial Neural Network (ANN) model, a Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) model, a combined model of Long Short-Term Memory Network and Convolutional Neural Network (LSTM-CNN), and a conventional Autoregressive Integrated Moving Average (ARIMA) model to validate the statistical data-filtering method proposed. The achieved results demonstrate which the STLF using ANN, LSTM-RNN, LSTM-CNN, and ARIMA models with the statistical data-filtering method can all outperform those with the existing data-filtering methods. Additionally, the numerical results also indicate that in case the SCADA-based load data is normally distributed, time-series forecasting models should be more preferred than neural network models; otherwise, when the SCADA-based load data contains multiple normally distributed sub-datasets, neural network-based prediction models are highly recommended.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations