A hybrid air quality early-warning framework: An hourly forecasting model with online sequential extreme learning machines and empirical mode decomposition algorithms

2020 
Abstract Modelling air quality with a practical tool that produces real-time forecasts to mitigate risk to public health continues to face significant challenges considering the chaotic, non-linear and high dimensional nature of air quality predictor variables. The novelty of this research is to propose a hybrid early-warning artificial intelligence (AI) framework that can emulate hourly air quality variables (i.e., Particulate Matter 2.5, PM2.5; Particulate Matter 10, PM10 and lower atmospheric visibility, VIS), the atmospheric variables associated with increased respiratory induced mortality and recurrent health-care cost. Firstly, hourly air quality data series (January-2015 to December-2017) are demarcated into their respective intrinsic mode functions (IMFs) and a residual sub-series that reveal patterns and resolve data complexity characteristics, followed by partial autocorrelation function applied to each IMF and residual sub-series to unveil historical changes in air quality. To design the prescribed hybrid model, the data is partitioned into training (70%), validation (15%) and testing (15%) sub-sets. The online sequential-extreme learning machine (OS-ELM) algorithm integrated with improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) is designed as a data pre-processing system to robustly extract predictive patterns and fine-tune the model generalization to a near-optimal global solution, which represents modelled air quality at hourly forecast horizons. The resulting early warning AI-based framework denoted as ICEEMDAN-OS-ELM model, is individually constructed by forecasting each IMF and residual sub-series, with hourly PM2.5, PM10, and VIS obtained by the aggregated sum of forecasted IMFs and residual sub-series. The results are benchmarked with many competing predictive approaches; e.g., hybrid ICEEMDAN-multiple-linear regression (MLR), ICEEMDAN-M5 model tree and standalone versions: OS-ELM, MLR, M5 model tree. Statistical metrics including the root-mean-square error (RMSE), mean absolute error (MAE), Willmott's Index (WI), Legates & McCabe's Index (ELM) and Nash–Sutcliffe coefficients (ENS) are used to evaluate the model's accuracy. Both visual and statistical results show that the proposed ICEEMDAN-OS-ELM model registers superior results, outperforming alternative comparison approaches. For instance, for PM2.5, ELM values ranged from 0.65–0.82 vs. 0.59–0.77 for ICEEMDAN-M5 tree, 0.59–0.74 for ICEEMDAN-MLR, 0.28–0.54 for OS-ELM, 0.27–0.54 for M5 tree and 0.25–0.53 for the MLR model. For remaining air quality variables (i.e., PM10 & VIS), the objective model (ICEEMDAN-OS-ELM) outperformed the comparative models. In particular, ICEEMDAN-OS-ELM registered relatively low RMSE/MAE, ranging from approximately 0.7–1.03 μg/m3 (MAE), 1.01–1.47 μg/m3 (RMSE) for PM2.5 whereas for PM10, these metrics registered a value of 1.29–3.84 μg/m3 (MAE), 3.01–7.04 μg/m3 (RMSE) and for Visibility, they were 0.01–3.72 μg/m3 (MAE (Mm−1)), 0.04–5.98 μg/m3 (RMSE (Mm−1)). Visual analysis of forecasted and observed air quality through a Taylor diagram illustrates the objective model's preciseness, confirming the versatility of early warning AI-model in generating air quality forecasts. The excellent performance ascertains the hybrid model's potential utility for air quality monitoring and subsequent public health risk mitigation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    119
    References
    27
    Citations
    NaN
    KQI
    []