Air quality predictions with a semi-supervised bidirectional LSTM neural network

2020 
Abstract Efficient and accurate air quality predictions can contribute to public health protection and policy decision making. Fine particulate matter (PM2.5) is an important index for measuring and controlling the degree of air pollution. Recent studies have obtained satisfactory PM2.5 predictions by designing complex models or adding numerous auxiliary data sets to models, and few studies have effectively extracted the spatiotemporal features of PM2.5 time-series data. In this study, a semi-supervised model was proposed for predicting PM2.5 concentrations. The approach includes empirical mode decomposition (EMD) and bidirectional long short-term memory (BiLSTM) neural networks. This model only requires PM2.5 time-series data as inputs, which are regarded as signal data. EMD is applied as an unsupervised feature learning method to decompose the data and extract the frequency and amplitude features. This approach improved short-term trend predictions, especially for sudden changes. BiLSTM was used in the supervised learning stage. Beijing hourly and daily PM2.5 datasets collected from the China National Environmental Monitoring Centre were used to validate the prediction performance of the proposed model. The results demonstrated that this model was more accurate than the other standard LSTM-based model, with four better indicator values at the hourly (RMSE: 6.86 μ g ⋅ m − 3 , MAE: 4.92 μ g ⋅ m − 3 , MAPE: 10.66%, R2: 0.989) and daily (RMSE: 22.58 μ g ⋅ m − 3 , MAE: 16.67 μ g ⋅ m − 3 , MAPE: 60.87%, R2: 0.742) scales. Furthermore, this study proposed a new method of multiscale PM2.5 predictions by reconstructing hourly PM2.5 datasets to form multi-hour datasets. This method could reduce error accumulation in PM2.5 multi-step predictions using LSTM-based models and captured at least 70% of the explained variance in this study, demonstrating the feasibility of the model.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    57
    References
    12
    Citations
    NaN
    KQI
    []