External validation for statistical NO2 modelling: A study case using a high-end mobile sensing instrument

2021 
Abstract Statistical learning models have been applied to study the spatial patterns of ambient Nitrogen Dioxide (NO 2 ), which is a highly dynamic, traffic-related air pollutant. Commonly, the validation process in most studies is based on bootstrapped split-sampling of training and test sets from fixed ground station measurements. As the ground stations distribute mostly sparsely over a region or country, this kind of cross-validation validation method does not consider how well models are capable of representing spatial variations in air pollution mostly occurring over distances shorter than the ground station sampling spacing. This may lead to inadequate hyperparameter optimisation and bias when comparing different statistical models. External mobile measurements are therefore needed for more reliable model evaluations as these provide detailed and spatially continuous information on air pollution patterns. However, most current designs of mobile NO 2 sensing instruments suffer from the trade-off between flexibility and measurement accuracy, as high-end sensors are commonly too heavy to be carried by a person or on a bike. In addition, sufficient repetitions over time are needed so that the measurements are representative to concentrations over a relatively long-term period. In this study, we installed a mobile air quality station onboard a cargo-bike to collect a dataset suitable for external validation. With the external validation dataset the model hyperparameter setting and statistical model comparison results alter. Our model comparison results also differ from previous studies relying only on ground stations for cross-validation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    49
    References
    0
    Citations
    NaN
    KQI
    []