Model-Based Versus Data-Driven Approach for Road Safety Analysis: Do More Data Help?

2016 
Crash data for road safety analysis and modeling are growing steadily in size and completeness due to latest advancement in information technologies. This increased availability of large datasets has generated resurgent interest in applying data-driven nonparametric approach as an alternative to the traditional parametric models for crash risk prediction. This paper investigates the question of how the relative performance of these two alternative approaches changes as crash data grows. The authors focus on comparing two popular techniques from the two approaches: negative binomial models (NB) for the parametric approach and kernel regression (KR) for the nonparametric counterpart. Using two large crash datasets, the authors investigate the performance of these two methods as a function of the amount of training data. Through a rigorous bootstrapping validation process, the study found that the two approaches exhibit strikingly different patterns, especially, in terms of sensitivity to data size. The kernel regression method outperforms the model based approach – NB in terms of predictive performance and that performance advantage increases noticeably as data available for calibration grows. With the arrival of the Big Data era and the added benefits of enabling automated road safety analysis and improved responsiveness to latest safety issues, nonparametric techniques (especially those of modern machine approaches) could be included as one of the important tools for road safety studies.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []