Assessing the Generalization of Machine Learning-Based Slope Failure Prediction to New Geographic Extents

2021 
Slope failure probabilistic models generated using random forest (RF) machine learning (ML), manually interpreted incident points, and light detection and ranging (LiDAR) digital terrain variables are assessed for predicting and generalizing to new geographic extents. Specifically, models for four Major Land Resource Areas (MLRAs) in the state of West Virginia in the United States (US) were created. All region-specific models were then used to predict withheld validation data within all four MLRAs. For all validation datasets, the model trained using data from the same MLRA provided the highest reported overall accuracy (OA), Kappa statistic, F1 Score, area under the receiver operating characteristic curve (AUC ROC), and area under the precision-recall curve (AUC PR). However, the model from the same MLRA as the validation dataset did not always provide the highest precision, recall, and/or specificity, suggesting that models extrapolated to new geographic extents tend to either overpredict or underpredict the land area of slope failure occurrence whereas they offer a better balance between omission and commission error within the region in which they were trained. This study highlights the value of developing region-specific inventories, models, and high resolution and detailed digital elevation data, since models may not generalize well to new geographic extents, potentially resulting from spatial heterogeneity in landscape and/or slope failure characteristics.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    90
    References
    0
    Citations
    NaN
    KQI
    []