A similarity-based approach to sampling absence data for landslide susceptibility mapping using data-driven methods

2019 
Abstract The absence data (samples) for landslide susceptibility mapping using data-driven methods are not available directly and often approximated by locations where no landslides have occurred. The existing methods for generating absence data cannot quantify the reliability of candidate absence data and thus such data reduce the quality of prediction. In this paper, a new approach to absence data generation, referred to as similarity based sampling, was proposed for landslide susceptibility mapping using data-driven methods. First, the reliability of candidate absence data is quantified based on the dissimilarity in environmental conditions (covariate conditions) between the absence data and the presence data (which are the landslide occurrences). The absence data whose reliability value is higher than a given threshold were selected to be used. The proposed approach was validated through its application to three data-driven methods (i.e. logistic regression, support vector machine and random forest) for landslide susceptibility mapping. A case study was conducted in the Youfang catchment in southern Gansu Province of China. Ten groups of absence data were generated each corresponding to one of the ten different thresholds of reliability ranging from 0.0 to 0.9. The results show that the prediction accuracy of the data-driven methods rose when the threshold increased from 0.0 to 0.5, but the accuracy decreases as the threshold continues to increase after 0.5, that is, from 0.5 to 0.9. The best performance was obtained when the threshold was 0.5. The proposed method was compared with existing methods for absence data generation (i.e. buffer controlled and target space exteriorization). These results show that the similarity-based approach has a better performance than these existing methods for landslide susceptibility mapping using data-driven methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    62
    References
    27
    Citations
    NaN
    KQI
    []