Dynamic ensemble classification for credit scoring using soft probability

2018 
A soft probability based dynamic ensemble classification method is proposed.Soft probability covers classifiers selection and combination.Selecting classifiers based on Type I and II error can minimize the risk.Combing different classifiers for the testing samples is more reasonable.Selective ensembles for credit scoring are a promising research field. In recent years, classification ensembles or multiple classifier systems have been widely applied to credit scoring, and they achieve significantly better performance than individual classifiers do. Selective ensembles, an important part of this group of systems, are a promising field of research. However, none of them considers the relative costs of Type I error and Type II error for credit scoring when selecting classifiers, which bring higher risks for the financial institutions. Moreover, earlier dynamic selective ensembles usually select and combine classifiers for each test sample dynamically based on classifiers performance in the validation set, regardless of their behaviors in the testing set. To fill the gap and overcome the limitations, we propose a new dynamic ensemble classification method for credit scoring based on soft probability. In this method, the classifiers are first selected based on their classification ability and the relative costs of Type I error and Type II error in the validation set. With the selected classifiers, we combine different classifiers for the samples in the testing set based on their classification results to get an interval probability of default by using soft probability. The proposed method is compared with some well-known individual classifiers and ensemble classification methods, including five selective ensembles, for credit scoring by using ten real-world data sets and seven performance indicators. Through these analyses and statistical tests, the experimental results demonstrate the ability and efficiency of the proposed method to improve prediction performance against the benchmark models.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    64
    References
    45
    Citations
    NaN
    KQI
    []