A Hybrid Approach for Preprocessing of Imbalanced Data in Credit Scoring Systems

2018 
During the last few years, classification task in machine learning is commonly used by various real-life applications. One of the common applications is credit scoring systems where the ability to accurately predict creditworthy or non-creditworthy applicants is critically important because incorrect predictions can cause major financial loss. In this paper, we aim to focus on skewed data distribution issue faced by credit scoring system. To reduce the imbalance between the classes, we apply preprocessing on the dataset which makes combined use of random re-sampling and dimensionality reduction. Experimental results on Australian and German credit datasets with the presented preprocessing technique has shown significant performance improvement in terms of AUC and F-measure.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    1
    Citations
    NaN
    KQI
    []