SVBO: Support Vector-Based Oversampling for handling class imbalance in k-NN

2012 
We propose a novel algorithm for handling class imbalance in the k-NN classifier. Class imbalance is a problem occurring in some valuable data such as medical diagnosis, fraud detection, oil spills and etc. The problem influences all supervised classification algorithms therefore a large amount of research is being done. We tackle the problem by preprocessing the data using oversampling techniques. A two phase algorithm, based on Support Vector Data Description (SVDD) is proposed. SVDD is a tool for data description. In our approach we firstly describe data from the minority class i.e. the class with less data using SVDD. This is followed by oversampling of the support vectors, which is suitable for k-NN. We evaluate our method using real world datasets with different imbalance ratios and compare it with four other oversampling methods namely SMOTE, Borderline SMOTE, random oversampling and cluster based sampling. The results show that the proposed algorithm is a suitable preprocessing method for the k-NN classifier.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    1
    Citations
    NaN
    KQI
    []