EHSO: Evolutionary Hybrid Sampling in Overlapping Scenarios for Imbalanced Learning

2020 
Abstract Imbalanced learning is a challenging task for conventional algorithms. Sampling techniques address this problem by synthesizing minority class samples or selecting part of majority class samples to provide a balanced data. A large number of related researches emerged in the past decades. Recent studies show that the samples in overlapping area play a more important role in improving the classification performance for imbalanced data. However, how to eliminate majority class samples in overlapping area efficiently while avoiding classification performance deterioration caused by loss of original distribution is still an open problem. This paper proposes to deal with the overlapped samples with an Evolutionary Hybrid Sampling technique (EHSO). The main purpose of EHSO is to make the decision boundary more visible through removing useless majority class samples and to avoid the possible unexpected data introduced by synthetic new minority samples. EHSO applies evolutionary algorithm to find the optimal compromise between classification performance and the replicate ratio of random oversampling. Numerical experiments on all the binary-class imbalanced datasets (100 datasets) of KEEL repository have demonstrated its superiority compared with other well-known sampling methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    43
    References
    3
    Citations
    NaN
    KQI
    []