LDAS: Local density-based adaptive sampling for imbalanced data classification

2021 
Abstract Class imbalance poses a great challenge to traditional classifiers in machine learning as they strongly favor the majority class while ignoring minority class. Synthetic over-sampling methods deal with this problem by generating synthetic examples to balance the distribution of data. However, most existing methods prefer to generate synthetic examples in a specific area without considering the complexity of imbalance distribution, which may result in the over-emphasis of learning model on some data difficulty factors. To this end, we propose a Local Density-based Adaptive Sampling method (LDAS) for imbalanced data. LDAS first assigns a local density for each minority example, then a new cleaning strategy is proposed to remove the overlapping majority examples. Finally, it weighs each minority example based on its approaching degree of decision boundary and the corresponding local density. This is done in such a way that synthetic examples are generated in the safe area and the border area simultaneously according to the weight of minority examples. Extensive experiments on KEEL datasets demonstrate the effectiveness of the proposal LDAS.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    68
    References
    0
    Citations
    NaN
    KQI
    []