Learning class-imbalanced data with region-impurity synthetic minority oversampling technique

2022 
NN ( nearest neighbors), which cannot identify the class distributions between pairs of two minority instances. Furthermore, the number of synthetic instances is left to be discussed in this field of study. To conquer these issues, we propose a new algorithm here named Region-Impurity Synthetic Minority Oversampling Technique (RIOT). Specifically, a region radius, we locate neighbors for minority instances and whereby to identify the relatively hard-to-learn minority instances, by the class ratio within the region and selecting building the base of sample generation. Then, generating synthetic instances until the region is approximately balanced. In the experiment, the results revealed that RIOT can perform better than some SMOTE extensions with less synthetic instances in terms of several model performance indicators for twelve real-world datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []