ROGA: Random Over-sampling Based on Genetic Algorithm

2021 
When using machine learning to solve practical tasks, we often face the problem of class imbalance. Unbalanced classes will cause the model to generate preferences during the learning process, thereby ignoring classes with fewer samples. The oversampling algorithm achieves the purpose of balancing the difference in quantity by generating a minority of samples. The quality of the artificial samples determines the impact of the oversampling algorithm on model training. Therefore, a challenge of the oversampling algorithm is how to find a suitable sample generation space. However, too strong conditional constraints can make the generated samples as non-noise points as possible, but at the same time they also limit the search space of the generated samples, which is not conducive to the discovery of better-quality new samples. Therefore, based on this problem, we propose an oversampling algorithm ROGA based on genetic algorithm. Based on random sampling, new samples are gradually generated and the samples that may become noise are filtered out. ROGA can ensure that the sample generation space is as wide as possible, and it can also reduce the noise samples generated. By verifying on multiple datasets, ROGA can achieve a good result.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []