Improving classification accuracy using data augmentation on small data sets

2020 
Abstract Data augmentation (DA) is a key element in the success of Deep Learning (DL) models, as its use can lead to better prediction accuracy values when large size data sets are used. DA was not very much used with earlier neural network models before 2012, and the reason might be related to the type of models and the size of the data sets used. We investigate in this work, applying several state-of-the-art models based on Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs), the effect of DA when using small size data sets, analyzing the results in terms of the prediction accuracy obtained according to the different characteristics of the training samples (number of instances and features, and class unbalance degree). We further introduce modifications to the standard methods used to generate the synthetic samples to alter the class balance representation, and the overall results indicate that with some computational effort a significant increase in prediction accuracy can be obtained when small data sets are considered.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    17
    Citations
    NaN
    KQI
    []