mixup: Beyond Empirical Risk Minimization

Hongyi Zhang,Moustapha Cisse,Yann N. Dauphin,David Lopez-Paz

mixup: Beyond Empirical Risk Minimization

2018

Hongyi Zhang
Moustapha Cisse
Yann N. Dauphin
David Lopez-Paz

Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples. Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of state-of-the-art neural network architectures. We also find that mixup reduces the memorization of corrupt labels, increases the robustness to adversarial examples, and stabilizes the training of generative adversarial networks.

Keywords:

Machine learning
Pattern recognition
Memorization
Artificial intelligence
Computer science
Robustness (computer science)
Artificial neural network
Contextual image classification
Empirical risk minimization
Generative grammar
deep neural networks
Adversarial system

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

1648

Citations