Learning to Defense by Learning to Attack

2019 
Adversarial training is a principled approach for training robust neural networks. From an optimization perspective, adversarial training is solving a bilevel optimization problem (a general form of minimax approaches): The leader problem targets on learning a robust classifier; The follower problem tries to generate adversarial samples. Unfortunately, such a bilevel problem is very challenging to solve due to its highly complicated structure. This work proposes a new adversarial training method based on a generic learning-to-learn (L2L) framework. Specifically, instead of applying hand-designed algorithms for the follower problem, we learn an optimizer, which is parametrized by a convolutional neural network. Meanwhile, a robust classifier is learned to defense the adversarial attacks generated by the learned optimizer. Our experiments over CIFAR datasets demonstrate that L2L improves upon existing methods in both robust accuracy and computational efficiency. Moreover, the L2L framework can be extended to other popular bilevel problems in machine learning.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    8
    Citations
    NaN
    KQI
    []