Adversarial robustness for latent models: Revisiting the robust-standard accuracies tradeoff.

2021 
Over the past few years, several adversarial training methods have been proposed to improve the robustness of machine learning models against adversarial perturbations in the input. Despite remarkable progress in this regard, adversarial training is often observed to drop the standard test accuracy. This phenomenon has intrigued the research community to investigate the potential tradeoff between standard and robust accuracy as two performance measures. In this paper, we revisit this tradeoff for latent models and argue that this tradeoff is mitigated when the data enjoys a low-dimensional structure. In particular, we consider binary classification under two data generative models, namely Gaussian mixture model and generalized linear model, where the feature data lie on a low-dimensional manifold. We show that as the manifold dimension to the ambient dimension decreases, one can obtain models that are nearly optimal with respect to both, the standard accuracy and the robust accuracy measures.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    0
    Citations
    NaN
    KQI
    []