Improving Costs and Robustness of Machine Learning Classifiers Against Adversarial Attacks via Self Play of Repeated Bayesian Games.

2020 
We consider the problem of adversarial machine learning where an adversary performs evasion attacks on a classifier-based learner by sending queries with adversarial data of different attack strengths to it. The learner is unaware whether a query sent to it is clean versus adversarial. The objective of the learner is to mitigate the adversary's attacks by reducing its classification errors of adversarial data. To address this problem, we propose a technique where the learner maintains multiple classifiers that are trained with clean as well as adversarial data of different attack strengths. We then describe a game theoretic framework based on a 2-player repeated Bayesian game called Repeated Bayesian Sequential Game with self play, that enables the learner to determine an appropriate classifier to deploy so that the likelihood of correctly classifying the query and preventing the evasion attack is not deteriorated, while reducing the costs to deploy the classifiers. Experimental results of our proposed approach with adversarial text data shows that our RBSG with self play-based technique maintains classifier accuracies comparable with that of an individual, powerful and costly classifier, while strategically using multiple, lower cost but less powerful classifiers to reduce the overall classification costs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    2
    Citations
    NaN
    KQI
    []