High Intrinsic Dimensionality Facilitates Adversarial Attack: Theoretical Evidence
2020
Machine learning systems are vulnerable to adversarial attack. By applying to the input object a small, carefully-designed perturbation, a classifier can be tricked into making an incorrect prediction. This phenomenon has drawn wide interest, with many attempts made to explain it. However, a complete understanding is yet to emerge. In this paper we adopt a slightly different perspective, still relevant to classification. We consider retrieval, where the output is a set of objects most similar to a user-supplied query object, corresponding to the set of $k$ -nearest neighbors. We investigate the effect of adversarial perturbation on the ranking of objects with respect to a query. Through theoretical analysis, supported by experiments, we demonstrate that as the intrinsic dimensionality of the data domain rises, the amount of perturbation required to subvert neighborhood rankings diminishes, and the vulnerability to adversarial attack rises. We examine two modes of perturbation of the query: either ‘closer’ to the target point, or ‘farther’ from it. We also consider two perspectives: ‘query-centric’, examining the effect of perturbation on the query’s own neighborhood ranking, and ‘target-centric’, considering the ranking of the query point in the target’s neighborhood set. All four cases correspond to practical scenarios involving classification and retrieval.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
47
References
5
Citations
NaN
KQI