Adversarial example soups: averaging multiple adversarial examples improves transferability without increasing additional generation time

arXiv (Cornell University) (2024)

Bo Yang Hengwei Zhang Chenwei Li Jindong Wang

Citation

Reference

Related Paper

Abstract:

For transfer-based attacks, the adversarial examples are crafted on the surrogate model, which can be implemented to mislead the target model effectively. The conventional method for maximizing adversarial transferability involves: (1) fine-tuning hyperparameters to generate multiple batches of adversarial examples on the substitute model; (2) conserving the batch of adversarial examples that have the best comprehensive performance on substitute model and target model, and discarding the others. In this work, we revisit the second step of this process in the context of fine-tuning hyperparameters to craft adversarial examples, where multiple batches of fine-tuned adversarial examples often appear in a single high error hilltop. We demonstrate that averaging multiple batches of adversarial examples under different hyperparameter configurations, which refers to as "adversarial example soups", can often enhance adversarial transferability. Compared with traditional methods, the proposed method incurs no additional generation time and computational cost. Besides, our method is orthogonal to existing transfer-based methods and can be combined with them seamlessly to generate more transferable adversarial examples. Extensive experiments on the ImageNet dataset show that our methods achieve a higher attack success rate than the state-of-the-art attacks.

Keywords:

Transferability

Topics:

Adversarial Robustness in Machine Learning

Advanced Malware Detection Techniques

Security and Verification in Computing

10.48550/arxiv.2402.18370

Cite

PDF

Attack-Centric Approach for Evaluating Transferability of Adversarial Samples in Machine Learning Models

arXiv (Cornell University) (2021)

Tochukwu Idika İsmail Aktürk

Transferability of adversarial samples became a serious concern due to their impact on the reliability of machine learning system deployments, as they find their way into many critical applications. Knowing factors that influence transferability of adversarial samples can assist experts to make informed decisions on how to build robust and reliable machine learning systems. The goal of this study is to provide insights on the mechanisms behind the transferability of adversarial samples through an attack-centric approach. This attack-centric perspective interprets how adversarial samples would transfer by assessing the impact of machine learning attacks (that generated them) on a given input dataset. To achieve this goal, we generated adversarial samples using attacker models and transferred these samples to victim models. We analyzed the behavior of adversarial samples on victim models and outlined four factors that can influence the transferability of adversarial samples. Although these factors are not necessarily exhaustive, they provide useful insights to researchers and practitioners of machine learning systems.

Transferability

Adversarial machine learning

Transfer of learning

10.48550/arxiv.2112.01777

Cite

Citations (0)

Proving Common Mechanisms Shared by Twelve Methods of Boosting Adversarial Transferability

arXiv (Cornell University) (2022)

Quanshi Zhang Xin Wang Jie Ren Xu Cheng Shuyun Lin

Although many methods have been proposed to enhance the transferability of adversarial perturbations, these methods are designed in a heuristic manner, and the essential mechanism for improving adversarial transferability is still unclear. This paper summarizes the common mechanism shared by twelve previous transferability-boosting methods in a unified view, i.e., these methods all reduce game-theoretic interactions between regional adversarial perturbations. To this end, we focus on the attacking utility of all interactions between regional adversarial perturbations, and we first discover and prove the negative correlation between the adversarial transferability and the attacking utility of interactions. Based on this discovery, we theoretically prove and empirically verify that twelve previous transferability-boosting methods all reduce interactions between regional adversarial perturbations. More crucially, we consider the reduction of interactions as the essential reason for the enhancement of adversarial transferability. Furthermore, we design the interaction loss to directly penalize interactions between regional adversarial perturbations during attacking. Experimental results show that the interaction loss significantly improves the transferability of adversarial perturbations.

Transferability

Boosting

10.48550/arxiv.2207.11694

Cite

Citations (2)

Delving into Transferable Adversarial Examples and Black-box Attacks

arXiv (Cornell University) (2016)

Yanpei Liu Xinyun Chen Chang Liu Dawn Song

An intriguing property of deep neural networks is the existence of adversarial examples, which can transfer among different architectures. These transferable adversarial examples may severely hinder deep neural network-based applications. Previous works mostly study the transferability using small scale datasets. In this work, we are the first to conduct an extensive study of the transferability over large models and a large scale dataset, and we are also the first to study the transferability of targeted adversarial examples with their target labels. We study both non-targeted and targeted adversarial examples, and show that while transferable non-targeted adversarial examples are easy to find, targeted adversarial examples generated using existing approaches almost never transfer with their target labels. Therefore, we propose novel ensemble-based approaches to generating transferable adversarial examples. Using such approaches, we observe a large proportion of targeted adversarial examples that are able to transfer with their target labels for the first time. We also present some geometric studies to help understanding the transferable adversarial examples. Finally, we show that the adversarial examples generated using ensemble-based approaches can successfully attack this http URL, which is a black-box image classification system.

Transferability

Deep Neural Networks

Black box

Source

Cite

Citations (589)

A Unified Approach to Interpreting and Boosting Adversarial Transferability

arXiv (Cornell University) (2020)

Xin Wang Jie Ren Shuyun Lin Xiangming Zhu Yisen Wang

In this paper, we use the interaction inside adversarial perturbations to explain and boost the adversarial transferability. We discover and prove the negative correlation between the adversarial transferability and the interaction inside adversarial perturbations. The negative correlation is further verified through different DNNs with various inputs. Moreover, this negative correlation can be regarded as a unified perspective to understand current transferability-boosting methods. To this end, we prove that some classic methods of enhancing the transferability essentially decease interactions inside adversarial perturbations. Based on this, we propose to directly penalize interactions during the attacking process, which significantly improves the adversarial transferability.

Transferability

Boosting

10.48550/arxiv.2010.04055

Cite

Citations (35)

Attack-Centric Approach for Evaluating Transferability of Adversarial Samples in Machine Learning Models

arXiv (Cornell University) (2021)

Tochukwu Idika İsmail Aktürk

Transferability

Adversarial machine learning

Transfer of learning

Source

Cite

Citations (0)

Mist: Towards Improved Adversarial Examples for Diffusion Models

arXiv (Cornell University) (2023)

Chumeng Liang Xiaoyu Wu

Diffusion Models (DMs) have empowered great success in artificial-intelligence-generated content, especially in artwork creation, yet raising new concerns in intellectual properties and copyright. For example, infringers can make profits by imitating non-authorized human-created paintings with DMs. Recent researches suggest that various adversarial examples for diffusion models can be effective tools against these copyright infringements. However, current adversarial examples show weakness in transferability over different painting-imitating methods and robustness under straightforward adversarial defense, for example, noise purification. We surprisingly find that the transferability of adversarial examples can be significantly enhanced by exploiting a fused and modified adversarial loss term under consistent parameters. In this work, we comprehensively evaluate the cross-method transferability of adversarial examples. The experimental observation shows that our method generates more transferable adversarial examples with even stronger robustness against the simple adversarial defense.

Transferability

Robustness

10.48550/arxiv.2305.12683

Cite

Citations (6)

DeT: Defending Against Adversarial Examples via Decreasing Transferability

Lecture notes in computer science (2019)

Changjiang Li Haiqin Weng Shouling Ji Jianfeng Dong Qinming He

Transferability

Deep Neural Networks

Benchmark (surveying)

10.1007/978-3-030-37337-5_25

Cite

Citations (9)

Towards A Unified Understanding and Improving of Adversarial Transferability

Xin Wang Jie Ren Shuyun Lin Xiangming Zhu Yisen Wang

Transferability

Boosting

Cite

Citations (0)

Delving into Transferable Adversarial Examples and Black-box Attacks

arXiv (Cornell University) (2016)

Yanpei Liu Xinyun Chen Chang Liu Dawn Song

An intriguing property of deep neural networks is the existence of adversarial examples, which can transfer among different architectures. These transferable adversarial examples may severely hinder deep neural network-based applications. Previous works mostly study the transferability using small scale datasets. In this work, we are the first to conduct an extensive study of the transferability over large models and a large scale dataset, and we are also the first to study the transferability of targeted adversarial examples with their target labels. We study both non-targeted and targeted adversarial examples, and show that while transferable non-targeted adversarial examples are easy to find, targeted adversarial examples generated using existing approaches almost never transfer with their target labels. Therefore, we propose novel ensemble-based approaches to generating transferable adversarial examples. Using such approaches, we observe a large proportion of targeted adversarial examples that are able to transfer with their target labels for the first time. We also present some geometric studies to help understanding the transferable adversarial examples. Finally, we show that the adversarial examples generated using ensemble-based approaches can successfully attack Clarifai.com, which is a black-box image classification system.

Black box

10.48550/arxiv.1611.02770

Cite

Citations (798)

A Unified Approach to Interpreting and Boosting Adversarial Transferability

arXiv (Cornell University) (2020)

Xin Wang Jie Ren Shuyun Lin Xiangming Zhu Yisen Wang

Transferability

Boosting

Source

Cite

Citations (4)