Exploiting temporal information to prevent the transferability of adversarial examples against deep fake detectors

Dongdong Lin Benedetta Tondi Bin Li Mauro Barni

Citation

Reference

Related Paper

Citation Trend

Abstract:

The diffusion of AI tools capable of generating realistic DeepFakes (DF) videos raises serious threats to face-based biometric recognition systems. For this reason, several detectors based on Deep Neural Networks (DNNs) have been developed to distinguish between real and DF videos. Despite their good performance, these methods suffer from vulnerability to adversarial attacks. In this paper, we argue that it is possible to increase the resilience of DNN-based DF detectors against black-box adversarial attacks by exploiting the temporal information contained in the video. By using such information, in fact, the transferability of adversarial examples from a source to a target model is significantly decreased, making it difficult to launch an attack without accessing the target network. To back this claim, we trained two convolutional neural networks (CNNs) to detect DF videos, and measured their robustness against black-box, transfer-based, attacks. We also trained two detectors by adding to the CNNs a long short-term memory (LSTM) layer to extract temporal information. Then, we measured the transferability of adversarial examples to-wards the LSTM-networks. The results we got suggest that the methods based on temporal information are less prone to black-box attacks.

Keywords:

Transferability

Robustness

Deep Neural Networks

Topics:

Adversarial Robustness in Machine Learning

Digital Media Forensic Detection

Anomaly Detection Techniques and Applications

10.1109/ijcb54206.2022.10007959

Cite

Comment on Transferability and Input Transformation with Additive Noise

arXiv (Cornell University) (2022)

Hoki Kim J.-G. Park Jaewook Lee

Adversarial attacks have verified the existence of the vulnerability of neural networks. By adding small perturbations to a benign example, adversarial attacks successfully generate adversarial examples that lead misclassification of deep learning models. More importantly, an adversarial example generated from a specific model can also deceive other models without modification. We call this phenomenon ``transferability". Here, we analyze the relationship between transferability and input transformation with additive noise by mathematically proving that the modified optimization can produce more transferable adversarial examples.

Transferability

Deep Neural Networks

Vulnerability

10.48550/arxiv.2206.09075

Cite

Citations (1)

Attack-Centric Approach for Evaluating Transferability of Adversarial Samples in Machine Learning Models

arXiv (Cornell University) (2021)

Tochukwu Idika İsmail Aktürk

Transferability of adversarial samples became a serious concern due to their impact on the reliability of machine learning system deployments, as they find their way into many critical applications. Knowing factors that influence transferability of adversarial samples can assist experts to make informed decisions on how to build robust and reliable machine learning systems. The goal of this study is to provide insights on the mechanisms behind the transferability of adversarial samples through an attack-centric approach. This attack-centric perspective interprets how adversarial samples would transfer by assessing the impact of machine learning attacks (that generated them) on a given input dataset. To achieve this goal, we generated adversarial samples using attacker models and transferred these samples to victim models. We analyzed the behavior of adversarial samples on victim models and outlined four factors that can influence the transferability of adversarial samples. Although these factors are not necessarily exhaustive, they provide useful insights to researchers and practitioners of machine learning systems.

Transferability

Adversarial machine learning

Transfer of learning

10.48550/arxiv.2112.01777

Cite

Citations (0)

Proving Common Mechanisms Shared by Twelve Methods of Boosting Adversarial Transferability

arXiv (Cornell University) (2022)

Quanshi Zhang Xin Wang Jie Ren Xu Cheng Shuyun Lin

Although many methods have been proposed to enhance the transferability of adversarial perturbations, these methods are designed in a heuristic manner, and the essential mechanism for improving adversarial transferability is still unclear. This paper summarizes the common mechanism shared by twelve previous transferability-boosting methods in a unified view, i.e., these methods all reduce game-theoretic interactions between regional adversarial perturbations. To this end, we focus on the attacking utility of all interactions between regional adversarial perturbations, and we first discover and prove the negative correlation between the adversarial transferability and the attacking utility of interactions. Based on this discovery, we theoretically prove and empirically verify that twelve previous transferability-boosting methods all reduce interactions between regional adversarial perturbations. More crucially, we consider the reduction of interactions as the essential reason for the enhancement of adversarial transferability. Furthermore, we design the interaction loss to directly penalize interactions between regional adversarial perturbations during attacking. Experimental results show that the interaction loss significantly improves the transferability of adversarial perturbations.

Transferability

Boosting

10.48550/arxiv.2207.11694

Cite

Citations (2)

A Unified Approach to Interpreting and Boosting Adversarial Transferability

arXiv (Cornell University) (2020)

Xin Wang Jie Ren Shuyun Lin Xiangming Zhu Yisen Wang

In this paper, we use the interaction inside adversarial perturbations to explain and boost the adversarial transferability. We discover and prove the negative correlation between the adversarial transferability and the interaction inside adversarial perturbations. The negative correlation is further verified through different DNNs with various inputs. Moreover, this negative correlation can be regarded as a unified perspective to understand current transferability-boosting methods. To this end, we prove that some classic methods of enhancing the transferability essentially decease interactions inside adversarial perturbations. Based on this, we propose to directly penalize interactions during the attacking process, which significantly improves the adversarial transferability.

Transferability

Boosting

10.48550/arxiv.2010.04055

Cite

Citations (35)

Adversarial example soups: averaging multiple adversarial examples improves transferability without increasing additional generation time

arXiv (Cornell University) (2024)

Bo Yang Hengwei Zhang Chenwei Li Jindong Wang

For transfer-based attacks, the adversarial examples are crafted on the surrogate model, which can be implemented to mislead the target model effectively. The conventional method for maximizing adversarial transferability involves: (1) fine-tuning hyperparameters to generate multiple batches of adversarial examples on the substitute model; (2) conserving the batch of adversarial examples that have the best comprehensive performance on substitute model and target model, and discarding the others. In this work, we revisit the second step of this process in the context of fine-tuning hyperparameters to craft adversarial examples, where multiple batches of fine-tuned adversarial examples often appear in a single high error hilltop. We demonstrate that averaging multiple batches of adversarial examples under different hyperparameter configurations, which refers to as "adversarial example soups", can often enhance adversarial transferability. Compared with traditional methods, the proposed method incurs no additional generation time and computational cost. Besides, our method is orthogonal to existing transfer-based methods and can be combined with them seamlessly to generate more transferable adversarial examples. Extensive experiments on the ImageNet dataset show that our methods achieve a higher attack success rate than the state-of-the-art attacks.

Transferability

10.48550/arxiv.2402.18370

Cite

Citations (1)

Attack-Centric Approach for Evaluating Transferability of Adversarial Samples in Machine Learning Models

arXiv (Cornell University) (2021)

Tochukwu Idika İsmail Aktürk

Transferability

Adversarial machine learning

Transfer of learning

Source

Cite

Citations (0)

Mist: Towards Improved Adversarial Examples for Diffusion Models

arXiv (Cornell University) (2023)

Chumeng Liang Xiaoyu Wu

Diffusion Models (DMs) have empowered great success in artificial-intelligence-generated content, especially in artwork creation, yet raising new concerns in intellectual properties and copyright. For example, infringers can make profits by imitating non-authorized human-created paintings with DMs. Recent researches suggest that various adversarial examples for diffusion models can be effective tools against these copyright infringements. However, current adversarial examples show weakness in transferability over different painting-imitating methods and robustness under straightforward adversarial defense, for example, noise purification. We surprisingly find that the transferability of adversarial examples can be significantly enhanced by exploiting a fused and modified adversarial loss term under consistent parameters. In this work, we comprehensively evaluate the cross-method transferability of adversarial examples. The experimental observation shows that our method generates more transferable adversarial examples with even stronger robustness against the simple adversarial defense.

Transferability

Robustness

10.48550/arxiv.2305.12683

Cite

Citations (6)

DeT: Defending Against Adversarial Examples via Decreasing Transferability

Lecture notes in computer science (2019)

Changjiang Li Haiqin Weng Shouling Ji Jianfeng Dong Qinming He

Transferability

Deep Neural Networks

Benchmark (surveying)

10.1007/978-3-030-37337-5_25