TADW: Traceable and Anti-detection Dynamic Watermarking of Deep Neural Networks

2022 
Deep neural networks (DNN) with incomparably advanced performance have been extensively applied in diverse fields (e.g., image recognition, natural language processing, and speech recognition). Training a high-performance DNN model requires a lot of training data and intellectual and computing resources, which bring a high cost to the model owners. Therefore, illegal model abuse (model theft, derivation, resale or redistribution, etc.) seriously infringes model owners’ legitimate rights and interests. Watermarking is considered the main topic of DNN ownership protection. However, almost all existing watermarking works apply solely to image data. They do not trace the unique infringing model, and the adversary easily detects these ownership verification samples (trigger set) simultaneously. This paper introduces TADW, a dynamic watermarking scheme with tracking and antidetection abilities in the deep learning (DL) textual domain. Specifically, we propose a new approach to construct trigger set samples for antidetection and innovatively design a mapping algorithm that assigns a unique serial number (SN) to every watermarked model. Furthermore, we implement and detailedly evaluate TADW on 2 benchmark datasets and 3 popular DNNs. Experiment results show that TADW can successfully verify the ownership of the target model at a less than 0.5% accuracy cost and identify unique infringing models. In addition, TADW is excellently robust against different model modifications and can serve numerous users.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []