Multi-Targeted Backdoor: Indentifying Backdoor Attack for Multiple Deep Neural Networks

Hyun KWON; Hyunsoo YOON; Ki-Woong PARK

doi:10.1587/transinf.2019EDL8170

Abstract

We propose a multi-targeted backdoor that misleads different models to different classes. The method trains multiple models with data that include specific triggers that will be misclassified by different models into different classes. For example, an attacker can use a single multi-targeted backdoor sample to make model A recognize it as a stop sign, model B as a left-turn sign, model C as a right-turn sign, and model D as a U-turn sign. We used MNIST and Fashion-MNIST as experimental datasets and Tensorflow as a machine learning library. Experimental results show that the proposed method with a trigger can cause misclassification as different classes by different models with a 100% attack success rate on MNIST and Fashion-MNIST while maintaining the 97.18% and 91.1% accuracy, respectively, on data without a trigger.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!