A Consistency Regularization for Certified Robust Neural Networks

M Xu, T Zhang, Z Li, D Zhang - CAAI International Conference on Artificial …, 2021 - Springer
CAAI International Conference on Artificial Intelligence, 2021Springer
A range of provable defense methods have been proposed to train neural networks that are
certifiably robust to the adversarial examples. Among which, COLT [1] combined adversarial
training and provable defense method that achieves state-of-the-art accuracy and certified
robustness. However, COLT treats all examples equally during training, which ignores the
inconsistent constraint of certified robustness between correctly classified (natural) and
misclassified examples. In this paper, we explore this inconsistency and add a regularization …
Abstract
A range of provable defense methods have been proposed to train neural networks that are certifiably robust to the adversarial examples. Among which, COLT [1] combined adversarial training and provable defense method that achieves state-of-the-art accuracy and certified robustness. However, COLT treats all examples equally during training, which ignores the inconsistent constraint of certified robustness between correctly classified (natural) and misclassified examples. In this paper, we explore this inconsistency and add a regularization to exploit misclassified examples efficiently. Specifically, we identified that the certified robustness of networks can be significantly improved by refining inconsistent constraint on misclassified examples. Besides, we design a new defense regularization called Misclassification Aware Adversarial Regularization (MAAR), which constrains the output probability distributions of all examples in the certified region of the misclassified example. Experimental results show that MAAR achieves the best certified robustness and comparable accuracy on CIFAR-10 and MNIST datasets in comparison with several state-of-the-art methods.
Springer
Showing the best result for this search. See all results