Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training

Park, Leo Hyun; Kim, Jaeuk; Oh, Myung Gyo; Park, Jaewoo; Kwon, Taekyoung

Computer Science > Computer Vision and Pattern Recognition

arXiv:2402.12187 (cs)

[Submitted on 19 Feb 2024]

Title:Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training

Authors:Leo Hyun Park, Jaeuk Kim, Myung Gyo Oh, Jaewoo Park, Taekyoung Kwon

View PDF

Abstract:Deep learning models continue to advance in accuracy, yet they remain vulnerable to adversarial attacks, which often lead to the misclassification of adversarial examples. Adversarial training is used to mitigate this problem by increasing robustness against these attacks. However, this approach typically reduces a model's standard accuracy on clean, non-adversarial samples. The necessity for deep learning models to balance both robustness and accuracy for security is obvious, but achieving this balance remains challenging, and the underlying reasons are yet to be clarified. This paper proposes a novel adversarial training method called Adversarial Feature Alignment (AFA), to address these problems. Our research unveils an intriguing insight: misalignment within the feature space often leads to misclassification, regardless of whether the samples are benign or adversarial. AFA mitigates this risk by employing a novel optimization algorithm based on contrastive learning to alleviate potential feature misalignment. Through our evaluations, we demonstrate the superior performance of AFA. The baseline AFA delivers higher robust accuracy than previous adversarial contrastive learning methods while minimizing the drop in clean accuracy to 1.86% and 8.91% on CIFAR10 and CIFAR100, respectively, in comparison to cross-entropy. We also show that joint optimization of AFA and TRADES, accompanied by data augmentation using a recent diffusion model, achieves state-of-the-art accuracy and robustness.

Comments:	19 pages, 5 figures, 16 tables, 2 algorithms
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
ACM classes:	I.4.0; K.6.5; D.2.7
Cite as:	arXiv:2402.12187 [cs.CV]
	(or arXiv:2402.12187v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2402.12187

Submission history

From: Taekyoung Kwon [view email]
[v1] Mon, 19 Feb 2024 14:51:20 UTC (6,061 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators