BAARD: Blocking Adversarial Examples by Testing for Applicability, Reliability and Decidability

Chang, Xinglong; Dost, Katharina; Zhao, Kaiqi; Demontis, Ambra; Roli, Fabio; Dobbie, Gill; Wicker, Jörg

doi:10.1007/978-3-031-33374-3_1

Computer Science > Machine Learning

arXiv:2105.00495 (cs)

[Submitted on 2 May 2021 (v1), last revised 13 Sep 2023 (this version, v2)]

Title:BAARD: Blocking Adversarial Examples by Testing for Applicability, Reliability and Decidability

Authors:Xinglong Chang, Katharina Dost, Kaiqi Zhao, Ambra Demontis, Fabio Roli, Gill Dobbie, Jörg Wicker

View PDF

Abstract:Adversarial defenses protect machine learning models from adversarial attacks, but are often tailored to one type of model or attack. The lack of information on unknown potential attacks makes detecting adversarial examples challenging. Additionally, attackers do not need to follow the rules made by the defender. To address this problem, we take inspiration from the concept of Applicability Domain in cheminformatics. Cheminformatics models struggle to make accurate predictions because only a limited number of compounds are known and available for training. Applicability Domain defines a domain based on the known compounds and rejects any unknown compound that falls outside the domain. Similarly, adversarial examples start as harmless inputs, but can be manipulated to evade reliable classification by moving outside the domain of the classifier. We are the first to identify the similarity between Applicability Domain and adversarial detection. Instead of focusing on unknown attacks, we focus on what is known, the training data. We propose a simple yet robust triple-stage data-driven framework that checks the input globally and locally, and confirms that they are coherent with the model's output. This framework can be applied to any classification model and is not limited to specific attacks. We demonstrate these three stages work as one unit, effectively detecting various attacks, even for a white-box scenario.

Subjects:	Machine Learning (cs.LG); Biomolecules (q-bio.BM)
Cite as:	arXiv:2105.00495 [cs.LG]
	(or arXiv:2105.00495v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2105.00495
Journal reference:	PAKDD,2023,3-14
Related DOI:	https://doi.org/10.1007/978-3-031-33374-3_1

Submission history

From: Luke Chang [view email]
[v1] Sun, 2 May 2021 15:24:33 UTC (1,319 KB)
[v2] Wed, 13 Sep 2023 19:34:50 UTC (1,580 KB)

Computer Science > Machine Learning

Title:BAARD: Blocking Adversarial Examples by Testing for Applicability, Reliability and Decidability

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:BAARD: Blocking Adversarial Examples by Testing for Applicability, Reliability and Decidability

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators