As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
Explaining artificial intelligence models can be utilized to launch targeted adversarial attacks on text classification algorithms. Understanding the reasoning behind the model’s decisions makes it easier to prepare such samples. Most of the current text-based adversarial attacks rely on brute-force by using SHAP approach to identify the importance of tokens in the samples, we modify the crucial ones to prepare targeted attacks. We base our results on experiments using 5 datasets. Our results show that our approach outperforms TextBugger and TextFooler, achieving better results with 4 out of 5 datasets against TextBugger, and 3 out of 5 datasets against TextFooler, while minimizing perturbation introduced to the texts. In particular, we managed to outperform the efficacy of TextFooler by over 3100% and TextBugger by over 420% on the WikiPL dataset, additionally keeping high cosine similarity between the original text sample and the adversarial example. The evaluation of the results was additionally supported through a survey to assess their quality and ensure that the text perturbations did not change the intended class according to subjective, human classification.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.