Language-guided Detection and Mitigation of Unknown Dataset Bias

Zhao, Zaiying; Kumano, Soichiro; Yamasaki, Toshihiko

Computer Science > Computer Vision and Pattern Recognition

arXiv:2406.02889 (cs)

[Submitted on 5 Jun 2024]

Title:Language-guided Detection and Mitigation of Unknown Dataset Bias

Authors:Zaiying Zhao, Soichiro Kumano, Toshihiko Yamasaki

View PDF HTML (experimental)

Abstract:Dataset bias is a significant problem in training fair classifiers. When attributes unrelated to classification exhibit strong biases towards certain classes, classifiers trained on such dataset may overfit to these bias attributes, substantially reducing the accuracy for minority groups. Mitigation techniques can be categorized according to the availability of bias information (\ie, prior knowledge). Although scenarios with unknown biases are better suited for real-world settings, previous work in this field often suffers from a lack of interpretability regarding biases and lower performance. In this study, we propose a framework to identify potential biases as keywords without prior knowledge based on the partial occurrence in the captions. We further propose two debiasing methods: (a) handing over to an existing debiasing approach which requires prior knowledge by assigning pseudo-labels, and (b) employing data augmentation via text-to-image generative models, using acquired bias keywords as prompts. Despite its simplicity, experimental results show that our framework not only outperforms existing methods without prior knowledge, but also is even comparable with a method that assumes prior knowledge.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2406.02889 [cs.CV]
	(or arXiv:2406.02889v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2406.02889

Submission history

From: Zaiying Zhao [view email]
[v1] Wed, 5 Jun 2024 03:11:33 UTC (4,350 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Language-guided Detection and Mitigation of Unknown Dataset Bias

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Language-guided Detection and Mitigation of Unknown Dataset Bias

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators