Proxy Tasks and Subjective Measures Can Be Misleading in Evaluating Explainable AI Systems

Buçinca, Zana; Lin, Phoebe; Gajos, Krzysztof Z.; Glassman, Elena L.

doi:10.1145/3377325.3377498

Computer Science > Artificial Intelligence

arXiv:2001.08298 (cs)

[Submitted on 22 Jan 2020]

Title:Proxy Tasks and Subjective Measures Can Be Misleading in Evaluating Explainable AI Systems

Authors:Zana Buçinca, Phoebe Lin, Krzysztof Z. Gajos, Elena L. Glassman

View PDF

Abstract:Explainable artificially intelligent (XAI) systems form part of sociotechnical systems, e.g., human+AI teams tasked with making decisions. Yet, current XAI systems are rarely evaluated by measuring the performance of human+AI teams on actual decision-making tasks. We conducted two online experiments and one in-person think-aloud study to evaluate two currently common techniques for evaluating XAI systems: (1) using proxy, artificial tasks such as how well humans predict the AI's decision from the given explanations, and (2) using subjective measures of trust and preference as predictors of actual performance. The results of our experiments demonstrate that evaluations with proxy tasks did not predict the results of the evaluations with the actual decision-making tasks. Further, the subjective measures on evaluations with actual decision-making tasks did not predict the objective performance on those same tasks. Our results suggest that by employing misleading evaluation methods, our field may be inadvertently slowing its progress toward developing human+AI teams that can reliably perform better than humans or AIs alone.

Subjects:	Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2001.08298 [cs.AI]
	(or arXiv:2001.08298v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2001.08298
Related DOI:	https://doi.org/10.1145/3377325.3377498

Submission history

From: Zana Buçinca [view email]
[v1] Wed, 22 Jan 2020 22:14:28 UTC (2,040 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.AI

< prev | next >

new | recent | 2020-01

Change to browse by:

cs
cs.HC

References & Citations

DBLP - CS Bibliography

listing | bibtex

Krzysztof Z. Gajos
Elena L. Glassman

export BibTeX citation

Computer Science > Artificial Intelligence

Title:Proxy Tasks and Subjective Measures Can Be Misleading in Evaluating Explainable AI Systems

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Proxy Tasks and Subjective Measures Can Be Misleading in Evaluating Explainable AI Systems

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators