WWW: A Unified Framework for Explaining What, Where and Why of Neural Networks by Interpretation of Neuron Concepts

Ahn, Yong Hyun; Kim, Hyeon Bae; Kim, Seong Tae

Computer Science > Computer Vision and Pattern Recognition

arXiv:2402.18956 (cs)

[Submitted on 29 Feb 2024 (v1), last revised 11 Apr 2024 (this version, v2)]

Title:WWW: A Unified Framework for Explaining What, Where and Why of Neural Networks by Interpretation of Neuron Concepts

Authors:Yong Hyun Ahn, Hyeon Bae Kim, Seong Tae Kim

View PDF HTML (experimental)

Abstract:Recent advancements in neural networks have showcased their remarkable capabilities across various domains. Despite these successes, the "black box" problem still remains. Addressing this, we propose a novel framework, WWW, that offers the 'what', 'where', and 'why' of the neural network decisions in human-understandable terms. Specifically, WWW utilizes adaptive selection for concept discovery, employing adaptive cosine similarity and thresholding techniques to effectively explain 'what'. To address the 'where' and 'why', we proposed a novel combination of neuron activation maps (NAMs) with Shapley values, generating localized concept maps and heatmaps for individual inputs. Furthermore, WWW introduces a method for predicting uncertainty, leveraging heatmap similarities to estimate 'how' reliable the prediction is. Experimental evaluations of WWW demonstrate superior performance in both quantitative and qualitative metrics, outperforming existing methods in interpretability. WWW provides a unified solution for explaining 'what', 'where', and 'why', introducing a method for localized explanations from global interpretations and offering a plug-and-play solution adaptable to various architectures.

Comments:	CVPR 2024
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2402.18956 [cs.CV]
	(or arXiv:2402.18956v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2402.18956

Submission history

From: Seong Tae Kim [view email]
[v1] Thu, 29 Feb 2024 08:51:51 UTC (12,970 KB)
[v2] Thu, 11 Apr 2024 10:06:10 UTC (13,250 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:WWW: A Unified Framework for Explaining What, Where and Why of Neural Networks by Interpretation of Neuron Concepts

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:WWW: A Unified Framework for Explaining What, Where and Why of Neural Networks by Interpretation of Neuron Concepts

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators