PEACH: Pretrained-Embedding Explanation across Contextual and Hierarchical Structure

PEACH: Pretrained-Embedding Explanation across Contextual and Hierarchical Structure

Feiqi Cao, Soyeon Caren Han, Hyunsuk Chung

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 6207-6215. https://doi.org/10.24963/ijcai.2024/686

In this work, we propose a novel tree-based explanation technique, PEACH (Pretrained-embedding Explanation Across Contextual and Hierarchical Structure), that can explain how text-based documents are classified by using any pretrained contextual embeddings in a tree-based human-interpretable manner. Note that PEACH can adopt any contextual embeddings of the PLMs as a training input for the decision tree. Using the proposed PEACH, we perform a comprehensive analysis of several contextual embeddings on nine different NLP text classification benchmarks. This analysis demonstrates the flexibility of the model by appling several PLM contextual embeddings, its attribute selections, scaling, and clustering methods. Furthermore, we show the utility of explanations by visualising the feature selection and important trend of text classification via human-interpretable word-cloud-based trees, which clearly identify model mistakes and assist in dataset debugging. Besides interpretability, PEACH outperforms or is similar to those from pretrained models. Code and Appendix are in https://github.com/adlnlp/peach.
Keywords:
Natural Language Processing: NLP: Interpretability and analysis of models for NLP
Knowledge Representation and Reasoning: KRR: Other