On Robustifying Concept Explanations

EasyChair Preprint 11000, version 2

Versions: 12→history

7 pages•Date: October 4, 2023

Abstract

With increasing use of deep learning models, understanding and diagnosing their predictions is becoming increasingly important. A common approach for understanding predictions of deep nets is Concept Explanations. Concept explanations are a form of global model that aim to interpet a deep networks output using human-understandable concepts. However, prevailing concept explanations methods are not robust to concepts or datasets chosen for explanation computation.
We show that this sensitivity is partly due to ignoring the effect of input noise and epistemic uncertainty in the estimation process. To address this challenge, we propose an uncertainty-aware estimation method. Through a mix of theoretical analysis and empirical evaluation, we demonstrate the stability, label efficiency, and faithfulness of the explanations computed by our approach.

Keyphrases: Explainable AI, concept bottleneck, concept explanations, uncertainty

Links:

https://easychair.org/publications/preprint/xPHd

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:11000,
  author    = {Elizabeth Chou and Amanda Boyd},
  title     = {On Robustifying Concept Explanations},
  howpublished = {EasyChair Preprint 11000},
  year      = {EasyChair, 2023}}

Download PDF Open PDF in browser