Interpretable Counterfactual Explanations Guided by Prototypes

Arnaud Van Looveren,Janis Klaise

Interpretable Counterfactual Explanations Guided by Prototypes

2021

Arnaud Van Looveren
Janis Klaise

We propose a fast, model agnostic method for finding interpretable counterfactual explanations of classifier predictions by using class prototypes. We show that class prototypes, obtained using either an encoder or through class specific k-d trees, significantly speed up the search for counterfactual instances and result in more interpretable explanations. We quantitatively evaluate interpretability of the generated counterfactuals to illustrate the effectiveness of our method on an image and tabular dataset, respectively MNIST and Breast Cancer Wisconsin (Diagnostic). Additionally, we propose a principled approach to handle categorical variables and illustrate our method on the Adult (Census) dataset. Our method also eliminates the computational bottleneck that arises because of numerical gradient evaluation for black box models.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations