Explaining Black-Box Models Using Interpretable Surrogates.

2019 
Explaining black-box machine learning models is important for their successful applicability to many real world problems. Existing approaches to model explanation either focus on explaining a particular decision instance or are applicable only to specific models. In this paper, we address these limitations by proposing a new model-agnostic mechanism to black-box model explainability. Our approach can be utilised to explain the predictions of any black-box machine learning model. Our work uses interpretable surrogate models (e.g. a decision tree) to extract global rules to describe the preditions of a model. We develop an optimization procedure, which helps a decision tree to mimic a black-box model, by efficiently retraining the decision tree in a sequential manner, using the data labeled by the black-box model. We demonstrate the usefulness of our proposed framework using three applications: two classification models, one built using iris dataset, other using synthetic dataset and a regression model built for bike sharing dataset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    6
    Citations
    NaN
    KQI
    []