Interpret Neural Networks by Extracting Critical Subnetworks.

2020 
: In recent years, deep neural networks have achieved excellent performance in many fields of artificial intelligence. The requirements for the interpretability and robustness of neural networks are also increasing. In this paper, we propose to understand the functional mechanism of neural networks by extracting critical subnetworks. Specifically, we denote the critical subnetworks as a group of important channels across layers such that if they were suppressed to zeros, the final test performance would deteriorate severely. This novel perspective can not only reveal the layerwise semantic behavior within the model but also present more accurate visual explanations appearing in the data through attribution methods. Moreover, we propose two adversarial example detection methods based on the properties of sample-specific and class-specific subnetworks, which provides the possibility for increasing the model robustness.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    59
    References
    3
    Citations
    NaN
    KQI
    []