Explaining Predictions of the X-Vector Speaker Age and Gender Classifier

2021 
In this paper, we assess the applicability of two Explainable Artificial Intelligence methods: model-agnostic Feature Ablation and Neural Networks-specific Integrated Gradients, to explain predictions of Deep Neural Network-based speech classification models. We use these techniques to explain predictions of two Deep Learning x-vector models trained for age and gender classification from speech. Our results show that both methods can be successfully used for speech classification related tasks, providing a deeper understanding of the model’s behaviour. In particular, we confirm that the features highlighted by the explored methods are fundamental to the performance of the models and their removal results in a rapid performance degradation compared to the random baseline. We also show that the highlighted characteristics align with the theoretical fundamentals regarding age- and gender-based changes in the speech production process.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    0
    Citations
    NaN
    KQI
    []