Evaluating the Fairness of Predictive Student Models Through Slicing Analysis

2019 
Predictive modeling has been a core area of learning analytics research over the past decade, with such models currently deployed in a variety of educational contexts from MOOCs to K-12. However, analyses of the differential effectiveness of these models across demographic, identity, or other groups has been scarce. In this paper, we present a method for evaluating unfairness in predictive student models. We define this in terms of differential accuracy between subgroups, and measure it using a new metric we term the Absolute Between-ROC Area (ABROCA). We demonstrate the proposed method through a gender-based "slicing analysis" using five different models replicated from other works and a dataset of 44 unique MOOCs and over four million learners. Our results demonstrate (1) significant differences in model fairness according to (a) statistical algorithm and (b) feature set used; (2) that the gender imbalance ratio, curricular area, and specific course used for a model all display significant association with the value of the ABROCA statistic; and (3) that there is not evidence of a strict tradeoff between performance and fairness. This work provides a framework for quantifying and understanding how predictive models might inadvertently privilege, or disparately impact, different student subgroups. Furthermore, our results suggest that learning analytics researchers and practitioners can use slicing analysis to improve model fairness without necessarily sacrificing performance.1
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    42
    References
    49
    Citations
    NaN
    KQI
    []