FairBalance: Improving Machine Learning Fairness on MultipleSensitive Attributes With Data Balancing.

2021 
This paper aims to improve machine learning fairness on multiple sensitive attributes. Machine learning fairness has attracted increasing attention since machine learning software is increasingly used for high-stakes and high-risk decisions. Most existing solutions for machine learning fairness either target only one sensitive attribute (e.g. sex) at a time, or have magic parameters to tune, or have expensive computational overhead. To overcome these challenges, we propose FairBalance to balance the group distribution of training data across every sensitive attribute before training the machine learning models. Our results show that, under the assumption of unbiased ground truth labels, at low computational overhead, FairBalance can significantly reduce fairness metrics (AOD, EOD, and SPD) on every known sensitive attribute without much, if any damage to the prediction performance. In addition, FairBalanceClass, a variant of FairBalance, can balance the class distribution in the training data. With FairBalanceClass, predictions will no longer favor the majority class, thus achieving a higher F$_1$ score on the minority class. FairBalance and FairBalanceClass also outperform other state-of-the-art bias mitigation algorithms in terms of prediction performance and fairness metrics. This research will benefit society by providing a simple yet effective approach to improve fairness of machine learning software on data with multiple sensitive attributes. Our results also validate the hypothesis that on datasets with unbiased ground truth labels, ethical biases in the learned models largely attribute to the training data having (1) difference in group size and (2) difference in class distribution within each group.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    0
    Citations
    NaN
    KQI
    []