logo
    Should We Rely on Entity Mentions for Relation Extraction? Debiasing Relation Extraction with Counterfactual Analysis
    19
    Citation
    40
    Reference
    10
    Related Paper
    Citation Trend
    Abstract:
    Yiwei Wang, Muhao Chen, Wenxuan Zhou, Yujun Cai, Yuxuan Liang, Dayiheng Liu, Baosong Yang, Juncheng Liu, Bryan Hooi. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022.
    Keywords:
    Debiasing
    Relationship extraction
    Computational linguistics
    Chen
    Previous work has examined how debiasing language models affect downstream tasks, specifically, how debiasing techniques influence task performance and whether debiased models also make impartial predictions in downstream tasks or not. However, what we don’t understand well yet is why debiasing methods have varying impacts on downstream tasks and how debiasing techniques affect internal components of language models, i.e., neurons, layers, and attentions. In this paper, we decompose the internal mechanisms of debiasing language models with respect to gender by applying causal mediation analysis to understand the influence of debiasing methods on toxicity detection as a downstream task. Our findings suggest a need to test the effectiveness of debiasing methods with different bias metrics, and to focus on changes in the behavior of certain components of the models, e.g.,first two layers of language models, and attention heads.
    Debiasing
    Affect
    The output tendencies of Pre-trained Language Models (PLM) vary markedly before and after Fine-Tuning (FT) due to the updates to the model parameters. These divergences in output tendencies result in a gap in the social biases of PLMs. For example, there exits a low correlation between intrinsic bias scores of a PLM and its extrinsic bias scores under FT-based debiasing methods. Additionally, applying FT-based debiasing methods to a PLM leads to a decline in performance in downstream tasks. On the other hand, PLMs trained on large datasets can learn without parameter updates via In-Context Learning (ICL) using prompts. ICL induces smaller changes to PLMs compared to FT-based debiasing methods. Therefore, we hypothesize that the gap observed in pre-trained and FT models does not hold true for debiasing methods that use ICL. In this study, we demonstrate that ICL-based debiasing methods show a higher correlation between intrinsic and extrinsic bias scores compared to FT-based methods. Moreover, the performance degradation due to debiasing is also lower in the ICL case compared to that in the FT case.
    Debiasing
    Citations (0)
    The increasing application of Artificial Intelligence and Machine Learning models poses potential risks of unfair behavior and, in light of recent regulations, has attracted the attention of the research community. Several researchers focused on seeking new fairness definitions or developing approaches to identify biased predictions. However, none try to exploit the counterfactual space to this aim. In that direction, the methodology proposed in this work aims to unveil unfair model behaviors using counterfactual reasoning in the case of fairness under unawareness setting. A counterfactual version of equal opportunity named counterfactual fair opportunity is defined and two novel metrics that analyze the sensitive information of counterfactual samples are introduced. Experimental results on three different datasets show the efficacy of our methodologies and our metrics, disclosing the unfair behavior of classic machine learning and debiasing models.
    Debiasing
    Citations (0)
    Abstract Consumers are subject to cognitive biases, which impede the rationality of their financial decisions. This is problematic, given the onus on the individual to make investment and savings decisions. Thus, there is an impetus for research to identify mitigation strategies. This qualitative review surveys the debiasing literature to identify the prevalent debiasing approaches and proposes an integrated model towards debiasing. The identified core debiasing strategies (education and training, decision support systems, information aspects, experience, and financial advice) are organized and integrated into a single model using the ‘Antecedents, Decisions, Outcomes’ format developed by Paul and Benito. We also propose an agenda for future debiasing research.
    Debiasing
    Reflection
    Citations (6)
    Ensemble-based debiasing methods have been shown effective in mitigating the reliance of classifiers on specific dataset bias, by exploiting the output of a bias-only model to adjust the learning target. In this paper, we focus on the bias-only model in these ensemble-based methods, which plays an important role but has not gained much attention in the existing literature. Theoretically, we prove that the debiasing performance can be damaged by inaccurate uncertainty estimations of the bias-only model. Empirically, we show that existing bias-only models fall short in producing accurate uncertainty estimations. Motivated by these findings, we propose to conduct calibration on the bias-only model, thus achieving a three-stage ensemble-based debiasing framework, including bias modeling, model calibrating, and debiasing. Experimental results on NLI and fact verification tasks show that our proposed three-stage debiasing framework consistently outperforms the traditional two-stage one in out-of-distribution accuracy.
    Debiasing
    Ensemble Learning
    Ensemble forecasting
    Citations (10)
    Previous work has examined how debiasing language models affect downstream tasks, specifically, how debiasing techniques influence task performance and whether debiased models also make impartial predictions in downstream tasks or not. However, what we don't understand well yet is why debiasing methods have varying impacts on downstream tasks and how debiasing techniques affect internal components of language models, i.e., neurons, layers, and attentions. In this paper, we decompose the internal mechanisms of debiasing language models with respect to gender by applying causal mediation analysis to understand the influence of debiasing methods on toxicity detection as a downstream task. Our findings suggest a need to test the effectiveness of debiasing methods with different bias metrics, and to focus on changes in the behavior of certain components of the models, e.g.,first two layers of language models, and attention heads.
    Debiasing
    Affect
    Citations (0)
    :Systematic biases have been found in both individual and group judgments, calling for research into debiasing approaches. Although individual debiasing has been studied to some extent, no such effort exists for group debiasing. This paper advocates the use of group support systems (GSS) for group debiasing and presents a theoretical perspective on how this debiasing may be achieved. Special attention is paid to two important judgment biases: representativeness bias and availability bias. A research model is developed from which propositions are derived.
    Debiasing
    In-group favoritism
    Pre-trained language models trained on large-scale data have learned serious levels of social biases. Consequently, various methods have been proposed to debias pre-trained models. Debiasing methods need to mitigate only discriminatory bias information from the pre-trained models, while retaining information that is useful for the downstream tasks. In previous research, whether useful information is retained has been confirmed by the performance of downstream tasks in debiased pre-trained models. On the other hand, it is not clear whether these benchmarks consist of data pertaining to social biases and are appropriate for investigating the impact of debiasing. For example in gender-related social biases, data containing female words (e.g. ``she, female, woman''), male words (e.g. ``he, male, man''), and stereotypical words (e.g. ``nurse, doctor, professor'') are considered to be the most affected by debiasing. If there is not much data containing these words in a benchmark dataset for a target task, there is the possibility of erroneously evaluating the effects of debiasing. In this study, we compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets that containing female, male, and stereotypical words. Experiments show that the effects of debiasing are consistently \emph{underestimated} across all tasks. Moreover, the effects of debiasing could be reliably evaluated by separately considering instances containing female, male, and stereotypical words than all of the instances in a benchmark dataset.
    Debiasing
    Benchmark (surveying)
    Citations (0)
    Systematic biases have been found in both individual and group judgments, calling for research into debiasing approaches. Although individual debiasing has been studied to some extent, no parallel effort exists for group debiasing. The paper advocates the use of group support systems (GSS) for group debiasing and presents a theoretical perspective on how the impact may be achieved. Special attention is paid to two important judgment biases: representativeness bias and availability bias. A research model is developed on which propositions are derived.
    Debiasing
    In-group favoritism
    Citations (1)