Efficient Detection of Multilingual Hate Speech by Using Interactive Attention Network with Minimal Human Feedback
2021
Online hate speech on social media has become a critical problem for social network services that has been further fueled by the self-isolation in the COVID-2019 pandemic. Current studies have primarily focused on detecting hate speech in one language due to the complexity of the task; however, hate speech has no boundaries across the languages and geographies in the real world nowadays. This demands further investigation on multilingual hate speech detection methods, with strong requirements for model interpretability to effectively understand the context of the model errors. In this paper, we propose a Multilingual Interactive Attention Network (MLIAN) model for hate speech detection on multilingual social media text corpora, by building upon the attention networks for interpretability and human-in-the-loop paradigm for model adaptability. This model interactively learns to give attention to the relevant contextual words and leverage the labels for the hate target mentions from the simulated human feedback. We evaluated the proposed model on SemEval-2019 Task 5 datasets in English and Spanish. Extensive experimentation of model training in both settings of single and multiple language data demonstrates the superior performance of our model (with AUC more than 84%) compared to the strong baselines. Our results show that human feedback not only improves the model performance but also helps to improve the interpretability of the model by establishing a strong connection between the learned attention weights and semantic frames for the text across languages. Further, an analysis of the amount of human feedback required to achieve reliable and increased model performance shows that less than 4% of training data is sufficient. The application of the MLIAN method can inform future studies on multilingual hate speech.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
26
References
0
Citations
NaN
KQI