HurtBERT: Incorporating Lexical Features with BERT for the Detection of Abusive Language

Anna Koufakou,Endang Wahyu Pamungkas,Valerio Basile,Viviana Patti

HurtBERT: Incorporating Lexical Features with BERT for the Detection of Abusive Language

2020

Anna Koufakou
Endang Wahyu Pamungkas
Valerio Basile
Viviana Patti

The detection of abusive or offensive remarks in social texts has received significant attention in research. In several related shared tasks, BERT has been shown to be the state-of-the-art. In this paper, we propose to utilize lexical features derived from a hate lexicon towards improving the performance of BERT in such tasks. We explore different ways to utilize the lexical features in the form of lexicon-based encodings at the sentence level or embeddings at the word level. We provide an extensive dataset evaluation that addresses in-domain as well as cross-domain detection of abusive content to render a complete picture. Our results indicate that our proposed models combining BERT with lexical features help improve over a baseline BERT model in many of our in-domain and cross-domain experiments.

Keywords:

Lexicon
Sentence
Computer science
Artificial intelligence
Natural language processing
Offensive

Correction
Cite
Save
Machine Reading By IdeaReader

References

Citations