Deep analysis of word sense disambiguation via semi-supervised learning and neural word representations

2021 
Abstract Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context. Different approaches have been proposed in supervised and unsupervised domains. In most cases, supervised learning provides superior WSD performance. Since sense-annotated corpora can be difficult or time-consuming to obtain, which must be repeated for new domains, languages, and sense inventories, semi-supervised learning (SSL) methods, that combine a small amount of sense-annotated data, start to be pre-eminent. In SSL, graph-based methods are common, because they capture the relationships between terms using an undirected graph. This paper aims to investigate semi-supervised WSD by considering different graph-based SSL algorithms with features generated by word embeddings from Word2Vec, FastText, GloVe, BERT and ELECTRA models combined with parts-of-speech tags and word context. We test several combinations of word-embedding models, similarity measures for graph construction and SSL classification algorithms to disambiguate classical lexical sample WSD datasets. The results indicate our SSL algorithms achieved competitive results compared to supervised ones and the ELECTRA models performed better than other embeddings for SSL.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    3
    Citations
    NaN
    KQI
    []