Not All Synonyms Are Created Equal: Incorporating Similarity of Synonyms to Enhance Word Embeddings

2020 
Traditional word embedding approaches learn semantic information from the associated contexts of words on large unlabeled corpora, which ignores a fact that synonymy between words happens often within different contexts in a corpus, so this relationship will not be well embedded into vectors. Furthermore, existing synonymy-based models directly incorporate synonyms to train word embeddings, but still neglect the similarity between words and corresponding synonyms. In this paper, we explore a novel approach that employs the similarity between words and corresponding synonyms to train and enhance word embeddings. To this purpose, we build two Synonymy Similarity Models (SSMs), named SSM-W and SSM-M respectively, which adopt different strategies to incorporate the similarity between words and corresponding synonyms during the training process. We evaluated our models for both Chinese and English. The results demonstrate that our models outperform the baselines on seven word similarity datasets. For the analogical reasoning and text classification tasks, our models also surpass all the baselines including a synonymy-based model.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    1
    Citations
    NaN
    KQI
    []