Sidecar: Augmenting Word Embedding Models with Expert Knowledge

2020 
This work investigates a method for enriching pre-trained word embeddings with domain-specific information using a small, custom word embedding. For a classification task on text containing out-of-vocabulary expert jargon, this new approach improves the prediction accuracy when using popular models such as Word2Vec (71.5% to 76.6%), GloVe (73.5% to 77.2%), and fastText (75.8% to 79.6%). Furthermore, an analysis of the approach demonstrates that expert knowledge is improved in terms of discrimination and inconsistency. Another advantage of this word embedding augmentation technique is that it is computationally inexpensive and leverages the general syntactic information encoded in large pre-trained word embeddings.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    0
    Citations
    NaN
    KQI
    []