A Combination of Enhanced WordNet and BERT for Semantic Textual Similarity

2021 
The task of measuring sentence similarity deals with computing the likeness between a pair of sentences by adopting Natural Language Processing techniques (Euclidean distance, Jaccard distance, Manhattan distance, etc.) as well as embedding techniques (word2vec, GloVe, Flair, etc.). For the purpose of determining sentence similarity, this paper proposes a novel, ensemble learning approach which uses the WordNet corpus and the Bidirectional Encoder Representations from Transformers (BERT) in order to consider the context of words in sentences while computing the similarity scores. The accuracy of the proposed model is computed by calculating the Pearson and Spearman scores for the sentence pairs from the Sentences Involving Compositional Knowledge (SICK) dataset. On analyzing the results, the proposed approach is observed to outperform existing state-of-the-art semantic textual similarity models since it returns the highest correlation scores. Further, this paper also introduces a possible machine learning approach for the same and evaluates its scope and drawbacks.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []