Detecting Semantically Equivalent Questions Using Transformer-Based Encodings

2021 
Due to the increasing influx of users on various Q and A forums like Quora, Stack Overflow, etc., answers to questions get fragmented across different versions of the same question due to this redundancy of questions. However, if questions which are lexically or semantically identical could be grouped, the search results for a given question could yield the assimilation of answers provided for all versions of a question. In this paper, an ensemble of four separate models is created using four different word-embedding techniques in each. The vectorized data from each of the embedding layers is passed through a custom-made architecture using transformer-based encodings. This architecture is used to determine if a pair of questions have the same semantic meaning or not. Finally, on testing the model on a subset of the provided data, the experiments show that the proposed model achieves an F1 score of 84.63%
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    0
    Citations
    NaN
    KQI
    []