Comparative and experimental study in identifying the similarity between languages for plagiarism detection and efficient language translation

2021 
Abstract The comparison of multilingual documents which are used in plagiarism detection and bilingual lexicon extraction is compared through this article. Parallel corpus is used for the comparison of multilingual text which is a collection of allied sentences and sentences which are translation of each other. In this paper we have presented a comparative study on the proposed three techniques like Sequence matcher, Fuzzy-Wuzzy (Ratio) and Spacy similarity techniques in order to find out the similarity between the sentences, words in a multi lingual content. The Comparative study proposed in this article includes the comparison of proposed methods with the similar kind of work implemented with techniques in the literature. The Fuzzy-Wuzzy (Ratio) out performs in terms of Accuracy compared to the sequence matcher and Spacy similarity for the identification of similarity between the languages.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    0
    Citations
    NaN
    KQI
    []