Cross-Language Urdu-English (CLUE) Text Alignment Corpus

2015 
Plagiarism is well known problem of the day. Easy access to print and electronic media and ready to use material made it easy to reuse the existing text in new document. The severity of the problem is much reduced in monolingual context by the automated and tailored effort made by the research community but the issue is yet not properly addressed in cross language (CL) text reuse. Any story or article written in any source language like Urdu is simply translated in target language like English and translator claims it as his own. Availability of standard and simulated resource address the issue and act as test bed for analyzing and implementing available plagiarism detection approaches. The research work is aimed at enriching the available cross- language corpus and on the other hand providing a benchmark corpus to Cross Language Plagiarism (CLP) domain.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    1
    Citations
    NaN
    KQI
    []