The Struggle with Academic Plagiarism: Approaches based on Semantic Similarity

arXiv (Cornell University) (2021)

Citation

Reference

Related Paper

Abstract:

Academic plagiarism is a serious problem nowadays. Due to the existence of inexhaustible sources of digital information, today it is easier to plagiarize more than ever before. The good thing is that plagiarism detection techniques have improved and are powerful enough to detect attempts of plagiarism in education. We are now witnessing efficient plagiarism detection software in action, such as Turnitin, iThenticate or SafeAssign. In the introduction we explore software that is used within the Croatian academic community for plagiarism detection in universities and/or in scientific journals. The question is: is this enough? Current software has proven to be successful, however the problem of identifying paraphrasing or obfuscation plagiarism remains unresolved. In this paper we present a report of how semantic similarity measures can be used in the plagiarism detection task.

Keywords:

Plagiarism detection

Obfuscation

Similarity (geometry)

Academic community

Topics:

Academic integrity and plagiarism

Topic Modeling

Imbalanced Data Classification Techniques

10.48550/arxiv.2106.04404

Cite

PDF

A Classic Multi-method Collaborative Obfuscation Strategy

Communications in computer and information science (2021)

Yujie Ma Yuanzhang Li Zhibin Zhang Ruyun Zhang Lu Liu

Obfuscation

Code (set theory)

10.1007/978-981-16-7502-7_10

Cite

Citations (1)

Aspects of Intermediate Level Obfuscation

Dmitriy Dunaev László Lengyel

The aim of obfuscation in general is to prevent malicious users from disclosing properties of the original source program. This goal can be achieved by an intermediate level obfuscation that deals with a target platform independent intermediate code. In this paper, we discuss general approaches to an intermediate level obfuscation algorithm, pointing out problems and proposing solutions. The paper discusses such aspects of intermediate level obfuscation as input data analysis, mixing of contexts, external function calls, etc. The focus is set on working out an optimization resistant intermediate level obfuscation algorithm that can reliably protect routines from unauthorized analysis and modification.

Obfuscation

Code (set theory)

10.1109/ecbs-eerc.2013.25

Cite

Citations (3)

Malware Obfuscation Techniques: A Brief Survey

Ilsun You Kangbin Yim

As the obfuscation is widely used by malware writers to evade antivirus scanners, so it becomes important to analyze how this technique is applied to malwares. This paper explores the malware obfuscation techniques while reviewing the encrypted, oligomorphic, polymorphic and metamorphic malwares which are able to avoid detection. Moreover, we discuss the future trends on the malware obfuscation techniques.

Obfuscation

Cryptovirology

Malware analysis

Ransomware

10.1109/bwcca.2010.85

Cite

Citations (533)

X9: An Obfuscation Resilient Approach for Source Code Plagiarism Detection in Virtual Learning Environments

Bruno Prado Kalil Bispo Raul Andrade

Obfuscation

Plagiarism detection

Code (set theory)

10.5220/0006668705170524

Cite

Citations (9)

An empirical approach for detecting program similarity and plagiarism within a university programming environment

Computers & Education (1987)

J. A. W. Faidhi Simon Robinson

Plagiarism detection

Similarity (geometry)

Empirical Research

10.1016/0360-1315(87)90042-x

Cite

Citations (193)

The struggle with academic plagiarism: Approaches based on semantic similarity

Tedo Vrbanec Ana Meštrović

Academic plagiarism is a serious problem nowadays. Due to the existence of inexhaustible sources of digital information, today it is easier to plagiarize more than ever before. The good thing is that plagiarism detection techniques have improved and are powerful enough to detect attempts of plagiarism in education. We are now witnessing efficient plagiarism detection software in action, such as Turnitin, iThenticate or SafeAssign. In the introduction we explore software that is used within the Croatian academic community for plagiarism detection in universities and/or in scientific journals. The question is - is this enough? Current software has proven to be successful, however the problem of identifying paraphrasing or obfuscation plagiarism remains unresolved. In this paper we present a report of how semantic similarity measures can be used in the plagiarism detection task.

Plagiarism detection

Obfuscation

Similarity (geometry)

Academic community

10.23919/mipro.2017.7973544

Cite

Citations (19)

Overcoming the obfuscation method of the dynamic name resolution

Naruaki Otsuki Haruaki Tamada

Using unevaluated obfuscation methods has a significant risk since the methods might have some vulnerabilities. One evaluation for obfuscation is de-obfuscation which discloses the hidden information by the obfuscation. This paper proposed the de-obfuscation method against for DNR (dynamic name resolution) obfuscation method. DNR hides system-defined names by encrypting them and resolves names dynamically during runtime. This paper clarifies the steps of de-obfuscation and proposes static and dynamic manners to de-obfuscate DNR. Through the case study, two ways both succeed in disclosing the hidden information of DNR.

Obfuscation

10.1145/3520084.3520103

Cite

Citations (0)

COAT: Code Obfuscation Tool to Evaluate the Performance of Code Plagiarism Detection Tools

Sang-Jun Ko Jusop Choi Hyoungshick Kim

There exist many plagiarism detection tools to uncover plagiarized codes by analyzing the similarity of source codes. To measure how reliable those plagiarism detection tools are, we developed a tool named Code ObfuscAtion Tool (COAT) that takes a program source code as input and produces another source code that is exactly equivalent to the input source code in their functional behaviors but with a different structure. In COAT, we particularly considered the eight representative obfuscation techniques (e.g., modifying control flow or inserting dummy codes) to test the performance of source code plagiarism detection tools. To show the practicality of COAT, we gathered 69 source codes and then tested those source codes with the four popularly used source code plagiarism detection tools (Moss, JPlag, SIM and Sherlock). In these experiments, we found that the similarity scores between the original source codes and their obfuscated plagiarized codes are very low; the mean similarity scores only ranged from 4.00 to 16.20 where the maximum possible score is 100. These results demonstrate that all the tested tools have clear limitations in detecting the plagiarized codes generated with combined code obfuscation techniques.

Obfuscation

Plagiarism detection

Code (set theory)

Similarity (geometry)

10.1109/icssa.2017.29

Cite

Citations (8)

ANALYSIS WINNOWING ALGORITHM FOR TEXT PLAGIARISM DETECTION USING THREE METHOD SIMILARITY

Proxies Jurnal Informatika (2021)

Luke Michael Febriansyah Shinta Estri Wahyuningrum

Cases of plagiarism in recent years has been an issues. Based on that issues, this research will create a system to detect similarity in a text. There is an aspect as reference of the research that is analyze the plagiarism algorithm. This research will analyze the accuracy one of plagiarism check algorithm, winnowing algorithm. Winnowing algorithm is a plagiarism detection algorithm based on document fingerprinting. To calculate percentage similarity of document fingerprinting in text, there are 3 methods to measure similarity that will be used in this research, which is jaccard similarity coefficient, sorensen dice similarity coefficient, and berg similarity coefficient.

Jaccard index

Similarity (geometry)

Plagiarism detection

Winnowing

Similarity measure

10.24167/proxies.v2i2.3208

Cite

Citations (1)

A note on the concept of obfuscation

Труды Института системного программирования РАН (2004)

N. P. Varnovsky

In this paper we address the issue of defining security of program obfuscation. We argue that requirements to obfuscated programs may be different and are dependent on potential applications. Therefore three distinct models are suggested for studying the issues of obfuscation, namely obfuscation for software protection, total obfuscation and constant hiding. We also introduce a definition of weak obfuscation based on “grey-box” paradigm and show this weak form of obfuscation to be impossible.

Obfuscation

Source

Cite

Citations (5)