Preventing Refactoring Attacks on Software Plagiarism Detection through Graph-Based Structural Normalization

Aus SDQ-Institutsseminar
Vortragende(r) Robin Maisch
Vortragstyp Masterarbeit
Betreuer(in) Timur Sağlam
Termin Fr 7. Juni 2024
Vortragssprache
Vortragsmodus in Präsenz
Kurzfassung Detecting software plagiarism among a set of code submissions by students remains a challenge. Plagiarists often obfuscate their work by modifying it just enough to avoid detection while preserving the code’s runtime behavior in order to create an equally valid solution as the original. This type of modification is commonly known as refactoring. The state-of-the-art in plagiarism detection, token-based approaches, are immune against some types of refactorings by their very design, whereas other types create very effective plagiarism.

This thesis presents a novel approach that uses refactorings as a means to normalize the structure of code submissions. This normalized structure is not affected by refactoring attacks. The normalization engine, implemented as a transformation system for code graphs, was integrated into a token-based plagiarism detection tool. We evaluate our approach on four relevant types of obfuscation attack schemes. From the results, we conclude that the approach is not only on par with the state of the art in its efficacy against all attack schemes, but it even outperforms it by a large margin on combined refactoring attacks.