Software Plagiarism Detection on Intermediate Representation

Vortragende(r)	Niklas Heneka
Vortragstyp	Bachelorarbeit
Betreuer(in)	Timur Sağlam
Termin	Fr 17. November 2023, 11:30 (Raum 237 (Gebäude 50.34))
Vortragssprache
Vortragsmodus	in Präsenz
Kurzfassung	Source code plagiarism is a widespread problem in computer science education. To counteract this, software plagiarism detectors can help identify plagiarized code. Most state-of-the-art plagiarism detectors are token-based. It is common to design and implement a new dedicated language module to support a new programming language. This process can be time-consuming, furthermore, it is unclear whether it is even necessary. In this thesis, we evaluate the necessity of dedicated language modules for Java and C/C++ and derive conclusions for designing new ones. To achieve this, we create a language module for the intermediate representation of LLVM. For the evaluation, we compare it to two existing dedicated language modules in JPlag. While our results show that dedicated language modules are better for plagiarism detection, language modules for intermediate representations show better resilience to obfuscation attacks.