Software Plagiarism Detection on Intermediate Representation

Vortragende(r)

Vortragstyp

Bachelorarbeit

Betreuer(in)

Termin

[[Institutsseminar/2023-11-17-2|

	Veranstaltungsdatum	Veranstaltungsraum
Institutsseminar/2023-11-17-2	Fr 17. November 2023, 11:11	Raum 237 (Gebäude 50.34)

]]

Vortragssprache

Vortragsmodus

in Präsenz

Kurzfassung

Source code plagiarism is a widespread problem in computer science education. To counteract this, software plagiarism detectors can help identify plagiarized code. Most state-of-the-art plagiarism detectors are token-based. It is common to design and implement a new dedicated language module to support a new programming language. This process can be time-consuming, furthermore, it is unclear whether it is even necessary. In this thesis, we evaluate the necessity of dedicated language modules for Java and C/C++ and derive conclusions for designing new ones. To achieve this, we create a language module for the intermediate representation of LLVM. For the evaluation, we compare it to two existing dedicated language modules in JPlag. While our results show that dedicated language modules are better for plagiarism detection, language modules for intermediate representations show better resilience to obfuscation attacks.