Software Plagiarism Detection on Intermediate Representation: Unterschied zwischen den Versionen
(Die Seite wurde neu angelegt: „{{Vortrag |vortragender=Niklas Heneka |email=niklas.heneka@student.kit.edu |vortragstyp=Bachelorarbeit |betreuer=Timur Sağlam |termin=Institutsseminar/2023-11-17 |vortragsmodus=in Präsenz |kurzfassung=TBD }}“) |
Keine Bearbeitungszusammenfassung |
||
(Eine dazwischenliegende Version desselben Benutzers wird nicht angezeigt) | |||
Zeile 4: | Zeile 4: | ||
|vortragstyp=Bachelorarbeit | |vortragstyp=Bachelorarbeit | ||
|betreuer=Timur Sağlam | |betreuer=Timur Sağlam | ||
|termin=Institutsseminar/2023-11-17 | |termin=Institutsseminar/2023-11-17-2 | ||
|vortragsmodus=in Präsenz | |vortragsmodus=in Präsenz | ||
|kurzfassung= | |kurzfassung=Source code plagiarism is a widespread problem in computer science education. To counteract this, software plagiarism detectors can help identify plagiarized code. Most state-of-the-art plagiarism detectors are token-based. It is common to design and implement a new dedicated language module to support a new programming language. This process can be time-consuming, furthermore, it is unclear whether it is even necessary. In this thesis, we evaluate the necessity of dedicated language modules for Java and C/C++ and derive conclusions for designing new ones. To achieve this, we create a language module for the intermediate representation of LLVM. For the evaluation, we compare it to two existing dedicated language modules in JPlag. While our results show that dedicated language modules are better for plagiarism detection, language modules for intermediate representations show better resilience to obfuscation attacks. | ||
}} | }} |
Aktuelle Version vom 30. Oktober 2023, 08:47 Uhr
Vortragende(r) | Niklas Heneka | |
---|---|---|
Vortragstyp | Bachelorarbeit | |
Betreuer(in) | Timur Sağlam | |
Termin | Fr 17. November 2023 | |
Vortragssprache | ||
Vortragsmodus | in Präsenz | |
Kurzfassung | Source code plagiarism is a widespread problem in computer science education. To counteract this, software plagiarism detectors can help identify plagiarized code. Most state-of-the-art plagiarism detectors are token-based. It is common to design and implement a new dedicated language module to support a new programming language. This process can be time-consuming, furthermore, it is unclear whether it is even necessary. In this thesis, we evaluate the necessity of dedicated language modules for Java and C/C++ and derive conclusions for designing new ones. To achieve this, we create a language module for the intermediate representation of LLVM. For the evaluation, we compare it to two existing dedicated language modules in JPlag. While our results show that dedicated language modules are better for plagiarism detection, language modules for intermediate representations show better resilience to obfuscation attacks. |