Multi-Language and Cross-Language Software Plagiarism Detection

Aus SDQ-Institutsseminar
Vortragende(r) Alexander Milster
Vortragstyp Bachelorarbeit
Betreuer(in) Robin Maisch
Termin [[Institutsseminar/2025-03-28|
 VeranstaltungsdatumVeranstaltungsraum
Institutsseminar/2025-03-28Fr 28. März 2025, 02:03Raum 348 (Gebäude 50.34)
]]
Vortragssprache Deutsch
Vortragsmodus in Präsenz
Kurzfassung Currently, commonly used plagiarism detection tools can only handle code from one language for a single run.

This thesis deals with two different sub-problems. Firstly, parsing and comparing the code of each occurring language in a single submission set separately (multi-language plagiarism detection) and, secondly, comparing submissions as a whole, despite containing code from multiple languages (cross-language plagiarism detection). In this thesis, we propose supporting multi-language plagiarism detection by concatenating the token lists. For cross-language plagiarism detection, we propose a set of language-agnostic tokens and rules for the order they should be extracted in, which have to be implemented for each supported language. In addition, a dynamic approach that allows more flexible matching of tokens is considered.