Häufigkeitsbasierte Erhebung bemerkenswerter Übereinstimmungen bei Quelltext-Plagiatserkennung: Unterschied zwischen den Versionen
(Die Seite wurde neu angelegt: „{{Vortrag |vortragender=Elisabeth Hermann |email=uhvgy@student.kit.edu |vortragssprache=Deutsch |vortragstyp=Bachelorarbeit |betreuer=Robin Maisch |termin=Institutsseminar/2025-09-19-2 |vortragsmodus=in Präsenz |kurzfassung=TBD }}“) |
Keine Bearbeitungszusammenfassung |
||
| Zeile 7: | Zeile 7: | ||
|termin=Institutsseminar/2025-09-19-2 | |termin=Institutsseminar/2025-09-19-2 | ||
|vortragsmodus=in Präsenz | |vortragsmodus=in Präsenz | ||
|kurzfassung= | |kurzfassung=Determining whether programming submissions addressing the same task were created | ||
independently or copied from one another is challenging. This task can be made easier with | |||
the use of plagiarism detection programs. These programs compare the submissions and | |||
identify similarities in sections between two submissions. However, to date, they do not | |||
take into account whether an identical section appears in more than two submissions. We | |||
assume that if a similarity occurs in only a few submissions, the probability of plagiarism is | |||
increased, and vice versa. The frequency of matches is counted across all comparisons. We | |||
integrate this approach into the token-based plagiarism detector JPlag to see how different | |||
strategies for detecting and weighting the frequency distribution of matches can be used to | |||
better separate plagiarism from inconspicuous matches. The weighting is incorporated into | |||
the Similarity Score, which assesses the similarity between two submissions. The results | |||
show that this approach can improve plagiarism detection. | |||
}} | }} | ||
Version vom 25. August 2025, 09:31 Uhr
| Vortragende(r) | Elisabeth Hermann | |
|---|---|---|
| Vortragstyp | Bachelorarbeit | |
| Betreuer(in) | Robin Maisch | |
| Termin | Fr 19. September 2025, 10:30 (Raum 010 (Gebäude 50.34)) | |
| Vortragssprache | Deutsch | |
| Vortragsmodus | in Präsenz | |
| Kurzfassung | Determining whether programming submissions addressing the same task were created
independently or copied from one another is challenging. This task can be made easier with the use of plagiarism detection programs. These programs compare the submissions and identify similarities in sections between two submissions. However, to date, they do not take into account whether an identical section appears in more than two submissions. We assume that if a similarity occurs in only a few submissions, the probability of plagiarism is increased, and vice versa. The frequency of matches is counted across all comparisons. We integrate this approach into the token-based plagiarism detector JPlag to see how different strategies for detecting and weighting the frequency distribution of matches can be used to better separate plagiarism from inconspicuous matches. The weighting is incorporated into the Similarity Score, which assesses the similarity between two submissions. The results show that this approach can improve plagiarism detection. | |