Institutsseminar/2025-09-26-2: Unterschied zwischen den Versionen

Version vom 4. Juni 2025, 15:15 Uhr

Termin (Alle Termine)
	Datum	Freitag, 26. September 2025
	Uhrzeit	11:30 – 13:00 Uhr (Dauer: 90 min)
	Ort	Raum 010 (Gebäude 50.34)
	Prüfer/in
	Webkonferenz
	Vorheriger Termin	Fr 19. September 2025
	Nächster Termin	Fr 26. September 2025

Termin in Kalender importieren: iCal (Download)

Vorträge

Enable Tracing Requirements and Source Code in Visual Studio Code
Vortragende(r)	Yifei Huang
Vortragstyp	Bachelorarbeit
Betreuer(in)	Kevin Feichtinger
Vortragssprache	Englisch
Vortragsmodus	in Präsenz
Kurzfassung	Maintaining end-to-end traceability between natural-language requirements and sourcecode is essential for comprehension, change impact analysis, and compliance, yet remainsdifficult in practice. This thesis presents a Visual Studio Code extension that brings documentation‑to‑code traceability into the developer workflow and a retrieval‑augmented approach,TRAG, that combines deterministic vector search with lightweight large‑language‑model(LLM) verification. TRAG pre-processes a workspace by summarizing compilation units,embedding summaries, and storing vectors locally. At link time it splits documentation intosentences, optionally rewrites them for precision, retrieves top‑k candidate code summariesby cosine similarity, and asks an LLM to accept or reject each candidate using compact context;chain‑of‑thought prompting is optional.We evaluate the TRAG against the ArDoCode verification baseline on benchmarked systemswith gold‑standard links, reporting precision, recall, and F1. ArDoCode provides consistentlyhigh recall and the best F1 on JR, TM, and BBB, whereas TRAG improves precision and canexceed F1 on MS and slightly on TS (e.g., best MS 0.208 with mistral‑nemo‑cot; TS 0.316 withgemma3‑4b). Chain‑of‑thought shows mixed effects, helping when evidence is compact butreducing recall otherwise. We discuss design choices, threats to validity, and practical operating points. Overall, retrieval‑augmented verification is a viable complement to recall‑orientedbaselines: it raises precision when evidence is well‑separated.

Incorporation of Text Content Similarity into Token-based Source Code Plagiarism Detection
Vortragende(r)	Moritz Rimpf
Vortragstyp	Bachelorarbeit
Betreuer(in)	Robin Maisch
Vortragssprache	Deutsch
Vortragsmodus	in Präsenz
Kurzfassung	State-of-the-art source-code plagiarism systems often discard textual content in source code during plagiarism detection. However, recent developments have shown that possible cases of plagiarism often do contain similar or identical textual content, such as inline or documentation comments. These comments could be used to more easily find cases of plagiarism or further aid instructors during the manual review of suspicious submissions. Therefore, in this thesis, we enhance plagiarism detection engines by reintroducing textual content in the form of source code comments into the plagiarism detection process. To process comments during plagiarism detection, we introduce a three-step comment processing pipeline that extracts and compares comments and merges the results with the plagiarism detection outcome. Furthermore, we compare three different matching algorithms for matching comments and examine the impact of natural language preprocessing on the comments within this pipeline.

LLM-based Code Generation for Model Transformation Languages
Vortragende(r)	Lukas Schroth
Vortragstyp	Bachelorarbeit
Betreuer(in)	Bowen Jiang
Vortragssprache	Englisch
Vortragsmodus	in Präsenz
Kurzfassung	Domain-specific model transformation languages such as the Reactions Language or ATL are vital in model-driven software development but remain largely absent from the training data of large language models (LLMs). As a result, generated code often contains severe syntactic and semantic errors. This thesis presents an evaluation pipeline, implemented in n8n and Docker, that systematically assesses LLM output on such languages. Metrics include syntax validity via parsing, syntactic closeness using ChrF, and semantic correctness through unit tests. A baseline across multiple LLMs (GPT, Claude, Gemini) is established and improvement strategies such as grammar prompting, few-shot prompting, and auxiliary descriptions of methods and variables is investigated. The pipeline enables reproducible experiments across languages and strategies. Results show that structured prompting and contextual aids can substantially increase correctness compared to baseline generation.

Neuen Vortrag erstellen

Hinweise