Fine-tuning vs. Prompting: The Case of Traceability Link Recovery

Aus SDQ-Wiki
Ausschreibung (Liste aller Ausschreibungen)
Typ Bachelorarbeit
Aushang Aushang Fine tuning vs Prompting TLR.pdf
Betreuer Wenden Sie sich bei Interesse oder Fragen bitte an:

Tobias Hey (E-Mail: hey@kit.edu, Telefon: +49-721-608-44765)

Motivation

Throughout the development and maintenance of a software system, various artifacts are created. These artifacts, such as requirements, diagrams, models, or the source code itself, are by no means independent but often represent refinements or implementations of the content of another type of artifact. These trace links are essential for a complete understanding of the system. The rise of large language models (LLMs) opened up new possibilities for automatically recovering those trace links. Both fine-tuning and prompting techniques have been applied to the task. However, fine-tuning requires a certain amount of labeled training data, whereas the results of prompt-based approaches are still insufficient to be used in practice.

Task Description

The goal of this work is to investigate the trade-off decision between fine-tuning a large language model (LLM) and prompting it for the task of traceability link recovery (TLR). How much training data is required to achieve similar or better results with fine-tuning as with prompting techniques? The focus of this work will be on an empirical evaluation using existing methods and datasets. By comparing different amounts of training data and different prompting techniques on established bechmark datasets, we aim to provide insights and thus potentially even guidelines into when to use which method.