SDQ-Institutsseminar
Das SDQ-Institutsseminar ist eine ständige Lehrveranstaltung, die den Zweck hat, über aktuelle Forschungsarbeiten in den zu Software Design and Quality (SDQ) gehörigen Forschungsgruppen innerhalb von KASTEL – Institut für Informationssicherheit und Verlässlichkeit zu informieren. Insbesondere soll Studierenden die Gelegenheit gegeben werden, über ihre Bachelor- und Masterarbeiten vor einem größeren Auditorium zu berichten. Schwerpunkte liegen dabei auf der Problemstellung, den Lösungsansätzen und den erzielten Ergebnissen. Das Seminar steht aber allen Studierenden und Mitarbeitenden des KIT sowie sonstigen Interessierten offen.
Das Seminar findet auf Wunsch hybrid mit online zugeschalteten Teilnehmenden statt. Bitte dies den Betreuenden rechtzeitig mitteilen, damit diese die notwendigen Geräte aufbauen können.
| Ort | Gebäude 50.34, Seminarraum 010 oder online, siehe Beschreibung | |
|---|---|---|
| Zeit | jeweils freitags, 14:00–15:30 Uhr |
Die Vorträge müssen den folgenden zeitlichen Rahmen einhalten (siehe auch SDQ-Wiki):
- Masterarbeit: 30 Minuten Redezeit + 15 Minuten Diskussion
- Bachelorarbeit: 20 Minuten Redezeit + 10 Minuten Diskussion
Weitere Informationen
Nächste Vorträge
Termine in Kalender importieren: iCal (Download)
Freitag, 13. März 2026, 11:30 Uhr
iCal (Download)
Ort: Raum 010 (Gebäude 50.34)
Webkonferenz: https://sdq.kastel.kit.edu/institutsseminar/Microsoft_Teams
| Vortragende(r) | Philipp Meyer |
|---|---|
| Vortragstyp | Bachelorarbeit |
| Betreuer(in) | Raziyeh Dehghani |
| Vortragssprache | Englisch |
| Vortragsmodus | in Präsenz |
| Kurzfassung | The development of Cyber-Physical Systems (CPS) is characterized by a high degree of complexity and requires continuous optimization throughout the entire development process.
The feedback cycles of the MODA framework are ideal for systematically controlling these adjustments. However, their effective use requires that descriptive models can be derived from runtime data. Established approaches to model derivation, however, were primarily designed for other domains and applications. Against this background, this work develops an automated pipeline to extract descriptive models from raw data and systematically evaluates the suitability of various modeling approaches for the domain of cyber-physical systems. A central element of the solution approach is the integration of the analysis results into a standardized metamodel based on the Structured Metrics Metamodel in order to give the raw data a semantic structure and ensure interoperability for downstream MDD tools. To objectively evaluate the results, a dedicated evaluation framework was developed that compares the various approaches using quantitative metrics and qualitative expert feedback. The evaluation confirms that the automated derivation of statistical parameters, segmentations, and discrete system states delivers robust results. In contrast, limitations were identified in the generation of complex process models using process mining, as the conversion of continuous physical signals into discrete logic remains a challenge. Overall, the work demonstrates as a proof of concept how the gap between collected runtime data and formal models can be closed, thus providing a technological basis for MODA feedback cycles in CPS development. |
| Vortragende(r) | Thomas Heinen |
|---|---|
| Vortragstyp | Masterarbeit |
| Betreuer(in) | Lars König |
| Vortragssprache | Deutsch |
| Vortragsmodus | in Präsenz |
| Kurzfassung | Modellgetriebene Entwicklung erlaubt es, Abstraktionen zu erstellen, und macht damit komplexe Domänenlogik beherrschbar. Textuelle Sprachen bieten Softwareentwicklern eine intuitive Möglichkeit, Modelle zu beschreiben und in ihren Workflow zu integrieren: Der bevorzugte Texteditor kann regulär weiterverwendet werden, und durch vertraute Versionskontrollsysteme wie Git kann wie gewohnt mit anderen Entwicklern zusammengearbeitet werden. Generische textuelle Sprachen sind häufig nicht ausdrucksstark genug und somit nur schwer verwendbar für Menschen. Für jedes verwendete Metamodell eine eigene domänenspezifische Sprache (DSL) zu erstellen, behebt dieses Problem zwar, erfordert jedoch einen erheblichen Aufwand. In dieser Arbeit präsentieren wir einen Ansatz, der es Sprachdesignern und Metamodell-Experten erlaubt, eine gegebene generische textuelle Sprache um metamodellspezifische Sprachkonstrukte zu erweitern. Diese Konstrukte machen die Sprache deutlich verständlicher für Entwickler, ähnlich wie eine DSL, benötigen aber einen deutlich geringeren Wartungsaufwand. Wir testen unseren Ansatz mithilfe einer von Vector Informatik entwickelten DSL. Unsere Evaluierung zeigt, dass unser Ansatz mächtig genug ist, um alle metamodellspezifischen Sprachkonstrukte aus der besagten DSL abzubilden. |
Freitag, 13. März 2026, 14:00 Uhr
iCal (Download)
Ort: Raum 010 (Gebäude 50.34)
| Vortragende(r) | Jonas Bruer |
|---|---|
| Vortragstyp | Bachelorarbeit |
| Betreuer(in) | Maximilian Hummel |
| Vortragssprache | Deutsch |
| Vortragsmodus | in Präsenz |
| Kurzfassung | Bridging MQTT-based IoT communication with Apache Kafka enables scalable data streaming but introduces additional processing stages that may become performance bottlenecks. Existing benchmarks mainly evaluate MQTT brokers in isolation or rely on black-box end-to-end measurements, offering limited insight into internal pipeline behavior.
This thesis presents a reproducible profiling framework for a MQTT-to-Kafka pipeline that combines benchmarking with white-box instrumentation of internal components. The framework models atomic Entry Level System Calls (ELSCs) and composes them into configurable workload classes, enabling automated and systematic performance experiments. The implementation is based on EMQX with integrated Kafka bridging and distributed tracing across protocol boundaries. Evaluation following a Goal-Question-Metric approach demonstrates that the framework supports reproducible experiments, preserves trace continuity across services, and enables identification of internal bottlenecks while maintaining controlled instrumentation overhead. |
| Vortragende(r) | Max Oesterle |
|---|---|
| Vortragstyp | Vortrag |
| Betreuer(in) | Nathan Hagel |
| Vortragssprache | Deutsch |
| Vortragsmodus | in Präsenz |
| Kurzfassung | Modern software systems are increasingly developed using multiple heterogeneous metamodels. The Vitruvius framework addresses cross-metamodel consistency through its Reactions language, which defines operationally how consistency is restored after changes. However, Vitruvius currently lacks a declarative counterpart that specifies what must be consistent, independently of any restoration logic.
Vitruvius OCL fills this gap by extending OCL with qualified cross-metamodel navigation syntax and a native correspondence operator that treats Vitruvius correspondence models as a first-class abstraction. VitruvOCL provides static type safety, native correspondence navigation, and true n-ary constraints within a purely declarative framework. |
Freitag, 13. März 2026, 14:30 Uhr
iCal (Download)
Ort: Raum 131 (Gebäude 50.34)
| Vortragende(r) | Ege Uzhan |
|---|---|
| Vortragstyp | Masterarbeit |
| Betreuer(in) | Jan Keim |
| Vortragssprache | Englisch |
| Vortragsmodus | in Präsenz |
| Kurzfassung | Documentation plays a crucial role in software engineering by supporting development, maintenance, and evolution. A structured form of documenting design decisions is Architectural Decision Records (ADRs), which concisely capture architectural choices. However, like most documentation, ADRs are often disconnected from related project artifacts. Traceability Link Recovery (TLR) aims to reconstruct such links automatically, yet it has not previously been applied specifically to ADRs, nor have benchmarks existed for this purpose. This work addresses that gap by applying established TLR approaches to ADRs and introducing the first gold-standard dataset linking ADRs to source code and software architecture documentation at multiple levels of granularity. Results show that ADR traceability is feasible, with file-level recovery yielding more stable precision-recall trade-offs than sentence-level recovery. Effectiveness depends on artifact type, granularity, and candidate selection, highlighting challenges and opportunities for improving ADR traceability. |
Freitag, 20. März 2026, 14:00 Uhr
iCal (Download)
Ort: Raum 010 (Gebäude 50.34)
| Vortragende(r) | Benjamin Arp |
|---|---|
| Vortragstyp | Masterarbeit |
| Betreuer(in) | Nils Niehues |
| Vortragssprache | Englisch |
| Vortragsmodus | in Präsenz |
| Kurzfassung | Identifying confidentiality violations in software architectures at design time is a well-studied problem.
Fixing them automatically is not. Frameworks such as STRIDE and LINDDUN use Data Flow Diagrams (DFDs) to expose threats at an early stage in the development lifecycle. But once a violation is detected, software architects must still manually determine and apply the appropriate countermeasures, which is both a time-consuming and error-prone process. This thesis addresses the repair side of the problem. The central research question is which discrete optimisation method is best suited to automate this task. To answer it without presupposing an outcome, we design a comparative survey of three candidate methods: Branch and Bound, Integer Linear Programming (ILP), and Evolutionary Algorithms. These are evaluated against four criteria: optimality, runtime performance, extensibility, and reproducibility. The survey establishes ILP as the only method satisfying all four, primarily because its declarative problem formulation separates constraint specification from solving. This separation proves essential when mitigation strategies must be extended or customised. Building on this result, we implement an ILP-based automated mitigation approach integrated into the ARCoViA framework. The approach operates on DFDs that are annotated with label-based confidentiality constraints. It then enumerates candidate mitigation strategies across the full space of label additions and deletions, node insertions and removals, as well as flow deletions. These strategies, along with their mutual dependencies and contradictions, are encoded into a Boolean ILP problem. The solver yields a minimal-cost repair, which is then applied to produce a repaired DFD automatically. The main engineering challenge is generating a complete and correct set of candidate mitigations and encoding their dependencies and contradictions without omissions. The prior SAT-based approach, which this work extends, is limited to purely additive label changes and struggles to express richer constraint structures in CNF form. The present approach removes both restrictions. We evaluate it against four goals: effectiveness, extensibility, cost, and scalability, using DFD models from the MicroSecEnD benchmark. The approach eliminates all detected violations, produces repairs that are approximately 73\% less invasive than a human baseline, and scales acceptably across all studied dimensions of model complexity. |
| Vortragende(r) | David Inca |
|---|---|
| Vortragstyp | Bachelorarbeit |
| Betreuer(in) | Julian Roßkothen |
| Vortragssprache | Deutsch |
| Vortragsmodus | in Präsenz |
| Kurzfassung | Das zuverlässige Retrieval von Artefakten der modellgetriebenen Softwareentwicklung (MDSE) mittels semantischer Ähnlichkeit ist eine Kernvoraussetzung für leistungsfähige RAG-Systeme in diesem Bereich. Da gängige Embedding-Modelle primär für unstrukturierten Text optimiert sind, untersucht diese Arbeit deren Effektivität für strukturierte, referenzielle Modellartefakte. In einem kontrollierten Benchmark-Setting wird der Einfluss von Datenaufbereitung, Serialisierung und Embedding-Modellwahl systematisch evaluiert.
Die Ergebnisse verdeutlichen, dass die Wahl des Embedding-Modells den signifikantesten Einfluss auf die Retrievalqualität ausübt. Als besonders effektiv erweist sich die Dereferenzierung interner Verknüpfungen, während die Wahl des Serialisierungsansatzes nur eine marginale Rolle spielt. Die Untersuchung belegt die prinzipielle Eignung embedding-basierter Verfahren für MDSE-Daten und liefert konkrete Handlungsempfehlungen für die Konfiguration effizienter Retrieval-Architekturen. |
| Vortragende(r) | Fabio Freund |
|---|---|
| Vortragstyp | Bachelorarbeit |
| Betreuer(in) | Maximilian Hummel |
| Vortragssprache | Deutsch |
| Vortragsmodus | in Präsenz |
| Kurzfassung | Text-based modeling simplifies the creation of software architecture models, yet existing grammars are largely rooted in traditional PCM concepts. Modern cloud-native systems—built around containers, microservices, and Kubernetes-based workflows—do not align well with these abstractions. In addition, current modeling approaches lack an accessible, declarative syntax familiar to DevOps engineers who work with YAML-style configuration files. This thesis addresses this gap by extending an existing textual modeling language to better represent cloud-native patterns while introducing a concise, YAML-inspired syntax. The work includes analyzing and adapting the TPCM/Xtext grammar, designing user-friendly constructs aligned with real-world deployment descriptors, and implementing a transformation layer that maps the extended language to PCM models compatible with Palladio and Simulizer. The result will improve the usability and relevance of performance simulation in cloud-native environments. |
Montag, 23. März 2026, 14:00 Uhr
iCal (Download)
Ort: Raum 348 (Gebäude 50.34)
| Vortragende(r) | Marco Demartino |
|---|---|
| Vortragstyp | Bachelorarbeit |
| Betreuer(in) | Vincenzo Scotti |
| Vortragssprache | Englisch |
| Vortragsmodus | in Präsenz |
| Kurzfassung | Kurzfassung |
| Vortragende(r) | Robin Schöppner |
|---|---|
| Vortragstyp | Masterarbeit |
| Betreuer(in) | Erik Burger |
| Vortragssprache | Deutsch |
| Vortragsmodus | in Präsenz |
| Kurzfassung | The automotive industry is in the midst of a shit from traditionally distributed systems of independent control units towards a more centralized E/E architecture. A prominent challenge in this transition is the integration of heterogeneous software components from diverse suppliers. Traditionally, this gap is bridged by manually written integration code, leading to tight coupling between domain logic and communication protocols, which makes system evolution difficult.
This thesis presents a model-driven approach for generating digital twin code components directly from formal descriptions of in-vehicle components. Central to the approach is the generation of a stable core domain model that represents the vehicle state. Connections to physical hardware are abstracted via a generic interface and the COVESA Interface Exchange Framework (IFEX) as a standardized contract layer. By implementing the communication patterns once generically, the approach eliminates the need for manual integration code for individual components or new requirements. This centralized state representation further enables a simplified off-board cloud twin instance to be fully synchronized with the on-board twin. This architecture generically supports remote control and monitoring of any vehicle function, regardless of whether that functionality was originally designed for online connectivity. The approach is validated by implementing a prototypical code generation pipeline, deploying it on distributed virtual machines, and connecting mock ECUs using different Interface Definition Languages (gRPC, SOME/IP). The evaluation demonstrates that changing underlying communication protocols requires only a model update without modifying domain code. |
| Vortragende(r) | Leon Bruns |
|---|---|
| Vortragstyp | Bachelorarbeit |
| Betreuer(in) | Robin Maisch |
| Vortragssprache | Deutsch |
| Vortragsmodus | in Präsenz |
| Kurzfassung | Code plagiarism in academic contexts, most notably in first-year programming courses, continues to be a problem. Currently, the most widely used plagiarism detectors are vulnerable to plagiarism obfuscation through inserted complex dead code. We propose using abstract interpretation to detect and remove dead code before the code is processed by plagiarism detection tools. |