Automatic Component Diagram Generation from Natural Language Specifications

Aus SDQ-Institutsseminar
Vortragende(r) Marco Demartino
Vortragstyp Bachelorarbeit
Betreuer(in) Vincenzo Scotti
Termin Mo 23. März 2026, 14:00 (Raum 348 (Gebäude 50.34))
Vortragssprache Englisch
Vortragsmodus in Präsenz
Kurzfassung This thesis investigates the automatic generation of UML component diagrams from natural language descriptions using large language models (LLMs). UML diagrams are widely used in software architecture documentation, but creating them manually can be time-consuming and requires expertise in modeling languages. The proposed approach introduces a generation pipeline that transforms textual specifications into structured component models, using prompt engineering techniques such as few-shot learning to guide the model toward syntactically valid and semantically meaningful outputs. An evaluation framework is defined to assess the generated diagrams with respect to both syntactic correctness and semantic alignment with the input descriptions. Initial experimental results are promising, suggesting that LLMs can effectively support the generation of UML component diagrams. In the future, LLMs could be integrated into automated pipelines to generate architecture diagrams more quickly and reduce the manual effort required in software documentation.