Beyond Similarity - Dimensions of Semantics and How to Detect them: Unterschied zwischen den Versionen
Keine Bearbeitungszusammenfassung |
Keine Bearbeitungszusammenfassung |
||
Zeile 6: | Zeile 6: | ||
|termin=Institutsseminar/2023-01-13 | |termin=Institutsseminar/2023-01-13 | ||
|vortragsmodus=in Präsenz | |vortragsmodus=in Präsenz | ||
|kurzfassung=Semantic similarity estimation is a widely used and well-researched area. Current state-of-the-art approaches estimate text similarity with large language models. However, semantic similarity estimation often ignores fine-grain differences between semantic similar sentences. This thesis proposes the concept of semantic dimensions to represent fine-grain differences between two sentences. A workshop with domain experts identified ten semantic dimensions. From the workshop insights, a model for semantic dimensions was created. Afterward, 60 participants decided via a survey which semantic dimensions are useful to users. Detectors for the five most useful semantic dimensions were implemented in an extendable framework. To evaluate the semantic dimensions detectors, a dataset of 200 sentence pairs was created. The detectors reached an average F1 score of 0.815. | |||
}} | }} |
Aktuelle Version vom 2. Januar 2023, 11:12 Uhr
Vortragende(r) | Felix Pieper | |
---|---|---|
Vortragstyp | Masterarbeit | |
Betreuer(in) | Sophie Corallo | |
Termin | Fr 13. Januar 2023 | |
Vortragsmodus | in Präsenz | |
Kurzfassung | Semantic similarity estimation is a widely used and well-researched area. Current state-of-the-art approaches estimate text similarity with large language models. However, semantic similarity estimation often ignores fine-grain differences between semantic similar sentences. This thesis proposes the concept of semantic dimensions to represent fine-grain differences between two sentences. A workshop with domain experts identified ten semantic dimensions. From the workshop insights, a model for semantic dimensions was created. Afterward, 60 participants decided via a survey which semantic dimensions are useful to users. Detectors for the five most useful semantic dimensions were implemented in an extendable framework. To evaluate the semantic dimensions detectors, a dataset of 200 sentence pairs was created. The detectors reached an average F1 score of 0.815. |