Collective Entity Matching for Linking Structures in Attributed Material Graphs

Aus SDQ-Institutsseminar
Vortragende(r) Aleksandra Pawelek
Vortragstyp Proposal
Betreuer(in) Daniel Betsche
Termin Fr 16. Juni 2023
Vortragssprache
Vortragsmodus in Präsenz
Kurzfassung In data analysis, entity matching (EM) or entity resolution is the task of finding the same entity within different data sources. When joining different data sets, it is a required step where the same entities may not always share a common identifier. When applied to graph data like knowledge graphs, ontologies, or abstractions of physical systems, the additional challenge of entity relationships comes into play. Now, not just the entities themselves but also their relationships and, therefore, their neighborhoods need to match. These relationships can also be used to our advantage, which builds the foundation for collective entity matching (CEM).

In this bachelor thesis, we focus on a graph data set based on a material simulation with the intent to match entities between neighboring system states. The goal is to identify structures that evolve over time and link their states with a common identifier. Current CEM Algorithms assume perfect matches to be possible, i.e., every entity can be matched. We want to overcome this challenge and address the high imbalance of potential candidates and impossible matches. A third major challenge is the large volumes of data which requires our algorithm to be efficient.