Quantitative Evaluation of the Expected Antagonism of Explainability and Privacy

Aus SDQ-Institutsseminar
Vortragende(r) Martin Lange
Vortragstyp Bachelorarbeit
Betreuer(in) Clemens Müssener
Termin Fr 20. August 2021
Kurzfassung Explainable artificial intelligence (XAI) offers a reasoning behind a model's behavior.

For many explainers this proposed reasoning gives us more information about the inner workings of the model or even about the training data. Since data privacy is becoming an important issue the question arises whether explainers can leak private data. It is unclear what private data can be obtained from different kinds of explanation. In this thesis I adapt three privacy attacks in machine learning to the field of XAI: model extraction, membership inference and training data extraction. The different kinds of explainers are sorted into these categories argumentatively and I present specific use cases how an attacker can obtain private data from an explanation. I demonstrate membership inference and training data extraction for two specific explainers in experiments. Thus, privacy can be breached with the help of explainers.