Quantitative Evaluation of the Expected Antagonism of Explainability and Privacy: Unterschied zwischen den Versionen

Aus SDQ-Institutsseminar
Keine Bearbeitungszusammenfassung
Keine Bearbeitungszusammenfassung
Zeile 5: Zeile 5:
|betreuer=Clemens Müssener
|betreuer=Clemens Müssener
|termin=Institutsseminar/2021-06-11
|termin=Institutsseminar/2021-06-11
|kurzfassung=Explainers for machine learning models help humans and models work together. They build trust in a model's decision by giving further insight into the decision making process. However, it is unclear whether this insight can also expose private information. The question of our thesis is whether there exists a conflict of objectives between explainability and privacy and how we measure the effects of this conflict. Specifically we are looking at local feature importance explainers.
|kurzfassung=Explainers for machine learning models help humans and models work together. They build trust in a model's decision by giving further insight into the decision making process. However, it is unclear whether this insight can also expose private information. The question of my thesis is whether there exists a conflict of objectives between explainability and privacy and how to measure the effects of this conflict.


We propose a use case where the prediction of a model for a person is considered their private data. An attacker might be able to gain insight into the predictions for other people by abusing their own explanation to imitate the model's behavior. We will test this use case experimentally to determine whether such an attack is possible.
I propose two different possible types of attack that can be applied against explainers: model extraction and information about the training data. Differential privacy is introduced as a way to measure the privacy breach of these attacks. Finally, three specific use cases are presented where explainers can realistically be abused to breach differential privacy.
}}
}}

Version vom 8. Juni 2021, 11:35 Uhr

Vortragende(r) Martin Lange
Vortragstyp Proposal
Betreuer(in) Clemens Müssener
Termin Fr 11. Juni 2021
Vortragsmodus
Kurzfassung Explainers for machine learning models help humans and models work together. They build trust in a model's decision by giving further insight into the decision making process. However, it is unclear whether this insight can also expose private information. The question of my thesis is whether there exists a conflict of objectives between explainability and privacy and how to measure the effects of this conflict.

I propose two different possible types of attack that can be applied against explainers: model extraction and information about the training data. Differential privacy is introduced as a way to measure the privacy breach of these attacks. Finally, three specific use cases are presented where explainers can realistically be abused to breach differential privacy.