Quantitative Evaluation of the Expected Antagonism of Explainability and Privacy

Aus IPD-Institutsseminar
Zur Navigation springen Zur Suche springen
Vortragende(r) Martin Lange
Vortragstyp Proposal
Betreuer(in) Clemens Müssener
Termin Fr 11. Juni 2021
Kurzfassung Explainers for machine learning models help humans and models work together. They build trust in a model's decision by giving further insight into the decision making process. However, it is unclear whether this insight can also expose private information. The question of our thesis is whether there exists a conflict of objectives between explainability and privacy and how we measure the effects of this conflict. Specifically we are looking at local feature importance explainers.

We propose a use case where the prediction of a model for a person is considered their private data. An attacker might be able to gain insight into the predictions for other people by abusing their own explanation to imitate the model's behavior. We will test this use case experimentally to determine whether such an attack is possible.