Cost-Efficient Evaluation of ML Classifiers With Feature Attribution Annotations (Proposal)

Aus SDQ-Institutsseminar
Version vom 12. Mai 2021, 09:19 Uhr von Moritz Renftle (Diskussion | Beiträge)
(Unterschied) ← Nächstältere Version | Aktuelle Version (Unterschied) | Nächstjüngere Version → (Unterschied)
Vortragende(r) Nobel Liaw
Vortragstyp Bachelorarbeit
Betreuer(in) Moritz Renftle
Termin Fr 14. Mai 2021
Kurzfassung Conventional evaluation of an ML classifier uses test data to estimate its expected loss. For "cognitive" ML tasks like image or text classification, this requires that experts annotate a large and representative test data set, which can be expensive.

In this thesis, we explore another approach for estimating the expected loss of an ML classifier. The aim is to enhance test data with additional expert knowledge. Inspired by recent feature attribution techniques, such as LIME or Saliency Maps, the idea is that experts annotate inputs not only with desired classes, but also with desired feature attributions. We then explore different methods to derive a large conventional test data set based on few such feature attribution annotations. We empirically evaluate the loss estimates of our approach against ground-truth estimates on large existing test data sets, with a focus on the tradeoff between the number of expert annotations and the achieved estimation accuracy.