Evaluation of Automated Feature Generation Methods

Aus SDQ-Institutsseminar
Version vom 1. Mai 2021, 10:38 Uhr von Jonathan Bechtle (Diskussion | Beiträge) (Die Seite wurde neu angelegt: „{{Vortrag |vortragender=Jonathan Bechtle |email=jonathan.bechtle@student.kit.edu |vortragstyp=Masterarbeit |betreuer=Vadim Arzamasov |termin=Institutsseminar/2…“)
(Unterschied) ← Nächstältere Version | Aktuelle Version (Unterschied) | Nächstjüngere Version → (Unterschied)
Vortragende(r) Jonathan Bechtle
Vortragstyp Masterarbeit
Betreuer(in) Vadim Arzamasov
Termin Fr 14. Mai 2021
Vortragsmodus
Kurzfassung Manual feature engineering is a time consuming and costly activity, when developing new Machine Learning applications, as it involves manual labor of a domain expert. Therefore, efforts have been made to automate the feature generation process. However, there exists no large benchmark of these Automated Feature Generation methods. It is therefore not obvious which method performs well in combination with specific Machine Learning models and what the strengths and weaknesses of these methods are.

In this thesis we present an evaluation framework for Automated Feature Generation methods, that is integrated into the scikit-learn framework for Python. We integrate nine Automated Feature Generation methods into this framework. We further evaluate the methods on 91 datasets for classification problems. The datasets in our evaluation have up to 58 features and 12,958 observations. As Machine Learning models we investigate five models including state of the art models like XGBoost.