Discovering data-driven Explanations: Unterschied zwischen den Versionen

Aus SDQ-Institutsseminar
Keine Bearbeitungszusammenfassung
Keine Bearbeitungszusammenfassung
Zeile 5: Zeile 5:
|betreuer=Vadim Arzamasov
|betreuer=Vadim Arzamasov
|termin=Institutsseminar/2019-06-21 Zusatztermin
|termin=Institutsseminar/2019-06-21 Zusatztermin
|kurzfassung=tbd
|kurzfassung=The main goal knowledge discovery focusses is, obviously, an increase of knowledge using some set of data. In many cases it is crucial that results are human-comprehensible. Subdividing the feature space into boxes with unique characteristics is a commonly used approach for achieving this goal. The patient-rule-induction method (PRIM) extracts such "interesting" hyperboxes from a dataset by generating boxes that maximize some class occurrence inside of it. However, the quality of the results varies when applied to small datasets. This work will examine to which extent data-generators can be used to artificially increase the amount of available data in order to improve the accuracy of the results. Secondly, it it will be tested if the a probabilistic classification can improve the results when using generated data.
}}
}}

Version vom 18. Juni 2019, 15:51 Uhr

Vortragende(r) Benjamin Jochum
Vortragstyp Proposal
Betreuer(in) Vadim Arzamasov
Termin Fr 21. Juni 2019
Vortragssprache
Vortragsmodus
Kurzfassung The main goal knowledge discovery focusses is, obviously, an increase of knowledge using some set of data. In many cases it is crucial that results are human-comprehensible. Subdividing the feature space into boxes with unique characteristics is a commonly used approach for achieving this goal. The patient-rule-induction method (PRIM) extracts such "interesting" hyperboxes from a dataset by generating boxes that maximize some class occurrence inside of it. However, the quality of the results varies when applied to small datasets. This work will examine to which extent data-generators can be used to artificially increase the amount of available data in order to improve the accuracy of the results. Secondly, it it will be tested if the a probabilistic classification can improve the results when using generated data.