Patient Rule Induction Method with Active Learning: Unterschied zwischen den Versionen

Version vom 26. November 2019, 10:30 Uhr

Vortragende(r)	Emmanouil Emmanouilidis
Vortragstyp	Proposal
Betreuer(in)	Vadim Arzamasov
Termin	Fr 29. November 2019
Vortragssprache
Vortragsmodus
Kurzfassung	PRIM (Patient Rule Induction Method) is an algorithm to create hyperboxes, that are human-comprehensible. Yet PRIM alone requires relatively large datasets. It has been shown, that combining PRIM with ML models (e.g. Random Forrest), which generalize faster, can reduce the number of simulation runs by around 75%. We are trying to reduce the number of simulation runs even further, using an active learning approach to train the model. Moreover, we are interested in applying this approach to classification as well as regression problems, utilizing different query strategies. Acquiring labels for a given dataset can be quite costly, with active learning, only a small part of the dataset has to be labeled. A preliminary experiment indicated, that the combination of these methods, does indeed help reduce the necessary runs even further. Though we only tried one possible sampling method thus far. Optimizing this non-trivial task is the focus of this Thesis.

@@ Zeile 5: / Zeile 5: @@
 |betreuer=Vadim Arzamasov
 |termin=Institutsseminar/2019-11-29 Zusatztermin
-|kurzfassung=PRIM (Patient Rule Induction Method) is an algorithm to create hyperboxes that are human comprehansable. But PRIM requires relatively large datasets. It has been shown, that using ML models (e.g. Random Forrest) that generalize faster can increase performance by around 75%.
+|kurzfassung=PRIM (Patient Rule Induction Method) is an algorithm to create hyperboxes, that are human-comprehensible. Yet PRIM alone requires relatively large datasets. It has been shown, that combining PRIM with  ML models (e.g. Random Forrest), which generalize faster, can reduce the number of simulation runs by around 75%.
-In this Thesis we are trying to increase the overall performance even further, using an active learning approach in order to train the models. Acquiring labels for a given dataset can be quite costly, with active learning only a small part of the dataset has to ben labeled (if at all). Furthermore, a  preliminary experiment indicated, that combining these methods does indeed increase performance even further.
+We are trying to reduce the number of simulation runs even further, using an active learning approach to train the model.
+Moreover, we are interested in applying this approach to classification as well as regression problems, utilizing different query strategies.
+Acquiring labels for a given dataset can be quite costly, with active learning, only a small part of the dataset has to be labeled. A preliminary experiment indicated, that the combination of these methods, does indeed help reduce the necessary runs even further. Though we only tried one possible sampling method thus far. Optimizing this non-trivial task is the focus of this Thesis.
 }}