Patient Rule Induction Method with Active Learning: Unterschied zwischen den Versionen
Keine Bearbeitungszusammenfassung |
Keine Bearbeitungszusammenfassung |
||
Zeile 5: | Zeile 5: | ||
|betreuer=Vadim Arzamasov | |betreuer=Vadim Arzamasov | ||
|termin=Institutsseminar/2019-11-29 Zusatztermin | |termin=Institutsseminar/2019-11-29 Zusatztermin | ||
|kurzfassung=PRIM (Patient Rule Induction Method) is an algorithm to create hyperboxes that are human | |kurzfassung=PRIM (Patient Rule Induction Method) is an algorithm to create hyperboxes, that are human-comprehensible. Yet PRIM alone requires relatively large datasets. It has been shown, that combining PRIM with ML models (e.g. Random Forrest), which generalize faster, can reduce the number of simulation runs by around 75%. | ||
We are trying to reduce the number of simulation runs even further, using an active learning approach to train the model. | |||
Moreover, we are interested in applying this approach to classification as well as regression problems, utilizing different query strategies. | |||
Acquiring labels for a given dataset can be quite costly, with active learning, only a small part of the dataset has to be labeled. A preliminary experiment indicated, that the combination of these methods, does indeed help reduce the necessary runs even further. Though we only tried one possible sampling method thus far. Optimizing this non-trivial task is the focus of this Thesis. | |||
}} | }} |
Version vom 26. November 2019, 10:30 Uhr
Vortragende(r) | Emmanouil Emmanouilidis | |
---|---|---|
Vortragstyp | Proposal | |
Betreuer(in) | Vadim Arzamasov | |
Termin | Fr 29. November 2019 | |
Vortragssprache | ||
Vortragsmodus | ||
Kurzfassung | PRIM (Patient Rule Induction Method) is an algorithm to create hyperboxes, that are human-comprehensible. Yet PRIM alone requires relatively large datasets. It has been shown, that combining PRIM with ML models (e.g. Random Forrest), which generalize faster, can reduce the number of simulation runs by around 75%.
We are trying to reduce the number of simulation runs even further, using an active learning approach to train the model. Moreover, we are interested in applying this approach to classification as well as regression problems, utilizing different query strategies. Acquiring labels for a given dataset can be quite costly, with active learning, only a small part of the dataset has to be labeled. A preliminary experiment indicated, that the combination of these methods, does indeed help reduce the necessary runs even further. Though we only tried one possible sampling method thus far. Optimizing this non-trivial task is the focus of this Thesis. |