Patient Rule Induction Method with Active Learning: Unterschied zwischen den Versionen

Aus SDQ-Institutsseminar
Keine Bearbeitungszusammenfassung
Keine Bearbeitungszusammenfassung
 
Zeile 5: Zeile 5:
|betreuer=Vadim Arzamasov
|betreuer=Vadim Arzamasov
|termin=Institutsseminar/2019-11-29 Zusatztermin
|termin=Institutsseminar/2019-11-29 Zusatztermin
|kurzfassung=PRIM (Patient Rule Induction Method) is an algorithm for discovering scenarios from simulations, by creating hyperboxes, that are human-comprehensible. Yet PRIM alone requires relatively large datasets and computational simulations are usually quite expensive. Therefore, one wants to obtain a plausible scenario, with a minimal number of simulations. It has been shown, that combining PRIM with  ML models, which generalize faster, can reduce the number of necessary simulation runs by around 75%.
|kurzfassung=PRIM (Patient Rule Induction Method) is an algorithm for discovering scenarios from simulations, by creating hyperboxes, that are human-comprehensible. Yet PRIM alone requires relatively large datasets and computational simulations are usually quite expensive. Consequently, one wants to obtain a plausible scenario, with a minimal number of simulations. It has been shown, that combining PRIM with  ML models, which generalize faster, can reduce the number of necessary simulation runs by around 75%.
We will try to reduce the number of simulation runs even further, using an active learning approach to train a suitable model.  
We will try to reduce the number of simulation runs even further, using an active learning approach to train an intermediate ML model.  
Moreover, we are interested in applying PRIM with Active Learning to classification as well as regression problems, utilizing different query strategies. Acquiring labels for a given dataset can be quite costly, with active learning, only a small part of the dataset has to be labeled. A preliminary experiment indicated, that the combination of these methods, does indeed help reduce the necessary runs even further. Though we only tried one possible sampling method thus far. Finding the best-suited combinations of components for the above described non-trivial problems is the focus of this Thesis.
Additionally, we extend the previously proposed methodology to not only cover classification but also regression problems. A preliminary experiment indicated, that the combination of these methods, does indeed help reduce the necessary runs even further. In this thesis, I will analyze different AL sampling strategies together with several intermediate ML models to find out if AL can systematically improve existing scenario discovery methods and if a most beneficial combination of sampling method and intermediate ML model exists for this purpose.
}}
}}

Aktuelle Version vom 26. November 2019, 18:44 Uhr

Vortragende(r) Emmanouil Emmanouilidis
Vortragstyp Proposal
Betreuer(in) Vadim Arzamasov
Termin Fr 29. November 2019
Vortragsmodus
Kurzfassung PRIM (Patient Rule Induction Method) is an algorithm for discovering scenarios from simulations, by creating hyperboxes, that are human-comprehensible. Yet PRIM alone requires relatively large datasets and computational simulations are usually quite expensive. Consequently, one wants to obtain a plausible scenario, with a minimal number of simulations. It has been shown, that combining PRIM with ML models, which generalize faster, can reduce the number of necessary simulation runs by around 75%.

We will try to reduce the number of simulation runs even further, using an active learning approach to train an intermediate ML model. Additionally, we extend the previously proposed methodology to not only cover classification but also regression problems. A preliminary experiment indicated, that the combination of these methods, does indeed help reduce the necessary runs even further. In this thesis, I will analyze different AL sampling strategies together with several intermediate ML models to find out if AL can systematically improve existing scenario discovery methods and if a most beneficial combination of sampling method and intermediate ML model exists for this purpose.