Statistical Generation of High Dimensional Data Streams with Complex Dependencies

Aus SDQ-Institutsseminar
Vortragende(r) Alexander Poth
Vortragstyp Bachelorarbeit
Betreuer(in) Edouard Fouché
Termin Fr 14. Dezember 2018
Kurzfassung The evaluation of data stream mining algorithms is an important task in current research. The lack of a ground truth data corpus that covers a large number of desireable features (especially concept drift and outlier placement) is the reason why researchers resort to producing their own synthetic data. This thesis proposes a novel framework ("streamgenerator") that allows to create data streams with finely controlled characteristics. The focus of this work is the conceptualization of the framework, however a prototypical implementation is provided as well. We evaluate the framework by testing our data streams against state-of-the-art dependency measures and outlier detection algorithms.