Standardized Real-World Change Detection Data
|Termin||Fr 13. Mai 2022|
|Kurzfassung||The reliable detection of change points is a fundamental task when analysing data across many fields, e.g., in finance, bioinformatics, and medicine.
To define “change points”, we assume that there is a distribution, which may change over time, generating the data we observe. A change point then is a change in this underlying distribution, i.e., the distribution coming before a change point is different from the distribution coming after. The principled way to compare distributions, and to find change points, is to employ statistical tests.
While change point detection is an unsupervised problem in practice, i.e., the data is unlabelled, the development and evaluation of data analysis algorithms requires labelled data. Only few labelled real world data sets are publicly available and many of them are either too small or have ambiguous labels. Further issues are that reusing data sets may lead to overfitting, and preprocessing (e.g., removing outliers) may manipulate results. To address these issues, van den Burg et al. publish 37 data sets annotated by data scientists and ML researchers and use them for an assessment of 14 change detection algorithms. Yet, there remain concerns due to the fact that these are labelled by hand: Can humans correctly identify changes according to the definition, and can they be consistent in doing so?
The goal of this Bachelor's thesis is to algorithmically label their data sets following the formal definition and to also identify and label larger and higher-dimensional data sets, thereby extending their work. To this end, we leverage a non-parametric hypothesis test which builds on Maximum Mean Discrepancy (MMD) as a test statistic, i.e., we identify changes in a principled way. We will analyse the labels so obtained and compare them to the human annotations, measuring their consistency with the F1 score. To assess the influence of the algorithmic and definition-conform annotations, we will use them to reevaluate the algorithms of van den Burg et al. and compare the respective performances.