Aus SDQ-Institutsseminar
Termin (Alle Termine)
Datum Freitag, 17. September 2021
Uhrzeit 11:30 – 13:00 Uhr (Dauer: 90 min)
Vorheriger Termin Fr 10. September 2021
Nächster Termin Fr 24. September 2021

Termin in Kalender importieren: iCal (Download)


Vortragende(r) Tanja Fenn
Titel Change Detection in High Dimensional Data Streams
Vortragstyp Masterarbeit
Betreuer(in) Edouard Fouché
Kurzfassung The data collected in many real-world scenarios such as environmental analysis, manufacturing, and e-commerce are high-dimensional and come as a stream, i.e., data properties evolve over time – a phenomenon known as "concept drift". This brings numerous challenges: data-driven models become outdated, and one is typically interested in detecting specific events, e.g., the critical wear and tear of industrial machines. Hence, it is crucial to detect change, i.e., concept drift, to design a reliable and adaptive predictive system for streaming data. However, existing techniques can only detect "when" a drift occurs and neglect the fact that various drifts may occur in different dimensions, i.e., they do not detect "where" a drift occurs. This is particularly problematic when data streams are high-dimensional.

The goal of this Master’s thesis is to develop and evaluate a framework to efficiently and effectively detect “when” and “where” concept drift occurs in high-dimensional data streams. We introduce stream autoencoder windowing (SAW), an approach based on the online training of an autoencoder, while monitoring its reconstruction error via a sliding window of adaptive size. We will evaluate the performance of our method against synthetic data, in which the characteristics of drifts are known. We then show how our method improves the accuracy of existing classifiers for predictive systems compared to benchmarks on real data streams.

Vortragende(r) Wenrui Zhou
Titel Outlier Analysis in Live Systems from Application Logs
Vortragstyp Masterarbeit
Betreuer(in) Edouard Fouché
Kurzfassung Modern computer applications tend to generate massive amounts of logs and have become so complex that it is often difficult to explain why applications failed. Locating outliers in application logs can help explain application failures. Outlier detection in application logs is challenging because (1) the log is unstructured text streaming data. (2) labeling application logs is labor-intensive and inefficient.

Logs are similar to natural languages. Recent deep learning algorithm Transformer Neural Network has shown outstanding performance in Natural Language Processing (NLP) tasks. Based on these, we adapt Transformer Neural Network to detect outliers from applications logs In an unsupervised way. We compared our algorithm against state-of-the-art log outlier detection algorithms on three widely used benchmark datasets. Our algorithm outperformed state-of-the-art log outlier detection algorithms.

Neuen Vortrag erstellen