Feature-based Time Series Generation: Unterschied zwischen den Versionen

Aus SDQ-Institutsseminar
(Die Seite wurde neu angelegt: „{{Vortrag |vortragender=Daniel Betsche |email=uuoah@student.kit.edu |vortragstyp=Proposal |betreuer=Adrian Englhardt |termin=Institutsseminar/2019-12-13 |kurzf…“)
 
Keine Bearbeitungszusammenfassung
 
(Eine dazwischenliegende Version desselben Benutzers wird nicht angezeigt)
Zeile 5: Zeile 5:
|betreuer=Adrian Englhardt
|betreuer=Adrian Englhardt
|termin=Institutsseminar/2019-12-13
|termin=Institutsseminar/2019-12-13
|kurzfassung=TBD
|kurzfassung=Due to privacy concerns and possible high collection costs of real time series data, access to high quality datasets is difficult to achieve for machine learning practitioners. The generation of synthetic time series data enables the study of model robustness against edge cases and special conditions not found in the original data. A requirement to achieve such results in applications when relying on synthetic data is the availability of fine-grained control over the generation to be able to meet the specific needs of the user. Classical approaches relying on autoregressive Models e.g. ARIMA only provide a basic control over composites like trend, cycles, season and error. A promising current approach is to train LSTM Autoencoders or GANs on a sample dataset and learn an unsupervised set of features which in turn can be used and manipulated to generate new data. The application of this approach is limited, due to the not human interpretable features and therefore limited control. We propose various methods to combine handcrafted and unsupervised features to provide the user with enhanced influence of various aspects of the time series data. To evaluate the performance of our work we collected a range of various metrics which were proposed to work well on synthetic data. We will compare these metrics and apply them to different datasets to showcase if we can achieve comparable or improved results.
}}
}}

Aktuelle Version vom 10. Dezember 2019, 10:30 Uhr

Vortragende(r) Daniel Betsche
Vortragstyp Proposal
Betreuer(in) Adrian Englhardt
Termin Fr 13. Dezember 2019
Vortragssprache
Vortragsmodus
Kurzfassung Due to privacy concerns and possible high collection costs of real time series data, access to high quality datasets is difficult to achieve for machine learning practitioners. The generation of synthetic time series data enables the study of model robustness against edge cases and special conditions not found in the original data. A requirement to achieve such results in applications when relying on synthetic data is the availability of fine-grained control over the generation to be able to meet the specific needs of the user. Classical approaches relying on autoregressive Models e.g. ARIMA only provide a basic control over composites like trend, cycles, season and error. A promising current approach is to train LSTM Autoencoders or GANs on a sample dataset and learn an unsupervised set of features which in turn can be used and manipulated to generate new data. The application of this approach is limited, due to the not human interpretable features and therefore limited control. We propose various methods to combine handcrafted and unsupervised features to provide the user with enhanced influence of various aspects of the time series data. To evaluate the performance of our work we collected a range of various metrics which were proposed to work well on synthetic data. We will compare these metrics and apply them to different datasets to showcase if we can achieve comparable or improved results.