|Rakan Al Masri
|Fr 17. März 2023
|While standard machine learning approaches rely solely on data to learn relevant patterns, in certain fields, this may not be sufficient. Researchers in the Healthcare domain, have successfully applied causal domain knowledge to improve prediction quality of machine learning models, especially for rare diseases. The causal domain knowledge informs the machine learning model about similar diseases, thus improving the quality of the predictions.
However, some domains, such as Cloud Systems Monitoring, lack readily available causal domain knowledge, and thus the knowledge must be approximated. Therefore, it is important to have a systematic investigation of the processes and design decision that affect the knowledge generation process.
In this study, we showed how causal discovery algorithms can be employed to generate causal domain knowledge from raw textual logs in the Cloud Systems Monitoring domain. We also investigated the impact of various design choices on the domain knowledge generation process through systematic testing across multiple datasets and shared the insights we gained. To our knowledge, this is the first time such an investigation has been conducted.