Performance Modelling for Genome Analysis Algorithms
Typ | Masterarbeit | |
---|---|---|
Aushang | Genomanalyse.pdf | |
Betreuer | Wenden Sie sich bei Interesse oder Fragen bitte an: Ralf Reussner (E-Mail: reussner@kit.edu, Telefon: +49-721-608-45993) |
Motivation The major challenge when searching for the optimal phylogenetic tree (representing the evolutionary history of species for a corresponding set of DNA sequences) is the computation of the phylogenetic likelihood function. During tree searches for the optimal tree (the tree search problem is NP-hard) production-level software spends 90-95% of overall run time in likelihood function evaluations. As a consequence, an entire zoo of algorithmic tricks to accelerate likelihood calculations exists. That is, there exist different likelihood kernel implementations with varying degree of efficiency, depending on (i) the input data at hand and (ii) the current computational stage of the tree inference program. The goal of this thesis is to develop a method that will automatically determine and use the most efficient likelihood kernel for a given program phase and input dataset characteristics.
Task
In this thesis, it is to be evaluated whether the existing software architecture performance simulator “Palladio” can be used to predict the performance impact of different kernel selections. Therefore, the modeling language of Palladio (PCM) is used to model exemplary algorithms. Simulation results of such models are compared with measured data to evaluate the accuracy of the Palladio simulations to learn how to improve the PCM to better predict the performance of kernel implementations.
This thesis will be supervised at the Institute for Program Structures and Data Organization (IPD, Prof. Reussner), Department of Informatics, and at the Institute for Theoretical Informatics (ITI, Prof. Stamatakis).
We provide Working with latest and innovative technologies Close relation to current research project Very good working environment and intensive supervision