In the following, we describe how performance prediction for virtualized environments can be enhanced based on the results of Ginpex experiments deriving load-dependent overheads that occur in virtualized environments.

The experiments yield an overhead model that consists of two parts: intra-machine overhead and inter-machine overhead.

First, a platform performance model is derived based on the experiment results which is used for calculating the overhead of requests. Then, this platform performance model is included in a performance prediction tool, such as the Palladio Component Model (PCM).

For each VM, the intra-machine experiment results are used to calculate a multi-dimensional regression model. To derive such an overhead model, we implemented a regression model based on Classification and Regression Trees (CART) [1]. A CART regression model allows for predicting the value of a dependent variable (in our case, the overhead for a resource demand) based on an input of independent variables (i.e. the number of parallel processes in the simulation accessing the different resources) and has been applied successfully in case studies on performance prediction ([2], [3], [4]). The same is done for the results of the inter-VM experiments.

The overall overhead of a resource request in a VM can then be calculated by first predicting the overhead of the intra-machine overhead model for the VM and then predicting the inter-machine overhead of the resource request. We then multiply the request with the calculated overheads to obtain an adapted request reflecting a slowdown due to parallel load.

Take for example a scenario with two VMs, the resources CPU, Disk, Network shared by both VMs, and the load situation (1, 0, 2), (0, 1, 2). This means that the physical CPU is fully utilized by a VM 1 process, 2 processes in VM 1 issue network requests, 1 process issues disk requests in VM 2, and 2 processes issue network requests in VM 2. Now, we assume that a new process issues network requests in VM 2.

To determine the overhead that slows down this request due to the current load situation, we compute the overhead as follows: We first compute the overhead for the request with the intra-machine overhead model. This overhead is calculated with the VM 2 intra-machine overhead model with the input parameters CPU load = 0, Disk load = 1 and Network load = 3.

Then, the inter-machine overhead is calculated. As network requests also occur in VM 1, we calculate the inter-machine network overhead for VM 2 with the input parameters Network VM1 = 1 and Network VM2 = 1. In addition, we have to add overhead that occurs on the hypervisor due to handling CPU and disk resource requests. This is done by first calculating overhead for VM 2 with the input parameters CPU VM1 = 1 and CPU VM2 = 0, and then calculating overhead for VM 2 with the input parameters Disk VM1 = 0 and Disk VM2 = 1. The issued network resource demand in VM 2 is then multiplied by all calculated overheads to obtain an adapted resource demand including the response time slowdown due to parallel load in the system. Similarly, overheads are calculated for the other resource requests in VM 1 and VM 2.

To use the overhead model in performance prediction, we adapted the simulation-based performance prediction tool of the Palladio Component Model (PCM). The PCM performance simulation is a discrete-event simulation that makes uses of queues to simulate contention effects on resources. Resource demands that occur in the modelled control flow of a component are scheduled on the corresponding queues, which are used to simulate resource contention effects that affect the response time of component services or the system's resource utilization.

The figure above shows how our approach is integrated into the event-based simulation of the PCM. Virtual machines can be modelled in PCM with nested resource containers [5]. During simulation, resource requests of components deployed on the containers are issued to the container's resources (e.g. CPU, disk, or a network device). Each request leads to an event indicating that the demand has to be put on the resource (step (1) in the figure). We intercept this event (2) and query the resources for the current load situation (3). Based on this information, we calculate the overhead as described above (4). With this information, the platform performance model can be used to predict the overhead for the issued demand based on the overall load situation. The adapted demand is then passed on to the simulation framework (5), where the demand is processed by queues which are used by the simulation to simulate resource scheduling logic and resource contention effects. At this time, we also use the platform performance model to adapt the demands of other requests currently being processed by the simulation's queues, as other demands are affected by the changed load situation as well (6).

The same procedure is done once a resource request is processed completely by the simulation. In this case, the simulation sends a different event indicating that the control flow of the component issuing the demand can be resumed (7). We again intercept this event (8), and adapt the demands of all requests that are currently processed by the corresponding queues (steps (9)-(12)).

References

[1] T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics. Springer-Verlag, 2nd edition, 2009.

[2] M. Wang, K. Au, A. Ailamaki, A. Brockwell, C. Faloutsos, and G. R. Ganger. Storage device performance prediction with CART models. In Proceedings of the 12th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (MASCOTS 2004), 2004.

[3] E. Thereska, B. Doebel, A. X. Zheng, and P. Nobel. Practical Performance Models for Complex, Popular Applications. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2010). ACM, 2010.

[4] D. Westermann, J. Happe, R. Krebs, and R. Farahbod. Automated Inference of Goal-oriented Performance Prediction Functions. In 27th IEEE/ACM International Conference On Automated Software Engineering (ASE 2012), to appear 2012.

[5] M. Hauck, M. Kuperberg, K. Krogmann, and R. Reussner. Modelling Layered Component Execution Environments for Performance Prediction. In Proceedings of the 12th International Symposium on Component Based Software Engineering (CBSE 2009), pages 191-208. Springer-Verlag, 2009.