[IEEE 2012 Third International Conference on Networking and Computing (ICNC) - Okinawa, Japan...

5
Evaluation of Performance Degradation in HPC Applications with VM Consolidation Yuya Hashimoto Tokyo Institute of Technology Tokyo, Japan Kento Aida National Institute of Informatics/Tokyo Institute of Technology Tokyo, Japan Email: [email protected] Abstract—This paper investigates the performance degradation in application programs running on virtual machines (VMs) in a physical computing server with a focus on HPC applications. We select three benchmarks as the expected workload in a datacenter: HPC applications, database applications, and web server applications. Then, we put VMs executing two application programs together in a physical computing server and evaluate the performance degradation in each application program. We also investigate the resource consumption in each application program and the reason for the performance degradation. The experimental results indicate that the interference among VMs executing two HPC application programs with high memory usage and high network I/O in the physical computing server significantly degrades the application performance. I. Introduction Cloud computing is now widely used as computing plat- forms. Business communities are changing their computing platforms from on-site traditional computing servers to cloud computing services, such as IaaS, PaaS and SaaS. Cloud com- puting oers users services to utilize computing and storage resources on demand. The users can access the resources operated in a datacenter via the Internet whenever they require resource capacity. Although cloud computing has been pri- marily used in business communities, academic communities are currently interested in executing their high-performance computing (HPC) application programs in a cloud computing environment. In the datacenter, virtual machines (VMs) run in physical computing servers using virtualization technology [1] and use software that oer services to the users. The virtualization technology also enables many VMs to execute in fewer phys- ical computing servers, a feature known as VM consolidation, so that idle physical computing servers are turned oor oper- ated in the low power mode. VM consolidation is a promising approach to improve energy eciency in the datacenter. For example, it is reported that a computing server consumes more than 50% of the peak power even when the server utilization is 10% [2]. Thus, VM consolidation can reduce power consumed by computing servers with low utilization. While VM consolidation contributes to improved energy eciency in a datacenter, interference among VMs in a physical computing server may degrade the performance of application programs running on these VMs. The interference is caused by access contention for devices, e.g., a CPU core, memory, and I/O devices in the physical computing server. Interference among VMs executing HPC application programs in the physical computing server may be significant, because HPC application programs require huge resource capacity. Tackling the problem of performance interference caused by VM consolidation has been investigated in related literature [3], [4], [5], [6]. However, the investigations have been limited to interference among VMs executing non-HPC application programs, and the performance degradation on HPC applica- tion programs has not been well investigated. In this paper, we investigate the performance degradation in application programs running on VMs in a physical computing server with a focus on HPC applications. We select three benchmarks as the expected workload in a datacenter: HPC ap- plications, database applications, and web server applications. Then, VMs, executing two application programs from three benchmarks, are put together in a physical computing server. We evaluate the performance degradation in each application program. We also investigate the resource consumption in each application program and discuss the reason for the performance degradation observed in the experiments. The experimental results indicate that the interference among VMs executing two HPC application programs with high memory usage and high network I/O in the physical computing server significantly degrades the application performance. The rest of the paper is organized as follows: Section II discusses related work. Section III describes our experimental setting and benchmarks. Section IV presents experimental results and briefly discusses the VM consolidation strategy to reduce performance degradation. Finally, Section V sum- marizes our contributions and outlines future work. II. Related Work Performance interference among application programs run- ning on VMs in a physical computing server has been investi- gated in [3] and [4]. These researchers conducted experiments using benchmarks of Unix commands, compilation processes, a Povray application, micro-benchmark programs, and web server applications. The work in [5] presents approaches to create a performance model of application programs run- ning on VMs in a physical computing server. Those authors conducted experiments using the benchmark, vConsolidate, and discussed the performance model. While the goal of the above work is similar to that of this paper, that work did not 2012 Third International Conference on Networking and Computing 978-0-7695-4893-7/12 $26.00 © 2012 IEEE DOI 10.1109/ICNC.2012.50 273

Transcript of [IEEE 2012 Third International Conference on Networking and Computing (ICNC) - Okinawa, Japan...

Page 1: [IEEE 2012 Third International Conference on Networking and Computing (ICNC) - Okinawa, Japan (2012.12.5-2012.12.7)] 2012 Third International Conference on Networking and Computing

Evaluation of Performance Degradation in HPCApplications with VM Consolidation

Yuya HashimotoTokyo Institute of Technology

Tokyo, Japan

Kento AidaNational Institute of Informatics/Tokyo Institute of Technology

Tokyo, Japan

Email: [email protected]

Abstract—This paper investigates the performance degradationin application programs running on virtual machines (VMs) in aphysical computing server with a focus on HPC applications.We select three benchmarks as the expected workload in adatacenter: HPC applications, database applications, and webserver applications. Then, we put VMs executing two applicationprograms together in a physical computing server and evaluatethe performance degradation in each application program. Wealso investigate the resource consumption in each applicationprogram and the reason for the performance degradation. Theexperimental results indicate that the interference among VMsexecuting two HPC application programs with high memoryusage and high network I/O in the physical computing serversignificantly degrades the application performance.

I. Introduction

Cloud computing is now widely used as computing plat-

forms. Business communities are changing their computing

platforms from on-site traditional computing servers to cloud

computing services, such as IaaS, PaaS and SaaS. Cloud com-

puting offers users services to utilize computing and storage

resources on demand. The users can access the resources

operated in a datacenter via the Internet whenever they require

resource capacity. Although cloud computing has been pri-

marily used in business communities, academic communities

are currently interested in executing their high-performance

computing (HPC) application programs in a cloud computing

environment.

In the datacenter, virtual machines (VMs) run in physical

computing servers using virtualization technology [1] and use

software that offer services to the users. The virtualization

technology also enables many VMs to execute in fewer phys-

ical computing servers, a feature known as VM consolidation,

so that idle physical computing servers are turned off or oper-

ated in the low power mode. VM consolidation is a promising

approach to improve energy efficiency in the datacenter. For

example, it is reported that a computing server consumes more

than 50% of the peak power even when the server utilization is

10% [2]. Thus, VM consolidation can reduce power consumed

by computing servers with low utilization.

While VM consolidation contributes to improved energy

efficiency in a datacenter, interference among VMs in a

physical computing server may degrade the performance of

application programs running on these VMs. The interference

is caused by access contention for devices, e.g., a CPU core,

memory, and I/O devices in the physical computing server.

Interference among VMs executing HPC application programs

in the physical computing server may be significant, because

HPC application programs require huge resource capacity.

Tackling the problem of performance interference caused by

VM consolidation has been investigated in related literature

[3], [4], [5], [6]. However, the investigations have been limited

to interference among VMs executing non-HPC application

programs, and the performance degradation on HPC applica-

tion programs has not been well investigated.

In this paper, we investigate the performance degradation in

application programs running on VMs in a physical computing

server with a focus on HPC applications. We select three

benchmarks as the expected workload in a datacenter: HPC ap-

plications, database applications, and web server applications.

Then, VMs, executing two application programs from three

benchmarks, are put together in a physical computing server.

We evaluate the performance degradation in each application

program. We also investigate the resource consumption in

each application program and discuss the reason for the

performance degradation observed in the experiments. The

experimental results indicate that the interference among VMs

executing two HPC application programs with high memory

usage and high network I/O in the physical computing server

significantly degrades the application performance.

The rest of the paper is organized as follows: Section II

discusses related work. Section III describes our experimental

setting and benchmarks. Section IV presents experimental

results and briefly discusses the VM consolidation strategy

to reduce performance degradation. Finally, Section V sum-

marizes our contributions and outlines future work.

II. RelatedWork

Performance interference among application programs run-

ning on VMs in a physical computing server has been investi-

gated in [3] and [4]. These researchers conducted experiments

using benchmarks of Unix commands, compilation processes,

a Povray application, micro-benchmark programs, and web

server applications. The work in [5] presents approaches to

create a performance model of application programs run-

ning on VMs in a physical computing server. Those authors

conducted experiments using the benchmark, vConsolidate,

and discussed the performance model. While the goal of the

above work is similar to that of this paper, that work did not

2012 Third International Conference on Networking and Computing

978-0-7695-4893-7/12 $26.00 © 2012 IEEE

DOI 10.1109/ICNC.2012.50

273

Page 2: [IEEE 2012 Third International Conference on Networking and Computing (ICNC) - Okinawa, Japan (2012.12.5-2012.12.7)] 2012 Third International Conference on Networking and Computing

investigate performance interference among HPC application

programs running on VMs in a physical computing server.

The work in [6] proposes a VM consolidation algorithm for

a scientific workflow application program. The authors studied

the correlation among workloads with different resource usage

and investigated the impact of the workload interference on the

application performance. The proposed algorithm decides the

placement of VMs using the hierarchical clustering technique.

The performance of the proposed algorithm was evaluated

using a volume rendering program and synthetic application

programs. However, performance interference among these

applications was not discussed.

III. Experimental Setting

This section describes our experimental setting and the

benchmarks used in the experiments.

A. Server Configuration and Measurement Tool

The experiments are conducted on a physical computing

server with dual Intel Xeon L5520 2.27 GHz, 16 GB memory,

and 1 TB SATA HDD. The operating system on the physical

computing server is Ubuntu Server 10.04. We use a Kernel-

based Virtual Machine (KVM) for virtualization and operate

OpenNebula to control VMs running in the physical comput-

ing server. Each CPU has four cores and hyper-threading is

enabled, that is, we can utilize 16 virtual CPU cores in the

physical computing server. We create the VM image with Intel

Xeon 2.27 GHz, 2 GB memory, and 10 GB HDD. The CPU

on the VM has one core and hyper-threading is enabled. The

operating system on the VM is also Ubuntu Server 10.04.

We measure resource consumption, CPU utilization, mem-

ory usage, network I/O, and disk I/O during the execution

of application programs by using the dstat command. The

interference by the measurement tool on the application per-

formance should be small. We conducted preliminary experi-

ments to compare the execution times of application programs

with and without dstat. In other words, we executed dstat

every second to collect CPU utilization, memory usage, net-

work I/O, and disk I/O during the execution of each application

program and compared the execution time with the original

execution time. The results show that the average increase

of the application execution time with the measurement tool

is less than 0.06% and the maximum increase is 1.96%. We

believe that this level of interference is acceptable.

B. Benchmarks

Cloud computing is now widely used in business commu-

nities, and database and web server applications are typical

workloads. Furthermore, academic communities are currently

interested in executing HPC applications in the cloud comput-

ing environment. We expect that the workload in the datacenter

will consist of HPC applications, database applications, and

web server applications in the near future. Thus, we selected

three types of benchmarks in our experiments.

1) HPC Applications: We selected the NAS Parallel Bench-

marks (NPB) [7] as HPC applications. NPB is widely used

in the HPC community as a benchmark to evaluate the

performance of parallel supercomputers. NPB consists of eight

programs representing the characteristics of computational

fluid dynamics (CFD) applications, as listed below:

• IS: kernel code of Integer Sort

• EP: kernel code of Embarrassingly Parallel computation

using the Monte Carlo method

• CG: kernel code of the Conjugate Gradient method

• MG: kernel code of the Multi-Grid method

• FT: kernel code of the discrete 3D fast Fourier Transform

• BT: application code of the Block Tri-diagonal solver

• SP: application code of the Scalar Pensa-diagonal solver

• LU: application code of the Lower-Upper Gauss-Seidel

solver

2) Database Application: We selected MySQL [8] and

Sysbench [9] as database applications. MySQL is popular

open source database software and we execute MySQL on

the database server in our experiments. We execute Sysbench,

which generates the workload for the database server, on the

client machine. Sysbench consists of benchmark programs for

evaluating the following performances of database servers:

• file I/O performance

• scheduler performance

• memory allocation and transfer speed

• POSIX thread implementation performance

• database server performance (on-line transaction process-

ing; OLTP benchmark)

We execute the OLTP benchmark in our experiments, where

a mixture of search and query processes are executed.3) Web Server Application: We selected Apache [10],

which is popular open source web server software. We created

client scripts to generate the workload for the web server.

While some benchmarks to generate the web server workload

are available, e.g., ApacheBench [11], they are usually used to

evaluate the robustness of a web server and generate too much

load on the web server. We did not use the existing bench-

marks, because the goal of our experiments is to investigate

performance with the realistic load.

IV. Experimental Results

This section presents our experimental results. First, we

investigate the resource consumption of a single program

included in NPB. The results are used to analyze the results

in the following experiments. Then, we present the experi-

mental results on the performance degradation of application

programs with VM consolidation.

A. Resource Consumption by a Single Application

Before investigating the performance degradation of appli-

cation programs with VM consolidation, we measured the

resource consumption in a single program included in NPB.

In the experiments, we execute each NPB program with the

problem size of CLASS C. Each program is executed with

eight MPI processes, where two processes are assigned to a

274

Page 3: [IEEE 2012 Third International Conference on Networking and Computing (ICNC) - Okinawa, Japan (2012.12.5-2012.12.7)] 2012 Third International Conference on Networking and Computing

��

���

���

���

���

����

����

�� �� ���� ��� ���� ��� ��� ��

����

����� �

����

���������� �����

�� ��

�����

Fig. 1. Network I/O in NPB FT

��

���

���

���

���

����

�� �� ���� ��� ���� ��� ��� ��

���������

��

��

����������������

����

����

Fig. 2. CPU utilization in NPB FT

VM with two virtual CPU cores, and a total of four VMs are

utilized in the physical computing server. Then, we measured

CPU utilization, memory usage, network I/O, and disk I/O

during execution of the program.

Figure 1 through Figure 4 show the resource consumption

on the single VM during the execution of FT in the NPB. FT

computes the discrete 3D fast Fourier Transform and performs

communication among all processes during its execution. This

procedure increases the network I/O, as presented in Figure 1.

The high network I/O causes much I/O wait time in the CPU

and the CPU utilization fluctuates, as presented in Figure 2.

We also see the fluctuation of the CPU utilization and the high

network I/O in the results of IS.

For the results of other programs in the NPB, the network

I/O is small and the CPU utilization is stable and high. For

example, Figure 5 and Figure 6 show the CPU utilization in

LU and the network I/O in EP, respectively. The memory usage

in all programs is stable, as in the example of FT in Figure 3.

All programs except EP consume high memory capacity, e.g.,

1,800 MB in FT. The disk I/O in all programs is sporadic, as

in the example of FT in Figure 4.

��

����

�����

�����

�����

�����

�� ��� ���� ���� ���� ���� ���� ����

���

��������

���

� �������������

Fig. 3. Memory usage in NPB FT

��

� �

���

� �

���

� �

���

� � � � � �� � � � �� � � � �� �

���������

��� �����������

�����

�����

Fig. 4. Disk I/O in NPB FT

Table I summarizes the resource consumption in NPB

programs on the VM. We omit the results of CG, BT and

SP because they exhibit similar results to others in the table.

Table II presents those in the physical computing server,

on which four VMs run. The results on Table II indicate

that resources in the physical computing server are not fully

occupied by a single program. This means that there is room to

put VMs executing another application program in the physical

computing server.

B. Performance Degradation with VM Consolidation

In this section, we present the experimental results of the

performance degradation of the NPB benchmark programs

with VM consolidation. We choose two NPB programs and

execute the programs in the physical computing server. In

other words, we execute each program on four VMs (with

eight virtual CPU cores) and put all eight VMs in the physical

computing server. All virtual CPU cores are utilized by the two

programs. The experimental results in Section IV-A indicate

that all NPB programs highly utilize the CPU resources. It is

obvious that the performance is significantly degraded if two

275

Page 4: [IEEE 2012 Third International Conference on Networking and Computing (ICNC) - Okinawa, Japan (2012.12.5-2012.12.7)] 2012 Third International Conference on Networking and Computing

��

��

��

� �� � �� �� ��

����������

���

��

��� ������������

��������

Fig. 5. CPU utilization in NPB LU

��

���

��

���

��

��

�� ��� ��� ��� ��� ���� ���� ����

����

����� �

�����

���������� �����

�� ��

�����

Fig. 6. Network I/O in NPB EP

programs share the virtual CPU core due to excessive con-

text switching. Furthermore, the performance degradation by

contention for CPU resources between computation intensive

programs has been discussed in the literature. Thus, we avoid

two programs sharing a virtual CPU core in order to investigate

the performance degradation caused for other reasons.

We also execute the NPB program with a program in

other benchmarks, or MySQL and Apache. In this experiment,

we choose one NPB program and execute the program on

four VMs. Then, we execute the MySQL server program on

the other four VMs and put all eight VMs in the physical

computing server. We also perform the same experiments using

Apache. For MySQL, we execute the OLTP benchmark in

Sysbench with the complex mode. In this setting, the bench-

mark program repeats the execution of 10 point queries and

eight complex queries (e.g., a range query). Our preliminary

experiments show that a cache on the database server was filled

3,500 seconds after the start of the application program. Thus,

we measure the resource consumption 3,500 seconds after the

start of the application program, so that we can investigate

the resource consumption in the steady state. For Apache, we

execute the Apache HTTP server on four VMs and execute

TABLE IResource Consumption in a VM

EP MG IS LU FTCPU high high high high highutilization stable stable unstable stable unstablemax. CPUutilization[%]

100 99 97 99 100

network I/O low low high low highsporadic frequent frequent frequent frequent

max.networkI/O [Mbps]

< 0.1 10.0 72.8 3.3 110.0

memory low high high high highusage stable stable stable stable stablemax.memoryusage [MB]

13.5 926.2 421.6 216.7 1811.6

disk I/O sporadic sporadic sporadic sporadic sporadicmax. diskI/O [MB]

3.6 3.7 3.3 2.1 29.7

TABLE IIResource consumption in the physical computing server

EP MG IS LU FTmax. CPU utilization[%]

51 47 49 52 56

max. network I/O[Mbps]

< 0.1 35.8 305.8 17.3 272.2

max. memory usage[MB]

162.4 3909.2 1832.6 831.7 7719.1

max. disk I/O [MB] 10.4 10.1 11.4 11.3 28.3

the script program to access the server on the client machine.

The script program recursively downloads the contents on the

HTTP server. The contents are created by imitating those on

commercial web sites in Japan [12].

Table III shows the performance degradation of NPB pro-

grams when we execute the program with another program,

as indicated in the column of “co-executed program” in

the physical computing server. We compute the performance

degradation, Dper f , by the formula below:

Dper f =Tconsolidation − Tsingle

Tsingle(1)

Here, Tconsolidation indicates the execution time of the pro-

gram when we execute the program with another program,

and Tsingle means the execution time of the program when

we execute the program alone. For example, the performance

degradation of MG when we execute MG with EP is 48. This

result means that the execution time of MG running with EP

is 48% longer than the execution time when we execute MG

alone.

Table III shows that the performances of FT and IS are sig-

nificantly degraded among the programs, and the degradation

in FT is the worst. From the results in the previous section,

we can see that both programs require high memory capacity

as well as high and frequent network I/O. The performance

degradation in EP is minimum, because EP requires less

memory capacity and low network I/O. The results indicate

that the performance of an application program requiring high

276

Page 5: [IEEE 2012 Third International Conference on Networking and Computing (ICNC) - Okinawa, Japan (2012.12.5-2012.12.7)] 2012 Third International Conference on Networking and Computing

TABLE IIIApplication performance degradation with VM consolidation

performance degradationco-executedapplications EP MG IS LU FT

EP - 48 92 31 125MG 18 - 110 86 188IS 22 109 - 34 159LU 18 134 126 - 167FT 19 59 87 62 -

MySQL 1 13 6 9 14Apache 3 12 6 5 14

memory capacity and high network I/O is highly affected by

VM consolidation. This phenomenon is also observed in the

comparison of the performance degradation between two ap-

plication programs in Table III. For example, the performance

degradation in FT running with MG is 188, while that in MG

running with FT is 59. The performance degradation in FT is

much larger than that in MG, and we can see that FT requires

more memory capacity and network I/O.

For the performance degradation in FT, FT is affected much

by MG and IS. Table I shows that MG consumes 926.2

MB memory capacity and IS performs 72.8 Mbps network

I/O. This result indicates that running two programs with

high memory usage or network I/O causes high overhead in

the hypervisor. Note that the memory usage in the physical

computing server when we execute FT and MG is 11,538

MB and the free memory space is still available during the

experiments.

The performance degradation of all programs in the NPB

when we execute the program with MySQL or Apache is

small. The reason is that both MySQL and Apache consume

fewer resources compared with the NPB programs.

C. Discussion

Our experimental results indicate that the interference

among VMs executing two HPC application programs with

high memory usage and high network I/O in the physical

computing server can significantly degrade the application per-

formance. The results lead us to a straightforward idea: putting

VMs executing an HPC application program together with

VMs executing a non-HPC application program to guarantee

application performance with VM consolidation. We can see

the effectiveness of the simple strategy in Table III, where the

performance of the NPB programs is not much affected by

MySQL and Apache. However, Table III also indicates that

VMs can execute two HPC application programs together in

the physical computing server with low performance degrada-

tion. For example, the performance degradation in LU is low

when we execute LU with EP or IS. The reason is that LU

requires relatively low resource capacity compared with others.

The results suggest that we can put VMs executing HPC

application programs together in the physical computing server

with low application performance degradation by carefully

selecting the application programs.

We need more quantitative analyses to create a VM con-

solidation strategy for executing HPC applications with a per-

formance guarantee, and this is our future work. However, we

believe that this paper provides some preliminary experimental

results for the discussion of a VM consolidation strategy.

V. Conclusion

We present experimental results to discuss the performance

degradation of application programs on VMs running in a

physical computing server with a focus on HPC applications.

We select NAS Parallel Benchmarks (NPB) as the HPC

benchmark and investigate the performance degradation of

the program when we put VMs running an HPC program

together with VMs running another NPB program or non-HPC

benchmark programs. The experimental results indicate that

the interference among VMs executing two HPC application

programs with high memory usage and high network I/O

in a physical computing server significantly degrades the

application performance. However, the results also suggest

that VMs do have the capability to execute HPC application

programs together in a physical computing server with low

application performance degradation. Our future work includes

more quantitative analysis of the application performance with

VM consolidation and creating the sophisticated performance

model.

Acknowledgment

This work was supported in part by JSPS KAKENHI Grant

Number 24240006.

References

[1] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neuge-bauer, I. Pratt, and A. Warfield, “Xen and the art of virtualization,”SIGOPS Oper. Syst. Rev., vol. 37, no. 5, pp. 164–177, Oct. 2003.

[2] S. Srikantaiah, A. Kansal, and F. Zhao, “Energy Aware Consolidationfor Cloud Computing,” in Proc. of the 2008 conference on Power awarecomputing and systems (HotPower ’08). USENIX Association, 2008.

[3] Y. Koh, R. C. Knauerhase, P. Brett, M. Bowman, Z. Wen, and C. Pu, “AnAnalysis of Performance Interference Effects in Virtual Environments,”in Proc. of IEEE International Symposium on Performance Analysis ofSystems & Software (ISPASS 2007). IEEE Computer Society, 2007, pp.200–209.

[4] X. Pu, L. Liu, Y. Mei, S. Sivathanu, Y. Koh, and C. Pu, “UnderstandingPerformance Interference of I/O Workload in Virtualized Cloud Envi-ronments,” in Proc. of the 2010 IEEE 3rd International Conference onCloud Computing (CLOUD ’10). IEEE Computer Society, 2010, pp.51–58.

[5] O. Tickoo, R. Iyer, R. Illikkal, and D. Newell, “Modeling virtual machineperformance: challenges and approaches,” SIGMETRICS Perform. Eval.Rev., vol. 37, no. 3, pp. 55–60, 2010.

[6] Q. Zhu, J. Zhu, and G. Agrawa, “Power-aware Consolidation of Sci-entific Workflows in Virtualized Environments,” in Proc. of the 2010ACM/IEEE International Conference for High Performance Computing,Networking, Storage and Analysis (SC10). IEEE Computer Society,2010, pp. 1–12.

[7] NASA. NAS Parallel Benchmarks. [Online]. Available:http://www.nas.nasa.gov/publications/npb.html

[8] ORACLE. MySQL. [Online]. Available: http://dev.mysql.com/[9] Alexey Kopytov. SysBench: a system performance benchmark. [Online].

Available: http://sysbench.sourceforge.net/[10] The Apache Software Foundation. Apache HTTP SERVER PROJECT.

[Online]. Available: http://httpd.apache.org/[11] ——. ab - Apache HTTP server benchmarking tool. [Online]. Available:

http://httpd.apache.org/docs/2.0/programs/ab.html[12] YAHOO Japan. [Online]. Available: http://www.yahoo.co.jp

277