JOURNAL OF LA CloudGenius: A Hybrid Decision Support ... · PDF filefor, = >> < >>:) = >> <...

14
0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers JOURNAL OF L A T E X CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 1 CloudGenius: A Hybrid Decision Support Method for Automating the Migration of Web Application Clusters to Public Clouds Michael Menzel, Member, IEEE, Rajiv Ranjan, Member, IEEE, Lizhe Wang, Senior Member, IEEE, Samee U. Khan, Senior Member, IEEE, Jinjun Chen, Senior Member, IEEE Abstract—With the increase in cloud service providers, and the increasing number of compute services offered, a migration of information systems to the cloud demands selecting the best mix of compute services and VM (Virtual Machine) images from an abundance of possibilities. Therefore, a migration process for web applications has to automate evaluation and, in doing so, ensure that Quality of Service (QoS) requirements are met, while satisfying conflicting selection criteria like throughput and cost. When selecting compute services for multiple connected software components, web application engineers must consider heterogeneous sets of criteria and complex dependencies across multiple layers, which is impossible to resolve manually. The previously proposed CloudGenius framework has proven its capability to support migrations of single-component web applications. In this paper, we expand on the additional complexity of facilitating migration support for multi-component web applications. In particular, we present an evolutionary migration process for web application clusters distributed over multiple locations, and clearly identify the most important criteria relevant to the selection problem. Moreover, we present a multi- criteria-based selection algorithm based on Analytic Hierarchy Process (AHP). Because the solution space grows exponentially, we developed a Genetic Algorithm (GA)-based approach to cope with computational complexities in a growing cloud market. Furthermore, a use case example proofs CloudGenius’ applicability. To conduct experiments, we implemented CumulusGenius, a prototype of the selection algorithm and the GA deployable on hadoop clusters. Experiments with CumulusGenius give insights on time complexities and the quality of the GA. Index Terms—cloud migration, migration process, selection problem, criteria set, decision-making, decision support. 1 I NTRODUCTION A Web application is a computer software appli- cation, which interacts with users through a frontend programmed using browser-based language (such as JavaScript and HTML). Web applications are typically accessed by million of users over the internet via a common web browser software (e.g., Internet explorer, Firefox, etc.). Common web applications in- clude webmail, online retail sales, online auctions, wikis and the like. 1.1 Motivation and The Research Problem In the traditional web application hosting model [1], hardware needs to be provisioned for handling peak load. However, uncertain traffic periods and unex- pected variations in workload patterns may result in low utilization rates of expensive hardware. There- fore, the traditional approach of provisioning for peak Michael Menzel ([email protected]) is with Karlsruhe Institute of Technology. Rajiv Ranjan ([email protected]) is with CSIRO Computational Infor- matics. He is the corresponding author for the paper. Lizhe Wang ([email protected]) is with Chinese Academy of Sciences. Samee U. Khan ([email protected]) is with North Dakota State University. Jinjun Chen([email protected]) is with University of Technology Sydney. workloads leads to unused or wasted computing cy- cles when traffic is low. With the advent of cloud computing, it is expected that more and more web applications will be hosted using cloud-based, vir- tualized services. Cloud computing [2] provides an elastic ICT (Information Communication Technology) infrastructure for the most demanding and dynamic web applications. Clouds provide an infrastructure (if optimally selected and allocated) that can match ICT cost with workload patterns in real-time. Cloud 1 ser- vice types can be abstracted into three layers: Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS) [3]. Cloud computing is a disruptive technology and an adoption brings along risks and obstacles. Risks can turn into effective problems or disadvantages for organizations that may decide to move web applica- tions to the cloud. Such a decision depends on many factors, from risks and costs to security issues, service level and QoS expectations. A migration from an organization-owned data center to a cloud infrastruc- ture service implies more than few trivial steps. Steps of a migration to PaaS offerings, such as Google App Engine, would differ in several regards. The following 1. Background information on cloud computing, VM images, and meta-data available for VM and compute services is given in Supplemental Material

Transcript of JOURNAL OF LA CloudGenius: A Hybrid Decision Support ... · PDF filefor, = >> < >>:) = >> <...

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 1

CloudGenius: A Hybrid Decision SupportMethod for Automating the Migration of Web

Application Clusters to Public CloudsMichael Menzel, Member, IEEE, Rajiv Ranjan, Member, IEEE, Lizhe Wang, Senior Member, IEEE,

Samee U. Khan, Senior Member, IEEE, Jinjun Chen, Senior Member, IEEE

Abstract—With the increase in cloud service providers, and the increasing number of compute services offered, a migration ofinformation systems to the cloud demands selecting the best mix of compute services and VM (Virtual Machine) images froman abundance of possibilities. Therefore, a migration process for web applications has to automate evaluation and, in doingso, ensure that Quality of Service (QoS) requirements are met, while satisfying conflicting selection criteria like throughput andcost. When selecting compute services for multiple connected software components, web application engineers must considerheterogeneous sets of criteria and complex dependencies across multiple layers, which is impossible to resolve manually.The previously proposed CloudGenius framework has proven its capability to support migrations of single-component webapplications. In this paper, we expand on the additional complexity of facilitating migration support for multi-component webapplications. In particular, we present an evolutionary migration process for web application clusters distributed over multiplelocations, and clearly identify the most important criteria relevant to the selection problem. Moreover, we present a multi-criteria-based selection algorithm based on Analytic Hierarchy Process (AHP). Because the solution space grows exponentially,we developed a Genetic Algorithm (GA)-based approach to cope with computational complexities in a growing cloud market.Furthermore, a use case example proofs CloudGenius’ applicability. To conduct experiments, we implemented CumulusGenius,a prototype of the selection algorithm and the GA deployable on hadoop clusters. Experiments with CumulusGenius give insightson time complexities and the quality of the GA.

Index Terms—cloud migration, migration process, selection problem, criteria set, decision-making, decision support.

F

1 INTRODUCTION

AWeb application is a computer software appli-cation, which interacts with users through a

frontend programmed using browser-based language(such as JavaScript and HTML). Web applications aretypically accessed by million of users over the internetvia a common web browser software (e.g., Internetexplorer, Firefox, etc.). Common web applications in-clude webmail, online retail sales, online auctions,wikis and the like.

1.1 Motivation and The Research Problem

In the traditional web application hosting model [1],hardware needs to be provisioned for handling peakload. However, uncertain traffic periods and unex-pected variations in workload patterns may result inlow utilization rates of expensive hardware. There-fore, the traditional approach of provisioning for peak

Michael Menzel ([email protected]) is with Karlsruhe Institute of Technology.Rajiv Ranjan ([email protected]) is with CSIRO Computational Infor-matics. He is the corresponding author for the paper.Lizhe Wang ([email protected]) is with Chinese Academy of Sciences.Samee U. Khan ([email protected]) is with North Dakota StateUniversity.Jinjun Chen([email protected]) is with University of TechnologySydney.

workloads leads to unused or wasted computing cy-cles when traffic is low. With the advent of cloudcomputing, it is expected that more and more webapplications will be hosted using cloud-based, vir-tualized services. Cloud computing [2] provides anelastic ICT (Information Communication Technology)infrastructure for the most demanding and dynamicweb applications. Clouds provide an infrastructure (ifoptimally selected and allocated) that can match ICTcost with workload patterns in real-time. Cloud 1 ser-vice types can be abstracted into three layers: Softwareas a Service (SaaS), Platform as a Service (PaaS), andInfrastructure as a Service (IaaS) [3].

Cloud computing is a disruptive technology andan adoption brings along risks and obstacles. Riskscan turn into effective problems or disadvantages fororganizations that may decide to move web applica-tions to the cloud. Such a decision depends on manyfactors, from risks and costs to security issues, servicelevel and QoS expectations. A migration from anorganization-owned data center to a cloud infrastruc-ture service implies more than few trivial steps. Stepsof a migration to PaaS offerings, such as Google AppEngine, would differ in several regards. The following

1. Background information on cloud computing, VM images,and meta-data available for VM and compute services is given inSupplemental Material

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 2

Compute Services

VM Images

cs1

cs2

cs3

Web Application Server

Presentation Tier

Business Logic Tier

Cloud Provider 1

2. match

3. instantiate

4. deploy

Cloud Provider n

...

2. match

Decision Path

Action

Legend

Web Application Implementation

1. implement

Database Tier

Load-Balancer

Fig. 1. Example of a VM Image (PaaS) and Compute (IaaS) Service Selection

steps outline a migration of an organization’s webapplication to an equivalent on a IaaS such as AmazonWeb Services (AWS) EC2, GoGrid, Rackspace, and thelike.

First, an appropriate cloud infrastructure service, orIaaS offering, is selected. This demands a well-thoughtdecision to be made that considers all relevant factors,such as price, Service Level Agreement (SLA) level,network latency, data center location, availability, andsupport quality. The basis of a selection are data andQoS measurements regarding each factor that describethe quality and make service options comparable.For instance, a low-end Compute service of MicrosoftAzure is 30% more expensive than the comparableAWS EC2 Compute service, but Azure can processapplication workload twice as quickly.

Secondly, the existing web application and its exe-cution platform, i.e., a web/application server, a load-balancer, and a database, are transferred from thelocal data center to the selected cloud infrastructureservice. Therefore, the web application and servermust be converted into a form expected by a cloudinfrastructure service. Typically, in this step, the wholeweb application is bundled as a VM image thatconsists of a software stack, from operating systemand software platforms to the software containingthe business logic. It is often unachievable to convertan existing web application and its server directlyto a VM image format compatible with a certaincloud infrastructure service. Therefore, an adequateexisting VM image offered by the cloud providercan be chosen and customized. For example, onecan select existing VM images provided by bitnami[4] or thecloudmarket.com [5] to cloudify an existingweb application system component (e.g., application

server or database). Additionally, markets for cloudVM images exist, such as the AWS marketplace [6].

Existing images vary in many aspects, such asunderlying operating system, software inside the soft-ware stack, or software versions. Therefore, selectinga functionally correct VM image becomes a complextask. Besides, choosing a comprehensive VM imagehelps to minimize the effort of installing a softwarestack on a basic image. The resulting VM imageshould reflect the original application server and atleast replace it in a sufficing manner. Next, a migrationstrategy needs to be defined and applied to makethe transition from the local data center to the cloudinfrastructure service. A migration strategy definesprocedures and the course of action to transitiona system and its data to the target state. In casedata must be incorporated in the web applicationmigration, all data on the original machine must betransferred to the new system in the cloud. Moreover,all configurations and settings must be applied onthe new web server in the cloud to finish creatingan appropriate equivalent.

Optimal web application server QoS in cloud en-vironments demands appropriate configuration forboth VM images and cloud infrastructure services.However, no detailed comprehensive cost, as well asperformance or feature comparison of cloud servicesexists. The key problem in mapping web applicationserver components to cloud data centers, as depictedin Figure 1, is selecting the best collection of VMimages and compute services to ensure that a system’sQoS targets are met. Furthermore, another challengeis to satisfy conflicting selection criteria related to soft-ware (e.g., operating system and popularity) and com-pute services (e.g., latency, cost, data center location,

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 3

and so on). Additionally, components might be placedat different locations or providers to prevent outagesand generate costs for the Internet connectivity.

Analytic

Hierarchy

Process

Genetic

Algorithm

Hadoop Cluster

Selection

Criteria

Ranking of

VM Images

and

Compute

Services

Fig. 2. Overview of Hybrid Multi-Goal OptimizationHeuristic Method (HMOHM) Approach

1.2 Overview of Methods and ContributionsTo address the complexities when migrating multi-component web application server clusters, we ex-pand the migration framework CloudGenius. TheCloudGenius framework [7] translates cloud serviceselection steps into multi-criteria decision-makingproblems using (MC2)2 and the Analytic Hierar-chy Process (AHP) [8]. The framework, furthermore,determines the most viable VM images and com-patible compute services at IaaS layer. CloudGeniusoriginally provides a framework that guides througha cloud migration process and offers a model andmethod to determine the best combined and com-patible choice of VM images and compute servicesfor a single web server. With enhancements to theframework, we provide comprehensive support formigrations of distributed, multi-component (web/ap-plication server, database, and load-balancer) web ap-plication clusters while factoring in data flow depen-dencies of components raising network traffic costs.

As the solution space for the problem grows ex-ponentially, we developed a Hadoop and GeneticAlgorithm (GA)-based approach to cope with com-putational complexities in a growing market of cloudservice offerings. By combining a GA with AHP, wecreated a Hybrid Multi-Goal Optimization HeuristicMethod (HMOHM). Details on the HMOHM can befound in Figure 2. A new challenge resulting fromthe combination of GA and AHP is to transfer subjec-tive opinions stated once into an AHP-based fitnessfunction. Therefore, we developed a novel approachfor AHP-based fitness functions in GAs. AHP requirespair-wise comparisons among all alternatives to nor-malize values for an absolute scale. This becomesunsolvable with a potentially infinite number of alter-natives considered in a GA. Therefore, we needed todevelop a novel way for speeding up the executiontime for HMOHM. To this end, we implemented aparallel version of the HMOHM over hadoop clusters.Specifically, the main contributions of this paper are:• We clearly identify the most important selection

criteria, selection goals, and cloud service alter-natives, considering the use case of migrating aweb application cluster to public cloud servicessuch as Amazon EC2 and GoGrid.

• We extend analytical formulations and models ofour previous work [7] for handling the migration

of web server cluster components across multiplecloud data centers spanning over geographicallydistributed network boundaries.

• A hybrid decision making technique is pro-posed that combines multi-criteria decision mak-ing (AHP) and evolutionary optimization tech-niques (genetic algorithms (GAs)) for selectingbest compute service and VM image.

• A comprehensive experimental evaluation is car-ried out based on a realistic scenario for verifyingthe performance of the proposed decision makingtechnique.

The paper is structured as follows. First, we dis-cuss related work in Section 2. Then, the extendedCloudGenius framework is presented in Section 3.The framework introduces a migration process andformal model which forms the basis for decisionsupport within the process. Afterwards, we presentCumulusGenius, a prototypical implementation of theframework in Section 4. In Section 5, we present ause case and the results of experiments on the timecomplexity and search space of CumulusGenius. Weconclude and discuss future work in Section 6.

2 RELATED WORK

Multi-component web services have been introducedand defined by the web service community [9]. Multi-component setups in the cloud are described in theCAFE framework [10] and in TOSCA [11].

In the context of decision-making and cloud com-puting, a range of approaches apply service match-ing using requirements from service level agreements[12], [13], [14], [15]. Other methods employ decision-making methods. Saripalli and Pingali [16], Li etal. [17], and Han et al. [18] proposed multi-criteriadecision-making methods to evaluate decision sce-narios in the cloud context, including cloud servicesand providers. Rehman et al. [19] gave an overviewof multi-criteria approaches in the cloud context.Moreover, Chan and Chieu [20] provided serviceand provider evaluation methods which lack multi-component support. Dastjerdi et al. [21] consideredvirtual machine images in a matchmaking-based ap-proach.

Multiple approaches for multi-component setups inthe cloud have applied optimization techniques [22],[23], [24], [25] for selecting hardware resources asa cloud provider. In contrast, performance measure-ment techniques [17] compare cloud infrastructureservices to quantitative criteria (throughput, etc.) froma customer side. However, the need to consider VMimages has largely been ignored. Also, existing ap-proaches are missing a migration process with trans-parent decision support and adaptability to customcriteria.

Khajeh-Hosseini et al. [26] developed the CloudAdoption Toolkit that offers a high-level decision

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 4

support for IT system to clouds with the focus onrisk management and a workload cost model. Whilewe apply AHP to select optimal combination of webserver images (PaaS) and corresponding computeservices (IaaS), Godse et al. [27] applied AHP forselecting optimal SaaS product.

Zheng et al. [28] proposed an approach to cloudservices’ selection based on similarity of concepts inparameters based on WordNet. On the other hand,Kang et al. [29] and Zhang et al. [30] uses ontologyfor semantic similarity based cloud service searchand reasoning. While these approaches aids in under-standing of the configuration of cloud services, theydo not offer any support for automating the process ofmigrating web application services to public clouds.

To the best of our knowledge, we are firstto propose a hybrid decision making techniquethat combines evolutionary optimization methods,multi-criteria decision making methods (AHP), andmassively parallel processing Hadoop programmingmodel to enable fast, optimized and flexible selectionof cloud VM images and corresponding infrastructureservices.

3 CLOUDGENIUS FOR WEB APPLICATIONS

To additionally support migrations of web applica-tions with a compute cluster architecture (referredto as web application clusters) to cloud infrastruc-tures, we propose an extension of the CloudGeniusframework. One feature of the extended frameworkis the evolutionary cloud migration process model.The process model integrates existing migration ap-proaches and methods to support multi-criteria-baseddecisions to select cloud VM images and computeservices for multiple components. In the followingsubsections, we present the process model for multi-component migrations of web application clusters andgive details on the formal model. Moreover, we pointout required user input and flexibilities, and presentmethods to select VM images and services from theabundance of offerings. Finally, we give insights intothe evaluation of clusters considering network costsand constraints, and look into computational com-plexities.

3.1 Evolutionary Cloud Migration ProcessMigrations of web applications to the cloud are ex-pected to be linear transitions from the state of ”notmigrated” to ”migrated”. However, web applicationsand related software stacks introduce complexity.Therefore, migrations should involve multiple repe-titions and reconsiderations of past actions. Cloud-Genius accommodates the vicissitudes within a cloudmigration by embedding decision support methodsinto an evolutionary migration process model. Also,existing migration strategies can be employed withinthe process model.

1Initial

2Migrated

migration procedure3

Web app obsolete

migration procedure

deletion

undeployment

cancellation

reactivation

Fig. 3. States in an Evolutionary Cloud Migration

CloudGenius’ migration process model describesthe states of a web application as depicted in a finitestate machine in Figure 3. From an initial state a webapplication is migrated by an application engineerinto the cloud with an optional cancellation path.From its migrated state a web application can bemigrated within the cloud, be it between providersor services, or due to changes in the software stack.Eventually, a web application might be replaced orneeds to be disposed and moves to an inactive state.CloudGenius provides decision support in the tran-sition phases to a migrated web application, exceptreactivation. The migration process model describesthe steps within a transition and embeds the decisionsupport.

Figure 4 depicts CloudGenius’ evolutionary migra-tion process model for clusters of web applicationsin Business Process Model and Notation (BPMN) 2.0.The process is divided into two lanes: (1) ”user input”lane representing application engineers and domainexperts and (2) ”CloudGenius” lane which representsan implementation of the framework.

The process begins with an initial decision betweena cloud and non-cloud infrastructure. Subsequently,an engineer states preferences and requirements, andlets CloudGenius recommend a list of feasible clustersolutions. Every solution comprises a mapping of acluster component to a target platform constitutedby a VM image and a compute service. In case nosatisfying solution has been identified, the processallows to end an infeasible cloud migration. Other-wise, the process continues with migration steps stilloffering the chance to return to an earlier stage inthe process. Thereby, CloudGenius becomes an evolu-tionary approach that facilitates cycles in a migration.In case of an user-initiated abort in the process, anintentional reset to the fundamental cloud decisionactivity (cloud versus non-cloud) is forced. This givesan engineer the chance to reconsider the fundamentalinfrastructure decision or to skip forward and modifyretained preference and requirement statements.

Eventually, an engineer ends up with a success-fully migrated web application cluster. In case ofdiscontinuation of the process, a validation that cloudinfrastructures are yet an unsatisfying choice has beenattained. In conclusion, the process model supportsan evolutionary approach to a migration. The evolu-tionary nature is facilitated through the chances forreconsiderations by returning to earlier steps withinthe process or by reapplying the whole process model(see Figure 3).

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 5

Fig.

4.C

lust

erM

igra

tion

Pro

cess

ofth

eex

tend

edC

loud

Gen

ius

Fram

ewor

k

3.2 Formal Model of CloudGeniusCloudGenius’ formal model is extended with the no-tion of components (C), compatibilities (D,E,F ) andnetwork traffic (Nin,Nout) to facilitate clusters. Table1 summarizes all parameters of the original and theextended model.

The extended formal model of CloudGenius incor-porates l components ch which are part of a clusterF . Furthermore, the model includes m images ai andn cloud infrastructure services sj of o providers pk. Cis the corresponding set of software components in acluster, A the set of VM images, S the set of cloud in-frastructure services, P the set of cloud providers, andI the set of component connections. Every VM imageai, compute service sj and any combination thereof(xl) owns numerical and non-numerical attributesnoted in the sets Aai , Asj , Bai and Bsj . χ represents avalue connected with a numerical attribute α or non-numerical attribute β. Moreover, the model introducesrA + rS + rX requirements.

F ∗ = arg max((∑ch∈C

u(ch, ai, sj)) + c(F ))

s.t. (ai,ch , sj,ch) ∈ D,∀ch ∈ C(ai,ch , aj,c′h) ∈ E,∀ch, c′h ∈ C(si,ch , sj,c′h) ∈ F,∀ch, c′h ∈ C

r = true,∀r ∈ RA ∪RS ∪Rch,X , ch ∈ C

(1)

Based on the model, CloudGenius recommends abest solution for a cluster F with best combinations(ai, sj) in Xch for every component ch. A best solutionfor a cluster has the highest value according to totalbenefit versus network traffic costs trade-off evaluatedin a function c(·) for network costs and a utilityfunction u(·) that considers an engineer’s preferences.In addition, a best solution needs to conforms withset D. In D viable combinations of VM images aideployable on compute services sj are marked. Com-patibilities between components are held in sets E andF . Compatible VM images ai and compute services sjare marked in E and F respectively. The sets Nout andNin hold the expected network traffic for componentrelations. In sum, the problem can be expressed asan optimization problem to find F ∗ with the highestvalue as in Equation 1.

3.3 Cluster ModellingWeb application clusters comprise load balancer,database server, and potentially inter-connected weband application server components. Web applicationsmight be divided into inter-connected front-end, busi-ness logic, and data layers to allow layer-independentscaling. In practive, web servers and applicationservers have merged, for example the Apache Tomcatand Red Hat JBoss Application Servers. Applicationservers not only provide an execution environment

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 6

TABLE 1Extended Formal Model in CloudGenius

Parameter DescriptionF cluster to migrate consisting of components CC = {c1, ..., cl} set of l software componentsA = {a1, ..., am} set of m cloud VM imagesS = {s1, ..., sn} set of n cloud infrastructure servicesP = {p1, ..., po} set of o cloud providersD = {d1, ..., dq} set of q image-service compatibilities (ai, sj)E = {(ai, aj)|ai, aj ∈ A)} set of inter-image compatibilitiesF = {(si, sj)|si, sj ∈ S)} set of inter-service compatibilitiesI = {(ci, cj)|ci, cj ∈ C} set of relations between componentsRA = {rA,1, ..., rA,rA} set of rA cloud image requirementsRS = {rS,1, ..., rS,rS} set of rS cloud service requirementsRch,X = {rX,1, ..., rX,rX } set of rX combination requirements for chτh cloud image ai, service sj , provider pk , or combination xl as τh ∈ A∪S∪P ∪XAτh = {ατh,1, ..., ατh,t} set of t numerical attributes of ai, sj , pk , xlBτh = {βτh,1, ..., βτh,u} set of u non-numerical attributes of ai, sj , pk , xlχ(α) value of numerical attribute α in CloudGenius databaseχ(β) value of non-numerical attribute β in CloudGenius databasevτh value of h-th τ calculated with (MC2)2

Xch = {xl = (ai, sj)|ai ∈ A, sj ∈ S} set of cloud image and service combinations for chNout = {nout,(ci,cj)|ci, cj ∈ C} set of outgoing traffic between components (in byte)Nin = {nin,(ci,cj)|ci, cj ∈ C)} set of incoming traffic between components (in byte)

for web applications, but also contain a web server tohandle http requests. Our approach, thus, focuses onapplication servers in the cluster modelling.

In this paper, in regards of selection of databaseservers, we restrict the feature selection to VM im-ages (e.g., Bitnami MySQL image [31], 3Tera re-lational database images [32], and BitNami Post-gresSQL image [33]) that offer relational databasefunctionalities. Relational database VM images in-clude pre-configured and pre-installed traditional re-lational database systems, e.g., MySQL, SQL Server,and PostGres. In such database systems, the data isstored in tables that have fixed schema. SQL is used asthe generic language that allows to execute projectionsand insert, delete, and update operations on the data.Fundamentally, relational databases have proven tobe good at managing structured data, especially inthe application scenario where transactional integrity(ACID properties) is a requirement. In the futurework, we intend to extend the HMOHM with theability to select non-relational database cloud ser-vices (NoSQL), such as Amazon DynamoDB, HBase,and MongoDB. Unlike relational database services,NoSQL services do not yet have support for ACIDtransaction principles, but rather offer weaker con-sistency properties. Besides, data access in NoSQLsystems is typically based on predefined access prim-itives such as key-value pairs.

In an initial step, an application engineer has tomodel the application as a cluster. All components ofthe cluster setup must be added as elements ch to setC and interconnections must be defined in form ofcomponent pairs in set I . The engineer and domainexperts must state the expected amount of outgoingand incoming data for each component in bytes inthe set Nout and Nin. Some components might be

component 1

component i

component j

component j+2

component n

feature: load balancer

feature: load balancer

feature: appl. server

feature: appl. server

feature: appl. server

component j+1 feature: appl. server

Fig. 5. Example of a Cluster Modeladded multiple times for scaling, or with distinct re-quirements and goals for fault-tolerance. Furthermore,a software feature must be assigned that categorizesthe component. Available feature categories are webserver, relational database server, and load balancer.A feature limits the set of plausible VM images and isa mandatory requirement restraining the set of viableVM images.

The extended CloudGenius approach is capable offinding a best solution for any combination of inter-connected components. Nevertheless, we suggest atypical cluster setup depicted in Figure 5 consisting ofa set of one or more load balancers that are connectedto multiple application servers which are partiallyinter-connected.

3.4 Software Component Requirements & Prefer-encesSimilar to the original CloudGenius process, for eachcomponent application engineers have to formulaterequirements and preferences. Requirements formu-lation comprises setting constraints on attributes ofVM images (pertaining to web server, applicationserver, and relational database sever) and compute

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 7

TABLE 2Requirement Types

Value Type Req. Type BooleanNumerical Equals χ(ci, α) = vr ∨ χ(cj , α)Numerical Max χ(ci, α) < vr ∨ χ(cj , α)Numerical Min χ(ci, α) > vr ∨ χ(cj , α)

Non-numerical Equals χ(ci, β) = χ(cj , β)Non-numerical Not Equals χ(ci, β) 6= χ(cj , β)Non-numerical One Of χ(ci, β) ∈ S

Fig. 6. Criteria Hierarchies Overview

services. Table 2 lists requirement filters in the (MC2)2

framework. An attribute can be required to adhere toa fixed value boundary vr. Alternatively, the attributeof a component can be included, such as the locationof c1 must not equal the location of c2. The table as-sumes χ(cH) to be an attribute value when evaluatingτ for ch. Attributes for requirement definitions canbe drawn from Tables 3 and 4 for VM images. Forcompute services the attributes can be seen in Tables 5,6, whereas for combinations the Tables 3 and 4 shouldbe consulted. Requirements of VM image attributesneed to be defined in set Rch,A, for services in setRch,S and for combinations in set Rch,X .

Given the proposed goal hierarchies depicted inFigure 6, an engineer has to state preferences asdescribed by the AHP. In addition to the computeservice goals of the original CloudGenius framework,the attributes CPU Cores, RAM Size and Disk Sizeare included in the goal hierarchy. For attributes ofload balancer and web server combinations (see Ta-bles 3 and 4), two single-levelled hierarchies must beweighted.

Finally, web application engineers have to statethe weight of the cloud VM image, compute service,and the combination thereof in the total value ofa solution. Therefore, the weights wa, ws and wattrneed to be determined. Also, some components mightbe more important than others. Hence, weights forthe importance of a component within the clusterhave to be determined in wch . Network cost attributesare not part of the goal hierarchies and weights arenot required to determine the total network costs,given a network traffic estimation for all components.However, the importance of network costs wT withinthe solution must be weighted. In parallel, the weightwQ for the value calculated from all other criteria mustbe defined.

TABLE 3VM Image Numerical Attributes

Name Influence Metric RangeHourly License Price Negative $/h 0-∞ $/hPopularity Positive % 0-100 %Age Positive Days 0-∞

TABLE 4VM image Non-numerical Attributes

Name Example ValuesVirtualization Format Xen, VMWare, . . .Operating System (OS) Linux, Windows, . . .OS Version Ubuntu 10.4, . . .Software Feature Application Server, Load Bal-

ancer, Database, . . .Software JBoss, Nginx, MySQL, . . .Software Vers. 0.8-alpha,. . .Implementation Lang. Java, Perl, Ruby, . . .Supported Impl. Lang.s Java, Perl, Ruby, . . .

3.5 Cloud VM Image and Compute Service At-tributesNumerical and non-numerical attributes for VM im-ages are listed in Tables 3 and 4, for cloud services inTables 5 and 6, corresponding to the proposed goalhierarchies. Attributes of VM images and computeservices are applicable to filters for components ofall feature categories. The selection of attributes isdrawn from own observations and literature on VMimages and services [21], [34], providing a basic setof attributes essential to cloud VM image and serviceselection. However, service attributes have a limitedapplicability to VM images. Therefore, we aim atgradually improving the list of attributes from usagedata of a publicly available prototype [35], [36] andVM image databases, such as thecloudmarket.com [5]and BitNami [4]. Moreover, cloud compute servicesown multiple numerical attributes that imply a mea-surement or benchmarking. Therefore, an integrationof existing approaches for costs [37], [26], perfor-mance, and latency measurements averaged over time[38] are beneficial.

TABLE 5Compute Service Numerical Attributes

Name Influence Metric RangeHourly Price Negative $/h 0-∞ $/hCPU Cores Positive Cores 0-∞RAM Size Positive Bit 0-∞Disk Size Positive Bit 0-∞CPU Perfomance Positive Flops 0-∞ FlopsRAM Perfomance Positive Ops/s 0-∞ Ops/sDisk Perfomance Positive B/s 0-∞ B/sMax. Latency Negative ms 0-∞ msAvg. Latency Negative ms 0-∞ msUptime Positive % 0-100%Service Popularity Positive % 0-100%Network Send Price Negative $/Byte 0-∞Network Recieve Price Negative $/Byte 0-∞Internet Send Price Negative $/Byte 0-∞Internet Recieve Price Negative $/Byte 0-∞

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 8

TABLE 6Compute Service Non-numerical Attributes

Name Example ValuesProvider Amazon, Rackspace, . . .Location Country Germany, Australia, . . .Architecture 32Bit, 64BitOwner Amazon, Rackspace, Bitnami,...

In addition to the attributes of VM images andservices, there exist attributes assignable to combi-nations of VM images and services. Attributes ofcombinations are in effect when a VM image is in-stantiated on a compute service. However, the in-stantiation of a VM image costs money and time.Therefore, measuring attributes of combinations is aconsiderable effort and filling a database with suchmeasurement data demands investments. The numberof attributes specific to instantiated VM images canbecome immense. For m VM images and n com-pute services, m × n measurements had to be made.Also, measuring instantiated VM images demandsspecific benchmarking tools such as (i) httperf [39] andApacheBench for load-balancers and (ii) TPC-W [40]for web/application and relational database servers.

Consequently, we leave the choice of combinationattributes to the application engineer and decision-maker. To lower the costs, the number of VM imagesmust be decreased by setting strict requirements. Thesoftware feature requirement is a mandatory require-ment. To make results comparable only VM imageswith similar software features should be measured.In general, CloudGenius can be extended to includeall sorts of software features. However, it is notewor-thy to mention that any additional feature requiresspecific attributes and criteria to be defined.

Nevertheless, by defining and measuring combina-tion attributes the evaluation becomes more precisein terms of considered factors. Combination attributesare included in the evaluation function h(·) in section3.6.

3.6 Best ClusterThe original CloudGenius framework evaluates VMimages, compute services, and combinations thereofwith an evaluation method based on the AnalyticHierarchy Process (AHP) created with the (MC2)2

framework. The evaluation method is translated intothree functions that map VM images, compute ser-vices, and combinations to a value. All functionsdetermine a normalized value according to a decision-maker’s preferences and requirements. For web appli-cation clusters the original approach remains, but withthe addition that evaluations are conducted per com-ponent in a cluster and, finally, are aggregated intoa whole cluster value. Furthermore, network trafficneeds to be considered in the value. Network trafficwithin a provider’s data center is typically cheaper

than traffic over the internet which is needed when acluster is distributed over multiple providers.

The component-related functions f(ch, ai, Aai , Bai),g(ch, sj , Asj , Bsj ), and h(ch, ai, sj) considercomponent-related requirements Rch,A and Rch,Sand weights (see Equations 2, 3 and 4). Function f(·)returns an evaluation value for a VM image, functiong(·) for a compute service, and function h(·) for everycombination of a VM image and compute service.Function i(·) merges values from VM image, computeservice, and combination attribute evaluations to atotal combination evaluation value (see Equation 5).

f(ch,ai, Aai , Bai) =|Aai

|∑j=0

wjχ(αj,ai,+)

|Aai|∑

j=0wjχ(αj,ai,−)

∀r ∈ Rch,A : r = true

0 else

7→ vch,ai

(2)

g(ch,sj , Asj , Bsj ) =|Asj

|∑i=0

wiχ(αi,sj ,+)

|Asj|∑

i=0wiχ(αi,sj ,−)

∀r ∈ Rch,S : r = true

0 else

7→ vch,sj

(3)

h(ch,ai, sj) =

|A(ai,sj)|∑l=0

wlχ(αl,(ai,sj),+)

|A(ai,sj)|∑l=0

wlχ(αl,(ai,sj),−)

∀r ∈ Rch,X : r = true

0 else

7→ vch,(ai,sj)

(4)i(ch,ai, sj) =

waf(ai, Aai , Bai)

+wsg(sj , Asj , Bsj )+wattrh(ch, ai, sj)

(ai, sj) ∈ D

0 else

7→ vch,(ai,sj)

(5)

For component ch, the best combination has thevalue max{vch,(a1,s1), ..., vch,(am,sn)} calculated withi(·), with all values being comparable on an absolute[0, 1] scale guaranteed by normalization in the AHP.In case no alternative meets all the requirements, in asubsequent step, all alternatives that meet all but onerequirement are considered. This procedure repeatsuntil a non-empty set of alternatives is found thatfulfills less requirements. Optionally, the CloudGeniusprocess offers an opt-out to look for non-cloud optionsif and only if no solution satisfies the given require-ments.

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 9

In a multi-component setup, the viable connectionsof components are defined by the sets E, F , and I .The connections included in set I must adhere to therules from the sets E and F which define compatibleVM images and services. A feasible solution of acluster, comprising combined solution pairs (ai, sj) forevery component, is not only the sum of the values ofall contained ch. Network costs of component inter-connections affect the quality of a solution, too. Cloudservices of different cloud providers and at differentlocations can cause internet traffic costs. Equation 6shows the ∆ of total network traffic costs (internet andlocal network) for a component ch ∈ C connected withother components according to I . If I includes theconnection (ch, ci) with ci ∈ C, I avoids doubled costsand will not hold the inverse (ci, ch). T represents thecosts for network traffic for expected communicationof a multi-component cluster solution Fi. Let Tch,ci,Rl

,Tch,ci,Sl

, Tch,ci,Rg , and Tch,ci,Sg be the expected costof incoming and outgoing local network (Rl, Sl) andinternet traffic (Rg , Sg) between components ch andci. These costs are calculated in advance according tosets Nout and Nin, and the providers’ price schemeheld in a compute service’s attributes.

Tnetwork,Fi=

∑o∈Ich

wT,RT(ch,o),Rl

+wT,ST(ch,o),Sl

provider/location equal

wT,RT(ch,o),Rg

+wT,ST(ch,o),Sg

else

(6)

j(Fi) =wQ

∑ch∈C

wchi(ch,ai,sj)

wTTnetwork,Fi,normalized

∀ch, c′h :(ai, a

′i) ∈ E

∧(sj , s′j) ∈ F

0 else

7→ vFi

(7)All Tnetwork,Fi

are normalized to (0, 1) scale in AHP.Equation 7 formulates the function j(·) that calculatesthe overall value of a cluster solution instance Fi.The value consists of the weighted sum of combinedsolution values returned by function i(·) and thetotal network costs Tnetwork,Fi

of a cluster instanceFi. Weights reflect the importance of a component ifstated by the user.

After a presentation of recommended solutions, themigration process continues with a selected cluster so-lution, deployment of the VM images on the computeservices, and further customization and re-evaluationcycles by the engineer. With all components deployedand customized, and after following a migration strat-egy results in the cluster available on diverse cloudinfrastructure services and distributed over data cen-ters when stated earlier in the requirements.

3.7 Computational Complexity

The decision problem in multi-component web ap-plication migrations addressed by CloudGenius isobviously complex, creating a potentially huge searchspace for many VM images, services, and compo-nents. Hence, it is important to analyze the actualcomputational complexity of the approach to ensureits applicability. We define O of CloudGenius as fol-lowing:

O(m ∗ |Ba|+ n ∗ |Bs|︸ ︷︷ ︸requirements check

+ m ∗ |Aa|+ n ∗ |As|︸ ︷︷ ︸images & services evaluation

+

m ∗ n ∗ |D|+m ∗m ∗ |E|+ n ∗ n ∗ |F |︸ ︷︷ ︸feasibility check

+

m ∗ n︸ ︷︷ ︸combined evaluation

+ (m ∗ n)l︸ ︷︷ ︸clusters evaluation

)

The computational complexity is proportional tothe number of VM images m, services n, and com-ponents l. Computations for m images and n servicescomprise requirements checks in sets Ba and Bs andevaluations regarding criteria of sets Aa and As. Inaddition, feasibility checks using the sets D, E, andF must be executed for all m × n combinations. Theevaluation with the functions f(·), g(·), g(·), and i(·)add additional complexity. The evaluations implicateadditional computation steps for AHP comprisingnormalization of matrices and derivation of globalweights which are not included for simplicity. Theformulated O shows that CloudGenius’ complexityis dominated by the cluster evaluation which createsa solution space with (m ∗ n)l clusters (Cartesianproduct of combinations). Therefore, the complexity isexpected to grow exponentially in proportion to l, m,and n. Since CloudGenius searches in the full solutionspace, we are confident to find the actual best solution.

3.8 Parallel Genetic Algorithm

Meta-heuristics allow to cope with exponentiallygrowing computational complexities. In particular,gargantuan search spaces of discrete solutions can besearched even without knowledge about the searchspace’s structure. While meta-heuristics can cut timecomplexities to a fixed limit, they cannot guaranteean optimal solution. Nevertheless, the accuracy canbe influenced by increasing the granted computationtime. Thereby, meta-heuristics give the option to tradewaiting time for accuracy. When parallelizing meta-heuristics, computation can be divided into multipleprocessing units which can be run on multiple serverinstances. This introduces the additional dimensionof costs for computational resources to the trade-off. Consequently, accuracy can be achieved by longwaiting times or high investments in computationalresources.

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 10

VM ImageCompute Service

VM ImageCompute Service

Formation

Component 1 Component n...

Fig. 7. Structure of a Cluster Solution Candidate

The proposed algorithm facilitates parallel com-putations and resembles a population-based geneticalgorithm (GA) meta-heuristic due to the discretenature of the search problem. The algorithm consistsof four major steps: (1) create initial population, (2)assignment of fitness values, (3) selection of elite, and(4) evolution of a population. Step 1 only occurs onceat the beginning of the algorithm, while steps 2–4repeat until a termination criterion is fulfilled. Eithera certain number of generations have been evolved ora certain amount of time has passed.

The algorithm expects a database of cloud VMimages and compute services including their compat-ibilities as described by CloudGenius model. Also, acluster model, preferences, and requirements need tobe defined in the model by an engineer. In additionto the model, three parameters must be set in thebeginning to adjust the GA: (a) population size, (b)elite size and (c) maximum computation time.

3.8.1 Candidates and Populations

Genetic algorithms search over populations in multi-ple iterations referred to as generations. Every pop-ulation consists of a predefined amount of solutioncandidates, each representing a viable and discretesolution to the problem. We propose candidates to bethe cluster solutions Fi as described in section 3.6.Figure 7 depicts the structure of a cluster solutioncandidate. Every candidate represents the set of com-ponent mappings to a viable VM image and computeservice combination that can be evaluated with func-tion j(·). To initiate the GA, an initial population mustbe generated. Our algorithm picks cluster solutionsrandomly ignoring duplicates and adds them to thepopulation until it has grown to the expected popu-lation size (a). After this step the algorithm is readyto evolve the initial population.

3.8.2 Fitness Value of Candidate

Every population is evaluated to determine the fitnessvalues of its candidates. Our genetic algorithm em-ploys the evaluation function j(·) for each candidateFi. Since the AHP determines values of candidatesin a comparison matrix, the whole population mustbe known and evaluated together. In particular, forparallel genetic algorithms this is of special interestsince every parallel evaluation task must be giventhe whole set of candidates in a population. For

implementations of the algorithm this leads to a limi-tation on the population size imposed by the availablememory size of tasked computation units.

3.8.3 Selection of EliteBased on the fitness values determined in the previousstep, an elite of size (b) can be drawn from the currentpopulation. The elite is represented by the set ofhighest ranked solution candidates in the population.Any non-elite candidates are discarded in the nextgeneration and replaced by evolved candidates. Thesize of the elite has been specified as a parameter inthe beginning.

With the termination of the algorithm a final eliteis returned which includes a best solution candidatecomputed by the algorithm. The highest ranked elitecandidate according to its value is the preferred clus-ter solution which is not guaranteed to be the globallybest cluster solution in the search space.

3.8.4 Evolution of PopulationWhile the elite candidates reside in the population,non-elite candidates are evolved to create a new gen-eration and continue the search for best solutions.How the candidates are evolved depends on whetherthe algorithm applies a global or local search. Mutat-ing candidates in only few attributes, e.g., changingthe mapping of one component to a slightly differentcompute service or VM image, potentially leads to alocal search. In contrast, substituting candidates withrandom cluster solutions from the search space thatdiffer in a subset of attributes results in a globalsearch.

To allow for global and local search we propose twoevolutionary operators: (1) altering a cluster solution(mutation) and (2) substituting it (nascency). Mutatinga candidate means substituting one of its component’ssolutions with other viable combinations of VM im-ages and compute services. To generate a better localsearch, the substitution strategy should take advan-tage of the fact that network traffic is commonly freeor cheap when staying within a provider’s network.Therefore, combination alterations for one or morecomponents should consider this common pricingmodel fact and choose a combination with a computeservice offered by the same provider. Substitutionimplies choosing a random solution from the searchspace that is not included in the current population.Both evolutionary operators, (1) mutation and (2)nascency, are applied for each non-elite candidatewith a 50% chance to balance global and local search.

3.8.5 TerminationEvery genetic algorithm requires a termination rulethat forces computations to stop and to accept thecurrently best solution candidate. Rules consider thecurrent quality of the solution, or simply stop after a

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 11

certain number of iterations or time limit. We proposemultiple rules that lead to an eventual termination.Firstly, the algorithm stops immediately after the sumof all uniquely discovered solution candidates passesthe number of total individuals in the search space|{F1, ..., Fn}| = (|A|×|S|)|C|. Therefore, a list of visitedsolutions helps to guarantee unique visits. Then, theexcess of the search space can be determined with theratio of search space size per population size testing#(generations) > (|A|×|S|)|C|

max #(individuals in population) . The algo-rithm halts when the number of generations exceedsthe ratio. Secondly, a general stopping rule ends thegenetic algorithm after a certain amount of time, e.g.,5 minutes, specified as parameter (c). Thirdly, moresophisticated stopping rules can halt the algorithmwhen, for example, the value of the best solution hasnot improved over multiple generations or the actualpopulation size undercuts the maximum populationsize, which means only few individuals have not beenvisited.

4 CUMULUSGENIUS: AN IMPLEMENTATION

With CumulusGenius [35] we provide an implemen-tation of the model and evaluation algorithms of theframework. The CumulusGenius java library offers adata model and evaluation algorithms based on theAotearoa AHP implementation [8] that enables theevaluation of VM images, cloud services, combina-tions thereof, and whole clusters. The parallel geneticalgorithm described in section 3.8 has been imple-mented [41] using the mahout framework for hadoop[42]. Mahout includes support for genetic algorithmimplementations using the watchmaker framework[43], and helps parallelizing and making algorithmsdeployable on hadoop clusters [44].

A Google Web Toolkit-based web frontend withjClouds [45] integration and database of the currentcloud provider landscape is currently under develop-ment [36].

5 EVALUATION

5.1 Use CaseAs multi-component web application system migra-tion use case, we consider an organization’s applica-tion engineer. The engineer wants to migrate the oper-ations of its online digital store from locally managedphysical infrastructure (non-virtualized) to virtualizedcloud infrastructure services. The company had en-gineered the original application based on multi-tierarchitecture to decouple major functionalities acrosstwo components: (a) presentation and business logiclayer with Tomcat 5.x Application servers and (b) datalayer, which stores information in a relational MySQL5.x database server. A HA Proxy 1.4.x load-balanceris employed to distribute workload across multipleapplication servers. To increase the dependability of

the web application, the application servers shall notbe placed in the same data centers or preferably atdifferent coasts or continents.

After the organization identified cloud infrastruc-tures as a target by applying the (MC2)2 evaluationframework in a first step, the engineer defines theweb server cluster including component dependen-cies. The cluster comprises two Tomcat AppServer(feature application server, version > 5.0, and UbuntuLinux) and one HA Proxy (feature load balancer withversion > 1.4.1). A MySQL Database Server version5.0 is already up and running in a cloud data center.Already deployed web applications persist data tothe given MySQL Database Server. The HA Proxyis connected to both Tomcats which are further con-nected to the MySQL database. However, the latteris not included in the cluster model. Subsequently,requirements for all components must be set such asfeature, OS, and version. The two Tomcats require adifferent locations in order to strengthen the cluster’sdependability.

The engineer states throughput and costs as highpriority goals, and VM image quality more importantthan compute services’. Moreover, evaluation criteriafor VM image and compute service preferences forboth application server components and the loadbalancer are weighted in paiwise comparisons.

CumulusGenius owns a database with informationabout VM images and compute services from AWS,Terremark, and Rackspace. As no results were foundfor the HA Proxy, the engineer changes the versionrequirement to > version 1.4.0 and gains cluster solu-tions. According to the cluster’s recommendation theengineer lets CumulusGenius deploy all listed VMimages to assigned compute services with jClouds.The engineer connects all running components, withTomcat servers being placed in the US and in the EUfor dependability and availability reasons. After exe-cuting a migration strategy and transferring backupdata to the database server, the cluster eventuallybecomes available on cloud infrastructure servicesand the first full cycle of the migration ended.

Within weeks, the demand rises and additionalTomcat servers should be added to the cluster. Theengineer re-enters the process cycle repeatedly tomodify the cluster. Each time, CumulusGenius sug-gests a new mapping, and the engineer deploys VMimages accordingly and connects new deploymentswith existing ones. Within the evolutionary process,the engineer revises the web application deploymentusing new experiences and enters the CloudGeniusprocess occasionally. In every revision, he compareshis past decision with his current objectives and re-evaluates viable deployment options.

5.2 ExperimentsWe tested our implementation CumulusGenius in ex-periments on an EC2 High-Memory Quadruple Extra

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 12

Fig. 8. Time Complexities of Parallel Computations

Large (m2.x4large) instance (8 CPU cores, 26 EC2Compute Units, and 68.4 GB of RAM) with Ubuntu10.04 and OpenJDK JRE 6.0 in order to analyze itssolution space and the actual time complexity. Toexecute computations with CumulusGenius, we usedwhirr to deploy the experiments on a mahout-enabledhadoop cluster consisting of one master node (namenode, job tracker, and mahout client) and multipleslave nodes (task tracker and data node). The parame-ters of the experiments are the number of VM images,services, and components. VM images and servicesare synthetically generated with all attributes havingrandom, but realistic values. There is a fixed numberof three providers and no requirements are definedto stress the algorithm with a full solution space.Components are randomly assigned to a provider andall inter-connected to each other. When componentsare offered by the same provider low network costsoccur. In case of different providers, five times higherinternet costs are assumed. First experiments showedan explosion in computation time for a growing num-ber of components l, VM images m and services n. Acluster of three components and a database of fiveVM images and five services of three providers takes>11,725 seconds (∼3,25h) to find the highest rankedof 15,625 solutions (three components with 25 viableVM image and compute service combinations each).Therefore, the implementation has been enhancedwith parallel threads making it possible to exploitmultiple CPU cores, and AHP’s matrix normalizationhas been simplified as this step was identified tocause tremendous effort. The exponentially growingeffort for additional components in a cluster becomesobvious with the measurements depicted in Figure 8.The EC2 instance ran out of memory with four com-ponents and three VM images and compute services.

Since the search space is extensive, we further an-alyzed the quality of a simpler naive solution in 100experiments per l, m, n variation. Instead of findingthe actual best cluster, it is very cheap (∼1 ms) toneglect network traffic costs and construct a bestsolution from every component’s best combinationof VM image and compute service. Table 7 showsthe average number of equal component solutions

TABLE 7Quality of Naive Solutions vs. Full Evaluation

l m n AVGEquals

NoEquals

AllEquals

3 3 3 1.7 11% 20%3 4 4 1.58 10% 17%3 5 5 1.19 19% 8%4 2 2 2.87 1% 24%4 3 3 2.57 3% 19%

TABLE 8Quality of GA vs. Full Evaluation

l m n GA vs. FE4 2 2 100%3 3 3 100%3 4 4 98%3 5 5 95%

and the possibility that a naive solution equals theactual solution in all or none of the components fordifferent parameters. The results show that for largernumbers of l, m, and n the chance for an acceptablenaive solution decline and naive solutions are at bestindicators for good solutions.

The genetic algorithm-based approach confirms ex-pectations to cope with the time complexities andto provide a handle to find accurate solutions forlarge solution spaces depending on the invested timeor money. In further experiments we compared thequality of solutions computed with the GA imple-mentation using the mahout framework. Therefore,we compare the results with the actual best solu-tion returned by the parallelized full evaluation thatsearches the whole solution space. The quality ratioshows how good the GA’s solution is compared to theactual solution found with a full evaluation (FE). TheGA was run with the parameters (a) population sizeof 100, (b) elite size of ten and (c) eventual stoppingat five minutes.

To test the power and quality of the GA imple-mentation in gargantuan solution spaces, we used afive and eight node hadoop cluster (EC2 m2.x4large)and computed ten solutions each for five components,10,000 VM images and variations of ten and thirtyservices of five providers, giving the algorithm fiveand ten minutes time. The number of 10,000 AMIsis derived from the amount of AMIs available in theAmazon EC2 Region us-east-1 (April 2012) and givesa realistic estimate. The algorithm succeeded to finda solution, while a full non-parallelized evaluationwould fail due to heap space exceedings. To get agrasp on how good the solution of the GA actually is,the naive solution serves as the only benchmark, albeitbeing worse than the actual solution. Table 9 shedssome light on the ratio of the GA solution versus thenaive solution calculated in average of ten runs. Withmore time and more compute resources the GA tendstowards better solutions and beats the naive solution

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 13

TABLE 9Quality of GA Solutions

time nodes l m GA vs.Naive

5 5 10 10,000 0.855 5 30 10,000 0.655 8 10 10,000 0.925 8 30 10,000 0.7710 5 10 10,000 1.1310 5 30 10,000 1.0110 8 10 10,000 0.9310 8 30 10,000 1.15

that omits the influence of network costs.In summary, the experiments show that the GA

approach succeeds where a full evaluation of thesolution space is not possible. However, the qualityof the results cannot be guaranteed to a predefinedlevel. Increasing investments in time or additionalcompute resources allows to improve the quality sincethe chances for the genetic algorithm to find the bestsolution improve. Additional time allows the GA tosearch over more generations with same amount ofcompute resources, while additional resources allowto evaluate more generations in the same time. Thecurrent GA implementation beats the naive solutionwith more than ten minutes of time and small clustersof five to eight nodes. Improvements in the implemen-tation and hadoop setup might reduce computationtimes and increase solution quality to an yet unknowndegree.

6 CONCLUSION AND FUTURE WORK

In this paper, we presented the extended CloudGeniusframework, which provides a hybrid approach thatcombines multi-criteria decision making techniquewith evolutionary optimization technique for: (i) help-ing application engineers with the selection of the bestservice mix at IaaS layer and (ii) enabling migrationof web application clusters distributed across clouds.

We believe that CloudGenius framework leavesspace for a range of enhancements and provides yetan amicable approach. Nevertheless, a major issue incloud service selection is the domain of available datain the decision, i.e., completeness and freshness of adatabase with VM images and services, the criteriacatalogs, and the quality and correctness of measuredvalues. To address these issues, we intend to integratecloud benchmarking approaches [38], [46] and exist-ing databases such as CloudHarmony, bitnami, andthecloudmarket.com [47], [4], [5]. Including extendedmeta-data information with a crawling approach [48]allows to gather more details on images and makethem more differentiable, but requires additional ef-fort to build a database.

Additionally, we plan to connect the migrationprocess with a monitoring of the deployed system

to trigger re-evaluation and decision-making in anevolutionary migration actively.

CloudGenius expects VM images to feature onecomponent stack, such as an application server stack,instead of whole software stacks (e.g., Bitnami Word-Press stack consisting of web server and database) orbasic VM images containing an OS only. Future workshould estimate customization efforts and a trade-off.Additionally, explicit support for hybrid cloud setups,output in deployment languages, and middlewareand persistence layer services will be explored infuture work. It is planned to conduct an evaluationover several month with a German infrastructureprovider using the CumulusGenius prototype and itsweb-frontend [36]. In the study not only its applica-bility are of interest, but also time measurements ofmigration process cycles.

From a web application and IT system cluster pointof view, important practical challenges exist, such as(a) how to model existing web applications or ITsystems as a cluster with a notation, (b) how to ex-press and automatically handle dependencies of com-ponents, and (c) how to undertake service selectionand deployment processes that consider these mod-elled dependencies. In future work, we aim to tacklethis by adapting deployment modelling languagesto consider control and data flow dependencies. Wewill also explore integration of existing configurationmanagement tools such as Chef, Puppet, and whirrto CumulusGenius for configuring, deploying, andmanaging cloud-hosted IT system clusters.

REFERENCES

[1] “Web Application Hosting in the AWS Cloud Best Practices,”http://media.amazonwebservices.com/AWS Web HostingBest Practices.pdf, accessed 2013-07-19., Amazon WebServices.

[2] M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. Kon-winski, G. Lee, D. Patterson, A. Rabkin, I. Stoica et al., “Abovethe Clouds: A Berkeley View of Cloud Computing,” EECS De-partment, University of California, Berkeley, Tech. Rep. UCB/EECS-2009-28, 2009.

[3] P. Mell and T. Grance, “The nist definition of cloud computing,recommendations of the national institute of standards andtechnolog,” NIST Special Publication, 2011.

[4] “BitNami Cloud Images,” http://bitnami.org/learn more/cloud images, accessed 2011-12-02, BitNami.

[5] “The Cloud Market,” http://cloudmarket.com, accessed 2011-10-19.

[6] A. W. Services, “The aws marketplace,”https://aws.amazon.com/marketplace, 01 2013. [Online].Available: https://aws.amazon.com/marketplace

[7] M. Menzel and R. Ranjan, “CloudGenius: Decision Supportfor Web Server Cloud Migration,” in Proceedings of the 21stInternational Conference on World Wide Web, ser. WWW ’12.New York, NY, USA: ACM, 2012.

[8] M. Menzel, M. Schonherr, and S. Tai, “(MC2)2: Criteria,Requirements and a Software Prototype for CloudInfrastructure Decisions,” Software: Practice and Experience,2011. [Online]. Available: http://dx.doi.org/10.1002/spe.1110

[9] R. Hamadi and B. Benatallah, “A Petri Net-based Model forWeb Service Composition,” in Proceedings of the 14th Aus-tralasian database conference-Volume 17. Australian ComputerSociety, Inc., 2003, pp. 191–200.

0018-9340 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citationinformation: DOI 10.1109/TC.2014.2317188, IEEE Transactions on Computers

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 14

[10] R. Mietzner, T. Unger, and F. Leymann, “Cafe : A Generic Con-figurable Customizable Composite Cloud Application Frame-work,” Framework, pp. 357–364, 2009.

[11] T. Binz, G. Breiter, F. Leyman, and T. Spatzier, “Portable cloudservices using tosca,” Internet Computing, IEEE, vol. 16, no. 3,pp. 80–85, 2012.

[12] V. Stantchev and C. Schropfer, “Negotiating and enforcing qosand slas in grid and cloud computing,” Advances in Grid andPervasive Computing, pp. 25–35, 2009. [Online]. Available: http://link.springer.com/chapter/10.1007/978-3-642-01671-4\ 3

[13] S. Wang, Z. Zheng, and Q. Sun, “Cloud model for serviceselection,” . . . WKSHPS), 2011 IEEE . . . , no. 60821001, pp.666–671, 2011. [Online]. Available: http://ieeexplore.ieee.org/xpls/abs\ all.jsp?arnumber=5928896

[14] R. Buyya, R. Ranjan, and R. Calheiros, “Intercloud: Utility-oriented federation of cloud computing environments forscaling of application services,” Algorithms and Architecturesfor Parallel Processing, pp. 13–31, 2010. [Online]. Available:http://www.springerlink.com/index/B1JP674661364678.pdf

[15] A. Ruiz-Alvarez and M. Humphrey, “An automated approachto cloud storage service selection,” . . . workshop on Scientificcloud computing, pp. 39–48, 2011. [Online]. Available: http://dl.acm.org/citation.cfm?id=1996117

[16] P. Saripalli and G. Pingali, “MADMAC: Multiple AttributeDecision Methodology for Adoption of Clouds,” in CloudComputing (CLOUD), 2011 IEEE International Conferenceon. IEEE, 2011, pp. 316–323. [Online]. Available: http://ieeexplore.ieee.org/xpls/abs\ all.jsp?arnumber=6008725

[17] A. Li, X. Yang, S. Kandula, and M. Zhang, “CloudCmp:comparing public cloud providers,” in Proceedings of the 10thannual conference on Internet measurement, ACM. ACM, 2010,pp. 1–14. [Online]. Available: http://dl.acm.org/citation.cfm?id=1879143

[18] S.-M. Han, M. M. Hassan, C.-W. Yoon, and E.-N.Huh, “Efficient service recommendation system for cloudcomputing market,” Proceedings of the 2nd InternationalConference on Interaction Sciences Information Technology, Cultureand Human - ICIS ’09, pp. 839–845, 2009. [Online]. Available:http://dl.acm.org/citation.cfm?doid=1655925.1656078

[19] Z. U. Rehman, F. K. Hussain, and O. K. Hussain, “TowardsMulti-criteria Cloud Service Selection,” 2011 Fifth InternationalConference on Innovative Mobile and Internet Services inUbiquitous Computing, pp. 44–48, Jun. 2011. [Online].Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5976164

[20] H. Chan and T. Chieu, “Ranking and Mapping of Applicationsto Cloud Computing Services by SVD,” in Network Operationsand Management Symposium Workshops (NOMS Wksps), 2010IEEE/IFIP. IEEE, 2010, pp. 362–369.

[21] A. Dastjerdi, S. Tabatabaei, and R. Buyya, “An EffectiveArchitecture for Automated Appliance Management SystemApplying Ontology-Based Cloud Discovery,” in Proceedingsof the 2010 10th IEEE/ACM International Conference on Cluster,Cloud and Grid Computing. IEEE Computer Society, 2010, pp.104–112.

[22] Z. Ye, X. Zhou, and A. Bouguettaya, “Genetic Algorithm basedQoS-aware Service Compositions in Cloud Computing,” inDatabase Systems for Advanced Applications. Springer, 2011,pp. 321–334.

[23] H. Wada, J. Suzuki, Y. Yamano, and K. Oba, “Evolutionarydeployment optimization for service-oriented clouds,”Software: Practice and Experience, vol. 41, no. 5, pp. 469–493,2011. [Online]. Available: http://dx.doi.org/10.1002/spe.1032

[24] H. Goudarzi and M. Pedram, “Multi-dimensional sla-basedresource allocation for multi-tier cloud computing systems,”in Proceedings of the 2011 IEEE 4th International Conferenceon Cloud Computing, ser. CLOUD ’11. Washington, DC,USA: IEEE Computer Society, 2011, pp. 324–331. [Online].Available: http://dx.doi.org/10.1109/CLOUD.2011.106

[25] M. Hajjat, X. Sun, Y. Sung, D. Maltz, S. Rao, K. Sripanidkulchai,and M. Tawarmalani, “Cloudward bound: Planning for benefi-cial Migration of Enterprise Applications to the Cloud,” ACMSIGCOMM Computer Communication Review, vol. 40, no. 4, pp.243–254, 2010.

[26] A. Khajeh-Hosseini, I. Sommerville, J. Bogaerts, and P. Tere-gowda, “Decision Support Tools for Cloud Migration in theEnterprise,” Arxiv preprint arXiv:1105.0149, 2011.

[27] M. Godse and S. Mulik, “An approach for selecting software-as-a-service (saas) product,” in Proceedings of the 2009 IEEEInternational Conference on Cloud Computing, ser. CLOUD’09. Washington, DC, USA: IEEE Computer Society, 2009,pp. 155–158. [Online]. Available: http://dx.doi.org/10.1109/CLOUD.2009.74

[28] C. Zeng, X. Guo, W. Ou, and D. Han, “Cloud computingservice composition and search based on semantic,” inCloud Computing, ser. Lecture Notes in Computer Science,M. Jaatun, G. Zhao, and C. Rong, Eds. Springer BerlinHeidelberg, 2009, vol. 5931, pp. 290–300. [Online]. Available:http://dx.doi.org/10.1007/978-3-642-10665-1 26

[29] J. Kang and K.-M. Sim, “Ontology and search engine for cloudcomputing system,” in System Science and Engineering (ICSSE),2011 International Conference on, June 2011, pp. 276–281.

[30] M. Zhang, R. Ranjan, A. Haller, D. Georgakopoulos, M. Men-zel, and S. Nepal, “An ontology-based system for cloud in-frastructure services’ discovery,” in Collaborative Computing:Networking, Applications and Worksharing (CollaborateCom), 20128th International Conference on. IEEE, 2012, pp. 524–530.

[31] “BitNami MySQL Virtual Machine Image or Cloud Ap-pliance,” http://wiki.bitnami.com/Components/MySQL, ac-cessed 2013-07-19., BitNami.

[32] “3Tera Database Virtual Machine Images or Cloud Appli-ances,” http://doc.3tera.com/AppLogic31/Catalog Ref/index.htm?toc.htm?CatDatabaseAppliancesMYSQLR.html,accessed 2013-07-19., 3Tera.

[33] “BitNami PostgreSQL Virtual Machine Image or Cloud Ap-pliance,” http://wiki.bitnami.com/Components/PostgreSQL,accessed 2013-07-19., BitNami.

[34] S. Kalepu, S. Krishnaswamy, and S. Loke, “Verity: A QoSMetric for Selecting Web Services and Providers,” in WebInformation Systems Engineering Workshops, 2003. Proceedings.Fourth International Conference on. IEEE, 2003, pp. 131–139.

[35] “CumulusGenius Prototype,” http://code.google.com/p/cumulusgenius/, accessed 2011-11-06, CumulusGeniusPrototype.

[36] “CumulusGenius Online Prototype,” http://cumulusgenius.appspot.com/, accessed 2011-11-06, CumulusGenius OnlinePrototype.

[37] M. Klems, J. Nimis, and S. Tai, “Do Clouds Compute? AFramework for Estimating the Value of Cloud Computing,”Designing E-Business Systems. Markets, Services, and Networks,pp. 110–123, 2009.

[38] A. Lenk, M. Menzel, J. Lipsky, S. Tai, and P. Offermann,“What are you paying for? Performance benchmarking forInfrastructure-as-a-Service Offerings,” in Cloud Computing(CLOUD), 2011 IEEE International Conference on. IEEE, 2011,pp. 484–491.

[39] D. Mosberger and T. Jin, “httperf - A Tool for MeasuringWeb Server Performance,” ACM SIGMETRICS PerformanceEvaluation Review, vol. 26, no. 3, pp. 31–37, 1998.

[40] D. Menasce, “TPC-W: A Benchmark for E-commerce,” InternetComputing, IEEE, vol. 6, no. 3, pp. 83–87, 2002.

[41] “Mahout-based CumulusGenius Implementation,” https://github.com/mugglmenzel/CumulusGeniusOnMahout.

[42] “The Apache Mahout Project,” http://mahout.apache.org/,accessed 2012-05-17.

[43] “Watchmaker Framework,” http://watchmaker.uncommons.org/, accessed 2012-05-17.

[44] “The Apache Hadoop Project,” http://hadoop.apache.org/,accessed 2012-05-17.

[45] “jClouds Multi-Cloud Library,” http://code.google.com/p/jclouds/, visited 2012-01-19, jClouds Multi-Cloud Library.

[46] S. Haak and M. Menzel, “Autonomic Benchmarking for CloudInfrastructures: An Economic Optimization Model,” in Pro-ceedings of the 1st ACM/IEEE workshop on Autonomic computingin economics. ACM, 2011, pp. 27–32.

[47] “CloudHarmony,” http://cloudharmony.com, accessed 2011-10-19.

[48] M. Menzel, M. Klems, H. A. Le, and S. Tai, “A configurationcrawler for virtual appliances in compute clouds,” in Inter-national Conference on Cloud Engineering (IC2E) 2013. IEEEComputer Society, Marz 2013, Inproceedings, p. 9.