Cost and risk modeling of virtualized computing labs

88
Budapest University of Technology and Economics Faculty of Electrical Engineering and Informatics Department of Measurement and Information Systems Madarász, Norbert COST AND RISK MODELING OF VIRTUAL COMPUTING LABS ASSISTANT Kocsis, Imre BUDAPEST, 2014

Transcript of Cost and risk modeling of virtualized computing labs

Page 1: Cost and risk modeling of virtualized computing labs

1

Budapest University of Technology and Economics

Faculty of Electrical Engineering and Informatics

Department of Measurement and Information Systems

Madarász, Norbert

COST AND RISK MODELING OF

VIRTUAL COMPUTING LABS

ASSISTANT

Kocsis, Imre

BUDAPEST, 2014

Page 2: Cost and risk modeling of virtualized computing labs
Page 3: Cost and risk modeling of virtualized computing labs

HALLGATÓI NYILATKOZAT

Alulírott Madarász, Norbert, szigorló hallgató kijelentem, hogy ezt a diplomatervet

meg nem engedett segítség nélkül, saját magam készítettem, csak a megadott forrásokat

(szakirodalom, eszközök stb.) használtam fel. Minden olyan részt, melyet szó szerint,

vagy azonos értelemben, de átfogalmazva más forrásból átvettem, egyértelműen, a

forrás megadásával megjelöltem.

Hozzájárulok, hogy a jelen munkám alapadatait (szerző(k), cím, angol és magyar nyelvű

tartalmi kivonat, készítés éve, konzulens(ek) neve) a BME VIK nyilvánosan

hozzáférhető elektronikus formában, a munka teljes szövegét pedig az egyetem belső

hálózatán keresztül (vagy hitelesített felhasználók számára) közzétegye. Kijelentem,

hogy a benyújtott munka és annak elektronikus verziója megegyezik. Dékáni

engedéllyel titkosított diplomatervek esetén a dolgozat szövege csak 3 év eltelte után

válik hozzáférhetővé.

Kelt: Budapest, 2014. 05. 15.

...…………………………………………….

Madarász, Norbert

Page 4: Cost and risk modeling of virtualized computing labs
Page 5: Cost and risk modeling of virtualized computing labs

Összefoglaló

A diplomaterv-feladat virtualizált számítógéplaborok (Apache VCL) költség- és

kockázatmodellezése volt annak érdekében, hogy az Apache VCL megoldás költségeit,

kockázatait és szolgáltatási szintjeit különböző hibrid számítási felhő konfigurációkban

lehessen vizsgálni. Az elkészített modellt egy hipotetikus, kari szintű Apache VCL

bevezetési projekt esetében használtam az optimális privát/publikus felhő konfiguráció

megtalálásához.

Kezdetben a nyílt forráskódú Apache VCL megoldással, illetve a tanszéken

használt Apache VCL-lel ismerkedtem meg. A megismerkedés során megértettem az

Apache VCL által használt foglalási mechanizmusokat, melyek a modellezési

feladathoz elengedhetetlenek voltak. Három fő folyamatot azonosítottam és

ismertettem. Ezek segítségével készült el egy egyszerűsített, csak a releváns folyamat

lépéseket tartalmazó Apache VCL modell.

Az egyszerűsített modell megalkotása után a következő feladat egy olyan

számítási felhő szimulátor megtalálása volt, mellyel valósághűen lehet modellezni és

szimulálni az Apache VCL működését. Erre a célra a szintén nyílt forráskódú CloudSim

szimulátort választottam. A CloudSim alapesetben nem volt tökéletesen alkalmas a

feladatra, ezért sok mindent kellett hozzáfejleszteni, hogy modellezni lehessen

különböző Apache VCL hibrid felhő konfigurációkat, azok költségeit, kihasználtságait,

jövőbeli kéréseit, illetve szolgáltatási szintjeit.

Végezetül a kibővített CloudSim szimulátor segítségével költség- és

kockázatoptimalizálást végeztem egy hipotetikus Apache VCL bevezetési projektet

példáján. Az esettanulmányon keresztül bebizonyítottam, hogy adott rendszerszintű

terhelés feltételezése mellett az elkészített szimulátor alkalmas az optimális méretű és

konfigurációjú felhő megtalálására. A döntés meghozatalához a felhasználói kérések

átlagos egység költségét és a felhő konfiguráció szolgáltatási szint értékét használtam.

Konklúzióként elmondható, hogy létezik egy olyan optimálisnak tekintett hibrid

felhő konfiguráció, ahol a felhasználói kérések egység költsége minimális, míg a

szolgáltatási szint felveszi maximumát. Kisebb egység költség csak a szolgáltatás

minőségének romlásával érhető el.

Page 6: Cost and risk modeling of virtualized computing labs

6

Abstract

The scope of the master thesis was to model cost and risk factors of virtual

computing labs (Apache VCL) in order to make cost, risk and service performance

optimizing researches on hybrid Apache VCL cloud setups. Using the created model a

hypothetical Apache VCL implementation project at a hypothetical Hungarian

university had to be optimized based on predefined criteria for a specific load profile.

At the beginning I got familiar with the open source Apache VCL and the

Apache VCL used at the Department of Measurement and Information Systems (DMIS)

of Budapest University of Technology and Economics (BME) in order to understand the

processes of computer allocations in Apache VCL to be able to simulate them later on.

There were three main types of allocations in Apache VCL which were circuitously

recognized and described. Using this knowledge I created a filtered, simplified model of

Apache VCL in order to simulate only the relevant parts.

After having the appropriate information about Apache VCL and the simplified

model the next step was to find such a cloud simulator that could be used to truly model

and simulate Apache VCL including the handling of reservation requests and slot

management as well. Eventually, the open source CloudSim cloud simulator was

chosen. CloudSim cannot be used to simulate Apache VCL without changes in its

source code. This modification contained four must have conceptual features which

were missing from CloudSim: cost, system utilization, future request and service

performance modeling related to hybrid cloud setups.

Finally, the relevance and usability of the extended CloudSim was demonstrated

through a hypothetical implementation project of Apache VCL at a Hungarian

university. During the case study I have proven that the created simulator can be used to

get cost optimal cloud setups for assumed future image requests. Decision making was

done by unit cost and service performance indicator of cloud infrastructure

implementation projects.

The conclusion is that there is an optimal hybrid cloud configuration where the

unit cost of image requests is minimal while the service performance indicator has its

maximum. Smaller unit cost can be only reached if the quality of the service is

decreased.

Page 7: Cost and risk modeling of virtualized computing labs

7

Table of contents

1 Introduction .................................................................................................................. 9

2 Apache VCL ............................................................................................................... 11

2.1 Virtual Desktop Infrastructure .............................................................................. 11

2.2 Apache VCL: an academic service ....................................................................... 11

2.3 Architecture of Apache VCL ................................................................................ 13

2.3.1 Self-service web portal .................................................................................. 14

2.3.2 Data model ..................................................................................................... 15

2.3.3 Management node .......................................................................................... 17

2.3.4 Network layout .............................................................................................. 17

2.3.5 Privileges ....................................................................................................... 18

2.4 Apache VCL of BME DMIS ................................................................................ 18

2.5 Modeling allocation policies of Apache VCL ...................................................... 19

2.5.1 Allocation by normal reservation .................................................................. 19

2.5.2 Block allocation ............................................................................................. 26

2.5.3 Predictive loading modules ............................................................................ 31

3 Simulation model ....................................................................................................... 37

3.1 Cloud simulation ................................................................................................... 37

3.2 CloudSim: the simulation framework ................................................................... 37

3.2.1 Architecture ................................................................................................... 38

3.2.2 Design and implementation ........................................................................... 41

3.2.3 Simulation framework ................................................................................... 43

3.2.4 Data center internal processing ...................................................................... 45

3.2.5 Communication among entities ..................................................................... 46

3.3 Simplified model of Apache VCL ........................................................................ 47

3.3.1 Simplified allocation by normal reservation .................................................. 48

3.3.2 Simplified block allocation ............................................................................ 49

3.3.3 Simplified predictive loading modules .......................................................... 50

3.4 Comparing Apache VCL and CloudSim .............................................................. 52

3.4.1 User ................................................................................................................ 53

3.4.2 Request........................................................................................................... 53

3.4.3 Image ............................................................................................................. 53

3.4.4 VMSlot........................................................................................................... 54

Page 8: Cost and risk modeling of virtualized computing labs

8

3.4.5 UserGroup ...................................................................................................... 54

3.4.6 VMSlotGroup ................................................................................................. 54

3.4.7 Reservation ..................................................................................................... 54

3.4.8 VMHost .......................................................................................................... 54

3.4.9 ManagementNode .......................................................................................... 54

3.4.10 Simplified allocation by normal reservation ................................................ 55

3.4.11 Simplified block allocation .......................................................................... 55

3.4.12 Simplified predictive loading modules ........................................................ 55

3.5 Additional conceptual features .............................................................................. 56

3.5.1 Cost modeling ................................................................................................ 56

3.5.2 Data center utilization .................................................................................... 56

3.5.3 Future request generation ............................................................................... 57

3.5.4 Service performance ....................................................................................... 57

3.6 Improvements in CloudSim .................................................................................. 57

3.6.1 Modified Java classes ..................................................................................... 58

3.6.2 New Java classes ............................................................................................ 61

3.6.3 Used Java libraries ......................................................................................... 65

3.7 How to use the simulator ....................................................................................... 66

3.7.1 Providing input parameters ............................................................................ 66

3.7.2 Starting the simulation ................................................................................... 69

3.7.3 Description of the output files ........................................................................ 70

4 Case study: a hypothetical implementation project ................................................ 72

4.1 Input parameters .................................................................................................... 72

4.1.1 Image types .................................................................................................... 72

4.1.2 Cloud computing infrastructures .................................................................... 73

4.1.3 CapEx and OpEx costs of private data center ................................................ 73

4.1.4 Opportunity cost for NPV .............................................................................. 75

4.1.5 SLA parameter ............................................................................................... 75

4.1.6 Future image reservations .............................................................................. 75

4.2 Simulation results .................................................................................................. 78

4.3 Finding the optimal cloud setup ............................................................................ 81

5 Summary ..................................................................................................................... 83

Bibliography .................................................................................................................. 85

Abbreviation list ............................................................................................................ 87

Page 9: Cost and risk modeling of virtualized computing labs

9

1 Introduction

Nowadays cloud computing service delivery model plays a very key role in

information technology (IT) services of both most business and non-business oriented

companies. Of course everything has some disadvantages even cloud computing has

some but these obstacles may slow down the transition from on premise architectures

but probably will not stop it. Cloud type service market is already worth €100bn, and is

still growing at 20% yearly [1]. As a consequence it can be drawn that it supersedes

older styles of IT in many areas as it has happened for a past few years.

Using cloud, universities could take advantage of cost reduction, flexibility,

faster deployment, better retain, performance, and the like. The question is what kind of

cloud computing services or solutions should be used by the universities to profit from

them and cloud is really worth to use at all? The answer is not so simple and there may

be various answers which all are right. There are several service models to be take into

account like Infrastructure-as-a-service (IaaS), Desktop-as-a-Service (DaaS), Platform-

as-a-Service (PaaS) or Software-as-a-Service (SaaS) and several solutions to the service

models as Virtual Data Center (VDC), Virtual Desktop Infrastructure (VDI), Java

platform or Customer Relationship Management (CRM) from cloud. Because of the

bunch of cloud services it must be considered by a specific university why some part of

the cloud is necessary or even useful for a specific university.

VDI takes lots of advantage: lecturers and students do not have to take part of

the labs personally to get access to the physical resources (e.g. computers, software) of

laboratories; rather the laboratories should get to be virtualized. VDI has the ability to

stop the dependency on lab premises and times, and make the laboratories more

flexible. VDI can be also used in e-learning as the students could have access to the

necessary resources of the education institutions for working out and submitting

assignments from anywhere.

At a university it is very important that appropriate number of resources for

students to be at service during and outside of the laboratory times. The capacity

planning is really ponderous under such circumstances. In order to serve the students

sizing of the private VDI cloud for peaks does not seem to be the best strategy. The

private VDI cloud has to serve only the average demand of students and the rest of

Page 10: Cost and risk modeling of virtualized computing labs

10

demand – above the average – public cloud should be made use. To reduce the costs

best an environment specific and optimized hybrid cloud setup (mix of private/public

clouds) should be defined in a very prudent way. For universities it is unacceptable not

to serve the demands coming from the students or having laboratory delays. These risk

factors of the hybrid cloud VDI solution must be handled to provide predefined high

availability, Service Level Agreements (SLAs) and Quality of Service (QoS)

parameters. At least all these requirements have to be taken into account when a

department or a faculty is thinking about building hybrid virtual computing lab

infrastructure. The topic optimizing cost and risk factors is relevant these days when all

entities including the education institutes need to cut their due to some economic

influential factors. My solution is exposed in this essay helps to utilize the power and

advantages of cloud computing service model namely virtualized computing labs.

In the study the open source Apache Virtual Computing Lab (VCL) will be used

as the private virtual computing lab infrastructure. The key functions of VCL should be

simulated to get statistical information about user behaviors, resource utilizations, and

so on. A cloud simulator will be used for analysis purposes of different types of VCL

cloud setups in order to simulate various reservation arrival processes, cost and possibly

fault models. Hybrid VCL setups – as mixing an in-house data center with a public

cloud – are not a technical reality for VCL yet, because VCL does not support the

submitting of image reservations in public clouds. Even so the question is in the thesis

that which VCL setup gives the optimal solution for a given number of requests from

the long-term cost optimization and risk mitigation point of view.

The desired cloud simulator – which truly models Apache VCL – needs to be

able to analyze the costs assigned to different private/public configurations. The cost

modeling ought to process the historical and real usage patterns of the Apache VCL

environment at BME DMIS. Likewise, the model has to be able to report some QoS

parameters to provide information about system’s performance.

Finally, this study will present a hypothetical hybrid virtual computing lab

implementation project at a Hungarian university where the created simulation model

and the simulator will be used to optimize cost- and risk factors based on several

assumptions like usage of Apache VCL, number of students, number of lectures,

resource demands and quality requirements.

Page 11: Cost and risk modeling of virtualized computing labs

11

2 Apache VCL

After a short introduction where the main points and motivations of this master

thesis have been mentioned I give a brief overview about the open source cloud

computing platform called Apache VCL. VCL was basically developed for universities

to provide predefined work environments for students at anytime from anywhere in

order to turn originally offline laboratories and workshops into an on-line, on-demand

type. It is also highly recommended to use VCL where bigger compute environment is

needed from time to time due to laboratories or workshops. In order to prepare and

preconfigure virtual compute environments for such occasions VCL provides the

opportunity to make it possible.

2.1 Virtual Desktop Infrastructure

Apache VCL platform is one of the realizations of desktop virtualization called

Virtual Desktop Infrastructure. Desktop virtualization is a software technology that

enables remote access to set of preinstalled and preconfigured virtual desktop

environments. Whereas VDI is a desktop-centric service that hosts users’ desktop

environments on remote servers and/or blade PCs. The environments are accessed over

a network using a remote display protocol by the users. [2] For the users this system

architecture takes advantage of accessing their desktop environment from anywhere.

Furthermore the users get the same applications and data from any location because the

servers and other resources are centralized so there is no restriction where the users

want to access their environments from. VDI provides more efficient way to maintain

the client environments as they are centralized.

2.2 Apache VCL: an academic service

VCL became an Apache Software Foundation top level project on June 20,

2012. [3] The Apache VCL platform is responsible for managing, controlling and

delivering desktop environments to users from centralized resources. There are number

of possible environments VCL can provision to, for example virtual machine on

different hypervisors, traditional bare-metal computer or clustered physical server.

The users make reservation via the VCL’s self-service web portal. Before

making reservation users have to select one from the available environments. After

Page 12: Cost and risk modeling of virtualized computing labs

12

making reservations web portal’s scheduling components determine which computer

resources are assigned to which reservations to run the chosen environments. After that

the requested environments are dynamically provisioned and configured to allow remote

access to the specific environments by the users. Then the users can remotely connect to

their specific prepared environments via remote desktop or Secure Shell (SSH) through

internet, Campus Area Network (CAN) or Local Area Network (LAN). Figure 2.1

depicts the overview of Apache VCL platform.

Apache VCL

Virtualized Data Center

...

Virtualmachines

Internet/CAN/LAN

Remote client

ReservationEstablishing connection

Remote desktop or terminal access

Figure 2.1 VCL overview (Source: [4])

VCL supports three types of authentication to confirm the identity of the users

using remotely the self-service web portal. [5] First is the built-in authentication

method. The users can log in with their usernames and passwords. The users are added

to VCL database by administrators. These users are called local VCL accounts. The

Page 13: Cost and risk modeling of virtualized computing labs

13

second one is Lightweight Directory Access Protocol (LDAP). VCL frontend can be

configured to use existing LDAP server. Last authentication method is Shibboleth. [6]

An environment or image is a collection of software that is installed on an

operating system. Users select from the list of environments they have access to. In

VCL every image may have different revisions. VCL controls the revisions of images

and also the creation of the images and the revisions. Creation of both images and

revisions depends on VCL’s privilege and authorization model. Users are granted

access to parts of the VCL web site and to resources through the privilege tree. All

together nine user permissions can be granted, and three resource attributes can be

assigned to a resource group in the privilege tree. [7]

So VCL is used to dynamically provision and broker remote access to a

dedicated environment for users. In general VCL provisioned computers are housed in

data centers and may be bare-metal, blade, rack mounted servers, standalone computer

lab machines, or virtual machines. For virtual machine provisioning VCL supports

VMware ESXi 4.x/5.x, ESX Standard server, Free Server, KVM and VirtualBox. For

physical, bare-metal provisioning Extreme Cloud Administration Toolkit (xCAT) is

supported by VCL. [8]

In order to determine how and when environments are used VCL has built-in

statistics page. The page provides data in readable format about number of reservations,

total and unique by user count, by OS and by environment. The page contains some

graphs about the usage trends of VCL as well.

2.3 Architecture of Apache VCL

VCL’s architecture consists of three main components:

Self-service web portal (also called frontend)

Database

Management nodes (also called backend)

Page 14: Cost and risk modeling of virtualized computing labs

14

Figure 2.2 VCL architecture (Source: [9])

Figure 2.2 shows the relationships among the components and main

responsibilities of the three parts.

2.3.1 Self-service web portal

The self-service web portal is the VCL access point for users. The User Interface

(UI) is responsible to authenticate the users either via the built-in, LDAP or Shibboleth

authentication methods, and to give authorization to the users to access only the parts of

the web portal. Then users can select from list of images. Users do not see all the

images in VCL, only those ones which users have rights to. The UI posts lots of

requests to the database because all of the information about the current system state is

stored in VCL’s database. For instance, frontend posts tens of select queries to

determine only the images which meet the privilege requirements, or to check if the

time period chosen by the user is still available and there is still available computer

node to fulfill the reservation. Not only select queries are executed in the database but

Page 15: Cost and risk modeling of virtualized computing labs

15

insert and update queries too. There is Application Programming Interface (API)

support in the frontend for making requests and provisioning resources from other

external software. [10] VCL Scheduler deals with scheduling the requests coming from

the users. The scheduler and the procedure of the reservations are going to be described

in chapter 2.5. For developing the frontend Apache Hypertext Transfer Protocol (HTTP)

Server with Secure Sockets Layer (SSL), PHP and Dojo Toolkit were used.

2.3.2 Data model

As can be seen VCL is very database driven technology so in this section I give

an overview about the most important tables of the VCL’s database schema as these

tables reflect the main notions and abstracts used by VCL.

The VCL database server has to be a MySQL 5.0 or later Relational Database

Management System (RDBMS). VCL database is a vcl schema in the MySQL RDBMS

and the tables are created during the Apache VCL installation. Database stores data

about the main entities in dedicated tables like computer, image, revision, log,

managementnode, module, OS, request, reservation, resource, schedule, state, user, and

other information as resource mappings, user privileges or image profiles. Table 2.1

contains the explanation of the database tables that are required to understand the main

VCL concepts later on.

Name of the table Comment on the table

blockComputers Tracks which computers have been allocated to individual

block allocation time slots.

blockRequest Contains all of the block allocations that have been

requested and their current state.

blockTimes Contains all of the time slots associated with a block

allocation that are active or have not yet been reached.

Computer

Contains all information about compute nodes and VMs that

VCL controls. All bare metal computers, virtual hosts, and

virtual machines must have an entry in this table. Images are

deployed on to computers. VCL needs to know about all of

the computers it will be managing. Entries for both physical

computers and VMs (also called VM slots) need to be

created in VCL for it to be able to manage them.

image

Contains all information about the images available through

VCL. It comes with a single required special image - "No

image" that is used to signify when a computer is not loaded

with anything. An image (or environment) is a collection of

software that is installed on an operating system. These

images can be deployed, used, modified, and saved. Images

can be designed to run directly on a computer (bare-metal)

Page 16: Cost and risk modeling of virtualized computing labs

16

or under a hypervisor (virtualized images).

imagerevision Contains an entry for every revision of each image.

managementnode

Contains information about each management node.

Management nodes run the VCL backend code (Perl code)

that is responsible for deploying images to computers when

users make reservations for images. Each management node

can manage a mix of physical and virtual computers.

OS Contains information about OSs VCL knows about.

request Contains information about every current or future

reservation.

reservation Contains information about every current or future

reservation.

resource

Contains an entry for every resource VCL knows about.

Every resource has a unique ID from this table, and a sub ID

from a resource specific table (computer, image,

management node).

resourcegroup

Contains all of the resource groups. Each resource group has

a type associated with it which can be one of image,

computer, management node, or schedule. The resource

groups are used to grant users access to resources and also to

allow VCL to know which resources can be used in relation

to other resources.

resourcegroupmembers Contains a list of which resources are in which resource

groups.

resourcemap

Contains which resource groups map to other resource

groups. Image groups are mapped to computer groups, and

management node groups are mapped to computer groups.

Any image in an image group can be run on any computer in

a computer group to which it is mapped if a user has

sufficient privileges to do so.

resourcetype Contains a list of all the resource types.

schedule

Contains all of the schedules available. Each computer must

have a schedule associated with it. Schedules provide a way

to define what times during a week a computer is available

through VCL.

state Contains all of the states used in VCL. Not all states are used

any place where states are used.

user Contains an entry for every user that has every logged in to

VCL.

usergroup

Contains all of the user groups. Each user group has certain

attributes associated with it. There are various places within

VCL that user groups can be used, with the primary place

being granting access to resources in the privilege tree.

usergroupmembers Tracks which users are members of which user groups.

vmhost Contains an entry for each virtual host.

vmtype Contains all of the virtual machine types.

Table 2.1 Main VCL concepts (as reflected by database tables) (Source: [11])

Page 17: Cost and risk modeling of virtualized computing labs

17

2.3.3 Management node

VCL management node or backend processes really often requests to the

database due to the same think like frontend does. The main task of backend is to

manage the computer nodes that mean loading, preloading, stopping, restarting and

configuring the nodes at specific time. The management node contains some

provisioning engines for bare-metal server and for virtual machines which are used to

communicate with the physical or virtual nodes in order to deploy, start or stop users’

requested environments. The VCL daemon that runs on the management node processes

requests to the database after a short period of time to be notified if there are any new

requests. For example, if there is at least one new request and if the request’s start time

is close enough to the actual time then the deployer accomplishes the deployment of the

user’s requested image. After finishing the deployment the user is notified via the web

portal that the compute environment is ready to use. [8]

In general for a production VCL environment image library is a shared storage,

either Network Attached Storage (NAS) or Storage Area Network (SAN). Image library

collects the image files, image metadata, Linux install trees. Image library is used by

backend during image deployment or other image related operations.

2.3.4 Network layout

This section describes the typical network layout required for VCL. VCL

originally architected using the classic approach to separate physically the “workload”

and the management communications. Due to that in most cases all hosts taking role in

VCL (e.g.: management node, hypervisor hosts) therefore they have to have two

physical interfaces: 1. public network: the users access the virtual machines remotely

through this network; 2. private network: this network applies to provisioning modules

where compute node is reloaded (ESX, VMware, etc.) [12] Figure 2.3 shows the typical

network layout.

Page 18: Cost and risk modeling of virtualized computing labs

18

Figure 2.3 VCL typical network layout (Source: [12])

2.3.5 Privileges

“Users are granted access to parts of the VCL web site and to resources through

the privilege tree. User permissions and resource attributes can both be cascaded down

from one node to all of its children. Additionally, cascaded user permissions and

resource attributes can be blocked at a node so that they do not cascade down to that

node or any of its children.

There number of user permissions that can be granted to users. They can be

granted to users directly or to user groups.” [13]

2.4 Apache VCL of BME DMIS

The university Apache VCL environment is used for laboratories, homework

assignments, and other research purposes. The VCL management node and the MySQL

database are installed on separate servers. There are 9 physical hosts that are responsible

for hosting virtual machines/images. Each server is an IBM x3550 with 8 Central

Processing Unit (CPU) cores, 32 GB memory, 2 * 136 GB Serial Attached SCSI (SAS)

disk. One more additional server belongs to the environment; the 1 TB Network File

Page 19: Cost and risk modeling of virtualized computing labs

19

System (NFS) shared storage. Figure 2.4 depicts the key elements of the VCL

environment of BME DMIS.

x3550 x3550 x3550 x3550 x3550 x3550 x3550 x3550 CentOS 6.4

ESXi ESXi ESXi ESXi ESXi ESXi ESXi ESXi

NFS

Bond Net.

Public

PrivateInternet

Storage

CentOS 6.4CentOS 6.4XenServer 6.2

Gateway

Shibboleth

OpenVPN

x3550

Host forservice VMs

Man. Node

Windows 8 CentOS 6.4

MySQLVCL Backend

VPN Client

x3550

ESXiAdministrative Console

Nagios

Collectd

DNS Forw.

CentOS 6.5

PXE Boot Serrver

Windows Server 2012

Database

ESXi Performance

moitoring

Cloud VM

vm-small-xx

Cloud VM

vm-medium-xx

Cloud VM

vm-large-xx

Cloud VM

vm-xlarge-xx

Cloud VM

vm-2xlarge-xx

Cloud VM

vm-4xlarge-xx

Cloud VM

vm-test-xx

Cloud VM

vm-tiny-xx

AdministratorUser

VPN Client

Figure 2.4 Network topology of VCL of BME DMIS (Source: [14])

2.5 Modeling allocation policies of Apache VCL

In order to achieve my goals understanding the processes of computer

allocations is a must. Under allocation process I understand how a computer is assigned

to a specific user request. The second notable part is the preloading scheduling logic. It

tries to predict which images the users would like to use in the future and the predictive

module preload the most probable images to reduce the image deployment times in case

of ad-hoc requests. As these policies are not documented by default it took lots of time

to investigate and understand them.

There are three main types of allocations in Apache VCL version 2.3: normal

reservation by a user/student, block allocation by a user/teacher, and predictive loading

modules. Each of these allocation types are described below.

2.5.1 Allocation by normal reservation

Figure 2.5, Figure 2.6, Figure 2.7, Figure 2.8 and Figure 2.9 show the main

function calls and other thesis relevant events that are done when a user submits his/her

request on the New Reservation page in VCL. The flow chart has been created looking

and going through the VCL Perl and PHP codes. For more information the source code

of VCL should be checked.

In order to serve users’ requests VCL scheduler (part of the frontend and written

in PHP programming language) executes the functions and tasks are shown on Figure

Page 20: Cost and risk modeling of virtualized computing labs

20

2.5, Figure 2.6, Figure 2.7, Figure 2.8 and Figure 2.9. After a submitted request function

newReservation() in requests.php file prints the form for submitting a new request by a

user. Function getUserResources() in utils.php returns the resources the user has access

to.

When the user chooses the image, reservation time, and submits his request

VCL scheduler ensures that the request can fit in the schedule or not. It adds if the

request fits, or notifies the user either way. Function submitRequest() in requests.php is

called when the user hits the submit button on the New Reservation page. In order to

check that the specific request constitute an available request function isAvailable() is

called by submitRequest(). If the return value is an integer > 0, it means there is

available computer which the environment can be processed on. This is good news

because the user can be served. In other cases (-3, -2, -1, 0) user is notified with error

message according to the error case.

Page 21: Cost and risk modeling of virtualized computing labs

21

newReservation()prints form for submitting a new reservation

getUserResources()

list of resources a user has access to

and returns it

getImages()

an array of images

checkValidImage()

If valid environment

was selected or view available

reservation times

Image and reservation time

chosenNo valid

environment or reservation time

User clicks on New Reservation page

Figure 2.5 VCL normal reservation flow chart, part one

Page 22: Cost and risk modeling of virtualized computing labs

22

Figure 2.6 VCL normal reservation flow chart, part two

Page 23: Cost and risk modeling of virtualized computing labs

23

Figure 2.7 VCL normal reservation flow chart, part three

Page 24: Cost and risk modeling of virtualized computing labs

24

Figure 2.8 VCL normal reservation flow chart, part four

Page 25: Cost and risk modeling of virtualized computing labs

25

Figure 2.9 VCL normal reservation flow chart, part five

Page 26: Cost and risk modeling of virtualized computing labs

26

2.5.2 Block allocation

Block allocation is a way to have a set of compute nodes preloaded with a

particular environment at a specified time for a specific group of users. This is ideal for

such occasions when a group of students will need access to the same image for a

limited time (classrooms or workshops). It can be made available on a repeating

schedule such as when a course meets each week. The images are preloaded prior to the

start time of the workshop. When the workshop starts, only those users get access to

these block allocated environments who are the members of the given user group. After

the lab or workshop is done the resources are again available for the other users from

different user groups.

Block allocation only allocates machines for the group of users – it does not

create the actual, end user reservations for the machines. All users still must log in to

the VCL web site and make their own reservations during the period a block allocation

is active. The forms on the Block Allocations page provide a way for the user to submit

a request for a block allocation. After reviewing the block allocation requests system

administrator approve or reject them. If a user just needs to use a machine through VCL,

he/she has to use the New Reservation page for that instead of submitting block

allocation.

Figure 2.10, Figure 2.11, Figure 2.12, Figure 2.13 and Figure 2.14 show the

main function calls and other thesis relevant events that are done when a user submits a

block allocation on the Block allocations page in VCL. The flow chart has been created

by looking and going through the VCL Perl and PHP codes. For more information the

source code of VCL should be checked. The figure contains the substantial operations

during a block allocation. If the system administrator approves a block allocation the

details are stored in the VCL database. There is a Perl function called main() that is the

vcld or vcl daemon. This is that code which always runs on the backend and plays a key

role in processing block allocations. It gets all the block requests assigned to this

specific management node and loops through them. It checks if the block request is

already being processed. If no, then it calls function check_blockrequest_time() in

utils.pm that is responsible for checking block request start, end and expire time. If the

expire time is in the past then it returns with expire for the block request to be removed.

If it is 30 minutes to 6 hours in advance to the start time of the block allocation the

request has to be started and it returns with start. If end time is less than 1 minute from

Page 27: Cost and risk modeling of virtualized computing labs

27

the actual time then it returns end for the request to be ended. If there are any block

requests to be processed then function make_new_child() is called to begin processing.

After that it calls Process() in blockrequest.pm which does different things depending

on start, end or expire mode is set before make_new_child() was called. Function

Process() is used to start or end nodes physically.

Figure 2.10 VCL block allocation flow chart, part one

Page 28: Cost and risk modeling of virtualized computing labs

28

Figure 2.11 VCL block allocation flow chart, part two

Page 29: Cost and risk modeling of virtualized computing labs

29

Figure 2.12 VCL block allocation flow chart, part three

Page 30: Cost and risk modeling of virtualized computing labs

30

Figure 2.13 VCL block allocation flow chart, part four

Page 31: Cost and risk modeling of virtualized computing labs

31

Figure 2.14 VCL block allocation flow chart, part five

2.5.3 Predictive loading modules

Figure 2.15, Figure 2.16, Figure 2.17 and Figure 2.18 present the main function

calls and other thesis relevant events are done when VCL backend decides which image

will be the next image for the machines. VCL uses two type of predicting algorithm:

Level_0 and Level_1. You can change the default Level_0 algorithm on page

Management Node Information via VCL UI. Flow charts have been created by looking

and going through the source code of VCL backend.

Level_0 algorithm visualized on Figure 2.15 is the simpler. If a reservation is

expired then function get_next_image() in Level_0.pm is called which contains the

algorithm. If the node is part of block reservation then the next image is the one

reserved in block reservation. If there is a reservation according to the specific computer

with start time less than 50 minutes and the request state is new, reload or imageprep

Page 32: Cost and risk modeling of virtualized computing labs

32

then the next image is that one which belongs to the request. Else, there are no

upcoming reservations on the computer so algorithm fetches the next image information

and reloads that image. Next image information is editable by the admin of the

computer on page Computers.

Request in reload phase

check if node is part of block reservation

get_block_request_image_info()

Checks the blockcomputers table matching image

Select image for the given computer

where the request state is IN(new,

reload, imageprep)

Check if there is a reservation

according to the computer with start time less

than 50 minutes from now

Yes, give the image belongs to that

reservation

Yes

No upcoming reservations on the

computer

Fetch next image information, reload the next image of

the computer

Next image chosen

No

Next image policy set by

admin

Figure 2.15 VCL predictive loading module for Level_0 algorithm flow chart

Level_1 – shown on Figure 2.16, Figure 2.17 and Figure 2.18 – is more

sophisticated than Level_0 because of the consideration of the past. Firstly, the online

computers (where state is available, reserved, reloading, inuse, timeout) are counted

Page 33: Cost and risk modeling of virtualized computing labs

33

then the available computers (computers which state is available). After that, two

variables are calculated: and

. Based

on the usage the following time frame is determined, if usage bigger than:

40%, timeframe = 1 day

35%, timeframe = 2 days

30%, timeframe = 3 days

25%, timeframe = 4 days

20%, timeframe = 5 days

15%, timeframe = 10 days

10%, timeframe = 20 days

5%, timeframe = 30 days

0%, timeframe = 2 months

With help of this value algorithm defines the most popular but actually not

loaded image in the time frame. Of course, only those possible images can be selected

which can run on the computer observing the resource mapping rules. If something goes

wrong and no image is selected then algorithm uses the next image information of the

machine.

Page 34: Cost and risk modeling of virtualized computing labs

34

Figure 2.16 VCL predictive loading module for Level_1 algorithm flow chart, part one

Page 35: Cost and risk modeling of virtualized computing labs

35

Figure 2.17 VCL predictive loading module for Level_1 algorithm flow chart, part two

Page 36: Cost and risk modeling of virtualized computing labs

36

Figure 2.18 VCL predictive loading module for Level_1 algorithm flow chart, part three

Page 37: Cost and risk modeling of virtualized computing labs

37

3 Simulation model

Afterwards we got closer to Apache VCL and understood the most important

parts of that the next step is to find an appropriate cloud simulator which can be used to

truly model and simulate the working of Apache VCL including the handling of

reservation requests and slot management as well. The desired simulator must be also

able to model hybrid virtual computing labs, namely Apache VCL as private and/or

public cloud computing solution. This chapter demonstrates how the CloudSim

simulation framework works and how the framework was extended to support the

requirements and getting my goals done.

3.1 Cloud simulation

Simulation is a much faster way to have the needed statistical results because it

is not necessary to wait for the running of the real VCL system as the simulator

simulates the execution without the need of physical realization of the operations. As an

example, the simulator does not need to deploy requested images physically on VMs.

Finding such a cloud computing simulator that can be used for my master thesis

without any modification in the code is not an easy task. Other problem is that most of

the current well-known free and open source grid or cloud simulators like GangSim,

SimGrid, iCanCloud, GridSim or CloudSim cannot be directly used to model hybrid

virtual computing lab environments. [15][16][17][18][19] Lack of the documentation

and the support also encumbers the searching of the ideal simulator. Developing own

simulator would be a huge work and it is not necessary even because from the listed

simulators CloudSim seemed good enough taking into account the logic, functionalities,

support, documentation, quality of the code and the usage of the cloud simulator. Of

course, CloudSim framework needs to be extended VCL to simulate as well. Eventually

CloudSim was chosen to model the working of Apache VCL.

3.2 CloudSim: the simulation framework

CloudSim was developed at the University of Melbourne, Australia in Cloud

Computing and Distributed Systems (CLOUDS) Laboratory of Department of

Computer Science and Software Engineering. CloudSim is an extensible simulation

framework that allows modeling, simulation, and experimentation of cloud computing

Page 38: Cost and risk modeling of virtualized computing labs

38

infrastructures and application services. [19] CloudSim can be used for initial

performance testing: it requires less effort and time to implement Apache VCL’s image

reservations and to test the performance of VCL in hybrid cloud environments with

little programming and deployment effort.

Thesis relevant functionalities of CloudSim:

support for modeling and simulation of (federated) cloud computing data

centers

support for modeling and simulation of virtualized server hosts, with

customizable policies for provisioning host resources to virtual machines

support for dynamic insertion of simulation elements, stop and resume of

simulation

support for user-defined policies for allocation of hosts to virtual

machines and policies for allocation of host resources to virtual machines

3.2.1 Architecture

Figure 3.1 shows the multi-layered design of the CloudSim software framework

and its architectural components.

Figure 3.1 CloudSim simulation layers (Source: [20])

“The CloudSim simulation layer provides support for modeling and simulation

of virtualized cloud-based data center environments including dedicated management

Page 39: Cost and risk modeling of virtualized computing labs

39

interfaces for virtual machines (VMs), memory, storage, and bandwidth. The

fundamental issues such as provisioning of hosts to VMs, managing application

execution, and monitoring dynamic system state are handled by this layer. It is possible

to compare different policies in allocating hosts to VMs (VM provisioning). This layer

also supports different policies in provisioning hosts to VMs. A host can be

concurrently allocated to a set of VMs that execute applications/images. User Code

exposes basic entities for hosts (number of machines), applications (number of tasks and

their requirements), VMs, number of users and their application types, and broker

scheduling policies.” [20]

3.2.1.1 Modeling the cloud

“The Datacenter entity manages host entities. The hosts are assigned to one or

more VMs based on a VM allocation policy. Here, the VM policy stands for the

operations control policies related to VM life cycle such as: provisioning of a host to a

VM, creation, destruction, and migration of a VM. Similarly, one or more application

services can be provisioned within a single VM instance.

A Datacenter entity can manage several hosts that in turn manage VMs during

their life cycles. Host is a CloudSim component that represents a physical computing

server in the cloud: it is assigned to pre-configured processing capability (expressed in

millions of instructions per second (MIPS)), memory, storage, and a provisioning policy

for allocating processing cores. VM allocation (provisioning) is the process of creating

VM instances on hosts that match the critical characteristics (storage, memory),

configurations (software environment), and requirements (availability zone) of the

provider.

An application service is assigned to one or more pre-instantiated VMs through

a service specific allocation policy. Allocation of application-specific VMs to Hosts in a

cloud-based data center is the responsibility of a Virtual Machine Allocation controller

component (called VmAllocationPolicy). By default, VmAllocationPolicy implements a

straightforward policy that allocates VMs to the host in First-Come-First-Serve (FCFS)

basis. Hardware requirements such as the number of processing cores, memory and

storage form the basis for the provisioning.

For each Host component, the allocation of processing cores to VMs is done

based on a host allocation policy. This policy takes into account several hardware

Page 40: Cost and risk modeling of virtualized computing labs

40

characteristics such as number of CPU cores, CPU share, and amount of memory that

are allocated to a given VM instance. CloudSim supports several simulation scenarios

that assign specific CPU cores to specific VMs (a space-shared policy) or dynamically

distribute the capacity of a core among VMs (time-shared policy), and assign cores to

VMs on demand. Each host component also instantiates a VM scheduler component,

which can either implement the space-shared or the time-shared policy for allocating

cores to VMs.” [20]

3.2.1.2 Modeling the cloud market

“Modeling of costs and economic policies are important aspects to be considered

when cloud simulator is designed. The cloud market is modeled based on a two layers

design. The first layer contains economic of features related to IaaS model such as cost

per unit of memory, cost per unit of storage, and cost per unit of used bandwidth. Cloud

customers (SaaS providers) have to pay for the costs of memory and storage when they

create and instantiate VMs, whereas the costs for network usage are only incurred in

event of data transfer. The second layer models the cost metrics related to SaaS model.

Costs at this layer are directly applicable to the task units (application service requests)

that are served by the application services. Hence, if a cloud customer provisions a VM

without an application service (task unit), then they would only be charged for layer 1

resources (i.e. the costs of memory and storage).” [20]

3.2.1.3 Modeling a federation of clouds

“In order to federate multiple clouds, there is a requirement for modeling a cloud

coordinator entity. This entity is responsible not only for communicating with other data

centers and end-users in the simulation environment, but also for monitoring and

managing the internal state of a data center entity. The information received as part of

the monitoring process, that is active throughout the simulation period, is utilized for

making decisions related to inter-cloud provisioning.” [20]

3.2.1.4 Modeling dynamic entities creation

“CloudSim supports dynamic creation of different kinds of entities. Apart from

the dynamic creation of user and broker entities, it is also possible to add and remove

data center entities at run time. After creation, new entities automatically register

Page 41: Cost and risk modeling of virtualized computing labs

41

themselves in Cloud Information Service (CIS) to enable dynamic resource discovery.”

[20]

3.2.2 Design and implementation

In this section you can see details related to the fundamental classes of

CloudSim, which are the building blocks of the simulator. The overall class design

diagram for CloudSim is shown on Figure 3.2.

Figure 3.2 CloudSim class design diagram (Source: [20])

BwProvisioner: This class models the policy for provisioning of bandwidth to

VMs. The main role of this component is to undertake the allocation of network

bandwidths to a set of competing VMs that are deployed across the data center.

BwProvisioningSimple allows a VM to reserve as much bandwidth as required,

however this is constrained by the total available bandwidth of the host.

CloudCoordinator: This abstract class extends a cloud-based data center to the

federation. It is responsible for periodically monitoring the internal state of data center

resources. Concrete implementation of this component includes the specific sensors and

the policy that should be followed during load-shredding. This component can also be

extended for simulating cloud-based services such as the Amazon Elastic Compute

Cloud (EC2) Load-Balancer.

Cloudlet: This class models the cloud-based application. CloudSim orchestrates

the complexity of an application in terms of its computational requirements. Every

Page 42: Cost and risk modeling of virtualized computing labs

42

application service has a pre-assigned instruction length and data transfer overhead that

it needs to undertake during its life-cycle.

CloudletScheduler: This abstract class is extended by implementation of

different policies that determine the share of processing power among cloudlets in a

virtual machine. Two types of provisioning policies are offered: space-shared

(CloudetSchedulerSpaceShared) and time-shared (CloudletSchedulerTimeShared).

Datacenter: This class models the core infrastructure level services that are

offered by cloud providers (for example Amazon EC2, Microsoft Azure or Google App

Engine). It encapsulates a set of compute hosts that can either be homogeneous or

heterogeneous with respect to their hardware configurations (memory, cores, capacity,

and storage). Furthermore, every datacenter component instantiates a generalized

application provisioning component that implements a set of policies for allocating

bandwidth, memory, and storage devices to hosts and VMs.

DatacenterBroker: This class models a broker, which is responsible for

mediating negotiations between SaaS and cloud providers. The broker acts on behalf of

SaaS providers. It discovers suitable Cloud service providers by querying the CIS and

undertakes on-line negotiations for allocation of resources/services. The difference

between the broker and the CloudCoordinator is that the former represents the customer,

while the latter acts on behalf of the data center.

DatacenterCharacteristics: This class contains configuration information of data

center resources.

Host: This class models a physical resource such as a compute or storage server.

It encapsulates important information such as the amount of memory and storage, a list

and type of processing cores, an allocation of policy for sharing the processing power

among virtual machines, and policies for provisioning memory and bandwidth to the

virtual machines.

NetworkTopology: This class contains the information for inducing network

behavior (latencies) in the simulation. It stores the topology information, which is

generated using Boston university Representative Internet Topology Generator (BRITE)

topology generator.

RamProvisioner: This is an abstract class that represents the provisioning policy

for allocating RAM to the VMs. The execution and deployment of VM on a host is

Page 43: Cost and risk modeling of virtualized computing labs

43

feasible only if the RamProvisioner component approves that the host has the required

amount of free memory. The RamProvisionerSimple does not enforce any limitation on

the amount of memory a VM may request. However, if the request is beyond available

memory capacity then it is simply rejected.

SanStorage: This class models a storage area network. SanStorage implements a

simple interface that can be used to simulate storage and retrieval of any amount of data,

subject to the availability of network bandwidth.

Sensor: This interface must be implemented to instantiate a sensor component

that can be used by a CloudCoordinator for monitoring specific performance

parameters.

VM: This class models a virtual machine, which is managed and hosted by a

cloud host component. Every VM component has access to a component that stores the

following characteristics related to a VM: accessible memory, processor, storage size,

and the VM’s internal provisioning policy that is extended from an abstract component

called the CloudletScheduler.

VmAllocationPolicy: This abstract class represents a provisioning policy that a

VM Monitor utilizes for allocating VMs to Hosts. The chief functionality of the

VmAllocationPolicy is to select available host in a data center that meets the memory,

storage, and availability requirement for a VM deployment.

VmScheduler: This is an abstract class implemented by a host component that

models the policies (space-shared, time-shared) required for allocating processor cores

to VMs.

3.2.3 Simulation framework

CLOUDS Lab developed its own discrete event management framework which

is used in the latest version of CloudSim. The class diagram of the framework is

presented on Figure 3.3.

Page 44: Cost and risk modeling of virtualized computing labs

44

Figure 3.3 CloudSim simulation framework (Source: [20])

“CloudSim: This is the main class, which is responsible for managing event

queues and controlling step by step (sequential) execution of simulation events. Every

event that is generated by the CloudSim entity at run-time is stored in the queue called

future events. These events are sorted by their time parameter and inserted into the

queue. Next, the events that are scheduled on each step of the simulation are removed

from the future events queue and transferred to the deferred event queue. Following

this, an event processing method is invoked for each entity, which chooses events from

the deferred event queue and performs appropriate actions.

DeferredQueue: This class implements the deferred event queue used by

CloudSim.

FutureQueue: This class implements the future event queue accessed by

CloudSim.

CloudInformationService: This entity provides resource registration, indexing,

and discovering capabilities.

SimEntity: This is an abstract class, which represents a simulation entity that is

able to send messages to other entities and process received messages as well as fire and

handle events. SimEntity class provides the ability to schedule new events and send

messages to other entities, where network delay is calculated according to the BRITE

model. Once created, entities automatically register with CIS.

CloudSimTags: This class contains various static event/command tags that

indicate the type of action that needs to be undertaken by CloudSim entities when they

receive or send events.

SimEvent: This entity represents a simulation event that is passed between two

or more entities. SimEvent stores information about an event that have to be passed to

the destination entity.

Page 45: Cost and risk modeling of virtualized computing labs

45

CloudSimShutdown: This is an entity that waits for the termination of all end-

user and broker entities, and then signals the end of simulation to CIS.

Predicate: Predicates are used for selecting events from the deferred queue.” [20]

3.2.4 Data center internal processing

“Processing of task units is handled by VMs; therefore their progress must be

continuously updated and monitored at every simulation step. For handling this, an

internal event is generated to inform the DataCenter entity that a task unit completion is

expected in the near future. Thus, at each simulation step, each DataCenter entity

invokes a method called updateVMsProcessing() for every host that it manages.

Following this, the contacted VMs update processing of currently active tasks with the

host. The input parameter type for this method is the current simulation time and the

return parameter type is the next expected completion time of a task currently running in

one of the VMs on that host. The next internal event time is the least time among all the

finish times, which are returned by the hosts.

At the host level, invocation of updateVMsProcessing() triggers an

updateCloudletsProcessing() method that directs every VM to update its tasks unit

status (finish, suspended, executing) with the Datacenter entity. This method

implements a similar logic as described previously for updateVMsProcessing() but at

the VM level. Once this method is called, VMs return the next expected completion

time of the task units currently managed by them. The least completion time among all

the computed values is sent to the Datacenter entity. As a result, completion times are

kept in a queue that is queried by Datacenter after each event processing step. The

completed tasks waiting in the finish queue that are directly returned concern

CloudBroker or CloudCoordinator. This process is depicted on Figure 3.4 in the form of

a sequence diagram.” [20]

Page 46: Cost and risk modeling of virtualized computing labs

46

Figure 3.4 CloudSim data center internal processing (Source: [20])

3.2.5 Communication among entities

“Figure 3.5 depicts the flow of communication among core CloudSim entities.

At the beginning of a simulation, each Datacenter entity registers with the CIS Registry.

Next, the DataCenterBrokers acting on behalf of users consult the CIS service to obtain

the list of cloud providers who can offer infrastructure services that match application’s

hardware, and software requirements. In the event of a match, the DataCenterBroker

deploys the application with the CIS suggested cloud. The communication flow

described relates to the basic flow in a simulated experiment. Some variations in this

flow are possible depending on policies. For example, messages from

DataCenterBrokers to Datacenters may require a confirmation from other parts of the

Page 47: Cost and risk modeling of virtualized computing labs

47

Datacenter, about the execution of an action, or about the maximum number of VMs

that a user can create.” [20]

Figure 3.5 CloudSim communication among entities (Source: [20])

3.3 Simplified model of Apache VCL

In order to model virtual computing labs it is highly recommended to create

VCL’s modified model because VCL contains lots of such information which are not in

the scope. To get the modified model of VCL I had two options: brute force solution is

to model everything what VCL does, while the second option is to collect the necessary

information from VCL – by understanding its logic with huge work – and use the

gathered info to create my own model based on the filtered VCL knowledge. The

second option was obviously the better choice since a lot of parts of VCL was not

relevant. Hereafter I call my own model “simplified model” because it does not simulate

all functionality of the Apache VCL, only the relevant ones. The simplified model’s

configuration diagram is on Figure 3.6.

In the simplified model only virtual machines (or vmslots) have in mind so I do

not deal with the non-virtual computers. The entities are listed in the simplified model

suit to their corresponding table in VCL schema described in Table 2.1.

Page 48: Cost and risk modeling of virtualized computing labs

48

+ImageID : int+ImageName : string+ProcCoreNumber : int+ProcSpeedMhz : int+MemorySizeMB : int+DiskSizeGB : int+NetworkSpeedMbps : int+MaxConcurrentUsage : int+EstimatedReloadTimeMin : int

Image

+UserID : int+UserName : string

User

+UserGroupID : int+MaxOverlappingReservation : int

UserGroup

+VMSlotID : int+VMSlotName : string+State+ProcCoreNumber : int+ProcSpeedMhz : int+MemorySizeMB : int+DiskSizeGB : int+NetworkSpeedMbps : int+InBlockAlloc : bool

VMSlot

+VMHostID : int+ProcCoreNumber : int+ProcSpeedMhz : int+MemorySizeMB : int+DiskSizeGB : int+NetworkSpeedMbps : int+MaxVMSlots : int

VMHost

+ReservationID : int+StartTime : char+EndTime : char+BlockReservation : bool

Reservation

+RequestID : int+State+StartTime : char+EndTime : char+HasProcessed+BeingProcessed+BlockRequest : bool+NumberVMSlots : int

Request

+ManagementNodeID : int+ManagementNodeName : string

ManagementNode

1

*

belongs to

*

*

has

1

1..*

owns

1

*

assigned to

1

*

has

+VMSlotGroupID : int+VMSlotGroupName : string

VMSlotGroup

1

*

has

*

*

mapped to

1

*

configured next image

1

*

owns

*

1

belongs to

*

1

handled by

1

*

processed to

1

*

reservated

*

*

mapped to

0..1

*

selected (BlockAllocation)

*

*

mapped to

Figure 3.6 Simplified model of VCL

3.3.1 Simplified allocation by normal reservation

After the same consideration what can be read in chapter 3.3 I recreated the flow

chart – see Figure 3.7 – of normal reservation exposed in chapter 2.5.1. Otherwise I did

not made any fundamental change in the allocation process of normal reservation.

Page 49: Cost and risk modeling of virtualized computing labs

49

User submits a new reservation

Get request attributes from

Request

Create new Request()

Check for max concurrent usage of

image

Image.MaxConcurrent

Get vmslots that can run specific image

#Mapping Images-VMSlots

Get list of available, not scheduled

vmslots for that specific time

#Reservation

get list of vmslots we can provision

image to

1. Resource requirements are

fulfilled by vmslots#VMSlot

2. vmslots are ranked DESC by

procspeed * procnumber, RAM,

network

Return:A: vmslots are not

preloaded, provisionableB: vmslots are

preloaded, provisionable

C: all vmslots that are part of a block

allocation the logged in user is a

part of that are available between start and end time

find vmslots whose hosts can handle

the required RAM

(we don't need to do this if there are

VMs with the requested image already available

because they would already fit within

the host's available RAM)

#VMHost

Determines a vmslots to use from

return A, B, C looking at the arrays

in that order and tries to allocate a

management node for it

Return: The first vmslot that passed

Empty?

No, there is available vmslot found

Yes, no vmslot

available for specific time

End of reservation procedure/Begin of

provisioning

Determine how many overlapping reservations User can have based on

the groups User is a member of

UserGroup.MaxNumberImage

Figure 3.7 Simplified normal reservation flow chart

3.3.2 Simplified block allocation

After the same consideration what is stated in chapter 3.3 I recreated the flow

chart – see Figure 3.8 – of block allocation exposed in chapter 2.5.2. Otherwise I did not

made any fundamental change in the allocation process of block reservation.

Page 50: Cost and risk modeling of virtualized computing labs

50

Vcld.main()[in vcld]

Get all the block requests assigned to this management node

#Request

Only START mode will be handled

Update the BeingProcessed flag to 1

#Request

Loop through the block requests assigned to this

management node

Check if the block request is already being processed

Yes, next blockrequest

No

Expire time is in the past?

Yes, remove block request

Return: „expire”

30min to 6 hrs in advance to the request

start time?

Yes, start assigning resources

Return: „start”

End time it is less than 1 minute from

now?

Yes, end the block requestReturn: „end”

No, block request does

not need to be processed now

Return: „0"Return

Return value?

0, next blockrequest

„start” and block request has been

already processed, next blockrequest

Else

Blockrequest_mode equal „start”?

Add any vmslots from future reservations Users in the Usergroup maden and vmslots are available for

whole block time

#Reservation

Update the HasProcessed flag to 1

#Requests

All blockrequest_vmslots allocated?

Yes

No, print how many

images could be allocated

Eq „end” or „expire”: not interested in

during master thesis

Return to vcld.main()

Timer = 12 sec

Calculate allocated vmslots

Calculate requests to allocate

Determine start time of block vmslots

#Image.Loadtime+10min

Get available vmslots

(call isAvailable())

Reserve returned vmslots and insert into #Reservation

No

No

Figure 3.8 Simplified block allocation flow chart

3.3.3 Simplified predictive loading modules

After the same consideration what was mentioned in chapter 3.3 I recreated the

flow charts – see Figure 3.9 and Figure 3.10 – of predictive loading modules exposed in

chapter 2.5.3. Otherwise I did not made any fundamental change in the predictive

loading modules.

Page 51: Cost and risk modeling of virtualized computing labs

51

Request is in reload phase after inuse

state

vmslot is part of block reservation?

Get block request image info

#Reservation

Select image for the given vmslot where the request state is

IN(new, reload, imageprep)

There is a reservation

according to the specific vmslot with start time less than

50 minutes from now?

Yes, give the image belongs to that

reservation

Yes

No upcoming reservations for

vmslot

Fetch next image information, reload

the next image of the vmslot

#VMSlot.NextImageID

Next image chosen

No

Figure 3.9 Simplified predictive loading module for Level_0 algorithm flow chart flow chart

Page 52: Cost and risk modeling of virtualized computing labs

52

Request is in reload phase after inuse state

vmslot is part of block reservation?

Get block request image

info

#Reservation

Select image for the given vmslot where the request

state is IN(new, reload, imageprep)

There is a reservation according to the specific

vmslot with start time less than 50 minutes from now?

Yes, give the image belongs to that

reservation

Yes

No upcoming reservations for vmslot

Fetch next image information, reload the next image of the vmslot

#VMSlot.NextImageID

Next image chosen

No

Count online vmslots, VMSlot.State IN (available, reserved, reloading, inuse,

timeout)

Count available vmslots, VMSlot.State IN

(available)

notavail = online -

available

usage = notavail /

online

usage > X% to look at past Y days, otherwise, look at

past 2 months?

Check#Mapping Images-VMSlots

Else:timeframe = 2

months

If usage > 40%, timeframe = 1 dayIf usage > 35%, timeframe = 2 daysIf usage > 30%, timeframe = 3 daysIf usage > 25%, timeframe = 4 daysIf usage > 20%, timeframe = 5 days

If usage > 15%, timeframe = 10 daysIf usage > 10%, timeframe = 20 daysIf usage > 5%, timeframe = 30 days

Fetch preferred, possible images for vmslot (preferred means available images can

go on the vmslot)

Fetch which of those images are already loaded

#VMSlot

Which of those are not loaded

(difference of preferred and

loaded images)

Get the most popular, not loaded image in the timeframe

Something went wrong, no next image

selected?

Yes

No

Figure 3.10 Simplified predictive loading module for Level_1 algorithm flow chart flow chart

3.4 Comparing Apache VCL and CloudSim

Basically the aim of extending CloudSim is to support the simplified model of

Apache VCL. The simplified model of VCL was discussed in chapter 3.3. The model

represents the key functionalities of VCL which should be somehow modeled by the

simulator. CloudSim can be capable to model after some modification in its framework.

Page 53: Cost and risk modeling of virtualized computing labs

53

3.4.1 User

In CloudSim the DataCenterBroker entity acts on behalf of a user and can

submit requests. Brokers are created by calling createNewBroker_MN() described in

chapter 3.6.2.1.

3.4.2 Request

VCL request entity has two important parameters: start time and end time of the

image usage. Also important that the user requests an image so the requested image

belongs to the user entity. In CloudSim there is no option to set start time for a

requested cloudlet (VCL image = CloudSim cloudlet + CloudSim VM) only the length

of the cloudlet can be set in million instructions (MI). Creation of the CloudSim broker

can be postponed or shifted in CloudSim simulation clock time so the problem can be

solved by implementing such an algorithm that can convert the start time into clock

time and convert the requested running time into MI to set cloudlet length. These are

done by functions calcDelay_calendarToMs_MN() and calcCloudletLength_MN()

described in chapter 3.6.2.1. More information about CloudSim cloudlet and CloudSim

VM can be found in chapter 3.6.1.1 and 3.6.1.5.

One more missing feature in CloudSim that the cloudlet requests are originally

terminated if they are not immediately (for the first try) served by a VM in any data

centers. This was also modified by calling recreateVmsInDatacenter_MN() described in

chapter 3.6.1.3.

3.4.3 Image

Image entity has different parameters: name, virtual CPU, memory size, disk

size, network speed, maximum concurrent usage and estimated reload time. Users owns

the requests. Images actually are virtual machines which have to be started in one of the

hosts when the user submit a request for an image reservation. In VCL an image cannot

be directly launched in a host because firstly an appropriate virtual machine slot has to

be found. In CloudSim there is no entity called image but virtual machine. However, the

virtual machines cannot be requested by the users. Users can only request for cloudlets

in CloudSim. By default cloudlets have length in MI, file size and output size so no

resource and request time parameters like images are defined. Therefore, the image

entity is modeled by a mixed entity of cloudlet and virtual machine. This modeling is

Page 54: Cost and risk modeling of virtualized computing labs

54

done in function createNewBroker_MN() described in 3.6.2.1. More information about

CloudSim cloudlet and CloudSim VM can be found in chapter 3.6.1.1 and 3.6.1.5.

3.4.4 VMSlot

VMSlot is used in VCL to divide physical hosts to little logical resource units.

These resource units are used to run users’ image requests. For example a VMSlot with

1 virtual CPU, 1024 MB RAM, 10 GB disk space and 100 Mbps bandwidth can be used

to execute an image with less resource need than the VMSlot has. VCL creates the

images in VMSlots by minimizing the not used resources of the VMSlots. In CloudSim

there are no virtual machine slots by default. Images can be created in any host.

3.4.5 UserGroup

In CloudSim there are no user groups and were not extended to support user

groups.

3.4.6 VMSlotGroup

Likewise to UserGroup virtual machine slot groups are also not supported by

CloudSim and were not implemented into CloudSim.

3.4.7 Reservation

VCL reservation entity is used to store the accepted requests submitted by the

users. In CloudSim all of the users’ requests are submitted as I assume that none of the

users breaks the rules of requesting images.

3.4.8 VMHost

In CloudSim host entity is the very same like virtual machine host in VCL. Host

are created in function createHost_MN() described in chapter 3.6.2.1. Host entity has an

ID, CPU core count, CPU speed MIPS value, memory size, disk size, and network

speed, server room cost per month, power cost per month, operational cost per month

and investment cost.

3.4.9 ManagementNode

VCL management node is an entity to handle virtual machine hosts. This

functionality is represented in CloudSim by data centers. Data centers are created in

Page 55: Cost and risk modeling of virtualized computing labs

55

function createDatacenter_MN() described in chapter 3.6.2.1 and have the following

parameters: name, architecture, operation system, virtual machine monitor, time zone,

cost of an image per hour, cost per network usage.

3.4.10 Simplified allocation by normal reservation

This reservation process is described on Figure 3.7. In CloudSim normal

reservations are done almost in the same way like in VCL only the checking steps

(checking of overlapping reservations, maximum concurrent usage of image, vmslots

that can run specific image) are skipped. A CloudSim broker is equal with a VCL user.

The reservation will be placed in that host which can handle the resource need of an

image. It is possible to create more data centers, in this case the simulator tries to create

the virtual machine of the reservation within the data center in the order of their creation

at the beginning of the simulation (the order is the same as they were added in the input

Excel file).

3.4.11 Simplified block allocation

This allocation process is described on Figure 3.8. In CloudSim block

allocations can be modeled the very same way as a normal reservation is done except it

need to be set the Number of images requested (block reservation) input parameter for

the requested reservation. If an image request cannot be fulfilled because there is not

enough resource capacity in any host then the simulator continues to create the image in

later time again until one of the hosts has enough resource capacity to handle the

reservation (creating the virtual machine).

3.4.12 Simplified predictive loading modules

These modules are described in chapter 3.3.3. In the Level_0 predictive loading

module that specific image will be pre-started in the virtual machine slot which was set

by the administrator on VCL admin website. Level_1 algorithm preloads the most

popular, not loaded image in a specific time frame (this time frame varies from 1 day to

30 days) in the vmslot. As in CloudSim no vmslot entity exists modeling the loading

modules is a bit strange. To solve this issue the virtual machine creation time can be

delayed to get similar working. By default, the simulator assumes that there is no delay

for any VM creation. In VCL terminology it means that each vmslot is preloaded with

the right image. The advanced solution is that the creation of a specific image request

Page 56: Cost and risk modeling of virtualized computing labs

56

has a specific delay value and all the time when that image is created the creation time

will be delayed with that specific time.

3.5 Additional conceptual features

After the comparison of Apache VCL and CloudSim, there are still some

missing system related – such as cost, system utilization, future request and service

performance modeling related to hybrid cloud setups – features from CloudSim which

features are also missing from Apache VCL. All of these features provide important

information about the simulation of cloud infrastructures.

3.5.1 Cost modeling

In my master thesis work I look all of the costs (total costs of ownership (TCO))

of a cloud infrastructure as outgoing cash flows and I follow the basics of time value of

money theory to appraise projects as private cloud investment project. Therefore, net

present value (NPV) was used to calculate the value of the investment projects. It

compares the present value of money today to the present value of money in the future,

taking returns into account. Because the project has not incomings, only outgoing cash

flows I will use the outgoing values as positive values to make the visualizations easier.

[21] NPV is given by the period t, the cash flow at a given time Ct, the total number of

periods N and the opportunity cost of the money r:

This NPV calculation is done for private data centers’ hosts in monthly basis by

the method getHostRunningCostMonthBasedNPV_MN() described in chapter 3.6.1.4.

For public cloud environment the pricing policy is designed on the Amazon’s business

model, so the cost of an instance is charged hourly as described in chapter 3.6.1.1.

3.5.2 Data center utilization

After the simulation terminates the utilization of private data center is really

important parameter to see whether the capacity of the private infrastructure is enough

to securely serve user requests in the future. For this purpose the simulator was

extended in that way to support this requirement: data center utilization is calculated as

Page 57: Cost and risk modeling of virtualized computing labs

57

the average CPU and RAM utilization of data centers’ hosts. This process is described

more detailed in chapter 3.6.2.1.

3.5.3 Future request generation

For generating thousands of future request loads an own algorithm was

developed. This algorithm takes some input parameters from the user like lecture

names, peak dates of the lectures, mean and standard deviation of average reservation

lengths and peaks. For each peak different peak dates and images can be added. The

algorithm uses normal distribution to generate requests randomly in time with peak date

as mean and hourly defined standard deviation. To get the reservation lengths normal

distribution was used with average reservation length as mean and hourly standard

deviation of average reservation length. Similar lectures’ start dates are shifted

randomly (based uniform distribution) to each other. The algorithm does the shifting

carefully to avoid starting and ending dates outside of the time windows of the current

semester (autumn or spring). Algorithm is the part of the method

createNewWorkloads_MN() which is described in chapter 3.6.2.1.

3.5.4 Service performance

Performance of a cloud computing system is a really important part of the result.

CloudSim does not provide rich output after a simulation, so no service performance or

the like is printed. That is the reason why I have extended the simulator with such

feature.

This is done by method calcSzolgaltatasbiztonsag_MN() described in chapter

3.6.2.1. The output file called “ServicePerformance*.txt” summarizes all important

information about the simulated requests: request IDs; user names; user IDs; image

names; data center IDs; data center names; host IDs; VM IDs; requested start, end and

running times; simulated actual start, end and running times; and wait times of the

requests.

3.6 Improvements in CloudSim

There have been modified couple of built in CloudSim core Java classes, created

some new Java classes and added three libraries to map Apache VCL in CloudSim

better than CloudSim would do without modification.

Page 58: Cost and risk modeling of virtualized computing labs

58

3.6.1 Modified Java classes

During mapping the VCL key functions on CloudSim the following CloudSim

Java classes were modified:

Cloudlet.java

Datacenter.java

DatacenterBroker.java

Host.java

VM.java

Code license of CloudSim is General Public License (GPL) which requires to

make the modified source code available so it was uploaded to make it public. [22]

3.6.1.1 Cloudlet.java

Cloudlet Java class is an extension to the cloudlet. By default CloudSim

provides the function called getCostPerSec() to get the running cost of a cloudlet in the

data center. This function returns the cost associated with running the cloudlet. The cost

value is set when data center’s characteristic is created. It means that the cost of all

cloudlets in a specific data center is calculated by using the same cost per second value.

This built in function is not totally perfect because I need hour basis costs and different

running costs for different cloudlet types (images). In order to get through these

problems I created a new function called getCloudletCostAmazonHourBased_MN().

This function has an input parameter costPerHour which solves both problems. The

pricing policy is designed on the Amazon’s business model, so the cost of an instance is

charged hourly.

3.6.1.2 Datacenter.java

Datacenter Java class deals with processing of VM queries (e.g. handling VMs)

instead of processing cloudlet related queries. The constructor had to be modified to set

the creation time of hosts. This is done by calling host’s setTimeCreated_MN()

function. Similarly, the destroy time is set to the host in shutdownEntity() function by

calling host’s setTimeDestroyed_MN() function.

3.6.1.3 DatacenterBroker.java

DatacentreBroker Java class represents a broker acting on behalf of a user. It

hides VM management, as VM creation, submission of cloudlets to this VMs and

Page 59: Cost and risk modeling of virtualized computing labs

59

destruction of VMs. Authors of CloudSim encourage developers to develop their own

broker policies to submit VMs and cloudlets according to the specific rules of the

simulated scenario. I took theirs advice and modified the original broker more times.

I have created three new attributes:

scheduleInMs: Brokers are basically created at the start of the simulation

but I needed brokers to start at different times. To make it possible I used

my own calculation method to postpone the creation of the broker. This

delay time in seconds is stored in scheduleInMs variable.

startTime: The requested start time of a reservation given by the user.

endTime: The requested end time of a reservation given by the user.

To be able to create new broker entities with the above listed variables new

constructor was necessary. It was done by extending the input parameters with the three

new attributes.

Original broker entity finishes its execution if it is not able to create a requested

VM in any data centers. This is not suitable for me, so the method processVmCreate()

was modified to process again the VM creation later. This is done by calling

recreateVmsInDatacenter_MN() which function sends a new VM creation message to

the data center then the data center tries to create the VM again in 4 minutes. The VM

creation cycle goes until the VM is created in one of the data centers.

CloudSim’s basic broker does not deal with reload times. The VMs are created

immediately (namely it takes 1 second) in one of the data centers if there is enough

resource in a data center. To handle this issue I had to modify the

createVmsInDatacenter_MN() method, which is used to create the virtual machines in a

datacenter by using a postponed VM creation message. The postpone time then is the

reload time of the specific image that was set by the user at the beginning of the

simulation. Each image has its own average reload time value and that value is used for

the VM creation.

DatacenterBroker extends the CloudSim SimEntity class and there are some

methods to be overwritten. One of these methods is startEntity() which is responsible

for starting the entity up. In order to schedule the broker entity to start up later as it

happens in VCL the scheduleInMs value should be used.

Page 60: Cost and risk modeling of virtualized computing labs

60

3.6.1.4 Host.java

Host Java class executes actions related to the management of virtual machines.

A host has a defined policy for provisioning memory and bandwidth, as well as an

allocation policy for processing elements to virtual machines. A host is associated to a

datacenter. It can host virtual machines.

In order to model private clouds’ TCO some modification had to be applied in

code. CloudSim does not provide any solution for that, so the modification in this class

was absolutely necessary.

New attributes and their roles:

timeCreated: Time value when the host was created, expressed in

Cloudsim simulation basis in seconds.

timeDestroyed: Time value when the host was destroyed, expressed in

Cloudsim simulation basis in seconds.

operationalCostMonth: Host’s average monthly operational room cost,

used to calculate NPV.

investmentCost: Host’s initial, one time investment cost, used to

calculate NPV.

backup: Boolean value to identify whether the host is in normal use or

for backing the system up.

runningTimeSec: Running time of the host during simulation in seconds.

datacenterName: Name of host’s data center.

NPV calculation is already mentioned in chapter 3.5.1 and done for every hosts

in method getHostRunningCostMonthBasedNPV_MN(). If the host is set to backup the

NPV is equal with the investment cost of the host, else the NPV is calculated, where the

period is month, total number of periods is the simulation interval in month, monthly

cash flow is equal with operational costs, and opportunity cost is given by the user.

There were also created setter and getter methods for my own attributes like

timeCreated, timeDestroyed, operationalCostMonth, investmentCost, backup.

setDatacenter() function is extended with setting the name of the data center.

3.6.1.5 VM.java

VM Java class represents a VM: it runs inside a host and processes cloudlets.

Only one change was applied in this class: creating reloadTimeMin attribute which is

used in the data center broker to simulate the reload time of an image like in VCL.

Page 61: Cost and risk modeling of virtualized computing labs

61

3.6.2 New Java classes

The following classes were created by me in order to map VCL functionalities:

CloudSim_jar.java

DcHoUtil.java

DCinput.java

Image.java

Szolgbiz.java

Workload.java

3.6.2.1 CloudSim_jar.java

CloudSim_jar Java class contains the main() static function. Beside I have

created some other static methods to support the VCL mapping.

In order to start the Java application function main() is called and that method is

responsible for using the appropriate Java classes and calling methods during the

simulation. First, the input Excel file is read by this method. After that, the brokers are

created by calling createNewBroker_MN() with the input file parameters. Hosts are also

created by createHost_MN() method as the input file defines the details of them. The

data centers are created the same way as the brokers but createDatacenter_MN() is

called to do that. After reading the input file the workbook is closed and the simulation

can be started calling startSimulation(). The simulation’s time depends on the time to be

simulated and on the number of total requests, so it can vary minutes to hours. Before

generating the output files service performance metrics are calculated in method

calcSzolgaltatasbiztonsag_MN() and host utilization percentages in

calcHostsUtil_MN(). Having the result of the simulation the output files are written out

in method printDatacenterCostsAndUtils_MN() and in method printSzolgbizList_MN().

Short summary about the output files:

Request*.txt: Stores the submitted requests.

Summary*.xls: Summarizes the costs of data centers’ and some other

useful information about wait times and served requests.

ServicePerformance*.txt: Contains the list of submitted requests and the

simulation parameters such as the simulated start time of the request or

the simulated wait time.

Java method createNewWorkloads_MN() is used for multiplying loads for a

specific lecture. It means that the user gives different metrics for a lecture like mean and

Page 62: Cost and risk modeling of virtualized computing labs

62

standard deviation of average reservation length, and based on these input parameters

the given number of similar lectures is created in such a way that only the starting time

parameters are modified by shifting them between the time frame of the semester

(autumn or spring semester) using uniform distribution. Using this solution user does

not have to type thousands of future reservations to simulate similar lectures’ requests.

Method calcHostsUtil_MN() is used to calculate the simulated hosts’ CPU and

RAM utilization during the simulated timeframe. It goes through the simulated

cloudlets and counts each cloudlet’s running time for its host. After that the running

times are summed for each host and the total cloudlets’ running time is divided by the

total running time of the host. In this way every hosts have its own utilization value.

Because I was interested in data center utilization I calculated the average of hosts’

utilization value for a specific data center, so at the end I use data center CPU and RAM

utilization.

Outputting the end results of the simulation method

printDatacenterCostsAndUtils_MN() was created. Using this method “Summary*.xls”

document is outputted at that path where the user starts the simulation JAR file from.

For further analysis of system’s performance “ServicePerformance*.txt” filed is created

after simulation is finished. This file contains detailed information for each image

requests. The output printing is done in method printSzolgbizList_MN(). The content of

the output files will be explained later.

Making more easier and comfortable the host generation method

createHost_MN() was developed. Number of hosts and other parameters are given by

the user via the input Excel file. For correct simulation of VCL environment the built in,

simple provisioning policy was used for RAM and bandwidth provisioning, for VM

scheduling space-shared scheduling algorithm was used. More information about space-

shared algorithm is described in chapter 3.2.1.1. Additionally, the host’s investment cost

and monthly operational cost are set here. Method createDatacenter_MN() is used to

create CloudSim data center entity with the user given data center characteristics and

the built in simple VM allocation policy.

By default CloudSim does not deal with scheduled cloudlet processing. There is

only the zero time and all cloudlet requests are processed at the same time. If any of the

cloudlet requests cannot be served because of lack of compute resource than those

requests are destroyed and flagged with unsuccessful state. In order to support future

Page 63: Cost and risk modeling of virtualized computing labs

63

reservations as VCL does, I modified the broker creation process in CloudSim. This

modification is impersonated in method createNewBroker_MN(). The user determinates

the start time of the simulation and the start time of the image requests in input file. The

difference between the two dates is used as delay time to start the broker later. The

delay time calculation is done by the method calcDelay_calendarToMs_MN(). Another

missing feature of CloudSim that it does not deal with start and end time of a cloudlet

request. It defines only length of a cloudlet given as MI. Based on user’s input

parameters for image requests the cloudlet length could be calculated. This happens in

method calcCloudletLength_MN(). Calculating cloudlet length in this way was

necessary to not change CloudSim core source code fundamentally.

Prior to adding the costs of the hosts related to a data center one host cost has to

be defined. This is done by calling the method

getHostRunningCostMonthBasedNPV_MN() of hosts. At the end of the simulation user

needs only one cost value for the data center. Method getDatacenterHostsCost_MN()

sums the costs of hosts for a given data center using NPV calculation with user given

opportunity cost. This part is very important for the thesis because it is mandatory to see

whether building up an own private cloud data center or a public cloud service is worth

better as it can help for the decision making.

Similar to the data center cost I have created the method

getDatacenterCloudletsCost_MN() to calculate data center’s image running costs based

on the hourly cost of the images. This cost value means that if the images was executed

in a public, pay-as-you-go cloud service the user has to pay this value for those images.

Using fees of images are accounted monthly and the monthly image using costs are

discounted to zero time by the user given opportunity cost. If the user would like to

decide between the two projects (building own private cloud or using public cloud) the

project with the less NPV value must be chosen.

Performance of the cloud computing systems is also an important part of the

result. Method calcSzolgaltatasbiztonsag_MN() makes the report of the simulation for

each requests. It means it calculates the wait time for them and collects every

meaningful information about a request: request ID, user name, user ID, image name,

data center ID, data center name, host ID, VM ID, requested start, end and running time,

simulated actual start, end and running time, and wait time of a given request. These

information are stored in “ServicePerformance*.txt” file after the simulation is finished.

Page 64: Cost and risk modeling of virtualized computing labs

64

In method createVM_MN() the virtual machines are created. Every virtual

machine uses time-shared cloudlet scheduler built in CloudSim scheduler. Each virtual

machine has its own reload time which reload time is given by the user for each image.

CloudSim uses cloudlets to simulate requests. Cloudlets have ID, length in MI,

processing elements (PEs), file size, output size and three utilization parameters. IDs are

the same with VMs’ IDs. Length is calculated by calcCloudletLength_MN(). PEs

number is equal with image’s virtual CPU number. File and output size is for moving in

and out size in MB over the network. In VCL Remote Desktop Protocol (RDP) or SSH

is used therefore only little data is transferred over the network. Because they are below

1 GB per month and Amazon charges only above 1 GB/month traffic I did not count

with these costs. For utilization model the built in UtilizationModelFull() was used.

This all is done in createCloudlet_MN().

3.6.2.2 DcHoUtil.java

DcHoUtil Java class was created to store all relevant information about a host as

ID, data center name, CPU and RAM utilization during the life time of the host. Calling

setUtilization() the utilization is calculated for the host.

3.6.2.3 DCinput.java

DCinput Java class stores information (name, type (private or public), number of

hosts, monthly cost and investment cost) about a data center. These values are given by

the user at starting the simulation.

3.6.2.4 Image.java

Image Java class represents an Apache VCL image entity in the simulation

environment. Attributes are the following: name, input and output size, storage and

RAM size, speed of the CPU cores, network speed, number of virtual CPUs, virtual

machine monitor name, hourly running cost of the image in public cloud and average

reload time of the image.

3.6.2.5 Szolgbiz.java

Java class called Szolgbiz is used to encapsulate the most important data of a

simulated reservation. The entity contains the following values related to a simulated

request: identifier of the cloudlet/request, user identifier of the broker/user, user name of

Page 65: Cost and risk modeling of virtualized computing labs

65

the broker/user, data center’s identifier, data center’s name, host’s identifier, virtual

machine’s identifier, name of the broker/user, name of the image requested, identifier of

that data center executed the image, name of the data center, identifier for the host

where the image was created, identifier of the virtual machine/request, requested start

time of the reservation, requested end time of the reservation, requested reservation

length, start time of the reservation after the simulation done, end time of the reservation

after the simulation done, reservation length after the simulation done, and wait time of

the user to get access to the requested image. To sum up the results of the simulation

these values are written out into the output text file.

3.6.2.6 Workload.java

Workload Java class is used to model a reservation’s basic information like

name of the broker, request’s start and end date, image name and number of the requests

(for block allocation).

3.6.3 Used Java libraries

Besides creating or modifying Java classes three libraries were added to the Java

project:

flanagan.jar

commons-math3-3.2.jar

jxl.jar

3.6.3.1 Flanagan.jar

Flanagan library is Michael Thomas Flanagan's Java Scientific Library which is

used by several times by CloudSim framework. [23]

3.6.3.2 Commons-math3-3.2.jar

Commons Math is a library of mathematics and statistics components addressing

the most common problems not available in the Java programming language or

Commons Lang. This library has to be added to the simulation project as CloudSim uses

some components from that. [24]

3.6.3.3 Jxl.jar

Java Excel API library is an open source Java API which was used to read the

Excel spreadsheet containing the input parameters and to generate dynamically the

Page 66: Cost and risk modeling of virtualized computing labs

66

simulation results into Excel spreadsheet. Java Excel API contains a mechanism which

allows Java applications to read in a spreadsheet, modify some cells and write out the

new spreadsheet. [25]

3.7 How to use the simulator

This part of my documentation aims to explain and share all needed information

how to use the simulation environment I have done in frame of the master thesis. First, I

go through the input files users to understand the basic inputs for the simulator. After

that the Java simulation application starting hints are coming. The output files are

covered at last.

3.7.1 Providing input parameters

To start the simulator an input Excel file is required. This Excel file must be in

Microsoft Excel Binary File Format (.XLS). The newer format, .XLSX format is not

supported by the Jxl.jar library. The format of the template Excel cannot be changed as

the simulator reads the cells in a predefined order. To protect the template Excel file’s

structure the input file is protected and only the cells with green and yellow background

color are allowed to modify by the user. The yellow color means that data validation is

set for those cells. The red ones cannot be modified because they are computed values

and if they are not calculated perfectly the simulator will not start. The worksheets’

names cannot be protected but they should not be changed because the simulator will

not work with other sheet names.

First of all the input worksheet has to be filled in. The simulation start date has

to be defined in the following format: yyyy.mm.dd HH:mm:ss as shown on Figure 3.11.

Figure 3.11 Simulator input parameter for simulation start date

Different images can be defined as in Apache VCL. One image consists of

different parameters as shown on Figure 3.12.

Then the data centers must be defined as shown on Figure 3.13. Investment cost

of data center is all capital expenditure (CapEx), while operational cost of data center is

all operating expenditure (OpEx) per month.

1

A B

Simulation start date: 2013.09.01 00:00:00

Page 67: Cost and risk modeling of virtualized computing labs

67

Figure 3.12 Simulator input parameters for images

Figure 3.13 Simulator input parameters for physical host servers

After having different number of data centers the physical hosts should be given

by the user as shown on Figure 3.14. Backup host flag means that only the CapEx of the

host is calculated to the data center result and discounted. It is allowed to add more

hosts to the same data center. For simulating infinite data center the physical hosts have

to be created using huge numbers (for example maximum value of integer) for CPU

cores and RAM parameters.

Figure 3.14 Simulator input parameters for data centers

Calculating the opportunity cost to employ NPV theory for evaluating private or

public cloud strategy, risk free rate, beta and expected market return have to be given as

shown on Figure 3.15.

Figure 3.15 Simulator input parameters for opportunity cost calculation

Service performance (SLA) parameter for the simulation is the accepted

maximum response time for a request. This value can be defined in minutes as shown

on Figure 3.16.

12

13

14

15

16

17

18

A B C

Name of the image: BSc AUTUMN MSc 1 AUTUMN

Size of input files to be moved via network, MB: 50 50

Size of output files to be moved via network, MB: 50 50

RAM size, MB: 3072 1024

Number of vCPUs: 2 1

Hourly cost executing this image in public datecenter, $: 0,35 0,12

Image reload time, min: 8 4

21

22

23

24

25

A B C

Name of datacenter: Private Amazon

Type: PRIVATE PUBLIC

Number of hosts: 9 1

Investment cost of datacenter, $: 34666,67 0,00

Monthly cost of datacenter, $: 1580,00 0,00

28

29

30

31

A B C D

Name of datacenter: Private Private Amazon

Number of CPU cores: 8 8 5000

Size of RAM, MB: 32768 32768 2147483647

Backup host: FALSE FALSE FALSE

34

35

36

A B

Risk free rate of intereset, %: 2,00%

ß - beta of the investment: 1,50

Expected return of the market, %: 8,00%

Page 68: Cost and risk modeling of virtualized computing labs

68

Figure 3.16 Simulator input parameter for service performance parameter

Generating high number of request loads for lectures is very easy with the

request generation algorithm which is already described in chapter 3.5.3. The algorithm

requires some input parameters which can be filled in using the input worksheet of

Excel input file as depicted on Figure 3.18. Firstly, the name of the lecture, number of

students, semester and number of similar lectures has to be defined. After that, up to 5

peaks can be used to model the loads of a lecture during a semester. For each peak

different peak dates and images can be added. The load generator uses normal

distribution to generate future requests with peak date as mean and hourly defined

standard deviation. For getting the reservation lengths also normal distribution was used

with average reservation length as mean and hourly stander deviation of average

reservation length. The rate between students and requested images value of a peak is a

multiplier which defines how many images should be created for specific peak if a

given number of students took that lecture.

If the user does not want the simulator to generate requests randomly requests

can be add one by one on worksheet workload. Identifier of the request has to contain

the name of the image with three or more “_” delimiters. Requested start dates must be

greater than the simulation start date and the images had to be already added on input

worksheet as shown on Figure 3.17. For VCL block allocation the values in the last

column should be modified.

Figure 3.17 Simulator input parameters for one by one request defining

It is also possible to provide requests not only in Excel format but in text file.

This feature is useful when the user would like to simulate more requests than the

number of rows (65 536) handled by Excel .XLS. In this case, the text file should have

the very same structure like the workload Excel sheet but using tabulator delimiters and

leaving the first label row out.

39

A B

Accepted maximum response time, min: 10

1

2

3

4

5

6

A B C D E

Request ID/Broker name Requested start date Requested end date Name of the image Number of images requested (for block allocation)

BSc_autumn_2013_1 2013.09.18 15:54:23 2013.09.18 16:01:59 BSc AUTUMN 1

MSc_autumn_2013_1 2013.09.18 19:03:56 2013.09.18 19:09:40 MSc 1 AUTUMN 1

MSc_autumn_2013_2 2013.09.18 19:25:29 2013.09.19 03:29:57 MSc 2 AUTUMN 1

MSc_autumn_2013_3 2013.09.19 17:04:07 2013.09.19 18:14:45 MSc 1 AUTUMN 1

MSc_autumn_2013_4 2013.09.19 17:04:07 2013.09.19 18:14:45 MSc 2 AUTUMN 1

Page 69: Cost and risk modeling of virtualized computing labs

69

Figure 3.18 Simulator input parameters for random request generator

3.7.2 Starting the simulation

After the input Excel file is filled the simulation can be started. The CloudSim is

written in Java and my own codes also were written in Java. The Java project is built by

NetBeans IDE 7.4 with Java version 1.7.0_51.

The dist folder contains the external libraries (commons-math3-3.2.jar,

flanagan.jar and jxl.jar) and cloudSim_jar.jar Java archived application. The

application needs two mandatory and one optional arguments: 1. the name of the input

file; 2. the output files’ suffix to be able to differentiate the different simulations’ output

files; 3. the name of the input file of requests text file is optional. Application is looking

for the input file at that path where the application was executed from.

Example starting code on Windows platform via Command Prompt (CMD):

cd C:\Users\exampleUser\inputFileFolder java -jar C:\Users\exampleUser\dist\cloudSim_jar.jar inputFile1.xls suffix1 [inputRequests1.txt]

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

A B C

Name of the lecture: BSc_autumn_2013 MSc_autumn_2013

Number of students: 200 50

Semester: AUTUMN AUTUMN

Number of same lectures to be simulated: 15 25

Image used at peak 1: BSc AUTUMN MSc 1 AUTUMN

Rate between students and requested images by peak 1: 18,00 9,00

Mean of the requests' length in hour by peak 1: 2,00 2,50

Standard deviation of the requests' length in hour by peak 1: 1,00 1,50

Mean of peak 1, date: 2013.10.27 15:00:00 2013.09.27 19:00:00

Standard deviation of the requests' time in hour by peak 1: 120,00 96,00

Image used at peak 2: BSc AUTUMN MSc 1 AUTUMN

Rate between students and requested images by peak 2: 11,00 36,00

Mean of the requests' length in hour by peak 2: 2,00 2,50

Standard deviation of the requests' length in hour by peak 2: 1,00 1,50

Mean of peak 2, date: 2013.11.17 17:00:00 2013.10.17 19:00:00

Standard deviation of the requests' time in hour by peak 2: 120,00 108,00

Image used at peak 3: BSc AUTUMN MSc 2 AUTUMN

Rate between students and requested images by peak 3: 1,50 2,50

Mean of the requests' length in hour by peak 3: 2,00 2,80

Standard deviation of the requests' length in hour by peak 3: 1,00 1,00

Mean of peak 3, date: 2013.12.16 17:00:00 2013.11.12 19:00:00

Standard deviation of the requests' time in hour by peak 3: 24,00 48,00

Image used at peak 4: MSc 2 AUTUMN

Rate between students and requested images by peak 4: 2,00

Mean of the requests' length in hour by peak 4: 2,80

Standard deviation of the requests' length in hour by peak 4: 1,00

Mean of peak 4, date: 2013.11.27 19:00:00

Standard deviation of the requests' time in hour by peak 4: 36,00

Page 70: Cost and risk modeling of virtualized computing labs

70

Example starting code on Linux platform via Terminal:

cd /home/exampleUser java -jar /home/exampleUser/dist/cloudSim_jar.jar inputFile1.xls suffix1 [inputRequests1.txt]

The simulator logs all important steps to the terminal during the simulation.

3.7.3 Description of the output files

After the simulation terminates, three output files are created (into that folder

where the application was executed from) by the application: the requests and service

performance text files and the summary Excel file. Name of the files are the following

where [suffix] is given by the user, [YYYYMMDD] date and [HHMMSS] time are

taken by the application at the start of the simulation:

Requests_[suffix]-[YYYYMMDD]_[HHMMSS].txt

ServicePerformance_[suffix]-[YYYYMMDD]_[HHMMSS].txt

Summary_[suffix]-[YYYYMMDD]_[HHMMSS].xls

Requests file contains the inputted requests in table form as it is shown in Table

3.1.

Request ID Request name Image name Requested start

time

Requested end

time

5 BSc_autumn_2013_0_0#0 BSc autumn

2013

2013.11.02

08:28:14

2013.11.02

11:22:55

Table 3.1 Output file for input requests

Service performance has high number of parameters which is also outputted in

table form. This file is tab delimited so it can be easily read for further analysis or

reporting. An example output is shown in Table 3.2.

Page 71: Cost and risk modeling of virtualized computing labs

71

Cloudlet

ID

Broker

ID

Broker

name

Image

name

Datacenter

ID

Datacenter

name

Host

ID

VM

ID

Requested

start time

Requested

end time

Requested

running

time (hour)

Actual start

time

Actual

end time

Actual

running

time

(hour)

Waiting

time

(min)

500 5 BSc_autumn

_2013_0_0#0

BSc

AUTU

MN

3 Private 301 500 2013.11.02

08:28:14

2013.11.02

11:22:55 2.91

2013.11.02

08:36:14

2013.11.02

11:30:54 2.91 8

Table 3.2 Output file for simulated requests

The summary Excel output file contains business related information and a short overview about the simulation. The following values are

created after simulations. Example output content is shown in Table 3.3.

Datacenter

name

Cost of

datacenter

with NPV

calculation,

$

Cost of

executing

images in

datacenter

with NPV

calculation, $

Average

CPU

utilization of

datacenter,

%

Average

memory

utilization of

datacenter,

%

Number of

requests

served in

datacenter

Number

of

requests

Number

of

requests

served

within the

normal

response

time

Number of

requests

not served

within the

normal

response

time

Total

average

waiting

time

(min)

Average

waiting

time of

normal

served

requests

(min)

Average

waiting

time of not

normal

served

requests

(min)

Rate of

not

normally

served

requests

(%)

Private 102 735.33 90 442.23 56.42 19.03 138 519 153 338 153 338 0 6 6 0 0

Amazon 0.00 12 615.65 - - 14 819 - - - - - - -

Table 3.3 Output file for simulation overview

Page 72: Cost and risk modeling of virtualized computing labs

72

4 Case study: a hypothetical implementation project

The case study would like to demonstrate the usage and capability of the

extended and improved CloudSim simulator. The task is to get an optimal sized Apache

VCL cloud computing environment at a hypothetical Hungarian university in order to

serve an estimated system load assumed that all kind of cloud computing types can be

used: private, public and hybrid as well. Decision making is done by regarding that each

cloud infrastructure setup is a business project. During the simulation the NPV is

calculated for each project. The easiest decision would be to choose the project with the

smallest NPV value. Unfortunately, it will be not 100% right because some other

parameters should be also taken into account like utilization of system, average user

wait time or service performance indicator. To sum up, the decision making is more

complicated than comparing only NPV values of the projects.

4.1 Input parameters

To run the simulator the first step is to fill in the input Excel document. In this

section I summarize what types of input parameters were used for this case study.

Simulation start time was set to 2013.09.01 15:00:00 (yyyy.mm.dd HH:mm:ss)

as I assumed that the autumn semester starts at 1st of September.

4.1.1 Image types

Three different images were created for the case study:

BSc AUTUMN: 2 virtual CPU cores, 3072 MB RAM, Windows, 8

minutes reload time, semester autumn, $0.35 hourly cost of running this

image in Amazon public cloud. [26] This image has similar size to the

image of lecture System Modeling, autumn version

MSc 1 AUTUMN: 1 virtual CPU cores, 1024 MB RAM, Linux, 4

minutes reload time, semester autumn, $0.12 hourly cost of running this

image in Amazon public cloud. This image has similar size to the first

DMIS labor of lecture IT Engineering Laboratory 2.

MSc 2 AUTUMN: 2 virtual CPU cores, 3072 MB RAM, $0.12, Linux, 4

minutes reload time, semester autumn, $0.12 hourly cost of running this

image in Amazon public cloud. This image has similar size to the second

DMIS labor of lecture IT Engineering Laboratory 2.

Page 73: Cost and risk modeling of virtualized computing labs

73

4.1.2 Cloud computing infrastructures

Two types of clouds were defined to cover private, public and hybrid cloud

configurations:

Private: to simulate own cloud environment of the university; investment

and monthly operational costs are calculated automatically based on the

appropriate cost estimation algorithm described in chapter 4.1.3.

Amazon: to simulate public cloud environment; null values were used

for investment and operational costs, so the image specific instance

prices cover the public data center’s usage costs.

Depending on the specific simulation a given number of physical servers were

created in the private data center. Each server contained 8 CPU cores, 32 768 MB RAM

and false backup flag. For Amazon’s data center only one server was created with 5 000

CPU cores, 2 147 483 647 MB RAM and with false backup flag in order to simulate

endless public resource capacity.

4.1.3 CapEx and OpEx costs of private data center

Before starting the simulation the input Excel file has to be filled to provide the

basic input parameters for the Java cloud simulator. For this university hypothetical

implementation project I had to determine two input values related to university’s

hypothetical private cloud infrastructure: the one-time investment cost and the monthly

operational cost of the private cloud computing infrastructure at the university. Getting

the values is not an easy thing because there are numerous cost influencers. At the end

the cost were calculated based on experts estimate and on empirical information.

Empirical information are based on the current costs of the current private data center

which currently consists of much fewer physical servers than the private data center is

required in this hypothetical project.

Table 4.1 depicts the key influencers of the capital expenditure. The table also

contains that how many items/devices had to have for a working private data center.

After that the market price of the item is shown which can vary in other countries or for

other type of items/devices. The last column defines the full investment cost of a

specific item for the private cloud infrastructure. For instance, the first row contains the

investment cost of the air conditioner. Each item is calculated for a given number of

physical servers. In the table the devices are calculated for a 45 physical server private

cloud configuration.

Page 74: Cost and risk modeling of virtualized computing labs

74

Name of the item Number of the item for

one physical server

Price of the

item, USD

One-time cost of the

item for 45 physical

servers, USD

Air conditioner 0.03 3 555.56 7 111.11

Construction 0.01 1 777.78 1 777.78

IBM x3550 server 1.00 2 666.67 120 000.00

Network switch 0.05 444.44 1 333.33

Storage 0.1 2 666.67 13 333.33

UPS 0.1 1 777.78 8 888.89

Others 0.01 444.44 444.44

Sum 152 888.89

Table 4.1 Sample capital expenditure estimation for private data center with 45 servers

The operational expenditure looks similar as the capital expenditure, except that

the monthly upkeep of working private cloud environment was taken. For each item its

specific unit number is taken to calculate the required costs for the private data center.

In this example the data center has 45 same IBM x3550 servers. Table 4.2 shows the

estimation of the OpEx.

Name of the item

Number of the

item for one

physical server

Cost per

month, USD

Unit price,

USD

Monthly cost of the

item for 45 physical

servers, USD

Power for air

conditioning 0.03 289 8.75 577.78

Human resource 0.01 889 8.89 888.89

Data center 0.01 222 2.22 222.22

Power for physical

servers 1.00 20 20.00 900.00

Sum 2 588.89

Table 4.2 Sample operational expenditure estimation for private data center with 45 servers

In this example both CapEx and OpEx were estimated for private data center

with 45 physical servers, where CapEx of the 45 server implementation project is

$152 888.89 and monthly OpEx is $2 588.89, so these values are used in the input file

Page 75: Cost and risk modeling of virtualized computing labs

75

when I executed the simulation with 45 physical server private data center. For other

sized private data centers the last columns should be recalculated based on the other

parameters.

4.1.4 Opportunity cost for NPV

In the case study I used 11% as the opportunity cost of capital. That comes from

2% risk free rate of interest, 2.38 beta (Computers [CPR]) and 8% expected return of

the market. [27]

4.1.5 SLA parameter

Accepted maximum response time to serve users’ image requests was set to 17

minutes. If the request was served slower than 17 minutes the service performance

indicator of the systems is decreased.

4.1.6 Future image reservations

At BME DMIS the Apache VCL system was launched in 2013 therefore there

are historical statistics about the usage of laboratory images of lectures Systems

Modeling and IT Engineering Laboratory 2. Figure 4.1 shows the histogram of

reservations for lecture Systems Modeling.

Histogram for past reservations for the lecture IT Engineering Laboratory 2 is

shown on Figure 4.2.

In order to generate future requests the characteristics of the past reservations

were used. Generating such future requests the random load generation algorithm was

set with the parameters listed on Figure 4.3. Totally, 15 different bachelor lectures were

created with 200 students per lecture. For simulating master lectures, 25 different

lectures with 50 students per lecture were given.

In this way the modeled and simulated image reservations were depicted on

Figure 4.4. Total number of generated image requests for one autumn semester is

153 373.

Page 76: Cost and risk modeling of virtualized computing labs

76

Figure 4.1 Past reservations of lecture Systems Modeling [28]

Figure 4.2 Past reservations of lecture IT Engineering Laboratory 2 [28]

Page 77: Cost and risk modeling of virtualized computing labs

77

Figure 4.3 Parameters for modeling past reservations

Figure 4.4 Generated and simulated image reservations [28]

Name of the lecture: BSc_autumn_2013 MSc_autumn_2013

Number of students: 200 50

Semester (SPRING/AUTUMN) AUTUMN AUTUMN

Number of same lectures to be simulated: 15 25

Image used at peak 1: BSc AUTUMN MSc 1 AUTUMN

Rate between students and requested images by peak 1: 18,00 9,00

Mean of the requests' length in hour by peak 1: 2,00 2,50

Standard deviation of the requests' length in hour by peak 1: 1,00 1,50

Mean of peak 1, date: 2013.10.27 15:00:00 2013.09.27 19:00:00

Standard deviation of the requests' time in hour by peak 1: 120,00 96,00

Image used at peak 2: BSc AUTUMN MSc 1 AUTUMN

Rate between students and requested images by peak 2: 11,00 36,00

Mean of the requests' length in hour by peak 2: 2,00 2,50

Standard deviation of the requests' length in hour by peak 2: 1,00 1,50

Mean of peak 2, date: 2013.11.17 17:00:00 2013.10.17 19:00:00

Standard deviation of the requests' time in hour by peak 2: 120,00 108,00

Image used at peak 3: BSc AUTUMN MSc 2 AUTUMN

Rate between students and requested images by peak 3: 1,50 2,50

Mean of the requests' length in hour by peak 3: 2,00 2,80

Standard deviation of the requests' length in hour by peak 3: 1,00 1,00

Mean of peak 3, date: 2013.12.16 17:00:00 2013.11.12 19:00:00

Standard deviation of the requests' time in hour by peak 3: 24,00 48,00

Image used at peak 4: MSc 2 AUTUMN

Rate between students and requested images by peak 4: 2,00

Mean of the requests' length in hour by peak 4: 2,80

Standard deviation of the requests' length in hour by peak 4: 1,00

Mean of peak 4, date: 2013.11.27 19:00:00

Standard deviation of the requests' time in hour by peak 4: 36,00

Page 78: Cost and risk modeling of virtualized computing labs

78

4.2 Simulation results

Having the future image reservation requests the question is which cloud setup

fits best the requirements to host Apache VCL. In order to answer this question couples

of simulations have had to be done with two varying parameters: number of physical

servers in private data center and allowance of using Amazon EC2 public cloud. This

processes is really important because the own created part of the CloudSim simulator

must be used to evaluate cloud infrastructure setups, so this is a proof of concept of my

work.

The simulations are divided into two parts by the allowance of using public

cloud: first one is when only private data center can be used, second one when all of the

image requests are served by the physical hosts at the university. In this case only the

private data center upkeep cost and the one-time investment cost of the private

infrastructure are taken into account. Second one is that both private and public data

centers (hybrid cloud setup) can run image requests. In this case the bill contains the

private cloud’s investment cost and operational cost, and additionally the price of the

instances in Amazon EC2 based on how many images were requested how long they

were used. Of course there is one more case when all of the image requests are handled

in public cloud environment but this case is covered by the first (when only private data

center is allowed) simulation scenario as the summary output file contains this cost by

default.

One more remark to the simulation: only one semester was simulated using the

simulator. For truly evaluating an implementation project at least 3-year project length

should be taken into account. Hence, the results for the hypothetical 3-year project

length were estimated from the half year (one semester) simulation results. Results of

the real simulations are represented on Figure 4.5 and Figure 4.6, while the

corresponding approximations for the 3-year length projects on Figure 4.7 and Figure

4.8.

Horizontal axis shows the number of hosts used in private cloud. On the left

vertical axis the following values are shown with colored lines:

Aqua blue line shows the current average unit cost if the image requests.

This value is calculated as dividing the total NPV cost with the number

of the total requests – smaller value is better.

Page 79: Cost and risk modeling of virtualized computing labs

79

Orange line is used to show the average unit cost of the image requests if

all of the requests were executed in Amazon EC2 public cloud – smaller

is better.

Green line depicts the service performance indicator, namely how many

percent of all requests were served above the user defined SLA time

parameter – bigger is better.

Dark blue line visualizes the NPV of the projects for specific number of

physical servers in private cloud – smaller is better.

Pink line shows the percentage of the requests that executed in the

private cloud – depends.

Ginger yellow line is for the CPU utilization of the private cloud –

depends

Brown is for the RAM utilization of the private cloud – depends.

Figure 4.5 Simulated result of half year long hypothetical private cloud only implementation

project [28]

Page 80: Cost and risk modeling of virtualized computing labs

80

Figure 4.6 Simulated result of half year long hypothetical hybrid cloud implementation project [28]

Figure 4.7 Estimated results of 3-year long hypothetical private cloud only implementation project

[28]

Page 81: Cost and risk modeling of virtualized computing labs

81

Figure 4.8 Estimated results of 3-year long hypothetical hybrid cloud implementation project [28]

4.3 Finding the optimal cloud setup

After numerous simulations of different cloud setup cases getting the cost and

risk optimal cloud computing configuration for the predefined workload (153 373 image

requests) is the next step in the decision making process. Rational choice would be the

cloud setup with the lowest NPV. This statement is almost true except that not only the

project NPV must be taken into account but the service performance indicator or the

utilization of private cloud infrastructure too. The service performance indicator defines

how many requests could be served below the SLA time parameter which is important

for the system availability and for the quality of service. The utilization should be under

60%. If the system utilization is permanently above 60% than the risk for IT system

break down gets higher and higher. According to these facts Figure 4.7 and Figure 4.8

need to be thoroughly considered to get the optimal solution based on the own decision

criteria.

Page 82: Cost and risk modeling of virtualized computing labs

82

On Figure 4.7 private only and public only (implicitly) cloud setup is depicted.

At the first look it is obvious that using only public cloud is not sustainable in case of 3-

year project length as the unit cost of Amazon EC2 (orange line) is $0.67 which is

bigger than the unit cost for using only private cloud (aqua blue line). Decreasing the

number of physical servers the service performance (green line) is also decreasing as the

same happens with the average cost of a request and the project NPV. The system

utilization (CPU and RAM) is increasing if the servers’ number decreases.

On Figure 4.8 different hybrid cloud setups are shown where the size of private

cloud is changing. The optimal hybrid cloud configuration has the lowest unit cost. This

optimal setup is reached when 38 physical servers were used in the private cloud. In this

case the unit cost is $0.2357 and total cost of ownership for 3 years is $216 935. CPU

and RAM utilization is 49.91% and 17.02%, while 98.18% of the requests are executed

in the private cloud. Of course the service performance indicator is 100%. That means

all of the requests are served within the predefined 17 minutes SLA parameter.

Choosing between private cloud only and hybrid cloud setups is up to the

decision makers. The private clouds with 40 physical hosts or less have lower unit costs

($0.2275 or less) than the optimal hybrid cloud setup with 38 private hosts ($0.2357),

but the service performance is worse (96.02%) than it would be in case of the optimal

hybrid cloud solution (100%). The system utilization also needs to be checked case by

case because it will be bigger if the number of servers is decreased.

Page 83: Cost and risk modeling of virtualized computing labs

83

5 Summary

Cloud computing service delivery model plays a very key role in IT services of

most business and non-business oriented companies around the world. Using cloud,

universities also could take advantage of cost reduction, flexibility, performance, and

the like. VDI takes lots of advantage in a university environment. Lecturers and students

do not have to take part of the labs personally only because of getting access to the

physical resources; having virtualized laboratories seems better solution. VDI has the

ability to stop the dependency on lab premises and on times. It makes the laboratories

more flexible as well. For this purpose BME DMIS launched its own Apache VCL

infrastructure in 2013.

In this master thesis the main task was to model the cost and risk factors of

Apache VCL in order to make cost, risk and service performance optimizing researches

on hybrid VCL cloud setups. Using the created model a hypothetical Apache VCL

implementation project at a hypothetical Hungarian university had to be optimized

based on some predefined criteria for a specific load profile.

At the beginning I gave a brief overview about the open source Apache VCL

and about the Apache VCL used by BME DMIS. To get familiar with Apache VCL the

processes of computer allocations of Apache VCL were checked. There were three main

types of allocations: the normal reservation by a user/student, the block allocation by a

user/teacher, and the predictive loading modules. These processes contained lots of

information which were not in the scope; therefore I created a filtered, simplified model

of Apache VCL to simulate only the relevant parts.

After having the appropriate information about Apache VCL and the simplified

model a cloud simulator had to be chosen to truly model and simulate Apache VCL

including the handling of reservation requests and the slot management. The simulator

had to be able to simulate hybrid cloud computing infrastructures too. Using simulator

was a must because simulation is a very fast way to have the necessary results.

Eventually the open source CloudSim cloud simulator was chosen. CloudSim could not

be used to simulate Apache VCL without changes in the source code. The modification

of CloudSim source code was the biggest task, bigger than understanding the working

of Apache VCL. This modification contained four must have conceptual features which

Page 84: Cost and risk modeling of virtualized computing labs

84

were missing from CloudSim: cost, system utilization, future request and service

performance modeling.

Finally, the relevance and the usability of the extended CloudSim were

demonstrated through a hypothetical implementation project of Apache VCL at a

Hungarian university. During the case study I have proven that the created simulator can

be used to get cost optimal cloud setups for assumed future image requests. Decision

making was done by the unit cost and the service performance indicator of the cloud

infrastructure implementation projects.

The conclusion of the simulations using different private only, public only and

hybrid cloud setups for 3-year long projects is that there is an optimal hybrid cloud

configuration where the unit cost is minimal while the service performance indicator has

its maximum as it can be seen on Figure 5.1. Smaller unit cost can be only reached if

the quality of the service is decreased.

Figure 5.1 Results of different hybrid cloud configurations [28]

Page 85: Cost and risk modeling of virtualized computing labs

85

Bibliography

[1] Pierre Audoin Consultants: Growing Cloud Market Presents Huge Opportunities

for SITS Vendors; https://www.pac-

online.com/pac/pac/live/pac_world/global/press_corner/press_releases/index.html

?lenya.usecase=show-

rapport&document=pac_sitsi_reports/press_release/PR_Cloud_Feb13&xsl=press_

release (May 2013)

[2] Vmware: Virtual Desktop Infrastucture, page 1;

http://www.vmware.com/pdf/virtual_desktop_infrastructure_wp.pdf (May 2013)

[3] Apache VCL: Homepage of Apache VCL; http://vcl.apache.org (March 2013)

[4] Imre Kocsis: Oktatás felhőben, slide 15;

http://www.slideshare.net/ImreKocsis1/oktats-apache-vcl-felhvel-tempus-

felsoktatsi-mhely (May 2014)

[5] Apache VCL: Authentication of Apache VCL;

https://cwiki.apache.org/confluence/display/VCL/VCL+2.3+Configure+Frontend

+Authentication (March 2013)

[6] Shibboleth: Homepage of Shibboleth; http://shibboleth.net/ (May 2013)

[7] Apache VCL: User privileges of Apache VCL; https://cwiki.apache.org/VCL/for-

vcl-users.html#ForVCLUsers-Privileges (March 2013)

[8] Apache VCL: Apache VCL documentation;

https://cwiki.apache.org/confluence/display/VCL/Apache+VCL (March 2013)

[9] Apache VCL: Apache VCL architecture; https://cwiki.apache.org/VCL/vcl-

architecture.html (April 2013)

[10] Apache VCL: Apache VCL XMLRPC API;

http://people.apache.org/~jfthomps/vcl_xmlrpc_api.html (May 2013)

[11] Apache VCL: Database schema of Apache VCL;

https://cwiki.apache.org/confluence/display/VCL/Database+Schema (April 2013)

[12] Apache VCL: Network layouts of Apache VCL;

https://cwiki.apache.org/VCL/network-layout.html (April 2013)

[13] Apache VCL: Resources, Groups & Privileges of Apache VCL;

https://cwiki.apache.org/VCL/resources-groups-privileges.html (April 2013)

[14] Imre Kocsis and Áron Tóth: Oktatás felhőben, slide 21;

http://www.slideshare.net/ImreKocsis1/oktats-apache-vcl-felhvel-tempus-

felsoktatsi-mhely (March 2014)

Page 86: Cost and risk modeling of virtualized computing labs

86

[15] Dumitrescu, Catalin L., and Ian Foster. "GangSim: a simulator for grid scheduling

studies." Cluster Computing and the Grid, 2005. CCGrid 2005. IEEE International

Symposium on. Vol. 2. IEEE, 2005.

[16] Casanova, Henri. "Simgrid: A toolkit for the simulation of application

scheduling." Cluster Computing and the Grid, 2001. Proceedings. First

IEEE/ACM International Symposium on. IEEE, 2001.

[17] Núñez, Alberto, et al. "iCanCloud: A flexible and scalable cloud infrastructure

simulator." Journal of Grid Computing 10.1 (2012): 185-209.

[18] Buyya, Rajkumar, and Manzur Murshed. "Gridsim: A toolkit for the modeling and

simulation of distributed resource management and scheduling for grid

computing." Concurrency and computation: practice and experience 14.13‐15

(2002): 1175-1220.

[19] CloudSim: Web site of CloudSim; http://www.cloudbus.org/cloudsim (March

2014)

[20] Calheiros, Rodrigo N., et al. "CloudSim: a toolkit for modeling and simulation of

cloud computing environments and evaluation of resource provisioning

algorithms." Software: Practice and Experience 41.1 (2011): 23-50.

[21] Investopedia: Net Present Value – NPV;

http://www.investopedia.com/terms/n/npv.asp (May 2014)

[22] Norbert Madarász: Modified source code of CloudSim;

https://www.dropbox.com/sh/qbmckbhpk69015n/AADsNB1GEl7OvVRBuNf-

hXRba (May 2014)

[23] Michael Thomas: Flanagan's Java Scientific Library;

http://www.ee.ucl.ac.uk/~mflanaga/java (April 2014)

[24] Apache Commons Math: Project web site;

http://commons.apache.org/proper/commons-math/index.html (April 2014)

[25] Java Excel API: Homapage; http://www.andykhan.com/jexcelapi/index.html

(April 2014)

[26] Amazon: Amazon EC2 Pricing; https://aws.amazon.com/ec2/pricing (April 2014)

[27] Dr. Andor, György and Dr. Tóth, Tamás (2010). Vállalati pénzügyek I. oktatási

segédanyag, page 148.

[28] Madarász, Norbert: Tableau figures online;

http://public.tableausoftware.com/profile/fulmi (May 2014)

Page 87: Cost and risk modeling of virtualized computing labs

87

Abbreviation list

API Application Programming Interface

BME Budapest University of Technology and Economics

BRITE Boston university Representative Internet Topology Generator

CAN Campus Area Network

CaPex Capital Expenditure

CIS Cloud Information Service

CLOUDS Cloud Computing and Distributed Systems Laboratory

CMD Command Prompt

CPU Central Processing Unit

CRM Customer Relationship Management

DaaS Desktop-as-a-Service

DMIS Department of Measurement and Information Systems

EC2 Amazon Elastic Compute Cloud

FCFS First-Come-First-Serve

GB Gigabyte

GPL General Public License

HTTP Hypertext Transfer Protocol

IaaS Infrastructure-as-a-Service

IT Information Technology

LAN Local Area Network

LDAP Lightweight Directory Access Protocol

MB Megabyte

Mbps Megabits per second

MI Million Instructions

MIPS Million Instructions Per Second

NAS Network Attached Storage

NFS Network File System

NPV Net Present Value

OpEx Operational Expenditure

PaaS Platform-as-a-Service

PE Processing Element

QoS Quality of Service

Page 88: Cost and risk modeling of virtualized computing labs

88

RAM Random Access memory

RDBMS Relational Database Management System

RDP Remote Desktop Protocol

SaaS Software-as-a-Service

SAN Storage Area Network

SAS Serial Attached SCSI

SCSI Small Computer System Interface

SLA Service Level Agreement

SSH Secure Shell

SSL Secure Sockets Layer

TB Terabyte

TCO Total Cost of Ownership

UI User Interface

VCL Apache Virtual Computing Lab

VDC Virtual Data Centre

VDI Virtual Desktop Infrastructure

VM Virtual Machine

xCAT Extreme Cloud Administration Toolkit

XLS Microsoft Excel Binary File Format