Cost-Time Performance of Scaling Applications on the Cloudsunimalr/papers/2018-cloudcom.pdfOur...

Cost-Time Performance ofScaling Applications on the Cloud

Sunimal RathnayakeDepartment of Computer ScienceNational University of Singapore

[email protected]

Lavanya RamapantuluComputer Systems Group

International Institute of Information TechnologyHyderabad, [email protected]

Yong Meng TeoDepartment of Computer ScienceNational University of Singapore

[email protected]

Abstract—Recent advancements in big data processing andmachine learning, among others, increase the resource demandfor running applications with larger problem sizes. Elasticcloud computing resources with pay-per-use pricing offers newopportunities where large application execution is constrainedonly by the cost budget. Given a cost budget and a time deadline,this paper introduces a measurement-driven analytical modelingapproach to determine the largest Pareto-optimal problem sizeand its corresponding cloud configuration for execution. We eval-uate our approach with a set of representative applications thatexhibit a range of resource demand growth patterns on AmazonAWS cloud. We show the existence of cost-time-size Pareto-frontier with multiple sweet spots meeting user constraints. Tocharacterize the cost-performance of cloud resources, we usePerformance Cost Ratio (PCR) metric. We extend Gustafson’sfixed-time scaling in the context of cloud, and, investigate fixed-cost-time scaling of applications and show that using resourceswith higher PCR yields better cost-time performance. We discussa number of useful insights on the trade-off between the executiontime and the largest Pareto-optimal problem size, and, show thattime deadline could be tightened for a proportionately muchsmaller reduction of problem size.

Index Terms—scaling, largest problem size, cloud, cost-timeperformance, Pareto-optimal configuration

I. INTRODUCTIONScalability in cloud computing is of two-fold: application

scalability and resource scalability. Variation of resource de-mand of application with problem size is known as applica-tion scalability. The plethora of cloud resources available oncloud with different cost and performance leads to resourcescalability. Unlike in traditional on-premise computer systemswhere scalability was limited by the available on-premisehardware resources, scalability on cloud is bounded only bycost budget and execution time deadline. As cloud resourcesare characterized in terms of processing power, memory,storage, and, network performance, among others, the cloudconsumer has the opportunity to choose a cost-efficient cloudresource configuration to match the resource demand of theapplication. However, the consumer should take into accountboth the resource demand growth of the application as well asthe cost-performance of cloud resources when deciding on anoptimal cloud resource configuration.

Scalability in parallel computing has been well studied [7],[9], [20] with a majority focusing on time-performance.Gustafson’s law on fixed-time scaling is one such work whichargues that given a fixed-time, a near-linear parallel speedupcould be achieved when the size of the application grows [4].With heterogeneous scalable cloud resources and applicationscalability, we extend Gustafson’s law and investigate fixed-

cost-time scaling of applications on the cloud. Moreover, sincecloud resource usage is measured, metered, and, charged basedon the duration the resource was used, studying the impact ofscaling on cost is equally important.

Given an application with a time deadline and a costbudget, we propose a measurement-driven analytical modelingapproach to determine cost-time Pareto-optimal problem sizes(defined in Section III-C) and cloud resource configurations forexecuting them. Although Pareto-optimal scaling is not new toparallel computing [2], [17], applying Pareto-optimal scalingfor investigating the trade-off between application problemsize, and, the cost and time of execution is new in the contextof cloud computing. Similar to Hwang et al.’s [10] workon performance evaluation, we use the instruction count asthe proxy for matching the resource demand of applicationwith resource capacity of cloud resources. We utilize baselineexecutions for characterizing applications and resources, and,evaluate our approach on Amazom EC2 cloud.

II. RELATED WORKStarting from the early research such as Amdahl’s law and

Gustafson’s law, researchers [7], [9], [10], [20], [21] haveattempted to model and improve scalability. In cloud com-puting, most of the work focus on resource elasticity [1], [3],[8], [11], [13]–[15], [19]. Such approaches include autoscalingand resource migration for cost optimization, scheduling fordynamic pricing schemes such as AWS spot pricing, and,model driven approaches for predicting resource demand. Incomparison, fewer research works [5], [6], [18], [22], [23] doaddress application scalability on the cloud. These researchworks include resource optimization for elastic applications,benchmarking for determining the best virtual machine type,and, framework to develop elastic algorithms. Extending thecurrent state of research, this paper focuses on fixed-cost-timePareto-optimal scaling of application problem size using ameasurement driven analytical modeling approach.

III. APPROACH

application

time deadline

cloud resources

cost deadline

non-virtualized

server

selected cloud

resources

application charac.

resource charac.

pred.model

opt.algo.

Pareto-optimal size

predicted time

predicted cost

configuration

inputs baseline phase

model and

optimization outputs

𝑃 𝑓(𝑆)

𝑅

𝑇, 𝐶

Δ𝑖

Δ𝐺 , 𝑆, 𝑡, 𝑐𝐺

Fig. 1: Approach Overview

This section discusses our measurement-driven analytical

model derivation and Pareto optimization algorithm.A. Fixed Cost-time Application Scaling

As shown in Figure 1, given an application P with a costbudget C, a time deadline T , and, a set of cloud resources,our approach determines Pareto-optimal problem sizes S ofP executable on the cloud and Pareto-optimal cloud resourceconfigurations to execute them.

For application characterization, we measure the numberinstructions executed on the non-virtualized server whilerunning the application for different problem sizes. Thesemeasurements are utilized to derive the application resourcedemand growth function using a curve fitting technique. Forcharacterizing cloud resources, the execution rate for eachcloud resource instance is computed by dividing the numberof instructions measured on a non-virtualized server for eachproblem size by the corresponding execution time.

Our Pareto-optimization algorithm takes as input C, T , theset of cloud configurations G and the growth function of P ,f(S), and determines largest Pareto-optimal sizes of P , Smax,optimal cost C ′, optimal time T ′ and the corresponding cloudresource configuration config.B. Model Derivation

A cloud configuration is a combination of instances fromone or more cloud resource types. We use the notation{|r0,j |, |r1,j |, .., |ri,j |, .., r|R|−1,j} to denote Gj , the jth cloudconfiguration in G, where ri,j ∈ R is the set of cloud resourceinstances from resource type i ∈ R.The largest S of P executable on Gj depends on the numberof instructions Gj can execute while running P .

S = f−1 (W ) (1)where f−1 is the inverse of the scaling function of P , and,W is the total number of instructions executed on Gj .Suppose, Gj has to execute P for a time duration t to executeW instructions, thus, W is computed as

W = ∆Gj × t (2)where ∆Gj is the resource capacity of Gj .Resource capacity of a cloud resource configuration is definedas the summation of resource capacities of all cloud resource

TABLE I: Model NotationsSymbol Description

ApplicationP an applicationS a problem size of PW resource demand of Pf(S) resource demand growth function of P in terms of S

ResourcesR set of all cloud resourcesi a cloud resource type in Rvi number of vCPUs in cloud resource type iG set of all cloud configurationsGj a configuration j having one or more resourcesri,j set of instances of type i in configuration jci cost per unit time for an instance of resource type icGj

cost per unit time for configuration Gj

∆ivCPUper vCPU computational capacity of a resource type i

∆i total computational capacity of a resource type i∆Gj

computational capacity of configuration Gj

ModelT time deadline to execute PC cost budget to execute Pt time to execute WT ′ predicted execution time for executing WC′ predicted cost for executing W

TABLE II: ApplicationsApplication Scaling Function

n-body W = 106n2 + 5× 106n− 7× 109

W = 1012s+ 2× 1013

sand W = 3× 1012nW = 8× 1015 ln(τ) + 4× 1016

instances in the configuration.

∆Gj=

|R|−1∑i=0

(|ri,j | ×∆i) (3)

where |ri,j | is the number of cloud resource instances fromresource type i in Gj , and, ∆i is the resource capacity ofcloud resource type i.A cloud resource instance consists of one or more vCPUs,thus, the total resource capacity of a cloud resource instancedepends on the number of vCPUs it consists.

∆i = ∆ivCPU× vi (4)

where ∆ivCPUis the per-vCPU resource capacity and, vi is

the number of vCPUs in one instance of i.Given that it takes time t to execute size S of P on configu-ration Gj , the total cost is

C ′ = t× CGj (5)where CGj

is the cost per unit time for configuration Gj .CGj

is determined based on the cost for unit time of eachcloud resource instance in the Gj and is defined as

CGj =

|R|−1∑i=0

(|ri,j | × ci) (6)

where ci is the cost per unit time for cloud resource type i.C. Determining Largest Pareto-optimal Size

We need to maximize the size while minimizing cost andtime. Thus, we define that a tuple <size, cost, time> con-sists Pareto-optimal size if size cannot be increased withoutincreasing either/both cost and time. Our Pareto-optimizationalgorithm traverses all cloud configurations and builds a listof tuples of which each tuple contains maximum problem sizefor each configuration, cost, and, time. Secondly, tuples withsame time are clustered together. Thirdly, from each clusterwith same time, the tuples with minimum cost are retainedand others are discarded. From the remainder, the tuples withlargest size are selected as Pareto-optimal sizes.

IV. EVALUATIONThis section presents the evaluation of our approach Ama-

zon EC2 cloud with a representative set of applications fol-lowed by an analysis of important observations.A. Applications

To evaluate our approach we selected two applications thatexhibit linear, quadratic and logarithmic growth of resourcedemand with respect to problem size as shown in Table II.n-body (n, s) [12] implements n-body simulation of masseson MPI where scalable problem sizes are the number ofmasses n and number of simulation steps s. sand (n, τ)[16] is a genome sequence alignment application implementedon Work-Queue platform which aligns compatible genomesequences from a list of candidate sequences of size n basedon a quality threshold τ .

0

50000

100000

150000

200000

250000

300000

350000

c4.l

arge

c4.x

larg

e

c4.2

xla

rge

m4.l

arge

m4.x

larg

e

m4.2

xla

rge

i3.l

arge

i3.x

larg

e

i3.2

xla

rge

r3.l

arge

r3.x

larg

e

r3.2

xla

rge

PC

R (

bil

lion i

nst

ruct

ions

per

$)

resource type

n-bodysand

Fig. 2: Performance Cost Ratio of EC2 Resources Types

TABLE III: Amazon EC2 Cloud Resource Types

Resource vCPUs Freq. Mem. CostExecution Rate PCR(bil. instr/s/vCPU) (bil. instr/s)

(GHz) (GB) ($) n-body sand n-body sandc4.large 2 2.9 3.75 0.105 1.38 4.53 94556 310629c4.xlarge 4 2.9 7.5 0.21 1.38 4.54 94338 312804c4.2xlarge 8 2.9 15 0.42 1.37 4.56 93640 313248m4.large 2 2.3 8 0.133 1.16 4.83 62764 261474m4.xlarge 4 2.3 16 0.266 1.24 4.85 62331 262556m4.2xlarge 8 2.3 32 0.532 1.23 4.88 66617 264010r3.large 2 2.5 15 0.166 1.14 3.69 49500 160048r3.xlarge 4 2.5 30.5 0.332 1.12 3.69 48391 159568r3.2xlarge 8 2.5 61 0.664 1.11 3.71 48194 160779

B. Cloud Resources and Performance Cost RatioWe use Performance Cost Ratio (PCR) as a metric for

representing and comparing cost efficiency of cloud resources.PCR is defined asPCR (instructions/$) = instruction execution rate (instructions/sec)

cost per unit time ($/sec)

Figure 2 shows PCR of four non-accelerated resource cate-gories on Amazon EC2 cloud for n− body and sand applica-tions and we observe that PCR has a non-linear relationshipacross different resource categories. For evaluation of ourapproach, while preserving the non-linear relation in PCRacross resource categories, we selected c4, m4 and r3 resourcecategories as shown in Table III. From each category, weselected large, xlarge, 2xlarge resource types resulting in aconfiguration space of over ten million configurations.C. Validation

Table IV shows model validation where the predicted timeand cost are compared against measured values from AmazonEC2 cloud. Due to limited research budget, we validate only

600 800

1000 1200

1400 40 50

60 70

80 90

100

4000 5000 6000 7000 8000 9000

10000

n

(c) sand (n, 0.16)

PO SizeMin. PO SizeMax. PO Size

Time (min)

Cost ($)

n

600 700 800 900 1000 1100 1200 1300 1400 1500 65

70 75

80 85

90 95

100

0.15 0.2

0.25 0.3

0.35 0.4

0.45 0.5

0.55

τ

(d) sand (4096m, τ)


Time (min)

Cost ($)

τ

600 700 800 900 1000 1100 1200 1300 1400 1500 55

60 65

70 75

80 85

90 95

100

4000 4500 5000 5500 6000 6500 7000

s

(a) n-body (64k, s)


Time (min)

Cost ($)

s

700 800 900 1000 1100 1200 1300 1400 1500 60

65 70

75 80

85 90

95 100

64000 66000 68000 70000 72000 74000 76000 78000 80000 82000

n

(b) n-body (n, 4000)


Time (min)

Cost ($)

n

Fig. 3: Pareto-optimal Size for n-body and sand

TABLE IV: Model ValidationApp Param Size Configuration† Time (min) Cost ($) Accu. (%)Pred. Act. Pred. Act.

n-bodyn 80445 {5,5,5,0,0,3,0,0,1} 1415 1154 99.99 81.55 82

65536 {3,5,5,0,0,2,0,0,1} 1218 1003 66.32 54.61 82

s 6896 {5,5,5,0,0,3,0,0,1} 1415 1149 99.99 81.19 814000 {4,4,5,0,0,0,0,0,1} 1061 862 56.78 46.13 81

sandn 9976m {5,5,5,0,0,3,0,0,1} 1415 1630 99.99 115.18 85

4096m {5,5,4,0,0,0,0,0,1} 651 738 40.54 45.96 87

τ0.5341 {5,5,5,0,0,3,0,0,1} 1415 1614 99.99 114.05 860.1600 {4,5,5,1,4,5,0,0,1} 752 876 71.21 82.95 84

† Configuration = {c4.2xlarge, c4.xlarge, c4.large, m4.2xlarge, m4.xlarge, m4.large, r3.2xlarge, r3.xlarge,r3.large}the maximum and minimum problem sizes for each applicationshown in Figure 3. The prediction accuracy varies in the range81% - 82% and 84% - 87%, for n-body and sand respectively.Sources of inaccuracy include impact of virtualization on mea-surements, having different system specification for the sameresource type on cloud, and the communication overhead.D. Model Analysis

This section presents an analysis based on model predictionsfor scaling applications on the Amazon EC2 cloud. For exper-iments on cloud, for simplicity and due to research budgetconstraints, we set the time deadline to 24 hrs and cost budgetto $100 whereas for extrapolated analysis in sections IV-D2and IV-D3 we set time deadline to 7 days (168 hrs) and costbudget to $1000, hence emulating long-running applications.1) Pareto-optimal Problem Size

To understand the relationship between the application prob-lem size and the cost-time performance of cloud resources, asshown in Figure 3, we investigate how Pareto-optimal problemsizes are distributed in the cost-time-size space.Observation 1: Given an application with a cost budget andtime deadline there exist one or more Pareto-optimal problemsizes of which one consists of the largest problem size.Having a set of Pareto-optimal problem sizes within thepredefined cost and time constraints provides opportunity forcloud consumer to further tighten these constraints at theexpense of the problem size of the application.Observation 2: Increasing the cost budget and relaxing thetime deadline does not always allow a larger problem size.Blindly selecting the cloud resource configuration may resultin a smaller problem size even with larger execution timeand higher cost, thus, consideration of cost-performance ofdifferent resource combinations is important.Observation 3: For a given Pareto-optimal problem size withmultiple resource configurations, resource demand is allocatedto different resource types in order of higher PCR.Selecting the resources in the order of the highest PCR wouldobtain a configuration with a near-optimal problem size.2) Impact of Time Deadline on Scaling

To investigate the impact of time deadline on scaling, asshown in Figure 4, we determine the largest size of theapplication executable for n-body and sand while relaxing thetime deadline for different fixed cost values.Observation 4: Among Pareto-optimal problem sizes, tight-ening of time deadline results in a proportionately smallerreduction of largest problem size.When the time deadline is relaxed, the capability of resourcesincrease linearly, but the resource demand increases quadrati-cally. Hence, results in a sublinear growth of largest problem

5000

10000

15000

20000

25000

30000

24 48 72 96 120 144 168

(c) sand (n, 0.16)

n (

mil

lio

n)

Time (hrs)

$250$500$750

$1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

24 48 72 96 120 144 168

(d) sand (4096m, τ)

τ

Time (hrs)

$250$500$750

$1000

10000 15000 20000 25000 30000 35000 40000 45000 50000 55000 60000 65000

24 48 72 96 120 144 168

(a) n-body (64k, s)

s

Time (hrs)

$250$500$750

$1000

100000

120000

140000

160000

180000

200000

220000

240000

260000

24 48 72 96 120 144 168

(b) n-body (n, 4000)

n

Time (hrs)

$250$500$750

$1000

Fig. 4: Impact of Time Deadline on Largest Problem Size

5000

10000

15000

20000

25000

30000

250 500 750 1000

(c) sand (n, 0.16)

n (

mil

lio

n)

Cost ($)

24hrs48hrs72hrs96hrs

120hrs144hrs168hrs

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

250 500 750 1000

(d) sand (4096m, τ)

τ

Cost ($)


120hrs144hrs168hrs

10000 15000 20000 25000 30000 35000 40000 45000 50000 55000 60000 65000

250 500 750 1000

(a) n-body (64k, s)

s

Cost ($)


120hrs144hrs168hrs

100000

120000

140000

160000

180000

200000

220000

240000

260000

250 500 750 1000

(b) n-body (n, 4000)

n

Cost ($)


120hrs144hrs168hrs

Fig. 5: Impact of Cost Budget on Largest Problem Sizesize. Change of gradient along the curve as seen in Figure 4 isdue to configuration spilling into the next available resourcecategory which has a lower PCR.This sub-linear relationship between the largest Pareto-optimalsize and the time deadline could be leveraged by the cloudconsumer to further tighten the time deadline with minimaltrade-off of the application problem size.3) Impact of Cost Budget on Scaling

To explore the applicability of Gustafson fixed-time scaling[4] in the context of cloud, we investigate the impact on thelargest executable problem size when the cost budget is relaxedwhile keeping the time deadline fixed as shown in Figure 5.Observation 5: Among Pareto-optimal problem sizes, costgrows faster than the increase of problem size because PCRis not linear across all resource types.While cloud poses itself as a platform for fix-time applcationscaling, with these observations, we reiterate the importance ofconsidering the cost-performance efficiency of cloud resourcesfor a consumer to efficiently scale applications on cloud.

V. CONCLUSIONScalable resources and pay-per-use charging make cloud a

suitable platform for executing scalable applications. However,

scaling decisions on cloud is challenging due to the large con-figuration space and application dependent resource demandgrowth. This paper presents a measurement-driven analyticalmodeling approach to determine the largest Pareto-optimalproblem sizes for a given application with a time deadlineand a cost budget, and, cloud configurations to execute them.Using model results, we show the existence of cost-time-sizePareto frontier with multiple Pareto-optimal sizes for a givenapplication with a time deadline and cost budget. We presentinteresting insights on fixed-cost-time scaling on cloud andintroduce Performance Cost Ratio (PCR) which can be utilizedto determine near-optimal cloud resource configurations.

VI. ACKNOWLEDGMENTSThis work was supported by Singapore Ministry of Educa-

tion through the Academic Research Fund Tier 1. The authorsthank Amazon Web Services for AWS Cloud research credits.

REFERENCES[1] T. Bicer, et al., Time and Cost Sensitive Data Intensive Computing on

Hybrid Clouds, Proc. of CCGRID, pages 636–643, 2012.[2] M. R. H. Farahabady, et al., Pareto-optimal cloud bursting, IEEE TPDS,

25(10):2670–2682, 2014.[3] Y. Gong, et al., Monetary Cost Optimizations for MPI-based HPC Ap-

plications on Amazon Clouds: Checkpoints and Replicated Execution,Proc. of SC, page 32, 2015.

[4] J. L. Gustafson, Reevaluating Amdahl’s Law, Communications of theACM, 31(5):532–533, 1988.

[5] R. Han, Investigations Into Elasticity in Cloud Computing, arXivpreprint arXiv:1511.04651, 2015.

[6] J. He, et al., On the Cost–QoE Tradeoff for Cloud Based Video Stream-ing Under Amazon EC2’s Pricing Models, IEEE TCSVT, 24(4):669–680,2014.

[7] M. D. Hill, et al., Amdahl’s Law in the Multicore Era, Computer, 41(7),2008.

[8] H. Huang, et al., CAP3, Proc. of CLOUD, pages 228–235, 2013.[9] K. Hwang, et al., Cloud Performance Modeling with Benchmark

Evaluation of Elastic Scaling Strategies, IEEE TPDS, 27(1):130–143,2016.

[10] K. Hwang, et al., Advanced Computer Architecture, 3e: , McGraw HillEducation (India) Private Limited, 2011.

[11] P. Kokkinos, et al., Cost and Utilization Optimization of Amazon EC2Instances, Proc. of CLOUD, pages 518–525, 2013.

[12] S. Leeman-Munk, PetaKit Software Suite, 2010.[13] M. Mao, et al., Auto-scaling to Minimize Cost and Meet Application

Deadlines in Cloud Workflows, Proc. of SC, page 49, 2011.[14] M. Mao, et al., Cloud Auto-scaling with Deadline and Budget Con-

straints, Proc. of GRID, pages 41–48, 2010.[15] A. Marathe, et al., Exploiting Redundancy for Cost-effective, Time-

constrained Execution of HPC Applications on Amazon EC2, Proc. ofHPDC, pages 279–290, 2014.

[16] C. Moretti, et al., PREPRINT: A Framework for Scalable GenomeAssembly on Clusters, Clouds, and Grids.

[17] L. Ramapantulu, et al., An Approach for Energy Efficient Execution ofHybrid Parallel Programs, Proc. of IPDPS, pages 1000–1009, 2015.

[18] S. Rathnayake, et al., CELIA: Cost-time Performance of ElasticApplications on Cloud, Proc. of ICPP, pages 342–351, 2017.

[19] U. Sharma, et al., A Cost-aware Elasticity Provisioning System for theCloud, Proc. of ICDCS, pages 559–570, 2011.

[20] X.-H. Sun, et al., Reevaluating Amdahls Law in the Multicore Era,JPDC, 70(2):183 – 188, 2010.

[21] X.-H. Sun, et al., Another View on Parallel Speedup, Proc. of SC, pages324–333. IEEE Computer Society Press, 1990.

[22] L. Thai, et al., Executing bag of distributed tasks on the cloud:Investigating the trade-offs between performance and cost, Proc. ofCLOUDCOM, pages 400–407. IEEE, 2014.

[23] B. Varghese, et al., Cloud benchmarking for maximising performanceof scientific applications, IEEE TCC, 2016.

Cost-Time Performance of Scaling Applications on the Cloudsunimalr/papers/2018-cloudcom.pdfOur...

Documents

Transcript of Cost-Time Performance of Scaling Applications on the Cloudsunimalr/papers/2018-cloudcom.pdfOur...