SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia...

22
SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen, Aalborg University (Denmark) Torben Bach Pedersen, Aalborg University (Denmark) Ugur Çetintemel, Brown University (USA) Tim Kraska, Brown University (USA)

description

Amazon Spot market User bids for the machine –“I need 8 vCPUs machine in region A” –“maximum I will pay $0.5 per hour” If the spot price < $0.5: –The user gets an instance If (and when) the spot price > $0.5: –AWS takes back an instance 3SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 ap-northeast-1c ap-northeast-1a

Transcript of SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia...

Page 1: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing

Tasks on Amazon EC2

by Dalia Kaulakiene, Aalborg University (Denmark)Christian Thomsen, Aalborg University (Denmark)

Torben Bach Pedersen, Aalborg University (Denmark)Ugur Çetintemel, Brown University (USA)

Tim Kraska, Brown University (USA)

Page 2: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 2

Amazon Web Services EC2 cloud

Contract Price per hour (*)

Reserved instances 1-year or 3-year contract $0.0581

on-demand No contract $0.128

Spot instances No contract,Can be revoked

$0.0365 (**)

* c3.large instance type (Linux) in ap-northeast-1 region ** Average price in 1 week, Mar 23 - Mar 30, 2015

Page 3: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 3

Amazon Spot market

• User bids for the machine– “I need 8 vCPUs machine in region A”– “maximum I will pay $0.5 per hour”

• If the spot price < $0.5:– The user gets an instance

• If (and when) the spotprice > $0.5:– AWS takes back an instance

ap-northeast-1c

ap-northeast-1a

Page 4: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 4

ProblemsThe user needs to execute an analytical workload on spot instances

Hadoop job Data in Amazon S3

Problem1. Execution time is unknown

AWS organizes instances into 9 families with 4-5 instance types General purpose: T2, M4, M3 Compute optimized: C4, C3 Memory optimized: R3 GPU: G2 Storage optimized: I2, D2

Problem2. Execution cost is unknown (and varies)

7 regions, 2-4 availability zones in each Spot prices changes in real-time

Page 5: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 5

SpotADAPT• Estimates execution time on AWS instances• Estimates execution price in AWS regions• Proposes deployment w/ optimization goals:

Fastest execution within budgetor Cheapest execution within time constraints

• Monitors execution • Proposes re-deployment if:

Instance is taken away by AWS Cheaper or faster deployment is available

Page 6: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 6

Execution time estimation

1. Dataset size increase effect Increasing (sampled) input size Executing on same machine More micro-runs does not improve

accuracy! SpotADAPT takes few micro-runs

to estimate the time of large dataset

AWS instance family:Slowest machine (2 vCPUs)More powerful machine (4 vCPUs)… (.. vCPUs)Most powerful machine (32 vCPUs)

Wordcount

Page 7: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 7

Execution time estimation (cont.)

2. Scale-up effect Increasing machine power (#

vCPUs) Using same dataset SpotADAPT takes 1 micro-

dataset, executes workload on few instance typesin the family, estimates the time of large dataset on all instances

3. Combine Estimate execution time on all

machines using large dataset

Wordcount

Page 8: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 8

SpotADAPT flow

Page 9: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 9

SpotADAPT. Step 1• Hadoop job

• Data: Bucket in AWS S3

• Optimization goals: Cheapest execution within time boundariesor Fastest execution within budget boundaries

Page 10: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 10

SpotADAPT. Step 2

• Setup: Prepare data for micro-runs

for data size effect estimation for scale-up effect estimation

Execute micro runs for each AWS instance family: On base instance type – for data size effect On other instance types using one micro-dataset –

for scale-up effect

Page 11: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 11

SpotADAPT. Step 2 (cont.)

• Execution time estimation: Data size effect

Scale-up effect

Combining both

Execution time (slowest instance, large dataset)

Scale factor (time on slower instance / time on 2x powerful instance)

Execution time (slowest instance, large dataset)Execution time (2x instance, large dataset)…Execution time (most powerful instance, large dataset)

AWS instance family:Slowest machine (2 vCPUs)More powerful machine (4 vCPUs)… (.. vCPUs)Most powerful machine (32 vCPUs)

Page 12: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 12

SpotADAPT. Step 2 (cont.)

• Execution price estimation For each instance family For each instance type in the family For each region For each availability zone For on-demand For spot (assuming start time = current time)

Page 13: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 13

SpotADAPT. Step 3• Initial deployment

Choose best combination: AWS region, zone Instance type Pricing model

For fastest execution:

1. Choose fastest instance2. Find the deployment which gives cheaper

execution than the budget3. If nothing found, choose second fastest, repeat

For cheapest execution:

1. Choose cheapest deployment2. If execution time exceeds the deadline, choose

second best deployment, repeat

Page 14: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 14

SpotADAPT. Step 3 (cont.)• Adaptive (re-)Deployment:

When instance is taken back by Amazon(Out-of-bid re-deployment)

When prices in current region increase When prices in other region decrease

Aligned with optimization goals

Page 15: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 15

Simulation

Fastest execution: SpotADAPT Oracle time Fast compute: Fast mem:

Cheapest execution: SpotADAPT Oracle time: Oracle time+price: Cheap vCPU:

default

oracle

• Workloads Wordcount Selfjoin

• Spot price traces Jan 8, 2015 – April 8, 2015 9 AWS regions, 21 availability zones in total

• Strategies:

Page 16: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 16

Results (Fastest execution)

Budget <= $0.1

Wor

dcou

ntS

elfjo

in

Defa

ult s

trate

gies

FAI

L

Budget <= $0.5

Spot

ADAP

T ==

Ora

cle

Page 17: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 17

Results (Cheapest execution)

Wor

dcou

nt

Deadline 9.5h

Sel

fjoin

Deadline 6hDeadline 1h

Spot

ADAP

T ==

Ora

cle

Chea

p vC

PU fa

ils 6

0% o

f tim

es

Spot

ADAP

T is

0.3

% m

ore

expe

nsiv

e

Page 18: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 18

Results – Adaptive re-deployment

Initial deployment

Re-deployment

Page 19: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 19

Summary• SpotADAPT estimates time on AWS instances

Only few micro-runs on some instances in the family• Estimates execution price in AWS regions

Using the most recent price is as good as knowing all future prices• Proposes deployment w/ optimization goals:

Fastest execution within budgetor Cheapest execution within time constraints

• Monitors execution • Proposes re-deployment if:

Instance is taken away by AWS Cheaper or faster deployment is available

Page 20: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 20

Thank you!

[email protected]

Page 21: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 21

Future work• Future work

More diverse workloads Larger input datasets

Page 22: SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 22

Backup slides• Setup time:

For 50GB Wordcount, for 80GB Selfjoin Setup time is ~ 15% of execution time on slowest machine

• Setup price: Setup price is ~ 50% of execution price on on-demand market