Powerpoint Templates

Powerpoint TemplatesPowerpoint Templates

Autonomic Mix-Aware Provisioning on Amazon EC2

Tony Schneider, Derek Bender, Andrew Hahn

Based on R. Singh et. al [1]

Powerpoint Templates

• Our Goals

• Background

• Implementation

• Results

• Future Work

Introduction

Our Goals • Background • Implementation • Results • Future Work


• Implement Singh et. al.

• Gain some experience with EC2

• Implement a functional provisioning

system

• ...And gain some additional insight of

SMCS

Full Disclosure: Not quite done yet (but

close!)

Main Idea



Why?


• Implement Singh et. al.

• Deals primarily with workload

provisioning

• Recurring idea

• Good entrance to the field

• Promising theory


Why?


• EC2

• Unknown quantity

• Gain some idea towards its

efficacy

• How it works

• What works and what doesn’t

• Cost


Mix-aware provisioning


• Many systems only reflect absolute

number of requests

• Ignores type of requests

• Higher volume in requests ≠ higher

demand (always)

• Mix-aware solves this oversight




• So what is it?

• Find frequencies of different

request types

• Long versus Short requests

• Increase in long requests requires

more resources

• Uses these categories to determine

“true” workload




• (Brief) example:

Short Request: 1ms, Long: 90ms

100 Requests/Sec

90short*1ms + 10long*90ms = 990ms

vs.

10short*1ms + 90long*90ms = 3610ms


Amazon EC2


• Cloud computing platform

• Scales with demand and application size

• Applicable to wide array of uses

• Some Jargon:

• AMI (Amazon Machine Image)

• Micro, Small, Large etc. instance


Amazon EC2


• How it works

• Select an AMI (or make one)

• VMs in EC2 automatically boot with the

image

• VMs hosted across several physical

machines

• Variety of options for provisioning, load

balancing, etc.


K-Means Algorithm


• Used to characterize mixes

• Groups similarly sized workloads

• Clusters used to determine

capacity provisioning

• Quick and dirty way to estimate

service classes

• Problem: Number of clusters required

prior to running


K-Means Algorithm


• Step 1:

• Use unique request types to partition

the request types into clusters

• Step 2:

• Adjust for tail-heavy workloads

• Step 3:

• Compute actual cluster means

• Step 4:

• Update cluster means


Step 1: Find N Clusters


• Problem: Number of clusters required

prior to running

• Solution: Run K-means for every

possible K, determine an optimal value

• Optimal value based on variance

• Can be somewhat slow

• May not be necessary to run for all

K < N


Step 1: Find N Clusters


• Variance between clusters:

• Maximized

• Provides “distinctness” between service classes

• Variance within clusters:

• Minimized

• Ensures like elements grouped in correct service

class

• Assumes that variance is a good model for the

service class


Step 2: Redistribute Loads


•Tail ends of clusters often quite big

• Caused by lots of infrequent but large

service times

• Causes too much change in a cluster

mean when frequency changes

• Alleviate naïvely

• Force kth cluster to have at most n

service classes


Step 3: Compute Means


• Use all service times in a cluster

• Not just the unique ones!

• Offers a more accurate modeling of

the cluster mean


Step 4: Recompute Means


• System may begin to differ from starting

to state

• Need to recompute means on

occasion (cluster centroids)

• Not drastic changes (5%)

• Generally not necessary


K-Means Implementation


• Python

• Runs once at the beginning of the server

life

• Successive runs simply generate

the new cluster means

• Number of clusters always stays

constant

• Works fairly well (but not perfectly)


Load Balancer


•AWS Elastic Load Balancing

•$0.025 per hour

•$0.008 per GB

•Easy setup

•But not capable of advanced logging


Load Balancer


•EC2 node running HAProxy

•$0.02 per hour (or free!)

•$0.150 per GB (first GB free!)

•Non-trivial configuration

•Advanced logging features

•syslog


HAProxy


•HAProxy chosen for more control

•Configuration

•Log all connection to syslog

•Roundrobin dns balancing

•Leave stats page on for monitoring

•haproxy.cfg can be hot-reloaded


Web Server


•Apache webserver running on all nodes

•phusion_passenger (modrails.com)

•Serve ruby applications in apache

•Sinatra (sinatrarb.com)

•Sinatra is a Ruby DSL for quickly creating

web applications: require 'sinatra'

get '/' do 'Hello world!' end


Web Server


•Short request:

get '/short' do

'Hello world!'end

•Long request:

get '/long' do

10.times do

BCrypt::Password.create("foo")

end

'done!'end


Web Server


BCrypt::Password.create("foo")

•BCrypt

•“Basically, it's slow as hell.”

•Blowfish encryption variant

•Hashing ten times takes ~800ms


Integration


• Putting it all together:

• EC2 launches our AMI

• Pre-loaded with HAP and the K-

means algorithm

• Designated the “master” node

• Waits for 200 requests, then executes

the K-means to find the typical mix


Integration

• After the cluster centroids are returned:

• Ruby script (front-end) takes over

• Tracks...

• Inter-arrival time

• Request rate

• Number of requests per cluster

• Service times

• Information passed to provisioning

algorithm to calculate the number of serversOur Goals • Background • Implementation • Results • Future Work


Integration

• Data Parsing

•Metrics difficult to extract

•/haproxy?stats helpful but doesn’t have

everything

•exports to a nice CSV file!

•syslog used to log all haproxy requests

but extensive data parsing needed

•acquiring all metrics isn’t instantaneous



Integration

• Ruby integration with AWS

•Fog (ruby gem, library)

•http://fog.io

•allows for full access to AWS

through ruby


server = AWS.servers.create(:image_id => 'ami-5ee70037')


Provisioning


1. Estimate arrival rate per cluster:

Use previous sampling periods request rate and

percentage of requests

ƛt = Request rate of entire cluster

Pi[t] = Percent of requests at cluster i

ƛi [t-1] = Pi [t-1] / ƛt-1


Provisioning


2. Predict Capacity

Probabilistically determine waiting time in the queue

σ2a = inter-arrival time variance

σ2b = service time variance

ƛ = request arrival rate

X = average service time

y = SLA guarantee (seconds)

Max rate of requests per server :


Provisioning


2.5: How many servers do we need?


Provisioning


3: Applying this configuration

•Currently adding only a single type of

Amazon machine

• Larger machines more expensive

• Would be more difficult to configure

• Use a basic round robin assignment

system

• EC2 offers their own load balancer


Results

• Unfortunately, nothing discrete yet...

• However:

• Revised k-means works very well

• Converges in fewer than 100

iterations (generally)

• Variance may not be the best

predictorOur Goals • Background • Implementation • Results • Future Work


Results

• What’s the hold up?

• Log parsing significantly harder than we

anticipated

• Getting the raw data is tedious

• Hard to acquire sensibly

• Robustness proving hard to

guarantee

• Integration difficult because architecture was

hard to plan in advance

• EC2 Free tier - For new customers onlyOur Goals • Background • Implementation • Results • Future Work


Results

• What’s the hold up?

• The amount of sysadmin footwork is enormous

and very tedious

•Constrained by always trying to find the free

route

•Amazon AMI is the only free one

•Unfamiliarity with Amazon Linux distro



Results

• EC2

• Thus far, feelings are mixed

• Works well enough

• Easily customizable

• AMIs are readily available for use

• Seems reliable (!)

• However, everything comes with a price



Future Work

Finish?

• Current task: Finish provisioning script,

integrate all the components

Test with demands and rates of various

sizes

• Different machine sizes?



Future Work

Questions?


Powerpoint Templates

Documents

Transcript of Powerpoint Templates