Powerpoint Templates Page 1 Powerpoint Templates Unit 5, Ch 9,10,11.
Powerpoint Templates
description
Transcript of Powerpoint Templates
Powerpoint TemplatesPowerpoint Templates
Autonomic Mix-Aware Provisioning on Amazon EC2
Tony Schneider, Derek Bender, Andrew Hahn
Based on R. Singh et. al [1]
Powerpoint Templates
• Our Goals
• Background
• Implementation
• Results
• Future Work
Introduction
Our Goals • Background • Implementation • Results • Future Work
Powerpoint Templates
• Implement Singh et. al.
• Gain some experience with EC2
• Implement a functional provisioning
system
• ...And gain some additional insight of
SMCS
Full Disclosure: Not quite done yet (but
close!)
Main Idea
Our Goals • Background • Implementation • Results • Future Work
Powerpoint Templates
Why?
Our Goals • Background • Implementation • Results • Future Work
• Implement Singh et. al.
• Deals primarily with workload
provisioning
• Recurring idea
• Good entrance to the field
• Promising theory
Powerpoint Templates
Why?
Our Goals • Background • Implementation • Results • Future Work
• EC2
• Unknown quantity
• Gain some idea towards its
efficacy
• How it works
• What works and what doesn’t
• Cost
Powerpoint Templates
Mix-aware provisioning
Our Goals • Background • Implementation • Results • Future Work
• Many systems only reflect absolute
number of requests
• Ignores type of requests
• Higher volume in requests ≠ higher
demand (always)
• Mix-aware solves this oversight
Powerpoint Templates
Mix-aware provisioning
Our Goals • Background • Implementation • Results • Future Work
• So what is it?
• Find frequencies of different
request types
• Long versus Short requests
• Increase in long requests requires
more resources
• Uses these categories to determine
“true” workload
Powerpoint Templates
Mix-aware provisioning
Our Goals • Background • Implementation • Results • Future Work
• (Brief) example:
Short Request: 1ms, Long: 90ms
100 Requests/Sec
90short*1ms + 10long*90ms = 990ms
vs.
10short*1ms + 90long*90ms = 3610ms
Powerpoint Templates
Amazon EC2
Our Goals • Background • Implementation • Results • Future Work
• Cloud computing platform
• Scales with demand and application size
• Applicable to wide array of uses
• Some Jargon:
• AMI (Amazon Machine Image)
• Micro, Small, Large etc. instance
Powerpoint Templates
Amazon EC2
Our Goals • Background • Implementation • Results • Future Work
• How it works
• Select an AMI (or make one)
• VMs in EC2 automatically boot with the
image
• VMs hosted across several physical
machines
• Variety of options for provisioning, load
balancing, etc.
Powerpoint Templates
K-Means Algorithm
Our Goals • Background • Implementation • Results • Future Work
• Used to characterize mixes
• Groups similarly sized workloads
• Clusters used to determine
capacity provisioning
• Quick and dirty way to estimate
service classes
• Problem: Number of clusters required
prior to running
Powerpoint Templates
K-Means Algorithm
Our Goals • Background • Implementation • Results • Future Work
• Step 1:
• Use unique request types to partition
the request types into clusters
• Step 2:
• Adjust for tail-heavy workloads
• Step 3:
• Compute actual cluster means
• Step 4:
• Update cluster means
Powerpoint Templates
Step 1: Find N Clusters
Our Goals • Background • Implementation • Results • Future Work
• Problem: Number of clusters required
prior to running
• Solution: Run K-means for every
possible K, determine an optimal value
• Optimal value based on variance
• Can be somewhat slow
• May not be necessary to run for all
K < N
Powerpoint Templates
Step 1: Find N Clusters
Our Goals • Background • Implementation • Results • Future Work
• Variance between clusters:
• Maximized
• Provides “distinctness” between service classes
• Variance within clusters:
• Minimized
• Ensures like elements grouped in correct service
class
• Assumes that variance is a good model for the
service class
Powerpoint Templates
Step 2: Redistribute Loads
Our Goals • Background • Implementation • Results • Future Work
•Tail ends of clusters often quite big
• Caused by lots of infrequent but large
service times
• Causes too much change in a cluster
mean when frequency changes
• Alleviate naïvely
• Force kth cluster to have at most n
service classes
Powerpoint Templates
Step 3: Compute Means
Our Goals • Background • Implementation • Results • Future Work
• Use all service times in a cluster
• Not just the unique ones!
• Offers a more accurate modeling of
the cluster mean
Powerpoint Templates
Step 4: Recompute Means
Our Goals • Background • Implementation • Results • Future Work
• System may begin to differ from starting
to state
• Need to recompute means on
occasion (cluster centroids)
• Not drastic changes (5%)
• Generally not necessary
Powerpoint Templates
K-Means Implementation
Our Goals • Background • Implementation • Results • Future Work
• Python
• Runs once at the beginning of the server
life
• Successive runs simply generate
the new cluster means
• Number of clusters always stays
constant
• Works fairly well (but not perfectly)
Powerpoint Templates
Load Balancer
Our Goals • Background • Implementation • Results • Future Work
•AWS Elastic Load Balancing
•$0.025 per hour
•$0.008 per GB
•Easy setup
•But not capable of advanced logging
Powerpoint Templates
Load Balancer
Our Goals • Background • Implementation • Results • Future Work
•EC2 node running HAProxy
•$0.02 per hour (or free!)
•$0.150 per GB (first GB free!)
•Non-trivial configuration
•Advanced logging features
•syslog
Powerpoint Templates
HAProxy
Our Goals • Background • Implementation • Results • Future Work
•HAProxy chosen for more control
•Configuration
•Log all connection to syslog
•Roundrobin dns balancing
•Leave stats page on for monitoring
•haproxy.cfg can be hot-reloaded
Powerpoint Templates
Web Server
Our Goals • Background • Implementation • Results • Future Work
•Apache webserver running on all nodes
•phusion_passenger (modrails.com)
•Serve ruby applications in apache
•Sinatra (sinatrarb.com)
•Sinatra is a Ruby DSL for quickly creating
web applications: require 'sinatra'
get '/' do 'Hello world!' end
Powerpoint Templates
Web Server
Our Goals • Background • Implementation • Results • Future Work
•Short request:
get '/short' do
'Hello world!'end
•Long request:
get '/long' do
10.times do
BCrypt::Password.create("foo")
end
'done!'end
Powerpoint Templates
Web Server
Our Goals • Background • Implementation • Results • Future Work
BCrypt::Password.create("foo")
•BCrypt
•“Basically, it's slow as hell.”
•Blowfish encryption variant
•Hashing ten times takes ~800ms
Powerpoint Templates
Integration
Our Goals • Background • Implementation • Results • Future Work
• Putting it all together:
• EC2 launches our AMI
• Pre-loaded with HAP and the K-
means algorithm
• Designated the “master” node
• Waits for 200 requests, then executes
the K-means to find the typical mix
Powerpoint Templates
Integration
• After the cluster centroids are returned:
• Ruby script (front-end) takes over
• Tracks...
• Inter-arrival time
• Request rate
• Number of requests per cluster
• Service times
• Information passed to provisioning
algorithm to calculate the number of serversOur Goals • Background • Implementation • Results • Future Work
Powerpoint Templates
Integration
• Data Parsing
•Metrics difficult to extract
•/haproxy?stats helpful but doesn’t have
everything
•exports to a nice CSV file!
•syslog used to log all haproxy requests
but extensive data parsing needed
•acquiring all metrics isn’t instantaneous
Our Goals • Background • Implementation • Results • Future Work
Powerpoint Templates
Integration
• Ruby integration with AWS
•Fog (ruby gem, library)
•http://fog.io
•allows for full access to AWS
through ruby
Our Goals • Background • Implementation • Results • Future Work
server = AWS.servers.create(:image_id => 'ami-5ee70037')
Powerpoint Templates
Provisioning
Our Goals • Background • Implementation • Results • Future Work
1. Estimate arrival rate per cluster:
Use previous sampling periods request rate and
percentage of requests
ƛt = Request rate of entire cluster
Pi[t] = Percent of requests at cluster i
ƛi [t-1] = Pi [t-1] / ƛt-1
Powerpoint Templates
Provisioning
Our Goals • Background • Implementation • Results • Future Work
2. Predict Capacity
Probabilistically determine waiting time in the queue
σ2a = inter-arrival time variance
σ2b = service time variance
ƛ = request arrival rate
X = average service time
y = SLA guarantee (seconds)
Max rate of requests per server :
Powerpoint Templates
Provisioning
Our Goals • Background • Implementation • Results • Future Work
2.5: How many servers do we need?
Powerpoint Templates
Provisioning
Our Goals • Background • Implementation • Results • Future Work
3: Applying this configuration
•Currently adding only a single type of
Amazon machine
• Larger machines more expensive
• Would be more difficult to configure
• Use a basic round robin assignment
system
• EC2 offers their own load balancer
Powerpoint Templates
Results
• Unfortunately, nothing discrete yet...
• However:
• Revised k-means works very well
• Converges in fewer than 100
iterations (generally)
• Variance may not be the best
predictorOur Goals • Background • Implementation • Results • Future Work
Powerpoint Templates
Results
• What’s the hold up?
• Log parsing significantly harder than we
anticipated
• Getting the raw data is tedious
• Hard to acquire sensibly
• Robustness proving hard to
guarantee
• Integration difficult because architecture was
hard to plan in advance
• EC2 Free tier - For new customers onlyOur Goals • Background • Implementation • Results • Future Work
Powerpoint Templates
Results
• What’s the hold up?
• The amount of sysadmin footwork is enormous
and very tedious
•Constrained by always trying to find the free
route
•Amazon AMI is the only free one
•Unfamiliarity with Amazon Linux distro
Our Goals • Background • Implementation • Results • Future Work
Powerpoint Templates
Results
• EC2
• Thus far, feelings are mixed
• Works well enough
• Easily customizable
• AMIs are readily available for use
• Seems reliable (!)
• However, everything comes with a price
Our Goals • Background • Implementation • Results • Future Work
Powerpoint Templates
Future Work
Finish?
• Current task: Finish provisioning script,
integrate all the components
Test with demands and rates of various
sizes
• Different machine sizes?
Our Goals • Background • Implementation • Results • Future Work
Powerpoint Templates
Future Work
Questions?
Our Goals • Background • Implementation • Results • Future Work