Cloud Probing
-
Upload
marat-zhanikeev -
Category
Technology
-
view
135 -
download
0
Transcript of Cloud Probing
.
Cloud Platforms (taxonomy)
• Cloud Platforms (Amazon)◦ raw access at VM level◦ client decides when and what to migrate
• App Platforms (Heroku)◦ container level◦ heroku packs containers to VMs◦ user has limited access to migrations
• DIY Platforms (Docker)◦ container level◦ manual install at each VM, then automation◦ Docker is a Github for OS images
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 2/24...
2/24
.
Cloud Populations
APP
Cloud/DC
APP
APP
VM Container
APP
Cloud/DC
APP
APP …
• population =service (heroku,docker, videostreaming 01)
• app can be VM orcontainer
• users can be includedas e2e QoS 04
01 myself+0 "Multi-Source Stream Aggregation in the Cloud" Book on Advanced Content Delivery, Wiley (2014)
04 myself+0 "A holistic community-based architecture for measuring E2E QoS at DCs" IJCSE (2014)
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 3/24...
3/24
.
Related Topics• active probing 03
◦ available bandwidth, bulk transfer, etc.
• delay space and network coordination 07
• Virtual Network Embedding (VNE) 09
• migration cost and energy-efficient clouds◦ migration schedules and greyboxes 05
• fog computing -- clouds at network edge 08
• BigData Networking -- circuits-over-packets in particular 02
03 1+myself "Active Network Measurement: Theory, Methods, and Tools" ITU (2009)
07 myself+1 "Application of Graph Theory to Clustering in Delay Space" APSITT (2010)
09 J.Lu+1 "Efficient Mapping of Virtual Networks onto a Shared Substrate" Washington Univ. (2006)
05 myself+0 "Optimizing Virtual Machine Migration for Energy-Efficient Clouds" IEICEJ (2014)
08 myself+0 "A Cloud Visitation Platform for Federated Services at Network Edge" 10th CISSE (2014)
02 myself+0 "Circuit Emulation for Big Data Transfers in Clouds" Book on Networking for Big Data, CRC (2015)
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 4/24...
4/24
.
Cloud Probing is Reversed VNE
• VNE: optimize mapping of many virtual graphs onto one physical topology◦ problem: feasibility low, complexity very high◦ unlikely for cloud providers to implement it in near future
• Cloud Probing: optimize your own population◦ basically a distributed version of client-side VNE◦ no need for support from cloud providers -- can use today!
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 5/24...
5/24
.
Experiment on Amazon (AWS) Cloud
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 6/24...
6/24
.
Experiment on Amazon (AWS) Cloud
• Planetlab (legacy) → Amazon Cloud
• 15 VMs across 8 AWS regions• 5 VMs migrate to random locations every hour
◦ roughly equal distribution is enforced
• each hour: continuous probing in random pairs of VMs◦ rx/tx direction is emulated as HTTP GET or POST
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 7/24...
7/24
.
Experiment : AWS Population
0 1000 2000 3000 4000 5000size (kbytes)
0
1
2
3
4
5tx
,rx th
roug
hput
as y
=log
( 1 +
x in
kbp
s)Intra-DC
0 1000 2000 3000 4000 5000size (kbytes)
0
1
2
3
4
5
tx,rx
thro
ughp
ut a
s y=l
og( 1
+ x
in k
bps) Inter-DC
AverageMin/Max3 sigma band
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 8/24...
8/24
.
Formulations
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 9/24...
9/24
.
Groping by Probing• probing: migrate and see what happens
• groping: no way to know whether migration results in better or worseperformance
• ... in advanced designs, can use history to assign probabilities → markovmodeling
Migrate
IDLE BETTER WORSE
Revert
Migrate
Revert
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 10/24...
10/24
.
The Low-Start Model
Performance
Cost
Stop
New state
• the low-startconcept
• each newimprovement comes athigher cost
• stop or newstate? ... in practicenew state is, bynature, more likely
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 11/24...
11/24
.
Stress Ring Visualization
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 12/24...
12/24
.
Stress Ring (1) Copy AMI
california
ireland
oregon
saopaulo
singapore
sydney
virginia
key (copyami)sizes (1 10)parties (ab) • stress ring: pressure implodes the
balloon
• key: which metric becomes stress
• sizes: kbytes ... small = delay, large= throughput
• parties: AA = intrA-DC, AB =intER-DC
• Copy API is AWS action for moving VMimages across DCs
• ... Brazil is very far from Tokyo
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 13/24...
13/24
.
Stress Ring (2) Intra-DC Delay and Bulk
california
ireland
oregon
saopaulo
singapore
sydney
tokyo
virginia
key (probe)sizes (1 10)parties (aa)
california
ireland
oregon
saopaulo
singapore
sydneytokyo virginia
key (probe)sizes (2000 5000)parties (aa)
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 14/24...
14/24
.
Stress Ring (3) Inter-DC Bulk
california
ireland
oregon
saopaulo
singapore
sydney
tokyo
virginia
key (probe)sizes (2000 5000)parties (ab)
• 2-ring version• outside ring: same as before
• inside ring: the main contributor tostress
• reading: California's throughput is notbad but variance is high and mostlycaused by Oregon
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 15/24...
15/24
.
Stress Optimization
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 16/24...
16/24
.
Stress: Graph vs Ringusing
bigdatabigdatabigdata
guigui
apiapiapi
pmstackspmstacks
vmsvmsvms
vmappsvmappsvmappsvmapps
distappsdistappsdistappsdistapps
scrumscrum
ticketdevticketdev
mongodbmongodb
eclipseeclipse
making researching
optimization
migration
visualization
apps
tools
tractractractractrac
.Cloud Populations.....
.
... are mostly rings, almost nevergraphs
• related topic: graph drawing 10
• rings are easiler to draw and understand
• rings are better for managementdecisions -- which DCs causes themost stress?
• facebook social graph vs google circles
10 T.Kamada+1 "An algorithm for drawing general undirected graphs" Information Processing Letters (1989)
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 17/24...
17/24
.
Stress Optimization• v (will call it key later) -- an arbitrary performance metric
• DCs/regions are a and b, i.e. performance is vab• same-node (intra-DC) vaa (always a) and directional vab ̸= vba• (even ring is a) graph G(N,M) of n nodes and m links
• collect a set of values{vab
}for pairwise a, b ∈ G
• then stress is an aggregate of probing data:
Sa = f(vaa, vab, vac, ..., vax), where{a, b, c, ...x
}∈ G, (1)
• ... f() is an arbitrary aggregator function (sum, average, etc.).
• stress optimization is then:
minimize∑
Sx, x ∈ G, (2)
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 18/24...
18/24
.
Analysis: Models
1. Pooler Model (1 ring)◦ BigData aggregation◦ 3 VMs, 1 VM collects and stores data from other 2 VMs
2. Syncer Model (2 rings)
◦ 1st ring: same as Pooler Model, only all-to-all throughput◦ 2nd ring: e2e delay between users and 3 VMs -- with time belts, etc.
• ... are trace-based simulations -- AWS experiment is the trace
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 19/24...
19/24
.
Analysis: MigrationPooler model
Syncer model
• Pooler Model is more stable --certain combinations of DCs are better
• Syncer Model -- less stablebecause of 2 rings and daytimefluctuations
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 20/24...
20/24
.
Analysis: Overall
0 100 200 300 400 500Ordered list of values
0
20
40
60
80
100
Com
plet
ion
time
(s)
Do nothingOptimize
Pooler Model
0 1000 2000 3000 4000 5000 6000Ordered list of values
2.25
2.55
2.85
3.15
3.45
3.75
Ave
rage
del
ay (l
og o
f ms)
Syncer Model
• stress optimization results inbetter performance in more than80% of cases
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 21/24...
21/24
.
Implementation
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 22/24...
22/24
.
Implementation: the TopoAPI
API Service Contract (key) Population
TopoAPI Service
Stats
New session ID
ADD( a, b, value) OK
OPTIMIZE( model) Graph, Migrations, …
Read Result Solve
• an independentservice --heroku-based API
• fully abstracta, b, value performancetuple
• sessions are up to client
• generic: stress ring isonly one model, others arepossible
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 23/24...
23/24
.
That’s all, thank you ...
M.Zhanikeev -- [email protected] -- Cloud Probing -- http://bit.do/150115icm -- 24/24...
24/24