Performance Benchmarking with Cloud Workbench (CWB)
Presenters
Joel Scheuner
Philipp Leitner
Chalmers !3
Performance Matters
Chalmers !4
Benchmarking IaaS Clouds
Chalmers !5
Capacity Planning in the Cloud is hard
020406080
100120140160180
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
Number of Instance Type
t2.nano0.05-1 vCPU0.5 GB RAM
$0.006/h
x1e.32xlarge128 vCPUs
3904 GB RAM$26.688 hourly
à Impractical to Test all Instance Types
Source: https://aws.amazon.com/blogs/aws/ec2-instance-history/
Chalmers !6
Capacity Planning in the Cloud is hard
“The instance type itself is a very major tunable parameter”
� @brendangregg re:Invent’17https://youtu.be/89fYOo1V2pA?t=5m4s
Chalmers !7
What cloud provider should I choose?
Should I go for many small or few large instances?
General-purpose or *-optimized?
Pay for better IOPS or not?
……………
➡ Need for Benchmarking
Chalmers !8
Basic Cloud Benchmarking Approach
BenchmarkManager
Provider API
results
provision
Instancestart benchmark
destroy
Chalmers !9
Basic Cloud Benchmarking Approach
CCGrid 2017 “An Approach and Case Study of Cloud Instance Type Selection for Multi-Tier Web Applications”
CWB Server
Chef Server
Vagrant
Sche
dule
r ProviderAPI
IaaS Provider
JMeterMaster
SUT
AcmeAirWebapplication MongoDB
request
DRIVER
responseTest Plan
results
JMeter Slave
provision provision provision provision
acquire
start-upCW
B Cl
ient
Chef Client
Chef Client
JMeter SlaveChef Client
Chef Client
JMeter Slave
Chef Client Chef Client
Chalmers !10
Benchmark Types
Generic
Artificial
Resource-specific
Specific
Real-World
Resource-heterogeneous
Micro Benchmarks
CPU Memory I/O Network Overall performance(e.g., response time)
ApplicationBenchmarks
Chalmers !11
Micro Benchmark Examples1)
Pre
pare I/O
2) Run
3) Extract Result
File I/O: 4k random read
4) C
lean
up 3.5793 MiB/sec
Network
Bandwidth
Server
Client
Result972 Mbits/sec
Micro Benchmarks
CPU Memory I/O Network
Chalmers !12
Application Benchmark Examples
Molecular Dynamics Simulation (MDSim)
WordPress Benchmark (WPBench)
Multiple short bloggingsession scenarios(read, search, comment)
0
20
40
60
80
100
00:00 01:00 02:00 03:00 04:00 05:00 06:00 07:00 08:00Elapsed Time [min]
Num
ber o
f Con
curre
nt T
hrea
ds
Overall performance(e.g., response time)
ApplicationBenchmarks
Chalmers !13
Cloud Workbench
CloudCom 2014 “Cloud Work Bench - Infrastructure-as-Code Based Cloud Benchmarking”
Tool for scheduling cloud experiments
Code:https://github.com/sealuzh/cloud-workbench
Demo:https://www.youtube.com/watch?
v=0yGFGvHvobk
Chalmers !14
Planned Schedule My First CWB Benchmark [~ 30 mins] CWB Architecture and Selected Previous Results [~ 30 mins]
~ Coffee Break ~ 🎉 Building and Running a Benchmark from Ground Up [~ 90 mins] Wrap-Up and Outlook [5 mins]
Chalmers !15
My First CWB Benchmark Interactive Session
Chalmers !17
Benchmarking with CWB 1. Write benchmark config to setup environment
(optional for simple benchmarks) 2. Declare IaaS resources and parametrize benchmark config 3. Trigger execution or define periodic schedule 4. Download metrics as CSV file 5. Analyze results
Chalmers !18
1. Write benchmark config Setup environment Write CWB execution hook
Chalmers !19
2. Declare IaaS resources and parametrize BM config
Chalmers !20
3. Trigger or schedule execution
Chalmers !21
4. Download metrics as CSV file
Chalmers !22
CWB Architecture
Chalmers !23
Provisioning Service
UploadConfiguration
REST
Experimenter
Configurations
Chalmers !24
CWB Server
Web Interface
Provisioning Service
REST
UploadConfiguration
AccessWeb Interface
REST
Business Logic
Sche
dule
rRelational Database
Experimenter
Configurations
Chalmers !25
IaaS ProviderIaaS ProviderIaaS ProvidersCWB Server
Web Interface
Provisioning Service
REST
UploadConfiguration
AccessWeb Interface
Prov
ider
APIManage VMs
REST
Business Logic
Sche
dule
rRelational Database
REST
Prov
ider
Plug
in
Experimenter
Configurations
Chalmers !26
IaaS ProviderIaaS ProviderIaaS ProvidersCWB Server
Web Interface
Provisioning Service
REST
UploadConfiguration
AccessWeb Interface
Prov
ider
APIManage VMs
Provision VMs +Execute
Commands
REST REST
Business Logic
Sche
dule
rRelational Database
REST
Prov
ider
Plug
in
Experimenter
Configurations
Clo
ud V
M
Benc
hmar
kEx
ecut
ion
Envi
ronm
ent
Clo
ud V
MsSSH
FetchConfiguration
Chalmers !27
IaaS ProviderIaaS ProviderIaaS ProvidersCWB Server
Web Interface
Provisioning Service
REST REST
UploadConfiguration
AccessWeb Interface
Prov
ider
APIManage VMs
Provision VMs +Execute
Commands
Notify State +Submit Metrics
REST REST
Business Logic
Sche
dule
rRelational Database
REST
Prov
ider
Plug
in
Experimenter
Configurations
Clo
ud V
M
CW
B C
lient
Lib
rary
Be
nchm
ark
Exec
utio
n En
viro
nmen
t
Clo
ud V
MsSSH
FetchConfiguration
Chalmers !28
IaaS ProviderIaaS ProviderIaaS ProvidersCWB Server
Web Interface
Provisioning Service
REST REST
UploadConfiguration
AccessWeb Interface
Prov
ider
APIManage VMs
Provision VMs +Execute
Commands
Notify State +Submit Metrics
REST REST
Business Logic
Sche
dule
rRelational Database
REST
Prov
ider
Plug
in
Experimenter
Configurations
Clo
ud V
M
CW
B C
lient
Lib
rary
Be
nchm
ark
Exec
utio
n En
viro
nmen
t
Clo
ud V
MsSSH
FetchConfiguration
Ruby DSL for defining
infrastructure (mostly VMs)
Ruby DSL for configuring machines
Chalmers !29
Experimenter /Scheduler CWB Server
Trigger Execution
Provider API
Acquire Resources
Cloud VM
Provision VM
ProvisioningService
Fetch VM Configurations
Apply VM Configurations
Start Benchmark RunRun Benchmark
Notify Benchmark CompletedPostprocess Results
Notify Postprocessing CompletedRelease Resources
Submit Metric(s)
Benchmark Execution Lifecycle
Chalmers !30
Selected Previous Results
Chalmers !31
Example Study 1 - Performance Testing of the Cloud
Study setup Benchmarked 22 cloud configurations using 5 benchmarks
Two types of experiments Isolated: 300 - 500 repetitions Continuous: 15 repetitions per configuration
TOIT 2016 “Patterns in the Chaos - A Study of Performance Variation and Predictability in Public IaaS Clouds”
Chalmers !32
Chalmers !33
Results Summary
TOIT 2016 “Patterns in the Chaos - A Study of Performance Variation and Predictability in Public IaaS Clouds”
Chalmers !34
Observed CPU Models
(for m1.small and Azure Small in North America) TOIT 2016 “Patterns in the Chaos - A Study of
Performance Variation and Predictability in Public IaaS Clouds”
Chalmers !35
Impact of Different Days / Times
(for m3.large in Europe) TOIT 2016 “Patterns in the Chaos - A Study of Performance Variation and Predictability in Public IaaS Clouds”
0
10
20
30
40
50
00:0004:0008:0012:0016:0020:00Time of the Day
IO B
andw
idth
[Mb/
s]
Mon Tue Wed Thu Fri Sat SunDay of the Week
0
10
20
30
40
50
00:0004:0008:0012:0016:0020:00Time of the Day
IO B
andw
idth
[Mb/
s]
Mon Tue Wed Thu Fri Sat SunDay of the Week
0
10
20
30
40
50
00:0004:0008:0012:0016:0020:00Time of the Day
IO B
andw
idth
[Mb/
s]
Mon Tue Wed Thu Fri Sat Sun
0
10
20
30
40
50
00:0004:0008:0012:0016:0020:00Time of the Day
IO B
andw
idth
[Mb/
s]
Mon Tue Wed Thu Fri Sat SunDay of the Week
0
10
20
30
40
50
00:0004:0008:0012:0016:0020:00Time of the Day
IO B
andw
idth
[Mb/
s]
Mon Tue Wed Thu Fri Sat SunDay of the Week
0
10
20
30
40
50
00:0004:0008:0012:0016:0020:00Time of the Day
IO B
andw
idth
[Mb/
s]
Mon Tue Wed Thu Fri Sat SunDay of the Week
0
10
20
30
40
50
00:00 04:00 08:00 12:00 16:00 20:00Time of the Day
IO B
andw
idth
[Mb/
s]
Mon Tue Wed Thu Fri Sat SunDay of the Week
0
10
20
30
40
50
00:0004:0008:0012:0016:0020:00Time of the Day
IO B
andw
idth
[Mb/
s]
Mon Tue Wed Thu Fri Sat SunDay of the Week
0
10
20
30
40
50
00:0004:0008:0012:0016:0020:00Time of the Day
IO B
andw
idth
[Mb/
s]
0
10
20
30
40
50
00:0004:0008:0012:0016:0020:00Time of the Day
IO B
andw
idth
[Mb/
s]
Mon Tue Wed Thu Fri Sat SunDay of the Week
0
10
20
30
40
50
00:0004:0008:0012:0016:0020:00Time of the Day
IO B
andw
idth
[Mb/
s]
Mon Tue Wed Thu Fri Sat SunDay of the Week
0
20
40
60
00:0004:0008:0012:0016:0020:00Time of the Day
IO B
andw
idth
[Mb/
s]
Mon Tue Wed Thu Fri Sat SunDay of the Week
0
10
20
30
40
50
00:0004:0008:0012:0016:0020:00Time of the Day
IO B
andw
idth
[Mb/
s]
Mon Tue Wed Thu Fri Sat SunDay of the Week
Chalmers !36
Recent Results
(unpublished data)
(Feb 2019)2015
Chalmers !37
Instance Runtime
(unpublished data)
(Feb 2019)2015
3.5
4.0
4.5
5.0
5.5
0 20 40 60Benchmark Runtime [h]
Benc
hmar
k Va
lue
Continuous io azure D2s
Chalmers !38
Example Study 2 - Estimating Application Performance from Microbenchmarks
Research question: How accurately can we predict application performance from system-level microbenchmarks?
Study setup: 2 applications (Wordpress, Molecular Dynamics Simulation) 23 microbenchmarks Study executed in AWS (11 instance types) Linear regression for prediction
Joel Scheuner, Philipp Leitner (2018). Estimating Cloud Application Performance Based on Micro-Benchmark Profiling. IEEE CLOUD.
Chalmers !39
Joel Scheuner, Philipp Leitner (2018). Estimating Cloud Application Performance Based on Micro-Benchmark Profiling. IEEE CLOUD.
CPU• sysbench/cpu-single-thread• sysbench/cpu-multi-thread• stressng/cpu-callfunc• stressng/cpu-double• stressng/cpu-euler• stressng/cpu-ftt• stressng/cpu-fibonacci• stressng/cpu-int64• stressng/cpu-loop• stressng/cpu-matrixprod
Memory• sysbench/memory-4k-block-size• sysbench/memory-1m-block-size
Broad resource coverage and specific resource testing
Micro BenchmarksMicro
Benchmarks
CPU Memory I/O Network
I/O• [file I/O] sysbench/fileio-1m-seq-write• [file I/O] sysbench/fileio-4k-rand-read• [disk I/O] fio/4k-seq-write• [disk I/O] fio/8k-rand-read
Network• iperf/single-thread-bandwidth• iperf/multi-thread-bandwidth• stressng/network-epoll• stressng/network-icmp• stressng/network-sockfd• stressng/network-udp
Chalmers !40
Joel Scheuner, Philipp Leitner (2018). Estimating Cloud Application Performance Based on Micro-Benchmark Profiling. IEEE CLOUD.
0
1000
2000
25 50 75 100Sysbench − CPU Multi Thread Duration [s]
WPB
ench
Rea
d −
Res
pons
e Ti
me
[ms]
Instance Typem1.small
m3.medium (pv)
m3.medium (hvm)
m1.medium
m3.large
m1.large
c3.large
m4.large
c4.large
c3.xlarge
c4.xlarge
c1.xlarge
Grouptest
train
Example Results: Predicting Wordpress
Chalmers !41
Example Study 3 - Software Performance Testing in the Cloud
Research question: Executed 19 software performance tests in different environments How small performance regressions can we find?
Study setup: 4 open source projects in Java and Go Study executed in AWS, Azure, Google Baseline: baremetal server in Softlayer / Bluemix
Christoph Laaber, Joel Scheuner, Philipp Leitner (2019). Software Microbenchmarking in the Cloud. How Bad is it Really? Empirical Software Engineering (EMSE). To appear.
Chalmers !42
Background - JMH Microbenchmarks
MSR’18. An Evaluation of Open-Source Software Microbenchmark Suites for Continuous Performance Assessment.
Chalmers !43
Summary - Variability of Software Benchmark Results
Christoph Laaber, Joel Scheuner, Philipp Leitner (2019). Software Microbenchmarking in the Cloud. How Bad is it Really? Empirical Software Engineering (EMSE). To appear.
Chalmers !44
Sources of Variability
0 2 4 6RSD
AWS CPU / etcd−2
0 25 50 75 100RSD
Azure Std / etcd−2
0 25 50 75RSD
AWS CPU / log4j2−5
0 10 20 30RSD
GCE Mem / etcd−4
Per Trial
Per Instance
Total
Christoph Laaber, Joel Scheuner, Philipp Leitner (2019). Software Microbenchmarking in the Cloud. How Bad is it Really? Empirical Software Engineering (EMSE). To appear.
Chalmers !45
Example Study 4 - Credit-based Bursting Instance Types
Research question: How do t2 instance types perform in terms of CPU and IO speed in comparison to other instances?When are t2 bursting instance types more cost-efficient than other instance types?How do t2 instance types perform in comparison to the previous generation (t1) types?
Study setup: Sysbench CPU and IO benchmarks Study executed in AWS 3 bursting vs non-bursting instance types
Philipp Leitner, Joel Scheuner (2015). Bursting With Possibilities – an Empirical Study of Credit-Based Bursting Cloud Instance Types (UCC).
Chalmers !46
Example Study 4 - Credit-based Bursting Instance Types
●
●●●●
●
●
●●●
●
●
●●●●●
●
●
●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
10x100
200
300
18:10 18:20 18:30 18:40 18:50 19:00 19:10 19:20 19:30 19:40 19:50 20:00 20:10 20:20 20:30Experiment Duration
Exec
utio
n Ti
me
(s)
0
25
50
75
100
18:10 18:20 18:30 18:40 18:50 19:00 19:10 19:20 19:30 19:40 19:50 20:00 20:10 20:20 20:30Experiment Duration
CPU
Tim
e (%
) CPU Timeuserstealidle
Peak
BaselinePeak
Baseline
t2.micro m3.medium m3.large c4.large t1.micro
0
1
2
3
4
t2.micro − Peak t2.micro − Base m3.medium m3.large c4.large t1.micro − PeakInstance Types
Med
ium−I
nsta
nce
Equi
vale
ntst2.micro m3.medium m3.large c4.large t1.micro
0
1
2
3
4
t2.micro − Peak t2.micro − Base m3.medium m3.large c4.large t1.micro − PeakInstance Types
Med
ium−I
nsta
nce
Equi
vale
nts
Burstable T2.* highly predictable unlike previous generation T1.*
Chalmers !47
Building and Running a Benchmark from Ground Up Interactive Session
http://bit.ly/cwb-tutorial
Chalmers !48
Wrap-Up and Outlook
Chalmers !49
Three Steps Towards a Benchmark
(Optional) Step 1: Write Chef Cookbook
Chalmers !50
Three Steps Towards a Benchmark
Step 1I: Define IaaS config and schedule
Chalmers !51
Three Steps Towards a Benchmark
Step III: Execute and download result CSV
Chalmers !52
Cloud Workbench
CloudCom 2014 “Cloud Work Bench - Infrastructure-as-Code Based Cloud Benchmarking”
Tool for scheduling cloud experiments
Code:https://github.com/sealuzh/cloud-workbench
Demo:https://www.youtube.com/watch?
v=0yGFGvHvobk
Top Related