PERFORMANCE BENCHMARKING OF CLOUDS EVALUATING OPENSTACK
Pradeep Kumar surisetty
#WHOAMI
Pradeep Kumar surisettyAssociate Engineering ManagerPerformance and Scale Engineering, Red [email protected] in Open source
Collaborate or Die
RED HAT PERFORMANCE & SCALE TEAM
TOPICS
CLOUD CHARACTERISTICS
PERFORMANCE MEASURING TOOLS
SPEC CLOUD Iaas 2016 BENCHMARKPERFORMANCE MONITORING TOOLSTUNING TIPS
CLOUD CHARACTERISTICS
SPEC RESEARCH GROUP - CLOUD WORKING GROUP
https://research.spec.org/working-groups/rg-cloud-working-group.html
READY FOR RAIN? A VIEW FROM SPEC RESEARCH ONTHE FUTURE OF CLOUD METRICS
https://research.spec.org/fileadmin/user_upload/documents/rg_cloud/endorsed_publications/SPEC-RG-2016-01_CloudMetrics.pdf
ELASTICITY
- THE DEGREE TO WHICH A SYSTEM IS ABLE TO ADAPT TO WORKLOAD CHANGES BYPROVISIONING AND DE-PROVISIONING RESOURCES IN AN AUTONOMIC MANNER, SUCHTHAT AT EACH POINT IN TIME THE AVAILABLE RESOURCES MATCH THE CURRENTDEMAND AS CLOSELY AS POSSIBLE
Source: READY FOR RAIN? A VIEW FROM SPEC RESEARCH ON THE FUTURE OF CLOUD METRICS, SPEC RG Cloud Working Group
Source: http://content.time.com/time/specials/packages/article/0,28804,2049243_2048657_2049165,00.html
ELASTICITY
Source: http://www.today.com/news/remember-stretch-armstrong-how-buy-your-favorite-retro-toys-your-1D80377927
HOW FAR WILL HE STRETCH?
AS YOU STRETCH HIM DOES IT GET HARDER TO STRETCH HIM MORE?
WHEN I LET GO DOES HE RETURN TO HIS ORIGINAL SHAPE?
WILL HE BREAK WHEN STRETCHED?
HOW LONG DOES HE TAKE TO RETURN TO HIS NORMAL SHAPE?
Source: http://content.time.com/time/specials/packages/article/0,28804,2049243_2048657_2049165,00.html
ELASTICITY
SCALABILITY
- THE ABILITY OF THE SYSTEM TO SUSTAIN INCREASING WORKLOADS BY MAKING USEOF ADDITIONAL RESOURCES, AND THEREFORE, IN CONTRAST TO ELASTICITY, IT IS NOTDIRECTLY RELATED TO HOW WELL THE ACTUAL RESOURCE DEMANDS ARE MATCHED BYTHE PROVISIONED RESOURCES AT ANY POINT IN TIME.
Source: READY FOR RAIN? A VIEW FROM SPEC RESEARCH ON THE FUTURE OF CLOUD METRICS, SPEC RG Cloud Working Group
PERFORMANCE MEASURING TOOLS
RALLYRALLY IS A FAMILIAR OPENSTACK PROJECT
HTTPS://GITHUB.COM/OPENSTACK/RALLYAN AUTOMATED BENCHMARK TOOL FOR OPENSTACK
BENCHMARKINGMULTIPLE USE CASES
DEVELOPMENT AND QADEVOPSCI/CD
RALLY
Source: https://github.com/OpenStack/rally/blob/master/doc/source/images/Rally-Actions.png
BROWBEAT
BROWBEATSCALE AND PERFORMANCE AUTOMATION
ANSIBLE PLAYBOOKS FOR AUTOMATION
PROVIDES AUTOMATION WRAPPER AROUND EXISTING TOOLING
RALLY - CONTROL PLANE TESTS
SHAKER - DATA PLANE NETWORK TESTS
PERFKIT - DATA PLANE TESTS
CBTOOL - DATA PLANE TESTS
LEVERAGES EXISTING UPSTREAM TEST FRAMEWORKS RATHER THANREPLACING THEM
PERFORMANCE MONITORING
COLLECTED/GRAPHITE/GRAPHANA
RESULTS CAPTURE AND STORAGE
ELK STACK
ALLOWS FOR ELASTICSEARCH RESULTS COMPARISON
ONCAPTURE METADATA LIKE #API WORKER, NEUTRON CONFIGURATION..ETC
BROWBEATWEB PRESENCE
LOTS OF GREAT INFORMATION ABOUT BROWBEAT
INSTALLING GRAFANA AND GRAPHITE-WEB + CARBON-CACHE AS DOCKERIMAGES
BROWBEAT IS NOW AN OPENSTACK PROJECT
BROWBEAT HAS NOW MOVED TO THE OPENSTACK.ORG NAMESPACE
NOW ABLE TO USE THE UPSTREAM OPENSTACK INFRASTRUCTURE AND CI
SEEING INTEREST PICK UP
BROWBEATPROJECT.ORG
HTTPS://GITHUB.COM/OPENSTACK/BROWBEAT
BROWBEAT
install and configure all of our
workloads ,
ELK (or ES, FluentD, and Kibana
under/overcloud with collectd
graphite and grafana,
OpenStack specific Grafana Dashboards that we push to Grafana based on your deployment.
BROWBEAT
REPEATABLE AUTOMATED TESTING
BROWBEAT
PERFKIT BENCHMARKER
Source: Introduction to Perfkit Benchmark and How to Extend it, https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/wiki/Tech-Talks
PERFKIT BENCHMARKER
PERFKIT BENCHMARKER
PERFKIT BENCHMARKER
Source: Introduction to Perfkit Benchmark and How to Extend it, https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/wiki/Tech-Talks
PERFKIT BENCHMARKER
CLOUDBENCHFRAMEWORK THAT AUTOMATES CLOUD-SCALE EVALUATION ANDBENCHMARKINGBENCHMARK HARNESS
REQUESTS THE CLOUD MANAGER TO CREATE AN INSTANCE(S)SUBMIT CONFIGURATION PLAN AND STEPS TO THE CLOUDMANAGER ON HOW THE TEST WILL BE PERFORMEDAT THE END OF THE TEST, COLLECT AND LOG APPLICABLEPERFORMANCE DATA AND LOGSDESTROY INSTANCES NO LONGER NEEDED FOR THE TEST.
BENCHMARK HARNESS
HARNESS AND WORKLOAD CONTROL
Benchmark Harness
Benchmark Harness. It comprises of Cloud Bench (CBTOOL)and baseline/elasticity drivers, and report generators.
For white-box clouds the benchmark harness is outside theSUT. For black-box clouds, it can be in the same location orcampus.
Cloud SUT
Group of boxes represents anapplication instance
SUPPORTED WORKLOADS
SPEC CLOUD IAAS 2016 BENCHMARK
SPEC CLOUD IAAS 2016 BENCHMARK
MEASURES PERFORMANCE OF INFRASTRUCTURE-AS-A-SERVICE(IAAS) CLOUDS.MEASURES BOTH CONTROL AND DATA PLANE
CONTROL: MANAGEMENT OPERATIONS, E.G., INSTANCEPROVISIONING TIMEDATA: VIRTUALIZATION, NETWORK PERFORMANCE, RUNTIMEPERFORMANCE
USES WORKLOADS THATRESEMBLE “REAL” CUSTOMER APPLICATIONSBENCHMARKS THE CLOUD, NOT THE APPLICATION
PRODUCES METRICS (“ELASTICITY”, “SCALABILITY”, “PROVISIONINGTIME”) WHICH ALLOW COMPARISON.
HTTP://EN.COMMUNITY.DELL.COM/TECHCENTER/CLOUD/B/DELL-CLOUD-BLOG/ARCHIVE/2016/06/24/SPEC-CLOUD-
IAAS-BENCHMARKING-DELL-LEADS-THE-WAY
SPEC CLOUD IAAS BENCHMARKING : DELL LEADS THE WAY
SPEC CLOUD WORKLOADSYCSB
FRAMEWORK USED BY A COMMON SET OF
WORKLOADS FOR EVALUATING
PERFORMANCE OF DIFFERENT KEY-VALUE
AND CLOUD SERVING STORES.
KMEANS
- HADOOP-BASED CPU INTENSIVE WORKLOAD
- CHOSE INTEL HIBENCH IMPLEMENTATION
WHAT IS MEASURED
MEASURES THE NUMBER OF AIS THAT CAN BE LOADEDONTO A CLUSTER BEFORE SLA VIOLATIONS OCCURMEASURES THE SCALABILITY AND ELASTICITY OF THECLOUD UNDER TEST (CUT)NOT A MEASURE OF INSTANCE DENSITYSPEC CLOUD WORKLOADS CAN INDIVIDUALLY BE USED TOSTRESS THE CUT:
KMEANS – CPU/MEMORYYCSB - IO
BENCHMARK STOPPING CONDITIONS
20% AIS FAIL TO PROVISION10% AIS HAVE ERRORS IN ANY RUNMAX NUMBER OF AIS SET BY CLOUD PROVIDER50% AIS HAVE QOS VIOLATIONS
KMEANS COMPLETION TIME ≤ 3.33X BASELINE PHASEYCSB THROUGHPUT ≥ BASELINETHROUGHPUT / 3YCSB READ RESPONSE TIME ≤ 20 X BASELINEREADRESPONSE TIMEYCSB INSERT RESPONSE TIME ≤ 20 X BASELINEINSERTRESPONSETIME
HIGH LEVEL REPORT SUMMARY
RESULTS COMPARED
PUBLISHED RESULTS WEBSITE
https://www.spec.org/cloud_iaas2016/results/cloudiaas2016.html
PERFORMANCE MONITORING TOOLS
CEILOMETERANOTHER FAMILIAR OPENSTACK PROJECT
GOAL IS TO EFFICIENTLY COLLECT, NORMALIZE AND TRANSFORMDATA PRODUCED BY OPENSTACK SERVICESINTERACTS DIRECTLY WITH THE OPENSTACK SERVICES THROUGHDEFINED INTERFACESMANY TOOLS UTILIZE CEILOMETER TO GATHER OPENSTACKPERFORMANCE DATA
HTTPS://GITHUB.COM/OPENSTACK/CEILOMETER
CEILOMETER
Source: http://docs.OpenStack.org/developer/ceilometer/architecture.html
COLLECTD/GRAPHITE/GRAPHANA
COLLECTDDAEMON TO COLLECT SYSTEM PERFORMANCE STATISTICCPU, MEMORY, DISK, NETWORK, PER PROCESS STATS (REGEX),POSTGRESQL AND MORE
GRAPHITE/CARBONCARBON RECEIVES METRICS, AND FLUSHES THEM TO WHISPERDATABASE FILESGRAPHITE IS WEBAPP FRONTEND TO CARBON
GRAFANAVISUALIZE METRICS FROM MULTIPLE BACKENDS.DASHBOARDS SAVED IN JSON AND CUSTOMIZED BY ANSIBLE DURINGDEPLOYMENT
COLLECTD/GRAPHITE/GRAPHANAExample Graphana dashboards
GANGLIA
SCALABLE DISTRIBUTED MONITORING SYSTEM FORHIGH-PERFORMANCE COMPUTINGWIDELY USED IN UNIVERSITIES, PRIVATE ANDGOVERNMENT LABORATORIES.GREAT TOOL FOR MONITORING HARDWARECOMPONENT UTILIZATION AND GATHERING STATS.
GANGLIA
TUNING TIPS
HARDWARE/OS TUNING
Latest BIOS and Firmware revsAppropriate BIOS settingsRAID/JBODDisk controllerNIC driver- Interrupt coalescing and affinitizationNIC bondingNIC jumbo framesOS configuration settings
INSTANCE CONFIGURATION
Performance isimpacted by
Instance type(flavor)Number ofInstances
OVER-SUBSCRIPTION
Beware of over-subscription !!!
LOCAL STORAGE
Use of local storageinstead of sharedstorage like Ceph couldimprove performanceby over50%...depending onCeph replication.
Source: OpenStack: Install and con gure a storage node - OpenStackkilo.
http://docs.OpenStack.org/kilo/install-guide/install/yum/content/cinder-install-storage-node.html (2015)
NUMA NODES
Pinning instance CPUto physical CPUs(NUMA nodes) onlocal storage furtherimprovesperformance.
Source: Red Hat: Cpu pinning and numa topology awareness in OpenStackcompute. http://redhatstackblog.redhat.com/2015/05/05/cpu-
pinning-and-numa-topology-awareness-in-OpenStack-compute/ (2015)
DISK PINNING
Source: OpenStack: OpenStack cinder multibackend. https://wiki.OpenStack.org/wiki/Cinder-
multi-backend (2015)
Disk Pinningshows a 15%performanceimprovement
UNEVEN CONTROLLER USAGE
One controller had more coresavailable than the other two andended up with all the jobs. Thisscenario was identified easilybecause the correct dashboardingwas in place.
HEAT MEMORY USAGE
About 1GB of memory used by Heat for every 10 compute nodes deployed. Size yourcontroller memory appropriately.
DEPLOYMENT TIMINGS
Saw many instance reschedules with default scheduler. Deployment time dropped dramatically bysetting up assignments via ironic.
CONCLUSION
DEFINE WHAT YOU ARE TRYING TO MEASURE
DEFINE A CLOUD
DEFINE WHAT METRICS ARE IMPORTANT
USE THE CORRECT TOOLS
RALLY
PERFKIT BENCHMARKER
CLOUDBENCH
SPEC CLOUD IAAS 2016 BENCHMARK
CEILOMETER
COLLECTD/GRAPHITE/GRAPHANA
GANGLIA
GATHER AND ANALYZE DATA
APPLY TUNING TIPS BASED ON THE DATA
THANKS
Thanks to Andy Bond, Douglas Shakshober , Joe Talerico for some of the content
ADDITIONAL INFORMATION
GUIDELINES AND CONSIDERATIONS FOR PERFORMANCE AND SCALING YOURRED HAT ENTERPRISE LINUX OPENSTACK PLATFORM 6 CLOUD
HTTPS://ACCESS.REDHAT.COM/ARTICLES/1507893
GUIDELINES AND CONSIDERATIONS FOR PERFORMANCE AND SCALING YOURRED HAT ENTERPRISE LINUX OPENSTACK PLATFORM 7 CLOUD
HTTPS://ACCESS.REDHAT.COM/ARTICLES/2165131
RED HAT OPENSTACK BLOG
HTTP://REDHATSTACKBLOG.REDHAT.COM/
RED HAT DEVELOPER BLOG
HTTP://DEVELOPERBLOG.REDHAT.COM/
RED HAT ENTERPRISE LINUX BLOG
HTTP://RHELBLOG.REDHAT.COM/
Top Related