Capacity planning2 v02
Transcript of Capacity planning2 v02
Agenda
Capacity Planning – practical view
CPU Capacity Planning
LPAR2RRD Premium features
Future
Discussion
What is that?
Does that save money?
If so then how?
Have you already have an IT capacity manager position and a tool at your company?
Why do we need capacity planning? To predict future bottlenecks
How to start? Need to have data.
Where to get data? Need to have monitoring in place
What to monitor? Networking (LAN / SAN / WAN)
Memory
Storage
CPU
Bottleneck on the LAN / SAN is very rare From my experience only backups are able regularly highly utilize
LAN/SAN
When a problem appears then it is usually caused by bad infrastructure design
Fix is usually fast and relatively cheep
new IO cards, network component, trunk …
From capacity planning view it is not major area
Network should never cause a bottleneck! Why?
HW/SW on the network layer is much cheaper than CPU, memory or storage so it should be sized enough
Usually it is not virtualized like CPU There are exceptions like
▪ IBM PowerVM Active Memory Sharing
▪ VMware Consumption is not so dynamic as CPU
i.e. database memory consumption is fixed by cfg Mem size has usually no effect on SW licensing Easy extending
Nowadays physical limits of memory on servers are usually high enough
Extend can be planned in days after a problem appears
From capacity planning point of view Storage capacity (size)
Storage throughput (IOPs, Bytes/sec)
Storage capacity Can be relatively simple retrieved and approximated to the future
When no special requirement for throughput then new capacity is relatively cheep
Disk upgrade is usually quick operation
Storage throughput Storage throughput is often reason of bottlenecks
Storage IO bottleneck is usually though to prove or predict
▪ Due to storage caching for example
▪ Solution does not have to be easy (data migration …)
Can be highly virtualized ( IBM PowerVM, VMware …) Virtualized means shared between virtual partitions
Lack of CPU power or bad virtualization design often cause bottlenecks
CPUs are quite expensive A lot of SW products has licensing based on CPU
Companies should care about number of CPU to do not have to buy unnecessary SW licenses
Accurate capacity planning can directly save money spent for HW and SW
Often happens that servers are fully CPU equipped and cannot be extended Server upgrade and or lpar migration can be long process
All above reasons indicate that CPU is most important from capacity planning point of view
Next part of presentation will be focused on CPU capacity planning on IBM Power platform
However the product we will talk about can be extended even to Intel world == VMware, Linux Redhat KVM …
The problem:
How to manage the CPU pools and sub pools on IBM Power Systems▪ Document actual setup
▪ Knowledge of free resources
Determine CPU usage trends
Identify abnormal CPU usage
Manage Capacity
Compare CPU load on different server models for migration purposes
Business model
Product is free
Support is for fee
Support levels
Basic
Standard
Premium
Some features are available only for customers under support
• Light weight solution• Real time and historical data• Easy to implement and manage• Cost effective (free)
it creates historical, future trends and nearly "real-time" CPU utilization graphs of LPAR's and shared CPU usage of IBM Power servers.
It collects complete physical and logical configuration of all servers & LPAR's.
It is agent less (it gets everything from the HMC/SDMC or IVM).
It supports all kinds of logical partitions:
AIX / AS400 / Linux / VIOS
It is free!!!
4 levels of detail
Global view
▪ all HMCs/SDMCs/IVMs, servers, lpars
Summary per each HMC (servers, lpars)
▪ http://lpar2rrd.com/userfiles/file/summary.htm
Detail view per each HMC (servers, lpars)
▪ http://lpar2rrd.com/userfiles/file/detail.htm
Verbose HW details per each server
▪ http://lpar2rrd.com/userfiles/file/server0/config.html
Shows 10 most CPU utilized lpars per:
All environment
Particular server
Day / week / month / year
Example:
http://lpar2rrd.com/topten.htm
It makes CPU estimation based on historical data
Useful for migration / consolidation
Estimation based on rPerf/CPW benchmark
Live example on
http://lpar2rrd.com/live_demo.html
Doc http://lpar2rrd.com/cpu_workload_estimator_ann.html
Favourites
It allows you choice typically most important or most often viewed CPU pools or lpars and place them into separated menu for quick access.
You can assign them aliases which then appear under “Favourites" menu
http://lpar2rrd.com/favourites.html
Custom groups
It allows you group selected lpars or pools from different servers into one aggregated graph
Limitations of free LPAR2RRD version is Max 4 lpars/pools per a group
You can use regular expressions
http://lpar2rrd.com/custom_groups.html
Alerting
You can define alarms for any CPU pool (or complete server) lpar
Useful especially for pools
Alerting Email Direct Nagios support External script Intergration with others monitoring on a request
http://lpar2rrd.com/alerting.html
All can be setup within 1 hour!
LPAR2RRD can run on any Unix OS
Disk space requirements are about 4MB per monitored LPAR
Required SW
Any web server
SSH
RRDTools
Perl
Automated capacity planning and performance analysis
Provides the information needed to help: productively and cost effectively manage their growth and
performance
Ensure uninterrupted availability of system The customer be prepared for IT challenges /
opportunities Maximize return on current and future hardware
investments Easily understand and plan for future requirements
LPAR2RRD home page http://www.lpar2rrd.com
Monitoring shared processor pools with LPAR2RRD http://www.ibm.com/developerworks/aix/library/au-
lpar2rrd/index.html
Nigel Griffith's LPAR2RRD view https://www.ibm.com/developerworks/mydeveloperworks/blogs/aix
pert/entry/whole_power_server_virtual_server_monitoring_part_2_via_lpar2rrd68?lang=en
Redbook: SAP Applications on a Virtualized IBM Power Systems Environment http://www.redbooks.ibm.com/abstracts/sg247564.html
Integration with SAP
Use of business data from SAP
Data source for SAP CCMS
Monitor storages (IOps, Bytes/sec …)
IBM DS3/4/5k already under beta
IBM DS8k November 2012
IBM Storewize and others will follow (EMC?)
Batch daily job for identification of lpars which reach max of their configured CPU
Storage monitoring
Monitoring basic storage parameters
IO rate per controller/port
Data rate per controler/port
Cache hit
Same easy frontend like for CPU
On a click you will find storage utilization
IBM DS4k/5k already in testing phase
IBM DS8k will be ready in November 2012
Other storages will follow on per a request …