Vm Ware Performance Troubleshooting
-
Upload
selva20001 -
Category
Documents
-
view
232 -
download
0
Transcript of Vm Ware Performance Troubleshooting
-
7/31/2019 Vm Ware Performance Troubleshooting
1/53
VMware Performance
Troubleshooting
Presented by Chris Kranz
-
7/31/2019 Vm Ware Performance Troubleshooting
2/53
Topics Covered Introduction
Root Cause Analysis Performance Characteristics
CPU
Networking
Memory
Disk
Virtual Machine optimisation ESXTop
vm-support
Service Console
Resource Groups
Design Guidelines
Capacity Planner limitations and cautions
Conclusion
Reference Articles
-
7/31/2019 Vm Ware Performance Troubleshooting
3/53
Introduction
Multiple layers of virtualisation are used to
increase service levels, availability and
manageability
However, multiple layers of virtualisation often
mask performance and configuration issuesmaking it more of a challenge to troubleshoot
and correct
The worst out come is that performance issues
after a virtualisation project lead to the
perception that VMware results in reducedperformance and future confidence in VMware
can be affected
-
7/31/2019 Vm Ware Performance Troubleshooting
4/53
Virtual Machine Resources
CPU
Memory
Disk
Networking
Performance Basics
-
7/31/2019 Vm Ware Performance Troubleshooting
5/53
Resource Maximums
Host GuestLogical Processors 64 N/A
Virtual CPUs N/A 8
Virtual CPUs per Core 20 N/A
Memory 1TB 256GB
http://www.vmware.com/pdf/vsphere4/r40/vsp_40_config_max.pdf
-
7/31/2019 Vm Ware Performance Troubleshooting
6/53
Typical Host
vSphere 1U Host
CPUs 2 x Quad Core
Memory 32-64GB RAM
Typical 3 VMs per core, 24VMs per Host
Each has 2GB of RAM = 48GB of RAM
-
7/31/2019 Vm Ware Performance Troubleshooting
7/53
Root Cause Analysis
http://www.vmware.com/resources/techresources/10066
http://www.vmware.com/resources/techresources/10066http://www.vmware.com/resources/techresources/10066 -
7/31/2019 Vm Ware Performance Troubleshooting
8/53
Root Cause ...
-
7/31/2019 Vm Ware Performance Troubleshooting
9/53
Do not rely on guest tools, but
Can show high CPU, & Memory Utilisation
Measurement of Latency & throughput of Disk &
Network Interfaces Use the virtualisation layer, to diagnose cause:
Guest is unaware of virtualisation workload
The way in which guest OSs account time isdifferent
No visibility of available resources
Monitoring Performance
-
7/31/2019 Vm Ware Performance Troubleshooting
10/53
esxtop (service console only)
resxtop (remote command line utilities)
Performance graphs in vCentre
Performance Analysis Tools
-
7/31/2019 Vm Ware Performance Troubleshooting
11/53
esxtop can be run:
Interactively
Batch (eg. esxtop -a -b > analysis.csv)
Load batch into windows perfmon or MS Excel
Two keys to remember
H : help
F : fields to display
esxtop
-
7/31/2019 Vm Ware Performance Troubleshooting
12/53
esxtop basics
Number of WorldsName of Resource
Pool, Virtual
Machine or World
Host Resources
-
7/31/2019 Vm Ware Performance Troubleshooting
13/53
Performance Characteristics
CPU NetworkingMemory DiskSlow Processing
High CPU Wait
Packet Loss
Slow Network
Slow Processing
Disk Swapping
Log Stalls
Disk Queue
Slow Application PerformanceReduced User Experience
Data Loss and Corruption
-
7/31/2019 Vm Ware Performance Troubleshooting
14/53
CPU
ESX Scheduler
Service
Console
Virtual
Machine
Limits / Shares / Reservations
Basic World States
Read / Run / Wait
CPU StatesReady / Usage / Wait
-
7/31/2019 Vm Ware Performance Troubleshooting
15/53
CPUesxtop
PCPU(%): CPU utilization%USED: Utilization
%RDY: Ready Time
%RUN: Run Time
%WAIT: Wait and idling time
High %RDY + High %User can imply over commitment
-
7/31/2019 Vm Ware Performance Troubleshooting
16/53
CPUVI-Client
Used Time > Ready Time:
Possible CPU over-committment
Used Time
Ready Time
-
7/31/2019 Vm Ware Performance Troubleshooting
17/53
CPUFurther Investigation
%MLMTD shows this VM has been limited
-
7/31/2019 Vm Ware Performance Troubleshooting
18/53
VMware Memory Management
Transparent Page Sharing
VMware Tools Balloon Driver to force the VM to swap to disk
Virtual Machine Page File
-
7/31/2019 Vm Ware Performance Troubleshooting
19/53
MemoryBallooning vs. Swapping
Ballooning driver causes the
host to swap pages that it
chooses to disk
ESX Swapping will swap any
pages to disk.
-
7/31/2019 Vm Ware Performance Troubleshooting
20/53
Ballooning can be disabled (0 value) or
controlled on a per Virtual Machine basis
using:
sched.mem.maxmemctl
Default is set to 65%, can be controlled at host
level.
Only is an issue in resource contention
scenarios. (or VMs with low latency eg Citrix)
Memory
-
7/31/2019 Vm Ware Performance Troubleshooting
21/53
Memory - Host
VI Client shows memory usage of the host. This is calculated as consumed + overhead
memory + Service Console.
Performance charts are a very good way of showing the Virtual Machine memory
breakdown.
Consumed Memory
Ballooned Memory
Shared Memory
Swapped Memory
-
7/31/2019 Vm Ware Performance Troubleshooting
22/53
Memory - Guest
Host Memory = Consumed + Overhead Memory
Guest Memory = Active Memory for Guest OS
-
7/31/2019 Vm Ware Performance Troubleshooting
23/53
Memory Guest Overhead
-
7/31/2019 Vm Ware Performance Troubleshooting
24/53
Memory
Metric Description
Memory Active (KB) Physical pages touched recently by a VM
Memory Usage (%) Active memory / configured memory
Memory Consumed (KB) Machine memory mapped to a virtual machine, including its portion of
shared pages. Doesnt include overhead memory
Memory Granted (KB) Physical pages allocated to a virtual machine. May be less thanconfigured memory. Includes shared pages. Doesnt include overhead
memory.
Memory Shared (KB) Physical pages shared with other virtual machines
Memory Balloon (KB) Physical memory ballooned from a virtual machine
Memory Swapped (KB) Physical memory in swap file (approx. swap out swap in). Swap outand Swap in are cumulative
Overhead Memory (KB) Machine pages used for virtualisation
Virtual Machine Memory Metrics VI Client
-
7/31/2019 Vm Ware Performance Troubleshooting
25/53
Memory
Metric Description
Memory Active (KB) Physical pages touched recently by the host
Memory Usage (%) Active memory / configured memory
Memory Consumed (KB) Total host physical memory free memory on host. Includes Overhead
and Service Console memoryMemory Granted (KB) Sum of physical pages allocated to all virtual machines. Doesnt include
overhead memory.
Memory Shared (KB) Physical pages shared by virtual machines on host
Shared Common (KB) Total machine pages used by shared pages
Memory Balloon (KB) Machine pages ballooned from virtual machinesMemory Swap Used (KB) Physical memory in swap file (approx. swap out swap in). Swap out
and Swap in are cumulative
Overhead Memory (KB) Machine pages used for virtualisation
Host Memory Metrics VI Client
-
7/31/2019 Vm Ware Performance Troubleshooting
26/53
Memoryesxtop
PMEM: Total physical memory breakdown
VMKMEM: Memory managed by vmkernel
COSMEM: Service Console memory breakdownPSHARE: Page sharing statistics
SWAP: Swap statistics
MEMCTL: Balloon driver data
-
7/31/2019 Vm Ware Performance Troubleshooting
27/53
-
7/31/2019 Vm Ware Performance Troubleshooting
28/53
Memory
VI Client esxtop
Memory Active N/A (try /proc/vmware/sched/mem-verbose)
Memory Usage N/A (try /proc/vmware/sched/mem-verbose)Memory Consumed PMEM total PMEM free
Memory Granted N/A (SZTGT and CMTTGT represent memory scheduler targets)
Memory Shared PSHARE (shared)
Memory Shared Common PSHARE (common)
Memory Balloon MEMCTL
Memory Swap Used SWAP (r/w and w/s are rates)
Overhead Memory OVHD & OVHDMAX
esxtop / VI Client metrics : Host Usage
-
7/31/2019 Vm Ware Performance Troubleshooting
29/53
MemoryVI Client memory usage graph
-
7/31/2019 Vm Ware Performance Troubleshooting
30/53
MemoryTroubleshooting Memory usage issues
-
7/31/2019 Vm Ware Performance Troubleshooting
31/53
Networking
Network configuration is more likely to blame than resource contention
Switch Assisted Teaming (IP Hash)
VLAN Trunking
Flow Control (full)
Speed & Duplex (1000Mb / Full)
Port FastBPDU Disabled
STP Disabled
Link State Tracking
Jumbo Frames
-
7/31/2019 Vm Ware Performance Troubleshooting
32/53
Networkingesxtop
Transmit and Receive in Mb/s
Transmit and Receive in Packets
-
7/31/2019 Vm Ware Performance Troubleshooting
33/53
Networkingesxtop
Drop Packets Received
Dropped Packets Transmit
-
7/31/2019 Vm Ware Performance Troubleshooting
34/53
Disk
Varying Factors File system performance
Disk subsystem configuration (SAN, NAS, iSCSI, local disk)
Disk caching
Disk formats (thick, sparse, thin)
ESX Storage Stack
Different latencies for different disks
Queuing within the kernel
K: Kernel
D: Device
G: Guest
-
7/31/2019 Vm Ware Performance Troubleshooting
35/53
Disk
Quite Coarse Statistics
Disk read / write rate (KB/s)
Disk usage: sum of read BW and write BW (KB/s)
Disk read / write requests (per 20s interval)
Bus resets / Command aborts (per 20s interval)Per LUN or aggregated stats
VI Client statistics
-
7/31/2019 Vm Ware Performance Troubleshooting
36/53
DiskAggregated stats similar to VI Client
Disk read / write per sec (READS/s, WRITES/s) MB read / write per sec (MBREAD/s, MBWRTN/s)
Latency Statistics
Kernel Average / command (KAVG/cmd)
Device Average / command (DAVG/cmd) Guest Average / command (GAVG/cmd)
Queuing Information
Adapter Queue Length (AQLEN)
LUN Queue Length (LQLEN)
VMKernel (QUED)
Active Queue (ACTV)
%Used (%USD = ACTV/LQLEN)
esxtop statistics
-
7/31/2019 Vm Ware Performance Troubleshooting
37/53
DiskSAN Rough Estimates
Purely looking at a single ESX host, roughly:Throughput (in MBps) = (Outstanding IOs * Block size in KB) / latency in msec
FC, rough maximums:Effective Link Bandwidth = ~80/90% of Real Bandwidth
Effective (2Gbps) = 200 230 MBpsEffective (4Gbps) = 410 460 MBps
Effective (8Gbps) = 820 920 MBps
iSCSI / NFS / FCoE, rough maximums:
Effective Link Bandwidth = ~70/80% of Real BandwidthEffective (1GigE) = 90 100 MBps
Effective (10GigE) = 900 1000 MBps
-
7/31/2019 Vm Ware Performance Troubleshooting
38/53
DiskDesired Latency Calculations
Desired Larency in msec
-
7/31/2019 Vm Ware Performance Troubleshooting
39/53
DiskVI Client
SAN Cache disabled
Poor throughput
SAN Cache enabled
High throughput
-
7/31/2019 Vm Ware Performance Troubleshooting
40/53
Diskesxtop
Latency is quite high
After enabling cache,
Latency is reduced
-
7/31/2019 Vm Ware Performance Troubleshooting
41/53
Virtual Machine OptimisationDeploy all machines from an optimised template!
VMware tools MUSTbe installed
The disks MUSTbe block aligned to the storage (even when using NFS and SAN)
Where possible, always separate data disks from OS disks
Windows performance settings should be optimised for application performance
Guest operating system timeouts should be set as defined by the SAN vendor
Pagefile should be separated where appropriate (this can impact VMware SRM however) Unused Windows services should be disabled (wireless config, print spooler, audio, etc.)
Last access update time should be disabled (unless where required)
Logging of the VM should be disabled (only enabled for troubleshooting)
Remove any unused virtual hardware (floppy drives, USB, etc.)
Disable screen savers and power saving features, including logon screen saver
Enable Remote Desktop, avoid using the VI Client for remote administration Install standard applications into template (bginfo, AntiVirus, any host agents, etc)
Multiple-CPUs should be allocated sparingly
-
7/31/2019 Vm Ware Performance Troubleshooting
42/53
Virtual Machine OptimisationBlock alignment is vital to good disk performance!
-
7/31/2019 Vm Ware Performance Troubleshooting
43/53
esxtopCommand Action
space Update the display
? Show the help page
q quit
f / F Add or Remove columns from the display
o / O Change the order the display is sorted
s change the update interval
# change the number of instances to display
W Write configuration to file
e Expand / Rollup CPU Stats
V View only VM instances
L Change the length of the NAME field
m Display memory statistics
n Display network statistics
i Display interrupt statisticsd Display disk adapter statistics
u Display disk device statistics
v Display disk VM statistics
Command Options
when inside esxtop
-
7/31/2019 Vm Ware Performance Troubleshooting
44/53
esxtop
Command Action
-b batch mode
-l locks the objects available in the first snapshot
-s enables secure mode
-a show all statistics
-c sets the configuration file
-R enables replay mode (used with vm-supportS)
-d sets the update interval
-n runs esxtop for n iterations
Command Line Options
from the console
-
7/31/2019 Vm Ware Performance Troubleshooting
45/53
esxtop
Expand the default window size for your session to get all statistics
-
7/31/2019 Vm Ware Performance Troubleshooting
46/53
vm-supportCreates a packaged zip file containing the following sections:
boot contains the grub configuration
etc
contains the Console OS configuration files (cron, tcpwrappers, syslog, etc)
proc
contains much of the hardware configuration modules and variables
tmp contains a lot of the ESX specific configuration output
var
contains log files and any core dumps
vmfs
contains the structure of the VMFS datastores
esx3-installation (where appropriate)
contains a copy if the previous esx3 configuration variables
-
7/31/2019 Vm Ware Performance Troubleshooting
47/53
vm-supportUsing vm-support to extract performance information:
vm-supportSd -i
and are in seconds
The output from this can then be replayed in esxtop for review after it has been
extracted.
esxtopR
-
7/31/2019 Vm Ware Performance Troubleshooting
48/53
Service Console Performance
Multiple Service Console networks for network resiliencyIncreased Service Console memory upto 800MB
Use host agents supplied by your vendors
Make storage recommended tweaks such as HBA Queue Depth
and IO timeoutsMinimal use of the VI Client console RDP or SSH instead
Properly sized vCenter server 64bit OS where possible
-
7/31/2019 Vm Ware Performance Troubleshooting
49/53
Resource Groups
Dynamically reallocate resource shares
Additional VM, shares allow you to over-
commit resources and have a gracefulre-allocation
Remove a VM and exploit extra resources
across all remaining VMs
-
7/31/2019 Vm Ware Performance Troubleshooting
50/53
Design Guidelines
Full Resilience / Multiple paths
Standard configuration across all aspects (ESX, Storage, Networking, etc.)
Standard naming conventions
Learn from others mistakes
Follow guidelines from vendors best-practices
Rule out the basics before requesting support
-
7/31/2019 Vm Ware Performance Troubleshooting
51/53
Capacity Planner & P2V Cautions and Limitations
Peak CPU usage can sometimes be misleading Back-end storage system performance
P2V machines will require block-aligning to the storage
P2V machines will still require guest OS optimisation
-
7/31/2019 Vm Ware Performance Troubleshooting
52/53
Conclusion Performance issues can often be traced with simple root cause
analysis using basic tools (VI Client / esxtop) Performance tools help diagnose issues and help rule out non-
issues
Performance tools are useful in different contexts, not always
either/or
Real-time data and troubleshooting: esxtop
Historical data: VI Client
Coarse resource / cluster usage: VI Client
Detailed resource usage: esxtop
Combine information from various tools to get a complete picture Always benchmark your systems first so you not what the optimal
performance is that you can receive
-
7/31/2019 Vm Ware Performance Troubleshooting
53/53
Reference Articles http://www.vmware.com/pdf/esx3_memory.pdf
http://www.vmworld.com/docs/DOC-2370
http://blogs.vmware.com/performance/
http://communities.vmware.com/docs/DOC-5420
http://kb.vmware.com/kb/1008205
http://communities.vmware.com/community/vmtn/general/performance
http://www.vmware.com/products/vmmark/ http://www.vmware.com/pdf/vsphere4/r40/vsp_40_san_cfg.pdf
http://www.vmware.com/pdf/vsphere4/r40/vsp_40_iscsi_san_cfg.pdf
http://www.vmware.com/pdf/vsphere4/r40/vsp_40_resource_mgmt.pdf
http://www.vmware.com/pdf/GuestOS_guide.pdf
http://www.vmware.com/resources/techresources/10066
http://www.vmware.com/resources/techresources/10059
http://www.vmware.com/resources/techresources/10062
http://www.vmware.com/pdf/esx3_memory.pdfhttp://www.vmworld.com/docs/DOC-2370http://blogs.vmware.com/performance/http://communities.vmware.com/docs/DOC-5420http://kb.vmware.com/kb/1008205http://communities.vmware.com/community/vmtn/general/performancehttp://www.vmware.com/products/vmmark/http://www.vmware.com/pdf/vsphere4/r40/vsp_40_san_cfg.pdfhttp://www.vmware.com/pdf/vsphere4/r40/vsp_40_iscsi_san_cfg.pdfhttp://www.vmware.com/pdf/vsphere4/r40/vsp_40_resource_mgmt.pdfhttp://www.vmware.com/pdf/GuestOS_guide.pdfhttp://www.vmware.com/resources/techresources/10066http://www.vmware.com/resources/techresources/10059http://www.vmware.com/resources/techresources/10062http://www.vmware.com/resources/techresources/10062http://www.vmware.com/resources/techresources/10059http://www.vmware.com/resources/techresources/10066http://www.vmware.com/pdf/GuestOS_guide.pdfhttp://www.vmware.com/pdf/vsphere4/r40/vsp_40_resource_mgmt.pdfhttp://www.vmware.com/pdf/vsphere4/r40/vsp_40_iscsi_san_cfg.pdfhttp://www.vmware.com/pdf/vsphere4/r40/vsp_40_san_cfg.pdfhttp://www.vmware.com/products/vmmark/http://communities.vmware.com/community/vmtn/general/performancehttp://kb.vmware.com/kb/1008205http://communities.vmware.com/docs/DOC-5420http://communities.vmware.com/docs/DOC-5420http://communities.vmware.com/docs/DOC-5420http://blogs.vmware.com/performance/http://www.vmworld.com/docs/DOC-2370http://www.vmworld.com/docs/DOC-2370http://www.vmworld.com/docs/DOC-2370http://www.vmware.com/pdf/esx3_memory.pdf