VMware- Customer Support Day · 8 SCSI Reservations –when an initiator requests/reserves...
Transcript of VMware- Customer Support Day · 8 SCSI Reservations –when an initiator requests/reserves...
© 2009 VMware Inc. All rights reserved
VMware- Customer Support Day
November 16, 2010
2
Agenda
9:30 AM - Welcome/Kick-Off
Bob Good, Manager, Systems Engineering
9:40 AM - Support Engagement
Laura Ortman, Director, Global Support Services (GSS)
10:00 AM - Storage Best Practices
Ken Kemp, Escalation Engineer
11:00 AM - Keynote – VMware Virtualization and Cloud Management
Doug Huber, Director, Systems Engineering
12:00 PM - Lunch/Q&A with the experts (Group A) /VMware Express – Private Viewing (Group B)
1:00 PM - Lunch/Q&A with the experts (Group B) / VMware Express – Private Viewing (Group A)
2:00 PM - View 4.5 Overview/Network Best Practices
David Garcia, Release Readiness Manager
3:15 PM - Break
3:30 PM - vSphere Performance Best Practices
Ken Kemp, Escalation Engineer
4:15 PM - Wrap Up/Raffle Drawing
Inte
rac
tive
Se
ss
ion
© 2009 VMware Inc. All rights reserved
Storage Best Practices
Ken Kemp – Escalation Engineer, Global Support Services
4
Agenda
Performance
SCSI Reservations
Performance Monitoring
• esxtop
Common Storage Issues
• Snapshot LUN’s
• Virtual Machine Snapshot
• iSCSI Multi Pathing
• All Paths Dead (APD)
5
Disk subsystem bottlenecks cause more performance problems
than CPU or RAM deficiencies
Your disk subsystem is considered to be performing poorly if it is
experiencing:
• Average read and write latencies greater than 20 milliseconds
• Latency spikes greater than 50 milliseconds that last for more than a few seconds
Performance
6
Performance vs. Capacity comes into play at two main levels
• Physical drive size
• Hard disk performance doesn’t scale with drive size
• In most cases the larger the drive the lower the performance.
• LUN size
• Larger LUNs increase the number of VM’s, which can lead to contention on that
particular LUN
• LUN size is often times related to physical drive size which can compound performance problems
Performance vs. Capacity
7
You need 1 TB of space for an application• 2 x 500GB 15K RPM SAS drives = ~300 IOPS
• Capacity needs satisfied, Performance low
• 8 x 146GB 15K RPM SAS drives = ~1168 IOPS
• Capacity needs satisfied, Performance high
Performance – Physical Drive Size
8
SCSI Reservations – when an initiator requests/reserves exclusive use of a target(LUN)
• VMFS is a clustered file system
• Uses SCSI reservations to protect metadata
• To preserve the integrity of VMFS in multi host deployments
• One host has complete access to the LUN exclusively
• A reboot or release command will clear the reservation
• The virtual machine monitor users SCSI-2 reservations
SCSI Reservations – Why?
9
What causes SCSI Reservations?
• When a VMDK is created, deleted, placed in REDO mode, has a snapshot (delta) file, is migrated (reservations from the source ESX and from the target ESX) or when the VM is suspended (Since there is a suspend file written).
• When VMDK is created via a template, we get SCSI reservations on the source and target
• When a template is created from a VMDK, SCSI reservation is generated
SCSI Reservations
10
• Simplify/verify deployments so that virtual machines do not span more than one LUN
• This will ensure SCSI reservations do not impact more than one LUN
• Determine if any operations are occurring on a LUN on which you want to perform another operation
• Snapshots
• VMotion
• Template Deployment
• Use a single ESX server as your deployment server to limit/prevent conflicts with other ESX servers attempting to perform similar operations
SCSI Reservation Best Practice
11
• Inside vCenter, limit access to actions that initiate reservations to administrators who understand the effects of reservations to control WHO can perform such operations
• Schedule virtual machine reboots so that only one LUN is impacted at any given time
• A power on and power off are considered separate operations and both with create a reservations
• VMotion
• Use care when scheduling backups. Consult the backup provider best practices information
• Use care when scheduling Anti Virus scans and updates
SCSI Reservation Best Practice - Continued
12
• Monitoring /var/log/vmkernel for:
• 24/0 0x0 0x0 0x0
• SYNC CR messages
• In a shared environment like ESX there will be some SCSI reservations. This is normal. But when you see 100’s of them it’s not normal.
• Check for Virtual Machines with snapshots
• Check for HP management agents still running the storage agent
• Check LUN presentation for Host mode settings
• Call VMware support to dig into it further
SCSI Reservation Monitoring
© 2009 VMware Inc. All rights reserved
Storage Performance Monitoring
Ken Kemp – Escalation Engineer, Global Support Services
14
esxtop
15
DAVG = Raw response time from the device
KAVG = Amount of time spent in the VMkernel, aka. virtualization
overhead
GAVG = Response time that would be perceived by virtual machines
D + K = G
esxtop - Continued
16
esxtop - Continued
17
esxtop - Continued
18
•What are correct values for these response times?• As with all things revolving around performance, it is subjective
• Obviously the lower these numbers are the better
• ESX will continue to function with nearly any response time, however how well it functions is another issue
• Any command that is not acknowledged by the SAN within 5000ms (5 seconds) will be aborted. This is where perceived disk performance takes a sharp dive
esxtop - Continued
© 2009 VMware Inc. All rights reserved
Common Storage Issues
Ken Kemp – Escalation Engineer, Global Support Services
20
How a LUN is detected as a snapshot in ESX?
• When an ESX 3.x server finds a VMFS-3 LUN, it compares the SCSI_DiskID information returned from the storage array with the SCSI_DiskID information stored in the LVM Header.
• If the two IDs do not match, the VMFS-3 volume is not mounted.
A VMFS volume on ESX can be detected as a snapshot for a number of reasons:
• LUN ID change
• SCSI version supported by array changed (firmware upgrade)
• Identifier type changed – Unit Serial Number vs NAA ID
Snapshot LUNs
21
Resignaturing MethodsESX 3.5
Enable LVM Resignaturing on the first ESX host
Configuration > Advanced Settings > LVM > LVM.EnableResignaturing to 1.
ESX 4
Single Volume Resignaturing
Configuration > Storage > Add Storage > Disk / LUN
Select Volume to Resignature > Select Mount, or Resignature
Snapshot LUNs - Continued
22
What is a Virtual Machine Snapshot?
• A snapshot captures the entire state of the virtual machine at the time you take the snapshot.
• This includes:
Memory state – The contents of the virtual machine’s memory.
Settings state – The virtual machine settings.
Disk state – The state of all the virtual machine’s virtual disks.
Virtual Machine Snapshots
23
Common issues:
• Snapshots filling up a Data Store
• Offline commit
• Clone VM
• Parent has changed.
• Contact VMware Support
• No Snapshots Found
• Create a new snapshot, then commit.
Virtual Machine Snapshot - Continued
24
ESX 4, Set Up Multi-pathing for Software iSCSI
Prerequisites:
• Two or more NICs.
• Unique vSwtich.
• Supported iSCSI array.
• ESX 4.0 or higher
ESX4 iSCSI Multi-pathing
25
Using the vSphere CLI, connect the software iSCSIinitiator to the iSCSI VMkernel ports.
Repeat this command for each port.
• esxcli swiscsi nic add -n <port_name> -d <vmhba>
Verify that the ports were added to the software iSCSI initiator by running the
following command:
• esxcli swiscsi nic list -d <vmhba>
Use the vSphere Client to rescan the software iSCSI initiator.
ESX4 iSCSI Multi-pathing - Continued
26
This example shows how to connect the software iSCSI initiator vmhba33 to VMkernel ports vmk1 and vmk2.
Connect vmhba33 to vmk1:
esxcli swiscsi nic add -n vmk1 -d vmhba33
Connect vmhba33 to vmk2:
esxcli swiscsi nic add -n vmk2 -d vmhba33
Verify vmhba33 configuration:
esxcli swiscsi nic list -d vmhba33
ESX4 iSCSI Multi-pathing - Continued
27
The IssueYou want to remove a LUN from a vSphere 4 cluster
You move or Storage vMotion the VMs off the datastore who is being removed (otherwise, the VMs would hard crash if you just yank out the datastore)
After removing the LUN, VMs on OTHER datastores would become unavailable (not crashing, but becoming periodically unavailable on the network)
the ESX logs would show a series of errors starting with ―NMP‖
All Paths Dead (APD)
28
Workaround 1 In the vSphere client, vacate the VMs from the datastore being
removed (migrate or Storage vMotion)
In the vSphere client, remove the Datastore
In the vSphere client, remove the storage device
Only then, in your array management tool remove the LUN from the host.
In the vSphere client, rescan the bus.
Workaround 2Only available in ESX/ESXi 4 U1
esxcfg-advcfg -s 1 /VMFS3/FailVolumeOpenIfAPD
All Paths Dead - Continued
29
4.1 Storage Additions
Storage I/O Control which allows us to prioritize I/O from Virtual Machines residing on different ESX servers but using the same shared VMFS volume.
New I/O statistics, including NFS throughput and latency counters.
vStorage API for Array Integration (VAAI) which allow the offloading of certain storage operations such as cloning and zeroing operations from the host to the array.
© 2009 VMware Inc. All rights reserved
Questions
© 2009 VMware Inc. All rights reserved
VMware View 4.5 Overview
David Garcia Jr - Global Support Services
32
Agenda
View (Overview)
User Experience (Highlights)
Performance & Scalability (Tiered Storage, View Composer)
Management (View Manager)
33
Hypervisor Performance
Storage Infrastructure
Performance
vCenter Performance
Client Performance
vCENTER SERVER
VIEW SERVER
VMware View
Performance
Storage
Infrastructure
Network
Infrastructure
Server
and
Virtualization
stack
View Server
and
Remote Clients
VDI deployment scope
34
View 4.5 Architecture overview
View Client with
Local Mode
Support for vSphere 4.1 and vCenter 4.1 - Delivers integration with
the most widely-deployed desktop virtualization platform in the industry.
Takes advantage of optimizations for View virtual desktops.
Lowest Cost Reference
Architectures - VMware has
worked with partners such as
Dell, HP, Cisco, NetApp, and
EMC to provide prescriptive
reference architectures to enable
you to deploy a scalable and
cost-effective desktop
virtualization solution.
35
View 4.5 Product highlights
Full Windows 7 Support
View Manager Enhancements
• Increasing Scale and Efficiency
• System and User Diagnostics
• Extensibility
PCoIP Updates: Smart Card Support
View Client with Local Mode (aka Offline Support)
Support for vSphere 4.1
36
Native Windows Client Thin- Client Support
Thick clients or
refurbished PCs
Broad industry
support
Flexible client access from multiple devices
Mac OS 10.5+
Native Mac Client (RDP)
NEW
Now with Local Mode
37
Single Sign On
Authentication to Virtual
Desktop
• Windows Username/Password
• Smart Cards/Proximity Cards
• Client Based (MAC Address)
• USB connected biometric devices
Integration with MS AD
• No Domain change, schema
change, password change
Supports ―Tap and Go‖
Functionality
• Integrates with SSO Vendors –
Imprivata, Sentillion, Juniper, etc
Simplified Sign-on
Connection
Server
Single sign on to virtual desktop and apps
38
Web download portal
• Enhanced capability to manage
distribution of full View Windows
Client including PCoIP, ThinPrint
and USB redirection features
• Ability to distribute current and
legacy versions of View Client
• Broker URL automatically passed
to Windows client upon launch
• Experimental Java based Mac and
Linux Web Access no longer
supported (use installable Mac
Client in View 4 and View Open
Client for Linux)
39
Value propositions of local desktops
For IT
Extend View benefits to mobile users with laptops
Enable Bring Your Own PC (BYOPC) programs for employees &
contractors
Extend View benefits to remote/branch offices with poor/unreliable
networks
For End Users
Mobility – check out VM to local laptop for offline usage
Disaster Recovery – VM replicated to datacenter
Flexibility – BYOPC and personal desktop productivity
Windows
Guest
VM 1
View Client with Local Mode
Guest
VM 2
40
High Level FeaturesView in
2010Details
Run anywhereAfter initial checkout, desktop can be used at home or on
the road w/o network connectivity.
Broad hardware support Works with almost any modern laptop today.
Encrypted and secureAES Encryption of Desktop and centrally managed
policies to control access and usage.
Data centralization & control Admin can pull all data back up to datacenter on demand.
High quality user experienceSupport for Win7 Aeroglass Effects, DirectX 9 w/3D,
distortion-free sound & multimedia.
Reasonable CAPEX costs Up & running in with a single ESX box & local storage!
Disaster recovery optionsCan schedule data replication to server for rapid,
seamless recovery from hardware loss or failure.
Single Image Management w/ViewWorks off same management infrastructure & images
as rest of View deployment.
High level features of local desktops in 2010
41
View 4.5 major management feature highlights
Up-to 10000
Desktops
Admin Features
• High perf GUI
• Role based Admin
• Event DB, Dashboard
• View Power CLI extension
Storage Optimization
• Tiered storage
• Disposable disk/Local
swap file redirection
• VM on local storage
Composer Enhancements
• Sysprep support
• Fast refresh
• Persistent Disk Management
Simplified Sign-on
• Smart-card/Proximity card
• Client (MAC/device ID),
support of Kiosk mode
ThinApp Integration
• App repo scanning
• Pool/Desktop ThinApp
assignment
42
Core broker: Performance & scalability
• 10,000 VM Pod (5 connection servers + 2 standby)
• Federated Pool Management
• Connection server instance in a cluster will be responsible for VM operations on
VMs belonging to the same pool
• Reduced locking/synchronization overhead
• Enhanced tracker w/ caching
• Reduced extra reloading from ADAM Datastore
• Refresh UI with 5,000 objects in seconds!
43
View Composer improvements overview
• Customization/Provisioning
• Sysprep support
• Refresh, Recompose and Rebalance for Floating Pool
• Storage Performance and Optimization
• Tiered support
• Optimization
• Disposable disk and Local swap file redirect
• Allow creation of linked-clones on local storage
• Management
• Full Management of Persistent Disk (formerly known as UDD)
44
View Composer: Tiered storage
Allow master VM replica to
reside in a separate datastore
Use high performance storage to boost
performance (e.g. reboot, virus scan)
45
View Composer: Other storage optimization
• Local swap file redirect
• Not reducing storage but allow the use of cheap local storage for individual VM swap file
• Allow creation of linked-clones using local data stores
• Wizard will not filter out local data stores for use of VM cloning
• Allow use of cheap local storage for non-persistent pool VMs
46
View Composer: Customization/provisioning
•Sysprep support
•Sysprep helps resolve
the SID management
issue: a new SID will
be generated for each
cloned VM
•The Three ‗R‘s
•Refresh
•Recompose
•Rebalance
47
View Composer: Enhanced management functions
• Persistent Disk (formerly known as UDD) Management
• Detach/Migrate/Archive/Reattach
• Managed as ―first class object‖
• Garbage collection scripts
• Remove one or more linked-clone VM(s) by name(s) from View, SVI, VC, and AD
48
Administration improvements in 2010
Provides Increased Management Efficiency:
Monitoring, Diagnostics and Supportability
Features
• Scalable Admin UI in Flex
• Role-based Administration
• System and End-User Troubleshooting
• Monitoring Dashboard
• Diagnostics
• Supportability
• Reporting and Auditing Enablement
• Events
• View Management Pack for SCOM
49
Scalable admin UI
• Based on Adobe Flex
• Rich application feel
• Scalability
• Easy navigation
• Cross-Platform
50
Role-based administration
• Delegated
administration
• Flexible Roles
• Helpdesk, etc
• Custom roles
• LDAP-based access
control on folders
51
System and end-user troubleshooting: Dashboard
• Surface key information to
administrators
• Drill-down as needed
• Locate root cause
• System health status
• View components
• vCenter components
• Status of desktops
• Status of client-hosted
endpoints
• Datastore usage
• VMs on storage LUN
52
Reporting and auditing enablement: Events
Formally defined events
• Events have a unique well defined identifier
• Standard attributes include module, user, desktop, machine
Provides a unified view across View components
• No more needing to review logs on each broker, agent!
Managed with a configurable database
Accessible with:
• VMware View Administrator
• Direct access (SQL) for other reporting tools
• Powershell
• Vdmadmin provides textual reports (csv or xml)
53
View management pack for SCOM
54
Links & Resources
Documentation, Release Notes http://www.vmware.com/support/pubs/view_pubs.html
• VMware View 4.5 Release Notes
• VMware View Architecture Planning Guide
• VMware View Administrator's Guide
• VMware View Installation Guide
• VMware View Upgrade Guide
• VMware View Integration Guide
Technical Papers http://www.vmware.com/resources/techresources/cat/91,156
• VMware View Optimization Guide for Windows 7 VMware Ensynch 09/27/2010
• Vblock Powered Solutions for VMware View VMware Cisco EMC 09/09/2010
• Virtual Desktop Sizing Guide with VMware View 4.0 and VMware vSphere 4.0 Update1 Mainline 05/21/2010
• Application Presentation to VMware View Desktops with Citrix XenApp VMware 05/20/2010
• PCoIP Display Protocol: Information and Scenario-Based Network Sizing Guide VMware 05/20/2010
• Location Awareness in VMware View 4 VMware 06/15/2010
• VMware View 4 & VMware ThinApp Integration Guide VMware 01/19/2010
• Anti-Virus Deployment for VMware View VMware 01/13/2010
© 2009 VMware Inc. All rights reserved
Questions
© 2009 VMware Inc. All rights reserved
vSphere Networking Best Practices
David Garcia Jr - Global Support Services
57
Agenda
vSwitches & Portgroups
Nic Teaming
Link Aggregation (802.3ad static mode)
Failover Configuration
Spanning Tree Protocol
Network I/O Control
Load-Based Teaming
VmDirectpath, Vmxnet3, FCOE CNA & 10GB
VLAN Trunking (802.1q)
Tips & Tricks
Troubleshooting Tips
Must Read & KB Links
58
Designing the Network
How do you design the virtual network for
performance and availability and but maintain
isolation between the various traffic types
(e.g. VM traffic, VMotion, and Management)?
• Starting point depends on:
• Number of available physical ports on server
• Required traffic types
• 2 NIC minimum for availability, 4+ NICs
per server preferred
• 802.1Q VLAN trunking highly recommended for logical
scaling (particularly with low NIC port servers)
• Examples are meant as guidance and do not represent strict
requirements in terms of design
• Understand your requirements and resultant traffic types and
design accordingly
59
ESX Virtual Switch: Capabilities
Layer 2 switch—forwards frames based on
48-bit destination MAC address in frame
MAC address known by registration
(it knows its VMs!)—no MAC learning
required
Can terminate VLAN trunks (VST mode) or
pass trunk through to VM (VGT mode)
Physical NICs associated with Switches
NIC teaming (of uplinks)
• Availability: uplink to multiple physical switches
• Load sharing: spread load over uplinks
VM0 VM1
vSwitch
MAC
address
assigned to
vnic
60
ESX Virtual Switch: Forwarding Rules
The vSwitch will forward frames
• VM VM
• VM Uplink
But not forward
• vSwitch to vSwitch
• Uplink to Uplink
ESX vSwitch will not create
loops in the physical network
And will not affect Spanning Tree
(STP) in the physical network
VM0 VM1
vSwitch
Physical
Switches
vSwitch
MAC a MAC b MAC c
61
Port Group Configuration
A Port Group is a template for one or more ports with a common configuration
• Assigns VLAN to port group members
• L2 Security—select ―reject‖ to see only frames for VM MAC addr
• Promiscuous mode/MAC address change/Forged transmits
• Traffic Shaping—limit egress traffic from VM
• Load Balancing—Origin VPID, Src MAC, IP-Hash, Explicit
• Failover Policy— Link Status & Beacon Probing
• Notify Switches—‖yes‖-gratuitously tell switches of mac location
• Failback—‖yes‖ if no fear of blackholing traffic, or, …
• … use Failover Order in ―Active Adapters‖
Distributed Virtual Port Group (vNetwork Distributed Switch)
• All above plus:
• Bidirectional traffic shaping (ingress and egress)
• Network VMotion—network port state migrated upon VMotion
62
NIC Teaming for Load Sharing & Availability
NIC Teaming aggregates multiple physical
uplinks for:
• Availability—reduce exposure to single points
of failure (NIC, uplink, physical switch)
• Load Sharing—distribute load over multiple
uplinks (according to selected NIC teaming
algorithm)
Requirements:
• Two or more NICs on same vSwitch
• Teamed NICs on same L2 broadcast domain
VM0 VM1
vSwitch
NIC Team
KB - NIC teaming in ESX Server (1004088)
KB - Dedicating specific NICs to portgroups while maintaining NIC teaming and failover for the vSwitch (1002722)
63
NIC Teaming with vDS
Teaming Policies Are Applied in DV Port Groups to dvUplinks
Service
Console
vmkernel
esx10b.tml.local
A B
Service
Console
vmkernel
esx10a.tml.local
A B
esx09b.tml.localesx09a.tml.local
―Orange‖ DV Port Group
Teaming Policy
0
1
2
3
vmnic0 esx09a.tml.local
vmnic0 esx09b.tml.local
vmnic0 esx10a.tml.local
vmnic2 esx10b.tml.local
vmnic1 esx09a.tml.local
vmnic1 esx09b.tml.local
vmnic1 esx10a.tml.local
vmnic0 esx10b.tml.local
vmnic2 esx09a.tml.local
vmnic2 esx09b.tml.local
vmnic2 esx10a.tml.local
vmnic3 esx10b.tml.local
vmnic3 esx09a.tml.local
vmnic3 esx09b.tml.local
vmnic3 esx10a.tml.local
vmnic1 esx10b.tml.local
vDS
vmnic2 vmnic0vmnic1 vmnic3
vmnic0 vmnic1 vmnic2 vmnic3
KB - vNetwork Distributed Switch on ESX 4.x - Concepts Overview (1010555)
64
NIC Teaming Options
Name Algorithm—vmnic
chosen based upon:
Physical Network Considerations
Originating
Virtual Port ID
vnic port Teamed ports in same L2 domain
(BP: team over two physical switches)
Source MAC
Address
MAC seen on vnic Teamed ports in same L2 domain
(BP: team over two physical switches)
IP Hash* Hash(SrcIP, DstIP) Teamed ports configured in static
802.3ad ―Etherchannel‖
- no LACP
- Needs MEC to span 2 switches
Explicit Failover
Order
Highest order uplink
from active list
Teamed ports in same L2 domain
(BP: team over two physical switches)
Best Practice: Use Originating Virtual PortID for VMs
*KB - ESX Server host requirements for link aggregation (1001938)
*KB - Sample configuration of EtherChannel/Link aggregation with ESX and Cisco/HP switches (1004048)
65
Link Aggregation
66
Link Aggregation - Continued
EtherChannel
is a port trunking (link aggregation is Cisco's term) technology used primarily on Cisco switches
Can be created from between two and eight active Fast Ethernet, Gigabit Ethernet, or 10 Gigabit Ethernet ports
LACP or IEEE 802.3ad
Link Aggregation Control Protocol (LACP) is included in IEEE specification as a method to control the bundling of
several physical ports together to form a single logical channel
Only supported on Nexus 1000v
EtherChannel vs. 802.3ad
EtherChannel and IEEE 802.3ad standards are very similar and accomplish the same goal
There are a few differences between the two, other than EtherChannel is Cisco proprietary and 802.3ad is an open
standard
EtherChannel Best Practice
One IP to one IP connections over multiple NICs are not supported (Host A one connection session to Host B uses
only one NIC)
Supported Cisco configuration: EtherChannel Mode ON – ( Enable Etherchannel only)
Supported HP configuration: Trunk Mode
Supported switch Aggregation algorithm: IP-SRC-DST short for (IP-Source-Destination) Global Policy on Switch
The only load balancing option for vSwitch or vDistributed Switch that can be used with EtherChannel is IP HASH
Do not use beacon probing with IP HASH load balancing
Do not configure standby uplinks with IP HASH load balancing.
67
Failover Configurations
• Link Status Only relies solely on the link status provided by the network adapter
•Detects failures such as cable pulls and physical switch power failures
•Cannot detect configuration errors
•Switch port being blocked by spanning tree
•Switch port configured for the wrong VLAN
•cable pulls on the other side of a physical switch.
• Beacon Probing sends out and listens for beacon probes
•Ethernet broadcast frames sent by physical adapters to detect upstream network
connection failures
•on all physical Ethernet adapters in the team, as shown in Figure
•Detects many of the failures mentioned above that are not detected by link status alone
•Should not be used as a substitute for a redundant Layer 2 network design
•Most useful to detect failures in the closest switch to the ESX Server hosts
•Beacon Probing Best Practice
•Use at least 3 NICs for triangulation
•If only 2 NICs in team, probe can’t determine which link failed
•Shotgun mode results
•KB - What is beacon probing? (1005577)
•KB - ESX host network flapping error when Beacon Probing is selected (1012819)
•KB - Duplicated Packets Occur when Beacon Probing Is Selected Using vmnic and
VLAN Type 4095 (1004373)
•KB - Packets are duplicated when you configure a portgroup or a vSwitch to use a route
that is based on IP-hash and Beaconing Probing policies simultaneously (1017612)
Figure — Using beacons to detect upstream
network connection failures.
68
Spanning Tree Protocol (STP) Considerations
Spanning Tree Protocol used to create
loop-free L2 tree topologies
in the physical network
• Some physical links put in ―blocking‖ state
to construct loop-free tree
ESX vSwitch does not participate
in Spanning Tree and will not create
loops with uplinks
• ESX Uplinks will not block and always
active (full use of all links)
VM0 VM1
vSwitch
Physical
Switches
MAC a MAC b
Switches sending
BPDUs every 2s to
construct and
maintain Spanning
Tree Topology
vSwitch drops
BPDUs
Blocked link
Recommendations for Physical Network Config:
1. Leave Spanning Tree enabled on physical network and ESX
facing ports (i.e. leave it as is!)
2. Use ―portfast‖ or ―portfast trunk‖ on ESX facing ports
(puts ports in forwarding state immediately)
3. Use ―bpduguard‖ to enforce STP boundary
KB - STP may cause temporary loss of network connectivity when a failover or failback event occurs (1003804)
69
ESX 4.1 Introduces Network I/O Control
VMware® vSphere™ 4.1 (―vSphere‖) introduces a number of enhancements and
new features to virtual networking.
• Network I/O Control (NetIOC)—flexibly partition and assure service for ESX/ESXi traffic
types and flows on a vNetwork Distributed Switch (vDS)
• Load-Based Teaming (LBT)—an additional and selectable load-balancing policy on the
vDS to enable dynamic adjustment of the load distribution over a team of NICs
• Network performance—vmkernel TCP/IP stack and guest virtual-machine network
performance enhancements
• Scale—enhancements to network scaling with the vDS
• IPv6 NIST Compliance—IPv6 enhancements to comply with U.S. National Institute of
Standards and Technology (NIST) Host Profile
• Cisco Nexus 1000V Enhancements—support for new features and enhancements on
the Cisco Nexus 1000V
70
Network I/O Control Usage
71
Load-Based Teaming (LBT)
LBT is another traffic-management feature of the vDS introduced with vSphere 4.1. LBT avoids
network congestion on the ESX/ESXi host uplinks caused by imbalances in the mapping of
traffic to those uplinks.
LBT enables customers to optimally use and balance network load over the available physical
uplinks attached to each ESX/ESXi host.
LBT helps avoid situations where one link may be congested, while other links may be relatively
underused.
How LBT works
• LBT dynamically adjusts the mapping of virtual ports to physical NICs to best balance the network load entering or
leaving the ESX/ESXi 4.1 host. When LBT detects an ingress- or egress- congestion condition on an uplink, signified
by a mean utilization of 75% or more over a 30-second period, it will attempt to move one or more of the virtual ports to
vmnic-mapped flows to lesser-used links within the team.
Configuring LBT
• LBT is an additional load-balancing policy available within the teaming and failover of a dvPortGroup on a vDS. LBT
appears as the ―Route based on physical NIC load.‖
*LBT is not available on the vNetwork Standard Switch (vSS).
72
VMXNET3—The Para-virtualized VM Virtual NIC
• Next evolution of ―Enhanced VMXNET‖ introduced in ESX 3.5
• Adds
• MSI/MSI-X support (subject to guest operating system kernel support)
• Receive Side Scaling (supported in Windows 2008 when explicitly enabled through
the device's Advanced configuration tab)
• Large TX/RX ring sizes (configured from within the virtual machine)
• High performance emulation mode (Default)
• Supports
• High DMA
• TSO (TCP Segmentation Offload) over IPv4 and IPv6
• TCP/UDP checksum offload over IPv4 and IPv6
• Jumbo Frames
• 802.1Q tag insertion
KB - Choosing a network adapter for your virtual machine (1001805)
73
VMDirectPath for VMs
I/O Device
Device Driver
Virtual
Layer
What is it?
Enables direct assignment of PCI devices to VM
Types of workloads
I/O Appliances
High performance VMs
Details
Guest controls the physical H/W
Requirements
vSphere 4
I/O MMU
Used for DMA Address Translation (Guest Physical
Host Physical) and protection
Generic device reset (FLR, Link Reset, ...)
KB - Configuring VMDirectPath I/O pass-through devices on an ESX host (1010789)
74
FCoE on ESX
VMware ESX Support
• FCoE supported since ESX 3.5u2
• Requires Converged Network
Adapters ―CNAs‖—(see HCL) e.g.
• Emulex LP21000 Series
• Qlogic QLE8000 Series
• Appears to ESX as:
• 10GigE NIC
• FC HBA
• SFP+ pluggable transceivers
• Copper twin-ax (<10m)
• Optical
10GigE
NIC
Fibre
Channel
HBA
vSwitch
FCoE
Switch
Fibre
ChannelEthernet
FCoE
CNA—Converged
Network Adapter
ESX
75
Using 10GigE
2x 10GigE common/expected
• 10GigE CNAs or NICs
Possible Deployment Method
• Active/Standby on all Portgroups
• VMs ―sticky‖ to one vmnic
• SC/vmk ports sticky to other
• Use Ingress Traffic Shaping
to control traffic type per
Port Group
• If FCoE, use Priority Group bandwidth
reservation (on CNA utility)
vSwitch
iSCSI NFS VMotion FT SC
FCoE FCoE
SC#2
FCoE
10
FCoE Priority Group
bandwidth reservation
(in CNA config utility)
Gbps10GE10GE
Ingress (into switch)
traffic shaping policy
control on Port Group
1-2G Low b/wHigh
b/w
Variable/high
b/w 2Gbps+
76
Traffic Types on a Virtual Network
Virtual Machine Traffic
• Traffic sourced and received from virtual machine(s)
• Isolate from each other based on service level
VMotion Traffic
• Traffic sent when moving a virtual machine from one ESX host to another
• Should be isolated
Management Traffic
• Should be isolated from VM traffic (one or two Service Consoles)
• If VMware HA is enabled, includes heartbeats
IP Storage Traffic—NFS and/or iSCSI via vmkernel interface
• Should be isolated from other traffic types
Fault Tolerance (FT) Logging Traffic
• Low latency, high bandwidth
• Should be isolated from other traffic types
How do we maintain traffic isolation without proliferating NICs?
77
VLAN Trunking to Server
IEEE 802.1Q VLAN Tagging
• Enables logical network partitioning
(Traffic separation)
• Scale traffic types without scaling physical NICs
• Virtual machines connect to virtual
switch ports (like access ports
on physical switch)
• Virtual switch ports are associated
with a particular VLAN (VST mode)—defined
in PortGroup
• Virtual switch tags packets exiting host
VM0 VM1
vSwitch
PortGroup
―Blue‖
VLAN 20
Port Group
―Yellow‖
VLAN 10
VLAN Trunks
Carrying
VLANs 10, 20
802.1Q Header
810012-bit VLAN id
field
(0-4095)
78
VLAN Tagging Options
vSwitch
Physical Switch
vSwitch
Physical Switch
vSwitch
Physical Switch
VST – Virtual Switch Tagging VGT – Virtual Guest Tagging EST – External Switch Tagging
VLAN Tags
applied in
vSwitch
VLAN Tags
applied in
Guest
PortGroup
set to VLAN
―4095‖
External Physical
switch applies
VLAN tags VST is the best practice and
most common method
VLAN
assigned in
Port Group
policy
79
VLAN Tagging: Further Example
KB -Sample configuration of virtual switch VLAN tagging (VST Mode) and ESX Server (1004074)
Uplinks A, B, and C connected to trunk ports on physical switch which carry four VLANs
(e.g. VLANs 10, 20, 50, 90)
Ports 1-14 emit untagged frames, and only those frames which were tagged with their respective VLAN ID
(equivalent to ―access port‖ on physical switch)
• Port Group VLAN ID set to one of 1-4094
Port 15 emits tagged frames for all VLANs.
• Port Group VLAN ID set to 4095 (for vSS) or ―VLAN Trunking‖ on vDS DV Port Group
1310 12 14111 2 3 4 5 6 7 8 9
A C
15
B
VLAN Trunks
Carrying VLANs
10, 20, 50, 90
Access Ports
on VLAN 10Access Ports
on VLAN 20
Access Ports
on VLAN 50
All VLANs
(10,20,50,90)
trunked to
VM
interface GigabitEthernet1/2
description host32-vmnic0
switchport trunk encapsulation dot1q
switchport trunk native vlan 999
switchport trunk allowed vlan 10,20,50,90
switchport mode trunk
spanning-tree portfast trunk
Example
configuration on
Physical Switch
80
Private VLANs: Traffic Isolation for Every VM
Solution: PVLAN
• Place VMs on the same virtual network
but prevent them from communicating
directly with each other (saves VLANs!)
• Avoids scaling issues from assigning
one VLAN and IP subnet per VM
Details
• Instead, configure a SINGLE DV port
group to have a SINGLE isolated*
VLAN (ONLY ONE)
• Attach all your VMs to this SINGLE
isolated VLAN DV port group
Distributed
Switch with
PVLAN
Private VLAN traffic isolation
between guest VMs
Common
Primary VLAN
on uplinks
KB - Private VLAN (PVLAN) on vNetwork Distributed Switch - Concept Overview (1010691)
81
W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B
vNetwork Distributed SwitchPG PG PG PG PG PG PG PG PG PG PG PG
TOTAL COST: 12 VLANs (one per VM)
TOTAL COST: 1 PVLAN (over 90% savings…)
W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B
vNetwork Distributed SwitchPG (with Isolated PVLAN)
Private VLANs - Continued
82
Tips & Tricks
• KB - Changing a MAC address in a Windows virtual machine (1008473)
• When a physical machine is converted into a virtual machine, the MAC address of the network adapter is
changed. This can pose a problem when software is installed where the licensing is tied to the MAC
address.
• KB – Configuring speed and duplex of an ESX Server host network adapter (1004089)
• ESX recommended settings for Gigabit-Ethernet speed and duplex while connecting to a physical switch
port are as following:
• Auto Negotiate <-> Auto Negotiate
• It is not recommended to mix hard-coded setting with Auto-negotiate.
• KB - Sample Configuration - Network Load Balancing (NLB) Multicast mode over routed subnet -
Cisco Switch Static ARP Configuration (1006525)
• NLB Multicast Mode – Static ARP Resolution
• Since NLB packets are unconventional, meaning the IP address is Unicast while the MAC address of it is
Multicast, switches and routers drop NLB packets
• NLB Multicast Packets get dropped by routers and switches, causing the ARP tables of switches to not get
populated with cluster IP and MAC address
• Manual ARP Resolution of NLB cluster address is required on physical switch and router interfaces
• Cluster IP and MAC static resolution is set on each switch port that connects to ESX host
83
Troubleshooting Tips
84
Troubleshooting with Esxtop
85
Esxtop Traffic
86
Capturing Traffic
87
ESX tcpdump
88
Wireshark in a VM
89
Must Read… http://www.vmware.com/technical-resources/virtual-networking/
Conclusion
This study compares performance results for e1000 and
vmxnet virtual network devices on 32-bit and 64-bit guest
operating systems using the netperf benchmark. The results
show that when a virtual machine is running with software
virtualization, e1000 is better in some cases and vmxnet is
better in others. Vmxnet has lower latency, which sometimes
comes at the cost of higher CPU utilization. When hardware
virtualization is used, vmxnet clearly provides the best
performance.
Conclusion
VMXNET3, the newest generation of virtual network adapter from
VMware, offers performance on par with or better than its previous
generations in both Windows and Linux guests. Both the driver
and the device have been highly tuned to perform better on
modern systems. Furthermore, VMXNET3 introduces new
features and enhancements, such as TSO6 and RSS. TSO6
makes it especially useful for users deploying applications that
deal with IPv6 traffic, while RSS is helpful for deployments
requiring high scalability. All these features give VMXNET3
advantages that are not possible with previous generations of
virtual network adapters. Moving forward, to keep pace with an
ever‐increasing demand for network bandwidth, we recommend
customers migrate to VMXNET3 if performance is of top concern
to their deployments.
Technical Papers
90
KB Links
• KB - Cisco Discovery Protocol (CDP) network information via command line and VirtualCenter on an
ESX host (1007069)
• Utilizing Cisco Discovery protocol (CDP) to get switch port configuration information.
• This command is utilized to troubleshoot network connectivity issues related to VLAN tagging methods on
virtual and physical port settings.
• KB - Troubleshooting network issues with the Cisco show tech-support command (1015437)
• If you experience networking issues between vSwitch and physical switched environment, you can obtain
information about the configuration of a Cisco router or switch by running the show tech-support command
in privileged EXEC mode.
• Note: This command does not alter the configuration of the router.
• KB - ESX host or virtual machines have intermittent or no network connectivity (1004109)
• KB - Troubleshooting Nexus 1000V vDS network issues (1014977)
• KB - Cisco Nexus 1000V installation and licensing information (1013452)
• Cisco Nexus 1000V Troubleshooting Guide, Release 4.0(4)SV1(2) 20/Jan/2010
• Cisco Nexus 1000V Troubleshooting Guide, Release 4.0(4)SV(1) 21/Jan/2010
• KB - Configuring promiscuous mode on a virtual switch or portgroup (1004099)
• KB - Troubleshooting network issues by capturing and sniffing network traffic via tcpdump (1004090)
91
KB Links - Continued
• KB - Troubleshooting network connection issues using Address Resolution Protocol (ARP)
(1008184)
• IEEE OUI and Company id Assignments http://standards.ieee.org/regauth/oui/index.shtml
• KB - Network performance issues (1004087)
• KB - Low Network Throughput in Windows Guest when Running UDP Application (5298153)
• KB - Performance of Outgoing UDP Packets Is Poor (10172)
• KB - Poor Network File Copy performance between local VMFS and shared VMFS (1003554)
• KB - Cannot connect to ESX 4.0 host for 30-40 minutes after boot (1012942)
• Ensure that DNS is configured and reachable from the ESX host
• KB - Identifying issues with and setting up name resolution on ESX Server (1003735)
• Note: localhost must always be present in the hosts file. Do not modify or remove the entry for localhost
• The hosts file must be identical on all ESX Servers in the cluster
• There must be an entry for every ESX Server in the cluster
• Every host must have an IP address, Fully Qualified Domain Name (FQDN), and short name
• The hosts file is case sensitive. Be sure to use lowercase throughout the environment
© 2009 VMware Inc. All rights reserved
Questions
© 2009 VMware Inc. All rights reserved
ESXi Readiness
Planning your migration to VMware ESXi, the next-generation hypervisor
architecture.
David Garcia Jr - Global Support Services
94
The Gartner Group says…
―The major benefit of ESXi is the fact that it is more lightweight —
under 100MB versus 2GB for VMware ESX with the service
console.‖
―Smaller means fewer patches‖
―It also eliminates the need to manage a separate Linux console
(and the Linux skills needed to manage it)…‖
As of August 2010 ―VMware users should put a plan in place to
migrate to ESXi during the next 12 to 18 months.‖
95
VMware ESX
Hypervisor ArchitectureVMware ESXi
Hypervisor Architecture
• Code base disk footprint: <100 MB
• VMware agents ported to run directly on VMkernel
• Authorized 3rd party modules can also run in
VMkernel to provide hw monitoring and drivers
• Other capabilities necessary for integration into an
enterprise datacenter are provided natively
•No other arbitrary code is allowed on the system
• Code base disk footprint: ~ 2GB
• VMware agents run in Console OS
• Nearly all other management functionality
provided by agents running in the Console OS
• Users must log into Console OS in order to run
commands for configuration and diagnostics
VMware ESXi and ESX hypervisor architectures comparison
96
Call to action for customers
Start testing ESXi
• If you‘ve not already deployed, there‘s no better time than the present
Ensure your 3rd party solutions are ESXi Ready
• Monitoring, backup, management, etc. Most already are.
• Bid farewell to agents!
Familiarize yourself with ESXi remote management options
• Transition any scripts or automation that depended on the COS
• Powerful off-host scripting and automation using vCLI, PowerCLI, …
Plan an ESXi migration as part of your vSphere upgrade
• Testing of ESXi architecture can be incorporated into overall vSphere
testing
97
Visit the ESXi and ESX Info Center today
http://vmware.com/go/ESXiInfoCenter
© 2009 VMware Inc. All rights reserved
Questions
© 2009 VMware Inc. All rights reserved
Break
© 2009 VMware Inc. All rights reserved
vSphere 4 - Performance Best Practices
Kenneth Kemp, Escalation Engineer
101
Agenda
Technical Guides
ESX 4.x Performance & Troubleshooting
• Memory
• CPU
vCenter Performance & Troubleshooting
• High Availability
• Distributed Resource Scheduler
• Fault Tolerance
• Resource Pool Designs
• HW Considerations and Settings
102
Technical Guides
© 2009 VMware Inc. All rights reserved
Memory
104
Memory – Resource Types
When assigning a VM a ―physical‖ amount of RAM, all you are really
doing is telling ESX how much memory a given VM process will
maximally consume past the overhead.
Whether or not that memory is physical depends on a few factors: Host
configuration, DRS shares/Limits/Reservations and host load.
Generally speaking, it is better to OVER-commit than UNDER-commit.
105
Memory – Overhead & Reclamation
ESX memory space overhead
Service Console: 272 MB
VMkernel: 100 MB+
Per-VM memory space overhead increases with:
Number of VCPUs
Size of guest memory
32 or 64 bit guest OS
ESX memory space reclamation
Page sharing
Ballooning
106
Memory – Page Tables
Page tables
ESX cannot use guest page tables
ESX Server maintains shadow page tables
Translate memory addresses from virtual to machine
Per process, per VCPU
VMM maintains physical (per VM) to machine maps
No overhead from ―ordinary‖ memory references
Overhead
Page table initialization and updates
Guest OS context switching
VA
PA
MA
107
Memory – Over-commitment & Sizing
Avoid high active host memory over-commitment
• Total memory demand = active working sets of all VMs
+ memory overhead
– page sharing
• No ESX swapping: total memory demand < physical memory
Right-size guest memory
• Define adequate guest memory to avoid guest swapping
• Per-VM memory space overhead grows with guest memory
108
Memory – NUMA considerations
Increasing a VM‘s memory on a NUMA machine
Will eventually force some memory to be allocated from a remote node, which
will decrease performance
Try to size the VM so both CPU and memory fit on one node
Node 0 Node 1
109
Memory – NUMA considerations continued
NUMA scheduling and memory placement policies in ESX manages all VMs transparently
No need to manually balance virtual machines between nodes
NUMA optimizations available when node interleaving is disabled
Manual override controls available
Memory placement: 'use memory from nodes'
Processor utilization: 'run on processors'
Not generally recommended
For best performance of VMs on NUMA systems
# of VCPUs + 1 <= # of cores per node
VM memory <= memory of one node
110
ESX must balance memory usage for all worlds
• Virtual machines, Service Console, and vmkernel consume memory
• Page sharing to reduce memory footprint of Virtual Machines
• Ballooning to relieve memory pressure in a graceful way
• Host swapping to relieve memory pressure when
ballooning insufficient
ESX allows overcommitment of memory
• Sum of configured memory sizes of virtual machines can be greater than
physical memory if working sets fit
Memory – Balancing & Overcommitment
111
Ballooning: Memctl driver grabs pages and gives to ESX
• Guest OS choose pages to give to memctl (avoids ―hot‖ pages if possible): either free pages or pages to swap
• Unused pages are given directly to memctl
• Pages to be swapped are first written to swap partition within guest OS and then given to
memctl
VM1
Swap partition
w/in
Guest OSESX
VM2
memctl
1. Balloon
2. Reclaim
3. Redistribute
F
Memory - Ballooning
112
Swapping: ESX reclaims pages forcibly
• Guest doesn’t pick pages…ESX may inadvertently pick ―hot‖ pages (possible VM performance implications)
• Pages written to VM swap file
VM1
Swap
Partitio
n (w/in
guest)
ESX
VM2
VSWP
(external to guest)
1. Force Swap2. Reclaim3. Redistribute
Memory - Swapping
113
Bottom line:
• Ballooning may occur even when no memory pressure just to keep memory
proportions under control
• Ballooning is vastly preferably to swapping
• Guest can surrender unused/free pages
• With host swapping, ESX cannot tell which pages are unused or free and may accidentally pick
―hot‖ pages
• Even if balloon driver has to swap to satisfy the balloon request, guest chooses what to swap
• Can avoid swapping ―hot‖ pages within guest
Memory – Ballooning vs. Swapping
114
If running VMs consume too much host memory…
• Some VMs do not get enough host memory
• This forces either ballooning or host swapping to satisfy VM demands
• Host swapping or excessive ballooning reduced VM performance
If I do not size a VM properly (e.g., create Windows VM with 128MB
RAM)
• Within the VM, swapping occurs, resulting in disk traffic
• VM may slow down
• But…don’t make memory too big! (High overhead memory)
Memory – Ok, So Why Do I Care About Memory Usage?
115
One rule of thumb: > 1MB/s swap in or swap out rate may
mean memory overcommitment
Metric (Client) Metric
(esxtop)
Metric (SDK) Description
Swap in rate
(ESX4.0 Hosts)
SWR/s mem.swapinRate.average Rate at which mem is
swapped in from disk
Swap out rate
(ESX4.0 Hosts)
SWW/s mem.swapoutRate.average Rate at which mem is
swapped out to disk
Swapped SWCUR mem.swapped.average (level 2
counter)
~swap out – swap in
Swap in
(cumulative)
n/a mem.swapin.average Mem swapped in from
disk
Swap out
(cumulative)
n/a mem.swapout.average Mem swapped out to
disk
Memory - Important Memory Metrics (Per VM)
116
One rule of thumb: > 1MB/s swap in or swap out rate may
mean memory overcommitment
Metric (Client) Metric
(esxtop)
Metric (SDK) Description
Swap in rate
(ESX4.0 Hosts)
SWR/s mem.swapinRate.average Rate at which mem is
swapped in from disk
Swap out rate
(ESX4.0 Hosts)
SWW/s mem.swapoutRate.average Rate at which mem is
swapped out to disk
Swap used SWCUR mem.swapused.average (level
2 counter)
~swap out – swap in
Swap in
(cumulative)
n/a mem.swapin.average Mem swapped in from
disk
Swap out
(cumulative)
n/a mem.swapout.average Mem swapped out to
disk
Memory - Important Memory Metrics (Per Host, sum of VMs)
117
No swapping
Lots of swapping
Increased swap activity may be a sign of over-commitment
Memory - vSphere Client: Swapping on a Host
118
No
swappin
g
Lots of
swappin
g
Memory - A Stacked Chart (per VM) of Swapping
119
Overview Page
• Balloon
• Active
• Swap used
• Granted
• Shared common
Memory - Counters Shown in vSphere Client: Host
120
Overview Page
• Balloon target (how
much should be
ballooned)
• Swapped (~swap out –
swap in)
• Shared
• Balloon
• Active
Memory - Counters Shown in vSphere Client: VM
121
• Main page shows host memory usage (consumed + overhead memory +
Service Console)
Data refreshed at 20s intervals
Memory - Other Counters Shown in vSphere Client
122
Host CPU: Avg. CPU utilization for Virtual machine
Host Memory: consumed + overhead memory for Virtual Machine
Guest Memory: active memory for guest
Note: This page is updated once per minute
Memory - Counters Shown on VM List Summary Tab
123
Overhead
consumed
Overhead reserved
Private (non-shared)
Shared (content-based
page-sharing)
Active used as input to DRS
Unaccessed = unmapped (~never been touched)
Host
Guest
Memory - Breakdown in a VM
124
Metric Description
Memory Active (KB) Physical pages touched recently by a virtual machine
Memory Usage (%) Active memory / configured memory
Memory Consumed
(KB)
Machine memory mapped to a virtual machine,
including its portion of shared pages. Does NOT
include overhead memory.
Memory Granted (KB) VM physical pages backed by machine memory. May
be less than configured memory. Includes shared
pages. Does NOT include overhead memory.
Memory Shared (KB) Physical pages shared with other virtual machines
Memory Balloon (KB) Physical memory ballooned from a virtual machine
Memory Swapped (KB)
(ESX4.0: swap rates!)
Physical memory in swap file (approx. ―swap out –
swap in‖). Swap out and Swap in are cumulative.
Overhead Memory (KB) Machine pages used for virtualization
Memory - Virtual Machine Memory Metrics, vSphere Client
125
Metric Description
Memory Active (KB) Physical pages touched recently by the host
Memory Usage (%) Active memory / configured memory
Memory Consumed (KB) Total host physical memory – free memory on host.
Includes Overhead and Service Console memory.
Memory Granted (KB) Sum of memory granted to all running virtual
machines. Does NOT include overhead memory.
Memory Shared (KB) Sum of memory shared for all running VMs
Shared common (KB) Total machine pages used by shared pages
Memory Balloon (KB) Machine pages ballooned from virtual machines
Memory Swap Used (KB)
(ESX4.0: swap rates!)
Physical memory in swap files (approx. ―swap out –
swap in‖). Swap out and Swap in are cumulative.
Overhead Memory (KB) Machine pages used for virtualization
Memory - Host Memory Metrics, vSphere Client
126
Swapping
MCTL: N - Balloon
driver not active, tools
probably not installed
Memory
Hog
VMs
Swapped in
the past but
not actively
swapping
now
More swapping
since balloon driver
is not active
Ballooning
active
Memory - Troubleshooting Memory Problems with Esxtop
© 2009 VMware Inc. All rights reserved
CPU
128
CPU - Resource Types
CPU resources are the raw processing speed of a given host or
VM
However, on a more abstract level, we are also bound by the
hosts‘ ability to schedule those resources.
We also have to account for running a VM in the most optimal
fashion, which typically means running it on the same processor
that the last cycle completed on.
129
ESX Server
CPU – SMP Performance
Some multi-threaded apps in a SMP VM may not
perform well
Use multiple UP VMs on a multi-CPU physical machine
ESX Server
130
CPU - Performance Overhead & Utilization
CPU virtualization adds varying amounts of overhead
Little or no overhead for the part of the workload that can run in direct
execution
Small to significant overhead for virtualising sensitive privileged instructions
Performance reduction vs. increase in CPU utilization
CPU-bound applications: any CPU virtualization overhead results in reduced
throughput
non-CPU-bound applications: should expect similar throughput at higher CPU
utilization
131
CPU – VM vCPU Processor Support
ESX supports up to eight virtual processors per VM
• Use UP VMs for single-threaded applications
• Use UP HAL or UP kernel
• For SMP VMs, configure only as many VCPUs as needed
• Unused VCPUs in SMP VMs:
• Impose unnecessary scheduling constraints on ESX Server
• Waste system resources (idle looping, process migrations, etc.)
132
CPU – 64-bit Performance
Full support for 64-bit guests
64-bit can offer better performance than 32-bit
• More registers, large kernel tables, no HIGHMEM issue in Linux
ESX Server may experience performance problems due to shared
host interrupt lines
• Can happen with any controller; most often with USB
• Disable unused controllers
• Physically move controllers
• See KB 1290 for more details
133
CPU – Virtual Machine Worlds
ESX is designed to run Virtual Machines
Schedulable entity = ―world‖
• Virtual Machines are composed of worlds
• Service Console is a world (has agents like vpxa, hostd)
• Helper Worlds
ESX uses proportional-share scheduler to help with resource management
• Limits
• Shares
• Reservations
Balanced interrupt processing
134
CPU – ESX CPU Scheduling
World states (simplified view):
• ready = ready-to-run but no physical CPU free
• run = currently active and running
• wait = blocked on I/O
Multi-CPU Virtual Machines => variant of gang scheduling called
‗relaxed co-scheduling‘
• Co-run (latency to get vCPUs running)
• Co-stop (time in ―stopped‖ state)
135
One common issue is high CPU ready time
• High ready time possible contention for CPU resources among VMs
• Many possible reasons
• CPU overcommitment (high %rdy + high %used)
• Workload variability
• set on VM
• No fixed threshold, but > 20% for a VCPU Investigate further
CPU - So, How Do I Spot CPU Performance Problems?
136
Metric (Client) Metric
(esxtop)
Metric (sdk) Description
Usage (%) %USED cpu.usage.average CPU used over
the collection
interval (%)
Usage (MHz) n/a cpu.usagemhz.average CPU used over
the collection
interval (MHz)
CPU: Useful Metrics Per-HOST
137
Per-VM
Metric (Client) Metric
(esxtop)
Metric (SDK) Description
Usage (%) %USED cpu.usage.average CPU used over
the collection
interval
Used (ms) %USED cpu.used.summation CPU used over
the collection
interval)*
Ready (ms) %RDY cpu.ready.summation CPU time spent
in ready state*
Swap wait time
(ms) [ESX4.0
hosts]
%SWPWT cpu.swapwait.summation CPU time spent
waiting for host-
level swap-in
* Units different between esxtop and vSphere client
CPU: Useful Metrics Per-VM
138
Note CPU milliseconds and percent are on the same chart but use different axes
CPU - vSphere Client CPU Screenshot Hint
139
• 2-CPU box, but 3 active VMs (high %used)
• High %rdy + high %used can imply CPU overcommitment
CPU - Spotting CPU Overcommitment in esxtop
140
• Used time ~ ready time:
may signal contention.
However, might not be
overcommitted due to
workload variability
• In this example, we have
periods of activity and idle
periods: CPU isn’t
overcommitted all the time
Used time
Ready time
~ used time
Ready time < used time
CPU - Spotting Workload Variability in the vSphere Client
141
High Ready TimeHigh MLMTD: there is a limit on this VM…
High ready time not always because of overcommitment
CPU - High Ready Time Due to Limits Set on VM: esxtop
142
Limit on CPU
High ready time
CPU - High Ready Time Due to Limits: vSphere Client
143
Ready time jump from 12.5% (idle DB) to 20% (busy
DB) – didn‘t notice until responsiveness suffered!
CPU - Ready Time: Why There is no Fixed Threshold…
144
CPU overcommitment
• Possible solution: add more CPUs or VMotion the VM
Workload variability
• A bunch of VMs wake up all at once
• Note: system may be mostly idle: not always overcommitted
Limit set on VM
• 4x2GHz host, 2 vcpu VM, limit set to 1GHz (VM can consume 1GHz)
• Without limit, max is 2GHz. With limit, max is 1GHz (50% of 2GHz)
• CPU all busy: %USED: 50%; %MLMTD & %RDY = 150% [total is 200%, or 2 CPUs]
CPU - Summary of Possible Reasons for High Ready Time
© 2009 VMware Inc. All rights reserved
vCenter
146
vCenter - Best Practices
VC Database sizing
Estimate of the space required to store your performance statistics in the DB
Separate Critical Files onto Separate Drives
Make sure the database and transaction log files are placed on separate
physical drives
Place the tempdb database on a separate physical drive if possible
Arrangement distributes the I/O to the DB and dramatically improves its
performance
If a third drive is not feasible, place the tempdb files on the transaction log drive
Enable Automatic Statistics
Keep vCenter logging level low, unless troubleshooting
Proper scheduling of DB backups, maintenance, monitoring
Do not run vCenter on a server that has many applications running
vCenter Heartbeat - http://www.vmware.com/products/vcenter-server-
heartbeat/
147
vCenter - Performance
High CPU utilization and sluggish UI performance
Number of clients attached is high
VC needs to keep clients consistent with inventory changes
Aggressive alarm settings
DB administration
Periodic maintenance
Recovery and log settings
Appropriate VC statistics level
Use gigabit NICs for the service console to clone VMs
Assign permissions appropriately
SQL Server Express will only run well up to 5 hosts and/or 50 VMs. Past that, VC needs to run off an Enterprise-class DB.
148
vCenter - High Availability (HA)
HA network configuration check – DNS, NTP, lowercase hostnames, HA advanced settings
Redundancy: server hardware, shared storage, network, management
Test network isolation from a core switch level, and host failure for expected outage behavior
Critical VMs should NOT be grouped together
Categorize VM criticality, then set the failover appropriately
Valid VM network label names required for proper failover
Failover capacity/Admission control may be too conservative when host and VM sizes vary widely – slot size calculator in VC
149
vCenter - DRS (Distributed Resource Scheduler)
Higher number of hosts => more DRS balancing options
Recommend up to 32 hosts/cluster, may vary with VC server configuration and VM/host ratio
Network configuration on all hosts - VMotion network: Security policies, VMotion NIC enabled, Gig
Reservations, Limits, and Shares
- Shares take effect during resource contention
- Low limits can lead to wasted resources
- High VM reservations may limit DRS balancing
- Overhead memory
- Use resource pools for better manageability, do not nest too deep
Virtual CPU’s and Memory size
High memory size and virtual CPU’s => fewer migration opportunities
Configure VMs based on need network, etc.
150
vCenter - DRS (Cont.)
Ensure hosts are CPU compatible
- Intel vs. AMD
- Similar CPU family/features
- Consistent server bios levels, and NX bit exposure
- Enhanced VMotion Compatibility (EVC)
- ―VMware VMotion and CPU Compatibility‖ whitepaper
- CPU incompatibility => limited DRS VM migration options
Larger Host CPU and memory size preferred for VM placement (if all equal)
Differences in cache or memory architecture => inconsistency in performance
Aggressiveness threshold - Moderate threshold (default) works well for most cases
Aggressive thresholds recommended if homogenous clusters and VM demand relatively
constant and few affinity/anti-affinity rules
Use affinity/anti-affinity rules only when needed
Affinity rules: closely interacting VMs Anti-affinity rules: I/O intensive workloads, availability
Automatic DRS mode recommended (cluster-wide)
Manual/Partially automatic mode for location-critical VMs (per VM)
Per VM setting overrides cluster-wide setting
151
This design is simple and does not limit any VMs from any
physical resources. Using the ESX shares mechanism, if two
or more VMs are competing for the same physical resources
the tug of war that results will be decided by the resource pool
memberships of the VMs.
The ESX cluster will have three resource pools defined.
• A ―High‖ resource pool will have no initial reservation and
unlimited/expandable RAM and CPU settings. CPU and
Memory shares will be set to high. This resource pool will be
devoted for mission-critical VMs.
• A second ―Normal‖ resource pool will have no initial
reservation and unlimited/expandable RAM and CPU
settings. CPU and Memory shares will be set to normal.
vCenter – Resource Pool Tug of War Design
152
This design takes the sum total of all physical resources and
slices it up across the resource pools. Although the following
design only uses two resource pools, many more ―slices‖ could
be created. The most basic Pizza Design would be to reserve
all memory and cpu, but the following example helps also
illustrate reservations and limits.
The ESX cluster will have two resource pools defined.
• A ―Critical Services‖ resource pool will have an initial
reservation of 32GB RAM and 8GHz CPU, and
unlimited/expandable RAM and CPU settings. This resource
pool will be devoted for mission-critical VMs. Shares for RAM
will be set to high, but shares for CPU will be set to normal.
vCenter – Resource Pool Pizza Design
153
vCenter - FT - Fault Tolerance
FT Provides complete VM redundancy
By definition, FT doubles resource requirements
Turning on FT disables performance-enhancing features like, H/W MMU
Each time FT is enabled, it causes a live migration
Use a dedicated NIC for FT traffic
Place primaries on different hosts
Asynchronous traffic patterns
Host Failure considerations
Run FT on machines with similar characteristics
154
vCenter - HW Considerations and Settings
When purchasing new servers, target MMU virtualization(EPT/RVI) processors,
or at least CPU virtualization(VT-x/AMD-V) depending on your application work
loads
If your application workload is creating/destroying a lot of processes, or
allocating a lot of memory them MMU will help performance
Purchase uniform, high-speed, quality memory, populate memory banks evenly
in the power of 2.
Choosing a system for better i/o performance MSI-X is needed which allows
support for multiple queues across multiple processors to process i/o in parallel
PCI slot configuration on the motherboard should support PCIe v/2.0 if you
intend to use 10 gb cards, otherwise you will not utilize full bandwidth
155
vCenter - HW Considerations and Settings (cont.)
BIOS Settings
- Make sure what you paid for,… is enabled in the bios
-enable ―Turbo-Mode‖ if your processors support it
- Verify that hyper-threading is enabled – more logical CPUs allow more options
for the VMkernel scheduler
- NUMA systems verify that node-interleaving is enabled
- Be sure to disable power management if you want to maximize performance
unless you are using DPM. Need to decide if performance out-weighs power
savings
C1E halt state - This causes parts of the processor to shut down for a short period of time in order to save
energy and reduce thermal loss
-Verify VT/NPT/EPT are enabled as older Barcelona systems do not enable these by
default
-Disable any unused USB, or serial ports
156
Reference Guide Links
VMware vCenter Server Performance and Best Practices for vSphere 4.1
http://www.vmware.com/resources/techresources/10145
Performance Best Practices for VMware vSphere® 4.0
http://www.vmware.com/pdf/Perf_Best_Practices_vSphere4.0.pdf
SAN System Design and Deployment Guide
http://www.vmware.com/files/pdf/techpaper/SAN_Design_and_Deployment_Guide.pdf
VMware vSphere: The CPU Scheduler in VMware ESX 4.1
http://www.vmware.com/files/pdf/techpaper/VMW_vSphere41_cpu_schedule_ESX.pdf
157
Reference Guide Links Continued…
Understanding Memory Resource Management in VMware ESX 4.1
http://www.vmware.com/files/pdf/techpaper/vsp_41_perf_memory_mgmt.pdf
Managing Performance Variance of Applications Using Storage I/O Control
http://www.vmware.com/files/pdf/techpaper/vsp_41_perf_SIOC.pdf
What‘s New in VMware® vSphere™ 4.1 — Networking
http://www.vmware.com/files/pdf/techpaper/VMW-Whats-New-vSphere41-Networking.pdf
VMware® Network I/O Control: Architecture, Performance and Best Practices VMware vSphere™ 4.1
http://www.vmware.com/files/pdf/techpaper/VMW_Netioc_BestPractices.pdf
Designing Resource Pools
http://vmetc.com/2008/03/04/designing-esx-resource-pools/
© 2009 VMware Inc. All rights reserved
Questions
© 2009 VMware Inc. All rights reserved
Wrap Up/Raffle Drawing