VMware- Customer Support Day · 8 SCSI Reservations –when an initiator requests/reserves...

© 2009 VMware Inc. All rights reserved

VMware- Customer Support Day

November 16, 2010

2

Agenda

9:30 AM - Welcome/Kick-Off

Bob Good, Manager, Systems Engineering

9:40 AM - Support Engagement

Laura Ortman, Director, Global Support Services (GSS)

10:00 AM - Storage Best Practices

Ken Kemp, Escalation Engineer

11:00 AM - Keynote – VMware Virtualization and Cloud Management

Doug Huber, Director, Systems Engineering

12:00 PM - Lunch/Q&A with the experts (Group A) /VMware Express – Private Viewing (Group B)

1:00 PM - Lunch/Q&A with the experts (Group B) / VMware Express – Private Viewing (Group A)

2:00 PM - View 4.5 Overview/Network Best Practices

David Garcia, Release Readiness Manager

3:15 PM - Break

3:30 PM - vSphere Performance Best Practices

Ken Kemp, Escalation Engineer

4:15 PM - Wrap Up/Raffle Drawing

Inte

rac

tive

Se

ss

ion


Storage Best Practices

Ken Kemp – Escalation Engineer, Global Support Services

4

Agenda

Performance

SCSI Reservations

Performance Monitoring

• esxtop

Common Storage Issues

• Snapshot LUN’s

• Virtual Machine Snapshot

• iSCSI Multi Pathing

• All Paths Dead (APD)

5

Disk subsystem bottlenecks cause more performance problems

than CPU or RAM deficiencies

Your disk subsystem is considered to be performing poorly if it is

experiencing:

• Average read and write latencies greater than 20 milliseconds

• Latency spikes greater than 50 milliseconds that last for more than a few seconds

Performance

6

Performance vs. Capacity comes into play at two main levels

• Physical drive size

• Hard disk performance doesn’t scale with drive size

• In most cases the larger the drive the lower the performance.

• LUN size

• Larger LUNs increase the number of VM’s, which can lead to contention on that

particular LUN

• LUN size is often times related to physical drive size which can compound performance problems

Performance vs. Capacity

7

You need 1 TB of space for an application• 2 x 500GB 15K RPM SAS drives = ~300 IOPS

• Capacity needs satisfied, Performance low

• 8 x 146GB 15K RPM SAS drives = ~1168 IOPS

• Capacity needs satisfied, Performance high

Performance – Physical Drive Size

8

SCSI Reservations – when an initiator requests/reserves exclusive use of a target(LUN)

• VMFS is a clustered file system

• Uses SCSI reservations to protect metadata

• To preserve the integrity of VMFS in multi host deployments

• One host has complete access to the LUN exclusively

• A reboot or release command will clear the reservation

• The virtual machine monitor users SCSI-2 reservations

SCSI Reservations – Why?

9

What causes SCSI Reservations?

• When a VMDK is created, deleted, placed in REDO mode, has a snapshot (delta) file, is migrated (reservations from the source ESX and from the target ESX) or when the VM is suspended (Since there is a suspend file written).

• When VMDK is created via a template, we get SCSI reservations on the source and target

• When a template is created from a VMDK, SCSI reservation is generated

SCSI Reservations

10

• Simplify/verify deployments so that virtual machines do not span more than one LUN

• This will ensure SCSI reservations do not impact more than one LUN

• Determine if any operations are occurring on a LUN on which you want to perform another operation

• Snapshots

• VMotion

• Template Deployment

• Use a single ESX server as your deployment server to limit/prevent conflicts with other ESX servers attempting to perform similar operations

SCSI Reservation Best Practice

11

• Inside vCenter, limit access to actions that initiate reservations to administrators who understand the effects of reservations to control WHO can perform such operations

• Schedule virtual machine reboots so that only one LUN is impacted at any given time

• A power on and power off are considered separate operations and both with create a reservations

• VMotion

• Use care when scheduling backups. Consult the backup provider best practices information

• Use care when scheduling Anti Virus scans and updates

SCSI Reservation Best Practice - Continued

12

• Monitoring /var/log/vmkernel for:

• 24/0 0x0 0x0 0x0

• SYNC CR messages

• In a shared environment like ESX there will be some SCSI reservations. This is normal. But when you see 100’s of them it’s not normal.

• Check for Virtual Machines with snapshots

• Check for HP management agents still running the storage agent

• Check LUN presentation for Host mode settings

• Call VMware support to dig into it further

SCSI Reservation Monitoring


Storage Performance Monitoring


14

esxtop

15

DAVG = Raw response time from the device

KAVG = Amount of time spent in the VMkernel, aka. virtualization

overhead

GAVG = Response time that would be perceived by virtual machines

D + K = G

esxtop - Continued

16

esxtop - Continued

17

esxtop - Continued

18

•What are correct values for these response times?• As with all things revolving around performance, it is subjective

• Obviously the lower these numbers are the better

• ESX will continue to function with nearly any response time, however how well it functions is another issue

• Any command that is not acknowledged by the SAN within 5000ms (5 seconds) will be aborted. This is where perceived disk performance takes a sharp dive

esxtop - Continued


Common Storage Issues


20

How a LUN is detected as a snapshot in ESX?

• When an ESX 3.x server finds a VMFS-3 LUN, it compares the SCSI_DiskID information returned from the storage array with the SCSI_DiskID information stored in the LVM Header.

• If the two IDs do not match, the VMFS-3 volume is not mounted.

A VMFS volume on ESX can be detected as a snapshot for a number of reasons:

• LUN ID change

• SCSI version supported by array changed (firmware upgrade)

• Identifier type changed – Unit Serial Number vs NAA ID

Snapshot LUNs

21

Resignaturing MethodsESX 3.5

Enable LVM Resignaturing on the first ESX host

Configuration > Advanced Settings > LVM > LVM.EnableResignaturing to 1.

ESX 4

Single Volume Resignaturing

Configuration > Storage > Add Storage > Disk / LUN

Select Volume to Resignature > Select Mount, or Resignature

Snapshot LUNs - Continued

22

What is a Virtual Machine Snapshot?

• A snapshot captures the entire state of the virtual machine at the time you take the snapshot.

• This includes:

Memory state – The contents of the virtual machine’s memory.

Settings state – The virtual machine settings.

Disk state – The state of all the virtual machine’s virtual disks.

Virtual Machine Snapshots

23

Common issues:

• Snapshots filling up a Data Store

• Offline commit

• Clone VM

• Parent has changed.

• Contact VMware Support

• No Snapshots Found

• Create a new snapshot, then commit.

Virtual Machine Snapshot - Continued

24

ESX 4, Set Up Multi-pathing for Software iSCSI

Prerequisites:

• Two or more NICs.

• Unique vSwtich.

• Supported iSCSI array.

• ESX 4.0 or higher

ESX4 iSCSI Multi-pathing

25

Using the vSphere CLI, connect the software iSCSIinitiator to the iSCSI VMkernel ports.

Repeat this command for each port.

• esxcli swiscsi nic add -n <port_name> -d <vmhba>

Verify that the ports were added to the software iSCSI initiator by running the

following command:

• esxcli swiscsi nic list -d <vmhba>

Use the vSphere Client to rescan the software iSCSI initiator.

ESX4 iSCSI Multi-pathing - Continued

26

This example shows how to connect the software iSCSI initiator vmhba33 to VMkernel ports vmk1 and vmk2.

Connect vmhba33 to vmk1:

esxcli swiscsi nic add -n vmk1 -d vmhba33

Connect vmhba33 to vmk2:

esxcli swiscsi nic add -n vmk2 -d vmhba33

Verify vmhba33 configuration:

esxcli swiscsi nic list -d vmhba33

ESX4 iSCSI Multi-pathing - Continued

27

The IssueYou want to remove a LUN from a vSphere 4 cluster

You move or Storage vMotion the VMs off the datastore who is being removed (otherwise, the VMs would hard crash if you just yank out the datastore)

After removing the LUN, VMs on OTHER datastores would become unavailable (not crashing, but becoming periodically unavailable on the network)

the ESX logs would show a series of errors starting with ―NMP‖

All Paths Dead (APD)

28

Workaround 1 In the vSphere client, vacate the VMs from the datastore being

removed (migrate or Storage vMotion)

In the vSphere client, remove the Datastore

In the vSphere client, remove the storage device

Only then, in your array management tool remove the LUN from the host.

In the vSphere client, rescan the bus.

Workaround 2Only available in ESX/ESXi 4 U1

esxcfg-advcfg -s 1 /VMFS3/FailVolumeOpenIfAPD

All Paths Dead - Continued

29

4.1 Storage Additions

Storage I/O Control which allows us to prioritize I/O from Virtual Machines residing on different ESX servers but using the same shared VMFS volume.

New I/O statistics, including NFS throughput and latency counters.

vStorage API for Array Integration (VAAI) which allow the offloading of certain storage operations such as cloning and zeroing operations from the host to the array.


Questions


VMware View 4.5 Overview

David Garcia Jr - Global Support Services

32

Agenda

View (Overview)

User Experience (Highlights)

Performance & Scalability (Tiered Storage, View Composer)

Management (View Manager)

33

Hypervisor Performance

Storage Infrastructure

Performance

vCenter Performance

Client Performance

vCENTER SERVER

VIEW SERVER

VMware View

Performance

Storage

Infrastructure

Network

Infrastructure

Server

and

Virtualization

stack

View Server

and

Remote Clients

VDI deployment scope

34

View 4.5 Architecture overview

View Client with

Local Mode

Support for vSphere 4.1 and vCenter 4.1 - Delivers integration with

the most widely-deployed desktop virtualization platform in the industry.

Takes advantage of optimizations for View virtual desktops.

Lowest Cost Reference

Architectures - VMware has

worked with partners such as

Dell, HP, Cisco, NetApp, and

EMC to provide prescriptive

reference architectures to enable

you to deploy a scalable and

cost-effective desktop

virtualization solution.

35

View 4.5 Product highlights

Full Windows 7 Support

View Manager Enhancements

• Increasing Scale and Efficiency

• System and User Diagnostics

• Extensibility

PCoIP Updates: Smart Card Support

View Client with Local Mode (aka Offline Support)

Support for vSphere 4.1

36

Native Windows Client Thin- Client Support

Thick clients or

refurbished PCs

Broad industry

support

Flexible client access from multiple devices

Mac OS 10.5+

Native Mac Client (RDP)

NEW

Now with Local Mode

37

Single Sign On

Authentication to Virtual

Desktop

• Windows Username/Password

• Smart Cards/Proximity Cards

• Client Based (MAC Address)

• USB connected biometric devices

Integration with MS AD

• No Domain change, schema

change, password change

Supports ―Tap and Go‖

Functionality

• Integrates with SSO Vendors –

Imprivata, Sentillion, Juniper, etc

Simplified Sign-on

Connection

Server

Single sign on to virtual desktop and apps

38

Web download portal

• Enhanced capability to manage

distribution of full View Windows

Client including PCoIP, ThinPrint

and USB redirection features

• Ability to distribute current and

legacy versions of View Client

• Broker URL automatically passed

to Windows client upon launch

• Experimental Java based Mac and

Linux Web Access no longer

supported (use installable Mac

Client in View 4 and View Open

Client for Linux)

39

Value propositions of local desktops

For IT

Extend View benefits to mobile users with laptops

Enable Bring Your Own PC (BYOPC) programs for employees &

contractors

Extend View benefits to remote/branch offices with poor/unreliable

networks

For End Users

Mobility – check out VM to local laptop for offline usage

Disaster Recovery – VM replicated to datacenter

Flexibility – BYOPC and personal desktop productivity

Windows

Guest

VM 1

View Client with Local Mode

Guest

VM 2

40

High Level FeaturesView in

2010Details

Run anywhereAfter initial checkout, desktop can be used at home or on

the road w/o network connectivity.

Broad hardware support Works with almost any modern laptop today.

Encrypted and secureAES Encryption of Desktop and centrally managed

policies to control access and usage.

Data centralization & control Admin can pull all data back up to datacenter on demand.

High quality user experienceSupport for Win7 Aeroglass Effects, DirectX 9 w/3D,

distortion-free sound & multimedia.

Reasonable CAPEX costs Up & running in with a single ESX box & local storage!

Disaster recovery optionsCan schedule data replication to server for rapid,

seamless recovery from hardware loss or failure.

Single Image Management w/ViewWorks off same management infrastructure & images

as rest of View deployment.

High level features of local desktops in 2010

41

View 4.5 major management feature highlights

Up-to 10000

Desktops

Admin Features

• High perf GUI

• Role based Admin

• Event DB, Dashboard

• View Power CLI extension

Storage Optimization

• Tiered storage

• Disposable disk/Local

swap file redirection

• VM on local storage

Composer Enhancements

• Sysprep support

• Fast refresh

• Persistent Disk Management

Simplified Sign-on

• Smart-card/Proximity card

• Client (MAC/device ID),

support of Kiosk mode

ThinApp Integration

• App repo scanning

• Pool/Desktop ThinApp

assignment

42

Core broker: Performance & scalability

• 10,000 VM Pod (5 connection servers + 2 standby)

• Federated Pool Management

• Connection server instance in a cluster will be responsible for VM operations on

VMs belonging to the same pool

• Reduced locking/synchronization overhead

• Enhanced tracker w/ caching

• Reduced extra reloading from ADAM Datastore

• Refresh UI with 5,000 objects in seconds!

43

View Composer improvements overview

• Customization/Provisioning

• Sysprep support

• Refresh, Recompose and Rebalance for Floating Pool

• Storage Performance and Optimization

• Tiered support

• Optimization

• Disposable disk and Local swap file redirect

• Allow creation of linked-clones on local storage

• Management

• Full Management of Persistent Disk (formerly known as UDD)

44

View Composer: Tiered storage

Allow master VM replica to

reside in a separate datastore

Use high performance storage to boost

performance (e.g. reboot, virus scan)

45

View Composer: Other storage optimization

• Local swap file redirect

• Not reducing storage but allow the use of cheap local storage for individual VM swap file

• Allow creation of linked-clones using local data stores

• Wizard will not filter out local data stores for use of VM cloning

• Allow use of cheap local storage for non-persistent pool VMs

46

View Composer: Customization/provisioning

•Sysprep support

•Sysprep helps resolve

the SID management

issue: a new SID will

be generated for each

cloned VM

•The Three ‗R‘s

•Refresh

•Recompose

•Rebalance

47

View Composer: Enhanced management functions

• Persistent Disk (formerly known as UDD) Management

• Detach/Migrate/Archive/Reattach

• Managed as ―first class object‖

• Garbage collection scripts

• Remove one or more linked-clone VM(s) by name(s) from View, SVI, VC, and AD

48

Administration improvements in 2010

Provides Increased Management Efficiency:

Monitoring, Diagnostics and Supportability

Features

• Scalable Admin UI in Flex

• Role-based Administration

• System and End-User Troubleshooting

• Monitoring Dashboard

• Diagnostics

• Supportability

• Reporting and Auditing Enablement

• Events

• View Management Pack for SCOM

49

Scalable admin UI

• Based on Adobe Flex

• Rich application feel

• Scalability

• Easy navigation

• Cross-Platform

50

Role-based administration

• Delegated

administration

• Flexible Roles

• Helpdesk, etc

• Custom roles

• LDAP-based access

control on folders

51

System and end-user troubleshooting: Dashboard

• Surface key information to

administrators

• Drill-down as needed

• Locate root cause

• System health status

• View components

• vCenter components

• Status of desktops

• Status of client-hosted

endpoints

• Datastore usage

• VMs on storage LUN

52

Reporting and auditing enablement: Events

Formally defined events

• Events have a unique well defined identifier

• Standard attributes include module, user, desktop, machine

Provides a unified view across View components

• No more needing to review logs on each broker, agent!

Managed with a configurable database

Accessible with:

• VMware View Administrator

• Direct access (SQL) for other reporting tools

• Powershell

• Vdmadmin provides textual reports (csv or xml)

53

View management pack for SCOM

54

Links & Resources

Documentation, Release Notes http://www.vmware.com/support/pubs/view_pubs.html

• VMware View 4.5 Release Notes

• VMware View Architecture Planning Guide

• VMware View Administrator's Guide

• VMware View Installation Guide

• VMware View Upgrade Guide

• VMware View Integration Guide

Technical Papers http://www.vmware.com/resources/techresources/cat/91,156

• VMware View Optimization Guide for Windows 7 VMware Ensynch 09/27/2010

• Vblock Powered Solutions for VMware View VMware Cisco EMC 09/09/2010

• Virtual Desktop Sizing Guide with VMware View 4.0 and VMware vSphere 4.0 Update1 Mainline 05/21/2010

• Application Presentation to VMware View Desktops with Citrix XenApp VMware 05/20/2010

• PCoIP Display Protocol: Information and Scenario-Based Network Sizing Guide VMware 05/20/2010

• Location Awareness in VMware View 4 VMware 06/15/2010

• VMware View 4 & VMware ThinApp Integration Guide VMware 01/19/2010

• Anti-Virus Deployment for VMware View VMware 01/13/2010

http://www.vmware.com/support/pubs/view_pubs.html

http://www.vmware.com/resources/techresources/cat/91,156


Questions


vSphere Networking Best Practices


57

Agenda

vSwitches & Portgroups

Nic Teaming

Link Aggregation (802.3ad static mode)

Failover Configuration

Spanning Tree Protocol

Network I/O Control

Load-Based Teaming

VmDirectpath, Vmxnet3, FCOE CNA & 10GB

VLAN Trunking (802.1q)

Tips & Tricks

Troubleshooting Tips

Must Read & KB Links

58

Designing the Network

How do you design the virtual network for

performance and availability and but maintain

isolation between the various traffic types

(e.g. VM traffic, VMotion, and Management)?

• Starting point depends on:

• Number of available physical ports on server

• Required traffic types

• 2 NIC minimum for availability, 4+ NICs

per server preferred

• 802.1Q VLAN trunking highly recommended for logical

scaling (particularly with low NIC port servers)

• Examples are meant as guidance and do not represent strict

requirements in terms of design

• Understand your requirements and resultant traffic types and

design accordingly

59

ESX Virtual Switch: Capabilities

Layer 2 switch—forwards frames based on

48-bit destination MAC address in frame

MAC address known by registration

(it knows its VMs!)—no MAC learning

required

Can terminate VLAN trunks (VST mode) or

pass trunk through to VM (VGT mode)

Physical NICs associated with Switches

NIC teaming (of uplinks)

• Availability: uplink to multiple physical switches

• Load sharing: spread load over uplinks

VM0 VM1

vSwitch

MAC

address

assigned to

vnic

60

ESX Virtual Switch: Forwarding Rules

The vSwitch will forward frames

• VM VM

• VM Uplink

But not forward

• vSwitch to vSwitch

• Uplink to Uplink

ESX vSwitch will not create

loops in the physical network

And will not affect Spanning Tree

(STP) in the physical network

VM0 VM1

vSwitch

Physical

Switches

vSwitch

MAC a MAC b MAC c

61

Port Group Configuration

A Port Group is a template for one or more ports with a common configuration

• Assigns VLAN to port group members

• L2 Security—select ―reject‖ to see only frames for VM MAC addr

• Promiscuous mode/MAC address change/Forged transmits

• Traffic Shaping—limit egress traffic from VM

• Load Balancing—Origin VPID, Src MAC, IP-Hash, Explicit

• Failover Policy— Link Status & Beacon Probing

• Notify Switches—‖yes‖-gratuitously tell switches of mac location

• Failback—‖yes‖ if no fear of blackholing traffic, or, …

• … use Failover Order in ―Active Adapters‖

Distributed Virtual Port Group (vNetwork Distributed Switch)

• All above plus:

• Bidirectional traffic shaping (ingress and egress)

• Network VMotion—network port state migrated upon VMotion

62

NIC Teaming for Load Sharing & Availability

NIC Teaming aggregates multiple physical

uplinks for:

• Availability—reduce exposure to single points

of failure (NIC, uplink, physical switch)

• Load Sharing—distribute load over multiple

uplinks (according to selected NIC teaming

algorithm)

Requirements:

• Two or more NICs on same vSwitch

• Teamed NICs on same L2 broadcast domain

VM0 VM1

vSwitch

NIC Team

KB - NIC teaming in ESX Server (1004088)

KB - Dedicating specific NICs to portgroups while maintaining NIC teaming and failover for the vSwitch (1002722)

63

NIC Teaming with vDS

Teaming Policies Are Applied in DV Port Groups to dvUplinks

Service

Console

vmkernel

esx10b.tml.local

A B

Service

Console

vmkernel

esx10a.tml.local

A B

esx09b.tml.localesx09a.tml.local

―Orange‖ DV Port Group

Teaming Policy

0

1

2

3

vmnic0 esx09a.tml.local

vmnic0 esx09b.tml.local















vDS

vmnic2 vmnic0vmnic1 vmnic3

vmnic0 vmnic1 vmnic2 vmnic3

KB - vNetwork Distributed Switch on ESX 4.x - Concepts Overview (1010555)

64

NIC Teaming Options

Name Algorithm—vmnic

chosen based upon:

Physical Network Considerations

Originating

Virtual Port ID

vnic port Teamed ports in same L2 domain

(BP: team over two physical switches)

Source MAC

Address

MAC seen on vnic Teamed ports in same L2 domain


IP Hash* Hash(SrcIP, DstIP) Teamed ports configured in static

802.3ad ―Etherchannel‖

- no LACP

- Needs MEC to span 2 switches

Explicit Failover

Order

Highest order uplink

from active list

Teamed ports in same L2 domain


Best Practice: Use Originating Virtual PortID for VMs

*KB - ESX Server host requirements for link aggregation (1001938)

*KB - Sample configuration of EtherChannel/Link aggregation with ESX and Cisco/HP switches (1004048)

65

Link Aggregation

66

Link Aggregation - Continued

EtherChannel

is a port trunking (link aggregation is Cisco's term) technology used primarily on Cisco switches

Can be created from between two and eight active Fast Ethernet, Gigabit Ethernet, or 10 Gigabit Ethernet ports

LACP or IEEE 802.3ad

Link Aggregation Control Protocol (LACP) is included in IEEE specification as a method to control the bundling of

several physical ports together to form a single logical channel

Only supported on Nexus 1000v

EtherChannel vs. 802.3ad

EtherChannel and IEEE 802.3ad standards are very similar and accomplish the same goal

There are a few differences between the two, other than EtherChannel is Cisco proprietary and 802.3ad is an open

standard

EtherChannel Best Practice

One IP to one IP connections over multiple NICs are not supported (Host A one connection session to Host B uses

only one NIC)

Supported Cisco configuration: EtherChannel Mode ON – ( Enable Etherchannel only)

Supported HP configuration: Trunk Mode

Supported switch Aggregation algorithm: IP-SRC-DST short for (IP-Source-Destination) Global Policy on Switch

The only load balancing option for vSwitch or vDistributed Switch that can be used with EtherChannel is IP HASH

Do not use beacon probing with IP HASH load balancing

Do not configure standby uplinks with IP HASH load balancing.

67

Failover Configurations

• Link Status Only relies solely on the link status provided by the network adapter

•Detects failures such as cable pulls and physical switch power failures

•Cannot detect configuration errors

•Switch port being blocked by spanning tree

•Switch port configured for the wrong VLAN

•cable pulls on the other side of a physical switch.

• Beacon Probing sends out and listens for beacon probes

•Ethernet broadcast frames sent by physical adapters to detect upstream network

connection failures

•on all physical Ethernet adapters in the team, as shown in Figure

•Detects many of the failures mentioned above that are not detected by link status alone

•Should not be used as a substitute for a redundant Layer 2 network design

•Most useful to detect failures in the closest switch to the ESX Server hosts

•Beacon Probing Best Practice

•Use at least 3 NICs for triangulation

•If only 2 NICs in team, probe can’t determine which link failed

•Shotgun mode results

•KB - What is beacon probing? (1005577)

•KB - ESX host network flapping error when Beacon Probing is selected (1012819)

•KB - Duplicated Packets Occur when Beacon Probing Is Selected Using vmnic and

VLAN Type 4095 (1004373)

•KB - Packets are duplicated when you configure a portgroup or a vSwitch to use a route

that is based on IP-hash and Beaconing Probing policies simultaneously (1017612)

Figure — Using beacons to detect upstream

network connection failures.

68

Spanning Tree Protocol (STP) Considerations

Spanning Tree Protocol used to create

loop-free L2 tree topologies

in the physical network

• Some physical links put in ―blocking‖ state

to construct loop-free tree

ESX vSwitch does not participate

in Spanning Tree and will not create

loops with uplinks

• ESX Uplinks will not block and always

active (full use of all links)

VM0 VM1

vSwitch

Physical

Switches

MAC a MAC b

Switches sending

BPDUs every 2s to

construct and

maintain Spanning

Tree Topology

vSwitch drops

BPDUs

Blocked link

Recommendations for Physical Network Config:

1. Leave Spanning Tree enabled on physical network and ESX

facing ports (i.e. leave it as is!)

2. Use ―portfast‖ or ―portfast trunk‖ on ESX facing ports

(puts ports in forwarding state immediately)

3. Use ―bpduguard‖ to enforce STP boundary

KB - STP may cause temporary loss of network connectivity when a failover or failback event occurs (1003804)

69

ESX 4.1 Introduces Network I/O Control

VMware® vSphere™ 4.1 (―vSphere‖) introduces a number of enhancements and

new features to virtual networking.

• Network I/O Control (NetIOC)—flexibly partition and assure service for ESX/ESXi traffic

types and flows on a vNetwork Distributed Switch (vDS)

• Load-Based Teaming (LBT)—an additional and selectable load-balancing policy on the

vDS to enable dynamic adjustment of the load distribution over a team of NICs

• Network performance—vmkernel TCP/IP stack and guest virtual-machine network

performance enhancements

• Scale—enhancements to network scaling with the vDS

• IPv6 NIST Compliance—IPv6 enhancements to comply with U.S. National Institute of

Standards and Technology (NIST) Host Profile

• Cisco Nexus 1000V Enhancements—support for new features and enhancements on

the Cisco Nexus 1000V

70

Network I/O Control Usage

71

Load-Based Teaming (LBT)

LBT is another traffic-management feature of the vDS introduced with vSphere 4.1. LBT avoids

network congestion on the ESX/ESXi host uplinks caused by imbalances in the mapping of

traffic to those uplinks.

LBT enables customers to optimally use and balance network load over the available physical

uplinks attached to each ESX/ESXi host.

LBT helps avoid situations where one link may be congested, while other links may be relatively

underused.

How LBT works

• LBT dynamically adjusts the mapping of virtual ports to physical NICs to best balance the network load entering or

leaving the ESX/ESXi 4.1 host. When LBT detects an ingress- or egress- congestion condition on an uplink, signified

by a mean utilization of 75% or more over a 30-second period, it will attempt to move one or more of the virtual ports to

vmnic-mapped flows to lesser-used links within the team.

Configuring LBT

• LBT is an additional load-balancing policy available within the teaming and failover of a dvPortGroup on a vDS. LBT

appears as the ―Route based on physical NIC load.‖

*LBT is not available on the vNetwork Standard Switch (vSS).

72

VMXNET3—The Para-virtualized VM Virtual NIC

• Next evolution of ―Enhanced VMXNET‖ introduced in ESX 3.5

• Adds

• MSI/MSI-X support (subject to guest operating system kernel support)

• Receive Side Scaling (supported in Windows 2008 when explicitly enabled through

the device's Advanced configuration tab)

• Large TX/RX ring sizes (configured from within the virtual machine)

• High performance emulation mode (Default)

• Supports

• High DMA

• TSO (TCP Segmentation Offload) over IPv4 and IPv6

• TCP/UDP checksum offload over IPv4 and IPv6

• Jumbo Frames

• 802.1Q tag insertion

KB - Choosing a network adapter for your virtual machine (1001805)

73

VMDirectPath for VMs

I/O Device

Device Driver

Virtual

Layer

What is it?

Enables direct assignment of PCI devices to VM

Types of workloads

I/O Appliances

High performance VMs

Details

Guest controls the physical H/W

Requirements

vSphere 4

I/O MMU

Used for DMA Address Translation (Guest Physical

Host Physical) and protection

Generic device reset (FLR, Link Reset, ...)

KB - Configuring VMDirectPath I/O pass-through devices on an ESX host (1010789)

http://vmweb.vmware.com/product_mktg/diagrams/images/icons/NIC_icon.zip

74

FCoE on ESX

VMware ESX Support

• FCoE supported since ESX 3.5u2

• Requires Converged Network

Adapters ―CNAs‖—(see HCL) e.g.

• Emulex LP21000 Series

• Qlogic QLE8000 Series

• Appears to ESX as:

• 10GigE NIC

• FC HBA

• SFP+ pluggable transceivers

• Copper twin-ax (<10m)

• Optical

10GigE

NIC

Fibre

Channel

HBA

vSwitch

FCoE

Switch

Fibre

ChannelEthernet

FCoE

CNA—Converged

Network Adapter

ESX

75

Using 10GigE

2x 10GigE common/expected

• 10GigE CNAs or NICs

Possible Deployment Method

• Active/Standby on all Portgroups

• VMs ―sticky‖ to one vmnic

• SC/vmk ports sticky to other

• Use Ingress Traffic Shaping

to control traffic type per

Port Group

• If FCoE, use Priority Group bandwidth

reservation (on CNA utility)

vSwitch

iSCSI NFS VMotion FT SC

FCoE FCoE

SC#2

FCoE

10

FCoE Priority Group

bandwidth reservation

(in CNA config utility)

Gbps10GE10GE

Ingress (into switch)

traffic shaping policy

control on Port Group

1-2G Low b/wHigh

b/w

Variable/high

b/w 2Gbps+

76

Traffic Types on a Virtual Network

Virtual Machine Traffic

• Traffic sourced and received from virtual machine(s)

• Isolate from each other based on service level

VMotion Traffic

• Traffic sent when moving a virtual machine from one ESX host to another

• Should be isolated

Management Traffic

• Should be isolated from VM traffic (one or two Service Consoles)

• If VMware HA is enabled, includes heartbeats

IP Storage Traffic—NFS and/or iSCSI via vmkernel interface

• Should be isolated from other traffic types

Fault Tolerance (FT) Logging Traffic

• Low latency, high bandwidth

• Should be isolated from other traffic types

How do we maintain traffic isolation without proliferating NICs?

77

VLAN Trunking to Server

IEEE 802.1Q VLAN Tagging

• Enables logical network partitioning

(Traffic separation)

• Scale traffic types without scaling physical NICs

• Virtual machines connect to virtual

switch ports (like access ports

on physical switch)

• Virtual switch ports are associated

with a particular VLAN (VST mode)—defined

in PortGroup

• Virtual switch tags packets exiting host

VM0 VM1

vSwitch

PortGroup

―Blue‖

VLAN 20

Port Group

―Yellow‖

VLAN 10

VLAN Trunks

Carrying

VLANs 10, 20

802.1Q Header

810012-bit VLAN id

field

(0-4095)

78

VLAN Tagging Options

vSwitch

Physical Switch

vSwitch

Physical Switch

vSwitch

Physical Switch

VST – Virtual Switch Tagging VGT – Virtual Guest Tagging EST – External Switch Tagging

VLAN Tags

applied in

vSwitch

VLAN Tags

applied in

Guest

PortGroup

set to VLAN

―4095‖

External Physical

switch applies

VLAN tags VST is the best practice and

most common method

VLAN

assigned in

Port Group

policy

79

VLAN Tagging: Further Example

KB -Sample configuration of virtual switch VLAN tagging (VST Mode) and ESX Server (1004074)

Uplinks A, B, and C connected to trunk ports on physical switch which carry four VLANs

(e.g. VLANs 10, 20, 50, 90)

Ports 1-14 emit untagged frames, and only those frames which were tagged with their respective VLAN ID

(equivalent to ―access port‖ on physical switch)

• Port Group VLAN ID set to one of 1-4094

Port 15 emits tagged frames for all VLANs.

• Port Group VLAN ID set to 4095 (for vSS) or ―VLAN Trunking‖ on vDS DV Port Group

1310 12 14111 2 3 4 5 6 7 8 9

A C

15

B

VLAN Trunks

Carrying VLANs

10, 20, 50, 90

Access Ports

on VLAN 10Access Ports

on VLAN 20

Access Ports

on VLAN 50

All VLANs

(10,20,50,90)

trunked to

VM

interface GigabitEthernet1/2

description host32-vmnic0

switchport trunk encapsulation dot1q

switchport trunk native vlan 999

switchport trunk allowed vlan 10,20,50,90

switchport mode trunk

spanning-tree portfast trunk

Example

configuration on

Physical Switch

80

Private VLANs: Traffic Isolation for Every VM

Solution: PVLAN

• Place VMs on the same virtual network

but prevent them from communicating

directly with each other (saves VLANs!)

• Avoids scaling issues from assigning

one VLAN and IP subnet per VM

Details

• Instead, configure a SINGLE DV port

group to have a SINGLE isolated*

VLAN (ONLY ONE)

• Attach all your VMs to this SINGLE

isolated VLAN DV port group

Distributed

Switch with

PVLAN

Private VLAN traffic isolation

between guest VMs

Common

Primary VLAN

on uplinks

KB - Private VLAN (PVLAN) on vNetwork Distributed Switch - Concept Overview (1010691)

81

W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B

vNetwork Distributed SwitchPG PG PG PG PG PG PG PG PG PG PG PG

TOTAL COST: 12 VLANs (one per VM)

TOTAL COST: 1 PVLAN (over 90% savings…)

W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B

vNetwork Distributed SwitchPG (with Isolated PVLAN)

Private VLANs - Continued

82

Tips & Tricks

• KB - Changing a MAC address in a Windows virtual machine (1008473)

• When a physical machine is converted into a virtual machine, the MAC address of the network adapter is

changed. This can pose a problem when software is installed where the licensing is tied to the MAC

address.

• KB – Configuring speed and duplex of an ESX Server host network adapter (1004089)

• ESX recommended settings for Gigabit-Ethernet speed and duplex while connecting to a physical switch

port are as following:

• Auto Negotiate <-> Auto Negotiate

• It is not recommended to mix hard-coded setting with Auto-negotiate.

• KB - Sample Configuration - Network Load Balancing (NLB) Multicast mode over routed subnet -

Cisco Switch Static ARP Configuration (1006525)

• NLB Multicast Mode – Static ARP Resolution

• Since NLB packets are unconventional, meaning the IP address is Unicast while the MAC address of it is

Multicast, switches and routers drop NLB packets

• NLB Multicast Packets get dropped by routers and switches, causing the ARP tables of switches to not get

populated with cluster IP and MAC address

• Manual ARP Resolution of NLB cluster address is required on physical switch and router interfaces

• Cluster IP and MAC static resolution is set on each switch port that connects to ESX host

83

Troubleshooting Tips

84

Troubleshooting with Esxtop

85

Esxtop Traffic

86

Capturing Traffic

87

ESX tcpdump

88

Wireshark in a VM

89

Must Read… http://www.vmware.com/technical-resources/virtual-networking/

Conclusion

This study compares performance results for e1000 and

vmxnet virtual network devices on 32-bit and 64-bit guest

operating systems using the netperf benchmark. The results

show that when a virtual machine is running with software

virtualization, e1000 is better in some cases and vmxnet is

better in others. Vmxnet has lower latency, which sometimes

comes at the cost of higher CPU utilization. When hardware

virtualization is used, vmxnet clearly provides the best

performance.

Conclusion

VMXNET3, the newest generation of virtual network adapter from

VMware, offers performance on par with or better than its previous

generations in both Windows and Linux guests. Both the driver

and the device have been highly tuned to perform better on

modern systems. Furthermore, VMXNET3 introduces new

features and enhancements, such as TSO6 and RSS. TSO6

makes it especially useful for users deploying applications that

deal with IPv6 traffic, while RSS is helpful for deployments

requiring high scalability. All these features give VMXNET3

advantages that are not possible with previous generations of

virtual network adapters. Moving forward, to keep pace with an

ever‐increasing demand for network bandwidth, we recommend

customers migrate to VMXNET3 if performance is of top concern

to their deployments.

Technical Papers

http://www.vmware.com/technical-resources/virtual-networking/





90

KB Links

• KB - Cisco Discovery Protocol (CDP) network information via command line and VirtualCenter on an

ESX host (1007069)

• Utilizing Cisco Discovery protocol (CDP) to get switch port configuration information.

• This command is utilized to troubleshoot network connectivity issues related to VLAN tagging methods on

virtual and physical port settings.

• KB - Troubleshooting network issues with the Cisco show tech-support command (1015437)

• If you experience networking issues between vSwitch and physical switched environment, you can obtain

information about the configuration of a Cisco router or switch by running the show tech-support command

in privileged EXEC mode.

• Note: This command does not alter the configuration of the router.

• KB - ESX host or virtual machines have intermittent or no network connectivity (1004109)

• KB - Troubleshooting Nexus 1000V vDS network issues (1014977)

• KB - Cisco Nexus 1000V installation and licensing information (1013452)

• Cisco Nexus 1000V Troubleshooting Guide, Release 4.0(4)SV1(2) 20/Jan/2010

• Cisco Nexus 1000V Troubleshooting Guide, Release 4.0(4)SV(1) 21/Jan/2010

• KB - Configuring promiscuous mode on a virtual switch or portgroup (1004099)

• KB - Troubleshooting network issues by capturing and sniffing network traffic via tcpdump (1004090)

91

KB Links - Continued

• KB - Troubleshooting network connection issues using Address Resolution Protocol (ARP)

(1008184)

• IEEE OUI and Company id Assignments http://standards.ieee.org/regauth/oui/index.shtml

• KB - Network performance issues (1004087)

• KB - Low Network Throughput in Windows Guest when Running UDP Application (5298153)

• KB - Performance of Outgoing UDP Packets Is Poor (10172)

• KB - Poor Network File Copy performance between local VMFS and shared VMFS (1003554)

• KB - Cannot connect to ESX 4.0 host for 30-40 minutes after boot (1012942)

• Ensure that DNS is configured and reachable from the ESX host

• KB - Identifying issues with and setting up name resolution on ESX Server (1003735)

• Note: localhost must always be present in the hosts file. Do not modify or remove the entry for localhost

• The hosts file must be identical on all ESX Servers in the cluster

• There must be an entry for every ESX Server in the cluster

• Every host must have an IP address, Fully Qualified Domain Name (FQDN), and short name

• The hosts file is case sensitive. Be sure to use lowercase throughout the environment


Questions


ESXi Readiness

Planning your migration to VMware ESXi, the next-generation hypervisor

architecture.


94

The Gartner Group says…

―The major benefit of ESXi is the fact that it is more lightweight —

under 100MB versus 2GB for VMware ESX with the service

console.‖

―Smaller means fewer patches‖

―It also eliminates the need to manage a separate Linux console

(and the Linux skills needed to manage it)…‖

As of August 2010 ―VMware users should put a plan in place to

migrate to ESXi during the next 12 to 18 months.‖

95

VMware ESX

Hypervisor ArchitectureVMware ESXi

Hypervisor Architecture

• Code base disk footprint: <100 MB

• VMware agents ported to run directly on VMkernel

• Authorized 3rd party modules can also run in

VMkernel to provide hw monitoring and drivers

• Other capabilities necessary for integration into an

enterprise datacenter are provided natively

•No other arbitrary code is allowed on the system

• Code base disk footprint: ~ 2GB

• VMware agents run in Console OS

• Nearly all other management functionality

provided by agents running in the Console OS

• Users must log into Console OS in order to run

commands for configuration and diagnostics

VMware ESXi and ESX hypervisor architectures comparison

96

Call to action for customers

Start testing ESXi

• If you‘ve not already deployed, there‘s no better time than the present

Ensure your 3rd party solutions are ESXi Ready

• Monitoring, backup, management, etc. Most already are.

• Bid farewell to agents!

Familiarize yourself with ESXi remote management options

• Transition any scripts or automation that depended on the COS

• Powerful off-host scripting and automation using vCLI, PowerCLI, …

Plan an ESXi migration as part of your vSphere upgrade

• Testing of ESXi architecture can be incorporated into overall vSphere

testing

97

Visit the ESXi and ESX Info Center today

http://vmware.com/go/ESXiInfoCenter

http://vmware.com/go/ESXiInfoCenter


Questions


Break


vSphere 4 - Performance Best Practices

Kenneth Kemp, Escalation Engineer

101

Agenda

Technical Guides

ESX 4.x Performance & Troubleshooting

• Memory

• CPU

vCenter Performance & Troubleshooting

• High Availability

• Distributed Resource Scheduler

• Fault Tolerance

• Resource Pool Designs

• HW Considerations and Settings

102

Technical Guides


Memory

104

Memory – Resource Types

When assigning a VM a ―physical‖ amount of RAM, all you are really

doing is telling ESX how much memory a given VM process will

maximally consume past the overhead.

Whether or not that memory is physical depends on a few factors: Host

configuration, DRS shares/Limits/Reservations and host load.

Generally speaking, it is better to OVER-commit than UNDER-commit.

105

Memory – Overhead & Reclamation

ESX memory space overhead

Service Console: 272 MB

VMkernel: 100 MB+

Per-VM memory space overhead increases with:

Number of VCPUs

Size of guest memory

32 or 64 bit guest OS

ESX memory space reclamation

Page sharing

Ballooning

106

Memory – Page Tables

Page tables

ESX cannot use guest page tables

ESX Server maintains shadow page tables

Translate memory addresses from virtual to machine

Per process, per VCPU

VMM maintains physical (per VM) to machine maps

No overhead from ―ordinary‖ memory references

Overhead

Page table initialization and updates

Guest OS context switching

VA

PA

MA

107

Memory – Over-commitment & Sizing

Avoid high active host memory over-commitment

• Total memory demand = active working sets of all VMs

+ memory overhead

– page sharing

• No ESX swapping: total memory demand < physical memory

Right-size guest memory

• Define adequate guest memory to avoid guest swapping

• Per-VM memory space overhead grows with guest memory

108

Memory – NUMA considerations

Increasing a VM‘s memory on a NUMA machine

Will eventually force some memory to be allocated from a remote node, which

will decrease performance

Try to size the VM so both CPU and memory fit on one node

Node 0 Node 1

109

Memory – NUMA considerations continued

NUMA scheduling and memory placement policies in ESX manages all VMs transparently

No need to manually balance virtual machines between nodes

NUMA optimizations available when node interleaving is disabled

Manual override controls available

Memory placement: 'use memory from nodes'

Processor utilization: 'run on processors'

Not generally recommended

For best performance of VMs on NUMA systems

# of VCPUs + 1 <= # of cores per node

VM memory <= memory of one node

110

ESX must balance memory usage for all worlds

• Virtual machines, Service Console, and vmkernel consume memory

• Page sharing to reduce memory footprint of Virtual Machines

• Ballooning to relieve memory pressure in a graceful way

• Host swapping to relieve memory pressure when

ballooning insufficient

ESX allows overcommitment of memory

• Sum of configured memory sizes of virtual machines can be greater than

physical memory if working sets fit

Memory – Balancing & Overcommitment

111

Ballooning: Memctl driver grabs pages and gives to ESX

• Guest OS choose pages to give to memctl (avoids ―hot‖ pages if possible): either free pages or pages to swap

• Unused pages are given directly to memctl

• Pages to be swapped are first written to swap partition within guest OS and then given to

memctl

VM1

Swap partition

w/in

Guest OSESX

VM2

memctl

1. Balloon

2. Reclaim

3. Redistribute

F

Memory - Ballooning

112

Swapping: ESX reclaims pages forcibly

• Guest doesn’t pick pages…ESX may inadvertently pick ―hot‖ pages (possible VM performance implications)

• Pages written to VM swap file

VM1

Swap

Partitio

n (w/in

guest)

ESX

VM2

VSWP

(external to guest)

1. Force Swap2. Reclaim3. Redistribute

Memory - Swapping

113

Bottom line:

• Ballooning may occur even when no memory pressure just to keep memory

proportions under control

• Ballooning is vastly preferably to swapping

• Guest can surrender unused/free pages

• With host swapping, ESX cannot tell which pages are unused or free and may accidentally pick

―hot‖ pages

• Even if balloon driver has to swap to satisfy the balloon request, guest chooses what to swap

• Can avoid swapping ―hot‖ pages within guest

Memory – Ballooning vs. Swapping

114

If running VMs consume too much host memory…

• Some VMs do not get enough host memory

• This forces either ballooning or host swapping to satisfy VM demands

• Host swapping or excessive ballooning reduced VM performance

If I do not size a VM properly (e.g., create Windows VM with 128MB

RAM)

• Within the VM, swapping occurs, resulting in disk traffic

• VM may slow down

• But…don’t make memory too big! (High overhead memory)

Memory – Ok, So Why Do I Care About Memory Usage?

115

One rule of thumb: > 1MB/s swap in or swap out rate may

mean memory overcommitment

Metric (Client) Metric

(esxtop)

Metric (SDK) Description

Swap in rate

(ESX4.0 Hosts)

SWR/s mem.swapinRate.average Rate at which mem is

swapped in from disk

Swap out rate

(ESX4.0 Hosts)

SWW/s mem.swapoutRate.average Rate at which mem is

swapped out to disk

Swapped SWCUR mem.swapped.average (level 2

counter)

~swap out – swap in

Swap in

(cumulative)

n/a mem.swapin.average Mem swapped in from

disk

Swap out

(cumulative)

n/a mem.swapout.average Mem swapped out to

disk

Memory - Important Memory Metrics (Per VM)

116

One rule of thumb: > 1MB/s swap in or swap out rate may

mean memory overcommitment


(esxtop)


Swap in rate

(ESX4.0 Hosts)

SWR/s mem.swapinRate.average Rate at which mem is

swapped in from disk

Swap out rate

(ESX4.0 Hosts)

SWW/s mem.swapoutRate.average Rate at which mem is

swapped out to disk

Swap used SWCUR mem.swapused.average (level

2 counter)

~swap out – swap in

Swap in

(cumulative)

n/a mem.swapin.average Mem swapped in from

disk

Swap out

(cumulative)

n/a mem.swapout.average Mem swapped out to

disk

Memory - Important Memory Metrics (Per Host, sum of VMs)

117

No swapping

Lots of swapping

Increased swap activity may be a sign of over-commitment

Memory - vSphere Client: Swapping on a Host

118

No

swappin

g

Lots of

swappin

g

Memory - A Stacked Chart (per VM) of Swapping

119

Overview Page

• Balloon

• Active

• Swap used

• Granted

• Shared common

Memory - Counters Shown in vSphere Client: Host

120

Overview Page

• Balloon target (how

much should be

ballooned)

• Swapped (~swap out –

swap in)

• Shared

• Balloon

• Active

Memory - Counters Shown in vSphere Client: VM

121

• Main page shows host memory usage (consumed + overhead memory +

Service Console)

Data refreshed at 20s intervals

Memory - Other Counters Shown in vSphere Client

122

Host CPU: Avg. CPU utilization for Virtual machine

Host Memory: consumed + overhead memory for Virtual Machine

Guest Memory: active memory for guest

Note: This page is updated once per minute

Memory - Counters Shown on VM List Summary Tab

123

Overhead

consumed

Overhead reserved

Private (non-shared)

Shared (content-based

page-sharing)

Active used as input to DRS

Unaccessed = unmapped (~never been touched)

Host

Guest

Memory - Breakdown in a VM

124

Metric Description

Memory Active (KB) Physical pages touched recently by a virtual machine

Memory Usage (%) Active memory / configured memory

Memory Consumed

(KB)

Machine memory mapped to a virtual machine,

including its portion of shared pages. Does NOT

include overhead memory.

Memory Granted (KB) VM physical pages backed by machine memory. May

be less than configured memory. Includes shared

pages. Does NOT include overhead memory.

Memory Shared (KB) Physical pages shared with other virtual machines

Memory Balloon (KB) Physical memory ballooned from a virtual machine

Memory Swapped (KB)

(ESX4.0: swap rates!)

Physical memory in swap file (approx. ―swap out –

swap in‖). Swap out and Swap in are cumulative.

Overhead Memory (KB) Machine pages used for virtualization

Memory - Virtual Machine Memory Metrics, vSphere Client

125

Metric Description

Memory Active (KB) Physical pages touched recently by the host

Memory Usage (%) Active memory / configured memory

Memory Consumed (KB) Total host physical memory – free memory on host.

Includes Overhead and Service Console memory.

Memory Granted (KB) Sum of memory granted to all running virtual

machines. Does NOT include overhead memory.

Memory Shared (KB) Sum of memory shared for all running VMs

Shared common (KB) Total machine pages used by shared pages

Memory Balloon (KB) Machine pages ballooned from virtual machines

Memory Swap Used (KB)

(ESX4.0: swap rates!)

Physical memory in swap files (approx. ―swap out –

swap in‖). Swap out and Swap in are cumulative.

Overhead Memory (KB) Machine pages used for virtualization

Memory - Host Memory Metrics, vSphere Client

126

Swapping

MCTL: N - Balloon

driver not active, tools

probably not installed

Memory

Hog

VMs

Swapped in

the past but

not actively

swapping

now

More swapping

since balloon driver

is not active

Ballooning

active

Memory - Troubleshooting Memory Problems with Esxtop


CPU

128

CPU - Resource Types

CPU resources are the raw processing speed of a given host or

VM

However, on a more abstract level, we are also bound by the

hosts‘ ability to schedule those resources.

We also have to account for running a VM in the most optimal

fashion, which typically means running it on the same processor

that the last cycle completed on.

129

ESX Server

CPU – SMP Performance

Some multi-threaded apps in a SMP VM may not

perform well

Use multiple UP VMs on a multi-CPU physical machine

ESX Server

130

CPU - Performance Overhead & Utilization

CPU virtualization adds varying amounts of overhead

Little or no overhead for the part of the workload that can run in direct

execution

Small to significant overhead for virtualising sensitive privileged instructions

Performance reduction vs. increase in CPU utilization

CPU-bound applications: any CPU virtualization overhead results in reduced

throughput

non-CPU-bound applications: should expect similar throughput at higher CPU

utilization

131

CPU – VM vCPU Processor Support

ESX supports up to eight virtual processors per VM

• Use UP VMs for single-threaded applications

• Use UP HAL or UP kernel

• For SMP VMs, configure only as many VCPUs as needed

• Unused VCPUs in SMP VMs:

• Impose unnecessary scheduling constraints on ESX Server

• Waste system resources (idle looping, process migrations, etc.)

132

CPU – 64-bit Performance

Full support for 64-bit guests

64-bit can offer better performance than 32-bit

• More registers, large kernel tables, no HIGHMEM issue in Linux

ESX Server may experience performance problems due to shared

host interrupt lines

• Can happen with any controller; most often with USB

• Disable unused controllers

• Physically move controllers

• See KB 1290 for more details

133

CPU – Virtual Machine Worlds

ESX is designed to run Virtual Machines

Schedulable entity = ―world‖

• Virtual Machines are composed of worlds

• Service Console is a world (has agents like vpxa, hostd)

• Helper Worlds

ESX uses proportional-share scheduler to help with resource management

• Limits

• Shares

• Reservations

Balanced interrupt processing

134

CPU – ESX CPU Scheduling

World states (simplified view):

• ready = ready-to-run but no physical CPU free

• run = currently active and running

• wait = blocked on I/O

Multi-CPU Virtual Machines => variant of gang scheduling called

‗relaxed co-scheduling‘

• Co-run (latency to get vCPUs running)

• Co-stop (time in ―stopped‖ state)

135

One common issue is high CPU ready time

• High ready time possible contention for CPU resources among VMs

• Many possible reasons

• CPU overcommitment (high %rdy + high %used)

• Workload variability

• set on VM

• No fixed threshold, but > 20% for a VCPU Investigate further

CPU - So, How Do I Spot CPU Performance Problems?

136


(esxtop)

Metric (sdk) Description

Usage (%) %USED cpu.usage.average CPU used over

the collection

interval (%)

Usage (MHz) n/a cpu.usagemhz.average CPU used over

the collection

interval (MHz)

CPU: Useful Metrics Per-HOST

137

Per-VM


(esxtop)


Usage (%) %USED cpu.usage.average CPU used over

the collection

interval

Used (ms) %USED cpu.used.summation CPU used over

the collection

interval)*

Ready (ms) %RDY cpu.ready.summation CPU time spent

in ready state*

Swap wait time

(ms) [ESX4.0

hosts]

%SWPWT cpu.swapwait.summation CPU time spent

waiting for host-

level swap-in

* Units different between esxtop and vSphere client

CPU: Useful Metrics Per-VM

138

Note CPU milliseconds and percent are on the same chart but use different axes

CPU - vSphere Client CPU Screenshot Hint

139

• 2-CPU box, but 3 active VMs (high %used)

• High %rdy + high %used can imply CPU overcommitment

CPU - Spotting CPU Overcommitment in esxtop

140

• Used time ~ ready time:

may signal contention.

However, might not be

overcommitted due to

workload variability

• In this example, we have

periods of activity and idle

periods: CPU isn’t

overcommitted all the time

Used time

Ready time

~ used time

Ready time < used time

CPU - Spotting Workload Variability in the vSphere Client

141

High Ready TimeHigh MLMTD: there is a limit on this VM…

High ready time not always because of overcommitment

CPU - High Ready Time Due to Limits Set on VM: esxtop

142

Limit on CPU

High ready time

CPU - High Ready Time Due to Limits: vSphere Client

143

Ready time jump from 12.5% (idle DB) to 20% (busy

DB) – didn‘t notice until responsiveness suffered!

CPU - Ready Time: Why There is no Fixed Threshold…

144

CPU overcommitment

• Possible solution: add more CPUs or VMotion the VM

Workload variability

• A bunch of VMs wake up all at once

• Note: system may be mostly idle: not always overcommitted

Limit set on VM

• 4x2GHz host, 2 vcpu VM, limit set to 1GHz (VM can consume 1GHz)

• Without limit, max is 2GHz. With limit, max is 1GHz (50% of 2GHz)

• CPU all busy: %USED: 50%; %MLMTD & %RDY = 150% [total is 200%, or 2 CPUs]

CPU - Summary of Possible Reasons for High Ready Time


vCenter

146

vCenter - Best Practices

VC Database sizing

Estimate of the space required to store your performance statistics in the DB

Separate Critical Files onto Separate Drives

Make sure the database and transaction log files are placed on separate

physical drives

Place the tempdb database on a separate physical drive if possible

Arrangement distributes the I/O to the DB and dramatically improves its

performance

If a third drive is not feasible, place the tempdb files on the transaction log drive

Enable Automatic Statistics

Keep vCenter logging level low, unless troubleshooting

Proper scheduling of DB backups, maintenance, monitoring

Do not run vCenter on a server that has many applications running

vCenter Heartbeat - http://www.vmware.com/products/vcenter-server-

heartbeat/

147

vCenter - Performance

High CPU utilization and sluggish UI performance

Number of clients attached is high

VC needs to keep clients consistent with inventory changes

Aggressive alarm settings

DB administration

Periodic maintenance

Recovery and log settings

Appropriate VC statistics level

Use gigabit NICs for the service console to clone VMs

Assign permissions appropriately

SQL Server Express will only run well up to 5 hosts and/or 50 VMs. Past that, VC needs to run off an Enterprise-class DB.

148

vCenter - High Availability (HA)

HA network configuration check – DNS, NTP, lowercase hostnames, HA advanced settings

Redundancy: server hardware, shared storage, network, management

Test network isolation from a core switch level, and host failure for expected outage behavior

Critical VMs should NOT be grouped together

Categorize VM criticality, then set the failover appropriately

Valid VM network label names required for proper failover

Failover capacity/Admission control may be too conservative when host and VM sizes vary widely – slot size calculator in VC

149

vCenter - DRS (Distributed Resource Scheduler)

Higher number of hosts => more DRS balancing options

Recommend up to 32 hosts/cluster, may vary with VC server configuration and VM/host ratio

Network configuration on all hosts - VMotion network: Security policies, VMotion NIC enabled, Gig

Reservations, Limits, and Shares

- Shares take effect during resource contention

- Low limits can lead to wasted resources

- High VM reservations may limit DRS balancing

- Overhead memory

- Use resource pools for better manageability, do not nest too deep

Virtual CPU’s and Memory size

High memory size and virtual CPU’s => fewer migration opportunities

Configure VMs based on need network, etc.

150

vCenter - DRS (Cont.)

Ensure hosts are CPU compatible

- Intel vs. AMD

- Similar CPU family/features

- Consistent server bios levels, and NX bit exposure

- Enhanced VMotion Compatibility (EVC)

- ―VMware VMotion and CPU Compatibility‖ whitepaper

- CPU incompatibility => limited DRS VM migration options

Larger Host CPU and memory size preferred for VM placement (if all equal)

Differences in cache or memory architecture => inconsistency in performance

Aggressiveness threshold - Moderate threshold (default) works well for most cases

Aggressive thresholds recommended if homogenous clusters and VM demand relatively

constant and few affinity/anti-affinity rules

Use affinity/anti-affinity rules only when needed

Affinity rules: closely interacting VMs Anti-affinity rules: I/O intensive workloads, availability

Automatic DRS mode recommended (cluster-wide)

Manual/Partially automatic mode for location-critical VMs (per VM)

Per VM setting overrides cluster-wide setting

151

This design is simple and does not limit any VMs from any

physical resources. Using the ESX shares mechanism, if two

or more VMs are competing for the same physical resources

the tug of war that results will be decided by the resource pool

memberships of the VMs.

The ESX cluster will have three resource pools defined.

• A ―High‖ resource pool will have no initial reservation and

unlimited/expandable RAM and CPU settings. CPU and

Memory shares will be set to high. This resource pool will be

devoted for mission-critical VMs.

• A second ―Normal‖ resource pool will have no initial

reservation and unlimited/expandable RAM and CPU

settings. CPU and Memory shares will be set to normal.

vCenter – Resource Pool Tug of War Design

152

This design takes the sum total of all physical resources and

slices it up across the resource pools. Although the following

design only uses two resource pools, many more ―slices‖ could

be created. The most basic Pizza Design would be to reserve

all memory and cpu, but the following example helps also

illustrate reservations and limits.

The ESX cluster will have two resource pools defined.

• A ―Critical Services‖ resource pool will have an initial

reservation of 32GB RAM and 8GHz CPU, and

unlimited/expandable RAM and CPU settings. This resource

pool will be devoted for mission-critical VMs. Shares for RAM

will be set to high, but shares for CPU will be set to normal.

vCenter – Resource Pool Pizza Design

153

vCenter - FT - Fault Tolerance

FT Provides complete VM redundancy

By definition, FT doubles resource requirements

Turning on FT disables performance-enhancing features like, H/W MMU

Each time FT is enabled, it causes a live migration

Use a dedicated NIC for FT traffic

Place primaries on different hosts

Asynchronous traffic patterns

Host Failure considerations

Run FT on machines with similar characteristics

154

vCenter - HW Considerations and Settings

When purchasing new servers, target MMU virtualization(EPT/RVI) processors,

or at least CPU virtualization(VT-x/AMD-V) depending on your application work

loads

If your application workload is creating/destroying a lot of processes, or

allocating a lot of memory them MMU will help performance

Purchase uniform, high-speed, quality memory, populate memory banks evenly

in the power of 2.

Choosing a system for better i/o performance MSI-X is needed which allows

support for multiple queues across multiple processors to process i/o in parallel

PCI slot configuration on the motherboard should support PCIe v/2.0 if you

intend to use 10 gb cards, otherwise you will not utilize full bandwidth

155

vCenter - HW Considerations and Settings (cont.)

BIOS Settings

- Make sure what you paid for,… is enabled in the bios

-enable ―Turbo-Mode‖ if your processors support it

- Verify that hyper-threading is enabled – more logical CPUs allow more options

for the VMkernel scheduler

- NUMA systems verify that node-interleaving is enabled

- Be sure to disable power management if you want to maximize performance

unless you are using DPM. Need to decide if performance out-weighs power

savings

C1E halt state - This causes parts of the processor to shut down for a short period of time in order to save

energy and reduce thermal loss

-Verify VT/NPT/EPT are enabled as older Barcelona systems do not enable these by

default

-Disable any unused USB, or serial ports

156

Reference Guide Links

VMware vCenter Server Performance and Best Practices for vSphere 4.1

http://www.vmware.com/resources/techresources/10145

Performance Best Practices for VMware vSphere® 4.0

http://www.vmware.com/pdf/Perf_Best_Practices_vSphere4.0.pdf

SAN System Design and Deployment Guide

http://www.vmware.com/files/pdf/techpaper/SAN_Design_and_Deployment_Guide.pdf

VMware vSphere: The CPU Scheduler in VMware ESX 4.1

http://www.vmware.com/files/pdf/techpaper/VMW_vSphere41_cpu_schedule_ESX.pdf

http://www.vmware.com/resources/techresources/10145

http://www.vmware.com/files/pdf/techpaper/VMW_vSphere41_cpu_schedule_ESX.pdf

157

Reference Guide Links Continued…

Understanding Memory Resource Management in VMware ESX 4.1

http://www.vmware.com/files/pdf/techpaper/vsp_41_perf_memory_mgmt.pdf

Managing Performance Variance of Applications Using Storage I/O Control

http://www.vmware.com/files/pdf/techpaper/vsp_41_perf_SIOC.pdf

What‘s New in VMware® vSphere™ 4.1 — Networking

http://www.vmware.com/files/pdf/techpaper/VMW-Whats-New-vSphere41-Networking.pdf

VMware® Network I/O Control: Architecture, Performance and Best Practices VMware vSphere™ 4.1

http://www.vmware.com/files/pdf/techpaper/VMW_Netioc_BestPractices.pdf

Designing Resource Pools

http://vmetc.com/2008/03/04/designing-esx-resource-pools/

http://www.vmware.com/files/pdf/techpaper/vsp_41_perf_memory_mgmt.pdf

http://www.vmware.com/files/pdf/techpaper/vsp_41_perf_SIOC.pdf










http://www.vmware.com/files/pdf/techpaper/VMW_Netioc_BestPractices.pdf


Questions


Wrap Up/Raffle Drawing

VMware- Customer Support Day · 8 SCSI Reservations –when an initiator requests/reserves...

Documents

Transcript of VMware- Customer Support Day · 8 SCSI Reservations –when an initiator requests/reserves...