BCO2874 vSphere High Availability 5.0 and SMP Fault Tolerance – Technical Overview and Roadmap

BCO2874vSphere High Availability 5.0 and SMP Fault Tolerance – Technical Overview and Roadmap

Name, Title, Company

2

Disclaimer

This session may contain product features that are currently under development.

This session/overview of the new technology represents no commitment from VMware to deliver these features in any generally available product.

Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

Technical feasibility and market demand will affect final delivery.

Pricing and packaging for any new technologies or features discussed or presented have not been determined.

3

vSphere HA and FT Today

Minimize downtime without the cost/complexity of traditional solutions vSphere HA provides rapid recovery from outages vSphere Fault Tolerance provides continuous availability

Coverage

Hardware

Guest OS

Application

Fault Tolerance

App Monitoring APIs

none minutesDowntime

Guest Monitoring

Partnersolutions

VMInfrastructure HA

4

Coverage

Hardware

Guest OS

Application

Fault Tolerance

App Monitoring APIs


Guest Monitoring

Partnersolutions

VMInfrastructure HA

This Talk

1. Technical overview of vSphere HA 5.0• Presented by Keith Farkas

2. Technical preview of vSphere Fault Tolerance SMP• Presented by Jim Chow

Multiple vCPUFT

HA 5.0

5

vSphere HA 5.0Objectives

Learn about the enhancements in vSphere HA 5.0

Understand the new architecture

Identify questions for the breakout / expert sessions

6

vSphere HA 5.0

vSphere HA was completely rewritten in 5.0 to• Simplify setting up HA clusters and managing them• Enable more flexible and larger HA deployments• Make HA more robust and easier to troubleshoot• Support network partitions

5.0 architecture is fundamentally different• This talk

• Describes the three key concepts• Summarizes host failure responses

• To learn more, see other VMworld HA venues

7

5.0 Architecture

New vSphere HA agent• Called the Fault Domain Manager (FDM)• Provides all the HA on-host functionality

As in previous releases• vCenter Server (VC) manages the cluster• Failover operations are independent of VC• FDMs communicate over management

network

vCenter Server (VC)

FDM

FDM FDM

FDM

8

Key Concepts – Part 1

• FDM roles and responsibilities

• Inter-FDM communication

9

One FDM is chosen to be the master• Normally, one master per cluster

• All others assume the role of FDM slaves

Any FDM can be chosen as master• No longer a primary / secondary role concept

• Selection done using an election

Master-specific responsibilities• Monitors availability of hosts / VMs in cluster

• Manages VM restarts after VM/host failures

• Reports cluster state / failover actions to VC

• Manages persisted state

FDM Master

master slave

slaveslave

vCenter Server (VC)

10

FDM Slave and Shared Responsibilities

Slave-specific responsibilities Forwards critical state changes to the master

Restarts VMs when directed by the master

If the master should fail, participates in master election

Each FDM (master or slave)• Monitors the state of local VMs and the host

• Implements the VM/App Monitoring feature

master slave

slaveslave

11

An election is held when: vSphere HA is enabled

• Master’s host becomes inactive

• HA is reconfigured on master’s host

• A management network partition occurs

If multiple masters can communicate, all but one will abdicate

Master-election algorithm• Takes15 to 25s (depends on reason for election)

• Elects participating host with the greatest number of mounted datastores

FDM

The Master Election

FDM

ESX 1

FDM

ESX 3

FDM

ESX 4ESX 2

FDM

FDM

12

master slave

slaveslave

Agent Communication

FDMs communicate over the• Management networks

• Datastores

Datastores used when network is unavailable• Used when hosts are isolated or partitioned

Network communication• All communication is point to point

• Election is conducted using UDP

• All master-slave communication is via SSL encrypted TCP

13

Questions Answered Using Datastore Communication

Master Slave

Is a slave partitioned or isolated? Is a master responsible for my VM?

Are its VMs running?

FDM

FDM

14

Questions Answered Using Datastore Communication

Master Slave

Is a slave partitioned or isolated? Is a master responsible for my VM?

Are its VMs running?

Datastores Used

Datastores selected by VC, calledthe Heartbeat Datastores

Datastores containing VM config files

FDM

FDM

15

Heartbeat Datastores

VC chooses (by default) two datastores for each host

You can override the selection or provide preferences• Use the cluster “edit settings” dialog for this purpose

16

Responses to a Network or Host Failures

17

Host Is Declared Dead

Master declares a host dead when:• Master can’t communicate with it over the network

• Host is not connected to master• Host does not respond to ICMP pings

• Master observes no storage heartbeats

Results in:• Master attempts to restart all VMs from host

• Restarts on network-reachable hosts andits own host

FDMESX 1

FDM

ESX 3

FDM

ESX 4ESX 2

FDM

18

ESX 3

FDMFDM

Master declares a host partitioned when:• Master can’t communicate with it over the network

• Master can see its storage heartbeats

Results in:• One master exists in each partition

• VC reports one master’s view of the cluster

• Only one master “owns” any one VM

• A VM running in the “other” partition will be• monitored via the heartbeat datastores• restarted if it fails (in master’s partition)

• When partition is resolved, all but one master abdicates

FDMESX 1

FDM

ESX 4ESX 2

FDM

Host Is Network Partitioned

19

Host Is Network Isolated

A host is isolated when: It sees no vSphere HA network traffic

It cannot ping the isolation addresses

Results in: Host invokes (improved) Isolation response• Checks first if a master “owns” a VM

• Applied if VM is owned or datastore is inaccessible

• Default is now Leave Powered On

Master• Restarts those VMs powered off or that fail later

• Reports host isolated if both can access itsheartbeat datastores, otherwise dead

FDMESX 1

FDM

ESX 3

FDM

ESX 4ESX 2

FDM

Isolation Addresses

20

Key Concepts – Part 2HA Protection and

failure-response guarantees

21

vSphere HA Response to Failures

Type of Failure Response Applicable to VMs

Guest OS hangs, crashesReset VM With tools installed

Application heartbeats stop

Host fails (e.g., reboots)Attempt

VM restartThe responding master knows are HA ProtectedHost isolation (VM powered off)

VM fails (e.g., VM crashes)

22

HA Protected Workflow

Master receives directive from VC

VC tells master to protect the VM

User issues power on for a VM

VC learns that the VM powered on

Master writes fact to a file

Write is done

Host powers on the VM

time

23

HA Restart Guarantee

An attempt will be madefor failures now and in future

An attempt may be madeif a failure occurs now






Write is done


time

24

vSphere HA Protection Property

Is a new per-VM propertyReports on whether a restart attempt is guaranteedIs shown on the VM summary panel and optionally in VM lists

25

Values of the HA Protection Property

Unprotected

Protected

Value reported by VC

N/A






Write is done. Master tells VC


time

VC learns VM has been protected

26

Wrap Up

27

vSphere HA feature provides organizations the ability to run their critical business applications with confidence

5.0 Enhancements provide• A solid, scalable foundation upon which to build to the cloud

• Simpler management and troubleshooting

• Additional and more robust responses to failures

Resource Pool

vSphere HA Summary

VMware ESXi VMware ESXi VMware ESXi

Failed Server Operating ServerOperating Server

28

To Learn More About HA and HA 5.0

At VMworld• See demo in VMware booth in solutions exchange

• Try it out in lab HOL04 – Reducing Unplanned Downtime• Attend group discussions GD15 and GD35 – vSphere HA and FT• Review panel session VSP1682 – vSphere Clustering Q&A• Talk with knowledge expert (EXPERTS-09)

Offline• Availability Guide

• Best Practices Guide

• Troubleshooting Guide

• Release notes

29

vSphere Fault Tolerance SMPTechnical Preview

Objectives Why Fault Tolerance?

What’s new: SMP

30

vSphere Availability Portfolio

Coverage

Hardware

Guest OS

Application

Fault Tolerance

App Monitoring APIs


Guest Monitoring

VMInfrastructure HA

31

Why Fault Tolerance?

Continuous Availability• Zero downtime

• Zero data loss

• No loss of TCP connections

• Completely transparent to guest software

• Simple UI: Turn On Fault Tolerance• Delegate all management to the virtual infrastructure

OS

Apps

Users

32

Background

2009: vSphere Fault Tolerance in vSphere 4.0 2010: Updates to vSphere Fault Tolerance in vSphere 4.1 2011: Updates to vSphere Fault Tolerance in vSphere 5.0 Details: http://www.vmware.com/products/fault-tolerance/ Problem:• FT only for uni-processor VMs

• Is FT for multi-processor VMs possible?• An impressively hard problem

• Concerted effort to find an approach

Reached milestone• We’d like to share it

33

A Starting Point: vSphere FT

Application

Operating SystemVirtualization Layer

Application


FT LOGGING

Shared Disk

vLockstep

34

A Clean Slate

Application


Application


FT LOGGING

Shared Disk

vLockstep

10 GigE

SMP protocol

35

A Clean Slate

Application


Application


FT LOGGING10 GigE

SMP protocol

Spare you the details See it in action

36

Live Demo

Application


Application


FT LOGGING10 GigE

SMP protocol

Experimental setup, caveats

Client

Operating System

37

Live Demo Summary

SMP FT in action• Presented a good solution

• Client oblivious to FT operation• SwingBench client

• SSH client

• Transparent failover

• Zero downtime, zero data loss

• Taste for performance / bandwidth

But that’s not all

38

Performance Numbers

Micros

oft S

QL Serv

er 2-v

CPU

Micros

oft S

QL Serv

er 4-v

CPU

Oracle

Swingbe

nch 2

-vCPU

Oracle

Swingbe

nch 4

-vCPU

0

40

80

% Throughput (FT/non FT)(higher is better)

Similar configuration to vSphere 4 FT Performance Whitepaper• Models real-world workloads: 60% CPU utilization

39

vSphere FT Summary

Why Fault Tolerance• Continuous availability

Fault Tolerance for multi-processor VMs• Good solution to impressively hard problem

• A new design

• Demonstrated similar experience to existing vSphere FT• But more vCPUs

40

vSphere HA and FTFuture Directions

41

vSphere HA and FT – Technical Directions

Technical directions include More comprehensive coverage of failures for more applications

Fault ToleranceHardware/VM

Application

Multi-tierapplication

Multiple vCPUs MetroHAInfrastructure HA

DowntimeProtection against host component failures

Coverage

App Monitoring APIs

Guest OS VM/Guest Monitoring

42


Technical directions include More comprehensive coverage of failures for more applications Broader set of enablers for improving availability of applications

Fault ToleranceMultiple vCPUs MetroHA

Infrastructure HA

VM/Guest Monitoring

DowntimeProtection against host component failures

Coverage

App Monitoring APIs

Building blocks for creating available

apps

API extensions

Hardware/VM

Application


Guest OS

43

Fault Tolerance



Multiple vCPUs MetroHAInfrastructure HA

none minutesDowntimeProtection against host component failures

Coverage

App Monitoring APIs


apps

API extensions

Partnersolutions

VM/Guest Monitoring

Hardware/VM

Application


Guest OS

44



Fault ToleranceMultiple vCPUs MetroHA

Infrastructure HA

none minutesDowntimeProtection against host component failures

Coverage

App Monitoring APIs


apps

API extensions

Partnersolutions

VM/Guest Monitoring

Solidifying vSphere as the platform for running all mission-critical applications

Hardware/VM

Application


Guest OS

45

Thank you!

Questions?

BCO2874vSphere High Availability 5.0 and SMP Fault Tolerance – Technical Overview and Roadmap

48

Additional vSphere HA 5.0 Details

49

Troubleshooting

50

Troubleshooting vSphere HA 5.0

HA issues proactive warning about possible future conditions• VMs not protected after powering on

• Management network discontinuities

• Isolation addresses stop working

HA host states provide granularity into error conditions All HA conditions reported via events; config issues/alarms for some• Event descriptions describe problem and actions to take

• All event messages contain “vSphere HA” so searching for HA issues easier

• HA alarms are more fine grain and auto clearing (where appropriate)

5.0 Troubleshooting guide which discusses likely top issues. E.g.,• Implications of each of the HA host states

• Topics on HB datastores, failovers, admission control

• Will be updated periodically

51

HA Agent Logging

HA 5.0 writes operational information to a single log file called fdm.log• A configurable number of historical copies are kept to assist with debugging

File contains a record of, for example,• Inventory updates relating to VMs, the host, and datastores received from the host

management agent (hostd)

• Processing of configuration updates sent to a master by vCenter Server

• Significant actions taken by the HA agent, such as protecting a VM or restarting a VM

• Messages sent by a slave to a master and by a master to a slave

Default location• ESXi 5.0: /var/log/fdm.log (historical copies in var/run/log)

• Earlier ESX versions: /var/log/vmware/fdm (all files in the same directory)

Notes• See vSphere HA best practices guide for recommended log capacities

• HA log files are designed to assist VMware support in diagnosing problems and the format may change at any time. Thus, for reporting, we recommend you rely on the vCenter Server HA-related events, alarms, config issues, and VM/host properties

52

Log File Format

Log file contains time stamped rows Many rows report the HA agent (FDM) module that logged the info E.g.,

2011-06-01T05:48:00.945Z [FFFE2B90 info 'Invt' opID=SWI-a111addb] [InventoryManagerImpl::ProcessClusterChange]

Cluster state changed to Startup

Noteworthy modules are• Cluster – module responsible for cluster functions

• Invt – module responsible for caching key inventory details

• Policy – module responsible for deciding what to do on a failure

• Placement – module responsible for placing failed VMs

• Execution – module responsible for restarting VMs

• Monitor – modules responsible for periodic health checks

• FDM – module responsible for communication with vCenter Server

53

Additional Datastore Details for HA 5.0

• Heartbeating and heartbeat files• Protected VM files• File locations

54

Heartbeat Datastores(HB): Purpose and Mechanisms

Used by master for slaves not connected to it over network Determine if a slave is alive• Rely on heartbeats issued to slave’s HB datastores

• Each FDM opens a file on each of its HB datastores for heartbeating purposes

• Files contain no information. On VMFS datastores, file will have the minimum-allowed file size

• Files are named X-hb, where X is the (SDK API) moID of the host

• Master periodically reads heartbeats of all partitioned / isolated slaves

Determine the set of VMs running on a slave• A FDM writes a list of powered on VMs into a file on each of its HB datastores

• Master periodically reads the files of all partitioned/isolated slaves

• Each poweron file contains at most 140 KB of info. On VMFS datastores, actual disk usage is determined by the file-sizes supported by the VMFS version

• They are named X-powereon, where X is the (SDK API) moID of the host

55

VM Protected Files

Protected-vm files are used• When recovering from a master failure

• To determine whether a master is responsible for a given VM

• To divvy the VMs up between masters during a partition

One protetedlist file per datastore per cluster using the datastore• It stores the local paths of the protected VMs

• A VM is listed only in the file on the datastore containing its config file

Each file is a fixed 2 MB in size

56

File Locations

FDMs create a directory (.vSphere-HA) in root of each relevant datastore

Within it, they create a subdirectory for each cluster using the datastore

Each subdirectory is given a unique name called the Fault Domain ID <VC uuid>-<cluster entity ID>-<8 random hex characters>-<VC hostname>• Entity ID is the number portion of the (SDK API) moID of the cluster

E.g., in /vmfs/volumes/clusterDS/.vSphere-HA/

FDM-C8496A0D-12D2-4933-AE02-601BCDDB9C61-9-d6bfc023-vc23/ Cluster 9FDM-C8496A0D-12D2-4933-AE02-601BCDDB9C61-17-ad9fd307-vc23/ Cluster 17

57

UI Changes

58

Summary of UI Changes

Cluster Summary Screen• Advanced Runtime Info (improved)

• Cluster Status (new)

• Configuration Issues (improved)

Cluster and datacenter• Hosts list view (improved)

Cluster Configuration• Datastore Heartbeating (new)

• Admission Control (improved)

Host, cluster, datacenter• VM list view (improved)

Host Summary Screen• HA host state (improved)

VM Summary Screen• HA Protection (improved)

Cluster

59




60





61






62




63











64


Host, cluster, datacenter• VM list view (improved) showing protected VMs

65

UI Changes










66

UI Changes










BCO2874 vSphere High Availability 5.0 and SMP Fault Tolerance – Technical Overview and Roadmap

Documents

Transcript of BCO2874 vSphere High Availability 5.0 and SMP Fault Tolerance – Technical Overview and Roadmap