VMware Virtual SAN 6.2 Proof of Concept Guide

Post on 13-Feb-2017

261 views 4 download

Transcript of VMware Virtual SAN 6.2 Proof of Concept Guide

VirtualSAN6.0ProofOfConceptGuide

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1

VMwareVirtualSAN6.2ProofofConceptGuideAugust2016EditionCormacHoganDavidBoonePaudieO’RiordanBradGarvey

VirtualSAN6.1ProofOfConceptGuide

VMwa r e S t o r a g e B u s i n e s s U n i t D o c umen t a t i o n / 2

Contents

1.INTRODUCTION.................................................................................................................................62.BEFOREYOUSTART.........................................................................................................................62.1ALLFLASHORHYBRID.....................................................................................................................................62.2THREE-NODEVERSUSFOUR-NODEORGREATER........................................................................................62.3FOLLOWTHEVSPHERECOMPATIBILITYGUIDEPRECISELY......................................................................72.3.1WhyIsThisImportant?...........................................................................................................................72.3.2Hardware,Drivers,andFirmware.....................................................................................................72.3.3RAID-0versusPass-ThroughforDisks.............................................................................................72.3.4ControllerConfiguration........................................................................................................................8

2.4USESUPPORTEDVSPHERESOFTWAREVERSIONS......................................................................................83.VIRTUALSANPOCSETUPASSUMPTIONSANDPREREQUISITES.......................................94.VIRTUALSANNETWORKSETUP................................................................................................104.1CREATINGAVMKERNELPORTFORVIRTUALSAN.................................................................................10

5.ENABLINGVIRTUALSANONTHECLUSTER...........................................................................156.ENABLETHEVIRTUALSANHEALTHCHECKPLUGIN..........................................................176.1CHECKYOURNETWORKTHOROUGHLY......................................................................................................186.1.1WhyIsThisImportant?........................................................................................................................196.1.2ChecktheNetworkPartitionGroupsafterCreatingCluster...............................................196.1.3UsetheHealthCheckPlugintoVerifyVirtualSANFunctionality.....................................19

7.VSPHEREFUNCTIONALITYONVIRTUALSAN........................................................................237.1DEPLOYYOURFIRSTVM..............................................................................................................................237.2SNAPSHOTVM.................................................................................................................................................327.3CLONEAVM....................................................................................................................................................367.4VMOTIONAVMBETWEENHOSTS...............................................................................................................397.5OPTIONAL:STORAGEVMOTIONAVMBETWEENDATASTORES...........................................................427.5.1MountanNFSDatastoretotheHosts...........................................................................................427.5.2StoragevMotionaVMfromVirtualSANtoAnotherDatastoreType.............................427.5.3StoragevMotionofVMtoVirtualSANfromAnotherDatastoreType............................44

8.SCALEOUTVIRTUALSAN.............................................................................................................458.1ADDTHEFOURTHHOSTTOVIRTUALSANCLUSTER..............................................................................468.2MANUALOPTION:CREATEDISKGROUPONNEWHOST........................................................................488.3VERIFYVIRTUALSANDISKGROUPCONFIGURATIONONNEWHOST.................................................498.4VERIFYNEWVIRTUALSANDATASTORECAPACITY...............................................................................49

9.VMSTORAGEPOLICIESANDVIRTUALSAN............................................................................519.1CREATEANEWVMSTORAGEPOLICY........................................................................................................529.2DEPLOYANEWVMWITHTHENEWVMSTORAGEPOLICY..................................................................569.3ADDANEWVMSTORAGEPOLICYTOANEXISTINGVM........................................................................599.4MODIFYAVMSTORAGEPOLICY..................................................................................................................619.5IOPSLimits....................................................................................................................................................659.6Checksum.......................................................................................................................................................66

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 3

10.VIRTUALSANMONITORING......................................................................................................6810.1MONITORTHEVIRTUALSANCLUSTER...................................................................................................6810.2MONITORVIRTUALDEVICESINTHEVIRTUALSANCLUSTER............................................................6910.3MONITORPHYSICALDEVICESINTHEVIRTUALSANCLUSTER..........................................................7010.4MONITORRESYNCHRONIZATIONANDREBALANCEOPERATIONS......................................................7010.5DEFAULTVIRTUALSANALARMS.............................................................................................................7110.7MONITORVIRTUALSANWITHVSANOBSERVER................................................................................7310.8PERFORMANCEMONITORINGSERVICE....................................................................................................73

11.PERFORMANCETESTING...........................................................................................................7711.1USEVSANOBSERVER.................................................................................................................................7811.2PERFORMANCECONSIDERATIONS............................................................................................................7811.2.1Singlevs.MultipleWorkers.............................................................................................................7811.2.2WorkingSet............................................................................................................................................7811.2.3SequentialWorkloadsversusRandomWorkloads...............................................................7911.2.4OutstandingIOs....................................................................................................................................7911.2.5BlockSize.................................................................................................................................................7911.2.6CacheWarmupConsiderations....................................................................................................7911.2.7NumberofMagneticDiskDrivesinHybridConfigurations..............................................7911.2.8StripingConsiderations.....................................................................................................................8011.2.9GuestFileSystemsConsiderations................................................................................................8011.2.10PerformanceduringFailureandRebuild...............................................................................80

11.3PERFORMANCETESTINGOPTION1:VIRTUALSANHEALTHCHECK................................................8111.4PERFORMANCETESTINGOPTION2:HCIBENCH...................................................................................8311.4.1WheretoGetHCIbench.....................................................................................................................8311.4.2DeployingHCIbench............................................................................................................................8311.4.3ConsiderationsforDefiningTestWorkloads...........................................................................88Results.....................................................................................................................................................................91

12.TESTINGHARDWAREFAILURES.............................................................................................9212.1UNDERSTANDINGEXPECTEDBEHAVIOR.................................................................................................9212.2IMPORTANT:TESTONETHINGATATIME..............................................................................................9212.3VMBEHAVIORWHENMULTIPLEFAILURESENCOUNTERED..............................................................9212.3.1VMPoweredonandVMHomeNamespaceObjectGoesInaccessible...........................9212.3.2VMPoweredonandDiskObjectGoesInaccessible...............................................................92

12.4WHATHAPPENSWHENASERVERFAILSORISREBOOTED?...............................................................9312.5SIMULATEHOSTFAILUREWITHOUTVSPHEREHA...............................................................................9412.6SIMULATEHOSTFAILUREWITHVSPHEREHA......................................................................................9712.7DISKISPULLEDUNEXPECTEDLYFROMESXIHOST............................................................................10112.7.1ExpectedBehaviors..........................................................................................................................101

12.8SSDISPULLEDUNEXPECTEDLYFROMESXIHOST.............................................................................10212.8.1ExpectedBehaviors..........................................................................................................................102

12.9WHATHAPPENSWHENADISKFAILS?.................................................................................................10312.9.1ExpectedBehaviors..........................................................................................................................103

12.10WHATHAPPENSWHENANSSDFAILS?.............................................................................................10412.10.1ExpectedBehaviors.......................................................................................................................104

12.11VIRTUALSANDISKFAULTINJECTIONSCRIPTFORPOCFAILURETESTING..............................10512.12PULLMAGNETICDISK/CAPACITYTIERSSDANDREPLACEBEFORETIMEOUTEXPIRES.........10512.13PULLMAGNETICDISK/CAPACITYTIERSSDANDDONOTREPLACEBEFORETIMEOUTEXPIRES.................................................................................................................................................................................10812.14PULLCACHETIERSSDANDDONOTREINSERT/REPLACE............................................................110

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 4

12.15CHECKINGREBUILD/RESYNCSTATUS................................................................................................11312.16INJECTINGADISKERROR.......................................................................................................................11512.16.2ClearaPermanentError............................................................................................................117

12.17WHENMIGHTAREBUILDOFCOMPONENTSNOTOCCUR?.............................................................11912.17.1LackofResources...........................................................................................................................11912.17.2UnderlyingFailures.......................................................................................................................119

13.VIRTUALSANMANAGEMENT................................................................................................12013.1PUTAHOSTINTOMAINTENANCEMODE..............................................................................................12013.2REMOVEANDEVACUATEADISK.............................................................................................................12513.3EVACUATEADISKGROUP........................................................................................................................12713.4ADDDISKGROUPSBACKAGAIN.............................................................................................................12813.5TURNINGONANDOFFDISKLEDS..........................................................................................................129

14.VIRTUALSAN6.1STRETCHEDCLUSTERCONFIGURATION.........................................13114.1VIRTUALSAN6.1STRETCHEDCLUSTERNETWORKTOPOLOGY.....................................................13114.2VIRTUALSAN6.1STRETCHEDCLUSTERHOSTS.................................................................................13114.3VIRTUALSAN6.1STRETCHEDCLUSTERDIAGRAM...........................................................................13214.4PREFERREDSITEDETAILS.......................................................................................................................13214.4.1CommandstoAddStaticRoutes.................................................................................................134

14.5SECONDARYSITEDETAILS.......................................................................................................................13514.5.1CommandstoAddStaticRoutes.................................................................................................136

14.6ANOTEONIGMPV3.................................................................................................................................13614.7WITNESSSITEDETAILS............................................................................................................................13714.7.1CommandstoAddStaticRoutes.................................................................................................138

14.8VSPHEREHASETTINGS............................................................................................................................13914.8.1ResponsetoHostIsolation............................................................................................................13914.8.2AdmissionControl.............................................................................................................................13914.8.3AdvancedSettings.............................................................................................................................140

14.9VMHOSTAFFINITYGROUPS...................................................................................................................14114.10DRSSETTINGS.........................................................................................................................................143

15.VIRTUALSANSTRETCHEDCLUSTERNETWORKFAILOVERSCENARIOS................14415.1NETWORKFAILUREBETWEENSECONDARYSITEANDWITNESS.....................................................14415.1.1TriggertheEvent..............................................................................................................................14415.1.2ClusterBehavioronFailure..........................................................................................................14415.1.3Conclusion............................................................................................................................................14715.1.4RepairtheFailure.............................................................................................................................148

15.2NETWORKFAILUREBETWEENPREFERREDSITEANDWITNESS......................................................14915.2.1TriggertheEvent..............................................................................................................................14915.2.2ClusterBehavioronFailure..........................................................................................................15015.2.3Conclusion............................................................................................................................................15215.2.4RepairtheFailure.............................................................................................................................152

15.3NETWORKFAILUREBETWEENWITNESSANDBOTHDATASITES...................................................15315.3.1TriggertheEvent..............................................................................................................................15315.3.2ClusterBehavioronFailure..........................................................................................................15315.3.3Conclusion............................................................................................................................................15415.3.4RepairtheFailure.............................................................................................................................154

16.VIRTUALSAN6.2ALLFLASHFEATURES..........................................................................15516.1DeduplicationandCompression....................................................................................................15516.2RAID-5/RAID-6ErasureCoding.....................................................................................................158

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 5

17.FURTHERINFORMATION........................................................................................................16117.1VMWAREVIRTUALSANCOMMUNITY...................................................................................................16117.2LINKSTOEXISTINGDOCUMENTATION...................................................................................................16117.3VMWARESUPPORT...................................................................................................................................161

APPENDIXA—FAULTDOMAINS..................................................................................................162A1.SETTINGUPFAULTDOMAINS....................................................................................................................162A2.CREATEAPOLICYTOLEVERAGEFAULTDOMAINS................................................................................164A3.CREATEAVMANDCHECKTHEFAULTDOMAINS..................................................................................167

APPENDIXB—MIGRATINGFROMSTANDARDVSWITCHTODISTRIBUTED................170B.1CREATEDISTRIBUTEDSWITCH..................................................................................................................170B.2CREATEPORTGROUPS................................................................................................................................172B.3MIGRATEMANAGEMENTNETWORK........................................................................................................175B.4MIGRATEVMOTION.....................................................................................................................................180B.5MIGRATEVIRTUALSANNETWORK.........................................................................................................180B.6MIGRATEVMNETWORK............................................................................................................................183

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 6

1. Introduction VMware customers love the simplicity, performance and integration of VMware®VirtualSAN™sinceitslaunch.Most customers choose to evaluate Virtual SAN before using it for production –alwaysagoodidea.We’vemadealistofissuesoccasionallyencounteredaspeoplegothroughthisprocess.Followthisguide,andyou’llhaveagreatevaluation.

2. Before You Start Planontestingareasonablehardwareconfigurationthatresembleswhatyouplantouseinproduction.RefertotheVMwareVirtualSAN6.0DesignandSizingGuideforinformation on supported hardware configurations, and consideration whendeployingVirtualSAN.

2.1 All Flash or Hybrid Thereareanumberofadditionalconsiderationsifyouplantodeployanall-flashVirtualSANsolution:

• All-flashisavailableinVirtualSANsinceversion6.0.• Itrequiresa10GbEthernetnetwork;itisnotsupportedwith1GbNICs.• Themaximumnumberofall-flashhostsis64.• Flashdevicesareusedforbothcacheandcapacity.• Flashreadcachereservationisnotusedwithall-flashconfigurations.• Thereisaneedtomarkaflashdevicesoitcanbeusedforcapacity–thisis

coveredintheVirtualSANAdministratorsGuide.• Endurancenowbecomesanimportantconsiderationbothforcacheand

capacitylayers.• Deduplicationandcompressionavailableonall-flashonly• ErasureCoding(Raid5/6)isavailableonall-flashonly

2.2 Three-node versus Four-node or Greater WhileVirtualSANfullysupports3-nodeconfigurations,theycanbehavedifferentlythanconfigurationswith4orgreaternodes.Inparticular,intheeventofafailureyoudo not have the ability to rebuild components on another host in the cluster totolerateanotherfailure.Alsowith3-nodeconfigurations,youwillnothavetheabilitytomigratealldatafromanodeduringmaintenance.Thisisbecausevirtualmachinesona3nodeclustercannotbeconfiguredtotoleratemorethanonefailure.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 7

Ifyouplantodeploya3-nodecluster,thenthatiswhatyoushouldtest.Butifyouplanondeployinglargerclusters,westronglyrecommendtesting4ormorenodes.Further considerationswith three-node clusters are covered in the failure testingsectionofthisdocument.For the purposes of this proof-of-concept guide, a 4-node configuration is used.However,theclustersizeis3-nodestobeginwith,andthefourthnodewillbeaddedduringthecourseoftheproof-of-concept.

2.3 Follow the vSphere Compatibility Guide Precisely

2.3.1 Why Is This Important? Wecannotoverstate the importanceof following thevSphereCompatibilityGuide(VCG)forVirtualSANtotheletter.Asignificantnumberofoursupportrequestsareultimately traced back to customers failing to follow these very specificrecommendations.Thison-linetoolisregularlyupdatedtoensurecustomersalwayshavethelatestguidancefromVMwareavailabletothem.ThehardwarecompatibilityguidecanbefoundatthefollowingURL:http://www.vmware.com/resources/compatibility/search.php?deviceCategory=vsanio&productid=39770&details=1&vsan_type=vsanio&io_releases=274&keyword=&page=1&display_interval=10&sortColumn=Partner&sortOrder=AscTherearetwooptionsforchoosinghardware.ThefirstmethodistochooseaReadyNodeincludedintheVSANReadyNodeConfigurator.ReadyNodesarevalidatedtoprovidepredictableperformanceandscalability.IfyouwouldliketobuildyourownVirtualSANhosts,youcanchoosefromcertifiedcomponentsintheHardwareCompatibilityList.IfyouchoosethisrouteyoumustconfirmthatallcomponentsincludingdrivesaresupportedbytheOEMservermanufacturer.

2.3.2 Hardware, Drivers, and Firmware TheVCGmakesveryspecificrecommendationsonhardwaremodelsforStorageI/Ocontrollers,SSDs,PCI-Eflashcardsanddiskdrives. Italsospecifieswhichdrivershavebeenfullytested,and–inmanycases–identifiesthefirmwarelevelrequired.Themostdirectwaytocheckthecontroller’sfirmwareversionisbyinterruptingthebootprocessandlookingintothecontroller’sBIOSsettings.TheVMwareVirtualSANDiagnosticsandTroubleshootingReferenceManualcontainsinformationaboutusing‘esxcli hardware pci list’ and ‘vmkload_mod -s' to find the I/O controller’s driverversion.

2.3.3 RAID-0 versus Pass-Through for Disks TheVCGwilltellyouifacontrollersupportsRAID-0orpass-throughwhenpresentingdisks to ESXi hosts. RAID-0 is only supportedwhen pass-through is not possible.Check that you are using the correct configuration and that the configuration isuniformacrossallnodes.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 8

2.3.4 Controller Configuration Keepthecontrollerconfigurationrelativelysimple.Forcontrollerwithcache,eitherdisableit,or–ifthatisnotpossible-setitto100%read.ForothervendorspecificcontrollerfeaturessuchasHPSSDSmartPath,werecommenddisablingthem.ThismayonlybepossiblefromtheBIOSofthecontrollerinmanycases.

2.4 Use Supported vSphere Software Versions ItishighlyrecommendedthatanyonewhoisconsideringanevaluationofVirtualSANshould pick up the latest versions of software. VMware continuously fixes issuesencounteredbycustomers,sobyusingthelatestversionofthesoftware,youavoidissuesalreadyfixed.

This version of the proof-of-concept guide is specific to version 6.2 of VSAN andvSphereversion6.0u2.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 9

3. Virtual SAN POC Setup Assumptions and Prerequisites Priortostartingtheproofofthefollowingpre-requisitesmustbecompleted.ForacompletelistingofPOCPre-requisitesrefertoVMwareVirtualSAN6ProofofConceptChecklist.Thefollowingassumptionsarebeingmadewithregardstothedeployment:

• Fourserversareavailable,andarecompliantwiththeVirtualSANHCL.• AllservershavehadESXi6.0u2deployed.• A6.0u2vCenterServerhasbeendeployedtomanagethesefourESXihosts.

ThesestepswillnotbecoveredinthisPOCguide.• ServicessuchasDHCP,DNSandNTPareavailableintheenvironmentwhere

thePOCistakingplace.• ThreeoutoffourESXihostsshouldbeplacedinaclusterinvCenter.• Theclustermustnothaveanyfeaturesenabled,suchasDRS,HAorVirtual

SAN.ThesewillbedonethroughoutthecourseofthePOC.• EachhostmusthaveamanagementnetworkandavMotionnetworkalready

configured.ThereisnoVirtualSANnetworkconfigured.ThiswillbedoneaspartofthePOC.

• ForthepurposesoftestingStoragevMotionoperations,anadditionaldatastoretype,suchasNFSorVMFS,shouldbepresentedtoallhosts.ThisisanoptionalPOCexercise.

• AsetofIPaddresses,oneperESXihostwillbeneededfortheVirtualSANtrafficVMkernelports.TherecommendationisthattheseareallonthesameVLANandnetworksegment.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 0

4. Virtual SAN Network Setup BeforeVirtualSANcanbeenabledtheatleastthreehostsmustbeaddedtotheclusterandbeassignedmanagementIPaddresses.AllESXihostsinaVirtualSANClustercommunicateoveraVirtualSANnetwork.FornetworkdesignandconfigurationbestpracticesrefertotheVMwareVirtualSANNetworkDesignGuide.ThefollowingexampledemonstrateshowtoconfigureaVirtualSANnetworkonanESXihost.

4.1 Creating a VMkernel Port for Virtual SAN Inmanydeployments,VirtualSANmaybesharingthesameuplinksasthemanagementandvMotiontraffic,especiallywhen10GbENICsareutilized.Lateron,wewilllookatanoptionalworkflowthatmigratesthestandardvSwitchestoadistributedswitchforthepurposeofprovidingQualityOfService(QoS)totheVirtualSANtrafficthroughafeaturecalledNetworkI/OControl.Thisisonlyavailableondistributedswitches.However,theassumptionforthisPOCisthatthereisalreadyastandardvSwitchcreatedwhichcontainstheuplinksthatwillbeusedforVirtualSANtraffic.Inthisexample,aseparatevSwitch(vSwitch1)withdedicated1GbeNICshasbeencreatedforVirtualSANtraffic,whilethemanagementandvMotionnetworkusedifferentuplinksonaseparatestandardvSwitch.TocreateaVirtualSANVMkernelport,followthesesteps:Select an ESXi host in the inventory, then navigate to Manage > Networking >VMkernelAdapters.Clickontheiconfor“Addhostnetworking”,ashighlightedbelow:

Figure 4.1: Add host networking EnsurethatVMkernelNetworkAdapterischosen.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 1

Figure 4.2: Select VMkernel Network Adapter type The next step gives you the opportunity to build a new standard vSwitch for theVirtualSANnetworktraffic.Inthisexample,analreadyexistingvSwitch1containstheuplinksfortheVirtualSANtraffic.Ifyoudonothavethisalreadyconfiguredinyour environment, you can use an already existing switch or select the option tocreateanewstandardvSwitch.Whenyouarelimitedto2x10GbEuplinks,itmakessensetousethesameVSS.Whenyouhavemanyuplinks,somededicatedtodifferenttraffictypes(asinthisexample),managementcanbealittleeasierifdifferentVSSwiththeirownuplinksareusedforthedifferenttraffictypes.AsthereisanexistingvSwitchinourenvironmentthatcontainsthenetworkuplinksfortheVirtualSANtraffic,the“browse”buttonisusedtoselectitasshownbelow.

Figure 4.3: Select and existing standard switch via “Browse” button

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 2

Figure 4.4: Choose a vSwitch

Figure 4.5: vSwitch is displayed once selected

ThenextstepistosetuptheVMkernelportproperties,andchoosetheservices,suchasVirtualSANtraffic.Thisiswhattheinitialportpropertieswindowlookslike.

Figure 4.6: Default port properties

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 3

HereiswhatitlookslikewhenpopulatedwithVirtualSANspecificinformation.

Figure 4.7: Port properties configured for Virtual SAN traffic

Intheaboveexample,thenetworklabelhasbeendesignated“VirtualSAN”,andtheVirtualSANtrafficdoesnotrunoveraVLAN.IfthereisaVLANusedfortheVirtualSANtrafficinyourPOC,changethisfrom“None(0)”toanappropriateVLANID.The next step is to provide an IP address and subnet mask for the Virtual SANVMkernel interface.Aspertheassumptionsandpre-requisitessectionearlier,youshouldhavetheseavailablebeforeyoustart.Atthispoint,yousimplyaddthem,oneperhostbyclickingon“UsestaticIPv4settings”asshownbelow.Alternatively,ifyouplan on using DHCP IP addresses, leave the default settingwhich is “Obtain IPv4settingsautomatically”.

Figure 4.8: IP address and subnet mask Thefinalwindowisareviewwindow.Hereyoucancheckthateverythingisaspertheoptionsselectedthroughoutthewizard.Ifanythingisincorrect,youcannavigatebackthroughthewizard. Ifeverything looks like it iscorrect,youcanclickonthe“Finish”button.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 4

Figure 4.9: Review window

IfthecreationoftheVMkernelportissuccessful,itwillappearinthelistofVMkernelports,asshownbelow.

Figure 4.10: VMkernel adapters with new Virtual SAN VMkernel adapter ThatcompletestheVirtualSANnetworkingsetupforthathost.YoumustnowrepeatthisforallotherESXihosts,includingthehostthatisnotcurrentlyintheclusteryouwilluseforVirtualSAN.IfyouwishtouseaDVS(distributedvSwitch),thestepstomigratefromstandardvSwitch(VSS)toDVSaredocumentedinAppendixB—MigratingfromStandardvSwitchtoDistributedofthisPOCGuide.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 5

5. Enabling Virtual SAN on the Cluster Onceallthepre-requisiteshavebeenmet,VSANcanbeconfigured.ToenableVSANcompletethefollowingsteps:

1. OpenthevSphereWebClient.2. ClicktheHostsandClustersTab.3. SelecttheclusteronwhichyouwishtoenableVirthalSAN.4. ClicktheManagetab.5. ClickSettings.6. UnderVirtualSAN,SelectGeneralandclickConfigure.

Figure 5.1: Virtual SAN is Turned OFF

7. Selectthemodeforstoragediskstobeclaimed.Theoptionsare:◦ Manual–Requiresyoutomanuallyclaimthedisksyouwanttouseoneach

host.NewdisksonthehostarenotautomaticallyclaimedbyVirtualSAN.Notethatthismodeismandatoryforanall-flashconfiguration!Italsoistherecommendedmodeasitallowsyoutokeepcontroloverwhendevicesareaddedandtowhichdiskgrouptheyareadded.

◦ Automatic–ClaimsallunclaimeddisksontheincludedhostsforVirtualSAN.VirtualSAN inautomaticmodeclaimsonly localdiskson theESXihosts in the cluster.Remotenon-shareddisks canbeaddedmanually ifrequired.

8. WhenusinganAll-Flashconfiguration,youhavetheoptiontoenable

DeduplicationandCompression.DeduplicationandCompressionarecoveredinChapter16ofthisguide.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 6

9. IfdesiredFaultDomainsandStretchedor2-Nodeclusteroptionscanbecreatedaspartoftheworkflow.AspartofthebasicconfigurationselectDonotconfigure.

10. ClickNext11. EnsurethenetworkvalidationscreeniscorrectandclickNext.12. Whenthemanualdiskclaimoptionhasbeenselectedyoucanclaimallthe

disksatonce.a. Foreachlisteddiskmakesureitislistedcorrectlyasflash,HDD,

cachingdeviceorcapacitydrive.

Figure 5.6: Claiming disks 1

13. ClickNext14. VerifytheconfigurationandclickFinish.

Oncetheconfigurationprocessiscomplete,ReturntotheGeneralview.Youshouldnowshowthenumberofflashdisksinuse(three,oneperhost)anddatadisks(six,twoperhost)thatarenowinuse.ItshouldalsoshowthetotalcapacityoftheVirtualSANdatastore,whichinthiscaseis~812GB.Thatis6x136GB,lesssomeoverhead.Rememberthatflashdevicesdonotcontributetowardscapacity,onlythemagneticdiskdevices(inthecaseofhybridconfigurations).

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 7

Figure 5.8: On-disk Format Version

6. Enable the Virtual SAN Health Check Plugin FollowingonfromtheVirtualSAN6.0GArelease,anewfeaturecalledHealthCheckpluginwas released.This givesadministratorsvaluable information regarding thestateoftheVirtualSANCluster,andisalsoextremelyusefulforPOCactivitiesasitquicklydiscoversissues.There is an in-depth description of health checks, including how to install andconfigureit,aswellasdetailedinformationonthevariouschecksthatitcarriesout.RefertotheVMwareVirtualSANHealthCheckPluginGuide.StartingwithvSphere6.0update1andvCenter6.0update1,theHealthCheckpluginispre-installedbothinvCenterandasaVIBoneachESXihost.Allthat’srequiredistoenablethehealthcheckservicesonceVirtualSANisenabled. This isdoneonacluster-by-clusterbasisatthecluster’sManagetab>Settings>VirtualSAN>Health.Onceenabled,thenewhealthcheckserviceinVirtualSAN6.1runshourlybydefaultandon-demandwhenyouvisitthecluster’sMonitortab>VirtualSAN>Health.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 8

Figure 6.1: Managing Virtual SAN health check service

WiththeVirtualSAN-enabledclusterobjectselectedintheinventory,navigatetotheMonitortab>VirtualSAN>Health.Thiswilldisplaythelistofhealthcheck,andtheirstatus.Hopefullyeverythingwillshowupaspassedasperfigure6.2below.

Figure 6.2: Top level list of health checks

6.1 Check Your Network Thoroughly OncetheVirtualSANnetworkhasbeencreated,andVirtualSANisenabled,youshouldcheckthateachESXihostintheVirtualSANClusterisablecommunicatetoallotherESXihostsinthecluster.TheeasiestwaytoachievethisisviatheHealthCheckPlugin.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 9

6.1.1 Why Is This Important? Virtual SAN is entirely dependent on the network: its configuration, reliability,performance,etc.Oneofthemostfrequentcausesofrequestingsupportiseitheranincorrectnetworkconfiguration,orthenetworknotperformingasexpected.

6.1.2 Check the Network Partition Groups after Creating Cluster Anetworkpartitioniswhenasubsetofhosts(oneormore)inunabletocommunicateto another subset of hosts. TheDiskManagement view (found under Virtual SANCluster>Managetab>Settings)providesimmediateinformationaboutwhetherornotthereisanetworkpartitioninyourcluster.Ifthenetworkisfunctioningproperly,allhostswillbeinGroup1.OnlyifmulticastroutingisproperlyconfiguredwouldVirtualSANstillfunctionwithmultiplepartitiongroups.RefertotheNetworkhealthtestsunderCluster>Monitor>VirtualSAN>Health.

Figure 6.3: Network Partition Group info

6.1.3 Use the Health Check Plugin to Verify Virtual SAN Functionality Runningindividualcommandsfromonehosttoallotherhostsintheclustercanbetediousandtimeconsuming.Fortunately,sinceVirtualSAN6.0supportsanewhealthcheckplugin,partofwhichteststhenetworkconnectivitybetweenallhostsinthecluster. If for some reason the cluster will not form, and displays a “Networkmisconfiguration”intheGeneralview,youshouldproceedwithenablingthehealthcheckplugin,outlinedintheprevioussection.Thiswillreducethetimetodetectandresolvethenetworkingissue,oranyotherVirtualSANmisconfigurationissuesinthecluster.Inthescreenshotbelow,onecanseethateachofthehealthchecksfornetworkinghassuccessfullypassed.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 2 0

Figure 6.4: Network health checks all passed

Ifanyofthenetworkhealthchecksfail,selecttheappropriatecheckandexaminethedetailsscreenbelowfordetailsonhowtoresolvetheissue.EachdetailsviewalsocontainsanAskVMwarebuttonwhereappropriate,whichwilltakeyoutoaVMwareKnowledgeBasearticledetailingtheissue,andhowtotroubleshootandresolveit.For example, in this case where one host does not have a Virtual SAN vmknicconfigured,thisiswhatisdisplayed.

Figure 6.5: Network health failure example

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 2 1

BeforegoinganyfurtherwiththisPOC,itisworthdownloadingthelatestversionoftheHCLdatabaseandrunninga“Retest”ontheHealthcheckscreen.Thiswillensureeverything in the cluster is optimal. It will also check the hardware against theVMware Compatibility Guide (VCG) for Virtual SAN, verify that the networking isfunctional,andthattherearenounderlyingdiskproblems.Allgoingwell,aftertheRetest,everythingshouldstilldisplaya“Passed”status.

Figure 6.6: Virtual SAN Health checksAt this point the Cluster health, Limits health and Physical disk health should beexamined.Thedatahealthonlybecomesrelevantonceyoustart todeployvirtualmachinestotheVirtualSANdatastore.

Figure 6.7: Expanded Health check plugin Checks

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 2 2

VirtualSAN,includingtheHealthcheckmonitoring,isnowsuccessfullydeployed.TheremainderofthisPOCguidewillinvolvevarioustestsanderrorinjectionstoshowhowVirtualSANwillbehaveunderthesecircumstances.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 2 3

7. vSphere Functionality on Virtual SAN ThisinitialtestisperVMtesting,andwillhighlightthefactthatgeneralvirtualmachineoperationsareunchangedinVirtualSANenvironments.

7.1 Deploy Your First VM Inthissection,aVMisdeployedtotheVirtualSANdatastoreusingthedefaultstoragepolicy.Thisdefaultpolicyispreconfiguredanddoesnotrequireanyinterventionunlessyouwishtochangethedefaultsettings,whichwedonotrecommend.Toexaminethedefaultpolicysettings,navigatetoHome>VMStoragePolicies.

Figure 7.1: VM Storage PoliciesFromthere,selectVirtualSANDefaultStoragePolicyandthenselecttheManagetab.UndertheManagetab,selectRule-Set1:VirtualSANtoseethesettingsonthepolicy:

Figure 7.2: Rule-Set 1: Virtual SAN (default policy)

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 2 4

WewillreturntoVMStoragePoliciesinmoredetailinafuturechapter,butsufficetosaythatwhenaVMisdeployedwiththedefaultpolicy,itshouldhaveamirrorcopyoftheVMdatacreated.ThissecondcopyoftheVMdata isplacedonstorageonadifferenthosttoenabletheVMtotolerateanysinglefailure. Alsonotethatobjectspacereservationissetto0%,meaningtheobjectshouldbedeployedas“thin”.AfterwehavedeployedtheVM,wewillverifythatVirtualSANadherestobothofthesecapabilities.OnefinalitemtocheckbeforewedeploytheVMisthecurrentfreecapacityontheVirtualSANdatastore.ThiscanbeviewedfromtheVirtualSANCluster>Managetab>Settings>Generalview.InthisPOC,itis811.95GB.

Figure 7.3: Current free capacity of Virtual SAN datastoreMakeanoteofthefreecapacityonyourPOCbeforecontinuingwiththedeployVMexercise.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 2 5

TodeploytheVM,simplyfollowthestepsprovidedinthewizard.

Figure 7.4: New Virtual Machine

Figure 7.5: Create a new virtual machine

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 2 6

AtthispointanamefortheVMmustbeprovided,andthentheVirtualSANClustermustbeselectedasacomputeresource.

Figure 7.6: Select a name and folder

Figure 7.7: Select a compute resource

Atthispoint,thevirtualmachinedeploymentprocessisalmostidenticaltoallothervirtualmachinedeploymentsthatyouhavedoneonotherstoragetypes.Itisthenextsectionthatmightbenewtoyou.This iswhereapolicy forthevirtualmachine ischosen.From the next menu, you can either select the Virtual SAN datastore, and the“Datastore Default” policy will actually point to the “Virtual SAN Default StoragePolicy”seenearlier.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 2 7

Figure 7.8: Select the Virtual SAN Default Storage Policy

Oncethepolicyhasbeenchosen,datastoresaresplitintothosethatarecompliantwiththepolicy,andthosethatarenon-compliantwiththepolicy.Asseenbelow,onlythe Virtual SAN datastore can understand the policy settings in the Virtual SANDefaultStoragePolicysoitistheonlyonethatshowsupasCompatibleinthelistofdatastores.

Figure 7.9: vsanDatastore is compatible with Virtual SAN Default Storage Policy

Therestof theVMdeploymentsteps in thewizardarequitestraightforward,andsimplyentailselectingESXiversioncompatibility(leaveatdefault),aguestOS(leaveatdefault)andcustomizehardware(nochanges).Essentiallyyoucanclickthroughtheremainingwizardscreenswithoutmakinganychanges.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 2 8

Figure 7.10: Select the ESXi compatibility (click next)

Figure 7.11: Select the guest OS (click next)

Figure 7.12: Customize hardware (click next)

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 2 9

Thefinalstepinthewizardistoclickthe“Finish”buttontoinitiatethecreationoftheVM.

Figure 7.13: Finish VM creation

OncetheVMiscreated,selectthenewVMintheinventory,navigatetotheManagetab,andthenselect“Policies”.Thereshouldbetwoobjectsshown,“VMhome”and“Harddisk1”.Bothoftheseshouldshowacompliancestatusof“Compliant”meaningthatVirtualSANwasabletodeploytheseobjectsinaccordancetothepolicysettings.

Figure 7.14: VM is compliant with policy settings Toverifythis,navigatetotheMonitortab,andthenselect“Policies”.Onceagain,boththe“VMhome”and“Harddisk1”shouldbedisplayed.Select“Harddisk1”andfurtherdown thewindow, select the “PhysicalDiskPlacement” tab.This shoulddisplay aRAID1configurationwithtwocomponents,eachcomponentrepresentingamirroredcopyofthevirtualdisk.Itshouldalsobenotedthatdifferentcomponentsarelocatedondifferenthosts.This implies that thepolicysettingtotolerate1 failure isbeingadheredto.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 3 0

Figure 7.15: Physical Disk Placement displays underlying layout of objects Thewitnessitemshownaboveisusedtomaintainaquorum.Formoreinformationon the purpose ofwitnesses, and objects and components in general, refer to theVMwareVirtualSAN6.0DesignandSizingGuide.Onefinalitemisrelatedtothe“objectspacereservation”policysettingthatdefineshowmuchspaceaVMreservesontheVirtualSANdatastore.Bydefault,itissetto0%, implying that the VM’s storage objects are entirely “thin” and consume nounnecessaryspace.IfweexamineFigure7.12,weseethatwerequestedthattheVMbedeployedwith40GBofdiskspace.However,ifwelookatthefreecapacityaftertheVMhasbeendeployed(asshowninfigure7.16below),weseethatthefreecapacityisveryclosetowhatitwasbeforetheVMwasdeployed,aspreviouslycapturedinfigure7.3.

Figure 7.16: Free capacity after VM is created

OfcoursewehavenotinstalledanythingintheVMsuchasaguestOS,butitshowsthatonlyatinyportionoftheVirtualSANdatastorehassofarbeenused,verifying

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 3 1

that the object space reservation setting of 0% (essentially thin provisioning) isworkingcorrectly.DonotdeletethisVMaswewilluseitforotherPOCtestsgoingforward.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 3 2

7.2 Snapshot VM Usingthevirtualmachinecreatedpreviously,takeasnapshotofit.ThesnapshotcanbetakenwhentheVMispoweredonorpoweredoff.Theobjectivehereistoseeasuccessfulsnapshotdeltaobjectcreated,andseethatthepolicysettingsofthedeltaobjectareinheriteddirectlyfromthebasediskobject.

Figure 7.17: Take a VM snapshot

Figure 7.18: Provide a name for the snapshot and optional descriptionOncethesnapshothasbeenrequested,monitortasksandeventstoensurethatithasbeen successfully captured. Once the snapshot creation has completed, additionalactionswillbecomeavailableinthesnapshotdropdownwindow.Forexamplethere

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 3 3

is a new action to “Revert to Latest Snapshot” and another action to “ManageSnapshots…”.

Figure 7.19: New snapshot actions

Ifthe“ManageSnapshots…”optionischosen,thefollowingisdisplayed.Itincludesdetailsregardingallsnapshotsinthechain,theabilitytodeleteoneorallofthem,aswellastheabilitytoreverttoaparticularsnapshot.

Figure 7.20: Manage Snapshots

ThereisunfortunatelynowaytoseesnapshotdeltaobjectinformationfromtheUI,likewecandoforVMDKsandforVMhome.Instead,theRubyvSphereConsole(RVC)must be relied on. To get familiar with RVC, see VMware Ruby vSphere ConsoleCommandReferenceforVirtualSAN.Thecommandneededtodisplaysnapshotinformationis:

vsan.vm_object_info <VM>

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 3 4

Hereisanoutputbasedonthesnapshotcreatedpreviously: /ie-vcsa-09.ie.local/VSAN6-DC/vms> vsan.vm_object_info 1 VM VSAN6-poc-test-vm-1: Namespace directory DOM Object: 95122555-8061-3328-cf10-001f29595f9f (v2, owner: cs-ie-h01.ie.local, policy: forceProvisioning = 0, hostFailuresToTolerate = 1, spbmProfileId = aa6d5a82-1c88-45da-85d3-3d74b91a5bad, proportionalCapacity = [0, 100], spbmProfileGenerationNumber = 0, cacheReservation = 0, stripeWidth = 1) RAID_1 Component: 96122555-80ad-3c97-dadf-001f29595f9f (state: ACTIVE (5), host: cs-ie-h01.ie.local, md: 52fc637f-ecf9-2b53-ff31-9e8d75d2b43f, ssd: 528ba019-e369-151e-01b3-26b103d7de0f, votes: 1, usage: 0.3 GB) Component: 96122555-dc90-3e97-9c6f-001f29595f9f (state: ACTIVE (5), host: cs-ie-h02.ie.local, md: 52edaed1-2b04-b3af-ba3f-2b03ebaa9fce, ssd: 521963f0-33f5-eaaf-d2e1-f7a218b13be4, votes: 1, usage: 0.3 GB) Witness: 96122555-fc7b-3f97-5d9a-001f29595f9f (state: ACTIVE (5), host: cs-ie-h03.ie.local, md: 527aade4-cec7-0661-b621-6e22d69c3042, ssd: 52a4acab-f622-6025-bee3-746d436627cf, votes: 1, usage: 0.0 GB) Disk backing: [vsanDatastore] 95122555-8061-3328-cf10-001f29595f9f/VSAN6-poc-test-vm-1-000001.vmdk DOM Object: 2a2a2555-946f-292b-2e23-001f29595f9f (v2, owner: cs-ie-h01.ie.local, policy: spbmProfileGenerationNumber = 0, forceProvisioning = 0, cacheReservation = 0, hostFailuresToTolerate = 1, stripeWidth = 1, spbmProfileId = aa6d5a82-1c88-45da-85d3-3d74b91a5bad, proportionalCapacity = [0, 100], objectVersion = 2) RAID_1 Component: 2a2a2555-8ce3-a171-fb8e-001f29595f9f (state: ACTIVE (5), host: cs-ie-h01.ie.local, md: 5255fd2b-83cc-911f-452b-13b4ca74e03e, ssd: 528ba019-e369-151e-01b3-26b103d7de0f, votes: 1, usage: 0.0 GB) Component: 2a2a2555-78d0-a371-b90d-001f29595f9f (state: ACTIVE (5), host: cs-ie-h02.ie.local, md: 52edaed1-2b04-b3af-ba3f-2b03ebaa9fce, ssd: 521963f0-33f5-eaaf-d2e1-f7a218b13be4, votes: 1, usage: 0.0 GB) Witness: 2a2a2555-ce29-a571-da2b-001f29595f9f (state: ACTIVE (5), host: cs-ie-h03.ie.local, md: 527aade4-cec7-0661-b621-6e22d69c3042, ssd: 52a4acab-f622-6025-bee3-746d436627cf, votes: 1, usage: 0.0 GB) Disk backing: [vsanDatastore] 95122555-8061-3328-cf10-001f29595f9f/VSAN6-poc-test-vm-1.vmdk DOM Object: 97122555-78d5-5580-bffc-001f29595f9f (v2, owner: cs-ie-h03.ie.local, policy: forceProvisioning = 0, hostFailuresToTolerate = 1, spbmProfileId = aa6d5a82-1c88-45da-85d3-3d74b91a5bad, proportionalCapacity = 0, spbmProfileGenerationNumber = 0, cacheReservation = 0, stripeWidth = 1) RAID_1 Component: 98122555-3ec9-d1d6-01f4-001f29595f9f (state: ACTIVE (5), host: cs-ie-h02.ie.local, md: 52edaed1-2b04-b3af-ba3f-2b03ebaa9fce, ssd: 521963f0-33f5-eaaf-d2e1-f7a218b13be4, votes: 1, usage: 0.0 GB) Component: 98122555-5c6f-d3d6-55a7-001f29595f9f (state: ACTIVE (5), host: cs-ie-h03.ie.local, md: 523de844-6b48-6bda-44ad-c28df042c16e, ssd: 52a4acab-f622-6025-bee3-746d436627cf, votes: 1, usage: 0.0 GB) Witness: 98122555-2028-d4d6-6ee6-001f29595f9f (state: ACTIVE (5), host: cs-ie-h01.ie.local, md: 5255fd2b-83cc-911f-452b-13b4ca74e03e, ssd: 528ba019-e369-151e-01b3-26b103d7de0f, votes: 1, usage: 0.0 GB) /ie-vcsa-09.ie.local/VSAN6-DC/vms>

Thethreeobjectsthatarenowassociatedwiththatvirtualmachinehaveaboldfontinthisdocumentforclarity.Thereisthenamespacedirectory(VMhome),thereisthediskVSAN6-poc-test-vm-1.vmdkandthereisthesnapshotdeltaVSAN6-poc-test-vm-1-000001.vmdk.Thesnapshotdeltahasbeenhighlightedinblueabove.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 3 5

Ifyoulookclosely,bothofthediskbackingshavethesamepolicysettingssinceeverysnapshotinheritsitspolicysettingsfromthebasedisk.BothhaveastripeWidthof1,and hostFailuresToTolerate of 1 and an Object Space Reservation (shown asproportionalCapacityhere)of0%.ThesnapshotcannowbedeletedfromtheVM.MonitortheVM’stasksandensurethat it deletes successfully. When complete, snapshot management should looksimilartothis.

Figure 7.21: Manage Snapshots… Snapshot deleted ThiscompletesthesnapshotsectionofthisPOC.SnapshotsinaVirtualSANdatastoreareveryintuitivebecausetheyutilizevSpherenativesnapshotcapabilities.StartingwithVirtualSAN6.0,theyarestoredefficientlyusing“vsansparse”technologythatimprovestheperformanceofsnapshotscomparedtoVirtualSAN5.5.InVirtualSAN6.1,snapshotchainscanbeupto16snapshotsdeep.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 3 6

7.3 Clone a VM ThenextPOCtestiscloningaVM.WewillcontinuetousethesameVMasbefore.ThistimemakesuretheVMispoweredonfirst.ThereareanumberofdifferentcloningoperationsavailableinvSphere6.Theseareshownhere.

Figure 7.22: Clone operationsTheonethatweshallberunningaspartofthisPOCisthe“ClonetoVirtualMachine”.Thecloningoperationisverymucha“click,click,next”typeactivity.Thisnextscreenistheonlyonethatrequireshumaninteraction.OnesimplyprovidesthenameforthenewlyclonedVM,andafolderifdesired.

Figure 7.23: Select a name and folderWearegoingtoclonetheVMintheVirtualSANCluster,sothismustbeselectedasthecomputeresource.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 3 7

Figure 7.24: Select a compute resource

ThestoragewillbethesameasthesourceVM,namelythevsanDatastore.Thiswillallbepre-selectedforyouiftheVMbeingclonedalsoresidesonthevsanDatastore.

Figure 7.25: Select storage

Figure 7.26: Select options (leave unchecked - default)

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 3 8

Thiswilltakeyoutothe“ReadytoComplete”screen.Ifeverythingisasexpected,clickFinishtocommencethecloneoperation.MonitortheVMtasksforstatusofthecloneoperation.

Figure 7.27: Ready to Complete DonotdeletethenewlyclonedVM.WewillbeusingitinsubsequentPOCtests.ThiscompletesthecloningsectionofthisPOC.CloningwithVirtualSANhasimproveddramaticallywiththenewon-disk(v2)formatinversion6.0and6.1.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 3 9

7.4 vMotion a VM between Hosts Thefirststepistopower-onthenewlyclonedvirtualmachine.WeshallmigratethisVMfromoneVirtualSANhosttoanotherVirtualSANhostusingvMotion.Note: Take a moment to revisit the network configuration and ensure that thevMotionnetworkisdistinctfromtheVirtualSANnetwork.Ifthesefeaturessharethesamenetwork,performancewillnotbeoptimal.First,determinewhichESXihosttheVMcurrentlyresideson.Selectingthe“Summary”taboftheVMdoesthis.OnthisPOC,theVMthatwewishtomigrateisonhostcs-ie-h01.ie.local.

Figure 7.28: VM Summary tab – Host is displayed RightclickontheVMandselectMigrate.

Figure 7.29: MigrateMigrate allows you tomigrate to a different compute resource (host), a differentdatastoreorbothatthesametime.Inthisinitialtest,wearesimplymigratingtheVMtoanotherhostinthecluster,sothisinitialscreencanbeleftatthedefaultof“Changecomputeresourceonly”.Therestofthescreensinthemigrationwizardareprettyself-explanatory.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 4 0

Figure 7.30: Change compute resources only

Figure 7.31: Select a destination host

Figure 7.32: Select a destination network

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 4 1

Figure 7.33: Priority can be left as high (default)

Atthe“ReadytoComplete”window,clickonFinishtoinitiatethemigration.Ifthemigrationissuccessful,thesummarytabofthevirtualmachineshouldshowthattheVMnowresidesonadifferenthost.

Figure 7.34: Verify VM has migrated to new host

DonotdeletethemigratedVM.WewillbeusingitinsubsequentPOCtests.Thiscompletesthe“VMmigrationusingvMotion”sectionofthisPOC.Asyoucansee,vMotionworksjustgreatwithVirtualSAN.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 4 2

7.5 Optional: Storage vMotion a VM between Datastores Thistestwillonlybepossibleifyouhaveanotherdatastoretypeavailabletoyourhosts,suchasNFS/VMFS.Ifso,thentheobjectiveofthistestistomigratetheVMfromanotherdatastoretypeintoVirtualSAN.TheVMFSdatastorecanevenbealocalVMFSdiskonthehost.

7.5.1 Mount an NFS Datastore to the Hosts ThestepstomountanNFSdatastoretomultipleESXihostsaredescribedinthevSphere6.0AdministratorsGuide.SeetheCreateNFSDatastoreinthevSphereClienttopicfordetailedsteps.

7.5.2 Storage vMotion a VM from Virtual SAN to Another Datastore Type CurrentlytheVMresidesontheVirtualSANdatastore.Launchthemigratewizard,justlikewedidinthelastexercise.However,onthisoccasion,tomovetheVMfromthe Virtual SAN datastore to the other datastore type you need to select “Changestorageonly”.

Figure 7.35: Change storage onlyInthisPOC,wehaveanNFSdatastorepresentedtoeachoftheESXihostsintheVirtualSANCluster.Thisisthedatastorewherewearegoingtomigratethevirtualmachineto.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 4 3

Figure 7.36: Select destination storage OneotheritemofinterestinthisstepisthattheVMStoragePolicyshouldalsobechangedto“DatastoreDefault”astheNFSdatastorewillnotunderstandtheVirtualSANpolicysettings.Atthe“Readytocomplete”screen,click“Finish”toinitiatethemigration:

Figure 7.37: Ready to complete

Oncethemigrationcompletes,theVMSummarytabcanbeusedtoexaminethedatastoreonwhichtheVMresides.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 4 4

Figure 7.38: Verify VM has moved to new storage

7.5.3 Storage vMotion of VM to Virtual SAN from Another Datastore Type NowStoragevMotionthevirtualmachinebacktotheVirtualSANdatastoretoprovethatStoragevMotionworksinbothdirections.Thisnowcompletestheoptional“VMmigrationusingStoragevMotion”sectionofthisPOC.Differentstoragepoliciescanbechosenaspartofthemigration.StoragevMotionworksseamlesslywithVirtualSAN.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 4 5

8. Scale out Virtual SAN Oneofthereallynicefeaturesisthesimplisticscale-outnatureofVirtualSAN.Ifyouneedmorecomputeorstorageresourcesinthecluster,simplyaddanotherhosttothecluster.Let’s remindourselvesabouthowourclustercurrently looks.Therearecurrentlythreehostsinthecluster,andthereisafourthhostnotinthecluster.WealsocreatedtwoVMsinthepreviousexercises.

Figure 8.1: Current inventory statusLetusalsoremindourselvesofhowbigtheVirtualSANdatastoreis.

Figure 8.2: Total and Free Virtual SAN datastore capacity

InthisPOC,theVirtualSANdatastoreis811.95GBinsizewith811.21GBfree.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 4 6

8.1 Add the Fourth Host to Virtual SAN Cluster WewillnowproceedwithaddingafourthhosttotheVirtualSANCluster.Note:Backinsection5ofthisPOCguide,youshouldhavealreadysetupaVirtualSANnetworkforthishost.Ifyouhavenotdonethat,revisitsection5,andsetuptheVirtualSANnetworkonthisfourthhost.Havingverifiedthatthenetworkingisconfiguredcorrectlyonthefourthhost,selecttheclusterobjectintheinventory,rightclickonitandselecttheoption“MoveHostsintoCluster…”asshownbelow.

Figure 8.3: Move hosts into ClusterYouwillthenbepromptedtoselectwhichhosttomoveintothecluster.InthisPOC,thereisonlyoneadditionalhost.Selectthathost.

Figure 8.4: Select a host to move into the cluster

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 4 7

Thenextscreenisrelatedtoresourcepools.Youcanleavethisatthedefault,whichistousethecluster’srootresourcepool,thenclickOK.

Figure 8.5: Resource Pools

Thismovesthehostintothecluster.Next,navigatetotheManagetab>Settings>VirtualSAN>Generalviewandverifythattheclusternowcontainsthenewnode.

Figure 8.6: Resource PoolsAsyoucanclearlysee,therearenow4hostsinthecluster.However,youwillalsonoticethattheVirtualSANdatastorehasnotchangedwithregardstototalandfreecapacity.Thisisbecausetheclusterwasconfiguredin“Manual”modebackinsection6.Therefore,VirtualSANwillnotclaimanyofthedisksautomatically.Youwillneedto createadiskgroup for thenewhostandclaimdisksmanually.At thispoint, it

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 4 8

wouldbegoodpracticetore-runthehealthchecktests.Ifthereareanyissueswiththefourthhostjoiningthecluster,usetheVirtualSANHealthchecktocheckwheretheissuelies.Verifythatthehostappearsinthesamenetworkpartitiongroupastheotherhostsinthecluster.

8.2 Manual Option: Create Disk Group on New Host This process has already been covered in section 6.2. Navigate to the DiskManagementsection,selectthenewhostandthenclickontheicontocreateanewdiskgroup:

Figure 8.7: Create a new disk group

Asbefore,weselectaflashdeviceandtwomagneticdisks.Thisissothatallhostsintheclustermaintainauniformconfiguration.

Figure 8.8: Select flash and capacity devices

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 4 9

8.3 Verify Virtual SAN Disk Group Configuration on New Host Oncethediskgrouphasbeencreated,thediskmanagementviewshouldberevisitedtoensurethatitishealthy.

Figure 8.9: Check disk group health

8.4 Verify New Virtual SAN Datastore Capacity ThefinalstepistoensurethattheVirtualSANdatastorehasnowgrowninaccordancetothecapacitydevicesinthediskgroupthatwasjustaddedonthefourthhost.ReturntotheGeneraltabandexaminethetotalandfreecapacityfield.

Figure 8.10: Virtual SAN Datastore capacity details

Aswecanclearlysee,theVirtualSANdatastorehasnowgrowninsizeto1.06TB.Freespaceisshownas1.06TBastheamountofspaceusedisminimal.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 5 0

Thiscompletesthe“ScaleOut”sectionofthisPOC.Asseen,scale-outonVirtualSANissimplebutverypowerful.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 5 1

9. VM Storage Policies and Virtual SAN VM Storage Policies form the basis of VMware’s Software Defined Storage vision.Rather than deployingVMs directly to a datastore, a VM Storage Policy is chosenduringinitialdeployment.Thepolicycontainscharacteristicsandcapabilitiesofthestoragerequiredby thevirtualmachine.Basedon thepolicycontents, thecorrectunderlyingstorageischosenfortheVM.IftheunderlyingstoragemeetstheVMstoragePolicyrequirements,theVMissaidtobeinacompatiblestate.IftheunderlyingstoragefailstomeettheVMstoragePolicyrequirements,theVMissaidtobeinanincompatiblestate.InthissectionofthePOCGuide,weshalllookatvariousaspectsofVMStoragePolicies.Thevirtualmachinesthathavebeendeployedthusfarhaveusedthedefaultstoragepolicy,whichhasthefollowingsettings:

• NumberOfFailuresToTolerate=1• NumberOfDiskObjectsToStripe=1• ObjectSpaceReservation=0%• FlashReadCacheReservation=0%• ForceProvisioning=False

WewillcreatesomeadditionalpoliciesinthissectionofthePOC.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 5 2

9.1 Create a New VM Storage Policy InthispartofthePOC,wewillbuildapolicythatcreatesastripewidthof2foreachstorageobjectdeployedwiththispolicy.TheVMStoragePoliciescanbeaccessedfromtheHomepageonthevSpherewebclientasshownbelow.

Figure 9.1: VM Storage Policies

Therewillbesomeexistingpoliciesalreadyinplace,suchastheVirtualSANDefaultStoragepolicy,whichwe’vealreadyusedtodeployVMsinsection7ofthisPOCguide.There is another policy called “VVol No Requirements Policy”, which is used forVirtualVolumesandisnotapplicabletoVirtualSAN.Thereareanumberoficonsonthispagethatmayneedfurtherexplanation:

CreateanewVMStoragePolicy

EditanexistingVMStoragePolicy

DeleteanexistingVMStoragePolicy

CheckthecomplianceofVMsusingthisVMStoragePolicy

CloneanexistingVMStoragePolicy

Table 9.1: VM Storage Policy icons

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 5 3

Tocreateanewpolicy,clickonthe“CreateanewVMStoragePolicy”icon.

Figure 9.2: Create a new VM Storage Policy

ThenextstepistoprovideanameandanoptionaldescriptionforthenewVMStoragePolicy.Sincethispolicywillcontainastripewidthof2,wehavegivenitanametoreflectthis.YoumayalsogiveitanametoreflectthatitisaVirtualSANpolicy.

Figure 9.3: VM Storage Policy Name and Description

ThenextsectioncontainsadescriptionofRule-Setsandhowtousethem.

Figure 9.4: Rule-Sets NowwegettothepointwherewecreateasetofrulesforourRule-Set(weareonlycreatingasingleRule-SetinthisVMStoragePolicy).Thefirststepistoselect“Virtual

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 5 4

SAN”asthe“Rulesbasedondataservices”.Oncethisisselected,thefivecustomizablecapabilitiesassociatedwithVirtualSANareexposed.SincethisVMStoragePolicyisgoingtohavearequirementwherethestripewidthofanobjectissettotwo,thisiswhatweselectfromthelistofrules.Itisofficiallycalled“Numberofdiskstripesperobject”.

Figure 9.5: Number of disk stripes per object

Wealsowanttosetthisvalueto2.Oncethediskstriperuleischosen,changethedefaultvaluefrom1to2asshownbelow.NoticealsotheStorageConsumptionModeldisplayontherighthandside,detailinghowmuchdiskspacewillbeconsumedbasedontherulesplacedinthepolicy.

Figure 9.6: Setting Stripe Width to 2

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 5 5

ClickingnextmovesontotheStorageCompatibilityscreen.Notethatthisdisplayswhichstorage“understands”thepolicysettings.Inthiscase,thevsanDatastoreistheonlydatastorethatiscompatiblewiththepolicysettings.Note:ThisdoesnotmeanthattheVirtualSANdatastorecansuccessfullydeployaVMwiththispolicy;itsimplymeansthattheVirtualSANdatastoreunderstandstherulesorrequirementsinthepolicy.

Figure 9.7: Storage Compatibility Atthispoint,youcanclickonnexttoreviewthesettingsoncemore,oralternatively,atthispoint,youcanclick“Finish”insteadofreviewingthepolicy.Onclicking“Finish”,thepolicyiscreated.Let’snowgoaheadanddeployaVMwiththisnewpolicy,andlet’sseewhateffectithasonthelayoutoftheunderlyingstorageobjects.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 5 6

9.2 Deploy a New VM with the New VM Storage Policy WehavealreadydeployedaVMbackin7.1.Thestepswillbeidentical,untilwegettothepointwheretheVMStoragePolicyischosen.Thistime,insteadofselectingthedefaultpolicy,wewillselectthenewlycreatedStripeWidth=2policyasshownbelow.

Figure 9.8: Selecting a non-default policy

Andasbefore,thevsanDatastoreshouldshowupasthecompatibledatastore,andthus the one towhich this VM should be provisioned if wewish to have the VMcompliantwithitspolicy.

Figure 9.9: vsanDatastore is compatible with the policy

Let’snowgoaheadandexamine the layoutof this virtualmachine, and see if thepolicyrequirementsaremet;i.e.dothestorageobjectsofthisVMhaveastripewidth

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 5 7

of2?First,ensure that theVM iscompliantwith thepolicybynavigating toVM>Managetab>Policies,asshownhere.

Figure 9.10: VM is compliant with the policy

ThenextstepistoselecttheMonitortab>PoliciesandcheckthelayoutoftheVM’sstorageobjects.Thefirstobjecttocheck is theVMhomenamespace.Select it,andthenselectthe“PhysicalDiskPlacement”tabatthelowerpartofthewindow.Thiscontinuestoshowthatthereisonlyonemirroredcomponent,butnostripewidth(whichisdisplayedasaRAID0configuration).Why?ThereasonforthisisthattheVMhomenamespaceobjectdoesnotbenefit fromstripingsoit ignoresthispolicysetting.Thereforethisbehaviorisnormalandtobeexpected.

Figure 9.11: VM home namespace ignores stripe width policy setting

Nowlet’sexamine“Harddisk1”andseeifthatlayoutisadheringtothepolicy.Herewecanclearlyseeadifference.EachreplicaormirrorcopyofthedatanowcontainstwocomponentsinaRAID0configuration.ThisimpliesthattheharddiskstorageobjectsareindeedadheringtothestripewidthrequirementthatwasplacedintheVMStoragePolicy.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 5 8

Figure 9.12: Hard disks adhere to stripe width policy setting Notethateachstripedcomponentmustbeplacedonitsownphysicaldisk.ThereareenoughphysicaldiskstomeetthisrequirementinthisPOC.However,arequestforalargerstripewidthwouldnotbepossibleinthisconfiguration.KeepthisinmindifyouplanaPOCwithalargestripewidthvalueinthepolicy.Itshouldalsobenotedthatsnapshotstakenofthisbasediskcontinuetoinheritthepolicyofthebasedisk,implyingthatthesnapshotdeltaobjectswillalsobestriped.One final item to note is the fact that this VM automatically has aNumberOfFailuresToTolerate=1, even though itwasnot explicitly requested in thepolicy.WecantellthisfromtheRAID1configurationinthelayout.VirtualSANwillalwaysprovideavailabilitytoVMsviatheNumberOfFailuresToToleratepolicysetting,evenwhenitisnotrequestedviathepolicy.TheonlywaytodeployaVMwithoutareplicacopyisbyplacingNumberOfFailuresToTolerate=0inthepolicy.AusefulruleofthumbforNumberOfFailuresToTolerateisthatinordertotoleratenfailuresinacluster,yourequireaminimumof2n+1hostsinthecluster(toretaina>50%quorumwithnhostfailures).

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 5 9

9.3 Add a New VM Storage Policy to an Existing VM VirtualMachinesmayalsohavenewVMStoragePoliciesaddedaftertheyhavebeendeployed to the Virtual SAN datastore. The configuration of the objects will bechangedwhenthenewpolicyisadded.Thatmaymeantheaddingofnewcomponentstoexistingobjects,forexampleinthecasewheretheNumberOfFailuresToTolerateisincreased. Itmay also involve the creation of new objects that are synced to theoriginal object, and once synchronized, the original object is discarded. This istypically only seenwhen the layout of the object changes, such as increasing theNumberOfDiskStripesPerObject.Inthiscase,wewilladdthenewStripeWidth=2policytooneoftheVMscreatedinsection 7 which still only has the default policy (NumberOfFailuresToTolerate=1,NumberOfDiskStripesPerObject=1,ObjectSpaceReservation=0)associatedwithit.To begin, select theVM that is going to have its policy changed from the vCenterinventory,thenselecttheManagetab>Policiesview.ThisVMshouldcurrentlybecompliantwith the Virtual SANDefault Storage Policy. Now click on the Edit VMStoragePoliciesbuttonashighlightedbelow.

Figure 9.13: Edit VM Storage Policies

Thistakesyoutotheeditscreen,wherethepolicycanbechanged.

Figure 9.14: Manage VM Storage Policies

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 6 0

SelectthenewVMStoragePolicyfromthedrop-downlist.ThepolicythatwewishtoaddtothisVMistheStripeWidth=2policy.

Figure 9.15: Select a new VM Storage Policies Oncethepolicyisselect,clickonthe“Applytoall”buttonasshownbelowtoensurethepolicygetsappliedtoallstorageobjectsandnotjusttheVMhomenamespaceobject.TheVMStoragePolicyshouldnowappearupdatedforallobjects.

Figure 9.16: Apply to all Next,clickOKandinitiatethepolicychange.NowwhenyourevisittheMonitortab>Policiesview,youshouldseethechangesintheprocessoftakingeffect(Reconfiguring)orcompleted,asshownbelow.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 6 1

Figure 9.17: Reconfiguring complete – new policy in effect

ThisisusefulwhenyouonlyneedtomodifythepolicyofoneortwoVMs,butwhatifyouneedtochangetheVMStoragePolicyofasignificantnumberofVMs.ThatcanbeachievedbysimplychangingthepolicyusedbythoseVMs.AllVMsusingthoseVMscanthenbe“broughttocompliance”byreconfiguringtheirstorageobjectlayouttomakethemcompliantwiththepolicy.Weshalllookatthisnext.

9.4 Modify a VM Storage Policy We will modify the StripeWidth=2 policy created earlier to include anObjectSpaceReservation=10%.Thismeansthateachstorageobjectwillnowreserve10%oftheVMDKsizeontheVirtualSANdatastore.SinceallVMsweredeployedwith40GBVMDKs,thereservationvaluewillbe4GB.The first step in this task is to note the amount of free space in the Virtual SANdatastore,soyoucancompareitlaterandconfirmthateachVMDKhas4GBofspacereserved.Next,revisittheVMStoragePolicysectionthatwevisitedpreviously.ThiscanbeaccessedonceagainviatheHomepage.

Figure 9.18: VM Storage Policies: Stripewidth

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 6 2

SelectStripeWidth=2policyinthelefthandcolumn,andthentheManagetab.Select“Rule-set1:VirtualSAN”andthenclickon“Edit”buttononthefarright.

Figure 9.19: Edit PolicyFromthe<Addrule>dropdownlist,selectObjectSpaceReservationasanewcapabilitytobeaddedtothepolicy.

Figure 9.20: Add Object space reservation (%) as a rule to the policySetObjectSpaceReservationto10%.NoteStorageConsumptioncalculationsonright.

Figure 9.21: Set Object space reservation to 10%

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 6 3

AfterclickingOKtomakethechange.Thewizardwillpromptyouastowhetheryouwanttoreapplythischangetothevirtualmachinesusingthispolicymanuallylater(default)orautomaticallynow.ItalsotellsyouhowmanyVMsintheenvironmentareusingthepolicyandwillbeaffectedbythechange.Leaveitatthedefault,whichis “Manually later”, by clicking Yes. This POC guidewill show you how to do thismanuallyshortly.

Figure 9.22: Manually later Next,clickontheMonitortabnexttotheManagetab.ItwilldisplaythetwoVMsalongwith theirstorageobjects,and the fact that theyareno longercompliantwith thepolicy. They are in an “Out ofDate” compliance state as the policy has nowbeenchanged.

Figure 9.23: Out of Date In order to bring theVM to a compliant state,wemustmanually reapply theVMStorage Policy to the objects. The button to do this action is highlighted in thepreviousscreenshot.Whenthisbuttonisclicked,thefollowingpopupappears.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 6 4

Figure 9.24: Reapply VM Storage Policy

When the reconfigure activity completes against the storage objects, and thecompliancestateisonceagainchecked,everythingshouldshowasCompliant.

Figure 9.25: Compliant once again

SincewehavenowincludedanObjectSpaceReservationvalueinthepolicy,whatyoumaynoticeisthattheamountoffreecapacityontheVirtualSANdatastorewillhavereduced.Forexample, the twoVMswith thenewpolicychangehave40GBstorageobjects.Therefore,thereisa10%ObjectSpaceReservationimplying4GBisreservedperVMDK.4GBperVMDK,1VMDKperVM,2VMsequals8GBreservedspace,right?However,theVMDKisalsomirrored,sothereisatotalof16GBreservedontheVirtualSANdatastore.CheckingtheVirtualSANdatastore,wecanseethisreflectedinthefreecapacity.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 6 5

Figure 9.26: ObjectSpace Reservation consuming capacity This completes the “VM Storage Policies” section of this POC. You should nowappreciate how powerful VM Storage Policies are, and how characteristics of theunderlyingstoragecanbeassignedtovirtualmachinesonagranularperVMDKbasiswhileusingasingleVirtualSANdatastore.

9.5 IOPS Limits VirtualSAN6.2addsaqualityofservicefeaturethatlimitsthenumberIOPSanobjectmayconsume.IOPSlimitsareenabledandappliedviaapolicysetting.Thesettingcanbeusedtoensurethataparticularvirtualmachinedoesconsumemorethanitsfairshareofresourcesornegativelyimpactperformanceoftheclusterasawhole.TheexamplebelowshowsaddinganIOPSlimittoanewpolicy.TocreateanewpolicywithanIOPSlimitcompletethefollowingsteps:CreateanewStoragePolicyIntheAddRuledropdownboxselectIOPSlimitforobject.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 6 6

Figure 9.27: IOPS Limits Rule 1

EnteravalueintheIOPSlimitforobjectbox.Inthisexamplewewilluseavalueof1000.

Figure 9.28: IOPS Limit Setting 1

ApplyingtheruleabovetoanobjectwillresultinanIOPSlimitof1000beingset.ItisimportanttonotethatnotonlyisreadandwriteI/OcountedinthelimitbutanyI/Oincurredbyasnapshotisalsoiscountedaswell.IfI/OagainstthisVMorVMDKshouldriseabove1000,theadditionalI/Owillbethrottled.

9.6 Checksum VirtualSAN6.2includesEnd-to-EndSoftwarechecksumtohelpavoiddataintegrityissuesthatmayariseintheunderlyingdisks.Bydefault,checksumisenabledinversion6.2butcanbeexplicitlydisabledviaastoragepolicysetting.Softwarechecksumcanbedisabledbycompletingthefollowingprocedure.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 6 7

1. FromthevShereWebClient,clickPoliciesandProfilesandSelectVMStoragePolicies.

2. SelectaVirtualMachinePolicy.3. RightclickandSelect“EditSettings”4. FromtheRule-Set1screen,select“Addrule”5. Select“Disableobjectchecksum”fromthedropdownlist.

Figure 9.29: Disable Checksum 1

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 6 8

10. Virtual SAN Monitoring WhenitcomestomonitoringVirtualSAN,thereareanumberofareasthatneedparticularattention.Innoparticularorder,theseareconsiderationswhenitcomestomonitoringVirtualSAN:

• MonitortheVirtualSANCluster• MonitorVirtualDevicesintheVirtualSANCluster• MonitorPhysicalDevicesinVirtualSANDatastores• MonitorResynchronization&RebalanceOperationsintheVirtualSAN

Cluster• ExamineDefaultVirtualSANAlarms• TriggeringAlarmsbasedonVirtualSANVMkernelObservationsAlarms

10.1 Monitor the Virtual SAN Cluster Thefirstitemtomonitoristheoverallhealthofthecluster.TheManage>Generalviewgivesyouagoodideaastowhetheralltheflashandcapacitydevicesthatyouexpect tobe inuseare in fact inuse. It alsoshowswhether thenetworkstatus isnormalornot.Finally,itisagoodindicatorastowhetherornottheexpectedcapacityoftheVirtualSANdatastoreiscorrect,andifthereareanycapacityconcernslooming.

Figure 10.1: General view VirtualSANHealthcheckswilldisplayevenmoreinformationregardinghealthandshouldbeenabledaspartofanyVirtualSAN6.1POC.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 6 9

10.2 Monitor Virtual Devices in the Virtual SAN Cluster Tomonitorthevirtualdevices,navigatetoMonitor>VirtualSAN>VirtualDisks.Thiswill list the objects associated with each virtual machine, such as the VM homenamespace and the hard disks. One can also see the policy, compliance state andhealthofanobject.Ifoneselectsanobject,physicaldiskplacementandcompliancefailuresaredisplayedinthelowerhalfofthescreen.

Figure 10.2: Virtual Disks view All objects should be compliant and healthy. All components in the physical diskplacementviewshouldappearas“Active”.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 7 0

10.3 Monitor Physical Devices in the Virtual SAN Cluster Inthesamemonitor>VirtualSANview,physicaldiskscanalsobedisplayed.Wherethisviewisveryusefuliswhenyouwishtoseewhichobjectsresideonaparticularphysicaldisk.Intheviewbelow,oneofthemagneticdisksisselectedandinthelowerhalfofthescreen,theobjectsthathavecomponentsresidingonthatphysicaldiskaredisplayed.

Figure 10.3: Physical Disks view

10.4 Monitor Resynchronization and Rebalance Operations AnotherveryusefulviewinthisMonitor>VirtualSANtabis“Resyncingcomponents”.Thiswilldisplayanyrebuildingorrebalancingoperationsthatmightbetakingplaceonthecluster.Forexample,iftherewasadevicefailure,resyncingorrebuildingactivitycouldbeobservedhere.Similarly,ifadevicewasremovedorahostfailed,andtheCLOMd(ClusterLogicalObjectManagerdaemon)timerexpired(60minutesbydefault),rebuildingactivitywouldalsobeobservedinthiscase.Withregardstorebalancing,VirtualSANattemptstokeepallphysicaldisksatlessthan80%capacity.Ifanyphysicaldisks’capacitypassesthisthreshold,VirtualSANwillmovecomponentsfromthisdisktootherdisksintheclusterinordertorebalancethephysicalstorage.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 7 1

Bydefault,thereshouldbenoresyncingactivitytakingplaceontheVirtualSANCluster,asshownbelow.Resyncingactivityusuallyindicates:(a) afailureofadeviceorhostinthecluster(b) adevicehasbeenremovedfromthecluster(c) aphysicaldiskhasgreaterthan80%ofitscapacityconsumed(d) apolicychangehasbeenimplementedwhichnecessitatesarebuildingofaVM’s

objectlayout.Inthiscase,thenewobjectlayoutiscreated,synchronizedtotheoriginalobject,andthentheoriginalobjectisdiscarded.

Figure 10.4: Resyncing components

10.5 Default Virtual SAN Alarms Thereareatleast56VirtualSANalarmspre-definedinvCenterserver6.0u1.Someareshownhere,andthemajorityrelatetoVirtualSANHealthissues:

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 7 2

Figure 10.5: Alarm definitions

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 7 3

10.7 Monitor Virtual SAN with VSAN Observer TheVMwareVSANObserverisaperformancemonitoringandtroubleshootingtoolforVirtualSAN.ThetoolislaunchedfromtheRubyvSphereConsole(RVC)andcanbeutilizedformonitoringperformancestatisticsforVirtualSANlivemodeoroffline.Whenrunninginlivemode,awebbrowsercanbepointedatvCenterServertoseelivegraphsrelatedtotheperformanceofVirtualSAN.TheutilitycanbeusedtounderstandVirtualSANperformancecharacteristics.Theutility is intended to provide deeper insights of Virtual SAN performancecharacteristicsandanalytics.VSANObserver’suserinterfacedisplaysperformanceinformationofthefollowingitems:

• Host level performance statistics (client stats) • Statistics of the physical disk layer • Deep dive physical disks group details • CPU Usage Statistics • Consumption of Virtual SAN memory pools • Physical and In-memory object distribution across Virtual SAN Clusters

TheVSANObserverUI dependson some JavaScript andCSS libraries (JQuery, d3,angular,bootstrap,font-awesome)inordertosuccessfullydisplaytheperformancestatisticsandotherinformation.TheselibraryfilesareaccessedandloadedovertheInternet at runtimewhen theVSAN Observer page is rendered.The tool requiresaccesstothelibrariesmentionedaboveinordertoworkcorrectly.Thismeansthatthe vCenter Server requiresaccess to the Internet. However with a little workbeforehand,VSANObservercanbeconfiguredtoworkinanenvironmentthatdoesnothaveInternetaccess.FurtherdiscussiononVSANObserverisoutsidethescopeofthisPOCGuide.ForthoseinterestedinlearningmoreaboutVirtualSANObserver,refertotheVMwareVirtualSAN Diagnostics and Troubleshooting ReferenceManual andMonitoring VMwareVirtualSANwithVSANObserver.

10.8 Performance Monitoring Service ThePerformancemonitoringserviceisnewtoVirtualSAN6.2.Theperformanceservicecanbeusedtoforverificationofperformanceaswellasquicktroubleshootingofperformancerelatedissues.Performancechartsareavailableformanydifferentlevels.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 7 4

• Cluster• Hosts• VirtualMachinesandVirtualDisks• Diskgroups• Physicaldisks

Adetailedlistofperformancegraphsanddescriptionscanbefoundhere.Theperformancemonitoringservicemustbeenabledpriortoviewingstatistics.Toenabletheperformancemonitoringservice,completethefollowingSteps:NavigatetotheVirtualSANCluster.

1. ClicktheManagetab.2. SelectHealthandPerformancefromtheVirtualSANSectionandclickEdit

toedittheperformancesettings.

3. SelecttheTurnOnVirtualSANperformanceservicecheckbox.4. SelectthedefaultstoragepolicyfortheStatsdatabaseobjectandclickOK.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 7 5

OncetheservicehasbeenenabledperformancestatisticscanbeviewfromtheperformancemenusinvCenter.Thefollowingexampleismeanttoprovideanoverviewofusingtheperformanceservice.ForpurposesofthisexercisewewillexamineIOPS,throughputandlatencyfromtheVirtualMachinelevelandtheVirtualSANBackendlevel.Theclusterlevershowsperformanceshowsmetricsfromaclusterlevel.Thisincludesallvirtualmachines.Let’stakealookatIOPSfromaclusterlevel.Toaccessclusterlevelperformancegraphs:

1. FromtheClusterlevelinvCenter,selectthePerformanceoptionfromtheMonitortab.

2. SelectVirtualSAN–VirtualMachineConsumptionfromthemenuonthe

left.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 7 6

10.8: Virtual Machine Consumption 1

FortheportionoftheexamplewewillstepdownalevelandtheperformancestatisticsfortheVirtualSANBackend.Thisshowsthe

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 7 7

Figure 10.9: VSAN Backend Performance 1

InpreviousversionsofVirtualSAN,theonlywaytolookatperformancemetricsinthisdetailwasthroughVSANobserver.Theperformanceserviceallowsadministratorstoviewnotonlyrealtimedatabuthistoricaldataaswell.Bydefault,theperformanceservicelooksatthelastonehourofdata.Thistimewindowcanbeincreasedorachangedbyspecifyingacustomrange.

11. Performance Testing Performancetestingisanimportantpartofevaluatinganystoragesolution.Settingup a desirable test environment could be challenging, and customers may do itdifferently.Customersmayalsoselect fromavarietyof tools torunworkloads,orchoose to collect data and logs in different ways. These all add complexity totroubleshootperformanceissuesclaimedbycustomers,andlengthentheevaluationprocess.VirtualSANPerformancewilldependonwhatdevicesareinthehosts(SSD,magneticdisks),onthepolicyofthevirtualmachine(howwidelythedataisspreadacrossthedevices),thesizeoftheworkingset,thetypeofworkload,andsoon.Amajorfactorforvirtualmachineperformanceisthevirtualhardware:howmanyvirtual SCSI controllers, VMDKs, outstanding I/O and how many vCPUs can be

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 7 8

pushingI/O.UseanumberofVMs,virtualSCSIcontrollersandVMDKsformaximumperformance.Virtual SAN’s distributed architecture dictates that reasonable performance isachievedwhen the pooled compute and storage resources in the cluster arewellutilized.ThisusuallymeansanumberofVMseachrunningthespecifiedworkloadshould be distributed in the cluster and run in a consistent manner to deliveraggregatedperformance.VirtualSANalsodependsonVSANObserver fordetailedperformancemonitoringandanalysis,whichasaseparatetooliseasytobecomeanafterthoughtofthetesting.

11.1 Use VSAN Observer Virtual SAN shipswith aperformance-monitoring tool calledVSANObserver. It isaccessedviaRVC–theRubyvSphereConsole.Ifyou’replanningondoinganysortofperformancetesting,planonusingVSANObservertoobservewhat’shappening.ReferenceVMwareKnowledgebaseArticle2064240 forgetting startedwithVSANObserver–http://kb.vmware.com/kb/2064240.SeedetailedinformationinMonitoringVMwareVirtualSANwithVSANObserver.

11.2 Performance Considerations ThereareanumberofconsiderationsyoushouldtakeintoaccountwhenrunningperformancetestsonVirtualSAN.

11.2.1 Single vs. Multiple Workers VirtualSANisdesignedtosupportgoodperformancewhenmanyVMsaredistributedandrunningsimultaneouslyacrossthehostsinthecluster.Runningasinglestoragetest in a single VMwon’t reflect on the aggregate performance of a Virtual SAN-enabled cluster. Regardless of what tool you are using – IOmeter, VDbench orsomething else – plan on usingmultiple “workers” or I/O processors tomultiplevirtualdiskstogetrepresentativeresults.

11.2.2 Working Set Forthebestperformance,avirtualmachine’sworkingsetshouldbemostlyincache.CarewillhavetobetakenwhensizingyourVirtualSANflashtoaccountforallofyourvirtualmachines’workingsetsresidingincache.Ageneralruleofthumbistosizecache as 10% of your consumed virtual machine storage (not including replicaobjects).Whilethisisadequateformostworkloads,understandingyourworkload’sworkingsetbeforesizingisausefulexercise.ConsiderusingVMwareInfrastructurePlanner(VIP)tooltohelpwiththistask–http://vip.vmware.com.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 7 9

11.2.3 Sequential Workloads versus Random Workloads Sustainedsequentialwriteworkloads(suchasVMcloningoperations)runonVirtualSANwillsimplyfillthecacheandfuturewriteswillneedtowaitforthecachetobedestagedtothespinningmagneticdisklayerbeforemoreI/Oscanbewrittentocache,soperformancewillbeareflectionofthespinningdisk(s)andnotofflash.Thesameistrueforsustainedsequentialreadworkflows.Iftheblockisnotincache,itwillhavetobe fetched fromspinningdisk.Mixedworkloadswill benefitmore fromVirtualSAN’scachingdesign.

11.2.4 Outstanding IOs MosttestingtoolshaveasettingforOutstandingIOs,orOIOforshort.Itshouldn’tbeset to1,norshould itbeset tomatchadevicequeuedepth.Considera settingofbetween2and8,dependingonthenumberofvirtualmachinesandVMDKsthatyouplantorun.ForasmallnumberofVMsandVMDKs,use8.ForalargenumberofVMsandVMDKs,considersettingitlower.

11.2.5 Block Size Theblocksizethatyouchooseisreallydependentontheapplication/workloadthatyou plan to run in your VM.While the block size for aWindowsGuest OS variesbetween512bytesand1MB,themostcommonblocksizeis4KB.Butifyouplantorun SQL Server, or MS Exchange workloads, you may want to pick block sizesappropriate to those applications (they may vary from application version toapplicationversion).Sinceitisunlikelythatallofyourworkloadswillusethesameblocksize,consideranumberofperformancetestswithdiffering,butcommonlyused,blocksizes.

11.2.6 Cache Warm up Considerations Flash as cache helps performance in two important ways. First, frequently readblocksendupincache,dramaticallyimprovingperformance.Second,allwritesarecommitted to cache first, before being efficiently destaged to disks – again,dramaticallyimprovingperformance.However,datastillhastomovebackandforthbetweendisksandcache.Mostreal-worldapplicationworkloadstakeawhileforcacheto“warmup”beforeachievingsteady-stateperformance.

11.2.7 Number of Magnetic Disk Drives in Hybrid Configurations Inthegettingstartedsection,wediscusshowdiskgroupswithmultipledisksperformbetterthandiskgroupswithfewer,astherearemorediskspindlestodestagetoaswell asmore spindles to handle read cachemisses. Let’s look at amoredetailedexamplearoundthis.ConsideraVirtualSANenvironmentwhereyouwishtocloneanumberofVMstotheVirtualSANdatastore.ThisisaverysequentialI/Ointensiveoperation.WemaybeabletowriteintotheSSDwritebufferatapproximately200-300MBpersecond.A

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 8 0

single magnetic disk can maybe do 100MB per second. So assuming no readoperationsaretakingplaceatthesametime,wewouldneed2-3magneticdiskstomatchtheSSDspeedfordestagingpurposes.Nowconsiderthattheremightalsobesomeoperationsgoingoninparallel.Let’ssaythatwehaveanotherVirtualSANrequirementtoachieve2000readIOPS.VirtualSANisdesignedtoachievea90%readcachehitrate(approximately).Thatmeans10%ofallreadsaregoingtobereadcachemisses;forexample,thatis200IOPSbasedonourrequirement.Asinglemagneticdiskcanperhapsachievesomewhereintheregionof100 IOPS.Therefore, an additional 2magnetic diskswill be required tomeet thisrequirement.Ifwecombinethedestagingrequirementsandthereadcachemissesdescribedabove,yourVirtualSANdesignmayneed4or5magneticdisksperdiskgrouptosatisfyyourworkload.

11.2.8 Striping Considerations OneoftheVMStoragePolicysettingsisNumberOfDiskStripesPerObject.ThatallowsyoutosetastripewidthonaVM’sVMDKobject.Whilesettingdiskstripingvaluescansometimesincreaseperformance,thatisn’talwaysthecase.As an example, if a given test is cache-friendly (e.g.most of the data is in cache),stripingwon’timpactperformancesignificantly.Asanotherexample,ifagivenVMDKis striped acrossdisks that arebusydoingother things, notmuchperformance isgained,andmayactuallybeworse.

11.2.9 Guest File Systems Considerations Many customers have reported significant differences in performance betweendifferentguestfilesystemsandtheirsettings;forexample,WindowsNTFSandLinux.Ifyouarenotgettingtheperformanceyouexpect,considerinvestigatingwhetheritcouldbeaguestOSfilesystemissue.

11.2.10 Performance during Failure and Rebuild WhenVirtualSANisrebuildingoneormorecomponents,applicationperformancecanbeimpacted.Forthisreason,alwayschecktomakesurethatVirtualSANisfullyrebuiltandthattherearenounderlyingissuespriortotestingperformance.Verifythere are no rebuilds occurring before testing with the following RVC command,whichwediscussedearlier:

• vsan.check_state• vsan.disks_stats• vsan.resync_dashboard

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 8 1

11.3 Performance Testing Option 1: Virtual SAN Health Check VirtualSANHealthCheckcomeswithitsownStoragePerformanceTest.Thisnegatesthe need to deploy additional tools to test the performance of your Virtual SANenvironment.To run thestorageperformance test isquite simple;navigate to thecluster’sMonitortab>VirtualSAN>ProactiveTests,selectStoragePerformanceTest,thenclickontheGoarrowhighlightedbelow.

Figure 11.1: Storage Performance Test

Apopupisthendisplayed,showingthedurationofthetest(default10minutes)alongwith the typeofworkload thatwillbe run.Theusercanchange thisduration, forexample,ifaburn-intestforalongerperiodoftimeisdesired.Thereareanumberofdifferentworkloadsthatcanbechosenfromthedrop-downmenu.

Figure 11.2: Storage Performance Test duration and workload

To learnmoreabout the test that isbeing run, clickon the (i) symbolnext to theworkload.Thiswilldescribethetypeofworkloadthatthetestwillinitiate.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 8 2

Whenthetestcompleted,theStorageLoadTestresultsaredisplayed,includingtestname,workloadtype,IOPS,throughput,averagelatencyandmaximumlatency.Keepinmindthatasequentialwritepatternwillnotbenefitfromcaching,sotheresultsthatareshownfromthistestarebasicallyareflectionofwhatthecapacitylayer(inthiscase,themagneticdisks)cando.

Figure 11.3: Virtual SAN Cluster Storage Load Test results

Theproactivetestcouldthenberepeatedwithdifferentworkloads

Asbefore,when the test completes, the results areonceagaindisplayed.Youwillnoticeamajordifferenceinresultswhentheworkloadcanleveragethecachinglayerversuswhenitcannot.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 8 3

11.4 Performance Testing Option 2: HCIbench Inahyper-convergedarchitecture,eachserverisintendedtosupportbothmanyapplicationVMs,aswellascontributetothepoolofstorageavailabletoapplications.ThisisbestmodeledbyinvokingmanydozensoftestVMs,eachaccessingmultiplestoredVMDKs.Thegoalistosimulateaverybusycluster.Unfortunately,popularstorageperformancetestingtoolsdonotdirectlysupportthismodel.Asaresultperformancetestingahyper-convergedarchitecturesuchasVirtualSANpresentsadifferentsetofchallenges.ToaccuratelysimulateworkloadsofaproductionclusteritisbesttodeploymultipleVMsdispersedacrosshostswitheachVMhavingmultipledisks.Inaddition,theworkloadtestneedstoberunagainsteachVManddisksimultaneously.Toaddressthechallengesofcorrectlyrunningperformancetestinginhyper-convergedenvironments,VMwarehascreatedastorageperformancetestingautomationtoolcalledHCIbenchthatautomatestheuseofthepopularVdbenchtestingtool.Userssimplyspecifytheparametersofthetesttheywouldliketorun,andHCIbenchinstructsVdbenchwhattodooneachandeverynodeinthecluster.HCIbenchaimstosimplifyandacceleratecustomerProofofConcept(POC)performancetestinginaconsistentandcontrolledway.Thetoolfullyautomatestheend-to-endprocessofdeployingtestVMs,coordinatingworkloadruns,aggregatingtestresults,andcollectingnecessarydatafortroubleshootingpurposes.Evaluatorschoosetheprofilestheyareinterestedin;HCIbenchdoestherestquicklyandeasily.ThissectionprovidesanoverviewandrecommendationsforsuccessfullyusingHCIbench.Forcompletedocumentationanduseprocedures,refertotheHCIbenchInstallationandUserguidewhichisaccessiblefromthedownloaddirectory.

11.4.1 Where to Get HCIbench HCIbenchandcompletedocumentationcanbedownloadedfromthefollowinglocation:HCIbenchAutomatedTestingTool.Thistoolisprovidedfreeofchargeandwithnorestrictions.Supportwillbeprovidedsolelyonabest-effortbasisastimeandresourcesallow,bytheVMwareVirtualSANCommunityForum.

11.4.2 Deploying HCIbench Step 1 – Deploy the OVATogetstarted,youdeployasingleHCIbenchappliancecalledHCIbench.ova.TheprocessfordeployingtheHCIbenchOVAisnodifferentfromdeployinganyotherOVA.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 8 4

Step 2 – HCIbench ConfigurationAfterdeployment,navigatetohttp://Controller_VM_IP:8080/tostartconfigurationandkickoffthetest.Therearethreemainsectionsinthisconfigurationfile:• vSphereEnvironmentInformationInthissection,alltheparametersarerequiredexceptfortheNetworkName

field.YoumustprovidethevSphereenvironmentinformationwheretheVirtualSANClusterisconfigured,includingvCenterIPaddress,vCentercredential,nameofthedatacenter,nameoftheVirtualSANCluster,andnameoftheDatastore.§ TheNetworkNameparameterdefineswhichnetworktheVdbenchGuest

VMsshoulduse.ThedefaultvalueisVMNetwork.§ IfDHCPservicesesarenotavailable,theEnableDHCPServiceonthe

NetworkparameterallowsusertoenableDHCPserviceonthenetworkwhich“HCIBenchInternalNetwork”mappedon.

§ TheDatastoreNameparameterspecifiesthedatastorestobetested.AllVMdatawillbedeployedonthisdatastore.ForthepurposesofthisguidetheVirtualSANdatastoreshouldbespecified.Testingmultipledatastoresinparallelisalsosupported.Youcanenterthedatastorenames,oneperline.Inthiscase,virtualmachinesaredistributedevenlyacrossthedatastores.Forexample,ifyouentertwodatastoresand100virtualmachines,50virtualmachineswillbedeployedoneachdatastore.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 8 5

Figure 11.4: Performance Automation Tool Configuration Step 3 – Virtual SAN Cluster Hosts InformationConfiguringtheClusterHostsinformationisoptional.IfthisparameterisliftuncheckedHCIbenchwillcreateaVDbenchGuestVM,thencloneittoallhostsinthe

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 8 6

VirtualSANClusterinaround-robinfashion.ThenamingconventionofVdbenchGuestVMsdeployedinthismodeis“vdbench-vc-<DATASTORE_NAME>-<#>”.Ifthisoptionischecked,eachhostsyouwishtodeployHCIbenchguestVMsonmustbemanuallyaddedtotheHostssection.AsabestpracticeitisrecommendedtoleavetheClusterhostinformationparameteruncheckedandletHCIbenchevenlydistributevirutalmachinesoneachhost.

Figure 11.5: Virtual SAN Cluster Hosts Information Step4-VDbenchGuestVMSpecificationInthissection,theonlyrequiredparameterisNumberofVMsthatspecifiesthetotalnumberofVdbenchGuestVMstobedeployedfortesting.Ifyouentermultipledatastores,theseVMsaredeployedevenlyonthedatastores.TheNumberofDataDiskandSizeofDataDiskparametersareoptional:§ TheNumberofDataDiskparameterspecifieshowmanyVMDKstobetested

areaddedtoeachVdbenchGuestVM.§ TheSizeofDataDiskparameterspecifiesthesize(GB)ofeachVMDKtobe

tested.ThetotalnumberofsimulatedworkloadinstancesisNumberofVM*(times)NumberofDataDisk.

Thedefaultvalueofbothparametersis10.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 8 7

NOTE:Priortosettingthenumberandsizeofeachdatadiskcarefulconsiderationshouldbegiventoensurethatthereissufficeentcomputeandstorageresroucestosupportethetargetworkload.Inaddition,thecumulativesizeofalltestVMsshouldnotexceedthesizeofcacheavailableontheclusterasawhole.Youshouldtakeacarefulesizingexercisetomakesurethereissufficientcomputeandstorageresorucestosupportthetargetlevelofworkloadinstances.

Figure 11.5: Vdbench Guest VM Specification

Step 5 – Download and add vdbench zip file, and add parameter fileOncethisisdone,usersneedtoprovideaccesstothevdbenchtool.Duetolicensingissues,wearenotallowedtodistributethevdbenchbenchmarkingtool,soitneedstobedownloadedfromOracleifyoudonothaveitalready.ThereisalinkprovidedtotheOraclewebsitetodownthevdbenchzipfile,butyouwillneedtohaveanaccountonOracle’ssitetoaccessit.Oncethevdbenchzipfilehasbeendownloadedlocally,youmustthenuploadedtotheappliance.Thenextpartofthesetupistogenerateavdbenchparameterfile,whichhasinformationsuchasI/Osize,R/WratioandwhethertheI/Oshouldberandomorsequentialinnature.Youshouldalsostatehowlongyouwantthetesttorun(3600seconds=1hourbelow),aswellaswhetheryouwanttoddthestoragefirst(initializeit).Finally,decideifyouwantthebenchmarkVMscleaneduponcethetestcompletes.Savetheconfiguration.TomakesurethateverythingisOK,runthevalidatetest.Thiswillverifythatalltheconfigurationparametersarecorrect,andwillstatewhetheritisOKtostartthetest.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 8 8

Figure 11.6: Vdbench Testing Configuration

11.4.3 Considerations for Defining Test Workloads WorkingsetWorkingsetisoneofthemostimportantfactorsforcorrectlyrunningperformancetestandobtainingaccurateresults. For thebestperformance,avirtualmachine’sworkingsetshouldbemostlyincache.CarewillhavetobetakenwhensizingyourVirtualSANflashtoaccountforallofyourvirtualmachines’workingsetsresidingincache. A general rule of thumb is to size cache as 10% of your consumed virtualmachine storage (not including replica objects). While this is adequate for mostworkloads, understanding your workload’s working set before sizing is a usefulexercise.ConsiderusingVMwareInfrastructurePlanner(VIP)tooltohelpwiththistask–http://vip.vmware.com.ThefollowingprocessisanexampleofsizinganappropriateworkingsetforperformancetestingwithHCIbench.Considerafournodeclusterwithone400GBSSDpernode.Thisgivestheclusteratotalcachesizeof1.6TB.ThetotalcacheavailableinVirtualSANissplit70%forreadcacheand30%forwritecache.Thisgivestheclusterinourexample1120GBofavailablereadcacheand480GBofavailablewritecache.InordertocorrectlyfittheHCIbenchwithintheavailablecache,thetotalcapacityofallVMDKsusedforI/Otestingshouldnotexceed1,120GB.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 8 9

Designinga test scenariowith4VMsperhost, eachVMhaving5X10GBVMDKs,resultinginatotalsizeof800GB. Thiswillallowthetestworkingsettofitwithincache.ThedefaultsettingforthenumberofdatadisksperVMis2andthedefaultsizeofdatadisksis5GB.ThesevaluesshouldbeadjustedsothatthetotalnumberofVMsmultipliedbythenumberofdatadisksperVMmultipliedbythesizeofdatadiskislessthanthesizeofSSDsmultipliedby70%(readcacheinhybridmode)multipliedbythenumberofdiskgroupsperhostmultipliedbythenumberofhosts.Thatis: NumberofVMs*NumberofDataDisk*SizeofDataDisk<CachetierSSDcapacity*70%readcache(hybrid)*DiskGroupsperHost*NumberofHostsToseetheexamplemathematically:4VMs*5DataDisks*10GB=800GB,400GBSSDs*70%*1DiskGroupperHost*4Hosts=1,120GB800GBworkingsetsize<1,120GBreadcacheinclusterThatlaststatementistrue,sothisisanacceptableworkingsetfortheconfiguration(andviceversa).SequentialworkloadsversusrandomworkloadsBeforedoingperformancetestsitisimportanttounderstandtheperformancecharacteristicsoftheproductionworkloadtobetested.Differentapplicationshavedifferentperformancecharacteristics.Understandingthesecharacteristicsiscrucialtosuccessfulperformancetesting.Whenitisnotpossibletotestwiththeactualapplicationorapplicationspecifictestingtoolitisimportanttodesignatestwhichmatchestheproductionworkloadascloselyaspossible.DifferentworkloadtypeswillperformdifferentlyonVirtualSAN.Sustainedsequentialwriteworkloads(suchasVMcloningoperations)runonVirtualSANwillsimplyfillthecacheandfuturewriteswillneedtowaitforthecachetobedestagedtothespinningmagneticdisklayerbeforemoreI/Oscanbewrittentocache,soperformancewillbeareflectionofthespinningdisk(s)andnotofflash.Thesameistrueforsustainedsequentialreadworkflows.Iftheblockisnotincache,itwillhavetobe fetched fromspinningdisk.Mixedworkloadswill benefitmore fromVirtualSAN’scachingdesign.HCIbench allows you to change the percentage read and the percentage randomparameters. As a starting point it is recommended to set the percentage readparameterto70andthepercentagerandomparameterto30%.InitializingStorageDuringconfigurationoftheworkloadtherecommendationistoselecttheoptiontoinitializestorage.ThisoptionwillzerothedisksforeachVMbeingusedinthetest,helpingtoalleviateafirstwritepenaltyduringtheperformancetestingphase.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 9 0

TestRunConsiderationsAsfrequentlyreadblocksendupincache,readperformancewillimprove.Inaproductionenvironmentactiveblockswillalreadybeincache.Whenrunninganykindofperformancetestingitisimportanttokeepthisinmind.Asabestpracticeperformancetestsshouldincludeatleasta15minutewarmupperiod.Alsokeepinmindthatthelongertestingrunsthemoreaccuratetheresultswillbe.InadditiontothecachewarmingperiodHCIbenchtestsshouldbeconfiguredtoforatleastanhour.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 9 1

Results AftertheVdbenchtestingiscompleted,thetestresultsarecollectedfromallVdbenchinstancesinthetestVMs.Andyoucanviewtheresultsathttp://Controller_VM_IP/resultsinawebbrowser.YoucanfindalloftheoriginalresultfilesproducedbyVdbenchinstancesinsidethesubdirectorycorrespondingtoatestrun.Inadditiontothetextfiles,thereisanothersubdirectorynamediotest-vdbench-<VM#>vminside,whichisthestatisticsdirectorygeneratedbyVirtualSANObserver.VirtualSANperformancedatacanbeviewedbyopeningthestats.htmlfilewithinthetestdirectory.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 9 2

12. Testing Hardware Failures

12.1 Understanding Expected Behavior When doing failure testing with Virtual SAN, it is important to understand theexpectedbehaviorfordifferentfailurescenarios.Youshouldcomparetheresultsofyour test towhat is expected.Theprevious section shouldbe read tounderstandexpectedfailurebehaviors.

12.2 Important: Test one Thing at a Time Bydefault,virtualmachinesaredeployedonVirtualSANwiththeabilitytotolerateonefailure.Ifyoudonotwaitforthefirstfailuretoberesolved,andthentrytotestanotherfailure,youwillhaveintroducedtwofailurestothecluster.VirtualMachineswillnotbeabletotoleratethesecondfailureandwillbecomeinaccessible.

12.3 VM Behavior when Multiple Failures Encountered Previously we discussed VM operational states and availability. To recap, a VMremainsaccessiblewhena fullmirror copyof theobjectsareavailable, aswell asgreaterthan50%ofthecomponentsthatmakeuptheVM;thewitnessesaretheretoassistwiththelatterrequirement.

Let’stalkalittleaboutVMbehaviorwhentherearemorefailuresintheclusterthantheNumberOfFailuresToToleratesettinginthepolicyassociatedwiththeVM.

12.3.1 VM Powered on and VM Home Namespace Object Goes Inaccessible IfarunningVMhasitsVMHomeNamespaceobjectgoinaccessibleduetofailuresinthecluster,anumberofdifferentthingsmayhappen.OncetheVMispoweredoff,itwillbemarked"inaccessible"inthevSpherewebclientUI.Therecanalsobeothersideeffects,suchastheVMgettingrenamedintheUItoits“.vmx”pathratherthanVMname,ortheVMbeingmarked"orphaned".

12.3.2 VM Powered on and Disk Object Goes Inaccessible IfarunningVMhasoneofitsdiskobjectsgoinaccessible,theVMwillkeeprunning,but itsVMDK’s I/O is stalled.Typically, theGuestOSwill eventually timeout I/O.Someoperatingsystemsmaycrashwhenthisoccurs.Otheroperatingsystems, forexamplesomeLinuxdistributions,maydowngradethefilesystemsontheimpactedVMDKtoread-only.TheGuestOSbehavior,andeventheVMbehaviorisnotVirtualSANspecific.ItcanalsobeseenonVMsrunningontraditionalstoragewhentheESXihostsuffersanAPD(AllPathsDown).OncetheVMbecomesaccessibleagain,thestatusshouldresolve,andthingsgobacktonormal.Ofcourse,dataremainsintactduringthesescenarios.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 9 3

12.4 What Happens when a Server Fails or is Rebooted? Ahost failure canoccur inanumberofways. It couldbea crash,or it couldbeanetwork issue (which isdiscussed inmoredetail in thenext section).However, itcouldalsobesomethingassimpleasareboot,andthatthehostwillbebackonlinewhen the reboot process completes. Once again, Virtual SAN needs to be able tohandlealloftheseevents.If there are active components of an object residing on the host that is detected to be failed (due to any of the stated reasons) then those components are marked as ABSENT. I/O flow to the object is restored within 5-7 seconds by removing the ABSENT component from the active set of components in the object.The ABSENT state is chosen rather than the DEGRADED state because in many cases a host failure is a temporary condition. A host might be configured to auto-reboot after a crash, or the host’s power cable was inadvertently removed, but plugged back in immediately. Virtual SAN is designed to allow enough time for a host to reboot before starting rebuilds on other hosts so as not to waste resources. BecauseVirtualSANcannottellifthisisahostfailure,anetworkdisconnectorahostreboot,the60-minutetimerisonceagainstarted.Ifthetimerexpires,andthehosthasnotrejoinedthecluster,arebuildofcomponentsontheremaininghostsintheclustercommences.Ifahost fails,or isrebooted, thiseventwill triggera"Hostconnectionandpowerstate"alarm,andifvSphereHAisenabledonthecluster,itwillalsocausea"vSphereHAhoststatus"alarmanda“HostcannotcommunicatewithallothernodesintheVirtualSANEnabledCluster”message.IfNumberOfFailuresToTolerate=1orhigherintheVMStoragePolicy,andanESXihostgoesdown,VMsnotrunningonthefailedhostcontinuetorunasnormal.IfanyVMswiththatpolicywererunningonthefailedhost,theywillgetrestartedononeoftheotherESXihostsintheclusterbyvSphereHA,aslongasitisconfiguredonthecluster.Caution: If VMs are configured in such a way as to not tolerate failures,(NumberOfFailuresToTolerate=0),aVMthathascomponentsonthefailinghostwillbecomeinaccessiblethroughthevSpherewebclientUI.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 9 4

12.5 Simulate Host Failure without vSphere HA WithoutvSphereHA,anyvirtualmachinesrunningonthehostthatfailswillnotbeautomaticallystartedelsewhereinthecluster,eventhoughthestoragebackingthevirtualmachineinquestionisunaffected.Let’stakeanexamplewhereaVMisrunningonahost(cs-ie-h02.ie.local).

Figure 12.1: host failure without vSphere HA Itwould also be a good test if this VM also had components located on the localstorageof thishost.However, itdoesnotmatter if itdoesnotas the testwill stillhighlightthebenefitsofvSphereHA.Next,thehostisrebooted:

Figure 12.2: Reboot the host

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 9 5

Asexpected,thehostisnotrespondinginvCenter,andtheVMbecomesdisconnected.TheVMwillremaininadisconnectedstateuntiltheESXihosthasfullyrebooted,asthereisnovSphereHAenabledonthecluster,sotheVMcannotberestartedonanotherhostinthecluster.

Figure 12.3: ESXi host not responding, VM disconnected

IfyounowexaminethepoliciesoftheVM,youwillseethatitisnon-compliant.Youcanalsoseethereasonwhyinthelowerpartofthescreen.ThisVMshouldbeabletotolerateonefailure,butduetothefailurecurrentlyinthecluster(forexample:oneESXihostiscurrentlyrebooting),thisVMcannottolerateanotherfailure,thusitisnon-compliantwithitspolicy.WhatcanbededucedfromthisisthatnotonlywastheVM’scomputerunningonthehost which was rebooted, but that it also had some components residing on thestorageofthehostthatwasrebooted.Wecanconfirmthiswhenthehostfullyreboots.

Figure 12.4: VM is non-compliant

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 9 6

OncetheESXihosthasrebooted,weseethattheVMisnolongerdisconnectedbutleftinapoweredoffstate.

Figure 12.5: ESXi host rebooted, VM powered off

Asmentionedpreviously,ifthephysicaldiskplacementisexamined,wecanclearlyseethatthestorageonthehostthatwasrebooted,cs-ie-h02.ie.local,wasusedtostorecomponentsbelongingtotheVM.

Figure 12.6: Components on host that was rebooted

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 9 7

12.6 Simulate Host Failure with vSphere HA Let’snowrepeatthesamescenario,butwithvSphereHAenabledonthecluster.First,powerontheVMfromthelasttest.Next,selecttheclusterobject,andnavigatetotheManagetab,thenSettings>Services>vSphereHA.vSphereHAisturnedoffcurrently.

Figure 12.7: vSphere HA is turned off

Clickonthe“Edit”buttontoenablevSphereHA.Whenthewizardpopsup,clickonthe“TurnonvSphereHA”checkboxasshownbelow,thenclickOK.

Figure 12.8: Turn on vSphere HAThiswilllaunchanumberoftasksoneachodeinthecluster.ThesecanbemonitoredviatheMonitor>Tasksview.WhentheconfiguringofvSphereHAtaskscomplete,selecttheclusterobject,thentheSummarytab,thenthevSphereHA

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 9 8

windowandensureitisconfiguredandmonitoring.TheclustershouldnowhaveVirtualSAN,DRSandvSphereHAenabled.

Figure 12.9: Virtual SAN, DRS and vSphere HA enabled

VerifythatthetestVMisstillresidingonhostcs-ie-h02.ie.local.Nowrepeatthesametestasbeforebyrebootinghostcs-ie-h02.ie.localandexaminethedifferenceswithvSphereHAenabled.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 9 9

Figure 12.10: Reboot the host, this time with vSphere HA enabled

Onthisoccasion,anumberofHArelatedeventsshouldbedisplayedontheSummarytabofthehostbeingrebooted(youmayneedtorefreshthewebclienttoseethese):

Figure 12.11: vSphere HA messagesHowever,ratherthantheVMbecomingdisconnectedforthedurationofthehostrebootlikewasseeninthelasttest,theVMininsteadrestartedonanotherhost,inthiscasecs-ie-h03.ie.local.

Figure 12.12: VM restarted on a different host

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 0 0

EarlierwestatedthatthereweresomecomponentsbelongingtotheobjectsofthisVMresidingonthelocalstorageofthehostthatwasrebooted.Thesecomponentsnowshowupas“Absent”intheVM>Monitor>policies>PhysicalDiskPlacementviewasshownbelow.

Figure 12.13: Absent components OncetheESXihostcompletesrebooting,assumingitisbackwithin60minutes,thesecomponentswillberediscovered,resynchronizedandplacedbackinanActivestate.Should thehost be disconnected for longer than60minutes (theCLOMD timeoutdelaydefaultvalue),the“Absent”componentswillberebuiltelsewhereinthecluster.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 0 1

12.7 Disk is Pulled Unexpectedly from ESXi Host WhenamagneticdiskispulledfromanESXihoststhatisusingittocontributestorageto Virtual SAN without first decommissioning the disk, all the Virtual SANcomponentsresidingonthediskgoABSENTandareinaccessible.The ABSENT state is chosen over DEGRADED because Virtual SAN knows the disk is not lost, but just removed. If the disk is placed back in the server before the 60-minute timeout, no harm is done and Virtual SAN syncs it back up. In this scenario, Virtual SAN is back up with full redundancy without wasting resources on an expensive rebuild.

12.7.1 Expected Behaviors • IftheVMhasapolicythatincludesNumberOfFailuresToTolerate=1orgreater,the

VM’s objectswill still be accessible from another ESXi host in theVirtual SANCluster.

• ThediskstateismarkedasABSENTandcanbeverifiedviavSpherewebclientUI.• At this point, all in-flight I/O is halted while Virtual SAN reevaluates the availability

of the object (e.g. VM Home Namespace or VMDK) without the failed component as part of the active set of components.

• If Virtual SAN concludes that the object is still available (based on a full mirror copy and greater than 50% of the components being available), all in-flight I/O is restarted.

• The typical time from physical removal of the disk, Virtual SAN processing this event, marking the component ABSENT halting and restoring I/O flow is approximately 5-7 seconds.

• If the same disk is placed back on the same host within 60minutes, no newcomponentswillbere-built.

• If60minutespasses,andtheoriginaldiskhasnotbeenreinserted in thehost,componentsontheremoveddiskwillbebuiltelsewhereinthecluster,ifcapacityisavailable,includinganynewlyinserteddisksclaimedbyVirtualSAN.

• If the VM Storage Policy hasNumberOfFailuresToTolerate=0,the VMDKwill beinaccessibleifoneoftheVMDKcomponents(thinkonecomponentofastripeorafullmirror)residesontheremoveddisk.TorestoretheVMDK,thesamediskhastobeplacedbackintheESXihost.ThereisnootheroptionforrecoveringtheVMDK.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 0 2

12.8 SSD is Pulled Unexpectedly from ESXi Host Whenasolid-statediskdriveispulledwithoutdecommissioningit,alltheVirtualSANcomponentsresidinginthatdiskgroupwillgoABSENTandareinaccessible.Inotherwords,if an SSD is removed, it will appear as a removal of the SSD as well as all associated magnetic disks backing the SSD from a Virtual SAN perspective.

12.8.1 Expected Behaviors • IftheVMhasapolicythatincludesNumberOfFailuresToTolerate=1orgreater,the

VM’sobjectswillstillbeaccessible.• DiskgroupandthedisksunderthediskgroupstateswillbemarkedasABSENT

andcanbeverifiedviathevSpherewebclientUI. • At this point, all in-flight I/O is halted while Virtual SAN reevaluates the availability

of the objects without the failed component(s) as part of the active set of components. • If Virtual SAN concludes that the object is still available (based on a full mirror copy

and greater than 50% of components being available), all in-flight I/O is restarted.• The typical time from physical removal of the disk, Virtual SAN processing this event,

marking the components ABSENT halting and restoring I/O flow is approximately 5-7 seconds.

• WhenthesameSSDisplacedbackonthesamehostwithin60minutes,nonewobjectswillbere-built.

• Whenthetimeoutexpires(default60minutes),componentsontheimpacteddiskgroupwillberebuiltelsewhereinthecluster,providingenoughcapacityandisavailable.

• If theVMStoragePolicyhasNumberOfFailuresToTolerate=0, theVMDKwill beinaccessibleifoneoftheVMDKcomponents(thinkonecomponentofastripeora fullmirror)existsondiskgroupwhomthepulledSSDbelongs to.TorestoretheVMDK,thesameSSDhastobeplacedbackintheESXihost.ThereisnooptiontorecovertheVMDK.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 0 3

12.9 What Happens When a Disk Fails? Ifadiskdrivehasanunrecoverableerror,VirtualSANmarksthediskasDEGRADEDasthefailureispermanent.

12.9.1 Expected Behaviors • IftheVMhasapolicythatincludesNumberOfFailuresToTolerate=1orgreater,the

VM’sobjectswillstillbeaccessible.• ThediskstateismarkedasDEGRADEDandcanbeverifiedviavSpherewebclient

UI.• At this point, all in-flight I/O is halted while Virtual SAN reevaluates the availability

of the object without the failed component as part of the active set of components. • If Virtual SAN concludes that the object is still available (based on a full mirror copy

and greater than 50% of components being available), all in-flight I/O is restarted.• The typical time from physical removal of the drive, Virtual SAN processing this event,

marking the component DEGRADED halting and restoring I/O flow is approximately 5-7 seconds.

• Virtual SAN now looks for any hosts and disks that can satisfy the object requirements. This includes adequate free disk space and placement rules (e.g. 2 mirrors may not share the same host). If such resources are found, Virtual SAN will create new components on there and start the recovery process immediately.

• If the VM Storage Policy hasNumberOfFailuresToTolerate=0,the VMDKwill beinaccessible ifoneoftheVMDKcomponents(thinkonecomponentofastripe)existsonthepulleddisk.ThiswillrequirearestoreoftheVMfromaknowngoodbackup.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 0 4

12.10 What Happens When an SSD Fails? AnSSDfailurefollowsasimilarsequenceofeventstothatofadiskfailurewithonemajordifference;VirtualSANwillmarktheentirediskgroupasDEGRADED.VirtualSANmarks theSSDandalldisks in thediskgroupasDEGRADEDas the failure ispermanent(diskisoffline,nolongervisible,andothers).

12.10.1 Expected Behaviors • IftheVMhasapolicythatincludesNumberOfFailuresToTolerate=1orgreater,the

VM’s objectswill still be accessible from another ESXi host in theVirtual SANCluster.

• Disk group and the disks under the disk group states will be marked asDEGRADEDandcanbeverifiedviathevSpherewebclientUI.

• At this point, all in-flight I/O is halted while Virtual SAN reevaluates the availability of the objects without the failed component(s) as part of the active set of components.

• If Virtual SAN concludes that the object is still available (based on available full mirror copy and witness), all in-flight I/O is restarted.

• The typical time from physical removal of the drive, Virtual SAN processing this event, marking the component DEGRADED halting and restoring I/O flow is approximately 5-7 seconds.

• Virtual SAN now looks for any hosts and disks that can satisfy the object requirements. This includes adequate free SSD and disk space and placement rules (e.g. 2 mirrors may not share the same hosts). If such resources are found, Virtual SAN will create new components on there and start the recovery process immediately.

• If theVMStoragePolicyhasNumberOfFailuresToTolerate=0, theVMDKwill beinaccessible ifoneoftheVMDKcomponents(thinkonecomponentofastripe)exists on disk group whom the pulled SSD belongs to. There is no option torecoverthe VMDK. Thismay require a restore of the VM from a known goodbackup.

Warning:TestonethingatatimeduringthefollowingPOCsteps.Failuretoresolvethepreviouserrorbeforeintroducingthenexterrorwillintroducemultiplefailuresinto Virtual SAN which it may not be equipped to deal with, based on theNumberOfFailuresToToleratesetting,whichissetto1bydefault.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 0 5

12.11 Virtual SAN Disk Fault Injection Script for POC Failure Testing WhentheVirtualSANHealthCheckVIBisinstalled(installedbydefaultinvSphere6.0U1),apythonscripttohelpwithPOCdiskfailuretestingisavailableonallESXihosts.ThescriptiscalledvsanDiskFaultInjection.pycandcanbefoundontheESXihostsinthedirectory/usr/lib/vmware/vsan/bin.Todisplaytheusage,runthefollowingcommand:[root@cs-ie-h01:/usr/lib/vmware/vsan/bin] python./vsanDiskFaultInjection.pyc -h Usage: injectError.py -t -r error_durationSecs -d deviceName injectError.py -p -d deviceName injectError.py -c -d deviceName Options: -h, --help show this help message and exit -u Inject hot unplug -t Inject transient error -p Inject permanent error -c Clear injected error -r ERRORDURATION Transient error duration in seconds -d DEVICENAME, --deviceName=DEVICENAME [root@cs-ie-h01:/usr/lib/vmware/vsan/bin]

Warning: This command should only be used in pre-production environmentsduringaPOC.Itshouldnotbeusedinproductionenvironments.UsingthiscommandtomarkdisksasfailedcanhaveacatastrophiceffectonaVirtualSANCluster.Readersshouldalsonotethatthistoolprovidestheabilitytodo“hotunplug”ofdrives,whichissimilartothetestingthatwasdonewiththehpssaclicommandpreviously.Thisisanalternativewayofcreatingasimilartypeofcondition.However,inthisPOCguide,thisscriptisonlybeingusedtoinjectpermanenterrors.

12.12 Pull Magnetic Disk/Capacity Tier SSD and Replace before Timeout Expires In this first example, we shall remove a disk from the host using thevsanDiskFaultInjection.pycpythonscriptratherthanphysicallyremovingitfromthehost.Itshouldbenotedthatthesametestscanberunbysimplyremovingthediskfromthehost.Ifphysicalaccesstothehostisconvenient,literallypullingadiskwouldtestexactphysicalconditionsasopposedtoemulatingitwithinsoftware.AlsonotethatnotallI/Ocontrollerssupporthotunpluggingdrives.ChecktheVirtualSANCompatibilityGuidetoseeifyourcontrollermodelsupportsthehotunplugfeature.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 0 6

We will then examine the effect this operation has on Virtual SAN, and virtualmachinesrunningonVirtualSAN.WeshallthenreplacethecomponentbeforetheCLOMD timeout delay expires (default 60 minutes), which will mean that norebuildingactivitywilloccurduringthistest.PickahostwitharunningVM.

Figure 12.14: Select host with running VM

Next,navigatetotheVM’sMonitortab>Policies,selectaHardDiskandthenselectPhysicalDiskPlacement tab in the lowerhalfof the screen. IdentifyaComponentobject.ThecolumnthatwearemostinterestedinisHDDDiskName,asitcontainstheNAASCSIidentifierofthedisk.Theobjectiveistoremoveoneofthesedisksfromthehost(othercolumnsmaybehiddenbyrightclickingonthem).

Figure 12.15: Display disk identifiers From figure 12.15, let us say that we wish to remove the disk containing thecomponentresidingonhostcs-ie-h01.ie.local.Thatcomponentresidesonphysicaldiskwith anNAA ID string of naa.600508b1001c388c92e817e43fcd5237.Make anoteofyourNAAIDstring.Next,SSHintothehostwiththedisktopull.InjectahotunplugeventusingthevsanDiskFaultInjection.pycpythonscript: [root@cs-ie-h01:~] python /usr/lib/vmware/vsan/bin/vsanDiskFaultInjection.pyc –u –d naa.600508b1001c388c92e817e43fcd5237 Injecting hot unplug on device vmhba1:C0:T4:L0 vsish -e set /reliability/vmkstress/ScsiPathInjectError 0x1 vsish -e set /storage/scsifw/paths/vmhba1:C0:T4:L0/injectError 0x004C0400000002

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 0 7

Let’s now check out the VM’s objects and components and as expected, thecomponentthatresidedonthatdiskinhostcs-ie-h01quicklyshowsupasabsent.

Figure 12.16: Disk Removed, Component Absent Toputthediskdrivebackinthehost,onesimplyrescansthehostfornewdisks.Navigatetothehost>Manage>Storage>StorageDevicesandclicktherescanbutton.

Figure 12.17: Rescan storage adapters LookatthelistofstoragedevicesfortheNAAIDthatwasremoved.Ifforsomereason,thediskdoesn’treturnafterrefreshingthescreen,tryrescanningthehost

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 0 8

again.Ifitstilldoesn’tappear,reboottheESXihost.OncetheNAAIDisback,clearanyhotunplugflagssetpreviouslywiththe–coption:[root@cs-ie-h01:~] python /usr/lib/vmware/vsan/bin/vsanDiskFaultInjection.pyc –c –d naa.600508b1001c388c92e817e43fcd5237 Clearing errors on device vmhba1:C0:T4:L0 vsish -e set /storage/scsifw/paths/vmhba1:C0:T4:L0/injectError 0x00000 vsish -e set /reliability/vmkstress/ScsiPathInjectError 0x00000

12.13 Pull Magnetic Disk/Capacity Tier SSD and Do not Replace before Timeout Expires Inthisexample,weshallremovethemagneticdiskfromthehost,onceagainusingthevsanDiskFaultInjection.pycscript.However,thistimeweshallwaitlongerthan60minutesbeforescanningtheHBAfornewdisks.After60minutes,VirtualSANwillrebuildthecomponentsonthemissingdiskelsewhereincluster.

Thesameprocessasbeforecannowberepeated.Howeverthistimeweshallleavethediskdriveremovedformorethan60minutesandseetherebuildactivitytakeplace.Beginbyidentifyingthediskonwhichthecomponentresides.

Figure 12.18: Identify NAA id

[root@cs-ie-h01:~] date Mon Dec 14 13:36:02 UTC 2015 [root@cs-ie-h01:~] python /usr/lib/vmware/vsan/bin/vsanDiskFaultInjection.pyc –u –d naa.600508b1001c388c92e817e43fcd5237 Injecting hot unplug on device vmhba1:C0:T4:L0 vsish -e set /reliability/vmkstress/ScsiPathInjectError 0x1 vsish -e set /storage/scsifw/paths/vmhba1:C0:T4:L0/injectError 0x004C0400000002

Atthispoint,wecanonceagainseethatthecomponenthasgoneabsent.After60minuteshaveelapsed,thecomponentshouldnowberebuilt.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 0 9

Figure 12.19: Component is absent

Afterthe60minuteshaselapsed,thecomponentshouldberebuiltonadifferentdiskinthecluster.Thatiswhatisobserved.Notethecomponentresidesonanewdisk(NAAidisdifferent).

Figure 12.20: Component is rebuilt Theremoveddiskcannowbere-addedbyscanningtheHBA:Navigatetothehost>Manage>Storage>StorageDevicesandclicktherescanbutton.SeeFigure12.18aboveforascreenshot.LookatthelistofstoragedevicesfortheNAAIDthatwasremoved.Ifforsomereason,thediskdoesn’treturnafterrefreshingthescreen,tryrescanningthehostagain.Ifitstilldoesn’tappear,reboottheESXihost.OncetheNAAIDisback,clearanyhotunplugflagssetpreviouslywiththe–coption:[root@cs-ie-h01:~] python /usr/lib/vmware/vsan/bin/vsanDiskFaultInjection.pyc –c –d naa.600508b1001c388c92e817e43fcd5237 Clearing errors on device vmhba1:C0:T4:L0 vsish -e set /storage/scsifw/paths/vmhba1:C0:T4:L0/injectError 0x00000 vsish -e set /reliability/vmkstress/ScsiPathInjectError 0x00000

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 1 0

ThatcompletesthispartofthePOC.

12.14 Pull Cache Tier SSD and Do Not Reinsert/Replace Forthepurposesofthistest,weshallremoveanSSDfromoneofthediskgroupsinthe cluster. Navigate to the cluster > Manage > Settings > Virtual SAN > DiskManagement. Selectadiskgroupfromthetopwindowandidentify itsSSDinthebottomwindow.IfAll-Flash,makesureit’stheFlashdeviceinthe“Cache”DiskRole.MakeanoteoftheSSD’sNAAIDstring.

Figure 12.21: Locate a caching-tier SSD

Intheabovescreenshot,wehavelocatedanSSDonhostw2-stsds-139withanNAAIDstringofnaa.55cd2e404b66fcc5.Next,SSHintothehostwiththeSSDtopull.InjectahotunplugeventusingthevsanDiskFaultInjection.pycpythonscript: [root@w2-stsds-139:~] python /usr/lib/vmware/vsan/bin/vsanDiskFaultInjection.pyc -u -d naa.55cd2e404b66fcc5 Injecting hot unplug on device vmhba1:C0:T4:L0 vsish -e set /reliability/vmkstress/ScsiPathInjectError 0x1 vsish -e set /storage/scsifw/paths/vmhba1:C0:T4:L0/injectError 0x004C0400000002

NowweobservetheimpactthatlosinganSSD(flashdevice)hasonthewholediskgroup.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 1 1

Figure 12.22: Absent cache tier SSD = Unhealthy Disk Group Andfinally,let’slookatthecomponentsbelongingtothevirtualmachine.Thistime,anycomponentsthatwereresidingonthatdiskgroupareabsent.

Figure 12.23: SSD removed – all components absent

ToshowthatthisimpactsallVMs,hereisanotherVMthathadacomponentonlocalstorageonhostcs-ie-h01.ie.local.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 1 2

Figure 12.24: SSD removed – all components absent IfyousearchallyourVMs,youwillseethateachVMthathadacomponentonthediskgrouponcs-ie-h07nowhasabsentcomponents.ThisisexpectedsinceanSSDfailureimpactsthewholeofthediskgroup.After60minuteshave elapsed, new components shouldbe rebuilt inplaceof theabsentcomponents.Ifyoumanagetorefreshatthecorrectmoment,youshouldbeabletoobservetheadditionalcomponentssynchronizingwiththeexistingdata.

Figure 12.25: New components resynchronizing after clomd timeout expires

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 1 3

TocompletethisPOC,re-addtheSSDlogicaldevicebacktothehostbyrescanningtheHBA:Navigatetothehost>Manage>Storage>StorageDevicesandclicktherescanbutton.SeeFigure12.18aboveforascreenshot.LookatthelistofstoragedevicesfortheNAAIDoftheSSDthatwasremoved.Ifforsomereason,theSSDdoesn’treturnafterrefreshingthescreen,tryrescanningthehostagain.Ifitstilldoesn’tappear,reboottheESXihost.OncetheNAAIDisback,clearanyhotunplugflagssetpreviouslywiththe–coption:[root@cs-ie-h01:~] python /usr/lib/vmware/vsan/bin/vsanDiskFaultInjection.pyc –c –d naa.55cd2e404b66fcc5 Clearing errors on device vmhba1:C0:T4:L0 vsish -e set /storage/scsifw/paths/vmhba1:C0:T4:L0/injectError 0x00000 vsish -e set /reliability/vmkstress/ScsiPathInjectError 0x00000

Figure 12.26: Verify that the disk group is back in a health state

Warning:IfyoudeleteanSSDdrivethatwasmarkedasanSSD,andalogicalRAID0devicewasrebuiltaspartofthistest,youmayhavetomarkthedriveasanSSDoncemore.

12.15 Checking Rebuild/Resync Status

VirtualSAN6.0displaysdetailsonresyncingcomponents.NavigatetoMonitortab>VirtualSAN>ResyncingComponents.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 1 4

Figure 12.27: Resyncing Components

Tocheckthestatusofcomponentresync/rebuildonaVirtualSANClusterusingRVCcommands,thefollowingcommandwilldisplayusefulinformation:

§ vsan.resync_dashboard

Whenresynchronizationiscomplete,thiscommandwillreport“0bytestosync”.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 1 5

12.16 Injecting a Disk Error Thefirststepistoselectahost,andtheselectadiskthatispartofadiskgrouponthathost.The–dDEVICENAMEargumentrequirestheSCSIidentifierofthedisk,typicallytheNAAid.YoumightalsowishtoverifythatthisdiskdoesindeedcontainVMcomponents.ThiscanbedonebyselectingaVM,thenselectingtheMonitor>Policies>PhysicalDiskPlacementtab.cs-ie-03,andhasanNAAidof600508b1001c1a7f310269ccd51a4e83:

Figure 12.28: Healthy Disk Group

TheerrorcanonlybeinjectedfromthecommandlineoftheESXihost.TodisplaytheNAAidsofthedisksontheESXihost,youwillneedtoSSHtotheESXihost,loginastherootuser,andrunthefollowingcommand:[root@cs-ie-h03:/usr/lib/vmware/vsan/bin] esxcli storage core device list| grep ^naa naa.600508b1001ceefc4213ceb9b51c4be4 naa.600508b1001cd259ab7ef213c87eaad7 naa.600508b1001c9c8b5f6f0d7a2be44433 naa.600508b1001c2b7a3d39534ac6beb92d naa.600508b1001cb11f3292fe743a0fd2e7 naa.600508b1001c1a7f310269ccd51a4e83 naa.600508b1001c9b93053e6dc3ea9bf3ef naa.600508b1001c626dcb42716218d73319

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 1 6

Onceadiskhasbeenidentified,andhasbeenverifiedtobepartofadiskgroup,andthatthediskcontainssomevirtualmachinecomponents,wecangoaheadandinjecttheerrorasfollows: [root@cs-ie-h03:/usr/lib/vmware/vsan/bin] python vsanDiskFaultInjection.pyc -p -d naa.600508b1001c1a7f310269ccd51a4e83 Injecting permanent error on device vmhba1:C0:T0:L4 vsish -e set /reliability/vmkstress/ScsiPathInjectError 0x1 vsish -e set /storage/scsifw/paths/vmhba1:C0:T0:L4/injectError 0x03110300000002 [root@cs-ie-h03:/usr/lib/vmware/vsan/bin] Beforetoolong,thediskshoulddisplayanerrorandthediskgroupshouldenteranunhealthystate.

Figure 12.29: Unhealthy Disk Group

Notice that the disk group is in an Unhealthy state and the status of the disk is“Permanentdiskloss”.Thisshouldplaceanycomponentsonthediskintoadegradedstate(whichcanbeobservedviatheVM’sPhysicalDiskPlacementtab,andinitiateanimmediaterebuildofcomponents.NavigatingtoCluster>Monitor>VirtualSAN>ResyncingComponentsshouldrevealthecomponentsresyncing.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 1 7

Figure 12.30: Resyncing components after disk failure

12.16.2 Clear a Permanent Error Atthispoint,wecancleartheerror.Weusethesamescriptthatwasusedtoinjecttheerror,butthistimeweprovidea–c(clear)option: [root@cs-ie-h03:/usr/lib/vmware/vsan/bin] python vsanDiskFaultInjection.pyc -c -d naa.600508b1001c1a7f310269ccd51a4e83 Clearing errors on device vmhba1:C0:T0:L4 vsish -e set /storage/scsifw/paths/vmhba1:C0:T0:L4/injectError 0x00000 vsish -e set /reliability/vmkstress/ScsiPathInjectError 0x00000 [root@cs-ie-h03:/usr/lib/vmware/vsan/bin]

Notehoweverthatsincethediskfailed,itwillhavetoberemoved,andre-addedfromthediskgroup.Thisisverysimpletodo.Simplyselectthediskinthediskgroup,andremoveitbyclickingontheiconhighlightedbelow.

Figure 12.31: Remove disk from disk group

This will display a pop-up window regarding which action to take regarding thecomponentsonthedisk.Youcanchoosetomigratethecomponentsornot.Bydefaultitisshownas“EvacuateData”,shownhere.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 1 8

Figure 12.32: Data is evacuated by default, but can be unchecked in this testForthepurposesofthisPOC,youcanuncheckthisoptionasyouareaddingthediskbackinthenextstep.Whenthediskhasbeenremovedandre-added,thediskgroupwillreturntoahealthystate.Thatcompletesthediskfailuretest.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 1 9

12.17 When Might a Rebuild of Components Not Occur? Thereareacoupleofreasonswhyarebuildofcomponentsmightnotoccur.

12.17.1 Lack of Resources VerifythatthereareenoughresourcestorebuildcomponentsbeforetestingwiththefollowingRVCcommand:

• vsan.whatif_host_failures

Ofcourse,ifyouaretestingwitha3-nodecluster,andyouintroduceahostfailure,therewillbenorebuildingofobjects.Onceagain,ifyouhavetheresourcestocreatea4-nodecluster, then this isamoredesirableconfiguration forevaluationVirtualSAN.

12.17.2 Underlying Failures Another cause of a rebuild not occurring is due to an underlying failure alreadypresent in thecluster.Verify therearenonebeforetestingwiththe followingRVCcommand:

• vsan.hosts_info• vsan.check_state• vsan.disks_stats

Ifthesecommandsrevealunderlyingissues(ABSENTorDEGRADEDcomponentsforexample), rectify these first or you risk inducing multiple failures in the cluster,resultingininaccessiblevirtualmachines.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 2 0

13. Virtual SAN Management Inthissection,weshalllookatanumberofmanagementtasks,suchasthebehaviorwhenplacingahostintomaintenancemode,andtheevacuationofadiskandadiskgroupfromahost.WewillalsolookathowtoturnonandofftheidentifyingLEDsonadiskdrive.

13.1 Put a Host into Maintenance Mode Thereareanumberofoptionsavailablewhenplacingahostintomaintenancemode.ThefirststepistoidentifyahostthathasarunningVM,aswellascomponentsbelongingtovirtualmachineobjects.SelecttheSummarytabofthevirtualmachinetoverifywhichhostitisrunningon.

Figure 13.1: VM Summary tabThenselecttheMonitortab>Policies>PhysicalDiskPlacementandverifythattherearecomponentsalsoresidingonthesamehost.

Figure 13.2: Physical Disk Placement

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 2 1

Fromthescreenshotsshownhere,wecanseethattheVMselectedisrunningonhostcs-ie-h02andalsohascomponentsresidingonthathost.Thisisthehostthatweshallplaceintomaintenancemode.Right click on the host, selectMaintenanceMode from the dropdownmenu, thenselecttheoption“EnterMaintenanceMode”asshownhere.

Figure 13.3: Enter Maintenance Mode

There are three options displayswhenmaintenancemode is selected; (i) Ensureaccessibility,(ii)Fulldatamigrationand(iii)Nodatamigration.

Figure 13.4: Maintenance Mode options Inthisfirstpartofthemaintenancemodetesting,weshallselecttheoption“Ensureaccessibility”.Thismeansthatalthoughcomponentsmaygomissing,theVMsshallremainaccessible.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 2 2

Whenthisoptionischosen,apopupisdisplayedregardingmigratingrunningvirtualmachines.SincethisisafullyautomatedDRScluster,thevirtualmachinesshouldbeautomaticallymigrated.

Figure 13.5: Migration warning Afterthehosthasenteredmaintenancemode,wecannowexaminethestateofthecomponentsthatwereonthelocalstorageofthishost.Whatyoushouldobserveisthat these components are now in an “Absent” state. However the VM remainsaccessibleaswechosetheoption“EnsureAccessibility”whenenteringMaintenanceMode.

Figure 13.6: Components are Absent during Maintenance Mode

Thehostcannowbetakenoutofmaintenancemode.Simplyrightclickonthehostasbefore,selectMaintenanceModeandthenExitMaintenanceMode.

Figure 13.7: Exit Maintenance Mode

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 2 3

AfterexitingMaintenanceMode,the“Absent”componentbecomesActiveoncemore.This is assuming that the host exited maintenance mode before thevsan.ClomdRepairDelayexpires(default60minutes).

Figure 13.8: Component is Active once more Weshallnowplacethehostintomaintenancemodeoncemore,butthistimeinsteadof “Ensure Accessibility”, we shall choose “Full data migration”. This means thatalthoughcomponentsonthehostinmaintenancemodewillnolongerbeavailable,thosecomponentswillberebuiltelsewhereinthecluster,implyingthatthereisfullavailabilityofthevirtualmachineobjects.Note:ThisisonlypossiblewhenNumberOfFailuresToTolerate=1andthereare4ormore hosts in the cluster. It is not possible with 3 hosts andNumberOfFailuresToTolerate=1,asanotherhostneedstobeavailabletorebuildthecomponents.ThisistrueforhighervaluesofNumberOfFailuresToToleratealso.

Figure 13.9: Full data migration

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 2 4

Nowifthecomponentsonhostcs-ie-h02.ie.localaremonitored,youwillseethatnocomponentsareplacedinan“Absent”state,butrathertheyarerebuiltontheotherhostsinthecluster.Whenthehostentersmaintenancemode,youwillnoticethatallcomponentsofthevirtualmachinesareactive,butnoneresideonthehostplacedintomaintenancemode.

Figure 13.10: All components are Active when host is in mode (full data migration)

Exitmaintenancemode.ThiscompletesthispartofthePOC.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 2 5

13.2 Remove and Evacuate a Disk Inthisexample,weshowafeatureintroducedinversion6.0.Thisistheabilitytoevacuateadiskpriortoremovingitfromadiskgroup.Note:Theclustermustbeleftinmanualmode.Theoperationsarenotavailablewhenaclusterisinautomaticmode.Navigatetothecluster>Managetab>VirtualSAN>DiskManagement,andselectadiskgroupinoneofthehostsasshownbelow.Thenselectoneofthecapacitydisksfromthediskgroup,alsoshownbelow.Notethatthediskiconwiththeredxbecomesvisible.Thisisnotvisibleiftheclusterisinautomaticmode.

Figure 13.11: Remove a disk Makeanoteofthedevicesinthediskgroup,asyouwillneedtheselatertorebuildthediskgroup.ThereareanumberofnewiconsonthisviewofdiskgroupsinVirtualSAN6.0.Itisworthspendingsometimeunderstandingthattheymean.Thefollowingtableshouldhelptoexplainthat.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 2 6

Addadisktotheselecteddiskgroup

Remove(andoptionallyevacuatedata)fromadiskinadiskgroup

TurnonthelocatorLEDontheselecteddisk

TurnoffthelocatorLEDontheselecteddisk

Tagadeviceasaflashdevice(usefulwhenRAID0,non-passthruinuse)

Tagadeviceasalocaldevice(usefulwhenSAScontrollersinuse)

Table 13.1: Disk group icons Tocontinuewiththeoptionofremovingadiskfromadiskgroupandevacuatingthedata,clickontheicontoremoveadiskhighlightedearlier.Thispopsupthefollowingwindow,whichgivesyoutheoptiontoevacuatedata(selectedautomatically).Click“Yes”tocontinue:

Figure 13.12: Evacuate data

Whentheoperationcompletes,thereshouldbeonelessdiskinthediskgroup,butifyouexaminethecomponentsofyourVMs,thereshouldbenonefoundtobeinan“Absent”state.Allcomponentsshouldbe“Active”,andanythatwereoriginallyonthediskthatwasevacuatedshouldnowberebuiltelsewhereinthecluster.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 2 7

13.3 Evacuate a Disk Group Let’srepeattheprevioustaskfortherestofthediskgroup.Insteadofremovingtheoriginaldisk,let’snowremovethewholeofthediskgroup.Makeanoteofthedevicesinthediskgroup,asyouwillneedtheselatertorebuildthediskgroup.

Figure 13.13: Delete disk group Asbefore,youarepromptedastowhetherornotyouwishtoevacuatethedatafromthediskgroup.Theamountofdataisalsodisplayed,andtheoptionisselectedbydefault.Click“Yes”tocontinue.

Figure 13.14: Evacuate data

Once the evacuation process has completed, the disk group should no longer bevisibleintheDiskGroupsview.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 2 8

Figure 13.15: Disk group now removed and evacuated Onceagain,ifyouexaminethecomponentsofyourVMs,thereshouldbenonefoundto be in an “Absent” state. All components should be “Active”, and any thatwereoriginally on the disk thatwas evacuated shouldnowbe rebuilt elsewhere in thecluster.

13.4 Add Disk Groups Back Again At thispoint,wecanrecreate thedeleteddiskgroup.Thiswasalreadycovered insection6.1ofthisPOCguide.Simplyselectthehostthatthediskgroupwasremovedfrom,andclickontheicontocreateanewdiskgroup.Oncemore,selectaflashdeviceandthetwomagneticdiskdevicesthatyoupreviouslynotedweremembersofthediskgroup.ClickOKtorecreatethediskgroup.

Figure 13.16: Recreate disk group

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 2 9

13.5 Turning on and off Disk LEDs OurfinalmaintenancetaskistoturnonandoffthelocatorLEDsonthediskdrives.ThisisanewfeatureofVirtualSAN6.0.Inchapter12,wespokeabouttheimportanceof thehpssacli utility for removingandadding logicaldevices.Thiswas a “nice tohave”.HoweverforturningonandoffthedisklocatorLEDs,theutilityisanecessitywhenusingHPcontrollers.Refertosection12.10forinformationonhowtolocateandinstallthisutility.Note: This is not an issue for LSI controllers, and all necessary components areshippedwithESXiforthesecontrollers.TheiconsforturningonandoffthedisklocatorLEDsareshownintable13.1.ToturnonaLED,selectadiskinthediskgroupandthenclickontheiconhighlightedbelow.

Figure 13.17: Turn on disk locator LED

Thiswilllaunchataskto“TurnondisklocatorLEDs”.Toseeifthetaskwassuccessful,gototheMonitortabandchecktheEvents.Ifthereisnoerror,thetaskwassuccessful.AtthispointyoucanalsotakealookinthedatacenterandvisuallycheckiftheLEDofthediskinquestionislit.Oncecompleted,thelocatorLEDcanbeturnedoffbyclickingonthe“TurnoffdisklocatorLEDs”ashighlightedinthescreenshotbelow.Onceagain,thiscanbevisuallycheckedinthedatacenterifyouwish.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 3 0

Figure 13.18: Turn off disk locator LED

This completes this section of the Virtual SAN 6.0 Proof-Of-Concept (POC) guide.Before handing over the environment to the customer, do one final check on thehealthandensureallcheckspass.

Figure 13.19: Final health check

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 3 1

14. Virtual SAN 6.1 Stretched Cluster Configuration AsperofthevSphere6.0U1releaseinSeptember2015,anumberofnewVirtualSANfeatureswereincluded.ThefeaturesincludedaStretchedClustersolution,whichisthe purpose of this report. Note that the Virtual SAN version in vSphere 6.0U1 isVirtualSAN6.1.A goodworking knowledge of howVirtual SANStretchedCluster is designed andarchitectedisassumed.ReadersunfamiliarwiththebasicsofVirtualSANStretchedClusterareurgedtoreviewtherelevantdocumentationbeforeproceedingwiththispart of the proof-of-concept. Details on how to configure a Virtual SAN StretchedClusterarefoundintheVirtualSAN6.1StretchedClusterGuide.

14.1 Virtual SAN 6.1 Stretched Cluster Network Topology As per theVirtual SAN6.1 Stretched Cluster Guide, a number of different networktopologiesaresupportedforVirtualSANStretchedCluster.Thenetworktopologydeployedinthislabenvironmentisafulllayer3stretchedVirtualSANnetwork.L3multicast is implemented for theVirtual SANnetworkbetweendata sites, andL3unicast is implemented for the Virtual SAN network between data sites and thewitnesssite.WhileVMwarealsosupportsstretchedL2betweenthedatasites,L3istheonlysupportednetworktopologyfortheVirtualSANnetworkbetweenthedatasitesandthewitnesssite.TheVMnetworkisastretchedL2betweenbothdatasites.

14.2 Virtual SAN 6.1 Stretched Cluster Hosts TherearefourESXihostsinthiscluster,twoESXihostsondatasiteA(the“preferred”site)andtwohostsondatasiteB(the“secondary”site).Thereisonedisk-groupperhost(allflash).Thewitnesshost/applianceisdeployedona3rd,remotedatacenter.Theconfigurationisreferredtoas2+2+1.

Figure 14.1: Hosts in Virtual SAN cluster

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 3 2

VMsaredeployedonboththe“Preferred”and“Secondary”sitesoftheVirtualSANStretchedCluster.VMsarerunning/activeonbothsites.

14.3 Virtual SAN 6.1 Stretched Cluster Diagram BelowisadiagramdetailingthePOCenvironmentusedfortheStretchedClustertesting.

Figure 14.2: Virtual SAN Stretch Cluster network diagram

• ThisconfigurationusesL3(route)fortheVirtualSANnetworkbetweenallsites.• Staticroutesarerequiredtoenablecommunicationbetweensites.• TheVirtualSANnetworkVLANfortheESXihostsonthepreferredsiteisVLAN

id4.Thegatewayis172.4.0.1.• TheVirtualSANnetworkVLANfortheESXihostsonthesecondarysiteisVLAN

id3.Thegatewayis172.3.0.1.• TheVirtualSANnetworkVLANforthewitnesshostonthewitnesssiteisVLAN

id80.• TheVMnetworkisstretchedL2betweenthedatasites.ThisisVLANid30.Since

noVMsarerunonthewitness,thereisnoneedtoextendthisnetworktothethirdsite.

14.4 Preferred Site Details In Virtual SAN Stretched Clusters, “preferred” site simplymeans the site that thewitnesswill‘bind’tointheeventofaninter-sitelinkfailurebetweenthedatasites.Thus,thiswillbethesitewiththemajorityofVMcomponents,sothiswillalsobethesitewhereallVMswillrunwhenthereisaninter-sitelinkfailurebetweendatasites.Inthisexample,VirtualSANtrafficisenabledonvmk1onthehostsonthepreferredsite,whichissittingonroutableVLAN4.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 3 3

Figure 14.3: Virtual SAN preferred site networking details Static routes need to bemanually configured on these hosts. This is because thedefaultgatewayisonthemanagementnetwork,andifthepreferredsitehoststriedtocommunicatetothesecondarysitehosts,thetrafficwouldberoutedviathedefaultgatewayandthusviathemanagementnetwork.SincethemanagementnetworkandtheVirtualSANnetworkareentirelyisolated,therewouldbenoroute.Since this is L3 everywhere, including between the data sites, the Virtual SANinterfaceonthepreferredsite,vmk1,hastorouteto“Secondarysite(VLAN3)”and“WitnessSite(VLAN80)”.

Figure 14.4: Primary site routing table with static routes to remote sites

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 3 4

14.4.1 Commands to Add Static Routes Thefollowingcommandisusedtoaddstaticroutesisasfollows:esxcli network ip route ipv4 add -n REMOTE-NETWORK -g LOCAL-GATEWAY

ToaddastaticroutefromapreferredhosttohostsonthesecondarysiteinthisPOC:esxcli network ip route ipv4 add -n 172.3.0.0/24 -g 172.4.0.1ToaddastaticroutefromapreferredhosttothewitnesshostinthisPOC: esxcli network ip route ipv4 add –n 147.80.0.0/24 –g 172.4.0.1

Note:L3MulticastroutingmustbeenabledbetweenVLAN3and4.Thisisconfiguredonthephysicalswitchorrouter.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 3 5

14.5 Secondary Site Details

ThesecondarysiteisthesitethatcontainsESXihostswhoseobjectsdonotbindwiththewitnesscomponentsintheeventofaninter-sitelinkfailure.Howeverthatistheonly significant difference. Under normal conditions, the secondary site behavesexactlylikethepreferredsite,andvirtualmachinesmayalsobedeployedthere.InthisPOC,VirtualSANtrafficisenabledonvmk1,whichissittingonroutableVLAN3.

Figure 14.5: Virtual SAN secondary site networking details

Onceagain,staticroutesneedtobemanuallyconfiguredontheVirtualSANnetworkinterface,vmk1,torouteto“Preferredsite(VLAN4)”and“WitnessSite(VLAN80)”.

Figure 14.6: Secondary site routing table with static routes to remote sites

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 3 6

14.5.1 Commands to Add Static Routes Thefollowingcommandisusedtoaddstaticroutesisasfollows:esxcli network ip route ipv4 add -n REMOTE-NETWORK -g LOCAL-GATEWAY

ToaddastaticroutefromasecondaryhosttohostsonthepreferredsiteinthisPOC:esxcli network ip route ipv4 add -n 172.4.0.0/24 -g 172.3.0.1ToaddastaticroutefromasecondaryhosttothewitnesshostinthisPOC: esxcli network ip route ipv4 add –n 147.80.0.0/24 –g 172.3.0.1

Note:L3MulticastroutingmustbeenabledbetweenVLAN3and4.Thisisconfiguredonthephysicalswitchorrouter.

14.6 A note on IGMP v3 IGMPVersion2,specifiedin[RFC-2236],addedsupportfor"lowleavelatency".Thatis,areduction inthetime it takes foramulticastrouterto learnthat therearenolongeranymembersofaparticulargrouppresentonanattachednetwork.IGMPVersion3addssupportfor"sourcefiltering".Thatis,theabilityforasystemtoreportinterestinreceivingpackets*only*fromspecificsourceaddresses,orfrom*allbut*specificsourceaddresses,senttoaparticularmulticastaddress.

It should be noted that in our POC testing with the DELL network switch, theStretchedClusterwouldnotconfigureproperlyafterfailuresuntilthenetworkswitchwasforcedtotalkIGMPv3betweenVLANs.Recommendation:UseIGMPv3formulticastconfigurations.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 3 7

14.7 Witness Site Details

ThewitnesssiteonlycontainsasinglehostfortheStretchedCluster,andtheonlyVMobjectsstoredonthishostare“witness”objects.Nodatacomponentsarestoredonthewitnesshost.InthisPOC,weareusingthewitnessappliance,whichisan“ESXihost running in a VM”. If you wish to use the witness appliance, it should bedownloadedfromVMware.Thisisbecauseitispreconfiguredwithvarioussettings,andalsocomeswithapre-installedlicense.NotethatthisdownloadrequiresalogintoMyVMware.Alternatively,customerscanuseaphysicalESXihostfortheappliance.Virtual SAN traffic must be enabled on the Virtual SAN interface of the witnessappliance, in this casevmk1,which is sittingon routableVLAN80 (taggedon theunderlyingphysicalESXi).

Figure 14.7: Virtual SAN witness host networking details Onceagain,staticroutesshouldbemanuallyconfiguredonVirtualSANvmk1torouteto“Preferredsite(VLAN4)”and“SecondarySite(VLAN3)”.

Figure 14.8: Witness host routing table with static routes to remote sites

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 3 8

14.7.1 Commands to Add Static Routes Thefollowingcommandisusedtoaddstaticroutesisasfollows:esxcli network ip route ipv4 add -n REMOTE-NETWORK -g LOCAL-GATEWAY

ToaddastaticroutefromthewitnesshosttohostsonthepreferredsiteinthisPOC:esxcli network ip route ipv4 add -n 172.4.0.0/24 -g 172.80.0.1ToaddastaticroutefromthewitnesshosttohostsonthesecondarysiteinthisPOC: esxcli network ip route ipv4 add –n 147.3.0.0/24 –g 172.80.0.1

Note:L3MulticastisnotrequiredforWitnessVirtualSANTraffic.AlsoVLANtaggingisenabledonESXihosthostingwitnessappliance.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 3 9

14.8 vSphere HA Settings vSphereHAplaysacriticalpartinStretchedCluster.HAisrequiredtorestartvirtualmachinesonotherhostsandeventheothersitedependingonthedifferentfailuresthatmayoccurinthecluster.ThefollowingsectioncoverstherecommendedsettingsforvSphereHAwhenconfiguringitinaStretchedClusterenvironment.

14.8.1 Response to Host Isolation Therecommendationisto“PoweroffandrestartVMs”onisolation,asshownbelow.Incaseswhere thevirtualmachinecanno longeraccess themajorityof itsobjectcomponents,itmaynotbepossibletoshutdowntheguestOSrunninginthevirtualmachine.Therefore,the“PoweroffandrestartVMs”optionisrecommended.

Figure 14.9: vSphere HA Host Isolation recommended setting

14.8.2 Admission Control Ifafullsitefails,thedesireistohaveallvirtualmachinesrunontheremainingsite.Toallowasingledatasitetorunallvirtualmachinesiftheotherdatasitefails,therecommendationistosetAdmissionControlto50%forCPUandMemoryasshownbelow.

Figure 14.10: vSphere HA Admission Control setting recommendation

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 4 0

14.8.3 Advanced Settings Thedefaultisolationaddressusesthedefaultgatewayofthemanagementnetwork.This will not be useful in a Virtual SAN Stretched Cluster, when the Virtual SANnetworkisbroken.Thereforethedefaultisolationresponseaddressshouldbeturnedoff.Thisisdoneviatheadvancedsettingdas.usedefaultisolationaddresstofalse.Todealwith failuresoccurringon theVirtualSANnetwork,VMwarerecommendssettingtwoisolationaddresses,eachofwhichislocaltooneofthedatasites.InthisPOC,oneaddressisonVLAN4,whichisreachablefromthehostsonthepreferredsites. The other address is on VLAN 3, which is reachable from the hosts on thesecondarysite.Useadvancesettingsdas.isolationaddress0anddas.isolationaddress1tosettheseisolationaddressesrespectively.

Figure 14.11: vSphere HA advanced options isolation address recommendations These advanced settings are added in the Advanced Options > ConfigurationParameter sectionof the vSphereHAUI.Theother advanced settings get filled inautomaticallybasedonadditionalconfigurationsteps.Thereisnoneedtoaddthemmanually.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 4 1

14.9 VM Host Affinity Groups ThenextstepistoconfigureVM/Hostaffinitygroups.Thisallowsadministratorstoautomaticallyplaceavirtualmachineonaparticularsitewhenitispoweredon.Intheeventofafailure,thevirtualmachinewillremainonthesamesite,butplacedonadifferenthost.Thevirtualmachinewillberestartedontheremotesiteonlywhenthereisacatastrophicfailureorasignificantresourceshortage.ToconfigureVM/Hostaffinitygroups,thefirststepistoaddhoststothehostgroups.Inthisexample,theHostGroupsarenamedPreferredandSecondary,asshownbelow.

Figure 14.12: Host affinity groups Thenextstepistoaddthevirtualmachinestothehostgroups.Notethatthesevirtualmachinesmustbecreatedinadvance.

Figure 14.13: Host affinity groups with VMs

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 4 2

Note that these VM/Host affinity rules are “should” rules and not “must” rules.“Should”rulesmeansthateveryattemptwillbemadetoadheretotheaffinityrules.However,ifthisisnotpossible(duelackofresources),theothersitewillbeusedforhostingthevirtualmachine.AlsonotethatthevSphereHArulesettingsissetto“should”.ThismeansthatifthereisacatastrophicfailureonthesitetowhichtheVMhasaffinity,HAwillrestartthevirtualmachineontheothersite.Ifthiswasa“must”rule,HAwouldnotstarttheVMontheothersite.

Figure 14.14: Set vSphere HA VM to Host affinity rules to “should”, not “must”

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 4 3

ThesamesettingsarenecessaryonboththeprimaryVM/HostgroupandthesecondaryVM/Hostgroup.

Figure 14.15: Set vSphere HA VM to Host affinity rules to “should” on Secondary too

14.10 DRS Settings InthisPOC,partiallyautomatedmodehasbeenchosen.However,thiscouldbesettoFully Automated if customers wish, but note that it should be changed back topartiallyautomatedwhenafullsitefailureoccurs.This istoavoidfailbackofVMsoccurringwhilstrebuildactivityisstilltakingplace.Moreonthislater.

Figure 14.16: Virtual SAN stretch cluster DRS settings

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 4 4

15. Virtual SAN Stretched Cluster Network Failover Scenarios Inthissection,wewilllookathowtoinjectvariousnetworkfailuresinaVirtualSANStretchedClusterconfiguration.Wewill seehowthe failuremanifests itself in thecluster,focusingontheVirtualSANhealthcheckandthealarms/eventsasreportedinthevSpherewebclient.

15.1 Network Failure between Secondary Site and Witness

Figure 15.1: Path failure between secondary site and witness site

15.1.1 Trigger the Event Tomakethesecondarysiteloseaccesstothewitnesssite,onecansimplyremovethestaticrouteonthewitnesshostthatprovidesapathtothesecondarysite.Onwitnesshostissue: esxcli network ip route ipv4 remove -g 147.80.0.1 -n 172.3.0.0/24

Onsecondaryhost(s)issue:esxcli network ip route ipv4 remove -g 172.3.0.1 -n 147.80.0.0/24

15.1.2 Cluster Behavior on Failure

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 4 5

Tobeginwith,theClusterSummaryviewshowsoneconfigurationissuerelatedto0witnesshosts.

Figure 15.2: Cluster summary view – 0 witness hosts ThissameeventisvisibleintheCluster>Monitor>Issues>AllIssuesview.

Figure 15.3: Cluster Issue – missing witness Notethatthiseventmaytakesometimetotrigger.Next,lookingatthehealthcheckalarms,anumberofthemgettriggered(TriggeringalarmsfromhealthchecktestfailuresisanewfeatureinVirtualSAN6.1).

Figure 15.4: Virtual SAN Health Alarms triggered IntheClustersummaryview,anerrorisalsoshown.Thisdirectstheadministratortogoto“MonitorVirtualSANhealth”.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 4 6

Figure 15.5: Virtual SAN cluster summary view Onnavigating to theVirtual SANHealth>Monitor view, there are a lot of checksshowing errors.One should also note that there is a set of newStretchedClusterhealthchecksin6.1.Thesearealsofailing.

Figure 15.6: Virtual SAN Health Check detects the problems Onefinalplacetoexamineisthevirtualmachines.NavigatetoaVMonthesecondarysite,thenMonitor>Policies>PhysicalDiskPlacement.Itshouldshowthewitnessabsentfromsecondarysiteperspective.Howeverthevirtualmachinesshouldstillberunningandfullyaccessible.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 4 7

Figure 15.7: VM shows the witness component is absent Returningtothehealthcheckclient,selecting“Basic(unicast)connectivitycheck(normalping),youcanseethattheSecondarySitecan’ttalktowitnessorviceversa.

Figure 15.8: Virtual SAN Health Check ping test results

15.1.3 Conclusion Lossofthewitnessdoesnotimpacttherunningvirtualmachinesonthesecondarysite.Thereisstillaquorumofcomponentsavailableperobject,availablefromthedatasites.Sincethereisonlyasinglewitnesshost/site,andonlythreefaultdomains,thereisnorebuilding/resyncingofobjects.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 4 8

15.1.4 Repair the Failure Addbackthestaticroutesthatwereremovedearlier,andrerunthehealthchecktests.Verifythatalltestsarepassingbeforeproceeding.Remembertotestonethingatatime.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 4 9

15.2 Network Failure between Preferred Site and Witness

Figure 15.9: Path failure between preferred site and witness site

15.2.1 Trigger the Event Tomakethepreferredsiteloseaccesstothewitnesssite,onecansimplyremovethestaticrouteonthewitnesshostthatprovidesapathtothepreferredsite.Onwitnesshostissue: esxcli network ip route ipv4 remove –g 147.80.0.1 –n 172.4.0.0/24

Onpreferredhost(s)issue:esxcli network ip route ipv4 remove –g 172.4.0.1 –n 147.80.0.0/24

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 5 0

15.2.2 Cluster Behavior on Failure Aspertheprevioustest,itmaytakesometimeforalarmstotriggerwhenthiseventoccurs.However,theeventsaresimilartothoseseenpreviously.

Figure 15.10: Cluster summary view – 0 witness hosts

Figure 15.11: Cluster issue – missing witness Onecanalsoseevarioushealthchecksfail,andtheirassociatedalarmsbeingraised.

Figure 15.12: Virtual SAN Health Check detects the problems Justliketheprevioustest,thewitnesscomponentgoesabsent.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 5 1

Figure 15.13: VM’s storage policy is out of compliance Wedidnot lookat the “Datahealth”health checkduring theprevious test. If thishealthcheck“VirtualSANobjecthealth”isselected,itdisplaysXnumberofobjectswith “reduced-availability-with-no-rebuild-delay-timer”. In this POC, there are 52objectsimpactedbythefailure.

Figure 15.14: Virtual SAN Health Check for object health

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 5 2

Thishealth checkbehaviorappearswhenever componentsgo ‘absent’ andVirtualSANiswaitingforthe60-minuteclomdtimertoexpirebeforestartinganyrebuilds.Ifanadministratorclickson“RepairObjectsImmediately”,theobjectsswitchstateand now the objects are no longerwaiting on the timer, andwill start to rebuildimmediatelyundergeneralcircumstances.HoweverinthisPOC,withonlythreefaultdomainsandnoplacetorebuildwitnesscomponents,thereisnosyncing/rebuilding.

15.2.3 Conclusion Just like the previous test, awitness failure has no impact on the running virtualmachinesonthepreferredsite.Thereisstillaquorumofcomponentsavailableperobject,as thedatasitescanstill communicate.Since there isonlyasinglewitnesshost/site,andonlythreefaultdomains,thereisnorebuilding/resyncingofobjects.

15.2.4 Repair the Failure Addbackthestaticroutesthatwereremovedearlier,andrerunthehealthchecktests.Verifythatalltestsarepassingbeforeproceeding.Remembertotestonethingatatime.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 5 3

15.3 Network Failure between Witness and Both Data Sites

Figure 15.15: Complete witness site outage

15.3.1 Trigger the Event Tointroduceanetworkfailurebetweenthepreferredandsecondarydatasitesandthewitness site, one can simply remove the static route on thewitness host thatprovides a path to both the preferred and secondary sites, and remove the staticroutestothewitnessonthepreferredandsecondaryhosts.OnWitnesshostissue:esxcli network ip route ipv4 remove -g 147.80.0.1 -n 172.3.0.0/24 esxcli network ip route ipv4 remove -g 147.80.0.1 -n 172.4.0.0/24

OnPreferredhost(s)issue:esxcli network ip route ipv4 remove -g 172.4.0.1 -n 147.80.0.0/24

OnSecondaryhost(s)issue:esxcli network ip route ipv4 remove -g 172.3.0.1 -n 147.80.0.0/24

15.3.2 Cluster Behavior on Failure The events observed are for the most part identical to those observed in failurescenario#1and#2.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 5 4

15.3.3 Conclusion WhentheVirtualSANnetworkfailsbetweenthewitnesssiteandboththedatasites(as in thewitnesssite fully losing itsWANaccess), itdoesnot impact therunningvirtualmachines.Thereisstillaquorumofcomponentsavailableperobject,availablefrom thedata sites.However, as explainedpreviously, since there is only a singlewitnesshost/site,andonlythreefaultdomains,thereisnorebuilding/resyncingofobjects.

15.3.4 Repair the Failure Addbackthestaticroutesthatwereremovedearlier,andrerunthehealthchecktests.Verifythatalltestsarepassing.Remembertotestonethingatatime.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 5 5

16. Virtual SAN 6.2 All Flash Features InVirtualSAN6.2twoadditionalmajorfeaturesareavailableforallflashclusters:

1. Deduplicationandcompression2. RAID-5/RAID-6ErasureCoding

DeduplicationandcompressionisavailableforspaceefficiencywhileRAID-5/RAID-6ErasureCodingprovidesadditionaldataprotectionoptionsthatrequirelessspacethanthetraditionalRAID-1options.

16.1 Deduplication and Compression DeduplicationandCompressionareenabledtogetherinVSANandapplieddirectlyatthecluster-level.ThescopeofdeduplicationandcompressionappliestoanindividualdiskgroupinordertoensurethegreatestavailabilityoftheVSANdatastore.Whendataisde-stagedfromthecachetier,VSANcheckstoseeifamatchforthatblockexists.Iftheblockexists,VSANdoesnotwriteanadditionalcopyoftheblocknordoesitgothroughthecompressionprocess.However,iftheblockdoesnotexist,VSANwillattempttocompresstheblock.Thecompressionalgorithmwilltrytocompressthesizeoftheblockto2KBorless.Ifthealgorithmisabletobeapplied,thecompressedblockisthenwrittentothecapacitytier.Ifthecompressionalgorithmcannotcompresstheblockto2KBorlessthanthefull4KBblockiswrittentothecapacitytier.TodemonstratetheeffectsofDeduplicationandCompression,thisexercisedisplaysthecapacitybeforeandafterdeployingfouridenticalvirtualmachines.Beforestartingthisexercise,ensurethatDeduplicationandCompressionisenabled.WhenenablingtheDedupeandCompressionfeatures,VSANwillgothroughtherollingupdateprocesswheredataisevacuatedfromeachdiskgroupandthediskgroupisreconfiguredwiththefeaturesenabled.Dependingonthenumberofdiskgroupsoneachhostandtheamountofdata,thiscanbealengthyprocess.ToenableDeduplicationandCompressioncompletethefollowingsteps:

1. SelecttheVSANclusterfromwithinthevCenterUI.2. SelecttheManagetab.3. SelectGeneralfromtheVirtualSANMenu.4. SelectthetopEditbutton.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 5 6

5. UndertheDeduplicationandCompressionSection,selectEnabledfromthe

dropdownmenuandSelectOK.

OncetheprocessofenablingDeduplicationandCompressioniscomplete,thefourvirtualmachinescanthenbecreated.However,beforecreatingthevirtualmachines,besuretoexaminethecapacityconsumedbylookingatthecapacitysummary.Thesestepsshowhowtoviewthecapacitysummary:

1. SelecttheVSANclusterfromwithinthevCenterUI.2. SelecttheMonitortab.3. SelectVirtualSAN4. ChooseCapacityfromthemenuoptionsontheleft.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 5 7

Focusonthecapacitysummarysection.Thecapacitysummaryshowsausedcapacityof344.55GBpriortodeduplicationandcompression.Fromthebaselineweshownosavingsfromdeduplicationandcompression.ForthenextpartoftheexercisewewillcreatefourclonesfromaWindows2012R2VM’seachwithasingle100GBThinProvisionedVMDK.ThefollowinggraphicshowstheexactconfigurationforeachVM.

After4virtualmachinesarecreated,checkthetotalcapacityconsumptionforVSAN.Aftercreating4additionalVMseachwith100GBdisktheDeduplicationandCompressionOverviewshowsausedcapacitybeforeof505.32GBandafterDeduplicationandCompressionausedcapacityof313.14for

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 5 8

16.2 RAID-5/RAID-6 Erasure Coding InpreviousversionsofVirtualSAN,objectscouldonlybedeployedusingaRAID-1(mirroring)configurationwhileVersion6.2allowsforRAID-5/6ErasureCoding.Thekeybenefitofusingerasurecodingisspaceefficiency.Insteadof2xor3xoverhead(FTT=1or2)inthetraditionalRAID-1configurationtowithstandmultiplefailures,RAID-5requiresonly33%additionalstorage,andRAID-6requiresonly50%additionaloverhead.InordertosupportRAID-5andRAID-6thefollowinghostsrequirementsmustbemet:

1. RAID-5worksina3+1configurationmeaning3datafragmentsand1parityfragmentperstripe.TouseaRAID-5protectionleveltheVSANclustermustcontainaminimumof4hosts.

2. RAID-6worksina4+2configurationmeaning4datafragmentsand2parity

fragmentsperstripe.ToenableaRAID-6protectionleveltheVSANclustermustcontainaminimumof6hosts.

3. RAID-5/6ErasureCodinglevelsaremadeavailableviastoragepolicy.Inthe

followingexercisewewillenableRAID-5bycreatinganewstoragepolicyandapplyingthatstoragepolicytoavirtualmachine.

TobegintheprocessforsettingupRAID-5ORRAID-6,opentheVMStoragePolicieswindowinvCenter.CreateanewstoragepolicyforRAID-5bycompletingthefollowingsteps:

1. Clickthe icontocreateanewVMstoragepolicy.2. ProvideanameforthenewPolicyandClickNext.Forthisexample,the

nameofRAID-5Policyisused.3. FromtheAddRuledropdownboxselectFailureTolerancemethod.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 5 9

4. FromtheToleranceMethoddropdownmenu,selectRAID5/6ErasureCoding.

5. ClickNextandSelecttheVSANdatastoreformthestoragecompatibilitylist.6. Whentheruleiscompletethesummaryshouldbesimilartothefollowing

graphic.

OncestoragepolicyforRAID-5/6hasbeencreated,thenextstepistocreateavirtualmachineusingthatpolicy.Forthisexample,createavirtualmachinecontainingasingle100GBdrive.DuringtheVMcreationprocessselecttheRAID-5policy.UponcompletiontheVMsummaryshouldbesimilartothefollowing.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 6 0

Nowthatthevirtualmachinehasbeencreated,youcanviewthephysicaldiskplacementofthecomponents.Forthisexample,theVMDKobjectwillcontain5separatecomponentsspreadacrossdifferenthostsinthecluster.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 6 1

17. Further Information

17.1 VMware Virtual SAN Community

17.2 Links to Existing Documentation • VMwareVirtualSANResources• AdministeringVMwareVirtualSAN• VMwareCompatibilityGuide• VMwareVirtualSANDiagnosticsandTroubleshootingReferenceManual• VMwareVirtualSAN6.0DesignandSizingGuide• VirtualSANHostedEvaluation• VMwareVirtualSANHealthCheckPluginGuide

17.3 VMware Support • MyVMware• HowtofileaSupportRequestinMyVMware• LocationoflogfilesforVMwareProducts• LocationofESXi5.1and5.5logfiles• CollectingVirtualSANsupportlogsanduploadingtoVMware

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 6 2

Appendix A—Fault Domains Inthisfour-nodeenvironment,wenowlookatthebenefitsoffailuredomains,anewfeatureintroducedinvSphere6.0.Inthisscenario,wewillassumethatthe4nodesareintworacks,somethingasfollows.

Figure A.1 Fault Domains Theobjectivenowistomatchtherackwithfaultdomains.Thisimpliesthatifthereisarackfailure,thevirtualmachinecomponentswillhavebeendistributedinafashionsuchthattheyremainavailableevenwhenacompleterackfails.

A1. Setting up Fault Domains Asshownabove,wewillcreatethreefaultdomain,twoofwhichonlycontainasinglehost,butonewhichcontainstwohosts.NavigatetotheManagetab,andunderVirtualSANselectFaultDomainsasshownbelow.Initially,therearenohostsinanyfaultdomains.

Figure A.2 No hosts in Fault Domains Clickonthegreen“+”symboltocreateafaultdomain.Initially,wewilladdhostcs-ie-h01.ie.localtothefirstfaultdomain.Let’scallthedomainFD1.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 6 3

Figure A.3 Add single host to Fault Domain FD1

Repeatthisoperationforthesecondfaultdomain,butthistimeaddhostcs-ie-h02.ie.localtothisdomainFD2.Forthethirdfaultdomain,addtheremainingtwohosts,cs-ie-h03andcs-ie-h04asshownhere.

Figure A.4 Add two hosts to third Fault Domain FD3

Atthispoint,threefaultdomainshavebeencreated.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 6 4

Figure A.5 Fault Domain Overview

A2. Create a Policy to Leverage Fault Domains Thenextstep is tocreateaVMstoragepolicy thathighlights thebehaviorof faultdomains. In the event of a failure of any single rack, there should still be enoughcomponents available belonging to the VM to continue running. In essence, thereshouldstillbeafullcopyofthedataevenwhenarackfails.Let’screateapolicysothat we can observe how a VM’s components.We have chosen a policy that hasNumberOfFailuresToTolerate=1andNumberOfDiskStripesPerObject=3.Wehavealreadycreatedpoliciesbackinchapter9.Herearethestepsonceagain.

Figure A.6 Navigate to VM Storage Policies Clickonthe“CreateNewPolicy”icon.Giveitanameandanoptionaldescription.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 6 5

Figure A.7 Give a name and description to the policy Clickthroughtherule-setdescription.

Figure A.8 Rule-set description

In theRule-Set1window,selectVirtualSANas the “Rulesbasedondataservice”.Thenaddtherule“Numberofdiskstripesperobject”andsetthevalueto3.Thereisnoneedtoadd“Numberoffailurestotolerate”asthisisautomaticallysetto1foreverypolicyunlessyouexplicitlysetittoavalueof0.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 6 6

Figure A.9 Number of disk stripes per object

The Virtual SAN datastore should appear as compatible, in other words itunderstandsthepolicysettings.

Figure A.10: vsanDatastore shows as compatible

ThefinalstepistoclickonFinishandcreatethepolicy.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 6 7

Figure A.11 Finish creating the policy We cannowgo ahead anddeploy aVMwith this policy, and afterwardswe shallexaminethelayoutandseeifitistakingFaultDomainsintoaccount.

A3. Create a VM and Check the Fault Domains Atthispoint,anewVMcanbedeployed.TheonlyinputsrequiredforthisVMaretoprovideitwithanameandtochoosethenewlycreatedpolicywithaStripeWidth=3.

Figure A.12 Create a new VM

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 6 8

Figure A.13 Select the new VM Storage Policy

TherestoftheVMcreationoptionscanbeleftatthedefault.Oncethevirtualmachinehas been deployed, check the Manage tab > Policies and verify that the VM iscompliantwiththepolicy.Itshouldbecompliantasshownbelow.

Figure A.14 VM Storage Policy is Compliant FinallycheckthedistributionofVMcomponentsundertheMonitortab>Policies.

Figure A.15 Component distribution

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 6 9

Thequestionsthatneedtobeaskednowarerelatedtorackfailuresandfaultdomainfailures.Forexample,ifrack1weretofail,istherestillafullcopyofthedata?Theanswerisyes.Whataboutrack2?Yes,thereisstillafullcopyofthedata.Whataboutrack3,whichhouseshosts3and4?Theanswerisyes,onceagaintherewouldbeafullycopyofthedataevenifrack3failed.Oneadditionalitemtohighlighthereisthelackofwitnesses.Thisissomethingnewin Virtual SAN 6.0. Certain configurations do not needwitnesses as a new votingmechanismhasbeenintroducedwhichgivescomponentsextravotes.Thereforeinsomeconfigurations,suchasthisone,witnessesarenotneeded,reducingtheoverallcomponentcount.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 7 0

Appendix B—Migrating from Standard vSwitch to Distributed Beforewebegin,thisprocedureisrathercomplicated,andcaneasilygowrong.TheonlyrealreasonwhyonewouldwanttomigratefromVSS(standardvSwitches)toaDVS(DistributedvSwitch)istomakeuseoftheNetworkI/OControlfeaturethatisonlyavailablewithDVS.ThiswillthenallowyoutoplaceQoS(QualityofService)onthevarioustraffictypessuchasVirtualSANtraffic.Warning:EnsurethatyouhaveconsoleaccesstotheESXihostsduringthisexercise.Allgoingwell,youwillnotneedit.However,shouldsomethinggowrong,youmaywellneedtoaccesstheconsoleoftheESXihosts.

B.1 Create Distributed Switch Tobeginwith,createthedistributedswitch.Thisisarelativelystraightforwardexercise.

Figure B.1 Create a new distributed switch

Provide it with a name.

Figure B.2: Provide a name for the new distributed switch

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 7 1

SelecttheversionoftheDVS.Inthisexample,weshallusethelatestversion,6.0.0.

Figure B.3: Select the distributed switch version

Atthispoint,wegettoaddthesettings.First,youwillneedtodeterminehowmanyuplinksyouarecurrentlyusingfornetworking.InourPOC,weareusingsix;oneformanagement,one forvMotion,one forvirtualmachinesandthree forVirtualSAN.Therefore,whenwearepromptedforthenumberofuplinks,weselect“6”.Thismaydifferinyourenvironmentbutyoucanalwaysedititlateron.

Figure B.4: Select the number of uplinks

Another point to note here is that a default portgroup can be created. You cancertainlycreateaportgroupatthispoint,buttherewillbeadditionalportgroupsthatneedtobecreatedshortly.Atthispoint,thedistributedswitchcanbecompleted.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 7 2

Figure B.5: Complete the creation of the DVS

Asalludedtoearlier,configureandcreatetheadditionalportgroups.

B.2 Create Port Groups Inthepreviousexercise,asingledefaultportgroupwascreatedforthemanagementnetwork.Therewaslittleinthewayofconfigurationthatcouldbedoneatthattime.ItisnowimportanttoeditthisportgrouptomakesureithasallthecharacteristicsofthemanagementportgroupontheVSS,suchasVLANandNICteamingandfailoversettings.Selectthedistributedportgroup,andclickontheEditbuttonshownbelow.

Figure B.6: Edit the distributed port group

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 7 3

ForsomeportgroupsitmaybenecessarytochangetheVLAN.SincethemanagementVLANinthisPOCison51,weneedtotagthedistributedportgroupaccordingly.

Figure B.7: Tag the distributed port group with a VLAN Thatisthemanagementdistributedportgrouptakencareof.YouwillalsoneedtocreatedistributedportgroupsforvMotion,virtualmachinenetworkingandofcourseVirtualSANnetworking.Inthe“GettingStarted”tabofthedistributedswitch,thereisabasictasklinkcalled“Createanewportgroup”.

Figure B.8: Create a new distributed port group

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 7 4

Inthisexercise,weshallcreateaportgroupforthevMotionnetwork.

Figure B.9: Provide a name for the new distributed port group

Figure B.10: Configure distributed port group settings, such as VLAN

Figure B.11: Finish creating the new distributed port group

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 7 5

Onceallthedistributedportgroupsarecreatedonthedistributedswitch,theuplinks,VMkernel networking and virtual machine networking can be migrated to thedistributedswitchandassociateddistributedportgroups.Warning:Whilethemigrationwizardallowsmanyuplinksandmanynetworkstobemigratedconcurrently,werecommendmigratingtheuplinksandnetworksstep-by-steptoproceedsmoothlyandwithcaution.Forthatreason,thisistheapproachweusehere.

B.3 Migrate Management Network Tobegin,let’smigratejustthemanagementnetwork(vmk0)anditsassociateduplink,whichinthiscaseisvmnic0fromVSStoDVS.Tobegin,select“Addandmanagehosts”fromthebasictasksintheGettingstartedtaboftheDVS.

Figure B.12: Add and manage hosts

ThefirststepistoaddhoststotheDVS.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 7 6

Figure B.13: Add hosts

Clickonthegreen+andaddallfourhostsfromthecluster.

Figure B.14: Select all hosts in the cluster ThenextstepistomanageboththephysicaladaptersandVMkerneladapters.Torepeat,whatwewishtodohereismigratebothvmnic0andvmk0totheDVS.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 7 7

Figure B.15: Select physical adapters and VMkernel adapters

Next, selectanappropriateuplinkon theDVS forphysicaladaptervmnic0. In thisexamplewechoseUplink1.

Figure B.16: Assign uplink (uplink1) to physical adapter vmnic0

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 7 8

Withthephysicaladapterselectedandanuplinkchosen,thenextstepistomigratethemanagementnetworkonvmk0fromtheVSStotheVDS.Wearegoingtoleavevmk1andvmk2forthemomentandjustmigratevmk0.Select vmk0, and then click on the “Assignport group” as shownbelow.Theportgroupassignedshouldbethenewlycreateddistributedportgroupcreatedforthemanagementnetworkearlier.Remembertodothisforeachhost.

Figure B.17: Assign port group for vmk0

ClickthroughtheanalyzeimpactscreensinceitonlychecksiSCSIandisnotrelevanttotheVirtualSANPOC.

Figure B.18: Impact on iSCSI (not relevant)

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 7 9

Atthefinishscreen,youcanexaminethechanges.Weareadding4hosts,4uplinks(vmnic0fromeachhost)and4VMkerneladapters(vmk0fromeachhost).

Figure B.19: Ready to complete

When the networking configuration of each host is now examined, you shouldobservethenewDVS,withoneuplink(vmnic0)andthevmk0managementportoneachhost.

Figure B.20: Management network migration to DVS complete Youwillnowneedtorepeatthisfortheothernetworks.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 8 0

B.4 Migrate vMotion Migrating the vMotion network takes the exact same steps as the managementnetwork.Beforeyoubegin,ensurethat thedistributedportgroupforthevMotionnetworkhasallthesameattributesastheportgrouponthestandard(VSS)switch.ThenitisjustamatterofmigratingtheuplinkusedforvMotion(inthiscasevmnic1)alongwiththeVMkerneladapter(vmk1).Asmentionedalready,thistakesthesamestepsasthemanagementnetwork.When themigration completes, the individual host network configuration shouldlooksimilartothefollowingdiagram.

Figure B.21: vMotion network migration to DVS complete

B.5 Migrate Virtual SAN Network IfyouareusingasingleuplinkfortheVirtualSANnetwork,thentheprocessbecomesthesameasbefore.However,ifyouareusingmorethanoneuplink,thenthereareadditionalstepstobetaken.IftheVirtualSANnetworkisusingafeaturesuchasLinkAggregation(LACP),oritisonadifferentVLANtotheotherVMkernelnetworks,thenyouwillneedtoplacesomeoftheuplinksintoanunusedstateforcertainVMkerneladapters.For example, in this scenario, VMkernel adapter vmk2 is used for Virtual SAN.Howeveruplinksvmnic3,4and5areusedforVirtualSANandtheyareinaLACPconfiguration.Thereforeforvmk2,allothervmnics(0,1and2)mustbeplacedinanunusedstate. Similarly, for themanagementadapter (vmk0)andvMotionadapter(vmk0),theVirtualSANuplinks/vmnicsshouldbeplacedinanunusedstate.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 8 1

Modifying the settings of the distributed port group and changing the pathpolicy/failoverappropriatelydothis.Inthemanagephysicalnetworkadapter,thestepsaresimilarasbeforeexceptthatnowyouaredoingthisformultipleadapters.

Figure B.22: Multiple uplinks used by the Virtual SAN network

As before, vmk2 (the Virtual SAN VMkernel adapter) should be assigned to thedistributedportgroupforVirtualSAN.

Figure B.23: Assign distributed port group for Virtual SAN networking

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 8 2

Note:IfyouareonlynowmigratingtheuplinksfortheVirtualSANnetwork,youmaynotbeabletochangethedistributedportgroupsettingsuntilafter themigration.Duringthistime,VirtualSANmayhavecommunicationissues.Afterthemigration,movetothedistributedportgroupsettingsandmakeanypolicychangesandmarkanyuplinks that shouldbeunused.Virtual SANnetworking should then return tonormal when this task is completed. Use the Health Check plugin to verify thateverythingisfunctionaloncethemigrationiscompleted.

Figure B.24: Change distributed port group settings

Figure B.25: Showing load balancing and unused uplinksThatcompletedtheVMkerneladaptermigrations.ThefinalstepistomovetheVMnetworking.

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 8 3

B.6 Migrate VM Network This is the finalstepofmigrating thenetwork fromastandardvSwitch(VSS) toadistributedswitch(DVS).Onceagain,weusethe“Addandmanagehosts”,thesamelinkusedformigratingtheVMkerneladapters.Thetaskistomanagehostnetworking.

Figure B.26: Manage host networkingSelect all the hosts in the cluster, as all hosts will have their virtual machinenetworkingmigratedtothedistributedswitch.

Figure B.27: Select all hosts

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 8 4

On this occasion, we do not need to move any uplinks. However, if the VM networking on your hosts used a different uplink, then this of course would also need to be migrated from the VSS. In this example, the uplink has already been migrated.

Figure B.28: Migrate virtual machine networking

SelecttheVMsthatyouwishtohavemigratedfromavirtualmachinenetworkontheVSStothenewvirtualmachinedistributedportgroupontheDVS.Clickonthe“Assignportgroup”optionlikewehavedonemanytimesbefore,andselectthedistributedportgroup,nameVM-DPGhere.

Figure B.29: Assign port groups for the VMsReviewing the final screen. In thiscaseweareonlymoving toVMs.Note thatanytemplatesusingtheoriginalVSSvirtualmachinenetworkwillneedtobeconverted

VMwa r e S t o r a g e a n d A v a i l a b i l i t y B u s i n e s s U n i t D o c umen t a t i o n / 1 8 5

tovirtualmachines,editedandthenewdistributedportgroupforvirtualmachineswillneed tobeselectedas thenetwork.Thisstepcannotbeachieved throughthemigrationwizard.

Figure B.30: FinishTheVSSshouldnolongerhaveanyuplinksofportgroupsandcanbesafelyremoved.

Figure B.31: VSS no longer in use

ThiscompletesthemigrationfromastandardvSwitch(VSS)toadistributedvSwitch(DVS).