CentOS High Performance - quickfixlinux High... · CentOS High Performance Credits About the Author...
Transcript of CentOS High Performance - quickfixlinux High... · CentOS High Performance Credits About the Author...
CentOSHighPerformance
TableofContents
CentOSHighPerformance
Credits
AbouttheAuthor
AbouttheReviewers
www.PacktPub.com
Supportfiles,eBooks,discountoffers,andmore
Whysubscribe?
FreeaccessforPacktaccountholders
Preface
Whatthisbookcovers
Whatyouneedforthisbook
Whothisbookisfor
Conventions
Readerfeedback
Customersupport
Downloadingtheexamplecode
Errata
Piracy
Questions
1.ClusterBasicsandInstallationonCentOS7
Clusteringfundamentals
WhyLinuxandCentOS7?
DownloadingCentOS
SettingupCentOS7nodes
InstallingCentOS7
Settingupthenetworkinfrastructure
Installingthepackagesrequiredforclustering
Keysoftwarecomponents
Settingupkey-basedauthenticationforSSHaccess
Summary
2.InstallingClusterServicesandConfiguringNetworkComponents
Configuringandstartingclusteringservices
Startingandenablingclusteringservices
Troubleshooting
Securityfundamentals
Lettinginandlettingout
GettingacquaintedwithPCS
Managingauthenticationandcreatingthecluster
SettingupavirtualIPforthecluster
AddingavirtualIPasaclusterresource
ViewingthestatusofthevirtualIP
Summary
3.ACloserLookatHighAvailability
Failover–anintroductiontohighavailabilityandperformance
Fencing–isolatingthemalfunctioningnodes
InstallingandconfiguringaSTONITHdevice
Split-brain–preparingtoavoidinconsistencies
Quorum–scoringinsideyourcluster
ConfiguringourclusterwithPCSGUI
Summary
4.Real-worldImplementationsofClustering
Settingupstorage
ELReporepositoryandDRBDavailability
ConfiguringDRBD
AddingDRBDasaPCSclusterresource
Installingthewebanddatabaseservers
Configuringthewebserverasaclusterresource
MountingtheDRBDresourceandusingitwithApache
TestingtheDRBDresourcealongwithApache
Settingupahigh-availabilitydatabasewithreplicatedstorage
Troubleshooting
Summary
5.MonitoringtheClusterHealth
Clusterservicesandperformance
Monitoringthenodestatus
Monitoringtheresources
Whenaresourcerefusestostart
Checkingtheavailabilityofcorecomponents
Summary
6.MeasuringandIncreasingPerformance
Settingupasampledatabase
DownloadingandinstallingtheEmployeesdatabase
Introducinginitialclustertests
Test1–retrievingallfieldsfromallrecords
Test2–performingJOINoperations
Performingafailover
Measuringandimprovingperformance
Apache’sconfigurationandsettings
Loadinganddisablingmodules
PlacinglimitsonthenumberofApacheprocessesandchildren
Databaseresource
Creatingindexes
Usingquerycache
MovingtoanA/Acluster
Summary
Index
CentOSHighPerformance
CentOSHighPerformanceCopyright©2016PacktPublishing
Allrightsreserved.Nopartofthisbookmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformorbyanymeans,withoutthepriorwrittenpermissionofthepublisher,exceptinthecaseofbriefquotationsembeddedincriticalarticlesorreviews.
Everyefforthasbeenmadeinthepreparationofthisbooktoensuretheaccuracyoftheinformationpresented.However,theinformationcontainedinthisbookissoldwithoutwarranty,eitherexpressorimplied.NeithertheauthornorPacktPublishing,anditsdealersanddistributorswillbeheldliableforanydamagescausedorallegedtobecauseddirectlyorindirectlybythisbook.
PacktPublishinghasendeavoredtoprovidetrademarkinformationaboutallofthecompaniesandproductsmentionedinthisbookbytheappropriateuseofcapitals.However,PacktPublishingcannotguaranteetheaccuracyofthisinformation.
Firstpublished:January2016
Productionreference:1250116
PublishedbyPacktPublishingLtd.
LiveryPlace
35LiveryStreet
BirminghamB32PB,UK.
ISBN978-1-78528-868-5
www.packtpub.com
CreditsAuthor
GabrielCánepa
Reviewers
MuhammadKamranAzeem
DenisFateyev
LekshminarayananK
OliverPelz
CommissioningEditor
VeenaPagare
AcquisitionEditor
SubhoGupta
ContentDevelopmentEditor
ZeeyanPinheiro
TechnicalEditor
VivekPala
CopyEditor
PranjaliChury
ProjectCoordinator
SuzanneCoutinho
Proofreader
SafisEditing
Indexer
MariammalChettiyar
Graphics
DishaHaria
ProductionCoordinator
NileshMohite
CoverWork
NileshMohite
AbouttheAuthorGabrielCánepaisaLinuxFoundationcertifiedsystemadministrator(LFCS-1500-0576-0100)andwebdeveloperfromVillaMercedes,SanLuis,Argentina.HeworksforaworldwideleadingconsumerproductcompanyandtakesgreatpleasureinusingFOSStoolstoincreaseproductivityinallareasofhisdailywork.Whenhe’snottypingcommandsorwritingcodeorarticles,heenjoystellingbedtimestorieswithhiswifetohistwolittledaughtersandplayingwiththem,whichisagreatpleasureinhislife.
IwouldliketothankGodforthemanyblessingsandthegrowthopportunitiesinpersonal,family,andprofessionallifethatHehasgiventhroughoutmylife.
Iwouldliketothankmymother,whoalwaysencouragedmetogetasmucheducationaspossibleandtoexcelineverythingIdo.I’dalsoliketothankmywife,Monica,andourtwodaughters,CamilaandFrancesca,fortheirsupport,understanding,andpatienceduringthelonghoursoftroubleshootingandwritingthisbook.
Next,I’dliketothankAndreadeAmpalioandDiegoCordobafromCarreraLinuxArgentina(www.carreralinux.com.ar),whohelpedmelearnandloveLinuxinthebestLinuxtrainingacademy—theirpeopleandLinuxskillsarewithoutmatch,andSubhoGupta,ManasiPandire,ZeeyanPinheiroandVivekPalafromPacktPublishingfortheirremarkabletalentandsupportwhileweworkedtogetheronthisbook.
Lastbutnotleast,I’dliketothankAndrewBeekhoffandtheteamatClusterLabs(http://clusterlabs.org/)forputtingtogetherthebestandmostcompleteclusterresourceinformationguideoutthere,whichservedasthemainsourceofmyresearch.
AbouttheReviewersMuhammadKamranAzeemisaseasonedITprofessionalwithtwentyyearsofexperienceinIT.HestartedworkingasaPCtechnicianin1995andgraduallygotintodatabaseadministration,systemadministration,highperformancecomputing,and,lately,informationsecurity.HealsotaughtundergraduateandgraduatelevelcoursesforC/C++,datastructuresandalgorithmdesign,Oracledeveloper,andalotmore,indifferentuniversitiesinPakistan.
Kamranholdsamaster’sdegreeinIT,andiscertifiedunderCISSP,CEH,RHCE,OCP,andCCNAprograms.HeistheauthorofPakistan’sfirstbookonLinuxsystemadministrationtitledLinuxPocketReferenceforSystem.
Administrators,andmanytrainingvideosonusingLinuxasthemaindesktopoperatingsystem,aswellasLinuxsystemadministration,allavailablethroughhiswebsitehttp://wbitt.com.
HeisanadvocateofFreeandOpenSourceSoftware(FOSS),andforthelasttenyears,heisthedrivingforcebehindthewaveofadaptationofLinuxinPakistan.
HeiscurrentlyworkingasaseniorDevOpsconsultantforPraqmaASinOslo,Norway,helpingcompaniesadoptmodernsoftwareandITinfrastructurepractices.
First,Iwouldliketothankmywife,Rohina,forbeingthegreatestsupportinwhatIdo.IwouldalsoliketothankMikeLong,myemployer,forencouragingmetoundertakethisbookreviewproject.
DenisFateyevholdsamaster’sdegreeinComputerScienceandhasbeenworkingwithLinuxformorethan10years(mostlywithRedHatandCentOS).HecurrentlyworksasaPerlprogrammerandDevOpsforasmallGermancompany.HehasreviewedseveralbooksmostlyrelatedtoCentOS,DevOps,andhighavailabilitytechnologies,includingGitLabCookbook,CentOSHighAvailability,CentOSHighPerformancebyPacktPublishing.Beingakeenparticipantintheopensourcecommunity,heisapackagemaintaineratFedoraandRepoforgeprojects.Hehasapassionforforeignlanguages,namely,GermanandSpanish,andlinguistics.
LekshminarayananKhasbeenadministeringLinux/Unixserverssince2009.HehadhisfirstexperiencewiththeopensourceonUbuntu8.04eversincethenhehasexperiencedmanyflavorsofLinuxlikeCentOS,Red-hat,FedoraandDebian.LekshminarayanisalsoexperiencedinapplicationadministrationslikeApache,Qmail,SVN,andGIT.HeiscurrentlyteachinghimselfShellandPythonscriptingandworkingasaLinuxadministratoratCOMODOInc.
Duringhisfreetime,heenjoysphotographyandistoofondofbooks.
OliverPelzhasmorethan10yearsofexperienceasasoftwaredeveloperandsystemadministrator.HegraduatedwithadiplomainBioinformaticsandiscurrentlyworkingattheGermanCancerResearchcenterinHeidelberg,wherehehasauthoredandcoauthored
severalscientificpublicationsinthefieldofBioinformatics.Nexttodevelopingwebapplicationsandbiologicaldatabasesforhisdepartmentandscientistsallovertheworld,heisadministratingadivision-wideLinux-baseddatacenterandhassetuptwohigh-performanceCentOSclustersfortheanalysisofhigh-throughputmicroscopeandgenomesequencingdata.Heloveswritingcode,ridinghismountainbikeintheBlackForestofGermanyandisanabsoluteLinuxandopensourceenthusiastformanyyears.HehascontributedtoseveralopensourceprojectsinthepastandisalsotheauthorofthebookCentOS7LinuxServerCookbook,PacktPublishing.HemaintainsanITtechblogatwww.oliverpelz.de.
IwouldliketothankmyfamilyandespeciallymywonderfulwifeBeatriceandlittlesonJonahfortheirpatienceandunderstandingforalltheselongworkinghoursandthefolksatPacktPublishingfortheopportunitytoreviewthismanuscript,itwasagreatpleasureforme.
www.PacktPub.com
Supportfiles,eBooks,discountoffers,andmoreForsupportfilesanddownloadsrelatedtoyourbook,pleasevisitwww.PacktPub.com.
DidyouknowthatPacktofferseBookversionsofeverybookpublished,withPDFandePubfilesavailable?YoucanupgradetotheeBookversionatwww.PacktPub.comandasaprintbookcustomer,youareentitledtoadiscountontheeBookcopy.Getintouchwithusat<[email protected]>formoredetails.
Atwww.PacktPub.com,youcanalsoreadacollectionoffreetechnicalarticles,signupforarangeoffreenewslettersandreceiveexclusivediscountsandoffersonPacktbooksandeBooks.
https://www2.packtpub.com/books/subscription/packtlib
DoyouneedinstantsolutionstoyourITquestions?PacktLibisPackt’sonlinedigitalbooklibrary.Here,youcansearch,access,andreadPackt’sentirelibraryofbooks.
Whysubscribe?FullysearchableacrosseverybookpublishedbyPacktCopyandpaste,print,andbookmarkcontentOndemandandaccessibleviaawebbrowser
FreeaccessforPacktaccountholdersIfyouhaveanaccountwithPacktatwww.PacktPub.com,youcanusethistoaccessPacktLibtodayandview9entirelyfreebooks.Simplyuseyourlogincredentialsforimmediateaccess.
PrefaceCentOSistheenterpriselevelLinuxOS,whichis100%binarycompatiblewithRedHatEnterpriseLinux(RHEL).ItactsasafreealternativetoRedHat’scommercialLinuxoffering,withonlyachangeinthebranding.Ahighperformanceclusterconsistsofagroupofcomputersthatworktogetherasonesetparallel,henceminimizingoreliminatingthedowntimeofcriticalservicesandenhancingtheperformanceoftheapplication.
WhatthisbookcoversChapter1,ClusterBasicsandInstallationonCentOS7,reviewsthebasicprinciplesofclusteringandoutlinesthenecessarystepstoinstallaclusterwithtwoCentOS7servers.
Chapter2,InstallingClusterServicesandConfiguringNetworkComponents,coversettingupandconfiguringthebasicrequirednetworkinfrastructureandclusteringservices.
Chapter3,ACloserLookatHighAvailability,liststhecomponentsofaclusterindetailanddemonstrateshowtoapproachthesplit-brainproblembyconfiguringthefailoverandfencingtheclusterasawholeandthequorumofeachnodeindividually.
Chapter4,Real-worldImplementationsofClustering,covershowtoimplementawebserverandadatabaseserverinyourcluster.
Chapter5,MonitoringtheClusterHealth,talksabouthowtomonitortheperformanceandavailabilityofyourcluster.
Chapter6,MeasuringandIncreasingPerformance,reviewsperformancetuningtechniquesforyourrecentlyinstalledhighavailabilitycluster.
WhatyouneedforthisbookTofollowalongwiththisbook,youwillneedtodownloadaCentOS7minimalinstallimagefromtheproject’swebsite.Youwillbeaskedtoinstallotherpackages(pacemaker,corosync,andpcs,tonameafewexamples)ineachchapterasrequired.
WhothisbookisforThisbookisdirectedtowardtwogroupsofsystemadministrators—thosewhowantadetailed,step-by-stepguidetosettingupahigh-performanceandhigh-availabilityCentOS7clusterandthosewhoarelookingforareferencebooktohelpthemlearnthenecessaryskillstoensurethattheirsystemsandthecorrespondingresources,andservicesarebeingutilizedattheirbestcapacity.Nopreviousknowledgeofperformancetuningisneededtostartreadingthisbook,butthereaderisexpectedtohaveatleastsomedegreeoffamiliaritywithanyspin-offoftheFedorafamilyofLinuxdistributions,preferablyCentOS.
ConventionsInthisbook,youwillfindanumberofstylesoftextthatdistinguishbetweendifferentkindsofinformation.Herearesomeexamplesofthesestyles,andanexplanationoftheirmeaning.
Codewordsintext,databasetablenames,foldernames,filenames,fileextensions,pathnames,dummyURLs,userinput,andTwitterhandlesareshownasfollows.TodownloadtheEmployeestable,gotohttps://launchpad.net/testdb/:
Ablockofcodeissetasfollows:
HWADDR="08:00:27:C8:C2:BE"
TYPE="Ethernet"
BOOTPROTO="static"
NAME="enp0s3"
ONBOOT="yes"
IPADDR="192.168.0.2"
NETMASK="255.255.255.0"
Newtermsandimportantwordsareshowninbold.Wordsthatyouseeonthescreen,inmenusordialogboxesforexample,appearinthetextlikethis:“HighlightInstallCentOS7usingtheupanddownarrows“.
NoteWarningsorimportantnotesappearinaboxlikethis.
TipTipsandtricksappearlikethis.
ReaderfeedbackFeedbackfromourreadersisalwayswelcome.Letusknowwhatyouthinkaboutthisbook—whatyoulikedormayhavedisliked.Readerfeedbackisimportantforustodeveloptitlesthatyoureallygetthemostoutof.
Tosendusgeneralfeedback,simplysendane-mailto<[email protected]>,andmentionthebooktitleviathesubjectofyourmessage.
Ifthereisatopicthatyouhaveexpertiseinandyouareinterestedineitherwritingorcontributingtoabook,seeourauthorguideonwww.packtpub.com/authors.
CustomersupportNowthatyouaretheproudownerofaPacktbook,wehaveanumberofthingstohelpyoutogetthemostfromyourpurchase.
DownloadingtheexamplecodeYoucandownloadtheexamplecodefilesforallPacktbooksyouhavepurchasedfromyouraccountathttp://www.packtpub.com.Ifyoupurchasedthisbookelsewhere,youcanvisithttp://www.packtpub.com/supportandregistertohavethefilese-maileddirectlytoyou.
ErrataAlthoughwehavetakeneverycaretoensuretheaccuracyofourcontent,mistakesdohappen.Ifyoufindamistakeinoneofourbooks—maybeamistakeinthetextorthecode—wewouldbegratefulifyouwouldreportthistous.Bydoingso,youcansaveotherreadersfromfrustrationandhelpusimprovesubsequentversionsofthisbook.Ifyoufindanyerrata,pleasereportthembyvisitinghttp://www.packtpub.com/submit-errata,selectingyourbook,clickingontheerratasubmissionformlink,andenteringthedetailsofyourerrata.Onceyourerrataareverified,yoursubmissionwillbeacceptedandtheerratawillbeuploadedonourwebsite,oraddedtoanylistofexistingerrata,undertheErratasectionofthattitle.Anyexistingerratacanbeviewedbyselectingyourtitlefromhttp://www.packtpub.com/support.
PiracyPiracyofcopyrightmaterialontheInternetisanongoingproblemacrossallmedia.AtPackt,wetaketheprotectionofourcopyrightandlicensesveryseriously.Ifyoucomeacrossanyillegalcopiesofourworks,inanyform,ontheInternet,pleaseprovideuswiththelocationaddressorwebsitenameimmediatelysothatwecanpursuearemedy.
Pleasecontactusat<[email protected]>withalinktothesuspectedpiratedmaterial.
Weappreciateyourhelpinprotectingourauthors,andourabilitytobringyouvaluablecontent.
QuestionsYoucancontactusat<[email protected]>ifyouarehavingaproblemwithanyaspectofthebook,andwewilldoourbesttoaddressit.
Chapter1.ClusterBasicsandInstallationonCentOS7Inthischapter,wewillintroducethebasicprinciplesofclusteringandshowhowtosetuptwoLinuxserversasmembersofacluster,stepbystep.
Aspartofthisprocess,wewillinstalltheCentOS7Linuxdistributionfromscratch,alongwiththenecessarypackages,andfinallyconfigurekey-basedauthenticationforSSHaccessfromonecomputertotheother.Allcommands,exceptifnotedotherwise,mustberunasrootandareindicatedbyaleading$signthroughoutthisbook.
ClusteringfundamentalsIncomputerscience,aclusterconsistsofagroupofcomputers(witheachcomputerreferredtoasanodeormember)thatworktogethersothatthesetisseenasasinglesystemfromtheoutside.
Theenterpriseandscienceenvironmentsoftenrequirehighcomputingpowertoanalyzemassiveamountsofdataproducedeveryday,andredundancy.Inorderfortheresultstobealwaysavailabletopeopleeitherusingthoseservicesormanagingthem,werelyonthehighavailabilityandperformanceofcomputersystems.TheneedofInternetwebsites,suchasthoseusedbybanksandothercommercialinstitutions,toperformwellwhenunderasignificantloadisaclearexampleoftheadvantagesofusingclusters.
Therearetwotypicalclustersetups.Thefirstoneinvolvesassigningadifferenttasktoeachnode,thusachievingahigherperformancecomparedwithseveraltasksbeingperformedbyasinglememberonitsown.Anotherclassicuseofclusteringistohelpensurehighavailabilitybyprovidingfailovercapabilitiestothesetwhereonenodemayautomaticallyreplaceafailedmembertominimizethedowntimeofoneorseveralcriticalservices.Ineithercase,theconceptofclusteringimpliesnotonlytakingadvantageofthecomputingfunctionalityofeachmemberalone,butalsomaximizingitbycomplementingitwiththeothers.
Thistypeofclustersetupiscalledhighavailability(HA),anditaimstoeliminatesystemdowntimebyfailingoverservicesfromonenodetoanotherincaseoneofthemexperiencesanissuethatrendersitinoperative.Asopposedtoswitchover,whichrequireshumanintervention,afailoverprocedureisperformedautomaticallybytheclusterwithoutanydowntime.Inotherwords,thisoperationistransparenttoendusersandclientsfromoutsidethecluster.
Thesecondsetupusesitsnodestoperformoperationsinparallelinordertoenhancetheperformanceofoneormoreapplications,andiscalledahigh-performancecluster(HPC).HPCsaretypicallyseeninscenariosinvolvingapplicationsandprocessesthatuselargecollectionsofdata.
WhyLinuxandCentOS7?Asmentionedearlier,wewillbuildaclusterwithtwomachinesrunningLinux.Thischoiceissupportedbythefactthatthisinvolveslowcostsandstabilityassociatedwiththissetup—nopaidoperatingsystemorsoftwarelicenses,alongwiththepossibilityofrunningLinuxonsystemswithsmallresources(suchasaRaspberryPiorrelativelyoldhardware).Thus,wecansetupaclusterwithverylittleresourcesormoney.
Wewillbeginourownjourneytowardclusteringbysettinguptheseparatenodesthatwillmakeupoursystem.OurchoiceofoperatingsystemisLinuxandCentOSversion7,asthedistribution,whichisthelatestavailablereleaseofCentOSasofnow.ThebinarycompatibilitywithRedHatEnterpriseLinux©(whichisoneofthemostwell-useddistributionsinenterpriseandscientificenvironments)alongwithitswell-provenstabilityarethereasonsbehindthisdecision.
NoteCentOS7isavailablefordownload,freeofcharge,fromtheproject’swebsiteathttp://www.centos.org/.Inadditiontothis,specificdetailsaboutthereleasecanalwaysbeconsultedinreleasenotesavailablethroughtheCentOSwiki,http://wiki.centos.org/Manuals/ReleaseNotes/CentOS7.
DownloadingCentOSTodownloadCentOS,gotohttp://www.centos.org/download/andclickononeofthethreeoptionsoutlinedinthefollowingscreenshot:
DVDISO:Thisisan.isofile(~4GB)thatcanbewrittenintoregularDVDopticalmediaandincludesthecommontools.DownloadthisfileifyouhavepermanentaccesstoareliableInternetconnectionthatyoucanusetodownloadotherpackagesandutilitieslater.EverythingISO:Thisisan.isofile(~7GB)withthecompletesetofpackagesmadeavailableinthebaserepositoryofCentOS7.DownloadthisfileifyoudonothaveaccesstoapermanentInternetconnectionorifyourplancontemplatesthepossibilityofinstallingorpopulatingalocalornetworkmirror.alternativedownloads:ThislinkwilltakeyoutoapublicdirectorywithinanofficialnearbyCentOSmirrorwherethepreviousoptionsareavailablealongwithothers,includingdifferentchoicesofdesktopversions(GNOMEorKDE),andtheminimal.isofile(~570MB),whichcontainsthecoreoressentialpackagesofthedistribution.
Althoughallthethreedownloadoptionswillwork,wewillusetheminimalinstallasitissufficientforourpurposeathand,andwecaninstallotherneededpackagesusingpublicsoftwarepackagerepositorieswiththestandardCentospackagemanageryumlater.Therecommended.isofiletodownloadisthelatestthatisavailablefromthedownloadpage,whichatthetimeofwritingthisisCentOS7.01406x86_64Minimal.iso.
SettingupCentOS7nodesIfyoudonothavededicatedhardwarethatyoucanusetosetupthenodesofyourcluster,youcanstillcreatethemusingvirtualmachinesoversomevirtualizationsoftware,suchasOracleVirtualbox©orVMware©,forexample.Usingvirtualizationwillallowustoeasilysetupthesecondnodeaftersettingupthefirstbycloningit.TheonlylimitationinthiscaseisthatwewillnothaveaSTONITHdeviceavailable.ShootTheOtherNodeInTheHead(STONITH)isamechanismthataimstopreventtwonodesfromactingastheprimarynodeinanHAcluster,thusavoidingthepossibilityofdatacorruption.
ThefollowingsetupisgoingtobeperformedonaVirtualboxVMwith1GBofRAMand30GBofdiskspaceplustwonetworkcardinterfaces.ThefirstonewillallowustoreachtheInternettodownloadotherpackages,whereasthesecondwillbeneededtocreateasharedIPaddresstoreachtheclusterasawhole.
ThereasonwhyIhavechosenVirtualBoxoverVMwareisthattheformerisfreeofcostandisavailableforMicrosoftWindows,Linux,andMacOS,whileafullversionofthelattercostsmoney.
TodownloadandinstallVirtualBox,gotohttps://www.virtualbox.org/andchoosetheversionforyouroperatingsystem.Fortheinstallationinstructions,youcanrefertohttps://www.virtualbox.org/manual/UserManual.html,especiallysections1.7Creatingyourfirstvirtualmachine,and1.13Cloningvirtualmachines.
Otherthanthat,youwillalsoneedtoensurethatyourvirtualmachinehastwonetworkinterfacecards.Thefirstoneiscreatedbydefaultwhilethesecondonehastobecreatedmanually.
TodisplaythecurrentnetworkconfigurationforaVM,clickonitinVirtualbox’smaininterfaceandthenontheSettingsbutton.Apopupwindowwillappearwiththelistofthedifferenthardwarecategories.ChooseNetworkandconfigureAdapter1toBridgedAdapter,asshowninthefollowingscreenshot:
ClickonAdapter2,enableitbycheckingthecorrespondingcheckboxandconfigureitaspartofanInternalNetworknamedintnet,asshowninthefollowingscreenshot:
Wewillusethedefaultpartitioningschema(LVM)assuggestedbytheinstallationprocess.
InstallingCentOS7WewillstartbycreatingthefirstnodestepbystepandthenusethecloningfeatureinVirtualboxtoinstantiateanidenticalnode.Thiswillreducethenecessarytimefortheinstallationasitwillonlyrequireaslightmodificationtothehostnameandnetwork.FollowthesestepstoinstallCentOS7inavirtualmachine:
1. Thesplashscreenshowninthefollowingscreenshotisthefirststepintheinstallationprocessafterloadingtheinstallationmediaonboot.HighlightInstallCentOS7usingtheupanddownarrowsandpressEnter:
2. SelectEnglish(oryourpreferredinstallationlanguage)andclickonContinue:
3. Inthenextscreen,youcansetthecurrentdateandtime,chooseakeyboardlayoutandlanguagesupport,pickaharddrivedestinationfortheinstallationalongwithapartitioningmethod,connectthemainnetworkinterface,andassignauniquehostnameforthenode.Wewillnamethecurrentnodeasnode01andleavetherest
ofthesettingsasdefault(wewillconfiguretheextranetworkcardlater).Then,clickontheBegininstallationbutton.
4. Whiletheinstallationcontinuesinthebackground,wewillbepromptedtosetthepasswordfortherootaccountandcreateanadministrativeuserforthenode.Oncethesestepshavebeenconfirmed,thecorrespondingwarningswillnolongerappear:
5. Whentheprocessiscompleted,clickonFinishconfigurationandtheinstallationwillfinishconfiguringthesystemandthedevices.Whenthesystemisreadytobootonitsown,youwillbepromptedtodoso.RemovetheinstallationmediaandclickonReboot.
6. AftersuccessfullyrestartingthecomputerandbootingintoaLinuxprompt,ourfirsttaskwillbetoupdateoursystem.However,beforewecandothis,wefirsthavetosetupourbasicnetworkadaptertoaccesstheInternettodownloadandupdatepackages.Then,wewillbeabletoproceedwithsettingupournetworkinterfaces.
SettingupthenetworkinfrastructureSinceournodeswillcommunicatebetweeneachotheroverthenetwork,wewillfirstdefineournetworkaddressesandconfiguration.OurratherbasicnetworkinfrastructurewillconsistoftwoCentOS7boxeswithstaticIPaddressesandhostnamesnode01[192.168.0.2]andnode02[192.168.0.3],andagatewayroutercalledsimplygateway[192.168.0.1].
InCentOS,allnetworkinterfacesareconfiguredusingscriptsinthe/etc/sysconfig/network-scriptsdirectory.Ifyoufollowedthestepsoutlinedearliertocreateasecondnetworkinterface,youshouldhaveaifcfg-enp0s3andifcfg-enp0s8fileinsidethatdirectory.ThefirstoneistheconfigurationfileforthenetworkcardthatwewillusetoaccesstheInternetandtoconnectviaSSHusinganoutsideclient,whereasthesecondwillbeusedinalaterchaptertobeapartofaclusterresource.Notethattheexactnamingofthenetworkinterfacesmaydifferalittle,butitissafetoassumethattheywillfollowtheifcfg-enp0sXformat,whereXisanintegernumber.
Thisistheminimumcontentthatisneededinthe/etc/sysconfig/network-scripts/ifcfg-enp0s3directoryforourpurposesinourfirstnode(whenyousetupthesecondnodelater,justchangetheIPaddress(IPADDR)to192.168.0.3):
HWADDR="08:00:27:C8:C2:BE"
TYPE="Ethernet"
BOOTPROTO="static"
NAME="enp0s3"
ONBOOT="yes"
IPADDR="192.168.0.2"
NETMASK="255.255.255.0"
GATEWAY="192.168.0.1"
PEERDNS="yes"
DNS1="8.8.8.8"
DNS2="8.8.4.4"
NotethattheUUIDandHWADDRvalueswillbedifferentinyourcaseastheyareassignedaspartoftheunderlyinghardware.Forthisreason,itissafetoleavethedefaultvaluesforthosesettings.Inadditiontothis,bewarethatclustermachinesneedtobeassignedastaticIPaddress—neverleavethatuptoDHCP!Intheconfigurationfileusedpreviously,weareusingGoogle’sDNSbutifyouwishto,feelfreetouseanotherDNS.
Whenyouaredonemakingchanges,savethefileandrestartthenetworkserviceinordertoapplythem.SinceCentOS,beginningwithversion7,usessystemdinsteadofSysVinitforservicemanagement,wewillusethesystemctlcommandinsteadofthe/etc/init.dscriptstorestarttheservicesthroughoutthisbook,asfollows:
$systemctlrestartnetwork.service#Restartthenetworkservice
Youcanverifythatthepreviouschangeshavetakeneffectusingthefollowingcommand:
$systemctlstatusnetwork.service#Displaythestatusofthenetwork
service
Youcanverifythattheexpectedchangeshavebeencorrectlyappliedwiththefollowing
command:
$ipaddr|grep'inet'''#DisplaytheIPaddresses
Youcandisregardallerrormessagesrelatedtotheloopbackinterfaceasshownintheprecedingscreenshot.However,youwillneedtoexaminecarefullyanyerrormessagesrelatedtoenp0s3,ifany,andgetthemresolvedinordertoproceedfurther.
Thesecondinterfacewillbecalledenp0sX,whereXistypically8,asitisinourcase.Youcanverifythiswiththefollowingcommand,asshowninthefollowingscreenshot:
$iplinkshow
Asfortheconfigurationfileofenp0s8,youcansafelycreateitcopyingthecontentsofifcfg-enp0s3.Donotforget,however,tochangethehardware(MAC)addressasreturnedbytheinformationontheNICbytheiplinkshowenp0s8commandandleavetheIPaddressfieldblanknow,usingthefollowingcommand:
iplinkshowenp0s8
cp/etc/sysconfig/network-scripts/ifcfg-enp0s3/etc/sysconfig/network-
scripts/ifcfg-enp0s8
Next,restartthenetworkserviceasexplainedearlier.
NotethatyouwillalsoneedtosetupatleastabasicDNSresolutionmethod.Consideringthatwewillsetupaclusterwithtwonodesonly,wewilluse/etc/hostsinbothhostsforthispurpose.
Edit/etc/hostswiththefollowingcontent:
192.168.0.2 node01
192.168.0.3 node02
192.168.0.1 gateway
Onceyouhavesetupbothnodesasexplainedinthefollowingsections,atthispointandbeforeproceedingfurther,youcanperformapingasabasictestforconnectivitybetweenthetwohoststoensurethattheyarereachablefromeachother.
Tobegin,executeinnode01:
$ping–c4node02
Next,dothesameinnode02:
$ping–c4node01
InstallingthepackagesrequiredforclusteringOncewehavefinishedinstallingtheoperatingsystemandconfiguringthebasicnetworkinfrastructure,wearereadytoinstallthepackagesthatwillprovidetheclusteringfunctionalitytoeachnode.Let’semphasizeherethatwithoutthesecorecomponents,ourtwonodeswouldbecomesimplestandaloneserversthatwouldnotbeabletosupporteachotherintheeventofasystemcrashoranothermajorissueinoneofthem.
KeysoftwarecomponentsEachnodewillneedthefollowingsoftwarecomponentsinordertoworkasamemberofthecluster.ThesepackagesarefullysupportedinCentOS7aspartofaclustersetup,asopposedtootheralternativesthathavebeendeprecated:
Pacemaker:Thisisaclusterresourcemanagerthatrunsscriptsatboottime,whenindividualnodesgoupordownorwhenrelatedresourcesfail.Inaddition,itcanbeconfiguredtoperiodicallycheckthehealthstatusofeachclustermember.Inotherwords,pacemakerwillbeinchargeofstartingandstoppingservices(suchasawebordatabaseserver,tonameaclassicexample)andwillimplementthelogictoensurethatallofthenecessaryservicesarerunninginonlyonelocationatthesametimeinordertoavoiddatafailureorcorruption.Corosync:Thisisamessagingservicethatwillprovideacommunicationchannelbetweennodes.Asyoucanguess,corosyncisessentialforpacemakertoperformitsjob.PCS:Thisisacorosyncandpacemakerconfigurationtoolthatwillallowyoutoeasilyview,modify,andcreatepacemaker-basedclusters.Thisisnotstrictlynecessarybutratheroptional.Wechoosetoinstallitbecauseitwillcomeinhandyatalaterstage.
Toinstallthethreeprecedingsoftwarepackages,runthefollowingcommand:
$yumupdate&&yuminstallpacemakercorosyncpcs
Yumwillupdatealltheinstalledpackagestotheirmostrecentversioninordertobettersatisfydependencies,anditwillthenproceedwiththeactualinstallation.
Inadditiontoinstallingtheprecedingpackages,wealsoneedtoenableiptables,asthedefaultfirewallforCentOS7isfirewalld.Wechooseiptablesoverfirewalldbecauseitsuseisfarmoreextended,andthereisachancethatyouwillbefamiliarwithitcomparedwiththerelativelynewfirewalld.Wewillinstallthenecessarypackageshereandleavetheconfigurationforthenextchapter.
Inordertomanageiptablesviasystemdutilities,youwillneedtoinstall(ifitisnotalreadyinstalled)theiptables-servicespackageusingthefollowingcommand:
yumupdate&&yuminstalliptables-services
Now,youcanstopanddisablefirewalldusingthefollowingcommands:
systemctlstopfirewalld.service
systemctldisablefirewalld.service
Next,enableiptablestobothinitializeonbootandstartduringthecurrentsession:
systemctlenableiptables.service
systemctlstartiptables.service
Youcanrefertothefollowingscreenshotforastep-by-stepexampleofthisprocess:
Oncetheinstallationofthefirstnode(node01)hasbeencompletedsuccessfully,clonethefirstnodefollowingtheoutlineinsection1.13ofVirtualboxmanual(Cloningvirtualmachines).Onceyou’redonecloningthevirtualmachine,addthefollowingminorchangestothesecondvirtualmachine:
Namethemachinenode02.Whenyoustartthisnewlycreatedvirtualmachine,itshostnamewillstillbesettonode01.Tochangeit,issuethefollowingcommandandthenrebootthemachinetoapplyit:
$hostnamectlset-hostnamenode02
$systemctlreboot
Intheconfigurationfileforenp0s3innode02,enter192.168.0.3astheIPaddressandtherightHWADDRaddress.Ensurethatboththevirtualmachinesarerunningandthateachnodecanpingtheotherandthegateway,asshowninthenexttwoscreenshots.
First,wewillpingnode02andgatewayfromnode01,andwewillseethefollowingoutput:
Then,wewillpingnode01andgatewayfromnode02:
Ifanyofthepingsdonotreturntheexpectedresult,asshownintheprecedingscreenshot,checkthenetworkinterfaceconfigurationinbothVirtualboxandintheconfigurationfiles,asoutlinedearlierinthischapter.
Settingupkey-basedauthenticationforSSHaccessWhilenotstrictlyrequired,wewillalsosetupapublickey-basedauthenticationforSSHsothatwecanaccesseachhostfromtheotherwithoutenteringtheaccountspasswordeverytimewewanttoaccessadifferentnode.Thisfeaturewillcomeinhandyincase,forsomereason,weneedtoperformsomesystemadministrationtaskononeofthenodes.Notethatyouwillneedtorepeatthisoperationonbothnodes.
Inordertoincreasesecurity,wemayalsoenterapassphrasewhilecreatingtheRSAkey,whichisshowninthefollowingscreenshot.Thisstepisoptionalandyoucanomititifyouwant.Infact,Iadviseyoutoleaveitemptyinordertomakethingseasierdowntheroad,butit’suptoyou.
RunthefollowingcommandinordertocreateaRSAkey:
$ssh-keygen-trsa
Toenablepasswordlesslogin,wewillcopythenewlycreatedkeytonode02,andviceversa,asshowninthenexttwofigures,respectively.
$cat.ssh/id_rsa.pub|sshroot@node02'cat'>>.ssh/authorized_keys'
Copythekeyfromnode01tonode02:
Copythekeyfromnode02tonode01:
Next,weneedtoverifythatwecanconnectfromeachclustermembertotheotherwithoutapasswordbutwiththepassphraseweenteredpreviously:
Finally,ifpasswordlessloginisnotsuccessful,youmaywanttoensurethattheSSHdaemonisrunningonbothhosts:
$systemctlstatussshd
Ifitisnotrunning,startitusingthefollowingcommand:
$systemctlstartsshd
Youmaywanttocheckthestatusoftheserviceagainafterattemptingtorestartit.Iftherehavebeenanyerrors,theoutputofsystemctlstatussshdwillgiveyouindicationsastowhatiswrongwiththeserviceandwhyitisrefusingtostartproperly.Followingthosedirections,youwillbeabletotroubleshoottheproblemwithoutmuchhassle.
SummaryInthischapter,wereviewedhowtoinstalltheoperatingsystemandinstalledthenecessarysoftwarecomponentstoimplementthebasicclusterfunctionality.Ensurethatyouhaveinstalledyournodes,thebasicclusteringsoftwareasoutlinedearlierinthischapter,andconfiguredthenetworkandSSHaccessbeforeproceedingwithChapter2,InstallingClusterServicesandConfiguringNetworkComponents,wherewewillconfiguretheresourcemanager,themessaginglayer,andthefirewallserviceinordertoactuallystartbuildingourcluster.
Chapter2.InstallingClusterServicesandConfiguringNetworkComponentsInthischapter,youwilllearnhowtosetupandconfigurethebasicrequirednetworkinfrastructureandalsotheclusteringcomponentsthatweinstalledinthepreviouschapter.
Inadditiontothis,wewillreviewthebasicandimportantconceptsoffirewallingandInternetprotocols,andwewillexplainhowtoaddthefirewallrulesthatwillallowcommunicationbetweenthenodesandtheproperoperationoftheclusteringservicesoneachnode.
IfyournativelanguageisanyotherthanEnglish,youmusthavetakenanEnglishclassortaughtyourself(asIdid)beforebeingabletoreadthisbook.Thesamethinghappenswhentwopeoplewhodonotspeakthesamelanguagewanttocommunicatewitheachother.Atleastoneofthemneedstoknowthelanguageoftheother,orthetwoofthemneedtoagreeonacommonidiominordertobeabletounderstandeachother.
Innetworking,theequivalentoflanguagesintheaboveanalogyiscalledprotocols.Inordertoenabledatatransmissionbetweentwomachines,theremustbealogicalwayforthemtobeabletospeaktoeachother.ThisisattheveryheartoftheInternetprotocolsuite,alsoknownastheInternetmodel,whichprovidesasetofcommunicationprotocolsorrules.
ItispreciselythissetofprotocolsthatmakedatatransmissionpossibleinnetworkssuchastheInternet.Laterinthischapter,wewillexplaintheprotocolsandnetworkportsthatparticipateinthecommunicationinsideacluster.
ConfiguringandstartingclusteringservicesHavingreviewedthekeynetworkingconceptsthatwereoutlinedearlier,wearenowreadytostartdescribingtheclusterservices.
StartingandenablingclusteringservicesYouwillrecallfromthepreviouschapterthatweinstalledthefollowingclusteringcomponents:
Pacemaker:ThisistheclusterresourcemanagerCorosync:ThisisthemessagingservicePCS:Thisisthesynchronizationandconfigurationtool
Asyoucanprobablyguessfromtheprecedinglist,thesecomponentsshouldrunasdaemons,aspecialtypeofprocessthatrunsinthebackgroundwithouttheneedofdirectinterventionorcontrolofanadministrator.AlthoughweinstalledthenecessarypackagesinChapter1,ClusterBasicsandInstallationonCentOS7,wedidnotstarttheclusterresourcemanagerorthemessagingservices.So,wenowneedtostartthemmanuallyforthefirsttimeandenablethemtorunautomaticallyonstartupduringthenextsystemboot.
WewillstartbyconfiguringpacemakerandcorosyncfirstandsavePCSforlaterinthischapter.
Asshowninthefollowingscreenshot,thesedaemons(alsoknownasunitsinsystemd-basedsystems)areinactivewhenyoufirstbootbothnodes(andarenotautomaticallystartedonreboot)afterperformingallthetasksoutlinedinChapter1,ClusterBasicsandInstallationonCentOS7.Youcanchecktheircurrentrunningstatususingthefollowingcommands:
systemctlstatuspacemaker
systemctlstatuscorosync
Inordertostartcorosyncandpacemakeroneachnodeandenablebothservicestostartautomaticallyduringsystemboot,firstcreatethecorosyncconfigurationfilebymakingacopyoftheexamplefile,whichcamewiththeinstallationpackage.AsopposedtothepacemakerandPCS,corosyncdoesnotcreatetheconfigurationfileautomaticallyforyou:
Tocreatethecorosyncconfigurationfile,do:
cp/etc/corosync/corosync.conf.example/etc/corosync/corosync.conf
Andthenrestartandenabletheservicesbyrunningthefollowingcommands:
systemctlstartpacemaker
systemctlenablecorosync
systemctlenablepacemaker
Intheprecedingcommands,notethatwearenotstartingcorosyncmanually,asitwilllaunchonitsownwhenpacemakerisstarted.Itisimportanttonotethatonsystemd-basedsystems,enablingaserviceisnotthesameasstartingit.Aunitmaybeenabledbutnotstarted,ortheotherwayaround.Asshowninthefollowingcode,enablingaserviceinvolvescreatingasymlinktotheunit’sconfigurationfile,whichamongotherthingsspecifiestheactionstobetakenonsystembootandshutdown.
Performthefollowingoperationsonbothnodes:
[root@node01~]#systemctlenablepacemaker
ln-s'/usr/lib/systemd/system/pacemaker.service'
'/etc/systemd/system/multi-user.target.wants/pacemaker.service'
[root@node01~]#systemctlenablecorosync
ln-s'/usr/lib/systemd/system/corosync.service'
'/etc/systemd/system/multi-user.target.wants/corosync.service'
[root@node01~]#
Finally,beforewecanconfiguretheclusteratalaterstage,weneedtoperformthefollowingsteps:
1. StartandenablethePCSdaemon(pcsd),whichwillbeinchargeofkeepingthecorosyncconfigurationsyncedonnode01andnode02.Inorderforthepcsddaemontoworkasexpected,corosyncandpacemakermusthavebeenstartedpreviously.Notethatwhenyouusethesystemctltooltomanageservicesinasystemd-basedsystem,youcanomitthetrailing.serviceafterthedaemonname(oruseitifyouwant,asindicatedinChapter1,ClusterBasicsandInstallationonCentOS7).StartandenablethePCSdaemonwith:
systemctlstartpcsd
systemctlenablepcsd
2. NowsetthepasswordforthehaclusterLinuxaccount,whichwascreatedautomaticallywhenPCSwasinstalled.ThisaccountisusedbythePCSdaemontosetupcommunicationbetweennodes,andisbestmanagedwhenthepasswordisidenticalonbothnodes.Tosetthepasswordforhacluster,typethefollowingcommandandassignthesamepasswordonbothnodes:
passwdhacluster
TroubleshootingUndernormalcircumstances,startingpacemakershouldstartcorosyncautomatically.Youcancheckcorosync’sstatuswiththesystemctlstatuscorosynccommand.Ifforsomereasonthatisnotthecase,youcanstillrunthefollowingcommandtomanuallystartthemessagingservice:
systemctlstartcorosync
Shouldanyoftheprecedingcommandsreturnanerror,runningsystemctl-lstatusunit,whereunitiseithercorosyncorpacemaker,willreturnadetailedstatusabouttherespectiveservice.
Hereisanotherusefultroubleshootingcommand:
journalctl-xn
Thiswillquerythesystemdjournal(systemd’sownlog)andreturnverbosemessagesaboutthelastevents.
Bothofthesecommandswillprovidehelpfulinformationastowhatwentwrong,andpointyouintherightdirectiontosolvetheproblem.
TipYoucanreadmoreaboutthesystemdjournalinitsmanpage,manjournalctl,orintheonlineversion,whichisavailableathttp://www.freedesktop.org/software/systemd/man/journalctl.html.
SecurityfundamentalsAtthispoint,wearereadytodiscussnetworksecuritytoonlyallowthepropernetworktrafficbetweenthenodes.Duringtheinitialsetupandwhileperformingyourfirsttests,youmaywanttodisablethefirewallandSELinux(whichisdescribedlaterinthischapter)andthengothroughbothofthematalaterstage—itisuptoyoudependingonyourgradeoffamiliaritywiththematthispoint.
LettinginandlettingoutAfterhavingstartedandenabledtheservicesmentionedearlier,wearereadytotakeacloserlookatthenetworkprocessesinvolvedintheclusterconfigurationandmaintenance.Withthehelpofthenetstatcommand,atoolincludedinthenet-toolspackageforCentOS7,wewillprintthecurrentlisteningnetworkportsandverifythatcorosyncisrunningandlisteningforconnections.Beforedoingso,youwillneedtoinstallthenet-toolspackage,asitisnotincludedintheminimalCentOS7setup,usingthefollowingcommand:
yum–yinstallnet-tools&&netstat-npltu|grep-icorosync
Aswecanseeinthefollowingscreenshot,CorosyncislisteningontheUDPports5404and5405oftheloopbackinterface(127.0.0.1)andontheport5405ofthemulticastaddress(whichissetto239.255.1.1bydefaultandprovidesalogicalwaytoidentifythisgroupofnodes):
NoteUserDatagramProtocol(UDP)isoneofthecoremembersoftheInternetmodel.Thisprotocolallowsapplicationstosendmessages(alsoknownasdatagrams)tohostsinanetworkinordertosetuppathsfordatatransmissionwithoutperformingfullhandshakes(orasuccessfulconnectionbetweentwohostsinanetwork).Additionally,UDPdoesnotincludeerrorcheckingandcorrectioninanetworkcommunication(thesechecksareperformedatthedestinationapplicationitself).
TheTransmissionControlProtocol(TCP)isanothercoreprotocoloftheInternetmodel.AsopposedtoUDP,itprovideserror,delivery,ordering,andduplicatescheckingofdatastreamsbetweencomputersinanetwork.Severalwell-knownapplicationlayerprotocols(suchasHTTP,SMTP,andSSH,tonameafew)areencapsulatedinTCP.
InternetGroupManagementProtocol(IGMP)isthecommunicationprotocolusedbynetworkdevices(whethertheycanbeeitherhostsorrouters)toestablishmulticastdatatransmissions,whichallowsonehostonthenetworktosenddatagramstoseveralothersystemsthatareinterestedinreceivingthesourcecontent.
Beforeweproceedfurther,wewillneedtoallowtrafficthroughthefirewalloneachnode.Bydefault,theportsnamedinthefollowinglistarethedefaultportswheretheseserviceswilllistenafterbeingstarted,aswepreviouslydid.Specifically,inbothnodes,weneedtoperformthefollowingsteps:
1. Openthenetworkportsneededbycorosync(UDPports5404and5405)andPCS
(usuallyTCP2224)usingthefollowingcommands:
iptables-IINPUT-mstate--stateNEW-pudp-mmultiport--dports
5404,5405-jACCEPT
iptables-IINPUT-ptcp-mstate--stateNEW--dport2224-jACCEPT
NoteNotethattheuseof-mmultiportallowsyoutocombineanumberofdifferentportsinoneruleinsteadofhavingtowriteseveralrulesthatarealmostidentical.Thisresultsinfewerrulesandeasiermaintenanceofiptables.
2. AllowIGMPandmulticasttrafficusingthefollowingcommands:
iptables-IINPUT-pigmp-jACCEPT
iptables-IINPUT-maddrtype--dst-typeMULTICAST-jACCEPT
3. ChangethedefaultiptablespolicyfortheINPUTchaintoDROP.Thus,anypacketthatdoesnotcomplywiththerulesthatwejustaddedwillbedropped.Notethat,asopposedtotheREJECTpolicy,DROPdoesnotsendanyresponsewhatsoevertothecallingclient,just“radiosilence”whileactivelydroppingthepackets:
iptables-PINPUTDROP
4. Afteraddingthenecessaryrules,ourfirewallconfigurationlooksasshowninthefollowingcode,wherewecanclearlyseethatbesidestherulesthatweaddedinthetwoprevioussteps,thereareothersthatwereinitializedbydefaultwhenwestartedandenablediptables,asexplainedinChapter1,ClusterBasicsandInstallationonCentOS7.Runthefollowingcommandtolistthefirewallrulesalongwiththeircorrespondingnumbers:
[root@node01~]#iptables-L-v--line-numbers
ChainINPUT(policyDROP0packets,0bytes)
numpktsbytestargetprotoptinoutsource
destination
142348645ACCEPTall—anyanyanywhere
anywhereADDRTYPEmatchdst-typeMULTICAST
200ACCEPTigmp—anyanyanywhere
anywhere
300ACCEPTtcp—anyanyanywhere
anywherestateNEWtcpdpt:efi-mg
41200124KACCEPTudp—anyanyanywhere
anywherestateNEWmultiportdportshpoms-dps-
lstn,netsupport
5867152ACCEPTall—anyanyanywhere
anywherestateRELATED,ESTABLISHED
600ACCEPTicmp—anyanyanywhere
anywhere
738741151ACCEPTall—loanyanywhere
anywhere
800ACCEPTtcp—anyanyanywhere
anywherestateNEWtcpdpt:ssh
96510405REJECTall—anyanyanywhere
anywherereject-withicmp-host-prohibited
ChainFORWARD(policyACCEPT0packets,0bytes)
numpktsbytestargetprotoptinoutsource
destination
100REJECTall—anyanyanywhere
anywherereject-withicmp-host-prohibited
ChainOUTPUT(policyACCEPT1149packets,127Kbytes)
numpktsbytestargetprotoptinoutsource
destination
5. IfthelastdefaultruleintheINPUTchainimplementsaREJECTprocedurefornon-compliantpackets,wewilldeleteitbecausewealreadytookcareofthatneedbychangingthedefaultpolicyforthechain:
iptables-DINPUT[rulenumber]
6. Finally,wemustsavethefirewallrulesforpersistencyacrossboots.Asshowninthefollowingscreenshot,thisconsistsofsavingthechangesto/etc/sysconfig/iptables:
serviceiptablessave
Ifweinspectthe/etc/sysconfig/iptablesfilewithourpreferredtexteditororpager,wewillrealizethatitpresentsthesamefirewallrulesinaformatthatissomewhateasiertoread,asshowninthefollowingcode:
[root@node02~]#cat/etc/sysconfig/iptables
#Generatedbyiptables-savev1.4.21onSatDec510:09:242015
*filter
:INPUTDROP[0:0]
:FORWARDACCEPT[0:0]
:OUTPUTACCEPT[263:28048]
-AINPUT-maddrtype--dst-typeMULTICAST-jACCEPT
-AINPUT-pigmp-jACCEPT
-AINPUT-ptcp-mstate--stateNEW-mtcp--dport2224-jACCEPT
-AINPUT-pudp-mstate--stateNEW-mmultiport--dports5404,5405-j
ACCEPT
-AINPUT-mstate--stateRELATED,ESTABLISHED-jACCEPT
-AINPUT-picmp-jACCEPT
-AINPUT-ilo-jACCEPT
-AINPUT-ptcp-mstate--stateNEW-mtcp--dport22-jACCEPT
-AINPUT-jREJECT--reject-withicmp-host-prohibited
-AFORWARD-jREJECT--reject-withicmp-host-prohibited
COMMIT
#CompletedonSatDec510:09:242015
Next,youwillalsoneedtoeditthe/etc/sysconfig/iptables-configfiletoindicatethatfirewallrulesshouldbepersistentonsystemshutdownandreboot.Notethatthese
linesalreadyexistinthefileandneedtobechanged.Asaprecaution,youmaywanttobackuptheexistingfilebeforemakingthechange:
cp/etc/sysconfig/iptables-config/etc/sysconfig/iptables-config.orig
Now,open/etc/sysconfig/iptables-configwithyourpreferredtexteditorandensurethattheindicatedlinesreadasfollows:
IPTABLES_SAVE_ON_STOP="yes"
IPTABLES_SAVE_ON_RESTART="yes"
Asusual,donotforgettorestartiptables(systemctlrestartiptables)inordertoapplychanges.
CentOS7,justlikethepreviousversionsofthedistribution,comeswithbuilt-inSELinux(SecurityEnhancedLinux)support.Thisprovidesnative,flexibleaccesscontrolfunctionalityfortheoperatingsystembasedonthekernelitself.YoumaywellbewonderingwhattodowithSELinuxpoliciesatthisstage.Thecurrentsettings,whichcanbedisplayedwiththesestatusandgetenforcecommands,andareshowninthefollowingscreenshot,willdoforthetimebeing:
Insimpleterms,wewillleavethedefaultmodesettoenforcingforsecuritypurposes.Thisshouldnotcauseanyissuesfurtherdowntheroad,butifitdoes,feelfreetosetthemodetopermissivewiththefollowingcommand:
setenforce0
Theprecedingcommandwillenablewarningsandlogerrorstohelpyoutroubleshootissueswhiletheserverisstillrunning.IncaseyouneedtotroubleshootissuesandyoususpectthatSELinuxmaybecausingthem,youshouldlookin/var/log/audit/audit.log.SELinuxlogmessages,whicharelabeledwiththeAVCkeyword,arewrittentothatfileviaauditd,theLinuxauditingsystem,whichisstartedbydefault.Otherwise,thesemessagesarewrittento/var/log/messages.
Now,beforeyoutacklethenextheading,don’tforgettorepeatthesameoperationsandsavethechangesontheothernodeaswell!
GettingacquaintedwithPCSWearegettingclosertoactuallysettingupthecluster.Beforedivingintothattask,weneedtobecomefamiliarwithPCS—thecorecomponentofourcluster—sotospeak,whichwillbeusedtocontrolandconfigurepacemakerandcorosync.Tobegindoingthat,wecanjustrunPCSwithoutarguments,asfollows:
pcs
Thisreturnstheoutputshowninthefollowingscreenshot,whichprovidesashortexplanationofeachoptionandcommandavailableinPCS:
WeareinterestedintheCommandssection,wheretheactualcategoriesofclusteringthatcanbemanagedthroughthistoolarelisted,alongwithabriefdescriptionoftheirusage.Eachofthemsupportsseveralcapabilities,whichcanbeshownbyappendingthewordhelptopcs[category].Forexample,let’s’seethefunctionalitythatisprovidedbytheclustercommand(whichbytheway,wewilluseshortly):
pcsclusterhelp
Usage:pcscluster[commands]...
Configureclusterforusewithpacemaker
Commands:
auth[node][...][-uusername][-ppassword][--force][--local]
Authenticatepcstopcsdonnodesspecified,oronallnodes
configuredincorosync.confifnonodesarespecified(authorization
tokensarestoredin~/.pcs/tokensor/var/lib/pcsd/tokensforroot).
Bydefaultallnodesarealsoauthenticatedtoeachother,using
--localonlyauthenticatesthelocalnode(anddoesnotauthenticate
theremotenodeswitheachother).Using--forceforces
re-authenticationtooccur.
setup[--start][--local][--enable]--name<clustername><node1[,node1-
altaddr]>
[node2[,node2-altaddr]][..][--transport<udpu|udp>][--rrpmode
active|passive]
[--addr0<addr/net>[[[--mcast0<address>][--mcastport0<port>]
[--ttl0<ttl>]]|[--broadcast0]]
[--addr1<addr/net>[[[--mcast1<address>][--mcastport1<port>]
[--ttl1<ttl>]]|[--broadcast1]]]]
[--wait_for_all=<0|1>][--auto_tie_breaker=<0|1>]
[--last_man_standing=<0|1>[--last_man_standing_window=<timein
ms>]]
[--ipv6][--token<timeout>][--join<timeout>]
[--consensus<timeout>][--miss_count_const<count>]
[--fail_recv_const<failures>]
Configurecorosyncandsyncconfigurationouttolistednodes
--localwillonlyperformchangesonthelocalnode
--startwillalsostarttheclusteronthespecifiednodes
--enablewillenablecorosyncandpacemakeronnodestartup
--transportallowsspecificationofcorosynctransport(default:udpu)
The--wait_for_all,--auto_tie_breaker,--last_man_standing,
--last_man_standing_windowoptionsarealldocumentedincorosync's'
votequorum(5)manpage.
--ipv6willconfigurecorosynctouseipv6(insteadofipv4)
--token<timeout>setstimeinmillisecondsuntilatokenlossis
declaredafternotreceivingatoken(default1000ms)
--join<timeout>setstimeinmillisecondstowaitforjoinmesages
(default50ms)
--consensus<timeout>setstimeinmillisecondstowaitforconsensus
tobeachievedbeforestartinganewroundofmembership
configuration
(default1200ms)
--miss_count_const<count>setsthemaximumnumberoftimeson
receiptofatokenamessageischeckedforretransmissionbefore
aretransmissionoccurs(default5messages)
--fail_recv_const<failures>specifieshowmanyrotationsofthetoken
withoutreceivinganymessageswhenmessagesshouldbereceived
mayoccurbeforeanewconfigurationisformed(default2500
failures)
NoteNotethattheoutputistruncatedforbrevity.
Youwilloftenfindyourselfexaminingthedocumentation,soyoushouldconsiderseriouslybecomingacquaintedwiththehelp.
ManagingauthenticationandcreatingtheclusterWearenowreadytoauthenticatePCStothepcsdserviceonthenodesspecifiedinthecommandline.Bydefault,allnodesareauthenticatedtoeachotherandthusPCScantalktoitselffromoneclustermembertotherest.
Thisispreciselywherethehaclusteruser(ofwhichwechangedthepasswordearlier)comesinhandy,asitistheaccountthatisusedforthispurpose.ThegenericsyntaxforPCStoperformthisstepinaclusterwithNnodesisasfollows:
pcsclusterauthmember1member2…memberN
Inourcurrentsetupwithtwonodes,settingupauthenticationmeans:
pcsclusterauthnode01node02
Wewillbepromptedtoentertheusernameandpasswordthatwillbeusedforauthentication,asdiscussedearlier,andfortunatelyforus,thisprocessdoesnotneedtoberepeatedaswecannowcontroltheclusterfromanyofthenodes.Thisprocedureisexemplifiedinthefollowingscreenshot(wherewesetuptheauthenticationforpcsfromnode01),andlaterwhenwecreatetheclusteritselfissuingthecommandinnode02,fromwherethe/etc/corosync/corosync.conffileissynchronizedtotheothernode:
Tocreatetheclusterusingthespecifiednodes,type(ononenodeonly,aftersuccessfullytryingthepasswordasillustratedintheprecedingscreenshot)thefollowingcommand:
pcsclustersetup--nameMyClusternode01node02
Here,MyClusteristhenamewehavechosenforourcluster(andyoumaywanttochangeitaccordingtoyourliking).Next,pressEnterandverifytheoutput.Notethatitisthiscommandthatcreatestheclusterconfigurationfilein/etc/corosync/corosync.confonbothnodes.
Ifyoucreatedthecorosync.conffileusingthesampleconfigurationfileasinstructedearlierinthischapter(inordertostartpacemakerandcorosync),youwillhavetousethe--forceoptiontooverwritethatfilewiththecurrentsettingsofthenewlycreatedcluster:
[root@node01~]#pcsclustersetup--nameMyClusternode01node02
Error:/etc/corosync/corosync.confalreadyexists,use--forcetooverwrite
[root@node01~]#pcsclustersetup--nameMyClusternode01node02--force
Shuttingdownpacemaker/corosyncservices…
Redirectingto/bin/systemctlstoppacemaker.service
Redirectingto/bin/systemctlstopcorosync.service
Killinganyremainingservices…
Removingallclusterconfigurationfiles…
node01:Succeeded
node02:Succeeded
[root@node01~]#
NoteIfyougetthefollowingerrormessagewhiletryingtosetupthepcsauthentication.Ensurethatpcsdisrunning(andenabled)onnodeXX,andtryagain:
Error:UnabletocommunicatewithnodeXX
Note(Here,XXisthenodenumber)
Atthispoint,the/etc/corosync/corosync.conffileinnode02shouldbeidenticaltothesamefileinnode01,ascanbeseenintheoutputofthefollowingdiffcommand,whenrunfromeithernode.Anemptyoutputindicatesthatthecorosyncconfigurationfilehasbeencorrectlysyncedfromonenodetotheother:
diff/etc/corosync/corosync.conf<(sshnode02'cat
/etc/corosync/corosync.conf')
Thenextstepconsistsofactuallystartingtheclusterbyissuingthecommand(again,ononenodeonly):
pcsclusterstart--all
NoteThecommandthatisusedtostartthecluster(pcsclusterstart)deservesfurtherclarification:
start[--all][node][...]
Startcorosync&pacemakeronspecifiednode(s),ifanodeisnot
specifiedthencorosync&pacemakerarestartedonthelocalnode.
If--allisspecifiedthencorosync&pacemakerarestartedonall
nodes.
NoteTherewillbetimeswhenyouwanttostarttheclusteronaspecificnode.Inthatcase,youwillnamesuchanodeinsteadofusingthe--allflag.
Theoutputtotheprecedingcommandshouldbeasfollows:
pcsclusterstart--all
[root@node01~]#pcsclusterstart--all
node01:StartingCluster…
node02:StartingCluster…
[root@node01~]#
Oncetheclusterhasbeenstarted,youcancheckitsstatusfromanyofthenodes(rememberthatPCSmakesitpossibleforyoutomanagetheclusterfromanynode):
[root@node01log]#pcsstatuscluster
ClusterStatus:
Lastupdated:SatDec511:59:142015Lastchange:SatDec5
11:53:012015byrootviacibadminonnode01
Stack:corosync
CurrentDC:node02(version1.1.13-a14efad)-partitionwithquorum
2nodesand0resourcesconfigured
Online:[node01node02]
[root@node01log]#orjustpcsstatus:
[root@node01log]#pcsstatus
Clustername:MyCluster
WARNING:nostonithdevicesandstonith-enabledisnotfalse
Lastupdated:SatDec511:55:432015Lastchange:SatDec5
11:53:012015byrootviacibadminonnode01
Stack:corosync
CurrentDC:node02(version1.1.13-a14efad)-partitionwithquorum
2nodesand0resourcesconfigured
Online:[node01node02]
Fulllistofresources:
PCSDStatus:
node01:Online
node02:Online
DaemonStatus:
corosync:active/disabled
pacemaker:active/disabled
pcsd:active/enabled
NoteThepcsstatuscommandprovidesmoredetailedinformation,includingthestatusofservicesandresources.ItispossiblethatyounoticethatoneofthenodesisOFFLINE,asfollows:
Online:[node01]
OFFLINE:[node02]
NoteInthiscase,ensurethatbothpacemakerandcorosyncareenabled(asindicatedaftertheDaemonstatus:line)andstartedonthenodethat’smarkedasOFFLINE,andthenperformpcsstatusagain.
Anotherissueyoumayencounterishavingoneormoreofthenodesinanuncleanstate.Whilethatisnotcommon,resyncingthenodesbystoppingandrestartingtheclusteronbothnodeswillfixit:
pcsclusterstop
pcsclusterstart
ThenodethatismarkedasDC,thatis,DesignatedController,isthenodewheretheclusterwasoriginallystartedandfromwherethecluster-relatedcommandswillbe
typicallyissued.Ifforsomereason,thecurrentDCfails,anewdesignatedcontrollerischosenautomaticallyfromtheremainingnodes.YoucanseewhichnodeisthecurrentDCwith:
pcsstatus|grep-idc
ToseethecurrentDCinyourcluster,do:
[root@node01~]#pcsstatus|grep-idc
CurrentDC:node02(version1.1.13-a14efad)-partitionwithquorum
[root@node01~]#
Youwillalsowanttocheckoneachnodeindividually:
Thepcsstatusnodescommandallowsyoutoviewallinformationabouttheclusteranditsconfiguredresources:
[root@node01~]#pcsstatusnodes
PacemakerNodes:
Online:node01node02
Standby:
Offline:
[root@node01~]#
Thecorosync-cmapctlcommandisanothertoolforaccessingthecluster’sobjectdatabase,whereyouwillbeabletoviewthepropertiesandconfigurationofeachnode.Sincetheoutputofcorosync-cmapctlcommandisratherlengthy,youmaywanttofilterbyachosenkeyword,suchasmembersorcluster_name,forexample:
[root@node01~]#corosync-cmapctl|grep-Ei'cluster'_name|members'
runtime.totem.pg.mrp.srp.members.1.config_version(u64)=0
runtime.totem.pg.mrp.srp.members.1.ip(str)=r(0)ip(192.168.0.2)
runtime.totem.pg.mrp.srp.members.1.join_count(u32)=1
runtime.totem.pg.mrp.srp.members.1.status(str)=joined
runtime.totem.pg.mrp.srp.members.2.config_version(u64)=0
runtime.totem.pg.mrp.srp.members.2.ip(str)=r(0)ip(192.168.0.3)
runtime.totem.pg.mrp.srp.members.2.join_count(u32)=1
runtime.totem.pg.mrp.srp.members.2.status(str)=joined
totem.cluster_name(str)=MyCluster
[root@node01~]#
Asyoucansee,theprecedingoutputallowsyoutoseethenameofyourcluster,theIPaddress,andthestatusofeachmember.
SettingupavirtualIPfortheclusterAsmentionedinChapter1,ClusterBasicsandInstallationonCentOS7,sinceaclusterisbydefinitionagroupofcomputers(whichwehavebeenreferringtoasnodesormembers)thatworktogethersothatthesetisseenasasinglesystemfromtheoutside,weneedtoensurethatendusersandclientsseeitthatway.
Forthisreason,thelastthingthatwewilldointhischapterisconfigureavirtualIP,whichistheaddressthatexternalclientswillusetoconnecttoourcluster.Notethatinanordinary,non-clusterenvironment,youcanusetools,suchasifconfigtoconfigureavirtualIPforyoursystem.
However,inourcase,wewillusenothingmoreandnothinglessthanPCSandperformtwooperationsatonce:
CreatingtheIPv4addressAssigningittotheclusterasawhole
AddingavirtualIPasaclusterresourceSinceavirtualIPiswhatiscalledaclusterresource,wewillusepcsresourcehelptolookforinformationontohowtocreateit.Youwillneed,inadvance,topickanIPaddressthatisnotbeingusedinyourLANtoassigntothevirtualIPresource.AfterthevirtualIPisinitialized,youcanpingitasusualtoconfirmitsavailability.
TocreatethevirtualIPnamedvirtual_ipwiththeaddress192.168.0.4/24,monitoredeverything30secondsonenp0s3,runthefollowingcommandoneithernode:
pcsresourcecreatevirtual_ipocf:heartbeat:IPaddr2ip=192.168.0.4
cidr_netmask=24nic=enp0s3opmonitorinterval=30s
Uptothispoint,thevirtualIPresourcewillshowasstoppedintheoutputofpcsclusterstatusorpcsstatusuntilalaterstagewhenwewilldisableSTONITH(whichisaclusterfeaturethatisexplainedinthenextsection).
ViewingthestatusofthevirtualIPToviewthecurrentstatusofclusterresourcesusethefollowingcommand:
pcsstatusresources
IncasethenewlycreatedvirtualIPisnotstartedautomatically,youwillwanttoperformamorethoroughcheck,includingaverboseoutputoftheconfigurationusedbytherunningclusterasprovidedbycrm_verify,atoolthatispartofthepacemakerclusterresourcemanager:
[root@node01~]#crm_verify-L-V
error:unpack_resources:Resourcestart-updisabledsincenoSTONITH
resourceshavebeendefined
error:unpack_resources:EitherconfiguresomeordisableSTONITH
withthestonith-enabledoption
error:unpack_resources:NOTE:Clusterswithshareddataneed
STONITHtoensuredataintegrity
Errorsfoundduringcheck:confignotvalid
[root@node01~]#
NoteSTONITH,anacronymforShootTheOtherNodeInTheHead,representsaclusterfeaturethatpreventsnodesinahigh-availabilityclusterfrombecomingactiveatthesametime,andthusservingthesamecontent.
Astheprecedingerrormessageindicates,clusterswithshareddataneedSTONITHtoensuredataintegrity.However,wewilldefertheappropriatediscussionforthisfeatureforthenextchapter,andwewilldisableitforthetimebeinginordertobeabletoshowhowthevirtualIPisstartedandbecomesaccessible.Ontheotherhand,whencrm_verify–L–Vdoesnotreturnanyoutput,itmeansthattheconfigurationisvalidandfreefromerrors.
GoaheadanddisableSTONITHbutkeepinmindthatwewillreturntothisinthenextchapter:
pcspropertysetstonith-enabled=false
Next,checktheclusterstatusagain.
Theresourceshouldnowshowasstartedwhenyouquerytheclusterstatus.Youcanchecktheresourceavailabilitybypingingit:
ping-c4192.168.0.4
Ifthepingoperationreturnsawarningthatsomepacketswerenotdeliveredtodestination,referto/var/log/pacemaker.logor/var/log/cluster/corosync.logforinformationonwhatcouldhavefailed.
SummaryInthischapter,youlearnedhowtosetupandconfigurethebasicrequirednetworkinfrastructureandalsotheclusteringcomponentsthatweinstalledinChapter1,ClusterBasicsandInstallationonCentOS7.Havingreviewedtheconceptsassociatedwithsecurity,firewalling,andInternetprotocols,wewereabletoaddthefirewallrulesthatwillallowthecommunicationofeachnodewitheachotherandtheproperoperationoftheclusteringservicesoneachbox.
Wewillusethetoolsdiscussedinthisarticlethroughouttherestofthisbook,notonlytocheckonthestatusoftheclusterortheindividualnodes,butalsoasatroubleshootingtechniqueincasethingsdon’tgoasexpected.
Chapter3.ACloserLookatHighAvailabilityInthischapter,wewilllookatthecomponentsofahigh-availabilityclusteringreaterdetailthanwewereabletodoinitiallyduringChapter1,ClusterBasicsandInstallationonCentOS7;youmaywanttoreviewthatchapterinordertorefreshyourmemorybeforeproceedingfurther.
Inthischapter,wewillcoverthefollowingtopics:
Failover—apremieronhighavailabilityandperformanceFencing—isolatingthemalfunctioningnodesSplitbrain—preparingtoavoidinconsistenciesQuorum—scoringinsideyourclusterConfiguringourclusterviaPCSGUI
Wewillsetoutonthischapterbyaskingourselvesafewquestionsabouthowtoachievehighavailability,andwewillattempttogetouranswersaswegoalong.Inthenextchapter,wewillsetupactualreal-lifeexamples:
Howcanweensureanautomaticfailoverwithouttheneedforhumanintervention?Howmanynodesareneededinaclusterinordertoensurehighavailabilityinseveralfailurescenarios?Howdoweconsistentlyensuredataintegrityandhighavailabilitywhenanofflinenodecomesonlineagain?
Overall,clusterscanbeclassifiedintotwomaincategories.Forsimplicity,wewilluseaclusterconsistingoftwonodesforthefollowingdefinitions,buttheconceptcanbeeasilyextendedtoaclusterwithahighernumberofmembers:
Active/Active(A/A):Inthistypeofcluster,allnodesareactiveatthesametime.Thus,theyareabletoserverequestssimultaneouslyandequally,eachwithindependentworkloads.Whenafailoverisnecessary,theremainingnodeisassignedanadditionalprocessingload,thusimpactingtheoverallperformanceoftheclusternegatively.Active/Passive(A/P):Inthistypeofcluster,thereisanactivenodeandapassivenode.Theformerhandlesalltrafficundernormalcircumstances,whilethelatterjustsitsidlewaitingtoenterthesceneduringafailover,whenitactuallytakesoverthesituationbyservicingrequestsusingitsownresourcesuntiltheothernodecomesbackonline.
Asyoucaninferfromthelasttwoparagraphs,anA/PclusterpresentsaclearlydesiredadvantageoverA/A,wherein,intheeventofafailover,thesamepercentageofhardwareandsoftwareresourcesismadeavailabletoendusers.Thisresultsinaconstantperformancelevelinatransparentway,whichisspeciallydesiredindatabaseservers,whereperformanceisacriticalrequirement.Ontheotherhand,A/Aclustersusuallyprovidehigheravailabilitysinceatleasttwoserversactivelyrunapplicationsandprovide
servicestoendusers.Inthenextchapter,youwillnoticethatwewillinitiallysetupanA/PclusterindetailandalsoprovidetheoverallinstructionstoconvertitintoanA/Aclusterifyouwishtodosoatalaterstage.
Failover–anintroductiontohighavailabilityandperformanceThefailoverprocesscanberoughlydescribedastheactionofswitching,intheeventofpowerornetworkfailure,toanavailableresourcetoresumeoperationswiththeleastdowntimeaspossible,withnodowntimebeingtheprimarygoalofhighavailabilityclusters.
InChapter2,InstallingClusterServicesandConfiguringNetworkComponents,weconfiguredasimplebutessentialresourceforourpurposes:avirtualIPaddress.YouwillalsorecallthatinordertostartbecomingacquaintedwithPCS—thetoolthatisusedasafrontendtoPCS(theconfigurationmanager)—wepresentedabriefintroductiontoitsbasicsyntaxandusage.
TipAsinothercasesintheLinuxecosystem,theprogram/protocol/packagenameiswrittenincaps,whilethetoolandutilityiswritteninlowercase.Thus,PCSisusedtoindicatethepackagename,anditisthecommand-lineutilitythatisusedtomanagePCS.
Withthepcsstatuscommand,wewillbeabletoviewthecurrentstatusoftheclusterandseveralimportantpiecesofinformation,asshowninthefollowingscreenshot:
Thefollowinglinespresenttheclusterresourcesthatarecurrentlyavailablefor
MyCluster:
Fulllistofresources:
virtual_ip(ocf::heartbeat:IPaddr2):Startednode01
Asindicated,thevirtualIPaddress(convenientlynamedvirtual_ipinChapter2,InstallingClusterServicesandConfiguringNetworkComponents)isstartedonnode01.SincethevirtualIPisaclusterresource,itistobeexpectedthatincasethenodefails,anautomaticfailoverofthisresourceistriggeredtonode02.Wewillsimulateanodegoingofflineduetoarealissuebystoppingbothcorosyncandpacemakeronthatclustermember.
Forourcurrentpurposes,thissimulationwillnotentitleshuttingdown(poweroff)thenodebecausewewanttoshowsomethinginterestingintheoutputofpcsstatusafterstoppingcorosyncandpacemakerinthatnode.
TipYoucanalsosimulateafailoverbypausingoneofthevirtualmachinesinVirtualBox(selecttheVMoptioninOracleVMVirtualBoxManagerandpressCtrl+PorchoosePausefromtheMachineMenu),andyoucanalsodoitbydisablingthenetworkingusingthesystemctldisablenetworkcommandinthatnode.
Let’sstoppacemakerandcorosyncinnode01:
pcsclusterstopnode01
Andrunagain,butontheothernode,thatisnode02,usingthefollowingcommand:
pcsstatus
Toviewthecurrentstatusofthecluster,itsnodes,andresources,whichisshowninthefollowingscreenshot,youwillneedtorunpcsstatusonthenodewheretheclusteriscurrentlyrunning:
Thereareafewlinesfromtheprecedingscreenshotthatareworthdiscussing.
TheOFFLINE:[node01]lineindicatesthatnode01isoffline—asfarastheclusterasawholeisconcerned—whichiswhatwewereexpectingafterstoppingtheclusterresourcemanagerandthemessagingservicesinthatmember.However,thefollowingcodeindicatesthatthepcsddaemon,theremoteconfigurationinterface,isstillrunningonnode01,whichmakesitpossibletostillcontrolpacemakerandcorosyncinthatnode,eitherlocallyorremotelyfromanothernode:
PCSDStatus:
node01:Online
Finally,thevirtual_ip(ocf::heartbeat:IPaddr2):Startednode02commandallowsustoseethatthefailoverofthevirtualIPfromnode01tonode02wasperformedautomaticallyandwithouterrors.If,forsomereason,yourunintoerrorswhileperformingthevirtualIPaddressfailover,youwillwanttochecktherelatedlogsforinformationastowhatcouldhavegonewrong.
Forexample,let’sexamineacasewheretheclusterresourcedoesnothaveanothernodetofailoverto.Pictureascenariowherenode02isoffline(eitherbecauseyoupausedtheVMoractuallyshutitdown),andallofasudden,node01goesdownaswell(rememberthatwearetalkingabouttheclusteringservicesnotbeingavailableinsteadofanactualpowerornetworkoutage).Ofcourse,allofthishappensbehindthescenes—theonlythingthatyouknowrightnowisthatyouhaveuserscomplainingthattheycannotaccess
whateverapplication,resource,orserviceisbeingofferedfromyourcluster.
ThefirstthingyoumayfeelinclinedtotryistoseewhetherthevirtualIPaddressispingablefromwithinyournetwork(changetheIPaddressasperyourchoicewhileconfiguringtheresourceattheendofChapter2,InstallingClusterServicesandConfiguringNetworkComponents):
ping-c4192.168.0.4
Youwillnoticethatnoneofthefourpacketswasabletoreachitsintendeddestination:
PING192.168.0.4(192.168.0.4)56(84)bytesofdata.
From192.168.0.2icmp_seq=1DestinationHostUnreachable
From192.168.0.2icmp_seq=2DestinationHostUnreachable
From192.168.0.2icmp_seq=3DestinationHostUnreachable
From192.168.0.2icmp_seq=4DestinationHostUnreachable
---192.168.0.4pingstatistics---
4packetstransmitted,0received,+4errors,100%packetloss,time3000ms
Forthatreason,gotonode01,whereyoufirststartedtheresourcetocheckonthenode’sstatus:
Error:clusterisnotcurrentlyrunningonthisnode
Thenyouseethattheclusterisdownonnode01.Butwasn’tthefailoversupposedtohappenautomatically?Atthispoint,youhavetwooptions:
Gotonode02tocheckwhethertheclusterisrunningthere.Checkthelogsonnode01.Notethatthisassumesthatyoushutdownnode02andthennode01.Inanyevent,youwanttochecktheloginthenodethatyoushutdownlast.
Abriefsearchforthekeywordvirtual_ipin/var/log/pacemaker.log(orwhatevernameyousetfortheresourceduringthelaststagesofChapter2,InstallingClusterServicesandConfiguringNetworkComponents)innode01tellsyouwhattheproblemis.Hereisabriefexcerptofthegrepvirtual_ip/var/log/pacemaker.logfile:
Mar2107:52:45[3839]node01pengine:info:native_print:
virtual_ip(ocf::heartbeat:IPaddr2):Stopped
Mar2107:52:45[3839]node01pengine:info:native_color:
Resourcevirtual_ipcannotrunanywhere
Mar2107:52:45[3839]node01pengine:info:LogActions:Leave
virtual_ip(Stopped)
Thefirstmessageindicatesthatvirtual_ipwasstoppedonnode01,andthesecondmessagestatesthatitcouldnotbefailedoveranywhere.TheresultisthattheresourceisleftasStopped(asoutlinedinthethirdmessage)untilitismanuallyre-enabledfromeithernodeinthecluster.However,rememberthatyouneedtostarttheclusteronsuchanodebeforehand:
pcsclusterstartnode01
Then,runthefollowingcommandonnode01:
pcsresourceenablevirtual_ip
Afurthercheckonpcsstatusmayormaynotindicatethattheresourceisstillstopped(itisagoodideatopingthevirtualIPaddresshereaswell).Ifvirtual_iprefusestostart,wecanusethefollowingcommandtoobtainverboseinformationaboutwhythisparticularresourceisnotbeingstartedproperly,andthenperformaresetoftheclusterresourcetomakeitreloaditsproperconfiguration:
pcsresourcedebug-startvirtual_ip--full
Rememberthatpcstakesanoption(notrequired)andacommandasarguments,whichmayinturnbefollowedbyspecificoptions.Inthisregard,pcsclusterstop,whereclusteristhecommandandstoprepresentsaspecificactionofsuchacommand,canbeusedtoshutdowncorosyncandpacemakeroneitherthelocalnode,allnodes,oraspecificnode.Inthefollowingextractofmanpcsyoucanreviewthesyntaxofpcsclusterstop:
stop[--all][node][...]
Stopcorosyncandpacemakeronspecifiednode(s),ifanodeisnot
specifiedthencorosyncandpacemakerarestoppedonthelocalnode.If--
allisspecifiedthencorosyncandpacemakerarestoppedonallnodes.
NoteRememberthatwhencorosyncandpacemakerarerunningonbothnodes,youcanrunanyPCScommandtoconfiguretheclusterfromanyofthenodes.Intheeventofaseverefailure,wherepcsdbecomesunavailableonbothnodes,youwillhavetoresorttousingSSHauthenticationfromonenodetotheothertotroubleshootandfixissues.
Asithappensinothercases,logfilesarethebestfriendsofsystemadministrators,andtheycanplayakeyroleinhelpingyoutofindoutwhattherootcausesofissuesarewhentheyhappen.Therearethreelogsthatyoumaywanttocheckonceinawhileandevenasyouareperformingafailover:
/var/log/pacemaker.log
/var/log/cluster/corosync.log
/var/log/pcsd/pcsd.log
NoteInaddition,youcanalsosearchinthesystemdlogwithjournalctl-xnandusegreptofilteraspecificwordorphrase.
TipYoucanresetthestatusofaclusterresourcewiththepcsresourcedisable<resource_name>andpcsresourceenable<resource_name>commands.
Fencing–isolatingthemalfunctioningnodesAsthenumberofnodesinaclusterincreases,itsavailabilityincreases,butsodoesthechanceofoneofthemfailingatsomepoint.Thisfailureevent,whetherseriousornot,suggeststhatwemustsecureawaytoisolatethemalfunctioningnodefromtheclusterinorderforittofullyreleaseitsprocessingtaskstotherestofthecluster.Thinkofwhatanerraticnodecancauseinasharedstoragecluster—datacorruptionwouldinevitablyoccur.Thewordmalfunctioning,inthiscontext,meansnotonlywhatitsuggestsinthetypicalusageoftheEnglishlanguage(somethingthatisnotworkingproperly),butalsoanode,whichalsoincludestheresourcesstartedonit,whosestatecannotbedeterminedbytheclusterforwhateverreason.
Thisiswherethetermfencingcomesintoplay.Bydefinition,clusterfencingistheprocessofisolating,orseparating,anodefromusingitsresourcesorstartingservices,whichitshouldnothaveaccessto,andfromtherestofthenodesaswell.OneoftheABCrulesofcomputerclusteringcanthusbeformulatedas,donotletamalfunctioningnoderunanyclusterresources-fenceitinallcases.Inlinewiththelaststatement,anunresponsivenodemustbetakenofflinebeforeanothernodewilltakeover.
FencingisperformedusingamechanismknownasSTONITH,whichwebrieflyintroducedduringthelastchapter(infewwords,STONITHisafencingmethodthatisusedtoisolateafailednodeinordertopreventitfromcausingproblemsinacluster).Youwillrecallthatwedisabledthisfeatureatthatpointandmentionedthatwewouldrevisitthetopichere.Aquickinspectionofthecluster’sconfiguration,asshowninthefollowingscreenshot,willconfirmthatthatSTONITHiscurrentlydisabled:
TipIfyourunthepcsconfigcode,youwillbeabletoviewthecurrentconfigurationfortheclusterindetail,whichisillustratedintheprecedingscreenshot.
Attheveryend,thestonith-enabled:falselineclearlyremindsusthatSTONITHisdisabledinourcluster.
Youwillwanttoaddpcsconfigtothelistofessentialcommandsthatyoumustkeepinmindaswemoveforwardwiththeclusterconfiguration.Itwillallowyoutoinspect,ataquickglance,thesettingsandresourcesmadeavailablethroughthecluster.
So,let’sbeginbyre-enablingSTONITH:
pcspropertysetstonith-enabled=true
Next,checkontheconfigurationagain,eitherwiththepcsconfigorpcspropertylistcommand.Forbrevity,inthecaseillustratedinthefollowingscreenshot,weusethepcspropertylistcommandinordertointroduceyoutoyetanotherusefulPCScommand.Notehowwecheckonthispropertybeforeandafterre-enablingSTONITH:
OncewehaveenabledSTONITHinourcluster,itistimetofinallysetupfencinginourclusterbyconfiguringaSTONITHresource(alsoknownasaSTONITHdevice).
InstallingandconfiguringaSTONITHdeviceItisworthnotingthataSTONITHdeviceisaclusterresourcethatwillbeusedtobringdownamalfunctioningorunresponsivenode.InstallingthefollowingpackagesonbothnodeswillmakeseveralSTONITHdevicesavailableinourcluster.Ifyouaresettingupyour2-nodeclusterwithtwovirtualmachines,assuggestedearlyinChapter1,ClusterBasicsandInstallationonCentOS7,installthefollowingpackagesonbothnodes:
yumupdate&&yuminstallfence-agents-allfence-virt
Oncetheinstallationiscomplete,youcanlistalltheavailableagentswiththepcsstonithlistcommand,asshowninthenextscreenshot.
Eachofthelisteddevicesinthefollowingscreenshotaredescribedbyseveralavailableparameters,whichcanbeshownwithpcsstonithdescribeagent,whereyoumustreplaceagentwiththecorrespondingnameoftheresource.NotethatwewillusetheseparameterswhenweactuallyconfiguretheSTONITHdeviceinalaterstep.Therequiredparametersareindicatedbytheword(required)atthebeginningofthedescription,usethepcsstonithdescribefence_ilocommand,whichreturnsthefollowingoutput:
Stonithoptionsfor:fence_ilo
ipaddr(required):IPAddressorHostname
login(required):LoginName
passwd:Loginpasswordorpassphrase
ssl:SSLconnection
notls:DisableTLSnegotiation
ribcl:Forceribclversiontouse
ipport:TCP/UDPporttouseforconnectionwithdevice
inet4_only:ForcesagenttouseIPv4addressesonly
inet6_only:ForcesagenttouseIPv6addressesonly
passwd_script:Scripttoretrievepassword
ssl_secure:SSLconnectionwithverifyingfencedevice's'certificate
ssl_insecure:SSLconnectionwithoutverifyingfencedevice's'certificate
action(required):FencingAction
verbose:Verbosemode
debug:Writedebuginformationtogivenfile
version:Displayversioninformationandexit
help:Displayhelpandexit
power_timeout:TestXsecondsforstatuschangeafterON/OFF
shell_timeout:WaitXsecondsforcmdpromptafterissuingcommand
login_timeout:WaitXsecondsforcmdpromptafterlogin
power_wait:WaitXsecondsafterissuingON/OFF
delay:WaitXsecondsbeforefencingisstarted
retry_on:Countofattemptstoretrypoweron
stonith-timeout:HowlongtowaitfortheSTONITHactiontocompletepera
stonithdevice.
priority:Thepriorityofthestonithresource.Devicesaretriedinorder
ofhighestprioritytolowest.
pcmk_host_map:Amappingofhostnamestoportsnumbersfordevicesthat
donotsupporthostnames.
pcmk_host_list:Alistofmachinescontrolledbythisdevice(Optional
unlesspcmk_host_check=static-list).
pcmk_host_check:Howtodeterminewhichmachinesarecontrolledbythe
device.
Amongtheseparameters,youcanseethatthereisanaction(action)thatshouldtakeplacewhenafencingeventisgoingtohappen,ahostlistthatwillbecontrolledbythisdevice(pcmk_host_list),andawaitingtime(timeoutorstonith-timeout),thatis,thetimetakentowaitforafencingactiontocomplete.TheseareessentialpiecesofinformationthatyouwillneedtotakeintoaccountwhilespecifyingtheSTONITHoptionsduringthecreationofthedeviceandsettingupyourinfrastructure.
Thenextstep,whichconsistsofcreatingthedeviceitself,willlargelydependonthehardwaredevicethatyouhaveavailable.Forexample,ifyouwanttofenceaHewlett-Packardnode(suchasaProliantserver)withabuilt-iniLOinterface,youwouldusethefence_iloagent,orifyournodesaresittingontopofVMWarevirtualization,youmayneedtochoosefence_vmware_soap.AnotherpopularoptionisDellwithDellRemoteAccessController(DRAC),forwhichyouwouldusefence_drac5.Unfortunately,asoftoday,thereisnoout-of-the-boxfencingdeviceavailableforVirtualBox.
Tip
AniLO(IntegratedLightsOut)cardisaseparateinterfacewithaseparatenetworkconnectionandIPaddressthatallowsasystemadministratortoperformcertainoperationsonHPserversremotelyviaHTTPS.SimilarfunctionalityisavailableinDellserverswithbuilt-inDRACs.
Let’snowcreateaSTONITHfence_ilodevicenamedStonith_1,whichcanfencenode01(althoughweareshowingthisexampleusingnode01,notethatthishastobedoneonaper-nodebasis):
pcsstonithcreateStonith_1fence_ilopcmk_host_list="""node01"
action=reboot--force
Thebasicsyntaxtocreateafencingdeviceisasfollows:
pcsstonithcreatestonith_device_namestonith_device_type
stonith_device_options
Youcanviewanexplainedlistofstonith_device_optionswithmanstonithd.
Toupdatethedevice,usethefollowingcommand:
pcsstonithupdatestonith_device_namestonith_device_options
Todeletethedevice,usethefollowingcommand:
pcsstonithdeletestonith_device_name
Finally,Thepcsstonithshow[stonith_device_name]--fullcommandwilldisplayalltheoptionsusedfor[stonith_device_name]orallfencingdevicesif[stonith_device_name]isnotspecified.
Youcanthensimulateafencingsituation(notethatthisisdoneautomaticallybehindthescenesunderareal-lifeevent)bykillingthepacemakerandcorosyncprocesseswiththefollowingcommands:
pcsclusterstopnode01#Cleanstopoftheclusteronnode_name
pcsstonithfencenode01--off
Also,confirmthatnode_nameisactuallyofflineusingthepcsstonithconfirmnode01command.
Split-brain–preparingtoavoidinconsistenciesUptothispoint,wehaveconsideredafewessentialconceptsinclustering,leadingtothefollowingnotcompletelyfictitiousscenario—whathappensifaclusterisformedbynodesthatarelocatedinseparatenetworksandthecommunicationlinkbetweenthemgoesdown?Thesameapplieswhenthenodesareinthesamenetworkandthelinkgoesdownaswell.Thatis,noneofthenodeshaveactuallygoneoffline,buteachappearstotheotherasunavailable.Thedefaultbehaviorwouldbethateachnodeassumesthattheotherisdownandcontinuesservingwhateverresourcesorapplicationstheclusterwaspreviouslyrunning.
Sofar,sogood!Now,let’ssaythenetworklinkcomesbackonlinebutbothnodesstillthinktheyarethemainclustermember.Thatiswheredatacorruption—attheworst—orinconsistency—atthebest—occur.Thisiscausedbypossiblechangesmadetodataoneithersidewithouthavingbeenreplicatedtotheotherend.
Thisiswhyconfiguringfencingissoimportant,asisensuringredundantcommunicationlinksbetweenclustermemberssothatsuchaSinglePointOfFailure(SPOF)doesnotendupcausingthesplit-brainsituationinourcluster.
Asfarasthefencingisconcerned,onlythenodethatismarkedasDesignatedController(DC)andalsohasquorumcanfencetheothernodesandruntheapplicationsandresourcesasmaster,oractive,inourA/Pcluster.Bydoingso,weensurethattheothernodewillnotbeallowedtotakeoverresourcesthatmayleadtothedatainconsistenciesdescribedearlier.
Quorum–scoringinsideyourclusterInsimpleterms,theconceptofquorumindicatestheminimumnumberofmembersthatarerequiredtobeactiveinorderforthecluster,asawhole,tobeavailable.Specifically,aclusterissaidtohavequorumwhenthenumberofactivenodesisgreaterthanthetotalnumberofnodesdividedbytwo.Anotherwaytoexpressthisisthatquorumisachievedbyatleastasimplemajority(50%ofthetotalnumberofnodes+1).
Althoughtheconceptofquorumdoesn’tpreventasplit-brainscenario,itwilldecidewhichnode(orgroupofnodes)isdominantandallowedtoruntheclustersothatwhenasplit-brainsituationoccurs,onlyonenode(orgroupofnodes)willbeabletoruntheclusterservices.
Bydefault,whentheclusterdoesnothavequorum,pacemakerwillstopallresourcesaltogethersothattheywillnotbestartedonmorenodesthandesired.However,aclustermemberwillstilllistenforothernodestoreappearonthenetwork,buttheywillnotworkasaclusteruntilthequorumexistsagain.
Youcaneasilyconfirmthisbehaviorbystoppingtheclusteronnode01andnode02andthenrestartingitagain.Youwillnoticethatvirtual_ipremainsstopped:
Fulllistofresources:
virtual_ip(ocf::heartbeat:IPaddr2):Stopped
Untilyouenableitmanuallyusingthefollowingcommand:
pcsresourceenablevirtual_ip
Fora2-nodecluster,asitisinourcase,whenweusedthepcsclustersetupinChapter2,InstallingClusterServicesandConfiguringNetworkComponents,thefollowingsectionwasaddedin/etc/corosync/corosync.confforus:
quorum{
provider:corosync_votequorum
two_node:1
}
Thetwo_node:1linetellscorosyncthatina2-nodecluster,onememberisenoughtoholdupthequorum.Thus,evenwhensomepeoplewouldarguethata2-nodeclusterispointless,ourclusterwillcontinueworkingwhenatleastoneofthenodesisonline.Perhapsyoualreadynoticedwhilestoppingandstartingtheclusterinonenodepreviously,butitisworthpointingoutthatwhentryingtostoponeofthemembersinour2-nodeclusters,youwillbeaskedtousethe--forceoption:
pcsclusterstopnode01--force
Todisplaythecurrentlistofnodesintheclusteranditsindividualcontributionstowardclusterquorum(whichisshowninthefollowingfigureunderVotescolumn),runthecorosync-quorumtool-lcommand:
Inaprospectivesplit-brainsituation,asdescribedearlier,andsupposingthattheclusterisdividedintotwopartitions,thepartitionwithamajorityofvotesremainsavailable,whiletheotherisfencedautomaticallybytheDCifSTONITHhasbeenputinplaceandproperlyconfigured.Forexample,ina4-nodecluster,quorumisestablishedwhenatleastthreeclusternodesarefunctioning.Otherwise,theclusternolongerhasquorumandpacemakerwillstoptheservicesrunbythecluster.
ConfiguringourclusterwithPCSGUIIfyoufollowedthestepsoutlinedinChapter2,InstallingClusterServicesandConfiguringNetworkComponents,toenabletheHaclusteraccountforclusteradministration,wecanalsousethePCSGUI,aclustermanagementwebinterface,tomanageclusters.Thisincludestheabilitytoadd,remove,andeditexistingclusters.
TonavigatetothePCSwebinterface,gotohttps://<ip_of_one_node>:2224(notethatit’shttpsandnothttp),acceptthesecurityexceptions,andthenloginusingthecredentialsthatwerepreviouslysetforHacluster,asshowninthefollowingscreenshot:
Thenextscreenthatyouwillsee(whichisasshowninthefollowingscreenshot)willpresentthemenustoremoveanexistingcluster,addanexistingcluster,orcreateanewone.WhenyouclickontheAddExistingbutton,youwillbepromptedtoenterthehostnameorIPaddressofanodethatcurrentlybelongstoanexistingclusterthatyouwanttomanageusingthewebUI:
Then,clickontheclusternameandfeelfreetobrowsethroughthemenuatthetopofthefollowingfigure,whichalsoservesthepurposeoflettingusadd,remove,oreditthe
resourcesthatwehavebeenhithertotalkingabout:
SummaryInthischapter,weexploredsituationsofnodefailuresandessentialtechniquesformalfunctioningclustermembers,alongwithsomeessentialclusterconceptsingreaterdepth.Inadditiontothis,wesawhowtoaddclusterresourcesinordertofurtherconfigureournewlycreatedclusterintoareal-worldusagecase,whichwewilldealwithduringthenextchapter.
Itisalsoworthreiteratingthattherearecertainhardwarecomponentsthatwehavenotbeenabletodiscussindetail,suchasfencingdevices,andyoushouldtakenoteofthefencingagentsanddevices(asperpcsstonithlist)andseeifanyofthemappliestotheavailablehardwareinyourcase.
Lastbutnotleast,youneedtorememberthatinordertoavoidsplit-brainsituations,besidesapplyingthoroughlytheconceptsoutlinedinthepresentchapter,youalsoneedtoensureredundantcommunicationlinksbetweenthenetworkswherethenodesarelocated.ThiswillhelpyoupreventaSinglePointOfFailure(SPOF)topotentiallycausesuchanunwantedevent.
Chapter4.Real-worldImplementationsofClusteringInthischapter,youwilllearnhowtouseyourclusterinreal-lifescenariosbydeployingawebserverandadatabaseserver.Beforewedothis,wewillneedtoreviewsomefundamentalconceptsrelatedtothesekeycomponents,configurereplicatedstoragesothatfilesarekeptinsyncbetweennodes,andthenfinally,populateourdatabasewithsampledata,whichwewillthenqueryusingasimplePHPapplication.
Sincetheprogrammingsideofthingsisoutofthescopeofthisbook,feelfreetousesomeotherprogramminglanguageofyourchoiceifyouwanttodoso.IhavechosenPHPforsimplicity.Keepinmindthatthisbookisnotaimedatteachingyouhowtobuildweb-basedapplicationsforuseinaCentOS7cluster,butratherhowtouseitinordertoprovidehighavailabilityforthoseapplications.
Duringthecourseofthischapter,youwillnoticethatwewillrelyontheconceptsintroducedandtheservicesconfiguredinpreviouschaptersaswediveintotakingadvantageoftheclusterarchitecturethatwehavealreadyputinplace.
SettingupstorageWhenwestarteddiscussingthefundamentalconceptsofclustering,wementionedthathighavailabilityclustersaim,insimpleterms,tominimizedowntimeofservicesbyprovidingfailovercapabilities.Aswebeginthejourneyofinstallingawebserverandadatabaseserverinourcluster,wecan’thelpbutwonderhowwillwesynchronizebetweennodesthecontentthatthoseservicesshouldmakeavailabletous.Weneedtofindawayfornodestoshareapieceofcommonstoragewheredatawillbesaved.Ifonenodefailstoprovideaccesstoit,theothernodewilltakeclientrequestsfromthenon.
InLinux,acommonandcost-freemethodofdealingwiththisquestionisanopensourcetechnologyknownasDistributedReplicatedBlockDevice(DRBD),whichmakesitpossibletomirrororreplicateindividualstoragedevices(suchasharddisksorpartitions)fromonenodetotheother(s)overanetworkconnection.Inasomewhathigh-levelexplanation,youcanthinkofthefunctionalityofferedbyDRBDasanetwork-basedRAID-1.Itsbasicstructureanddataflowareillustratedinthefollowingfigure:
TipAllreplicateddatasets,suchasasharedstoragedevice,arecalledaresourceinDRBDandshouldnotbeconfusedwithaPCSresource,asdiscussedinpreviouschapters.
InordertoinstallDRBD,youwillneedtoenabletheELReporepositoryonbothnodes,becausethissoftwarepackageisnotdistributedthroughthestandardCentOSrepositories.HereisabriefexplanationofthepurposeandcontentsoftheELReporepository:
1. ThefirststepconsistsofimportingtheGPGkeythatisusedtosigntherpmpackage,whichrepresentsthefoundationtotherepository.Shouldyoutrytoinstallthepackageusingrpmbeforeimportingthekey,theinstallationwillfailasasecurity
measure.2. Runthefollowingcommandsonbothnodes:
rpm--importhttps://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm-Uvhhttp://www.elrepo.org/elrepo-release-7.0-
2.el7.elrepo.noarch.rpm
3. YoucanverifythatELRepohasbeenaddedtoyourconfiguredrepositorieswiththefollowingcommand:
yumrepolist|grepelrepo
Theoutputshouldbesimilartotheoneshowninthefollowingscreenshot:
TipAlternatively,youcanexplicitlydisableELRepoafterinstallingtherpmpackagesthataddittoyoursystemandenableitonlytoinstallthenecessarypackages(forprecaution,makesureyoumakeacopyoftheoriginalrepositoryconfigurationfilefirst):
cp/etc/yum.repos.d/elrepo.repo/etc/yum.repos.d/elrepo.repo.ORG
sed-i"s/enabled=1/enabled=0/g"/etc/yum.repos.d/elrepo.repo
yum--enablerepoelrepoupdate
yum--enablerepoelrepoinstall-ydrbd84-utilskmod-drbd84
4. Then,usethefollowingcommand:
yumupdate&&yuminstalldrbd84-utilskmod-drbd84
Itwillinstallthenecessarymanagementutilities,alongwiththecorrespondingkernelmoduleforDRBD.Oncethisprocessiscomplete,youwillneedtocheckwhetherthemoduleisloaded,usingthiscommand:
lsmod|grep-idrbd
Ifitisnotloadedautomatically,youcanloadthemoduletothekernelonbothnodes,asfollows:
modprobedrbd
Note
Notethatmodprobecommandwilltakecareofloadingthekernelmoduleforthetimebeingonyourcurrentsession.However,inorderforittobeloadedduringboot,youhavetomakeuseofthesystemd-modules-loadservicebycreatingafileinside/etc/modules-load.d/sothattheDRBDmoduleisloadedproperlyeachtimethesystemboots:
echodrbd>/etc/modules-load.d/drbd.conf
ELReporepositoryandDRBDavailabilityELRepoisacommunityrepositoryforLinuxdistributionsthatarecompatiblewithRedHatEnterpriseLinux,whichCentOSandScientificLinuxarederivativesof.ELRepohashardware-relatedpackages(especiallydrivers)astheprimaryfocusinordertoenhanceorprovidefunctionalitythatisnotpresentinthecurrentkernel.Thus,byinstallingthecorrespondingpackage,yousaveyourselffromthepainofhavingtorecompilethekernelonlytoaddacertainfeature,orhavingtowaitforittobesupportedbyupstreamrepositories,orforthefeaturetobeincludedinalaterkernelrelease.TheELReporepositoryismaintainedbyactivemembersoftherelateddistributions(RHEL,CentOS,andScientificLinux).
DBRD,asmadeavailablebyELRepo,isintendedprimarilytoevaluateandgetexperiencewithDRBDonRHEL-basedplatforms,butisnotofficiallysupportedbyRedHatandLINBIT,thecreatorsofDRBD.However,followingtheproceduresoutlinedinthischapterandthroughouttherestofthisbook,youcanensurethatallofthenecessaryfunctionalitywillbeavailableinyourcluster.
Oncewehaveinstalledthepackagesmentionedearlier,weneedtoallocatethephysicalspacethatwillbeusedtostorethereplicatedcontentsonbothservers.Withscalabilityinmind,wewillusetheLogicalVolumeManager(LVM)technologytocreatedynamicharddiskpartitionsthatareeasilyresizabledowntheroadifweneedto.
Tobeginwith,wewilladda2GBharddisktoeachnode.ThepurposeofthisharddiskistoserveastheunderlyingfilesystemforaPHPapplicationaccessedbytheApachewebserver.
Ichosethissizebecauseitwillbeenoughtostoreallthenecessaryfilestobereplicated,andbecauseVirtualboxallowsyoutopickarbitrarysizesforstoragedisks.Ifyouhappentobeusingrealhardwareasyoufollowalongwiththisbook,youmaywanttochooseadifferentsizeaccordingly.
ToaddavirtualharddisktoanexistingvirtualmachineinVirtualbox,followthesesteps:
1. TurnofftheVM2. Right-clickonitinVirtualbox’sinitialscreen3. Fromthecontextualmenu,chooseSettingsandthenStorage4. SelectController:SATA,andclickonAddharddiskandthenclickonCreatenew
disk5. ChooseVirtualDiskImage(VDI)andDynamicallyAllocatedandproceedtonext
step6. Finally,assignanameforthedeviceandchoose2GBassize
Afterstartingandbootingupeachnode,weshouldissuethefollowingcommandinordertoidentifythenewlyaddeddisk(thenewdiskwillbe,inourcase,theonethatisnotpartitionedyet):
ls-l/dev|grep-Eisd[a-z]
Wewillidentifythenewlyaddeddiskwiththefollowingcommand:
dmesg|grepsdb
Here,/dev/sdbisthenewdiskID,asreturnedbylistingthecontentsofthe/devdirectoryearlier:
[root@node02~]#dmesg|grepsdb
[2.484257]sd3:0:0:0:[sdb]4194304512-bytelogicalblocks:(2.14
GB/2.00GiB)
[2.484258]sd3:0:0:0:[sdb]WriteProtectisoff
[2.484258]sd3:0:0:0:[sdb]ModeSense:003a0000
[2.484258]sd3:0:0:0:[sdb]Writecache:enabled,readcache:enabled,
doesn'tsupportDPOorFUA
[2.487361]sdb:unknownpartitiontable
[2.498564]sd3:0:0:0:[sdb]AttachedSCSIdisk
[root@node02~]#
Now,let’screateapartitiononthedisk,thecorrespondingphysicalvolume,avolumegroup(drbd_vg),andfinally,alogicalvolume(drbd_vol)ontop.Makesureyourepeatthesestepsoneachnode,changingthedevice(dev/sdX)asneeded:
parted/dev/sdbmklabelmsdos
parted/dev/sdbmkpartp0%100%
pvcreate/dev/sdb1
vgcreatedrbd_vg/dev/sdb1
lvcreate-ndrbd_vol-l100%FREEdrbd_vg
NoteYoucancheckthestatusofthenewlycreatedlogicalvolumewithlvdisplay/dev/drbd_vg/drbd_vol.
ConfiguringDRBDAfterhavingsuccessfullycreatedandpartitionedourDRBDdisksoneachnode,themainconfigurationfileforDRBDislocatedin/etc/drbd.conf,whichconsistsonlyofthefollowingtwolines:
include"drbd.d/global_common.conf";
include"drbd.d/*.res";
Bothlinesincluderelativepaths,startingat/etc/,oftheactualconfigurationfiles.Intheglobal_common.conffile,youwillfindtheglobalsettingsforyourDRBDinstallation,alongwiththecommonsection(whichdefinesthosesettingsthatshouldbeinheritedbyeveryresource)oftheDRBDconfiguration.Ontheotherhand,inthe.resfiles,youwillfindthespecificconfigurationforeachDRBDresource.
Wewillnowrenametheexistingglobal_common.conffileasglobal_common.conf.orig(asabackupcopyoftheoriginalsettings)withthefollowingcommand:
mv/etc/drbd.d/global_common.conf/etc/drbd.d/global_common.conf.orig
Wewillthencreateanewglobal_common.conffilewiththefollowingcontentsbyopeningthefilewithyourpreferredtexteditor:
global{
usage-countno;
}
common{
net{
protocolC;
}
}
Onceyoucreatedtheprecedingfileononenode(say,node01),youcaneasilycopyittotheanothernode,asfollows:
sshnode02mv/etc/drbd.d/global_common.conf
/etc/drbd.d/global_common.conf.orig
scp/etc/drbd.d/global_common.confnode02:/etc/drbd.d/
NoteYoushouldmakeitahabittomakebackupcopiesoftheoriginalconfigurationfilessothatyoucanrollbacktoprevioussettings,shouldsomethinggowrongatanytime.
Theusage-countnolineintheglobalsectionskipssendinganoticetotheDRBDteameachtimeanewversionofthesoftwareisinstalledinyoursystem.Youcouldchangeittoyesifyouwanttosubmitinformationfromyoursystem.Alternatively,youcouldchangeittoaskifyouwanttobepromptedforadecisioneachtimeyoudoanupgrade.Eitherway,youshouldknowthattheyusethisinformationforstatisticalanalysisonly,andtheirreportsarealwaysavailabletothepublicathttp://usage.drbd.org/cgi-bin/show_usage.pl.
TheprotocolClinetellstheDRBDresourcetouseafullysynchronousreplication,whichmeansthatlocalwriteoperationsonthenodethatisfunctioningasprimaryare
consideredcompletedonlyafterboththelocalandremotediskwriteshavebeenconfirmed.Thus,ifyourunintothelossofasinglenode,thatshouldnotleadtoanydatalossundernormalcircumstances,unlessbothnodes(ortheirstoragesubsystems)areirreversiblydestroyedatthesametime.
Next,wewillneedtocreateaspecificnewconfigurationfilefile(called/etc/drbd.d/drbd0.res)forourresource,whichwewillnamedrbd0,withthefollowingcontents(where192.168.0.2and192.168.0.3aretheIPaddressesofourtwonodes,and7789istheportusedforcommunication):
resourcedrbd0{
disk/dev/drbd_vg/drbd_vol;
device/dev/drbd0;
meta-diskinternal;
onnode01{
address192.168.0.2:7789;
}
onnode02{
address192.168.0.3:7789;
}
}
NoteYoucanlookupthemeaningofeachdirective(andtherestaswell)intheresourceconfigurationfileatLinbit’swebsiteathttp://drbd.linbit.com/users-guide-8.4/.
TCPport7789isthetypicalportnumberusedinmostDRBDinstallations.However,theofficialdocumentationstatesthatDRBD(byconvention)usesTCPportsfrom7788upwards,witheveryresourcelisteningonaseparateport.Inthischapter,sincewearedealingwithonlyoneresource,wewillonlyuseport7789—bothintheonlyresourceconfigurationfileandinthefirewallsettingsonbothnodes.Itisessentialthatyouremembertoopenthisportinthefirewall,becauseotherwise,theresourceswillnotbeabletosynchronizelater.
Toopenthe7789TCPportinthefirewallconfiguration,executethefollowingcommandsonbothnodes:
iptables-IINPUT-ptcp-mstate--stateNEW-mtcp--dport7789-jACCEPT
serviceiptablessave
Again,youcancopythisfiletotheothernode,asfollows:
scp/etc/drbd.d/drbd0.resnode02:/etc/drbd.d/
WhenweinstalledDRBDearlier,autilitycalleddrbdadmwasinstalledaswell,which,asyouwillbeabletoguessfromitsname,isintendedtobeusedfortheadministrationofDRBDresources,suchasournewlyconfiguredvolumeThefirststepinstartingandbringingaDRBDresourceonlineistoinitializeitsmetadata(youmayneedtochangetheresourcenameifyousetadifferentnameintheconfigurationfilepreviously).Notethatthe/var/lib/drbddirectoryisneededbeforehand.IfitwasnotcreatedpreviouslywhenyouinstalledDRBD,createitmanuallybeforeproceeding,usingthefollowinglinesof
code:
mkdir/var/lib/drbd
drbdadmcreate-mddrbd0
Theselinesshouldresultinthefollowingoutput,withthecorrespondingconfirmationmessagethatindicatesasuccessfulcreationofthemetadataforthedevice:
NoteTheword“metadata”hasbeendefinedasdataaboutthedata.InthecontextofDRBDresources,themetadataofaresourceconsistsofseveralpiecesofinformationaboutthedeviceandthedatathatiskeptinit.Thedrbdadmcreate-md[drbdresource]commandwillreturnusefuldebugginginformationifsomethingdoesnotworkasexpected.
Thenextstepconsistsofenablingdrbd0inordertofinishallocatingbothdiskandnetworkresourcesforitsoperation:
drbdadmupdrbd0
Youcanverifythestatusoftheresourcebytakingalookatthe/procvirtualfilesystem,whichallowsyoutoviewthesystem’sresourcesasthekernelseesthem,asyoucanseeinthefollowingscreenshot.However,makesureyouhavefollowedtheinstructionsoutlinedearlieronbothnodes:
cat/proc/drbd
Takealookatthefollowingscreenshot:
Notethatthestatusofthedeviceshowsasunknownandinconsistentsincewehaven’tindicatedyetwhichoftheDRBDdevices(oneineachnode)willactasaprimarydeviceandwhichoneasasecondarydevice.Atthispoint,givenourcurrentscenariowherewehavesetuptwoDRBDdevicesfromscratch,itdoesnotmatterwhichoneyouchoosetobeprimary.However,ifwehadusedonedevicewithdataalreadyresidinginit,itiscrucialthatyouselectthatonedeviceastheprimaryresource.Otherwise,yourunthe
seriousriskoflosingyourdata.
Runthiscommandinordertomarkonedeviceasprimaryandtoperformtheinitialsynchronization.Youonlyneedtodothisinthenodethathastheprimaryresource(inourexample,thismeansnode01):
drbdadmprimary--forcedrbd0
Asyoudidearlier,youcancheckthecurrentstatusofthesynchronizationwhileit’sbeingperformed.Thecat/proc/drbdcommanddisplaysthecreationandsynchronizationprogressoftheresource,asshownhere:
Now,withthehelpofdrbd-overviewcommand,asitsnameimplies,youcanseeanoverviewofthecurrentlyconfiguredDRBDresources.Inthiscase,youshouldseethatnode01isactingasprimaryandnode02assecondary,asindicatedbyrunningthecommandonbothnodes(whichcanalsobeseeninthefollowingscreenshot):
Innode01:thedrbd-overviewcommandshouldreturn:
0:drbd0/0ConnectedPrimary/SecondaryUpToDate/UpToDate
Whereasinnode02youshouldsee:
0:drbd0/0ConnectedSecondary/PrimaryUpToDate/UpToDate
Finally,weneedtocreateafilesystemon/dev/drbd0innode01.Youcanchoosewhateversuitsyourneedsorrequirements,ifany.Ext4isagoodchoiceifyouhavenotdecidedwhichonetouse.XFSisthedefaultfilesystemforCentOS7outofthebox.However,itisnotpossibletoresizeitifweneedtodosoatalatertime,shouldwerunintoamorecomplexsetupfortheunderlyingstorageneededfortheoperationoftheweb
anddatabaseservers.
Runthefollowingcommandontheprimarynodetocreateanext4filesystemon/dev/drbd0andwaituntilitcompletes,asshowninthefollowingscreenshot:
mkfs.ext4/dev/drbd0
Now,yourDRBDresourceisreadytobeusedasusual.Youcannowmountitandstartsavingfilestoit.However,westillneedtoadditasaclusterresourcebeforewecanstartusingitasahighlyavailableandfail-safecomponent.Thisiswhatwewilldointhenextsection.
Itisveryimportantthatyoucreatethefilesystemontheresourcefromnode01,ourprimarynode.Otherwise,youwillrunintoamountingissuethatiscausedwhenyoutrytoaddafilesystemfromanodethatisnottheprimarymemberofthecluster.
AddingDRBDasaPCSclusterresourceYouwillrecallhowinChapter2,InstallingClusterServicesandConfiguringNetworkComponents,weaddedavirtualIPaddresstothecluster.Now,it’stimetodothesamewiththeDRBDresourcethatwehavejustcreatedandconfigured.
Beforedoingthat,however,wemustpointoutthatoneofthemostdistinguishingfeaturesofthePCScommand-linetoolthatwefirstintroducedbackinChapter2,InstallingClusterServicesandConfiguringNetworkComponents,isitsabilitytosavethecurrentclusterconfigurationtoafile,towhichyoucanaddfurthersettingsusingcommand-linetools.Then,youcanusetheresultingfiletoupdatetherunningclusterconfiguration.
ToretrievetheclusterconfigurationfromtheClusterInformationBase(CIB)andsaveittoafilenameddrbd0_confinthecurrentworkingdirectory,usethefollowingcommandtomakesureyoustartedtheclusterfirst:
pcsclusterstart--all
Thensavetheclusterconfigurationtothefilementionedearlier(drbd0_confwillbecreatedautomatically):
pcsclustercibdrbd0_conf
Next,wewilladdtheDRBDdeviceasaPCSclusterresource.Notethe-fswitch,whichindicatesthatchangesresultingfromthefollowingcommandshouldbeappendedtothedrbd0_conffile.Thefollowingcommandmustbeexecutedfromthesamedirectoryasthepreviouscommand(meaningthedirectorywherethedrbd0_conffileislocated):
pcs-fdrbd0_confresourcecreateweb_drbdocf:linbit:drbd
drbd_resource=drbd0opmonitorinterval=60s
Finally,weneedtomakesurethattheresourcewillrunonbothnodessimultaneouslybyaddingacloneresource(aspecialtypeofresourcethatshouldbeactiveonmultiplehostsatthesametime)forthatpurpose:
pcs-fdrbd0_confresourcemasterweb_drbd_cloneweb_drbdmaster-max=1
master-node-max=1clone-max=2clone-node-max=1notify=true
Atthispoint,wecanupdatetheclusterconfigurationusingthedrbd0_conffile.However,aquickinspectionoftheclusterstatusanditsresourceswillallowustobettervisualizethechangesifwerunpcsstatuscommandbeforeandafterupdatingtheglobalconfiguration,inthatorder:
pcsstatus
pcsclustercib-pushdrbd0_conf
Thelastcommandshouldresultinthefollowingmessageiftheupdatewassuccessful:
CIBupdated
Now,let’scheckthecurrentclusterconfigurationagain:
pcsstatus
InthecasethelastPCSstatusindicatessomefailureevent(mostlikelyrelatedtoSELinuxpoliciesandlesslikelywithregularfilepermissions),youshouldinspectthe/var/log/audit/audit.logfiletostartyourtroubleshooting.LinesstartingwithAVCwillpointouttheplaceswhereyouneedtolookfirst.Hereisanexample:
type=AVCmsg=audit(1429116572.153:295):avc:denied{readwrite}for
pid=24192comm="drbdsetup-84"name="drbd-147-0"dev="tmpfs"ino=20373
scontext=system_u:system_r:drbd_t:s0
tcontext=unconfined_u:object_r:var_lock_t:s0tclass=file
TheprecedingerrormessageseemstoindicatethatSELinuxisdenyingthedrbdsetup-84executableread/writeaccesstothetemporarytmpfsfilesystem.Itscorrespondingdeniedsystemcallsupportsthistheory:
type=SYSCALLmsg=audit(1429116572.153:295):arch=c000003esyscall=2
success=noexit=-13a0=125e080a1=42a2=180a3=7fff42b39f80items=0
ppid=24191pid=24192auid=4294967295uid=0gid=0euid=0suid=0fsuid=0
egid=0sgid=0fsgid=0tty=(none)ses=4294967295comm="drbdsetup-84"
exe="/usr/lib/drbd/drbdsetup-84"subj=system_u:system_r:drbd_t:s0key=
(null)
NoteNSASecurity-EnhancedLinux(SELinux)isanimplementationofaflexiblemandatoryaccesscontrolarchitectureinLinux.Youcandisableittoperformthefollowingsteps(butitisstronglyrecommendedthatyoudon’t)ifyouexperienceseveralissueswithitatfirst.IfyouchoosetodisableSELinuxbyediting/etc/sysconfig/selinux,donotforgettocleantheresourceerrorcountwithpcsresourcecleanup[resource_id],whereresource_idisthenameoftheresourceasreturnedbypcsresourceshow.
Toclearalldoubts,installthepolicycoreutils-pythonpackage(whichcontainsthemanagementtoolsusedtomanageanSELinuxenvironment):
yumupdate&&yuminstallpolicycoreutils-python
Usetheaudit2allowutilityincludedinittoviewthereasonofaccessdeniedinhuman-readableformandthengenerateanSELinuxpolicy-allowrulebasedonlogsofdeniedoperations.Thefollowingcommandwilloutputthelastlineintheaudit.logfilewherethewordAVCappearsandthenpipeittoaudit2allowtoproducetheresultinhuman-readableform:
cat/var/log/audit/audit.log|grepAVC|tail-1|audit2allow-w-a
Asshowninthefollowingscreenshot,wecanconfirmthataccesswasdeniedduetoamissingtypeenforcementrule:
Nowthatweknowwhatiscausingtheproblem,let’screateapolicypackageinordertoimplementthenecessarytypeenforcementruleintoamodulewhosenameisspecifiedinthecommandline:
cat/var/log/audit/audit.log|grepAVC|tail-1|audit2allow-a-M
drbd0_access_0
Ifyoudols-linyourcurrentworkingdirectory,youwillfindthattheprecedingcommandcreatedatypeenforcementfile(drbd_access_0.te)andcompileditintoapolicypackage(drbd_access_0.pp),whichyouwillneedtoactivatewiththefollowingcommand:
semodule-idrbd0_access_0.pp
Theprecedingcommandcantakeaboutaminutetocomplete,sodonotworryifthisisthecaseforyou,asyoucanseeinthefollowingscreenshot,nooutputmeansasuccessfuloperation:
Now,weneedtocopythemoduletonode02andinstallitthere.Thisisoneofthereasonswhywesetupkey-basedauthenticationbetweennodesinChapter1,ClusterBasicsandInstallationonCentOS7:
scpdrbd0_access_0.ppnode02:~
Then,runthefollowingcommandinnode02:
semodule-idrbd0_access_0.pp
Alternatively,youcanexecutethefollowingcommandinnode01:
sshnode02semodule-idrbd0_access_0.pp
Inaddition,theSELinuxdaemons_enable_cluster_modepolicyshouldbesettotrueon
bothnodes:
setsebool-Pdaemons_enable_cluster_mode1
Then,youmayneedtorepeatthisprocessmorethanonceiftheoutputofpcsstatusshowsfurthererrors.Ifyoufindthatyouhavetorepeatitseveraltimes,youmaywanttoconsidersettingSELinuxtopermissivesothatitwillstillissuewarningsinsteadofblockingtheclusterresource.Then,youcancontinuewiththesetupforthetimebeinganddebuglater.
Wecanseethatbothnodesareonline,andtheclusterresourcesareproperlystarted,asshownhere:
Now,let’sgiveDRBDarestforabriefmoment,andlet’sfocusontheinstallationofthewebanddatabaseservers.NotethatwewillalsorevisitthistopicinChapter5,MonitoringtheClusterHealth,wherewewillsimulateandtroubleshootissues.Notethatifyourebootanodeorbothofthem,nodesmaydetectasplit-brainsituationatthispoint,whichwewillfixmanually(asthatisthemethodthatisrecommendedbyLINBIT)laterduringthenextchapter,whenwetroubleshootthemostcommonissuesthatmaycomeupduringtheclusteroperation.
InstallingthewebanddatabaseserversAsofthetimeofwritingthisbook,theApacheHTTPserver(orjustApacheforshort)remainstheworld’smostwidelyusedwebserverandisoftenusedwithinwhatiscalledaLAMPstack.Inthisstack,aLinuxdistributionisusedastheoperatingsystem,Apacheasthewebserver,MySQL/MariaDBasthedatabaseserver,andPHPastheserver-sideprogramminglanguageforapplications.Eachoneofthesecomponentsisfree,andthesetechnologiesarewidelyspreadandthuseasytolearn/gethelpon.
ToinstalltheApacheandMariaDB(afreeandopensourceforkofMySQL)servers,runthefollowingcommandsoneachnode.NotethatthiswillinstallPHPaswell:
yumupdate&&yuminstallhttpdmariadbmariadb-serverphp
Uponsuccessfulinstallation,wewillproceedaswedidearlier.Tobegin,let’senableandstartthewebserveronbothnodes:
systemctlenablehttpd
systemctlstarthttpd
Don’tforgettomakesurethatApacheisrunning:
systemctlstatushttpd
AllowtrafficthroughTCPport80inthefirewall:
iptables-IINPUT-ptcp-mstate--stateNEW-mtcp--dport80-jACCEPT
serviceiptablessave
Atthispoint,youcanfireupawebbrowserandpointittotheindividualIPaddressesofthenodes(rememberthatwehaven’taddedApacheasaclusterresource,andthus,wecan’taccessthewebserveronthevirtualIPthatiscommontobothnodes).YoushouldseeApache’swelcomepage,asshowninthefollowingfigure,wherewecanseethatwebserverisrunningcorrectlyonnode02(192.168.0.3asperourinitialsetup):
Now,itistimetotakeasmallstepback.WewilldisableandstopApacheonbothnodessothattheclusterwillmanageitwhenPCSismovingforward:
systemctldisablehttpd
systemctlstophttpd
InorderforApachetolistenonthevirtualIP(towhichweassigned192.168.0.4astheIPaddress)andtheloopbackaddress(wewillseewhyinjustaminute),weneedtomodifythemainconfigurationfile(/etc/httpd/conf/httpd.conf),asfollows(youmaywanttomakeabackupofthisfilefirst):
#Listen:AllowsyoutobindApachetospecificIPaddressesand/or
#ports,insteadofthedefault.Seealsothe<VirtualHost>
#directive.
#
#ChangethistoListenonspecificIPaddressesasshownbelowto
#preventApachefromglommingontoallboundIPaddresses.
#
#Listen12.34.56.78:80
Listen192.168.0.4:80
Listen127.0.0.1
Then,restartApache:
systemctlrestarthttpd
Notethatwhilerestartingthewebserverinthesecondnode,anerroristobeexpectedsincethereisalreadyaservicerunninginthatsocket.However,thatisnormal,andnow,youshouldbeabletoaccesstheApachewelcomepagebypointingyourbrowsertothevirtualIP.
ThefunpartisfindingoutwhichisthenodeinwhichthevirtualIPwasstarted,asshowninthefollowingscreenshot.Ifyougetanerrorhereinstead,makesurevirtual_ipisstartedbyPCSfirst:
pcsstatus|grepvirtual_ip
Now,let’sstoptheclusterinthatnode,usingthefollowingcommand:
pcsresourceshowvirtual_ip
Then,ontheothernode,itshouldstillindicatethattheresourceisactive.
However,evenwhenthevirtualIPisfailedovertonode02,thewebserverisnotaccessiblethroughthatresourcebecauseitwasn’tstartedthereinthefirstplace.Forthisreason,westillneedtoconfigureApacheasaclusterresourcesothatitcanbemanagedassuch.
ConfiguringthewebserverasaclusterresourceYouwillrecallfromwhenweconfiguredthevirtualIPinChapter2,InstallingClusterServicesandConfiguringNetworkComponents,andwhenweaddedreplicatedstorageearlierduringthischapterthatwemustindicateawayforPCStocheckonaperiodicbasiswhethertheresourceisavailableornot.
Inthiscase,wewillusetheserverstatuspage(http://node0[1-2]/server-status),whichisthepreferredApachewebpageasitprovidesinformationabouthowwelltheserverwillbeperformingPCSwillquerythispageonceperminute.Thisisaccomplishedbycreatingafilenamedstatus.confinside/etc/httpd/conf.donbothnodes:
<Location/server-status>
SetHandlerserver-status
Orderdeny,allow
Denyfromall
Allowfrom127.0.0.1
</Location>
Then,withthefollowingcommand,wewilladdApacheasaclusterresource.ThestatusoftheresourcewillbecheckedbyPCSonceeveryminute:
pcsresourcecreatewebserverocf:heartbeat:apache
configfile=/etc/httpd/conf/httpd.confstatusurl="http://localhost/server-
status"opmonitorinterval=1min
Bydefault,pacemakerwilltrytobalancetheresourceusageoverthecluster.However,atcertaintimes,oursetupwillrequirethattworelatedresources(asitisinthecaseofthewebserverandthevirtualIP)needtorunonthesamehost.
ThewebservershouldalwaysrunonthehostonwhichthevirtualIPisactive.ThisalsomeansthatifthevirtualIPresourceisnotactiveonanynode,thewebservershouldnotrunatall.Inaddition,sinceweneedthewebservertolistenonthevirtualIPaddressaswellasontheloopbackdeviceoneachhost,itgoeswithoutsayingthat
WemustensurethatthevirtualIPresourceisstartedbeforethewebserverresource.
Wecanaccomplishbothrequirementsthroughtheuseofthefollowingconstraints:
pcsconstraintcolocationaddwebserverwithvirtual_ipINFINITY
pcsconstraintordervirtual_ipthenwebserver
Afterrunningthesecondcommand,youshouldseethefollowingmessageonyourscreen.NotethatstartingthevirtualIPresourcebeforethewebserverisamandatoryrequirement:
Addingvirtual_ipwebserver(kind:Mandatory)(Options:first-action=start
then-action=start)
Now,let’scheckthestatusoftheclusterandfocusonitsassignedresources,asshownin
thefollowingscreenshot:
Youcannowsimulateafailoverbyforcingnode01togooffline.Todoso,youcanrunthefollowingcommand:
pcsclusterstop
Theresourcesshouldbeautomaticallystartedonnode02,asindicatedinthefollowingscreenshot:
ThelaststepconsistsofmountingtheDRBDresourceonthe/var/html/wwwdirectoryandaddinginitasimplePHPpagetodisplaythePHPconfigurationofthecluster.Youwillthenbeabletobuildonthatsimpleexampletoaddmoresophisticatedapplications.
Beforeattemptingtouse/dev/drbd0,weshouldcheckitsstatusonbothnodeswithdrbd-overview.IftheoutputshowsStandAloneorWFConnection,wearelookingatasplit-brainsituation,whichcanbeconfirmedintheoutputofthefollowingcommand:
dmesg|grep-ibrain
ThiswillresultinaSplit-Braindetected,droppingconnection!errormessage.
Linbitrecommendstomanuallyresolvesuchcasesbychoosinganodewhosemodificationswillbediscardedandthenissuingthefollowingcommandsinit:
drbdadmsecondary[resourcename]
drbdadmconnect--discard-my-data[resourcename]
ThenconnecttheDRBDresourceontheothernode:
drbdadmconnect[resourcename]
YoucanalsostartorstopDRBDandgetanoverviewwiththefollowingcommandsinnode01:
drbdadmupdrbd0
drbdadmdowndrbd0
drbd-overview
sshnode02drbdadmupdrbd0
sshnode02drbdadmdowndrbd0
sshnode02drbd-overview
NoteReviewtheDRBDdocumentationcarefullybeforechoosingarecoverymethodafterasplit-brainsituation.Sincethereisnoone-size-fits-allanswertothisissue,Ihavechosentocovertherecommendedmethodinthisbook.
MountingtheDRBDresourceandusingitwithApacheBeforeusingtheDRBDresource,youmustdefineafilesystemonitandmountitonalocaldirectory.WewilluseApache’sdocumentrootdirectory(/var/www/html),butgiventhecase,youcoulduseavirtualhostdirectoryaswell.Aswedidearlier,wewilladdthesechangesinaconfigurationfile,stepbystep,andwewillpushittotherunningCIBlateronnode01(orwhatevertheDCis).
Tobegin,createanewconfigurationfilenamedfs_dbrd0_cfg(feelfreetochangethenameifyouwant):
pcsclustercibfs_drbd0_cfg
Next,we’llcreatethefilesystemresourceitself(again,changethevariablevaluesifneeded).Thisisanotherspecialtypeofresourceprovidedoutofthebox:
pcs-ffs_drbd0_cfgresourcecreateweb_fsFilesystemdevice="/dev/drbd0"
directory="/var/www/html"fstype="ext4"
ItindicatesthatthefilesystemshouldalwaysbeavailableonthemasterDRBDresource:
pcs-ffs_drbd0_cfgconstraintcolocationaddweb_fswithweb_drbd_clone
INFINITYwith-rsc-role=Master
Notethatinorderforthefilesystemtobestartedproperly,/dev/drbd0mustbestartedfirst,sowewillhavetoaddaconstraintforthispurpose:
pcs-ffs_drbd0_cfgconstraintorderpromoteweb_drbd_clonethenstart
web_fs
Finally,ensurethatApacheneedstorunonthesamenodeasthefilesystemresource,whichalsoneedstocomeonlinebeforethewebserverresourcecanbestarted:
pcs-ffs_drbd0_cfgconstraintcolocationaddwebserverwithweb_fs
INFINITY
pcs-ffs_drbd0_cfgconstraintorderweb_fsthenwebserver
Youcanreviewtheconfigurationwiththefollowingcommand:
pcs-ffs_drbd0_cfgconstraint
Theoutputisshowninthefollowingscreenshot:
Ifeverythingiscorrect,thenpushittotherunningCIBwiththiscommand:
pcsclustercib-pushfs_drbd0_cfg
TheprecedingcommandshouldshowCIBupdatedonsuccessfulcompletion.
Ifyounowrunpcsstatus,youshouldseethenewlyaddedresources,asyoucanseeinthefollowingscreenshot:
Now,youdon’tneedtomanuallymount/dev/drbd0in/var/www/html,becausetheclusterwilltakecareofit.YoucanverifythattheDRBDdevicehasbeenmountedin/var/www/htmlusingthiscommand:
mount|grepdrbd0
NoteRememberthatanyoriginalcontentspresentin/var/html/wwwwillnotbeavailablewhile/dev/drbd0ismounted.
TestingtheDRBDresourcealongwithApacheAsasimpletest,wewilldisplaytheinformationaboutthePHPinstallation.Createafilenamedinfo.phpinside/var/www/htmlonnode01withthefollowingcontents:
<?php
phpinfo();
?>
Now,pointyourbrowserto192.168.0.4/info.phpandverifythattheoutputissimilartotheoneshownhere:
Then,stopthecluster(pcsclusterstop)onnode01orputitintothestandbymode(pcsclusterstandbynode01)andrefreshthebrowser.Theonlythingthatshouldchangeontheoutputisthesystemname,asshowninthefollowingscreenshot,sincethephinfo()PHPfunctionreturnsthelocalhostnamealongwiththeinformationaboutthePHPinstallation:
Inaddition,ifyoulistthecontentsof/var/www/htmlonnode02,youwillseethattheinfo.phpfilethatwascreatedoriginallyonnode01nowshowsonnode02aswell,asindicatedinthisscreenshot:
Beforeproceeding,remembertoreturnnode01tonormalmode:
pcsclusterunstandbynode01
Settingupahigh-availabilitydatabasewithreplicatedstorageThelastpartofthischapterfocusesonsettingupaHAMariaDBdatabasewithreplicatedstorage.Tobegin,wewillhavetosetupanotherDRBDresourceaswedidearlier.Wewillreviewthenecessarystepshereforclarity:
1. Addanothervirtualdisktoeachvirtualmachine(a2GBdiskwilldo).2. Createapartitiononthenewlyaddeddiskandthengothroughtheprocessof
creatingaPhysicalVolume(PV)on/dev/sdc1,aVolumeGroup(VG,nameddrbd_db_vg),andfinallyaLogicalVolume(LV,drbd_db_vol):
parted/dev/sdcmklabelmsdos
parted/dev/sdcmkpartp0%100%
pvcreate/dev/sdc1
vgcreatedrbd_db_vg/dev/sdc1
lvcreate-ndrbd_db_vol-l100%FREEdrbd_db_vg
3. Createaconfigurationfile(/etc/drbd.d/drbd1.res)forthenewDRBDresource(drbd1),andbasedontheconfigurationfileforthefirstreplicatedstorageresource,editthesettingsaccordinglyanduseadifferentport:
resourcedrbd1{
disk/dev/drbd_db_vg/drbd_db_vol;
device/dev/drbd1;
meta-diskinternal;
onnode01{
address192.168.0.2:7790;
}
onnode02{
address192.168.0.3:7790;
}
}
The,addafirewallruletoallowtraffic:
iptables-IINPUT-ptcp-mstate--stateNEW-mtcp--dport7790-j
ACCEPT
serviceiptablessave
4. Repeatthepreviousstepsonthesecondnode.InitializethemetadataforthenewDRBDresourceonbothnodes:
drbdadmcreate-mddbrd1
5. Enablethereplicatedstorageresourceinordertoallocatediskandnetworkresourcesforitsoperation:
drbdadmupdrbd1
6. MarktheDRBDdeviceontheDCnodeasprimary:
drbdadmprimary--forcedrbd1
7. AddthenewDRBDdeviceasclusterresource:
mkdir-p/var/lib/mariadb_drbd1/data
pcsclustercibdrbd1_conf
pcs-fdrbd1_confresourcecreatedb_drbdocf:linbit:drbd
drbd_resource=drbd1opmonitorinterval=60s
pcs-fdrbd1_confresourcemasterdb_drbd_clonedb_drbdmaster-max=1
master-node-max=1clone-max=2clone-node-max=1notify=true
pcs-ffs_drbd1_cfgresourcecreatedb_fsFilesystem
device="/dev/drbd1"directory="/var/lib/mariadb_drbd1"fstype="ext4"
pcsclustercib-pushdrbd1_conf
Whenthisprocessiscomplete,theoverviewofallconfiguredDRBDresourcesupuntilthispointshouldlooklikethis:
[root@node01~]#cat/proc/drbd
version:8.4.6(api:1/proto:86-101)
GIT-hash:833d830e0152d1e457fa7856e71e11248ccf3f70buildbyphil@Build64R7,
2015-04-1005:13:52
0:cs:Connectedro:Primary/Secondaryds:UpToDate/UpToDateCr-----
ns:98324nr:0dw:32888dr:66457al:11bm:0lo:0pe:0ua:0ap:0ep:1
wo:foos:0
1:cs:Connectedro:Primary/Secondaryds:UpToDate/UpToDateCr-----
ns:2092956nr:0dw:33996dr:2094412al:0bm:0lo:0pe:0ua:0ap:0ep:1
wo:foos:0
[root@node01~]#drbd-overview
0:drbd0/0ConnectedPrimary/SecondaryUpToDate/UpToDate/var/www/html
ext42.0G6.1M1.9G1%
1:drbd1/0ConnectedPrimary/SecondaryUpToDate/UpToDate
[root@node01~]#
Inaddition,theclustershouldnowincludethenewDRBDresourceanditsclone(db_drbdanddb_drbd_clone,respectively)aswellasthefilesystemresource,asyoucanseeinthisscreenshot:
WecannowdividetheMariaDBfilesintotwoseparatesections:
Binaries,socket,and.pidfileswillbeplacedinsideadirectoryonaregularpartition,independentoneachnode(/var/lib/mysqlbydefault).Thesearefileswe
don’tneedtobehighlyavailableorfail-safe.Databaseandconfigurationfiles(my.cnf)willbestoredinaDRBDresource,whichwillbemountedunder/var/lib/mariadb_drbd1,insideadirectorynameddata.
Next,weneedtoaddthedatabaseserverasaclusterresource:
pcsresourcecreatedbserverocf:heartbeat:mysql
config="/var/lib/mariadb_drbd1/my.cnf"
datadir="/var/lib/mariadb_drbd1/data"opmonitorinterval="30s"opstart
interval="0"timeout="60s"opstopinterval="0"timeout="60s"
ThiswewilladdthesameconstraintsthatwedidwithApache:
pcsconstraintcolocationadddbserverwithvirtual_ipINFINITY
pcsconstraintordervirtual_ipthendbserver
pcsconstraintcolocationadddb_drbd_clonewithvirtual_ipINFINITY
pcsconstraintordervirtual_ipthendb_drbd_clone
Next,wewilladdafirewallruletoallowtraffic:
iptables-IINPUT-ptcp-mstate--stateNEW-mtcp--dport3306-jACCEPT
serviceiptablessave
Wewillbeginbycreatinganext4filesystemondrbd1andmountingitinthedirectorythatwascreatedpreviously.OnlyperformthisstepontheDC:
mkfs.ext4/dev/drbd1
mount/dev/drbd1/var/lib/mariadb_drbd1
Next,weneedtomovethedatabaseserverconfigurationfiletothemountpointofdrbd1(performallofthefollowingstepsonbothnodes):
mv/etc/my.cnf/var/lib/mariadb_drbd1/my.cnf
EdititsothatthedatadirvariablewillpointtotherightdirectoryinsidethemountpointoftheDRBDresourceandatthesametime,specifythatthedatabaseservershouldlistenforTCPconnectionsonadefinedaddress(inthiscase,theIPaddressofourvirtualIPresource):
datadir=/var/lib/mariadb_drbd1/data
bind-address=192.168.0.4
Next,weneedtoinitializethedatabasedatadirectory:
mysql_install_db--no-defaults--datadir=/var/lib/mariadb_drbd1/data
Finally,logontothedatabaseserver:
mysql–h192.168.0.4–uroot–p
Then,grantallpermissionstotherootuseridentifiedbythedefinedpassword:
GRANTALLON*.*TO'root'@'%'IDENTIFIEDBY'MyDBpassword';
FLUSHPRIVILEGES;
Note
Thispermissionsetisonlyfortestingandshouldbemodifiedwiththenecessarysecurityparametersbeforemovingtheclustertoaproductionenvironment.
Alternatively,wecancreateanemptydatabase:
CREATEDATABASEcluster_db;
Finally,makesurethemysqlusercanaccessthe/var/lib/mariadb_drbd1directory:
chown-Rmysql:mysql/var/lib/mariadb_drbd1/
Ifwenowfailover,fromtheactivenodetothepassiveone,theactualdatabasefileswithindatadirwillbereplicatedbyDRBDtothesamedirectoryontheothernode.
TroubleshootingAsexplainedpreviously,theoutputofpcsstatusunderFailedactionswillshowyouwhetherthereareproblemswiththeclusterresourcesandprovideinformationastowhatyoushoulddoinordertofixthem.
Hereisanexample:
exit-reason='Config/var/lib/mariadb_drbd1/my.cnfdoesn'texist':MakesuretheconfigurationfileforMariaDBexistsandisidenticalonbothnodes.exit-reason='Couldn'tfinddevice[/dev/drbd1].Expected/dev/???to
exist':TheDRBDdevicewasnotcreatedcorrectly.Reviewtheinstructionsandtrytocreateit.
Asyoucansee,theexitreasonwillgiveyouvaluableinformationtotroubleshootandfixtheissuesyoumayhave.If,afterverifyingtheconditionsoutlinedintheerrormessages,youarestillexperiencingissueswithaparticularresource,itisusefultocleanuptheoperationhistoryofaresourceandredetectitscurrentstate:
pcsresourcecleanup[resourcename]
FromKamran,arealworldproblemscenario,whichhappenswhenthereaderfollows(orgetslostfollowing)instructionsinthischapter:
[root@node01~]#pcsstatus
Clustername:MyCluster
Lastupdated:TueMay1217:07:042015
Lastchange:TueMay1216:54:032015
Stack:corosync
CurrentDC:node01(1)-partitionwithquorum
Version:1.1.12-a14efad
2nodesconfigured
9resourcesconfigured
Online:[node01node02]
Fulllistofresources:
virtual_ip(ocf::heartbeat:IPaddr2):Startednode02
Master/SlaveSet:web_drbd_clone[web_drbd]
Masters:[node01]
Slaves:[node02]
webserver(ocf::heartbeat:apache):Stopped
web_fs(ocf::heartbeat:Filesystem):Startednode01
dbserver(ocf::heartbeat:mysql):Stopped
Master/SlaveSet:db_drbd_clone[db_drbd]
Masters:[node02]
Stopped:[node01]
db_fs(ocf::heartbeat:Filesystem):Stopped
Failedactions:
dbserver_start_0onnode01'notinstalled'(5):call=36,
status=complete,exit-reason='Config/var/lib/mariadb_drbd1/my.cnfdoesn't
exist',last-rc-change='TueMay1217:01:092015',queued=0ms,exec=66ms
db_fs_start_0onnode01'notinstalled'(5):call=41,status=complete,
exit-reason='Couldn'tfinddevice[/dev/drbd1].Expected/dev/???to
exist',last-rc-change='TueMay1217:01:092015',queued=0ms,exec=38ms
dbserver_start_0onnode02'notinstalled'(5):call=41,
status=complete,exit-reason='Config/var/lib/mariadb_drbd1/my.cnfdoesn't
exist',last-rc-change='TueMay1217:01:092015',queued=0ms,exec=91ms
db_fs_start_0onnode02'notinstalled'(5):call=32,status=complete,
exit-reason='Couldn'tfinddevice[/dev/drbd1].Expected/dev/???to
exist',last-rc-change='TueMay1217:01:082015',queued=0ms,exec=39ms
PCSDStatus:
node01:Online
node02:Online
DaemonStatus:
corosync:active/enabled
pacemaker:active/enabled
pcsd:active/enabled
[root@node01~]#
SummaryInthischapter,weexplainedhowtosetupreal-worldapplicationsofclusters:adatabaseserverandawebserver.Bothapplicationsbuilduponareplicatedstoragedeviceinasetupthatincreasesavailabilitybyprovidingfailoverstorageforregularanddatabasefiles.
Inthenexttwochapters,wewillbuildupontheconceptsandresourcesthatweintroducedhere,troubleshootcommonissuesincluster-basedwebanddatabaseservers,andpreventcommonbottlenecksinordertoensurethehighavailabilityofapplications.
Chapter5.MonitoringtheClusterHealthInChapter2,InstallingClusterServicesandConfiguringNetworkComponents,wementionedthatbecomingfamiliarwithPCSanditsmyriadoptionswouldbehelpfulalongthepaththatmightleadustotheinstallationofafulloperationalhighavailabilitycluster.Althoughduringthepreviouschaptersweconfirmedhowtruethatstatementwas,herewewillmakefurtheruseofPCStomonitortheperformanceandavailabilityofourclusterinordertoidentifyandpreventpossiblebottlenecksandtroubleshootanyissuethatmayarise.
ClusterservicesandperformanceAlthougheverysystemadministratormustbewellacquaintedwiththewidelyusedLinuxcommands,suchastopandps,toquicklyreportasnapshotofrunningdaemonsandotherprocessesineachnode,youmustalsolearntorelyonthenewutilitiesprovidedbyCentOS7tostartournodemonitoring,whichwehaveintroducedinpreviouschapters.Butevenmoreimportantly,wewillalsousePCS-basedcommandstogainfurtherinsightintoourclusteranditsresources.
MonitoringthenodestatusAsyoucanguess,perhapsthefirstthingthatyoualwaysneedtocheckisthestatusofeachnode—whethertheyareonlineoroffline.Otherwise,thereislittlepointinproceedingwithfurtheravailabilityandperformanceanalysis.
Ifyouhaveanetworkmanagementsystem(suchasZabbixorNagios)server,youcaneasilymonitorthestatusofyourclustermembersandreceivealertswhentheyareunreachable.Ifnot,youmustcomeupwithasupplementarysolutionofyourown(whichmaynotbeaseffectiveorerrorproof)thatyoucanusetodetectwhenanodehasgoneoffline.
Onesuchsolutionisasimplebashscript(wewillnameitpingreport.sh,saveitinside/root/scripts,andmakeitexecutablewithchmod+x/root/scripts/pingreport.sh)whichwillperiodicallypingyournodesfromanotherhostandreportviaane-mailtothesystemadministratorifoneofthemisofflineinorderforyoutotakeappropriateaction.ThefollowingshellscriptdoesjustthatfornodeswithIPaddresses192.168.0.2and192.168.0.3(youcanaddasmanynodesintheNODESvariable,whichwillbeusedinthefollowingforloop,butremembertoseparatethemwithablankspace).Ifbothnodesarepingable,thereportwillbeemptyandnoe-mailswillbesent.
Inordertotakeadvantageofthefollowingscript,youwillneedtohaveane-mailsolutioninplaceinordertosendoutalerts.Inthiscase,weusethemailtoolcalledmailx,whichisavailableafterinstallingapackage(yuminstallmailx):
#!/bin/bash
#Directorywherethepingscriptislocated
DIR=/root/scripts
#HostnameorIPofremotehost(tosendalertsto)
REMOTEHOST="192.168.0.5"
#Nameofreportfile
PING_REPORT="ping_report.txt"
#Makesurethecurrentfileisempty
cat/dev/null>$DIR/$PING_REPORT
#Currentdatetobeusedinthepingscript
CURRENT_DATE=$(date+'%Y-%m-%d%H:%M')
#Nodelist
NODES="node01node02"
#Loopthroughthelistofnodes
fornodein$NODES
do
LOST_PACKETS=$(ping-c4$node|grep-iunreachable|wc-l)
if[$LOST_PACKETS-ne"0"]
then
echo"$"LOST_PACKETSpacketsweremissedwhilepinging$node
at$CURRENT_DATE">>$DIR/$PING_REPORT
fi
done
#Mailthereportunlessit's'empty
if[-s"$"DIR/$PING_REPORT"]
then
mailroot@$REMOTEHOST-s"Pingreport"-a$DIR/$PING_REPORT
fi
Eventhoughtheprecedingscriptisenoughtodeterminewhetheranodeispingableornot,youcantweakthatscriptasyoulike,andthenaddittocroninorderforittorunautomaticallyonthedesiredfrequency.Forexample,thefollowingcronjobwillexecutethescripteveryfiveminutes,regardlessoftheday:
*/5****/root/scripts/pingreport.sh
Ifyouwanttorunthescriptmanually,youcandosoasfollows:
/root/scripts/pingreport.sh
Thefollowingexampleindicatesthatboth192.168.0.2and192.168.0.3werenotpingablewhenthescriptwaslastrun.Notethatforsimplicity,thescriptwasexecutedfromnode01,aclustermember;however,undernormalcircumstances,youwillwanttouseaseparatehostforthis:
Wewillresumeworkingwiththescriptlaterinthischapterandextenditsfunctionalities.
Now,itistimetodigalittledeeperandviewthestatusofthenodesconfiguredincorosync/pacemakerwiththefollowingcommand:
pcsstatusnodespacemaker|corosync|both
Intheprecedingcommand,averticalbarisusedtoindicatemutuallyexclusivearguments.
Inthefollowingscreenshot,youcanseehowpcsstatusnodesbothreturnsthestatusofbothpacemakerandcorosynconbothnodes:
NoteAlthoughyoucancheckthecluster’soverallstatuswithpcsstatus,aswehavementionedearlier,pcsstatusnodesbothwillgiveyouthefine-grainednodestatusinformation.Youcanstopone(orboth)oftheservicesoneithernodeandrunthissamecommandtoverify.Thisisequivalenttousingsystemctlis-activepacemaker|corosynconeachnode.
MonitoringtheresourcesAswehaveexplainedinthepreviouschapters,aclusterresourceisahighlyavailableservicethatismadeavailablethroughatleastoneofthenodes.Amongtheresourcesthatweconfiguredupuntilthispoint,wecanmentionthevirtualIP,thereplicatedstoragedevice,thewebserver,andthedatabaseserver.YoucanrefertoChapter4,Real-worldImplementationsofClustering,whereweaddedconstraintsthatindicatedhow(inwhatorder)andwhere(inwhichnode)theclusterresourcesshouldbestarted.
Eitherpcsstatusorpcsresourceshow,thepreferredalternative,willlistthenamesandstatusofallcurrentlyconfiguredresources.
NoteIfyouspecifyaresourceusingitsID(thatis,pcsresourceshowvirtual_ip),youwillseetheoptionsfortheconfiguredresource.Ontheotherhand,if--fullisspecified(pcsresourceshow--full),allconfiguredresourceoptionswillbedisplayedinstead.
Ifaresourceisstartedonthewrongnode(forexample,ifitdependsonaservicethatiscurrentlyactiveonanothernode),youwillgetaninformativemessagewhenyouattempttouseit.Forexample,thefollowingscreenshotshowsthatdbserverisstartedonnode02,whereasitsassociatedunderlyingstoragedevice(db_fs)hasbeenstartedonnode01.Youwillrecallfromearlierchaptersthatthisispartoftheoutputofpcsstatus:
Forthisreason,ifyouattempttologontothedatabaseserverusingthevirtualIPaddress(whichisthecommonlinktotheclusterresources),youwillgettheerrormessageindicatedinthefollowingscreenshottellingyouthatyoucan’tconnecttotheMariaDBinstance:
Let’sseewhathappens(asshowninthenextscreenshot)whenwemovethedbserverresourcetonode01andenableitmanuallysothatitstartsrightaway.Thefollowingconstraintisintendedtocausedbservertoprefernode01sothatitalwaysrunsonnode01wheneversuchanodeisavailable:
pcsconstraintlocationdbserverprefersnode01=INFINITY
pcsresourcerestartdbserver
TipIfyouneedtoremoveaconstraint,findoutitsidwithpcsconstraint--fullandlocatetheassociatedresource.Then,deleteitwithpcsconstraintremoveconstraint_id,whereconstraint_idistheidentificationasreturnedbythefirstcommand.Youcanalsomanuallyremoveresourcesfromonenodetoanotherwithpcsresourcemove<resource_id><node_name>,butbeawarethatthecurrentconstraintsmayormaynotallowyoutosuccessfullycompletetheoperation.
Nowwecanaccessthedatabaseserverresourceasexpected,asshownhere:
Onceinawhile,youmayencountersomeerrorsduringorafterafailoverprocedureorduringboot—younameit.Thesemessagesarevisibleintheoutputofpcsstatus,asintheexcerptshownhere:
Beforeweproceedfurther,perhapsyouwillaskyourself:WhatifIwanttosaveallavailableinformationaboutclusterproblemstoproperlyanalyzeandtroubleshootoffline?IfyouareexpectingPCStohaveatooltohelpyouwiththat,youareright.Putadateand
timefollowingthe--fromand--tooptionsandreplacedestwithafilename(aspecificexampleisprovidedinthefollowingcommandaswell):
pcsclusterreport[--from"YYYY-M-DH:M:S"[--to"YYYY-M-D"H:M:S"]]"dest
Thiswillcreateatarballcontainingeverypieceofinformationthatisneededwhenreportingclusterproblems.If--fromand--toarenotused,thereportwillincludethedataofthelast24hours.
Inthescreenshotthatwillfollow,wehaveomittedthe--fromand--toflagsforbrevity,andwecanseeyetanotherreasonwhysettingupkey-basedauthenticationviasshduringChapter1,ClusterBasicsandInstallationonCentOS7wasnotameresuggestion—youhavetoreportclusterinformationfrombothnodes.
Inourcase,wewillexecutethefollowingcommandtoobtainatarballnamedYYYY-MM-DD-report.tar.gzinthecurrentworkingdirectory.Notethatthedatepartinthefilenameisforidentificationpurposesonly:
pcsclusterreport$(date+%Y-%m-%d)-report
Oncethetarballwiththereportfileshasbeencreated,youcanuntarandexamineit.Youwillnoticethatitcontainsthefilesanddirectoriesseeninthefollowingimage.Beforeproceedingfurther,youmaywanttotakealookatsomeofthem,shownasfollows:
tarxzf$(date+%Y-%m-%d)-report.tar.gz
cd$(date+%Y-%m-%d)-report
Now,ofcourseyouwanttopurgerecordsofthepastfailedactionsthathavebeenresolved.Forthisreason,PCSallowsyoutoinstructtheclustertoforgettheoperationhistoryofaresource(orallofthem),resetthefailcount,andredetectthecurrentstates:
pcsresourcecleanup<resource_id>
Notethatifresource_idisnotspecified,thenallresources/STONITHdeviceswillbecleanedup.
Finally,whilewearestilltalkingaboutmonitoringclusterresources,wemightaswellaskourselves:Isthereawaywecanbackupthecurrentclusterconfigurationfilesandrestorethemlaterifneeded,andcanweeasilygobacktoapreviousconfiguration?Theanswertobothquestionsisyes—let’sseehow.
Inordertobackuptheclusterconfigurationfiles,youwillusethefollowingcommand:
pcsconfigbackup<filename>
Here,<filename>isafileidentificationofyourchoicetowhichPCSwillappendthetar.bz2extensionaftercreatingthetarball.
Considerthefollowingexample:
pcsconfigbackupcluster_config_$(date+%Y-%m-%d)
ThiswillresultinthetarballbackupwiththecontentsshowninthefollowingscreenshotForourconvenience,letuscreateasubdirectorynamedcluster_configinsideourcurrentworkingdirectory.Wewillusethisnewlycreatedsubdirectorytoextractthecontentsofthereporttarball:
mkdircluster_config
tarxzfcluster_config_$(date+%Y-%m-%d).tar.bz2-Ccluster_config
ls-Rcluster_config
Note
Ifyouhavefollowedtheinstallationprocessstepbystep,asoutlinedinthisbook,bzip2willmostlikelynotbeavailable.Youwillneedtoinstallitwithyumupdate&&yuminstallbzip2inordertountartheclusterconfigurationtarball.
Restoringtheconfigurationisjustaseasy(youwillneedtostopthenodeandthenstartitagainaftertherestorationprocessiscompleted),usethefollowingcommand:
pcsconfigrestore[--local]<filename>
Thiscommandwillrestorethebacked-upclusterconfigurationfilesonallnodesusingthebackupassource.Ifyouonlyneedtorestorethefilesonthecurrentnode,usethe--localflag.Notethatfilenamemustbethe.tar.bz2file(nottheextractedfiles).
Youcanalsogobacktoacertainpointintime,asfarasclusterconfigurationisconcerned,usingpcsconfigcheckpointwithitsassociatedoptions.Withnooptions,pcsconfigcheckpointwilllistallavailableconfigurationcheckpoints,asshownhere:
Thepcsconfigcheckpointview<checkpoint_number>commanddisplaystostandardoutputthespecifiedconfigurationcheckpointdetails,asshowninthenextscreenshot.Considerthefollowingexample:
pcsconfigcheckpointview1
Thepcsconfigcheckpointrestore<checkpoint_number>commandrestoresclusterconfigurationtoaspecifiedcheckpoint,whichiswhyit’sagreatideatocheckthedetailsofthedesiredcheckpointbeforerestoring.
WhenaresourcerefusestostartUndernormalcircumstances,clusterresourceswillbemanagedautomaticallywithoutmuchinterventionfromthesystemadministrator.However,therewillbetimeswhensomethingmaypreventaresourcefromstartingproperly,anditwillbenecessarytotakeimmediateaction.
AsthemanpageforPCSstates,
Startingresourcesonaclusteris(almost)alwaysdonebypacemakerandnotdirectlyfromPCS.Ifyourresourceisn’tstarting,it’susuallyduetoeitheramisconfigurationoftheresource(whichyoudebuginthesystemlog),orconstraintspreventingtheresourcefromstartingortheresourcebeingdisabled.Youcanusepcsresourcedebug-starttotestresourceconfiguration,butitshouldnotnormallybeusedtostartresourcesinacluster.
Havingsaidthat,whenpacemakercannot,forsomereason,properlystartaresource,executethefollowingcommand:
pcsresourcedebug-start<resourceid>[--full]
Thiswillforcethespecifiedresourcetostartonthecurrentnode,ignoringtheclusterrecommendations.Theresultwillbeprintedtothescreen(usethe--fullflagtoobtainmoredetailedoutput)andwillprovidehelpfulinformationtoassistyouintroubleshootingtheresourceandtheclusteroperation.
Inthefollowingscreenshot,theoutputofpcsresourcedebug-startvirtual_ip--fullistruncatedforthesakeofbrevity:
Fromthisexample,youcanbegintoglimpsehowusefulthiscommandcanbeasitprovidesyouwithverydetailedinformation,stepbystep,oftheresourceoperation.Forexample,ifthedbserverresourcerefusestostartandreturnserrorsevenafterrepeatedlyhavingcleaneditup,runthefollowingcommand:
pcsresourcedebug-startdbserver--full|less
Withthis,youwillbeabletoview—withgreatdetail—thestepsthatareusuallyperformedbytheclusterwhentryingtobringupsucharesource.Ifthisprocessfailsatsomepoint,youwillbeprovidedwithadescriptionofwhatwentwrongandwhen,andthenyouwillbebetterabletofixit.
CheckingtheavailabilityofcorecomponentsBeforewrappingup,let’sgobacktothefirstexample(checkingtheonlinestatusofeachnode)andextenditsothatwecanalsomonitorthecorecomponentsoftheclusterframework,thatis,pacemaker,corosync,andpcsd,asoutlinedearlierinChapter2,InstallingClusterServicesandConfiguringNetworkComponents.
TipInordertoensureasuccessfulconnectionviasshfromanodetoitself,youwillneedtocopyitskeytoauthorized_keysThus,toenablepasswordlessuserloginforuserroot,runthefollowingcommandonbothnodes:
cp/root/.ssh/id_rsa.pub/root/.ssh/authorized_keys
Inthebestcasescenario,duringagracefulfailover,youwillwanttobenotifiedwheneverone(ormore)ofthoseservicesisstopped.Addingafewlinestothescriptwillalsocheckforthestatusofthecorrespondingdaemonsandalertyouifthey’redown:
#!/bin/bash
#Directorywherethepingscriptislocated
DIR=/root/scripts
#HostnameorIPofremotehost(tosendalertsto)
REMOTEHOST="192.168.0.5"
#Nameofreportfile
PING_REPORT="ping_report.txt"
#Makesurethecurrentfileisempty
cat/dev/null>$DIR/$PING_REPORT
#Currentdatetobeusedinthepingscript
CURRENT_DATE=$(date+'%Y-%m-%d%H:%M')
#Nodelist
NODES="node01node02"
#Outerloop:checkeachnode
fornodein$NODES
do
LOST_PACKETS=$(ping-c4$node|grep-iunreachable|wc-l)
if[$LOST_PACKETS-ne"0"]
then
echo"$"LOST_PACKETSpacketsweremissedwhilepinging$node
at$CURRENT_DATE">>$DIR/$PING_REPORT
fi
#Innerloop:checkallclustercorecomponentsineachnode
forserviceincorosyncpacemakerpcsd
do
IS_ACTIVE=$(ssh-qn$nodesystemctlis-active$service)
if[$IS_ACTIVE!="active"]
then
echo"$"serviceisNOTactiveon$node.PleasecheckASAP."
>>$DIR/$PING_REPORT
fi
done
done
#Mailthereportunlessit's'empty
if[-s"$"DIR/$PING_REPORT"]
then
mail-s"Pingreport"root@localhost<$DIR/$PING_REPORT
fi
Asasimpletest,stopthecluster(pcsclusterstop)onnode02(192.168.0.3),runthescriptfromthemonitoringhostorfromanynode,andcheckyourmailinboxtoverifythatitisworkingcorrectly.Inthefollowingscreenshot,youcanseeanexampleofwhatitshouldlooklike:
SummaryInthischapter,wehaveexplainedhowtomonitor,troubleshoot,andfixcommonclusterproblemsandneeds.Notallofthesewillbeundesiredorunexpectedasasuddensystemcrash.Therewillbetimeswhenyouneedtobringdowntheclusterandtheresourcesitisrunningforsomeplannedmaintenanceorduringapoweroutagebeforeyouruninterruptiblepowersupply(UPS)runsout.
Becausepreventionisyourbestallyinthesecircumstances,ensurethatyouroutinelymonitorthehealthofyourcluster.Followtheproceduresoutlinedinthischaptersothatyoudon’trunintoanysurpriseswhenrealemergenciescomeup.Specifically,undereitherrealorsimulatedcases,ensurethatyoubackuptheclusterconfiguration,stoptheclusteronbothnodesseparately,andthenandonlythen,haltthenode.
Chapter6.MeasuringandIncreasingPerformanceUptothispoint,wehavecreatedanactive/passivecluster,addedseveralresourcestoit,andtesteditsfailovercapabilities.Wealsodiscussedhowtotroubleshootcommonissues.Thefinalstepinourjourneyconsistsofmeasuringandincreasingtheperformanceofourclusterasithasbeeninstalledsofar—asfarastheservicesrunningonitareconcerned.
Inaddition,wewillprovidetheoverallinstructionstoconvertyourA/PclusterintoanA/Aone.
SettingupasampledatabaseInordertoproperlytestourMariaDBdatabaseserver,weneedadatabasepopulatedwithsampledata.Forthisreason,wewillusetheEmployeesdatabase,developedbyPatrickCrewsandGiuseppeMaxiaandprovidedbyOracleCorporationunderaCreativeCommonsAttribution-ShareAlike3.0UnportedLicense.Itprovidesaverylargedataset(~160MBand~4millionrecords)spreadoversixtables,whichwillbeidealforourperformancetests.
NoteTheCreativeCommonsAttribution-ShareAlike3.0UnportedLicense,availableathttp://creativecommons.org/licenses/by-sa/3.0/,grantsusthefollowingfreedomsregardingtheEmployeesdatabase:
Share:Thisletsuscopyandredistributethematerialinanymediumorformat
Adapt:Thisletsusremix,transform,andbuilduponthematerial
foranypurpose,evencommercially.
Thelicensorcannotrevokethesefreedomsaslongasyoufollowthelicenseterms.
DownloadingandinstallingtheEmployeesdatabaseLet’sproceedwithdownloadingandinstallingthedatabaseusingthefollowingsteps:
1. TodownloadtheEmployeestable,gotohttps://launchpad.net/test-db/andgrabthelinkforthetarballofthelateststablerelease(atthetimeofwritingthisbook,itisv1.0.6),asshowninthefollowingscreenshot:
2. Then,downloadittothenodeonwhichthedatabaseserverisrunning(inourcase,itisnode01).Todoso,youwillneedtoinstalltwopackagesnamedwgetandbzip2first,usingthefollowingcommand:
yum–yinstallwgetbzip2&&wgethttps://launchpad.net/test-
db/employees-db-1/1.0.6/+download/employees_db-full-1.0.6.tar.bz2
Then,extract/unarchiveitscontentsinyourcurrentworkingdirectory:
tarxjfemployees_db-full-1.0.6.tar.bz2
3. Thiswillcreateasubdirectorynamedemployees_db,wherethemaininstallationscript(employees.sql)resides,ascanbeseenintheoutputofthefollowingtwocommands:
cdemployees_db
ls
4. Next,usethefollowingcommandtoconnecttotheclusterdatabaseserverwesetupandconfiguredinChapter4,Real-worldImplementationsofClustering(notethatyouwillbepromptedtoenterthepasswordfortherootMariaDBuser):
mysql-h192.168.0.4-uroot-p-t<employees.sql
5. Thiswillalsoinstalltheemployeesdatabaseandloadthecorrespondinginformationintoitstables:
departments
employees
dept_emp
dept_manager
titles
salaries
NoteAfteryouaredonesettingupthesampledatabase,feelfreetoperformaforcedfailovertoverifythattheresourcesandthedatabase,alongwiththeirtablesandrecords,becomeavailableinthecurrentpassivenode.Reviewchapter4torecallinstructionsifyouneed.
Duetothehighvolumeofdatabeingloadedintothedatabase,itistobeexpectedthattheinstallationmaytakearoundaminuteortwotocomplete.Whileweareatit,wewillseetheprogressoftheimportprocess:thedatabasestructureandthestorageengineareinstantiated,thenthetablesarecreated,andfinally,theyarepopulatedwithdata,asshownhere:
6. Wecanverifybyloggingintothedatabaseserverandissuingthesecommandstofirstlistalldatabases.Then,switchtotherecentlyinstalledEmployeesdatabase,anduseitforthesubsequentqueries:
SHOWDATABASES;
USEemployees;
SHOWTABLES;
7. Theoutputshouldbesimilartotheoneshownintheprecedingscreenshot.
8. Beforeweproceedwiththeactualperformancetests(measuringgeneralperformancebeforeandafterafailoverevent),feelfreetoinvestigatethosetables(andthefieldstheycontain)usingtheDESCRIBEstatement.ThenbrowsetherecordswiththeSELECTstatement,asshownhere:
DESCRIBEsalaries;
SELECT*FROMsalariesLIMIT5;
9. Theresultcanbeseeninthefollowingscreenshot:
Onceyouhavetakensometimetobecomeacquaintedwiththestructureofthedatabase,wearereadytoproceedwiththetests.
IntroducinginitialclustertestsInaddition,fortheactualperformancetests,youshouldnotethatMariaDBcomeswithseveraldatabase-relatedutilitiesthatcancomeinhandyforavarietyofadministrationtasks.Oneofthemismysqlshow,whichreturnscompleteinformationaboutdatabasesandtablesinonequickcommand.
Itsgenericsyntaxisasfollows:
mysqlshow[options][db_name[tbl_name[col_name]]]
So,wecouldusethefollowingcommandtodisplaythedescriptionforthetitlestableintheemployeesdatabase:
mysqlshowemployeestitles-h192.168.0.4-uroot-p
NoteYoucanlistthecompletesetofutilitiesthatareincludedinyourMariaDBinstallationusingthels/bin|grepmysqlcommand.Eachofthosetoolshasacorrespondingmanualpage,whichcanbeinvokedfromthecommandlineasusual.
WewilluseanotherofthetoolsthatareincludedbyMariaDBtoseehowourdatabaseserverperformswhenplacedundersignificantload.Thetoolismysqlslap,adiagnosticprogramdesignedtoemulateclientloadforaMariaDB/MySQLserverandtoreportthetimingofeachstage.Itworksasifmultipleclientsareaccessingtheserversimultaneously.
Beforeexecutingtheactualcommandsthatwewilluseinthefollowingtests,wewillintroduceafewoftheflagsavailableformysqlslap:
--create-schema:Thiscommandspecifiesthedatabaseinwhichwewillrunthetests--query:Thisisastring(oralternatively,afile)containingtheSELECTstatementsusedtoretrievedata--delimiter:Thiscommandallowsyoutospecifyadelimitertoseparatemultiplequeriesinthesamestringin--query--concurrency:Thiscommandisthenumberofsimultaneousconnectionstosimulate--iterations:Thisisthenumberoftimestorunthetests--number-of-queries:Thiscommandlimitseachclient(referto--concurrency)tothatamountofqueries
Inaddition,thereareotherswitcheslistedinthemanualpageformysqlslapthatyoucanuseifyouwant.
Thatsaid,wewillrunthefollowingtestsagainstthedatabaseserverinourcluster.
Test1–retrievingallfieldsfromallrecordsInthisfirsttest,wewillperformarathersimplequerythatconsistsofretrievingallfieldsfromallrecordsintheemployeestable.Wewillsimulate10concurrentconnectionsandmake50queriesoverall.Thiswillresultinclientsrunning5querieseach(50/10=5):
mysqlslap--create-schema=employees--query="SELECT*FROMemployees"--
concurrency=10--iterations=2--number-of-queries=50-h192.168.0.4-uroot
-p
Afteracoupleofminutes,youwillbeabletoseeoutputsimilartotheoneshowninthefollowingscreenshot.Althoughherewelisttheresultofanisolatedtest,youmaywanttoperformthisoperationseveraltimesonyourownandwritedowntheresultsforalatercomparison.However,ifyouchoosetodoso,makesurethatthequeryresultsarenotcachedbyrunningthefollowingcommandinyourMariaDBserversessionaftereachrun:
RESETQUERYCACHE;
Test2–performingJOINoperationsInthissecondtest,wewilldoaJOINoperationbetweentheemployeesandsalariestables(amorerealisticexample)andmodifythenumberofconnections,queries,anditerationsabit:
mysqlslap--create-schema=employees--query="SELECTA.first_name,
A.last_name,B.salaryFROMemployeesAJOINsalariesBonA.emp_no=
B.emp_no"--concurrency=3--number-of-queries=12--iterations=2-h
192.168.0.4-uroot-p
Inthefollowingscreenshot,wecanseeanexpectedincreaseinthetimeittooktorunthequeriesthistime:
Beforeproceedingfurther,feelfreetoplayaroundwiththenumberofconnections,iterations,andqueries,orwiththequeryitself.Basedonthesevalues,youmayknockthedatabaseserverdown.Thatistobeexpectedatsomepoint,sincewehavebeenbuildingourinfrastructureandexamplesonavirtualmachine-basedcluster.Forthisreason,youmaywanttoincreasetheprocessingresourcesoneachnode’sVirtualboxconfigurationtotheextentoftheavailablecapacity,orconsideracquiringrealhardwaretosetupyourcluster.
NoteDatabaseadministrationandoptimizationaretopicsoutofthescopeofthisbook.Itisstronglyrecommendedthatyoualsotakethesesubjectsintoaccountbeforemovingtheclustertoaproductionenvironment.Sincetheperformanceofthedatabaseandwebserverscanbeoptimizedseparatelythroughtheircorrespondingsettings,inthisbook,wewillfocusoureffortsonanalyzingandimprovingtheavailabilityoftheseresources(whichwehavenameddbserverandwebserverrespectively)usingtheirrespectiveconfigurationfilesandinternalsettings.
PerformingafailoverWewillnowforceafailoverbystoppingtheclusterfunctionalityonthenodewherealltheresourcesarecurrentlyrunning(node01)sothattheywillmovetonode02.Here,wewillperformtests1and2,andweexpecttoseeasimilarbehaviortowhatwesawearlier.Itisimportanttokeepinmindthatduringafailover,dataisnotencryptedautomatically.Ifyouhaveconcernsaboutsensitivedatabeingfailedoveranunsecuredconnection,youshouldtakethenecessaryprecautiontouseencryptioneitheratthefilesystemorattheLogicalVolumelevel.Beforewedothis,however,wemustkeepinmindthatmovingasensitiveresource,suchasadatabaseserver,aroundaclusterconstantlymaynegativelyimpacttheavailabilityofsuchresource.Forthisreason,wewillwantittoremaininthenodewhereitisactiveunlessinthecasethatthereisanactualnodeshutdown.Theconceptofresourcestickinessdoesexactlythis:itallowsustoinstructallclusterresourcestoeitherfallbacktotheiroriginalnodewhenitbecomesavailableagainafteranoutage,ortoremainwheretheyarecurrentlyactive.Thefollowingsyntaxisusedtospecifythedefaultvalueforallresources:
pcsresourcedefaultsresource-stickiness=value
Thehigherthevalue,themoretheresourcewillprefertostaywhereitis.Bydefault,Pacemakeruses0asvalue,whichtellstheclusterthatitisdesired(andoptimal)tomovetheresourcearoundinthecaseoffailover.Tospecifythestickinessofaspecificresource,usethefollowingsyntaxtosetthestickinessforaspecificresource:
pcsresourcemeta<resource_id>stickiness=value
Let’sassumethatyouuseINFINITYasthevalueintheprecedingcommand:
pcsresourcedefaultsresource-stickiness=INFINITY
pcsresourcemeta<resource_id>stickiness=INFINITY
(Whereyouneedtoreplaceresource_idwiththeactualresourceidentification)
Then,boththedefaultstickinessforallresourcesandfortheresourceidentifiedbyresource_idwillbesettoINFINITY.Thatbeingsaid,let’snowperformthefailover.Takenoteofthecurrentnodeandresourcestatusbyusingthefollowingcommand:
pcsstatus
Then,stoptheclusterbyusingthefollowingcommand:
pcsclusterstop
Then,verifythatallresourceshavebeenproperlystartedontheothernode.Ifnot,troubleshootusingthetoolsexplainedinChapter5,MonitoringtheClusterHealth.Finally,proceedtoruntests1and2onnode02.
Theresultsinourpresentcaseareexplainedhere.
Fortest1,refertothefollowingscreenshot:
Summarizingresultsoftest1onbothnodes
Forourconvenience,let’sputbothresultsinthefollowingforaquickcomparison:
TEST1[seconds] Node01 Node02
Average,allqueries 20.770 20.179
Minimum,allqueries 20.242 19.930
Maximum,allqueries 21.298 20.428
Ontheotherhand,fortest2,thefollowingscreenshotandthenexttableshowthedetails:
Summarizingresultsoftest2onbothnodes
TEST2[seconds] Node01 Node02
Average,allqueries 40.008 39.084
Minimum,allqueries 38.713 38.779
Maximum,allqueries 41.304 39.389
Asyoucansee,theresultsareverysimilarinbothcases,whichconfirmsthatthefailoverdidnotaffecttheperformanceofthedatabaseserverrunningontopofourcluster.Whileitistruethatthefailoverdidnotimproveperformanceeither,wecanseethattheavailabilityoftheresourceduringafailoverhasbeenconfirmedwithanegativeimpactonthefunctionalityofthecluster.
MeasuringandimprovingperformanceYouwillrecallfromearlierchaptersthatbydefinition,aresourceisaservicethatismadehighlyavailablebythecluster.Everyresourceisassignedwhatiscalledaresourceagent,anexternalshellscriptthatmanagestheactualresourceforthecluster,independentlyofhowthoseserviceswouldbemanagedbysystemdiftheywerelefttoitscare.Thus,theactualoperationoftheresourceistransparenttothecluster,sinceitisbeingmanagedbytheresourceagent.
Resourceagentsarefoundinside/usr/lib/ocf/resource.d,sofeelfreetotakealookatthemtobecomebetteracquaintedwiththeirstructure.Inmostcircumstances,youwillnotneedtomodifythem,butworkonthespecificresources’configurationfiles,asweshallsee.Youwillrecallfromearlierchaptersthataddingaclusterresourceinvolvedusinganargumentofthestandard:provider:resource_agentform(ocf:heartbeat:mysql,forexample).Youcanalsoviewthecompletelistofresourcestandardsandproviderswithpcsresourcestandardsandpcsresourceprovidersrespectively.Additionally,youcanviewtheavailableagentsforeachstandard:providerpairwithpcsresourceagentsstandard:provider.
Apache’sconfigurationandsettingsWhentheApachewebserverisfirstinstalled,bydefault,itcomeswithseveralmodulesintheformofDynamicSharedObjects(DSOs)thatextenditsfunctionality.Thedownsideisthatsomeofthemmayconsumeresourcesunnecessarilyiftheyremainloadedandyourapplicationsdon’t’usethem.Asyoucanprobablyguess,thismayleadtoperformancelossovertime.
InCentOS7,youcanviewthelistofcurrentlyloadedandsharedmoduleswithhttpd-M.Thefollowingoutputistruncatedforthesakeofbrevity,butshouldbeverysimilarinyourcase:
LoadedModules:
core_module(static)
so_module(static)
http_module(static)
access_compat_module(shared)
actions_module(shared)
alias_module(shared)
allowmethods_module(shared)
auth_basic_module(shared)
auth_digest_module(shared)
authn_anon_module(shared)
authn_core_module(shared)
Acarefulinspectionofthemodulelistandsolidknowledgeofwhatyourapplicationsactuallyneedswillhelpyoudefinewhichmodulesarenotneeded,andthus,theycanbeunloadedforthetimebeing.
Lookatthefollowinglinein/etc/httpd/conf/httpd.conf:
IncludeOptionalconf.modules.d/*.conf
ThislineindicatesthatApachewilllookintheconf.modules.ddirectoryforinstructionstoloadmoduleinside.conffiles.Forexample,inthestandardinstallation,00-base.confcontains~70LoadModuledirectivesthatpointtoDSOsinside/etc/httpd/modules.Itisinthese.conffilesthatyoucanenableordisable(byprependingeachLoadModuledirectivewitha#symbol,thuscommentingthatline)Apachemodules.Notethatthismustbeperformedonbothnodes.
LoadinganddisablingmodulesInthefollowingscreenshot,userdir_modulemodules,version_module,andvhost_alias_moduleareloaded,whereasbuffer_module,watchdog_module,andheartbeat_modulearedisabledthrough00-base.conf:
Forexample,inordertodisabletheuserdirmodule,commentthecorrespondingLoadModuledirectivein/etc/httpd/conf.modules.d/00-base.confonbothnodes:
#LoadModuleuserdir_modulemodules/mod_userdir.so
Restarttheclusterresourceonthenodewhereitiscurrentlyactive:
pcsresourcerestartwebserver
PlacinglimitsonthenumberofApacheprocessesandchildrenInorderforApachetobeabletohandleasmanysimultaneousrequestsasneeded,butpreventingitfromconsumingmoreRAMthanyoucanaffordforyourapplication(s),youneedtosettheMaxRequestWorkers(calledMaxClientsbeforeversion2.3.13)directivetoanappropriatevaluebasedmostlyontheavailablephysicalmemorythatcanbeallottedinyourspecificenvironment.Notethatifthisvalueissettoohigh,youmaybringthewebserver(andtheresourcealtogether)toitsknees.
Ontheotherhand,settingittoanappropriatevalue,whichiscalculatedbasedonthememoryusageofeachApacheprocesscomparedtotheallottedRAM,willallowthewebservertorespondtothatmanyrequestsatonce.Ifthenumberofrequestssurpassesthecapacityoftheserver,theextrarequestswillbeservedoncethefirstoneshavealreadybeenserved,thusavoidingtheresourcefromhangingforallconnections.
Forfurtherdetails,refertotheApacheMPMCommondirectivesdocumentationathttp://httpd.apache.org/docs/2.4/mod/mpm_common.html.KeepinmindthatApachefine-tuningisoutofthescopeofthisbook,andtheactionsmentionedherearegenerallynotenoughforproductionuse.
DatabaseresourceSinceyouwillseldomuseawebserverwithoutanaccompanyingdatabaseserver,youalsoneedtolookonthatsideofthingstoimproveperformance.Herearesomebasicthingsyouwillwanttolookat.
CreatingindexesAdatabasecontainingtablesofhundredsofthousandsormillionofrecordscanquicklybecomeaperformancebottleneckwhenatypicalSELECT-FROM-WHEREstatementismadetoretrieveaspecificrecord.Goingthrougheveryrowinatabletoaccomplishthisisconsideredhighlyinefficientasitisperformedattheharddisklevel.
Withindexes,theoperationisperformedinmemoryinsteadofdisk,andrecordscanbeautomaticallysortedsothatit’sfastertofindtheonewewantbecauseanindexonlycontainstheactualsorteddataandalinktotheoriginaldatarecord.Inaddition,wecancreateanindexforeachcolumnweneedtosortby,sousingindexesbecomesahandytooltoimproveperformance.
Tobegin,exityourMariaDBsessionandruntest3tomeasureperformancewithoutindexes:
mysqlslap--create-schema=employees--query="SELECT*FROMemployeesWHERE
emp_no=1007"--concurrency=15--number-of-queries=150--iterations=10-h
192.168.0.4-uroot-p
Now,let’screateindexesontheemp_nofieldintheemployeesandsalariestablessincewewillusetheminourWHEREclause,andthenperformtest3again.Performthesesteps:
1. First,logintothedatabaseserverusingthefollowingcommand:
mysql-h192.168.0.4-uroot-p
2. Then,issuethefollowingcommandsfromtheMariaDBshell:
USEemployees;
RESETQUERYCACHE;
CREATEINDEXemployees_emp_noONemployees(emp_no);
CREATEINDEXsalaries_emp_noONsalaries(emp_no);
3. Afterthat,exittheMariaDBshellandrunthetestagaintocompareperformance.Theresultsareshowninthefollowingscreenshotsandsummarizedagainstthepreviousexample(withoutindexes)inthenexttable:
Now,let’slookattheresultsofthesametest,butthistimeusingindexes:
Summarizingresultsoftest2withandwithoutindexesonnode01
TEST3(inseconds) Node01(withoutindexes) Node01(withindexes)
Average,allqueries 0.043 0.038
Minimum,allqueries 0.035 0.037
Maximum,allqueries 0.055 0.046
Theprecedingscreenshotsdemonstratethatcreatingindexesonsearchablefieldswillimproveperformanceasitwillpreventtheserverfromhavingtogothroughallrowsbeforereturningtheresults.
UsingquerycacheInaMariaDBdatabaseserver,theresultsofSELECTqueriesarestoredinaquerycachesothatwhentheexactsameoperationisperformedagain,theresultscanbereturnedfaster.Thisispreciselythecaseinmostmodernwebsiteswheresimilarqueriesaremadeoverandoveragain(high-readandlow-writeenvironments).
So,howdoesthishappenattheserverlevel?Ifanincomingqueryisnotfoundinthecache,itwillbeprocessednormallyandthenstored,alongwithitsresultset,inthequerycache.Otherwise,theresultsarepulledfromthecache,whichmakesitpossibletocompletetheoperationmuchfasterthanifitwasprocessednormally.
InMariaDB,thequerycacheisenabledbydefault(SHOWVARIABLESLIKE'query'query'_cache_type';),butitssizeissettozero(SHOWVARIABLESLIKE'query'query'_cache_size';),asindicatedinthefollowingscreenshot:
Forthisreason,weneedtosetthequerycachesizevariabletoanappropriatevalueaccordingtotheuseofourapplication.Inthefollowingscreenshot,thisvariableissetto100KB(SETGLOBALquery_cache_size=102400;),andwecanseethatthequerycachesizehasbeenupdatedaccordingly:
Notethattherightvalueforthequerycachesizewilldependlargely,ifnotentirely,ontheneedsofyourspecificcase.Settingittoohighwillresultinperformancedegradationasthesystemwillhavetoallocateextraresourcestomanagealargecache.Ontheotherhand,settingittoaverylowvaluewillcauseatleastsomerepeatedqueriestobeprocessednormallyandnotbecached.Intheprecedingexample,weallocated100KBofdataascachetostorequeriesandtheircorrespondingresults.
Forfurtherdetails,refertotheMariaDBdocumentation(https://mariadb.com),specificallytotheManagingMariaDB/Optimizationandtuningsection.
NoteTheMariaDBdocumentationcontainsveryhelpfulinformationtotuneadatabaseserverstartingfromthegroundup(allthewayfromtheoperatingsystemlevelthroughqueryoptimization).OthertoolstoincreaseperformanceandstabilityareMySQLtuner(http://mysqltuner.com/),MySQLTuningPrimer(https://launchpad.net/mysql-tuning-primer),andphpMyAdminAdvisor(https://www.phpmyadmin.net/).ThelasttoolisavailableintheStatustabofastandardphpMyAdmininstallation.
MovingtoanA/AclusterAsyouwillrecallfromtheintroductionofChapter3,ACloserLookatHighAvailability,A/Aclusterstendtoprovidehigheravailabilityasseveralnodesareactivelyrunningapplicationsatthesametime(which,bytheway,requiresthatthenecessarydataforthoseapplicationsbeavailablesimultaneouslyonallclustermembers).Thedownsideisthatifoneormorenodesgooffline,theremainingonesareassignedextraprocessingload,thusnegativelyimpactingtheoverallperformanceofthecluster.
Thatbeingsaid,let’sexaminebrieflytherequiredstepstoconvertourcurrentA/PclustertoanA/Aone.MakesureaSTONITHresourcehasbeendefined(refertochapter3forfurtherdetails).
1. EnableSTONITHresourcebyusingthefollowingcommand:
pcspropertysetstonith-enabled=trueInstall
2. Installtheadditionalsoftwarethatwillbeneededforthis:
yumupdate&&yuminstallgfs2-utilsdlm
Asopposedtoatraditionaljournalingfilesystemsuchasext4(whichwehaveusedforourfilesystemsupuntilthispointinthebook),youwillneedawaytoensurethatallnodesaregrantedsimultaneousaccesstothesameblockstorage.GlobalFileSystem2(alsoknownasGFS2)providessuchafeaturethroughitscommand-linetools,whichareincludedinthegfs2-utilspackage.
3. Inaddition,thedlmpackagewillinstalltheDistributedLockManager(alsoknownasDLM),arequirementinclusterfilesystemstosynchronizeaccesstosharedresources.Add(andclone)theDistributedLockManagerasaclusterresourceoftheocfclass,pacemakerprovider,andcontroldclass:
pcsclustercibdlm_cfg
pcs-fdlm_cfgresourcecreatedlmocf:pacemaker:controldopmonitor
interval=60s
pcs-fdlm_cfgresourceclonedlmclone-max=2clone-node-max=1
4. Now,pushthenewlycreatedresourcetotheCIB:
pcsclustercib-pushdlm_cfg
5. Chooseareplicatedstorageresourceandcreateagfs2filesystemontopofitsassociateddevicenode.
Forexample,let’susethe/dev/drbd0devicewecreatedinChapter4,Real-worldImplementationsofClustering.WewillneedtounmountitfromthenodewiththeDRBDprimaryrole(mostlikely,node01)beforewecancreateagfs2filesystemonit:
umount/dev/drbd0
mkfs.gfs2-plock_dlm-j2-tMyCluster:Web/dev/drbd0
Here,asyoucanseeinthefollowingscreenshot,MyClusteristheoriginalnameof
ourcluster,Webisarandomname,andthe-jflagisusedtoindicatethatthefilesystemwillusetwojournals(inthiscaseoneforeachnode-youwillwanttochangethisnumberifyourclusterconsistsinmorenodes).Finally,the-poptiontellsusthatwearegoingtousetheDLMprovidedbythekernel:
Youwillalsoneedtochangethefstypeoptionoftheweb_fsresourcefromext4(theoriginalfilesystemusedwhenwefirstcreateditinChapter4,Real-worldImplementationsofClustering)togfs2inthePCSresourceconfiguration:
pcsresourceupdateweb_fsfstype=gfs2
Itisimportanttonotethatiftheclusterattemptstostartweb_fsbeforedlm-clone,wewillrunintoanissue(wecannotmountagfs2filesystemifthedlmfunctionalityisnotpresent).Thus,weneedtoaddcolocationandorderingconstraintssothatweb_fswillalwaysstartonthenodewheredlm-clonestarts:
pcsconstraintcolocationaddweb_fswithdlm-cloneINFINITY
anddlm-clonewillbestartedbeforeweb_fs.
6. Thepcsconstraintorderdlm-clonethenweb_fsClonethevirtualIPaddressresource.
CloningtheIPaddresswillallowustoeffectivelyuseresourcesonbothnodes,butatthesametime,anygivenpacketwillbesenttoonlyonenode(thus,implementingabasicload-balancingmethodinourcluster):
Todothis,wewillsavetheclusterconfigurationtoafilenamedload_balancing_cfgandupdatesuchfilewiththe:
pcsclustercibload_balancing_cfg
Youwillnoticefromthepcsresourcehelpthatthecloneoperationallowsyoutospecifycertainoptions.Inthefollowinglines,clone-maxspecifiesthenumberofnodesthathostthevirtual_ipresource(2inthiscase),whereasclone-node-maxindicatesthenumberofresourceinstanceseachnodeisallowedtorun.Next,globally-uniqueinstructstheresourceagentthateachnodeisdistinctfromtherest
andthus,handlesdistincttrafficaswell.Finally,clusterip_hash=sourceiptellsusthatthepacket’ssourceIPaddresswillbeusedtodecidewhichnodegetstoprocesswhichrequest:
pcs-fload_balancing_cfgresourceclonevirtual_ipclone-max=2clone-
node-max=2globally-unique=true
pcs-fload_balancing_cfgresourceupdatevirtual_ip
clusterip_hash=sourceip
ThenextstepsconsistsofcloningthefilesystemandApacheand/orMariaDBresources.NotethatinordertoallowtwoprimariesinaDRBDdevicesothatyoucanservecontentfrombothatthesametime,youwillneedtosettheallow-two-primariesdirectivetoyes(allow-two-primariesyes;)inthenetsectionoftheresourceconfigurationfile(/etc/drbd.d/drbd0.res,forexample):
resourcedrbd0
net{
protocolC;
allow-two-primariesyes;
}
...
}
7. Onceagain,savethecurrentCIBtoalocalfileandaddthecloneresourceinformation.Inthenextexample,wewilluseweb_fs,web_drbd_cloneandwebserver:
pcsclustercibcurrent_cfg
pcs-fcurrent_cfgresourcecloneweb_fs
pcs-fcurrent_cfgresourceclonewebserver
8. Now,web_drbdshouldbeallowedtoservebothinstancesasprimaryormaster:
pcs-fcurrent_cfgresourceupdateweb_drbd_clonemaster-max=2
9. Then,activatethenewconfiguration:
pcsclustercib-pushcurrent_cfg
10. Lastbutnotleast,youneedtokeepinmindthatyouwillneedtosetthevalueoftheresourcestickinessto0inorderforittoreturnaninstancetoitsoriginalnodeafterafailover.Todoso,refertothePerformingafailoversectionthissamechapter.
Youcannowproceedtoforceafailoverasusual,andtesttheresourceavailability.Unfortunately,thisisnotpossibleinaVirtualboxenvironmentasIhaveexplainedpreviously.However,it’sentirelypossibleifyouareabletobuildyourclusterwithrealhardwareandanactualSTONITHdevice.
SummaryInthislastchapter,wesetupacoupleofperformancetestingtoolsfortheexampleservicesthatyouneedtomakehighlyavailableinyourcluster,andprovidedafewsuggestionstooptimizetheirperformanceseparatelyaswell.Notethatthosesuggestionsarenotintendedtorepresentanexhaustivelistoftuningmethods,butastartingpointinstead.WehavealsoprovidedtheoverallinstructionssothatyoucanconvertanA/PclusterintoanA/Aone.
Finally,keepinmindthatthisbookwaswrittenusingvirtualmachinesinsteadofspecializedhardware.Thus,wehaverunintosomeassociatedlimitations,suchasthelackforrealSTONITHdevicesthatwouldotherwisehaveallowedustoactuallydemonstratethefunctionalitiesofanA/Acluster.However,theprinciplesoutlinedinthisbookwillundoubtedlybeaguidetosetupyourownclusters,whetheryouareexperimentingwithvirtualmachinesaswellorusingrealhardware.
Bestofsuccessinyourendeavors!
IndexA
A/AclusterA/Pcluster,convertingto/MovingtoanA/Acluster
ApacheDRBDresource,using/MountingtheDRBDresourceandusingitwithApacheDRBDresource,testing/TestingtheDRBDresourcealongwithApache
ApacheMPMCommondirectivesURL/PlacinglimitsonthenumberofApacheprocessesandchildren
CCentOS
downloading/DownloadingCentOSURL/DownloadingCentOS
CentOS7using/WhyLinuxandCentOS7?about/WhyLinuxandCentOS7?URL/WhyLinuxandCentOS7?nodes,settingup/SettingupCentOS7nodesinstalling/InstallingCentOS7
clustervirtualIP,settingup/SettingupavirtualIPfortheclusterconfiguring,withPCSGUI/ConfiguringourclusterwithPCSGUI
ClusterInformationBase(CIB)about/AddingDRBDasaPCSclusterresource
clusteringabout/ClusteringfundamentalsLinux,using/WhyLinuxandCentOS7?CentOS7,using/WhyLinuxandCentOS7?requiredpackages,installing/Installingthepackagesrequiredforclustering
clusteringservicesconfiguring/Configuringandstartingclusteringservicesstarting/Configuringandstartingclusteringservicesenabling/Startingandenablingclusteringservicestroubleshooting/Troubleshooting
clusterresourcevirtualIP,adding/AddingavirtualIPasaclusterresourceabout/AddingavirtualIPasaclusterresourcewebservers,configuring/Configuringthewebserverasaclusterresourceproblems,troubleshooting/Troubleshooting
clustertestsperforming/Introducinginitialclustertestsfields,retrievingfromrecords/Test1–retrievingallfieldsfromallrecordsJOINoperations,performing/Test2–performingJOINoperationsfailover,performing/Performingafailover
Corosyncabout/Keysoftwarecomponents,Startingandenablingclusteringservices
Ddatabaseresource
performanceoptimization/Databaseresourceindexes,creating/Creatingindexesquerycache,using/Usingquerycache
databaseserversinstalling/Installingthewebanddatabaseservers
datagramsabout/Lettinginandlettingout
DellRemoteAccessController(DRAC)about/InstallingandconfiguringaSTONITHdevice
DesignatedController(DC)about/Managingauthenticationandcreatingthecluster,Split-brain–preparingtoavoidinconsistencies
DistributedLockManager(DLM)about/MovingtoanA/Acluster
DistributedReplicatedBlockDevice(DRBD)about/Settingupstorageinstalling/Settingupstorageavailability/ELReporepositoryandDRBDavailabilityconfiguring/ConfiguringDRBDadding,asPCSclusterresource/AddingDRBDasaPCSclusterresource
DRBDresourcemounting/MountingtheDRBDresourceandusingitwithApacheused,withApache/MountingtheDRBDresourceandusingitwithApachetesting,withApache/TestingtheDRBDresourcealongwithApache
DynamicSharedObjects(DSOs)about/Apache’sconfigurationandsettings
EELReporepository
about/ELReporepositoryandDRBDavailabilityEmployeesdatabase
settingup/Settingupasampledatabasedownloading/DownloadingandinstallingtheEmployeesdatabaseinstalling/DownloadingandinstallingtheEmployeesdatabaseURL/DownloadingandinstallingtheEmployeesdatabase
Ffailover
about/Failover–anintroductiontohighavailabilityandperformancefencing
malfunctioningnodes,isolating/Fencing–isolatingthemalfunctioningnodesabout/Fencing–isolatingthemalfunctioningnodes
GGlobalFileSystem2(GFS2)
about/MovingtoanA/Acluster
Hhigh-availabilitydatabase
settingup,withreplicatedstorage/Settingupahigh-availabilitydatabasewithreplicatedstorage
high-performancecluster(HPC)about/Clusteringfundamentals
highavailability(HA)about/Clusteringfundamentals
IiLO(IntegratedLightsOut)
about/InstallingandconfiguringaSTONITHdeviceindexes
creating/CreatingindexesInternetGroupManagementProtocol(IGMP)
about/Lettinginandlettingout
Kkey-basedauthentication
settingup,forSSHaccess/Settingupkey-basedauthenticationforSSHaccess
LLAMPstack
about/InstallingthewebanddatabaseserversLinbit
URL/ConfiguringDRBDLinux
using/WhyLinuxandCentOS7?LogicalVolume(LV)
about/Settingupahigh-availabilitydatabasewithreplicatedstorageLogicalVolumeManager(LVM)
about/ELReporepositoryandDRBDavailability
Mmanjournalctl
URL/TroubleshootingMariaDB
URL/Usingquerycachemembers
about/Clusteringfundamentalsmodprobecommand
about/Settingupstoragemysqlslaptool
about/Introducinginitialclustertests—create-schemaflag/Introducinginitialclustertests—queryflag/Introducinginitialclustertests—delimiterflag/Introducinginitialclustertests—concurrencyflag/Introducinginitialclustertests—iterationsflag/Introducinginitialclustertests—number-of-queriesflag/Introducinginitialclustertests
MySQLtunerURL/Usingquerycache
MySQLTuningPrimerURL/Usingquerycache
NNagios
about/Monitoringthenodestatusnetworksecurity
fundamentals/Securityfundamentalstraffic,allowingbetweennodes/Lettinginandlettingout
nodesabout/Clusteringfundamentalsstatus,monitoring/Monitoringthenodestatus
nodes,CentOS7settingup/SettingupCentOS7nodesCentOS7,installing/InstallingCentOS7networkinfrastructure,settingup/Settingupthenetworkinfrastructure
PPacemaker
about/Keysoftwarecomponents,Startingandenablingclusteringservicespackages
installing,forclustering/Installingthepackagesrequiredforclusteringsoftwarecomponents,using/Keysoftwarecomponentskey-basedauthentication,settingupforSSHaccess/Settingupkey-basedauthenticationforSSHaccess
PCSabout/Keysoftwarecomponents,Startingandenablingclusteringservicessettingup/GettingacquaintedwithPCSauthentication,managing/Managingauthenticationandcreatingtheclustercluster,creating/Managingauthenticationandcreatingthecluster
PCSclusterresourceDistributedReplicatedBlockDevice(DRBD),adding/AddingDRBDasaPCSclusterresource
PCSGUIcluster,configuring/ConfiguringourclusterwithPCSGUI
performanceoptimization,clusterabout/MeasuringandimprovingperformanceApache’sconfiguration/Apache’sconfigurationandsettingsApache’ssettings/Apache’sconfigurationandsettingsmodules,disabling/Loadinganddisablingmodulesmodules,loading/LoadinganddisablingmodulesApacheprocesses,limiting/PlacinglimitsonthenumberofApacheprocessesandchildrenApachechildren,limiting/PlacinglimitsonthenumberofApacheprocessesandchildrendatabaseresource/Databaseresource
phpMyAdminAdvisorURL/Usingquerycache
PhysicalVolume(PV)about/Settingupahigh-availabilitydatabasewithreplicatedstorage
pscommandabout/Clusterservicesandperformance
Qquerycache
using/Usingquerycachequorum
about/Quorum–scoringinsideyourcluster
Rreplicatedstorage
high-availabilitydatabase,settingup/Settingupahigh-availabilitydatabasewithreplicatedstorage
resourceagentabout/Measuringandimprovingperformance
resourcesmonitoring/Monitoringtheresourcesstartingissues,monitoring/Whenaresourcerefusestostartavailabilityofcorecomponents,checking/Checkingtheavailabilityofcorecomponents
SSELinux(SecurityEnhancedLinux)
about/Lettinginandlettingoutserverstatuspage
URL/ConfiguringthewebserverasaclusterresourceShootTheOtherNodeInTheHead(STONITH)
about/SettingupCentOS7nodes,ViewingthestatusofthevirtualIPsimplygateway[192.168.0.1]
about/SettingupthenetworkinfrastructureSinglePointOfFailure(SPOF)
about/Split-brain–preparingtoavoidinconsistenciessoftwarecomponents
using/KeysoftwarecomponentsPacemaker/KeysoftwarecomponentsCorosync/KeysoftwarecomponentsPCS/Keysoftwarecomponents
split-brainabout/Split-brain–preparingtoavoidinconsistencies
SSHaccesskey-basedauthentication,settingup/Settingupkey-basedauthenticationforSSHaccess
standardCentospackagemanageryumabout/DownloadingCentOS
STONITHdeviceinstalling/InstallingandconfiguringaSTONITHdeviceconfiguring/InstallingandconfiguringaSTONITHdevice
storagesettingup/Settingupstorage
T7789TCPport
about/ConfiguringDRBDtopcommand
about/ClusterservicesandperformanceTransmissionControlProtocol(TCP)
about/Lettinginandlettingout
UUsageDRBD.org
URL/ConfiguringDRBDUserDatagramProtocol(UDP)
about/Lettinginandlettingout
VVirtualbox
virtualharddisk,adding/ELReporepositoryandDRBDavailabilityVirtualBox
URL/SettingupCentOS7nodesinstalling/SettingupCentOS7nodes
virtualIPsettingup,forcluster/SettingupavirtualIPfortheclusteradding,asclusterresource/AddingavirtualIPasaclusterresourcestatus,viewing/ViewingthestatusofthevirtualIP
VolumeGroup(VG)about/Settingupahigh-availabilitydatabasewithreplicatedstorage
Wwebservers
installing/Installingthewebanddatabaseserversconfiguring,asclusterresource/Configuringthewebserverasaclusterresource
ZZabbix
about/Monitoringthenodestatus