Breda Development Meetup 2016-06-08 - High Availability
-
Upload
bas-peters -
Category
Internet
-
view
49 -
download
0
Transcript of Breda Development Meetup 2016-06-08 - High Availability
![Page 1: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/1.jpg)
HighAvailabilityBredaDevelopmentMeetupBasPeters- june 8,2016
![Page 2: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/2.jpg)
![Page 3: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/3.jpg)
UptimePercentiletarget Max downtimeperyear
90% 36days
99% 3.65days
99.5% 1.83days
99.9% 8.76hours
99.99% 52.56minutes
99.999% 5.25minutes
99.9999% 31.5seconds
![Page 4: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/4.jpg)
HA is Redundancyü RAID: Disk crash? Another disk still works!
ü Virtualization: Physical host crashes? VM available on other physical host!
ü Clustering: Server crashes? Another server still works!
ü Power: Power outage? Redundant power supply!
ü Network: Switch or NIC crashes? 2nd network route available!
ü Geographical: Datacenter offline? Another DC available to perform work!
![Page 5: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/5.jpg)
Traditional setup
router
server
enduser
![Page 6: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/6.jpg)
Traditional setup - enhanced
router databaseserverenduser applicationserver
![Page 7: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/7.jpg)
Adding redundancy
router databaseserverenduser
applicationserver1
loadbalancer
applicationserver2
![Page 8: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/8.jpg)
Enhanced redundancy
router databaseserverenduser
applicationserver1
loadbalancer
applicationserver2
router(backup) loadbalancer (backup)
![Page 9: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/9.jpg)
Database redundancy
routerenduser
applicationserver1
loadbalancer
applicationserver2
router(backup) loadbalancer (backup)
databaseserver1
databaseserver2
![Page 10: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/10.jpg)
Datacenter redundancy
routerenduser
applicationserver1
loadbalancer applicationserver2
router(backup) loadbalancer (backup) databaseserver1
databaseserver2
datacenter1
datacenter2
![Page 11: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/11.jpg)
States and sessionso Multiplerequestscanbeservedby
differentbackendservers
o StoresessionindatabaseornoSQL cache
o Loadbalancercan“stick”asinglebackend
servertoauser…
o ...butnot inallcases!
app1 app2 app3 app4
12
3
12 3
![Page 12: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/12.jpg)
Local storageo Avoidstoringmeaningfulpersistentusercontentonalocalserver
o Applicationlevelcachingisusefulaslongasitisnotdestructive
o Synchronizationofcontentsbetweenbackendserversisapain
o Usedatabaseforstoragewherepossible
…Therearepossibilitiestosharestorageamongstbackendservers
![Page 13: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/13.jpg)
Shared storage - NASo NetworkAttachedStorage
o ANAShandlesthecompletefilesystemo Reliesonprotocolslike:
NFS: NetworkFilesystemSMB/CIFS: WindowsFileSharing
o Simpletoimplement
o Redundancyisveryhardtoachieve,oftensinglepointoffailure
o Performanceismediocreandbottleneckscanoccur
![Page 14: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/14.jpg)
Shared storage - SANo StorageAreaNetwork
o ASANhandlesonlythe“blocklevel”partofthefilesystemo Reliesonprotocolslike:
iSCSI: IPbasedSCSIFibre Channel: OpticalfibertransportprotocolAoE: ATAoverEthernet
o Hardtoimplement,expensive
o Redundancycanbeachievedtoavoidsinglepointoffailure
o Performanceandscalabilityis(reasonably)good
![Page 15: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/15.jpg)
Shared storage – Cluster Filesystemo Filesystemsharedonmultipleserversusingspecialsoftware/driverso Windowsimplementation:
DFS: WindowsDistributedFileSystemo Linuximplementations:
HDFS: HadoopDistributedFilesystemCeph: ObjectStoragePlatformGlusterFS: RedHatClusterFilesystem
o Relativelyeasytoimplement
o Redundancycaneasilybeachieved
o Performanceandscalabilityis(reasonably)good
![Page 16: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/16.jpg)
Database High Availabilityo HighAvailabilityonRDBMS(relationaldatabasemanagementsystems)is
oftenthemostdifficultthinginaHighAvailablesetup
o Hardwareresourcesanddataneed toberedundant
o Rememberthatitisn’tjustdata,itisconstantlychangingdata
o HighAvailabilitymeanstheoperationcancontinueuninterrupted,notby
restoringanew/backupserver
![Page 17: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/17.jpg)
Database HA - Replication
o Asynchronousbydefault
o Onemaster,manyslaves
o Nowritescale-outpossible
o Difficulttorecoverfromafailoversituation
o Pronetoinconsistencywhennotusedproperly
![Page 18: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/18.jpg)
Database HA - Shardingo Separatedataovermultipledatabase
back-endsusingkeyeddistribution
o Multimastersetuppossible
o Excellentscalability
o Redundancyneedstobeobtainedthroughacomplementarymethodology
o Requiresmorecomplexapplicationlogic
![Page 19: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/19.jpg)
Database HA – Clustering I
o Synchronousbydefault
o Multimastersetuppossible
o Writescale-outpossible
o Near-automaticfaultrecovery
o Requirescodelevelreplicationconflictresolving
![Page 20: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/20.jpg)
Database HA – Clustering IIClusteringforMicrosoftSQL(from2012)o AlwaysOnAvailabilityGroupso EachnoderequiresWSFC(WindowsServerFailoverClustering)o Asynchronousandsynchronouscommitmodesupportedo Upto8“warm”availabilityreplicascanbesetupo Thesereplicascanbeusedforreadtransactionsandbackupso Availabilitygrouplistenertoautomaticallyredirectclientstothebestavailableservero Nota“real”cluster,nomaster-masterreplicationpossible
![Page 21: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/21.jpg)
Database HA – Clustering IIIClusteringforMySQL(MariaDB)o Galera (wsrep)plugintoenableclustering
(includedinMariaDB 10.1bydefault)o Asynchronousandsynchronouscommitmodesupportedo Multi-mastersynchronousreplicationo Readandwritescalabilityo Automaticmembershipcontrol,nodejoininganddroppingo Nolistenerfunctionalitythatredirectsclientstoavailablenodes
![Page 22: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/22.jpg)
Clustering – Quorum I
”A quorum istheminimumnumberofmembersofa deliberative
assembly necessarytoconductthebusinessofthatgroup”
- Wikipedia
![Page 23: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/23.jpg)
Clustering – Quorum IIo NodeMajority:Eachnodethatisavailable
andincommunicationcanvote.Theclusterfunctionsonlywithamajorityofthevotes.
o Whenanetworkpartitionoccurs,thenodesintheminoritypartwillgoinlockdowntoavoida“splitbrain”situation
o Whenanetworkpartitionresolves,theminoritypartwillrejointheactiveclusterafterastatetransfertoretrievethedatathatwaschangedinthemeantime
o Aclustershouldcontainanoddnumberofnodestopreventatotallockdownduringanodefailureornetworkpartition
![Page 24: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/24.jpg)
Clustering – Scenario 1o NodeAisgracefullystopped
o Othernodesreceive“leave”messageandquorumisreducedby1
o Clusterisonline
o NodeBandCcontinuetoserverequestsbecausetheyhavethemajorityofvotes(2of2)
![Page 25: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/25.jpg)
Clustering – Scenario 2o NodeAandBaregracefullystopped
o NodeCreceive“leave”messagesfromAandBandquorumisreducedby2
o Clusterisonline
o NodeCcontinuestoserveclientssinceithasthemajorityofvotesinthequorum(1of1)
![Page 26: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/26.jpg)
Clustering – Scenario 3o Allnodesaregracefullystopped
o Clusterisoffline
o Thereisapotentialprobleminstartingtheclusteragain.Themostrecent(laststopped)nodeshouldbeusedtobootstraptheclusterorthereispotentialdataloss
![Page 27: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/27.jpg)
Clustering – Scenario 4o NodeAdisappearsfromtheclusterdueto
unforeseencircumstances
o NodeBandCwilltrytoreconnecttoAbutwilleventuallyremoveAfromthecluster,maintainingthequorum(3)
o Clusterisonline
o NodeBandCcontinuetoserverequestsbecausetheyhavethemajorityofvotes(2of3)
![Page 28: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/28.jpg)
Clustering – Scenario 5o NodeAandBdisappearfromthecluster
duetounforeseencircumstances
o NodeCwilltrytoreconnecttoAandBbutwilleventuallyremovebothfromthecluster,maintainingthequorum(3)
o Clusterisoffline
o TheclusterisofflinebecauseNodeCcannotacquireamajorityofthevotes(1of3)andwillremaininlockdown
![Page 29: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/29.jpg)
Clustering – Scenario 6o Allnodesdisappearfromthecluster
duetounforeseencircumstances
o Clusterisoffline (obviously)
o ThisisapotentialproblemastheNodewiththemostrecentdatashouldbeusedtobootstraptheclusteragaintoavoiddataloss
![Page 30: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/30.jpg)
Clustering – Scenario 7o AnetworksplitcausesNodeA,BandC
toloseconnectivitywithNodeD,EandF
o Clusterisoffline
o NodeA,BandChavenomajority(3of6)andNodeD,EandFalsohavenomajority(3of6).AllNodesgoinlockdown
![Page 31: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/31.jpg)
Clustering – Multiple Datacenters IDC1 DC2
node1
node2
node3
![Page 32: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/32.jpg)
Clustering – Multiple Datacenters IIDC1 DC2
node1
node2
node3
node4
![Page 33: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/33.jpg)
Clustering – Multiple Datacenters IIIDC1 DC2
node1 node2
DC3
node3
![Page 34: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/34.jpg)
Clustering – Multiple Datacenters IVDC1 DC2
node1
node2
node3
node4
DC3
node5 node6
![Page 35: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/35.jpg)
Health Endpoint Monitoring
o MonitorapplicationsforavailabilityinaHApool
o Monitormiddle-tierservicesforavailability
o Automaticremovalofmisbehavingendpointsfromthepool
o Endpointsthatarehealthyagainafteraserviceinterruptionare
automaticallyre-added
![Page 36: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/36.jpg)
Application Health Check
loadbalancer
ApplicationNode
StorageavailableCodecanbeexecutedDatabasereachableServiceArunningServiceBrunning
statusrequest
200(OK)Responsetime:50ms
![Page 37: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/37.jpg)
Database Health Check
loadbalancer
DatabaseNode
DatabaserunningSimplequerycanbeexecutedLocaldatabasenode ishealthyclusternode
statusrequest
200(OK)Responsetime:50ms
![Page 38: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/38.jpg)
appserver 1
appserver 2appserver 3
Monitoring Strategy
Loadbalancer
DBloadbalancer
db node1db node2
db node3
DBloadbalancer
db node1db node2
db node3
appserver1appserver2
DBnode1DBnode3
![Page 39: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/39.jpg)
Design Patterns for HA environments
o Safeguardperformance
o Increasefaulttolerancy
o Improveconsistency
![Page 40: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/40.jpg)
Queue based load leveling pattern I
o Temporaldecoupling
o Loadleveling
o Loadbalancing
o Loosecoupling
tasks
service
messagequeue
requestsreceivedatvariablerate
messagesprocessedatamore
consistentrate
![Page 41: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/41.jpg)
Queue based load leveling pattern II
Whentouse?o Anytypeofapplicationorservicethatissubjecttooverloading
Whennottouse?o Notsuitableifaresponsewithminimallatencyisexpectedfromthe
applicationorservice
![Page 42: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/42.jpg)
Throttling pattern Io Rejectordelayrequeststotheapplicationwhenacertainnumberof
requestsinacertainamountoftimeisreached
o Disableordegradefunctionalityofselectednonessentialservicessothatessentialservicescanrununimpededwithsufficientresources
![Page 43: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/43.jpg)
Throttling pattern IIWhentouse?o Toensurethatasystemcontinuestomeetservicelevelagreements
o Topreventasingletenantfrommonopolizingtheresourcesprovidedbyanapplication
o Tohandleburstsinactivity
o Tohelpcost-optimizeasystembylimitingthemaximumresourcelevelsneededtokeepitfunctioning
![Page 44: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/44.jpg)
Retry patterno Enabletheapplicationtohandleanticipated,temporaryfailures
o Transparentlyretryinganoperationthathaspreviouslyfailedintheexpectationthatthecauseofthefailureistransient
o Especiallyusefulinmicro-serviceandcloudarchitectures
![Page 45: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/45.jpg)
DeploymentsHighavailableenvironmentsbringadditionalchallengestosoftwaredeployments:
o Howtoperformatomicreleases?
o Howtorollbackafaultyreleasequickly?
o Howtoreleasenewsoftwarewithoutanydowntime?
![Page 46: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/46.jpg)
Basic deployment
loadbalancer
applicationserver1
applicationserver2
databasecluster
1.replaceapplicationcodeonappserver 1
2.replaceapplicationcodeonappserver 2
3.applydatabasechanges
DONE!
![Page 47: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/47.jpg)
Enhanced deployment
loadbalancer
applicationserver1
applicationserver2
databasecluster
1.removeappserver 1fromthepool
3.enableappserver 1inthepoolanddisableappserver 2
2.replaceapplicationcodeonappserver 1
DONE!
4.replaceapplicationcodeonappserver 2
5.enableappserver 2inthepool
6.applydatabasechanges
![Page 48: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/48.jpg)
A/B Deployments Iloadbalancer applicationserver1 applicationserver2
www.live.nlappserver 1- Aappserver 2- A
www.shadow.nlappserver 1- Bappserver 2- B
webserverA/deploy/A
webserverA/deploy/A
webserverB/deploy/B
webserverB/deploy/B
![Page 49: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/49.jpg)
A/B Deployments IIloadbalancer
requestfor:www.live.nl
“www.live.nl isbeingservedbypoolA”
applicationserver
WebserverAcoderesidesat/deploy/A
requestfor:www.shadow.nl
“www.shadow.nl isbeingservedbypool B”
Webserver Bcoderesides at/deploy/B
![Page 50: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/50.jpg)
A/B Deployments IIIloadbalancer
www.live.nlwww.shadow.nl
POOLAè BPOOLBè A
ByswappingPoolAwithPoolBinthe loadbalancer,theentirebackendsareswitchedinstantaneously.
Thisenablesseamlessdeploymentwithout downtime
![Page 51: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/51.jpg)
Deployment best practiceso Neverintroducebackwardsbreakingchangestothedatabase
o Thoroughlytestshadow-liveenvironmentasitistheclosesttothereallivedeployment
o Maintainatightreleaseversioning,basedonsemanticversioning
o ReleasingendofdayandonaFridayisnotrecommended
![Page 52: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/52.jpg)
Questions?
![Page 53: Breda Development Meetup 2016-06-08 - High Availability](https://reader031.fdocuments.in/reader031/viewer/2022030307/58e980ea1a28aba6498b5127/html5/thumbnails/53.jpg)
WWW.CMTELECOM.COM
THANKSFORLISTENING!