SUSE Linux Enterprise High Availability Extension Documentation
Open Source High Availability on Linux
Transcript of Open Source High Availability on Linux
![Page 1: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/1.jpg)
IBM
October 2006 © 2006 IBM Corporation
Open Source High Open Source High Availability on LinuxAvailability on Linux
Alan RobertsonAlan [email protected]@unix.sh
![Page 2: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/2.jpg)
Slide 2
IBM
October 2006 © 2006 IBM Corporation
Agenda - High Availability on Linux
HA Basics
Open Source High-Availability Software for Linux
Linux-HA Open Source project
DRBD Open Source Project
Linux Virtual Server (LVS) Project
![Page 3: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/3.jpg)
Slide 3
IBM
October 2006 © 2006 IBM Corporation
The Desire for HA Systems
Who wants lowWho wants lowavailability systems?availability systems?
Why are so few systems HighAvailability?
![Page 4: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/4.jpg)
Slide 4
IBM
October 2006 © 2006 IBM Corporation
Barriers to HA Systems
CostVery manageable with modern hardware, OSS software
ComplexityCan't give away 'simplicity' – good management tools help
![Page 5: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/5.jpg)
Slide 5
IBM
October 2006 © 2006 IBM Corporation
![Page 6: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/6.jpg)
Slide 6
IBM
October 2006 © 2006 IBM Corporation
What would be the result?
Increased Availability
Drastically multiplying customers multiplies experience - products mature faster (especially in OSS model)
OSS developers grow from customers
OSS Clustering is a disruptive technology
![Page 7: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/7.jpg)
Slide 7
IBM
October 2006 © 2006 IBM Corporation
What is a Computer Cluster?
From Wikipedia:
A computer cluster is a group of loosely coupled computers that work together closely so that in many respects they can be viewed as though they are a single computer.
Clusters are usually deployed to improve performance and/or availability over that provided by a single computer, while typically being much more cost-effective than single computers of comparable speed or availability.
![Page 8: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/8.jpg)
Slide 8
IBM
October 2006 © 2006 IBM Corporation
HA vs. HPC Clustering
HPC clusters work primarily to manage and maximize the increased performance which results from having multiple computers working together
High-Availability clusters primarily work to manage and maximize the increased availability which is possible when multiple computers work together
These goals are not mutually exclusive
![Page 9: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/9.jpg)
Slide 9
IBM
October 2006 © 2006 IBM Corporation
What is an HA cluster?
A group of computers which cooperate to provide a service even when system components fail
When one machine goes down, others take over its work
This involves IP address takeover, service takeover, etc.
New work comes to the “takeover” machine
When a service fails, it is restarted
Can be restarted on the same server or a different one
![Page 10: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/10.jpg)
Slide 10
IBM
October 2006 © 2006 IBM Corporation
What Can HA clustering do for you?
It cannot achieve 100% availability – nothing can.
HA Clustering primarily designed to recover from single faults
It can make your outages very shortFrom about a second to a few minutes
It is like a Magician's (Illusionist's) trick:When it goes well, the hand is faster than the eyeWhen it goes not-so-well, it can be reasonably visible
A good HA clustering system adds a “9” to your base availability99->99.9, 99.9->99.99, 99.99->99.999, etc.
Complexity is the enemy of reliability!
![Page 11: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/11.jpg)
Slide 11
IBM
October 2006 © 2006 IBM Corporation
High Availability Approach - Redundancy
Redundancy eliminates Single Points Of Failure (SPOF) Reduces cost of planned and unplanned outages
![Page 12: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/12.jpg)
Slide 12
IBM
October 2006 © 2006 IBM Corporation
The 3 R's of High-Availability
Redundancy
Redundancy
Redundancy
If this sounds redundant, that's probably appropriate... ;-)
HA Clustering is a good way of providing and managing redundancy
![Page 13: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/13.jpg)
Slide 13
IBM
October 2006 © 2006 IBM Corporation
High Availability Approach - Failover
Auto detect Failures (hardware, network, applications) Automatic Recovery from failures (no human intervention)
Managed failover to standby systems, components
![Page 14: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/14.jpg)
Slide 14
IBM
October 2006 © 2006 IBM Corporation
Statistics... Counting Nines...
Availability percentage Yearly downtime100% 099.99999% 3s99.9999% 30 sec99.999% 5 min99.99% 52 min99.9% 9 hr99% 3.5 day
![Page 15: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/15.jpg)
Slide 15
IBM
October 2006 © 2006 IBM Corporation
Two Node Active/Passive HA ClusterShared Disk (DS4000, ESS, etc.)
![Page 16: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/16.jpg)
Slide 16
IBM
October 2006 © 2006 IBM Corporation
Two Node Active/Active HA ClusterShared Disk (DS4000, ESS, etc.)
![Page 17: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/17.jpg)
Slide 17
IBM
October 2006 © 2006 IBM Corporation
Linux-HA (“heartbeat”) Project
Open Source Project (IBM Leadership)
Multiple platform solution for Linux, Solaris, *BSD, OS/X
Packaged with most Linux Distributions (except Red Hat)
Part of OSCAR-HA package
Strong focus on ease-of-use, security, low-cost
> 30K clusters in production since 1999
Equal to or superior to commercial HA packages
![Page 18: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/18.jpg)
Slide 18
IBM
October 2006 © 2006 IBM Corporation
What is the "Linux-HA" project?
An open-community project providing basic fail over capabilities for Linux (and other OSes)
Active, open development community led by IBM
Wide variety of industries, applications
Reference implementation for Open Cluster Framework (OCF) standards
Simple to understand and easy to install
No special hardware requirements; no kernel dependencies, all user space
All releases tested by automatic test suites
http://linux-ha.org/
![Page 19: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/19.jpg)
Slide 19
IBM
October 2006 © 2006 IBM Corporation
"Linux-HA" SuccessesFexEx – used in truck schedulingThe Weather Channel (weather.com)BBC – internet infrastructureCERN – grid servicesLos Alamos National Laboratories – badge readersSony - manufacturing processesUnited NationsIntuit (Quicken, TurboTax, etc.) use it for firewallsAgilent Technologies in Fort Collins – 3 clustersISO New England manages the New England power grid using 12 "Linux HA" clustersUniversity of Toledo – 20K user WebCT SystemEmageon – medical imaging services ADC – telco provisioning manager product (w/ x330/335)Incredimail uses "Linux HA" on IBM hardwareBavarian Radio Station (Munich) used "Linux HA" and xSeries for coverage of 2002 Olympics in Salt Lake CityMore listed at: http://linux-ha.org/SuccessStories
![Page 20: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/20.jpg)
Slide 20
IBM
October 2006 © 2006 IBM Corporation
Linux-HA Capabilities
Supports n-node clusters – where 'n' is currently <= something like 16
Active/Passive or full Active/Active
Can use UDP bcast, mcast, ucast comm.
Fails over on node failure, or on service (resource) failure
Fails over on loss of IP connectivity, or arbitrary criteria
Support for the OCF resource management standard
Sophisticated dependency model with rich constraint support (resources, groups, incarnations, master/slave)
XML-based resource configuration
Configuration and monitoring GUI
Support for OCFS2 cluster filesystem – others coming
![Page 21: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/21.jpg)
Slide 21
IBM
October 2006 © 2006 IBM Corporation
Linux-HA futures being considered
Business Continuity support (in source control now)
Specific virtualization supportTransparent migration“Containerized” resources (peek inside client VM via proxy)
Increase number of nodes directly supported
Loosen cluster definition to manage many more nodes through hierarchical proxies
Integration with provisioning software
![Page 22: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/22.jpg)
Slide 22
IBM
October 2006 © 2006 IBM Corporation
DRBD – Distributed Replicating Block DeviceRAID1 over the LAN
DRBD is a block-level replication technology – it works underneath any (non-clustered) filesystem
Every time a block is written on the master side, it is copied over the LAN and written on the slave side
It is extremely cost-effective – common with xSeries
Typically, a dedicated replication link is used
Also used with slower links for Business Continuity
Worst-case around 10% throughput loss – typically negligible
Current versions have very fast “full” resync
![Page 23: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/23.jpg)
Slide 23
IBM
October 2006 © 2006 IBM Corporation
Two Node Active/Passive HA ClusterReal-Time Disk Replication (DRBD) DRBD = Distributed Replicating Block Device
![Page 24: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/24.jpg)
Slide 24
IBM
October 2006 © 2006 IBM Corporation
Linux Virtual Server (LVS) Project
Linux Virtual Server (LVS/ipvs) comes with Linux, very widely used
IP sprayer type of load balancer
Commonly used in “server farm” type arrangements
Integrates well with Linux-HA
Used in many mission-critical applications (like medical imaging, credit card authorization, nuclear facilities)
Some customers perform stateful load-balancer failover in less than .5 seconds
Support for stateful active/active load balancer clusters
![Page 25: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/25.jpg)
Slide 25
IBM
October 2006 © 2006 IBM Corporation
LVS In Action
![Page 26: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/26.jpg)
Slide 26
IBM
October 2006 © 2006 IBM Corporation
Plays Well With Others
Each of these independent services can work together to scale to large systems
All single points of failure can be eliminated
High-Availability, Load Balancing work together nicely
![Page 27: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/27.jpg)
Slide 27
IBM
October 2006 © 2006 IBM Corporation
Linux-HA, DRBD and LVS Working Together
![Page 28: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/28.jpg)
Slide 28
IBM
October 2006 © 2006 IBM Corporation
References
http://linux-ha.org/
http://www.drbd.org/
http://www.linuxvirtualserver.org/
![Page 29: Open Source High Availability on Linux](https://reader034.fdocuments.in/reader034/viewer/2022052419/5876a5ec1a28abb46d8ba0dc/html5/thumbnails/29.jpg)
Slide 29
IBM
October 2006 © 2006 IBM Corporation
Legal Statements
IBM is a trademark of International Business Machines Corporation.
Linux is a registered trademark of Linus Torvalds.
Other company, product, and service names may be trademarks or service marks of others.
This work represents the views of the author and does not necessarily reflect the views of the IBM Corporation.