SUSE Linux Enterprise High Availability Roadmap · 2015-11-20 · SUSE ® Linux Enterprise High...

Post on 13-May-2020

19 views 0 download

Transcript of SUSE Linux Enterprise High Availability Roadmap · 2015-11-20 · SUSE ® Linux Enterprise High...

SUSE® Linux Enterprise High Availability Roadmap

Kai DupkeSenior Product Manager

SUSE Linux Enterprise Server

kdupke@suse.com

Distribution: pdf anyDate: 2015-09-22Not a public document

Kai DupkeSenior Product Manager

SUSE Linux Enterprise Server

kdupke@suse.com

Lars Marowsky-BréeDistinguished Engineer

Architect Storage / HA

lmb@suse.com

2

Topics

Overview

SUSE High Availability

Geo Cluster

Roadmap

3

Challenge

• Faults will occur

– Hardware crash, flood, fire, power outage, earthquake?

• Service outage and loss of data

– You might afford a five second blip, but can you afford a longer outage?

• How much does downtime cost?

Murphy's Law is universal

Can you afford low available systems?

HA or no-HA?

5

Quis custodit custodes?

• Re-Boot instead of fail-over‒ (virt.) hardware needs to be available

• Re-Deployment instead of fail-over‒ Monitor needs to be always available

• Farmed services‒ Client needs to handle fail-over ('F5', SMTP)

‒ 3rd party application must support scale-out

‒ Backend needs to be available

SUSE Linux EnterpriseHigh Availability Extension

7

OverviewSUSE® Linux Enterprise High Availability Extension

• Most modern and complete open source solution for high available Linux clusters

• A suite of robust open source technologies that is‒ Easy to use

‒ Integrated

‒ Virtualization agnostic

• Used with SUSE Linux Enterprise Server, it helps to‒ Maintain business continuity

‒ Protect data integrity

‒ Reduce unplanned downtime for mission-critical workloads

8

• Service Failover

• Cluster File System

• Clustered Samba

• Virtualization Agnostic

• Support for x86, x86_64, POWER, System z

• Network Load-Balancer

• Data Replication

• Node Recovery

• Unlimited Geo Clustering

FeaturesSUSE® Linux Enterprise High Availability Extension

9

TargetsSUSE® Linux Enterprise High Availability Extension

Quickly and easily to install, configure and manage

Continuous access to mission-critical systems and data

Transparent to Virtualization

Meet Service Level Agreements

Increase service availability

10

Key Use CasesSUSE® Linux Enterprise High Availability Extension

• Focus on mission-critical services

• Active/active services‒ OCFS2, Databases, Samba File Servers

• Active/passive service fail-over‒ Traditional databases, SAP setups, regular services

• High availability across guests‒ Fine granular monitoring and HA on top of virtualization

• Network Load-Balancing‒ with transparent fail-over

• All Topologies‒ Local, Metro, and Geographical area clusters

11

Simple Stack Enqueue Replication

DRBD Data Sync HA in Virtual Environments

Sample Use Cases - SAPSUSE® Linux Enterprise High Availability Extension

14

• SAP Systems Integrator& Managed Services

• Part of Hitachi Data Systems

• Unix to SUSE Linux & SAP migrations

• youtube.com/watch?v=pcKxCcKgmQ4

Reference – oXyaSUSE® Linux Enterprise High Availability Extension

„SUSE Linux Enterprise [is] giving our customers an enterprise-class option for high availability.”

„High-availability managed services enable our customers to adapt and grow their businesses faster and at lower cost.”

„We are very happy with the technical support that SUSE provides.”

“..., we recommend a clustered solution based on SUSE Linux Enterprise ServerTBD.”

— François DétrezCTO & co-founder

oXya

15

• Service failover at any distance – from local to geo

• Up to 99.9999% availability

• Rolling updates for less planned downtime

• Easy setup, administration, management

• Virtualization agnostic

• Leading open source High Availability

• On par with proprietary products

Fighting Murphy's Law

When will you start?

SummarySUSE® Linux Enterprise High Availability Extension

Setup & Management

17

• Bootstrapping a cluster is really easy- node1 # sleha-init -i bond0 -t ocfs2 -p /dev/sdb- node[2...N] # sleha-join -c 192.168.2.1Options are optional ...

• Connect to the web console for cluster management & wizards

Easy Setup – Bootstrap & WizardsSUSE® Linux Enterprise High Availability Extension

Administration

19

History Explorer

20

Cluster SimulatorSUSE® Linux Enterprise High Availability Extension

21

Command LineSUSE® Linux Enterprise High Availability Extension

Geo Cluster

23

• Cluster fail-over between different locations‒ Provide disaster resilience in case of site failure

‒ Each site is a self-contained, autonomous cluster

‒ Support manual and automatic switch-/fail-over

• Extends Metro Cluster capabilities‒ No distance limit between data centers

‒ No unified storage / network needed

• Storage replicated as active / passive‒ Leverage SUSE included data replication (DRBD)

‒ Integrate third-party solutions via scripts

Geo Cluster – OverviewSUSE® Linux Enterprise High Availability Extension

24

• Local cluster‒ Negligible network latency

‒ Typically synchronous concurrent storage access

• Metro area (stretched) cluster‒ Network latency <15ms (~20mls)

‒ Unified / redundant network between sites

‒ Usually some form of replication at the storage level

• Geo clustering‒ High network latency, limited bandwidth

‒ Asynchronous storage replication

Geo Cluster – From Local to GeoSUSE® Linux Enterprise High Availability Extension

26

Geo Cluster – SetupSUSE® Linux Enterprise High Availability Extension

Site A Site B

(Arbitrator)

boothd

Node 1 Node 2 Node 7 Node 8

Site C

boothd boothd

Roadmap

28

2012 2013 2014 2015 2016

SLE 11

SLE 12

SP3

GA

SP2

SP1

SP4

RoadmapSUSE® Linux Enterprise High Availability Extension

SLE HA 12• Off-line

History-Explorer• SCSI Reservation• GEO

- Standard stack- multi-tenancy- IP relocation

SLE HA 12 SP1• More wizards

MariaDB• GEO

- enhanced security- improved management

• HAWK redesign preview

SLE HA 11 SP4• More wizards

DB2, Oracle• Selected features

from SLE HA 12• Better scalability

29

Recent Improvements – 12SUSE® Linux Enterprise High Availability Extension

• History Explorer‒ Off-line support

• Fence Agents update‒ SCSI handling

• Administration‒ Cluster health evaluation

‒ crmsh improvements

‒ New config options

• Node Recovery‒ Updated rear

• Load Balancer‒ HAproxy added

• Cluster File System‒ OCFS2 performance

improvements

‒ GFS2 added

• Geo Clustering‒ Multi tenancy arbitrator

‒ IP relocation (DNS based)

30

Recent Improvements – 11 SP4SUSE® Linux Enterprise High Availability Extension

• New wizards‒ Oracle simple stack

‒ DB2 simple stack

‒ DB2 HADR

‒ Shared disk web service

• Administration‒ Resource groups

‒ Multiple FS exports

• Scalability‒ secondary cluster nodes

with pacemaker_remote

• Unified Platform‒ SLE HA 12 ReaR added

to 11 SP4

‒ Important features fromSLE HA 12

‒ Important features fromSLE HA GEO 12

Selected Features

32

• Remote monitoring of resources

‒ no HA components needed

‒ re-use of Nagios/icinga plugins

• Improved handling of virtual guests

‒ monitor virtual services from the hypervisor

‒ improve protection of VMs as cluster workload

‒ guests remain unaltered – monitoring is external

• Extends pacemaker to include the concept of “container” resources

External Remote Monitoring

33

ClusterDomain

• Total node count beyond 32 nodes

• Main nodes controls secondary nodes

Scale-out – pacemake_remote

34

• Core cluster is a “small” full traditional cluster

• Core drives arbitrary number of remote nodes

‒ Remote nodes can be virtual or physical

• Remote management and monitoring:

‒ Remote agent (pacemaker-remote) needed

‒ Uses resource agents & system init scripts

‒ More feature-rich than external monitoring

• Remote nodes can host (almost) all resources

‒ Exceptions: DLM, cLVM2, OCFS2, GFS2

Scale-out via remote nodes

35

• Simplify NFS configuration

Scale easier – Multiple FS Exports

36

• Bundling groups of resources

• Handle connected resources with one command

Scale easier – Resource Tags

37

• Guided setup GUI way

• Makes standard scenarios easier

• Less to config – faster to deploy

Customer Friendly – Wizards

Outlook

39

• More wizards‒ All from SLE HA 11 SP4

‒ plus mariaDB

• Performance and scalability improvements

• SLE HA GEO‒ Authentication of clients and peers

‒ Support for multiple simultaneous clients

• Preview of re-designed HAWK web GUI

Upcoming Improvements – 12 SP1SUSE® Linux Enterprise High Availability Extension

Forward looking statement, might change without notice.

40

HAWK Re-Design – PreviewSUSE Linux Enterprise High Availability Extension

Forward looking statement, might change without notice.

41

• Failure will occur‒ What outage is tolerable – 0s, 1s, 1min, 1hour, 1day?

• Virtualization and Cloud‒ Is re-{booting,deploying} a guest sufficient?

‒ Install HA components in the guests?

• Service Monitoring‒ In depth monitoring, 'system as one' or remote monitoring?

• Local, Metro, Geo...‒ What is the next cluster scenario?

Areas to Look IntoSUSE® Linux Enterprise High Availability Extension

Forward looking statement, might change without notice.

Architecture

43

Cluster ExampleSUSE® Linux Enterprise High Availability Extension

Kernel

XenVM1

LAMPApache

IPext3

Kernel Kernel

Corosync + openAIS

Pacemaker

DLM

cLVM2+OCFS2

XenVM2

Network Links

Clients

Storage

44

Linux High Availability StackSUSE® Linux Enterprise High Availability Extension

• The stack includes:‒ resource-agents – manage and monitor availability of

services

‒ stonith – IO fencing support (also Xen and VMware VMs)

‒ corosync and OpenAIS – cluster infrastructure

‒ Pacemaker – cluster resource manager

‒ CRM GUI – graphical interface for cluster resource and dependencies editing

‒ hawk – Web console for cluster monitoring and administration

‒ CLI – improved command line to interact with the CIB: editing, prepare multiple changes - commit once, syntax validation, etc.

45

Detailed ArchitectureSUSE® Linux Enterprise High Availability Extension

46

47

Learn more

www.suse.com/products/highavailability

Kai DupkeSenior Product Manager

SUSE Linux Enterprise Server

kdupke@suse.com

Lars Marowsky-BréeDistinguished Engineer

Architect Storage / HA

lmb@suse.com

Unpublished Work of SUSE. All Rights Reserved.This work is an unpublished work and contains confidential, proprietary and trade secret information of SUSE. Access to this work is restricted to SUSE employees who have a need to know to perform tasks within the scope of their assignments. No part of this work may be practiced, performed, copied, distributed, revised, modified, translated, abridged, condensed, expanded, collected, or adapted without the prior written consent of SUSE. Any use or exploitation of this work without authorization could subject the perpetrator to criminal and civil liability.

General DisclaimerThis document is not to be construed as a promise by any participating company to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. SUSE makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for SUSE products remains at the sole discretion of SUSE. Further, SUSE reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All SUSE marks referenced in this presentation are trademarks or registered trademarks of Novell, Inc. in the United States and other countries. All third-party trademarks are the property of their respective owners.