High Availability with Oracle Linux - Home - Exitas · •Used by Oracle VM Tier 1 Virtualization...
Transcript of High Availability with Oracle Linux - Home - Exitas · •Used by Oracle VM Tier 1 Virtualization...
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
High Availability with Oracle Linux
Lucian Preoteasa Sales Consultant Oracle Linux and Oracle VM June, 2015
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
3
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Agenda
1. OCFS2
2. Clusterware
3. Ksplice
4
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Market Drivers OCFS2
5
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Essential Concepts What is a Clustered File System?
• a file system which is shared by being simultaneously mounted on multiple servers
• Oracle supports a shared disk clustered file system architecture providing a basis for load balancing and failover solutions (e.g. OCFS2, ACFS)
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
OCFS2 Scalable Cluster File System at No Additional Cost
• Shared-disk cluster file system for Linux
• POSIX+ conformance
• Native journaling file system
– 2003: Developed as successor to OCFS
– January 2006: Integration into mainline Linux (2.6.16)
• Architecture- and endian-neutral – Parallel mounts on x86, x86-64, IA64 or PPC possible
– Big- and little-endian, 32-bit and 64-bit
ORIGINALLY DEVELOPED BY ORACLE
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Oracle Cluster File System Architecture
8
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Oracle Cluster File System Features and Benefits Overview
• Advanced Security (POSIX ACLs and SELinux) and Quotas
• REFLINK Snapshots with Copy-On-Write
• In-built Clusterstack with a Distributed Lock Manager
• File Size Scalability up to 16 TB
• Cluster Scalability up to 32 Nodes
• Used by Oracle VM Tier 1 Virtualization solution, database clusters (Oracle RAC), middleware clusters (Oracle E-Business Suite), appliances (SAP's Business Intelligence Accelerator), and many other Oracle products
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
OCFS2 Architecture OCFS2 Heartbeat and Split Brain Scenario Avoidance
• With the local heartbeat the heartbeat threads write/read from the heartbeat file per mount basis
• With the global heartbeat the heartbeat threads write/read from the regions that was initialized together with the O2CB cluster stack
Node 1
Shared
Storage
Node 2 Node 3
O2CB O2CB O2CB
Heartbeat Files or
Regions
I see 1,2,3 I see 1,2,3 I see 1,2,3
e.g.: Cluster
in Optimal
Status
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
OCFS2 Cluster Stack Storage Heartbeat
• Use heartbeat to check a node’s health
– In a separate service - o2cb
Storage heartbeat (o2hb-diskid process)
– Only way to check liveness of a node in cluster
– Use global heartbeat region in SPFS of a clustered server pool
• Each node has specific area (by index of node) in global heartbeat region
– Every two seconds
• Each node tries to update timestamp in assigned area
• Read global heartbeat region to know liveness of other nodes
– Fail to update timestamp within defined interval will cause heartbeat failure
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
OCFS2 Cluster Stack Network Heartbeat
Driven by o2net process
– To check connectivity to other nodes in cluster
– How to:
• Establish connection to other nodes in cluster
– When a Node is started
– When a Node finds a new alive node through storage heartbeat
• Keep-alive message is sent periodically after connection is established
• Fail to connect some alive nodes within defined interval
– Means network heartbeat failure
– Will cause split-brain problem
• Resolved by quorum mechanism in OCFS2
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
OCFS2 - Cluster Configuration /etc/ocfs2/cluster.conf – Cluster Layout
$ cat /etc/ocfs2/cluster.conf
cluster: name = b383b1a1e6fc003f heartbeat_mode = global node_count = 2 node: cluster = b383b1a1e6fc003f name = ovs1 number = 0 ip_address = 10.146.147.1 ip_port = 7777
node: cluster = b383b1a1e6fc003f name = ovs2 number = 1 ip_address = 10.146.147.2 ip_port = 7777 heartbeat: cluster = b383b1a1e6fc003f region = 0004FB000005000054AE95D21C6E22FB0
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
OCFS2 - Cluster Configuration /etc/sysconfig/o2cb – Cluster Timeout Values
$ cat /etc/sysconfig/o2cb
O2CB_ENABLED=true
O2CB_STACK=o2cb
O2CB_BOOTCLUSTER=b383b1a1e6fc003f
O2CB_HEARTBEAT_THRESHOLD=61
O2CB_IDLE_TIMEOUT_MS=60000
O2CB_KEEPALIVE_DELAY_MS=2000
O2CB_RECONNECT_DELAY_MS=2000
Network heart-beat configuration
• o2net daemon
Storage heart-beat configuration
• o2hb-diskid daemon
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Market Drivers Clusterware
15
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Oracle Clusterware A History of Providing High Availability Solutions
• Oracle Clusterware was introduced as part of Oracle Database 10g, as the foundation for the Oracle Real Application Cluster (RAC) solution.
• Beginning with 10g Release 2, Oracle enhanced the features of Oracle Clusterware to provide High Availability services for any workload. – This includes Oracle products and 3rd party products!
• The performance and reliability features used in these solutions can be leveraged for all high availability workload needs.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
High Availability General Concepts
• The clustering of systems is a standard business practice
• Enterprise solutions have existed in market for over 15 years
• Basic concepts are all similar
– Grouping multiple systems together to appear as a single system
– Provide redundancy to prevent a single point of failure
– Isolate problem nodes to prevent data corruption | workload failure
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Oracle Clusterware The Basics
• Hardware Components
– Nodes
– Shared Storage (NAS or SAN)
– Private Interconnect
– Public Interconnect (LAN)
– ACFS, OCFS2, NFS
• Software Components – Voting Disk
– Oracle Cluster Registry
Node 1 Node 2 Node 3 Node 4 Node 5 Node 6
Shared Storage
Private Interconnect
Application | Web Services
Public Interconnect
Local Local Local Local Local Local
Voting Disk
OCR
Cluster Heartbeat
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Managing Resources Heartbeats, Fencing, Failover, Dependencies
• Oracle Clusterware uses a cluster heartbeat to monitor the status of each node
• To prevent split-brain conditions, nodes are fenced if unresponsive
• Failover actions are defined by the application action profile
• Resource failover can be dependent on other cluster resources
Node 1 Node 2 Node 3 Node 4 Node 5 Node 6
Shared Storage
Private Interconnect
Application | Web Services
Public Interconnect
Local Local Local Local Local Local
Voting Disk
OCR
Fenced Node
Cluster Heartbeat
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Managing Non-Oracle Applications with Oracle Clusterware Simple Integration with Oracle Clusterware Application Framework
Generic Agent for easy application plug-in available beginning with 11.2.04
Script Agent to build your own agent available as of 11.2
Oracle Certified partners are providing application agents
Standalone and bundled agents are available, beginning with 11.2.03 (see http://oracle.com/goto/clusterware)
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Market Drivers Ksplice
21
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Zero-downtime Kernel Updates Oracle Ksplice - providing zero-downtime kernel updates since 2007
• Oracle Ksplice capabilities are extensive
– Supports multiple operating system releases and kernel versions
– Capable of patching a variety of kernel issues
– Easy to apply and rollback updates
– Simple, flexible tools and options for installing updates
– Proven track record in providing stable updates for production systems
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
• Time is money
• Each reboot of a production system impacts systems connected to or relying on the system:
– Middleware
– Database
– Storage
– Applications
• Not to mention the impact it has on groups throughout an organization
A Reboot Impacts More Than Servers
Avoid Expensive Disruptions
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
How it works Ksplice Technology
• Ksplice technology transforms Oracle Linux updates into zero downtime updates
• Linux servers within the customer environment connect to a Unbreakable Linux Network to download and apply updates while the system is running
• Customers can track the status of their servers via an intuitive web interface and can integrate zero downtime updates into existing management tools via an API
Ksplice technology Kernel update
Customer systems
Zero downtime
kernel update
Client
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Thee Ways to Consume Ksplice 1x Online, 2x Offline • Individual Servers can register with Oracle’s Ksplice server directly
– Each system must be connected to Internet
– Each system will check for new updates ~every 4 hours
– Oracle provides an interactive Web portal to monitor users’ systems
– Updates can be auto-installed if desired
• Offline with Local Yum Ksplice server
– Utilises Company Certificates to secure link between local Ksplice server and Oracle’s online Ksplice server
– All internal servers use local Ksplice server
• Completely Offline
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Ksplice Hosted Access to Online Updates 1x Online, 2x Offline • Log into Ksplice as you would ULN to manage your Ksplice Systems
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Ksplice Off-Line with Intranet Connection Disconnected from Internet, but connected through Intranet • Create a local YUM Mirror and register the Ksplice Channel(s)
– http://docs.oracle.com/cd/E37670_01/E37355/html/ol_offlncl_ksplice.html
• Subscribe each machine in yum.repos.d to your local YUM Ksplice channel [ol6_x86_64_ksplice] name=Ksplice for $releasever - $basearch baseurl=http://local_yum_server/yum/OracleLinux/OL6/ksplice/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY gpgcheck=1 enabled=1
• Install Ksplice “Uptrack-offline” RPM onto target machines.
– # yum install uptrack-offline
• Perform kernel patching with your Local YUM Ksplice Repository
– # yum install uptrack-updates-`uname –r`
– Can be integrated with Spacewalk and EM12c
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Ksplice Demo !
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 29