Clustering
description
Transcript of Clustering
ClusteringClusteringTypes of Clustering Types of Clustering
ObjectivesObjectivesAt the end of this module the student will At the end of this module the student will understand the following tasks and understand the following tasks and concepts.concepts.
What clustering is and why you would What clustering is and why you would want itwant it
Clustering optionsClustering options Differences between various types of Differences between various types of
clustering; advantages and clustering; advantages and disadvantagesdisadvantages
Factors to consider when choosing a Factors to consider when choosing a cluster typecluster type
What is a cluster?What is a cluster? My definitionMy definition
Multiple systems performing a single Multiple systems performing a single functionfunction
Black boxBlack box
Why Cluster?Why Cluster? PerformancePerformance AvailabilityAvailability RecoverabilityRecoverability
FeaturesFeatures Speedup Speedup
Faster response timesFaster response times Transactions finish fasterTransactions finish faster
ScaleupScaleup More work doneMore work done More capacity, more concurrent More capacity, more concurrent
transactionstransactions ScalabilityScalability
Single Node ScalingSingle Node Scaling
Scales to Scales to multiple CPUsmultiple CPUs
Doesn’t scale Doesn’t scale beyond one beyond one nodenode
Multiple Multiple single points single points of failureof failure
Users
DatabaseDatabase
Server
Cluster DefinitionsCluster Definitions Shared Nothing (Federated)Shared Nothing (Federated) Replicated SiteReplicated Site Shared DiskShared Disk FailoverFailover
Active/PassiveActive/Passive Active/ActiveActive/Active
Shared EverythingShared Everything
Shared Nothing ClusterShared Nothing Cluster Only one CPU is connected to a diskOnly one CPU is connected to a disk May have shared memoryMay have shared memory MPP Systems are Shared NothingMPP Systems are Shared Nothing Other vendors have “Shared Nothing” Other vendors have “Shared Nothing”
clustersclusters
Federated (Shared Federated (Shared Nothing) ClusterNothing) Cluster Distributed database Distributed database
(separate database (separate database on each machine)on each machine)
Data is spread across Data is spread across nodes; each machine nodes; each machine has part of the datahas part of the data
Function is spread Function is spread across nodesacross nodes
Two-Phase CommitTwo-Phase Commit
Database Database
Server Server
Got it?
Got it!
Good!
1.
2.
3.
Replicated SystemReplicated System Data replicated at Data replicated at
the server the server (network) level or (network) level or at the storage at the storage (SAN) level(SAN) level
Multiple copies of Multiple copies of the same the same databasedatabase
Most common Most common implementation is implementation is Active/PassiveActive/Passive
Failover between Failover between nodesnodes
Database Database
Server Server
Server levelReplication
Storage levelReplication
Active Node Passive Node
or
Shared Disk ClusterShared Disk Cluster Shared file systemShared file system Multiple systems attached to the same Multiple systems attached to the same
diskdisk All nodes must have access to dataAll nodes must have access to data Only one database instance; only one Only one database instance; only one
node has “ownership” of the shared disknode has “ownership” of the shared disk Synchronization between systems; If one Synchronization between systems; If one
node fails, then the other takes overnode fails, then the other takes over
Cluster InterconnectCluster Interconnect Most Shared Disk clusters require some form of Most Shared Disk clusters require some form of
Cluster InterconnectCluster Interconnect Network – i.e. Gigabit EthernetNetwork – i.e. Gigabit Ethernet Specialized – i.e. Infiniband, MyrinetSpecialized – i.e. Infiniband, Myrinet
Most clusters implement a “heartbeat” between Most clusters implement a “heartbeat” between cluster nodes to monitor node healthcluster nodes to monitor node health Multiple nodes require a switchMultiple nodes require a switch Usually separated from the LANUsually separated from the LAN
Some shared disk clusters implement a Some shared disk clusters implement a “heartbeat” mechanism to a quorum disk via the “heartbeat” mechanism to a quorum disk via the SAN in addition to/instead of network heartbeatSAN in addition to/instead of network heartbeat
Oracle RAC implements Cache Fusion across the Oracle RAC implements Cache Fusion across the interconnectinterconnect Extra network traffic increases the throughput Extra network traffic increases the throughput
requirementsrequirements UDP implementation requires a separate networkUDP implementation requires a separate network
Failover ClusterFailover Cluster One system is a standby system for One system is a standby system for
anotheranother Only one system doing work at a timeOnly one system doing work at a time Pseudo-Shared DiskPseudo-Shared Disk Limited scalability in active/passive Limited scalability in active/passive
modemode
Failover ClusteringFailover Clustering
Fault tolerant Fault tolerant systems; systems; highly highly availableavailable
Basic failover Basic failover clusters don’t clusters don’t scale beyond scale beyond two nodestwo nodes
Users
DatabaseDatabase
Server Server
Active/Passive vs. Active/Passive vs. Active/ActiveActive/Active Both are failover onlyBoth are failover only Active/PassiveActive/Passive
One node is activeOne node is active The other is passive until failoverThe other is passive until failover
Active/ActiveActive/Active Still uses active/passive technologyStill uses active/passive technology 2 separate databases2 separate databases One is active on node A and passive on node One is active on node A and passive on node
BB The second database is active on node B and The second database is active on node B and
passive on node A.passive on node A. Separate applications and user connections to Separate applications and user connections to
each of the different databaseseach of the different databases
Active/PassiveActive/Passive
Node A is Node A is activeactive Node B is Node B is passivepassive
until/unless Node A failsuntil/unless Node A fails Only one Oracle license Only one Oracle license
is requiredis required
Node A Node B
Active/PassiveActive/Passive
Node A Node BXIf Node A fails …
Active/PassiveActive/Passive
Node B becomes Node B becomes activeactive
Node A is dead Node A is dead (definitely (definitely passivepassive!) !) until repaired and until repaired and then “failed back” then “failed back” if necessary.if necessary.
Node A Node BX
Active/ActiveActive/Active
Application Group A Application Group A and User Group A are and User Group A are activeactive on Node Aon Node A
Application Group B Application Group B and User Group B are and User Group B are activeactive on Node Bon Node B
Each node serves as Each node serves as failover for the other.failover for the other.
2 separate databases. 2 separate databases. Both nodes are not Both nodes are not accessing the same accessing the same data at the same time.data at the same time.
Oracle license required Oracle license required on each nodeon each node
Node A
Application A
User Group A
Passive Fail-over for B
Node B
Passive Fail-over for A
Application B
User Group B
Switchover vs. FailoverSwitchover vs. Failover Many cluster systems utilize the concept Many cluster systems utilize the concept
of Service Groupsof Service Groups Service Groups allow granular control of Service Groups allow granular control of
individual software packages (i.e. individual software packages (i.e. individual Oracle instances)individual Oracle instances)
An individual group can be manually An individual group can be manually moved to another server without affecting moved to another server without affecting other service groups – a “switchover” other service groups – a “switchover” versus a “failover”versus a “failover”
Adds greater management flexibilityAdds greater management flexibility
N-to-1 Failover N-to-1 Failover ConfigurationConfiguration
Node D is a Node D is a dedicated failover dedicated failover node for failures node for failures on Node A, B, and on Node A, B, and CC
Extends number Extends number of active nodesof active nodes
A problem is that A problem is that once the failed once the failed node is available, node is available, the Service the Service Groups on Node D Groups on Node D (failover node) (failover node) must failback to must failback to original server to original server to restore High restore High AvailabilityAvailability
Node A
Node B
Application A
User Group A
Application B
User Group B
Application C
User Group C
Application D
User Group D
Application E
User Group E
Application F
User Group F
Application G
User Group G
Application H
User Group H
Application I
User Group I
Failover G
Failover H
Failover IX
Node C
Node D
Failover
Failback
N + 1 Failover N + 1 Failover ConfigurationConfiguration
Node D is a Node D is a dedicated failover dedicated failover node for failures node for failures on Node A, B, and on Node A, B, and CC
Extends number Extends number of active nodesof active nodes
Once Node C is Once Node C is restored, it restored, it becomes the becomes the failover node, failover node, leaving Node D in leaving Node D in production.production.
Node A
Node B
Application A
User Group A
Application B
User Group B
Application C
User Group C
Application D
User Group D
Application E
User Group E
Application F
User Group F
Application G
User Group G
Application H
User Group H
Application I
User Group I
Failover G
Failover H
Failover IX
Node C
Node D
Failover
N-to-N Failover N-to-N Failover ConfigurationConfiguration
Node C fails, Node C fails, and its Service and its Service Groups are re-Groups are re-distributed distributed across across surviving surviving nodesnodes
Optimal Optimal solution for > 2 solution for > 2 nodesnodes
Implemented Implemented on third party on third party failover failover clusters and clusters and Oracle RACOracle RAC
Node A
Node B
Application A
User Group A
Application B
User Group B
Application C
User Group C
Application D
User Group D
Application E
User Group E
Application F
User Group F
Application G
User Group G
Application H
User Group H
Application I
User Group IX
Node C
Node D
Failover G Failover H Failover I
Application J
User Group J
Application K
User Group K
Application L
User Group L
Third Party ClustersThird Party Clusters Support for extended cluster nodes – Support for extended cluster nodes –
up to 32 nodes for vendor Clusteringup to 32 nodes for vendor Clustering Supports N + 1 and N - N failover Supports N + 1 and N - N failover
clusteringclustering Integrated with hardware and/or Integrated with hardware and/or
software replication for long distance software replication for long distance “clusters”“clusters”
Clustering Solutions from Clustering Solutions from OracleOracle Oracle FailsafeOracle Failsafe Oracle Data GuardOracle Data Guard Advanced ReplicationAdvanced Replication Shared Nothing ClusterShared Nothing Cluster Oracle Parallel ServerOracle Parallel Server Real Application Clustering (RAC)Real Application Clustering (RAC)
FailsafeFailsafe MS Clustering EnabledMS Clustering Enabled Two servers one disk subsystemTwo servers one disk subsystem Switches in the event of a hardware Switches in the event of a hardware
failurefailure Requires recoveryRequires recovery
Standby DatabaseStandby Database Copy of Database (usually remote)Copy of Database (usually remote) Kept up to date with Archive LogsKept up to date with Archive Logs Oracle 8i featureOracle 8i feature Oracle 9i-10g version of a standby Oracle 9i-10g version of a standby
database is Data Guarddatabase is Data Guard
Oracle Data GuardOracle Data Guard Mirrored ServerMirrored Server Physical StandbyPhysical Standby
Archive Logs are applied to the remote databaseArchive Logs are applied to the remote database Switchover occurs in the event of a failureSwitchover occurs in the event of a failure
Logical StandbyLogical Standby Log Miner technology is used to generate SQLLog Miner technology is used to generate SQL Standby Database can also be used for read-only Standby Database can also be used for read-only
reportingreporting AdvantagesAdvantages
Safe from user failureSafe from user failure Can be in different locationCan be in different location No recovery requiredNo recovery required
Advanced ReplicationAdvanced Replication Uses Updatable-SnapshotsUses Updatable-Snapshots Replicates to another systemReplicates to another system Systems stay in syncSystems stay in sync
Oracle Parallel ServerOracle Parallel Server Shared disk cluster productShared disk cluster product Loosely CoupledLoosely Coupled Scalable performance Scalable performance No downtime in the event of a system No downtime in the event of a system
failurefailure Replaced by RAC in 9iReplaced by RAC in 9i
True Shared Disk Server True Shared Disk Server (RAC)(RAC) ONE databaseONE database Separate multiple Separate multiple
instances (processes & instances (processes & memory)memory)
All nodes can access All nodes can access data simultaneouslydata simultaneously
Shared Everything Shared Everything ClusterCluster
Transparent Application Transparent Application FailoverFailover
Oracle license required Oracle license required on each nodeon each node
Highest level of cluster Highest level of cluster functionalityfunctionality
Node A Node B
Factors to Consider for Factors to Consider for ClusteringClustering Which do you need most?Which do you need most?
High Availability – Failover Clusters, Synchronous Replication, High Availability – Failover Clusters, Synchronous Replication, Data GuardData Guard
Performance scalability – Active/Active failover clusters, N-to-N Performance scalability – Active/Active failover clusters, N-to-N failover clustersfailover clusters
Both – Oracle RACBoth – Oracle RAC Administration complexityAdministration complexity
Failover clusters – relatively lowFailover clusters – relatively low Oracle RAC – relatively highOracle RAC – relatively high
Substantially less complex for 10g RAC than 9i RACSubstantially less complex for 10g RAC than 9i RAC Local or long distance?Local or long distance?
Local – Failover, RACLocal – Failover, RAC Remote – Federated database, Replication, Standby Remote – Federated database, Replication, Standby
database/Data Guarddatabase/Data Guard Oracle license costsOracle license costs
Active/Passive failover clusters – active nodes onlyActive/Passive failover clusters – active nodes only Active/Active failover clusters, RAC – per nodeActive/Active failover clusters, RAC – per node
ReviewReview What type of commit is required for a What type of commit is required for a
Federated (shared nothing) cluster?Federated (shared nothing) cluster? What is the difference in how the database What is the difference in how the database
is kept up-to-date in Oracle Data Guard vs. is kept up-to-date in Oracle Data Guard vs. Advanced Replication?Advanced Replication?
What is the difference between N-to-1 What is the difference between N-to-1 failover clusters and N + 1 failover failover clusters and N + 1 failover clusters?clusters?
How many databases are there in an 8 How many databases are there in an 8 node Oracle RAC cluster?node Oracle RAC cluster?
SummarySummary Types of clusters:Types of clusters:
Shared Nothing ClustersShared Nothing Clusters Federated databasesFederated databases ReplicationReplication
Shared Disk ClustersShared Disk Clusters FailoverFailover Oracle RACOracle RAC
Failover ClustersFailover Clusters Active/PassiveActive/Passive Active/ActiveActive/Active N-to-1N-to-1 N + 1N + 1 N-to-NN-to-N
Shared Everything ClustersShared Everything Clusters Oracle RACOracle RAC
Choosing a cluster type involves trade-offs in Choosing a cluster type involves trade-offs in functionality, costs, and administration functionality, costs, and administration complexitycomplexity