Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems...
-
Upload
brianna-miles -
Category
Documents
-
view
213 -
download
0
Transcript of Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems...
Highly Available Database Systems
Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS)
Technische Universität Kaiserslautern
Ou Yi
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 2
Agenda
Introduction of Basic Concepts Three Types of Coupling DDBS & PDBS Clustering HA in Practice
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 3
Availability Concept
Why? Availability
“the fraction of the offered load that is processed with acceptable response times”
A = MTTF / ( MTTF + MTTR ) The Number of Nines
System Type Availability Unavailability (min/year)
Well-managed 99.9 % 526
Fault-tolerant 99.99 % 53
High-availability 99.999 % 5
Very-high-availability 99.9999 % 0.5
Ultra-availability 99.99999 % 0.05
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 4
HA and Multi-Computer System
High Availability A → 1 downtime → 0 causes of downtime trouble-free components? redundancy!
Multi-Computer System higher availability than single-computer system higher performance communication (network or shared HW)
tight coupling, close coupling, loose coupling
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 5
Tight Coupling
Characteristics shared main memory single copy of software
Multiprocessor and SMP Pro & Contra
+ computing power + communication − availability − extensibility (max 16)
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 6
Tight Coupling (2)
Shared Everything (Shared Memory) a DBMS running on a multiprocessor multiprocessing with support of OS used as node architecture
tight coupling
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 7
Loose Coupling
Characteristics interconnected through network independency
Pro & Contra + error isolation + extensibility (unlimited?) − communication
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 8
Loose Coupling (2)
Shared Nothing independent computers (or nodes) database: physically partitioned, logically unified each node runs a copy of DBMS, which has direct
access only to its own partition
loose coupling
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 9
Loose Coupling (3)
Shared Disk external storage is shared, “all to all” each node runs a DBMS, independent a TA can be completed locally no need for
distributed query plan distributed commit
required concurrency control buffer coherency control
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 10
Close Coupling
Characteristics coupling component private main memory and software
Pro & Contra + communication + error isolation + extensibility (max 32) − proprietary design
Shared Data multi-computer DBS hybrid of close coupling and shared disk
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 11
Close Coupling (2)
Shared Data (cont.) Parallel Sysplex Cluster
shared disk & close coupling Coupling Facility
specialized SMP with large global main memory useful for global tasks
Sysplex Timer global unique time e.g. global log file
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 12
Shared Disk vs. Shared Nothing
Node Failure SD: all the data still accessible to surviving nodes SN: data of the failed node can not be easily accessed
Increased Workload SD: new nodes can be easily added and participate in
query processing SN: new nodes have no direct access to data,
reallocation is expensive!
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 13
DDBS & PDBS
DDBS distributed database
a collection of multiple, logically unified databases distributed over a computer network
distributed DBMS manages the distributed DB maintains distribution transparency multiple DBMSs cooperating across sites (SN!)
PDBS “locally distributed database systems of the types shared
nothing, shared disk, or shared everything”
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 14
DDBS & PDBS (2)
Replicated Data across nodes, improve data availability improve performance, consistency? ideally: one-copy equivalence ROWA (Read-Once/Write-All) algorithm
+ one-copy equivalence − response time
strict consistency is expensive! snapshot consistency
snapshots, materialized views consistent in some point of time in the past acceptable for applications: analysis, reporting, etc. periodical refreshing or triggered refreshing
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 15
DDBS & PDBS (3)
Disaster Recovery destroys complete computer center (site failure) keeping backup medium off-site
time consuming server to be set up whole database to be retrieved
loss of TAs online replication
identical configuration highly up-to-date backup database quick takeover possible variations
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 16
DDBS & PDBS (4)
Online Replication (cont.) 1-safe
asynchronous log transfer + response time − loss of TAs
very-safe distributed two-phase commit + no loss of TAs − availability
2-safe backup involved in commit process in normal case primary independent from backup, unilaterally commit
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 17
Clustering
Definition interconnecting of a group of computers so that they
work together closely and it can be viewed as if it were a single computer
special components for load distribution and failure detection
architecture shared nothing, shared disk, shared data
Cluster At Different Levels OS: MS Windows NT/2000/2003, IBM AIX middleware: C-JDBC application: Oracle RAC (Real Application Clusters)
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 18
Clustering
Purpose high-availability cluster
fail-over & switch-over backup is idle
load balancing cluster all nodes are active
Private Network used to detect node failure
heartbeat: status info. sent by the nodes to each otherat regular intervals
network partition problem recommended to be redundant, fast and reliable
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 19
HA in Practice
Mainframe (“big iron”) used by large companies mission critical applications and
bulk data processing financial TA processing airline booking railway systems
RAS, years without interruption IBM zSeries family
z9-109 the most powerful with up to 54 configurable
PUs (processor) and many HA features
z9-109 external view
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 20
HA in Practice - zSeries Memory-coherent SMP
typical problem of SMP: NUMA
single large L2, uniform memory access, cache coherency
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 21
HA in Practice - zSeries PU Sparing & Instruction Retry
two spare PUs per server, protection against PU failure roles can be reassigned dynamically and transparently each instruction is executed two times detect soft error
Modular Multi-book Design multiple “books” per server a book hosts PUs (12 / 16),
memory and I/O connectors Concurrent Book Add (CBA) Enhanced Book Availability (EBA)
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 22
HA in Practice - zSeries Virtual Machine
protected and isolated copy of the physical machine private address space, independency user illusion: having a dedicated physical machine Hypervisor
software which emulates the underlying physical machine’s architecture very efficiently (machine code)
advantages error isolation overcoming hardware boundaries
PR/SM of zSeries a pool of resources support up to 60 LPARs
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 23
HA in Practice - zSeries LPAR (Logical Partition)
CPU resources zone (part of the physical main memory) I/O resources (statically assigned)
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 24
HA in Practice - DBMS
System Failure database crash recovery & clustering
Data Failure storage failure
RAID and (online) backupsDB2 backup db sample online to /dev3/backup
human error Flashback technology
FLASHBACK TABLE account TO BEFORE DROPFLASHBACK DATABASE TO TIMESTAMP (...)
Site Failure cross-sites online replication: Oracle Data Guard
ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE{PROTECTION | AVAILABILITY | PERFORMANCE}
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 25
HA in Practice - MAA
Designing HA Architecture Oracle MAA (Maximum Availability Architecture)
April 20, 2023 Highly Available Database Systems – Ou Yi Slide 26
Summary
HA DBSs are multi-computer DBSs Basic architectures are the three types of
coupling Error isolation and redundancy are effective and
widely adopted approaches A promising market