Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems...

27
Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi

Transcript of Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems...

Page 1: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

Highly Available Database Systems

Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS)

Technische Universität Kaiserslautern

Ou Yi

Page 2: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 2

Agenda

Introduction of Basic Concepts Three Types of Coupling DDBS & PDBS Clustering HA in Practice

Page 3: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 3

Availability Concept

Why? Availability

“the fraction of the offered load that is processed with acceptable response times”

A = MTTF / ( MTTF + MTTR ) The Number of Nines

System Type Availability Unavailability (min/year)

Well-managed 99.9 % 526

Fault-tolerant 99.99 % 53

High-availability 99.999 % 5

Very-high-availability 99.9999 % 0.5

Ultra-availability 99.99999 % 0.05

Page 4: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 4

HA and Multi-Computer System

High Availability A → 1 downtime → 0 causes of downtime trouble-free components? redundancy!

Multi-Computer System higher availability than single-computer system higher performance communication (network or shared HW)

tight coupling, close coupling, loose coupling

Page 5: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 5

Tight Coupling

Characteristics shared main memory single copy of software

Multiprocessor and SMP Pro & Contra

+ computing power + communication − availability − extensibility (max 16)

Page 6: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 6

Tight Coupling (2)

Shared Everything (Shared Memory) a DBMS running on a multiprocessor multiprocessing with support of OS used as node architecture

tight coupling

Page 7: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 7

Loose Coupling

Characteristics interconnected through network independency

Pro & Contra + error isolation + extensibility (unlimited?) − communication

Page 8: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 8

Loose Coupling (2)

Shared Nothing independent computers (or nodes) database: physically partitioned, logically unified each node runs a copy of DBMS, which has direct

access only to its own partition

loose coupling

Page 9: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 9

Loose Coupling (3)

Shared Disk external storage is shared, “all to all” each node runs a DBMS, independent a TA can be completed locally no need for

distributed query plan distributed commit

required concurrency control buffer coherency control

Page 10: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 10

Close Coupling

Characteristics coupling component private main memory and software

Pro & Contra + communication + error isolation + extensibility (max 32) − proprietary design

Shared Data multi-computer DBS hybrid of close coupling and shared disk

Page 11: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 11

Close Coupling (2)

Shared Data (cont.) Parallel Sysplex Cluster

shared disk & close coupling Coupling Facility

specialized SMP with large global main memory useful for global tasks

Sysplex Timer global unique time e.g. global log file

Page 12: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 12

Shared Disk vs. Shared Nothing

Node Failure SD: all the data still accessible to surviving nodes SN: data of the failed node can not be easily accessed

Increased Workload SD: new nodes can be easily added and participate in

query processing SN: new nodes have no direct access to data,

reallocation is expensive!

Page 13: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 13

DDBS & PDBS

DDBS distributed database

a collection of multiple, logically unified databases distributed over a computer network

distributed DBMS manages the distributed DB maintains distribution transparency multiple DBMSs cooperating across sites (SN!)

PDBS “locally distributed database systems of the types shared

nothing, shared disk, or shared everything”

Page 14: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 14

DDBS & PDBS (2)

Replicated Data across nodes, improve data availability improve performance, consistency? ideally: one-copy equivalence ROWA (Read-Once/Write-All) algorithm

+ one-copy equivalence − response time

strict consistency is expensive! snapshot consistency

snapshots, materialized views consistent in some point of time in the past acceptable for applications: analysis, reporting, etc. periodical refreshing or triggered refreshing

Page 15: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 15

DDBS & PDBS (3)

Disaster Recovery destroys complete computer center (site failure) keeping backup medium off-site

time consuming server to be set up whole database to be retrieved

loss of TAs online replication

identical configuration highly up-to-date backup database quick takeover possible variations

Page 16: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 16

DDBS & PDBS (4)

Online Replication (cont.) 1-safe

asynchronous log transfer + response time − loss of TAs

very-safe distributed two-phase commit + no loss of TAs − availability

2-safe backup involved in commit process in normal case primary independent from backup, unilaterally commit

Page 17: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 17

Clustering

Definition interconnecting of a group of computers so that they

work together closely and it can be viewed as if it were a single computer

special components for load distribution and failure detection

architecture shared nothing, shared disk, shared data

Cluster At Different Levels OS: MS Windows NT/2000/2003, IBM AIX middleware: C-JDBC application: Oracle RAC (Real Application Clusters)

Page 18: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 18

Clustering

Purpose high-availability cluster

fail-over & switch-over backup is idle

load balancing cluster all nodes are active

Private Network used to detect node failure

heartbeat: status info. sent by the nodes to each otherat regular intervals

network partition problem recommended to be redundant, fast and reliable

Page 19: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 19

HA in Practice

Mainframe (“big iron”) used by large companies mission critical applications and

bulk data processing financial TA processing airline booking railway systems

RAS, years without interruption IBM zSeries family

z9-109 the most powerful with up to 54 configurable

PUs (processor) and many HA features

z9-109 external view

Page 20: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 20

HA in Practice - zSeries Memory-coherent SMP

typical problem of SMP: NUMA

single large L2, uniform memory access, cache coherency

Page 21: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 21

HA in Practice - zSeries PU Sparing & Instruction Retry

two spare PUs per server, protection against PU failure roles can be reassigned dynamically and transparently each instruction is executed two times detect soft error

Modular Multi-book Design multiple “books” per server a book hosts PUs (12 / 16),

memory and I/O connectors Concurrent Book Add (CBA) Enhanced Book Availability (EBA)

Page 22: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 22

HA in Practice - zSeries Virtual Machine

protected and isolated copy of the physical machine private address space, independency user illusion: having a dedicated physical machine Hypervisor

software which emulates the underlying physical machine’s architecture very efficiently (machine code)

advantages error isolation overcoming hardware boundaries

PR/SM of zSeries a pool of resources support up to 60 LPARs

Page 23: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 23

HA in Practice - zSeries LPAR (Logical Partition)

CPU resources zone (part of the physical main memory) I/O resources (statically assigned)

Page 24: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 24

HA in Practice - DBMS

System Failure database crash recovery & clustering

Data Failure storage failure

RAID and (online) backupsDB2 backup db sample online to /dev3/backup

human error Flashback technology

FLASHBACK TABLE account TO BEFORE DROPFLASHBACK DATABASE TO TIMESTAMP (...)

Site Failure cross-sites online replication: Oracle Data Guard

ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE{PROTECTION | AVAILABILITY | PERFORMANCE}

Page 25: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 25

HA in Practice - MAA

Designing HA Architecture Oracle MAA (Maximum Availability Architecture)

Page 26: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

April 20, 2023 Highly Available Database Systems – Ou Yi Slide 26

Summary

HA DBSs are multi-computer DBSs Basic architectures are the three types of

coupling Error isolation and redundancy are effective and

widely adopted approaches A promising market

Page 27: Highly Available Database Systems Seminar im WS 2005/2006: Dependable Adaptive Information Systems (DAIS) Technische Universität Kaiserslautern Ou Yi.

Thank You!

Technische Universität Kaiserslautern

Ou Yi

[email protected]