L2: DDBMS Architecture

38
L2: DDBMS Architecture -- 1 L2: DDBMS Architecture L2: DDBMS Architecture DDBMS and Distribution Transparency Architecture Alternatives DDBMS Components

description

L2: DDBMS Architecture. DDBMS and Distribution Transparency Architecture Alternatives DDBMS Components. ANSI/SPARC Architecture. Users. External Schema. External View. External View. External View. Conceptual Schema. Conceptual View. Internal Schema. Internal View. LOCAL DB 1. - PowerPoint PPT Presentation

Transcript of L2: DDBMS Architecture

Page 1: L2: DDBMS Architecture

L2: DDBMS Architecture -- 1

L2: DDBMS ArchitectureL2: DDBMS Architecture

DDBMS and Distribution Transparency

Architecture Alternatives DDBMS Components

Page 2: L2: DDBMS Architecture

L2: DDBMS Architecture -- 2

ANSI/SPARC ArchitectureANSI/SPARC Architecture

External View

External View

External View

Conceptual View

Internal View

ExternalSchema

Users

ConceptualSchema

InternalSchema

Page 3: L2: DDBMS Architecture

L2: DDBMS Architecture -- 3

The Classical DDBMS The Classical DDBMS ArchitectureArchitecture

Site 1

Other sites

Site IndependentSchemas

Global Schema

Fragmentation Schema

Allocation Schema

LOCAL DB 1

Local Mapping Schema

DBMS 1

LOCAL DB 2

Local Mapping Schema

DBMS 2

Site 2

Page 4: L2: DDBMS Architecture

L2: DDBMS Architecture -- 4

DDBMS SchemasDDBMS Schemas

Global Schema: a set of global relations as if database were not distributed at all

Fragmentation Schema: global relation is split into “non-overlapping” (logical) fragments. 1:n mapping from relation R to fragments Ri.

Allocation Schema: 1:1 or 1:n (redundant) mapping from fragments to sites. All fragments corresponding to the same relation R at a site j constitute the physical image Rj. A copy of a fragment is denoted by Rj

i. Local Mapping Schema: a mapping from

physical images to physical objects, which are manipulated by local DBMSs.

Page 5: L2: DDBMS Architecture

L2: DDBMS Architecture -- 5

Global Relations, Fragments and Global Relations, Fragments and Physical ImagesPhysical Images

R

GlobalRelation

R33

R32

R22

R21

R12

R11

R3

R2

R1

R2

(Site2)

R1

(Site 1)

R3

(Site3)

Physical Images

Fragments

Page 6: L2: DDBMS Architecture

L2: DDBMS Architecture -- 6

Motivation for this Motivation for this Architecture Architecture fragmentation transparency location transparency

Separating the concept of data fragmentation from the concept of data allocation

Explicit control of redundancy Independence from local databases:

allows local mapping transparency

Page 7: L2: DDBMS Architecture

L2: DDBMS Architecture -- 7

Rules for Data Fragmentation Rules for Data Fragmentation

Completeness All the data of the global relation must be mapped into the fragments

Reconstruction It must always be possible to reconstruct each global relation from its fragments

Disjointedness it is convenient that fragments be disjoint, so that the replication of data can be controlled explicitly at the allocation level

Page 8: L2: DDBMS Architecture

L2: DDBMS Architecture -- 8

Types of Data FragmentationTypes of Data Fragmentation

Vertical Fragmentation•Projection on relation

(subset of attributes)•Reconstruction by join•Updates require no tuple

migrationHorizontal Fragmentation

•Selection on relation (subset of tuples)

•Reconstruction by union•Updates may requires

tuple migrationMixed Fragmentation•A fragment is a Select-

Project query on relation.

Page 9: L2: DDBMS Architecture

L2: DDBMS Architecture -- 9

Horizontal Fragmentation Horizontal Fragmentation

Partitioning the tuples of a global relation into subsets Example:

Supplier(SNum, Name, City)

Horizontal Fragmentation can be: Supplier 1 = City = ``HK'' Supplier

Supplier2 = City != “HK” Supplier

Reconstruction is possible: Supplier = Supplier1 Supplier2

The set of predicates defining all the fragments must be complete, and mutually exclusive

Page 10: L2: DDBMS Architecture

L2: DDBMS Architecture -- 10

Derived Horizontal Derived Horizontal Fragmentation Fragmentation The horizontal fragmentation is derived

from the horizontal fragmentation of another relationExample:

Supply (SNum, PNum, DeptNum, Quan)

SNum is a supplier number Supply1 = Supply SNum=SNum Supplier1

Supply2 = Supply SNum=SNum Supplier2

The predicates defining derived horizontal fragments are: Supply.SNum = Supplier.SNum and Supplier. City =

``HK'' Supply.SNum = Supplier.SNum and Supplier. City !=

``HK''

is the semi join operation.

Page 11: L2: DDBMS Architecture

L2: DDBMS Architecture -- 11

Vertical Fragmentation Vertical Fragmentation

The vertical fragmentation of a global relation is the subdivision of its attributes into groups; fragments are obtained by projecting the global relation over each groupExample

EMP (ENum,Name,Sal,Tax,MNum,DNum)

A vertical fragmentation can be EMP1 = ENum, Name, MNum, DNum EMP

EMP2 = ENum, Sal, Tax EMP

Reconstruction:

ENum = ENumEMP = EMP1 EMP2

Page 12: L2: DDBMS Architecture

L2: DDBMS Architecture -- 12

Distribution Transparency Distribution Transparency

We analyze the different levels of distribution transparency which can be provided by DDBMS for applications. A Simple Application

Supplier(SNum, Name, City)

Horizontally fragmented into: Supplier 1 = City = ``HK'' Supplier at Site1

Supplier2 = City != “HK” Supplier at Site2, Site3

Application: Read the supplier number from the terminal and

return the name of the supplier with that number

Page 13: L2: DDBMS Architecture

L2: DDBMS Architecture -- 13

Level 1 of Distribution Level 1 of Distribution Transparency Transparency Fragmentation transparency:

The DDBMS interprets the database operation by accessing the databases at different sites in a way which is completely determined by the system

Supplier2

Supplier2

Supplier1

DDBMS

read(terminal, $SNum); Select Name into $Name from Supplier where SNum = $SNum;

write(terminal, $Name). S3

S2

S1

Page 14: L2: DDBMS Architecture

L2: DDBMS Architecture -- 14

Level 2 of Distribution Level 2 of Distribution Transparency Transparency

Location Transparency

The application is independent from changes in allocation schema, but not from changes to fragmentation schema

Supplier2

Supplier2

Supplier1

DDBMS

read(terminal, $SNum); Select Name into $Name from Supplier1 where SNum = $SNum;

If not FOUND thenSelect Name into $Name from Supplier2 where SNum = $SNum;

write(terminal, $Name).

S3

S2

S1

Page 15: L2: DDBMS Architecture

L2: DDBMS Architecture -- 15

Level 3 of Distribution Level 3 of Distribution Transparency Transparency

Local Mapping Transparency

The applications have to specify both the fragment names and the sites where they are located. The mapping of database operations specified in applications to those in DBMSs at sites is transparent

Supplier2

Supplier2

Supplier1

DDBMS

read(terminal, $SNum); Select Name into $Name from S1.Supplier1 where SNum = $SNum;

If not FOUND thenSelect Name into $Name from S3.Supplier2 where SNum = $SNum;

write(terminal, $Name).

S3

S2

S1

Page 16: L2: DDBMS Architecture

L2: DDBMS Architecture -- 16

Level 4 of Distribution Level 4 of Distribution Transparency Transparency No Transparency

read(terminal, $SNum); $SupIMS($Snum,$Name,$Found) at S1;

If not FOUND then$SupCODASYL($Snum,$Name,$Found) at S3;

write(terminal, $Name).

S1

Supplier1

IMS

DDBMS

S3

Supplier2

Codasyl

Page 17: L2: DDBMS Architecture

L2: DDBMS Architecture -- 17

Distribution Transparency for Distribution Transparency for Updates Updates

EMP1 = ENum,Name,Sal,TaxDNum10 (EMP) EMP2 = ENum,MNum,DNumDNum10 (EMP) EMP3 = ENum,Name,DNumDnum>10 (EMP) EMP4 = ENum,MNum,Sal,TaxDnum>10 (EMP)

EnumName Sal Tax 100 Ann 100 10

EnumMnumDnum100 20 3

EnumMnum Sal Tax 100 20 100 10

EnumName Dnum100 Ann 15

Update Dnum=15 for Employee with Enum=100

EMP1

EMP3 EMP4

EMP2

Difficult• broadcasting

updates to all copies

• migration of tuples because of change of fragment defining attributes

Page 18: L2: DDBMS Architecture

L2: DDBMS Architecture -- 18

An Update Application An Update Application

UPDATE Emp SET DNum = 15 WHERE ENum = 100;

Select Name, Tax, Sal into $Name, $Sal, $Tax From EMP 1 Where ENum = 100; Select MNum into $MNum From Emp 2 Where ENum = 100; Insert into EMP 3 (ENum, Name, DNum) (100, $Name, 15); Insert into EMP 4 (ENum, Sal, Tax, MNum) (100, $Sal, $Tax, $MNum); Delete EMP 1 where ENum = 100; Delete EMP 2 where ENum = 100;

With Level 1 Fragmentation Transparency

With Level 2 Location

Transparency only

Page 19: L2: DDBMS Architecture

L2: DDBMS Architecture -- 19

Levels of Distribution Levels of Distribution TransparencyTransparency Fragmentation Transparency

Just like using global relations. Location Transparency

Need to know fragmentation schema; but no need to know where fragments are located

Applications access fragments (no need to specify sites where fragments are located).

Local Mapping Transparency Need to know both fragmentation and allocation schema;

no need to know what the underlying local DBMSs are. Applications access fragments explicitly specifying where

the fragments are located. No Transparency

Need to know local DBMS query languages, and write applications using functionality provided by the Local DBMS

Page 20: L2: DDBMS Architecture

L2: DDBMS Architecture -- 20

On Distribution TransparencyOn Distribution Transparency

Higher levels of distribution transparency require appropriate DDBMS support, but makes end-application developers work easy.

Less distribution transparency the more the end-application developer needs to know about fragmentation and allocation schemes, and how to maintain database consistency.

There are tough problems in query optimization and transaction management that need to be tackled (in terms of system support and implementation) before fragmentation transparency can be supported.

Page 21: L2: DDBMS Architecture

L2: DDBMS Architecture -- 21

Layers of Transparency Layers of Transparency

The level of transparency is inevitably a compromise between ease of use and the difficulty and overhead cost of providing high levels of transparency

DDBMS provides location transparency and fragmentation transparency, OS provides network transparency, and DBMS provides data independence

Page 22: L2: DDBMS Architecture

L2: DDBMS Architecture -- 22

Some Aspects of the Classical Some Aspects of the Classical ArchitectureArchitecture Distributed database technology is an “add-

on” technology, most users already have populated centralized DBMSs. Whereas top down design assumes implementation of new DDBMS from scratch.

In many application environments, such as semi-structured databases, continuous multimedia data, the notion of fragment is difficult to define.

Current relational DBMS products provide for some form of location transparency (such as, by using nicknames).

Page 23: L2: DDBMS Architecture

L2: DDBMS Architecture -- 23

Bottom Up Architecture - Bottom Up Architecture - Present & FuturePresent & FuturePossible ways in which multiple databases may be

put together for sharing by multiple DBMSs, which are characterized according to

Autonomy (A) degree to which individual DBMSs can operate independently.

0 - Tightly coupled - integrated (A0), 1 - Semiautonomous - federated (A1), 2- Total Isolation - multidatabase systems(A2)

Distribution (D) 0- no distribution - single site (D0), 1 - client-server - distribution of DBMS functionality

(D1), 2- full distribution - peer to peer distributed

architecture(D2) Heterogeneity (H)

0 - homogeneous (H0) 1 - heterogeneous (H1)

Page 24: L2: DDBMS Architecture

L2: DDBMS Architecture -- 24

Autonomy Autonomy

Autonomy refers to the distribution of control, not of data. It indicates the degree to which individual DBMSs can operate independently

Requirements of an autonomous system The local operations of the individual DBMSs are

not affected by their participation in the DDBS. The individual DBMSs query processing and

optimization should not be affected by the execution of global queries that access multiple databases

System consistency or operation should not be compromised when individual DBMSs join or leave the distributed database confederation.

Page 25: L2: DDBMS Architecture

L2: DDBMS Architecture -- 25

Dimensions of Autonomy Dimensions of Autonomy

Design Autonomy Freedom for individual DBMSs to use data

models and transaction management techniques they prefer

Communication Autonomy Freedom for individual DBMSs to decide

what information (data & control) is to be exported

Execution Autonomy Freedom for individual DBMSs to execute

transactions submitted in any way that it wants to

Page 26: L2: DDBMS Architecture

L2: DDBMS Architecture -- 26

DDBMSs Implementation DDBMSs Implementation AlternativesAlternatives

D

Distributed heterogeneous multidatabasesystem

Logically integratedand homogeneous multiple DBMSs

Distributed heterogeneous DBMS

Distributed homogeneous multidatabase system

Heterogeneous Integrated DBMS

Multidatabase System

Single site homogeneous federated DBMS

Distributed homogeneous federated system

Distributed homogeneous DBMS

Heterogeneous multidatabase system

Single Site heterogeneous federated DBMS

Distributed Heterogeneous federated DBMS

A

H

Page 27: L2: DDBMS Architecture

L2: DDBMS Architecture -- 27

Taxonomy of Distributed Taxonomy of Distributed Databases Databases Composite DBMSs -tight integration

single image of entire database is available to any user can be single or multiple sites can be homogeneous or heterogeneous

Federated DBMSs - semiautonomous DBMSs that can operate independently, but have decided to

make some parts of their local data shareable can be single or multiple sites. they need to be modified to enable them to exchange

information Multidatabase Systems - total isolation

individual systems are stand alone DBMSs, which know neither the existence of other databases or how to communicate with them

no global control over the execution of individual DBMSs. can be single or multiple sites homogeneous or heterogeneous

Page 28: L2: DDBMS Architecture

L2: DDBMS Architecture -- 28

DDBS Reference Architecture DDBS Reference Architecture

GCS

ES1 ES2 ESn

LCS1 LCS2 LCSn

LIS1 LIS2 LISn

External Schema

It is logically integrated. Provides for the levels of transparency

Local Internal Schema

Local Conceptual Schema

Global Conceptual Schema

Page 29: L2: DDBMS Architecture

L2: DDBMS Architecture -- 29

MDBS Architecture (with MDBS Architecture (with Global Schema) Global Schema)

•The GCS is generated by integrating LES's or LCS's •The users of a local DBMS can maintain their autonomy •Design of GCS is bottom-up

GCS

GES1 GES2 GES3

LCS1 LCSn

LIS1 LISn

LES11 LES12 LES13 LESn1 LESn2 LESn3

Page 30: L2: DDBMS Architecture

L2: DDBMS Architecture -- 30

MDBS Architecture MDBS Architecture

Heterogeneous MDBS-Unilingual Requires users to utilize different data models and

languages when both a local and global database need to be accessed

Applications accessing multiple databases should use external views defined on GCS

A single application can have a LES on LCS and GES on GCS

Heterogeneous MDBS-Multilingual Users access global database by means of external

schema defined using the language of user's local DBMS

Queries against global databases are made using language of local DBMS

Deal with translation of queries at runtime

Page 31: L2: DDBMS Architecture

L2: DDBMS Architecture -- 31

MDBS w/o Global Conceptual MDBS w/o Global Conceptual Schema Schema

Local system layer consists of several DBMSs which present to multidatabase layer part of their databases

The shared database is either local conceptual schema or external schema (Not shown in the figure above)

External views on one or more LCSs. Access to multiple databases through

application programs

ES1

LCS1

LIS1

MultidatabaseLayer

Local DatabaseSystem Layer

ES2

LCS2

LIS2

ES3

LCS3

LIS3

Page 32: L2: DDBMS Architecture

L2: DDBMS Architecture -- 32

MDBS w/o Global SchemaMDBS w/o Global Schema

Federated Database Systems Do not use global conceptual schema Each local DBMS defines export schema Global database is a union of export schemas Each application accesses global database

through import schema (external view) MDBSs components architecture

Existence of full fledged local DBMSs MDBS is a layer on top of individual DBMSs

that support access to different databases The complexity of the layer depends on

existence of GCS and heterogeneity

Page 33: L2: DDBMS Architecture

L2: DDBMS Architecture -- 33

Components of Client/Server Components of Client/Server SystemSystem

UI Application Program

Client DBMS

Communication softwareOpe

ratin

g Sy

stem

System

Recovery Manager

Transaction Manager

Query Optimizer

Semantic Data Controller

Communication software

Ope

ratin

g

Runtime Support Processor

SQL Queries Result Relation

Database

Page 34: L2: DDBMS Architecture

L2: DDBMS Architecture -- 34

Components of DDBMSComponents of DDBMS

Local

Qu

ery

Pro

cess

or

Local

Recove

ryM

an

ag

er

Ru

nti

me

Su

pp

ort

Use

r In

terf

ace

Han

dle

r

Sem

an

tic D

ata

Con

troll

er

Glo

bal

Qu

ery

Op

tim

izer

Glo

bal

Execu

tion

M

on

itor

ExternalSchema

GlobalSchema

LocalConceptual

Schema

LocalInternalSchema

GD/Dlog

USER PROCESSOR DATA PROCESSOR

Page 35: L2: DDBMS Architecture

L2: DDBMS Architecture -- 35

DDBMS Components: USER DDBMS Components: USER PROCESSOR PROCESSOR User interface handler interprets user

commands and formats the result data as it is sent to the user

Semantic data controller checks the integrity constraints and authorization requirements

Global query optimizer and decomposer determines execution strategy, translates global queries to local queries, and generates strategy for distributed join operations

Global execution monitor (distributed transaction manager) coordinates the distributed execution of the user request

Page 36: L2: DDBMS Architecture

L2: DDBMS Architecture -- 36

DDBMS Components: DATA DDBMS Components: DATA PROCESSOR PROCESSOR Local query processor selects the

access path and is involved in local query optimization and join operations

Local recovery manager maintains local database consistency

Run-time support processor physically accesses the database. It is the interface to the OS and contains database buffer manager

Page 37: L2: DDBMS Architecture

L2: DDBMS Architecture -- 37

Components of MDBMSComponents of MDBMS

Multi-DBMS Layer

User RequestsSystem Responses

Runtime Sup. Processor

Recovery Manager

Scheduler

TransactionManager

Query Processor

Query Optimizer

UserInterface

Database

Runtime Sup. Processor

Recovery Manager

Scheduler

TransactionManager

Query Processor

Query Optimizer

UserInterface

Database

User

User User

Page 38: L2: DDBMS Architecture

L2: DDBMS Architecture -- 38

Global DirectoryGlobal Directory

Directory is itself a database that contains meta-data about the actual data stored in the database. It includes the support for fragmentation transparency for the classical DDBMS architecture.

Directory can be local or distributed. Directory can be replicated and/or

partitioned. Directory issues are very important for

large multi-database applications, such as digital libraries.