L2: DDBMS Architecture -- 1
L2: DDBMS ArchitectureL2: DDBMS Architecture
DDBMS and Distribution Transparency
Architecture Alternatives DDBMS Components
L2: DDBMS Architecture -- 2
ANSI/SPARC ArchitectureANSI/SPARC Architecture
External View
External View
External View
Conceptual View
Internal View
ExternalSchema
Users
ConceptualSchema
InternalSchema
L2: DDBMS Architecture -- 3
The Classical DDBMS The Classical DDBMS ArchitectureArchitecture
Site 1
Other sites
Site IndependentSchemas
Global Schema
Fragmentation Schema
Allocation Schema
LOCAL DB 1
Local Mapping Schema
DBMS 1
LOCAL DB 2
Local Mapping Schema
DBMS 2
Site 2
L2: DDBMS Architecture -- 4
DDBMS SchemasDDBMS Schemas
Global Schema: a set of global relations as if database were not distributed at all
Fragmentation Schema: global relation is split into “non-overlapping” (logical) fragments. 1:n mapping from relation R to fragments Ri.
Allocation Schema: 1:1 or 1:n (redundant) mapping from fragments to sites. All fragments corresponding to the same relation R at a site j constitute the physical image Rj. A copy of a fragment is denoted by Rj
i. Local Mapping Schema: a mapping from
physical images to physical objects, which are manipulated by local DBMSs.
L2: DDBMS Architecture -- 5
Global Relations, Fragments and Global Relations, Fragments and Physical ImagesPhysical Images
R
GlobalRelation
R33
R32
R22
R21
R12
R11
R3
R2
R1
R2
(Site2)
R1
(Site 1)
R3
(Site3)
Physical Images
Fragments
L2: DDBMS Architecture -- 6
Motivation for this Motivation for this Architecture Architecture fragmentation transparency location transparency
Separating the concept of data fragmentation from the concept of data allocation
Explicit control of redundancy Independence from local databases:
allows local mapping transparency
L2: DDBMS Architecture -- 7
Rules for Data Fragmentation Rules for Data Fragmentation
Completeness All the data of the global relation must be mapped into the fragments
Reconstruction It must always be possible to reconstruct each global relation from its fragments
Disjointedness it is convenient that fragments be disjoint, so that the replication of data can be controlled explicitly at the allocation level
L2: DDBMS Architecture -- 8
Types of Data FragmentationTypes of Data Fragmentation
Vertical Fragmentation•Projection on relation
(subset of attributes)•Reconstruction by join•Updates require no tuple
migrationHorizontal Fragmentation
•Selection on relation (subset of tuples)
•Reconstruction by union•Updates may requires
tuple migrationMixed Fragmentation•A fragment is a Select-
Project query on relation.
L2: DDBMS Architecture -- 9
Horizontal Fragmentation Horizontal Fragmentation
Partitioning the tuples of a global relation into subsets Example:
Supplier(SNum, Name, City)
Horizontal Fragmentation can be: Supplier 1 = City = ``HK'' Supplier
Supplier2 = City != “HK” Supplier
Reconstruction is possible: Supplier = Supplier1 Supplier2
The set of predicates defining all the fragments must be complete, and mutually exclusive
L2: DDBMS Architecture -- 10
Derived Horizontal Derived Horizontal Fragmentation Fragmentation The horizontal fragmentation is derived
from the horizontal fragmentation of another relationExample:
Supply (SNum, PNum, DeptNum, Quan)
SNum is a supplier number Supply1 = Supply SNum=SNum Supplier1
Supply2 = Supply SNum=SNum Supplier2
The predicates defining derived horizontal fragments are: Supply.SNum = Supplier.SNum and Supplier. City =
``HK'' Supply.SNum = Supplier.SNum and Supplier. City !=
``HK''
is the semi join operation.
L2: DDBMS Architecture -- 11
Vertical Fragmentation Vertical Fragmentation
The vertical fragmentation of a global relation is the subdivision of its attributes into groups; fragments are obtained by projecting the global relation over each groupExample
EMP (ENum,Name,Sal,Tax,MNum,DNum)
A vertical fragmentation can be EMP1 = ENum, Name, MNum, DNum EMP
EMP2 = ENum, Sal, Tax EMP
Reconstruction:
ENum = ENumEMP = EMP1 EMP2
L2: DDBMS Architecture -- 12
Distribution Transparency Distribution Transparency
We analyze the different levels of distribution transparency which can be provided by DDBMS for applications. A Simple Application
Supplier(SNum, Name, City)
Horizontally fragmented into: Supplier 1 = City = ``HK'' Supplier at Site1
Supplier2 = City != “HK” Supplier at Site2, Site3
Application: Read the supplier number from the terminal and
return the name of the supplier with that number
L2: DDBMS Architecture -- 13
Level 1 of Distribution Level 1 of Distribution Transparency Transparency Fragmentation transparency:
The DDBMS interprets the database operation by accessing the databases at different sites in a way which is completely determined by the system
Supplier2
Supplier2
Supplier1
DDBMS
read(terminal, $SNum); Select Name into $Name from Supplier where SNum = $SNum;
write(terminal, $Name). S3
S2
S1
L2: DDBMS Architecture -- 14
Level 2 of Distribution Level 2 of Distribution Transparency Transparency
Location Transparency
The application is independent from changes in allocation schema, but not from changes to fragmentation schema
Supplier2
Supplier2
Supplier1
DDBMS
read(terminal, $SNum); Select Name into $Name from Supplier1 where SNum = $SNum;
If not FOUND thenSelect Name into $Name from Supplier2 where SNum = $SNum;
write(terminal, $Name).
S3
S2
S1
L2: DDBMS Architecture -- 15
Level 3 of Distribution Level 3 of Distribution Transparency Transparency
Local Mapping Transparency
The applications have to specify both the fragment names and the sites where they are located. The mapping of database operations specified in applications to those in DBMSs at sites is transparent
Supplier2
Supplier2
Supplier1
DDBMS
read(terminal, $SNum); Select Name into $Name from S1.Supplier1 where SNum = $SNum;
If not FOUND thenSelect Name into $Name from S3.Supplier2 where SNum = $SNum;
write(terminal, $Name).
S3
S2
S1
L2: DDBMS Architecture -- 16
Level 4 of Distribution Level 4 of Distribution Transparency Transparency No Transparency
read(terminal, $SNum); $SupIMS($Snum,$Name,$Found) at S1;
If not FOUND then$SupCODASYL($Snum,$Name,$Found) at S3;
write(terminal, $Name).
S1
Supplier1
IMS
DDBMS
S3
Supplier2
Codasyl
L2: DDBMS Architecture -- 17
Distribution Transparency for Distribution Transparency for Updates Updates
EMP1 = ENum,Name,Sal,TaxDNum10 (EMP) EMP2 = ENum,MNum,DNumDNum10 (EMP) EMP3 = ENum,Name,DNumDnum>10 (EMP) EMP4 = ENum,MNum,Sal,TaxDnum>10 (EMP)
EnumName Sal Tax 100 Ann 100 10
EnumMnumDnum100 20 3
EnumMnum Sal Tax 100 20 100 10
EnumName Dnum100 Ann 15
Update Dnum=15 for Employee with Enum=100
EMP1
EMP3 EMP4
EMP2
Difficult• broadcasting
updates to all copies
• migration of tuples because of change of fragment defining attributes
L2: DDBMS Architecture -- 18
An Update Application An Update Application
UPDATE Emp SET DNum = 15 WHERE ENum = 100;
Select Name, Tax, Sal into $Name, $Sal, $Tax From EMP 1 Where ENum = 100; Select MNum into $MNum From Emp 2 Where ENum = 100; Insert into EMP 3 (ENum, Name, DNum) (100, $Name, 15); Insert into EMP 4 (ENum, Sal, Tax, MNum) (100, $Sal, $Tax, $MNum); Delete EMP 1 where ENum = 100; Delete EMP 2 where ENum = 100;
With Level 1 Fragmentation Transparency
With Level 2 Location
Transparency only
L2: DDBMS Architecture -- 19
Levels of Distribution Levels of Distribution TransparencyTransparency Fragmentation Transparency
Just like using global relations. Location Transparency
Need to know fragmentation schema; but no need to know where fragments are located
Applications access fragments (no need to specify sites where fragments are located).
Local Mapping Transparency Need to know both fragmentation and allocation schema;
no need to know what the underlying local DBMSs are. Applications access fragments explicitly specifying where
the fragments are located. No Transparency
Need to know local DBMS query languages, and write applications using functionality provided by the Local DBMS
L2: DDBMS Architecture -- 20
On Distribution TransparencyOn Distribution Transparency
Higher levels of distribution transparency require appropriate DDBMS support, but makes end-application developers work easy.
Less distribution transparency the more the end-application developer needs to know about fragmentation and allocation schemes, and how to maintain database consistency.
There are tough problems in query optimization and transaction management that need to be tackled (in terms of system support and implementation) before fragmentation transparency can be supported.
L2: DDBMS Architecture -- 21
Layers of Transparency Layers of Transparency
The level of transparency is inevitably a compromise between ease of use and the difficulty and overhead cost of providing high levels of transparency
DDBMS provides location transparency and fragmentation transparency, OS provides network transparency, and DBMS provides data independence
L2: DDBMS Architecture -- 22
Some Aspects of the Classical Some Aspects of the Classical ArchitectureArchitecture Distributed database technology is an “add-
on” technology, most users already have populated centralized DBMSs. Whereas top down design assumes implementation of new DDBMS from scratch.
In many application environments, such as semi-structured databases, continuous multimedia data, the notion of fragment is difficult to define.
Current relational DBMS products provide for some form of location transparency (such as, by using nicknames).
L2: DDBMS Architecture -- 23
Bottom Up Architecture - Bottom Up Architecture - Present & FuturePresent & FuturePossible ways in which multiple databases may be
put together for sharing by multiple DBMSs, which are characterized according to
Autonomy (A) degree to which individual DBMSs can operate independently.
0 - Tightly coupled - integrated (A0), 1 - Semiautonomous - federated (A1), 2- Total Isolation - multidatabase systems(A2)
Distribution (D) 0- no distribution - single site (D0), 1 - client-server - distribution of DBMS functionality
(D1), 2- full distribution - peer to peer distributed
architecture(D2) Heterogeneity (H)
0 - homogeneous (H0) 1 - heterogeneous (H1)
L2: DDBMS Architecture -- 24
Autonomy Autonomy
Autonomy refers to the distribution of control, not of data. It indicates the degree to which individual DBMSs can operate independently
Requirements of an autonomous system The local operations of the individual DBMSs are
not affected by their participation in the DDBS. The individual DBMSs query processing and
optimization should not be affected by the execution of global queries that access multiple databases
System consistency or operation should not be compromised when individual DBMSs join or leave the distributed database confederation.
L2: DDBMS Architecture -- 25
Dimensions of Autonomy Dimensions of Autonomy
Design Autonomy Freedom for individual DBMSs to use data
models and transaction management techniques they prefer
Communication Autonomy Freedom for individual DBMSs to decide
what information (data & control) is to be exported
Execution Autonomy Freedom for individual DBMSs to execute
transactions submitted in any way that it wants to
L2: DDBMS Architecture -- 26
DDBMSs Implementation DDBMSs Implementation AlternativesAlternatives
D
Distributed heterogeneous multidatabasesystem
Logically integratedand homogeneous multiple DBMSs
Distributed heterogeneous DBMS
Distributed homogeneous multidatabase system
Heterogeneous Integrated DBMS
Multidatabase System
Single site homogeneous federated DBMS
Distributed homogeneous federated system
Distributed homogeneous DBMS
Heterogeneous multidatabase system
Single Site heterogeneous federated DBMS
Distributed Heterogeneous federated DBMS
A
H
L2: DDBMS Architecture -- 27
Taxonomy of Distributed Taxonomy of Distributed Databases Databases Composite DBMSs -tight integration
single image of entire database is available to any user can be single or multiple sites can be homogeneous or heterogeneous
Federated DBMSs - semiautonomous DBMSs that can operate independently, but have decided to
make some parts of their local data shareable can be single or multiple sites. they need to be modified to enable them to exchange
information Multidatabase Systems - total isolation
individual systems are stand alone DBMSs, which know neither the existence of other databases or how to communicate with them
no global control over the execution of individual DBMSs. can be single or multiple sites homogeneous or heterogeneous
L2: DDBMS Architecture -- 28
DDBS Reference Architecture DDBS Reference Architecture
GCS
ES1 ES2 ESn
LCS1 LCS2 LCSn
LIS1 LIS2 LISn
External Schema
It is logically integrated. Provides for the levels of transparency
Local Internal Schema
Local Conceptual Schema
Global Conceptual Schema
L2: DDBMS Architecture -- 29
MDBS Architecture (with MDBS Architecture (with Global Schema) Global Schema)
•The GCS is generated by integrating LES's or LCS's •The users of a local DBMS can maintain their autonomy •Design of GCS is bottom-up
GCS
GES1 GES2 GES3
LCS1 LCSn
LIS1 LISn
LES11 LES12 LES13 LESn1 LESn2 LESn3
L2: DDBMS Architecture -- 30
MDBS Architecture MDBS Architecture
Heterogeneous MDBS-Unilingual Requires users to utilize different data models and
languages when both a local and global database need to be accessed
Applications accessing multiple databases should use external views defined on GCS
A single application can have a LES on LCS and GES on GCS
Heterogeneous MDBS-Multilingual Users access global database by means of external
schema defined using the language of user's local DBMS
Queries against global databases are made using language of local DBMS
Deal with translation of queries at runtime
L2: DDBMS Architecture -- 31
MDBS w/o Global Conceptual MDBS w/o Global Conceptual Schema Schema
Local system layer consists of several DBMSs which present to multidatabase layer part of their databases
The shared database is either local conceptual schema or external schema (Not shown in the figure above)
External views on one or more LCSs. Access to multiple databases through
application programs
ES1
LCS1
LIS1
MultidatabaseLayer
Local DatabaseSystem Layer
ES2
LCS2
LIS2
ES3
LCS3
LIS3
L2: DDBMS Architecture -- 32
MDBS w/o Global SchemaMDBS w/o Global Schema
Federated Database Systems Do not use global conceptual schema Each local DBMS defines export schema Global database is a union of export schemas Each application accesses global database
through import schema (external view) MDBSs components architecture
Existence of full fledged local DBMSs MDBS is a layer on top of individual DBMSs
that support access to different databases The complexity of the layer depends on
existence of GCS and heterogeneity
L2: DDBMS Architecture -- 33
Components of Client/Server Components of Client/Server SystemSystem
UI Application Program
Client DBMS
Communication softwareOpe
ratin
g Sy
stem
System
Recovery Manager
Transaction Manager
Query Optimizer
Semantic Data Controller
Communication software
Ope
ratin
g
Runtime Support Processor
SQL Queries Result Relation
Database
L2: DDBMS Architecture -- 34
Components of DDBMSComponents of DDBMS
Local
Qu
ery
Pro
cess
or
Local
Recove
ryM
an
ag
er
Ru
nti
me
Su
pp
ort
Use
r In
terf
ace
Han
dle
r
Sem
an
tic D
ata
Con
troll
er
Glo
bal
Qu
ery
Op
tim
izer
Glo
bal
Execu
tion
M
on
itor
ExternalSchema
GlobalSchema
LocalConceptual
Schema
LocalInternalSchema
GD/Dlog
USER PROCESSOR DATA PROCESSOR
L2: DDBMS Architecture -- 35
DDBMS Components: USER DDBMS Components: USER PROCESSOR PROCESSOR User interface handler interprets user
commands and formats the result data as it is sent to the user
Semantic data controller checks the integrity constraints and authorization requirements
Global query optimizer and decomposer determines execution strategy, translates global queries to local queries, and generates strategy for distributed join operations
Global execution monitor (distributed transaction manager) coordinates the distributed execution of the user request
L2: DDBMS Architecture -- 36
DDBMS Components: DATA DDBMS Components: DATA PROCESSOR PROCESSOR Local query processor selects the
access path and is involved in local query optimization and join operations
Local recovery manager maintains local database consistency
Run-time support processor physically accesses the database. It is the interface to the OS and contains database buffer manager
L2: DDBMS Architecture -- 37
Components of MDBMSComponents of MDBMS
Multi-DBMS Layer
User RequestsSystem Responses
Runtime Sup. Processor
Recovery Manager
Scheduler
TransactionManager
Query Processor
Query Optimizer
UserInterface
Database
Runtime Sup. Processor
Recovery Manager
Scheduler
TransactionManager
Query Processor
Query Optimizer
UserInterface
Database
User
User User
L2: DDBMS Architecture -- 38
Global DirectoryGlobal Directory
Directory is itself a database that contains meta-data about the actual data stored in the database. It includes the support for fragmentation transparency for the classical DDBMS architecture.
Directory can be local or distributed. Directory can be replicated and/or
partitioned. Directory issues are very important for
large multi-database applications, such as digital libraries.
Top Related