1
Distributed Databases
Fábio PortoLBDwinter 2004/2005
2
Fábio Porto
Agenda
IntroductionArchitectureDistributed database designQuery processing on distributed databaseData Integration
3
Fábio Porto
Outline
Introduction to DDBMSArchitectureDistributed Database DesignHorizontal FragmentationDerived Horizontal FragmentationVertical FragmentationConclusion
4
Fábio Porto
Motivation
DatabaseTechnology
ComputerNetwork
integration distribution
Integration & distribution
integration ≠ distribution
DistributedDatabaseSystem
© 1994 M. Tamer Özsu & Patrick Valduriez
Distributed database systems (DDBS) is a technology fostered by the development of database technology and computer network. Database systems provide data independence for user applications by offering a logical view of the data that is independent of physical implementation. On the other side, computer networks allows physically distributed machines to communicate. This is relevant to database technology because one may consider that for various reasons like privacy, efficiency, reliability, scalability users may want their data to be closer to him/her or be replicated through different nodes and still obtain an integrated global view of it.Thus, DDBSs explores the communication facilities of a computer network to offer distributed users the same level of services obtained in a centralized database system: query processing, transaction management, recovery processing, security guaranties , constraint support.
5
Fábio Porto
Introduction
What’s a distributed database system?– a collection of data which belong logically to the
same system but are spread over the sites of a computer network [Ceri and Pelagatti 1984]
Logically relatedData physically distributed
6
Fábio Porto
Logically related
Means that applications view data as an integrated database independently of data physical placement
Physical placementComputer 1 Computer 2 Computer 3
R1,R2 R4,R5R3
Apps.
Logicaldatabase
Logical view
query
Distributed databases are still databases. Thus the data it comprises is logically related according to the database model. In the eyes of a user, there should be no logical distinction between a distributed and centralized database systems. Obviously, in the physical level the DDBMS is adapted to confront with distribution concerns: performance, distributed catalog management; distributed transactions, etc..
7
Fábio Porto
Physically distributed
The database data is placed in different computers of a network, local or wide area.
Physical placementComputer 1 Computer 2 Computer 3
R1,R2 R4,R5R3
Physical distribution reflects the fact that objects on the database logical level are distributed. This is more than just file distribution. It means that, in the Relational Model, for instance, relational views of data are placed in different nodes of a computer network. The user perceives data as globally integrated and the DDBMS is responsible for recomposing such a view from the distributed relational views.
8
Scenarios
9
A centralized DBMS available on a network node is a simplified view of a DDBMS usually known as a client-server DBMS. In this scenario, most of the DBMS functionality is done on the server side. The client side is responsible for: the user application interface, DBMS communication modules, caching of DBMS data and some transaction management functionality.
Fábio Porto
Centralized DBMS in a network
Communicationnetwork
Site 5
Site 1Site 2
Site 3Site 4
10
Fábio Porto
Shared memory architecture
Examples : symmetric multiprocessors (Sequent, Encore) and some mainframes (IBM3090, Bull's DPS8), clusters SUN FIRE (6800)
P1 Pn MD
A shared memory architecture is a multi-processor system in which processors share a common memory space where they can exchange data without the need to send message in between processors. This is also not a DDBMS as the control is centralized in the shared memory space. Data could be distributed in the network, managed by a distributed file system, but this does not create a distributed control over the data. This architecture is also know as Symmetric multi-processor machines. Some commercial DBMSs have been implemented on top of such architecture providing for parallel execution of queries (DB2).
11
Fábio Porto
Shared Disk architecture
Examples : DEC's VAX cluster, IBM's IMS/VS Data Sharing
DP1
M1
Pn
Mn
A shared disk architecture is also a multiprocessor system in which each processor possesses a local memory space not shared with other processors. Disks are shared between processors and can be used to exchange information. A DBMS instance may run in each of the processor’s nodes with shared access to data on disks. Copies of data can be maintained by each processor node requiring managing of updates in cached copies.A shared-disk system is not, essentially, a DDBMS as data is centralized even if control is decentralized.
12
Fábio Porto
Shared nothing architecture
Exemplos : Teradata's DBC, Tandem, Intel's Paragon, NCR's 3600 and 3700
P1
M1
D1
Pn
Mn
Dn
A shared-nothing is a multiprocessor system in which each node is completely independent of the other. Each processor has exclusive access to its disks and memory space. The architecture is adequate for hosting a DDBMS where each node hosts an instance of the distributed database.Teradata’s DBC is a DDBMS designed on top of a shared-nothing architecture and that can support up to 1024 nodes.
13
This is a typical distributed database scenario. It is assumed that the database is distributed through some of the nodes of the network and that these nodes also hold a copy of the DBMS. Users connecting into a node served by a DBMS are offered an integrated view of the database independently of data location.
Fábio Porto
Distributed DBMS environment
CommunicationNetwork
Site 5
Site 1Site 2
Site 3Site 4
14
Transparency
15
Fábio Porto
Transparency
Lower level system characteristics are hidden from users;In a DBMS environment, complex applications can concentrate on functional characteristics– data management is done by the DBMS.
16
Fábio Porto
Levels of Transparency
Data
Data Independency
Location Transparency
Fragmentation Transparency
Language Transparency
Network Transparency
Different transparency levels come into play regarding data distributed in a computer network.In the most basic level it comes simply the data itself. Accessing data can be achieved directly by a user program , in which case its completely dependent on the physical structure of the data, or can be abstracted by a data management service that isolates physical aspects from logical view of data. The latter corresponds to the “data independency” level and allows database administrators to change data structure without affecting applications. In order to offer users integrated view of distributed data, the database design process is extended with the notion of data fragments. Fragments specify a criteria for splitting a global concept into smaller fragments. Usually, in database such criteria is expressed through a language (such as the Relational Algebra). The fragmentation transparency specifies that applications should not be affected by the fragmentation process, rather they should continue to see data as defined in a global view. Fragments of a global concept may be independently associated to a node in a network, considering that a DBMS module exists to manage it. Allocation is the process of defining the location a fragment should be placed in. The location transparency property defines that applications should bother about where a fragment has been placed. An extension of the location transparency includes fragment replication policy. Thus, in addition to place a fragment into a node, allocation process may consider the replication of a fragment into nodes. So we also consider within location transparency the property of hide replication management issues for the user application. As an example, a DBMS may transparently select a copy of a fragment that is closest to the user. Considering data distribution, access depends on message exchange between network nodes. Network transparency guarantees that applications don’t have to deal with network communication problems.Finally, in a heterogeneous database scenario, data can be specified according to different data models and languages. Language transparency allows users to access data using a common defined language which is mapped to the different component languages.
Network Transparency
17
Fábio Porto
Reference Architecture
ES1 ES2 ES3
GCS
LCS1
LIS1
LCS2
LIS2
LCSn
LISn
…
…
The above reference architecture is a general model for distributed database systems. On the top level, external schemas specify the application views of data. This corresponds to the external schema level proposed by the ANSI/SPARC reference database architecture. The Global Conceptual Schema (GCS) provides a single integrated view of the data independently of distribution aspects. In the next layer, the Local Conceptual Schema corresponds to the fragmentation and allocation definitions in respect to each computer network node. Finally, the Local internal schema specifies physical characteristics for each allocated fragment.
Queries are specified over the external schema or directly over the GCS. The LCS specifies the mappings between the GCS and LCS. Based on the latter, references to global elements of the GCS can be re-written in terms of local fragments. Finally, the system uses internal resources defined in the LIS to access local elements.This architecture considers a distributed database scenario where all component databases share the same model (i.e. homogeneous). Later on, we will discuss architectures where components are heterogeneous. In such scenarios, a new layer named Local Exported Schema can be introduced [Sheth and Larson 1990]
18
Fábio Porto
Architectural Models
Autonomy – refers to the distribution of control. It defines how each component of the DDBMS reacts in face of distributed transactions;Distribution- refers to the distribution of data through different sites;Heterogeneity – refers to data representation on different components of the DDBMS
19
Fábio PortoArchitectural Models for DDBMS
Distribution
Heterogeneity
Autonomy
(A2,D2,H1)
(A0,D2,H0)
The three axes of the architectural models highlight orthogonal dimensions based on which we can classifying distributed database systems. They are orthogonal in the sense that a value for one dimension may be combined with any point of another orthogonal dimension giving birth to an architecture model. The Distribution dimension is graded into three levels (o: no distributed), (1: client-server), (3: distributed). Autonomy refers to the level of collaboration that a component of the system is prepared to offer. It refers to the distribution of control, not data. Autonomy (0) refers to tightly integrated systems. Autonomy (1) is a level where systems partially collaborate with the integrated view. This means that part of the DBMS may be accessed directly by non-global transactions. The complement part of the database is accessed through the global view. Autonomy level (3) corresponds to highly independent systems that do not take any action towards the global view. The integrated view, if one exists, is provided by an extra layer on top of each autonomous DBMS.Finally, heterogeneity is in respect to different database models, data models and languages. A DDBMS may be classified as homogeneous (0) or heterogeneous (1)A triple (A(X),D(Y),Z(W)) defines a possible DDBMS architecture. For instance, (A2,D2,H1) specifies a multi-DBMS composed of heterogeneous DBMSS that are distributed in the network.
20
Fábio Porto
Architectures
21
Fábio Porto
Distributed Databases
...
...
View1 View2 Viewn
GCS
LIS2 LISnLIS1
22
Fábio Porto
LDBnLDB2LDB1
...
...
View1 View2 Viewn
GCS
LIS2 LISnLIS1
GDBMS
Distributed Databases
GlobalCatalog
23
Fábio Porto
FDBMS
Federated Databases
...
...
...
ES1 ES2 ESn
GCS
LCS1 LCS2 LCSn
LIS1 LIS2 LISn
LDBMS1 LDBMS2 LDBMSn
GlobalCatalog
24
Fábio Porto
Multi-DBMS
...
GCS
GES1
LCS2 LCSn…
…LIS2 LISn
LES11 LESnm
GES2 GESn
LIS1
LCS1
LDBMS1 LDBMS2 LDBMSn
GlobalCatalog
GQP
25
Fábio Porto
GQP
Mediators
...
...
...
ES1 ES2 ESn
GCS
LCS1 LCS2 LCSn
DS1 DS2 DSn
26
Fábio Porto
GQP
Mediators
...
...
...
ES1 ES2 ESn
GCS
LCS1 LCS2 LCSn
DS1 DS2 DSn
Wrapper1 Wrapper2 Wrappern
GlobalCatalog
27
Fábio Porto
Peer-to-Peer
ES1 ES2
LCS3
PES3
DSnDS1
LCS1
PES1
DSnDS1
LCS2
PES2
DSnDS1
ICS1 ICS2 ICS3
28
Fábio Porto
Peer-to-Peer
LCS1
PQP1
PES1
LocalCatalog
+MappingsDSn
Wrpn
DS1
Wrp1PCS2
PQP2
PES2
LocalCatalog
+Mappings
ICS2
PCS3
PQP3
PES3
LocalCatalog
+Mappings
ICS3LCSn
ICS1PCS1
LCSk
DSz
Wrpz
DSk
Wrpk
LCSz
LCSm
DSw
Wrpw
DSm
Wrpm
LCSw
29
Distributed Database Design
30
Fábio Porto
What is it?
Is the process of designing a database for its deployment in a distributed environment;Two main approaches:– Top-down (from centralized to distributed)– Bottom-up (from existent DBMSs to an integrated
view)
Top-down is an approach taken in building new applications. The traditional database design approach considers the activities: Requirements analysis; conceptual model design; logical model design; physical database design. Distribution adds two new activities: fragmentation design and allocation design. The build directly from the conceptual database design enriched with view access information to produce fragments that correspond to application data access profile.Bottom-up approach is used when databases already exists and new applications aim at having an integrated view of those databases. In this scenario, a global conceptual view is defined and the local conceptual views are mapped to the global view. In the top-down approach, the GCS is a union of the LCSs that derive from it. In the reverse approach, the GCS can be a subset of the union of the LCSs as some of the existing data may not be relevant to the global conceptual view.
31
Fábio Porto
Top-Down ApproachRequirements
Analysis
SystemRequirements
ConceptualDesign View Design
Global ConceptualSchema
AccessInformation
External SchemaDefinition
Distribution Design
LocalConceptualSchemas
User Input
The process of top-down design begins with a requirement analysis that defines the environment of the system and elicits both the data and processing needs of all potential databases users. The requirements document is input to two process: conceptual design and view design. The former identifies the concepts and relationships that represent the domain covered by the application and the latter defines access criteria. One may see this as complementary static and dynamic information for the database design. The conceptual schema is the basis for the global conceptual schema and access information gives input to the views applications have of data. By refining access view information with frequency a fragmentation design policy is defined (see this in the next slides).
32
Fábio Porto
Why to fragment?
To define a proper unit of distribution;Applications access a subset of a relation;Fragments allow for concurrent transactions to run in parallel accessing different fragments of a relation;Fragments allow queries to access in parallel different fragments of a relation;
Fragmentation is not strictly necessary for distribution as one may think of distributing whole files or relations, but this would not be the most efficient approach as applications access only a subset of relations, like information of employees of a certain department or of students of a certain course. In this case, distributing whole relations would be inefficient as irrelevant data would be transferred to the application site. Thus, it is important in a distribution design process to identify an adequate distribution unit. A fragment is a distribution unit that is a subset of a relation (or object in a more general sense) that can be obtained by an expression on the data model language. Common expressions used for fragmentation of relations include: selections, projections and semi-joins.In addition to reduce network transfer cost, fragmentation also maximizes concurrency by allowing concurrent transactions that access disjoint sets of a relation to proceed independently one of the other. In addition, a single query can also benefit from fragmentation by accessing in parallel different fragments.The drawback of fragmentation occurs exactly where an optimal fragmentation point is crossed. As a result of a extreme fragmentation applications may need to access and union different fragments to build their needed view of data.As different applications access the same data and have different access view requirement, defining a best fragmentation unit is not simple and should count on the relevance of the different applications.
33
Fábio PortoFragmentation approachalternatives
Horizontal – defines a subset of a relation based on a selection predicate over relation attribute values;Vertical – specifies fragments according to the subsets of the attributes accessed by applicationsHybrid – a combination of the two;
34
Fábio Porto
Example
Syst. Anal.J. JonesE8
Mech. Eng.R. DavisE7
Elect. Eng.L. ChuE6
Syst. Anal.B. CaseyE5
ProgrammerJ. MillerE4
Mech. Eng.A. LeeE3
Syst. Anal.M. SmithE2Elec. Eng.J.DoeE1
TITLEENameENOEMP
P4
P2
P2
P4
P3
P2
P1P1
48MngrE6
24MngrE5
18Prog.E4
48Eng.E3
10Cons.E3
6Anal.E2
24Anal.E212MngrE1
PNO DURRESPENOALLOC
35
Fábio Porto
Example
Lausanne
Geneva
RioLausanne
150000AircraftP4
1000000PlasmaP3
300000ELearn.P2500000BioinfoP1
LocationBudgetPNamePNOProj
24000Programmer
27000Mech. Eng.
34000Syst. Anal.40000Elec. Eng.
SalaryTitleSal
36
Fábio PortoExample of HorizontalPartitioning of Proj
RioLausanne
300000ELearn.P2500000BioinfoP1
LocationBudgetPNamePNOProj1
Lausanne
Geneva
150000AircraftP4
1000000PlasmaP3
LocationBudgetPNamePNOProj2
37
Fábio PortoExample of Vertical Partitioning
150000P4
1000000P3
300000P2500000P1
BudgetPNOProj1
Lausanne
Geneva
RioLausanne
AircraftP4
PlasmaP3
ELearn.P2BioinfoP1
LocationPNamePNOProj2
38
Fábio Porto
Correctness criteria
Completeness – each data item of a relation R must appear in one of its fragments;Reconstruction – Given a relation R and its fragments F= {R1, R2,…, Rn} is always possible to reconstruct R by applying operations over F.Disjointness – fragments of R contain a disjoint subset of it.
The completeness property guarantees that the fragmentation process produces a total partition of the global relation. This is important as users should be unaware of the fragmentation criteria and as such be free to insert data that is valid according to a centralized model, independently of any fragmentation decision.The reconstruction criteria is similar to the decomposition rule in database normalization procedure. Once data has been fragmented there should always exists an inverse operation that reconstruct the original view of the data.Finally, disjointness should simplify the recomposition process as no intersection has to be dealt with. Regaring vertical partitioning, though, fragmentation will have to consider the overlap of primary keys as a guarantee of recomposition.
39
Fábio Porto
Allocation alternatives
Single copy – each fragment is allocated to a single nodeReplicated – a fragment can have multiple copies each allocated to a node in the network.
40
Horizontal Fragmentation
41
Fábio Porto
Database Information– relationship
PHF – Information requirements
S
TITLE,SAL
ENO, ENAME, TITLE PNO, PNAME, BUDGET, LOC
ENO, PNO, RESP, DUR
E P
A
L 1
L 2 L 3
42
Fábio Porto
Application information (qualitative)– Simple predicates: Gigen R[A1, A2, …, An], a simple
predicate pj ispj : Ai Φ value
where Φ ∈ {=,<,≠,>,>=,<=}, value ∈ Di and Di is the domain of Ai.For R define Pr = {p1, p2, …,pm} as a set of simple predicates defined over R.Example :
PNAME = “BionInfo"Budget <= US$ 200000
PHF – Information requirements
43
Fábio Porto
PHF – Information requirements
– Minterm predicates : Given R and Pr={p1, p2, …,pm} defines M={m1,m2,…,mr} as
M={ mi | mi = Λpj ∈ Pr pj* }, 1<= j <=m, 1 <=i <=zwhere pj* = pj or pj* = ¬(pj).
44
Fábio Porto
Example
m1: PNAME=“BionInfo " & Budget > 200000
m2: NOT(PNAME =“BionInfo ") & Budget > 200000
m3: PNAME= “BionInfo " & NOT(Budget > 200000)
m4: NOT(PNAME=" BionInfo ") & NOT(Budget > 200000)
PHF – Information requirements
45
Fábio Porto
Application information (quantitative)– minterm selectivity : sel(mi)
Number of tuples that would be accessed by a query according to a given minterm predicate. For instance the selectivity of m3 is zero as there is no project that corresponds to the conditions on the minterm.
– Access frequency: acc(qi)Frequency with which user applications access data. Acc(qi) gives the frequency of query qi in a certain period.
acc(mi) is the access frequency of a minterm micorresponding to acc(qi) that contains mi
PHF – Information requirements
46
Fábio Porto
Two candidate relations: S e PFragmentation of S
– Aplication: Verifies salary information and determines a salary raise.
– Arquivos de empregados mantidos em dois sites Þ aplicação executa em dois sites
– Simple predicatesp1 : SAL <= 30000p2 : SAL > 30000
Min-term predicatesm1 : (SAL < = 30000) and (SAL > 30000)m2 : (SAL <= 30000) and NOT(SAL > 30000)m3 : NOT(SAL <= 30000) and (SAL > 30000)m4 : NOT(SAL <= 30000) and NOT(SAL > 30000)
PHF – Example
47
Fábio Porto
Fragmentation of S (cont.)– Implications
i1 : (SAL <= 30000) ⇒ NOT(SAL > 30000)
i2 : NOT(SAL <= 30000) ⇒ (SAL > 30000)
i3 : (SAL > 30000) ⇒ NOT(SAL <= 30000)
i4 : NOT(SAL > 30000) ⇒ (SAL <= 30000)
– m1 is contradictory to i1, m4 is contradictory to i2.
PHF – Example
TITLE
Mec. Eng.
Programmer
SAL
27000
24000
S1 S2
TITLE
Eletr. Eng.
Sist. Anal.
SAL
40000
34000
48
Fábio Porto
Fragmentation of relation P– Aplications:
1- Find name and budget of projects given their locations– Submitted in three sites
2- Accesses project’s information based on their budgets– One site accesses <= 200000, other accesses > 200000
– Simple predicatesFor application (1)p1 : LOC = “Lausanne"p2 : LOC = “Geneva"p3 : LOC = “Rio"For application (2)p4 : Budget <= 200000p5 : Budget > 200000
– Pr = Pr' = {p1,p2,p3,p4,p5}
PHF – Example
49
Fábio Porto
Continuing– Min-term fragments after elimination of
contradictionsm1 : (LOC = “Lausanne") and (Budget <=200000)
m2 : (LOC = “Lausanne") and (Budget > 200000)
m3 : (LOC = “Geneva") and (Budget <= 200000)
m4 : (LOC = " Geneva ") and (Budget > 200000)
m5 : (LOC = “Rio") and (Budget <= 200000)
m6 : (LOC = “Rio") and (Budget > 200000)
PHF – Example
50
Fábio Porto
PHF – Example
P1
PNO PNAME BUDGET LOC PNO PNAME BUDGET LOC
P4 Aircraft 150000 Lausanne P1 Bio Info 500000 Lausanne
P2
P4 P6
PNO PNAME BUDGET LOC
P3 Plasma 1000000 Geneva
PNO PNAME BUDGET LOC
E-learningP2 300000 Rio
51
Fábio Porto
Horizontal Fragmentation
For each single relation in a global conceptual schema
– Identify the most important applications that access it;– Define simple predicates based on selection criteria defined
by queries over the studied relation;– Build min-term predicates as a conjunction of simple
predicates (include the negation of each simple predicate)– Identify contradictory predicates and eliminate
corresponding min-terms;– Define fragments as selection operations with formula
corresponding to the min-term predicates
52
Fábio Porto
Intuition – two relations that are commonly jointly accessed define partitions of both sets that should appear together
– Example:Emp(ENO,ENAME,TITLE, DEPT)Alloc(ENO,DNO,Dur)If employees are managed by department
– M1= Dept=D1– M2= Dept=D2– M3= Dept=D3
The allocation information is dependent on that of associated employees, so allocation fragments are defined according to the min-terms defined over employees.
– A1= Alloc E1– A2= Alloc E2
Derived Horizontal Fragmentation
53
Fábio Porto
Graph of dependency
E1
E2
E3
A1
A2
A3
A4
Independent Fragmentation would leadTo excessive joins (eventually remote)
E1
E2
E3
A1
A2
A3
Defining derived fragmentationGuarantees that a single join wouldBe needed to access data in E and A
54
Fábio Porto
Semi-join
The semi-join of relation R, defined over the set of attributes A, by relation S, defined over the set of attributes B, is the subset of tuplesof R that participates in the join of R with S.– It is denoted as R F S, where F is a predicate.– It can be obtained as follows:
R F S = πA( R F S)
55
Fábio Porto
Example DHF
Define a fragmentation criteria for the Alloc relation based on the min-terms specified for the Projrelation:
P1= σ ((LOC = “Lausanne") and (Budget <=200000) ) Proj
P2=σ ((LOC = “Lausanne") and (Budget > 200000)) Proj
P3=σ ((LOC = “Geneva") and (Budget <= 200000)) Proj
P4=σ ((LOC = “Geneva") and (Budget > 200000)) Proj
P5=σ ((LOC = “Rio") and (Budget <= 200000)) Proj
P6=σ ((LOC = “Rio") and (Budget > 200000)) Proj
56
Fábio Porto
Example DHF
Derived Horizontal fragmentation obtained by defining semi-joins over fragments of Proj:– A1= Alloc (Alloc.PNO=P1.PNO) P1
– A2= Alloc (Alloc.PNO=P2.PNO) P2
– A3= Alloc (Alloc.PNO=P3.PNO) P3
– A4= Alloc (Alloc.PNO=P4.PNO) P4
– A5= Alloc (Alloc.PNO=P5.PNO) P5
– A6= Alloc (Alloc.PNO=P6.PNO) P6
57
Fábio Porto
DHF of Alloc over Proj
P4
P4
48MngrE6
48Eng.E3
PNO DURRESPENOA1
P1P1
24Anal.E212MngrE1
PNO DURRESPENOA2
P3 10Cons.E3
PNO DURRESPENOA4
P2
P2
P2
24.MngrE5
18Prog.E4
6Anal.E2PNO DURRESPENOA6
Top Related