Chapter25

38
Elmasri/Navathe, Fundamen tals of Database Systems, 4th Edition

description

Navate Database Management system

Transcript of Chapter25

Page 1: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Page 2: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

DistributedDatabases and Client-Server Architectures

Chapter 25

Page 3: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Distributed Database Concepts

Data Fragmentation, Replication, and Allocation Techniques for Distributed Database Design

Types of Distributed Database Systems

Query Processing in Distributed Databases

Overview of Concurrency Control and Recovery in Distributed Databases

Distributed Databases in Oracle

Chapter Outline

Page 4: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

A number of processing elements that are interconnected by a computer network and cooperate in performing certain assigned tasks is called a distributed computing system.

A distributed computing system partitions a big problem into small pieces and and solve it efficiently in a coordinate manner.

There are two types of multiprocessor systems:1) Shared memory (tightly coupled) architecture –

share secondary storage (disk) and primary memory

2) Shared disk (loosely coupled) architecture – share secondary storage, but each has their own primary memory

Distributed Database Concepts

Page 5: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

A Distributed Database (DDB) is a collection of multiple logically interrelated databases distributed over a computer network.

A Distributed Database Management System (DDBMS) refers to a software system that manages a DDB.

Database management systems developed using multiprocessor architectures are called parallel database management systems.

In shared nothing architecture every processor has its own primary and secondary memory.

Distributed Database Concepts

Page 6: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.1 Some different database system architectures.

(a) Shared nothing architecture.

Distributed Database Concepts

Page 7: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.1 (continued)Some different database system architectures. (b) A networked architecture with a centralized database

at one of the sites.

Distributed Database Concepts

Page 8: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.1 (continued)Some different database system architectures. (c) A truly distributed database architecture.

Distributed Database Concepts

Page 9: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Advantages of distributed databases:1) Management of distributed data with

different levels of transparency: hiding the details of where each file (able, relation) is physically stored within the system. The possible types of transparency are:

a) Distribution or network transparency – freedom for the user from the operational details of the network. It can be divided into location and naming transparency.

b) Replication transparency – makes the user unaware of the existence of copies.

c) Fragmentation transparency – makes the user unaware of the existence of fragments.

Advantages of Distributed Databases

Page 10: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

2) Increased reliability and availability: two of the most common potential advantages cited for DDB: a) Reliability – the probability that a system is

running (not down) at a certain time point,b) Availability – the probability that the system is

continuously available during a time interval.

3) Improved performance: a DDBMS fragments the database by keeping the data closer to where it is needed most (data localization reduces CPU and I/O).

4) Easier expansion: adding more data, increasing database sizes, or adding more processors is much easier in a DDB.

Advantages of Distributed Databases

Page 11: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.2 Data distribution and replication among distributed databases.

Advantages of Distributed Databases

Page 12: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

A DDBMS must provide the following functions:

Keeping track of data: the ability to keep track of the data distribution, fragmentation, and replication,

Distributed query processing: the ability to access remote sites via a communication network,

Distributed transaction management: the ability to execute transactions that access data from many sites,

Replicated data management: the ability to decide which copy of a replicated data to access and maintain the consistency of copies of replicated data items,

Additional Functions of DDB

Page 13: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

(cont.)

Distributed database recovery: the ability to recover from individual site crashes (or communication failure),

Security: proper management of the security of the data and the authorization/access privileges of users,

Distributed directory (catalog) management: design and policy issues for the placement of directory.

Additional Functions of DDB

Page 14: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Techniques that are used to break up a database into logical units, called fragmentation.

Data replication permits certain data to be stored in more than one site, and the process of allocating fragments.

Information concerning data fragmentation, allocation, and replication is stored in a global directory that is accessed by the DDB applications as needed.

Data Fragmentation, Replication

Page 15: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

In a DDB, decisions must be made regarding which site should contain which portion of the DB

A horizontal fragment of a relation is a subset of the tuples in the relation (e.g., DNO=5).

A vertical fragment of a relation keeps only certain attributes of a relation (e.g., SSN, SEX).

A mixed (hybrid) fragment of a relation combines the horizontal and vertical fragmentations.

Data Fragmentation, Replication

Page 16: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

A horizontal fragment of a relation R can be specified by a σci (R).

A set of horizontal fragments whose conditions C1, C2, …, Cn include all the tuples in R is called a complete horizontal fragmentation of R.

A vertical fragment of a relation R can be specified by a πLi (R).

A set of vertical fragments whose projection lists L1, L2, …, Ln include all the attributes in R but share only the primary key attribute of R is called a complete vertical fragmentation of R.

Data Fragmentation, Replication

Page 17: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Replication is useful in improving the availability of data.

In a fully replicated distributed database the whole database is replicated at every site.

The other extreme to full replication involves having no replication (i.e., each fragment is stored at exactly one site).

In partial replication of the data, some fragments of the database ma be replicated whereas others may not.

Data Fragmentation, Replication

Page 18: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.3 Allocation of fragments to sites. (a) Relation fragments at site 2 corresponding to

department 5.

Data Fragmentation, Replication

Page 19: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.3 (continued) Allocation of fragments to sites.

(b) Relation fragments at site 3 corresponding to department 4.

Data Fragmentation, Replication

Page 20: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.4Complete and disjoint fragments of the WORKS_ON

relation. (a) Fragments of WORKS_ON for employees working in

department 5 (C=[ESSN IN (SELECT SSN FROM EMPLOYEE WHERE DNO=5)]).

Data Fragmentation, Replication

Page 21: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.4 (continued)Complete and disjoint fragments of the WORKS_ON

relation. (b) Fragments of WORKS_ON for employees working in

department 4 (C=[ESSN IN (SELECT SSN FROM EMPLOYEE WHERE DNO=4)]).

Data Fragmentation, Replication

Page 22: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.4 (continued)Complete and disjoint fragments of the WORKS_ON

relation. (c) Fragments of WORKS_ON for employees working in

department 1 (C=[ESSN IN (SELECT SSN FROM EMPLOYEE WHERE DNO=1)]).

Data Fragmentation, Replication

Page 23: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

The degree of homogeneity of the DDBMS:

1) Homogeneous DDBMS: all servers (or individual local DBMSs) use identical software and all users (clients) use identical software.

2) Heterogeneous DDBMS: servers and/or users may use different software.

The degree of local autonomy of the DDBMS:

If there is no provision for the local site to function as a stand-alone DBMS, then the system has no local autonomy; otherwise(e.g., direct access by local transactions to a server is permitted), the system has some degree of local autonomy.

Types of Distributed Database Systems

Page 24: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

In a federated DDBMS, each server is an independent and autonomous centralized DBMS that has its own local users, local transactions, and DBA (i.e., a very high level of local autonomy).

In a federated database system (FDBS) there is some global view or schema of a federation of databases that is shared by the applications.

In a multidatabase system there exists no global schema. It interactively constructs one as needed by the application.

Types of Distributed Database Systems

Page 25: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

The type of heterogeneity present in FDBSs may arise from:

1) Difference in Data Models: e.g., one server may be a relational DBMS, another an object or hierarchical SBMS.

2) Differences in Constraints: e.g., using the referential integrity constraints and triggers.

3) Differences in Query Languages: e.g., SQL has multiple versions like SQL-89, SQL-92, and SQL-99 (even with the same data model and language).

Types of Distributed Database Systems

Page 26: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Communication autonomy refers to the ability to decide whether to communicate with another component DBS.

Execution autonomy refers to the ability of a component DBS to execute local operations without inference from external operations by other component DBSs (and decide the order).

Association autonomy refers to the ability to decide whether and how much to share its functionality and resources with other component DBSs.

Types of Distributed Database Systems

Page 27: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.5The five-level schema architecture in a federated database system (FDBS). Source: Adapted from Sheth and Larson, Federated Database Systems for Managing Distributed Heterogeneous Autonomous Databases. ACM Computing Surveys (Vol. 22: No. 3, September 1990).

Types of Distributed Database Systems

Page 28: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Suppose that the EMPLOYEE and the DEPARTMENT relations are distributed. The query:

π FNAME, LNAME, DNAME (EMPLOYEE ⋈ DNO=DNUMBER DEPARTMENT)

may be executed within one of the following strategies:1) Transfer both the EMPLOYEE and the DEPARTMENT

relations to the result site,

2) Transfer the EMPLOYEE relation to site A (where the DEPARTMENT relation is located), execute the query and send the output to the result site,

3) Transfer the DEPARTMENT relation to site B (where the EMPLOYEE relation is located), execute the query and send the output to the result site.

Query Processing in Distributed Databases

Page 29: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.6Examples to illustrate volume of data transferred.

Query Processing in Distributed Databases

Page 30: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

A DDBMS that supports full distribution, fragmentation, and replication transparence allows a query the query decomposition module to break up or decompose a query into subqueries that can be executed at the individual sites.

For horizontal fragmentation, a selection condition (called a guard) specifies which tuples exist in the fragment.

For vertical fragmentation, the attribute list for each fragment is kept in the catalog.

Query Processing in Distributed Databases

Page 31: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.7 Guard conditions and attributes lists for fragments. (a) Site 2 fragments.

Query Processing in Distributed Databases

Page 32: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.7 (continued)Guard conditions and attributes lists for fragments. (b) Site 3 fragments.

Query Processing in Distributed Databases

Page 33: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Concurrency control and recovery problems in DDBMS environment that are not encountered in a centralized DBMS environment.

Dealing with multiple copies of the data items

Failure of individual sites,

Failure of communication links,

Distributed commit,

Distributed deadlock.

Overview of Concurrency Control and recovery

Page 34: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

The distributed concurrency control based on a distinguished copy uses a particular copy of each data item. The lock for this data item are associated with the distinguished copy, and all locking and unlocking requests are sent to the site that contains that copy.

If all distinguished copies are kept in a single site then it is called the primary site technique, otherwise it is called the primary copy technique.

A site that includes a distinguished copy of a data item acts as the coordinator site for concurrency control on that item.

Overview of Concurrency Control and recovery

Page 35: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

The Oracle system is divided into two parts:

1) A front-end as the client portion, that interacts with the user. The client has no data access responsibility and merely handles the requesting, processing, and presentation of data managed by the server.

2) A back-end as the server portion, that runs Oracle and handles the functions related to concurrent shared access.

Oracle uses a two-phase commit protocol to deal with concurrent distributed transactions

Distributed Databases in Oracle

Page 36: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

All Oracle databases in a DDBS use Oracle’s networking software net8, which allows databases to communicate across networks to support remote and distributed transactions.

Oracle supports database links that define a one-way communication path from one Oracle database to another. For example,

CREATE DATABASE LINK sales.us.americas;

establishes a connection to the sales database under the network domain US that comes under domain americas.

Distributed Databases in Oracle

Page 37: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Data in an Oracle DDBS can be replicated using snapshots or replicated master tables. Replication is provided a the following levels:

Basic replication: replicates of tables are managed for read-only access. For updates, data must be accessed at a single primary site.

Advanced (symmetric) replication: allowing applications to update table replicas throughout a replicated DDBS. Data can be read and updated at any site.

Distributed Databases in Oracle

Page 38: Chapter25

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 25.9Oracle distributed database systems. Source: From Oracle (1997a). Copyright Oracle Corporation 1997. All rights reserved.

Distributed Databases in Oracle