U2 1.ppt

Post on 21-Jul-2016

2 views 0 download

description

distributed database

Transcript of U2 1.ppt

Distributed DatabasesDistributed Databases

Presentation-IPresentation-I

Mr. Gadakh Prashant J.Mr. Gadakh Prashant J.

UNIT-IIUNIT-IIDISTRIBUTED DATABASESDISTRIBUTED DATABASES

•Study of DDBMS architectures,Study of DDBMS architectures,•Comparison of Homogeneous and Heterogeneous Comparison of Homogeneous and Heterogeneous Databases, Databases, •Analysis of Concurrency control in distributed Analysis of Concurrency control in distributed databases,databases,•Implementation of Distributed query processing. Implementation of Distributed query processing. Distributed data storage,Distributed data storage,•Distributed transactions,Distributed transactions,•Commit protocols, Availability, Commit protocols, Availability, •Distributed query processing,Distributed query processing,•Directory systems-LDAP, Directory systems-LDAP, •Distributed data storage and transactions. Distributed data storage and transactions.

Distributed DatabaseDistributed Database

In a distributed database system, the In a distributed database system, the database is stored on several computers.database is stored on several computers.

The computers in distributed system The computers in distributed system communicate with one another through communicate with one another through

various communication media.various communication media.

They do not share main memory or They do not share main memory or disks.disks.

Disadvantage of Parallel DatabaseDisadvantage of Parallel Database Unlike parallel systems,Unlike parallel systems,

• In which the processors are tightly In which the processors are tightly coupled and coupled and

• Constitute a single database system, Constitute a single database system,

INTRODUCTIONINTRODUCTION The computers in a distributed The computers in a distributed

system are referred to by a number system are referred to by a number of different names, such as of different names, such as sites sites or or nodesnodes, depending on the context in , depending on the context in which they are mentioned.which they are mentioned.

We mainly use the term We mainly use the term sitesite, to , to emphasize the physical distribution emphasize the physical distribution of these systems.of these systems.

Objectives Objectives

Sharing dataSharing data

AvailabilityAvailability

Location TransparencyLocation Transparency

An Example of a Distributed An Example of a Distributed DatabaseDatabase

Consider a banking system consisting of four branches Consider a banking system consisting of four branches in four different cities. Each branch has its own in four different cities. Each branch has its own computer, with a database of all the accounts computer, with a database of all the accounts maintained at that branch. Each such installation is maintained at that branch. Each such installation is thus a site. There also exists one single site that thus a site. There also exists one single site that maintains information about all the branches of the maintains information about all the branches of the bank. Each branch maintainsbank. Each branch maintains

(among others) a relation (among others) a relation accountaccount((Account-schemaAccount-schema), ), wherewhere

Account-schema Account-schema = (= (account-numberaccount-number, , branch-namebranch-name, , balancebalance))

Cont..Cont..The site containing information about all The site containing information about all the branches of the bank maintains the the branches of the bank maintains the relation.relation.branchbranch((BranchBranch--schemaschema), where), whereBranch-schema Branch-schema = (= (branch-namebranch-name, , branch-branch-citycity, , assetsassets))There are other relations maintained at There are other relations maintained at the various sites; we ignore them for the the various sites; we ignore them for the purpose of our example.purpose of our example.

Cont..Cont.. To illustrate the difference between the two types of To illustrate the difference between the two types of

transactions—local and global—at the sites, transactions—local and global—at the sites, consider a transaction to add $50 to account number consider a transaction to add $50 to account number

A-177 located at the Valleyview branch. If the A-177 located at the Valleyview branch. If the transaction was initiated at the Valleyview branch, transaction was initiated at the Valleyview branch, then it is considered local; then it is considered local;

otherwise, it is considered global. A transaction to otherwise, it is considered global. A transaction to transfer $50 from account A-177 to account A-305, transfer $50 from account A-177 to account A-305, which is located at the Hillside branch, is a global which is located at the Hillside branch, is a global transaction, since accounts in two different sites are transaction, since accounts in two different sites are accessed as a result of its execution.accessed as a result of its execution.

Types of distributed databaseTypes of distributed database

Homogeneous distributed database SystemHomogeneous distributed database System

Heterogeneous distributed database systemHeterogeneous distributed database system

In a Homogeneous distributed database SystemIn a Homogeneous distributed database System

All the sites have identical database All the sites have identical database management system software.management system software.

Are aware of each other and agree to Are aware of each other and agree to cooperatecooperate

in processing user request.in processing user request.

Appears to user as a single systemAppears to user as a single system

DBMS

SOFTWAREDBMSDBMS SOFTWARESOFTWARE

DBMS

SOFTWARE DBMSDBMS

SOFTWARESOFTWARE

DISTRIBUTEDDISTRIBUTEDDATABASEDATABASE

IdenticalIdentical

DBMSSDBMSS

Homogeneous distributed database SystemHomogeneous distributed database System

In a heterogeneousIn a heterogeneous distributed database Systemdistributed database System

Different sides may use different schemas and Different sides may use different schemas and software.software.

Sites may not be aware of each other and may Sites may not be aware of each other and may provides only limited facilities for cooperationprovides only limited facilities for cooperation

in transaction processing.in transaction processing.

DBMS

SOFTWAREDBMSDBMS SOFTWARESOFTWARE

DBMS

SOFTWARE DBMSDBMS

SOFTWARESOFTWARE

DISTRIBUTEDDISTRIBUTEDDATABASEDATABASE

Non IdenticalNon IdenticalDBMSSDBMSS

HeterogeneousHeterogeneous distributed database Systemdistributed database System

Distributed DBMS ArchitecturesDistributed DBMS Architectures

Client – Server SystemClient – Server System

Collaborating server systemCollaborating server system

Middleware systemMiddleware system

Client - Server SystemClient - Server System

It has one or more client processes & one or It has one or more client processes & one or more server processes.more server processes.

Client are responsible for user interface Client are responsible for user interface issues , an servers manages data and execute issues , an servers manages data and execute transactiontransaction

D/BASE

CLIENT#1

SERVER #2

CLIENT#2

CLIENT#3

D/BASE

SERVER #1

ClientClient#1#1

Collaborating server systemCollaborating server system

In this system we have a collection of In this system we have a collection of database servers.database servers.

When a server receives a query that requires When a server receives a query that requires access to data at other servers, it generates access to data at other servers, it generates appropriate sub queries to be executed by appropriate sub queries to be executed by other servers and puts the results together to other servers and puts the results together to compute answers to the original querycompute answers to the original query

Server

ServerServer

ServerServer

QueryQuery result

Middleware SystemsMiddleware Systems

The idea is that we need just one database server that The idea is that we need just one database server that is capable of managing queries and transactions is capable of managing queries and transactions spanning multiple servers; the remaining servers only spanning multiple servers; the remaining servers only need to handle local queries and transactionsneed to handle local queries and transactions..

special server as a layer of software that coordinates special server as a layer of software that coordinates the execution of queries and transactions across one or the execution of queries and transactions across one or more independent database servers; such a software is more independent database servers; such a software is often called often called middlewaremiddleware

Distributed Data StorageDistributed Data StorageConsider a relation Consider a relation r r that is to be stored in the that is to be stored in the database. There are two approaches to storing this database. There are two approaches to storing this relation in the distributed database:relation in the distributed database:

• • ReplicationReplication. The system maintains several identical . The system maintains several identical replicas (copies) of the relation, and stores each replica replicas (copies) of the relation, and stores each replica at a different site. The alternative to replication is to at a different site. The alternative to replication is to store only one copy of relation store only one copy of relation rr..

• • FragmentationFragmentation. The system partitions the relation . The system partitions the relation into several fragments, and stores each fragment at a into several fragments, and stores each fragment at a different site.different site.

Storing data in distributed systemStoring data in distributed system

FragmentationFragmentation : : It consist of breaking a relation into smaller It consist of breaking a relation into smaller

relations or fragments & storing the fragment relations or fragments & storing the fragment possibly at different sites.possibly at different sites.

1. Horizontal 1. Horizontal fragmentationfragmentation 2 .vertical2 .vertical FragmentationFragmentation

Horizontal Horizontal FragmentationFragmentation

Each fragments consist of a subset of Each fragments consist of a subset of rows of the original relation.rows of the original relation.

Tuples that belong to a given horizontal Tuples that belong to a given horizontal fragment are identified by a selection query.fragment are identified by a selection query.

VerticalVertical FragmentationFragmentation

Each fragments consist of a subset of Each fragments consist of a subset of columns of the original relation.columns of the original relation.

Tuples that belong to a given horizontal Tuples that belong to a given horizontal fragment are identified by a projection query.fragment are identified by a projection query.

68Lucknow819991012Amit

63Ghaziabad819991020Dhruv

61Dehradun819991041Rishi

64Mumbai819991011Amber

60Banaras819991014Anurag

%AddressSemesterRoll NoName

19991012Amit

19991020Dhruv

19991041Rishi

19991046Amber

19991014Anurag

Roll NoName

61Dehradun819991041Rishi

64Mumbai819991011Amber

%AddressSemesterRoll NoName

Vertical fragmentationVertical fragmentationHorizontal fragmentationHorizontal fragmentation

ReplicationReplication

System maintains several identical replicas System maintains several identical replicas (copies) of the relation , and stores each (copies) of the relation , and stores each replica at a different site.replica at a different site.

Advantages

AvailabilityAvailability Faster query evaluationFaster query evaluation

Disadvantages Increased overhead on update

Distributed Query ProcessingDistributed Query Processing In a distributed system, we must take into account In a distributed system, we must take into account

several other matters, including The cost of data several other matters, including The cost of data transmission over the network The potential gain in transmission over the network The potential gain in performance from having several sites process parts of performance from having several sites process parts of the query in parallel.the query in parallel.

The relative cost of data transfer over the network and The relative cost of data transfer over the network and data transfer to and from diskdata transfer to and from disk

varies widely depending on the type of network and on varies widely depending on the type of network and on the speed of the disks. Thus, in general, we cannot the speed of the disks. Thus, in general, we cannot focus solely on disk costs or on network costs. Rather, focus solely on disk costs or on network costs. Rather, we must find a good tradeoff between the two.we must find a good tradeoff between the two.

Cont..Cont..

In Next In Next Presentation,,,Presentation,,,