Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

35
1 Jerry Post Copyright © 2003 Database Management Database Management Systems Systems Chapter 10 Distributed Databases

Transcript of Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

Page 1: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

1

Jerry PostCopyright © 2003

Database Management Database Management SystemsSystems

Chapter 10

Distributed Databases

Page 2: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

2

DDAATTAABBAASSEE

Distributed Databases

Definition Advantages / Uses Problems / Complications Client-Server / SQL Server Microsoft Access

Britain

Germany

France

Italy

SELECT SalesFROM Britain.SalesUNIONSELECT SalesFROM France.SalesUNIONSELECT SalesFROM Italy.Sales

Page 3: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

3

DDAATTAABBAASSEE

Distributed Database Definition

Multiple independent databases Each DBMS is a complete

DBMS (engine, queries, locking, transactions, etc.)

Usually on different machines. Usually in different locations.

Connected by a network. Might be different environments

Hardware Operating System DBMS Software

DatabaseZeus

DatabaseApollo

DatabaseAthena

United States

England

France

Page 4: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

4

DDAATTAABBAASSEE

Distributed Database Rules

C.J. Date Rule 0: Transparency: the

user should not know or care that the database is distributed. Local autonomy. No reliance on a central site. Continuous operation. Location independence. Fragmentation independence

(physical storage). Replication independence.

Distributed query processing. Distributed transaction

management. Hardware independence. Operating system independence. Network independence. DBMS independence.

Page 5: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

5

DDAATTAABBAASSEE

Distributed Features

Each database can continue to run even if portion fails. Data and hardware can be moved without affecting

operations or users. Expanding operations. Performance issues.

System expansion and upgrades. Add new section without affecting others. Upgrade hardware, network and DBMS.

Page 6: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

6

DDAATTAABBAASSEE

Advantages and Applications Business operations are

often distributed Work and data are

segmented by department. Work and data are

segmented by geographical location.

Improved performance Most updates and queries

are performed locally. Maintain local control and

responsibility over data.

Can still combine data across the system.

Scalability and expansion Add on, not replacement.

localtransactions

futureexpansion

Page 7: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

7

DDAATTAABBAASSEE

Creating a Distributed Database

Design administration plan. Choose hardware and DBMS vendor,

and network. Set up network and DBMS

connections. Choose locations for data. Choose replication strategy. Create backup plan and strategy. Create local views and synonyms. Perform stress test: loads and failures.

Page 8: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

8

DDAATTAABBAASSEE

Distributed Query Processing

Networks are slow Drives: 20 - 60 MB per sec. LANs: 1-10 MB per sec (10-100 mbps). WANs: 0.01 - 5 MB per sec. Faster is possible but expensive! SANs: 10-100 MB per sec.

Goal is to minimize transmissions. Each system must be capable of

evaluating queries--preferably SQL. Results depend heavily on how the

system joins tables.

10 - 20 MB10-100 MB

0.1 - 5 MB

Disk driveLAN

WAN

Page 9: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

9

DDAATTAABBAASSEE

Customers(C#, …)1,000,000

NY

Products(P#, Color…)10,000,000

Sales(S#, C#, Sdate)20,000,000SaleItem(S#, P#,…)50,000,000

Chicago

LA

Distributed Query Processing Example

NY: Customers: 1 M rows LA: Production: 10 M rows Chicago: Sales: 20 M rows Query: List customers who

bought blue products on March 1 Bad idea #1

Transfer all rows to ChicagoThen JOIN and select.

Better idea #2 (probably)Transfer blue products from LA

to Chicago Better idea #3

Get sale items on March 1Get blue products from LASend C# to NY

P# sold onMarch 1

Blue P#sold onMarch 1

C# list fromdesired P#

MatchingCustomerdata

Page 10: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

10

DDAATTAABBAASSEE

Data Replication Goals

Minimize transmissions Improve performance Support heavy multiuser

access.

Problems Updating copies

Bulk transmissions Site unavailable

Concurrency Easier for two people to

change the same data at the same time.

Decision support systems. Data warehouse.

Britain: Customers& Sales

France: Customers& Sales

Spain: Customers& Sales

Britain

Britain: Customers& Sales

France: Customers& Sales

Spain: Customers& Sales

Spain

Update data.

Market research & data corrections.

Periodic updates

Page 11: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

11

DDAATTAABBAASSEE

Concurrency and Locks

Each DBMS must maintain lock facility.

To update, each DBMS must utilize and recognize other lock mechanisms and return codes.

Each DBMS must have a deadlock resolution protocol that recognizes the distributed databases. Random wait. Optimistic updates. Two-phase commit.

DBMS #1Accounts

Jones 8898

DBMS #2Accounts

Jones 3561

Transaction ALockedWaiting

Transaction BWaitingLocked

Page 12: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

12

DDAATTAABBAASSEE

Transactions & Two-Phase Commit Two (or more) separate lock

managers. DBMS initiating update

serves as the coordinator. Two phases

Coordinator sends message and data to all machines to “get ready.”

Local machines save data in logs, verify update status and return message.

If all locals report OK, then coordinator writes log and instructs others to proceed. If any fail, it sends Rollback message.

Database 1Initiate Transaction

Database 2

Database 3

1. Prepare to commit.All agree?

2. Commit

Lock tables.Save log.Update all tables.

Page 13: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

13

DDAATTAABBAASSEE

Distributed Transaction Managers

Transaction Processing

Monitor

Transaction Manager

Resource Manager

Transaction Manager

Resource Manager

Transaction Manager

Resource Manager

DBMSDBMS

DBMSThe distributed transaction coordinator/transaction processing monitor handles the transaction decisions and coordinates across the participating systems.

Page 14: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

14

DDAATTAABBAASSEE

Distributed Design Questions

Question Concurrent ReplicationWhat level of data consistency is needed? High Low – MediumHow expensive is storage? Medium – High LowWhat are the shared access requirements? Global LocalHow often are the tables updated? Often SeldomRequired speed of updates (transactions)? Fast SlowHow important are predictable transaction times? High LowDBMS support for concurrency and locking? Good – Excellent PoorCan shared access be avoided? No Yes

Page 15: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

15

DDAATTAABBAASSEE

Distributed Databases In Oracle

Database Links Full database names. CONNECT command.

Linking through synonyms. CREATE SYNONYM … Central control over permissions.

Linking through Views/queries. CREATE VIEW AS … Can assign local permissions.

Linking through stored procedures. DELETE … Strong control over actions.

[email protected]@hq.acme.com

Serverdatabase

View

Synonym:Employee Procedure:

DELETE FROMEmployeeWHERE ...

userpermissions

User can onlyrun procedure.No other access.

Page 16: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

16

DDAATTAABBAASSEE

Client-Server

ServerServer

ClientsClients

SharedDatabase

Front-endUser Interface

Page 17: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

17

DDAATTAABBAASSEE

LAN File Server

Not a distributed database. Data file stored on server. Server is passive, appears

as giant disk drive to PC. PC processes all data. Retrieves all needed data

across the network.

Performance improvements. Indexes are crucial. Store some data on each

PC (replication). Store applications on PC

(graphics & forms). Convert to SQL-Server

File Server

DBMS data file

ApplicationShared

Data

SELECT Name, SaleDateFROM Customer INNER JOIN Sales ON Customer.C# = Sales.C#WHERE SaleDate BETWEEN #1-Mar-97# AND #9-Mar-97#;

All data from all tables are read by PC, which performs JOIN and WHERE test. If available, reads index first.

Page 18: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

18

DDAATTAABBAASSEE

LAN File Server: Slow

File Server

CustID Name …115 Jenkins…125 Juarez ...

Order ...

MyFile.mdb

Forms

SELECT *FROM CustomerWHERE City = “Sandy”

DBMSsoftwaretransferred.

Applicationand querytransferred.

One row at a timetransferred, untilall rows are examined.

Page 19: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

19

DDAATTAABBAASSEE

Client-Server Databases

One machine machine is dominant (server) and handles data for many clients.

Client machines handle front-end tasks and small data tables that are not shared.

File Server

DBMS

SQL ServerShared

Data

application

SE

LEC

T .

. .

Send SQLstatement.

Returnmatchingdata.

Page 20: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

20

DDAATTAABBAASSEE

ADO and Direct Connections

DatabaseServer

Visual Basicapplication

DBMS transport

ADO

Client Computer

DBMS transport

Server Computer

SE

LEC

T …

Res

ults

The Database vendor provides its own data transport (e.g,. Oracle or SQL Server) installed on the server and the client.

ADO provides a driver that connects your application to the transport services.

ODBC can serve as the data transport if nothing else is available

Page 21: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

21

DDAATTAABBAASSEE

Three-Tier Client-Server

Server Databases Client front-end Middle

Locate databases Business rules Program code

Client

Middleware

DatabaseServers

Application.Front-end.User Interface.

Databases.Transactions.Legacy applications.

Database links.Business rules.Program code.

Page 22: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

22

DDAATTAABBAASSEE

Database Independence on the Client

New DBMSOriginal DBMS

Application

ADO ADO

Page 23: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

23

DDAATTAABBAASSEE

Database Independence with Queries

SELECT SaleID, SaleDate, CustomerID, CustomerName

FROM SaleCustomer

SELECT SaleID, SaleDate, CustomerID, LastName || ‘, ‘ || FirstName AS CustomerName

FROM Sale, CustomerWHERE Sale.CustomerID=Customer.CustomerID

SELECT SaleID, SaleDate, CustomerID, LastName + ‘, ‘ + FirstName AS CustomerName

FROM Sale INNER JOIN CustomerON Sale.CustomerID = Customer.CustomerID

Independent Application Query: works with any DBMS

Saved Oracle Query

Saved SQL Server Query

Page 24: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

24

DDAATTAABBAASSEE

The Internet as Client-Server

ClientBrowser

Server

Web Server

Router RouterInternet

HTML pagesFormsGraphics

http://server.location/page

request

information

Page 25: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

25

DDAATTAABBAASSEE

HTML Limited Clients<HTML>

<HEAD>

<TITLE>My main page</TITLE></HEAD>

<BODY BACKGROUND=“graphics/back0.jpg”>

<P>My text goes in paragraphs.</P>

<P>Additional tags set <B>boldface</B> and <I>Italic</I>.

<P>Tables are more complicated and use a set of tags for rows and columns.</P>

<TABLE BORDER=1>

<TR><TD>First cell</TD><TD>Second cell</TD></TR>

<TR><TD>Next row</TD><TD>Second column</TD></TR>

</TABLE>

<P>There are form tags to create input forms for collecting data.

But you need CGI program code to convert and use the input data.</P>

</BODY>

</HTML>

Page 26: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

26

DDAATTAABBAASSEE

HTML Output

My text goes in paragraphs.Additional tags set boldface and I talic.Tables are more complicated and use a set of tagsfor rows and columns.F irst cell Second cellNext row Second columnThere are form tags to create input forms forcollecting data. But you need CGI program codeto convert and use the input data.

Page 27: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

27

DDAATTAABBAASSEE

Web Server Database Fundamentals

Client/Browser

Web Server

HTMLform

1

Data

DBMS

2

Form.html

Query

Database

Result

QueryTemplate+ Code

Program code

Page = Template + Result

Result Page

1 2 3

1

2

3

Form

CGI String

0 Request Server/Form.html

HTML Form

Page 28: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

28

DDAATTAABBAASSEE

Database Example: Client Side

0 Request Server/Form.html

1

2

Server

3 Results

Call ASP page

Initial form

Page 29: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

29

DDAATTAABBAASSEE

Client-Server Data Transfer

Order Form

Order Date

Customer

12-Aug

Jones, Martha

Order ID 1015

What if there are 10,000 customers?

How much time to load the combo box?

How do you refresh/reload the combo box?

Alternatives?

Page 30: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

30

DDAATTAABBAASSEE

Latency

time

Server

Client

Generate form

Form received

User delay

Receive form data

Transmission delay

Transmission delay

Page 31: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

31

DDAATTAABBAASSEE

XML: Transferring Data

Order: OrderID, OrderDate, ShippingCost, Comment

Item: ItemID, Description, Quantity, Cost

Many XML files contain hierarchical data.

Item: ItemID, Description, Quantity, Cost

Item: ItemID, Description, Quantity, Cost

Page 32: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

32

DDAATTAABBAASSEE

XML: Schema Definition xsd<?xml version="1.0" encoding="utf-8"?><xs:schema id="OrderList" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"> <xs:element name="OrderList" msdata:IsDataSet="true"> <xs:complexType> <xs:choice maxOccurs="unbounded"> <xs:element name="Order"> <xs:complexType> <xs:sequence> <xs:element name="OrderID" type="xs:string" minOccurs="0" /> <xs:element name="OrderDate" type="xs:date" minOccurs="0" /> <xs:element name="ShippingCost" type="xs:string" minOccurs="0" /> <xs:element name="Comment" type="xs:string" minOccurs="0" /> <xs:element name="Items" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="ItemID" nillable="true" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:simpleContent msdata:ColumnName="ItemID_Text" msdata:Ordinal="0"> <xs:extension base="xs:string"> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="Description" nillable="true" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:simpleContent msdata:ColumnName="Description_Text" msdata:Ordinal="0"> <xs:extension base="xs:string"> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element>

Partial file, generated by .NET xsd.exe

Page 33: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

33

DDAATTAABBAASSEE

XML Data Example

<?xml version="1.0"?><!DOCTYPE OrderList SYSTEM "orderlist.dtd"><OrderList><Order><OrderID>1</OrderID><OrderDate>3/6/2004</OrderDate><ShippingCost>$33.54</ShippingCost><Comment>Need immediately.</Comment><Items><ItemID>30</ItemID><Description>Flea Collar-Dog-Medium</Description><Quantity>208</Quantity><Cost>$4.42</Cost><ItemID>27</ItemID><Description>Aquarium Filter &amp; Pump</Description><Quantity>8</Quantity><Cost>$24.65</Cost></Items></Order></OrderList>

XML: extensible markup language

Page 34: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

34

DDAATTAABBAASSEE

XML Example in Explorer

Page 35: Jerry Post Copyright © 2003 1 Database Management Systems Chapter 10 Distributed Databases.

35

DDAATTAABBAASSEE

Java and JDBC

Connection con = DriverManager.getConnection("jdbc.myDriver:myDBName", “myLogin”, “myPassword”);

Statement smt = con.CreateStatement();ResultSet rst = smt.executeQuery(

“SELECT AnimalID, Name, Category, Breed FROM Animal”);while (rst.next()) {

int iAnimal = rst.getInt(“AnimalID”);String sName = rst.getString(“Name”);String sCategory = rst.getString(“Category”);String sBreed = rst.getString(“Breed”);

\\ Now do something with these four variables}