DATABASE 1 Distributed Databases Definition Advantages / Uses Problems / Complications ...

35
1 D D A A T T A A B B A A S S E E Distributed Databases Definition Advantages / Uses Problems / Complications Client-Server / SQL Server Microsoft Access Britain Germany France Italy SELECT Sales FROM Britain.Sales UNION SELECT Sales FROM France.Sales UNION SELECT Sales FROM Italy.Sales

Transcript of DATABASE 1 Distributed Databases Definition Advantages / Uses Problems / Complications ...

Page 1: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

1

DDAATTAABBAASSEE

Distributed Databases

Definition Advantages / Uses Problems / Complications Client-Server / SQL Server Microsoft Access

Britain

Germany

France

Italy

SELECT SalesFROM Britain.SalesUNIONSELECT SalesFROM France.SalesUNIONSELECT SalesFROM Italy.Sales

Page 2: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

2

DDAATTAABBAASSEE

Distributed Database Definition

Multiple independent databases Each DBMS is a complete

DBMS (engine, queries, locking, transactinos, etc.)

Usually on different machines. Usually in different locations.

Connected by a network. Might be different environments

Hardware Operating System DBMS Software

DatabaseZeus

DatabaseApollo

DatabaseAthena

United States

England

France

Page 3: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

3

DDAATTAABBAASSEE

Distributed Database Rules

C.J. Date Rule 0: Transparency: the

user should not know or care that the database is distributed. Local autonomy. No reliance on a central site. Continuous operation. Location independence. Fragmentation independence

(physical storage). Replication independence.

Distributed query processing. Distributed transaction

management. Hardware independence. Operating system independence. Network independence. DBMS independence.

Page 4: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

4

DDAATTAABBAASSEE

Distributed Features

Each database can continue to run even if portion fails. Data and hardware can be moved without affecting

operations or users. Expanding operations. Performance issues.

System expansion and upgrades. Add new section without affecting others. Upgrade hardware, network and DBMS.

Page 5: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

5

DDAATTAABBAASSEE

Advantages and Applications Business operations are

often distributed Work and data are

segmented by department. Work and data are

segmented by geographical location.

Improved performance Most updates and queries

are performed locally. Maintain local control and

responsibility over data.

Can still combine data across the system.

Scalability and expansion Add on, not replacement.

localtransactions

futureexpansion

Page 6: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

6

DDAATTAABBAASSEE

Creating a Distributed Database

Design administration plan. Choose hardware and DBMS vendor,

and network. Set up network and DBMS

connections. Choose locations for data. Choose replication strategy. Create backup plan and strategy. Create local views and synonyms. Perform stress test: loads and failures.

Page 7: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

7

DDAATTAABBAASSEE

Distributed Query Processing

Networks are slow Drives: 10 - 20 MB per sec. LANs: 1 - 10 MB per sec. WANs: 0.01 - 5 MB per sec. Faster is possible but expensive!

Goal is to minimize transmissions. Each system must be capable of

evaluating queries--preferably SQL. Results depend heavily on how the

system joins tables.

10 - 20 MB1 - 10 MB

0.1 - 5 MB

Disk driveLAN

WAN

Page 8: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

8

DDAATTAABBAASSEE

Customers(C#, …)1,000,000

NY

Products(P#, Color…)10,000,000

Sales(S#, C#, Sdate)20,000,000SaleItem(S#, P#,…)50,000,000

Chicago

LA

Distributed Query Processing Example

NY: Customers: 1 M rows LA: Production: 10 M rows Chicago: Sales: 20 M rows Query: List customers who

bought blue products on 1-Mar-01 Bad idea #1

Transfer all rows to ChicagoThen JOIN and select.

Better idea #2 (probably)Transfer blue products from LA

to Chicago Better idea #3

Get sale items on 1-Mar-01Get blue products from LASend C# to NY

P# sold on1-Mar-01

Blue P#sold on1-Mar-01

C# list fromdesired P#

MatchingCustomerdata

Page 9: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

9

DDAATTAABBAASSEE

Data Replication Goals

Minimize transmissions Improve performance Support heavy multiuser

access.

Problems Updating copies

Bulk transmissions Site unavailable

Concurrency Easier for two people to

change the same data at the same time.

Decision support systems. Data warehouse.

Britain: Customers& Sales

France: Customers& Sales

Spain: Customers& Sales

Britain

Britain: Customers& Sales

France: Customers& Sales

Spain: Customers& Sales

Spain

Update data.

Market research & data corrections.

Periodic updates

Page 10: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

10

DDAATTAABBAASSEE

Concurrency and Locks

Each DBMS must maintain lock facility.

To update, each DBMS must utilize and recognize other lock mechanisms and return codes.

Each DBMS must have a deadlock resolution protocol that recognizes the distributed databases. Random wait. Optimistic updates. Two-phase commit.

DBMS #1Accounts

Jones 8898

DBMS #2Accounts

Jones 3561

Transaction ALockedWaiting

Transaction BWaitingLocked

Page 11: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

11

DDAATTAABBAASSEE

Transactions & Two-Phase Commit Two (or more) separate lock

managers. DBMS initiating update

serves as the coordinator. Two phases

Coordinator sends message and data to all machines to “get ready.”

Local machines save data in logs, verify update status and return message.

If all locals report OK, then coordinator writes log and instructs others to proceed. If any fail, it sends Rollback message.

Database 1Initiate Transaction

Database 2

Database 3

1. Prepare to commit.All agree?

2. Commit

Lock tables.Save log.Update all tables.

Page 12: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

12

DDAATTAABBAASSEE

Distributed Design Questions

Question Concurrent ReplicationWhat level of data consistency is needed? High Low – MediumHow expensive is storage? Medium – High LowWhat are the shared access requirements? Global LocalHow often are the tables updated? Often SeldomRequired speed of updates (transactions)? Fast SlowHow important are predictable transaction times? High LowDBMS support for concurrency and locking? Good – Excellent PoorCan shared access be avoided? No Yes

Page 13: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

13

DDAATTAABBAASSEE

Distributed Databases In Oracle

Database Links Full database names. CONNECT command.

Linking through synonyms. CREATE SYNONYM … Central control over permissions.

Linking through Views/queries. CREATE VIEW AS … Can assign local permissions.

Linking through stored procedures. DELETE … Strong control over actions.

[email protected]@hq.acme.com

Serverdatabase

View

Synonym:Employee Procedure:

DELETE FROMEmployeeWHERE ...

userpermissions

User can onlyrun procedure.No other access.

Page 14: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

14

DDAATTAABBAASSEE

Client-Server

ServerServer

ClientsClients

SharedDatabase

Front-endUser Interface

Page 15: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

15

DDAATTAABBAASSEE

LAN File Server

Not a distributed database. Data file stored on server. Server is passive, appears

as giant disk drive to PC. PC processes all data. Retrieves all needed data

across the network.

Performance improvements. Indexes are crucial. Store some data on each

PC (replication). Store applications on PC

(graphics & forms). Convert to SQL-Server

File Server

DBMS data file

ApplicationShared

Data

SELECT Name, SaleDateFROM Customer INNER JOIN Sales ON Customer.C# = Sales.C#WHERE SaleDate BETWEEN #1-Mar-97# AND #9-Mar-97#;

All data from all tables are read by PC, which performs JOIN and WHERE test. If available, reads index first.

Page 16: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

16

DDAATTAABBAASSEE

LAN File Server: Slow

File Server

CustID Name …115 Jenkins…125 Juarez ...

Order ...

MyFile.mdb

Forms

SELECT *FROM CustomerWHERE City = “Sandy”

DBMSsoftwaretransferred.

Applicationand querytransferred.

One row at a timetransferred, untilall rows are examined.

Page 17: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

17

DDAATTAABBAASSEE

Client-Server Databases

One machine machine is dominant (server) and handles data for many clients.

Client machines handle front-end tasks and small data tables that are not shared.

File Server

DBMS

SQL ServerShared

Data

application

SE

LEC

T .

. .

Send SQLstatement.

Returnmatchingdata.

Page 18: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

18

DDAATTAABBAASSEE

Microsoft Access

Access is a client-side database. In file server environment. As a client to a database server.

Attach or Link to other databases. Link to an Access database is file server. Link through ODBC is database server.

ODBC: Open DataBase Connectivity Pass-Through Queries

Page 19: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

19

DDAATTAABBAASSEE

Open Database Connectivity: ODBC

Microsoft connectivity standard. Most DBMS companies

provide drivers. SQL Server, Oracle, Ingres,

etc.

Driver is installed on PC and Server.

ODBC handles: Login to database. Send query. Interpret result codes. Exchange data.

DatabaseServer

Access

ODBC driver

Link Tables

Client Computer

ODBC driver

Server Computer

SE

LEC

T …

Res

ults

Page 20: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

20

DDAATTAABBAASSEE

ODBC Basics Set the connection string.

Datatype: odbc DSN: data source name holds linkage data--built with Windows Control

Panel or RegisterDatabase. uid: login Username pwd: login Password database: full database name on remote system.

Page 21: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

21

DDAATTAABBAASSEE

ADO and Direct Connections

DatabaseServer

Visual Basicapplication

DBMS transport

ADO

Client Computer

DBMS transport

Server Computer

SE

LEC

T …

Res

ults

The Database vendor provides its own data transport (e.g,. Oracle or SQL Server) installed on the server and the client.

ADO provides a driver that connects your application to the transport services.

ODBC can serve as the data transport if nothing else is available

Page 22: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

24

DDAATTAABBAASSEE

Three-Tier Client-Server

Server Databases Client front-end Middle

Locate databases Business rules Program code

Client

Middleware

DatabaseServers

Application.Front-end.User Interface.

Databases.Transactions.Legacy applications.

Database links.Business rules.Program code.

Page 23: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

25

DDAATTAABBAASSEE

The Internet as Client-Server

ClientBrowser

Server

Web Server

Router RouterInternet

HTML pagesFormsGraphics

http://server.location/page

request

information

Page 24: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

26

DDAATTAABBAASSEE

HTML Limited Clients<HTML>

<HEAD>

<TITLE>My main page</TITLE></HEAD>

<BODY BACKGROUND=“graphics/back0.jpg”>

<P>My text goes in paragraphs.</P>

<P>Additional tags set <B>boldface</B> and <I>Italic</I>.

<P>Tables are more complicated and use a set of tags for rows and columns.</P>

<TABLE BORDER=1>

<TR><TD>First cell</TD><TD>Second cell</TD></TR>

<TR><TD>Next row</TD><TD>Second column</TD></TR>

</TABLE>

<P>There are form tags to create input forms for collecting data.

But you need CGI program code to convert and use the input data.</P>

</BODY>

</HTML>

Page 25: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

27

DDAATTAABBAASSEE

HTML Output

My text goes in paragraphs.Additional tags set boldface and I talic.Tables are more complicated and use a set of tagsfor rows and columns.F irst cell Second cellNext row Second columnThere are form tags to create input forms forcollecting data. But you need CGI program codeto convert and use the input data.

Page 26: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

28

DDAATTAABBAASSEE

Web Server Database Fundamentals

Client/Browser

HTML FormAction=“Query.asp” Web Server (IIS)

HTMLform

1

Data

SQL Server

2

Form.html

Query

Database

Result

QueryTemplate+ Code

Query.asp

Page = Template + Result

Result Page

1 2 3

1

2

3

Form

CGI String

0 Request Server/Form.html

Page 27: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

29

DDAATTAABBAASSEE

Database Example: Client Side

0 Request Server/Form.html

1

2

Server

3 Results

Call ASP page

Initial form

Page 28: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

30

DDAATTAABBAASSEE

Form Code

<HTML><HEAD><TITLE>Catalog Search</TITLE></HEAD><BODY TEXT="#000000" BGCOLOR="#ffffff"><FORM ACTION="PetStoreSearch.asp" METHOD="Put"><P> <SELECT NAME="Category"><OPTION SELECTED VALUE="Cat">Cat<OPTION VALUE="Dog">Dog<OPTION VALUE="Fish">Fish</SELECT> Category</P><P><INPUT TYPE="TEXT" NAME="Color" VALUE="Black"> Color</P><P><INPUT TYPE="SUBMIT" VALUE="Search" NAME="Submit"></P>

</FORM></BODY> </HTML>

1

Page 29: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

31

DDAATTAABBAASSEE

ASP Code Structure

Note: Access will createthe basic structure.Export a query as asp.Then edit the file.

Connect to the DatabaseCreate the SQLConnect to the Database/RecordsetLoop through the Recordset

Get a FieldDisplay itMove to the next row

End Loop

Set objConn = CreateObject("ADODB.Connection")Set rst = CreateObject("ADODB.Recordset")sql = " SELECT …"rst.Open sql …rst.MoveFirstdo while Not rs.eof

Response.Write Server.HTMLEncode(rst(”LastName"))rst.MoveNext

Loop

2

Page 30: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

32

DDAATTAABBAASSEE

ASP Query: Build SQL

<HTML><HEAD><TITLE>Animal Search Results</TITLE></HEAD><BODY><% Set objConn = Server.CreateObject("ADODB.Connection") objConn.open "PetStore","",""

sql = "SELECT AnimalID, Name, Category, Breed, DateBorn, Color, ListPrice " sql = sql & "FROM Animal WHERE (Category = '" sql = sql & Request.Form("Category") & "’) AND (Color LIKE '*" sql = sql & Request.Form("Color") & "*’)"

Set rst = Server.CreateObject("ADODB.Recordset") rst.Open sql, objConn, 3, 3%>

Page 31: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

33

DDAATTAABBAASSEE

ASP Query: Create Table<TABLE BORDER=1><CAPTION><B>ASPSearch1</B></CAPTION><THEAD><TR><TH>AnimalID</TH> <TH>Name</TH> <TH>Category</TH><TH>Breed</TH> <TH>DateBorn</TH><TH>Color</TH> <TH>ListPrice</TH></TR></THEAD><TBODY><% On Error Resume Next rst.MoveFirst do while Not rst.eof%><TR VALIGN=TOP><TD><%=Server.HTMLEncode(rst("AnimalID"))%> <BR></TD><TD><%=Server.HTMLEncode(rst("Name"))%> <BR></TD><TD><%=Server.HTMLEncode(rst("Category"))%> <BR></TD><TD><%=Server.HTMLEncode(rst("Breed"))%> <BR></TD><TD><%=Server.HTMLEncode(rst("DateBorn"))%> <BR></TD><TD><%=Server.HTMLEncode(rst("Color"))%> <BR></TD><TD><%=Server.HTMLEncode(rst("ListPrice"))%> <BR></TD></TR><% rst.MoveNext loop%></TBODY></TABLE></BODY></HTML>

Page 32: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

34

DDAATTAABBAASSEE

Client-Server Data Transfer

Order Form

Order Date

Customer

12-Aug

Jones, Martha

Order ID 1015

What if there are 10,000 customers?

How much time to load the combo box?

How do you refresh/reload the combo box?

Alternatives?

Page 33: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

35

DDAATTAABBAASSEE

XML: Transferring DataOrder: OrderID, OrderDate

Item: ItemID, Quantity, Cost

Item: ItemID, Quantity, Cost

Item: ItemID, Quantity, Cost

<!ELEMENT OrderList (Order+)><!ELEMENT Order (OrderID,OrderDate,Comment?,ShippingCost,Items+)><!ELEMENT OrderID (#PCDATA)><!ELEMENT OrderDate (#PCDATA)><!ELEMENT Comment (#PCDATA)><!ELEMENT ShippingCost (#PCDATA)><!ELEMENT Items (Item+)><!ELEMENT Item (ItemID,Description,Quantity,Price)><!ELEMENT ItemID (#PCDATA)><!ELEMENT Description (#PCDATA)><!ELEMENT Quantity (#PCDATA)><!ELEMENT Cost (#PCDATA)>

DTD: Document Type Definition is hierarchical.

+ Repeats 1 or more? Optional* Repeats 0 or more

#PCDATA: parsed character data

Page 34: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

36

DDAATTAABBAASSEE

XML Data Example

<?xml version="1.0"?><!DOCTYPE OrderList SYSTEM "orderlist.dtd"><OrderList><Order><OrderID>1</OrderID><OrderDate>3/6/2001</OrderDate><ShippingCost>$33.54</ShippingCost><Comment>Need immediately.</Comment><Items><ItemID>30</ItemID><Description>Flea Collar-Dog-Medium</Description><Quantity>208</Quantity><Cost>$4.42</Cost><ItemID>27</ItemID><Description>Aquarium Filter &amp; Pump</Description><Quantity>8</Quantity><Cost>$24.65</Cost></Items></Order></OrderList>

XML: extensible markup language

Page 35: DATABASE 1 Distributed Databases  Definition  Advantages / Uses  Problems / Complications  Client-Server / SQL Server  Microsoft Access Britain Germany.

37

DDAATTAABBAASSEE

XML Example in Explorer