PowerPoint Chapter 12
Transcript of PowerPoint Chapter 12
12
1
Chapter 12
Distributed Database Management Systems
Database Systems: Design, Implementation, and Management,
Seventh Edition, Rob and Coronel
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
2
In this chapter, you will learn:
• What a distributed database management system (DDBMS) is and what its components are
• How database implementation is affected by different levels of data and process distribution
• How transactions are managed in a distributed database environment
• How database design is affected by the distributed database environment
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
3
The Evolution of Distributed Database Management Systems
• Distributed database management system (DDBMS) – Governs storage and processing of logically
related data over interconnected computer systems in which both data and processing functions are distributed among several sites
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
4
The Evolution of Distributed Database Management Systems (continued)
• Centralized database required that corporate data be stored in a single central site
• Dynamic business environment and centralized database’s shortcomings spawned a demand for applications based on data access from different sources at multiple locations
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
5
The Evolution of Distributed Database Management Systems (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
6
DDBMS Advantages and Disadvantages
• Advantages include:– Data are located near “greatest demand” site– Faster data access– Faster data processing – Growth facilitation – Improved communications
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
7
DDBMS Advantages and Disadvantages (continued)
• Advantages include (continued):– Reduced operating costs – User-friendly interface – Less danger of a single-point failure – Processor independence
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
8
DDBMS Advantages and Disadvantages (continued)
• Disadvantages include:– Complexity of management and control – Security – Lack of standards– Increased storage requirements – Increased training cost
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
9
DDBMS Advantages and Disadvantages (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
10
DDBMS Advantages and Disadvantages (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
11
DDBMS Advantages and Disadvantages (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
12
Characteristics of Distributed Management Systems
• Application interface
• Validation
• Transformation
• Query optimization
• Mapping
• I/O interface
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
13
Characteristics of Distributed Management Systems (continued)
• Formatting
• Security
• Backup and recovery
• DB administration
• Concurrency control
• Transaction management
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
14
Characteristics of Distributed Management Systems (continued)
• Must perform all the functions of centralized DBMS
• Must handle all necessary functions imposed by distribution of data and processing– Must perform these additional functions
transparently to the end user
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
15
Characteristics of Distributed Management Systems (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
16
DDBMS Components
• Must include (at least) the following components:– Computer workstations – Network hardware and software – Communications media – Transaction processor (application processor,
transaction manager)• Software component found in each computer
that requests data
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
17
DDBMS Components (continued)
• Must include (at least) the following components (continued):– Data processor or data manager
• Software component residing on each computer that stores and retrieves data located at the site
• May be a centralized DBMS
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
18
DDBMS Components (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
19
Levels of Data and Process Distribution
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
20
Single-Site Processing, Single-Site Data (SPSD)
• All processing is done on single CPU or host computer (mainframe, midrange, or PC)
• All data are stored on host computer’s local disk
• Processing cannot be done on end user’s side of system
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
21
Single-Site Processing, Single-Site Data (SPSD) (continued)
• Typical of most mainframe and midrange computer DBMSs
• DBMS is located on host computer, which is accessed by dumb terminals connected to it
• Also typical of first generation of single-user microcomputer databases
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
22
Single-Site Processing, Single-Site Data (SPSD) (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
23
Multiple-Site Processing, Single-Site Data (MPSD)
• Multiple processes run on different computers sharing single data repository
• MPSD scenario requires network file server running conventional applications that are accessed through LAN
• Many multiuser accounting applications, running under personal computer network, fit such a description
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
24
Multiple-Site Processing, Single-Site Data (MPSD) (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
25
Multiple-Site Processing, Multiple-Site Data (MPMD)
• Fully distributed database management system with support for multiple data processors and transaction processors at multiple sites
• Classified as either homogeneous or heterogeneous
• Homogeneous DDBMSs – Integrate only one type of centralized DBMS
over a network
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
26
Multiple-Site Processing, Multiple-Site Data (MPMD) (continued)
• Heterogeneous DDBMSs– Integrate different types of centralized DBMSs
over a network
• Fully heterogeneous DDBMS– Support different DBMSs that may even
support different data models (relational, hierarchical, or network) running under different computer systems, such as mainframes and microcomputers
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
27
Multiple-Site Processing, Multiple-Site Data (MPMD) (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
28
Distributed Database Transparency Features
• Allow end user to feel like database’s only user
• Features include: – Distribution transparency– Transaction transparency– Failure transparency– Performance transparency– Heterogeneity transparency
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
29
Distribution Transparency
• Allows management of physically dispersed database as though it were a centralized database
• Following three levels of distribution transparency are recognized:– Fragmentation transparency – Location transparency – Local mapping transparency
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
30
Distribution Transparency (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
31
Distribution Transparency (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
32
Transaction Transparency
• Ensures database transactions will maintain distributed database’s integrity and consistency
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
33
Distributed Requests and Distributed Transactions
• Distributed transaction – Can update or request data from several
different remote sites on network
• Remote request– Lets single SQL statement access data to be
processed by single remote database processor
• Remote transaction– Accesses data at single remote site
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
34
Distributed Requests and Distributed Transactions (continued)
• Distributed transaction – Allows transaction to reference several
different (local or remote) DP sites
• Distributed request– Lets single SQL statement reference data
located at several different local or remote DP sites
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
35
Distributed Requests and Distributed Transactions (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
36
Distributed Requests and Distributed Transactions (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
37
Distributed Requests and Distributed Transactions (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
38
Distributed Requests and Distributed Transactions (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
39
Distributed Requests and Distributed Transactions (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
40
Distributed Concurrency Control
• Multisite, multiple-process operations are much more likely to create data inconsistencies and deadlocked transactions than are single-site systems
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
41
Distributed Concurrency Control (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
42
Two-Phase Commit Protocol
• Distributed databases make it possible for transaction to access data at several sites
• Final COMMIT must not be issued until all sites have committed their parts of transaction
• Two-phase commit protocol requires each individual DP’s transaction log entry be written before database fragment is actually updated
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
43
Performance Transparency and Query Optimization
• Objective of query optimization routine is to minimize total cost associated with execution of request
• Costs associated with request are function of:– Access time (I/O) cost – Communication cost – CPU time cost
• Must provide distribution transparency as well as replica transparency
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
44
Performance Transparency and Query Optimization (continued)
• Replica transparency – DDBMS’s ability to hide existence of multiple
copies of data from user
• Query optimization techniques include: – Manual or automatic– Static or dynamic – Statistically based or rule-based algorithms
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
45
Distributed Database Design
• Data fragmentation – How to partition database into fragments
• Data replication – Which fragments to replicate
• Data allocation – Where to locate those fragments and replicas
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
46
Data Fragmentation
• Breaks single object into two or more segments or fragments
• Each fragment can be stored at any site over computer network
• Information about data fragmentation is stored in distributed data catalog (DDC), from which it is accessed by TP to process user requests
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
47
Data Fragmentation (continued)
• Strategies– Horizontal fragmentation
• Division of a relation into subsets (fragments) of tuples (rows)
– Vertical fragmentation • Division of a relation into attribute (column)
subsets– Mixed fragmentation
• Combination of horizontal and vertical strategies
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
48
Data Fragmentation (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
49
Data Fragmentation (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
50
Data Fragmentation (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
51
Data Fragmentation (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
52
Data Fragmentation (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
53
Data Fragmentation (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
54
Data Fragmentation (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
55
Data Replication
• Storage of data copies at multiple sites served by computer network
• Fragment copies can be stored at several sites to serve specific information requirements– Can enhance data availability and response
time– Can help to reduce communication and total
query costs
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
56
Data Replication (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
57
Data Replication (continued)
• Replication scenarios– Fully replicated database
• Stores multiple copies of each database fragment at multiple sites
• Can be impractical due to amount of overhead – Partially replicated database
• Stores multiple copies of some database fragments at multiple sites
• Most DDBMSs are able to handle the partially replicated database well
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
58
Data Replication (continued)
• Replication scenarios (continued)– Unreplicated database
• Stores each database fragment at single site• No duplicate database fragments
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
59
Data Allocation
• Deciding where to locate data
• Allocation strategies– Centralized data allocation
• Entire database is stored at one site– Partitioned data allocation
• Database is divided into several disjointed parts (fragments) and stored at several sites
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
60
Data Allocation (continued)
• Allocation strategies (continued)– Replicated data allocation
• Copies of one or more database fragments are stored at several sites
• Data distribution over computer network is achieved through data partition, data replication, or combination of both
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
61
Client/Server vs. DDBMS
• Way in which computers interact to form system
• Features user of resources, or client, and provider of resources, or server
• Can be used to implement a DBMS in which client is the TP and server is the DP
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
62
Client/Server vs. DDBMS (continued)
• Client/server advantages– Less expensive than alternate minicomputer
or mainframe solutions– Allow end user to use microcomputer’s GUI,
thereby improving functionality and simplicity– More people in job market have PC skills than
mainframe skills– PC is well established in workplace
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
63
Client/Server vs. DDBMS (continued)
• Client/server advantages (continued)– Numerous data analysis and query tools exist
to facilitate interaction with DBMSs available in PC market
– Considerable cost advantage to offloading applications development from mainframe to powerful PCs
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
64
Client/Server vs. DDBMS (continued)
• Client/server disadvantages– Creates more complex environment
• Different platforms (LANs, operating systems, and so on) are often difficult to manage
– An increase in number of users and processing sites often paves the way for security problems
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
65
Client/Server vs. DDBMS (continued)
• Client/server disadvantages (continued)– Possible to spread data access to much wider
circle of users• Increases demand for people with broad
knowledge of computers and software• Increases burden of training and cost of
maintaining the environment
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
66
C. J. Date’s Twelve Commandments for Distributed Databases
• Local site independence
• Central site independence
• Failure independence
• Location transparency
• Fragmentation transparency
• Replication transparency
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
67
C. J. Date’s Twelve Commandments for Distributed Databases (continued)
• Distributed query processing
• Distributed transaction processing
• Hardware independence
• Operating system independence
• Network independence
• Database independence
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
68
Summary
• Distributed database stores logically related data in two or more physically independent sites connected via computer network
• Distributed processing is division of logical database processing among two or more network nodes
• Distributed databases require distributed processing
• Main components of DDBMS are transaction processor and data processor
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
69
Summary (continued)
• Current database systems can be classified by extent to which they support processing and data distribution
• Homogeneous distributed database system integrates only one particular type of DBMS over computer network
• Heterogeneous distributed database system integrates several different types of DBMSs over computer network
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
70
Summary (continued)
• DDBMS characteristics are best described as set of transparencies
• Transaction is formed by one or more database requests
• Distributed concurrency control is required in network of distributed databases
• Distributed DBMS evaluates every data request to find optimum access path in distributed database
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
71
Summary (continued)
• The design of distributed database must consider fragmentation and replication of data
• Database can be replicated over several different sites on computer network
• Client/server architecture refers to way in which two computers interact over computer network to form a system