Distributed DBMS [Good]

download Distributed DBMS [Good]

of 58

Transcript of Distributed DBMS [Good]

  • 8/7/2019 Distributed DBMS [Good]

    1/58

    Distributed Database

    Management SystemsChapter 10

  • 8/7/2019 Distributed DBMS [Good]

    2/58

    In this chapter, you will

    learn: What a distributed database management

    system (DDBMS) is and what its componentsare

    How database implementation is affected bydifferent levels of data and process distribution

    How transactions are managed in a distributed

    database environment How database design is affected by the

    distributed database environment

  • 8/7/2019 Distributed DBMS [Good]

    3/58

    The Evolution of

    Distributed DatabaseManagement Systems Distributed database management

    system (DDBMS)

    Governs storage and processing of

    logically related data over

    interconnected computer systems

    in which both data and processingfunctions are distributed among

    several sites

  • 8/7/2019 Distributed DBMS [Good]

    4/58

    The Evolution of Distributed

    Database Management Systems(continued) Centralized database required that

    corporate data be stored in a

    single central site

    Dynamic business environment

    and centralized databases

    shortcomings spawned a demandfor applications based on data

    access from different sources at

    multiple locations

  • 8/7/2019 Distributed DBMS [Good]

    5/58

    Centralized Database

    Management System

  • 8/7/2019 Distributed DBMS [Good]

    6/58

    DDBMS Advantages Data are located near greatest demand site Faster data access

    Faster data processing

    Growth facilitation

    Improved communications

    Reduced operating costs

    User-friendly interface

    Less danger of a single-point failure

    Processor independence

  • 8/7/2019 Distributed DBMS [Good]

    7/58

    DDBMS Disadvantages Complexity of management and control

    Security

    Lack of standards Increased storage requirements

    Greater difficulty in managing the data

    environment

    Increased training cost

  • 8/7/2019 Distributed DBMS [Good]

    8/58

    Distributed Processing

    Environment

  • 8/7/2019 Distributed DBMS [Good]

    9/58

    Distributed Database

    Environment

  • 8/7/2019 Distributed DBMS [Good]

    10/58

    Characteristics of Distributed

    Management Systems Application interface

    Validation

    Transformation

    Query optimization

    Mapping

    I/O interface

    Formatting

    Security

    Backup and recovery DB administration

    Concurrency control

    Transaction management

  • 8/7/2019 Distributed DBMS [Good]

    11/58

    Characteristics of

    Distributed ManagementSystems (continued) Must perform all the functions of a

    centralized DBMS

    Must handle all necessary

    functions imposed by the

    distribution of data and processing

    Must perform these additional

    functions transparentlyto the end

    user

  • 8/7/2019 Distributed DBMS [Good]

    12/58

    A Fully Distributed

    Database ManagementSystem

  • 8/7/2019 Distributed DBMS [Good]

    13/58

    DDBMSComponents Must include (at least) the following components:

    Computer workstations

    Network hardware and software

    Communicati

    ons

    media

    Transaction processor (or, application processor,or transaction manager)

    Software component found in each computer thatrequests data

    Data processor or data manager Software component residing on each computer

    that stores and retrieves data located at the site

    May be a centralized DBMS

  • 8/7/2019 Distributed DBMS [Good]

    14/58

    Distributed Database

    System Components

  • 8/7/2019 Distributed DBMS [Good]

    15/58

    Database Systems:

    Levels of Data andProcess Distribution

  • 8/7/2019 Distributed DBMS [Good]

    16/58

    Single-Site Processing,

    Single-Site Data (SPSD) All processing is done on single CPU or host

    computer (mainframe, midrange, or PC)

    All data are stored on host computers local disk

    Processing cannot be done on end users side ofthe system

    Typical of most mainframe and midrangecomputer DBMSs

    DBMS is located on the host computer, which isaccessed by dumb terminals connected to it

    Also typical of the first generation of single-usermicrocomputer databases

  • 8/7/2019 Distributed DBMS [Good]

    17/58

    Single-Site Processing,Single-Site Data(Centralized)

  • 8/7/2019 Distributed DBMS [Good]

    18/58

    Multiple-SiteProcessing, Single-SiteData (MPSD) Multiple processes run on different

    computers sharing a single data

    repository

    MPSD scenario requires a network

    file server running conventional

    applications that are accessedthrough a LAN

    Many multi-user accounting

    applications, running under a

    personal computer network, fit

  • 8/7/2019 Distributed DBMS [Good]

    19/58

    Multiple-SiteProcessing, Single-SiteData

  • 8/7/2019 Distributed DBMS [Good]

    20/58

    Multiple-Site Processing,

    Multiple-Site Data (MPMD) Fully distributed database management system

    with support for multiple data processors and

    transaction processors at multiple sites Classified as either homogeneous or

    heterogeneous

    Homogeneous DDBMSs Integrate only one type of centralized DBMS

    over a network

  • 8/7/2019 Distributed DBMS [Good]

    21/58

    Multiple-Site Processing,Multiple-Site Data (MPMD)(continued)

    Heterogeneous DDBMSs

    Integrate different types of centralized DBMSs

    over a network

    Fully heterogeneous DDBMS

    Support different DBMSs that may even support

    different data models (relational, hierarchical, ornetwork) running under different computer

    systems, such as mainframes and

    microcomputers

  • 8/7/2019 Distributed DBMS [Good]

    22/58

    HeterogeneousDistributedDatabase Scenario

  • 8/7/2019 Distributed DBMS [Good]

    23/58

    Distributed Database

    Transparency Features Allow end user to feel like databases only

    user

    Features include:Distribution transparency

    Transaction transparency

    Failure transparencyPerformance transparency

    Heterogeneity transparency

  • 8/7/2019 Distributed DBMS [Good]

    24/58

    Distribution

    Transparency Allows management of a physically dispersed

    database as though it were a centralized

    database

    Three levels of distribution transparency are

    recognized:

    Fragmentation transparency

    Location transparency

    Local mapping transparency

  • 8/7/2019 Distributed DBMS [Good]

    25/58

    A Summary of

    Transparency Features

  • 8/7/2019 Distributed DBMS [Good]

    26/58

    Fragment Locations

  • 8/7/2019 Distributed DBMS [Good]

    27/58

    Transaction

    Transparency Ensures database transactions will

    maintain distributed databases

    integrity and consistency

  • 8/7/2019 Distributed DBMS [Good]

    28/58

    Distributed Requestsand DistributedTransactions

    Distributed transaction

    Can update or request data from several

    different remote sites on a network Remote request

    Lets a single SQL statement access data to beprocessed by a single remote database

    processor

    Remote transaction

    Accesses data at a single remote site

  • 8/7/2019 Distributed DBMS [Good]

    29/58

    Distributed Requestsand DistributedTransactions (continued) Distributed transaction

    Allows a transaction to referenceseveral different (local or remote)

    DP sites

    Distributed request

    Lets a single SQL statement

    reference data located at several

    different local or remote DP sites

  • 8/7/2019 Distributed DBMS [Good]

    30/58

    A Remote Request

  • 8/7/2019 Distributed DBMS [Good]

    31/58

    A Remote Transaction

  • 8/7/2019 Distributed DBMS [Good]

    32/58

    A Distributed

    Transaction

  • 8/7/2019 Distributed DBMS [Good]

    33/58

    A Distributed Request

  • 8/7/2019 Distributed DBMS [Good]

    34/58

    Another Distributed

    Request

  • 8/7/2019 Distributed DBMS [Good]

    35/58

    DistributedConcurrency

    Control Multisite, multiple-process

    operations are much more likely to

    create data inconsistencies anddeadlocked transactions than are

    single-site systems

  • 8/7/2019 Distributed DBMS [Good]

    36/58

    The Effect of a

    Premature COMMIT

  • 8/7/2019 Distributed DBMS [Good]

    37/58

    Two-Phase Commit

    Protocol Distributed databases make it possible for a

    transaction to access data at several sites

    Final COMMIT must not be issued until allsites have committed their parts of the

    transaction

    Two-phase commit protocol requires each

    individual DPs transaction log entry be

    written before the database fragment is

    actually updated

  • 8/7/2019 Distributed DBMS [Good]

    38/58

    PerformanceTransparencyand QueryOptimization Objective of query optimization

    routine is to minimize total cost

    associated with the execution of arequest

    Costs associated with a request

    are a function of the:

    Access time (I/O) cost

    Communication cost

    CPU time cost

  • 8/7/2019 Distributed DBMS [Good]

    39/58

    Performance Transparency

    and QueryOptimization (continued) Must provide distribution transparency as well as

    replica transparency

    Replica transparency:DDBMSs ability to hide the existence of multiple

    copies of data from the user

    Query optimization techniques:

    Manual or automatic

    Static or dynamic

    Statistically based or rule-based algorithms

  • 8/7/2019 Distributed DBMS [Good]

    40/58

    Distributed Database

    Design Data fragmentation:

    How to partition the database into fragments

    Data replication:

    Which fragments to replicate

    Data allocation:

    Where to locate those fragments and replicas

  • 8/7/2019 Distributed DBMS [Good]

    41/58

    Data Fragmentation

    Breaks single object into two or more

    segments or fragments

    Each fragment can be stored at any site overa computer network

    Information about data fragmentation is

    stored in the distributed data catalog (DDC),from which it is accessed by the TP to

    process user requests

  • 8/7/2019 Distributed DBMS [Good]

    42/58

    Data Fragmentation

    Strategies Horizontal fragmentation:

    Division of a relation into subsets (fragments)

    of tuples (rows)

    Vertical fragmentation:

    Division of a relation into attribute (column)

    subsets

    Mixed fragmentation:

    Combination of horizontal and vertical

    strategies

  • 8/7/2019 Distributed DBMS [Good]

    43/58

    A Sample CUSTOMER

    Table

    H i t l

  • 8/7/2019 Distributed DBMS [Good]

    44/58

    HorizontalFragmentation of the

    CUSTOMERTable byState

  • 8/7/2019 Distributed DBMS [Good]

    45/58

    Table Fragments in

    Three Locations

  • 8/7/2019 Distributed DBMS [Good]

    46/58

    Vertically Fragmented

    Table Contents

  • 8/7/2019 Distributed DBMS [Good]

    47/58

    Mixed Fragmentation oftheCUSTOMERTable

  • 8/7/2019 Distributed DBMS [Good]

    48/58

    Data Replication

    Storage of data copies at multiple sites served

    by a computer network

    Fragment copies can be stored at several sitesto serve specific information requirements

    Can enhance data availability and response time

    Can help to reduce communication and total

    query costs

  • 8/7/2019 Distributed DBMS [Good]

    49/58

    Table Contents After theMixed FragmentationProcess

  • 8/7/2019 Distributed DBMS [Good]

    50/58

    Data Replication

  • 8/7/2019 Distributed DBMS [Good]

    51/58

    Replication Scenarios Fully replicated database:

    Stores multiple copies ofeach databasefragment at multiple sites

    Can be impractical due to amount of overhead

    Partially replicated database:Stores multiple copies ofsome database

    fragments at multiple sites

    Most DDBMSs are able to handle the partiallyreplicated database well

    Unreplicated database:Stores each database fragment at a single

    site

    No duplicate database fragments

  • 8/7/2019 Distributed DBMS [Good]

    52/58

    Data Allocation

    Deciding where to locate data Allocation strategies:

    Centralized data allocation Entire database is stored at one site

    Partitioned data allocation Database is divided into several disjointed parts

    (fragments) and stored at several sites

    Replicated data allocation Copies of one or more database fragments are

    stored at several sites

    Data distribution over a computer network isachieved through data partition, datareplication, or a combination of both

  • 8/7/2019 Distributed DBMS [Good]

    53/58

    Client/Server vs. DDBMS

    Way in which computers interact to form a

    system

    Features a userof resources, or a client, and

    a providerof resources, or a server

    Can be used to implement a DBMS in which

    the client is the TP and the server is the DP

  • 8/7/2019 Distributed DBMS [Good]

    54/58

    Client/Server Advantages Less expensive than alternate minicomputer ormainframe solutions

    Allow end user to use microcomputers GUI,thereby improving functionality and simplicity

    More people with PC skills than with mainframe

    skills in the job market

    PC is well established in the workplace

    Numerous data analysis and query tools exist tofacilitate interaction with DBMSs available in the

    PC market Considerable cost advantage to offloading

    applications development from the mainframe topowerful PCs

  • 8/7/2019 Distributed DBMS [Good]

    55/58

    Client/Server Disadvantages

    Creates a more complex environment, in whichdifferent platforms (LANs, operating systems,

    and so on) are often difficult to manage

    An increase in the number of users and

    processing sites often paves the way for securityproblems

    Possible to spread data access to a much wider

    circle of users increases demand for people with

    broad knowledge of computers and softwareincreases burden of training and cost of

    maintaining the environment

  • 8/7/2019 Distributed DBMS [Good]

    56/58

    C. J. Dates TwelveCommandments forDistributed Databases

    1. Local site independence

    2. Central site independence

    3. Failure independence

    4. Location transparency5. Fragmentation transparency

    6. Replication transparency

    7. Distributed query processing

    8. Distributed transaction processing

    9. Hardware independence

    10. Operating system independence

    11. Network independence

    12. Database independence

  • 8/7/2019 Distributed DBMS [Good]

    57/58

    Summary

    Distributed database stores logically related

    data in two or more physically independent

    sites connected via a computer network

    Database is divided into fragments

    Distributed databases require distributed

    processing

    Main components of a DDBMS are the

    transaction processor and the data processor

  • 8/7/2019 Distributed DBMS [Good]

    58/58

    Summary (continued)

    Current database systems can be classified byextent to which they support processing anddata distribution

    DDBMS characteristics are best described as a

    set of transparencies A transaction is formed by one or more

    database requests

    A database can be replicated over several

    different sites on a computer network Client/server architecture refers to the way in

    which two computers interact over a computernetwork to form a system