chap06r[1] data base

download chap06r[1] data base

of 36

Transcript of chap06r[1] data base

  • 8/12/2019 chap06r[1] data base

    1/36

    1 Prentice Hall, 2002

    Chapter 6:Physical Database Design

    and Performance

    M odern Database M anagement

    6 th Edition

    Jeff rey A. H off er , M ary B. Prescott, Fred R.M cFadden

  • 8/12/2019 chap06r[1] data base

    2/36

  • 8/12/2019 chap06r[1] data base

    3/36

    Chapter 6 3 Prentice Hall, 2002

    Physical Database Design

    Purpose - translate the logical descriptionof data into the technical specifications forstoring and retrieving dataGoal - create a design for storing data thatwill provide adequate performance and

    insure database integrity , security andrecoverability

  • 8/12/2019 chap06r[1] data base

    4/36

    Chapter 6 4 Prentice Hall, 2002

    Physical Design Process

    Normalized relations

    Volume estimates

    Attribute definitionsResponse time expectations

    Data security needs

    Backup/recovery needs

    Integrity expectations

    DBMS technology used

    Inputs

    Attribute data types

    Physical record descriptions(doesnt always match logicaldesign)

    File organizations

    Indexes and databasearchitectures

    Query optimization

    Leads to

    Decisions

  • 8/12/2019 chap06r[1] data base

    5/36

    Chapter 6 5 Prentice Hall, 2002

    Figure 6.1 - Composite usage map(Pine Valley Furniture Company)

  • 8/12/2019 chap06r[1] data base

    6/36

    Chapter 6 6 Prentice Hall, 2002

    Figure 6.1 - Composite usage map(Pine Valley Furniture Company)

    Data volumes

  • 8/12/2019 chap06r[1] data base

    7/36

    Chapter 6 7 Prentice Hall, 2002

    Figure 6.1 - Composite usage map(Pine Valley Furniture Company)

    Access Frequencies(per hour)

  • 8/12/2019 chap06r[1] data base

    8/36

    Chapter 6 8 Prentice Hall, 2002

    Figure 6.1 - Composite usage map(Pine Valley Furniture Company)

    Usage analysis:200 purchased parts accessed per hour 80 quotations accessed fromthese 200 purchased partaccesses

    70 suppliers accessed fromthese 80 quotation accesses

  • 8/12/2019 chap06r[1] data base

    9/36

    Chapter 6 9 Prentice Hall, 2002

    Figure 6.1 - Composite usage map(Pine Valley Furniture Company)

    Usage analysis:75 suppliers accessed perhour 40 quotations accessed fromthese 75 supplier accesses 40 purchased parts accessedfrom these 40 quotationaccesses

  • 8/12/2019 chap06r[1] data base

    10/36

    Chapter 6 10 Prentice Hall, 2002

    Designing Fields

    Field: smallest unit of data in

    databaseField design Choosing data type Coding, compression, encryption Controlling data integrity

  • 8/12/2019 chap06r[1] data base

    11/36

    Chapter 6 11 Prentice Hall, 2002

    Choosing Data Types

    CHAR fixed-length characterVARCHAR2 variable-length character

    (memo)LONG large number

    NUMBER positive/negative number

    DATE actual dateBLOB binary large object (good forgraphics, sound clips, etc.)

  • 8/12/2019 chap06r[1] data base

    12/36

    Chapter 6 12 Prentice Hall, 2002

    Figure 6.2Example code-look-up table (Pine Valley Furniture Company)

    Code saves space, but costsan additional lookup toobtain actual value.

  • 8/12/2019 chap06r[1] data base

    13/36

    Chapter 6 13 Prentice Hall, 2002

    Field Data Integrity

    Default value - assumed value if no explicitvalue

    Range control allowable value limitations(constraints or validation rules) Null value control allowing or prohibitingempty fields

    Referential integrity range control (and nullvalue allowances) for foreign-key to primary-key match-ups

  • 8/12/2019 chap06r[1] data base

    14/36

    Chapter 6 14 Prentice Hall, 2002

    Handling Missing Data

    Substitute an estimate of the missing value(e.g. using a formula)

    Construct a report listing missing valuesIn programs, ignore missing data unless thevalue is significant

    Triggers can be used to perform these operations

  • 8/12/2019 chap06r[1] data base

    15/36

    Chapter 6 15 Prentice Hall, 2002

    Physical RecordsPhysical Record: A group of fields stored inadjacent memory locations and retrievedtogether as a unitPage: The amount of data read or written inone I/O operationBlocking Factor: The number of physicalrecords per page

  • 8/12/2019 chap06r[1] data base

    16/36

  • 8/12/2019 chap06r[1] data base

    17/36

  • 8/12/2019 chap06r[1] data base

    18/36

    Chapter 6 18 Prentice Hall, 2002

    PartitioningHorizontal Partitioning: Distributing the rows of a tableinto several separate files Useful for situations where different users need access to different

    rows Three types: Key Range Partitioning, Hash Partitioning, or

    Composite Partitioning

    Vertical Partitioning: Distributing the columns of a tableinto several separate files Useful for situations where different users need access to different

    columns

    The primary key must be repeated in each fileCombinations of Horizontal and Vertical

    Partitions often correspond with User Schemas (user views)

  • 8/12/2019 chap06r[1] data base

    19/36

    Chapter 6 19 Prentice Hall, 2002

    Partitioning

    Advantages of Partitioning: Records used together are grouped together

    Each partition can be optimized for performance Security, recovery Partitions stored on different disks: contention Take advantage of parallel processing capability

    Disadvantages of Partitioning: Slow retrievals across partitions Complexity

  • 8/12/2019 chap06r[1] data base

    20/36

    Chapter 6 20 Prentice Hall, 2002

    Data Replication

    Purposely storing the same data in multiplelocations of the database

    Improves performance by allowing multipleusers to access the same data at the sametime with minimum contentionSacrifices data integrity due to dataduplicationBest for data that is not updated often

  • 8/12/2019 chap06r[1] data base

    21/36

    Chapter 6 21 Prentice Hall, 2002

    Designing Physical Files

    Physical File: A named portion of secondary memory allocated for

    the purpose of storing physical records

    Constructs to link two pieces of data: Sequential storage. Pointers.

    File Organization: How the files are arranged on the disk.

    Access Method: How the data can be retrieved based on the file

    organization.

  • 8/12/2019 chap06r[1] data base

    22/36

    Chapter 6 22 Prentice Hall, 2002

    Figure 6-7 (a)Sequential fileorganization

    If not sorted

    Average time to finddesired record = n/2.

    1

    2

    n

    Records of thefile are stored insequence by the

    primary keyfield values.

    If sorted every insert ordelete requiresresort

  • 8/12/2019 chap06r[1] data base

    23/36

    Chapter 6 23 Prentice Hall, 2002

    Indexed File OrganizationsIndex a separate table that contains organizationof records for quick retrievalPrimary keys are automatically indexedOracle has a CREATE INDEX operation, and MSACCESS allows indexes to be created for mostfield typesIndexing approaches: B-tree index, Fig. 6-7b Bitmap index, Fig. 6-8 Hash Index, Fig. 6-7c Join Index, Fig 6-9

  • 8/12/2019 chap06r[1] data base

    24/36

    Chapter 6 24 Prentice Hall, 2002

    Fig. 6-7b B-tree index

    uses a t r e e s e a r c hAverage time to find desiredrecord = depth of the tree

    Leaves of the treeare all at samelevel

    consistent accesstime

  • 8/12/2019 chap06r[1] data base

    25/36

    Chapter 6 25 Prentice Hall, 2002

    Fig 6-7cHashed file orindex

    organization

    Hash algorithm Usually uses division-remainder to determinerecord position. Recordswith same position aregrouped in lists.

  • 8/12/2019 chap06r[1] data base

    26/36

    Chapter 6 26 Prentice Hall, 2002

    Fig 6-8Bitmap indexindex

    organization

    Bitmap saves on space requirements Rows - possible values of the attributeColumns - table rowsBit indicates whether the attribute of a row has the values

  • 8/12/2019 chap06r[1] data base

    27/36

    Chapter 6 27 Prentice Hall, 2002

    Fig 6-9 Join Index speeds up join operations

  • 8/12/2019 chap06r[1] data base

    28/36

    Chapter 6 28 Prentice Hall, 2002

    Clustering Files

    In some relational DBMSs, related records fromdifferent tables can be stored together in the samedisk areaUseful for improving performance of joinoperationsPrimary key records of the main table are stored

    adjacent to associated foreign key records of thedependent tablee.g. Oracle has a CREATE CLUSTER command

  • 8/12/2019 chap06r[1] data base

    29/36

    Chapter 6 29 Prentice Hall, 2002

    Rules for Using Indexes

    1. Use on larger tables2. Index the primary key of each table

    3. Index search fields (fields frequently inWHERE clause)

    4. Fields in SQL ORDER BY and GROUP

    BY commands5. When there are >100 values but not when

    there are

  • 8/12/2019 chap06r[1] data base

    30/36

    Chapter 6 30 Prentice Hall, 2002

    Rules for Using Indexes

    6. DBMS may have limit on number ofindexes per table and number of bytes perindexed field(s)

    7. Null values will not be referenced from anindex8. Use indexes heavily for non-volatile

    databases; limit the use of indexes forvolatile databases

    Why? Because modifications (e.g. inserts,deletes) require updates to occur in index files

  • 8/12/2019 chap06r[1] data base

    31/36

    Chapter 6 31 Prentice Hall, 2002

    RAID

    Redundant Array of Inexpensive DisksA set of disk drives that appear to the userto be a single disk driveAllows parallel access to data (improvesaccess speed)Pages are arranged in stripes

  • 8/12/2019 chap06r[1] data base

    32/36

    Chapter 6 32 Prentice Hall, 2002

    Figure 6-10 RAID with four

    disks and

    striping

    Here, pages 1-4can beread/writtensimultaneously

  • 8/12/2019 chap06r[1] data base

    33/36

    Chapter 6 33 Prentice Hall, 2002

    Raid Types (Figure 6-11)

    Raid 0 Maximized parallelism No redundancy No error correction no fault-tolerance

    Raid 1 Redundant data fault tolerant Most common form

    Raid 2 No redundancy

    One record spans across datadisks Error correction in multiple

    disks reconstruct damaged data

    Raid 3 Error correction in one disk Record spans multiple data disks (more

    than RAID2) Not good for multi-user environments,

    Raid 4 Error correction in one disk Multiple records per stripe Parallelism, but slow updates due to

    error correction contention

    Raid 5 Rotating parity array Error correction takes place in same disks as

    data storage Parallelism, better performance than Raid4

    s

  • 8/12/2019 chap06r[1] data base

    34/36

    Chapter 6 34 Prentice Hall, 2002

    D a

    t a b a s e

    A r c

    h i t e c

    t u r e s

    ( f i g u r e

    6 - 1

    2

    LegacySystems

    Current

    Technology

    DataWarehouses

  • 8/12/2019 chap06r[1] data base

    35/36

    Chapter 6 35 Prentice Hall, 2002

    Query Optimization

    Parallel Query ProcessingOverride Automatic Query OptimizationData Block Size -- Performance tradeoffs: Block contention Random vs. sequential row access speed

    Row size Overhead

    Balancing I/O Across Disk Controllers

  • 8/12/2019 chap06r[1] data base

    36/36

    Ch 6 36

    Query OptimizationWise use of indexesCompatible data typesSimple queriesAvoid query nestingTemporary tables for query groups

    Select only needed columns No sort without index