1 Chapter 4 Logical & Physical Database Design. 2 Logical Data Modeling Application data models have...

Post on 19-Jan-2016

215 views 0 download

Transcript of 1 Chapter 4 Logical & Physical Database Design. 2 Logical Data Modeling Application data models have...

1

Chapter 4Logical & Physical Database Design

2

Logical Data Modeling

Application data models have two primary phases– Establishing logical data model– Translating logical data model into physical data model

Normalization & Third Normal Form– All data in an entity is dependent on the primary key– There should be no repeating groups of attributes– No data in an entity is dependent on part of the key– No data in an entity is dependent on any nonkey attribute

“The key, the whole key, and nothing but the key, so help me Codd”

3

Logical Data Modeling

Application data models have two primary phases– Establishing logical data model– Translating logical data model into physical data model

Normalization & Third Normal Form– All data in an entity is dependent on the primary key– There should be no repeating groups of attributes– No data in an entity is dependent on part of the key– No data in an entity is dependent on any nonkey attribute

“The key, the whole key, and nothing but the key, so help me Codd”

4

Logical Data Modeling (cont.)

Determining Data types for attributes– Fixed & Variable length attributes– Integer & floating point numbers– Pictures, video, long character data (LONG, LOB)– Dates, Timestamps, and the like

When determining data types– Consider the impact on storage– Consider database fragmentation– Carefully consider options before using LONG, LOB

5

Logical Data Modeling (cont.)

Keys– Natural

constructed naturally from entity Column(s) have meaning Usually will have a longer key length than artificial keys Often are made up of multiple columns

– Artificial unintelligent, has no meaning Usually a sequential number Generally will perform better than natural keys Never needs updating (easier to manage referential integrity)

– Ongoing debate about the use of natural vs. artificial keys

6

Logical Data Modeling (cont.)

Data Warehouse Design– Have different logical requirements– Common data models include

Star Schema– Fact table (generally large)– Series of dimension tables (generally small)

Snowflake Schema– A more complex star schema

Hybrid– Have different physical requirements

Partitioning Bitmap Indexes Materialized Views

7

Logical to Physical

Logical design meets functional requirements Physical design meets performance

requirements Common mistake is making physical model

exact copy of logical model– Usually means lower performance – Pay the price upfront to properly determine physical

model

8

Logical to Physical (cont.)

Key steps include– Mapping entities to tables– Choosing a table type– Determining Data types– Precision & optional attributes

Denormalization– Done for performance– Can increase overhead

Summary tables Partitioning tables

9

The Star Schema

Common in Data Warehousing Typically show better performance for warehouses Fact table

– Contains the detailed information– Many foreign keys to “dimension” tables

Dimension tables– Many tables that surround the fact table– Are reference or “categorized” tables such as time, product,

and customer

See Figure 4-3 (p. 94)

10

The Snowflake Schema

An expanded star schema Dimensions are split into multiple tables Is a normalization technique To be used with caution, can degrade

performance Can complicate queries Can reduce storage requirements See Figure 4-4 (p. 95)

11

Materialized Views

Also common in the data warehouse Done to aggregate or join data Created to simplify access for the end-user Is a physical table (contains storage) Provides sophisticated refresh functionality Refreshes can be complete or just the changes Query rewrite functionality gives user

transparency of the views themselves

12

Physical Storage Options

Segment space management– Can be automatic (preferred) or manual– Affect how oracle manages block management

regarding: Freelists Block-related parameters (PCTUSED,PCTFREE)

Row migration– Indicated by table fetch continued row (V$SYSSTAT)

INITRANS – transaction slots within a block– ITL (Interested Transaction list)

13

Physical Storage Options (cont.)

Compression– Reduces storage & memory requirement– Makes DML slower– Select operations can be faster– In Oracle 10g

Done during table creation or reorganization Loads needed to be done “direct-load” Normal DML caused decompression

– In Oracle 11g New advanced compression component Can function within normal DML operations

– Best used with Table scans Character type data

14

Physical Storage Options (cont.)

LOBS (Large Object Data)– Character data > 4000 bytes (CLOB)– Binary data (BLOB)– Stored separately from remainder of row data– Storage mechanism differs than row data (chunks)– Have different storage options – New security features in Oracle 11g

15

Oracle Partitioning

Breaks table/index up into logical segments Each partition can have separate storage

characteristics Benefits include

– Reading only relevant partitions needed for queries– Can help improve parallel processing for DML, select

operations, and database maintenance operations– Deletes can be done much cheaper– Can reduce latch contention

16

Oracle Partitioning (cont.)

Partitioning types:– Range (typically time-based)– Hash (helps ensure equal-sized partitions)– List (based on specific set of values – e.g. – state code)– Composite partitioning (combination of above)

New Oracle 11g partitioning– Reference (child tables inherit partitioning from parents)– Interval (can enable auto-addition of partitions)– Virtual-column (enables partitioning on expressions)

17

Oracle Partitioning (cont.)

Choosing a partitioning strategy– Range (good if data will be purged)– Hash (helps if parallel operations are needed)– List (queries based on a small subset of table’s data)– Composite partitioning (if multiple of above factors are

indicated)– Enterprise manager partitioning advisor

Can help suggest partitioning schemes