Principles of Database Design, Conclusions MBAA 609 R. Nakatsu.

Post on 13-Dec-2015

219 views 0 download

Tags:

Transcript of Principles of Database Design, Conclusions MBAA 609 R. Nakatsu.

Principles ofDatabase Design,

Conclusions

MBAA 609

R. Nakatsu

Principles of Database DesignNormalization is a process that assigns

attributes (fields) to tables such that data redundancies are eliminated or reduced.

Four Rules of Thumb:1. Single Themes: Break up a large table into

separate themes, or subjects.2. Field Uniqueness: Each field in a table

should represent a unique type of information (e.g., break up complex fields, and eliminate repeating groups).

Principles of Normalization,Continued

3. Primary Keys: Each table must have a unique identifier, or primary key, that is made up of one or more fields in the table.

4. Field Independence: You must be able to make a change to the data in any field (other than a field in the primary key) without affecting the data in any other field.

Rule 4: Field IndependenceEmployee (Employee Number, Last Name,

First Name, Job Class, Hourly Rate)

In this example, Hourly Rate is dependent on Job Class.

Employee Number Last Name First Name Job Class Hourly Rate

11 Smith John Mechanic 20

12 Jones Susan Technician 18

13 McKay Bob Mechanic 20

14 Owens Paula Clerk 15

15 Chang Steve Mechanic 20

16 Sarandon Sarah Mechanic 20

Solution: Create Two TablesEmployee (Employee Number, Last Name,

First Name, Job Class ID)

Job Class ID is the link to the Job Class table.

Employee Number Last Name First Name Job Class ID

11 Smith John 2

12 Jones Susan 3

13 McKay Bob 2

14 Owens Paula 1

15 Chang Steve 2

16 Sarandon Sarah 2

Job Class (Job Class ID, Job Class,

Hourly Rate)

There are no more field dependencies!

Job Class ID Job Class Hourly Rate1 Clerk 152 Mechanic 203 Technician 18

Integrity of a Database

It is important that the data in a database is correct and consistent.

Data integrity problems can occur due to: incorrect data entry data redundancy poor database design

Ways to Maintain Data Integrity

Entity Integrity (primary key must be unique and must not be null)

Referential Integrity (see previous slide) Field types (e.g., text, numeric, date, yes/no) Input masks (e.g., ISBN number, dates) Field validation rules (What are some

examples?) Good design of input screens (user interfaces) Check digits

Check Digit Example

Add a check digit to validate a clerk’s data entry.

The check digit is determined by some mathematical algorithm.

Example:Multiply the first digit by 2; multiply the second

digit by 3; multiply the third by 4; add the results; divide by 10 and take the remainder.

127 yields (1*2 + 2*3 + 7*4) / 10 = 36 / 10 for a check digit of 6.

Concurrency Control

Concurrency control is the management of concurrent transaction execution.

Why is it important?

The simultaneous execution of transactions over a shared database may create several data-integrity and consistency problems.

Lost Updates: An Example

Note that the first transaction has not been permanently recorded when the second transaction is executed.

Time TransactionStep Stored Value1 T1 Read QOH 502 T2 Read QOH 503 T1 QOH = 50 + 204 T2 QOH = 50 - 305 T1 Write QOH 706 T2 Write QOH 20

Concurrency Controlwith Locking

A lock guarantees exclusive use of a data item to a current transaction.

In the previous example, transaction T2 will not have access to the data item that is currently used by transaction T1.

The lock is released when the transaction T1 is completed.

Data Warehouse

support OLAP (online analytical processing) support data mining (data in a data warehouse are analyzed to reveal hidden patterns and trends

in historical business activity)

stores data that have been extracted from the various operational, external, and other databases of an organization

Data warehouses...

Data Warehouse

Recap

Why do we need a database? What is a database? What are the main functions of a DBMS? Entity-Relationship Modeling Linking two tables: 1:1, 1:M, and M:N Four principles of database design Data Integrity Concurrency Control Data Warehouses