Database Principles Relational Database Design I.

27
Database Principles Relational Database Design I

Transcript of Database Principles Relational Database Design I.

Page 1: Database Principles Relational Database Design I.

Database Principles

Relational Database Design I

Page 2: Database Principles Relational Database Design I.

Database Principles

Good Tables versus Bad Tables:

• A table in a relational database is good if it is about one thing. A table that is not good is bad.

• Problem of RDB Design: Build good tables and convert bad tables into good tables.

• What is a table “about”? The key to the answer is the key.

• The key to a table is the identifier of whatever the table is about.

Page 3: Database Principles Relational Database Design I.

Database Principles

Good Table Examples:

Pno Pdesc Colour

p1 screw redp2 bolt yellowp3 nut greenp4 washer red

Part

Sno Sname Location

s1 Acme NYs2 Ajax Boss3 Apex Chis4 Ace LAs5 A-1 Phil

Supplier

Sno Pno O_date

s1 p1 nov 3s2 p2 nov 4s3 p1 nov 5s3 p3 nov 6s4 p1 nov 7s4 p2 nov 8s4 p4 nov 9

Supplies

Supplier is good because its key is Sno, which identifies different suppliers, and each column in Supplier – Sname and Location – is a piece

of information about Suppliers.

Exercise: Explain why Part is a good table.

Supplies is good because its key is (Sno,Pno), which identify individual orders, and the only other column in the table – O_date – is a piece of info about individual orders.

Page 4: Database Principles Relational Database Design I.

Database Principles

Bad Table Example:

Sno Sname Location O_date Pno Pdesc Color

s1 Acme NY nov 3 p1 screw reds2 Ajax Bos nov 4 p2 bolt yellow s3 Apex Chi nov 5 p1 screw reds3 Apex Chi nov 6 p3 nut greens4 Ace LA nov 7 p1 screw reds4 Ace LA nov 8 p2 bolt yellows4 Ace LA nov 9 p4 washer red

Supplier Part Supplies

Even though this table has info about Suppliers, Parts and Supplies its key is (Sno,Pno).And so this table is “about” whatever its key identifies, namely Supplies. But the table contains various columns – Sname, Location, Pdesc, Color – that are not about Supplies but about Supplier and Part respectively.

So this table is “bad” and the process of RDB Design would be to reform this table intothe three tables on the previous slide.

Quick Observation: If your tables come from an ERD they are normally pretty “good”.

Page 5: Database Principles Relational Database Design I.

Database Principles

Bad Tables can be Useful, if not Good:

• The previous table, called bad, is so only if it is a permanent table. As part of an on-going query it is not considered bad since data is not stored permanently in this format.

• Suppose you never, ever expect to look at the supplies table information without knowing the name of the supplier and the part supplied. If the join table does not exist then you will always have to construct the join. This can be time consuming and to save that time you might keep the table in pre-joined, permanent form.

Page 6: Database Principles Relational Database Design I.

Database Principles

What’s so Bad about a Bad Table?

• Suppose instead of three tables

we only have one table

Pno Pdesc Colour

p1 screw redp2 bolt yellowp3 nut greenp4 washer red

Part

Sno Sname Location

s1 Acme NYs2 Ajax Boss3 Apex Chis4 Ace LAs5 A-1 Phil

Supplier

Sno Pno O_date

s1 p1 nov 3s2 p2 nov 4s3 p1 nov 5s3 p3 nov 6s4 p1 nov 7s4 p2 nov 8s4 p4 nov 9

Supplies

Sno Sname Location O_date Pno Pdesc Color

s1 Acme NY nov 3 p1 screw reds2 Ajax Bos nov 4 p2 bolt yellow s3 Apex Chi nov 5 p1 screw reds3 Apex Chi nov 6 p3 nut greens4 Ace LA nov 7 p1 screw reds4 Ace LA nov 8 p2 bolt yellows4 Ace LA nov 9 p4 washer red

Supplier Part Supplies

Page 7: Database Principles Relational Database Design I.

Database Principles

Insert Anomaly:

• We want to add supplier A-1 to the database but for now we have no parts that A-1 supplies. Since the key to the table is (Sno,Pno) we can’t add a row until we have values for both Sno and Pno.

Sno Sname Location O_date Pno Pdesc Color

s1 Acme NY nov 3 p1 screw reds2 Ajax Bos nov 4 p2 bolt yellow s3 Apex Chi nov 5 p1 screw reds3 Apex Chi nov 6 p3 nut greens4 Ace LA nov 7 p1 screw reds4 Ace LA nov 8 p2 bolt yellows4 Ace LA nov 9 p4 washer red

Supplier Part Supplies

s5 A-1 Phil null null null null

not permitted

Page 8: Database Principles Relational Database Design I.

Database Principles

Update Anomaly:

• What happens if the part, p2, changes its color from yellow to purple?

• We must search every row of the join-table and change every instance of yellow to purple in rows involving the supplying of part, p2.

Sno Sname Location O_date Pno Pdesc Color

s1 Acme NY nov 3 p1 screw reds2 Ajax Bos nov 4 p2 bolt yellow s3 Apex Chi nov 5 p1 screw reds3 Apex Chi nov 6 p3 nut greens4 Ace LA nov 7 p1 screw reds4 Ace LA nov 8 p2 bolt yellows4 Ace LA nov 9 p4 washer red

Supplier Part Supplies

multiplechanges

Page 9: Database Principles Relational Database Design I.

Database Principles

Update Anomaly (cont):

• This one change in the real world makes for many changes in the database.

• What if we mess up and end up not making all changes?

• Now, what color is p2?

Sno Sname Location O_date Pno Pdesc Color

s1 Acme NY nov 3 p1 screw reds2 Ajax Bos nov 4 p2 bolt yellow s3 Apex Chi nov 5 p1 screw reds3 Apex Chi nov 6 p3 nut greens4 Ace LA nov 7 p1 screw reds4 Ace LA nov 8 p2 bolt yellows4 Ace LA nov 9 p4 washer red

Supplier Part Supplies

purple

twodifferentcolors

Page 10: Database Principles Relational Database Design I.

Database Principles

Delete Anomaly:

• What if we cancel the order for bolts from supplier, s2?

• A consequence is that we lose all information about the supplier, s2.

Sno Sname Location O_date Pno Pdesc Color

s1 Acme NY nov 3 p1 screw reds2 Ajax Bos nov 4 p2 bolt yellow s3 Apex Chi nov 5 p1 screw reds3 Apex Chi nov 6 p3 nut greens4 Ace LA nov 7 p1 screw reds4 Ace LA nov 8 p2 bolt yellows4 Ace LA nov 9 p4 washer red

Supplier Part Supplies

Page 11: Database Principles Relational Database Design I.

Database Principles

So What Can Be Done?

• Suppose we keep these two tables

and we also keep this table

Pno Pdesc Colour

p1 screw redp2 bolt yellowp3 nut greenp4 washer red

Part

Sno Sname Location

s1 Acme NYs2 Ajax Boss3 Apex Chis4 Ace LAs5 A-1 Phil

Supplier

Sno Pno O_date

s1 p1 nov 3s2 p2 nov 4s3 p1 nov 5s3 p3 nov 6s4 p1 nov 7s4 p2 nov 8s4 p4 nov 9

Supplies

Sno Sname Location O_date Pno Pdesc Color

s1 Acme NY nov 3 p1 screw reds2 Ajax Bos nov 4 p2 bolt yellow s3 Apex Chi nov 5 p1 screw reds3 Apex Chi nov 6 p3 nut greens4 Ace LA nov 7 p1 screw reds4 Ace LA nov 8 p2 bolt yellows4 Ace LA nov 9 p4 washer red

Supplier Part Supplies

Any ideas?

Page 12: Database Principles Relational Database Design I.

Database Principles

So What Can Be Done? (cont)

• There is the issue of data consistency. • Given the same information stored in several places it

becomes a big job to make sure this data is consistent.• If we lose data consistency then all the data essentially

becomes “noise”.

Page 13: Database Principles Relational Database Design I.

Database Principles

Some Notation:• A table is sometimes called a relation.

– We use R, S and T and nearby letters to represent tables.

• Table columns are also called attributes. – We use A, B and C and nearby letters to represent

columns.• The possible values in a column A of table R are called the

domain of A, dom(A).• Table schemas are lists of table columns.

– We use R, S and T to represent schemas.

Page 14: Database Principles Relational Database Design I.

Database Principles

Some Notation (cont):• Table rows are also called tuples.

– We use r, s and t and nearby letters to represent rows

Subsets of a table schema are represented by X, Y and Z and nearby letters.– X R is a subset of the list of all columns in a table.

• r[A] is the value in row r column A.

• r[X] is the subrow of r consisting of the values in the columns of X.

Page 15: Database Principles Relational Database Design I.

Database Principles

Example:

A B C D E F

a1 b1 c1 d1 e1 f1a2 b2 c2 d2 e2 f2a3 b3 c3 d3 e3 f3

an bn cn dn en fn. . .

R

Table Name: Letter from middle of alphabet – R, S, T

Column Name: Letter from beginning of alphabet – A, B, C

r = ( a2, b2, c2, d2, e2, f2 )

∩X = { A, B, C } R: X is a subset of the schema; a letter at the end of the alphabet

R = { A, B, C, D, E, F }; the schema of R

r[A] = (a2); a singleton tuple

r[X] = ( a2, b2, c2 ); a subset of r

r

Page 16: Database Principles Relational Database Design I.

Database Principles

What is a Key to a Table?

• A key is a set of columns of a table whose values uniquely identify distinct rows of the table.

• A key is a set of columns of a table such that if you know the values of the columns in the key, there is at most one row in the table with these values.

Def’n: For any table R, if X is a subset of R, then X is a key to the table R if the following is true: for any two rows r and s of R, if r[X] = s[X] then r = s. In other words, r and s are the same row.

In other words, any two rows that agree on X agree everywhere

Sno Sname Location

s1 Acme NYs2 Ajax Boss3 Apex Chis4 Ace LAs5 A-1 Phil

Supplier

One key to Supplier is {Sno}; are there any others?

What about {Sno, Location}?

Page 17: Database Principles Relational Database Design I.

Database Principles

Not all Columns are Keys:

• Why isn’t {Location} a key to Supplier?

• Because at some point in the future we may add a new supplier who comes from Boston, for example.

Sno Sname Location

s1 Acme NYs2 Ajax Boss3 Apex Chis4 Ace LAs5 A-1 Phil

Supplier

Page 18: Database Principles Relational Database Design I.

Database Principles

Keys, Keys and More Keys:

• We saw earlier that {Sno, Location} is also a key to Supplier. It is called a superkey.

Sno Sname Location

s1 Acme NYs2 Ajax Boss3 Apex Chis4 Ace LAs5 A-1 Phil

Supplier

Def’n: For a table R, if X is a key of R and X Y, then Y is a superkey of R.

Any set of columns that contains a key to a table is a superkey of the same table.

Superkeys are keys too.

Supplier has many superkeys – {Sno}, {Sno,Sname}, {Sno,Location}, Supplier

Page 19: Database Principles Relational Database Design I.

Database Principles

Exercise:

• Prove that any table has at least one superkey.

• Answer: The schema itself is a superkey.

Page 20: Database Principles Relational Database Design I.

Database Principles

Keys, Keys and More Keys:

• Some keys are smaller (fewer columns) than others.• Some keys can’t be made any smaller (fewer columns).

Sno Sname Location

s1 Acme NYs2 Ajax Boss3 Apex Chis4 Ace LAs5 A-1 Phil

Supplier

Def’n: For a table R, if X is a key of R and for any Y X, we know that Y is not a key of R, then X is called a candidate key of R.

The only candidate key to Supplier is {Sno}.

Page 21: Database Principles Relational Database Design I.

Database Principles

Candidate Keys:

• What made us decide {Sno} was a candidate key of Supplier?

• What makes us say the {Location} is not a key to Supplier?

• What makes us decide that something is a key is a rule about the real world that makes it so. We call such a rule and Enterprise Rule.

We know that no two suppliers were assigned the same number

We know there is no rule saying suppliers must come from different locations.

Page 22: Database Principles Relational Database Design I.

Database Principles

Multiple Candidate keys (1):

• In the table below there are 2 possible candidate keys:

StudentID SSN Fname Lname DOB Address

Student

{StudentID} and {SSN}

Page 23: Database Principles Relational Database Design I.

Database Principles

Multiple Candidate keys (2):

• In the table below there are 2 possible candidate keys:

• Candidate keys don’t need to be the same size.

{CourseID,SectionID} and {RoomNum, BldgID,TimeSlot}

CourseID SectionID RoomNum BldgID TimeSlot

Fall08RoomAssignments

Page 24: Database Principles Relational Database Design I.

Database Principles

Primary key:

• What do you do when you have candidates?

• The candidate key that wins the election is called the primary key. There is only one primary key in a table.

hold an election

Page 25: Database Principles Relational Database Design I.

Database Principles

Primary Key Examples:

• What are the primary keys of each of the tables below.

Sno Sname Location

s1 Acme NYs2 Ajax Boss3 Apex Chis4 Ace LAs5 A-1 Phil

SupplierPno Pdesc Colour

p1 screw redp2 bolt yellowp3 nut greenp4 washer red

Part

Sno Pno O_date

s1 p1 nov 3s2 p2 nov 4s3 p1 nov 5s3 p3 nov 6s4 p1 nov 7s4 p2 nov 8s4 p4 nov 9

Supplies

pk = {Sno} pk = {Pno}

pk = {Sno,Pno,O_date} or {Sno,Pno}

the difference in what is the primary key is determined by the Enterprise Rules

Page 26: Database Principles Relational Database Design I.

Database Principles

Foreign Keys:

• What makes someone a foreigner?

• What makes a set of columns a foreign key?

• In the Supplies table both {Sno} and {Pno} are foreign keys because they are primary keys in other tables; Supplier and Part respectively.

being physically in a country other than their own

columns are a foreign key if they are a primary key in some other table

Page 27: Database Principles Relational Database Design I.

Database Principles

What are the Foreign Keys?

borrowerid b_name b_addr b_status loan_limit

pk

Cardholder

borrowerid isbn r_date

pk

Reserves

isbn author title pub_name pub_date c_price

pk

Book

accession_no isbn p_price

pk

Copy

borrowerid accession_no l_date

Borrows

pk

fk fk

fkfk fk