1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref :...

67
1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : - A First Course in Database System (Jeffrey D Ullman, Jennifer Widom) + online.

Transcript of 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref :...

Page 1: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

1

IT 244 Database Management System

Topic 7

The Relational Data Model

Functional Dependencies

Ref : - A First Course in Database System (Jeffrey D Ullman, Jennifer Widom) + online.

Page 2: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

2

Functional Dependencies

Science is the knowledge of consequences, and dependence of one fact upon another.

Thomas Hobbes

(1588-1679)

Page 3: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

3

Functional Dependencies

• Functional Dependence often abbreviated FD is a unique-value constrain and the most important type of constraint we use for relational schema design.

• The requirement that a data item be dependent on the entire primary key and not just a part of the primary key.

Page 4: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

4

How?

• It is also possible for database designers to produce relational schemas directly from application requirements although doing so can be very difficult.

• Regardless of how relational designer are produced, we shall see that frequently it is possible to improve designs systematically based on certain types of constraints.

Page 5: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

5

It is vital?

• Knowledge of this type of constraint is vital for the redesigning of database schemas

• What it does? - it eliminate redundancy.• There are also some other kinds of

constraints that help us design good databases schemas such as multivalued dependencies and referential integrity constraints and others.

Page 6: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

6

The Evils of Redundancy• Redundancy is at the root of several

problems associated with relational schemas:– redundant storage, insert/delete/update

anomalies• Integrity constraints, in particular functional

dependencies, can be used to identify schemas with such problems and to suggest refinements.

• Main refinement technique: decomposition – replacing ABCD with, say, AB and BCD, or ACD and

ABD.• Decomposition should be used judiciously:

– Is there reason to decompose a relation?– What problems (if any) does the decomposition cause?

Page 7: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

7

Functional Dependency

• A functional dependency (FD) on a relation R is a statement of the form “If two tuples of R agree on attributes A1 , A2 , …. An ( ie the tuples have the same values in their respective components for each of these attributes) then they must agree on another attribute B”. (Book definition)

Page 8: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

8

In other words:

• X Y means Given any two tuples in r, if the X values

are the same, then the Y values must also be the same. (but not vice versa)

• Can read “” as “determines

Page 9: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

9

In other words cont’• An FD is a statement about all allowable relations.

– Must be identified based on semantics of application.– Given some instance r1 of R, we can check if r1 violates

some FD f, but we cannot determine if f holds over R.

• Question: How related to keys?• if “K all attributes of R” then K is a superkey for R

(does not require K to be minimal.)

• FDs are a generalization of keys.

Page 10: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

10

Expression

• R represent Relation

• A1, A2, A3 … An represent attributes

• X, Y, Z represent sets of attributes

• Letters at the beginning of the alphabet represent single attributes, and letters at the end of the alphabet represent a set of attributes.

Page 11: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

11

• Formally we write FD as A1 , A2 , … An -> B and say that A1 , A2 , … , An functionally determine B.

• A set of attributes A1 , A2 , … , An functionally determines more than one attributes

A1 , A2 , … , An -> B1

A1 , A2 , … , An -> B2

……..

A1 , A2 , … , An -> Bm

For short we write this set of FD’s as

A1 , A2 , … , An -> B1B2 … Bm

Page 12: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

12

In other words

• X -> A is an assertion about a relation R that whenever two tuples of R agree on all the attributes of X, then they must also agree on the attribute A.– Say “X -> A holds in R.”– Notice convention: …,X, Y, Z represent sets of

attributes; A, B, C,… represent single attributes.– Convention: no set formers in sets of attributes,

just ABC, rather than {A,B,C }.

Page 13: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

13

What this FD’s suggest

Page 14: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

14

Example 1• Movie (title, year, length, filmType, studioName, starName)• Instance of what we reproduce is

Title Year Length film Type studio Name star Name

Star wars 1977 124 Color Fox Carrie Fisher

Star wars 1977 124 Color Fox Mark Hamili

Star Wars 1977 124 Color Fox Harrison Ford

Mighty Ducks 1991 104 Color Disney Emilio Estevez

Wayne’s World 1992 95 Color Paramount Dana Carvey

Wayne’s World 1992 95 Color Paramount Mike Mayers

Several FD’s that we can reasonably assert about this Movie Relations

title year -> length

title year -> filmType

title year -> studioName

Page 15: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

15

What does this FD’s Says?• Informally it says that if two tuples have the same value in

their title components, and they also have the same value in their year components then these two tuples must have the same values in their length component, the same value in their filmtype component and the same values in their studioName component. This assertion makes sense if we remember the original design from which this relation schema was developed.

• Attributes title and year form a key for the Movie entity set. Thus we expect that given a title and year, there is a unique movie, therefore there is a unique length and a unique film type for the movie. There is a many to one relationship from Movies to Studios. Consequently, we expect that given a movie there is only one owing studio.

Page 16: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

16

What does this FD’s Says?

• On the other hand, we observe that the statement

title year -> starsName is false, it is not functionally dependency. Given a movie, it is entirely possible that there is more than one star for the movie listed in our database.

Page 17: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

17

Example 2

• Drinkers(name, addr, beersLiked, manf, favBeer).

• Think of a:

Reasonable FD’s to assert:

Page 18: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

18

• Drinkers(name, addr, beersLiked, manf, favBeer).

• Reasonable FD’s to assert:1.name -> addr

2.name -> favBeer

3.beersLiked -> manf

Page 19: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

19

Example Data

name addr beersLiked manf favBeerVili Kalau VB A.B IkaleVili Kalau Ikale R.B IkaleTolu Manima VB A.B VB

Because name -> addr Because name -> favBeer

Because beersLiked -> manf

Page 20: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

20

Exercise

• Look at your own database, see if you can find examples of FD’s from it.

Page 21: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

21

Functional Dependencies Tell us about the Schema

• Recall that FD like any constraint is an assertion about the schema of a relation, not about a particular instance.

• If we look at an instance we cannot tell for certain that a FD holds

we might suppose that a FD like title -> filmtype holds because for every tuple in this particular instance of the relation Movie it happens that any two tuples agreeing on title also agree on filmtype- two version of King Kong –one is color and one is black and white

Page 22: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

22

FD’s With Multiple Attributes

• No need for FD’s with > 1 attribute on right.– But sometimes convenient to combine FD’s

as a shorthand.– Example: name -> addr and name -> favBeer

become name -> addr favBeer

• > 1 attribute on left may be essential.– Example: bar beer -> price

Page 23: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

23

Another Example

• Attributes {title, year, starName} form a key for the relation Movies. But first we must show that they functionally determine all the other attributes. That is suppose two tuples agree on these three attributes: title, year, and starName. Because they agree on title and year, they must agree on the other attributes – length, filmType and studioName. Thus, two different tuples can not agree on all of title, year and starName they would in fact be the same tuple,

Page 24: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

24

• Now we must agree that no proper subset of {title, year, starName} functionally determines all other attributes.

• Why?- begin by observing that title and year do not determine starName, because many movies have more than one star. Thus {title and year} is not a key – {year, starName} is not a key because we could have a star in two moives in the same year: therefor year starName -> title is not a FD. Also we claim that {title, starName} is not a key because two movies with the same title made in different years, occasionally have a star in common.

Page 25: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

25

Therefore…

Several FD’s that we can reasonably assert about the Movie Relations

title year -> length

title year -> filmType

title year -> studioName

Since each have the same left side title and year we summarize them like

Title year -> length filmType studioName.

Page 26: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

26

Keys of Relations

• We say a set of one or more attributes {A1 , A2 , … An } is a key for a relation R if:

• 1. Those attributes functionally determine all other attributes of the relation. That is because relations are sets, it is

impossible for two distinct tuples of R to agree on all of A1 , A2 , … An

Page 27: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

27

• 2. No proper subset of {A1 , A2 , … An } functionally determines all

other attributes of R; ie. A key must be minimal

• * When a key consists of a single attribute A, we often say that A (rather than {A}) is a key.

Page 28: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

28

Short form…

• K is a key for relation R if:– Set K functionally determines all attributes

of R– For no proper subset of K is (1) true.

• If K satisfies (1), but perhaps not (2), then K is a superkey.

• Note E/R keys have no requirement for minimality, as in (2) for relational keys.

Page 29: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

29

Superkeys

• Superkeys

A set of attributes that contain a key is called a superkey, short for superset of a key. Thus every key is a superkey.

Note that every superkey satisfies the first condition of a key; it functionally determines all other attributes of the relation. However, a superkey need not satisfy the second condition: minimality.

Page 30: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

30

Example

• Consider relation Drinkers(name, addr, beersLiked, manf, favBeer).

• {name, beersLiked} is a superkey because together these attributes determine all the other attributes.– name -> addr favBeer– beersLiked -> manf

Page 31: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

31

Example, Cont.

• {name, beersLiked} is a key because neither {name} nor {beersLiked} is a superkey.– name doesn’t -> manf; beersLiked doesn’t ->

addr.

• In this example, there are no other keys, but lots of superkeys.– Any superset of {name, beersLiked}.

Page 32: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

32

Discovering Keys for Relations• When a relation schema was developed by

converting an E/R design to relations, we can often predict the key of the relation.

• Rule 1If the relation comes from an entity set then the key for the relation is the key attributes of this entity set.Eg. Entity set Movies and Stars could be

converted to relations. The keys for these entity sets were {title, year} and {name}

Movies (title, year, length, filmtype)Stars (name, address)

Page 33: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

33

• Second Rule

concerns binary relationship – if a relation R is constructed form a relationship, then the multiplicity of the relationship affects the key for R. There are three cases

- If the relationship is many-many, then the keys of both connected entity sets are the key attributes for R

- If the relationship is a many-one from entity set E1 to entity set E2, then the key attributes of E1 are key attributes of R, but those of E2 are not.

- If the relationship is one-one, then the key attributes for either of the connected entity sets are key attributes of

R, Thus there is not a unique key for R.

Page 34: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

34

E/R and Relational Keys

• Keys in E/R are properties of entities• Keys in relations are properties of tuples.• Usually, one tuple corresponds to one entity, so

the ideas are the same.• But --- in poor relational designs, one entity can

become several tuples, so E/R keys and Relational keys are different.

Page 35: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

35

Example Data

Relational key = name beersLikedBut in E/R, name is a key for Drinkers, and beersLiked is a key

for Beers.Note: 2 tuples for Vili entity and 2 tuples for VB entity.

name addr beersLiked manf favBeerVili Kalau VB A.B IkaleVili Kalau Ikale R.B IkaleTolu Manima VB A.B VB

Page 36: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

36

A continuous Example!!!

Constraints on Entity Set • Consider relation obtained from Hourly_Emps: Hourly_Emps (ssn, name, lot, rating,

wage_per_hr, hrs_per_wk)

• We sometimes denote a relation schema by listing the attributes: e.g., SNLRWH

• This is really the set of attributes {S,N,L,R,W,H}.• Sometimes, we refer to the set of all attributes of

a relation by using the relation name. e.g., “Hourly_Emps” for SNLRWH

Page 37: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

37

A continuous Example!!! count’

What are some FDs on Hourly_Emps?– ssn is the key: S SNLRWH – rating determines wage_per_hr: R W– lot determines lot: L L (“trivial”

dependnency)

Know Trivial Dependency? Find!!!

Page 38: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

38

A continuous Example!!!

• Update anomaly: Can we modify W in only the 1st tuple of SNLRWH?

• Insertion anomaly: What if we want to insert an employee and don’t know the hourly wage for his or her rating? (or we get it wrong?)

• Deletion anomaly: If we delete all employees with rating 5, we lose the information about the wage for rating 5!

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

Hourly_Emps

Problems Due to R W

Page 39: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

39

A continuous Example!!!

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

Hourly_Emps

Q: Why was R W problematic, but S W not?

Detecting Redundancy

Page 40: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

40

A continuous Example!!!

• Redundancy can be removed by “chopping” the relation into pieces.

• FD’s are used to drive this process.R W is causing the problems, so decompose SNLRWH

into what relations?

S N L R H

123-22-3666 Attishoo 48 8 40

231-31-5368 Smiley 22 8 30

131-24-3650 Smethurst 35 5 30

434-26-3751 Guldu 35 5 32

612-67-4134 Madayan 35 8 40

R W

8 10

5 7

Hourly_Emps2

Wages

Decomposing a Relation

Page 41: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

41

Refining an ER Diagram• 1st diagram becomes:

Workers(S,N,L,D,Si) Departments(D,M,B)– Lots associated with

workers.• Suppose all workers in

a dept are assigned the same lot: D L

• Redundancy; fixed by: Workers2(S,N,D,Si) Dept_Lots(D,L) Departments(D,M,B)

• Can fine-tune this: Workers2(S,N,D,Si) Departments(D,M,B,L)

lot

dname

budgetdid

sincename

Works_In DepartmentsEmployees

ssn

lot

dname

budget

did

sincename

Works_In DepartmentsEmployees

ssn

Before:

After:

Page 42: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

42

Where Do Keys Come From?

1. We could simply assert a key K. Then the only FD’s are K -> A for all attributes A, and K turns out to be the only key obtainable from the FD’s.

2. We could assert FD’s and deduce the keys by systematic exploration.

E/R gives us FD’s from entity-set keys and many-one relationships.

Page 43: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

43

FD’s From “Physics”

• While most FD’s come from E/R Keynes and many-one relationships, some are really physical laws.

• Example: “no two courses can meet in the same room at the same time” tells us: hour room -> course.

Page 44: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

44

Inferring FD’s: Motivation

• In order to design relation schemas well, we often need to tell what FD’s hold in a relation.

• We are given FD’s X1 -> A1, X2 -> A2,…, Xn -> An , and we want to know whether an FD Y -> B must hold in any relation that satisfies the given FD’s.– Example: If A -> B and B -> C hold, surely A -

> C holds, even if we don’t say so.

Page 45: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

45

Reasoning About FDs• Given some FDs, we can usually infer additional

FDs:title studio, star implies title studio and title star title studio and title star implies title studio, startitle studio, studio star implies title star

But, title, star studio does NOT necessarily imply that

title studio or that star studio• An FD f is implied by a set of FDs F if f holds

whenever all FDs in F hold.

• F+ = closure of F is the set of all FDs that are implied by F. (includes “trivial dependencies”)

Page 46: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

46

Finding All Implied FD’s

• Motivation: “normalization,” the process where we break a relation schema into two or more schemas.

• Example: ABCD with FD’s AB ->C, C ->D, and D ->A.– Decompose into ABC, AD. What FD’s hold

in ABC ?– Not only AB ->C, but also C ->A !

Page 47: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

47

Basic Idea

• To know what FD’s hold in a projection, we start with given FD’s and find all FD’s that follow from given ones.

• Then, restrict to those FD’s that involve only attributes of the projected schema.

Page 48: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

48

Rules of Inference• Armstrong’s Axioms (X, Y, Z are sets of attributes):

– Reflexivity: If X Y, then X Y – Augmentation: If X Y, then XZ YZ for any Z– Transitivity: If X Y and Y Z, then X Z

• These are sound and complete inference rules for FDs!– i.e., using AA you can compute all the FDs in F+ and only

these FDs.

• Some additional rules (that follow from AA):– Union: If X Y and X Z, then X YZ– Decomposition: If X YZ, then X Y and X Z

Page 49: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

49

Inference Test

• To test if Y -> B, start assuming two tuples agree in all attributes of Y.

• Use the given FD’s to infer that these tuples must also agree in certain other attributes.

• If B is eventually found to be one of these attributes, then Y -> B is true; otherwise, the two tuples, with any forced equalities form a two-tuple relation that proves Y -> B does not follow from the given FD’s.

Page 50: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

50

Example• Contracts(cid,sid,jid,did,pid,qty,value), and:

– C is the key: C CSJDPQV– Proj purchases each part using single contract: JP C– Dept purchases at most 1 part from a supplier: SD P

• Problem: Prove that SDJ is a key for Contracts• JP C, C CSJDPQV imply JP CSJDPQV

(by transitivity) (shows that JP is a key)

• SD P implies SDJ JP (by augmentation)• SDJ JP, JP CSJDPQV imply SDJ CSJDPQV (by transitivity) thus SDJ is a key.

Q: can you now infer that SD CSDPQV (i.e., drop J on both sides)?

No! FD inference is not like arithmetic multiplication.

Page 51: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

51

Attribute Closure

• Computing the closure of a set of FDs can be expensive. (Size of closure is exponential in # attrs!)

• Typically, we just want to check if a given FD X Y is in the closure of a set of FDs F. An efficient check:– Compute attribute closure of X (denoted X+) wrt F. X+

= Set of all attributes A such that X A is in F+

• X+ := X• Repeat until no change: if there is an fd U V in F such that U

is in X+, then add V to X+

– Check if Y is in X+

– Approach can also be used to find the keys of a relation.• If all attributes of R are in the closure of X then X is a

superkey for R.• Q: How to check if X is a “candidate key”?

Page 52: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

52

Closure Test

• An easier way to test is to compute the closure of Y, denoted Y +.

• Basis: Y + = Y.

• Induction: Look for an FD’s left side X that is a subset of the current Y +. If the FD is X -> A, add A to Y +.

Page 53: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

53

Attribute Closure (example)• R = {A, B, C, D, E}• F = { B CD, D E, B A, E C, AD B }• Is B E in F+ ?• B+ = B

– B+ = BCD– B+ = BCDA– B+ = BCDAE … Yes! and

B is a key for R too!• Is D a key for R?

– D+ = D– D+ = DE– D+ = DEC – … Nope!

• Is AD a key for R? AD+ = ADAD+ = ABD and B is a key, so Yes!

• Is AD a candidate key for R?A+ = A, D+ = DEC… A,D not keys, so Yes!

• Is ADE a candidate key for R?

… No! AD is a key, so ADE is a superkey, but not a cand. key

Page 54: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

54

Y+new Y+

X A

Page 55: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

55

Simple, Exponential Algorithm

1. For each set of attributes X, compute X +.

2. Add X ->A for all A in X + - X.

3. However, drop XY ->A whenever we discover X ->A.

Because XY ->A follows from X ->A.

4. Finally, use only FD’s involving projected attributes.

Page 56: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

56

A Few Tricks

• Never need to compute the closure of the empty set or of the set of all attributes.

• If we find X + = all attributes, don’t bother computing the closure of any supersets of X.

Page 57: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

57

Example

• ABC with FD’s A ->B and B ->C. Project onto AC.– A +=ABC ; yields A ->B, A ->C.

• We do not need to compute AB + or AC +.

– B +=BC ; yields B ->C.– C +=C ; yields nothing.– BC +=BC ; yields nothing.

Page 58: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

58

Example, Continued

• Resulting FD’s: A ->B, A ->C, and B ->C.

• Projection onto AC : A ->C.– Only FD that involves a subset of {A,C }.

Page 59: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

59

A Geometric View of FD’s

• Imagine the set of all instances of a particular relation.

• That is, all finite sets of tuples that have the proper number of components.

• Each instance is a point in this space.

Page 60: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

60

Example: R(A,B)

{(1,2), (3,4)}

{}

{(1,2), (3,4), (1,3)}

{(5,1)}

Page 61: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

61

An FD is a Subset of Instances

• For each FD X -> A there is a subset of all instances that satisfy the FD.

• We can represent an FD by a region in the space.

• Trivial FD : an FD that is represented by the entire space.

– Example: A -> A.

Page 62: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

62

Example: A -> B for R(A,B)

{(1,2), (3,4)}

{}

{(1,2), (3,4), (1,3)}

{(5,1)}A -> B

Page 63: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

63

Representing Sets of FD’s

• If each FD is a set of relation instances, then a collection of FD’s corresponds to the intersection of those sets.– Intersection = all instances that satisfy all of

the FD’s.

Page 64: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

64

Example

A->BB->C

CD->A

Instances satisfyingA->B, B->C, andCD->A

Page 65: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

65

Implication of FD’s

• If an FD Y -> B follows from FD’s X1 -> A1,…, Xn -> An , then the region in the space of instances for Y -> B must include the intersection of the regions for the FD’s Xi -> Ai .– That is, every instance satisfying all the

FD’s Xi -> Ai surely satisfies Y -> B.– But an instance could satisfy Y -> B, yet

not be in this intersection.

Page 66: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

66

Example

A->B B->CA->C

Page 67: 1 IT 244 Database Management System Topic 7 The Relational Data Model Functional Dependencies Ref : -A First Course in Database System (Jeffrey D Ullman,

67

Summary

• Functional Dependencies –definition, -expressions, -applications

• FD’s with multiple attributes• Keys of Relations• Superkeys• Discovering Keys of a Relations• Inferring FD’s• Geometric View of FD’s