Logical Database Design (2 of 3) John Ortiz. Lecture 7Logical Database Design (2)2 Finding All...
-
Upload
alexandrina-norton -
Category
Documents
-
view
213 -
download
0
Transcript of Logical Database Design (2 of 3) John Ortiz. Lecture 7Logical Database Design (2)2 Finding All...
Logical Database Design (2 of 3)
John Ortiz
Lecture 7 Logical Database Design (2) 3
Finding All Candidate Keys (cont.)Method 2 (manual approach): Step 1: Draw the dependency graph of F.
Each vertex corresponds to an attribute. Edges can be defined as follows:
A B becomes A B A BC becomes A B C AB C becomes A B C
Lecture 7 Logical Database Design (2) 4
Finding All Candidate Keys (cont.)Step 2: Identify the set of vertices Vni that
have no incoming edges. Step 3: Identify the set of vertices Voi that
have only incoming edges. Step 4: A candidate key is a set of attributes
that contains all attributes in Vni
contains no attribute in Voi
has no subset that is already a candidate key
Lecture 7 Logical Database Design (2) 5
An Example Using Method 2
Consider R(A, B, C, G, H, I), and F = {A BC, CG HI, B H }
Vni = {A, G}, Voi = {H, I}. Since (AG)+ = ABCGHI, AG is the only
candidate key of R.
A B H
C GI
Lecture 7 Logical Database Design (2) 6
Another Example Using Method 2
Consider R(A, B, C, D, E, H), andF = {A B, AB E, BH C, C D, D A }
Vni = { H }, Voi = { E }. Candidate keys: AH, BH, CH, DH.
A B C D
E H
Lecture 7 Logical Database Design (2) 7
Normal Forms If a relation is in a certain normal form
(BCNF, 3NF, …), certain types of redundancy is known to be avoided/eliminated.
A relation schema R is in First Normal Form (1NF) if every attribute of R takes only single and atomic values.
Every relation is in 1NF 1NF allows all kinds of redundancy Higher normal forms are defined in terms
of FDs.
Lecture 7 Logical Database Design (2) 8
Second Normal Form (2NF)Let F be a set of FDs satisfied by R. An attribute of R is prime if it appears in a
candidate key (according to F) of R. Y is fully functionally dependent on X if F
implies X Y, but not W Y where W X. R is in Second Normal Form (2NF) if every
non-prime attribute of R is fully functionally dependent of every candidate key.
If a part of a candidate key can determine a non-prime attribute, R is not in 2NF.
Lecture 7 Logical Database Design (2) 9
2NF: Examples
(1) Consider F = {B AH, L CAt} over relation
Bank-Loans (Bank, Assets, Headquarter, Loan#, Customer, Amount) B A is in F+, where A is non-prime, & B is
not a candidate key. Bank-Loans is not in 2NF.
(2) Consider F = {S NMG, M AO} over Students(SID,Name,Major,GPA,Advisor,Office) S is the only candidate key, and has a
single attribute. Students is in 2NF. 2NF relations still allow unwanted
redundancy
Lecture 7 Logical Database Design (2) 10
Another Definition of 2NF R is in 2NF if for every FD X Y in F+,
Y X (trivial); or every attribute in Y is prime; or X is not a proper subset of any
candidate key.
R is in 2NF if every candidate key is a single attribute
Lecture 7 Logical Database Design (2) 11
Third Normal Form (3NF) Let F be a set of FDs satisfied by R. R is in Third Normal Form (3NF) if for every
FD X A in F+, (a) A X (trivial); or (b) every attribute in A is prime; or (c) X is a superkey.
Let X be a candidate key. If Y B F+, B Y, B is non-prime, and Y is not a super key, then B is non-trivially transitively dependent of X. 3NF removes this dependency.
Lecture 7 Logical Database Design (2) 12
3NF: Examples(1) Consider F = {S NASaDn, Dn Ds}
over Employees (SSN, Name, Age, Salary, Dept_name,
Dept_manager_SSN) Employees is not in 3NF due to Dn Ds.(2) Consider F = { CS Z, Z C } over R(City, Street, Zipcode) R is in 3NF as each attribute is prime (How
many candidate keys are there?). 3NF may still have redundancy (introduced
by Z C)
Lecture 7 Logical Database Design (2) 13
Boyce-Codd Normal Forms (BCNF) Let F be a set of FDs over R. R is in Boyce-Codd Normal Form (BCNF) if
for every FD X A in F+, (a) A X (trivial); or(b) X is a superkey.
Example: Consider R(City, Street, Zipcode) and F = { CS Z, Z C }. R is in 3NF but not in BCNF because in Z C, Z is not a superkey.
Lecture 7 Logical Database Design (2) 14
Normal Forms: Summary BCNF 3NF 2NF 1NF 2NF removes some insertion anomalies
and deletion anomalies. Also removes redundancies caused by partial dependencies on key.
3NF removes all insertion anomalies and deletion anomalies. Also removes redundancies caused by transitive dependencies.
BCNF achieves all that are achieved by 3NF, and removes all redundancies caused by FDs.
SSN --> Name, Age, Address, PetID, PetName, PetAge, Type, License#, Vehicle, Color, VehPrice, Year
SSN --> Name, Age, Address SSN : PetID :: 1 : MPetID --> PetName, PetAge, TypeLicense# --> Vehicle, Color, VehPrice, Year SSN : License# :: M : MVehicle --> VehPrice
EMPLOYEESSSN Name Age Address PetID PetName PetAge Type License# Vehicle Color VehPrice (K) Year
111 joe 43 72 RD2 L1
buddy snipper
1 2
dog lizard LN 03 van grn 25 1991
123 joe 22 57 R bp1 bl1
222 steve 32 12 C
C1 P1 L4
fluffy pete lenny
1 2 1
cat parot lizard
LN 01 LN 09
viper celica
red yel
70 29
1999 1987
234 jim 35 18 C C2 sassy 1 cat
333 fred 21 12 QF1 L2
herman vinny
1 2
frog lizard
LN 04 LN 06
jeep wagon
blu red
28 10
1995 1975
343 bob 17 15 H
F2 S1 S2
feddy sneaky sulky
3 2 2
frog snake snake LN 14 truck blu 28 1982
444 ann 21 32 FD1 D4
fido arfy
3 3
dog dog
555 ann 21 32 F C3 cotton 4 catLN 05 LN 15
SUV SUV
yel red
35 35
1997 1996
777 sally 25 54 Z D3 mutz 5 dog LN 07 jeep blu 28 1995788 sally 24 54 Z D5 mutz2 4 dog LN 18 camry wht 23 1998789 tasha 27 54 Z LN 08 mustang red 28 1991987 elena 51 12 Q L3 lizzy 3 lizard LN 06 wagon red 5 1975
Unnormalized
SSN, PetID, License# --> Name, Age, Address, PetName, PetAge, Type, Vehicle, Color, VehPrice, Year
SSN --> Name, Age, Address SSN : PetID :: 1 : MPetID --> PetName, PetAge, TypeLicense# --> Vehicle, Color, VehPrice, Year SSN : License# :: M : MVehicle --> VehPrice
EMPLOYEESSSN Name Age Address PetID PetName PetAge Type License# Vehicle Color VehPrice (K) Year111 joe 43 72 R D2 buddy 1 dog LN 03 van grn 25 1991111 joe 43 72 R L1 snipper 2 lizard LN 03 van grn 25 1991123 joe 22 57 R bp1 bl1222 steve 32 12 C C1 fluffy 1 cat LN 01 viper red 70 1999222 steve 32 12 C P1 pete 2 parot LN 09 celica yel 29 1987222 steve 32 12 C L4 lenny 1 lizard LN 09 celica yel 29 1987234 jim 35 18 C C2 sassy 1 cat bl2333 fred 21 12 Q F1 herman 1 frog LN 04 jeep blu 28 1995333 fred 53 12 Q L2 vinny 2 lizard LN 06 wagon red 10 1975343 bob 17 15 H F2 freddy 3 frog LN 14 truck blu 28 1982343 bob 17 15 H S1 sneaky 2 snake LN 14 truck blu 28 1982343 bob 17 15 H S2 sulky 2 snake LN 14 truck blu 28 1982444 ann 21 32 F D1 fido 3 dog bl3444 ann 21 32 F D4 arfy 3 dog bl4555 ann 21 32 F C3 cotton 4 cat LN 05 SUV yel 35 1997555 ann 21 32 F C3 cotton 5 cat LN 15 SUV red 35 1996777 sally 25 54 Z D3 mutz 5 dog LN 07 jeep blu 28 1995788 sally 24 54 Z D5 mutz2 4 dog LN 18 camry wht 23 1998789 tasha 27 54 Z bp2 LN 08 mustang red 28 1991987 elena 51 12 Q L3 lizzy 3 lizard LN 06 wagon red 5 1975
1NF
SSN, PetID, License# --> Name, Age, Address, PetName, PetAge, Type, Vehicle, Color, VehPrice, Year
SSN --> Name, Age, Address SSN : PetID :: 1 : M LEGEND: redundant
PetID --> PetName, PetAge, Type inconsistent
License# --> Vehicle, Color, VehPrice, Year SSN : License# :: M : M redundant for 2 reasons
Vehicle --> VehPrice
EMPLOYEESSSN Name Age Address PetID PetName PetAge Type License# Vehicle Color VehPrice (K) Year111 joe 43 72 R D2 buddy 1 dog LN 03 van grn 25 1991111 joe 43 72 R L1 snipper 2 lizard LN 03 van grn 25 1991123 joe 22 57 R bp1 bl1222 steve 32 12 C C1 fluffy 1 cat LN 01 viper red 70 1999222 steve 32 12 C P1 pete 2 parot LN 09 celica yel 29 1987222 steve 32 12 C L4 lenny 1 lizard LN 09 celica yel 29 1987234 jim 35 18 C C2 sassy 1 cat bl2333 fred 21 12 Q F1 herman 1 frog LN 04 jeep blu 28 1995333 fred 53 12 Q L2 vinny 2 lizard LN 06 wagon red 10 1975343 bob 17 15 H F2 freddy 3 frog LN 14 truck blu 28 1982343 bob 17 15 H S1 sneaky 2 snake LN 14 truck blu 28 1982343 bob 17 15 H S2 sulky 2 snake LN 14 truck blu 28 1982444 ann 21 32 F D1 fido 3 dog bl3444 ann 21 32 F D4 arfy 3 dog bl4555 ann 21 32 F C3 cotton 4 cat LN 05 SUV red 35 1997555 ann 21 32 F C3 cotton 5 cat LN 15 SUV red 30 1996777 sally 25 54 Z D3 mutz 5 dog LN 07 jeep grn 28 1995788 sally 24 54 Z D5 mutz2 4 dog LN 18 camry wht 23 1998789 tasha 27 54 Z bp2 LN 08 mustang red 28 1991987 elena 51 12 Q L3 lizzy 3 lizard LN 06 wagon blu 10 1975
Redundancy Unleashed
SSN --> Name, Age, AddressPetID --> PetName, PetAge, Type, SSNLicense# --> Vehicle, Color, VehPrice, YearVehicle --> VehPrice
PEOPLE PETS
SSN Name Age Address PetID PetName PetAge Type SSN111 joe 43 72 R D2 buddy 1 dog 111
L1 snipper 2 lizard 111123 joe 22 57 R222 steve 32 12 C C1 fluffy 1 cat 222
P1 pete 2 parot 222L4 lenny 1 lizard 222
234 jim 35 18 C C2 sassy 1 cat 234333 fred 21 12 Q F1 herman 1 frog 333
L2 vinny 2 lizard 333343 bob 17 15 H F2 freddy 3 frog 343
S1 sneaky 2 snake 343S2 sulky 2 snake 343
444 ann 21 32 F D1 fido 3 dog 444D4 arfy 3 dog 444
555 ann 21 32 F C3 cotton 4 cat 555
777 sally 25 54 Z D3 mutz 5 dog 777788 sally 24 54 Z D5 mutz2 4 dog 788789 tasha 27 54 Z bp2987 elena 51 12 Q L3 lizzy 3 lizard 987
2NF Raw – Part1
JT VEHICLES
SSN License# License# Vehicle Color VehPrice (K) Year111 LN 03 LN 03 van grn 25 1991222 LN 01
222 LN 09
333 LN 04 LN 01 viper red 70 1999333 LN 06 LN 09 celica yel 29 1987343 LN 14
555 LN 05
555 LN 15 LN 04 jeep blu 28 1995777 LN 07 LN 06 wagon red 10 1975788 LN 18 LN 14 truck blu 28 1982789 LN08
987 LN06
LN 05 SUV yel 35 1997LN 15 SUV red 35 1996LN 07 jeep blu 28 1995LN 18 camry wht 23 1998LN 08 mustang red 28 1991
2NF Raw – Part2
SSN --> Name, Age, AddressPetID --> PetName, PetAge, Type, SSNLicense# --> Vehicle, Color, VehPrice, YearVehicle --> VehPrice
PEOPLE PETS
SSN Name Age Address PetID PetName PetAge Type SSN111 joe 43 72 R C1 fluffy 1 cat 222123 joe 22 57 R C2 sassy 1 cat 234222 steve 32 12 C C3 cotton 4 cat 555234 jim 35 18 C D1 fido 3 dog 444333 fred 21 12 Q D2 buddy 1 dog 111343 bob 17 15 H D3 mutz 5 dog 777444 ann 21 32 F D4 arfy 3 dog 444555 ann 21 32 F D5 mutz2 4 dog 788777 sally 25 54 Z F1 herman 1 frog 333788 sally 24 54 Z F2 freddy 3 frog 343789 tasha 27 54 Z L1 snipper 2 lizard 111987 elena 51 12 Q L2 vinny 2 lizard 333
L3 lizzy 3 lizard 987L4 lenny 1 lizard 222P1 pete 2 parot 222S1 sneaky 2 snake 343S2 sulky 2 snake 343
2NF Clean – Part1
JT VEHICLES
SSN License# License# Vehicle Color VehPrice (K) Year111 LN 03 LN 01 viper red 70 1999222 LN 01 LN 03 van grn 25 1991222 LN 09 LN 04 jeep blu 28 1995333 LN 04 LN 05 SUV yel 35 1997333 LN 06 LN 06 wagon red 10 1975343 LN 14 LN 07 jeep blu 28 1995555 LN 05 LN 08 mustang red 28 1991555 LN 15 LN 09 celica yel 29 1987777 LN 07 LN 14 truck blu 28 1982788 LN 18 LN 15 SUV red 35 1996789 LN08 LN 18 camry wht 23 1998987 LN06
2NF Clean – Part2
SSN --> Name, Age, AddressPetID --> PetName, PetAge, Type, SSNLicense# --> Vehicle, Color, YearVehicle --> VehPrice
PEOPLE PETS
SSN Name Age Address PetID PetName PetAge Type SSN111 joe 43 72 R C1 fluffy 1 cat 222123 joe 22 57 R C2 sassy 1 cat 234222 steve 32 12 C C3 cotton 4 cat 555234 jim 35 18 C D1 fido 3 dog 444333 fred 21 12 Q D2 buddy 1 dog 111343 bob 17 15 H D3 mutz 5 dog 777444 ann 21 32 F D4 arfy 3 dog 444555 ann 21 32 F D5 mutz2 4 dog 788777 sally 25 54 Z F1 herman 1 frog 333788 sally 24 54 Z F2 freddy 3 frog 343789 tasha 27 54 Z L1 snipper 2 lizard 111987 elena 51 12 Q L2 vinny 2 lizard 333
L3 lizzy 3 lizard 987L4 lenny 1 lizard 222P1 pete 2 parot 222S1 sneaky 2 snake 343S2 sulky 2 snake 343
3NF Clean – Part1
JT VEHICLES VEH
SSN License# License# Vehicle Color Year Vehicle VehPrice (K)111 LN 03 LN 01 viper red 1999 camry 23222 LN 01 LN 03 van grn 1991 celica 29222 LN 09 LN 04 jeep blu 1995 jeep 28333 LN 04 LN 05 SUV yel 1997 mustang 28333 LN 06 LN 06 wagon red 1975 SUV 35343 LN 14 LN 07 jeep blu 1995 truck 28555 LN 05 LN 08 mustang red 1991 van 25555 LN 15 LN 09 celica yel 1987 viper 70777 LN 07 LN 14 truck blu 1982 wagon 10788 LN 18 LN 15 SUV red 1996789 LN08 LN 18 camry wht 1998987 LN06
3NF Clean – Part2
Lecture 7 Logical Database Design (2) 24
Normalize the Following Relation Universal Relation R (A, B, {C, D, K}, E, F(G, H, I), J) Given: A B, C DK, E F, F GHI, K EJ A:C is M:N, C:K is 1:M (C is the many),
K:E is 1:M (E is the many) What do the parenthesis indicate? What do the braces indicate?
Lecture 7 Logical Database Design (2) 25
A
R
GH
F
B
C
DE
I
K
J
E-R Diagram - Unnormalized
Lecture 7 Logical Database Design (2) 26
Normalize the Following Relation Universal Relation R (A, B, {C, D, K}, E, F(G, H, I), J) Given: A B, C DK, E F, F GHI, K EJ Step 1: Remove any composite
attributes Either determine that the level of
detail provided by G, H, I is unnecessary
OR remove F For our purposes we will remove F
Lecture 7 Logical Database Design (2) 27
Normalize the Following Relation
New Universal Relation R (A, B, {C, D, K}, E, G, H, I, J) Given: A B, C DK, E GHI, K EJ Step 2: Remove any multi-valued
attributes If there is a determinant within the MV
attributes, make it part of the key AC BDK
Lecture 7 Logical Database Design (2) 28
Proof
Given: A B (IR2) AC BC (augmentation) (IR4) AC B (decomposition) Given: C DK (IR2) AC DK (IR5) AC BDK (union)
Lecture 7 Logical Database Design (2) 29
1NF 1NF Universal Relation R R(A, B, C, D, E, G, H, I, J, K) Given: AC BD, A B, C DK, E GHI, K EJ Find all Candidate Keys: Vni (A C), Voi (B D G H I J), E, K have both A determines BDK, in which K dets EJ, in
which E dets GHI and C determines DK Only Candidate Key is AC
Lecture 7 Logical Database Design (2) 30
E-R Diagram - 1NF
A
R
GH
B
I
K
J
C
DE
Lecture 7 Logical Database Design (2) 31
Update Anomalies in 1NF R(A, B, C, D, E, G, H, I, J, K) AC BDK, A B, C DK, E GHI, K EJ Identify Partial Dependencies: A B, C DK Can’t insert an ‘A’ without a ‘C’
(vice/versa) If you delete an ‘A’ may lose info about
‘C’ What info would you lose?
If you change a ‘B’, may have to change in multiple places
Lecture 7 Logical Database Design (2) 32
Going to 2NF REMOVE PARTIAL DEPENDENCIES R(A, B, C, D, E, G, H, I, J, K) AC BDK, A B, C DK, E GHI, K
EJ R1(A, B) R2(C, D, K, E, G, H, I, J) Given: A:C is M:N, therefore we need
what? R3(A, C) What is the PK for R3? Identify the FK(s). Check: Are we in 2NF? Part. Deps. in R1?, R2?
Lecture 7 Logical Database Design (2) 33
E-R Diagram – 2NF
A
R1B
R2
GH
I
K
J
C
DE
R3M N
Lecture 7 Logical Database Design (2) 34
Update Anomalies in 2NF R1(A, B), R2(C, D, K, E, G, H, I, J), R3(A, C) Identify Transitive Dependencies: Given: A B, C DK, E GHI, K EJ C K, K E, E GHI Can’t insert an ‘K’ without a ‘C’ (NOT
vice/versa) If you delete an ‘C’ may lose info about ‘K’
What info would you lose? If you change a ‘E’, may have to change in
multiple places
Lecture 7 Logical Database Design (2) 35
Going to 3NF REMOVE TRANSITIVE
DEPENDENCIES R1(A, B) – IN 3NF, only one attribute in PK
so impossible to have transitive dependency!
R2(C, D, K, E, G, H, I, J) R3(A, C) A B, C DK, E GHI, K EJ C:K is 1:M (C is the many), K:E is 1:M (E is
the many) R2 is replaced by: R4(C, D, K), R5(K, J), R6(E, G, H, I, K)
Lecture 7 Logical Database Design (2) 36
E-R Diagram – 3NF
A
R1
B
R4
G
H
I
K
J
C D
K
R3M
N
R5
R6
K
E
R8
R7
M
1M
1
Lecture 7 Logical Database Design (2) 37
Another Example R(A, B, C, D, E, F, G, H, I, J) AB -> F G H I J B:AB -> 1:M B -> C D E AB:H -> M:N H -> I J What is the candidate key? What normal form is this relation in?
Are there any multi-valued attributes? Are there any partial dependencies? Are there any transitive dependencies? Are there any FDs determining part of the CK?
Lecture 7 Logical Database Design (2) 38
1NF Anomalies R(A, B, C, D, E, F, G, H, I, J) AB -> F G H I J B:AB -> 1:M B -> C D E AB:H -> M:N H -> I J Insertion Anomaly based on Part. Dep.? Deletion Anomaly based on Part. Dep.? Modification Anomaly based on Part. Dep.? To go to 2NF, Decompose Partial
Dependencies
Lecture 7 Logical Database Design (2) 39
2NF R1(A, B, F, G, H, I, J) AB:H -> M:N R2(B, C, D, E) B:AB -> 1:M AB -> F G H I J, B -> C D E, H -> I J What are the CKs now? Are there any foreign keys?
Lecture 7 Logical Database Design (2) 40
2NF Anomalies R1(A, B, F, G, H, I, J) AB:H -> M:N R2(B, C, D, E) B:AB -> 1:M AB -> F G H I J, B -> C D E, H -> I J Insertion Anomaly based on Trans. Dep.? Deletion Anomaly based on Trans. Dep.? Modification Anomaly based on Trans.
Dep.? To go to 3NF, Decompose Transitive
Dependencies
FK
Lecture 7 Logical Database Design (2) 41
3NF R1(A, B, F, G, H, I, J) AB:H -> M:N R2(B, C, D, E) B:AB -> 1:M AB -> F G H I J, B -> C D E, H -> I J Decompose transitive dependencies, R2 is
ok R3(A, B, F, G), R1 is gone! R4(H, I, J) R5(A, B, H) What are the candidate keys now? What type of relation is R5?
Lecture 7 Logical Database Design (2) 42
BCNF If G -> B then we would decompose
further to achieve BCNF
Lecture 7 Logical Database Design (2) 43
Could that last example be real? R(A, B, C, D, E, F, G, H, I, J) A = Depen. Name B = Emp. SSN, C D E = Emp. Name, Off,
Ph F G = Depen. Rm#, Ph H = Depen. Car, I J = car make, model Each employee can have many
dependents, but each dependent has only 1 employee, hence the 1:M between B and AB.
Perhaps siblings share ownership of the car, hence the M:N between AB and H