Normalization p2
-
Upload
prabhakar-reddy -
Category
Documents
-
view
222 -
download
0
Transcript of Normalization p2
-
8/14/2019 Normalization p2
1/22
NORMALIZATION
FIRST NORMAL FORM (1NF):
A relation R is in 1NF if all attributes have
atomic value
= one value for an attribute
= no repeating groups
= no multivalued attributes
= no composite attributes
-
8/14/2019 Normalization p2
2/22
Example
Non-1NF
EMP (E#, ENAME, SKILL). Here SKILL is
a multi-valued attribute.
EMP( E#, ENAME, SKILL1, SKILL2,
SKILL3, SKILL4, ....). Skill as a repeating
group attribute.
-
8/14/2019 Normalization p2
3/22
NON-NFNFD
There are two methods of converting a NON-
1NF into a 1NF relation. Method 1 mapps
out the multi-valued (or repeating group)attribute into another table, while method 2
keeps the multi-valued attribute but simply
uses a composite PK.
-
8/14/2019 Normalization p2
4/22
Method 1: Conversion to 1NF
1. Create one relation for repeating groups by
adding the key of original relation.
2. Remove the attributes of repeating groups
from the original relation.
-
8/14/2019 Normalization p2
5/22
Example
SKILL (E#, SKILL)
EMP (E#, ENAME)
Note the composite PK of SKILL relation.
-
8/14/2019 Normalization p2
6/22
Method 2: We can also flatten the
table as follows:EMP (E#, Skill, Ename)
- Elmasri's book uses this method.
- This method repeats the repeating group
value in a separate tuple.
- Note the composite PK.
-
8/14/2019 Normalization p2
7/22
SECOND NORMAL FORM
(2NF)A relation R is in 2NF if
(a) R is in 1NF, and
(b) each attribute isfully functionally
dependent on the whole key of R
(This FD is called a partial dependency(PD).)
-
8/14/2019 Normalization p2
8/22
Example
INVENTORY (WH, PART, QTY,
WH_ADDR)
WH, PART --> QTY
WH --> WH_ADDR (This is not in 2NF,
since WH is a part of a key)
Key: WH+PART
-
8/14/2019 Normalization p2
9/22
Problem of non-2NF (update
anomaly):
Warehouse address is repeated for every
part stored
If the address is changed, needs multiple
updates
If no parts in a warehouse, can't keep the
warehouse address
-
8/14/2019 Normalization p2
10/22
2NF Decomposition:
Create a separate relation for each PD
Remove the RHS of the PD from the
original relation.
-
8/14/2019 Normalization p2
11/22
The above Non-2NF can be transformed
into the following 2NF relations.
INVENTORY(WH, PART, QTY)
WAREHOUSE(WH, WH_ADDR)
-
8/14/2019 Normalization p2
12/22
Example
NOTE that Non-2NF occurs only when we havea composite key.
EMP_PROJ (SSN, P#, HOURS, ENAME,
PNAME, PLOC)SSN, P# --> HOURS SSN --> ENAME
(* Violate 2NF; SSN is a part of a key*)
P# --> PNAME, PLOC (* Violate 2NF; P# is apart of a key *)
-
8/14/2019 Normalization p2
13/22
2NF decomposition
R1 (SSN, P#, HOURS) R2 (SSN,
ENAME) R3 (P#, PNAME, PLOC)
-
8/14/2019 Normalization p2
14/22
THIRD NORMAL FORM
(3NF)
A relation R is in 3NF if
a) it is in 2 NF and
b) it has no transitive dependencies.
That is, each nonkey attribute must be functionally
dependent on the key and nothing else. If you haveany FD whose LHS is not a PK (or CK), then R is
not in 3NF.
-
8/14/2019 Normalization p2
15/22
Example
WORK (EMP#, DEPT, LOC) KEY: EMP#
2NF 3NF
(1) EMP# --> DEPT Y Y
(2) DEPT --> LOC Y N
WORK is in 2NF, but not in 3NF because of
FD (2).
-
8/14/2019 Normalization p2
16/22
Problem of Non-3NF
Dept. location is repeated for everyemployee
If the location is changed, needs multipleupdates
If you forget to change all records, can
cause inconsistency If a dept. has no employees, can't keep dept
location
-
8/14/2019 Normalization p2
17/22
3NF DECOMPOSITION
Algorithm for a given minimal cover:
1) Combine the RHS of FDs if they have common
LHS2) Create a separate table for each FD.
3) Check for Lossless decomposition.
(Check whether a CK of the original realtionappears in any of the decomposed relation). IF not
lossless, then add a table consisting of a CK.
-
8/14/2019 Normalization p2
18/22
Example
R1 (EMP#, DEPT), R2 (DEPT, LOC)
The original relation WORK is not in 3NF,
but R1 and R2 are in 3NF.
Note that the LHS of a FD becomes the PK
of each decomposed table.
-
8/14/2019 Normalization p2
19/22
Our 3NF definition we used above is an
informal one used by many industry
designers. Some DB text books, includingElmasri's book use a more rigorous
definition that is shown below.
-
8/14/2019 Normalization p2
20/22
Formal Def. of 3NF
A relation R is in 3NF if, for all X --> A in R
(1) X is a super key or
(2) A is a prime attribute (where X and Acould be a set of attributes)
In other words, all attributes, except primeattributes, must be dependent on anycandidate keys.
-
8/14/2019 Normalization p2
21/22
The only difference between the informaldefinition and the formal definition is the secondcondition in the formal definition. That is, theformal definition allows transitive dependencywhose RHS is a prime attribute, where a primeattribute is an attribute that belongs to anycandidate key. The difference between these two
definition is very minor and many real-world DBdesigners just use the informal definition. For youreference, we showed the formal definition of3NF.
SUMMARY OF
-
8/14/2019 Normalization p2
22/22
SUMMARY OF
NORMALIZATION- As we go to higher normal forms, we create a more number of
relations.
- Each higher normal form removes a certain type of dependency thatcauses redundancy.
- As a relation becomes a higher normal form:
- We have a more number of relations
- That increases more number of joins in query forming
- Which increases more number of join processings
- And also more referential integrity constraints need to be maintained
- And thus schema is complicated and performance is drcreased.
So, many real-world DB designers stop at 3NF, which reasonablyremoves typical redundnacy and still maintains performance. So,
strive to achive 3NF in your real-world RDB!