7/23/2019 Lect#5 - Normalization.ppt
1/42
Normalization
Pearson Education Limited 1995, 2005
7/23/2019 Lect#5 - Normalization.ppt
2/42
Objectives2
What is normalization and the purpose ofnormalization
What is update anomalies? How normal forms can be transformed
from lower normal forms to highernormal forms; 1NF, 2NF and 3NF
Pearson Education Limited 1995, 2005
7/23/2019 Lect#5 - Normalization.ppt
3/42
Purpose of
Normalization3
Normalization is a techniue of anal!zing and correcting tablestructure for producing a set of suitable relations that supportthe data reuirements of an enterprise" #esult$ a set of relations with minimized data redundancies
%haracteristics of a suitable set of relations include$ the minimalnumber of attributes necessar! to support the
data reuirements of the enterprise;
attributes with a close logical relationship are found in thesame relation i"e each table represent a single sub&ect
minimalredundanc! with each attribute represented onl!once with the important e'ception of attributes that form allor part of foreign (e!s i"e no data item will be unnecessaril!stored in more than 1 table
)ll attributes in a table are dependant on the primar! (e!
Pearson Education Limited 1995, 2005
7/23/2019 Lect#5 - Normalization.ppt
4/42
Purpose of
Normalization4
*he bene+ts of using a database that hasa suitable set of relations is that thedatabase will be$ easier for the user to access and maintain
the data reducing the opportunities fordata inconsistencies;
ta(e up minimal storage space on thecomputer"
Pearson Education Limited 1995, 2005
7/23/2019 Lect#5 - Normalization.ppt
5/42
How NormalizationSupports Database Design5
Pearson Education Limited 1995, 2005
7/23/2019 Lect#5 - Normalization.ppt
6/42
Data Redundancy andUpdate nomalies6 a&or aim of relational database design
-i"e normalization. is to group attributesinto relations to minimize data
redundanc! /roblems associated with data
redundanc! are illustrated b! comparing
the Staffand Branchrelations with theStaffBranchrelation"
Pearson Education Limited 1995, 2005
7/23/2019 Lect#5 - Normalization.ppt
7/42
Data Redundancy andUpdate nomalies7
Pearson Education Limited 1995, 2005
Design 1
Design 2
D R d d d U d
7/23/2019 Lect#5 - Normalization.ppt
8/42
Data Redundancy and Updatenomalies
8
StaffBranchrelation has redundant data; thedetails of a branch are repeated for e0er!member of sta" -#efer to esign 2.
n contrast, the branch information -bAddress.appears onl! once for each branch in the
Branchrelation and onl! the branch number-branchNo. is repeated in the Staff relation,to represent where each member of sta islocated" -#efer to esign 1.
#elations that contain redundant information
ma! potentiall! suer from update anomalies" *!pes of update anomalies include
nsertion eletion
odi+cation Pearson Education Limited 1995, 2005
7/23/2019 Lect#5 - Normalization.ppt
9/42
!nsertion nomalies "
#$amples9
f esign 2is used, to enter the details of new sta with branch no" 4556
would reuire that the correct details of branch no" 4556is entered so that it will be consistent with 0alues for
branch no" 4556 in other tuples" 4ut if esign 1 relationis used, the! do not suer this potential inconsistenc! to insert a new branch that has no member, other
attributes would consist null 0alues 7 this can 0iolateprimar! (e! re"
7/23/2019 Lect#5 - Normalization.ppt
10/42
Deletion nomalies "
#$ample10
f esign 2 is used, if we delete a tuple fromthe relation that represents the last memberof sta located at a branch -branchNO8
4556., the details of the branch is lostcompared to if we used the relations Staffand Branch relationsin esign 1"
7/23/2019 Lect#5 - Normalization.ppt
11/42
%odi&cation nomalies "#$ample11
f esign 2 is used, if the 0alue of theattribute is to be changed for e'amplebAddress = 22 eer #d, 9ondon, the
other tuples with the same bAddressmustalso be updated"
7/23/2019 Lect#5 - Normalization.ppt
12/42
'(e Need forNormalization12:'ample$ %ompan! that manages building
pro&ects :ach pro&ect has its own pro&ect number, name,
emplo!ees assigned to it
:ach emplo!ee has an emplo!ee number, name &ob classi+cation
%harges its clients b! billing hours spent oneach contract
Hourl! billing rate is dependent on emplo!ee
7/23/2019 Lect#5 - Normalization.ppt
13/42
'(e Need for Normalization
)continued*13
7/23/2019 Lect#5 - Normalization.ppt
14/42
'(e Need for Normalization)continued*
14
>tructure of data set in pre0ious +guredoes not handle data 0er! well odi+cation anomalies nsertion anomalies eletion anomalies
7/23/2019 Lect#5 - Normalization.ppt
15/42
'(e NormalizationProcess15Wor(s through a series of stages called
normal forms$ nnormalized form -NF. 7 ) table that contain
one or more repeating groups First normal form -1NF. 7 table format; no
repeating group
>econd normal form -2NF. 7 1NF and no partial
dependencies *hird normal form -3NF. 7 2NF and no transiti0e
dependencies
7/23/2019 Lect#5 - Normalization.ppt
16/42
'(e Process of
Normalization16
Pearson Education Limited 1995, 2005
7/23/2019 Lect#5 - Normalization.ppt
17/42
+onversion to ,irst
Normal ,orm17
#epeating group eri0es its name from the fact that a group of
multiple entries of same t!pe can e'ist for an!single (e! attribute occurrence
:'" PROJ_NUM81= has = entries that arerelated because the! each share thePROJ_NUM81= characteristics
#elational table must not contain repeating
groups 8@ reAecting data redundancies Normalizing table structure will reduce data
redundancies
7/23/2019 Lect#5 - Normalization.ppt
18/42
+onversion to
-N,)continued*18
>tep 1$ :liminate the #epeating Broups /resent data in tabular format, where each
cell has single 0alue and there are no
repeating groups :liminate repeating groups, eliminate nulls
b! ma(ing sure that each repeating groupattribute contains an appropriate data
0alue
7/23/2019 Lect#5 - Normalization.ppt
19/42
+onversion to
-N,)continued*19
7/23/2019 Lect#5 - Normalization.ppt
20/42
+onversion to
-N,)continued*20
>tep 2$ dentif! )ll ependencies
De&nition. ) functional dependenc!occurs when one attribute in a relationuniuel! determines another attribute"
*his can be written ) 4 which wouldbe the same as stating C4 is functionall!
dependent upon )C or C) determines4C"
7/23/2019 Lect#5 - Normalization.ppt
21/42
+onversion to
-N,)continued*21
ependencies can be depicted with help ofa diagram -or dependenc! notation ) 4."
ependenc! diagram$
epicts all dependencies found within gi0entable structure
Helpful in getting bird
7/23/2019 Lect#5 - Normalization.ppt
22/42
+onversion to
-N,)continued*
/artial dependenc!7 a dependenc! that that is based on onl!part of a composite primar! (e!
*ransiti0e dependenc!7 a dependenc! of one nonEprimeattribute on another nonEprime attribute
22
7/23/2019 Lect#5 - Normalization.ppt
23/42
+onversion to
-N,)continued*23
>tep 3$ dentif! the /rimar! e! /rimar! (e! must uniuel! identif!
attribute 0alue" n other words, if a 0alue
of the (e! is gi0en, onl! one answer canbe returned for other attributes" For e'ample, PROJ_NUM in the sample
schema cannot be a primar! (e!" *his is so
since PROJ_NUM81= can identif! an! one of =emplo!ees so PROJ_NUMalone is not enoughto be used as a primar! (e!
7/23/2019 Lect#5 - Normalization.ppt
24/42
+onversion to
-N,)continued*24
/rimar! (e! can be determined based onthe functional dependencies identi+edearlier"
t can be a single attribute, i"e", thedeterminant which can determine uniuel! allattributes or
%omposition of se0eral determinants which
can co0er all attributes" GNote$ if there are fewpossibilities, choose the ones with biggestscope
7/23/2019 Lect#5 - Normalization.ppt
25/42
+onversion to
-N,)continued*
/rimar! (e! is combination of pro&Inum and empInum"
25
7/23/2019 Lect#5 - Normalization.ppt
26/42
Result of -N,26
#esult from 1NF normalization process will beone relation with all attributes listed"/rimar!Jcomposite (e! is underlined"
:'ample$
:/9KL::I/#KM:%* -pro&Inum, pro&Iname,empInum,
empIname, &obIclass,chgIhr, hours.Relation name
Primary/composite
key
All attributes
7/23/2019 Lect#5 - Normalization.ppt
27/42
-N,27
n First normal form $ )ll (e! attributes are de+ned *here are no repeating groups in the table that is
each rowJcolumn intersection contains one andonl! one 0alue, not a set of 0alues
)ll attributes are dependent on primar! (e! /roblem$ 1NF table structure contains partial
dependencies
>ometimes used for performance reasons, butshould be used with caution
>till sub&ect to dataredundancies" :'" Whathappen if EMP_NUM 8 15= changes JOB_CLASS?
7/23/2019 Lect#5 - Normalization.ppt
28/42
+onversion to Second
Normal ,orm28
#elational database design can be impro0ed b!con0erting the database into 2NF"
2NF remo0es partial dependenc!
f 1NF relation has a single attribute as primar!(e!, then the relation is automaticall! in its 2NFas well"
/artial dependenc! can onl! happen when
composite (e! e'ists" f we ha0e more than oneattribute in the (e!, then there are possibilitiesthat some attributes ma! depend on a portionof the (e! onl!"
7/23/2019 Lect#5 - Normalization.ppt
29/42
+onversion to0N,)continued*29
Step -$ Write :ach e! %omponent on a>eparate 9ine
Write each (e! component on separate line,
then write original -composite. (e! on last linePROJ_NUM
EMP_NUM
PROJ_NUM EMP_NUM
:ach component will become (e! in newtableJrelation
!"E# $% t&e key &as 2 attributes 'A,(), t&en possible components *ill
be A, (, and A(+ $% components 'A,(,-), t&en it *ill be A, (, -, A(,
A-, (- and A(-+
7/23/2019 Lect#5 - Normalization.ppt
30/42
+onversion to 0N,)continued*30Step 0. )ssign %orresponding ependent )ttributes etermine those attributes that are dependent on
other attributes(PROJ_NUM, PROJ_NAME)
(EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR)
(PROJ_NUM, EMP_NUM, HOURS)
)t this point, most anomalies ha0e beeneliminatedPROJECT(PROJ_NUM, PROJ_NAME)
EMPLOYEE(EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR)
ASSIGNMENT(PROJ_NUM, EMP_NUM, HOURS)
7/23/2019 Lect#5 - Normalization.ppt
31/42
+onversion to
0N,)continued*31
f at the end, there are relations withonl! the (e!s in them -e'cept for therelation ha0ing all (e!s., then the
relations can be eliminated" #elation is in second normal form -2NF.
when it includes no partial dependencies
7/23/2019 Lect#5 - Normalization.ppt
32/42
not(er tec(ni1ue toconvert to 0N,32
Step -. Write the original 1NF relation:/9KL::I/#KM:%* -pro&Inum, pro&Iname, empInum,
empIname, &obIclass, chgIhr, hours.
Step 0. For each partial dependenc!, create a newrelation, with the determinant as (e!" n the 1NF,delete the dependents and circleJitalicJcolored the(e! -foreign (e!."
:/9KL::I/#KM:%* -pro&Inum, empInum,
empIname, &obIclass, chgIhr, hours.
/#KM:%* -pro&Inum, pro&Iname.
Pro.num
pro.name
#e eat for other artial
7/23/2019 Lect#5 - Normalization.ppt
33/42
Result of 0N,33
#esult from 2NF normalization process will bemultiple relations" :ach primar!Jcomposite (e!is underlined" :ach foreign (e! identi+ed-circleJitalicJcolored.
:'ample$
:/9KL::I/#KM:%* -pro&Inum, empInum,hours.
/#KM:%* -pro&Inum, pro&Iname
:/9KL:: -empInum, empIname, &obIclass,chgIhr.
7/23/2019 Lect#5 - Normalization.ppt
34/42
+onversion to
0N,)continued*34
7/23/2019 Lect#5 - Normalization.ppt
35/42
+onversion to '(irdNormal ,orm35 3NF remo0es transiti0e dependenc!
>tep 1$ Write the pre0ious 2NF relations
>tep 2$ For each transiti0e dependenc! -nonE(e! dependents on another nonE(e!., createa new relation with determinant as (e!"
>tep 3$ n the original 2NF, delete thedependents" a(e the (e! into foreign (e!-circleJitalicJcolored."
7/23/2019 Lect#5 - Normalization.ppt
36/42
Result of 2N,36
#esult from 3NF normalization process will bemultiple relations" :ach primar!Jcomposite (e!is underlined" :ach foreign (e! identi+ed-circleJitalicJcolored.
:'ample$
:/9KL::I/#KM:%* -pro&Inum, empInum,hours.
/#KM:%* -pro&Inum, pro&Iname
:/9KL:: -empInum, empIname, &obIclass.
MK4 -&obIclass, chgIhr.ee t&e di%%erence+ $n 2, t&e key is bot&
%+k+ and p+k+ $n , t&e key is %+k+ only
7/23/2019 Lect#5 - Normalization.ppt
37/42
+onversion to 2N,
)continued*
Note$ Kriginal EMPLOYEEtable
7/23/2019 Lect#5 - Normalization.ppt
38/42
staffNo branchNo branchAddress name position hoursPerWeek
S4552 B001 City South Plaza, Seattle, WA98122
Ellen London Assistant 16
S4555 B004 16 14th A!enue, Seattle, WA98128
Ellen Lay"an Assistant 9
S4612 B002 City Cente# Plaza, Seattle,WA 98122
$a!e Sin%lai# Cle#& 14
S4612 B004 16 14th A!enue, Seattle, WA
98128
$a!e Sin%lai# Cle#& 10
38
E'a"ine the ta(le sho)n a(o!e* +his ta(le #e#esents the hou#s )o#&ed e# )ee&
-o# te"o#a#y sta-- at ea%h (#an%h o- a %o"any*
1* .denti-y the -un%tional deenden%ies #e#esented (y the data sho)n in the
ta(le*
2* /sin the -un%tional deenden%ies identi-ied in a#t 2, des%#i(e and illust#ate
the #o%ess o- no#"alization (y %on!e#tin the ta(le to +hi#d 3o#"al o#" 3
#elations* .denti-y the #i"a#y and -o#ein &eys in you# 3 #elations*
Sample #$ercise-
7/23/2019 Lect#5 - Normalization.ppt
39/42
39
i!en the -ollo)in #elational s%he"a7
MOVIE(cinemaID, cinemaCapacit, mo!ieID, mo!ie"it#e, mo!ieDuration,sho$Date, sho$"ime, actorID, actorName, ticketPrice, ticket%o#d,tota#Co##ection&
1* S&et%h a ta(le )ith the att#i(utes o- the a(o!e s%he"a as %olu"n heade#s*
Poulate the ta(le )ith 10 #e%o#ds o- data*
2* .denti-y the #i"a#y &ey and the -un%tional deenden%ies #e#esented (y the
data sho)n in the ta(le*
* /sin the -un%tional deenden%ies identi-ied in a#t 2, des%#i(e and illust#ate
the #o%ess o- no#"alization (y %on!e#tin the ta(le to +hi#d 3o#"al o#" 3
#elations* .denti-y the #i"a#y and -o#ein &eys in you# 3 #elations*
Sample #$ercise0
7/23/2019 Lect#5 - Normalization.ppt
40/42
40
i!en the -ollo)in in%o"lete deenden%y dia#a",
a $#a) an a##o) -o# ea%h -un%tional deenden%y
( State the #i"a#y &ey (ased on the identi-ied -un%tional deenden%ies
% W#ite the 13 #elational s%he"a -o# the dia#a"
d .denti-y any a#tial deenden%y
e 3o#"alize the #elation in % into 23 #elations
- .denti-y any t#ansiti!e deenden%y
3o#"alize the #elation in e into 3 #elations*
Sample #$ercise2
'#%!D
+'R3
4RP
%'+H
!D
D'#
'!%#
S'D!U%
5O+ 6 5 D 4, 4 P'
7/23/2019 Lect#5 - Normalization.ppt
41/42
5earning Outcomes41
Now students should be able to$ :'plain what is normalization and the
purpose of normalization
/erform normalization process fromlower normal forms to higher normalforms; 1NF, 2NF and 3NF
7/23/2019 Lect#5 - Normalization.ppt
42/42
References42
Database Systems 7 practical pproac( toDesign8 !mplementation and %anagement9
Thomas Connolly, Carolyn Begg (2010), AddisionWesley, Fifth Edition.
Chapter 1
Database Systems Design, Implementation& Management.
!eter "o#, Carlos Coronel (200$), Thomson Co%rseTe&hnology, 'eenth edition.
Chapter (pg 1* + 1*)