Normalization p2

download Normalization p2

of 22

Transcript of Normalization p2

  • 8/14/2019 Normalization p2

    1/22

    NORMALIZATION

    FIRST NORMAL FORM (1NF):

    A relation R is in 1NF if all attributes have

    atomic value

    = one value for an attribute

    = no repeating groups

    = no multivalued attributes

    = no composite attributes

  • 8/14/2019 Normalization p2

    2/22

    Example

    Non-1NF

    EMP (E#, ENAME, SKILL). Here SKILL is

    a multi-valued attribute.

    EMP( E#, ENAME, SKILL1, SKILL2,

    SKILL3, SKILL4, ....). Skill as a repeating

    group attribute.

  • 8/14/2019 Normalization p2

    3/22

    NON-NFNFD

    There are two methods of converting a NON-

    1NF into a 1NF relation. Method 1 mapps

    out the multi-valued (or repeating group)attribute into another table, while method 2

    keeps the multi-valued attribute but simply

    uses a composite PK.

  • 8/14/2019 Normalization p2

    4/22

    Method 1: Conversion to 1NF

    1. Create one relation for repeating groups by

    adding the key of original relation.

    2. Remove the attributes of repeating groups

    from the original relation.

  • 8/14/2019 Normalization p2

    5/22

    Example

    SKILL (E#, SKILL)

    EMP (E#, ENAME)

    Note the composite PK of SKILL relation.

  • 8/14/2019 Normalization p2

    6/22

    Method 2: We can also flatten the

    table as follows:EMP (E#, Skill, Ename)

    - Elmasri's book uses this method.

    - This method repeats the repeating group

    value in a separate tuple.

    - Note the composite PK.

  • 8/14/2019 Normalization p2

    7/22

    SECOND NORMAL FORM

    (2NF)A relation R is in 2NF if

    (a) R is in 1NF, and

    (b) each attribute isfully functionally

    dependent on the whole key of R

    (This FD is called a partial dependency(PD).)

  • 8/14/2019 Normalization p2

    8/22

    Example

    INVENTORY (WH, PART, QTY,

    WH_ADDR)

    WH, PART --> QTY

    WH --> WH_ADDR (This is not in 2NF,

    since WH is a part of a key)

    Key: WH+PART

  • 8/14/2019 Normalization p2

    9/22

    Problem of non-2NF (update

    anomaly):

    Warehouse address is repeated for every

    part stored

    If the address is changed, needs multiple

    updates

    If no parts in a warehouse, can't keep the

    warehouse address

  • 8/14/2019 Normalization p2

    10/22

    2NF Decomposition:

    Create a separate relation for each PD

    Remove the RHS of the PD from the

    original relation.

  • 8/14/2019 Normalization p2

    11/22

    The above Non-2NF can be transformed

    into the following 2NF relations.

    INVENTORY(WH, PART, QTY)

    WAREHOUSE(WH, WH_ADDR)

  • 8/14/2019 Normalization p2

    12/22

    Example

    NOTE that Non-2NF occurs only when we havea composite key.

    EMP_PROJ (SSN, P#, HOURS, ENAME,

    PNAME, PLOC)SSN, P# --> HOURS SSN --> ENAME

    (* Violate 2NF; SSN is a part of a key*)

    P# --> PNAME, PLOC (* Violate 2NF; P# is apart of a key *)

  • 8/14/2019 Normalization p2

    13/22

    2NF decomposition

    R1 (SSN, P#, HOURS) R2 (SSN,

    ENAME) R3 (P#, PNAME, PLOC)

  • 8/14/2019 Normalization p2

    14/22

    THIRD NORMAL FORM

    (3NF)

    A relation R is in 3NF if

    a) it is in 2 NF and

    b) it has no transitive dependencies.

    That is, each nonkey attribute must be functionally

    dependent on the key and nothing else. If you haveany FD whose LHS is not a PK (or CK), then R is

    not in 3NF.

  • 8/14/2019 Normalization p2

    15/22

    Example

    WORK (EMP#, DEPT, LOC) KEY: EMP#

    2NF 3NF

    (1) EMP# --> DEPT Y Y

    (2) DEPT --> LOC Y N

    WORK is in 2NF, but not in 3NF because of

    FD (2).

  • 8/14/2019 Normalization p2

    16/22

    Problem of Non-3NF

    Dept. location is repeated for everyemployee

    If the location is changed, needs multipleupdates

    If you forget to change all records, can

    cause inconsistency If a dept. has no employees, can't keep dept

    location

  • 8/14/2019 Normalization p2

    17/22

    3NF DECOMPOSITION

    Algorithm for a given minimal cover:

    1) Combine the RHS of FDs if they have common

    LHS2) Create a separate table for each FD.

    3) Check for Lossless decomposition.

    (Check whether a CK of the original realtionappears in any of the decomposed relation). IF not

    lossless, then add a table consisting of a CK.

  • 8/14/2019 Normalization p2

    18/22

    Example

    R1 (EMP#, DEPT), R2 (DEPT, LOC)

    The original relation WORK is not in 3NF,

    but R1 and R2 are in 3NF.

    Note that the LHS of a FD becomes the PK

    of each decomposed table.

  • 8/14/2019 Normalization p2

    19/22

    Our 3NF definition we used above is an

    informal one used by many industry

    designers. Some DB text books, includingElmasri's book use a more rigorous

    definition that is shown below.

  • 8/14/2019 Normalization p2

    20/22

    Formal Def. of 3NF

    A relation R is in 3NF if, for all X --> A in R

    (1) X is a super key or

    (2) A is a prime attribute (where X and Acould be a set of attributes)

    In other words, all attributes, except primeattributes, must be dependent on anycandidate keys.

  • 8/14/2019 Normalization p2

    21/22

    The only difference between the informaldefinition and the formal definition is the secondcondition in the formal definition. That is, theformal definition allows transitive dependencywhose RHS is a prime attribute, where a primeattribute is an attribute that belongs to anycandidate key. The difference between these two

    definition is very minor and many real-world DBdesigners just use the informal definition. For youreference, we showed the formal definition of3NF.

    SUMMARY OF

  • 8/14/2019 Normalization p2

    22/22

    SUMMARY OF

    NORMALIZATION- As we go to higher normal forms, we create a more number of

    relations.

    - Each higher normal form removes a certain type of dependency thatcauses redundancy.

    - As a relation becomes a higher normal form:

    - We have a more number of relations

    - That increases more number of joins in query forming

    - Which increases more number of join processings

    - And also more referential integrity constraints need to be maintained

    - And thus schema is complicated and performance is drcreased.

    So, many real-world DB designers stop at 3NF, which reasonablyremoves typical redundnacy and still maintains performance. So,

    strive to achive 3NF in your real-world RDB!