Database Management Systems (CS644) -...

191
Chapter Four Conceptual Database Design (ER modeling)

Transcript of Database Management Systems (CS644) -...

Page 1: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Chapter Four

Conceptual Database Design

(ER modeling)

Page 2: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Agenda (Chapter four)

Overview-database design

Conceptual Design (E-R Modeling)

Structural Constraints

EER- Generalization and Specialization

Reducing E-R Model to Table (logical Design)

Logical Database Design Process of Normalization

Page 3: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Objective

Learn skills and methods of Modeling organizational data

using Entity relationship diagram

Implementing the rules of relational data model through the

process of normalization

Page 4: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Overview-database design Database design is the process of coming up with different

kinds of specification for the data to be stored in the database

The database design part is one of the middle phases we have

in information systems development where the system uses a

database approach.

Page 5: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Design is the part on which we would be engaged to

describe

how the data should be perceived at different

levels and

finally how it is going to be stored in a

computer system

Page 6: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Information System development with Database

application consists of several tasks which include:

Planning of Information systems Design

Requirements Analysis,

Database Design (Conceptual, Logical and Physical

Design)

Interface, program etc designs are also there

Implementation

Operation and Support

Page 7: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont… From these different phases, the prime interest of this

courses will be the Database Design part which is again sub divided into other three sub-phases.

These sub-phases are:

Conceptual Design

Logical Design, and

Physical Design

Page 8: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont… In general, one has to go back and forth between these

tasks to refine a database design, and decisions in one task can influence the choices in another task.

In developing a good design, one should answer such questions as:

What are the relevant Entities for the Organization What are the important features of each Entity What are the important Relationships What are the important queries from the user What are the other requirements of the Organization and the

Users

Page 9: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Conceptual Database Design Conceptual design revolves around discovering and analyzing

organizational and user data requirements

The important activities are to identify Entities

Attributes

Relationships

Constraints

And based on these components develop the ER model using ER diagram components

Page 10: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Designing conceptual model for the database is not a one

linear process but an iterative activity where the design is

refined again and again.

Important steps then could be

1. Identification of componentsi. Entities

ii. Attributes

iii. Relationship

iv. Constraints

2. Use of notations to model

3. Draw, Review, Refine and Validate

Page 11: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Before working on the conceptual design of the

database, one has to know and answer the

following basic questions

What are the entities and relationships in the enterprise?

What information about these entities and relationships should we

store in the database?

What are the integrity constraints that hold?

Constraints on each data with respect to update, retrieval and store

Represent this information pictorially in ER diagrams, then map

ER diagram into a relational schema.

Page 12: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Modeling tools

Be professional- use the right tool symbols are languages

There are many ER diagramming tools

Rational Rose

Microsoft Visio

Oracle Designer

Power Designer

Page 13: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

E-R Model Constructs (recall)

Entities:

Entity instance–person, place, object, event, concept (often corresponds to a row in a table)

Entity Type–collection of entities (often corresponds to a table)

Relationships:

Relationship instance–link between entities (corresponds to primary key-foreign key equivalencies in related tables)

Relationship type–category of relationship…link between entity types

Attribute–property or characteristic of an entity or relationship type (often corresponds to a field in a table)

Page 14: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

How to Find Components of ER… To identify the entities, attributes, relationships, and

constraints on the data, there are different set of methods used during the analysis phase.

These include information gathered through Interviewing end users individually and in a group

Questionnaire survey

Direct observation

Examining different documents

Generally understand the business rules

Page 15: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Entity; Attribute; Relationship

Constructs Identification-ERD

Page 16: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Detailed Steps in Creating an ERD Here are the steps you may follow to create an entity-

relationship diagram.

1. Identify Entities: These are typically the nouns and noun-phrases in the descriptive data

produced in your analysis. Do not include entities that are irrelevant to your domain.

2. Find Relationships: Discover the semantic relationships between entities. These are usually the verbs that connect the nouns. Not all relationships are this blatant, you may have to discover some

on your own. The easiest way to see all possible relationships is to build a table with

the entities across the columns and down the rows, and fill in those cells where a relationship exists between entities.

Page 17: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…3. Draw Rough ERD

Draw the entities and relationships that you have discovered.

4. Fill in Cardinality Determine the cardinality of the relationships. You may want to

decide on cardinality when you are creating a relationship table.

5. Define Primary Keys Identify attribute(s) that uniquely identify each occurrence of

that entity.

6. Draw Key-Based ERD Now add them (the primary key attributes) to your ERD. Revise your diagram to eliminate many-to-many relationships,

and tag all foreign keys .

Page 18: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…7. Identify Attributes

Identify all entity characteristics relevant to the domain being analyzed.

8. Map Attributes Determine to which entity each characteristic belongs.

Do not duplicate attributes across entities.

9. Draw fully attributed ERD Now add these attributes.

The diagram may need to be modified to accommodate necessary new entities.

10. Check Results Is the diagram a consistent and complete representation of the

domain.

Page 19: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

In this all tasks

Keep in mind –Business rule

Business functions are expressed in terms of

business rules

Page 20: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example

business function-to-data matching matrix

Page 21: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Sample E-R Diagram- what it looks like

Page 22: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

What Should an Entity Be?

SHOULD BE:

An object that will have many instances in the database

An object that will be composed of multiple attributes

An object that we are trying to model

SHOULD NOT BE:

A user of the database system (the one who only uses the system

and his/her information is not important for the process)

Sometimes user data might also be important to capture. For instance, in

a financial system, it is important to register the employee who prepared

the payment or approved a transaction……issue of accountability)

An output of the database system (e.g., a report, display)

Page 23: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Inappropriate

entities

System

user

System

output

Example of inappropriate entities

Appropriate

entities

Page 24: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Attributes

Attribute–property or characteristic of an entity or relationship type

Classifications of attributes:

1. Required versus Optional Attributes

2. Simple versus Composite Attribute

3. Single-Valued versus Multivalued Attribute

4. Stored versus Derived Attributes

5. Entity Identifier Attributes

Page 25: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Identifiers (Keys)

Identifier (Key)–An attribute (or combination of attributes) that uniquely identifies individual instances of an entity type

Simple versus Composite Identifier

Candidate Key– an attribute that could be a key…satisfies the requirements for being an identifier

Will not be null

Page 26: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

A composite attribute

An attribute

broken into

component parts

Entity with multivalued attribute (Skill)

and derived attribute (Years_Employed)

Multivaluedan employee can have more than one skill

Derivedfrom date employed and current date

Notation, Notation, Notation, Notation, Notation, Notation

Page 27: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Simple and composite identifier attributes

The identifier is boldfaced and underlined

Page 28: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

More on Relationships

Relationship Types vs. Relationship Instances The relationship type is modeled as lines between

entity types…the instance is between specific entity instances

Relationships can have attributes These describe features pertaining to the association

between the entities in the relationship

Associative Entity –combination of relationship and entity

Page 29: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Relationship types and instances

a) Relationship type

b) Relationship

instances

Page 30: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Degree of relationships

Entities of

two different

types related

to each other Entities of three

different types

related to each

other

One entity

related to

another of

the same

entity type

Page 31: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Examples of relationships of different degrees

a) Unary relationships

Page 32: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Examples of relationships of different degrees (cont.)

b) Binary relationships

Page 33: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Examples of relationships of different degrees (cont.)

c) Ternary relationship

Note: a relationship can have attributes of its own

Page 34: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Structural Constraints

(Cardinality Constraints)

Cardinality Constraints - the number of instances of one entity that can or must be associated with each instance of another entityMinimum Cardinality

If zero, then optional

If one or more, then mandatory

Maximum Cardinality The maximum number of tuples taking part in the

relationship

Page 35: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

One-to-One

Each entity in the relationship will have exactly one related entity

E.g.: Relationship Manages between EMPLOYEE and BRANCHThe multiplicity of the relationship is:➢One branch can have one and only one manager ➢One employee could manage either one or no branches

1..1 0..1Employee BranchManages

Peter Chen’s Notation

Page 36: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

One-to-Many

An entity on one side of the relationship can have many related entities, but

an entity on the other side will have a maximum of one related entity

E.g.: Relationship Leads between EMPLOYEE and PROJECTThe multiplicity of the relationship➢One employee may Lead no, one or more project(s)➢One project is Lead by one staff

1..1 0..*Employee ProjectLeads

Peter Chen’s Notation

Page 37: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Many-to-Many

Entities on both sides of the relationship can have many related

entities on the other side

E.g.: Relationship Teaches between INSTRUCTOR and COURSEThe multiplicity of the relationship➢One Instructor Teaches one or more Course(s)➢One Course Thought by no, one or more Instructor(s)

* ..0 1..*Instructor CourseTeaches

Peter Chen’s Notation

Page 38: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Examples of cardinality constraints

a) Mandatory cardinalities

A patient must have recorded

at least one history, and can

have many

A patient history is

recorded for one and only

one patient

Crow’s Foot Notation

Page 39: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Examples of cardinality constraints (cont.)

b) One optional, one mandatory

An employee can be assigned

to any number of projects, or

may not be assigned to any at

all

A project must be

assigned to at least one

employee, and may be

assigned to many

Peter Chen’s Notation

Page 40: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Examples of cardinality constraints (cont.)

a) Optional cardinalities

A person is married

to at most one other

person, or may not

be married at all

Peter Chen’s Notation

Page 41: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Entities can be related to one another in more than one way

Examples of multiple relationshipsEmployees and departments

Peter Chen’s Notation

Page 42: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Multivalued attributes can be represented as

relationships and entity

simple

composite

Peter Chen’s Notation

Page 43: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Strong entity Weak entity

Identifying relationship

Peter Chen’s Notation

Page 44: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Associative Entities

An Entity–which has attributes

A relationship–links entities together

When should a relationship with attributes instead be an associative entity?

All relationships for the associative entity should have many cardinality

The associative entity could have meaning independent of the other entities

The associative entity preferably has a unique identifier, and should also have other attributes

The associative entity may participate in other relationships other than the entities of the associated relationship

Ternary relationships should be converted to associative entities

Page 45: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example

What if we want to keep the date the employee completed

the course

Page 46: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

A binary relationship with an attribute

Here, the date completed attribute pertains specifically to the

employee’s completion of a course…it is an attribute of the

relationship

Page 47: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

An associative entity (CERTIFICATE)

Associative entity is like a relationship with an attribute, but it is

also considered to be an entity in its own right.

Note that the many-to-many cardinality between entities will be

replaced by two one-to-many relationships with the associative

entity.

Page 48: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

An associative entity – bill of materials structure

This could just be a relationship with

attributes…it’s a judgment call

Page 49: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Ternary relationship as an associative entity

Page 50: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

ER Diagramming

Notation Standards

Page 51: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Related diagramming convention techniques: Peter Chen notation

Crow’s Foot Notation

Bachman notation

EXPRESS

IDEF1X[4]

Martin notation

UML class diagrams

Page 52: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Relationship

degrees specify

number of

entity types

involved

Entity

symbols

A special entity

that is also a

relationship

Relationship

symbols

Relationship

cardinalities

specify how

many of each

entity type is

allowed

Attribute

symbols

Basic E-R notation

Page 54: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Symbols☼ ☼

Page 55: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

ER Diagramming Example-1

Page 56: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont….

The University of Toronto has several departments. Each department is managed by a chair, and has at least one professor working for it. Professors must be assigned to one, but possibly more departments. At least one professor teaches each course, but a professor may be on sabbatical and not teach any course. Each course may be taught more than once by different professors. We know of the department name, the professor name, the professor employee id, the course names, the course schedule, the term/year that the course is taught, the departments the professor is assigned to, the department that offers the course.

Page 57: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont….

The University of Toronto has several departments. Each department is managed by a chair, and has at least one professor working for it. Professors must be assigned to one, but possibly more departments. At least one professor teaches each course, but a professor may be on sabbatical and not teach any course. Each course may be taught more than once by different professors. We know of the department name, the professor name, the professor employee id, the course names, the course schedule, the term/year that the course is taught, the departments the professor is assigned to, the department that offers the course.

Page 58: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Solution

Identify Entities

These are typically the nouns and noun-phrases in the

descriptive data produced in your analysis.

Do not include entities that are irrelevant to your domain.

The entity candidates are departments, chair, professor,

course.

Since there is only one instance of the “University of Toronto”,

which is the owner of the system, we exclude it from our

consideration.

Page 59: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Finding Relationships

Department Chair Professor Course

Department Managed_By Is_Assigned Offers

Chair Manages Is_a

Professor Assigned_To Is_a Teaches

Course Offered_By Taught_By

Page 60: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Finding Relationships

Department Chair Professor Course

Department Managed_By Is_Assigned Offers

Chair Manages Is_a

Professor Assigned_To Is_a Teaches

Course Offered_By Taught_By

Page 61: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Draw Rough ERDDraw the entities and relationships that you have discovered.

Department Chair

ProfessorCourse

PK PK

PK PK

Assigned

Is aOffers

Teaches

Managed By

Page 62: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Fill in Cardinality: Determine the cardinality of the relationships. You may

want to decide on cardinality when you are creating a relationship table.

Department Chair

ProfessorCourse

PK PK

PK PK

Assigned

Is aOffers

Teaches

Managed By

Page 63: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Define Primary Keys

Identify attribute(s) that uniquely identify each occurrence of

that entity.

Possible PK

Department DeptName OR DeptID

Chair EmpID

Professor EmpID

Course CourseName OR CourseNo

Page 64: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Draw Key-Based ERD

Now add them (the primary key attributes) to your ERD. Revise your

diagram to

eliminate many-to-many relationships, and tag all foreign keys

Department Chair

ProfessorCourse

DeptIDPK EmpIDPK

EmpIDPKCourseNoPK

Assigned

Is aOffers

Teaches

Managed By

Page 65: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont… Identify Attributes

Identify all entity characteristics relevant to the domain being analyzed.

Excluding those keys already identified:

Schedule, Term, Professor name, Department Chair (which is an employee ID, a foreign key to Professor)

Map Attributes

Determine to which entity each characteristic belongs.

Do not duplicate attributes across entities.

If necessary, contain them in a new, related, entity. Schedule -- Prof-Course, Term -- Prof-Course, Chair – Department

Draw fully attributed ERD

Now add these attributes

The diagram may need to be modified to accommodate

necessary new entities.

Page 66: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Department

Chair

Professor

Course

DeptIDPK

EmpIDPK

EmpIDPK

CourseNoPK

Assigned

Is a

Offers

Teaches

Managed By

DeptName

EstablishmentYear

Location

CourseTitle

CreditHr

attribute name

attribute name

FullName

Specialization

HireDate

Finally: Review, Modify and Validate

Page 67: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example 2 A company has several departments. Each department

has a supervisor and at least one employee. Employees must be assigned to at least one, but possibly more departments. At least one employee is assigned to a project, but an employee may be on vacation and not assigned to any projects. The important data fields are the names of the departments, projects, supervisors and employees, as well as the supervisor and employee number and a unique project number.

Draw an ER Diagram

Page 68: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Conceptual Database Design

(EER modeling)

Page 69: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont… Enhanced E-R (EER) Models

Object-oriented extensions to E-R model

EER is important when: Some tuples have special attribute to be described with

a relationship between two entities is partial

EER Concepts

Generalization…….Superclass/Supertype

Specialization………Subclass/Subtype

Attribute Inheritance Subtype entities inherit values of all attributes of the super-type

An instance of a subtype is also an instance of the super-type

Constraints on specialization and generalization73

Page 70: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Supertypes and Subtypes Subclass/Subtype

An entity type whose tuples have attributes that distinguish its members from tuples of the generalized or Superclass entities.

Subclasses can be either mutually exclusive (disjoint) or overlapping (inclusive).

A single subclass may inherit attributes from two distinct superclasses.

A mutually exclusive category/subclass is when an entity instance can be in only one of the subclasses. E.g.: An EMPLOYEE can either be SALARIED or PART-TIMER but not both.

An overlapping category/subclass is when an entity instance may be in two or more subclasses. E.g.: A PERSON who works for a university can be both EMPLOYEE and a

STUDENT at the same time.74

Page 71: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Supertypes and Subtypes…

Superclass /Supertype

An entity type whose tuples share common attributes.

Attributes that are shared by all entity occurrences (including

the identifier) are associated with the supertype.

Is the generalized entity

75

Page 72: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

An instance ca not only be a member of a subclass. i.e. Every instance of a subclass is also an instance in the Superclass.

A member of a subclass is represented as a distinct database object, a distinct record that is related via foreign key attribute to its super-class.

The relationship between a subclass and a Superclass is an “IS A” or

“IS PART OF” type.

Subclass IS PART OF Superclass

Manager IS AN Employee

76

Relationship of Superclass and Subclass

Page 73: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Notation

• All subclasses or specializedentity sets should beconnected with thesuperclass using a line to acircle where there is a subsetsymbol indicating thedirection ofsubclass/superclassrelationship.

77

Page 74: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Basic notation for supertype/subtype notation

a) EER notation

78

Page 75: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Attribute InheritanceAn entity that is a member of a subclass inherits

all the attributes of the entity as a member of the superclass.

The entity also inherits all the relationships in which the superclass participates.

An entity may have more than one subclass categories.

All subclasses of a superclass share a common unique identifier attribute (primary key). i.e.

79

Page 76: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Consider the EMPLOYEE supertype

entity shown above.

This entity can have several different

subtype entities (for example: HOURLY

and SALARIED), each with distinct

properties not shared by other subtypes.

But whether the employee is HOURLY

or SALARIED, same attributes

(EmployeeId, Name, and DateHired) are

shared.

The Supertype EMPLOYEE stores all

properties that subclasses have in

common.

And HOURLY employees have the

unique attribute Wage (hourly wage rate),

while SALARIED employees have two

unique attributes, Pension and Salary. 80

Hours

Page 77: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Generalization and Specialization

Generalization:The process of defining a more general

entity type from a set of more specialized entity types.

BOTTOM-UP

Specialization:The process of defining one or more

subtypes of the supertype and forming supertype/subtype

relationships.

TOP-DOWN

81

Page 78: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example of generalization

a) Three entity types: CAR, TRUCK, and MOTORCYCLE

All these types of vehicles have common attributes

82

Page 79: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example of generalization (cont.)

So we put

the shared

attributes in

a supertype

Note: no subtype for motorcycle, since it has no unique attributes

b) Generalization to VEHICLE supertype

83

Page 80: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example of specialization

a) Entity type PART

Only applies to manufactured parts

Applies only to purchased parts

84

Page 81: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

b) Specialization to MANUFACTURED PART and PURCHASED PART

Created 2

subtypes

Example of specialization (cont.)

85

SupplierID

Unite_Price

Page 82: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

b) Specialization to MANUFACTURED PART and PURCHASED PART

Note: multivalued attribute was replaced by an

associative entity relationship to another entity

Created 2

subtypes

Example of specialization (cont.)

86

Page 83: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Relationship without Enhanced ER Model

Example of specialization (cont.)

87

Page 84: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

88

Specialization of EMPLOYEE with Job Type

Page 85: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Constraints in Supertype

Completeness Constraints: Whether an

instance of a supertype must also be a member of

at least one subtype

Total Specialization Rule: Yes (double line)

Partial Specialization Rule: No (single line)

89

Page 86: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Constraints in Supertype The Total Specialization Rule specifies that an entity occurrence

should at least be a member of one of the subclasses.

E.g.: If we have EXTENTION and REGULAR as subclasses

of a superclass STUDENT, then it is mandatory that each

student to be either EXTENTION or REGULAR student.

Thus the participation of instances of STUDENT in

EXTENTION and REGULAR subclasses will be total.

90

Page 87: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Examples of completeness constraints

a) Total specialization rule

A patient must be either

an outpatient or a

resident patient

91

Page 88: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Constraints in Supertype The Partial Specialization Rule specifies that it is not

necessary for all entity occurrences in the superclass to be a member of one of the subclasses.

Here we have an optional participation on the specialization. E.g.: If we have MANAGER and SECRETARY as subclasses of a

superclass EMPLOYEE, then it is not the case that all employees are either manager or secretary.

Thus the participation of instances of employee in MANAGER and SECRETARY subclasses will be partial.

92

Page 89: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

b) Partial specialization rule

A vehicle

could be a

car, a truck,

or neither

Examples of completeness constraints (cont.)

93

Page 90: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Constraints in Supertype

Disjointness Constraints: Whether an instance of a supertype

may simultaneously be a member of two (or more) subtypes

Disjoint Rule: An instance of the supertype can be only ONE of the

subtypes

Eg: an EMPLOYEE can either be SALARIED or PART-TIMER, but not the both at the

same time.

Overlap Rule: An instance of the supertype could be more than one of

the subtypes

Eg: EMPLOYEE working at the university can be both a STUDENT and an

EMPLOYEE at the same time.

This is diagrammed by placing either the letter "d" for disjoint

or "o" for overlapping inside the circle on the Generalization

Hierarchy portion of the E-R diagram. 94

Page 91: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

a) Disjoint rule

Examples of disjointness constraints

A patient can either be outpatient

or resident, but not both

95

Page 92: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

b) Overlap rule

A part may be both

purchased and

manufactured

Examples of disjointness constraints (cont.)

96

Page 93: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Constraints in Supertype The two types of constraints on generalization and specialization

(Disjointness and Completeness constraints) are not dependent on one another.

That is, being disjoint will not favour whether the tuples in the superclass should have Total or Partial participation for that specific specialization.

From the two types of constraints we can have four possible constraints

Disjoint ANDTotal

Disjoint AND Partial

Overlapping ANDTotal

Overlapping AND Partial

97

Page 94: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Constraints in Supertype/

Subtype Discriminators

Subtype Discriminator: An attribute of the supertype

whose values determine the target subtype(s)

Disjoint – a simple attribute with alternative values to indicate the

possible subtypes

Overlapping – a composite attribute whose subparts pertain to

different subtypes. Each subpart contains a boolean value to indicate

whether or not the instance belongs to the associated subtype

98

Page 95: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Introducing a subtype discriminator (disjoint rule)

A simple attribute with

different possible values

indicating the subtype

99

Page 96: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Subtype discriminator (overlap rule)

A composite

attribute with

sub-attributes

indicating “yes”

or “no” to

determine

whether it is of

each subtype

100

Page 97: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example of supertype/subtype hierarchy

101

Page 98: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

ERD-to- Relation

Starting point for Logical

Database Design

Page 99: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Why ER

ER is very high level representation and is closer

to end users understanding

ER can not directly be implemented

Why Mapping

Relational Data model requires everything

be represented in a form of a table

Page 100: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

How

The first step is converting the conceptual design

to a form suitable for relational logical model,

which is in a form of tables- mapping

The next is Normalization (applying the rules

in relational data model)-

Page 101: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

ER Diagrams into RelationsMapping Regular Entities to Relations/Tables

Every Entity will be a Tables

1. Simple attributes: E-R attributes map directly onto the

relation

2. Composite attributes: Use only their simple, component

attributes

3. Multivalued Attribute–Becomes a separate relation with a

foreign key taken from the superior entity

Page 102: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

(a) CUSTOMER

entity type with

simple

attributes

Mapping a regular entity

(b) CUSTOMER relation

Page 103: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

(a) CUSTOMER

entity type with

composite

attribute

Mapping a composite attribute

(b) CUSTOMER relation with address detail

Page 104: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Mapping an entity with a multivalued attribute

One–to–many relationship between original entity and new relation

(a)

Multivalued attribute becomes a separate relation with foreign key

(b)

Page 105: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Mapping Weak Entities

Becomes a separate relation with a foreign key

taken from the superior/Strong entity

Primary key composed of:

Partial identifier of weak entity

and/or

Primary key of identifying relation (strong entity)

ER Diagrams into Relations

Page 106: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example of mapping a weak entity

a) Weak entity DEPENDENT

Page 107: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

NOTE: the domain constraint

for the foreign key should

NOT allow null value if

DEPENDENT is a weak

entity

Foreign key

Composite primary key

Example of mapping a weak entity (cont.)

b) Relations resulting from weak entity

Page 108: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

The cardinality will be the basis for mapping relationship

One-to-One:

Primary key of one of the side can be taken into the other as a

foreign key.

Recommendation: Primary key on the optional side becomes a

foreign key on the mandatory side

One-to-Many:

Primary key on the one side becomes a foreign key on the

many side

Many-to-Many:

Create a new relation with the primary keys of the two entities

becomes the foreign key in the new table. The foreign keys will

as well be part of the primary key in the new table.

Mapping Relationships

Page 109: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example of mapping a 1:M relationship

a) Relationship between customers and orders

Note the mandatory one

b) Mapping the relationship

Again, no null value in the

foreign key…this is because

of the mandatory minimum

cardinalityForeign key

Page 110: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example of mapping an M:N relationship

a) Many to Many relationship (M:N)

The Completes relationship will need to become a separate relation

Page 111: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Foreign key

Foreign key

Composite primary key

Example of mapping an M:N relationship (cont.)

a) Three resulting relations

New intersection

relation

Page 112: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example of mapping an M:N relationship

b) Many to Many relationship (M:N) with attribute

✓ What if the relationship has additional attribute called

“Date Completed”

✓ The Completes relationship will need to become a

separate relation with the newly added attribute

Page 113: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

New

intersection

relation

Foreign key

Foreign key

Composite primary key

Example of mapping an M:N relationship (cont.)

b) Three resulting relations

Page 114: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example of mapping a binary 1:1 relationship

a) One to one relationship (1:1)

Often in 1:1 relationships, one direction is optional.

Page 115: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

b) Resulting relations

Example of mapping a binary 1:1 relationship (cont.)

Foreign key goes in the relation on the optional side,

Matching the primary key on the mandatory side

Page 116: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example of mapping an associative entity

a) An associative entity

Page 117: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example of mapping an associative entity (cont.)

b) Three resulting relations

Composite primary key formed from the two foreign keys

Page 118: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example of mapping an associative entity with

an identifier

a) SHIPMENT associative entity

Page 119: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example of mapping an associative entity with

an identifier (cont.)

b) Three resulting relations

Primary key differs from foreign keys

Page 120: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

ER Diagrams into RelationsMapping Unary Relationships

One-to-One:

Recursive foreign key in the same relation

One-to-Many:

Recursive foreign key in the same relation

Many-to-Many:

Create a new relation and the primary key of the entity will be

taken as a foreign key twice playing both roles in the recursive

relationship. (N.B: make sure that the two foreign key names are

different from one another even though they refer to same primary

key value in the main table)

Page 121: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Mapping a unary 1:N relationship

➢EMPLOYEE

entity with

unary

relationship

➢ EMPLOYEE

relation with

recursive

foreign key

Page 122: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Mapping a unary M:N relationship

➢ Bill-of-materials

relationships (M:N)

➢ ITEM and

COMPONENT

relations

Page 123: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Mapping Ternary (and n-ary) Relationships

Create a new relation an post the primary key of all

participating entities into the new table.

Include if there are any attributes that would be

meaningful only if the relationship takes place.

ER Diagrams into Relations

Page 124: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Mapping a ternary relationship

a) PATIENT TREATMENT Ternary relationship with

associative entity

Page 125: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

b) Mapping the ternary relationship PATIENT TREATMENT

Remember

that the

primary key

MUST be

unique

Mapping a ternary relationship (cont.)

This is why

treatment date

and time are

included in the

composite

primary key

But this makes a

very

cumbersome

key…

It would be

better to create a

surrogate key

like Treatment#

Page 126: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

EER Diagrams into RelationsMapping Supertype/Subtype Relationships

A separate relation for supertype and for each subtype

Supertype attributes (including the subtype discriminator) go into supertype relation

Subtype attributes go into each subtype; primary key of supertype relation also becomes primary key of subtype relation

Page 127: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Supertype/subtype relationships

Page 128: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Mapping Supertype/subtype relationships to relations

These are implemented as one-to-one

relationships

Page 129: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example

Page 130: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example

DepartmentDepartment

EmployeeEmployee

ProjectProject

Works_forWorks_for

assignedassigned

Run_ByRun_By

Page 131: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Solution

Convert the ERD into Relation

The simple process is

1. Create a table/relation for each entity from the ERD.

2. The entity attributes become the attributes/columns

for the table/relation.

3. Potentially, include some foreign keys to represent

relationships shown in the ERD.

Page 132: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Create a table for each ERD entity

There are 6 entities in the above Figure so there are six tables

with the same names:

Department,

Supervisor,

EmployeeDepartment,

Employee,

EmployeeProject,

Project.

Page 133: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Entity attributes become table attributes

DEPARTMENT( DepartmentName )

SUPERVISOR( SupervisorNumber, SupervisorName )

EMPLOYEEDEPARTMENT( DepartmentName, EmployeeNumber )

EMPLOYEE( EmployeeNumber, EmployeeName )

EMPLOYEEPROJECT( EmployeeNumber, ProjectNumber )

PROJECT( ProjectNumber )

Page 134: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont… Include some foreign keys to represent relationships

Let's list the relationships and take each in turn

Run By

Can be represented by adding SupervisorNumber to the Department table

DEPARTMENT( DepartmentName, SupervisorNumber )

Is assigned (employee-department)

Is already represented in the EmployeeDepartment table

Page 135: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Involves

Also represented in the EmployeeDepartment table

Works on

Represented in the EmployeeProject table

Is assigned (employ-project)

Also represented by the EmployeeProject table.

Page 136: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont… This means no other foreign keys are really needed.

This leaves the Project table with only a primary key.

But

The primary key is meant to uniquely identify instances of an entity.

The only data being stored in this table is the primary key, it has nothing to identify.

The choices are Remove the table. It's not needed if it only has the primary key.

Add more attributes to the table.

So the Project table becomes

PROJECT(ProjectNo, ProjectName, Budget, EndDate )

Page 137: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Add more required attributes to other table EMPLOYEEDEPARTMENT

( DepartmentName, EmployeeNumber, StartDate )

EMPLOYEEPROJECT

( EmployeeNumber, ProjectNumber, StartDate )

Normally, this sort of process would not be required.

Page 138: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

CLIENT

ClientID

ClientName

PostalAddress

(Street, Suburb,

State, PostCode)

Phone {(PhoneType,

PhoneNbr)}

ClientType

Capricorn Computing Consultants (c3)

Entity relationship model

Author: E.E. Tansley

Last modified: 30/6/09

PROJECT

ProjectID

Description

EstimatedPrice

EstimatedStartDate

EstimatedFinishDate

ActualStartDate

ActualFinishDate

CurrentStatus

[ActualDuration]

TASK

TaskID

Description

EstimatedHours

ActualStartDate

ActualFinishDate

CurrentStatus

[ActualDuration]

STAFF

StaffID

StaffName

Mobile

JobTitle

managed

byrequests

TIMESHEET

TimesheetID

WorkDate

WorkDescription

HoursWorked

Billable

involves

submitsincurs

pre-req

Exercise

Page 139: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Next on:

Process of Normalization

Page 140: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Objectives

144

Recalling Relational concepts Understand different anomalies and functional

dependency concepts Use normalization to convert anomalous tables to

well-structured relations Why normalize What is normalization Identify three problems solved by normalization Example of how to normalize

Page 141: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Relation

145

Definition:

A relation is a data in a form of a named, two-dimensional table

Major requirements for a table to qualify as a relation: A Relation must have a unique name Every attribute value must be atomic

not multi-valued, not composite Every row must be unique (can’t have two rows with exactly the

same values for all its fields) Attributes (columns) in tables must have unique names The order of the columns and rows must be irrelevant

NOTE: all relations are in 1st Normal form

Page 142: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Why normalize

146

Data design aims to identify data stored in a system

Almost certainly stored in a relational database

Normalization intended to

Eliminate redundancy Reduce the potential for anomalies

Organize data efficiently

Page 143: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

What is normalization

147

Normalization is the process to decompose a relation

into a set of smaller relations

A relation is in a specific normal form (NF) if it

Satisfies requirements of all previous NFs

Satisfies requirements of the current NF

The focus is on the first three NFs (1NF, 2NF & 3NF)

Data/Database normalization is a series of stepsfollowed to obtain a database design that allowsfor consistent storage and efficient access ofdata in a relational database.

Page 144: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

148

Formal definitionNORMALIZATION is the process of identifying the

logical associations between data items and designing a database that will represent such associations but without suffering the update anomalies which are;

Insertion Anomalies Deletion Anomalies Modification Anomalies

Reading Assignment: Read and Understand the three kinds of anomalies

Page 145: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Functional Dependencies and Keys

153

The mechanism to avoid the update anomalies and do normalization is to analyze relationship between data values/attributes of a relation

Such analysis is done using the concept of Functional Dependency

Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute

Candidate Key: A unique identifier. One of the candidate keys will become the

primary key

Each non-key field is functionally dependent on the candidate key/primary key

Page 146: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Data Dependency

154

FDs are derived from the real-world constraints on the attributes

The essence of this idea is that if the existence of

something, call it A, implies that B must exist and

have a certain value, then we say:

"B is functionally dependent on A."

We also often express this idea by saying that

"A determines B," or

"B is a function of A," or

"A functionally governs B."

Page 147: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Data Dependency

155

Often, the notions of functionality and functional dependency are expressed briefly by the statement, "If A, then B."

It is important to note that the value B must be uniquefor a given value of A,

i.e., any given value of A must imply just one and only one value of B, in order for the relationship to qualify for the name "function."

However, this does not necessarily prevent different values of A from implying the same value of B.

Page 148: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Data Dependency

156

X →Y holds whenever two tuples have the same value for X, they must have the same value for Y

Notation:

A→B which is read as; B is functionally dependent on A

or A functionally determines B

In general, functional dependency is a relationship among attributes.

Who will tell us this FD? How do we know?

Most of the time we need sample data

Page 149: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Data Dependency…

157

Partial Dependency

If an attribute which is not a member of the primary key is

dependent on some part of the primary key (if we have

composite primary key) then that attribute is partially

functionally dependent on the primary key.

Let {A,B} be the Primary Key and C is non key attribute.

Then if {A,B}→C and B→C or A→ C

Then C is partially functionally dependent on {A,B}

Page 150: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Data Dependency…

158

Full Dependency

If an attribute which is not a member of the primary key is not

dependent on some part of the primary key but the whole key

(if we have composite primary key) then that attribute is fully

functionally dependent on the primary key.

Let {A,B} is the Primary Key and C is non key attribute

Then if {A,B}→C and B→C and A→C does not hold

Then C Fully functionally dependent on {A,B}

Page 151: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Data Dependency…

159

Transitive Dependency In mathematics and logic, a transitive relationship is a relationship

of the following form: "If A implies B, and if also B implies C, then A implies C."

Example: If Mr X is a Human, and if every Human is an Animal, then Mr X must be

an Animal.

Generalized description of transitive dependency is that: If A functionally governs B, AND If B functionally governs C

THEN A functionally governs C Provided that neither C nor B determines A

i.e. (B A and C A)

In the normal notation: {(A→B) AND (B→C)} ==> A→C provided that B A and C A

Page 152: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Steps of Normalization

160

We have various levels or steps in normalization called Normal Forms.

The level of complexity, strength of the rule and decomposition increases as we move from one lower level Normal Form to the higher.

A table in a relational database is said to be in a certain normal form if it satisfies certain constraints.

Normal form below(next) represents a stronger condition than the previous one

Page 153: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

161

Steps in normalization- Pictorial representation

Page 154: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

First Normal Form (1NF)

162

Requires that all column values in a table are atomic (e.g., a number is an atomic value, while a list or a set is not).

Solution

Moving this repeating groups to a new row by repeating the common attributes. If so then Find the key with which you can find all data

Thus No multivalued attributes Every attribute value is atomic For example Fig. 5-25 is not in 1st Normal Form (multivalued

attributes) ➔ it is not a relation While Fig. 5-26 is in 1st Normal form

Remark All relations are in 1st Normal Form

Page 155: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

163

Formal Definition: a table (relation) is in 1NF

If

There are no duplicated rows in the table. Unique identifier

Each cell is single-valued (i.e., there are no repeating groups).

Entries in a column (attribute, field) are of the same kind.

Page 156: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example for First Normal form (1NF )

164

Page 157: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

165

FIRST NORMAL FORM (1NF)Remove all repeating groups. Distribute the multi-valued attributes into different rows and identify a unique identifier for the relation so that is can be said is a relation in relational database.

Page 158: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

166

Example 2: Table with multivalued attributes, not in 1st

normal form

Note: this is NOT a relation

Page 159: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

167

Table with no multivalued attributes and unique rows, in 1st

normal form

Note: this is relation, but not a well-structured one

Page 160: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Second Normal form 2NF

168

No partial dependency of a non key attribute on part of the

primary key.

This will result in a set of relations with a level of Second

Normal Form.

Any table that is in 1NF and has a single-attribute (i.e., a

non-composite) key is automatically also in 2NF.

Page 161: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

169

Formal Definition: a table (relation) is in 2NF

If

It is in 1NF and

If all non-key attributes are dependent on the entire

primary key. i.e. no partial dependency.

Page 162: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Example 1: for 2NF

170

EmpID EmpName ProjNo ProjName ProjLoc ProjFund PrrojMangID Incentive

EmpID ProjNo EmpName ProjName ProjLoc ProjFund ProjMangID Incentive

EMP_PROJ rearranged

EMP_PROJ

Business rule: Whenever an employee participates in a project, he/she will be entitled for an incentive.

This schema is in its 1NF since we don’t have any repeating groups or attributes with multi-valued property.

Page 163: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

171

To convert it to a 2NF we need to remove all partial

dependencies of non key attributes on part of the primary

key.

{EmpID, ProjNo}→ EmpName, ProjName, ProjLoc, ProjFund,

ProjMangID, Incentive

But in addition to this we have the following dependencies

FD1: {EmpID}→EmpNameFD2: {ProjNo}→ProjName, ProjLoc, ProjFund, ProjMangIDFD3: {EmpID, ProjNo}→ Incentive

Page 164: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

172

As we can see, some non key attributes are partially

dependent on some part of the primary key.

This can be witnessed by analyzing the first two

functional dependencies (FD1 and FD2).

Thus, each Functional Dependencies, with their

dependent attributes should be moved to a new relation

where the Determinant will be the Primary Key for

each.

Page 165: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

173

Page 166: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

174

Order_ID ➔ Order_Date, Customer_ID, Customer_Name, Customer_Address

Therefore, NOT in 2nd Normal Form

Customer_ID ➔ Customer_Name, Customer_Address

Product_ID ➔ Product_Description, Product_Finish, Unit_Price

Order_ID, Product_ID ➔ Order_Quantity

Example 2: Functional dependency diagram for INVOICE

Page 167: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

175

Partial dependencies are removed, but there

are still transitive dependencies

Getting it into Second Normal Form

Removing partial dependencies

Page 168: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Third Normal Form (3NF )

176

Eliminate Columns Dependent on another non-Primary Key - If attributes do not contribute to a description of the key, remove them to a separate table. This level avoids update and delete anomalies.

Formal Definition: a Table (Relation) is in 3NF

If It is in 2NF and There are no transitive dependencies between a

primary key and non-primary key attributes.

Page 169: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

177

2NF PLUS no transitive dependencies (functional dependencies on non-primary-key attributes) Note: This is called transitive, because the primary key is a determinant

for another attribute, which in turn is a determinant for a third

Solution: Non-key determinant with transitive dependencies go into a new table; non-key determinant becomes primary key in the new table and stays as

foreign key in the old table

Page 170: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

178

Transitive dependencies are removed

Removing transitive dependencies

Getting it into Third Normal Form

Page 171: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

2nd Example for (3NF)

Assumption: Students of same batch (same year) live in one building or dormitory

Student

StudID Stud_F_Name Stud_L_Name Dept Year Dormitary

125/97 Abebe Mekuria Info Sc 1 401

654/95 Lemma Alemu Geog 3 403

842/95 Chane Kebede CompSc 3 403

165/97 Alem Kebede InfoSc 1 401

985/95 Almaz Belay Geog 3 403

This schema is in its 2NF since the primary key is a single attribute.

179

Page 172: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

180

StudID→Year ANDYear→Dormitary

And

Year can not determine StudID and Dormitary can not

determine StudID

Then transitively StudID→Dormitary

To convert it to a 3NF we need to remove all transitive dependencies of non key attributes on another non-key attribute.

The non-primary key attributes, dependent on each other will be moved to another table and linked with the main table using Candidate Key- Foreign Key relationship.

Page 173: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

181

Page 174: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

182

Generally, eventhough there are other four additional levels

of Normalization, a table is said to be normalized if it reaches

3NF.

A database with all tables in the 3NF is said to be Normalized

Database.

Reading Assignment Boyce-Codd Normal Form (BCNF)

Forth Normal form (4NF)

Fifth Normal Form (5NF)

Domain-Key Normal Form (DKNF)

Page 175: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Summary of the Process of

Normalization

183

To understand normalisation you need to know these

problems

1NF: Repeating groups

2NF: Partial Dependency

3NF: Transitive Dependency and Derived Attributes

Normalisation is the process of decomposing relations

Page 176: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

184

Some important indicators

No Repeating or Redundancy: no repeating fields.

The Fields Depend Upon the Key: the table should solely

depend on the key.

The Whole Key: no partial key dependency.

And Nothing But The Key: no inter data dependency.

Page 177: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

185

Pitfalls (Limitations) of Normalization

Requires data to see the problems

May reduce performance of the system

Is time consuming,

Difficult to design and apply and

Prone to human error

Page 178: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Physical Database Design

186

Page 179: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Physical Objectives

Understand Purpose of physical database design

Describe the physical database design process

Choose storage formats for attributes

Describe indexes and their appropriate use

187

Page 180: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Logical vs Physical Database Design

Logical database design is independent of

implementation details, such as functionality of target

DBMS.

Logical database design concerned with the what,

physical database design is concerned with the how.

Physical Design is description of the

implementation of the database on secondary

storage using the target DBMS

188

Page 181: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Physical Database Design

Process of producing a description ofimplementation of the database on secondarystorage.

The main tasks including: field specification,

choosing indexes and file organization, and to

refine the conceptual and external schemas (if

necessary) to meet performance goals.

189

Page 182: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

The primary goal of physical database design is

data processing efficiency (Time and Storage

impact)

We will concentrate on choices often available to

optimize performance of database services

Many physical database design decisions are

implicit in the technology adopted

190

Page 183: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Design decision about Storage Format

Choosing the storage format of each field (attribute).

The DBMS provides some set of data types that can

be used for the physical storage of fields in the

database

Data Type (format) is chosen to optimize storage

space and maximize data integrity

191

Page 184: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

192

Cont…

It is also called Designing Fields Field

Smallest unit of named application data recognized by system software

Attributes from relations will be represented as fields In addition to Choosing data type you may also

consider Coding (optional) Controlling data integrity

Data Type A coding scheme recognized by system software for

representing organizational data

192

Page 185: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Choosing Data Types

CHAR–fixed-length character

VARCHAR–variable-length character (memo)

LONG–large number

NUMBER–positive/negative number

INEGER–positive/negative whole number

DATE–actual date

BLOB–binary large object (good for graphics, sound

clips, etc.)

193

Page 186: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Cont…

Objectives of choosing data types1. Minimize storage space

2. Represent all possible values of the field

3. Improve data integrity of the field

4. Support all data manipulations desired on

the field

194

Page 187: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

195

Example storage format (field design) for

the “Branch” table

195

Page 188: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

196

Cont…

Choose indexes Index in a database is similar to an index in a book Primary indexes is automatic by the primary key Clustering index- by non key attribute

Guidelines for Choosing Indexes Specify a unique index for the primary key of each table. Specify an index for foreign keys. Specify an index for nonkey fields that are referenced in

qualification, sorting and grouping commands for the purpose of retrieving data.

196

Page 189: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Rules for Using Indexes

1. Use on larger tables

2. Index the primary key of each table

3. Index search fields (fields frequently in WHERE clause)

4. Fields in SQL ORDER BY and GROUP BY commands

5. When there are >100 values but not when there are <30 values

197

Page 190: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

Rules for Using Indexes (cont.)

6. Avoid use of indexes for fields with long values; perhaps compress values first

Name is costly to be indexed

7. DBMS may have limit on number of indexes per table and number of bytes per indexed field(s)

8. Null values will not be referenced from an index

9. Use indexes heavily for non-volatile databases; limit the use of indexes for volatile databases

Why? Because modifications (e.g. inserts, deletes) require updates to occur in index files

198

Page 191: Database Management Systems (CS644) - Weeblydbmshilcoe.weebly.com/.../2/66126761/database_management_syst… · Overview-database design Database design is the process of coming up

End of Chapter Four

199