WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers...

17
DATABASE SYSTEMS I WEEK 2: THE ENTITY-RELATIONSHIP MODEL 2 OVERVIEW OF DATABASE DEVELOPMENT Requirements Analysis / Ideas High-Level Database Design Conceptual Database Design / Relational Database Schema Physical Database Design / Relational DBMS Similar to software development 3 OVERVIEW OF DATABASE DEVELOPMENT Requirements Analysis What data are to be stored in the enterprise? What are the required applications? What are the most important operations? High-level database design What are the entities and relationships in the enterprise? What information about these entities and relationships should we store in the database? What are the integrity constraints or business rules that hold? ER model or UML to represent high-level design

Transcript of WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers...

Page 1: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

DATABASE SYSTEMS I

WEEK 2: THE ENTITY-RELATIONSHIP

MODEL

22

OVERVIEW OF DATABASE

DEVELOPMENT

Requirements Analysis / Ideas

High-Level Database Design

Conceptual Database Design / Relational Database Schema

Physical Database Design / Relational DBMS

Similar to software development

33

OVERVIEW OF DATABASE

DEVELOPMENT Requirements Analysis

What data are to be stored in the enterprise?

What are the required applications?

What are the most important operations?

High-level database design

What are the entities and relationships in the enterprise?

What information about these entities and relationshipsshould we store in the database?

What are the integrity constraints or business rules thathold?

ER model or UML to represent high-level design

Page 2: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

44

OVERVIEW OF DATABASE

DEVELOPMENT Conceptual database design

What data model to implement for the DBS?E.g., relational data model

Map the high-level design (e.g., ER diagram) to a(conceptual) database schema of the chosen data model.

Physical database design

What DBMS to use?

What are the typical workloads of the DBS?

Build indexes to support efficient query processing.

What redesign of the conceptual database schema isnecessary from the point of view of efficientimplementation?

55

ENTITY-RELATIONSHIP MODEL

Short: ER model.

A lot of similarities with other modeling languagessuch as UML.

Concepts Entities / Entity sets,

Attributes,

Relationships/ Relationship sets, and

Constraints.

Offers more modeling concepts than the relationaldata model (which only offers relations).

Closer to the way in which people think.

66

ENTITY-RELATIONSHIP DIAGRAMS

An Entity-Relationship diagram (ER diagram) is agraph with nodes representing entity sets, attributesand relationship sets.

Entity sets denoted by rectangles.

Attributes denoted by ovals.

Relationship sets denoted by diamonds.

Edges (lines) connect entity sets to their attributes andrelationship sets to their entity sets.

lot

dname

budgetdid

sincename

Works_In DepartmentsEmployees

ssn

Page 3: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

77

ENTITIES AND ENTITY SETS

Entity: Real-world object distinguishable fromother objects

e.g. employee Miller.

Entity can be physical or abstract object.

An entity is associated with the attributesdescribing its properties.

Attribute values are atomic

e.g. strings, integer or real numbers.

Contain a single piece of information

Ex: first name

Age or date-of-birth?

Entity set: A collection of similar entities.

E.g., all employees.

88

ENTITIES AND ENTITY SETS

All entities in an entity set have the same set ofattributes. (At least, for the moment!)

Each entity set has a key, i.e. a minimal set ofattributes to uniquely identify an entity of this set.Key attributes are underlined.

Each attribute has a domain, i.e. a set of all possibleattribute values.

Employees

ssnname

age

99

ENTITIES AND ENTITY SETS

A key must be unique across all possible (not just thecurrent) entities of its set.

A key can consist of more than one attribute.

There can be more than one key for a given entityset, but we choose one (primary key) for the ERdiagram.

Employees

firstnamelastname

birthdate

salary

Page 4: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

1010

RELATIONSHIPS AND RELATIONSHIP

SETS

Relationship: Association among two or moreentities.

E.g., Miller works in Pharmacy department.

Relationship set: Collection of similar relationshipsamong two or more entity sets.

age

dname

budgetdid

name

Works_In DepartmentsEmployees

ssn

1111

RELATIONSHIPS AND RELATIONSHIP

SETS

An n-ary relationship set R relates nentity sets E1 ... En.

Each relationship in R involvesentities e1 E1, ..., en En.

Binary relationship sets mostcommon.

Same entity set can participate indifferent relationship sets, or indifferent “roles” in same set. Reports_To

age

name

Employees

subor-dinate

super-visor

ssn

1212

RELATIONSHIPS AND RELATIONSHIP

SETS

Entity object that is distinguishable from other objects

Ex: your home address, CMPT 354

Entity Set All home addresses

Collection of CMPT courses

Each entity set has 1-to-many entities

Each entity can belong to multiple entity sets

Relationship Joe lives at 45 Main St.

Mary lives at 89 Wood Ave.

Relationship Set Person lives at home address

Page 5: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

1313

RELATIONSHIPS AND RELATIONSHIP

SETS

Relationship sets can also have attributes.

Useful for properties that cannot reasonably beassociated with one of the participating entity sets.

age

dname

budgetdid

sincename

Works_In DepartmentsEmployees

ssn

1414

INSTANCES OF AN ER DIAGRAM

Entity set contains a set of entities. Each entityhas one value for each of its attributes.

No duplicate instances.

ssn name age

12345678 “John Miller” 30

14789632 “Paul Li” 25

. . . . . . . . .

Employees

1515

INSTANCES OF AN ER DIAGRAM

Relationship set contains a set (no duplicates!) ofrelationships, each relating a set of entities, onefrom each of the participating entity sets.

Components are entities, not attribute values.

Employee (ssn) Department (did)

12345678 1

14789632 1

56756322 2

. . . . . .

Works_In

Page 6: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

1616

RELATIONSHIPS AND RELATIONSHIP

SETS

Multiway relationship sets (n > 2) are usedwhenever binary relationships cannot capture theapplication semantics.

TasksWorks_For

name

Employees

ssn age

Projects

pid pbudget

description

tid

Infrequent.

1717

RELATIONSHIPS AND RELATIONSHIP

SETS

Works_For

name

Employees

ssn age

Projects

pid pbudget

Employee (ssn) Tasks (tid) Project (pid)

12345678 1000 101

12345678 1500 106

56756322 1500 106

. . . . . . . . .

Works_For

Tasks

descriptiontid

1818

MULTIPLICITY OF RELATIONSHIPS

An employeecan work inmanydepartments;a dept canhave manyemployees.

Each dept hasat most onemanager, whomay manageseveral(many)departments.

dname

budgetdid

since

age

name

ssn

ManagesEmployees Departments

age

dname

budgetdid

sincename

Works_In DepartmentsEmployees

ssn

Page 7: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

1919

MULTIPLICITY OF RELATIONSHIPS

The different types of (binary) relationships froma multiplicity point of view:

One to one

One to many

Many to one

Many to many

many-to-manyone-to-one one-to-many many-to-one

2020

KEY CONSTRAINTS

A key constraint on a relationship set specifiesthat the marked entity set participates in at mostone relationship of this relationship set.

Entity set is marked with an arrow.

dname

budgetdid

since

age

name

ssn

ManagesEmployees Departments

Key constraint

2121

PARTICIPATION CONSTRAINTS

A participation constraint on a relationship setspecifies that the marked entity set participates in atleast one relationship of this relationship set.

Entity set is marked with a bold line.

age

name dname

budgetdid

sincename dname

budgetdid

since

Manages

since

DepartmentsEmployees

ssn

Works_In

Participationconstraint

Page 8: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

2222

WEAK ENTITIES A weak entity exists only in the context of another

(owner) entity. The weak entity can be identified uniquely only by

considering the primary key of the owner and its ownpartial key. Owner entity set and weak entity set must participate in a

one-to-many relationship set (one owner, many weak entities). Weak entity set must have total participation in this

supporting relationship set.

Ex: If there is no employee, there cannot be a dependent.

age

name

agename

DependentsEmployees

ssn

Policy

cost

2323

SUBCLASSES

Sometimes, an entity set contains some entitiesthat do share many, but not all properties withthe entity set hierarchies.

A ISA B: every A entity is also considered to be aB entity. A specializes B, B generalizes A.

A is called subclass,B is called superclass.

A subclass inherits theattributes of asuperclass, may defineadditional attributes.

Contract_Emps

Employees

ISA

Hourly_Emps

2424

SUBCLASSES

Contract_Emps

name

ssn

Employees

age

hourly_wages

ISA

Hourly_Emps

contractid

hours_worked

Hourly_Emps and Contract_Emps inherit thessn (key!), name and age attributes fromEmployees.

They define additional attributes hourly_wages,hours_worked and contractid, resp.

Page 9: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

2525

SUBCLASSES

Covering constraints:Does every Employees entityhave to be either anHourly_Emps or aContract_Emps entity?

NO. UnlessHourly_EmpsAND Contract_EmpsCOVER Employees

Overlap constraints:Can Joe be an Hourly_Emps as well as aContract_Emps entity?

YES. Hourly_Emps OVERLAPS Contract_Emps

2626

SUBCLASSES

There are several good reasons for

using ISA relationships and subclasses: Do not have to redefine all the attributes.

Can add descriptive attributes specific to asubclass.

To identify entitity sets that participate in arelationship set as precisely as possible.

ISA relationships form a tree structure(taxonomy) with one entity set servingas root.

2727

DESIGN PRINCIPLES

Faithfulness

Design must be faithful to the specification / reality.

Relevant aspects of reality must be represented in themodel.

Avoiding redundancy

Redundant representation blows up ER diagram andmakes it harder to understand.

Redundant representation wastes storage.

Redundancy may lead to inconsistencies in thedatabase.

Page 10: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

2828

DESIGN PRINCIPLES

Keep it simple

The simpler, the easier to understand for some(external) reader of the ER diagrams.

Avoid introducing more elements than necessary.

If possible, prefer attributes over entity sets andrelationship sets.

Formulate constraints as far as possible

A lot of data semantics can (and should) be captured.

But some constraints cannot be captured in ERdiagrams.

2929

HIGH-LEVEL DESIGN WITH ER MODEL

Major design choices

Should a concept be modeled as an entity or anattribute? a relationship?

What relationships to use: binary or ternary?

Should address be an attribute of Employees or anentity (connected to Employees by a relationship)?

Depends upon the use we want to make of addressinformation, and the semantics of the data:

If we have several addresses per employee, address must bean entity (since attributes cannot be set-valued).

3030

ENTITY VS. ATTRIBUTE

Works_In2 does notallow an employee towork in the samedepartment for two ormore periods (why?).

We want to recordseveral values of thedescriptive attributesfor each instance of thisrelationship.

Page 11: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

3131

ENTITY VS. RELATIONSHIP

This ER diagram o.k. if a manager gets a separatediscretionary budget for each dept.

But what if a manager gets a discretionary budgetthat covers all managed depts?

Redundancy of dbudget, which is stored for each deptmanaged by the manager.

Misleading: suggests dbudget tied to managed dept.

Manages2

name dnamebudgetdid

Employees Departments

ssn lot

dbudgetsince

3232

ENTITY VS. RELATIONSHIP

What about thisdiagram?

Employees whoare notmanagers willhavedbudget=null?

The followingER diagram ismoreappropriate andavoids the aboveproblems!

Each managernow has abudget.

3333

BINARY VS. TERNARY RELATIONSHIPS

If each policy is owned by just one employee: Key constraint on Policies would mean policy can only cover

1 dependent! (only 1 combination of Employees and Policiescan be in Covers)

Bad design!

agepname

DependentsCovers

name

Employees

ssn lot

Policies

policyid cost

ER diagram says Employee can own several policies Each policy can be owned by several employees Each dependent can be covered by several policies

Page 12: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

3434

BINARY VS. TERNARY RELATIONSHIPS

This diagram is a better design.

Policy can only exist for employees. Dependents onlyexist if they are covered by a policy.

Beneficiary

agepname

Dependents

policyid cost

Policies

Purchaser

name

Employees

ssn lot

3535

BINARY VS. TERNARY RELATIONSHIPS

Previous example illustrated a case when twobinary relationships were better than one ternaryrelationship.

An example in the other direction:

a ternary relation Contracts relates entity sets Parts,Departments and Suppliers, and has descriptiveattribute qty. No combination of binary relationshipsis an adequate substitute:

S “can-supply” P, D “needs” P, and D “deals-with” Sdoes not imply that D has agreed to buy P from S.

How do we record qty?

3636

CONCEPTUAL DESIGN:ER TO RELATIONAL

How to represent Entity sets, Relationship sets, Attributes, Key and participation constraints, Subclasses, Weak entity sets. . . ?

Page 13: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

3737

ENTITY SETS

Entity sets are translated to tables.

CREATE TABLE Employees(ssn CHAR(11),name CHAR(20),lot INTEGER,PRIMARY KEY (ssn));Employees

ssnname

lot

3838

RELATIONSHIP SETS

Relationship sets arealso translated totables. Keys for each

participating entity set(as foreign keys).The combination of these

keys forms a superkey forthe table.

All descriptiveattributesof the relationship set.

CREATE TABLE Works_In(

ssn CHAR(11),

did INTEGER,

since DATE,

PRIMARY KEY (ssn, did),

FOREIGN KEY (ssn)

REFERENCES Employees,

FOREIGN KEY (did)

REFERENCES Departments);

3939

KEY CONSTRAINTS

Each dept hasat most onemanager,according tothe keyconstraint onManages.

Translation torelational model?

many-to-manyone-to-one one-to-many many-to-one

dname

budgetdid

since

lot

name

ssn

ManagesEmployees Departments

Page 14: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

4040

KEY CONSTRAINTS

Map relationshipset to a table: Separate tables

for EmployeesandDepartments.

Note that did isthe key now!

Since eachdepartment has aunique manager,we could insteadcombineManages andDepartments.

CREATE TABLE Manages(ssn CHAR(11),did INTEGER,since DATE,PRIMARY KEY (did),FOREIGN KEY (ssn) REFERENCES Employees,FOREIGN KEY (did) REFERENCES Departments)

CREATE TABLE Dept_Mgr(did INTEGER,dname CHAR(20),budget REAL,manager CHAR(11),since DATE,PRIMARY KEY (did),FOREIGN KEY (manager)

REFERENCES Employees)

4141

PARTICIPATION CONSTRAINTS

We can capture participation constraints involvingone entity set in a binary relationship, using NOTNULL.

In other cases, we need CHECK constraints.

CREATE TABLE Dept_Mgr(did INTEGER,

dname CHAR(20),budget REAL,manager CHAR(11) NOT NULL,since DATE,PRIMARY KEY (did),FOREIGN KEY (manager) REFERENCES Employees,

ON DELETE NO ACTION)

4242

WEAK ENTITY SETS

A weak entity set can be identified uniquely onlyby considering the primary key of another (owner)entity set.

Owner entity set and weak entity set must participatein a one-to-many relationship set (one owner, manyweak entities).

Weak entity set must have total participation in thisidentifying relationship set.

lot

name

agepname

DependentsEmployees

ssn

Policy

cost

Page 15: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

4343

WEAK ENTITY SETS

Weak entity set and identifying relationship setare translated into a single table.

When the owner entity is deleted, all owned weakentities must also be deleted.

CREATE TABLE Dep_Policy (pname CHAR(20),age INTEGER,cost REAL,ssn CHAR(11) NOT NULL,PRIMARY KEY (pname, ssn),FOREIGN KEY (ssn) REFERENCES Employees,

ON DELETE CASCADE)

4444

SUBCLASSES

If we declare A ISA B, every A entity is alsoconsidered to be a B entity.

Attributes of B are inherited to A.

Overlap constraints: Can Joe be an Hourly_Empsas well as a Contract_Emps entity?(Allowed/disallowed)

Covering constraints:Does every Employeesentity either have to be anHourly_Emps or aContract_Emps entity? (Yes/no)

Contract_Emps

namessn

Employees

lot

hourly_wages

ISA

Hourly_Emps

contractid

hours_worked

4545

SUBCLASSES

ER style translation

One table for each of the entity sets (superclass andsubclasses).

ISA relationship does not require additional table.

All tables have the same key, i.e. the key of thesuperclass.

E.g.: One table each for Employees, Hourly_Emps andContract_Emps.

General employee attributes are recorded in Employees. Forhourly emps and contract emps, extra info recorded in therespective relations.

Page 16: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

4646

SUBCLASSES

Queries involving all employees easy, thoseinvolving just Hourly_Emps require a join to gettheir special attributes.

CREATE TABLE Hourly_Emps(ssn CHAR(11),

hourly_wages REAL,

hours_worked INTEGER,

PRIMARY KEY (ssn),FOREIGN KEY (ssn)

REFERENCES Employees,ON DELETE CASCADE)

CREATE TABLE Employees(ssn CHAR(11),

name CHAR(20),lot INTEGER,

PRIMARY KEY (ssn))

4747

SUBCLASSES

Alternative translation

Create tables for the subclasses only. These tableshave all attributes of the superclass(es) and thesubclass.

This approach is applicable only if the subclassescover the superclass.

E.g.:

Hourly_Emps: ssn, name, lot, hourly_wages,hours_worked.

Contract_Emps: ssn, name, lot, contractid.

Queries involving all employees difficult, those onHourly_Emps and Contract_Emps alone are easy.

Only applicable, ifHourly_Emps AND Contract_Emps COVEREmployees

4848

BINARY VS. TERNARY RELATIONSHIPS

The keyconstraintsallow us tocombinePurchaserwith PoliciesandBeneficiarywithDependents.

Participationconstraintsleadto NOTNULLconstraints.

CREATE TABLE Policies (policyid INTEGER,cost REAL,ssn CHAR(11) NOT NULL,PRIMARY KEY (policyid).FOREIGN KEY (ssn) REFERENCES Employees,

ON DELETE CASCADE)

CREATE TABLE Dependents (pname CHAR(20),age INTEGER,policyid INTEGER NOT NULL,PRIMARY KEY (pname, policyid).FOREIGN KEY (policyid) REFERENCES Policies,

ON DELETE CASCADE)

Page 17: WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations).

4949

SUMMARY

High-level design follows requirements analysisand yields a high-level description of data to bestored.

ER model popular for high-level design.

Constructs are expressive, close to the way peoplethink about their applications.

Basic constructs: entities, relationships, andattributes (of entities and relationships).

Some additional constructs: weak entities,subclasses, and constraints.

ER design is subjective. There are often manyways to model a given scenario! Analyzingalternatives can be tricky, especially for a largeenterprise.

5050

SUMMARY

There are guidelines to translate ER diagrams toa relational database schema.

However, there are often alternatives that need tobe carefully considered.

Entity sets and relationship sets are allrepresented by relations.

Some constructs of the ER model cannot be easilytranslated, e.g. multiple participation constraints.