WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers...
Transcript of WEEK 2: THE ENTITY-RELATIONSHIP MODEL · Relationships/ Relationship sets, and Constraints. Offers...
DATABASE SYSTEMS I
WEEK 2: THE ENTITY-RELATIONSHIP
MODEL
22
OVERVIEW OF DATABASE
DEVELOPMENT
Requirements Analysis / Ideas
High-Level Database Design
Conceptual Database Design / Relational Database Schema
Physical Database Design / Relational DBMS
Similar to software development
33
OVERVIEW OF DATABASE
DEVELOPMENT Requirements Analysis
What data are to be stored in the enterprise?
What are the required applications?
What are the most important operations?
High-level database design
What are the entities and relationships in the enterprise?
What information about these entities and relationshipsshould we store in the database?
What are the integrity constraints or business rules thathold?
ER model or UML to represent high-level design
44
OVERVIEW OF DATABASE
DEVELOPMENT Conceptual database design
What data model to implement for the DBS?E.g., relational data model
Map the high-level design (e.g., ER diagram) to a(conceptual) database schema of the chosen data model.
Physical database design
What DBMS to use?
What are the typical workloads of the DBS?
Build indexes to support efficient query processing.
What redesign of the conceptual database schema isnecessary from the point of view of efficientimplementation?
55
ENTITY-RELATIONSHIP MODEL
Short: ER model.
A lot of similarities with other modeling languagessuch as UML.
Concepts Entities / Entity sets,
Attributes,
Relationships/ Relationship sets, and
Constraints.
Offers more modeling concepts than the relationaldata model (which only offers relations).
Closer to the way in which people think.
66
ENTITY-RELATIONSHIP DIAGRAMS
An Entity-Relationship diagram (ER diagram) is agraph with nodes representing entity sets, attributesand relationship sets.
Entity sets denoted by rectangles.
Attributes denoted by ovals.
Relationship sets denoted by diamonds.
Edges (lines) connect entity sets to their attributes andrelationship sets to their entity sets.
lot
dname
budgetdid
sincename
Works_In DepartmentsEmployees
ssn
77
ENTITIES AND ENTITY SETS
Entity: Real-world object distinguishable fromother objects
e.g. employee Miller.
Entity can be physical or abstract object.
An entity is associated with the attributesdescribing its properties.
Attribute values are atomic
e.g. strings, integer or real numbers.
Contain a single piece of information
Ex: first name
Age or date-of-birth?
Entity set: A collection of similar entities.
E.g., all employees.
88
ENTITIES AND ENTITY SETS
All entities in an entity set have the same set ofattributes. (At least, for the moment!)
Each entity set has a key, i.e. a minimal set ofattributes to uniquely identify an entity of this set.Key attributes are underlined.
Each attribute has a domain, i.e. a set of all possibleattribute values.
Employees
ssnname
age
99
ENTITIES AND ENTITY SETS
A key must be unique across all possible (not just thecurrent) entities of its set.
A key can consist of more than one attribute.
There can be more than one key for a given entityset, but we choose one (primary key) for the ERdiagram.
Employees
firstnamelastname
birthdate
salary
1010
RELATIONSHIPS AND RELATIONSHIP
SETS
Relationship: Association among two or moreentities.
E.g., Miller works in Pharmacy department.
Relationship set: Collection of similar relationshipsamong two or more entity sets.
age
dname
budgetdid
name
Works_In DepartmentsEmployees
ssn
1111
RELATIONSHIPS AND RELATIONSHIP
SETS
An n-ary relationship set R relates nentity sets E1 ... En.
Each relationship in R involvesentities e1 E1, ..., en En.
Binary relationship sets mostcommon.
Same entity set can participate indifferent relationship sets, or indifferent “roles” in same set. Reports_To
age
name
Employees
subor-dinate
super-visor
ssn
1212
RELATIONSHIPS AND RELATIONSHIP
SETS
Entity object that is distinguishable from other objects
Ex: your home address, CMPT 354
Entity Set All home addresses
Collection of CMPT courses
Each entity set has 1-to-many entities
Each entity can belong to multiple entity sets
Relationship Joe lives at 45 Main St.
Mary lives at 89 Wood Ave.
Relationship Set Person lives at home address
1313
RELATIONSHIPS AND RELATIONSHIP
SETS
Relationship sets can also have attributes.
Useful for properties that cannot reasonably beassociated with one of the participating entity sets.
age
dname
budgetdid
sincename
Works_In DepartmentsEmployees
ssn
1414
INSTANCES OF AN ER DIAGRAM
Entity set contains a set of entities. Each entityhas one value for each of its attributes.
No duplicate instances.
ssn name age
12345678 “John Miller” 30
14789632 “Paul Li” 25
. . . . . . . . .
Employees
1515
INSTANCES OF AN ER DIAGRAM
Relationship set contains a set (no duplicates!) ofrelationships, each relating a set of entities, onefrom each of the participating entity sets.
Components are entities, not attribute values.
Employee (ssn) Department (did)
12345678 1
14789632 1
56756322 2
. . . . . .
Works_In
1616
RELATIONSHIPS AND RELATIONSHIP
SETS
Multiway relationship sets (n > 2) are usedwhenever binary relationships cannot capture theapplication semantics.
TasksWorks_For
name
Employees
ssn age
Projects
pid pbudget
description
tid
Infrequent.
1717
RELATIONSHIPS AND RELATIONSHIP
SETS
Works_For
name
Employees
ssn age
Projects
pid pbudget
Employee (ssn) Tasks (tid) Project (pid)
12345678 1000 101
12345678 1500 106
56756322 1500 106
. . . . . . . . .
Works_For
Tasks
descriptiontid
1818
MULTIPLICITY OF RELATIONSHIPS
An employeecan work inmanydepartments;a dept canhave manyemployees.
Each dept hasat most onemanager, whomay manageseveral(many)departments.
dname
budgetdid
since
age
name
ssn
ManagesEmployees Departments
age
dname
budgetdid
sincename
Works_In DepartmentsEmployees
ssn
1919
MULTIPLICITY OF RELATIONSHIPS
The different types of (binary) relationships froma multiplicity point of view:
One to one
One to many
Many to one
Many to many
many-to-manyone-to-one one-to-many many-to-one
2020
KEY CONSTRAINTS
A key constraint on a relationship set specifiesthat the marked entity set participates in at mostone relationship of this relationship set.
Entity set is marked with an arrow.
dname
budgetdid
since
age
name
ssn
ManagesEmployees Departments
Key constraint
2121
PARTICIPATION CONSTRAINTS
A participation constraint on a relationship setspecifies that the marked entity set participates in atleast one relationship of this relationship set.
Entity set is marked with a bold line.
age
name dname
budgetdid
sincename dname
budgetdid
since
Manages
since
DepartmentsEmployees
ssn
Works_In
Participationconstraint
2222
WEAK ENTITIES A weak entity exists only in the context of another
(owner) entity. The weak entity can be identified uniquely only by
considering the primary key of the owner and its ownpartial key. Owner entity set and weak entity set must participate in a
one-to-many relationship set (one owner, many weak entities). Weak entity set must have total participation in this
supporting relationship set.
Ex: If there is no employee, there cannot be a dependent.
age
name
agename
DependentsEmployees
ssn
Policy
cost
2323
SUBCLASSES
Sometimes, an entity set contains some entitiesthat do share many, but not all properties withthe entity set hierarchies.
A ISA B: every A entity is also considered to be aB entity. A specializes B, B generalizes A.
A is called subclass,B is called superclass.
A subclass inherits theattributes of asuperclass, may defineadditional attributes.
Contract_Emps
Employees
ISA
Hourly_Emps
2424
SUBCLASSES
Contract_Emps
name
ssn
Employees
age
hourly_wages
ISA
Hourly_Emps
contractid
hours_worked
Hourly_Emps and Contract_Emps inherit thessn (key!), name and age attributes fromEmployees.
They define additional attributes hourly_wages,hours_worked and contractid, resp.
2525
SUBCLASSES
Covering constraints:Does every Employees entityhave to be either anHourly_Emps or aContract_Emps entity?
NO. UnlessHourly_EmpsAND Contract_EmpsCOVER Employees
Overlap constraints:Can Joe be an Hourly_Emps as well as aContract_Emps entity?
YES. Hourly_Emps OVERLAPS Contract_Emps
2626
SUBCLASSES
There are several good reasons for
using ISA relationships and subclasses: Do not have to redefine all the attributes.
Can add descriptive attributes specific to asubclass.
To identify entitity sets that participate in arelationship set as precisely as possible.
ISA relationships form a tree structure(taxonomy) with one entity set servingas root.
2727
DESIGN PRINCIPLES
Faithfulness
Design must be faithful to the specification / reality.
Relevant aspects of reality must be represented in themodel.
Avoiding redundancy
Redundant representation blows up ER diagram andmakes it harder to understand.
Redundant representation wastes storage.
Redundancy may lead to inconsistencies in thedatabase.
2828
DESIGN PRINCIPLES
Keep it simple
The simpler, the easier to understand for some(external) reader of the ER diagrams.
Avoid introducing more elements than necessary.
If possible, prefer attributes over entity sets andrelationship sets.
Formulate constraints as far as possible
A lot of data semantics can (and should) be captured.
But some constraints cannot be captured in ERdiagrams.
2929
HIGH-LEVEL DESIGN WITH ER MODEL
Major design choices
Should a concept be modeled as an entity or anattribute? a relationship?
What relationships to use: binary or ternary?
Should address be an attribute of Employees or anentity (connected to Employees by a relationship)?
Depends upon the use we want to make of addressinformation, and the semantics of the data:
If we have several addresses per employee, address must bean entity (since attributes cannot be set-valued).
3030
ENTITY VS. ATTRIBUTE
Works_In2 does notallow an employee towork in the samedepartment for two ormore periods (why?).
We want to recordseveral values of thedescriptive attributesfor each instance of thisrelationship.
3131
ENTITY VS. RELATIONSHIP
This ER diagram o.k. if a manager gets a separatediscretionary budget for each dept.
But what if a manager gets a discretionary budgetthat covers all managed depts?
Redundancy of dbudget, which is stored for each deptmanaged by the manager.
Misleading: suggests dbudget tied to managed dept.
Manages2
name dnamebudgetdid
Employees Departments
ssn lot
dbudgetsince
3232
ENTITY VS. RELATIONSHIP
What about thisdiagram?
Employees whoare notmanagers willhavedbudget=null?
The followingER diagram ismoreappropriate andavoids the aboveproblems!
Each managernow has abudget.
3333
BINARY VS. TERNARY RELATIONSHIPS
If each policy is owned by just one employee: Key constraint on Policies would mean policy can only cover
1 dependent! (only 1 combination of Employees and Policiescan be in Covers)
Bad design!
agepname
DependentsCovers
name
Employees
ssn lot
Policies
policyid cost
ER diagram says Employee can own several policies Each policy can be owned by several employees Each dependent can be covered by several policies
3434
BINARY VS. TERNARY RELATIONSHIPS
This diagram is a better design.
Policy can only exist for employees. Dependents onlyexist if they are covered by a policy.
Beneficiary
agepname
Dependents
policyid cost
Policies
Purchaser
name
Employees
ssn lot
3535
BINARY VS. TERNARY RELATIONSHIPS
Previous example illustrated a case when twobinary relationships were better than one ternaryrelationship.
An example in the other direction:
a ternary relation Contracts relates entity sets Parts,Departments and Suppliers, and has descriptiveattribute qty. No combination of binary relationshipsis an adequate substitute:
S “can-supply” P, D “needs” P, and D “deals-with” Sdoes not imply that D has agreed to buy P from S.
How do we record qty?
3636
CONCEPTUAL DESIGN:ER TO RELATIONAL
How to represent Entity sets, Relationship sets, Attributes, Key and participation constraints, Subclasses, Weak entity sets. . . ?
3737
ENTITY SETS
Entity sets are translated to tables.
CREATE TABLE Employees(ssn CHAR(11),name CHAR(20),lot INTEGER,PRIMARY KEY (ssn));Employees
ssnname
lot
3838
RELATIONSHIP SETS
Relationship sets arealso translated totables. Keys for each
participating entity set(as foreign keys).The combination of these
keys forms a superkey forthe table.
All descriptiveattributesof the relationship set.
CREATE TABLE Works_In(
ssn CHAR(11),
did INTEGER,
since DATE,
PRIMARY KEY (ssn, did),
FOREIGN KEY (ssn)
REFERENCES Employees,
FOREIGN KEY (did)
REFERENCES Departments);
3939
KEY CONSTRAINTS
Each dept hasat most onemanager,according tothe keyconstraint onManages.
Translation torelational model?
many-to-manyone-to-one one-to-many many-to-one
dname
budgetdid
since
lot
name
ssn
ManagesEmployees Departments
4040
KEY CONSTRAINTS
Map relationshipset to a table: Separate tables
for EmployeesandDepartments.
Note that did isthe key now!
Since eachdepartment has aunique manager,we could insteadcombineManages andDepartments.
CREATE TABLE Manages(ssn CHAR(11),did INTEGER,since DATE,PRIMARY KEY (did),FOREIGN KEY (ssn) REFERENCES Employees,FOREIGN KEY (did) REFERENCES Departments)
CREATE TABLE Dept_Mgr(did INTEGER,dname CHAR(20),budget REAL,manager CHAR(11),since DATE,PRIMARY KEY (did),FOREIGN KEY (manager)
REFERENCES Employees)
4141
PARTICIPATION CONSTRAINTS
We can capture participation constraints involvingone entity set in a binary relationship, using NOTNULL.
In other cases, we need CHECK constraints.
CREATE TABLE Dept_Mgr(did INTEGER,
dname CHAR(20),budget REAL,manager CHAR(11) NOT NULL,since DATE,PRIMARY KEY (did),FOREIGN KEY (manager) REFERENCES Employees,
ON DELETE NO ACTION)
4242
WEAK ENTITY SETS
A weak entity set can be identified uniquely onlyby considering the primary key of another (owner)entity set.
Owner entity set and weak entity set must participatein a one-to-many relationship set (one owner, manyweak entities).
Weak entity set must have total participation in thisidentifying relationship set.
lot
name
agepname
DependentsEmployees
ssn
Policy
cost
4343
WEAK ENTITY SETS
Weak entity set and identifying relationship setare translated into a single table.
When the owner entity is deleted, all owned weakentities must also be deleted.
CREATE TABLE Dep_Policy (pname CHAR(20),age INTEGER,cost REAL,ssn CHAR(11) NOT NULL,PRIMARY KEY (pname, ssn),FOREIGN KEY (ssn) REFERENCES Employees,
ON DELETE CASCADE)
4444
SUBCLASSES
If we declare A ISA B, every A entity is alsoconsidered to be a B entity.
Attributes of B are inherited to A.
Overlap constraints: Can Joe be an Hourly_Empsas well as a Contract_Emps entity?(Allowed/disallowed)
Covering constraints:Does every Employeesentity either have to be anHourly_Emps or aContract_Emps entity? (Yes/no)
Contract_Emps
namessn
Employees
lot
hourly_wages
ISA
Hourly_Emps
contractid
hours_worked
4545
SUBCLASSES
ER style translation
One table for each of the entity sets (superclass andsubclasses).
ISA relationship does not require additional table.
All tables have the same key, i.e. the key of thesuperclass.
E.g.: One table each for Employees, Hourly_Emps andContract_Emps.
General employee attributes are recorded in Employees. Forhourly emps and contract emps, extra info recorded in therespective relations.
4646
SUBCLASSES
Queries involving all employees easy, thoseinvolving just Hourly_Emps require a join to gettheir special attributes.
CREATE TABLE Hourly_Emps(ssn CHAR(11),
hourly_wages REAL,
hours_worked INTEGER,
PRIMARY KEY (ssn),FOREIGN KEY (ssn)
REFERENCES Employees,ON DELETE CASCADE)
CREATE TABLE Employees(ssn CHAR(11),
name CHAR(20),lot INTEGER,
PRIMARY KEY (ssn))
4747
SUBCLASSES
Alternative translation
Create tables for the subclasses only. These tableshave all attributes of the superclass(es) and thesubclass.
This approach is applicable only if the subclassescover the superclass.
E.g.:
Hourly_Emps: ssn, name, lot, hourly_wages,hours_worked.
Contract_Emps: ssn, name, lot, contractid.
Queries involving all employees difficult, those onHourly_Emps and Contract_Emps alone are easy.
Only applicable, ifHourly_Emps AND Contract_Emps COVEREmployees
4848
BINARY VS. TERNARY RELATIONSHIPS
The keyconstraintsallow us tocombinePurchaserwith PoliciesandBeneficiarywithDependents.
Participationconstraintsleadto NOTNULLconstraints.
CREATE TABLE Policies (policyid INTEGER,cost REAL,ssn CHAR(11) NOT NULL,PRIMARY KEY (policyid).FOREIGN KEY (ssn) REFERENCES Employees,
ON DELETE CASCADE)
CREATE TABLE Dependents (pname CHAR(20),age INTEGER,policyid INTEGER NOT NULL,PRIMARY KEY (pname, policyid).FOREIGN KEY (policyid) REFERENCES Policies,
ON DELETE CASCADE)
4949
SUMMARY
High-level design follows requirements analysisand yields a high-level description of data to bestored.
ER model popular for high-level design.
Constructs are expressive, close to the way peoplethink about their applications.
Basic constructs: entities, relationships, andattributes (of entities and relationships).
Some additional constructs: weak entities,subclasses, and constraints.
ER design is subjective. There are often manyways to model a given scenario! Analyzingalternatives can be tricky, especially for a largeenterprise.
5050
SUMMARY
There are guidelines to translate ER diagrams toa relational database schema.
However, there are often alternatives that need tobe carefully considered.
Entity sets and relationship sets are allrepresented by relations.
Some constructs of the ER model cannot be easilytranslated, e.g. multiple participation constraints.