1.1 CAS CS 460/660 Introduction to Database Systems SQL III.
CAS CS 460 Entity-Relationship Model. 1.2 Review Course Agenda Application/User-Oriented Database...
-
date post
22-Dec-2015 -
Category
Documents
-
view
218 -
download
1
Transcript of CAS CS 460 Entity-Relationship Model. 1.2 Review Course Agenda Application/User-Oriented Database...
1.2
ReviewReview
Course Agenda Application/User-Oriented
Database Modeling Data Retrieval Design Considerations
Database System Internals Storage and Indexing Query Processing, Optimization Transaction Processing Recovery Management
Advanced Topics Data Mining, XML DB in Oracle
1.3
ReviewReview
Course Agenda Application/User-Oriented
Database Modeling – ER Model, Relational Model Data Retrieval -- Relational Alg, Rel. Cal,… SQL,.. Design Considerations – Normal Forms,…
Database System Internals Storage and Indexing Query Processing, Optimization Transaction Processing Recovery Management
Advanced Topics Data Mining, XML DB in Oracle
1.4
Databases Model the Real WorldDatabases Model the Real World “Data Model” translates real world things into structures computers
can store
Many models: Relational, E-R, O-O, Network, Hierarchical, etc.
Relational (more next time) Rows & Columns
Keys & Foreign Keys to link Relations
sid name login age gpa
53666 Jones jones@cs 18 3.4 53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8
sid cid grade53666 Carnatic101 C53666 Reggae203 B53650 Topology112 A53666 History105 B
Enrolled Students
1.5
Problems with Relational ModelProblems with Relational Model
With complicated schemas, it may be hard for a person to understand the structure from the data definition.
CREATE TABLE Students(sid CHAR(20), name CHAR(20), login CHAR(10), age INTEGER, gpa FLOAT)
CREATE TABLE Enrolled(sid CHAR(20), cid CHAR(20), grade CHAR(2))
sid name login age gpa
53666 Jones jones@cs 18 3.453688 Smith smith@eecs 18 3.253650 Smith smith@math 19 3.8
Studentscid grade sid Carnatic101 C 53666 Reggae203 B 53666 Topology112 A 53650 History105 B 53666
Enrolled
1.6
One Solution: The E-R ModelOne Solution: The E-R Model
Instead of relations, it has: Entities and Relationships
These are described with diagrams
both structure, notation more obvious to humans
lot
name
Employee
ssn
Works_In
sincedname
budgetdid
Department
1.7
Steps in Database DesignSteps in Database Design
Requirements Analysis user needs; what must database do?
Conceptual Design high level descr (often done w/ER model)
Logical Design translate ER into DBMS data model
Schema Refinement consistency, normalization
Physical Design indexes, disk layout
Security Design who accesses what, and how
1.8
Example: DBA for Bank of AmericaExample: DBA for Bank of America
Requirements Specification Determine the requirements of clients ( Database to store
information about customers, accounts, loans, branches, transactions, …)
Conceptual Design Express client requirements in terms of E/R model.
Confirm with clients that requirements are correct.
Specify required data operations
Logical Design Convert E/R model to relational, object-based, XML-based,…
Physical Design Specify file organizations, build indexes
1.9
ModelingModeling
A database can be modeled as: a collection of entities,
relationship among entities.
An entity is an object that exists and is distinguishable from other objects.
Example: specific employee, company, customer, loan
Entities have attributes Example: employees have names and addresses
An entity set is a set of entities of the same type that share the same properties. Example: set of all employees, companies, customers, loans
1.10
Entity Sets Entity Sets customercustomer and and loanloan
Cst_id Cst_name Cst_street Cst_city Loan_id, Amount
1.11
Relationship SetsRelationship Sets
A relationship is an association among several entities
Example:Hayes depositor A-102
customer entity relationship set account entity
A relationship set is a mathematical relation among n 2 entities, each taken from entity sets
{(e1, e2, … en) | e1 E1, e2 E2, …, en En}
where (e1, e2, …, en) is a relationship
Example:
(Hayes, A-102) depositor
1.13
Degree of a Relationship SetDegree of a Relationship Set
Refers to number of entity sets that participate in a relationship set.
Relationship sets that involve two entity sets are binary (or degree two). Generally, most relationship sets in a database system are binary.
Relationship sets may involve more than two entity sets.
Relationships between more than two entity sets are rare. Most relationships are binary. (More on this later.)
Example: Suppose employees of a bank may have jobs (responsibilities) at multiple branches, with different jobs at different branches. Then there is a ternary relationship set between entity sets employee, job, and branch
1.14
AttributesAttributes
An entity is represented by a set of attributes, that is descriptive properties possessed by all members of an entity set.
Domain – the set of permitted values for each attribute
Attribute types: Simple and composite attributes.
Single-valued and multi-valued attributes
Example: multivalued attribute: phone_numbers
Derived attributes
Can be computed from other attributes
Example: age, given date_of_birth
Example:
customer = (customer_id, customer_name, customer_street, customer_city )
loan = (loan_number, amount )
1.16
Relationship Sets (Cont.)Relationship Sets (Cont.)
An attribute can also be property of a relationship set. For instance, the depositor relationship set between entity sets
customer and account may have the attribute access-date
1.17
E/R Data ModelE/R Data ModelAnother ExampleAnother Example
Employee Works_At
essn
Branch
ename
phone
children
since seniority bname bcity
Entity Set
Relationship Set
Attribute
Employee
Works_For
phone
Lots of notation to come.
1.18
E/R Data ModelE/R Data ModelTypes of AttributesTypes of Attributes
Defaultename
children
seniority
Multivalued
Derived
Employee Works_At
essn
Branch
ename
phone
children
since seniority bname bcity
1.19
E/R Data ModelE/R Data ModelTypes of relationshipsTypes of relationships
Many-to-Many (n:m)Works_At
Employee Works_At
essn
Branch
ename
phone
children
since seniority bname bcity
1.20
RolesRoles
Entity sets of a relationship need not be distinct
The labels “manager” and “worker” are called roles; they specify how employee entities interact via the works_for relationship set.
Roles are indicated in E-R diagrams by labeling the lines that connect diamonds to rectangles.
Role labels are optional, and are used to clarify semantics of the relationship
1.21
E/R Data ModelE/R Data ModelRecursive relationshipsRecursive relationships
Employeemanager
workerWorks_For
Recursive relationships: Must be declared with roles
Employee Works_At
essn
Branch
ename
phone
children
since seniority bname bcity
Works_Formanager
worker Many-to-1 Relationship
1.22
An Example: Employees can have multiple phones
E/R Data ModelE/R Data ModelDesign Issue #1: Entity Sets vs. AttributesDesign Issue #1: Entity Sets vs. Attributes
Employee
phone_no phone_loc
Employee
no loc
PhoneUsesvs(a) (b)
To resolve, determine how phones are used 1. Can many employees share a phone?
(If yes, then (b))
2. Can employees have multiple phones?
(if yes, then (b), or (a) with multivalued attributes)
3. Else
(a), perhaps with composite attributes
Employee
phonenoloc
1.23
An Example: How to model bank loans
E/R Data ModelE/R Data ModelDesign Issue #2: Entity Sets vs. Relationship SetsDesign Issue #2: Entity Sets vs. Relationship Sets
Customer
cssn cname
vs
(a)
To resolve, determine how loans are issued 1. Can there be more than one customer per loan?
If yes, then (a). Otherwise in (b), loan info must be replicated for each customer (wasteful storage , potential update anomalies)
2. Is loan a noun or a verb?
Both, but more of a noun to a bank. (hence (a) probably more appropriate)
Borrows Loan
lno amt
Customer
cssn cname
(b)
Loans Branch
bname bcity
lno amt
1.24
An Example:
E/R Data ModelE/R Data ModelDesign Issue #3: Relationship CardinalitiesDesign Issue #3: Relationship Cardinalities
Customer Borrows Loan? ?
Variations on Borrows: 1. Can a customer hold multiple loans?
2. Can a loan be jointly held by more than 1 customer?
1.25
E/R Data ModelE/R Data ModelDesign Issue #3: Relationship CardinalitiesDesign Issue #3: Relationship Cardinalities
Type Illustrated Multiple Loans? Joint Loans?
One-to-One (1:1) No No
Many-to-one (n:1) No Yes
One-to-many (1:n) Yes No
Many-to-many (n:m) Yes Yes
Cardinalities of Borrows:
Borrows
Borrows
Borrows
Borrows
Customer Borrows Loan? ?
1.26
E/R Data ModelE/R Data ModelDesign Issue #3: Relationship Cardinalities (cont)Design Issue #3: Relationship Cardinalities (cont)
In general...
1:1
n:1
1:n
n:m
1.27
An Example: Works_At
E/R Data ModelE/R Data ModelDesign Issue #4: N-ary vs Binary Relationship SetsDesign Issue #4: N-ary vs Binary Relationship Sets
Employee Works_at Branch
Dept
Employee WAE Branch
Dept
WABinary:
Ternary:
WAB
WAD
vs(Joe, Moody, Acct) Works_At
(Joe, w3) WAE
(Moody, w3) WAB
(Acct, w3) WAD
Choose n-arywhen possible!(Avoids redundancy,update anomalies)
1.28
Key = set of attributes identifying individual entities or relationships
E/R Data ModelE/R Data ModelKeysKeys
Employee
essn ename eaddress ephone
A. Superkey: any attribute set that distinguishes identities
• i.e., uniquely identify a member of the entity set e.g., {essn}, {essn, ename, eaddress}
B. Candidate Key: “minimal superkey” (can’t remove attributes and preserve “keyness”) e.g., {essn}, {ename, eaddress}
C. Primary Key: candidate key chosen as the key by a DBA e.g., {essn} (denoted by underline)
1.29
ExampleExample
ESSN EName EAddress E-Id
123456789 John Neuman Worcester OR-0034
234567567 John Neuman Boston OR-0027
234789890 Richard Wang Framingham OR-0025
976563456 Sheila Dixie Worcester OR-6788
• Superkeys: uniquely identify a member of the entity set - {ESSN}, {EName, EAddress}, {E-Id}, {ESSN, E-Id}• Candidate keys: {ESSN}, {EName, EAddress}, {E-Id}
- why? No subset is a super key- E.g., {EName}, {EAddress} are not super keys hence {EName, EAddress} which is a super key is also a candidate key.
• {ESSN, E-id}: Superkey but not a Candidate key
1.30
E/R Data ModelE/R Data ModelRelationship Set KeysRelationship Set Keys
Employee
essn ename ...
Works_At Branch
bname bcity ...
Q: What attributes needed to represent relationships in Works_At?
since
e1e2
e3
b1
b2
A: {essn, bname, since}
1.31
E/R Data ModelE/R Data ModelRelationship Set Keys (cont.)Relationship Set Keys (cont.)
Q: What are the candidate keys of Works_At?
e1e2
e3
b1
b2
A: {essn}
Employee
essn ename ...
Works_At Branch
bname bcity ...since
1.32
E/R Data ModelE/R Data ModelRelationship Set Keys (cont.)Relationship Set Keys (cont.)
Q: What are the candidate keys if Works_At is...?
A: {essn, bname} Assumption: employees have <= 1 record per branch
b. n:m
a. 1:n
c. 1:1
A: {bname}
A: {essn}, {bname}
Employee
essn ename ...
Works_At Branch
bname bcity ...since
1.33
General Rules for Relationship Set Keys
E/R Data ModelE/R Data ModelRelationship Set Keys (cont.)Relationship Set Keys (cont.)
E1
P (E1) ...
R E2
P (E2) ...
If R is:
R1:11:nn:1n:m
Candidate KeysP (E1) or P (E2)
P (E2)P (E1)
P (E1) P (E2)
1.34
Idea: Existence of one entity depends on another
Example: Loans and Loan Payments
E/R Data ModelE/R Data ModelExistence Dependencies and Weak Entity SetsExistence Dependencies and Weak Entity Sets
Loan
lno lamt
Loan_Pmt Payment
pno pdate pamt
Weak Entity Set
Identifying Relationship
Total Participation
1.35
E/R Data ModelE/R Data ModelExistence Dependencies and Weak Entity SetsExistence Dependencies and Weak Entity Sets
Weak Entity Sets
existence of payments depends upon loans
have no superkeys: different payment records (for different loans) can be identical
instead of keys, discriminators: discriminate between payments for given loan (e.g., pno)
Loan
lno lamt
Loan_Pmt Payment
pno pdate pamt
1.36
E/R Data ModelE/R Data ModelExistence Dependencies and Weak Entity SetsExistence Dependencies and Weak Entity Sets
Identifying Relationships
We say:
Loan is dominant in Loan_Pmt
Payment is subordinate in Loan_Pmt
Payment is existence dependent on Loan
Loan
lno lamt
Loan_Pmt Payment
pno pdate pamt
1.37
E/R Data ModelE/R Data ModelExistence Dependencies and Weak Entity SetsExistence Dependencies and Weak Entity Sets
All elements of Payment appear in Loan_Pmt
Total Participation
Loan
lno lamt
Loan_Pmt Payment
pno pdate pamt
1.38
E/R Data ModelE/R Data ModelExistence Dependencies and Weak Entity SetsExistence Dependencies and Weak Entity Sets
E1
attam
E2
attb1 attbn
Q. Is {attb1, …, attbn} a superkey of E2?
... ...atta1
A: No
Q. Name a candidate key of E2
A: {atta1, attb1}
R
1.39
E/R Data ModelE/R Data ModelExtensions to the Model: Specialization and GeneralizationExtensions to the Model: Specialization and Generalization
An Example: Customers can have checking and savings accts
Checking ~ Savings (many of the same attributes)
Old Way:
Customer Has1 Savings Acct
acct_no balance interest
Has2 Checking Acct
acct_no balance overdraft
1.40
E/R Data ModelE/R Data ModelExtensions to the Model: Specialization and GeneralizationExtensions to the Model: Specialization and Generalization
Customer Has Account
acct_no balance
Checking Acct
overdraftinterest
Isa
Savings Acct
An Example: Customers can have checking and savings accts
Checking ~ Savings (many of the same attributes)
New Way:
superclass
subclasses
1.41
E/R Data ModelE/R Data ModelExtensions to the Model: Specialization and GeneralizationExtensions to the Model: Specialization and Generalization
Subclass Distinctions:
1. User-Defined vs. Condition-Defined
User: Membership in subclasses explicitly determined (e.g., Employee, Manager < Person)
Condition: Membership predicate associated with subclasses - e.g:
Person
Isa
Child Adult Senior
age < 18 age18 age65
1.42
E/R Data ModelE/R Data ModelExtensions to the Model: Specialization and GeneralizationExtensions to the Model: Specialization and Generalization
Subclass Distinctions:
2. Overlapping vs. Disjoint
Overlapping: Entities can belong to >1 entity set (e.g., Adult, Senior)
Disjoint: Entities belong to exactly 1 entity set
(e.g., Child)
Person
Isa
Child Adult Senior
age < 18 age18 age65
1.43
E/R Data ModelE/R Data ModelExtensions to the Model: Specialization and GeneralizationExtensions to the Model: Specialization and Generalization
Subclass Distinctions:
3. Total vs. Partial Membership Total: Every entity of superclass belongs to a subclass e.g.,
Partial: Some entities of superclass do not belong to any subclass (e.g., if Adults condition is age21 )
Person
Isa
Child Adult Senior
age < 18 age18 age65
1.44
E/R Data ModelE/R Data ModelExtensions to the Model: AggregationExtensions to the Model: Aggregation
E/R: No relationships between relationships
E.g.: Associate loan officers with Borrows relationship set
Customer LoanBorrows
Employee
Loan_Officer
?
Associate Loan Officer with Loan?
What if we want a loan officer for every (customer, loan) pair?
1.45
E/R Data ModelE/R Data ModelExtensions to the Model: AggregationExtensions to the Model: Aggregation
E/R: No relationships between relationships
E.g.: Associate loan officers with Borrows relationship set
Customer LoanBorrows
Employee
Loan_Officer
Associate Loan Officer with Borrows?
Must First Aggregate
1.46
Other Similar Models: UMLOther Similar Models: UML
UML: Unified Modeling Language
UML has many components to graphically model different aspects of an entire software system
UML Class Diagrams correspond to E-R Diagram, but several differences.
1.48
UML Class Diagrams (Cont.)UML Class Diagrams (Cont.)
Entity sets are shown as boxes, and attributes are shown within the box, rather than as separate ellipses in E-R diagrams.
Binary relationship sets are represented in UML by just drawing a line connecting the entity sets. The relationship set name is written adjacent to the line.
The role played by an entity set in a relationship set may also be specified by writing the role name on the line, adjacent to the entity set.
The relationship set name may alternatively be written in a box, along with attributes of the relationship set, and the box is connected, using a dotted line, to the line depicting the relationship set.
Non-binary relationships drawn using diamonds, just as in ER diagrams
1.49
UML Class Diagram Notation (Cont.)UML Class Diagram Notation (Cont.)
*Note reversal of position in cardinality constraint depiction*Generalization can use merged or separate arrows independent of disjoint/overlapping
overlapping
disjoint
1.50
UML Class Diagrams (Contd.)UML Class Diagrams (Contd.)
Cardinality constraints are specified in the form l..h, where l denotes the minimum and h the maximum number of relationships an entity can participate in.
Beware: the positioning of the constraints is exactly the reverse of the positioning of constraints in E-R diagrams.
The constraint 0..* on the E2 side and 0..1 on the E1 side means that each E2 entity can participate in at most one relationship, whereas each E1 entity can participate in many relationships; in other words, the relationship is many to one from E2 to E1.
Single values, such as 1 or * may be written on edges; The single value 1 on an edge is treated as equivalent to 1..1, while * is equivalent to 0..*.
1.51
E/R Data ModelE/R Data ModelSummarySummary
Entities, Relationships (sets)
Both can have attributes (simple, multivalued, derived, composite)
Cardinality or relationship sets (1:1, n:1, n:m)
Keys: superkeys, candidate keys, primary key
DBA chooses primary key for entity sets
Automatically determined for relationship sets
Weak Entity Sets, Existence Dependence, Total/Partial Participation
Specialization and Generalization (E/R + inheritance)
Aggregation (E/R + higher-order relationships)
1.52
These things get pretty hairy!These things get pretty hairy!
Many E-R diagrams cover entire walls!
A modest example:
1.53
A Cadastral E-R DiagramA Cadastral E-R Diagram
cadastral: showing or recording property boundaries, subdivision lines, buildings, and related details
Source: US Dept. Interior Bureau of Land Management,Federal Geographic Data Committee Cadastral Subcommittee
http://www.fairview-industries.com/standardmodule/cad-erd.htm