1 The Entity-Relationship Model 2. 2 Database Design Process Requirement collection and analysis ...
-
date post
22-Dec-2015 -
Category
Documents
-
view
219 -
download
1
Transcript of 1 The Entity-Relationship Model 2. 2 Database Design Process Requirement collection and analysis ...
1
The Entity-Relationship Model
2
2
Database Design Process
Requirement collection and analysis DB requirements and functional requirements
Conceptual DB design using a high-level model Easier to understand and communicate with others
Logical DB design (data model mapping) Conceptual schema is transformed from a high-
level data model into implementation data model Physical DB design
Internal data structures and file organizations for DB are specified
3
Overview of Database Design
Conceptual design: (ER Model is used at this stage.) What are the entities and relationships in the
enterprise? What information about these entities and
relationships should we store in the database? What are the integrity constraints or business
rules that hold? A database `schema’ in the ER Model can be
represented pictorially (ER diagrams). An ER diagram can be mapped into a relational
schema.
4
Relational Model [Properties] Each relation (or table) in a database
has a unique name An entry at the intersection of each row
and column is atomic (or single-valued);there can be no multi-valued attributes in a relation
Each row is unique; no two rows in a relation are identical
Each attribute (or column) within a table has a unique name
The Relational Model
5
Properties Cont’d The sequence of columns (left to
right) is insignificant; the columns of a relation can be interchanged without changing the meaning or use of the relation
The sequence of rows (top to bottom) is insignificant;rows of a relation may be interchanged or stored in any sequence
The Relational Model
6
The Relational Model...The relational model of data has
three major components: Relational database objects
allows to define data structures
Relational operators allows manipulation of stored data
Relational integrity constraints allows to defines business rules and ensure data integrity
7
The Relational Objects Location
Most RDBMS can have multiplelocations, all managed by the same database engine
Marketing Purchasing
AccountingMarketing
Sales Advertising
Accounting
Accounts Receivable
Accounts Payable
Corporate Database
8
Database Server
Multi-user
Client Applications
The Relational Objects
Location
9
The Relational Objects...Database
A set of SQL objects
UPDATE UPDATE TT SET SETINSERT INTO INSERT INTO TTDELETE FROM DELETE FROM TTCALL STPROGCALL STPROG
Client Application
Database Server
StoredProcedure
BEGIN...
Table A
Table B
Table T
Update Trigger
Insert Trigger
Delete Trigger
BEGIN...
BEGIN...
BEGIN...
10
The Relational Objects...Database
A collection of tables and associated indexes
Table
Department
Table
Product
Table
Customer
Table
Employee
Index
Files
11
The Relational Objects...Relation
A named, two dimensional table of data
Database A collection of databases, tables and
related objects organised in a structured fashion.
Several database vendors use schema interchangeably with database
12
Relational Objects...
Tables are comprised of rows and a fixed number of named columns.
Data is presented to the user as tables:
Column 1 Column 2 Column 3 Column 4
Row
Row
Row
Table
13
Relational Objects...
Columns are attributes describing an entity. Each column must have an unique name and a data type.
Data is presented to the user as tables:
Name Designation Department
Row
Row
Row
Employee
Structure of a relation (e.g. Employee)Employee(Name, Designation, Department)
14
Relational Objects...
Rows are records that present information about a particular entity occurrence
Data is presented to the user as tables:
Name Designation Department
Row
Row
Row
Employee
De Silva Manager Personnel
Perera Secretary Personnel
Dias Manager Sales
15
Relational model terminology
Row is called a ‘tuple’ Column header is called an ‘attribute’ Table is called a ‘relation’ The data type describing the type of values that
can appear in each column is called a ‘domain’ Eg:-
Names : the set of names of persons Employee_ages : value between 15 & 80 years oldThe above is called ‘logical definitions of domains’.A data type or format can also be specified for each
domain.Eg: The employee age is an integer between 15 and 80
16
Characteristics of relations
Ordering of tuples Tuples in a realtion don’t have any particular order.
How ever in a file they may be physically ordered based on a criteria, this is not there in relational model
Ordering of values within tuple Ordering of values within a tuple are unnecessary,
hence a tuple can be considered as a ‘set’. But when relation is implemented as a file attributes
may be physically ordered Values in a tuple are atomic
17
Relational constraints
Domain constraints specifies that the value of each attribute ‘A’ must be
an atomic value. And from the specified domain Key constraints
There is a sub set of attributes of a relational schema with the property that no two tuples should have the same combination of values for the attributes.
Any such subset of attributes is called a ‘superkey’ A ‘superkey’ can have redundant attributes. A key is
a minimul superkey If a realtion has more than one key, they are called
candidate keys One of them is chosen as the primary key
18
Relational Objects
Primary Key: An attribute (or combination of attributes) that uniquely identifies each row in a relation.Employee(Emp_No, Emp_Name, Department)
Composite Key: A primary key that consists of more than one attributeSalary(Emp_No, Eff_Date, Amount)
Keys
19
Relational Objects
Each table has a primary key. The primary key is a column or combination of columns that uniquely identify each row of the table.
Data is presented to the user as tables:
Primary Key
EmployeeE-No E-Name D-No
179 Silva 7857 Perera 4342 Dias 7
Primary Key
SalaryE-No Eff-Date Amt
179 1/1/98 8000857 3/7/94 9000179 1/6/97 7000342 28/1/97 7500
20
SalaryE-No Eff-Date Amt
179 1/1/98 8000857 3/7/94 9000179 1/6/97 7000342 28/1/97 7500
Relational Objects
The cardinality of a table refers to the number of rows in the table. The degree of a table refers to the number of columns.
Data is presented to the user as tables:
Salary TableDegree = 3Cardinality = 4
21
Entity integrity, referential integrity/foreign keys
Entity integrity constraint specifies that no primary key can be null
The referential integrity constraint is specified between two relations and is used to maintain the consistency among tuples of the two realtions
Informally what this means is that a tuple in one relation that refers to another relation must refer to an existing tuple.
To define referential integrity we use the concept of foreign keys.
22
Relational Objects
Foreign Key: An attribute in a relation of a database that serves as the primary key of another relation in the same database
Employee(Emp_No, Emp_Name, Department)
Department(Dept_No, Dept_Name, M_No)
Relationship
=== works for ==>
23
Relational Objects
A foreign key is a set of columns in one table that serve as the primary key in another table
Data is presented to the user as tables:
Foreign KeyPrimary Key
Primary Key
D-No D-Name M-No
4 Finance 857 7 Sales 179
Primary Key
DepartmentEmployeeE-No E-Name D-No
179 Silva 7857 Perera 4342 Dias 7
Recursive foreign key: A foreign key in a relation that references the primary key values of that same relation
24
Primary Key
D-No D-Name M-No
4 Finance 857 7 Sales 179
Primary Key
Department
EmployeeE-No E-Name D-No
179 Silva 7857 Perera 4342 Dias 7
Relational Objects...
Rows in one or more tables are associated with each other solely through data values in columns (no pointers).
Primary Key
Foreign KeyPrimary Key
Foreign Key
Foreign Key
SalaryE-No Eff-Date Amt
179 1/1/98 8000857 3/7/94 9000179 1/6/97 7000342 28/1/97 7500
Primary Key
25
Relational ObjectsIndex
An ordered set of pointers to the data in the table
E-No E-Name D-No
179 Silva 7857 Perera 4342 Dias 7719 De Silva 5
EmployeeE-Name Pointer
De Silva Dias Perera Silva
26
Index: Employee NameE-No E-Name D-No
179 Silva 7857 Perera 4342 Dias 7719 De Silva 5587 Alwis 4432 Costa 6197 Zoysa 2875 Peiris 4324 Vaas 7917 Bandara 3785 Opatha 2234 Wickrama 1
EmployeeE-Name Pointer
Alwis Bandara Costa De Silva Dias Opatha Peiris Perera Silva Vaas Wickrama Zoysa
27
E-Name Pointer
Alwis Bandara Costa De Silva Dias Opatha Peiris Perera Silva Vaas Wickrama Zoysa
Search: Employee Dias
Index
Improvesperformance.Access to data is faster
28
Search: Employee Dias
Opatha
SilvaCosta
Bandara PereraDias Wickrama
Index
Ensures uniqueness. A table with unique fields in the index cannot havetwo rows with the same values in the column or columns that form the index key.
29
Search: Employee Dias
. Opatha . .. Bandara . .
. Alwis . . . Dias . .. Costa . . . Peiris . . . Silva . . . Wickrama . Zoysa .
. De Silva . Perera .
. Vaas . .
30
Relational Database
STOREStore 1 || ColomboStore 2 || Kandy
INVENTORYStore 1 || P1 || 50Store 1 || P3 || 20Store 2 || P2 || 100Store 2 || P1 || 30
ORDERSStore 1 || P3 || 3428 || 0052 || 10Store 2 || P2 || 3428 || 0098 || 7Store 2 || P3 || 3428 || 0098 || 15Store 2 || P4 || 5726 || 0099 || 1
PARTP1 | PrinterP2 | DisketteP3 | Disk DriveP4 | Modem
VENDOR3428 || East West5726 || DMS
STOREStore Name || City
INVENTORYStore Name || Part No || Quantity
ORDERSStore Name || Part No || Vendor No || Order No || Quantity
PARTPart No || Description
VENDORVendor No || Vendor Name
31
ER Model Basics
Entity: Real-world object distinguishable from other objects. An entity is described (in DB) using a set of attributes.
Entity Set: A collection of similar entities. E.g., all employees. All entities in an entity set have the same set of
attributes. (Until we consider ISA hierarchies, anyway!)
Each entity set has a key. Each attribute has a domain.
Employees
ssnname
lot
32
ER Model Basics
Key and key attributes: Key: a unique value for an entity Key attributes: a group of one or more attributes
that uniquely identify an entity in the entity set Super key, candidate key, and primary key
Super key: a set of attributes that allows to identify and entity uniquely in the entity set
Candidate key: minimal super key• There can be many candidate keys
Primary key: a candidate key chosen by the designer
• Denoted by underlining in ER attributes
Employees
ssnname
lot
33
ER Model Basics (Contd.)
Relationship: Association among two or more entities. e.g., Jack works in Pharmacy department.
Relationship Set: Collection of similar relationships. An n-ary relationship set R relates n entity sets E1 ... En;
each relationship in R involves entities e1 in E1, ..., en in En• Same entity set could participate in different relationship
sets, or in different “roles” in same set.
lot
dname
budgetdid
sincename
Works_In DepartmentsEmployees
ssn
Reports_To
lot
name
Employees
subor-dinate
super-visor
ssn
34
Key Constraints
Consider Works_In: An employee can work in many departments; a dept can have many employees.
In contrast, each dept has at most one manager, according to the key constraint on Manages.
Many-to-Many1-to-1 1-to Many Many-to-1
dname
budgetdid
since
lot
name
ssn
ManagesEmployees Departments
35
Example ER• An ER diagram
represents several assertions about the real world. What are they?
• When attributes are added, more assertions are made.
• How can we ensure they are correct?
• A DB is judged correct if it captures ER diagram correctly.
Students
Professor teaches
Department
faculty
major offers
Courses
enrollmentadvisor
36
Participation Constraints
Does every department have a manager? If so, this is a participation constraint: the participation
of Departments in Manages is said to be total (vs. partial).
• Every Departments entity must appear in an instance of the Manages relationship.
lot
name dnamebudgetdid
sincename dname
budgetdid
since
Manages
since
DepartmentsEmployees
ssn
Works_In
37
Weak Entities A weak entity can be identified uniquely only by
considering the primary key of another (owner) entity. Owner entity set and weak entity set must participate in a
one-to-many relationship set (one owner, many weak entities).
Weak entity set must have total participation in this identifying relationship set.
lot
name
agepname
DependentsEmployees
ssn
Policy
cost
38
ISA (`is a’) Hierarchies
Contract_Emps
namessn
Employees
lot
hourly_wagesISA
Hourly_Emps
contractid
hours_worked As in C++, or other PLs, attributes are inherited. If we declare A ISA B, every A entity is also considered to be a B entity. Overlap constraints: Can Joe be an Hourly_Emps as well
as a Contract_Emps entity? (default: disallowed; A overlaps B)
Covering constraints: Does every Employees entity also have to be an Hourly_Emps or a Contract_Emps entity? (default: no; A AND B COVER C)
Reasons for using ISA: To add descriptive attributes specific to a subclass. To identify entities that participate in a relationship.
39
Aggregation Used when we
have to model a relationship involving (entitity sets and) a relationship set. Aggregation
allows us to treat a relationship set as an entity set for purposes of participation in (other) relationships.
Aggregation vs. ternary relationship: Monitors is a distinct relationship, with a descriptive attribute. Also, can say that each sponsorship is monitored by at most one employee.
budgetdidpid
started_on
pbudgetdname
until
DepartmentsProjects Sponsors
Employees
Monitors
lotname
ssn
since
40
Conceptual Design Using the ER Model
Design choices: Should a concept be modeled as an entity or an
attribute? Should a concept be modeled as an entity or a
relationship? Identifying relationships: Binary or ternary?
Aggregation? Constraints in the ER Model:
A lot of data semantics can (and should) be captured.
But some constraints cannot be captured in ER diagrams.
41
Entity vs. Attribute
Should address be an attribute of Employees or an entity (connected to Employees by a relationship)?
Depends upon the use we want to make of address information, and the semantics of the data:
• If we have several addresses per employee, address must be an entity (since attributes cannot be set-valued).
• If the structure (city, street, etc.) is important, e.g., we want to retrieve employees in a given city, address must be modeled as an entity (since attribute values are atomic).
42
Entity vs. Attribute (Contd.) Works_In4 does not
allow an employee to work in a department for two or more periods.
Similar to the problem of wanting to record several addresses for an employee: We want to record several values of the descriptive attributes for each instance of this relationship. Accomplished by introducing new entity set, Duration.
name
Employees
ssn lot
Works_In4
from todname
budgetdid
Departments
dnamebudgetdid
name
Departments
ssn lot
Employees Works_In4
Durationfrom to
43
Entity vs. Relationship First ER diagram OK
if a manager gets a separate discretionary budget for each dept.
What if a manager gets a discretionary budget that covers all managed depts? Redundancy: dbudget
stored for each dept managed by manager.
Misleading: Suggests dbudget associated with department-mgr combination.
Manages2
name dnamebudgetdid
Employees Departments
ssn lot
dbudgetsince
dnamebudgetdid
DepartmentsManages2
Employees
namessn lot
since
Managers dbudget
ISA
This fixes theproblem!
44
Binary vs. Ternary Relationships
If each policy is owned by just 1 employee, and each dependent is tied to the covering policy, first diagram is inaccurate.
What are the additional constraints in the 2nd diagram?
agepname
DependentsCovers
name
Employees
ssn lot
Policies
policyid cost
Beneficiary
agepname
Dependents
policyid cost
Policies
Purchaser
name
Employees
ssn lot
Bad design
Better design
45
Binary vs. Ternary Relationships (Contd.)
Previous example illustrated a case when two binary relationships were better than one ternary relationship.
An example in the other direction: a ternary relation Contracts relates entity sets Parts, Departments and Suppliers, and has descriptive attribute qty. No combination of binary relationships is an adequate substitute: S “can-supply” P, D “needs” P, and D “deals-
with” S does not imply that D has agreed to buy P from S.
How do we record qty?
46
Summary of Conceptual Design
Conceptual design follows requirements analysis, Yields a high-level description of data to be stored
ER model popular for conceptual design Constructs are expressive, close to the way people
think about their applications. Basic constructs: entities, relationships, and
attributes (of entities and relationships). Some additional constructs: weak entities, ISA
hierarchies, and aggregation. Note: There are many variations on ER model.
47
Summary of ER (Contd.)
Several kinds of integrity constraints can be expressed in the ER model: key constraints, participation constraints, and overlap/covering constraints for ISA hierarchies. Some foreign key constraints are also implicit in the definition of a relationship set. Some constraints (notably, functional dependencies)
cannot be expressed in the ER model. Constraints play an important role in determining
the best database design for an enterprise.
48
Summary of ER (Contd.) ER design is subjective. There are often many
ways to model a given scenario! Analyzing alternatives can be tricky, especially for a large enterprise. Common choices include: Entity vs. attribute, entity vs. relationship, binary or n-
ary relationship, whether or not to use ISA hierarchies, and whether or not to use aggregation.
Ensuring good database design: resulting relational schema should be analyzed and refined further. FD information and normalization techniques are especially useful.