LIS 557 Database Design and Management

39
LIS 557 Database Design and Management William Voon Michael Cole Spring '04

description

William Voon Michael Cole Spring '04. LIS 557 Database Design and Management. 19 February 2004. Data modeling. One can't proceed from the informal to the formal by formal means. -- Alan J. Perlis. The Zepplin View. - PowerPoint PPT Presentation

Transcript of LIS 557 Database Design and Management

Page 1: LIS 557 Database Design and Management

LIS 557Database Design and

Management

William VoonMichael ColeSpring '04

Page 2: LIS 557 Database Design and Management

Data modeling

19 February 2004

Page 3: LIS 557 Database Design and Management

One can't proceed from the informal to the formal by formal means.

-- Alan J. Perlis

Page 4: LIS 557 Database Design and Management

The Zepplin View

● To build an RDB, our data model starts as a conceptual model and then becomes a logical design and finally is physically implemented.

Page 5: LIS 557 Database Design and Management

RDB in Seven Steps1. Interview users and domain experts

2. Identify the data elements and their relationships.

3. Create a data model.

4. Select the database management software (DBMS).

5. Map data-model elements to tables, and normalize them.

6. Create data type definitions and a database structure.

7. Design the application.

Page 6: LIS 557 Database Design and Management

Data Modeling● Entity relationships show us (more or

less) how many tables we need and how they may be linked– They do not tell us how to make an

optimally efficient design of linked tables● Tonight, we will learn detailed

procedures that can be used to go from entity relationships to a well-designed database.– These procedures are not a substitute for

insight into the system we wish to model

Page 7: LIS 557 Database Design and Management

What is being modeled?

● There are several levels of modeling that are possible:

– Physical model– Internal model– External models– Conceptual model

Recall that Codd's RDB is intended to hide the physical details of data storage and use, so we can concentrate on the Conceptual, Internal and External models.

Page 8: LIS 557 Database Design and Management

The Conceptual Model

● We have already seen this – E-R diagrams are the most widely used conceptual modelling tool

● ERDs provide a conceptual schema, that is, a plan for the RDB

● Much more information can be packed into an ERD

Page 9: LIS 557 Database Design and Management

Tiny College

Entities for Conceptual Model

Page 10: LIS 557 Database Design and Management

The Model

● Consider each of the entities and write out the relationships to the other entities

● EXERCISE: Teams of two– Enumerate the relationships that make

sense between the entities of Tiny College

Page 11: LIS 557 Database Design and Management

The Basic TC Model

Page 12: LIS 557 Database Design and Management

Internal Model

● Taking the conceptual model, match the characteristics and constraints to the specific software (DBMS) used

– So internal models are software dependent● Usually this means indicating the data location

in a storage group.● The internal model is (for RDBs) the

implementation model.

– So, M:N relationships must be resolved using bridging tables

Page 13: LIS 557 Database Design and Management

TC Internal Model

Page 14: LIS 557 Database Design and Management

TC External Schema(s)

External models are what users see. For TC, think of two web sites, one for students, one for professors.

Page 15: LIS 557 Database Design and Management

External Models

● Models the database slice that is seen and used by distinct groups of users

● Each uses a subset of the data in the system

● Even when two groups use overlapping data sets (or even the same data set) there may be different constraints to implement

Page 16: LIS 557 Database Design and Management

TC External Model Constraints

● Student registration constraints

– A class is limited to 25 students– A student can enroll for up to 5 classes

● Class scheduling

– A room can be used by many classes, but a class may only use one room

– A class is taught by one professor– A professor may teach up to three classes

Page 17: LIS 557 Database Design and Management

Entity Relationship Models

● We have already seen Entity Relationship Diagrams,

● Now, we will now add some elements for a richer description.

● The goal is to have an ERD that describes the complete structure of all the tables and their relationships to one another (THE RDB!!)

Page 18: LIS 557 Database Design and Management

Attributes in ERDs

● Attributes are ovals attached to an entity (remember in this use the entity is entity set = table)

● Attributes have a domain of values, e.g. (T F), (0, 10), (female male) etc.

● An entity is a collection of attributes:

Car = CAR(CAR_ID_NUM, MOD_CODE, CAR_YEAR, CAR_COLOR)

Page 19: LIS 557 Database Design and Management

Entity with Attributes

Multi-valued attribute(double line)

Need to split into distinct attributes for implementation

Key attribute (underlined)

Page 20: LIS 557 Database Design and Management

Derived Attribute

A derived attribute is calculated or inferred (usually directly from other entity attributes)

Page 21: LIS 557 Database Design and Management

Relationships

● Degree: unary, binary, ternary, ...– How many entities are involved in the

relationship?● Connectivity: 1:1, 1:M, M:N● Cardinality

– Number of entity occurrences associated with one occurrence of related entity

Page 22: LIS 557 Database Design and Management

Relationships

Cardinalities express constraints in the internal model. Here, a professor can teach between one and four classes.

Page 23: LIS 557 Database Design and Management

Cardinality Exercise

● A car can have four body colors (r b y g), draw the ERD.

● A manufacturer makes three car types (sedan SUV van), with the four body colors, draw the ERD.

Page 24: LIS 557 Database Design and Management

Another example

Interpret this ERD.

Page 25: LIS 557 Database Design and Management

Existence Dependency

● If the existence of one or more entities is required for another entity, that entity is existence-dependent– e.g. A course must exist for there to be a

specific class● Existence-dependence is important

because it determines the order in which tables must be created

Page 26: LIS 557 Database Design and Management

Relationships: Mandatory or Optional?

● Constraints show where a relationship must exist for a specific entity and where the relationship is optional

Page 27: LIS 557 Database Design and Management

Weak Entity Relationships

● A weak entity

– Is existence-dependent (i.e. Other entities must exist if it exists)

– Has primary key that is derived (at least in part, usually totally) from the parent entity

● Use a double rectangle to denote a weak entity

Page 28: LIS 557 Database Design and Management

Weak Entity Diagram

Page 29: LIS 557 Database Design and Management

Recursive Entities

● A unary relationship is a recursive entity

– One consequence is that only one table is required for the relationship

– Example: Prerequisites for a course– Other examples?

● Common when we model part-whole relationships and in state transformations (so a subsequent state is related to its predecessor state)

Page 30: LIS 557 Database Design and Management

Composite Entity

● Resolution of M:N relationships requires a bridge entity. This is a composite entity.

● Notice that a composite entity is existence-dependent (Why?)

● The composite entity can have attributes that are not required for the bridging function

Page 31: LIS 557 Database Design and Management

Bridging from M:N to 1:M

To break down a M:N relationship, create a new entity that bridges between the two entities

So from STUDENT to CLASS build a new table with a link (=key) to STUDENT and a link to CLASS

This is efficient because only the key information is redundant.

These tables are composite or bridge entities

Page 32: LIS 557 Database Design and Management

Bridging Tables

Page 33: LIS 557 Database Design and Management

The Bridging Process

Page 34: LIS 557 Database Design and Management

A fully specified bridge

Page 35: LIS 557 Database Design and Management

Entity subtypes

● Entities that have subtypes are often best handled by a generalization hierarchy

● Isolate the common attributes in a superclass

Page 36: LIS 557 Database Design and Management

Generalization Hierarchy

Page 37: LIS 557 Database Design and Management

A Table View

Page 38: LIS 557 Database Design and Management

Developing an E-R Diagram

● It is an interative process– A reworking of the conceptual model to

elaborate all of the details of the database– Adding the constraints to reach the

internal model● Thinking about what parts of the internal

model are exposed to users (external models)

Page 39: LIS 557 Database Design and Management

The Seven Steps (Again)

1.Interview users and domain experts

2.Identify the data elements and their relationships.

3. Create a data model.

4. Select the database management software (DBMS).

5. Map data-model elements to tables, and normalize them.

6. Create data type definitions and a database structure.

7. Design the application.