File and Database Design SYS364. Today’s Agenda WHTSA DBMS, RDBMS, SQL A place for everything and...

21
File and Database Design SYS364

Transcript of File and Database Design SYS364. Today’s Agenda WHTSA DBMS, RDBMS, SQL A place for everything and...

File and Database Design

SYS364

Today’s Agenda

WHTSA DBMS, RDBMS, SQL A place for everything and everything in its

place. Entity Relationship Diagrams

to figure it out Normalization

to make it work

Why DB?

Data & Information is lifeblood of most systems, esp. business systems

faster and fewer programs OLTP for accurate and efficient data easy retrieval for EIS, DSS, OLAP and data

warehouse creation easy interface to heterogeneous systems

via SQL, ODBC, JDBC

Why else?

DB analysts make more money than programmers.

Data Terminology and Concepts

It’s a flat (sequential) file if it needs fscanf. Application programs parse the byte stream into data.

DBs use Entities, fields, records, files and keys so programs do not have to know about physical storage of data, only logical

Flat file data

Quote, comma delimited bytes interpreted by program into data222222222,"Evans","Gil",234.56,"CPO","3","arrpiano"333333333,“Young",“Dave",34.56,"CPA","4","bass"111111111,"Ferguson","Maynard",123.45,"CPAC","5","trumpet"444444444,“Bickert",“Ed",456.78,"CPA","6",“guitar"666666666,"Koffman","Moe",678.94,"CPAC","5","flute"555555555,"Clarke","Terry",567.88,"CPA","5","drums"888888888,"Krall","Diana",876.54,"CPAC","5","vocalfem"777777777,"Eikhart","Shirley",789.01,"CPAC","5","composer"999999999,"Peterson","Oscar",-9.87,"CPA","6","piano"

DB file DB knows about the data

StdNo Lname Fname Balance Pgm Semester email111111111 Ferguson Maynard 123.45 CPAC 5 trumpet222222222 Evans Gil 234.56 CPO 3 arrpiano333333333 Young Dave 34.56 CPA 4 bass444444444 Bickert Ed 456.78 CPA 6 guitar555555555 Clarke Terry 567.88 CPA 5 drums666666666 Koffman Moe 678.94 CPAC 5 flute777777777 Eikhart Shirley 789.01 CPAC 5 composer888888888 Krall Diana 876.54 CPAC 5 vocal999999999 Peterson Oscar -9.87 CPA 6 piano

Relational Model Advantages

logical and physical characteristics of the DB are separated. e.g. order of rows and columns is immaterial

much more easily understood by humans powerful operators available, enable

complex operations with brief commands. sound framework for the design of DBs

DBMS library or Relational Database or SQL collection or E-R Diagram:

DBMS: fileRel: relationSQL: tableE-R: Entity

DBMSfield

SQLcolumn

Relationalattribute

E-Rattribute

DBMSrecord

key

SQLrow

key

Relationaltuple

key

E-Rinstance

identifier

Key Fields

May be one or two or more fields (combination / multivalued / composite keys)

Primary keys are unique and minimal Candidate keys (possible Primaries) Secondary keys (alternate key, views) Foreign keys (field(s) in one file is a

primary key in another file - a relationship! Referential Integrity - checks relationship

Data Relationships and Entity-Relationship Diagrams

ERD Graphical model that shows the

relationships among system entitiesEntities are drawn as rectangles

Labeled with singular nounsRelationships are drawn as diamonds

Labeled as active verbs

Types of Relationships

One to One (1:1) One to Many (1:M) Many to Many (M:N)

Cardinality

how many relationships among entities Analysts need to understand cardinality to

design files and databases that reflect accurately all relationships among entities 0, 1 or many relationships between Entities Entity can be mandatory Entity can be optional

Crows Foot Notation

Creating an ERD

Identify the Entities Determine all significant events or

activities for two or more entities Analyze the nature of the interaction Draw the ERD

Try it out yourself

Seneca has courses, sections, professors, and students

Normalization

A process by which you identify and correct inherent problems in record design

Involves three stages First Normal Form Second Normal Form Third Normal Form

First Normal Form

1NF = no repeating attributes (columns) Is this déjà vu all over again?

If so, reorganize the data into separate relations (tables)

You are unique – just like everyone else Establish a minimal & unique primary key Include a key to identify the repeating

columns now in their own table

Second Normal Form

1NF with no partial dependencies, i.e. where the value of a non-key column is dependent on only part of the key (usually on one column of a multi-column key)

1NF record with a primary key that is a single column is automatically in 2NF

See next slide for translation

2NF

Who does this belong to? Each field belongs to the whole primary key If not, split the row into new tables

When does it matter? (not in the textbooks) The Item Master file contains a price field. If only the

current price matters, this is where the price belongs. An Invoice Item file has its own copy of the price field

because the price is time dependent – it belongs with the invoice at the time of sale.

Third Normal Form

2NF with no transitive dependencies, i.e. where a non-key column is dependent on another non-key column

To go from 2NF to 3NF, remove the dependent column

Do we know this already? If have Unit-Price & Qty, do not need Extended-Price If Postal Code, what don’t you need?

Summary

Entity Relationship Diagrams Who is in our model? How do they relate?

Normalization Confirms ERD Organizes data:

Data is stored only once Keys uniquely and minimally identify relations