File and Database Design SYS364. Today’s Agenda WHTSA DBMS, RDBMS, SQL A place for everything and...
-
Upload
melina-perry -
Category
Documents
-
view
221 -
download
1
Transcript of File and Database Design SYS364. Today’s Agenda WHTSA DBMS, RDBMS, SQL A place for everything and...
Today’s Agenda
WHTSA DBMS, RDBMS, SQL A place for everything and everything in its
place. Entity Relationship Diagrams
to figure it out Normalization
to make it work
Why DB?
Data & Information is lifeblood of most systems, esp. business systems
faster and fewer programs OLTP for accurate and efficient data easy retrieval for EIS, DSS, OLAP and data
warehouse creation easy interface to heterogeneous systems
via SQL, ODBC, JDBC
Data Terminology and Concepts
It’s a flat (sequential) file if it needs fscanf. Application programs parse the byte stream into data.
DBs use Entities, fields, records, files and keys so programs do not have to know about physical storage of data, only logical
Flat file data
Quote, comma delimited bytes interpreted by program into data222222222,"Evans","Gil",234.56,"CPO","3","arrpiano"333333333,“Young",“Dave",34.56,"CPA","4","bass"111111111,"Ferguson","Maynard",123.45,"CPAC","5","trumpet"444444444,“Bickert",“Ed",456.78,"CPA","6",“guitar"666666666,"Koffman","Moe",678.94,"CPAC","5","flute"555555555,"Clarke","Terry",567.88,"CPA","5","drums"888888888,"Krall","Diana",876.54,"CPAC","5","vocalfem"777777777,"Eikhart","Shirley",789.01,"CPAC","5","composer"999999999,"Peterson","Oscar",-9.87,"CPA","6","piano"
DB file DB knows about the data
StdNo Lname Fname Balance Pgm Semester email111111111 Ferguson Maynard 123.45 CPAC 5 trumpet222222222 Evans Gil 234.56 CPO 3 arrpiano333333333 Young Dave 34.56 CPA 4 bass444444444 Bickert Ed 456.78 CPA 6 guitar555555555 Clarke Terry 567.88 CPA 5 drums666666666 Koffman Moe 678.94 CPAC 5 flute777777777 Eikhart Shirley 789.01 CPAC 5 composer888888888 Krall Diana 876.54 CPAC 5 vocal999999999 Peterson Oscar -9.87 CPA 6 piano
Relational Model Advantages
logical and physical characteristics of the DB are separated. e.g. order of rows and columns is immaterial
much more easily understood by humans powerful operators available, enable
complex operations with brief commands. sound framework for the design of DBs
DBMS library or Relational Database or SQL collection or E-R Diagram:
DBMS: fileRel: relationSQL: tableE-R: Entity
DBMSfield
SQLcolumn
Relationalattribute
E-Rattribute
DBMSrecord
key
SQLrow
key
Relationaltuple
key
E-Rinstance
identifier
Key Fields
May be one or two or more fields (combination / multivalued / composite keys)
Primary keys are unique and minimal Candidate keys (possible Primaries) Secondary keys (alternate key, views) Foreign keys (field(s) in one file is a
primary key in another file - a relationship! Referential Integrity - checks relationship
Data Relationships and Entity-Relationship Diagrams
ERD Graphical model that shows the
relationships among system entitiesEntities are drawn as rectangles
Labeled with singular nounsRelationships are drawn as diamonds
Labeled as active verbs
Cardinality
how many relationships among entities Analysts need to understand cardinality to
design files and databases that reflect accurately all relationships among entities 0, 1 or many relationships between Entities Entity can be mandatory Entity can be optional
Crows Foot Notation
Creating an ERD
Identify the Entities Determine all significant events or
activities for two or more entities Analyze the nature of the interaction Draw the ERD
Normalization
A process by which you identify and correct inherent problems in record design
Involves three stages First Normal Form Second Normal Form Third Normal Form
First Normal Form
1NF = no repeating attributes (columns) Is this déjà vu all over again?
If so, reorganize the data into separate relations (tables)
You are unique – just like everyone else Establish a minimal & unique primary key Include a key to identify the repeating
columns now in their own table
Second Normal Form
1NF with no partial dependencies, i.e. where the value of a non-key column is dependent on only part of the key (usually on one column of a multi-column key)
1NF record with a primary key that is a single column is automatically in 2NF
See next slide for translation
2NF
Who does this belong to? Each field belongs to the whole primary key If not, split the row into new tables
When does it matter? (not in the textbooks) The Item Master file contains a price field. If only the
current price matters, this is where the price belongs. An Invoice Item file has its own copy of the price field
because the price is time dependent – it belongs with the invoice at the time of sale.
Third Normal Form
2NF with no transitive dependencies, i.e. where a non-key column is dependent on another non-key column
To go from 2NF to 3NF, remove the dependent column
Do we know this already? If have Unit-Price & Qty, do not need Extended-Price If Postal Code, what don’t you need?