DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES...

48
DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Instructor: Michael Kremer, Ph.D. Technology & Information Management Section 1

Transcript of DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES...

Page 1: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

DESIGNING, BUILDING, AND USING DATABASES

(BEGINNING MICROSOFT ACCESS, X405.4)

Database Program: Microsoft Access Series

Instructor: Michael Kremer, Ph.D.Technology & Information Management

Section 1

Page 2: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

WHO AM I?

Michael Kremer

Federal Reserve Bank San Francisco

Previously: Lawrence Berkeley National Laboratory

Database/Application Developer

dBase, Access Developer for over 20 years

Instructor for UC Extension since 1998

DB: Oracle, SQL Server, AccessProg.: ASP.net. C#, VB/VBA, Java/Javascript

Reporting: Cognos, Actuate

Page 3: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

WHO ARE YOU?

Name/Company/Organization

What do you do?

Computer Experience (OS, Application SW,

Other Classes Taken, etc.)

Database Experience (if any)

Expectations/Goals

Any other information about you such as

hobbies, special interests, fun facts, etc.

Page 4: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

AGENDA

1. Introduction to Relational Database Systems.

2. Data Normalization and Integrity Rules

Page 5: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

Introduction to Relational Database Systems

1.

Page 6: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.1 WHAT IS A DATABASE?

What is a Database?

Collection of organized information to provide efficient retrieval.

Collected information could be in any number of formats (electronic, printed, graphic, audio, statistical, combinations).

Physical (paper/print) and electronic databases.

Surrounded by databases: Dictionary, Card file, Phone book, collection of recipes, TV guide, etc.

In very simple terms, a database is a container storing data.

To be more specific, the data stored in a database is a collection of related data.

Abstraction of real-world, complex sets of data to make data more meaningful and useful to humans. Database Server (OS)

Database Engine

Database Management System

1

Page 7: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.1 WHAT IS A DATABASE?

DB Engine’s primary purpose is to store and extract data.

More specific:

To query data (optimize queries)

To add data

To maintain/update data

To delete data

DBMS performs the following main tasks:

Controlling data access

Enforcing data integrity

Managing concurrency control

Recovering the database after failures and restoring it from backup files

Maintaining database security

Database application consists of database ,GUI, reporting

system, navigation, notification/messaging.

2

Page 8: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.1 WHAT IS A DATABASE?

Database (Back-end)

GUI (Front-end)

Common examples of

complete DB application

systems:

PeopleSoft HR

Oracle Financials

SalesForce CRM

SAP Enterprise Resource Planning

3

Forms

Database Engine GUI

Database Server

Reports

Database Management

System

Page 9: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.2 DATABASE SYSTEMS

Flat File System

Data is stored in one table.

Examples include spreadsheet files, word processing data files,

and personal address books.

Hierarchical Database Systems

Keeping track of automobile parts.

Car had to be decomposed into hundreds of assemblies (body,

engine), sub-assemblies (valves, spark plugs, cylinders), and

sub-sub assemblies (nuts, bolts, washers).

Data was of hierarchical nature.

A parent record (such as a car) can have many child records.

But one child can only be related to one parent.

4

Page 10: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.2 DATABASE SYSTEMS

Child records contain an implicit,

physical pointer to the parent.

Advantage: Rapid access and

updates since relationships are

predefined and implemented through physical pointers.

Disadvantage: No other relationships allowed, like child to

another child, even it is makes logical sense.

Network Database Systems

Extension of hierarchical database, allowed more complex

relationships.

Instead of top-down parent-child, allowed also other lateral links.

Valve (child record belonging to the parent engine) can now also

be related to a supplier record.

5

Car

Body Engine Chassis

Cylinder Valves

Cyl. Cast Seals

Page 11: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.2 DATABASE SYSTEMS

Still, all relationships are predefined

and “hard-wired” using physical pointers.

Any changes to database structure

requires rebuilding entire database (hierarchical and network).

To query the data, programs have to written, navigating the tree

structure to find the data (hierarchical and network).

Relational Database Systems

Paved the way to a new paradigm of representing data.

The relational model eliminated the explicit parent/child

structures and instead represented all data in a database as

simple row/column tables of data values.

All data visible to the user is organized strictly as tables of data

values, and where all database operations work on these tables.

specifically rules out any user-invisible structures (pointers).

6

Car

Body Engine

Cylinder Valves

Cyl. Cast Seals

Supplier

Page 12: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.2 DATABASE SYSTEMS

Relationships are visible only through the data values contained

in the database tables data is used to link data, all that data

is available and visible to the user. Additionally, you can link the

data in any way you want, there are no prescribed relationships.

Object-Oriented Database Systems

More complex data: Images, drawings,

audio, video

An object is a logical grouping of related

data and program logic representing a

real world thing, such as a customer.

Encapsulation: Fields or variables can

only be accessed through methods, they can

never be manipulated directly.

7

CustomerIDCustomerNameStreetCityStateZipCode

Fields/Variables

Add Customer

Set Customer Inactive

Update Address

Print Mailing Label

Check Credit Limit

List Customers

Update Phone

Numbers

Methods

Page 13: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.3 DATABASE FUNCTIONS

Online Transaction Processing (OLTP) Databases

Frequent data changes, data is volatile.

Daily transactions are entered, and data is modified along the

process,

Few indexes to allow for fast updates and high throughput.

Online Analytical Processing (OLAP) Databases

Business decisions are made based on an organization’s data

set without the need to access the most current data.

A data warehouse is a separate database holding mostly non-

volatile data (data which does not change anymore).

Tools to analyze data are included in a data warehouse (ad-hoc

query tools). Two main tools: OLAP and data mining.

8

Page 14: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.4 DATABASE ARCHITECTURES

Database architecture mostly affects system performance and

scalability (=number of users) by distributing the workload onto

different servers and separating the application logic:

Presentation logic (Presenting and formatting data)

Business logic (Business rules)

Database logic (Storing data, ensuring

data integrity)

One-Tier

All three components are processed in the CPU of one computer.

Can also be setup as multi-user applications where database

back-end file is stored on a file server.

Multiple users share the same data through a file-server.

However, all application logic is processed on the desktop

computers of the users.

9

File ServerWork Station

Work Station

Work Station

Desktop database system

Multi-User Desktop database system

Page 15: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.4 DATABASE ARCHITECTURES

Two-Tier

Also called client-server.

Database logic is processed

in the CPU on a server.

Presentation and business

logic is processed in the

client’s CPU.

N-Tier

Most n-tier database architectures exist in a three-tier

configuration.

Client/server model expands to include a middle tier (business

tier) application server that houses the business logic.

10

Database ServerWork Station

Work Station

Web ServerWork Station

Work StationDatabase

Server

NetworkHttp

Thick Client(Local App)

Thin Client (Browser)

Page 16: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.4 DATABASE ARCHITECTURES

N-Tier (continued)

Middle tier relieves the client

application(s) and database server

of some of their processing

duties by translating client calls

into database queries and

translating data from the

database into client data in return.

Consequently, the client and server

never talk directly to one-another.

11

Application ServerWork Station

Work StationDatabase

Server

Network

Web ServerWork Station

Work StationDatabase

Server

Http

Application Server

Page 17: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.5 WHY USE A RELATIONAL DATABASE?

Main benefit of using a relational database is to avoid data

duplication, or redundancy.

Complex data was “forced” into the single table to maintain

simplicity, at the cost of data issues.

Data Redundancy

This bank’s business is based on the past,

where we were all happy having just one

account.

Imagine when one customer suddenly

wants to put some money away into a savings account.

The entire customer information had to be entered again along

with the new account.

One customer is one object in the real world, and it should be

stored only once in a database.

12

Page 18: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.5 WHY USE A RELATIONAL DATABASE?

Redundant

data can cause other data problems, the so-called data

anomalies. Includes Insert, Update, and Delete anomaly.Data Update Anomaly

Customer calls the bank to report the new address

Bank service representative pulls up checking account

information since customers have at least a checking account.

Bank employee does not ask about other accounts.

Now one customer was morphed into two customers.Data Insertion Anomaly

Bank is holding promotional events to attract new customers.

Add prospective customers.

System allows only customers having an account to be entered.

13

Page 19: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.5 WHY USE A RELATIONAL DATABASE?

Data Deletion Anomaly

What if a customer is taking their business elsewhere.

Need to close the account and actually remove the information

from the transactional system (the history information could be

stored in the data warehouse).

But removing the account information also inadvertently

removes the customer information with it.

Benefits of a Relational Database

Efficient Data Entry

Minimal Structural Maintenance

Accurate Data Analysis

Avoiding Data Anomalies

14

Page 20: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

1.5 WHY USE A RELATIONAL DATABASE?

15

Page 21: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

Data Normalization and Integrity

2.

Page 22: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.1 DATA COLLECTION

Normalization is process of organizing data in a database.

There are two goals of the normalization process:

Eliminating redundant data (for example, storing the same data in more than one

table)

Ensuring data dependencies make sense (only storing related data in a table).

Collection means:

questionnaires, interviews, observations, and collection of input forms, output

reports, procedures, and if exist, functional and technical documentation of the

current system.

Collection methods:

Talking to users, observing the workflow within an organization or a department,

attending business meetings.

Talking to management to determine how much resources are available, what are

the short and long term needs, etc.

Talking to technical staff with respect to database software, network architecture,

security, etc.

16

Page 23: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.2 INTUITIVE DATABASE DESIGN

After data collection, understanding of data is important.

Data source (where it comes from)

When it can be deleted

How it interacts with other data

Its contribution to the generation of information (Data Information)

The processes and transactions in which it is utilized

Organize data to form preliminary design:

The first step involved in this process is to identify the main database tables. This

step is also called intuitive database design method.

Review each individual data item and create a subject class for that item, that is,

categorize the data into groups or subjects.

These groups or subjects may eventually form a database table, also called an

“Entity Class”.

Each individual data item is placed into these entities, they may become the fields

of the table, or also called “Attributes”.

17

Page 24: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.2 INTUITIVE DATABASE DESIGN

Let us take a look at a simple invoice.

By inspecting the individual data

items on the invoice, the following

main subjects can be derived:

Customer (who orders products)

Products/Services (ordered by the

customer)

Company (which sells the products)

The database design process transforms user-perceived objects (like

an invoice) into conceptual database objects (like Customer, Products,

Company tables) and ultimately into physical database objects.

18

Page 25: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.3 THE RELATIONAL MODEL

Relational model defines the way data can be represented (data

structure), the way data can be protected (data integrity), and

the operations that can be performed on data (data

manipulation).

A relation is defined as a table of

columns (attributes) and rows

(tuples).

The definition specifies what will be contained in each column of

the table, but it does not include data. When you include rows of

data, you have an instance of a relation.

This definition looks like a flat file or a spreadsheet.

However, a relation has some very specific characteristics that

distinguish it from other rectangular ways of looking at data.

19

Logical Term Physical Term

Relation Table

Unique Identifier Primary Key

Attribute Column/Field

Tuple Row

Page 26: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.3 THE RELATIONAL MODEL

Column Characteristics:

A name that is unique within the table.

The values in a column are drawn from one and only one domain.

Columns are subject to domain constraints. Besides the data type (such as

integer or Date/Time, for example), other domain rules may be created for

particular columns.

Row Characteristics:

Only one value at the intersection of a column and a row. A relation does not allow

multivalued attributes.

There are no duplicate rows in a relation (Uniqueness).

Rows in a relation are unordered.

A primary key is a column or combination of columns that uniquely identifies each

row.

20

Page 27: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.4 THE NORMALIZATION PROCESS

The theoretical rules that the design of a

relation must meet are known as normal

forms.

Each normal form represents an

increasingly stringent set of rules.

Theoretically, the higher the normal form,

the better the design of the relation.

Applying the normal forms on a relation is

called data normalization.

21

Page 28: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.4 THE NORMALIZATION PROCESS

Each of these normal forms represents a specific rule.

When a normal form is applied to a table, the table is said to be

in that normal form.

At the beginning of the process, the tables are in a denormalized

state.

22

Page 29: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.5 FIRST NORMAL FORM (1NF)

1NF: A relation is in first normal form if the domain of each

attribute contains only atomic values, and the value of each

attribute contains only a single value from that domain.

1NF is a two-part rule, atomic values and single values.

Atomic Value

Domain:

Set of all possible values that an attribute may validly contain.

Data type is a physical concept, whereas a domain is a logical one.

Subset of all possible values that a specific data type allows.

Example: Column named age, data type is Integer (all whole

numbers, negative and positive), domain of values is 0 - 120.

Atomic value means that a value cannot be divided any further.

Store data in its smallest logical part breaking the data into

small pieces so that each piece still contains logical information.

23

Page 30: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.5 FIRST NORMAL FORM (1NF)

Single Value

Very difficult to extract only phone numbers for all customers, to

find out which customer has a fax number, and so on.

One way of

solving this

problem is to create individual fields for each of the

communication numbers, such as Phone, Fax, Pager, etc.

This is also called a horizontal design.

What if a customer now has a cellular phone? add new field

to store new number design change propagates throughout

the entire database.

Instead of having multiple instances of data in one field, we have

now repeating groups of data in separate fields.

Data of one domain (ph numbers) should be stored in one field.

24

Page 31: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.5 FIRST NORMAL FORM (1NF)

Better way is to design the table vertically (vertical design).

Now we have repeating groups of data vertically, in this case the

customer information.

When you encounter repeating groups of data in one entity,

move the repeating groups along with the unique identifier into a

new table.

The customer table

is now in its first

normal form, that is,

it contains no

repeating groups

horizontally or vertically nor does it contain multi-valued data.

25

Customer Table

Communication Table

Page 32: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.6 THE PRIMARY KEY

To uniquely identify a row. For example, to update one row within

a table of over 1 million rows, you retrieve this one row by

specifying the primary key value as a search value to find the

row to update its data.

As far as a relational database is concerned, you only need three

pieces of information to retrieve any specific bit of data:

Name of the table,

Name of the column,

Primary key of the row.

To uniquely identify a record, either one field or a combination of

fields must be unique. This set of one or more fields is called

primary key.

26

Page 33: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.6 THE PRIMARY KEY

If too many fields comprise a primary key, choose AutoNumber.

Rules for primary key when choosing existing fields:

A primary key should be some value that is highly unlikely ever to be

null(for composite key, any field should never be null). Also understand

that the data needs to be available at the time the record is entered in the

database, and not at a later time.

A primary key should never change, ever.

The primary key value should be of uniform length. For example, the

social security number is a nine-digit number for everybody, and not five-

digits for some and seven-digits for others.

Primary key should be minimal or irreducible.

27

Page 34: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.6 THE PRIMARY KEY

Another important concept in relational databases is the

functional dependency.

That is, a specific value for one field determines the value or

values for one or more other fields in the same table.

Functional dependency: For a specific value of the primary key,

say X, the value of another field is determined, say Y. X ->Y,

which reads X functionally determines Y.

28

Social Security Number

Name

City

Address

Page 35: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.7 SECOND NORMAL FORM (2NF)

A relation is in 2NF if and only if it is in 1NF and no non prime

(not part of primary key) attribute is dependent on any proper

subset of any candidate key of the table.

The second normal form goes back to the issue that all fields in

a table should relate to the subject of the table. Here it is

formulated in a stricter way.

Only applies to tables having composite, primary key.

All non-primary key fields (non prime) must fully depend on the

entire primary key, and not on part of the primary key.

Invoice example, communication table. What is the primary key?

Choosing a primary key is also a business decision, as it

imposes constraints on the data.

If we select the CustomerNumber and the DeviceID as the

primary key, what are the consequences of this choice?

29

Page 36: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.7 SECOND NORMAL FORM (2NF)

Now we apply the second normal

form and test the functional dependencies of the non-key fields

(here Number and Device) on the composite primary key (here

CustomerNumber and DeviceID):

Does the Number field functionally depend on the combination

of CustomerNumber and DeviceID? Yes, it does, because :

For the same customer (same CustomerNumber) there are different number

values for different DeviceID values.

For the same DeviceID there are different number values for different customers.

Does the Device field functionally depend on the combination of

CustomerNumber and DeviceID? No, it does not, because:

For the same customer (same CustomerNumber) ) there are different Device

values for different DeviceID values.

For the same DeviceID there are not different Device values for different

customers.

30

Primary Key Fields Non-Primary Key Fields

Page 37: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.7 SECOND NORMAL FORM (2NF)

Here we see that the Device field only depends on the DeviceID

part of the primary key, but not on the CustomerNumber.

This partial dependency violates the second normal form.

Consequently, the Device field (including DeviceID) must be

moved to a new table.

31

Page 38: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.8 THIRD NORMAL FORM (3NF)

A relation is in 3NF if and only if it is in second normal form(2NF)

and every non-prime attribute is non-transitively dependent (i.e.

directly dependent) on the primary key.

“Every non-key attribute must provide a fact about the key, the

whole key, and nothing but the key." So help me Codd.

32

Page 39: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.8 THIRD NORMAL FORM (3NF)

Functional dependency between StateCode and State.

Move the State field along with a copy of the StateCode field

(to link it back to the customer table) into a new table.

33

Page 40: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.9 DETERMINING RELATIONSHIPS

Normalization process created many tables, we need to link

them back together in order to see related information.

When determining relationships between tables, additional

tables may need to be created.

Tables engaged in a relationship are called participants of a

relationship. Depending on the number of participants, the so-

called degree of a relationship, there are unary, binary (two

tables) and ternary (three tables) relationships.

Relationships are also classified by how many records in one

table are associated with how many records in the other table.

This is called cardinality, and there are three main categories:

One-To-Many

One-To-One

Many-To-Many

34

Page 41: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.9 DETERMINING RELATIONSHIPS

One-to-Many

One record in table A can have zero, one or more matching

record in table B, but one record in table B

has at most one matching record in table A.

Test for cardinalities to identify type of relationship.

Customer Table: One customer may have zero (at least) to many

(at most) communication numbers,

Communication Table:

One communication number

is related to at least and at

most one customer.

The optionality is mostly

found in the many side, but it can be also in the one side.

35

CommNumbers

CustomerNumber (FK,IE1)DeviceID

Number

Customer

CustomerNumber

FirstName LastName City StateCode

Minimum Cardinality

Maximum Cardinality

Page 42: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.9 DETERMINING RELATIONSHIPS

To implement a relationship in the database, the primary key

field of the one side is also placed in the many side table. In the

many side table, it is called the foreign key. The primary key and

foreign key are used to link matching records together.

The relationships are as follows:

One Customer may have one or many communication

numbers. One communication record belongs to one

and only one customer.

One customer lives in one and only one state, and one state belongs to zero, one,

or many customers.

One device belongs to zero, one or many communication numbers, and one

communication number belongs to one and only one device.

In order to determine a One-To-Many relationship correctly, one

must asked the relationship question in both ways (round trip).

36

Device

DeviceID(PK)

Device

State

StateCode(PK)

State

Customer

CustomerNumber(PK)

FirstName LastName City StateCode (FK)

CommNumbers

CustomerNumber (FK)DeviceID (FK)

Number

Page 43: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.9 DETERMINING RELATIONSHIPS

Many-to-Many

A record in table A can have more than one matching record in

table B, and a record in table B can have more than one

matching record in table A.

Database design must be amended and additional tables

created.

Invoice example: Order and Products

Order Table:

One order contains one or many products.

Product Table:

One product is contained on zero, one, ore many orders.

This type of relationship cannot be modeled in a relational

database using primary and foreign keys.

37

Order

OrderID CustomerNumber OrderDate ShipMethod

Product

ProductID ProductName Price

Page 44: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.9 DETERMINING RELATIONSHIPS

Place primary key of one table as a foreign key in the other table.

Order Table: Place

ProductID as foreign

key. Order redundancy.

Product Table: Place

OrderID as foreign

key. Product redundancy.

By placing the primary key

from one table into the other table as a foreign key, these tables

become denormalized.

OrderID (or ProductID) must be moved to a different table.

To link the new table with the Product (Order) table, the Product ID

(OrderID) must be placed into the new table as a foreign key.

This new table is called intersection or junction table.

One Many-To-Many is broken down into two One-To-Many relationships.

38

Page 45: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.9 DETERMINING RELATIONSHIPS

Both primary keys serve

individually as foreign

keys in the junction table,

but together they serve

as primary key.

Combination of OrderID and ProductID is always unique:

On one order same product is not ordered twice.

One product is not contained more than once on an order.

If the same customer decides to order the same product again,

even on the same day, a new OrderID is assigned.

The intersection table does not represent a subject, such as

Customer, Order, or Product. It models a many-to-many

relationship through the means of a table, the only object

relational databases have.

39

Page 46: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.9 DETERMINING RELATIONSHIPS

One-to-One

One record in table A can have no more than one record in table

B and one record in table B can have no more than one in table

A.

A One-To-One relationship is fairly unusual in a relational

database, however, there are valid reasons for it in specific

cases.

When you encounter a One-To-One relationship, why not

combine both tables into one table.

There are some reasons why not:

Limitations due to the number of fields per table.

One or more fields are not populated for most records in a table.

Table Subclassing: For certain tables, groups of records have different properties

and therefore need different fields than other groups of records in the same table.

40

Page 47: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.9 DETERMINING RELATIONSHIPS

41

Page 48: DESIGNING, BUILDING, AND USING DATABASES …€¦ · DESIGNING, BUILDING, AND USING DATABASES (BEGINNING MICROSOFT ACCESS, X405.4) Database Program: Microsoft Access Series Technology

2.9 DETERMINING RELATIONSHIPS

Below is a summary of the

database design process:

The information collection

process.

The intuitive database design.

First Normal Form (Elimination

of Repeating Groups).

Primary Keys (Entity Integrity)

Second Normal Form (Elimination of Partial Dependencies).

Third Normal Form (Elimination of Transitive Dependencies).

Create relationships and thereby identify foreign keys (Referential Integrity).

42