Additional Database Notes

44
Database Basics Objectives of Database Its advantages and applic ati on in Corpo rate      Share ability: An ability to share data resources is a fundamental obje ct ive of database manage ment. This means different people and diff erent processes can the same actual dataat the same time. Serving differently t ypes of users w ith varying skill levels Handling different user views of the same sto red data Combining interrelated data Controlling concurr ent updates so as to maintain data integrity's

Transcript of Additional Database Notes

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 1/44

Database Basics

Objectives of Database Its advantagesand application in Corporate     Share ability: An ability to share data resources is a

fundamental objective of database management.This means different people and different processes canthe same actual data at the same time.

Serving differently types of users with varying skill levels

Handling different user views of the same stored data

Combining interrelated dataControlling concurrent updates so as to maintain dataintegrity's

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 2/44

Database Basics

     Evolvability     Evolvability refers to the ability of the DBMS tochange in response to growing user needs andadvancing technology.

     Evolvability is the system characteristic thatenhances future availability of the data resources.

     Evolvability is not the same as expandability orextensibility, which imply extending or adding to thesystem, which then grows ever larger.

     Evolvability covers expansion or contraction, bothof which may occur as the system changes to fit theever changing needs and desires of the usingenvironment.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 3/44

Database Basics

     Integrity The importance and pervasiveness of the need to

maintain database integrity is rooted in the reality

that man is perfect. Destruction, errors andimproper disclosure must be anticipated and explicitmechanisms provided for handling them. The threeprimary facets of database integrity are:

protecting the existence of the database

Maintaining the quality of the database

Ensuring the privacy of the database

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 4/44

DBMSData Dictionary & Metadata A database contains information about entities of 

interest to users in an organization When created, the database itself becomes an

³entity´ about which information must be kept for

various data administration purposes  Data dictionary (or system catalog) is a database

about the database Contents of a DD are commonly referred to as

metadata

DD can be updated, queried much as a ³regular´database

DBMS often maintains the DD

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 5/44

DBMS

Benefits of Data DictionaryBenefits include -

improved documentation and control

consistency in data use

easier data analysis

reduced data redundancy

simpler programming

the enforcement of standards

better means of estimating the effect of change.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 6/44

DBMS

Metadata

Metadata: data that describes the properties andMetadata: data that describes the properties and

context of user data.context of user data.

 ± ±

but separate from that data;but separate from that data; ± ± Stored as part of the database.Stored as part of the database.

---including data types, field sizes, allowable values,and data context

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 7/44

DBMS

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 8/44

Data Independence

With knowledge about the three-schemes architecture the termdata independence can be explained as followed: Each higher 

level of the data architecture is immune to changes of the next

lower level of the architecture.

Physical Independence: Therefore, the logical scheme may stay

unchanged even though the storage space or type of some data

is changed for reasons of optimisation or reorganisation.

Logical Independence: Also the external scheme may stay

unchanged for most changes of the logical scheme. This is

especially desirable as in this case the application software doesnot need to be modified or newly translated.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 9/44

Distributed DatabaseT  ypes of Distributed Database System

Distributed database system are of the following types

Homogenous Distributed Database Systems

Heterogeneous Distributed Database System

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 10/44

Distributed Database H omogenous Distributed Database System

All sites have identical software

T hey are aware of each other and agree to cooperate in processing user requests

 It appears to user as a single system

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 11/44

An Homogenous Distributed Database

Systems example

A distributed system connects three databases: hq, mfg, and sales

An application can simultaneously access or modify the data in several

databases in a single distributed environment.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 12/44

Distributed Database H eterogeneous Distributed Database System

In a heterogeneous distributed database system, at least one

of the databases uses different schemas and software.

A database system having different schema may cause a

major problem for query processing.

A database system having different software may cause amajor problem for transaction processing.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 13/44

Distributed Database F eatures of Distributed Database System

R eplication

 ± System maintains multiple copies of data, stored in different

sites, for faster retrieval and fault tolerance. Fragmentation

 ± R elation is partitioned into several fragments stored indistinct sites

R eplication and fragmentation can be combined

R elation is partitioned into several fragments: systemmaintains several identical replicas of each such fragment.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 14/44

Distributed Database Advantages Distributed Database System

Availability: failure of site containing relation r does not

result in unavailability of r is replicas exist.

Parallelism: queries on r may be processed by several nodes

in parallel.

R educed data transfer: relation r is available locally at eachsite containing a replica of r.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 15/44

Distributed Database Disadvantages Distributed Database System

Increased cost of updates: each replica of relation r must beupdated.

Increased complexity of concurrency control: concurrent

updates to distinct replicas may lead to inconsistent data

unless special concurrency control mechanisms are

implemented.

One solution: choose one copy as primary copy and

apply concurrency control operations on primary copy.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 16/44

Data Warehousing & Data Mining

 Benefits of Data Mining  Data mining in customer relationship management applications can

contribute significantly to the bottom line.

R ather than randomly contacting a prospect or customer through a call center 

or sending mail, a company can concentrate its efforts on prospects that are

 predicted to have a high likelihood of responding to an offer. More sophisticated methods may be used to optimize resources across

campaigns so that one may predict to which channel and to which offer an

individual is most likely to respond²across all potential offers.

Businesses employing data mining may see a return on investment

Data mining can also be helpful to human-resources departments inidentifying the characteristics of their most successful employees.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 17/44

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 18/44

Why choose MS-Access over SPSS /

Excel?

 Although there is always overlap, the following rules might help

when deciding when / when not to use MS  Access:

 MS  Access is best used for long-term data storage and/or data sharing.

 MS Excel is best used for minor data collection, manipulation,

and especially visualization.

 SPSS is best used for minor data collection and especially data

analysis.

 It is easy to export data from MS  Access to Excel  SPSS 

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 19/44

What is in an MS-Access file - 1? Although the term ³database´ typically refers to a collection of 

related data tables, an  Access database includes more than just 

data. In addition to tables, you can add:

 Saved queries (stored procedures) - organizing and/or manipulating data

 F orms ± gui interaction with data, event programming 

 Reports ± customized results for printing (~ static forms)

 Macros and V  B programs for extending functionality

 Microsoft provides some logical integration of these tools through

³wizards´.  H owever, these are pretty basic - most developers

must pick and choose the best approach when implementing 

applications.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 20/44

What is in an MS-Access file - 2?

U nless advanced 

techniques are employed,

all entities are stored in

one *.mdb file. Whenrunning, a locking file

(*.ldb) is also visible.

Only the mdb file needs

to be copied to transfer 

the database to another computer or location.

 Ex.

 MSCI_  B yrneGuestLect 

ure.mdb

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 21/44

What is in an MS-Access file - 3?

Demographics Ethnicity Labs H & P

Tables

Queries

Forms (Active)Reports (Static)

VB + Macros  ± Event Driven Automation, etc.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 22/44

Microsoft Access Module 1

Summary

 MS- Access is a powerful relational database program. It 

has many integrated features and can be greatlycustomized to fit most personal/departmental needs for 

data collection and storage.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 23/44

Microsoft Access Module 2

Creating / Working with T ables

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 24/44

Tables Glucose Measurement

Database

We wish to construct a database to track waking glucosemeasurements for an indefinite amount of time on 100

 patients receiving 3 possible drug combinations.

Why would this be difficult in MS-Excel or SPSS?

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 25/44

Tables Overview

j T hink of  Access as a collection of spreadsheets that are

relationally linked .

   S   T   O   R   E   D   A   T   A   O   N   E   T   I   M   E   /   O   N   E

   P   L   A   C   E

   D   O

    N   O   T   S   T   O   R

   E   C   A   L   C   U   L   A   T   E   D

   D   A   T   A

Demographics

Patient_IDFname

Lname

 Address

Phone

Gender 

RaceDOB

Height

Glucose

Glucose_IDPatient_ID

Date

Weight

Med_ID

Glucose

Meds

Med_IDDrugCombonation

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 26/44

Table Demonstration - Live

General Setup for T ables

 Describe General Options Show Validation Rule

 Relationships

 Lookup Option

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 27/44

Table Relationships - Live

Table Relationships

Describe Cascade Features

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 28/44

Table Import / Link - Live

 Importing a T able

makes a copy of 

existing data

 Linking a T able lets

 you control existing 

data through  Access

(Exercise Caution !)

 N ote that you

may import 

non- Access

 files.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 29/44

MS Access Module 2 Summary Data storage principles

1. Attempt to store data 1 time / 1 place;2. Do not store data that may be calculated from other 

 fields (utilize queries); and 3. Strive for very discrete data storage (no ambiguity ± 

 garbage in / garbage out).

4. Choose real or arbitrary (autonumber) uniqueidentifier for each record.

 RelationshipsU  se table relationships to automatically cascade delete

and update records.

Other Data Sources

 Import = Copy; Link = Live Connect.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 30/44

Microsoft Access Module 3

Creating / Working with Queries

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 31/44

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 32/44

One Table Q uery Example - Live

Right-Click + Add to addtable(s)

Drag and Drop Fields

Custom sort 

by one or 

more fields.

U  se this

button to

toggle

between

design, sheet 

and SQL

views.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 33/44

2-Table Q uery Example - Live

Drag and Drop Fields

Right-Click + Add to add table(s)

Note that relationship often automatic.

Calculated Field

BMI: [Weight]/([Height]/100)^2

Right-Clicking gray area above

field enables property changes.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 34/44

Q uery Calculating Fields

 N ame the calculated field, then type a colon, then

type the equation using brackets ( [ ] ) around 

table fields. If there is ambiguity in the field  

names between tables, you may need to type

table.[field] format.

 Ex:  B MI: [Weight]/([  H eight]/100)^2

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 35/44

Q uery Sorting Data

Choose Ascending or Descending in the Sort Row

This query would sort by Gender THEN by Race.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 36/44

Q uery Filtering Data

This query will return all records in the database for:Females

who are not white

whose height are greater than 150 cm

and who weigh between 60 and 70 kg

   Y  o

  u  n  e  e   d  n  o   t   ³  s   h  o  w   ´   t   h  e   d  a   t  a   f   i  e   l   d   t

  o  u  s  e  a  s  a   f   i   l   t  e  r .

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 37/44

Q uery Filter Operators

= equals

> greater than

>= greater than or equal

< less than

<= less than or equal

<> not equal to

Betweenbetween two values

Is Null field is empty

is not null field is not empty

Like Matches a pattern (Like John*)

OR Logical OR (one or other is true)

 AND Logical AND (both are true)

etc.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 38/44

Q uery Grouping Data - 1

Clicking the Totals Button Enables

Grouping, Counting and Statistical

Options

Notice new ³Total´ row.

Each field (column) can be set.

Running this

Query indicates

there are 203

Females and 261Males in the

database.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 39/44

Q uery Grouping Data -2

Totals Options Include:

Group By

Sum

 Avg

Min

Max

Count

StDev

Var 

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 40/44

Q uery Export Data

Create and Save

Query1)

Use OfficeLinks (Excel Toggle Option) to

³Analyze it with Excel´2)

Data Automatically Exported

to Excel3)

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 41/44

Microsoft Access Module 4

Creating / Working with  F orms/Reports

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 42/44

Graphical User Interface (GUI)

 Although it is possible to enter data directly into a table,

 you can enhance data quality by forcing data entry

through forms.

 Depending upon your users, you may wish to set things up

 so they never even see the database window. In other  

words, you can design your application so they only touch

the data through programmed forms.

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 43/44

Graphical User Interface (GUI)

Continuing with the glucose database we formulated 

earlier, we¶ll now attempt to build a graphical user  

interface to:

Collect Data

 Periodically report data through pre-formatted reports

Quit the program

8/3/2019 Additional Database Notes

http://slidepdf.com/reader/full/additional-database-notes 44/44

GUI Forms/Report Live

Out of Program