Lecture01 257

54
IS 257 – Fall 2009 2009.08.27 - SLIDE 1 Database Management: Introduction Ray R. Larson University of California, Berkeley School of Information IS 257: Database Management

description

database management introduction

Transcript of Lecture01 257

Page 1: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 1

Database Management:Introduction

Ray R. Larson

University of California, Berkeley

School of Information

IS 257: Database Management

Page 2: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 2

Overview

• Announcements

• Course Description

• Database Concepts and Terminology

• Database Models

• Database Life Cycle

Page 3: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 3

Overview

• Announcements

• Course Description

• Database Concepts and Terminology

• Database Models

• Database Life Cycle

Page 4: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 4

Course Overview

• Description of the Course

• Assignments

• Readings

• Grading

• Schedule• Web site:

http://courses.ischool.berkeley.edu/i257/f09

Page 5: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 5

Course Description

• This course is concerned with the design of the database itself -- not with the design of database system software.– We will discuss DBMS internals only as they

relate to the database and its design and structure

• We will spend a fair amount time on database application design, especially on options for Web application database support -- but this will not be primary focus.

Page 6: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 6

Assignments

• Two kinds of assignments– Using a pre-built database for search and retrieval

and database modification queries– Designing, populating, and running queries against

your own personal (or group) database• Types of database project

– Individual

» Work related

» Course only

» Projects from around campus that need doing…

– Group

» Course related

» Final Master’s project

Page 7: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 7

Readings

• Textbook is:– Jeffrey A. Hoffer, Mary B. Prescott and Heikki

Topi. Modern Database Management (Ninth Edition). Prentice Hall (Pearson Educational) : Upper Saddle River, NJ, 2008.

• ISBN 978-0-13-600391-5

Page 8: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 8

Grading

• Grades will be based on:– Assignments (30%)– Personal/Group Database project (60%)– Class participation (10%)– (No midterm or final)

Page 9: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 9

Schedule

• On website:• http://courses.ischool.berkeley.edu/i257/f09/

Page 10: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 10

Overview

• Announcements

• Course Description

• Database Concepts and Terminology

• Database Models

• Database Life Cycle

Page 11: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 11

What is a Database?

Page 12: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 12

Files and Databases

• File: A collection of records or documents dealing with one organization, person, area or subject. (Rowley)– Manual (paper) files– Computer files

• Database: A collection of similar records with relationships between the records. (Rowley)– bibliographic, statistical, business data, images,

etc.

Page 13: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 13

Database

• A Database is a collection of stored operational data used by the application systems of some particular enterprise. (C.J. Date)– Paper “Databases”

• Still contain a large portion of the world’s knowledge– Changing as, for example, book scanning projects like Google

Books and the Open Content Aliance convert paper docs

– File-Based Data Processing Systems• Early batch processing of (primarily) business data

– would not be too surprised to find some these still out there

– Database Management Systems (DBMS)• Some old ones still in use, but most modern DBMS are

relational, object or object-relational

Page 14: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 14

Terms and Concepts

• Database Management System -- DBMS– Software system used to define, create,

maintain and provide controlled access to the database and its metadata

Page 15: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 15

Terms and Concepts

• Data Dictionary– AKA Repository (old usage)– The place where all metadata for a particular

database is stored– may also include information on relationships

between files or tables in a particular database

Page 16: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 16

Terms and Concepts

• Metadata– Data about data

• In DBMS means all of the characteristics describing the attributes of an entity, E.G.:

– name of attribute– data type of attribute– size of the attribute– format or special characteristics

– Characteristics of files or relations• name, content, notes, etc.

Page 17: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 17

Why DBMS?

• History– 50’s and 60’s all applications were custom

built for particular needs– File based– Many similar/duplicative applications dealing

with collections of business data– Early DBMS were extensions of programming

languages– 1970 - E.F. Codd and the Relational Model– 1979 - Ashton-Tate & first Microcomputer

DBMS

Page 18: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 18

From File Systems to DBMS

• Problems with File Processing systems– Inconsistent Data

– Inflexibility

– Limited Data Sharing

– Poor enforcement of standards

– Excessive program maintenance

Page 19: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 19

DBMS Benefits

• Minimal Data Redundancy• Consistency of Data• Integration of Data• Sharing of Data• Ease of Application Development• Uniform Security, Privacy, and Integrity

Controls• Data Accessibility and Responsiveness• Data Independence• Reduced Program Maintenance

Page 20: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 20

Terms and Concepts

• Data Independence– Physical representation and location of data

and the use of that data are separated• The application doesn’t need to know how or

where the database has stored the data, but just how to ask for it.

• Moving a database from one DBMS to another should not have a material effect on application program

• Recoding, adding fields, etc. in the database should not affect applications

Page 21: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 21

Database Environment

CASE Tools

DBMS

UserInterface

ApplicationPrograms

Repository Database

Page 22: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 22

Database Components

DBMS===============

Design toolsTable CreationForm CreationQuery CreationReport Creation

Procedural language

compiler (4GL)=============

Run timeForm processorQuery processor

Report WriterLanguage Run time

UserInterface

Applications

ApplicationProgramsDatabase

Database contains:User’s DataMetadataIndexesApplication Metadata

Page 23: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 23

Types of Database Systems

• PC Databases

• Centralized Database

• Client/Server Databases

• Distributed Databases

• Database Models

Page 24: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 24

PC Databases

E.G.AccessFoxProDbaseEtc.

Page 25: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 25

Centralized Databases

Cental Computer

Page 26: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 26

Client Server Databases

NetworkClient

Client

Client

DatabaseServer

Page 27: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 27

Distributed Databases

computercomputer

computer

Location A

Location CLocation B

HomogeneousDatabases

Page 28: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 28

Distributed Databases

Local Network

DatabaseServer

Client

Client

CommServer

Remote Comp.

Remote Comp.

HeterogeneousOr FederatedDatabases

Page 29: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 29

Terms and Concepts

• Database activities:– Create

• Add new data to the database

– Read• Read current data from the database

– Update• Update or modify current database data

– Delete• Remove current data from the database

Page 30: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 30

Terms and Concepts

• Database Application– An application program (or set of related

programs) that is used to perform a series of database activities:

• Create• Read• Update• Delete• On behalf of database users

Page 31: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 31

Range of Database Applications

• PC databases– Usually for individual

• WorkGroup databases– Small group use where everyone has access

to the database over a LAN

• Departmental databases– Larger than a workgroup – but similar

• Enterprises databases– For the entire organization over an intranet (or

sometimes the internet)

Page 32: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 32

Terms and Concepts

• Enterprise– Organization

• Entity– Person, Place, Thing, Event, Concept...

• Attributes– Data elements (facts) about some entity– Also sometimes called fields or items or domains

• Data values– instances of a particular attribute for a particular entity

Page 33: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 33

Terms and Concepts

• Records– The set of values for all attributes of a

particular entity– AKA “tuples” or “rows” in relational DBMS

• File– Collection of records – AKA “Relation” or “Table” in relational DBMS

Page 34: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 34

Terms and Concepts

• Key– an attribute or set of attributes used to identify

or locate records in a file

• Primary Key– an attribute or set of attributes that uniquely

identifies each record in a file

Page 35: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 35

Terms and Concepts

• DA– Data adminstrator - person responsible for the

Data Administration function in an organization

– Sometimes may be the CIO -- Chief Information Officer

• DBA– Database Administrator - person responsible

for the Database Administration Function

Page 36: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 36

Terms and Concepts

• Models– (1) Levels or views of the Database

• Conceptual, logical, physical

– (2) DBMS types• Relational, Hierarchic, Network, Object-Oriented,

Object-Relational

Page 37: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 37

Overview

• Announcements

• Course Description

• Database Concepts and Terminology

• Database Models

• Database Life Cycle

Page 38: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 38

Models (1)

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

Page 39: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 39

Data Models(2): History

• Hierarchical Model (1960’s and 1970’s)– Similar to data structures in programming

languages.

Books(id, title)

Publisher SubjectsAuthors

(first, last)

Page 40: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 40

Data Models(2): History

• Network Model (1970’s)– Provides for single entries of data and

navigational “links” through chains of data.

Subjects Books

Authors

Publishers

Page 41: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 41

Data Models(2): History

• Relational Model (1980’s)– Provides a conceptually simple model for data

as relations (typically considered “tables”) with all data visible.

Book ID Title pubid Author id1 Introductio 2 12 The history 4 23 New stuff ab 3 34 Another title 2 45 And yet more 1 5

pubid pubname1 Harper2 Addison3 Oxford4 Que

Authorid Author name1 Smith2 Wynar3 Jones4 Duncan5 Applegate

Subid Subject1 cataloging2 history3 stuff

Book ID Subid1 22 13 34 24 3

Page 42: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 42

Data Models(2): History

• Object Oriented Data Model (1990’s)– Encapsulates data and operations as

“Objects”

Books(id, title)

Publisher SubjectsAuthors

(first, last)

Page 43: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 43

Data Models(2): History

• Object-Relational Model (1990’s)– Combines the well-known properties of the

Relational Model with such OO features as:• User-defined datatypes• User-defined functions• Inheritance and sub-classing

Page 44: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 44

Overview

• Announcements

• Course Description

• Database Concepts and Terminology

• Database Models

• Database Life Cycle

Page 45: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 45

Database System Life Cycle

Growth,Change, &

Maintenance6

Operations5

Integration4

Design1

Conversion3

PhysicalCreation

2

Page 46: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 46

The “Cascade” View

Project Identifcation and Selection

ProjectInitiation

and Planning

Analysis

Logical Design

PhysicalDesign

Implementation

MaintenanceSee Hoffer, p. 41

Page 47: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 47

Design

• Determination of the needs of the organization

• Development of the Conceptual Model of the database– Typically using Entity-Relationship

diagramming techniques

• Construction of a Data Dictionary• Development of the Logical Model

Page 48: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 48

Physical Creation

• Development of the Physical Model of the Database– data formats and types– determination of indexes, etc.

• Load a prototype database and test• Determine and implement security,

privacy and access controls• Determine and implement integrity

constraints

Page 49: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 49

Conversion

• Convert existing data sets and applications to use the new database– May need programs, conversion utilities to

convert old data to new formats.

Page 50: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 50

Integration

• Overlaps with Phase 3

• Integration of converted applications and new applications into the new database

Page 51: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 51

Operations

• All applications run full-scale

• Privacy, security, access control must be in place.

• Recovery and Backup procedures must be established and used

Page 52: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 52

Growth, Change & Maintenance

• Change is a way of life– Applications, data requirements, reports, etc.

will all change as new needs and requirements are found

– The Database and applications and will need to be modified to meet the needs of changes

Page 53: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 53

Another View of the Life Cycle

Operations5

Conversion3

PhysicalCreation

2Growth, Change

6

Integration4

Design1

Page 54: Lecture01 257

IS 257 – Fall 2009 2009.08.27 - SLIDE 54

Next Time

• Introduction to the Diveshop database

• Introduction to Database Design