ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction.

Post on 12-Jan-2016

222 views 2 download

Transcript of ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction.

ITEC313 Database Programming

Lecture 1: Database Design Methodology : Introduction

Learning Objectives

• Database Design Terminology• Purpose of Database Design• Phases of Database Design

Database and Database System

• A database is a shared collection of logically related data designed to meet the information needs of an organization.

• Components of a Database Systems– Database– Hardware– Software - DBMS– Users

3

Database

• The data in the database will be expected to be both integrated and shared particularly on multi-user systems

• Integration - The database may be thought of as a unification of several otherwise distinct files, with any redundancy among these files eliminated

• Shared - individual pieces of data in the database may be shared among several different users 4

Hardware

These are secondary storage on which the database physically resides, together with the associated I/O devices, device controllers etc.

5

DBMS

Examples of DBMS Products Oracle Informix Access DB2 Fox pro dBase SQL Server My SQL

6

Typical Functions of DBMS

7

Functions of a DBMS

Data storage, retrieval

and update A user-accessible

catalog

Transaction support

Concurrency and control

services

Recovery services

Authorization

services

Support of data

communication

Integrity Services

Services to promote

data independen

ce

Utility services

Users

• Application Programmer - writes programs that use the database

• Database Designers - designs conceptual and logical database

• Database Administrator (DBA)• Data Administrator• End - user - interacts with the system

from an on-line terminal by using Query Languages etc.

8

Data & Database Administration

• Data Administrator – a business manager responsible for controlling the overall corporate data resources

• Database Administrator (DBA) - a technical person responsible for development of the total system

9

Sample Applications

• Student Records• Banking • Insurance• Billing Systems e.g.

Electricity, Phone• ISPs• Accounting Systems• Reservation Systems

e.g. Airline, Hotel• Medical Records

10

• Stock control• Personnel systems• Product catalogues• Telephone directories• Train timetables• Airline bookings• Credit card details• Customer histories• Stock market prices• Discussion boards• Web indexes• Library catalogues

Advantages

• Control of data redundancy

• Data consistency

• Multipurpose use of data

• Sharing of data,

• Enforcement of standards

• Economy of scale

• Balance conflicting user

requirement

• Improved data accessibility and

responsiveness

• Increased productivity

• Improved maintenance through

data independence

• Increased concurrency

• Improved backup and recovery

services.

11

Disadvantages

Complexity

Size

Cost of DBMS

Additional hardware costs

Cost of conversion

12

Data Independence

• Software maintenance is a large part (50%) of information system budgets

• Reduce impact of changes by separating database description from applications

• Change database definition with minimal effect on applications that use the database

Three Schema Architecture

Three Schema Architecture

Database Architecture

External Level – concerned with the way users perceive the database

Conceptual Level – concerned with abstract representation of the database in its entirety

Internal Level – concerned with the way data is actually stored

17

Differences among Levels

• External– Course Registration Form– Instructor load assignments

• Conceptual: –Tables: student, course,takes, …

• Internal– Files needed to store the tables– Extra files to improve performance

Architecture of Db System

19

DBMS

Application 2Application 1 Application 3

Database

Conceptual Level

Internal Level

External Level

Logical Data Independence

Physical Data Independence

Data Independence

• Logical Data Independence – users and user programs are independent of logical structure of the database

• Physical Data Independence – the separation of structural information about the data from the programs that manipulate and use the data i.e. the immunity of application programs to changes in the storage structure and access strategy

20

Data Independence

• Different applications will need different views of the same data, so that if they are not interested in a part of the database, that part need not be included in their view. This feature is also important for controlling access to parts of database

• The DBA must have the freedom to change the storage structure or access strategy in response to changing requirements, without having to modify the existing applications

21

Client-Server Architecture

Database

Database

a) Client, server, anddatabase on thesame computer

b) Mulitple clients and 1 serveron different computers

c) Multiple servers and databases on different computers

Client

Server

Client Server

Client Server Server

DatabaseDatabase

Client

Client

Client

Client

Client

Database Development

• In the past many software development projects were unsuccessful due to:– requirements were not properly collected/specified– Lack of development methodology

• The stages in the DB development cycle has been identified:– Clearly specified– Not sequential, but involve some repetition.– Contain feedback loops (even back to the

requirements stage)

Db Development Life Cycle

Database planning System definition Requirement collection and analysis Database design DBMS selection Application design Prototyping Implementation Data conversion and loading Testing Operational maintenance

24

Database Design

DATABASE PLANNING

SYSTEMS DEFINITION

REQUIREMENTS ANALYSIS

IMPLEMENTATION

CONCEPTUAL DESIGN

DISTRIBUTED DB DESIGN

PHYSICAL DESIGN

APPLICATION DESIGN

DBMS SELECTION

PROTOTYPING

DATA LOADING

TESTING

MAINTENANCE

LOGICAL DESIGN

Database Application

Lifecycle

Optional

Database Application Lifecycle

• Management activities that allow the stages of the database application to be realized as efficiently as possible

Database Planning :

• The scope and boundaries of the application including its major application areas and user groups

System Definition :

• Encompasses tasks that determine the needs or conditions to meet for a new or altered product, taking account of the possibly conflicting, vague and incomplete requirements of the various stakeholders

Requirements Analysis:

Database Application Lifecycle

• Design of the user interface and the application programs that use and process the database.

Application Design :

• Building a working model of a database application

Prototyping :

• Physical realization of the database and application design

Implementation :

Database Application Lifecycle

• Transferring any existing data into the new database and converting any existing processes to run on the new database.

Data Conversion and Loading :

• Process of executing the application programs with the intent of finding errors.

Testing :

• Process of monitoring and maintaining the system following installation.

Operational Maintenance :

Planning

Slide 29

Planning Factors

The work to

be done

The resources

to do it

The cost

Planning Objectives

Organisational Units

Consist of various

departments

Locations

List of operational locations

Business Functions

Identify related

business processes

Entity Types

Something for which

data is collected

Two stages

System Definition

• Identify boundaries– Want to know at a very high level what the

boundaries of the system are, e.g.• Current users• Current application areas

• Identify interfaces within organization

Requirements Analysis

• Database design should reflect the information within the organisation

• Many ways of gathering information• interviewing• observing• examining documents• using questionnaires• using experience from the design of other systems• …

Requirements Analysis• Critical information

– Main application areas and user groups– Documentation used– Details of transactions needed

• A prioritized user requirement specification• Amount gathered depends on size of

organization and scope of application• Documentation is VERY important

– DFD, matrices etc.• Identifying the required functionality for a database system is crucial:• systems with inadequate functionality will fail

Database Design

MAIN AIMS• To represent data & relationships required

by users and applications• To provide a data model which supports

transactions• To specify a design that meets performance

requirements

Database Design Approaches

begins at the level of attributes and then adds entities as new relationships are seen. Normalization is an example of this.

starts with the development of the data model that contains a few high level entities and then it refines them in ever increasing detail. Data modeling comes under this.

BOTT

OM

UP

TOP-D

OW

N

Phases of database Design

• Remember the main phases:

– Conceptual Database Design– Logical Database design– Distributed Database Design (optional)– Physical Database Design

Conceptual Database Design

• Create a conceptual data model– Use data modeling to understand

• each users perspective of data• the data• Use of data across applications

• Independent of any implementation details– DBMS or physical aspects are immaterial

• Based on user requirements specification– assists in understanding data– facilitates communication

Logical database design

• The data model created in the previous phase is refined

• At this point you know – which type of DBMS you will implementing in - e.g.

relational, object-oriented …– but not the actual DBMS

• Test the correctness of the data model through– Normalization– Validation against user transactions

A crucial stage in the database Application lifecycle is choosing the DB.

The aim is to choose a system that

• allows expansion• enables speedy retrieval• gives easy application development etc.

All data should have been collected and documented before DB selection

Many organizations in practice choose a DBMS purely on the basis of cost.

Database selection

Define terms of reference• the scope of the study should be stated• potential list of the products to be assessed• the criteria to be used, timescales …

Identify products• hardware, • compatibility with existing systems, • cost ..• User support • upgrades …

Produce shortlist of products• Shortlist 2-3 products

Evaluate products• Ask Vendors• Involve Users

Recommend selection and produce report• Give details of criteria used• Compare/Contrast alternatives

Database selection

Physical Database Design

HOW to physically implement the logical data model

– derive tables & constraints– identify storage structures and access methods– design security features

Application Design

• Design transactions– data to be used by transactions– functions of the transactions– output of transactions– programs

• Design human interface– Various guidelines

Design of software programs which will process the data

Prototyping

• used to check – developer’s understanding of what is required– interpretation of requirements

• Building a working model

• Inexpensive & quick to build

Implementation

• Database created using DDL• Implement application programs using

selected language• Implement security & integrity controls

Data Loading/Conversion

• Transfer any existing data• Insert any new data• Usually there is a facility within the DBMS to

load data into a database

Testing

• The process of executing the application programs with the intention of finding errors.– Use realistic data– Involve users

• There are various strategies that can be used:– White Box – Black box testing

Slide 45

Maintenance

• Monitoring Performance– Various tools are available

• Maintaining and Upgrading

Slide 46

Overview of Database Design• Assist in understanding of the semantics of data• Facilitate the communication about information

requirements

Purpose of Data Modeling

Criteria for Optimal Data Models

Shareability

Diagrammatic Representation

Extensibility

Expressability

Structural Validity

Nonredundancy

Integrity

Simplicity

Database Design Methodology• A structured approach that uses procedures, techniques,

tools and documentation aids to support and facilitate the process of design

Interaction with users

Structured methodology

Data-driven approach

Structural and integrity

considerations

Data dictionaryvalidate

diagrams

DBDL

Repeat

Broad Goals of Database Development

• Develop a common vocabulary• Define data meaning• Ensure data quality• Provide efficient implementation

Develop a Common Vocabulary

• Diverse groups of users• Difficult to obtain acceptance of a common

vocabulary• Compromise to find least objectionable

solution• Unify organization by establishing a common

vocabulary

Define Meaning of Data

• Business rules support organizational policies

• Restrictiveness of business rules– Too restrictive: reject valid business

interactions– Too loose: allow erroneous business

interactions• Exceptions allow flexibility

Data Quality

• Poor data quality leads to poor decision making– Difficult customer communication– Inventory shortages

• Cost-benefit tradeoff to achieve desired level of data quality

• Long-term effects of poor data quality

Data Quality Measures

• Completeness• Lack of ambiguity• Timeliness• Correctness• Consistency• Reliability

Data Quality Measures• Completeness:

– database represents all important parts of an information system• Lack of ambiguity:

– each part of a database has only one meaning• Timeliness:

– business changes are posted to a database without excessive delays• Correctness:

– database contains values perceived by the user• Consistency:

– different parts of a database do not conflict• Reliability:

– failures or interference do not corrupt database

Importance of measure depends on the database, system, and organizationEach measure can be quantified

Efficient Implementation

• Supersedes other goals• Optimization problem

– Maximize performance– Subject to constraints of data quality, data

meaning, and resource usage• Difficult problem:

– Number of choices– Relationships among choices– DBMS specific

Database Development Phases

Conceptual Data Modeling

Logical Database Design

Distributed Database Design

Physical Database Design

ERD

Tables

Distribution Schema

Internal Schema, Populated DB

Data requirements

OPTIONAL

Database Design

• Conceptual database design - the process of constructing a model of the information used in an organization, independent of all physical considerations

Step 1 Build local conceptual data model for each user view

58

Database Design

• Logical database design for the relational model - the process of constructing a model of the info used in an organization based on a specific data model, but independent of a particular DBMS and other physical considerations

Step 2 Build and validate local data model for each user viewStep 3 Build and validate global logical data model

59

Database Design

• Physical database design for relational databases - the process of producing a description of the implementation of the database on secondary storage.

Step 4 Translate global data model for target DBMS

Step 5 Design physical representationStep 6 Design security mechanismsStep 7 Monitor and tune the operational

system 60

Phases of Database Design

• Process of constructing a model of the information used in an enterprise independent of all physical considerations

Conceptual Database Design

• Process of constructing a model of information used in an enterprise based on a specific data model but independent of a particular DBMS or any other physical considerations

Logical Database Design

• (Optional)Process of deciding about the placement of data across the sites of a computer network. Involves designing the network itself, as well as distribution of DBMS software, DB applications and data

Distributed Database Design

• Description of the implementation of the database on secondary storage. It describes the storage structures and access methods for efficient access.

Physical Database Design

Overview of Database Design

Build local conceptual data model for each user view

Build and Validate local logical data model for each user view

Build and validate global logical Model

Translate global logical model for target DBMS

Design Physical representation

Design Security Mechanisms

Monitor and Tune operational system

Conceptual

Logical

Physical

Centralized Approach to Managing Multiple User Views

63Pearson Education © 2009

View Integration Approach to Managing Multiple User Views

64

Conceptual Database Design

1.1 • Identify entity

types

1.2 • Identify

relationship types

1.3 • Identify and

associate attributes with entity or relationship types

1.4 • Determine

Attribute Domains

1.5 • Determine

candidate and primary key attributes

1.6 • Specialize/

generalize entity types

1.7 • Draw Entity-

Relationship Diagram

1.8 • Review local

conceptual data model with user

1. Build local conceptual data model for each user view

Logical Database Design

2.1 • Map local

Conceptual data model to local logical data model

2.2 • Derive relations

from local logical data model

2.3 • Validate model

using normalization

2.4 • Validate model

against user transactions

2.5 • Draw Entity

relationship Diagram

2.6 • Define integrity

constraints

2.7 • Review Local

logical data model with user

2. Build and validate local logical data model

Logical Database Design

3.1 • Merge local logical

data models into global model

3.2 • Validate global

logical data model

3.3 • Check for future

growth

3.4 • Draw final Entity

Relationship diagram

3.5 • Review global

logical data model with users

3. Build and Validate Global Logical data model

Physical Database Design

4. Translate Global Logical Data Model for target DBMS

4.1 Design base relations for target DBMS4.2 Design enterprise constraints for target DBMS

5. Design Physical Representations5.1 Analyze transactions5.2 Choose file organizations

Physical Database design

5.3 Choose secondary indexes5.4 Consider introduction of controlled redundancy

6. Design Security Mechanisms6.1 Design user views6.2 Design access rules

7. Monitor and tune operational system

END OF LECTURE