File Processing

30
1 File Processing Data are stored in files with interface between programs and files. Various access methods exist (e.g., Sequential, indexed, random) One file corresponds to one or several programs. PROGRAM 1 Data Management FILE 1 FILE 2 Redundant Data PROGRAM 2 Data Management PROGRAM 3 Data Management File System Services

description

File Processing. Data are stored in files with interface between programs and files. Various access methods exist (e.g., Sequential, indexed, random) One file corresponds to one or several programs. PROGRAM 1. Data Management. FILE 1. File System Services. PROGRAM 2. Redundant Data. - PowerPoint PPT Presentation

Transcript of File Processing

Page 1: File Processing

1

File Processing Data are stored in files with interface between programs and

files. Various access methods exist (e.g., Sequential, indexed, random) One file corresponds to one or several programs.

PROGRAM 1

DataManagement FILE 1

FILE 2 Red

unda

nt D

ata

PROGRAM 2

DataManagement

PROGRAM 3

DataManagement

FileSystemServices

Page 2: File Processing

2

Problems With File Systems

Data are still high redundant sharing limited and at the file level

Data is unstructured “flat” files

High maintenance costs data dependence; accessing data is difficult ensuring data consistency and controlling access to data

Sharing granularity is very coarse Difficulties in developing new applications

Page 3: File Processing

3

PROGRAM 1

PROGRAM 1

PROGRAM 2

IntegratedDatabase

DBMS

Query Processor

Transaction Mgr

Database Approach

Page 4: File Processing

4

What is a Database?

A database is an integrated and structured collection of stored operational data used (shared) by application systems of an enterprise

Manufacturing Product data

University Student data, courses

Hospital Patient data, facilities

Bank Account data

Page 5: File Processing

5

What is a Database?

A database (DB) is a structured collection of data about the entities that exist in the environment that is being modeled.

The structure of the database is determined by the data model that is used.

A database management system (DBMS) is the generalized tool that facilitates the management of and access to the database.

Page 6: File Processing

6

Data Model

Formalism that defines how data are organized Within a file Between files

File systems can at best specify data organization within one file

Alternatives Old ones (hierarchical, network) Relational Object-oriented

Page 7: File Processing

7

Example Relation Instances

ENO ENAME TITLE

E1 J. Doe Elect. Eng.E2 M. Smith Syst. Anal.E3 A. Lee Mech. Eng.E4 J. Miller ProgrammerE5 B. Casey Syst. Anal.E6 L. Chu Elect. Eng.E7 R. Davis Mech. Eng.E8 J. Jones Syst. Anal.

EMP

ENO PNO RESP

E1 P1 Manager 12

DUR

E2 P1 Analyst 24E2 P2 Analyst 6E3 P3 Consultant 10E3 P4 Engineer 48E4 P2 Programmer 18E5 P2 Manager 24E6 P4 Manager 48E7 P3 Engineer 36

E8 P3 Manager 40

WORKS

E7 P5 Engineer 23

PROJ

PNO PNAME BUDGET

P1 Instrumentation 150000

P3 CAD/CAM 250000P2 Database Develop. 135000

P4 Maintenance 310000P5 CAD/CAM 500000

Page 8: File Processing

8

Data constitute an organizational asset Integrated control Reduction of redundancy Avoidance of inconsistency Sharability Standards Improved security Integrity integrated centralized

Programmer productivity Data Independence

Why Database Technology

Page 9: File Processing

9

Programmer productivity High data independence

Why Database Technology

Time

Maintenance

New systemdevelopment

To

tal

Sys

tem

Co

st

Page 10: File Processing

10

Invisibility (transparency) of the details of conceptual organization, storage structure and access strategy to the users Logical

transparency of the conceptual organization transparency of logical access strategy

Physical transparency of the physical storage organization transparency of physical access paths

Data Independence

Page 11: File Processing

11

Database Functionality

Integrated schema Users have uniform view of data They see things only as relations (tables) in the relational model

Declarative integrity and consistency enforcement 24000 Salary 250000 No employee can have a salary greater than his/her manager. User specifies and system enforces.

Individualized views Restrictions to certain relations Reorganization of relations for certain classes of users

Page 12: File Processing

12

Database Functionality (cont’d)

Declarative access Query language SQL

Find the names of all electrical engineers.SELECT ENAMEFROM EMPWHERE TITLE = “Elect. Eng.”

Find the names of all employees who have worked on a project as a manager for more than 12 months.SELECT EMP.ENAMEFROM EMP,ASGWHERE RESP = “Manager”AND DUR > 12AND EMP.ENO = ASG.ENO

Page 13: File Processing

13

Database Functionality (cont’d)

System determined execution Query optimizer Relational algebra

Find the names of all electrical engineers.PROJECTENAME(SELECTTITLE = “Elect. Eng.”EMP)

Find the names of all employees who have worked on a project as a manager for more than 12 months.PROJECTENAME(SELECTRESP=“Manager” AND DUR>12(EMP JOINENO ASG)

Page 14: File Processing

14

Database Functionality (cont’d)

Transactions Execute user requests as atomic units May contain one query or multiple queries Provide

Concurrency transparency Multiple users may access the database, but they each see the

database as their own personal data Concurrency control

Failure transparency Even when system failure occurs, database consistency is not

violated Logging and recovery

Page 15: File Processing

15

Database Functionality (cont’d)

Transaction Properties Atomicity

All-or-nothing property Consistency

Each transaction is correct and does not violate database consistency

Isolation Concurrent transactions do not interfere with each other

Durability Once the transaction completes its work (commits), its effects

are guaranteed to be reflected in the database regardless of what may occur

Page 16: File Processing

16G.C. Everest. Database Management, McGraw Hill, 1986

DecisionsComparative Reports

ReportsQueries

Decision Support

MODELING FORECASTING

PLANS & EXPECTATIONS

Answers

REALITY

Actual results

Input data transactions

Actions

CORPORATE DATABASE

• The "image" • Basis of decisions and actions in the organization

OPERATIONAL ACTIVITIES & MANAGEMENT

MANAGEMENT PLANNING & CONTROL

POLICIES GOALS & STRATEGIC PLANNING

Role in an Information System

Page 17: File Processing

17

Application Programs

Operating System

DBMSApp. development tools

Hardware

Place in a Computer System

Page 18: File Processing

18H.F. Korth and A. Silberschatz. Database System Concepts, McGraw-Hill, 1986.

System Structure

data manipulationlanguage

precompiler

queryprocessor

data definition language compiler

applicationprogram

object code

databasemanager

applicationprograms

querydatabase scheme

system calls

naiveusers

applicationprogrammers

casualusers

databaseadministrator

data files

data dictionary

file manager

DBMS

Page 19: File Processing

19

DBMS Architecture

DBMS Languages Data Definition Language (DDL)

defines conceptual schema, external schema, and internal schema, as well as mappings between them.

Data Manipulation Language (DML) embedded query language in a host language “stand-alone” query language

Procedural vs. non-procedural Procedural: specify where and how Non-procedural: specify what

Page 20: File Processing

20

DBMS Architecture

DBMS Interfaces Menu-based Interface Graphical Interface Forms-based Interface Interface for DBA

Page 21: File Processing

21

DBMS Architecture

Main DBMS Modules DDL compiler DML compiler Ad-hoc (interactive) query compiler Run-time database processor Stored Data Manager Concurrency/back-up/recovery subsystem

Page 22: File Processing

22

ANSI/SPARC Architecture

ExternalSchema

ConceptualSchema

InternalSchema

Internal view

Conceptualview

Externalview

Externalview

External view

Users

DBMS

Page 23: File Processing

23

ENTITY ENGINEER [ATTRIBUTES = {

ENG-NO:CHARACTER(9)ENG-NAME :CHARACTER(15)TITLE : CHARACTER(10)SALARY : NUMERIC(6)}

KEY = {ENG-NO}]ENGINEER(ENG-NO,ENG-NAME,TITLE,SALARY)

ENTITY PROJECT [ATTRIBUTES = {

PROJ-NO : CHARACTER(7)PROJ-NAME : CHARACTER(20)BUDGET : NUMERIC(7)}

KEY = {PROJ-NO)]PROJECT(PROJ-NO,PROJ-NAME,BUDGET)

Conceptual Schema

Page 24: File Processing

24

RELATIONSHIP WORKS_IN [

BETWEEN = (ENGINEER, PROJECT}

ATTRIBUTES = {

RESP : CHARACTER(10)

DURATION : NUMERIC(3)

}

]

WORKS_IN(ENG-NO,PROJ-NO,RESP,DURATION)

Conceptual Schema

Page 25: File Processing

25

ENTITY ENGINEER [ATTRIBUTES = {

ENG-NO: CHARACTER(9)ENG-NAME: CHARACTER(15)TITLE : CHARACTER(10)SALARY : NUMERIC(6)}

KEY = {ENGINEER_NUMBER}]

FILE ENG [

INDEX ON E# CALL EMINXFIELD = {

E# : BYTE(9)ENAME : BYTE(15)TIT : BYTE(10)SALARY : INTEGER(6)}

BLOCKING FACTOR 100]

Internal Schema

Page 26: File Processing

26

Create an BUDGET view from the PROJECT entity.

CREATE VIEW BUDGET (PNAME, BUD)AS SELECT PROJ-NAME, BUDGET FROM PROJECT

External Schema

Page 27: File Processing

27

Create an ASSIGNMENT view from the entities ENGINEER and PROJECT that show the names of engineers and the projects that they work on.

CREATE VIEW ASSIGNMENT(ENO,PNO,ENAME,PNAME)

AS SELECT ENGINEER.ENG-NO,ENGINEER.ENG-NAME,

PROJECT.PROJ-NO,PROJECT.PROJ-NAME

FROM ENGINEER,PROJECT,WORKS_IN

WHERE ENGINEER.ENG-NO = WORKS_IN.ENG-NO

AND WORKS_IN.PROJ-NO = PROJECT.PROJ-NO

External Schema

Page 28: File Processing

28

DBMSSystem Buffer

User Program A

LanguageUser Work Area (UWA)

external schemaused by user program A

Schema

Physical/InternalData Schema

Operating System

Database

1 2

3

45

6

78

910

Database Access

(DBMS)

Page 29: File Processing

29

Database Access

User program A sends to DBMS an invoke command to retrieve a (set of) record

DBMS analyzes the external schema of the user program A and finds the database description of the record.

DBMS checks with the schema to get the data types and location information of record

DBMS checks with the physical schema to find out which device the record is in and what access methods can be used.

According to 4, DBMS sends OS a read command to execute the search.

Page 30: File Processing

30

Database Access

OS issues the page invoke command to the correspond device, and then puts the page fetched into the system buffer.

DBMS uses the schema and the external schema to infer the logical structure of the retrieving record.

DBMS places the relevant data to the UWA, and provides the status information at the program invocation

exit.