Chapter 12 File Processing and Data Management Concepts
description
Transcript of Chapter 12 File Processing and Data Management Concepts
Chapter 12File Processing and Data Management Concepts
Presentation Outline
I. Terminology
II. Database Technology
III. The Architecture of a Database Management System (DBMS)
IV. The Database Administrator
I. Terminology
A. Field
B. Data Occurrences
C. Fixed vs. Variable Length Records
D. Record Key
E. Sort Keys
A. Field
A field is the smallest block of data that will be stored and retrieved in the information system.
Other names for field include data item, attribute, or element.
Field 1 Field 2
B. Data Occurrences
A specific set of data values for a record in a file.
1
2
3
4
5
The above table contains 5 occurrences of account records for the general ledger account file.
C. Fixed vs. Variable Length Records
Fixed Length RecordsBoth the number of fields
and the length of each field are fixed.
Strength: Easier to manipulate records.Weakness: Must
accommodate maximum sizes.
Variable Length RecordsBoth the number of fields
and the length of each field are variable. (See Fig. 15-1
on p. 603)Strength: Less waste of
memory when maximum sizes do not have to be
accomodated.Weakness: Record manipulation is more
difficult.
D. Record Key
A record key is a field or combination of fields that uniquely identifies a particular record in a file.
1110
1500
2105
2110
E. Sort Keys Primary sort key – The first field used to sort the data
occurrences in a record set. Secondary sort key – A field used to determine relative
position among a set of data occurrences in a record set. Tertiary sort key – Additional fields beyond primary and secondary sort keys that are required to uniquely identify data
occurrences in a record set.
Last Name First Name Age
Adams Tom 25
Jones Alisa 36
Jones Julie 19
Jones Julie 21
Young Sam 22
II. Database Technology
A. The Problem of Redundancy
B. The Components of a Database
A. The Problem of Redundancy
Redundancy occurs when different areas of
an organization use the information system
to store the same information in more
than one place.Results in update
anomaly.
That is not what we show
per our records.
B. The Components of a Database Management System
1. Data Description Language (DDL)
2. Data Manipulation Language (DML)
3. Data Query Language (DQL)
1. Data Description Language (DDL)
Defines the logical structure of the database (known as the schema). Defines the
following:Name of data fields.
Type of data (numeric, alphabetic, etc.)
Number of positions (length of field).
May also define subschema (i.e., individual
user views)
2. Data Manipulation Language (DML)
The DML consists of the commands for updating,
editing, manipulating, and extracting data.
Structured query language (SQL) is a common DML
in relational settings.
Pull a trial balance.
Structure Query Language (SQL)
3. Data Query Language
A data query language is a user friendly language or interface that allows the user to request information by simply filling in blanks. Represents a special type of
DML.
Query by Example (QBE)
III. The Architecture of a Database Management System (DBMS)
A. The Database Architecture
B. The Conceptual Architecture and Entity-Relationship (ER) Diagrams
C. Logical Data Structures
D. The Physical Structure
A. The Database Architecture
ConceptualLevel
Database contents Uses of database Desired reports Information to be viewed
Logical Level
Logical data structures: Tree Network Relational
Physical Level
Access Methods: Sequential Access Indexed Files
B. The Conceptual Architecture and Entity-Relationship (ER) Diagrams
Square boxes are used for entities (separate tables).
Ellipses are used for attributes (table
columns).Diamond shaped
boxes depict relationships.
PART
PART_NO NAME
COST
STORED AT
LOCATION
WHSE ADDRESS
C. Logical Data Structures
1. Tree or Hierarchical Structure2. Network Structures3. Relational Structure
a. Selectionb. Projection
c. Join
1. Tree or Hierarchical StructureA parent record can have many children. However a child record can have only one parent.Can only model 1:1 (one-to-one) and 1:* (one-to-many) relationships.Commonly used with accounting data. Can only access data by going from a parent to child.
Balance Sheet
Assets Liabilities Equity
Current Assets
Long-term Assets
Current Liabilities
Long-term Liabilities
Revenues
Expenses
2. Network Structure
Eliminates the distinction of parent and child records. A parent can have many children and a child can have many parents.Can model 1:1 (one-to-one), 1:* (one-to-many), and *:* (many-to-many) relationships.Must know the physical structure of the data in order to access it.
3. Relational Structure
Relational databases organize and store data in two dimensional tables consisting of
rows and columns.Relationships among tables are represented
by common data values in different tables.Straight forward in terms of organizing and
searching the data. Possesses ad hoc search capabilities.
3a. SelectionProduces a horizontal subset (includes entire row) of
a relation which satisfies a boolean predicate.
Name Acct # Balance
John 123 35.75
Bill 205 3.95
Mary 707 7.95
Joe 127 4.05
Balance < 5.00(Savings)
Savings Table
Name Acct # Balance
Bill 205 3.95
Joe 127 4.05
3b. ProjectionConstructs a vertical subset of a relation. The subset is obtained by selecting specified attributes and removing
others.
Name Acct # Balance
John 123 35.75
Bill 205 3.95
Mary 707 7.95
Joe 127 4.05
Balance < 5.00(Savings)
Savings Table
Name
Bill
Joe
Name
3c. JoinA join is used to combine 2 tables. The attribute
used to join must be in both tables.
A B C
a1 b1 c1
a2 b2 c2
a2 b2 c3
a4 b2 c2
Table R
C D E
c2 d1 e1
c3 d2 e3
c2 d1 e2
Table S
A B C D E
a2 b2 c2 d1 e1
a2 b2 c2 d1 e2
a2 b2 c3 d2 e3
a4 b2 c2 d1 e1
a4 b2 c2 d1 e2
R |X| S
D. The Physical Structure
1. Sequential Access
2. Indexed Files
1. Sequential Access
Records can only be accessed in a predefined sequence. For example, if there are 100 records in a file, one must access
the first 99 records before accessing the last
record. Generally useful for
batch processing when nearly all records must
be accessed.
2. Indexed Files Any attribute can be
extracted from the records in a primary file and used to
build a new file whose purpose is to provide an index to the original file.
First, the index is searched to find a specified value of an
attribute such as an customer account number.
Second, the disk addresses are used to directly retrieve
the desired recordsSee Fig. 12-13 on p. 427.
IV. The Database Administrator
The database administrator is a
person who coordinates data
management activities such as approving the physical contents and
user views of the database.
This is not quite what we
need.
Summary
Fields and keysThree Components of a DBMS
Three Types of Database ArchitectureThe Database Administrator