CS 435Database Systems
Chapter 1
An Overview of Database Management
What is a database?
“An electronic filing cabinet”
“A repository for a collection of computerized data files”
A collection of interrelated data.
“Descriptions”--not definitions
So, how is that different from a file?
File processing systems
• Independent systems
• Each has its own definition of data
• Each has its own data formats
File processing systems
• Independent systems. Each has its own definition of data. Each has its own data formats
Faculty Data File
Payroll System
Reports
Class Data File
Class Scheduling System
Reports
Student Data File
Grade Posting System
Reports
File processing systems
Faculty Data File
Payroll System
Reports
Class Data File
Class Scheduling System
Reports
Problems of inconsistency.
May need faculty member name in each file. May be recorded differently in each.
Database systems
• A single data definition
• All data (potentially) accessible from each application
• Less paperwork exchange between applications
Database systemsA single data definition
All data (potentially) accessible from each application
Faculty Data
Class Data
Student Data
Data DefinitionDatabase Management System
Payroll System
Reports
Class Scheduling System
Reports
Grade Posting System
Reports
Less paperwork exchange between applications
So, a database is a collection of "files," or at least a collection of data that would otherwise usually exist in multiple files.
What is a database management system?
The software that makes it possible for multiple applications and multiple users to access the same (single) set of data.
The software that enables users to access and share the single set of integrated data without concern about files and file structure.
What is a database system?
• The data
• The database software (database management system)
• The other software (applications)
• The hardware where the data and software reside (and execute)
• The users who use the system
data
Computerhardware
software
data
Computerhardware
users
A database system is the collection of data, the software to provide access to that data, (and the hardware upon which the data and software reside and execute.)
To that we can also add the users. They are also part of the "system."
An example of a database and some "database operations.”
(The CELLAR example)
http://www.csupomona.edu/~hnriley/web_mysql/cellar.html
+-----+----------------+---------------+------+---------+-------+| bin | wine | producer | year | bottles | ready |+-----+----------------+---------------+------+---------+-------+| 2 | Chardonnay | Buena Vista | 2001 | 1 | 2003 || 3 | Chardonnay | Geyser Peak | 2001 | 5 | 2003 || 6 | Chardonnay | Simi | 2000 | 4 | 2002 || 12 | Joh. Riesling | Jekel | 2002 | 1 | 2003 || 21 | Fume Blanc | Ch. St. Jean | 2001 | 4 | 2003 || 22 | Fume Blanc | Robt. Mondavi | 2000 | 2 | 2002 || 30 | Gewurztraminer | Ch. St. Jean | 2002 | 3 | 2003 || 43 | Cab. Sauvignon | Windsor | 1995 | 12 | 2004 || 45 | Cab. Sauvignon | Geyser Peak | 1998 | 12 | 2006 || 48 | Cab. Sauvignon | Robt. Mondavi | 1997 | 12 | 2008 || 50 | Pinot Noir | Gary Farrell | 2000 | 3 | 2003 || 51 | Pinot Noir | Fetzer | 1997 | 3 | 2004 || 52 | Pinot Noir | Dehlinger | 1999 | 2 | 2002 || 58 | Merlot | Clos du Bois | 1998 | 9 | 2004 || 64 | Zinfandel | Cline | 1998 | 9 | 2007 || 72 | Zinfandel | Rafanelli | 1999 | 2 | 2007 |+-----+----------------+---------------+------+---------+-------+
Date’s “CELLAR” Example
Date’s “CELLAR” Example
Retrieval:
select wine, bin_num, producer
from Cellar
where ready = '2000' ;
Result:
+----------------+-----+--------------+| wine | bin | producer |+----------------+-----+--------------+| Cab. Sauvignon | 43 | Windsor || Pinot Noir | 51 | Fetzer || Merlot | 58 | Clos du Bois |+----------------+-----+--------------+
3 rows in set (0.00 sec)
Date’s “CELLAR” Example
Inserting new data:
insert
into Cellar
values (53, 'Pinot Noir', 'Saintsbury', 2003, 6, 2008);
Date’s “CELLAR” Example
Changing existing data:
update Cellar
set bottles = 4
where bin_num = 3;
Deleting existing data:
delete
from cellar
where bin_num = 2;
+-----+----------------+---------------+------+---------+-------+| bin | wine | producer | year | bottles | ready |+-----+----------------+---------------+------+---------+-------+| 3 | Chardonnay | Geyser Peak | 2001 | 4 | 2003 || 6 | Chardonnay | Simi | 2000 | 4 | 2002 || 12 | Joh. Riesling | Jekel | 2002 | 1 | 2003 || 21 | Fume Blanc | Ch. St. Jean | 2001 | 4 | 2003 || 22 | Fume Blanc | Robt. Mondavi | 2000 | 2 | 2002 || 30 | Gewurztraminer | Ch. St. Jean | 2002 | 3 | 2003 || 43 | Cab. Sauvignon | Windsor | 1995 | 12 | 2004 || 45 | Cab. Sauvignon | Geyser Peak | 1998 | 12 | 2006 || 48 | Cab. Sauvignon | Robt. Mondavi | 1997 | 12 | 2008 || 50 | Pinot Noir | Gary Farrell | 2000 | 3 | 2003 || 51 | Pinot Noir | Fetzer | 1997 | 3 | 2004 || 52 | Pinot Noir | Dehlinger | 1999 | 2 | 2002 || 58 | Merlot | Clos du Bois | 1998 | 9 | 2004 || 64 | Zinfandel | Cline | 1998 | 9 | 2007 || 72 | Zinfandel | Rafanelli | 1999 | 2 | 2007 || 53 | Pinot Noir | Saintsbury | 2003 | 6 | 2008 |+-----+----------------+---------------+------+---------+-------+
CELLAR changed from 5 to 4
row for bin 2 deleted
new row inserted
Note that the “CELLAR database" looks like a "table," and in fact,that is what it is.
In particular it is a relational table, or just a "relation."
Aside regarding tables…
and, “looks like a table.”
+-----+----------------+---------------+------+---------+-------+| bin | wine | producer | year | bottles | ready |+-----+----------------+---------------+------+---------+-------+| 2 | Chardonnay | Buena Vista | 2001 | 1 | 2003 || 3 | Chardonnay | Geyser Peak | 2001 | 5 | 2003 || 6 | Chardonnay | Simi | 2000 | 4 | 2002 || 12 | Joh. Riesling | Jekel | 2002 | 1 | 2003 || 21 | Fume Blanc | Ch. St. Jean | 2001 | 4 | 2003 || 22 | Fume Blanc | Robt. Mondavi | 2000 | 2 | 2002 || 30 | Gewurztraminer | Ch. St. Jean | 2002 | 3 | 2003 || 43 | Cab. Sauvignon | Windsor | 1995 | 12 | 2004 || 45 | Cab. Sauvignon | Geyser Peak | 1998 | 12 | 2006 || 48 | Cab. Sauvignon | Robt. Mondavi | 1997 | 12 | 2008 || 50 | Pinot Noir | Gary Farrell | 2000 | 3 | 2003 || 51 | Pinot Noir | Fetzer | 1997 | 3 | 2004 || 52 | Pinot Noir | Dehlinger | 1999 | 2 | 2002 || 58 | Merlot | Clos du Bois | 1998 | 9 | 2004 || 64 | Zinfandel | Cline | 1998 | 9 | 2007 || 72 | Zinfandel | Rafanelli | 1999 | 2 | 2007 |+-----+----------------+---------------+------+---------+-------+
“Tables”
rows
column “headings”
columns
Columns are aligned:
i.e., strings left justified
numbers right justified
+-----+----------------+---------------+------+---------+-------+| bin | wine | producer | year | bottles | ready |+-----+----------------+---------------+------+---------+-------+| 2 | Chardonnay | Buena Vista | 2001 | 1 | 2003 || 3 | Chardonnay | Geyser Peak | 2001 | 5 | 2003 || 6 | Chardonnay | Simi | 2000 | 4 | 2002 || 12 | Joh. Riesling | Jekel | 2002 | 1 | 2003 || 21 | Fume Blanc | Ch. St. Jean | 2001 | 4 | 2003 || 22 | Fume Blanc | Robt. Mondavi | 2000 | 2 | 2002 || 30 | Gewurztraminer | Ch. St. Jean | 2002 | 3 | 2003 || 43 | Cab. Sauvignon | Windsor | 1995 | 12 | 2004 || 45 | Cab. Sauvignon | Geyser Peak | 1998 | 12 | 2006 || 48 | Cab. Sauvignon | Robt. Mondavi | 1997 | 12 | 2008 || 50 | Pinot Noir | Gary Farrell | 2000 | 3 | 2003 || 51 | Pinot Noir | Fetzer | 1997 | 3 | 2004 || 52 | Pinot Noir | Dehlinger | 1999 | 2 | 2002 || 58 | Merlot | Clos du Bois | 1998 | 9 | 2004 || 64 | Zinfandel | Cline | 1998 | 9 | 2007 || 72 | Zinfandel | Rafanelli | 1999 | 2 | 2007 |+-----+----------------+---------------+------+---------+-------+
Separating lines provided by MySQL
Separating lines provided by textbook publisher
bin wine producer year bottles ready
2 Chardonnay Buena Vista 2001 1 2003 3 Chardonnay Geyser Peak 2001 5 2003 6 Chardonnay Simi 2000 4 2002 12 Joh. Riesling Jekel 2002 1 2003 21 Fume Blanc Ch. St. Jean 2001 4 2003 22 Fume Blanc Robt. Mondavi 2000 2 2002 30 Gewurztraminer Ch. St. Jean 2002 3 2003 43 Cab. Sauvignon Windsor 1995 12 2004 45 Cab. Sauvignon Geyser Peak 1998 12 2006 48 Cab. Sauvignon Robt. Mondavi 1997 12 2008 50 Pinot Noir Gary Farrell 2000 3 2003 51 Pinot Noir Fetzer 1997 3 2004 52 Pinot Noir Dehlinger 1999 2 2002 58 Merlot Clos du Bois 1998 9 2004 64 Zinfandel Cline 1998 9 2007 72 Zinfandel Rafanelli 1999 2 2007
No separating lines
So, a database is usually said to consist of tables rather than files.
The rows of the tables would be the "records" of a file.
The columns of the table are the "fields" of those records.
Note that the "database," the collection of tables, is a logical concept, a data structure.
The database software (the database manager, the DBMS) provides
the mapping of the logical database
into one or more logical files, and ultimately into a physical representation on disk.
Thus, there are stored files, stored records, and stored fields.
(Sometimes the DBMS shares this mapping with the operating system)
Parts+------+-------+-------+--------+--------+| pnum | pname | color | weight | city |+------+-------+-------+--------+--------+| P1 | Nut | Red | 12.0 | London || P2 | Bolt | Green | 17.0 | Paris || P3 | Screw | Blue | 17.0 | Rome || P4 | Screw | Red | 14.0 | London || P5 | Cam | Blue | 12.0 | Paris || P6 | Cog | Red | 19.0 | London |+------+-------+-------+--------+--------+
Say this parts table is stored in the database as a file
Stored database
Other stored files
P1 Nut Red 12.0
P2 Bolt Green 17.0
“Parts” stored
file
Two occurrences of the “part” stored record type.
Stored field occurrences
…and the table rows become records in the file, the columns fields within each record
P1 Nut Red 12.0
But, for example, the data for a part (a table row):
P1 12.0
P1 Nut Red
might be stored as two records:
and
Stored database
Other stored files
P1 Nut Red 12.0
P2 Bolt Green 17.0
“Parts” stored
file
Parts+------+-------+-------+--------+--------+| pnum | pname | color | weight | city |+------+-------+-------+--------+--------+| P1 | Nut | Red | 12.0 | London || P2 | Bolt | Green | 17.0 | Paris || P3 | Screw | Blue | 17.0 | Rome || P4 | Screw | Red | 14.0 | London || P5 | Cam | Blue | 12.0 | Paris || P6 | Cog | Red | 19.0 | London |+------+-------+-------+--------+--------+
DBMS
Data independence--the immunity of applications to change in physical representation.
The DBMS relieves the user of any concern about how the data is represented physically.
Look again at the CELLAR example to see how this table relates to other tables that might exist.
http://www.csupomona.edu/~hnriley/web_mysql/cellar.html
+-----+----------------+---------------+------+---------+-------+| bin | wine | producer | year | bottles | ready |+-----+----------------+---------------+------+---------+-------+| 2 | Chardonnay | Buena Vista | 2001 | 1 | 2003 || 3 | Chardonnay | Geyser Peak | 2001 | 5 | 2003 || 6 | Chardonnay | Simi | 2000 | 4 | 2002 || 12 | Joh. Riesling | Jekel | 2002 | 1 | 2003 || 21 | Fume Blanc | Ch. St. Jean | 2001 | 4 | 2003 || 22 | Fume Blanc | Robt. Mondavi | 2000 | 2 | 2002 || 30 | Gewurztraminer | Ch. St. Jean | 2002 | 3 | 2003 || 43 | Cab. Sauvignon | Windsor | 1995 | 12 | 2004 || 45 | Cab. Sauvignon | Geyser Peak | 1998 | 12 | 2006 || 48 | Cab. Sauvignon | Robt. Mondavi | 1997 | 12 | 2008 || 50 | Pinot Noir | Gary Farrell | 2000 | 3 | 2003 || 51 | Pinot Noir | Fetzer | 1997 | 3 | 2004 || 52 | Pinot Noir | Dehlinger | 1999 | 2 | 2002 || 58 | Merlot | Clos du Bois | 1998 | 9 | 2004 || 64 | Zinfandel | Cline | 1998 | 9 | 2007 || 72 | Zinfandel | Rafanelli | 1999 | 2 | 2007 |+-----+----------------+---------------+------+---------+-------+
CELLAR
Suppose we want to add some information about each wine.
+-----+----------------+-------|---------------+------+-----+-------+| bin | wine | type | producer | year | qty | ready |+-----+----------------+-------|---------------+------+-----+-------+| 2 | Chardonnay | white | Buena Vista | 2001 | 1 | 2003 || 3 | Chardonnay | white | Geyser Peak | 2001 | 5 | 2003 || 6 | Chardonnay | white | Simi | 2000 | 4 | 2002 || 12 | Joh. Riesling | white | Jekel | 2002 | 1 | 2003 || 21 | Fume Blanc | white |Ch. St. Jean | 2001 | 4 | 2003 || 22 | Fume Blanc | white | Robt. Mondavi | 2000 | 2 | 2002 || 30 | Gewurztraminer | white | Ch. St. Jean | 2002 | 3 | 2003 || 43 | Cab. Sauvignon | red | Windsor | 1995 | 12 | 2004 || 45 | Cab. Sauvignon | red | Geyser Peak | 1998 | 12 | 2006 || 48 | Cab. Sauvignon | red | Robt. Mondavi | 1997 | 12 | 2008 || 50 | Pinot Noir | red | Gary Farrell | 2000 | 3 | 2003 || 51 | Pinot Noir | red | Fetzer | 1997 | 3 | 2004 || 52 | Pinot Noir | red | Dehlinger | 1999 | 2 | 2002 || 58 | Merlot | red | Clos du Bois | 1998 | 9 | 2004 || 64 | Zinfandel | red | Cline | 1998 | 9 | 2007 || 72 | Zinfandel | red | Rafanelli | 1999 | 2 | 2007 |+-----+----------------+-------|---------------+------+-----+-------+
for example:
“redundancies”Chardonnay is white--3 times
Pinot Noir is red--3 times
So, more tables--for example:
• Wine: Name, Type, Description, Characteristic
• Producer: Name, Area, Appellation
Wines
+----------------+-----------+------------------+---------------------+| wine_name | wine_type | wine_description | wine_characteristic |+----------------+-----------+------------------+---------------------+| Chardonnay | white | dry | buttery || Joh. Riesling | white | semi-sweet | fruity || Fume Blanc | white | dry | smoky || Gewurztraminer | white | semi-sweet | spicy || Cab. Sauvignon | red | dry | oaky || Pinot Noir | red | dry | fruity || Merlot | red | dry | plummy || Zinfandel | red | dry | spicy |+----------------+-----------+------------------+---------------------+
Note that this gives us the ability to describe a wine as "Red" in one place, rather than adding it to the CELLAR table and repeating it each time that wine appears.
This eliminates "redundancy."
Producers
+--------------+-----------------+-------------+| name | area | appellation |+--------------+-----------------+-------------+| Fetzer | Hopland | Mendocino || Gary Farrell | Russian River V | Sonoma || Geyser Peak | Alexander Valle | Sonoma || Jekel | Arroyo Seco | Monterey || . | . | . || . | . | . || . | . | . || . | etc. | . |+--------------+-----------------+-------------+
Similarly,
Thus the database, (the collection of tables) is "integrated," i.e., the entirety of the data is formed by use of all of the tables.
Cellar+-----+----------------+---------------+------+---------+-------+| bin | wine | producer | year | bottles | ready |+-----+----------------+---------------+------+---------+-------+| 2 | Chardonnay | Buena Vista | 2001 | 1 | 2003 || 3 | Chardonnay | Geyser Peak | 2001 | 5 | 2003 || 6 | Chardonnay | Simi | 2000 | 4 | 2002 || 12 | Joh. Riesling | Jekel | 2002 | 1 | 2003 || 21 | Fume Blanc | Ch. St. Jean | 2001 | 4 | 2003 || 22 | Fume Blanc | Robt. Mondavi | 2000 | 2 | 2002 || 30 | Gewurztraminer | Ch. St. Jean | 2002 | 3 | 2003 || 43 | Cab. Sauvignon | Windsor | 1995 | 12 | 2004 || 45 | Cab. Sauvignon | Geyser Peak | 1998 | 12 | 2006 || 48 | Cab. Sauvignon | Robt. Mondavi | 1997 | 12 | 2008 || 50 | Pinot Noir | Gary Farrell | 2000 | 3 | 2003 || 51 | Pinot Noir | Fetzer | 1997 | 3 | 2004 || 52 | Pinot Noir | Dehlinger | 1999 | 2 | 2002 || 58 | Merlot | Clos du Bois | 1998 | 9 | 2004 || 64 | Zinfandel | Cline | 1998 | 9 | 2007 || 72 | Zinfandel | Rafanelli | 1999 | 2 | 2007 |+-----+----------------+---------------+------+---------+-------+
Wines+----------------+-----------+------------------+---------------------+| wine_name | wine_type | wine_description | wine_characteristic |+----------------+-----------+------------------+---------------------+| Chardonnay | white | dry | buttery || Joh. Riesling | white | semi-sweet | fruity || Fume Blanc | white | dry | smoky || Gewurztraminer | white | semi-sweet | spicy || Cab. Sauvignon | red | dry | oaky || Pinot Noir | red | dry | fruity || Merlot | red | dry | plummy || Zinfandel | red | dry | spicy |+----------------+-----------+------------------+---------------------+
Producers+--------------+-----------------+-------------+| name | area | appellation |+--------------+-----------------+-------------+| Fetzer | Hopland | Mendocino || Gary Farrell | Russian River V | Sonoma || Geyser Peak | Alexander Valle | Sonoma || Jekel | Arroyo Seco | Monterey || . | . | . |
The database: a collection of "files," or at least a collection of data that would otherwise usually exist in multiple files.
And, we can add
Maps to the wineries
Photographs of wineries, wine bottles
Recordings of our own “tasting notes”
Etc.
The data can also be "shared," by programs or by users.
(Single-user vs. multi-user systems.)
Persistent vs. Transient data
The persistent data. Once put into the database, it stays there until explicitly removed.
Faculty Data
Class Data
Student Data
Data DefinitionDatabase Management System
Payroll System
Reports
Class Scheduling System
Reports
Grade Posting System
Reports
Transient or ephemeral data. Input, output, intermediate results.
Another definition:
A database is a collection of persistent data that is used by the application systems of some given enterprise.
What are applications? Application programs? Application systems?
What is an “enterprise”?
The Benefits of a Database
• Makes possible, supports, enhances– Rapid availability of current data.– Reduced redundancy.– Less inconsistency.– Sharing of data among users, applications– Enforcement of standards– Enforcement of security– Maintenance of integrity
Entity Relationship Modeling
Entities: The “things” that we need to record data about.
Relationships: How these things are related to one another.
Entities: The “things” that we need to record data about.
PeopleProductsPlacesProcessesPoliciesPaper (documents)
Relationships: How these things are related to one another--connections between and among the “things” and their data;
Which people make which products
Which products are stored in which places
What places use what processes
What processes require what policies
the relationships
ProductsPeople
People make products.
Products are made by people.
Relationships are bi-directional.
Entity Entity
Relationship
ProductsPeople
Relationships are bi-directional.
Entity Entity
Relationship
For example:
Given a person, find which products that person makes.
Given a product, find which people make that product.
binary relationship
ternary relationship
recursive relationship
A “horizontal,” or row , subset of the table CELLAR
A “vertical,” or column, subset
Operations on tables produce only tables.
Relational Database Management Systems
DB2
Ingres
Informix
Microsoft SQL Server
Oracle
Sybase
Other Database Management System “Models”
(pre-relational)
Hierarchical (tree structure)
Network (graph structure)
Inverted list
Other (new) Approaches
(post-relational)
Deductive
Expert
Extendable
Object Oriented
Semantic
Universal Relation
DBMSs
People
Data Administrator
High level position
Responsible for defining the data to be maintained
Makes policy (regarding security, etc.)
Non-”technical”
Database Administrator
Creates the database
Implements the policies
IT professional
People
Application Programmers
Write the programs to maintain the database
and provide access to it
Need to know only the external view of the DB
End Users
Interact with the programs to enter data,
change data, generate reports
May not need to know anything about the DB
Application Programs
4GL Systems
Interfaces
Query Language Processor
Command Driven Menu or Forms Driven
1.1 Define the following terms:
binary relationship menu-driven interface
command-driven interface multi-user system
concurrent access online application
data administration persistent data
database property
database system query language
data independence redundancy
DBA relationship
DBMS security
entity sharing
entity/relationship diagram stored field
forms-driven interface stored file
integration stored record
integrity transaction
Top Related