SALEH database lect2
-
Upload
golden-coin -
Category
Documents
-
view
220 -
download
0
Transcript of SALEH database lect2
-
8/6/2019 SALEH database lect2
1/56
CS200 Database Systems
CS200 DATABASE SYSTEMS
LECTURE1INTRODUCTION
InstructorALLY .S. Nyamawe, [email protected] +255 715 016 580
mailto:[email protected]:[email protected] -
8/6/2019 SALEH database lect2
2/56
-
8/6/2019 SALEH database lect2
3/56
CS200 Database Systems
Cont...
chances are that our activities will involvesomeone or some computer program accessing adatabase. Even purchasing items from a
supermarket nowadays in many cases involvesan automatic update of the database that keepsthe inventory of supermarket items.
These interactions are examples of what we
may call traditional database applications, inwhich most of the information that is stored andaccessed is either textual or numeric.
-
8/6/2019 SALEH database lect2
4/56
-
8/6/2019 SALEH database lect2
5/56
CS200 Database Systems
Database Definition
Databases and database technology are having amajor impact on the growing use of computers. Itis fair to say that databases play a critical role inalmost all areas where computers are used,including business, electronic commerce,engineering, medicine, law, education, and library
science, etc.Generally, A database is a collection of related
data.
-
8/6/2019 SALEH database lect2
6/56
CS200 Database Systems
Cont...
By data, we mean known facts that can berecorded and that have implicit meaning. Forexample, consider the names, telephone numbers,
and addresses of the people you know. You mayhave recorded this data in an indexed address
book, or you may have stored it on a hard drive,using a personal computer and software such asMicrosoft Access, or Excel. This is a collection ofrelated data with an implicit meaning and hence isa database.
-
8/6/2019 SALEH database lect2
7/56
CS200 Database Systems
Cont...
Also, A database can be defined as collection ofinformation organized in such a way that it canbe accessed easily.
ExamplesTracking customer ordersMaintaining Employees Records.Maintaining Students Information
-
8/6/2019 SALEH database lect2
8/56
CS200 Database Systems
Database Properties
A database has the following implicit properties:A database represents some aspect of the real world,
sometimes called the miniworld or the universe ofdiscourse. Changes to the miniworld are reflected in
the database.A database is a logically coherent collection of data
with some inherent meaning. A random assortment ofdata cannot correctly be referred to as a database.
A database is designed, built, and populated with datafor a specific purpose. It has an intended group of usersand some preconceived applications in which theseusers are interested.
-
8/6/2019 SALEH database lect2
9/56
CS200 Database Systems
History of Databases
Manual systems
File Processing Systems (FPS)
Database Management systems (DBMS)
-
8/6/2019 SALEH database lect2
10/56
CS200 Database Systems
Manual Systems
Structure
Information can be stored in dedicated room or inseparate offices.
Room or office will be furnished with shelves.Different shelves will hold Records for different
subjects.
Records will be stored in hard flat files, each filewill carry one record.
Each file will have a specific number to identify it.
A person will use the file number to retrieve the
specific file (record).
-
8/6/2019 SALEH database lect2
11/56
CS200 Database Systems
Manual Systems
User
File keeper
Files Cabinet
-
8/6/2019 SALEH database lect2
12/56
CS200 Database Systems
File Processing Systems (FPS)
Information stored as groups of records in separatefiles
File processing systems consisted of a few data filesand many application programs
Each file called a soft flat file
Flat file contain processed information for onespecific function
Use of programming languages to write applications
Little flexibility
High maintenance
Many limitations
-
8/6/2019 SALEH database lect2
13/56
CS200 Database Systems
File Processing Systems (FPS)
-
8/6/2019 SALEH database lect2
14/56
CS200 Database Systems
Limitations of File ProcessingSystems
Separate and isolated data.
Data redundancy.
Program - data interdependence involvingfile formats and access techniques.
Difficulty in representing data from theusers view.
Data inflexibility.
-
8/6/2019 SALEH database lect2
15/56
CS200 Database Systems
Database Management systems(DBMS
A program that allows users to define, create,manipulate, store, maintain, retrieve, and process the
data in a database in order to produce meaningful
information.
Focus on information representationData stored as records in various database files that can
be combined to produce meaningful information for users
DBMS controls all functions of capturing, processing,
storing, retrieving data and generates various forms of dataoutput
Manages access by multiple users and multiple programsto a common store of data
-
8/6/2019 SALEH database lect2
16/56
CS200 Database Systems
Cont..
-
8/6/2019 SALEH database lect2
17/56
CS200 Database Systems
DBMS overcomes all Limitationsof FPS.
Eliminates separation and isolation of data
Reduces data redundancy
Eliminates dependence between programs and
data
Allows for representation of data from usersview
Increases data flexibilitySuperior flexibility and security over spreadsheet
applications
-
8/6/2019 SALEH database lect2
18/56
CS200 Database Systems
Characteristics of a DBMS
Computerized record-keeping system
Contains facilities that allow the user to:
Add, delete files, Insert, retrieve, update,and delete data
Collection of databases; each can be used forseparate purposes or combined
Examples of DBMS are: Sql server, Ms Access,MySql, Oracle.
-
8/6/2019 SALEH database lect2
19/56
CS200 Database Systems
Functions and Uses of a DBMS
To store data
To organize data
To control access to dataTo protect data
To provide decision support
To provide transaction processing
-
8/6/2019 SALEH database lect2
20/56
CS200 Database Systems
Components of a DBMS
users/programmers
application programs/queries
software to process queries/programs
software to access stored data
stored database
definition
(meta -data)
stored
database
-
8/6/2019 SALEH database lect2
21/56
CS200 Database Systems
Architecture of a DBMS
user queries
storage manager
stored database
definition
(meta-data)
stored
database
schema modifications modifications
query processor
transaction
manager
-
8/6/2019 SALEH database lect2
22/56
CS200 Database Systems
Overview of DBMS Components
Stored Database and Meta-data: The stored databaseresides on secondary and tertiary devices. (At anygiven moment some portion of the database will also
be mirrored in cache, but we will ignore this for the
moment.)Meta-data is data about data. In this case the meta-data is
a description of the data components of the database.Offsets of fields within records. Typing information.
Schema information. Index information and so forth.For a given database, a DBMS may maintain many
different indices designed to provide fast access to randomdata.
-
8/6/2019 SALEH database lect2
23/56
CS200 Database Systems
Cont...
Storage Manager: In a simple database system,the storage manager is nothing more than thefile system of the underlying OS. In larger
systems, for the purposes of efficiency, theDBMSs normally control storage on the diskdirectly.
The storage manager consists of two basiccomponents (1) the buffer manager, and (2) thefile manager.
-
8/6/2019 SALEH database lect2
24/56
CS200 Database Systems
Cont...
File Manager: Keeps track of the location of files on the disksand obtains the block or blocks containing a file on requestfrom the buffer manager. Disks are typically blocked into
regions of contiguous space ranging between 212
and 214
bytes (between roughly 4000 to 16,000 bytes/block).
Buffer Manager: Handles main memory. It obtains blocks ofdata from the disk, via the file manager, and chooses a pageof main memory in which to store the block. The paging
algorithm will determine how long a page will remain inmain memory. However, the transaction manager can alsoforce a page in main memory to be returned to disk.
-
8/6/2019 SALEH database lect2
25/56
CS200 Database Systems
Cont...
Query Manager: Turns a query or databasemanipulation, which may be expressed at avery high level (e.g., SQL) into a sequence
of request for stored data such as specifictuples of a relation or parts of an index to arelation.
Often the hardest part of query processing is
query optimization, which involves theformulation of a good query executionstrategy.
-
8/6/2019 SALEH database lect2
26/56
CS200 Database Systems
Cont...
Transaction Manager: There are certain guarantees thata DBMS must make when performing operations on adatabase. These guarantees are often referred to as theACID properties.
Atomicity: all of a transaction is executed or none of itis executed.
Consistency: data cannot be in a inconsistent state.
Isolation: concurrent transactions must be isolated from
each other both in effect and in visibility.Durability: changes to the database caused by a
transaction must not be lost even if the system failsimmediately after the transaction completes.
-
8/6/2019 SALEH database lect2
27/56
CS200 Database Systems
Database Design
For the system to be acceptable to the end-users, the database design activity is crucial.
A poorly designed database will generate
error that may lead to bad decisions beingmade, which may have serious repercussionsfor the organization. On the other hand, a well-
designed database produces, in an efficient way,a system that provides the correct informationfor the decision-making process to succeed.
-
8/6/2019 SALEH database lect2
28/56
CS200 Database Systems
Roles in the Database Environment
Data and Database AdministratorsThe Data Administrator (DA) is responsible for the
management of the data resource including database planning, development and maintenance of standards,
policies and procedures, and conceptual/logical databasedesign.
The Database Administrator (DBA) is responsible for the physical realization of the database, including physicaldatabase design and implementation, security and integritycontrol, maintenance of the operational system, and ensuringsatisfactory performance of the applications for users. Therole of the DBA is more technically oriented than that of theDA.
-
8/6/2019 SALEH database lect2
29/56
CS200 Database Systems
Database Administrator
A database administrator (DBA) controls andmanages the database.
Functions of a DBAMake decisions concerning the content of thedatabase
Plan storage structures and access strategies
Provides support to usersDefines security and integrity checks
Interprets backup and recovery strategies.
R l i th D t b E i t
-
8/6/2019 SALEH database lect2
30/56
CS200 Database Systems
Roles in the Database Environment(Cont..)
Database DesignersIn large db design projects, we can distinguish between two
types of designers: logical database designers and physicaldatabase designers.
Logical database designers are concerned with identifyingthe data (the entities and attributes), the relationshipsbetween the data, and the constraints on the data that will bestored in the database.
Physical database designers are highly dependent on thetarget DBMS, and there may be more than one way ofimplementing a mechanism. The physical db designer mustbe fully aware of the functionality of the target DBMS.
-
8/6/2019 SALEH database lect2
31/56
CS200 Database Systems
Cont...
Application DevelopersOnce the database has been implemented, the
application programs that provide the required
functionality for the end-users must beimplemented. This is the responsibility of theapplication developers.
-
8/6/2019 SALEH database lect2
32/56
CS200 Database Systems
Cont...
End UsersEnd users are the clients for the database and can be
broadly categorized into two groups based upon how theyutilize the system.
Nave users are typically unaware of the DBMS.They access the database through specially writtenapplication programs which attempt to make theoperations as simple as possible. They typically knownothing about the database or the DBMS.
Sophisticated users are familiar with the structure ofthe database and the facilities offered by the DBMS. Theywill typically use a high-level query language like SQL to perform their required operations and may even write their
own application programs.
Advantages and Disadvantages
-
8/6/2019 SALEH database lect2
33/56
CS200 Database Systems
Advantages and Disadvantagesof a DBMS
Advantages:Centralized data reduces management problemsData redundancy and consistency are controllable
Program - data interdependency is diminishedFlexibility of data is increased
More information from the same amount of data
Sharing of data
Improved data integrityImproved security
Enforcement of standards
-
8/6/2019 SALEH database lect2
34/56
CS200 Database Systems
Cont...
Disadvantages:Reduction in speed of data access time
Requires special knowledgePossible dependency of application
programs to specific DBMS versions
-
8/6/2019 SALEH database lect2
35/56
CS200 Database Systems
More DBMS Advantages
control of data redundancy economy of scale
data consistency
more information from same data
amount of data available
sharing of data
improved data integrity
improved data security
enforcement of standards
balance of conflicting requirements
improved data accessibility
increased productivity
improved maintenance
increased concurrency
improved backup and recovery
improved responsiveness
-
8/6/2019 SALEH database lect2
36/56
CS200 Database Systems
More DBMS Disadvantages
complexity
size
cost of DBMSs
additional hardware costs
cost of conversion
performance (specific cases)
higher impact of failure
complexity
size
cost of DBMSs
additional hardware costs
cost of conversion
performance (specific cases)
higher impact of failure
Th L l f Ab t ti i
-
8/6/2019 SALEH database lect2
37/56
CS200 Database Systems
Three-Levels of Abstraction in a
Database System
View 1View 1 View 2View 2 View nView n
user 1 user 2 user n
external level
Conceptual
Schema
Conceptual
Schema
conceptual level
internal level
physical data organization
Internal
Schema
Internal
Schema
dbdb
external to
conceptualmapping
conceptual to
internal
mapping
-
8/6/2019 SALEH database lect2
38/56
CS200 Database Systems
The External Level
The external level is the users view of the database.This level describes that part of the database which is
relevant to each user.The external level consists of a number of different
external views of the db. Each user has a view of thereal world represented in a form that is familiar for thatuser.
The external view includes only those entities,
attributes, and relationships in the real world that theuser is interested in. Other entities, attributes, andrelationships may exist, but the user will be unaware thatthey even exist.
-
8/6/2019 SALEH database lect2
39/56
CS200 Database Systems
The External Level Cont...
It is often the case that different external views willhave different representations of the same data.Example: one view may represent dates in the form of
(month, day, year) while another view may represent dates
in the form of (day, month, year).Some views may include derived or calculated data. Thisis data that is not actually stored in the database as such,
but created when needed.
Example: one view may need to see a persons age.However, this is probably not a stored value in the db sinceit would require daily updates. Rather, it is probablyderived from stored data representing the persons date of
birth and the current date.
-
8/6/2019 SALEH database lect2
40/56
CS200 Database Systems
The Conceptual Level
The conceptual level is the community view of the database.This level describes whatdata is stored in the database and therelationships among the data.This is the level at which the logical structure of the entire
database as seen by the DBA is contained. It represents a
complete view of the data requirements of the organization that isindependent of any storage considerations.
The conceptual level supports each external view, in that anydata available to a user must be contained in, or derivable from,the conceptual level.
This level contains no storage-dependent details. For example, an entity may be defined as represented byan integer data type at this level, but the number of bytes itoccupies is not specified at this level.
-
8/6/2019 SALEH database lect2
41/56
CS200 Database Systems
The Internal Level
The internal level represents the physicalrepresentation of the database on the computer. Thislevel describes howthe data is stored in the database.The internal level describes the physical
implementation necessary to achieve optimal runtimeperformance and storage space utilization.It covers the data structures and file organizations
used to store the data on the storage devices.
It interfaces with the OS access methods (filemanagement techniques for storing and retrievingdata records) to place the data on the storage devices,
build indexes, retrieve the data, and so on.
Th Ph i l L l
-
8/6/2019 SALEH database lect2
42/56
CS200 Database Systems
The Physical Level
Below the internal level is the physical level that may bemanaged by the OS under the direction of the DBMS.
The functions of the DBMS and the OS at the physicallevel are not clear cut and will vary from system to
system.Some DBMSs take advantage of many of the OS access
methods, while others will use only the most basic onesand create their own file organizations.
The physical level below the DBMS consists of itemsonly the OS knows, such as exactly how the sequencing isimplemented and whether the fields of internal recordsare stored as contiguous bytes on the disk.
D t I d d
-
8/6/2019 SALEH database lect2
43/56
CS200 Database Systems
Data Independence
One of the major objectives of the three-levelarchitecture is to provide data independence, whichmeans that the upper levels are unaffected by changesto lower levels.
There are two types of data independence: logicalandphysical.Logical data independence refers to the immunity of
the external schemas to changes in the conceptual
schema.Physical data independence refers to the immunityof the conceptual schema to changes in the internalschema.
-
8/6/2019 SALEH database lect2
44/56
CS200 Database Systems
Data Independence(cont.)
View 1View 1 View 2View 2 View nView n
user 1 user 2 user n
external level
ConceptualSchema
ConceptualSchema
conceptual level
internal level
physical data organization
Internal
Schema
Internal
Schema
dbdb
logical data independence
physical data independence
D t b L
-
8/6/2019 SALEH database lect2
45/56
CS200 Database Systems
Database Languages
A data sublanguage consists of two parts: a Data DefinitionLanguage (DDL) and a Data Manipulation Language (DML).The DDL is used to specify the database schema and the DML
is used to both read and update the database.These languages are called data sublanguages because they do
not include constructs for all computing needs such asconditional or iterative statements, which are provided by thehigh-level programming languages.Most DBMSs have a facility for embedding the sublanguage ina high-level programming language such as COBOL, Pascal,
C, C++, Java, or Visual Basic which is then called the hostlanguage.Most data sublanguages also provide a non-embedded or
interactive version of the language to be input directly from a
terminal.
D t D fi iti L (DDL)
-
8/6/2019 SALEH database lect2
46/56
CS200 Database Systems
Data Definition Language (DDL)
A Data Definition Language is a language thatallows the DBA or user to describe and name theentities, attributes, and relationships required forthe application, together with any associated
integrity and security constraints.The result of the compilation/execution of the DDLstatements is a set of tables stored in special filescollectively referred to as the system catalog. Thesystem catalog is also commonly referred to as thedata dictionary ordata directory.
D t M i l ti L (DML)
-
8/6/2019 SALEH database lect2
47/56
CS200 Database Systems
Data Manipulation Language (DML)
A Data Manipulation Language is a language thatprovides a set of operations to support the basic datamanipulation operations on the data held in thedatabase.
DML operations usually include the following: insertion of new data into the database. modification of data stored in the database. retrieval of data contained in the database. deletion of data from the database.The part of the DML that involves data retrieval iscalled aquery language.
DMLs (cont )
-
8/6/2019 SALEH database lect2
48/56
CS200 Database Systems
DMLs (cont.)
DMLs are distinguished by their underlying retrievalconstructs. We can distinguish two basic types of DMLs:procedural and non-procedural.Procedural DMLs are languages in which the user informsthe system whatdata is required and exactly howto retrieve
that data. Non-procedural DMLs are languages in which the userinforms the system only ofwhatdata is required and leavesthe how to retrieve the data entirely up to the system.
It is common for procedural DMLs to be embedded in high-level programming languages.Procedural DMLs tend to be more focused on individual
records while non-procedural DMLs tend to operate on sets ofrecords.
Fourth Generation Languages
-
8/6/2019 SALEH database lect2
49/56
CS200 Database Systems
Fourth Generation Languages
There is no consensus as to what constitutes a4GL. In essences it is a shorthand programminglanguage. What requires several hundred lines of
code in a 3GL will require only a few lines ofcode in a 4GL.3GLs are procedural while 4GLs are non-
procedural.
4GLs include spreadsheets and databaselanguages.
SQL and QBE are examples of 4GLs.
Types of Databases
-
8/6/2019 SALEH database lect2
50/56
CS200 Database Systems
Types of Databases
Flat Databases
Relational Database
Flat Databases
-
8/6/2019 SALEH database lect2
51/56
CS200 Database Systems
Flat Databases
A single kind of record with a fixed number offields.
a way of organizing all information in a single
table.
suitable for extremely simple databases.
inherit data redundancy
Relational Database
-
8/6/2019 SALEH database lect2
52/56
CS200 Database Systems
Relational Database
Data stored in a collection of columns androws called a table, or a relation
Tables may be electronically linked via akey field containing common data
Easy to add, delete and modify the data
and the table structures
Relational Database
-
8/6/2019 SALEH database lect2
53/56
CS200 Database Systems
Relational Database
Summary
-
8/6/2019 SALEH database lect2
54/56
CS200 Database Systems
Summary
In this lecture we defined a database as a collection ofrelated data, where data means recorded facts. A typicaldatabase represents some aspect of the real world and isused for specific purposes by one or more groups of users.
A DBMS is a generalized software package forimplementing and maintaining a computerized database.The database and software together form a databasesystem. We identified several characteristics thatdistinguish the database approach from traditional file-
processing applications.We discussed about database history and it's types as wellas it's users (a DBA, DA etc).
Challenge
-
8/6/2019 SALEH database lect2
55/56
CS200 Database Systems
Challenge
Discuss the capabilities that should be providedby a DBMS.
Discuss the main characteristics of the database
and how it differs from traditional file systems.Why would you choose a database system
instead of simply storing data in operating systemfiles? When would it make sense not to use a
database system?
Questions
-
8/6/2019 SALEH database lect2
56/56
CS200 Database Systems
Questions