INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic...
-
Upload
colleen-james -
Category
Documents
-
view
228 -
download
0
description
Transcript of INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic...
INF 280 Database Systems
BASIC CONCEPTS
D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1
Typical software application
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 2
Data Processing
Transforming interface into data request
Transforming datasets into reports/forms
Query (SQL)
Datasets
Interface
INF 280
Database
Business Logic
Basic Concepts - Topics
1. A database as a collection of related data2. Database and Database Management System3. Characteristics and advantages of DB
approach4. DB users5. DB Architecture6. DBMS Architecture
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 3
DB as a collection of related data (1) • Data: facts that can be recorded and that have implicit
meaning. • Database implicit properties: – A database represents some aspect of the real world,
sometimes called the miniworld or the Universe of Discourse (UoD).
– A database is a logically coherent collection of data with some inherent meaning.
– A database is designed, built, and populated with data for a specific purpose. It has an intended groups of users and some applications in which these users are interested.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 4
DB as a collection of related data (2)
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 5
Basic characteristics
1. Self-Describing Nature of a Database System2. Insulation between Programs and Data, Data
Abstraction3. Support of Multiple Views of the Data4. Sharing of Data and Multiuser Transaction
Processing
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 6
Basic characteristics (1)
Self-Describing Nature of a Database System• System catalogue contains information about the
structure of each file, the type and storage format of each data item, and various constraints on the data. The information stored in the catalogue is called meta-data, and it describes the structure of the primary database.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 7
Basic characteristics (2)Insulation between Programs and Data, and Data Abstraction• The characteristic that allows program-data
independence and program-operation independence is called data abstraction.
• A DBMS provides users with a conceptual representation of data that does not include many of the details of how the data is stored or how the operations are implemented. Data model (or logical data model) is a type of data abstraction that is used to provide this conceptual representation. Data model hides storage and implementation details.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 8
Basic characteristics (3)
Support of Multiple Views of the Data• A view may be a subset of the database or it may
contain virtual data that is derived from the database files but is not explicitly stored.
• Different categories of users need different views on the database.
• One user may need to solve different problems with database and for every problem may need different view on the data.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 9
Basic characteristics (4)
Sharing of Data and Multiuser Transaction Processing• Multiple users may need to access database
simultaneously.• The DBMS must include concurrency control
software to ensure that several users trying to update the same data do so in a controlled manner so that the result of the updates is correct.
• On-line transaction processing (OLTP) applications.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 10
D. ChristozovINF 280: Database Systems Concurency
Control11
Transaction Processing and Concurrency Control (1)
Transaction: execution of a program that accesses and/orchanges the content of a file.
Concurrency:concurrent execution of two or more transactions.
Concurrency mechanisms to avoid failures, losses, etc. in Control: concurrent execution of transactions
Single Vs. Multiuser/Multi Tasking Systems: Time Shearing
System Log: journal (file), which holds the history of changes the state of a database
D. ChristozovINF 280: Database Systems Concurency
Control12
ACID Properties of Transaction:
A Atomicity Transaction is either performed on its entirety or not at all.
C Consistency A correct execution of a transaction takes Preservation the database from one consistent state to
another consistent state.
I Isolation A transaction should not make its updates visible to other transaction until it is committed.
D Durability Once a transaction changes the DB and the changes are committed, these changes must never be lost because of subsequent failure.
Transaction Processing and Concurrency Control (2)
D. ChristozovINF 280: Database Systems Concurency
Control13
Schedule: A schedule S of n transactions T1, T2, …,Tn is an ordering of the execution of operations of the transactions. Operations of two transactions Ti and Tj can be interleaved.
Recoverability: Ability to recover from transaction failure. A schedule S is recoverable if no transaction T
in S commits until all transactions T’, that have written an item that T reads have committed.
Serializability: The concurrent execution of transactions is equivalent of serial execution: Serial, Non-
Serial, and Conflict Schedules.
Protocols: sets of rules to guarantee “serializability”.
Transaction Processing and Concurrency Control (3)
D. ChristozovINF 280: Database Systems Concurency
Control14
Locking: prevents multiple transactions from accessing the same item concurrently
Timestamps: uses unique identifier for each transaction
Multiversion: the system uses multiple versions of the same data item
Optimistic: validation and certification of transactions
record block DB spacefield of a
record file whole database
Granularity: What portion of the DB the data item represents
Transaction Processing and Concurrency Control (4)
D. ChristozovINF 280: Database Systems Concurency
Control15
Locks:• Binary lock: two states (locked/unlocked) for each item;• Shared: three states: read-lock, write-lock,
unlocked;• Two-phase lock: all locking operations precede the first
unlock operation. First phase – expanding; second phase – shrinking.
Basic, Conservative, Strict Two-phase locking.
• Deadlock: each of two transactions is waiting for other to unlock a given data item.
• Livelock: a transaction waits, while the other continue.
Transaction Processing and Concurrency Control (5)
D. ChristozovINF 280: Database Systems Concurency
Control16
Timestamps: order transactions according to their timestamps
Multiversion: keeps the old values when the item is updated
Optimistic: no checking during execution of the transaction; all updates applied to a local copy of the data item. After execution a validation phase is performed to check serializability.
Transaction Processing and Concurrency Control (6)
D. ChristozovINF 280: Database Systems Concurency
Control17
Testing schedules for serializability:
1. Only read_item and write_item operations are interesting
2. The algorithm is based on constructing precedence (serialization) graph for the schedule: a directed graph G = {N, E}, where
N = {T1, T2, …, Tn} nodes and E = {e1, e2, …, en} – adges
There is one node for each transaction Ti and an edge ei is a precedence of (TjTk), where Tj is a starting node and Tk – ending node, one operation in Tj appears in the schedule BEFORE some conflict operations in Tk.
Transaction Processing and Concurrency Control (7)
D. ChristozovINF 280: Database Systems Concurency
Control18
Algorithm for testing “serializability” of a schedule S:
1. For each transaction Ti create a node in a precedence graph G.
2. If in S Tj:read_item(X) is after Ti:write_item(X), create an edge (TiTj)
3. If in S Tj:write_item(X) is after Ti:read_item(X), create an edge (TiTj)
4. If in S Tj:write_item(X) is after Ti:write_item(X), create an edge (TiTj)
The schedule S is serializable if and only if the G has no cycles.
Transaction Processing and Concurrency Control (8)
D. ChristozovINF 280: Database Systems Concurency
Control19
Examples: Serial Schedules
T1 T2
Read item(X)X:=X-NWrite item(X)Read item(Y)Y:=Y+NWrite item(Y)
Read item(X)X:=X+MWrite item(X)
T1 T2
Transaction Processing and Concurrency Control (9)
D. ChristozovINF 280: Database Systems Concurency
Control20
T1 T2
Read item(X)X:=X-N
Read item(X)X:=X+M
Write item(X)Read item(Y)
Write item(X)
Y:=Y+NWrite item(Y)
T1 T2
Cycle: {X}
X
X
Transaction Processing and Concurrency Control (10)
Examples: Non Serial Schedules
D. ChristozovINF 280: Database Systems Concurency
Control21
T1 T2
Read_item(X);X:=X-N;Write_item(X);Read_item(Y);Y:=Y+N;Write_item(Y);
Read_item(X);X:+X+M;Write_item(X);
T1 T2Read_item(X);X:=X-N;
Read_item(X);X:+X+M;
Write_item(X);Read_item(Y);
Write_item(X);Y:=Y+N;Write_item(Y);
Lost update problem:Transactions
Schedule
The two transactions access and update the same DB item simultaneously.
Transaction Processing and Concurrency Control (11)
D. ChristozovINF 280: Database Systems Concurency
Control22
T1 T2
Read_item(X);X:=X-N;Write_item(X);Read_item(Y);Y:=Y+N;Write_item(Y);
Read_item(X);X:+X+M;Write_item(X);
T1 T2Read_item(X);X:=X-N;Write_item(X);
Read_item(X);X:+X+M;Write_item(X);
Read_item(Y);failure
Dirty Read (temporary update problem):
TransactionsSchedule
One transaction updates an item and fails, before correctly update item Y, another transaction uses the already updated item.
Transaction Processing and Concurrency Control (11)
D. ChristozovINF 280: Database Systems Concurency
Control23
T1 T2
Read_item(X);X:=X-N;Write_item(X);Read_item(Y);Y:=Y+N;Write_item(Y);
Read_item(A);Sum := Sum+A;Write_item(X);Sum := Sum +X; Read_item(Y);Sum := Sum+Y;
T1 T2Read_Item(A);Sum := Sum+A;
Read_item(X);X:=X-N;Write_item(X);
Read_item(X);Sum := Sum +X; Read_item(Y);Sum := Sum+Y;
Read_item(Y);Y:=Y+N;Write_item(Y);
Incorrect summary problem:
Transactions
Schedule
One transaction calculates aggregate function,
while another updates the same record.
Transaction Processing and Concurrency Control (12)
Advantages of Using DBMS• Controlling Redundancy (reducing)• Preserving Data Integrity• Restricting Unauthorized Access• Providing Persistent Storage for Program Objects and
Data Structures (Object-Oriented DB)• Permitting Inferencing and Actions Using Rules• Providing Multiple User Interfaces• Representing Complex Relationships Among Data• Providing Backup and Recovery
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 24
Redundant Data
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 25
Id# Name Address Code Title Cr. Instructor Section Grade
000101 Ivan Ivanov Scapto 1 COS480 DB System 3 Christozov A B-
000101 Ivan Ivanov Scapto 1 COS 221 FDS 3 Christozov B B+
000101 Ivan Ivanov Scapto 1 AUB 102 Writing 3 Colman C D+
000102 Georgi Georgiev Scapto 2 COS 480 DB System 3 Christozov A B+
000102 Georgi Georgiev Scapto 2 AUB 102 Writing 3 Colman C C+
Student’s information
Course information
Grade information
Integrity
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 26
Id# Name Address Code Title Cr. Instructor Section Grade000101 Ivan Ivanov Scapto 1 COS480 DB System 3 Christozov A B-
000101 Ivan Ivanov Scapto 1 COS 221 FDS 3 Christozov B B+
000101 Ivan Ivanov Scapto 1 AUB 102 Writing 3 Colman C D+
000102 Georgi Georgiev Scapto 2 COS 480 DB System 3 Christozov A B+
000102 Georgi Georgiev Scapto 2 AUB 102 Writing 3 Colman C C+
Family Name Given Name Title Office
Bonev Stoyan Assoc. Professor 221
Colman Mark Professor 231
Grades
Facultymissing
Actors on the Scene• DB Administrators• DB Designers• End Users:– Casual– Naive (parametric)– Sophisticated– Stand-alone
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 27
DB Administrators• The DBA is responsible for authorizing access to the
database, coordinating and monitoring its use, and acquiring software and hardware resources as needed.
• The DBA is accountable for problems such as security breaches and poor system response time. In large organizations, the DBA is assisted by a staff that carries out these functions.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 28
Database designers• Database designers are responsible for identifying
the data to be stored in the database and for choosing appropriate structures to represent and store this data. These tasks are mostly undertaken before the database is actually implemented and populated with data.
• Database designers responsibility is to communicate with all prospective database users in order to understand their requirements and to create a design that meets these requirements.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 29
End Users• Casual end users occasionally access the database,
but they may need different information each time. They use a sophisticated database query language to specify their requests and are typically middle- or high- level managers or other occasional browsers.
• Naive or parametric end users main job function revolves around constantly querying and updating the database, using standard types of queries and updates - called canned transactions - that have been carefully programmed and tested. Examples: Bank tellers, Reservation agents, etc.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 30
End Users (cont.)• Sophisticated end users include engineers, scientists,
business analysts, and others who thoroughly familiarize themselves with the facilities of the DBMS in order to implement their own applications to meet their complex requirements.
• Standalone users maintain personal databases by using ready-made program packages that provide easy-to-use menu-based or graphics-based interfaces. An example is the user of a tax package that stores a variety of personal financial data for tax purposes.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 31
Actors Behind the Scene• DBMS Systems designers and implementers
design and implement the DBMS modules and interfaces as a software package.
• Tools developers design and implement tools - the software packages that facilitate database modeling and design, database system design, and improved performance.
• Operators and Maintenance personnel are responsible for the actual running and maintenance of the hardware and software environment for the database system.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 32
DB History
Database Systems: the success story of Computer Science
• Early applications: use of File Systems• 1960s: Hierarchical and Network DB models• Late 1970s: Codd’s Relational Model• Late 1980s: OODB -> R-OO DB• 1990s: SQL standards, WWW, E-Commerce• Spatial DB, Data Warehouses, Data Mining
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 33
DB Model
Object-Oriented Models incorporate both structure and behaviorIn “classical” models (hierarchical, network or relational) behavior is limited to generic operations.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 34
Data Model: collection of concepts that can be used to describe the structure of a database
Structure: data types; relationships; constraintsOperation: retrieve, insert, delete, modify, user-
defined operationsBehavior: dynamic
DB Model: Categories of Data ModelsHigh-level – conceptual
How users perceive data.
Low-level – physical
How data is actually stored on computer.
Representational – logical
Close to the way usersunderstand data, but allow direct interpretation by given DBMS.
Database schema: Description of database model. Most data models have certain conventions for displaying schemas as diagrams.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 35
Schema Instance State• In any data model, it is important to distinguish
between the description of the database and the database itself. The description of a database is called the database schema, which is specified during database design and is not expected to change frequently.
• The data in the database at a particular moment in time is called a database state or snapshot. It is also called the current set of occurrences or instances in the database.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 36
DB and DBMS• A database management system (DBMS) is a
collection of programs that enables users to create and maintain a database.
• The DBMS is a general-purpose software system that facilitates the processes of – defining, – constructing, and – manipulating databases for various applications.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 37
DBMS• DBMS supports the following categories of
languages:– Data definition language (DDL). – Storage definition language (SDL)– View definition language (VDL)– Data manipulation language (DML), including
querying language• Note: In current DBMSs, these types of
languages are not considered distinct languages.
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 38
DBMS Components
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 39
Recorded DB
Boundaries of DBMS
Query compiler
Run-time processor
DDL interpreter
DML compiler
Sophisticated Users
System catalogue
DB designers Naive Users
Data manager Concurrency control, recovery, backup
subsystems
DB administrators
DBMS Architecture• The Three-Schema Architecture
1. The internal level (internal schema), describes the physical storage structure of the database.
2. The conceptual level (conceptual schema), describes the structure of the whole database for a community of users.
3. The external or view level includes a number of external schemas or user views.
• Data Independence1. Logical data independence. 2. Physical data independence
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 40
The Three-Schema Architecture
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 41
External Schemas
Logical Schema
Physical Schema
IndexesData FilesMaster Files
Meta DataSystem Catalog
Physical Data Independence
Logical Data Independence
Categories of Users
Database System Utilities
1. Loading2. Backup3. File reorganization 4. Performance monitoring
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 42
Q & A
D. Christozov / G.TuparovINF 280 Database Systems:
Basic Concepts 43