INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic...

43
INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1

description

Basic Concepts - Topics 1.A database as a collection of related data 2.Database and Database Management System 3.Characteristics and advantages of DB approach 4.DB users 5.DB Architecture 6.DBMS Architecture D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 3

Transcript of INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic...

Page 1: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

INF 280 Database Systems

BASIC CONCEPTS

D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1

Page 2: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Typical software application

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 2

Data Processing

Transforming interface into data request

Transforming datasets into reports/forms

Query (SQL)

Datasets

Interface

INF 280

Database

Business Logic

Page 3: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Basic Concepts - Topics

1. A database as a collection of related data2. Database and Database Management System3. Characteristics and advantages of DB

approach4. DB users5. DB Architecture6. DBMS Architecture

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 3

Page 4: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

DB as a collection of related data (1) • Data: facts that can be recorded and that have implicit

meaning. • Database implicit properties: – A database represents some aspect of the real world,

sometimes called the miniworld or the Universe of Discourse (UoD).

– A database is a logically coherent collection of data with some inherent meaning.

– A database is designed, built, and populated with data for a specific purpose. It has an intended groups of users and some applications in which these users are interested.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 4

Page 5: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

DB as a collection of related data (2)

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 5

Page 6: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Basic characteristics

1. Self-Describing Nature of a Database System2. Insulation between Programs and Data, Data

Abstraction3. Support of Multiple Views of the Data4. Sharing of Data and Multiuser Transaction

Processing

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 6

Page 7: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Basic characteristics (1)

Self-Describing Nature of a Database System• System catalogue contains information about the

structure of each file, the type and storage format of each data item, and various constraints on the data. The information stored in the catalogue is called meta-data, and it describes the structure of the primary database.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 7

Page 8: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Basic characteristics (2)Insulation between Programs and Data, and Data Abstraction• The characteristic that allows program-data

independence and program-operation independence is called data abstraction.

• A DBMS provides users with a conceptual representation of data that does not include many of the details of how the data is stored or how the operations are implemented. Data model (or logical data model) is a type of data abstraction that is used to provide this conceptual representation. Data model hides storage and implementation details.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 8

Page 9: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Basic characteristics (3)

Support of Multiple Views of the Data• A view may be a subset of the database or it may

contain virtual data that is derived from the database files but is not explicitly stored.

• Different categories of users need different views on the database.

• One user may need to solve different problems with database and for every problem may need different view on the data.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 9

Page 10: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Basic characteristics (4)

Sharing of Data and Multiuser Transaction Processing• Multiple users may need to access database

simultaneously.• The DBMS must include concurrency control

software to ensure that several users trying to update the same data do so in a controlled manner so that the result of the updates is correct.

• On-line transaction processing (OLTP) applications.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 10

Page 11: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

D. ChristozovINF 280: Database Systems Concurency

Control11

Transaction Processing and Concurrency Control (1)

Transaction: execution of a program that accesses and/orchanges the content of a file.

Concurrency:concurrent execution of two or more transactions.

Concurrency mechanisms to avoid failures, losses, etc. in Control: concurrent execution of transactions

Single Vs. Multiuser/Multi Tasking Systems: Time Shearing

System Log: journal (file), which holds the history of changes the state of a database

Page 12: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

D. ChristozovINF 280: Database Systems Concurency

Control12

ACID Properties of Transaction:

A Atomicity Transaction is either performed on its entirety or not at all.

C Consistency A correct execution of a transaction takes Preservation the database from one consistent state to

another consistent state.

I Isolation A transaction should not make its updates visible to other transaction until it is committed.

D Durability Once a transaction changes the DB and the changes are committed, these changes must never be lost because of subsequent failure.

Transaction Processing and Concurrency Control (2)

Page 13: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

D. ChristozovINF 280: Database Systems Concurency

Control13

Schedule: A schedule S of n transactions T1, T2, …,Tn is an ordering of the execution of operations of the transactions. Operations of two transactions Ti and Tj can be interleaved.

Recoverability: Ability to recover from transaction failure. A schedule S is recoverable if no transaction T

in S commits until all transactions T’, that have written an item that T reads have committed.

Serializability: The concurrent execution of transactions is equivalent of serial execution: Serial, Non-

Serial, and Conflict Schedules.

Protocols: sets of rules to guarantee “serializability”.

Transaction Processing and Concurrency Control (3)

Page 14: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

D. ChristozovINF 280: Database Systems Concurency

Control14

Locking: prevents multiple transactions from accessing the same item concurrently

Timestamps: uses unique identifier for each transaction

Multiversion: the system uses multiple versions of the same data item

Optimistic: validation and certification of transactions

record block DB spacefield of a

record file whole database

Granularity: What portion of the DB the data item represents

Transaction Processing and Concurrency Control (4)

Page 15: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

D. ChristozovINF 280: Database Systems Concurency

Control15

Locks:• Binary lock: two states (locked/unlocked) for each item;• Shared: three states: read-lock, write-lock,

unlocked;• Two-phase lock: all locking operations precede the first

unlock operation. First phase – expanding; second phase – shrinking.

Basic, Conservative, Strict Two-phase locking.

• Deadlock: each of two transactions is waiting for other to unlock a given data item.

• Livelock: a transaction waits, while the other continue.

Transaction Processing and Concurrency Control (5)

Page 16: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

D. ChristozovINF 280: Database Systems Concurency

Control16

Timestamps: order transactions according to their timestamps

Multiversion: keeps the old values when the item is updated

Optimistic: no checking during execution of the transaction; all updates applied to a local copy of the data item. After execution a validation phase is performed to check serializability.

Transaction Processing and Concurrency Control (6)

Page 17: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

D. ChristozovINF 280: Database Systems Concurency

Control17

Testing schedules for serializability:

1. Only read_item and write_item operations are interesting

2. The algorithm is based on constructing precedence (serialization) graph for the schedule: a directed graph G = {N, E}, where

N = {T1, T2, …, Tn} nodes and E = {e1, e2, …, en} – adges

There is one node for each transaction Ti and an edge ei is a precedence of (TjTk), where Tj is a starting node and Tk – ending node, one operation in Tj appears in the schedule BEFORE some conflict operations in Tk.

Transaction Processing and Concurrency Control (7)

Page 18: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

D. ChristozovINF 280: Database Systems Concurency

Control18

Algorithm for testing “serializability” of a schedule S:

1. For each transaction Ti create a node in a precedence graph G.

2. If in S Tj:read_item(X) is after Ti:write_item(X), create an edge (TiTj)

3. If in S Tj:write_item(X) is after Ti:read_item(X), create an edge (TiTj)

4. If in S Tj:write_item(X) is after Ti:write_item(X), create an edge (TiTj)

The schedule S is serializable if and only if the G has no cycles.

Transaction Processing and Concurrency Control (8)

Page 19: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

D. ChristozovINF 280: Database Systems Concurency

Control19

Examples: Serial Schedules

T1 T2

Read item(X)X:=X-NWrite item(X)Read item(Y)Y:=Y+NWrite item(Y)

Read item(X)X:=X+MWrite item(X)

T1 T2

Transaction Processing and Concurrency Control (9)

Page 20: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

D. ChristozovINF 280: Database Systems Concurency

Control20

T1 T2

Read item(X)X:=X-N

Read item(X)X:=X+M

Write item(X)Read item(Y)

Write item(X)

Y:=Y+NWrite item(Y)

T1 T2

Cycle: {X}

X

X

Transaction Processing and Concurrency Control (10)

Examples: Non Serial Schedules

Page 21: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

D. ChristozovINF 280: Database Systems Concurency

Control21

T1 T2

Read_item(X);X:=X-N;Write_item(X);Read_item(Y);Y:=Y+N;Write_item(Y);

Read_item(X);X:+X+M;Write_item(X);

T1 T2Read_item(X);X:=X-N;

Read_item(X);X:+X+M;

Write_item(X);Read_item(Y);

Write_item(X);Y:=Y+N;Write_item(Y);

Lost update problem:Transactions

Schedule

The two transactions access and update the same DB item simultaneously.

Transaction Processing and Concurrency Control (11)

Page 22: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

D. ChristozovINF 280: Database Systems Concurency

Control22

T1 T2

Read_item(X);X:=X-N;Write_item(X);Read_item(Y);Y:=Y+N;Write_item(Y);

Read_item(X);X:+X+M;Write_item(X);

T1 T2Read_item(X);X:=X-N;Write_item(X);

Read_item(X);X:+X+M;Write_item(X);

Read_item(Y);failure

Dirty Read (temporary update problem):

TransactionsSchedule

One transaction updates an item and fails, before correctly update item Y, another transaction uses the already updated item.

Transaction Processing and Concurrency Control (11)

Page 23: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

D. ChristozovINF 280: Database Systems Concurency

Control23

T1 T2

Read_item(X);X:=X-N;Write_item(X);Read_item(Y);Y:=Y+N;Write_item(Y);

Read_item(A);Sum := Sum+A;Write_item(X);Sum := Sum +X; Read_item(Y);Sum := Sum+Y;

T1 T2Read_Item(A);Sum := Sum+A;

Read_item(X);X:=X-N;Write_item(X);

Read_item(X);Sum := Sum +X; Read_item(Y);Sum := Sum+Y;

Read_item(Y);Y:=Y+N;Write_item(Y);

Incorrect summary problem:

Transactions

Schedule

One transaction calculates aggregate function,

while another updates the same record.

Transaction Processing and Concurrency Control (12)

Page 24: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Advantages of Using DBMS• Controlling Redundancy (reducing)• Preserving Data Integrity• Restricting Unauthorized Access• Providing Persistent Storage for Program Objects and

Data Structures (Object-Oriented DB)• Permitting Inferencing and Actions Using Rules• Providing Multiple User Interfaces• Representing Complex Relationships Among Data• Providing Backup and Recovery

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 24

Page 25: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Redundant Data

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 25

Id# Name Address Code Title Cr. Instructor Section Grade

000101 Ivan Ivanov Scapto 1 COS480 DB System 3 Christozov A B-

000101 Ivan Ivanov Scapto 1 COS 221 FDS 3 Christozov B B+

000101 Ivan Ivanov Scapto 1 AUB 102 Writing 3 Colman C D+

000102 Georgi Georgiev Scapto 2 COS 480 DB System 3 Christozov A B+

000102 Georgi Georgiev Scapto 2 AUB 102 Writing 3 Colman C C+

Student’s information

Course information

Grade information

Page 26: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Integrity

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 26

Id# Name Address Code Title Cr. Instructor Section Grade000101 Ivan Ivanov Scapto 1 COS480 DB System 3 Christozov A B-

000101 Ivan Ivanov Scapto 1 COS 221 FDS 3 Christozov B B+

000101 Ivan Ivanov Scapto 1 AUB 102 Writing 3 Colman C D+

000102 Georgi Georgiev Scapto 2 COS 480 DB System 3 Christozov A B+

000102 Georgi Georgiev Scapto 2 AUB 102 Writing 3 Colman C C+

Family Name Given Name Title Office

Bonev Stoyan Assoc. Professor 221

Colman Mark Professor 231

Grades

Facultymissing

Page 27: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Actors on the Scene• DB Administrators• DB Designers• End Users:– Casual– Naive (parametric)– Sophisticated– Stand-alone

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 27

Page 28: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

DB Administrators• The DBA is responsible for authorizing access to the

database, coordinating and monitoring its use, and acquiring software and hardware resources as needed.

• The DBA is accountable for problems such as security breaches and poor system response time. In large organizations, the DBA is assisted by a staff that carries out these functions.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 28

Page 29: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Database designers• Database designers are responsible for identifying

the data to be stored in the database and for choosing appropriate structures to represent and store this data. These tasks are mostly undertaken before the database is actually implemented and populated with data.

• Database designers responsibility is to communicate with all prospective database users in order to understand their requirements and to create a design that meets these requirements.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 29

Page 30: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

End Users• Casual end users occasionally access the database,

but they may need different information each time. They use a sophisticated database query language to specify their requests and are typically middle- or high- level managers or other occasional browsers.

• Naive or parametric end users main job function revolves around constantly querying and updating the database, using standard types of queries and updates - called canned transactions - that have been carefully programmed and tested. Examples: Bank tellers, Reservation agents, etc.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 30

Page 31: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

End Users (cont.)• Sophisticated end users include engineers, scientists,

business analysts, and others who thoroughly familiarize themselves with the facilities of the DBMS in order to implement their own applications to meet their complex requirements.

• Standalone users maintain personal databases by using ready-made program packages that provide easy-to-use menu-based or graphics-based interfaces. An example is the user of a tax package that stores a variety of personal financial data for tax purposes.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 31

Page 32: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Actors Behind the Scene• DBMS Systems designers and implementers

design and implement the DBMS modules and interfaces as a software package.

• Tools developers design and implement tools - the software packages that facilitate database modeling and design, database system design, and improved performance.

• Operators and Maintenance personnel are responsible for the actual running and maintenance of the hardware and software environment for the database system.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 32

Page 33: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

DB History

Database Systems: the success story of Computer Science

• Early applications: use of File Systems• 1960s: Hierarchical and Network DB models• Late 1970s: Codd’s Relational Model• Late 1980s: OODB -> R-OO DB• 1990s: SQL standards, WWW, E-Commerce• Spatial DB, Data Warehouses, Data Mining

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 33

Page 34: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

DB Model

Object-Oriented Models incorporate both structure and behaviorIn “classical” models (hierarchical, network or relational) behavior is limited to generic operations.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 34

Data Model: collection of concepts that can be used to describe the structure of a database

Structure: data types; relationships; constraintsOperation: retrieve, insert, delete, modify, user-

defined operationsBehavior: dynamic

Page 35: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

DB Model: Categories of Data ModelsHigh-level – conceptual

How users perceive data.

Low-level – physical

How data is actually stored on computer.

Representational – logical

Close to the way usersunderstand data, but allow direct interpretation by given DBMS.

Database schema: Description of database model. Most data models have certain conventions for displaying schemas as diagrams.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 35

Page 36: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Schema Instance State• In any data model, it is important to distinguish

between the description of the database and the database itself. The description of a database is called the database schema, which is specified during database design and is not expected to change frequently.

• The data in the database at a particular moment in time is called a database state or snapshot. It is also called the current set of occurrences or instances in the database.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 36

Page 37: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

DB and DBMS• A database management system (DBMS) is a

collection of programs that enables users to create and maintain a database.

• The DBMS is a general-purpose software system that facilitates the processes of – defining, – constructing, and – manipulating databases for various applications.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 37

Page 38: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

DBMS• DBMS supports the following categories of

languages:– Data definition language (DDL). – Storage definition language (SDL)– View definition language (VDL)– Data manipulation language (DML), including

querying language• Note: In current DBMSs, these types of

languages are not considered distinct languages.

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 38

Page 39: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

DBMS Components

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 39

Recorded DB

Boundaries of DBMS

Query compiler

Run-time processor

DDL interpreter

DML compiler

Sophisticated Users

System catalogue

DB designers Naive Users

Data manager Concurrency control, recovery, backup

subsystems

DB administrators

Page 40: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

DBMS Architecture• The Three-Schema Architecture

1. The internal level (internal schema), describes the physical storage structure of the database.

2. The conceptual level (conceptual schema), describes the structure of the whole database for a community of users.

3. The external or view level includes a number of external schemas or user views.

• Data Independence1. Logical data independence. 2. Physical data independence

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 40

Page 41: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

The Three-Schema Architecture

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 41

External Schemas

Logical Schema

Physical Schema

IndexesData FilesMaster Files

Meta DataSystem Catalog

Physical Data Independence

Logical Data Independence

Categories of Users

Page 42: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Database System Utilities

1. Loading2. Backup3. File reorganization 4. Performance monitoring

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 42

Page 43: INF 280 Database Systems BASIC CONCEPTS D. Christozov / G.Tuparov INF 280 Database Systems: Basic Concepts 1.

Q & A

D. Christozov / G.TuparovINF 280 Database Systems:

Basic Concepts 43