DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

42
2 December 2005 Introduction to Databases DBMS Architectures and Features Prof. Beat Signer Department of Computer Science Vrije Universiteit Brussel http://www.beatsigner.com

description

This lecture is part of an Introduction to Databases course given at the Vrije Universiteit Brussel.

Transcript of DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Page 1: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

2 December 2005

Introduction to Databases DBMS Architectures and Features

Prof. Beat Signer

Department of Computer Science

Vrije Universiteit Brussel

http://www.beatsigner.com

Page 2: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

2 March 27, 2014

DBMS Components

Access

Methods

System

Buffers

Authorisation

Control

Integrity

Checker

Command

Processor

Program

Object Code

DDL

Compiler

File

Manager

Buffer

Manager

Recovery

Manager

Scheduler

Query

Optimiser

Transaction

Manager

Query

Compiler

Queries

Catalogue

Manager

DML

Preprocessor

Database

Schema

Application

Programs

Database

Manager

Data

Manager

DBMS

Programmers Users DB Admins

Based on 'Components of a DBMS', Database Systems,

T. Connolly and C. Begg, Addison-Wesley 2010

Data, Indices and

System Catalogue

Page 3: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

3 March 27, 2014

DBMS Components ...

DML preprocessor transforms embedded SQL statements into statements of the host

language

interacts with the query compiler to generate the appropriate host language code

Query compiler transforms queries into a set of low-level instructions (query plan)

which are forwarded to the database manager component

DDL compiler converts a set of DDL statements into a set of tables

tables and metadata are stored in the system catalogue (catalogue manager)

Page 4: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

4 March 27, 2014

DBMS Components ...

Catalogue manager provides access and manages the system catalogue

used by most DBMS components

Database manager processes user-submitted queries

interfaces with application programs

contains a set of components

- query optimiser

- transaction manager

- ...

Page 5: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

5 March 27, 2014

Database Manager Components

Authorisation control checks whether as user has the necessary rights to execute a

specific operation

Command processor executes the steps of a given query plan handed over by the

authorisation control component

Integrity checker ensures that the operation is not going to violate any integrity

constraints (e.g. key constraints)

Query optimiser computes an optimal query execution strategy

transforms the initial query plan into the best available sequence of operations on the given data

Page 6: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

6 March 27, 2014

Database Manager Components ...

Transaction manager processes any transaction-specific operations

Scheduler manages the relative order in which transaction operations are

executed

Recovery manager deals with commits and aborts of transactions

ensures that the database remains in a consistent state in case of failures

Buffer manager transfers data between main memory and secondary storage

Page 7: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

7 March 27, 2014

DBMS Architectures

There is a wide variety of different DBMS architectures Teleprocessing

File-Server Architecture

Two-Tier Client-Server Architecture

Three-Tier Client Server Architecture

N-Tier Architecture

Peer-to-Peer Architecture

Distributed DBMS

Service-Oriented Architecture

Cloud Architecture

...

Page 8: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

8 March 27, 2014

Teleprocessing

Traditional multi-user system architecture single mainframe and multiple (dumb) terminals

Heavy load on the central mainframe runs application programs and DBMS

formats data for presentation on terminals

Mainframe

Terminal 1

Terminal 2

Terminal 3

Terminal n

Page 9: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

9 March 27, 2014

Teleprocessing ...

Tendency to replace mainframes with network of

personal computers (downsizing) file-server architectures

client-server architectures

...

Page 10: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

10 March 27, 2014

File-Server Architecture

A file-server is a computer that is connected to a network

and mainly serves as a shared storage e.g. for realising shared access to a database

In a file-server architecture the processing is distributed

over the network workstations (application and DBMS) request data (files)

Workstation 2 (DBMS) Workstation n (DBMS)

Workstation 1 (DBMS) File-Server

Database

data request

file

LAN or

WAN

Page 11: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

11 March 27, 2014

File-Server Architecture ...

SQL Request Example

Since the file-server is not SQL-aware, the Customer and

Order relations (files) have to be transferred to the client

Disadvantages heavy network traffic

high total costs of ownership (TCO)

- maintain a full instance of the DBMS on each client (workstation)

complex integrity, concurrency and recovery control

- multiple DBMSs may concurrently access the same shared file

SELECT name, street FROM Customer, Order WHERE Order.customerID = Customer.customerID AND name = 'Max Frisch';

Page 12: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

12 March 27, 2014

Two-Tier Client-Server Architecture

Application consists of a client (first tier) and a server

(second tier) that might run on different machines clear separation of concerns between client and server

thin client vs. thick client

- less or more application logic on the client side

Supports decentralised business environments

LAN or

WAN

Client 2 Client n

Client 1 Server (DBMS)

Database

data request

selected data

Page 13: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

13 March 27, 2014

Two-Tier Client-Server Architecture ...

Client (first tier) tasks presentation of data (user interface)

business and data application logic

send database requests to the server and process the results

Server (second tier) tasks manage (concurrent) database access (data services)

- authorisation, integrity checks, query/update processing, recovery control, ...

business logic (e.g. validation of data)

Different possible client-server topologies single client and single server

multiple clients and single server

multiple clients and multiple servers

Page 14: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

14 March 27, 2014

Two-Tier Client-Server Architecture ...

Communication between client and server via inter-

process communication (IPC) or over network special protocols such as ODBC (JDBC) introduced earlier in the

course when discussing the Call Level Interface (CLI)

Advantages increased performance

- certain tasks performed in parallel and server can be tuned for DB processing

reduced communication costs

- only selected data is transferred

reduced hardware costs

- only server has to run a DBMS

increased consistency/security through separation of concerns

- constraint checking in a single place (server)

Page 15: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

15 March 27, 2014

Two-Tier Client-Server Architecture ...

Disadvantages limitations in terms of enterprise scalability with thousands of

potential clients

- significant client-side administration overhead

• e.g. expensive deployment of new busines and data application logic

- thick client requires a considerable amount of resources (CPU, RAM, ...) to run

applications efficiently

Page 16: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

16 March 27, 2014

Three-Tier Client-Server Architecture

In the 1990s, the three-tier client-server architecture was

introduced to address the enterprise scalability problem e.g. driven by emerging web applications

Application consists of a presentation tier (client),

a logic tier (application server) and a data tier

(database server) that might run on different platforms

LAN or

WAN

Client 1 Database Server

Database

data request

selected data

Client 2 Client n

Application Server

Page 17: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

17 March 27, 2014

Three-Tier Client-Server Architecture ...

Presentation tier (first tier) tasks presentation of data (user interface)

basic input validation (thin client)

send requests to the server and visualise results

Logic tier (second tier / middle tier) tasks business logic

data processing logic

Data tier (third tier) tasks basic data validation

manage (concurrent) database access (data services)

- authorisation, integrity checks, query/update processing, recovery control, ...

Page 18: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

18 March 27, 2014

Three-Tier Client-Server Architecture ...

Advantages reduced costs for thin clients due to lower resource requirements

(CPU, RAM, ...)

- e.g. applications running in a web browser

application logic is centralised in a single application server

- reduces the software distribution problem (updates) that is present in

two-tier client-server architectures

increased modularity

- easier to replace one tier without affecting the other tiers

load balancing becomes easier with a clear separation between the core business logic and the database functionality

Page 19: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

19 March 27, 2014

Web Information Systems (WIS)

The tree-tier architecture maps very naturally to web

environments browser as a thin client, application server and database server

The move towards thin browser clients has dramatically

reduced the costs for software deployment

Internet

HTTP Request

HTTP Response

Client Application

Server

DB Server

Database

Page 20: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

20 March 27, 2014

N-Tier Architecture

The three-tier architecture can be extended with

additional intermediary tiers for increased flexibility e.g. separation between web server and application server in the

previous web information system example

- increases the flexibility for load balancing by introducing multiple web servers

- only dynamic content delivered by the application server whereas static

content is directly managed by the web server

Internet

HTTP Request

HTTP Response

Client

Application

Server

HTML Pages

Web

Server

DB Server

Database

Page 21: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

21 March 27, 2014

Peer-to-Peer Architecture

Systems exchanging information and services in a peer-to-peer-like manner without a central authority

no global schema need for schema integration (matching)

Data and service sharing no dedicated clients and servers

sites may dynamically form new client/server relationships

LAN or

WAN Site 3

Site n

Database 3

Site 2

Database 2

Site 1

Database 1

Database n

Page 22: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

22 March 27, 2014

Middleware

Software that connects (mediates) between software components or applications hide complexity of heterogenous and distributed components

(e.g. servers) and provide a uniform interface

There exist different types of middleware remote procedure call (RPC)

- Java RMI

- CORBA

- XML RPC

asynchronous publish/subscribe

- subscribe for different types of messages

SQL-oriented data access

- open database connectivity (ODBC), JDBC, ...

...

Page 23: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

23 March 27, 2014

Transaction Processing Monitor

Complex applications can be built on top of several

resource managers e.g. multiple DBMSs, operating systems, ...

A Transaction Processing Monitor (TP Monitor) is a

middleware component that provides uniform access to

the services of a number of resource managers

DB Server 1

Database 1

DB Server n

Database n

TP Monitor

Service 1

Service r

.

.

.

Client 1

Client m Application Server

Page 24: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

24 March 27, 2014

Transaction Processing Monitor ...

A TP Monitor offers a number of advantages transaction routing

- increase scalability by directing transactions to specific DBMSs

distributed transaction management

- manage transactions that require access to multiple heterogeneous DBMSs

- e.g. based on XA standard for distributed transaction processing

load balancing

- balance requests across multiple DBMSs by directing to least loaded server

funelling

- establish connections with DBMSs and funnel user requests through these

connections thereby reducing the number of required connections

increased reliability

- if a DBMS fails, the TP monitor can resubmit the request to another DBMS or

hold the transaction request until the DBMS becomes available and resubmit

Page 25: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

25 March 27, 2014

Parallel Database Architectures

Parallel machines (multiple processors) can be used to

speed up the transaction processing

Different models for parallel database architectures shared memory, shared disk, shared nothing and hierarchical

Processor Memory

Processor

Processor Disk

Disk

shared memory

Processor

Processor

Processor Disk

Disk

shared disk

Memory

Memory

Memory

Processor

Disk

shared nothing

Memory

Processor

Disk

Memory

Page 26: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

26 March 27, 2014

Parallel Database Architectures ...

Shared memory processors and disks have access to a shared memory via a bus

very efficient communication between processors

not scalable since bus becomes a bottleneck

Shared disk all processors can access all disks via an interconnection network

each processor has its own memory

certain degree of fault tolerance if processor/memory fails

also disks maybe have fault tollerance (e.g. RAID architecture)

interconnection to the disk systems becomes bottleneck

Page 27: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

27 March 27, 2014

Parallel Database Architectures ...

Shared nothing each node consists of a processor, memory and one or

more disks

high-speed interconnection network between processors

more scalable than shared memory or shared disk model

increased communication costs for non-local disk access

Hierarchical combines the different models (composition)

top level is shared nothing between nodes

- each node can be a shared memory or shared disk "subsystem"

Page 28: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

28 March 27, 2014

Distributed DBMS (DDBMS)

Distributed database logically related collection of shared data and metadata that is

distributed over a network

Distributed DBMS software system to manage the distributed database in a

transparent way

LAN or

WAN Site 3

Site n

Database

Site 2

Database

Site 1

Database Database

Page 29: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

29 March 27, 2014

Distributed DBMS (DDBMS) ...

Distinction between local and global transactions local transaction

- accesses only data from the site from which the transaction was initiated

global transaction

- accesses data from several different sites

Reasons for building a distributed DBMS data sharing

- possibility to access data that resides at other sides

autonomy

- each site retains a certain degree of control over the local data

availability

- if one site fails the other sites may still be able to continue operating

- data might be replicated at serveral sites for increased availability

Page 30: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

30 March 27, 2014

Distributed DBMS (DDBMS) ...

Reasons for building a distributed DBMS ... costs and scalability

- use cluster of PCs instead of large mainframe systems

integration of existing DBMS

- coexistence of legacy systems with new applications

dynamic organisational structure

- mergers and acquisitions

Implementation issues transactions have to be executed atomically accross different

sites (two-phase commit protocol)

- commit decission is left to a single coordinator

distributed concurrency control

- deadlock detection has to be carried out across multiple sites

Page 31: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

31 March 27, 2014

Service-Oriented Architectures (SOA)

Architecture that modularises functionality as

interoperable services loose coupling of services

service encapsulation

interoperability between different operating systems and programming languages

new services can be defined as a mashup of existing services

Service-oriented database architecture (SODA) e.g. single SQL Server processes acting as service providers

Software as a service (SaaS) software is offered as a service and on the other hand might be a

composition of other services

- e.g. service for data storage etc.

Page 32: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

32 March 27, 2014

Service-Oriented Architectures (SOA) ...

Share business logic, data and processes via web

service APIs Big Web Services

- Universal Description, Discovery and Integration (UDDI)

- Web Services Description Language (WSDL)

- Simple Object Access Protocol (SOAP)

RESTful Web Services

Web services are based on established technologies

such as the Extensible Markup Language (XML)

Special service orchestration languages for the use of

services e.g. Business Process Execution Language (BPEL)

Page 33: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

33 March 27, 2014

Cloud Computing

Internet-based computing with on-demand and pay-per-

use access to shared resources, data and software

Main characteristics web-based access (e.g. Web Service API or browser)

pay only for the services that are actually used (pay-per-use)

no initial investment (e.g. for resources) required

The Cloud

Google Microsoft

Amazon

Yahoo

Client 1

Client 2

Client 3

Client n

Page 34: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

34 March 27, 2014

Cloud Computing ...

Cloud services infrastructure as a service (IaaS)

- OS virtualisation

- e.g. Amazon EC2, Rackspace, ...

platform as a service (PaaS)

- provide platform to run applications

- e.g. Google App Engine, Windows Azure Platform, ...

software as a service (SaaS)

- provide software as a service over the Internet

- e.g. Google Docs, ...

Cloud service vendor gets some degree of control

Page 35: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

35 March 27, 2014

Cloud Computing ...

New challenges for database management

in cloud computing cloud database server might be less reliable

- might become difficult to guarantee a specific quality of service (QoS) for an

application realised in the cloud

backup, replication, ...

Online or web-based databases store data in the cloud or on servers on the Internet

examples

- QuickBase, quickbase.intuit.com

- Blist, www.blist.com

- ...

Page 36: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

36 March 27, 2014

Cloud Data Service Example

Amazon Simple Storage Service (Amazon S3) online storage service with unlimited storage space

store objects (up to 5 TB in size) in buckets

Web Service API

Amazon SimpleDB distributed database written in Erlang

offers a Web Service API

makes use of S3 and EC2

on demand scaling

non-relational data store

- schemaless

- hashtables with set of key value pairs

- eventual consistency

Page 37: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

37 March 27, 2014

Mobile DBMS

Users want access to information on the move via mobile

devices tourist information systems

salesperson who is visiting their customers

emergency services

...

New requirements for mobile DBMSs small footprint databases that can run on mobile devices with

limited resources

- e.g. db4objects, http://www.db4o.com

location-dependent queries

context-aware queries

Page 38: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

38 March 27, 2014

Mobile DBMS ...

New requirements for mobile DBMSs ... communicate with centralised database server via wireless

network or fixed Internet connections

replicate data on a centralised server and on a mobile device

- synchronisation challenges

caching of data and transactions to cope with potential network connection failures

opportunistic (peer-to-peer based) information exchange with other mobile DBMSs

- e.g. dynamic P2P Bluetooth connections with other devices in range

(proximity-based information exchange)

security

- which portion of a database can/should be replicated on a mobile device?

...

Page 39: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

39 March 27, 2014

Homework

Study the following chapter of the

Database System Concepts book chapter 17

- sections 17.1-17.6

- Database-System Architectures

Page 40: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

40 March 27, 2014

Exercise 7

Structured Query Language (SQL)

Page 41: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

41 March 27, 2014

References

A. Silberschatz, H. Korth and S. Sudarshan,

Database System Concepts (Sixth Edition),

McGraw-Hill, 2010

T. Connolly and C. Begg, Database Systems: A Practical

Approach to Design, Implementation and Management

(Fifth Edition), Addison-Wesley, 2010

Page 42: DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

2 December 2005

Next Lecture Storage Management