Avi 1.Introduction

download Avi 1.Introduction

of 23

Transcript of Avi 1.Introduction

  • 8/8/2019 Avi 1.Introduction

    1/23

    Advanced Database Systems

    Introduction` Objectives` A DataBase Management System DBMS` Structure of a DBMS` File and data models an overview` Evolution of Database Technology

    ` Advanced database contents/topicsCentralized database system related issuesDistributed database system related issuesIntroduction to Database system adapted architecturesModern database systemsDatabase Security and Authorization

    Database development applicationsRecapitulative mini project

    Courseoutlines

  • 8/8/2019 Avi 1.Introduction

    2/23

    2

    Objectives` To understand the fundamental concepts and advanced technology underlying

    database systems: database management systems roles and Advanced database fundamentals Modern database systems:

    Object-relational databases Spatial databases Active databases

    Etc.

    Physical data organizations Query processing and optimization Transaction management

    concurrence control crash recovery

    Security and authorization Distributed database system issues Parallel and system architectures OLAP, data mining and data warehouse

    ` To gain hand-on experience with database application systems developing a small application system using MS SQLSERVER Java/VisualBasic and database-backed web-sites

    Advanced Database Systems- Introduction

  • 8/8/2019 Avi 1.Introduction

    3/23

    3

    A DataBase Management System DBMS` What is DBMS?

    Need for information management A very large, integrated collection of data. Models real-world enterprise.

    DMBS contains information about a particular enterprise A Database Management System (DBMS) is a software package designed to store

    and manage databases. DBMS provides an environment that it both convenient and efficient to use

    ` Why Use a DBMS? Purpose of Database SystemsDatabase management systems were developed to handle the followingissues/difficulties of typical file-processing systems supported by conventionaloperating systems: Data independence and efficient access.

    Difficulty in accessing data Data integrity and security. Data redundancy and inconsistency Uniform data administration. Concurrent access, recovery from crashes. Data isolation multiple files and formats Replication control Reduced application development time.

    Advanced Database Systems- Introduction

  • 8/8/2019 Avi 1.Introduction

    4/23

    4

    Structure of a DBMS (1/2)` A typical DBMS has a layered

    architecture.

    ` The figure does not show the

    concurrency control andrecovery components.

    ` This is one of several possiblearchitectures; each system

    has its own variations.

    Query Optimization

    and Execution

    Relational Operators

    Files and Access Methods

    Buffer Management

    Disk Space Management

    DB

    These layers must consider

    concurrency control and recovery

    Advanced Database Systems- Introduction

  • 8/8/2019 Avi 1.Introduction

    5/23

    5

    Structure of a DBMS (2/2)The enclosed figure shows the structure

    (with some simplification) of a typical DBMS

    based on the relational data model

    Web Forms Application Front Ends SQL Interface

    Plan Executor

    Operator Evaluator

    Parser

    Optimizer

    Query

    EvaluationEngine

    SQL Statements

    Transaction

    Manager

    Lock

    Manager

    Files andAccess Methods

    Buffer Manager

    Disk Space Manager

    Recovery

    Manager

    Concurrency

    control

    DBMS

    Index Files

    Data Files

    System

    catalog

    Database

    Advanced Database Systems- Introduction

  • 8/8/2019 Avi 1.Introduction

    6/23

    6

    File and data models an overview` Data Models

    A data model is a collection of concepts for describing data. A schema is a description of a particular collection of data, using the a

    given data model.

    ` History of File Organizations:

    Sequential search index sequential B-tree Hashing

    ` Classification of Database Models: Entity-relationship

    Network Hierarchical Relation Object-oriented Deductive Object-relational Semi-structured data XML

    Various database models providelogical and physical dataindependence to separate simplelogical database structures andcomplicated physical filestructures.

    Advanced Database Systems- Introduction

  • 8/8/2019 Avi 1.Introduction

    7/23

    7

    Evolution of Database Technology1960s: Hierarchical (IMS) & network (CODASYL) DBMS.

    1970s: Relational data model, relational DBMS implementation.

    1980: RDBMS rules the earth

    1985: Advanced data models (extended-relational, OO, deductive,etc.)Application-oriented DBMS (spatial, scientific, engineering,

    etc.).

    1990s: ORDB, OLAP, Data mining, data warehousing, multimediadatabases, and network databases.

    2000s: Databases for XML, bioinformatics, stream data and sensor

    Advanced Database Systems- Introduction

  • 8/8/2019 Avi 1.Introduction

    8/23

    8

    Advanced database contents/topics` Centralized database system related issues

    Storage and File Structure Indexing and Hashing Query Processing Transaction Fundamentals Concurrency Control Techniques Database Recovery

    ` Distributed database system related issues Distributed Databases Fundamentals Distributed Transactions-Commit Protocols Distributed Databases Concurrency Control Heterogeneous Distributed Databases Advanced and modem database systems

    ` Introduction to Database system adapted architectures` Modern database systems

    Object-Oriented Databases Spatial and Geographic Databases Data Mining and DataWarehousing: Concepts and Techniques

    ` Database Security and Authorization` Database development applications

    ` Recapitulative mini project

    Advanced Database Systems- Introduction

  • 8/8/2019 Avi 1.Introduction

    9/23

    9

    File Organizations and indexing` Storage and File Structure

    Physical Storage Media - we introduce the different types of storage media and technologies,conversed topics are: Volatile and nonvolatile storage Storing devices and Magnetic Disks Performance Measures Introduction to the RAID technology

    Storage and File Organization - in this section we speak about the following topics: Storage Access : Buffer Manager, Buffer-Replacement Policies

    File Organization: Organization of Records in Files, Un ordered, Ordered files, Hashed files Data-Dictionary Storage

    ` Indexing and Hashing Introduction Basic Concepts

    Ordered Indexes-Clustered / Unclustered

    Multi-level Indexes Index Update: Deletion / Insertion B+-Tree Index Files, B-Tree Index Files Example of a B+-tree - Insert / Delete

    Static Hashing / Dynamic Hashing Example of Hash Index Hashing vs. Other Schemes

    Grid Files Bitmap Indices Index Definition in SQL

    Centralizeddatabase system related issues

  • 8/8/2019 Avi 1.Introduction

    10/23

    10

    Query Processing`Basic Steps in Query Processing an overview

    `Measures of Query Cost`Query Processing- Several algorithmsSelection OperationJoin Operation: different algorithms to implement joins

    Nested-loop join - Block nested-loop joinIndexed nested-loop joinOther Operations an overview

    `Query Optimization using HeuristicsQuery tree, Graph treeTransformation of Relational Expressions

    Equivalence RulesPushing Selections, Join Ordering, etc.Choice of Evaluation PlansStructure of Query Optimizers

    `Evaluation of Expressions Materialization, Pipelining`Statistics for Cost Estimation

    Centralizeddatabase system related issues

  • 8/8/2019 Avi 1.Introduction

    11/23

    11

    Transaction and concurrency control` Transaction Fundamentals

    Transaction Concept Transaction - ACID properties Transaction States

    Concurrent Executions Schedules Scheduling Transactions Serializability Serializability - Conflict Serializability Testing for conflict Serializability - Precedence graph Serializability - View Serializability

    Recoverability:Why recovery is needed? Cascading rollback Concurrency Control an overview Levels of Consistency - Levels of Consistency in SQL-92 Transaction Definition in SQL

    ` Database Concurrency Control Techniques Purpose of Concurrency Control Lock-Based Protocols - Pitfalls and serializability issues The Two-Phase Locking (2PL) Protocol Timestamp-Based Protocols - Recoverability and Cascade Freedom

    Deadlock Handling Deadlock Prevention Strategies Deadlock Detection graph based strategy, Deadlock Recovery

    Locking and Insert, Delete Operations

    Other protocols and schemes - an overview Graph-Based Protocols Validation-Based Protocol Granularity of data items , Intention Lock Modes Multi-version Schemes, Index Locking Protocol

    Centralizeddatabase system related issues

  • 8/8/2019 Avi 1.Introduction

    12/23

    12

    Database Recovery` Database Recovery an overview

    Failure Classification Algorithms/techniques and Storage Structures

    ` Data Access

    ` Recovery and Atomicity

    ` Log-Based Recovery Deferred Database Modification Immediate Database Modification

    ` Checkpoints an overview Checkpoints recovery steps - example

    RecoveryW

    ith Concurrent Transactions

    ` Buffer Management - Log Record Buffering

    ` Failure with Loss of Nonvolatile Storage

    ` Shadow Paging

    Centralizeddatabase system related issues

  • 8/8/2019 Avi 1.Introduction

    13/23

    13

    Distributed Databases Fundamentals` Distributed Database concepts

    Distributed Database System an overview

    Data Distribution Advantages and benefits

    ` Types of Distributed Databases Heterogeneous and Homogeneous Databases

    ` Distributed DBMS Architectures

    ` Distributed Data Storage Data Replication Data Fragmentation

    ` Distributed Catalog Management

    ` Data transparency - Naming of Data Items

    ` Transparency and updates

    Distributeddatabase system related issues

  • 8/8/2019 Avi 1.Introduction

    14/23

    14

    Distributed Transactions and Concurrency control

    ` Distributed Transactions- Commit Protocols Distributed Transactions - Overview System Failure Modes Commit Protocols - Two Phase Commit Protocol (2PC)

    Phase 1: Obtaining a Decision Phase 2: Recording the Decision Handling of Failures

    Recovery and Concurrency Control Alternative Models - Persistent messaging systems

    Error Conditions with Persistent Messaging Persistent Messaging andWorkflows Implementation of Persistent Messaging

    Annex - Three Phase Commit (3PC), TransactionalWorkflows

    ` Distributed Databases Concurrency Control Concurrency Control an overview Centralized: Single-Lock-Manager Approach Distributed Lock Manager

    - Primary copy- Majority protocol - Biased protocol- Quorum consensus Time-stamping Replication withWeak Consistency Multi-master Lazy Replication Deadlock Handling

    Prevention strategies Centralized Approach

    Distributed Query Processing Simple Join Processing / Possible strategies, Semijoin Strategy

    Join Strategies that Exploit Parallelism

    Distributeddatabase system related issues

  • 8/8/2019 Avi 1.Introduction

    15/23

  • 8/8/2019 Avi 1.Introduction

    16/23

    16

    Database System Architectures

    ` Centralized Systems

    ` Client-Server SystemsTransaction Servers

    Data Servers

    ` Parallel DBMSInterconnection Network ArchitecturesArchitecture Issue: SharedWhat?Parallel DBMS Techniques and different TypesData PartitioningParallel processing

  • 8/8/2019 Avi 1.Introduction

    17/23

    17

    Object-Oriented Databases (1/2)` Motivation

    Introduction Motivating ExampleWhy Object Databases ODBs? Need for Complex Data Types

    ` Engineering Database Design-overview Database Design Process Logical/Physical database design

    ` Object-oriented concepts ODBs are more Natural & Direct Object-oriented terminologies an overview Investigation and analysis - RDBs vs. ORDBs , RDBs vs. ODBs., etc.

    ` The Object-Oriented Data Model - An example of a class in UML

    ` Object-Oriented Data Modelling - rapid overview

    ` OO Data Modelling: Example

  • 8/8/2019 Avi 1.Introduction

    18/23

  • 8/8/2019 Avi 1.Introduction

    19/23

    19

    Spatial and Geographic Databases` Fundamentals of GIS - Overview

    Spatial and Geographic Data(bases)Why Study GIS?What is GIS?Whats in a GIS?

    ` GIS vs. Other Systems - How GIS differs from Related Systems

    ` GIS System-Architecture and Components GIS Spatial Data Model

    GIS Spatial and Attribute DataHow a GIS Organizes Spatial Data?Raster and Vector data Model - Spaghetti & Topologic Vector Data Model

    Representing Surfaces DEM, TIN, Contour (isolines) Lines

    File Formats for Raster and Vector data models

    Spatial Database Management? Querying Data & Indexing of Spatial Data

    ` Sources of Geographic Data

  • 8/8/2019 Avi 1.Introduction

    20/23

    20

    Data Mining and DataWarehousing` Motivation

    ` Evolution of Database Technology overview

    ` Why Data Mining? Potential Applications

    ` What Is Data Mining? Data Mining: A KDD Process

    ` Data Mining: OnWhat Kind of Data?

    ` What is a Data Warehouse? DataWarehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling of DataWarehouses

    Defining a Snowflake Schema in Data Mining Query Language DMQL Multi-Tiered Architecture - Approaches to Building OLAP Server

    Indexing OLAP Data: Bitmap Index DataWarehouse Back-End Tools and Utilities From OLAP to On Line Analytical Mining OLAM, An OLAM Architecture

    ` Data Mining Functionalities` Are All the Discovered Patterns Interesting?` Market-Basket Data; typical case,` Frequent Pairs in SQL, A-Priori Algorithm

  • 8/8/2019 Avi 1.Introduction

    21/23

    21

    Database Security and Authorization` Introduction to DB Security

    ` Access Controls

    ` Database Security and the DBA

    ` Discretionary Access Control The privileges at the account/relation levels Granting and revoking of relation privileges Views and Security Propagation of Privileges Role-Based Authorization

    ` Mandatory Access Control Access Control for Multilevel Security Multilevel Relations

    ` Discretionary Access Control vs. Mandatory Access Control

    ` Introduction to Statistical Database Security

  • 8/8/2019 Avi 1.Introduction

    22/23

    22

    Database Application Development` Database Programming

    Embedded SQLDynamic SQLEmbedded SQL in Java

    Database APIs: Alternative to embeddingEmbedded SQL in Java using SQLJ - an exampleDatabase Stored ProceduresSQL Persistent Stored Modules (SQL/PSM)

    ` Client-Server a Modern Database Architectures Client-Server Computing Two-Tier Architecture Multiple-Tier Architecture

    ` Active Database Concepts and Triggers an introduction Generalized (ECA) Model for Active DB

    Database Triggers context

  • 8/8/2019 Avi 1.Introduction

    23/23

    23

    Mini-Projects, Recapitulation` University Education Information System

    Object Oriented database design

    Relational / Object mapping

    Relational / Object Replication issues

    ` Private University distributed Information System Many relational sites, repository central site

    Database design issues

    Issues related to data sharing

    SQL queries for internet access

    Website implementing and using the SQL queries