Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational...

56
Syllabus 1. Introduction 2. ER Model 3. Relational Model 4. SQL 5. Integrity and Security 6. Relational Database Design 7. File Structure, Indexing and Hashing 8. Transaction 9. Concurrency control 10. Recovery System

Transcript of Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational...

Page 1: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Syllabus1. Introduction

2. ER Model

3. Relational Model

4. SQL

5. Integrity and Security

6. Relational Database Design

7. File Structure, Indexing and Hashing

8. Transaction

9. Concurrency control

10. Recovery System

Page 2: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Chapter 1: Introduction• What is Database?• Purpose of Database Systems• View of Data• Data Models • Data Definition Language • Data Manipulation Language• Transaction Management • Storage Management• Database Administrator• Database Users• Overall System Structure

Page 3: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Definition of Database

• A shared collection of logically related data, designed to meet the information needs of multiple users in an organization.

• The term database is often erroneously referred to as a synonym for a “DataBase Management System (DBMS)"

Page 4: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Contd…• A collection of data: part numbers, product codes, customer information, etc.

It usually refers to data organized and stored on a computer that can be searched and retrieved by a computer program.

 

• A data structure that stores metadata, i.e. data about data. More generally we can say an organized collection of information.

 

• A collection of information organized and presented to serve a specific purpose. (A telephone book is a common database.) A computerized database is an updated, organized file of machine readable information that is rapidly searched and retrieved by computer.

 

• An organized collection of information in computerized format.

 

• A collection of related information about a subject organized in a useful manner that provides a base or foundation for procedures such as retrieving information, drawing conclusions, and making decisions.

Page 5: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

ExampleThing Data (Facts/Figures)

Cricket PlayerCountry, name, date of birth, specialty, matches played, runs etc.

ScholarsName, data of birth, age, country, field, books published etc.

Food Name, ingredients, taste, preferred time, origin, etc.

VehicleRegistration number, make, owner, type, price, etc.

Page 6: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Purpose / Need of Database Systems

Let us discuss an example

Page 7: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

• We might start by building a file with the following structure:

• This text file is easy to deal with. So there's no need for a DBMS!

Example: Personal Calendar

What Day When Who Where

Lunch 10/24 1pm Rick Joe’s DinerCS123 10/25 9am Dr. Egghead Morris234Biking 10/26 9am Jane Jane’s houseDinner 10/26 6PM Jane Café Le Boeuf

Page 8: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Problem 1: Data Organization

• Consider the all-important ``who'' field. Do we also want to keep e-mail addresses, telephone numbers etc?

• Expand our file to look like:

• Now we are keeping our address book in our calendar and doing so redundantly.

What When Who-name Who-email Who-tel …. Where …

Page 9: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

“Link” Calendar with Address Book?

• Two conceptual “entities” -- contact information and calendar -- with a relationship between them, linking people in the calendar to their contact information.

• This link could be based on something as simple as the person's name.

Page 10: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Problem 2: Efficiency• Size of personal address book is probably less

than one hundred entries, but there are things we'd like to do quickly and efficiently. – “Give me all appointments on 10/28”– “When am I next meeting Jim?”

• “Program” these as quickly as possible. • Have these programs executed efficiently. • What would happen if you were using a

“corporate” calendar with hundreds of thousands of entries?

Page 11: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Problem 3. Concurrency and Reliability

• Suppose other people are allowed access to your calendar and are allowed to modify it? How do we stop two people changing the file at the same time and leaving it in a physical (or logical) mess?

• Suppose the system crashes while we are changing the calendar. How do we recover our work?

Page 12: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Example

Suppose a manager schedule a meeting with his staff today (3:00pm) and at the same time his secretary schedules him to meet with the Chairman. They both see that the time is open, but presumably only one of the two meetings will show on the calendar later.

Page 13: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

What is a DBMS?• A database (DB) is a large, integrated

collection of data.• A DB models a real-world enterprise /

DBMS contains information about a particular enterprise.

• A database management system (DBMS) is a software package designed to store and manage databases / set of programs to access the data.

• DBMS provides an environment that is simultaneously convenient, secure and efficient to use.

Page 14: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

• Is the software or tool that is used to manage the database and its users.

• A DBMS consist of different components or subsystem.

• Each subsystem or component of the DBMS performs different function(s).

• So a DBMS is collection of different programs but they all work jointly to manage the data stored in the database and its users.

Page 15: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

• Database is collection of data, DBMS is tool to manage this data, and both jointly are called database system.

Page 16: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

• Organization of data• Efficient retrieval of data• Reliable storage of data• Maintaining consistent data• All these topics are interrelated.

What the DBMS is about

Page 17: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Drawbacks of file systems• In the early days, database applications

were built directly on top of file systems• Drawbacks of using file systems to store

data:– Data redundancy and inconsistency

• Multiple file formats, duplication of information in different files

– Difficulty in accessing data • Need to write a new program to carry out each

new task

– Data isolation — multiple files and formats

Page 18: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Drawbacks of file systems (Cont.)

– Integrity problems• Integrity constraints (e.g. account balance >

0) become “buried” in program code rather than being stated explicitly

• Hard to add new constraints or change existing ones

– Atomicity of updates• Failures may leave database in an inconsistent

state with partial updates carried out• Example: Transfer of funds from one account to

another should either complete or not happen at all

Page 19: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Drawbacks of file systems (Cont.)

• Concurrent access by multiple users• Concurrent accessed needed for

performance• Uncontrolled concurrent accesses can

lead to inconsistencies– Example: Two people reading a

balance and updating it at the same time

• Security problems• Hard to provide user access to some, but

not all, data• Database systems offer solutions to all the

above problems

Page 20: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Database Applications

– Banking: all business transactions– Airlines: reservations, schedules– Universities: registration, grades– Sales: customers, products, purchases– Manufacturing: production, inventory,

orders, supply chain– Human resources: employee records,

salaries, tax deductions

Page 21: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Data and Information• Data is the collection of raw facts collected

from any specific environment for a specific purpose.

• Data in itself does not show anything about its environment.

• So to get desired types of results from the data we transform it into information by applying certain processing on it.

• Once we have processed data using different methods data is converted into meaningful form and that form of the Data is called information

Page 22: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Levels of AbstractionDBMS users are not computer trained, developers hide the complexity from users thro’ levels of abstraction, to simplify user’s interactions with the system

• Physical level: lowest level describes how a record (e.g., customer) is stored.

• Logical level: next higher level describes what data stored in database, and the relationships among the data.

Page 23: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

type customer = recordcustomer_id : string; customer_name : string;customer_street : string;customer_city : integer;

end;

• View level: highest level describes only part of the entire database. DBMS may provide many views for the same database.

Page 24: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

View of DataAn architecture for a database system

Page 25: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Data Abstraction

LogicalLevel

Physical Level

View Level

View 1 View 2 View n…

How data is actually stored ? e.g. are we using disks ? Which file system ?

What data is stored ? describe data properties such as data semantics, data relationships

What data users and application programs see ?

Page 26: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Instances and Schemas

Similar to types and variables in programming languages

• Schema – the logical structure / overall design of the database – Example: The database consists of information about a

set of customers and accounts and the relationship between them

– Analogous to type information of a variable in a program

– Physical schema: database design at the physical level

– Logical schema: database design at the logical level

Page 27: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

• Instance – the actual content of the database at a particular point in time – Analogous to the value of a variable

Page 28: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Data Independence• Physical Data Independence – the ability

to modify the physical schema without changing the logical schema– Applications depend on the logical schema– In general, the interfaces between the various

levels and components should be well defined so that changes in some parts do not seriously influence others.

• Logical Data Independence – the ability to modify the logical schema without causing application programs to be rewritten. It is difficult to achieve since application programs are heavily dependent on logical structure of data that they access.

Page 29: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Database LanguageLanguage for accessing and manipulating the data organized by the appropriate data model• One to specify database schema, storage

structure and access methods (DDL) and • other to express database queries and

updates (DML)Data Definition Language (DDL) • Specification notation for defining the database

schema• DDL compiler generates a set of tables stored in

a data dictionary, a file that contains metadata, i.e. data about data. Data dictionary is consulted before reading or modifying the actual data

Page 30: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Data Manipulation Language (DML)

• Language for accessing and manipulating the data organized by the appropriate data model– DML also known as query language

• Two classes of languages – Procedural – user specifies what data is

required and how to get those data – Declarative (nonprocedural) – user

specifies what data is required without specifying how to get those data

• SQL is the most widely used query language

Page 31: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Data Modeling• A data model is a collection of concepts for describing

data properties and domain knowledge:– Data relationships– Data semantics– Data constraints– Relational Model

• Only one abstract concept• Closer to the physical representation on disk• Normalization

Page 32: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

• Entity-Relationship data model (mainly for database design)

• Relational model• Object-based data models (Object-

oriented and Object-relational)• Semistructured data model (XML)• Other older models:

– Network model – Hierarchical model

Page 33: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

• Models an enterprise as a collection of entities and relationships– Entity: a “thing” or “object” in the enterprise that is

distinguishable from other objects• Described by a set of attributes

– Relationship: an association among several entities

e.g. each employee is an entity described by empno, empname, designation etc.

– Entity-relationship Model• Diagrammatic representation• Easier to work with• Syntax not important, but remember the “meaning”• Remember what you can model

Entity-Relationship Model

Page 34: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

• Entity set – set of all entities of the same type

• Relationship set – set of all relationships of same type

• Mapping cardinality – number of entities to which another entity can be associated via relationship set

Page 35: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

• The overall logical design of database can be expressed graphically by an E-R diagram, which has following components:

• Rectangles - entity set• Ellipses –attributes of an entity• Diamonds –relationship among

entity sets• Lines – link attributes to entity sets

and entity sets to relationship sets

Page 36: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Each component of E-R diagram is labeled with entity or relationship that it represents

Example of schema in the entity-relationship model

Page 37: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Relational Model• It uses collection of tables to

represent data as well as relationship among those data.

• Each table has multiple columns, each column has unique name.

Page 38: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.
Page 39: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Other Models Network Model

• Data are represented by collection of records, and relationships among those data are represented by links, which are viewed as pointers.

• Records are organized as a collection of arbitrary graph

Page 40: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Other Models Cont.Hierarchical Model

• Data are represented by collection of records, and relationships among those data are represented by links, which are viewed as pointers.

• Records are organized as a collection of trees rather than arbitrary graph

Page 41: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Database UsersUsers are differentiated by the way they

expect to interact with the system• End Users

access to the database for querying, updating, and generating reports

Casual end users: occasionally access the databaseneed different information each time learn only a few facilities that they may use repeatedly.use a sophisticated database query language to specify their

requests typically middle- or high-level managers or other occasional

browsers

• Application programmers – interact with system through DML calls

• Sophisticated users – form requests in a database query language

Page 42: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Database Users• Specialized users – write specialized

database applications that do not fit into the traditional data processing framework

• Naïve users – invoke one of the permanent application programs that have been written previously– E.g. people accessing database over the

web, bank tellers, clerical staff

Page 43: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Database Users

• System Analysts and Application Programmers

– Determine the requirements of end users, especially naive and parametric end users, and develop specifications for canned transactions that meet these requirements

– Application programmers implement these specifications as programs; then they test, debug, document, and maintain these canned transactions

• Workers behind the Scene

– Typically do not use the database for their own purposes

– DBMS system designers and implementers

– design and implement the DBMS modules (for implementing the catalog, query language, interface processors, data access, concurrency control, recovery, and security. ) and interfaces as a software package

Page 44: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Database Users• Tool developers

– Tools are optional packages that are often purchased separately

– include packages for database design, performance monitoring, natural language or graphical interfaces, prototyping, simulation, and test data generation.

• Operators and maintenance personnel – system administration personnel who are

responsible for the actual running and maintenance of the hardware and software environment for the database system

Page 45: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Database Administrator• Coordinates all the activities of the database

system; the database administrator has a good understanding of the enterprise’s information resources and needs.

• Database administrator's duties include:– Schema definition– Storage structure and access method definition– Schema and physical organization modification– Granting user authority to access the database– Specifying integrity constraints– Monitoring performance and responding to

changes in requirements

Page 46: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Database Actors• Database Administrators

– In a database environment, the primary resource is the database itself and the secondary resource is the DBMS and related software

– authorizing access to the database– coordinating and monitoring its use– acquiring software and hardware resources as

needed

Page 47: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

• Database Designers

– identifying the data to be stored in the database

– choosing appropriate structures to represent and store this data undertaken before the database is actually implemented and populated with data

– communicate with all prospective database users, in order to understand their requirements

– develop a view of the database that meets the data and processing requirements for each group of users

– These views are then analyzed and integrated with the views of other user groups. The final database design must be capable of supporting the requirements of all user groups

Database Actors

Page 48: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Overall Database System Structure

Page 49: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.
Page 50: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Query Processor Components• DML Compiler – translates DML statements in

a query language into low level instructions

• DDL interpreter – interprets DDL statements and records them in a set of tables containing metadata.

• Query evaluation engine – executes low-level instructions generated by DML compiler

Page 51: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Storage Manager Components• Authorization & integrity manager –

tests for satisfaction of integrity constraints and checks the authority of user to access the data

• Transaction manager – ensures the consistency of the database despite system failures, and concurrent transaction executions proceed with conflicting

• File Manager – Manages the allocation of space on disk storage and the data structures used to represent information on disk

• Buffer Manager – responsible for fetching data from disk storage into main memory, and deciding what data to catch in memory

Page 52: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Data structures are required as a part of physical implementation

• Data files – stores database itself• Data dictionary - stores meta data

about the structure of database • Indices – which provides fast access to

data items that hold particular values• Statistical data – which stores

statistical information about the data in the database, used by strategy selector

Page 53: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

DATABASE SYSTEM ARCHITECTURE

Page 54: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.
Page 55: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Storage Management• Storage manager is a program module

that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system.

• The storage manager is responsible to the following tasks: – Interaction with the file manager – Efficient storing, retrieving and updating of data

• Issues:– Storage access– File organization– Indexing and hashing

Page 56: Syllabus 1.Introduction 2.ER Model 3.Relational Model 4.SQL 5.Integrity and Security 6.Relational Database Design 7.File Structure, Indexing and Hashing.

Transaction Management• A transaction is a collection of operations

that performs a single logical function in a database application

• Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures.

• Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database.