RELATIONAL Database Management System (RDBMS) Concepts...
Transcript of RELATIONAL Database Management System (RDBMS) Concepts...
Venkatesh Vinayakarao (Vv)
RELATIONAL Database
Management System (RDBMS)
Concepts and SQl
Venkatesh [email protected]
http://vvtesh.co.in
Chennai Mathematical Institute
The primary goal of a DBMS is to provide a way to store and retrieve database information that isboth convenient and efficient. - Silberschatz, Korth and Sudarshan.
A database-management system (DBMS) is a collection of:
1. interrelated data and 2. a set of programs to access those data.
The collection of data is usually referred to as database.
3
Why files are insufficient to store data?
Why files are insufficient?
• Major disadvantages of using files to store and retrieve data• Data redundancy and inconsistency
• Classic example – Address book. Same phone number and address may repeat and may also be inconsistent
• Difficulty in access• Unrestricted file format implies custom programs need to be
written• Data integrity (Atomicity of transactions)
• Assume having a debit ledger and credit ledger!
• Backups• Concurrent access and related anomalies• Security
4
An Architecture for a Database Management System (DBMS)
How is the data stored?
Description of the data and its relationships.
Architecture
6
Data Models
• We are interested in understanding, describing and analyzing several aspects of data such as:• Data • Data relationships• Data semantics• Data constraints
7
A data model captures these aspects.
Several models exist: Relational, Entity-
Relationship, Object Oriented, etc.
Schema
• Logical Schema – the overall logical structure of the database • Example: Set of customers and accounts in a bank and
the relationship between them
• Physical schema– the overall physical structure of the database
Physical Data IndependenceThe ability to modify the physical schema without
changing the logical schema
History
9
1950
Tapes and punched cards – Sequential
data access
1960
Direct access with hard disks. Codd introduces
relational data model. UC Berkeley builds Ingres.
1980
SQL becomes a standard.
Parallel, Obj Oriented and Distributed DBMS are built.
1990
multi-terabyte data warehouses,
web commerce
2000
XML Standards emerge.Now
Big Data, NoSQL DB.
Course Dynamics
Assessment
12
Component Weight
Assignments (2 * 10%) 20%
Exam 30%
Course Plan*
13
Topics Text Book Reference
Introduction to RDBMS Chapters 1, 2
SQL Chapters 3, 4
Relational Model Relational Algebra
Chapters 6
Relational Database Design Functional Dependencies, Normal Forms, Keys, Decomposition
Chapters 7, 8
Data Storage and Querying Storage, Indexing Structures
Chapters 10, 11
Transaction ManagementACID Properties, Transactions, Concurrency control, Serialization
Chapter 14
* Tentative
Acknowledgment
• Slide contents are borrowed from the official website of the course text. For the authors’ original version of slides, visit: • https://www.db-book.com/db6/slide-dir/index.html
14
Introduction to RDBMS
15
Relational Model
• All the data is stored in various tables.
• Example of tabular data in the relational model
A Sample Relational Database
Exercise: Describe these two tables in your own words (in two to three
sentences each).
Data Definition Language (DDL)
• Specification notation for defining the database schema
• DDL compiler generates a set of table templates stored in a data dictionary
• Data dictionary contains metadata (i.e., data about data)• Database schema
• Integrity constraints• Primary key (ID uniquely identifies instructors)
• Authorization
Defining Instructor Relation
19
Data Manipulation Language (DML)• Language for accessing and manipulating the data
organized by the appropriate data model• DML also known as query language
• Two classes of languages • Pure – used for proving properties about computational
power and for optimization• Relational Algebra
• Tuple relational calculus
• Domain relational calculus
• Commercial – used in commercial systems• SQL is the most widely used commercial language
SQL
• The most widely used commercial language
• Application programs generally access databases through one of• Language extensions to allow embedded SQL
• Application program interface (e.g., ODBC/JDBC) which allow SQL queries to be sent to a database
Sample SQL statements
select name from instructor
update instructorset salary = salary * 1.03where salary > 100000;
Database Design
• The process of designing the general structure of the database:• Logical Design – Deciding on the database schema.
Database design requires that we find a “good” collection of relation schemas.• Business decision – What attributes should we record in the
database?
• Computer Science decision – What relation schemas should we have and how should the attributes be distributed among the various relation schemas?
• Physical Design – Deciding on the physical layout of the database
Design Approaches
• Need to come up with a methodology to ensure that each of the relations in the database is “good”
• Two ways of doing so:• Entity Relationship Model (Chapter 7)
• Models an enterprise as a collection of entities and relationships
• Represented diagrammatically by an entity-relationship diagram
• Normalization Theory (Chapter 8)• Formalize what designs are bad, and test for them
A Part of an ER Diagram
24
A More Elaborate ER Model
25
Database Design (Cont.)
• Is there any problem with this relation?
Database Engine
• Storage manager
• Query processing
• Transaction manager
Storage Management
• Storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system.
• The storage manager deals with:• File organization
• Indexing and hashing
• and anything that is low-level concerning the data storage.
File Organization
• The database is stored as a collection of files. Each file is a sequence of records. A record is a sequence of fields.
• One approach:•assume record size is fixed
•each file has records of one particular type only
•different files are used for different relations
Query Processing
How to effectively execute the query?
Select Operation – selection of rows (tuples)
Relation r
(A=B) ^ (D > 5) (r)
Transaction Management
• A transaction is a unit of program execution that accesses and possibly updates various data items.
• E.g., transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Two main issues to deal with:• Failures of various kinds, such as hardware failures and
system crashes• Concurrent execution of multiple transactions
Summary and Course Plan*
33
Topics Text Book Reference
Introduction to RDBMS Chapters 1, 2
SQL Chapters 3, 4
Relational Model Relational Algebra
Chapters 6
Relational Database Design Functional Dependencies, Normal Forms, Keys, Decomposition
Chapters 7, 8
Data Storage and Querying Storage, Indexing Structures
Chapters 10, 11
Transaction ManagementACID Properties, Transactions, Concurrency control, Serialization
Chapter 14
* Tentative