Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to...

21
Introduction to Database Management Systems V2.0 2011/09/09 Jeffrey D. Ullman – Jean-Michel Busca

Transcript of Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to...

Page 1: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

Introduction to Database Management Systems

V2.0 2011/09/09 Jeffrey D. Ullman – Jean-Michel Busca

Page 2: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

2

Interesting Stuff About Databases

It used to be about boring stuff: employee records, bank records, etc.

Today, the field covers all the largest sources of data, with many new ideas.

• Web search.

• Scientific and medical databases.

• Data mining.

• Integrating information.

Page 3: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

3

More Interesting Stuff

Database programming centers around limited programming languages.

• Only area where non-Turing-complete languages make sense.

• Leads to very succinct programming, but also to unique query-optimization problems.

Page 4: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

4

Still More …

You may not notice it, but databases are behind almost everything you do on the Web.

• Google searches.

• Queries at Amazon, eBay, etc.

Page 5: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

5

And More…

Databases often have unique concurrency-control problems

• Many activities (transactions) at the database at all times.

• Must not confuse actions, e.g., two withdrawals from the same account must each debit the account.

Page 6: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

6

Why Use a Database Management System?

Page 7: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

A Simplistic "Database"

Assume you want to store your address book on your computer:

Name, Telephone, Address

John Smith, 212 549 123, 10 bank street – NY, NY … … …

You don't need a DBMS for this, a plain CSV file or Excel file will do. Why?

7

Page 8: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

A Simplistic "Database" (2)

The structure of data is simple

• one relation, represented by lines

• no strong consistency constraints

The set of data is small

• no performance issues

• a simple, linear search will do

8

Page 9: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

A Simplistic "Database" (3)

You are the only user:

• one update at a time

• you have access to every data item

9

Page 10: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

Why use a DBMS?

In contrast, a DBMS is designed to support

• very large data sets,

• with complex structure,

• accessed by many users.

Let's see what issues are raised and how DBMSs address them.

10

Page 11: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

Large Data Sets

A DBMS implements sophisticated techniques to store and retrieve data efficiently (indexes, caching of data, of temporary results, etc.).

It also allows to compile and store the code of the most common queries in order to execute them efficiently.

11

Page 12: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

Complex Data Structures

A DBMS defines some data model that

• allows to describe complex data structures and relationships among data

• provides an abstract view of the data that hides how data is actually stored

It provides some Data Manipulation Language (DML) that allows to express complex queries / updates over data.

12

Page 13: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

Complex Data Structures (2)

It also provides a Data Definition Language (DDL) that allows to describe the data and their integrity constraints.

It enforces integrity constraints on every update, thus preserving database consistency.

13

Page 14: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

Many Users

Concurrent access: a DBMS schedules concurrent accesses to the data so that

• users always get consistent results

• database consistency is always preserved

Crash recovery: the database is a critical resource; a DBMS protects users from the effect of system failures.

14

Page 15: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

Many Users (2)

Access control: a DBMS manages privil-eges and roles that governs what data can be accessed by different classes of users.

A DBMS also implements views that restrict the part of the database that specific classes of users can see.

15

Page 16: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

16

A Bit of History

Page 17: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

The Network Model

First DBMSs implemented rudimentary data models.

Early 60s: network data model

• data arranged in graphs

• standardized by CODASYL

• ex: Integrated Data Store (General Electric)

17

Page 18: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

The Hierarchical Model

Late 60s: hierarchical data model

• data arranged in trees

• ex: Information Management System (IBM)

Those models were close to the actual implementation of data: queries were difficult to program and to maintain.

18

Page 19: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

The Relational Model

1970: E. Codd defined the relational data model, still widely used today:

• data arranged in logical tables

• linked by (foreign) keys.

Today, DBMS are evolving toward XML- or object-based data models to handle more complex structures or data types.

19

Page 20: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

The Relational Model (2)

The relational model is very popular because it has a strong mathematical background.

Every query translates to one or more operations (selection, intersection, …) on sets (tables): relational algebra

20

Page 21: Introduction to DBMSefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en... · Introduction to Database Management Systems Jeffrey D. Ullman – Jean-Michel Busca V2.0 2011/09/09

The Relational Model (3)

As a result:

• the semantics of queries is clear and non-ambiguous,

• queries can be optimized using well-known mathematical transformations (query rewritting).

21