SQL & intro to db architectures - Harvard...

71
SQL & intro to db architectures prof. Stratos Idreos HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/ class 3

Transcript of SQL & intro to db architectures - Harvard...

Page 1: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

SQL & intro to db architecturesprof. Stratos Idreos

HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/

class 3

Page 2: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

welcome brave cs165 students!

2

35+62

Page 3: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /553

Laura Haas

Data Systems Researcher

Director of IBM Research’s Accelerated Discovery Lab

& Harvard alumna

The Power Behind the Throne: Information Integration in the Age of Data-Driven Discovery

October 5

a 1 hour discussion with students will follow after class

guest lecture

1 hour wics meeting 3:30-4:30

Page 4: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /554

Peter Haas

Data Systems Researcher

IBM Almaden Research Center & Harvard alumnus, class 78

October 5, 3:30pm

Balancing Recency and Continuity in Massive Scale Dynamic Interaction Graphs

guest lecture

Page 5: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /555

Nga Tran

Data Systems Researcher

Head of Optimizer Group at HP Vertica

Nov 9

guest lecture

Page 6: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

when should I start my project?

6

next week

in the meantime: 1) play with code base & tools 2) linked list, binary tree 3) MonetDB, PostgreSQL

Page 7: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

which tests should I pass?

7

we care about DSL tests by Wed it should be obvious why

leaderboard will test all DSL tests so you will know exactly your status at any time

more tests soon

Page 8: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

how much should I optimize?

8

enough…e.g., see m3

we expect a generally elegant design & implementation

often OH & sectionsrule of thumb: at least 2 areas

Page 9: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /559

slides are not notes!

starting today we will do collaborative note taking:

http://tinyurl.com/cs165-notes

ping for Wasay for access

How: 1) register for a class or two 2) take notes during class and put

them in later on 3) edit notes “your name: note”

Page 10: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5510

assume an array of N integers: find all positions where value>x

data

qualifying positions

exists in all systems: sql, nosql, newsqlselect operator

not as simple as it looks…

from last time

Page 11: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

today

11

logical design

physical design

system design

Page 12: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5512

essential steps in using a database system

clean schema load tune

query

experts/system admins

user/apps

Page 13: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5513

relational model+SQL

professors(id,name,…)

courses(id,name, profId,…)

students(id,name,…)

database

table/relationcolumn/attribute

key

Page 14: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5513

relational model+SQL

professors(id,name,…)

courses(id,name, profId,…)

students(id,name,…)

database

table/relationcolumn/attribute

create table for professors: create table professors (id:integer, name: char(40), telephone: char(10), …)

key

Page 15: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5513

relational model+SQL

professors(id,name,…)

courses(id,name, profId,…)

students(id,name,…)

database

table/relationcolumn/attribute

create table for professors: create table professors (id:integer, name: char(40), telephone: char(10), …) insert into professors (76897689, “john smith”, …)

key

Page 16: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5513

relational model+SQL

professors(id,name,…)

courses(id,name, profId,…)

students(id,name,…)

database

table/relationcolumn/attribute

give me the names of all students: select name from students

create table for professors: create table professors (id:integer, name: char(40), telephone: char(10), …) insert into professors (76897689, “john smith”, …)

key

Page 17: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5513

relational model+SQL

professors(id,name,…)

courses(id,name, profId,…)

students(id,name,…)

database

table/relationcolumn/attribute

give me the names of all students: select name from students

create table for professors: create table professors (id:integer, name: char(40), telephone: char(10), …) insert into professors (76897689, “john smith”, …)

where GPA>3.0

key

Page 18: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5514

employee(id:int, name:varchar(50), office:char(5), telephone:char(10), city:varchar(30), salary:int)

(1, name1, office1, tel1, city1, salary1) (2, name2, office2, tel2, city2, salary2) (3, name3, office3, tel3, city3, salary3) (4, name4, office4, tel4, city4, salary4) (5, name5, office5, tel5, city5, salary5) (6, name6, office6, tel6, city6, salary6) (7, name7, office7, tel7, city7, salary7) (8, name8, office8, tel8, city8, salary8) (9, name9, office9, NULL, city9, salary9)

schema

data

cardinality=9

SQL:insert into employee (1, name1, office1, tel1, city1, salary1)

value does not exist

Page 19: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5515

relational model+SQL

professors(id,name,…)

courses(id,name, profId,…)

students(id,name,…)

database

give me all students enrolled in cs165

Page 20: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5515

relational model+SQL

professors(id,name,…)

courses(id,name, profId,…)

students(id,name,…)

database

give me all students enrolled in cs165

enrolled(studentId,

courseId,…)

Page 21: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5515

relational model+SQL

professors(id,name,…)

courses(id,name, profId,…)

students(id,name,…)

database

give me all students enrolled in cs165

enrolled(studentId,

courseId,…) foreign key

Page 22: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5515

relational model+SQL

professors(id,name,…)

courses(id,name, profId,…)

students(id,name,…)

database

give me all students enrolled in cs165select student.name from students, enrolled, courses where courses.name=“cs165” and enrolled.courseId=course.id and student.id=enrolled.studentId

enrolled(studentId,

courseId,…) foreign key

Page 23: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5515

relational model+SQL

professors(id,name,…)

courses(id,name, profId,…)

students(id,name,…)

database

give me all students enrolled in cs165select student.name from students, enrolled, courses where courses.name=“cs165” and enrolled.courseId=course.id and student.id=enrolled.studentId

enrolled(studentId,

courseId,…) foreign key

join

Page 24: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5516

students(id,name,…)

enrolled(studentId,courseId,…)

how do we join

Page 25: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

normalization

17

AllData(student ID,student name,student address, course name, grade, professor name, professor ID, professor telephone,…)

good

say schema about university db contains one table…

Page 26: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

normalization

17

AllData(student ID,student name,student address, course name, grade, professor name, professor ID, professor telephone,…)

good

say schema about university db contains one table…

duplicates - tons of data - updates - but no joins

Page 27: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5518

fact table(id1,id2,…)

dimension table 1(id1,…)

dimension table 2(id2,…)

star schema

Page 28: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5519

snowflake schema

Page 29: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5520

create table employee (id:integer, name:varchar(50) not null, office:char(5), telephone:char(10), city:varchar(30), salary:integer, primary key (id) check (salary<100000))

constraints

at most 5 charsmust have a value

must be uniquemust not become rich

Page 30: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5520

create table employee (id:integer, name:varchar(50) not null, office:char(5), telephone:char(10), city:varchar(30), salary:integer, primary key (id) check (salary<100000))

constraints

at most 5 charsmust have a value

must be uniquemust not become rich

when and how do we enforce constraints

Page 31: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5521

select max(GPA),avg(GPA),min(GPA) from students

aggregations

select R.a - R.b + R.c from R

math

select * from R where R.a IN (select b from S where C<10)

nested

select * from R where a =10UNIONselect * from B where b =20

set ops

more SQL examples

Page 32: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5522

select avg(GPA), class, major from students where GPA>3.0 and class>1990 group by class, major order by class

Page 33: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5523

Employee(id:int, name:varchar(50), office:char(5),

telephone:char(10), city:varchar(30), salary:int)

base table

Employee-Berlin-Managerselect * from employee where city=“berlin”

view to be used by managers in Berlin

Employee-Berlin-Allselect id,name,city,office from employee where city=“berlin”

view to be used by all employees in Berlin

Page 34: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5523

Employee(id:int, name:varchar(50), office:char(5),

telephone:char(10), city:varchar(30), salary:int)

base table

Employee-Berlin-Managerselect * from employee where city=“berlin”

view to be used by managers in Berlin

Employee-Berlin-Allselect id,name,city,office from employee where city=“berlin”

view to be used by all employees in Berlin

how should we store views

Page 35: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

design

24

logical design

physical design

system design

physical/logical independence

app/user: no need to know how data is stored/accessed

we can safely change lower layers

Page 36: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

quiz 1

25

~15 minutes - groups of 2+

it is summer 2015 - now you know all about data systems

you are building an augmented reality startup using Google Glass

people wearing Google Glass can tag places/objects - voice/image recognition works fine

tagging means assigning values, comments, etc to an object

you can then query this data - again assume voice recognition works fine and a black box translates natural language to SQL

how does the schema of your app look like? (tables, attributes, keys, relationships) (assume a limited working environment/features, say walking around Harvard square/yard)

describe 2 interesting queries in SQL

Page 37: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5526

user(id,name,location,device,…)

object(id,name,location,telephone,date,url,color,taste,…many more)

comment(id,user_id,oject_id,text,…)

likes_object(user_id,oject_id,)

likes_comment(user_id,comment_id)

a possible example

trust(user_id,user_id)

q1: get all places where jenny said “awesome” q2: get all users that like what I like and are close by

Page 38: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5526

user(id,name,location,device,…)

object(id,name,location,telephone,date,url,color,taste,…many more)

comment(id,user_id,oject_id,text,…)

likes_object(user_id,oject_id,)

likes_comment(user_id,comment_id)

a possible example

trust(user_id,user_id)

q1: get all places where jenny said “awesome” q2: get all users that like what I like and are close by

select object.location from object, user where user.name = “jenny” and

comment.user_id=user.id and comment.text LIKE “%awesome%”

Page 39: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5526

user(id,name,location,device,…)

object(id,name,location,telephone,date,url,color,taste,…many more)

comment(id,user_id,oject_id,text,…)

likes_object(user_id,oject_id,)

likes_comment(user_id,comment_id)

a possible example

trust(user_id,user_id)

q1: get all places where jenny said “awesome” q2: get all users that like what I like and are close by

Page 40: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5526

user(id,name,location,device,…)

object(id,name,location,telephone,date,url,color,taste,…many more)

comment(id,user_id,oject_id,text,…)

likes_object(user_id,oject_id,)

likes_comment(user_id,comment_id)

a possible example

trust(user_id,user_id)

q1: get all places where jenny said “awesome” q2: get all users that like what I like and are close by

select user.name, user.location from user, likes_object as L1, likes_object as L2 where L1.user_id=my_id and L1.object_id=L2.object_id and L2.user_id !=my_id

and user.id=L2.user_id and close(user.location,mylocation)=true

Page 41: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5527

object(id,name,location,telephone,date,url,color,taste,…many more)

how do we store the object table?

what if we want to add another kind of

object?

Page 42: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

design

28

logical design

physical design

system design

Page 43: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

so do db systems “just work”?

29

declarative interface ask what you want

db system

Page 44: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5530

declarative interface ask what you want

db system

DBAindexes/views/tuning knobs

Page 45: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5530

declarative interface ask what you want

db system

DBAindexes/views/tuning knobsbut … db cracking, adaptive* ideas

Page 46: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5531

clean schema load tune

query

experts/system admins

user/apps

essential steps in using a database system

Page 47: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

design

32

logical design

physical design

system design next up: db architectures 101

Page 48: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5533

declarative interface ask what you want

db system

the db system decides how to “best store and access data”

Page 49: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5534

design/implement numerous possible algorithms + data representations

choose the bestdata source, algorithms and path for each query

database kernel

data data data

algo

rithm

s/op

erat

ors

applications

sql

Page 50: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5535

database kernel

data data data

algo

rithm

s/op

erat

ors

select min(A) from R where B<10 and C<80

optimizer

parser

execution

storage

Page 51: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5536

database kernel

applications

sql

optimizer

parser

execution

storage

in/out

admission

Page 52: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5537

database kernel

applications

sql

client programs

thread1

thread2

thread3 db

program

thread pool

thread5

thread4

Page 53: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5538

database kernel

applications

sql

optimizer

parser

execution

storage buffer pool

in/out

thread pool

transactions

disk

memory

cpu

Page 54: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5538

database kernel

applications

sql

optimizer

parser

execution

storage buffer pool

in/out

thread pool

transactions

disk

memory

cpu

is it “good” to have modules

Page 55: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5539

query plan

database kernel

data data data

algo

rithm

s/op

erat

ors

Page 56: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

select name from student

where GPA>3.0

40

select GPA>3.0

project name

student(id,name,GPA,address,class,…)

result

scan all the data?

logical plan

Page 57: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

select name from student

where GPA>3.0

41

scan GPA>3.0

project name

result

studentsstudents

index scan GPA>3.0

project name

result

physical plans

Page 58: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

select avg(GPA) from student

where class=2017

42

select year=2017

project GPA

student(id,name,GPA,address,class,…)

avg GPA

result

Page 59: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5543

professors(id,name,…)

courses(id,name, profId,…)

students(id,name,…)

give me all students enrolled in cs165select student.name from students, enrolled, courses where courses.name=“cs165” and enrolled.courseId=course.id and student.id=enrolled.studentId

enrolled(studentId,

courseId,…)

Page 60: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5544

select courses.name=“cs165”

join enrolled.courseid=course.id

students enrolled courses

join student.id=enrolled.studentid

project student.name

good plan

select student.name from students, enrolled, courses where courses.name=“cs165” and enrolled.courseId=course.id and student.id=enrolled.studentId

Page 61: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5545

select courses.name=“cs165”

join enrolled.courseid=course.id

students enrolled courses

join student.id=enrolled.studentid

project student.name

select student.name from students, enrolled, courses where courses.name=“cs165” and enrolled.courseId=course.id and student.id=enrolled.studentId

pushing selects down

Page 62: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5546

select min(A) from R where B<10 and C<80

internal languagelogical plan

optimizer rules/cost model/statistics

physical plan execution

internal language

Page 63: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

concurrency

47

how many queries should a db run in parallel and how

reads - writes

Page 64: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

concurrency

48

transactions ACID properties locks

Jim Gray, IBM, Tandem, DEC, Microsoft ACM Turing award ACM SIGMOD Edgar F. Codd Inovations award 1993

Page 65: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

recovery

49

what should happen if something fails during a query?

reads - writes

Page 66: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

recovery

50

classic examplejoe owes mike 100$both joe and mike have a Bank of Bla account

possible actions

joe -100 mike + 100

mike + 100 joe - 100

1) read mike; 2) mike+100; 3) write new mike;

Page 67: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /55

recovery

51

logs write ahead replay checkpoints

C Mohan, IBM ACM SIGMOD Edgar F. Codd Inovations award 1993

Page 68: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5552

optimizer

execution

storage

tuning

db kernel

can DBAs make wrong decisions?

can optimizers make

wrong decisions?

Page 69: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5553

Architecture of a Database System (Sections 1,2,3,4)by J. Hellerstein, M. Stonebraker and J. Hamilton

Page 70: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

CS165, Fall 2015 Stratos Idreos /5554

IEEE Data Engineering Bulletin, 35(1), March 2012 Special Issue on Column-stores (9 short overview papers)

next class we start discussing data layouts and column-stores

The Design and Implementation of Modern Column-store Database Systems (Sections: all -4.6 & 4.8)by D. Abadi, P. Boncz, S. Harizopoulos, S. Idreos, S. Madden

readings for next few classes

Page 71: SQL & intro to db architectures - Harvard Universitydaslab.seas.harvard.edu/classes/cs165/doc/class_slides/Class3CS16… · CS165, Fall 2015 Stratos Idreos 4 /55 Peter Haas Data Systems

SQL & intro to db architectures

DATA SYSTEMSprof. Stratos Idreos

class 3