CS145: Intro to Databases - Stanford...

52
CS145: Intro to Databases Lecture 1: Course Overview 1

Transcript of CS145: Intro to Databases - Stanford...

Page 1: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

CS145:IntrotoDatabases

Lecture1:CourseOverview

1

Page 2: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Theworldisincreasinglydrivenbydata…

2

Thisclassteachesthebasicsofhowtouse&managedata.

Page 3: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

KeyQuestionsWeWillAnswer

• Howcanwecollectandstorelargeamountsofdata?• Bybuildingtoolsanddatastructurestoefficientlyindexandservedata

• Howcanweefficientlyquerydata?• Bycompilinghigh-leveldeclarativequeriesintoefficientlow-levelplans

• Howcanwesafelyupdatedata?• Bymanagingconcurrentaccesstostateasitisreadandwritten

• Howdodifferentdatabasesystemsmanagedesigntrade-offs?• e.g.,atscale,inadistributedenvironment?

3

Page 4: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Whenyou’llusethismaterial

• Buildingalmostanysoftwareapplication• e.g.,mobile,cloud,consumer,enterprise,analytics,machinelearning• Corollary:everyapplicationyouuseusesadatabase• Bonus:everyprogramconsumesdata(evenifonlytheprogramtext!)

• Performingdataanalytics• Businessintelligence,datascience,predictivemodeling• (Evenifyou’reusingPandas,you’reusingrelationalalgebra!)

• Buildingdata-intensivetoolsandapplications• Manycoreconceptspowerdeeplearningframeworkstoself-drivingcars

4

Page 5: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Today’sLecture

1. Introduction,admin&setup• ACTIVITY:Jupyter “HelloWorld!”

2. Overviewoftherelationaldatamodel• ACTIVITY:SQLinJupyter

3. OverviewofDBMStopics:Keyconcepts&challenges

5

Page 6: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

1.Introduction,admin&setup

6

Section1

Page 7: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Whatyouwilllearnaboutinthissection

1. MotivationforstudyingDBs

2. Administrativestructure

3. Courselogistics

4. Overviewoflecturecoverage

5. ACTIVITY:Jupyter “HelloWorld!”

7

Section1

Page 8: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

BigDataLandscape…InfrastructureisChanging

8http://www.bigdatalandscape.com/

New tech.Same Principles.

Section1>Introduction

Page 9: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Whyshouldyou studydatabases?

• Mercenary-makemore$$$:• StartupsneedDBtalentrightaway=lowemployee#• Massiveindustry…

• Intellectual:• Science:datapoortodatarich

• Noideahowtohandlethedata!• Fundamentalideasto/fromallofCS:

• Systems,theory,AI,logic,stats,analysis….

9ManygreatcomputersystemsideasstartedinDB.

Section1>Introduction

Page 10: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Whatthiscourseis(andisnot)

• Discussfundamentalsofdatamanagement• Howtodesigndatabases,querydatabases,buildapplicationswiththem.• Howtodebugthemwhentheygowrong!• Not howtobeaDBAorhowtotuneOracle12g.

• We’llcoverhowdatabasemanagementsystemswork

• Andsome(butnotallof)theprinciplesofhowtobuildthem• see245,345,and346.

10

Section1>Introduction

Page 11: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Whoweare…

Instructor(me)PeterBailis• FacultyintheInfoLab• SecondyearatStanford,firsttimeteachingCS145!• Research:tools+systemsforlarge-scaledataanalytics• Officehours:T/Th 4:30-5:30,Gates410

11

Section1>Administrative>CourseStaff

Page 12: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

CourseAssistants(CAs)

12

Section1>Administrative>CourseStaff

Page 13: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

13

Section1>Administrative>CourseStaff

DevBhargavaWilliamChenSorooshHemmatiWoncheol JeongLingtongSunStephanieTangAmeliaVu

Page 14: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

14

Section1>Administrative>CourseStaff

TaraHeadCA

Page 15: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

15

CS145.stanford.edu

Page 16: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Communicationw/CourseStaff

• Piazza

• Officehours

• Byappointment!

16

OHsarelistedonthecoursewebsite!

Section1>Administrative

Page 17: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Piazza

Thegoalistogetyoutoanswereachother’squestionssoyoucanbenefitandlearnfromeachother.

17

Section1>Administrative

Page 18: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

18

Page 19: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Important!

StudentswithdocumenteddisabilitiesshouldsendintheiraccommodationletterfromO.A.E.(OfficeofAccessibleEducation)bytheendofthisweektoTaraBalakrishnan(HeadCA)&cc’me.

19

Section1>Administrative

Page 20: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

CourseWebsite:

cs145.stanford.edu

20

Section1>Administrative

Page 21: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

21

Lectures

• Lectureslidescoveressentialmaterial• Thisisyourbestreference.• Wearetryingtogetawayfrombook,butdohavepointers

• Trytocoversamethinginmanyways:Lecture,lecturenotes,homework,exams(noshock)• Attendancemakesyourlifeeasier…

Section1>Logistics

Page 22: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

22

Attendance

• Idislikemandatoryattendance…butinthepastwenoticed…• PeoplewhodidnotattenddidworseL• PeoplewhodidnotattendusedmorecourseresourcesL• PeoplewhodidnotattendwerelesshappywiththecourseL

• Lastyear:mandatoryattendance• Thisyear:voluntary(tostart!)-- reserverighttochange

Section1>Logistics

Page 23: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

GradedElements

• ProblemSets(25%)

• Programmingproject(25%)

• Midterm(20%)

• Finalexam(30%)

23

AssignmentsaretypicallydueTuesdaybeforeclass,typically2weekstocomplete

Section1>Logistics

Page 24: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Un-GradedElements

• Readingsprovided tohelpyou!• Onlyitemsinlecture,homework,orprojectarefairgame.

• Activitiesareagainmainlytohelp/befun!• Willoccurduringclass- notgraded,butcountaspartoflecturematerial(fairgameaswell)

• Jupyter Notebooksprovided• Theseareoptionalbuthopefullyhelpful.• Redesignedsothatyoucan‘interactivelyreplay’partsoflecture

24

Section1>Logistics

Page 25: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Whatisexpectedfromyou

• Attendlectures• Ifyoudon’t,it’satyourownperil

• Beactiveandthinkcritically• Askquestions,postcommentsonforums

• Doprogrammingandhomeworkprojects• Startearlyandbehonest

• Studyfortestsandexams

25

Section1>Logistics

Page 26: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Lectures:1st half- fromauser’sperspective

1. Foundations:Relationaldatamodels&SQL• Lectures2-3• HowtomanipulatedatawithSQL,adeclarativelanguage

• reducedexpressivepowerbutthesystemcandomoreforyou

2. DatabaseDesign: Designtheoryandconstraints• Lectures4-6• Designingrelationalschematokeepyourdatafromgettingcorrupted

3. Transactions: Syntax&supportingsystems• Lectures7-8• Aprogrammer’sabstractionfordataconsistency

26

Section1>Lectures

Page 27: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Lectures:2nd half- understandinghowitworks

4.Introductiontodatabasesystems• Lectures12-16• Indexing• ExternalMemoryAlgorithms(IOmodel)forsorting,joins,etc.• Basicsofqueryoptimization(CostEstimates)• Relationalalgebra

5.SpecializedandNewDataProcessingSystems• Lectures17-19• Key-ValueStores• Hadoopandits10yearanniversary• SparkSQL.There-riseofSQL• Next-genanalyticssystems&currentintersectionswithML&AI

27

Section1>Lectures

Page 28: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Lectures:Anoteaboutformatofnotes

28

Section1>Lectures

Theseareasides/notes(stillneedtoknowtheseingeneral!)

Mainpointofslide/keytakeawayatbottom

Definitionsinbluewithconceptbeingdefined bold&underlined

Warnings- payattentionhere!

Takenote!!

Page 29: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Jupyter Notebook“HelloWorld”

• Jupyter notebooksareinteractiveshellswhichsaveoutputinanicenotebookformat• Theyalsocandisplaymarkdown,LaTeX,HTML,js…

• You’llusethesefor• in-classactivities• interactivelecturesupplements/recaps• homeworks,projects,etc.- ifhelpful!

29

Section1>ACTIVITY

FYI:“Jupyter Notebook”arealsocallediPython notebooksbuttheyhandleotherlanguagestoo.

Note:youdo needtoknoworlearnpythonforthiscourse!

Page 30: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Jupyter NotebookSetup

30

Section1>ACTIVITY

Asageneralpolicyinupper-levelCScourses,Windowsisnotofficiallysupported.Howeverwearemakingabest-effortattempttoprovidesomesolutionshere!

1. HIGHLYRECOMMENDED.Installonyourlaptop viatheinstructionsonthenextslide/Piazza

2. Otheroptionsrunningviaoneofthealternativemethods:1. UbuntuVM.2. Corn

3. CometoourInstallationOfficeHoursafterthisclassandtomorrow!

Pleasehelpoutyourpeersbypostingissues/solutionsonPiazza!

Page 31: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Jupyter NotebookSetup

31

Section1>ACTIVITY

CAswillbecomingaroundtohelpwithsetup&installation

https://github.com/stanford-futuredata/cs145-2017/blob/master/jupyter_install.md

Page 32: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Activity-1-1.ipynb

32

Section1>ACTIVITY

Page 33: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

2.Overviewoftherelationaldatamodel

33

Section2

Page 34: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Whatyouwilllearnaboutinthissection

1. DefinitionofDBMS

2. Datamodels&therelationaldatamodel

3. Schemas&dataindependence

4. ACTIVITY:Jupyter +SQL

34

Section2

Page 35: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

WhatisaDBMS?

• Alarge,integratedcollectionofdata

• Modelsareal-worldenterprise• Entities(e.g.,Students,Courses)• Relationships(e.g., Aliceisenrolledin145)

ADatabaseManagementSystem(DBMS) isapieceofsoftwaredesignedtostoreandmanagedatabases

35

Section2>DBMS

Page 36: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

36

AMotivating,RunningExample

• Considerbuildingacoursemanagementsystem(CMS):

• Students• Courses• Professors

• Whotakeswhat• Whoteacheswhat

Entities

Relationships

Section2>Datamodels

Page 37: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Datamodels• Adatamodelisacollectionofconceptsfordescribingdata

• Therelationalmodelofdata isthemostwidelyusedmodeltoday• MainConcept:therelation- essentially,atable

• Aschema isadescriptionofaparticularcollectionofdata,usingthegivendatamodel

• E.g.everyrelation inarelationaldatamodelhasaschema describingtypes,etc.

37

Section2>Datamodels

Page 38: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

“Relationaldatabasesformthebedrockofwesterncivilization”

- BruceLindsay,IBMResearch

38

Section2>Datamodels

Page 39: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

ModelingtheCMS• LogicalSchema• Students(sid:string,name:string,gpa:float)• Courses(cid:string,cname:string,credits:int)• Enrolled(sid:string,cid:string,grade:string)

sid Name Gpa101 Bob 3.2123 Mary 3.8

Students

cid cname credits564 564-2 4308 417 2

Coursessid cid Grade123 564 A

Enrolled

Relations

39

Section2>Datamodels

Page 40: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

ModelingtheCMS• LogicalSchema• Students(sid:string,name:string,gpa:float)• Courses(cid:string,cname:string,credits:int)• Enrolled(sid:string,cid:string,grade:string)

sid Name Gpa101 Bob 3.2123 Mary 3.8

Students

cid cname credits564 564-2 4308 417 2

Coursessid cid Grade123 564 A

Enrolled40

Correspondingkeys

Section2>Datamodels

Page 41: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

OtherSchemata…

• PhysicalSchema:describesdatalayout• Relationsasunorderedfiles• Somedatainsortedorder(index)

• LogicalSchema:Previousslide

• ExternalSchema:(Views)• Course_info(cid:string,enrollment:integer)• Derivedfromothertables

Applications

Administrators

41

Section2>Schemata

Page 42: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

DataindependenceConcept: Applicationsdonotneedtoworryabouthowthedataisstructuredandstored

Logicaldataindependence:protectionfromchangesinthelogicalstructureofthedata

Physicaldataindependence:protectionfromphysicallayoutchanges

OneofthemostimportantreasonstouseaDBMS 42

Section2>Schemata

I.e.shouldnotneedtoask:canweaddanewentityorattributewithoutrewritingtheapplication?

I.e.shouldnotneedtoask:whichdisksarethedatastoredon?Isthedataindexed?

Page 43: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Activity-1-2.ipynb

43

Section2>ACTIVITY

Page 44: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

3.OverviewofDBMStopicsKeyconcepts&challenges

44

Section3

Page 45: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Whatyouwilllearnaboutinthissection

1. Transactions

2. Concurrency&locking

3. Atomicity&logging

4. Summary

45

Section3

Page 46: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

ChallengeswithManyUsers• SupposethatourCMSapplicationserves1000’sofusersormore-whataresomechallenges?

DBMSallowsusertowriteprogramsasiftheyweretheonly user

Disk/SSDaccessisslow,DBMShidethelatencybydoingmoreCPUworkconcurrently

46

Section3 >DBMSChallenges

• Security:Differentusers,differentroles

• Performance:Needtoprovideconcurrentaccess

• Consistency:Concurrencycanleadtoupdateproblems

Wewon’tlookattoomuchinthiscourse,butisextremely important

Page 47: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Transactions• Akeyconceptisthetransaction(TXN):an atomicsequenceofdbactions(reads/writes)

Atomicity:Anactioneithercompletesentirely ornotatall

47

Section3>DBMSChallenges

Acct Balancea10 20,000a20 15,000

Acct Balancea10 17,000a20 18,000

Transfer$3kfroma10toa20:1. Debit$3kfroma102. Credit$3ktoa20

• Crashbefore1,• After1butbefore2,• After2.

Writtennaively,inwhichstatesis

atomicity preserved?

DBAlwayspreservesatomicity!

Page 48: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Transactions• Akeyconceptisthetransaction(TXN):an atomicsequenceofdbactions(reads/writes)• IfausercancelsaTXN,itshouldbeasifnothinghappened!

• TransactionsleavetheDBinaconsistent state• Usersmaywriteintegrityconstraints, e.g.,‘eachcourseisassignedtoexactlyoneroom’

Atomicity:Anactioneithercompletesentirely ornotatall

48

Section3>DBMSChallenges

Consistency:Anactionresultsinastatewhichconformstoallintegrityconstraints

However, notethattheDBMSdoesnotunderstandtherealmeaningoftheconstraints– consistencyburdenisstillontheuser!

Page 49: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

Challenge:SchedulingConcurrentTransactions• TheDBMSensuresthattheexecutionof{T1,…,Tn}isequivalenttosomeserial execution

• Onewaytoaccomplishthis:Locking• Beforereadingorwriting,transactionrequiresalockfromDBMS,holdsuntiltheend

• KeyIdea: IfTi wantstowritetoanitemxandTjwantstoreadx,thenTi,Tj conflict.Solutionvialocking:• onlyonewinnergetsthelock• loserisblocked(waits)untilwinnerfinishes

AsetofTXNsisisolated iftheireffectisasifallwereexecutedserially

49

Section3 >DBMSChallenges

WhatifTiandTj needXandY,andTi asksforXbeforeTj,andTj asksforYbeforeTi?->Deadlock!Oneisaborted…

AllconcurrencyissueshandledbytheDBMS…

Page 50: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

EnsuringAtomicity&Durability• DBMSensuresatomicity evenifaTXNcrashes!

• Onewaytoaccomplishthis:Write-aheadlogging(WAL)

• KeyIdea: Keepalogofallthewritesdone.• Afteracrash,thepartiallyexecutedTXNsareundoneusingthelog

Write-aheadLogging(WAL): Beforeanyactionisfinalized,acorrespondinglogentryisforcedtodisk

50

Section3 >DBMSChallenges

Weassumethatthelogison“stable”storage

AllatomicityissuesalsohandledbytheDBMS…

Page 51: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

AWell-DesignedDBMSmakesmanypeoplehappy!

• EndusersandDBMSvendors• Reducescostandmakesmoney

• DBapplicationprogrammers• Canhandlemoreusers,faster,forcheaper,andwithbetterreliability/securityguarantees!

• Databaseadministrators(DBA)• Easiertimeofdesigninglogical/physicalschema,handlingsecurity/authorization,tuning,crashrecovery,andmore…

MuststillunderstandDBinternals

51

Section3 >Summary

Page 52: CS145: Intro to Databases - Stanford Universityweb.stanford.edu/class/cs145/lectures/lecture-1/Lecture... ·  · 2017-12-087 Section 1. Big Data Landscape… ... 1sthalf -from a

SummaryofDBMS

• DBMSareusedtomaintain,query,andmanagelargedatasets.• Provideconcurrency,recoveryfromcrashes,quickapplicationdevelopment,integrity,andsecurity

• Keyabstractionsgivedataindependence

• DBMSR&Disoneofthebroadest,mostexcitingfieldsinCS.Fact!

52

Section3 >Summary