Post on 04-Jan-2016
CSE 5330/7330CSE 5330/7330Database IntroductionDatabase Introduction
Fall 2009Fall 2009
Margaret H. DunhamMargaret H. DunhamDepartment of Computer Science and EngineeringDepartment of Computer Science and Engineering
Southern Methodist UniversitySouthern Methodist UniversityPOBox 750122POBox 750122
Dallas, Texas 75275-0122Dallas, Texas 75275-0122
CSE 5330/7330 Fall 2009 1
CSE 5330/7330 Fall 2009 2
Database IntroductionDatabase Introduction
NOTE: These slides provide an overview NOTE: These slides provide an overview of the basic database concepts. During of the basic database concepts. During the semester we will return to them to the semester we will return to them to provide an overview and summary of provide an overview and summary of each section covered.each section covered.
CSE 5330/7330 Fall 2009 3
Database OutlineDatabase Outline
IntroductionIntroduction File Organization & IndexingFile Organization & Indexing Data ModelsData Models Relational ModelRelational Model SQL/Query ProcessingSQL/Query Processing TransactionsTransactions
CSE 5330/7330 Fall 2009 4
Database HistoryDatabase History
A Short Database History by John VaughnA Short Database History by John Vaughn
http://math.hws.edu/vaughn/cpsc/343/2003/http://math.hws.edu/vaughn/cpsc/343/2003/history.htmhistory.htmll
A Brief History of Database SystemsA Brief History of Database Systems
http://www.comphist.org/computing_history/new_page_9.htm
CSE 5330/7330 Fall 2009 5
DB History SnapshotsDB History Snapshots
CFMS/DBOMP (Late 60s)CFMS/DBOMP (Late 60s) EF Codd Paper (1970)EF Codd Paper (1970) DBTG Report (1974)DBTG Report (1974) IMS/IDMS (Early 1970s)IMS/IDMS (Early 1970s) System R (1970s)System R (1970s) Transactions – Jim Gray (1970s)Transactions – Jim Gray (1970s) ER Model (1976)ER Model (1976) OODB (1985)OODB (1985) XML/Internet (1990s)XML/Internet (1990s)
CSE 5330/7330 Fall 2009 6
What is a Database?What is a Database?
Collection of Related DataCollection of Related Data– DataData– HardwareHardware– Software (DBMS)Software (DBMS)– Users Users
CSE 5330/7330 Fall 2009 7
File vs. DatabaseFile vs. Database
Single user vs. multiple usersSingle user vs. multiple users Simple relationships vs. complex Simple relationships vs. complex
relationshipsrelationships Integrity supportIntegrity support Concurrency controlConcurrency control RecoveryRecovery Query languageQuery language SecuritySecurity Different Views of dataDifferent Views of data
CSE 5330/7330 Fall 2009 8
Some DB TermsSome DB Terms Data/Information/KnowledgeData/Information/Knowledge DataBase Management System (DBMS)DataBase Management System (DBMS) Data Dictionary/Directory/MetadataData Dictionary/Directory/Metadata Data ModelData Model Data Definition Languare (DDL)Data Definition Languare (DDL) Data Manipulaiton Language (DML)Data Manipulaiton Language (DML) DataBase Administrator (DBA)DataBase Administrator (DBA) Data Administrator (DA)Data Administrator (DA) Database designerDatabase designer Information Resource Manager (IRM)Information Resource Manager (IRM) Chief Information Officer (CIO)Chief Information Officer (CIO) ……
CSE 5330/7330 Fall 2009 9
DBMS ComponentsDBMS Components DDL CompilerDDL Compiler DML CompilerDML Compiler Precompiler (embedded language support)Precompiler (embedded language support) Access methodsAccess methods Concurrency ControlConcurrency Control RecoveryRecovery SecuritySecurity Data Dictionary (Metadata)Data Dictionary (Metadata) Utility ServicesUtility Services ……
CSE 5330/7330 Fall 2009 10
Views of DataViews of Data
Levels of AbstractionLevels of Abstraction– External viewExternal view– Conceptual schemaConceptual schema– Physical (Internal schema)Physical (Internal schema)
Data independenceData independence
CSE 5330/7330 Fall 2009 11
Data ModelData Model Way to “picture” and access data independent of Way to “picture” and access data independent of
how it is actually stored.how it is actually stored.– Data DescriptionData Description– Data RelationshipsData Relationships– OperationsOperations– Integrity/Consistency constraintsIntegrity/Consistency constraints
Examples:Examples:– Entity-Relationship (ER/ERA)Entity-Relationship (ER/ERA)– RelationalRelational– Object OrientedObject Oriented– Object/RelationalObject/Relational– Older – Network/HierarchicalOlder – Network/Hierarchical
CSE 5330/7330 Fall 2009 12
Relational Model*Relational Model* Based on tables, as:Based on tables, as:
acct #acct # NameName BalanceBalance
1234512345 SallySally 1000.211000.21
3456734567 SueSue 285.48 285.48
…… …… … … Rows (tuples)Rows (tuples) Columns (attributes)Columns (attributes) Today used in Today used in mostmost DBMS's DBMS's..
Most of the following slides were obtained from the home page for Most of the following slides were obtained from the home page for A First Course in Database SystemsA First Course in Database Systems by Jeffrey D. by Jeffrey D. Ullman and Jennifer Widom, Prentice Hall, 2002, http://www-db.stanford.edu/~ullman/fcdb.htmlUllman and Jennifer Widom, Prentice Hall, 2002, http://www-db.stanford.edu/~ullman/fcdb.html
CSE 5330/7330 Fall 2009 13
Data RelationshipsData Relationships One-to-one (1:1)One-to-one (1:1)
Ex: Name to SSNEx: Name to SSN One-to-many (1:M)One-to-many (1:M)
Ex: Name to PhoneEx: Name to Phone Many-to many (M:n)Many-to many (M:n)
Ex: Part to SupplierEx: Part to Supplier What data structures can be used to What data structures can be used to
store these relationships?store these relationships?
CSE 5330/7330 Fall 2009 14
Database OutlineDatabase Outline IntroductionIntroduction
File Organization & IndexingFile Organization & Indexing Data ModelsData Models Relational ModelRelational Model SQL/Query ProcessingSQL/Query Processing TransactionsTransactions
CSE 5330/7330 Fall 2009 15
Typical Data Structures used Typical Data Structures used in DBMSsin DBMSs
Sequential FilesSequential Files Hash TableHash Table
– Extendible Hash TableExtendible Hash Table B-Tree (Multiway Search Tree)B-Tree (Multiway Search Tree)
– B+-TreeB+-Tree Combinations of theseCombinations of these IndicesIndices
CSE 5330/7330 Fall 2009 16
Placement of Data on DiskPlacement of Data on Disk Record/Cylinder/Block/SectorRecord/Cylinder/Block/Sector Blocking FactorBlocking Factor AllocationAllocation
– ContiguousContiguous– LinkedLinked– ExtentsExtents– IndexedIndexed
ClusteringClustering PartitioningPartitioning RAIDRAID
CSE 5330/7330 Fall 2009 17
Data Structure PointersData Structure Pointers
LogicalLogical– KeyKey– Relative BlockRelative Block
PhysicalPhysical– Memory - Physical Address (offset)Memory - Physical Address (offset)– Disk – Physical Address Disk – Physical Address
» Device/Cylinder/Track/Sector/Block/OffsetDevice/Cylinder/Track/Sector/Block/Offset
CSE 5330/7330 Fall 2009 18
Disk vs. Memory Data Disk vs. Memory Data StructuresStructures
ObjectiveObjective– Disk – minimize I/ODisk – minimize I/O– Memory – minimize memory accesses or Memory – minimize memory accesses or
CPU timeCPU time TreeTree
– Disk – large nodes, shallowDisk – large nodes, shallow– Memory – small nodes. DeepMemory – small nodes. Deep
CSE 5330/7330 Fall 2009 19
AccessAccess
Sequential – Retrieve records in order Sequential – Retrieve records in order (logical/physical)(logical/physical)
Random – Retrieve record based on keyRandom – Retrieve record based on key Direct – Retrieve record based on physical Direct – Retrieve record based on physical
addressaddress Relative – Retrieve record based on relative Relative – Retrieve record based on relative
position in fileposition in file Binary Search – Randomly retrieve record Binary Search – Randomly retrieve record
doing binary searchdoing binary search
CSE 5330/7330 Fall 2009 20
OrganizationOrganization Sequential – Records stored in logical order of Sequential – Records stored in logical order of
key.key.Access: Sequential, relative, binary searchAccess: Sequential, relative, binary search
Heap – Records added to end or where space.Heap – Records added to end or where space.Access: DirectAccess: Direct
Btree – Multiway balanced search tree.Btree – Multiway balanced search tree.Access: Sequential, randomAccess: Sequential, random
Hash – Store and access record based on Hash – Store and access record based on address determined when key is hashed.address determined when key is hashed.Access: RandomAccess: Random
CSE 5330/7330 Fall 2009 21
IndexingIndexing Speed up processing of data by Speed up processing of data by
providing alternative access path.providing alternative access path. Both index and primary storage of data Both index and primary storage of data
provide access method.provide access method. Ex: Ex:
Employee Data
Hash on ID
BTree index on last name
BTree index on job type
CSE 5330/7330 Fall 2009 22
Index TypesIndex Types
Number of entriesNumber of entries– Dense – One index entry for each record in Dense – One index entry for each record in
filefile– Sparse – One index entry for many recordsSparse – One index entry for many records
KeyKey– Primary – Same key as main filePrimary – Same key as main file– Secondary – Different key from original fileSecondary – Different key from original file
Organizations: Hash, BTree, Organizations: Hash, BTree, Sequential, BSTSequential, BST
CSE 5330/7330 Fall 2009 23
Index Search TimesIndex Search TimesOrganizationOrganization WorstWorst ExpectedExpected
SequentialSequential O(n)O(n) O(n/2)O(n/2)
HashHash O(n)O(n) O(1.??)O(1.??)
Tree Tree (Balanced)(Balanced)
O(n)O(n) O(lg n)O(lg n)
B+-TreeB+-Tree O(lg n)O(lg n) O(lg n)O(lg n)
CSE 5330/7330 Fall 2009 24
HashingHashing
Bucket in one block (or cluster thereof)Bucket in one block (or cluster thereof) Hash value may be precise address or Hash value may be precise address or
relative block (bucket) numberrelative block (bucket) number Collisions handled by linked listsCollisions handled by linked lists Dynamic Hashing – Allows hash table Dynamic Hashing – Allows hash table
size to growsize to grow
CSE 5330/7330 Fall 2009 25
Multiple Key IndexingMultiple Key Indexing
Key composed of many subkeysKey composed of many subkeys Access based on all or subset of theseAccess based on all or subset of these Some indexing structures specifically Some indexing structures specifically
targeted to n-dimensional accessingtargeted to n-dimensional accessing
CSE 5330/7330 Fall 2009 26
Database OutlineDatabase Outline IntroductionIntroduction File Organization & IndexingFile Organization & Indexing
Data ModelsData Models Relational ModelRelational Model SQL/Query ProcessingSQL/Query Processing TransactionsTransactions
CSE 5330/7330 Fall 2009 27
Data Model Data Model EvolutionEvolution
Hierarchical
60’s
70's
80's
90’s
now
RelationalChoice for most new
applications
Object Bases Knowledge Bases
Network
CSE 5330/7330 Fall 2009 28
Entity/Relationship ModelEntity/Relationship Model
Diagrams to represent designs.Diagrams to represent designs. EntityEntity like object, = “thing.” like object, = “thing.” Entity setEntity set like class = set of “similar” like class = set of “similar”
entities/objects.entities/objects. AttributeAttribute = property of entities in an entity set. = property of entities in an entity set. In diagrams:In diagrams:
– entity set entity set rectangle rectangle– attribute attribute oval. oval.
Students
ID name phone
height
CSE 5330/7330 Fall 2009 29
RelationshipsRelationships
Connect two or more entity sets.Connect two or more entity sets. Represented by diamonds.Represented by diamonds.
Students CoursesTaking
CSE 5330/7330 Fall 2009 30
Relationship SetRelationship SetThink of the “value” of a relationship set as a table.Think of the “value” of a relationship set as a table. One column for each of the connected entity sets.One column for each of the connected entity sets. One row for each list of entities, one from each set, One row for each list of entities, one from each set,
that are connected by the relationship.that are connected by the relationship.StudentsStudents CoursesCourses
SallySally CS180CS180
SallySally CS111CS111
JoeJoe CS180CS180
…… ……
CSE 5330/7330 Fall 2009 31
StudentsStudents CoursesCourses TAsTAsAnnAnn CS180CS180 JanJanSueSue CS180CS180 PatPatBobBob CS180CS180 JanJan…… …… ……
Students
Courses
TAs
Enrolls
CSE 5330/7330 Fall 2009 32
Beers-Bars-Drinkers ExampleBeers-Bars-Drinkers Examplename addr license
name manf name addr
Beers Drinkers
BarsServes Frequents
Likes
CSE 5330/7330 Fall 2009 33
Multiplicity of RelationshipsMultiplicity of Relationships
Representation of Many-OneRepresentation of Many-One E/R: arrow pointing to “one.”E/R: arrow pointing to “one.”
– Rounded arrow = “exactly one.”Rounded arrow = “exactly one.”
Many-many Many-one One-one
CSE 5330/7330 Fall 2009 34
Example:Example:Drinkers Have Favorite BeersDrinkers Have Favorite Beers
name addr license
name manf name addr
Beers Drinkers
BarsServes Frequents
Likes
Favorite
CSE 5330/7330 Fall 2009 35
One-One RelationshipsOne-One RelationshipsPut arrows in both directions.Put arrows in both directions.
Design Issue:Design Issue:Is the rounded arrow justified?Is the rounded arrow justified?Design Issue:Design Issue:Here, manufacturer is an E.S.Here, manufacturer is an E.S.In earlier diagrams it is an attribute.In earlier diagrams it is an attribute.Which is right?Which is right?
Manfs BeersBest-seller
CSE 5330/7330 Fall 2009 36
Attributes on RelationshipsAttributes on Relationships
Shorthand for 3-way relationship:Shorthand for 3-way relationship:
Bars BeersSells
price
Bars BeersSells
price
Prices
CSE 5330/7330 Fall 2009 37
RolesRoles
Sometimes an E.S. participates more Sometimes an E.S. participates more than once in a relationship.than once in a relationship.
Label edges with Label edges with rolesroles to distinguish. to distinguish.Husband Wifed1 d2d3 d4… …
Drinkers
Married
husband wife
CSE 5330/7330 Fall 2009 38
Notice Notice BuddiesBuddies is symmetric, Married not. is symmetric, Married not.– No way to say “symmetric” in E/R.No way to say “symmetric” in E/R.
Design QuestionDesign Question
Should we replace Should we replace husbandhusband and and wifewife by one by one relationship relationship spousespouse??
Buddy1 Buddy2d1 d2d1 d3d2 d1d2 d4… …
Drinkers
Buddies
1 2
CSE 5330/7330 Fall 2009 39
Multiple InheritanceMultiple InheritanceTheoretically, an E.S. could be a subclass Theoretically, an E.S. could be a subclass
of several other entity sets.of several other entity sets.name manf
Beers
name manf
Wines
GrapeBeers
isaisa
CSE 5330/7330 Fall 2009 40
KeysKeys
A A keykey is a set of attributes whose values can is a set of attributes whose values can belong to at most one entity.belong to at most one entity.
In E/R model, every E.S. must have a key.In E/R model, every E.S. must have a key.– It could have more than one key, but one set of It could have more than one key, but one set of
attributes is the “designated” key.attributes is the “designated” key.
In E/R diagrams, you should underline all In E/R diagrams, you should underline all attributes of the designated key.attributes of the designated key.
CSE 5330/7330 Fall 2009 41
ExampleExample Suppose Suppose namename is key for is key for BeersBeers..
Beer name is also key for ales.Beer name is also key for ales.– In general, key at root is key for all.In general, key at root is key for all.
name manfBeers
Alescolor
isa
CSE 5330/7330 Fall 2009 42
Example: A Multiattribute KeyExample: A Multiattribute Key
Possibly, the combination of hours + Possibly, the combination of hours + room also forms a key, but we have not room also forms a key, but we have not designated it as such.designated it as such.
dept roomCourses
number hours
CSE 5330/7330 Fall 2009 43
Database OutlineDatabase Outline IntroductionIntroduction File Organization & IndexingFile Organization & Indexing Data ModelsData Models
Relational ModelRelational Model SQL/Query ProcessingSQL/Query Processing TransactionsTransactions
CSE 5330/7330 Fall 2009 44
Relational ModelRelational Model Table = Table = relationrelation.. Column headers = Column headers = attributesattributes.. Row = Row = tupletuple
BeersBeers Relation schemaRelation schema = name(attributes) + other structure info., = name(attributes) + other structure info.,
e.g., keys, other constraints. Example: Beers(name, manf)e.g., keys, other constraints. Example: Beers(name, manf)– Order of attributes is arbitrary, but in practice we need to assume Order of attributes is arbitrary, but in practice we need to assume
the order given in the relation schema.the order given in the relation schema. Relation instanceRelation instance is current set of rows for a relation is current set of rows for a relation
schema.schema. Database schemaDatabase schema = collection of relation schemas. = collection of relation schemas.
name manfWinterBrew Pete’sBudLite A.B.… …
CSE 5330/7330 Fall 2009 45
Relation Relation InstanceInstance
NameName AddressAddress TelephoneTelephone
BobBob 123 Main St123 Main St 555-1234555-1234
BobBob 128 Main St128 Main St 555-1235555-1235
PatPat 123 Main St123 Main St 555-1235555-1235
HarryHarry 456 Main St456 Main St 555-2221555-2221
SallySally 456 Main St456 Main St 555-2221555-2221
SallySally 456 Main St456 Main St 555-2223555-2223
PatPat 12 State St12 State St 555-1235555-1235
CSE 5330/7330 Fall 2009 46
Why Relations?Why Relations?
Very simple model.Very simple model. OftenOften a good match for the way we a good match for the way we
think about our data.think about our data. Abstract model that underlies SQL, the Abstract model that underlies SQL, the
most important language in DBMS’s most important language in DBMS’s today.today.
CSE 5330/7330 Fall 2009 47
Relational DesignRelational DesignSimplest approach (not always best): convert each Simplest approach (not always best): convert each
E.S. to a relation and each relationship to a E.S. to a relation and each relationship to a relation.relation.
Entity Set Entity Set Relation RelationE.S. attributes become relational attributes.E.S. attributes become relational attributes.
Becomes:Becomes:Beers(Beers(namename, manf), manf)
Beers
name manf
CSE 5330/7330 Fall 2009 48
Keys in RelationsKeys in Relations
An attribute or set of attributes An attribute or set of attributes KK is a is a keykey for a relation for a relation RR if we expect that in no if we expect that in no instance of instance of RR will two different tuples will two different tuples agree on all the attributes of agree on all the attributes of KK..
Indicate a key by underlining the key Indicate a key by underlining the key attributes.attributes.
Example: If name is a key for Beers:Example: If name is a key for Beers:Beers(Beers(namename, manf), manf)
CSE 5330/7330 Fall 2009 49
E/R Relationships E/R Relationships Relations Relations
Relation has attribute for Relation has attribute for keykey attributes of attributes of each E.S. that participates in the each E.S. that participates in the relationship.relationship.
Add any attributes that belong to the Add any attributes that belong to the relationship itself.relationship itself.
Renaming attributes OK.Renaming attributes OK.– Essential if multiple roles for an E.S.Essential if multiple roles for an E.S.
CSE 5330/7330 Fall 2009 50
Drinkers
For one-one relation Married, we can For one-one relation Married, we can choose either husband or wife as key.choose either husband or wife as key.
Likes(drinker, beer)Favorite(drinker, beer)Married(husband, wife)Buddies(name1, name2)
BeersLikes
name manfname addr
Buddies
Married
Favorite
1 2
husband wife
CSE 5330/7330 Fall 2009 51
Combining RelationsCombining RelationsSometimes it makes sense to combine relations.Sometimes it makes sense to combine relations. Common case: Relation for an E.S. Common case: Relation for an E.S. EE plus the relation for plus the relation for
some many-one relationship from some many-one relationship from EE to another E.S. to another E.S.
ExampleExampleCombine Combine Drinker(Drinker(namename, addr), addr) with with Favorite(Favorite(drinkerdrinker, beer), beer) to get to get
Drinker1(Drinker1(namename, addr, favBeer), addr, favBeer).. Danger in pushing this idea too far: redundancy.Danger in pushing this idea too far: redundancy. e.ge.g., combining Drinker with Likes causes the drinker's ., combining Drinker with Likes causes the drinker's
address to be repeated, viz.:address to be repeated, viz.: namename addraddr beerbeer
SallySally 123 Maple123 Maple BudBudSallySally 123 Maple123 Maple MillerMiller
Notice the difference: Notice the difference: FavoriteFavorite is many-one; is many-one;LikesLikes is many-many. is many-many.
CSE 5330/7330 Fall 2009 52
Keys of RelationsKeys of Relations
KK is a is a keykey for relation for relation RR if: if:
1. 1. KK all attributes of all attributes of RR. . (Uniqueness)(Uniqueness)
2. For no proper subset of 2. For no proper subset of KK is (1) true. is (1) true. (Minimality)(Minimality)
If If KK at least satisfies (1), then at least satisfies (1), then KK is a is a superkeysuperkey..
ConventionsConventions Pick one key; underline key attributes in the Pick one key; underline key attributes in the
relation schema.relation schema.
CSE 5330/7330 Fall 2009 53
ExampleExampleDrinkers(Drinkers(namename, addr, , addr, beersLikedbeersLiked, manf, favoriteBeer), manf, favoriteBeer) {name, beersLiked} FD’s all attributes, as seen.{name, beersLiked} FD’s all attributes, as seen.
– Shows {Shows {name, beersLikedname, beersLiked} is a superkey.} is a superkey.
name name beersLiked is false, so name not a superkey. beersLiked is false, so name not a superkey.
beersLiked beersLiked name also false, so beersLiked not a name also false, so beersLiked not a
superkey. superkey. Thus, {name, beersLiked} is a key.Thus, {name, beersLiked} is a key. No other keys in this example.No other keys in this example.
– Neither Neither namename nor nor beersLikedbeersLiked is on the right of any observed is on the right of any observed
FD, so they must be part of FD, so they must be part of anyany superkey. superkey.
CSE 5330/7330 Fall 2009 54
Example 2Example 2
Keys are Keys are {Lastname, Firstname}{Lastname, Firstname} and and {StudentID}{StudentID}
Lastname Firstname Student ID Major
Key Key
(2 attributes)
Superkey
Note: There are alternate keys
CSE 5330/7330 Fall 2009 55
NormalizationNormalization Process of simplifying relational design:Process of simplifying relational design:
– Avoid redundancyAvoid redundancy Functional Dependencies (FD)Functional Dependencies (FD)
– Identify relationships between data valuesIdentify relationships between data values– SSN SSN Name Name
» In any tuple, the value for SSN determines a unique value In any tuple, the value for SSN determines a unique value for Name.for Name.
» If the same SSN exists in two tuples, you’ll have the same If the same SSN exists in two tuples, you’ll have the same Name duplicated.Name duplicated.
FDs are used by algorithms to determine best FDs are used by algorithms to determine best relations to be used given a set of attributes. relations to be used given a set of attributes.
CSE 5330/7330 Fall 2009 56
Example of ProblemsExample of ProblemsDrinkers(Drinkers(namename, addr, , addr, beersLikedbeersLiked, manf, favoriteBeer), manf, favoriteBeer)
FD’s:FD’s:1. name 1. name addr addr2. name 2. name favoriteBeer favoriteBeer3. beersLiked 3. beersLiked manf manf ???’s are redundant, since we can figure them out from the ???’s are redundant, since we can figure them out from the
FD’s.FD’s. Update anomalies: If Janeway gets transferred to the Update anomalies: If Janeway gets transferred to the
IntrepidIntrepid, will we change addr in each of her tuples?, will we change addr in each of her tuples? Deletion anomalies: If nobody likes Bud, we lose track of Deletion anomalies: If nobody likes Bud, we lose track of
Bud’s manufacturer.Bud’s manufacturer.
name addr beersLikedmanf favoriteBeer
J aneway Voyager Bud A.B. WickedAleJ aneway ??? WickedAle Pete's ???Spock Enterprise Bud ??? Bud
CSE 5330/7330 Fall 2009 57
Database OutlineDatabase Outline IntroductionIntroduction File Organization & IndexingFile Organization & Indexing Data ModelsData Models Relational ModelRelational Model
SQL/Query ProcessingSQL/Query Processing TransactionsTransactions
CSE 5330/7330 Fall 2009 58
““Core” Relational AlgebraCore” Relational AlgebraA small set of operators that allow us to manipulate A small set of operators that allow us to manipulate
relations in limited but useful ways. The operators relations in limited but useful ways. The operators are:are:
1.1. Union, intersection, and difference: the usual set Union, intersection, and difference: the usual set operators.operators.– But the relation schemas must be the same.But the relation schemas must be the same.
2.2. SelectionSelection: Picking certain rows from a relation.: Picking certain rows from a relation.
3.3. ProjectionProjection: Picking certain columns.: Picking certain columns.
4.4. Products and joinsProducts and joins: Composing relations in useful : Composing relations in useful ways.ways.
5.5. RenamingRenaming of relations and their attributes. of relations and their attributes.
CSE 5330/7330 Fall 2009 59
SelectionSelectionRR11 = = CC((RR22))
where where CC is a condition involving the attributes of relation is a condition involving the attributes of relation RR22..
ExampleExampleRelation Sells:Relation Sells:
JoeMenu = JoeMenu = bar=Joe'sbar=Joe's(Sells)(Sells)
bar beer priceJoe's Bud 2.50Joe's Miller 2.75Sue's Bud 2.50Sue's Coors 3.00
bar beer priceJoe's Bud 2.50Joe's Miller 2.75
CSE 5330/7330 Fall 2009 60
ProjectionProjectionRR11 = = LL((RR22))
where where LL is a list of attributes from the is a list of attributes from the schema of schema of RR22..
ExampleExamplebeer,pricebeer,price(Sells)(Sells)
Notice elimination of duplicate tuples.Notice elimination of duplicate tuples.
beer priceBud 2.50Miller 2.75Coors 3.00
CSE 5330/7330 Fall 2009 61
ProductProduct
RR = = RR11 RR22
pairs each tuple pairs each tuple tt11 of of RR11 with each tuple with each tuple tt22
of of RR22 and puts in and puts in RR a tuple a tuple tt11tt22..
A B C D D E F A B C D D' E F
CSE 5330/7330 Fall 2009 62
Sells Bars Sells Bars
BarInfo = Sells BarInfo = Sells Sells.Bar=Bars.NameSells.Bar=Bars.Name Bars Bars
bar beer priceJoe's Bud 2.50Joe's Miller 2.75Sue's Bud 2.50Sue's Coors 3.00
name addrJoe's Maple St.Sue's River Rd.
bar beer price name addrJoe's Bud 2.50 Joe's Maple St.Joe's Miller 2.75 Joe's Maple St.Sue's Bud 2.50 Sue's River Rd.Sue's Coors 3.00 Sue's River Rd.
JoinJoin
CSE 5330/7330 Fall 2009 63
SQLSQL
SEQUEL in System RSEQUEL in System R Structured English QUEry LanguageStructured English QUEry Language DDL and DMLDDL and DML Standard Relational query languageStandard Relational query language
CSE 5330/7330 Fall 2009 64
SQL OperationsSQL Operations
SELECT … FROM … WHERE …SELECT … FROM … WHERE … UPDATE … SET … WHERE …UPDATE … SET … WHERE … INSERT INTO … VALUES (…)INSERT INTO … VALUES (…) DELETE … WHERE …DELETE … WHERE …
CSE 5330/7330 Fall 2009 65
SQLSQLEmployee
Name Dept
Department
Dept Manager
SQL
SELECT ManagerSELECT ManagerFROM Employee, DepartmentFROM Employee, DepartmentWHERE Employee.name = "Clark Kent”WHERE Employee.name = "Clark Kent”
AND Employee.Dept = Department.DeptAND Employee.Dept = Department.Dept
CSE 5330/7330 Fall 2009 66
Host Host Languages Languages
C, C++, Fortran, Lisp, COBOL
Application prog.
Local Vars
DBMS
Calls toDB
(Memory)
(Storage) Different DBMSs support different Different DBMSs support different host language interfaces host language interfaces
PrecompilerPrecompiler ODBC/JDBCODBC/JDBC
CSE 5330/7330 Fall 2009 67
Embedded SQLEmbedded SQL
Add to a conventional programming Add to a conventional programming language (C in our examples) certain language (C in our examples) certain statements that represent SQL statements that represent SQL operations.operations.
Each embedded SQL statement Each embedded SQL statement introduced with EXEC SQL.introduced with EXEC SQL.
Preprocessor converts C + SQL to pure Preprocessor converts C + SQL to pure C.C.– SQL statements become procedure calls.SQL statements become procedure calls.
CSE 5330/7330 Fall 2009 68
ExampleExampleFind the price for a given beer at a given bar.Find the price for a given beer at a given bar.
Sells(Sells(barbar, , beerbeer, price), price)
EXEC SQL BEGIN DECLARE SECTION;EXEC SQL BEGIN DECLARE SECTION;char theBar[21], theBeer[21];char theBar[21], theBeer[21];float thePrice;float thePrice;
EXEC SQL END DECLARE SECTION;EXEC SQL END DECLARE SECTION;. . .. . .
/* assign to theBar and theBeer *//* assign to theBar and theBeer */. . .. . .
EXEC SQL SELECT priceEXEC SQL SELECT priceINTO :thePriceINTO :thePriceFROM SellsFROM SellsWHERE beer = :theBeer ANDWHERE beer = :theBeer AND
bar = :theBar;bar = :theBar;. . .. . .
CSE 5330/7330 Fall 2009 69
Call-Level InterfacesCall-Level InterfacesA more modern approach to the host-A more modern approach to the host-
language/SQL connection is a language/SQL connection is a call-level call-level interfaceinterface, in which the C (or other language) , in which the C (or other language) program creates SQL statements as program creates SQL statements as character strings and passes them to character strings and passes them to functions that are part of a library.functions that are part of a library.
Similar to what really happens in embedded Similar to what really happens in embedded SQL implementations.SQL implementations.
Two major approaches: SQL/CLI (standard of Two major approaches: SQL/CLI (standard of ODBC = ODBC = open database connectivityopen database connectivity) and ) and JDBC (Java database connectivity).JDBC (Java database connectivity).
CSE 5330/7330 Fall 2009 70
JDBCJDBC Start with a Start with a ConnectionConnection object, obtained from the DBMS (see text). object, obtained from the DBMS (see text). Method Method createStatementcreateStatement() returns an object of class () returns an object of class StatementStatement (if (if
there is no argument) or there is no argument) or PreparedStatementPreparedStatement if there is an SQL if there is an SQL statement as argument.statement as argument.
ExampleExampleStatement stat1 = myCon.createStatement();Statement stat1 = myCon.createStatement();PreparedStatement stat2 =PreparedStatement stat2 =
myCon.createStatement(myCon.createStatement("SELECT beer, price " +"SELECT beer, price " +"FROM Sells" +"FROM Sells" +"WHERE bar = 'Joe''s Bar'""WHERE bar = 'Joe''s Bar'"
);); myCon is a connection, stat1 is an “empty” statement object, and myCon is a connection, stat1 is an “empty” statement object, and
stat2 is a (prepared) statement object that has an SQL statement stat2 is a (prepared) statement object that has an SQL statement associated.associated.
CSE 5330/7330 Fall 2009 71
Executing StatementsExecuting Statements JDBC distinguishes queries from JDBC distinguishes queries from updatesupdates Methods Methods executeQueryexecuteQuery() and () and executeUpdateexecuteUpdate() ()
are used to execute these two kinds of SQL are used to execute these two kinds of SQL statements.statements.
When a query is executed, it returns an object When a query is executed, it returns an object of class of class ResultSetResultSet..
ExampleExamplestat1.executeUpdate(stat1.executeUpdate("INSERT INTO Sells" +"INSERT INTO Sells" +"VALUES('Brass Rail', 'Bud', 3.00)""VALUES('Brass Rail', 'Bud', 3.00)"););ResultSet Menu = stat2.executeQuery();ResultSet Menu = stat2.executeQuery();
CSE 5330/7330 Fall 2009 72
Getting the Tuples of a Getting the Tuples of a ResultSetResultSet Method Method NextNext() applies to a () applies to a ResultSetResultSet and moves a and moves a
“cursor” to the next tuple in that set.“cursor” to the next tuple in that set.– Apply Apply NextNext() once to get to the first tuple.() once to get to the first tuple.– NextNext() returns FALSE if there are no more tuples.() returns FALSE if there are no more tuples.
While a given tuple is the current of the cursor, you While a given tuple is the current of the cursor, you can get its can get its iith component by applying to a th component by applying to a ResultSetResultSet a method of the form get a method of the form get XX(i), where (i), where XX is the name is the name for the type of that component.for the type of that component.
ExampleExamplewhile(Menu.Next()) {while(Menu.Next()) {
theBeer = Menu.getString(1);theBeer = Menu.getString(1);thePrice = Menu.getFloat(2);thePrice = Menu.getFloat(2);
......}}
CSE 5330/7330 Fall 2009 73
Database OutlineDatabase Outline IntroductionIntroduction File Organization & IndexingFile Organization & Indexing Data ModelsData Models Relational ModelRelational Model SQL/Query ProcessingSQL/Query Processing
TransactionsTransactions
CSE 5330/7330 Fall 2009 74
TransactionsTransactions= units of work that must be:= units of work that must be:1.1. AtomicAtomic = either all work is done, or none of it. = either all work is done, or none of it.2.2. ConsistentConsistent = relationships among values = relationships among values
maintained.maintained.3.3. IsolatedIsolated = appear to have been executed = appear to have been executed
when no other DB operations were being when no other DB operations were being performed.performed.
– Often called Often called serializableserializable behavior. behavior.
4.4. DurableDurable = effects are permanent even if = effects are permanent even if system crashes.system crashes.
CSE 5330/7330 Fall 2009 75
Commit/Abort DecisionCommit/Abort Decision
Each transaction ends with either:Each transaction ends with either:
1.1. CommitCommit = the work of the transaction is installed = the work of the transaction is installed in the database; previously its changes may be in the database; previously its changes may be invisible to other transactions.invisible to other transactions.
2.2. AbortAbort = no changes by the transaction appear in = no changes by the transaction appear in the database; it is as if the transaction never the database; it is as if the transaction never occurred.occurred.
– ROLLBACK is the term used in SQL and the Oracle ROLLBACK is the term used in SQL and the Oracle systemsystem
CSE 5330/7330 Fall 2009 76
ExampleExampleSells(Sells(barbar, , beerbeer, price), price)
Joe's Bar sells Bud for $2.50 and Miller for $3.00.Joe's Bar sells Bud for $2.50 and Miller for $3.00. Sally is querying the database for the highest and lowest Sally is querying the database for the highest and lowest
price Joe charges:price Joe charges:(1)(1) SELECT MAX(price) FROM SellsSELECT MAX(price) FROM Sells
WHERE bar = 'Joe''s Bar';WHERE bar = 'Joe''s Bar';(2)(2) SELECT MIN(price) FROM SellsSELECT MIN(price) FROM Sells
WHERE bar = 'Joe''s Bar';WHERE bar = 'Joe''s Bar'; At the same time, Joe has decided to replace Miller and Bud At the same time, Joe has decided to replace Miller and Bud
by Heineken at $3.50:by Heineken at $3.50:((3)3) DELETE FROM SellsDELETE FROM Sells
WHERE bar = 'Joe''s Bar' ANDWHERE bar = 'Joe''s Bar' AND (beer = 'Miller' OR beer = 'Bud');(beer = 'Miller' OR beer = 'Bud');
(4)(4) INSERT INTO SellsINSERT INTO SellsVALUES('Joe''s bar', 'Heineken', 3.50);VALUES('Joe''s bar', 'Heineken', 3.50);
CSE 5330/7330 Fall 2009 77
Example: Problem With RollbackExample: Problem With Rollback
Suppose Joe executes statement 4 (insert Suppose Joe executes statement 4 (insert Heineken), but then, during the transaction thinks Heineken), but then, during the transaction thinks better of it and issues a ROLLBACK statement.better of it and issues a ROLLBACK statement.
If Sally is allowed to execute her statement 1 (find If Sally is allowed to execute her statement 1 (find max) just before the rollback, she gets the answer max) just before the rollback, she gets the answer $3.50, even though Joe doesn't sell any beer for $3.50, even though Joe doesn't sell any beer for $3.50.$3.50.
Fix by making statement 4 a transaction, or part of Fix by making statement 4 a transaction, or part of a transaction, so its effects cannot be seen by a transaction, so its effects cannot be seen by Sally unless there is a COMMIT action.Sally unless there is a COMMIT action.
CSE 5330/7330 Fall 2009 78
Deadlock Deadlock ANDAND
1. Wait and hold hold some locks while you wait for others1. Wait and hold hold some locks while you wait for others
2. Circular chain of waiters T42. Circular chain of waiters T4
wait-for graph T1 T3wait-for graph T1 T3
T2T2
3. No pre-emption3. No pre-emption
We can avoid deadlock by doing at least ONE of:We can avoid deadlock by doing at least ONE of:
1. Get all your locks at once1. Get all your locks at once
2. Apply an ordering to acquiring locks2. Apply an ordering to acquiring locks
3. Allow preemption (for example, use timeout on waits)3. Allow preemption (for example, use timeout on waits)
CSE 5330/7330 Fall 2009 79
Serializability of schedulesSerializability of schedulesT1T1
Read (A)Read (A)A:= A-50A:= A-50Write (A)Write (A)Read (B)Read (B)B:= B+50B:= B+50Write (B)Write (B)
Schedule is serializable if effect is the same as Schedule is serializable if effect is the same as a serial schedulea serial schedule
T1 –> T2 T2 –> T1T1 –> T2 T2 –> T1A= A=A= A=B= B= B= B=
T2Read (A)temp:= A * 0.1A:= A + tempWrite (A)Read (B)B:= B - tempWrite (B)
A Bdisk 100 200
T1
T2
T1 T2
A
B
CSE 5330/7330 Fall 2009 80
If no progress is possible, then there is a cycle
T3
T6
T4
T5
A
A B
C
T1T2T3T4T5T6T2
T1
D
C
D
CSE 5330/7330 Fall 2009 81
Cascading AbortCascading Abort
T1T1 T2T2LOCK ALOCK ARead ARead Achange Achange AWrite AWrite AUNLOCK AUNLOCK A
LOCK ALOCK ARead ARead Achange Achange AWrite AWrite AUNLOCK AUNLOCK A
LOCK BLOCK BRead BRead BDiscover problemDiscover problemABORTABORT
CSE 5330/7330 Fall 2009 82
Two-Phase Locking (2PL)Two-Phase Locking (2PL)
Phase I: All requesting of locks precedesPhase I: All requesting of locks precedes Phase II: Any releasing of locksPhase II: Any releasing of locks
Theorem: Any schedule for 2-phase locked Theorem: Any schedule for 2-phase locked transaction is serializabletransaction is serializable
Time
Locks