1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10,...
-
Upload
janel-matthews -
Category
Documents
-
view
215 -
download
1
Transcript of 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10,...
1
Data Models and Query Languages
CSE 590DB, Winter 1999Theory of Databases
Zack IvesJanuary 10, 1999
2
The Relational Model
Database consists of relations, where columns are attributes and rows are tuples
Declarative queries using an algebraic or logic language (or SQL, which we shall ignore)
attribute
tuple
arity = 3, cardinality = 3NewMovies(Name, Rating, Director)
NewMoviesName Rating DirectorThe Phantom Menace PG George LucasPatch Adams PG-13 Tom ShadyacStepmom PG-13 Chris Columbus
3
Schema
Each attribute in a relation has a type:NewMovies(Name: String, Rating: RatingDomain, Director:
String), where RatingDomain = {G, PG, PG-13, R, NC-17}(In many cases, we will omit the type, and assume the type
most appropriate to the expression)
Relation schema: relation name, attributes, and types Relation instance: a set of tuples for a given schema Database schema: set of relation schemas Database instance: relation instance for every
relation schema
4
More on Tuples
Formally, a tuple is a mapping from attribute names to (correctly typed) values: Director -> “George Lucas” Name -> “Stepmom”
May specify a tuple using values & schema notation:NewMovies(“The Phantom Menace”, “PG”, “George Lucas”)
Can have constraints on attribute values: Integrity constraints Keys Foreign keys Functional dependencies
5
Relational Algebra
All operators take one or more sets of tuples as input, and produce a new set as output Basic set operators (, , —, but normally no
complement) Selection () Projection () Cartesian Product () Join () Division ()
6
Set Operations
Sets must be compatible! Same number of attributes Same types/domains
R1 R2 = {all tuples in either R1 or R2}R1 R2 = {all tuples in both R1 and R2}Complement???Negation: R1 — R2 = {all tuples in R1 not
in R2}Query without unions is conjunctive
7
Selection
Chooses a subset of tuples satisfying a predicate
Rating = “PG-13” (NewMovies)
Name Rating DirectorPatch Adams PG-13 Tom ShadyacStepmom PG-13 Chris Columbus
NewMoviesName Rating DirectorThe Phantom Menace PG George LucasPatch Adams PG-13 Tom ShadyacStepmom PG-13 Chris Columbus
8
Projection
Chooses a subset of attributes, removes duplicates
NewMoviesName Rating DirectorThe Phantom Menace PG George LucasPatch Adams PG-13 Tom ShadyacStepmom PG-13 Chris Columbus
Name DirectorThe Phantom Menace George LucasPatch Adams Tom ShadyacStepmom Chris Columbus
Name, Director (NewMovies)
9
Cartesian Product
Combine two relations - all pairings of tuples
NewMovieName DirectorPhantom Menace George LucasPatch Adams Tom ShadyacStepmom Chris Columbus
MetroShowsName TimePatch Adams 7:00Patch Adams 9:40Stepmom 6:50Stepmom 9:00
NewMovie MetroShowsName Director Name TimePhantom Menace George Lucas Patch Adams 7:00Phantom Menace George Lucas Patch Adams 9:40Phantom Menace George Lucas Stepmom 6:50Phantom Menace George Lucas Stepmom 9:00Patch Adams Tom Shadyac Patch Adams 7:00… … … ...
10
Join
Combine tuples from two relations by predicate. If predicate is equality, remove duplicate attribute.
NewMovieName DirectorPhantom Menace George LucasPatch Adams Tom ShadyacStepmom Chris Columbus
MetroShowsName TimePatch Adams 7:00Patch Adams 9:40Stepmom 6:50Stepmom 9:00
NewMovie NewMovie.Name = MetroShows.Name MetroShows
Name Director TimePatch Adams Tom Shadyac 7:00Patch Adams Tom Shadyac 9:40Stepmom Chris Columbus 6:50Stepmom Chris Columbus 9:00
Equivalent to:Name, Director, Time ( NewMovie.Name = MetroShows.Name
(NewMovie MetroShows))
11
Division
Only returns results from dividend which match attributes of all of tuples in divisor (“for all”)Customersname branchJohnson DowntownSmith BrightonLindsay RedwoodGreen BrightonGreen Downtown
BranchesbranchBrightonDowntown
Customers BranchesnameGreen
12
Logic-Based Query Languages
Tuple-relational calculus Based on first-order predicate calculus, but
imposes restrictions (e.g., no function symbols)
Equivalent in power to relational algebra
Datalog More powerful than tuple-relational calculus
Supports recursion and thus transitive closure
Equivalent to set of Horn clauses
13
Predicates & Atoms
Relations represented by predicatesTuples represented by atoms:
Flight(“Alon”, “British Airways”, 1234, “Seattle”, “Israel”, “1/9/1999”, “10:00”)
Arithmetic atoms:X < 100, X + Y + 5 > Z * 2
In certain cases, negated atoms:not Flight(“Zack”, “British Airways”, 1234, “Seattle”,
“Israel”, “1/9/1999”, “10:00”)
14
Datalog Rules & Programs
“Pure” datalog rules:head :- atom1, atom2, atom3
where all atoms are non-negated and relational
Datalog program is set of datalog rules Conjunctive query has single rule:
HanksDirector(D) :- ActorIn(“Hanks”, M) & DirectedBy(D, M)
Disjunctive query:HanksDirector(D) :- ActorIn(“Hanks”, M) & DirectedBy(D, M)HanksDirector(D) :- ProducedBy(“Hanks”, M) & DirectedBy(D, M)
15
Intension & Extension
Distinction between EDB & IDB: Extensional DB
what’s stored in the databasecan only be in the body of a Datalog rule
Intensional DBresult of a datalog rulecan be in head or body of a Datalog rule
16
Evaluating Rules
Evaluating rules: Start with EDB, iteratively derive facts for IDB’s Repeat until cannot derive any new facts:
Consider every possible assignment from database to variables in body
If each atom in body is made true by assignment, add tuple for the head into the head’s relation
Conjunctive queries are inexpensive Disjunctive are more expensive (even though
they’re Horn clauses)
17
Transitive Closure
Suppose we to know all possible destination airports reachable from SeaTac, given an EDB Flight(From, To):
SEA
SFO
LAX
JFK
LGA DEN
We need to express the query: Find all airports reachable from SEA
18
Recursion in Datalog
Program:Dest(X, Y) :- Flight(X, Y)Dest(X, Y) :- Dest(X, Z), Flight(Z, Y)
Evaluate unknown # of times until fixpoint Flight: {(SEA,SFO), (SEA,LAX), (SEA,JFK), (LAX,LGA),
(JFK,LGA),(LGA,DEN)}Dest0: {}
Dest1: Dest0 {(SEA,SFO), (SEA,LAX), (SEA,JFK)} Dest2: Dest1 {(SEA,LGA),(LAX,DEN),(JFK,DEN)} Dest3: Dest2 {(SEA,DEN)} No more changes, so stop (minimum fixpoint)
19
Built-in Predicates
Rules may include atoms with built-in predicates:Expensive(X) :- Product(X, Y, P) & P > 100
But need to restrict use of built-in atoms:P(X) :- R(X) & X < YWhen is X < Y?
We require that every variable from a built-in atom appear in a relational atom
20
Negated Subgoals
Restrict forms:P(X, Y) :- Between(X, Y, Z) & NOT Direct(X,Z)
Bad:Q(X, Y) :- From(X) & NOT To(Y)
Bad but fixable:P(X) :- From(X) & NOT Connects(X, Y)
Rewrite as:Connects’(X) :- Connects(X,Y)P(X) :- End(X) & NOT Connects’(X)
21
Stratified/Stratisfiable Rules
Predicate P depends on predicate Q if Q appears negated in a rule defining P
If cycle in dependency graph, program is not stratifiable:p(X) :- r(X) & NOT q(X)
q(X) :- r(X) & NOT p(X)
Suppose r has tuple {1}
Intuitively, stratification provides a way of executing program P as sequence of subprograms P1 … Pn defining IDB relations w/o “forward references”. Results independent of stratification.