SQL REVIEW. Example Schema We will use these table definitions in our examples. Sailors(sid:...

16
SQL REVIEW

Transcript of SQL REVIEW. Example Schema We will use these table definitions in our examples. Sailors(sid:...

SQL REVIEW

Example Schema

We will use these table definitions in our examples.

Sailors(sid: integer, sname: string, rating: integer, age: real)

Boats(bid: integer, bname: string, color: string)

Reserves(sid: integer, bid: integer, day: date)

Make reservations

Basic SQL Query

relation-list A list of relation names

target-list A list of attributes of relations in relation-list

qualification Comparisons (“Attr op const” or “Attr1 op Attr2,” where op is one of ˂, ˃, ≤, ≥, =, ≠ ) combined using AND, OR, and NOT.

DISTINCT is an optional keyword indicating that the answer should not contain duplicates.

SELECT [DISTINCT] target-listFROM relation-listWHERE qualification

Querying Relations (1)

sid cid grade53831 Carnatic101 C53831 Reggae203 B53650 Topology112 A53666 History105 B

SELECT S.name, E.cidFROM Students S, Enrolled EWHERE S.sid=E.sid AND E.grade=“A”

Enrolled

A student with “sid” has an entry in Enrolled

The Enrolled entry has a grade of “A”

What does the following query compute?

sid name login age gpa

53831 Zhang zhang@ee 19 3.2

53650 Smith smith@cs 21 3.9

53666 Jones jones@cs 20 3.5

Students

Retrieve names of students and the courses they received an “A” grade

Querying Relations (2)

sid cid grade53831 Carnatic101 C53831 Reggae203 B53650 Topology112 A53666 History105 B

SELECT S.name, E.cidFROM Students S, Enrolled EWHERE S.sid=E.sid AND E.grade=“A”

Enrolled

sid name login age gpa

53831 Zhang zhang@ee 19 3.2

53650 Smith smith@cs 21 3.9

53666 Jones jones@cs 20 3.5

Students

S.name E.cid

Smith Topology112

Query Result

Define search space

Semantics of SQL

SELECT [DISTINCT] target-listFROM relation-listWHERE qualification

R1 × R2 × R3 × · · ·

Select rows

qualification

Select columns

target-listrelation-list

Projection

˂, ˃, ≤, ≥, =, ≠

QUERY PROCESSOR

This strategy is probably the least efficient way to compute a query! An optimizer will find more efficient strategies to compute the same answers.

Example of Conceptual EvaluationSELECT S.snameFROM Sailors S, Reserves RWHERE S.sid=R.sid AND R.bid=103

Range variable

sid sname rating age

22 dustin 7 45.0

31 lubber 8 55.558 rusty 10 35.0

sid bid day

22 101 10/10/9658 103 11/12/96

Reservation

Sailors

Example of Conceptual EvaluationSELECT S.snameFROM Sailors S, Reserves RWHERE S.sid=R.sid AND R.bid=103

(sid) sname rating age (sid) bid day22 dustin 7 45.0 22 101 10/10/9622 dustin 7 45.0 58 103 11/12/9631 lubber 8 55.5 22 101 10/10/9631 lubber 8 55.5 58 103 11/12/9658 rusty 10 35.0 22 101 10/10/9658 rusty 10 35.0 58 103 11/12/96

S×R

Disqualified

RemoveIrrelevantcolumns

Answer

A Note on Range Variables

Really needed only if the same relation appears twice in the FROM clause. The previous query can be written in two ways:

SELECT S.snameFROM Sailors S, Reserves RWHERE S.sid=R.sid AND R.bid=103

SELECT snameFROM Sailors, Reserves WHERE Sailors.sid=Reserves.sid AND bid=103

It is good style,however, to userange variablesalways!

OR

Range variable

Aggregate OperatorsSignificant extension of relational algebra

COUNT (*) The number of rows in the relation

COUNT ([DISTINCT] A) The number of (unique) values in the A column

SUM ([DISTINCT] A) The sum of all (unique) values in the A column

AVG ([DISTINCT] A) The average of all (unique) values in the A column

MAX (A) The maximum value in the A column

MIN (A) The minimum value in the A column

SELECT S.snameFROM Sailors SWHERE S.rating= (SELECT MAX(S2.rating) FROM Sailors S2)

Aggregate Operators

SELECT AVG (S.age)FROM Sailors SWHERE S.rating=10

SELECT COUNT (*)FROM Sailors S

SELECT AVG (DISTINCT S.age)FROM Sailors SWHERE S.rating=10

SELECT COUNT (DISTINCT S.rating)FROM Sailors SWHERE S.sname=‘Bob’

Count the number of sailors

Find the average age of sailors with

a rating of 10 Find the average of the distinct ages of sailors with a rating

of 10

Count the number of distinct ratings of

sailors called “Bob”

Find the name of sailors with

the highest rating

Compute maximum rating

GROUP BY and HAVING (1)

• So far, we’ve applied aggregate operators to all (qualifying) tuples.

RelationQualifierAggregator32

SELECT AVG (S.age)FROM Sailors SWHERE S.rating=10

Find the average age of sailors with

a rating of 10

Qualifier

Aggregator

GROUP BY and HAVING (1)

• So far, we’ve applied aggregate operators to all (qualifying) tuples.

• Sometimes, we want to apply them to each of several groups of tuples.

RelationQualifierAggregator32

RelationGroup 1Aggregator12

Group 2

Group 3

Aggregator

Aggregator

9

11

Queries With GROUP BY and HAVING

SELECT [DISTINCT] target-listFROM relation-listWHERE qualificationGROUP BY grouping-listHAVING group-qualification

SELECTFROM

WHERE

Group 1

Group 2

Group 3

Qualifier selecting groups

Aggregator12

9 Aggregator

GROUP BYHAVING

MIN(Attribute)

Output a table

rating age1 33.07 45.07 35.08 55.510 35.0

Find the age of the youngest sailor with age ≥ 18, for each rating with at least 2 such sailors

SELECT S.rating, MIN (S.age)FROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1

sid sname rating age22 dustin 7 45.031 lubber 8 55.571 zorba 10 16.064 horatio 7 35.029 brutus 1 33.058 rusty 10 35.0

Input relation

Disqualify

Only S.rating and S.age are mentioned in SELECT

4 rating groups

rating7 35.0

age Answer

Only one group

satisfies HAVING

Sailors

Summary• SQL was an important factor in the early acceptance of

the relational model; more natural than earlier, procedural query languages.

• Relationally complete; in fact, significantly more expressive power than relational algebra.

• Even queries that can be expressed in RA can often be expressed more naturally in SQL.

• Many alternative ways to write a query; optimizer should look for most efficient evaluation plan. In practice, users need to be aware of how queries are

optimized and evaluated for best results.