M ATH IN SQL. 222 A GGREGATION O PERATORS Operators on sets of tuples. Significant extension of...
-
Upload
rosalyn-harris -
Category
Documents
-
view
216 -
download
1
Transcript of M ATH IN SQL. 222 A GGREGATION O PERATORS Operators on sets of tuples. Significant extension of...
222
AGGREGATION OPERATORS
Operators on sets of tuples.
Significant extension of relational algebra. SUM ( [DISTINCT] A): the sum of all (unique) values
in attribute A. AVG ( [DISTINCT] A): the average of all (unique)
values in attribute A.
SELECT AVG ( DISTINCT S.age)FROM Sailors SWHERE S.rating=10;
SELECT AVG (S.age)FROM Sailors S;
333
AGGREGATION OPERATORS
Operators on sets of tuples.
Significant extension of relational algebra. MAX (A): the maximum value in attribute A. MIN (A): the minimum value in attribute A.
SELECT S.snameFROM Sailors SWHERE S.rating= (SELECT MAX(S2.rating) FROM Sailors S2);
SELECT MAX(rating) FROM Sailors;
444
AGGREGATION OPERATORS
Operators on sets of tuples.
Significant extension of relational algebra. COUNT (*): the number of tuples.
SELECT COUNT (*)FROM Sailors S
555
AGGREGATION OPERATORS
Operators on sets of tuples.
Significant extension of relational algebra. COUNT ( [DISTINCT] A): the number of (unique)
values in attribute A.
SELECT COUNT (DISTINCT S.rating)FROM Sailors SWHERE S.sname=‘Bob’;
666
AGGREGATION OPERATORS Find name and age of
the oldest sailor(s). The first query looks
correct, but is illegal. Thoughts as to why?
The second query is a correct and legal solution.
SELECT S.sname, MAX (S.age)FROM Sailors S;
SELECT S.sname, S.ageFROM Sailors SWHERE S.age = (SELECT MAX (S2.age) FROM Sailors S2);
777
GROUP BY AND HAVING So far, we’ve applied aggregation operators
to all (qualifying) tuples. Sometimes, we want to apply them to each of several groups of tuples.
Find the age of the youngest sailor for each rating value. Suppose we know that rating values go from 1 to
10; we can write ten (!) queries that look like this:
But in general, we don’t know how many rating values exist, and what these rating values are.
Plus, it’s a waste of time to write so many queries
SELECT MIN (S.age)FROM Sailors SWHERE S.rating = i;
For i = 1, 2, ... , 10:
888
GROUP BY AND HAVING
A group is a set of tuples that have the same value for all attributes grouping-list.
The target-list contains attribute names terms with aggregation operations.
Attribute list must be a subset of grouping-list.
Each answer tuple corresponds to a group, and output attributes must have a single value per group.
SELECT [DISTINCT] target-listFROM relation-listWHERE qualificationGROUP BY grouping-listHAVING group-qualification
Notice the notation
999
CONCEPTUAL EVALUATION Given:
SELECT S.rating, MIN(S.age) as minageFROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT(*) > 1
Step 1 The cross-product of relation-list is computed In this instance, it’s only Sailors
101010
CONCEPTUAL EVALUATION Given:
SELECT S.rating, MIN(S.age) as minageFROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT(*) > 1
Step 2 Tuples that fail qualification are discarded ‘unnecessary’ attributes are deleted
111111
CONCEPTUAL EVALUATION Given:
SELECT S.rating, MIN(S.age) as minageFROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT(*) > 1
Step 3 Remaining tuples are partitioned into
groups by the value of attributes ingrouping-list
121212
CONCEPTUAL EVALUATION Given:
SELECT S.rating, MIN(S.age) as minageFROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT(*) > 1
Step 4 The group-qualification is then applied to
eliminate groups that do not satisfy thiscondition.
131313
CONCEPTUAL EVALUATION Given:
SELECT S.rating, MIN(S.age) as minageFROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT(*) > 1
Step 5 One answer tuple is generated per qualifying
group by applying the aggregation operator.
141414
GROUP BY AND HAVING Find the age of the youngest
sailor with age 18, for each rating with at least 2 such sailors.
SELECT S.rating, MIN (S.age)
FROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1;
Only S.rating and S.age are mentioned in the SELECT, GROUP BY or HAVING clauses; other attributes `unnecessary’.
2nd column of result is unnamed What to do?
sid sname rating age22 dustin 7 45.031 lubber 8 55.571 zorba 10 16.064 horatio 7 35.029 brutus 1 33.058 rusty 10 35.0
rating age1 33.07 45.07 35.08 55.510 35.0
rating7 35.0
Answer relation
151515
GROUP BY AND HAVING For each red boat, find the number of
reservations for this boat.SELECT B.bid, COUNT (*) AS scountFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND
B.color=‘red’GROUP BY B.bid;
Grouping over a join of three relations.
What do we get if we remove B.color=‘red’ from the WHERE clause and add a HAVING clause with this condition?
What if we drop Sailors and the condition involving S.sid?
161616
GROUP BY AND HAVING Find the age of the youngest sailor with age > 18, for
each rating with at least 2 sailors (of any age).
SELECT S.rating, MIN (S.age)FROM Sailors SWHERE S.age > 18GROUP BY S.ratingHAVING 1 < (SELECT COUNT (*) FROM Sailors S2 WHERE S.rating=S2.rating);
Shows HAVING clause can also contain a subquery.
What if HAVING clause is replaced by: HAVING COUNT(*) >1
171717
GROUP BY AND HAVING Find those ratings for which the average age is the
minimum over all ratings. Aggregation operations cannot be nested! WRONG:
SELECT S.ratingFROM Sailors SWHERE S.age = (SELECT MIN (AVG (S2.age)) FROM Sailors S2);
Correct solution:SELECT Temp.rating, Temp.avgageFROM (SELECT S.rating, AVG (S.age) AS avgage FROM Sailors S GROUP BY S.rating) AS TempWHERE Temp.avgage = (SELECT MIN (Temp.avgage) FROM Temp);
191919
ORDER BY The ORDER BY keyword is used to sort the
result-set by a specified column. The ORDER BY keyword sort the records in
ascending order by default. If you want to sort the records in a
descending order, you can use the DESC keyword.
202020
TOP/BOTTOM The TOP clause is used to specify the number
of records to return. The TOP clause can be very useful on large
tables with thousands of records Returning a large number of records can impact
on performance Can ‘sample’ the table using TOP
Not all database systems support the TOP clause or implement it in different fashion
212121
TOP/BOTTOM
SQL ServerSELECT TOP number|percent column_name(s)FROM table_name
Ex: SELECT TOP 5 * FROM Persons
MySQLSELECT column_name(s)FROM table_nameLIMIT number
Ex: SELECT *FROM PersonsLIMIT 5
222222
TOP/BOTTOM
OracleSELECT column_name(s)FROM table_nameWHERE ROWNUM <= number
Ex: SELECT *FROM PersonsWHERE ROWNUM <=5
DB2SELECT column_name(s)FROM table_nameFETCH FIRST number ROWS ONLY
Ex: SELECT *FROM PersonsFETCH FIRST 5 ROWS ONLY
262626
SUMMARY SQL was an important factor in the early
acceptance of the relational model; more natural than earlier, procedural query languages.
All queries that can be expressed in relational algebra can also be formulated in SQL.
In addition, SQL has significantly more expressive power than relational algebra, in particular aggregation operations and grouping.
Many alternative ways to write a query; query optimizer looks for most efficient evaluation plan.
In practice, users need to be aware of how queries are optimized and evaluated for most efficient results.