1 Section 5 - Grouping Data u The GROUP BY clause allows the grouping of data u Aggregate functions...

Post on 24-Dec-2015

221 views 0 download

Transcript of 1 Section 5 - Grouping Data u The GROUP BY clause allows the grouping of data u Aggregate functions...

1

Section 5 - Grouping Data

The GROUP BY clause allows the grouping of data

Aggregate functions are most often used with the GROUP BY clause

GROUP BY divides a table into sets, then Aggregate functions return summary values for those sets.

2

GROUP BY Syntax

SELECT select_listFROM table_list[WHERE conditions]GROUP BY group_by_list;

3

Example

SELECT pub_id, COUNT(title)FROM titlesGROUP BY pub_id;

All items in the Select list that are not in the Group By list must generate a single value for each group

4

Groups within Groups

You may nest Groups within other groups by separating the columns with commas

Example:

SELECT pub_id, type, COUNT(type)FROM titlesGROUP BY pub_id, type;

5

Restrictions

Again: Each item in the SELECT list must produce a single value

Wrong:

SELECT pub_id, type, COUNT(type)FROM titlesGROUP BY pub_id;

6

More Restrictions

You can NOT use expressions in the GROUP BY clause

Wrong:

SELECT pub_id, SUM(price)FROM titlesGROUP BY pub_id, SUM(price);

7

No Column Numbers

Unlike the ORDER BY clause, you cannot use the column select list position number in the GROUP BY clause

Wrong:

SELECT pub_id, SUM(price)FROM titlesGROUP BY 1;

8

Multiple Summaries To see the summary values for a publisher and for the

type of books within that publisher you will need two SELECT statements

SELECT pub_id, SUM(price)FROM titlesGROUP BY pub_id;

SELECT pub_id, type, SUM(price)FROM titlesGROUP BY pub_id, type;

9

Exercise

Display a list of the authors and the state they live in. Sort the list by the author’s last name within state

10

Discussion

SELECT au_lname, au_fname, stateFROM authorsORDER BY state, au_lname;

We don't need a Group By for this statement because no summary information was asked for

11

Exercise

Display a list of states and the number of authors that are from each state. Also, show how many different cities are in each state. Sort in state order.

12

Discussion

SELECT state, count(*), count(distinct city)FROM authorsGROUP BY stateORDER BY state, au_lname;

This gets us the number of authors per state and the number of distinct cities in each state.If we didn't use the DISTINCT keyword we would count all authors who lived in a city.

13

NULLs and GROUPS

NULLs never equal another NULL BUT... GROUP BY will create a separate

group for the NULLs Think of it as a Group of Unknowns

14

Example

The Type column contains NULLs

SELECT type, COUNT(*)FROM titlesGROUP BY type;

Returns count of 1, if we used a COUNT(type) instead of Count(*) we'd get back a zero instead.Why?

15

Discussion

Count(*) counts whole rows and there is 1 row of a NULL type group

Count(type) counts the non-NULL type columns in the NULL type group and there are zero non-NULL values in the NULL group.

16

More NULLs

More than one NULL in a column?

SELECT advance, COUNT(*)FROM titlesGROUP BY advance;

Two books have a NULL advance and they are grouped. [Note: zero is different group]

17

GROUP BY with WHERE The WHERE clause allows grouping of a subset of

rows. The WHERE clause acts first to find the rows you want

Then the GROUP BY clause divides the rows into groups

SELECT type, AVG(price)FROM titlesWHERE advance > 5000GROUP BY type;

18

No WHERE

Same statement, no WHERE

SELECT type, AVG(price)FROM titlesGROUP BY type;

NULL group returned[In the previous example, the WHERE clause eliminated the NULLs]

19

ORDER the GROUPS

GROUP BY puts rows into sets, but doesn't put them in order.

SELECT type, AVG(price)FROM titlesWHERE advance > 5000GROUP BY typeORDER BY 2;

20

Exercise

Show the average position that an author appears on a book if the author has a royalty share less than 100%. Also, show the number of books written by the author. List the author using his social security number and sort by social security number within number of books order. Show the authors with the most number of books first.

21

Discussion

SELECT au_id, AVG(au_ord), COUNT(title_id)

FROM titleauthorsWHERE royaltyshare < 1.0GROUP BY au_idORDER BY 3 DESC, au_id;

22

HAVING Clause

HAVING is like a WHERE clause for a GROUP

WHERE limits rows HAVING limits GROUPs

23

HAVING Syntax

SELECT select_listFROM table_list[WHERE conditions]GROUP BY group_list[HAVING conditions];

24

HAVING Aggregates

The WHERE conditions apply before Aggregates are calculated

Then the HAVING conditions apply after Aggregates are calculated

25

HAVING vs. WHERE

WHERE comes after the FROM HAVING comes after the GROUP BY

WHERE conditions cannot include Aggregates

HAVING conditions almost always include Aggregates

26

Example

SELECT type, count(*)FROM titlesGROUP BY typeHAVING COUNT(*) > 1;

NOTE: Cannot use WHERE instead of HAVING since WHERE does not allow Aggregates

27

HAVING without Aggregates

Applies to grouping columns

SELECT typeFROM titlesGROUP BY typeHAVING type LIKE 'p%';

You could of used the WHERE clause to find types that began with 'p', as well

28

Exercise

List the editor positions that have at least three editors

29

Answer

SELECT ed_pos, count(*)FROM editorsGROUP BY ed_posHAVING count(*) >= 3;

30

HAVING Conditions

You may use more than one condition on a HAVING clause

SELECT pub_id, SUM(advance), AVG(price)FROM titlesGROUP BY pub_idHAVING SUM(advance) > 15000AND AVG(price) < 20AND pub_id > '0800';

31

Exercise

List the publisher id and the average advance for each book that the publisher sells and the total number of books they sell, but only if the total cost of all the books they sell (that are priced more than $10.00) is more than eighty dollars and they sell more than one book. Sort by pub_id and book count.

32

Discussion

SELECT pub_id, AVG(advance),COUNT(*)

FROM titlesWHERE price > 10GROUP BY pub_idHAVING SUM(price) > 80AND Count(*) > 1ORDER BY 1, 3;

33

Discussion

The WHERE clause first eliminates all books that do not cost more than $10

Then the GROUP BY forms the pub_id groups

Then the HAVING clause eliminates any groups whose total cost ( sum(price) ) is not greater than $80 and any pub_id that has not sold more than one book.

34

Section 5 - Last Slide

Please complete Assignment 4