1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

34
1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT

Transcript of 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

Page 1: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

1

CS 430Database Theory

Winter 2005

Lecture 12: SQL DML - SELECT

Page 2: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

2

SELECT

One statement for retrieving data:SELECT <attributes>FROM <table>WHERE <condition>;

Approximately: <attributes>(<condition>(<table>)) {t.<attributes> | <table>(t) AND <condition>} Difference: SELECT does not eliminate duplicate

rows

Page 3: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

3

SELECTMultiple Tables Can have multiple tables in the FROM

clause Retrieve names and address of Research

department employees SELECT FNAME, LNAME, ADDRESS

FROM EMPLOYEE, DEPARMENT

WHERE DNAME = ‘Research’

AND DEPARTMENT.DNUMBER

= EMPLOYEE.DNUMBER;

Page 4: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

4

Notes:

SQL is officially case insensitive However, there are situations where it makes a

difference E.g. MySQL stores MyISAM tables in individual files

making table names case sensitive

Attribute names don’t require qualification if they are unique (among all the tables that are part of the query)

Page 5: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

5

SELECTQualifying Attribute Names We can qualify attribute names

Tablename.attribute Assume EMPLOYEE and DEPARTMENT

both have NAME and DNUMBER attributes: SELECT FNAME, EMPLOYEE.NAME, ADDRESS

FROM EMPLOYEE, DEPARMENT

WHERE DEPARTMENT.NAME = ‘Research’

AND DEPARTMENT.DNUMBER

= EMPLOYEE.DNUMBER;

Page 6: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

6

SELECTQualifying Attribute Names (2) <table> [AS] <synonym>:

SELECT FNAME, E.NAME, ADDRESS

FROM EMPLOYEE AS E, DEPARMENT AS D

WHERE D.NAME = ‘Research’

AND D.NUMBER = E.NUMBER; Also can refer to same table more than once:

SELECT E.FNAME, E.LNAME, S.FNAME, S.LNAME

FROM EMPLOYEE E, EMPLOYEE S

WHERE E.SUPERSSN = S.SSN;

Page 7: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

7

SELECTWhere Clause is Optional SELECT FNAME, LNAME, SSN

FROM EMPLOYEE; Including multiple tables:

SELECT FNAME, LNAME, DNAME

FROM EMPLOYEE, DEPARTMENT Result is a Cartesian Product (all combinations of

one row from each table)

Page 8: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

8

SELECT *

Can use * in place of attributes:SELECT *

FROM EMPLOYEE; Attributes are listed in order of definition

Can also qualify * with a table name:SELECT EMPLOYEE.*, DNAME

FROM EMPLOYEE, DEPARTMENT

WHERE DNO = DNUMBER; Style: Never use * in a program

Always ask for attributes in order expected by program

Page 9: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

9

SELECT DISTINCT

What is we want duplicates eliminated That is, when we want a set of results Usually costs some time to eliminate duplicates In many cases there are no duplicates, e.g. when we select

the key of a table Usually don’t want to eliminate duplicates when using

aggregate functions (e.g. SUM) Do SELECT DISTINCT:

SELECT DISTINCT SALARY

FROM EMPLOYEE; Note: ALL is opposite of DISTINCT

Page 10: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

10

SELECTOrdering the result Use an ORDER BY clause E.g.

SELECT DNAME, LNAME, FNAME

FROM EMPLOYEE, DEPARTMENT

WHERE DNO = DNUMBER

ORDER BY DNAME, LNAME, FNAME;

Page 11: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

11

SELECTOrdering the result (2) By default ORDER BY sorts in ascending

order, if you want the opposite use ASC and DESC: SELECT DNAME, LNAME, FNAME

FROM EMPLOYEE, DEPARTMENT

WHERE DNO = DNUMBER

ORDER BY DNAME DESC, LNAME ASC,

FNAME ASC;

Page 12: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

12

Operations

There is a plethora of operations available. Some examples: Arithmetic: +, -, *, / Comparison: <, <=, =, <> or !=, >=, > Pattern Matching: LIKE

Operations can appear as part of a condition and as part of attributes to be selected

Page 13: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

13

Operations Examples

SELECT FNAME, LNAME, 1.1 * SALARYFROM EMPLOYEE;

SELECT FNAME, LNAMEFROM EMPLOYEE E, EMPLOYEE SWHERE E.MGRSSN = S.SSN AND E.SALARY > 1.5 * S.SALARY;

SELECT FNAME, LNAME,1.1 * SALARY AS NEW_SALARY

FROM EMPLOYEE; Renames the result column (when it makes a difference)

Page 14: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

14

Pattern Matching

Field LIKE Pattern, Field NOT LIKE Pattern Patterns

% matches zero or more characters like Regular Expression *

_ matches one character like Regular Expression .

Use \ to escape _ and % Example:

ADDRESS LIKE ‘%Houston%’ Houston is anywhere in the address string (would include a

street named Houston)

Page 15: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

15

Note on Pattern Matching

Consider performance when using pattern matching. SELECT FNAME, LNAME

FROM EMPLOYEE

WHERE ADDRESS LIKE ‘%Houston%’; This would require a full table scan of the

EMPLOYEE table

Page 16: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

16

Three Valued Logic

NULLs have multiple meanings All comparison involving a NULL yield NULL

or UNKNOWN Even NULL = NULL, all NULLs are considered

distinct AND, OR, NOT

FALSE AND UNKNOWN is FALSE TRUE OR UNKNOWN is TRUE NOT UNKNOWN is UNKNOWN

Page 17: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

17

More Three Valued Logic

Can explicitly test for NULL with IS NULL, IS NOT NULL

Example:SELECT FNAME, LNAME

FROM EMPLOYEE

WHERE SUPERSSN IS NULL;

Page 18: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

18

JOINing Tables

SQL incorporate notion of JOINed tables Can specify a join in the FROM clause

Example: SELECT FNAME, LNAME, ADDRESS

FROM (EMPLOYEE JOIN DEPARMENT

ON DNO = DNUMBER)

WHERE DNAME = ‘Research’;

Page 19: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

19

More JOINing Tables

NATURAL JOIN SELECT DNAME, DLOCATION

FROM (DEPARTMENTNATURAL JOIN DEPT_LOCATIONS);

OUTER JOINS SELECT E.LNAME AS ENAME, S.LNAME AS

SNAMEFROM (EMPLOYEE E LEFT OUTER JOIN

EMPLOYEE S ON E.SUPERSSN = S.SSN);

Page 20: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

20

More JOINing Tables

Kinds of JOINs [INNER] JOIN

Normal Join CROSS JOIN

Cross Join is a cartesian product (no join condition) LEFT [OUTER] JOIN RIGHT [OUTER] JOIN

Left and Right Outer Joins FULL [OUTER] JOIN

Two sided outer join Not supported in MySQL

Page 21: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

21

Aggregate Functions

Aggregate Functions: COUNT, SUM, MAX, MIN, AVG, and some more SUM, AVG work on numeric values MAX, MIN work on values with a “total order” COUNT returns number of tuples or values

Simple example:SELECT AVG(SALARY), MAX(SALARY)FROM EMPLOYEE;

Note: Can include WHERE Clause and JoinsSELECT AVG(SALARY), MAX(SALARY) FROM (EMPLOYEE JOIN DEPARTMENT ON

DNO=DNUMBER) WHERE DNAME = ‘Research’;

Page 22: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

22

COUNT()

COUNT counts the number of tuples or values

COUNT(*) will count the number of tuples in the query

COUNT(SALARY) will count the number of tuples with a non-null SALARY

COUNT(DISTINCT SALARY) will count the number of DISTINCT values for SALARY

Page 23: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

23

GROUP BY

Problem with simple aggregates: Can’t return more than one row Relational Algebra: Group rows together and

compute aggregate function over the group (“script F” operator)

SQL: GROUP BY CLAUSE Simple example:

SELECT DNO, COUNT(*), AVG(SALARY)FROM EMPLOYEEGROUP BY DNO;

Page 24: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

24

MORE GROUP BY

Restriction (Queries with GROUP BY): All attributes used in the SELECT clause (includes used in expressions) must be either: Named in the GROUP BY clause Or, Used in an Aggregate function

Illegal Example (should be): SELECT DNO, SALARY FROM EMPLOYEE

GROUP BY DNO; Legal Example:

SELECT DNO, MAX(SALARY) FROM EMPLOYEE GROUP BY DNO;

Page 25: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

25

GROUP BY and NULLs

If some of the rows being grouped have null attributes for a GROUPing attribute a separate group is created for those rows with that attribute being NULL

Example: Get number of employees supervised by each supervisor:

SELECT SUPERSSN, COUNT(*)

FROM EMPLOYEE

GROUP BY SUPERSSN;

Page 26: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

26

HAVING

HAVING provides a condition that is tested after the rows are grouped: Example: HAVING COUNT(*) > 2 (more than two

tuples in the group). EXAMPLE:

SELECT SUPERSSN, MAX(SALARY)

FROM EMPLOYEE

GROUP BY SUPERSSN

HAVING COUNT(*) >= 2;

Page 27: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

27

MORE GROUP BY and HAVING Semantics

The rest of the query (without GROUP BY and HAVING clauses) is processed first, resulting in a table of tuples

That table is then grouped according to the GROUP BY and Aggregate functions are computed as needed for the SELECT and HAVING clauses

Any HAVING clause is applied The appropriate attributes from the resulting

tuples are returned as the result

Page 28: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

28

Nested Queries

(Need MySQL 4.1 or later) Mini-example:

SELECT DISTINCT PNUMBER

FROM PROJECT

WHERE PNUMBER IN

(SELECT PNUMBER

FROM … WHERE … ); In general: Anything that can be done with a

nested query can be done without

Page 29: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

29

More Nested Queries

IN can work with tuples: Example:

Find all employees who work the same (project, hours) combination on some project that SSN ‘123456789’ works on.SELECT DISTINCT ESSNFROM WORKS_ONWHERE (PNO, HOURS) IN (SELECT PNO, HOURS FROM WORKS_ON WHERE SSN = ‘123456789’);

Page 30: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

30

More Nested Queries

IN is the same as = ANY You can use ANY and ALL with any comparison (>,

=, etc.) operator Example:

Find all employees whose salary is greater than all the employees in department 5

SELECT LNAME, FNAMEFROM EMPLOYEEWHERE SALARY > ALL (SELECT SALARY FROM EMPLOYEE WHERE DNO = 5);

Page 31: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

31

Correlated Nested Queries

Refers to queries where a nested query refers to attributes in an outer query

Example: Find the name of all employees who have a dependent with

the same first name and sex SELECT E.FNAME, E.LNAME

FROM EMPLOYEE E

WHERE E.SSN IN

(SELECT ESSN FROM DEPENDENT

E.FNAME = DEPENDENT_NAME

AND E.SEX = SEX);

Page 32: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

32

EXISTS and NOT EXISTS

EXISTS and NOT EXISTS are predicates that can be applied to a subquery

Example: Find all employees having dependents SELECT FNAME, LNAME FROM EMPLOYEE

WHERE NOT EXISTS

(SELECT * FROM DEPENDENT

WHERE SSN = ESSN);

Page 33: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

33

UNION

You can get the result of more than one query combined into a single result:

SELECT …

UNION

SELECT … Results in a table whose rows are the union of the

results from the two queries SELECT DISTINCT is implied with UNIONS

This is a set operation

Page 34: 1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.

34

SELECT Statement Summary

Lots of options You can mix and match pretty much

arbitrarily Build up queries much as you would

relational algebra expressions