Part 4: Database Language -...

36
Page 1 Junping Sun Database Systems 4-1 Part 4: Database Language - SQL Junping Sun Database Systems 4-2 Database Languages and Implementation Data Model Data Model = Data Schema + Database Operations + Constraints Database Languages such as SQL and QUEL can be viewed as a tool to implement database schema and data operations at logical or implementation level. Database Language = Database Definition Language (DDL) + Database Manipulation Language (DML) DDL implements database schema DML implements database operations Separation of DDL and DML is the major distinction between the application systems developed by database languages and developed by programming languages.

Transcript of Part 4: Database Language -...

Page 1: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 1

Junping Sun Database Systems 4-1

Part 4: Database Language - SQL

Junping Sun Database Systems 4-2

Database Languages and Implementation Data Model

Data Model = Data Schema + Database Operations + Constraints

• Database Languages such as SQL and QUEL can be viewed as a tool to implement database schema and data operations at logical or implementation level.

• Database Language = Database Definition Language (DDL) + Database Manipulation Language (DML)

• DDL implements database schema

• DML implements database operations

• Separation of DDL and DML is the major distinction between the application systems developed by database languages and developed by programming languages.

Page 2: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 2

Junping Sun Database Systems 4-3

SQL - Structural Query Language

SQL:• It is the most accepted and implemented interface language for relational

database systems(intergalactic dataspeak).

History of Relational Database Languages:

• SEQUEL (1974 -- 1975)• It was the Application Programing Interface (API) to System R. • It was revised to SEQUEL/2 after several years, and later SEQUEL/2 was

changed to SQL.• SQL/DS (1981)• DB2 (1983)• SQL (ANSI-86) the first standardized version of SQL, called SQL1• SQL (ANSI-89) • SQL (ANSI-92), called SQL2• SQL3, support recursive operation and object-oriented paradigm• SQL-99 Standard

Junping Sun Database Systems 4-4

Data Definition

Schema Definition at Three Level of Databases:

View data schema (table) definition:A view table can be defined on the top of one or more base table

Base data table schema definition:A base table is corresponding to one physical data file in the storage system.

Physical• Each base table can be stored in different type of storage schema or data

organization structure such as sequential file, hashindex, ISAM, VSAMB-Tree, B+-Tree, B*-Tree, K-D Tree, KDB Tree, R-Tree, R+-Tree, R*-Tree

• Integrity constraints on schema• Authorization, and security mechanism on user defined database operations

such as query, update, and insert/delete operations.

Page 3: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 3

Junping Sun Database Systems 4-5

Data Definition

Create Statements:

• create table statement (to define a base table)• create index statement (to define an index at internal level)• create view statement (to define an view at user level)• create schema statement (to treat a database as whole unit in SQL89 &SQL2)

Drop Statements:

• drop table statement (to delete the definition and all instances of the table)• drop index statement (to remove an existing index)• drop view statement (to delete the view)• drop schema statement (to delete schema)

Junping Sun Database Systems 4-6

Schema and Catalog in ANSI-SQL Standard

SQL Schema:

• It is identified by a schema name , and includes an authorization identifier to indicate the user or account who owns the schema.

Example:

CREATE SCHEMA COMPANY AUTHORIZATION JSMITH;

• It creates a schema called COMPANY, owned by the user with authorization identifier JSMITH.

Syntax:schema ::= CREATE SCHEMA schema-name

AUTHORIZATION user[ schema-element-list ]

Page 4: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 4

Junping Sun Database Systems 4-7

CREATE TABLE EMPLOYEE Statement

CREATE TABLE EMPLOYEE(NAME VARCHAR2(19) NOT NULL,SSN CHAR(9),BDATE DATE,ADDRESS VARCHAR(30),SEX CHAR,SALARY NUMBER(10,2),SUPERSSN CHAR(9),DNO VARCHAR(8) NOT NULL,

CONSTRAINT EMPPK PRIMARY KEY(SSN),CONSTRAINT EMPSUPERFRKFOREIGN KEY (SUPERSSN) REFERENCES EMPLOYEE (SSN) DISABLE,CONSTRAINT EMPDUMFRKFOREIGN KEY (DNO) REFERENCES DEPARTMENT (DNUMBER) DISABLE);

• The constraint can be enabled by using the ALTER TABLE statement after the data is loaded into the table.

ALTER TABLE EMPLOYEE ENABLE CONSTRAINT EMPSUPERFRK;

Junping Sun Database Systems 4-8

Specifying Referential Triggered Actions

CREATE TABLE EMPLOYEE(NAME VARCHAR2(19) NOT NULL,SSN CHAR(9),BDATE DATE,ADDRESS VARCHAR(30),SEX CHAR,SALARY NUMBER(10,2)CHECK SALARY BETWEEN 10000 AND 99000,DNO VARCHAR(9) NOT NULL DEFAULT “1”,CONSTRAINT EMPPKPRIMARY KEY (SSN),CONSTRAINT EMPSUPERFKFOREIGN KEY (SUPERSSN) REFERENCES EMPLOYEE (SSN)

ON DELETE CASCADE DISABLE);

• ORACLE supports ON DELETE CASCADE.

Page 5: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 5

Junping Sun Database Systems 4-9

Specifying Referential Triggered Actions

CREATE TABLE DEPARTMENT(DNAME VARCHAR2(15) NOT NULL,DNUMBER VARCHAR(8),MGRSSN CHAR(9) NOT NULL DEFAULT “888665555”,CONSTRAINT DEPTPKPRIMARY KEY (DNUMBER),CONSTRAINT DEPTSKUNIQUE (DNAME),CONSTRAINT DEPTMGRFRKFOREIGN KEY (MGRSSN) REFERENCES EMPLOYEE(SSN)

ON DELETE CASCADE DISABLE);

ALTER TABLE EMPLOYEE ADD (CONSTRAINT EMPDNOFRKFOREIGN KEY (DNO) REFERENCES DEPARTMENT(DNUMBER) );

Junping Sun Database Systems 4-10

Data Types

SQL Data Types: (ANSI-SQL) SQL Data Types: (ORACLE)

CHARACTER(n) CHAR(n)CHARACTER VARYING(n) VARCHAR(n)

VARCHAR2(2)NUMERIC(p,s) NUMBER(p,s)DECIMAL(p,s)INTEGER NUMBER(38)INTSMALLINTFLOAT(b) NUMBERDOUBLE PRECISIONREALDATE DATE

RAWLONG

LONG RAW ROWID

Page 6: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 6

Junping Sun Database Systems 4-11

Data Manipulation in SQL

Data Manipulation at Base Table Level:

• Query the database via select statement• Modify data (tuples) in a table of the database via update statement• Remove data (tuples) from a table of the database via delete statement.• Append data (tuples) into a table in the database via insert statement.

Data Manipulation at View (virtual table) Level:

• Query the partial database via select statement from view• Update or modify the partial data defined at the view level

mapping view update to the underlying base table• single table update• multiple table update still has unsolved problem.

Junping Sun Database Systems 4-12

Query Database In SQL

• Querying database in SQL is done via select statement.

General format of select statement:

select <attribute list>from <table list>where <condition>

• <attribute list> is a list of attribute names whose values are to be retrieved by the query.

• <table list> is a list of the relation names required to process the query.multiple tables listed in the <table list> implies join operation involved.

• <condition> is a conditional (Boolean) expression that identifies the tuples to be retrieved by the query.<condition> specifies the selection and join operations.<condition> can include another select statement as a subquery of nested query.

Page 7: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 7

Junping Sun Database Systems 4-13

SELECT-PROJECT QUERY

Q0: Retrieve the birth date and address of the employee whose name is ‘John B. Smith’.

SQL Script for Q0:

Q0: select bdate, addressfrom employeewhere fname =‘John’ and minit = ‘B’ and lname = ‘Smith’;

Relation Algebra Expression for Q0:

� <bdate, address> ( � fname = ‘John’ and minit = ‘b’ and lname =‘smith’ (employee)

Target Attribute: bdate, addressConstraint: fname =‘John’ and minit = ‘B’ and lname = ‘Smith’Target Relation: employee

Junping Sun Database Systems 4-14

SELECT-PROJECT-JOIN QUERY

Q1. Retrieve the first and last names and addresses of all employees who work for the 'Research ' department.

select fname, lname, address

from employee, department

where dname = 'Research' and dnumber = dno;

Target Attributes: fname, lname, address

Constraint:

Select Condition: dname = 'Research'Join Condition: dnumber = dno

Target Relations: employee, department

• This query involves one selection on department relation and a join on relations employee and department.

Page 8: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 8

Junping Sun Database Systems 4-15

Q2. For every project located in 'Stafford’, list the project number, the controlling department number, and the department manager's lastname, address, and birthdate.

select pnumber, dnum, lname, address, bdate

from project, department, employee

where plocation = 'Stafford' and dnum = dnumber and mgrssn = ssn;

Target Attributes: pnumber, dnum, lname, address, bdate

Constraints:

Select Condition: plocation='Stafford'

Join Condition: dnum=dnumber, mgrssn = ssn

Target Relations: project, department, employee

• selection operation on project relation to select project tuples located in 'Stafford'.

• join with project and department relation to find the controlling department

• join with department and employee relation to find manager’s information in employee relation.

• two join operations implement two relationships in ER schema of the database, MANAGES and Controls.

Junping Sun Database Systems 4-16

Dealing with Ambiguous Attribute Names and Aliasing

Q1A: select fname, lname, address

from employee, department

where department.dname = 'Research' and

department.dnumber = employee.dnumber ;

• if the attribute names for department number are the same in both employee and department tables, then qualifier will be necessary in specifying a query to avoid ambiguity.

Q8. For each employee, retrieve the employee's first and last name and the first and last name of his or her immediate supervisor.

select e.fname, e.lname, s.fname, s.lname

from employee e, employee s

where e.superssn = s.ssn;

Page 9: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 9

Junping Sun Database Systems 4-17

Discussion on Aliasing

• ambiguity will arise in the case of queries that refer to the same relation name twice.

• the above query statement declares alternative relation names of employeerelation e and s.

• e and s can be imagined as two different copies of the employee relation.

e represents employees in the role of supervisees

s represents employees in the role of supervisors

• join and selection operations are involved.

• join attributes are superssn and ssn.

the join condition e.superssn = s.ssn links the employee’s supervisor’s corresponding information such as fname and lname.

• the join condition implements the recursive relationship supervision in original ER schema.

• this is an example of one level recursion.

• a general recursive query, with unknown number of levels, can be not specified.

Junping Sun Database Systems 4-18

Query Examples

Query with PROJECT:

Q9: List all employees’ social security number.

select ssnfrom employee;

Query with SELECT:

Q1C: Retrieve all employees’ tuples from department 5.

select *from employeewhere dno = 5;

Page 10: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 10

Junping Sun Database Systems 4-19

Query Examples

Query with CARTESIAN PRODUCT:

Q10: List all combinations of EMPLOYEE SSN and DEPARTMENT DNAME

select ssn, dnamefrom employee, department;

Query with Retrieving Distinct Attribute Values:

Q11: Retrieve the salary of every employee

select ALL salaryfrom employee;

Q11A: Retrieve all distinct salary values

select DISTINCT salaryfrom employee;

Junping Sun Database Systems 4-20

Query Involving with Union

Q4. Make a list of all project numbers for projects that involve an employee whose last name is ’Smith’ as a worker or as a manager of the department that controls the project.

(select distinct pnumber

from project, employee, department

where lname = ’Smith’ and dnum = dnumber and mgrssn = ssn)

union

(select distinct pnumber

from project, employee, works_on

where lname = ’Smith’ and pnumber = pno and essn = ssn);

• the first select query retrieves the projects that involve a 'Smith' as a manager of department that controls the project.

• the second select query retrieves the projects that involve a 'Smith' as a worker on the project.

• if several employees have the last name 'Smith', the project names involving any of them would be retrieved.

Page 11: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 11

Junping Sun Database Systems 4-21

Discussion

The first part of union:Target Attributes: pnumberConstraints:Select Condition: lname = ‘Smith’Join Condition: dnum = dnumber (implement relationship control)

mgrssn = ssn (implement relationship manager)Target Relations: project, employee, department

The second part of union:Target Attributes: pnumberConstraints:Select Condition: lname = ‘Smith’Join Condition: pnumber = pno and essn = ssn

(implement M:N relationship works_on)Target Relations: project, employee, works_on

Junping Sun Database Systems 4-22

Predicate IN

• The IN predicates selects those rows for which a specified value appears in a list of constant values enclosed in parentheses or the results from a subquery.

Q13: Retrieve the social security numbers of all employees who work on any one of the project with project number 1, 2, or 3.

select distinct essnfrom works_onwhere pno in (1, 2, 3);

Result from the query:essn123456789666884444453453453333445555

Page 12: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 12

Junping Sun Database Systems 4-23

Workson Table

Junping Sun Database Systems 4-24

Predicate NOT IN

• The NOT IN predicate is true if the expression preceding the keyword IN does not match any value in the list.

Q13b: Retrieve the social security numbers of all employees who work on the project other than projects 1, 2, and 3.

select essnfrom works_onwhere pno not in (1, 2, 3);

Result from the query:

essn333445555888665555987654321987987987999887777

Page 13: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 13

Junping Sun Database Systems 4-25

Quantifier ANY/SOME

Predicate ANY /SOME:• The ANY/SOME predicates select those rows for which a specified value

appears in the results from a subquery.

Query: Retrieve the social security numbers of employees who works on some projects controlled by department 5.

select distinct essnfrom works_onwhere pno = any (select pnumber

from projectwhere dnum = 5);

• =any predicate is same as the IN predicate.

• ANSI-SQL supports both ANY and SOME predicates, even they are equivalent.

• ORACLE only supports ANY predicate not SOME.

• The difference between IN and = ANY(=SOME) predicates is that IN could be connected with a set of values but ANY(SOME) only subqueries.

Junping Sun Database Systems 4-26

Quantifier SOME and ANY

• Both SOME and ANY are designed to link a simple relational operator with a subquery that return a multi-row result.

• The sequence preceding the subquery has the following format:{expression relational-operator quantifier} is called quantifier predicate

Expression Comparison-operator Quantifier Subqueryquantity > ANY (select ... )

• The whole quantifier predicate will be applied to each row of subquery result in return.Logical expression is true if and only if one or more rows in the subquery result satisfy the comparison.It is false if and only if absolutely none of the subquery result rows satisfy the comparison.

Page 14: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 14

Junping Sun Database Systems 4-27

Quantifier ALL

Quantifier ALL:

• The ALL predicates evaluates to true if and only if a comparison between a single value and the set of values retrieved by the subquery is true for all values retrieved by the subquery.

Query: List the names of employees whose salary is greater than the salary of all the employees in department 5.

select lname, fnamefrom employeewhere salary > all (select salary

from employeewhere dno = 5);

• Predicate ANY, SOME, and ALL could be prefixed with any comparison operators such as { =, �������� ���� }

• can be expressed by <> or != in the sql condition expression.

Junping Sun Database Systems 4-28

Discussions on Predicates IN and NOT IN

• The predicate a IN (x, y, z) is equivalent to a = x OR a = y OR a = z

select essnfrom works_onwhere pno = 1 or pno = 2 or pno = 3;

• The predicate a NOT IN (x, y, z) is equivalent to a <> x AND a <> y AND a<> za NOT IN (x, y, z) is equivalent to a <> ALL (x, y, z)

select essnfrom works_onwhere pno <> and pno <> 2 and pno <> 3;

• The predicate a <> ANY/SOME (x, y, z) is equivalent to (a <> x) or (a <> y) or (a <> z).

Page 15: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 15

Junping Sun Database Systems 4-29

Nested Query (Type-N)

Q4A. Make a list of all project names for projects that involve an employee whose last name is ’Smith’ as a worker, or as a manager of the department that controls the project.

select distinct pname

from project

where pnumber in (select pnumber

from project, department, employee

where lname =’Smith’ and dnum = dnumber and mgrssn =ssn)

or

pnumber in (select pno

from works_on, employee

where lname = ’Smith’ and essn = ssn);

• The comparison operator IN compares a value V (here V is pnumber) with a set of (or multiset) of values V and evaluates to TRUE if V is one of the elements in V.

Junping Sun Database Systems 4-30

Decomposition of Nested Query

Subquery 1:

temp1: select pnumber

from project, department, employee

where dnum = dnumber and mgrssn =ssn and lname ='Smith'

Subquery 2:

temp2: select pno

from workson, employee

where essn = ssn and lname = 'Smith'

Subquery 3:

select distinct pnumberfrom projectwhere pnumber = temp1.pno o r pnumber = temp2.pno

Page 16: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 16

Junping Sun Database Systems 4-31

Comparison Nested and Flatten Queries

Query: Retrieve the social security numbers of employees who work on some projects controlled by department 5.

select distinct essnfrom works_onwhere pno = (select pnumber

from projectwhere dnum = 5);

Equivalent Query:

select essnfrom works_on, projectwhere dnum = 5 and pno = pnumber ;

• The first implementation by using subquery can avoid join operation.• The second implementation has to use join operation where

pno = pnumber is the join condition or join path.

Junping Sun Database Systems 4-32

Correlated Nested Query (Type-J)

Q12. Retrieve the name of each employee who has a dependent with the same first name and same sex as the employee.

select e.fname, e.lname

from employee e

where e.ssn in (select essn

from dependent

where essn = e.ssn and sex = e.sex and e.fname = dependent_name);

• The where clause of inner query block contains join predicates that references the table of an outer query block (and the table is not included in the from clause of the inner query block).

• essn = e.ssn correlates the current dependent tuple with the corresponding employee the dependent belongs to.

• sex = e.sex and e.fname = dependent_name checks the equivalence of sex and fname values between employee and dependent tuples.

Page 17: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 17

Junping Sun Database Systems 4-33

Rule for Subqueries and Nested Queries

1. The subquery should be enclosed within parentheses.2. Subqueries may contain nested subqueries. When subqueries are nested,

SQL evaluates them from the inside out.a. The innermost query is processed firstb. Then the result of query is passed to the next outer query.

3. In general, we might have several levels of nested queries, the ambiguity among attribute names will be possible if attributes of the same name exist,

one in a relation in the from-clause of the outer query, and the other in a relation in the from-clause of the nested query (inner query).

The rule is that a reference to an unqualified attribute refers to the relation declared in the innermost nested query.

4. Column name in a subquery are implicitly qualified by the table name in the FROM clause of the subquery (that is the FROM clause at the same level).

5. A subquery may refer only to column names from tables which are named in outer queries or in subquery’s own FROM clause.A subquery may not access tables which are used only by a child query.

6. When a subquery is one of the two operands involved in a comparison, the subquery must be written as the second operand.

Junping Sun Database Systems 4-34

Query with Exists Function

Q12B: Retrieve the name of employee who has a dependent with the same first name and same sex as the employee.

select e.fname, e.lname

from employee e

where exists (select *

from dependent

where essn = e.ssn and sex = e.sex and e.fname = dependent_name);

Page 18: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 18

Junping Sun Database Systems 4-35

The Exists Function in SQL

• exists and not exists in SQL is used to check whether the result of a correlated query is empty.

• exists and not exists in SQL are usually used in conjunction with a correlated nested query.

• In the example 12, the nest query within the exists function references the ssn, fname, and sex attributes of employee relation from the outer query.

• For each employee tuple, evaluate the nested query, which retrieves all dependent tuples with the same social security number ssn, sex and nameas the employee tuple.

if at least one tuple exists in the results of the nested query, then select that employee tuple.

In general,

exists(Q) returns TRUE if there is at least one tuple in the result of query Q and returns FALSE otherwise.

not exists(Q) returns TRUE if there are no tuples in the result of query Q and returns FALSE otherwise.

Junping Sun Database Systems 4-36

Query with Not Exists Function

Q6: Retrieve the names of employees who have no dependents.

select fname, lname

from employee

where not exists (select *

from dependent

where ssn = essn);

• The correlated nested query retrieves all dependent tuples related to an employee tuple, if none exist, the employee tuple is selected.

• For each employee tuple, the nested query selects all dependent tuples whose essn value matches the employee ssn.

• If the result of the nested query is empty then no dependents are related to the employee, so that employee tuple is selected and its fname and lname are retrieved.

• This is the implementation of difference operation.

Page 19: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 19

Junping Sun Database Systems 4-37

Nested Query with Two Exists Function

Q7. List the names of managers who have at least one dependent.

select fname, lname

from employee

where exists (select *

from dependent

where ssn = essn)

and

exists (select *

from department

where ssn = mgrssn);

• the first nested query selects all dependent tuple related to an employee

• the second nested query selects all department tuples managed by the employee tuple.

• if at least one of the fist one and at least one of the second exist with the same ssn, the employee tuple is selected and the fname and lname are retrieved.

• this is the implementation of intersection operation.

Junping Sun Database Systems 4-38

Query with Division (use contains)

Q3. Retrieve the name of each employee who works on all the projectscontrolled by department 5.

select fname, lname

from employee

where ((select pno

from works_on

where ssn = essn)

contains

(select pnumber

from project

where dnum = 5));

• the second nested query which is not correlated to the outer query retrieves the project numbers of all projects controlled by department 5.

• for each employee tuple, the first nested query, which is correlated, retrieves the project numbers on which the employee works; if these contain all projects controlled by department 5, the employee tuples is selected and the name of that tuple is retrieved.

• ANSI-SQL and most SQL engine do not support the contains operator.

Page 20: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 20

Junping Sun Database Systems 4-39

Query with Division

Q3: Retrieve the name of each employee who works on all the projectscontrolled by department 5.

select fname, lname

from employee e

where not exists

( (select pnumber

from project

where dnum = 5)

minus

(select pno

from workson w where e.ssn = w.essn) )

Junping Sun Database Systems 4-40

Query with Division

Q3: Retrieve the name of each employee who works on all the projectscontrolled by department 5.

select fname, lname

from employee

where not exists

(select *

from workson b

where (b.pno in (select pnumber

from project

where dnum = 5))

and

not exists (select *

from workson c

where c.essn = ssn and

c.pno = b.pno));

Page 21: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 21

Junping Sun Database Systems 4-41

Discussion

• The outer nested query selects any works_on (b) tuples whose pno is of a project controlled by department 5 and there is not a works_on (c) with the same pno and the same ssn as that of the employee tuple under consideration in the outer query.

if no such tuple exists, we select the employee tuple, and retrieve the fnameand lname of that employee tuple.

the equivalent interpretation of the query script is as follows:

there does not exist a project controlled by department 5 that the employee does not work on.

equivalently,

select each employee who works on all the projects controlled by department 5.

Junping Sun Database Systems 4-42

Renaming Attributes and Join Tables

Q8a: Retrieve the last name of each employee and his or her supervisor, while renaming the resulting attribute names as employee_name and supervisor_name.

select e.lname as employee_name, s.lname as supervisor_name from employee as e, employee as swhere e.superssn = s.ssn;

Q1a: Retrieve the names of the employees who work for ‘Research’ department.

select fname, lname, addressfrom (employee join department on dno = dnumber)where dname = ‘Research’;

• The concept of a joined table is only supported in ANSI-SQL92.

Page 22: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 22

Junping Sun Database Systems 4-43

Natural Join, Outer Join, and Nested Join

Q1b: select fname, lname, addressfrom (employee natural join

(department as dept(dname, dno, mssn, msdate)where dname = ‘Research’;

Q8b: Retrieve the last names of all employees and his or her supervisor if these employees have a supervisor.

select e.lname as employee_name, s.lname as supervisor_name from (employee e left outer join employee s

on e.superssn = s.ssn);

Q2A: select pnumber, dnum, lname, address, bdatefrom ((project join department on dnum = dnumber) join

employee on mgrssn = ssn)where plocation = ‘Stafford’;

Junping Sun Database Systems 4-44

Outer Join in ORACLE

Q8b: Retrieve the last names of all employees and his or her supervisor if these employees have a supervisor.

select e.lname as employee_name, s.lname as supervisor_name from employee e, employee s where e.superssn = s.ssn (+);

• This is equivalent to that the employee table as the role of employee left outer joins the employee table as the role of supervisor.

Q8c: Retrieve the last names of all employees and his or her supervisees if these employees have a supervisee.

select s.lname as employee_name, e.lname as supervisor_name from employee s, employee e where s.ssn = e.superssn (+);

• This is equivalent to that the employee as the role of supervisor left outer joins the employee table as the role of supervisee.

Page 23: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 23

Junping Sun Database Systems 4-45

Aggregation Functions

Aggregate Functions:

• It takes an entire column as an argument and compute a single value based on the contents of the column.

• The function result is an “aggregate” of the individual data values in the rows of the column.

Q15’: Find the total number of employees in the company, the sum of the salaries of all employees, the maximum, the minimum, and the averagesalary.

select count(*), sum(salary), max(salary), min(salary), avg(salary)from employee;

• count(*) is applied to count the total number of tuple from employee tuple.

• sum(), max(), min(), and avg() functions is applied to salary column value of the tuples in employee table.

Junping Sun Database Systems 4-46

Q16’: Find the total number of employees of the ‘Research’ department, as well as the summation of the salaries, the maximum salary, the minimum salary, and the average salary in this department.

select count(*), sum(salary), max(salary), min(salary), avg(salary)from employeewhere dno = dnumber and dname = ‘Research’;

• all the aggregation functions, count(), sum(), max(), min(), and avg() are applied to these employee tuples from ‘Research’ department.

• the constraints dno = dnumber and dname = ‘Research’ in where clause are evaluated first before aggregate functions are evaluated.

Q19: Count the number of distinct salary values in the database.

select count (distinct salary)from employee;

Page 24: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 24

Junping Sun Database Systems 4-47

Q5: Retrieve the names of all employees who have two or more dependents

Incorrect one: select lname, fnamefrom employeewhere (select count(*)

from dependentwhere ssn = essn ) >= 2;

• when a subquery is one of the two operands involved in a comparison, the subquery must be written as the second operand.

Correct one:select lname, fnamefrom employeewhere 2 <= (select count(*)

from dependentwhere ssn = essn );

Junping Sun Database Systems 4-48

Group By Clause

• In many cases, we want to apply aggregate functions to subgroups of tuples in a relation based on some attribute values.

Example:

Find the average salary of employees in each department

find the number of employees who work on each project.

• In these cases, we want to group the tuples have the same value of some attribute(s), called the grouping attribute(s), and apply the function to each such group independently.

• SQL has a group by clause for this purpose.

• The group by clause specifies the grouping attributes, which must also appear in the select clause, so that the value of applying each function on the group of tuples appears along with the value of the grouping attribute(s).

Page 25: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 25

Junping Sun Database Systems 4-49

Group by Clause

Q20: For each department, retrieve the department number, the number of employees in the department, and their average salary.

select dno, count(*), avg(salary)from employeegroup by dno;

Q21: For each project, retrieve the project number, the project name, and number of employees who work on that project.

select pnumber, pname, count(*)from project, works_onwhere pnumber = pnogroup by pnumber, pname;

• the grouping and aggregate functions are applied after the joining of the two relations.

Junping Sun Database Systems 4-50

Having Clause

Q22. For each project on which more than two employees work, retrieve the project number, project name, and number of employees work on that project.

select pnumber, pname, count(*)

from project, workson

where pnumber = pno

group by pnumber, pname

having count(*) > 2;

• SQL provides a having clause, which can appear only in conjunction with group by clause

• having provides a condition on the group of tuples associated with eachvalue of the grouping attributes, and only the groups that satisfy the condition are retrieved in the result of the query.

• selection condition in the where clause limits the tuples to which group function are applied.

• the having clause limits the whole groups.

Page 26: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 26

Junping Sun Database Systems 4-51

Q23. For each project, retrieve the project number, project name, and number of employee from department 5 who works on that project

select pnumber, pname, count(*)

from project, workson, employee

where pnumber = pno and ssn = essn and dno = 5

group by pnumber, pname;

Q5. Retrieve the name s of all employees who have two or more dependents.

select lname, fname

from employee

where ssn in (select essn

from dependent

where ssn = essn

group by essn

having count (essn) >= 2);

Junping Sun Database Systems 4-52

Where Condition before Having

Q24. Count the total number of employees with salaries greater than $40,000 who work in each department, but only these department with more than five employees.

select dname, count(*)

from department, employee

where dnumber = dno and salary > 40000

group by dname

having count(*) > 5;

• this is not the correct query statement.

• selection condition (salary > 40000) has eliminated these employee tuples whose salary <= 40000 before the group by and having clauses.

• it will select only departments that have more than five employees who each earns more than $40,000.

• the rule is that the where clause is executed first to select individual tuples;

the having clause is applied later to select individual groups of tuples.

• the tuples are already restricted to employees earning more than $40,000 before the function in the having clause is applied.

Page 27: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 27

Junping Sun Database Systems 4-53

The correct one:

select dname, count(*)

from department, employee

where dnumber = dno and salary > 40000 and

dno in (select dno

from employee

group by dno

having count(*) > 5)

group by dname;

• the constraints dnumber = dno and salary > 40000 in where clause join the department tuples with employee tuples whose salary is greater than 40000.

• the subquery which includesfive employees work.

Junping Sun Database Systems 4-54

Having Clause• HAVING clause is designed for use in conjunction with GROUP BY when it is

desired to restrict the groups which appears in the final result.

• HAVING conditions often involve aggregation functions, permitting the filtering of groups based on summary calculations.

• Aggregation functions may not be used within a WHERE clause.

• WHERE clause filters individual rows going to the final result or intermediate result.

• HAVING filters groups going into the final result.

• WHERE and HAVING may be used together cooperatively:WHERE is applied first to filter single rows, then group are formed from the rows which remain, then finally the HAVING clause is applied to filter the groups.

• Generally, the HAVING clause immediately follows the GROUP BY clause.

Page 28: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 28

Junping Sun Database Systems 4-55

Summary of GROUP BY/HAVING Clauses

1. Attribute names or column names not listed in the GROUP BY clause may not appear in the HAVING condition in ANSI-1989 and ANSI-1992 SQL.

2. Aggregation functions may always be used in the HAVING clause, even if they do not appear in the SELECT attribute list.

3. The HAVING condition can involve compound conditions formed by combining simple logical expressions with the logical operators AND, OR, and NOT.

4. HAVING and WHERE can work together.• HAVING condition is always applied to GROUP BY Clause.• WHERE condition is always applied to attributes involved in selection or join.

5. Non-aggregation expression may be used in the HAVING clause, providing the expressions involve only columns which are named in the GROUP BY clause.

Junping Sun Database Systems 4-56

Syntax Structure of SELECT Statements

SELECT <attribute list>FROM <table list>[WHERE <condition>][GROUP BY <grouping attribute(s)>][HAVING <grouping condition>][ORDER BY <attribute list>]

• SELECT clause lists the attributes or functions to be retrieved.• FROM clause specifies all relations needed in the query but not those in

nested query.• WHERE clause specifies the conditions for selection of tuples from these

relations.• GROUP BY specifies grouping attribute(s), whereas HAVING clause specifies

a condition on the groups being selected rather than on the individual tuples.• The built in aggregation functions COUNT, SUM, MIN, MAX, and AVG are

used in conjunction with grouping.• ORDER specifies an order

Page 29: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 29

Junping Sun Database Systems 4-57

Sequence

1. FROM: The FROM clause is processed first. It specifies the table(s) or views which serve as the source of all data for the final result. If multiple tables are involved, the join operation is necessary.

2. WHERE: The WHERE clause is processed second. It eliminates those rowsdefined in FROM clause which do not satisfy the search condition.

3. GROUP BY: The GROUP BY clause groups the remaining rows on the basis of shared values in the GROUP BY column(s). The partial result now has the form of a set of groups.

4. HAVING: The HAVING clause is now applied to eliminate those groups which do not satisfy the HAVING condition.

5. SELECT: The SELECT list is used to remove unwanted columns or attributes from the partial result. Only elements which appear in the SELECT list remain.

6. ORDER BY: The final result in the order based on ORDER BY list.

Junping Sun Database Systems 4-58

Insert Statement in SQL

Insert Statement:

Insert a new tuple into employee table:

insert into employee

values (’Richard’, ’K’, ’Marini’, ’653298653’, ’30-DEC-52’, ’98 Oak Forest, Katy, ‘TX', 'M', 37000, '987654321', 4);

insert into employee(fname, lname, ssn)

values (‘Richard’, ‘Marimi’, ‘653298653’);

• Attributes that are not specified in the insert statement are set to their DEFAULT or to NULL if the attributes are defined with DEFAULT or NULL.

• The insert operation will be rejected if NOT NULL has been specified for those attributes.

Page 30: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 30

Junping Sun Database Systems 4-59

Insert a set of tuples into a table:

• create a relation and load it with result of a query.

create table depts_info (deptname vchar(15),

noofemps integer,

totalsal integer);

insert into depts_info (deptname, noofemps, totalsal)

select dname, count(*), sum(salary)

from department, employee

where dnumber = dno

group by dname;

Junping Sun Database Systems 4-60

Delete Statement in SQL

Delete a tuple:to delete the employee tuple with lname ‘Brown’delete from employeewhere lname = ‘Brown’;

Delete a set of tuples:to delete the employee tuples from ‘Research’ department

delete from employeewhere dno in (select dnumber

from departmentwhere dname = ‘Research’);

To delete all the tuples in employee table:

delete from employee; (this gives an empty table)

Page 31: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 31

Junping Sun Database Systems 4-61

Update Statement in SQLUpdate a single tuple:

to change the location and controlling department number of project number 10 to ‘Bellaire’ and 5.

update projectset plocation = ‘Bellaire’, dnum = 5where pnumber = 10;

Update a set of tuples in a table:

to raise the salary of employees from ‘Research’ department by 10%.

update employeeset salary = salary * 1.1where dno in (select dnumber

from departmentwhere dname = ‘Research’);

Junping Sun Database Systems 4-62

Views in SQL

View:

• It is a single table is derived from other tables, these other tables can be base tables or previously defined views.

• A view does not necessarily exist in physical form, it is considered as a virtual table in contrast to base tables whose tuples are actually stored in the database.

Advantages and Disadvantages of View:

• The advantage is that a frequent query involving with join operations can be represented. Queries involving join operations do not have to do join operations every time by querying the view.

• The disadvantage is that the possible update operations applied to views are limited.

Page 32: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 32

Junping Sun Database Systems 4-63

Specification of Views in SQLCreate a view on fname, lname, pname, hoursV1: create view works_on1

as select fname, lname, pname, hours

from employee, project, works_on

where ssn = essn and pno = pnumber;

works_on1:

V2: create view dept_info (dept_name, no_of_emps, total_sal)

as select dname, count(*), sum(salary)

from department, employee

where dnumber = dno

group by dname;

dept_infodept_name no_of_emps total_sal

fname lname pname hours

Junping Sun Database Systems 4-64

Querying on View

QV1: To retrieve the last name, first name of all employees who work on ‘ProjectX’select pname, fname, lnamefrom works_on1where pname = ‘ProductX’;

• A view is always up to date, if we modify the tuples in the base tables which the view is defined, the view automatically reflects these changes.

• The view is not realized at the time of view definition but rather at the time we specify a query on the view.

• It is the responsibility of the DBMS and not the user to make sure that the view is up to date.

• If the view is no longer useful, then view can be disposed by drop command.

V1d: drop view works_on1;

V2d: drop view dept_info;

Page 33: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 33

Junping Sun Database Systems 4-65

Updating in Views

Single Table View Update:

An update on a view defined on a single table can be mapped to an update on the underlying base table.

Multi Table View Update:

An view involving joins, an update operation may be mapped to update operations on the underlying base relations in multiple ways.

Suppose there is a view update the PNAME attribute of ’John Smith’ from ’ProductX’ to ’ProductY’.

UV1: update works_on1

set pname = ’ProductY’

where lname = ’smith’ and fname = ’john’

and pname =’ProductX’

this query can be mapped into several updates on the base relations to give the desired update on the view.

Junping Sun Database Systems 4-66

• There are two possible update (a) and (b) on the base relationscorresponding to UV1.

(a). update works_on

set pno = (select pnumber

from project

where pname ='ProdcutY')

where essn = (select ssn

from employee

where lname = 'Smith' and fname ='John') and

pno = (select pnumber

from project

where pname ='ProductX')

(b). update project

set pname = 'ProductY'

where pname = 'ProductX'

Page 34: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 34

Junping Sun Database Systems 4-67

Discussion

• Update (a) relates "John Smith’ to the ’Product Y’ project tuple in place of the ’Product X’, and is the most likely to desired updated.

• Original update changes the project name pname in works_on1 view, it is unlikely that the update wants to change the PNAME itself, the semantics here is to update the project that ’John Smith’ works on.

• So the update (a) will update the correspondent project number where PNAME = ’Product Y’ in works_on base table.

• Update (b) would also give the desired updated effect on the view, but it accomplishes this by changing the name of of the ’Product X’ tuple in the project relation to ’Product Y’.

It is quite unlikely that the user who specified the view update UV1 wants to update to be interpreted as in update (b).

Junping Sun Database Systems 4-68

Observation

• A view with a single defining table is updatable if the view attributes contain the primary key or some other candidate key of the base relation, because this maps each (virtual) view tuple to a single base tuple.

• Views defined on multiple tables using joins are generally not updatable.

• Views defined using grouping and aggregate function are not updatable.

Example:

UV2: modify dept_info

set total_sal = 100000

where dname = ’Research’;

• A view update is feasible when only one possible update on the base relations can accomplish the desired update effect on the view.

• Whenever an update on the view can be mapped to more than one update on the underlying base relations, we must have a certain procedure to choose the desired update.

• some researchers have developed methods for choosing the most likely update.

• while other researchers prefer to have the user choose the desired update mapping view definition.

Page 35: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 35

Junping Sun Database Systems 4-69

Specifying Additional Constraints as Assertions

• To specify the constraint “The salary of an employee must not be greater than the salary of the manager of the department that employee works for.

create assertion salary_constraintcheck ( not exists ( select *

from employee e, employee m, department dwhere e.salary > m.salary and e.dno = d.dnumberand d.mgrssn = m.ssn) );

• if tuples in the database cause the condition of an Assertion statement to evaluate to be FALSE, the constraint is violated.

Junping Sun Database Systems 4-70

Specifying Index in SQL

Specifying index on single attribute:I1: create index lname_index

on employee (lname );

Specifying index on multiple attributes:I2: create index names_index

on employee (lname asc, fname desc, minit);

Specifying index on the attribute with unique value:I3: create unique index ssn_index

on employee(ssn);

Specifying cluster index:I4: create index dno_index

on employee (dno)cluster;

Page 36: Part 4: Database Language - SQLscis.nova.edu/~jps/teaching/phdiss/diss02s/diss750/notes/diss02-4.pdfPart 4: Database Language - SQL ... • SQL (ANSI-86) the first standardized version

Page 36

Junping Sun Database Systems 4-71

Cluster in ORACLE

create cluster deptandemp (deptemp varchar(9) );

create table department( dname varchar(19),

dnumber varchar(9),......)

cluster deptandemp (dnumber);

create table employee( name varchar(19),

......dno varchar(9),

)cluster deptandemp (dno);

Junping Sun Database Systems 4-72

Discussion on Index

• The reseason and motivation for index is to support efficient search and maintenance.

Advantages:Indices support binary searchIndices support dynamic maintenance

Disadvantages:It costs extra memory space.Algorithms to support indices are more complex.

• Key work unique can be used to enforce the key constraint.The reason behind linking the definition of a key constraint with specifying an index is that it is much more efficient to enforce uniqueness of key values on a file if an index is defined on the key attribute, since the search on index is much more efficient .

• A clustering and unique index is similar to primary index.• A clustering and non-unique index is similar to cluster index.• A nonclustering index is similar to secondary index.