SQLsibel/dbs_notes/sql.pdf3 SQL • SQL is case insensitive (though strings are case sensitive of...

78
SQL CSCI 4380 Monday, October 4, 2010

Transcript of SQLsibel/dbs_notes/sql.pdf3 SQL • SQL is case insensitive (though strings are case sensitive of...

  • SQLCSCI 4380

    Monday, October 4, 2010

  • 2

    SQL• A logical/declarative query language

    • Express what you want, not how to get it• Each SQL expression can be translated to multiple equivalent

    relational algebra expressions

    • SQL is tuple based, each statement refers to individual tuples in relations

    • SQL has bag semantics• Recall RDMS implementations of relations as tables do not

    require tables to always have a key, hence allowing the possibility of duplicate tuples

    • Same is true for SQL, an SQL expression may return duplicate tuples, unless they are removed explicitly.

    Monday, October 4, 2010

  • 3

    SQL• SQL is case insensitive (though strings are case sensitive

    of course)

    • It is best to imagine the control flow of SQL as • From: read relations involved in the from• Where: check for each tuple if it passes the where

    clause

    • Select: for tuples that pass the where clause, construct the output by the projection attributes in select

    Monday, October 4, 2010

  • 4

    Example DatabaseSTUDENT(Id, Name, Password, Address)

    FACULTY(Id, Name, DeptId, Password, Address)

    COURSE(CrsCode, DeptId, CrsName, CreditHours)

    REQUIRES(CrsCode, PrereqCrsCode, EnforcedSince)

    CLASS(CrsCode, SectionNo, Semester, Year,Textbook, ClassTime, Enrollment, MaxEnrollment, ClassroomId, InstructorId)

    CLASSROOM(ClassroomId, Seats)

    TRANSCRIPT(StudId, CrsCode, SectionNo, Semester, Year, Grade)

    Monday, October 4, 2010

  • 5

    SQLSELECT C.CrsCode, C.SectionNo

    FROM CLASS C

    WHERE C.Semester=‘Fall’ AND C.Year=2002 AND C.Enrollment > C.MaxEnrollment

    • Select from C all tuples that satisfy the WHERE clause

    • For these tuples, project C.CrsCode and C.SectionNo

    • Π CrsCode, SectionNo ( σSemester=‘Fall’ AND Year=2002 AND Enrollment > MaxEnrollment CLASS )

    Monday, October 4, 2010

  • 6

    SQL - simple expressionsSELECT R.PrereqCrsCode AS Reqfor4380, 2009 AS

    currentSemester

    FROM Requires R

    WHERE R.CrsCode = ‘CSCI4380’

    • For each tuple R of the Requires relation (relation alias) check the WHERE clause

    • Reqfor4380 is an alias for the output (or projection) column -> renaming operator

    • currentSemester is a new column with the value 2009 for all tuples

    Monday, October 4, 2010

  • SQL - SELECT

    SELECT StudId, CrsCode, SectionNo, Semester, 2009-Year AS timeSpan

    FROM Transcript

    An arithmetic, string or date expression can be used in select to find a new column or in where for a complex selection condition.

    Monday, October 4, 2010

  • SQL-dates• Date data type allows specific operations

    • date '2001-09-28' + integer '7' = date '2001-10-05'• date '2001-09-28' + interval '1 hour' = timestamp

    '2001-09-28 01:00:00'

    • date '2001-09-28' + time '03:00' =timestamp '2001-09-28 03:00:00'

    • date '2001-10-01' - date '2001-09-28'= integer '3' (days)

    Monday, October 4, 2010

  • SQL - dates• Suppose Requires.EnforcedSince is a date.

    SELECT CrsCode, PrereqCrsCode

    FROM Requires

    WHERE date ‘2009-10-01’ - enforcedSince > interval ‘4 years’

    Find prerequisites in place for 4 or more years from 2009-10-01.

    Monday, October 4, 2010

  • Strings• Strings are delimited by `` (escaped by ``)• Choose any character to escape by ESCAPE char• String concatenation: a || b• Regular expressions:

    • % stands for 0 or more characters• _ stands for 1 character• to search if a string contains a regular expression

    use “like”.

    Monday, October 4, 2010

  • StringsSELECT Name || Address

    FROM Faculty

    SELECT

    FROM Class

    WHERE Textbook like ‘%Database%’

    Note that the match is case sensitive

    Monday, October 4, 2010

  • 12

    NULL VALUES

    • A null value usually means there is no value for a specific value. The reasons may be:

    • Value does not exist (yet). The grade for a course in progress does not exist.

    • Value is not known. We may know that a person has a phone, but we do not know the phone number.

    • It is not known whether a value exists or not. A student may or may not have non-campus e-mail address.

    Monday, October 4, 2010

  • 13

    NULL VALUES• To check whether a value is null or not, a specific predicate

    is used:

    WHERE T.grade IS NULL

    WHERE T.grade IS NOT NULL

    • For regular comparison conditions and other predicates, when the compared values are null, the condition evaluates to “unknown”.

    • WHERE T.grade = ‘A’ evaluates to unknown if T.grade is null• WHERE T.grade=T2.grade evaluates to unknown if either T.grade or

    T2.grade (or both) are null.

    Monday, October 4, 2010

  • 14

    NULL VALUES• Furthermore, we are given the following:

    • UNKNOWN AND TRUE = UNKNOWN• UNKNOWN OR TRUE = TRUE• UNKNOWN AND FALSE = FALSE• UNKNOWN OR FALSE = UNKNOWN• NOT (UNKNOWN) = UNKNOWN• UNKNOWN OR UNKNOWN = UNKNOWN• UNKNOWN AND UNKNOWN = UNKNOWN

    • Note that a tuple is returned by the WHERE clause, only if the condition evaluates to true.

    Monday, October 4, 2010

  • 15

    NULL VALUESSELECT T.studId, T.grade

    FROM Transcript T

    WHERE T.semester=‘Fall’ AND T.year=2002 AND

    NOT (T.grade IN (‘A’,’B’,’C’,’D’,’F’))

    Does this select all null grades?

    Better select WHERE T.grade IS NULL or T.grade=‘I’

    Monday, October 4, 2010

  • 16

    SQL - multiple tablesRecall:

    FACULTY(Id, Name, DeptId, Password, Address)

    CLASS(CrsCode, SectionNo, Semester, Year,Textbook, ClassTime, Enrollment, MaxEnrollment, ClassroomId, InstructorId)

    What does this query do?

    SELECT F.Name, C.CrsCode

    FROM Faculty F, Class C

    Monday, October 4, 2010

  • 17

    SQL - multiple tablesTo find Faculty and the classes they teach:

    SELECT F.Name, C.CrsCode

    FROM Faculty F, Class C

    WHERE C.InstructorId = F.Id

    F and C are aliases for the tables, to be used for disambiguation.

    Monday, October 4, 2010

  • 18

    SQLLet: SELECT A1 B1, A2 B2, … , Am Bm

    FROM R1, R2, … , Rn

    WHERE selection-condition

    be a valid SQL statement where

    • A1,…,Am are attributes in R1,…,Rn (disregarding relation aliases) and B1,…,Bm are attribute aliases

    • Selection-condition is a valid boolean expression involving only relations R1,…,RnThen, the result of this statement is equivalent to the relational algebra expression:

    (Π A1,…,Am (σselection-condition (R1 × R2 × … × Rn))) [B1,…,Bm]

    Monday, October 4, 2010

  • 19

    SQL - multiple tablesSELECT DISTINCT T.StudId, C.DeptId

    FROM Course C, Transcript T

    WHERE T.Semester = ‘Spring’ AND

    T.Year = 2002 AND

    T.CrsCode = C.CrsCode AND

    T.Grade = ‘A’

    Monday, October 4, 2010

  • 20

    Examples• Find all the courses student named ‘Jill Pecan’ has

    completed.

    SELECT T.CrsCode, T.Semester, T.Year

    FROM Student S, Transcript T

    WHERE S.Id=T.StudId AND T.Grade IS NOT NULL AND

    T.Grade ‘I’ AND S.name = ‘Jill Pecan’

    Suppose when you register for a course, you have no value for grade. Note that the incomplete grade (I) is different than this.

    Monday, October 4, 2010

  • 21

    Examples• Find all faculty who taught courses both in ‘Fall’ and ‘Spring’

    2002.

    SELECT DISTINCT F.Name

    FROM CLASS C1, CLASS C2, Faculty F

    WHERE C1.InstructorId = C2.InstructorId AND

    C1.Semester = ‘Fall’ AND C1.Year=2002 AND

    C2.Semester=‘Spring’ AND C2.Year=2002 AND

    F.Id = C1.InstructorId

    Monday, October 4, 2010

  • 22

    Examples• Find all students who are taking a course by Prof. Acorn in ‘Fall 2002’.

    SELECT DISTINCT S.Name

    FROM Student S, CLASS C, Faculty F, Transcript T

    WHERE S.Id = T.StudId AND C.InstructorId = F.Id AND

    C.CrsCode = T.CrsCode AND C.Semester = T.Semester AND

    C.Year = T.Year AND F.name LIKE ‘%Acorn’ AND

    C.SectionNo = T.SectionNo AND

    T.Semester = ‘Fall’ AND T.Year = 2002

    Monday, October 4, 2010

  • 23

    Examples• Find all faculty who teach a course offered by a department other

    than their own. List the name of the faculty, and the name of the courses they teach from other departments

    SELECT F.Name, C.CrsCode

    FROM Faculty F, Course C, Class CL

    WHERE F.Id = C.InstructorId AND

    C.CrsCode = CL.CrsCode AND

    C.DeptId F.DeptId

    Monday, October 4, 2010

  • 24

    SQL - set/bag operators(SELECT S.Name

    FROM Student S)

    UNION

    (SELECT F.Name

    FROM Faculty F)

    • Union compatibility is still needed for this operation

    Monday, October 4, 2010

  • 25

    SQL - set/bag operators• SQL set operators are UNION, INTERSECT and

    EXCEPT (set difference)

    • Each operator is a set operator, i.e. removes duplicate tuples -even if SQL does not automatically.

    • In cases where the duplicate information is important, then you can tell SQL not to remove duplicates by UNION ALL.

    • UNION is supported by all DBMSs, the implementation for others may differ from system to system.

    Monday, October 4, 2010

  • 26

    Examples• Find all faculty who taught courses both in ‘Fall’ and ‘Spring’ 2002.

    (SELECT DISTINCT F.Name

    FROM CLASS C, Faculty F

    WHERE C.Semester = ‘Fall’ AND C.Year=2002 AND

    F.Id = C.InstructorId)

    INTERSECT

    (SELECT DISTINCT F.Name

    FROM CLASS C, Faculty F

    WHERE C.Semester = ‘Spring’ AND C.Year=2002 AND

    F.Id = C.InstructorId)

    Monday, October 4, 2010

  • Examples

    • Find faculty who never taught courses(SELECT Id, Name FROM Faculty)

    EXCEPT

    (SELECT F.Id, F.Name FROM Faculty F, Class C

    WHERE F.id = C.instructor_id)

    Monday, October 4, 2010

  • 28

    SQL - distinctSELECT DISTINCT T.StudId

    FROM Transcript T

    WHERE T.grade = ‘A’ AND T.CrsCode = ‘CSCI4380’

    SELECT DISTINCT C.InstructorId

    FROM CLASS C

    WHERE C.CrsCode = ‘4380’ AND

    C.Year IN (1998, 2000, 2002) AND

    C.Textbook LIKE ‘%Transaction%’

    Remove duplicate

    tuples

    Monday, October 4, 2010

  • 29

    COUNTING• SQL has the ability to count and aggregate values across tuples.• However since each SQL expression in the WHERE clause refers

    to individual tuples, counting usually is the last step.

    • Recall that an SQL expression returns a bag of tuples.SELECT count(T.studid), count(DISTINCT T.sectionid)

    FROM Transcript T

    WHERE T.semester=‘Fall’ AND T.year=2002 AND

    T.grade=‘A’ AND T.crscode=‘CSCI4380’

    How many tuples are returned by this expression?

    Monday, October 4, 2010

  • Aggregates

    • Any aggregate operation operates over all the tuples and returns a single value

    • examples: sum, avg, min, max, stddev

    Monday, October 4, 2010

  • 31

    GROUP BY• Suppose we want to find how generous each

    professor is, by counting the total number of ‘A’s they give every semester in each class, and even compare the count with the class size.

    • We would like to return a relation with schema:• ProfId, SpecificClass, TotalAs, TotalAsByPercentagewhich lists the total number of As by the specific faculty in

    a specific class.

    Monday, October 4, 2010

  • 32

    GROUP BYFirst, match professors with the grades.

    SELECT C.instructorid, C.csrcode, C.semester, C.year, C.sectionNo, T.grade

    FROM Transcript T, Class C

    WHERE C.crscode = T.crscode AND C.semester = T.semester AND

    C.year = T.year AND C.sectionNo = T.sectionNo AND T.grade=‘A’

    The next step is to count, but the following does not give us what we wanted:

    SELECT count(T.studId) FROM Transcript T, Class C

    WHERE C.crscode = T.crscode AND C.semester = T.semester AND

    C.year = T.year AND C.sectionNo = T.sectionNo AND T.grade

    Monday, October 4, 2010

  • 33

    GROUP BYInstead, first group tuples into groups for each specific instructor

    and course that they taught.

    FROM Transcript T, Class C

    WHERE C.crscode = T.crscode AND C.semester = T.semester AND

    C.year = T.year AND C.sectionNo = T.sectionNo AND

    T.grade=‘A’

    GROUP BY C.instructorid, C.crscode, C.semester, C.year, C.sectionNo

    Instructor 1,4380, Fall 2002,

    Section 1

    Instructor 2,4380, Fall 2002,

    Section 2

    Instructor 3,4380, Spring 2002,

    Section 1

    Instructor 1,4380, Fall 2001,

    Section 1

    Monday, October 4, 2010

  • 34

    GROUP BYFor each group of tuples, we can now compute the necessary

    aggregates. We generate a new tuple for each different group.

    SELECT C.instructorid, C.crscode, C.semester, C.year, C.sectionNo,

    count(T.studId) TotalAs

    FROM Transcript T, Class C

    WHERE C.crscode = T.crscode AND C.semester = T.semester AND

    C.year = T.year AND C.sectionNo = T.sectionNo AND

    T.grade=‘A’

    GROUP BY C.instructorid, C.crscode, C.semester, C.year, C.sectionNo

    Note that in this grouping, we cannot compare the total number of As with the total class size. We have already eliminated the students who did not get As.

    Monday, October 4, 2010

  • 35

    GROUP BYFor all students, find the total number of credits hours they have completed.

    SELECT T.studId, sum(C.credithours) TotalHrs,

    count(T.crscode)/count(DISTINCT T.crscode) Repeats

    FROM Transcript T, Course C

    WHERE C.crscode = T.crscode AND

    UPPER(T.grade) IN (‘A’,’B’,’C’,’D’)

    GROUP BY T.studId

    Note that this query does not take into account the case where the student took the same course more than once. Then, the credit hours should only count once.

    Monday, October 4, 2010

  • 36

    GROUP BY / HAVINGFor all students, find the total number of credits hours they have

    completed. But, only return the students with at least 100 total credits hours.

    SELECT T.studId, sum(C.credithours) TotalHrs,

    count(T.crscode)/count(DISTINCT T.crscode) Repeats

    FROM Transcript T, Course C

    WHERE C.crscode = T.crscode AND

    UPPER(T.grade) IN (‘A’,’B’,’C’,’D’,’F’)

    GROUP BY T.studId

    HAVING sum(C.credithours) >= 100

    Monday, October 4, 2010

  • 37

    GROUP BY / HAVING• Group by creates groups, each containing a bag of

    tuples.

    • The Having clause applies to each single group. • If the group satisfies the having condition, then the whole

    group passes. Otherwise, the whole group is eliminated.

    • For each group, create a new tuple in the select statement. You can include some or all of the grouping attributes, and aggregate conditions.

    • Each aggregate condition is applied to each group separately.

    Monday, October 4, 2010

  • 38

    WRONG!!! WRONG!!!Find two things that are wrong with these statements!!!

    SELECT T.studId, T.sectionId, sum(C.credithours) TotalHrs

    FROM Transcript T, Course C

    WHERE C.crscode = T.crscode AND

    UPPER(T.grade) IN (‘A’,’B’,’C’,’D’,’F’)

    GROUP BY T.studId

    HAVING C.credithours >= 4

    Monday, October 4, 2010

  • 39

    WRONG!!! WRONG!!!Find two things that are wrong with these statements!!!

    SELECT T.studId, T.sectionId, sum(C.credithours) TotalHrs

    FROM Transcript T, Course C

    WHERE C.crscode = T.crscode AND

    UPPER(T.grade) IN (‘A’,’B’,’C’,’D’,’F’)

    GROUP BY T.studId

    HAVING C.credithours >= 4

    Monday, October 4, 2010

  • 40

    ORDER BY• The order by statement after select is used to order the

    tuples in the result. It is a simple sort statement.

    SQL complete:

    (SELECT […] FROM […] WHERE […] GROUP BY […] HAVING […] )

    UNION […] UNION

    (SELECT […] FROM […] WHERE […] GROUP BY […] HAVING […] )

    ORDER BY […]

    Monday, October 4, 2010

  • SQL control flow• For each individual SELECT-FROM-WHERE-

    GROUP BY-HAVING expression

    • execute: FROM->WHERE->GROUP BY->HAVING->SELECT

    • Execute all union/intersect/except if there are any

    • Execute the order by statement

    Monday, October 4, 2010

  • Inner joinSELECT F.Name, C.CrsCode

    FROM Faculty F, Class C

    WHERE C.InstructorId = F.Id

    • is equivalent toSELECT F.Name, C.CrsCode

    FROM Faculty F INNER JOIN Class C

    ON C.InstructorId = F.Id

    Monday, October 4, 2010

  • 43

    INNER JOINFind all faculty and the classes they are teaching in Spring 2002. If they are not

    teaching a course, then simply return a null value next to the faculty name.

    First try:

    SELECT F.name, C.crscode, C.sectionNo

    FROM Faculty F, Class C

    WHERE C.instructorId=F.id AND C.semester=‘Spring’ AND C.year=2002

    Unfortunately, this eliminates all instructors who do not teach in this semester.

    Monday, October 4, 2010

  • 44

    OUTER JOIN• A JOIN B, inner join selects tuples that satisfy a join

    condition, eliminates all tuples that do not satisfy the join condition. A is called the left operand and B is the right operand of the join operation.

    • A LEFT OUTER JOIN B returns all tuples in the inner join as well as the tuples in A that do not join with any tuples in in B.

    • A RIGHT OUTER JOIN B returns all tuples in the inner join as well as the tuples in B that do not join with any tuples in in A.

    • A FULL OUTER JOIN B returns all tuples in the inner join as well as the tuples from A and B that do not participate in the inner join.

    Monday, October 4, 2010

  • 45

    OUTER JOIN

    PARTSUPP JOIN SUPPLY ON Supplier=ID (3 tuples)

    PARTSUPP LEFT OUTER JOIN SUPPLY ON Supplier=Id

    PARTSUPPPARTSUPPPARTSUPPPart Supplier QuantityP1 S1 10P1 S2 20P3 S1 40P4 S4 50

    SUPPLYSUPPLYId NameS1 WorldcomS2 EnronS3 Lucent

    Monday, October 4, 2010

  • 46

    INNER JOIN

    PARTSUPP INNER JOIN SUPPLY ON Supplier=Id

    PARTSUPP INNER JOIN SUPPLY ON Supplier=Id

    PARTSUPP INNER JOIN SUPPLY ON Supplier=Id

    PARTSUPP INNER JOIN SUPPLY ON Supplier=Id

    PARTSUPP INNER JOIN SUPPLY ON Supplier=Id

    Part Supplier Quantity Id NameP1 S1 10 S1 WorldComP1 S2 20 S2 EnronP3 S1 40 S1 WorldCom

    Monday, October 4, 2010

  • 47

    LEFT OUTER JOIN

    PARTSUPP LEFT OUTER JOIN SUPPLY ON Supplier=Id

    PARTSUPP LEFT OUTER JOIN SUPPLY ON Supplier=Id

    PARTSUPP LEFT OUTER JOIN SUPPLY ON Supplier=Id

    PARTSUPP LEFT OUTER JOIN SUPPLY ON Supplier=Id

    PARTSUPP LEFT OUTER JOIN SUPPLY ON Supplier=Id

    Part Supplier Quantity Id NameP1 S1 10 S1 WorldComP1 S2 20 S2 EnronP3 S1 40 S1 WorldComP4 S4 50 null null

    Monday, October 4, 2010

  • 48

    RIGHT OUTER JOIN

    PARTSUPP RIGHT OUTER JOIN SUPPLY ON Supplier=Id

    PARTSUPP RIGHT OUTER JOIN SUPPLY ON Supplier=Id

    PARTSUPP RIGHT OUTER JOIN SUPPLY ON Supplier=Id

    PARTSUPP RIGHT OUTER JOIN SUPPLY ON Supplier=Id

    PARTSUPP RIGHT OUTER JOIN SUPPLY ON Supplier=Id

    Part Supplier Quantity Id NameP1 S1 10 S1 WorldComP1 S2 20 S2 EnronP3 S1 40 S1 WorldComnull null null S3 Lucent

    Monday, October 4, 2010

  • 49

    FULL OUTER JOIN

    PARTSUPP RIGHT OUTER JOIN SUPPLY ON Supplier=Id

    PARTSUPP RIGHT OUTER JOIN SUPPLY ON Supplier=Id

    PARTSUPP RIGHT OUTER JOIN SUPPLY ON Supplier=Id

    PARTSUPP RIGHT OUTER JOIN SUPPLY ON Supplier=Id

    PARTSUPP RIGHT OUTER JOIN SUPPLY ON Supplier=Id

    Part Supplier Quantity Id NameP1 S1 10 S1 WorldComP1 S2 20 S2 EnronP3 S1 40 S1 WorldComnull null null S3 LucentP4 S4 50 null null

    Monday, October 4, 2010

  • 50

    OUTER JOINFind all faculty and the classes they are teaching in Spring 2002. If they

    are not teaching a course, then simply return a null value next to the faculty name.

    SELECT F.name, C.crscode, C.sectionNo

    FROM Faculty F LEFT OUTER JOIN Class C ON F.id = C.instructorId

    WHERE C.semester=‘Spring’ AND C.year=2002

    If the faculty is not teaching any courses, then crscode and sectionNo fields will simply be null.

    Monday, October 4, 2010

  • 51

    MORE OUTER JOINReturn faculty, the courses they are teaching and the number of students in each class. If the faculty is

    not teaching a course, just return the faculty name.

    First, write the statement without any outer joins:

    SELECT F.name, CR.CrsnName, count(T.StudId) numStudents

    FROM Faculty F, Course CR, Class C, Transcript T

    WHERE F.id = C.instructorId and

    C.CrsCode = Cr.CrsCode and

    C.CrsCode = T.CrsCode and C.SectionNo = T.SectionNo and

    C.Semester = T.Semester and C.Year = T.Year

    GROUP BY F.id, F.name, CR.CrsName;

    We would like to keep F tuples even if they do not match with any C tuple, we also would like to keep C tuples even if they do not match with any T tuples. C tuples will always match with CR tuples, right?

    Monday, October 4, 2010

  • 52

    MORE OUTER JOINReturn faculty, the courses they are teaching and the number of students in each

    class. If the faculty is not teaching a course, just return the faculty name.

    SELECT F.id, CR.CrsName, count(T.StudId) numStudents

    FROM (((Faculty F LEFT OUTER JOIN Class C on F.id = C.instructorId)

    LEFT OUTER JOIN Course CR on C.CrsCode = Cr.CrsCode)

    LEFT OUTER JOIN Transcript T on C.CrsCode = T.CrsCode and

    C.SectionNo = T.SectionNo and

    C.Semester = T.Semester and C.Year = T.Year )

    GROUP BY F.id, CR.CrsName;

    Monday, October 4, 2010

  • Counting and null values• When counting:• count(*) counts the number of tuples• count(r.a) counts the number of values r.a

    but does not count any null values

    • count(distinct r.a) counts the distinct number of values for r.a, not counting null again.

    Monday, October 4, 2010

  • 54

    Anonymous Relations• FROM statements may involve SELECT expressions that describe a

    new unnamed relation which can be used in the query expression

    SELECT T.studId

    FROM (SELECT DISTINCT T.studid, T.semester, T.year

    FROM Transcript T, Prerequisite P

    WHERE P.crscode=4380’ AND

    P.preqcrscode=T.crscode AND

    T.grade IS NOT NULL) PT, Transcript T

    WHERE T.crscode = ‘4380’ and T.studid=PT.studid AND

    T.year > PT.year OR

    (T.semester=‘Fall’ AND PT.semester=‘Spring’);

    Monday, October 4, 2010

  • 55

    Scalar Queries• Any query that returns a single number with an

    aggregate function is called a scalar query. You can use a scalar query as if it was a number.

    SELECT count(T.studId) numStudents

    FROM Transcript T

    WHERE T.semester=‘Fall’ AND T.year=2002 AND

    T.crscode=4380 AND T.sectionNo=‘01’

    Monday, October 4, 2010

  • 56

    Scalar QueriesSELECT count(T.studId) / (SELECT count(T.studId) numStudents

    FROM Transcript T

    WHERE T.semester=‘Fall’ AND

    T.year=2002 AND

    T.crscode=4380 AND

    T.sectionNo=‘01’ ) PercentAs

    FROM Transcript T

    WHERE T.semester=‘Fall’ AND T.year=2002 AND

    T.crscode=4380 AND T.sectionNo=‘01’ AND

    T.grade=‘A’

    Monday, October 4, 2010

  • 57

    Nested Expressions• You can treat SELECT…FROM…WHERE… expressions as if they were

    multisets.

    • Check membership• value IN set• value NOT IN set

    • Check whether the set is empty or not• EXISTS set --> return true if set is not empty (there exists an element in the

    set)

    • NOT EXISTS set -> return true if set is empty• Take the union and set difference of sets

    • set1 UNION set1, set1 EXCEPT set1

    Monday, October 4, 2010

  • Set operations5 in (1,2,3,4) FALSE

    5 not in (1,2,3,4) TRUE

    2 in (1,2,3,4) TRUE

    exists (1,2,3,4) TRUE

    not exists (1,2,3,4) FALSE

    not exists () TRUE

    5 ALL (1,2,3,4) TRUE

    2 =ANY (1,2,3,4) TRUE

    2 =ALL (1,2,3,4) FALSE

    Monday, October 4, 2010

  • 59

    Nested ExpressionsSELECT F.name

    FROM Faculty F

    WHERE F.Id IN

    ( SELECT C.instructorId

    FROM Class C

    WHERE C.Year=2002)INNEREXPRESSION

    OUTEREXPRESSION

    Monday, October 4, 2010

  • 60

    Nested ExpressionsWHERE R.attr IN

    (SELECT … FROM … WHERE …)

    For each tuple R, return true if the value R.attr is in the set resulting from the execution of the inner SQL expression

    WHERE R.attr NOT IN

    (SELECT … FROM … WHERE …)

    For each tuple R, return true if the value R.attr is not in the set resulting from the execution of the inner SQL expression

    Monday, October 4, 2010

  • 61

    Nested ExpressionsWHERE R.attr >ANY (SELECT … FROM … WHERE …)For each tuple R, return true if the value R.attr is greater than

    any value in the set resulting from the execution of the inner SQL expression

    We also have=ANY (also known as IN)

    ANY, …

    Monday, October 4, 2010

  • 62

    Nested ExpressionsWHERE R.attr >ALL

    (SELECT … FROM … WHERE …)

    For each tuple R, return true if the value R.attr is greater than all values in the set resulting from the execution of the inner SQL expression

    We also have=ALL

    ALL, …

    Monday, October 4, 2010

  • 63

    Scope of termsSELECT F.name

    FROM Faculty F

    WHERE 2 < ( SELECT count(*)

    FROM Class C

    WHERE C.Year=2002 AND C.Semester=‘Fall’ AND

    C.instructorId = F.id )

    For each faculty tuple F, find all tuples in the class relation for the courses he/she has taught in Fall 2002. Return the count.

    If the total number of courses taught by a faculty in Fall 2002 is greater than 2, then return the name of the faculty.

    A term is visible to all the inner expressions below, but not in the outer expressions.

    Monday, October 4, 2010

  • 64

    Nested ExpressionsWHERE EXISTS

    (SELECT … FROM … WHERE …)

    For each tuple examined in the WHERE clause, return true if the inner SQL expression has at least one tuple in it.

    WHERE NOT EXISTS

    (SELECT … FROM … WHERE …)

    For each tuple examined in the WHERE clause, return true if the inner SQL expression has no tuples in it.

    Monday, October 4, 2010

  • 65

    NOT EXISTSFind all students who did not take any courses yet.

    SELECT S.name

    FROM Student S

    WHERE NOT EXISTS

    ( SELECT *

    FROM Transcript T

    WHERE T.studId = S.id )

    Monday, October 4, 2010

  • 66

    NOT EXISTSFind all students who did not take any courses yet -

    alternate solution

    SELECT S.name

    FROM Student S LEFT OUTER JOIN Transcript T

    ON T.studId = S.id

    WHERE T.studid IS NULL

    Monday, October 4, 2010

  • 67

    NOT EXISTSFind the total credits a student has completed

    SELECT T.studId, sum(C.credithours) TotalHrs,

    count(T.crscode)/count(DISTINCT T.crscode) Repeats

    FROM Transcript T, Course C

    WHERE C.crscode = T.crscode AND

    UPPER(T.grade) IN (‘A’,’B’,’C’,’D’)

    GROUP BY T.studId

    We would like to exclude multiple times the same course is taken by the student. For each course, check if it is the last time they took the course.

    Monday, October 4, 2010

  • 68

    NOT EXISTSFor each student and the course they took,

    check if this is the last time they took this course.

    For each student and the course they took, return the course if there does not exist a tuple in transcript for the same course and a later time.

    Monday, October 4, 2010

  • 69

    NOT EXISTSSELECT T.studId, sum(C.credithours) TotalHrs

    FROM Transcript T, Course C

    WHERE C.crscode = T.crscode AND T.grade IN (‘A’,’B’,’C’,’D’) AND

    NOT EXISTS (SELECT *

    FROM Transcript T2

    WHERE T2.studId=T.studid AND

    T2.crscode=T.crscode AND

    T2.grade IN (‘A’,’B’,’C’,’D’)

    (T2.year > T.year OR

    (T2.year=T.year AND T2.semester=‘Fall’

    AND T.semester=‘Spring’))

    )

    GROUP BY T.studId

    Monday, October 4, 2010

  • 70

    FOR ALL QUERIES• For all queries that normally require a

    division (or two set subtractions) can be solved using NOT EXISTS.

    • Find all students who have taken all prerequisites for CSCI4380.

    • Solve by finding students for whom there does not exist a prerequisite for CSCI4380 that they have not yet taken.

    Monday, October 4, 2010

  • 71

    FOR ALL QUERIESSELECT S.name

    FROM Student S

    WHERE NOT EXISTS (SELECT * FROM Requires R

    WHERE R.crscode = ‘CSCI4380’ AND

    NOT EXISTS

    ( SELECT * FROM Transcript T

    WHERE T.studId = S.Id AND

    T.grade IN (‘A’, ‘B’, ‘C’, ‘D’) AND

    T.crscode = R.prereqcrscode

    )

    )

    Monday, October 4, 2010

  • FOR ALL Query

    • Alternate solution by counting

    Monday, October 4, 2010

  • 73

    FOR ALL QUERIESSELECT S.name

    FROM Student S

    WHERE (SELECT count(R.prereqcrscode) FROM Requires R

    WHERE R.crscode = ‘CSCI4380’ )

    =

    (SELECT count(DISTINCT T.crscode)

    FROM Transcript T, Requires R2

    WHERE T.studid = S.id AND T.grade IN (‘A’,’B’,’C’,’D’) AND

    R2.prereqcrscode = T.crscode AND

    R2.crscode = ‘CSCI4380’)

    Monday, October 4, 2010

  • 74

    Things to remember

    • Most queries that use IN or EXISTS can be rewritten using simple joins. Joins are much easier to optimize.

    • Set subtraction usually can be expressed using NOT IN or NOT EXISTS.

    • Using anonymous relations in the from clause may cause the optimizer to miss some optimizations. Simpler the query, the better it is.

    Monday, October 4, 2010

  • 75

    Things to remember• There is a subtle difference on the syntax of the two

    statements:

    • Attribute NOT IN (select statement)• NOT EXISTS (select statement)

    • For all queries usually require two NOT EXISTS.• SQL aggregates and outer joins are powerful

    constructs for formulating complex queries, even those involving some sort of negation.

    Monday, October 4, 2010

  • 76

    INSERT• To insert a new tuple:

    INSERT INTO faculty(Id, Name, DeptId)

    VALUES (10, ‘Legolas’, ‘ELF’)

    All unnamed attributes will be appended NULL values for this tuple.

    • To insert a number of tuples, use a select statement:INSERT INTO STUDENT(Id, Name)

    SELECT 10000+F.Id, F.Name

    FROM Faculty F

    WHERE F.deptid = ‘CS’

    Monday, October 4, 2010

  • 77

    DELETE• Deleting tuples that satisfy a specific

    condition:

    DELETE FROM CLASS C

    WHERE C.year < 1998

    • Delete all tuples:

    DELETE FROM CLASS C

    Monday, October 4, 2010

  • 78

    UPDATE• Update values in the tuples that satisfy the where

    condition:

    UPDATE Transcript T

    SET T.grade = ‘I’

    WHERE T.year=2002 AND

    T.semester=‘Spring’

    AND T.grade is null

    Monday, October 4, 2010