UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture 9 Database Theory & Practice (5) :...
-
Upload
louise-randall -
Category
Documents
-
view
217 -
download
1
Transcript of UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture 9 Database Theory & Practice (5) :...
UFCE8V-20-3 Information Systems Development 3 (SHAPE HK)
Lecture 9Database Theory & Practice (5) : Introduction to the
Structured Query Language (SQL)
Origins & history• Early 1970’s – IBM develops Sequel as part of the System R
project at its San Hose Research Lab;• 1986 - ANSI & ISO publish the standard SQL-86;• 1987 – IBM publishes its own “standard” SQL called
Systems Architecture Database Interface (SAA-SQL);• 1989 – SQL-89 published by ANSI (extended version of
SQL-86);• 1992 – SQL-92 published with better support for algebraic
operations; • 1999 – SQL-1999 published with support for typing, stored
procedures, triggers, BLOBs etc.
SQL-92 remains the most widely implemented standard – and most database vendors also provide their own
(proprietary) extensions.
Components of SQLThe SQL language has several parts:• Data-definition language (DDL). The SQL DDL provides commands for
defining relation schemas, deleting relations, and modifying relation schemas.
• Interactive data-manipulation language (DML). The SQL DML includes a query language based on both the relational algebra and the tuple relational calculus. It includes also commands to insert tuples into, delete tuples from, and modify tuples in the database.
• View definition. The SQL DDL includes commands for defining views.• Transaction control. SQL includes commands for specifying the
beginning and ending of transactions.• Embedded SQL and dynamic SQL. Embedded and dynamic SQL
define how SQL statements can be embedded within general-purpose programming languages, such as C, C++, Java, PL/I, Cobol, Pascal, and Fortran.
• Integrity. The SQL DDL includes commands for specifying integrity constraints that the data stored in the database must satisfy. Updates that violate integrity constraints are disallowed.
• Authorization. The SQL DDL includes commands for specifying access rights to relations and views.
SQL Example (example db)
• The Supplier-Parts Database
sno sname status city
1 Smith 20 London
2 Jones 10 Paris
3 Blake 30 Paris
4 Clark 20 London
5 Adams 30 Athens
s
pno pname color weight city
1 Nut Red 12.0 London
2 Bolt Green 17.0 Paris
3 Screw Blue 17.0 Oslo
4 Screw Red 14.0 London
5 Cam Blue 12.0 Paris
6 Cog Red 19.0 London
sno pno qty
1 1 300
1 2 200
1 3 400
1 4 200
1 5 100
1 6 100
2 1 300
2 2 400
3 2 200
4 2 200
4 4 300
4 5 400
p
sp
SQL Example (project)
• Project the columns
sname
Smith
Jones
Blake
Clark
Adams
SELECT sname FROM s
computed columns:
SELECT sname, status * 5 FROM s
sname status * 5
Smith 100
Jones 50
Blake 150
Clark 100
Adams 150
renamed columns:
SELECT sname AS Supplier, status * 5 AS 'Status times Five' FROM s
Supplier Status times Five
Smith 100
Jones 50
Blake 150
Clark 100
Adams 150
SELECT statement (restrict)
• Restrict the rowsSELECT * FROM s WHERE city=‘London’
sno sname status city
s1 Smith 20 London
s4 Clark 20 London
complex condition:
SELECT * FROM s WHERE city=‘London’ OR status = 30
sno sname status city
s1 Smith 20 London
s3 Blake 30 Paris
s4 Clark 20 London
s5 Adams 30 Athens
SELECT statement (restrict & project)
• Restrict & Project
city
London
London
SELECT city FROM s WHERE sname='smith' OR status='20'
remove duplicate rows:
SELECT DISTINCT city FROM s WHERE sname='smith' OR status='20'
city
London
SELECT statement (group by & having)
• Use the ‘GROUP BY’ clause to aggregate related rows
city Total Status
Athens 30
London 40
Paris 40
SELECT city, SUM(status) AS 'Total Status' FROM s GROUP BY city
• Group By and Having
• Use the ‘HAVING’ clause to restrict rows aggregated with ‘GROUP BY’
city Total Status
London 40
Paris 40
SELECT city, SUM(status) AS 'Total Status' FROM s GROUP BY city HAVING SUM(status) > 30
For many of the modern uses of databases, it is often necessary to select some subset of the records from a table, and let some other program manipulate the results. In SQL the SELECT statement is the workhorse for these operations.
A summary of the SELECT statement:
SELECT columns or computationsFROM tableWHERE conditionGROUP BY columnsHAVING conditionORDER BY column [ASC | DESC]LIMIT offset,count;
SELECT statement summarized :
In SQL, the WHERE clause is used to operate on subsets of a table. The following comparison operators are available:
• Usual logical operators: < > <= >= = <>• BETWEEN used to test for a range• IN used to test group membership• Keyword NOT used for negation• LIKE operator allows wildcards
• _ means single character, % means anything• SELECT salary WHERE name LIKE ’Fred %’;
SQL Comparison operators :
SQL supports a very large number of data types & formats for internal storage of data.
Numeric• INTEGER, SMALLINT, BIGINT• NUMERIC(w,d), DECIMAL(w,d) - numbers with
width w and d decimal places• REAL, DOUBLE PRECISION - machine and database dependent• FLOAT(p) - floating point number with p binary digits of precision
SQL data types :
Character• CHARACTER(L) - a fixed-length character of length L• CHARACTER VARYING(L) or VARCHAR(L) - supports
maximum length of L
Binary• BIT(L), BIT VARYING(L) - like corresponding characters• BINARY LARGE OBJECT(L) or BLOB(L)
Temporal• DATE• TIME• TIMESTAMP
SQL data types (cont.) :
SQL Functions :
• SQL provides a wide range of predefined functions to perform data manipulation.
• Four types of functions:arithmetic (sqrt(), log(), mod(), round() …)
date (sysdate(), month(), dayname() …)
character (length(), lower(), upper()…)aggregate (min(), max(), avg(), sum() …)
Database & Table description commands :
Since a single server can support many databases, eachcontaining many tables, with each table having a variety of columns, it’s often necessary to view which databases are available and what the table structures are within a particular database.
The following SQL commands are often used for these purposes :
• SHOW DATABASES;• SHOW TABLES IN database;• SHOW COLUMNS IN table;• DESCRIBE table; - shows the columns and their types
Inserting Records :Individual records can be entered using the INSERT command:INSERT INTO s VALUES(6, Thomas, 40, Cardiff);
Using the column names:INSERT INTO s (sno, sname, status, city)VALUES(6, Thomas, 40, Cardiff);
Insert multiple records:INSERT INTO s (sno, sname, status, city)VALUES(6, Thomas, 40, Cardiff), (7, Hamish, 30, Glasgow);
Upload from file:LOAD DATA INFILE ’supplier.tab’INTO TABLE sFIELDS TERMINATED BY ’\t’;
Updating (Editing) Existing Records :
To change one or more values of columns of a table, the UPDATE command can be used.
Edits are provided as a comma-separated list of column/value pairs.
UPDATE s SET status=status + 10WHERE city=’London’;
Note that the UPDATE command without a WHERE clause will update all the rows of a table.
Deleting Records :
To delete existing record/s the DELETE FROM command is used.
Note the WHERE clause in the DELETE syntax. The WHERE clause specifies which record or records that should be deleted. If the WHERE clause is omitted, all records will be deleted!
DELETE FROM s WHERE city=’London’;
Normalization (avoiding redundancy) :
Repeating data (the same column values across many records) wastes space (redundancy) and introduces insert & update anomalies. To avoid this, tables are often normalized and repeating fields are moved to their own tables. These are then related to the base or parent table using foreign keys.
For instance in the Quote example – author and category are moved to their own tables since a specific category can have many associated quotes and an author can be the source of many quotes.
Joins (1)
• The m-f database
id name age
1 tom 23
2 dick 20
3 harry 30
id name age
1 mary 23
2 anne 30
3 sue 34
m f
Joins are used to re-combine records which have data spread across many tables. The following simple example database with two tables - m, f – is used to illustrate the various kinds of joins.
Joins (2)• Product (or Cartesian Product)
id name age id name age
1 tom 23 1 mary 23
2 dick 20 1 mary 23
3 harry 30 1 mary 23
1 tom 23 2 anne 30
2 dick 20 2 anne 30
3 harry 30 2 anne 30
1 tom 23 3 sue 34
2 dick 20 3 sue 34
3 harry 30 3 sue 34
SELECT * FROM m, f
Synonymous with the CROSS JOIN, hence: SELECT * FROM m CROSS JOIN f; would return the same result. This is not very useful but is the basis for all other joins.
Joins (3)• Natural join
Joins tables using some shared characteristic – usually (but not necessarily) a foreign key.
SELECT * FROM m,f WHERE m.age = f.age
id name age id name age
1 tom 23 1 mary 23
3 harry 30 2 anne 30
Joins (4)
• Inner joinsThe previous example, besides being a natural join, is also an example of an inner join. An inner join retrieves data only from those rows where the join condition is met.
id name age id name age
3 harry 30 1 mary 23
SELECT * FROM m,f WHERE m.age > f.age
Joins (5)• Outer joins
Unmatched rows can be included in the output using as outer join.
id name age id name age
1 tom 23 1 mary 23
2 dick 20 NULL NULL NULL
3 harry 30 2 anne 30
id name age id name age
1 tom 23 1 mary 23
3 harry 30 2 anne 30
NULL NULL NULL 3 sue 34
Right outer join:SELECT * FROM m RIGHT OUTER JOIN f ON m.age = f.age
Left outer join: SELECT * FROM m LEFT OUTER JOIN f ON m.age = f.age
Joins (6)• Self Join
Special case of the inner join – here the table employee shows employees and their managers. Ruth manages Joe who manages Tom, Dick and Harry.
emp_id emp_name mgr_id
1 Tom 4
2 Dick 4
3 Harry 4
4 Joe 5
5 Ruth NULL
Employee Manager
Tom Joe
Dick Joe
Harry Joe
Joe Ruth
Show who manages who by name:SELECT E1.emp_name AS Employee, E2.emp_name AS ManagerFROM employee AS E1INNER JOIN employee AS E2 ON E1.mgr_id = E2.emp_id