Teradata advanced-sql

38
1 Advanced Teradata SQL GLOBAL Temporary Vs VOLATILE Temporary Vs Derived tables WITH and WITH BY Special Index function Trigger Online Analytical Function Ordered Analytical function

description

Teradata advanced-sql

Transcript of Teradata advanced-sql

Page 1: Teradata advanced-sql

1

Advanced Teradata SQL

GLOBAL Temporary Vs VOLATILE Temporary Vs Derived tables

WITH and WITH BY

Special Index function

Trigger

Online Analytical Function

Ordered Analytical function

Page 2: Teradata advanced-sql

2

Global and Volatile Temporary Tables

Temporary tables are an additional tool that assist us in performing

SQL operations that either:

Might not be possible against a normalized table, or

May require multiple SQL statements to complete.

There are three types of temporary tables implemented in Teradata:

Global

Volatile

Derived

Page 3: Teradata advanced-sql

3

Derived Tables

Some characteristics of a derived table include:

Local to the query - it exists for the duration of the query.

When the query is done the table is discarded.

Incorporated into SQL query syntax.

Spool rows are also discarded when query finishes.

There is no data dictionary involvement - less system overhead.

Example

SELECT t.prodid,

t.sumsales,

RANK(t.sumsales)

FROM (SELECT prodid,

SUM(sales)

FROM salestbl

GROUP BY 1) AS t(prodid, sumsales)

QUALIFY RANK(sumsales)<=3;

Page 4: Teradata advanced-sql

4

Volatile Temporary Tables

Volatile tables have a lot of the advantages of derived tables, and

additional benefits such as:

Local to a session - it exists throughout the entire session, not just a single query.

It must be explicitly created using the CREATE VOLATILE TABLE syntax.

It is discarded automatically at the end of the session.

There is no data dictionary involvement.

Example

CREATE VOLATILE TABLE vt_deptsal, LOG

(deptno SMALLINT

,avgsal DEC(9,2)

,maxsal DEC(9,2)

,sumsal DEC(9,2)

,empcnt SMALLINT)

ON COMMIT PRESERVE ROWS;

Page 5: Teradata advanced-sql

5

Global Temporary Tables

The major difference between a global temporary table and a volatiletemporary table is that the global table has a definition in the datadictionary, thus the definition may be shared by many users. Each usersession can materialize its own local instance of the table.

Attributes of a global temporary table include: Local to a session, however each user session may have its own instance. Uses CREATE GLOBAL TEMPORARY TABLE syntax. Materialized instance of table discarded at session end. Creates and keeps table definition in data dictionary.

ExampleCREATE SET GLOBAL TEMPORARY TABLE PED.gt_deptsal,NO FALLBACK,LOG ( deptno SMALLINT, avgsal DECIMAL(9,2)) PRIMARY INDEX (deptno ) INDEX ( empcnt) ON COMMIT PRESERVE ROWS;

Page 6: Teradata advanced-sql

6

Exercise

Can we COLLECT STATISTICS on GLOBAL TEMP Table, volatile temp

table, derived tables.

Page 7: Teradata advanced-sql

7

WITH...BY for Subtotals

The WITH...BY clause is a Teradata extension that creates subtotallines for a detailed list. The WITH...BY clause allows subtotal"breaks" on more than one column and generates an automatic sorton all "BY" columns.

Creating a Report Using WITH..BY

Problem

To display employee name and salary with subtotals by department.

Solution

SELECT last_name AS NAME

,salary_amount AS SALARY

,department_number AS DEPT

FROM employee

WITH SUM(salary) BY DEPT ;

Page 8: Teradata advanced-sql

8

ResultNAME SALARY DEPTStein 29450.00 301 Kanieski 29250.00 301

------------Sum(Salary) 58700.00

Johnson 36300.00 401 Trader 37850.00 401

------------Sum(Salary) 74150.00

Villegas 49700.00 403 Ryan 31200.00 403

------------Sum(Salary) 80900.00

NOTE:"WITH BY" is a non-ANSI Teradata extension. It is supported by BTEQ, but not supported by ODBC clients.

Page 9: Teradata advanced-sql

9

Creating Final Totals Using WITH

The WITH clause without 'BY' is used to create a grand totals. Problem

Display employee numbers and salary amounts for department 301 and a final total for the department.

SolutionSELECT employee_number

,salary_amount FROM employee WHERE department_number = 301 WITH SUM(salary_amount) (TITLE 'GRAND TOTAL') ORDER BY employee_number ;

Resultemployee_number salary_amount-------------------------------------------------

1006 29450.001008 29250.001019 57700.00

------------GRAND TOTAL 116400.00

NOTE:WITH is a non-ANSI Teradata extension and is not supported by ODBC clients.

Page 10: Teradata advanced-sql

10

DISTINCT Modifier

Use the DISTINCT modifier is used in conjuction with the COUNT aggregate to prevent the same value from being counted more than once.

ProblemCount the number of managers for employees numbered between 1003 and 1008.

SolutionSELECT employee_number

,department_number ,manager_employee_number AS manager

FROM employee WHERE employee_number BETWEEN 1003 AND 1008 WITH COUNT (DISTINCT manager) (TITLE 'TOTAL MANAGERS') ;

Resultemployee_number department_number manager

----------------------------------------------------------1006 301 10191005 403 8011003 401 8011007 403 10051008 301 10191004 401 1003

----------TOTAL MANAGERS 4

This gives you an actual count for the number of managers.

Page 11: Teradata advanced-sql

11

Legal Example:

SELECT COUNT(DISTINCT(job_code)),COUNT(DISTINCT(employee_number)).....

Illegal Example:

SELECT COUNT(DISTINCT(job_code, employee_number)).......

Page 12: Teradata advanced-sql

12

Combining WITH and WITH BY

WITH and WITH BY may be combined within the same query.

ProblemShow the salary for each employee with subtotals by department, a final total, and results sorted by employee name.

Solution

SELECT last_name AS NAME

,salary_amount AS SALARY

,department_number AS DEPT

FROM employee

WHERE employee_number BETWEEN 1003 and 1008

WITH SUM (SALARY) BY DEPT

WITH SUM (SALARY) (TITLE 'GRAND TOTAL')

ORDER BY NAME ;

Page 13: Teradata advanced-sql

13

Results

NAME SALARY DEPT

Kanieski 29250.00 301Stein 29450.00 301

------------

Sum (Salary) 58700.00

Johnson 36300.00 401

Trader 37850.00 401------------

Sum (Salary) 58700.00

Ryan 31200.00 403 Villegas 49700.00 403

------------

Sum (Salary) 58700.00

------------

GRAND TOTAL 213750.00

Page 14: Teradata advanced-sql

14

Summary of WITH...BY and WITH

WITH BY :Will provide subtotals, subcounts, and subaverages, and also show detail rows. Summarylist can specify more than one column. Breaklist can specify more than one column. Implied ORDER BY on the breaklist columns. WITH...BY determines the major sort key(s). ORDER BY specifies any additional minor sorts. An SQL statement can have several WITH...BY clauses. The highest level of sort is the last specified WITH...BY clause. Is not supported by ANSI standard or ODBC clients.

WITH :The WITH clause produces grand total results for the entire answer set. This clause is typically used when you need to produce final totals and also wish to see the detail rows.Will provide a final or grand total, count or average. Summary list may specify more than one column. Only a single WITH is allowed in a query. WITH must follow any WITH...BY syntax. Not supported by ANSI standard or ODBC clients.

Page 15: Teradata advanced-sql

15

Special Index Functions

Create an 'employee' table with a NUSI on the job code.

CREATE SET TABLE employee ,FALLBACK ,

(employee_number INTEGER,manager_employee_number INTEGER,department_number INTEGER,job_code INTEGER,last_name CHAR(20) NOT NULL,first_name VARCHAR(30) NOT NULL,hire_date DATE FORMAT 'YY/MM/DD' NOT NULL,birthdate DATE FORMAT 'YY/MM/DD' NOT NULL,salary_amount DECIMAL(10,2) NOT NULL)

UNIQUE PRIMARY INDEX ( employee_number )INDEX (job_code);

CREATE INDEX (job_code) ON employee;

Page 16: Teradata advanced-sql

16

Drop the NUSI on the job code column of the 'employee' table. DROP INDEX (job_code) ON employee;

Value Ordered NUSI's

CREATE SET TABLE employee ,FALLBACK , ( employee_number INTEGER,manager_employee_number INTEGER, department_number INTEGER, job_code INTEGER,last_name CHAR(20) NOT NULL, first_name VARCHAR(30) NOT NULL,hire_date DATE FORMAT 'YY/MM/DD' NOT NULL, birthdate DATE FORMAT 'YY/MM/DD' NOT NULL,salary_amount DECIMAL(10,2) NOT NULL) UNIQUE PRIMARY INDEX ( employee_number ) INDEX (job_code) ORDER BY VALUES (job_code);

Page 17: Teradata advanced-sql

17

Create a value-ordered NUSI on the job code column of existing

'employee' table.

CREATE INDEX (job_code) ORDER BY VALUES (job_code)

ON employee;

Limitations of Value-Ordered NUSI's

A column defined as a value-ordered index column must be:

A single column

A column which is a part of or all of the index definition

A numeric column – non-numeric are not allowed

No greater than four bytes in length – INT, SMALLINT, BYTEINT, DATE, DEC are valid

Page 18: Teradata advanced-sql

18

Triggers

A trigger is an object in a database, like a macro or view. A trigger iscreated with a CREATE TRIGGER statement and defines events that willhappen when some other event, called a triggering event, occurs. A triggerconsists of one or more SQL statements which are associated with a tableand which are executed when the trigger is 'fired'.

All of the following statements are valid with triggers:

CREATE TRIGGER

DROP TRIGGER

SHOW TRIGGER

ALTER TRIGGER

RENAME TRIGGER

REPLACE TRIGGER

HELP TRIGGER

NOTE: Join indexes are never permitted on tables which have defined triggers.

Page 19: Teradata advanced-sql

19

Triggered and Triggering Statements

A trigger is said to ‘fire’ when the triggering event occurs andvarious conditions are met. When a trigger fires, it causes otherevents, called triggered events to occur. A triggered eventconsists of one or more triggered statements.

A triggering statement is an SQL statement which causes atrigger to fire. It is the 'launching' statement.

Page 21: Teradata advanced-sql

21

Online Analytical Functions

OLAP stands for On-Line Analytical Processing. The new OLAP functions available with Teradata V2R5 permit data mining on a database via SQL.

These functions include: RANK - (Rankings) QUANTILE - (Quantiles) CSUM - (Cumulation) MAVG - (Moving Averages) MSUM - (Moving Sums) MDIFF - (Moving Differences) MLINREG - (Moving Linear Regression)

OLAP functions are similar to aggregate functions in that they: Operate on groups of rows (like the GROUP BY clause) Can filter groups using QUALIFY (like the HAVING clause)OLAP functions are unlike aggregate functions because they: Return a data value for each qualifying row - not group May not be performed within subqueries

OLAP functions may be performed on the following database objects or activities: Tables (Perm, Temp, Derived) Views INSERT/SELECT populations

Page 22: Teradata advanced-sql

22

Cumulative Sum

Cumulative sum (CSUM) computes a running or cumulative total of a column’s value.

The syntax is:

CSUM(colname, sort list)

The 'daily_sales' table is used in many of the subsequent examples and its definition is repeated here for convenience.

CREATE SET TABLE daily_sales

,NO FALLBACK

,NO BEFORE JOURNAL

,NO AFTER JOURNAL

(

itemid INTEGER

,salesdate DATE FORMAT 'YY/MM/DD'

,sales DECIMAL(9,2))

PRIMARY INDEX ( itemid );

Page 23: Teradata advanced-sql

23

ProblemCreate a running daily total for item 10 for Jan and Feb 1998.

SolutionSELECT salesdate, sales, CSUM(sales, salesdate)FROM daily_sales WHERE salesdate BETWEEN 980101 AND 980301AND itemid = 10;

Resultsalesdate sales Csum 98/01/01 150.00 150.00 98/01/02 200.00 350.0098/01/03 250.00 600.00 98/01/05 350.00 950.0098/01/10 550.00 500.00 98/01/21 150.00 1650.00 98/01/25 200.00 1850.0098/01/31 100.00 1950.00 98/02/01 150.00 100.00 98/02/03 250.00 2350.00 98/02/06 350.00 2700.00 98/02/17 550.00 3250.00 98/02/20 450.00 3700.00 98/02/27 350.00 4050.00

Page 24: Teradata advanced-sql

24

Moving Averages

The Moving Average (MAVG) function permits the calculation of a moving average on a specified column, based on a defined number of rows known as the query width. The sortlist column provides the column(s) for the ascending sort of the final result.

This function computes the moving AVG of a column based on some number of preceding rows.

The syntax for Moving Averages is: MAVG(colname, n, sortlist)

colname = the column on which the moving average is computedn = the number of rows (< 4096) which will be used in the calculation including the current row ('n' is also refered to as the 'width' of the average)sortlist = the column(s) which determine the sequencing of the rows

Page 25: Teradata advanced-sql

25

ProblemShow a moving average on a 7 day window for sales of item 10.

SELECT salesdate,itemid, sales,MAVG(sales, 7, salesdate)

WHERE itemid = 10 FROM daily_sales;

Result salesdate itemid sales MAvg

98/01/01 10 150.00 150.00 98/01/02 10 200.00 175.00 Avg of the first 2 rows98/01/03 10 250.00 200.00 98/01/05 10 350.00 237.50 98/01/10 10 550.00 300.00 98/01/21 10 150.00 275.00 98/01/25 10 200.00 264.29 Avg of the first 7 rows98/01/31 10 100.00 257.14 98/02/01 10 150.00 250.0098/02/03 10 250.00 250.00

Page 26: Teradata advanced-sql

26

Moving Sums

The syntax for using Moving Sums is: MSUM(colname, n, sortlist)

ProblemShow a moving sum on a three day window for sales of item 10.

SolutionSELECT salesdate,

itemid,sales, MSUM(sales, 3, salesdate)

WHERE itemid = 10 FROM daily_sales;

Resultsalesdate itemid sales MSum98/01/01 10 150.00 150.00 98/01/02 10 200.00 350.00 Sum of 2 rows98/01/03 10 250.00 600.00 98/01/05 10 350.00 800.00 98/01/10 10 550.00 1150.00 98/01/21 10 150.00 1050.00 98/01/25 10 200.00 900.00 Sum of 3 rows98/01/31 10 100.00 450.00

98/02/01 10 150.00 450.00

Page 27: Teradata advanced-sql

27

Moving Differences

The Moving Difference (MDIFF) function permits a calculation of a moving difference of a specified column, based on a defined query width (n). The syntax for MDIFF is: MDIFF(colname, n, sortlist)

ProblemShow a moving difference on a three day window for sales of item 10.

SolutionSELECT salesdate,

itemid, sales,MDIFF(sales, 3, salesdate)

FROM daily_sales; Result

salesdate itemid sales MDiff98/01/01 10 150.00 ? 98/01/02 10 200.00 ? 98/01/03 10 250.00 ? 98/01/05 10 350.00 200.00 Difference of 2 rows98/01/10 10 550.00 350.00 98/01/21 10 150.00 -100.00 98/01/25 10 200.00 -150.00 98/01/31 10 100.00 -450.00 98/02/01 10 150.00 .OO 98/02/03 10 250.00 50.00 Difference of 2 rows98/02/06 10 350.00 250.00

Page 28: Teradata advanced-sql

28

Cumulative Sum

Cumulative sum (CSUM) computes a running or cumulative total of a column’s value.

The syntax is:

CSUM(colname, sort list)

The 'daily_sales' table is used in many of the subsequent examples and its definition is repeated here for convenience.

CREATE SET TABLE daily_sales

,NO FALLBACK

,NO BEFORE JOURNAL

,NO AFTER JOURNAL

(

itemid INTEGER

,salesdate DATE FORMAT 'YY/MM/DD'

,sales DECIMAL(9,2))

PRIMARY INDEX ( itemid );

Page 29: Teradata advanced-sql

29

ProblemCreate a running daily total for item 10 for Jan and Feb 1998.

SolutionSELECT salesdate, sales, CSUM(sales, salesdate)FROM daily_sales WHERE salesdate BETWEEN 980101 AND 980301AND itemid = 10;

Resultsalesdate sales Csum 98/01/01 150.00 150.00 98/01/02 200.00 350.0098/01/03 250.00 600.00 98/01/05 350.00 950.0098/01/10 550.00 500.00 98/01/21 150.00 1650.00 98/01/25 200.00 1850.0098/01/31 100.00 1950.00 98/02/01 150.00 100.00 98/02/03 250.00 2350.00 98/02/06 350.00 2700.00 98/02/17 550.00 3250.00 98/02/20 450.00 3700.00 98/02/27 350.00 4050.00

Page 30: Teradata advanced-sql

30

Moving Averages

The Moving Average (MAVG) function permits the calculation of a moving average on a specified column, based on a defined number of rows known as the query width. The sortlist column provides the column(s) for the ascending sort of the final result.

This function computes the moving AVG of a column based on some number of preceding rows.

The syntax for Moving Averages is: MAVG(colname, n, sortlist)

colname = the column on which the moving average is computedn = the number of rows (< 4096) which will be used in the calculation including the current row ('n' is also refered to as the 'width' of the average)sortlist = the column(s) which determine the sequencing of the rows

Page 31: Teradata advanced-sql

31

ProblemShow a moving average on a 7 day window for sales of item 10.

SELECT salesdate,itemid, sales,MAVG(sales, 7, salesdate)

WHERE itemid = 10 FROM daily_sales;

Result salesdate itemid sales MAvg

98/01/01 10 150.00 150.00 98/01/02 10 200.00 175.00 Avg of the first 2 rows98/01/03 10 250.00 200.00 98/01/05 10 350.00 237.50 98/01/10 10 550.00 300.00 98/01/21 10 150.00 275.00 98/01/25 10 200.00 264.29 Avg of the first 7 rows98/01/31 10 100.00 257.14 98/02/01 10 150.00 250.0098/02/03 10 250.00 250.00

Page 32: Teradata advanced-sql

32

Moving Sums

The syntax for using Moving Sums is: MSUM(colname, n, sortlist)

ProblemShow a moving sum on a three day window for sales of item 10.

SolutionSELECT salesdate,

itemid,sales, MSUM(sales, 3, salesdate)

WHERE itemid = 10 FROM daily_sales;

Resultsalesdate itemid sales MSum98/01/01 10 150.00 150.00 98/01/02 10 200.00 350.00 Sum of 2 rows98/01/03 10 250.00 600.00 98/01/05 10 350.00 800.00 98/01/10 10 550.00 1150.00 98/01/21 10 150.00 1050.00 98/01/25 10 200.00 900.00 Sum of 3 rows98/01/31 10 100.00 450.00

98/02/01 10 150.00 450.00

Page 33: Teradata advanced-sql

33

Moving Differences

The Moving Difference (MDIFF) function permits a calculation of a moving difference of a specified column, based on a defined query width (n). The syntax for MDIFF is: MDIFF(colname, n, sortlist)

ProblemShow a moving difference on a three day window for sales of item 10.

SolutionSELECT salesdate,

itemid, sales,MDIFF(sales, 3, salesdate)

FROM daily_sales; Result

salesdate itemid sales MDiff98/01/01 10 150.00 ? 98/01/02 10 200.00 ? 98/01/03 10 250.00 ? 98/01/05 10 350.00 200.00 Difference of 2 rows98/01/10 10 550.00 350.00 98/01/21 10 150.00 -100.00 98/01/25 10 200.00 -150.00 98/01/31 10 100.00 -450.00 98/02/01 10 150.00 .OO 98/02/03 10 250.00 50.00 Difference of 2 rows98/02/06 10 350.00 250.00

Page 34: Teradata advanced-sql

34

Ordered Analytic Functions

ROW_NUMBER Function

The ROW_NUMBER function returns the sequential row number of a group starting with the number one.

ROW_NUMBER:

is the same as ANSI RANK function except in the event of a tie.

doesn't report duplicate values, unlike RANK.

Page 35: Teradata advanced-sql

35

Example

SELECT storeid, prodid, sales, ROW_NUMBER() OVER (PARTITION BY storeid ORDER BY sales DESC)AS rank_sales FROM salestbl QUALIFY rank_sales <= 3;

storeid prodid sales rank_sales----------- ------ ----------- -----------1001 F 150000.00 11001 A 100000.00 21001 C 60000.00 31002 A 40000.00 11002 C 35000.00 21002 D 25000.00 31003 B 65000.00 11003 D 50000.00 21003 A 30000.00 3

Page 36: Teradata advanced-sql

36

SUM Window Group Function

SELECT storeid, prodid, sales, SUM(sales) OVER (PARTITION BY prodid ORDER BY sales DESC) FROM salestbl ;

storeid prodid sales Group Sum(sales)----------- ------ ----------- ----------------1001 A 100000.00 170000.001002 A 40000.00 170000.001003 A 30000.00 170000.001003 B 65000.00 65000.001001 C 60000.00 115000.001002 C 35000.00 115000.001003 C 20000.00 115000.001003 D 50000.00 110000.001001 D 35000.00 110000.001002 D 25000.00 110000.001001 F 150000.00 150000.00

Page 37: Teradata advanced-sql

37

SUM Window Cumulative Function

SELECT storeid, prodid, sales, SUM(sales) OVER (ORDER BY sales DESC ROWS UNBOUNDED PRECEDING) FROM salestbl ;

storeid prodid sales Cumulative Sum(sales)----------- ------ ----------- ---------------------1001 F 150000.00 150000.001001 A 100000.00 250000.001003 B 65000.00 315000.001001 C 60000.00 375000.001003 D 50000.00 425000.001002 A 40000.00 465000.001001 D 35000.00 500000.001002 C 35000.00 535000.001003 A 30000.00 565000.001002 D 25000.00 590000.001003 C 20000.00 610000.00

Page 38: Teradata advanced-sql

38

SUM Window Moving Function

SELECT storeid, prodid, sales, SUM(sales) OVER (ORDER BY sales DESC ROWS 2 PRECEDING) FROM salestbl;

storeid prodid sales Moving Sum(sales)----------- ------ ----------- -----------------1001 F 150000.00 150000.001001 A 100000.00 250000.001003 B 65000.00 315000.001001 C 60000.00 225000.001003 D 50000.00 175000.001002 A 40000.00 150000.001001 D 35000.00 125000.001002 C 35000.00 110000.001003 A 30000.00 100000.001002 D 25000.00 90000.001003 C 20000.00 75000.00