Modern SQL in PostgreSQL

174
Still using Windows 3.1? So why stick to SQL-92? @ModernSQL - http://modern-sql.com/ @MarkusWinand

Transcript of Modern SQL in PostgreSQL

Page 1: Modern SQL in PostgreSQL

Still using Windows 3.1? So why stick to

SQL-92?

@ModernSQL - http://modern-sql.com/ @MarkusWinand

Page 2: Modern SQL in PostgreSQL

SQL:1999

Page 3: Modern SQL in PostgreSQL

LATERAL

Page 4: Modern SQL in PostgreSQL

Select-list sub-queries must be scalar[0]:

LATERAL Before SQL:1999

SELECT…,(SELECTcolumn_1FROMt1WHEREt1.x=t2.y)AScFROMt2…

(an atomic quantity that can hold only one value at a time[1])

[0] Neglecting row values and other workarounds here; [1] https://en.wikipedia.org/wiki/Scalar

Page 5: Modern SQL in PostgreSQL

Select-list sub-queries must be scalar[0]:

LATERAL Before SQL:1999

SELECT…,(SELECTcolumn_1FROMt1WHEREt1.x=t2.y)AScFROMt2…

(an atomic quantity that can hold only one value at a time[1])

[0] Neglecting row values and other workarounds here; [1] https://en.wikipedia.org/wiki/Scalar

✗,column_2

More thanone column? ⇒Syntax error

}More thanone row?

⇒Runtime error!

Page 6: Modern SQL in PostgreSQL

Lateral derived tables lift both limitations and can be correlated:

LATERAL Since SQL:1999

SELECT…,ldt.*FROMt2LEFTJOINLATERAL(SELECTcolumn_1,column_2FROMt1WHEREt1.x=t2.y)ASldtON(true)…

Page 7: Modern SQL in PostgreSQL

Lateral derived tables lift both limitations and can be correlated:

LATERAL Since SQL:1999

SELECT…,ldt.*FROMt2LEFTJOINLATERAL(SELECTcolumn_1,column_2FROMt1WHEREt1.x=t2.y)ASldtON(true)…

“Derived table” meansit’s in the

FROM/JOIN clause

Still “correlated”

Regular join semantics

Page 8: Modern SQL in PostgreSQL

FROMtJOINLATERAL(SELECT…FROM…WHEREt.c=…ORDERBY…LIMIT10)derived_table

‣ Top-N per group

inside a lateral derived tableFETCHFIRST (or LIMIT, TOP)applies per row from left tables.

‣ Also useful to find most recent news from several subscribed topics (“multi-source top-N”).

Use-CasesLATERAL

Add proper indexfor Top-N query

http://use-the-index-luke.com/sql/partial-results/top-n-queries

Page 9: Modern SQL in PostgreSQL

FROMtJOINLATERAL(SELECT…FROM…WHEREt.c=…ORDERBY…LIMIT10)derived_table

‣ Top-N per group

inside a lateral derived tableFETCHFIRST (or LIMIT, TOP)applies per row from left tables.

‣ Also useful to find most recent news from several subscribed topics (“multi-source top-N”).

‣ Table function arguments

(TABLE often implies LATERAL)

Use-CasesLATERAL

FROMtJOINTABLE(your_func(t.c))

Page 10: Modern SQL in PostgreSQL

LATERAL is the "for each" loop of SQL

LATERAL plays well with outer and cross joins

LATERAL is great for Top-N subqueries

LATERAL can join table functions (unnest!)

LATERAL In a Nutshell

Page 11: Modern SQL in PostgreSQL

LATERAL Availability19

99

2001

2003

2005

2007

2009

2011

2013

2015

5.1 MariaDBMySQL

9.3 PostgreSQLSQLite

9.1 DB2 LUW11gR1[0] 12c Oracle

2005[1] SQL Server[0]Undocumented. Requires setting trace event 22829.[1]LATERAL is not supported as of SQL Server 2016 but [CROSS|OUTER]APPLY can be used for the same effect.

Page 12: Modern SQL in PostgreSQL

GROUPINGSETS

Page 13: Modern SQL in PostgreSQL

Only one GROUPBY operation at a time:

GROUPINGSETS Before SQL:1999

SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month

Monthly revenue Yearly revenue

SELECTyear,sum(revenue)FROMtblGROUPBYyear

Page 14: Modern SQL in PostgreSQL

GROUPINGSETS Before SQL:1999SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month

SELECTyear,sum(revenue)FROMtblGROUPBYyear

Page 15: Modern SQL in PostgreSQL

GROUPINGSETS Before SQL:1999SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month

SELECTyear,sum(revenue)FROMtblGROUPBYyear

UNIONALL

,null

Page 16: Modern SQL in PostgreSQL

GROUPINGSETS Since SQL:1999SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month

SELECTyear,sum(revenue)FROMtblGROUPBYyear

UNIONALL

,null

SELECTyear,month,sum(revenue)FROMtblGROUPBYGROUPINGSETS((year,month),(year))

Page 17: Modern SQL in PostgreSQL

GROUPINGSETS are multiple GROUPBYs in one go

() (empty brackets) build a group over all rows

GROUPING (function) disambiguates the meaning of NULL(was the grouped data NULL or is this column not currently grouped?)

Permutations can be created using ROLLUP and CUBE(ROLLUP(a,b,c) = GROUPINGSETS((a,b,c),(a,b),(a),())

GROUPINGSETS In a Nutshell

Page 18: Modern SQL in PostgreSQL

GROUPINGSETS Availability19

9920

0120

0320

0520

0720

0920

1120

1320

15

5.1[0] MariaDB5.0[0] MySQL

9.5 PostgreSQLSQLite

5 DB2 LUW9iR1 Oracle

2008 SQL Server[0]Only ROLLUP

Page 19: Modern SQL in PostgreSQL

WITH (Common Table Expressions)

Page 20: Modern SQL in PostgreSQL

WITH (non-recursive) The ProblemNested queries are hard to read:

SELECT…FROM(SELECT…FROMt1JOIN(SELECT…FROM…)aON(…))bJOIN(SELECT…FROM…)cON(…)

Page 21: Modern SQL in PostgreSQL

Understand

this first

WITH (non-recursive) The ProblemNested queries are hard to read:

SELECT…FROM(SELECT…FROMt1JOIN(SELECT…FROM…)aON(…))bJOIN(SELECT…FROM…)cON(…)

Page 22: Modern SQL in PostgreSQL

Then this...

WITH (non-recursive) The ProblemNested queries are hard to read:

SELECT…FROM(SELECT…FROMt1JOIN(SELECT…FROM…)aON(…))bJOIN(SELECT…FROM…)cON(…)

Page 23: Modern SQL in PostgreSQL

Then this...

WITH (non-recursive) The ProblemNested queries are hard to read:

SELECT…FROM(SELECT…FROMt1JOIN(SELECT…FROM…)aON(…))bJOIN(SELECT…FROM…)cON(…)

Page 24: Modern SQL in PostgreSQL

Finally the first line makes sense

WITH (non-recursive) The ProblemNested queries are hard to read:

SELECT…FROM(SELECT…FROMt1JOIN(SELECT…FROM…)aON(…))bJOIN(SELECT…FROM…)cON(…)

Page 25: Modern SQL in PostgreSQL

CTEs are statement-scoped views:

WITHa(c1,c2,c3)AS(SELECTc1,c2,c3FROM…),

b(c4,…)AS(SELECTc4,…FROMt1JOINaON(…)),

c(…)

WITH (non-recursive) Since SQL:1999

Page 26: Modern SQL in PostgreSQL

CTEs are statement-scoped views:

WITHa(c1,c2,c3)AS(SELECTc1,c2,c3FROM…),

b(c4,…)AS(SELECTc4,…FROMt1JOINaON(…)),

c(…)

Keyword

WITH (non-recursive) Since SQL:1999

Page 27: Modern SQL in PostgreSQL

CTEs are statement-scoped views:

WITHa(c1,c2,c3)AS(SELECTc1,c2,c3FROM…),

b(c4,…)AS(SELECTc4,…FROMt1JOINaON(…)),

c(…)

Name of CTE and (here optional) column names

WITH (non-recursive) Since SQL:1999

Page 28: Modern SQL in PostgreSQL

CTEs are statement-scoped views:

WITHa(c1,c2,c3)AS(SELECTc1,c2,c3FROM…),

b(c4,…)AS(SELECTc4,…FROMt1JOINaON(…)),

c(…)

Definition

WITH (non-recursive) Since SQL:1999

Page 29: Modern SQL in PostgreSQL

CTEs are statement-scoped views:

WITHa(c1,c2,c3)AS(SELECTc1,c2,c3FROM…),

b(c4,…)AS(SELECTc4,…FROMt1JOINaON(…)),

c(…)

Introduces another CTE

Don't repeat WITH

WITH (non-recursive) Since SQL:1999

Page 30: Modern SQL in PostgreSQL

CTEs are statement-scoped views:

WITHa(c1,c2,c3)AS(SELECTc1,c2,c3FROM…),

b(c4,…)AS(SELECTc4,…FROMt1JOINaON(…)),

c(…)

May refer toprevious CTEs

WITH (non-recursive) Since SQL:1999

Page 31: Modern SQL in PostgreSQL

WITHa(c1,c2,c3)AS(SELECTc1,c2,c3FROM…),

b(c4,…)AS(SELECTc4,…FROMt1JOINaON(…)),

c(…)AS(SELECT…FROM…)

SELECT…FROMbJOINcON(…)

Third CTE

WITH (non-recursive) Since SQL:1999

Page 32: Modern SQL in PostgreSQL

WITHa(c1,c2,c3)AS(SELECTc1,c2,c3FROM…),

b(c4,…)AS(SELECTc4,…FROMt1JOINaON(…)),

c(…)AS(SELECT…FROM…)

SELECT…FROMbJOINcON(…)

No comma!

WITH (non-recursive) Since SQL:1999

Page 33: Modern SQL in PostgreSQL

WITHa(c1,c2,c3)AS(SELECTc1,c2,c3FROM…),

b(c4,…)AS(SELECTc4,…FROMt1JOINaON(…)),

c(…)AS(SELECT…FROM…)

SELECT…FROMbJOINcON(…)

Main query

WITH (non-recursive) Since SQL:1999

Page 34: Modern SQL in PostgreSQL

CTEs are statement-scoped views:

WITHa(c1,c2,c3)AS(SELECTc1,c2,c3FROM…),

b(c4,…)AS(SELECTc4,…FROMt1JOINaON(…)),

c(…)AS(SELECT…FROM…)

SELECT…FROMbJOINcON(…)

Read top down

WITH (non-recursive) Since SQL:1999

Page 35: Modern SQL in PostgreSQL

‣ Literate SQL

Organize SQL code toimprove maintainability

‣ Assign column names

to tables produced by valuesor unnest.

‣ Overload tables (for testing)

with queries hide tablesof the same name.

Use-CasesWITH (non-recursive)

http://modern-sql.com/use-case/literate-sql

http://modern-sql.com/use-case/naming-unnamed-columns

http://modern-sql.com/use-case/unit-tests-on-transient-data

Page 36: Modern SQL in PostgreSQL

WITH are the "private methods" of SQL

WITH is a prefix to SELECT

WITH queries are only visible in the SELECT they precede

WITH in detail: http://modern-sql.com/feature/with

WITH (non-recursive) In a Nutshell

Page 37: Modern SQL in PostgreSQL

PostgreSQL “issues”WITH (non-recursive)

In PostgreSQL WITH queries are “optimizer fences”:

WITHcteAS(SELECT*FROMnews)SELECT*FROMcteWHEREtopic=1

Page 38: Modern SQL in PostgreSQL

CTEScanoncte(rows=6370)Filter:topic=1CTEcte->SeqScanonnews(rows=10000001)

PostgreSQL “issues”WITH (non-recursive)

In PostgreSQL WITH queries are “optimizer fences”:

WITHcteAS(SELECT*FROMnews)SELECT*FROMcteWHEREtopic=1

Page 39: Modern SQL in PostgreSQL

CTEScanoncte(rows=6370)Filter:topic=1CTEcte->SeqScanonnews(rows=10000001)

PostgreSQL “issues”WITH (non-recursive)

In PostgreSQL WITH queries are “optimizer fences”:

WITHcteAS(SELECT*FROMnews)SELECT*FROMcteWHEREtopic=1

Page 40: Modern SQL in PostgreSQL

CTEScanoncte(rows=6370)Filter:topic=1CTEcte->SeqScanonnews(rows=10000001)

PostgreSQL “issues”WITH (non-recursive)

In PostgreSQL WITH queries are “optimizer fences”:

WITHcteAS(SELECT*FROMnews)SELECT*FROMcteWHEREtopic=1

CTE doesn't

know about the outer

filter

Page 41: Modern SQL in PostgreSQL

Views and derived tables support "predicate pushdown":SELECT*FROM(SELECT*FROMnews)nWHEREtopic=1;

PostgreSQL “issues”WITH (non-recursive)

Page 42: Modern SQL in PostgreSQL

Views and derived tables support "predicate pushdown":SELECT*FROM(SELECT*FROMnews)nWHEREtopic=1;

BitmapHeapScanonnews(rows=6370)->BitmapIndexScanonidx(rows=6370)Cond:topic=1

PostgreSQL “issues”WITH (non-recursive)

Page 43: Modern SQL in PostgreSQL

PostgreSQL 9.1+ allows DML within WITH:

WITHdeleted_rowsAS(DELETEFROMsource_tblRETURNING*)INSERTINTOdestination_tblSELECT*FROMdeleted_rows;

PostgreSQL ExtensionWITH (non-recursive)

Page 44: Modern SQL in PostgreSQL

1999

2001

2003

2005

2007

2009

2011

2013

2015

5.1[0] MariaDBMySQL[1]

8.4 PostgreSQL3.8.3[2] SQLite

7 DB2 LUW9iR2 Oracle

2005 SQL Server[0]Available MariaDB 10.2 alpha[1]Announced for 8.0: http://www.percona.com/blog/2016/09/01/percona-live-europe-featured-talk-manyi-lu[2]Only for top-level SELECT statements

AvailabilityWITH (non-recursive)

Page 45: Modern SQL in PostgreSQL

WITHRECURSIVE (Common Table Expressions)

Page 46: Modern SQL in PostgreSQL

CREATETABLEt(idNUMERICNOTNULL,parent_idNUMERIC,…PRIMARYKEY(id))

Coping with hierarchies in the Adjacency List Model[0]

WITHRECURSIVE The Problem

[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties”

Page 47: Modern SQL in PostgreSQL

SELECT*FROMtASd0LEFTJOINtASd1ON(d1.parent_id=d0.id)LEFTJOINtASd2ON(d2.parent_id=d1.id)

Coping with hierarchies in the Adjacency List Model[0]

WITHRECURSIVE The Problem

WHEREd0.id=?

[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties”

Page 48: Modern SQL in PostgreSQL

SELECT*FROMtASd0LEFTJOINtASd1ON(d1.parent_id=d0.id)LEFTJOINtASd2ON(d2.parent_id=d1.id)

Coping with hierarchies in the Adjacency List Model[0]

WITHRECURSIVE The Problem

WHEREd0.id=?

[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties”

Page 49: Modern SQL in PostgreSQL

SELECT*FROMtASd0LEFTJOINtASd1ON(d1.parent_id=d0.id)LEFTJOINtASd2ON(d2.parent_id=d1.id)

Coping with hierarchies in the Adjacency List Model[0]

WITHRECURSIVE The Problem

WHEREd0.id=?

[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties”

Page 50: Modern SQL in PostgreSQL

SELECT*FROMtASd0LEFTJOINtASd1ON(d1.parent_id=d0.id)LEFTJOINtASd2ON(d2.parent_id=d1.id)

WITHRECURSIVE Since SQL:1999

WHEREd0.id=?

WITHRECURSIVEd(id,parent,…)AS(SELECTid,parent,…FROMtblWHEREid=?UNIONALLSELECTid,parent,…FROMdLEFTJOINtblON(tbl.parent=d.id))SELECT*FROMsubtree

Page 51: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

Since SQL:1999WITHRECURSIVE

Page 52: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

Keyword

Since SQL:1999WITHRECURSIVE

Page 53: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

Column list mandatory here

Since SQL:1999WITHRECURSIVE

Page 54: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

Executed first

Since SQL:1999WITHRECURSIVE

Page 55: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

Result sent there

Since SQL:1999WITHRECURSIVE

Page 56: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

Result visible twice

Since SQL:1999WITHRECURSIVE

Page 57: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

n---123(3rows)

Once it becomes

part of the final result

Since SQL:1999WITHRECURSIVE

Page 58: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

n---123(3rows)

Since SQL:1999WITHRECURSIVE

Page 59: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

n---123(3rows)

Second leg of UNION is executed

Since SQL:1999WITHRECURSIVE

Page 60: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

n---123(3rows)

Result sent there again

Since SQL:1999WITHRECURSIVE

Page 61: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

n---123(3rows)

Since SQL:1999WITHRECURSIVE

Page 62: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

n---123(3rows)

It's a loop!

Since SQL:1999WITHRECURSIVE

Page 63: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

n---123(3rows)

It's a loop!

Since SQL:1999WITHRECURSIVE

Page 64: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

n---123(3rows)

It's a loop!

Since SQL:1999WITHRECURSIVE

Page 65: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

n---123(3rows)

n=3 doesn't match

Since SQL:1999WITHRECURSIVE

Page 66: Modern SQL in PostgreSQL

Recursive common table expressions may refer to themselves in the second leg of a UNION[ALL]:

WITHRECURSIVEcte(n)AS(SELECT1UNIONALLSELECTn+1FROMcteWHEREn<3)SELECT*FROMcte

n---123(3rows)

n=3 doesn't matchLoop terminates

Since SQL:1999WITHRECURSIVE

Page 67: Modern SQL in PostgreSQL

Use Cases‣ Row generators

To fill gaps (e.g., in time series), generate test data.

‣ Processing graphs

Shortest route from person A to B in LinkedIn/Facebook/Twitter/…

‣ Finding distinct values

with n*log(N)† time complexity. […many more…]

As shown on previous slide

http://aprogrammerwrites.eu/?p=1391

“[…] for certain classes of graphs, solutions utilizing relational database technology […] can offer performance superior to that of the dedicated graph databases.” event.cwi.nl/grades2013/07-welc.pdf

http://wiki.postgresql.org/wiki/Loose_indexscan

† n … # distinct values, N … # of table rows. Suitable index required

WITHRECURSIVE

Page 68: Modern SQL in PostgreSQL

WITHRECURSIVE is the “while” of SQL

WITHRECURSIVE "supports" infinite loops

Except PostgreSQL, databases generally don't require the RECURSIVE keyword.

DB2, SQL Server & Oracle don’t even know the keyword RECURSIVE, but allow recursive CTEs anyway.

In a NutshellWITHRECURSIVE

Page 69: Modern SQL in PostgreSQL

AvailabilityWITHRECURSIVE19

99

2001

2003

2005

2007

2009

2011

2013

2015

5.1[0] MariaDBMySQL[1]

8.4 PostgreSQL3.8.3[2] SQLite

7 DB2 LUW11gR2 Oracle

2005 SQL Server[0]Expected in 10.2.2[1]Announced for 8.0: http://www.percona.com/blog/2016/09/01/percona-live-europe-featured-talk-manyi-lu[2]Only for top-level SELECT statements

Page 70: Modern SQL in PostgreSQL

SQL:2003

Page 71: Modern SQL in PostgreSQL

FILTER

Page 72: Modern SQL in PostgreSQL

SELECTYEAR,SUM(CASEWHENMONTH=1THENsalesELSE0END)JAN,SUM(CASEWHENMONTH=2THENsalesELSE0END)FEB,…FROMsale_dataGROUPBYYEAR

FILTER The Problem

Pivot table: Years on the Y axis, month on X:

Page 73: Modern SQL in PostgreSQL

SELECTYEAR,SUM(sales)FILTER(WHEREMONTH=1)JAN,SUM(sales)FILTER(WHEREMONTH=2)FEB,…FROMsale_dataGROUPBYYEAR;

FILTER Since SQL:2003SQL:2003 allows FILTER(WHERE…) after aggregates:

Page 74: Modern SQL in PostgreSQL

FILTER Availability19

99

2001

2003

2005

2007

2009

2011

2013

2015

5.1 MariaDBMySQL

9.4 PostgreSQLSQLite

DB2 LUWOracleSQL Server

Page 75: Modern SQL in PostgreSQL

OVER and

PARTITIONBY

Page 76: Modern SQL in PostgreSQL

OVER (PARTITION BY) The ProblemTwo distinct concepts could not be used independently:

‣Merge rows with the same key properties

‣ GROUPBY to specify key properties

‣ DISTINCT to use full row as key

‣ Aggregate data from related rows ‣ Requires GROUPBY to segregate the rows

‣ COUNT, SUM, AVG, MIN, MAX to aggregate grouped rows

Page 77: Modern SQL in PostgreSQL

SELECTc1,SUM(c2)totFROMtGROUPBYc1

OVER (PARTITION BY) The Problem

Yes ⇠

Mer

ge ro

ws ⇢

No

No ⇠ Aggregate ⇢ Yes

SELECTc1,c2FROMt

SELECTDISTINCTc1,c2FROMt

SELECTc1,c2FROMt

SELECTc1,SUM(c2)totFROMtGROUPBYc1

Page 78: Modern SQL in PostgreSQL

SELECTc1,SUM(c2)totFROMtGROUPBYc1

OVER (PARTITION BY) The Problem

Yes ⇠

Mer

ge ro

ws ⇢

No

No ⇠ Aggregate ⇢ Yes

SELECTc1,c2FROMt

SELECTDISTINCTc1,c2FROMt

SELECTc1,c2FROMtJOIN()taON(t.c1=ta.c1)

SELECTc1,SUM(c2)totFROMtGROUPBYc1

,tot

Page 79: Modern SQL in PostgreSQL

SELECTc1,SUM(c2)totFROMtGROUPBYc1

OVER (PARTITION BY) The Problem

Yes ⇠

Mer

ge ro

ws ⇢

No

No ⇠ Aggregate ⇢ Yes

SELECTc1,c2FROMt

SELECTDISTINCTc1,c2FROMt

SELECTc1,c2FROMtJOIN()taON(t.c1=ta.c1)

SELECTc1,SUM(c2)totFROMtGROUPBYc1

,tot

Page 80: Modern SQL in PostgreSQL

SELECTc1,SUM(c2)totFROMtGROUPBYc1

OVER (PARTITION BY) Since SQL:2003

Yes ⇠

Mer

ge ro

ws ⇢

No

No ⇠ Aggregate ⇢ Yes

SELECTc1,c2FROMt

SELECTDISTINCTc1,c2FROMt

SELECTc1,c2FROMt

FROMt

,SUM(c2)OVER(PARTITIONBYc1)

Page 81: Modern SQL in PostgreSQL

SELECTdep,salary,SUM(salary)OVER()FROMemp

dep salary ts1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000

OVER (PARTITION BY) How it works

Page 82: Modern SQL in PostgreSQL

SELECTdep,salary,SUM(salary)OVER()FROMemp

dep salary ts1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000

OVER (PARTITION BY) How it works

Page 83: Modern SQL in PostgreSQL

SELECTdep,salary,SUM(salary)OVER()FROMemp

dep salary ts1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000

OVER (PARTITION BY) How it works

Page 84: Modern SQL in PostgreSQL

SELECTdep,salary,SUM(salary)OVER()FROMemp

dep salary ts1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000

OVER (PARTITION BY) How it works

Page 85: Modern SQL in PostgreSQL

SELECTdep,salary,SUM(salary)OVER()FROMemp

dep salary ts1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000

OVER (PARTITION BY) How it works

Page 86: Modern SQL in PostgreSQL

SELECTdep,salary,SUM(salary)OVER()FROMemp

OVER (PARTITION BY) How it works

)

dep salary ts1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000

Page 87: Modern SQL in PostgreSQL

SELECTdep,salary,SUM(salary)OVER()FROMemp

dep salary ts1 1000 100022 1000 200022 1000 2000333 1000 3000333 1000 3000333 1000 3000

OVER (PARTITION BY) How it works

)PARTITIONBYdep

Page 88: Modern SQL in PostgreSQL

OVER and

ORDERBY(Framing & Ranking)

Page 89: Modern SQL in PostgreSQL

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

OVER (ORDER BY) The Problem

SELECTid,value,FROMtransactionst

Page 90: Modern SQL in PostgreSQL

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

OVER (ORDER BY) The Problem

SELECTid,value,

(SELECTSUM(value)FROMtransactionst2WHEREt2.id<=t.id)

FROMtransactionst

Page 91: Modern SQL in PostgreSQL

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

OVER (ORDER BY) The Problem

SELECTid,value,

(SELECTSUM(value)FROMtransactionst2WHEREt2.id<=t.id)

FROMtransactionst

Range segregation (<=)not possible with

GROUP BY orPARTITION BY

Page 92: Modern SQL in PostgreSQL

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYid

Page 93: Modern SQL in PostgreSQL

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDING

Page 94: Modern SQL in PostgreSQL

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

Page 95: Modern SQL in PostgreSQL

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

Page 96: Modern SQL in PostgreSQL

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

Page 97: Modern SQL in PostgreSQL

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

Page 98: Modern SQL in PostgreSQL

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

Page 99: Modern SQL in PostgreSQL

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

Page 100: Modern SQL in PostgreSQL

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +20

22 3 -10 +10

333 4 +50 +50

333 5 -30 +20

333 6 -20 .0

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

PARTITIONBYacnt

Page 101: Modern SQL in PostgreSQL

OVER (ORDER BY) Since SQL:2003With OVER(ORDERBYn) a new type of functions make sense:

n ROW_NUMBER RANK DENSE_RANK PERCENT_RANK CUME_DIST1 1 1 1 0 0.252 2 2 2 0.33… 0.753 3 2 2 0.33… 0.754 4 4 3 1 1

Page 102: Modern SQL in PostgreSQL

‣ Aggregates without GROUPBY

‣ Running totals, moving averages

‣ Ranking‣ Top-N per Group

‣ Avoiding self-joins

[… many more …]

Use Cases

SELECT*FROM(SELECTROW_NUMBER()OVER(PARTITIONBY…ORDERBY…)rn,t.*FROMt)numbered_tWHERErn<=3

AVG(…)OVER(ORDERBY…ROWSBETWEEN3PRECEDINGAND3FOLLOWING)moving_avg

OVER (SQL:2003)

Page 103: Modern SQL in PostgreSQL

OVER may follow any aggregate function

OVER defines which rows are visible at each row

OVER() makes all rows visible at every row

OVER(PARTITIONBY …) segregates like GROUPBY

OVER(ORDERBY…BETWEEN) segregates using <, >

In a NutshellOVER (SQL:2003)

Page 104: Modern SQL in PostgreSQL

1999

2001

2003

2005

2007

2009

2011

2013

2015

5.1[0] MariaDBMySQL[1]

8.4 PostgreSQLSQLite

7 DB2 LUW8i Oracle

2005 SQL Server[0]Available MariaDB 10.2 alpha[1]On the roadmap: http://www.slideshare.net/ManyiLu/optimizer-percona-liveams2015/47

OVER (SQL:2003) AvailabilityHive

Impala

Spark

NuoDB

Page 105: Modern SQL in PostgreSQL

WITHINGROUP

Page 106: Modern SQL in PostgreSQL

SELECTd1.valFROMdatad1JOINdatad2ON(d1.val<d2.valOR(d1.val=d2.valANDd1.id<d2.id))GROUPBYd1.valHAVINGcount(*)=(SELECTFLOOR(COUNT(*)/2)FROMdatad3)

WITHINGROUP The ProblemGrouped rows cannot be ordered prior aggregation.

(how to get the middle value (median) of a set)

Page 107: Modern SQL in PostgreSQL

SELECTd1.valFROMdatad1JOINdatad2ON(d1.val<d2.valOR(d1.val=d2.valANDd1.id<d2.id))GROUPBYd1.valHAVINGcount(*)=(SELECTFLOOR(COUNT(*)/2)FROMdatad3)

WITHINGROUP The ProblemGrouped rows cannot be ordered prior aggregation.

(how to get the middle value (median) of a set)

Number rows

Pick middle one

Page 108: Modern SQL in PostgreSQL

SELECTd1.valFROMdatad1JOINdatad2ON(d1.val<d2.valOR(d1.val=d2.valANDd1.id<d2.id))GROUPBYd1.valHAVINGcount(*)=(SELECTFLOOR(COUNT(*)/2)FROMdatad3)

WITHINGROUP The ProblemGrouped rows cannot be ordered prior aggregation.

(how to get the middle value (median) of a set)

Number rows

Pick middle one

Page 109: Modern SQL in PostgreSQL

SELECTd1.valFROMdatad1JOINdatad2ON(d1.val<d2.valOR(d1.val=d2.valANDd1.id<d2.id))GROUPBYd1.valHAVINGcount(*)=(SELECTFLOOR(COUNT(*)/2)FROMdatad3)

WITHINGROUP The ProblemGrouped rows cannot be ordered prior aggregation.

(how to get the middle value (median) of a set)

Number rows

Pick middle one

Page 110: Modern SQL in PostgreSQL

SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata

Median

Which value?

WITHINGROUP Since 2013SQL:2003 introduced ordered set functions:

Page 111: Modern SQL in PostgreSQL

SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata

WITHINGROUP Since 2013SQL:2003 introduced ordered set functions:

SELECTRANK(123)WITHINGROUP(ORDERBYval)FROMdata

…and hypothetical set-functions:

Page 112: Modern SQL in PostgreSQL

WITHINGROUP Availability19

99

2001

2003

2005

2007

2009

2011

2013

2015

5.1 MariaDBMySQL

9.4 PostgreSQLSQLite

DB2 LUW9iR1 Oracle

2012[0] SQL Server[0]Only as window function (OVER required). Feature request 728969 closed as "won't fix"

Page 113: Modern SQL in PostgreSQL

TABLESAMPLE

Page 114: Modern SQL in PostgreSQL

TABLESAMPLE Availability19

9920

0120

0320

0520

0720

0920

1120

1320

15

5.1 MariaDBMySQL

9.5[0] PostgreSQLSQLite

8.2[0] DB2 LUW8i[0] Oracle

2005[0] SQL Server[0]Not for derived tables

Page 115: Modern SQL in PostgreSQL

SQL:2008

Page 116: Modern SQL in PostgreSQL

FETCHFIRST

Page 117: Modern SQL in PostgreSQL

SELECT*FROM(SELECT*,ROW_NUMBER()OVER(ORDERBYx)rnFROMdata)numbered_dataWHERErn<=10

FETCHFIRST The ProblemLimit the result to a number of rows. (LIMIT, TOP and ROWNUM are all proprietary)

SQL:2003 introduced ROW_NUMBER() to number rows.But this still requires wrapping to limit the result.

And how about databases not supporting ROW_NUMBER()?

Page 118: Modern SQL in PostgreSQL

SELECT*FROM(SELECT*,ROW_NUMBER()OVER(ORDERBYx)rnFROMdata)numbered_dataWHERErn<=10

FETCHFIRST The ProblemLimit the result to a number of rows. (LIMIT, TOP and ROWNUM are all proprietary)

SQL:2003 introduced ROW_NUMBER() to number rows.But this still requires wrapping to limit the result.

And how about databases not supporting ROW_NUMBER()?

Dammit! Let's takeLIMIT

Page 119: Modern SQL in PostgreSQL

SELECT*FROMdataORDERBYxFETCHFIRST10ROWSONLY

FETCHFIRST Since SQL:2008SQL:2008 introduced the FETCHFIRST…ROWSONLY clause:

Page 120: Modern SQL in PostgreSQL

FETCHFIRST Availability19

99

2001

2003

2005

2007

2009

2011

2013

2015

5.1 MariaDB3.19.3[0] MySQL

6.5[1] 8.4 PostgreSQL2.1.0[1] SQLite

7 DB2 LUW12c Oracle

7.0[2] 2012 SQL Server[0]Earliest mention of LIMIT. Probably inherited from mSQL[1]Functionality available using LIMIT[2]SELECTTOPn... SQL Server 2000 also supports expressions and bind parameters

Page 121: Modern SQL in PostgreSQL

SQL:2011

Page 122: Modern SQL in PostgreSQL

OFFSET

Page 123: Modern SQL in PostgreSQL

SELECT*FROM(SELECT*,ROW_NUMBER()OVER(ORDERBYx)rnFROMdata)numbered_dataWHERErn>10andrn<=20

OFFSET The Problem

How to fetch the rows after a limit?(pagination anybody?)

Page 124: Modern SQL in PostgreSQL

SELECT*FROMdataORDERBYxOFFSET10ROWSFETCHNEXT10ROWSONLY

OFFSET Since SQL:2011SQL:2011 introduced OFFSET, unfortunately!

Page 125: Modern SQL in PostgreSQL

SELECT*FROMdataORDERBYxOFFSET10ROWSFETCHNEXT10ROWSONLY

OFFSET Since SQL:2011SQL:2011 introduced OFFSET, unfortunately!

OFFSETGrab coasters & stickers!

http://use-the-index-luke.com/no-offset

Page 126: Modern SQL in PostgreSQL

OFFSET Since SQL:201119

99

2001

2003

2005

2007

2009

2011

2013

2015

5.1 MariaDB3.20.3[0] 4.0.6[1] MySQL

6.5 PostgreSQL2.1.0 SQLite

9.7[2] 11.1 DB2 LUW12c Oracle

2012 SQL Server[0]LIMIT[offset,]limit: "With this it's easy to do a poor man's next page/previous page WWW application."[1]The release notes say "Added PostgreSQL compatible LIMIT syntax"[2]Requires enabling the MySQL compatibility vector: db2setDB2_COMPATIBILITY_VECTOR=MYS

Page 127: Modern SQL in PostgreSQL

OVER

Page 128: Modern SQL in PostgreSQL

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

Page 129: Modern SQL in PostgreSQL

WITHnumbered_tAS(SELECT*)

SELECTcurr.*,curr.balance-COALESCE(prev.balance,0)FROMnumbered_tcurrLEFTJOINnumbered_tprevON(curr.rn=prev.rn+1)

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

currbalance … rn

50 … 190 … 270 … 330 … 4

FROMt

Page 130: Modern SQL in PostgreSQL

WITHnumbered_tAS(SELECT*)

SELECTcurr.*,curr.balance-COALESCE(prev.balance,0)FROMnumbered_tcurrLEFTJOINnumbered_tprevON(curr.rn=prev.rn+1)

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

currbalance … rn

50 … 190 … 270 … 330 … 4

FROMt,ROW_NUMBER()OVER(ORDERBYx)rn

Page 131: Modern SQL in PostgreSQL

WITHnumbered_tAS(SELECT*)

SELECTcurr.*,curr.balance-COALESCE(prev.balance,0)FROMnumbered_tcurrLEFTJOINnumbered_tprevON(curr.rn=prev.rn+1)

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

currbalance … rn

50 … 190 … 270 … 330 … 4

FROMt,ROW_NUMBER()OVER(ORDERBYx)rn

Page 132: Modern SQL in PostgreSQL

WITHnumbered_tAS(SELECT*)

SELECTcurr.*,curr.balance-COALESCE(prev.balance,0)FROMnumbered_tcurrLEFTJOINnumbered_tprevON(curr.rn=prev.rn+1)

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

currbalance … rn

50 … 190 … 270 … 330 … 4

FROMt,ROW_NUMBER()OVER(ORDERBYx)rn

prevbalance … rn

50 … 190 … 270 … 330 … 4

Page 133: Modern SQL in PostgreSQL

WITHnumbered_tAS(SELECT*)

SELECTcurr.*,curr.balance-COALESCE(prev.balance,0)FROMnumbered_tcurrLEFTJOINnumbered_tprevON(curr.rn=prev.rn+1)

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

currbalance … rn

50 … 190 … 270 … 330 … 4

FROMt,ROW_NUMBER()OVER(ORDERBYx)rn

prevbalance … rn

50 … 190 … 270 … 330 … 4

Page 134: Modern SQL in PostgreSQL

WITHnumbered_tAS(SELECT*)

SELECTcurr.*,curr.balance-COALESCE(prev.balance,0)FROMnumbered_tcurrLEFTJOINnumbered_tprevON(curr.rn=prev.rn+1)

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

currbalance … rn

50 … 190 … 270 … 330 … 4

FROMt,ROW_NUMBER()OVER(ORDERBYx)rn

prevbalance … rn

50 … 190 … 270 … 330 … 4

+50+40-20-40

Page 135: Modern SQL in PostgreSQL

SELECT*,balance-COALESCE(LAG(balance)OVER(ORDERBYx),0)FROMt

Available functions:LEAD/LAGFIRST_VALUE/LAST_VALUENTH_VALUE(col,n)FROMFIRST/LASTRESPECT/IGNORENULLS

OVER (SQL:2011) Since SQL:2011SQL:2011 introduced LEAD, LAG, NTH_VALUE, … for that:

Page 136: Modern SQL in PostgreSQL

OVER (LEAD, LAG, …) Since SQL:201119

99

2001

2003

2005

2007

2009

2011

2013

2015

5.1[0] MariaDBMySQL

8.4[1] PostgreSQLSQLite

9.5[2] 11.1 DB2 LUW8i[2] 11gR2 Oracle

2012[2] SQL Server[0]Not yet available in MariaDB 10.2.2 (alpha). MDEV-8091[1]No IGNORENULLS and FROMLAST as of PostgreSQL 9.6[2]No NTH_VALUE

Page 137: Modern SQL in PostgreSQL

Temporal Tables (Time Traveling)

Page 138: Modern SQL in PostgreSQL

INSERTUPDATEDELETE

are DESTRUCTIVE

Temporal Tables The Problem

Page 139: Modern SQL in PostgreSQL

CREATETABLEt(...,start_tsTIMESTAMP(9)GENERATEDALWAYSASROWSTART,end_tsTIMESTAMP(9)GENERATEDALWAYSASROWEND,

PERIODFORSYSTEMTIME(start_ts,end_ts))WITHSYSTEMVERSIONING

Temporal Tables Since SQL:2011Table can be system versioned, application versioned or both.

Page 140: Modern SQL in PostgreSQL

ID Data start_ts end_ts1 X 10:00:00

UPDATE...SETDATA='Y'...

ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00

DELETE...WHEREID=1

INSERT...(ID,DATA)VALUES(1,'X')

Temporal Tables Since SQL:2011

Page 141: Modern SQL in PostgreSQL

ID Data start_ts end_ts1 X 10:00:00

UPDATE...SETDATA='Y'...

ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00

DELETE...WHEREID=1

ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00 12:00:00

INSERT...(ID,DATA)VALUES(1,'X')

Temporal Tables Since SQL:2011

Page 142: Modern SQL in PostgreSQL

Although multiple versions exist, only the “current” one is visible per default.

After 12:00:00, SELECT*FROMt doesn’t return anything anymore.

ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00 12:00:00

Temporal Tables Since SQL:2011

Page 143: Modern SQL in PostgreSQL

ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00 12:00:00

With FOR…ASOF you can query anything you like: SELECT*FROMtFORSYSTEM_TIMEASOFTIMESTAMP'2015-04-0210:30:00'

ID Data start_ts end_ts

1 X 10:00:00 11:00:00

Temporal Tables Since SQL:2011

Page 144: Modern SQL in PostgreSQL

It isn’t possible to define constraints to avoid overlapping periods.

Workarounds are possible, but no fun: CREATETRIGGER

id begin end1 8:00 9:001 9:00 11:001 10:00 12:00

Temporal Tables The Problem

Page 145: Modern SQL in PostgreSQL

SQL:2011 provides means to cope with temporal tables:

PRIMARYKEY(id,periodWITHOUTOVERLAPS)

Temporal Tables Since SQL:2011

Temporal support in SQL:2011 goes way further.

Please read this paper to get the idea:

Temporal features in SQL:2011 http://cs.ulb.ac.be/public/_media/teaching/infoh415/tempfeaturessql2011.pdf

Page 146: Modern SQL in PostgreSQL

Temporal Tables Since SQL:201119

99

2001

2003

2005

2007

2009

2011

2013

2015

5.1 MariaDBMySQLPostgreSQLSQLite

10.1 DB2 LUW10gR1[0] 12cR1[1] Oracle

2016[2] SQL Server[0]Limited system versioning via Flashback[1]Limited application versioning added (e.g. no WITHOUT OVERLAPS)[2]Only system versioning

Page 147: Modern SQL in PostgreSQL

SQL:2016 (released: 2016-12-14)

Page 148: Modern SQL in PostgreSQL

MATCH_RECOGNIZE (Row Pattern Matching)

Page 149: Modern SQL in PostgreSQL

Row Pattern Matching

Example: Logfile

Page 150: Modern SQL in PostgreSQL

Row Pattern Matching

Time

30 minutes

Example: Logfile

Page 151: Modern SQL in PostgreSQL

Row Pattern Matching

Example: Logfile

Time

30 minutes

Session 1 Session 2 Session 3

Session 4

Example problem:

‣ Average session duration

Two approaches:

‣ Row pattern matching

‣ Start-of-group tagging

Page 152: Modern SQL in PostgreSQL

SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(newcont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t

Since SQL:2016Row Pattern Matching

Time

30 minutes

definecontinuation

Oracle doesn’t support avg on intervals — query doesn’t work as shown

Page 153: Modern SQL in PostgreSQL

SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(newcont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t

Since SQL:2016Row Pattern Matching

Time

30 minutes

undefinedpattern variable: matches any row

Oracle doesn’t support avg on intervals — query doesn’t work as shown

Page 154: Modern SQL in PostgreSQL

SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(newcont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t

Since SQL:2016Row Pattern Matching

Time

30 minutes

any numberof “cont”

rows

Oracle doesn’t support avg on intervals — query doesn’t work as shown

Page 155: Modern SQL in PostgreSQL

SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(newcont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t

Since SQL:2016Row Pattern Matching

Time

30 minutes

Very muchlike GROUP BY

Oracle doesn’t support avg on intervals — query doesn’t work as shown

Page 156: Modern SQL in PostgreSQL

SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(newcont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t

Since SQL:2016Row Pattern Matching

Time

30 minutes

Very muchlike SELECT

Oracle doesn’t support avg on intervals — query doesn’t work as shown

Page 157: Modern SQL in PostgreSQL

SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(newcont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t

Since SQL:2016Row Pattern Matching

Time

30 minutes

Oracle doesn’t support avg on intervals — query doesn’t work as shown

Page 158: Modern SQL in PostgreSQL

SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(newcont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t

Since SQL:2016Row Pattern Matching

Time

30 minutes

Oracle doesn’t support avg on intervals — query doesn’t work as shown

Page 159: Modern SQL in PostgreSQL

Row Pattern Matching Before SQL:2016

Time

30 minutes

Now, let’s try using window functions

Page 160: Modern SQL in PostgreSQL

SELECTcount(*)sessions,avg(duration)avg_durationFROM(SELECTMAX(ts)-MIN(ts)durationFROM(SELECTts,COUNT(grp_start)OVER(ORDERBYts)session_noFROM(SELECTts,CASEWHENts>=LAG(ts,1,DATE’1900-01-1')OVER(ORDERBYts)+INTERVAL'30'minuteTHEN1ENDgrp_startFROMlog)tagged)numberedGROUPBYsession_no)grouped

Row Pattern Matching Before SQL:2016

Time

30 minutes

Start-of-group tags

Page 161: Modern SQL in PostgreSQL

SELECTcount(*)sessions,avg(duration)avg_durationFROM(SELECTMAX(ts)-MIN(ts)durationFROM(SELECTts,COUNT(grp_start)OVER(ORDERBYts)session_noFROM(SELECTts,CASEWHENts>=LAG(ts,1,DATE’1900-01-1')OVER(ORDERBYts)+INTERVAL'30'minuteTHEN1ENDgrp_startFROMlog)tagged)numberedGROUPBYsession_no)grouped

Row Pattern Matching Before SQL:2016

Time

30 minutes

number sessions

2222 2 33 3 44 42 3 41

Page 162: Modern SQL in PostgreSQL

SELECTcount(*)sessions,avg(duration)avg_durationFROM(SELECTMAX(ts)-MIN(ts)durationFROM(SELECTts,COUNT(grp_start)OVER(ORDERBYts)session_noFROM(SELECTts,CASEWHENts>=LAG(ts,1,DATE’1900-01-1')OVER(ORDERBYts)+INTERVAL'30'minuteTHEN1ENDgrp_startFROMlog)tagged)numberedGROUPBYsession_no)grouped

Row Pattern Matching Before SQL:2016

Time

30 minutes 2222 2 33 3 44 42 3 41

Page 163: Modern SQL in PostgreSQL

Row Pattern Matching Since SQL:2016

https://www.slideshare.net/MarkusWinand/row-pattern-matching-in-sql2016

Page 164: Modern SQL in PostgreSQL

Row Pattern Matching Availability19

99

2001

2003

2005

2007

2009

2011

2013

2015

MariaDBMySQLPostgreSQLSQLite

DB2 LUW12cR1 Oracle

SQL Server

Page 165: Modern SQL in PostgreSQL

LIST_AGG

Page 166: Modern SQL in PostgreSQL

Since SQL:2016

grp val1 B1 A1 C2 X

LIST_AGG

Page 167: Modern SQL in PostgreSQL

Since SQL:2016

grp val1 B1 A1 C2 X

SELECTgrp,LIST_AGG(val,',')WITHINGROUP(ORDERBYval)FROMtGROUPBYgrp

LIST_AGG

Page 168: Modern SQL in PostgreSQL

Since SQL:2016

grp val1 B1 A1 C2 X

grp val1 A, B, C2 X

SELECTgrp,LIST_AGG(val,',')WITHINGROUP(ORDERBYval)FROMtGROUPBYgrp

LIST_AGG

Page 169: Modern SQL in PostgreSQL

Since SQL:2016

grp val1 B1 A1 C2 X

grp val1 A, B, C2 X

SELECTgrp,LIST_AGG(val,',')WITHINGROUP(ORDERBYval)FROMtGROUPBYgrp

LIST_AGG(val,','ONOVERFLOWTRUNCATE'...'WITHCOUNT)➔'A,B,...(1)'

LIST_AGG(val,','ONOVERFLOWERROR)

Default

LIST_AGG

LIST_AGG(val,','ONOVERFLOWTRUNCATE'...'WITHOUTCOUNT)➔'A,B,...'

Default

Page 170: Modern SQL in PostgreSQL

LIST_AGG Availability19

99

2001

2003

2005

2007

2009

2011

2013

2015

5.1[0] MariaDB4.1[0] MySQL

7.4[1] 8.4[2]9.0[3] PostgreSQL3.5.4[4] SQLite

10.5[5] DB2 LUW11gR1 12cR2 Oracle

SQL Server[6]

[0]group_concat[1]array_to_string[2]array_agg[3]string_agg[4]group_concat without ORDER BY[5]No ON OVERFLOW clause[6]string_agg announced for vNext

[0] group_concat [1] array_to_string [2] array_agg [3] string_agg

[4] group_concat w/o ORDERBY [5] No ONOVERFLOW clause [6] string_agg announced for vNext

Page 171: Modern SQL in PostgreSQL

Also new in SQL:2016 JSON

DATE FORMAT

POLYMORPHIC TABLE FUNCTIONS

Page 172: Modern SQL in PostgreSQL

About @MarkusWinand

‣Training for Developers ‣ SQL Performance (Indexing) ‣ Modern SQL ‣ On-Site or Online

‣SQL Tuning ‣ Index-Redesign ‣ Query Improvements ‣ On-Site or Online

http://winand.at/

Page 173: Modern SQL in PostgreSQL

About @MarkusWinand€0,-

€10-30‣Training for Developers ‣ SQL Performance (Indexing) ‣ Modern SQL ‣ On-Site or Online

‣SQL Tuning ‣ Index-Redesign ‣ Query Improvements ‣ On-Site or Online

sql-performance-explained.com

Page 174: Modern SQL in PostgreSQL

About @MarkusWinand@ModernSQL

http://modern-sql.com‣Training for Developers ‣ SQL Performance (Indexing) ‣ Modern SQL ‣ On-Site or Online

‣SQL Tuning ‣ Index-Redesign ‣ Query Improvements ‣ On-Site or Online