Post on 28-Mar-2015
SQL Top-Nand Pagination Pattern
Maxym Kharchenko
What is top-N
• Give me the top 10 salaries in the “Sales” dept• Give me the top 10 best selling books• Give me the 10 latest orders
What is top-N
SetupSQL> @desc cities Name Null? Type -------------------------- -------- --------------- NAME NOT NULL VARCHAR2(100) STATE NOT NULL VARCHAR2(100) POPULATION NOT NULL NUMBER
PCTFREE 99 PCTUSED 1
http://www.census.gov
Naïve Top-N
SELECT name, populationFROM citiesWHERE rownum <= 5ORDER BY population DESC;
NAME Pop---------------------- ------Robertsdale city 5,276Glen Allen town (pt.) 458Boligee town 328Riverview town 184Altoona town (pt.) 30
Give me the top 5 cities by population
Statistics 7 consistent gets
Naïve Top-N explained
-----------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Time |-----------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 110 | 00:00:01 || 1 | SORT ORDER BY | | 5 | 110 | 00:00:01 ||* 2 | COUNT STOPKEY | | | | || 3 | TABLE ACCESS FULL| CITIES | 10 | 220 | 00:00:01 |-----------------------------------------------------------------
Correct top-N query
SELECT name, populationFROM citiesORDER BY population DESCFETCH FIRST 5 ROWS ONLY;
SELECT * FROM ( SELECT name, population FROM cities ORDER BY population DESC) WHERE rownum <= 5;
>= 12c <= 11g
Correct top-N query: Execution
SELECT * FROM ( SELECT name, population FROM cities ORDER BY population DESC) WHERE rownum <= 5;
NAME Pop-------------------- ----------Los Angeles city 3,792,621Chicago city (pt.) 2,695,598Chicago city (pt.) 2,695,598Chicago city 2,695,598New York city (pt.) 2,504,700
Statistics 56024 consistent gets
Reading, filtering and sorting
---------------------------------------------------------------------| Id | Operation | Name | Rows |TempSpc| Time |---------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | | 00:01:58 ||* 1 | COUNT STOPKEY | | | | || 2 | VIEW | | 56072 | | 00:01:58 ||* 3 | SORT ORDER BY STOPKEY| | 56072 | 1768K| 00:01:58 || 4 | TABLE ACCESS FULL | CITIES | 56072 | | 00:01:54 |---------------------------------------------------------------------
Proper data structure
--------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 56072 | 00:00:01 || 4 | INDEX RANGE SCAN DESCENDING| I_POP | 10 | 00:00:01 |--------------------------------------------------------------------
Ordered By: Population
Statistics 12 consistent gets
CREATE INDEX i_pop ON cities(population);
Why index works
• Colocation• Can stop after reading N rows• No Sort
Ordered By: Population
CREATE INDEX i_pop ON cities(population);
More elaborate top-N
SELECT * FROM ( SELECT name, population FROM cities WHERE state='Florida' ORDER BY population DESC) WHERE rownum <= 5;
NAME Pop-------------------- ----------Jacksonville city 821,784Miami city 399,457Tampa city 335,709St. Petersburg city 244,769Orlando city 238,300
Give me the top 5 cities by population in Florida
Statistics 264 consistent gets
Uncertain nature of filtering
WHERE state='Florida' ORDER BY population DESC) WHERE rownum <= 5;
WHERE state='Florida' ORDER BY population DESC) WHERE rownum <= 200;
Ordered By: Population
Statistics19747 consistent gets
Statistics264 consistent gets
Multi column indexes
State
Population
AKAL CO FLAZ
Ordered By:StateState+Population
Not Ordered by: Population
WHERE state=‘FL’
NOW: Ordered By: Population
MA WA
where state=‘FL’
CREATE INDEX i_state_pop ON cities(state, population);
Multicolumn indexes-------------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 11 | 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 1099 | 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_STATE_POP | 11 | 00:00:01 |-------------------------------------------------------------------------
Predicate Information (identified by operation id):
1 - filter(ROWNUM<=5) 4 - access("STATE"='Florida')
Statistics 12 consistent gets
Trips to the table-------------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 11 | 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 1099 | 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_STATE_POP | 11 | 00:00:01 |-------------------------------------------------------------------------
Predicate Information (identified by operation id):
1 - filter(ROWNUM<=5) 4 - access("STATE"='Florida')
Statistics 12 consistent gets
Index range scan: cost mathWindow: 500 records
4-5 logical reads
~ 5-10 logical reads
~ 10-500 logical reads
Covering indexCREATE INDEX i_state_pop_cON cities
(state, population, name);
CREATE INDEX i_state_pop ON cities
(state, population);
--------------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I_STATE_POP_C | 506 | 00:00:01 |--------------------------------------------------------------------------
Statistics 12 consistent gets
Statistics 7 consistent gets
Ideal top-N
• Use the index• Make the best index• And read only from the index
Less than ideal top-N
• Effect of query conditions
• Effect of deletes and updates
• Technicalities
Condition better!
WHERE active != 'N' ORDER BY order_date DESC) WHERE rownum <= 10;
WHERE active = 'Y' ORDER BY order_date DESC) WHERE rownum <= 10;
Statistics12345 consistent gets
Statistics10 consistent gets
CREATE TABLE orders ( … active char(1) NOT NULL CHECK (active IN ('Y', 'N'))
Trade WHERE for ORDER BY
SELECT * FROM (SELECT * FROM t WHERE a=12 ORDER BY c) ) WHERE rownum <= 10;
WHERE a=12 ORDER BY c
WHERE a=12 ORDER BY b, c
WHERE a=12 AND b=0ORDER BY c
CREATE INDEX t_idx ON t(a, b, c);
Statistics 1200 consistent gets
Statistics 12 consistent gets
Statistics 12 consistent gets
Tolerate filteringSELECT * FROM ( SELECT name, population FROM cities WHERE state != 'Florida' ORDER BY population DESC) WHERE rownum <= 10;
Statistics 28 consistent gets
Tolerate filtering--------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------| 0 | SELECT STATEMENT | | 11 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 11 | 00:00:01 ||* 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 55566 | 00:00:01 || 4 | INDEX RANGE SCAN DESCENDING| I_POP | 12 | 00:00:01 |-------------------------------------------------------------------
Predicate Information (identified by operation id):---------------------------------------------------
1 - filter(ROWNUM<=10) 3 - filter("STATE"<>'Florida')
Updates and DeletesSQL> @desc cities2 Name Null? Type ---------------------- -------- ---------------- NAME NOT NULL VARCHAR2(100) STATE NOT NULL VARCHAR2(100) POPULATION NOT NULL NUMBER BUDGET_SURPLUS NOT NULL VARCHAR2(1)
CREATE INDEX i2_pop ON cities2(budget_surplus, population, name);
Updates and Deletes
SELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus='Y' ORDER BY population DESC) WHERE rownum <= 5;
-------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 12 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_POP | 56067 | 00:00:01 |-------------------------------------------------------------------
Statistics 7 consistent gets
Updates and Deletes
UPDATE cities2 SET budget_surplus='N' WHERE rowid IN ( SELECT * FROM ( SELECT rowid FROM cities2 ORDER BY population DESC ) WHERE rownum <= 200);
-------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 12 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_POP | 56067 | 00:00:01 |-------------------------------------------------------------------
Statistics 207 consistent gets
Updates and Deletes
Updates and DeletesALTER TABLE cities2 ADD (version number default 0 NOT NULL);
CREATE INDEX i2_vpop ON cities2(budget_surplus, version, population);
UPDATE cities2 SET version=1WHERE budget_surplus='Y' AND version=0;
0
Y
1
Budget_surplus
Version
Population Y
Y
Budget_surplus
Updates and Deletes
--------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 1 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_VPOP | 1 | 00:00:01 |--------------------------------------------------------------------
Statistics 9 consistent gets
SELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus='Y' AND version=1 ORDER BY population DESC) WHERE rownum <= 5;
Pagination
SELECT * FROM ( SELECT name, population FROM cities WHERE state='Florida' ORDER BY population DESC) WHERE rownum <= 10;
SELECT * FROM ( SELECT * FROM ( SELECT name, population, rownum AS rn FROM cities WHERE state='Florida' ORDER BY population DESC ) WHERE rownum <= 20) WHERE rn > 10;
Dumb Pagination
) WHERE rownum <= 20) WHERE rn > 10;
Statistics 22 consistent gets
) WHERE rownum <= 30) WHERE rn > 20;
Statistics 32 consistent gets
Smart paginationSELECT * FROM ( SELECT name, population FROM cities WHERE state='Florida' AND population < 154750 ORDER BY population DESC) WHERE rownum <= 10;
SELECT * FROM ( SELECT * FROM ( SELECT name, population, rownum AS rn FROM cities WHERE state='Florida' ORDER BY population DESC ) WHERE rownum <= 20) WHERE rn > 10;
Statistics 22 consistent gets
Statistics 12 consistent gets
Top-N with joinsSELECT * FROM ( SELECT c.name as city,
c.population, s.capital FROM cities c, states s WHERE c.state_id = s.id AND c.state='Florida' ORDER BY c.population DESC) WHERE rownum <= 5/
state
population
state_id
name
Filter
Order By
Join
Select
Drivingtable:
Joined totable:
idJoin
capitalSelect
Use Nested Loops! Build indexes like this!
Top-N with joins: Good
-------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:13 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:13 || 3 | NESTED LOOPS | | 10 | 00:00:13 ||* 4 | INDEX RANGE SCAN| I_C | 506 | 00:00:07 ||* 5 | INDEX RANGE SCAN| I_S | 1 | 00:00:01 |-------------------------------------------------------
Top-N with joins: Bad
-----------------------------------------------------------| Id | Operation | Name | Rows | Time |-----------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:07 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:07 ||* 3 | SORT ORDER BY STOPKEY| | 10 | 00:00:07 ||* 4 | HASH JOIN | | 10 | 00:00:07 ||* 5 | INDEX RANGE SCAN | I_C | 506 | 00:00:07 ||* 6 | INDEX RANGE SCAN | I_S | 1 | 00:00:01 |-----------------------------------------------------------
Gotchas?
TMI“Too many indexes”
Thank you!
Query conditions
WHERE state = 'Florida'
State
Population
AKAL CO FLAZ MA WA
where state=‘FL’
WHERE state != 'Florida'
where state != ‘FL’
AKAL FL MA WA
… …GA HI
Watch out for DESC/ASC
WHERE state >= 'Florida'ORDER BY state, population DESC) WHERE rownum <= 10
WHERE state >= 'Florida'ORDER BY state, population) WHERE rownum <= 10
Statistics 12 consistent gets
Statistics 107408 consistent gets
CREATE INDEX i_s_pop ON cities(state, population);
+ SORTNO SORT