Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped...
Transcript of Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped...
![Page 1: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/1.jpg)
FUN WITH ANALYTIC FUNCTIONSUTOUG TRAINING DAYS 2017
![Page 2: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/2.jpg)
ABOUT ME
• Born and raised here in UT
• In IT for 10 years, DBA for the last 6
• Databases and Data are my hobbies, I’m rather quite boring
• This isn’t why you’re here though
![Page 3: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/3.jpg)
ANALYTIC FUNCTIONS… SAY WHAT?
• Analytic Functions compute a value based upon a subset of the rows in a query result
• The subset it referred to as “the partition” – Unrelated to table partitioning
• The best way to understand these functions is to compare them to standard Aggregate
functions (SUM, MIN, MAX, etc.)
![Page 4: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/4.jpg)
AGGREGATE VS. ANALYTIC
The Data Aggregate AVG Analytic Function AVG
![Page 5: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/5.jpg)
41 FLAVORS
• 41 different Analytic Functions
• Positional (FIRST, LAST, ROW_NUMBER, LEAD, LAG, RANK, etc.)
• Statistical (CORR, REG_R, N_TILE, STDDEV, etc.)
• Aggregate (SUM, AVG, MIN, MAX, etc.)
• Pattern Matching (Find patterns, like V shaped dips in stock ticker data)
• ListAgg
![Page 6: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/6.jpg)
SAMPLES!
• Samples based
on SCOTT schema
• View -> Snippets
![Page 7: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/7.jpg)
THE SYNTAX
It’s not as complicated as it looks
![Page 8: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/8.jpg)
QUICK EXAMPLES
The Data Analytic Function AVG
select
ename,
job,
deptno,
avg(sal)over (partition by deptno)
avg_sal_by_deptno,
sal,
sal/(avg(sal) over (partition by deptno))
pct_of_average
from scott.emp
order by deptno desc;
FUNCTION(<field a>) OVER (PARTITION by <field b>)
![Page 9: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/9.jpg)
MIX ‘N MATCH
select
ename,
job,
deptno,
avg(sal)over (partition by deptno)
avg_sal_by_deptno,
sal,
sal/(avg(sal) over (partition by deptno))
pct_of_average
from scott.emp
order by deptno desc;
select
ename,
job,
deptno,
min(sal) over (partition by deptno)
min_sal_by_deptno,
sal,
sal/(min(sal) over (partition by deptno))
pct_of_min
from scott.emp
order by deptno desc;
![Page 10: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/10.jpg)
REAL LIFE
![Page 11: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/11.jpg)
C-LEVEL ASKS EASY QUESTION
“Can you tell me the order that accounts were opened in?” “Can you give me an ordinal number (1st, 2nd, 3rd)?”
row_number() over (partition by acct order by acct_open_date)
![Page 12: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/12.jpg)
WHAT ABOUT WHEN TWO SUB ACCOUNTS ARE OPENED ON THE SAME DAY, CAN YOU MAKE THOSE BE THE SAME?
dense_rank() over (partition by acct order by acct_open_date)
rank() over (partition by acct order by acct_open_date)
row_number() over (partition by acct order by acct_open_date)
Original Query
![Page 13: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/13.jpg)
CAN YOU TELL ME HOW LONG IT TAKES BETWEEN ONE ACCOUNT AND ANOTHER?
lag(acct_open_date) over (partition by acct order by acct_open_date)
acct_open_date - lag(acct_open_date) over (partition by acct order by acct_open_date)
LAG
LEAD
![Page 14: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/14.jpg)
WHAT SHE REALLY WANTED…
• I just need the sequence patterns, in general
This uses LISTAGG
![Page 15: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/15.jpg)
LISTAGG
• LISTAGG(<string to concatenate>, ‘<concatenator>’ within group (order by <field>)
• LISTAGG(job, ' -> ') within group (order by hiredate)
![Page 16: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/16.jpg)
NOT GOOD ENOUGH…
• “Can you order those by how common each pattern is?”
• Sure…?
SELECT
DISTINCT listagg(acct_description, ' -> ') WITHIN GROUP (order by ACCT_OPEN_DATE)
,
count(DISTINCT listagg(acct_description,' -> ') WITHIN GROUP (order by ACCT_OPEN_DATE))
pattern_observance_count
…
Analytic Functions can’t go in a GROUP BY Clause
![Page 17: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/17.jpg)
DON’T PUT YOUR AF’S WHERE THEY DON’T BELONG
• Use a subquery to get around this
select
deptno,
avg(sal)over (partition by deptno)
avg_sal_by_deptno,
sal,
sal/(avg(sal) over (partition by deptno))
pct_of_average
from scott.emp
order by deptno desc;
select
deptno,
avg(sal)over (partition by deptno)
avg_sal_by_deptno,
sal,
sal/(avg(sal) over (partition by deptno))
pct_of_average
from scott.emp
where sal/(avg(sal) over (partition by deptno))
>1
order by deptno desc;
select
deptno,avg_sal_by_deptno,sal,pct_of_average
from (
select
deptno,
avg(sal)over (partition by deptno)
avg_sal_by_deptno,
sal,
sal/(avg(sal) over (partition by
deptno)) pct_of_average
from scott.emp
order by deptno desc
)
where pct_of_average >=1
![Page 18: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/18.jpg)
GETTING ROLLED…
Can you tell me the transactions an account has done? Can you sum the Amounts?
![Page 19: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/19.jpg)
NO, COULD YOU SUM UP THE AMOUNTS FOR EACH MONTH, BUT DON'T HIDE THE TRANSACTION DETAILS?
Original Data sum(amount)over
(partition by trunc(business_date,'MM'), acct_num)
monthly_total
sum(amount)
![Page 20: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/20.jpg)
COULD YOU BREAK IT OUT BY THE TYPE OF TRANSACTION IT WAS? DEBIT VS. CREDIT?
sum(amount)over
(partition by trunc(business_date,'MM'),
acct_num,tran_type) monthly_total
sum(amount)over
(partition by trunc(business_date,'MM'),
acct_num) monthly_total
Nulls
treated
together
Same partition => same total
Different partition => different total
![Page 21: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/21.jpg)
COULD YOU MAKE A ROLLING SUM TOO, BROKEN OUT THE SAME WAY?
sum(amount)over (partition by trunc(business_date,'MM'),acct_num,tran_type) monthly_total,
sum(amount) over ( partition by trunc(business_date,'MM'),acct,suffix,tran_type
order by acct_seq_num) rolling_monthly_total
![Page 22: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/22.jpg)
PERFECT, BUT COULD YOU EXCLUDE THE CURRENT TRANSACTION FROM THE ROLLING MONTHLY TOTAL ?
sum(amount)over (partition by trunc(business_date,'MM'), acct_num,tran_type) monthly_total,
sum(amount) over ( partition by trunc(business_date,'MM'),acct,suffix,tran_type order by acct_seq_num)
rolling_monthly_total,
sum(amount) over ( partition by trunc(business_date,'MM'),acct,suffix,tran_type
ROWS BETWEEN UNBOUNDED PRECEDING and 1 PRECEDING ) roll_mnthly_tot_excl_cur_tran
![Page 23: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/23.jpg)
ROWS AND RANGE – SUB PARTITIONS
• ROWS BETWEEN UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING
• ROWS BETWEEN UNBOUNDED PRECEDING and X PRECEDING
• ROWS is number of Rows
• RANGE is a numeric or date range
• PRECEEDING is before the current row
• FOLLOWING is after the current row
![Page 24: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/24.jpg)
SIMPLE EXAMPLE
lead(row_number) over (partition by 'X' order by row_number) next_number,
first_value(row_number) over (partition by 'X' order by row_number rows between 2 FOLLOWING and 3 FOLLOWING)
number_after_the_next_number,
sum(row_number) over (partition by 'X' order by row_number rows between 1 FOLLOWING and 2 FOLLOWING)
sum_of_next_2_nums,
sum(row_number) over (partition by 'X' order by row_number rows between 1 FOLLOWING and UNBOUNDED FOLLOWING)
sum_nums_from_this_to_the_end,
sum(row_number) over (partition by 'X' order by row_number rows between 1 PRECEDING and 1 FOLLOWING)
sum_nums_1_before_to_1_after
![Page 25: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/25.jpg)
FILLING HOLES
Can you tell me a drawer’s end of day totals are each day?
Lots of
missing days
How can we fill
in those gaps?
![Page 26: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/26.jpg)
LET’S GET THE NEXT USED DATE ON EACH ROW
lead(branch_date) over (partition by branch_code,cashbox_id order by branch_date) next_used_date
Lets fix this null
![Page 27: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/27.jpg)
AF’S CAN BE USED ALMOST ANYWHERE
case
when lead(branch_date) over (partition by branch_code,cashbox_id order by
branch_date)is null then
branch_date
else
lead(branch_date) over (partition by branch_code,cashbox_id order by branch_date)
end next_used_date,
![Page 28: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/28.jpg)
NULLS FIXED!
Before After
But we still have gaps…
![Page 29: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/29.jpg)
JOIN THIS TO A “CALENDAR”
Begin Date
Some big number larger than how far you want to go back.
This would calculate out the “End Date”
SELECT
to_date('20161101','YYYYMMDD')+ ROWNUM -1 calendar_date
FROM ( SELECT 1 just_a_column
FROM dual
CONNECT BY LEVEL <= (10000)
20161101* to_date('20161101','YYYYMMDD')
![Page 30: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/30.jpg)
JOINING TO A CALENDAR
WHERE calendar_date BETWEEN branch_date and next_used_date-1
20161115 is between
20161115 and (20161116 -1)
20th is missing, but
20161120 is between
20161119 and (20161121– 1)
![Page 31: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/31.jpg)
FILLED GAPS – THANKS TO AN AF
Before After
![Page 32: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/32.jpg)
HOW BIG IS THAT CANYON?
• Department wanted to know details of accounts going negative
• They wanted to know how deep and how wide the “canyon” was when looking at a daily
history of account balances
-2000
-1500
-1000
-500
0
500
1000
1500
How deep?
How wide?
Start Time?End Time?
![Page 33: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/33.jpg)
USE PATTERN MATCHING (12C)
The Data
-500
0
500
1000
1500
The Result
![Page 34: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/34.jpg)
THINGS YOU CAN DO WITH IT:
• Find V, W and other patterns in Stock Prices
• Find timeframes of high database use
• Group clicks in web logs into sessions
• Detect traversal patterns of Finite State Machines
• We won’t go much deeper… but look into these, they’re neat!
![Page 35: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/35.jpg)
NOT COMPLICATED, JUST INVOLVED
• Used wherever you can put data into a line graph, i.e. data is a log of events
• Lots of great resources:
• Ask Tom - http://www.oracle.com/technetwork/issue-archive/2013/13-nov/o63asktom-2034271.html
• GitHub - https://github.com/oracle/analytical-sql-examples/tree/master/pattern-matching
• Burleson - http://www.dba-oracle.com/t_sql_match_recognize.htm
• YouTube has some good demos too
![Page 36: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/36.jpg)
AF PERFORMANCE?
• Keep an eye on performance – these do lots of sorts
• Try to use indexes, filter your data before applying analytic functions
• Sometimes AF’s can help improve performance, other times it can reduce it
• Tom Kyte says: In general, analytics are great for answering "really big" questions or
questions against "small sets" https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1137250200346660664
![Page 37: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema](https://reader036.fdocuments.in/reader036/viewer/2022070801/5f0275b07e708231d4045e87/html5/thumbnails/37.jpg)
QUESTIONS?