Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler...
-
Upload
anderson-rennick -
Category
Documents
-
view
217 -
download
1
Transcript of Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler...
Robust query processing
Goetz Graefe, Christian König, Harumi Kuno,Volker Markl, Kai-Uwe Sattler
Dagstuhl – September 2010
April 18, 2023 Dagstuhl - Robust Query Processing 2
Max-diff histograms
True distribution Average value
Equal width
Equal area
Max-diff
Equal height?
April 18, 2023 Dagstuhl - Robust Query Processing 3
Histograms with slope
True distribution Average value
Linear regression
Max-diff with slope
Max-diff
April 18, 2023 Dagstuhl - Robust Query Processing 4
Slope, patterns, extrapolation
April 18, 2023 Dagstuhl - Robust Query Processing 5
0.00
5.00
10.00
15.00
20.00
25.00
30.00
35.00
40.00
45.00
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59
Query execution
Ela
pse
d t
ime
Measured values
Slow average
Fast average
Detecting query slowdown
April 18, 2023 Dagstuhl - Robust Query Processing 6
External merge sort
• Initial runs: size M, count N/M
• Merge fan-in F = M − read-ahead buffers
• Merge depth = merge levels = logF (N/M)
… ……
…Size = F×M
Size = M
Fan-in = F
April 18, 2023 Dagstuhl - Robust Query Processing 7
Hybrid hash join • Applies if M < N1 ≤ F×M
1 < N1/M ≤ F
0 < logF(N1/M) ≤ 1
• Actual fan-out K: 1 < K ≤ FHash table + K output buffers
(M−K) + (K×M) ≥ N1
K ≥ (N1−M) / (M−1)
• Fairly smooth cost function Eases query optimization
Eases memory management
1
K
… 1
K
April 18, 2023 Dagstuhl - Robust Query Processing 8
Merging vs.partitioning
Duality of sorting & hashingIssue Sorting Hashing
In-memoryalgorithm
Quicksort etc. “Classic” hashing
Large (& very large)inputs
(Multi-level)merging
(Recursive)partitioning
Tradeoffs Fan-in vs. large I/O& read-ahead
Fan-out vs. largeI/O & write-behind
Partial levels On-demand spilling(SIGMOD 98)
Hybrid hashing
Multiple inputs “Interesting orders” Hash teams
April 18, 2023 Dagstuhl - Robust Query Processing 9
Multiple optimization techniques are needed to find this plan Join clause inferred between line item & part supply Group-by list reduced by functional dependencies Grouping (on alternative column) pushed down through join “Interesting orderings” between scans, joins, grouping
April 18, 2023 Dagstuhl - Robust Query Processing 10
Multiple optimization techniques in a hash-based planSame as previous example, plus Integrated hash operation … … within a hash team Disk-order scans
April 18, 2023 Dagstuhl - Robust Query Processing 11
Star joins: semi-join reduction
First, join each dimension table with an index of the fact table;then, (hash-) intersect bookmark lists;finally, fetch fact table rows
Also considered: Cartesian products of dimension tables
April 18, 2023 Dagstuhl - Robust Query Processing 12
Symmetric semi-join reduction
Index T1 (a, s) Index T2 (a, s)
Join “T1.a = T2.a”
Select … from T1 join T2 on T1.a = T2.a where …
Fetch using T1.s
Fetch using T1.s
Fields T1.s, T2.s
Fields T1.*, T2.s
Fields T2.a, T2.s
Fields T1.*, T2.*
April 18, 2023 Dagstuhl - Robust Query Processing 13
Index-to-index navigation performanceSingle-table execution times
0.00
100.00
200.00
300.00
400.00
500.00
600.00
700.00
800.00
900.00
1,000.00
Row count
Tim
e [
se
co
nd
s]
Scan plan Fetch plan Join plan Fetch 9115 Hash join
Merge join Join + fetchTrad. fetch
April 18, 2023 Dagstuhl - Robust Query Processing 14
2-dimensional parameter space
April 18, 2023 Dagstuhl - Robust Query Processing 15
Fast loads and fast queriesQ
uery
per
form
ance
Load bandwidth
Multipleindexes
No indexes or statistics
Zonemaps
PartitionedB-trees
Zonefilters
Zoneindexes
?
April 18, 2023 Adaptive merging 16
Traditional index choices• Don’t index. Scan for each query – no cost for
index creation
• Index creation before query processing– Useful for predictable workloads
• “Monitoring and tuning” wizard– Extra effort, hard to predictScan
Index creation Index searches
Adaptive Indexing
Index tuning
April 18, 2023 Dagstuhl - Robust Query Processing 17April 18, 2023 17
Adaptive merging in partitioned B-trees
run generation
merging
a za a azzz
a za za a azzz
… after merging a-j
a zk k kzzzkj #4#3#2#1#0
April 18, 2023 Dagstuhl - Robust Query Processing 18April 18, 2023 18
Adaptive merging vs database cracking
Database crackingImproved crackingAdaptive merging
April 18, 2023 Dagstuhl - Robust Query Processing 19
Tree of losers • Traditional priority queue
– Enter and exit at root
– 2 log2 M comparisons
• Tree of winners – Enter at leaf, exit at root
– log2 M comparisons
– Specific entry points – Duplicate entries – M/2 entries
• Tree of losers – Enter at leaf, exit at root – No duplicates, M entries
Run 4: key A
0: F 7: B
Run 3: key D
1: G 2: E 5: D 6: C
0: F1: G
2: E3: D
4: A5: D
6: C7: B
Array slot 0
1
2 3
7654
April 18, 2023 Dagstuhl - Robust Query Processing 20
Graceful degradation • Exploit large memory
– Even during small merge
– Merge from memory
• Smooth transition – Run generation to merging
• Continuous cost function – Effect of hybrid hash join
– 2 × 6 GB ÷ 100 MB/s = 120 sec = 2 min
1 2
0 12 3
0
April 18, 2023 Dagstuhl - Robust Query Processing 21
Graceful degradation in memory hierarchy
Output
Main memory
Flash memory
A few runs on disk
Rotating disk drive
Run inmemory
A few runson flash
Buffer forlarge disk pages
High fan-inmerge
April 18, 2023 Dagstuhl - Robust Query Processing 22
SQL Server lock modes
April 18, 2023 Dagstuhl - Robust Query Processing 23
Optimal B-tree node sizes in 1997
April 18, 2023 Dagstuhl - Robust Query Processing 24
Hilbert space-filling curve
Nicolas Bruno and Surajit Chaudhuri, Automatic Physical Database Tuning: A Relaxation-based Approach, in Proceedings of the ACM International Conference on Management of Data (SIGMOD), Association for Computing Machinery, Inc., 2005
Automatic Tuning: Relaxation-based
Sanjay Agrawal, Nicolas Bruno, Surajit Chaudhuri, and Vivek Narasayya, AutoAdmin: Self-Tuning Database Systems Technology, in Data Engineering Bulletin, IEEE Computer Society, 2006
Self-Tuning DB: AutoAdmin
Surajit Chaudhuri, Arnd Christian König, and Vivek Narasayya, SQLCM: A Contiuous Monitoring Framework for Relational Database Engines, in ICDE 2004.
Continuous Monitoring: SQLCM