Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries...
-
Upload
kenneth-pitts -
Category
Documents
-
view
213 -
download
0
Transcript of Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries...
Database Architecture Optimized for the New Bottleneck: Memory Access
Peter Boncz
Data Distilleries B.V.Amsterdam
The Netherlands
Stefan Manegold Martin Kersten
CWIAmsterdam
The Netherlands
{S.Manegold,M.Kersten}@cwi.nl
2
Contents
• How Memory Access works• Simple Scan Experiment• Consequences for DBMS
– Data Structures: vertical decomposition– Algorithms: tune random memory access
• Partitioned Join Algorithms– Monet Experiments– Accurate Cost Models
• Conclusion
3
CPU Speed vs. Memory Speed
Moore’s Law:CPU speed doublesevery 3 years
4
Memory Access in Hierarchical Systems
5
Simple Scan Experiment
6
Consequences for DBMS
• Memory access is a bottleneck• Prevent cache & TLB misses• Cache lines must be used fully
• DBMS must optimize– Data structures– Algorithms (focus: join)
7
Vertical Decomposition in Monet
8
Partitioned Joins
• Cluster both input relations• Create clusters that fit in
memory cache• Join matching clusters• Two algorithms:
– Partitioned hash-join– Radix-Join
(partitioned nested-loop)
9
Partitioned Joins: Straightforward Clustering
• Problem:Number of clusters exceeds number of– TLB entries ==> TLB trashing– Cache lines ==> cache trashing
• Solution:Multi-pass radix-cluster
10
Partitioned Joins: Multi-Pass Radix-Cluster
• Multiple clustering passes
• Limit number of clusters per pass
• Avoid cache/TLB trashing
• Trade memory cost for CPU cost
• Any data type (hashing)
11
Monet Experiments: Setup
• Platform: – SGI Origin2000 (MIPS R10000, 250 MHz)
• System:– Monet DBMS
• Data sets: – Integer join columns– Join hit-rate of 1– Cardinalities: 15,625 - 64,000,000
• Hardware event counters – to analyze cache & TLB misses
12
Monet Experiments: Radix-Cluster (64,000,000 tuples)
13
Accurate Cost Modeling: Radix-Cluster
14
Monet Experiments: Partitioned Hash-Join
15
Monet Experiments: Radix-Join
16
Monet Experiments: Overall Performance
(64,000,000 tuples)
17
Conclusion
• Problem:– Memory access is increasingly the most important
bottleneck for database performance
• Solutions:– Vertical decomposition improves column-wise data access– Radix-algorithms optimize join performance
• General:– Algorithms can be tuned to achieve optimal memory access– Detailed and accurate estimation of memory cost is possible
Monet homepage: www.cwi.nl/~monet