Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries...

17
Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands [email protected] Stefan Manegold Martin Kersten CWI Amsterdam The Netherlands {S.Manegold,M.Kersten}@cwi.nl

Transcript of Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries...

Page 1: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

Database Architecture Optimized for the New Bottleneck: Memory Access

Peter Boncz

Data Distilleries B.V.Amsterdam

The Netherlands

[email protected]

Stefan Manegold Martin Kersten

CWIAmsterdam

The Netherlands

{S.Manegold,M.Kersten}@cwi.nl

Page 2: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

2

Contents

• How Memory Access works• Simple Scan Experiment• Consequences for DBMS

– Data Structures: vertical decomposition– Algorithms: tune random memory access

• Partitioned Join Algorithms– Monet Experiments– Accurate Cost Models

• Conclusion

Page 3: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

3

CPU Speed vs. Memory Speed

Moore’s Law:CPU speed doublesevery 3 years

Page 4: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

4

Memory Access in Hierarchical Systems

Page 5: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

5

Simple Scan Experiment

Page 6: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

6

Consequences for DBMS

• Memory access is a bottleneck• Prevent cache & TLB misses• Cache lines must be used fully

• DBMS must optimize– Data structures– Algorithms (focus: join)

Page 7: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

7

Vertical Decomposition in Monet

Page 8: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

8

Partitioned Joins

• Cluster both input relations• Create clusters that fit in

memory cache• Join matching clusters• Two algorithms:

– Partitioned hash-join– Radix-Join

(partitioned nested-loop)

Page 9: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

9

Partitioned Joins: Straightforward Clustering

• Problem:Number of clusters exceeds number of– TLB entries ==> TLB trashing– Cache lines ==> cache trashing

• Solution:Multi-pass radix-cluster

Page 10: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

10

Partitioned Joins: Multi-Pass Radix-Cluster

• Multiple clustering passes

• Limit number of clusters per pass

• Avoid cache/TLB trashing

• Trade memory cost for CPU cost

• Any data type (hashing)

Page 11: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

11

Monet Experiments: Setup

• Platform: – SGI Origin2000 (MIPS R10000, 250 MHz)

• System:– Monet DBMS

• Data sets: – Integer join columns– Join hit-rate of 1– Cardinalities: 15,625 - 64,000,000

• Hardware event counters – to analyze cache & TLB misses

Page 12: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

12

Monet Experiments: Radix-Cluster (64,000,000 tuples)

Page 13: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

13

Accurate Cost Modeling: Radix-Cluster

Page 14: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

14

Monet Experiments: Partitioned Hash-Join

Page 15: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

15

Monet Experiments: Radix-Join

Page 16: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

16

Monet Experiments: Overall Performance

(64,000,000 tuples)

Page 17: Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan.

17

Conclusion

• Problem:– Memory access is increasingly the most important

bottleneck for database performance

• Solutions:– Vertical decomposition improves column-wise data access– Radix-algorithms optimize join performance

• General:– Algorithms can be tuned to achieve optimal memory access– Detailed and accurate estimation of memory cost is possible

Monet homepage: www.cwi.nl/~monet