Performance Tradeoffs in Read-Optimized Databases Stavros Harizopoulos * MIT CSAIL joint work with:...
-
Upload
annabelle-merry-harrell -
Category
Documents
-
view
216 -
download
1
Transcript of Performance Tradeoffs in Read-Optimized Databases Stavros Harizopoulos * MIT CSAIL joint work with:...
Performance Tradeoffs in Read-Optimized Databases
Stavros Harizopoulos*MIT CSAIL
joint work with:Velen Liang, Daniel Abadi, and Sam Madden
massachusetts institute of technology
*seeking an academicor research lab position
in 2007
massachusetts institute of technology 2
Read-optimized databases
45
…37
Joe
…Sue
1
…2
column stores
1 Joe 45
… … …2 Sue 37
row stores
Sybase IQMonetDBCStore
SQL ServerDB2Oracle
Materialized views, multiple indices, compressionRead optimizations:
How does column-orientation affect performance?
massachusetts institute of technology 3
Rows vs. columns
column datarow data
1 Joe 45
2 Sue 37… … … single
file
project
Joe 45
1 2 …
JoeSue
4537……
3 files
Joe
45reconstruct
Joe 45
Study performance tradeoffs solely in data storage
seek
massachusetts institute of technology 4
Performance study• Methodology
– Built storage manager from scratch– Sequential scans– Analyze CPU, disk, memory
• Findings– Columns are generally more I/O efficient– Competing traffic favors columns– Conditions where columns are CPU-constrained– Conditions where rows are MemBW-constrained
massachusetts institute of technology 5
Talk outline• System architecture
• Workload and Experiments
• Analysis
• Conclusions
massachusetts institute of technology 6
System architecture• Block-iterator operators
– Single-threaded, C++, Linux AIO
• No buffer pool– Use filesystem, bypass OS cache
• Compression
• Dense-pack60% full 100% full
massachusetts institute of technology 7
Storage engine
S
SELECT name, ageWHERE age > 40
applypredicate(s)
Joe 45… …
S
S
#POS 45#POS …
Joe 45… …
applypredicate #1
row scanner column scanner
age
name
massachusetts institute of technology 8
Platform
3.2GHz
CPU L2 RAM
1MB 1GB180 MB/sec3.2 GB/sec
DISKS
direct IO
100msread
10msseek
L2 cacheprefetching
read 128 bytes
(striped)
prefetching:
massachusetts institute of technology 9
Workload• LINEITEM (wide)
– 60m rows → 9.5 GB
• ORDERS (narrow)– 60m rows → 1.9 GB
• Query
150 bytes 50 bytes
32 bytes 12 bytes
SELECT a1, a2, a3, …WHERE a1 yields variable selectivity
massachusetts institute of technology 10
Wide tuple: 10% selectivity
selected bytes per tuple
time
(sec
)
0
10
20
30
40
50
60
4 20 36 52 68 84 100 116 132 148
• Large prefetch hides disk seeks in columns
Row
Row (CPU only)
Column (CPU only)
Column
25B 10B 69B
int4B
text text text
char1B
massachusetts institute of technology 11
Wide tuple: 10% sel. (CPU)tim
e (s
ec)
row store
0
2
4
6
8
10
12
1 16
Other stalls (user)
Memory stalls (user)
Busy (user)
System
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
# attributes selectedcolumn store
• Row-CPU suffers from memory stalls
massachusetts institute of technology 12
0
2
4
6
8
10
12
1 16
Other stalls (user)
Memory stalls (user)
Busy (user)
System
• Column-CPU efficiency with lower selectivity
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Wide tuple: 10% sel. (CPU)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0.1%
# attributes selectedcolumn store
time
(sec
)
row store
massachusetts institute of technology 13
Narrow tuple: 10% selectivity
• Memory stalls disappear in narrow tuples
• Compression: similar to narrow (not shown)
0
2
46
8
10
12
4 8 12 16 20 24 28 32
RowColumn
1 2 3 4 5 6 7
time
(sec
)
selected bytes per tuple# attributes selected
0
24
68
1012
1 7
Other
Memory
CPU user
CPU system
row store column store
massachusetts institute of technology 14
Varying prefetch size
• No prefetching hurts columns in single scans
0
10
20
30
40
4 8 12 16 20 24 28 32
time
(sec
)
no competingdisk traffic
selected bytes per tuple
Row (any prefetch size)
Column 48 (x 128KB)Column 16
Column 8
Column 2
massachusetts institute of technology 15
Varying prefetch size
• No prefetching hurts columns in single scans
• Under competing traffic, columns outperform rows for any prefetch size
0
10
20
30
40
4 8 12 16 20 24 28 32
no competingdisk traffic
with competing disk traffic
0
10
20
30
40
4 12 20 28
Column, 48Row, 48
0
10
20
30
40
4 12 20 28
Column, 8Row, 8
selected bytes per tuple
time
(sec
)
massachusetts institute of technology 16
Analysis• Central parameter in analysis:
cycles per disk byte (cpdb)
• What can it model:• More / fewer disks• More / fewer CPUs• CPU / disk competing traffic
• Trends in cpdb:• 10 → 30 from 1995 to 2006• Further increase with multicore chips
massachusetts institute of technology 17
Analysis
• Rows favored by narrow tuples and low cpdb– Disk-bound workloads have higher cpdb
8 12 16 20 24 28 32 369
18
36
72
14410% selectivity50% projection
tuple width
cycl
es p
er d
isk
byte
speedup ofcols over rows
2
1.6 – 2
1.2 – 1.6
0.8 – 1.2
0.4 – 0.8
(cpdb)
massachusetts institute of technology 18
See our paper for the rest• CPU time breakdowns, L2 prefetcher
• Disk prefetching implementation
• Compression results
• Non-pipelined column scanner
• Analysis
massachusetts institute of technology 19
Conclusions• Given enough space for prefetching,
columns outperform rows in most workloads
• Competing traffic favors columns
• Memory-bandwidth bottleneck in rows
• Future work– Column scanners, random I/O, write performance
massachusetts institute of technology 20
Thank you
db.csail.mit.edu/projects/cstore
massachusetts institute of technology 21
Compression methods• Dictionary
• Bit-pack– Pack several attributes inside a 4-byte word– Use as many bits as max-value
• Delta– Base value per page– Arithmetic differences
… ‘low’ …… ‘high’ …… ‘low’ …… ‘normal’ …
… 00 …… 10 …… 00 …… 01 …
massachusetts institute of technology 22
Analysis
SizeFilevarious DB schemas
TupleWidth
MemBytesCycle memory bus speed
f # of selected attributes
I CPU work
cpdb(cycles perdisk byte)
more / fewer disks
more / fewer CPUs
CPU / disk competing traffic
parameter
what it can model