CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo //...
Transcript of CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo //...
![Page 1: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/1.jpg)
Andy Pavlo // Carnegie Mellon University // Spring 2016
ADVANCED DATABASE SYSTEMS
Lecture #23 – Larger-than-Memory Databases
15-721
@Andy_Pavlo // Carnegie Mellon University // Spring 2017
![Page 2: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/2.jpg)
CMU 15-721 (Spring 2017)
ADMINISTRIVIA
Final Exam: May 4th
@ 12:00pm → Multiple choice + short-answer questions. → I will provide sample questions this week.
Code Review #2: May 3rd
@ 11:59pm
→ We will use the same group pairings as before.
Final Presentations: May 9th
@ 5:30pm
→ WEH Hall 7500
→ 12 minutes per group → Food and prizes for everyone!
2
![Page 3: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/3.jpg)
CMU 15-721 (Spring 2017)
SIGMOD 2017 INNOVATION AWARD
SIGMOD Edgar F. Codd Innovations Award Goetz Graefe
3
Graefe
![Page 4: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/4.jpg)
CMU 15-721 (Spring 2017)
TODAY’S AGENDA
Background Implementation Issues Real-world Examples Evaluation
4
![Page 5: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/5.jpg)
CMU 15-721 (Spring 2017)
MOTIVATION
DRAM is expensive, son.
It would be nice if our in-memory DBMS could use cheaper storage.
5
![Page 6: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/6.jpg)
CMU 15-721 (Spring 2017)
L ARGER-THAN-MEMORY DATABASES
Allow an in-memory DBMS to store/access data on disk without bringing back all the slow parts of a disk-oriented DBMS.
Need to be aware of hardware access methods → In-memory Storage = Tuple-Oriented → Disk Storage = Block-Oriented
6
![Page 7: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/7.jpg)
CMU 15-721 (Spring 2017)
OL AP
OLAP queries generally access the entire table. Thus, there isn’t anything about the workload for the DBMS to exploit that a disk-oriented buffer pool can’t handle.
7
Disk Data
A
In-Memory
Zone Map (A)
MIN=## MAX=## SUM=##
COUNT=## AVG=###
STDEV=###
⋮
A
![Page 8: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/8.jpg)
CMU 15-721 (Spring 2017)
OLTP
OLTP workloads almost always have hot and cold portions of the database. → We can assume that txns will almost always access hot
tuples.
The DBMS needs a mechanism to move cold data out to disk and then retrieve it if it is ever needed again.
8
![Page 9: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/9.jpg)
CMU 15-721 (Spring 2017)
L ARGER-THAN-MEMORY DATABASES
9
In-Memory
Table Heap
Tuple #01
Tuple #03
Tuple #04
Tuple #00
Tuple #02
Cold-Data
Storage
In-Memory
Index
![Page 10: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/10.jpg)
CMU 15-721 (Spring 2017)
L ARGER-THAN-MEMORY DATABASES
9
In-Memory
Table Heap
Tuple #00
Tuple #02
Cold-Data
Storage
header
Tuple #01
Tuple #03
Tuple #04
In-Memory
Index Ev
icte
d T
up
le
B
lo
ck
![Page 11: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/11.jpg)
CMU 15-721 (Spring 2017)
L ARGER-THAN-MEMORY DATABASES
9
In-Memory
Table Heap
Tuple #00
Tuple #02
Cold-Data
Storage
header
Tuple #01
Tuple #03
Tuple #04
In-Memory
Index
???
??? ???
Ev
icte
d T
up
le
B
lo
ck
![Page 12: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/12.jpg)
CMU 15-721 (Spring 2017)
L ARGER-THAN-MEMORY DATABASES
9
In-Memory
Table Heap
Tuple #00
Tuple #02
Cold-Data
Storage
header
Tuple #01
Tuple #03
Tuple #04
In-Memory
Index
???
???
??? ???
Ev
icte
d T
up
le
B
lo
ck
![Page 13: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/13.jpg)
CMU 15-721 (Spring 2017)
L ARGER-THAN-MEMORY DATABASES
9
In-Memory
Table Heap
Tuple #00
Tuple #02
Cold-Data
Storage
header
Tuple #01
Tuple #03
Tuple #04
In-Memory
Index
SELECT * FROM table WHERE id = <Tuple #01>
???
???
??? ???
Ev
icte
d T
up
le
B
lo
ck
???
![Page 14: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/14.jpg)
CMU 15-721 (Spring 2017)
L ARGER-THAN-MEMORY DATABASES
9
In-Memory
Table Heap
Tuple #00
Tuple #02
Cold-Data
Storage
header
Tuple #01
Tuple #03
Tuple #04
In-Memory
Index
SELECT * FROM table WHERE id = <Tuple #01>
???
???
??? ???
Ev
icte
d T
up
le
B
lo
ck
???
???
![Page 15: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/15.jpg)
CMU 15-721 (Spring 2017)
AGAIN, WHY NOT MMAP?
Write-ahead logging requires that a modified page cannot be written to disk before the log records that made those changes is written.
There are no mechanisms for asynchronous read-ahead or writing multiple pages concurrently.
10
IN-MEMORY PERFORMANCE FOR BIG DATA VLDB 2014
![Page 16: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/16.jpg)
CMU 15-721 (Spring 2017)
OLTP ISSUES
Run-time Operations
→ Cold Tuple Identification
Eviction Policies
→ Timing → Evicted Tuple Metadata
Data Retrieval Policies
→ Granularity → Retrieval Mechanism → Merging back to memory
11
![Page 17: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/17.jpg)
CMU 15-721 (Spring 2017)
COLD TUPLE IDENTIFICATION
Choice #1: On-line
→ The DBMS monitors txn access patterns and tracks how often tuples are used.
→ Embed the tracking meta-data directly in tuples.
Choice #2: Off-line
→ Maintain a tuple access log during txn execution. → Process in background to compute frequencies.
12
![Page 18: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/18.jpg)
CMU 15-721 (Spring 2017)
EVICTION TIMING
Choice #1: Threshold
→ The DBMS monitors memory usage and begins evicting tuples when it reaches a threshold.
→ The DBMS has to manually move data.
Choice #2: OS Virtual Memory
→ The OS decides when it wants to move data out to disk. This is done in the background.
13
![Page 19: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/19.jpg)
CMU 15-721 (Spring 2017)
EVICTED TUPLE METADATA
Choice #1: Tombstones
→ Leave a marker that points to the on-disk tuple. → Update indexes to point to the tombstone tuples.
Choice #2: Bloom Filters
→ Use approximate data structure for each index. → Check both index + filter for each query.
Choice #3: OS Virtual Memory
→ The OS tracks what data is on disk. The DBMS does not need to maintain any additional metadata.
14
![Page 20: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/20.jpg)
CMU 15-721 (Spring 2017)
EVICTED TUPLE METADATA
15
In-Memory
Table Heap
Tuple #01
Tuple #03
Tuple #04
Tuple #00
Tuple #02
Cold-Data
Storage
In-Memory
Index
Access Frequency
Tuple #00
Tuple #01
Tuple #02
Tuple #03
Tuple #04
Tuple #05
![Page 21: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/21.jpg)
CMU 15-721 (Spring 2017)
EVICTED TUPLE METADATA
15
In-Memory
Table Heap
Tuple #01
Tuple #03
Tuple #04
Tuple #00
Tuple #02
Cold-Data
Storage
In-Memory
Index
Access Frequency
Tuple #00
Tuple #01
Tuple #02
Tuple #03
Tuple #04
Tuple #05
![Page 22: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/22.jpg)
CMU 15-721 (Spring 2017)
EVICTED TUPLE METADATA
15
In-Memory
Table Heap
Tuple #00
Tuple #02
Cold-Data
Storage
header
Tuple #01
Tuple #03
Tuple #04
In-Memory
Index
Access Frequency
Tuple #00
Tuple #01
Tuple #02
Tuple #03
Tuple #04
Tuple #05
![Page 23: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/23.jpg)
CMU 15-721 (Spring 2017)
EVICTED TUPLE METADATA
15
In-Memory
Table Heap
Tuple #00
Tuple #02
Cold-Data
Storage
header
Tuple #01
Tuple #03
Tuple #04
In-Memory
Index
![Page 24: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/24.jpg)
CMU 15-721 (Spring 2017)
EVICTED TUPLE METADATA
15
In-Memory
Table Heap
Tuple #00
Tuple #02
Cold-Data
Storage
header
Tuple #01
Tuple #03
Tuple #04
In-Memory
Index
<Block,Offset>
<Block,Offset>
<Block,Offset>
![Page 25: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/25.jpg)
CMU 15-721 (Spring 2017)
EVICTED TUPLE METADATA
15
In-Memory
Table Heap
Tuple #00
Tuple #02
Cold-Data
Storage
header
Tuple #01
Tuple #03
Tuple #04
In-Memory
Index
Bloom Filter Index
![Page 26: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/26.jpg)
CMU 15-721 (Spring 2017)
DATA RETRIEVAL GRANUL ARIT Y
Choice #1: Only Tuples Needed
→ Only merge the tuples that were accessed by a query back into the in-memory table heap.
→ Requires additional bookkeeping to track holes.
Choice #2: All Tuples in Block
→ Merge all the tuples retrieved from a block regardless of whether they are needed.
→ More CPU overhead to update indexes. → Tuples are likely to be evicted again.
16
![Page 27: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/27.jpg)
CMU 15-721 (Spring 2017)
RETRIEVAL MECHANISM
Choice #1: Abort-and-Restart
→ Abort the txn that accessed the evicted tuple. → Retrieve the data from disk and merge it into memory
with a separate background thread. → Restart the txn when the data is ready. → Cannot guarantee consistency for large queries.
Choice #2: Synchronous Retrieval
→ Stall the txn when it accesses an evicted tuple while the DBMS fetches the data and merges it back into memory.
17
![Page 28: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/28.jpg)
CMU 15-721 (Spring 2017)
MERGING THRESHOLD
Choice #1: Always Merge
→ Retrieved tuples are always put into table heap.
Choice #2: Merge Only on Update
→ Retrieved tuples are only merged into table heap if they are used in an UPDATE query.
→ All other tuples are put in a temporary buffer.
Choice #3: Selective Merge
→ Keep track of how often each block is retrieved. → If a block’s access frequency is above some threshold,
merge it back into the table heap.
18
![Page 29: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/29.jpg)
CMU 15-721 (Spring 2017)
REAL-WORLD IMPLEMENTATIONS
H-Store – Anti-Caching Hekaton – Project Siberia EPFL’s VoltDB Prototype Apache Geode – Overflow Tables MemSQL – Columnar Tables
19
![Page 30: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/30.jpg)
CMU 15-721 (Spring 2017)
H-STORE – ANTI-CACHING
On-line Identification Administrator-defined Threshold Tombstones Abort-and-restart Retrieval Block-level Granularity Always Merge
20
ANTI-CACHING: A NEW APPROACH TO DATABASE MANAGEMENT SYSTEM ARCHITECTURE VLDB 2013
![Page 31: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/31.jpg)
CMU 15-721 (Spring 2017)
HEKATON – PROJECT SIBERIA
Off-line Identification Administrator-defined Threshold Bloom Filters Synchronous Retrieval Tuple-level Granularity Always Merge
21
TREKKING THROUGH SIBERIA: MANAGING COLD DATA IN A MEMORY-OPTIMIZED DATABASE VLDB 2014
![Page 32: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/32.jpg)
CMU 15-721 (Spring 2017)
EPFL VOLTDB
Off-line Identification OS Virtual Memory Synchronous Retrieval Page-level Granularity Always Merge
22
ENABLING EFFICIENT OS PAGING FOR MAIN-MEMORY OLTP DATABASES DAMON 2013
![Page 33: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/33.jpg)
CMU 15-721 (Spring 2017)
EPFL VOLTDB
Off-line Identification OS Virtual Memory Synchronous Retrieval Page-level Granularity Always Merge
22
ENABLING EFFICIENT OS PAGING FOR MAIN-MEMORY OLTP DATABASES DAMON 2013
![Page 34: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/34.jpg)
CMU 15-721 (Spring 2017)
In-Memory
Table Heap
Cold-Data
Storage
EPFL VOLTDB
23
Tuple #00
Tuple #02
Hot Tuples
Cold Tuples
Tuple #01
![Page 35: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/35.jpg)
CMU 15-721 (Spring 2017)
In-Memory
Table Heap
Cold-Data
Storage
EPFL VOLTDB
23
Tuple #00
Tuple #02
Hot Tuples
Cold Tuples
Tuple #01
![Page 36: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/36.jpg)
CMU 15-721 (Spring 2017)
In-Memory
Table Heap
Cold-Data
Storage
EPFL VOLTDB
23
Tuple #00
Tuple #02
Hot Tuples
Cold Tuples Tuple #01
![Page 37: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/37.jpg)
CMU 15-721 (Spring 2017)
In-Memory
Table Heap
Cold-Data
Storage
EPFL VOLTDB
23
Tuple #00
Tuple #02
Hot Tuples
Cold Tuples Tuple #01
Tuple #03
![Page 38: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/38.jpg)
CMU 15-721 (Spring 2017)
In-Memory
Table Heap
Cold-Data
Storage
EPFL VOLTDB
23
Tuple #00
Tuple #02
Hot Tuples
Cold Tuples Tuple #01
Tuple #03
![Page 39: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/39.jpg)
CMU 15-721 (Spring 2017)
In-Memory
Table Heap
Cold-Data
Storage
EPFL VOLTDB
23
Tuple #00
Tuple #02
Hot Tuples
Cold Tuples
Tuple #03
Tuple #01
![Page 40: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/40.jpg)
CMU 15-721 (Spring 2017)
In-Memory
Table Heap
Cold-Data
Storage
EPFL VOLTDB
23
Tuple #00
Tuple #02
Hot Tuples
Cold Tuples
Tuple #03
Tuple #01
![Page 41: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/41.jpg)
CMU 15-721 (Spring 2017)
In-Memory
Table Heap
Cold-Data
Storage
EPFL VOLTDB
23
Tuple #00
Tuple #02
Hot Tuples
Cold Tuples
Tuple #03
![Page 42: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/42.jpg)
CMU 15-721 (Spring 2017)
In-Memory
Table Heap
Cold-Data
Storage
EPFL VOLTDB
23
Tuple #00
Tuple #02
Hot Tuples
Cold Tuples
Tuple #03
![Page 43: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/43.jpg)
CMU 15-721 (Spring 2017)
APACHE GEODE – OVERFLOW TABLES
On-line Identification Administrator-defined Threshold Tombstones (?) Synchronous Retrieval Tuple-level Granularity Merge Only on Update (?)
24
Source: Apache Geode Documentation
![Page 44: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/44.jpg)
CMU 15-721 (Spring 2017)
MEMSQL – COLUMNAR TABLES
Administrator manually declares a table as a distinct disk-resident columnar table. → Appears as a separate logical table to the application. → Uses mmap to manage buffer pool. → Pre-computed aggregates per block always in memory.
Manual Identification No Evicted Metadata is needed. Synchronous Retrieval Always Merge
25
Source: MemSQL Documentation
![Page 45: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/45.jpg)
CMU 15-721 (Spring 2017)
EVALUATION
Compare different design decisions in H-Store with anti-caching.
Storage Devices: → Hard-Disk Drive (HDD) → Shingled Magnetic Recording Drive (SMR) → Solid-State Drive (SSD) → 3D XPoint (3DX) → Non-volatile Memory (NVRAM)
26
LARGER-THAN-MEMORY DATA MANAGEMENT ON MODERN STORAGE HARDWARE FOR IN-MEMORY OLTP DATABASE SYSTEMS DAMON 2016
![Page 46: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/46.jpg)
CMU 15-721 (Spring 2017)
MICROBENCHMARK
27
1E+00
1E+02
1E+04
1E+06
1E+08
HDD SMR SSD 3D XPoint NVRAM DRAM
La
te
ncy
(n
an
ose
c)
1KB Read 1KB Write 64KB Read 64KB Write
102
100
104
108
106
10m tuples – 1KB each
50% Reads / 50% Writes – Synchronization Enabled
![Page 47: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/47.jpg)
CMU 15-721 (Spring 2017)
MICROBENCHMARK
27
1E+00
1E+02
1E+04
1E+06
1E+08
HDD SMR SSD 3D XPoint NVRAM DRAM
La
te
ncy
(n
an
ose
c)
1KB Read 1KB Write 64KB Read 64KB Write
102
100
104
108
106
10m tuples – 1KB each
50% Reads / 50% Writes – Synchronization Enabled
![Page 48: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/48.jpg)
CMU 15-721 (Spring 2017)
MERGING THRESHOLD
28
0
50000
100000
150000
200000
250000
HDD (AR) HDD (SR) SMR (AR) SMR (SR) SSD 3DX NVMRAM
Th
ro
ug
hp
ut (tx
n/
se
c)
Merge (Update-Only) Merge (Top-5%) Merge (Top-20%) Merge (All)
YCSB Workload – 90% Reads / 10% Writes
10GB Database using 1.25GB Memory
DRAM
![Page 49: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/49.jpg)
CMU 15-721 (Spring 2017)
CONFIGURATION COMPARISON
Generic Configuration
→ Abort-and-Restart Retrieval → Merge (All) Threshold → 1024 KB Block Size
Optimized Configuration
→ Synchronous Retrieval → Top-5% Merge Threshold → Block Sizes (HDD/SMR - 1024 KB) (SSD/3DX - 16 KB)
29
![Page 50: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/50.jpg)
CMU 15-721 (Spring 2017)
TATP BENCHMARK
30
0
80000
160000
240000
320000
HDD SMR SSD 3D XPoint NVRAM
Th
ro
ug
hp
ut (tx
n/
se
c)
Generic Optimized
Optimal Configuration per Storage Device
1.25GB Memory
DRAM
![Page 51: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/51.jpg)
CMU 15-721 (Spring 2017)
VOTER BENCHMARK
31
0
50000
100000
150000
HDD SMR SSD 3DX NVRAM
Th
ro
ug
hp
ut (tx
n/
se
c)
Generic Optimized
Optimal Configuration per Storage Device
1.25GB Memory
DRAM
![Page 52: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/52.jpg)
CMU 15-721 (Spring 2017)
PARTING THOUGHTS
Today was about working around the block-oriented access and slowness of secondary storage.
None of these techniques handle index memory.
Fast & cheap byte-addressable NVM will make this lecture unnecessary.
32
![Page 53: CMU SCS 15-721 (Spring 2017) :: Larger-than-Memory Databases · 2017. 6. 24. · Andy Pavlo // Carnegie Mellon University // Spring 2016 ADVANCED DATABASE SYSTEMS . Lecture #23 –](https://reader034.fdocuments.in/reader034/viewer/2022051804/5fec1502f8fe812d1470463f/html5/thumbnails/53.jpg)
CMU 15-721 (Spring 2017)
NEXT CL ASS
Non-Volatile Memory Sample Final Exam
33