Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf ·...
Transcript of Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf ·...
![Page 1: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/1.jpg)
Let’s Talk About
Storage & Recovery Methods for
Non-Volatile Memory Database Systems
Joy Arulraj, Andrew Pavlo, Subramanya R. Dulloor
![Page 2: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/2.jpg)
Non-Volatile Memory (NVM) DRAM SSD DISK NVM
Read Latency 1x 500x 105x 2-4x Write Latency 1x 5000x 105x 2-8x Persistence û ü ü ü Byte-level access ü û û ü Write endurance ü û ü û
![Page 3: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/3.jpg)
Executive Summary
Design a DBMS storage engine for NVM v Re-‐examine tradi-onal assump-ons v Storage and recovery op-miza-ons
![Page 4: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/4.jpg)
NVM Hardware Emulator • Configure DRAM load and store latency • Throttle memory bandwidth • Two interfaces
– Filesystem Interface (fread/fwrite) – Allocator Interface (malloc/free)
![Page 5: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/5.jpg)
DBMS Platform • Lightweight DBMS
– NVM-only design – No volatile DRAM – Runs on the hardware emulator
• Pluggable backend storage architecture • Timestamp-based concurrency control
![Page 6: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/6.jpg)
3 Storage Engines
ENGINE TYPE TABLE STORAGE LOGGING EXAMPLE
In-Place Updates ü ü VoltDB
Copy-on-Write Updates ü û LMDB
Log-Structured Updates û ü LevelDB
![Page 7: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/7.jpg)
#1: In-Place Updates
TABLE
TUPLE UPDATED TUPLE
WRITE AHEAD LOG
SNAPSHOTS INDEX
TUPLE DELTA
UPDATED TUPLE
1
2
3
![Page 8: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/8.jpg)
Optimizing for NVM • Non-volatile pointer
– Non-volatile data structures – Valid even after system restarts
• Exclusively use allocator interface – Byte-addressable NVM – Not filesystem interface
![Page 9: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/9.jpg)
WRITE AHEAD LOG
TUPLE DELTA 1
ONLY POINTERS NOT ACTUAL DATA
TABLE
TUPLE UPDATED TUPLE 2
NON-‐VOLATILE INDEX
#1: NVM In-Place Updates
SNAPSHOTS
û Benefits
v Reduce data duplica-on v Almost instantaneous recovery v No redo log, only an undo log
![Page 10: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/10.jpg)
#2: Copy-on-Write Updates
Dirty Directory Current Directory
Leaf 2 Leaf 1 Updated Leaf 1
Master Record
Copy-‐on-‐Write B+Tree
TUPLE UPDATED TUPLE 1
2
3
![Page 11: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/11.jpg)
#2: NVM Copy-on-Write Updates
Dirty Directory Current Directory
Leaf Leaf
Master Record
TUPLE
2
3
ONLY POINTERS ONLY POINTERS
ALLOCATOR-‐BASED (NOT FILE-‐BASED)
Updated Leaf 1
UPDATED TUPLE 1
Benefits v Support smaller B+tree nodes v Cheaper dirty directory crea-on v Reduces data duplica-on
![Page 12: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/12.jpg)
#3: Log-Structured Updates
INDEX MEMTABLE
WRITE AHEAD LOG
TUPLE DELTA
TUPLE DELTA
INDEX SSTABLES
TUPLE DELTA
TUPLE
1
2 3
![Page 13: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/13.jpg)
WRITE AHEAD LOG
INDEX
TUPLE DELTA
TUPLE DELTA
INDEX
TUPLE DELTA
TUPLE
1
2 3
SSTABLES MEMTABLE
ONLY POINTERS
IMMUTABLE MEMTABLE
#3: NVM Log-Structured Updates
û Benefits
v Cheaper SSTable crea-on v Reduces data duplica-on
![Page 14: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/14.jpg)
Summary • Storage optimizations
– Avoid unnecessary data duplication – Leverage byte-addressability
• Recovery optimizations – NVM-optimized recovery protocols – Non-volatile data structures
![Page 15: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/15.jpg)
Experimental Evaluation • NVM Hardware Emulator
– NVM latency = 2x DRAM latency • Yahoo! Cloud Serving Benchmark
– 6 storage engines – 2 million records + 1 million transactions – Write-heavy workload – High-skew setting
![Page 16: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/16.jpg)
Throughput
0
400000
800000
1200000
Throughp
ut (txn/sec)
Storage Engines
INP COW LOG NVM-‐INP NVM-‐COW NVM-‐LOG
Tradi5onal
NVM-‐Op5mized
2x
4x 2x
![Page 17: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/17.jpg)
Write Endurance
0
100
200
300 NVM
Stores (M)
Storage Engines
INP COW LOG NVM-‐INP NVM-‐COW NVM-‐LOG
75%
60%
80%
![Page 18: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/18.jpg)
Recovery Latency
0.01
1
100
10000
1000 10000 100000
Recovery Laten
cy (m
s)
Number of transac\ons
INP COW LOG NVM-‐INP NVM-‐COW NVM-‐LOG
Instant Recovery
![Page 19: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/19.jpg)
Takeaways • Designing for NVM is important
– Higher throughput – Longer device lifetime – Faster recovery
• System design principles – Non-volatile data structures – Need a system-level rethink
![Page 20: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/20.jpg)
Peloton @ CMU • Hybrid storage hierarchy
– NVM + DRAM oriented design • HTAP workloads
– Real-time analytics and fast transactions
![Page 21: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/21.jpg)
END jarulraj@
Thanks !
![Page 22: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/22.jpg)
Filesystem Interface • Optimized for byte-addressable NVM • Bypass page-cache & block device layer • File I/O requires only one copy
– 7x better performance than EXT4
![Page 23: Let’s Talk About Storage & Recovery Methods for Non ...jarulraj/talks/2015.storage.sigmod.pdf · Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](https://reader034.fdocuments.in/reader034/viewer/2022042018/5e76482d2a82840f0164f0ce/html5/thumbnails/23.jpg)
Allocator Interface • NVM-aware memory allocator
– No system calls – Bypass kernel’s VFS layer
• Flush CPU cache for durable writes – 10x better performance than FS interface