RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info...

19
RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University (Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble, Mendel Rosenblum and John Ousterhout)

Transcript of RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info...

Page 1: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

RAMCloud: A Low-Latency Datacenter Storage System

Ankita Kejriwal

Stanford University

(Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble,

Mendel Rosenblum and John Ousterhout)

Page 2: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

… a Storage System that provides:

● Scale Data size: 10 PB

Accessible by 100,000 nodes (10 Million cores)

● Uniform fast random access time to all data 100 B read: 2 µs RPC

100 B write: 5 µs RPC

● Durable and available

What if you had…

February 07, 2013 RAMCloud Slide 2

Page 3: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

● General-purpose storage system

● All data always in DRAM

● Scale: 1000 – 10000 servers, 1 PB data

● Performance goals:

High throughput: 1M ops/sec/server

Low-latency access: 5-10µs RPC

● Durable and available

● Potential impact: enable new class of applications

Primary motivation: Web sphere

Maybe HPC?

RAMCloud

February 07, 2013 RAMCloud Slide 3

Page 4: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

DRAM in Storage Systems

1970 1980 1990 2000 2010

UNIX buffer

cache

Main-memory

databases

Large file

caches

Web indexes

entirely in DRAM

memcached

Facebook:

200 TB total data

150 TB cache!

Main-memory

DBs, again

February 07, 2013 RAMCloud Slide 4

Page 5: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

DRAM in Storage Systems

● DRAM usage

specialized/limited

● Clumsy (consistency with

backing store)

● Lost performance (backing

store, cache misses)

1970 1980 1990 2000 2010

UNIX buffer

cache

Main-memory

databases

Large file

caches

Web indexes

entirely in DRAM

memcached

Facebook:

200 TB total data

150 TB cache!

Main-memory

DBs, again

February 07, 2013 RAMCloud Slide 5

Page 6: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

RAMCloud Slide 6

DRAM is cheaper! from "Andersen et al., "FAWN: A Fast Array of Wimpy Nodes",

Proc. 22nd Symposium on Operating System Principles, 2009, pp. 1-14.

0.1 1 10 100 1000 0.1

1

10

100

1000

10000

Query Rate (millions/sec)

Da

tas

et

Siz

e (

TB

)

Disk

Flash

DRAM

February 07, 2013

Lowest TCO

Page 7: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

Why Does Latency Matter?

● Large-scale apps struggle with high latency

Random access data rate has not scaled!

Facebook: can only make 100-150 internal requests per page

UI

App.

Logic

Data Structures

Traditional Application

<< 1µs latency 0.5-10ms latency

Single machine

February 07, 2013 RAMCloud Slide 7

UI

App.

Logic

Applic

ation S

erv

ers

Sto

rage

Se

rve

rs

Web Application

Datacenter

Page 8: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

RAMCloud Goal: Scale and Latency

● Enable new class of applications

Traditional Application

<< 1µs latency 0.5-10ms latency 5-10µs

UI

App.

Logic

Data Structures

Single machine

February 07, 2013 RAMCloud Slide 8

UI

App.

Logic

Applic

ation S

erv

ers

Sto

rage

Se

rve

rs

Web Application

Datacenter

Page 9: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

RAMCloud Architecture

Master

Backup

Master

Backup

Master

Backup

Master

Backup

Appl.

Library

Appl.

Library

Appl.

Library

Appl.

Library

Datacenter

Network Coordinator

1000 – 10,000 Storage Servers

1000 – 100,000 Application Servers

Commodity

Servers

DRAM

32-256 GB

per server

High-speed networking:

● 5 µs round-trip

● Full bisection

bandwidth

February 07, 2013 RAMCloud Slide 9

Page 10: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

read(tableId, key)

=> blob, version

write(tableId, key, blob)

=> version

cwrite(tableId, key, blob, version)

=> version

delete(tableId, key)

enumerate(tableId)

Data Model: Key-Value Store

Tables

Key (≤ 64KB)

Version (64b)

Blob (≤ 1MB)

Object

February 07, 2013 RAMCloud Slide 10

Richer model in the future:

• Indexes?

• Transactions?

• Graphs?

key | value

key | value

key | value

key | value

key | value

key | value

key | value

key | value

key | value

key | value

key | value

Page 11: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

● Goals: No impact on performance

Minimum cost, energy

● Keep replicas in DRAM of other servers? 3x system cost, energy

Still have to handle power failures

● RAMCloud approach: 1 copy in DRAM

Backup copies on disk/flash: durability ~ free!

● Issues to resolve: Synchronous disk I/O’s during writes??

Data unavailable after crashes??

Durability and Availability

February 07, 2013 RAMCloud Slide 11

Page 12: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

● No disk I/O during write requests

● Log-structured: backup disks and master’s memory

● Log cleaning

Buffered Logging

Disk

Backup

Buffered Segment

Disk

Backup

Buffered Segment

Master

Disk

Backup

Buffered Segment

In-Memory Log

Hash

Table

Write request

February 07, 2013 RAMCloud Slide 12

Page 13: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

● Server crashes:

Must replay log to reconstruct data

● Crash recovery:

Choose recovery master

Backup reads log info from disk

Transfers logs to recovery master

Recovery master replays log

● Meanwhile, data is unavailable

● RAMCloud approach: fast crash recovery

1-2 seconds for 100 GB of data

Use system scale to get around bottlenecks

Crash Recovery

February 07, 2013 RAMCloud Slide 13

Recovery

Master

Backups

Dead

Master

Page 14: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

● Scatter backup data across backups

● Divide each master’s data into partitions

Recover each partition on a separate recovery master

Each backup divides its log data among recovery masters

Fast Crash Recovery

Recovery

Masters

Backups

Dead

Master

February 07, 2013 RAMCloud Slide 14

Page 15: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

● Goal: build production-quality implementation

● Nearing 1.0-level release

● Current test cluster:

80 servers, 2 TB data

High speed Infiniband networking

Performance:

● 100 B read: 5.3 µs RPC

● 100 B write: 15 µs RPC

● Interested in finding applications for RAMCloud

RAMCloud Project Status

February 07, 2013 RAMCloud Slide 15

Page 16: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

Properties of RAMCloud relevant to application

developers:

● Durability and availability

● Key-value store

● Commodity hardware

● Read / write access latency

● Random access to small objects

Is RAMCloud right for HPC apps?

February 07, 2013 RAMCloud Slide 16

Page 17: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

● General-purpose storage system

● All data always in DRAM

● Designed for:

Scale: 1000 – 10000 servers, 1 PB data

Performance: 5-10µs RPC

● Durable and available

Conclusion

February 07, 2013 RAMCloud Slide 17

Page 18: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

● Is RAMCloud appropriate for HPC Applications?

Durability and availability

Key-value store

Commodity hardware

Read / write access latency

Random access to small objects

● One thing that we could change to make RAMCloud

interesting to you!

Questions

February 07, 2013 RAMCloud Slide 18

Page 19: RAMCloud: A Low-Latency Datacenter Storage System...Choose recovery master Backup reads log info from disk Transfers logs to recovery master Recovery master replays log Meanwhile,

Thank you!