SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

16
1 SEDCL, June 7-8, 2012 Table Enumeration in RAMCloud Elliott Slaughter

Transcript of SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

Page 1: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

1SEDCL, June 7-8, 2012

Table Enumeration in RAMCloud

Elliott Slaughter

Page 2: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

2SEDCL, June 7-8, 2012

Overview

•Motivation

•Consistency Goals

•RAMCloud Internals

•Enumerate Algorithm

Page 3: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

3SEDCL, June 7-8, 2012

What is enumeration?

•Distributed key/value store

•Iterate over all key/value pairs in a table

•SELECT *

•Use cases:

•“ls”

•Offline backups

Page 4: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

4SEDCL, June 7-8, 2012

Pitfalls in enumeration

•Table can be modified during enumeration

•Table might be distributed on multiple servers

•Need multiple requests to fetch a table

•Servers can go up and down

•Table can be redistributed across the cluster

•Etc.

Page 5: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

5SEDCL, June 7-8, 2012

Consistency goals• “Once-only” consistency model:

• Any object whose lifetime spans the entire enumeration will be returned exactly once

• No object will be returned more than once

Page 6: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

6SEDCL, June 7-8, 2012

How clients locate objects on servers• Clients locate objects by hashing keys

• Table is divided into chunks in the hash space and distributed among servers

• Tables can be redistributed at any time

Page 7: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

7SEDCL, June 7-8, 2012

How servers locate objects for clients•Servers use a hash table to locate

objects in their log

•Many different tablets may be collocated in the hash table

Page 8: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

8SEDCL, June 7-8, 2012

The client algorithm

•Suppose we can encapsulate the state of iteration in an opaque blob

•Enumerate(tabletStartHash, iterator) =>nextTabletStartHash, nextIterator, objects

•Just call this RPC repeatedly until we receive no objects back

Page 9: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

9SEDCL, June 7-8, 2012

The server algorithm: Naive

version•Client iterates tablets front to back

• Server walks hash table in bucket order and collects objects

Iterator contains:bucketIndex

Page 10: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

10

SEDCL, June 7-8, 2012

Problem: Large buckets

•No guarantees on ordering of elements in bucket

Page 11: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

11

SEDCL, June 7-8, 2012

Solution: Large buckets

• Sort large buckets by hash value, and keep track of the next bucket hash in the iterator

Iterator contains:bucketIndex

nextBucketHash

Page 12: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

12

SEDCL, June 7-8, 2012

Problem: Tablet reconfiguration

Before merge After merge

Page 13: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

13

SEDCL, June 7-8, 2012

Solution: Tablet reconfiguration

• Iterator is now a stack

• Keep track of tablet boundaries

Iterator contains:bucketIndex

nextBucketHashtabletStartHashtabletEndHash

*

After mergeBefore merge

Page 14: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

14

SEDCL, June 7-8, 2012

Problem: Rehashing

•Mapping to buckets depends on size of hash table

•Different servers might have different sizes of hash tables

Page 15: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

15

SEDCL, June 7-8, 2012

Solution: Rehashing

• Track the size of the hash table to handle resizing the table

Iterator contains:bucketIndex

nextBucketHashtabletStartHashtabletEndHashnumBuckets

*

Page 16: SEDCL, June 7-8, 2012 1 Table Enumeration in RAMCloud Elliott Slaughter.

16

SEDCL, June 7-8, 2012

Conclusions

Iterator contains:bucketIndex

nextBucketHashtabletStartHashtabletEndHashnumBuckets

*

P.S. I know of at least two more problems. Can you find them?

• Pros:

• Iterator contains entire state (server is stateless)

•No extra data structures

•Cons:

• Efficiency?

•Consistency?