Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon,...

37
Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout and Mendel Rosenblum 1

Transcript of Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon,...

Page 1: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

1

Copysets: Reducing the Frequency of Data Loss in Cloud Storage

Stanford University

Asaf Cidon, Stephen M. Rumble, Ryan Stutsman,Sachin Katti, John Ousterhout and Mendel Rosenblum

Page 2: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Goal: Tolerate Node Failures

Random replication used by:• HDFS• GFS• Windows Azure• RAMCloud• …

Choose random

Page 3: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Not All Failures are Independent

• Power outages– 1-2 times a year [Google, LinkedIn, Yahoo]

• Large scale network failures– 5-10 times a year [Google, LinkedIn]

• And more:– Rolling software/hardware upgrades– Power down

Page 4: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Random Replication Fails Under Simultaneous Failures

Confirmed by:Facebook, Yahoo, LinkedIn

Page 5: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Random Replication

Node 1

Node 4

Node 7

Node 2

Node 5

Node 8 Node 9

Node 6

Node 3

Page 6: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Random Replication

Node 1

Node 4

Node 7

Node 2

Node 5

Node 8 Node 9

Node 6

Node 3

Page 7: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Random Replication

Node 1

Node 4

Node 7

Node 2

Node 5

Node 8 Node 9

Node 6

Node 3

Page 8: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Random Replication

Node 1

Node 4

Node 7

Node 2

Node 5

Node 8 Node 9

Node 6

Node 3

Page 9: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Random Replication

Node 1

Node 4

Node 7

Node 2

Node 5

Node 8 Node 9

Node 6

Node 3

Page 10: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Random Replication

Node 1

Node 4

Node 7

Node 2

Node 5

Node 8 Node 9

Node 6

Node 3

Copysets:{1, 5, 6}, {2, 6, 8}

Page 11: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Random Replication

Node 1

Node 4

Node 7

Node 2

Node 5

Node 8 Node 9

Node 6

Node 3

Page 12: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Random Replication

Node 1

Node 4

Node 7

Node 2

Node 5

Node 8 Node 9

Node 6

Node 3

Copysets:{1, 2, 3}, {1, 2, 4}, {1, 2, 5},{1, 2, 6}, {1, 2, 7}, {1, 2, 8},

Page 13: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Random Replication Causes Frequent Data Loss

• Random replication eventually creates maximum number of copysets– Any combination of 3 nodes– = 84 copysets

• If 3 nodes fail, 100% probability of data loss–

Page 14: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

MinCopysets

Node 1

Node 4

Node 7

Node 2

Node 5

Node 8 Node 9

Node 6

Node 3

Page 15: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

MinCopysets

Node 1

Node 4

Node 7

Node 2

Node 5

Node 8 Node 9

Node 6

Node 3

Copysets:{1, 5, 7}, {2, 4, 9}, {3, 6, 8}

Page 16: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

MinCopysets Minimizes Data Loss Frequency

• MinCopysets creates minimum number of copysets– Only {1, 5, 7}, {2, 4, 9}, {3, 6, 8}

• If 3 nodes fail, 3.5% of data loss–

Page 17: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

MinCopysets Reduces Probability of Data Loss

Page 18: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

The Trade-off

MinCopysets Random Replication

Mean Time to Failure 625 years 1 year

Amount of Data Lost 1 TB 5.5 GB

• 5000-node cluster• Power outage occurs every year Confirmed by:

Facebook, LinkedIn, NetApp, Google

Page 19: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Problem: MinCopysets Increases Single Node Recovery Time

Random Replication MinCopysets0

100

200

300

400

500

600

700

800

Time to Recovery a 100 GB Node in 39-node HDFS cluster

Reco

very

Tim

e (s

econ

ds)

Page 20: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Facebook Extension to HDFS

Choose random

Buddy Group

Page 21: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

A Compromise

HDFS Random Replica-tion

Facebook Extension to HDFS

MinCopysets0

100

200

300

400

500

600

700

800

Time to Recovery a 100 GB Node in 39-node HDFS cluster

Reco

very

Tim

e (s

econ

ds)

Page 22: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Can We Do Better?

Page 23: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Definition: Scatter Width

Facebook Extension to HDFSScatter Width = 10

MinCopysetsScatter Width = 2

Page 24: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Facebook Extension to HDFS

• Node 1’s copysets:– {1, 2, 3}, {1, 2, 4}, {1, 2, 5}, {1, 3, 4}, {1, 3, 5}, {1, 4, 5}

• Overall: 54 copysets• If 3 nodes fail simultaneously:•

1 2 3 4 5 6 8 97

Buddy group

Page 25: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Copyset Replication: Intuition

• Same scatter width (4), different scheme:{1, 2, 3}, {4, 5, 6}, {7, 8, 9}{1, 4, 7}, {2, 5, 8}, {3, 6, 9}

Ingredients of ideal scheme1. Maximize scatter width2. Minimize overlaps•

1

2 3

4 7

Page 26: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Random Permutation

1 2 3 4 5 6 8 97

7 3 5 6 2 9 8 41

Copyset Replication: Initialization

Split into copysets (Scatter width = 2)

7 3 5 6 2 9 8 41

Copyset Copyset Copyset

Page 27: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

1 2 3 4 5 6 8 97

Copyset Replication: Initialization

Permutation 1: Scatter width = 2

7 3 5 6 2 9 8 41

Permutation 2: Scatter width = 4

9 7 1 5 6 8 2 34

Permutation 5: Scatter width = 10

Page 28: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

1 2 3 4 5 6 8 97

Copyset Replication: Replication

7 3 5 6 2 9 8 41

9 7 1 5 6 8 2 34

Randomly choose copyset

Page 29: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Insignificant Overhead

HDFS

Face

book Exte

nsion to

HDFS

Copyset R

eplication

MinCopysets

0100200300400500600700800

Time to Recovery a 100 GB Node in 39-node HDFS cluster

Reco

very

Tim

e (s

econ

ds)

Page 30: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Copyset Replication

Page 31: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Inherent Trade-off

Page 32: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Related Work

• BIBD (Balanced Incomplete Block Designs)– Originally proposed for designing agricultural

experiments in the 1930’s! [Fisher, ’40]

• Other applications– Power downs [Harnik et al ’09, Leverich et al ’10, Thereska ’11]

– Multi-fabric interconnects [Mehra, ’99]

Page 33: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

Summary

1. Many storage systems randomly spray their data across a large number of nodes

2. Serious problem with correlated failures3. Copyset Replication is a better way of

spraying data that decreases the probability of correlated failures

Page 34: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

34

Thank You!

Stanford University

Page 35: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

More Failures (Facebook)

Page 36: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

RAMCloud

Page 37: Copysets: Reducing the Frequency of Data Loss in Cloud Storage Stanford University Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout.

HDFS