Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells,...

22
Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley Future Directions in Distributed Computing. Thursday, June 6, 2002
  • date post

    15-Jan-2016
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells,...

Page 1: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems

Hakim Weatherspoon, Chris Wells, John KubiatowiczUniversity of California, Berkeley

Future Directions in Distributed Computing. Thursday, June 6, 2002

Page 2: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:2

OceanStore Context: Ubiquitous Computing

• Computing everywhere:– Desktop, Laptop, Palmtop.– Cars, Cellphones.– Shoes? Clothing? Walls?

• Connectivity everywhere:– Rapid growth of bandwidth in the interior of the

net.– Broadband to the home and office.– Wireless technologies such as CDMA, Satellite,

laser.

Page 3: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:3

Archival Storage

• Where is persistent information stored?– Want: Geographic independence for availability,

durability, and freedom to adapt to circumstances

• How is it protected?– Want: Encryption for privacy, secure naming and

signatures for authenticity, and Byzantine commitment for integrity

• Is it Available/Durable? – Want: Redundancy with continuous repair and

redistribution for long-term durability

Page 4: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:4

Path of an Update

Page 5: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:5

Questions about Data?

• How to use redundancy to protect against data being lost?

• How to verify data?

Page 6: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:6

Archival Dissemination Built into Update

• Erasure codes– redundancy without overhead of strict replication– produce n fragments, where any m is sufficient to reconstruct data. m < n.

rate r = m/n. Storage overhead is 1/r.

Page 7: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:7

Durability

1.E-70

1.E-60

1.E-50

1.E-40

1.E-30

1.E-20

1.E-10

1.E+00

0 6 12 18 24

Repair Time (Months)

Pro

ba

bilit

y o

f B

loc

k F

ailu

re p

er

Ye

ar

n = 4 fragmentsn = 8 fragmentsn = 16 fragmentsn = 32 fragmentsn = 64 fragments

• Fraction of Blocks Lost Per Year (FBLPY)*– r = ¼, erasure-encoded block. (e.g. m = 16, n = 64)– Increasing number of fragments, increases durability of block

• Same storage cost and repair time.

– n = 4 fragment case is equivalent to replication on four servers.* Erasure Coding vs. Replication, H. Weatherspoon and J. Kubiatowicz, In Proc. of IPTPS

2002.

Page 8: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:8

Naming and Verification Algorithm

• Use cryptographically secure hash algorithm to detect corrupted fragments.

• Verification Tree: – n is the number of

fragments.– store log(n) + 1 hashes

with each fragment.

– Total of n.(log(n) + 1) hashes.

• Top hash is a block GUID (B-GUID).

– Fragments and blocks are self-verifying

Fragment 3:

Fragment 4:

Data:

Fragment 1:

Fragment 2:

H2 H34 Hd F1 - fragment data

H14 data

H1 H34 Hd F2 - fragment data

H4 H12 Hd F3 - fragment data

H3 H12 Hd F4 - fragment data

F1 F2 F3 F4

H1 H2 H3 H4

H12 H34

H14

B-GUID

HdData

Encoded Fragments

F1

H2

H34

Hd

Fragment 1: H2 H34 Hd F1 - fragment data

Page 9: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:9

Enabling Technology

GUID

Fragments

Page 10: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:10

Complex Objects I

Unit of Coding

data

Verification Tree

GUID of d

Encoded Fragments:Unit of Archival Storage

Page 11: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:11

Complex Objects II

GUID of d1

d1

Unit of CodingEncoded Fragments:Unit of Archival Storage

Verification Tree

BlocksData

VGUID

d2 d4d3 d8d7d6d5 d9

DataB -Tree

Indirect Blocks

M

Page 12: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:12

Complex Objects III

DataBlocks

VGUIDi VGUIDi + 1

d2 d4d3 d8d7d6d5 d9d1

Data B -Tree

IndirectBlocks

M

d'8 d'9

Mbackpointer

copy on write

copy on write

AGUID = hash{name+keys}

Page 13: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:13

Mutable Data

• Need mutable data for real system.– Entity in network.– A-GUID to V-GUID mapping.– Byzantine Commitment for Integrity– Verifies client privileges.– Creates a serial order. – Atomically applies update.

• Versioning system– Each version is inherently read-only.

Page 14: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:14

Archiver I

• Archiver Server Architecture– Requests to archive

objects recv’d thru network layers.

– Consistency mechanisms decides to archive obj.

Asynchronous DiskAsynchronous Network

Network

Operating System

Java Virtual Machine

ThreadScheduler

X

Y

Consistency

Location & Routing

Archiver

IntrospectionModules

Dispatch

4

2

31

4

Page 15: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:15

Archiver Control Flow

GenerateChkptStage

GenerateFragsStage

DisseminatorStage

ConsistencyStage(s)

GenerateFragsChkptReqDiss

eminate

FragsR

eq

GenerateFragsReq

GenerateFragsResp

GenerateFragsChkptResp

Send Frags toStorage Servers

DisseminateFragsResp

Req

Page 16: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:16

Performance

• Performance of the Archival Layer– Performance of an OceanStore server in archiving a objects.– analyze operations of archiving data (this includes signing

updates in a BFT protocol).• No archiving• Inlined archiving (synchronous) (m = 16, n = 32)• Delayed archiving (asynchronous) (m = 16, n = 32)

• Experiment Environment– OceanStore servers were analyzed on a 42-node cluster. – Each machine in the cluster is a

• IBM xSeries 330 1U rackmount PC with• two 1.0 GHz Pentium III CPUs• 1.5 GB ECC PC133 SDRAM• two 36 GB IBM UltraStar 36LZX hard drives.• The machines use a single Intel PRO/1000 XF gigabit Ethernet

adaptor to connect to a Packet Engines• Linux 2.4.17 SMP kernel.

Page 17: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:17

Performance: Throughput

• Data Throughput– No archive 5MB/s.– Delayed 3MB/s.– Inlined 2.5MB/s.

Update Throughput vs. UpdateSize

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

1 10 100 1000 10000UpdateSize (kB)

Th

rou

gh

pu

t (M

B/s

)

Delayed Archive

Inlined Archive

No Archive

Page 18: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:18

Performance: Latency

• Latency– Archive only

• Y-intercept 3ms, slope 0.3s/MB.

– Inlined Archive = No archive + only archive.

Update Latency vs. Update Size

y = 0.6x + 29.6

y = 0.3x + 3.0

y = 1.2x + 36.4

0

10

20

30

40

50

60

70

80

2 7 12 17 22 27 32

UpdateSize (kB)

Late

ncy (

in m

s)

Inlined Archive

No Archive

Only Archive

Page 19: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:19

Future Directions

• Caching for performance– Automatic Replica Placement

• Automatic Repair

Page 20: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:20

Caching

• Automatic Replica Placement– Replicas are soft-state.– Can be constructed and destroyed as necessary.

• Prefetching– Reconstruct replicas from fragments in advance of

use

Page 21: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:21

Efficient Repair

• Global.– Global Sweep and repair not

efficient.– Want detection/notification of

node removal in system.– Not as affective as distributed

mechanisms.

• Distributed.– Exploit DOLR’s distributed

information and locality properties.

– Efficient detection and then reconstruction of fragments.

3274

4577

5544

AE87

3213

9098

1167

6003

0128

L2L2

L1

L1

L2

L2

L3

L3

L2

L1L1

L2L3

L2

Ring of L1Heartbeats

Page 22: Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.

FuDiCo 2002 ©2002 Hakim Weatherspoon/UC Berkeley Naming and Integrity:22

Conclusion

• Storage efficient, self-verifying mechanism.– Erasure codes are good.

• Self-verifying data assist in– Secure read-only data – Secure caching infrastructures– Continuous adaptation and repair

For more information: http://oceanstore.cs.berkeley.edu/