UniStore Project Updates...• Overview of a pattern-directed replication scheme • Object distance...

1
The Center for Cloud and Autonomic Computing is supported by the National Science Foundation under Grant No. 1362134. NSF CAC Semi-Annual Meeting, April 28-29, 2016 UniStore Project Updates Wei Xie, Jiang Zhou, Mark Reyes, Jason Nobel and Yong Chen Department of Computer Science and Nimboxx, Inc. Abstract 2 new papers completed Tiered-CRUSH Pattern-directed Replication Scheme 1 new paper in preparation: version consistent hash 2 papers under review: PRS and SUORA Simulation code and prototype developed to evaluate the proposed schemes Proposed schemes improve the performance of heterogeneous storage systems and maintain the balanced storage utilization Most data centers use heterogeneous storage combining hard disk drives HDD with emerging storage class memory SCM (e.g., solid state drives SSD and phase change memory PCM) Lack a storage system that unifies the management of heterogeneous storage devices efficiently Data are replicated for data availability, but not for performance To take full advantage of faster storage devices’ performance and the hard disk drives’ cost-efficiency Overview of a pattern-directed replication scheme Object distance calculation Evaluation Motivation and Goals Research Paper 2: Pattern-directed Replication Scheme Acknowledgements We are grateful to the Cloud and Autonomic Computing site at Texas Tech University for the valuable support for this project. Thank High Performance Computing Center at Texas Tech University for providing the computing resources and support for this project. CRUSH ensures data placed across multiple independent locations to improve data availability Tiered-CRUSH integrates storage tiering into the CRUSH data placement The virtualized volumes have different access pattern Access frequency of object recorded per volume, hotter data more likely to be placed on faster tiers Fair storage utilization maintained (1) The original objects are placed on hybrid nodes in default layout (2) The runtime object I/O requests are traced by a trace collector (3) The scheme analyzes the trace, reorganizes objects for identified access patterns and makes replications (4) when applications run again, objects are read from replicas in optimized layout (a) Random Read (c) Overhead Real machine/ Virtual machine Application’s execution Default layout Object I/O request Heterogeneous Nodes Trace collector I/O trace Trace analyzer Real machine/ Virtual machine Application’s execution Optimized layout Object I/O redirection Heterogeneous Nodes Distribution algorithm Object reorganization Data replicator Pattern-directed Replication Scheme (PRS) (a) Local access pattern (b) Global access pattern dist (o 1 , o 2 ) = 1 Min count (o 1 , o 2 ) count (o 1 ) , count (o 1 , o 2 ) count (o 2 ) " # $ % & ' 0 0 Object ID 1 2 3 4 5 6 7 8 9 10 2 4 6 New Object 7 9 10 Replica 1 Replica 2 local pattern local pattern dist ( o 1 , o 2 ) = count ( o 1 ) count ( o 2 ) count ( o 1 ) + count ( o 2 ) " # $ % & ' HDD 0 0 Object ID 1 2 3 4 5 50 51 52 53 2 5 1 4 New Object 3 Replica 1 Replica 2 Hot data 50 Replica 3 53 51 SCM Heterogeneous device architecture global pattern global pattern global pattern (ratio = SCM capacity Total capacity ) 0 20 40 60 80 100 120 140 1 2 4 8 16 32 64 128 256 512 Aggregated Bandwidth (MB/s) Number of processes FIO with different numbers of processes Original-1 rep Original-3 rep PD only 0 20 40 60 80 100 120 140 1 2 4 8 16 32 64 128 256 512 Aggregated Bandwidth (MB/s) Number of processes FIO with different numbers of processes Original-1 rep Original-3 rep PD only (b) Sequential Read Research Paper 1: Tiered-CRUSH 0 20 40 60 80 100 120 140 1 2 4 8 16 32 64 128 256 512 Aggregated Bandwidth (MB/s) Number of processes FIO with different numbers of processes Original-1 rep PRS scheme Root Tier1 Tier2 Tier3 N3 S1 S2 H1 H2 H3 Root Cab1 Cab2 Cab3 N3 S1 S2 H1 H2 H3 SATA SSD: S NVMe SSD: N Hard Disk Drive: H Primary Replica1 Replica2 Primary placed based on tiering information and tier map Replicas placed based on CRUSH map Data Object Tier Map CRUSH Map Volume 1 Volume 2 Storage tiers Objects Read count Objects Write count Hotness Server Monitor access Calculate and maintain tiering information Tier1 Tier2 Tier3 Build versions into the virtual nodes Avoid data migration when adding nodes or node fails Maintain efficient data lookup Example of 3 versions of a consistent hashing ring Data lookup algorithm Research Paper 3: Version Consistent Hashing 2 1 3 Version1, committed 1 2 3 c c c D1 D1 Update, lookup 2 1 3 Version2, uncommitted 1 2 3 c c c D1 D1 4 4 n, v2 Update, lookup Lookup 2 1 3 Version3, uncommitted 1 2 3 c c c D1 D1 4 4 n, v2 Update, lookup Lookup 5 n, v3 5 lookup 2 1 3 Version3, committed 1 2 3 c c c D1 D1 4 5 c Update, lookup 4 c 5 D1 3 1 2 4 5 6 7 v2 v3 v3 v4 v1: 1,2 v2: 4,1 v3: 4,6 v4: 4,6 Lookup locations: {4, 6, 1, 2} Performance Improvement 0 500 1000 1500 2000 Number of nodes (half are added) 10 0 10 1 10 2 10 3 10 4 Average lookup amplification Consistent Hashing Commit CH Version CH

Transcript of UniStore Project Updates...• Overview of a pattern-directed replication scheme • Object distance...

Page 1: UniStore Project Updates...• Overview of a pattern-directed replication scheme • Object distance calculation • Evaluation Motivation and Goals Research Paper 2: Pattern-directed

The Center for Cloud and Autonomic Computing is supported by the National Science Foundation under Grant No. 1362134.

NSF CAC Semi-Annual Meeting, April 28-29, 2016

UniStore Project UpdatesWei Xie, Jiang Zhou, Mark Reyes, Jason Nobel and Yong ChenDepartment of Computer Science and Nimboxx, Inc.

Abstract• 2 new papers completed

• Tiered-CRUSH• Pattern-directed Replication Scheme

• 1 new paper in preparation: version consistent hash• 2 papers under review: PRS and SUORA• Simulation code and prototype developed to evaluate

the proposed schemes• Proposed schemes improve the performance of

heterogeneous storage systems and maintain the balanced storage utilization

• Most data centers use heterogeneous storage combining hard disk drives HDD with emergingstorage class memory SCM (e.g., solid state drives SSD and phase change memory PCM)

• Lack a storage system that unifies the management of heterogeneous storage devices efficiently

• Data are replicated for data availability, but not for performance

• To take full advantage of faster storage devices’ performance and the hard disk drives’ cost-efficiency

• Overview of a pattern-directed replication scheme

• Object distance calculation

• Evaluation

Motivation and Goals

Research Paper 2: Pattern-directed Replication Scheme

AcknowledgementsWe are grateful to the Cloud and Autonomic Computing site at Texas Tech University for the valuable support for this project. Thank High Performance Computing Center at Texas Tech University for providing the computing resources and support for this project.

• CRUSH ensures data placed across multiple independent locations to improve data availability

• Tiered-CRUSH integrates storage tiering into the CRUSH data placement

• The virtualized volumes have different access pattern• Access frequency of object recorded per volume, hotter

data more likely to be placed on faster tiers• Fair storage utilization maintained

(1) The original objects are placed on hybrid nodes in default layout(2) The runtime object I/O requests are traced by a trace collector(3) The scheme analyzes the trace, reorganizes objects for identified access patterns and makes replications(4) when applications run again, objects are read from replicas in optimized layout

(a) Random Read (c) Overhead

Real machine/Virtual machineApplication’s

execution

Default layout

Object I/O request

Heterogeneous Nodes

Trace collector

I/O trace

Trace analyzer

Real machine/Virtual machineApplication’s

execution

Optimized layout

Object I/O redirection

Heterogeneous Nodes

Distribution algorithm

Objectreorganization

Data replicator

Pattern-directed Replication Scheme (PRS)

(a) Local access pattern

(b) Global access pattern

dist(o1,o2 ) =1−Mincount(o1,o2 )count(o1)

, count(o1,o2 )count(o2 )

"#$

%&'

0

0

Object ID 1 2 3 4 5 6 7 8 9 10

2 4 6New Object 7 9 10

Replica 1 Replica 2

local pattern local pattern

dist(o1,o2 ) =count(o1)− count(o2 )count(o1)+ count(o2 )

"#$

%&'

HDD

0

0

Object ID 1 2 3 4 5 50 51 52 53

2 5 1 4New Object 3

Replica 1 Replica 2

Hot data

50

Replica 3

5351

SCM Heterogeneous device architecture

global pattern global pattern global pattern(ratio = SCM capacity

Total capacity)

0

20

40

60

80

100

120

140

1 2 4 8 16 32 64 128 256 512

Agg

rega

ted

Ban

dwid

th (M

B/s)

Number of processes

FIO with different numbers of processesOriginal-1 rep Original-3 rep PD only

0

20

40

60

80

100

120

140

1 2 4 8 16 32 64 128 256 512A

ggre

gate

d B

andw

idth

(MB/

s)

Number of processes

FIO with different numbers of processesOriginal-1 rep Original-3 rep PD only

(b) Sequential ReadResearch Paper 1: Tiered-CRUSH

0

20

40

60

80

100

120

140

1 2 4 8 16 32 64 128 256 512

Agg

rega

ted

Ban

dwid

th (M

B/s)

Number of processes

FIO with different numbers of processesOriginal-1 rep PRS scheme

Root

Tier1 Tier2 Tier3

N3 S1 S2 H1 H2 H3

Root

Cab1 Cab2 Cab3

N3S1 S2H1 H2 H3

SATA SSD: S NVMe SSD: N Hard Disk Drive: H

Primary Replica1 Replica2

Primary placed based on tiering information and tier map

Replicas placed based on CRUSH map

Data Object

Tier Map CRUSH Map

Volume 1 Volume 2

Storage tiers

Objects Read count Objects Write count

Hotness Server

Monitor accessCalculate and

maintain tiering information

Tier1 Tier2 Tier3

• Build versions into the virtual nodes• Avoid data migration when adding nodes or node fails• Maintain efficient data lookup • Example of 3 versions of a consistent hashing ring

• Data lookup algorithm

Research Paper 3: Version Consistent Hashing

21 3

Version1, committed

1

2

3c

c

cD1

D1Update, lookup

21 3

Version2, uncommitted

1

2

3c

c

cD1

D1

4

4n, v2

Update, lookupLookup

21 3

Version3, uncommitted

1

2

3c

c

cD1

D1

4

4n, v2

Update, lookupLookup

5n, v3

5

lookup

21 3

Version3, committed

1

2

3c

c

cD1

D1

4

5 c

Update, lookup

4 c

5

D1

3

1

2

4

5

6

7

v2

v3

v3

v4

• v1: 1,2• v2: 4,1• v3: 4,6• v4: 4,6• Lookup locations: {4, 6, 1, 2}

• Performance Improvement

0 500 1000 1500 2000Number of nodes (half are added)

100

101

102

103

104

Aver

age

look

up a

mpl

ifica

tion Consistent Hashing

Commit CHVersion CH