Computer Architecture and MEmory systems Lab.

Post on 18-May-2022

2 views 0 download

Transcript of Computer Architecture and MEmory systems Lab.

Computer Architecture and MEmory systems Lab.

Understanding System Characteristics of Online Erasure Coding on Scalable, Distributed and

Large-Scale SSD Array Systems

Sungjoon Koh, Jie Zhang, Miryeong Kwon, Jungyeon Yoon, David Donofrio, Nam Sung Kim and Myoungsoo Jung

Take-away

• Motivation: Distributed systems are starting to adopt erasure coding as a fault tolerance mechanism instead of replication due to its storage overheads.

Take-away

• Motivation: Distributed systems are starting to adopt erasure coding as a fault tolerance mechanism instead of replication due to its storage overheads.

• Goal: Understanding system characteristics of online erasure coding by analyzing and comparing them with those of replication.

Take-away

• Motivation: Distributed systems are starting to adopt erasure coding as a fault tolerance mechanism instead of replication due to its storage overheads.

• Goal: Understanding system characteristics of online erasure coding by analyzing and comparing them with those of replication.

• Observations in Online Erasure Coding:1) Up to 13× I/O performance degradation compared to replication.2) 50% CPU usage and lots of context switches.3) Up to 700× I/O amplification more than total request volumes.4) Up to 500× network traffics among the storage nodes compared to total

request amount.

Take-away

• Motivation: Distributed systems are starting to adopt erasure coding as a fault tolerance mechanism instead of replication due to its storage overheads.

• Goal: Understanding system characteristics of online erasure coding by analyzing and comparing them with those of replication.

• Observations in Online Erasure Coding:1) Up to 13× I/O performance degradation compared to replication.2) 50% CPU usage and lots of context switches.3) Up to 700× I/O amplification more than total request volumes.4) Up to 500× network traffics among the storage nodes compared to total

request amount.

• Summary of Our Work:• We observe and measure various overheads imposed by online erasure

coding quantitatively on a distributed system that consists of 52 SSDs.• Collect block-level traces from all-flash array based storage clusters, which

can be downloaded freely.

Overall Results

Read Write Read Write Read Write Read Write Read Write Read Write

Throughput LatencyCPU

Utilization

Relative

Context

Switch

Private

Network

Overhead

I/O

Amplification

05

10253035505560

57.7

10.4

37.8

8.712.6

1.9

10.77.6

1.50.14

4KB Random Request

3-Replication (3-Rep.)

RS (10, 4)

No

rmalized

to

Re

plicati

on

0.67

Almost 0

Overall Results

Read Write Read Write Read Write Read Write Read Write Read Write

Throughput LatencyCPU

Utilization

Relative

Context

Switch

Private

Network

Overhead

I/O

Amplification

05

10253035505560

57.7

10.4

37.8

8.712.6

1.9

10.77.6

1.50.14

4KB Random Request

3-Replication (3-Rep.)

RS (10, 4)

No

rmalized

to

Re

plicati

on

0.67

Almost 0

Overall Results

Read Write Read Write Read Write Read Write Read Write Read Write

Throughput LatencyCPU

Utilization

Relative

Context

Switch

Private

Network

Overhead

I/O

Amplification

05

10253035505560

57.7

10.4

37.8

8.712.6

1.9

10.77.6

1.50.14

4KB Random Request

3-Replication (3-Rep.)

RS (10, 4)

No

rmalized

to

Re

plicati

on

0.67

Almost 0

Overall Results

Read Write Read Write Read Write Read Write Read Write Read Write

Throughput LatencyCPU

Utilization

Relative

Context

Switch

Private

Network

Overhead

I/O

Amplification

05

10253035505560

57.7

10.4

37.8

8.712.6

1.9

10.77.6

1.50.14

4KB Random Request

3-Replication (3-Rep.)

RS (10, 4)

No

rmalized

to

Re

plicati

on

0.67

Almost 0

Overall Results

Read Write Read Write Read Write Read Write Read Write Read Write

Throughput LatencyCPU

Utilization

Relative

Context

Switch

Private

Network

Overhead

I/O

Amplification

05

10253035505560

57.7

10.4

37.8

8.712.6

1.9

10.77.6

1.50.14

4KB Random Request

3-Replication (3-Rep.)

RS (10, 4)

No

rmalized

to

Re

plicati

on

0.67

Almost 0

Overall Results

Read Write Read Write Read Write Read Write Read Write Read Write

Throughput LatencyCPU

Utilization

Relative

Context

Switch

Private

Network

Overhead

I/O

Amplification

05

10253035505560

57.7

10.4

37.8

8.712.6

1.9

10.77.6

1.50.14

4KB Random Request

3-Replication (3-Rep.)

RS (10, 4)

No

rmalized

to

Re

plicati

on

0.67

Almost 0

Introduction

Demands on scalable, high performancedistributed storage system

Introduction

Employing SSDs to HPC & DC Systems

HPC & DC

SSD

Employing SSDs to HPC & DC Systems

HPC & DC

SSDHDD

Employing SSDs to HPC & DC Systems

HPC & DC

SSDHDDHigher bandwidthShorter latency &

Lower power consumption

Employing SSDs to HPC & DC Systems

HPC & DC

• Typically, storage systems have regular failures.

Storage System Failures

• Typically, storage systems have regular failures.

1) Storage failure

ex) Facebook reports; Up to 3% HDDs fails each day(Ref. M. Sathiamoorthy et al., “Xoringelephants: Novel erasure codes for big data,” in PVLDB, 2013.)

Storage System Failures

• Typically, storage systems have regular failures.

1) Storage failure

Although SSDs have higher reliability than HDDs, daily failure cannot be ignored.

ex) Facebook reports; Up to 3% HDDs fails each day(Ref. M. Sathiamoorthy et al., “Xoringelephants: Novel erasure codes for big data,” in PVLDB, 2013.)

Storage System Failures

• Typically, storage systems have regular failures.

1) Storage failure

Although SSDs have higher reliability than HDDs, daily failure cannot be ignored.

2) Network switch errors, power outages, and soft/hard errors

ex) Facebook reports; Up to 3% HDDs fails each day(Ref. M. Sathiamoorthy et al., “Xoringelephants: Novel erasure codes for big data,” in PVLDB, 2013.)

Storage System Failures

• Typically, storage systems have regular failures.

1) Storage failure

Although SSDs have higher reliability than HDDs, daily failure cannot be ignored.

2) Network switch errors, power outages, and soft/hard errors

ex) Facebook reports; Up to 3% HDDs fails each day(Ref. M. Sathiamoorthy et al., “Xoringelephants: Novel erasure codes for big data,” in PVLDB, 2013.)

“So we need fault tolerance mechanism.”

Storage System Failures

Traditional fault tolerance mechanism.

Replication

Data Replica Replica

Fault Tolerance Mechanisms in Distributed System

Traditional fault tolerance mechanism.

• Simple and effective way to make system resilient.

Replication

Data Replica Replica

Fault Tolerance Mechanisms in Distributed System

Traditional fault tolerance mechanism.

• Simple and effective way to make system resilient.

× High storage overheads (3x).

Replication

Data Replica Replica

Fault Tolerance Mechanisms in Distributed System

Traditional fault tolerance mechanism.

• Simple and effective way to make system resilient.

× High storage overheads (3x).

× Especially for SSD, replication

Replication

Data Replica Replica

Fault Tolerance Mechanisms in Distributed System

Traditional fault tolerance mechanism.

• Simple and effective way to make system resilient.

× High storage overheads (3x).

× Especially for SSD, replication

1) causes high expense because of SSD’s high cost per GB.

Replication

Data Replica Replica

Fault Tolerance Mechanisms in Distributed System

Traditional fault tolerance mechanism.

• Simple and effective way to make system resilient.

× High storage overheads (3x).

× Especially for SSD, replication

1) causes high expense because of SSD’s high cost per GB.

2) causes performance degradation due to SSD’s specific characteristics .

ex) Garbage collection, wearing out…

Replication

Data Replica Replica

Fault Tolerance Mechanisms in Distributed System

Traditional fault tolerance mechanism.

• Simple and effective way to make system resilient.

× High storage overheads (3x).

× Especially for SSD, replication

1) causes high expense because of SSD’s high cost per GB.

2) causes performance degradation due to SSD’s specific characteristics .

ex) Garbage collection, wearing out…

Replication

Data Replica Replica

Fault Tolerance Mechanisms in Distributed System

Need an alternative method to reduce the storage overheads.

Alternative method of replication.

Erasure coding

Fault Tolerance Mechanisms in Distributed System

Data Chunks Coding Chunks

Encode

Alternative method of replication.

• Lower storage overheads than replication.

Erasure coding

Fault Tolerance Mechanisms in Distributed System

Data Chunks Coding Chunks

Encode

Alternative method of replication.

• Lower storage overheads than replication.

× High reconstruction costs. (well known problem)

Erasure coding

Fault Tolerance Mechanisms in Distributed System

Data Chunks Coding Chunks

Encode

Alternative method of replication.

• Lower storage overheads than replication.

× High reconstruction costs. (well known problem)

ex) Facebook cluster with EC increases network traffics by more than 100TB in a

day.

Erasure coding

Fault Tolerance Mechanisms in Distributed System

Data Chunks Coding Chunks

Encode

Alternative method of replication.

• Lower storage overheads than replication.

× High reconstruction costs. (well known problem)

ex) Facebook cluster with EC increases network traffics by more than 100TB in a

day.

Many researches try to reduce reconstruction costs.

Erasure coding

Fault Tolerance Mechanisms in Distributed System

Data Chunks Coding Chunks

Encode

Alternative method of replication.

• Lower storage overheads than replication.

× High reconstruction costs. (well known problem)

ex) Facebook cluster with EC increases network traffics by more than 100TB in a

day.

Many researches try to reduce reconstruction costs.

Erasure coding

Fault Tolerance Mechanisms in Distributed System

Data Chunks Coding Chunks

Encode

We observed significant overheadsimposed during I/O services in distributed

system employing erasure codes.

• Reed-Solomon: Erasure coding algorithm.

• Ceph: Distributed system used in this research.

- Architecture

- Data path

- Storage stack

Background

• The most famous erasure coding algorithm.

Reed-Solomon

• The most famous erasure coding algorithm.

• Divide data into k equal data chunks and generates m coding chunks.

Reed-Solomon

D0

D1...

Dk

Data

C0

C1...

CmData

ChunksCodingChunks

• The most famous erasure coding algorithm.

• Divide data into k equal data chunks and generates m coding chunks.

• Encoding: Multiplication of a generator matrix and data chunks as a vector.

Reed-Solomon

DataChunks

D0

D1

D2

D3

D0

D1

D2

D3

C0

C1

GeneratorMatrix

C2

CodingChunks

• The most famous erasure coding algorithm.

• Divide data into k equal data chunks and generates m coding chunks.

• Encoding: Multiplication of a generator matrix and data chunks as a vector.

• Stripe: k data chunks.

Reed-Solomon

DataChunks

D0

D1

D2

D3

D0

D1

D2

D3

C0

C1

GeneratorMatrix

C2

Stripe

• The most famous erasure coding algorithm.

• Divide data into k equal data chunks and generates m coding chunks.

• Encoding: Multiplication of a generator matrix and data chunks as a vector.

• Stripe: k data chunks.

• Write Reed-Solomon with k data chunks and m coding chunks as “RS(k,m)”.

Reed-Solomon

DataChunks

D0

D1

D2

D3

D0

D1

D2

D3

C0

C1

GeneratorMatrix

C2

RS(4,3)

• The most famous erasure coding algorithm.

• Divide data into k equal data chunks and generates m coding chunks.

• Encoding: Multiplication of a generator matrix and data chunks as a vector.

• Stripe: k data chunks.

• Write Reed-Solomon with k data chunks and m coding chunks as “RS(k,m)”.

• Can be recovered from m failures.

Reed-Solomon

DataChunks

D0

D1

D2

D3

D0

D1

D2

D3

C0

C1

GeneratorMatrix

C2

3 failures

App

libRBD

libRADOS

App App

libRBD

libRADOS

App

Client Nodes

Public Network

Node 0 Node n

Monitor Monitor

PrivateNetwork

Storage

Ceph Architecture

App

libRBD

libRADOS

App App

libRBD

libRADOS

App

Client Nodes

Public Network

Node 0 Node n

Monitor Monitor

PrivateNetwork

Storage

Ceph Architecture

App

libRBD

libRADOS

App App

libRBD

libRADOS

App

Client Nodes

Public Network

Node 0 Node n

Monitor Monitor

PrivateNetwork

Storage

Ceph Architecture

• Client nodes are connected to the storage nodes through “public network”.

App

libRBD

libRADOS

App App

libRBD

libRADOS

App

Client Nodes

Public Network

Node 0 Node n

Monitor Monitor

PrivateNetwork

Storage

Ceph Architecture

• Client nodes are connected to the storage nodes through “public network”.

• Storage nodes are connected through “private network”.

App

libRBD

libRADOS

App App

libRBD

libRADOS

App

Client Nodes

Public Network

Node 0 Node n

Monitor Monitor

PrivateNetwork

Storage

Ceph Architecture

• Client nodes are connected to the storage nodes through “public network”.

• Storage nodes are connected through “private network”.

• Each storage node consists of several object storage device daemons (OSDs) and monitors.

• OSDs handle read/write services.

App

libRBD

libRADOS

App App

libRBD

libRADOS

App

Client Nodes

Public Network

Node 0 Node n

Monitor Monitor

PrivateNetwork

Storage

Ceph Architecture

• Client nodes are connected to the storage nodes through “public network”.

• Storage nodes are connected through “private network”.

• Each storage node consists of several object storage device daemons (OSDs) and monitors.

• OSDs handle read/write services.

• OSDs handle read/write services.

App

libRBD

libRADOS

App App

libRBD

libRADOS

App

Client Nodes

Public Network

Node 0 Node n

Monitor Monitor

PrivateNetwork

Storage

Ceph Architecture

• Client nodes are connected to the storage nodes through “public network”.

• Storage nodes are connected through “private network”.

• Each storage node consists of several object storage device daemons (OSDs) and monitors.

• OSDs handle read/write services.

• Monitors manage the access permissions and the status of multiple OSDs.

Data Path

file

Object libRBDObject

1. File/block is handled as an object.

Data Path

file

Object libRBD

libRADOS

Pool APool Boid

HASH()pgid

Object

1. File/block is handled as an object.

2. Object is assigned to placement group

(PG) consists of several OSDs according

to the result of hash function.

01

n

17,18,19,11

PG

Data Path

file

Object libRBD

libRADOS

Pool APool Boid

HASH()pgid

CRUSH()

Object

1. File/block is handled as an object.

2. Object is assigned to placement group

(PG) consists of several OSDs according

to the result of hash function.

3. CRUSH algorithm determines primary

OSD in PG.

01

n

17,18,19,11

PG

Data Path

file

Object libRBD

libRADOS

Pool APool Boid

HASH()pgid

CRUSH()

primaryObject

1. File/block is handled as an object.

2. Object is assigned to placement group

(PG) consists of several OSDs according

to the result of hash function.

3. CRUSH algorithm determines primary

OSD in PG.

4. Object is sent to primary OSD.01

n

17,18,19,11

PG

Data Path

file

Object libRBD

libRADOS

Pool APool Boid

HASH()pgid

CRUSH()

primaryObjectData 1 Data 2 Data 3 Data 4

1. File/block is handled as an object.

2. Object is assigned to placement group

(PG) consists of several OSDs according

to the result of hash function.

3. CRUSH algorithm determines primary

OSD in PG.

4. Object is sent to primary OSD.

5. Primary OSD sends object to another

OSDs (secondary, tertiary, …) as a form

of replica/chunk depending on the fault

tolerance mechanism.

01

n

17,18,19,11

PG

Data Path

file

Object libRBD

libRADOS

Pool APool Boid

HASH()pgid

CRUSH()

primaryObjectData 1 Data 2 Data 3 Data 4

1. File/block is handled as an object.

2. Object is assigned to placement group

(PG) consists of several OSDs according

to the result of hash function.

3. CRUSH algorithm determines primary

OSD in PG.

4. Object is sent to primary OSD.

5. Primary OSD sends object to another

OSDs (secondary, tertiary, …) as a form

of replica/chunk depending on the fault

tolerance mechanism.

“In detail”

01

n

17,18,19,11

PG

Storage Stack

Primary OSDObject

Storage Stack

Primary OSD

Client Messenger

Object

Storage Stack

Primary OSD

Client Messenger

Dispatcher

Object

Storage Stack

Primary OSD

Client Messenger

Dispatcher

PrimaryLogPG

Object

Storage Stack

Primary OSD

Client Messenger

Dispatcher

PrimaryLogPG

PG BackendReplicated EC

Object

Storage Stack

Primary OSD

Client Messenger

Dispatcher

PrimaryLogPG

PG BackendReplicated EC

Object

Storage Stack

Primary OSD

Client Messenger

Dispatcher

PrimaryLogPG

PG BackendReplicated EC

Object

Storage Stack

Primary OSD

Client Messenger

Dispatcher

PrimaryLogPG

PG BackendReplicated EC

Cluster Messenger

Object

Chunk/Replica

Storage Stack

Primary OSD

Client Messenger

Dispatcher

PrimaryLogPG

PG BackendReplicated EC

Dispatcher

PrimaryLogPG

PG BackendReplicated EC

SSD SSD

Cluster Messenger

Object

Chunk/Replica

Storage Stack

Primary OSD

Client Messenger

Dispatcher

PrimaryLogPG

PG BackendReplicated EC

Dispatcher

PrimaryLogPG

PG BackendReplicated EC

SSD SSD

Cluster Messenger

Object

“Implemented in user space.”

User space

Kernel space

Chunk/Replica

Client

Core Core Core Core24

SSD SSD6TB

256GB DDR4 DRAM

SSDSSD

Mon

OSD1

OSD6

Mon

OSD1

OSD6

Mon

OSD1

OSD6

OSD1

OSD6

10Gb

10Gb

10Gb

Ceph Storage Cluster

PublicNetwork

PrivateNetwork

Analysis Overview

Client

Core Core Core Core24

SSD SSD6TB

256GB DDR4 DRAM

SSDSSD

Mon

OSD1

OSD6

Mon

OSD1

OSD6

Mon

OSD1

OSD6

OSD1

OSD6

10Gb

10Gb

10Gb

Ceph Storage Cluster

PublicNetwork

PrivateNetwork

Analysis Overview

1) Overall performance.- throughput & latency

Client

Core Core Core Core24

SSD SSD6TB

256GB DDR4 DRAM

SSDSSD

Mon

OSD1

OSD6

Mon

OSD1

OSD6

Mon

OSD1

OSD6

OSD1

OSD6

10Gb

10Gb

10Gb

Ceph Storage Cluster

PublicNetwork

PrivateNetwork

Analysis Overview

1) Overall performance.- throughput & latency

2) CPU utilization & # context switches.

Client

Core Core Core Core24

SSD SSD6TB

256GB DDR4 DRAM

SSDSSD

Mon

OSD1

OSD6

Mon

OSD1

OSD6

Mon

OSD1

OSD6

OSD1

OSD6

10Gb

10Gb

10Gb

Ceph Storage Cluster

PublicNetwork

PrivateNetwork

Analysis Overview

1) Overall performance.- throughput & latency

2) CPU utilization & # context switches.

3) Actual amount of reads & writes served from disks.

Client

Core Core Core Core24

SSD SSD6TB

256GB DDR4 DRAM

SSDSSD

Mon

OSD1

OSD6

Mon

OSD1

OSD6

Mon

OSD1

OSD6

OSD1

OSD6

10Gb

10Gb

10Gb

Ceph Storage Cluster

PublicNetwork

PrivateNetwork

Analysis Overview

1) Overall performance.- throughput & latency

2) CPU utilization & # context switches.

3) Actual amount of reads & writes served from disks.

4) Private network traffics.

Object Management in Erasure Coding

Observe that erasure coding has different object management

scheme with replication.

Object Management in Erasure Coding

Observe that erasure coding has different object management

scheme with replication.

- To manage data/coding chunks.

Object Management in Erasure Coding

Observe that erasure coding has different object management

scheme with replication.

- To manage data/coding chunks.

- Two phases: object initialization & object update.

Object Management in Erasure Coding

Observe that erasure coding has different object management

scheme with replication.

- To manage data/coding chunks.

- Two phases: object initialization & object update.

i) Object initialization.

<4MB Object>

𝑘KB Write

Object Management in Erasure Coding

Observe that erasure coding has different object management

scheme with replication.

- To manage data/coding chunks.

- Two phases: object initialization & object update.

i) Object initialization.

<4MB Object>

𝑘KB Write

Dummy Data

Object Management in Erasure Coding

Observe that erasure coding has different object management

scheme with replication.

- To manage data/coding chunks.

- Two phases: object initialization & object update.

i) Object initialization.

<4MB Object>

𝑘KB Write

Dummy DataGenerate object with

coding chunks

Object Management in Erasure Coding

ii) Object update.

Data Chunk 2

To be updated

Object Management in Erasure Coding

ii) Object update.

Data Chunk 1

Data Chunk 2

Data Chunk 3

Data Chunk 4

Data Chunk 5

Data Chunk 0

i) Read whole stripe

To be updated

Object Management in Erasure Coding

ii) Object update.

Data Chunk 1

Data Chunk 2

Data Chunk 3

Data Chunk 4

Data Chunk 5

Data Chunk 0

To be updated

ii) Generate coding chunks

Coding Chunk 0

Coding Chunk 1

Coding Chunk 2

Object Management in Erasure Coding

ii) Object update.

Data Chunk 1

Data Chunk 2

Data Chunk 3

Data Chunk 4

Data Chunk 5

Data Chunk 0

To be updated

Coding Chunk 0

Coding Chunk 1

Coding Chunk 2

iii) Write to storage

Workload Description

3-replication, RS(6,3), RS(10,4)

Workload Description

Micro benchmark: Flexible I/O (FIO)

Request Size (KB) 1, 2, 4, 8, 16, 32, 64, 128

Access Type Sequential Random

Pre-write X O X O

Operation Type Write Read Wrie Write Read Write

3-replication, RS(6,3), RS(10,4)

Workload Description

Micro benchmark: Flexible I/O (FIO)

Request Size (KB) 1, 2, 4, 8, 16, 32, 64, 128

Access Type Sequential Random

Pre-write X O X O

Operation Type Write Read Wrie Write Read Write

3-replication, RS(6,3), RS(10,4)

Workload Description

Micro benchmark: Flexible I/O (FIO)

Request Size (KB) 1, 2, 4, 8, 16, 32, 64, 128

Access Type Sequential Random

Pre-write X O X O

Operation Type Write Read Wrie Write Read Write

3-replication, RS(6,3), RS(10,4)

Workload Description

Micro benchmark: Flexible I/O (FIO)

Request Size (KB) 1, 2, 4, 8, 16, 32, 64, 128

Access Type Sequential Random

Pre-write X O X O

Operation Type Write Read Wrie Write Read Write

3-replication, RS(6,3), RS(10,4)

Client

Core Core Core Core24

SSD SSD6TB

256GB DDR4 DRAM

SSDSSD

Mon

OSD1

OSD6

Mon

OSD1

OSD6

Mon

OSD1

OSD6

OSD1

OSD6

10Gb

10Gb

10Gb

Ceph Storage Cluster

PublicNetwork

PrivateNetwork

Analysis Overview

1) Overall performance.- throughput & latency

2) CPU utilization & # context switches.

3) Actual amount of reads & writes served from disks.

4) Private network traffics.

Performance Comparison (Sequential Write)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

0

4

8

124KB

2KB1KB

3-Rep. RS 6, 3 RS 10, 4

Th

rou

gh

pu

t (M

B/s

)

- worse in RS (max)

- longer in RS (max)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

1000

1200 3-Rep.

RS 6, 3

RS 10, 4

La

ten

cy

(m

se

c)

Performance Comparison (Sequential Write)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

0

4

8

124KB

2KB1KB

3-Rep. RS 6, 3 RS 10, 4

Th

rou

gh

pu

t (M

B/s

)

- Significant performance degradation in Reed-Solomon.

- worse in RS (max)

- longer in RS (max)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

1000

1200 3-Rep.

RS 6, 3

RS 10, 4

La

ten

cy

(m

se

c)

Performance Comparison (Sequential Write)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

0

4

8

124KB

2KB1KB

3-Rep. RS 6, 3 RS 10, 4

Th

rou

gh

pu

t (M

B/s

)

X 11.3 (MAX)

- worse in RS (max)

- Significant performance degradation in Reed-Solomon.

- Throughput: 11.3× worse in RS (max)

- longer in RS (max)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

1000

1200 3-Rep.

RS 6, 3

RS 10, 4

La

ten

cy

(m

se

c)

Performance Comparison (Sequential Write)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

0

4

8

124KB

2KB1KB

3-Rep. RS 6, 3 RS 10, 4

Th

rou

gh

pu

t (M

B/s

)

X 11.3 (MAX)

- longer in RS (max)

- worse in RS (max)

- Significant performance degradation in Reed-Solomon.

- Latency: 12.9× longer in RS (max)

- longer in RS (max)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

1000

1200 3-Rep.

RS 6, 3

RS 10, 4

La

ten

cy

(m

se

c)

X 12.9 (MAX)

Performance Comparison (Sequential Write)

4KB8KB

16KB0

20

40

60 3-Rep.

RS 6, 3

RS 10, 4

Th

rou

gh

pu

t (M

B/s

)

- longer in RS (max)

- worse in RS (max)

- Significant performance degradation in Reed-Solomon.

- Degradation in request size 4~16KB is not acceptable

- longer in RS (max)

4KB8KB

16KB0

200

400

600

800

1000

1200

1400 3-Rep. RS 6, 3 RS 10, 4

La

ten

cy

(m

se

c)

Performance Comparison (Sequential Write)

4KB8KB

16KB0

20

40

60 3-Rep.

RS 6, 3

RS 10, 4

Th

rou

gh

pu

t (M

B/s

)

- longer in RS (max)

- worse in RS (max)

- Significant performance degradation in Reed-Solomon.

- Degradation in request size 4~16KB is not acceptable

- Computation for encoding, data management, and additional network traffic

causes degradation in erasure coding.

4KB8KB

16KB0

200

400

600

800

1000

1200

1400 3-Rep. RS 6, 3 RS 10, 4

La

ten

cy

(m

se

c)

Performance Comparison (Sequential Read)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

1000

0

8

16

24

324KB

2KB

1KB

Th

rou

gh

pu

t (M

B/s

) 3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

50

100

150 3-Rep. RS 6, 3 RS 10, 4

La

ten

cy

(m

se

c)

- worse in RS (4KB)

- longer in RS (4KB)

Performance Comparison (Sequential Read)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

1000

0

8

16

24

324KB

2KB

1KB

Th

rou

gh

pu

t (M

B/s

) 3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

50

100

150 3-Rep. RS 6, 3 RS 10, 4

La

ten

cy

(m

se

c)

- Performance degradation in Reed-Solomon.

- worse in RS (4KB)

- longer in RS (4KB)

Performance Comparison (Sequential Read)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

1000

0

8

16

24

324KB

2KB

1KB

Th

rou

gh

pu

t (M

B/s

) 3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

50

100

150 3-Rep. RS 6, 3 RS 10, 4

La

ten

cy

(m

se

c)

- worse in RS (4KB)

- Performance degradation in Reed-Solomon.

- Throughput: 3.4× worse in RS (4KB)

- longer in RS (4KB)

X 3.4

Performance Comparison (Sequential Read)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

1000

0

8

16

24

324KB

2KB

1KB

Th

rou

gh

pu

t (M

B/s

) 3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

50

100

150 3-Rep. RS 6, 3 RS 10, 4

La

ten

cy

(m

se

c)

- longer in RS (4KB)

- worse in RS (4KB)

- Performance degradation in Reed-Solomon.

- Latency: 3.4× longer in RS (4KB)

- longer in RS (4KB)

X 3.4 X 3.4

Performance Comparison (Sequential Read)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

1000

0

8

16

24

324KB

2KB

1KB

Th

rou

gh

pu

t (M

B/s

) 3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

50

100

150 3-Rep. RS 6, 3 RS 10, 4

La

ten

cy

(m

se

c)

- longer in RS (4KB)

- worse in RS (4KB)

- Performance degradation in Reed-Solomon.

- Even though there was no failure, performance degradation occurred.

- longer in RS (4KB)

Performance Comparison (Sequential Read)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

200

400

600

800

1000

0

8

16

24

324KB

2KB

1KB

Th

rou

gh

pu

t (M

B/s

) 3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

50

100

150 3-Rep. RS 6, 3 RS 10, 4

La

ten

cy

(m

se

c)

- longer in RS (4KB)

- worse in RS (4KB)

- Performance degradation in Reed-Solomon.

- Even though there was no failure, performance degradation occurred.

- Caused by RS-concatenation, which generates extra data transfers.

Performance Comparison (Sequential Read)

- longer in RS (4KB)

- worse in RS (4KB)

- Performance degradation in Reed-Solomon.

- Even though there was no failure, performance degradation occurred.

- Caused by RS-concatenation, which generates extra data transfers.

Data Chunk 1

Data Chunk 2

Data Chunk 3

Data Chunk 4

Data Chunk 5

Data Chunk 0

“RS-Concatenation”

Performance Comparison (Sequential Read)

- longer in RS (4KB)

- worse in RS (4KB)

- Performance degradation in Reed-Solomon.

- Even though there was no failure, performance degradation occurred.

- Caused by RS-concatenation, which generates extra data transfers.

Data Chunk 1

Data Chunk 2

Data Chunk 3

Data Chunk 4

Data Chunk 5

Data Chunk 0

“RS-Concatenation”

Performance Comparison (Sequential Read)

- longer in RS (4KB)

- worse in RS (4KB)

- Performance degradation in Reed-Solomon.

- Even though there was no failure, performance degradation occurred.

- Caused by RS-concatenation, which generates extra data transfers.

Data Chunk 1

Data Chunk 2

Data Chunk 3

Data Chunk 4

Data Chunk 5

Data Chunk 0

“RS-Concatenation”

Stripe

Client

Core Core Core Core24

SSD SSD6TB

256GB DDR4 DRAM

SSDSSD

Mon

OSD1

OSD6

Mon

OSD1

OSD6

Mon

OSD1

OSD6

OSD1

OSD6

10Gb

10Gb

10Gb

Ceph Storage Cluster

PublicNetwork

PrivateNetwork

Analysis Overview

1) Overall performance.- throughput & latency

2) CPU utilization & # context switches.

3) Actual amount of reads & writes served from disks.

4) Private network traffics.

Computing and Software Overheads (CPU Utilization)

<Random Read><Random Write>

Computing and Software Overheads (CPU Utilization)

<Random Read><Random Write>

- RS requires much more CPU cycles than replication.

Computing and Software Overheads (CPU Utilization)

<Random Read><Random Write>

- RS requires much more CPU cycles than replication.

- User mode CPU utilizations account for 70~75% of total CPU cycles.

70~75%

Computing and Software Overheads (CPU Utilization)

<Random Read><Random Write>

- RS requires much more CPU cycles than replication.

- User mode CPU utilizations account for 70~75% of total CPU cycles.

- Uncommon in RAID systems

70~75%

Computing and Software Overheads (CPU Utilization)

<Random Read><Random Write>

- RS requires much more CPU cycles than replication.

- User mode CPU utilizations account for 70~75% of total CPU cycles.

- Uncommon in RAID systems

- Implemented at the user level. (ex. OSD daemon, PG backend, fault tolerant modules)

70~75%

Computing and software overheads (Context Switch)

<Random Read><Random Write>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

80k

160k

240k

0

3k

6k

9k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

500k

1M

0

20k

40k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

elative number of context switches =The number of context switches

Total amount of request (MB)

Computing and software overheads (Context Switch)

<Random Read><Random Write>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

80k

160k

240k

0

3k

6k

9k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

500k

1M

0

20k

40k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

- The number of context switches Total amount of request MB

- )𝐵𝑀( 𝑡𝑠𝑒𝑢𝑞𝑒𝑟 𝑓𝑜 𝑡𝑛𝑢𝑜𝑚𝑎 𝑙𝑎𝑡𝑜 The number of context switches Total amount of request

sehctiws txetnoc fo rebmun evitaleRelative number of context switches =

The number of context switches

Total amount of request (MB)

Computing and software overheads (Context Switch)

<Random Read><Random Write>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

80k

160k

240k

0

3k

6k

9k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

500k

1M

0

20k

40k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

- The number of context switches Total amount of request MB

- Much more context switches occur in RS than replication.

Computing and software overheads (Context Switch)

<Random Read><Random Write>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

80k

160k

240k

0

3k

6k

9k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

500k

1M

0

20k

40k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

- The number of context switches Total amount of request MB

- Much more context switches occur in RS than replication.

i) Read: Data transfers through OSDs and computations during RS-concatenation.

Computing and software overheads (Context Switch)

<Random Read><Random Write>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

80k

160k

240k

0

3k

6k

9k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

500k

1M

0

20k

40k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

- The number of context switches Total amount of request MB

- Much more context switches occur in RS than replication.

i) Read: Data transfers through OSDs and computations during RS-concatenation.

ii) Write:

Computing and software overheads (Context Switch)

<Random Read><Random Write>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

80k

160k

240k

0

3k

6k

9k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

500k

1M

0

20k

40k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

- The number of context switches Total amount of request MB

- Much more context switches occur in RS than replication.

i) Read: Data transfers through OSDs and computations during RS-concatenation.

ii) Write:

1) Initializing object has lots of writes, and significant amount of computations.

Computing and software overheads (Context Switch)

<Random Read><Random Write>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

80k

160k

240k

0

3k

6k

9k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

500k

1M

0

20k

40k 32KB

64KB

128KB

Co

nte

xt

Sw

itc

h (

#/M

B)

3-Rep. RS 6, 3 RS 10, 4

- The number of context switches Total amount of request MB

- Much more context switches occur in RS than replication.

i) Read: Data transfers through OSDs and computations during RS-concatenation.

ii) Write:

1) Initializing object has lots of writes, and significant amount of computations.

2) Updating object introduces many transfers among OSDs through user-level modules.

Client

Core Core Core Core24

SSD SSD6TB

256GB DDR4 DRAM

SSDSSD

Mon

OSD1

OSD6

Mon

OSD1

OSD6

Mon

OSD1

OSD6

OSD1

OSD6

10Gb

10Gb

10Gb

Ceph Storage Cluster

PublicNetwork

PrivateNetwork

Analysis Overview

1) Overall performance.- throughput & latency

2) CPU utilization & # context switches.

3) Actual amount of reads & writes served from disks.

4) Private network traffics.

I/O Amplification (Random Write)

/O amplification =Read/write amount from storage(MB)

Total amount of request(MB)

<Read Amplification><Write Amplification>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0100200300400500600700

32 64 1280

8

16

24

3-Rep. RS 6, 3 RS 10, 4

Wri

te t

o S

tora

ge (

MB

)

To

tal R

eq

uest

(MB

)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

10

20

30

40

50

32 64 1280.0

0.8

1.6

2.4

3-Rep. RS 6, 3 RS 10, 4

Read

fro

m S

tora

ge (

MB

)

To

tal R

eq

uest

(MB

)

I/O Amplification (Random Write)

i)/O amplification= Read/write amount from storage(MB) Total amount of

request(MB) Read/write amount from storage(MB) Read/write amount from

storage(MB) Total amount of request(MB) Total amount of request(MB)

Read/write amount from storage(MB) Total amount of request(MB)

ii) I/O amplification =Read/write amount from storage(MB)

Total amount of request(MB)

<Read Amplification><Write Amplification>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0100200300400500600700

32 64 1280

8

16

24

3-Rep. RS 6, 3 RS 10, 4

Wri

te t

o S

tora

ge (

MB

)

To

tal R

eq

uest

(MB

)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

10

20

30

40

50

32 64 1280.0

0.8

1.6

2.4

3-Rep. RS 6, 3 RS 10, 4

Read

fro

m S

tora

ge (

MB

)

To

tal R

eq

uest

(MB

)

I/O Amplification (Random Write)

i)/O amplification= Read/write amount from storage(MB) Total amount of

request(MB) Read/write amount from storage(MB) Read/write amount from

storage(MB) Total amount of request(MB) Total amount of request(MB)

Read/write amount from storage(MB) Total amount of request(MB)

ii) Erasure coding causes write amplification up to 700 times more than total

request volume.

<Read Amplification><Write Amplification>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0100200300400500600700

32 64 1280

8

16

24

3-Rep. RS 6, 3 RS 10, 4

Wri

te t

o S

tora

ge (

MB

)

To

tal R

eq

uest

(MB

)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

10

20

30

40

50

32 64 1280.0

0.8

1.6

2.4

3-Rep. RS 6, 3 RS 10, 4

Read

fro

m S

tora

ge (

MB

)

To

tal R

eq

uest

(MB

)

I/O Amplification (Random Write)

i)/O amplification= Read/write amount from storage(MB) Total amount of

request(MB) Read/write amount from storage(MB) Read/write amount from

storage(MB) Total amount of request(MB) Total amount of request(MB)

Read/write amount from storage(MB) Total amount of request(MB)

ii) Erasure coding causes write amplification up to 700 times more than total

request volume.

iii) Why is write amplification by random writes so big?

<Read Amplification><Write Amplification>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0100200300400500600700

32 64 1280

8

16

24

3-Rep. RS 6, 3 RS 10, 4

Wri

te t

o S

tora

ge (

MB

)

To

tal R

eq

uest

(MB

)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

10

20

30

40

50

32 64 1280.0

0.8

1.6

2.4

3-Rep. RS 6, 3 RS 10, 4

Read

fro

m S

tora

ge (

MB

)

To

tal R

eq

uest

(MB

)

I/O Amplification (Random Write)

i)/O amplification= Read/write amount from storage(MB) Total amount of

request(MB) Read/write amount from storage(MB) Read/write amount from

storage(MB) Total amount of request(MB) Total amount of request(MB)

Read/write amount from storage(MB) Total amount of request(MB)

ii) Erasure coding causes write amplification up to 700 times more than total

request volume.

iii) Why is write amplification by random writes so big?

<Read Amplification><Write Amplification>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0100200300400500600700

32 64 1280

8

16

24

3-Rep. RS 6, 3 RS 10, 4

Wri

te t

o S

tora

ge (

MB

)

To

tal R

eq

uest

(MB

)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

10

20

30

40

50

32 64 1280.0

0.8

1.6

2.4

3-Rep. RS 6, 3 RS 10, 4

Read

fro

m S

tora

ge (

MB

)

To

tal R

eq

uest

(MB

)

I/O Amplification (Read)

<Sequential Read><Random Read>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

10

20

30

40

32 64 1280.8

1.2

1.6

2.0

2.4

3-Rep. RS 6, 3 RS 10, 4

Read

fro

m S

tora

ge

(M

B)

To

tal R

eq

uest

(MB

)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0.6

0.8

1.0

1.2

1.4

Read

fro

m S

tora

ge (

MB

)

To

tal R

eq

ues

t (M

B) 3-Rep. RS 6, 3 RS 10, 4

I/O Amplification (Read)

- Read amplification caused by RS-concatenation.

<Sequential Read><Random Read>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

10

20

30

40

32 64 1280.8

1.2

1.6

2.0

2.4

3-Rep. RS 6, 3 RS 10, 4

Read

fro

m S

tora

ge

(M

B)

To

tal R

eq

uest

(MB

)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0.6

0.8

1.0

1.2

1.4

Read

fro

m S

tora

ge (

MB

)

To

tal R

eq

ues

t (M

B) 3-Rep. RS 6, 3 RS 10, 4

I/O Amplification (Read)

- Read amplification caused by RS-concatenation.

i) Random read: Mostly reads different stripes. Lots of read amplifications.

<Sequential Read><Random Read>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

10

20

30

40

32 64 1280.8

1.2

1.6

2.0

2.4

3-Rep. RS 6, 3 RS 10, 4

Read

fro

m S

tora

ge

(M

B)

To

tal R

eq

uest

(MB

)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0.6

0.8

1.0

1.2

1.4

Read

fro

m S

tora

ge (

MB

)

To

tal R

eq

ues

t (M

B) 3-Rep. RS 6, 3 RS 10, 4

I/O Amplification (Read)

- Read amplification caused by RS-concatenation.

i) Random read: Mostly reads different stripes. Lots of read amplifications.

ii) Sequential read: Consecutive I/O requests read data from same stripe. no read

amplifications.

<Sequential Read><Random Read>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

10

20

30

40

32 64 1280.8

1.2

1.6

2.0

2.4

3-Rep. RS 6, 3 RS 10, 4

Read

fro

m S

tora

ge

(M

B)

To

tal R

eq

uest

(MB

)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0.6

0.8

1.0

1.2

1.4

Read

fro

m S

tora

ge (

MB

)

To

tal R

eq

ues

t (M

B) 3-Rep. RS 6, 3 RS 10, 4

I/O Amplification (Read)

- Read amplification caused by RS-concatenation.

i) Random read: Mostly reads different stripes. Lots of read amplifications.

ii) Sequential read: Consecutive I/O requests read data from same stripe. no read

amplifications.

<Sequential Read><Random Read>

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

10

20

30

40

32 64 1280.8

1.2

1.6

2.0

2.4

3-Rep. RS 6, 3 RS 10, 4

Read

fro

m S

tora

ge

(M

B)

To

tal R

eq

uest

(MB

)

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0.6

0.8

1.0

1.2

1.4

Read

fro

m S

tora

ge (

MB

)

To

tal R

eq

ues

t (M

B) 3-Rep. RS 6, 3 RS 10, 4

Client

Core Core Core Core24

SSD SSD6TB

256GB DDR4 DRAM

SSDSSD

Mon

OSD1

OSD6

Mon

OSD1

OSD6

Mon

OSD1

OSD6

OSD1

OSD6

10Gb

10Gb

10Gb

Ceph Storage Cluster

PublicNetwork

PrivateNetwork

Analysis Overview

1) Overall performance.- throughput & latency

2) CPU utilization & # context switches.

3) Actual amount of reads & writes served from disks.

4) Private network traffics.

Network Traffics Among the Storage Nodes

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

100

200

300

400

500

0

5

10

15

20 32KB

64KB

128KB

Pri

va

te N

etw

ork

Tra

ffic

(M

B)

To

tal

Re

qu

es

t (M

B) 3-Rep. RS 6, 3 RS 10, 4

1KB2KB

4KB8KB

16KB32KB

64KB128KB

0

2

4

6

8

10

Pri

va

te N

etw

ork

Tra

ffic

(M

B)

To

tal

Re

qu

es

t (M

B) 3-Rep.

RS 6, 3

RS 10, 4

<Random Read><Random Write>

- Show similar trend with I/O amplifications.

- Erasure coding

i) Write: initializing & updating objects in erasure coding cause lots of network traffics.

ii) Read: RS-concatenation cause lots of network traffics.

- Replication exhibits only minimum data transfers related to necessary communications.

(ex. OSD interaction: monitoring the status of each OSD)

Conclusion

- We studied the overheads imposed by erasure coding on a

distributed SSD array system.

- In contrast to the common expectation on erasure codes, we

observed that they exhibit heavy network traffic and more I/O

amplification than replication.

- Also erasure coding requires much more CPU cycles and

context switches than replication due to user-level

implementation.

Q&A

Object Management in Erasure Coding

0s 50s 100s 150s0

600

1200

0s 20s 40s 60s 80s

0.0M

0.8M

1.6M0

30

60Overwrites

Th

rou

gh

pu

t

(MB

/s)

RX

TX

Writes on pristine

RX

TX

Co

nte

xt

Sw

itch

(# / s

ec)

#Context Switch/sec

#Context Switch/sec

CP

U

Utiliz

atio

n (

%)

User

System

User

System

Obejct initialize Obejct update

Time series analysis for CPU utilization, context switches, private network throughputobserved by random writes on pristine image & random overwrites