Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default...

23
Adding Storage Simulation Capacities to the SimGrid Toolkit A. Lebre, A. Legrand, F. Suter , P. Veyre Inria, Ecole des Mines de Nantes/LINA CNRS/INRIA/University of Grenoble IN2P3 Computing Center, CNRS CCGrid 2015 Shenzen, China May 06, 2015 Storage Simulation with the SimGrid Toolkit 1/17

Transcript of Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default...

Page 1: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Adding Storage Simulation Capacitiesto the SimGrid Toolkit

A. Lebre, A. Legrand, F. Suter, P. Veyre

Inria, Ecole des Mines de Nantes/LINACNRS/INRIA/University of Grenoble

IN2P3 Computing Center, CNRS

CCGrid 2015Shenzen, ChinaMay 06, 2015

Storage Simulation with the SimGrid Toolkit 1/17

Page 2: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Data is Everywhere!

Age of Data-Intensive Computational Sciences

I Data is the new source of scientific resultsI Fourth paradigm, Data deluge, Big Data, . . .I ↗ Volume, ↗ Velocity, ↗ Variety,

↗ Veracity, ↗ Value

Storage becomes more and more important

I Not only for historical big playersI E.g., High Energy Physics and LHC data

processing on data grids

I But in every scientific fieldI And on any large scale distributed infrastructure

I Clusters, Clouds, Grids, . . .

Storage Simulation with the SimGrid Toolkit 2/17

Page 3: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Data is Everywhere!

Age of Data-Intensive Computational Sciences

I Data is the new source of scientific resultsI Fourth paradigm, Data deluge, Big Data, . . .I ↗ Volume, ↗ Velocity, ↗ Variety,

↗ Veracity, ↗ Value

Storage becomes more and more important

I Not only for historical big playersI E.g., High Energy Physics and LHC data

processing on data grids

I But in every scientific fieldI And on any large scale distributed infrastructure

I Clusters, Clouds, Grids, . . .

Storage Simulation with the SimGrid Toolkit 2/17

Page 4: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Why Simulate Storage?

Storage: a performance driver to understand

I Independent of scale and type of the computing infrastructure

I As much important as computing and networkingI Simulation is a classical approach in performance evaluation

I Accuracy, Scalability, and Versatility are the keys

Specifics and concerns of storage subsystems may vary

I Data Centers ; Hierarchial (mass) storage subsystemsI Different types of media involved

I Supercomputers ; Large scale dedicated storage networkI High-speed network interconnect

I (MapReduce) Clusters ; Specific and tuned file systemI Reliable, scalable, and simple

I Grids and Clouds ; Set of services offered by multiple data centersI Hidden underlying infrastructures

Storage Simulation with the SimGrid Toolkit 3/17

Page 5: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Why Simulate Storage?

Storage: a performance driver to understand

I Independent of scale and type of the computing infrastructure

I As much important as computing and networkingI Simulation is a classical approach in performance evaluation

I Accuracy, Scalability, and Versatility are the keys

Specifics and concerns of storage subsystems may vary

I Data Centers ; Hierarchial (mass) storage subsystemsI Different types of media involved

I Supercomputers ; Large scale dedicated storage networkI High-speed network interconnect

I (MapReduce) Clusters ; Specific and tuned file systemI Reliable, scalable, and simple

I Grids and Clouds ; Set of services offered by multiple data centersI Hidden underlying infrastructures

Storage Simulation with the SimGrid Toolkit 3/17

Page 6: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Simulating Storage with SimGrid

What is SimGrid?I 15-years old project for the simulation

of distributed systemsI but lacking of a storage component

for about 10 years

I Open source, sustainable, widely used

I Available on http://simgrid.org

Main StrengthsI Versatility: simulates Grids, Clouds,

HPC, and P2P systems

I Fast and scalable simulation kernel

I Tractable models: fluid models andMax-Min fairness sharing

I (In)validation studies: simulationresults can be trusted

SURF

MSG SMPI SIMDAG

SIMIX

372435work

remaining

variable

530530

50664

245245

Concurrent

{Condition

...

...

processes

variables

App. spec. astask graph

...

x1

x2

x3

x3

+

xn

...

+

+ xn

...

...

Variables

x1

App. spec. as concurrent code

... ...

Activities

ResourceCapacities

6 CL2

6 CP1

6 CL1

6 CLm

Our claim

I Building a simulatorfrom scratch should beavoided

Storage Simulation with the SimGrid Toolkit 4/17

Page 7: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Simulating Storage with SimGrid

What is SimGrid?I 15-years old project for the simulation

of distributed systemsI but lacking of a storage component

for about 10 years

I Open source, sustainable, widely used

I Available on http://simgrid.org

Main StrengthsI Versatility: simulates Grids, Clouds,

HPC, and P2P systems

I Fast and scalable simulation kernel

I Tractable models: fluid models andMax-Min fairness sharing

I (In)validation studies: simulationresults can be trusted

SURF

MSG SMPI SIMDAG

SIMIX

372435work

remaining

variable

530530

50664

245245

Concurrent

{Condition

...

...

processes

variables

App. spec. astask graph

...

x1

x2

x3

x3

+

xn

...

+

+ xn

...

...

Variables

x1

App. spec. as concurrent code

... ...

Activities

ResourceCapacities

6 CL2

6 CP1

6 CL1

6 CLm

Our claim

I Building a simulatorfrom scratch should beavoided

Storage Simulation with the SimGrid Toolkit 4/17

Page 8: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Disclaimer

This talk is end-to-end-study-free

I Problem ; Idea ; Implementation ; Evaluation ; Problem solved!

But not contribution-free

I Comprehensive description of storage-related conceptsI Original API to develop SimGrid-based simulators

I Leveraging a sound and reliable simulation kernel

I Performance analysis of various types of disks ; Derived models

Our objective

I Convince you to use our proposal to conduct your storage-relatedsimulation studies

Storage Simulation with the SimGrid Toolkit 5/17

Page 9: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Disclaimer

This talk is end-to-end-study-free

I Problem ; Idea ; Implementation ; Evaluation ; Problem solved!

But not contribution-free

I Comprehensive description of storage-related conceptsI Original API to develop SimGrid-based simulators

I Leveraging a sound and reliable simulation kernel

I Performance analysis of various types of disks ; Derived models

Our objective

I Convince you to use our proposal to conduct your storage-relatedsimulation studies

Storage Simulation with the SimGrid Toolkit 5/17

Page 10: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Outline

Introduction

Adding Storage to SimGridConcepts and ModelsImplementation Highlights

Checking Modeling Assumptions

Added Value of Using SimGrid

Conclusions and Future Work

Storage Simulation with the SimGrid Toolkit 6/17

Page 11: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Concepts and Models

Basic Concepts

I File descriptorsI Description: Name (= full path) + size [+ user-level properties]

I Remark: no UNIX info, no contents

I Life cycle: Simulated entity created by open, destroyed by closeI Local operations: open, close, read, write, seek, tell, move, and deleteI Remote operations: move and copy

I Storage volumesI Description: Name + type + capacity + file list + mount point +

attach point + simulation modelI Remark: inert file list and no navigation in tree

I Life cycle: Instantiated at parsing timeI Operations: get file list and get [total, used, available] capacity

Fluid models: Tractable and fastI Assumptions (to be experimentally confirmed)

I Linearity, negligible latency, fair sharing

Maximize mina∈A ρaunder constraints{∑

a ∈ A using resource r ρa 6 Cr ,

Storage Simulation with the SimGrid Toolkit 7/17

Page 12: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Implementation Highlights

Comprehensive platform description

I Scalable XML format

Seamless remote operations

I I/O operations ; network transfersI in a store-and-forward mode

<storage_type id="SATA-II_HDD" size="500GB"

content_type="txt_unix"

content="unix_content.txt"

model="linear">

<model_prop id="r_bw" value="92MBps"/>

<model_prop id="w_bw" value="62MBps"/>

</storage_type>

<storage id="Disk1" typeId="SATA-II_HDD"

attach="bob"/>

<storage id="Disk2" typeId="SATA-II_HDD"

attach="alice"

content_type="txt_windows"

content="windows_content.txt" />

<host id="bob" power="1Gf">

<mount id="Disk1" name="/home"/>

<mount id="Disk2" name="/windows"/>

</host>

<host id="alice" power="1Gf">

<mount id="Disk2" name="c:"/>

</host>

<link id="link1" bandwidth="125MBps"

latency="50us"/>

<route src="bob" dst="alice"

symmetrical="YES">

<link_ctn id="link1"/>

</route>

Storage Simulation with the SimGrid Toolkit 8/17

Page 13: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Implementation Highlights

Comprehensive platform description

I Scalable XML format

Seamless remote operations

I I/O operations ; network transfersI in a store-and-forward mode

<storage_type id="SATA-II_HDD" size="500GB"

content_type="txt_unix"

content="unix_content.txt"

model="linear">

<model_prop id="r_bw" value="92MBps"/>

<model_prop id="w_bw" value="62MBps"/>

</storage_type>

<storage id="Disk1" typeId="SATA-II_HDD"

attach="bob"/>

<storage id="Disk2" typeId="SATA-II_HDD"

attach="alice"

content_type="txt_windows"

content="windows_content.txt" />

<host id="bob" power="1Gf">

<mount id="Disk1" name="/home"/>

<mount id="Disk2" name="/windows"/>

</host>

<host id="alice" power="1Gf">

<mount id="Disk2" name="c:"/>

</host>

<link id="link1" bandwidth="125MBps"

latency="50us"/>

<route src="bob" dst="alice"

symmetrical="YES">

<link_ctn id="link1"/>

</route>

Storage Simulation with the SimGrid Toolkit 8/17

Page 14: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Outline

Introduction

Adding Storage to SimGrid

Checking Modeling AssumptionsExperimental SetupIndependent AccessesConcurrent Accesses

Added Value of Using SimGrid

Conclusions and Future Work

Storage Simulation with the SimGrid Toolkit 9/17

Page 15: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Experimental Setup

TestbedI Grid’5000 experimental platform (http://www.grid5000.fr)

Name Model Interface Size Max. Bandwidthgriffon Hitachi HDP72503 SATA-II 320 GiB 79 MiB/secgranduc Seagate ST9146802SS SAS 146 GiB 84.7 MiB/secedel C400-MTFDDAA SATA/SSD 128 GiB 244.8 MiB/sec

Methodology

I Randomized benchmarks with FIO 2.0.8 managed with execoI Additional dd benchmark on granduc to cope with faulty raid controller

I Synchronous, non-buffered I/O operationsI Independent: From 32kiB to 2GiB with a fixed block size of 32KiBI Concurrent: 1 to 15 operations

I for 10, 50, 100, 500, 1024, and 2048 MiB files

Feel free to check and/or reproduce our results

I Everything is available online (http://dx.doi.org/10.6084/m9.figshare.1175156)

I Engines, raw data, analysis scripts, graphs and article sources

Storage Simulation with the SimGrid Toolkit 10/17

Page 16: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Modeling the Behavior of SATA-II Disks

I Top: Size vs. DurationI Confirms the linearity assumptionI But heteroscedastic behavior

I Variability proportional to sizeI Negligible latency

I Middle: Size vs. BandwidthI Independent of file sizeI Variability ; Random variables

I Bottom: Bandwidth distributionI Single mode but not following any

well-known distribution

Griffon (SATA II), Read Griffon (SATA II), Write

0

50

100

150

0 500 1000 1500 2000 0 500 1000 1500 2000File Size (in MiB)

Dur

atio

n (

in s

econ

ds)

67.6 MiB/s 50.4 MiB/s

Griffon (SATA II), Read Griffon (SATA II), Write

0

25

50

75

0 500 1000 1500 2000 0 500 1000 1500 2000File Size (in MiB)

Effe

ctiv

e B

andw

idth

(in

MiB

/sec

ond)

Griffon (SATA II), Read Griffon (SATA II), Write

0

100

200

300

400

40 60 80 20 40 60 80Effective Bandwidth (in MiB/second) Distribution

Cou

nt

Properties of derived model

I Linear w.r.t. bandwidth with no latencyI Modeling the bandwidth-dependent variability

I Inject sample distribution and draw random variable upon access

Storage Simulation with the SimGrid Toolkit 11/17

Page 17: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Modeling the Behavior of SSD Disks

I Top: Size vs. DurationI Linear with very little variability

I Middle: Size vs. BandwidthI Far from hdparm resultsI Default ext4 config prevents

getting maximum performance

I Bottom: Bandwidth distributionI Regular but not following any

well-known distribution

Edel (SSD), Read Edel (SSD), Write

0

5

10

15

0 500 1000 1500 2000 0 500 1000 1500 2000File Size (in MiB)

Dur

atio

n (

in s

econ

ds)

152.4 MiB/s 132.2 MiB/s

Edel (SSD), Read Edel (SSD), Write

0

50

100

150

0 500 1000 1500 2000 0 500 1000 1500 2000File Size (in MiB)

Effe

ctiv

e B

andw

idth

(in

MiB

/sec

ond)

Edel (SSD), Read Edel (SSD), Write

0

100

200

300

400

130 140 150 160 120 130 140 150Effective Bandwidth (in MiB/second) Distribution

Cou

nt

Properties of derived model (similar to SATA-II)

I Linear w.r.t. bandwidth with no latencyI Modeling the bandwidth-dependent variability

I Inject sample distribution and draw random variable upon access

Storage Simulation with the SimGrid Toolkit 12/17

Page 18: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Modeling Concurrent Accesses

Performance improvements on SSD

I Significant and non-linear for readsI When having more than one write

I Likely because of bad ext4 setup

On SAS and SATA-II

I Fixed bandwidth for writesI Linear decay fro reads

I Explained by arm movements

Edel (SSD), Read Edel (SSD), Write

Granduc (SAS), Read Granduc (SAS), Write

Griffon (SATA II), Read Griffon (SATA II), Write

0

50

100

150

200

250

0

50

100

150

200

250

0

25

50

75

0

25

50

75

0

25

50

75

100

0

25

50

75

100

0 5 10 15 0 5 10 15Number of concurrent operations

Agg

rega

ted

Ban

dwid

th (

MiB

/s)

Properties of derived model

I Modify resource capacity as concurrency increases

I Reevaluation each time a transfer begins or ends

I Easy to implement in SimGrid’s kernel

Storage Simulation with the SimGrid Toolkit 13/17

Page 19: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Outline

Introduction

Adding Storage to SimGrid

Checking Modeling Assumptions

Added Value of Using SimGridBuild (and Trust) your own SimulatorDesign (and Plug) your own Storage Model

Conclusions and Future Work

Storage Simulation with the SimGrid Toolkit 14/17

Page 20: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Build (and Trust) your own Simulator

RationaleI Developing a full DES from scratch is counterproductive!

I Already there: open, fast, and scalable kernel

I Better focus on the applicative part of the simulatorI With confidence on lower layers: (in)validated and reliable models

I Leverage versatilityI Mixing concepts 6= stacking features

Examples of added value

I Versatility ; Study more performance drivers w/o oversimplificationI Storage study + network interconnect + CPU heterogeneity

I (In)validation studies ; get realistic results, not just some resultsI Leverage predictive value in performance studies

I Scalable does not necessarily means inaccurateI Both can be obtained simultaneously

Storage Simulation with the SimGrid Toolkit 15/17

Page 21: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Design (and Plug) your own Storage Model

There is more than disks to modelI Tape libraries ; Access time (arm movements) + I/O time

I Combination of models

I Parallel/Distributed File Systems ; Disks + management layerI File system simulator + disks modelsI Model experienced throughput

I Storage on unknown infrastructures (Clouds) ; Black boxesI Model with bandwidth vs. #requests matrices

How to design and plug a new model?

I Designing and plugging a fluid model is pretty straightforwardI Behavior for a single operation + Sharing policy

I Instantiation is more complex (yet crucial)I Benchmarking and analysis procedures available online

I Contributions are welcomed!

Storage Simulation with the SimGrid Toolkit 16/17

Page 22: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Conclusion and Future Work

Conclusions

I Comprehensive description of storage-related conceptsI Original API to develop SimGrid-based simulators

I Leveraging a sound and reliable simulation kernel

I Thorough Performance analysis of various types of disksI Derived Fluid models ; tractable, fast, and accurate

I Only a first step . . .

Future Work

I Extend API to handle block storage, handle cache policiesI Integrate other resource models

I Only after thorough (in)validation studies

I Study other performance metrics (e.g., energy consumption)I Welcome contributions from external users

I Now I hope you are convinced to use SimGrid ©

Storage Simulation with the SimGrid Toolkit 17/17

Page 23: Adding Storage Simulation Capacities to the SimGrid Toolkit · I Farfrom hdparm results I Default ext4 con g prevents getting maximum performance I Bottom:Bandwidth distribution I

Conclusion and Future Work

Conclusions

I Comprehensive description of storage-related conceptsI Original API to develop SimGrid-based simulators

I Leveraging a sound and reliable simulation kernel

I Thorough Performance analysis of various types of disksI Derived Fluid models ; tractable, fast, and accurate

I Only a first step . . .

Future Work

I Extend API to handle block storage, handle cache policiesI Integrate other resource models

I Only after thorough (in)validation studies

I Study other performance metrics (e.g., energy consumption)I Welcome contributions from external users

I Now I hope you are convinced to use SimGrid ©

Storage Simulation with the SimGrid Toolkit 17/17