Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf ·...

79
Scibox: Online Sharing of Scientific Data via the Cloud Jian Huang , Xuechen Zhang , Greg Eisenhauer , Karsten Schwan Matthew Wolf †,‡ , Stephane Ethier ǂ , Scott Klasky CERCS Research Center, Georgia Tech ǂ Princeton Plasma Physics Laboratory Oak Ridge National Laboratory Supported in part by funding from the US Department of Energy for DOE SDAV SciDac 1

Transcript of Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf ·...

Page 1: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Scibox: Online Sharing of Scientific Data via the Cloud

Jian Huang†, Xuechen Zhang†, Greg Eisenhauer†, Karsten Schwan†

Matthew Wolf†,‡, Stephane Ethierǂ, Scott Klasky‡

†CERCS Research Center, Georgia TechǂPrinceton Plasma Physics Laboratory

‡Oak Ridge National LaboratorySupported in part by funding from the US Department of Energy for DOE SDAV SciDac

1

Page 2: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Outline

• Background and Motivation

• Problems and Challenges

• Design and Implementation

• Evaluation

• Conclusion and Future Work

2

Page 3: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Cloud Storage is Popular

Easy-of-use

Pay-as-you-go model

Universal accessibility

Good scalability and durability

3

Page 4: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Cloud Storage is Popular

Easy-of-use

Pay-as-you-go model

Universal accessibility

Good scalability and durability

3

Works based on cloud storage• Dropbox, GoogleDrive, iCloud, SkyDrive, and etc.

Page 5: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Cloud Storage is Popular

Easy-of-use

Pay-as-you-go model

Universal accessibility

Good scalability and durability

3

Works based on cloud storage• Dropbox, GoogleDrive, iCloud, SkyDrive, and etc.

Scibox: focus on scientific data sharing

Page 6: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Use Cases for Cloud Storage

4

CombustionExperimental

Data Private Cloud

Aero Cluster

Page 7: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Use Cases for Cloud Storage

4

CombustionExperimental

Data Private Cloud

Aero Cluster

Page 8: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Use Cases for Cloud Storage

4

CombustionExperimental

Data Private Cloud

ImageProcessing

Aero Cluster Student PC

Page 9: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Use Cases for Cloud Storage

4

CombustionExperimental

Data

GTS/LAMMPS

Private Cloud

ImageProcessing

Aero Cluster Vogue Student PC

Page 10: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Use Cases for Cloud Storage

4

CombustionExperimental

Data

GTS/LAMMPS

Private Cloud

ImageProcessing

Aero Cluster Vogue Student PC

Public Cloud

Page 11: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Use Cases for Cloud Storage

4

CombustionExperimental

Data

GTS/LAMMPS

Private Cloud

ImageProcessing

Visualization

Aero Cluster Vogue Student PC

GeorgiaTech (Atlanta)WSU (Detroit)

OSU (Columbus)

Public Cloud

Page 12: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Use Cases for Cloud Storage

4

CombustionExperimental

Data

GTS/LAMMPS

Private Cloud

ImageProcessing

Visualization

Aero Cluster Vogue Student PC

GeorgiaTech (Atlanta)WSU (Detroit)

OSU (Columbus)

Public Cloud

1. Easy of use 2. Universal accessibility3. Good scalability

Page 13: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Outline

• Background and Motivation

• Problems and Challenges

• Design and Implementation

• Evaluation

• Conclusion and Future Work

5

Page 14: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Cloud Storage is Not Ready for HEC

Scientific applications are data-intensive• Generate large amounts of data

6

Page 15: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Cloud Storage is Not Ready for HEC

Scientific applications are data-intensive• Generate large amounts of data

6

Networking bandwidth is limited• Inadequate levels of ingress and egress bandwidths available to/from

remote cloud stores

Page 16: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Cloud Storage is Not Ready for HEC

Scientific applications are data-intensive• Generate large amounts of data

6

Networking bandwidth is limited• Inadequate levels of ingress and egress bandwidths available to/from

remote cloud stores

High costs imposed by cloud providers• Expensive for large amounts of data when using the pay-as-you-go model

Page 17: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Cloud Storage is Not Ready for HEC

Scientific applications are data-intensive• Generate large amounts of data

6

Networking bandwidth is limited• Inadequate levels of ingress and egress bandwidths available to/from

remote cloud stores

High costs imposed by cloud providers• Expensive for large amounts of data when using the pay-as-you-go model

An example:

A GTS runs on 29K cores on the Jaguar machine at OLCF generates over 54 Terabytes of data in a 24 hour period.

Amazon S3: ~$0.03/GB for storage and $0.09/GB for data transfer out.

Cost: $6635.52/day, increases with increasing number of collaborators

Page 18: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Problem: Too Much Data Movement

Issue: naïve approach transfers lots of data, even if only some of it is needed

7

Cloud

Data producer

Time step 0A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 1A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 0A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 1A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Data consumer

ExampleOutput of GTS fusion modeling simulation:Checkpoint data, diagnosis data, visualization data and etc.Each data subset includes many elements

Page 19: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Problem: Too Much Data Movement

Issue: naïve approach transfers lots of data, even if only some of it is needed

7

Cloud

Data producer

Time step 0A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 1A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 0A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 1A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Data consumer

Page 20: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Problem: Too Much Data Movement

Issue: naïve approach transfers lots of data, even if only some of it is needed

7

Cloud

Data producer

Time step 0A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 1A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

• Scientific data formats: BP/HDF5 Structured and meta-data rich

• Standard I/O interface: ADIOS Almost transparent to users

Time step 0A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 1A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Data consumer

Page 21: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Problem: Too Much Data Movement

Issue: naïve approach transfers lots of data, even if only some of it is needed

7

Cloud

Data producer

• Scientific data formats: BP/HDF5 Structured and meta-data rich

• Standard I/O interface: ADIOS Almost transparent to users

Time step 0A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 1A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Data consumer

Page 22: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Problem: Too Much Data Movement

Issue: naïve approach transfers lots of data, even if only some of it is needed

7

Cloud

Data producer

• Scientific data formats: BP/HDF5 Structured and meta-data rich

• Standard I/O interface: ADIOS Almost transparent to users

Time step 0A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 1A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Data consumer

Page 23: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Problem: Too Much Data Movement

Issue: naïve approach transfers lots of data, even if only some of it is needed

7

Cloud

Data producer

• Scientific data formats: BP/HDF5 Structured and meta-data rich

• Standard I/O interface: ADIOS Almost transparent to users

Time step 0A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 1A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Goal: Reduce data transfer from producers to consumersData consumer

Page 24: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Solutions for Minimizing Data Transfers for Data Sharing

Compression• Helps, but compression ratio can be low for floating-point scientific data

8

Page 25: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Solutions for Minimizing Data Transfers for Data Sharing

Compression• Helps, but compression ratio can be low for floating-point scientific data

Data selection• Files: users can ask for subsets of data files, by specifying file offsets

• Requires knowledge about data layout in files

8

Page 26: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Solutions for Minimizing Data Transfers for Data Sharing

Compression• Helps, but compression ratio can be low for floating-point scientific data

Data selection• Files: users can ask for subsets of data files, by specifying file offsets

• Requires knowledge about data layout in files

Content-based data indexing

• Useful, but may require large amounts of meta-data

8

Page 27: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Solutions for Minimizing Data Transfers for Data Sharing

Compression• Helps, but compression ratio can be low for floating-point scientific data

Data selection• Files: users can ask for subsets of data files, by specifying file offsets

• Requires knowledge about data layout in files

Content-based data indexing

• Useful, but may require large amounts of meta-data

8

Scibox Approaches:

• Filter unnecessary data at producer-side via metadata (uploads)

Page 28: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Solutions for Minimizing Data Transfers for Data Sharing

Compression• Helps, but compression ratio can be low for floating-point scientific data

Data selection• Files: users can ask for subsets of data files, by specifying file offsets

• Requires knowledge about data layout in files

Content-based data indexing

• Useful, but may require large amounts of meta-data

8

Scibox Approaches:

• Filter unnecessary data at producer-side via metadata (uploads)

• Merge overlapping subsets when multiple users share the same data(uploads)

Page 29: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Solutions for Minimizing Data Transfers for Data Sharing

Compression• Helps, but compression ratio can be low for floating-point scientific data

Data selection• Files: users can ask for subsets of data files, by specifying file offsets

• Requires knowledge about data layout in files

Content-based data indexing

• Useful, but may require large amounts of meta-data

8

Scibox Approaches:

• Filter unnecessary data at producer-side via metadata (uploads)

• Merge overlapping subsets when multiple users share the same data(uploads)

• Minimize data sharing cost in cloud storage via new software protocol(downloads)

Page 30: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Challenge: How to Filter Data

Recall: scientific data is structured and meta-data rich

9

Time step 0A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 1A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 0A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 1A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time steps

Variable {dimensions,attributes

}

Page 31: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Challenge: How to Filter Data

Recall: scientific data is structured and meta-data rich

9

Time step 0A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 1A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 0A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time step 1A0 A1 A2 … An

B0 B1 B2 … Bn

C0 C1 C2 … Cn

Time steps

Variable {dimensions,attributes

}

Analytics users know what can/needs to be filtered

Page 32: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Outline

• Background and Motivation

• Problems and Challenges

• Design and Implementation

• Evaluation

• Conclusion and Future Work

10

Page 33: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Scibox Design

D(ata)R(eduction)-Function for data filtering• Reduce cloud upload/download volumes

• Permit end users to identify the exact data needed for each specificanalytics activity, using filters

11

Page 34: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Scibox Design

D(ata)R(eduction)-Function for data filtering• Reduce cloud upload/download volumes

• Permit end users to identify the exact data needed for each specificanalytics activity, using filters

Merge shared data objects before uploading• Promote data sharing in the cloud to reduce data redundancy on the

storage server and data volumes transferred across the network

11

Page 35: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Scibox Design

D(ata)R(eduction)-Function for data filtering• Reduce cloud upload/download volumes

• Permit end users to identify the exact data needed for each specificanalytics activity, using filters

Merge shared data objects before uploading• Promote data sharing in the cloud to reduce data redundancy on the

storage server and data volumes transferred across the network

Utilize ADIOS metadata-rich I/O methods

New ADIOS I/O transport• Write output to cloud that can be directly read by subsequent,

potentially remote data analytics or visualization codes

• Transparent on both the producer and consumer sides

11

Page 36: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Scibox Design

D(ata)R(eduction)-Function for data filtering• Reduce cloud upload/download volumes

• Permit end users to identify the exact data needed for each specificanalytics activity, using filters

Merge shared data objects before uploading• Promote data sharing in the cloud to reduce data redundancy on the

storage server and data volumes transferred across the network

Utilize ADIOS metadata-rich I/O methods

New ADIOS I/O transport• Write output to cloud that can be directly read by subsequent,

potentially remote data analytics or visualization codes

• Transparent on both the producer and consumer sides

Partial object access for private cloud storage• Patch the OpenStack Swift object store

11

Page 37: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Cloud-IO Transport

12

HPC Applications

ADIOS API

MP

I-IO

PO

SIX

-IO

LIV

E/D

ata

Tap

DA

RT

HD

F-5

pn

etC

DF

Viz

Eng

ines

CL

OU

D-IO

Oth

ers

(Plu

g-in

)

Proxy Node

Auth Node

Store Node

Store Node

Store Node

Store Node

HPC Platform Swift Object Storage

DR-Function Run-time Execution

Authentication

Account Server

Container Server

Object Server

Account Server

Container Server

Object Server

Account Server

Container Server

Object Server

Account Server

Container Server

Object Server

Page 38: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Scibox Architecture

Amazon S3/OpenStack Swift

Scibox ClientCloud-IO (Write)

Data Producer

Scibox ClientCloud-IO (Read)

Data Consumer

Cloud Storage

Control Flow Data Flow User Group38

Page 39: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Scibox Architecture

Amazon S3/OpenStack Swift

Scibox ClientCloud-IO (Write)

Data Producer

Scibox ClientCloud-IO (Read)

Data Consumer

Cloud Storage

Control Flow Data Flow User Group39

Page 40: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Scibox Architecture

Amazon S3/OpenStack Swift

Scibox ClientCloud-IO (Write)

Data Producer

Scibox ClientCloud-IO (Read)

Data Consumer

Cloud Storage

Control Flow Data Flow User Group40

Page 41: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Scibox Architecture

Amazon S3/OpenStack Swift

Scibox ClientCloud-IO (Write)

Data Producer

Scibox ClientCloud-IO (Read)

Data Consumer

Cloud Storage

Control Flow Data Flow User Group41

Page 42: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Scibox Architecture

Amazon S3/OpenStack Swift

Scibox ClientCloud-IO (Write)

Data Producer

Scibox ClientCloud-IO (Read)

Data Consumer

Cloud Storage

Control Flow Data Flow User Group42

Page 43: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Scibox Architecture

Amazon S3/OpenStack Swift

Scibox ClientCloud-IO (Write)

Data Producer

Scibox ClientCloud-IO (Read)

Data Consumer

Cloud Storage

Control Flow Data Flow User Group43

Page 44: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Scibox Architecture

Amazon S3/OpenStack Swift

Scibox ClientCloud-IO (Write)

Data Producer

Scibox ClientCloud-IO (Read)

Data Consumer

Cloud Storage

Control Flow Data Flow User Group44

Page 45: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Scibox Architecture

Amazon S3/OpenStack Swift

Scibox ClientCloud-IO (Write)

Data Producer

Scibox ClientCloud-IO (Read)

Data Consumer

Cloud Storage

Control Flow Data Flow User Group45

Page 46: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Sample XML File

14

<?xml version="1.0"?><adios-config host-language="Fortran">

<adios-group name="restart" coordination-communicator="comm"><var name="mype" type="integer"/><var name="numberpe" type="integer"/><var name="istep" type="integer"/><var name="MIMAX_VAR" type="integer"/><var name="NX" type="integer"/><var name="NY" type="integer"/><var name="zion0_1Darray" gwrite="zion0" type="double"

dimensions="MIMAX_VAR”/><var name="phi_2Darray" gwrite="phi" type="double" dimensions="NX, NY"/>

<!– for reader.xml -->

</adios-group><method group="restart" method=”CLOUD-IO”>;</method><buffer size-MB="20" allocate-time="now"/></adios-config>

Page 47: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Sample XML File

14

<?xml version="1.0"?><adios-config host-language="Fortran">

<adios-group name="restart" coordination-communicator="comm"><var name="mype" type="integer"/><var name="numberpe" type="integer"/><var name="istep" type="integer"/><var name="MIMAX_VAR" type="integer"/><var name="NX" type="integer"/><var name="NY" type="integer"/><var name="zion0_1Darray" gwrite="zion0" type="double"

dimensions="MIMAX_VAR”/><var name="phi_2Darray" gwrite="phi" type="double" dimensions="NX, NY"/>

<!– for reader.xml --><rd type=8 name="phi_2Darray”

cod=“int i; double sum = 0.0; for(i = 0; i<input.count; i= i+1) { sum = sum + input.vals[i]; } return sum;” />

</adios-group><method group="restart" method=”CLOUD-IO”>;</method><buffer size-MB="20" allocate-time="now"/></adios-config>

Page 48: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Data Reduction (DR) Functions

15

Definition• Function defined by end-users that can transform/reduce data to prepare

it for cloud storage

Page 49: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Data Reduction (DR) Functions

15

Definition• Function defined by end-users that can transform/reduce data to prepare

it for cloud storage

Creation• Programmed by end users

• Generated from higher level descriptions

• Derived from users’ I/O access patterns

Page 50: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Data Reduction (DR) Functions

15

Definition• Function defined by end-users that can transform/reduce data to prepare

it for cloud storage

Creation• Programmed by end users

• Generated from higher level descriptions

• Derived from users’ I/O access patterns

Current implementation• Customized CoD (C on Demand)

require producer-side computational resources

• DR-function library

same DR-function specified by multiple clients, will be executed onlyonce, and its output data will be reused for multiple consumers

Page 51: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

DR-functions Provided by SciBox

51

Type Description Example

DR1 Max(variable) Max(var_double_2Darray)

DR2 Min(variable) Min(var_double_2Darray)

DR3 Mean(variable) Mean(var_double_2Darray)

DR4 Range(variable, dimension, start_pos, end_pos)

Range(var_int_1Darray, 1, 100, 1000)

DR5 Select(variable, threshold1, threshold2)

Select var.valuewhere var.value in (threshold1, threshold2)

DR6 Select(variable, DR_Function1, DR_Function2)

Select var.valuewhere var.value ≥ Mean(var)

DR7 Select(variable1, variable2,threshold1, threshold2)

Select var2.value where var1.value in (threshold1, threshold2)

DR8 Self defined function Double proc(cod_exec_context ec, input_type * input, int k, int m) {int I; intj; double sum = 0.0; double average=0.0; for(i=0;i<m;i++)sum+=input.tmpbuf[i+k*m];average=sum/m; resturn average;}

Page 52: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

DR-functions Provided by SciBox

52

Type Description Example

DR1 Max(variable) Max(var_double_2Darray)

DR2 Min(variable) Min(var_double_2Darray)

DR3 Mean(variable) Mean(var_double_2Darray)

DR4 Range(variable, dimension, start_pos, end_pos)

Range(var_int_1Darray, 1, 100, 1000)

DR5 Select(variable, threshold1, threshold2)

Select var.valuewhere var.value in (threshold1, threshold2)

DR6 Select(variable, DR_Function1, DR_Function2)

Select var.valuewhere var.value ≥ Mean(var)

DR7 Select(variable1, variable2,threshold1, threshold2)

Select var2.value where var1.value in (threshold1, threshold2)

DR8 Self defined function Double proc(cod_exec_context ec, input_type * input, int k, int m) {int I; intj; double sum = 0.0; double average=0.0; for(i=0;i<m;i++)sum+=input.tmpbuf[i+k*m];average=sum/m; resturn average;}

C on Demand (CoD):Consumer: a stringProducer:1. registration 2. compile and execute on demand.

Page 53: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Data Object Management

17

Group Metadata User 0 Metadata User 1 Metadata User N Metadata

Scibox Super File

Writer.xml

Reader0.xml

Reader1.xml

……

……

ReaderN.xml

Group Metadata Container

User0 Metadata Object

User1 Metadata Object

UserN Metadata Object

……

Object 0 Object 1 Object N……

User Metadata Container

Data Container

Page 54: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Data Object Management

17

Group Metadata User 0 Metadata User 1 Metadata User N Metadata

Scibox Super File

Writer.xml

Reader0.xml

Reader1.xml

……

……

ReaderN.xml

Group Metadata Container

User0 Metadata Object

User1 Metadata Object

UserN Metadata Object

……

Object 0 Object 1 Object N……

User Metadata Container

Data Container

Enable object sharing

Page 55: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Determination of Object Size

18

Merge overlapping data sets• reduce upload data size

Partial object access• Current Amazon S3 and OpenStack Swift stores do not support this

• Users have to download the whole object, even if only a small portion of itsdata is needed

Page 56: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Determination of Object Size

18

Merge overlapping data sets• reduce upload data size

Partial object access• Current Amazon S3 and OpenStack Swift stores do not support this

• Users have to download the whole object, even if only a small portion of itsdata is needed

Two approaches used in Scibox• Private cloud

modify the software to enable partial object access

Object size is determined by predicting upload throughput

Page 57: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Determination of Object Size

18

Merge overlapping data sets• reduce upload data size

Partial object access• Current Amazon S3 and OpenStack Swift stores do not support this

• Users have to download the whole object, even if only a small portion of itsdata is needed

Two approaches used in Scibox• Private cloud

modify the software to enable partial object access

Object size is determined by predicting upload throughput

• Public cloud

limit object size considering storage pricing

Object size is determined by comparing the cost w/ sharing and w/o sharing

Page 58: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Cost Model

19

Definitionα: $/GB of standard cloud storage

β: $/GB of data transfer into cloud

γ: $/GB of data transfer out from cloud

Assumptionn clients request Data1, Data2, … Datan respectively

Page 59: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Cost Model

19

Definitionα: $/GB of standard cloud storage

β: $/GB of data transfer into cloud

γ: $/GB of data transfer out from cloud

Assumptionn clients request Data1, Data2, … Datan respectively

Without sharing:

Page 60: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Cost Model

19

Definitionα: $/GB of standard cloud storage

β: $/GB of data transfer into cloud

γ: $/GB of data transfer out from cloud

Assumptionn clients request Data1, Data2, … Datan respectively

Without sharing:

With sharing ( n objects can be merged into one):

Page 61: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Cost Model

19

Definitionα: $/GB of standard cloud storage

β: $/GB of data transfer into cloud

γ: $/GB of data transfer out from cloud

Assumptionn clients request Data1, Data2, … Datan respectively

Without sharing:

With sharing ( n objects can be merged into one):

Page 62: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Outline

• Background and Motivation

• Problems and Challenges

• Design and Implementation

• Evaluation

• Conclusion and Future Work

20

Page 63: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Experimental Setup

2163

AerospaceApp

GTS

Aero Cluster Vogue

OpenStackSwift

Page 64: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Experimental Setup

2164

AerospaceApp

GTS

ImagesProcessing

Aero Cluster Vogue Jedi Cluster

OpenStackSwift

Page 65: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Experimental Setup

2165

AerospaceApp

GTS

ImagesProcessing

Visualization

Aero Cluster Vogue Jedi Cluster

GeorgiaTech (Atlanta)WSU (Detroit)

OSU(Columbus)

OpenStackSwift

Page 66: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Workloads

Synthetic Workloads• 10 variables shared by multiple consumers

• 1,000 requests generated by each consumer

• 8 types of DR functions, uniformly distributed

• 1 data producer serving 1,000 requests x #client servers

• 3 self-defined DR-functions:

FFT, histogram diagnosis, and average of row values of a matrix

22

Page 67: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Workloads

Synthetic Workloads• 10 variables shared by multiple consumers

• 1,000 requests generated by each consumer

• 8 types of DR functions, uniformly distributed

• 1 data producer serving 1,000 requests x #client servers

• 3 self-defined DR-functions:

FFT, histogram diagnosis, and average of row values of a matrix

22

Real Workloads• GTS workload

128 parallel processes, consumers are from 3 different states in USA

• Combustion workload

10, 000 512X512 12-bit double framed images (~1.5 MB per image)

Page 68: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Latency Breakdown of Swift Object Store

23

34.63%

63.82% 73.58%

84.93% 88.69% 91.30% 93.28%

0.48 s 0.88 s 1.28 s 2.17 s 3.65 s 6.99 s 14.64 s

0%

20%

40%

60%

80%

100%

1 4 8 16 32 64 128

Percen

tag

e

Object Size (MB)

Headers Transfer Data Transfer & Server Processing

Disk Write & CheckSum Authorization & Container Checking

Software Overhead & Others 0.43 s 0.61 s 0.92 s 1.49 s 2.62 s 5.05 s 9.36 s

17.04%

41.77%

54.38%

67.33% 75.62% 78.78%

83.20%

0%

20%

40%

60%

80%

100%

1 4 8 16 32 64 128

Percen

ta

ge

Object Size (MB)

Data Transfer & Server Processing Disk Read

Authorization & Container Checking Software Overhead & Others

Put Get

Page 69: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Latency Breakdown of Swift Object Store

23

34.63%

63.82% 73.58%

84.93% 88.69% 91.30% 93.28%

0.48 s 0.88 s 1.28 s 2.17 s 3.65 s 6.99 s 14.64 s

0%

20%

40%

60%

80%

100%

1 4 8 16 32 64 128

Percen

tag

e

Object Size (MB)

Headers Transfer Data Transfer & Server Processing

Disk Write & CheckSum Authorization & Container Checking

Software Overhead & Others 0.43 s 0.61 s 0.92 s 1.49 s 2.62 s 5.05 s 9.36 s

17.04%

41.77%

54.38%

67.33% 75.62% 78.78%

83.20%

0%

20%

40%

60%

80%

100%

1 4 8 16 32 64 128

Percen

ta

ge

Object Size (MB)

Data Transfer & Server Processing Disk Read

Authorization & Container Checking Software Overhead & Others

Put GetData transfer dominates the request latency.

Page 70: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Execution Time of DR-functions

24

0

0.5

1

1.5

2

2.5

3

3.5

4

64 MB 128 MB 256 MB 512 MB 1 GB 1.5 GB

Ex

ecu

tion

Tim

e (s

eco

nd

s)

Variable Size

DR1 DR2 DR3

0

5

10

15

20

25

30

35

40

64 MB 128 MB 256 MB 512 MB 1 GB 1.5 GB

Ex

ecu

tio

n T

ime (

seco

nd

s)

Variable Size

DR5 DR6 DR7

0

1

2

3

4

5

6

7

64 MB 128 MB 256 MB 512 MB 1 GB 1.5 GB

Exe

cuti

on

Tim

e (s

econ

ds)

Variable Size

DR4-1K DR4-1M DR4-16M DR4-64M DR4-128M

Recall: DR7:Select var2.value where var1.value in (r1, r2). var1 is small.

Recall: DR6:var.value where var.value>Mean(var). Double scan of input data.

Page 71: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

SciBox System Scalability

25

842.81 968.86

1329.83

2643.06

4962.91

92.39 109.65 151.44 271.46

412.12

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

1 2 4 8 16

Execu

tio

n T

ime (

seco

nd

s)

Number of Producers

Stock System

Scibox

4.34 5.1

9.54

19.66

42.25

81.91

1.98 2 3.09 6.48

12.68

21.94

0

10

20

30

40

50

60

70

80

90

2 4 8 16 32 64

La

ten

cy

(seco

nd

s)

Number of Consumers in the Same Sharing Group

Stock System

Scibox

• With Scibox, data is merged before upload

• With Scibox, partial object access is supported

Page 72: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

GTS Workload

26

0

200

400

600

800

1000

1200

1400

1600

Gatech-Scibox Gatech-FTP OSU-Scibox OSU-FTP WSU-Scibox WSU-FTP

La

ten

cy (

seco

nd

s)

Data Sharing Schemes

Post-Computation Download Filter-Computation GTS-Computation

WSU OSU GT

GT 900 KB/s 4.4 MB/s 44 MB/s

Page 73: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Combustion Workload

27

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

1 2 4 8 16

Sp

eed

up

of

Per

form

an

ce

Number of Clients in the Same Sharing Group

1 Process 2 Processes 4 Processes

8 Processes 16 Processes 32 Processes

• DR-function: (ImageName, DR8, FFT)

• 10, 000 images (~1.5 MB/image) shared via Scibox

Page 74: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Outline

• Background and Motivation

• Problems and Challenges

• Design and Implementation

• Evaluation

• Conclusion and Future Work

28

Page 75: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Conclusions and Future Work

Scibox: Cloud-based support for scientific data sharing• Can operate across both public and private cloud stores

29

Page 76: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Conclusions and Future Work

Scibox: Cloud-based support for scientific data sharing• Can operate across both public and private cloud stores

Data Reduction functions• Exploit the structured nature of scientific data

• Reduce the amount of transferred data

29

Page 77: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Conclusions and Future Work

Scibox: Cloud-based support for scientific data sharing• Can operate across both public and private cloud stores

Data Reduction functions• Exploit the structured nature of scientific data

• Reduce the amount of transferred data

Transparent to Applications/Data Producers• Implemented in ADIOS

• Can be also applied to other I/O libraries

29

Page 78: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Conclusions and Future Work

Scibox: Cloud-based support for scientific data sharing• Can operate across both public and private cloud stores

Data Reduction functions• Exploit the structured nature of scientific data

• Reduce the amount of transferred data

Transparent to Applications/Data Producers• Implemented in ADIOS

• Can be also applied to other I/O libraries

Future Work• Deploy Scibox on national labs’ facilities to better understand potential use

cases

• Additional optimizations of cloud storage for scientific data management

29

Page 79: Scibox: Online Sharing of Scientific Data via the Cloudjhuang95/papers/scibox-ipdps14-slides.pdf · Aero Cluster Vogue Student PC GeorgiaTech (Atlanta) WSU (Detroit) OSU (Columbus)

Thanks!

30