1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2...

33
1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi 1 Sudharshan Vazhkudai 2 1. North Carolina State University 2. Oak Ridge National Laboratory September, 2004

Transcript of 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2...

Page 1: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

1

FreeLoader: borrowing desktop resources for large transient data

Vincent Freeh1

Xiaosong Ma1,2

Stephen Scott2

Jonathan Strickland1

Nandan Tammineedi1

Sudharshan Vazhkudai2

1. North Carolina State University 2. Oak Ridge National Laboratory

September, 2004

Page 2: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

2

Roadmap

Motivation FreeLoader architecture Design choices Results Future work

Page 3: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

3

Motivation: Data Avalanche

More data to process Science, industry,

government

Example: scientific data Better instruments More simulation power Higher resolution

(Picture courtesy: Jim Gray, SLAC Data Management Workshop)

Space TelescopeP&E Gene Sequencer Fromhttp://www.genome.uci.edu/

Page 4: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

4

Data acquisition and storage

Data acquisition, reduction, analysis, visualization, storage

Data Acquisition System

Remote userswith local computing and storage

Remote storage

Local users

High Speed Network

Metadata

rawdata

Remote users

Supercomputers

Page 5: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

5

Remote Data Sources

Data serving at supercomputing sites Shared file systems – GPFS Archiving systems - HPSS

Data centers Expensive, high-end solutions with guaranteed

capacity and access rates Tools used in access

FTP, GridFTP Grid file systems Customized data migration program Web browser

Page 6: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

6

User perspective

End user typically processes data locally Convenience and control Better CPU/memory configurations Problem 1: needs local space to hold data Problem 2: getting data from remote sources is slow

Central point of failure High contention for resource, multiple incoming

requests – availability is hit Dataset characteristics

Write-once, read-many access patterns Raw data often discarded Shared interest to same data among groups Primary copy archived elsewhere Squirrel – P2P web cache

Page 7: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

7

Harnessing idle disk storage

Harnessing storage resources of individual workstations ~ Harnessing idle CPU cycles

LAN environments desktops with 100Mbps or Gbps connectivity Increasing hard disk capacities Increasing % of total is unused – 50% and upwards

Even with contribution << available - impressive aggregate storage

Increasing numbers of workstations are online most of the time

Access locality, aggregate I/O and network bandwidth, data sharing

Page 8: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

8

Use Cases

FreeLoader storage cloud as a: Cache Local, client-side scratch Intermediate hop Grid replica

Page 9: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

9

Intended Role of FreeLoader

What the scavenged storage “is not”: Not a replacement to high-end storage Not a file system Not intended for integrating resources at wide-area scale Does not emphasize replica discovery, routing protocol and

consistency like P2P storage systems What it “is”:

Low-cost, best-effort alternative to remote high-end storage Intended to facilitate

transient access to large, read-only datasets data sharing within administrative domain

To be used in conjunction with higher-end storage systems

Page 10: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

10

FreeLoader Architecture

Pool nMorsel Access, Data Integrity, Non-invasiveness

Management LayerData Placement, Replication, Grid Awareness,

Metadata Management

Management LayerData Placement, Replication, Grid Awareness,

Metadata Management

Pool A

Registration

Storage Layer

Pool m

Registration

Grid Data Access ToolsGrid Data Access Tools

Page 11: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

11

Storage Layer

Donors/Benefactors: Morsels as a unit of contribution Basic morsel operations [new(), free(), get(), put()…] Space Reclaim:

User withdrawal / space shrinkage Data Integrity through checksums Performance history per benefactor

Pools: Benefactor registrations (soft state) Dataset distributions Proximity and performance characteristics

dataset 1: 1 2 3

dataset n: 1a 2a 3a 4a

2a1a

21

4a3a

23

2a1a

3a1

Page 12: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

12

Management Layer

Manager: Pool registrations Metadata: datasets-to-pools; pools-to-

benefactors, etc. Availability:

Redundant Array of Replicated Morsels Minimum replication factor for morsels Where to replicate? Which morsel replica to choose?

Clients are oblivious to metadata – all metadata requests are sent to manager

Cache replacement policy

Page 13: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

13

Dataset Striping

Stripe datasets across benefactors Morsel doubles as basic unit of striping Manager decides the allocation of data blocks to

morsels across benefactors Multiple-fold benefits

Higher aggregate access bandwidth Lowering impact per benefactor Load balancing

Greedy algorithm to make best use of available space

Stripe width and Stripe size can be varied as striping parameters

Page 14: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

14

Client interface

Obtains metadata from the manager Performs gets or puts directly to the benefactors All control messages are exchanged via UDP All data transfers – TCP Morsel requests are sent to benefactors in

parallel, striping strategy ensures these blocks are contiguous

Efficient buffering strategy : Buffer pool of size (stripesize+1)*stripewidth Double buffering scheme

Allows network and I/O to proceed in parallel After pool is filled up, buffer contents are flushed to

disk Reduces disk seeks, waits for filled buffer contents to

form contiguous blocks before writing to disk

Page 15: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

15

Current Status

Application

Client

Manager

Benefactor

OS

Benefactor

OS

I/O interface

UDP (A)

UDP (C)

UDP/TCP (B)

reserve() cancel()store() retrieve() delete() open() close()read() write()

new() free() get() put()

(A) services: Dataset

creation/deletion Space reservation

(B) services: Dataset retrieval Hints

(C) services: Registration Benefactor alerts,

warnings, alarms to manager

(D) services: Dataset store Morsel request

UDP/TCP (D)

Simple data striping

Page 16: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

16

Results: Experiment Setup

FreeLoader prototype running at ORNL Client Box

AMD Athlon 700MHz 400MB memory Gig-E card Linux 2.4.20-8

Benefactors Group of heterogeneous Linux workstations Contributing 7GB-30GB each 100Mb cards

Page 17: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

17

Data Sources

Local GPFS Attached to ORNL SCs Accessed through GridFTP 1MB TCP buffer, 4 parallel streams

Local HPSS Accessed through HSI client, highly optimized Hot: data in disk cache without tape unloading Cold: data purged, retrieval done in large intervals

Remote NFS At NCSU HPC center Accessed through GridFTP 1MB TCP buffer, 4 parallel streams

FreeLoader 1 MB morsel size for all experiments Varying configurations

Page 18: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

18

Testbed

Page 19: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

19

Best of class performance comparisons

Th

roughput

(MB

/s)

Page 20: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

20

Effect of stripe width variation ( stripe size=1 morsel)

Page 21: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

21

Effect of stripe width variation ( stripe size=8 morsels)

Page 22: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

22

Effect of stripe size variation ( stripe width=4 benefactors)

Page 23: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

23

Impact Tests

How uncomfortable do the donors feel When running CPU intensive tasks? Disk intensive tasks? Network intensive?

A set of tests at NCSU Benefactor performing local tasks Client retrieving datasets at a given rate

Rate is varied to study the impact on user Pentium 4, 512MB memory, 100Mbps connectivity

Page 24: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

24

CPU-intensive and MixedTim

e (

s)

Page 25: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

25

Network-intensive Task

Norm

alize

d D

ow

nlo

ad

Tim

e

Page 26: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

26

Disk-intensive Task

Impact on I/O performance

0

10

20

30

40

50

60

0 1 2 3 4 5 6 7

Request rate (MB/s)

write

read

Th

roughput

(MB

/s)

Page 27: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

27

Sample application - formatdb

Subset of basic file APIs implemented formatdb (NCBI) BLAST toolkit – preprocesses

biological sequence database to create set of sequence and index files

Raw database is ideal candidate for caching on FreeLoader

formatdb not the ideal application for FreeLoader

Local NFS Benefactors

Time(sec)

1 2 4

598 585 599 563 556

Page 28: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

28

Significant results

Page 29: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

29

Significant results – contd.

2x and 4x speedup wrt GPFS and HPSS Management overhead is minimal 14% worst case performance hit for CPU

intensive <= 25% for network intensive tasks formatdb – tests upper bound of FreeLoader’s

internal overhead Same as local for 1 benefactor, 2 % slower than NFS 5% faster than NFS for 4 benefactors

10 MB/s performance gain for each benefactor added until saturation

Page 30: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

30

Conclusions

Goal is to achieve saturation from the client side Striping helps achieve this

Low cost commodity parts Harnessing idle disk bandwidth Low impact on donor, controlled by throttling

request rate Better availability, more suitable for large

transient data sets than regular FS

Page 31: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

31

In-progress and Future Work

In-progress Windows support

Future Complete pool structure, registration Intelligent data distribution, service profiling Benefactor impact control, self-configuration Naming and replication Grid awareness

Potential extensions Harnessing local storage at cluster nodes? Complementing commercial storage servers?

Page 32: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

32

Further Information

http://www.csm.ornl.gov/~vazhkuda/Morsels/

Page 33: 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

33