IMCa: A High-Performance Caching Front-end for GlusterFS...

30
IMCa: A High-Performance Caching Front-end for GlusterFS on InfiniBand Ranjit Noronha and Dhabaleswar K. Panda Network Based Computing Lab The Ohio State University <noronha, panda>@cse.ohio-state.edu

Transcript of IMCa: A High-Performance Caching Front-end for GlusterFS...

Page 1: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

IMCa: A High-Performance Caching Front-end for GlusterFS on InfiniBand

Ranjit Noronha and Dhabaleswar K. PandaNetwork Based Computing Lab

The Ohio State University<noronha, panda>@cse.ohio-state.edu

Page 2: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Outline of the Talk

• Background and Motivation

• Architecture and Design of IMCa

• Experimental Evaluation of IMCa

• Conclusions and Future Work

Page 3: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Background

• Large Scale Scientific and Commercial Workloads

• Petascale Computers have arrived

• High-Performance access to the I/O data is crucial – Parallel applications is often limited by I/O

• Clustered/Parallel File Systems have evolved to meet this challenge

Page 4: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

• File System performance still dependent on disk performance

• Single Server Bandwidth Drop With Multiple Clients

• Parallel I/O Bandwidth From Multiple Servers

0

500

1000

1 2 3 4 5 6 7 8

RDMA IPoIB GigE

0

500

1000

1 2 3 4 5 6 7 8

R DMA IP oIB G igE

Ba

nd

wid

th (

Meg

aB

yte

s/s)

Number of clients

4GB Server Memory

8GB Server Memory

Page 5: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

• Performance for Small Files– Generally difficult to achieve– Many environments with a large number of small files– Storing on the same disk block provides limited benefit– Striping does not provide benefit– Store on different servers

• Cache Coherency Problems– Client side cache provides good performance– Non-coherent client cache limited when there is sharing– Limited Scalability of coherent caches

• Server Load Problems– RDMA reduces overhead from TCP/IP– RDMA based transport protocols cannot reduce copying costs

within the file system

Page 6: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Problem Statement• Which file-system operations are potential

targets for caching?• What are the alternatives to the traditional

client cache/server cache architecture?• What are the advantages and disadvantages

of alternate cache architectures?• How do we provide the performance of the

non-coherent client cache without the scalability problems of the coherent client cache?

Page 7: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Outline of the Talk

• Background and Motivation

• Architecture and Design of IMCa

• Experimental Evaluation of IMCa

• Conclusions and Future Work

Page 8: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Potential File System Operations That May Be Cached

• Potential Targets For Caching– Should be something the client reads– Should be possible to uniquely identify cache target– Should be possible to chunk the data element

• Small Operations Stat, Create, Delete, Open• Stat

– Read by the client – Used as a form of update by many applications– Should be used – Should be updated on read/write operations on the server

• Create/delete– Not read by the client– Delete should invalidate previous cache entries

• File Open– Not a target for caching, but may be used for prefetching

• Data Transfer Operations – Read and Writes– Blocks Needed

Page 9: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Intermediate Cache Architecture (IMCa)

• Easy to maintain coherency• Extensible• Can multiple Cache nodes

provide benefit?

FS Client

SMCache

Underlying FS Cache

Cache1 Cache2 CachenEach Cache is a node (MCD Array)

Hash Function (CRC32) toFind the Cache Server

Hash Function (CRC32) toFind the Cache Server

CMCache

ext3

Page 10: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Need for Blocks In IMCa

• Most file system store data on the disk as blocks

• Parallel file-systems stripe data across multiple servers

• IMCa uses a fixed block size to store data across the cache servers– Block size should provide good performance

for most small files– Should avoid

• excessive fragmentation

Requested Data

Data Block Boundaries

Extradata File data segmented

by IMCa blocksize

Page 11: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Design-Read Operations (Hit)Client

SMCache

Underlying FS Cache

Cache1 Cache2 Cachen

CMCache

ext3

Page 12: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Design-Read Operations (Miss)Client

SMCache

Underlying FS Cache

Cache1 Cache2 Cachen

CMCache

ext3

Page 13: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Design-Write Operations

Client

SMCache

Underlying FS Cache

Cache1 Cache2 Cachen

CMCache

ext3

Page 14: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Advantages/Disadvantages of IMCa

• Fewer Requests Hit the Server• Latency for requests read from the cache is

lower• MCDs are self-managing• Failures in MCDs do not impact correctness• Additional node elements needed especially

for caching• Cold Misses are expensive• Additional Blocks/Data Transfers Needed• Overhead and delayed updates

Page 15: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Outline of the Talk

• Background and Motivation

• Architecture and Design of IMCa

• Experimental Evaluation of IMCa

• Conclusions and Future Work

Page 16: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

GlusterFS File System

• Clustered File System

• Client and Server in userspace

• Use FUSE interface to translate FS calls from the kernel to the user daemons

• No Stripping data distributed across servers

• Possible to apply translators at the server and client to perform different functions

• WWW.glusterfs.org

Page 17: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Experimental Setup

• 64-node cluster– 8-core Intel Clovertown– 8 GB memory

• InfiniBand DDR is the interconnect• GlusterFS file-system• The data servers each have 8 RAID highpoint disks• Communication protocol is IPoIB in Reliable Connected

(RC) mode• MCDs run on independent nodes and use up to 6GB of

memory • CMCache and SMCache use a CRC32 hash function for

locating data on the MCDs• Lustre 1.6.4.3 is used with a socklnd for comparison

Page 18: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Experiment-stat

– Consists of two stages

– First stage (untimed)• 262144 files created by a single node

– Second stage (timed)• each node tries to perform a stat on each of the 262144 files sequentially

Page 19: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Stat performance

• Time to stat 262144 different files• Benchmark has two phase create (untimed), followed by stat (timed)• 82% improvement at 64 nodes

0

500

1000

1500

2000

2500

3000

3500

16 32 64

No Cache 1 Cache Server 2

2 Cache Servers 4 Cache Servers

6 Cache Servers Lustre-4DS

Number of Nodes

Tim

e (s

eco

nd

s)

0

100

200

300

400

500

1 2 4 8

No Cache 1 Cache Server 2

2 Cache Servers 4 Cache Servers

6 Cache Servers Lustre-4DS

Page 20: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Experiment-Write Single Client

– One Client

– Writes 1,024 records of size r sequentially to the file

– Measure time for this to complete

Page 21: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Write – Single Client

0200400600800

1000120014001600

1 2 4 8 16 32 64 128

256

512

1024

2048

4096

8192

1638

4

3276

8

No Cache IMCa (2K) IMCa (Server Threads)

Write

La

ten

cy (

mic

rose

con

ds)

I/O Record Size (bytes)

• 2KB block size• Server thread helps performance

Page 22: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Experiments-Read

• Single Client Read– Follows Write component of the benchmark

– Move file pointer to the beginning of the file

– Read 1,024 records of size r sequentially to the file

– Measure time for this to complete

• Multiple Client Read– Each client uses a separate file

• Multiple Client Read Shared – Same file used by every client

• Lustre configurations– Cold Client Cache Unmount between Write and Read

– Warm Client Cache No unmount between Write and Read

Page 23: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Read Latency (Single Client)

0

200

400

600

800

1000

1200

1 4 16 64 256

1024

4096

1638

4

6553

6

No Cache Cache (256)

Cache (2K) Cache (8K)

Lustre-1DS (Cold) Lustre-4DS (Cold)

Lustre-4DS (Warm)

0

5000

10000

15000

20000

25000

No Cache Cache (256)

Cache (2K) Cache (8K)

Lustre-1DS (Cold) Lustre-4DS (Cold)

Lustre-4DS (Warm)

La

ten

cy (

us)

Bytes•Lustre shows best latency•Cache provides benefit for small message sizes

Page 24: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Read Multiple Client (32 clients)

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

1 2 4 8

16

32

64

12

8

25

6

51

2

10

24

20

48

40

96

81

92

16

38

4

32

76

8

65

53

6

NoCache

IMCa (1)

IMCa (2)

IMCa (4)

Lustre (Cold)

Lustre (Warm)

•51% improvement in latency at 16K•Multiple MCDs help reduce capacity misses

Bytes

Tim

e (m

icro

seco

nd

s)

Page 25: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Iozone throughput

0

200

400

600

800

1000

No Cache Cache (1) Cache (2) Cache (4) Lustre-1DS (Cold)

1

2

4

8

Th

rou

gh

pu

t (M

ega

By

tes/

seco

nd

)

•1, 2, 4, 8 IOzone threads, 1GB files, 2KB block size•325 MB/s (NoCache) -> 868 MB/s (4 MCDs)

Page 26: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Read-Shared Latency

0

200

400

600

800

1000

1200

1400

1600

2 4 8 16 32

No Cache Lustre-1DS (Cold) MCD (1)

Tim

e (

mic

rose

cond

s)

Number of nodes

•IMCa helps improve performance over NoCache case

Page 27: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Outline of the Talk

• Background and Motivation

• Architecture and Design of IMCa

• Experimental Evaluation of IMCa

• Conclusions and Future Work

Page 28: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Conclusions and Future Work

•Proposed, Designed and Evaluated an Intermediate Cache for GlusterFS•Good improvement in stat performance•Improvement in latency/throughput of read operations

•Depends on block size• Would like to evaluate the performance with RDMA•Would like to evaluate distribution algorithms

Page 29: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Acknowledgements

Our research is supported by the following organizations

Page 30: IMCa: A High-Performance Caching Front-end for GlusterFS ...mvapich.cse.ohio-state.edu/static/media/publications/slide/imca... · IMCa: A High-Performance Caching Front-end for GlusterFS

Thank you

{noronha, panda}@cse.ohio-state.edu

Network-Based Computing Laboratory

http://nowlab.cse.ohio-state.edu/