Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian...

12
Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols

Transcript of Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian...

Page 1: Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols.

Implementing Babel RMI with ARMCI

Jian YinKhushbu AgarwalDaniel ChavarríaManoj Krishnan

Ian GortonVidhya Gurumoorthi

Patrick Nichols

Page 2: Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols.

Motivation

Remote Method Invocation provides a useful abstraction for distributed computing

Example: event service for CCA framework

Existing TCP/IP based implementation has performance problemsQuestion: can we speed up Babel RMI with high performance communication protocols

2

Page 3: Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols.

Objectives

Demonstrate that it is feasible to build high performance Babel RMI

Prototype a Babel RMI with ARMCI and measure its performance experimentally

Produce a quality implementation of high performance RMI

3

Page 4: Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols.

Outline

MotivationObjectivesBackground

Babel RMI

ARMCI

Preliminary performance resultsFuture works

4

Page 5: Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols.

Babel RMI

Babel supports Remote Method InvocationTransparent

Flexible

Implemented with extensive code marshalling and runtime libraryExisting TCP/IP based implementation incurs high overhead

Multiple copying

Context switching

5

Page 6: Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols.

TCP RMI Performance

6

Page 7: Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols.

ARMCI

Middleware for remote memory access (RMA)Support many networks and HPC systems

Myrinet, Infiniband, Quadrics, Giganet, …

Cray XT4, XT, X1, IBM BlueGene,…

Efficient

Minimum number of copying

Truly one side communication protocolPut, get, accumulating

Atomic read-modified-write, mutex

Blocking and non-blocking interfaces

7

Page 8: Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols.

Experiment Setup

Hardwarecluster with 11 nodes

4 core 2.4 GHz Intel Xeon processor

Infiniband DDR network

SoftwareBabel 1.4.0

ARMCI 1.4

OpenMPI 1.2.6

8

Page 9: Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols.

Implementation

Implemented extensive set of functions in the runtime library

InstanceHandle, Server, Invocation, Response, Call, Return, …

Usage Exampleshello_World h = hello_World__createRemote(armcihandler://<process_id>:<mutex_id>, &_ex);

hello_World h2 = hello_World__connect(armcihandler://<process_id>:<mutex_id>/<object_id>&_ex);

9

Page 10: Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols.

ARMCI RMI Performance

10

Page 11: Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols.

Next Step

Reduce protocol overheadReduce function call overhead

Reduce copying

Batch RMI CallReduce RDMA overhead

Prefetch in the backgroundPreload libraries

Prefech arguments

11

Page 12: Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols.

Where to Use High Performance Babel RMI

Applications for high performance RMIFine grain distribution

Hybrid computing

Suggestions …

12