SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes...

42
SWIFT Simon Pickartz, Pablo Reble, Carsten Clauß, and Stefan Lankes A Transparent and Flexible communication Layer for PCIe-coupled Accelerators and (Co-)Processors 05/19/2014

Transcript of SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes...

Page 1: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

SWIFT

Simon Pickartz, Pablo Reble, Carsten Clauß, and Stefan Lankes

A Transparent and Flexible communication Layer for PCIe-coupledAccelerators and (Co-)Processors

05/19/2014

Page 2: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Today’s HPC Systems

Homogeneous hardware landscapeSeparate computer systemsconnected to increase theaggregated compute powerDifferent interconnects forLAN and SANRDMA capabilities

I/O

StorageHost

I/O

StorageHost

...

I/O

StorageHost

I/O

StorageHost

...Fabric

→ For portability concerns one standardised interface to layers on topis sufficient (e. g. uDAPL)

05/19/2014 | ACS Automation for Complex Power Systems | 2Simon Pickartz

Page 3: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Tomorrow’s HPC Systems

Intra-rack network connectingHostsI/O devicesStorage

Heterogeneous compute nodesCPUsGPUsAcceleratorsEtc.

Peer-to-peer communicationRDMA and RMA capabilitiesStill different interconnects

PCIeFabric

Host

Host

Host

Host

...

I/O

I/O

...

Storage

Storage

...

→ Computer systems connected to share resourcesand to increase the aggregated compute power

05/19/2014 | ACS Automation for Complex Power Systems | 3Simon Pickartz

Page 4: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Agenda

Socket Wheeled Intelligent Fabric Transport (SWIFT)

Hardware

Results

Outlook

05/19/2014 | ACS Automation for Complex Power Systems | 4Simon Pickartz

Page 5: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

SWIFT – Requirements

Support for heterogeneous network landscapes

→ A transparent solution is targeted

High portability

→ Hardware abstraction

Supply of different programming models

→ Service-oriented, SPMD, etc.

Consideration of the hardware’s RDMA and RMA capabilities

→ Offer one-sided communication primitives

High performance

→ Low latencies and high data rates

05/19/2014 | ACS Automation for Complex Power Systems | 5Simon Pickartz

Page 6: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

SWIFT – Requirements

Support for heterogeneous network landscapes→ A transparent solution is targetedHigh portability

→ Hardware abstraction

Supply of different programming models

→ Service-oriented, SPMD, etc.

Consideration of the hardware’s RDMA and RMA capabilities

→ Offer one-sided communication primitives

High performance

→ Low latencies and high data rates

05/19/2014 | ACS Automation for Complex Power Systems | 5Simon Pickartz

Page 7: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

SWIFT – Requirements

Support for heterogeneous network landscapes→ A transparent solution is targetedHigh portability

→ Hardware abstractionSupply of different programming models

→ Service-oriented, SPMD, etc.

Consideration of the hardware’s RDMA and RMA capabilities

→ Offer one-sided communication primitives

High performance

→ Low latencies and high data rates

05/19/2014 | ACS Automation for Complex Power Systems | 5Simon Pickartz

Page 8: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

SWIFT – Requirements

Support for heterogeneous network landscapes→ A transparent solution is targetedHigh portability

→ Hardware abstractionSupply of different programming models

→ Service-oriented, SPMD, etc.Consideration of the hardware’s RDMA and RMA capabilities

→ Offer one-sided communication primitives

High performance

→ Low latencies and high data rates

05/19/2014 | ACS Automation for Complex Power Systems | 5Simon Pickartz

Page 9: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

SWIFT – Requirements

Support for heterogeneous network landscapes→ A transparent solution is targetedHigh portability

→ Hardware abstractionSupply of different programming models

→ Service-oriented, SPMD, etc.Consideration of the hardware’s RDMA and RMA capabilities

→ Offer one-sided communication primitivesHigh performance

→ Low latencies and high data rates

05/19/2014 | ACS Automation for Complex Power Systems | 5Simon Pickartz

Page 10: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

SWIFT – Requirements

Support for heterogeneous network landscapes→ A transparent solution is targetedHigh portability

→ Hardware abstractionSupply of different programming models

→ Service-oriented, SPMD, etc.Consideration of the hardware’s RDMA and RMA capabilities

→ Offer one-sided communication primitivesHigh performance

→ Low latencies and high data rates

05/19/2014 | ACS Automation for Complex Power Systems | 5Simon Pickartz

Page 11: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

SWIFT – Basic Concepts

TopologyHostsNodesEndpoints

Communication modesAsynchronous signaling via mailsNon-blocking two-sided communicationOne-sided communication (including atomics)

Gateway-based fabric connection

05/19/2014 | ACS Automation for Complex Power Systems | 6Simon Pickartz

Page 12: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Topology

Fabric 0

Fabric 1

Fabric 2

Fabric 3

SWIFT Network

05/19/2014 | ACS Automation for Complex Power Systems | 7Simon Pickartz

Page 13: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Topology

Fabric 0

Fabric 1

Fabric 2

Fabric 3

SWIFT Network

05/19/2014 | ACS Automation for Complex Power Systems | 7Simon Pickartz

Page 14: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Topology

Fabric 0

Fabric 1

Fabric 2

Fabric 3

SWIFT Network

05/19/2014 | ACS Automation for Complex Power Systems | 7Simon Pickartz

Page 15: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Layered Architecture

Application layerHigher level libraries(e. g. MPI)Parallel applicationsService-oriented apps

SWIFT layerRoutingTopologyMessaging services

Device layerHardware abstractionOptimization

Applications /High Level Communications Libraries

SWIFT

Device #0 Device #N...

Network #0 Network #N

Software

Hardware

→ Well-defined interfaces to layers above and below

05/19/2014 | ACS Automation for Complex Power Systems | 8Simon Pickartz

Page 16: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

SWIFT Device

Small interface (around 20 prototypes only)Administration module

Constructor and destructorAutomatic discovery of fabric nodes

Channel moduleBi-directional FIFO channelFixed channel sizeAsynchronous connection establishmentvia create() and connect()Three transfer modes: PIO, DMA, and AUTO

→ High portability

put

getA B

get

put

05/19/2014 | ACS Automation for Complex Power Systems | 9Simon Pickartz

Page 17: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Connection Setup

A distributed Directory Service (DS)Dedicated process for holding topologyinformationAutomatically maintains connectionsto other DS on neighbor hostsManages node IDs for the local nodesNo single point of failure

On-demand connection setup via DSDirect communication between nodes on different hosts

→ Minimization of the hop countAutomatically connect to destination DS if necessaryA bit of proactivity

05/19/2014 | ACS Automation for Complex Power Systems | 10Simon Pickartz

Page 18: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Connection Setup

Host #0 Host #1

A

DS

B

DS

1

2

3

4

1

23

4

05/19/2014 | ACS Automation for Complex Power Systems | 11Simon Pickartz

Page 19: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Connection Setup

Host #0 Host #1

A

DS

B

DS

1

2

3

4

1

23

4

05/19/2014 | ACS Automation for Complex Power Systems | 11Simon Pickartz

Page 20: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Connection Setup

Host #0 Host #1

A

DS

B

DS

1

2

3

4

1

23

4

05/19/2014 | ACS Automation for Complex Power Systems | 11Simon Pickartz

Page 21: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Connection Setup

Host #0 Host #1

A

DS

B

DS

1

2

3

4

1

23

4

05/19/2014 | ACS Automation for Complex Power Systems | 11Simon Pickartz

Page 22: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

The ACS Cluster

Intel Xeon(E5-2650)

#0

Intel Xeon(E5-2650)

#1

Intel Xeon Phi#0

Intel Xeon Phi#1

Host #0

Dolphin IXH610 PCIe Fabric

Intel Xeon(E5-2650)

#0

Intel Xeon(E5-2650)

#1

Intel Xeon Phi#0

Intel Xeon Phi#1

Host #1

05/19/2014 | ACS Automation for Complex Power Systems | 12Simon Pickartz

Page 23: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Mapping SWIFT onto the Cluster

SCIF Fabrics

SISCI Fabric

05/19/2014 | ACS Automation for Complex Power Systems | 13Simon Pickartz

Page 24: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

SWIFT Overhead – Latencies

Host-to-Host Host-to-Phi Phi-to-Phi0

5

10

15

20La

tenc

yin

µs(R

TT/2

)DeviceSWIFTMPI

05/19/2014 | ACS Automation for Complex Power Systems | 14Simon Pickartz

Page 25: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

SWIFT Overhead – Throughput

64 512 4 k 32 k 256 k 2 M 16 M0

500

1,000

1,500

2,000

2,500

Size in Byte

Thro

ughp

utin

MiB

/s

Host-to-Host (RMA)

DeviceSWIFT

MPITPP

05/19/2014 | ACS Automation for Complex Power Systems | 15Simon Pickartz

Page 26: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

SWIFT Overhead – Throughput

64 512 4 k 32 k 256 k 2 M 16 M0

200

400

600

800

1,000

Size in Byte

Thro

ughp

utin

MiB

/s

Host-to-Phi (RMA)

DeviceSWIFT

MPITPP

05/19/2014 | ACS Automation for Complex Power Systems | 16Simon Pickartz

Page 27: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Multi-Hop PingPong – Latency

1-Hop 2-Hop 3-Hop0

5

10

15La

tenc

yin

µsRealPhi-HostHost-Host

05/19/2014 | ACS Automation for Complex Power Systems | 17Simon Pickartz

Page 28: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Multi-Hop PingPong – Throughput

Latencies accumulateAverage throughput equals that of the bottleneck link

A B C DtX tY tZ

05/19/2014 | ACS Automation for Complex Power Systems | 18Simon Pickartz

Page 29: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Multi-Hop PingPong – Throughput

64 512 4 k 32 k 256 k 2 M 16 M0

200

400

600

800

Size in Byte

Thro

ughp

utin

MiB

/s

1-Hop2-Hop

64 512 4 k 32 k 256 k 2 M 16 M0

200

400

600

800

Size in Byte

Thro

ughp

utin

MiB

/s

1-Hop2-Hop3-Hop

05/19/2014 | ACS Automation for Complex Power Systems | 19Simon Pickartz

Page 30: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Multi-Hop PingPong – Throughput

64 512 4 k 32 k 256 k 2 M 16 M0

200

400

600

800

Size in Byte

Thro

ughp

utin

MiB

/s

1-Hop2-Hop

64 512 4 k 32 k 256 k 2 M 16 M0

200

400

600

800

Size in Byte

Thro

ughp

utin

MiB

/s

1-Hop2-Hop3-Hop

05/19/2014 | ACS Automation for Complex Power Systems | 19Simon Pickartz

Page 31: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Multi-Hop PingPong – Throughput

Latencies accumulateAverage throughput equals that of the bottleneck link

A B C DtX tY tZ

tAB

tBA

tBC

tCB

tCD

tDC

→ Asymetric links→ Average throughput corresponds to the harmonic mean of the two

bottleneck links

05/19/2014 | ACS Automation for Complex Power Systems | 20Simon Pickartz

Page 32: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Multi-Hop PingPong – Throughput

Latencies accumulateAverage throughput equals that of the bottleneck link

A B C D

tX tY tZ

tAB

tBA

tBC

tCB

tCD

tDC

→ Asymetric links→ Average throughput corresponds to the harmonic mean of the two

bottleneck links

05/19/2014 | ACS Automation for Complex Power Systems | 20Simon Pickartz

Page 33: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Multi-Hop PingPong – Throughput

64 512 4 k 32 k 256 k 2 M 16 M0

100

200

300

400

500

Size in Byte

Thro

ughp

utin

MiB

/s

3-HopPhi-to-Host

05/19/2014 | ACS Automation for Complex Power Systems | 21Simon Pickartz

Page 34: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

What we have . . .

Support for heterogeneous network landscapesHigh portability

Device abstractionThree devices: SCIF, SISCI, and SHMEM

Supply of different programming models

Three-layered topologyAutomatic node ID assignment

Consideration of the hardware’s RDMA and RMA capabilities

One-sided communicationZero-copy forwarding

High performance

Good latency results (asynchronous signaling)Promising multi-hop throughput results

05/19/2014 | ACS Automation for Complex Power Systems | 22Simon Pickartz

Page 35: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

What we have . . .

Support for heterogeneous network landscapesHigh portability

Device abstractionThree devices: SCIF, SISCI, and SHMEM

Supply of different programming models

Three-layered topologyAutomatic node ID assignment

Consideration of the hardware’s RDMA and RMA capabilities

One-sided communicationZero-copy forwarding

High performance

Good latency results (asynchronous signaling)Promising multi-hop throughput results

05/19/2014 | ACS Automation for Complex Power Systems | 22Simon Pickartz

Page 36: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

What we have . . .

Support for heterogeneous network landscapesHigh portability

Device abstractionThree devices: SCIF, SISCI, and SHMEM

Supply of different programming modelsThree-layered topologyAutomatic node ID assignment

Consideration of the hardware’s RDMA and RMA capabilities

One-sided communicationZero-copy forwarding

High performance

Good latency results (asynchronous signaling)Promising multi-hop throughput results

05/19/2014 | ACS Automation for Complex Power Systems | 22Simon Pickartz

Page 37: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

What we have . . .

Support for heterogeneous network landscapesHigh portability

Device abstractionThree devices: SCIF, SISCI, and SHMEM

Supply of different programming modelsThree-layered topologyAutomatic node ID assignment

Consideration of the hardware’s RDMA and RMA capabilitiesOne-sided communicationZero-copy forwarding

High performance

Good latency results (asynchronous signaling)Promising multi-hop throughput results

05/19/2014 | ACS Automation for Complex Power Systems | 22Simon Pickartz

Page 38: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

What we have . . .

Support for heterogeneous network landscapesHigh portability

Device abstractionThree devices: SCIF, SISCI, and SHMEM

Supply of different programming modelsThree-layered topologyAutomatic node ID assignment

Consideration of the hardware’s RDMA and RMA capabilitiesOne-sided communicationZero-copy forwarding

High performanceGood latency results (asynchronous signaling)Promising multi-hop throughput results

05/19/2014 | ACS Automation for Complex Power Systems | 22Simon Pickartz

Page 39: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

What we need . . .

DMA over the whole platformImplementation of swift_put()/_get() (w. i. p.)

Multicast or PUB/SUB communication modeDynamic routing (e. g. Bellman-Ford)

05/19/2014 | ACS Automation for Complex Power Systems | 23Simon Pickartz

Page 40: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Thank you for your kind attention!

Simon Pickartz – [email protected]

Institute for Automation of Complex Power SystemsE.ON Energy Research Center, RWTH Aachen UniversityMathieustraße 1052074 Aachen

www.eonerc.rwth-aachen.de

Page 41: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

Asymmetric Channels

A BtAB

tBA

time to send a message of length L from A to B and back

T = LtAB

+ LtBA

resulting throughput

tres = LT2

= 2LL

tAB+ L

tBA

= 2 · tAB · tBAtAB + tBA

05/19/2014 | ACS Automation for Complex Power Systems | 25Simon Pickartz

Page 42: SWIFT - stefan's techblog · SWIFT – Requirements Support for heterogeneous network landscapes → A transparent solution is targeted High portability → Hardware abstraction Supply

RDMA Results

64 512 4 k 32 k 256 k 2 M 16 M0

500

1,000

1,500

Size in Byte

Thro

ughp

utin

MiB

/s

Phi-to-Phi

DeviceSWIFT (host-routed)

MPI (host-routed)

05/19/2014 | ACS Automation for Complex Power Systems | 26Simon Pickartz