Synthetic Traffic Generation - SLIP onlinesliponline.org/SLIP07/presentations/4C_Heirman.pdf ·...

Post on 20-Jul-2020

3 views 0 download

Transcript of Synthetic Traffic Generation - SLIP onlinesliponline.org/SLIP07/presentations/4C_Heirman.pdf ·...

Synthetic Traffic Generation:a Tool for Dynamic Interconnect Evaluation

W. Heirman, J. Dambre, J. Van CampenhoutELIS Department, Ghent University, Belgium

Sponsored by IAP-V PHOTON & IAP-VI photonics@be, Belgian Science Policy Office

PHOT

ONnetwork

2

Outline

• Introduction• Synthetic traffic generation• Results• Conclusions

3

Distributed shared-memory architecture

CPUMEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

supercomputer

onon--chipchip

server

Network is part of the memory hierarchy

cache

instruction: 0.5 ns

cache: 5 ns

DDR: 50 ns

network: 500 ns

4

Interconnect requirementsNon-uniform network traffic in space and time

=> Reconfigurable network?

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

time

load

Link #5

time

load

Link #9

time

load

Link #13

5

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

CPU MEM

NetIF

Reconfiguration implementation:base network + extra reconfigurable links

Base network (fixed)Extra links (reconfigurable)

CPU 1

CPU 2

...

CPU n

Broadcast element

Fiber links

Processor nodes

Tunable lasers

CPU 1

CPU 2

...

CPU n

Photodetectors

other ‘dynamic networks’: e.g. per-link voltage scaling

6

Evaluate networks with synthetic traffic

• Mimics the behavior of real traffic• But without the computational cost of

modeling application, OS, CPUs, caches, …

ApplicationOS

CPUCaches

Reconfigurablenetwork

Synthetictraffic generator

Reconfigurablenetwork

network traffic

simulator simulator

simulation time ÷10

7

We need better synthetic traffic

Reconfiguration exploits low-frequency dynamics in the network traffic

• Trace-driven simulation using static traffic patterns (uniform, hotspot, shuffle, …)won’t do!

• Full execution-driven simulation (traffic is driven by application:FFT, weather forecast, database) is too slow!

8

Outline

• Introduction• Synthetic traffic generation• Results• Conclusions

9

Realistic synthetic traffic generation

• One execution-driven simulation

• Resulting traffic profile re-used many times

ApplicationOS

CPUCaches

Synthetictraffic generator

Reconfigurablenetwork

network traffic

Parameterextraction

Statistical traffic profile

×1 ×n

10

Preserve packet-interdependencies by using packet groups

Packets are processed/generated in groups, corresponding to one memory operation each

processor

home

REQ (1) REPLY (2)

processor

home

REQ (1) REPLY (4)

owner

WBreq (2) WBreply (3)

processor

home

REQ (1) REPLY (4)

sharer

INVreq (2) INVreply (3)

sharer ...

(3) (2)

11

Distribution of # involved nodes...

2 3 n

12

Reuse distance of home nodes:introduce locality

previous destinations

next destination

node A

node Cnode Bnode Anode D…

2

13

Computation or ‘think’ timeModels time delay between subsequent requests

14

Outline

• Introduction• Synthetic traffic generation• Results• Conclusions

15

Simulations

• Simulation platform: Simics, providing functionalmultiprocessor simulation– 16 UltraSPARC III processors– SPLASH-2 parallel benchmarks

• Timing model:– Computes the latency

for each memory access– Models caches, interconnection network– Base network: 4x4 torus– Extra links: configurable number,

fan-out, reconfiguration interval

16

Simulations

Once per benchmark:Simulate execution of the benchmark, base network only, measuring traffic profile (1)

For each set of extra link parameters:• Execution-driven simulation with reconfigurable network (2)

“correct” result• Trace-driven simulation using (simplified) traffic from (2)

tracing error• Trace-driven simulation using (simplified) traffic from (1)

traffic-dependence on network• Trace-driven simulation using synthetic traffic

total error

17

Several parameters can be measured

18

Detailed view of “average packet latency”

19

Variability for shorter traces

trace-driven +profiling* exec-driven* assuming traffic profile is re-used 100 times

synthetic traffic execution-driven

20

Outline

• Introduction• Synthetic traffic generation• Results• Conclusions

21

Conclusions

• Synthetic traffic generation was extended to– shared-memory cache-coherence protocols,– reconfigurable networks

• Good relative accuracy for different network topologies

• Much less computationally expensive (x10), even more so for shorter traces (x100)

• Reproducibility equal to or better than execution-driven simulations