Efficient Link Capacity and QoS Design for Wormhole Network-on-Chip

Post on 25-Feb-2016

24 views 0 download

Tags:

description

Efficient Link Capacity and QoS Design for Wormhole Network-on-Chip. Zvika Guz, Isask ’ har Walter, Evgeny Bolotin, Israel Cidon, Ran Ginosar and Avinoam Kolodny. Technion, Israel Institute of Technology. Problem Essence . How much capacity [bits/sec] should be assigned to each link? - PowerPoint PPT Presentation

Transcript of Efficient Link Capacity and QoS Design for Wormhole Network-on-Chip

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

Module

R

R

R R R

R

RR R R R

R R

R

Module

R

R

R

Efficient Link Capacity and QoS Design for Wormhole

Network-on-Chip

Zvika Guz, Isask’har Walter, Evgeny Bolotin, Israel Cidon, Ran Ginosar and

Avinoam Kolodny

Technion, Israel Institute of Technology

DATE’06 NoC Capacity Allocation 2

Problem Essence How much capacity [bits/sec] should be

assigned to each link? - All flows must meet delay requirements - Minimize total resources

R

R

R R R

R

RR R R R

R R

R

R

R

R

DATE’06 NoC Capacity Allocation 3

Outline Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model- Capacity allocation algorithm

Design examples Summary

ModuleModule

Module

Module

Module

Module

Module

Module

Module Module Module

ModuleModule

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 4

Outline Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model- capacity allocation algorithm

Design examples Summary

ModuleModule

Module

Module

Module

Module

Module

Module

Module Module Module

ModuleModule

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 5

IP1

Inte

rface

IP2

Wormhole Switching

Interface

Suits on chip interconnect Small number of buffers Low latency Virtual Channels

- interleaving packets on the same link

DATE’06 NoC Capacity Allocation 6

Wormhole Switching Suits on chip interconnect Small number of buffers Low latency Virtual Channels

- interleaving packets on the same link

IP1

Inte

rface

Interface

IP3IP2Interface

DATE’06 NoC Capacity Allocation 7

Outline Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model- Capacity allocation algorithm

Design examples Summary

ModuleModule

Module

Module

Module

Module

Module

Module

Module Module Module

ModuleModule

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 8

ModuleModule

Module

Module

Module

Module

Module

Module

Module Module Module

ModuleModule

ModuleModule

Module

Module

NoC Design FlowDefine inter-

module traffic

Place modules

Allocate link capacities

Verify QoS and cost

R

R

R R R

R

RR R R R

R RR

R

R

R

R

R

RR

R

R

R

R

R

R

R R

R R

R

R

R

R

R

RR

R

R

DATE’06 NoC Capacity Allocation 9

ModuleModule

Module

Module

Module

Module

Module

Module

Module Module Module

ModuleModule

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

NoC Design Flow

ModuleModule

Module

Module

Module

Module

Module

Module

Module Module Module

ModuleModule

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

Define inter-module traffic

Place modules

Allocate link capacities

Verify QoS and cost

Too low capacity results in poor QoS Too high capacity wastes power/area

ModuleModule

Module

Module

Module

Module

Module

Module

Module Module Module

ModuleModule

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

ModuleModule

Module

Module

Module

Module

Module

Module

Module Module Module

ModuleModule

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 10

Capacity Allocation Problem Simulation takes too long

a simulation based solution is not scalable

If no simulations are used:- How to extract flows’ delays? - How to reassign capacity?

Our solution:- Analytical model to forecast QoS- Capacity allocation algorithm that exploit the model

DATE’06 NoC Capacity Allocation 11

Outline Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model- Capacity allocation algorithm

Design examples Summary

ModuleModule

Module

Module

Module

Module

Module

Module

Module Module Module

ModuleModule

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 12

Delay Analysis

s1

d2s2

d1

R

R

R R R

R

RR R R R

R R

R

R

R

R

Approximate per-flow latencies Given:

- Network topology- Link capacities- Communication demands

DATE’06 NoC Capacity Allocation 13

Because they assume:- Symmetrical communication demands - No virtual channels- Identical link capacity!

Generally, they calculate the delay of an“average flow”- A per-flow analysis is needed

Why Previous Models Do Not Apply?

DATE’06 NoC Capacity Allocation 14

IP1

Inte

rface

IP2Interface

Wormhole Delay Analysis The delivery

resembles a pipeline pass

Packet transmission can be divided into two separated phases:- Path acquisition- Packet delivery

We focus on packet delivery phase

DATE’06 NoC Capacity Allocation 15

IP1

Inte

rface

IP2Interface

Packet delivery time is dominated by the slowest link- Transmission rate- Link sharing

Packet Delivery Time

Low-capacity link

DATE’06 NoC Capacity Allocation 16

IP1

Inte

rface

Interface Interface

IP2

Packet Delivery Time Packet delivery

time is dominated by the slowest link- Transmission rate- Link sharing

IP3

DATE’06 NoC Capacity Allocation 17

Analysis Basics Determines the flow’s effective bandwidth

Per link Account for interleaving

tt

DATE’06 NoC Capacity Allocation 18

- mean time to deliver a flit of flow i over link j [sec] - capacity of link j [bits per sec] - flit length [bits/flit] - total flit injection rate of all flows sharing link j

except for flow i [flits/sec]

Single Hop Flow, no Sharing

1

1ij

jl

tC

t

ijtjC

ij

l

DATE’06 NoC Capacity Allocation 19

- mean time to deliver a flit of flow i over link j [sec] - capacity of link j [bits per sec] - flit length [bits/flit] - total flit injection rate of all flows sharing link j

except for flow i [flits/sec]

1

1ij i

j jl

tC

ijtjC

ij

l

Bandwidth used by

other flows on link j

Single Hop Flow, with Sharing

tt

DATE’06 NoC Capacity Allocation 20

The Convoy Effect Consider inter-link dependencies

- Wormhole backpressure - Traffic jams down the road

1

1ij i

j jl

tC

| ( , )ij

i ii i k kj j i

k k k

l tt t

C dist j k

Link Load

Account for all subsequent hops Basic delay

weighted by distance

DATE’06 NoC Capacity Allocation 21

Weakest link dominates packet delivery time

Total Packet Transmission Time

- mean packet latency for flow i [sec]iT

max( | )i i i ijT m t j

Packet size[flits/packet]

Account for weakest link

=

- mean packet latency for flow i [sec]

DATE’06 NoC Capacity Allocation 22

Outline Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model- Capacity allocation algorithm

Design examples Summary

ModuleModule

Module

Module

Module

Module

Module

Module

Module Module Module

ModuleModule

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 23

Greedy, iterative algorithm

Capacity Allocation Algorithm

For each src-dst pair: Use delay model to identify most sensitive link

Increase its capacity Repeat until delay requirements are met

DATE’06 NoC Capacity Allocation 24

Outline Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model- Capacity allocation algorithm

Design examples Summary

ModuleModule

Module

Module

Module

Module

Module

Module

Module Module Module

ModuleModule

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 25

Capacity Allocation – Example#1

Before optimizationAfter optimization

00

01

02

03

10

11

12

13

20

21

22

23

30

31

32

33

Total capacity reduced by

7%

Uniform traffic with identical requirements Uniform allocation: 74.4Gbit/sec Capacity allocation algorithm: 69Gbit/sec

DATE’06 26

After optimizationBefore optimization

00

01

02

03

10

11

12

13

20

21

22

23

Capacity Allocation – Example#2 A SoC-like system

- Heterogeneous traffic demands and delay requirements Uniform allocation: 41.8Gbit/sec

Total capacity reduced by

30%

Capacity allocation algorithm: 28.7Gbit/sec

DATE’06 NoC Capacity Allocation 27

Outline Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model- Capacity allocation algorithm

Design Examples Summary

ModuleModule

Module

Module

Module

Module

Module

Module

Module Module Module

ModuleModule

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 28

Summary SoCs need non uniform link capacities

- Capacity allocation Wormhole delay analysis

- Heterogeneous link capacities - Heterogeneous communication demands- Multiple VCs

Greedy allocation algorithm Design examples

- NoC cost considerably reduced

DATE’06 NoC Capacity Allocation 29

Questions?

Module

Module M odule

M odule Module

M odule Module

Module

Modu le

Module

Module

Module

QNoCResearch

GroupGroup

ResearchQNoC

DATE’06 NoC Capacity Allocation 30

Backup

DATE’06 NoC Capacity Allocation 31

Grid topology Packet-switched Wormhole switching Fixed path XY routing Heterogeneous link capacities Quality-of-Service

QNoC Architecture

Module

Module

Module

Module

Module

Module

Module

Module

Module

Module

ModuleModule Module Module Module

ModuleModule Module Module Module

ModuleModule Module Module Module

R

R

R

R

R R

R

R

R

RR R R R

RR R R R

RR R R R

R

Router Link

E. Bolotin, I. Cidon, R. Ginosar, A. Kolodny, “QoS Architecture and Design Process for Cost-Effective Network on Chip”, Journal of Systems Architecture, 2004

DATE’06 NoC Capacity Allocation 32

Analysis and Simulation vs. Load

Nor

mal

ized

Del

ay

Utilization

Analytical model was validated using simulations- Different link capacities- Different communication

demands

Analysis Validation

DATE’06 NoC Capacity Allocation 33

Slack Elimination

Packet Delay Slack

Slac

k]%

[

Flow