CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th...

34
CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah, M. Ramirez, M. Daneshtalab, P. Liljeberg, J. Plosila 1

Transcript of CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th...

Page 1: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

1

CoNA : Dynamic Application Mapping forCongestion Reduction in Many-Core

Systems

2012 IEEE 30th International Conference on Computer Design (ICCD)

M. Fattah, M. Ramirez, M. Daneshtalab, P. Liljeberg, J. Plosila

Page 2: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

2

Outline

Introduction

Mapping Problem and Evaluation Metrics

Contiguous Neighborhood Allocation Mapping

Experimental Setup

Results and Analysis

Conclusion

Page 3: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

3

Outline

Introduction

Mapping Problem and Evaluation Metrics

Contiguous Neighborhood Allocation Mapping

Experimental Setup

Results and Analysis

Conclusion

Page 4: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

4

Introduction

An efficient algorithm for run-time application mapping problem

Three novel contributions

First node selection

First task selection

Map the rest of tasks onto nearest neighborhood

Page 5: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

5

Outline

Introduction

Mapping Problem and Evaluation Metrics

Contiguous Neighborhood Allocation Mapping

Experimental Setup

Results and Analysis

Conclusion

Page 6: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

6

Mapping Problem and Evaluation Metrics

Applications

Ap =TG(T, E) ti ϵ T ei,j ϵ E

Communication platform

AG(Ñ, L)

ñi,j={(ri,j, pei,j)| ñi,jϵ Ñ, 0≤ i<M, 0≤ j<N}

Manhattan Distance : MD(ñi,j, ñm,n ) = (|i - m| + |j - n|)

Mapping function

map: T→ Ñ, s.t. map(ti ) = ñm,n; ∀ti∈T, ∃nm,n∈ Ñ

Page 7: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

7

Evaluation Metrics

Packet latency

Average Manhattan Distance

Average Weighted Manhattan Distance

Page 8: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

8

Evaluation Metrics (cont.)

Mapped Region Dispersion

Internal Congestion Ratio (ICR)

The number of edges using the same channel with respect to its total number of edges

Page 9: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

9

Outline

Introduction

Mapping Problem and Evaluation Metrics

Contiguous Neighborhood Allocation Mapping

Experimental Setup

Results and Analysis

Conclusion

Page 10: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

10

Contiguous Neighborhood Allocation Mapping(CoNA)

Three steps

First node selection

Choosing the first task of the application

Contiguous neighborhood allocation

Page 11: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

11

CoNA (cont.)

Page 12: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

12

CoNA (cont.)

First node selection

The nearest node to the central manager among the nodes with the largest number of available neighbors

Page 13: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

13

CoNA (cont.)

Choosing the first task of the application

Selects the task with the largest number of edges

The most intensive communication

Page 14: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

14

CoNA (cont.)

Contiguous neighborhood allocation

Task graph is traversed in the breadth-first order, paired with their predecessors is: {(t1, t4), (t2, t4), (t5, t4), (t0, t1), (t3, t2)}

Select the one which fits in the smallest square with the first node

Page 15: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

15

CoNA (cont.)

Contiguous neighborhood allocation

Task graph is traversed in the breadth-first order, paired with their predecessors is: {(t1, t4), (t2, t4), (t5, t4), (t0, t1), (t3, t2)}

Select the one which fits in the smallest square with the first node

Page 16: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

16

CoNA (cont.)

Contiguous neighborhood allocation

Task graph is traversed in the breadth-first order, paired with their predecessors is: {(t1, t4), (t2, t4), (t5, t4), (t0, t1), (t3, t2)}

Select the one which fits in the smallest square with the first node

Page 17: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

17

CoNA (cont.)

Contiguous neighborhood allocation

Task graph is traversed in the breadth-first order, paired with their predecessors is: {(t1, t4), (t2, t4), (t5, t4), (t0, t1), (t3, t2)}

Select the one which fits in the smallest square with the first node

Page 18: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

18

CoNA (cont.)

Contiguous neighborhood allocation

Task graph is traversed in the breadth-first order, paired with their predecessors is: {(t1, t4), (t2, t4), (t5, t4), (t0, t1), (t3, t2)}

Select the one which fits in the smallest square with the first node

Page 19: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

19

CoNA (cont.)

Page 20: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

20

Outline

Introduction

Mapping Problem and Evaluation Metrics

Contiguous Neighborhood Allocation Mapping

Experimental Setup

Results and Analysis

Conclusion

Page 21: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

21

Experimental Setup

NoC platform

Plasma processor

Local memory

DMA controller

Tra-NI interface

Central manager (CM)

The maximum number of applications that could be injected per second into the system is denoted as λfull

Page 22: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

22

Experimental Setup (cont.)

Simulation

To extract packet latency

FPGA

To investigate CoNA time complexity

Xilinx ML605

Page 23: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

23

Experimental Setup (cont.)

Application set

Task graphs are randomly generated (set1) using the Task graph generator

Number of nodes : 4 – 11

Weight of edges : 4 – 16 flits

The weights of applications edges are equally multiplied by 16 (set16)

Page 24: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

24

Outline

Introduction

Mapping Problem and Evaluation Metrics

Contiguous Neighborhood Allocation Mapping

Experimental Setup

Results and Analysis

Conclusion

Page 25: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

25

Results and Analysis

Packet latency evaluation

Time complexity evaluation

Page 26: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

26

Packet latency evaluation

Page 27: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

27

Packet latency evaluation (cont.)

Page 28: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

28

Packet latency evaluation (cont.)

Page 29: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

29

Packet latency evaluation (cont.)

Page 30: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

30

Time complexity evaluation

Page 31: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

31

Time complexity evaluation (cont.)

Page 32: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

32

Outline

Introduction

Mapping Problem and Evaluation Metrics

Contiguous Neighborhood Allocation Mapping

Experimental Setup

Results and Analysis

Conclusion

Page 33: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

33

Conclusion

An efficient run-time task allocation is proposed

Reduce internal and external congestions

Three novel contributions

Page 34: CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,

34

Thank you !