MultiNoC

2
MultiNoC What is it? a programmable on-chip multiprocessing platform using a network-on-chip (NoC) as communication media To whom is it addressed? industrial - applications requiring high bandwidth and/or parallel processing academic - undergraduate/graduate advanced disciplines Why use it? multiprocessed system in small FPGA demonstrates that parallel processing can be easily prototyped in low cost FPGAs NoCs are more scalable, more performant than busses M ultiN oC System Serial IP 00 R S-232 protocol HERM ES NO C Mem ory IP 11 (1k W ord) Spartan-IIe FPG A Processor 1 IP 01 Processor 2 IP 10 1k W ord M emory rx tx H ostcom puter R outer 00 R outer 01 R outer 11 R outer 10 R8 Processor 1k W ord Mem ory R8 Processor MultiNoC Architecture NUMA architecture each processor has its local memory and may access other processor (remote) memories wait/notify instructions to synchronize IPs Occupied Slices: Number of 4 input LUTs: Number of Block RAMs: MultiNoC Synthesis Chip Constrained Floorplan Simulation C orrect? Yes No No Send G enerated O bjectCode Sim ulate the Assem bly C ode C onfigure FPG A R esetFPG A Startserial application Synchronize SW /HW W rite the Assem bly C ode Start C orrect Hardware Execution? optional Activate Processors Fill M em ory Contents D ebug I/O Operations End Yes Application 1: Parallel edge detection Host computer sends an image line Each embedded processor computes one gradient (gx and gy) One embedded processor adds gx and gy and notifies the host Host receives processed line and sends a new line Application 2: Parallel Sort Bubble sort algorithm executed in parallel by both processors on each vector half Merge sort algorithm executed by only one processor, generating the final result from the two sorted halves. MultiNoC Demo Application MultiNoC System Flow Diagram 2,325 out of 2,352 98% 3,699 out of 4,704 78% 12 out of 14 85% R8 Simulator SpartanII-E Application

description

MultiNoC. MultiNoC Synthesis. MultiNoC Demo Application. Application 1: Parallel edge detection Host computer sends an image line Each embedded processor computes one gradient ( gx and gy ) One embedded processor adds gx and gy and notifies the host - PowerPoint PPT Presentation

Transcript of MultiNoC

Page 1: MultiNoC

MultiNoC

What is it? a programmable on-chip multiprocessing

platform using a network-on-chip (NoC) as communication media

To whom is it addressed? industrial - applications requiring high

bandwidth and/or parallel processing academic - undergraduate/graduate

advanced disciplines

Why use it? multiprocessed system in small FPGA demonstrates that parallel processing can

be easily prototyped in low cost FPGAs NoCs are more scalable, more performant

than busses

MultiNoC System

SerialIP 00

RS-232 protocol

HERMES NOC

MemoryIP 11

(1k Word)

Spartan-IIe FPGA

Processor 1IP 01

Processor 2IP 10

1k Word Memory

rxtx

Host computer

Router00

Router01

Router11

Router10

R8 Processor

1k Word Memory

R8 Processor

MultiNoC Architecture

NUMA architecture each processor has its local memory and

may access other processor (remote) memories

wait/notify instructions to synchronize IPs

Occupied Slices: Number of 4 input LUTs:Number of Block RAMs:

MultiNoC Synthesis

Chip Constrained Floorplan

Simulation Correct?

Yes

No

No

Send Generated Object Code

Simulate the Assembly Code

Configure FPGA

Reset FPGA

Start serial application

Synchronize SW/HW

Write the Assembly Code

Start

Correct Hardware

Execution?

optional

Activate Processors

Fill Memory Contents

Debug

I/OOperations

End

Yes

Application 1: Parallel edge detection Host computer sends an image line Each embedded processor computes one gradient

(gx and gy) One embedded processor adds gx and gy and

notifies the host Host receives processed line and sends a new line

Application 2: Parallel Sort Bubble sort algorithm executed in parallel by both

processors on each vector half Merge sort algorithm executed by only one

processor, generating the final result from the two sorted halves.

MultiNoC Demo Application

MultiNoC System Flow Diagram

2,325 out of 2,352 98%3,699 out of 4,704 78% 12 out of 14 85%

R8 Simulator

SpartanII-E

Application

Page 2: MultiNoC

A Multiprocessing System Enabled by a

Network on Chip

Xilinx Design Contest

CONTACT

Leandro Möller, Graduate Computer Science StudentAline Mello, Graduate Computer Science StudentEverton Carara, Undergraduate Computer Science StudentFernando Moraes, ProfessorNey Calazans, Professor{moller, alinev, carara, moraes, calazans}@inf.pucrs.br

SOME HERMES REFERENCES

Moraes, F.; Mello, A; Möller, L; Ost, L.; Calazans, N.A Low Area Overhead Packet-switched Network on Chip: Architecture and Prototyping.In: IFIP VLSI SOC 2003, pp. 318-323.

Moraes, F.; Calazans, N.; Mello, A.; Möller, L.; Ost, L.HERMES: an Infrastructure for Low Area Overhead Packet-switching Networks on Chip.Integration, the VLSI Journal, IN PRESS, 2004.

ADDRESS

Faculdade de Informática - PUCRSAv. Ipiranga, 6681 - Prédio 16 90619-900 - PORTO ALEGRE - BRASIL

Telefone: +55 51 3320 3611FAX: +55 51 3320 3621http://www.inf. pucrs.br/~gaph

Grupo de Apoio ao Projeto de HardwareHardware Design Support Group

NoC – Network on a Chip

PUCRS

+ +-+ +Parallelism

Network-on-Chip (NoC)

Shared BusDedicated

WiresInterconnection

Structures

Lower, shorter wires

Higher,longer wires

Lower, shorter wires

Power Consumption

+ +- +-Scalability

+ ++ +-Reusability

HERMES IP 2 x 2 mesh network wormhole packet switching no global address map - NUMA 8 different message formats: read

memory, write memory, active processor, printf, scanf, scanf return, printf return, notify

R8 embedded processor IP load-store 16-bit processor architecture 16x16 bit register file 36 distinct instructions 1K 16-bit words local memory for program

and data

Memory IP 1K 16-bit words, using 4 Block RAMs

Serial IP RS-232 protocol provides bidirectional communication with

a host processor

MultiNoC IPs