ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ......

87
ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ([email protected]) ΗΜΥ 656 ΠΡΟΧΩΡΗΜΕΝΗ ΑΡΧΙΤΕΚΤΟΝΙΚΗ ΗΛΕΚΤΡΟΝΙΚΩΝ ΥΠΟΛΟΓΙΣΤΩΝ Εαρινό Εξάμηνο 2007 ΔΙΑΛΕΞΗ 7: Interconnection Architectures

Transcript of ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ......

Page 1: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

ΧΑΡΗΣ

ΘΕΟΧΑΡΙΔΗΣ

([email protected])

ΗΜΥ

656

ΠΡΟΧΩΡΗΜΕΝΗ

ΑΡΧΙΤΕΚΤΟΝΙΚΗ

ΗΛΕΚΤΡΟΝΙΚΩΝ

ΥΠΟΛΟΓΙΣΤΩΝ

Εαρινό

Εξάμηνο

2007

ΔΙΑΛΕΞΗ

7: Interconnection

Architectures

Page 2: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Interconnect

An interconnect consists of:–

Medium.

Channels.–

Nodes and switches.

A host connects to the network through a node.

Information is divided in transmission units, called packets.

Page 3: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

The Ubiquitous Microchip The Ubiquitous Microchip

Sources: Sony, Philips, McLaren

Mercedes, Apple, Airbus, Lexus, Toshiba

Page 4: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

System-Level Design IssuesSystem-Level Design IssuesSystem Complexity

Higher level abstraction and specificationSystem-level reuse

System ReliabilityRobustness to internal and external noiseSelf-Sufficient Recovery

System Power Consumption Energy-Performance Trade-Offs

Integration of Heterogeneous TechnologiesComponent varietyTop-Down Planning, SynchronizationInterconnect

Source: International Technology Roadmap for Semiconductors, June 2005

Page 5: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

A Critical Bottleneck -

InterconnectA Critical Bottleneck -

Interconnect

Source:

Gordon Moore, Chairman Emeritus, Intel Corp.

0.18

0

50

100

150

200

250

300

Technology generation (μm)

Del

ay (p

sec)

0.8 0.5 0.250.2

50.1

5

Transistor/Gate delay

Interconnect delay

0.35

Page 6: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

The Billion Transistor EraThe Billion Transistor Era

Intel Itanium 2 (Codename Montecito)

1.7 BILLION transistors per die!

Pho

to b

y In

tel

Feature sizes diminishing RAPIDLY into the nanometer regime

Transistor densities skyrocketing

Gate delays are scaling down

What about Global Wiring delays?As wire cross-sections decrease, resistance INCREASES!

Interconnects are also an issue in terms of AREA, POWER, and RELIABILITYThe INTERCONNECT

can no longer be ignored!

Page 7: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Delay for Metal 1 and Global Wiring versus Feature Size (2005 ITRS)

Wiring Delays Keep Increasing Relentlessly!Wiring Delays Keep Increasing Relentlessly!

Global w/o Repeaters

Global with Repeaters

Gate Delay

Page 8: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Relative DelaysRelative Delays

Gate Delay250nm

32nm

Global Wiring

Global Interconnect Delays are NOT Scaling like Gate Delays!

Pho

to b

y IB

M

Page 9: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

System-on-Chip (SoC) Design RevolutionSystem-on-Chip (SoC) Design Revolution

DesignIP Blocks

LogicalComponents

System-on-ChipIntegration

ASICdesign

PhysicalComponents

System-on-BoardIntegration

Until now

Now/Future

Increasing Circuit Complexity

IP Re-Use

Platform-Based Design

On-chip Interconnect Scalability

Page 10: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Features of current SoCsFeatures of current SoCs

Reuse of design and testHard cores: available as layouts or netlistsSoft cores: Available as synthesizable HDL code

SoC DesignSelection and specialization of cores

Example: In a processor core you may have an option of selecting number of registers

Standard interfaces

Plug’n Play ApproachPlug the core into a “predefined” area and expect it to work.

Page 11: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

On Chip InterconnectsOn Chip Interconnects

Requirements for an on-chip interconnect

Buses

Switching networksCircuit-switchingPacket-switching

Comparison

Immutability and uniqueness

Page 12: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Requirements: General goalsRequirements: General goals

A reusable on-chip interconnect must:Support a modern, IP-block based methodology.Provide a pre-made/black-box/push-a-button product to the system designer.Provide a standard interface for access.Support a wide range of configuration parameters.Be effective and efficient!

Generic performance goals:Latency and latency jitter.Bandwidth.Power.Performance scalability.

Page 13: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

RequirementsRequirements

Topology and protocol requirements:Long connections must be asynchronous.Architectural scalability.General purpose.Programmable.

Reliability requirements:Performance guarantees.Noise resistance.Dynamic fault tolerance.

Design environment requirements:Early performance estimates must be possible.Economic use of resources.Implemented as an IP-block (or a set of them).

Page 14: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Buses: Definition (I)Buses: Definition (I)

A bus is an interconnection structure in which all connected hosts share the communication mechanism spatially.

Communication is broadcasted, and multiplexed in time.

Hierarchical buses are an extension of this idea.

Page 15: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Buses: Definition (II)Buses: Definition (II)

Page 16: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Buses: Advantages vs. DisadvantagesBuses: Advantages vs. Disadvantages

Economic.

Simple communication mechanism.

Simple priority (arbitration) implementation.

Memory mapped communication.

Deficient scalability.

Contention.

Full functional test impossible.

Lack of modularity.

Page 17: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Switching Networks: Definition (I)Switching Networks: Definition (I)

The communication medium is divided in segments called links.

Connections between links are dynamically controlled by switches.

The network architecture defines the topologyand routing scheme.

Page 18: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Switching Networks: Definition (II)Switching Networks: Definition (II)

If a path between hosts is kept through the transmission, we talk about circuit-switching.

Page 19: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Circuit-switching: Advantages and Disadvantages

Circuit-switching: Advantages and Disadvantages

Resources decoupled.

High accumulated bandwidth.

Stable connection parameters.

Blocking.

Circuit set-up penalization.

Page 20: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Switching Networks: Definition (III)Switching Networks: Definition (III)

If each packet is routed independently, we talk about packet-switching.

Page 21: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Packet-switching: Advantages and Disadvantages

Packet-switching: Advantages and Disadvantages

Alternate paths for each packet available: congestion avoidance, fault tolerance.

Flexibility and programmability.

Modularity.

Routing and reordering.

Header penalty.

Nodes need buffering.

Difficult QoS guarantees.

Page 22: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

ComparisonComparison

The flaws of packet switching can be alleviated at design time, using software tools.

Page 23: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Immutability and Uniqueness: DefinitionImmutability and Uniqueness: Definition

Once the on-chip interconnect is built on silicon, it is immutable.

A NoC is unique, in the sense that it will fit that application’s requirements better than any other’s, and that for those other applications, we will be able to instantiate a better NoC.

Page 24: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Immutability and Uniqueness: ApplicationsImmutability and Uniqueness: Applications

Because of immutability, NoCs can be optimized further than LANs or other macro-networks.

Uniqueness allows NoCs to take part in the system level tasks that are carried out with a certain level of knowledge about the system and the application running in it.

Page 25: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

On-chip Interconnect ScalabilityOn-chip Interconnect Scalability

Non- scalable

Global Wiring Complexity

Shared-Medium, Bus-Based ArchitecturesSegmented BusHierarchical Bus

Ring-Based ArchitecturesIBM Cell Microprocessor (8 cores)

Crossbar-Based ArchitecturesSun UltraSPARC T1 (Niagara) (8 cores)Microsoft Xbox 360 CPU (by IBM) (3 cores)

Point-to-Point Architectures

Page 26: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Buses are becoming spaghettiBuses are becoming spaghetti

CEVA-X1620

PMUICUTIMERS

DMA

L2 SRAM

GPIO

I/OAPB

bridge

APB systemcontrol

DMA Data

CoreO

Data Controller

Program Controller

TAG

I/O

Userperipherals

Userperipherals

Userperipherals

-

Peripheral APB

AHBMasterBridge

AHBSlave

BridgeIF

IF

Internal Data memory

DMA Prog.

Internal Program memory

CRU

ARM DATA bus

DMA - DATA bus 1

CORE - DATA busCORE - program bus

DMA - DATA bus 2

TDM

IF

L2 SRAM

Accelerator Bus

Page 27: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Learning from FPGAsLearning from FPGAs

Universal Logic Blocks

Regular layout and

Interconnection resources

Programmability

Page 28: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Enter the Network-on-Chip (NoC)!Enter the Network-on-Chip (NoC)!

Replace Global Wires with a Resource-Constrained Network

Structured Interconnect Layout

Electrical Properties OPTIMIZED and WELL CONTROLLED

NoCs are like IP Blocks for Wiring!

PEPE PEPE

PEPE PEPE

PEPE PEPE

PEPE PEPE

Page 29: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Systems-on-ChipSystems-on-Chip

Systems-on-Chip Networks-on-Chip

VGA CORE

ADC / DAC

ANALOG

DSP

ALU CORE

Page 30: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

What

are Networks-on-Chip (NoC)?What

are Networks-on-Chip (NoC)?

NIC

R

NIC

R

NIC

R

NIC

R

NIC

R

NIC

R

NIC

R

NIC

R

NIC

R

Processing

Elements

(PEs) interconnected via a packet-based network

NICb b

Router

b-bitLinks

Page 31: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Regular Network on ChipRegular Network on Chip

PE

PE

PE

PE

PE

PE

PE

PE

PE

PERouter

Page 32: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Networks On ChipNetworks On Chip

Messages packetized at PE-Network Interface, routed to destinations where they are de-packetized into data.

MSG

MSG

Packetized Message

Decoded Message

Page 33: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

The NoC

Paradigm ShiftThe NoC

Paradigm Shift

Computing

module

Network

router

Network

link

Architectural paradigm shift Replace the spaghetti by a customized network

Usage paradigm shift Pack everything in packets

Organizational paradigm shift Confiscate communications from logic designersCreate a new discipline, a new back-end responsibility (Already done for power grid, clock grid, …)

Bus

Page 34: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Why go there?Why go there?

Efficient sharing of wires

Lower area / lower power / faster operation

Shorter design time, lower design effort

Scalability

Page 35: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

NoC

CustomizationNoC

Customization

Trim routers / ports /

links

Place Modules

Adjustlink

capacities

Page 36: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Network ComponentsNetwork ComponentsNetwork Interface

Hardware located between each PE and routerCreates data/control packets for outgoing dataDecodes incoming data/control packets

Network Router/SwitchReceives packets, routes them based on routing algorithmCrossbar switch used for switchingContains buffering capacity for switchingOptional error control, QoS hardware, etc.

Network LinksPhysical channels between each router-to-router and router-to-PEUnidirectional links typicallyLow-swing signals used for low-power consumption

Page 37: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Related Work in NoCsRelated Work in NoCs

Architectural Impact of NoCsNetwork on a Chip: An architecture for billion transistor era [Hemani, Jantsch, Kumar, Postula, Oberg, Millberg, Lindqvist – IEEE NorChip Conference 2000]Route Packets, Not Wires: On-Chip Interconnection Networks [Dally, Towles – DAC 2001]Networks on Chips: A New SoC Paradigm [Benini, De Micheli – IEEE Computer January 2002]The Raw Microprocessor: A Computational Fabric for Software Circuits and General Purpose Programs [Taylor et al. – IEEE Micro March/April 2002]

High PerformanceLow-Latency Virtual-Channel Routers for On-Chip Networks [Mullins, West, Moore –ISCA 2004]

Power-Performance, Temperature-PerformancePower-Driven Design of Router Microarchitectures in On-Chip Networks [Wang, Peh, Malik – MICRO 2003]Thermal Modeling, Characterization and Management of On-Chip Networks [Shang, Peh, Kumar, Jha – MICRO 2004]

Performance-ReliabilityNetworks-On-Chip: The Quest for On-Chip Fault-Tolerant Communication [Marculescu– ISVLSI 2003]

Page 38: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Micronetwork ControlMicronetwork Control

Protocol stack is employed to effectively utilize micronetwork architecture

Abstraction into data link layer, network layer and transport layer

Physical

Data link

Network

Transport

Application

Page 39: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Data Link LayerData Link Layer

Requirement - To increase the reliability of the physical link up to a minimum required level

Physical layer is not sufficiently reliablePacketizing data

Performance vs. error probability tradeoff depending on packet size

Error-correcting codealternating-bit, go-back-N and selective repeat Physical

Data link

Network

Transport

Application

Page 40: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Network LayerNetwork LayerRequirement – To implement end-to-end delivery control in network architectures with many communication channels

Switching algorithmsCircuit, packet and cut-through swithcing

Routing algorithmsDeterministic routing – good for regular traffic patternAdaptive routing – good for irregular traffic (case of SoCs)

Physical

Data link

Network

Transport

Application

Page 41: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Transport LayerTransport LayerRequirement – To provide reliable end-to-end services (e.g. TCP)

Packetization – at the sourceResequencing and reassembling – at the destinationFlow control and negotiation

Deterministic approach – service quality guarantee with resource underutilizationStatistical approach – more efficient but no quality guarantee

Physical

Data link

Network

Transport

Application

Page 42: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Micronetwork ControlMicronetwork Control

Further work to predict the tradeoff curves

Architecture and protocol can be tailored to the target system or applications

Impact of architecture and control design on communication energy consumption

Page 43: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Commercial NoCsCommercial NoCs

ST MicroelectronicsSTNoC, dubbed “Spidergon”Complex multimedia chipsProprietary topology

Philips Electronics NVÆthereal NoCQuality-of-ServiceNetwork connections configurable at run-time

Arteris SALicensable NoC design toolsIP cores of NoC components

Page 44: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

AEthereal

Network-on-SiliconAEthereal

Network-on-Silicon

Research in progress (Philips)

IP cores are connected by network

Packet-switched router network

Protocol stack-based design

Provide guaranteed serviceSimplifies IP design and composition of IPs

Page 45: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Generic Router I/O and ArchitectureGeneric Router I/O and Architecture

MXN

ROUTINGDECISION

Page 46: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Generic NoC Router ArchitectureGeneric NoC Router Architecture

RoutingDecision

Unit

ERROR DETECTION/ERROR CORRECTION

INCOMING FLIT

NACK/ACK

FORWARDFLOW

OUTGOINGFLIT

(N)ACK fromnext router

nth OUTPUT PORTnth INPUT PORT

Crossbar

Switch

VirtualChannel

Arbitration

VirtualChannelRegisters

ACK/NACK /CORRECTED DATA

CrossbarArbitration

RetransmissionRegisters Retransmission?

Page 47: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

A Conventional NoC RouterA Conventional NoC Router

VC 0

Crossbar (5 x 5)

Routing Unit (RC)

VC Allocator(VA)

Switch Allocator (SA)

VC Identifier

From East

To East

To PE

VC 1VC 2

VC 0From WestVC 1

VC 0From NorthVC 1

VC 2

VC 0From SouthVC 1

VC 0From PEVC 1

VC 2

To WestTo NorthTo South

VC 2

VC 2

Input Port with BuffersControl Logic

Crossbar

Page 48: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

The Typical NoC Router PipelineThe Typical NoC Router Pipeline

Switch Alloc.

SAArbiter

VC Alloc.

VCArbiter

Crossbar Flit OutVC 1

VC V

Flit In VC 2:

Routing

L.S. Peh et al. (HPCA 2001)• 3-stage pipeline

Look-Ahead Routing(ISCA 2006, DAC 2005)• 2-stage pipeline

R. Mullins et al. (ISCA 2004)• 1-stage pipeline

Page 49: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Network-On-Chip IssuesNetwork-On-Chip IssuesPower Consumption

Overhead power consumed in routers, network interfaces, and overhead data transmission/encoding

Data such as addresses, control bits, etc.

ReliabilityReliable data transmission is a necessary concept for any on-chip NetworkNetwork guarantees data transmission from PE A to PE B.

PerformanceNetwork net throughput

Defined as the rate of useful data that can be sent over the networkNetwork utilization

In general, similar to traditional networks!

Page 50: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

TopologiesTopologiesDifferent interconnection topologies tend to have different traffic patterns - the way in which nodes are connected in a network impacts latency

Bandwidth Traffic pattern

2D meshLow costSome nodes connect to more neighbors than othersTends to generate “hot spots” in the center of the topology

2D TorusLower message latencyFolded torus is used to avoid wire delaysAll nodes in a torus connect to the same number of neighborsUniform traffic density

Page 51: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Packets and AddressingPackets and AddressingPacket SizeNo: of packets per messageBoth header and payload are packetsPacket length

ExplicitImplicit

Addressing scheme E.g. 6 bit encoding for at most an 8*8 array

Page 52: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Switching activitySwitching activityVirtual Cut-Through

Packet is forwarded as soon as destination can accept it in its entiretyBuffering requirements pretty high

Store-and-ForwardPacket is received in its entirety and then it is forwardedAgain, high buffering requirements…

Wormhole RoutingPackets are broken down into flits (smallest bufferable chunk)Flits are being routed as soon as the destination can accept a single flitMuch smaller buffering requirementsPreferred method of switching in NoC’s today

Page 53: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Routing Algorithms (Deterministic)Routing Algorithms (Deterministic)

X-Y routing Hierarchical routing Hot-potato routing

S S

DDD

S

Page 54: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Wiring and TilingWiring and Tiling

Page 55: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Crossbar SwitchesCrossbar Switches

E

W

N

S

PE

INPUTS

OUTPUTSE W N S PE

Control

OUT

IN

•Delay through a crossbar increases significantly with the number of ports •Places a limit on the connectivity of the network•Power-hungry operation

Page 56: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Virtual ChannelsVirtual Channels

CROSSBARPRIORITY

DETERMINATION UNIT

VC # 1

VC # 2

VC # n

Virtual Channel Selection Signal

N

S

E

W

PE

OUTPUT LINK

•Virtual Channel concept used to provide QoS and deadlock avoidance•Buffer capacity a limiting factor•VC hardware can be complex

Page 57: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

VirtualizationVirtualization

Mapping of more than one node onto physical PE hardwareAllows larger number of nodes on-chip. Shared hardwareConfiguration Memory overhead at each PE

B1 B2 B3 B4

C1 C2

ROUTER ROUTER ROUTER

ROUTER ROUTER ROUTER

B1B2

C1C2

B3B4

Page 58: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Network ParametersNetwork ParametersChannel width

Flit size – In combined GT-BE router flit size should be a multiple of block size to avoid alignment problems

Number of channels – two nodes can have more than one channel between them

Buffer memory parameters - Critical since we cannot drop packets

Flit buffer depth

Flit buffer organization Shared between channelsIndividual buffers for each channels

Page 59: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Hot Spots in NoCHot Spots in NoC

Hot Spot: A module that occasionally cannot digest all the traffic addressed to it

Results in temporary massive delay build-upResults in blocking the net !

This is NOT congestion on the netHigher network capacity won’t help

ExamplesPort to off-chip DRAMShared resource on chip

Page 60: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

HotSpots

in QNoC

(cont’d)

When HotSpot (HS) cogs, worms “get stuck” in the network, and block other worms

Two problems:PerformanceFairness

IP (HS) Inte

rface

Page 61: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

IP3Interface

IP2

Inte

rfaceIP1

(HS) Inte

rface

HS Affects the System

HS is not a local problem. Traffic destined elsewheresuffers too!

The Green packet experiences long delay even though it does NOT share any link with HS traffic

Page 62: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Network Performance

As HS module utilization grows, a large part of the system becomes clogged

Page 63: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Source (un)FairnessModule location greatly affects QoS

Example: At 90% utilization, a distant module experiences x10 the latency of a close one

Simulation results for a 4x4 NoC with 10Gbit/Sec links, 6Gbit/Sec HS Module

HS

6

1

5

3

7

4

8

109 11 12

1413 15 16

R

R

R

R

R R

R

RR R R

RR R R

R

Page 64: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Blocked Output Ports…Blocked Output Ports…

Page 65: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Cooling down the Hot SpotCooling down the Hot SpotWhen the spot gets hot, block new packets to it

This is prevention

How? With credit-based allocation

Page 66: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

IP1

Flow

Control

IP4

NoC

Interface

Interface

IP3

IP2 (HS)

Enh

ance

d In

terfa

ceSc

hedu

ler

HotSpot Credit-Based Allocation

Page 67: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

IP1

IP4

NoC

Interface

Interface

IP3

IP2 (HS)

Enh

ance

d In

terfa

ceFlow

Control

Sche

dule

r

HotSpot

Credit-Based Allocation

Page 68: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

IP1

Flow

Control

NoCIP2 (HS)

Enh

ance

d In

terfa

ceSc

hedu

ler

Interface

IP3

IP4

Interface

HotSpot

Credit-Based Allocation

Page 69: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Power ConsumptionPower ConsumptionNetwork power identified as a limitation - ~40% of total power!Energy(flit)

= [E(write buffer) + E(read buffer) + E(arbitration) + E(crossbar) + E(link) ] * # of Hops

= E(Buffers)+ E(Arbitration) + E(crossbar) + E(link)* Hops

Energy (Packet) = E(Flit) X # flits/packetEnergy per packet depends on the amount of flits per packet, and the number of hops the packet travels through the networkThe larger the network, the more hopsNeed better routing algorithms, topologies, etc.

Page 70: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

NoC ReliabilityNoC ReliabilityShrinking feature size results in decreasing Vdd and Vt

Crosstalk, coupling noise, soft errors and process variations affect reliabilityReliability a critical design issue

Communication protocol (NoCs) requires error protection mechanisms

Error protection consumes energy and increases latency

Traditional macro-networks provide ideas on error detection/correction schemes

Error Detection + Data Retransmission vs. Error Correction

Page 71: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Recall: NoC DesignsRecall: NoC DesignsS S S S

S S S S

S S S S

S S S S

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

RESOURCE

NI

•PE – Switch(es)-PE communication

•While we view a PE as a sender and a PE as a receiver, same concept can be applied for Switch to Switch

•As such, multiple transmissions for each data packet

•Level of protection?

Page 72: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Error Detection SchemesError Detection Schemes

End-to-End Error DetectionParity Check or CRC codes can be added to each packet/flitCRC/Parity encoder integrated into each sender NIPackets are encoded, transmitted and stored (for retransmission)Receiver NI checks for error

Ack/Nack Signal to sender, either piggybacked w/ response packet or individual.Open Core Protocol requires request-response transaction

Time-Out mechanism necessaryNeed sequence #’s for each packet (for re-ordering and identification of duplicate packets)

Page 73: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

End-to-End RetransmissionEnd-to-End Retransmission

Decoder

receiver NIsender NI

packet Buffers

credit signalqueuing buffers

swtich A

Core

switch B

Core

Encoder

Network

switch

Page 74: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Error Detection SchemesError Detection SchemesSwitch-to-Switch Error Detection

Similar to end-to-end only done at each switchCan be done at packet or flit level

switch-switch flit w/ parity, switch-switch flit w/ CRC– Each flit contains its own check bits

switch-switch packet w/ parity, switch-switch packet w/ CRC– Check bits added to tail flit

Need two sets of buffersRegular operation (queuing buffers)Storing packets not acknowledged from receiver (for retransmission)Capacity as w/ queuing buffers (2NL+1 for flit level) and (2NL+f for packet level of f flits/packet) ACK/NACK can be a single wire now

Page 75: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Switch to Switch RetransmissionSwitch to Switch Retransmission

TMRnodatavalid

ACK

mf

mf

mf

data

credit signal

buffers)(queuing + retransmissioncircular buffers

DecoderDecoder receiver NI

switch Bswtich A

packet buffers

sender NI

Core Core

Encoder

switch

Page 76: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

ObservationsObservationsFor End-End and Hybrid schemes

Power consumption mainly in buffers at NICommunication pattern can provide buffer requirement information

Increase in traffic (ACK/NACK packets)Merge multiple ACKs/NACKs in a single packet?

For Switch-Switch (packet/flit) schemesRetransmission Buffers responsible for most power consumption

Efficient Buffer allocation based on application demands needs to be explored

Out-of-Order arrival and Duplicate Rejection Mechanisms also necessary and consume power overhead.

Not efficient to block network traffic while waiting for a retransmission packet

Page 77: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

ObservationsObservations

End-End more efficient when length of link is large (multi-cycle links)

Switch-Switch more efficient when short link and when hop-count is high (NI buffering an issue)

Low error rates result in similar performance results,

Higher error rates favor the hybrid mechanism

End-End a subset of hybrid scheme, hence we can selectively disable correction circuitry

Hierarchical networks Switch-based for local communicationEnd-Based for global communication

Page 78: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Case Study:

A Neural NetworkCase Study:

A Neural Network

Input Receptive Fields

Output Neuron

10x10

10x10 10x10

10x10

5x55x5

5x55x5

5x55x55x55x5

5x20

1st Hidden Layer of Neurons

2nd Hidden Layer of Neurons

5x205x205x205x205x20

Input Image

FACE ?

Lighting Correction

Page 79: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

From Spaghetti Wires to NoC!From Spaghetti Wires to NoC!

Traffic flow – Regular!

Some computation values initialized during configuration

Data in and out through(I/O ingress/egress) nodes

Page 80: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Research Ideas -

ChallengesResearch Ideas -

ChallengesPerformance

Throughput, Bandwidth, Frequency

EnergyReduce # of hops, packets, optimizations on individual components

ReliabilityTransient errors (e.g. soft errors) occurring within a routerEnergy reduction via Application-Specific characteristics

What about the application space?Applications that benefit from NoC ImplementationsCases where NoC overhead is a negative factor?

Vision and Multimedia applications are huge benefactors!

Page 81: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

3D Chip Design3D Chip Design

New Challenges = New Opportunities

How about the third dimension?

Page 82: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

3D Stacking = Increased Locality! 3D Stacking = Increased Locality!

Many more neighbors within a few minutes of reach!

Page 83: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Device Layer 2Vertical Interconnect

Silicon

1

Multiple layers of active devices

Vertical interconnects between layers

Device Layer

Silicon

1

Courtesy: K.Bernstein, IBM

2D Chip

3D Chip

Layer 1

Layer 2

Chip-Level 3D IntegrationChip-Level 3D Integration

Page 84: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

Reduced Global Interconnect LengthReduced Global Interconnect Length

L

L

Delay/Power Reduction

Bandwidth Increase

Smaller Footprint

Mixed Technology Integration

Page 85: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

3D Benefit: Increased Locality3D Benefit: Increased Locality

CPU Nodes within 1 hopNodes within 2 hops Nodes within 3 hops

Bus-based Inter-Layer Communication (dTDMA Bus Pillar)

2D vicinity

3D vicinity

Page 86: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

3-D Networks on Chip3-D Networks on Chip

Page 87: ΔΙΑΛΕΞΗ 7: Interconnection Architectures - UCY · 7: Interconnection Architectures. ... zPoint-to-Point Architectures. Buses are becoming spaghetti Buses are becoming spaghetti

To NoC or not to NoC ?To NoC or not to NoC ?

Adopting just any net feature for NoC may be a mistake

You can create an elegant regular topologyBut ASICs are irregular

You can create a non-blocking networkBut hot spots can block networks of infinite capacity

You can guarantee service (it’s easy to verify)But extremely hard to configure. Best Effort is simpler

You can use lots of buffersAnd dissipate lots of power

You can create complex routingFixed, simple single-path routing saves energy and area

You can try to balance trafficSingle-path routing works better with links of uneven capacity

You can make packets conflict with each otherBetter use priority levels and pre-emption