CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies...
Transcript of CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies...
![Page 1: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/1.jpg)
Page 1
CS/ECE 757: Advanced Computer Architecture II(Parallel Computer Architecture)
Interconnection Network Design (Chapter 10)
Copyright 2001 Mark D. HillUniversity of Wisconsin-Madison
Slides are derived from work bySarita Adve (Illinois), Babak Falsafi (CMU),
Alvy Lebeck (Duke), Steve Reinhardt (Michigan),and J. P. Singh (Princeton). Thanks!
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 2
Interconnection Networks
• Goal: Communication between computers• Warning: Terminology-rich environment• Focus on Networks for Parallel Computing
– today’s System Area Networks exhibit many of the same properties
• Traffic Patterns– Arbitrary (bisection bandwidth matters)– Near-neighbor– Permutations (e.g., FFT)– Multicasts or Broadcasts?
– Latency or Bandwidth Critical?
![Page 2: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/2.jpg)
Page 2
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 3
Terms
Network characterized by• Topology
– physical structure of the graph
• Routing Algorithm– through which paths can message flow
• Switching Strategy– How data in message traverses its route – Circuit Switched vs Packet Switched
• Flow Control– When does a packet (or portions of it) move along its route
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 4
More Terms
• Given topology constructed by linking switches and network interfaces, must deliver packet from node A to node B
• Link: cable with connectors on each end– connect switches to other switches or network interfaces
• Switch: N inputs N outputs (degree N)• Phit: Minimum # of bits physically moved across link
in one cycle• Flit: Minimum # of bits move across link as a single
unit• Packet: unit that requires routing information, some
number of flits
![Page 3: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/3.jpg)
Page 3
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 5
Outline
• Topology
• Switching, Routing, & Deadlock
• Switch Design
• Flow Control
• Case Studies
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 6
Topology
• Structure of the interconnect• Determines
– Switch Degree: number of links from a node– Diameter: number of links crossed between nodes on maximum
shortest path– Average distance: number of hops to random destination– Bisection: minimum number of links that separate the network into
two halves
• Don’t forget Network Interface (NI)– “ On-” and “ off-ramp” to the interconnect “ freeway”– NI latency can dominate interconnect latency
(reducing the importance of topology)
![Page 4: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/4.jpg)
Page 4
(C) 2003 J. E. Smith CS/ECE 757 7
Topology
Node
SW Interface
HW Interface
linklink
bandwidth
Node
SW Interface
HW Interface
linklink
bandwidth
. . .
. . .
Interconnect
Bisection Bandwidth
Overhead
Latency
Application Application
(C) 2003 J. E. Smith CS/ECE 757 8
Buses
• Direct interconnect• Performance/Cost:
– Switch cost: N– Wire cost: const– Avg. latency: const– Bisection B/W: const– Not neighbor optimized– May be local optimized
• Capable of broadcast (good for MP coherence)• Bandwidth not scalable => major problem
– Hierarchical buses?» Bisection B/W remains constant» Becomes neighbor optimized
MemoryMemoryMemoryMemory
Proc
cache
Proc
cache
Proc
cache
Proc
cache
![Page 5: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/5.jpg)
Page 5
(C) 2003 J. E. Smith CS/ECE 757 9
Rings
• Direct interconnect• Performance/Cost:
– Switch cost: N– Wire cost: N– Avg. latency: N / 2– Bisection B/W: const– neighbor optimized, if bi-directional– probably local optimized
• Not easily scalable– Hierarchical rings?
» Bisection B/W remains constant» Becomes neighbor optimized
M
P
RING
S
M
P
S
M
P
S
(C) 2003 J. E. Smith CS/ECE 757 10
Hypercubes
• n-dimensional unit cube• Direct interconnect• Performance/Cost:
– Switch cost: N log2 N– Wire cost: (N log2 N) /2– Avg. latency: (log2 N) /2– Bisection B/W: N /2– neighbor optimized– probably local optimized
• latency and bandwidth scale well, BUT– individual switch complexity
grows=> max size is built in
S
M P
S
M P
S
M P
S
M P
S
M P
S
M P
S
M P
S
M P
000
110001 101
111011
010
100
![Page 6: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/6.jpg)
Page 6
(C) 2003 J. E. Smith CS/ECE 757 11
2D Torus
• Direct interconnect• Performance/Cost:
– Switch cost: N– Wire cost: 2 N– Avg. latency: N 1/2
– Bisection B/W: 2 N 1/2
– neighbor optimized– probably local optimized
SM P
SM P
SM P
SM P
SM P
SM P
SM P
SM P
SM P
SM P
SM P
SM P
SM P
SM P
SM P
SM P
(C) 2003 J. E. Smith CS/ECE 757 12
2D Torus, contd.
• Cost scales well• Latency and bandwidth do not scale as well as
hypercube, BUT– difference is relatively small for practical-sized systems
• Weave nodes to make inter-node latencies const.• Routing can be tricky (deadlocks)• 2D Mesh similar, but without wraparound
SM P
SM P
SM P
SM P
SM P
SM P
SM P
SM P
![Page 7: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/7.jpg)
Page 7
(C) 2003 J. E. Smith CS/ECE 757 13
3D Torus
• Direct interconnect• Performance/Cost:
– Switch cost: N– Wire cost: 3 N– Avg. latency: 3(N1/3 / 2)– Bisection B/W: 2 N2/3
– neighbor optimized– probably local optimized
(C) 2003 J. E. Smith CS/ECE 757 14
3D Torus, contd.
• Cost scales well• Latency and bandwidth do not scale as well
as hypercube, BUT– difference is relatively small for practical-sized systems
• Seems to have become an interconnect of choice:– Cray, Intel, Tera, DASH, etc.
• Routing can be tricky (deadlocks)• 3D Mesh similar, but without wraparound
![Page 8: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/8.jpg)
Page 8
(C) 2003 J. E. Smith CS/ECE 757 15
Crossbars
• Indirect interconnect• Performance/Cost:
– Switch cost: N 2
– Wire cost: 2 N– Avg. latency: const– Bisection B/W: N– Not neighbor optimized– Not local optimized
Processor
MemoryMemoryMemoryMemory
MemoryMemoryMemoryMemory
Processor
Processor
Processor
corssbar
(C) 2003 J. E. Smith CS/ECE 757 16
Crossbars, contd.
• Capable of broadcast• No network conflicts• Cost does not scale well• Often implemented with
muxes
mux
P
M
P
P
P
mux
M
mux
M
mux
M
![Page 9: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/9.jpg)
Page 9
(C) 2003 J. E. Smith CS/ECE 757 17
Multistage Interconnection Networks (MINs)
• AKA: Banyan, Baseline, Omega networks, etc.
• Indirect interconnect• Crossbars interconnected
with shuffles• Can be viewed as
overlapped MUX trees• Destination address
specifies the path• The shuffle interconnect
routes addresses in a binary fashion
• This can most easily be seen with MINs in the form at right
0101
1011
(C) 2003 J. E. Smith CS/ECE 757 18
MINs, contd.
• f switch outputs => decode log2 f bits in switch• Performance/Cost:
– Switch cost: (N log f N) / f– Wire cost: N log f N– Avg. latency: log f N– Bisection B/W: N– Not neighbor optimized
• Problem (in combination with latency growth)– Not local optimized
• Capable of broadcast• Commonly used in large UMA systems
![Page 10: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/10.jpg)
Page 10
(C) 2003 J. E. Smith CS/ECE 757 19
Multistage Nets, Equivalence
• By rearranging switches, multistage nets can be shown to be equivalent
A
B
C
D
A
B
C
D
A
B
C
D
E
F
G
H
A
B
C
D
E
F
G
H
E
F
G
H
E
F
G
H
I
J
K
L
M
N
O
P
I
J
M
N
K
L
O
P
(C) 2003 J. E. Smith CS/ECE 757 20
Fat Trees
• Tree-like, with constant bandwidth at all levels
• Closely related to MINs• Indirect interconnect• Performance/Cost:
– Switch cost: N log f N
– Wire cost: f N log f N
– Avg. latency: approx 2 log f N– Bisection B/W: f N– neighbor optimized– may be local optimized
• Capable of broadcastP
SM
P
SM
P
SM
P
SM
P
SM
P
SM
P
SM
P
SM
![Page 11: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/11.jpg)
Page 11
(C) 2003 J. E. Smith CS/ECE 757 21
Fat Trees, Contd.
• The MIN-derived Fat Tree, is, in fact, a Fat Tree:• However, the switching "nodes" in effect do not have full
crossbar connectivity
P
SM
P
SM
P
SM
P
SM
P
SM
P
SM
P
SM
P
SM
A B C D E F G H
I J K L M N O P
Q R S T U V W X
A B C D E F G H
I J K L M N O P
Q R S T U V W X
S S S S S S S S
M M M M M M M M
P P P P P P P P
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 22
Important Topologies
N = 1024
Type Degree Diameter Ave Dist Bisection Diam Ave D
1D mesh 2 N-1 2N/3 1
2D mesh 4 2(N1/2 - 1) 2N1/2 / 3 N1/2 63 21
3D mesh 6 3(N1/3 - 1) 3N1/3 / 3 N2/3 ~30 ~10
nD mesh 2n n(N1/n - 1) nN1/n / 3 N(n-1) / n
(N = kn)
Ring 2 N / 2 N/4 2
2D torus 4 N1/2 N1/2 / 2 2N1/2 32 16
k-ary n-cube 2n n(N1/n) nN1/n/2 15 8 (3D) (N = kn) nk/2 nk/4 2kn-1
Hypercube n n = LogN n/2 N/2 10 5
![Page 12: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/12.jpg)
Page 12
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 23
N = 1024
Type Degree Diameter Ave Dist Bisection Diam Ave D
2D Tree 3 2Log2 N ~2Log2 N 1 20 ~20
4D Tree 5 2Log4 N 2Log4 N - 2/3 1 10 9.33
kD k+1 Logk N
2D fat tree 4 Log2 N N
2D butterfly 4 Log2 N Log2 N N/2 20 20
Topologies (cont)
CM-5 Thinned Fat Tree
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 24
Butterfly
N/2 Butterfly
°°°
N/2 Butterfly
°°°
Benes Network
N Butterfly
ReversedN
Butterfly
°°°
• Routes all permutations w/o conflict
• Notice similarity to Fat Tree (Fold in half)
• Randomization is major breakthrough
• All paths equal length
• Unique path from anyinput to any output
• Conflicts cause tree saturation
Multistage: nodes at ends, switches in middle
![Page 13: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/13.jpg)
Page 13
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 25
Outline
• Topology
• Switching, Routing, & Deadlock
• Switch Design
• Flow Control
• Case Studies
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 26
Queue on each end• Can send both ways (“ Bi-directional, Full
Duplex” )• Rules for communication? “ protocol”
– Synchronous send » Need Request & Response signaling
– Name for standard group of bits sent: Packet
ABCs of Networks
• Starting Point: Send bits between 2 computers
![Page 14: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/14.jpg)
Page 14
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 27
A Simple Example
• What is the packet format?– Fixed? (for HW Interpretation)– Number bytes?
Request/Response
Address/Data
1 bit 32 bits
0: Please send data from Address1: Packet contains data corresponding to request
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 28
Questions About Simple Example
• What if more than 2 computers want to communicate?
– Need node identifier field (destination) in packet– Routing and topology
• What if packet is garbled in transit?– Add error detection field in packet (e.g., CRC)
• What if packet is lost?– More elaborate protocols to detect loss (e.g., NAK, time outs)
• What if multiple processes/machine?– Dispatch– Queue per process
• Questions such as these lead to more complex protocols and packet formats
![Page 15: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/15.jpg)
Page 15
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 29
General Packet Format
• Header– routing and control information
• Payload– carries data (non HW specific information)– can be further divided (framing, protocol stacks…)
• Error Code– generally at tail of packet so it can be generated on the way out
Header Payload Error Code
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 30
Message vs. Packet
• A Message may be composed of several packets• Applications reason about messages• Network transfers packets• Small fixed size packets. Problems?
Fragmentation and reassembly (SW overhead)• Variable Size packets. Problems?
Congestion
![Page 16: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/16.jpg)
Page 16
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 31
Packet Switched vs Circuit Switched
Circuit Switched• Establish Route then Send Data• Telephone systemPacket Switched• Route each packet individually• Delivery Guarantees
– Reliable– In order, what if not?
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 32
Routing
• Store-and-forward• Cut-through• Virtual cut-through• Wormhole
![Page 17: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/17.jpg)
Page 17
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 33
Store and Forward
• Store-and-forward policy: each switch waits for the full packet to arrive in the switch before it is sent on to the next switch
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 34
Cut Through
• Cut-through routing: switch examines the header, decides where to send the message, and then starts forwarding it immediately
![Page 18: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/18.jpg)
Page 18
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 35
Virtual Cut-Through
• What to do if output port is blocked?• Lets the tail continue when the head is blocked,
absorbing the whole message into a single switch. – Requires a buffer large enough to hold the largest packet.
• Degenerates to store-and-forward with high contention
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 36
Wormhole
• When the head of the message is blocked the message stays strung out over the network
– Potentially blocks other messages (needs only buffer the piece of the packet that is sent between switches).
– CM-5 used it, with each switch buffer being 4 bits per port.– Myrinet uses it
• Interaction with Pack Size• Can cause tree saturation…
![Page 19: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/19.jpg)
Page 19
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 37
Store and Forward vs. Cut-Through
• Advantage– Latency reduces from function of:
number of intermediate switches times the size of the packet
to
time for 1st part of the packet to negotiate the switches + the packet size ÷ interconnect BW
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 38
Routing Algorithm
• How do I know where a packet should go?• Arithmetic• Source-Based• Table Lookup• Adaptive—route based on network state
(e.g., contention)
![Page 20: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/20.jpg)
Page 20
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 39
Arithmetic Routing
• For regular topology, simple arithmetic to determine route
• 2D Mesh (Also called NEWS network)– packet header contains signed offset to destination– switch ++ or -- one field of header (x or y dimension)– when x == 0 and y == 0, then at correct processor
• Requires ALU in switch• Must recompute CRC
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 40
Source Based and Table Lookup Routing
Source Based• Source specifies output port for each switch in route• Very Simple Switches
– no control state– strip output port off header
• Myrinet uses thisTable Lookup• Very Small Header, index into table for output port• Big tables, must be kept up to date...
![Page 21: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/21.jpg)
Page 21
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 41
001
000
101
100
010 110
111011
Deterministic v.s. Adaptive Routing
• Deterministic—follows a pre-specified route
– mesh: dimension-order routing» (x1, y1) -> (x2, y2)» first Dx = x2 - x1,» then Dy = y2 - y1,
– hypercube: edge-cube routing» X = x0x1x2 . . .xn -> Y = y0y1y2 . . .yn» R = X xor Y» Traverse dimensions of differing
address in order– tree: common ancestor
• Adaptive—route determined by contention for output port
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 42
Adaptive Routing
• Essential for fault tolerance– at least multipath
• Can improve utilization of the network• Simple deterministic algorithms easily run into bad
permutations
• Fully/partially adaptive, minimal/non-minimal• Can introduce complexity or anomalies• Little adaptation goes a long way!
![Page 22: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/22.jpg)
Page 22
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 43
Deadlock
Guri Sohi
• Break a Necessary Condition– Use more than one resource– Not willing to release resource in use– Cycle in order of recourse use
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 44
Deadlock Free Routing
• Virtual Channels– Not virtual cut-through– Add buffers so flits of wormhole packets can be interleaved
• Up*-Down*– Number switches: higher = farther away from processors– route up, make one turn, route down
• Turn Model Routing– Restrict order of turns
» West First» North Last» Negative First
– Can increase number of hops
![Page 23: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/23.jpg)
Page 23
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 45
Minimal turn restrictions in 2D
West-f irst
north-last negative first
-x +x
+y
-y
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 46
Outline
• Topology
• Switching, Routing, & Deadlock
• Switch Design
• Flow Control
• Case Studies
![Page 24: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/24.jpg)
Page 24
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 47
A Generic Switch
• At minimum, must route inputs to outputs
InputBuffer
OutputBuffer
Cross-bar
ControlRouting, Scheduling
Receiver Transmitter
OutputPorts
InputPorts
VLSI makes it easier to create larger fully connected switches
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 48
Switch Design Issues
• ports, pin limited• data path (cross bar designs)• non-blocking crossbar• routing logic per input
– ALU– table– Finite State Machine for cut-through
![Page 25: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/25.jpg)
Page 25
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 49
Switch Buffering
• need to absorb some transient peaks• shared buffer pool
– need high bandwidth– one output could hog all buffer space
• input buffers
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 50
Input Buffering
• buffer per port• routing logic• head of line (HOL) blocking
– subsequent packet may be routed to unused output port
![Page 26: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/26.jpg)
Page 26
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 51
Output Buffering
• Buffers “ logically” associated with output– split on either side of cross bar
• Arbitration for physical link (output scheduling)– static priority– random– round-robin– oldest-first
• Effects of adaptive routing?– Select output based on availability– requires feedback from output port
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 52
Stacked Dimension Switch
• Uses only 2x2 switch to build higher dimension switch
2x2
2x2
2x2
zin zout
yin
xin
yout
xout
Hostin
Hostout
![Page 27: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/27.jpg)
Page 27
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 53
Outline
• Topology
• Switching, Routing, & Deadlock
• Switch Design
• Flow Control
• Case Studies
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 54
Congestion Control
• Packet switched networks do not reserve bandwidth; this leads to contention
• Solution: prevent packets from entering until contention is reduced (e.g., metering lights)
• Options:– End-to-end Flow Control– Link-level Flow Control
![Page 28: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/28.jpg)
Page 28
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 55
Flow Control
• Packet discarding: If a packet arrives at a switch and there is no room in the buffer, the packet is discarded
– no communication between switches, requires higher level protocol
• Flow control: between pairs of receivers and senders; use feedback to tell the sender when it is allowed to send the next packet
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 56
Link-Level Flow Control
Data
Ready
• Transfer single flit when receiver is ready• Could have long links with many flits in flight
![Page 29: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/29.jpg)
Page 29
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 57
Credit-based (Window) Flow Control
• Receiver gives N credits to sender– sender decrements count– stops sending if zero– receiver sends back credit as it drains its buffer– bundle credits to reduce overhead
• Must account for link latency
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 58
Water Level
• high water, low water• stop & go back to source switch (Myrinet)• can send redundant stop/go
Stop
Go
Incoming phits
Outgoing phits
![Page 30: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/30.jpg)
Page 30
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 59
Outline
• Topology
• Switching, Routing, & Deadlock
• Switch Design
• Flow Control
• Case Studies
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 60
Case Study Cray T3D
• 1024 switch nodes each connected to 2 processors• 3D Torus, bidirectional, 300 MB/s• Link: 16 bits, 8 control bits• Variable size packet (multiple of 16 bits)• Logical request & response networks
– 2 virtual channels each for deadlock
• Stacked dimension routing• Wormhole for large packets, virtual cut-through for
small packets
![Page 31: CS/ECE 757: Advanced Computer Architecture II (Parallel ...mr56/ece757/Lecture17.pdfTopologies (cont) CM-5 Thinned Fat Tree (C) 2001 Mark D. Hill from Adve,Falsafi, Lebeck, Reinhardt,](https://reader035.fdocuments.in/reader035/viewer/2022071404/60f7dc1a8f6dd7168c68b6d7/html5/thumbnails/31.jpg)
Page 31
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 61
IBM SP-2 (Vulcan)
• Switch has eight bidirectional 40 MB/s links• Link: 8 data bits, 1 tag, 1 reverse flow-control• Flit is 16 bits, phit is 8• input FIFO + output FIFO + central buffer 128 8-byte
segments
(C) 2001 Mark D. Hill from Adve,Falsafi , Lebeck, Reinhardt, & Singh CS/ECE 757 62
Real Machines
• Wide links, smaller routing delay• Tremendous variation