CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.
-
date post
22-Dec-2015 -
Category
Documents
-
view
219 -
download
3
Transcript of CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.
![Page 1: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/1.jpg)
1CSIT560 by M. Hamdi
Packet Packet Scheduling/Arbitration in Scheduling/Arbitration in
Virtual Output QueuesVirtual Output Queuesand Othersand Others
![Page 2: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/2.jpg)
2CSIT560 by M. Hamdi
Key Characteristics in Designing Key Characteristics in Designing Internet Switches and RoutersInternet Switches and Routers
1.1. Scalability in terms of line rates
2. Scalability in terms of number of interfaces (port numbers)
![Page 3: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/3.jpg)
3CSIT560 by M. Hamdi
Switch/Router Architecture Comparison
http://www.lightreading.com/document.asp?doc_id=47959
![Page 4: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/4.jpg)
4CSIT560 by M. Hamdi
Head-of-Line Blocking
Blocked!
Blocked!
![Page 5: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/5.jpg)
5CSIT560 by M. Hamdi
![Page 6: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/6.jpg)
6CSIT560 by M. Hamdi
![Page 7: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/7.jpg)
7CSIT560 by M. Hamdi
Crossbar Switches: Virtual Output Queues
• Virtual Output Queues: – At each input port, there are N queues – each associated
with an output port
– Only one packet can go from an input port at a time
– Only one packet can be received by an output port at a time
• It retains the scalability of FIFO input-queued switches (no memory bandwidth problem)
• It eliminates the HoL problem with FIFO input Queues
![Page 8: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/8.jpg)
8CSIT560 by M. Hamdi
Virtual Output Queues
![Page 9: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/9.jpg)
9CSIT560 by M. Hamdi
Scheduler
VOQs
VOQs: How Packets Move
![Page 10: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/10.jpg)
10CSIT560 by M. Hamdi
Crossbar Scheduler in VOQ Architecture
Scheduler
Memory b/w=2R
Can be quite complex!
![Page 11: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/11.jpg)
11CSIT560 by M. Hamdi
Question: do more lanes help?
• Answer: it depends on the scheduling
Head of Line BlockingVOQs with Bad SchedulingGood Scheduling? Ayalon: depends on traffic matrix…
![Page 12: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/12.jpg)
12CSIT560 by M. Hamdi
Crossbar Scheduler in VOQ Architecture
Which packetsI can send during each configuration
of the crossbar
![Page 13: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/13.jpg)
13CSIT560 by M. Hamdi
PortProcessor
opticsLCS Protocol
optics
PortProcessor
opticsLCS Protocol
optics
Crossbar
Switch core architecture
Port #1
Scheduler(Like the
Processor of A Computer)
Request
Grant/Credit
Cell Data
Port #256
![Page 14: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/14.jpg)
14CSIT560 by M. Hamdi
Basic Switch Model
A1(n)
S(n)
N NLNN(n)
A1N(n)
A11(n)L11(n)
1 1
AN(n)
ANN(n)
AN1(n)
D1(n)
DN(n)
![Page 15: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/15.jpg)
15CSIT560 by M. Hamdi
Some definitions
matrix. npermutatio a is and :where
:matrix Service 2.
".admissible" is traffic the say we If
where
:matrix Traffic 1.
SssS
nAE
ijij
jij
iij
ijijij
1,0],[
1,1
)]([:,
3. Queue occupancies:
Occupancy
L11(n) LNN(n)
![Page 16: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/16.jpg)
16CSIT560 by M. Hamdi
Some possible performance goals
?metrics...Other .5
.4
,)( 3.
t" throughpu"100% 2.
onconservati Work 1.
ndedDelayisbou
nCnLij When
traffic is admissible
![Page 17: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/17.jpg)
17CSIT560 by M. Hamdi
VOQ Switch Scheduling
A 1
B
C
D
E
F
2
3
4
5
6
• The VOQ switch scheduling can be represented by a bipartite graph– The left-hand side nodes of the bipartite graph are the input ports
– The right-hand side nodes of the bipartite graph are the output ports
– The edges between the nodes are requests for packet transmission between input ports and output ports.
![Page 18: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/18.jpg)
18CSIT560 by M. Hamdi
Maximum size bipartite match
• Intuition: maximizes instantaneous throughput
L11(n)>0
LN1(n)>0
“Request” Graph Bipartite Match
MaximumSize Match
![Page 19: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/19.jpg)
19CSIT560 by M. Hamdi
Network flows and bipartite matching
Finding a maximum size bipartite matching is equivalent to solving a network flow problem with capacities and flows of size “1”.
A 1
Sources
Sinkt
B
C
D
E
F
2
3
4
5
6
![Page 20: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/20.jpg)
20CSIT560 by M. Hamdi
Network Flows
Sources
Sinkt
a c
b d
10
10
10
1
1
1
10
10
• Let G=[V,E] be a directed graph with capacity cap(v,w) on edge [v,w].
• A flow is an (integer) function, f, that is chosen for each edge so that f(v,w) <= cap(v,w).
• We wish to maximize the flow allocation.
![Page 21: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/21.jpg)
21CSIT560 by M. Hamdi
A maximum network flow exampleBy inspection
Sources
Sinkt
a c
b d
10
10
10
1
1
1
10
10
Step 1:
Sources
Sinkt
a c
b d
10, 10
10
10, 10
1
1
1
10
10, 10
Flow is of size 10
![Page 22: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/22.jpg)
22CSIT560 by M. Hamdi
A maximum network flow example
Sources
Sinkt
a c
b d
10, 10
10, 1
10, 10
1
1
1, 1 10, 1
10, 10
Step 2:
Flow is of size 10+1 = 11
Sources
Sinkt
a c
b d
10, 10
10, 2
10, 9
1,1
1,1
1, 1 10, 2
10, 10
Maximum flow:
Flow is of size 10+2 = 12
Not obvious
![Page 23: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/23.jpg)
23CSIT560 by M. Hamdi
Ford-Fulkerson method of augmenting paths
1. Set f(v,w) = -f(w,v) on all edges.
2. Define a Residual Graph, R, in which res(v,w) = cap(v,w) – f(v,w)
3. Find paths from s to t for which there is positive residue.
4. Increase the flow along the paths to augment them by the minimum residue along the path.
5. Keep augmenting paths until there are no more to augment.
![Page 24: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/24.jpg)
24CSIT560 by M. Hamdi
Example of Residual Graph
s t
a c
b d
10, 10
10
10, 10
1
1
1
10
10, 10
Flow is of size 10
t
a c
b d
10
10
10
1
1
1
10
10
s
res(v,w) = cap(v,w) – f(v,w) Residual Graph, R
Augmenting path
![Page 25: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/25.jpg)
25CSIT560 by M. Hamdi
Example of Residual Graph
s t
a c
b d
10, 10
10
10, 10
1
1
1
10
10, 10
Flow is of size 10
t
a c
b d
10
10
10
1
1
1
10
10
s
res(v,w) = cap(v,w) – f(v,w) Residual Graph, R
Augmenting path
![Page 26: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/26.jpg)
26CSIT560 by M. Hamdi
Example of Residual Graph
s t
a c
b d
10, 10
10, 1
10, 10
1
1
1, 1 10, 1
10, 10
Step 2:
Flow is of size 10+1 = 11
s t
a c
b d
10
1
10
1
1
1
1
10
Residual Graph
9 9Augmenting pathAugmenting path
![Page 27: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/27.jpg)
27CSIT560 by M. Hamdi
Example of Residual Graph
s t
a c
b d
10, 10
10, 2
10, 9
1, 1
1, 1
1, 1 10, 2
10, 10
Step 3:
Flow is of size 10+2 = 12
s t
a c
b d
10
2
10
1
1
1
2
10
Residual Graph
8 8
![Page 28: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/28.jpg)
28CSIT560 by M. Hamdi
An other Example: Ford-Fulkerson method
s
16
13
10 4 97
12
20
411
a b
c d
t
f=0G
s
16
13
10 4 97
12
20
411
a b
c d
t
Gf
find augmenting path p
s
16
4/13
10 4 97
12
20
4/44/11
a b
c d
t s
16
410 4 9
7
12
20
4
7
a b
c d
t
49
f=4
![Page 29: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/29.jpg)
29CSIT560 by M. Hamdi
f=4G Gf
find augmenting path p
s
16
4/13
10 4 97
12
20
4/44/11
a b
c d
t s
16
410 4 9
7
12
20
4
7
a b
c d
t
49
f=4+12
s
12/16
4/13
10 4 97
12/12
12/20
4/44/11
a b
c d
t s12
410 4 9
7
128
4
7
a b
c d
t
49
4
12
An other Example: Ford-Fulkerson method
![Page 30: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/30.jpg)
30CSIT560 by M. Hamdi
f=16G Gf
find augmenting path p
s
12/16
4/13
10 4 97
12/12
12/20
4/44/11
a b
c d
t s12
410 4 9
7
128
4
7
a b
c d
t
49
4
12
f=16+7
s
12/16
11/13
10 4 97/7
12/12
19/20
4/411/11
a b
c d
t s12
1110 4 9
7
121
4
11
a b
c d
t
2
4
19
An other Example: Ford-Fulkerson method
![Page 31: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/31.jpg)
31CSIT560 by M. Hamdi
f=23G Gf
find augmenting path p
s
12/16
11/13
10 4 97/7
12/12
19/20
4/411/11
a b
c d
t s12
1110 4 9
7
121
4
11
a b
c d
t
2
4
19
No more augmenting path
Maximum Flow is 23
An other Example: Ford-Fulkerson method
![Page 32: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/32.jpg)
32CSIT560 by M. Hamdi
An example for Flow: Obvious solution
S
T
10 10
10
1010
9
99
9
Input graph G
S
T
10 10
10
1010
9
99
9
Residual Graph Gr
S
T
Flow graph Gf
S
T
0 10
0
010
9
99
9
S
T
10
10
10
S
T
10
10
9
99
9
S
T
10
10
10
Total flow = 10, Sub-optimal solution!
![Page 33: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/33.jpg)
33CSIT560 by M. Hamdi
Flow algorithm – Optimal version
S
T
10 10
10
1010
9
99
9
Input graph G
S
T
10 10
10
1010
9
99
9
Residual Graph Gr
S
T
Flow graph Gf
S
T
10 10
10
1010
9
99
9
S
T
S
T
0 10
0
010
9
99
9
S
T
10
10
10
10
10
10
S
T
10
10
9
99
9
S
T
10
10
10
10
10
10
S
T
10
10
9
99
9
S
T
10
10
10
10
10
10
Total flow = 10 + 9 = 19 units!
S
T
1
1
S
T
10
1
10
10
1
109
9
9 9
9
9
9
99
9
9
9
9
![Page 34: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/34.jpg)
34CSIT560 by M. Hamdi
Complexity of network flow problems
• In general, it is possible to find a solution by considering at most V.E paths, by picking shortest augmenting path first.
• There are many variations, such as picking most augmenting path first.
• The complexity of the algorithm is less when the graph is bipartite
• There are techniques other than the Ford-Fulkerson method.
![Page 35: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/35.jpg)
35CSIT560 by M. Hamdi
Ford - Fulkerson Algorithm – 1
1 2 3 4 5 6
sink
a b c d e f
source
Network flows and bipartite matching
Finding a maximum size bipartite matching is equivalent to solving a network flow problem with capacities and flows of size “1”.
![Page 36: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/36.jpg)
36CSIT560 by M. Hamdi
Ford - Fulkerson Algorithm – 2
1 2 3 4 5 6
sink
a b c d e f
source
Increasing the flow by 1.
![Page 37: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/37.jpg)
37CSIT560 by M. Hamdi
Ford - Fulkerson Algorithm – 3
1 2 3 4 5 6
sink
a b c d e f
source
Increasing the flow by 1.
![Page 38: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/38.jpg)
38CSIT560 by M. Hamdi
Ford - Fulkerson Algorithm – 4
1 2 3 4 5 6
sink
a b c d e f
source
Increasing the flow by 1.
![Page 39: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/39.jpg)
39CSIT560 by M. Hamdi
Ford - Fulkerson Algorithm – 5
1 2 3 4 5 6
sink
a b c d e f
source
Increasing the flow by 1.
![Page 40: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/40.jpg)
40CSIT560 by M. Hamdi
Ford - Fulkerson Algorithm – 6
1 2 3 4 5 6
sink
a b c d e f
source
Increasing the flow by 1.
![Page 41: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/41.jpg)
41CSIT560 by M. Hamdi
Ford - Fulkerson Algorithm – 7
1 2 3 4 5 6
sink
a b c d e f
source
Augmenting flow along the augmenting path.
![Page 42: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/42.jpg)
42CSIT560 by M. Hamdi
Ford - Fulkerson Algorithm – 8
1 2 3 4 5 6
sink
a b c d e f
source
Maximum flow found!Thus maximum matching found.
![Page 43: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/43.jpg)
43CSIT560 by M. Hamdi
Complexity of Maximum Matchings
• Maximum Size/Cardinality Matchings:– Algorithm by Dinic O(N5/2)
• Maximum Weight Matchings– Algorithm by Kuhn O(N3logN)
• ftp://dimacs.rutgers.edu/pub/netflow/matching/
(contains code for maximum size/weighting algorithms)
• In general:– Hard to implement in hardware
– Slooooow.
![Page 44: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/44.jpg)
44CSIT560 by M. Hamdi
Maximum size bipartite match
• Intuition: maximizes instantaneous throughput
• for uniform traffic.
L11(n)>0
LN1(n)>0
“Request” Graph Bipartite Match
MaximumSize Match
[ ( )]ijE L n
![Page 45: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/45.jpg)
45CSIT560 by M. Hamdi
Why doesn’t maximizing instantaneous throughput give
100% throughput for non-uniform traffic?
2/1
2/1
2/1
32
21
1211
Three possiblematches, S(n):
100%). t(throughpu stable not is switch 0.0358 if so And
But
most at is served is 1 input which at rate total The
. w.p. serviced is 1 Input ) w.p.( arrivals have
both and and , time at that Assume
.)21(31121
.)21(311
)21(11)21(32
32)21(
)()(0)(0)(
21
2
22
2
32211211
-δ// - -λ
//
/-//
/-δ/
nQnQ n, L nn, L
![Page 46: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/46.jpg)
46CSIT560 by M. Hamdi
Maximum weight matching
A1(n)
N NLNN(n)
A1N(n)
A11(n)
L11(n)
1 1
AN(n)
ANN(n)
AN1(n)
D1(n)
DN(n)
L11(n)
LN1(n)
“Request” Graph Bipartite Match
S*(n)
MaximumWeight Match
*
( )( ) arg max( ( ) ( ))T
S nS n L n S n
•Weight could be Weight could be length of queue or length of queue or age of packetage of packet
• Achieves 100% Achieves 100% throughput under throughput under all traffic patternsall traffic patterns
![Page 47: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/47.jpg)
47CSIT560 by M. Hamdi
Packet Scheduling/Arbitration in Virtual Output Queues:
Maximal Matching Algorithms
![Page 48: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/48.jpg)
48CSIT560 by M. Hamdi
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
Maximum size matching
Maximum weight matching
1
2
3
4
1
2
3
4
8
6
4
2
1
3
1
1
2
3
4
1
2
3
4
8
6
4
Maximum Matching in VOQ Architecture
![Page 49: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/49.jpg)
49CSIT560 by M. Hamdi
Complexity of Maximum Matchings
• Maximum Size/Cardinality Matchings:– Algorithm by Dinic O(N5/2)
• Maximum Weight Matchings– Algorithm by Kuhn O(N3logN)
• In general:– Hard to implement in hardware
– Slooooow.
![Page 50: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/50.jpg)
50CSIT560 by M. Hamdi
Maximal Matching
• A maximal matching is a matching in which each edge is added one at a time, and is not later removed from the matching.
• i.e., No augmenting paths allowed (they remove edges added earlier) – like by inspection.
• No input and output are left unnecessarily idle.
![Page 51: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/51.jpg)
51CSIT560 by M. Hamdi
Example of Maximal Size Matching
A 1
B
C
D
E
F
2
3
4
5
6
A 1
B
C
D
E
F
2
3
4
5
6
Maximal Matching Maximum Matching
![Page 52: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/52.jpg)
52CSIT560 by M. Hamdi
Comments on Maximal Matchings
• In general, maximal matching is much simpler to implement, and has a much faster running time.
• A maximal size matching is at least half the size of a maximum size matching.
• A maximal weight matching is defined in the obvious way.
• A maximal weight matching is at least half the size of a maximum weight matching.
![Page 53: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/53.jpg)
53CSIT560 by M. Hamdi
PIM Maximal Size Matching Algorithm: Performance and Properties
• It is among the very first practical schedulers proposed for VOQ architectures (used by DEC).
• It is based on having arbiters at the inputs and outputs
• It iterates the following steps until no more requests can be accepted (or for a given number of iterations):
1. Request: Each unmatched input sends a request to every output for which it has a queued cell
2. Grant (outputs): If an unmatched output receives any request, it grants one by randomly selecting a request uniformly over all requests.
3. Accept (inputs): If an unmatched input receives a grant, it accepts one by selecting an output randomly among those granted to this input.
![Page 54: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/54.jpg)
54CSIT560 by M. Hamdi
Sta
te o
f In
pu
t Q
ueu
es (
N 2 b
its)
1
2
N
1
2
N
Dec
isio
n R
egis
ter
Grant Arbiters Request Arbiters
Implementation of the parallel maximal matching algorithms
![Page 55: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/55.jpg)
55CSIT560 by M. Hamdi
Implementation of the parallel maximal matching algorithms
(another similar way)
Request
Buffer
Grant
Arbiter
Accept
ArbiterNew Request Decision
Request
Buffer
Grant
Arbiter
Accept
ArbiterNew Request Decision
Request
Buffer
Grant
Arbiter
Accept
ArbiterNew Request Decision
![Page 56: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/56.jpg)
56CSIT560 by M. Hamdi
1
2
3
4
1
2
3
4
Step 1: Request
1
2
3
4
1
2
3
4
Step 2: Grant
1
2
3
4
1
2
3
4Step 3: Accept
PIM: 1st IterationRandom
selection
Random selection
PIM Maximum Size Matching Algorithm: Performance and
Properties
![Page 57: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/57.jpg)
57CSIT560 by M. Hamdi
1
2
3
4
1
2
3
4Step 3: Accept
PIM: 2nd Iteration
1
2
3
4
1
2
3
4
Step 1: Request
Step 2: Grant
1
2
3
4
1
2
3
4
PIM Maximum Size Matching Algorithm: Performance and
Properties
![Page 58: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/58.jpg)
58CSIT560 by M. Hamdi
Traffic Types to evaluate Algorithms
xx
x
xx
xx
1
1
1
1
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
2
2
2
2
Uniform trafficUniform traffic Unbalanced trafficUnbalanced traffic
Hotpot trafficHotpot traffic
![Page 59: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/59.jpg)
59CSIT560 by M. Hamdi
Parallel Iterative Matching
PIM with a single iteration
![Page 60: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/60.jpg)
60CSIT560 by M. Hamdi
Parallel Iterative Matching
PIM with 4 iterations
![Page 61: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/61.jpg)
61CSIT560 by M. Hamdi
Parallel Iterative MatchingAnalytical Results
E C Nlog
E Ui N2
4i------- C # of iterations required to resolve connections=
N # of ports =
Ui # of unresolved connections after iteration i=
Number of iterations to converge:
![Page 62: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/62.jpg)
62CSIT560 by M. Hamdi
PIM Maximum Size Matching Algorithm: Performance and Properties
• It is a fair algorithm – servicing inputs
• Can have 100% throughput under uniform traffic
• It converges in logN iterations to a maximal size matching
• It has a very poor performance (63% throughput) with 1 iteration – because of its inability to desynchronize the output pointers
• It is not easy to build random arbiters in hardware
• The best iterative maximal size matching algorithm takes O(N2logN) serial or O(log N) parallel time steps.
• If the number of iterations is constant, then it can be implemented in constant time (that is why it is practical) – however the hardware design is not trivial.
![Page 63: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/63.jpg)
63CSIT560 by M. Hamdi
RRM Maximum Size Matching Algorithm: Performance and Properties
• Round Robin Matching (RRM) is easier to implement that PIM (in terms of designing the I/O arbiters).
• The pointers of the arbiters move in straightforward way
• It iterates the following steps until no more requests can be accepted (or for a given number of iterations):
• Request. Each input sends a request to every output for which it has a queued cell.
• Grant. If an output receives any requests, it chooses the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The output notifies each input whether or not its request was granted. The pointer gi to the highest priority element of the round-robin
schedule is incremented (modulo N) to one location beyond the granted input. If no request is received, the pointer stays unchanged.
![Page 64: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/64.jpg)
64CSIT560 by M. Hamdi
RRM Maximum Size Matching Algorithm: Performance and Properties
• Accept. If an input receives a grant, it accepts the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The pointer ai to the highest priority
element of the round-robin schedule is incremented (modulo N) to one location beyond the accepted output. If no grant is received, the pointer stays unchanged.
![Page 65: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/65.jpg)
65CSIT560 by M. Hamdi
RRM Maximal Matching Algorithm (1)
0
1
2
3
0
1
2
3
Step 1: Request
![Page 66: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/66.jpg)
66CSIT560 by M. Hamdi
RRM Maximal Matching Algorithm (2)
0
1
2
3
0
1
2
3
Step 2: Grant
3 02 1
3 02 1
![Page 67: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/67.jpg)
67CSIT560 by M. Hamdi
RRM Maximal Matching Algorithm (2)
0
1
2
3
0
1
2
3
Step 2: Grant
3 02 1
3 02 1
![Page 68: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/68.jpg)
68CSIT560 by M. Hamdi
RRM Maximal Matching Algorithm (2)
0
1
2
3
0
1
2
3
Step 2: Grant
3 02 1
3 02 1
![Page 69: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/69.jpg)
69CSIT560 by M. Hamdi
RRM Maximal Matching Algorithm (2)
0
1
2
3
0
1
2
3
Step 2: Grant
3 02 1
3 02 1
![Page 70: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/70.jpg)
70CSIT560 by M. Hamdi
RRM Maximal Matching Algorithm (3)
0 31 2
0
1
2
3
0
1
2
3
Step 3: Accept
3 02 1
3 02 1
![Page 71: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/71.jpg)
71CSIT560 by M. Hamdi
RRM Maximal Matching Algorithm (3)
0 31 2
0
1
2
3
0
1
2
3
Step 3: Accept
3 02 1
3 02 1
![Page 72: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/72.jpg)
72CSIT560 by M. Hamdi
RRM Maximal Matching Algorithm (3)
0 31 2
0
1
2
3
0
1
2
3
Step 3: Accept
3 02 1
3 02 1
![Page 73: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/73.jpg)
73CSIT560 by M. Hamdi
Poor performance of RRM Maximal Matching Algorithm
0
1
0
1
00
11
00
11
50% Throughput50% Throughput
00
11
00
11
....
00
11
00
11
....
![Page 74: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/74.jpg)
74CSIT560 by M. Hamdi
iSLIP Maximum Size Matching Algorithm: Performance and Properties
• It is a scheduler used in most VOQ switches (e.g., Cisco).
• It is exactly like RRM algorithm with the following change:
• Grant. If an output receives any requests, it chooses the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The output notifies each input whether or not its request was granted. The pointer gi to the highest priority
element of the round-robin schedule is incremented (modulo N) to one location beyond the granted input if and only if the grant is accepted in (Accept phase) .
![Page 75: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/75.jpg)
75CSIT560 by M. Hamdi
1
2
3
4
1
2
3
4
Step 2: Grant
1
2
3
4
1
2
3
4Step 3: Accept
iSlip: 1st Iteration
4 13 2
4 13 2
1
2
3
4
1
2
3
4
Step 1: Request
1 42 3
4 13 2
Original pointerSelected oneUpdated pointer
iSLIP Maximum Size Matching Algorithm
![Page 76: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/76.jpg)
76CSIT560 by M. Hamdi
1
2
3
4
1
2
3
4
Step 2: Grant
1
2
3
4
1
2
3
4Step 3: Accept
iSlip: 2nd Iteration
4 13 2
1
2
3
4
1
2
3
4
Step 1: Request
1 42 3
4 13 2
No change
Original pointerSelected oneUpdated pointer
iSLIP Maximum Size Matching Algorithm
![Page 77: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/77.jpg)
77CSIT560 by M. Hamdi
Simple Iterative Algorithms: iSlip
0
1
2
3
0
1
2
3
Step 1: Request
![Page 78: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/78.jpg)
78CSIT560 by M. Hamdi
Simple Iterative Algorithms: iSlip
0
1
2
3
0
1
2
3
Step 2: Grant
3 02 1
3 02 1
![Page 79: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/79.jpg)
79CSIT560 by M. Hamdi
0
1
2
3
0
1
2
3
Step 2: Grant
3 02 1
3 02 1
Simple Iterative Algorithms: iSlip
![Page 80: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/80.jpg)
80CSIT560 by M. Hamdi
0 31 2
0
1
2
3
0
1
2
3
Step 3: Accept
3 02 1
3 02 1
Simple Iterative Algorithms: iSlip
![Page 81: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/81.jpg)
81CSIT560 by M. Hamdi
0 31 2
0
1
2
3
0
1
2
3
Step 3: Accept
3 02 1
3 02 1
Simple Iterative Algorithms: iSlip
![Page 82: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/82.jpg)
82CSIT560 by M. Hamdi
Simple Iterative Algorithms: iSlip
0 31 2
0
1
2
3
0
1
2
3
Step 3: Accept
3 02 1
3 02 1
![Page 83: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/83.jpg)
83CSIT560 by M. Hamdi
Simple Iterative Algorithms: iSlip
0 31 2
0
1
2
3
0
1
2
3
Step 3: Accept
3 02 1
3 02 1
![Page 84: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/84.jpg)
84CSIT560 by M. Hamdi
Simple Iterative Algorithms: iSlip
0 31 2
0
1
2
3
0
1
2
3
Step 3: Accept
3 02 1
3 02 1
![Page 85: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/85.jpg)
85CSIT560 by M. Hamdi
iSLIP Implementation
Grant
Grant
Grant
Accept
Accept
Accept
1
2
N
1
2
N
State
N
N
N
Decision
log2N
log2N
log2N
ProgrammablePriority Encoder
![Page 86: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/86.jpg)
86CSIT560 by M. Hamdi
Hardware Design
256 bit PriorityEncoder
Layout Size 292μ m x 273μ mPost LayoutSimulation delay
2.7 ns
Layout of the 256 bits Priority Encoder
![Page 87: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/87.jpg)
87CSIT560 by M. Hamdi
Hardware Design
`
P.E. P.E.
Filter
MU
X
Flipping
Pointer & Mask
Latch
Flipping256 bit PriorityEncoder
Layout Size 1016μ m x 985μ mPost LayoutSimulation delay(filter to the latch)
2.3ns
Post LayoutSimulation delay(P.E. to the flipping)
4.06 ns
Layout of 256 bits grant arbiter
![Page 88: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/88.jpg)
88CSIT560 by M. Hamdi
FIRM Maximum Size Matching Algorithm: Performance and Properties
• It is exactly like iSLIP with a very small – yet significant modification.
• Grant (outputs): If an unmatched output receives a request, it grants the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The output notifies each input whether or not its request is granted. The pointer to the highest priority element of the round-robin schedule is incremented beyond the granted input. If input does not accept the pointer is set at the granted one.
![Page 89: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/89.jpg)
89CSIT560 by M. Hamdi
0
1
2
3
0
1
2
3
Step 3: Accept
3 02 1
3 02 1
Simple Iterative Algorithms: FIRM
![Page 90: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/90.jpg)
90CSIT560 by M. Hamdi
Pointer Synchronization
• Why this is good: this small change prevents the output arbiters from moving in lock-step (being synchronized – pointing to the same input) leading to a dramatic improvement in performance.
• If several outputs grant the same input, no matter how this input chooses, only one match can be made, and the other outputs will be idle.
• To get as many matches as possible, it's better that each output grants a different input.
• Since each output will select the highest priority input if a request is received from this input, it's better to keep the output pointers desynchronized (pointing to different locations).
![Page 91: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/91.jpg)
91CSIT560 by M. Hamdi
iSLIP Maximal Matching Algorithm
0
1
0
1
00
11
00
11
100% Throughput100% Throughput
00
11
00
11
....
00
00
11
00
....
![Page 92: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/92.jpg)
92CSIT560 by M. Hamdi
Pointer Synchronization: Differences between RRM, iSlip & FIRM
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
15
20
25
30
35
Normalized load
Avg
num
ber
of s
ynch
roni
zed
outp
ut s
ched
uler
s
32x32 switch under uniform traffic
RRM iSlipFIRM
![Page 93: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/93.jpg)
93CSIT560 by M. Hamdi
Differences between RRM, iSlip & FIRM
RRM iSlip FIRM
Input No grant unchanged
Granted one location beyond the accepted one
Output
No request unchanged
Grant accepted
one location beyond the granted one
Grant not accepted
one location beyond the previously granted one
unchanged the granted one
![Page 94: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/94.jpg)
94CSIT560 by M. Hamdi
General remarks• Since all of these algorithms try to approximate
maximum size matching, they can be unstable under non-uniform traffic
• They can achieve 100% throughput under uniform traffic
• Under a large number of iterations, their performance is similar
• They have similar implementation complexity
![Page 95: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/95.jpg)
95CSIT560 by M. Hamdi
Input QueueingLongest Queue First or
Oldest Cell First
1234
1234
1234
1234
10 1
1
1
1 10
Maximum weight
Weight Waiting Time
100%Queue Length { } =
![Page 96: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/96.jpg)
96CSIT560 by M. Hamdi
Input QueueingWhy is serving long/old queues better than serving
maximum number of queues?
• When traffic is uniformly distributed, servicing themaximum number of queues leads to 100% throughput.
• When traffic is non-uniform, some queues become longer than others.
• A good algorithm keeps the queue lengths matched, and
services a large number of queues.
VOQ #
Avg
Occ
upan
cy Uniform traffic
VOQ #
Avg
Occ
upan
cy
Non-uniform traffic
![Page 97: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/97.jpg)
97CSIT560 by M. Hamdi
Maximum/Maximal Weight Matching
• 100% throughput for admissible traffic (uniform or non-uniform)
• Maximum Weight Matching
– OCF (Oldest Cell First): w=cell waiting time
– LQF (Longest Queue First):w=input queue occupancy
– LPF (Longest Port First):w=QL of the source port + Sum of QL form the source port to the destination port
• Maximal Weight Matching (practical algorithms)
– iOCF
– iLQF
– iLPF (comparators in the critical path of iLQF are removed )
![Page 98: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/98.jpg)
98CSIT560 by M. Hamdi
Maximal Weight Matching Algorithms: iLQF
• Request. Each unmatched input sends a request word of width bits to each output for which it has a queued cell, indicating the number of cells that it has queued to that output.
• Grant. If an unmatched output receives any requests, it chooses the largest valued request. Ties are broken randomly.
• Accept. If an unmatched input receives one or more grants, it accepts the one to which it made the largest valued request. Ties are broken randomly.
![Page 99: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/99.jpg)
99CSIT560 by M. Hamdi
Maximal Weight Matching Algotithms: iLQF
• The i-LQF algorithm has the following properties:
• Property 1. Independent of the number of iterations, the longest input queue is always served.
• Property 2. As with i-SLIP, the algorithm converges in at most logN iterations.
• Property 3. For an inadmissible offered load, an input queue may be starved.
![Page 100: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/100.jpg)
100CSIT560 by M. Hamdi
Maximal Weight Matching Algotithms: iOCF
• The i-OCF algorithm works in similar fashion to iLQF, and has the following properties:
• Property 1. Independent of the number of iterations, the cell that has been waiting the longest time in the input queues (it must at the head of the queue)
• Property 2. As with i-LQF, the algorithm converges in at most logN iterations.
• Property 3. No input queue can be starved indefinitely.
• Property 4. It is difficult to keep time stamps on the cells.
![Page 101: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/101.jpg)
101CSIT560 by M. Hamdi
iLQF - Implementation
![Page 102: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/102.jpg)
102CSIT560 by M. Hamdi
iLQF - ImplementationComplicated hardware
![Page 103: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/103.jpg)
103CSIT560 by M. Hamdi
Other research efforts
• Packet-based arbitration
• Exhaustive-based arbitration
• Numerous other efforts
![Page 104: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/104.jpg)
104CSIT560 by M. Hamdi
Packet Scheduling/Arbitration in Virtual Output Queues:
Randomized Algorithmsand Others
![Page 105: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/105.jpg)
105CSIT560 by M. Hamdi
Input-Queued Packet Switch
Crossbar
Scheduler
inputs
outputs
1
N
1 N
.
.
.
.
. . . .
i,j
N,N
1,
1
Xi,j
(i i i,j < 1 ; j j i,j < 1)
![Page 106: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/106.jpg)
106CSIT560 by M. Hamdi
Bipartite Graph and Matrix
011
111
001inputs
outputs
1
2
3
321
![Page 107: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/107.jpg)
107CSIT560 by M. Hamdi
Stability of Scheduling
Definition:
Let Xi,j(t) be the number of packets queued at input i for output j at time-slot t.
Then an algorithm is stable iff:
)(
, , tXEji ji
![Page 108: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/108.jpg)
108CSIT560 by M. Hamdi
MotivationMotivation• Networking problems suffer from the “curse of
dimensionality”
– algorithmic solutions do not scale well
• Typical causes
– size: large number of users or large number of I/O
– time: very high speeds of operation
• A good deterministic algorithm exists (Max Flow), but …
– it needs state information, and “state” is too big
– it “starts from scratch” in each iteration
![Page 109: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/109.jpg)
109CSIT560 by M. Hamdi
Randomization• Randomized algorithms have frequently been used in many
situations where the state space (e.g., different number of connections between input and output N!) is very large
• Randomized algorithms
– are a powerful way of approximating the optimal solution
– it is often possible to randomize deterministic algorithms
– this simplifies the implementation while retaining a (surprisingly) high level of performance
• The main idea is
– to simplify the decision-making process
– by basing decisions upon a small, randomly chosen sample of the state
– rather than upon the complete state
![Page 110: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/110.jpg)
110CSIT560 by M. Hamdi
Randomizing Iterative Schemes (e.g., iSLIP)
• Often, we want to perform some operation iteratively
• Example: find the heaviest matching in a switch in every time slot
• Since, in each time slot– at most one packet can arrive at each input
– and, at most one packet can depart from each output the size of the queues, or the “state” of the switch, doesn’t change by
much between successive time slots so, a matching that was heavy at time t will quite likely continue to be
heavy at time t+1
• This suggests that– knowing a heavy matching at time t should help in determining a
heavy matching at time t+1 there is no need to start from scratch in each time slot
![Page 111: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/111.jpg)
111CSIT560 by M. Hamdi
Summarizing Randomized Algorithms
• Randomized algorithms can help simplify the implementation– by reducing the amount of work in each iteration
• If the state of the system doesn’t change by much between iterations, then– we can reduce the work even further by carrying
information between iterations
• The big pay-off is that, even though it is an approximation, the performance of
a randomized scheme can be surprisingly good
![Page 112: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/112.jpg)
112CSIT560 by M. Hamdi
Randomized Scheduling Algorithms: Example
• Consider a 3 x 3 input-queued switch – input traffic: is Bernoulli IID and λij = α/3 for all i, j, and α <
1
– This is admissible
– note: there are a total of 6 (= 3!) possible service matrices
111
111
111
3/
3/3/3/
3/3/3/
3/3/3/
100
010
001
010
100
001
100
001
010
001
100
010
010
001
100
001
010
100
![Page 113: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/113.jpg)
113CSIT560 by M. Hamdi
Random Scheduling Algorithms
• In time slot n, let S(n) be equal to one of the 6 possible matchings independently and uniformly at random
• Stability of Random – Consider L11(n), the number of packets in VOQ11
• arrivals to VOQ11 occur according to A11(n), which is Bernoulli IID
• input rate = λ11 = α/3
• this queue gets served whenever the service matrix connects input 1 to output 1
• There are 2 service matrices that connect input 1 to output 1
• since Random chooses service matrices u.a.r., input 1 is connected to output 1
1. for a fraction of time = 2/6 = 1/3 --- the service rate between input1 and output1
• E(L11(n)) < iff λ11 < 1/3 α < 1
• This random algorithm is stable.
![Page 114: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/114.jpg)
114CSIT560 by M. Hamdi
Random Scheduling Algorithms
• Instability of Random
• Now suppose λii = α for all i and λij =0 for – clearly, this is admissible traffic for all α < 1
– but, under Random, the service rate at VOQ11 is 1/3 at best
– hence VOQ11 and the switch will be unstable as soon as
• Stability (or 100% throughput) means it is stable under all admissible traffic!
ji
3/1
![Page 115: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/115.jpg)
115CSIT560 by M. Hamdi
Obvious Randomized Schemes
• Choose a matching at random and use it as the schedule doesn’t give 100% throughput (already shown)
• Choose 2 matchings at random and use the heavier one as the schedule
• Choose N matchings at random and use the heaviest one as the schedule
None of these can give 100% throughput !!
![Page 116: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/116.jpg)
116CSIT560 by M. Hamdi
0.001
0.01
0.1
1
10
100
1000
10000
0.0 0.2 0.4 0.6 0.8 1.0
Mea
n IQ
Len
Normalized Load
Diagonal Traffic
MWM R32R1
![Page 117: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/117.jpg)
117CSIT560 by M. Hamdi
Iterative Randomized Scheme(Tassiulas)
• Say M is the matching used at time t
• Let R be a new matching chosen uniformly at random (u.a.r.) among the N! different matchings
• At time t+1, use the heavier of M and R
• Complexity is very low O(1) iterations • This gives 100% throughput !
note the boost in throughput is due to memory (saving previous matchings)
• But, delays are very large
![Page 118: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/118.jpg)
118CSIT560 by M. Hamdi
0.01
0.1
1
10
100
1000
10000
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Mea
n IQ
Len
Normalized Load
Diagonal Traffic
MWMTassiulas
![Page 119: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/119.jpg)
119CSIT560 by M. Hamdi
Finer Observations
• Let M be schedule used at time t
• Choose a “good’’ random matching R
• M’ = Merge(M,R)
• M’ includes best edges from M and R
• Use M’ as schedule at time t+1
• Above procedure yields algorithm called LAURA
• There are many other small variations to this algorithm.
![Page 120: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/120.jpg)
120CSIT560 by M. Hamdi
3
2
3
2
2
1
2
3
4
1Merging
3
2
3
3
1
X R3-1+2-2=2
2-1+2-4=-1
W(X)=12 W(R)=10
M
W(M)=13
Merging Procedure
![Page 121: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/121.jpg)
121CSIT560 by M. Hamdi
0.01
0.1
1
10
100
1000
10000
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Mea
n IQ
Len
Normalized Load
Diagonal Traffic
MWMM-LAURA LAURAiLQFTassiulas
![Page 122: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/122.jpg)
122CSIT560 by M. Hamdi
Can we avoid having schedulers altogether !!!
![Page 123: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/123.jpg)
123CSIT560 by M. Hamdi
Recap:Recap: Two Successive Scaling Two Successive Scaling ProblemsProblems
OQ routers: + work-conserving (QoS)- memory bandwidth =
(N+1)RR
R
RR
IQ routers: + memory bandwidth = 2R- arbitration complexity
Bipartite Matching
R R
![Page 124: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/124.jpg)
124CSIT560 by M. Hamdi
Today: 64 ports at 10Gbps, 64-byte cells.
• Arbitration Time = = 51.2ns
• Request/Grant Communication BW = 17.5Gbps
10Gbps 64bytes
IQ Arbitration Complexity
Two main alternatives for scaling:1. Increase cell size2. Eliminate arbitration
Scaling to 160Gbps:• Arbitration Time = 3.2ns• Request/Grant Communication BW = 280Gbps
![Page 125: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/125.jpg)
125CSIT560 by M. Hamdi
Desirable Characteristics for Router Architecture
Ideal: OQ• 100% throughput• Minimum delay• Maintains packet order
Necessary: able to regularly connect any input to any output
What if the world was perfect? Assume Bernoulli iid uniform arrival traffic...
![Page 126: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/126.jpg)
126CSIT560 by M. Hamdi
Round-Robin Scheduling
• Uniform & non-bursty traffic => 100% throughput• Problem: traffic is non-uniform & bursty
![Page 127: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/127.jpg)
127CSIT560 by M. Hamdi
Two-Stage Switch (I)
1
N
1
N
1
N
External Outputs
Internal Inputs
External Inputs
First Round-Robin Second Round-Robin
![Page 128: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/128.jpg)
128CSIT560 by M. Hamdi
Two-Stage Switch (I)
1
N
1
N
1
N
External Outputs
Internal Inputs
External Inputs
First Round-Robin Second Round-Robin
Load Balancing
![Page 129: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/129.jpg)
129CSIT560 by M. Hamdi
• 100% throughput• Problem: unbounded mis-sequencing
External Outputs
Internal Inputs
1
N
ExternalInputs
Cyclic Shift Cyclic Shift
1
N
1
N
11
2
2
Two-Stage Switch Characteristics
![Page 130: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/130.jpg)
130CSIT560 by M. Hamdi
Two-Stage Switch (II)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
F ik
F ik
.
.
.
.
.
.
.
FlowSplitter
LoadBalancer VOQs First-Stage Round-Robin Second-Stage Round-RobinVOQs
External inputs Internal outputs Internal inputs External outputs
1 1 1
N N N
1
N
1
N
i
.
.
.
.
.
.
.
.
.
.
.
.
j
.
.
.
.
.
.
.
.
.
.
.
.
j
.
.
.
.
.
.
.
.
.
.
.
.
k
.
.
.
.
.
.
.
.
.
.
.
.
New
N3 instead of N2
![Page 131: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/131.jpg)
131CSIT560 by M. Hamdi
Expanding VOQ Structure
Solution: expand VOQ structure by distinguishing among switch inputs
2
1
3
a
b
![Page 132: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/132.jpg)
132CSIT560 by M. Hamdi
What is being done in practice(Cisco for example)
• They want schedulers that achieve 100% throughput and very low delay (Like MWM)
• They want it to be as simple as iSLIP in terms of hardware implementation
• Is there any solution to this !!!!!
![Page 133: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/133.jpg)
133CSIT560 by M. Hamdi
Typical Performance of ISLIP-like Algorithms
PIM with 4 iterations
![Page 134: CSIT560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues and Others.](https://reader037.fdocuments.in/reader037/viewer/2022110207/56649d775503460f94a58dfb/html5/thumbnails/134.jpg)
134CSIT560 by M. Hamdi
What is being done in practice(Cisco for example)
Company Switching Capacity
Switch Architecture
Fabric Overspeed
Agere 40 Gbit/s-2.5 Tbit/s Arbitrated crossbar 2x
AMCC 20-160 Gbit/s Shared memory 1.0x
AMCC 40 Gbit/s-1.2 Tbit/s Arbitrated crossbar 1-2x
Broadcom 40-640 Gbit/s Buffered crossbar 1-4x
Cisco 40-320 Gbit/s Arbitrated crossbar 2x