The Parallel Packet Switch

31
Stanford University © 1999 The Parallel Packet Switch Web Site: http://klamath.stanford.edu/fjr Sundar Iyer, Amr Awadallah, & Nick McKeown High Performance Networking Group, Stanford University.

description

The Parallel Packet Switch. Sundar Iyer, Amr Awadallah, & Nick McKeown High Performance Networking Group, Stanford University. Web Site: http://klamath.stanford.edu/fjr. Contents. Motivation Key Ideas Speedup, Concentration, Constraints Mimicking an OQ-Switch - PowerPoint PPT Presentation

Transcript of The Parallel Packet Switch

PFATEnabling QoS in a PPS
PIFO: A Speedup of 3 suffices
Motivation for a Distributed Algorithm
Work Conservance: A Speedup of sqrt(k) suffices
Multicasting
Conclusions
a switch with memories running slower than the line rate
an extremely high-speed packet switch
a switch with a highly scaleable architecture
To Support
Stanford University © 1999
Architecture Alternatives - Refresher
An Ideal Switch:
Supports QoS
A parallel packet-switch (PPS) is comprised of multiple identical lower-speed packet-switches operating
independently and in parallel. An incoming stream of packets is spread, packet-by-packet, by a de-multiplexor
across the slower packet-switches, then recombined by a multiplexor at the output.
Stanford University © 1999
Key Concept - “Inverse Multiplexing”
Buffering occurs only in the internal switches !
By choosing a large value of “k”, we would like to arbitrarily
reduce the memory speeds within a switch
Can such a switch work “ideally” ?
Can it give the advantages of an output queued switch ?
What should the multiplexor and de-multiplexor do ?
Does not the switch behave well in a trivial manner ?
Stanford University © 1999
Output Queued Switch
A switch in which arriving packets are placed immediately in queues at the output, where they contend with packets destined to the same output waiting their turn to depart.
“We would like to perform as well as an output queued switch”
Mimic (Black Box Model)
Two different switches are said to mimic each other, if under identical inputs, identical packets depart from each switch at the same time
Work Conserving
A system is said to be work-conserving if its outputs never idle unnecessarily.
“If you got something to do, do it now !!”
Stanford University © 1999
Stanford University © 1999
Potential Pitfalls - Concentration
(R/3)
(2R/3)
(R/3)
(R/3)
(R/3)
“Concentration is when a large number of cells destined to the same output
are concentrated on a small fraction of internal layers”
Packets destined to output port two
Stanford University © 1999
t=0
t=1
t=0’
t=1’
Link Constraints
Input Link Constraint- An external input port is constrained to send a cell to a specific layer at most once every ceil(k/S) time slots.
This constraint is due to the switch architecture
Each arriving cell must adhere to this constraint
Output Link Constraint
Demultiplexor
Demultiplexor
Stanford University © 1999
AIL and AOL Sets
Available Input Link Set: AIL(i,n), is the set of layers to which external input port i can start sending a cell in time slot n.
This is the set of layers that external input i has not started sending any cells to within the last ceil(k/S) time slots.
AIL(i,n) evolves over time
AIL(i,n) is full when there are no cells destined to an input for ceil(k/S) time slots.
Available Output Link Set: AOL(j,n’), is the set of layers that can send a cell to external output j at time slot n’ in the future.
This is the set of layers that have not started to send a new cell to external output j in the last ceil(k/S) time slots before time slot n’
AOL(j,n’) evolves over
time & cells to output j
AOL(j,n’) is never full as long as there are cells in the system destined to output j.
Stanford University © 1999
Demultiplexor
Theorems
Theorem1: (Sufficiency) A PPS can exactly mimic an FCFS- OQ Switch if it guarantees that each arriving cell is allocated to a layer l, such that l € AIL(i,n) and l € AOL(j,n’), (i.e. if it meets both the ILC and the OLC)
AIL(i,n)
AOL(j,n’)
The intersection set
Theorem2: (Sufficiency) A speedup of 2k/(k+2) is sufficient for a PPS to meet both the input and output link constraints for every cell.
Stanford University © 1999
8
5
6
7
2
3
4
1
8
1
4
3
7
2
6
5
Stanford University © 1999
Stanford University © 1999
2
3
1
4
5
6
8
10
9
7
11
13
12
2
3
1
4
5
6
8
10
9
7
11
13
12
7
7
7
14
14
7
Individual
Output
Queues
R/k
R/k
2
3
1
4
5
6
8
10
9
7
11
13
12
2
3
1
4
5
6
8
10
9
7
11
14
13
15
14
12
ILC
7
7
7
7
R/k
R/k
Cell must not be sent to layer which belongs to
OLC(j,n’)
([k/S] -1) + ([k/S] -1) + ([k/S] -1) < k
Theorem2: (Sufficiency) A speedup of 3k/(k+3) is sufficient for a PPS to mimic a PIFO OQ-Switch.
Stanford University © 1999
FIFO
A speedup of m +1 suffices
AIL(i,n)
AOL(j,n1’)
AOL(k,n2’)
A speedup of 2m +1 suffices
Stanford University © 1999
Summary of Results
A central scheduler is broadcast the AIL Sets
CPA calculates the intersection between AIL and one or more AOL’s
CPA timestamps the cells
The cells are output in the order of the global timestamp
If the speedup S >= 2, then
CPA can perfectly mimic a FCFS OQ Switch
If the speedup S >= 3, then
CPA can perfectly mimic a PIFO OQ Switch
Stanford University © 1999
Centralized Algorithm not practical
Does not scale with N, the number of input ports
Ideally, we would like a distributed algorithm where each input makes its decision independently.
Caveats
Stanford University © 1999
Potential Pitfall
“If inputs act independently, the PPS can immediately become non work conserving”
Decrease the number of inputs which request simultaneously
Give the scheduler choice
Increase the speedup appropriately
N schedulers
Broadcast phase
Request phase
Each input requests a layer which satisfies ILC & OLC (primary request)
Each input also requests a duplicate layer (duplicate request)
Duplication function
Grant phase
The scheduler grants each input one request amongst the two
Stanford University © 1999
l’ is the duplicate request layer
k is the number of layers
l’ = (l +g) mod k
“Inputs belonging to
duplicate requests”
Input 4 belongs to group 3 and does not duplicate
R
R
R
R
R
(R/k)
(R/k)
Multiplexor
De
multiplexor
1
1
R
De
multiplexor
2
R
De
multiplexor
3
De
multiplexor
R
2
k
=3
Understanding the Scheduling Stage in DPA
A set of x nodes can pack at the most x(x-1) +1 request tuples
A set of x request tuples span at least ceil[sqrt(x)] layers
The maximum number of requests which need to be granted to a single layer in a given scheduling stage is bounded by ceil[sqrt(k)]
So a speedup of around sqrt(k) suffices ?
Stanford University © 1999
Fact1: (Work Conservance - Necessary condition for PPS)
For the PPS to be work conserving we require that no more than s cells be scheduled to depart from the same layer in a given window of k time slots.
Fact2: (Work Conservance - Sufficiency for DPA)
If in any scheduling stage we present only layers which have less than S - ceil[sqrt(k)] cells belonging to the present k-window slot in the AOL. then DPA will always remain work conserving.
Fact3: We have to ensure that there always exists 2 layers such that
l € AIL & AOL
l’ also € AIL & AOL
Stanford University © 1999
Conclusions & Future Work
DPA has to be made simpler
Complete multicasting study in a PPS
Stanford University © 1999