Queueing and Scheduling Traffic is moved by connecting end-systems to switches, and switches to each...
-
Upload
gwenda-lang -
Category
Documents
-
view
218 -
download
1
Transcript of Queueing and Scheduling Traffic is moved by connecting end-systems to switches, and switches to each...
Queueing and Scheduling
Traffic is moved by connecting end-systems to switches, and Traffic is moved by connecting end-systems to switches, and switches to each otherswitches to each other
Switches: packet switches v/s circuit switches and Switches: packet switches v/s circuit switches and connectionless v/s connection orientedconnectionless v/s connection oriented
Data arriving to an input port of a switch have to be moved to Data arriving to an input port of a switch have to be moved to one or more of the output portsone or more of the output ports
Blocking in Packet Switches
Can have both internal and output blockingCan have both internal and output blocking Internal: no path to outputInternal: no path to output Output: link unavailableOutput: link unavailable
Unlike a circuit switch, cannot predict if packets will blockUnlike a circuit switch, cannot predict if packets will block
If packet is blocked, must either buffer or drop itIf packet is blocked, must either buffer or drop it
Dealing with blocking: Match input rate to service rateDealing with blocking: Match input rate to service rate Overprovisioning: internal links much faster than inputsOverprovisioning: internal links much faster than inputs Buffering: Buffering:
input portinput port in the fabricin the fabric output portoutput port shared memoryshared memory hybridhybrid
Input Buffering (input queueing)
No speedup in buffers or trunks (unlike output queued switch)No speedup in buffers or trunks (unlike output queued switch)
Needs arbiterNeeds arbiter
Problem: Problem: head of line blockinghead of line blocking
Dealing with HOL blocking
Per-output queues at inputsPer-output queues at inputs
Arbiter must choose one of the input ports for each output portArbiter must choose one of the input ports for each output port
Parallel Iterated MatchingParallel Iterated Matching inputs tell arbiter which outputs they are interested ininputs tell arbiter which outputs they are interested in output selects one of the inputsoutput selects one of the inputs some inputs may get more than one some inputs may get more than one grantgrant, others may get none, others may get none if >1 grant, input picks one at random, and tells outputif >1 grant, input picks one at random, and tells output losing inputs and outputs try againlosing inputs and outputs try again
Used in DEC Autonet 2 switchUsed in DEC Autonet 2 switch
Output Queueing
Don’t suffer from head-of-line blockingDon’t suffer from head-of-line blocking
Output buffers need to run much faster than trunk speedOutput buffers need to run much faster than trunk speed
Can reduce some of the cost by using the Can reduce some of the cost by using the knockoutknockout principle principle unlikely that all N inputs will have packets for the same outputunlikely that all N inputs will have packets for the same output
Most commonly used mechanism in routersMost commonly used mechanism in routers
Shared Memory
Route only the header to output portRoute only the header to output port
Bottleneck is time taken to read and write multiported memoryBottleneck is time taken to read and write multiported memory
Doesn’t scale to large switchesDoesn’t scale to large switches
Scheduling
Scheduling disciplines:Scheduling disciplines: resolve contention resolve contention allocateallocate
bandwidthbandwidth delaydelay lossloss
determine the determine the fairness offairness of the network the network give different qualities of service and performance guaranteesgive different qualities of service and performance guarantees
Components:Components: decides service orderdecides service order manages queue of service requestsmanages queue of service requests
Example: consider queries awaiting web serverExample: consider queries awaiting web server scheduling discipline decides service orderscheduling discipline decides service order and also if some query should be ignoredand also if some query should be ignored
Scheduling
Use scheduling:Use scheduling: Wherever contention may occurWherever contention may occur Usually studied at network layer, at output queues of switchesUsually studied at network layer, at output queues of switches
Application types:Application types: best-effort (adaptive, non-real time)best-effort (adaptive, non-real time)
e.g. email, some types of file transfere.g. email, some types of file transfer guaranteed service (non-adaptive, real time)guaranteed service (non-adaptive, real time)
e.g. packet voice, interactive video, stock quotese.g. packet voice, interactive video, stock quotes RequirementsRequirements
implementation ease: few instructions or hardwareimplementation ease: few instructions or hardware has to make a decision once every few microseconds!has to make a decision once every few microseconds! work per packet should scale less than linearly with number of active connectionswork per packet should scale less than linearly with number of active connections
fairness: protection against traffic hogsfairness: protection against traffic hogs performance bounds: on bandwidth, delay and lossperformance bounds: on bandwidth, delay and loss admission control: needed to provide QoSadmission control: needed to provide QoS
Max-Min Fairness
Scheduling discipline Scheduling discipline allocatesallocates a a resourceresource
An allocation is fair if it satisfies An allocation is fair if it satisfies max-min fairnessmax-min fairness
IntuitivelyIntuitively each connection gets no more than what it wantseach connection gets no more than what it wants the excess, if any, is equally sharedthe excess, if any, is equally shared
A B C A B C
Transfer half of excess
Unsatisfied demand
Scheduling Design Choices Priority: Priority:
packet is served from a given priority level only if no packets exist at higher packet is served from a given priority level only if no packets exist at higher levels levels
highest level gets lowest delay (starvation)highest level gets lowest delay (starvation) Work Conservation:Work Conservation:
conservation law: Σρconservation law: Σρ iiqqii = constant; where ρ = constant; where ρ i i = λ = λiixxii; λ; λii is traffic arrival rate is traffic arrival rate, , xxii is is
mean service time for packet; qmean service time for packet; q ii is mean waiting time at the scheduler, for is mean waiting time at the scheduler, for
connection i; connection i; sum of mean queueing delays received by a set of multiplexed connections, sum of mean queueing delays received by a set of multiplexed connections,
weighted by their share of the link, is independent of the scheduling disciplineweighted by their share of the link, is independent of the scheduling discipline work conserving v/s non work conserving disciplineswork conserving v/s non work conserving disciplines
Service Algorithm:Service Algorithm: FCFS: bandwidth hogs win; no delay guaranteesFCFS: bandwidth hogs win; no delay guarantees service tags: arbitrarily reorder queue; provide guarantees; expensive sortingservice tags: arbitrarily reorder queue; provide guarantees; expensive sorting
Work Conserving v/s Non-Work-Conserving
Work conserving discipline is never idle when packets await Work conserving discipline is never idle when packets await serviceservice
Non work conserving discipline may be idle even when packets Non work conserving discipline may be idle even when packets await serviceawait service main idea: delay packet till main idea: delay packet till eligibleeligible Reduces delay-jitter => fewer buffers in networkReduces delay-jitter => fewer buffers in network Choosing eligibility time:Choosing eligibility time:
rate-jitter regulator: bounds maximum outgoing raterate-jitter regulator: bounds maximum outgoing rate delay-jitter regulator: compensates for variable delay at delay-jitter regulator: compensates for variable delay at
previous hopprevious hop Always punishes a misbehaving sourceAlways punishes a misbehaving source Increases mean delay; Wastes bandwidth; Implementation costIncreases mean delay; Wastes bandwidth; Implementation cost
Scheduling Best-effort Connections
Main requirement is Main requirement is fairnessfairness
Achievable using Achievable using Generalized Processor Sharing (GPS)Generalized Processor Sharing (GPS) Visit each non-empty queue in turnVisit each non-empty queue in turn Serve infinitesimal from eachServe infinitesimal from each GPS is not implementable; we can serve only packetsGPS is not implementable; we can serve only packets No packet discipline can be as fair as GPSNo packet discipline can be as fair as GPS
Weighted Round Robin
Serve a packet from each non-empty queue in turnServe a packet from each non-empty queue in turn
Unfair if packets are of different length or weights are not equalUnfair if packets are of different length or weights are not equal
Different weights, fixed packet sizeDifferent weights, fixed packet size serve more than one packet per visit, after normalizing to obtain serve more than one packet per visit, after normalizing to obtain
integer weightsinteger weights Different weights, variable size packetsDifferent weights, variable size packets
normalize weights by meannormalize weights by mean packet sizepacket size e.g. weights {0.5, 0.75, 1.0}, mean packet sizes {50, 500, 1500}e.g. weights {0.5, 0.75, 1.0}, mean packet sizes {50, 500, 1500} normalize weights: {0.5/50, 0.75/500, 1.0/1500} = { 0.01, 0.0015, normalize weights: {0.5/50, 0.75/500, 1.0/1500} = { 0.01, 0.0015,
0.000666}, normalize again {60, 9, 4}0.000666}, normalize again {60, 9, 4} problem: need to know mean packet size in advanceproblem: need to know mean packet size in advance
Used in some ATM switchesUsed in some ATM switches
Weighted Fair Queueing (WFQ)
Deals better with variable size packets and weightsDeals better with variable size packets and weights
Also known as Also known as packet-by-packet GPSpacket-by-packet GPS (PGPS) (PGPS)
Find Find finish timefinish time of a packet, of a packet, had we been doing GPShad we been doing GPS; serve packets in order of ; serve packets in order of their finish timestheir finish times
WFQ details:WFQ details: Suppose, in each Suppose, in each round, round, the server served one bit from each active connectionthe server served one bit from each active connection Round numberRound number is the number of rounds already completed is the number of rounds already completed
can be fractionalcan be fractional If a packet of length If a packet of length p p arrives to an empty queue when the round number is arrives to an empty queue when the round number is RR, it will , it will
complete service when the round number is complete service when the round number is R + p => finish numberR + p => finish number is is R + pR + p independent of the number of other connections!independent of the number of other connections!
If a packet arrives to a non-empty queue, and the previous packet has a finish If a packet arrives to a non-empty queue, and the previous packet has a finish number of number of ff, then the packet’s finish number is , then the packet’s finish number is f+pf+p
Serve packets in order of finish numbersServe packets in order of finish numbers Finish time of a packet is not the same as the finish numberFinish time of a packet is not the same as the finish number
WFQ continued
A queue may need to be considered non-empty even if it has no A queue may need to be considered non-empty even if it has no packets in itpackets in it e.g. packets of length 1 from connections A and B, on a link of speed 1 e.g. packets of length 1 from connections A and B, on a link of speed 1
bit/secbit/sec at time 1, packet from A served, round number = 0.5at time 1, packet from A served, round number = 0.5 A has no packets in its queue, yet should be considered non-empty, A has no packets in its queue, yet should be considered non-empty,
because a packet arriving to it at time 1 should have finish number 1+ because a packet arriving to it at time 1 should have finish number 1+ pp A connection is A connection is activeactive if the last packet served from it, or in its queue, if the last packet served from it, or in its queue,
has a finish number greater than the current round numberhas a finish number greater than the current round number
Assuming we know the current round number Assuming we know the current round number RR, finish number of , finish number of packet of length packet of length pp if arriving to active connection = previous finish number + if arriving to active connection = previous finish number + pp if arriving to an inactive connection = if arriving to an inactive connection = RR + + pp
WFQ: computing the round number
Naively: round number = number of rounds of service completed so Naively: round number = number of rounds of service completed so farfar what if a server has not served all connections in a round?what if a server has not served all connections in a round? what if new conversations join in halfway through a round?what if new conversations join in halfway through a round?
RedefineRedefine round number as a real-valued variable that increases at a round number as a real-valued variable that increases at a rate inversely proportional to the number of currently active rate inversely proportional to the number of currently active connectionsconnections
With this change, WFQ emulates GPS instead of bit-by-bit RRWith this change, WFQ emulates GPS instead of bit-by-bit RR
Iterated deletion:Iterated deletion: A sever recomputes round number on each packet arrivalA sever recomputes round number on each packet arrival At any recomputation, the number of conversations can go up at most by one, but At any recomputation, the number of conversations can go up at most by one, but
can go down to zero, leading to change in round numbercan go down to zero, leading to change in round number Soln: use previous count to compute round number; if this makes some Soln: use previous count to compute round number; if this makes some
conversation inactive, recompute; repeat until no conversations become inactiveconversation inactive, recompute; repeat until no conversations become inactive
WFQ Implementation
On packet arrival:On packet arrival: use source + destination address (or VCI) to classify it and look up finish use source + destination address (or VCI) to classify it and look up finish
number of last packet served (or waiting to be served)number of last packet served (or waiting to be served) recompute round numberrecompute round number compute finish numbercompute finish number insert in priority queue sorted by finish numbersinsert in priority queue sorted by finish numbers if no space, drop the packet with largest finish numberif no space, drop the packet with largest finish number
On service completionOn service completion select the packet with the lowest finish numberselect the packet with the lowest finish number
Pros: like GPS, provides fairness and protection; can obtain worst-case Pros: like GPS, provides fairness and protection; can obtain worst-case end-to-end delay boundend-to-end delay bound
Cons: needs per-connection state; requires a priority queueCons: needs per-connection state; requires a priority queue
Used in most CISCO routersUsed in most CISCO routers
Scheduling Guaranteed-Service Connections
With best-effort connections, goal is fairnessWith best-effort connections, goal is fairness
Guaranteed-service schedulingGuaranteed-service scheduling WFQ: provides performance (end-to-end delay) guaranteesWFQ: provides performance (end-to-end delay) guarantees Delay-Earliest Due Date:Delay-Earliest Due Date:
Earliest-due-date: packet with earliest deadline selectedEarliest-due-date: packet with earliest deadline selected Delay-EDD prescribes how to assign deadlines to packetsDelay-EDD prescribes how to assign deadlines to packets A source is required to send slower than its A source is required to send slower than its peak ratepeak rate Bandwidth at scheduler reserved at peak rateBandwidth at scheduler reserved at peak rate Deadline = expected arrival time + delay boundDeadline = expected arrival time + delay bound Delay bound is Delay bound is independentindependent of bandwidth requirement of bandwidth requirement Implementation requires per-connection state and a priority queueImplementation requires per-connection state and a priority queue
Rate-controlled scheduling: Regulator Rate-controlled scheduling: Regulator shapesshapes the traffic, scheduler the traffic, scheduler provides performance guaranteesprovides performance guarantees
Packet Dropping
Packets that cannot be served immediately are bufferedPackets that cannot be served immediately are buffered
Full buffers => Full buffers => packet drop strategypacket drop strategy
Packet losses happen from best-effort connectionsPacket losses happen from best-effort connections
Shouldn’t drop packets unless imperative (wasted resources)Shouldn’t drop packets unless imperative (wasted resources)
Common strategies:Common strategies: aggregation: classify packets into classes and drop packet from class with aggregation: classify packets into classes and drop packet from class with
longest queuelongest queue priorities: drop lower priority packetspriorities: drop lower priority packets
endpoint or regulator marks CLP bit in packetsendpoint or regulator marks CLP bit in packets if network has capacity, all traffic is carried, else droppedif network has capacity, all traffic is carried, else dropped separating priorities within a single connection is hardseparating priorities within a single connection is hard
early drop: drop even if space is availableearly drop: drop even if space is available drop position: drop packet from some position in the queuedrop position: drop packet from some position in the queue
Early Random Drop and RED Early drop => drop even if space is availableEarly drop => drop even if space is available
signals endpoints to reduce ratesignals endpoints to reduce rate cooperative sources get lower overall delays, uncooperative sources get severe cooperative sources get lower overall delays, uncooperative sources get severe
packet losspacket loss Early random dropEarly random drop
drop arriving packet with fixed drop probability if queue length exceeds thresholddrop arriving packet with fixed drop probability if queue length exceeds threshold intuition: misbehaving sources more likely to send packets and see packet lossesintuition: misbehaving sources more likely to send packets and see packet losses
Random early detection (RED) makes three improvements: Random early detection (RED) makes three improvements: Metric is moving average of queue lengthsMetric is moving average of queue lengths Packet drop probability is a function of mean queue lengthPacket drop probability is a function of mean queue length Can mark packets instead of dropping themCan mark packets instead of dropping them RED improves performance of a network of cooperating TCP sourcesRED improves performance of a network of cooperating TCP sources
small bursts pass through unharmedsmall bursts pass through unharmed prevents severe reaction to mild overloadprevents severe reaction to mild overload allows sources to detect network state without lossesallows sources to detect network state without losses controls queue length regardless of endpoint cooperationcontrols queue length regardless of endpoint cooperation
Drop Position
Can drop a packet from head, tail, or random position in the Can drop a packet from head, tail, or random position in the queuequeue
Tail: easy; default approachTail: easy; default approach
Head: harder; lets source detect loss earlierHead: harder; lets source detect loss earlier
Random: hardest; if no aggregation, hurts hogs mostRandom: hardest; if no aggregation, hurts hogs most