Queueing and Scheduling Traffic is moved by connecting end-systems to switches, and switches to each...

Queueing and Scheduling

Traffic is moved by connecting end-systems to switches, and Traffic is moved by connecting end-systems to switches, and switches to each otherswitches to each other

Switches: packet switches v/s circuit switches and Switches: packet switches v/s circuit switches and connectionless v/s connection orientedconnectionless v/s connection oriented

Data arriving to an input port of a switch have to be moved to Data arriving to an input port of a switch have to be moved to one or more of the output portsone or more of the output ports

Blocking in Packet Switches

Can have both internal and output blockingCan have both internal and output blocking Internal: no path to outputInternal: no path to output Output: link unavailableOutput: link unavailable

Unlike a circuit switch, cannot predict if packets will blockUnlike a circuit switch, cannot predict if packets will block

If packet is blocked, must either buffer or drop itIf packet is blocked, must either buffer or drop it

Dealing with blocking: Match input rate to service rateDealing with blocking: Match input rate to service rate Overprovisioning: internal links much faster than inputsOverprovisioning: internal links much faster than inputs Buffering: Buffering:

input portinput port in the fabricin the fabric output portoutput port shared memoryshared memory hybridhybrid

Input Buffering (input queueing)

No speedup in buffers or trunks (unlike output queued switch)No speedup in buffers or trunks (unlike output queued switch)

Needs arbiterNeeds arbiter

Problem: Problem: head of line blockinghead of line blocking

Dealing with HOL blocking

Per-output queues at inputsPer-output queues at inputs

Arbiter must choose one of the input ports for each output portArbiter must choose one of the input ports for each output port

Parallel Iterated MatchingParallel Iterated Matching inputs tell arbiter which outputs they are interested ininputs tell arbiter which outputs they are interested in output selects one of the inputsoutput selects one of the inputs some inputs may get more than one some inputs may get more than one grantgrant, others may get none, others may get none if >1 grant, input picks one at random, and tells outputif >1 grant, input picks one at random, and tells output losing inputs and outputs try againlosing inputs and outputs try again

Used in DEC Autonet 2 switchUsed in DEC Autonet 2 switch

Output Queueing

Don’t suffer from head-of-line blockingDon’t suffer from head-of-line blocking

Output buffers need to run much faster than trunk speedOutput buffers need to run much faster than trunk speed

Can reduce some of the cost by using the Can reduce some of the cost by using the knockoutknockout principle principle unlikely that all N inputs will have packets for the same outputunlikely that all N inputs will have packets for the same output

Most commonly used mechanism in routersMost commonly used mechanism in routers

Shared Memory

Route only the header to output portRoute only the header to output port

Bottleneck is time taken to read and write multiported memoryBottleneck is time taken to read and write multiported memory

Doesn’t scale to large switchesDoesn’t scale to large switches

Scheduling

Scheduling disciplines:Scheduling disciplines: resolve contention resolve contention allocateallocate

bandwidthbandwidth delaydelay lossloss

determine the determine the fairness offairness of the network the network give different qualities of service and performance guaranteesgive different qualities of service and performance guarantees

Components:Components: decides service orderdecides service order manages queue of service requestsmanages queue of service requests

Example: consider queries awaiting web serverExample: consider queries awaiting web server scheduling discipline decides service orderscheduling discipline decides service order and also if some query should be ignoredand also if some query should be ignored

Scheduling

Use scheduling:Use scheduling: Wherever contention may occurWherever contention may occur Usually studied at network layer, at output queues of switchesUsually studied at network layer, at output queues of switches

Application types:Application types: best-effort (adaptive, non-real time)best-effort (adaptive, non-real time)

e.g. email, some types of file transfere.g. email, some types of file transfer guaranteed service (non-adaptive, real time)guaranteed service (non-adaptive, real time)

e.g. packet voice, interactive video, stock quotese.g. packet voice, interactive video, stock quotes RequirementsRequirements

implementation ease: few instructions or hardwareimplementation ease: few instructions or hardware has to make a decision once every few microseconds!has to make a decision once every few microseconds! work per packet should scale less than linearly with number of active connectionswork per packet should scale less than linearly with number of active connections

fairness: protection against traffic hogsfairness: protection against traffic hogs performance bounds: on bandwidth, delay and lossperformance bounds: on bandwidth, delay and loss admission control: needed to provide QoSadmission control: needed to provide QoS

Max-Min Fairness

Scheduling discipline Scheduling discipline allocatesallocates a a resourceresource

An allocation is fair if it satisfies An allocation is fair if it satisfies max-min fairnessmax-min fairness

IntuitivelyIntuitively each connection gets no more than what it wantseach connection gets no more than what it wants the excess, if any, is equally sharedthe excess, if any, is equally shared

A B C A B C

Transfer half of excess

Unsatisfied demand

Scheduling Design Choices Priority: Priority:

packet is served from a given priority level only if no packets exist at higher packet is served from a given priority level only if no packets exist at higher levels levels

highest level gets lowest delay (starvation)highest level gets lowest delay (starvation) Work Conservation:Work Conservation:

conservation law: Σρconservation law: Σρ iiqqii = constant; where ρ = constant; where ρ i i = λ = λiixxii; λ; λii is traffic arrival rate is traffic arrival rate, , xxii is is

mean service time for packet; qmean service time for packet; q ii is mean waiting time at the scheduler, for is mean waiting time at the scheduler, for

connection i; connection i; sum of mean queueing delays received by a set of multiplexed connections, sum of mean queueing delays received by a set of multiplexed connections,

weighted by their share of the link, is independent of the scheduling disciplineweighted by their share of the link, is independent of the scheduling discipline work conserving v/s non work conserving disciplineswork conserving v/s non work conserving disciplines

Service Algorithm:Service Algorithm: FCFS: bandwidth hogs win; no delay guaranteesFCFS: bandwidth hogs win; no delay guarantees service tags: arbitrarily reorder queue; provide guarantees; expensive sortingservice tags: arbitrarily reorder queue; provide guarantees; expensive sorting

Work Conserving v/s Non-Work-Conserving

Work conserving discipline is never idle when packets await Work conserving discipline is never idle when packets await serviceservice

Non work conserving discipline may be idle even when packets Non work conserving discipline may be idle even when packets await serviceawait service main idea: delay packet till main idea: delay packet till eligibleeligible Reduces delay-jitter => fewer buffers in networkReduces delay-jitter => fewer buffers in network Choosing eligibility time:Choosing eligibility time:

rate-jitter regulator: bounds maximum outgoing raterate-jitter regulator: bounds maximum outgoing rate delay-jitter regulator: compensates for variable delay at delay-jitter regulator: compensates for variable delay at

previous hopprevious hop Always punishes a misbehaving sourceAlways punishes a misbehaving source Increases mean delay; Wastes bandwidth; Implementation costIncreases mean delay; Wastes bandwidth; Implementation cost

Scheduling Best-effort Connections

Main requirement is Main requirement is fairnessfairness

Achievable using Achievable using Generalized Processor Sharing (GPS)Generalized Processor Sharing (GPS) Visit each non-empty queue in turnVisit each non-empty queue in turn Serve infinitesimal from eachServe infinitesimal from each GPS is not implementable; we can serve only packetsGPS is not implementable; we can serve only packets No packet discipline can be as fair as GPSNo packet discipline can be as fair as GPS

Weighted Round Robin

Serve a packet from each non-empty queue in turnServe a packet from each non-empty queue in turn

Unfair if packets are of different length or weights are not equalUnfair if packets are of different length or weights are not equal

Different weights, fixed packet sizeDifferent weights, fixed packet size serve more than one packet per visit, after normalizing to obtain serve more than one packet per visit, after normalizing to obtain

integer weightsinteger weights Different weights, variable size packetsDifferent weights, variable size packets

normalize weights by meannormalize weights by mean packet sizepacket size e.g. weights {0.5, 0.75, 1.0}, mean packet sizes {50, 500, 1500}e.g. weights {0.5, 0.75, 1.0}, mean packet sizes {50, 500, 1500} normalize weights: {0.5/50, 0.75/500, 1.0/1500} = { 0.01, 0.0015, normalize weights: {0.5/50, 0.75/500, 1.0/1500} = { 0.01, 0.0015,

0.000666}, normalize again {60, 9, 4}0.000666}, normalize again {60, 9, 4} problem: need to know mean packet size in advanceproblem: need to know mean packet size in advance

Used in some ATM switchesUsed in some ATM switches

Weighted Fair Queueing (WFQ)

Deals better with variable size packets and weightsDeals better with variable size packets and weights

Also known as Also known as packet-by-packet GPSpacket-by-packet GPS (PGPS) (PGPS)

Find Find finish timefinish time of a packet, of a packet, had we been doing GPShad we been doing GPS; serve packets in order of ; serve packets in order of their finish timestheir finish times

WFQ details:WFQ details: Suppose, in each Suppose, in each round, round, the server served one bit from each active connectionthe server served one bit from each active connection Round numberRound number is the number of rounds already completed is the number of rounds already completed

can be fractionalcan be fractional If a packet of length If a packet of length p p arrives to an empty queue when the round number is arrives to an empty queue when the round number is RR, it will , it will

complete service when the round number is complete service when the round number is R + p => finish numberR + p => finish number is is R + pR + p independent of the number of other connections!independent of the number of other connections!

If a packet arrives to a non-empty queue, and the previous packet has a finish If a packet arrives to a non-empty queue, and the previous packet has a finish number of number of ff, then the packet’s finish number is , then the packet’s finish number is f+pf+p

Serve packets in order of finish numbersServe packets in order of finish numbers Finish time of a packet is not the same as the finish numberFinish time of a packet is not the same as the finish number

WFQ continued

A queue may need to be considered non-empty even if it has no A queue may need to be considered non-empty even if it has no packets in itpackets in it e.g. packets of length 1 from connections A and B, on a link of speed 1 e.g. packets of length 1 from connections A and B, on a link of speed 1

bit/secbit/sec at time 1, packet from A served, round number = 0.5at time 1, packet from A served, round number = 0.5 A has no packets in its queue, yet should be considered non-empty, A has no packets in its queue, yet should be considered non-empty,

because a packet arriving to it at time 1 should have finish number 1+ because a packet arriving to it at time 1 should have finish number 1+ pp A connection is A connection is activeactive if the last packet served from it, or in its queue, if the last packet served from it, or in its queue,

has a finish number greater than the current round numberhas a finish number greater than the current round number

Assuming we know the current round number Assuming we know the current round number RR, finish number of , finish number of packet of length packet of length pp if arriving to active connection = previous finish number + if arriving to active connection = previous finish number + pp if arriving to an inactive connection = if arriving to an inactive connection = RR + + pp

WFQ: computing the round number

Naively: round number = number of rounds of service completed so Naively: round number = number of rounds of service completed so farfar what if a server has not served all connections in a round?what if a server has not served all connections in a round? what if new conversations join in halfway through a round?what if new conversations join in halfway through a round?

RedefineRedefine round number as a real-valued variable that increases at a round number as a real-valued variable that increases at a rate inversely proportional to the number of currently active rate inversely proportional to the number of currently active connectionsconnections

With this change, WFQ emulates GPS instead of bit-by-bit RRWith this change, WFQ emulates GPS instead of bit-by-bit RR

Iterated deletion:Iterated deletion: A sever recomputes round number on each packet arrivalA sever recomputes round number on each packet arrival At any recomputation, the number of conversations can go up at most by one, but At any recomputation, the number of conversations can go up at most by one, but

can go down to zero, leading to change in round numbercan go down to zero, leading to change in round number Soln: use previous count to compute round number; if this makes some Soln: use previous count to compute round number; if this makes some

conversation inactive, recompute; repeat until no conversations become inactiveconversation inactive, recompute; repeat until no conversations become inactive

WFQ Implementation

On packet arrival:On packet arrival: use source + destination address (or VCI) to classify it and look up finish use source + destination address (or VCI) to classify it and look up finish

number of last packet served (or waiting to be served)number of last packet served (or waiting to be served) recompute round numberrecompute round number compute finish numbercompute finish number insert in priority queue sorted by finish numbersinsert in priority queue sorted by finish numbers if no space, drop the packet with largest finish numberif no space, drop the packet with largest finish number

On service completionOn service completion select the packet with the lowest finish numberselect the packet with the lowest finish number

Pros: like GPS, provides fairness and protection; can obtain worst-case Pros: like GPS, provides fairness and protection; can obtain worst-case end-to-end delay boundend-to-end delay bound

Cons: needs per-connection state; requires a priority queueCons: needs per-connection state; requires a priority queue

Used in most CISCO routersUsed in most CISCO routers

Scheduling Guaranteed-Service Connections

With best-effort connections, goal is fairnessWith best-effort connections, goal is fairness

Guaranteed-service schedulingGuaranteed-service scheduling WFQ: provides performance (end-to-end delay) guaranteesWFQ: provides performance (end-to-end delay) guarantees Delay-Earliest Due Date:Delay-Earliest Due Date:

Earliest-due-date: packet with earliest deadline selectedEarliest-due-date: packet with earliest deadline selected Delay-EDD prescribes how to assign deadlines to packetsDelay-EDD prescribes how to assign deadlines to packets A source is required to send slower than its A source is required to send slower than its peak ratepeak rate Bandwidth at scheduler reserved at peak rateBandwidth at scheduler reserved at peak rate Deadline = expected arrival time + delay boundDeadline = expected arrival time + delay bound Delay bound is Delay bound is independentindependent of bandwidth requirement of bandwidth requirement Implementation requires per-connection state and a priority queueImplementation requires per-connection state and a priority queue

Rate-controlled scheduling: Regulator Rate-controlled scheduling: Regulator shapesshapes the traffic, scheduler the traffic, scheduler provides performance guaranteesprovides performance guarantees

Packet Dropping

Packets that cannot be served immediately are bufferedPackets that cannot be served immediately are buffered

Full buffers => Full buffers => packet drop strategypacket drop strategy

Packet losses happen from best-effort connectionsPacket losses happen from best-effort connections

Shouldn’t drop packets unless imperative (wasted resources)Shouldn’t drop packets unless imperative (wasted resources)

Common strategies:Common strategies: aggregation: classify packets into classes and drop packet from class with aggregation: classify packets into classes and drop packet from class with

longest queuelongest queue priorities: drop lower priority packetspriorities: drop lower priority packets

endpoint or regulator marks CLP bit in packetsendpoint or regulator marks CLP bit in packets if network has capacity, all traffic is carried, else droppedif network has capacity, all traffic is carried, else dropped separating priorities within a single connection is hardseparating priorities within a single connection is hard

early drop: drop even if space is availableearly drop: drop even if space is available drop position: drop packet from some position in the queuedrop position: drop packet from some position in the queue

Early Random Drop and RED Early drop => drop even if space is availableEarly drop => drop even if space is available

signals endpoints to reduce ratesignals endpoints to reduce rate cooperative sources get lower overall delays, uncooperative sources get severe cooperative sources get lower overall delays, uncooperative sources get severe

packet losspacket loss Early random dropEarly random drop

drop arriving packet with fixed drop probability if queue length exceeds thresholddrop arriving packet with fixed drop probability if queue length exceeds threshold intuition: misbehaving sources more likely to send packets and see packet lossesintuition: misbehaving sources more likely to send packets and see packet losses

Random early detection (RED) makes three improvements: Random early detection (RED) makes three improvements: Metric is moving average of queue lengthsMetric is moving average of queue lengths Packet drop probability is a function of mean queue lengthPacket drop probability is a function of mean queue length Can mark packets instead of dropping themCan mark packets instead of dropping them RED improves performance of a network of cooperating TCP sourcesRED improves performance of a network of cooperating TCP sources

small bursts pass through unharmedsmall bursts pass through unharmed prevents severe reaction to mild overloadprevents severe reaction to mild overload allows sources to detect network state without lossesallows sources to detect network state without losses controls queue length regardless of endpoint cooperationcontrols queue length regardless of endpoint cooperation

Drop Position

Can drop a packet from head, tail, or random position in the Can drop a packet from head, tail, or random position in the queuequeue

Tail: easy; default approachTail: easy; default approach

Head: harder; lets source detect loss earlierHead: harder; lets source detect loss earlier

Random: hardest; if no aggregation, hurts hogs mostRandom: hardest; if no aggregation, hurts hogs most

Queueing and Scheduling Traffic is moved by connecting end-systems to switches, and switches to each...

Documents

Transcript of Queueing and Scheduling Traffic is moved by connecting end-systems to switches, and switches to each...