MPLS and Traffic Engineering CMPT771 - 2008 Spring Liang Zhou Jiang (Jeff)
CMPT771 Network Layer 1 Network Layer Issues and Applications CMPT 771 Internet Architecture and...
-
Upload
dominic-fisher -
Category
Documents
-
view
221 -
download
0
Transcript of CMPT771 Network Layer 1 Network Layer Issues and Applications CMPT 771 Internet Architecture and...
CMPT771 Network Layer 1
Network Layer Issues and Applications
CMPT 771 Internet Architecture and Protocols
CMPT771 Network Layer 2
Network Layer Issues and Applications
Contents1. Network Service Models2. Routing Principles
Link state routing Distance vector routing Hierarchical routing
3. Multicast Routing4. Video Multicast5. Proxy Caching6. Peer-to-Peer7. Internet QoS
CMPT771 Network Layer 3
Network layer functions
deliver packets from sending to receiving hosts
network layer protocols in every host, router
three important functions: path determination: route
taken by packets from source to dest. Routing algorithms
forwarding: move packets from router’s input to appropriate router output
call setup: some network architectures require router call setup along path before data flows
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
application
transportnetworkdata linkphysical
application
transportnetworkdata linkphysical
CMPT771 Network Layer 4
Network service model
Q: What service model for “channel” transporting packets from sender to receiver?
guaranteed bandwidth? preservation of inter-packet
timing (no jitter)? loss-free delivery? in-order delivery? congestion feedback to
sender?
? ??virtual circuit
or datagram?
The most important abstraction provided
by network layer:
serv
ice a
bst
ract
ion
CMPT771 Network Layer 5
Virtual circuits
“ source-to-dest path behaves much like telephone circuit” performance-wise network actions along source-to-dest path
application
transportnetworkdata linkphysical
application
transportnetworkdata linkphysical
1. Initiate call 2. incoming call
3. Accept call4. Call connected5. Data flow begins 6. Receive data
CMPT771 Network Layer 6
Virtual circuits call setup, teardown for each call before data can flow each packet carries VC identifier (not destination host ID) every router on source-dest path maintains “state” for each
passing connection transport-layer connection only involved two end systems
link, router resources (bandwidth, buffers) may be allocated to VC
to get circuit-like perf.
used to setup, maintain teardown VC used in ATM, frame-relay, X.25 not used in today’s Internet
CMPT771 Network Layer 7
Datagram networks: Internet’s model
no call setup at network layer routers: no state about end-to-end connections
no network-level concept of “connection” Forwarded: using destination host address
packets between same source-dest pair may take different paths
application
transportnetworkdata linkphysical
application
transportnetworkdata linkphysical
1. Send data 2. Receive data
CMPT771 Network Layer 8
Datagram or VC network: why?
Internet (Datagram) data exchange among
computers “elastic” service, no strict
timing req. “smart” end systems
(computers) can adapt, perform
control, error recovery simple inside network,
complexity at “edge” heterogeneous link types
different characteristics uniform service difficult
Asynchronous Transfer Mode - ATM (VC)
evolved from telephony human conversation:
strict timing, reliability requirements
need for guaranteed service
“dumb” end systems telephones complexity inside
network
Hard state vs Soft state
CMPT771 Network Layer 9
Outline
1. Network Service Models2. Routing Principles
Link state routing Distance vector routing Hierarchical routing
3. Multicast Routing4. Video Multicast5. Proxy Caching6. Peer-to-Peer7. Internet QoS
CMPT771 Network Layer 10
Routing
Graph abstraction for routing algorithms:
graph nodes are routers graph edges are physical
links link cost: delay, $ cost,
or congestion level
Goal: determine a “good” path
(sequence of routers) thru network from source to
dest.
Routing protocol
A
ED
CB
F
2
2
13
1
1
2
53
5
“ good” path: typically means
minimum cost path other def’s possible
CMPT771 Network Layer 11
Routing Algorithm classification
Global or decentralized information?
Global: all routers have complete
topology, link cost info “link state” algorithmsDecentralized: router knows physically-
connected neighbors, link costs to neighbors
iterative process of computation, exchange of info with neighbors
“distance vector” algorithms
Static or dynamic?Static: routes change slowly
over timeDynamic: routes change more
quickly periodic update in response to link
cost changes
CMPT771 Network Layer 12
Link-State Routing Algorithm
Dijkstra’s algorithm net topology, link costs
known to all nodes accomplished via “link
state broadcast” all nodes have same info
computes least cost paths from one node (‘source”) to all other nodes gives routing table for
that node
Notation: c(i,j): link cost from node i to
j. cost infinite if not direct neighbors
D(v): current value of cost of path from source to dest V
p(v): predecessor node along path from source to v, that is next v
N: set of nodes whose least cost path definitively known
CMPT771 Network Layer 13
Distance Vector Routing Algorithm
iterative: continues until no nodes
exchange info. self-terminating: no
“signal” to stop
asynchronous: nodes need not exchange
info/iterate in lock step!distributed: each node communicates
only with directly-attached neighbors
Key Idea Given my distance to a
neighboring node Given the distances from the
neighboring nodes to remote nodes
My distances to remote nodes
CMPT771 Network Layer 14
Distance Vector Routing Algorithm
Distance Table data structure each node has its own row for each possible destination column for each directly-attached
neighbor to node example: in node X, for dest. Y via
neighbor Z:
D (Y,Z)X
distance from X toY, via Z as next hop
c(X,Z) + min {D (Y,w)}Z
w
=
=
D ()
Y
Z
Y
1
7
Z
2
5
Xvia
dest
inat
ion
CMPT771 Network Layer 15
Distance Vector Routing: overview
Iterative, asynchronous: each local iteration caused by:
message from neighbor: its least cost path change from neighbor
Distributed: each node notifies
neighbors only when its least cost path to any destination changes
neighbors then notify their neighbors if necessary
wait for (msg from neighbor)
recompute distance table
if least cost path to any dest
has changed, notify neighbors
Each node:
CMPT771 Network Layer 16
Distance Vector: Count-to-Infinity Problem
11
AB C
3
1
CMPT771 Network Layer 17
Comparison of LS and DV algorithms
Message complexity LS: with n nodes, E links,
O(nE) msgs sent each DV: exchange between
neighbors only convergence time varies
Speed of Convergence LS: O(n2) algorithm requires
O(nE) msgs may have oscillations
DV: convergence time varies may be routing loops count-to-infinity problem
Robustness: what happens if router malfunctions?
LS: node can advertise
incorrect link cost each node computes only
its own table
DV: DV node can advertise
incorrect path cost each node’s table used by
others • error propagate thru
network
CMPT771 Network Layer 18
Routing in the Internet
The Global Internet consists of Autonomous Systems (AS) interconnected with each other: Stub AS: small corporation: one connection to other AS’s Multihomed AS: large corporation (no transit): multiple
connections to other AS’s Transit AS: provider, hooking many AS’s together
Two-level routing: Intra-AS: administrator responsible for choice of routing
algorithm within network Inter-AS: unique standard for inter-AS routing: BGP
CMPT771 Network Layer 19
Internet AS Hierarchy
Intra-AS border (exterior gateway) routers
Inter-AS interior (gateway) routers
CMPT771 Network Layer 20
Topology of Tier-1 ISP
CMPT771 Network Layer 21
Redundancy in AS Routes
CMPT771 Network Layer 22
Intra-AS Routing
Also known as Interior Gateway Protocols (IGP) Most common Intra-AS routing protocols:
RIP: Routing Information Protocol
OSPF: Open Shortest Path First
IGRP: Interior Gateway Routing Protocol (Cisco proprietary)
CMPT771 Network Layer 23
Internet inter-AS routing: BGP
BGP (Border Gateway Protocol): the de facto standard Path Vector protocol:
similar to Distance Vector protocol each Border Gateway broadcast to neighbors (peers)
entire path (i.e., sequence of AS’s) to destination BGP routes to networks (ASs), not individual hosts E.g., Gateway X may send its path to dest. Z:
Path (X,Z) = X,Y1,Y2,Y3,…,Z
CMPT771 Network Layer 24
Why different Intra-/Inter-AS routing ?
Shaw
Bell
Telus
C. Huitema, Routing on the Internet. Prentice-Hall, 2000.
CMPT771 Network Layer 25
Outline
1. Network Service Models2. Routing Principles
Link state routing Distance vector routing Hierarchical routing
3. Multicast Routing4. Video Multicast5. Proxy Caching6. Peer-to-Peer7. Internet QoS
CMPT771 Network Layer 26
Multicast: one sender to many receivers
Multicast: act of sending datagram to multiple receivers with single “transmit” operation analogy: one teacher to many students
Question: how to achieve multicast (which layer ?)
Multicast via unicast source sends N unicast
datagrams, one addressed to each of N receivers
multicast receiver (red)
not a multicast receiver (red)
routersforward unicastdatagrams
CMPT771 Network Layer 27
Multicast: one sender to many receivers
Multicast: act of sending datagram to multiple receivers with single “transmit” operation analogy: one teacher to many students
Question: how to achieve multicast
Multicast via unicast: Problems
1. Inefficient2. Addressing
multicast receiver (red)
not a multicast receiver (red)
routersforward unicastdatagrams
CMPT771 Network Layer 28
Multicast: one sender to many receivers
Multicast: act of sending datagram to multiple receivers with single “transmit” operation analogy: one teacher to many students
Question: how to achieve multicast
Network multicast Router actively participate in
multicast, making copies of packets as needed and forwarding towards multicast receivers
Multicastrouters (red) duplicate and forward multicast datagrams
CMPT771 Network Layer 29
Multicast: one sender to many receivers
Multicast: act of sending datagram to multiple receivers with single “transmit” operation analogy: one teacher to many students
Question: how to achieve multicast
Application-layer multicast end systems involved in
multicast copy and forward unicast datagrams among themselves
CMPT771 Network Layer 30
Multicast Routing: Problem Statement
Goal: find a tree (or trees) connecting routers having local mcast group members tree: not all paths between routers used source-based: different tree from each sender to rcvrs shared-tree: same tree used by all group members (senders)
Shared tree Source-based trees
CMPT771 Network Layer 31
Approaches for building mcast trees
Approaches: source-based tree: one tree per source
shortest path trees reverse path forwarding
group-shared tree: group uses one tree minimal spanning (Steiner) center-based trees
…we first look at basic approaches, then specific protocols adopting these approaches
CMPT771 Network Layer 32
Source based Tree: Shortest Path Tree
mcast forwarding tree: tree of shortest path routes from source to all receivers Dijkstra’s algorithm
R1
R2
R3
R4
R5
R6 R7
21
6
3 4
5
i
router with attachedgroup member
router with no attachedgroup member
link used for forwarding,i indicates order linkadded by algorithm
LEGENDS: source
CMPT771 Network Layer 33
Source based Tree: Flooding
• Problem: Broadcast storm
R1
R2
R3
R4
R5
R6 R7
S: source
CMPT771 Network Layer 34
Source based Tree: Reverse Path Forwarding
if (mcast datagram received on incoming link on shortest path back to center)
then flood datagram onto all outgoing links else ignore datagram
rely on router’s knowledge of unicast shortest path from it to sender
each router has simple forwarding behavior:
CMPT771 Network Layer 35
Reverse Path Forwarding: example
• result is a source-specific reverse SPT– may be a bad choice with asymmetric links
R1
R2
R3
R4
R5
R6 R7
router with attachedgroup member
router with no attachedgroup member
datagram will be forwarded
LEGENDS: source
datagram will not be forwarded
CMPT771 Network Layer 36
Reverse Path Forwarding: pruning
forwarding tree contains subtrees with no mcast group members no need to forward datagrams down subtree “prune” msgs sent upstream by router with no
downstream group members
R1
R2
R3
R4
R5
R6 R7
router with attachedgroup member
router with no attachedgroup member
prune message
LEGENDS: source
links with multicastforwarding
P
P
P
CMPT771 Network Layer 37
Reverse Path Forwarding: Multiple trees for multi-sender
R1
R2
R3
R4
R5
R6 R7
S: sourceR1
R2
R3
R4
R5
R6 R7
S: source
CMPT771 Network Layer 38
Shared-Tree: General Problem
Minimum Spanning Tree: minimum cost tree connecting all routers with attached group members Algorithms ?
Steiner Tree : minimum cost tree connecting a set of routers, which includes all that with attached group members
CMPT771 Network Layer 39
Shared-Tree: General Problem
Minimum Spanning Tree: minimum cost tree connecting all routers with attached group members Prim, Kurskal algorithms
Steiner Tree : minimum cost tree connecting a set of routers, which includes all that with attached group members
CMPT771 Network Layer 40
Spanning Tree vs Steiner Tree
CMPT771 Network Layer 41
An example of a minimum Steiner Tree
AB
D G
H
I
C
F
K J
E
4
54
52
2
1
64
15
1
132
3
Mcast group members
Relay Nodes
AG = 6AE = 5AD = 3AC = 3AI = 8
Shortest Paths from Atotal distribution cost = 16
*
AG = 8AE = 5AD = 3AC = 3AI = 8
Steiner Tree paths from Atotal distribution cost=13
CMPT771 Network Layer 42
Shared-Tree: General Problem
Minimum Spanning Tree: minimum cost tree connecting all routers with attached group members Prim, Kurskal algorithms
Minimum Steiner Tree : minimum cost tree connecting a set of routers, which includes all that with attached group members problem is NP-complete excellent heuristics exists
CMPT771 Network Layer 43
KMB - A Fast Algorithm for Steiner Trees(Acta
Informatica) Output a Stiener Tree Th for G and the Z-vertices.
Step 1: Construct a complete directed distance graph G1=(V1,E1,c1) from G and Z.
Step 2: Find the minimum spanning tree T1 of G1. (pick any to break ties)
Step3: Construct a subgraph GS of G by replacing each edge in T1 by its corresponding shortest path in G. (break ties arbitrarily).
Step 4: Find the minimum spanning tree TS of GS (break ties arbitrarily).
Step 5: Construct a Steiner tree TH from TS by deleting edges in TS if necessary, so that all the leaves in TH are Steiner points.
Worst case time complexity O(|S||V|2). Cost no more than 2(1 - 1/l) *optimal cost
where l = number of leaves in the steiner tree.
CMPT771 Network Layer 44
Working of KMB
A
C
D4
4
444 4
B
A
C
D4
4
4
B
A
B
C D
EF
G
HI
10
1
1
2
9
8
1 1
1/2
2
1/2
1
B
C D
EF
G
HI
1
1
2
1 1
1/2
2
1/2
1
A
CMPT771 Network Layer 45
Working of KMB
B
C D
EF
G
HI
1
1
2
1 1
1/2
2
1/2
A
B
C D
EF
I
1
1
2
1 1
2
A
Destination Nodes
Intermediate Nodes
CMPT771 Network Layer 46
From Theory to Practice (2)
Does the heuristic algorithm(s) work ? Demands
Distributed implementation Support reliable transmission
– link failure should not increase delay or reduce resource availability
Return optimal routes taking into consideration • price to be paid (bandwidth consumed)• end to end delay. (no. of links traversed)
Minimize network load.– Avoid loops– Avoid traffic concentration on a few links or sub-nets
Minimize the state stored in routers Asymmetric links
Steiner tree is for undirectional
CMPT771 Network Layer 47
From Theory to Practice (2)
Performance metrics (costs?) Delay
• End to end delay between source and receiver relative to the shortest unicast path delay.
Cost• Cost of total bandwidth consumption• Cost of tree state info
Traffic Concentration • Maximum number of flows on a unidirectional link.• How evenly the routes are distributed.
CMPT771 Network Layer 48
Internet Multicast Service Model
multicast group concept: use of indirection hosts addresses IP datagram to multicast group routers forward multicast datagrams to hosts that have
“joined” that multicast group Many-to-many communications
128.119.40.186
128.59.16.12
128.34.108.63
128.34.108.60
multicast group
226.17.30.197
CMPT771 Network Layer 49
Multicast groups
class D Internet addresses reserved for multicast:
host group semantics:o anyone can “join” (receive) multicast groupo anyone can send to multicast groupo no network-layer identification to hosts of members
needed: infrastructure to deliver mcast-addressed datagrams to all hosts that have joined that multicast group
CMPT771 Network Layer 50
Joining a mcast group: two-step process
local: host informs local mcast router of desire to join group: IGMP (Internet Group Management Protocol)
wide area: local router interacts with other routers to receive mcast datagram flow many protocols (e.g., DVMRP, MOSPF, PIM)
IGMPIGMP
IGMP
wide-areamulticast
routing
CMPT771 Network Layer 51
IGMP: Internet Group Management Protocol
host: sends IGMP report when application joins mcast group IP_ADD_MEMBERSHIP socket option host need not explicitly “unjoin” group when leaving
router: sends IGMP query at regular intervals host belonging to a mcast group must reply to query
query report
CMPT771 Network Layer 52
IGMP
IGMP version 1 router: Host Membership
Query msg broadcast on LAN to all hosts
host: Host Membership Report msg to indicate group membership randomized delay before
responding implicit leave via no
reply to Query RFC 1112
IGMP v2: additions include Leave Group msg
last host replying to Query can send explicit Leave Group msg
router performs group-specific query to see if any hosts left in group
RFC 2236
IGMP v3: under development as Internet draft
CMPT771 Network Layer 53
Shared Tree:
Center (Core)-based trees
one router identified as “center” of tree
to join: edge router sends unicast
join-msg addressed to center router
join-msg “processed” by intermediate routers and forwarded towards center
join-msg either hits existing tree branch for this center, or arrives at center
path taken by join-msg becomes new branch of tree for this router
A
B
C
CMPT771 Network Layer 54
Center-based trees: an example
Suppose R6 chosen as center:
R1
R2
R3
R4
R5
R6 R7
router with attachedgroup member
router with no attachedgroup member
path order in which join messages generated
LEGEND
21
3
1
CMPT771 Network Layer 55
PIM: Protocol Independent Multicast
not dependent on any specific underlying unicast routing algorithm (works with all)
two different multicast distribution scenarios :
Dense: group members
densely packed, in “close” proximity.
bandwidth more plentiful
Sparse: # networks with group
members small wrt # interconnected networks
group members “widely dispersed”
bandwidth not plentiful
CMPT771 Network Layer 56
Consequences of Sparse-Dense Dichotomy: Dense data-driven construction on
mcast tree (e.g., RPF) group membership by
routers assumed until routers explicitly prune
bandwidth and non-group-router processing profligate
Sparse: receiver- driven
construction of mcast tree (e.g., center-based)
no membership until routers explicitly join
bandwidth and non-group-router processing conservative
CMPT771 Network Layer 57
Tunneling
Q: How to connect “islands” of multicast routers in a “sea” of unicast routers?
mcast datagram encapsulated inside “normal” (non-multicast-addressed) datagram normal IP datagram sent thru “tunnel” via regular IP unicast to receiving mcast router receiving mcast router unencapsulates to get mcast datagram
physical topology logical topology
CMPT771 Network Layer 58
Outline
1. Network Service Models2. Routing Principles
Link state routing Distance vector routing Hierarchical routing
3. Multicast Routing4. Video Multicast5. Proxy Caching6. Peer-to-Peer7. Internet QoS
CMPT771 Network Layer 59
Video Multicast
Why attractive ? Bandwidth efficiency Multi-receiver nature of video programs
What is the support ? IP multicast – network layer RTP protocol – transport layer But, generic interfaces only …
CMPT771 Network Layer 60
Challenges: Best-Effort Service
IP multicast is best-effort only No quality-of-service guarantee What if non-adaptive ?
- Unfair to adaptive traffic starve TCP
- Lead to congestion poor user experience
Solutions Quality-of-service deployment
- IntServ (RSVP), DiffServ, QoS routing
- Long-term objective, will be addressed later
Adaptive Multicast
CMPT771 Network Layer 61
Challenges: Heterogeneity
Receivers Bandwidth
- 56K Modem, 1.5 ADSL,10M/100M Ethernet …
Computation power- PDA, Laptop, Workstation …
Sessions Uneven session populations
CMPT771 Network Layer 62
Not an Atypical Network
PSTN
I nternet
Router
10M Hub
Receiver 6Receiver 5
56K Modem
Receiver 4
Receiver 2Receiver 1
ATM Switch
Router
Video Server(Sender)
1 2 3 4 5 6
7 8 9 10111 2
AB
1 2x
6x
8x
2x
9 x
3 x
10x
4x
1 1x
5x
7x
1x
Eth
ern
et
A
12x
6x
8x
2x
9x
3x
10 x
4x
11 x
5 x
7 x
1 x
C 100M LAN Switch
VideoRecorder
128K ISDN
Receiver 3
CMPT771 Network Layer 63
Challenges: Heterogeneity
Receivers Bandwidth
- 56K Modem, 1.5 ADSL,10M/100M Ethernet …
Computation power- PDA, Laptop, Workstation …
Sessions Uneven session populations
How can a single rate satisfy everyone simultaneous ?
Solution Multi-rate Multicast
CMPT771 Network Layer 64
Solution 1: Simulcast
Replicated streams Same video program Different rates
Each stream for a cluster of receivers Same access technology Same bottleneck
1.5 Mbps encoding
28.8 Kbps encoding
CMPT771 Network Layer 65
Solution 1: Simulcast
Pros Simplicity Compatibility
Cons Redundancy
Number of Streams
Redundancy
Mismatch
Single-rate Multicast Simulcast Multiple-unicast
CMPT771 Network Layer 66
Solution 2: Layered Multicast
Cumulative layered coding (Scalable coding) Base layer
- Most important feature
- Low rate, low quality
Enhancement layers- Progressively refine the
quality
Receiver-driven Adaptation Layer Multicast group Congestion ?
- No: Join group- Yes: Leave group
CMPT771 Network Layer 67
Solution 2: Layered Multicast
Pros No replication redundancy End-to-end control
Cons Incompatibility Overhead for scalability
- Computation time, bandwidth
Base Layer
Enhancement Layer
I
B
P
B
P
B
P
Base Layer
Enhancement Layer
P B B B
PPPI
Base Layer
Enhancement Layer
I P P P
P B B B
Temporal scalability Spatial scalability SNR scalability.
CMPT771 Network Layer 68
Solution 3: Proxy-based Video Filtering
Multiple video proxies (agents, gateways,..) Each proxy for a confined region
Bandwidth monitoring Video filtering
- Frame dropping, transcoding …
Pros Fine-grained rate control Work with no IP multicast
Cons Expensive High computation cost
Sender Receiver Router Proxy
CMPT771 Network Layer 69
Active Node (Proxy) vs. Network Node
VideoTranscoder
Rate Controller
QoS Monitor
RTP Layer
UDP Layer
IP Layer
Upstream Node
Downstream Node
RTCP
Packet Scheduler
IGMP
(a)
Application Level
IP Layer
Upstream Node
Packet Scheduler
IGMP
Network Level
Downstream Node
(b)
Network Level
CMPT771 Network Layer 70
Taxonomy
Unicast
Multicast
Single-Rate
Multi-Rate
Simulcasting
Layering
Proxy Filtering
Cumulative
Noncumulative
VideoTransmission
CMPT771 Network Layer 71
Example: Receiver-Driven Layered Multicast
Server sends layered encoded stream where each layer is sent to a separate multicast group
Receivers can control delivered quality by joining proper number of multicast groups. i.e. controls no of delivered layers Goal: to accommodate receiver bw heterogeneity!!
Basic idea: When no cong. => receiver adds a new layer When cong occurs => receiver drops top layer
CMPT771 Network Layer 72
RLM: example
Time
Lay
er #
1
2
3
4
CMPT771 Network Layer 73
Issues for adding layers
When? Sustained performance with little to no loss.
Hysterics problems When available bandwidth is between the rates of
two different layers. Probing by new members.
Join latency Takes a while for multicast joins to occur
CMPT771 Network Layer 74
Issues for dropping layers.
When? Usually as a reaction to congestion.
Co-located receiver must drop the top layer at about the same time!
New members probing for appropriate layer will cause “false” congestion.
Leave latency Similar to the problem with join latency.
CMPT771 Network Layer 75
Join Experiments
Receivers use “join-experiments” to determine if a layer should be added. A join timer is associated with each layer
• Initially, timers are fairly short. If timer goes off and no congestion is experienced, then
next level joined. If congestion is experienced, then level is dropped and
join timer is exponentially backed off. Receiver learns from past experiences
CMPT771 Network Layer 76
RLM Problems (1)
Scaling issues. Too many participants attempting join-experiments
leads to chronic congestion. Possible fix?
Coordinate join experiments with others. Shared learning Still problematic
• Want to limit shared learning to topologically close neighbors.
• Complexity of control mechanisms increases.
CMPT771 Network Layer 77
RLM Problems (2)
Bandwidth adaptation How to allocate among sessions
- Uniform ? But session population is uneven How to allocate among layers
- Heterogeneous bandwidth demands from receivers- Non-linear video utility- Num of layers
Objective: Better Fairness Intra-session fairness
• Fairly treat all the users• Ideal: each user receives video at a rate
commensurate with its own bandwidth demand
CMPT771 Network Layer 78
Hybrid Sender/Receiver Adaptation
Layered Bit-stream
Bandwidth Report
Sender
A Recei ver
Layered VideoCoder
RateController
FeedbackCollector
LayeredRate
Calculator
LayeredDecoder
LayerAdapter
BandwidthEstimator
Multicast Network
Receiver 1
Receiver 2
Receiver N
Layered Bit-stream
......
BandwidthReport
CMPT771 Network Layer 79
Control Bandwidth Scaling
Case Study: RTCP for Multicast
- To limit traffic, each participant reduces his RTCP traffic as the number of conference participants increases.
CMPT771 Network Layer 80
RTCP Bandwidth Scaling
RTCP attempts to limit its traffic to 5% of the session bandwidth.
Example Suppose one sender,
sending video at a rate of 2 Mbps. Then RTCP attempts to limit its traffic to 100 Kbps.
RTCP gives 75% of this rate to the receivers; remaining 25% to the sender
The 75 kbps is equally shared among receivers: With R receivers, each
receiver gets to send RTCP traffic at 75/R kbps.
Sender gets to send RTCP traffic at 25 kbps.
Participant determines RTCP packet transmission period by calculating avg RTCP packet size (across the entire session) and dividing by allocated rate.
CMPT771 Network Layer 81
Randomization in Feedback
Case Study: Reliable Multicast (RM)
how to transfer data “reliably” from source to receivers ?
ACK/NAK ?
Recall
• TCP/UDP vs Multicast
CMPT771 Network Layer 82
Randomized Feedback Suppression
randomly delay NAK generation “listen” to NAKs generated by
others if no NAK for lost pkt when delay
expires, transmit NAK tradeoffs:
more complexity (backoff timer), more pkts (data, NAKs) at rcvs
reduced feedback at sender what’s the optimal delay
distribution?
CMPT771 Network Layer 83
Local Recovery
Allow rcvr to recover lost pkt from “nearby” rcvr “ask your neighbor”:
send localized NAK (repair request)
multicast: randomize local repair transmission time to avoid too many replies
orthogonal (complementary) to feedback supression
who to recover from? don’t want repair
request to go to everyone
scoping: how to restrict how far request will travel: IP time-to-live field
CMPT771 Network Layer 84
Local Recovery: example
R2 detects lost pkt waits random amount of
time, multicasts repair request
limited scope not seen by R4
R1 and R3 have pkt R3 times out first and
multicasts repair, with limited scope
Other possibilities Feedback aggregation
CMPT771 Network Layer 85
Outline
1. Network Layer Service Models2. Routing Principles
Link state routing Distance vector routing Hierarchical routing
3. Multicast Routing4. Video Multicast5. Proxy Caching6. Peer-to-Peer7. Internet QoS
CMPT771 Network Layer 86
More about Proxy: Caching
user sets browser: Web accesses via proxy
browser sends all HTTP requests to proxy
object in cache: cache returns object
else cache requests object from origin server, then returns object to client
Proxy: both client and server; typically installed by ISP (university, company, residential ISP)
Goal: satisfy client request without involving origin server
client
Proxyserver
client
HTTP request
HTTP request
HTTP response
HTTP response
HTTP request
HTTP response
origin server
origin server
CMPT771 Network Layer 87
Caching example (1)
Assumptions average object size = 100,000
bits avg. request rate from
institution’s browser to origin serves = 15/sec
delay from institutional router to any origin server and back to router = 2 sec
Consequences utilization on LAN = 15% utilization on access link = 100% total delay = Internet delay +
access delay + LAN delay = 2 sec + minutes + milliseconds
originservers
public Internet
institutionalnetwork 10 Mbps LAN
1.5 Mbps access link
institutionalcache
CMPT771 Network Layer 88
Caching example (2)
Possible solution increase bandwidth of access
link to, say, 10 Mbps
Consequences (if 10 Mbps) utilization on LAN = 15% utilization on access link = 15% Total delay = Internet delay +
access delay + LAN delay = 2 sec + msecs + msecs often a costly upgrade
originservers
public Internet
institutionalnetwork 10 Mbps LAN
10 Mbps access link
institutionalcache
CMPT771 Network Layer 89
Caching example (3)
Install cache suppose hit rate is .4
Consequence 40% requests will be satisfied
almost immediately 60% requests satisfied by origin
server utilization of access link reduced
to 60%, resulting in negligible delays (say 10 msec)
total delay = Internet delay + access delay + LAN delay
= .6*2 sec + .4*.01 secs + milliseconds < 1.3 secs
originservers
public Internet
institutionalnetwork 10 Mbps LAN
1.5 Mbps access link
institutionalcache
CMPT771 Network Layer 90
Hierarchical cache
Calculation HR of Proxy 1 = 90% HR of Proxy 2 = 95% Independent Joint HR = 1-(1-0.9)*(1-
0.95)*100%
= 99.5%
How to measure cache ? Hit Ratio (HR)
client
ProxyServer 1
client
origin server
ProxyServer 2
CMPT771 Network Layer 91
Why Web caching ?
Reduce response time for client request. Reduce traffic on an institution’s access link. Avoid overloading origin server Internet dense with caches enables “poor”
content providers to effectively deliver content Enhance data availability (Related: Google)
CMPT771 Network Layer 92
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 2 3 4 5 6 7 8 9 10
Obejct ID
Acc
ess
Prob
abili
ty
More about Web caching
Why caching is effective for Web, even if cache space is quite limited ?
Zipf distribution|,|,...,2,1,
)/1(
)/1(||
1
S
jj
jp
S
j
j
CMPT771 Network Layer 93
Proxy
Proxy
Video Server
Backbone Network
Video Caching
Proxy caching saving objects at proxies close to clients
Temporal locality Geographical locality Skewed access rates
• Zipf distribution, too• Theta = 0.271
CMPT771 Network Layer 94
Video objects vs. Web objects
High data rate, yet adaptive Long playback duration
► Various interactions: • random access • early termination
► Huge volume • one-hour MPEG-1, about 675 MB
Semi-static Low validation cost (consistency)
CMPT771 Network Layer 95
Revisit: Consistency of Web Objects
Inconsistency out-of-date objects… News, stock websites
Solution: Conditional GET
client server
HTTP request msgIf-modified-since:
<date>
HTTP responseHTTP/1.0
304 Not Modified
object not
modified
HTTP request msgIf-modified-since:
<date>
HTTP responseHTTP/1.0 200 OK
<data>
object modified
CMPT771 Network Layer 96
Solution: Partial Video Caching
Consistency is not a big issue for video caching
But data volume is …
Control Channel(RTSP, RTCP)
Scheduler
MediaRepository
ServerMedia Proxy
Proxy Manager
EnterpriseNetwork
Cache
Player
Buffer
Client
BackboneNetwork
Data Channel(RTP)
Ass
embl
er/S
wit
cher
Data Channel(RTP)
Control Channel(RTSP,RTCP)
CMPT771 Network Layer 97
Partial Caching: Protocol Operations
OPTIONS
DESCRIBE
SETUP
PLAY
TEARDOWN
OK
OK
OK
OK
SETUP
OK
PLAY Range
OK
TEARDOWN
OK
Client Proxy Server
Portion 1(Cached)
Portion 2(Uncached)
Portion 2
RTSP
RTP
RTSP
CMPT771 Network Layer 98
Partial Caching Algorithms (1)
Problem: Which part(s) to cache ?
Interval caching I/O bandwidth ?
Prefix caching Reducing startup latency Cache allocation problem
• Given proxy cache size, video bandwidth, access rates to different videos, what are the prefix lengths ?
1 4 5 6 7 8 9 3 4 5 6 7 8 9
r1r2 r1r2
1 2 2 3
CMPT771 Network Layer 99
Partial Caching Algorithms (2)
Segment caching Which segment ? Segment size
• Uniform/Exponential VCR-support
Video staging Cache upper or lower part ? Cut-off rate ?
cut-off rate
CMPT771 Network Layer 100
Partial Caching for Homogeneous Clients
Sliding-Interval
Caching
Prefix
Caching
Segment
Caching
Rate-Split
Caching
Cached Portion Sliding
intervals Prefix Segments
Portion of
higher rate
VCR-like support No No Yes No
Disk I/O High Moderate Moderate Moderate
Disk Space Low Moderate High High Resource
Demand
Sync Overhead Low Moderate High High
Bandwidth
Reduction High* Moderate Moderate Moderate
Performance
Improvement Start-up Latency
Reduction High* High High** Moderate
CMPT771 Network Layer 101
Generalizing Proxy Functionalities (1): Content Distribution Networks
Content replication CDN company installs
hundreds of CDN servers throughout Internet in lower-tier ISPs, close to
users CDN replicates its customers’
content in CDN servers. When provider updates content, CDN updates servers
origin server in North America
CDN distribution node
CDN serverin S. America CDN server
in Europe
CDN serverin Asia
CMPT771 Network Layer 102
CDN example
origin server www.foo.com distributes HTML Replaces: http://www.foo.com/sports.ruth.gif
with
http://www.cdn.com/www.foo.com/sports/ruth.gif
HTTP request for
www.foo.com/sports/sports.html
DNS query for www.cdn.com
HTTP request for
www.cdn.com/www.foo.com/sports/ruth.gif
1
2
3
Origin server
CDNs authoritative DNS server
NearbyCDN server
CDN company cdn.com distributes gif files uses its authoritative
DNS server to route redirect requests (CNAME)
CMPT771 Network Layer 103
More about CDNs
routing requests CDN creates a “map”, indicating distances from
leaf ISPs and CDN nodes when query arrives at authoritative DNS server:
server determines ISP from which query originates uses “map” to determine best CDN server
Caching vs. CDN Pull: passive Push: active
CMPT771 Network Layer 104
Outline
1. Network Layer Service Models2. Routing Principles
Link state routing Distance vector routing Hierarchical routing
3. Multicast Routing4. Video Multicast5. Proxy Caching6. Peer-to-Peer7. Internet QoS
CMPT771 Network Layer 105
Generalizing Proxy Functionalities (2): Peer-to-Peer Paradigm
A peer is both client and server
application
transportnetworkdata linkphysical
application
transportnetworkdata linkphysical
Recall Client/Server paradigmClient: initiates contact with server
(“speaks first”) typically requests service from
server, Web: client implemented in
browser; e-mail: in mail reader
request
reply
Server: provides requested service to client e.g., Web server sends requested Web
page, mail server delivers e-mail
CMPT771 Network Layer 106
Peer-to-Peer Communications
Example Alice runs P2P client
application on her notebook computer
Intermittently connects to Internet; gets new IP address for each connection
Asks for “X.mp3” Application displays
other peers that have copy of X.mp3.
Alice chooses one of the peers, Bob.
File is copied from Bob’s PC to Alice’s notebook: HTTP
While Alice downloads, other users uploading from Alice.
Alice’s peer is both a Web client and a transient Web server.
All peers are servers = highly scalable!
CMPT771 Network Layer 107
P2P: centralized directory
For file sharing Key issue: File searching - which peer has the file
?
original “Napster” design1) when peer connects, it
informs central server: IP address content
2) Alice queries for “X.mp3”3) Alice requests file from Bob
centralizeddirectory server
peers
Alice
Bob
1
1
1
12
3
CMPT771 Network Layer 108
P2P: problems with centralized directory
Single point of failure Performance bottleneck Copyright infringement
file transfer is decentralized, but locating content is highly decentralized
CMPT771 Network Layer 109
P2P: decentralized directory
Each peer is either a group leader or assigned to a group leader.
Group leader tracks the content in all its children.
Peer queries group leader; group leader may query other group leaders.
ordinary peer
group-leader peer
neighoring re la tionshipsin overlay network
CMPT771 Network Layer 110
More about decentralized directory
advantages of approach no centralized directory server
location service distributed over peers more difficult to shut down
disadvantages of approach bootstrap node needed group leaders can get overloaded
CMPT771 Network Layer 111
P2P: Query flooding
Gnutella no hierarchy use bootstrap node to
learn about others join message
Send query to neighbors Neighbors forward query If queried peer has object, it
sends message back to querying peer
join
CMPT771 Network Layer 112
P2P: more on query flooding
Pros peers have similar
responsibilities: no group leaders
highly decentralized no peer maintains
directory info
Cons excessive query traffic query radius: may not
have content when present
bootstrap node maintenance of overlay
network
CMPT771 Network Layer 113
Aside: Search Time?
CMPT771 Network Layer 114
Aside: All Peers Equal?
56kbps Modem
10Mbps LAN
1.5Mbps DSL
56kbps Modem56kbps Modem
1.5Mbps DSL
1.5Mbps DSL
1.5Mbps DSL
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
CMPT771 Network Layer 115
Aside: Network Resilience
Partial Topology Random 30% die Targeted 4% die
from Saroiu et al., MMCN 2002
CMPT771 Network Layer 116
DHT: A New Story…
In 2000-2001, academic researchers said “we want to play too!” Motivation:
Frustrated by popularity of all these “half-baked” P2P apps We can do better! Guaranteed lookup success for files in system Provable bounds on search time Provable scalability to millions of node
Hot Topic in networking ever since No practical use so far (?)
CMPT771 Network Layer 117
P2P: Content Addressing (Hash Routing)
Hash routing Given an object identifier I, calculate its hash value
H=hash(I), and (hopefully) find it (or its location info) in peer H
Not a new idea Load balancing – hash IP address, re-direct to different
servers
hash table
application
get (key) data
node node node….
put(key, data)
CMPT771 Network Layer 118
Hash Routing
Two alternatives Node can cache each (existing) object that hashes
within its range Pointer-based: level of indirection - node caches
pointer to location(s) of object
What’s new in P2P? Dynamic overlay
• peer join/leave• number of peers is not fixed
Traditional hash function doesn’t work• SHA-1
0-9999500-9999
1000-19991500-4999
9000-9500
4500-6999
8000-8999 7000-8500
CMPT771 Network Layer 119
Distributed Hash Table (DHT)
Challenges For each object, node(s) whose range(s) cover that
object must be reachable via a “short” path # neighbors for each node should scale well (e.g.,
should not be O(N)) Fully distributed (no centralized bottleneck/single
point of failure) DHT mechanism should gracefully handle nodes
joining/leaving need to repartition the range space over existing
nodes need to reorganize neighbor set need bootstrap mechanism to connect new nodes into
the existing DHT infrastructure
CMPT771 Network Layer 120
Case Studies
Structure overlay (p2p) systems Chord CAN (Content Addressable Network)
Key Questions Q1: How is hash space divided “evenly” among
existing nodes? Q2: How is routing implemented that connects an
arbitrary node to the node responsible for a given object?
Q3: How is the hash space repartitioned when nodes join/leave?
Let N be the number of nodes in the overlay Let H be the size of the range of the hash
function (when applicable)
CMPT771 Network Layer 121
Chord
Associate to each node and file a unique id in an uni-dimensional space (a Ring)
E.g., pick from the range [0...2m] Usually the hash of the file or IP address
Properties: Routing table size is O(log N) , where N is the total number
of nodes Guarantees that a file is found in O(log N) hops
from MIT in 2001
CMPT771 Network Layer 122
Consistent Hashing
N32
N90
N105
K80
K20
K5
Circular ID space
Key 5Node 105
A key is stored at its successor: node with next higher ID
CMPT771 Network Layer 123
Chord Basic Lookup
N32
N90
N105
N60
N10N120
K80
“ Where is key 80?”
“ N90 has K80”
CMPT771 Network Layer 124
Chord “Finger Table”
N80
1/21/4
1/8
1/161/321/641/128
Entry i in the finger table of node n is the first node that succeeds or equals n + 2i
In other words, the ith finger points 1/2n-i way around the ring
CMPT771 Network Layer 125
Chord Join
Assume an identifier space [0..8]
Node n1 joins
01
2
34
5
6
7
i id+2i succ0 2 11 3 12 5 1
Succ. Table
CMPT771 Network Layer 126
Chord Join
Node n2 joins
01
2
34
5
6
7
i id+2i succ0 2 21 3 12 5 1
Succ. Table
i id+2i succ0 3 11 4 12 6 1
Succ. Table
CMPT771 Network Layer 127
Chord Join
Nodes n0, n6 join
01
2
34
5
6
7
i id+2i succ0 2 21 3 62 5 6
Succ. Table
i id+2i succ0 3 61 4 62 6 6
Succ. Table
i id+2i succ0 1 11 2 22 4 6
Succ. Table
i id+2i succ0 7 01 0 02 2 2
Succ. Table
CMPT771 Network Layer 128
Chord Join
Nodes: n1, n2, n0, n6
Items: f7, f2 0
1
2
34
5
6
7 i id+2i succ0 2 21 3 62 5 6
Succ. Table
i id+2i succ0 3 61 4 62 6 6
Succ. Table
i id+2i succ0 1 11 2 22 4 6
Succ. Table
7
Items 1
Items
i id+2i succ0 7 01 0 02 2 2
Succ. Table
CMPT771 Network Layer 129
Chord Routing
Upon receiving a query for item id, a node:
Checks whether stores the item locally
If not, forwards the query to the largest node in its successor table that does not exceed id
01
2
34
5
6
7 i id+2i succ0 2 21 3 62 5 6
Succ. Table
i id+2i succ0 3 61 4 62 6 6
Succ. Table
i id+2i succ0 1 11 2 22 4 6
Succ. Table
7
Items 1
Items
i id+2i succ0 7 01 0 02 2 2
Succ. Table
query(7)
CMPT771 Network Layer 130
Chord Summary
Routing table size? Log N fingers
Routing time? Each hop expects to 1/2 the distance to the desired
id => expect O(log N) hops.
CMPT771 Network Layer 131
CAN (Content Addressable Network)
Hyper-cube space hash value is viewed as a
point in a D-dimensional cartesian space
each node responsible for a D-dimensional “cube” in the space
The space is covered by all the nodes
Example: Initial node n1:(1, 2) 1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1
CMPT771 Network Layer 132
CAN Illustration: 2-D
node n2:(4, 2) joins
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
CMPT771 Network Layer 133
CAN Illustration: 2-D
node n3:(3, 5) joins。
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3
CMPT771 Network Layer 134
CAN Illustration: 2-D
node n4:(5, 5) and n5:(6,6) join
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3 n4n5
CMPT771 Network Layer 135
CAN Illustration: 2-D
nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6)
Data (key): f1:(2,3); f2:(5,0); f3:(2,1); f4:(7,5)
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3 n4n5
f1
f2
f3
f4
CMPT771 Network Layer 136
CAN Illustration: 2-D
Association
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3 n4n5
f1
f2
f3
f4
CMPT771 Network Layer 137
CAN Illustration: 2-D
A node knows its neighbors
Forward query to the neighbor that is nearest to key
Example: n1 want to find f4
Average length (d/4)(n1/d), and each node maintains O(2d) neighboring info
Multiple paths -> reliable
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3 n4n5
f1
f2
f3
f4
CMPT771 Network Layer 138
DHT: Discussion
Pros: Guaranteed Lookup O(log N) per node state and search scope
Cons: No one uses them? Supporting non-exact match search is hard Topology-awareness ?
CMPT771 Network Layer 139
P2P: Media Streaming
Can media streaming benefit from p2p ? Yes, Overlay/Application-layer multicast
Problems ? File searching – not a big issue Dynamicity
• Bandwidth fluctuation• Peer join/leave
Delay• Real-time requirement
J. Liu, S. G. Rao, B. Li, and H. Zhang, Opportunities and Challenges of Peer-to-Peer Internet Video Broadcast, Proceedings of the IEEE, Special Issue on Recent Advances in Distributed Multimedia Communications, 2007.
CMPT771 Network Layer 140
Outline
1. Network Layer Service Models2. Routing Principles
Link state routing Distance vector routing Hierarchical routing
3. Multicast Routing4. Video Multicast5. Proxy Caching6. Peer-to-Peer7. Internet QoS
Scheduling and policing IntServ/DiffServ
CMPT771 Network Layer 141
Beyond Best Effort
Internet service modelBest-effort (or least-effort) Guarantee only one thing (sometimes even cannot) -- delivery a packet to destination
Thus far: “making the best of best effort” TCP Multicast Optimal adaptation Proxy filtering/caching Content distribution (replication) P2P …
But, many things cannot be guaranteed in transport/application layer if network layer does not guarantee them Delay, bandwidth – important to media applications
CMPT771 Network Layer 142
Improving QOS in IP Networks
Future: next generation Internet with QoS guarantees RSVP: signaling for resource reservations Differentiated Services: differential guarantees Integrated Services: firm guarantees
simple model for sharing and congestion studies:
CMPT771 Network Layer 143
Principles for QOS Guarantees
Example: 1Mbps IP phone, FTP share 1.5 Mbps link. bursts of FTP can congest router, cause audio loss want to give priority to audio over FTP
packet marking needed for router to distinguish between different classes; and new router policy to treat packets accordingly
Principle 1
CMPT771 Network Layer 144
Principles for QOS Guarantees (more) what if applications misbehave (audio sends higher
than declared rate) policing: force source adherence to bandwidth allocations
marking and policing at network edge: similar to ATM UNI (User Network Interface)
provide protection (isolation) for one class from othersPrinciple 2
CMPT771 Network Layer 145
Principles for QOS Guarantees (more)
Allocating fixed (non-sharable) bandwidth to flow: inefficient use of bandwidth if flows doesn’t use its allocation
While providing isolation, it is desirable to use resources as efficiently as possible
Principle 3
CMPT771 Network Layer 146
Principles for QOS Guarantees (more)
Basic fact of life: can not support traffic demands beyond link capacity
Call Admission: flow declares its needs, network may block call (e.g., busy signal) if it cannot meet needs
Principle 4
CMPT771 Network Layer 147
Summary of QoS Principles
Let’s next look at mechanisms for achieving this ….
CMPT771 Network Layer 148
Scheduling Polices (1)
scheduling: choose next packet to send on link FIFO (first in first out) scheduling: send in order of arrival to queue
discard policy: if packet arrives to full queue: who to discard?• Tail drop: drop arriving packet• priority: drop/remove on priority basis• random: drop/remove randomly
CMPT771 Network Layer 149
Work-Conserving vs. Non-Work-Conserving
Work conserving Idle only if queue(s) is empty
Conservation law
Do we need non-work-conserving ? Jitter Buffer ?
i iq Cr =åUtility*mean waiting time
CMPT771 Network Layer 150
Scheduling Policies (2)
Priority scheduling: transmit highest priority queued packet
multiple classes, with different priorities class may depend on marking or other header info,
e.g. IP source/dest, port numbers, etc..
CMPT771 Network Layer 151
Scheduling Policies (3)
round robin scheduling: multiple classes cyclically scan class queues, serving one from each class (if available)
CMPT771 Network Layer 152
Weighted round robin
WA=1.4 WB=0.2 WC=0.8
WA=7 WB=1 WC=4
Normalize
…
round length = 13
Normalize the weights so that they become integer
CMPT771 Network Layer 153
Weighted RR – variable length packet
If different connection have different packet size, then divide the weight of each connection with connection’s mean packet size before normalizing weights {0.5, 0.75, 1.0}, mean packet sizes {50, 500, 1500} normalize weights: {0.5/50, 0.75/500, 1.0/1500} = { 0.01,
0.0015, 0.000666}, normalize again {60, 9, 4}
CMPT771 Network Layer 154
Scheduling Policies (4)
Generalized Processor Sharing Round-robin, sounds good
Is it fair ? Only if packets are of same size Does it guarantee bandwidth (in a small time scale) ?
Fairness, protection. GPS can provide fairness and protection Visit each non-empty queue in turn, serve infinitesimal
data (1 bit) GPS is not implementable; we can serve only packets
CMPT771 Network Layer 155
Weighted Fair Queueing (WFQ)
Deals better with variable size packets and weights
Also known as packet-by-packet GPS (PGPS)
Find finish time of a packet, had we been doing GPS; serve packets in order of their finish times
CMPT771 Network Layer 156
WFQ details
Uses notions of round number and finish number
Suppose, in each round, the server served one bit from each active connection
Round number is the number of rounds already completed – a virtual time can be fractional depends on the number of active connections
CMPT771 Network Layer 157
WFQ – finish number
If a packet of length p arrives to an empty queue when the round number is R, it will complete service when the round number is R + p => finish number is R + p independent of the number of other
connections.
CMPT771 Network Layer 158
WFQ – service order
If a packet arrives to a queue, and the previous packet has/had a finish number of f, then the packet’s finish number is f+p Serve packets in order of finish numbers
Finish time of a packet is not the same as the finish number
CMPT771 Network Layer 159
Active connections
A connection is active if the last packet served from it, or in its queue, has a finish number greater than the current round number
finish number of packet of length p if arriving to active connection = previous finish number
+ p if to inactive connection = R + p
CMPT771 Network Layer 160
WFQ Example
A L=1 L=2
B L=2
C L=2
0
0.5
1
1.5
2
2.5
3
3.5
4
0 1 2 3 4 5 6 7 8real time
virt
ual t
ime
1/3
1/2
1/3
1
F1=1
F1=2
F1=2
F2=3.5
A
r = 1
A = B = C = 1
B
C
C
t=0: Packets of sizes 1,2,2 arrive at connections A, B, C.
t=4: Packet of size 2 arrives at connection A
CMPT771 Network Layer 161
Example (contd.)
At time 0, slope of 1/3, Finish number of A = 1, Finish number of B, C = 2
At time 3, connection A become inactive, slope becomes 1/2
At time 4, second packet at A gets finish number 2 + 1.5 = 3.5,
Slope decreases to 1/3
CMPT771 Network Layer 162
Example (contd.)
At time 5.5, round number becomes 2 and connection B and C
become inactive, Slope becomes 1
At time 7, round number becomes 3.5 and A becomes inactive.
CMPT771 Network Layer 163
WFQ implementation
On packet arrival: first updates round number compute finish number for the packet insert in priority queue sorted by finish numbers, so that it is
served according to the finish number if no space, drop the packet with largest finish number
On service completion After servicing one packet, select the packet with the lowest
finish number
CMPT771 Network Layer 164
Computing the round number
Iterated deletion: re-computes round number
on each packet arrival At any re-computation, the
number of connections can go up at most by one, but can go down to zero, leading to change in round number
Solution: use previous count of connections to compute round number; if this makes some connection inactive, re-compute; repeat until no connections become inactive
A L=1 L=2
B L=2
C L=2
0
0.5
1
1.5
2
2.5
3
3.5
4
0 1 2 3 4 5 6 7 8real time
virt
ual t
ime
1/3
1/2
1/3
1
F1=1
F1=2
F1=2
F2=3.5A L=1 L=2
B L=2
C L=2
A L=1A L=1 L=2L=2
B L=2B L=2B L=2
C L=2C L=2
0
0.5
1
1.5
2
2.5
3
3.5
4
0 1 2 3 4 5 6 7 8real time
virt
ual t
ime
1/3
1/2
1/3
1
0
0.5
1
1.5
2
2.5
3
3.5
4
0 1 2 3 4 5 6 7 8real time
virt
ual t
ime
1/3
1/2
1/3
1
0
0.5
1
1.5
2
2.5
3
3.5
4
0 1 2 3 4 5 6 7 8real time
virt
ual t
ime
1/3
1/2
1/3
1
F1=1
F1=2
F1=2
F2=3.5F1=1
F1=2
F1=2
F2=3.5
4/3
CMPT771 Network Layer 165
WFQ Evaluation
Let Pmax(i) be the largest possible packet sent on flow i, and Pmax be the largest possible packet in the network
Let G(i, τ, t) to be the service received by flow i, using GPS, in the interval (τ, t), and g(i) to be the smallest rate allocated to flow i along it’s path
It can be shown that WFQ can lag GPS by at most Pmax/g(i) that is,
G(i, τ, t) /g(i) - S(i, τ, t) /g(i) <= Pmax/g(i) G(i, τ, t) /g(i) - S(i, τ, t) /g(i) is defined as Absolute Fairness
Bound (ABS) of a scheduling policy with respect to GPS.
CMPT771 Network Layer 166
WFQ Evaluation (Contd..) Protection
The rate provided to each flow is not affected by the sending rate of other flows.
This firewalling property is desirable by public networks. Incentive to intelligent flow controller
no need to send slower than allocated rate if sent more then likely packet drop
Requires per connection scheduler state Computing round numbers with iterated deletion is
costly Bound on end to end delay (?)
CMPT771 Network Layer 167
Virtual Clock vs Shadow Configuration
The Shadow Cabinet is a feature of the Westminster system of government. It comprises a senior group of opposition spokespeople who, under the leadership of the Leader of the Opposition, form an alternative cabinet to that of the government, and whose members shadow or mark each individual member of the Cabinet.
The Official Opposition and other major political parties not in the Government, will mirror the governmental organisation with their own Shadow Cabinet made up of Shadow Ministers.
CMPT771 Network Layer 168
Policing Mechanisms
Goal: limit traffic to not exceed declared parameters
Three common-used criteria: (Long term) Average Rate: how many pkts can be sent per unit time
(in the long run) crucial question: what is the interval length: 100 packets per sec or 6000
packets per min have same average!
Peak Rate: e.g., 6000 pkts per min. (ppm) avg.; 1500 ppm peak rate (Max.) Burst Size: max. number of pkts sent consecutively (with no
intervening idle)
CMPT771 Network Layer 169
Policing Mechanisms
Token Bucket: limit input to specified Burst Size and Average Rate.
bucket can hold b tokens tokens generated at rate r token/sec unless
bucket full over interval of length t: number of packets
admitted less than or equal to (r t + b).
CMPT771 Network Layer 170
Policing Mechanisms (more)
token bucket, WFQ combine to provide guaranteed upper bound on delay, i.e., QoS guarantee!
WFQ
token rate, r
bucket size, b
per-flowrate, R
D = b/Rmax
arrivingtraffic
CMPT771 Network Layer 171
IETF Integrated Services
architecture for providing QOS guarantees in IP networks for individual application sessions
resource reservation: routers maintain state info (a la VC) of allocated resources, QoS req’s
admit/deny new call setup requests:
Question: can newly arriving flow be admitted with performance guarantees while not violated QoS guarantees made to already admitted flows?
CMPT771 Network Layer 172
Intserv: QoS guarantee scenario
Resource reservation call setup, signaling (RSVP) traffic, QoS declaration per-element admission control
QoS-sensitive scheduling (e.g.,
WFQ)
request/reply
CMPT771 Network Layer 173
Call Admission
Arriving session must : declare its QOS requirement
R-spec: defines the QOS being requested characterize traffic it will send into network
T-spec: defines traffic characteristics signaling protocol: needed to carry R-spec and T-
spec to routers (where reservation is required) RSVP
CMPT771 Network Layer 174
Intserv QoS: Service models [rfc2211, rfc 2212]
Guaranteed service: worst case traffic arrival: leaky-
bucket-policed source simple (mathematically
provable) bound on delay [Parekh 1992, Cruz 1988]
Controlled load service: "a quality of service closely
approximating the QoS that same flow would receive from an unloaded network element."
WFQ
token rate, r
bucket size, b
per-flowrate, R
D = b/Rmax
arrivingtraffic
CMPT771 Network Layer 175
Outline
1. Multimedia Networking Applications
2. Streaming stored audio and video RTSP
3. Real-time Multimedia: Internet Phone Case Study
4. Protocols for Real-Time Streaming App RTP,RTCP SIP
5. Beyond Best Effort 6. Scheduling and
Policing Mechanisms 7. Integrated Services 8. RSVP 9. Differentiated
Services
CMPT771 Network Layer 176
Signaling in the Internet
connectionless (stateless)
forwarding by IP routers
best effort service
no network signaling protocols
in initial IP design
+ =
New requirement: reserve resources along end-to-end path (end system, routers) for QoS for multimedia applications
RSVP: Resource Reservation Protocol [RFC 2205] “ … allow users to communicate requirements to
network in robust and efficient way.” i.e., signaling ! earlier Internet Signaling protocol: ST-II [RFC 1819]
CMPT771 Network Layer 177
RSVP Design Goals
1. accommodate heterogeneous receivers (different bandwidth along paths)
2. accommodate different applications with different resource requirements
3. make multicast a first class service, with adaptation to multicast group membership
4. leverage existing multicast/unicast routing, with adaptation to changes in underlying unicast, multicast routes
5. control protocol overhead to grow (at worst) linear in # receivers
6. modular design for heterogeneous underlying technologies
CMPT771 Network Layer 178
RSVP: does not…
specify how resources are to be reserved rather: a mechanism for communicating needs
determine routes packets will take that’s the job of routing protocols signaling decoupled from routing
interact with forwarding of packets separation of control (signaling) and data (forwarding)
planes
CMPT771 Network Layer 179
RSVP: overview of operation
senders, receiver join a multicast group done outside of RSVP senders need not join group
sender-to-network signaling path message: make sender presence known to routers path teardown: delete sender’s path state from routers
receiver-to-network signaling reservation message: reserve resources from sender(s) to
receiver reservation teardown: remove receiver reservations
network-to-end-system signaling path error reservation error
CMPT771 Network Layer 180
IETF Differentiated Services
Concerns with Intserv: Scalability: signaling, maintaining per-flow router state difficult with large
number of flows Flexible Service Models: Intserv has only two classes. Also want “qualitative”
service classes “behaves like a wire” relative service distinction: Platinum, Gold, Silver
Diffserv approach: simple functions in network core, relatively complex functions at edge routers
(or hosts) Do’t define define service classes, provide functional components to build
service classes
CMPT771 Network Layer 181
Diffserv Architecture
Edge router:- per-flow traffic management
- marks packets as in-profile and out-profile
Core router:
- per class traffic management
- buffering and scheduling
based on marking at edge
- preference given to in-profile packets- Assured Forwarding
scheduling
...
r
b
marking
CMPT771 Network Layer 182
Edge-router Packet Marking
class-based marking: packets of different classes marked differently
intra-class marking: conforming portion of flow marked differently than non-conforming one
profile: pre-negotiated rate A, bucket size B packet marking at edge based on per-flow profile
Possible usage of marking:
User packets
Rate A
B
CMPT771 Network Layer 183
Classification and Conditioning
Packet is marked in the Type of Service (TOS) in IPv4, and Traffic Class in IPv6
6 bits used for Differentiated Service Code Point (DSCP) and determine PHB that the packet will receive
2 bits are currently unused
CMPT771 Network Layer 184
Classification and Conditioning
may be desirable to limit traffic injection rate of some class:
user declares traffic profile (e.g., rate, burst size)
traffic metered, shaped if non-conforming
CMPT771 Network Layer 185
Forwarding (PHB)
PHB result in a different observable (measurable) forwarding performance behavior
PHB does not specify what mechanisms to use to ensure required PHB performance behavior
Examples: Class A gets x% of outgoing link bandwidth over time
intervals of a specified length Class A packets leave first before packets from class
B
CMPT771 Network Layer 186
Randomization in Router Queue Management
normally, packets dropped only when queue overflows “Drop-tail” queueing
ISPISProuter
ISPISPInternet
P1P2P3P4P5P6FCFS
SchedulerFCFS
Scheduler
router
CMPT771 Network Layer 187
The case against drop-tail queue management
large queues in routers are “a bad thing” End-to-end latency dominated by length of queues at
switches in network
allowing queues to overflow is “a bad thing” connections transmitting at high rates can starve
connections transmitting at low rates connections can synchronize their response to
congestion
P1P2P3P4FCFS
Scheduler
FCFSScheduler
P5P6
CMPT771 Network Layer 188
Idea: early random packet drop
When queue length exceeds threshold, packets dropped with fixed probability probabilistic packet drop: flows see same loss rate problem: bursty traffic (burst arrives when queue is
near full) can be overpenalized
P1P2P3P4P5P6FCFS
Scheduler
FCFSScheduler
CMPT771 Network Layer 189
Random early detection (RED) packet drop
use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts React to longer term trends
tie drop prob. to weighted avg. queue length avoids over-reaction to mild overload conditions
Maxthreshold
Minthreshold
Average queue length
Forced drop
Probabilisticearly drop
No drop
Time
Drop probabilityMax
queue length
CMPT771 Network Layer 190
Random early detection (RED) packet drop
Maxthreshold
Minthreshold
Average queue length
Forced drop
Probabilisticearly drop
No drop
Time
Drop probabilityMax
queue length
100%
Drop probability
maxp
Weighted AverageQueue Length
min max
CMPT771 Network Layer 191
Random early detection (RED) packet drop
large number (5) of parameters: difficult to tune (at least for http traffic)
gains over drop-tail FCFS not that significant still not widely deployed …
CMPT771 Network Layer 192
RED: why probabilistic drop?
provide gentle transition from no-drop to all-drop provide “gentle” early warning
provide same loss rate to all sessions: with tail-drop, low-sending-rate sessions can be
completely starved
avoid synchronized loss bursts among sources avoid cycles of large-loss followed by no-
transmission
WRED (Cisco)