End-to-End Multicast Congestion Control and Avoidance Jiang Li Advisor: Shivkumar Kalyanaraman.
Reliable multicast from end-to-end solutions to active solutions
-
Upload
caesar-lara -
Category
Documents
-
view
37 -
download
1
description
Transcript of Reliable multicast from end-to-end solutions to active solutions
Reliable multicast
from end-to-end solutions to active solutions
C. Pham
RESO/LIP-Univ. Lyon 1, France
DEA DIF Nov. 13th, 2002ENS-Lyon, France
22
Q&A
Q1: How many people in the audience have heard about multicast?
Q2: How many people in the audience know basically what multicast is?
Q3: How many people in the audience have ever tried multicast technologies?
Q4: How many people think they need multicast?
33
My guess on the answers
Q1: How many people in the audience have heard about multicast? 100%
Q2: How many people in the audience know basically what multicast is? about 40%
Q3: How many people in the audience have ever tried multicast technologies? 0%
Q4: How many people think they need multicast? 0%
44
Purpose of this tutorial
Provide a comprehensive overview of current multicast technologies
Show the evolution in multicast technologies
Achieve 100%, 100%, 30% and 70% to the previous answers next time!
multicast!
multicast!
How multicast can change the way people use the Internet?
multica
st!
multicast!
multi
cast
!
Everybody's talking about multicast! Really annoying ! Why would I need
multicast for by the way?
multicast!
multicast!multicast!
multicast!
multicast!
multicast!
multicast!
mu
ltic
ast!
multicast!alone
multicast!
66
From unicast…
ProblemSending same data to many receivers via unicast is inefficient
ExamplePopular WWW sites become serious bottlenecks
Sender
data
datadata
data
Receiver Receiver Receiver
datadata
77
…to multicast on the Internet.
Sender
Not n-unicast from the sender perspective
Efficient one to many data distribution
Towards low latence, high bandwidth
data
datadata
data
Receiver Receiver Receiver
IP multicast
88
high-speed www video-conferencing video-on-demand interactive TV programs remote archival systems tele-medecine, white board high-performance computing, grids virtual reality, immersion systems distributed interactive
simulations/gaming…
New applications for the Internet
Think about…
99
A whole new world for multicast…
1010
A very simple example
File replication 10MBytes file 1 source, n receivers (replication sites) 512KBits/s upstream access n=100
Tx= 4.55 hours
n=1000 Tx= 1 day 21 hours 30 mins!
1111
A real example: LHC (DataGrid)
Tier2 Center
Online System
Offline Farm~20 TIPS
CERN Computer Center > ~20 TIPS
FermilabFrance Regional Center
Italy Regional Center
UK Regional Center
InstituteInstituteInstituteInstitute ~0.25TIPS
Workstations
~100 MBytes/sec
~100 MBytes/sec
~2.4 Gbits/sec
100 - 1000 Mbits/sec
Bunch crossing per 25 nsecs.100 triggers per secondEvent is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute has ~10 physicists working on one or more channels
Data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight
Tier2 CenterTier2 CenterTier2 Center
~622 Mbits/sec
Tier 0Tier 0
Tier 1Tier 1
~ 4 TIPS~ 4 TIPS
Tier 3Tier 3
1 TIPS = 25,000 SpecInt95
PC (1999) = ~15 SpecInt95
Tier2 CenterTier 2Tier 2
source DataGrid
1212
Multicast for computational grids
application user
from Dorian Arnold: Netsolve Happenings
1313
Distributed & interactive simulations:DIS, HLA,Training.
Some grid applications
Astrophysics:Black holes, neutron stars, supernovae
Mechanics:Fluid dynamic,CAD, simulation.
Chemistry&biology:Molecular simulations, Genomic simulations.
W ide- ar ea int er act ive simulat ions
IN T E R N E T
human in t he loopfl ight s imulat or
bat t le fi eld simulat ion
displaycomput er - basedsub- mar ine simulat or
1414
Data replications
Code & data transfers, interactive job submissions
Data communications for distributed applications (collective & gather operations, sync. barrier)
Databases, directories services
Data replications
Code & data transfers, interactive job submissions
Data communications for distributed applications (collective & gather operations, sync. barrier)
Databases, directories services
Reliable multicast: a big win for grids
Multicast address group 224.2.0.1
224.2.0.1
SDSC IBM SP1024 procs5x12x17 =1020
NCSA Origin Array256+128+1285x12x(4+2+2) =480
CPlant cluster256 nodes
1515
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
From reliable multicast to Nobel prize!
We see something,but too weak.
Please simulateto enhance signal!
Resource Broker:7 sites OK, but need to send data fast…
Resource Broker:LANL is best match…
but down for the moment
OK! Resource EstimatorSays need 5TB, 2TF.Where can I do this?
From [email protected]
Congratulations, you have done a great job, it's the discovery of the century!!
The phenomenon was short but we manage to react quickly. This would have not been possible without efficient multicast facilities to enable quick reaction and fast distribution of data.
Nobel Prize is on the way :-)
From [email protected]
Congratulations, you have done a great job, it's the discovery of the century!!
The phenomenon was short but we manage to react quickly. This would have not been possible without efficient multicast facilities to enable quick reaction and fast distribution of data.
Nobel Prize is on the way :-)
1616
Wide-area interactive simulations
human in the loopflight simulator
battle field simulation
displaycomputer-basedsub-marine simulator
INTERNET
1717
The challenges of multicast
SCALABILITY
SCALABILITY
SCALABILITYSCALABILITY
Part I
The IP multicast model
1919
A look back in history of multicast
History Long history of usage on shared medium
networks Data distribution Resource discovery: ARP, Bootp, DHCP
Ethernet Broadcast (software filtered) Multicast (hardware filtered)
Multiple LAN multicast protocols DECnet, AppleTalk, IP
2020
IP Multicast - Introduction
Efficient one to many data distribution Tree style data distribution Packets traverse network links only once replication/multicast engine at the network layer
Location independent addressing IP address per multicast group
Receiver-oriented service model Receivers subscribe to any group Senders do not know who is listening Routers find receivers Similar to television model Contrasts with telephone network, ATM
2121
The Internet group model
multicast/group communications means... 1 n as well as n m
a group is identified by a class D IP address (224.0.0.0 to 239.255.255.255)
abstract notion that does not identify any host!
host_1
194.199.25.100194.199.25.100sourcesource
host_3
receiverreceiver133.121.11.22133.121.11.22
host_2
receiverreceiver194.199.25.101194.199.25.101
multicast group225.1.2.3 multicast router
Ethernet
multicast router
multicast router
host_1
sourcesource
host_2
Ethernet
receiverreceiver
host_3
site 1
site 2
Internet
receiverreceiver
multicast distribution tree
from logical view...
...to physical view
from V. Roca
2222
Example: video-conferencing
from UREC, http://www.urec.frMulticast address group 224.2.0.1
224.2.0.1
2323
The Internet group model... (cont’)
local-area multicast use the potential diffusion capabilities of the physical
layer (e.g. Ethernet) efficient and straightforward
wide-area multicast requires to go through multicast routers, use
IGMP/multicast routing/...(e.g. DVMRP, PIM-DM, PIM-SM, PIM-SSM, MSDP, MBGP, BGMP, etc.)
routing in the same administrative domain is simple and efficient
inter-domain routing is complex, not fully operational
from V. Roca
2424
IP Multicast Architecture
Hosts
Routers
Service model
Host-to-router protocolHost-to-router protocol(IGMP)(IGMP)
Multicast routing protocols(various)
2525
Multicast and the TCP/IP layered model
TCP UDP
IP / IP multicast
device drivers
ICMP IGMP
Application
Socket layer
multicastrouting
higher-levelservices
user spacekernel space
congestioncontrol
reliabilitymgmt
other buildingblocks
security
from V. Roca
2626
Internet Group Management Protocol
IGMP: “signaling” protocol to establish, maintain, remove groups on a subnet.
Objective: keep router up-to-date with group membership of entire LAN Routers need not know who all the
members are, only that members exist Each host keeps track of which mcast
groups are subscribed to Socket API informs IGMP process of all
joins
2727
224.0.0.1 reach all multicast host on the subnet
IGMP: subscribe to a group (1)
Host 1 Host 2 Host 3
224.2.0.1224.2.0.1 224.5.5.5 224.5.5.5
periodically sendsIGMP Query at 224.0.0.1
224.2.0.1
empty empty
from UREC
2828
IGMP: subscribe to a group (2)
224.2.0.1224.2.0.1 224.5.5.5 224.5.5.5
Sends Reportfor 224.2.0.1
224.2.0.1
224.2.0.1
Host 1 Host 2 Host 3
somebody has already subscribed
for the group
from UREC
2929
IGMP: subscribe to a group (3)
224.2.0.1224.2.0.1 224.5.5.5 224.5.5.5
Sends Reportfor 224.5.5.5224.5.5.5
224.2.0.1
224.2.0.1224.5.5.5224.5.5.5
Host 1 Host 2 Host 3
from UREC
3030
Data distribution example
224.2.0.1224.2.0.1 224.5.5.5 224.5.5.5224.2.0.1
224.2.0.1224.5.5.5224.5.5.5
Host 1 Host 2 Host 3
data224.2.0.1
OK
from UREC
3131
IGMP: leave a group (1)
Host 1 Host 2 Host 3
224.2.0.1
Sends Leavefor 224.2.0.1at 224.0.0.2
224.2.0.1224.5.5.5224.5.5.5
224.0.0.2 reach the multicast enabled router in the subnet
224.2.0.1 224.5.5.5 224.5.5.5
from UREC
3232
IGMP: leave a group (2)
Host 1 Host 2 Host 3
224.2.0.1
Sends IGMP Query for 224.2.0.1
224.2.0.1224.5.5.5224.5.5.5
224.2.0.1 224.5.5.5 224.5.5.5
from UREC
3333
IGMP: leave a group (3)
Host 1 Host 2 Host 3
224.2.0.1
Sends Reportfor 224.2.0.1
224.2.0.1224.5.5.5224.5.5.5
224.2.0.1 224.5.5.5 224.5.5.5
Hey, I'm still
here!
from UREC
3434
IGMP: leave a group (4)
Host 1 Host 2 Host 3
224.2.0.1 224.2.0.1
Sends Leavefor 224.5.5.5at 224.0.0.2
224.2.0.1224.5.5.5224.5.5.5
from UREC
3535
IGMP: leave a group (5)
Host 1 Host 2 Host 3
224.2.0.1 224.2.0.1
Sends IGMP Query for 244.5.5.5 224.2.0.1
from UREC
Part II
Introducing reliability
3737
User perspective of the Internet
from UREC, http://www.urec.fr
3838
Links: the basic element in networks
Backbone links optical fibers 2.5 to 160 GBits/s with DWDM
techniques End-user access
9.6Kbits/s (GSM) to 2Mbits/s (UMTS) V.90 56Kbits/s modem on twisted pair
64Kbits/s to 1930Kbits/s ISDN access 512Kbits/s to 2Mbits/s with xDSL modem 1Mbits/s to 10Mbits/s Cable-modem 155Mbits/s to 2.5Gbits/s SONET/SDH
3939
Routers: key elements of internetworking
Routers run routing protocols and build
routing table, receive data packets and
perform relaying, may have to consider Quality
of Service constraints for scheduling packets,
are highly optimized for packet forwarding functions.
4040
The Wild Wild Web
important data
heterogeneity,link failures,
congested routerspacket loss, packet drop,bit errors…
?
4141
At the routing level management of the group address (IGMP) dynamic nature of the group membership construction of the multicast tree (DVMRP,
PIM, CBT…) multicast packet forwarding
At the transport level reliability, loss recovery strategies flow control congestion avoidance
Multicast difficulties
4242
Reliability Models
Reliability => requires redundancy to recover from uncertain loss or other failure modes.
Two types of redundancy: Spatial redundancy: independent backup copies
Forward error correction (FEC) codes Problem: requires huge overhead, since the FEC is
also part of the packet(s) it cannot recover from erasure of all packets
Temporal redundancy: retransmit if packets lost/error
Lazy: trades off response time for reliability Design of status reports and retransmission
optimization important
4343
Temporal Redundancy Model
Packets • Sequence Numbers• CRC or Checksum
Status Reports • ACKs• NAKs, • SACKs• Bitmaps
• Packets• FEC information
Retransmissions
Timeout
Part III
End-to-end solutions
4545
End-to-end solutions for reliability
Sender-reliable Sender detects packet losses by gap in
ACK sequence Easy resource management
Receiver-reliable Receiver detect the packet losses and
send NACK towards the source
4646
Challenge: Reliable multicast scalability
many problems arise with 10,000 receivers...
Problem 1: scalable control traffic ACK each data packet (à la TCP)...oops,
10000ACKs/pkt! NAK (negative ack) only if failure... oops, if pkt is lost
close to src,10000 NAKs!
source implosion!
NACK4NACK4
NACK4
NACK4
NACK4
NACK4NACK4NACK4
source
4747
problem 2: exposure receivers may receive several time the same
packet
Challenge: Reliable multicast scalability
NACK4
NACK4
NACK4
NACK4
data4
data4
data4data4
data
4
data4
data4
data
4
data4
data4
data4
data4
data4
data4
4848
One example – SRMScalable Reliable Multicast
Receiver-reliable NACK-based Not much per-receiver state at the
sender Every member may multicast
NACK or retransmission
4949
SRM (con’t)
NACK/Retransmission suppression Delay before sending Based on RTT estimation Deterministic + Stochastic
Periodic session messages Sequence number: detection of loss Estimation of distance matrix among
members
5050
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
5151
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
5252
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
5353
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
5454
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
5555
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
5656
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
5757
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
5858
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
5959
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
6060
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
6161
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
6262
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
6363
Deterministic Suppression
d
d
d
d
3d
time
data
nack repair
d
session msg
4d
d
2d
3d
= sender
= repairer
= requestor
Delay = C1dS,R
from Haobo Yu , Christos Papadopoulos
6464
SRM Star Topology
Src
from Haobo Yu , Christos Papadopoulos
6565
SRM: Stochastic Suppression
datad
d
d
d
time
NACK
repair
2d
session msg
0
1
2
3
Delay = U[0,C2] dS,R
= sender
= repairer
= requestor
from Haobo Yu , Christos Papadopoulos
6666
What’s missing?
Losses at link (A,C) causes retransmission to the whole group
Only retransmit to those members who lost the packet
[Only request from the nearest responder] sender receiver
AA BB
EE FF
SS
CC DD
from Haobo Yu , Christos Papadopoulos
6767
Idea: Perform Local Recovery with scope limitation
TTL scoped multicast use the TTL field of IP packets to limit
the scope of the repair packet
Src
TTL=1 TTL=2 TTL=3
6868
Example: RMTP
Reliable Multicast Transport Protocol by Purdue and AT&T Research Labs
Designed for file dissemination (single-sender)
Deployed in AT&T’s billing network
from Haobo Yu , Christos Papadopoulos
6969
Rcvrs grouped into local regions
Rcvr unicasts periodic ACK to its ACK Processor (AP), AP unicasts its own ACK to its parent
Rcvr dynamically chooses closest statically configured Designated Receiver (DR) as its AP
RMTP: Fixed hierarchy
AA R2R2
R5R5
SS
R3R3 R4R4
R1R1
AA AA AA AA
AA
AA AA
AA
AA DR/AP AA receiver R*R* router
from Haobo Yu , Christos Papadopoulos
7070
RMTP: Error control
DR checks retx “request” periodically
Mcast or unicast retransmission Based on percentage of requests Scoped mcast for local recovery
from Haobo Yu , Christos Papadopoulos
7171
RMTP: Comments
: Heterogeneity Lossy link or slow receiver will only
affect a local region : Position of DR critical
Static hierarchy cannot adapt local recovery zone to loss points
from Haobo Yu , Christos Papadopoulos
7272
Summary: reliability problems
What is the problem of loss recovery? feedback (ACK or NACK) implosion replies/repairs duplications difficult adaptability to dynamic
membership changes Design goals
reduces the feedback traffic reduces recovery latencies improves recovery isolation
7373
Summary: end-to-end solutions
ACK/NACK aggregation based on timers are approximative!
TTL-scoped retransmissions are approximative!
Not really scalable! Can not exploit in-network
information.
Part IV
Active solutions
7575
What is active networks?
Programmable nodes/routers Customized computations on packets Standardized execution environment
and programming interface However, adds extra processing cost
7676
Motivations behind active networking
user applications can implement, and deploy customized services and protocols
specific data filtering criteria (DIS, HLA) fast collective and gather operations…
globally better performances by reducing the amount of traffic
high throughput low end-to-end latency
7777
Active networks implementations
Discrete approach (operator's approach) Adds dynamic deployment features in
nodes/routers New services can be downloaded into
router's kernel Integrated approach
Adds executable code to data packets Capsule = data + code Granularity set to the packets
7878
DataData
The discrete approach
Separates the injection of programs from the processing of packets
active code A1
active code A2
A1A2
7979
The integrated approach
User packets carry code to be applied on the data part of the packet
High flexibility to define new services
data code
data datacode
data
datadata
8080
An active router
IP packet
IP packet
Filter Action
Forwardingtable
Routingagent
IP input processing IP output processing
IP packet
Packet scheduler
IP output processing
IP packet
Packet scheduler
some layer for executing code.Let's call it Active Layer
AL packet
8181
Where to put active components?
In the core network? routers already have to process millions
of packets per second gigabit rates make additional processing
difficult without a dramatic slow down At the edge?
to efficiently handle heterogeneity of user accesses
to provide QoS, implement intelligent congestion avoidance mechanisms…
8282
Users' accesses
offices
campus
residentials
Network Provider
metro ring
Network Provider
PSTNADSLCable…
Internet
InternetDataCenter
8383
Solutions
Traditional end-to-end retransmission schemes scoped retransmission with the TTL
fields receiver-based local NACK suppression
Active contributions cache of data to allow local recoveries feedback aggregation subcast early lost packet detection …
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
The reliable multicast
universe
YOID
ALMI
HBMApplication-based
RMANP
ARMDyRAM
Router supported,active networking
AER
PGM
RLC
RLM
Layered/FEC
CIFL
FLID
Logging server/replier
LBRM
SRM
TRAM RMTP
LMS
XTPEnd to End
MTP
RMF
AFDP
10 human years (means much more in computer year)
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Routers have specific functionalities/services for supporting multicast flows.
Active networking goes a step further by opening routers to dynamic code provided by end-users.
Open new perspectives for efficient in-network services and rapid deployment.
RMANP
ARMDyRAM
AER
PGM
Router supported,active networking
8686
A step toward active services: LBRM
8787
Active local recovery
routers perform cache of data packets repair packets are sent by routers,
when available
data1data2data3data4data5
datadatadata5
NACK4data4
data1data2data3data4data5
data1data2data3data5
8888
Global NACKs suppression
NACK4NACK4
NACK4
NACK4data4
NACK4
only one NACK is forwarded to the source
8989
Local NACKs suppression
data
NACK
NACK
NACK
NACK
NACK
9090
Early lost packet detection
NACK4
NACK4
NACK4
NACK4
NACK4
A NACK is sent by the router
data3data4
data5
The repair latency can be reduced if the lost packet could be requested as soon as possible
These NACKs are ignored!
9191
Active subcast features
Send repair packet only to the relevant set of receivers
NACK4
NACK4
NACK4
NACK4
data
4
data4
data4
data4
data4
data4
data4data4
data
4
data4
data4
9292
The DyRAM framework(Dynamic Replier Active Reliable Multicast)
Motivations for DyRAM low recovery latency using local
recovery low memory usage in routers : local
recovery is performed from the receivers (no cache in routers)
low processing overheads in routers : light active services
9393
DyRAM's main active services
DyRAM is NACK-based with…
Global NACK suppression Early packet loss detection Subcast of repair packets Dynamic replier election
9494
Replier election
A receiver is elected to be a replier for each lost packet (one recovery tree per packet)
Load balancing can be taken into account for the replier election
Replier election and repair subcast
IP multicastIP multicast
IP multicast
DyRAMDyRAM
IP multicast
IP multicast
DyRAMDyRAM
R1
R2R3R4
R5 R6 R7
0
12
1 0
NAK 2,@ NAK 2,@
NAK 2,@
NAK 2 from link 1NAK 2 from link 2
NAK 2
Repair 2
Repair 2
Repair 2
Repair 2
D0
D1
NAK 2
NAK 2
core networkGbits rate
1000 Base FX
active routeractive router
active router
active router
active router
100 Base FX
sourcesourceThe backbone is very fast so nothing else than fast forwarding functions.
• Nacks suppresion• Subcast• Loss detection
A hierarchy of active routers can be used for processing specific functions at different layers of the hierarchy.
Any receiver can be elected as a replier for a loss packet.
•Nacks suppression•Subcast •Replier election
The DyRAM framework for grids
9797
Network model
10 MBytes file transfer
Source router
9898
Local recovery from the receivers
Local recoveries reduces the end-to-end delay (especially for high loss rates and a large number of receivers).
#grp: 6…24
4 receivers/group
p=0.25
9999
Local recovery from the receivers
As the group size increases, doing the recoveries from the receivers greatly reduces the bandwidth consumption
48 receivers distributed in g groups #grp: 2…24
100100
DyRAM vs ARM
ARM performs better than DyRAM only for very low loss rates and with considerable caching requirements
101101
Simulation results
p=0.25
#grp: 6…244 receivers/group
EPLD is very beneficialto DyRAM
simulation resultsvery close to thoseof the analyticalstudy
DyRAM implementation
Preliminary experimental results
Testbed configuration
TAMANOIR active execution environment
Java 1.3.1 and a linux kernel 2.4 A set of PCs receivers and 2 PC-based
routers ( Pentium II 400 MHz 512 KB cache 128MB RAM)
Data packets are of 4KB
Packets format : ANEP
S@IP SEQSVCD@IP isR Payload
S@IP SEQSVCD@IP S@IP
Data /Repair packets
NACK packets
The data path
IP UDP S,@IP data
ANEP packet
IP
UDP
S,@IP dataTamanoir portFTP port
S
TAMANOIRTAMANOIR
S1
FTPFTP
Router’s data structures
The track list TL which maintains for each multicast session, lastOrdered : the sequence number of the last
received packet in order lastReceived : the sequence number of the last
received data packet lostList : list of not received data packets in
between. The Nack structure NS that keeps for each
lost data packet, seq : sequene number of the lost data packet subList : list of IP addresses of the downstream
receivers (active routers) that have lost it.
The first configuration
ike
resama
resamo
resamdstan
NACK : 135μs DP : 20μs if
no seq gap, 12ms-17ms otherwise. Only 256μs without timer setting
Repair : 123μs
Active service costs
The second configuration
ike
resamo
NACK
The election is performed on-the-fly.
It depends on the number of downstream links.
ranges from 0.1 to 1ms for 5 to 25 links per router.
The replier election cost
111111
Conclusions
Reliability on large-scale multicast is difficult.
End-to-end solutions have to face many critical problems with approximative solutions
Active services can provide more efficient solutions.
The main DyRAM design goal is reducing the end-to-end latencies using active services
Preliminary results are very encouraging