Bullet: High Bandwidth Data Dissemination Using an Overlay Mesh
description
Transcript of Bullet: High Bandwidth Data Dissemination Using an Overlay Mesh
Bullet: High Bandwidth Data Dissemination Using an Overlay Mesh
Introduction
Given a sender and a large set of receivers spread across the Internet, how can we maximize the bandwidth?
Problem domain: Software or video distribution Real-time multimedia streaming
Existing Solutions
IP multicast does not consider bandwidth when constructing its distribution tree
A promising alternative: Overlay Attempts to mimic the multicast routing
trees Instead of high-speed routers, use
programmable end hosts as interior nodes in the overlay tree
Existing Solutions
A tree structure is problematic Decreasing bandwidth as moving
down a tree Any loss high up the tree will reduce
the bandwidth lower down the tree Bandwidth of a node limited by its
single parent
A New Approach
Transmit disjoint data set to various points in the network
A node download from multiple sources rather than a single parent Higher reliability
Conceptual Model
A B
Root1 Mbps 1 Mbps
1 Mbps
Conventional Model
A B
Root1 Mbps 1 Mbps
1 Mbps
Conventional Model
A B
Root1 Mbps 1 Mbps
1 Mbps
Bullet
A B
Root1 Mbps 1 Mbps
1 Mbps
Bullet
A B
Root1 Mbps 1 Mbps
1 Mbps
Bullet
A B
Root1 Mbps 1 Mbps
1 Mbps
Bullet
A B
Root1 Mbps 1 Mbps
1 Mbps
Bullet
A B
Root1 Mbps 1 Mbps
1 Mbps
Bullet
A B
Root1 Mbps 1 Mbps
1 Mbps
Bullet
A B
Root1 Mbps 1 Mbps
1 Mbps
Bullet
A B
Root1 Mbps 1 Mbps
1 Mbps
Bullet
A B
Root1 Mbps 1 Mbps
1 Mbps
Bullet
A B
Root1 Mbps 1 Mbps
1 Mbps
Bullet Properties TCP friendly
Must respond to congestion signals Low control overhead
Probing resource Locating multiple downloading sources
Decentralized and scalable No global knowledge
Robust to failures, even high up in a tree
Bullet Overview
Use meshes as opposed to trees Bandwidth independent of the
underlying overlay tree Used a 1,000-node overlay and
20,000 network topologies Up to 2x bandwidth improvement
over a bandwidth-optimized tree Overhead of 30 Kbps
System Components Split the data into packet-sized objects Disseminate disjoint objects to clients
at a rate determined by bandwidth to each client
Nodes need to locate and retrieve disjoint data from their peers
Periodically exchange summary tickets Minimize overlapping objects from
each peer
Illustration
S
A
ED
CB
1 2 3 4 5 76
1 2 3 5 1 3 4 6 2 4 5 6
1 2 5 1 3 4
Data Encoding
Multimedia: MDC encoding Large files: erasure encoding
Tornado code Only need to locate 5% extra packets
to reconstruct the original message Faster encoding and decoding time
RanSub Distributes random subsets of
participating nodes During the collect phase, each
node sends a random subset of its descendant nodes up the tree
During the distribute phase, each node sends a random subset of collected nodes down the tree
Informed Content Delivery Techniques
Use summary tickets A summary ticket is an array
array[i] = hashi(working set)
Check ticket elements against a Bloom filter It is possible to have false positives It is possible that B will not send a
packet to A even though A is missing it
TCP Friendly Rate Control (TFRC)
TCP halves the sending rate as soon as one packet loss is detected Too severe
TFRC is based on loss events, or multiple dropped packets within one round-trip time
TCP Friendly Rate Control (TFRC)
Bullet eliminated retransmission from TFRC Easier to recover from other sources
than from the initial sender TFRC does not aggressively seek
newly available bandwidth like TCP
Bullet
Layers a mesh on top of an overlay tree to increase overall bandwidth
Finding Overlay Peers
RanSub periodically delivers subsets of uniformly random selected nodes
Via summary tickets (120 bytes per node)
The working set from each node is associated with a Bloom filter
Peer with nodes with the lowest similarity ratio
Recovering Data from Peers A receiver assigns a portion of the
sequence space to each of its senders, to avoid duplication among senders
A receiver periodically updates each sender with its current Bloom filter and the range of sequences covered in its Bloom filter
Less than 10% of all received packets are duplicates
Making Data Disjoint
Given a randomly chosen subset of peer nodes, it is about the same probability that each node has a particular data packet
A parent decides the portion of its data being sent to each child A function of limiting and sending
factors
Making Data Disjoint
The portion of data a child should own is proportional to The number of its descendants Bandwidth
If not enough bandwidth Each child receives a completely
disjoint data stream
Making Data Disjoint
If ample bandwidth Each child will received the entire
parent stream
Improving the Bullet Mesh
What can go wrong Not enough peers Constant changing network
Use trial senders and receivers Bullet periodically evaluates the
performance of its peers Places the worst performing
sender/receiver
Evaluation
Used Internet environments and ModelNet IP emulation
Deployed on the PlanetLab Built on MACEDON
Specifies the overlay algorithms Core logic under 1,000 lines of
code
Evaluation
ModelNet experiments 50 2 GHz Pentium 4’s running Linux
2.4.20 100 Mbps and 1 Gbps Ethernet
switches 1,000 emulated instances 20,000 INET-generated topologies
Offline Bottleneck Bandwidth Tree
Given global knowledge, what is the overlay tree that will deliver the highest bandwidth to a set of overlay nodes? Finding a tree with a maximum
bottleneck NP hard in general
Offline Bottleneck Bandwidth Tree
Assumptions The path between two overlay nodes is fixed The overlay tree uses TCP-friendly unicast
connections to transfer data point-to-point In the absence of other flows, we can
estimate the throughput of a TCP-friendly flow using a steady-state formula
In the case of sharing, each flow can achieve at most of 1/nth of the total throughput
Bullet vs. Streaming
The maximum bottleneck tree achieves 5x bandwidth compared to random trees
Bullet outperforms the bottleneck tree by a factor up to 100%
Creating Disjoint Data
Without disjoint transmission of data Bullet degrades by 25%
Epidemic Approaches
Bullet is 60% better than anti-entropy (gossiping) approaches Epidemic algorithms have an
excessive number of duplicate packets
Bullet on a Lossy Network
Bullet achieves 2x bandwidth compared to maximum bottleneck trees
Performance Under Failure
With failure detection/recovery disabled 30% performance degradation with a
missing child under the root With failure detection/recovery
enabled Negligible disruption of performance
PlanetLab
47 nodes for deployment Similar results
Related Work Kazza
Perpendicular downloads Does not use erasure code Bandwidth consuming
BitTorrent Centralized tracker
FastReplica Not bandwidth aware
Related Work
Scalable Reliable Multicast Difficult to configure
Epidemic approaches Do not avoid duplicates
Narada Use overlay meshes Bandwidth still limited by parent
Related Work
Overcast More heavyweight when nodes leave
a tree Splitstream
Not bandwidth aware CoopNet
Centralized