P2P Content Distribution - uvigo tv
Transcript of P2P Content Distribution - uvigo tv
2
Traditional Content Distribution
Often, large content needs to be distributed to millions of clients:
• Currently:
• Huge server farms
• Infrastructure-based solutions (e.g. Akamai)
slow, expensive, non scalable
Server Farm
3
Content Distribution Evolution
Hype
Realism
Growth
Caching
IP Multicast
CDNs
Akamai
Enterprise
CDNs
Layer-7 Switches
Satellite CDNs
P2P
1999
2000
2001
2002
2003
2004
Disappointment
5
P2P Content DistributionDesktop PCs can help each other!
• Clients become new servers
• Capacity increases with the number of clients
• Limitless scalability and fast speeds at extremely low cost!!
Server Farm
10
100
1000
10000
100000
1000000
10000000
0 7 14 21 28 36 43 50 57
Time (sec)
Num
ber o
f Clie
nts
Ser
ved
Cooperative
Client/Server
4 MB file. Server 100 Mbps. Client 1 Mbps
6
Examples• Updates/Critical Patches (combat virus/worm propagation)
– Adding more servers and egress capacity to absorb pick load is quite expensive
– Alternative solution is to artificially delay clients» Patches do not arrive on-time
• Software Distribution– BitTorrent: successfully distributed 1.77GB Redhat 9
• PodCasting
• Group Information Sharing
• Enterprise content distribution
7
P2P Content Distribution
• Benefits:– Dramatically improves speed– Limitless scalability– Minimum server requirements– Very cheap
• Challenges:– Requires incentives for cooperation– Hard to ensure end2end full connectivity– Security– Manageability– Lack of locality increases transit costs for ISPs– Asymmetric links (traffic engineering)– Variable bandwidth, peers come and go– Need for more sophisticated distribution algorithms
8
P2P Swarming• File is divided into many small pieces for distribution
• Clients request different pieces from the server or from other clients
• Clients become servers for those pieces downloaded
• When all pieces are downloaded, clients can re-construct the whole file
1 2 65
Server
3 4
1 5 6 2 4
1 2 3 4 5 6
3
[Rodriguez, Biersack, Infocom’00]
9
1 2 65
The Challenge
Server
3 4
1 5 6 2 4
1 2 3 4 5 6
3
If there are many users,deciding which is the best piece to download can be very hard!!⇒ Incorrect decisions result in low
throughput, nodes not able to finish, bandwidth wasted, etc.
Solutions that require to have full knowledge of who has what are non-scalable
11
Goal
• Provide a very fast and robust P2P file distribution solution
•Current problems in existing P2P solutions:•Rare-blocks are hard to obtain•Tit-for-tat incentive mechanisms decrease speeds•Arrival of new users slows down old users•Heterogeneous nodes do not interact well•Same information travels repeatedly over bottleneck links•Too much dependency from seeds•Sudden departures can prevent peers from finishing
12
Source
The Problem of Efficient Scheduling of Information
Node A Node B
Block 1 Block 2
Node C
Block 1
Block 1, or 2, or 1⊕2?
13
The Avalanche Magic
• To solve problems of existing P2P file distribution solutions, Avalanche uses special encoding algorithms
• Each encoded piece has the “DNA” of all pieces in the file.=> A given encoded piece can be used by any peer in place of any piece
• Encoded pieces are created using linear equations that involve all pieces in the file
• Reconstructing the file requires collecting enough encoded pieces and solving the set of mathematical equations
14
Coding in general• Assume file: F = [x1 x2], where xi is a block.
• Define code Ei(ai,1, ai,2) = ai,1*x1+ ai,2*x2, where ai,1, ai,2 are numbers.
• “Infinite” number of Ei’s.
• Any two linearly independent Ei(ai,1, ai,2) can recover [x1 x2]. – Similar as solving a system of linear equations.
• Operations in finite fields [such as GF(216)].
15
Avalanche Coding
B1 B2 Bn
Server
α1 α2
Client A
β1 β2 βn
E1 E2
Client B
ω1 ω2
E3
[Chou et al., ’03]
• Content is encoded at the server
• Clients can produce new encoded packets out of partial files
αn
File
16
Avalanche Robustness
100 150 200 250 3000
50
100
150
200
250
300
350
400
450
500
Time
# of
Pee
rs F
inis
hed
NCFECLR
If server suddenly goes down (after serving the ful l file one), all Avalanche users are able to complete the download. Only 10% of BitT orrent-like users are able to complete.
17
Avalanche Download Time
Fin
ish
Tim
es
Nodes (sorted by order of arrival)
0 50 100 150 200 250 300 350 4000
50
100
150
200
250
300
NCRandom
Avalanche
BitTorrent
BitTorrent peers not yet finished
=> Much lower and predictable download times
18
No need for nodes to stay around…
• With Avalanche, there is no need for nodes to stay after they finish the download to help other nodes (the performance remains unchanged)
Nodes stay for ever
Nodes leave immediately
Nodes (sorted by order of arrival)
Fin
ish
Tim
es
19
Minimum Server Requirements
Less than half the server requirements of BitTorrent-like systems
50 100 150 2000
20
40
60
80
100
120
140
Time
Ser
ver
Load
NCFECSimple
20
Decoding Performance
Avalanche trades-off better speeds and less server
load for more processing power at each node
3m 38 sec100200
2m 21 sec100100
37 sec10050
5 sec10010
TimeBlocksFile Size (MB)
Note: Pentium III, 650MHz, 512MB RAM.
Decoding time is less than 4% of the total download