Understanding the Power of Pull-based P2P Streaming ... · Understanding the Power of Pull-based...
Transcript of Understanding the Power of Pull-based P2P Streaming ... · Understanding the Power of Pull-based...
Understanding the Power of Pull-based P2P
Streaming Protocol: We Can Do Even Better
Qian ZhangHong Kong University of Science and Technology
Aug. 2007
2
Internet Video Streaming
• Enable video distribution from anywhere, to any number of people anywhere in the world
• Unlimited number of channels– Everyone can be a content producer/provider
3
Evolution of Internet Streaming Technology
Native Unicast Approach
IP Multicast
Content Distribution Networks
P2P Internet Video Streaming/Broadcast
4
P2P Traffic Really Matters
• At end 2006, P2P represented ~65% of Internet traffic
1999: Napster, first widely used p2p-application
5
P2P Protocols:• 1999: Napster, End System Multicast (ESM)• 2000: Gnutella, eDonkey• 2001: Kazaa• 2002: eMule, BitTorrent• 2003: Skype• 2004: Coolstreaming, PPLive• Today: GridMedia, TVKoo, TVAnts,
PPStream, SopCast…• Next: Video-on-Demand, Gaming
File DownloadStreamingTelephonyVideo-on-DemandGaming
Application Types:
P2P is More Than File Download
Our focus is on live streaming !
6
Popular Deployed Systems
• Live P2P streaming has become increasingly popular approach• Many real deployed systems. Just name a few …
• Coolstreaming: Cooperative Overlay Streaming– First release: May 2004
Till Oct 2006Download: > 1,000,000Average online users: 20,000Peak-time online user: 80,000Google entries (CoolStreaming): 370,000
CoolStreaming is the base technology for RoxbeamCorp., which launched live IPTV programs jointly with Yahoo Japan in October 2006
7
Popular Deployed Systems (Cont.)
• PPlive: P2P-based IPTV system– 3.5 M subscribers in 2005– 36.9 M subscribers in 2009 predicted– May 2006 – over 200 distinct online channels
• New direction: need to understand current system better– CMU, “Measurement of Commercial Peer-To-Peer Live Video Streaming”,
PPlive and SOPCast– PolyTech, “A Measurement Study of a Large-Scale P2P IPTV System”,
PPlive– UIUC, “Measurement of a Large-scale Overlay for Multimedia Streaming”,
PPLive– HKUST, “An Empirical Study of the Coolstreaming System”,
Coolstreaming
• More to come …
8
Pull-based Streaming
• Almost all real-deployed P2P streaming systems are based on pull-based protocol (data-driven/swarming)
• Basic idea– Live media content is divided into segments and every node
periodically notifies its neighbors of what packets it has– Each node explicitly requests the segments of interest from its
neighbors according to their notification– Very similar to that of BitTorrent
• The well-acknowledged advantages– Robustness and simplicity
9
Pull-based Streaming
• Current focus– Design schemes to enhance the throughput
• A graceful characteristic of pull-based protocol has not been paid enough attention– The simplest pull-based protocol is nearly optimal
– In terms of bandwidth utilization and system throughput
– With appropriate protocol design and parameter settings
– Without any intelligent scheduling and bandwidth measurement
10
What to Deliver Through this Talk
• How good is the pull-based streaming protocol?
• Can we do even better?
• Any deployed real system to support the claim?
11
Simulation and Experiment Methodology
• In simulation, relying on 2 4-CPU 8G-memory machines, we are able to simulate – 10,000-node sessions– 300kbps streaming rate
• In real-world PlanetLab experiment, we totally use – 409 nodes all over the world– 300kbps streaming rate
12
Metrics We Care
• Capacity supply ratio:
–
– i.e.,
• Deliverable rate:– The available streaming rate received by the node (not
involving redundant streaming packets and control packets)
• Delivery ratio:–
receivers ofnumber the (300kbps) rate streaming raw thepeers all amongcapacity upload totalthe
×
demandbandwidth minimum thesupplybandwidth
(309kbps) rate streaming packetizedrate edeliverabl
13
The Near Optimality of Pull-based Protocol
1.02 1.05 1.1 1.15 1.2 1.25 1.3200
250
300
350
400
450B
it ra
te (k
bps)
(a) Capacity supply ratio
Average deliverable rateAverage download/upload rateAverage upload capacity
0 500 1000 1500 2000 2500 3000 35000.95
0.96
0.97
0.98
0.99
1
(b) Duration (sec)
Avg
del
iver
y
ra
tio
0 500 1000 1500 2000 2500 3000 35000
100
200
300
400
500
Onl
ine
node
s
Average delivery ratioOnline nodes
(a) Deliverable rate by simulation with 10,000 nodes (b) PlanetLab experiment with 409 nodes driven by GridMedia trace
14
The Near Optimality of Pull-based Protocol
1.02 1.05 1.1 1.15 1.2 1.25 1.3200
250
300
350
400
450B
it ra
te (k
bps)
(a) Capacity supply ratio
Average deliverable rateAverage download/upload rateAverage upload capacity
0 500 1000 1500 2000 2500 3000 35000.95
0.96
0.97
0.98
0.99
1
(b) Duration (sec)
Avg
del
iver
y
ra
tio
0 500 1000 1500 2000 2500 3000 35000
100
200
300
400
500
Onl
ine
node
s
Average delivery ratioOnline nodes
(a) Deliverable rate by simulation with 10,000 nodes (b) PlanetLab experiment with 409 nodes driven by GridMedia trace
Note that when the capacity supply ratio is only 15% more than the minimum bandwidth demand, the deliverable rate can achieve the best streaming rate (309kbps)
15
Analysis
• Use the simplest packet scheduling– No bandwidth measurement and no intelligent packet scheduling– The packets being requested in one interval will be evenly
allocated to every sender
• Evaluate under different scenarios
n-to-1 scenario n-to-m scenario
n-hop scenario
16
Analysis (Cont.)
• The upload capacity utilization and system throughput will be very close to the optimal (with proper setting)– Request window is larger than 20 sec and – Request interval is between 400ms and 1 sec
0 5 10 15 20 25 30 35 40 45 500
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Source-to-end time (sec)
Del
iver
y ra
tio
Delivery ratio under different source-to-end delay (1 to 10 hop)
17
Overhead and Delay
30 packets/sec (packet size: 1250 bytes)Dramatic number of control packets (even more than the streaming packets)
Playback delays in which 90% users have 99% delivery ratio is at least around 20 sec
18
What to Deliver Through this Talk
• How good is the pull-based streaming protocol?
• Can we do even better?
• Any deployed real system to support the claim?
19
Hybrid Pull-Push Protocol
• Pull-based protocol has the tradeoff between control overhead and delay– To minimize the delay
• Node notifies its neighbors of packet arrival immediately• Neighbors also request the packet immediately• remarkable control overhead
– To diminish the overhead• Node can wait until dozens of packets arrived before inform
its neighbors • Neighbors can also request a bunch of packets each time• considerable delay
20
Push-Pull Streaming Mechanism
• How to reduce the delay of pull mechanism while keeping the advantages of pull mechanism?– Use pull mechanism as startup– Then packets will be pushed directly from the neighbors if possible– In each interval, every node subscribes the pushing packets from the
neighbors– Packets loss during push time interval will be recovered by pull
mechanism
21
Performance
1.02 1.05 1.1 1.15 1.2 1.25 1.30
20
40
60
Con
trol
pac
ket r
ate
(pac
kets
/sec
)
(b) Capacity supply ratio
Control packet rate of pull-pushControl packet plus redundant packet rate of pull-pushControl packet rate of pull
1.02 1.05 1.1 1.15 1.2 1.25 1.3200
250
300
350
400
450B
it ra
te (k
bps)
(a) Capacity supply ratio
Average deliverable rateAverage download/upload rateAverage upload capacity
(a) Deliverable rate of hybrid Pull-Push protocol (b) Control packet rate comparison
22
Performance
1.02 1.05 1.1 1.15 1.2 1.25 1.30
20
40
60
Con
trol
pac
ket r
ate
(pac
kets
/sec
)
(b) Capacity supply ratio
Control packet rate of pull-pushControl packet plus redundant packet rate of pull-pushControl packet rate of pull
1.02 1.05 1.1 1.15 1.2 1.25 1.3200
250
300
350
400
450B
it ra
te (k
bps)
(a) Capacity supply ratio
Average deliverable rateAverage download/upload rateAverage upload capacity
(a) Deliverable rate of hybrid Pull-Push protocol (b) Control packet rate comparison
Note that when the capacity supply ratio is only 10% more than the minimum bandwidth demand, the deliverable rate can achieve the best streaming rate (309kbps)
The overhead of hybrid pull-push protocol has much smaller than the pull-based protocol
23
Performance
Hybrid pull-push protocol on PlanetLab experiment
Significant delay reduction!
24
Insights Obtained
• Delivery ratio of proposed pull-push hybrid protocol can achieve 1, as long as the capacity supply is higher by only 10% than the minimum bandwidth demand
• So there is little room left for other advanced techniques (e.g., network coding) to promote throughput further if server has reasonable capacity
25
What to Deliver Through this Talk
• How good is the pull-based streaming protocol?
• Can we do even better?
• Any deployed real system to support the claim?
26
GridMedia
• Gridmedia is designed to support large-scale live video streaming over world-wide Internet
http://www.gridmedia.com.cn/
• The first generation: Gridmedia I– Mesh-based multi-sender structure– Combined with IP multicast– First release: May 2004
• The second generation: Gridmedia II– Unstructured overlay– Push-pull streaming mechanism– First release: Jan. 2005 GridMediaTM
27
Real Deployment
• Gala Evening for Spring Festival 2005 and 2006– Streaming server: double-core Xeon server– Video encoding rate = 300 kbps– Maximum connections from server
• 2005: 200• 2006: 800
– Partners number = about 10– Buffer Deadline = 20s
For the largest TV station in China (CCTV)
28
Performance Analysis
• Gala Evening for Spring Festival 2005– More than 500,000 person times in total, maximum concurrent
users 15,239– Users from 66 countries, 78.0% from China– Enabled 76 times (15,239/200≈76) in terms of capacity
amplification to bounded server outgoing bandwidth
21:00 22:00 23:00 0:006000
8000
10000
12000
14000
16000
Time
Num
ber o
f con
curre
nt o
nlin
e us
ers
Others22%
China78% Canada
20%
USA18%
UK15%
Japan13%
Others28%
GM 6%
29
Performance Analysis (Cont.)
• Gala Evening for Spring Festival 2006– More than 1,800,000 person times in total, maximum concurrent
users 224,453– Users from 69 countries, 79.2% from China– Enabled 280 times (224,453/800≈280) in terms of capacity
amplification to bounded server outgoing bandwidth
20:00 21:00 22:00 23:00 0:00 1:000
0.4
0.8
1.2
1.6
2
2.4x 10
5
Time
Num
ber o
f con
curre
nt o
nlin
e us
ers USA
Canada
Japan
Australia
UK
IANA
NewZealandAPNIC
Singapore
30
DeploymentExperience
20:30 21:00 21:30 22:00 22:30 23:000
50
100
150
200
250
300
350
Time
Ave
rage
Stre
amin
g R
ate
Incoming RateOutgoing Rate
Online Duration
Connection Heterogeneity
Request Characteristics
• In 2005, about 60.8% users were behind different types of NATs while at least 16.0% users (in China) accessed Internet via DSL connections
• In 2006, about 59.2% users were behind different types of NATs while at least 14.2% users (in China) accessed Internet via DSL connections
An effective NAT traversal scheme should be carefully considered in the system design of P2P-based live streaming applications
31
– In 2005, nearly 50% users spent less 3 minutes and about 18% users kept active for more than 30 minutes
– In 2006, roughly 30% users in 2006 left the system in 3 minutes and more than 35% user would like to enjoy the show for more than 30 minutes
– Peers with longer online duration are expected to have larger average remaining online time
0 2000 4000 6000 8000 10000 120000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Online Time(sec)
CD
F of
Onl
ine
Tim
e
20052006
0 1000 2000 3000 4000 5000 60001500
2000
2500
3000
3500
4000
4500
5000
5500
6000
Online Time (sec)
Rem
aini
ng o
nlin
e Ti
me
DeploymentExperience Online
DurationConnection
HeterogeneityRequest
Characteristics
32
– In 2005, nearly 50% users spent less 3 minutes and about 18% users kept active for more than 30 minutes
– In 2006, roughly 30% users in 2006 left the system in 3 minutes and more than 35% user would like to enjoy the show for more than 30 minutes
– Peers with longer online duration are expected to have larger average remaining online time
0 2000 4000 6000 8000 10000 120000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Online Time(sec)
CD
F of
Onl
ine
Tim
e
20052006
0 1000 2000 3000 4000 5000 60001500
2000
2500
3000
3500
4000
4500
5000
5500
6000
Online Time (sec)
Rem
aini
ng o
nlin
e Ti
me
DeploymentExperience Online
DurationConnection
HeterogeneityRequest
Characteristics
Taking online duration information into consideration when designing overlay structure or selecting upstream peers can improve system performance
33
23:10 23:20 23:30 23:40 23:50 0:000
1000
2000
3000
4000
Time
Req
uest
Rat
e in
200
5
0
1
2
3
4x 10
4
Req
uest
Rat
e in
200
6
DeploymentExperience Online
DurationConnection
HeterogeneityRequest
Characteristics
Request rate per 30 seconds from 23:00pm to 0:00am in 2005 and 2006
• The average request rate always kept at a record of hundreds in 2005 while thousands in 2006
• Occasionally the request rate rushed to a peak beyond 3,700 in 2005 while 32,000 in 2006
34
23:10 23:20 23:30 23:40 23:50 0:000
1000
2000
3000
4000
Time
Req
uest
Rat
e in
200
5
0
1
2
3
4x 10
4
Req
uest
Rat
e in
200
6
DeploymentExperience Online
DurationConnection
HeterogeneityRequest
Characteristics
Request rate per 30 seconds from 23:00pm to 0:00am in 2005 and 2006
• The average request rate always kept at a record of hundreds in 2005 while thousands in 2006
• Occasionally the request rate rushed to a peak beyond 3,700 in 2005 while 32,000 in 2006The high request rate and sporadic flush-crowd essentially
pose great challenge on the reliability and stability of RP server and system
35
Conclusions and Future Directions
• Simplest pull-based P2P streaming is nearly optimal in throughput and upload capacity utilization– Conducted detailed simulation, experiment, and mathematical
analysis to estimate the lower bound of the delivery ratio
• Propose a novel push-pull hybrid protocol – Nearly optimal throughput and bandwidth utilization – Far lower playback delay and much smaller overhead
• GridMedia system has been adopted by the largest TV station in China (CCTV) for TV online broadcasting– Support over 220,000 users concurrent online to watch high-quality
Internet TV by only one server
36
Conclusions and Future Directions
• Throughput improvement should not be the only key focus
• Interesting future directions– Minimize ISP core network and cross-ISP traffic
• Evaluate the impact of pull/pull-push protocols on link stress
• Use proxy cache and locality-aware technique to relieve the link stress
– Server bandwidth reduction• How to let home users broadcast video with high quality?
– Real Internet environment• Connections across the peer link bridge between ISPs have low rate
• NAT/firewall prevent end-host from connecting with each other
37
Acknowledgement
Meng ZHANG, Yun TANG, Ji-Guang LUO, Shiqiang YANGfrom Tsinghua University
Q&A?
Thanks!Contact Info: [email protected]