Analysis of Movie Replication and Benefits of Coding in P2P VoD
description
Transcript of Analysis of Movie Replication and Benefits of Coding in P2P VoD
23/4/21 CUHK
Analysis of Movie Replication and Benefits of Coding in P2P VoD
Yipeng Zhou
Aug 29, 2012
23/4/21 CUHK
Outline
Movie Replication Introduction Problem Formulation Analysis of Scheduling Algorithm Simulation Results
Benefits of Coding for VoD Background Analysis Simulation Results
Conclusion23/4/21
23/4/21 CUHK
Introduction
Objective is to minimize server load by optimizing movies replicated by different peers.
2012-5-10
Practical System:
PPTV
PPStream
UUSee
Challenge:
How to organize peers share content? Scheduling
How to place right content on peers? Replication
23/4/21 CUHK
Related Work Scheduling strategy and Movie
Replication strategy are not analyzed separately.
Not covered Topology: Any pair of peers can talk with each other.
However, the number of simultaneously communicated peers is limited.
No Coding: Only a complete copy is replicated by a peer to simplify model complexity.
23/4/21 CUHK23/4/21
To simplify analysis, we assume:
Homogeneous movies. Homogeneous peers. (Same upload capacity & storage) Total peers’ uplink capacity is equal to total demand. View Upload Decoupling. No start-up delay, buffer is not considered
Assumption
23/4/21 CUHK23/4/21
Closed queuing network model N users, continuously watching movies. Select a movie, watch for a random period. After viewing a movie, select another movie based on
transition probability matrix. By solving a fixed point equation, derive stationary
popularity of movies.
User Behavior ModelN users continuously generate N viewing requests
[D. Wu et al, Infocom’09 best paper]
Relative popularity: for movie j and
K
jj
1
1
The peer population to view movie j follows Binomial Distribution.
23/4/21 CUHK
Movie Popularity Zipf distribution is used for movie popularity. All movies
are ranked by descending order of popularity.
23/4/21
is a parameter in the range [0.271, 1].
[N. Venkatasubramanian et al, ICDCS 97]
is a key parameter.
Solution: Derive bound of server load to ignore the effect of Θ without considering long tail.
23/4/21 CUHK23/4/21The Chinese University of Hong
Kong
Formulation
Qi is the set of movies replicated by peer i. L is the storage size of each peer.
Xj is the random variable to denote the bandwidth received by peers watching movie j from P2P system.
Xj is determined by request scheduling strategy and replication strategy.
23/4/21 CUHK23/4/21The Chinese University of Hong
Kong
Formulation Cont.
It is still difficult to minimize the weighted variance. Fortunately, we can get the bound of average server load.
Balance BW Allocation
23/4/21 CUHK Fig. 2
Xj
Objective
Playback RatePlayback Rate
Fig. 1 timetime
Xj
Server load
23/4/21 CUHK23/4/21
Request Scheduling Strategy
Fixed BW allocation(FBA) Fair Sharing
23/4/21 CUHK
FBA A virtual super server can be used to derive average
server load, as the figure shows.
23/4/21
Super Server
Replication strategy: Proportional (to popularity) in homogeneous network.
It is easy to calculate the bandwidth allocated to a particular movie.
[D. Wu et al, Infocom mini 09]
23/4/21 CUHK
FBA Cont.
jSi
ij N
L
UCapacity
j
jNqE ]Re[#
Server load is:
K
j Capacityq
qCapacityqB1 Re#
)RePr(#*)Re(#
Binomial Distribution
Proportional to movie popularity.
23/4/21 CUHK23/4/21
PFS and FSFD Both of perfect fair sharing (PFS) and fair
sharing with fixed degree (FSFD) are special cases of FS
PFS When a peer wants to stream movie j, it sends out sub-
requests to all peers storing movie j to fetch parts of that movie. When serving other peers, a peer treats all sub-requests the same.
FSFD When a peer wants to stream a movie j, it sends out
sub-requests to exactly y peers who store movie j.
23/4/21 CUHK
PFS
Received sub-requests by peer i in PFS is:
We use Poisson distribution as an approximation of Binomial distribution
We can derive the expected value and variance of Xj(i)
The distribution of Xj(i) is: )(#Pr])(Pr[ kreqk
UiX i
ij
Xj(i) is the random variable to denote the BW received by sending a sub-request to peer i for movie j.
23/4/21 CUHK
PFS Cont.
The variance of Xj
The correlation determines total variance.
The distribution of Xj(i) depends on the number of sub-requests received by peer i.
The number of sub-requests received by peer i depends on Qi
It is very complicated to get the distribution of Xj
23/4/21 CUHK
PFS Worst Case
Correlation is equal to 1 means that peers form K/L clusters. In each cluster, all peers store the same movie set. The movie set is random selected from the whole movie set.
The received requests is the same for all peers in the same clusters. The behavior of a cluster is like a super server. The server load can be derived exactly.
Cluster 1 store movie 1, 2,..L
Cluster 1 store movie L+1,L+2,..2L
Cluster L store movie K-L+1,..K
LKRRR /21 ...
KNLH /
23/4/21 CUHK
PFS Best Case
The upper bound is achieved when all peers have the same load λi and the bandwidth from different peers is independent.
Xj(i)s are independent identical distributed for different i. Normal distribution is used as approximation of Xj.
The required server load to support one peer is: The total serever load is:
N ...21 KNLH /
23/4/21 CUHK
Random Load Balancing Algorithm
Initialization
To minimize correlation
To balance bandwidth allocation
Bj = E[Xj]
23/4/21 CUHK
FSFD Each peer sends out exactly y sub-requests to randomly
selected peers replicating target movie. Similar to PFS, the received BW from one sub-request is:
ij iXE
1
)]([
Proportional replication strategy achieves the balanced bandwidth allocation since λi = y
1)]([][ iXEyXE jj
[J. Wu et al, Infocom mini 2009] [K. Suh et al, JSAC 2007]
23/4/21 CUHK
FSFD Worst Case
The received requests is perfect correlated for all peers in the same clusters. The behavior of a cluster is like a super server. The server load can be derived exactly.
Cluster 1 store movie 1, 2,..L
Cluster 1 store movie L+1,L+2,..2L
Cluster L store movie K-L+1,..K
Here, the difference from PFS is that the each peer sends only y sub-requests instead of sending sub-requests to all peers.
23/4/21 CUHK
FBA, PFS vs FSFD
Scheduling Strategy Optimal Replication Strategy
FBA Proportional
PFS RLB
FSFD Proportional
H = NL/K, which is the average storage resource.
23/4/21 CUHK
FSBDWhen a peer wants to stream a movie j, it sends out at most Y sub-requests to random selected peers who store movie j.
Balanced BW allocation, equivalent to E[Xj] = 1
Nk is the expected peer population to view movie k.
23/4/21 CUHK
FSBD Worst Case The worst case is similar to the worst case of PFS. But
there are two type clusters. In type I cluster: y = Y, similar to FSFD. In type II cluster: y = No. of Peers, similar to PFS.
Request
Sub- requestType I
An example with Y = 3
Type IIRequest
Sub- request
Type IIRequest
Sub- request
23/4/21 CUHK
FSBD Cont.
LK
ii NR
/
1
Type I Type II
Ri is the peer population of cluster i.
B is maximized whenγ = 1
23/4/21 CUHK
FSBD Cont.
23/4/21
Performance comparison of FSBD with FSFD and PFS
The next question: design a replication strategy to work no matter what the bound of out-degree, i.e. Y
23/4/21 CUHK23/4/21
DAR Algorithm
23/4/21 CUHK
N = 10000, Fix ratio of K/L= 50, Homo. movie popularity and peer uplink bandwidth
Bound Validation of PFS
COV 0
B = O(K/L)
B = O(Sqrt(NK/L))
COV 1
23/4/21 CUHK23/4/21
Model Validation
FBA
Bound of PFS
FSFD
N=4000, K=400, L=4
23/4/21 CUHK23/4/21
FSBD
DAR
DAR
ARLB
Proportional
N=4000, K=400, L=4
Proportional
23/4/21 CUHK
Outline
Movie Replication Introduction Problem Formulation Analysis of Scheduling Algorithm Simulation Results
Benefits of Coding for VoD Background Analysis Simulation Results
Conclusion23/4/21
23/4/21 CUHK
Background
For P2P, helper no. = peer no.
23/4/21 CUHK
Previous Work
[F. Liu et al, Infocom’11] adopts RS Coding.[Y. Kao et al, TPDS’11] adopts Network Coding.
23/4/21 CUHK23/4/21
To simplify analysis, we assume:
Perfect View Upload Decoupling. Random Selected Enough Neighbors. Limited Downloading. No Encoding or Decoding Overhead. Discrete time slot.
Model & Assumption
23/4/21 CUHK
Model with d=1
For Greedy Strategy
For FF Strategy
Buffer map X X X X
1 2 3 4 5 6 7 8playback
FF Selection Greedy Selection N
Ip
N
i
i 1
8
)8(
Performance depends on p(n). Streaming cost is 1-p(n)
Helper
Selection
23/4/21 CUHK
Proposition 1: In a P2P system with perfect view-uploaddecoupling, the Greedy strategy is always the optimal strategy to maximize p(n, d).
Proposition 2: For two coding schemes using Greedy strategywith block size d1 and d2, if d1 < d2 and d2 is divisibleby d1, the streaming cost for coding scheme d2 is smaller thanthat for d1.
Main Result
It is a tradeoff between streaming cost and movie replication cost.
23/4/21 CUHK
Simulation
Helpers are assumed to have stored necessary encoded chunks.
Streaming cost decreases with d
23/4/21 CUHK
Simulation Cont.
A scenario with new movie.No helper replicates the new movie.
Two ways for new movie replication:1.Pushed from server.2.Distributed among helpers.
23/4/21 CUHK23/4/21
We use a new approach to analyze three kinds of request scheduling strategies.
Real-world systems is likely to be in between fair sharing (with some fixed degree) and perfect fair sharing. Therefore, we propose a novel FSBD model with varying out-degree. This allows us to illustrate the effect of out-degree in request scheduling.
We use a simple mean field stochastic model to analyze the benefits by adopting coding for movie replication.
Conclusion
23/4/21 CUHK23/4/21
The end
Thank you
Q & A