Modeling dynamics of file diffusion behaviour

download Modeling dynamics of file diffusion behaviour

of 6

Transcript of Modeling dynamics of file diffusion behaviour

  • 8/17/2019 Modeling dynamics of file diffusion behaviour

    1/6

    Z. Du (Ed.): Proceedings of the 2012 International Conference of MCSA, AISC 191, pp. 117–122.

    springerlink.com © Springer-Verlag Berlin Heidelberg 2013

    Modeling Dynamic File Diffusion Behavior

    in P2P Networks

    Baogang Chen1 and Jinlong Hu2 

    1 College of Information and Management Science,

    Henan Agriculture University, Zhengzhou, China, 4500022 Communication and Computer Network Laboratory of Guangdong Province,

    South China University of Technology, Guangzhou, China, 510641

    Abstract. In this paper, according to the characteristics of popular filesdownloading and transmission, the various states of nodes in P2P file-sharing

    system are researched. Based on these various states of nodes, through a

    multi-species epidemic model with spatial dynamic, file diffusion model of P2P

    file-sharing system is proposed. By experiments, P2P file diffusion model is

    proved to accords with the actual situation and to have the ability simulating the

    behaviors of the peers in P2P network.

    Keywords: P2P Networks, file diffusion, SEIR epidemic model, spatial

    dynamics.

    1 Introduction

    The large number of popular file downloading behavior is similar to the process of the

    infectious disease spread in P2P file-sharing system, which can be described with

    infectious diseases dynamics. In the field of medicine, Many infectious diseases

    propagation model have been investigated for a long history, which are the effective

    ways to research process of infectious diseases propagation and to predict the outbreak

    of infectious diseases. 

    Most of existing researches on the characteristics of the files replication anddiffusion only consider the steady state performance of P2P networks, and does not

    consider the unstable states in the file diffusion process [1,2], or not fully reflect the

    status of nodes in the system [3,4]. In this paper, the study of infectious diseases

    dynamics theories are referenced, and the various states of users nodes are

    comprehensively examined in P2P file-sharing system, hereafter a new model is

    proposed for file diffusion in P2P file sharing system.

    2 Dynamic Model of Infectious Diseases

    In 1927, Kermack and McKendrick provided famous SIR compartment model when

    they investigated the law of epidemics [5]. And later, a class of SEIR Epidemic Modelwith Latent period was made on this basis. In the SEIR model, the population is divided

  • 8/17/2019 Modeling dynamics of file diffusion behaviour

    2/6

    118 B. Chen and J. Hu

    into four groups. susceptible to infection denoted by type S; the infected denoted by

    type I; if the infected have a period of incubation before ill, and the period of infection

    is not contagious, then these people in the incubation period, denoted by E; recovered

    class denoted by type R.

    In 2005, Julien Arino et al proposed a disease transmission model with spatialdynamics

     [6]. An SEIR epidemic model with spatial dynamics is considered for a

    population consisting of S species and occupying N spatial patches. The total

    population for species i in patch p isip

     N  and the population for species i is 0 0i

     N   > , a

    fixed constant. At time t, the numbers of susceptible, exposed, infectious and recovered

    individuals of species i in patch p at time t are denoted by Sip, Eip, Iip and Rip,

    respectively.1/ 0ip

    d    > , 1/ 0ip

    ω    > , 1/ 0ipγ    >   are the average lifetime, latent period

    and infectious period for species i in patch p, respectively. The disease is assumed to be

    horizontally transmitted within and between species according to standard incidence

    with 0 j ip

     β    ≥ , the rate of disease transfer from species j to species i in patch p. The

    dynamics for species i = 1, . . . , s in patch p = 1, . . . , n is given by the following system

    of 4sn equations:

    1 1 1

    ( )s n n

    ip jp

    ip ip ip ijp ip ipq iq iqp ip

     j q q jp

    dS I d N S S m S m S  

    dt N  β 

    = = =

    = − − + −   (1)

    1 1 1( )

    s n nip jp

    ijp ip ip ip ip ipq iq iqp ip j q q jp

    dE I 

    S d E m E m E  dt N  β ω = = == − + + −

      (2)

    1 1

    ( )n n

    ip

    ip ip ip ip ip ipq iq iqp ip

    q q

    dI  E d I m I m I 

    dt ω γ  

    = =

    = − + + −   (3)

    1 1

    n nip

    ip ip ip ip ipq iq iqp ip

    q q

    dR I d R m R m R

    dt γ  

    = =

    = − + −   (4)

    3 Conformation of P2P File Diffusion Model

    In P2P file-sharing system ,the state of user node is called susceptibility before they

    search for files and propose their downloading request, denoted by class W; When they

    enter into downloading queue after they propose downloading request, the state of user

    node is called latent period, denoted by class D; When they share file for period of time

    after accomplishing their downloading task, the state of user node is called infection,

    denoted by class S; When users are no longer interested in the file and delete it, or users

    nodes does not share the file after they downloading the file, this status is called

    restoration, denoted by Class I.

    Let X be any particular type of nodes, and X 

    P  denotes the number of nodes in X

    class. In system, the user node has two states: on-line and off-line, and between them

    can be transformed into each other. Therefore, all nodes within the system are divided

  • 8/17/2019 Modeling dynamics of file diffusion behaviour

    3/6

      Modeling Dynamic File Diffusion Behavior in P2P Networks 119

    into four different compartments, and each node has both online and offline status, such

    ason

    W  andoff 

    W  .

    In the following, through analysis of user nodes number changes in various types,

    we get the process model of the file diffusion behavior.

    (1) Change rate of the node classon

    W   

    Firstly, a node of classon

    W   changes its state fromon

    W  toon

     D because it searches the

    download file and put forward download request. Suppose the rate of a file being

    inquired, the ratio of the current file shared and the total number of nodes in the system

    are proportional. Suppose the average rate of the user sending out file queries and file

    download request is   λ  . According to standard incidence rate of infectious

    diseases /  SI N  β  , then the class nodeon

    W  will be converted to classon

     D   with

    rate /  Won Son P

    P P N λ  . Meanwhile, when some nodes enter into offline, the number of

    nodes online will be reduced. If rate of offline is set aton off 

    λ −

    , the nodes of classon

    W  will

    transfer into class ofoff 

    W  at rateon off Won

    Pλ −

    .As downloading nodes will lose

    downloading sources, so they will be forced to seek new downloading source.

    Therefore, these nodes will be reentering theon

    W   class fromon

     D  class, and then set

    the transfer rate of occurrence as1r . When the node state transition from offline to

    online, the number of onW   class nodes will increase. Assuming that this conversion

    occurs with rateoff on

    λ −

    , then the change rate of classon

    W  can be expressed as:

    1 / 

    Won

    Won Son P on off Won Don off on Woff  

    dPP P N P rP P

    dt λ λ λ 

    − −= − − + +   (5)

    (2) Change rate of the node classon

     D  

    There are four types of situations which cause the nodes out of the Classon

     D : nodes of

    classon

     D   terminate download back to Classon

    W    because the source nodes are no

    longer share files; nodes of classon

     D  enter into Classon

    S   due to share the file after

    downloading it; nodes of classon

     D  enter into classon

     I    due to does not share the files

    after downloading the file; nodes of classon

     D   occur state transition from online to

    offline. Set the rate from classon

     D   intoon

    W   as r1; download rate as  µ  ; file sharing

    probability asshare

     p ; the rate that classon

     D enter intoon

    S    isshare

     p µ  ;and the rate that

    class on D change into on I    is (1 )share p µ    − . At the same time, the number of class on D   is

    increased, caused by two cases: the nodes from classon

    W    to classon

     D   and from

  • 8/17/2019 Modeling dynamics of file diffusion behaviour

    4/6

    120 B. Chen and J. Hu

    offline to online. Set the rate fromon

    W    toon

     D   is / Won Son P

    P P N λ    and the rate from offline

    to online isoff on Doff 

    Pλ −

    . Then the change rate of classon

     D can be expressed as:

    1 /  Don

    Won Son P Don Don on off Don off on Doff  

    dP

    P P N P r P P Pdt  λ µ λ λ  − −= − − − +   (6)

    (3) Change rate of the node classon

    S   

    Nodes leave the classon

    S  in two situations: don’t share files and change state from

    online to offline. Let the average time for each node to share the file is1/ δ  , so the rate

    of classon

    S   enter into classon

     I  is δ  .Meanwhile, there are two cases to increase the

    number of classon

    S    nodes: nodes of classon

     D   come into classon

    S    and nodes of class

    on D   change state from offline to online. Suppose the rate at which the transition

    on D class into

    onS   class is

    share p µ  , and the transition from class

    off S    to

    onS   occurs at

    rateoff on Soff 

    Pλ −

    . Then the change rate of classon

    S   is given as:

    Son

    share Don Son on off Son off on Soff  

    dP p P P P P

    dt  µ δ λ λ 

    − −= − − +   (7)

    (4) Change rate of the node classon

     I   

    When the nodes of classon

     D  give up to share the download file, they will directly

    come into the classon

     I    at rate of (1 )share

     p µ    − ; Meanwhile, nodes of classon

    S  end to

    share file at rate δ  .Then the total transformation rate of classon

     I    nodes can be

    expressed as:

    (1 ) IonSon share Don on off Ion off on Ioff  

    dPP p P P P

    dt δ µ λ λ  

    − −= + − − +   (8)

    (5) Change rate of offline nodes

    Offline nodes have four classesoff 

    W 、 

    off  D

    、 

    off S  and

    off  I  . Set all kinds of nodes

    transition state rate from online to offline and in turn are the same. So, we have

    Woff 

    on off Won off on Woff  

    dPP P

    dt λ λ 

    − −= −   (9)

     Doff 

    on off Don off on D off  

    dPP P

    dt λ λ 

    − −= −   (10)

    Soff on off Son off on Soff  

    dP P Pdt 

    λ λ − −= −   (11)

     Ioff 

    on off Ion off on I off  

    dPP P

    dt λ λ 

    − −= −   (12)

  • 8/17/2019 Modeling dynamics of file diffusion behaviour

    5/6

      Modeling Dynamic File Diffusion Behavior in P2P Networks 121

    4 Experiment and Analysis

    The model assumes that all user nodes are initially interested in a particular file, then

    we select RMVB type files downloaded rank in the top 7 in MAZE log, and file name

    called A, B,…, G respectively.Due to the users are interested in download files initially, then download request

    interval is equivalent to the average time interval of all users’ file download request. As

    users of MAZE system have obvious periodicity and "day mode", so online and offline

    times of nodes are set up at 12 hours. Taking into account the user log lasting for one

    week, the average time of file shared can be treated as half week (84 hours). Assuming

    that the number of user nodes are distributed in download queue evenly, when the user

    doesn’t share file, then the corresponding proportion of users are forced to choose

    download source again. Therefore, the average leaving rate of downloading node

    1r equals to the average time of user nodes to share files. Parameter values in Table 1.

    Table 1. Experimental parameter values

    1r   δ    share p   λ   on off λ  −   off onλ  −   P N   

    1. 98E-4 1 . 98E-4 0. 122 0 . 00612 0. 00138 0. 00138 100068

    In order to obtain the initial number of nodes that share files, the log data is divided

    into two parts: the first 12 hours log data and the remaining time of the data. The

    number of nodes that share files initially are nodes that finish downloading file within

    12 hours multiplied by the factorshare

     p and then plus number of nodes available to

    share file in the beginning. The initial value of Son

    P andSoff 

    P are set to the half number of

    nodes sharing file initially. The initial values of  Don

    P , Doff 

    P , Ion

    P , Ioff 

    P are set to 0;Won

    P  

    andWoff 

    P   are set to the half value thatP

     N    minusSon

    P   andSoff 

    P .The average

    download rate is defined as all users’ download traffic divided by time between all

    nodes entering into download queue and end of download. The average download rate

    divided by the file size is the rate of download accomplished per unit time.

    With time granularity in minutes, and using parameter values acquired within 12

    hours first, we calculate the number of file downloaded completely in latter 156 hours.

    The result is shown in Table 2. The comparison of user log data and model results are

    shown in Figure 1, 2. As can be seen from Table 2, the differences between the top four

    RMVB files of user log data and the model results are small. But at the back rank of the

    files in table, the differences seem obvious. The experimental results and assumptions

    are related. Suppose that in addition to users sharing file, all other users are initiallyinterested in and will download the file, but the reality is that files on the back list are

    not very popular, and not all users want to download. So model results biased.

  • 8/17/2019 Modeling dynamics of file diffusion behaviour

    6/6

    122 B. Chen and J. Hu

    Table 2. Experimental result and real data

    ( )

    1 A 386. 52 5838 157 2849 3106

    2 B 383. 33 5304 146 2678 2905

    3 C 441. 87 4368 138 2455 2791

    4 D 465. 76 3313 114 2021 2269

    5 E 158. 65 2629 96 1690 1997

    6 F 383. 33 1722 33 586 720

    7 G 734. 61 1399 66 874 1263

    5 Conclusion

    This paper analyzes the user node state changes based on dynamic model theory of

    infectious diseases, and describe file diffusion model of P2P file sharing system.

    Because of considering the online and offline status in user node, therefore this model

    can be more close to the actual situation. Experimental analysis indicates that the model

    can describe the diffusion of most popular file in P2P file sharing system very well.

    How to use the model in depth analysis, and to research more P2P file sharing system

    behavior and performance characteristics, is one of the main works in the future.

    References

    1.  Lo, P.F., Giovannil, N., Giuseppe, B.: The effect of heterogeneous link capacities in

    BitTorrent-like file sharing systems. In: Proceedings of the First International workshop on

    Hot Topics in Peer-to-Peer Systems, Volendam, The Netherlands, pp. 40–47 (2004)

    2. 

    Qiu, D.Y., Srikant, R.: Modeling and performance analysis of BitTorrrent_like peer-to-peer

    networks. In: Proceedings of the ACM SIGCOMM 2004: Conference on Computer

    Communications, New York, USA, pp. 367–377 (2004)

    3. 

    Leibnitz, K., Hossfeld, T., Wakamiya, N., et al.: Modeling of epidemic diffusion in

    Peer-to-Peer file-sharing networks. In: Proceedings of the 2nd International Workshop on

    Biologically Inspired Approaches for Advanced Information Technology, Osaka, Japan, pp.

    322–329 (2006)

    4.  Ni, J., Lin, J., Harrington, S.J., et al.: Designing File Replication Schemes for Peer-to-Peer

    File Sharing Systems. In: IEEE International Conference on Communications, Beijing,

    China, pp. 5609–5612 (2008)

    5.  Kermack, W.O., McKendrick, A.G.: Contributions to the mathematical theory of epidemics.

    In: Proceedings of the Royal Society. Series A, vol. 115, pp. 700–721 (1927)

    6. 

    Arnio, J., Davis, J., Hartley, D., et al.: A multi-species epidemic model with spatial dynamics.

    Mathematical Medicine and Biology 22(2), 129–142 (2005)

    7.  Arnio, J., Jordan, R., van den Driessche, P.: Quarantine in a multi-species epidemic model

    with spatial dynamics. Mathematical Biosciences 206(1), 46–60 (2007)