Download - Modeling dynamics of file diffusion behaviour

8/17/2019 Modeling dynamics of file diffusion behaviour

1/6

Z. Du (Ed.): Proceedings of the 2012 International Conference of MCSA, AISC 191, pp. 117–122.

springerlink.com © Springer-Verlag Berlin Heidelberg 2013

Modeling Dynamic File Diffusion Behavior

in P2P Networks

Baogang Chen1 and Jinlong Hu2

1 College of Information and Management Science,

Henan Agriculture University, Zhengzhou, China, 4500022 Communication and Computer Network Laboratory of Guangdong Province,

South China University of Technology, Guangzhou, China, 510641

Abstract. In this paper, according to the characteristics of popular filesdownloading and transmission, the various states of nodes in P2P file-sharing

system are researched. Based on these various states of nodes, through a

multi-species epidemic model with spatial dynamic, file diffusion model of P2P

file-sharing system is proposed. By experiments, P2P file diffusion model is

proved to accords with the actual situation and to have the ability simulating the

behaviors of the peers in P2P network.

Keywords: P2P Networks, file diffusion, SEIR epidemic model, spatial

dynamics.

1 Introduction

The large number of popular file downloading behavior is similar to the process of the

infectious disease spread in P2P file-sharing system, which can be described with

infectious diseases dynamics. In the field of medicine, Many infectious diseases

propagation model have been investigated for a long history, which are the effective

ways to research process of infectious diseases propagation and to predict the outbreak

of infectious diseases.

Most of existing researches on the characteristics of the files replication anddiffusion only consider the steady state performance of P2P networks, and does not

consider the unstable states in the file diffusion process [1,2], or not fully reflect the

status of nodes in the system [3,4]. In this paper, the study of infectious diseases

dynamics theories are referenced, and the various states of users nodes are

comprehensively examined in P2P file-sharing system, hereafter a new model is

proposed for file diffusion in P2P file sharing system.

2 Dynamic Model of Infectious Diseases

In 1927, Kermack and McKendrick provided famous SIR compartment model when

they investigated the law of epidemics [5]. And later, a class of SEIR Epidemic Modelwith Latent period was made on this basis. In the SEIR model, the population is divided


2/6

118 B. Chen and J. Hu

into four groups. susceptible to infection denoted by type S; the infected denoted by

type I; if the infected have a period of incubation before ill, and the period of infection

is not contagious, then these people in the incubation period, denoted by E; recovered

class denoted by type R.

In 2005, Julien Arino et al proposed a disease transmission model with spatialdynamics

[6]. An SEIR epidemic model with spatial dynamics is considered for a

population consisting of S species and occupying N spatial patches. The total

population for species i in patch p isip

N and the population for species i is 0 0i

N > , a

fixed constant. At time t, the numbers of susceptible, exposed, infectious and recovered

individuals of species i in patch p at time t are denoted by Sip, Eip, Iip and Rip,

respectively.1/ 0ip

d > ， 1/ 0ip

ω > ， 1/ 0ipγ > are the average lifetime, latent period

and infectious period for species i in patch p, respectively. The disease is assumed to be

horizontally transmitted within and between species according to standard incidence

with 0 j ip

β ≥ , the rate of disease transfer from species j to species i in patch p. The

dynamics for species i = 1, . . . , s in patch p = 1, . . . , n is given by the following system

of 4sn equations:

1 1 1

( )s n n

ip jp

ip ip ip ijp ip ipq iq iqp ip

j q q jp

dS I d N S S m S m S

dt N β

= = =

= − − + − (1)

1 1 1( )

s n nip jp

ijp ip ip ip ip ipq iq iqp ip j q q jp

dE I

S d E m E m E dt N β ω = = == − + + −

(2)

1 1

( )n n

ip

ip ip ip ip ip ipq iq iqp ip

q q

dI E d I m I m I

dt ω γ

= =

= − + + − (3)

1 1

n nip

ip ip ip ip ipq iq iqp ip

q q

dR I d R m R m R

dt γ

= =

= − + − (4)

3 Conformation of P2P File Diffusion Model

In P2P file-sharing system ,the state of user node is called susceptibility before they

search for files and propose their downloading request, denoted by class W; When they

enter into downloading queue after they propose downloading request, the state of user

node is called latent period, denoted by class D; When they share file for period of time

after accomplishing their downloading task, the state of user node is called infection,

denoted by class S; When users are no longer interested in the file and delete it, or users

nodes does not share the file after they downloading the file, this status is called

restoration, denoted by Class I.

Let X be any particular type of nodes, and X

P denotes the number of nodes in X

class. In system, the user node has two states: on-line and off-line, and between them

can be transformed into each other. Therefore, all nodes within the system are divided


3/6

Modeling Dynamic File Diffusion Behavior in P2P Networks 119

into four different compartments, and each node has both online and offline status, such

ason

W andoff

W .

In the following, through analysis of user nodes number changes in various types,

we get the process model of the file diffusion behavior.

(1) Change rate of the node classon

W

Firstly, a node of classon

W changes its state fromon

W toon

D because it searches the

download file and put forward download request. Suppose the rate of a file being

inquired, the ratio of the current file shared and the total number of nodes in the system

are proportional. Suppose the average rate of the user sending out file queries and file

download request is λ . According to standard incidence rate of infectious

diseases / SI N β , then the class nodeon

W will be converted to classon

D with

rate / Won Son P

P P N λ . Meanwhile, when some nodes enter into offline, the number of

nodes online will be reduced. If rate of offline is set aton off

λ −

, the nodes of classon

W will

transfer into class ofoff

W at rateon off Won

Pλ −

.As downloading nodes will lose

downloading sources, so they will be forced to seek new downloading source.

Therefore, these nodes will be reentering theon

W class fromon

D class, and then set

the transfer rate of occurrence as1r . When the node state transition from offline to

online, the number of onW class nodes will increase. Assuming that this conversion

occurs with rateoff on

λ −

, then the change rate of classon

W can be expressed as:

1 /

Won

Won Son P on off Won Don off on Woff

dPP P N P rP P

dt λ λ λ

− −= − − + + (5)


D

There are four types of situations which cause the nodes out of the Classon

D : nodes of

classon

D terminate download back to Classon

W because the source nodes are no

longer share files; nodes of classon

D enter into Classon

S due to share the file after

downloading it; nodes of classon

D enter into classon

I due to does not share the files

after downloading the file; nodes of classon

D occur state transition from online to

offline. Set the rate from classon

D intoon

W as r1; download rate as µ ; file sharing

probability asshare

p ; the rate that classon

D enter intoon

S isshare

p µ ;and the rate that

class on D change into on I is (1 )share p µ − . At the same time, the number of class on D is

increased, caused by two cases: the nodes from classon

W to classon

D and from


4/6


offline to online. Set the rate fromon

W toon

D is / Won Son P

P P N λ and the rate from offline

to online isoff on Doff

Pλ −

. Then the change rate of classon

D can be expressed as:

1 / Don

Won Son P Don Don on off Don off on Doff

dP

P P N P r P P Pdt λ µ λ λ − −= − − − + (6)


S

Nodes leave the classon

S in two situations: don’t share files and change state from

online to offline. Let the average time for each node to share the file is1/ δ , so the rate

of classon

S enter into classon

I is δ .Meanwhile, there are two cases to increase the

number of classon

S nodes: nodes of classon

D come into classon

S and nodes of class

on D change state from offline to online. Suppose the rate at which the transition

on D class into

onS class is

share p µ , and the transition from class

off S to

onS occurs at

rateoff on Soff

Pλ −

. Then the change rate of classon

S is given as:

Son

share Don Son on off Son off on Soff

dP p P P P P

dt µ δ λ λ

− −= − − + (7)


I

When the nodes of classon

D give up to share the download file, they will directly

come into the classon

I at rate of (1 )share

p µ − ; Meanwhile, nodes of classon

S end to

share file at rate δ .Then the total transformation rate of classon

I nodes can be

expressed as:

(1 ) IonSon share Don on off Ion off on Ioff

dPP p P P P

dt δ µ λ λ

− −= + − − + (8)

(5) Change rate of offline nodes

Offline nodes have four classesoff

W 、

off D

、

off S and

off I . Set all kinds of nodes

transition state rate from online to offline and in turn are the same. So, we have

Woff

on off Won off on Woff

dPP P

dt λ λ

− −= − (9)

Doff

on off Don off on D off

dPP P

dt λ λ

− −= − (10)

Soff on off Son off on Soff

dP P Pdt

λ λ − −= − (11)

Ioff

on off Ion off on I off

dPP P

dt λ λ

− −= − (12)


5/6

Modeling Dynamic File Diffusion Behavior in P2P Networks 121

4 Experiment and Analysis

The model assumes that all user nodes are initially interested in a particular file, then

we select RMVB type files downloaded rank in the top 7 in MAZE log, and file name

called A, B,…, G respectively.Due to the users are interested in download files initially, then download request

interval is equivalent to the average time interval of all users’ file download request. As

users of MAZE system have obvious periodicity and "day mode", so online and offline

times of nodes are set up at 12 hours. Taking into account the user log lasting for one

week, the average time of file shared can be treated as half week (84 hours). Assuming

that the number of user nodes are distributed in download queue evenly, when the user

doesn’t share file, then the corresponding proportion of users are forced to choose

download source again. Therefore, the average leaving rate of downloading node

1r equals to the average time of user nodes to share files. Parameter values in Table 1.

Table 1. Experimental parameter values

1r δ share p λ on off λ − off onλ − P N

1． 98E-4 1 ． 98E-4 0． 122 0 ． 00612 0． 00138 0． 00138 100068

In order to obtain the initial number of nodes that share files, the log data is divided

into two parts: the first 12 hours log data and the remaining time of the data. The

number of nodes that share files initially are nodes that finish downloading file within

12 hours multiplied by the factorshare

p and then plus number of nodes available to

share file in the beginning. The initial value of Son

P andSoff

P are set to the half number of

nodes sharing file initially. The initial values of Don

P , Doff

P , Ion

P , Ioff

P are set to 0;Won

P

andWoff

P are set to the half value thatP

N minusSon

P andSoff

P .The average

download rate is defined as all users’ download traffic divided by time between all

nodes entering into download queue and end of download. The average download rate

divided by the file size is the rate of download accomplished per unit time.

With time granularity in minutes, and using parameter values acquired within 12

hours first, we calculate the number of file downloaded completely in latter 156 hours.

The result is shown in Table 2. The comparison of user log data and model results are

shown in Figure 1, 2. As can be seen from Table 2, the differences between the top four

RMVB files of user log data and the model results are small. But at the back rank of the

files in table, the differences seem obvious. The experimental results and assumptions

are related. Suppose that in addition to users sharing file, all other users are initiallyinterested in and will download the file, but the reality is that files on the back list are

not very popular, and not all users want to download. So model results biased.


6/6


Table 2. Experimental result and real data

（）

1 A 386. 52 5838 157 2849 3106

2 B 383. 33 5304 146 2678 2905

3 C 441. 87 4368 138 2455 2791

4 D 465. 76 3313 114 2021 2269

5 E 158. 65 2629 96 1690 1997

6 F 383. 33 1722 33 586 720

7 G 734. 61 1399 66 874 1263

5 Conclusion

This paper analyzes the user node state changes based on dynamic model theory of

infectious diseases, and describe file diffusion model of P2P file sharing system.

Because of considering the online and offline status in user node, therefore this model

can be more close to the actual situation. Experimental analysis indicates that the model

can describe the diffusion of most popular file in P2P file sharing system very well.

How to use the model in depth analysis, and to research more P2P file sharing system

behavior and performance characteristics, is one of the main works in the future.

References

1. Lo, P.F., Giovannil, N., Giuseppe, B.: The effect of heterogeneous link capacities in

BitTorrent-like file sharing systems. In: Proceedings of the First International workshop on

Hot Topics in Peer-to-Peer Systems, Volendam, The Netherlands, pp. 40–47 (2004)

2.

Qiu, D.Y., Srikant, R.: Modeling and performance analysis of BitTorrrent_like peer-to-peer

networks. In: Proceedings of the ACM SIGCOMM 2004: Conference on Computer

Communications, New York, USA, pp. 367–377 (2004)

3.

Leibnitz, K., Hossfeld, T., Wakamiya, N., et al.: Modeling of epidemic diffusion in

Peer-to-Peer file-sharing networks. In: Proceedings of the 2nd International Workshop on

Biologically Inspired Approaches for Advanced Information Technology, Osaka, Japan, pp.

322–329 (2006)

4. Ni, J., Lin, J., Harrington, S.J., et al.: Designing File Replication Schemes for Peer-to-Peer

File Sharing Systems. In: IEEE International Conference on Communications, Beijing,

China, pp. 5609–5612 (2008)

5. Kermack, W.O., McKendrick, A.G.: Contributions to the mathematical theory of epidemics.

In: Proceedings of the Royal Society. Series A, vol. 115, pp. 700–721 (1927)

6.

Arnio, J., Davis, J., Hartley, D., et al.: A multi-species epidemic model with spatial dynamics.

Mathematical Medicine and Biology 22(2), 129–142 (2005)

7. Arnio, J., Jordan, R., van den Driessche, P.: Quarantine in a multi-species epidemic model

with spatial dynamics. Mathematical Biosciences 206(1), 46–60 (2007)