Targeting Communities to Maximise Information Diffusion

Post on 28-Nov-2014

239 views 0 download

description

 

Transcript of Targeting Communities to Maximise Information Diffusion

© Copyright 2010 Digital Enterprise Research Institute. All rights reserved.

Digital Enterprise Research Institute www.deri.ie

Targeting Communities to Maximise Information Diffusion

Václav Belák, Samantha Lam, Conor Hayes

Digital Enterprise Research Institute www.deri.ie

Motivation

•  Many companies have started to utilise online communities as a means of communicating and targeting their customers

•  A common approach is to maximise information diffusion by targeting influential actors

•  In the context of many online communities (e.g. discussion fora) the information is shared to the community as a whole and not to individual actors

Digital Enterprise Research Institute www.deri.ie

Objectives

•  Our main hypothesis is that it is possible to efficiently spread a message over the information flow network by targeting highly influential communities

•  We derive the information flow network from the reply-to network between the actors

•  The main problem is then formulated as a prediction of the set of communities to target such that the message is spread over the network as much as possible

Digital Enterprise Research Institute www.deri.ie

Methods: Definition of Impact

•  We propose (Belák et al., ‘12) to take two factors into account: 1.  degree of community membership of the users 2.  centrality of the users within each community

•  we used in-degree (# replies of a user) •  For general case of n users and k communities define:

•  n × k membership matrix M •  n × k centrality matrix C

•  Cross-community k × k impact matrix J can then be obtained as a product of the two matrices: J=MTC

a

b

c

d

e

f

g

forum A forum B

Digital Enterprise Research Institute www.deri.ie

Methods: Targeting Communities

•  Level of dispersion (heterogeneity) of total impact of community i can be measured as an entropy of a an i-th row/column of the impact matrix

•  Is a community broadly influential or does it influence only few other communities?

•  We propose to target communities by means of the product of the total impact of community i and its entropy: impact focus (IF)

•  IF compared with random targeting (R), and group in-degree (GI) (Everett & Borgatti, ‘99)

•  We simulated the diffusion by extending Independent Cascade Model (ICM) (Kempe et al., ‘03)

1.  Take q target communities and sample s users from each of them 2.  Run the original ICM from the union of sampled users

•  Performance measured by the fraction of all the users, that have been activated during the simulation

Digital Enterprise Research Institute www.deri.ie

Evaluation Data-Set

•  51 weeks of data of the largest Irish discussion board system •  Segmented using 1 week sliding window

•  1 week window represents approx. 84% of cross-fora posting activity

•  540 communities in total •  5,298 avg. nodes per snapshot •  26,484 avg. edges per snapshot

Digital Enterprise Research Institute www.deri.ie

Results: Avg. Performance

2 4 6 8 10 14

0.1

0.2

0.3

0.4

0.5

0.6

targeted communities q=1

user sample size (s)

mea

n ac

tivat

ion

fract

ion

(a)

IFGIR

2 4 6 8 10 14

0.1

0.2

0.3

0.4

0.5

0.6

targeted communities q=2

user sample size (s)

mea

n ac

tivat

ion

fract

ion

(a)

IFGIR

2 4 6 8 10 14

0.1

0.2

0.3

0.4

0.5

0.6

targeted communities q=3

user sample size (s)

mea

n ac

tivat

ion

fract

ion

(a)

IFGIR

2 4 6 8 10 14

0.1

0.2

0.3

0.4

0.5

0.6

targeted communities q=4

user sample size (s)m

ean

activ

atio

n fra

ctio

n (a

)

IFGIR

2 4 6 8 10 14

0.1

0.2

0.3

0.4

0.5

0.6

targeted communities q=5

user sample size (s)

mea

n ac

tivat

ion

fract

ion

(a)

IFGIR

•  Impact focus outperformed the other two namely for small number of targeted communities and seed users sampled from them

•  Diffusion process became saturated on avg. at approx. 60% of the

users activated

Digital Enterprise Research Institute www.deri.ie

Results: IF outperforms GI, R

IF GI R

0.1

0.2

0.3

0.4

0.5

user sample size s=1ac

tivat

ion

fract

ion

(a)

● ●

IF GI R

0.45

0.50

0.55

0.60

0.65

user sample size s=12

activ

atio

n fra

ctio

n (a

)

Digital Enterprise Research Institute www.deri.ie

Conclusion

•  The evaluation demonstrated that the framework •  is able to identify highly influential communities •  can predict which communities to stimulate (e.g. by posting a

message) s.t. the stimulus spreads efficiently •  We aim to extend it with content analysis

•  E.g. What are the most influential communities with respect to a particular topic?

•  We will also investigate empirically-observed topic cascades and modify our models accordingly if needed

References •  Belák V., Lam S., Hayes C. Cross-Community Influence in Discussion

Fora. ICWSM. AAAI, 2012. •  M. Everett and S. Borgatti. The centrality of groups and classes. J. of

Mathematical Sociology, 23(3):181–201, 1999. •  D. Kempe, J. Kleinberg, and É. Tardos. Maximizing the spread of

influence through a social network. SIGKDD. ACM, 2003.