Targeting Communities to Maximise Information Diffusion

9

Click here to load reader

description

 

Transcript of Targeting Communities to Maximise Information Diffusion

Page 1: Targeting Communities to Maximise Information Diffusion

© Copyright 2010 Digital Enterprise Research Institute. All rights reserved.

Digital Enterprise Research Institute www.deri.ie

Targeting Communities to Maximise Information Diffusion

Václav Belák, Samantha Lam, Conor Hayes

Page 2: Targeting Communities to Maximise Information Diffusion

Digital Enterprise Research Institute www.deri.ie

Motivation

•  Many companies have started to utilise online communities as a means of communicating and targeting their customers

•  A common approach is to maximise information diffusion by targeting influential actors

•  In the context of many online communities (e.g. discussion fora) the information is shared to the community as a whole and not to individual actors

Page 3: Targeting Communities to Maximise Information Diffusion

Digital Enterprise Research Institute www.deri.ie

Objectives

•  Our main hypothesis is that it is possible to efficiently spread a message over the information flow network by targeting highly influential communities

•  We derive the information flow network from the reply-to network between the actors

•  The main problem is then formulated as a prediction of the set of communities to target such that the message is spread over the network as much as possible

Page 4: Targeting Communities to Maximise Information Diffusion

Digital Enterprise Research Institute www.deri.ie

Methods: Definition of Impact

•  We propose (Belák et al., ‘12) to take two factors into account: 1.  degree of community membership of the users 2.  centrality of the users within each community

•  we used in-degree (# replies of a user) •  For general case of n users and k communities define:

•  n × k membership matrix M •  n × k centrality matrix C

•  Cross-community k × k impact matrix J can then be obtained as a product of the two matrices: J=MTC

a

b

c

d

e

f

g

forum A forum B

Page 5: Targeting Communities to Maximise Information Diffusion

Digital Enterprise Research Institute www.deri.ie

Methods: Targeting Communities

•  Level of dispersion (heterogeneity) of total impact of community i can be measured as an entropy of a an i-th row/column of the impact matrix

•  Is a community broadly influential or does it influence only few other communities?

•  We propose to target communities by means of the product of the total impact of community i and its entropy: impact focus (IF)

•  IF compared with random targeting (R), and group in-degree (GI) (Everett & Borgatti, ‘99)

•  We simulated the diffusion by extending Independent Cascade Model (ICM) (Kempe et al., ‘03)

1.  Take q target communities and sample s users from each of them 2.  Run the original ICM from the union of sampled users

•  Performance measured by the fraction of all the users, that have been activated during the simulation

Page 6: Targeting Communities to Maximise Information Diffusion

Digital Enterprise Research Institute www.deri.ie

Evaluation Data-Set

•  51 weeks of data of the largest Irish discussion board system •  Segmented using 1 week sliding window

•  1 week window represents approx. 84% of cross-fora posting activity

•  540 communities in total •  5,298 avg. nodes per snapshot •  26,484 avg. edges per snapshot

Page 7: Targeting Communities to Maximise Information Diffusion

Digital Enterprise Research Institute www.deri.ie

Results: Avg. Performance

2 4 6 8 10 14

0.1

0.2

0.3

0.4

0.5

0.6

targeted communities q=1

user sample size (s)

mea

n ac

tivat

ion

fract

ion

(a)

IFGIR

2 4 6 8 10 14

0.1

0.2

0.3

0.4

0.5

0.6

targeted communities q=2

user sample size (s)

mea

n ac

tivat

ion

fract

ion

(a)

IFGIR

2 4 6 8 10 14

0.1

0.2

0.3

0.4

0.5

0.6

targeted communities q=3

user sample size (s)

mea

n ac

tivat

ion

fract

ion

(a)

IFGIR

2 4 6 8 10 14

0.1

0.2

0.3

0.4

0.5

0.6

targeted communities q=4

user sample size (s)m

ean

activ

atio

n fra

ctio

n (a

)

IFGIR

2 4 6 8 10 14

0.1

0.2

0.3

0.4

0.5

0.6

targeted communities q=5

user sample size (s)

mea

n ac

tivat

ion

fract

ion

(a)

IFGIR

•  Impact focus outperformed the other two namely for small number of targeted communities and seed users sampled from them

•  Diffusion process became saturated on avg. at approx. 60% of the

users activated

Page 8: Targeting Communities to Maximise Information Diffusion

Digital Enterprise Research Institute www.deri.ie

Results: IF outperforms GI, R

IF GI R

0.1

0.2

0.3

0.4

0.5

user sample size s=1ac

tivat

ion

fract

ion

(a)

● ●

IF GI R

0.45

0.50

0.55

0.60

0.65

user sample size s=12

activ

atio

n fra

ctio

n (a

)

Page 9: Targeting Communities to Maximise Information Diffusion

Digital Enterprise Research Institute www.deri.ie

Conclusion

•  The evaluation demonstrated that the framework •  is able to identify highly influential communities •  can predict which communities to stimulate (e.g. by posting a

message) s.t. the stimulus spreads efficiently •  We aim to extend it with content analysis

•  E.g. What are the most influential communities with respect to a particular topic?

•  We will also investigate empirically-observed topic cascades and modify our models accordingly if needed

References •  Belák V., Lam S., Hayes C. Cross-Community Influence in Discussion

Fora. ICWSM. AAAI, 2012. •  M. Everett and S. Borgatti. The centrality of groups and classes. J. of

Mathematical Sociology, 23(3):181–201, 1999. •  D. Kempe, J. Kleinberg, and É. Tardos. Maximizing the spread of

influence through a social network. SIGKDD. ACM, 2003.