Targeting Communities to Maximise Information Diffusion
Click here to load reader
-
Upload
vaclav-belak -
Category
Technology
-
view
239 -
download
0
description
Transcript of Targeting Communities to Maximise Information Diffusion
© Copyright 2010 Digital Enterprise Research Institute. All rights reserved.
Digital Enterprise Research Institute www.deri.ie
Targeting Communities to Maximise Information Diffusion
Václav Belák, Samantha Lam, Conor Hayes
Digital Enterprise Research Institute www.deri.ie
Motivation
• Many companies have started to utilise online communities as a means of communicating and targeting their customers
• A common approach is to maximise information diffusion by targeting influential actors
• In the context of many online communities (e.g. discussion fora) the information is shared to the community as a whole and not to individual actors
Digital Enterprise Research Institute www.deri.ie
Objectives
• Our main hypothesis is that it is possible to efficiently spread a message over the information flow network by targeting highly influential communities
• We derive the information flow network from the reply-to network between the actors
• The main problem is then formulated as a prediction of the set of communities to target such that the message is spread over the network as much as possible
Digital Enterprise Research Institute www.deri.ie
Methods: Definition of Impact
• We propose (Belák et al., ‘12) to take two factors into account: 1. degree of community membership of the users 2. centrality of the users within each community
• we used in-degree (# replies of a user) • For general case of n users and k communities define:
• n × k membership matrix M • n × k centrality matrix C
• Cross-community k × k impact matrix J can then be obtained as a product of the two matrices: J=MTC
a
b
c
d
e
f
g
forum A forum B
Digital Enterprise Research Institute www.deri.ie
Methods: Targeting Communities
• Level of dispersion (heterogeneity) of total impact of community i can be measured as an entropy of a an i-th row/column of the impact matrix
• Is a community broadly influential or does it influence only few other communities?
• We propose to target communities by means of the product of the total impact of community i and its entropy: impact focus (IF)
• IF compared with random targeting (R), and group in-degree (GI) (Everett & Borgatti, ‘99)
• We simulated the diffusion by extending Independent Cascade Model (ICM) (Kempe et al., ‘03)
1. Take q target communities and sample s users from each of them 2. Run the original ICM from the union of sampled users
• Performance measured by the fraction of all the users, that have been activated during the simulation
Digital Enterprise Research Institute www.deri.ie
Evaluation Data-Set
• 51 weeks of data of the largest Irish discussion board system • Segmented using 1 week sliding window
• 1 week window represents approx. 84% of cross-fora posting activity
• 540 communities in total • 5,298 avg. nodes per snapshot • 26,484 avg. edges per snapshot
Digital Enterprise Research Institute www.deri.ie
Results: Avg. Performance
2 4 6 8 10 14
0.1
0.2
0.3
0.4
0.5
0.6
targeted communities q=1
user sample size (s)
mea
n ac
tivat
ion
fract
ion
(a)
IFGIR
2 4 6 8 10 14
0.1
0.2
0.3
0.4
0.5
0.6
targeted communities q=2
user sample size (s)
mea
n ac
tivat
ion
fract
ion
(a)
IFGIR
2 4 6 8 10 14
0.1
0.2
0.3
0.4
0.5
0.6
targeted communities q=3
user sample size (s)
mea
n ac
tivat
ion
fract
ion
(a)
IFGIR
2 4 6 8 10 14
0.1
0.2
0.3
0.4
0.5
0.6
targeted communities q=4
user sample size (s)m
ean
activ
atio
n fra
ctio
n (a
)
IFGIR
2 4 6 8 10 14
0.1
0.2
0.3
0.4
0.5
0.6
targeted communities q=5
user sample size (s)
mea
n ac
tivat
ion
fract
ion
(a)
IFGIR
• Impact focus outperformed the other two namely for small number of targeted communities and seed users sampled from them
• Diffusion process became saturated on avg. at approx. 60% of the
users activated
Digital Enterprise Research Institute www.deri.ie
Results: IF outperforms GI, R
IF GI R
0.1
0.2
0.3
0.4
0.5
user sample size s=1ac
tivat
ion
fract
ion
(a)
●
●
● ●
●
●
IF GI R
0.45
0.50
0.55
0.60
0.65
user sample size s=12
activ
atio
n fra
ctio
n (a
)
Digital Enterprise Research Institute www.deri.ie
Conclusion
• The evaluation demonstrated that the framework • is able to identify highly influential communities • can predict which communities to stimulate (e.g. by posting a
message) s.t. the stimulus spreads efficiently • We aim to extend it with content analysis
• E.g. What are the most influential communities with respect to a particular topic?
• We will also investigate empirically-observed topic cascades and modify our models accordingly if needed
References • Belák V., Lam S., Hayes C. Cross-Community Influence in Discussion
Fora. ICWSM. AAAI, 2012. • M. Everett and S. Borgatti. The centrality of groups and classes. J. of
Mathematical Sociology, 23(3):181–201, 1999. • D. Kempe, J. Kleinberg, and É. Tardos. Maximizing the spread of
influence through a social network. SIGKDD. ACM, 2003.