On Bharathi-Kempe-Salek Conjecture about Influence Maximization Ding-Zhu Du University of Texas at...

On Bharathi-Kempe-Salek Conjecture about Influence Maximization

Ding-Zhu DuUniversity of Texas at Dallas

Outline• Influence Max• BKS-conjecture

2

What is Social Network? Wikipedia Definition: Social Structure •Nodes: Social actors (individuals or organizations)•Links: Social relations

3

What is Social Influence?

• Social influence occurs when one's opinions, emotions, or behaviors are affected by others, intentionally or unintentionally.[1]

– Informational social influence: to accept information from another;

– Normative social influence: to conform to the positive expectations of others.

[1] http://en.wikipedia.org/wiki/Social_influence 4

The trend effect that Kate, Duchess of Cambridge has on others, from cosmetic surgery for brides, to sales of coral-colored jeans.”

“Kate Middleton effect

Kate Middleton effect

5

According to Newsweek, "The Kate Effect may be worth £1 billion to the UK fashion industry."

Tony DiMasso, L. K. Bennett’s US president, stated in 2012, "...when she does wear something, it always seems to go on a waiting list."

Hike in Sales of Special Products

6

• Influential persons often have many friends.

• Kate is one of the persons that have many friends in this social network.

For more Kates, it’s not as easy as you might think!

How to Find Kate?

7

•Given a digraph and k>0,

•Find k seeds (Kates) to maximize the number of influenced persons (possibly in many steps).

Influence Maximization

8

9

. toequal isunion whosesubsets theof find

,0integer and},,...,{set ground a of

,..., subsets of collection aGiven :Cover-Set

Max InfluenceCover-Set

hard.- isMax Influence

1

1

Uk

kuuU

SS

NP

n

m

pm

Theorem

Proof

1S 2S mS

1u nu2u

ji Su

nodes influence seeds solution hasCover -Set knk

Modularity of Influence

10

.submodular and increasing monotone is )(Then

.set seedby influenced nodes of # denote )(Let

A

AA

)()( BABA vv

v

B

A

Theorem

11

Max. Influence

for ion approximat-)1( a isGreedy 1 e

Diffusion Model

• Deterministic diffusion model • Independent Cascade (IC) • Linear Threshold (LT)

12

Independent Cascade (IC) Model

• When node v becomes active, it has a single chance of activating each currently inactive neighbor w.

• The activation attempt succeeds with probability pvw .

• The deterministic model is a special case of IC model. In this case, pvw =1 for all (v,w).

Example

vw 0.5

0.3 0.20.5

0.10.4

0.3 0.2

0.6

0.2

Inactive Node

Active Node

Newly active node

Successful attempt

Unsuccessfulattempt

Stop!

UX

Y

IC Model

• Each person can tell only one person at each moment.

• However, each person may hear from many persons.

15

Linear Threshold (LT) Model• A node v has random threshold ~ U[0,1]• A node v is influenced by each neighbor w according to a

weight bw,v such that

• A node v becomes active when at least

(weighted) fraction of its neighbors are active

v

v

1 ofneighbor , vwvwb

vvw

vwb ofneighbor active

,

Example

Inactive Node

Active Node

Threshold

Active neighbors

vw 0.5

0.30.2

0.5

0.10.4

0.3 0.2

0.6

0.2

Stop!

U

X

Y

Influence Maximization Problem

• Influence spread of node set S: σ(S) – expected number of active nodes at the end of

diffusion process, if set S is the initial active set.

• Problem Definition (by Kempe et al., 2003): (Influence Maximization). Given a directed and edge-weighted social graph G = (V,E, p) , a diffusion model m, and an integer k ≤ |V |, find a set S V ⊆ , |S| = k, such that the expected influence spread σm(S) is maximum.

Known Results• Bad news: NP-hard optimization problem for both IC and LT

models.• Good news: • σm(S) is monotone and submodular.• We can use Greedy algorithm!

• Theorem: The resulting set S activates at least (1-1/e) (>63%) of the number of nodes that any size-k set could activate .

Outline• Influence Max• BKS-Conjecture

20

Bharathi-Kempe-Salek Conjecture

21

root. a into directed

cearborescenfor hard-NP ison maximizati Influence

311-306 :2007 WINENetworks. Socialin on Maximizati Influence

eCompetitiv :SalekMahyar Kempe, David Bharathi,Shishir

Diffusion Model

• Deterministic diffusion model -polynomial-time.

• Linear Threshold (LT) – polynomial-time.• Independent Cascade (IC) – PTAS

22

Deterministic Diffusion Model

When a node becomes active (infected or protected), it activates all of its currently inactive (not infected and not protected) neighbors.

The activation attempts succeed with a probability 1.

23

Deterministic Model

1

3

4

5

26

both 1 and 6 are source nodes.

Step 1: 1--2,3; 6--2,4. .

04/21/23 24

1

3

5

2

4

6

Step 2: 4--5.

Example

04/21/23 25

A Property of Optimal Solution

26

leaf. aat located

be should seedevery solution, optimalIn

vk

.least at

is leaves ofnumber theassume ,simplicityFor

k

27

1u

vk

Naïve Dynamic Programming

vdu

.at rooted cearborescen

in the placed are seeds when nodes

influenced ofnumber maximum theis ),(

. of degree theis

v

k

kvf

vdv


28

iu

vik

.0 if ,1

,0 if 0,),(

then leaf, a is If

)}.,(),({max1),(

then leaf, anot is If

111

k

kkvf

v

kufkufkvf

v

vvvdddkkk

Running Time

29

. of neighbors-inat rooted subtrees the

toseeds distribute toexamined are sallocation ))((

. of degree-in thedenotes

vd

kdkO

vd

v

dv

v

v

It is not a polynomial-time!

1u

v1k

Counting

30

).)((1

))((1

1 sallocation of #

rooms.different into allocated are persons

kv

v

dv

v

v

v

dkOdk

k

dkOdk

d

dk

v

Virtual Nodes

31

vv

Change arborescence to binary arborescence

At most n virtual nodes can be introduced.

Weight

32

otherwise 1

node virtuala is if 0 uwu

.at rooted cearborescen

in the placed are seeds when nodes

influenced of weight totalmaximum theis ),(

v

k

kvf


33

1u

v1k

.0 if ,1

,0 if 0,),(

then leaf, a is If

)}.,(),({max),(

then leaf, anot is If

221121

k

kkvf

v

kufkufwkvf

v

kkkv

2u

Linear Threshold (LT) Model• A node v has random threshold ~ U[0,1]• A node v is influenced by each neighbor w according to a

weight bw,v such that

• A node v becomes active when at least

(weighted) fraction of its neighbors are active

v

v

1 ofneighbor , vwvwb

vvw

vwb ofneighbor active

,

Example

Inactive Node

Active Node

Threshold

Active neighbors

vw 0.5

0.30.2

0.5

0.10.4

0.3 0.2

0.6

0.2

Stop!

U

X

Y

A property

36

edges. live ofselection random under the paths,

edge-live via from reachable setsover on distributi The )2(

. from starting completion toprocess ThresholdLinear the

runningby obtained sets activeover on distributi The (1)

:same theare nodes of setsover

onsdistributi twofollowing the,set seedgiven aFor

A

A

A

?)()( @,, AAbp CLTwuwu

@C model

• Influence can be made only through private talk of person to p.erson

37

http://www.google.com/imgres?imgurl=http://www.creativeguerrillamarketing.com/wp-content/uploads/2013/01/Viral-Marketing-Campaigns1.jpeg&imgrefurl=http://www.creativeguerrillamarketing.com/viral-marketing/7-viral-marketing-tools-you-might-not-know-about/&h=282&w=425&tbnid=JYpo_9opk2wFMM:&zoom=1&docid=VF-xsdBflKugGM&ei=np7mVLaRLJK0sATNlYL4CA&tbm=isch&ved=0CCAQMygYMBg4yAE

38

.1

yprobabilit with activenobody makes )1(

.y probabilit with active makes )(

.y probabilit with active makes (1)

:events exclusivemutually possible 1only are

then there,,...,, neighbors-out has node a If

1

11

21

k

k

vuvu

vuk

vu

k

pp

vk

puvk

puv

k

uuukv

Important understanding on IC

http://www.google.com/imgres?imgurl=http://www.creativeguerrillamarketing.com/wp-content/uploads/2013/01/Viral-Marketing-Campaigns1.jpeg&imgrefurl=http://www.creativeguerrillamarketing.com/viral-marketing/7-viral-marketing-tools-you-might-not-know-about/&h=282&w=425&tbnid=JYpo_9opk2wFMM:&zoom=1&docid=VF-xsdBflKugGM&ei=np7mVLaRLJK0sATNlYL4CA&tbm=isch&ved=0CCAQMygYMBg4yAE

Equivalent Networks

39

1p

3p

2p

1p

3p

2p1p

40

.1

yprobabilit with active makesnobody )1(



:events exclusivemutually possible 1only are there

then ,,...,, neighbors (coming) has node a If

1

11

21

vuvu

vuk

vu

k

k

k

pp

vk

pvuk

pvu

k

uuukv

Additional Condition in @C

Equivalent Networks

41

1p

3p

2p1p

3p

2p

1p

A Property of @C

42

. to from paths ofset ),(

alive. being edge ofy probabilit

)(),(

@

vuvuP

ep

pA

e

Vv Au vuPP PeeC

43

x y x y

xyx 1 y1

yxyx

yxyx

11

leaf. aat located be to

neednot may seedeach solution, optimalIn

44

seed)} anot is |,( seed), a is |,(max{

),(

vkvfvkvf

kvf

At seed v

45

1uv

1k2u

)},(),({max1

)seed a is |,(

221121kufkuf

vkvf

kkk

At non-seed v

46

1uv

1k 2u

)},(),({max

active) becomes Pr()seed anot is |,(

221121kufkuf

vwvkvf

kkk

v

At non-seed v

47

1uv

1k

2u

)},1,(),1,({max

)seed anot is |,(

221121kufkuf

vkvf

kkk

At non-seed v

48

1uv

1k

2u

)},1,(),1,({max

)seed anot is |,,(

221121kiufkiuf

vkivf

kkk

At seed v

49

1uv

1k2u

)},(),({max

1

)seed a is |,,(

2211

1211

21

21

kufkuf

wppwppwp

vkivf

kkk

vivv i

1v 2v

1p 2p

Independent Cascade (IC) Model

• When node v becomes active, it has a single chance of activating each currently inactive neighbor w.

• The activation attempt succeeds with probability pvw .

• The deterministic model is a special case of IC model. In this case, pvw =1 for all (v,w).

Example

vw 0.5

0.3 0.20.5

0.10.4

0.3 0.2

0.6

0.2

Inactive Node

Active Node

Newly active node

Successful attempt

Unsuccessfulattempt

Stop!

UX

Y

IC Model

• Each person can tell only one person at each moment.

• However, each person may hear from many persons.

52

53

.1

yprobabilit with activenobody makes )1(



:events exclusivemutually possible 1only are

then there,,...,, neighbors-out has node a If

1

11

21

k

k

vuvu

vuk

vu

k

pp

vk

puvk

puv

k

uuukv

Important understanding on IC

At non-seed v

54

1uv

ik1 iu

)},(),({max

active) becomes Pr()seed anot is |,(

221121kufkuf

vwvkvf

kkk

v

Another Dynamic Programming

55

active) becomes Pr(

active) becomes Pr( where

} ),,(),,({max

),,(

22

11

21222111)1)(1(1 21

21

uq

uq

qqqkufqkuf

qkvf

iqqq

kkk

.active) Pr( condition under ),(),,( qvkvfqkvf

? of valuespossiblemany How q

Open Problem

• IC model• Parameterized algorithms with treewidth as

parameter in IC model.

56

Bharathi-Kempe-Salek Conjecture

57

root. a into directed cearborescenfor

hard-NP is model ICon with maximizati Influence

311-306 :2007 WINENetworks. Socialin on Maximizati Influence

eCompetitiv :SalekMahyar Kempe, David Bharathi,Shishir

Open!!!

Polynomial-time Algorithm

58

Primal or incremental method

duality

Primal-dual

Dynamic program

Divide and conquer

greedy

Local ratio

THANK YOU!

On Bharathi-Kempe-Salek Conjecture about Influence Maximization Ding-Zhu Du University of Texas at...

Documents

Transcript of On Bharathi-Kempe-Salek Conjecture about Influence Maximization Ding-Zhu Du University of Texas at...