On Bharathi-Kempe-Salek Conjecture about Influence Maximization Ding-Zhu Du University of Texas at...

Post on 18-Jan-2016

220 views 2 download

Transcript of On Bharathi-Kempe-Salek Conjecture about Influence Maximization Ding-Zhu Du University of Texas at...

On Bharathi-Kempe-Salek Conjecture about Influence Maximization

Ding-Zhu DuUniversity of Texas at Dallas

Outline• Influence Max• BKS-conjecture

2

What is Social Network? Wikipedia Definition: Social Structure •Nodes: Social actors (individuals or organizations)•Links: Social relations

3

What is Social Influence?

• Social influence occurs when one's opinions, emotions, or behaviors are affected by others, intentionally or unintentionally.[1]

– Informational social influence: to accept information from another;

– Normative social influence: to conform to the positive expectations of others.

[1] http://en.wikipedia.org/wiki/Social_influence 4

The trend effect that Kate, Duchess of Cambridge has on others, from cosmetic surgery for brides, to sales of coral-colored jeans.”

“Kate Middleton effect

Kate Middleton effect

5

According to Newsweek, "The Kate Effect may be worth £1 billion to the UK fashion industry."

Tony DiMasso, L. K. Bennett’s US president, stated in 2012, "...when she does wear something, it always seems to go on a waiting list."

Hike in Sales of Special Products

6

• Influential persons often have many friends.

• Kate is one of the persons that have many friends in this social network.

For more Kates, it’s not as easy as you might think!

How to Find Kate?

7

•Given a digraph and k>0,

•Find k seeds (Kates) to maximize the number of influenced persons (possibly in many steps).

Influence Maximization

8

9

. toequal isunion whosesubsets theof find

,0integer and},,...,{set ground a of

,..., subsets of collection aGiven :Cover-Set

Max InfluenceCover-Set

hard.- isMax Influence

1

1

Uk

kuuU

SS

NP

n

m

pm

Theorem

Proof

1S 2S mS

1u nu2u

ji Su

nodes influence seeds solution hasCover -Set knk

Modularity of Influence

10

.submodular and increasing monotone is )(Then

.set seedby influenced nodes of # denote )(Let

A

AA

)()( BABA vv

v

B

A

Theorem

11

Max. Influence

for ion approximat-)1( a isGreedy 1 e

Diffusion Model

• Deterministic diffusion model • Independent Cascade (IC) • Linear Threshold (LT)

12

Independent Cascade (IC) Model

• When node v becomes active, it has a single chance of activating each currently inactive neighbor w.

• The activation attempt succeeds with probability pvw .

• The deterministic model is a special case of IC model. In this case, pvw =1 for all (v,w).

Example

vw 0.5

0.3 0.20.5

0.10.4

0.3 0.2

0.6

0.2

Inactive Node

Active Node

Newly active node

Successful attempt

Unsuccessfulattempt

Stop!

UX

Y

IC Model

• Each person can tell only one person at each moment.

• However, each person may hear from many persons.

15

Linear Threshold (LT) Model• A node v has random threshold ~ U[0,1]• A node v is influenced by each neighbor w according to a

weight bw,v such that

• A node v becomes active when at least

(weighted) fraction of its neighbors are active

v

v

1 ofneighbor , vwvwb

vvw

vwb ofneighbor active

,

Example

Inactive Node

Active Node

Threshold

Active neighbors

vw 0.5

0.30.2

0.5

0.10.4

0.3 0.2

0.6

0.2

Stop!

U

X

Y

Influence Maximization Problem

• Influence spread of node set S: σ(S) – expected number of active nodes at the end of

diffusion process, if set S is the initial active set.

• Problem Definition (by Kempe et al., 2003): (Influence Maximization). Given a directed and edge-weighted social graph G = (V,E, p) , a diffusion model m, and an integer k ≤ |V |, find a set S V ⊆ , |S| = k, such that the expected influence spread σm(S) is maximum.

Known Results• Bad news: NP-hard optimization problem for both IC and LT

models.• Good news: • σm(S) is monotone and submodular.• We can use Greedy algorithm!

• Theorem: The resulting set S activates at least (1-1/e) (>63%) of the number of nodes that any size-k set could activate .

Outline• Influence Max• BKS-Conjecture

20

Bharathi-Kempe-Salek Conjecture

21

root. a into directed

cearborescenfor hard-NP ison maximizati Influence

311-306 :2007 WINENetworks. Socialin on Maximizati Influence

eCompetitiv :SalekMahyar Kempe, David Bharathi,Shishir

Diffusion Model

• Deterministic diffusion model -polynomial-time.

• Linear Threshold (LT) – polynomial-time.• Independent Cascade (IC) – PTAS

22

Deterministic Diffusion Model

When a node becomes active (infected or protected), it activates all of its currently inactive (not infected and not protected) neighbors.

The activation attempts succeed with a probability 1.

23

Deterministic Model

1

3

4

5

26

both 1 and 6 are source nodes.

Step 1: 1--2,3; 6--2,4. .

04/21/23 24

1

3

5

2

4

6

Step 2: 4--5.

Example

04/21/23 25

A Property of Optimal Solution

26

leaf. aat located

be should seedevery solution, optimalIn

vk

.least at

is leaves ofnumber theassume ,simplicityFor

k

27

1u

vk

Naïve Dynamic Programming

vdu

.at rooted cearborescen

in the placed are seeds when nodes

influenced ofnumber maximum theis ),(

. of degree theis

v

k

kvf

vdv

Naïve Dynamic Programming

28

iu

vik

.0 if ,1

,0 if 0,),(

then leaf, a is If

)}.,(),({max1),(

then leaf, anot is If

111

k

kkvf

v

kufkufkvf

v

vvvdddkkk

Running Time

29

. of neighbors-inat rooted subtrees the

toseeds distribute toexamined are sallocation ))((

. of degree-in thedenotes

vd

kdkO

vd

v

dv

v

v

It is not a polynomial-time!

1u

v1k

Counting

30

).)((1

))((1

1 sallocation of #

rooms.different into allocated are persons

kv

v

dv

v

v

v

dkOdk

k

dkOdk

d

dk

v

Virtual Nodes

31

vv

Change arborescence to binary arborescence

At most n virtual nodes can be introduced.

Weight

32

otherwise 1

node virtuala is if 0 uwu

.at rooted cearborescen

in the placed are seeds when nodes

influenced of weight totalmaximum theis ),(

v

k

kvf

Naïve Dynamic Programming

33

1u

v1k

.0 if ,1

,0 if 0,),(

then leaf, a is If

)}.,(),({max),(

then leaf, anot is If

221121

k

kkvf

v

kufkufwkvf

v

kkkv

2u

Linear Threshold (LT) Model• A node v has random threshold ~ U[0,1]• A node v is influenced by each neighbor w according to a

weight bw,v such that

• A node v becomes active when at least

(weighted) fraction of its neighbors are active

v

v

1 ofneighbor , vwvwb

vvw

vwb ofneighbor active

,

Example

Inactive Node

Active Node

Threshold

Active neighbors

vw 0.5

0.30.2

0.5

0.10.4

0.3 0.2

0.6

0.2

Stop!

U

X

Y

A property

36

edges. live ofselection random under the paths,

edge-live via from reachable setsover on distributi The )2(

. from starting completion toprocess ThresholdLinear the

runningby obtained sets activeover on distributi The (1)

:same theare nodes of setsover

onsdistributi twofollowing the,set seedgiven aFor

A

A

A

?)()( @,, AAbp CLTwuwu

38

.1

yprobabilit with activenobody makes )1(

.y probabilit with active makes )(

.y probabilit with active makes (1)

:events exclusivemutually possible 1only are

then there,,...,, neighbors-out has node a If

1

11

21

k

k

vuvu

vuk

vu

k

pp

vk

puvk

puv

k

uuukv

Important understanding on IC

Equivalent Networks

39

1p

3p

2p

1p

3p

2p1p

40

.1

yprobabilit with active makesnobody )1(

.y probabilit with active makes )(

.y probabilit with active makes (1)

:events exclusivemutually possible 1only are there

then ,,...,, neighbors (coming) has node a If

1

11

21

vuvu

vuk

vu

k

k

k

pp

vk

pvuk

pvu

k

uuukv

Additional Condition in @C

Equivalent Networks

41

1p

3p

2p1p

3p

2p

1p

A Property of @C

42

. to from paths ofset ),(

alive. being edge ofy probabilit

)(),(

@

vuvuP

ep

pA

e

Vv Au vuPP PeeC

43

x y x y

xyx 1 y1

yxyx

yxyx

11

leaf. aat located be to

neednot may seedeach solution, optimalIn

44

seed)} anot is |,( seed), a is |,(max{

),(

vkvfvkvf

kvf

At seed v

45

1uv

1k2u

)},(),({max1

)seed a is |,(

221121kufkuf

vkvf

kkk

At non-seed v

46

1uv

1k 2u

)},(),({max

active) becomes Pr()seed anot is |,(

221121kufkuf

vwvkvf

kkk

v

At non-seed v

47

1uv

1k

2u

)},1,(),1,({max

)seed anot is |,(

221121kufkuf

vkvf

kkk

At non-seed v

48

1uv

1k

2u

)},1,(),1,({max

)seed anot is |,,(

221121kiufkiuf

vkivf

kkk

At seed v

49

1uv

1k2u

)},(),({max

1

)seed a is |,,(

2211

1211

21

21

kufkuf

wppwppwp

vkivf

kkk

vivv i

1v 2v

1p 2p

Independent Cascade (IC) Model

• When node v becomes active, it has a single chance of activating each currently inactive neighbor w.

• The activation attempt succeeds with probability pvw .

• The deterministic model is a special case of IC model. In this case, pvw =1 for all (v,w).

Example

vw 0.5

0.3 0.20.5

0.10.4

0.3 0.2

0.6

0.2

Inactive Node

Active Node

Newly active node

Successful attempt

Unsuccessfulattempt

Stop!

UX

Y

IC Model

• Each person can tell only one person at each moment.

• However, each person may hear from many persons.

52

53

.1

yprobabilit with activenobody makes )1(

.y probabilit with active makes )(

.y probabilit with active makes (1)

:events exclusivemutually possible 1only are

then there,,...,, neighbors-out has node a If

1

11

21

k

k

vuvu

vuk

vu

k

pp

vk

puvk

puv

k

uuukv

Important understanding on IC

At non-seed v

54

1uv

ik1 iu

)},(),({max

active) becomes Pr()seed anot is |,(

221121kufkuf

vwvkvf

kkk

v

Another Dynamic Programming

55

active) becomes Pr(

active) becomes Pr( where

} ),,(),,({max

),,(

22

11

21222111)1)(1(1 21

21

uq

uq

qqqkufqkuf

qkvf

iqqq

kkk

.active) Pr( condition under ),(),,( qvkvfqkvf

? of valuespossiblemany How q

Open Problem

• IC model• Parameterized algorithms with treewidth as

parameter in IC model.

56

Bharathi-Kempe-Salek Conjecture

57

root. a into directed cearborescenfor

hard-NP is model ICon with maximizati Influence

311-306 :2007 WINENetworks. Socialin on Maximizati Influence

eCompetitiv :SalekMahyar Kempe, David Bharathi,Shishir

Open!!!

Polynomial-time Algorithm

58

Primal or incremental method

duality

Primal-dual

Dynamic program

Divide and conquer

greedy

Local ratio

THANK YOU!