Spreading Process in Multilayer Networks

download Spreading Process in Multilayer Networks

of 48

Transcript of Spreading Process in Multilayer Networks

  • 7/24/2019 Spreading Process in Multilayer Networks

    1/48

    Spreading Process inMultilayer Networks

    Luca Casini

    [email protected]

    Corso diLaurea Magistrale in Informatica

    A.A. 2015/2016

    mailto:[email protected]:[email protected]
  • 7/24/2019 Spreading Process in Multilayer Networks

    2/48

    Introduction

  • 7/24/2019 Spreading Process in Multilayer Networks

    3/48

    Introduction

    Networks are used to model many real-world systems.

    i.e. Transportation, Computer, (Online) social network

    spreading process may involve humans, information ofvarious nature, viral agents.

    Biologist first studied diffusion of pathogens and thennetwork science took over their work.

    These models gained the attention in recent year for theirapplication to information over communication systemsand social networks.

    Early works focused on simple network but multilayernetworks are becoming more and more important.

  • 7/24/2019 Spreading Process in Multilayer Networks

    4/48

    Preliminaries and Definitions

  • 7/24/2019 Spreading Process in Multilayer Networks

    5/48

    Multilayer Network

    Multilayer networks are composed of monoplex networks

    which are modeled as traditional graphs.We use the following notation:

    V The set of nodes in a multilayer network

    L The set of layers in a multilayer network

    n the number of nodes

    (u,lu) Node uon layerl

    u

    ((u,lu),(v,l

    v)) Edge between node uon layer l

    uand node von

    layer lv

  • 7/24/2019 Spreading Process in Multilayer Networks

    6/48

    Multilayer Network

    AGGIUNGERE TESTO

  • 7/24/2019 Spreading Process in Multilayer Networks

    7/48

    Types of Multilayer Network

    We can define two types of multilayer network:

    Multiplex Network: All layers (almost) contain thesame nodes.

    i.e. same group of people in multiple social networks

    Interdependent Network:Nodes belong to just onelayer. This kind of network may be seen asinterconnected communities within a large monoplexnetwork.

    i.e. power and communication infrastructure networks

  • 7/24/2019 Spreading Process in Multilayer Networks

    8/48

    Cascade and Diffusion

    We call Cascadethe trace of information diffusion starting

    from a node called Seed.A cascade generates an implicit network, called DiffusionNetwork.

    The multilayer network in which the cascade takes place isreferred as Underlying Network

    We use the following notation:

    C An Information Cascade

    (u,lu,v,l

    v,t)

    CThe entries of the set denoted by the cascade C

    D A multilayer diffusion network

  • 7/24/2019 Spreading Process in Multilayer Networks

    9/48

    Cascade and Diffusion

    cascade c1 with seed (v4,l2)

    cascade c2

    with seed (v4,l

    1)

    The diffusion networkresulting from theaggregation on c

    1and

    c

    2

  • 7/24/2019 Spreading Process in Multilayer Networks

    10/48

    Cascade and Diffusion

    There are four possibilities of spreading:

    same-node inter-layer: the cascade switches layer butnot node.i.e. User sharing content on different social networks

    other-node inter-layer: the cascade goes from one node

    to another on a different layer.i.e. Sharing a youtube video on Facebook

    other-node intra-layer: the cascade moves inside onelayer.

    i.e. simple spreading inside social network like retwitting

    same-node intra-layer:Trivial, generally omitted instudies

  • 7/24/2019 Spreading Process in Multilayer Networks

    11/48

    Variables

    Since its difficult to obtain real datasets, multilayer

    networks research is mostly composed of eithersimulation-based studiesor analytical studies basedon mathematical models.

    Both are based on the observation of some interesting

    variablesand input parameters.Many different metricscan be found in the literature, wetake a look at some of the most important.

  • 7/24/2019 Spreading Process in Multilayer Networks

    12/48

    Input Parameters

    Transmissibilityrepresents the probability of

    transmitting an item from one node to another.If nodes have different types we can distinguish betweenhomogenousand heterogenoustransmissibility.

    The type of underlying network(i.e. random, small

    world, scale-free )

    The relationship between different layers (i.e. thecorrelation between node degrees)

    Variations of these two parameters can producesubstantial differences in the outcome of the spreadingprocess.

  • 7/24/2019 Spreading Process in Multilayer Networks

    13/48

    Static Variables

    Epidemic Thresholdis one of the fundamental

    variable in the epidemic-like models. It indicates avalue of transmissibility above which the diffusioninvolves most of the network.

    Survival Threshold indicates if a diffusion process will

    survive. Absolute-Dominance Threshold indicates if adiffusion process can completely remove thecompetitor.

    Infection Size(also called outbreak or cascade size) is

    the number of nodes in the diffusion network. Infection Rate: the average rate of being in contact

    over a link.

  • 7/24/2019 Spreading Process in Multilayer Networks

    14/48

    Temporal Variables

    The variables we introduced until now are all static.

    Taking time into account we have: Epidemic Dynamics is the fraction of infected nodes

    at a given time in those models that call for recovery ordeath after some time.

    Cascade Velocity measure how fast a cascade reachesa certain size or some relevantnodes.

    Survival Probability indicates the chances of aninfection started from a single node of still being active

    at a time t.

  • 7/24/2019 Spreading Process in Multilayer Networks

    15/48

    Target-Based Variables

    Sometimes there is a subset of nodes that we consider

    relevant based on some particular features.i.e. popular people on social networks

    In this context we can use the measures of RecallandPrecision that can be commonly found in the field of

    information retrieval. Recallis defined as the ratio of relevant nodes in the

    diffusion network over the total number of nodes .

    Precisionis defined as the ratio of relevant nodes in

    the diffusion network over the total nodes in thediffusion network.

  • 7/24/2019 Spreading Process in Multilayer Networks

    16/48

    Models

  • 7/24/2019 Spreading Process in Multilayer Networks

    17/48

    Models

    We will now review the most important models used to

    study spreading process on multilayer networksFirst we will categorize models in two groups:

    Epidemic-Like

    Decision-Based

    The we will present some of the mathematical approachesused in the analysis of those models:

    Generating Functions

    Markov Chain Approximation

    Mean-Field Theory

    Game Theory

  • 7/24/2019 Spreading Process in Multilayer Networks

    18/48

    Epidemic Models

    Epidemic-Like Models are generally applied to either

    diseases of influence diffusion.Most of the works on multilayer networks are based onthe SIR, SISor SI

    1I

    2Rmodels.

    Those are stateful models in which a node can either be

    susceptible, infected or recovered (in the SIR models).

    Infectednodes diffuse the disease to their susceptibleneighbors with infection rate .

    Infected nodes can recover (or return susceptible) after atime .

    Transmissibility is defined as T = 1- e-

  • 7/24/2019 Spreading Process in Multilayer Networks

    19/48

    Epidemic Models

    Many variation on the SIR model can be found in the

    literature. Some add new states considering the event ofbirth and death, or the effects of isolation on thespreading process.

    An important one is the Independent Cascade Modelwhich is a discrete-time version of the SIR model.An infected node uat time tcan infected its neighbor v,if it succeeds vbecomes infected at time t+1

    This model is often used in influence spreading studies.

  • 7/24/2019 Spreading Process in Multilayer Networks

    20/48

    Decision-Based Models

    Decision-Based Models, also called Threshold Modelsin

    physics literature, are based on the idea that each agentdecides whether or not to adopt a behaviour dependingon its neighbors.

    i.e. People may start smoking if their social network is

    comprised of many smokers.There are two main approaches in decision-basedstudies:

    Informational Effects Approach

    Direct-Benefit Effects Approach

  • 7/24/2019 Spreading Process in Multilayer Networks

    21/48

    Decision-Based Models

    In the Informational Effects Approach decisions are

    made based on indirect information about others choices. Linear Threshold Model: if a fraction of neighbors >

    TLTM

    has adopted a new behaviour then a node will takeup the same behaviour.

    The Direct-Benefit Effects Approachis a game theoreticperspective of the problem where an agent takes up abehaviour if its convenient.

    Ramezanian proposed a model where each node is playing

    a game with its neighbors. At each round nodes updatetheir strategy (adoption on behaviour A or B) based on apayoff matrix.

  • 7/24/2019 Spreading Process in Multilayer Networks

    22/48

    Theoretical Approaches

    Widely used in the analysis of stochastic processes,

    Generating Functions can uniquely determine a discretesequence of numbers, and can be useful for computing:

    probability density functions moments

    limit distributions solutions of linked differential-difference equations

    Generating functions have also been used to studybranchingand percolationprocessesas two important

    stochastic processes for modeling spread of epidemicsover networks.

  • 7/24/2019 Spreading Process in Multilayer Networks

    23/48

    Branching Process

    The branchingprocessmodel is a simple framework for

    modeling epidemics on a network.While infected, an agent may spread the disease withprobability p to kotheragents (first wave). Each of thosecan then infect kother agents, spreading the disease to k2

    individuals (second wave), and so on.Studying how many waves can a process survive is ofmajor interest.

    When state is important (e.g. SIR model) branching

    process cannot be used and bond percolation is preferred.

  • 7/24/2019 Spreading Process in Multilayer Networks

    24/48

    Percolation

    Percolation theory studies the structure of connected

    clusters in random graphs.pcis the critical probability such that for p > p

    cthe random

    graph has a giant connected component. A percolationtransition occurs at the critical occupation probability p

    c,

    which is the point of appearance/disappearance of a GCC.In [102] the authors extends percolation theory tomultiplex networks by introducing the concept of weakbootstrap percolation and weak pruning percolation.

    These two models are distinct and give origin to differentcritical behaviors on the emergence of critical transitions,unlike their equivalence in the case of single layer.

  • 7/24/2019 Spreading Process in Multilayer Networks

    25/48

    Markov-Chain Approximation

    The Microscopic Markov-Chain Approximation (MMA) is an

    established approach to study the microscopic behavior ofepidemic dynamics.

    e.g., the probability that a given node will be infected.

    This approach can further be categorized as:

    Discrete-time version Continuous-time version

    Discrete-time version has been used to study malwarediffusion with the SIS model showing equivalence between

    multilayer and single layer dynamics when the state of anode is the same in all layers.

  • 7/24/2019 Spreading Process in Multilayer Networks

    26/48

    Mean-Field Theory

    Large Markovian models may become intractable.

    In Mean-field theory, a small averaged effect and anexternal field are considered instead of computing allinteractions between agents.

    This allows the description of the model with a number ofnonlinear differential equations with linearly, instead ofexponentially, growing state space.

    This method has been used to review and generalize

    epidemic-like models.

  • 7/24/2019 Spreading Process in Multilayer Networks

    27/48

    Game Theory

    Game-theoretical approaches take into account the effect

    of cooperation and competition between agents.Studies in social networks showed that communityemergence and information spreading can be explained interms of payoff maximizationand are influenced by

    features of each agent: Reputation

    Desire of popularity

    Knowledge

    Information belief

  • 7/24/2019 Spreading Process in Multilayer Networks

    28/48

    Spreading Dynamics on

    Multilayer Networks

  • 7/24/2019 Spreading Process in Multilayer Networks

    29/48

    Interconnected Networks

    Diffusion processes in interconnected networksare

    affected by spectral properties of the combinatorial supra-Laplacian of underlying graph which is linked to layercoupling.

    In particular changing the second eigenvalue shows two

    very distinct regimes with layers either decoupled orindistinguishable.

    Spreading in interconnected networks has been studied interms of:

    Interaction strength between layers

    Inter-layer pattern.

  • 7/24/2019 Spreading Process in Multilayer Networks

    30/48

    Interaction Strength

    second-nearest neighbors: expected number of

    neighbors = k2

    /k.kis the moment of the degreedistribution. In weakly coupled networks (

    A

    T

    B) we

    find a phase in which a layer may be in epidemic stateindependently of others, depending on transmissibilityand average inter-layer degree.

    interconnection topology measure: quantitativemeasure of coupling given by the formula:

    inter-layer link density: The ratio of existing interlayerconnection to the total possible d = m/(n

    Ax n

    B)

  • 7/24/2019 Spreading Process in Multilayer Networks

    31/48

    Inter-Layer Patterns

    Some studies highlighted the importance of inter-layer

    links and the patterns they create.Simulation studies showed that the degree of connectionshave less impact than the density.

    A new definition of Epidemic Threshold in the SIS model

    was proposed, considering degree of connected nodes:

    TE= 1/

    1(M + N) =infection rate, M = adjacency matrix, N

    inter-layer matrix

    Another study observed that if correlation between intraand inter-layer degree is very strong an outbreak mayappear even below the epidemic threshold of each layer.

    l

  • 7/24/2019 Spreading Process in Multilayer Networks

    32/48

    Intra-layer structure

    Epidemics dynamics depend not only on interlayer links

    but also in intralayer.A study on cliques (groups of people who are close)showed how they influence epidemic threshold and

    infection size and speed. They defined 3 types of link: 1

    intra-clique, 2 inter-clique, and 3 online.let d

    wand d

    fbe the number of type 1 and 2 links per node,

    there is an epidemic state when:

    with E representing the moment of degree distribution

    i l i il i

  • 7/24/2019 Spreading Process in Multilayer Networks

    33/48

    inter-layer similarity

    Similarity (or the lack of it) may influence the spreading

    behaviour.Degree-Degree Correlation is described by factors where k is the number of intra-layer nodes in

    each layer. Similarly interlayer correlation can be

    measured.Average Similarity of Neighbors is defined as:

    whereKA

    represents number of neighbors on layer A and KC

    representthe number of common neighbor.

    Strong degree correlation lead to low epidemic thresholdand small infection size. Interestingly its not influenced byaverage neighbors similarity.

    l i hi

  • 7/24/2019 Spreading Process in Multilayer Networks

    34/48

    layer-switching cost

    Some models must consider that diffusion on different

    layer involves some kind of cost or overheadi.e. changing mean of transport or sharing content from a socialnetwork to another

    A recent study considers this and observes the behaviour

    of epidemic threshold in function of node degree andinfection rate.

    A large difference in infection rates among layers meanhigher overhead and higher epidemic threshold.

    If a layer is denser epidemic threshold lowers as spreadingbecomes easier.

    Diff i V l i

  • 7/24/2019 Spreading Process in Multilayer Networks

    35/48

    Diffusion Velocity

    The presence of multiple layers impacts the speed of the

    diffusion process.Intuitively multiple layers speed up the diffusion processbecause there more links to spread the information. Somestudies confirmed this showing correlation between

    coupling and velocity.However some empirical studies pointed out thatinefficient topologies in monoplex networks andobstructed inter-layer links in multiplex networks lead to

    decreasing speed.

    P ti l O l

  • 7/24/2019 Spreading Process in Multilayer Networks

    36/48

    Partial Overlap

    In partially overlapped multilayer networksonly a

    fraction of nodes is present in all layers.A study on the effect of overlapping in the SIR modeldiscovered that the epidemic threshold T

    cis directly linked

    to the fraction qon node present in both layers

    aaaa A = branching factor

    aaaaaaaa of layer A

    this means that the epidemic threshold of the layer withlower diffusion capability affects the threshold of theother.

    I t ti S di P

  • 7/24/2019 Spreading Process in Multilayer Networks

    37/48

    Interacting Spreading Process

    In the real world different spreading processes coexist and

    interact with each other. Epidemic and games-theoreticmodels have been presented to address this.

    game-theoreticstudies are mainly focused on competingrumors or companies trying to sell their product.

    Epidemic models and consider two competing viruses ormemes. This two viruses can coexist or one can dominatethe other, eventually leading to extinction.

    An interesting application of the model studies a virus

    versus an immunization process.

    All studies concluded that interaction dynamics are linkedto the underlying network topology.

    I ti Diff i

  • 7/24/2019 Spreading Process in Multilayer Networks

    38/48

    Innovation Diffusion

    diffusion of innovation (new behaviours, technology,

    ideas, products) received considerable interest in socialscience and economics.

    The problem has been studied in using and extension of

    the Watts threshold model.

    A content-dependent threshold was introduced, dealingwith a specific bias each link has towards certain content.

    Considering the approach of direct-benefit effects n agame-theoretic framework, a lower bound for the success

    of an innovation can be found

    i.e. how many people in the network adopt a specific

    strategy

    R C t i t

  • 7/24/2019 Spreading Process in Multilayer Networks

    39/48

    Resource Constraints

    In real life nodes of a multiplex network share limited

    resources. This will impact spreading processes dynamics.i.e. people share their time between different online socialnetworks.

    A variation of the SIR model called constrained SIR isintroduced. In each step, a limited number of neighborscan be infected.

    The authors find that, in agreement with previous studies,in the absence of constraints, positively correlation leads

    to a lower epidemic threshold than a negative correlation.However, in the presence of constraints, spreading is lessefficient in positively correlated coupling than negativelycorrelated networks.

  • 7/24/2019 Spreading Process in Multilayer Networks

    40/48

    Applications

    Applications

  • 7/24/2019 Spreading Process in Multilayer Networks

    41/48

    Applications

    Spreading processes in multilayer networks have a large

    number of applications: understanding the dynamics of cascades. maximizing the influence in the context of viral

    marketing .

    placing sensors to detect the spreading as quickly aspossible in a network.

    The application areas can be roughly categorized into twoclasses:

    Forward Prediction:applications that need to steer thenetwork into a particular desired state.

    Backward Prediction: applications that require to predicthow a given piece of information will diffuse in a network.

    Influence Maximization

  • 7/24/2019 Spreading Process in Multilayer Networks

    42/48

    Influence Maximization

    influence maximization has the goal of spreading

    information as quickly as possible.This can be achieved by choosing the most influential

    nodes as a seed. these nodes are chosen according to

    some measure of centrality like: pagerank, betweeness

    or eigenvector centrality.On the other hand we can choose the messages that are

    likely to survive longer than the others and so propagateto more nodes, obtaining the same effect.

    Immunization

  • 7/24/2019 Spreading Process in Multilayer Networks

    43/48

    Immunization

    resilience to a disease can be achieved through

    information dissemination. Various work investigated thisquestion using a model based on two layers:

    the infection layer: where the disease spreadsthe prevention layer: where awareness spreads

    Studies have observed that awareness can raise the

    infection threshold and in many cases almost stop theinfection.

    An important application is studying the effectiveness ofvaccination campaigns.

    Delay Tolerant Networks

  • 7/24/2019 Spreading Process in Multilayer Networks

    44/48

    Delay-Tolerant Networks

    Delay-Tolerant Networks are networks which address the

    problem of the lack of continuous connectivity.i.e. deep space communication, sensor networks

    Routing on DTNs is more challenging than on traditionalnetworks and is an important application of forward

    prediction, usually addressed using epidemic algorithmover the active connection graph.

    As every sensor may have more than one communicationdevice the spreading process can be mapped with a

    multilayer network where the best routing is bound bylatency and energy constraints.

    Malware Propagation

  • 7/24/2019 Spreading Process in Multilayer Networks

    45/48

    Malware Propagation

    Studying malware propagation and design solution to

    contain outbreak is very important and involves bothforward and backward prediction.

    This problem is intrinsically multilayer; along computer wehave mobile devices that are connected through multiplewireless interfaces (3g, bluetooth, wi-fi) and the use ofapplication allows communication with device that maynot be immediate neighbors.

    Each of these factor should be taken into account as aseparate layer when modelling such spreading process

    Conclusions

  • 7/24/2019 Spreading Process in Multilayer Networks

    46/48

    Conclusions

    Information diffusion in multilayer networks is an active

    and not yet consolidated research field, and thereforeoffers many unsolved problems to address. In some cases,phenomena that are quite well understood in monoplex

    networks are comparatively not well understood in thecontext of multilayer networks; in other

    cases, completely novel ideas, algorithms and analysis,specific to multilayer networks have to be developed.

    Some research directions are illustrated below.

    Open Problems

  • 7/24/2019 Spreading Process in Multilayer Networks

    47/48

    Open Problems

    empirical study of information diffusion:Real dataset are both

    difficult to obtain and study due to their size.

    metrics and measurements: New metrics specific to multilayernetwork should be researched aside from those derived from

    monoplex networks.

    new models: Some phenomena may require new models to be

    described. An example is the Data-Mining approach for heterogeneous

    networks.data visualization: Visualization is a great tool for researchers, the

    muxViz project is the best contribution at the moment.

    time-varying networks: Time is central to many process. Studies on

    time-varying multiplex networks are yet to appear.

    evolution of underlying structure: There are studies on adaptivemonoplex networks but the multilayer perspective must be deepened.

    outbreak detection: Detecting as quickly as possible a spreading

    processes is a field worth exploring in multilayer networks.

    References

  • 7/24/2019 Spreading Process in Multilayer Networks

    48/48

    References

    [1]

    [2]