Research Report - francescoficarola.com€¦ · Embedded Network Sensor Systems, 2013. L....

35
North Carolina State University Department of Computer Science 890 Oval Drive, Raleigh, NC Algorithms and Protocols on Social Graphs: Real vs Virtual Social Networks Research Report Francesco Ficarola, Ph.D. Student [email protected]

Transcript of Research Report - francescoficarola.com€¦ · Embedded Network Sensor Systems, 2013. L....

  • North Carolina State UniversityDepartment of Computer Science

    890 Oval Drive, Raleigh, NC

    Algorithms and Protocols on Social Graphs:Real vs Virtual Social Networks

    Research Report

    Francesco Ficarola, Ph.D. [email protected]

  • Who am I?

    - I'm 29 and I have a master's degree in Computer Engineering

    - I'm a Ph.D. student from Sapienza, University of Rome (Italy)

    - I currently work (part-time) for a company on research projects

    - I really like photography, nature and science.

  • Who am I?

    - I'm 29 and I have a master's degree in Computer Engineering

    - I'm a Ph.D. student from Sapienza, University of Rome (Italy)

    - I currently work (part-time) for a company on research projects

    - I really like photography, nature and science.

    - I really like cultural exchanges

    - I won a TEE scholarship to study here for 6 months

    - I'd like to improve my English

    Why am I here?

  • Towards a more social world… or not?

    … a photo with you …

    … a new job …

    … video …

  • Social Networks as networks of contacts

    A definition of Social Graph

    A social graph is a network of connections and relationships

    between people.

    Representation

    G = (V ,E)

    nodeedge

    ∈ E

    ∈V

  • Real and Virtual Social Graphs

    Real Social Graph

    • Person-to-Person interactions:when individual A physically meets another individual B.

    A particular technology must be used to record and measure them.

    Virtual Social Graph• Virtual interactions:

    - Facebook friendships- Twitter followers- …

    Easy tracking by APIs

  • Real and Virtual Social Graphs

    Real Social Graph

    • Person-to-Person interactions:when individual A physically meets another individual B.

    A particular technology must be used to record and measure them.

    Virtual Social Graph• Virtual interactions:

    - Facebook friendships- Twitter followers- …

    Applications and Related Work

    Easy tracking by APIs

    - Facebook applications (e.g., games)

    - Analysis of activities and friendships

    - “Like button” to suggest targeted advertising

    - Intel Imotes

    - Smartphones

    - SocioPatterns projects

    bluetooth

  • Contribution and main goals in few words

    - Deployment of several testbeds to collect real interactions

    - Analysis of virtual and real social networks

    - Population protocols

    - Distinct Counting

    - Distributed PageRank

    - The Wisdom of Crowds

  • Contribution and main goals in few words

    - Deployment of several testbeds to collect real interactions

    - Analysis of virtual and real social networks

    - Population protocols

    - Distinct Counting

    - Distributed PageRank

    - The Wisdom of Crowds

    WHY?

  • SocioPatterns platform: the OpenBeacon project

    The SocioPatterns sensing platform employs wearable electronic badges to

    sense sustained face-to-face proximity among people.

    Radio Frequency IDentification technology

    OpenBeacon Reader

    OpenBeacon Tag

  • Our contribution in detail…

    - DIAG experiment - MACRO experiment

    Set-up:- Five-day experiment- DIAG LAN- 116 attendees

    Set-up: - Two-and-a-half-hour experiment- LAN built by powerline and wifi bridge devices- 114 attendees

    python collectorreaders

    tags

    forwarding data

  • Brief analysis of our networks

    - DIAG experiment - MACRO experiment

    Similarity in terms of grouping: triangles and couples

  • Algorithms and Protocols on Social Graphs

    Population Protocols

    • Threshold: for any given a Є X, it is true if #a ≥ T, where T is a given threshold

    • Modulo: it is true whenever #a ≡ j (mod k), for given j, k

    • Comparison: it is true whenever #a ≥ #b, for a, b Є X

    a

    aa

    ba

    b

    a

    a

    a

    b

    b

    a

    Example set-up12 nodes8 nodes with symbol a4 nodes with symbol bThreshold value: T=7Modulo parameters: j=7, k=4

    Example ResultsThreshold: verifiedModulo: not verifiedComparison: verified

  • Simulation of Population Protocols

    - Threshold and Modulo

    DIAG Graph MACRO Graph

    Erdos-Renyi Random Graph

  • Simulation of Population Protocols

    - Threshold and Modulo

    DIAG Graph MACRO Graph

    Erdos-Renyi Random Graph

    A significant fraction ofnodes (≥ 80%) convergesto the correct outputwell before the finalconvergence time.

    Time normalization requiredDIAG experiment : 5 daysMACRO experiment : 2.5 hours

  • Erdos-Renyi Random Graph

    Simulation of Population Protocols

    - Threshold and Modulo

    DIAG Graph MACRO Graph

    A significant fraction ofnodes (≥ 80%) convergesto the correct outputwell before the finalconvergence time.

    Time normalization requiredDIAG experiment : 5 daysMACRO experiment : 2.5 hours

    The time required by the wholepopulation to stabilize, stronglydepends on the interaction patterns

    Faster convergence = Greater density

  • Simulation of Population Protocols

    - Comparison

    DIAG Graph MACRO Graph

    Erdos-Renyi Random Graph

    Similar results to the threshold andmodulo predicates: 80% of nodesconverges to the correct output wellbefore the final convergence time.

    Trend curves are pretty similarto the case of the random graph

  • Populations Protocols on physical devices

    Implementation of Population Protocols on physical devices

    threshold threshold

    [Unexpected result] The threshold predicate directly running on physical devices converges faster than simulations on Netlogo

  • Implementation of Population Protocols on physical devices

    threshold threshold

    Sub-Networks and Parallel Computing

    counter

    Initial network

    Populations Protocols on physical devices

    [Unexpected result] The threshold predicate directly running on physical devices converges faster than simulations on Netlogo

  • Algorithms and Protocols on Social Graphs

    Distinct Counting: the Flajolet-Martin algorithmfor i = 0, ..., L-1 do M[i] = 0foreach (data item with value x) do

    while hash(x, i) = 0 doi = i + 1

    M[i] = 1let Z = min{i : M[i] = 0}

    return

    [0000001]

    [0000001] [0000010]

    [0000010]

    [0000001]OR operator

    [0000011] Z = 2

    577351.

    2≅

    ZEstimation

    77351.

    2Z

  • The experiment at WSDM 2013

    Distinct Counting

    Average of 10 simulations on Netlogo Estimations vs true number of nodes

  • The experiment at WSDM 2013

    Distinct Counting

    Average of 10 simulations on Netlogo Estimations vs true number of nodes

    Estimation (Netlogo): very good approximation of the curve depicting the true number of nodes over time.

    Estimation (WSDM): a proof of concept. Running a distributed version of the FM algorithm on devices.

  • The Wisdom of Crowds

    Sir Francis Galton

    What does the ox weigh?

    • Aggregation

    • Diversity of opinion

    • Independence

    • Decentralization

    Crowd

  • The experiment at WSDM 2013

    The Wisdom of Crowds: setup• 73 attendees

    • 2 rounds

    - In the first round we gave people some questions to be answered alone.

    - In the second round we proposed the same set of questions to be answeredafter a social interaction.

    • 4 questions

    - What was the total value in euro of all the coins thrown at the Trevi fountain in 2011?

    - What is the total length (in meters) of the corridor of the Auditorium Antonianum?

    - What is the average number of journal papers among the WSDM 2013 participantsaccording to DBLP?

    - What was the number of Internet users in New Zealand by the end of 2011?

  • The experiment at WSDM 2013

    The Wisdom of Crowds: preliminary results

    Relation between the error made by users in the first round and the improvement in the second one.

    Pearsoncoefficient

    ≈ 1

  • Algorithms and Protocols on Social Graphs

    Adjacency Matrix

    =

    0111001100001000

    dcba

    dcba

    M

    PageRank π

    b

    c

    d

    a

  • Algorithms and Protocols on Social Graphs

    PageRank π Adjacency Matrix

    =

    0111001100001000

    dcba

    dcba

    M

    Stochastic Adjacency Matrix

    =

    031

    31

    31

    0021

    21

    41

    41

    41

    41

    1000

    d

    c

    ba

    dcba

    Aij

    ∑ ∑∞

    =

    −=

    0],[1

    k j

    kki ijAn

    ααπ

    b

    c

    d

    a

  • Algorithms and Protocols on Social Graphs

    Distributed PageRank p

    Ti = generate_tokens(r) ∪ incoming_tokens()foreach token in Ti do:

    if rnd(0,1) > α then:Ci = Ci + 1

    else:if node is not a dangling node then:

    send token to a neighbor j with probability Aijelse:

    send token u.a.r. to another noderemove token from Ti

    IDEALSAMPLE

    b

    c

    d

    a

    Stochastic Adjacency Matrix

    =

    031

    31

    31

    0021

    21

    41

    41

    41

    41

    1000

    d

    c

    ba

    dcba

    Aij

    foreachnodei

  • Algorithms and Protocols on Social Graphs

    Distributed PageRank p

    Ti = generate_tokens(r) ∪ incoming_tokens()foreach token in Ti do:

    if rnd(0,1) > α then:Ci = Ci + 1

    else:if node is not a dangling node then:

    send token to a neighbor j with probability Qijelse:

    send token u.a.r. to another noderemove token from Ti

    FDSAMPLE

    b

    c

    d

    a

    Modified Adjacency Matrix

    =

    031

    31

    31

    0021

    21

    00001000

    d

    cba

    dcba

    Qij

    foreachnodei

  • The distributed PageRank

    ArXiv HEP-PH citations graphNodes: 34546 Edges: 421578 Dangling nodes: 2393

  • The distributed PageRank

    Performance measures and preliminary results

    Average Relative Error: ∑ ==n

    i ie

    nE

    111

    wherei

    iii

    pe

    ππ −

    =

  • The distributed PageRank

    Performance measures and preliminary results

    Precision: the fraction of the top k nodes (ordered according to non increasing πi’ s) that are also among the top k when nodes are ordered according to non increasing pi’ s.

  • What I'd like to do at NCSU…

    - Deployment of other two social experiments:

    • Real social experiment SocioPatterns infrastructure

    • Virtual social experiment Chat game

    - The Wisdom of Crowds in the real and virtual social experiments

    - Distributed PageRank in physical devices during real social experiments

  • Publications

    L. Becchetti, L. Bergamini, F. Ficarola, and A. Vitaletti. Population protocols on real social networks (POSTER). In Proceedings of the Fifth Workshop on Social Network Systems, SNS '12, pages 15:1-15:2, New York, NY, USA, 2012. ACM.

    L. Becchetti, L. Bergamini, F. Ficarola, and A. Vitaletti. Population protocols on real social networks. In PE-WASUN '12: Proceedings of the 9th ACM workshop on Performance evaluation of wireless ad hoc, sensor, and ubiquitous networks, 2012.

    L. Becchetti, L. Bergamini, F. Ficarola, F. Salvatore, and A. Vitaletti. First Experiences with the Implementation and Evaluation of Population Protocols on Physical Devices. In The IEEE International Conference on Cyber, Physical and Social Computing, 2012.

    T. Arzilli, F. Ficarola, K. Massri, A. Vitaletti, F. Loriga, I. De Marinis, A. Ferraresi, R. Bloise, and M. Goretti. ProvinciaSense: extending the capillary WiFi infrastructure of Lazio region with static and mobile sensor networks (DEMO). To be appeared in SENSYS '13: Proceedings of the 11th ACM Conference on Embedded Network Sensor Systems, 2013.

    L. Becchetti, F. Ficarola, G. Persiano and A. Vitaletti. Decentralized computation of authority scores over an evolving network. To be accepted in WWW '14 Companion: Proceedings of the 23rd international conference companion on World Wide Web.

  • Diapositiva numero 1Diapositiva numero 2Diapositiva numero 3Diapositiva numero 4Diapositiva numero 5Diapositiva numero 6Diapositiva numero 7Diapositiva numero 8Diapositiva numero 9Diapositiva numero 10Diapositiva numero 11Diapositiva numero 12Diapositiva numero 13Diapositiva numero 14Diapositiva numero 15Diapositiva numero 16Diapositiva numero 17Diapositiva numero 19Diapositiva numero 20Diapositiva numero 21Diapositiva numero 22Diapositiva numero 23Diapositiva numero 24Diapositiva numero 25Diapositiva numero 26Diapositiva numero 27Diapositiva numero 28Diapositiva numero 29Diapositiva numero 30Diapositiva numero 31Diapositiva numero 32Diapositiva numero 33Diapositiva numero 34Diapositiva numero 35Diapositiva numero 36