DDAY2014 - Edgesense: Social network analysis per tutti

EdgesenseSocial network analysis per tutti

Luca Mearelli - @lmea

Hi, I’m Luca

Collective Intelligence

Emergence

larger entities, patterns, and regularities arise through interactions among smaller or simpler entities that themselves do not exhibit such properties

Online collaboration

it works!

Online communities

• Exhibit emergence

• Strong design properties

•Hackable

The Blueprint

•Map the community social network

•Measure the structural properties

• Visualize the structure & the metrics

• Tweak the interaction

Edgesense

Edgesense Architecture HTML5 Javascript

JSON files

Python

JSON source

Edgesense Source Data

• users.json

• nodes.json

• comments.json

users.json

nodes.json

comments.json

Edgesense Backend

• Python

•NetworkX

Edgesense Parsing Pipeline

• Parse source JSON files

• Build network from interactions

• Extract metrics

• Export network + metrics to JSON files

Network construction

• Persons are nodes

•Comments make links

• Edges are aggregated

•Metadata is added

def extract_edges(nodes_map, comments_map): # build the list of edges edges_list = [] # a comment is 'valid' if it has a recipient and an author valid_comments = [e for e in comments_map.values() if e.get('recipient_id', None) and e.get('author_id', None)] logging.info("%(v)i valid comments on %(t)i total" % {'v':len(valid_comments), 't':len(comments_map.values())}) # build the whole network to use for metrics for comment in valid_comments: link = { 'id': "{0}_{1}_{2}".format(comment['author_id'],comment['recipient_id'],comment['created_ts']), 'source': comment['author_id'], 'target': comment['recipient_id'], 'ts': comment['created_ts'], 'effort': comment['length'], 'team': comment['team'] } if nodes_map.has_key(comment['author_id']): nodes_map[comment['author_id']]['active'] = True else: logging.info("error: node %(n)s was linked but not found in the nodes_map" % {'n':comment['author_id']}) if nodes_map.has_key(comment['recipient_id']): nodes_map[comment['recipient_id']]['active'] = True else: logging.info("error: node %(n)s was linked but not found in the nodes_map" % {'n':comment['recipient_id']}) edges_list.append(link)

return sorted(edges_list, key=eu.sort_by('ts'))

def build_network(network): MDG=nx.MultiDiGraph()

for node in network['nodes']: MDG.add_node(node['id'], node)

for edge in network['edges']: MDG.add_edge(edge['source'], edge['target'], attr_dict=edge) set_isolated(network['nodes'], MDG) return MDG

def extract_dpsg(mdg, ts, team=True): dg=nx.DiGraph() # add all the nodes present at the time ts for node in mdg.nodes_iter(): if mdg.node[node]['created_ts'] <= ts and (team or not mdg.node[node]['team']): dg.add_node(node, mdg.node[node]) for node in mdg.nodes_iter(): for neighbour in mdg[node].keys(): count = sum(1 for e in mdg[node][neighbour].values() if e['ts'] <= ts and (team or not e['team'])) effort = sum(e['effort'] for e in mdg[node][neighbour].values() if e['ts'] <= ts and (team or not e['team'])) team_edge = sum(1 for e in mdg[node][neighbour].values() if e['ts'] <= ts and e['team'])>0 if count > 0 and (team or not team_edge): dg.add_edge(node, neighbour, {'source': node, 'target': neighbour, 'effort': effort, 'count': count, 'team': team_edge}) return dg

•Content metrics

•Network metrics

•Number of users (active/inactive)

•Number of connections

•Number of community contributions

•Degree

•Distance

•Centrality

•Modularity

Network Metrics: Degree

•Number of inbound / outbound edges insisting on a node

Network Metrics: Distance

• The average number of hops needed to go from a randomly chosen node to another.

• A lower distance implies that information spreads more easily across the network.

Network Metrics: Centrality

• Refers to indicators which identify the most important vertices within a graph

• Betweenness Centrality: it is equal to the number of shortest paths from all vertices to all others that pass through that node.

Network Metrics: Modularity

• The difference between the observed network and a random one with the same degree distribution, on a 0-1 scale.

• Subcommunities are defined such that its members are more connected to each other than to

Network Metricsdef extract_network_metrics(mdg, ts, team=True): met = {} dsg = extract_dpsg(mdg, ts, team) if team : pre = 'full:' else: pre = 'user:' # avoid trying to compute metrics for # the case of empty networks if dsg.number_of_nodes()==0: return met met[pre+'nodes_count'] = dsg.number_of_nodes() met[pre+'edges_count'] = dsg.number_of_edges() met[pre+'density'] = nx.density(dsg) met[pre+'betweenness'] = nx.betweenness_centrality(dsg) met[pre+'avg_betweenness'] = float(sum(met[pre+'betweenness'].values()))/float(len(met[pre+'betweenness'].values())) met[pre+'betweenness_count'] = nx.betweenness_centrality(dsg, weight='count') met[pre+'avg_betweenness_count'] = float(sum(met[pre+'betweenness_count'].values()))/float(len(met[pre+'betweenness_count'].values())) met[pre+'betweenness_effort'] = nx.betweenness_centrality(dsg, weight='effort') met[pre+'avg_betweenness_effort'] = float(sum(met[pre+'betweenness_effort'].values()))/float(len(met[pre+'betweenness_effort'].values())) met[pre+'in_degree'] = dsg.in_degree() met[pre+'avg_in_degree'] = float(sum(met[pre+'in_degree'].values()))/float(len(met[pre+'in_degree'].values())) met[pre+'out_degree'] = dsg.out_degree() met[pre+'avg_out_degree'] = float(sum(met[pre+'out_degree'].values()))/float(len(met[pre+'out_degree'].values())) met[pre+'degree'] = dsg.degree() met[pre+'avg_degree'] = float(sum(met[pre+'degree'].values()))/float(len(met[pre+'degree'].values())) met[pre+'degree_count'] = dsg.degree(weight='count') met[pre+'avg_degree_count'] = float(sum(met[pre+'degree_count'].values()))/float(len(met[pre+'degree_count'].values())) met[pre+'degree_effort'] = dsg.degree(weight='effort') met[pre+'avg_degree_effort'] = float(sum(met[pre+'degree_effort'].values()))/float(len(met[pre+'degree_effort'].values()))

Exported Format{ "edges": [ { "effort": 4, "id": "2_1_1315491000", "source": "2", "target": "1", "team": false, "ts": 1315491000 }, ... ], "meta": { "generated": 1415788633 }, "metrics": [ { "ts": 1315491000, ... } ], "nodes": [ { "active": true, "created_on": "2011-09-08", "created_ts": 1315483000, "id": "1", "isolated": false, "name": "Alice", "team": true, "team_on": "2011-09-08", "team_ts": 1315483000 }, {...} ]}

Edgesense Frontend

• Single page application

•D3.js

• Sigma.js

Dashboard: Network

•Uses sigma.js

• ForceAtlas layout *

•Contextual information

Dashboard: Metrics

• Sidebar, Bottom widgets

•Declaratively select metrics to display

<div class="small-box bg-maroon big-metric metric helped" data-metric-name="louvain_modularity" data-metric-round="3" data-help="modularity" > <div class="inner"> <h3 class="value"> </h3> <p> Modularity </p> </div> <div class="minichart"> </div></div>

Dashboard: Filters

Extras

• Twitter parser

•Gexf exporting

Drupal!

• Module to embed Edgesense

• Configurator for the backend processing

• Configurator for the dashboard

Thank you!P.S. Edgesense is opensource:

github.com/Wikitalia/edgesense

Photo credits

https://www.flickr.com/photos/swedish_heritage_board/14141937687/https://www.flickr.com/photos/nationaalarchief/5453358304/https://www.flickr.com/photos/ul_digital_library/10922274335/https://www.flickr.com/photos/texasstatearchives/9077251415/https://www.flickr.com/photos/nasacommons/9465040235/https://www.flickr.com/photos/nasacommons/9467807836

DDAY2014 - Edgesense: Social network analysis per tutti

Technology

Transcript of DDAY2014 - Edgesense: Social network analysis per tutti

Python per tutti

Merry christmas a tutti

DDAY2014 - Loveyourpix Case Study

Tutti Per L'Arte Portfolio

Tutti per uno e Swagger per tutti!€¦ · presenta –info@wpc2015.it - +39 02 365738.11 - #wpc15it 1 Tutti per uno e Swagger per tutti! Nicolò Carandini –MVP Windows Development

DDAY2014 - Message Stack: gestire activity streams, notifiche, sottoscrizioni in a Drupal way

Doubel Bass Excerpts Tutti

eBook per tutti

Buon Agosto a Tutti!

(1994) Gorka Hermosa (1976-) · Gernika, 26/4/1937 op. 4b (for accordion orchestra) (1994) Gorka Hermosa (1976-) staccato sempre Andante misterioso Allegro exultante tutti tutti tutti

НОВИНКА TUTTI

Industrial Motherboards - Advantechadvcloudfiles.advantech.com/ecatalog/2018/05031655.pdf · iManager 2.0 Embedded BIOS SUSI API WISE-PaaS/EdgeSense is an edge intelligence and sensing

Tutti per tutti sport - 2015 English

The Tutti Frutti Shop

Ubuntu per Tutti!

CHAPTER FIVE FRATELLI TUTTI

Ad*gio Tutti Fag'

collectie tutti milano

TUTTI TROMBONES

Viola Tutti