Structure, Tie Persistence and Event Detection in Large Phone and SMS Networks Leman Akoglu and...

Post on 22-Dec-2015

214 views 0 download

Tags:

Transcript of Structure, Tie Persistence and Event Detection in Large Phone and SMS Networks Leman Akoglu and...

Structure, Tie Persistence and Event Detection in Large Phone and SMS Networks

Leman Akoglu and Bhavana Dalvi{lakoglu , bbd} @cs.cmu.edu

Carnegie Mellon University and iLab

Dataset used for this work was provided by iLab at Carnegie Mellon University.

• Tie Persistence (TP) : It is the stability of ties across time as number of time-ticks in which a link is observed, over the total number P of time-ticks.

• User Perseverance (UP) : Perseverance of a user is defined as the average of the persistence of all his/her ties.

Tie Persistence

(2) Event Detection: • Define a sliding window of size W (set to 5 days) • Generate a correlation matrix C, with Cij being Pearson’s correlation

between the time series of pair (i,j)over window W. • Largest eigenvector of C give the “activity” of each node. • Compare “activity” vectors over time by taking dot product score Z (1 if same, 0 if perpendicular –flag for small Z)

Structure Analysis

Phone and SMS network

Tie Attributes Node Attributes

Reciprocity (R) : 1 if the tie is reciprocal in time tick

Degree (K)

Topological Overlap (TO) : Cluster Coefficient (C) :

User reciprocity (r) : Faction of ties containing both incoming and outgoing calls

# common neighbours

Node degree

# triads in which node is involved

Tie Persistence Event Detection

Methodology(1) Feature extraction: • Characterize nodes

with 12 network-features F: degree (number of contacts), total weight (phone call duration), …

• One TxN time-series matrix per feature, T=183 days N=1,8M users

(left) Z score vs time with W=5 and F=inweight (number of calls received).Top 10 days with the largest Z score is highlighted in red bars. (middle) u(t) vs r(t-1) for each node at T=Dec 26th. Top 5 nodes with the largest change is marked with red stars. (right) inweight vs time for the top 5 nodes marked –notice the change in calling behavior during the Christmas week.

• How to predict whether a link will persist in the future?

• Which link and node attributes are important in prediction?

Around 2M users, 50M edges, 500M phone

calls/SMS6 months data

Tie strength based on (a) # SMS (b) # Phone calls (c) Duration of phone calls